Evaluating multi-label classifiers and recommender systems in the financial service sector

Published on Dec 1, 2019in European Journal of Operational Research4.213
· DOI :10.1016/J.EJOR.2019.05.037
Matthias Bogaert4
Estimated H-index: 4
(UGent: Ghent University),
Justine Lootens1
Estimated H-index: 1
(UGent: Ghent University)
+ 1 AuthorsMichel Ballings11
Estimated H-index: 11
(UT: University of Tennessee)
Abstract The objective of this paper is to evaluate multi-label classification techniques and recommender systems for cross-sell purposes in the financial services sector. We carried out three analyses using data obtained from an international financial services provider. First, we tested four multi-label classification techniques, of which the two problem transformation methods were combined with several base classifiers. Second, we benchmarked the performance of five state-of-the-art recommender approaches. Third, we compared the best performing multi-label classification and recommender approaches with each other. The results identify user-based collaborative filtering as the top performing recommender system, with a cross-validated F 1 measure of 42.20% and G -mean of 42.64%. Classifier chains binary relevance with adaboost and binary relevance with random forest are the top performing multi-label classification algorithms for respectively F 1 measure and G -mean, yielding a cross-validated F 1 measure of 53.33% and G -mean of 54.37%. The statistical comparison between the best performing approaches confirms the superiority of multi-label classification techniques. Our study provides important recommendations for financial services providers, who are interested in the most effective methods to determine cross-sell opportunities. In previous studies, multi-label classification techniques and recommender systems were always investigated independently of each other. To the best of our knowledge, our study is therefore the first to compare both techniques in the financial services sector.
📖 Papers frequently viewed together
2013PAAMS: Practical Applications of Agents and Multi-Agent Systems
9 Citations
21 Citations
24 Citations
#1Philipp Baumann (University of Bern)H-Index: 8
#2Dorit S. Hochbaum (University of California, Berkeley)H-Index: 55
Last. Yan T. Yang (Amazon.com)H-Index: 3
view all 3 authors...
We present here a computational study comparing the performance of leading machine learning techniques to that of recently developed graph-based combinatorial optimization algorithms (SNC and KSNC). The surprising result of this study is that SNC and KSNC consistently show the best or close to best performance in terms of their F1-scores, accuracy, and recall. Furthermore, the performance of SNC and KSNC is considerably more robust than that of the other algorithms; the others may perform well o...
16 CitationsSource
#1Steven Debaere (Lille Catholic University)H-Index: 3
#2Kristof Coussement (Lille Catholic University)H-Index: 17
Last. Tom De RuyckH-Index: 5
view all 3 authors...
Abstract Online innovation communities are defined as internet-based platforms for communication and exchange among customers interested in building innovations for a given product or technology. As firms recognize an online innovation community as a valuable resource for integrating external consumer knowledge into innovation processes, they increasingly ignore to build long-term interactions and collaborations. However, in the pursuit of a long-term community, moderators face enormous challeng...
9 CitationsSource
#1Matthias Bogaert (UGent: Ghent University)H-Index: 4
#2Michel Ballings (UT: University of Tennessee)H-Index: 11
Last. Dirk Van den Poel (UGent: Ghent University)H-Index: 52
view all 3 authors...
The purpose of this paper is to evaluate which communication types on social media are most indicative for romantic tie prediction. In contrast to analyzing communication as a composite measure, we take a disaggregated approach by modeling separate measures for commenting, liking and tagging focused on an alter’s status updates, photos, videos, check-ins, locations and links. To ensure that we have the best possible model we benchmark 8 classifiers using different data sampling techniques. The r...
9 CitationsSource
#1Stijn Geuens (Lille Catholic University)H-Index: 1
#2Kristof Coussement (Lille Catholic University)H-Index: 17
Last. Koen W. De BockH-Index: 12
view all 3 authors...
Abstract This study proposes a decision support framework to help e-commerce companies select the best collaborative filtering algorithms (CF) for generating recommendations on the basis of online binary purchase data. To create this framework, an experimental design applies several CF configurations, which are characterized by different data-reduction techniques, CF methods, and similarity measures, to binary purchase data sets with distinct input data characteristics, i.e., sparsity level, pur...
31 CitationsSource
#1Matthias BogaertH-Index: 4
#2Michel BallingsH-Index: 11
Last. Dirk Van den PoelH-Index: 52
view all 4 authors...
This study assesses the feasibility of identifying self-reported sports practitioners (soccer players) on Facebook. The main goal is to develop a system to support marketers with the decision as to which prospects to target for advertising purposes. To do so, we benchmark several algorithms (i.e., random forest, logistic regression, adaboost, rotation forest, neural networks, and kernel factory) using five times twofold cross-validation. To evaluate performance and variable importances, we build...
4 CitationsSource
#1Bernd Bischl (LMU: Ludwig Maximilian University of Munich)H-Index: 27
#2Michel Lang (LMU: Ludwig Maximilian University of Munich)H-Index: 11
Last. Zachary M. Jones (LMU: Ludwig Maximilian University of Munich)H-Index: 6
view all 8 authors...
The MLR package provides a generic, object-oriented, and extensible framework for classification, regression, survival analysis and clustering for the R language. It provides a unified interface to more than 160 basic learners and includes meta-algorithms and model selection techniques to improve and extend the functionality of basic learners with, e.g., hyperparameter tuning, feature selection, and ensemble construction. Parallel high-performance computing is natively supported. The package tar...
264 Citations
#1Damien Zufferey (University of Fribourg)H-Index: 5
#2Thomas HoferH-Index: 2
Last. Stefano BromuriH-Index: 11
view all 6 authors...
We are motivated by the issue of classifying diseases of chronically ill patients to assist physicians in their everyday work. Our goal is to provide a performance comparison of state-of-the-art multi-label learning algorithms for the analysis of multivariate sequential clinical data from medical records of patients affected by chronic diseases. As a matter of fact, the multi-label learning approach appears to be a good candidate for modeling overlapped medical conditions, specific to chronicall...
32 CitationsSource
#1Cataldo MustoH-Index: 24
#2Giovanni SemeraroH-Index: 39
Last. Georgios LekkasH-Index: 2
view all 5 authors...
Recommendation of financial investment strategies is a complex and knowledge-intensive task. Typically, financial advisors have to discuss at length with their wealthy clients and have to sift through several investment proposals before finding one able to completely meet investors' needs and constraints. As a consequence, a recent trend in wealth management is to improve the advisory process by exploiting recommendation technologies. This paper proposes a framework for recommendation of asset a...
30 CitationsSource
Extensive study of 12 multi-label learning methods with interactivity constraints.Focus on the beginning of the classification task where few examples are available.Experimental evaluation with a protocol independent of any implementation environment.Classifier performances are evaluated for 7 quality and time criteria on 12 datasets.RF-PCT obtains the best predictive performance while being computationally efficient. Interactive classification aims at introducing user preferences in the learnin...
10 CitationsSource
#1Jie Lu (UTS: University of Technology, Sydney)H-Index: 66
#2Dianshuang Wu (UTS: University of Technology, Sydney)H-Index: 11
Last. Guangquan Zhang (UTS: University of Technology, Sydney)H-Index: 59
view all 5 authors...
A recommender system aims to provide users with personalized online product or service recommendations to handle the increasing online information overload problem and improve customer relationship management. Various recommender system techniques have been proposed since the mid-1990s, and many sorts of recommender system software have been developed recently for a variety of applications. Researchers and managers recognize that recommender systems offer great opportunities and challenges for b...
662 CitationsSource
Cited By7
#1Justin Munoz (RMIT: RMIT University)
#2Ahmad Asgharian Rezaei (RMIT: RMIT University)H-Index: 3
Last. Laleh Tafakori (RMIT: RMIT University)H-Index: 1
view all 4 authors...
Abstract null null A fundamental component to managing a marketing campaign is identifying prospects and selection of leads. Current lead generation models focus on predicting the intention of a customer to purchase a product, however with financial products, particularly loans, this can be insufficient as there are many factors to consider, such as risk, utility, and financial maturity. Developing a marketing campaign for loan prospecting should consider not only customers who need a loan, but ...
#1Gonzalo Nápoles (Tilburg University)H-Index: 14
#2Marilyn Bello (Central University, India)H-Index: 4
Last. Yamisleydi Salgueiro (University of Talca)H-Index: 4
view all 3 authors...
Abstract This paper presents a neural system to deal with multi-label classification problems that might involve sparse features. The architecture of this model involves three sequential blocks with well-defined functions. The first block consists of a multilayered feed-forward structure that extracts hidden features, thus reducing the problem dimensionality. This block is useful when dealing with sparse problems. The second block consists of a Long-term Cognitive Network-based model that operat...
1 CitationsSource
view all 2 authors...
Recommendation systems play an indispensable role in tourists’ decision-making process. An important issue for tourists concerns the selection of accommodation in accordance with the criteria on th...
1 CitationsSource
#1Lihi Dery (Ariel University)H-Index: 4
We survey multi-label ranking tasks, specifically multi-label classification and label ranking classification. We highlight the unique challenges, and re-categorize the methods, as they no longer fit into the traditional categories of transformation and adaptation. We survey developments in the last demi-decade, with a special focus on state-of-the-art methods in deep learning multi-label mining, extreme multi-label classification and label ranking. We conclude by offering a few future research ...
1 Citations
#1Shang Gao (SUSE: Sichuan University of Science and Engineering)H-Index: 4
#2Wenlu Dong (Jiangsu University)H-Index: 1
Last. Hualong Yu (SUSE: Sichuan University of Science and Engineering)H-Index: 17
view all 6 authors...
Multi-label learning is a popular area of machine learning research as it is widely applicable to many real-world scenarios. In comparison with traditional binary and multi-classification tasks, the multi-label data are more easily impacted or destroyed by an imbalanced data distribution. This paper describes an adaptive decision threshold-based extreme learning machine algorithm (ADT-ELM) that addresses the imbalanced multi-label data classification problem. Specifically, the macro and micro F-...
2 CitationsSource
#1Mukul Gupta (Indian Institute of Management Indore)H-Index: 2
#2Pradeep Kumar (Indian Institute of Management Lucknow)H-Index: 10
Abstract In today's era of electronic markets with information overload, generating personalized recommendations for e-commerce users is a challenging and interesting problem. Recommending top-N items of interest to e-commerce users is more challenging using binary implicit feedback. The training data is usually highly sparse and has binary values capturing a user's action or inaction. Due to the sparseness of data and lack of explicit user preferences, neighborhood-based and model-based approac...
10 CitationsSource