Genetic Programming and Adaboosting based churn prediction for Telecom

Published on Dec 13, 2012 in SMC (Systems, Man and Cybernetics)
· DOI :10.1109/ICSMC.2012.6377917
Adnan Idris10
Estimated H-index: 10
(PIEAS: Pakistan Institute of Engineering and Applied Sciences),
Asifullah Khan36
Estimated H-index: 36
(PIEAS: Pakistan Institute of Engineering and Applied Sciences),
Yeon Soo Lee17
Estimated H-index: 17
(DCU: Catholic University of Daegu)
Sources
Abstract
Churn prediction model guides the customer relationship management to retain the customers who are expected to quit. In recent times, a number of tree based ensemble classifiers are used to model the churn prediction in telecom. These models predict the churners quite satisfactorily; however, there is a considerable margin of improvement. In telecom, the enormous size, imbalanced nature, and high dimensionality of the training dataset mainly cause the classification algorithms to suffer in accurately predicting the churners. In this paper, we use Genetic Programming (GP) based approach for modeling the challenging problem of churn prediction in telecom. Adaboost style boosting is used to evolve a number of programs per class. Finally, the predictions are made with the resulting programs using the higher output, from a weighted sum of the outputs of programs per class. The prediction accuracy is evaluated using 10 fold cross validation on standard telecom datasets and a 0.89 score of area under the curve is observed. We hope that such an efficient churn prediction approach might be significantly beneficial for the competitive telecom industry.
📖 Papers frequently viewed together
18 Citations
6 Citations
337 Citations
References16
Newest
#1B. Q. Huang (UCD: University College Dublin)H-Index: 4
#2Mohand Tahar Kechadi (UCD: University College Dublin)H-Index: 5
Last. Brian Buckley (UCD: University College Dublin)H-Index: 1
view all 3 authors...
This paper presents a new set of features for land-line customer churn prediction, including 2 six-month Henley segmentation, precise 4-month call details, line information, bill and payment information, account information, demographic profiles, service orders, complain information, etc. Then the seven prediction techniques (Logistic Regressions, Linear Classifications, Naive Bayes, Decision Trees, Multilayer Perceptron Neural Networks, Support Vector Machines and the Evolutionary Data Mining A...
163 CitationsSource
#1Koen W. De Bock (Lille Catholic University)H-Index: 12
#2Dirk Van den Poel (UGent: Ghent University)H-Index: 52
Several studies have demonstrated the superior performance of ensemble classification algorithms, whereby multiple member classifiers are combined into one aggregated and powerful classification model, over single models. In this paper, two rotation-based ensemble classifiers are proposed as modeling techniques for customer churn prediction. In Rotation Forests, feature extraction is applied to feature subsets in order to rotate the input data for training base classifiers, while RotBoost combin...
84 CitationsSource
Several studies have demonstrated the superior performance of ensemble classification algorithms, whereby multiple member classifiers are combined into one aggregated and powerful classification model, over single models. In this paper, two rotation-based ensemble classifiers are proposed as modeling techniques for customer churn prediction. In Rotation Forests, feature extraction is applied to feature subsets in order to rotate the input data for training base classifiers, while RotBoost combin...
5 Citations
#1Antanas Verikas (KTU: Kaunas University of Technology)H-Index: 22
#2Adas Gelzinis (KTU: Kaunas University of Technology)H-Index: 16
Last. Marija Bacauskiene (KTU: Kaunas University of Technology)H-Index: 19
view all 3 authors...
Random forests (RF) has become a popular technique for classification, prediction, studying variable importance, variable selection, and outlier detection. There are numerous application examples of RF in a variety of fields. Several large scale comparisons including RF have been performed. There are numerous articles, where variable importance evaluations based on the variable importance measures available from RF are used for data exploration and understanding. Apart from the literature survey...
435 CitationsSource
In this article, we test the usefulness of the popular data mining models to predict churn of the clients of the Polish cellular telecommunication company. When comparing to previous studies on this topic, our research is novel in the following areas: (1) we deal with prepaid clients (previous studies dealt with postpaid clients) who are far more likely to churn, are less stable and much less is known about them (no application, demographical or personal data), (2) we have 1381 potential variabl...
79 CitationsSource
Mar 1, 2010 in SMC (Systems, Man and Cybernetics)
#1P.G. Espejo (UCO: University of Córdoba (Spain))H-Index: 1
#2Sebastián Ventura (UCO: University of Córdoba (Spain))H-Index: 48
Last. Francisco Herrera (UGR: University of Granada)H-Index: 154
view all 3 authors...
Classification is one of the most researched questions in machine learning and data mining. A wide range of real problems have been stated as classification problems, for example credit scoring, bankruptcy prediction, medical diagnosis, pattern recognition, text categorization, software quality assessment, and many more. The use of evolutionary algorithms for training classifiers has been studied in the past few decades. Genetic programming (GP) is a flexible and powerful evolutionary technique ...
399 CitationsSource
#1Lior Rokach (BGU: Ben-Gurion University of the Negev)H-Index: 60
Ensemble methodology, which builds a classification model by integrating multiple classifiers, can be used for improving prediction performance. Researchers from various disciplines such as statistics, pattern recognition, and machine learning have seriously explored the use of ensemble methodology. This paper presents an updated survey of ensemble methods in classification tasks, while introducing a new taxonomy for characterizing them. The new taxonomy, presented from the algorithm designer's ...
176 CitationsSource
#1Yaya Xie (THU: Tsinghua University)H-Index: 4
#2Xiu Li (THU: Tsinghua University)
Last. Weiyun Ying (Xi'an Jiaotong University)H-Index: 5
view all 4 authors...
Churn prediction is becoming a major focus of banks in China who wish to retain customers by satisfying their needs under resource constraints. In churn prediction, an important yet challenging problem is the imbalance in the data distribution. In this paper, we propose a novel learning method, called improved balanced random forests (IBRF), and demonstrate its application to churn prediction. We investigate the effectiveness of the standard random forests approach in predicting customer churn, ...
235 CitationsSource
#1Chunxia Zhang (Xi'an Jiaotong University)H-Index: 15
#2Jiangshe Zhang (Xi'an Jiaotong University)H-Index: 23
This paper presents a novel ensemble classifier generation technique RotBoost, which is constructed by combining Rotation Forest and AdaBoost. The experiments conducted with 36 real-world data sets available from the UCI repository, among which a classification tree is adopted as the base learning algorithm, demonstrate that RotBoost can generate ensemble classifiers with significantly lower prediction error than either Rotation Forest or AdaBoost more often than the reverse. Meanwhile, RotBoost...
120 CitationsSource
50 Citations
Cited By29
Newest
#2Xinfeng Zhang (Beijing University of Technology)H-Index: 4
Last. Hui Li
view all 4 authors...
Source
Customer retention is a major challenge in several business sectors and diverse companies identify the customer churn prediction (CCP) as an important process for retaining the customers. CCP in the telecommunication sector has become an essential need owing to a rise in the number of the telecommunication service providers. Recently, machine learning (ML) and deep learning (DL) models have begun to develop effective CCP model. This paper presents a new improved synthetic minority over-sampling ...
1 CitationsSource
#2Sérgio Moro (ISCTE-IUL: ISCTE – University Institute of Lisbon)H-Index: 16
Source
#1Praveen LalwaniH-Index: 6
#2Manas Kumar MishraH-Index: 6
view all 4 authors...
The customer churn prediction (CCP) is one of the challenging problems in the telecom industry. With the advancement in the field of machine learning and artificial intelligence, the possibilities to predict customer churn has increased significantly. Our proposed methodology, consists of six phases. In the first two phases, data pre-processing and feature analysis is performed. In the third phase, feature selection is taken into consideration using gravitational search algorithm. Next, the data...
3 CitationsSource
Source
#2Alireza Shafieinejad (TMU: Tarbiat Modares University)H-Index: 4
Last. Halim YanikomerogluH-Index: 52
view all 3 authors...
Subscriber authentication is a primitive operation in mobile networks required by each operator prior to offering any service to end users. In this paper, we propose a novel blockchain-based Authentication and Key Agreement (AKA) protocol for roaming services in 5G networks. Each Home Network (HN) creates its own smart contract and publishes its address to inform other operators who want to offer roaming services to HN subscribers. All subsequent communication between the HN and Serving Network ...
2 CitationsSource
#2Joshua Denton (MSU: Mississippi State University)H-Index: 1
Last. Abbas KeramatiH-Index: 16
view all 5 authors...
Knowledge-based churn prediction and decision making is invaluable for telecom companies due to highly competitive markets. The comprehensiveness and action ability of a data-driven churn prediction system depend on the effective extraction of hidden patterns from the data. Generally, data analytics is employed to extrapolate the extracted patterns from the training dataset to the test set. In this study, one more step is taken; the improved prediction performance is attained by capturing the in...
2 CitationsSource
In today’s date where machine learning is the key to solve so many problems in different fields, one really should know the extent of its importance in their field. One of the major applications of machine learning is Predictive Analytics. Churn prediction is one of the key steps for customer retention in this saturating market scenario [31]. This is one of the major objectives and any toolkit which can give insights on this can be really beneficial for any service providing companies. Furthermo...
1 CitationsSource
#1Yongtao Zhang (ZJU: Zhejiang University)H-Index: 2
#2Shibo He (ZJU: Zhejiang University)H-Index: 35
Last. Jiming Chen (ZJU: Zhejiang University)H-Index: 74
view all 4 authors...
Customer churn as a common but extremely harmful phenomenon haunts service operators in telecommunications industry for a long time. Discriminating customers prone to churning in the early stage and taking precaution measures can mitigate customer churn effectively. Unlike previous works which mainly considered the inter-operator customer churn, we focus on a new problem of intra-operator customer churn, that is, customers abandon their fourth generation (4G) mobile communication services and sw...
1 CitationsSource