Using PCA to predict customer churn in telecommunication dataset

Published on Nov 19, 2010 in ADMA (Advanced Data Mining and Applications)
· DOI :10.1007/978-3-642-17313-4_32
T. Sato1
Estimated H-index: 1
(UCD: University College Dublin),
B. Q. Huang4
Estimated H-index: 4
(UCD: University College Dublin)
+ 2 AuthorsB. Buckley1
Estimated H-index: 1
Failure to identify potential churners affects significantly a company revenues and services that can provide. Imbalance distribution of instances between churners and non-churners and the size of customer dataset are the concerns when building a churn prediction model. This paper presents a local PCA classifier approach to avoid these problems by comparing eigenvalues of the best principal component. The experiments were carried out on a large real-world Telecommunication dataset and assessed on a churn prediction task. The experimental results showed that local PCA classifier generally outperformed Naive Bayes, Logistic regression, SVM and Decision Tree C4.5 in terms of true churn rate.
📖 Papers frequently viewed together
2010PRICAI: Pacific Rim International Conference on Artificial Intelligence
4 Authors (T. Sato, ..., M-Tahar Kechadi)
5 Citations
2015CIT: Computer and Information Technology
7 Citations
1 Author (Uan Y)
Aug 30, 2009 in DaWaK (Data Warehousing and Knowledge Discovery)
#1B. Q. Huang (UCD: University College Dublin)H-Index: 4
#2M-T. Kechadi (UCD: University College Dublin)H-Index: 4
Last. B. BuckleyH-Index: 2
view all 3 authors...
Although churn prediction has been an area of research in the voice branch of telecommunications services, more focused studies on the huge growth area of Broadband Internet services are limited. Therefore, this paper presents a new set of features for broadband Internet customer churn prediction, based on Henley segments, the broadband usage, dial types, the spend of dial-up, line-information, bill and payment information, account information. Then the four prediction techniques (Logistic Regre...
28 CitationsSource
#1Show-Jane Yen (MCU: Ming Chuan University)H-Index: 16
#2Yue-Shi Lee (MCU: Ming Chuan University)H-Index: 17
For classification problem, the training data will significantly influence the classification accuracy. However, the data in real-world applications often are imbalanced class distribution, that is, most of the data are in majority class and little data are in minority class. In this case, if all the data are used to be the training data, the classifier tends to predict that most of the incoming data belongs to the majority class. Hence, it is important to select the suitable training data for c...
408 CitationsSource
50 Citations
#1Kristof Coussement (UGent: Ghent University)H-Index: 17
#2Dirk Van den Poel (UGent: Ghent University)H-Index: 52
CRM gains increasing importance due to intensive competition and saturated markets. With the purpose of retaining customers, academics as well as practitioners find it crucial to build a churn prediction model that is as accurate as possible. This study applies support vector machines in a newspaper subscription context in order to construct a churn model with a higher predictive performance. Moreover, a comparison is made between two parameter-selection techniques, needed to implement support v...
327 CitationsSource
#1Kristof CoussementH-Index: 17
#2D. Van Den PoelH-Index: 5
CRM gains increasing importance due to intensive competition and saturated markets. With the purpose of retaining customers, academics as well as practitioners find it crucial to build a churn prediction model that is as accurate as possible. This study applies support vector machines in a newspaper subscription context in order to construct a churn model with a higher predictive performance. Moreover, a comparison is made between two parameter-selection techniques, needed to implement support v...
91 Citations
#1Luo BinH-Index: 2
#2Shao Peiji (University of Electronic Science and Technology of China)H-Index: 5
Last. Liu Juan (University of Electronic Science and Technology of China)H-Index: 1
view all 3 authors...
Nowadays, churn prediction and management is critical for more and more companies in the fast changing and strongly competitive telecommunication market. In order to improve customer retention, telecommunication companies must be able to predict customers at risk who are prone to switch service provider. In this study, to overcome the limitations of lack of information of customers of Personal Handyphone System Service (PHSS) and to build an effective and accurate customer churn model, three res...
49 CitationsSource
#1Hee-Su KimH-Index: 1
#2Choong-Han Yoon (Hanyang University)H-Index: 1
Abstract By using a binomial logit model based on a survey of 973 mobile users in Korea, the determinants of subscriber churn and customer loyalty are identified in the Korean mobile telephony market. The probability that a subscriber will switch carrier is dependent on the level of satisfaction with alternative-specific service attributes including call quality, tariff level, handsets, brand image, as well as income, and subscription duration. However, only factors such as call quality, handset...
326 CitationsSource
#1Nitesh V. ChawlaH-Index: 69
#2Nathalie Japkowicz (U of O: University of Ottawa)H-Index: 30
Last. Aleksander KotczH-Index: 1
view all 3 authors...
1,468 CitationsSource
#1Wai-Ho AuH-Index: 12
#2Keith C. C. ChanH-Index: 40
Last. Xin YaoH-Index: 112
view all 3 authors...
Classification is an important topic in data mining research. Given a set of data records, each of which belongs to one of a number of predefined classes, the classification problem is concerned with the discovery of classification rules that can allow records with unknown class membership to be correctly classified. Many algorithms have been developed to mine large data sets for classification models and they have been shown to be very effective. However, when it comes to determining the likeli...
297 CitationsSource
#1Chih-Ping Wei (NSYSU: National Sun Yat-sen University)H-Index: 1
#2I-Tang Chiu (Chunghwa Telecom)H-Index: 1
Abstract As deregulation, new technologies, and new competitors open up the mobile telecommunications industry, churn prediction and management has become of great concern to mobile service providers. A mobile service provider wishing to retain its subscribers needs to be able to predict which of them may be at-risk of changing services and will make those subscribers the focus of customer retention efforts. In response to the limitations of existing churn-prediction systems and the unavailabili...
337 CitationsSource
Cited By6
#1Ming Zhao (Chongqing Technology and Business University)
#2Qingjun Zeng (Chongqing Technology and Business University)
Last. Jiafu Su (Chongqing Technology and Business University)
view all 5 authors...
Customer churn will cause the value flowing from customers to enterprises to decrease. If customer churn continues to occur, the enterprise will gradually lose its competitive advantage. When the growth of new customers cannot meet the needs of enterprise development, the enterprise will fall into a survival dilemma. Focusing on the customer churn prediction model, this paper takes the telecom industry in China as the research object, establishes a customer churn prediction model by using a logi...
Feb 26, 2018 in ICML (International Conference on Machine Learning)
#1Shaoying Cui (Dalian Maritime University)H-Index: 1
#2Ning Ding (Dalian Maritime University)H-Index: 1
In this era of strong competition, data mining can provide effective support to bank operators in their effort to analyze and forecast the real needs of their customers. Applying bank data mining results to the actual business enables banks to develop products that not only meet customer needs but also deliver maximum bank profitability. In this paper, a data mining model for bank customers is established, and after extraction and transformation, the data are loaded into a data warehouse specifi...
1 CitationsSource
Jul 18, 2016 in ICDM (Industrial Conference on Data Mining)
#1Bingquan Huang (UCD: University College Dublin)
#2Ying Huang (UCD: University College Dublin)H-Index: 3
Last. Mohand Tahar Kechadi (UCD: University College Dublin)H-Index: 5
view all 4 authors...
Customer churn has emerged as a critical issue for Customer Relationship Management and customer retention in the telecommunications industry, thus churn prediction is necessary and valuable to retain the customers and reduce the losses. Recently rule-based classification methods designed transparently interpreting the classification results are preferable in customer churn prediction. However most of rule-based learning algorithms designed with the assumption of well-balanced datasets, may prov...
1 CitationsSource
Industrial businesses must respond efficiently to market demands; therefore, industrial construction must accurately predict the project duration at the pre-investment stage. In practice, project duration predictions rely on the experience of project managers. To provide impartial expertise and quantitative estimate the predicted duration of constructing an industrial building, an extensive history of industrial building cases were collected to form a database. Principal component analysis was a...
5 CitationsSource
Last. Min Li
view all 4 authors...
Telecom operators are facing an urgent problem of telecom customer churn that should be solved as soon as possible. This paper, according to the three-month average customer consumption, divides the levels of customer value, comprehensively uses decision tree algorithm and clustering algorithm modeling of data mining, introduces confusion matrix model for model evaluation, and uses the model output rules set for targeted customers’ maintaining marketing, so as to reduce customer churn, improve t...
#1Nittaya Kerdprasop (Suranaree University of Technology)H-Index: 9
Last. Kittisak Kerdprasop (Suranaree University of Technology)H-Index: 9
view all 3 authors...
In the era of digital technologies, most enterprises have collected huge amount of data in an electronic form. Business intelligence technology has emerged as a tool to support information summarization, pattern extracting, knowledge discovery, and other knowledgerelated tasks. The main part of most business intelligence software is the data mining engine to analyze and report relationships that exist in the stored data. Visualization tools are created to help data analysts easily explore the in...
9 Citations