Stable and accurate customer churn prediction: comparative analysis of eight classification algorithms

Vincent Alexander Haris, Muhammad Ilyas Arsyad, Nathanael Septhian Adi Nugraha, Yasi Dani, Maria Artanta Ginting

Abstract


Predicting customer churn is a challenging problem in many subscription-based industries, though it is considered more cost-effective than acquiring new customers. In this research, customer churn is predicted using a public dataset from an internet service provider, with 72,274 instances and 55% churn rate. The main contribution is to provide a comprehensive comparison of the stability and performance of eight classification algorithms in customer churn prediction using a large-scale public dataset. The research process includes data collection, data preprocessing, feature engineering, and model evaluation. The metrics evaluation presents test accuracy, accuracy gap, precision, recall, F1-Score, and ROC AUC, with stratified K-Fold cross-validation. Since the proportion of churn and non-churn in the dataset is relatively balanced, the F1-score is considered as the primary evaluation metric, as it provides a balanced assessment of precision and recall for both classes. The results show that CatBoost and XGBoost are the most effective models that achieve high F1-scores of 94.97% and 94.92%, respectively.

Keywords


Customer churn prediction; Machine learning; Classification algorithm; Evaluation metric; Performance evaluation

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v41.i2.pp655-665

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

shopify stats IJEECS visitor statistics