Archive/Analyzing the Impact of Feature Selection on Customer Churn Prediction in the Retail E-Commerce Industry
Analyzing the Impact of Feature Selection on Customer Churn Prediction in the Retail E-Commerce Industry
Meryem Chajia, El Habib Nfaoui, Soufiyan Ouali
July 1, 2026
en

Abstract

Customer churn has become a major challenge in the retail industry, where customer loyalty directly affects business success and sustainability. Despite the significant progress in Artificial Intelligence, especially in prediction tasks, its use in the retail e-commerce domain remains limited and underexplored; this is due to the scarcity and limited quality of available datasets. To address these challenges, this paper proposes a churn prediction approach designed to handle data scarcity while ensuring accurate performance. We experimented with a combination of various feature selection techniques along with several Machine Learning and Deep Learning models to evaluate their performance on a limited tabular dataset. The impact of feature selection on predictive performance was also systematically analyzed. The results demonstrated that feature selection plays an important role in improving model performance by identifying the key features that have the most significance to the classification task. The analysis showed that the L1-based Logistic Regression feature selection method combined with the Extreme Gradient Boosting classifier achieved the best performance, with a Macro F1-score of 95.25%. Based on these results, companies can identify potential churners and implement retention strategies. These findings may provide a useful reference point for future researchers in the retail e-commerce industry.

IPC Classification

G06

Keywords

analyzingimpactfeatureselectioncustomerchurnpredictionretaile-commerceindustrybecomemajorchallengewhereloyaltydirectlyaffectsbusinesssuccesssustainabilitydespitesignificantprogressartificial
Reference this publication

€ 4.00