Message Boards Message Boards

[WSS17] Churn Classification of Mobile Telecom CDR Data

GROUPS:

Churn Classification for Mobile Telecom CDR Data


Objective and Business Use Case

Churn in telecommunication industry happens when the customers are leaving the current brand and moving to another telecom company. With the increasing number of churns, it becomes the operator's process to retain the profitable customers and this is known as churn management. In communication service provider (CSP) industry each company provides the customers with huge incentives to lure them to switch to their.

Telecom Churns are traditionally classified into two main categories: Involuntary and Voluntary. Of the two, Involuntary are easier to identify. Involuntary churn are those customers whom the telecom CSP decides to remove from their subscriber base. They are made to churn out deliberately for instance of fraud, non-payment etc. On the other hand, Voluntary churn are quite difficult to determine manually, given the amount of data and the frequency at which the data are generated; here it is the decision of the customer to unsubscribe from the service provider. Voluntary churn can further be classified as either an incidental and deliberate churn. The former occurs without any prior planning by the churn but due to change in the financial condition, location, quality etc. Whereas, the latter happens for technological advancement, economics, quality factors and convenience reasons. Most operators are trying to deal with these type of churns mainly. Identifying potential churners predictively is vital for the CSP business to maintain the business revenue. Getting a new customer is almost 6x the cost of retaining existing customers.

The churning tendency can be realised by analysing the "Call/Charging Data Records" or the (CDR) data which is auto generated every minute, day in and out by the core network whenever there is any telecom system activities. The approach here is to aggregate the customer related data for the required analysis and classify potential customers who might churn. The outcome can be used for various business use cases to improve targeted marketing, improve product design, identify network fault that led to churn and potential fraud detection. With the impending risk of OTT VoIP cannibalisation of voice calls, the clear understanding of their customers behaviour towards churn is vital for the business to maintain their steady revenue. SPSS and SAS are the traditionally used tools with SQL on the CDR since ages by all. Python and R are relatively new in this domain and are limited to few prototyping. Using Wolfram Language as a new approach looks beneficial, owing to it vast inbuilt math functions, features and ease of use that can handle such problems.

Data Description

In real world scenario, the CDR data has hundreds of parameters as defined by the 3GPP TS 32.298 standards : The Call/Charging Data Record (CDR) parameter description. The operator chooses the various configurations.

For prototyping we are limiting to few key parameters relevant one to customer calls.The description of the data fields are as below :-

phone - (discrete) number to uniquely identify a subscriber
account length - (continuous)  tenure of the customer with the brand
number vmail messages - (continuous)  - number of voice mail messages
total day minutes - (continuous) - total number of minutes of mobile usage during the day time hours
total day calls - (continuous) - total number of calls made during the day time hours
total day charge - (continuous) - total amount of incurred charges for the usage during the day time hours
total eve minutes -  (continuous)  - total number of minutes of mobile usage during the evening time hours
total eve calls - (continuous)  - total number of calls made during the evening time hours
total eve charge -  (continuous).- total amount of incurred charges for the usage during the evening time hours
total night minutes -  (continuous)  - total number of minutes of mobile usage during the night time hours
total night calls - (continuous)  - total number of calls made during the night time hours
total night charge - (continuous) - total amount of incurred charges for the usage during the night time hours
total intl minutes - (continuous) - total number of minutes mobile usage for international outgoing calls
total intl calls - (continuous) - total number of international outgoing calls
total intl charge - (continuous)- total amount of incurred charges for the international outing calls
number customer service calls - (continuous)- total number of customer support calls made

"Phone" should not be used as part of the training, since it has no predictive value.

The last column "Churn" is the classification (True, False)

The data used for this can be found here : https://github.com/lazycrazyowl/Wolfram-Summer-School-2017-Jassim-Moideen/tree/master/Project/Assets

Approach

The approach here is to aggregate the CDR data for the required analysis with required customer call details that lists churners and non-churners. Divide the dataset into training and testing datasets in 80:20 ratio. Build a simple neural network and train it using the training data-set to learn and classify potential customers who might churn. Test the model on the test data-set.

Code

The code used here can be found at GitHub : https://github.com/lazycrazyowl/Wolfram-Summer-School-2017-Jassim-Moideen/tree/master/Project

Conclusion

This is a proof of concept towards building better models with a similar approach for Telecom CDR analytics. Validating this model on real-time data in a batch processed mode, that can handle few hundreds of more parameters on an nVidia GPU like 1070 needs to be explored. Once the approach is understood, the next area would be towards building a model using deep learning networks to predict customer churn in a mobile telecommunication network on a daily basis.

References

Papers:

     1. A Customer Churn Prediction Model in Telecom Industry Using Boosting - Ning Lu et al - IEEE Transactions on Industrial Informatics ( Volume: 10, Issue: 2, May 2014 )
     2. Customer Churn in Mobile Markets: A Comparison of Techniques - E-ISSN 1913-9012 https://arxiv.org/ftp/arxiv/papers/1607/1607.07792.pdf
     3. Deep Learning in Customer Churn Prediction: Unsupervised Feature Learning on Abstract Company Independent Feature Vectors - Philip Spanoudes, Thomson Nguyen

Texts:

     1. Telecom Churn Management: The Golden Opportunity by Rob Mattison.
     2. Customer Churn Reduction and Retention for Telecoms: Models for All Marketers by Arthur Middleton Hughes. 

CDR Standard:

     1. 3GPP TS 32.298: Charging Data Record (CDR) parameter description standard.

My special thanks to my mentor Riccardo Di Virgilio and rest of the WSS17 mentors for their kind support.

POSTED BY: Jassim Moideen
Answer
19 days ago

Group Abstract Group Abstract