AEM Group4 Project Report-1
AEM Group4 Project Report-1
AEM Group4 Project Report-1
Applied Econometrics for Managers Group - 04
Table of Contents
PROBLEM STATEMENT.....................................................................................................5
Gathering Data....................................................................................................................5
Describing Data..................................................................................................................5
Data Verification.................................................................................................................6
Exploring Data....................................................................................................................6
EXPLORATORY RESEARCH.............................................................................................6
Descriptive Test..................................................................................................................7
Data Visualization...............................................................................................................7
DATA PREPARATION........................................................................................................14
REGRESSION MODEL.......................................................................................................14
Model 1: Probit Model......................................................................................................15
Model 2: Logit Model.......................................................................................................16
LR Test.............................................................................................................................17
Calculation of APE (Average Partial Effect)....................................................................17
Applied Econometrics for Managers Group - 04
Through voice and data communication services, the telecom sector is crucial in bridging the
gap between individuals and organisations around the globe. It includes a wide range of
services, such as internet access, mobile and landline phone services, and more recently,
In the telecom sector, it is essential to analyse customer turnover rate and its contributing
causes for a number of reasons. First, it assists telecom firms in identifying and resolving
problems with network quality, price, or customer service that result in customer churn. In
order to keep important clients, suppliers may improve their marketing and retention tactics
by having a better grasp of churn. Finally, lowering churn is a crucial indicator for telecom
organisations to track and manage since it may greatly affect profitability and long-term
The rate at which consumers stop doing business with a company is known as the churn rate,
often referred to as the attrition rate. This measure, which is frequently represented as the
percentage of service subscribers that cancel their subscriptions within a specified time frame,
reveals the effectiveness of the company's customer service division and its potential for
overall growth. The major segments within these sub-sectors include the following:
Wireless Telecommunications
Fixed-Line Telecommunications
Equipment Manufacturers
The industry is reportedly one of the top producers of new jobs and receives significant NRI
investment. Government monopolies are facing a torrent of new competitors as they are being
privatised in several countries across the world. The growth of mobile services is surpassing that
of fixed-line services, and voice is starting to lose ground to the Internet as the primary method
Applied Econometrics for Managers Group - 04
The telecommunications industry's churn rate fluctuates for a variety of reasons. By using
CHURN as a dependent variable and identifying the elements that influence customer churn,
the objective is to identify the major aspects that contribute the most to customer turnover.
The following are the major problems which will be addressed through the study:
The effect of the customer's choice of service type, such as 2G, 3G, or 4G
Gathering Data
The dataset has been obtained from from the telecommunication sector.
Describing Data
Each row in the dataset corresponds to a unique client, and each column to a separate
attribute, and the dataset explains the information of 7038 distinct consumers based on 11
different qualities.
Applied Econometrics for Managers Group - 04
Data Verification
To ensure that the data is consistent, the dataset is examined for any attributes with missing
Exploring Data
Character data columns, binary data columns, and numeric data columns are all mixed
All the columns in the dataset or the attributes of the data are described below in the table.
In order to draw some conclusions from the data, statistical tools have been used to analyse it.
It also describes the dataset's properties, such as the kind of variables used and how they are
used. Additionally, it gives a brief overview of a number of factors that could be significant
Applied Econometrics for Managers Group - 04
Descriptive Test
Inference: Proportion of missing data is very less, hence we will omit those rows.
Data Visualization
It can be observed that almost 73% of the customers have not left the services while the other
27% did leave the operator or the services. This indicates that the churn rate is at 27% and the
other 73% customers are trying to stick to their network operators or the services that they
chose initially.
Applied Econometrics for Managers Group - 04
Fig 3: Box plot showing the monthly charges being charged from the customers w.r.t. churn rate
It is clear that the fees for consumers who have abandoned the services are somewhat greater
than the fees for users who have not abandoned the services. We may conclude that a small
number of clients abandoned the services because the prices were too high, as opposed to those
Applied Econometrics for Managers Group - 04
It can be observed that the number of customers leaving the services in both the genders,
males and females is almost the same. This indicates that the gender is not a major
influencing factor for a customer to leave the service or to continue with the service.
It can be observed that the senior citizens tend to leave the services compared to the non-
senior citizens. This indicates that the senior citizen factor is playing a significant influencing
factor role for a customer to leave the service or to continue with the service.
It can be observed that the number of customers leaving the services of Airtel and VI are on a
slightly lower level as compared to BSNL and RJio. Overall, also, we can see that most
customers are loyal to their operators and do not leave it for any reason whatsoever.
However, this also means that all the operators need to change their strategies so as to
It can be observed that travelling is not a major factor for the churn rate of customers as we can
see that all three categories have a similar and low number of customers leaving the operator.
Travelling has a slightly higher value as compared to Indoor and Outdoor, however, it is a very
minute number.
Applied Econometrics for Managers Group - 04
It can be observed that the customers are shifting towards the new era and new technology.
The churn rate for the 2G type of network is the highest with almost 35% of its users leaving.
Then comes the 3G type of network where the churn rate is around 32%. Here, the customers
would most probably shift towards the 4G type of network or even better technologies. 4G
type of network also has a significant churn rate with more than 25% of its customers leaving,
but this is comparatively less when compared to the other two types of networks we are
Applied Econometrics for Managers Group - 04
Fig 8: Proportion of customer churn rate for different tech support received
It can be observed that the operators who do not provide any support services have the
highest churn rate of about 48% customers leaving the services while the operators who
provide support services also have churn rate but less than 25% which comparatively low.
Surprisingly, the operators who do not provide internet service have the lowest churn rate.
This indicates that the customer is likely to leave the services if the operator does not provide
Applied Econometrics for Managers Group - 04
The initial and most important phase is eliminating outliers and uncommon data types. In
addition, phases of data cleansing are typically discussed. Before being processed and
analysed, raw data must be transformed and cleansed. This phase frequently includes data
reformatting, data restoration, and blending data sources to enrich data. The processing phase
Regression analysis is a statistical technique that helps determine the relationship between
variables. Numerous modelling and analytical techniques are included to help determine the
illustrates how dependent variables change when one or more independent variables change.
In this instance, logistic regression was used to analyse the dataset provided.
We have run a regression model in which the dependent variable is customer churn, i.e., whether
the customer has left the service, and the independent variables are SeniorCitizen, operator,
network_type, gender, Support, charges, and other variables. We have developed and compared
two distinct models, one with "Logit" and the other with "Probit" algorithms.
Applied Econometrics for Managers Group - 04
Model1 was summarised using the Probit model. The significant factors identified were
Applied Econometrics for Managers Group - 04
McFadden's pseudo R-squared for the first model is 0.054, which indicates that this model
Model2 was summarised using the logit model. The significant factors identified were
Applied Econometrics for Managers Group - 04
McFadden’s pseudo R-squared for the second model is 0.054, which indicates that this model
Model3 was summarised using the logit model. The significant factors identified were
Applied Econometrics for Managers Group - 04
McFadden’s pseudo R-squared for the third model is 0.054, which indicates that this model
LR Test
The likelihood ratio test demonstrates the importance of the models. Also, it can be seen that
the R-squared values for all the three models are comparable, indicating that both all of them
The coefficients of Probit, Logit, and LP models cannot be interpreted directly because they
do not indicate the contribution of each factor individually. Thus, we calculate Average
Applied Econometrics for Managers Group - 04
Partial Effect (APE) that informs us of the individual contribution of every factor.
1. All variables except for gender were found to be significantly affecting the Churn
Applied Econometrics for Managers Group - 04
0.063%. This suggests that the operators should focus on the services provided to
Senior Citizens.
3. As for the Operators, in comparison to Airtel, BSNL customers have the highest
probability of 0.187 and RJio with the smallest increase in the probability of
networks reduce the likelihood of customer churn by 0.45. Therefore, the better
5. Monthly Charges contribute to the increase in the probability of consumer churn. The
6. The lifetime of a customer's relationship with a particular service provider plays a role.
7. Using the Differences in Difference (DID) method, we analysed the combined effect
of senior citizen support and technical support. Even though the factor is not
statistically significant, it can be observed that senior citizen customers who receive
adequate customer support have a lower probability of churning than senior citizens
who do not receive Technical Support. In this case, the probability decreases by
Applied Econometrics for Managers Group - 04
Applied Econometrics for Managers Group - 04
#load packages
#Descriptive Analysis
##Proportion of missing data is very less, hence we will omit those rows.
mydata1 <- na.omit(mydata)
##Data Visualiztion
status <- table(mydata1$Churn)
Applied Econometrics for Managers Group - 04
##Building Model
#family = binomial(link=probit)
#McFadden's pseudo Rsquared
##Gives the percentage deviance
1 - Model1$deviance/Model1$null.deviance
1 - Model2$deviance/Model2$null.deviance
1 - Model3$deviance/Model3$null.deviance
#LR test statistics for overall significance
#APE model
logitmfx (factor(Churn) ~ SeniorCitizen + (operator) + (network_type) + (gender) +
Applied Econometrics for Managers Group - 04