Customer Segmentation Using Machine Learning

The document discusses the importance of customer segmentation in business using machine learning techniques, particularly the K-Means clustering algorithm. It highlights how effective segmentation can help companies identify target groups based on shared characteristics, ultimately leading to better marketing strategies and increased profits. The study provides a methodology for implementing K-Means clustering on a dataset from a mall store to categorize customers based on their income and spending behavior.

Uploaded by

jkaka0481

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views4 pages

Customer Segmentation Using Machine Learning

Uploaded by

jkaka0481

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

CUSTOMER SEGMENTATION USING

MACHINE LEARNING
Anand Kasaudhan Dr S. Srinivasan
Rahul Kumar Gupta
Computer Science and Engineering Computer Science and Engineering
Computer Science and Engineering
Galgotias University Galgotias University
Galgotias University
Greater Noida,India Greater Noida,India
Greater Noida,India
akdev.0811@gmail.com s.srinivasan@galgotiasuniversity.edu.in
rahulkumargupta8821@gmail.com

Abstract— Effective decisions are mandatory for any community closer together. Almost all have moved to online
company to generate good revenue. In these days platforms, expanding their reach to large customer groups.
competition is huge and all companies are moving forward Customers are also happy to accept this change
with their own different strategies. We should use data and Each customer also generates a large amount of data. So
take a proper decision. Every person is different from one why should companies fall behind? Companies also need to
another and we do not know what he/she buys or what their change the way they work and use available resources to
likes are. But, with the help of machine learning technique support growth. Most business goals can be achieved
one can sort out the data and can find the target group by through customer segmentation. How do businesses benefit
applying several algorithms to the dataset. Without this, It from this?
will be very difficult and no better techniques are available For example, suppose a company starts using customer
to find the group of people with similar character and segmentation. A company wants to group its customers by
interests in a large dataset. Here, the customer region. Now the company is previewing which products will
segmentation using K-Means clustering helps to group the be rated the highest in which locations. The company can
data with same attributes which exactly helps to business now use this information to plan
the best. We are going to use elbow method to find the 's advertising campaigns, strategies and more. Indirectly, it
number of clusters and at last we visualize the data. brings more profit to the business.

Keywords—K-means algorithm, Machine Learning, un- 3. Literature Review

supervised learning, Customer segmentation, Clustering, 3.1 Customer Segmentation
Python
The business world has been highly competitive for many
years, and organizations need to increase their profits and
profits by meeting customer demands and acquiring new
1. INTRODUCTION
customers. I need to grow my business. needs. Identifying
The corporate sector has experienced significant growth in
customers and meeting their needs is a very complex and
recent years. Businesses set new goals every day and make
time-consuming task. This is because customers may differ
every effort to reach them. This has created a highly
based on their requirements, tastes, preferences, etc. Instead
competitive
of a "one-size-fits-all" approach, customer segmentation
environment in the corporate sector. Whether your company
groups customers into groups that share the same
is small or big, you are competing with other companies.
characteristics or behavioral traits. [5] Customer
The problem is that many of his
segmentation is therefore a strategy for dividing the market
competitors are not successful. There are many reasons why
into homogeneous groups.
businesses fail, but in our opinion, one of the biggest
reasons companies fail is those who choose to avoid
3.2 Clustering and K-Means Algorithm
learning from their customers. Any company has potential,
but they don't understand the market. In short, companies do
Clustering algorithms generates clusters such that within the
not divide the market. The solution to this problem is to
clusters are similar based on some characteristics. Similarity
understand customer segmentation (aka market
is defined in terms of how close the objects are in space.
segmentation). Customer segmentation can be described as a
K-means algorithm in one of the most popular centroid
game in which a child separates balls, cubes
based algorithm. Suppose data set, D, contains n objects in
according to their shape and color. Simply put, customer
space. Partitioning methods distribute the objects in D into k
segmentation means separating customers, marketing them
clusters, C1,...,Ck , that is, Ci ⊂ D and Ci ∩Cj = ∅ for (1 ≤ i,
by different criteria, and grouping them based on similar
j ≤ k). A centroid-based partitioning technique uses the
characteristics.
centroid of a cluster, Ci , to represent that cluster.
Why Use Customer Segmentation Now? Today's market is
Conceptually, the centroid of a cluster is its center point.
growing at a very fast rate, as are the customers. The
The difference between an object p ∈ Ci and ci , the
smartphone revolution has brought her
representative of the cluster, is measured by dist(p,ci),
where dist(x,y) is the Euclidean distance between two points
x and y.
Algorithm: A k-means algorithm for partitioning, where the
center of each cluster is represented by the mean value of
the objects in the cluster.
Input: k: number of clusters, D: dataset containing n objects.
Output: A set of k clusters.
Method: (1) arbitrarily select k objects from D as initial
cluster centers; (2) repeat (3) (re)assign each object to the
cluster to which the object is most similar, based on the
mean value of the objects in the cluster; (4) updating the
cluster means, i.e., calculating the mean value of the objects
for each cluster; (5) until changed.

4. Methodology
A mall store provided the dataset for clustering using the K-
means algorithm. Five attributes and 200 tuples form a
Figure.1. Annual Income vs Spending Score
dataset that represents the information of 200 consumers.
The characteristics in the data collection are CustomerId,
Now we can build a K-means model based on the fact that
gender, age, yearly income (k$), and spending
there are many groups, but not in great detail. The silhouette
score on a scale of (1-100).
coefficient approach is used to perform k-means clustering
for a range of k clusters (say 1 to 10) and estimate the sum
of the squared distances from each point to its assigned
center for each value. Decide on the number of clusters that
will give you the best silhouette score. This defines how the
silhouette score is calculated. We notice that once K=5 is
Table 1. Dataset reached, there is no rapid movement in WCSS (within
Cluster Sum of Squares). And given the number of clusters
To begin with, we need to clarify what data we will work we have now, K=5 will be the correct number of clusters. 7.
with (dataset see table 1). We use a straightforward yet Refer to the illustration.
comprehensive data set that includes customer ID, gender,
age, annual income and purchase score. The value of a
customer's purchases or spending at the mall is represented
by a spend score that ranges from 1 to 100. (The higher the
number, the greater the amount spent.) The structure of the
dataset was displayed correctly and there are no value
values.

If the dataset contains nulls, duplicates, or other noisy data,

data cleaning is required. Data cleaning ensures that the
information is reliable, usable and available for analysis.
When we have data, we can visualize it by comparing
gender-specific annual income and expenditure scores.
According to the study, there are five different types of
graphs that illustrate groups of customers who engage in the
following activities, as well as customer behaviors
associated with annual income and expenditure scores:
1. High Income / Low Expense Score
2. Low income - high score for spending
3. High score for spending – despite low income
4.Average income - average expenditure score
5.High income – high expenditure score.

Figure.2. Silhouette approach result.

We can divide the plot into various groups, determine

cluster can be prioritized, and then assign a label to each
using the method stated above. The K-means approach can
be used to decide which of the five clusters should be
targeted, namely clients with Moderate Income- Moderate
Spending Score, High Income- High Spending Score, and
Low Income- High Spending Score. The required behavior based on their annual income and expenditure
consumers have been located, as shown in Figure 3. scores. This cluster analysis can be applied to a number of
\ consumer marketing methods. We want to keep our target
clientele,
who have a high income and a high expense score because
they provide the largest profit margin. As their lifestyle
demands a high income and a low spending score,
customers will be attracted to the Mall supermarket because
of the wide variety of items available. Less Income Lesser
Spending Scores can get more promotions and will be
tempted to spend by receiving frequent offers and discounts.
Cluster analysis can be used to determine what things clients
wish to consume, allowing more targeted marketing efforts
to be developed. Potential clients in this situation are people
in groups 3 and 4.

6. CONCLUSION
This study demonstrates that client segmentation in
Figure.3. Final cluster of customers shopping malls is achievable despite the fact that this form
of machine learning application is highly useful in the
market, a manager can concentrate all of his or her attention
. on each cluster that has been discovered and meet all of
their requirements. Mall managers must be able to
5.EXPERIMENT RESULTS
understand what customers require and, more importantly,
Mall shoppers can be divided into five groups based on their
how to meet those needs. analyze their purchasing habits,
annual earnings and spending. For starters, the yellow group
and establish frequent encounters with customers that make
refers to people who have high incomes and high spending
them feel comfortable in order to satisfy their demands.
scores; this is an excellent example of a mall or retail center
being a good target. Because these are
most profitable customers. This person could be a frequent
REFERENCES
shopper at the mall where they could be easily apprehended
[1] “Customer segmentation based on survival character,”
by mall security. The blue group, on the other hand, consists
IEEE, Jul.2003.
of those who have a lot of money but spend very little. This
[2] “Customer Segmentation Using K Means Clustering,”
is an interesting case because there are many reasons for the
Towards Data Science, Apr. 2019.
development of such a club. Let's assume that they are
[3] Peter J. Rousseeuw (1987). "Silhouettes: a Graphical
people who like to shop, but are not satisfied with the
Aid to the Interpretation and Validation of Cluster
current offer or facilities of the mall. Those are good goals
Analysis". Computational and Applied Mathematics. 20:
too, but we'll have to find out why they're spending so little.
53–65. doi:10.1016/0377-0427(87)90125-7.
A department head or mall authority could design or build
[4] R.C. de Amorim, C. Hennig (2015). "Recovering the
facilities to attract these groups to come in and meet their
number of clusters in data sets with noise features using
needs. Based on the facts we know, they have average
feature rescaling factors". Information Sciences. 324: 126–
earnings and expenses, as illustrated by the orange group.
145. arXiv:1602.06989. doi:10.1016/j.ins.2015.06.039.
We can assume that these are people who do not always buy
[5] Leonard Kaufman; Peter J. Rousseeuw (1990).
things, but have a strong desire to spend despite their
Finding groups in data : An introduction to cluster
financial limits. As a manager, I try to avoid marketing
analysis. Hoboken, NJ: Wiley-Interscience. p. 87.
strategies that target this
doi:10.1002/9780470316801. ISBN9780471878766.
population as much as possible, because they do not
[6] Kriegel, Hans-Peter; Schubert, Erich; Zimek,
represent a significant source of income for the shopping
Arthur(2016). "The (black) art of runtime evaluation: Are
center. However, they can use a number of data analysis
we comparing algorithms or implementations?". Knowledge
techniques to help them increase their spending. There is a
and Information Systems. 52 (2): 341–378.
purple group that includes people with low income but high
doi:10.1007/s10115-016-1004-2. ISSN 0219-1377. S2CID
spending scores; despite their low income, people in this
40772241.
group like or are interested in spending money. This is also
[7] Fader, P. S., Hardie, B. G., & Lee, K. L. (2005).RFM
possible if customers are satisfied with the services of the
and CLV: Using iso-value curves for customer base
mall and therefore feel compelled to spend money because
analysis. Journal of Marketing Research, 42(4),415-430.
they are satisfied with the services. The green group, fifth,
[8] Tkachenko, Yegor. Autonomous CRM Control via CLV
had low annual incomes and bad spending habits. It also
Approximation with Deep Reinforcement Learning in
makes sense that they're on a tight budget and would cut
Discrete and Continuous Action Space.(April 8, 2015).
corners wherever possible, even if what they're doing is a
arXiv.org:https://arxiv.org/abs/1504.01840
smart and great decision given their circumstances. People
[9] Yeh, I-Cheng, Yang, King-Jang, and Ting, Tao-Ming,
in this cluster should be given the lowest priority by the mall
"Knowledge discovery on RFM model using Bernoulli
manager. By analyzing data, we can predict customer
sequence," Expert Systems with
Applications, 2009. Method Based on K-Means Algorithm. Physics
[10] Robert L. Thorndike (December 1953). "Who Belongs Procedia. 25. 1104-1109.
in the Family?". Psychometrika. 18 (4):267–276. 10.1016/j.phpro.2012.03.206.
doi:10.1007/BF02289263. [14] Wei, Jo-Ting & Lin, Shih-Yen & Wu, Hsin-
[11] Williamson, D & Parker, RA & Kendrick, Hung.(2010). A review of the application of RFM
Juliette.(1989). The box plot: A simple visual method to model.African Journal of Business Management December
interpret data. Annals of internal medicine. 110. 916-21. Special Review. 4. 4199-4206.
10.1059/0003-4819-110-11-916.
[12] Bhaya, Wesam. (2017). Review of Data
Preprocessing Techniques in Data Mining. Journal
of Engineering and Applied Sciences. 12. 4102-
4107. 10.3923/jeasci.2017.4102.4107.
[13] Li, Youguo & Wu, Haiyan. (2012). A Clustering

Retail Customer Segmentation Using SAS
No ratings yet
Retail Customer Segmentation Using SAS
19 pages
Prrethy-Dr. Huma Lone - AL
No ratings yet
Prrethy-Dr. Huma Lone - AL
7 pages
Customer Segmentation Report
No ratings yet
Customer Segmentation Report
31 pages
Customer Segmentation
No ratings yet
Customer Segmentation
43 pages
Chapter 5 CLUSTERING
No ratings yet
Chapter 5 CLUSTERING
36 pages
JPSP202244
No ratings yet
JPSP202244
7 pages
Mall Customer Segmentation Using Cluster
No ratings yet
Mall Customer Segmentation Using Cluster
6 pages
A Comparative Analyis of K-Means and Its Varinats For Customer Segmentation
No ratings yet
A Comparative Analyis of K-Means and Its Varinats For Customer Segmentation
15 pages
Final Synopsis
No ratings yet
Final Synopsis
9 pages
Python Machine Learning
No ratings yet
Python Machine Learning
19 pages
Research Paper Mini Project
No ratings yet
Research Paper Mini Project
13 pages
Data Segmentation
No ratings yet
Data Segmentation
27 pages
Final
No ratings yet
Final
48 pages
Honey Research Paper
No ratings yet
Honey Research Paper
4 pages
Customer Segemntation
No ratings yet
Customer Segemntation
26 pages
Customer Segmentation Using Machine Learning With A Coupon Generator GUI
No ratings yet
Customer Segmentation Using Machine Learning With A Coupon Generator GUI
6 pages
ML Project Report
No ratings yet
ML Project Report
22 pages
Retail Sales Analysis Using Clustering: Dr. M. Rajeshwari, P.R.Bharathi Nandha
No ratings yet
Retail Sales Analysis Using Clustering: Dr. M. Rajeshwari, P.R.Bharathi Nandha
8 pages
Updated Thesis
No ratings yet
Updated Thesis
29 pages
Customer Segmentation
No ratings yet
Customer Segmentation
15 pages
IJCSP23D1055
No ratings yet
IJCSP23D1055
9 pages
Chapter 1,2 Report
No ratings yet
Chapter 1,2 Report
5 pages
Słowacja Wszystko PDF
No ratings yet
Słowacja Wszystko PDF
379 pages
Customer Segmentation Using Machine Learning: Ilavendhan@galgotiasuniversity - Edu.in
No ratings yet
Customer Segmentation Using Machine Learning: Ilavendhan@galgotiasuniversity - Edu.in
7 pages
Energy Consumption Prediction System
No ratings yet
Energy Consumption Prediction System
21 pages
Mall Customer Segmentation: Submitted By: Batch No:8
No ratings yet
Mall Customer Segmentation: Submitted By: Batch No:8
17 pages
Customer Segmentation With Machine Learning
No ratings yet
Customer Segmentation With Machine Learning
7 pages
Employee Mangement System
No ratings yet
Employee Mangement System
60 pages
IJCRT22A6129
No ratings yet
IJCRT22A6129
9 pages
BT 4065 Report
No ratings yet
BT 4065 Report
32 pages
Fin Irjmets1653303840
No ratings yet
Fin Irjmets1653303840
4 pages
UNIT II-Segmentation, Positioning, and Product Optimization
No ratings yet
UNIT II-Segmentation, Positioning, and Product Optimization
48 pages
Customer Segmentation Using Ensemble Clustering
No ratings yet
Customer Segmentation Using Ensemble Clustering
20 pages
Behavioural Customer Segmentation Based
No ratings yet
Behavioural Customer Segmentation Based
7 pages
Updated Thesis
No ratings yet
Updated Thesis
28 pages
MGM3165 Chapter 16 17
No ratings yet
MGM3165 Chapter 16 17
21 pages
Customer Segmentation Using K Means Clustering
No ratings yet
Customer Segmentation Using K Means Clustering
7 pages
LP I Assignment A4 Clustering
No ratings yet
LP I Assignment A4 Clustering
13 pages
Customer Segmentation Using K Means Clustering IJERTV11IS030152
No ratings yet
Customer Segmentation Using K Means Clustering IJERTV11IS030152
6 pages
Mall Customer Segmentation Kalash Daf
No ratings yet
Mall Customer Segmentation Kalash Daf
12 pages
Introduction To Data Science: Clustering
No ratings yet
Introduction To Data Science: Clustering
45 pages
Predictive Analytics and Data Mining: Segmentation Using Clustering
No ratings yet
Predictive Analytics and Data Mining: Segmentation Using Clustering
25 pages
ML Assignment 4
No ratings yet
ML Assignment 4
6 pages
DWDM PPT
No ratings yet
DWDM PPT
13 pages
Customer Categorization by Data Analysis Using Clustering Algorithms of Machine Learning
No ratings yet
Customer Categorization by Data Analysis Using Clustering Algorithms of Machine Learning
4 pages
Customer Segmentation Using K-Means Algorithm PROJECT
No ratings yet
Customer Segmentation Using K-Means Algorithm PROJECT
28 pages
Class6 Unsupervised Learning Clustering
No ratings yet
Class6 Unsupervised Learning Clustering
13 pages
K-Means Clustering Algorithm Based On E-Commerce B
No ratings yet
K-Means Clustering Algorithm Based On E-Commerce B
6 pages
288175101
No ratings yet
288175101
51 pages
Factor Analysis - Segmentation New
No ratings yet
Factor Analysis - Segmentation New
142 pages
Determination of Customer Satisfaction Using Improved K-Means Algorithm
No ratings yet
Determination of Customer Satisfaction Using Improved K-Means Algorithm
19 pages
Customer Segmentation Using Data Science
No ratings yet
Customer Segmentation Using Data Science
7 pages
Mall Customer Segmentation Using Machine Learning Techniques
No ratings yet
Mall Customer Segmentation Using Machine Learning Techniques
17 pages
Variance Rover System
No ratings yet
Variance Rover System
3 pages
MRA - Session 7-8 - Segmentation Analytics
No ratings yet
MRA - Session 7-8 - Segmentation Analytics
22 pages
CUSTOMER - MALL - SEGMENTATION.1 (1) (1) (Autosaved)
No ratings yet
CUSTOMER - MALL - SEGMENTATION.1 (1) (1) (Autosaved)
9 pages
Hariprasath Conferencepaper
No ratings yet
Hariprasath Conferencepaper
6 pages
Cluster Analysis
No ratings yet
Cluster Analysis
46 pages
IEEE Conference Template 5
No ratings yet
IEEE Conference Template 5
5 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Detection and Prediction of Rice Leaf Disease Using A Hybrid CNN-SVM Model
No ratings yet
Detection and Prediction of Rice Leaf Disease Using A Hybrid CNN-SVM Model
19 pages
Unit 4
No ratings yet
Unit 4
18 pages
MLF Notes - Rishab Dec 24
No ratings yet
MLF Notes - Rishab Dec 24
6 pages
Numerical Analysis Notes For Mechanical Engineering
No ratings yet
Numerical Analysis Notes For Mechanical Engineering
99 pages
CE007-Module 2-Solution To System of Linear Equation Using Direct Methods
No ratings yet
CE007-Module 2-Solution To System of Linear Equation Using Direct Methods
42 pages
Design and Implementation of A Novel Circuit-Based
No ratings yet
Design and Implementation of A Novel Circuit-Based
24 pages
Optimal Control Minimum Time Hamiltonian
No ratings yet
Optimal Control Minimum Time Hamiltonian
11 pages
Exercises 2 Opt R1
No ratings yet
Exercises 2 Opt R1
33 pages
Aliaksei Maistrou - Finite Element Method Demystified
No ratings yet
Aliaksei Maistrou - Finite Element Method Demystified
26 pages
Program
No ratings yet
Program
51 pages
Polynomial Representation and Addition
No ratings yet
Polynomial Representation and Addition
4 pages
Convolutional Neural Network With An Optimized Backpropagation Technique
No ratings yet
Convolutional Neural Network With An Optimized Backpropagation Technique
5 pages
Polynomial PDF
No ratings yet
Polynomial PDF
7 pages
Note Sep 13, 2022
No ratings yet
Note Sep 13, 2022
5 pages
Sem 5 Put 2022-23
No ratings yet
Sem 5 Put 2022-23
8 pages
Probabilistic Approximation of Metric Spaces and Its Algorithmic Applications
No ratings yet
Probabilistic Approximation of Metric Spaces and Its Algorithmic Applications
60 pages
Stabilization of Nonlinear Time-Varying Systems: A Control Lyapunov Function Approach
No ratings yet
Stabilization of Nonlinear Time-Varying Systems: A Control Lyapunov Function Approach
14 pages
How To Use Popular Data Structures and Algorithms in Python ?
100% (1)
How To Use Popular Data Structures and Algorithms in Python ?
11 pages
02 FermetureTransitive-EN
No ratings yet
02 FermetureTransitive-EN
2 pages
Kruskals Algorithm
No ratings yet
Kruskals Algorithm
15 pages
03 Machine Learning Enabled Quantification of Stochastic Active Metadamping in Acoustic Metamaterials
No ratings yet
03 Machine Learning Enabled Quantification of Stochastic Active Metadamping in Acoustic Metamaterials
11 pages
Parametric LP
No ratings yet
Parametric LP
13 pages
DSP Lab File - Experiments - 1 & 2 DTU
No ratings yet
DSP Lab File - Experiments - 1 & 2 DTU
15 pages
CS3351 AIML UNIT 2 Notes
No ratings yet
CS3351 AIML UNIT 2 Notes
27 pages
Hill
No ratings yet
Hill
15 pages
Application of Quadratic Form in Engineering: 04/06/2023 Sample Footer Text
No ratings yet
Application of Quadratic Form in Engineering: 04/06/2023 Sample Footer Text
11 pages
Papa DR A Kakis 1981
No ratings yet
Papa DR A Kakis 1981
10 pages
Modulus Function
No ratings yet
Modulus Function
13 pages
Template Tesis UTM v2 PSM UG SC System Development
No ratings yet
Template Tesis UTM v2 PSM UG SC System Development
42 pages
Vehicle Loan Default Prediction
No ratings yet
Vehicle Loan Default Prediction
14 pages

Customer Segmentation Using Machine Learning

Uploaded by

Customer Segmentation Using Machine Learning

Uploaded by

CUSTOMER SEGMENTATION USING

Keywords—K-means algorithm, Machine Learning, un- 3. Literature Review

If the dataset contains nulls, duplicates, or other noisy data,

Figure.2. Silhouette approach result.

We can divide the plot into various groups, determine

You might also like