Customer Segmentation Analysis

✦ Project Objective

The goal of this project is to analyze a retail customer dataset and divide the customer base into distinct groups based on their purchasing behavior and personal data. By understanding these segments, businesses can optimize targeted marketing strategies, improve customer retention, and maximize revenue.

✦ Dataset

The data used in this project is the Mall Customer Segmentation Data sourced from Kaggle. It contains synthetic data on supermarket customers designed specifically for learning clustering algorithms.

Features used for clustering:
- Age
- Annual Income (k$)
- Spending Score (1-100): A proprietary score assigned to customers based on purchasing behavior and history.

✦ Project Structure

customer-segmentation/
│
├── data/
│   └── Mall_Customers.csv         # The dataset
│
├── notebooks/
│   └── eda_and_clustering.ipynb  
│
├── src/
│   ├── eda.py                     # Exploratory Data Analysis
│   ├── kmeans_2d.py               # 2D Clustering (Income vs Spending)
│   └── kmeans_3d.py               # 3D Clustering (Age, Income, Spending)
│
├── visuals/                       
│   ├── elbow_curve.png            
│   └── 3d Scatterplot of Age vs Annual Income vs Spending Score          
│   └── Cluster of Spending Score vs Annual Income
│   └── Distribution of Age
│   └── Distribution of Annual Income
│   └── Distribution of Spending Score
│   └── Gender Distribution
│   └── Scatterplot of the clusters of Spending Score vs Annual Income
│
├── requirements.txt              
└── README.md

✦ Methodology

This project utilizes K-Means Clustering, an unsupervised machine learning algorithm.

Exploratory Data Analysis (EDA): Visualized the distributions of age, annual income, spending score, and gender.
Determining Optimal Clusters: Used the Elbow Method (calculating the Within-Cluster Sum of Squares) to find the optimal number of clusters (k=5).
2D Clustering: Grouped customers based on Annual Income and Spending Score.
3D Clustering: Added 'Age' as a third dimension for a more granular segmentation.

✦ Key Insights & Customer Segments

Based on the K-Means algorithm, we successfully grouped the customer base into 5 distinct segments:

Cluster 0 (Target Customers): High Income, High Spending Score. (Prime targets for premium marketing)
Cluster 1 (Careful Customers): High Income, Low Spending Score. (Target for retention and discount campaigns to boost spending)
Cluster 2 (Standard Customers): Average Income, Average Spending Score. (The bulk of the customer base)
Cluster 3 (Sensible Customers): Low Income, Low Spending Score.
Cluster 4 (Careless Customers): Low Income, High Spending Score.

3D Cluster View

✦ Installation & Setup

1. Clone the repository

git clone https://github.com/swechchhapatel/Customer-segmentation-analysis.git
cd customer-segmentation

2. Install dependencies Make sure you have Python installed, then run:

pip install -r requirements.txt

3. Run the scripts To view the Exploratory Data Analysis:

python src/eda.py

To run the 2-Dimensional Clustering model:

python src/kmeans_2d.py

To run the 3-Dimensional Clustering model and print the customer groups:

python src/kmeans_3d.py

✦ Contribution

Contributions are welcome and appreciated!

If you'd like to improve this project, please follow these steps:

Fork the repository

Create a new branch

git checkout -b feature/your-feature-name

Make changes and commit
```
git commit -m "Add: your message"
```
Push to your branch and open a Pull Request

Feel free to improve features, UI, or model performance.

Made with ❤️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Segmentation Analysis

✦ Project Objective

✦ Dataset

✦ Project Structure

✦ Methodology

✦ Key Insights & Customer Segments

3D Cluster View

✦ Installation & Setup

✦ Contribution

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
notebooks		notebooks
src		src
visuals		visuals
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Customer Segmentation Analysis

✦ Project Objective

✦ Dataset

✦ Project Structure

✦ Methodology

✦ Key Insights & Customer Segments

3D Cluster View

✦ Installation & Setup

✦ Contribution

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages