Skip to content

timothynn/customer-churn-prediction-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Customer Churn Prediction System

A comprehensive machine learning system to predict which customers are likely to close their accounts and identify effective retention strategies. This project helps businesses reduce churn by 10-15% and increase customer lifetime value.

🎯 Business Impact

  • High Impact: Reduces customer churn by 10-15%
  • Increased Revenue: Improves customer lifetime value through targeted retention
  • Proactive Approach: Automated alerting system for high-risk customers
  • Data-Driven Insights: SHAP-based model interpretability for actionable insights

πŸ”‘ Key Skills Developed

  • Classification: Advanced ML techniques for churn prediction
  • Feature Selection: Behavioral pattern analysis and feature engineering
  • Model Interpretation: SHAP values for explainable AI
  • Customer Segmentation: Risk-based customer categorization

πŸ—οΈ Architecture

customer-churn-prediction-system/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ models/          # ML model training and prediction
β”‚   β”œβ”€β”€ features/        # Feature engineering and selection
β”‚   β”œβ”€β”€ visualization/   # Data visualization and plotting
β”‚   β”œβ”€β”€ utils/          # Utility functions and helpers
β”‚   └── dashboard/      # Interactive dashboard application
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/            # Original datasets
β”‚   β”œβ”€β”€ processed/      # Cleaned and preprocessed data
β”‚   β”œβ”€β”€ external/       # External data sources
β”‚   └── models/         # Trained model artifacts
β”œβ”€β”€ notebooks/          # Jupyter notebooks for exploration
β”œβ”€β”€ tests/              # Unit and integration tests
β”œβ”€β”€ config/             # Configuration files
β”œβ”€β”€ scripts/            # Data processing and utility scripts
└── docs/              # Documentation

πŸš€ Quick Start with Nix Flakes

This project uses Nix flakes for reproducible development environments and dependency management.

Prerequisites

  • Nix with flakes enabled
  • Git

Setup

  1. Clone the repository:

    git clone <repository-url>
    cd customer-churn-prediction-system
  2. Enter the development environment:

    nix develop
  3. Initialize project structure:

    make setup
  4. Download sample data:

    make data-download

Alternative Setup (without Nix)

If you prefer not to use Nix, you can set up the environment manually:

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e ".[dev,jupyter]"

# Initialize project structure
make setup

πŸ“Š Datasets

The project supports multiple datasets:

  1. Kaggle: Telco Customer Churn - Telecommunications customer data
  2. Bank Customer Churn Dataset - Financial services customer data

Download scripts are provided in scripts/download_data.py.

πŸ› οΈ Development Workflow

Available Commands

# Development environment
make setup          # Initialize project structure
nix develop         # Enter development shell

# Data processing
make data-download  # Download datasets
make data-process   # Process raw data

# Model development
make train          # Train churn prediction model
make predict        # Run predictions
make model-evaluate # Evaluate model performance

# Development tools
make test           # Run tests
make lint           # Check code style
make format         # Format code
make type-check     # Type checking

# Applications
make jupyter        # Start Jupyter Lab
make dashboard      # Start interactive dashboard

Using Nix Apps

You can also use Nix apps for common tasks:

nix run .#jupyter   # Start Jupyter Lab
nix run .#dashboard # Start dashboard

🧠 Implementation Steps

  1. Data Analysis (notebooks/01-exploratory-data-analysis.ipynb)

    • Analyze customer behavior patterns
    • Examine transaction history and trends
    • Identify key churn indicators
  2. Feature Engineering (src/features/build_features.py)

    • Transaction frequency metrics
    • Balance trend analysis
    • Service usage patterns
    • Behavioral change detection
  3. Model Development (src/models/train_model.py)

    • Ensemble methods (Random Forest, XGBoost, LightGBM)
    • Handle class imbalance
    • Cross-validation and hyperparameter tuning
  4. Model Interpretation (notebooks/03-model-interpretation.ipynb)

    • SHAP values for feature importance
    • Local and global explanations
    • Business-friendly interpretation
  5. Customer Segmentation (src/models/segment_customers.py)

    • High/Medium/Low risk categorization
    • Behavioral clustering
    • Personalized retention strategies
  6. Dashboard Development (src/dashboard/app.py)

    • Interactive Plotly/Dash dashboard
    • Real-time churn monitoring
    • Customer success team interface
  7. Alerting System (src/utils/alerting.py)

    • Automated high-risk customer detection
    • Email/Slack notifications
    • Integration with CRM systems

🏒 Business Value

Retention Strategies by Risk Level

  • High Risk: Immediate intervention with personalized offers
  • Medium Risk: Proactive engagement and loyalty programs
  • Low Risk: Maintain satisfaction with regular check-ins

Expected Outcomes

  • 10-15% reduction in customer churn
  • Increased customer lifetime value
  • Improved customer satisfaction scores
  • Data-driven retention budget allocation

πŸ”§ Technologies

  • Python: Core development language
  • scikit-learn: Machine learning framework
  • SHAP: Model interpretability
  • Plotly/Dash: Interactive dashboards
  • SQL: Data querying and analysis
  • Nix: Reproducible development environment

πŸ§ͺ Testing

Run the test suite:

make test

For specific test categories:

pytest tests/unit/              # Unit tests
pytest tests/integration/       # Integration tests
pytest -m "not slow"           # Skip slow tests

πŸ“ˆ Monitoring and Deployment

The system includes:

  • Model performance monitoring
  • Data drift detection
  • Automated retraining pipelines
  • A/B testing framework for retention strategies

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests and linting
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“ž Support

For questions and support, please open an issue on GitHub.

About

Predict which customers are likely to close their accounts and identify retention strategies.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors