Churn Risk Intelligence

Customer churn prediction using machine learning. Achieves 82% accuracy with comprehensive analytics pipeline for proactive retention strategies.

Machine Learning Completed 2 months
Churn Risk Intelligence

Business Impact

Problem Solved

Customer acquisition costs are 5-25x higher than retention costs. This solution identifies at-risk customers with 82% accuracy.

Value Delivered

Enables proactive retention campaigns, reduces revenue loss, and optimizes marketing budget allocation.

Key Results

82.19% accuracy with Logistic Regression, identifying high-risk customers for targeted retention strategies.

Model Performance

Model Accuracy Precision (Churn) Recall (Churn) Business Use Case
Logistic Regression 82.19% 69% 60% Balanced Performance
KNN (k=15) 77.93% 60% 51% Standard Classification
Logistic + SMOTE 76.86% 55% 72% Higher Recall Priority
KNN + SMOTE 72.89% 49% 81% Maximum Churn Detection

Technical Architecture

1

Data Preprocessing

Missing value imputation, categorical encoding, feature scaling, and data leakage detection.

2

Feature Engineering

One-hot encoding, data leakage detection, and train-test splitting with stratification.

3

Model Development

Logistic Regression and K-Nearest Neighbors with hyperparameter optimization using GridSearchCV.

4

Class Imbalance Handling

SMOTE (Synthetic Minority Oversampling Technique) for balanced model training.

5

Evaluation & Insights

Comprehensive metrics analysis with business-focused interpretation and actionable recommendations.

Key Code Components

Main Pipeline Orchestrator

Model Training Pipeline

Dataset Information

Source

Telco Customer Churn Dataset

Size

7,043 customers × 21 features

Target

Customer churn (Yes/No)

Class Distribution

26.5% churn rate (imbalanced dataset)

Feature Categories

Demographics

Gender, age range, partner/dependent status

Account Information

Contract type, payment method, tenure, billing preferences

Services

Phone, internet, security, backup, streaming services

Financial

Monthly charges, total charges

Business Recommendations

For High-Value Customer Businesses

  • Use KNN + SMOTE model for maximum churn detection (81% recall)
  • Implement comprehensive retention campaigns for all flagged customers
  • Focus on customers with fiber optic internet and month-to-month contracts

For Cost-Conscious Operations

  • Use Logistic Regression model for efficient targeting (69% precision)
  • Prioritize customers with electronic check payments and high monthly charges
  • Develop automated retention workflows for scalability

Key Risk Factors to Monitor

Contract Type

Month-to-month contracts show highest churn rates

Internet Service

Fiber optic users demonstrate elevated churn risk

Payment Method

Electronic check payments correlate with increased churn

Tenure

New customers (< 6 months) require attention

Technology Stack

Python Scikit-learn Pandas NumPy Matplotlib Seaborn Jupyter Joblib