Churn Risk Intelligence
Customer churn prediction using machine learning. Achieves 82% accuracy with comprehensive analytics pipeline for proactive retention strategies.

Business Impact
Problem Solved
Customer acquisition costs are 5-25x higher than retention costs. This solution identifies at-risk customers with 82% accuracy.
Value Delivered
Enables proactive retention campaigns, reduces revenue loss, and optimizes marketing budget allocation.
Key Results
82.19% accuracy with Logistic Regression, identifying high-risk customers for targeted retention strategies.
Model Performance
Model | Accuracy | Precision (Churn) | Recall (Churn) | Business Use Case |
---|---|---|---|---|
Logistic Regression | 82.19% | 69% | 60% | Balanced Performance |
KNN (k=15) | 77.93% | 60% | 51% | Standard Classification |
Logistic + SMOTE | 76.86% | 55% | 72% | Higher Recall Priority |
KNN + SMOTE | 72.89% | 49% | 81% | Maximum Churn Detection |
Technical Architecture
Data Preprocessing
Missing value imputation, categorical encoding, feature scaling, and data leakage detection.
Feature Engineering
One-hot encoding, data leakage detection, and train-test splitting with stratification.
Model Development
Logistic Regression and K-Nearest Neighbors with hyperparameter optimization using GridSearchCV.
Class Imbalance Handling
SMOTE (Synthetic Minority Oversampling Technique) for balanced model training.
Evaluation & Insights
Comprehensive metrics analysis with business-focused interpretation and actionable recommendations.
Key Code Components
Main Pipeline Orchestrator
Model Training Pipeline
Dataset Information
Source
Telco Customer Churn Dataset
Size
7,043 customers × 21 features
Target
Customer churn (Yes/No)
Class Distribution
26.5% churn rate (imbalanced dataset)
Feature Categories
Demographics
Gender, age range, partner/dependent status
Account Information
Contract type, payment method, tenure, billing preferences
Services
Phone, internet, security, backup, streaming services
Financial
Monthly charges, total charges
Business Recommendations
For High-Value Customer Businesses
- Use KNN + SMOTE model for maximum churn detection (81% recall)
- Implement comprehensive retention campaigns for all flagged customers
- Focus on customers with fiber optic internet and month-to-month contracts
For Cost-Conscious Operations
- Use Logistic Regression model for efficient targeting (69% precision)
- Prioritize customers with electronic check payments and high monthly charges
- Develop automated retention workflows for scalability
Key Risk Factors to Monitor
Contract Type
Month-to-month contracts show highest churn rates
Internet Service
Fiber optic users demonstrate elevated churn risk
Payment Method
Electronic check payments correlate with increased churn
Tenure
New customers (< 6 months) require attention