Bank Term Deposit Prediction

Machine learning solution for predicting bank term deposit subscriptions. Optimizes marketing campaigns through predictive analytics with comprehensive binary classification pipeline.

Machine Learning Completed 2 months
Bank Term Deposit Prediction

Business Impact

Problem Solved

Banks invest significant resources in marketing campaigns. This solution optimizes marketing spend by targeting high-probability prospects.

Value Delivered

Increases campaign conversion rates, improves customer experience, and maximizes return on investment for marketing activities.

Key Results

Comprehensive binary classification pipeline analyzing ~41,000 customers with multiple ML algorithms for actionable insights.

Dataset Information

Source

Bank Marketing Campaign Data

Size

~41,000 customers × 20+ features

Target

Term deposit subscription (yes/no)

Class Distribution

88% no subscription, 12% subscription

Feature Categories

Demographic

Age, job, marital status, education, default status

Financial

Housing loan, personal loan status

Campaign

Contact method, month, day, duration, contacts

Economic

Employment rate, CPI, confidence index

Model Comparison

Naive Bayes

Probabilistic
  • Fast training and prediction
  • Good baseline performance
  • Handles categorical features well
  • Probabilistic outputs

Decision Tree (Gini)

Tree-based
  • Interpretable decision rules
  • Handles mixed data types
  • Provides feature importance
  • No feature scaling required

Decision Tree (Entropy)

Tree-based
  • Alternative splitting criterion
  • Comparable performance to Gini
  • Different decision boundaries
  • Information theory approach

Key Insights & Findings

Data Insights

Class Imbalance

~88% customers do not subscribe to term deposits

High Correlation

Strong correlation between economic indicators

Seasonal Patterns

March and December show higher subscription rates

Contact Method

Cellular contact generally outperforms telephone

Business Recommendations

Target Segments

Focus campaigns on students and retired individuals

Timing

Schedule major campaigns in March and December

Contact Method

Prioritize cellular over telephone contact

Previous Success

Heavily weight previous campaign success in targeting

Technical Architecture

1

Data Processing

Data loading, cleaning, comprehensive EDA with visualizations and data quality validation.

2

Feature Engineering

Creates meaningful features, handles multicollinearity, and implements categorical encoding.

3

Model Training

Implements and compares Naive Bayes and Decision Tree models with comprehensive evaluation.

4

Model Evaluation

Complete assessment with metrics, visualizations, confusion matrices, and ROC curves.

5

Model Persistence

Automated saving of trained models with timestamps and comprehensive reporting.

Key Code Components

Main Pipeline Orchestrator

Data Processing Module

Project Structure

src/
config.py
data_processing.py
feature_engineering.py
model_training.py
evaluation.py
main.py
data/
bank_data.csv
models/
saved models (.pkl)
results/
plots and reports

Technology Stack

Python Scikit-learn Pandas NumPy Matplotlib Seaborn Jupyter Joblib