Focus: Real-world pipelines · Model tuning · SHAP & LIME explainability · API integration · Capstone project deployment
This week focused on advanced machine learning concepts and practices that go beyond just training models — diving into tuning, interpretation, automation, and deployment-level workflow design.
- Revisited bagging vs boosting principles
- Explored XGBoost and LightGBM internals
- Installed and configured libraries
- ✅ Practiced: Trained a basic XGBoost model on the Iris dataset
- Understood cross-validation (CV) strategies: KFold, StratifiedKFold
- Implemented GridSearchCV for hyperparameter tuning
- ✅ Practiced: Tuned
max_depth
,learning_rate
, andn_estimators
for XGBoost
- Compared LightGBM vs XGBoost
- Trained a LightGBM model using Breast Cancer dataset
- Used RandomizedSearchCV for efficient tuning
- ✅ Practiced: Evaluated and compared both models
- Learned why interpretability matters in ML
- Used SHAP to visualize:
- Global feature importance (
summary_plot
) - Individual prediction reasons (
force_plot
) - Feature interactions (
dependence_plot
)
- Global feature importance (
- ✅ Practiced: Visual explanations on Breast Cancer dataset
- Learned how LIME provides local explanation for black-box models
- Compared SHAP vs LIME
- Used LimeTabularExplainer for per-instance interpretation
- Explored REST APIs: What they are and how to call them with Python
- ✅ Practiced: Parsed real-time public API like cat facts using
requests
andjson
- Designed an ML data pipeline: Ingest → Clean → Model → Export
- Used:
pandas
for preprocessingxgboost
for modelingjoblib
for saving modelsSQLite
& CSV for output storage
- ✅ Practiced: Full pipeline built using Titanic dataset
- End-to-end ML project using Telco Customer Churn dataset
- Steps performed:
- Load and explore dataset
- Clean and encode features
- Train-test split + XGBoost with GridSearchCV
- Interpret model using SHAP & LIME
- Save predictions to CSV and SQLite
- ✅ Outcome: A complete, interpretable, production-ready ML pipeline
Category | Libraries |
---|---|
Modeling | xgboost , lightgbm |
Tuning | sklearn.model_selection |
Interpretation | shap , lime |
Data handling | pandas , numpy |
Deployment Ready | joblib , sqlite3 , csv |
API Integration | requests , json |
- Gradient Boosting Algorithms
- Model Hyperparameter Tuning
- SHAP & LIME Interpretation
- Data Pipeline Design
- API Calls with Python
- Model & Output Storage (CSV, DB)
✅ All completed notebooks and projects from this Week are uploaded here:
🔗 [Advanced-ML](https://github.com/sushma-prog/customer-churn-prediction
Sushma Sandanshiv
BTech Data Science | Aspiring Data Scientist
🔗 LinkedIn • 🔗 GitHub