📘 Advanced Machine Learning — End-to-End Learning Pipeline

Focus: Real-world pipelines · Model tuning · SHAP & LIME explainability · API integration · Capstone project deployment

✅ Week Overview

This week focused on advanced machine learning concepts and practices that go beyond just training models — diving into tuning, interpretation, automation, and deployment-level workflow design.

📅 Day-by-Day Breakdown

🔹 Day 1: Introduction to Gradient Boosting

Revisited bagging vs boosting principles
Explored XGBoost and LightGBM internals
Installed and configured libraries
✅ Practiced: Trained a basic XGBoost model on the Iris dataset

🔹 Day 2: Cross-Validation & GridSearchCV

Understood cross-validation (CV) strategies: KFold, StratifiedKFold
Implemented GridSearchCV for hyperparameter tuning
✅ Practiced: Tuned max_depth, learning_rate, and n_estimators for XGBoost

🔹 Day 3: LightGBM & RandomizedSearchCV

Compared LightGBM vs XGBoost
Trained a LightGBM model using Breast Cancer dataset
Used RandomizedSearchCV for efficient tuning
✅ Practiced: Evaluated and compared both models

🔹 Day 4: Model Interpretation with SHAP

Learned why interpretability matters in ML
Used SHAP to visualize:
- Global feature importance (summary_plot)
- Individual prediction reasons (force_plot)
- Feature interactions (dependence_plot)
✅ Practiced: Visual explanations on Breast Cancer dataset

🔹 Day 5: LIME + API Basics

Learned how LIME provides local explanation for black-box models
Compared SHAP vs LIME
Used LimeTabularExplainer for per-instance interpretation
Explored REST APIs: What they are and how to call them with Python
✅ Practiced: Parsed real-time public API like cat facts using requests and json

🔹 Day 6: Building a Mini Data Pipeline

Designed an ML data pipeline: Ingest → Clean → Model → Export
Used:
- pandas for preprocessing
- xgboost for modeling
- joblib for saving models
- SQLite & CSV for output storage
✅ Practiced: Full pipeline built using Titanic dataset

🔹 Day 7: Capstone Project — Customer Churn Prediction

End-to-end ML project using Telco Customer Churn dataset
Steps performed:
1. Load and explore dataset
2. Clean and encode features
3. Train-test split + XGBoost with GridSearchCV
4. Interpret model using SHAP & LIME
5. Save predictions to CSV and SQLite
✅ Outcome: A complete, interpretable, production-ready ML pipeline

🎯 Tools & Libraries Used

Category	Libraries
Modeling	`xgboost`, `lightgbm`
Tuning	`sklearn.model_selection`
Interpretation	`shap`, `lime`
Data handling	`pandas`, `numpy`
Deployment Ready	`joblib`, `sqlite3`, `csv`
API Integration	`requests`, `json`

📌 Key Skills Mastered

Gradient Boosting Algorithms
Model Hyperparameter Tuning
SHAP & LIME Interpretation
Data Pipeline Design
API Calls with Python
Model & Output Storage (CSV, DB)

📁 Project Repositories

✅ All completed notebooks and projects from this Week are uploaded here:
🔗 [Advanced-ML](https://github.com/sushma-prog/customer-churn-prediction

👩‍💻 Author

Sushma Sandanshiv
BTech Data Science | Aspiring Data Scientist 🔗 LinkedIn • 🔗 GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
Building a Data Pipeline		Building a Data Pipeline
Capstone project- Customer Churn Prediction		Capstone project- Customer Churn Prediction
Cross Validation & GridSearchCV		Cross Validation & GridSearchCV
Intro to LIME + API Basics		Intro to LIME + API Basics
Introduction to Gradient Boosting		Introduction to Gradient Boosting
LightGBM & Hyperparameter Tuning		LightGBM & Hyperparameter Tuning
Model Interpritation with SHAP		Model Interpritation with SHAP
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📘 Advanced Machine Learning — End-to-End Learning Pipeline

✅ Week Overview

📅 Day-by-Day Breakdown

🔹 Day 1: Introduction to Gradient Boosting

🔹 Day 2: Cross-Validation & GridSearchCV

🔹 Day 3: LightGBM & RandomizedSearchCV

🔹 Day 4: Model Interpretation with SHAP

🔹 Day 5: LIME + API Basics

🔹 Day 6: Building a Mini Data Pipeline

🔹 Day 7: Capstone Project — Customer Churn Prediction

🎯 Tools & Libraries Used

📌 Key Skills Mastered

📁 Project Repositories

👩‍💻 Author

About

Uh oh!

Releases

Packages

Languages

sushma-prog/Advanced-ML

Folders and files

Latest commit

History

Repository files navigation

📘 Advanced Machine Learning — End-to-End Learning Pipeline

✅ Week Overview

📅 Day-by-Day Breakdown

🔹 Day 1: Introduction to Gradient Boosting

🔹 Day 2: Cross-Validation & GridSearchCV

🔹 Day 3: LightGBM & RandomizedSearchCV

🔹 Day 4: Model Interpretation with SHAP

🔹 Day 5: LIME + API Basics

🔹 Day 6: Building a Mini Data Pipeline

🔹 Day 7: Capstone Project — Customer Churn Prediction

🎯 Tools & Libraries Used

📌 Key Skills Mastered

📁 Project Repositories

👩‍💻 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages