Skip to content

A hands-on repository covering essential feature engineering techniques used in Machine Learning. Includes real-world examples using Python, NumPy, and pandas — perfect for students and beginners looking to improve model performance through better data preprocessing.

Notifications You must be signed in to change notification settings

Abdullah-Niaz/Feature-Engineering

Repository files navigation

Feature Engineering:

Welcome to the Feature Engineering repository! This project contains hands-on implementation of essential techniques used in preprocessing and transforming data for machine learning models. The focus is on building a strong foundation in practical feature engineering skills.


Topics Covered

🔹 Handling Missing Data

  • Drop missing values
  • Mean/Median/Mode Imputation
  • Random Sample Imputation
  • Capturing NaNs with Indicators
  • End of Distribution Imputation
  • Arbitrary Value Imputation

🔹 Encoding Categorical Variables

  • One Hot Encoding
  • Ordinal Encoding
  • Count/Frequency Encoding
  • Target Mean Encoding

🔹 Variable Transformation

  • Logarithmic Transformation
  • Box-Cox Transformation

🔹 Outlier Handling

  • IQR Method
  • Z-score Method

🔹 Discretization (Binning)

  • Equal Width Binning
  • Equal Frequency Binning

🔹 Feature Scaling

  • Min-Max Scaling
  • Standardization (Z-score Normalization)
  • Robust Scaling

🔹 Feature Extraction

  • Date/Time Feature Extraction
  • Text Feature Extraction

🔹 Feature Selection

  • Filter Methods (Correlation)
  • Wrapper Methods
  • Embedded Methods

🚀 Usage

Each topic is implemented and explained step-by-step in Jupyter notebooks. You can clone the repository and run the notebooks for practice and learning:

git clone https://github.com/Abdullah-Niaz/Feature-Engineering.git
cd Feature-Engineering

Open the notebooks in JupyterLab or Google Colab to explore the techniques interactively.


👨‍💻 Author

Abdullah Niaz

Connect with me on GitHub

About

A hands-on repository covering essential feature engineering techniques used in Machine Learning. Includes real-world examples using Python, NumPy, and pandas — perfect for students and beginners looking to improve model performance through better data preprocessing.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published