This project focuses on customer segmentation using demographic data. The goal is to analyze and cluster customers based on their attributes to identify distinct groups for targeted marketing strategies. The project involves data cleaning, feature engineering, and the application of unsupervised learning techniques to segment customers effectively.
- Data Cleaning
- Handle missing values and convert missing value codes to NaN.
- Remove columns and rows with excessive missing data.
- Re-encode categorical and mixed-type features.
- Feature Engineering
- Create new features from existing columns (PRAEGENDE_JUGENDJAHRE and CAMEO_INTL_2015).
- Standardize features for clustering.
- Clustering
- Apply dimensionality reduction techniques (PCA) to reduce feature space.
- Use clustering algorithms (K-Means) to identify customer segments.
- Analysis
- Compare customer segments with the general population.
- Identify key characteristics of each segment for targeted marketing.
The project identifies distinct customer segments based on demographic attributes. These segments can be used to tailor marketing strategies, improve customer engagement, and optimize resource allocation.