A modular, on-premise data analytics platform for processing, querying, and visualizing OECD productivity data using Apache NiFi, PostgreSQL, MinIO, FastAPI, and optional LLM integration.
This platform automates the ingestion, transformation, and analysis of OECD economic indicators (e.g., productivity data) from the SDMX API. It supports batch processing, a REST API for querying the data, and integration with large language models (LLMs) for natural language insights.
| Layer | Technology |
|---|---|
| Data Ingestion | Apache NiFi (Docker) |
| Storage | MinIO (S3-compatible) |
| Transformation | Python (Pandas, JOLT optional) |
| Database | PostgreSQL |
| API Service | FastAPI |
| Visualization | [TBD] React / Superset |
| LLM (optional) | GPT-4 / LLaMA / Mistral (local or remote) |
| CI/CD | GitLab CI, Jenkins (planned) |
- 🚀 Automated ETL with NiFi
- 💾 Data lake storage using MinIO
- 🧮 PostgreSQL analytics-ready schema
- 🔍 REST API (FastAPI) for querying:
/countries/industries/productivity/{country_code}/productivity/compare?countries=.../trends/{country_code}/summary/germany(custom use case)
- 🧠 Optional: Ask LLMs natural questions like
“What was Germany’s productivity growth in 2023?”
git clone https://github.com/YOUR_USERNAME/oecd-analytics-platform.git
cd oecd-analytics-platformCreate a .env file with the following:
# PostgreSQL
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=oecd_analytics
POSTGRES_USER=your_user
POSTGRES_PASSWORD=your_password
# MinIO
MINIO_ENDPOINT=localhost:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmindocker-compose up -dFetch JSON from SDMX API and upload to MinIO under:
bucket: oecd-raw-data
prefix: oecd/productivity/
python etl/process_oecd_productivity.pycd microservices/api
uvicorn main:app --reload --host 0.0.0.0 --port 8000Go to http://localhost:8000/docs for interactive API documentation.
GET /countries– List of all countries in the datasetGET /industries– Available industriesGET /productivity/DEU– Germany’s productivity dataGET /productivity/compare?countries=DEU,FRA,ESPGET /summary/germany– Prebuilt German dashboard data
- Batch ingestion with NiFi + MinIO
- PostgreSQL schema and ETL
- FastAPI backend with analytics endpoints
- React-based visualization dashboard
- LLM endpoint integration (
/ask) - CI/CD with GitLab + Jenkins
- Dockerized deployment
- LLM grounding on structured PostgreSQL results
.
├── etl/ # ETL scripts (JSON → PostgreSQL)
├── microservices/
│ └── api/ # FastAPI microservice
├── data/ # OECD JSON sample files (optional)
├── docker-compose.yml # Runs NiFi + MinIO
├── .env # Local environment config
└── README.md
MIT License. See LICENSE.
Murilo Polla
Built as part of an on-premise, AI-driven OECD analytics pipeline.
Interested in circularity, sustainability, and responsible AI.