Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
- 
            Updated
            Aug 26, 2022 
- Python
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
One framework to develop, deploy and operate data workflows with Python and SQL.
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
Data Engineering Project with Hadoop HDFS and Kafka
Project demonstrating how to automate Prefect 2.0 deployments to AWS ECS Fargate
Code examples showing flow deployment to various types of infrastructure
Classwork projects and home works done through Udacity data engineering nano degree
Let your pipe lines flow thru the Python code in xonsh.
Agentic Data Integrator that helps you build production-ready data pipelines so you can connect to more systems, faster. You run it in your terminal as a workflow wizard.
Deploy a Prefect flow to serverless AWS Lambda function
Apache Spark Guide
Distributed Data Processing Pipeline for MCP.
A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apache Kafka and stored in a local Cassandra database.
A fully serverless, event-driven data pipeline that ingests, enriches, validates, and visualizes real-time news data using AWS services. Designed for cost-efficient, scalable deployment using only free-tier AWS services.
A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api
Analysis of 311 Service Requests for the City of NYC (from 2010 to 2023) Tech: Prefect cloud, dbt core, BigQuery, Compute Engine, CloudRun, Artifact Registry, Terraform, Docker
Reusable data engineering toolkit My personal data infrastructure
Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups
Add a description, image, and links to the data-engineering-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering-pipeline topic, visit your repo's landing page and select "manage topics."