Table of Contents
This is a concept for what a Rails-inspired small data platform for startups and SMEs could look like. After using a variety of end-to-end solutions like DOMO, Keboola, Mozart Data and others, I keep wishing there was something that would do the 80% of ELT + BI out-of-the-box, without the price surprises.
This project is an attempt to stitch together a set of solid and reliable open-source tools that combine into a lean platform where one data engineer can own the entire lifecycle. From ELT, to data modelling, to deploying and scaling in production.
-
🧪 From laptop to production in minutes - Develop locally with DuckDB, deploy with the same code. No more "it works on my machine" problems.
-
⚡ Lightning-fast analytics on any data size - DuckDB's column-oriented design handles gigabytes of data on modest hardware. Query billions of rows in seconds.
-
📊 Beautiful dashboards - Drag-and-drop dataviz with Metabase. Perfect for everyone - tech and non-tech alike.
-
💸 Scale without breaking the bank - Enterprise-grade data stack for as little as $30/month. DuckDB + SQLMesh's efficiency means less compute costs than Snowflake or BigQuery.
-
🔄 30+ ready-to-use integrations - Instant integrations with dlt for Stripe, GitHub, Salesforce, and more. Connect your SaaS tools with minimal code.
-
🤖 Just ask your DB - Ask questions in plain English with DuckDB's MCP. Get immediate answers without writing complex queries.
-
🔍 End-to-end data lineage - SQLMesh tracks transformations from raw to gold data. Understand exactly where metrics come from and debug easily.
Caution
The project is very much in the pre-alpha stage. This is more of an experiment and is not meant for produciton workloads.
- Local-first development for the entire stack.
- Support companies that can't afford heavy, expensive data tools or large teams.
- No "SSO tax" - all tools should be either fully free, or affordable once deployed in serious prod use case.
- No k8s, so a small data team can be self-sufficient .
- Cheap path to production and scaling.
- Extract (planned): dlt
- Transform: SQLMesh
- Data Storage: DuckDB
- BI / data viz: Metabase
- Deployment: Dokploy
This is an example of how to list things you need to use the software and how to install them.
uv
mise
(recommended)claude
(recommended)
- Clone this repository
- Download the DuckDB driver for Metabase:
make download-duckdb-driver
- Start the services:
docker-compose up -d
- Access Metabase at http://localhost:3000
TODO
This project can be deployed to DigitalOcean/Hetzner/EC2 using Dokploy with the following architecture:
-
Metabase Container:
- Dedicated hostname (e.g., metabase.yourdomain.com)
- Access to mounted DuckDB volume
-
dlt + SQLMesh Container:
- Combined container for data processing
- Access to the same DuckDB volume
-
Shared Storage:
- Used for persistent DuckDB storage
- Add SQLMesh
- Add MCP for DuckDB
- Add dlt
- Add Dokku deployment configuration
- Create a DigitalOcean box for a public demo
- Add installation docs
- Add usage docs
- Add Aider docs
Greg Goltsov - @gregoltsov, gregoltsov.bsky.social.
Here are some projects which inspired my thinking and this project: