ClickHouse
diff --git a/‎.github/workflows/ci.yaml‎
Lines changed: 10 additions & 2 deletions b/‎.github/workflows/ci.yaml‎
Lines changed: 10 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 101 additions & 2 deletions b/‎README.md‎
Lines changed: 101 additions & 2 deletions
diff --git a/‎mcp_clickhouse/__init__.py‎
Lines changed: 6 additions & 0 deletions b/‎mcp_clickhouse/__init__.py‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎mcp_clickhouse/chdb_prompt.py‎
Lines changed: 119 additions & 0 deletions b/‎mcp_clickhouse/chdb_prompt.py‎
Lines changed: 119 additions & 0 deletions
@@ -30,7 +30,15 @@ jobs:
       - name: Install Project
         run: uv sync --all-extras --dev
 
-      - name: Run tests
+      - name: Run chDB tests
+        env:
+          CHDB_ENABLED: "true"
+          CLICKHOUSE_ENABLED: "false"
+          CHDB_DATA_PATH: ":memory:"
+        run: |
+          uv run pytest tests/test_chdb_tool.py
+
+      - name: Run ClickHouse tests
         env:
           CLICKHOUSE_HOST: "localhost"
           CLICKHOUSE_PORT: "8123"
@@ -39,7 +47,7 @@ jobs:
           CLICKHOUSE_SECURE: "false"
           CLICKHOUSE_VERIFY: "false"
         run: |
-          uv run pytest tests
+          uv run pytest tests/test_tool.py
 
       - name: Lint with Ruff
         run: uv run ruff check .
@@ -8,7 +8,7 @@ An MCP server for ClickHouse.
 
 ## Features
 
-### Tools
+### ClickHouse Tools
 
 * `run_select_query`
   * Execute SQL queries on your ClickHouse cluster.
@@ -22,8 +22,17 @@ An MCP server for ClickHouse.
   * List all tables in a database.
   * Input: `database` (string): The name of the database.
 
+### chDB Tools
+
+* `run_chdb_select_query`
+  * Execute SQL queries using chDB's embedded OLAP engine.
+  * Input: `sql` (string): The SQL query to execute.
+  * Query data directly from various sources (files, URLs, databases) without ETL processes.
+
 ## Configuration
 
+This MCP server supports both ClickHouse and chDB. You can enable either or both depending on your needs.
+
 1. Open the Claude Desktop configuration file located at:
    * On macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
    * On Windows: `%APPDATA%/Claude/claude_desktop_config.json`
@@ -90,6 +99,63 @@ Or, if you'd like to try it out with the [ClickHouse SQL Playground](https://sql
 }
 ```
 
+For chDB (embedded OLAP engine), add the following configuration:
+
+```json
+{
+  "mcpServers": {
+    "mcp-clickhouse": {
+      "command": "uv",
+      "args": [
+        "run",
+        "--with",
+        "mcp-clickhouse",
+        "--python",
+        "3.13",
+        "mcp-clickhouse"
+      ],
+      "env": {
+        "CHDB_ENABLED": "true",
+        "CLICKHOUSE_ENABLED": "false",
+        "CHDB_DATA_PATH": "/path/to/chdb/data"
+      }
+    }
+  }
+}
+```
+
+You can also enable both ClickHouse and chDB simultaneously:
+
+```json
+{
+  "mcpServers": {
+    "mcp-clickhouse": {
+      "command": "uv",
+      "args": [
+        "run",
+        "--with",
+        "mcp-clickhouse",
+        "--python",
+        "3.13",
+        "mcp-clickhouse"
+      ],
+      "env": {
+        "CLICKHOUSE_HOST": "<clickhouse-host>",
+        "CLICKHOUSE_PORT": "<clickhouse-port>",
+        "CLICKHOUSE_USER": "<clickhouse-user>",
+        "CLICKHOUSE_PASSWORD": "<clickhouse-password>",
+        "CLICKHOUSE_SECURE": "true",
+        "CLICKHOUSE_VERIFY": "true",
+        "CLICKHOUSE_CONNECT_TIMEOUT": "30",
+        "CLICKHOUSE_SEND_RECEIVE_TIMEOUT": "30",
+        "CHDB_ENABLED": "true",
+        "CHDB_DATA_PATH": "/path/to/chdb/data"
+      }
+    }
+  }
+}
+```
+
 3. Locate the command entry for `uv` and replace it with the absolute path to the `uv` executable. This ensures that the correct version of `uv` is used when starting the server. On a mac, you can find this path using `which uv`.
 
 4. Restart Claude Desktop to apply the changes.
@@ -115,7 +181,22 @@ CLICKHOUSE_PASSWORD=clickhouse
 
 ### Environment Variables
 
-The following environment variables are used to configure the ClickHouse connection:
+The following environment variables are used to configure the ClickHouse and chDB connections:
+
+#### chDB Variables
+
+* `CHDB_ENABLED`: Enable/disable chDB functionality
+  * Default: `"false"`
+  * Set to `"true"` to enable chDB tools
+* `CHDB_DATA_PATH`: The path to the chDB data directory
+  * Required when `CHDB_ENABLED=true`
+  * Use `:memory:` for in-memory database (recommended for testing)
+  * Use a file path for persistent storage (e.g., `/path/to/chdb/data`)
+* `CLICKHOUSE_ENABLED`: Enable/disable ClickHouse functionality
+  * Default: `"true"`
+  * Set to `"false"` to disable ClickHouse tools when using chDB only
+
+#### ClickHouse Variables
 
 #### Required Variables
 
@@ -184,6 +265,24 @@ CLICKHOUSE_PASSWORD=
 # Uses secure defaults (HTTPS on port 8443)
 ```
 
+For chDB only (in-memory):
+
+```env
+# chDB configuration
+CHDB_ENABLED=true
+CLICKHOUSE_ENABLED=false
+CHDB_DATA_PATH=:memory:
+```
+
+For chDB with persistent storage:
+
+```env
+# chDB configuration
+CHDB_ENABLED=true
+CLICKHOUSE_ENABLED=false
+CHDB_DATA_PATH=/path/to/chdb/data
+```
+
 You can set these variables in your environment, in a `.env` file, or in the Claude Desktop configuration:
 
 ```json
 
@@ -3,11 +3,17 @@
     list_databases,
     list_tables,
     run_select_query,
+    create_chdb_client,
+    run_chdb_select_query,
+    chdb_initial_prompt,
 )
 
 __all__ = [
     "list_databases",
     "list_tables",
     "run_select_query",
     "create_clickhouse_client",
+    "create_chdb_client",
+    "run_chdb_select_query",
+    "chdb_initial_prompt",
 ]
@@ -0,0 +1,119 @@
+"""chDB prompts for MCP server."""
+
+CHDB_PROMPT = """
+# chDB Assistant Guide
+
+You are an expert chDB assistant designed to help users leverage chDB for querying diverse data sources. chDB is an embedded SQL OLAP engine that excels at analytical queries through its extensive table function ecosystem.
+
+## Available Tools
+- **run_chdb_select_query**: Execute SELECT queries using chDB's table functions
+
+## Table Functions: The Core of chDB
+
+chDB's strength lies in its **table functions** - special functions that act as virtual tables, allowing you to query data from various sources without traditional ETL processes. Each table function is optimized for specific data sources and formats.
+
+### File-Based Table Functions
+
+#### **file() Function**
+Query local files directly with automatic format detection:
+```sql
+-- Auto-detect format
+SELECT * FROM file('/path/to/data.parquet');
+SELECT * FROM file('sales.csv');
+
+-- Explicit format specification
+SELECT * FROM file('data.csv', 'CSV');
+SELECT * FROM file('logs.json', 'JSONEachRow');
+SELECT * FROM file('export.tsv', 'TSV');
+```
+
+### Remote Data Table Functions
+
+#### **url() Function**
+Access remote data over HTTP/HTTPS:
+```sql
+-- Query CSV from URL
+SELECT * FROM url('https://example.com/data.csv', 'CSV');
+
+-- Query parquet from URL 
+SELECT * FROM url('https://data.example.com/logs/data.parquet');
+```
+
+#### **s3() Function**
+Direct S3 data access:
+```sql
+-- Single S3 file
+SELECT * FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/aapl_stock.csv', 'CSVWithNames');
+
+-- S3 with credentials and wildcard patterns
+SELECT count() FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/mta/*.tsv', '<KEY>', '<SECRET>','TSVWithNames')
+```
+
+#### **hdfs() Function**
+Hadoop Distributed File System access:
+```sql
+-- HDFS file access
+SELECT * FROM hdfs('hdfs://namenode:9000/data/events.parquet');
+
+-- HDFS directory scan
+SELECT * FROM hdfs('hdfs://cluster/warehouse/table/*', 'TSV');
+```
+
+### Database Table Functions
+
+#### **sqlite() Function**
+Query SQLite databases:
+```sql
+-- Access SQLite table
+SELECT * FROM sqlite('/path/to/database.db', 'users');
+
+-- Join with other data
+SELECT u.name, s.amount 
+FROM sqlite('app.db', 'users') u
+JOIN file('sales.csv') s ON u.id = s.user_id;
+```
+
+#### **postgresql() Function**
+Connect to PostgreSQL:
+```sql
+-- PostgreSQL table access
+SELECT * FROM postgresql('localhost:5432', 'mydb', 'orders', 'user', 'password');
+```
+
+#### **mysql() Function**
+MySQL database integration:
+```sql
+-- MySQL table query
+SELECT * FROM mysql('localhost:3306', 'shop', 'products', 'user', 'password');
+```
+
+## Table Function Best Practices
+
+### **Performance Optimization**
+- **Predicate Pushdown**: Apply filters early to reduce data transfer
+- **Column Pruning**: Select only needed columns
+
+### **Error Handling**
+- Test table function connectivity with `LIMIT 1`
+- Verify data formats match function expectations
+- Use `DESCRIBE` to understand schema before complex queries
+
+## Workflow with Table Functions
+
+1. **Identify Data Source**: Choose appropriate table function
+2. **Test Connection**: Use simple `SELECT * LIMIT 1` queries
+3. **Explore Schema**: Use `DESCRIBE table_function(...)` 
+4. **Build Query**: Combine table functions as needed
+5. **Optimize**: Apply filters and column selection
+
+## Getting Started
+
+When helping users:
+1. **Identify their data source type** and recommend the appropriate table function
+2. **Show table function syntax** with their specific parameters
+3. **Demonstrate data exploration** using the table function
+4. **Build analytical queries** combining multiple table functions if needed
+5. **Optimize performance** through proper filtering and column selection
+
+Remember: chDB's table functions eliminate the need for data loading - you can query data directly from its source, making analytics faster and more flexible.
+"""