You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+105-4Lines changed: 105 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ An MCP server for ClickHouse.
8
8
9
9
## Features
10
10
11
-
### Tools
11
+
### ClickHouse Tools
12
12
13
13
*`run_select_query`
14
14
* Execute SQL queries on your ClickHouse cluster.
@@ -22,8 +22,17 @@ An MCP server for ClickHouse.
22
22
* List all tables in a database.
23
23
* Input: `database` (string): The name of the database.
24
24
25
+
### chDB Tools
26
+
27
+
*`run_chdb_select_query`
28
+
* Execute SQL queries using chDB's embedded OLAP engine.
29
+
* Input: `sql` (string): The SQL query to execute.
30
+
* Query data directly from various sources (files, URLs, databases) without ETL processes.
31
+
25
32
## Configuration
26
33
34
+
This MCP server supports both ClickHouse and chDB. You can enable either or both depending on your needs.
35
+
27
36
1. Open the Claude Desktop configuration file located at:
28
37
* On macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
29
38
* On Windows: `%APPDATA%/Claude/claude_desktop_config.json`
@@ -90,6 +99,63 @@ Or, if you'd like to try it out with the [ClickHouse SQL Playground](https://sql
90
99
}
91
100
```
92
101
102
+
For chDB (embedded OLAP engine), add the following configuration:
103
+
104
+
```json
105
+
{
106
+
"mcpServers": {
107
+
"mcp-clickhouse": {
108
+
"command": "uv",
109
+
"args": [
110
+
"run",
111
+
"--with",
112
+
"mcp-clickhouse",
113
+
"--python",
114
+
"3.13",
115
+
"mcp-clickhouse"
116
+
],
117
+
"env": {
118
+
"CHDB_ENABLED": "true",
119
+
"CLICKHOUSE_ENABLED": "false",
120
+
"CHDB_DATA_PATH": "/path/to/chdb/data"
121
+
}
122
+
}
123
+
}
124
+
}
125
+
```
126
+
127
+
You can also enable both ClickHouse and chDB simultaneously:
128
+
129
+
```json
130
+
{
131
+
"mcpServers": {
132
+
"mcp-clickhouse": {
133
+
"command": "uv",
134
+
"args": [
135
+
"run",
136
+
"--with",
137
+
"mcp-clickhouse",
138
+
"--python",
139
+
"3.13",
140
+
"mcp-clickhouse"
141
+
],
142
+
"env": {
143
+
"CLICKHOUSE_HOST": "<clickhouse-host>",
144
+
"CLICKHOUSE_PORT": "<clickhouse-port>",
145
+
"CLICKHOUSE_USER": "<clickhouse-user>",
146
+
"CLICKHOUSE_PASSWORD": "<clickhouse-password>",
147
+
"CLICKHOUSE_SECURE": "true",
148
+
"CLICKHOUSE_VERIFY": "true",
149
+
"CLICKHOUSE_CONNECT_TIMEOUT": "30",
150
+
"CLICKHOUSE_SEND_RECEIVE_TIMEOUT": "30",
151
+
"CHDB_ENABLED": "true",
152
+
"CHDB_DATA_PATH": "/path/to/chdb/data"
153
+
}
154
+
}
155
+
}
156
+
}
157
+
```
158
+
93
159
3. Locate the command entry for `uv` and replace it with the absolute path to the `uv` executable. This ensures that the correct version of `uv` is used when starting the server. On a mac, you can find this path using `which uv`.
The following environment variables are used to configure the ClickHouse connection:
184
+
The following environment variables are used to configure the ClickHouse and chDB connections:
119
185
120
-
#### Required Variables
186
+
#### ClickHouse Variables
187
+
188
+
##### Required Variables
121
189
122
190
*`CLICKHOUSE_HOST`: The hostname of your ClickHouse server
123
191
*`CLICKHOUSE_USER`: The username for authentication
@@ -126,7 +194,7 @@ The following environment variables are used to configure the ClickHouse connect
126
194
> [!CAUTION]
127
195
> It is important to treat your MCP database user as you would any external client connecting to your database, granting only the minimum necessary privileges required for its operation. The use of default or administrative users should be strictly avoided at all times.
128
196
129
-
#### Optional Variables
197
+
#####Optional Variables
130
198
131
199
*`CLICKHOUSE_PORT`: The port number of your ClickHouse server
132
200
* Default: `8443` if HTTPS is enabled, `8123` if disabled
@@ -149,6 +217,19 @@ The following environment variables are used to configure the ClickHouse connect
149
217
*`CLICKHOUSE_MCP_SERVER_TRANSPORT`: Sets the transport method for the MCP server.
150
218
* Default: `"stdio"`
151
219
* Valid options: `"stdio"`, `"http"`, `"streamable-http"`, `"sse"`. This is useful for local development with tools like MCP Inspector.
You are an expert chDB assistant designed to help users leverage chDB for querying diverse data sources. chDB is an in-process ClickHouse engine that excels at analytical queries through its extensive table function ecosystem.
7
+
8
+
## Available Tools
9
+
- **run_chdb_select_query**: Execute SELECT queries using chDB's table functions
10
+
11
+
## Table Functions: The Core of chDB
12
+
13
+
chDB's strength lies in its **table functions** - special functions that act as virtual tables, allowing you to query data from various sources without traditional ETL processes. Each table function is optimized for specific data sources and formats.
14
+
15
+
### File-Based Table Functions
16
+
17
+
#### **file() Function**
18
+
Query local files directly with automatic format detection:
19
+
```sql
20
+
-- Auto-detect format
21
+
SELECT * FROM file('/path/to/data.parquet');
22
+
SELECT * FROM file('sales.csv');
23
+
24
+
-- Explicit format specification
25
+
SELECT * FROM file('data.csv', 'CSV');
26
+
SELECT * FROM file('logs.json', 'JSONEachRow');
27
+
SELECT * FROM file('export.tsv', 'TSV');
28
+
```
29
+
30
+
### Remote Data Table Functions
31
+
32
+
#### **url() Function**
33
+
Access remote data over HTTP/HTTPS:
34
+
```sql
35
+
-- Query CSV from URL
36
+
SELECT * FROM url('https://example.com/data.csv', 'CSV');
37
+
38
+
-- Query parquet from URL
39
+
SELECT * FROM url('https://data.example.com/logs/data.parquet');
40
+
```
41
+
42
+
#### **s3() Function**
43
+
Direct S3 data access:
44
+
```sql
45
+
-- Single S3 file
46
+
SELECT * FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/aapl_stock.csv', 'CSVWithNames');
47
+
48
+
-- S3 with credentials and wildcard patterns
49
+
SELECT count() FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/mta/*.tsv', '<KEY>', '<SECRET>','TSVWithNames')
50
+
```
51
+
52
+
#### **hdfs() Function**
53
+
Hadoop Distributed File System access:
54
+
```sql
55
+
-- HDFS file access
56
+
SELECT * FROM hdfs('hdfs://namenode:9000/data/events.parquet');
57
+
58
+
-- HDFS directory scan
59
+
SELECT * FROM hdfs('hdfs://cluster/warehouse/table/*', 'TSV');
60
+
```
61
+
62
+
### Database Table Functions
63
+
64
+
#### **sqlite() Function**
65
+
Query SQLite databases:
66
+
```sql
67
+
-- Access SQLite table
68
+
SELECT * FROM sqlite('/path/to/database.db', 'users');
69
+
70
+
-- Join with other data
71
+
SELECT u.name, s.amount
72
+
FROM sqlite('app.db', 'users') u
73
+
JOIN file('sales.csv') s ON u.id = s.user_id;
74
+
```
75
+
76
+
#### **postgresql() Function**
77
+
Connect to PostgreSQL:
78
+
```sql
79
+
-- PostgreSQL table access
80
+
SELECT * FROM postgresql('localhost:5432', 'mydb', 'orders', 'user', 'password');
81
+
```
82
+
83
+
#### **mysql() Function**
84
+
MySQL database integration:
85
+
```sql
86
+
-- MySQL table query
87
+
SELECT * FROM mysql('localhost:3306', 'shop', 'products', 'user', 'password');
88
+
```
89
+
90
+
## Table Function Best Practices
91
+
92
+
### **Performance Optimization**
93
+
- **Predicate Pushdown**: Apply filters early to reduce data transfer
94
+
- **Column Pruning**: Select only needed columns
95
+
96
+
### **Error Handling**
97
+
- Test table function connectivity with `LIMIT 1`
98
+
- Verify data formats match function expectations
99
+
- Use `DESCRIBE` to understand schema before complex queries
100
+
101
+
## Workflow with Table Functions
102
+
103
+
1. **Identify Data Source**: Choose appropriate table function
104
+
2. **Test Connection**: Use simple `SELECT * LIMIT 1` queries
105
+
3. **Explore Schema**: Use `DESCRIBE table_function(...)`
106
+
4. **Build Query**: Combine table functions as needed
107
+
5. **Optimize**: Apply filters and column selection
108
+
109
+
## Getting Started
110
+
111
+
When helping users:
112
+
1. **Identify their data source type** and recommend the appropriate table function
113
+
2. **Show table function syntax** with their specific parameters
114
+
3. **Demonstrate data exploration** using the table function
115
+
4. **Build analytical queries** combining multiple table functions if needed
116
+
5. **Optimize performance** through proper filtering and column selection
117
+
118
+
Remember: chDB's table functions eliminate the need for data loading - you can query data directly from its source, making analytics faster and more flexible.
0 commit comments