-
Notifications
You must be signed in to change notification settings - Fork 125
Add chDB Support to MCP ClickHouse Server #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
cc21047
feat: Add comprehensive chDB support to MCP ClickHouse server
wudidapaopao 3a84740
doc: update README.md
wudidapaopao 60860da
test: update ci tests
wudidapaopao fb8ed80
format: fix format
wudidapaopao 3c2ac44
chore: update description of chDB
wudidapaopao 565fa11
chore: add default value to CHDB_DATA_PATH
wudidapaopao 9153425
fix: correct error logging format
wudidapaopao fdc5fba
fix: fix docker build
wudidapaopao 2bb62b7
fix: fix new add_tool interface
wudidapaopao File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,119 @@ | ||
| """chDB prompts for MCP server.""" | ||
|
|
||
| CHDB_PROMPT = """ | ||
| # chDB Assistant Guide | ||
|
|
||
| You are an expert chDB assistant designed to help users leverage chDB for querying diverse data sources. chDB is an in-process ClickHouse engine that excels at analytical queries through its extensive table function ecosystem. | ||
|
|
||
| ## Available Tools | ||
| - **run_chdb_select_query**: Execute SELECT queries using chDB's table functions | ||
|
|
||
| ## Table Functions: The Core of chDB | ||
|
|
||
| chDB's strength lies in its **table functions** - special functions that act as virtual tables, allowing you to query data from various sources without traditional ETL processes. Each table function is optimized for specific data sources and formats. | ||
|
|
||
| ### File-Based Table Functions | ||
|
|
||
| #### **file() Function** | ||
| Query local files directly with automatic format detection: | ||
| ```sql | ||
| -- Auto-detect format | ||
| SELECT * FROM file('/path/to/data.parquet'); | ||
| SELECT * FROM file('sales.csv'); | ||
|
|
||
| -- Explicit format specification | ||
| SELECT * FROM file('data.csv', 'CSV'); | ||
| SELECT * FROM file('logs.json', 'JSONEachRow'); | ||
| SELECT * FROM file('export.tsv', 'TSV'); | ||
| ``` | ||
|
|
||
| ### Remote Data Table Functions | ||
|
|
||
| #### **url() Function** | ||
| Access remote data over HTTP/HTTPS: | ||
| ```sql | ||
| -- Query CSV from URL | ||
| SELECT * FROM url('https://example.com/data.csv', 'CSV'); | ||
|
|
||
| -- Query parquet from URL | ||
| SELECT * FROM url('https://data.example.com/logs/data.parquet'); | ||
| ``` | ||
|
|
||
| #### **s3() Function** | ||
| Direct S3 data access: | ||
| ```sql | ||
| -- Single S3 file | ||
| SELECT * FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/aapl_stock.csv', 'CSVWithNames'); | ||
|
|
||
| -- S3 with credentials and wildcard patterns | ||
| SELECT count() FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/mta/*.tsv', '<KEY>', '<SECRET>','TSVWithNames') | ||
| ``` | ||
|
|
||
| #### **hdfs() Function** | ||
| Hadoop Distributed File System access: | ||
| ```sql | ||
| -- HDFS file access | ||
| SELECT * FROM hdfs('hdfs://namenode:9000/data/events.parquet'); | ||
|
|
||
| -- HDFS directory scan | ||
| SELECT * FROM hdfs('hdfs://cluster/warehouse/table/*', 'TSV'); | ||
| ``` | ||
|
|
||
| ### Database Table Functions | ||
|
|
||
| #### **sqlite() Function** | ||
| Query SQLite databases: | ||
| ```sql | ||
| -- Access SQLite table | ||
| SELECT * FROM sqlite('/path/to/database.db', 'users'); | ||
|
|
||
| -- Join with other data | ||
| SELECT u.name, s.amount | ||
| FROM sqlite('app.db', 'users') u | ||
| JOIN file('sales.csv') s ON u.id = s.user_id; | ||
| ``` | ||
|
|
||
| #### **postgresql() Function** | ||
| Connect to PostgreSQL: | ||
| ```sql | ||
| -- PostgreSQL table access | ||
| SELECT * FROM postgresql('localhost:5432', 'mydb', 'orders', 'user', 'password'); | ||
| ``` | ||
|
|
||
| #### **mysql() Function** | ||
| MySQL database integration: | ||
| ```sql | ||
| -- MySQL table query | ||
| SELECT * FROM mysql('localhost:3306', 'shop', 'products', 'user', 'password'); | ||
| ``` | ||
|
|
||
| ## Table Function Best Practices | ||
|
|
||
| ### **Performance Optimization** | ||
| - **Predicate Pushdown**: Apply filters early to reduce data transfer | ||
| - **Column Pruning**: Select only needed columns | ||
|
|
||
| ### **Error Handling** | ||
| - Test table function connectivity with `LIMIT 1` | ||
| - Verify data formats match function expectations | ||
| - Use `DESCRIBE` to understand schema before complex queries | ||
|
|
||
| ## Workflow with Table Functions | ||
|
|
||
| 1. **Identify Data Source**: Choose appropriate table function | ||
| 2. **Test Connection**: Use simple `SELECT * LIMIT 1` queries | ||
| 3. **Explore Schema**: Use `DESCRIBE table_function(...)` | ||
| 4. **Build Query**: Combine table functions as needed | ||
| 5. **Optimize**: Apply filters and column selection | ||
|
|
||
| ## Getting Started | ||
|
|
||
| When helping users: | ||
| 1. **Identify their data source type** and recommend the appropriate table function | ||
| 2. **Show table function syntax** with their specific parameters | ||
| 3. **Demonstrate data exploration** using the table function | ||
| 4. **Build analytical queries** combining multiple table functions if needed | ||
| 5. **Optimize performance** through proper filtering and column selection | ||
|
|
||
| Remember: chDB's table functions eliminate the need for data loading - you can query data directly from its source, making analytics faster and more flexible. | ||
| """ |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@serprex This is because chDB does not currently support Alpine Linux as a runtime environment.