Skip to content

Commit 51faf0a

Browse files
antejavormatea16
andauthored
Document sql2graph (#1449)
* Init agents docs. * Delete examples. * Update wording. * Update title. * Update paragraphs. * Update pages with paragraphs. * update structure --------- Co-authored-by: matea16 <mateapesic@hotmail.com>
1 parent 99ab79a commit 51faf0a

File tree

7 files changed

+380
-10
lines changed

7 files changed

+380
-10
lines changed

pages/ai-ecosystem.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ This section of Memgraph’s documentation is your guide to using Memgraph for A
3030
- [GraphChat in Memgraph Lab](/memgraph-lab/features/graphchat): Explore how
3131
natural language querying (GraphChat) ties into the GraphRAG ecosystem, making
3232
complex graphs accessible to everyone.
33+
- [Agents in Memgraph](/ai-ecosystem/agents): Discover how you can leverage AI
34+
agents to automate graph modeling and migration tasks.
3335

3436

3537
<CommunityLinks/>

pages/ai-ecosystem/_meta.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,5 @@ export default {
22
"graph-rag": "GraphRAG",
33
"integrations": "Integrations",
44
"machine-learning": "Machine learning",
5+
"agents": "Agents"
56
}

pages/ai-ecosystem/agents.mdx

Lines changed: 339 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,339 @@
1+
---
2+
title: Agents
3+
description: Memgraph agents are build to help you build graph applications faster by leveraging the power of AI.
4+
---
5+
import { Callout, Steps } from 'nextra/components'
6+
import {CommunityLinks} from '/components/social-card/CommunityLinks'
7+
8+
# Memgraph Agents
9+
10+
**Memgraph Agents** are specialized tools designed to streamline and enhance the
11+
development of Graph applications. These agents leverage **Large Language Models
12+
(LLMs)** to provide intelligent solutions for various graph-related tasks. By the
13+
nature of technology maturity, some agents may be experimental and are
14+
continuously evolving to better serve your needs.
15+
16+
## SQL2Graph Agent
17+
18+
The **SQL2Graph Agent** is an intelligent database migration agent that
19+
transforms **relational databases** (MySQL, PostgreSQL) into **graph databases** using
20+
AI-powered analysis. <br/>
21+
It leverages Large Language Models (LLMs) to understand the
22+
semantics of your relational schema and generate an optimized property graph
23+
model for Memgraph. <br/>
24+
The agent enables interactive modeling and refinement of the
25+
graph schema, validation after the migration.
26+
27+
<h3 className="custom-header">Key capabilities</h3>
28+
29+
- **Automatic database migration**: Performs end-to-end migration from SQL to
30+
graph with minimal user input.
31+
- **Interactive graph modeling**: Enables users to review and refine the
32+
generated graph model incrementally, before executing the migration.
33+
- **Validation**: Provides pre- and post-migration validation to communicate
34+
the quality and correctness of the migration.
35+
36+
**Modeling strategies**
37+
38+
1. **Deterministic strategy**: Rule-based mapping of tables to nodes and foreign
39+
keys to relationships.
40+
2. **LLM strategy**: AI-powered analysis using LLMs to generate a semantically
41+
rich graph model.
42+
43+
**Operation modes**
44+
45+
1. **Automatic mode**: Fully automated migration without user interaction.
46+
2. **Incremental mode**: Step-by-step review and refinement of the graph model
47+
before migration.
48+
49+
These are controlled via CLI flags and environment variables.
50+
51+
### Supported databases
52+
53+
| Type | Supported options |
54+
|------|-------------------|
55+
| **Source databases** | PostgreSQL, MySQL |
56+
| **Target database** | Memgraph |
57+
58+
59+
### How to use the Agent
60+
61+
From this point onward, it is assumed that you have Memgraph installed and
62+
running. If you haven't done so, please refer to the [Memgraph installation
63+
guide](/memgraph/installation).
64+
65+
Start Memgraph with schema tracking enabled:
66+
67+
```bash
68+
docker run -p 7687:7687 memgraph/memgraph --schema-info-enabled
69+
```
70+
71+
<Callout type="warning">
72+
**Important**: Memgraph must be started with `--schema-info-enabled=true` for
73+
full functionality.
74+
</Callout>
75+
76+
It is also assumed that you have a running instance of either PostgreSQL or
77+
MySQL with a sample database to migrate.
78+
79+
### Installation
80+
81+
In order to use you first need to clone the repository and install the
82+
dependencies:
83+
84+
85+
```bash
86+
# Clone the repository
87+
git clone https://github.com/memgraph/ai-toolkit
88+
89+
# Navigate to the sql2graph directory
90+
cd agents/sql2graph
91+
92+
# Install dependencies using uv
93+
uv pip install -e .
94+
```
95+
96+
### Configuration
97+
98+
The configuration enables you to control the agent flow via environment
99+
variables. The key information needed are the source database connection
100+
details, target Memgraph connection details, and LLM API keys and agent
101+
configuration.
102+
103+
Create a `.env` and fill the following variables:
104+
105+
```bash
106+
# Source Database
107+
SOURCE_DB_TYPE=postgresql # or mysql
108+
109+
# PostgreSQL Configuration
110+
POSTGRES_HOST=localhost
111+
POSTGRES_PORT=5432
112+
POSTGRES_DATABASE=mydb
113+
POSTGRES_USER=username
114+
POSTGRES_PASSWORD=password
115+
POSTGRES_SCHEMA=public
116+
117+
# MySQL Configuration (if using MySQL)
118+
MYSQL_HOST=localhost
119+
MYSQL_PORT=3306
120+
MYSQL_DATABASE=mydb
121+
MYSQL_USER=username
122+
MYSQL_PASSWORD=password
123+
124+
# Target Memgraph Database
125+
MEMGRAPH_URL=bolt://localhost:7687
126+
MEMGRAPH_USERNAME=
127+
MEMGRAPH_PASSWORD=
128+
MEMGRAPH_DATABASE=memgraph
129+
130+
# LLM API Keys (for AI-powered features)
131+
# Only provide the key for your chosen provider
132+
OPENAI_API_KEY=sk-... # For GPT models
133+
# ANTHROPIC_API_KEY=sk-ant-... # For Claude models
134+
# GOOGLE_API_KEY=AI... # For Gemini models
135+
136+
# Optional: Specify LLM model (defaults shown)
137+
# LLM_MODEL=gpt-4o-mini # OpenAI default
138+
# LLM_MODEL=claude-3-5-sonnet-20241022 # Anthropic default
139+
# LLM_MODEL=gemini-2.0-flash-exp # Google default
140+
141+
# Migration Defaults (can be overridden via CLI flags)
142+
SQL2MG_MODE=automatic # Options: automatic, incremental
143+
SQL2MG_STRATEGY=deterministic # Options: deterministic, llm
144+
SQL2MG_META_POLICY=auto # Options: auto, reset, skip
145+
SQL2MG_LOG_LEVEL=INFO
146+
147+
```
148+
> 💡 **Tip:** Use `.env.example` in `agents/sql2graph` as a template.
149+
150+
151+
#### Quick start - automatic migration
152+
153+
Run with default settings (automatic mode, deterministic strategy):
154+
155+
```bash
156+
uv run main.py
157+
```
158+
159+
The agent will:
160+
1. Validate your environment and database connections
161+
2. Analyze the source database schema
162+
3. Generate a complete graph model
163+
4. Execute the migration
164+
5. Validate the results
165+
166+
In **automatic mode**, no user interaction is required, and the entire process
167+
is automated. This means the `SQL2MG_MODE` is set to `automatic`, and the
168+
SQL2MG_STRATEGY is set to `deterministic`. `SQL2MG_MODE` refers **modeling
169+
mode** and represents how much user interaction is involved, while
170+
`SQL2MG_STRATEGY` refers to how the graph model is generated.
171+
172+
173+
#### Refinement with incremental mode
174+
175+
For more control, run in incremental mode to review and refine the model
176+
step-by-step:
177+
178+
```bash
179+
uv run main.py --mode incremental
180+
```
181+
182+
The agent will:
183+
1. Analyze the source database schema
184+
2. Generate an initial graph model
185+
3. Present each table's proposed transformation for review
186+
4. Allow you to accept, skip, or modify each table's mapping
187+
5. After reviewing all tables, optionally enter a refinement loop for final
188+
adjustments
189+
6. Execute the migration
190+
7. Validate the results
191+
192+
This is predictable and repeatable flow so you can iteratively improve the graph
193+
model before migration. Each table is processed one at a time, and you have full
194+
control over the transformations, the table will show you all the proposed
195+
nodes, relationships, and properties for that table, and you can choose to
196+
accept them as-is, skip the table entirely, or modify the mapping details.
197+
198+
199+
#### Interactive migration with LLM
200+
201+
Use LLM-powered modeling for AI driven design:
202+
203+
```bash
204+
uv run main.py --strategy llm
205+
```
206+
207+
The agent auto-detects which LLM provider to use based on available API keys. In
208+
this strategy, the agent will:
209+
1. Analyze your SQL schema semantically using LLM
210+
2. Generate an initial graph model with AI-optimized design
211+
3. Execute the migration
212+
4. Validate the results
213+
214+
Keep in mind that in this mode, the entire migration is still automatic and LLM
215+
driven.
216+
217+
#### Incremental migration with review
218+
219+
Control each step of the transformation:
220+
221+
```bash
222+
uv run main.py --mode incremental --strategy llm
223+
```
224+
225+
In incremental mode:
226+
1. The AI generates a complete graph model for all tables
227+
2. You review each table's mapping one at a time
228+
3. Accept or modify individual table transformations
229+
4. After processing all tables, optionally enter a refinement loop
230+
5. Interactively adjust the entire model before final migration
231+
232+
In this mode the LLM is used to generate the initial model, but you have full
233+
control to review and refine each table's mapping before migration. After each
234+
modification, the LLM will try to regenerate based on your feedback and
235+
validation errors to improve the model iteratively.
236+
237+
### CLI reference
238+
239+
#### Command-line options
240+
241+
| Flag | Environment Variable | Description | Default |
242+
|------|---------------------|-------------|---------|
243+
| `--mode` | `SQL2MG_MODE` | `automatic` or `incremental` | interactive prompt |
244+
| `--strategy` | `SQL2MG_STRATEGY` | `deterministic` or `llm` | interactive prompt |
245+
| `--provider` | _(none)_ | `openai`, `anthropic`, or `gemini` | auto-detect from API keys |
246+
| `--model` | `LLM_MODEL` | Specific model name | provider default |
247+
| `--meta-graph` | `SQL2MG_META_POLICY` | `auto`, `skip`, or `reset` | `auto` |
248+
| `--log-level` | `SQL2MG_LOG_LEVEL` | `DEBUG`, `INFO`, `WARNING`, `ERROR` | `INFO` |
249+
250+
#### Usage examples
251+
252+
```bash
253+
# Use specific Gemini model
254+
uv run main.py --strategy llm --provider gemini --model gemini-2.0-flash-exp
255+
256+
# Skip meta-graph comparison (treat as fresh migration)
257+
uv run main.py --meta-graph skip
258+
259+
# Enable debug logging
260+
uv run main.py --log-level DEBUG
261+
262+
# Fully configured non-interactive run
263+
uv run main.py \
264+
--mode automatic \
265+
--strategy deterministic \
266+
--meta-graph reset \
267+
--log-level INFO
268+
```
269+
270+
271+
### LLM provider support
272+
273+
| Provider | Models |
274+
|----------|--------|
275+
| **OpenAI** | GPT-4o, GPT-4o-mini |
276+
| **Anthropic** | Claude 3.5 Sonnet |
277+
| **Google** | Gemini 2.0 Flash |
278+
279+
#### Provider selection
280+
281+
The agent automatically selects a provider based on available API keys:
282+
1. Checks for `OPENAI_API_KEY`
283+
2. Falls back to `ANTHROPIC_API_KEY`
284+
3. Falls back to `GOOGLE_API_KEY`
285+
286+
Override with `--provider` flag:
287+
288+
```bash
289+
# Force Anthropic even if OpenAI key exists
290+
uv run main.py --strategy llm --provider anthropic
291+
```
292+
293+
#### Model selection
294+
295+
Each provider has sensible defaults:
296+
- **OpenAI**: `gpt-4o-mini`
297+
- **Anthropic**: `claude-3-5-sonnet-20241022`
298+
- **Google**: `gemini-2.0-flash-exp`
299+
300+
Override with `--model` or `LLM_MODEL` env variable:
301+
302+
```bash
303+
# Use more powerful OpenAI model
304+
uv run main.py --strategy llm --model gpt-4o
305+
306+
# Or via environment variable
307+
export LLM_MODEL=claude-3-opus-20240229
308+
uv run main.py --strategy llm --provider anthropic
309+
```
310+
311+
312+
### Architecture overview
313+
314+
If you like the implementation details, here is a high-level overview of the
315+
project structure:
316+
317+
```
318+
sql2graph/
319+
├── main.py # CLI entry point
320+
├── core/
321+
│ ├── migration_agent.py # Main orchestration
322+
│ └── hygm/ # Graph modeling engine
323+
│ ├── hygm.py # HyGM core
324+
│ ├── models/ # Data models
325+
│ ├── strategies/ # Modeling strategies
326+
│ └── validation/ # Validation system
327+
├── database/
328+
│ ├── analyzer.py # Schema analysis
329+
│ ├── factory.py # Database adapter factory
330+
│ └── adapters/ # DB-specific adapters
331+
├── query_generation/
332+
│ ├── cypher_generator.py # Cypher query builder
333+
│ └── schema_utilities.py # Schema helpers
334+
└── utils/
335+
├── environment.py # Env validation
336+
└── config.py # Configuration
337+
```
338+
339+
<CommunityLinks/>

pages/data-migration.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,12 @@ In order to learn all the pre-requisites for importing data into Memgraph, check
2929

3030
</Callout>
3131

32+
<Callout type="info" title="SQL2Graph Agent">
33+
If you have a SQL data model and
34+
want to migrate to Memgraph, you can try out our [Agent](/ai-ecosystem/agents)
35+
that leverages the LLM to automate the process of modeling and migration.
36+
</Callout>
37+
3238
## File types
3339

3440
### CSV files

0 commit comments

Comments
 (0)