|
| 1 | +--- |
| 2 | +title: Agents |
| 3 | +description: Memgraph agents are build to help you build graph applications faster by leveraging the power of AI. |
| 4 | +--- |
| 5 | +import { Callout, Steps } from 'nextra/components' |
| 6 | +import {CommunityLinks} from '/components/social-card/CommunityLinks' |
| 7 | + |
| 8 | +# Memgraph Agents |
| 9 | + |
| 10 | +**Memgraph Agents** are specialized tools designed to streamline and enhance the |
| 11 | +development of Graph applications. These agents leverage **Large Language Models |
| 12 | +(LLMs)** to provide intelligent solutions for various graph-related tasks. By the |
| 13 | +nature of technology maturity, some agents may be experimental and are |
| 14 | +continuously evolving to better serve your needs. |
| 15 | + |
| 16 | +## SQL2Graph Agent |
| 17 | + |
| 18 | +The **SQL2Graph Agent** is an intelligent database migration agent that |
| 19 | +transforms **relational databases** (MySQL, PostgreSQL) into **graph databases** using |
| 20 | +AI-powered analysis. <br/> |
| 21 | +It leverages Large Language Models (LLMs) to understand the |
| 22 | +semantics of your relational schema and generate an optimized property graph |
| 23 | +model for Memgraph. <br/> |
| 24 | +The agent enables interactive modeling and refinement of the |
| 25 | +graph schema, validation after the migration. |
| 26 | + |
| 27 | +<h3 className="custom-header">Key capabilities</h3> |
| 28 | + |
| 29 | +- **Automatic database migration**: Performs end-to-end migration from SQL to |
| 30 | + graph with minimal user input. |
| 31 | +- **Interactive graph modeling**: Enables users to review and refine the |
| 32 | + generated graph model incrementally, before executing the migration. |
| 33 | +- **Validation**: Provides pre- and post-migration validation to communicate |
| 34 | + the quality and correctness of the migration. |
| 35 | + |
| 36 | +**Modeling strategies** |
| 37 | + |
| 38 | +1. **Deterministic strategy**: Rule-based mapping of tables to nodes and foreign |
| 39 | + keys to relationships. |
| 40 | +2. **LLM strategy**: AI-powered analysis using LLMs to generate a semantically |
| 41 | + rich graph model. |
| 42 | + |
| 43 | +**Operation modes** |
| 44 | + |
| 45 | +1. **Automatic mode**: Fully automated migration without user interaction. |
| 46 | +2. **Incremental mode**: Step-by-step review and refinement of the graph model |
| 47 | + before migration. |
| 48 | + |
| 49 | +These are controlled via CLI flags and environment variables. |
| 50 | + |
| 51 | +### Supported databases |
| 52 | + |
| 53 | +| Type | Supported options | |
| 54 | +|------|-------------------| |
| 55 | +| **Source databases** | PostgreSQL, MySQL | |
| 56 | +| **Target database** | Memgraph | |
| 57 | + |
| 58 | + |
| 59 | +### How to use the Agent |
| 60 | + |
| 61 | +From this point onward, it is assumed that you have Memgraph installed and |
| 62 | +running. If you haven't done so, please refer to the [Memgraph installation |
| 63 | +guide](/memgraph/installation). |
| 64 | + |
| 65 | +Start Memgraph with schema tracking enabled: |
| 66 | + |
| 67 | +```bash |
| 68 | +docker run -p 7687:7687 memgraph/memgraph --schema-info-enabled |
| 69 | +``` |
| 70 | + |
| 71 | +<Callout type="warning"> |
| 72 | +**Important**: Memgraph must be started with `--schema-info-enabled=true` for |
| 73 | +full functionality. |
| 74 | +</Callout> |
| 75 | + |
| 76 | +It is also assumed that you have a running instance of either PostgreSQL or |
| 77 | +MySQL with a sample database to migrate. |
| 78 | + |
| 79 | +### Installation |
| 80 | + |
| 81 | +In order to use you first need to clone the repository and install the |
| 82 | +dependencies: |
| 83 | + |
| 84 | + |
| 85 | +```bash |
| 86 | +# Clone the repository |
| 87 | +git clone https://github.com/memgraph/ai-toolkit |
| 88 | + |
| 89 | +# Navigate to the sql2graph directory |
| 90 | +cd agents/sql2graph |
| 91 | + |
| 92 | +# Install dependencies using uv |
| 93 | +uv pip install -e . |
| 94 | +``` |
| 95 | + |
| 96 | +### Configuration |
| 97 | + |
| 98 | +The configuration enables you to control the agent flow via environment |
| 99 | +variables. The key information needed are the source database connection |
| 100 | +details, target Memgraph connection details, and LLM API keys and agent |
| 101 | +configuration. |
| 102 | + |
| 103 | +Create a `.env` and fill the following variables: |
| 104 | + |
| 105 | +```bash |
| 106 | +# Source Database |
| 107 | +SOURCE_DB_TYPE=postgresql # or mysql |
| 108 | + |
| 109 | +# PostgreSQL Configuration |
| 110 | +POSTGRES_HOST=localhost |
| 111 | +POSTGRES_PORT=5432 |
| 112 | +POSTGRES_DATABASE=mydb |
| 113 | +POSTGRES_USER=username |
| 114 | +POSTGRES_PASSWORD=password |
| 115 | +POSTGRES_SCHEMA=public |
| 116 | + |
| 117 | +# MySQL Configuration (if using MySQL) |
| 118 | +MYSQL_HOST=localhost |
| 119 | +MYSQL_PORT=3306 |
| 120 | +MYSQL_DATABASE=mydb |
| 121 | +MYSQL_USER=username |
| 122 | +MYSQL_PASSWORD=password |
| 123 | + |
| 124 | +# Target Memgraph Database |
| 125 | +MEMGRAPH_URL=bolt://localhost:7687 |
| 126 | +MEMGRAPH_USERNAME= |
| 127 | +MEMGRAPH_PASSWORD= |
| 128 | +MEMGRAPH_DATABASE=memgraph |
| 129 | + |
| 130 | +# LLM API Keys (for AI-powered features) |
| 131 | +# Only provide the key for your chosen provider |
| 132 | +OPENAI_API_KEY=sk-... # For GPT models |
| 133 | +# ANTHROPIC_API_KEY=sk-ant-... # For Claude models |
| 134 | +# GOOGLE_API_KEY=AI... # For Gemini models |
| 135 | + |
| 136 | +# Optional: Specify LLM model (defaults shown) |
| 137 | +# LLM_MODEL=gpt-4o-mini # OpenAI default |
| 138 | +# LLM_MODEL=claude-3-5-sonnet-20241022 # Anthropic default |
| 139 | +# LLM_MODEL=gemini-2.0-flash-exp # Google default |
| 140 | + |
| 141 | +# Migration Defaults (can be overridden via CLI flags) |
| 142 | +SQL2MG_MODE=automatic # Options: automatic, incremental |
| 143 | +SQL2MG_STRATEGY=deterministic # Options: deterministic, llm |
| 144 | +SQL2MG_META_POLICY=auto # Options: auto, reset, skip |
| 145 | +SQL2MG_LOG_LEVEL=INFO |
| 146 | + |
| 147 | +``` |
| 148 | +> 💡 **Tip:** Use `.env.example` in `agents/sql2graph` as a template. |
| 149 | +
|
| 150 | + |
| 151 | +#### Quick start - automatic migration |
| 152 | + |
| 153 | +Run with default settings (automatic mode, deterministic strategy): |
| 154 | + |
| 155 | +```bash |
| 156 | +uv run main.py |
| 157 | +``` |
| 158 | + |
| 159 | +The agent will: |
| 160 | +1. Validate your environment and database connections |
| 161 | +2. Analyze the source database schema |
| 162 | +3. Generate a complete graph model |
| 163 | +4. Execute the migration |
| 164 | +5. Validate the results |
| 165 | + |
| 166 | +In **automatic mode**, no user interaction is required, and the entire process |
| 167 | +is automated. This means the `SQL2MG_MODE` is set to `automatic`, and the |
| 168 | +SQL2MG_STRATEGY is set to `deterministic`. `SQL2MG_MODE` refers **modeling |
| 169 | +mode** and represents how much user interaction is involved, while |
| 170 | +`SQL2MG_STRATEGY` refers to how the graph model is generated. |
| 171 | + |
| 172 | + |
| 173 | +#### Refinement with incremental mode |
| 174 | + |
| 175 | +For more control, run in incremental mode to review and refine the model |
| 176 | +step-by-step: |
| 177 | + |
| 178 | +```bash |
| 179 | +uv run main.py --mode incremental |
| 180 | +``` |
| 181 | + |
| 182 | +The agent will: |
| 183 | +1. Analyze the source database schema |
| 184 | +2. Generate an initial graph model |
| 185 | +3. Present each table's proposed transformation for review |
| 186 | +4. Allow you to accept, skip, or modify each table's mapping |
| 187 | +5. After reviewing all tables, optionally enter a refinement loop for final |
| 188 | + adjustments |
| 189 | +6. Execute the migration |
| 190 | +7. Validate the results |
| 191 | + |
| 192 | +This is predictable and repeatable flow so you can iteratively improve the graph |
| 193 | +model before migration. Each table is processed one at a time, and you have full |
| 194 | +control over the transformations, the table will show you all the proposed |
| 195 | +nodes, relationships, and properties for that table, and you can choose to |
| 196 | +accept them as-is, skip the table entirely, or modify the mapping details. |
| 197 | + |
| 198 | + |
| 199 | +#### Interactive migration with LLM |
| 200 | + |
| 201 | +Use LLM-powered modeling for AI driven design: |
| 202 | + |
| 203 | +```bash |
| 204 | +uv run main.py --strategy llm |
| 205 | +``` |
| 206 | + |
| 207 | +The agent auto-detects which LLM provider to use based on available API keys. In |
| 208 | +this strategy, the agent will: |
| 209 | +1. Analyze your SQL schema semantically using LLM |
| 210 | +2. Generate an initial graph model with AI-optimized design |
| 211 | +3. Execute the migration |
| 212 | +4. Validate the results |
| 213 | + |
| 214 | +Keep in mind that in this mode, the entire migration is still automatic and LLM |
| 215 | +driven. |
| 216 | + |
| 217 | +#### Incremental migration with review |
| 218 | + |
| 219 | +Control each step of the transformation: |
| 220 | + |
| 221 | +```bash |
| 222 | +uv run main.py --mode incremental --strategy llm |
| 223 | +``` |
| 224 | + |
| 225 | +In incremental mode: |
| 226 | +1. The AI generates a complete graph model for all tables |
| 227 | +2. You review each table's mapping one at a time |
| 228 | +3. Accept or modify individual table transformations |
| 229 | +4. After processing all tables, optionally enter a refinement loop |
| 230 | +5. Interactively adjust the entire model before final migration |
| 231 | + |
| 232 | +In this mode the LLM is used to generate the initial model, but you have full |
| 233 | +control to review and refine each table's mapping before migration. After each |
| 234 | +modification, the LLM will try to regenerate based on your feedback and |
| 235 | +validation errors to improve the model iteratively. |
| 236 | + |
| 237 | +### CLI reference |
| 238 | + |
| 239 | +#### Command-line options |
| 240 | + |
| 241 | +| Flag | Environment Variable | Description | Default | |
| 242 | +|------|---------------------|-------------|---------| |
| 243 | +| `--mode` | `SQL2MG_MODE` | `automatic` or `incremental` | interactive prompt | |
| 244 | +| `--strategy` | `SQL2MG_STRATEGY` | `deterministic` or `llm` | interactive prompt | |
| 245 | +| `--provider` | _(none)_ | `openai`, `anthropic`, or `gemini` | auto-detect from API keys | |
| 246 | +| `--model` | `LLM_MODEL` | Specific model name | provider default | |
| 247 | +| `--meta-graph` | `SQL2MG_META_POLICY` | `auto`, `skip`, or `reset` | `auto` | |
| 248 | +| `--log-level` | `SQL2MG_LOG_LEVEL` | `DEBUG`, `INFO`, `WARNING`, `ERROR` | `INFO` | |
| 249 | + |
| 250 | +#### Usage examples |
| 251 | + |
| 252 | +```bash |
| 253 | +# Use specific Gemini model |
| 254 | +uv run main.py --strategy llm --provider gemini --model gemini-2.0-flash-exp |
| 255 | + |
| 256 | +# Skip meta-graph comparison (treat as fresh migration) |
| 257 | +uv run main.py --meta-graph skip |
| 258 | + |
| 259 | +# Enable debug logging |
| 260 | +uv run main.py --log-level DEBUG |
| 261 | + |
| 262 | +# Fully configured non-interactive run |
| 263 | +uv run main.py \ |
| 264 | + --mode automatic \ |
| 265 | + --strategy deterministic \ |
| 266 | + --meta-graph reset \ |
| 267 | + --log-level INFO |
| 268 | +``` |
| 269 | + |
| 270 | + |
| 271 | +### LLM provider support |
| 272 | + |
| 273 | +| Provider | Models | |
| 274 | +|----------|--------| |
| 275 | +| **OpenAI** | GPT-4o, GPT-4o-mini | |
| 276 | +| **Anthropic** | Claude 3.5 Sonnet | |
| 277 | +| **Google** | Gemini 2.0 Flash | |
| 278 | + |
| 279 | +#### Provider selection |
| 280 | + |
| 281 | +The agent automatically selects a provider based on available API keys: |
| 282 | +1. Checks for `OPENAI_API_KEY` |
| 283 | +2. Falls back to `ANTHROPIC_API_KEY` |
| 284 | +3. Falls back to `GOOGLE_API_KEY` |
| 285 | + |
| 286 | +Override with `--provider` flag: |
| 287 | + |
| 288 | +```bash |
| 289 | +# Force Anthropic even if OpenAI key exists |
| 290 | +uv run main.py --strategy llm --provider anthropic |
| 291 | +``` |
| 292 | + |
| 293 | +#### Model selection |
| 294 | + |
| 295 | +Each provider has sensible defaults: |
| 296 | +- **OpenAI**: `gpt-4o-mini` |
| 297 | +- **Anthropic**: `claude-3-5-sonnet-20241022` |
| 298 | +- **Google**: `gemini-2.0-flash-exp` |
| 299 | + |
| 300 | +Override with `--model` or `LLM_MODEL` env variable: |
| 301 | + |
| 302 | +```bash |
| 303 | +# Use more powerful OpenAI model |
| 304 | +uv run main.py --strategy llm --model gpt-4o |
| 305 | + |
| 306 | +# Or via environment variable |
| 307 | +export LLM_MODEL=claude-3-opus-20240229 |
| 308 | +uv run main.py --strategy llm --provider anthropic |
| 309 | +``` |
| 310 | + |
| 311 | + |
| 312 | +### Architecture overview |
| 313 | + |
| 314 | +If you like the implementation details, here is a high-level overview of the |
| 315 | +project structure: |
| 316 | + |
| 317 | +``` |
| 318 | +sql2graph/ |
| 319 | +├── main.py # CLI entry point |
| 320 | +├── core/ |
| 321 | +│ ├── migration_agent.py # Main orchestration |
| 322 | +│ └── hygm/ # Graph modeling engine |
| 323 | +│ ├── hygm.py # HyGM core |
| 324 | +│ ├── models/ # Data models |
| 325 | +│ ├── strategies/ # Modeling strategies |
| 326 | +│ └── validation/ # Validation system |
| 327 | +├── database/ |
| 328 | +│ ├── analyzer.py # Schema analysis |
| 329 | +│ ├── factory.py # Database adapter factory |
| 330 | +│ └── adapters/ # DB-specific adapters |
| 331 | +├── query_generation/ |
| 332 | +│ ├── cypher_generator.py # Cypher query builder |
| 333 | +│ └── schema_utilities.py # Schema helpers |
| 334 | +└── utils/ |
| 335 | + ├── environment.py # Env validation |
| 336 | + └── config.py # Configuration |
| 337 | +``` |
| 338 | + |
| 339 | +<CommunityLinks/> |
0 commit comments