Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 51 additions & 19 deletions docsrc/enrolling.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -3,36 +3,68 @@ title: "Enrolling an existing db"
order: 2
---

If you use fastmigrate with valid migration scripts, fastmigrate can guarantee the version of your database presented to your application code. This is the value of managed migrations.
## Enroll an existing db

However, to provide this guarantee, you need your database to be managed by fastmigrate. If you created the database with fastmigrate (using `create_db` or the `fastmigrate_create_db` CLI command), then it is managed.
Here's how to add an existing db to be managed by fastmigrate.

But what if you are starting with an application built outside of fastmigrate, and you want to _enroll_ the database in fastmigrate? Here is how to think about it, and how to do it correctly:
0. Back up your database, of course.

To clarify the background: the key invariant which we need to maintain is this: *any database which has a fastmigrate version number (like 1, or like 3) is exactly in the state which would be produced by the migration script with that version number (like by `0001-initialize.sql` or `0003-unify-users.sql`).*
1. Run `fastmigrate_enroll_db --db /path/to/database.db` against your current database.

Now if you create a db with fastmigrate, it is created with version 0, and the version only advances as a result of running migration scripts. So this maintains the invariant.
This will modify your database, marking it as version 1 by adding a `_meta` table.

It will also generate an initial migration script, `0001-initialize.sql`, which is for creating an empty "version 1" database with the same schema as your current database.

2. Move your current database to somewhere safe, update your application's db setup code to use fastmigrate, and run it.

But if you are enrolling an existing db into fastmigrate, then you need to do three things.
``` python
import fastmigrate
fastmigrate.create_db("/path/to/database.db")
fastmigrate.run_migrations("/path/to/database.db")
# application code continues from here
```

- First, write a migration script `0001-initialize.sql` which will produce the schema of the database which you are working with right now.
Since you moved your real db, fastmigrate will create a new db at that path based on the initial migration script.

3. Check your app, to see if it is working fine.

Why? You need this so that, when you are starting fresh instances of your application, fastmigrate can create a database which is equivalent to what you have created now. The easiest way to create this script is to run `sqlite3 data.db .schema > 0001-initialize.sql` on your current database, which will create a sql file `0001-initialize.sql` which creates a fresh db with the same schema as your current db.
If it is, your initialization script is correct, and your can move your real database back into place.

This is now your first migration script. Because it matches the current state of your current database, it will not be run on your current database. But it will ensure that newly created databases match your current database.
If not, then you will need to edit that initialization script, so that it produces a database which is equivalent to your current database.

From an abundance of caution, you should use it to create a db and confirm that it is indeed equivalent to your current db.

- Second, manually modify your current data to add fastmigrate version tag and set its version to 1. You can do this by using fastmigrate's internal API. Doing this constitutes asserting that the db is in fact in the state which would be produced by the migration script 0001. After doing this, fastmigrate will recognize your db as managed. Here is how to do it:
## The reason for this procedure

As long as you use fastmigrate with valid migration scripts, fastmigrate can guarantee whch version of your database it presented to your application code. This is the value of managed migrations.

However, to provide this guarantee, you need your database to be managed by fastmigrate -- that is, to be explicitly marked with a version, in its `_meta` table. If you created the database with fastmigrate (using `create_db` or the `fastmigrate_create_db` CLI command), then it is managed.

But what if you are starting with an application built outside of fastmigrate, and you want to _enroll_ the database in fastmigrate?

To recap the basic idea of what migrations are, the fundamental guarantee which we need to maintain is this: *any database which has a fastmigrate version number (like 1, or like 3) is in the state which would be produced by the migration script with that version number (like by `0001-initialize.sql` or `0003-unify-users.sql`).*

So when enrolling an existing db, you need to assign a version to the db you already have. But since that version number takes its meaning from the migration script which _would_ produce it, you also need to create a migration script which would produce a database like yours. That script is also practically useful. If you ever want to deploy a new instance of your database, or run fresh instancs for debugging, you need that initialization script to create the initial, empty state of a db for your application to use.

```python
from fastmigrate.core import _ensure_meta_table, _set_db_version
_ensure_meta_table("path/to/data.db")
_set_db_version("path/to/data.db",1)
`fastmigrate_enroll_db` is merely a helper for those tasks. It marks your database, and generates an initialization migration script.

### One reason enrollment needs manual inspection

Why is this not 100% automatic?

The tool generates the migration script based on the _schema_ of your existing database. In many cases, that is all that matters for defining the version of a database, because the schema is all that the application code depends on.

However, this will not be enough if you application requires not only a particular table schema, but also certain _initial data values_ to be present. In that case you will need to add code to your the initialization script which not only creates the necessary tables but also inserts those values.

For instance, if your application code merely required a `user` table which tracked settings, you would expect a line like this:

``` sql
CREATE TABLE user (id INTEGER, settings TEXT);
```

- Third, update your application code.
But if your application code also required that the database start with one row in that table, defining a user with an ID of 1 and settings which were an empty pair of brances, then you would also add a line like so:

You should update it so that it no longer manually creates and initializes a database if it is missing by itself (as it might do now), but instead uses fastmigrate to create the db and to run the migrations, as is shown in the readme. You should check the migration scripts into version control alongside your application code. Your application code should now all be written under the assumption that it will find the database in the state defined by the highest-numbered migration script in the repo.

``` sql
INSERT INTO user VALUES (1, 0, '{}');
```

This subltety is a reason why it is not strictly accurate to say migrations version exist only to track schema schema. In fact, they define versions which should track what application code expects, which likely includes versions but not only versions.

13 changes: 8 additions & 5 deletions fastmigrate/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,20 +134,23 @@ def enroll_db(
migrations: Path = DEFAULT_MIGRATIONS, # Path to the migrations directory
config_path: Path = DEFAULT_CONFIG # Path to config file
) -> None:
"""Enroll an existing SQLite database for versioning, adding a default initial migration, then running it.
"""Enroll an existing SQLite database for versioning, and generate a draft initial migration.

Note: command line arguments take precedence over values from a
config file, unless they are equal to default values.
"""
db_path, migrations_path = _get_config(config_path, db, migrations)
try:
db_version = core.get_db_version(db_path)
print(f"Cannot enroll, since this database is already managed.\nIt is marked as version {db_version}")
sys.exit(1)
except sqlite3.Error: pass
if not migrations_path.exists(): migrations_path.mkdir(parents=True)
initial_migration = migrations_path / "0001-initial.sql"
initial_migration = migrations_path / "0001-initialize.sql"
schema = core.get_db_schema(db_path)
initial_migration.write_text(schema)
core._ensure_meta_table(db_path)
success = core.run_migrations(db_path, migrations_path, verbose=True)
if not success:
sys.exit(1)
core._set_db_version(db_path,1)


@call_parse
Expand Down
5 changes: 3 additions & 2 deletions tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,7 @@ def test_cli_enroll_db_already_versioned(tmp_path):
], capture_output=True, text=True)

# Should exit with zero status because the database is successfully versioned
assert result.returncode == 0
assert result.returncode != 0

# Verify the database version wasn't changed
conn = sqlite3.connect(db_path)
Expand Down Expand Up @@ -515,4 +515,5 @@ def test_cli_with_testsuite_a(tmp_path):
)
assert cursor.fetchone() is not None

conn.close()
conn.close()