Skip to content

Conversation

@corylanou
Copy link
Collaborator

@corylanou corylanou commented Sep 5, 2025

Summary

Adds directory replication support for multi-tenant applications where each tenant has their own SQLite database file.

Closes #42

Configuration

dbs:
  - directory: /var/lib/tenants
    pattern: "*.db"
    recursive: true
    replica:
      type: s3
      bucket: backups
      path: tenants

Replica Path Behavior

Each database gets a unique replica path by appending its relative path from the directory root:

Example:

directory: /var/lib/databases
replica path: backups/prod

Results:
  /var/lib/databases/tenant1.db        → backups/prod/tenant1.db/ltx/...
  /var/lib/databases/team-a/db2.db     → backups/prod/team-a/db2.db/ltx/...

This ensures isolated storage per database with no collision risk.

Features

  • Scan directories for SQLite databases with configurable patterns (*.db, *.sqlite, etc.)
  • Recursive directory scanning
  • SQLite header validation (prevents replicating non-SQLite files)
  • Mix directory and single-database configs
  • Preserves directory structure in replica paths
  • Works with all storage backends (S3, GCS, ABS, SFTP, WebDAV, File, NATS)

Testing

  • Comprehensive test coverage for path uniqueness
  • Tests for subdirectory handling, duplicate filenames, special characters
  • All existing tests passing

@corylanou corylanou marked this pull request as draft September 5, 2025 22:56
@corylanou corylanou force-pushed the feature/directory-replication branch from bdc882a to 689debe Compare September 13, 2025 18:52
@corylanou corylanou marked this pull request as ready for review September 13, 2025 18:52
This implements the ability to replicate entire directories of SQLite databases,
perfect for multi-tenant applications where each tenant has their own database file.

Features:
- Scan directories for SQLite databases with configurable file patterns
- Support for recursive directory scanning
- Validate files by checking SQLite headers
- Mix single database and directory configurations in the same config file

Configuration example:
```yaml
dbs:
  - directory: /var/lib/myapp/databases
    pattern: "*.db"
    recursive: true
    replica:
      type: s3
      bucket: my-bucket
      path: backups
```

Implementation:
- Extended DBConfig with Directory, Pattern, and Recursive fields
- Added NewDBsFromDirectoryConfig() to create multiple DB instances
- Added FindSQLiteDatabases() for directory scanning
- Added IsSQLiteDatabase() for file validation
- Updated ReplicateCommand to handle directory configurations
- Added comprehensive test coverage

Documentation Note:
A follow-up issue should be created in the litestream.io repository to document:
- New configuration options (directory, pattern, recursive)
- Use cases for multi-tenant applications
- Examples of mixed single/directory configurations
- Current limitations (no dynamic discovery without restart)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@corylanou corylanou force-pushed the feature/directory-replication branch from 689debe to 5d204f8 Compare November 3, 2025 17:44
corylanou and others added 2 commits November 3, 2025 13:31
CRITICAL BUG FIX: Prevents data corruption in directory replication.

Problem:
- All databases in a directory were sharing the same replica path
- This caused databases to overwrite each other's LTX files
- Result: Complete data corruption and inability to restore

Root Cause:
- NewDBsFromDirectoryConfig performed shallow copy of DBConfig
- Replica/Replicas fields (pointers) were shared across all databases
- All databases wrote to same path (e.g., backups/databases/ltx/0/...)

Solution:
- Deep copy Replica and Replicas configs for each database
- Append relative path from directory root to replica base path
- Use filepath.ToSlash() to normalize paths for cloud storage
- Each database now gets unique isolated replica path

Path Behavior (Option B - Relative Path):
  Config:
    directory: /var/lib/databases
    replica path: backups/prod

  Results:
    /var/lib/databases/db1.db           → backups/prod/db1.db/ltx/...
    /var/lib/databases/team-a/db2.db    → backups/prod/team-a/db2.db/ltx/...
    /var/lib/databases/deep/dir/db3.db  → backups/prod/deep/dir/db3.db/ltx/...

This preserves directory structure and eliminates collision risk even
with duplicate filenames in different subdirectories.

Changes:
- cmd/litestream/main.go: Fix NewDBsFromDirectoryConfig()
  - Calculate relative path for each database
  - Deep copy replica configs
  - Append normalized relative path to replica base path
  - Handle both 'replica' and 'replicas' (deprecated) fields

- cmd/litestream/main_test.go: Add comprehensive tests
  - TestNewDBsFromDirectoryConfig_UniquePaths
  - TestNewDBsFromDirectoryConfig_SubdirectoryPaths
  - TestNewDBsFromDirectoryConfig_DuplicateFilenames
  - TestNewDBsFromDirectoryConfig_SpecialCharacters
  - TestNewDBsFromDirectoryConfig_EmptyBasePath
  - TestNewDBsFromDirectoryConfig_ReplicasArray

- etc/litestream-directory-example.yml: Document path behavior
  - Add detailed explanation of automatic path uniqueness
  - Show examples of resulting paths
  - Clarify directory structure preservation

Breaking Change Notice:
Since this feature currently causes data corruption, any existing
users of directory replication will need to:
1. Delete corrupted replicas
2. Update to this fixed version
3. Restart replication (will use new path structure)

Fixes #738 (P0 bug found during review)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Refactored the replica path cloning logic to be more robust and maintainable.

Improvements:
- Extract cloneReplicaConfigWithRelativePath() helper function
  - Better separation of concerns
  - Proper error handling and propagation
  - Handles both Path and URL configuration styles

- Add appendRelativePathToURL() for URL path manipulation
  - Ensures proper URL path rooting
  - Handles edge cases (empty paths, dots, etc.)

- Type-specific path handling:
  - File backend: Use OS-specific paths (filepath.Join)
  - Cloud backends: Use forward slashes (path.Join)
  - URL-based configs: Modify URL path component directly

- Better edge case handling:
  - Empty relative paths ("" or ".")
  - Nil replica configs
  - URL parsing errors

This refactoring makes the code cleaner, more testable, and easier to
understand while maintaining all existing functionality.

Co-Authored-By: Cory LaNou <cory@lanou.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@corylanou corylanou changed the title Add directory replication support feat: add directory replication support for multi-tenant databases Nov 3, 2025
Directory replication will be documented in the litestream.io repo
alongside other configuration examples rather than as a separate
example file here.
@corylanou
Copy link
Collaborator Author

Directory Replication Field Report

Environment

litestream binary: ./litestream (go build ./cmd/litestream)
litestream-test:   ./bin/litestream-test (go build -o bin/litestream-test ./cmd/litestream-test)
MinIO container:   docker run -d --name litestream-minio -p 9500:9000 -p 9600:9090 \
                       -e MINIO_ROOT_USER=litestream \
                       -e MINIO_ROOT_PASSWORD=litestream123 \
                       minio/minio server /data --console-address :9090
mc client:         /tmp/mc (curl download)
Bucket:            litestream-directory-tests

MinIO credentials were also exported into the Litestream config using the access-key-id and secret-access-key fields.

Local Directory Layout

SQLite databases populated under /tmp/litestream-dir-test-hJj7VN:

/tmp/litestream-dir-test-hJj7VN
├─ db1.db
├─ team-a/
│  └─ db2.db
├─ team-b/
│  └─ east/
│     └─ db3.db
└─ shared/
   └─ db4.db

Each database was created with ~5 MB of data via:

./bin/litestream-test populate -db <path> -target-size 5MB

Litestream Configuration

/tmp/litestream-dir-test-config.yml

logging:
  level: info
  type: text

dbs:
  - directory: /tmp/litestream-dir-test-hJj7VN
    pattern: "*.db"
    recursive: true
    replica:
      type: s3
      bucket: litestream-directory-tests
      path: directory-run
      region: us-east-1
      endpoint: http://127.0.0.1:9500
      force-path-style: true
      skip-verify: true
      access-key-id: litestream
      secret-access-key: litestream123

Replication Runs

Spin up Litestream using the config above:

./litestream replicate -config /tmp/litestream-dir-test-config.yml

Key log excerpts (/tmp/litestream-dir-test-run.log etc.):

time=… level=INFO msg="found databases in directory" directory=/tmp/litestream-dir-test-hJj7VN count=4
time=… level=INFO msg="replicating to" path=directory-run/db1.db …
time=… level=INFO msg="replicating to" path=directory-run/team-a/db2.db …
time=… level=INFO msg="replicating to" path=directory-run/team-b/east/db3.db …
time=… level=INFO msg="replicating to" path=directory-run/shared/db4.db …

Expected warnings seen when stopping the process early:

  • compaction failed … page size not initialized yet (harmless startup race)
  • failed to close database … transaction has already been committed or rolled back (signal during shutdown)

Load Generation

While replication ran, each database received write/read load from litestream-test:

for db in \
  "/tmp/litestream-dir-test-hJj7VN/db1.db" \
  "/tmp/litestream-dir-test-hJj7VN/team-a/db2.db" \
  "/tmp/litestream-dir-test-hJj7VN/team-b/east/db3.db" \
  "/tmp/litestream-dir-test-hJj7VN/shared/db4.db"; do
  ./bin/litestream-test load \
    -db "$db" \
    -duration 6s \
    -write-rate 40 \
    -workers 2 \
    -read-ratio 0.1
done

Results (aggregated from /tmp/litestream-dir-test-run-load*.log):

Database Writes/sec Reads/sec Errors
db1.db 35–44 3–6 0
team-a/db2.db 35–44 3–6 0
team-b/east/db3.db 34–45 4–6 0
shared/db4.db 35–45 4–5 0

(All runs lasted 6–8 seconds with two workers and 1 KB payloads.)

Remote Bucket Snapshot

/tmp/mc tree --files litestream/litestream-directory-tests/directory-run

directory-run/
├─ db1.db/
│  ├─ 0000/0000000000000002-000000000000000a.ltx
│  └─ 0001/{0000000000000001-0000000000000001,…,0000000000000002-000000000000000a}.ltx
├─ team-a/db2.db/
│  ├─ 0000/0000000000000002-000000000000000a.ltx
│  ├─ 0001/0000000000000001-000000000000000a.ltx
│  └─ 0009/0000000000000001-0000000000000001.ltx
├─ team-b/east/db3.db/
│  ├─ 0000/…000a.ltx
│  ├─ 0001/0000000000000001-0000000000000007.ltx
│  └─ 0009/0000000000000001-0000000000000001.ltx
└─ shared/db4.db/
   ├─ 0000/…000a.ltx
   ├─ 0001/0000000000000001-0000000000000001.ltx
   └─ 0009/0000000000000001-0000000000000001.ltx

Each database replicates into a unique prefix derived from its relative path, confirming the directory-expansion fix works under active load.

Artifacts

  • Replication logs: /tmp/litestream-dir-test-run*.log
  • Load logs: /tmp/litestream-dir-test-run-load*.log
  • Config: /tmp/litestream-dir-test-config.yml
  • Remote tree snapshot: /tmp/mc tree --files …/directory-run

Cleanup Checklist

docker rm -f litestream-minio
rm -rf /tmp/litestream-dir-test-hJj7VN /tmp/litestream-dir-test-*.log /tmp/litestream-dir-test-config.yml
rm -f /tmp/mc litestream

All directory-replication databases replicated independently to MinIO, even with concurrent writes, and no path collisions were observed.

Copy link
Owner

@benbjohnson benbjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good first pass. The complicated thing about directory support is that many times databases are created after Litestream has been started so you'll need to add a file & directory watcher to check for new database files & to check for deleted databases.

Go ahead and merge this in and add watcher support as a separate PR.

- Rename Directory → Dir in config for brevity
- Make pattern field required (no default)
- Clear dir/pattern/recursive fields in individual DB configs

Addresses review feedback from @benbjohnson on PR #738.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@corylanou corylanou merged commit e87000c into main Nov 3, 2025
14 checks passed
@corylanou corylanou deleted the feature/directory-replication branch November 3, 2025 23:13
corylanou added a commit that referenced this pull request Nov 6, 2025
Implement real-time monitoring of directory replication paths using fsnotify.
The DirectoryMonitor automatically detects when SQLite databases are created
or removed from watched directories and dynamically adds/removes them from
replication without requiring restarts.

Key features:
- Automatic database discovery with pattern matching
- Support for recursive directory watching
- Thread-safe database lifecycle management
- New Store.AddDB() and Store.RemoveDB() methods for dynamic management
- Comprehensive integration tests for lifecycle validation

This enhancement builds on the existing directory replication feature (#738)
by making it fully dynamic for use cases like multi-tenant SaaS where
databases are created and destroyed frequently.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
corylanou added a commit that referenced this pull request Nov 8, 2025
Implement real-time monitoring of directory replication paths using fsnotify.
The DirectoryMonitor automatically detects when SQLite databases are created
or removed from watched directories and dynamically adds/removes them from
replication without requiring restarts.

Key features:
- Automatic database discovery with pattern matching
- Support for recursive directory watching
- Thread-safe database lifecycle management
- New Store.AddDB() and Store.RemoveDB() methods for dynamic management
- Comprehensive integration tests for lifecycle validation

This enhancement builds on the existing directory replication feature (#738)
by making it fully dynamic for use cases like multi-tenant SaaS where
databases are created and destroyed frequently.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
corylanou added a commit that referenced this pull request Nov 11, 2025
Implement real-time monitoring of directory replication paths using fsnotify.
The DirectoryMonitor automatically detects when SQLite databases are created
or removed from watched directories and dynamically adds/removes them from
replication without requiring restarts.

Key features:
- Automatic database discovery with pattern matching
- Support for recursive directory watching
- Thread-safe database lifecycle management
- New Store.AddDB() and Store.RemoveDB() methods for dynamic management
- Comprehensive integration tests for lifecycle validation

This enhancement builds on the existing directory replication feature (#738)
by making it fully dynamic for use cases like multi-tenant SaaS where
databases are created and destroyed frequently.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
corylanou added a commit that referenced this pull request Dec 12, 2025
Implement real-time monitoring of directory replication paths using fsnotify.
The DirectoryMonitor automatically detects when SQLite databases are created
or removed from watched directories and dynamically adds/removes them from
replication without requiring restarts.

Key features:
- Automatic database discovery with pattern matching
- Support for recursive directory watching
- Thread-safe database lifecycle management
- New Store.AddDB() and Store.RemoveDB() methods for dynamic management
- Comprehensive integration tests for lifecycle validation

This enhancement builds on the existing directory replication feature (#738)
by making it fully dynamic for use cases like multi-tenant SaaS where
databases are created and destroyed frequently.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replicate whole directories

3 participants