-
Notifications
You must be signed in to change notification settings - Fork 326
feat: add directory replication support for multi-tenant databases #738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bdc882a to
689debe
Compare
This implements the ability to replicate entire directories of SQLite databases,
perfect for multi-tenant applications where each tenant has their own database file.
Features:
- Scan directories for SQLite databases with configurable file patterns
- Support for recursive directory scanning
- Validate files by checking SQLite headers
- Mix single database and directory configurations in the same config file
Configuration example:
```yaml
dbs:
- directory: /var/lib/myapp/databases
pattern: "*.db"
recursive: true
replica:
type: s3
bucket: my-bucket
path: backups
```
Implementation:
- Extended DBConfig with Directory, Pattern, and Recursive fields
- Added NewDBsFromDirectoryConfig() to create multiple DB instances
- Added FindSQLiteDatabases() for directory scanning
- Added IsSQLiteDatabase() for file validation
- Updated ReplicateCommand to handle directory configurations
- Added comprehensive test coverage
Documentation Note:
A follow-up issue should be created in the litestream.io repository to document:
- New configuration options (directory, pattern, recursive)
- Use cases for multi-tenant applications
- Examples of mixed single/directory configurations
- Current limitations (no dynamic discovery without restart)
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
689debe to
5d204f8
Compare
CRITICAL BUG FIX: Prevents data corruption in directory replication.
Problem:
- All databases in a directory were sharing the same replica path
- This caused databases to overwrite each other's LTX files
- Result: Complete data corruption and inability to restore
Root Cause:
- NewDBsFromDirectoryConfig performed shallow copy of DBConfig
- Replica/Replicas fields (pointers) were shared across all databases
- All databases wrote to same path (e.g., backups/databases/ltx/0/...)
Solution:
- Deep copy Replica and Replicas configs for each database
- Append relative path from directory root to replica base path
- Use filepath.ToSlash() to normalize paths for cloud storage
- Each database now gets unique isolated replica path
Path Behavior (Option B - Relative Path):
Config:
directory: /var/lib/databases
replica path: backups/prod
Results:
/var/lib/databases/db1.db → backups/prod/db1.db/ltx/...
/var/lib/databases/team-a/db2.db → backups/prod/team-a/db2.db/ltx/...
/var/lib/databases/deep/dir/db3.db → backups/prod/deep/dir/db3.db/ltx/...
This preserves directory structure and eliminates collision risk even
with duplicate filenames in different subdirectories.
Changes:
- cmd/litestream/main.go: Fix NewDBsFromDirectoryConfig()
- Calculate relative path for each database
- Deep copy replica configs
- Append normalized relative path to replica base path
- Handle both 'replica' and 'replicas' (deprecated) fields
- cmd/litestream/main_test.go: Add comprehensive tests
- TestNewDBsFromDirectoryConfig_UniquePaths
- TestNewDBsFromDirectoryConfig_SubdirectoryPaths
- TestNewDBsFromDirectoryConfig_DuplicateFilenames
- TestNewDBsFromDirectoryConfig_SpecialCharacters
- TestNewDBsFromDirectoryConfig_EmptyBasePath
- TestNewDBsFromDirectoryConfig_ReplicasArray
- etc/litestream-directory-example.yml: Document path behavior
- Add detailed explanation of automatic path uniqueness
- Show examples of resulting paths
- Clarify directory structure preservation
Breaking Change Notice:
Since this feature currently causes data corruption, any existing
users of directory replication will need to:
1. Delete corrupted replicas
2. Update to this fixed version
3. Restart replication (will use new path structure)
Fixes #738 (P0 bug found during review)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Refactored the replica path cloning logic to be more robust and maintainable.
Improvements:
- Extract cloneReplicaConfigWithRelativePath() helper function
- Better separation of concerns
- Proper error handling and propagation
- Handles both Path and URL configuration styles
- Add appendRelativePathToURL() for URL path manipulation
- Ensures proper URL path rooting
- Handles edge cases (empty paths, dots, etc.)
- Type-specific path handling:
- File backend: Use OS-specific paths (filepath.Join)
- Cloud backends: Use forward slashes (path.Join)
- URL-based configs: Modify URL path component directly
- Better edge case handling:
- Empty relative paths ("" or ".")
- Nil replica configs
- URL parsing errors
This refactoring makes the code cleaner, more testable, and easier to
understand while maintaining all existing functionality.
Co-Authored-By: Cory LaNou <cory@lanou.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Directory replication will be documented in the litestream.io repo alongside other configuration examples rather than as a separate example file here.
Directory Replication Field ReportEnvironmentMinIO credentials were also exported into the Litestream config using the Local Directory LayoutSQLite databases populated under Each database was created with ~5 MB of data via: ./bin/litestream-test populate -db <path> -target-size 5MBLitestream Configuration
logging:
level: info
type: text
dbs:
- directory: /tmp/litestream-dir-test-hJj7VN
pattern: "*.db"
recursive: true
replica:
type: s3
bucket: litestream-directory-tests
path: directory-run
region: us-east-1
endpoint: http://127.0.0.1:9500
force-path-style: true
skip-verify: true
access-key-id: litestream
secret-access-key: litestream123Replication RunsSpin up Litestream using the config above: ./litestream replicate -config /tmp/litestream-dir-test-config.ymlKey log excerpts ( Expected warnings seen when stopping the process early:
Load GenerationWhile replication ran, each database received write/read load from for db in \
"/tmp/litestream-dir-test-hJj7VN/db1.db" \
"/tmp/litestream-dir-test-hJj7VN/team-a/db2.db" \
"/tmp/litestream-dir-test-hJj7VN/team-b/east/db3.db" \
"/tmp/litestream-dir-test-hJj7VN/shared/db4.db"; do
./bin/litestream-test load \
-db "$db" \
-duration 6s \
-write-rate 40 \
-workers 2 \
-read-ratio 0.1
doneResults (aggregated from
(All runs lasted 6–8 seconds with two workers and 1 KB payloads.) Remote Bucket Snapshot
Each database replicates into a unique prefix derived from its relative path, confirming the directory-expansion fix works under active load. Artifacts
Cleanup Checklistdocker rm -f litestream-minio
rm -rf /tmp/litestream-dir-test-hJj7VN /tmp/litestream-dir-test-*.log /tmp/litestream-dir-test-config.yml
rm -f /tmp/mc litestreamAll directory-replication databases replicated independently to MinIO, even with concurrent writes, and no path collisions were observed. |
benbjohnson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good first pass. The complicated thing about directory support is that many times databases are created after Litestream has been started so you'll need to add a file & directory watcher to check for new database files & to check for deleted databases.
Go ahead and merge this in and add watcher support as a separate PR.
- Rename Directory → Dir in config for brevity - Make pattern field required (no default) - Clear dir/pattern/recursive fields in individual DB configs Addresses review feedback from @benbjohnson on PR #738. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implement real-time monitoring of directory replication paths using fsnotify. The DirectoryMonitor automatically detects when SQLite databases are created or removed from watched directories and dynamically adds/removes them from replication without requiring restarts. Key features: - Automatic database discovery with pattern matching - Support for recursive directory watching - Thread-safe database lifecycle management - New Store.AddDB() and Store.RemoveDB() methods for dynamic management - Comprehensive integration tests for lifecycle validation This enhancement builds on the existing directory replication feature (#738) by making it fully dynamic for use cases like multi-tenant SaaS where databases are created and destroyed frequently. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implement real-time monitoring of directory replication paths using fsnotify. The DirectoryMonitor automatically detects when SQLite databases are created or removed from watched directories and dynamically adds/removes them from replication without requiring restarts. Key features: - Automatic database discovery with pattern matching - Support for recursive directory watching - Thread-safe database lifecycle management - New Store.AddDB() and Store.RemoveDB() methods for dynamic management - Comprehensive integration tests for lifecycle validation This enhancement builds on the existing directory replication feature (#738) by making it fully dynamic for use cases like multi-tenant SaaS where databases are created and destroyed frequently. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implement real-time monitoring of directory replication paths using fsnotify. The DirectoryMonitor automatically detects when SQLite databases are created or removed from watched directories and dynamically adds/removes them from replication without requiring restarts. Key features: - Automatic database discovery with pattern matching - Support for recursive directory watching - Thread-safe database lifecycle management - New Store.AddDB() and Store.RemoveDB() methods for dynamic management - Comprehensive integration tests for lifecycle validation This enhancement builds on the existing directory replication feature (#738) by making it fully dynamic for use cases like multi-tenant SaaS where databases are created and destroyed frequently. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implement real-time monitoring of directory replication paths using fsnotify. The DirectoryMonitor automatically detects when SQLite databases are created or removed from watched directories and dynamically adds/removes them from replication without requiring restarts. Key features: - Automatic database discovery with pattern matching - Support for recursive directory watching - Thread-safe database lifecycle management - New Store.AddDB() and Store.RemoveDB() methods for dynamic management - Comprehensive integration tests for lifecycle validation This enhancement builds on the existing directory replication feature (#738) by making it fully dynamic for use cases like multi-tenant SaaS where databases are created and destroyed frequently. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Adds directory replication support for multi-tenant applications where each tenant has their own SQLite database file.
Closes #42
Configuration
Replica Path Behavior
Each database gets a unique replica path by appending its relative path from the directory root:
Example:
This ensures isolated storage per database with no collision risk.
Features
*.db,*.sqlite, etc.)Testing