Skip to content

Conversation

@corylanou
Copy link
Collaborator

Summary

This PR introduces a comprehensive documentation system specifically designed for AI agents (Claude, GitHub Copilot, Cursor, etc.) working with the Litestream codebase.

Why In-Repo Documentation vs litestream.io?

These docs are intentionally placed in the repository rather than on the litestream.io website for several critical reasons:

1. AI Agent Context Window Access

  • AI agents analyzing code need immediate access to technical documentation
  • In-repo docs are automatically included in the agent's context when browsing the codebase
  • Reduces need for external lookups that break workflow

2. Version Synchronization

  • Documentation stays in sync with code changes
  • PRs can update both code and relevant documentation atomically
  • No lag between implementation changes and documentation updates

3. PR Review Validation

  • Reviewers can verify that AI agents are following correct patterns
  • Documentation serves as a contract for expected AI behavior
  • Reduces incorrect assumptions and hallucinations

4. Reduced AI Hallucination

What's Included

Core Documentation

  • AGENT.md: Main entry point with architecture overview and common pitfalls
  • docs/SQLITE_INTERNALS.md: SQLite fundamentals including WAL format and the critical 1GB lock page
  • docs/LTX_FORMAT.md: Complete LTX format specification with binary layouts
  • docs/ARCHITECTURE.md: Deep technical dive into Litestream components
  • docs/REPLICA_CLIENT_GUIDE.md: Storage backend implementation guide
  • docs/TESTING_GUIDE.md: Comprehensive testing strategies including >1GB database tests
  • docs/V050_CHANGES.md: v0.5.0 migration guide and breaking changes

Key Features

  • 🎯 Addresses Real Issues: Documents solutions to actual problems from recent PRs
  • 📊 Visual Understanding: Mermaid diagrams for architecture and data flow
  • ⚠️ Anti-Patterns: Clear DO and DON'T examples based on real mistakes
  • 🔍 Critical Edge Cases: Emphasizes 1GB lock page, eventual consistency, etc.
  • 🚀 v0.5.0 Aligned: Updated for single replica, new compaction levels, pure Go

Impact on AI-Generated PRs

With this documentation, AI agents will:

  1. Understand the critical SQLite lock page at 1GB (page 262145 for 4KB pages)
  2. Know to read from local files during compaction (not remote)
  3. Use proper locking patterns (Lock vs RLock for writes)
  4. Preserve CreatedAt timestamps during compaction
  5. Understand v0.5.0 constraints (single replica only)

Testing

  • All markdown files pass linting
  • Documentation has been validated against current codebase
  • References to line numbers and functions have been verified

Removed

  • docs/RELEASE.md: Outdated release documentation (superseded by new process)

This documentation system ensures that AI agents have the deep understanding of SQLite internals and LTX format necessary to make correct modifications to the Litestream codebase, reducing the review burden on maintainers.

corylanou and others added 7 commits October 10, 2025 12:24
This commit introduces a complete documentation system specifically designed
for AI agents working with the Litestream codebase.

Added:
- AGENT.md: Main entry point with architecture overview and common pitfalls
- docs/SQLITE_INTERNALS.md: SQLite fundamentals including WAL and lock page
- docs/LTX_FORMAT.md: Complete LTX format specification
- docs/ARCHITECTURE.md: Deep technical dive into components
- docs/REPLICA_CLIENT_GUIDE.md: Storage backend implementation guide
- docs/TESTING_GUIDE.md: Comprehensive testing strategies
- docs/V050_CHANGES.md: v0.5.0 migration guide and breaking changes

Key features:
- Emphasizes critical concepts like 1GB lock page handling
- Documents common pitfalls from recent PRs (#760, #748)
- Aligns with v0.5.0 changes (single replica, new compaction levels)
- Provides mermaid diagrams for visual understanding
- Includes anti-patterns and correct approaches

Removed:
- docs/RELEASE.md: Outdated release documentation

This documentation lives in the repo rather than on litestream.io because:
1. AI agents need immediate access to technical details during code analysis
2. Docs can be versioned alongside code changes
3. PR reviewers can verify AI understanding matches implementation
4. Reduces hallucination by providing authoritative in-repo reference
Based on Ben Johnson's feedback in PR #783, added comprehensive documentation about:

- Architectural boundaries between DB and Replica layers
- Proper placement of database restoration logic (DB.init() not Replica.Start())
- Atomic file operations pattern (temp file + rename)
- Proper error handling (return errors, don't just log and continue)
- Leveraging existing mechanisms (e.g., verify() for snapshots)

These patterns help AI agents understand proper Litestream architecture and avoid common mistakes when contributing fixes.
Remove specific PR references to make the documentation stand on its own
as architectural guidance rather than historical context. The documentation
should focus on patterns and anti-patterns, not where we learned about them.
Adopt AGENTS.md standard for universal AI agent support across Claude,
GitHub Copilot, Cursor, Gemini, and other AI coding assistants.

Changes:
- Renamed AGENT.md to AGENTS.md (emerging standard)
- Added agent-specific sections for each major AI assistant
- Created llms.txt index for universal documentation discovery
- Added symlinks for tool compatibility (.cursorrules, copilot-instructions.md)
- Created GEMINI.md for Gemini-specific configuration
- Added .aiexclude for Gemini file filtering (like .gitignore)

This unified approach ensures consistent AI assistance across all major
coding assistants while minimizing documentation maintenance overhead.
Implement comprehensive Claude Code support infrastructure:

Agents (.claude/agents/):
- sqlite-expert: SQLite WAL and page management expertise
- replica-client-developer: Storage backend implementation guide
- ltx-compaction-specialist: LTX format and compaction expert
- test-engineer: Comprehensive testing strategies
- performance-optimizer: Performance and resource optimization

Commands (.claude/commands/):
- analyze-ltx: Analyze LTX file structure
- debug-wal: Debug WAL replication issues
- test-compaction: Test compaction scenarios
- trace-replication: Trace replication flow
- validate-replica: Validate replica implementations
- add-storage-backend: Create new storage backends
- fix-common-issues: Diagnose and fix common problems
- run-comprehensive-tests: Execute full test suite

Configuration:
- Force include .claude/ in git despite global gitignore
- Exclude logs, hooks, and local settings from commits
- Update CLAUDE.md to reference .claude resources

This provides Claude Code with specialized knowledge and tools for
effective Litestream development and debugging.
User updates to align documentation with actual implementation:
- Updated .claude/agents with current interface signatures and patterns
- Updated .claude/commands with correct command patterns and workflows
- Aligned AGENTS.md with current constraints and architectural boundaries
- Updated technical documentation (LTX_FORMAT, ARCHITECTURE, REPLICA_CLIENT_GUIDE, TESTING_GUIDE)
- Removed outdated V050_CHANGES.md
- Updated llms.txt index with correct file references

Net change: -212 lines (significant cleanup and consolidation)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Fixed parse error in Layer Responsibilities diagram:
- Removed periods from node labels (DB.pos → DB position)
- HTML-escaped parentheses in method names for safer parsing

Resolves: "Parse error on line 3: got 'PS'" in Mermaid renderer

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants