Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions .changeset/perf-indexing-ux-improvements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
"@lytics/dev-agent-core": minor
"@lytics/dev-agent-cli": minor
"@lytics/dev-agent": patch
---

Massive indexing performance and UX improvements

**Performance Optimizations (184% faster):**
- **63x faster metadata collection**: Eliminated 863 individual git calls by using single batched git command
- **Removed storage size calculation**: Deferred to on-demand in `dev stats` (saves 1-3s)
- **Simplified ownership tracking**: Author contributions now calculated on-demand in `dev owners` (1s), removed SQLite pre-indexing overhead
- **Total speedup**: Indexing now completes in ~33s vs ~95s (61s improvement!)

**Architecture Simplifications:**
- Removed `file_authors` SQLite table (on-demand is fast enough)
- Removed `appendFileAuthors()` and `getFileAuthors()` from MetricsStore
- Removed `authorContributions` from IndexUpdatedEvent
- Cleaner separation: metrics for analytics, ownership for developer insights

**UX Improvements (no more silent gaps):**
- **Section-based progress display**: Clean, informative output inspired by Homebrew/Cargo
- **Applied to 4 commands**: `dev index`, `dev update`, `dev git index`, `dev github index`
- **Live progress updates**: Shows current progress for each phase (scanning, embedding, git, GitHub)
- **Clean indexing plan**: Removed INFO timestamps from plan display
- **Helpful next steps**: Suggests relevant commands after indexing completes
- **More frequent scanner progress**: Logs every 2 batches OR every 10 seconds (was every 50 files)
- **Slow file detection**: Debug logs for files/batches taking >5s to process
- **Cleaner completion summary**: Removed storage size from index output (shown in `dev stats` instead)
- **Continuous feedback**: Maximum 1-second gaps between progress updates
- **Context-aware `dev owners` command**: Adapts output based on git status and current directory
- **Changed files mode**: Shows ownership of uncommitted changes with real-time git log analysis
- **Root directory mode**: High-level overview of top areas (packages/cli/, packages/core/)
- **Subdirectory mode**: Detailed expertise for specific area
- **Smart ownership display**: Asymmetric icons that only flag exceptions (⚠️ for others' files, 🆕 for new files)
- **Last touched timestamps**: Shows when files were last modified (catches stale code and active development)
- **Recent activity detection**: Warns when others recently touched your files (prevents conflicts)
- **Suggested reviewers**: Automatically identifies who to loop in for code reviews
- **Visual hierarchy**: Tree branches (├─, └─) and emojis (📝, 📁, 👤) for better readability
- **Activity-focused**: Sorted by last active, not file count (no more leaderboard vibes)
- **Git root detection**: Works from any subdirectory within the repository
- **Better developer grouping**: `dev owners` now groups by GitHub handle instead of email (merges multiple emails for same developer)
- **Graceful degradation**: Verbose mode and non-TTY environments show traditional log output

**Technical Details:**
- Added `log-update` dependency for smooth single-line progress updates
- New `ProgressRenderer` class for section-based progress display
- Optimized `buildCodeMetadata()` to derive change frequency from author contributions instead of making separate git calls
- Scanner now tracks time since last log and ensures updates every 10s
- Storage size calculation moved from index-time to query-time (lazy evaluation)
- TTY detection for graceful fallback in CI/CD environments

**Before:**
```
[14:27:37] typescript 3450/3730 (92%)
← 3 MINUTES OF SILENCE
[14:30:09] typescript 3600/3730 (97%)
← EMBEDDING COMPLETES
← 63 SECONDS OF SILENCE
[14:31:12] Starting git extraction
```

**After:**
```
▸ Scanning Repository
357/433 files (82%, 119 files/sec)
✓ Scanning Repository (3.2s)
433 files → 2,525 components

▸ Embedding Vectors
1,600/2,525 documents (63%, 108 docs/sec)
✓ Embedding Vectors (20.7s)
2,525 documents

▸ Git History
150/252 commits (60%)
✓ Git History (4.4s)
252 commits

▸ GitHub Issues/PRs
82/163 documents (50%)
✓ GitHub Issues/PRs (7.8s)
163 documents

✓ Repository indexed successfully!

Indexed: 433 files • 2,525 components • 252 commits • 163 GitHub docs
Duration: 33.5s

💡 Next steps:
dev map Explore codebase structure
dev owners See contributor stats
dev activity Find active files
```

1 change: 1 addition & 0 deletions packages/cli/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
"chalk": "^5.6.2",
"cli-table3": "^0.6.5",
"commander": "^12.1.0",
"log-update": "^6.1.0",
"ora": "^8.0.1",
"terminal-size": "^4.0.0"
},
Expand Down
18 changes: 10 additions & 8 deletions packages/cli/src/commands/commands.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ describe('CLI Commands', () => {
);
});

it('should display storage size after indexing', async () => {
it('should display indexing summary without storage size', async () => {
const indexDir = path.join(testDir, 'index-test');
await fs.mkdir(indexDir, { recursive: true });

Expand Down Expand Up @@ -120,13 +120,15 @@ export class Calculator {
exitSpy.mockRestore();
console.log = originalConsoleLog;

// Verify storage size is in the output (new compact format shows it after duration)
const storageSizeLog = loggedMessages.find(
(msg) => msg.includes('Duration:') || msg.includes('Storage:')
);
expect(storageSizeLog).toBeDefined();
// Check for storage size in compact format: "Duration: X • Storage: Y"
expect(loggedMessages.some((msg) => /\d+(\.\d+)?\s*(B|KB|MB|GB)/.test(msg))).toBe(true);
// Verify summary shows duration (storage size calculated on-demand in `dev stats`)
const durationLog = loggedMessages.find((msg) => msg.includes('Duration:'));
expect(durationLog).toBeDefined();
// Verify storage size is NOT shown (deferred to `dev stats`)
const hasStorageSize = loggedMessages.some((msg) => msg.includes('Storage:'));
expect(hasStorageSize).toBe(false);
// Verify indexed stats are shown
const indexedLog = loggedMessages.find((msg) => msg.includes('Indexed:'));
expect(indexedLog).toBeDefined();
}, 30000); // 30s timeout for indexing
});
});
84 changes: 63 additions & 21 deletions packages/cli/src/commands/git.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ import {
LocalGitExtractor,
VectorStorage,
} from '@lytics/dev-agent-core';
import { createLogger } from '@lytics/kero';
import chalk from 'chalk';
import { Command } from 'commander';
import ora from 'ora';
import { keroLogger, logger } from '../utils/logger.js';
import { createIndexLogger, logger } from '../utils/logger.js';
import { output, printGitStats } from '../utils/output.js';
import { ProgressRenderer } from '../utils/progress.js';

/**
* Create Git indexer with centralized storage
Expand Down Expand Up @@ -48,49 +48,91 @@ export const gitCommand = new Command('git')
.addCommand(
new Command('index')
.description('Index git commit history for semantic search')
.option('--limit <number>', 'Maximum commits to index (default: 500)', Number.parseInt, 500)
.option(
'--limit <number>',
'Maximum commits to index (default: 500)',
(val) => Number.parseInt(val, 10),
500
)
.option(
'--since <date>',
'Only index commits after this date (e.g., "2024-01-01", "6 months ago")'
)
.option('-v, --verbose', 'Verbose output', false)
.action(async (options) => {
const spinner = ora('Loading configuration...').start();
const spinner = ora('Initializing git indexer...').start();

// Create logger for indexing
const indexLogger = options.verbose
? createLogger({ level: 'debug', format: 'pretty' })
: keroLogger.child({ command: 'git-index' });
const indexLogger = createIndexLogger(options.verbose);

try {
spinner.text = 'Initializing git indexer...';

const { indexer, vectorStore } = await createGitIndexer();

spinner.text = 'Indexing git commits...';
// Stop spinner and switch to section-based progress
spinner.stop();

// Initialize progress renderer
const progressRenderer = new ProgressRenderer({ verbose: options.verbose });
progressRenderer.setSections(['Extracting Commits', 'Embedding Commits']);

const startTime = Date.now();
const extractStartTime = startTime;
let embeddingStartTime = 0;
let inEmbeddingPhase = false;

const stats = await indexer.index({
limit: options.limit,
since: options.since,
logger: indexLogger,
onProgress: (progress) => {
if (progress.phase === 'storing' && progress.totalCommits > 0) {
// Transitioning to embedding phase
if (!inEmbeddingPhase) {
const extractDuration = (Date.now() - extractStartTime) / 1000;
progressRenderer.completeSection(
`${progress.totalCommits.toLocaleString()} commits extracted`,
extractDuration
);
embeddingStartTime = Date.now();
inEmbeddingPhase = true;
}

// Update embedding progress
const pct = Math.round((progress.commitsProcessed / progress.totalCommits) * 100);
spinner.text = `Embedding ${progress.commitsProcessed}/${progress.totalCommits} commits (${pct}%)`;
progressRenderer.updateSection(
`${progress.commitsProcessed}/${progress.totalCommits} commits (${pct}%)`
);
}
},
});

spinner.succeed(chalk.green('Git history indexed!'));
// Complete embedding section
if (inEmbeddingPhase) {
const embeddingDuration = (Date.now() - embeddingStartTime) / 1000;
progressRenderer.completeSection(
`${stats.commitsIndexed.toLocaleString()} commits`,
embeddingDuration
);
}

// Display stats
logger.log('');
logger.log(chalk.bold('Indexing Stats:'));
logger.log(` Commits indexed: ${chalk.yellow(stats.commitsIndexed)}`);
logger.log(` Duration: ${chalk.cyan(stats.durationMs)}ms`);
logger.log('');
logger.log(chalk.gray('Now you can search with: dev git search "<query>"'));
logger.log('');
const totalDuration = (Date.now() - startTime) / 1000;

// Finalize progress display
progressRenderer.done();

// Display success message
output.log('');
output.success(`Git history indexed successfully!`);
output.log(
` ${chalk.bold('Indexed:')} ${stats.commitsIndexed.toLocaleString()} commits`
);
output.log(` ${chalk.bold('Duration:')} ${totalDuration.toFixed(1)}s`);
output.log('');
output.log(chalk.dim('💡 Next step:'));
output.log(
` ${chalk.cyan('dev git search "<query>"')} ${chalk.dim('Search commit history')}`
);
output.log('');

await vectorStore.close();
} catch (error) {
Expand All @@ -111,7 +153,7 @@ export const gitCommand = new Command('git')
new Command('search')
.description('Semantic search over git commit messages')
.argument('<query>', 'Search query (e.g., "authentication bug fix")')
.option('--limit <number>', 'Number of results', Number.parseInt, 10)
.option('--limit <number>', 'Number of results', (val) => Number.parseInt(val, 10), 10)
.option('--json', 'Output as JSON')
.action(async (query, options) => {
const spinner = ora('Loading configuration...').start();
Expand Down
Loading