Skip to content

Conversation

@dslovinsky
Copy link
Collaborator

@dslovinsky dslovinsky commented Jan 9, 2026

Description

This PR migrates the content indexing system from docs-site to the docs repository and refactors it to use filesystem reads in most cases. This eliminates the Github API limit issue. The new system reads manual documentation directly from disk while only fetching SDK references (which won't be edited locally) via GitHub API. Once again you can check the README here to get a solid overview on what this does and how.

The indexer has been split into three independent services: main indexer (manual docs from local filesystem), SDK indexer (SDK references from aa-sdk repo via API), and changelog indexer (local changelog files). Each indexer can run independently and updates branch-scoped keys in Redis ({branch}/path-index:main) to enable preview environments. The system supports dual modes: production mode uploads to both Redis and Algolia with permanent keys, while preview mode only updates Redis with 30-day TTL and skips Algolia to use production search indices.

The architecture uses a 3-phase pipeline (scan → batch fetch → process) with intelligent caching and bidirectional navigation tree merging that allows the main and SDK indexers to update the shared wallets navigation without overwriting each other's changes. Comprehensive test coverage (95-100%) ensures reliability with 4,000+ lines of tests covering all core components.

For the most part none of this affects existing code. Fern site still works as normal. Unfortunately I had to switch the package type to module, which necessitated adding .js to all JS file imports. But we didn't really have that much JS before so not a huge deal.

Related Issues

https://app.asana.com/1/1129441638109975/project/1211825853436056/task/1212670258654565?focus=true

Changes Made

  • Move content indexer from docs-site repo to src/content-indexer/
  • Create 3 GitHub Actions workflows for automated indexing on specific file changes
  • Cross-repo automation: aa-sdk triggers SDK indexer in docs repo via repository_dispatch
  • 21 test suites with comprehensive coverage of all indexer components
  • Add dependencies: @upstash/redis, algoliasearch, octokit, gray-matter, remove-markdown, vitest + @vitest/coverage-v8 for testing infrastructure
  • Converted to ESM ("type": "module" - required for octokit)
  • Switched from ts-node to tsx
  • Added .env.example with new env vars required for running indexers
  • Added several new npm scripts for running indexers and tests

Testing

  • I have tested these changes locally
  • I have run the validation scripts (pnpm run validate)
  • I have checked that the documentation builds correctly

@dslovinsky dslovinsky self-assigned this Jan 9, 2026
@github-actions
Copy link

github-actions bot commented Jan 9, 2026

🌿 Documentation Preview

Name Status Preview Updated (UTC)
Alchemy Docs ✅ Ready 🔗 Visit Preview Jan 9, 2026, 10:53 PM

@github-actions github-actions bot temporarily deployed to docs-preview January 9, 2026 19:46 Destroyed
@github-actions github-actions bot temporarily deployed to docs-preview January 9, 2026 22:01 Destroyed
@github-actions github-actions bot temporarily deployed to docs-preview January 9, 2026 22:09 Destroyed
@dslovinsky dslovinsky marked this pull request as ready for review January 9, 2026 22:16
Copilot AI review requested due to automatic review settings January 9, 2026 22:16
@dslovinsky dslovinsky requested a review from a team as a code owner January 9, 2026 22:16
@chatgpt-codex-connector
Copy link

To use Codex here, create a Codex account and connect to github.

1 similar comment
@chatgpt-codex-connector
Copy link

To use Codex here, create a Codex account and connect to github.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the content-indexer system from the docs-site repository to the docs repository, enabling direct content processing and indexing capabilities. The content-indexer builds path indexes, navigation trees, and Algolia search records by processing documentation from three sources: main docs (local filesystem), SDK references (GitHub API), and changelog entries.

Changes:

  • Added complete content-indexer implementation with three independent indexers (main, SDK, changelog)
  • Configured package.json as ES module with new indexer scripts and dependencies
  • Added GitHub Actions workflows for automated indexing on content changes

Reviewed changes

Copilot reviewed 88 out of 92 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
vitest.config.ts New Vitest configuration for testing
src/utils/*.ts Updated imports to include .js extension for ES modules
src/content-indexer/**/* New content-indexer implementation with visitors, collectors, and uploaders
package.json Added type: "module" and new dependencies/scripts
fern/components/**/*.tsx Updated imports to include .js extension
eslint.config.ts Added no-non-null-assertion rule
.github/workflows/*.yml New workflows for automated indexing
.env.example New environment variable template
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dslovinsky
Copy link
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link

To use Codex here, create a Codex account and connect to github.

@dslovinsky
Copy link
Collaborator Author

To use Codex here, create a Codex account and connect to github.

It's already connected!! 😠

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant