influxdata · jstirnaman · Nov 7, 2025 · Nov 24, 2025
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -42,6 +42,9 @@ jobs:
       - run:
           name: Hugo Build
           command: yarn hugo --environment production --logLevel info --gc --destination workspace/public
+      - run:
+          name: Generate LLM-friendly Markdown
+          command: yarn build:md
       - persist_to_workspace:
           root: workspace
           paths:

diff --git a/.claude/settings.json b/.claude/settings.json
@@ -0,0 +1,16 @@
+{
+  "permissions": {
+    "allow": [
+    ],
+    "deny": [
+      "Read(./.env)",
+      "Read(./.env.*)",
+      "Read(./secrets/**)",
+      "Read(./config/credentials.json)",
+      "Read(./build)"
+    ],
+    "ask": [
+      "Bash(git push:*)"
+    ]
+  }
+}
diff --git a/.gitignore b/.gitignore
@@ -38,10 +38,25 @@ tmp
 
 # TypeScript build output
 **/dist/
+**/dist-lambda/
 
 # User context files for AI assistant tools
 .context/*
 !.context/README.md
 
 # External repos
 .ext/*
+
+# Lambda deployment artifacts
+deploy/llm-markdown/lambda-edge/markdown-generator/*.zip
+deploy/llm-markdown/lambda-edge/markdown-generator/package-lock.json
+deploy/llm-markdown/lambda-edge/markdown-generator/.package-tmp/
+deploy/llm-markdown/lambda-edge/markdown-generator/yarn.lock
+deploy/llm-markdown/lambda-edge/markdown-generator/config.json
+
+# JavaScript/TypeScript build artifacts
+*.tsbuildinfo
+*.d.ts
+*.d.ts.map
+*.js.map
+.eslintcache
diff --git a/.s3deploy.yml b/.s3deploy.yml
@@ -4,5 +4,5 @@ routes:
       headers:
          Cache-Control: "max-age=630720000, no-transform, public"
       gzip: true
-    - route: "^.+\\.(html|xml|json|js)$"
+    - route: "^.+\\.(html|xml|json|js|md)$"
       gzip: true
diff --git a/DOCS-DEPLOYING.md b/DOCS-DEPLOYING.md
@@ -0,0 +1,330 @@
+# Deploying InfluxData Documentation
+
+This guide covers deploying the docs-v2 site to staging and production environments, as well as LLM markdown generation.
+
+## Table of Contents
+
+- [Staging Deployment](#staging-deployment)
+- [Production Deployment](#production-deployment)
+- [LLM Markdown Generation](#llm-markdown-generation)
+- [Testing and Validation](#testing-and-validation)
+- [Troubleshooting](#troubleshooting)
+
+## Staging Deployment
+
+Staging deployments are manual and run locally with your AWS credentials.
+
+### Prerequisites
+
+1. **AWS Credentials** - Configure AWS CLI with appropriate permissions:
+   ```bash
+   aws configure
+   ```
+
+2. **s3deploy** - Install the s3deploy binary:
+   ```bash
+   ./deploy/ci-install-s3deploy.sh
+   ```
+
+3. **Environment Variables** - Set required variables:
+   ```bash
+   export STAGING_BUCKET="test2.docs.influxdata.com"
+   export AWS_REGION="us-east-1"
+   export STAGING_CF_DISTRIBUTION_ID="E1XXXXXXXXXX"  # Optional
+   ```
+
+### Deploy to Staging
+
+Use the staging deployment script:
+
+```bash
+yarn deploy:staging
+```
+
+Or run the script directly:
+
+```bash
+./scripts/deploy-staging.sh
+```
+
+### What the Script Does
+
+1. **Builds Hugo site** with staging configuration (`config/staging/hugo.yml`)
+2. **Generates LLM-friendly Markdown** (`yarn build:md`)
+3. **Uploads to S3** using s3deploy
+4. **Invalidates CloudFront cache** (if `STAGING_CF_DISTRIBUTION_ID` is set)
+
+### Optional Environment Variables
+
+Skip specific steps for faster iteration:
+
+```bash
+# Skip Hugo build (use existing public/)
+export SKIP_BUILD=true
+
+# Skip markdown generation
+export SKIP_MARKDOWN=true
+
+# Build only (no S3 upload)
+export SKIP_DEPLOY=true
+```
+
+### Example: Test Markdown Generation Only
+
+```bash
+SKIP_DEPLOY=true ./scripts/deploy-staging.sh
+```
+
+## Production Deployment
+
+Production deployments are **automatic** via CircleCI when merging to `master`.
+
+### Workflow
+
+1. **Build Job** (`.circleci/config.yml`):
+   - Installs dependencies
+   - Builds Hugo site with production config
+   - Generates LLM-friendly Markdown (`yarn build:md`)
+   - Persists workspace for deploy job
+
+2. **Deploy Job**:
+   - Attaches workspace
+   - Uploads to S3 using s3deploy
+   - Invalidates CloudFront cache
+   - Posts success notification to Slack
+
+### Environment Variables (CircleCI)
+
+Production deployment requires the following environment variables set in CircleCI:
+
+- `BUCKET` - Production S3 bucket name
+- `REGION` - AWS region
+- `CF_DISTRIBUTION_ID` - CloudFront distribution ID
+- `SLACK_WEBHOOK_URL` - Slack notification webhook
+
+### Trigger Production Deploy
+
+```bash
+git push origin master
+```
+
+CircleCI will automatically build and deploy.
+
+## LLM Markdown Generation
+
+Both staging and production deployments generate LLM-friendly Markdown files at build time.
+
+### Output Files
+
+The build generates two types of markdown files in `public/`:
+
+1. **Single-page markdown** (`index.md`)
+   - Individual page content with frontmatter
+   - Contains: title, description, URL, product, version, token estimate
+
+2. **Section bundles** (`index.section.md`)
+   - Aggregated section with all child pages
+   - Includes child page list in frontmatter
+   - Optimized for LLM context windows
+
+### Generation Script
+
+```bash
+# Generate all markdown
+yarn build:md
+
+# Generate for specific path
+node scripts/build-llm-markdown.js --path influxdb3/core/get-started
+
+# Limit number of files (for testing)
+node scripts/build-llm-markdown.js --limit 100
+```
+
+### Configuration
+
+Edit `scripts/build-llm-markdown.js` to adjust:
+
+```javascript
+// Skip files smaller than this (Hugo alias redirects)
+const MIN_HTML_SIZE_BYTES = 1024;
+
+// Token estimation ratio
+const CHARS_PER_TOKEN = 4;
+
+// Concurrency (workers)
+const CONCURRENCY = process.env.CI ? 10 : 20;
+```
+
+### Performance
+
+- **Speed**: \~105 seconds for 5,000 pages + 500 sections
+- **Memory**: \~300MB peak (safe for 2GB CircleCI)
+- **Rate**: \~23 files/second with memory-bounded parallelism
+
+## Testing and Validation
+
+### Local Testing
+
+Test markdown generation locally before deploying:
+
+```bash
+# Prerequisites
+yarn install
+yarn build:ts
+npx hugo --quiet
+
+# Generate markdown for testing
+yarn build:md
+
+# Generate markdown for specific path
+node scripts/build-llm-markdown.js --path influxdb3/core/get-started --limit 10
+
+# Run validation tests
+node cypress/support/run-e2e-specs.js \
+  --spec "cypress/e2e/content/markdown-content-validation.cy.js"
+```
+
+### Validation Checks
+
+The Cypress tests validate:
+
+- ✅ No raw Hugo shortcodes (`{{< >}}` or `{{% %}}`)
+- ✅ No HTML comments
+- ✅ Proper YAML frontmatter with required fields
+- ✅ UI elements removed (feedback forms, navigation)
+- ✅ GitHub-style callouts (Note, Warning, etc.)
+- ✅ Properly formatted tables, lists, and code blocks
+- ✅ Product context metadata
+- ✅ Clean link formatting
+
+See [DOCS-TESTING.md](DOCS-TESTING.md) for comprehensive testing documentation.
+
+## Troubleshooting
+
+### s3deploy Not Found
+
+Install the s3deploy binary:
+
+```bash
+./deploy/ci-install-s3deploy.sh
+```
+
+Verify installation:
+
+```bash
+s3deploy -version
+```
+
+### Missing Environment Variables
+
+Check required variables are set:
+
+```bash
+echo $STAGING_BUCKET
+echo $AWS_REGION
+```
+
+Set them if missing:
+
+```bash
+export STAGING_BUCKET="test2.docs.influxdata.com"
+export AWS_REGION="us-east-1"
+```
+
+### AWS Permission Errors
+
+Ensure your AWS credentials have the required permissions:
+
+- `s3:PutObject` - Upload files to S3
+- `s3:DeleteObject` - Delete old files from S3
+- `cloudfront:CreateInvalidation` - Invalidate cache
+
+Check your AWS profile:
+
+```bash
+aws sts get-caller-identity
+```
+
+### Hugo Build Fails
+
+Check for:
+
+- Missing dependencies (`yarn install`)
+- TypeScript compilation errors (`yarn build:ts`)
+- Invalid Hugo configuration
+
+Build Hugo separately to isolate the issue:
+
+```bash
+yarn hugo --environment staging
+```
+
+### Markdown Generation Fails
+
+Check for:
+
+- Hugo build completed successfully
+- TypeScript compiled (`yarn build:ts`)
+- Sufficient memory available
+
+Test markdown generation separately:
+
+```bash
+yarn build:md --limit 10
+```
+
+### CloudFront Cache Not Invalidating
+
+If you see stale content after deployment:
+
+1. Check `STAGING_CF_DISTRIBUTION_ID` is set correctly
+2. Verify AWS credentials have `cloudfront:CreateInvalidation` permission
+3. Manual invalidation:
+   ```bash
+   aws cloudfront create-invalidation \
+     --distribution-id E1XXXXXXXXXX \
+     --paths "/*"
+   ```
+
+### Deployment Timing Out
+
+For large deployments:
+
+1. **Skip markdown generation** if unchanged:
+   ```bash
+   SKIP_MARKDOWN=true ./scripts/deploy-staging.sh
+   ```
+
+2. **Use s3deploy's incremental upload**:
+   - s3deploy only uploads changed files
+   - First deploy is slower, subsequent deploys are faster
+
+3. **Check network speed**:
+   - Large uploads require good bandwidth
+   - Consider deploying from an AWS region closer to the S3 bucket
+
+## Deployment Checklist
+
+### Before Deploying to Staging
+
+- [ ] Run tests locally (`yarn lint`)
+- [ ] Build Hugo successfully (`yarn hugo --environment staging`)
+- [ ] Generate markdown successfully (`yarn build:md`)
+- [ ] Set staging environment variables
+- [ ] Have AWS credentials configured
+
+### Before Merging to Master (Production)
+
+- [ ] Test on staging first
+- [ ] Verify LLM markdown quality
+- [ ] Check for broken links (`yarn test:links`)
+- [ ] Run code block tests (`yarn test:codeblocks:all`)
+- [ ] Review CircleCI configuration changes
+- [ ] Ensure all tests pass
+
+## Related Documentation
+
+- [Contributing Guide](DOCS-CONTRIBUTING.md)
+- [Testing Guide](DOCS-TESTING.md)
+- [CircleCI Configuration](.circleci/config.yml)
+- [S3 Deploy Configuration](.s3deploy.yml)