Skip to content

Conversation

@NathanFlurry
Copy link
Member

No description provided.

@vercel
Copy link

vercel bot commented Nov 5, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivetkit-serverless Ready Ready Preview Comment Nov 5, 2025 5:14am
3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Nov 5, 2025 5:14am
rivet-inspector Ignored Ignored Preview Nov 5, 2025 5:14am
rivet-site Ignored Ignored Preview Nov 5, 2025 5:14am

Copy link
Member Author

NathanFlurry commented Nov 5, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Nov 5, 2025

PR Review: Auto Shut Down Runners on Source Code Updates

Summary

This PR adds a file watching mechanism to automatically shut down Next.js runners when source code is updated during development. The implementation uses polling instead of filesystem watchers to avoid cross-platform issues.


Code Quality & Best Practices

Strengths:

  • Well-documented code with clear comments explaining the workarounds
  • Good separation of concerns with distinct functions for each responsibility
  • Proper cleanup of intervals and event listeners to prevent memory leaks
  • Type-safe with explicit Promise return type on fetchWrapper

Issues & Suggestions:

  1. Hardcoded Route Path (Critical) - mod.ts:160-163

    const routePath = join(
        process.cwd(),
        ".next/server/app/api/rivet/[...all]/route.js",
    );

    This assumes a fixed API route structure. This will break if users:

    • Use a different route path (e.g., /api/custom/[...all])
    • Use TypeScript in production builds
    • Have a custom Next.js build directory

    Recommendation: Make the route path configurable via inputConfig or detect it dynamically.

  2. Log Level Inconsistency - mod.ts:167

    logger().info("checking");

    This logs "checking" every second at INFO level, which will spam logs. Should be debug() or removed entirely.

  3. Environment Check Inconsistency - mod.ts:58 vs mod.ts:17

    • Line 58 checks NODE_ENV !== "development"
    • Line 17 checks NODE_ENV !== "production"

    These are not equivalent (could be "test", "staging", etc.). Consider standardizing on one approach or explicitly handling all cases.

  4. Missing Cleanup on Abort Signal - mod.ts:93

    request.signal?.addEventListener("abort", abortMerged);

    The event listener is never removed, causing a potential memory leak if the mergedController outlives the request. Consider:

    const cleanup = () => {
        request.signal?.removeEventListener("abort", abortMerged);
        clearInterval(watchIntervalId);
    };
    mergedController.signal.addEventListener("abort", cleanup);

Potential Bugs

  1. Race Condition in Stream Wrapping - mod.ts:131-134
    The interval might not be cleared if the stream is read asynchronously after the response is returned. The onFinish callback could fire after the function returns but before cleanup happens.

  2. File Watcher Not Stopped on Error - mod.ts:123
    If fetch(newReq) throws an exception, the interval will never be cleared. Wrap in try-finally:

    try {
        const response = await fetch(newReq);
        // ... existing code
    } catch (err) {
        clearInterval(watchIntervalId);
        throw err;
    }
  3. Body Already Consumed - mod.ts:112
    Passing body: request.body when creating a new Request may fail if the body has already been read. Consider checking request.bodyUsed.


Performance Considerations

  1. Per-Request File Watchers - mod.ts:99
    Creating a polling interval for every request is expensive. With concurrent requests, you could have many timers checking the same file. Consider:

    • Use a singleton watcher shared across requests
    • Increase polling interval to 2-3 seconds (1s seems aggressive)
    • Debounce file checks
  2. Synchronous File Operations - mod.ts:169-173
    existsSync() and statSync() block the event loop. With many concurrent requests, this could cause performance issues. Consider using async versions or caching results briefly.

  3. Stream Wrapping Overhead - mod.ts:204-229
    Every chunk passes through an extra transform, adding latency to streaming responses. This is probably acceptable for dev mode but worth noting.


Security Concerns

No significant security issues identified. The file watching is limited to development mode and watches a build artifact, not user input.


Test Coverage

Missing Tests:

  • No test files found for this package
  • Should add tests for:
    • File change detection and abort triggering
    • Interval cleanup in various scenarios (normal completion, error, abort)
    • Stream wrapping and cleanup
    • Environment variable branching logic

Additional Recommendations

  1. Configuration Option: Add ability to disable this feature via config for users who don't want it:

    inputConfig.disableFileWatcher?: boolean
  2. Metrics/Telemetry: Log when file changes are detected and runners are terminated to help with debugging.

  3. Error Handling: The error in checkFile is logged but might benefit from alerting if it happens repeatedly.

  4. Type Safety: The fetch parameter type (request: Request, ...args: any) => Response | Promise<Response> uses any. Consider making it more specific or generic.


Verdict

The implementation solves a real problem with Next.js hot reloading, but needs refinement around:

  • Configurable route paths
  • Memory leak prevention
  • Error handling
  • Performance optimization for concurrent requests
  • Test coverage

The approach is reasonable given Next.js limitations, but the hardcoded path and per-request watchers are blockers that should be addressed before merging.

@NathanFlurry NathanFlurry force-pushed the 11-04-fix_next-js_auto_shut_down_runners_when_source_code_updates branch from e376715 to 943ae66 Compare November 5, 2025 04:55
@NathanFlurry NathanFlurry marked this pull request as ready for review November 5, 2025 04:55
@NathanFlurry NathanFlurry requested review from MasterPtato and jog1t and removed request for MasterPtato November 5, 2025 04:55
@claude
Copy link

claude bot commented Nov 5, 2025

PR Review: Auto shut down runners when source code updates

Summary

This PR implements a development-mode file watcher that automatically shuts down runners when Next.js hot-reloads route files. The implementation addresses a legitimate issue where long-running requests (like SSE streams) persist across hot reloads in development.


Code Quality & Best Practices

Strengths:

  • Clear separation of concerns with dedicated functions for file watching and stream handling
  • Good documentation explaining the workarounds for Next.js limitations
  • Proper cleanup of intervals to prevent memory leaks
  • Type safety with explicit Promise<Response> return type

Issues:

  1. Incomplete comment (rivetkit-typescript/packages/rivetkit/src/drivers/engine/actor-driver.ts:652)

    // If we did not use a graceful shutdown, the runner would

    This comment is cut off mid-sentence. Please complete it to explain the consequences.

  2. Environment check inconsistency (rivetkit-typescript/packages/next-js/src/mod.ts:58)

    • Line 17: process.env.NODE_ENV !== "production"
    • Line 58: process.env.NODE_ENV !== "development"

    The file watcher only runs in development, but the condition checks for !== "development" which would also activate in test/staging environments. Consider using === "development" for clarity.

  3. Duplicate event listener (rivetkit-typescript/packages/next-js/src/mod.ts:93,102)

    request.signal?.addEventListener("abort", abortMerged);  // Line 93
    request.signal.addEventListener("abort", () => { ... });  // Line 102

    Two separate abort listeners are registered on request.signal. The second one (clearing interval) is redundant since the abort will trigger mergedController.abort() via the first listener, which will cause the stream wrapper to clean up. Consider consolidating:

    request.signal?.addEventListener("abort", () => {
        abortMerged();
        logger().debug("clearing file watcher interval: request aborted");
        clearInterval(watchIntervalId);
    });

Potential Bugs

  1. Optional chaining inconsistency (rivetkit-typescript/packages/next-js/src/mod.ts:93,102)

    • Line 93 uses request.signal?.addEventListener (optional chaining)
    • Line 102 uses request.signal.addEventListener (no optional chaining)

    This will throw if request.signal is undefined. Use optional chaining consistently.

  2. Race condition in file watching (rivetkit-typescript/packages/next-js/src/mod.ts:190-192)
    The interval starts immediately with checkFile(), then every 1000ms. If the file changes between the initial check and first interval tick, there's a 1-second window where changes won't be detected. Consider:

    const intervalId = setInterval(checkFile, 1000);
    checkFile(); // Run after setting up interval
    return intervalId;

    Actually, looking more carefully, the current implementation calls checkFile() first (line 190), then starts the interval. This is correct, but there's still a race: if the file changes between line 190 and 192, and then changes back before the next interval, the change could be missed.

  3. Request body stream consumption (rivetkit-typescript/packages/next-js/src/mod.ts:112)
    When creating newReq, the original request.body is passed. Request bodies can only be read once. If the body was already consumed (e.g., for logging/middleware), this will fail. Consider checking request.bodyUsed or cloning the request.

  4. Hardcoded route path (rivetkit-typescript/packages/next-js/src/mod.ts:160-163)

    const routePath = join(
        process.cwd(),
        ".next/server/app/api/rivet/[...all]/route.js",
    );

    This assumes:

    • The API route is always at /api/rivet
    • Next.js app router (not pages router)
    • Specific build output structure

    If users configure a different route path or use pages router, this will fail silently. Consider making this configurable or documenting the assumption clearly.


Performance Considerations

  1. Polling interval efficiency (rivetkit-typescript/packages/next-js/src/mod.ts:192)
    1-second polling with statSync() on every active request could be expensive with many concurrent connections:

    • 10 SSE connections = 10 stats/second
    • 100 SSE connections = 100 stats/second

    Consider:

    • Increasing the interval (2-3 seconds might be acceptable for dev)
    • Using a singleton watcher with reference counting
    • Documenting the performance characteristics
  2. Stream wrapper overhead (rivetkit-typescript/packages/next-js/src/mod.ts:200-230)
    Every chunk passes through the custom ReadableStream wrapper. This adds overhead but is necessary for cleanup. The implementation looks efficient.


Security Concerns

  1. Development-only code
    The file watching is correctly limited to development mode. Good separation.

  2. File system access (rivetkit-typescript/packages/next-js/src/mod.ts:169,173)
    Uses existsSync() and statSync() on a predictable path. Since this is development-only and uses process.cwd() as the base, this is acceptable. Error handling is present.


Test Coverage

Missing tests:

  • No tests found for the next-js package
  • Consider adding tests for:
    • File watching triggers on mtime change
    • Interval cleanup on various scenarios (abort, stream finish, no body)
    • Stream wrapper correctly passes through data
    • AbortController merge logic
    • Environment-based behavior switching

Recommended test cases:

describe("handleRequestWithFileWatcher", () => {
  it("should clean up interval when stream finishes");
  it("should clean up interval when request is aborted");
  it("should clean up interval immediately if no response body");
  it("should abort merged controller when file changes");
  it("should handle file stat errors gracefully");
});

Additional Recommendations

  1. Configuration option
    Consider adding an option to disable file watching even in development, e.g., disableDevFileWatcher: boolean in config. This would help with debugging or performance testing.

  2. Metrics/observability
    Consider adding metrics for:

    • Number of active file watchers
    • File change detections
    • Watcher cleanup events
  3. Error boundary
    If file watching fails to initialize, should the request still proceed? Currently it will (errors are caught), which seems correct, but consider logging a warning on the first failure.

  4. Type safety improvement (rivetkit-typescript/packages/next-js/src/mod.ts:87)

    fetch: (request: Request, ...args: any) => Response | Promise<Response>

    The any type could be more specific. What are the actual additional arguments?


Summary

The PR solves a real development experience problem with a practical workaround. The implementation is generally solid, but has several issues that should be addressed:

Must fix:

  • Complete the truncated comment in actor-driver.ts:652
  • Fix optional chaining inconsistency for request.signal
  • Consolidate duplicate abort event listeners

Should fix:

  • Request body consumption issue
  • Environment check clarity
  • Document/make configurable the hardcoded route path

Nice to have:

  • Add test coverage
  • Add configuration option to disable feature
  • Optimize polling with reference-counted singleton watcher

The core logic is sound, but the edge cases and cleanup paths need attention to prevent subtle bugs in production.

@graphite-app graphite-app bot changed the base branch from 11-04-fix_rivetkit_fix_duplicate_runner_key_in_serverless to graphite-base/3369 November 5, 2025 05:11
@graphite-app graphite-app bot force-pushed the graphite-base/3369 branch from 3763786 to fdd89a7 Compare November 5, 2025 05:12
@graphite-app graphite-app bot force-pushed the 11-04-fix_next-js_auto_shut_down_runners_when_source_code_updates branch from 943ae66 to d42c437 Compare November 5, 2025 05:12
@graphite-app graphite-app bot changed the base branch from graphite-base/3369 to main November 5, 2025 05:13
@graphite-app graphite-app bot force-pushed the 11-04-fix_next-js_auto_shut_down_runners_when_source_code_updates branch from d42c437 to ddef08f Compare November 5, 2025 05:13
@claude
Copy link

claude bot commented Nov 5, 2025

Pull Request Review

Overview

This PR implements automatic shutdown of runners when source code updates in Next.js development mode. The implementation adds file watching functionality to detect changes and gracefully abort running requests.


Code Quality & Best Practices

Positive aspects:

  • Good separation of concerns with dedicated functions (handleRequestWithFileWatcher, watchRouteFile, waitForStreamFinish)
  • Comprehensive inline documentation explaining the "why" behind design decisions
  • Proper cleanup of intervals and event listeners
  • Appropriate use of environment checks to limit file watching to development mode only

Suggestions:

  1. Incomplete comment in actor-driver.ts:652 - The comment is cut off mid-sentence:

    // If we did not use a graceful shutdown, the runner would

    This should be completed to explain the consequences of not using graceful shutdown.

  2. Memory leak potential - In handleRequestWithFileWatcher (line 102-105), there's a duplicate event listener registration:

    • Line 93: request.signal?.addEventListener('abort', abortMerged)
    • Line 102: request.signal.addEventListener('abort', () => {...})

    The first listener (abortMerged) should be removed when the interval is cleared. Consider removing it in the interval cleanup callback.

  3. Hard-coded route path - The path .next/server/app/api/rivet/[...all]/route.js (line 162) is hard-coded. If users have a different route structure or Next.js changes its build output structure, this will break silently. Consider:

    • Adding a configuration option to override the path
    • Detecting the route path dynamically
    • Adding a warning if the file doesn't exist after the first check

Potential Bugs

  1. Race condition in stream cleanup - If the stream finishes before watchRouteFile detects a change, the interval is cleared. However, if a file change is detected while the stream is finishing, both paths might try to clear the interval. While clearInterval is idempotent, there's a brief window where abortController.abort() might be called after the stream completes. This is likely harmless but worth noting.

  2. Request body consumption - Line 112 passes request.body to the new Request constructor. If the original request body was already read/consumed, this will fail. Consider checking if the body is readable or handling this edge case.

  3. Missing null check - Line 102 accesses request.signal.addEventListener without the optional chaining used on line 93. This could throw if request.signal is undefined. Should be:

    request.signal?.addEventListener('abort', () => {...})

Performance Considerations

  1. Polling interval - The 1-second polling interval (line 192) seems reasonable for development, but creates one file stat per request per second. For applications with many concurrent long-running requests, this could add up. Consider:

    • Making the interval configurable
    • Using a shared file watcher across requests (with proper cleanup on hot reload)
    • Increasing the interval to 2-3 seconds for less aggressive checking
  2. File stat overhead - Using statSync (line 173) blocks the event loop briefly. For a 1-second interval this is probably fine, but consider stat (async) with proper error handling if you want to reduce blocking.


Security Concerns

No significant security issues identified. The code:

  • Only runs in development mode
  • Doesn't expose file system information to the client
  • Properly handles abort signals
  • Uses safe file system operations

Test Coverage

The PR doesn't include tests. Consider adding:

  1. Unit tests for waitForStreamFinish to verify cleanup in success/error/cancel scenarios
  2. Integration tests verifying file change detection triggers abort
  3. Tests for cleanup of intervals and event listeners to prevent memory leaks
  4. Edge case tests for missing files, permission errors, etc.

Additional Recommendations

  1. Logging consistency - Some logs use structured format (e.g., { msg: '...', routePath }) while others don't (e.g., line 103). Standardize on structured logging throughout.

  2. Error handling in watchRouteFile - The error is logged but the watcher continues running even if the file doesn't exist. Consider stopping the interval after N consecutive failures to avoid unnecessary work.

  3. Environment variable check - Line 58 checks NODE_ENV !== 'development' while line 17 checks NODE_ENV !== 'production'. This creates three states: development, production, and "other". Make sure the behavior for the "other" state is intentional.


Summary

The implementation is well-structured and solves a real development pain point with Next.js hot reloading. The main concerns are:

  • Critical: Missing null check on line 102
  • Important: Incomplete comment in actor-driver.ts
  • Nice to have: Memory leak prevention, configurable polling interval, test coverage

Overall, this is a solid solution to a tricky problem. With the fixes above, this should be ready to merge.

@NathanFlurry NathanFlurry merged commit ce0f835 into main Nov 5, 2025
7 of 11 checks passed
@NathanFlurry NathanFlurry deleted the 11-04-fix_next-js_auto_shut_down_runners_when_source_code_updates branch November 5, 2025 05:14
@claude
Copy link

claude bot commented Nov 5, 2025

Pull Request Review

Summary

This PR implements auto-shutdown of runners when source code updates in Next.js development mode. The solution uses a polling-based file watcher to detect changes to the compiled route file and gracefully aborts requests to trigger runner restarts.


Code Quality & Best Practices

Positive Aspects

  • Well-documented: Excellent use of comments explaining the workarounds and design decisions (especially the HACK comments which are helpful)
  • Clean separation: The new functionality is properly isolated into separate functions (handleRequestWithFileWatcher, watchRouteFile, waitForStreamFinish)
  • Proper resource cleanup: Good attention to clearing intervals in multiple scenarios (stream finish, abort, no body)
  • Type safety: Return type annotation added to fetchWrapper improves type safety

Suggestions for Improvement

1. Environment check inconsistency (mod.ts:58)
Line 17 checks !== production but line 58 checks !== development. This means file watching will activate in all non-development modes (production, test, staging, etc.). Recommendation: Change line 58 to process.env.NODE_ENV === development to match the intent.

2. Potential memory leak with abort listeners (mod.ts:93, 102)
Two separate abort listeners are attached to request.signal. The first listener (abortMerged) is never removed. Recommendation: Either use a single listener or store references for proper cleanup.

3. Hard-coded route path (mod.ts:160-163)
The path /api/rivet is hard-coded but could theoretically be customized by users. Recommendation: Make this configurable or derive it from the config if possible.


Potential Bugs & Issues

Critical Issues

1. Race condition in file watcher (mod.ts:166-193)
If the file doesn't exist on first check, lastMtime stays null. When the file is created later, it won't trigger an abort (since lastMtime === null). Impact: Runner won't restart when switching branches or after clean builds. Recommendation: Initialize lastMtime even when file doesn't exist, or handle the file created case.

2. Incomplete comment in actor-driver.ts:652
Comment is incomplete (sentence cuts off mid-thought). Recommendation: Complete the sentence to explain what would happen.

3. Body stream consumption issue (mod.ts:108-120)
request.body is a stream that can only be read once. If the body was already partially consumed, this will fail. Recommendation: Check if body was consumed or use spread operator to copy properties.

Medium Priority Issues

4. Interval not cleared on immediate abort (mod.ts:99-105)
If request.signal is already aborted when the function is called, the interval is created but may not be cleared immediately. Recommendation: Check request.signal.aborted before starting the interval.

5. Error handling in stream wrapper (mod.ts:218-221)
controller.error() can throw if called after controller.close(). Recommendation: Add guard or wrap in try-catch.


Performance Considerations

1. Polling every 1 second per request (mod.ts:192)
Issue: Creates one interval per concurrent request (O(n) intervals for n requests). With many concurrent SSE connections, this creates significant overhead. Each interval performs a synchronous statSync() call every second. Impact: High CPU usage during development with multiple active connections. Recommendation: Use a singleton watcher shared across all requests with observers, or increase interval to 2-3 seconds.

2. Synchronous file system calls (mod.ts:169, 173)
existsSync and statSync block the event loop. Recommendation: Consider using async versions although the current approach is acceptable for development-only code.


Security Concerns

1. Development-only feature - File watching is properly restricted to development mode. No security concerns for production deployments.

2. Path traversal protection - Hard-coded path with join() prevents path traversal. No user input in file path construction.

3. Resource exhaustion potential - Multiple concurrent requests each create an interval. An attacker could open many connections to consume CPU/memory. Mitigation: This is development-only, so risk is minimal. Recommendation: Consider adding a maximum concurrent watcher limit.


Test Coverage

Missing Tests: No test files found for next-js package. Critical path (file watching, stream wrapping, abort handling) is untested. Recommendation: Add tests for file change detection, interval cleanup, stream wrapper behavior, memory leak prevention, and edge cases.


Summary & Recommendations

Priority Fixes:

  1. Fix environment check (mod.ts:58) - Use === development instead of !== development
  2. Fix file creation race condition - Handle case when file doesn't exist initially
  3. Clean up abort listeners - Prevent memory leaks
  4. Optimize polling - Consider singleton watcher or longer interval

Nice to Have:

  • Complete the comment in actor-driver.ts:652
  • Add test coverage for new functionality
  • Make route path configurable
  • Use async file system operations

Overall Assessment:
The PR implements a pragmatic solution to a real problem with Next.js hot reloading. The code is well-structured and documented. However, there are some potential bugs (environment check, race conditions) and performance concerns (per-request polling) that should be addressed before merging.

Recommended Action: Request changes to fix the critical environment check issue and file creation race condition. Other issues can be addressed in follow-up PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants