fix(next-js): auto shut down runners when source code updates #3369

NathanFlurry · 2025-11-05T04:05:17Z

No description provided.

vercel · 2025-11-05T04:05:24Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
rivetkit-serverless	Ready	Preview	Comment	Nov 5, 2025 5:14am

3 Skipped Deployments

Project	Deployment	Preview	Updated (UTC)
rivet-cloud	Ignored		Nov 5, 2025 5:14am
rivet-inspector	Ignored	Preview	Nov 5, 2025 5:14am
rivet-site	Ignored	Preview	Nov 5, 2025 5:14am

NathanFlurry · 2025-11-05T04:05:47Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

claude · 2025-11-05T04:14:47Z

PR Review: Auto Shut Down Runners on Source Code Updates

Summary

This PR adds a file watching mechanism to automatically shut down Next.js runners when source code is updated during development. The implementation uses polling instead of filesystem watchers to avoid cross-platform issues.

Code Quality & Best Practices

Strengths:

Well-documented code with clear comments explaining the workarounds
Good separation of concerns with distinct functions for each responsibility
Proper cleanup of intervals and event listeners to prevent memory leaks
Type-safe with explicit Promise return type on fetchWrapper

Issues & Suggestions:

Hardcoded Route Path (Critical) - mod.ts:160-163
```
const routePath = join(
    process.cwd(),
    ".next/server/app/api/rivet/[...all]/route.js",
);
```
This assumes a fixed API route structure. This will break if users:
- Use a different route path (e.g., /api/custom/[...all])
- Use TypeScript in production builds
- Have a custom Next.js build directory
Recommendation: Make the route path configurable via inputConfig or detect it dynamically.
Log Level Inconsistency - mod.ts:167
```
logger().info("checking");
```
This logs "checking" every second at INFO level, which will spam logs. Should be debug() or removed entirely.
Environment Check Inconsistency - mod.ts:58 vs mod.ts:17
- Line 58 checks NODE_ENV !== "development"
- Line 17 checks NODE_ENV !== "production"
These are not equivalent (could be "test", "staging", etc.). Consider standardizing on one approach or explicitly handling all cases.

Missing Cleanup on Abort Signal - mod.ts:93

request.signal?.addEventListener("abort", abortMerged);

The event listener is never removed, causing a potential memory leak if the mergedController outlives the request. Consider:

const cleanup = () => {
    request.signal?.removeEventListener("abort", abortMerged);
    clearInterval(watchIntervalId);
};
mergedController.signal.addEventListener("abort", cleanup);

Potential Bugs

Race Condition in Stream Wrapping - mod.ts:131-134
The interval might not be cleared if the stream is read asynchronously after the response is returned. The onFinish callback could fire after the function returns but before cleanup happens.

File Watcher Not Stopped on Error - mod.ts:123
If fetch(newReq) throws an exception, the interval will never be cleared. Wrap in try-finally:

try {
    const response = await fetch(newReq);
    // ... existing code
} catch (err) {
    clearInterval(watchIntervalId);
    throw err;
}

Body Already Consumed - mod.ts:112
Passing body: request.body when creating a new Request may fail if the body has already been read. Consider checking request.bodyUsed.

Performance Considerations

Per-Request File Watchers - mod.ts:99
Creating a polling interval for every request is expensive. With concurrent requests, you could have many timers checking the same file. Consider:
- Use a singleton watcher shared across requests
- Increase polling interval to 2-3 seconds (1s seems aggressive)
- Debounce file checks
Synchronous File Operations - mod.ts:169-173
existsSync() and statSync() block the event loop. With many concurrent requests, this could cause performance issues. Consider using async versions or caching results briefly.
Stream Wrapping Overhead - mod.ts:204-229
Every chunk passes through an extra transform, adding latency to streaming responses. This is probably acceptable for dev mode but worth noting.

Security Concerns

No significant security issues identified. The file watching is limited to development mode and watches a build artifact, not user input.

Test Coverage

Missing Tests:

No test files found for this package
Should add tests for:
- File change detection and abort triggering
- Interval cleanup in various scenarios (normal completion, error, abort)
- Stream wrapping and cleanup
- Environment variable branching logic

Additional Recommendations

Configuration Option: Add ability to disable this feature via config for users who don't want it:
```
inputConfig.disableFileWatcher?: boolean
```
Metrics/Telemetry: Log when file changes are detected and runners are terminated to help with debugging.
Error Handling: The error in checkFile is logged but might benefit from alerting if it happens repeatedly.
Type Safety: The fetch parameter type (request: Request, ...args: any) => Response | Promise<Response> uses any. Consider making it more specific or generic.

Verdict

The implementation solves a real problem with Next.js hot reloading, but needs refinement around:

Configurable route paths
Memory leak prevention
Error handling
Performance optimization for concurrent requests
Test coverage

The approach is reasonable given Next.js limitations, but the hardcoded path and per-request watchers are blockers that should be addressed before merging.

claude · 2025-11-05T04:57:16Z

PR Review: Auto shut down runners when source code updates

Summary

This PR implements a development-mode file watcher that automatically shuts down runners when Next.js hot-reloads route files. The implementation addresses a legitimate issue where long-running requests (like SSE streams) persist across hot reloads in development.

Code Quality & Best Practices

Strengths:

Clear separation of concerns with dedicated functions for file watching and stream handling
Good documentation explaining the workarounds for Next.js limitations
Proper cleanup of intervals to prevent memory leaks
Type safety with explicit Promise<Response> return type

Issues:

Incomplete comment (rivetkit-typescript/packages/rivetkit/src/drivers/engine/actor-driver.ts:652)
```
// If we did not use a graceful shutdown, the runner would
```
This comment is cut off mid-sentence. Please complete it to explain the consequences.
Environment check inconsistency (rivetkit-typescript/packages/next-js/src/mod.ts:58)
- Line 17: process.env.NODE_ENV !== "production"
- Line 58: process.env.NODE_ENV !== "development"
The file watcher only runs in development, but the condition checks for !== "development" which would also activate in test/staging environments. Consider using === "development" for clarity.

Duplicate event listener (rivetkit-typescript/packages/next-js/src/mod.ts:93,102)

request.signal?.addEventListener("abort", abortMerged);  // Line 93
request.signal.addEventListener("abort", () => { ... });  // Line 102

Two separate abort listeners are registered on request.signal. The second one (clearing interval) is redundant since the abort will trigger mergedController.abort() via the first listener, which will cause the stream wrapper to clean up. Consider consolidating:

request.signal?.addEventListener("abort", () => {
    abortMerged();
    logger().debug("clearing file watcher interval: request aborted");
    clearInterval(watchIntervalId);
});

Potential Bugs

Optional chaining inconsistency (rivetkit-typescript/packages/next-js/src/mod.ts:93,102)
- Line 93 uses request.signal?.addEventListener (optional chaining)
- Line 102 uses request.signal.addEventListener (no optional chaining)
This will throw if request.signal is undefined. Use optional chaining consistently.
Race condition in file watching (rivetkit-typescript/packages/next-js/src/mod.ts:190-192)
The interval starts immediately with checkFile(), then every 1000ms. If the file changes between the initial check and first interval tick, there's a 1-second window where changes won't be detected. Consider:
```
const intervalId = setInterval(checkFile, 1000);
checkFile(); // Run after setting up interval
return intervalId;
```
Actually, looking more carefully, the current implementation calls checkFile() first (line 190), then starts the interval. This is correct, but there's still a race: if the file changes between line 190 and 192, and then changes back before the next interval, the change could be missed.
Request body stream consumption (rivetkit-typescript/packages/next-js/src/mod.ts:112)
When creating newReq, the original request.body is passed. Request bodies can only be read once. If the body was already consumed (e.g., for logging/middleware), this will fail. Consider checking request.bodyUsed or cloning the request.
Hardcoded route path (rivetkit-typescript/packages/next-js/src/mod.ts:160-163)
```
const routePath = join(
    process.cwd(),
    ".next/server/app/api/rivet/[...all]/route.js",
);
```
This assumes:
- The API route is always at /api/rivet
- Next.js app router (not pages router)
- Specific build output structure
If users configure a different route path or use pages router, this will fail silently. Consider making this configurable or documenting the assumption clearly.

Performance Considerations

Polling interval efficiency (rivetkit-typescript/packages/next-js/src/mod.ts:192)
1-second polling with statSync() on every active request could be expensive with many concurrent connections:
- 10 SSE connections = 10 stats/second
- 100 SSE connections = 100 stats/second
Consider:
- Increasing the interval (2-3 seconds might be acceptable for dev)
- Using a singleton watcher with reference counting
- Documenting the performance characteristics
Stream wrapper overhead (rivetkit-typescript/packages/next-js/src/mod.ts:200-230)
Every chunk passes through the custom ReadableStream wrapper. This adds overhead but is necessary for cleanup. The implementation looks efficient.

Security Concerns

Development-only code
The file watching is correctly limited to development mode. Good separation.
File system access (rivetkit-typescript/packages/next-js/src/mod.ts:169,173)
Uses existsSync() and statSync() on a predictable path. Since this is development-only and uses process.cwd() as the base, this is acceptable. Error handling is present.

Test Coverage

Missing tests:

No tests found for the next-js package
Consider adding tests for:
- File watching triggers on mtime change
- Interval cleanup on various scenarios (abort, stream finish, no body)
- Stream wrapper correctly passes through data
- AbortController merge logic
- Environment-based behavior switching

Recommended test cases:

describe("handleRequestWithFileWatcher", () => {
  it("should clean up interval when stream finishes");
  it("should clean up interval when request is aborted");
  it("should clean up interval immediately if no response body");
  it("should abort merged controller when file changes");
  it("should handle file stat errors gracefully");
});

Additional Recommendations

Configuration option
Consider adding an option to disable file watching even in development, e.g., disableDevFileWatcher: boolean in config. This would help with debugging or performance testing.
Metrics/observability
Consider adding metrics for:
- Number of active file watchers
- File change detections
- Watcher cleanup events
Error boundary
If file watching fails to initialize, should the request still proceed? Currently it will (errors are caught), which seems correct, but consider logging a warning on the first failure.
Type safety improvement (rivetkit-typescript/packages/next-js/src/mod.ts:87)
```
fetch: (request: Request, ...args: any) => Response | Promise<Response>
```
The any type could be more specific. What are the actual additional arguments?

Summary

The PR solves a real development experience problem with a practical workaround. The implementation is generally solid, but has several issues that should be addressed:

Must fix:

Complete the truncated comment in actor-driver.ts:652
Fix optional chaining inconsistency for request.signal
Consolidate duplicate abort event listeners

Should fix:

Request body consumption issue
Environment check clarity
Document/make configurable the hardcoded route path

Nice to have:

Add test coverage
Add configuration option to disable feature
Optimize polling with reference-counted singleton watcher

The core logic is sound, but the edge cases and cleanup paths need attention to prevent subtle bugs in production.

claude · 2025-11-05T05:14:13Z

Pull Request Review

Overview

This PR implements automatic shutdown of runners when source code updates in Next.js development mode. The implementation adds file watching functionality to detect changes and gracefully abort running requests.

Code Quality & Best Practices

Positive aspects:

Good separation of concerns with dedicated functions (handleRequestWithFileWatcher, watchRouteFile, waitForStreamFinish)
Comprehensive inline documentation explaining the "why" behind design decisions
Proper cleanup of intervals and event listeners
Appropriate use of environment checks to limit file watching to development mode only

Suggestions:

Incomplete comment in actor-driver.ts:652 - The comment is cut off mid-sentence:
```
// If we did not use a graceful shutdown, the runner would
```
This should be completed to explain the consequences of not using graceful shutdown.
Memory leak potential - In handleRequestWithFileWatcher (line 102-105), there's a duplicate event listener registration:
- Line 93: request.signal?.addEventListener('abort', abortMerged)
- Line 102: request.signal.addEventListener('abort', () => {...})
The first listener (abortMerged) should be removed when the interval is cleared. Consider removing it in the interval cleanup callback.
Hard-coded route path - The path .next/server/app/api/rivet/[...all]/route.js (line 162) is hard-coded. If users have a different route structure or Next.js changes its build output structure, this will break silently. Consider:
- Adding a configuration option to override the path
- Detecting the route path dynamically
- Adding a warning if the file doesn't exist after the first check

Potential Bugs

Race condition in stream cleanup - If the stream finishes before watchRouteFile detects a change, the interval is cleared. However, if a file change is detected while the stream is finishing, both paths might try to clear the interval. While clearInterval is idempotent, there's a brief window where abortController.abort() might be called after the stream completes. This is likely harmless but worth noting.
Request body consumption - Line 112 passes request.body to the new Request constructor. If the original request body was already read/consumed, this will fail. Consider checking if the body is readable or handling this edge case.
Missing null check - Line 102 accesses request.signal.addEventListener without the optional chaining used on line 93. This could throw if request.signal is undefined. Should be:
```
request.signal?.addEventListener('abort', () => {...})
```

Performance Considerations

Polling interval - The 1-second polling interval (line 192) seems reasonable for development, but creates one file stat per request per second. For applications with many concurrent long-running requests, this could add up. Consider:
- Making the interval configurable
- Using a shared file watcher across requests (with proper cleanup on hot reload)
- Increasing the interval to 2-3 seconds for less aggressive checking
File stat overhead - Using statSync (line 173) blocks the event loop briefly. For a 1-second interval this is probably fine, but consider stat (async) with proper error handling if you want to reduce blocking.

Security Concerns

No significant security issues identified. The code:

Only runs in development mode
Doesn't expose file system information to the client
Properly handles abort signals
Uses safe file system operations

Test Coverage

The PR doesn't include tests. Consider adding:

Unit tests for waitForStreamFinish to verify cleanup in success/error/cancel scenarios
Integration tests verifying file change detection triggers abort
Tests for cleanup of intervals and event listeners to prevent memory leaks
Edge case tests for missing files, permission errors, etc.

Additional Recommendations

Logging consistency - Some logs use structured format (e.g., { msg: '...', routePath }) while others don't (e.g., line 103). Standardize on structured logging throughout.
Error handling in watchRouteFile - The error is logged but the watcher continues running even if the file doesn't exist. Consider stopping the interval after N consecutive failures to avoid unnecessary work.
Environment variable check - Line 58 checks NODE_ENV !== 'development' while line 17 checks NODE_ENV !== 'production'. This creates three states: development, production, and "other". Make sure the behavior for the "other" state is intentional.

Summary

The implementation is well-structured and solves a real development pain point with Next.js hot reloading. The main concerns are:

Critical: Missing null check on line 102
Important: Incomplete comment in actor-driver.ts
Nice to have: Memory leak prevention, configurable polling interval, test coverage

Overall, this is a solid solution to a tricky problem. With the fixes above, this should be ready to merge.

claude · 2025-11-05T05:15:34Z

Pull Request Review

Summary

This PR implements auto-shutdown of runners when source code updates in Next.js development mode. The solution uses a polling-based file watcher to detect changes to the compiled route file and gracefully aborts requests to trigger runner restarts.

Code Quality & Best Practices

Positive Aspects

Well-documented: Excellent use of comments explaining the workarounds and design decisions (especially the HACK comments which are helpful)
Clean separation: The new functionality is properly isolated into separate functions (handleRequestWithFileWatcher, watchRouteFile, waitForStreamFinish)
Proper resource cleanup: Good attention to clearing intervals in multiple scenarios (stream finish, abort, no body)
Type safety: Return type annotation added to fetchWrapper improves type safety

Suggestions for Improvement

1. Environment check inconsistency (mod.ts:58)
Line 17 checks !== production but line 58 checks !== development. This means file watching will activate in all non-development modes (production, test, staging, etc.). Recommendation: Change line 58 to process.env.NODE_ENV === development to match the intent.

2. Potential memory leak with abort listeners (mod.ts:93, 102)
Two separate abort listeners are attached to request.signal. The first listener (abortMerged) is never removed. Recommendation: Either use a single listener or store references for proper cleanup.

3. Hard-coded route path (mod.ts:160-163)
The path /api/rivet is hard-coded but could theoretically be customized by users. Recommendation: Make this configurable or derive it from the config if possible.

Potential Bugs & Issues

Critical Issues

1. Race condition in file watcher (mod.ts:166-193)
If the file doesn't exist on first check, lastMtime stays null. When the file is created later, it won't trigger an abort (since lastMtime === null). Impact: Runner won't restart when switching branches or after clean builds. Recommendation: Initialize lastMtime even when file doesn't exist, or handle the file created case.

2. Incomplete comment in actor-driver.ts:652
Comment is incomplete (sentence cuts off mid-thought). Recommendation: Complete the sentence to explain what would happen.

3. Body stream consumption issue (mod.ts:108-120)
request.body is a stream that can only be read once. If the body was already partially consumed, this will fail. Recommendation: Check if body was consumed or use spread operator to copy properties.

Medium Priority Issues

4. Interval not cleared on immediate abort (mod.ts:99-105)
If request.signal is already aborted when the function is called, the interval is created but may not be cleared immediately. Recommendation: Check request.signal.aborted before starting the interval.

5. Error handling in stream wrapper (mod.ts:218-221)
controller.error() can throw if called after controller.close(). Recommendation: Add guard or wrap in try-catch.

Performance Considerations

1. Polling every 1 second per request (mod.ts:192)
Issue: Creates one interval per concurrent request (O(n) intervals for n requests). With many concurrent SSE connections, this creates significant overhead. Each interval performs a synchronous statSync() call every second. Impact: High CPU usage during development with multiple active connections. Recommendation: Use a singleton watcher shared across all requests with observers, or increase interval to 2-3 seconds.

2. Synchronous file system calls (mod.ts:169, 173)
existsSync and statSync block the event loop. Recommendation: Consider using async versions although the current approach is acceptable for development-only code.

Security Concerns

1. Development-only feature - File watching is properly restricted to development mode. No security concerns for production deployments.

2. Path traversal protection - Hard-coded path with join() prevents path traversal. No user input in file path construction.

3. Resource exhaustion potential - Multiple concurrent requests each create an interval. An attacker could open many connections to consume CPU/memory. Mitigation: This is development-only, so risk is minimal. Recommendation: Consider adding a maximum concurrent watcher limit.

Test Coverage

Missing Tests: No test files found for next-js package. Critical path (file watching, stream wrapping, abort handling) is untested. Recommendation: Add tests for file change detection, interval cleanup, stream wrapper behavior, memory leak prevention, and edge cases.

Summary & Recommendations

Priority Fixes:

Fix environment check (mod.ts:58) - Use === development instead of !== development
Fix file creation race condition - Handle case when file doesn't exist initially
Clean up abort listeners - Prevent memory leaks
Optimize polling - Consider singleton watcher or longer interval

Nice to Have:

Complete the comment in actor-driver.ts:652
Add test coverage for new functionality
Make route path configurable
Use async file system operations

Overall Assessment:
The PR implements a pragmatic solution to a real problem with Next.js hot reloading. The code is well-structured and documented. However, there are some potential bugs (environment check, race conditions) and performance concerns (per-request polling) that should be addressed before merging.

Recommended Action: Request changes to fix the critical environment check issue and file creation race condition. Other issues can be addressed in follow-up PRs.

NathanFlurry mentioned this pull request Nov 5, 2025

fix(rivetkit): fix duplicate runner key in serverless #3366

Closed

NathanFlurry mentioned this pull request Nov 5, 2025

chore(runner): improve runner connection logs #3365

Closed

vercel bot deployed to Preview – rivetkit-serverless November 5, 2025 04:06 View deployment

NathanFlurry force-pushed the 11-04-fix_next-js_auto_shut_down_runners_when_source_code_updates branch from e376715 to 943ae66 Compare November 5, 2025 04:55

NathanFlurry marked this pull request as ready for review November 5, 2025 04:55

NathanFlurry requested review from MasterPtato and jog1t and removed request for MasterPtato November 5, 2025 04:55

NathanFlurry mentioned this pull request Nov 5, 2025

chore(pegboard): allow configuring reschedule_backoff_max_exponent #3370

Merged

vercel bot deployed to Preview – rivetkit-serverless November 5, 2025 04:57 View deployment

graphite-app bot changed the base branch from 11-04-fix_rivetkit_fix_duplicate_runner_key_in_serverless to graphite-base/3369 November 5, 2025 05:11

graphite-app bot force-pushed the graphite-base/3369 branch from 3763786 to fdd89a7 Compare November 5, 2025 05:12

graphite-app bot force-pushed the 11-04-fix_next-js_auto_shut_down_runners_when_source_code_updates branch from 943ae66 to d42c437 Compare November 5, 2025 05:12

graphite-app bot changed the base branch from graphite-base/3369 to main November 5, 2025 05:13

fix(next-js): auto shut down runners when source code updates

ddef08f

graphite-app bot force-pushed the 11-04-fix_next-js_auto_shut_down_runners_when_source_code_updates branch from d42c437 to ddef08f Compare November 5, 2025 05:13

NathanFlurry merged commit ce0f835 into main Nov 5, 2025
7 of 11 checks passed

vercel bot deployed to Preview – rivetkit-serverless November 5, 2025 05:14 View deployment

NathanFlurry deleted the 11-04-fix_next-js_auto_shut_down_runners_when_source_code_updates branch November 5, 2025 05:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(next-js): auto shut down runners when source code updates #3369

fix(next-js): auto shut down runners when source code updates #3369

Uh oh!

NathanFlurry commented Nov 5, 2025

Uh oh!

vercel bot commented Nov 5, 2025 •

edited

Loading

Uh oh!

NathanFlurry commented Nov 5, 2025 •

edited

Loading

Uh oh!

claude bot commented Nov 5, 2025

Uh oh!

claude bot commented Nov 5, 2025

Uh oh!

claude bot commented Nov 5, 2025

Uh oh!

Uh oh!

claude bot commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(next-js): auto shut down runners when source code updates #3369

fix(next-js): auto shut down runners when source code updates #3369

Uh oh!

Conversation

NathanFlurry commented Nov 5, 2025

Uh oh!

vercel bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NathanFlurry commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to use the Graphite Merge Queue

Uh oh!

claude bot commented Nov 5, 2025

PR Review: Auto Shut Down Runners on Source Code Updates

Summary

Code Quality & Best Practices

Potential Bugs

Performance Considerations

Security Concerns

Test Coverage

Additional Recommendations

Verdict

Uh oh!

claude bot commented Nov 5, 2025

PR Review: Auto shut down runners when source code updates

Summary

Code Quality & Best Practices

Potential Bugs

Performance Considerations

Security Concerns

Test Coverage

Additional Recommendations

Summary

Uh oh!

claude bot commented Nov 5, 2025

Pull Request Review

Overview

Code Quality & Best Practices

Potential Bugs

Performance Considerations

Security Concerns

Test Coverage

Additional Recommendations

Summary

Uh oh!

Uh oh!

claude bot commented Nov 5, 2025

Pull Request Review

Summary

Code Quality & Best Practices

Positive Aspects

Suggestions for Improvement

Potential Bugs & Issues

Critical Issues

Medium Priority Issues

Performance Considerations

Security Concerns

Test Coverage

Summary & Recommendations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Nov 5, 2025 •

edited

Loading

NathanFlurry commented Nov 5, 2025 •

edited

Loading