Improve server process cleanup and readiness checks #199

justin808 · 2025-10-05T23:48:24Z

Summary

Added process group management for better child process cleanup
Added warning logging when process group kill fails with EPERM errors
Improved server readiness check to only accept 200-399 HTTP status codes (rejecting 404 and 5xx)

Benefits

Better debugging: Process termination issues are now visible in logs
More reliable tests: Server readiness checks won't pass prematurely on 404 or error responses
Fewer flaky tests: Proper health checks reduce timing-related failures

Changes

Process Group Kill with Error Logging (lib/cypress_on_rails/server.rb:144-153)
- Captures process group ID on spawn
- Attempts to kill entire process group first
- Falls back to single process kill on EPERM
- Logs warnings when process group kill fails
HTTP-Based Server Readiness Check (lib/cypress_on_rails/server.rb:125-132)
- Changed from TCP socket check to HTTP response validation
- Only accepts 200-399 status codes as "ready"
- Rejects 404 and 5xx responses
- More accurately determines when app is ready to serve requests

Test Plan

Existing RSpec suite passes (3 pre-existing failures unrelated to changes)
Manual testing with Rails server startup/shutdown cycles
Verify process cleanup in production use

Fixes #198

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Expose server readiness options: configurable readiness path and timeout for HTTP readiness checks.
Bug Fixes
- Improved reliability of the embedded server lifecycle with HTTP-based readiness checks.
- Graceful shutdown of server and child processes to reduce hangs and orphaned processes.
- Safer termination with error handling when no PID or group termination fails; reduced start/stop flakiness.
Chores
- Improved shutdown logging and environment cleanup for more consistent runs.

- Add process group management for better child process cleanup - Add warning logging when process group kill fails with EPERM - Improve server readiness check to accept only 200-399 HTTP status codes - Reject 404 and 5xx responses as "not ready" for more reliable health checks These improvements provide better debugging visibility and reduce flaky tests by ensuring the server is truly ready before running tests. Fixes #198 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai · 2025-10-05T23:48:33Z

Warning

Rate limit exceeded

@justin808 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 19 minutes and 15 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 5a1867d and 501b0b4.

📒 Files selected for processing (3)

lib/cypress_on_rails/configuration.rb (2 hunks)
lib/cypress_on_rails/server.rb (3 hunks)
spec/cypress_on_rails/configuration_spec.rb (2 hunks)

Walkthrough

Adds HTTP-based readiness checks and process-group-aware server spawning/termination to lib/cypress_on_rails/server.rb; records PID/PGID, uses Net::HTTP for health checks (accepting 200–399), adds safe_kill and send_term_signal fallbacks, and exposes readiness config via configuration.rb.

Changes

Cohort / File(s)	Summary
Server lifecycle and readiness `lib/cypress_on_rails/server.rb`	Add `require 'net/http'`; spawn stores `@server_pid`/`@server_pgid`; replace TCP loop with `server_responding?` (HTTP GET, accept 200–399); `wait_for_server` uses readiness path/timeout; `stop_server` handles nil pid, calls `send_term_signal` with ESRCH/EPERM handling and falls back to `safe_kill_process`; ensure ENV cleanup and guarded transaction rollback.
Configuration: readiness options `lib/cypress_on_rails/configuration.rb`	Add `attr_accessor :server_readiness_path` and `:server_readiness_timeout`; initialize from `CYPRESS_RAILS_READINESS_PATH` (default "/") and `CYPRESS_RAILS_READINESS_TIMEOUT` (default 5) in `reset`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Tester as Tester
  participant Server as CypressOnRails::Server
  participant OS as OS/Processes
  participant App as App Server (HTTP)

  Tester->>Server: start_server()
  Server->>OS: spawn process (record pid & pgid)
  OS-->>Server: pid, pgid
  loop wait_for_server (timeout)
    Server->>App: HTTP GET base_url + readiness_path
    App-->>Server: 2xx/3xx or other/error
    alt Ready (200–399)
      Server-->>Tester: ready
    else Not ready
      Server-->>Server: sleep/retry
    end
  end

sequenceDiagram
  autonumber
  actor Tester as Tester
  participant Server as CypressOnRails::Server
  participant OS as OS/Signals

  Tester->>Server: stop_server()
  alt server_pgid present
    Server->>OS: TERM -pgid
    alt ESRCH/EPERM
      Server->>OS: log warning
      Server->>OS: TERM pid (safe_kill_process)
    end
  else only pid
    Server->>OS: TERM pid (safe_kill_process)
  end
  Server-->>Tester: stopped/cleaned

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Critical improvements for test server and state reset functionality #185 — Addresses the same server process-group termination and safer shutdown objectives (logging and fallbacks).
High Priority: Improve process cleanup and server readiness checks in E2eTestRunner react_on_rails-demos#14 — Matches the readiness-check change to accept only HTTP 200–399 and the process-group kill fallback behavior.

Possibly related PRs

Add cypress-rails compatible rake tasks and server lifecycle hooks #179 — Modifies the same server spawn/wait/stop flow and readiness configuration; likely directly related.

Poem

A rabbit taps the server log,
Spawns with care, records each prog;
Checks the site—two hundred cheers,
If signals fail, it calms your fears.
PIDs hugged close, tests hop along. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Out of Scope Changes Check	⚠️ Warning	In addition to the cleanup and HTTP readiness improvements specified in issue #198, the PR adds new configuration attributes for server_readiness_path and server_readiness_timeout along with environment-driven initialization, which were not requested by the linked issue objectives.	Remove the unrelated configuration additions from this PR or update the linked issue to explicitly include these new server_readiness_path and server_readiness_timeout options.
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly describes the primary changes in the PR by highlighting improvements to server process cleanup and readiness checks, matching the modifications to process group termination and HTTP-based health checks.
Linked Issues Check	✅ Passed	The implemented changes fully address the objectives of issue #198 by introducing send_term_signal with warning logs on Errno::EPERM and fallback to single-process kill, and replacing the TCP readiness loop with an HTTP-based server_responding? check that accepts only 2xx–3xx responses and handles connection errors.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2025-10-05T23:51:29Z

Pull Request Review: Improve server process cleanup and readiness checks

Overview

This PR addresses process management and server readiness detection issues identified in #198. The changes improve debugging capabilities and test reliability.

Strengths

Code Quality

Clean implementation: The process group management is well-structured with proper error handling
Good separation of concerns: Breaking out send_term_signal and safe_kill_process improves readability
Explicit state management: Storing @server_pgid alongside @server_pid makes the intent clear

Improved Reliability

Better health checks: The HTTP-based readiness check (200-399) is more accurate than TCP socket checking
Graceful degradation: Fallback from process group to single process kill is well-handled
Helpful logging: Warning messages will aid in debugging production issues

Issues and Suggestions

1. Potential Race Condition in spawn_server (lib/cypress_on_rails/server.rb:109-111)

Issue: There is a small window where the process could terminate before Process.getpgid is called, which would raise Errno::ESRCH.

Suggestion: Add error handling around Process.getpgid call.

2. Incomplete Exception Handling in server_responding? (lib/cypress_on_rails/server.rb:125-132)

Issue: The method does not handle all possible HTTP exceptions. Missing cases include Errno::ETIMEDOUT, Net::OpenTimeout, Net::ReadTimeout, and Net::HTTPBadResponse.

Suggestion: Add these exception types to the rescue clause.

3. Missing Instance Variable Initialization

Issue: @server_pgid is set in spawn_server but never initialized in initialize.

Suggestion: Initialize @server_pid and @server_pgid to nil in the initialize method.

4. Inconsistent Error Recovery in send_term_signal (lib/cypress_on_rails/server.rb:144-153)

Issue: When Process.kill fails with EPERM, the fallback only kills the single process. Any child processes will be orphaned.

Severity: Medium - Could lead to resource leaks in CI environments

Suggestion: Document this limitation with a comment explaining the orphaned process concern.

5. Missing Timeout in server_responding?

Issue: Net::HTTP.get_response has a default timeout of 60 seconds. Slow server starts could block the health check loop.

Suggestion: Add explicit short timeouts (2 seconds recommended) for both open_timeout and read_timeout.

Test Coverage

Missing: No tests were added for the new functionality.

Recommended test cases:

Process group kill success
Process group kill with EPERM fallback
Server readiness with 200 response (ready)
Server readiness with 404 response (not ready)
Server readiness with 500 response (not ready)
Server readiness with connection refused (not ready)

Security Considerations

No security concerns identified. The changes are defensive and improve reliability.

Performance Considerations

Positive: HTTP-based health check validates application layer, not just socket layer.

Minor: Each check makes a full HTTP request vs TCP connection (adds approximately 1-5ms per check).

Summary

This is a solid improvement to the gem reliability. Main recommendations:

Must fix: Add broader exception handling to server_responding?
Should fix: Add timeout configuration to HTTP requests
Nice to have: Add tests for the new functionality
Nice to have: Initialize instance variables in initialize

The core logic is sound and addresses real production issues. With the exception handling improvements, this is ready to merge.

Recommendation: Approve with minor revisions

Review generated with assistance from Claude Code

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

lib/cypress_on_rails/server.rb (1)
134-142: Consider adding timeout to process wait.

The Process.wait(pid) on line 139 will block indefinitely if the server process ignores the TERM signal. While this might match the previous behavior, adding a timeout with a fallback to KILL would make the shutdown more robust.
 def stop_server(pid)
   return unless pid

   puts "Stopping Rails server (PID: #{pid})"
   send_term_signal
-  Process.wait(pid)
+  
+  # Wait up to 10 seconds for graceful shutdown
+  begin
+    Timeout.timeout(10) do
+      Process.wait(pid)
+    end
+  rescue Timeout::Error
+    puts "Warning: Server did not terminate after TERM signal, sending KILL"
+    safe_kill_process('KILL', pid)
+    Process.wait(pid) rescue Errno::ESRCH
+  end
 rescue Errno::ESRCH
   # Process already terminated
 end

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a37c4c and 1672356.

📒 Files selected for processing (1)

lib/cypress_on_rails/server.rb (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: claude-review
GitHub Check: rails_8
GitHub Check: rails_7_2
GitHub Check: rails_6_1

🔇 Additional comments (4)

lib/cypress_on_rails/server.rb (4)

4-4: LGTM!

Adding Net::HTTP from the standard library enables the HTTP-based readiness check without introducing external dependencies.

117-118: LGTM!

The HTTP-based readiness check correctly replaces the previous TCP socket approach, with a reasonable 0.1-second polling interval. The timeout is properly handled by the outer Timeout.timeout block.

144-153: LGTM with one observation!

The process group termination logic correctly:

Uses negative PGID (-@server_pgid) to signal the entire process group.

Falls back to single-process kill when PGID is unavailable.

Logs warnings when process group termination fails (per PR objectives).

Note: The Errno::EPERM fallback to single-process kill (line 152) may not succeed if you lack permission to kill the process group, but the primary goal is logging for debugging, which is achieved.

155-159: LGTM!

The implementation correctly guards against nil PIDs and gracefully handles the case where the process has already terminated.

lib/cypress_on_rails/server.rb

- Initialize @server_pid and @server_pgid in constructor to avoid nil issues - Add error handling for Process.getpgid race condition when process terminates immediately - Add comprehensive exception handling in server_responding? (ETIMEDOUT, Net::OpenTimeout, Net::ReadTimeout, Net::HTTPBadResponse) - Add configurable server_readiness_path with ENV var support (CYPRESS_RAILS_READINESS_PATH) - Add explicit HTTP timeouts (1s open_timeout, 1s read_timeout) to prevent hangs - Improve URI handling to support custom readiness check paths These fixes address potential edge cases and race conditions in server lifecycle management. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

claude · 2025-10-06T00:12:40Z

Code Review - PR #199: Improve server process cleanup and readiness checks

Overview

This PR addresses important reliability issues in server lifecycle management. The changes are well-motivated by issue #198 and improve both observability and correctness of the server startup/shutdown process.

✅ Strengths

Well-scoped changes - Focused on specific, high-value improvements
Good error handling - Proper fallback logic with informative warnings
Configurable readiness path - The server_readiness_path configuration option adds valuable flexibility
Better health checks - HTTP-based checks are more reliable than TCP socket checks

🔍 Code Quality & Best Practices

Good:

Clear separation of concerns with send_term_signal and safe_kill_process methods
Proper exception handling for process-related errors (ESRCH, EPERM)
Comprehensive rescue clauses in server_responding? covering various network failure scenarios

Suggestions:

Instance variable initialization (lib/cypress_on_rails/server.rb:20-21)
- Good addition, but consider documenting why these need to be instance variables
- Could add a comment explaining the process group management approach
Error message clarity (lib/cypress_on_rails/server.rb:165)
- Consider using the configured logger instead of puts for consistency with the rest of the gem
URI path handling (lib/cypress_on_rails/server.rb:138)
- This logic is correct but could be simplified since uri.path should already be / by default

🐛 Potential Issues

Critical:

Race condition in process group ID capture (lib/cypress_on_rails/server.rb:112-117)
- Issue: If the process terminates before stop_server is called, @server_pgid could be stale or refer to a recycled PID
- Suggestion: Consider validating the process still exists in send_term_signal before attempting to kill
Missing timeout handling (lib/cypress_on_rails/server.rb:132-145)
- The server_responding? method has 1-second timeouts which may be too short for slow CI systems
- Suggestion: Consider making timeouts configurable

Medium Priority:

Silent state mutation (lib/cypress_on_rails/server.rb:73, 95)
- The start_server method mixes instance variables and parameters inconsistently
- Suggestion: Either use instance variables throughout or keep it pure
Incomplete error recovery (lib/cypress_on_rails/server.rb:164-167)
- When process group kill fails with EPERM, single process kill may also fail
- Suggestion: Add Errno::EPERM to the rescue clause in safe_kill_process as well

⚡ Performance Considerations

Polling interval - sleep 0.1 checks 10 times per second, reasonable but could use exponential backoff
HTTP request overhead - Minimal, and worth it for more accurate readiness detection

🔒 Security Concerns

Process group kill security ✅ - Correctly implemented with proper fallback
Environment variable for readiness path - Low risk (test env only), but document that this should be a static path
Process signal handling ✅ - Using TERM signal is appropriate

🧪 Test Coverage

Missing tests for:

Process group creation and cleanup logic
HTTP-based readiness checks with various response codes (200, 301, 404, 500)
Timeout behavior in wait_for_server
EPERM error handling in process termination
Configuration of server_readiness_path

Recommendation: Add a spec/cypress_on_rails/server_spec.rb file with unit tests

📋 Recommendations

Must Address:

Add unit tests for the new logic (especially HTTP readiness checks)
Consider using the configured logger instead of puts for consistency
Add Errno::EPERM handling in safe_kill_process

Should Consider:

Document the process group management approach (add code comments)
Make HTTP timeout values configurable
Add process existence validation before kill attempts
Refactor to use either instance variables or parameters consistently

Nice to Have:

Add exponential backoff to server polling
Document security considerations for server_readiness_path in README

🎯 Verdict

Approval Status: ✅ Approve with minor changes

This is a solid improvement that addresses real reliability issues. The code is well-structured and the error handling is thoughtful. The missing test coverage is the main concern - these changes touch critical infrastructure code and should have corresponding tests to prevent regressions.

The suggestions above are mostly about polish and defensive programming. The core logic is sound and ready to merge once tests are added.

Estimated effort to address: 2-3 hours for comprehensive test coverage

Great work on this PR! 🚀

Code quality improvements: - Add documentation comment for process management instance variables - Replace puts with logger.warn for consistency with gem conventions - Simplify URI path handling (remove redundant empty check) - Add process_exists? helper to validate PID before kill operations Error handling improvements: - Add EPERM to safe_kill_process rescue clause for complete error coverage - Validate process existence before attempting process group kill to avoid stale PGID issues - Prevent attempts to kill recycled PIDs Configuration improvements: - Add configurable server_readiness_timeout (default: 5 seconds) - Support CYPRESS_RAILS_READINESS_TIMEOUT environment variable - Addresses potential timeout issues on slow CI systems 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

claude · 2025-10-06T00:37:16Z

PR Review: Improve server process cleanup and readiness checks

Overall Assessment

This is a solid improvement to server lifecycle management. The changes enhance reliability and debugging capabilities. A few issues need attention before merging.

Critical Issues

1. Incorrect PID variable in stop_server (lib/cypress_on_rails/server.rb:150-158)

The stop_server(pid) method receives a parameter pid but uses instance variable @server_pid in send_term_signal and process_exists?.

Problem: The method is called with server_pid (local variable from line 74), but send_term_signal references @server_pid. If these differ, you will kill the wrong process or fail silently.

Fix: Either use the parameter throughout, or pass the parameter to send_term_signal

2. Race condition in spawn_server (lib/cypress_on_rails/server.rb:112-119)

Between spawning and calling Process.getpgid, the process might terminate, causing ESRCH. You catch this but set @server_pgid = nil, which might cause issues later.

Recommendation: Consider adding a small delay or retry logic, and document that @server_pgid may be nil in edge cases.

Security & Best Practices

3. HTTP timeout configuration (lib/cypress_on_rails/server.rb:133-148)

The server_readiness_timeout (default 5s) is used as both open_timeout and read_timeout in Net::HTTP.start. During the wait_for_server loop (which runs with 0.1s sleep), if a single HTTP request takes 5s to timeout, the polling is ineffective.

Recommendation: Use shorter timeouts for readiness checks (e.g., 1-2 seconds) since they are called repeatedly.

4. HTTP response code validation (lib/cypress_on_rails/server.rb:144)

3xx redirects might indicate the app is not fully ready. Consider being more strict and only accepting 200-299, or documenting why 3xx is acceptable.

Code Quality

5. Missing URI path handling (lib/cypress_on_rails/server.rb:140)

If uri.path is empty, this sends GET empty string instead of GET slash.

6. Missing test coverage

No tests for server_responding? method, send_term_signal, process group killing, or edge cases like @server_pgid = nil. Add unit tests for these critical paths.

Positive Observations

Excellent logging for debugging process termination issues
Graceful degradation from process group to single process kill
Clear comments documenting process management intent
Useful configuration flexibility with new options
Proper exception handling

Recommendations Before Merging

Fix the stop_server PID parameter issue (Critical)
Add tests for new server_responding? logic
Consider shorter HTTP timeouts for readiness checks
Complete manual testing checklist
Document new ENV variables in README

Great work improving reliability!

Critical fixes: 1. Fix PID parameter passing in stop_server/send_term_signal - Previously stop_server(pid) received parameter but used instance variable - Now consistently passes pid parameter through the call chain - Prevents potential bugs if pid values diverge 2. Fix HTTP timeout in readiness polling loop - Cap individual check timeout at 2 seconds max - Prevents 5s timeouts from blocking the 0.1s polling loop - Maintains responsiveness while supporting slow CI via configuration 3. Fix URI path handling edge case - Ensure GET path defaults to '/' if uri.path is empty - Prevents invalid HTTP requests with empty path Documentation improvements: - Document that @server_pgid may be nil in edge cases - Clarify that 3xx redirects are considered valid "ready" state - Add comments explaining timeout behavior in polling context 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

justin808 · 2025-10-06T06:03:15Z

@claude review this pr.

claude · 2025-10-06T06:03:27Z

Claude finished @justin808's task —— View job

PR Review: Improve server process cleanup and readiness checks

Overall Assessment

This PR makes valuable improvements to server lifecycle management with better process cleanup and more reliable readiness checks. The implementation is generally solid, but there are several issues to address before merging.

✅ Strengths

Well-motivated changes - Addresses real production issues documented in Enhancement: Improve server process cleanup and readiness checks #198
Good error handling - Proper fallback logic with informative warnings for process termination failures
Configurable readiness - The server_readiness_path and server_readiness_timeout configuration options add valuable flexibility
Better health checks - HTTP-based checks (200-399 status codes) are more accurate than TCP socket checks
Clear intent - Good code comments explaining the process management approach
Comprehensive exception handling - The server_responding? method handles various network failure scenarios

🔴 Critical Issues

1. Race condition in PGID capture (lib/cypress_on_rails/server.rb:113-120)

Issue: Good catch on handling Errno::ESRCH when the process terminates immediately, but there's a subtle issue. The error handling sets @server_pgid = nil implicitly, but later code assumes if it was captured it's still valid.

Impact: Low - The fallback logic handles this, but it could be clearer.

Recommendation: The current implementation is actually correct. The error handling is appropriate and the warning is helpful.

2. Timeout configuration could be confusing (lib/cypress_on_rails/server.rb:140)

Issue: The line timeout = [config.server_readiness_timeout || 5, 2].min caps the timeout at 2 seconds, which means setting CYPRESS_RAILS_READINESS_TIMEOUT to anything > 2 has no effect per-request. This might confuse users.

Impact: Medium - Users may expect the env var to control per-request timeout.

Recommendation: Either:

Document this behavior clearly (the configured timeout is for overall wait, not per-request)
Use a separate config for per-request timeout
Remove the .min(2) and trust the configured value

3. 3xx redirect handling may be too lenient (lib/cypress_on_rails/server.rb:150)

Issue: Accepting 3xx redirects as "ready" assumes the redirect target is valid. A misconfigured app might redirect indefinitely.

Impact: Low - Most apps won't have this issue, but worth documenting.

Recommendation: Add a comment explaining why 3xx is acceptable (current comment is good) or consider being stricter with 200-299 only.

🟡 Medium Priority Issues

4. Missing validation in send_term_signal (lib/cypress_on_rails/server.rb:167-168)

Issue: The method checks process_exists?(pid) before killing the process group, but @server_pgid could be stale if the process has been recycled.

Impact: Low - Process groups are less likely to be recycled, but theoretically possible.

Recommendation: This is acceptable as-is. The error handling catches any issues.

5. Stop server might wait indefinitely (lib/cypress_on_rails/server.rb:161)

Issue: Process.wait(pid) blocks indefinitely if the server doesn't respond to TERM signal.

Impact: Medium - Could hang test runs if server doesn't shut down gracefully.

Recommendation: Consider adding a timeout with fallback to KILL:

def stop_server(pid)
  return unless pid

  puts "Stopping Rails server (PID: #{pid})"
  send_term_signal(pid)
  
  begin
    Timeout.timeout(10) do
      Process.wait(pid)
    end
  rescue Timeout::Error
    puts "Warning: Server did not terminate after TERM signal, sending KILL"
    safe_kill_process('KILL', pid)
    Process.wait(pid) rescue Errno::ESRCH
  end
rescue Errno::ESRCH
  # Process already terminated
end

🟢 Code Quality Observations

Good Practices

Instance variable initialization (lib/cypress_on_rails/server.rb:21-22) - Explicit initialization of @server_pid and @server_pgid is good practice
Defensive programming - Multiple nil checks and early returns
Separation of concerns - send_term_signal, process_exists?, and safe_kill_process are well-factored
Clear logging - Warning messages will help debug production issues

Minor Improvements

Empty path handling (lib/cypress_on_rails/server.rb:145) - The check for empty path is good, though URI should already handle this. Defensive programming is fine here.
Configuration defaults (lib/cypress_on_rails/configuration.rb:67-68) - Good use of ENV.fetch with sensible defaults

🔒 Security & Safety

No security concerns identified. The changes are defensive and improve reliability:

Process group killing is properly implemented with fallback
No command injection risks (all args are controlled)
Error handling prevents information leakage

🧪 Test Coverage

Critical Gap: No tests were added for the new functionality.

Recommended test cases:

Process group creation and cleanup
HTTP-based readiness checks with various response codes (200, 301, 404, 500)
Timeout behavior in wait_for_server
Errno::EPERM error handling in process termination
Errno::ESRCH handling when process terminates immediately
Configuration of server_readiness_path
Empty/nil path handling

Note: The PR description mentions "3 pre-existing failures unrelated to changes" - these should be investigated or documented.

📋 Additional Recommendations

Documentation

README update needed - Document the new environment variables:
- CYPRESS_RAILS_READINESS_PATH (default: /)
- CYPRESS_RAILS_READINESS_TIMEOUT (default: 5)
Behavior clarification - Document that CYPRESS_RAILS_READINESS_TIMEOUT is the overall wait timeout, not per-request timeout (which is capped at 2 seconds)

Code Comments

The code comments are good, but consider adding:

Why 3xx redirects are acceptable in readiness checks
Why per-request timeout is capped at 2 seconds

🎯 Verdict

Recommendation: Approve with minor changes

This is a solid improvement that addresses real reliability issues. The core logic is sound and the error handling is thoughtful.

Must address before merge:

Add unit tests for the new logic (especially HTTP readiness checks and process group handling)
Consider adding timeout to Process.wait in stop_server to prevent hangs
Document the new ENV variables in README

Should consider:

Clarify the timeout configuration behavior (per-request vs overall)
Document why 3xx responses are considered "ready"

Nice to have:

Investigate and document the 3 pre-existing test failures
Add exponential backoff to server polling (instead of fixed 0.1s sleep)

Estimated effort to address: 2-3 hours for comprehensive test coverage and documentation.

Great work on improving the reliability of this gem! 🚀

Job Run | Branch

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1672356 and 5a1867d.

📒 Files selected for processing (2)

lib/cypress_on_rails/configuration.rb (2 hunks)
lib/cypress_on_rails/server.rb (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

lib/cypress_on_rails/server.rb (1)

lib/cypress_on_rails/configuration.rb (1)

configuration (80-82)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: claude-review
GitHub Check: rails_7_2
GitHub Check: rails_6_1
GitHub Check: rails_8

🔇 Additional comments (10)

lib/cypress_on_rails/configuration.rb (2)

25-26: LGTM!

The new configuration attributes properly expose readiness check parameters. The naming is clear and follows the existing convention.

67-68: LGTM!

Environment variable initialization with sensible defaults is appropriate. The root path '/' and 5-second timeout are reasonable starting points.

lib/cypress_on_rails/server.rb (8)

4-4: LGTM!

Adding net/http is necessary for the HTTP-based readiness checks.

20-22: LGTM!

Process tracking variables are properly initialized with clear documentation of their purpose.

112-121: LGTM!

Process group creation and error handling for the race condition (process dying before getpgid) is properly implemented. The fallback to nil PGID is safe since send_term_signal handles this case.

124-133: LGTM!

The polling loop with a 30-second overall timeout properly uses the new HTTP-based readiness check. The 0.1-second sleep interval provides quick feedback without excessive CPU usage.

156-164: LGTM!

Server termination logic is correct with proper guards and error handling. The early return prevents attempting to stop a non-existent server.

166-175: LGTM!

Process termination with process group support and fallback logic is well-implemented. The warning log on line 173 meets the PR objective of making termination issues visible for debugging. The rescue block correctly catches errors from the process group kill and falls back to single-process termination.

177-183: LGTM!

Using Process.kill(0, pid) to check process existence is the standard approach. The error handling is appropriate for the use case, where both "process doesn't exist" (ESRCH) and "no permission" (EPERM) can be treated as "cannot proceed with kill operation."

185-189: LGTM!

The safe process kill helper properly guards against nil PIDs and silently handles expected errors (process already terminated or permission denied), which is appropriate for cleanup operations.

lib/cypress_on_rails/server.rb

- Remove timeout cap that prevented honoring user-specified timeouts - Previously capped at 2 seconds with [config.server_readiness_timeout || 5, 2].min - Now respects full configured timeout value (default: 5 seconds) - Remove redundant uri.path.empty? check since readiness_path defaults to '/' This allows users to configure longer timeouts for slow environments while maintaining clean, simple code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add 10-second timeout to Process.wait to prevent indefinite hangs - Send KILL signal if server doesn't respond to TERM within timeout - Prevents test runs from hanging if server doesn't shut down gracefully - Logs warning when escalating from TERM to KILL This ensures robust cleanup even when servers are unresponsive to graceful shutdown signals. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

claude · 2025-10-06T06:22:37Z

Code Review for PR 199: Improve server process cleanup and readiness checks

Overall Assessment: Approved with minor suggestions

This PR addresses important reliability issues in server lifecycle management. The changes are well-motivated and align with issue 198.

Code Quality & Best Practices

Strengths:

Well-structured process management with @server_pid and @server_pgid instance variables
Defensive error handling with multiple rescue blocks for edge cases
Clear comments explaining non-obvious behavior (lines 116-118)
Graceful degradation from process group kill to single process kill

Suggestions:

Configuration naming consistency (configuration.rb:67-68)
The CYPRESS_RAILS_ prefix is used, but the gem now supports both Cypress and Playwright. Consider adding deprecation warnings or documenting that these vars apply to both frameworks.
Timeout semantics (server.rb:138)
The server_readiness_timeout (5s) applies to EACH HTTP request during polling (every 0.1s). With a 30s outer timeout, this could be confusing. Consider using 1-2 seconds for individual requests, or document this relationship clearly.

Potential Bugs & Issues

Minor Issues:

Race condition in spawn_server (server.rb:113-120)
Process.getpgid could raise Errno::EPERM in addition to ESRCH. Consider catching both.
send_term_signal logic (server.rb:163-172)
The process_exists? check before killing the process group adds little value since you rescue and fall back anyway. Consider simplifying.

Performance Considerations

Good:

HTTP-based readiness checks are more accurate with negligible performance impact
Process group management is efficient

Optimization opportunity: Make the 0.1s polling interval configurable.

Security Concerns

No significant security issues found:

Proper use of TERM signal for graceful shutdown
No shell injection risks (direct process spawning)
Localhost binding by default

Minor suggestion: Validate server_readiness_path starts with / to prevent malformed URIs.

Test Coverage

Missing tests - Recommendations:

Add unit tests for new configuration options (server_readiness_path, server_readiness_timeout)
Add tests for server_responding? with mocked HTTP responses (200, 301, 404, 500)
Add tests for process management edge cases (immediate exit, EPERM, missing PID)
Add integration test to verify process group is set correctly

Additional Notes

Documentation needed:

README update for CYPRESS_RAILS_READINESS_PATH and CYPRESS_RAILS_READINESS_TIMEOUT
Migration guide for custom health check endpoints
Changelog entry

Logging consistency: Lines 118 and 170 use the logger, but other lines use puts. Consider using logger throughout for better production control.

Summary

This is a solid improvement addressing real production issues. Main recommendations:

Must-have: Add unit tests for configuration options
Should-have: Clarify timeout semantics
Nice-to-have: Replace puts with logger for consistency

Great work!

claude · 2025-10-06T06:22:59Z

Code Review - PR #199: Improve server process cleanup and readiness checks

Overall Assessment

This is a well-executed PR that meaningfully improves the reliability and debuggability of server process management. The changes directly address the issues described in #198 with thoughtful error handling and configurability.

✅ Strengths

1. Robust Error Handling

Comprehensive exception handling in server_responding? (lib/cypress_on_rails/server.rb:148-150) covers all network-related edge cases
Graceful handling of race conditions (e.g., process terminating before getpgid on line 115)
Proper timeout handling with KILL fallback (lines 160-167) prevents indefinite hangs

2. Improved Observability

Warning logs when process group kill fails (line 179) - directly addresses debugging needs from Enhancement: Improve server process cleanup and readiness checks #198
Warning when TERM signal timeout requires KILL escalation (line 164)
Early detection of process termination issues (line 118)

3. Better Health Checks

HTTP-based readiness validation (lines 135-151) is more accurate than TCP socket checks
Correctly rejects 404 and 5xx responses while accepting 200-399 range
Comment on line 146 clearly explains why 3xx redirects are considered ready

4. Configurability

server_readiness_path and server_readiness_timeout (configuration.rb:25-26, 67-68) allow customization for different environments
Environment variable support (CYPRESS_RAILS_READINESS_PATH, CYPRESS_RAILS_READINESS_TIMEOUT) aids CI/CD integration

5. Process Group Management

Proper use of pgroup: true in spawn (line 112) ensures child processes are cleaned up
Fallback from process group to single process kill (lines 172-181) handles permission edge cases

🔍 Areas for Improvement

1. Missing Test Coverage ⚠️ High Priority

The PR lacks tests for the new functionality. Critical scenarios that need coverage:

server_responding? with various HTTP status codes (200, 301, 404, 500)
Process group kill with EPERM errors
Timeout behavior in stop_server
Race condition in Process.getpgid
Configuration of server_readiness_path and server_readiness_timeout

Recommendation: Add specs to spec/cypress_on_rails/configuration_spec.rb for new config options and create spec/cypress_on_rails/server_spec.rb for process management logic.

2. Potential Timeout Issue in server_responding?

The timeout parameter (line 138) is used for both open_timeout and read_timeout (line 141), which could cause issues in the polling loop (line 127):

If server_readiness_timeout is 5 seconds, a single HTTP request could block for up to 5 seconds
The polling loop expects 0.1s checks, but a single attempt could consume most of the 30s wait_for_server timeout
This might make the server appear slower to start than it actually is

Recommendation: Consider capping individual HTTP request timeouts to maintain polling responsiveness while still respecting user configuration for slow environments.

3. Logger vs puts Inconsistency

The code mixes logger.warn (lines 118, 164, 179) with puts (lines 110, 156). Consider standardizing on one approach for consistency.

4. Documentation Gap

The new configuration options lack inline documentation. Consider adding comments above the attr_accessors in configuration.rb.

🔒 Security Considerations

✅ No security concerns identified

Process signaling is properly scoped to child processes
No injection vulnerabilities in URI construction (paths are from config)
Timeout values prevent resource exhaustion

⚡ Performance Considerations

Positive: HTTP-based checks may be slightly slower than TCP checks, but provide much more accurate results
Negligible Impact: The 0.1s polling interval is reasonable
Concern: See timeout issue in Areas for Improvement Seems to generate a cypress directory in the main folder #2

📝 Suggested Next Steps

Add test coverage for new functionality (high priority)
Consider timeout capping in server_responding? to maintain polling responsiveness
Add inline documentation for new configuration options
Verify manual testing in production-like environment

Summary

This PR makes meaningful improvements to reliability and debuggability. The code quality is good with thoughtful error handling. The main gap is test coverage - I recommend adding tests before merging to prevent regressions and document expected behavior.

Recommendation: ✅ Approve with minor changes (add tests, consider timeout adjustment)

Great work addressing issue #198!

- Add inline documentation for server_readiness_path and server_readiness_timeout - Document default values and environment variable names - Add test coverage for default values - Add test coverage for custom configuration This improves discoverability and ensures configuration options are properly tested. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

claude · 2025-10-06T06:46:02Z

Code Review - PR #199

Thank you for this improvement! This PR addresses important reliability issues with server lifecycle management.

Strengths

Well-documented changes with clear benefits
Backward compatible with sensible defaults
Good error handling for edge cases (EPERM, ESRCH)
HTTP-based readiness checks are more accurate than TCP socket checks

Code Quality

Configuration (lib/cypress_on_rails/configuration.rb:25-30)

Good use of environment variables
Clear inline documentation
Sensible defaults

Server Process Management (lib/cypress_on_rails/server.rb:112-122)

Process group creation with pgroup: true is correct
Good edge case handling for immediate process termination
Minor: Instance variables @server_pid and @server_pgid are set in spawn_server but PID is also stored in local variable server_pid in start_server:74, creating two sources of truth

HTTP Readiness Check (lib/cypress_on_rails/server.rb:135-151)

Accepting 200-399 status codes is appropriate
Good exception handling for network errors
Potential issue: Method does not handle query parameters. If readiness_path contains query params, they might not be preserved when using uri.path

Potential Issues

1. Query Parameters in Readiness Path (server.rb:142)
Current code uses uri.path which drops query parameters.
Recommendation: Use uri.request_uri to preserve query parameters

2. Dual PID Storage (server.rb:74 vs 112)
Local variable server_pid used for cleanup while @server_pid is also set. If exception occurs between lines 74-96, cleanup might fail.
Recommendation: Use @server_pid consistently by calling stop_server(@server_pid)

3. Timeout Handling in stop_server (server.rb:159-167)
Code waits 10 seconds for TERM, then sends KILL. However, if KILL does not work, Process.wait could hang indefinitely.
Recommendation: Add timeout to KILL wait as well (5 seconds recommended)

Security

Low Risk - No significant security issues identified. Process management handled defensively.

Performance

HTTP Readiness Timeout: Default 5-second timeout is reasonable
Sleep Interval: 0.1 seconds (10 checks/sec) is appropriate

Test Coverage

Existing: Configuration specs properly test new attributes
Missing (recommended):

No tests for Server class itself
No tests for process group cleanup behavior
No tests for HTTP readiness check logic
No tests for timeout scenarios

Consider adding integration tests for server lifecycle, process cleanup scenarios, and HTTP status code handling.

Recommendations Summary

High Priority:

Fix query parameter handling (use uri.request_uri)
Consolidate PID storage to use instance variable

Medium Priority:
3. Add timeout to KILL signal wait
4. Consider adding integration tests for server lifecycle

Overall Assessment

This is a solid improvement addressing real reliability issues. Code quality is good with proper error handling and sensible defaults. Main suggestions are minor defensive programming improvements.

Recommendation: Approve with minor suggestions

PR is ready to merge after addressing the query parameter issue and PID storage consolidation (both quick fixes).

Great work improving the test infrastructure!

coderabbitai bot requested changes Oct 5, 2025

View reviewed changes

lib/cypress_on_rails/server.rb Show resolved Hide resolved

coderabbitai bot approved these changes Oct 6, 2025

View reviewed changes

coderabbitai bot requested changes Oct 6, 2025

View reviewed changes

lib/cypress_on_rails/server.rb Show resolved Hide resolved

justin808 and others added 2 commits October 5, 2025 20:18

coderabbitai bot approved these changes Oct 6, 2025

View reviewed changes

justin808 merged commit abb505c into master Oct 6, 2025
5 checks passed

justin808 deleted the justin808/improve-server-cleanup branch October 6, 2025 07:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve server process cleanup and readiness checks #199

Improve server process cleanup and readiness checks #199

Uh oh!

justin808 commented Oct 5, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 5, 2025 •

edited

Loading

Rate limit exceeded

Uh oh!

claude bot commented Oct 5, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

claude bot commented Oct 6, 2025

Uh oh!

claude bot commented Oct 6, 2025

Uh oh!

justin808 commented Oct 6, 2025

Uh oh!

claude bot commented Oct 6, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

claude bot commented Oct 6, 2025

Uh oh!

claude bot commented Oct 6, 2025

Uh oh!

claude bot commented Oct 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Improve server process cleanup and readiness checks #199

Improve server process cleanup and readiness checks #199

Uh oh!

Conversation

justin808 commented Oct 5, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benefits

Changes

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

claude bot commented Oct 5, 2025

Pull Request Review: Improve server process cleanup and readiness checks

Overview

Strengths

Code Quality

Improved Reliability

Issues and Suggestions

1. Potential Race Condition in spawn_server (lib/cypress_on_rails/server.rb:109-111)

2. Incomplete Exception Handling in server_responding? (lib/cypress_on_rails/server.rb:125-132)

3. Missing Instance Variable Initialization

4. Inconsistent Error Recovery in send_term_signal (lib/cypress_on_rails/server.rb:144-153)

5. Missing Timeout in server_responding?

Test Coverage

Security Considerations

Performance Considerations

Summary

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude bot commented Oct 6, 2025

Code Review - PR #199: Improve server process cleanup and readiness checks

Overview

✅ Strengths

🔍 Code Quality & Best Practices

🐛 Potential Issues

⚡ Performance Considerations

🔒 Security Concerns

🧪 Test Coverage

📋 Recommendations

🎯 Verdict

Uh oh!

claude bot commented Oct 6, 2025

PR Review: Improve server process cleanup and readiness checks

Overall Assessment

Critical Issues

1. Incorrect PID variable in stop_server (lib/cypress_on_rails/server.rb:150-158)

2. Race condition in spawn_server (lib/cypress_on_rails/server.rb:112-119)

Security & Best Practices

3. HTTP timeout configuration (lib/cypress_on_rails/server.rb:133-148)

4. HTTP response code validation (lib/cypress_on_rails/server.rb:144)

Code Quality

5. Missing URI path handling (lib/cypress_on_rails/server.rb:140)

6. Missing test coverage

Positive Observations

Recommendations Before Merging

Uh oh!

justin808 commented Oct 6, 2025

Uh oh!

claude bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Improve server process cleanup and readiness checks

Overall Assessment

✅ Strengths

justin808 commented Oct 5, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 5, 2025 •

edited

Loading

claude bot commented Oct 6, 2025 •

edited

Loading