-
Notifications
You must be signed in to change notification settings - Fork 497
Server: Robust shutdown on stdio detach (signals, stdin/parent monitor, forced exit) #363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…r, forced exit)\nTests: move telemetry tests to tests/ and convert to asserts
…duplicates; fix Windows ValueError in parent monitor; tests: add autouse cwd fixture for telemetry to locate pyproject.toml
…guarded exit timers, and os._exit force-exit in UnityMcpServer~ entry points
WalkthroughSignal handling and process lifecycle management have been added to multiple MCP server implementations. Three server.py files now include background monitors for stdin/parent-process detachment, signal handlers for SIGTERM/SIGINT, global shutdown coordination flags, and enhanced main() entry points. One telemetry test file has been removed and replaced with a new structured test module with enhanced fixture support. Changes
Sequence Diagram(s)sequenceDiagram
actor main as main()
participant sig as Signal Handlers
participant mon as Monitor Thread
participant mcp as mcp.run()
participant proc as OS
main->>sig: Install signal handlers<br/>(SIGTERM, SIGINT, ...)
main->>mon: Start _monitor_stdin()<br/>(daemon thread)
par Monitor Loop
mon->>proc: Poll stdin/parent
Note over mon: Check for EOF or<br/>parent detachment
and
main->>mcp: Enter mcp.run(transport='stdio')
proc->>sig: Send signal (SIGTERM/SIGINT)
sig->>sig: _signal_handler triggered
sig->>mon: Set _shutdown_flag
end
alt Graceful Path
mon->>mcp: Detect shutdown_flag or stdin EOF
mon->>mcp: Trigger controlled shutdown
mcp->>main: Return normally
main->>main: Schedule _force_exit() with short delay
else Exception Path
mcp->>main: Raise (KeyboardInterrupt, BrokenPipeError)
main->>main: Catch exception, log, set _shutdown_flag
main->>main: Call _force_exit(1) after delay
else Timeout Path
alt Not already scheduled
main->>main: Schedule _force_exit via Timer
Note over main: Ensures process exits<br/>if graceful path stalls
end
end
main->>proc: os._exit(code)
Note over proc: Forced abrupt termination
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
MCPForUnity/UnityMcpServer~/src/server.py(3 hunks)Server/server.py(3 hunks)Server/test_telemetry.py(0 hunks)UnityMcpBridge/UnityMcpServer~/src/server.py(3 hunks)tests/test_telemetry_server.py(1 hunks)
💤 Files with no reviewable changes (1)
- Server/test_telemetry.py
🧰 Additional context used
🪛 Ruff (0.14.2)
UnityMcpBridge/UnityMcpServer~/src/server.py
204-204: Unused function argument: frame
(ARG001)
241-241: Do not catch blind exception: Exception
(BLE001)
251-251: Do not catch blind exception: Exception
(BLE001)
264-266: try-except-pass detected, consider logging the exception
(S110)
264-264: Do not catch blind exception: Exception
(BLE001)
MCPForUnity/UnityMcpServer~/src/server.py
201-201: Unused function argument: frame
(ARG001)
238-238: Do not catch blind exception: Exception
(BLE001)
248-248: Do not catch blind exception: Exception
(BLE001)
262-264: try-except-pass detected, consider logging the exception
(S110)
262-262: Do not catch blind exception: Exception
(BLE001)
Server/server.py
201-201: Unused function argument: frame
(ARG001)
238-238: Do not catch blind exception: Exception
(BLE001)
248-248: Do not catch blind exception: Exception
(BLE001)
262-264: try-except-pass detected, consider logging the exception
(S110)
262-262: Do not catch blind exception: Exception
(BLE001)
| if parent_pid is not None: | ||
| try: | ||
| os.kill(parent_pid, 0) | ||
| except ValueError: | ||
| # Signal 0 unsupported on this platform (e.g., Windows); disable parent probing | ||
| parent_pid = None | ||
| except (ProcessLookupError, OSError): | ||
| logger.info(f"Parent process {parent_pid} no longer exists; shutting down") | ||
| break | ||
|
|
||
| try: | ||
| if sys.stdin.closed: | ||
| logger.info("stdin.closed is True; client disconnected") | ||
| break | ||
| fd = sys.stdin.fileno() | ||
| if fd < 0: | ||
| logger.info("stdin fd invalid; client disconnected") | ||
| break | ||
| except (ValueError, OSError, AttributeError): | ||
| # Closed pipe or unavailable stdin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent false parent-death detection on Windows.
Identical to the bridge variant: os.kill(parent_pid, 0) throws PermissionError/EINVAL on Windows when signal 0 isn’t supported, yet this code treats every OSError as “parent vanished” and forces shutdown. Any Windows run will therefore exit immediately. Please gate on the errno, disable probing for the unsupported cases, and keep exiting only when we actually get ProcessLookupError.
@@
-import signal
-import sys
-import threading
+import errno
+import signal
+import sys
+import threading
@@
- except ValueError:
- # Signal 0 unsupported on this platform (e.g., Windows); disable parent probing
- parent_pid = None
- except (ProcessLookupError, OSError):
+ except ValueError:
+ parent_pid = None
+ except ProcessLookupError:
logger.info(f"Parent process {parent_pid} no longer exists; shutting down")
break
+ except OSError as exc:
+ if exc.errno in (errno.EPERM, errno.EACCES, errno.EINVAL, errno.ENOSYS):
+ logger.debug("Parent probe unsupported; disabling parent monitoring", exc_info=True)
+ parent_pid = None
+ continue
+ logger.info(f"Parent process {parent_pid} no longer exists; shutting down")
+ break📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if parent_pid is not None: | |
| try: | |
| os.kill(parent_pid, 0) | |
| except ValueError: | |
| # Signal 0 unsupported on this platform (e.g., Windows); disable parent probing | |
| parent_pid = None | |
| except (ProcessLookupError, OSError): | |
| logger.info(f"Parent process {parent_pid} no longer exists; shutting down") | |
| break | |
| try: | |
| if sys.stdin.closed: | |
| logger.info("stdin.closed is True; client disconnected") | |
| break | |
| fd = sys.stdin.fileno() | |
| if fd < 0: | |
| logger.info("stdin fd invalid; client disconnected") | |
| break | |
| except (ValueError, OSError, AttributeError): | |
| # Closed pipe or unavailable stdin | |
| if parent_pid is not None: | |
| try: | |
| os.kill(parent_pid, 0) | |
| except ValueError: | |
| parent_pid = None | |
| except ProcessLookupError: | |
| logger.info(f"Parent process {parent_pid} no longer exists; shutting down") | |
| break | |
| except OSError as exc: | |
| if exc.errno in (errno.EPERM, errno.EACCES, errno.EINVAL, errno.ENOSYS): | |
| logger.debug("Parent probe unsupported; disabling parent monitoring", exc_info=True) | |
| parent_pid = None | |
| continue | |
| logger.info(f"Parent process {parent_pid} no longer exists; shutting down") | |
| break | |
| try: | |
| if sys.stdin.closed: | |
| logger.info("stdin.closed is True; client disconnected") | |
| break | |
| fd = sys.stdin.fileno() | |
| if fd < 0: | |
| logger.info("stdin fd invalid; client disconnected") | |
| break | |
| except (ValueError, OSError, AttributeError): | |
| # Closed pipe or unavailable stdin |
🤖 Prompt for AI Agents
In MCPForUnity/UnityMcpServer~/src/server.py around lines 217-236, the
parent-probing code treats all OSErrors as "parent vanished" which causes
immediate exits on Windows; change the exception handling so PermissionError and
OSError with errno indicating signal-0 unsupported (e.g., errno.EINVAL or
errno.EPERM) do not trigger shutdown but instead disable parent probing by
setting parent_pid = None, while only ProcessLookupError should log "parent no
longer exists" and break; use the errno module to inspect e.errno and keep the
existing ValueError handling as-is.
| if parent_pid is not None: | ||
| try: | ||
| os.kill(parent_pid, 0) | ||
| except ValueError: | ||
| # Signal 0 unsupported on this platform (e.g., Windows); disable parent probing | ||
| parent_pid = None | ||
| except (ProcessLookupError, OSError): | ||
| logger.info(f"Parent process {parent_pid} no longer exists; shutting down") | ||
| break | ||
|
|
||
| try: | ||
| if sys.stdin.closed: | ||
| logger.info("stdin.closed is True; client disconnected") | ||
| break | ||
| fd = sys.stdin.fileno() | ||
| if fd < 0: | ||
| logger.info("stdin fd invalid; client disconnected") | ||
| break | ||
| except (ValueError, OSError, AttributeError): | ||
| # Closed pipe or unavailable stdin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against unsupported signal 0 on Windows.
Here too, os.kill(parent_pid, 0) will raise PermissionError/EINVAL on Windows whenever signal 0 isn’t supported. Treating all OSError values like ProcessLookupError means the server kills itself immediately even though the parent is fine. Please special‑case those errnos, disable parent probing, and only shut down when we genuinely see ProcessLookupError.
@@
-import signal
-import sys
-import threading
+import errno
+import signal
+import sys
+import threading
@@
- except ValueError:
- # Signal 0 unsupported on this platform (e.g., Windows); disable parent probing
- parent_pid = None
- except (ProcessLookupError, OSError):
+ except ValueError:
+ parent_pid = None
+ except ProcessLookupError:
logger.info(f"Parent process {parent_pid} no longer exists; shutting down")
break
+ except OSError as exc:
+ if exc.errno in (errno.EPERM, errno.EACCES, errno.EINVAL, errno.ENOSYS):
+ logger.debug("Parent probe unsupported; disabling parent monitoring", exc_info=True)
+ parent_pid = None
+ continue
+ logger.info(f"Parent process {parent_pid} no longer exists; shutting down")
+ breakCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In Server/server.py around lines 217-236, the except handling for
os.kill(parent_pid, 0) treats all OSError the same and can cause premature
shutdown on Windows; change it to special-case signal-not-supported errors by
importing errno and checking the exception errno: if the caught exception is
PermissionError or has errno in (errno.EINVAL, errno.EPERM) then set parent_pid
= None (disable parent probing) and continue, but only log and break for
ProcessLookupError or errors with errno == errno.ESRCH that indicate the parent
truly doesn't exist; re-raise or handle other unexpected errors appropriately.
| break | ||
|
|
||
| if parent_pid is not None: | ||
| try: | ||
| os.kill(parent_pid, 0) | ||
| except ValueError: | ||
| # Signal 0 unsupported on this platform (e.g., Windows); disable parent probing | ||
| parent_pid = None | ||
| except (ProcessLookupError, OSError): | ||
| logger.info(f"Parent process {parent_pid} no longer exists; shutting down") | ||
| break | ||
|
|
||
| try: | ||
| if sys.stdin.closed: | ||
| logger.info("stdin.closed is True; client disconnected") | ||
| break | ||
| fd = sys.stdin.fileno() | ||
| if fd < 0: | ||
| logger.info("stdin fd invalid; client disconnected") | ||
| break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix parent-process probe on Windows to avoid false shutdowns.
On Windows, os.kill(parent_pid, 0) often raises PermissionError or OSError with errno=EINVAL/EPERM even when the parent is still alive (signal 0 simply isn’t supported). Because we route every OSError through the ProcessLookupError path, the server will immediately assume the parent died and tear itself down. That turns a healthy Windows deploy into an instant self‑shutdown. Please treat those “unsupported” errnos like the ValueError case and disable parent probing instead of exiting. Add the necessary errno import and keep shutting down only when we genuinely see ProcessLookupError (parent gone).
@@
-import signal
-import sys
-import threading
+import errno
+import signal
+import sys
+import threading
@@
- except ValueError:
- # Signal 0 unsupported on this platform (e.g., Windows); disable parent probing
- parent_pid = None
- except (ProcessLookupError, OSError):
+ except ValueError:
+ parent_pid = None
+ except ProcessLookupError:
logger.info(f"Parent process {parent_pid} no longer exists; shutting down")
break
+ except OSError as exc:
+ if exc.errno in (errno.EPERM, errno.EACCES, errno.EINVAL, errno.ENOSYS):
+ logger.debug("Parent probe unsupported; disabling parent monitoring", exc_info=True)
+ parent_pid = None
+ continue
+ logger.info(f"Parent process {parent_pid} no longer exists; shutting down")
+ breakCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In UnityMcpBridge/UnityMcpServer~/src/server.py around lines 218 to 237, the
parent-process probe treats all OSError from os.kill(parent_pid, 0) as
process-not-found and triggers shutdown; modify the exception handling to treat
Windows "unsupported" errors (PermissionError or OSError with errno
EINVAL/EPERM) the same as the existing ValueError case by disabling parent
probing instead of exiting, and only shut down when ProcessLookupError (parent
gone) is raised; add an import for errno at the top and check exception.errno
against errno.EINVAL and errno.EPERM (or handle PermissionError separately) to
decide to set parent_pid = None rather than logging shutdown and breaking.
…t monitor, forced exit) (CoplayDev#363)" This reverts commit ca01fc7.
Problem
The MCP server could become a zombie process when the client disconnected or the parent process exited, because it didn't reliably detect stdio detach or handle shutdown signals. (Noted in, and should fix #359)
Solution
Implemented robust shutdown coordination across all server entry points:
ValueErrorhandling for unsupportedos.kill(pid, 0))threading.Eventflag prevents multiple concurrent exit timers across shutdown paths_force_exit()to directos._exit()to bypass background threads and atexit handlersChanges
Server/server.py: Added shutdown coordination with signal handlers, monitor thread, and guarded exit logicUnityMcpBridge/UnityMcpServer~/src/server.py: Mirrored shutdown improvementsMCPForUnity/UnityMcpServer~/src/server.py: Mirrored shutdown improvementstests/test_telemetry_server.py: Added autouse pytest fixture tochdirintoUnityMcpServer~/srcso telemetry can locatepyproject.tomlfor version detectionTesting
Windows compatibility
ValueErrorwhenos.kill(parent_pid, 0)is unsupported on WindowsThis ensures the server exits cleanly in all scenarios, preventing zombie processes and improving resource cleanup.
Summary by CodeRabbit
New Features
Tests