-
Notifications
You must be signed in to change notification settings - Fork 555
UN-2717 [FEAT] Sigterm Implementation in Sidecar Tool container #1523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
UN-2717 [FEAT] Sigterm Implementation in Sidecar Tool container #1523
Conversation
Summary by CodeRabbit
WalkthroughAdds graceful shutdown to log monitoring: registers SIGINT/SIGTERM handlers to set a shutdown flag, updates the monitoring loop to observe this flag, performs a final log collection via a new helper, logs completion, and exits cleanly. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant OS as OS
participant Main as main()
participant Sig as signal_handler
participant Mon as monitor_logs()
participant File as log file
Note over Main: Startup
Main->>Main: register SIGINT/SIGTERM handlers
Main->>Mon: start monitoring loop
loop monitor loop
Mon->>File: read new lines
File-->>Mon: lines (if any)
Mon->>Mon: process lines
alt shutdown not requested
Mon->>Mon: sleep/continue
end
end
Note over OS,Sig: Shutdown initiated
OS-->>Sig: SIGINT/SIGTERM
Sig->>Mon: set _shutdown_requested = True
opt On next iteration
Mon->>Mon: detect shutdown flag
Mon->>File: _final_log_collection(file_handle)
File-->>Mon: remaining lines
Mon->>Mon: process remaining lines
Mon-->>Main: exit monitoring loop
end
Main->>Main: log completion and exit
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tool-sidecar/src/unstract/tool_sidecar/log_processor.py (1)
98-113
: Honor SIGTERM while waiting for the log file (otherwise can hang up to 300s).If SIGTERM arrives before the file exists, the loop ignores it and can block until timeout, likely causing a forced SIGKILL in Kubernetes. Check the shutdown flag inside the wait loop.
Apply this diff:
def wait_for_log_file(self, timeout: int = 300) -> bool: """Wait for the log file to be created by the tool container. @@ - start_time = time.time() - while not os.path.exists(self.log_path): + start_time = time.time() + while not os.path.exists(self.log_path): + if _shutdown_requested: + logger.info("Shutdown requested while waiting for log file; aborting wait.") + return False if time.time() - start_time > timeout: return False time.sleep(0.1) return True
🧹 Nitpick comments (4)
tool-sidecar/src/unstract/tool_sidecar/log_processor.py (4)
227-227
: Remove unnecessaryglobal
declaration.
monitor_logs
only reads_shutdown_requested
;global
is not required.Apply this diff:
- global _shutdown_requested
266-290
: Keep parsing behavior consistent: avoid.strip()
here.Normal path passes raw lines (with trailing newline) to
process_log_line
, whichjson.loads
already handles. Stripping only in the final drain can introduce subtle differences (e.g., leading/trailing spaces are significant in some non‑JSON lines you log verbatim).Apply this diff:
- log_line = self.process_log_line(line.strip()) + log_line = self.process_log_line(line)
230-230
: Optional: shorten or customize the timeout exception message (Ruff TRY003).Either shorten the message or introduce a small custom exception class to carry context.
Apply this minimal tweak:
- raise TimeoutError("Log file was not created within timeout period") + raise TimeoutError("Log file not created within timeout")
296-300
: Signal registration LGTM; confirm pod termination settings.Handlers for SIGTERM/SIGINT are correctly registered in
main()
.
- Ensure your pod’s
terminationGracePeriodSeconds
comfortably exceeds the final-drain window (recommend ≥5–10s).- If you use a
preStop
hook, confirm it doesn’t block the sidecar from receiving SIGTERM promptly.- Verify in your manifests/Helm charts.
Would you like a follow-up PR checklist to validate this in the integration cluster?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base
setting
📒 Files selected for processing (1)
tool-sidecar/src/unstract/tool_sidecar/log_processor.py
(4 hunks)
🧰 Additional context used
🪛 Ruff (0.13.1)
tool-sidecar/src/unstract/tool_sidecar/log_processor.py
33-33: Unused function argument: frame
(ARG001)
230-230: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (1)
tool-sidecar/src/unstract/tool_sidecar/log_processor.py (1)
235-240
: Shutdown check and final drain look good.Early exit with a brief final drain achieves graceful behavior without heavy work inside the signal handler.
def signal_handler(signum, frame): | ||
"""Handle shutdown signals gracefully.""" | ||
global _shutdown_requested | ||
signal_name = signal.Signals(signum).name | ||
logger.info(f"Received {signal_name}, initiating graceful shutdown...") | ||
_shutdown_requested = True | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix unused frame
arg in signal handler (lint) and keep handler minimal.
Rename the unused parameter to underscore to satisfy Ruff ARG001 and keep the handler lean.
Apply this diff:
-def signal_handler(signum, frame):
+def signal_handler(signum, _frame):
"""Handle shutdown signals gracefully."""
global _shutdown_requested
signal_name = signal.Signals(signum).name
logger.info(f"Received {signal_name}, initiating graceful shutdown...")
_shutdown_requested = True
Optional: consider replacing the module‑level bool with a threading.Event
for clearer intent and to avoid globals.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def signal_handler(signum, frame): | |
"""Handle shutdown signals gracefully.""" | |
global _shutdown_requested | |
signal_name = signal.Signals(signum).name | |
logger.info(f"Received {signal_name}, initiating graceful shutdown...") | |
_shutdown_requested = True | |
def signal_handler(signum, _frame): | |
"""Handle shutdown signals gracefully.""" | |
global _shutdown_requested | |
signal_name = signal.Signals(signum).name | |
logger.info(f"Received {signal_name}, initiating graceful shutdown...") | |
_shutdown_requested = True |
🧰 Tools
🪛 Ruff (0.13.1)
33-33: Unused function argument: frame
(ARG001)
🤖 Prompt for AI Agents
In tool-sidecar/src/unstract/tool_sidecar/log_processor.py around lines 33 to
39, the signal_handler currently accepts an unused parameter named frame which
triggers lint ARG001; rename the unused parameter to _frame (or just _ ) to
satisfy Ruff and keep the handler minimal by leaving only the logging and
setting of the shutdown flag (remove any extra comments or logic inside the
handler). Optionally, consider replacing the module-level boolean
_shutdown_requested with a threading.Event to avoid globals and make shutdown
signaling clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kirtimanmishrazipstack can you address this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
# Check for shutdown signal | ||
if _shutdown_requested: | ||
logger.info("Shutdown requested, performing final log collection...") | ||
self._final_log_collection(f) | ||
break | ||
|
||
# Remember current position | ||
where = f.tell() | ||
line = f.readline() | ||
|
||
if not line: | ||
# No new data, check if tool container is done | ||
if os.path.exists( | ||
os.path.join(os.path.dirname(self.log_path), "completed") | ||
): | ||
break | ||
|
||
# Sleep briefly to avoid CPU spinning | ||
time.sleep(0.1) | ||
# Go back to where we were | ||
f.seek(where) | ||
continue | ||
|
||
# Process the log line | ||
log_line = self.process_log_line(line) | ||
if log_line.is_terminated: | ||
logger.info("Completion signal received") | ||
break | ||
|
||
logger.info("Log monitoring completed") | ||
|
||
def _final_log_collection(self, file_handle) -> None: | ||
"""Perform final collection of any remaining logs before shutdown.""" | ||
logger.info("Performing final log collection...") | ||
|
||
# Give main container brief moment to write final logs | ||
time.sleep(0.2) | ||
|
||
lines_collected = 0 | ||
|
||
# Read any remaining lines in the file | ||
while True: | ||
line = file_handle.readline() | ||
if not line: | ||
break | ||
|
||
lines_collected += 1 | ||
log_line = self.process_log_line(line.strip()) | ||
|
||
if log_line.is_terminated: | ||
logger.info("Found completion signal during final collection") | ||
|
||
logger.info( | ||
f"Final log collection completed, processed {lines_collected} remaining lines" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kirtimanmishrazipstack I don't think this handling is required. Similar to what we did in the sdk, we just need to trap the signal. No need to do anything else for now.
What
Why
How
Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Screenshots
Checklist
I have read and understood the Contribution Guidelines.