-
Notifications
You must be signed in to change notification settings - Fork 553
UN-2865 [FIX] Remove premature COMPLETED status update in general worker after async file orchestration #1574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UN-2865 [FIX] Remove premature COMPLETED status update in general worker after async file orchestration #1574
Conversation
This fix addresses a critical bug where the general worker incorrectly marked workflow executions as COMPLETED immediately after orchestrating async file processing, while files were still being processed. Changes: - Removed WorkflowExecutionStatusUpdate that set status to COMPLETED - Removed incorrect execution_time update (only orchestration time) - Removed incorrect total_files calculation - Updated comments to clarify orchestration vs execution completion - Updated logging to reflect async orchestration behavior The callback worker now properly handles setting the final COMPLETED or ERROR status after all files finish processing, matching the pattern used by the API deployment worker. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary by CodeRabbit
WalkthroughThe general worker’s orchestration flow was adjusted to no longer finalize workflow status. It now records orchestration execution time, logs accordingly, and defers status completion and cache cleanup to a callback worker. An import of WorkflowExecutionStatusUpdate was removed. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant T as Trigger
participant GW as General Worker (async_execute_bin_general)
participant AJ as Async File Processing
participant CB as Callback Worker
participant DS as Data Store
T->>GW: Start workflow execution
GW->>DS: Mark workflow status: EXECUTING
GW->>AJ: Dispatch async file processing jobs
Note right of GW: Record orchestration-only execution_time<br/>(not total runtime)
GW-->>T: Return orchestration success (async processing ongoing)
par For each file
AJ-->>CB: Completion callback with file result
end
CB->>DS: Aggregate results and finalize status (e.g., COMPLETED/FAILED)
CB->>DS: Perform cache cleanup
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro Cache: Disabled due to Reviews > Disable Cache setting Knowledge base: Disabled due to 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
|
What
workers/general/tasks.py
after orchestrating async file processingWorkflowExecutionStatusUpdate
import and usage that incorrectly marked execution as complete before files finished processingtotal_files
calculation based on input parametersWhy
execution_time
was only measuring orchestration time (~2 seconds), not actual file processing time (potentially minutes), leading to incorrect execution time displaytotal_files
calculation was incorrect - used input parameterhash_values_of_files
instead of actual discovered source files countHow
WorkflowExecutionStatusUpdate
and calledupdate_workflow_execution_status()
with COMPLETED statusWorkflowExecutionStatusUpdate
from imports (line 19) as it's no longer needed"Successfully completed general workflow execution"
to"Successfully orchestrated general workflow execution (files processing asynchronously)"
workers/api-deployment/tasks.py
behavior which correctly leaves final status determination to callback workerworkers/callback/tasks.py
) already handles setting final COMPLETED or ERROR status using_determine_execution_status_unified()
after all files completeCan this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
No, this PR will not break any existing features. In fact, it fixes broken behavior:
_execute_general_workflow()
)Database Migrations
Env Config
Relevant Docs
workers/WORKERS_ARCHITECTURE_CONFLUENCE.md
workers/callback/tasks.py
_determine_execution_status_unified()
Related Issues or PRs
workers/api-deployment/tasks.py
(correct implementation)Dependencies Versions
Notes on Testing
_determine_execution_status_unified()
_execute_general_workflow()
Screenshots
N/A - Backend worker logic fix only
Checklist
I have read and understood the Contribution Guidelines.
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com