-
Notifications
You must be signed in to change notification settings - Fork 427
OCPBUGS-69394: improve conflict error handling in hosted cluster status updates #7414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add proper conflict resolution for Status().Update() calls throughout the hosted cluster controller. When API conflicts occur, return requeue instead of propagating errors to prevent unnecessary error states. This resolves issues where concurrent status updates could cause controller reconciliation failures due to resource version conflicts. Changes: - Add IsConflict() checks in hosted cluster status updates - Return appropriate requeue results for conflict scenarios - Maintain error handling for non-conflict status update failures 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com> Signed-off-by: Juan Manuel Parrilla Madrid <jparrill@redhat.com>
|
@jparrill: This pull request references Jira Issue OCPBUGS-69394, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughModified the hosted cluster controller to implement robust conflict handling across multiple reconciliation paths. When Kubernetes API updates encounter conflicts due to optimistic concurrency, the controller now requeues operations instead of failing, replacing direct error returns with requeue responses throughout status updates and finalizer removal operations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro Cache: Disabled due to data retention organization setting Knowledge base: Disabled due to 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)**⚙️ CodeRabbit configuration file
Files:
🔇 Additional comments (5)
Comment |
|
/auto-cc |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jparrill The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary
This PR improves conflict error handling in the HostedCluster controller by properly handling API resource version conflicts during status updates.
This is part of the resolution of OCPBUGS-69394, still need more info about the behavior after this PR on Integration. The reasoning behind this is:
This PR solves the status changes from HO to ensure the error handling is well managed.
Problem
The HostedCluster controller was experiencing reconciliation failures when concurrent status updates occurred, leading to resource version conflicts. These conflicts were being treated as hard errors rather than temporary conditions that should trigger requeue.
Solution
IsConflict()checks for allStatus().Update()calls in the hosted cluster controllerChanges
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.goTesting
Fixes
Related Issues
🤖 Generated with Claude Code