-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Do not fail PipelineRun if pvc creation error is because of exceeded quotas #8903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/wip |
4a5edbf
to
52464a8
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
/hold |
…quotas In case of the PVC creation (from volumeclaimtemplate) is due to a quota error (quota exceeded), do not fail with a permanent error, and instead mark the PipelineRun as pending. Once there is some quota available back, it will be able to start. Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
52464a8
to
6b0c7ae
Compare
The following is the coverage report on the affected files.
|
/retest |
Timeout is taken into account 👼🏼 |
/assign |
/cc @twoGiants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, it makes sense to set the status to pending and let it retry 😸 👍.
Now it will re-queue right away. Do you want to make it configurable at some point like suggested in the issue?
My comments are below. I would add unit tests, simplify the conditional logic and remove the re-declaration of the errors in affinity_assistant.go
.
@@ -73,14 +80,20 @@ func (c *defaultPVCHandler) CreatePVCFromVolumeClaimTemplate(ctx context.Context | |||
if apierrors.IsAlreadyExists(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: We could use the chance to untangle the conditions a bit. If you pick this suggestion the comment right below can be skipped :).
if apierrors.IsAlreadyExists(err) { | |
_, getErr := c.clientset.CoreV1().PersistentVolumeClaims(claim.Namespace).Get(ctx, claim.Name, metav1.GetOptions{}) | |
switch { | |
case apierrors.IsNotFound(getErr): | |
_, createErr := c.clientset.CoreV1().PersistentVolumeClaims(claim.Namespace).Create(ctx, claim, metav1.CreateOptions{}) | |
if createErr == nil { | |
c.logger.Infof("Created PersistentVolumeClaim %s in namespace %s", claim.Name, claim.Namespace) | |
return nil | |
} | |
if apierrors.IsAlreadyExists(createErr) { | |
c.logger.Infof("Tried to create PersistentVolumeClaim %s in namespace %s, but it already exists", | |
claim.Name, claim.Namespace) | |
return nil | |
} | |
// This is a retry-able error | |
if apierrors.IsForbidden(createErr) && strings.Contains(createErr.Error(), "exceeded quota") { | |
return fmt.Errorf("%w: %v", ErrPvcCreationFailedRetryable, createErr.Error()) | |
} | |
return fmt.Errorf("%w for %s: %w", ErrPvcCreationFailed, claim.Name, createErr) | |
case getErr != nil: | |
return fmt.Errorf("%w: failed to retrieve PVC %s: %w", ErrPvcCreationFailed, claim.Name, getErr) | |
} | |
} |
// This is a retry-able error | ||
return fmt.Errorf("%w: %v", ErrPvcCreationFailedRetryable, err.Error()) | ||
} | ||
return fmt.Errorf("%w for %s: %w", ErrPvcCreationFailed, claim.Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return and the one in the else
right after can be simplified.
return fmt.Errorf("%w for %s: %w", ErrPvcCreationFailed, claim.Name, err) | |
} else if apierrors.IsForbidden(err) && strings.Contains(err.Error(), "exceeded quota") { | |
// This is a retry-able error | |
return fmt.Errorf("%w: %v", ErrPvcCreationFailedRetryable, err.Error()) | |
} | |
return fmt.Errorf("%w for %s: %w", ErrPvcCreationFailed, claim.Name, err) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit tests for the new conditional logic in CreatePVCFromVolumeClaimTemplate
should be added.
@@ -50,7 +50,8 @@ const ( | |||
) | |||
|
|||
var ( | |||
ErrPvcCreationFailed = errors.New("PVC creation error") | |||
ErrPvcCreationFailed = volumeclaim.ErrPvcCreationFailed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't re-declare them here if they are already declared in volumeclaim
and public. You don't use them in this file anyway.
@@ -577,54 +577,56 @@ func TestCreateOrUpdateAffinityAssistantsAndPVCs_Failure(t *testing.T) { | |||
name: "pvc creation failed - per workspace", | |||
failureType: "pvc", | |||
aaBehavior: aa.AffinityAssistantPerWorkspace, | |||
expectedErr: fmt.Errorf("%w: failed to create PVC pvc-b9eea16dce: error creating persistentvolumeclaims", ErrPvcCreationFailed), | |||
expectedErr: fmt.Errorf("%w for pvc-b9eea16dce: error creating persistentvolumeclaims", ErrPvcCreationFailed), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use volumeclaim.ErrPvcCreationFailed
instead, then you don't need to re-declare them in affinity_assistant.go
.
@@ -755,6 +755,10 @@ func (c *Reconciler) reconcile(ctx context.Context, pr *v1.PipelineRun, getPipel | |||
pr.Status.MarkFailed(volumeclaim.ReasonCouldntCreateWorkspacePVC, | |||
"Failed to create PVC for PipelineRun %s/%s correctly: %s", | |||
pr.Namespace, pr.Name, err) | |||
case errors.Is(err, ErrPvcCreationFailedRetryable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I would also use it from volumeclaim.ErrPvcCreationFailed
.
Changes
In case of the PVC creation (from volumeclaimtemplate) is due to a
quota error (quota exceeded), do not fail with a permanent error, and
instead mark the PipelineRun as pending. Once there is some quota
available back, it will be able to start.
Signed-off-by: Vincent Demeester vdemeest@redhat.com
Closes #7672
/kind feature
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
/kind <type>
. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tepRelease Notes