-
Notifications
You must be signed in to change notification settings - Fork 82
OADP-6653: CloudStorage exponential backoff by removing RequeueAfter #1937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: oadp-dev
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
@kaovilai: This pull request references OADP-6653 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
from scrum backoff for transient/unknown 500 errs |
3a9b0b4
to
bb041c9
Compare
@kaovilai: This pull request references OADP-6653 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
if updateErr := b.Client.Status().Update(ctx, &bucket); updateErr != nil { | ||
logger.Error(updateErr, "failed to update CloudStorage status") | ||
} | ||
return ctrl.Result{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return ctrl.Result{}, err
makes it requeue with exponential backoff if err != nil
.
// Return error to trigger exponential backoff | ||
if err != nil { | ||
return ctrl.Result{}, err | ||
} | ||
return ctrl.Result{}, fmt.Errorf("bucket creation failed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return ctrl.Result{}, err
makes it requeue with exponential backoff if err != nil
.
// Return error to trigger exponential backoff | ||
return ctrl.Result{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return ctrl.Result{}, err
makes it requeue with exponential backoff if err != nil
.
/hold to rebase against #1936 |
Reviewers are asked to review only second commit for this PR. |
@kaovilai: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
- Add Conditions field to CloudStorageStatus for better observability - Implement exponential backoff by returning errors on bucket operations - Controller-runtime automatically handles retries (5ms to 1000s max) - Add condition constants for type-safe reason strings - Create mock bucket client for improved testing - Add comprehensive tests for backoff behavior and conditions Key improvements: - Standard Kubernetes pattern using built-in workqueue backoff - Self-healing: continues retrying with increasing delays - Better observability through status conditions - Per-item backoff: each CloudStorage CR gets independent retry timing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
bb041c9
to
9cd79a3
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kaovilai The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kaovilai The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
} | ||
|
||
// Update status with updated value | ||
bucket.Status.LastSynced = &metav1.Time{Time: time.Now()} | ||
bucket.Status.Name = bucket.Spec.Name | ||
|
||
b.Client.Status().Update(ctx, &bucket) | ||
if err := b.Client.Status().Update(ctx, &bucket); err != nil { | ||
logger.Error(err, "failed to update CloudStorage status") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also use a backoff here? Otherwise, could this result in the condition remaining not ready?
- Return error instead of just logging when final status update fails - Add documentation test explaining the change - Ensures controller-runtime's exponential backoff is triggered for status update failures Addresses PR comment openshift#1937 discussion_r2330918689 Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
* Initial plan * Add exponential backoff for status update failures - Return error instead of just logging when final status update fails - Add documentation test explaining the change - Ensures controller-runtime's exponential backoff is triggered for status update failures Addresses PR comment openshift#1937 discussion_r2330918689 Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Description
This PR implements exponential backoff for the CloudStorage controller using Kubernetes-native libraries. The controller leverages controller-runtime's built-in workqueue backoff mechanism for automatic retries with increasing delays when bucket operations fail.
Key Changes
Exponential Backoff Implementation
Status Conditions for Observability
Conditions
field toCloudStorageStatus
for standard Kubernetes status reportingBucketReady
condition with multiple reason constants:ReasonBucketCreated
: Bucket successfully createdReasonBucketReady
: Bucket exists and is readyReasonBucketCreationFailed
: Bucket creation failed (triggers backoff)ReasonBucketCheckError
: Error checking bucket status (triggers backoff)ReasonSTSSecretError
: STS secret operation failed (triggers backoff)Improved Testing
mockBucketClient
for dependency injection in testsBucketClientFactory
to CloudStorageReconciler for test mockingBenefits
Testing
Related Issues
Note
Responses generated with Claude