OADP-6653: CloudStorage stop retrying after 3 errors #1937

kaovilai · 2025-09-03T13:54:31Z

Remove after rebase against OADP-6652: Fix unnecessary secret updates and logging in STS flow #1936 Fix unnecessary secret updates and logging in STS flow
Add retry logic and status conditions to CloudStorage controller

Why the changes were made

How to test the changes made

The operator was repeatedly logging "Secret already exists, updating" and "Following standardized STS workflow, secret created successfully" even when the secret content hadn't changed. This was happening because the CloudStorage controller calls STSStandardizedFlow() on every reconciliation, which always attempted to create the secret first, then caught the AlreadyExists error and performed an update. Changed the approach to: - First check if the secret exists - Compare existing data with desired data - Only update when there are actual differences - Skip updates and avoid logging when content is identical - Changed CloudStorage controller to use Debug level and more accurate message when STS secret is available (not necessarily created) This eliminates unnecessary API calls to the Kubernetes cluster and reduces noise in the operator logs. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

openshift-ci · 2025-09-03T13:54:49Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

openshift-ci · 2025-09-03T13:55:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kaovilai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [kaovilai]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2025-09-03T18:27:57Z

@kaovilai: This pull request references OADP-6653 which is a valid jira issue.

In response to this:

Remove after rebase against OADP-6652: Fix unnecessary secret updates and logging in STS flow #1936 Fix unnecessary secret updates and logging in STS flow

Add retry logic and status conditions to CloudStorage controller

Why the changes were made

How to test the changes made

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

kaovilai · 2025-09-04T14:34:16Z

from scrum backoff for transient/unknown 500 errs
if known issue then super long backoff. this would cover the case where there is even nothing to watch.. like STS tokens which on rotation do not result in secret updates.

- Add Conditions field to CloudStorageStatus for better observability - Implement exponential backoff by returning errors on bucket operations - Controller-runtime automatically handles retries (5ms to 1000s max) - Add condition constants for type-safe reason strings - Create mock bucket client for improved testing - Add comprehensive tests for backoff behavior and conditions Key improvements: - Standard Kubernetes pattern using built-in workqueue backoff - Self-healing: continues retrying with increasing delays - Better observability through status conditions - Per-item backoff: each CloudStorage CR gets independent retry timing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 3, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 3, 2025

kaovilai changed the title ~~CloudStorage LimitedRetries~~ OADP-6653: CloudStorage stop retrying after 3 errors Sep 3, 2025

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 3, 2025

kaovilai force-pushed the CloudStorage-LimitedRetries branch from dbbfdab to 3a9b0b4 Compare September 5, 2025 14:10

kaovilai force-pushed the CloudStorage-LimitedRetries branch from 3a9b0b4 to bb041c9 Compare September 5, 2025 14:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OADP-6653: CloudStorage stop retrying after 3 errors #1937

OADP-6653: CloudStorage stop retrying after 3 errors #1937

kaovilai commented Sep 3, 2025 •

edited

Loading

Uh oh!

openshift-ci bot commented Sep 3, 2025

Uh oh!

openshift-ci bot commented Sep 3, 2025

Uh oh!

openshift-ci-robot commented Sep 3, 2025 •

edited by openshift-ci bot

Loading

Why the changes were made

How to test the changes made

Uh oh!

kaovilai commented Sep 4, 2025

Uh oh!

Uh oh!

OADP-6653: CloudStorage stop retrying after 3 errors #1937

Are you sure you want to change the base?

OADP-6653: CloudStorage stop retrying after 3 errors #1937

Conversation

kaovilai commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why the changes were made

How to test the changes made

Uh oh!

openshift-ci bot commented Sep 3, 2025

Uh oh!

openshift-ci bot commented Sep 3, 2025

Uh oh!

openshift-ci-robot commented Sep 3, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why the changes were made

How to test the changes made

Uh oh!

kaovilai commented Sep 4, 2025

Uh oh!

Uh oh!

kaovilai commented Sep 3, 2025 •

edited

Loading

openshift-ci-robot commented Sep 3, 2025 •

edited by openshift-ci bot

Loading