Skip to content

🐛 catalog status is not updated during re-unpack after pod reset #1626

@perdasilva

Description

@perdasilva

If the controller pod is deleted (and recreated by the deployment), the underlying catalog cache get deleted and the controller must re-cache the catalog images. During this period, the catalog reports positive "progressing" and "serving" conditions, but the server will return a 404. We may want to reset these conditions to reflect the fact that the catalog needs to be unpacked. We may want to do something similar for bundle unpacking.

Note that this bug is the cause for the upgrade-e2e test flakiness. Seems that there isn't enough time for the cache to rebuild itself and the cluster extension upgrade to progress and the test enters this eventually loop because to it it seems the catalog is up, and everything is ok.

related to #1550

⚠ when addressing this issue, we should also update the upgrade-e2e-test and remove the flakiness mitigation.
Once fixed, it ought to be sufficient to wait for signal from the catalog to ensure its been successfully re-unpacked and is being served.

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions