-
Notifications
You must be signed in to change notification settings - Fork 66
Description
If the controller pod is deleted (and recreated by the deployment), the underlying catalog cache get deleted and the controller must re-cache the catalog images. During this period, the catalog reports positive "progressing" and "serving" conditions, but the server will return a 404. We may want to reset these conditions to reflect the fact that the catalog needs to be unpacked. We may want to do something similar for bundle unpacking.
Note that this bug is the cause for the upgrade-e2e test flakiness. Seems that there isn't enough time for the cache to rebuild itself and the cluster extension upgrade to progress and the test enters this eventually loop because to it it seems the catalog is up, and everything is ok.
related to #1550
⚠ when addressing this issue, we should also update the upgrade-e2e-test and remove the flakiness mitigation.
Once fixed, it ought to be sufficient to wait for signal from the catalog to ensure its been successfully re-unpacked and is being served.