-
Notifications
You must be signed in to change notification settings - Fork 66
WIP 🐛 Fix: Truncate large error messages in status conditions (OCPBUGS-59518, OCPBUGS-38567) #2169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
WIP 🐛 Fix: Truncate large error messages in status conditions (OCPBUGS-59518, OCPBUGS-38567) #2169
Conversation
When upgrading operators, CRD validation errors can be very large (50KB+). Kubernetes rejects status updates over 32KB with "Too long: may not be more than 32768 bytes". This causes ClusterExtension upgrades to fail and get stuck. Added `truncateMessage()` function that cuts messages over 30KB. Applied to status condition functions that handle large errors: - `setStatusProgressing()` - handles CRD validation errors - `ensureAllConditionsWithReason()` - handles resolution errors - `setInstalledStatusConditionUnknown()` - handles bundle errors Messages keep important info at the start and add "... [message truncated]" suffix. Now upgrades complete successfully even with large CRD validation errors. Added unit tests for truncation logic and CRD error scenarios. Assisted-by: Cursor
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Can you include some details of the messages that are too long? I feel like arbitrarily truncating the message is sort of papering over the underlying issue, which is that 30k-byte messages in conditions are a poor UX, and the real solution would be to make the message shorter to begin with. /hold |
@@ -27,6 +27,23 @@ import ( | |||
ocv1 "github.com/operator-framework/operator-controller/api/v1" | |||
) | |||
|
|||
const ( | |||
// maxConditionMessageLength is the Kubernetes limit minus some buffer for safety | |||
maxConditionMessageLength = 30000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the maximum length is 32768 according to kubernetes validation, we can use all of that. There's no need to have an extra bit of space for safety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need an extra for truncationSuffix = "\n\n... [message truncated]"
at least
@@ -160,7 +160,7 @@ func ensureAllConditionsWithReason(ext *ocv1.ClusterExtension, reason v1alpha1.C | |||
Type: condType, | |||
Status: metav1.ConditionFalse, | |||
Reason: string(reason), | |||
Message: message, | |||
Message: truncateMessage(message), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a limit imposed on all condition messages, it seems like we need to make sure that we truncate all condition messages.
This is one of many places where we set condition messages, right?
We may need to implement a wrapper around the meta.SetCondition()
that:
- truncates messages
- everything throughout our project uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think wrapper will be better as well +1
Hi @joelanford Thank you for your fast review
I remember when we discussed it in the past the idea was just trunc |
When upgrading operators, CRD validation errors can be very large (50KB+). Kubernetes rejects status updates over 32KB with "Too long: may not be more than 32768 bytes". This causes ClusterExtension upgrades to fail and get stuck.
Added
truncateMessage()
function that cuts messages over 30KB. Applied to status condition functions that handle large errors:setStatusProgressing()
- handles CRD validation errorsensureAllConditionsWithReason()
- handles resolution errorssetInstalledStatusConditionUnknown()
- handles bundle errorsMessages keep important info at the start and add "... [message truncated]" suffix. Now upgrades complete successfully even with large CRD validation errors.
Added unit tests for truncation logic and CRD error scenarios.
Reviewer Checklist