-
Notifications
You must be signed in to change notification settings - Fork 41.1k
Resolve confusing use of TooManyRequests error for eviction #133097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Resolve confusing use of TooManyRequests error for eviction #133097
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hi @kei01234kei. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kei01234kei The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/sig apps |
Let me also assign you as reviewers because I saw you in the issue discussion. |
/ok-to-test |
@kei01234kei: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@@ -434,7 +434,25 @@ func (r *EvictionREST) checkAndDecrement(namespace string, podName string, pdb p | |||
} | |||
if pdb.Status.DisruptionsAllowed == 0 { | |||
err := errors.NewTooManyRequests("Cannot evict pod as it would violate the pod's disruption budget.", 0) | |||
err.ErrStatus.Details.Causes = append(err.ErrStatus.Details.Causes, metav1.StatusCause{Type: policyv1.DisruptionBudgetCause, Message: fmt.Sprintf("The disruption budget %s needs %d healthy pods and has %d currently", pdb.Name, pdb.Status.DesiredHealthy, pdb.Status.CurrentHealthy)}) | |||
condition := meta.FindStatusCondition(pdb.Status.Conditions, policyv1.DisruptionAllowedCondition) | |||
if condition.Status == metav1.ConditionFalse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
condition.Status
will panic if condition is nil, which FindStatusCondition will return if the condition is not present
I'd suggest customizing how we construct the message based on CurrentHealthy / DesiredHealthy / presence of a SyncFailedReason or other False condition, with a sensible generic fallback to avoid being confusing. I'd also suggest keeping the existing message as-is if CurrentHealthy <= DesiredHealthy since that is not confusing.
condition := meta.FindStatusCondition(pdb.Status.Conditions, policyv1.DisruptionAllowedCondition)
var msg string
switch {
case pdb.Status.CurrentHealthy <= pdb.Status.DesiredHealthy:
msg = fmt.Sprintf("The disruption budget %s needs %d healthy pods and has %d currently", pdb.Name, pdb.Status.DesiredHealthy, pdb.Status.CurrentHealthy)
case condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0 && condition.Reason == policy.SyncFailedReason:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently because it failed sync: %v", pdb.Name, condition.Message)
case condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently: %v", pdb.Name, condition.Message)
default:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently", pdb.Name)
}
err.ErrStatus.Details.Causes = append(err.ErrStatus.Details.Causes, metav1.StatusCause{Type: policyv1.DisruptionBudgetCause, Message: msg})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the above flow
Perhaps, this part could also output conditions without a condition.Message
and print also the condition.Reason
switch {
...
case condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently: %v", pdb.Name, condition.Message)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kei01234kei It would be great if we could test the new errors. I think we can add a new cases to this unit test
func TestEvictionIgnorePDB(t *testing.T) { |
Btw, the name
TestEvictionIgnorePDB
does not describe all of its test cases well anymore. Because not all the cases ignore PDBs. The easiest way to fix this is as follows, IMO:
s/TestEviction/TestEvictionWithETCD
s/TestEvictionIgnorePDB/TestEviction
@@ -434,7 +434,25 @@ func (r *EvictionREST) checkAndDecrement(namespace string, podName string, pdb p | |||
} | |||
if pdb.Status.DisruptionsAllowed == 0 { | |||
err := errors.NewTooManyRequests("Cannot evict pod as it would violate the pod's disruption budget.", 0) | |||
err.ErrStatus.Details.Causes = append(err.ErrStatus.Details.Causes, metav1.StatusCause{Type: policyv1.DisruptionBudgetCause, Message: fmt.Sprintf("The disruption budget %s needs %d healthy pods and has %d currently", pdb.Name, pdb.Status.DesiredHealthy, pdb.Status.CurrentHealthy)}) | |||
condition := meta.FindStatusCondition(pdb.Status.Conditions, policyv1.DisruptionAllowedCondition) | |||
if condition.Status == metav1.ConditionFalse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the above flow
Perhaps, this part could also output conditions without a condition.Message
and print also the condition.Reason
switch {
...
case condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently: %v", pdb.Name, condition.Message)
...
What type of PR is this?
/kind bug
What this PR does / why we need it:
To resolve the issue "Confusing use of TooManyRequests error for eviction."
Which issue(s) this PR is related to:
Fixes #106286
Special notes for your reviewer:
#106286 (comment)
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: