-
Notifications
You must be signed in to change notification settings - Fork 41.3k
fix nil pointer dereference when NodeInfo.RemovePod #97609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix nil pointer dereference when NodeInfo.RemovePod #97609
Conversation
@xiaoanyunfei: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: xiaoanyunfei The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
pkg/scheduler/framework/types.go
Outdated
@@ -581,7 +585,7 @@ func (n *NodeInfo) RemovePod(pod *v1.Pod) error { | |||
return nil | |||
} | |||
} | |||
return fmt.Errorf("no corresponding pod %s in pods of node %s", pod.Name, n.node.Name) | |||
return fmt.Errorf("no corresponding pod %s in pods of node %s", pod.Name, n.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getting here means there is another bug, so before we decide whether or not it is worth carrying the node name in nodeinfo, lets debug why we are here in the first place.
/cc @alculquicondor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would only happen if the Pod was already removed, which in turn would be when 2 Pod delete events are received.
But we can use pod.Spec.NodeName.
Unless this is a bug that we already solved (can't find the PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a legitimate case where we get two delete events for a pod?
@xiaoanyunfei could you clarify which version where you using? |
aaa0e3a
to
dec0193
Compare
I'm using v1.19.0 |
Could you retry in the latest 1.19 release? |
I'll take a look at the code just in case. |
Also I would appreciate if you can provide repro steps (from a real world scenario), including the initial distribution of pods in the node. Also, the stacktrace seems incomplete. |
/triage needs-information |
@xiaoanyunfei: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/close |
@alculquicondor: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
Add one of the following kinds:
/kind bug
What this PR does / why we need it:
When
RemoveNode
is called beforeRemovePod
, k8s will panic: "invalid memory address or nil pointer dereference"This PR fix nil pointer dereference when NodeInfo.RemovePod
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
/sig-scheduler