Skip to content

[FG:InPlacePodVerticalScaling] Don't read AllocatedResources from PodStatus during admission #133281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tallclair
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:

The static CPU policy & static memory policies attempt to read resource requests from the AllocatedResources in the container status, but the way this is done could be problematic if the container status is set without allocated resources. Allocated resources should be read from the allocation manager directly instead.

Fortunately, the requests are only read through the Allocate, GetTopologyHints and GetPodTopologyHints methods, which are only called through TopologyManager.Admit, and we already overwrite the desired resources with allocated resources in admission, so we can simply remove the logic to explicitly read the allocated resources.

Which issue(s) this PR is related to:

N/A

Special notes for your reviewer:

This can wait until v1.35

Does this PR introduce a user-facing change?

NONE

/sig node
/priority important-longterm
/assign @natasha41575
/cc @pravk03 @esotsal

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jul 29, 2025
@k8s-ci-robot k8s-ci-robot requested a review from esotsal July 29, 2025 17:26
@k8s-ci-robot
Copy link
Contributor

@tallclair: GitHub didn't allow me to request PR reviews from the following users: pravk03.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

What type of PR is this?

/kind bug

What this PR does / why we need it:

The static CPU policy & static memory policies attempt to read resource requests from the AllocatedResources in the container status, but the way this is done could be problematic if the container status is set without allocated resources. Allocated resources should be read from the allocation manager directly instead.

Fortunately, the requests are only read through the Allocate, GetTopologyHints and GetPodTopologyHints methods, which are only called through TopologyManager.Admit, and we already overwrite the desired resources with allocated resources in admission, so we can simply remove the logic to explicitly read the allocated resources.

Which issue(s) this PR is related to:

N/A

Special notes for your reviewer:

This can wait until v1.35

Does this PR introduce a user-facing change?

NONE

/sig node
/priority important-longterm
/assign @natasha41575
/cc @pravk03 @esotsal

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 29, 2025
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 29, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tallclair

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet labels Jul 29, 2025
Copy link
Contributor

@natasha41575 natasha41575 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: a1713ca1582c2764a241dce03213edbc9f7e7b12

@natasha41575
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 29, 2025
@pravk03
Copy link
Contributor

pravk03 commented Jul 29, 2025

/lgtm

@k8s-ci-robot
Copy link
Contributor

@pravk03: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

1 similar comment
@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@esotsal
Copy link
Contributor

esotsal commented Jul 30, 2025

/cc @ffromani for approval

@k8s-ci-robot k8s-ci-robot requested a review from ffromani July 30, 2025 01:54
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 30, 2025
@esotsal
Copy link
Contributor

esotsal commented Jul 30, 2025

/test pull-kubernetes-node-kubelet-serial-cpu-manager
/test pull-kubernetes-node-kubelet-serial-cpu-manager-kubetest2

@esotsal
Copy link
Contributor

esotsal commented Jul 30, 2025

/lgtm

@esotsal
Copy link
Contributor

esotsal commented Jul 30, 2025

/retest

@esotsal
Copy link
Contributor

esotsal commented Jul 30, 2025

@tallclair since Allocation manager is introduced in v1.34 i think would be great if this commit can be part of v1.34 as well. We can monitor for flaky tests/corner case, i think this commit will have low risk of impacting behaviour since IPPVS static policy is safeguarded but just in case.

@natasha41575
Copy link
Contributor

natasha41575 commented Jul 30, 2025

since Allocation manager is introduced in v1.34 i think would be great if this commit can be part of v1.34 as well. We can monitor for flaky tests/corner case, i think this commit will have low risk of impacting behaviour since IPPVS static policy is safeguarded but just in case.

  1. Allocation manager was introduced in v1.33 ([FG:InPlacePodVerticalScaling] Move pod resource allocation management out of the status manager #130254)

  2. I agree that there is probably low risk of including this in 1.34 but what makes it important enough to warrant an exception for it 1 week after code freeze?

@natasha41575 natasha41575 moved this from Triage to Needs Approver in SIG Node: code and documentation PRs Jul 30, 2025
@natasha41575 natasha41575 moved this from Needs Approver to Waiting on Author in SIG Node: code and documentation PRs Jul 30, 2025
@esotsal
Copy link
Contributor

esotsal commented Jul 30, 2025

During last weeks in v1.34 couple of tasks have been moved to allocation Manager like PodAdmission and resize. I thought it’s a good idea to include this as well to have them all in one bundle. No any other special reason.

@tallclair
Copy link
Member Author

Yeah, I didn't intend for this to be included in v1.34, since I don't have any indication that it's an actual bug. This would only be a problem if something initiated the container status before it was admitted by the kubelet. I think we should wait for v1.35 to merge this.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 6, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 6, 2025
@natasha41575
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 422e57b60f0c5e260db6b5be3f89cea9acc5b3b6

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

4 similar comments
@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@esotsal
Copy link
Contributor

esotsal commented Aug 8, 2025

Failure in windows pipeline is not related with this commit , there is a known flaky test issue monitored by #133297

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

1 similar comment
@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Waiting on Author
Development

Successfully merging this pull request may close these issues.

6 participants