Skip to content

Add PSA for blocking .host on pod probes #125271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

tssurya
Copy link
Contributor

@tssurya tssurya commented Jun 1, 2024

What type of PR is this?

/kind bug
/kind feature

What this PR does / why we need it:

This PR aims to add the following fields as part of PSA so that users can be blocked/warned when they create probes with the .Host field set.

* spec.containers[*].livenessProbe.httpGet.host
* spec.containers[*].readinessProbe.httpGet.host
* spec.containers[*].startupProbe.httpGet.host
* spec.containers[*].livenessProbe.tcpSocket.host
* spec.containers[*].readinessProbe.tcpSocket.host
* spec.containers[*].startupProbe.tcpSocket.host
* spec.containers[*].lifecycle.postStart.tcpSocket.host // Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept for backward compatibility.
* spec.containers[*].lifecycle.preStop.tcpSocket.host // Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept for backward compatibility.
* spec.containers[*].lifecycle.postStart.httpGet.host
* spec.containers[*].lifecycle.preStop.httpGet.host
* spec.initContainers[*].livenessProbe.httpGet.host
* spec.initContainers[*].readinessProbe.httpGet.host
* spec.initContainers[*].startupProbe.httpGet.host
* spec.initContainers[*].livenessProbe.tcpSocket.host
* spec.initContainers[*].readinessProbe.tcpSocket.host
* spec.initContainers[*].startupProbe.tcpSocket.host
* spec.initContainers[*].lifecycle.postStart.tcpSocket.host // Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept for backward compatibility.
* spec.initContainers[*].lifecycle.preStop.tcpSocket.host // Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept for backward compatibility.
* spec.initContainers[*].lifecycle.postStart.httpGet.host
* spec.initContainers[*].lifecycle.preStop.httpGet.host

Which issue(s) this PR fixes:

Fixes #99425
Fixes kubernetes/enhancements#4940

Special notes for your reviewer:

This is my first PR to pod-security-admission area, so I might need some hand-holding, but I will mark the PR as draft till I get all the pieces done.

So one question I have is should I fix the tests on sig-node side like I have done here today by making them use the privileged mode? Or should I do something like 77fa562

Ran the following tests locally:

  • UPDATE_POD_SECURITY_FIXTURE_DATA=true go test -v ./test/... -run TestFixtures
  • go test -v ./test/... -run TestFixtures
  • go test -v ./policy/... -run TestHostProbesAndHostLifecycle
  • go test -v ./policy/... -timeout 30s
  • go test -v ./test/... -timeout 30s
  • make test-integration WHAT=./test/integration/auth GOFLAGS="-v" KUBE_TEST_ARGS='-run ^TestPodSecurity$$'

Does this PR introduce a user-facing change?

The `baseline` and `restricted` pod security admission levels now block setting the `host` field on probe and lifecycle handlers

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Please use the following format for linking documentation:


@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 1, 2024
@k8s-ci-robot k8s-ci-robot requested review from deads2k and krmayankk June 1, 2024 19:37
@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 1, 2024
@tssurya tssurya force-pushed the psa-probe-lifecycle-handler-host-option branch 2 times, most recently from 7ff5459 to 24e1078 Compare July 17, 2024 16:02
@tssurya tssurya changed the title Add PSS for blocking .host on pod probes Add PSA for blocking .host on pod probes Jul 17, 2024
@ibihim
Copy link
Contributor

ibihim commented Sep 9, 2024

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 9, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 29, 2025
@tssurya
Copy link
Contributor Author

tssurya commented Jan 29, 2025

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 29, 2025
@tssurya tssurya force-pushed the psa-probe-lifecycle-handler-host-option branch from 24e1078 to cc1d5c7 Compare March 24, 2025 09:37
@liggitt liggitt self-assigned this May 21, 2025
@liggitt liggitt added this to the v1.34 milestone Jun 10, 2025
@Rajalakshmi-Girish Rajalakshmi-Girish moved this to Pending inclusion in [sig-release] Bug Triage Jun 30, 2025
@adilGhaffarDev
Copy link
Contributor

Hello @tssurya @liggitt
This PR has not been updated for 4 months, so I'd like to check what's the status. If there's anything we can do, please let us know. The code freeze is starting 02:00 UTC Friday 25th July 2025 (about 4 weeks from now). Please make sure the PR has both lgtm and approved labels before the code freeze.
Thanks!

Copy link
Member

@liggitt liggitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple nits, then lgtm

please also open the documentation PR to update https://kubernetes.io/docs/concepts/security/pod-security-standards/#baseline for the dev-1.34 branch in https://github.com/kubernetes/website/

@tssurya
Copy link
Contributor Author

tssurya commented Jul 23, 2025

a couple nits, then lgtm

please also open the documentation PR to update https://kubernetes.io/docs/concepts/security/pod-security-standards/#baseline for the dev-1.34 branch in https://github.com/kubernetes/website/

thank you @liggitt for taking the time and patiently reviewing this and for your valuable suggestions, I'll fix the last set of comments now..
I have opened a draft docs PR here: kubernetes/website#51498 will update the docs

tssurya added 6 commits July 23, 2025 21:15
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
This commit adds the fixture tests for the
new .host field restrictions on probe
and lifecycle handlers.

ran UPDATE_POD_SECURITY_FIXTURE_DATA=true go test -v ./test/... -run TestFixtures

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
The sig-node tests have scenarios of doing probes and
lifecycle handler tests with post-start and pre-stop hooks
setting the host field to be another pod.

In baseline level such things won't be allowed because of
the PSA rules we are adding in this PR. So unsetting
the host field means it uses the podIP of self for doing
the checks and using that in the pre-stop and post-start
hooks is tricky because of the timing issues with when the
container is actually up v/s running the test.

So I have changed the tests to be privileded for them to
use the .host fields if they desire to.

See kubernetes#133091
which is an issue opened to properly refactor these tests.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
@tssurya tssurya force-pushed the psa-probe-lifecycle-handler-host-option branch from 1b05dbf to 4c87e60 Compare July 23, 2025 19:19
@liggitt
Copy link
Member

liggitt commented Jul 23, 2025

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 23, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 9953493bdf0e2230d06acc863906b76cd9a3b1f8

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 23, 2025
@k8s-ci-robot k8s-ci-robot merged commit 49cd871 into kubernetes:master Jul 23, 2025
13 checks passed
@github-project-automation github-project-automation bot moved this from Tracked to Done in [sig-release] Bug Triage Jul 23, 2025
@github-project-automation github-project-automation bot moved this from Archive-it to Done in SIG Node CI/Test Board Jul 23, 2025
@github-project-automation github-project-automation bot moved this from In Review to Closed / Done in SIG Auth Jul 23, 2025
@michaelasp
Copy link
Contributor

michaelasp commented Jul 24, 2025

It seems like this PR may have broken https://testgrid.k8s.io/sig-api-machinery-emulated-version

@tssurya it seems to be breaking poststart hooks, is this also occurring in regular conformance tests?

@tssurya
Copy link
Contributor Author

tssurya commented Jul 24, 2025

@michaelasp : Looking at the test grid link I can see its the same two sig-node tests around Lifecycle Hooks - if the tests are using '.Host' in the life cycle hook then yes I might have broken them :( starting this PR
I did fix these two failing tests in regular conformance to run in privileged mode..(See last commit in this PR: 4c87e60) Should I have also done that for n-1 somehow ? LMK if I missed a spot somehow..I am new to the test grid so not exactly sure what the version compatbility test matrix does - but if its doing what I think its doing which is running my PSA code in older n-1 versions then yes we will fail with:

https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/ci-kubernetes-e2e-kind-compatibility-versions-n-minus-1/1948226823225610240
	3s
{ failed [FAILED] Error creating Pod: pods "pod-with-prestop-http-hook" is forbidden: violates PodSecurity "baseline:latest": probe or lifecycle host (container "pod-with-prestop-http-hook" uses probe or lifecycle host "10.244.1.78")

We added a PSA blocking all .Host fields from being set in baseline mode. I'm not sure how in the past these cases have been handled... If I knew where to look for version compatibility tests I can fix them there too - Is it about having two versions of the tests or something and keeping a Min version to run these?

Also, those two sig-node tests also need a cleanup (#133091).

@danwinship
Copy link
Contributor

danwinship commented Jul 24, 2025

It seems like this might be the first new PSA rule added since KEP-4330: Compatibility Versions in Kubernetes?

I think the answer here is that PSA application needs to be based on the --emulated-version rather than the actual apiserver version. @liggitt @jpbetz?

@liggitt
Copy link
Member

liggitt commented Jul 24, 2025

hrmm... yeah... emulating an older minor should really adjust what PSA sees as the "latest" minor for the checks it chooses to run. I started plumbing emulated version into admission and it got really big really fast. I opened #133176 as a stop-gap for this one feature, will follow up in 1.35 to make admission and PSA generally emulation-version-aware.

@liggitt
Copy link
Member

liggitt commented Jul 28, 2025

(following up, https://testgrid.k8s.io/sig-testing-kind#compatibility-version-test-n-minus-1 jobs are green again after merge of #133176)

@AkihiroSuda
Copy link
Member

This one is still failing since 2025-07-24:

• [FAILED] [6.590 seconds]
[sig-node] Container Lifecycle Hook when create a pod with lifecycle hook [It] should execute poststart https hook properly [MinimumKubeletVersion:1.23] [NodeConformance] [sig-node, NodeConformance]
k8s.io/kubernetes/test/e2e/common/node/lifecycle_hook.go:194
   Timeline >>
  STEP: Creating a kubernetes client @ 07/24/25 07:49:56.477
  I0724 07:49:56.477650 2815 util.go:453] >>> kubeConfig: /home/prow/go/src/k8s.io/kubernetes/_rundir/188a9770-7f1e-4d01-8648-e5dce8a2ca03/kubeconfig
  STEP: Building a namespace api object, basename container-lifecycle-hook @ 07/24/25 07:49:56.478
  STEP: Waiting for a default service account to be provisioned in namespace @ 07/24/25 07:49:56.561
  STEP: Waiting for kube-root-ca.crt to be provisioned in namespace @ 07/24/25 07:49:56.569
  STEP: create the container to handle the HTTPGet hook request. @ 07/24/25 07:49:56.59
  STEP: create the pod with lifecycle hook @ 07/24/25 07:50:02.706
  I0724 07:50:02.712557 2815 lifecycle_hook.go:101] Unexpected error: Error creating Pod: 
      <*errors.StatusError | 0xc00063c000>: 
      pods "pod-with-poststart-https-hook" is forbidden: violates PodSecurity "baseline:latest": probe or lifecycle host (container "pod-with-poststart-https-hook" uses probe or lifecycle host "10.244.0.7")
      {
          ErrStatus: 
              code: 403
              details:
                kind: pods
                name: pod-with-poststart-https-hook
              message: 'pods "pod-with-poststart-https-hook" is forbidden: violates PodSecurity
                "baseline:latest": probe or lifecycle host (container "pod-with-poststart-https-hook"
                uses probe or lifecycle host "10.244.0.7")'
              metadata: {}
              reason: Forbidden
              status: Failure,
      }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Closed / Done
Status: Done
Development

Successfully merging this pull request may close these issues.

Add PSA to block setting .host field from ProbeHandler and LifecycleHandler Pod probes lead to blind SSRF from the node
10 participants