Skip to content

[WIP] refactor: clean up the watch server handlers #131325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

karlkfi
Copy link
Contributor

@karlkfi karlkfi commented Apr 16, 2025

What type of PR is this?

/kind cleanup
/sig api-machinery

What this PR does / why we need it

  • Always stop the watcher when done reading events from the result
    channel. ListResource already stops the watch, but it's multiple
    layers above, which means unit tests of the internal methods need
    to replicate that behavior. So this change simplifies testing and
    ensures the watcher is stopped at least once. This should also make
    it easier to simplify the WatchServer in the future.
  • Avoid calling Done & ResultChan multiple times, to reduce locking
    and memory allocations.
  • Replace the TimeoutFactory with a Context, to reduce complexity.
  • Use the -Ch suffix for channel variables.
  • Add a done channel to the WatchServer to handle cleanup:
    stop the context and the storage watcher.
  • Fix some flaky tests that were trying to use SimpleStorage.fakeWatcher
    before Watch was called.
  • Validate url.Parse does not error in tests.
  • Add apitesting http methods for handling response body drain, read,
    and closure.

Does this PR introduce a user-facing change?

NONE

Dependencies

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 16, 2025
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.33 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.33.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Tue Apr 15 19:35:15 UTC 2025.

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 16, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver labels Apr 16, 2025
@k8s-ci-robot k8s-ci-robot requested review from apelisse and sttts April 16, 2025 00:21
@karlkfi karlkfi changed the title chore: clean up the watch server handlers [WIP] chore: clean up the watch server handlers Apr 16, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 16, 2025
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 17, 2025
@karlkfi
Copy link
Contributor Author

karlkfi commented Apr 17, 2025

Needs rebase on top of #131339

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2025
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 25, 2025
@karlkfi karlkfi changed the title [WIP] chore: clean up the watch server handlers [WIP] refactor: clean up the watch server handlers Apr 25, 2025
@karlkfi karlkfi force-pushed the karl-watch-close branch 3 times, most recently from cf15132 to 5767ca0 Compare April 25, 2025 05:47
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: karlkfi
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

karlkfi added 2 commits April 26, 2025 02:10
- Use testify assert & require
- Use io instead of ioutil
- Use named sub-tests, instead of just loops
- Fix flaky tests that weren't waiting for Watch to be called before
  sending events.
- Validate url.Parse does not error in tests.
- Close response body
- Add apitesting http methods for handling response body and websocket
  read, drain, and closure.
- Pass test context to http requests
- Fix client-go Watch and WatchList to cancel the request on error.
  This frees up resources in the client and server, allowing re-use of
  the TCP connection.
- Fix net.IsProbableEOF to catch read failures on closed responses.
  This prevents StreamWatcher.receive from sending a Decode error event
  when the response body has been closed asynchronously, which prevents
  the watch client from sending a decode error on the watch result
  channel after the watcher has been stopped by the event consumer or
  after the watch context has been cancelled or timed out.
- Update the apiserver watch endpoint tests to validate that the
  storage watcher result channel, timeout done channel, and response
  body are closed on both success and failure.
- Update the client-go rest tests to stop the watcher and wait for the
  result channel to close.
- Add channel testing helper functions to k8s.io/client-go/util/testing
- Always stop the watcher when done reading events from the result
  channel. ListResource already stops the watch, but it's multiple
  layers above, which means unit tests of the internal methods need
  to replicate that behavior. So this change simplifies testing and
  ensures the watcher is stopped at least once. This should also make
  it easier to simplify the WatchServer in the future.
- Avoid calling Done & ResultChan multiple times, to reduce locking
  and memory allocations.
- Replace the TimeoutFactory with a Context, to reduce complexity.
- Add a done channel to the WatchServer to handle cleanup:
  stop the context and the storage watcher.
- Fix some flaky tests that were trying to use SimpleStorage.fakeWatcher
  before Watch was called.
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Apr 28, 2025

@karlkfi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-gce bf78458 link true /test pull-kubernetes-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 9, 2025
@dims dims added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 12, 2025
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants