-
Notifications
You must be signed in to change notification settings - Fork 41.1k
Description
What happened?
We have CRDs with multiple apiVersions. Even if the old (non-storage) apiVersions are not used at all we regularly receive conversion requests for them.
We roughly get 1 conversion request for each non-storage apiVersion per kube-apiserver instance for every CR create/update (actually a little bit less than that, but not sure why).
So if we have a CRD with 5 old apiVersions and a cluster with 3 kube-apiservers
=> we get roughly 15 conversion requests for every create/update on a CR (it's slightly less than that - not sure why though)
What did you expect to happen?
I would expect to only get conversion requests when conversion is required, e.g. if a client requests a CR in a different apiVersion than the one in which the object is stored in etcd.
How can we reproduce it (as minimally and precisely as possible)?
Create a Kubernetes cluster via kind
kind create cluster
Deploy the Cluster CRD
$ kubectl apply -f ./crd_cluster.yaml
crd_cluster.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.17.0
name: clusters.cluster.x-k8s.io
spec:
conversion:
strategy: Webhook
webhook:
conversionReviewVersions: ["v1", "v1beta1"]
clientConfig:
service:
namespace: system
name: webhook-service
path: /convert
group: cluster.x-k8s.io
names:
kind: Cluster
listKind: ClusterList
plural: clusters
singular: cluster
scope: Namespaced
versions:
- deprecated: true
name: v1alpha3
schema:
openAPIV3Schema:
properties:
spec:
properties:
paused:
type: boolean
type: object
type: object
served: false
storage: false
- deprecated: true
name: v1alpha4
schema:
openAPIV3Schema:
properties:
spec:
properties:
paused:
type: boolean
type: object
type: object
served: true
storage: false
- name: v1beta1
schema:
openAPIV3Schema:
properties:
spec:
properties:
paused:
type: boolean
type: object
type: object
served: true
storage: true
Deploy the Cluster CR
$ kubectl apply -f ./cr_cluster.yaml
cr_cluster.yaml
kind: Cluster
apiVersion: cluster.x-k8s.io/v1beta1
metadata:
name: cluster-1
namespace: default
Observe kube-apiserver logs
$ kubectl -n kube-system logs -f kube-apiserver-kind-control-plane
...
E0204 13:21:54.207206 1 watcher.go:567] failed to prepare current and previous objects: conversion webhook for cluster.x-k8s.io/v1beta1, Kind=Cluster failed: Post "https://webhook-service.system.svc:443/convert?timeout=30s": service "webhook-service" not found
...
E0204 13:30:47.583960 1 cacher.go:478] cacher (clusters.cluster.x-k8s.io): unexpected ListAndWatch error: failed to list cluster.x-k8s.io/v1alpha4, Kind=Cluster: conversion webhook for cluster.x-k8s.io/v1beta1, Kind=Cluster failed: Post "https://webhook-service.system.svc:443/convert?timeout=30s": service "webhook-service" not found; reinitializing...
W0204 13:30:48.586795 1 reflector.go:569] storage/cacher.go:/cluster.x-k8s.io/clusters: failed to list cluster.x-k8s.io/v1alpha4, Kind=Cluster: conversion webhook for cluster.x-k8s.io/v1beta1, Kind=Cluster failed: Post "https://webhook-service.system.svc:443/convert?timeout=30s": service "webhook-service" not found
...
E0204 13:30:45.575802 1 cacher.go:478] cacher (clusters.cluster.x-k8s.io): unexpected ListAndWatch error: failed to list cluster.x-k8s.io/v1alpha3, Kind=Cluster: conversion webhook for cluster.x-k8s.io/v1beta1, Kind=Cluster failed: Post "https://webhook-service.system.svc:443/convert?timeout=30s": service "webhook-service" not found; reinitializing...
W0204 13:30:46.579452 1 reflector.go:569] storage/cacher.go:/cluster.x-k8s.io/clusters: failed to list cluster.x-k8s.io/v1alpha3, Kind=Cluster: conversion webhook for cluster.x-k8s.io/v1beta1, Kind=Cluster failed: Post "https://webhook-service.system.svc:443/convert?timeout=30s": service "webhook-service" not found
Some comments:
- First we deploy a CRD with the following apiVersions:
- v1alpha3: served: false, storage: false
- v1alpha4: served: true, storage: false
- v1beta1: served: true, storage: true
- Then we deploy a v1beta1 Cluster CR
- We can then see in the apiserver logs that the apiserver tries to create a ListWatch for v1alpha3 & v1alpha4
- This simple example to reproduce the issue doesn't implement an actual conversion webhook, so we simply get errors.
- If we would implement a conversion webhook the ListWatch would be created successfully and we could observe conversion requests for v1alpha3 / v1alpha4 (as mentioned above roughly for every single create/update of a Cluster CR)
- As not a single CR has been read or written with v1alpha3 or v1alpha4 I would have expected to receive no conversion requests at all. Instead we see a very high number of conversion requests. This problem multiplies with the number of kube-apiserver's.
Anything else we need to know?
We opened a Slack thread for this issue and did some initial triage: https://kubernetes.slack.com/archives/C0EG7JC6T/p1736528576393239
While debugging through the apiserver we found the following:
- A GET request to one of our APIs leads to a call of https://github.com/kubernetes/kubernetes/blob/439d2f7b4028638b3d8d9261bb046c3ba8d9[…]apiextensions-apiserver/pkg/apiserver/customresource_handler.go
getOrCreateServingInfoFor
then iterates through all versions of our CRD and calls customresource.NewStorage for them- Then a few layers deeper reflectors are created for all versions
So if we understand this correctly the apiserver creates reflectors (with list & watch) for all versions of all CRDs (also independent of if the versions are served or not):
We think these reflectors are then later calling the conversion webhooks:
@sttts opened a PR with the goal to stop creating ListWatches for unserved versions: #129709
Kubernetes version
$ kubectl version
Client Version: v1.32.0
Kustomize Version: v5.5.0
Server Version: v1.32.0
Cloud provider
OS version
Apple Silicon M2