[APF] low priorities have larger effective shares than high priorities

### What happened?

When a cluster is under load (i.e. there are a fair amount of requests from all priority levels), we notice critical requests are not served on time (i.e. responded 429s). For us, critical requests are commonly in `leader-election` and `node-high` priority levels and failures to serve those requests would result in controller restarts, unhealthy nodes. We believe the total volume of those critical requests is completely under the capacity of the machine that hosts `kube-apiserver`. In fact, it's requests from other priority levels that consumes the majority of machine capacity.

### What did you expect to happen?

With default settings, kube-apiserver is able to prioritize processing requests from `leader-election` and `node-high` priority levels. More specifically, I hope there is way to define the priority order of all priority levels and enforce a mechanism that higher priorities requests are always served first (while each priority can have a fair minimal share), only after that if the machine has remaining capacity, the lower priorities can be served.

This has a few pros:

1. Users don't need to precisely calculate and configure the nominal shares for each priority level with considerations of different machine capacities and workload types etc.
2. Concurrency shares are well utilized since unused shares will be passed and shared to lower priority levels.

### How can we reproduce it (as minimally and precisely as possible)?

1. Run a cluster and send a high volume of requests of all PLs
3. `kubectl get --raw /debug/api_priority_and_fairness/dump_priority_levels` to observe effective concurrency shares

### Anything else we need to know?

I believe it's currently a fact that APF doesn't implement prioritization, instead it reserves a portion of capacity for each priority level. Also, the reservation could be overcommitted. However, in practice, some requests are always more important than other requests. Although it's configurable, I hope the default settings can take this into consideration. 

According to [APF KEP](https://github.com/kubernetes/enhancements/tree/97713189b3107b41c4c19505d04aa7ef22df063b/keps/sig-api-machinery/1040-priority-and-fairness) and code, the default configurations of values for non-exempt priority levels indicate that the priority levels of system/workload-high/workload-low occupy 90% of the total concurrency shares.

| Name | Nominal Shares | Lendable | Proposed Borrowing Limit | Guaranteed |
| ---- | -------------: | -------: | -----------------------: | --: |
| leader-election |  10 |   0% | none | 10 |
| node-high       |  40 |  25% | none | 30 |
| system          |  30 |  33% | none | 20 |
| workload-high   |  40 |  50% | none | 20 |
| workload-low    | 100 |  90% | none | 10 |
| global-default  |  20 |  50% | none | 10 |
| catch-all       |   5 |   0% | none | 5 |

If we want to apply some degree of prioritization, maybe we can think the priority order as in `leader-election > node-high > system ...`. However, the default settings prioritize `workload-low` as it gets the majority of total shares.

### Kubernetes version

<details>

```console
$ kubectl version
# paste output here
```

</details>


### Cloud provider

<details>

</details>


### OS version

<details>

```console
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
```

</details>


### Install tools

<details>

</details>


### Container runtime (CRI) and version (if applicable)

<details>

</details>


### Related plugins (CNI, CSI, ...) and versions (if applicable)

<details>

</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[APF] low priorities have larger effective shares than high priorities #121982

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Name	Nominal Shares	Lendable	Proposed Borrowing Limit	Guaranteed
leader-election	10	0%	none	10
node-high	40	25%	none	30
system	30	33%	none	20
workload-high	40	50%	none	20
workload-low	100	90%	none	10
global-default	20	50%	none	10
catch-all	5	0%	none	5

[APF] low priorities have larger effective shares than high priorities #121982

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions