Skip to content

CAP_NET_ADMIN for agent and CLI binaries on Linux #9881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
coadler opened this issue Sep 26, 2023 · 3 comments
Closed

CAP_NET_ADMIN for agent and CLI binaries on Linux #9881

coadler opened this issue Sep 26, 2023 · 3 comments
Assignees
Labels
networking Area: networking

Comments

@coadler
Copy link
Contributor

coadler commented Sep 26, 2023

Right now we run agent and CLI binaries without any sort of privilege escalation, which leaves us with the inability to increase buffer sizes for our UDP sockets. This is shown by the following log message:

2023-09-26 21:30:17.014 [debu]  net.wgengine: magicsock: [warning] failed to force-set UDP write buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only)

I tested a bit with giving coder binaries CAP_NET_ADMIN (which allows us to resize UDP buffers) and got about ~50% increase in performance on two cores.

Without CAP_NET_ADMIN (both agent and CLI)

INTERVAL       THROUGHPUT
0.00-1.02 sec  362.4548 Mbits/sec
1.02-2.06 sec  419.8718 Mbits/sec
2.06-3.07 sec  529.4830 Mbits/sec
3.07-4.07 sec  536.7345 Mbits/sec
4.07-5.06 sec  663.2100 Mbits/sec
-----------------------------------
0.00-5.06 sec  500.8592 Mbits/sec

With CAP_NET_ADMIN (both agent and CLI)

INTERVAL       THROUGHPUT
0.00-1.00 sec  786.4530 Mbits/sec
1.00-2.02 sec  822.9956 Mbits/sec
2.02-3.04 sec  875.3655 Mbits/sec
3.04-4.05 sec  877.9789 Mbits/sec
4.05-5.04 sec  865.6894 Mbits/sec
-----------------------------------
0.00-5.04 sec  845.7073 Mbits/sec

Adding CAP_NET_ADMIN to agents should be pretty straight forward, as long as the workspace contains setcap. For CLI installs, we might be able to automatically add it via the install script. It's worth noting the increase in speeds only happen when both the agent and CLI have CAP_NET_ADMIN. If either are missing, the lower speeds are seen.

It might be good to experiment with higher buffer sizes to find a happy medium for our use case.

@coadler coadler added feature s3 Bugs that confuse, annoy, or are purely cosmetic networking Area: networking labels Sep 26, 2023
@coadler coadler self-assigned this Sep 26, 2023
@cdr-bot cdr-bot bot added the bug label Sep 26, 2023
@coadler coadler removed the bug label Sep 26, 2023
@cdr-bot cdr-bot bot added the bug label Sep 26, 2023
@kylecarbs
Copy link
Member

Very cool!

@spikecurtis spikecurtis removed s3 Bugs that confuse, annoy, or are purely cosmetic bug labels Sep 27, 2023
@spikecurtis
Copy link
Contributor

This would be great to get in. We should make sure that everything continues to work if that capability is unavailable, e.g. testing on OpenShift with PodSecurity policy that prohibits escalation.

@coadler
Copy link
Contributor Author

coadler commented Oct 3, 2023

Additions to the agent and CLI script were added to allow automatically setting CAP_NET_ADMIN on coder binaries.

CLI: use curl -L https://coder.com/install.sh | sh -s -- --net-admin
Agent: set USE_CAP_NET_ADMIN=true in the workspace env

@coadler coadler closed this as completed Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
networking Area: networking
Projects
None yet
Development

No branches or pull requests

3 participants