Skip to content

Commit a027f6c

Browse files
feat: add scaling doc and clean up requirements doc (#1086)
* feat: add scaling doc and clean up requirements doc * Update setup/scaling.md Co-authored-by: Kyle Carberry <kyle@carberry.com> * Update setup/requirements.md Co-authored-by: Kyle Carberry <kyle@carberry.com> * Update setup/requirements.md Co-authored-by: Kyle Carberry <kyle@carberry.com> * Update setup/scaling.md Co-authored-by: Kyle Carberry <kyle@carberry.com> * Update setup/scaling.md Co-authored-by: Kyle Carberry <kyle@carberry.com> * Update scaling.md removed database info, not required. It would be rarely used. Co-authored-by: Kyle Carberry <kyle@carberry.com>
1 parent 8c368ee commit a027f6c

File tree

4 files changed

+95
-30
lines changed

4 files changed

+95
-30
lines changed

assets/setup/resource-request.png

7.86 KB
Loading

manifest.json

+3
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,9 @@
180180
{
181181
"path": "./setup/configuration.md"
182182
},
183+
{
184+
"path": "./setup/scaling.md"
185+
},
183186
{
184187
"path": "./setup/air-gapped/index.md",
185188
"children": [

setup/requirements.md

+7-30
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,20 @@
11
---
2-
title: "Requirements"
2+
title: "System Requirements"
33
description: Learn about the prerequisite infrastructure requirements.
44
---
55

6-
Coder is deployed onto Kubernetes clusters, and we recommend the following
7-
resource allocation minimums to ensure quality performance.
6+
Coder is deployed into a Kubernetes cluster namespace. We recommend the
7+
following resource minimums to ensure quality performance.
88

99
## Compute
1010

11-
For the Coder control plane (which consists of the `coderd` pod and any
12-
additional replicas) allocate at least 2 CPU cores, 4 GB of RAM, and 20 GB of
13-
storage.
14-
15-
In addition to sizing the control plane node(s), you can configure the `coderd`
16-
pod's resource requests/limits and number of replicas in the Helm chart. The
17-
current defaults for both CPU and memory are the following:
18-
19-
```yaml
20-
resources:
21-
requests:
22-
cpu: "250m"
23-
memory: "512Mi"
24-
limits:
25-
cpu: "250m"
26-
memory: "512Mi"
27-
```
28-
29-
By default, Coder is a single-replica deployment. For production systems,
30-
consider using at least three replicas to provide failover and load balancing
31-
capabilities.
32-
33-
If you expect roughly ten or more concurrent users, we recommend increasing
34-
these figures to improve platform performance (we also recommend regular
35-
performance testing in a staging environment).
11+
See [Scaling](./scaling.md) for more information.
3612

3713
For **each** active developer using Coder, allocate additional resources. The
3814
specific amount required per developer varies, though we recommend starting with
39-
4 CPUs and 16 GB of RAM, then iterating as needed. Developers are free to
40-
request the resource allocation that fits their usage:
15+
4 CPUs and 4 GB of RAM, especially when JetBrains IDEs are used and which are
16+
resource intensive. Developers are free to request the resource allocation
17+
that fits their usage:
4118

4219
![Workspace resource request](../assets/setup/resource-request.png)
4320

setup/scaling.md

+85
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
---
2+
title: "Scaling Coder"
3+
description: Learn about best practices to properly scale Coder to meet developer and workspace needs.
4+
---
5+
6+
Coder's control plane (`coderd`) and workspaces are deployed in a Kubernetes
7+
namespace. This document outlines vertical and horizontal scaling techniques to
8+
ensure the `coderd` pods can accommodate user and workspace load on a Coder
9+
deployment.
10+
11+
> Vertical scaling is preferred over horizontal scaling!
12+
13+
## Vertical Scaling
14+
15+
Vertical scaling or scaling up the Coder control plane (which consists of the
16+
`coderd` pod and any additional replicas) is done by adding additional computing
17+
resources to the `coderd` pods in the Helm chart's `values.yaml` file.
18+
19+
Download the values file for a deployed Coder release with the following
20+
command:
21+
22+
```console
23+
helm get values coder > values.yaml -n coder
24+
```
25+
26+
Experiment with increasing CPU and memory requests and limits as the number of
27+
workspaces in your Coder deployment increase. Pay particular attention to
28+
whether users have their workspaces configured to auto-start at the same time
29+
each day, which produces spike loads on the `coderd` pods. To best prevent Out
30+
of Memory conditions aka OOM Kills, configure the memory requests and limits to
31+
be the same megabytes (Mi) values. e.g., 8000Mi
32+
33+
> Increasing `coderd` CPU and memory resources requires sufficient Kubernetes
34+
> node machine types to accomodate `coderd`, Coder workspaces and additional
35+
> system and 3rd party pods on the same cluster namespace.
36+
37+
These are example `values.yaml` resources for `coderd`'s CPU and memory for a
38+
larger deployment with hundreds of workspaces autostarting at the same time each
39+
day:
40+
41+
```yaml
42+
coderd:
43+
resources:
44+
requests:
45+
cpu: "4"
46+
memory: "8Gi"
47+
limits:
48+
cpu: "8"
49+
memory: "8Gi"
50+
```
51+
52+
Leading indicators of undersized `coderd` pods include users experiencing
53+
disconnects in the web terminal, a web IDE like code-server or slowness within
54+
the Coder UI dashboard. One condition that may be occuring is an OOM Kill where
55+
one or more `coderd` pod fails, restarts or fails to restart and enteres a
56+
CrashLoopBackOff status. If `coderd` restarts and there are active workspaces
57+
and user sessions, they will be reconnected to a new `coderd` pod causing a
58+
disconnect situation. As a Kubernetes administrator, you can also notice
59+
restarts by noticing frequently changing and low `AGE` column when getting the
60+
pods:
61+
62+
```console
63+
kubectl get pods -n coder | grep coderd
64+
```
65+
66+
## Horizontal Scaling
67+
68+
Another way to distribute user and workspace load on a Coder deployment is to add additional `coderd` pods.
69+
70+
```yaml
71+
coderd:
72+
replicas: 3
73+
```
74+
75+
Coder load balances user and workspace requests across the `coderd` replicas ensuring sufficient resources and response time.
76+
77+
> There is not a linear relationship between nodes and `coderd` replicas so experiment with incrementing replicas as you increase nodes. e.g., 8 nodes and 3 `coderd` replicas.
78+
79+
### Horizontal Pod Autoscaling
80+
81+
Horizontal Pod Autoscaling (HPA) is another Kubernetes techique to automatically
82+
add, and remove, additional `coderd` pods when the existing pods exceed
83+
sustained CPU and memory thresholds. Consult [Kubernetes HPA
84+
documention](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
85+
for the various API version implementations of HPA.

0 commit comments

Comments
 (0)