Skip to content

feat: add scaling doc and clean up requirements doc #1086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 18, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified assets/setup/resource-request.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,9 @@
{
"path": "./setup/configuration.md"
},
{
"path": "./setup/scaling.md"
},
{
"path": "./setup/air-gapped/index.md",
"children": [
Expand Down
37 changes: 7 additions & 30 deletions setup/requirements.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,20 @@
---
title: "Requirements"
title: "System Requirements"
description: Learn about the prerequisite infrastructure requirements.
---

Coder is deployed onto Kubernetes clusters, and we recommend the following
resource allocation minimums to ensure quality performance.
Coder is deployed into a Kubernetes cluster namespace. We recommend the
following resource minimums to ensure quality performance.

## Compute

For the Coder control plane (which consists of the `coderd` pod and any
additional replicas) allocate at least 2 CPU cores, 4 GB of RAM, and 20 GB of
storage.

In addition to sizing the control plane node(s), you can configure the `coderd`
pod's resource requests/limits and number of replicas in the Helm chart. The
current defaults for both CPU and memory are the following:

```yaml
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "250m"
memory: "512Mi"
```

By default, Coder is a single-replica deployment. For production systems,
consider using at least three replicas to provide failover and load balancing
capabilities.

If you expect roughly ten or more concurrent users, we recommend increasing
these figures to improve platform performance (we also recommend regular
performance testing in a staging environment).
See [Scaling](./scaling.md) for more information.

For **each** active developer using Coder, allocate additional resources. The
specific amount required per developer varies, though we recommend starting with
4 CPUs and 16 GB of RAM, then iterating as needed. Developers are free to
request the resource allocation that fits their usage:
4 CPUs and 4 GB of RAM, especially when JetBrains IDEs are used and which are
resource intensive. Developers are free to request the resource allocation
that fits their usage:

![Workspace resource request](../assets/setup/resource-request.png)

Expand Down
85 changes: 85 additions & 0 deletions setup/scaling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
title: "Scaling Coder"
description: Learn about best practices to properly scale Coder to meet developer and workspace needs.
---

Coder's control plane (`coderd`) and workspaces are deployed in a Kubernetes
namespace. This document outlines vertical and horizontal scaling techniques to
ensure the `coderd` pods can accommodate user and workspace load on a Coder
deployment.

> Vertical scaling is preferred over horizontal scaling!

## Vertical Scaling

Vertical scaling or scaling up the Coder control plane (which consists of the
`coderd` pod and any additional replicas) is done by adding additional computing
resources to the `coderd` pods in the Helm chart's `values.yaml` file.

Download the values file for a deployed Coder release with the following
command:

```console
helm get values coder > values.yaml -n coder
```

Experiment with increasing CPU and memory requests and limits as the number of
workspaces in your Coder deployment increase. Pay particular attention to
whether users have their workspaces configured to auto-start at the same time
each day, which produces spike loads on the `coderd` pods. To best prevent Out
of Memory conditions aka OOM Kills, configure the memory requests and limits to
be the same megabytes (Mi) values. e.g., 8000Mi

> Increasing `coderd` CPU and memory resources requires sufficient Kubernetes
> node machine types to accomodate `coderd`, Coder workspaces and additional
> system and 3rd party pods on the same cluster namespace.

These are example `values.yaml` resources for `coderd`'s CPU and memory for a
larger deployment with hundreds of workspaces autostarting at the same time each
day:

```yaml
coderd:
resources:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "8"
memory: "8Gi"
```

Leading indicators of undersized `coderd` pods include users experiencing
disconnects in the web terminal, a web IDE like code-server or slowness within
the Coder UI dashboard. One condition that may be occuring is an OOM Kill where
one or more `coderd` pod fails, restarts or fails to restart and enteres a
CrashLoopBackOff status. If `coderd` restarts and there are active workspaces
and user sessions, they will be reconnected to a new `coderd` pod causing a
disconnect situation. As a Kubernetes administrator, you can also notice
restarts by noticing frequently changing and low `AGE` column when getting the
pods:

```console
kubectl get pods -n coder | grep coderd
```

## Horizontal Scaling

Another way to distribute user and workspace load on a Coder deployment is to add additional `coderd` pods.

```yaml
coderd:
replicas: 3
```

Coder load balances user and workspace requests across the `coderd` replicas ensuring sufficient resources and response time.

> There is not a linear relationship between nodes and `coderd` replicas so experiment with incrementing replicas as you increase nodes. e.g., 8 nodes and 3 `coderd` replicas.

### Horizontal Pod Autoscaling

Horizontal Pod Autoscaling (HPA) is another Kubernetes techique to automatically
add, and remove, additional `coderd` pods when the existing pods exceed
sustained CPU and memory thresholds. Consult [Kubernetes HPA
documention](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
for the various API version implementations of HPA.