diff --git a/assets/setup/resource-request.png b/assets/setup/resource-request.png index e25bfafc0..2cb3fd8f7 100644 Binary files a/assets/setup/resource-request.png and b/assets/setup/resource-request.png differ diff --git a/manifest.json b/manifest.json index 724ed8c89..eea555fe7 100644 --- a/manifest.json +++ b/manifest.json @@ -180,6 +180,9 @@ { "path": "./setup/configuration.md" }, + { + "path": "./setup/scaling.md" + }, { "path": "./setup/air-gapped/index.md", "children": [ diff --git a/setup/requirements.md b/setup/requirements.md index 6c57d25ea..a7c856444 100644 --- a/setup/requirements.md +++ b/setup/requirements.md @@ -1,43 +1,20 @@ --- -title: "Requirements" +title: "System Requirements" description: Learn about the prerequisite infrastructure requirements. --- -Coder is deployed onto Kubernetes clusters, and we recommend the following -resource allocation minimums to ensure quality performance. +Coder is deployed into a Kubernetes cluster namespace. We recommend the +following resource minimums to ensure quality performance. ## Compute -For the Coder control plane (which consists of the `coderd` pod and any -additional replicas) allocate at least 2 CPU cores, 4 GB of RAM, and 20 GB of -storage. - -In addition to sizing the control plane node(s), you can configure the `coderd` -pod's resource requests/limits and number of replicas in the Helm chart. The -current defaults for both CPU and memory are the following: - -```yaml -resources: - requests: - cpu: "250m" - memory: "512Mi" - limits: - cpu: "250m" - memory: "512Mi" -``` - -By default, Coder is a single-replica deployment. For production systems, -consider using at least three replicas to provide failover and load balancing -capabilities. - -If you expect roughly ten or more concurrent users, we recommend increasing -these figures to improve platform performance (we also recommend regular -performance testing in a staging environment). +See [Scaling](./scaling.md) for more information. For **each** active developer using Coder, allocate additional resources. The specific amount required per developer varies, though we recommend starting with -4 CPUs and 16 GB of RAM, then iterating as needed. Developers are free to -request the resource allocation that fits their usage: +4 CPUs and 4 GB of RAM, especially when JetBrains IDEs are used and which are +resource intensive. Developers are free to request the resource allocation +that fits their usage: ![Workspace resource request](../assets/setup/resource-request.png) diff --git a/setup/scaling.md b/setup/scaling.md new file mode 100644 index 000000000..870653dc1 --- /dev/null +++ b/setup/scaling.md @@ -0,0 +1,85 @@ +--- +title: "Scaling Coder" +description: Learn about best practices to properly scale Coder to meet developer and workspace needs. +--- + +Coder's control plane (`coderd`) and workspaces are deployed in a Kubernetes +namespace. This document outlines vertical and horizontal scaling techniques to +ensure the `coderd` pods can accommodate user and workspace load on a Coder +deployment. + +> Vertical scaling is preferred over horizontal scaling! + +## Vertical Scaling + +Vertical scaling or scaling up the Coder control plane (which consists of the +`coderd` pod and any additional replicas) is done by adding additional computing +resources to the `coderd` pods in the Helm chart's `values.yaml` file. + +Download the values file for a deployed Coder release with the following +command: + +```console +helm get values coder > values.yaml -n coder +``` + +Experiment with increasing CPU and memory requests and limits as the number of +workspaces in your Coder deployment increase. Pay particular attention to +whether users have their workspaces configured to auto-start at the same time +each day, which produces spike loads on the `coderd` pods. To best prevent Out +of Memory conditions aka OOM Kills, configure the memory requests and limits to +be the same megabytes (Mi) values. e.g., 8000Mi + +> Increasing `coderd` CPU and memory resources requires sufficient Kubernetes +> node machine types to accomodate `coderd`, Coder workspaces and additional +> system and 3rd party pods on the same cluster namespace. + +These are example `values.yaml` resources for `coderd`'s CPU and memory for a +larger deployment with hundreds of workspaces autostarting at the same time each +day: + +```yaml +coderd: + resources: + requests: + cpu: "4" + memory: "8Gi" + limits: + cpu: "8" + memory: "8Gi" +``` + +Leading indicators of undersized `coderd` pods include users experiencing +disconnects in the web terminal, a web IDE like code-server or slowness within +the Coder UI dashboard. One condition that may be occuring is an OOM Kill where +one or more `coderd` pod fails, restarts or fails to restart and enteres a +CrashLoopBackOff status. If `coderd` restarts and there are active workspaces +and user sessions, they will be reconnected to a new `coderd` pod causing a +disconnect situation. As a Kubernetes administrator, you can also notice +restarts by noticing frequently changing and low `AGE` column when getting the +pods: + +```console +kubectl get pods -n coder | grep coderd +``` + +## Horizontal Scaling + +Another way to distribute user and workspace load on a Coder deployment is to add additional `coderd` pods. + +```yaml +coderd: + replicas: 3 +``` + +Coder load balances user and workspace requests across the `coderd` replicas ensuring sufficient resources and response time. + +> There is not a linear relationship between nodes and `coderd` replicas so experiment with incrementing replicas as you increase nodes. e.g., 8 nodes and 3 `coderd` replicas. + +### Horizontal Pod Autoscaling + +Horizontal Pod Autoscaling (HPA) is another Kubernetes techique to automatically +add, and remove, additional `coderd` pods when the existing pods exceed +sustained CPU and memory thresholds. Consult [Kubernetes HPA +documention](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) +for the various API version implementations of HPA.