From a248f8f1f204a28aa25a85e656a0fc8e6ba37c16 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Tue, 17 Dec 2024 22:17:18 +0000
Subject: [PATCH 01/15] add scaling best practice doc

---
 docs/manifest.json                            |  15 +-
 docs/tutorials/best-practices/scale-coder.md  | 321 ++++++++++++++++++
 .../WorkspaceTiming/Chart/XAxis.tsx           |   4 +-
 3 files changed, 332 insertions(+), 8 deletions(-)
 create mode 100644 docs/tutorials/best-practices/scale-coder.md

diff --git a/docs/manifest.json b/docs/manifest.json
index 2a67e78e991e1..ce5c5e818c93f 100644
--- a/docs/manifest.json
+++ b/docs/manifest.json
@@ -761,16 +761,21 @@
 					"description": "Guides to help you make the most of your Coder experience",
 					"path": "./tutorials/best-practices/index.md",
 					"children": [
-						{
-							"title": "Security - best practices",
-							"description": "Make your Coder deployment more secure",
-							"path": "./tutorials/best-practices/security-best-practices.md"
-						},
 						{
 							"title": "Organizations - best practices",
 							"description": "How to make the best use of Coder Organizations",
 							"path": "./tutorials/best-practices/organizations.md"
 						},
+						{
+							"title": "Scale Coder",
+							"description": "How to prepare a Coder deployment for scale",
+							"path": "./tutorials/best-practices/scale-coder.md"
+						},
+						{
+							"title": "Security - best practices",
+							"description": "Make your Coder deployment more secure",
+							"path": "./tutorials/best-practices/security-best-practices.md"
+						},
 						{
 							"title": "Speed up your workspaces",
 							"description": "Speed up your Coder templates and workspaces",
diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
new file mode 100644
index 0000000000000..c0fe630e0cca2
--- /dev/null
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -0,0 +1,321 @@
+# Scale Coder
+
+December 20, 2024
+
+---
+
+This best practice guide helps you prepare a low-scale Coder deployment so that
+it can be scaled up to a high-scale deployment as use grows, and keep it
+operating smoothly with a high number of active users and workspaces.
+
+## Observability
+
+Observability is one of the most important aspects to a scalable Coder
+deployment.
+
+Identify potential bottlenecks before they negatively affect the end-user
+experience. It will also allow you to empirically verify that modifications you
+make to your deployment to increase capacity have their intended effects.
+
+- Capture log output from Coder Server instances and external provisioner
+  daemons and store them in a searchable log store.
+
+  - For example: Loki, CloudWatch Logs, etc.
+
+  - Retain logs for a minimum of thirty days, ideally ninety days. This allows
+    you to look back to see when anomalous behaviors began.
+
+- Metrics:
+
+  - Capture infrastructure metrics like CPU, memory, open files, and network I/O
+    for all Coder Server, external provisioner daemon, workspace proxy, and
+    PostgreSQL instances.
+
+  - Capture metrics from Coder Server and external provisioner daemons via
+    Prometheus.
+
+    - On Coder Server
+
+      - Enable Prometheus metrics:
+
+        ```yaml
+        CODER_PROMETHEUS_ENABLE=true
+        ```
+
+      - Enable database metrics:
+
+        ```yaml
+        CODER_PROMETHEUS_COLLECT_DB_METRICS=true
+        ```
+
+      - Configure agent stats to avoid large cardinality:
+
+        ```yaml
+        CODER_PROMETHEUS_AGGREGATE_AGENT_STATS_BY=agent_name
+        ```
+
+        - To disable Agent stats:
+
+          ```yaml
+          CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
+          ```
+
+  - Retain metric time series for at least six months. This allows you to see
+    performance trends relative to user growth.
+
+  - Integrate metrics with an observability dashboard, for example, Grafana.
+
+### Key metrics
+
+**CPU and Memory Utilization**
+
+- Monitor the utilization as a fraction of the available resources on the
+  instance. Its utilization will vary with use throughout the day and over the
+  course of the week. Monitor the trends, paying special attention to the daily
+  and weekly peak utilization. Use long-term trends to plan infrastructure
+  upgrades.
+
+**Tail latency of Coder Server API requests**
+
+- Use the `coderd_api_request_latencies_seconds` metric.
+- High tail latency can indicate Coder Server or the PostgreSQL database is
+  being starved for resources.
+
+**Tail latency of database queries**
+
+- Use the `coderd_db_query_latencies_seconds` metric.
+- High tail latency can indicate the PostgreSQL database is low in resources.
+
+Configure alerting based on these metrics to ensure you surface problems before
+end users notice them.
+
+## Coder Server
+
+### Locality
+
+If increased availability of the Coder API is a concern, deploy at least three
+instances. Spread the instances across nodes (e.g. via anti-affinity rules in
+Kubernetes), and/or in different availability zones of the same geographic
+region.
+
+Do not deploy in different geographic regions. Coder Servers need to be able to
+communicate with one another directly with low latency, under 10ms. Note that
+this is for the availability of the Coder API – workspaces will not be fault
+tolerant unless they are explicitly built that way at the template level.
+
+Deploy Coder Server instances as geographically close to PostgreSQL as possible.
+Low-latency communication (under 10ms) with Postgres is essential for Coder
+Server's performance.
+
+### Scaling
+
+Coder Server can be scaled both vertically for bigger instances and horizontally
+for more instances.
+
+Aim to keep the number of Coder Server instances relatively small, preferably
+under ten instances, and opt for vertical scale over horizontal scale after
+meeting availability requirements.
+
+Coder's
+[validated architectures](../../admin/infrastructure/validated-architectures.md)
+give specific sizing recommendations for various user scales. These are a useful
+starting point, but very few deployments will remain stable at a predetermined
+user level over the long term, so monitoring and adjusting of resources is
+recommended.
+
+We don't recommend that you autoscale the Coder Servers. Instead, scale the
+deployment for peak weekly usage.
+
+Although Coder Server persists no internal state, it operates as a proxy for end
+users to their workspaces in two capacities:
+
+1. As an HTTP proxy when they access workspace applications in their browser via
+   the Coder Dashboard
+
+1. As a DERP proxy when establishing tunneled connections via CLI tools
+   (`coder ssh`, `coder port-forward`, etc.) and desktop IDEs.
+
+Stopping a Coder Server instance will (momentarily) disconnect any users
+currently connecting through that instance. Adding a new instance is not
+disruptive, but removing instances and upgrades should be performed during a
+maintenance window to minimize disruption.
+
+## Provisioner daemons
+
+### Locality
+
+We recommend you disable provisioner daemons within your Coder Server:
+
+```yaml
+CODER_PROVISIONER_DAEMONS=0
+```
+
+Run one or more
+[provisioner daemon deployments external to Coder Server](../../admin/provisioners.md).
+This allows you to scale them independently of the Coder Server.
+
+We recommend deploying provisioner daemons within the same cluster as the
+workspaces they will provision or are hosted in.
+
+- This gives them a low-latency connection to the APIs they will use to
+  provision workspaces and can speed builds.
+
+- It allows provisioner daemons to use in-cluster mechanisms (for example
+  Kubernetes service account tokens, AWS IAM Roles, etc.) to authenticate with
+  the infrastructure APIs.
+
+- If you deploy workspaces in multiple clusters, run multiple provisioner daemon
+  deployments and use template tags to select the correct set of provisioner
+  daemons.
+
+- Provisioner daemons need to be able to connect to Coder Server, but this need
+  not be a low-latency connection.
+
+Provisioner daemons make no direct connections to the PostgreSQL database, so
+there's no need for locality to the Postgres database.
+
+### Scaling
+
+Each provisioner daemon instance can handle a single workspace build job at a
+time. Therefore, the number of provisioner daemon instances within a tagged
+deployment equals the maximum number of simultaneous builds your Coder
+deployment can handle.
+
+If users experience unacceptably long queues for workspace builds to start,
+consider increasing the number of provisioner daemon instances in the affected
+cluster.
+
+You may wish to automatically scale the number of provisioner daemon instances
+throughout the day to meet demand. If you stop instances with `SIGHUP`, they
+will complete their current build job and exit. `SIGINT` will cancel the current
+job, which will result in a failed build. Ensure your autoscaler waits long
+enough for your build jobs to complete before forcibly killing the provisioner
+daemon process.
+
+If deploying in Kubernetes, we recommend a single provisioner daemon per pod. On
+a virtual machine (VM), you can deploy multiple provisioner daemons, ensuring
+each has a unique `CODER_CACHE_DIRECTORY` value.
+
+Coder's
+[validated architectures](../../admin/infrastructure/validated-architectures.md)
+give specific sizing recommendations for various user scales. Since the
+complexity of builds varies significantly depending on the workspace template,
+consider this a starting point. Monitor queue times and build times to adjust
+the number and size of your provisioner daemon instances.
+
+## PostgreSQL
+
+PostgreSQL is the primary persistence layer for all of Coder's deployment data.
+We also use `LISTEN` and `NOTIFY` to coordinate between different instances of
+Coder Server.
+
+### Locality
+
+Coder Server instances must have low-latency connections (under 10ms) to
+PostgreSQL. If you use multiple PostgreSQL replicas in a clustered config, these
+must also be low-latency with respect to one another.
+
+### Scaling
+
+Prefer scaling PostgreSQL vertically rather than horizontally for best
+performance. Coder's
+[validated architectures](../../admin/infrastructure/validated-architectures.md)
+give specific sizing recommendations for various user scales.
+
+## Workspace proxies
+
+Workspace proxies proxy HTTP traffic from end users to workspaces for Coder apps
+defined in the templates, and HTTP ports opened by the workspace. By default
+they also include a DERP Proxy.
+
+### Locality
+
+We recommend each geographic cluster of workspaces have an associated deployment
+of workspace proxies. This ensures that users always have a near-optimal proxy
+path.
+
+### Scaling
+
+Workspace proxy load is determined by the amount of traffic they proxy. We
+recommend you monitor CPU, memory, and network I/O utilization to decide when to
+resize the number of proxy instances.
+
+We do not recommend autoscaling the workspace proxies because many applications
+use long-lived connections such as websockets, which would be disrupted by
+stopping the proxy. We recommend you scale for peak demand and scale down or
+upgrade during a maintenance window.
+
+## Workspaces
+
+Workspaces represent the vast majority of resources in most Coder deployments.
+Because they are defined by templates, there is no one-size-fits-all advice for
+scaling.
+
+### Hard and soft cluster limits
+
+All Infrastructure as a Service (IaaS) clusters have limits to what can be
+simultaneously provisioned. These could be hard limits, based on the physical
+size of the cluster, especially in the case of a private cloud, or soft limits,
+based on configured limits in your public cloud account.
+
+It is important to be aware of these limits and monitor Coder workspace resource
+utilization against the limits, so that a new influx of users doesn't encounter
+failed builds. Monitoring these is outside the scope of Coder, but we recommend
+that you set up dashboards and alerts for each kind of limited resource.
+
+As you approach soft limits, you might be able to justify an increase to keep
+growing.
+
+As you approach hard limits, you will need to consider deploying to additional
+cluster(s).
+
+### Workspaces per node
+
+Many development workloads are "spiky" in their CPU and memory requirements, for
+example, peaking during build/test and then ebbing while editing code. This
+leads to an opportunity to efficiently use compute resources by packing multiple
+workspaces onto a single node. This can lead to better experience (more CPU and
+memory available during brief bursts) and lower cost.
+
+However, it needs to be considered against several trade-offs.
+
+- There are residual probabilities of "noisy neighbor" problems negatively
+  affecting end users. The probabilities increase with the amount of
+  oversubscription of CPU and memory resources.
+
+- If the shared nodes are a provisioned resource, for example, Kubernetes nodes
+  running on VMs in a public cloud, then it can sometimes be a challenge to
+  effectively autoscale down.
+
+  - For example, if half the workspaces are stopped overnight, and there are ten
+    workspaces per node, it's unlikely that all ten workspaces on the node are
+    among the stopped ones.
+
+  - You can mitigate this by lowering the number of workspaces per node, or
+    using autostop policies to stop more workspaces during off-peak hours.
+
+- If you do overprovision workspaces onto nodes, keep them in a separate node
+  pool and schedule Coder control plane (Coder Server, PostgreSQL, workspace
+  proxies) components on a different node pool to avoid resource spikes
+  affecting them.
+
+Coder customers have had success with both:
+
+- One workspace per AWS VM
+- Lots of workspaces on Kubernetes nodes for efficiency
+
+### Cost control
+
+- Use quotas to discourage users from creating many workspaces they don't need
+  simultaneously.
+
+- Label workspace cloud resources by user, team, organization, or your own
+  labelling conventions to track usage at different granularities.
+
+- Use autostop requirements to bring off-peak utilization down.
+
+## Networking
+
+Set up your network so that most users can get direct, peer-to-peer connections
+to their workspaces. This drastically reduces the load on Coder Server and
+workspace proxy instances.
diff --git a/site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx b/site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx
index 9ab5054957992..4863b08ec19bd 100644
--- a/site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx
+++ b/site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx
@@ -121,9 +121,7 @@ export const XGrid: FC<XGridProps> = ({ columns, ...htmlProps }) => {
 
 // A dashed line is used as a background image to create the grid.
 // Using it as a background simplifies replication along the Y axis.
-const dashedLine = (
-	color: string,
-) => `<svg width="2" height="446" viewBox="0 0 2 446" fill="none" xmlns="http://www.w3.org/2000/svg">
+const dashedLine = (color: string) => `<svg width="2" height="446" viewBox="0 0 2 446" fill="none" xmlns="http://www.w3.org/2000/svg">
 	<path fill-rule="evenodd" clip-rule="evenodd" d="M1.75 440.932L1.75 446L0.75 446L0.75 440.932L1.75 440.932ZM1.75 420.659L1.75 430.795L0.749999 430.795L0.749999 420.659L1.75 420.659ZM1.75 400.386L1.75 410.523L0.749998 410.523L0.749998 400.386L1.75 400.386ZM1.75 380.114L1.75 390.25L0.749998 390.25L0.749997 380.114L1.75 380.114ZM1.75 359.841L1.75 369.977L0.749997 369.977L0.749996 359.841L1.75 359.841ZM1.75 339.568L1.75 349.705L0.749996 349.705L0.749995 339.568L1.75 339.568ZM1.74999 319.295L1.74999 329.432L0.749995 329.432L0.749994 319.295L1.74999 319.295ZM1.74999 299.023L1.74999 309.159L0.749994 309.159L0.749994 299.023L1.74999 299.023ZM1.74999 278.75L1.74999 288.886L0.749993 288.886L0.749993 278.75L1.74999 278.75ZM1.74999 258.477L1.74999 268.614L0.749992 268.614L0.749992 258.477L1.74999 258.477ZM1.74999 238.204L1.74999 248.341L0.749991 248.341L0.749991 238.204L1.74999 238.204ZM1.74999 217.932L1.74999 228.068L0.74999 228.068L0.74999 217.932L1.74999 217.932ZM1.74999 197.659L1.74999 207.795L0.74999 207.795L0.749989 197.659L1.74999 197.659ZM1.74999 177.386L1.74999 187.523L0.749989 187.523L0.749988 177.386L1.74999 177.386ZM1.74999 157.114L1.74999 167.25L0.749988 167.25L0.749987 157.114L1.74999 157.114ZM1.74999 136.841L1.74999 146.977L0.749987 146.977L0.749986 136.841L1.74999 136.841ZM1.74999 116.568L1.74999 126.705L0.749986 126.705L0.749986 116.568L1.74999 116.568ZM1.74998 96.2955L1.74999 106.432L0.749985 106.432L0.749985 96.2955L1.74998 96.2955ZM1.74998 76.0228L1.74998 86.1591L0.749984 86.1591L0.749984 76.0228L1.74998 76.0228ZM1.74998 55.7501L1.74998 65.8864L0.749983 65.8864L0.749983 55.7501L1.74998 55.7501ZM1.74998 35.4774L1.74998 45.6137L0.749982 45.6137L0.749982 35.4774L1.74998 35.4774ZM1.74998 15.2047L1.74998 25.341L0.749982 25.341L0.749981 15.2047L1.74998 15.2047ZM1.74998 -4.37114e-08L1.74998 5.0683L0.749981 5.0683L0.749981 0L1.74998 -4.37114e-08Z" fill="${color}"/>
 </svg>`;
 

From 46831a1cb26e28615540ec2ebbf152f17d1a871b Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Tue, 17 Dec 2024 22:24:49 +0000
Subject: [PATCH 02/15] fix links

---
 docs/tutorials/best-practices/scale-coder.md                | 6 +++---
 site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx | 4 +++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index c0fe630e0cca2..248be4db976e8 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -117,7 +117,7 @@ under ten instances, and opt for vertical scale over horizontal scale after
 meeting availability requirements.
 
 Coder's
-[validated architectures](../../admin/infrastructure/validated-architectures.md)
+[validated architectures](../../admin/infrastructure/validated-architectures/index.md)
 give specific sizing recommendations for various user scales. These are a useful
 starting point, but very few deployments will remain stable at a predetermined
 user level over the long term, so monitoring and adjusting of resources is
@@ -197,7 +197,7 @@ a virtual machine (VM), you can deploy multiple provisioner daemons, ensuring
 each has a unique `CODER_CACHE_DIRECTORY` value.
 
 Coder's
-[validated architectures](../../admin/infrastructure/validated-architectures.md)
+[validated architectures](../../admin/infrastructure/validated-architectures/index.md)
 give specific sizing recommendations for various user scales. Since the
 complexity of builds varies significantly depending on the workspace template,
 consider this a starting point. Monitor queue times and build times to adjust
@@ -219,7 +219,7 @@ must also be low-latency with respect to one another.
 
 Prefer scaling PostgreSQL vertically rather than horizontally for best
 performance. Coder's
-[validated architectures](../../admin/infrastructure/validated-architectures.md)
+[validated architectures](../../admin/infrastructure/validated-architectures/index.md)
 give specific sizing recommendations for various user scales.
 
 ## Workspace proxies
diff --git a/site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx b/site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx
index 4863b08ec19bd..9ab5054957992 100644
--- a/site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx
+++ b/site/src/modules/workspaces/WorkspaceTiming/Chart/XAxis.tsx
@@ -121,7 +121,9 @@ export const XGrid: FC<XGridProps> = ({ columns, ...htmlProps }) => {
 
 // A dashed line is used as a background image to create the grid.
 // Using it as a background simplifies replication along the Y axis.
-const dashedLine = (color: string) => `<svg width="2" height="446" viewBox="0 0 2 446" fill="none" xmlns="http://www.w3.org/2000/svg">
+const dashedLine = (
+	color: string,
+) => `<svg width="2" height="446" viewBox="0 0 2 446" fill="none" xmlns="http://www.w3.org/2000/svg">
 	<path fill-rule="evenodd" clip-rule="evenodd" d="M1.75 440.932L1.75 446L0.75 446L0.75 440.932L1.75 440.932ZM1.75 420.659L1.75 430.795L0.749999 430.795L0.749999 420.659L1.75 420.659ZM1.75 400.386L1.75 410.523L0.749998 410.523L0.749998 400.386L1.75 400.386ZM1.75 380.114L1.75 390.25L0.749998 390.25L0.749997 380.114L1.75 380.114ZM1.75 359.841L1.75 369.977L0.749997 369.977L0.749996 359.841L1.75 359.841ZM1.75 339.568L1.75 349.705L0.749996 349.705L0.749995 339.568L1.75 339.568ZM1.74999 319.295L1.74999 329.432L0.749995 329.432L0.749994 319.295L1.74999 319.295ZM1.74999 299.023L1.74999 309.159L0.749994 309.159L0.749994 299.023L1.74999 299.023ZM1.74999 278.75L1.74999 288.886L0.749993 288.886L0.749993 278.75L1.74999 278.75ZM1.74999 258.477L1.74999 268.614L0.749992 268.614L0.749992 258.477L1.74999 258.477ZM1.74999 238.204L1.74999 248.341L0.749991 248.341L0.749991 238.204L1.74999 238.204ZM1.74999 217.932L1.74999 228.068L0.74999 228.068L0.74999 217.932L1.74999 217.932ZM1.74999 197.659L1.74999 207.795L0.74999 207.795L0.749989 197.659L1.74999 197.659ZM1.74999 177.386L1.74999 187.523L0.749989 187.523L0.749988 177.386L1.74999 177.386ZM1.74999 157.114L1.74999 167.25L0.749988 167.25L0.749987 157.114L1.74999 157.114ZM1.74999 136.841L1.74999 146.977L0.749987 146.977L0.749986 136.841L1.74999 136.841ZM1.74999 116.568L1.74999 126.705L0.749986 126.705L0.749986 116.568L1.74999 116.568ZM1.74998 96.2955L1.74999 106.432L0.749985 106.432L0.749985 96.2955L1.74998 96.2955ZM1.74998 76.0228L1.74998 86.1591L0.749984 86.1591L0.749984 76.0228L1.74998 76.0228ZM1.74998 55.7501L1.74998 65.8864L0.749983 65.8864L0.749983 55.7501L1.74998 55.7501ZM1.74998 35.4774L1.74998 45.6137L0.749982 45.6137L0.749982 35.4774L1.74998 35.4774ZM1.74998 15.2047L1.74998 25.341L0.749982 25.341L0.749981 15.2047L1.74998 15.2047ZM1.74998 -4.37114e-08L1.74998 5.0683L0.749981 5.0683L0.749981 0L1.74998 -4.37114e-08Z" fill="${color}"/>
 </svg>`;
 

From d9aebc6e6376100c38e389c2a8ffdf1515dbdc3e Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Thu, 19 Dec 2024 16:10:49 +0000
Subject: [PATCH 03/15] copy edit

---
 docs/tutorials/best-practices/scale-coder.md | 60 +++++++++-----------
 1 file changed, 27 insertions(+), 33 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 248be4db976e8..c920ee358990c 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -13,59 +13,53 @@ operating smoothly with a high number of active users and workspaces.
 Observability is one of the most important aspects to a scalable Coder
 deployment.
 
-Identify potential bottlenecks before they negatively affect the end-user
-experience. It will also allow you to empirically verify that modifications you
-make to your deployment to increase capacity have their intended effects.
+[Monitor your Coder deployment](../../admin/monitoring/index.md) with log output and metrics to identify potential bottlenecks before they negatively affect the end-user experience and measure the effects of modifications you make to your deployment.
 
-- Capture log output from Coder Server instances and external provisioner
-  daemons and store them in a searchable log store.
+**Log output**
 
-  - For example: Loki, CloudWatch Logs, etc.
+- Capture log output from Loki, CloudWatch logs, and other tools on your Coder Server instances and external provisioner
+  daemons and store them in a searchable log store.
 
   - Retain logs for a minimum of thirty days, ideally ninety days. This allows
     you to look back to see when anomalous behaviors began.
 
-- Metrics:
+**Metrics**
 
-  - Capture infrastructure metrics like CPU, memory, open files, and network I/O
-    for all Coder Server, external provisioner daemon, workspace proxy, and
-    PostgreSQL instances.
+- Capture infrastructure metrics like CPU, memory, open files, and network I/O for all Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
 
-  - Capture metrics from Coder Server and external provisioner daemons via
-    Prometheus.
+### Capture Coder server metrics with Prometheus
 
-    - On Coder Server
+To capture metrics from Coder Server and external provisioner daemons with [Prometheus](../../admin/integrations/prometheus.md):
 
-      - Enable Prometheus metrics:
+1. Enable Prometheus metrics:
 
-        ```yaml
-        CODER_PROMETHEUS_ENABLE=true
-        ```
+   ```yaml
+   CODER_PROMETHEUS_ENABLE=true
+   ```
 
-      - Enable database metrics:
+1. Enable database metrics:
 
-        ```yaml
-        CODER_PROMETHEUS_COLLECT_DB_METRICS=true
-        ```
+   ```yaml
+   CODER_PROMETHEUS_COLLECT_DB_METRICS=true
+   ```
 
-      - Configure agent stats to avoid large cardinality:
+1. Configure agent stats to avoid large cardinality:
 
-        ```yaml
-        CODER_PROMETHEUS_AGGREGATE_AGENT_STATS_BY=agent_name
-        ```
+   ```yaml
+   CODER_PROMETHEUS_AGGREGATE_AGENT_STATS_BY=agent_name
+   ```
 
-        - To disable Agent stats:
+  - To disable agent stats:
 
-          ```yaml
-          CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
-          ```
+     ```yaml
+     CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
+     ```
 
-  - Retain metric time series for at least six months. This allows you to see
-    performance trends relative to user growth.
+Retain metric time series for at least six months. This allows you to see performance trends relative to user growth.
 
-  - Integrate metrics with an observability dashboard, for example, Grafana.
+For a more comprehensive overview, integrate metrics with an observability dashboard, for example, [Grafana](../../admin/monitoring/index.md).
 
-### Key metrics
+### Observability key metrics
 
 **CPU and Memory Utilization**
 

From b0787cd7abd365e2bada71eb09762d3fa9111d27 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Thu, 19 Dec 2024 16:10:49 +0000
Subject: [PATCH 04/15] copy edit

---
 docs/tutorials/best-practices/scale-coder.md | 39 +++++++++-----------
 1 file changed, 17 insertions(+), 22 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index c920ee358990c..209f585fbffd4 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -1,4 +1,4 @@
-# Scale Coder
+kl# Scale Coder
 
 December 20, 2024
 
@@ -61,45 +61,40 @@ For a more comprehensive overview, integrate metrics with an observability dashb
 
 ### Observability key metrics
 
+Configure alerting based on these metrics to ensure you surface problems before they affect the end-user experience.
+
 **CPU and Memory Utilization**
 
-- Monitor the utilization as a fraction of the available resources on the
-  instance. Its utilization will vary with use throughout the day and over the
-  course of the week. Monitor the trends, paying special attention to the daily
-  and weekly peak utilization. Use long-term trends to plan infrastructure
+- Monitor the utilization as a fraction of the available resources on the instance.
+
+  Utilization will vary with use throughout the course of a day, week, and longer timelines. Monitor trends and pay special attention to the daily and weekly peak utilization. Use long-term trends to plan infrastructure
   upgrades.
 
 **Tail latency of Coder Server API requests**
 
-- Use the `coderd_api_request_latencies_seconds` metric.
-- High tail latency can indicate Coder Server or the PostgreSQL database is
-  being starved for resources.
+- High tail latency can indicate Coder Server or the PostgreSQL database is low on resources.
+
+  Use the `coderd_api_request_latencies_seconds` metric.
 
 **Tail latency of database queries**
 
-- Use the `coderd_db_query_latencies_seconds` metric.
 - High tail latency can indicate the PostgreSQL database is low in resources.
 
-Configure alerting based on these metrics to ensure you surface problems before
-end users notice them.
+  Use the `coderd_db_query_latencies_seconds` metric.
 
 ## Coder Server
 
 ### Locality
 
-If increased availability of the Coder API is a concern, deploy at least three
-instances. Spread the instances across nodes (e.g. via anti-affinity rules in
-Kubernetes), and/or in different availability zones of the same geographic
-region.
+To ensure increased availability of the Coder API, deploy at least three instances. Spread the instances across nodes with anti-affinity rules in
+Kubernetes or in different availability zones of the same geographic region.
+
+Do not deploy in different geographic regions.
 
-Do not deploy in different geographic regions. Coder Servers need to be able to
-communicate with one another directly with low latency, under 10ms. Note that
-this is for the availability of the Coder API – workspaces will not be fault
-tolerant unless they are explicitly built that way at the template level.
+Coder Servers need to be able to
+communicate with one another directly with low latency, under 10ms. Note that this is for the availability of the Coder API. Workspaces are not fault tolerant unless they are explicitly built that way at the template level.
 
-Deploy Coder Server instances as geographically close to PostgreSQL as possible.
-Low-latency communication (under 10ms) with Postgres is essential for Coder
-Server's performance.
+Deploy Coder Server instances as geographically close to PostgreSQL as possible. Low-latency communication (under 10ms) with Postgres is essential for Coder Server's performance.
 
 ### Scaling
 

From 6ca6b8cdddcef51de9bbc80400a1b2200df9625c Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Mon, 23 Dec 2024 19:50:26 +0000
Subject: [PATCH 05/15] make fmt

---
 docs/tutorials/best-practices/scale-coder.md | 58 +++++++++++++-------
 1 file changed, 38 insertions(+), 20 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 209f585fbffd4..9049a17d09615 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -13,23 +13,30 @@ operating smoothly with a high number of active users and workspaces.
 Observability is one of the most important aspects to a scalable Coder
 deployment.
 
-[Monitor your Coder deployment](../../admin/monitoring/index.md) with log output and metrics to identify potential bottlenecks before they negatively affect the end-user experience and measure the effects of modifications you make to your deployment.
+[Monitor your Coder deployment](../../admin/monitoring/index.md) with log output
+and metrics to identify potential bottlenecks before they negatively affect the
+end-user experience and measure the effects of modifications you make to your
+deployment.
 
 **Log output**
 
-- Capture log output from Loki, CloudWatch logs, and other tools on your Coder Server instances and external provisioner
-  daemons and store them in a searchable log store.
+- Capture log output from Loki, CloudWatch logs, and other tools on your Coder
+  Server instances and external provisioner daemons and store them in a
+  searchable log store.
 
   - Retain logs for a minimum of thirty days, ideally ninety days. This allows
     you to look back to see when anomalous behaviors began.
 
 **Metrics**
 
-- Capture infrastructure metrics like CPU, memory, open files, and network I/O for all Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
+- Capture infrastructure metrics like CPU, memory, open files, and network I/O
+  for all Coder Server, external provisioner daemon, workspace proxy, and
+  PostgreSQL instances.
 
 ### Capture Coder server metrics with Prometheus
 
-To capture metrics from Coder Server and external provisioner daemons with [Prometheus](../../admin/integrations/prometheus.md):
+To capture metrics from Coder Server and external provisioner daemons with
+[Prometheus](../../admin/integrations/prometheus.md):
 
 1. Enable Prometheus metrics:
 
@@ -49,30 +56,36 @@ To capture metrics from Coder Server and external provisioner daemons with [Prom
    CODER_PROMETHEUS_AGGREGATE_AGENT_STATS_BY=agent_name
    ```
 
-  - To disable agent stats:
+- To disable agent stats:
 
-     ```yaml
-     CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
-     ```
+  ```yaml
+  CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
+  ```
 
-Retain metric time series for at least six months. This allows you to see performance trends relative to user growth.
+Retain metric time series for at least six months. This allows you to see
+performance trends relative to user growth.
 
-For a more comprehensive overview, integrate metrics with an observability dashboard, for example, [Grafana](../../admin/monitoring/index.md).
+For a more comprehensive overview, integrate metrics with an observability
+dashboard, for example, [Grafana](../../admin/monitoring/index.md).
 
 ### Observability key metrics
 
-Configure alerting based on these metrics to ensure you surface problems before they affect the end-user experience.
+Configure alerting based on these metrics to ensure you surface problems before
+they affect the end-user experience.
 
 **CPU and Memory Utilization**
 
-- Monitor the utilization as a fraction of the available resources on the instance.
+- Monitor the utilization as a fraction of the available resources on the
+  instance.
 
-  Utilization will vary with use throughout the course of a day, week, and longer timelines. Monitor trends and pay special attention to the daily and weekly peak utilization. Use long-term trends to plan infrastructure
-  upgrades.
+  Utilization will vary with use throughout the course of a day, week, and
+  longer timelines. Monitor trends and pay special attention to the daily and
+  weekly peak utilization. Use long-term trends to plan infrastructure upgrades.
 
 **Tail latency of Coder Server API requests**
 
-- High tail latency can indicate Coder Server or the PostgreSQL database is low on resources.
+- High tail latency can indicate Coder Server or the PostgreSQL database is low
+  on resources.
 
   Use the `coderd_api_request_latencies_seconds` metric.
 
@@ -86,15 +99,20 @@ Configure alerting based on these metrics to ensure you surface problems before
 
 ### Locality
 
-To ensure increased availability of the Coder API, deploy at least three instances. Spread the instances across nodes with anti-affinity rules in
+To ensure increased availability of the Coder API, deploy at least three
+instances. Spread the instances across nodes with anti-affinity rules in
 Kubernetes or in different availability zones of the same geographic region.
 
 Do not deploy in different geographic regions.
 
-Coder Servers need to be able to
-communicate with one another directly with low latency, under 10ms. Note that this is for the availability of the Coder API. Workspaces are not fault tolerant unless they are explicitly built that way at the template level.
+Coder Servers need to be able to communicate with one another directly with low
+latency, under 10ms. Note that this is for the availability of the Coder API.
+Workspaces are not fault tolerant unless they are explicitly built that way at
+the template level.
 
-Deploy Coder Server instances as geographically close to PostgreSQL as possible. Low-latency communication (under 10ms) with Postgres is essential for Coder Server's performance.
+Deploy Coder Server instances as geographically close to PostgreSQL as possible.
+Low-latency communication (under 10ms) with Postgres is essential for Coder
+Server's performance.
 
 ### Scaling
 

From f8c5158b2d2ef307e0e00dc0d339c2435aedf355 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Fri, 3 Jan 2025 17:43:46 +0000
Subject: [PATCH 06/15] adjust to md rules

---
 docs/tutorials/best-practices/scale-coder.md | 39 +++++++-------------
 1 file changed, 14 insertions(+), 25 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 9049a17d09615..67b591ea849e8 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -18,20 +18,13 @@ and metrics to identify potential bottlenecks before they negatively affect the
 end-user experience and measure the effects of modifications you make to your
 deployment.
 
-**Log output**
-
-- Capture log output from Loki, CloudWatch logs, and other tools on your Coder
-  Server instances and external provisioner daemons and store them in a
+- Log output
+  - Capture log output from Loki, CloudWatch logs, and other tools on your Coder Server instances and external provisioner daemons and store them in a
   searchable log store.
+  - Retain logs for a minimum of thirty days, ideally ninety days. This allows you to look back to see when anomalous behaviors began.
 
-  - Retain logs for a minimum of thirty days, ideally ninety days. This allows
-    you to look back to see when anomalous behaviors began.
-
-**Metrics**
-
-- Capture infrastructure metrics like CPU, memory, open files, and network I/O
-  for all Coder Server, external provisioner daemon, workspace proxy, and
-  PostgreSQL instances.
+- Metrics
+  - Capture infrastructure metrics like CPU, memory, open files, and network I/O for all Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
 
 ### Capture Coder server metrics with Prometheus
 
@@ -73,27 +66,23 @@ dashboard, for example, [Grafana](../../admin/monitoring/index.md).
 Configure alerting based on these metrics to ensure you surface problems before
 they affect the end-user experience.
 
-**CPU and Memory Utilization**
+#### CPU and Memory Utilization
 
-- Monitor the utilization as a fraction of the available resources on the
-  instance.
+Monitor the utilization as a fraction of the available resources on the instance.
 
-  Utilization will vary with use throughout the course of a day, week, and
-  longer timelines. Monitor trends and pay special attention to the daily and
-  weekly peak utilization. Use long-term trends to plan infrastructure upgrades.
+Utilization will vary with use throughout the course of a day, week, and longer timelines. Monitor trends and pay special attention to the daily and weekly peak utilization. Use long-term trends to plan infrastructure upgrades.
 
-**Tail latency of Coder Server API requests**
+#### Tail latency of Coder Server API requests
 
-- High tail latency can indicate Coder Server or the PostgreSQL database is low
-  on resources.
+High tail latency can indicate Coder Server or the PostgreSQL database is low on resources.
 
-  Use the `coderd_api_request_latencies_seconds` metric.
+- Use the `coderd_api_request_latencies_seconds` metric.
 
-**Tail latency of database queries**
+#### Tail latency of database queries
 
-- High tail latency can indicate the PostgreSQL database is low in resources.
+High tail latency can indicate the PostgreSQL database is low in resources.
 
-  Use the `coderd_db_query_latencies_seconds` metric.
+- Use the `coderd_db_query_latencies_seconds` metric.
 
 ## Coder Server
 

From 6451c298f8894e959702fce2f872f18a8189aa21 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Thu, 9 Jan 2025 19:46:28 +0000
Subject: [PATCH 07/15] copy edit

---
 docs/tutorials/best-practices/scale-coder.md | 142 +++++++++----------
 1 file changed, 67 insertions(+), 75 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 67b591ea849e8..0334f5a028529 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -1,17 +1,14 @@
-kl# Scale Coder
+# Scale Coder
 
-December 20, 2024
-
----
-
-This best practice guide helps you prepare a low-scale Coder deployment so that
-it can be scaled up to a high-scale deployment as use grows, and keep it
-operating smoothly with a high number of active users and workspaces.
+This best practice guide helps you prepare a low-scale Coder deployment that you can
+scale up to a high-scale deployment as use grows, and keep it operating smoothly with a
+high number of active users and workspaces.
 
 ## Observability
 
-Observability is one of the most important aspects to a scalable Coder
-deployment.
+Observability is one of the most important aspects to a scalable Coder deployment.
+When you have visibility into performance and usage metrics, you can make informed
+decisions about what changes you should make.
 
 [Monitor your Coder deployment](../../admin/monitoring/index.md) with log output
 and metrics to identify potential bottlenecks before they negatively affect the
@@ -19,16 +16,18 @@ end-user experience and measure the effects of modifications you make to your
 deployment.
 
 - Log output
-  - Capture log output from Loki, CloudWatch logs, and other tools on your Coder Server instances and external provisioner daemons and store them in a
-  searchable log store.
-  - Retain logs for a minimum of thirty days, ideally ninety days. This allows you to look back to see when anomalous behaviors began.
+  - Capture log output from Loki, CloudWatch logs, and other tools on your Coder Server
+  instances and external provisioner daemons and store them in a searchable log store.
+  - Retain logs for a minimum of thirty days, ideally ninety days.
+  This allows you to look back to see when anomalous behaviors began.
 
 - Metrics
-  - Capture infrastructure metrics like CPU, memory, open files, and network I/O for all Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
+  - Capture infrastructure metrics like CPU, memory, open files, and network I/O for all
+  Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
 
 ### Capture Coder server metrics with Prometheus
 
-To capture metrics from Coder Server and external provisioner daemons with
+Edit your Helm `values.yaml` to capture metrics from Coder Server and external provisioner daemons with
 [Prometheus](../../admin/integrations/prometheus.md):
 
 1. Enable Prometheus metrics:
@@ -49,40 +48,35 @@ To capture metrics from Coder Server and external provisioner daemons with
    CODER_PROMETHEUS_AGGREGATE_AGENT_STATS_BY=agent_name
    ```
 
-- To disable agent stats:
+   - To disable agent stats:
 
-  ```yaml
-  CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
-  ```
+     ```yaml
+     CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
+     ```
 
 Retain metric time series for at least six months. This allows you to see
 performance trends relative to user growth.
 
 For a more comprehensive overview, integrate metrics with an observability
-dashboard, for example, [Grafana](../../admin/monitoring/index.md).
+dashboard like [Grafana](../../admin/monitoring/index.md).
 
 ### Observability key metrics
 
 Configure alerting based on these metrics to ensure you surface problems before
 they affect the end-user experience.
 
-#### CPU and Memory Utilization
-
-Monitor the utilization as a fraction of the available resources on the instance.
-
-Utilization will vary with use throughout the course of a day, week, and longer timelines. Monitor trends and pay special attention to the daily and weekly peak utilization. Use long-term trends to plan infrastructure upgrades.
-
-#### Tail latency of Coder Server API requests
-
-High tail latency can indicate Coder Server or the PostgreSQL database is low on resources.
+- CPU and Memory Utilization
+  - Monitor the utilization as a fraction of the available resources on the instance.
 
-- Use the `coderd_api_request_latencies_seconds` metric.
+     Utilization will vary with use throughout the course of a day, week, and longer timelines. Monitor trends and pay special attention to the daily and weekly peak utilization. Use long-term trends to plan infrastructure upgrades.
 
-#### Tail latency of database queries
+- Tail latency of Coder Server API requests
+  - High tail latency can indicate Coder Server or the PostgreSQL database is low on resources.
+  - Use the `coderd_api_request_latencies_seconds` metric.
 
-High tail latency can indicate the PostgreSQL database is low in resources.
-
-- Use the `coderd_db_query_latencies_seconds` metric.
+- Tail latency of database queries
+  - High tail latency can indicate the PostgreSQL database is low in resources.
+  - Use the `coderd_db_query_latencies_seconds` metric.
 
 ## Coder Server
 
@@ -116,8 +110,7 @@ Coder's
 [validated architectures](../../admin/infrastructure/validated-architectures/index.md)
 give specific sizing recommendations for various user scales. These are a useful
 starting point, but very few deployments will remain stable at a predetermined
-user level over the long term, so monitoring and adjusting of resources is
-recommended.
+user level over the long term. We recommend monitoring and adjusting resources as needed.
 
 We don't recommend that you autoscale the Coder Servers. Instead, scale the
 deployment for peak weekly usage.
@@ -128,28 +121,26 @@ users to their workspaces in two capacities:
 1. As an HTTP proxy when they access workspace applications in their browser via
    the Coder Dashboard
 
-1. As a DERP proxy when establishing tunneled connections via CLI tools
-   (`coder ssh`, `coder port-forward`, etc.) and desktop IDEs.
+1. As a DERP proxy when establishing tunneled connections with CLI tools like `coder ssh`, `coder port-forward`, and others, and with desktop IDEs.
 
 Stopping a Coder Server instance will (momentarily) disconnect any users
 currently connecting through that instance. Adding a new instance is not
-disruptive, but removing instances and upgrades should be performed during a
+disruptive, but you should remove instances and perform upgrades during a
 maintenance window to minimize disruption.
 
 ## Provisioner daemons
 
 ### Locality
 
-We recommend you disable provisioner daemons within your Coder Server:
+We recommend that you run one or more
+[provisioner daemon deployments external to Coder Server](../../admin/provisioners.md)
+and disable provisioner daemons within your Coder Server.
+This allows you to scale them independently of the Coder Server:
 
 ```yaml
 CODER_PROVISIONER_DAEMONS=0
 ```
 
-Run one or more
-[provisioner daemon deployments external to Coder Server](../../admin/provisioners.md).
-This allows you to scale them independently of the Coder Server.
-
 We recommend deploying provisioner daemons within the same cluster as the
 workspaces they will provision or are hosted in.
 
@@ -157,15 +148,15 @@ workspaces they will provision or are hosted in.
   provision workspaces and can speed builds.
 
 - It allows provisioner daemons to use in-cluster mechanisms (for example
-  Kubernetes service account tokens, AWS IAM Roles, etc.) to authenticate with
+  Kubernetes service account tokens, AWS IAM Roles, and others) to authenticate with
   the infrastructure APIs.
 
 - If you deploy workspaces in multiple clusters, run multiple provisioner daemon
   deployments and use template tags to select the correct set of provisioner
   daemons.
 
-- Provisioner daemons need to be able to connect to Coder Server, but this need
-  not be a low-latency connection.
+- Provisioner daemons need to be able to connect to Coder Server, but this does not need
+  to be a low-latency connection.
 
 Provisioner daemons make no direct connections to the PostgreSQL database, so
 there's no need for locality to the Postgres database.
@@ -173,30 +164,31 @@ there's no need for locality to the Postgres database.
 ### Scaling
 
 Each provisioner daemon instance can handle a single workspace build job at a
-time. Therefore, the number of provisioner daemon instances within a tagged
-deployment equals the maximum number of simultaneous builds your Coder
-deployment can handle.
+time. Therefore, the maximum number of simultaneous builds your Coder deployment
+can handle is equal to the number of provisioner daemon instances within a tagged
+deployment.
 
 If users experience unacceptably long queues for workspace builds to start,
 consider increasing the number of provisioner daemon instances in the affected
 cluster.
 
-You may wish to automatically scale the number of provisioner daemon instances
-throughout the day to meet demand. If you stop instances with `SIGHUP`, they
-will complete their current build job and exit. `SIGINT` will cancel the current
-job, which will result in a failed build. Ensure your autoscaler waits long
-enough for your build jobs to complete before forcibly killing the provisioner
-daemon process.
+You might need to automatically scale the number of provisioner daemon instances
+throughout the day to meet demand.
 
-If deploying in Kubernetes, we recommend a single provisioner daemon per pod. On
-a virtual machine (VM), you can deploy multiple provisioner daemons, ensuring
+If you stop instances with `SIGHUP`, they will complete their current build job
+and exit. `SIGINT` will cancel the current job, which will result in a failed build.
+Ensure your autoscaler waits long enough for your build jobs to complete before
+it kills the provisioner daemon process.
+
+If you deploy in Kubernetes, we recommend a single provisioner daemon per pod.
+On a virtual machine (VM), you can deploy multiple provisioner daemons, ensuring
 each has a unique `CODER_CACHE_DIRECTORY` value.
 
 Coder's
 [validated architectures](../../admin/infrastructure/validated-architectures/index.md)
 give specific sizing recommendations for various user scales. Since the
 complexity of builds varies significantly depending on the workspace template,
-consider this a starting point. Monitor queue times and build times to adjust
+consider this a starting point. Monitor queue times and build times and adjust
 the number and size of your provisioner daemon instances.
 
 ## PostgreSQL
@@ -232,20 +224,22 @@ path.
 
 ### Scaling
 
-Workspace proxy load is determined by the amount of traffic they proxy. We
-recommend you monitor CPU, memory, and network I/O utilization to decide when to
-resize the number of proxy instances.
+Workspace proxy load is determined by the amount of traffic they proxy.
+
+Monitor CPU, memory, and network I/O utilization to decide when to resize
+the number of proxy instances.
+
+Scale for peak demand and scale down or upgrade during a maintenance window.
 
 We do not recommend autoscaling the workspace proxies because many applications
 use long-lived connections such as websockets, which would be disrupted by
-stopping the proxy. We recommend you scale for peak demand and scale down or
-upgrade during a maintenance window.
+stopping the proxy.
 
 ## Workspaces
 
 Workspaces represent the vast majority of resources in most Coder deployments.
 Because they are defined by templates, there is no one-size-fits-all advice for
-scaling.
+scaling workspaces.
 
 ### Hard and soft cluster limits
 
@@ -259,25 +253,23 @@ utilization against the limits, so that a new influx of users doesn't encounter
 failed builds. Monitoring these is outside the scope of Coder, but we recommend
 that you set up dashboards and alerts for each kind of limited resource.
 
-As you approach soft limits, you might be able to justify an increase to keep
-growing.
+As you approach soft limits, you can increase limits to keep growing.
 
-As you approach hard limits, you will need to consider deploying to additional
-cluster(s).
+As you approach hard limits, consider deploying to additional cluster(s).
 
 ### Workspaces per node
 
 Many development workloads are "spiky" in their CPU and memory requirements, for
-example, peaking during build/test and then ebbing while editing code. This
-leads to an opportunity to efficiently use compute resources by packing multiple
+example, they peak during build/test and then ebb while editing code.
+This leads to an opportunity to efficiently use compute resources by packing multiple
 workspaces onto a single node. This can lead to better experience (more CPU and
 memory available during brief bursts) and lower cost.
 
-However, it needs to be considered against several trade-offs.
+There are a number of things you should consider before you decide how many
+workspaces you should allow per node:
 
-- There are residual probabilities of "noisy neighbor" problems negatively
-  affecting end users. The probabilities increase with the amount of
-  oversubscription of CPU and memory resources.
+- "Noisy neighbor" issues: Users share the node's CPU and memory resources and might
+be susceptible to a user or process consuming shared resources.
 
 - If the shared nodes are a provisioned resource, for example, Kubernetes nodes
   running on VMs in a public cloud, then it can sometimes be a challenge to

From fbedc4edc039d18533856d8ddc9a2a4f1403ae2a Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Thu, 9 Jan 2025 20:18:17 +0000
Subject: [PATCH 08/15] s/ebb/lower

---
 docs/tutorials/best-practices/scale-coder.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 0334f5a028529..cfe7ee1e72bea 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -260,7 +260,7 @@ As you approach hard limits, consider deploying to additional cluster(s).
 ### Workspaces per node
 
 Many development workloads are "spiky" in their CPU and memory requirements, for
-example, they peak during build/test and then ebb while editing code.
+example, they peak during build/test and then lower while editing code.
 This leads to an opportunity to efficiently use compute resources by packing multiple
 workspaces onto a single node. This can lead to better experience (more CPU and
 memory available during brief bursts) and lower cost.

From 30a62075cbcd343f141095b62577f7daa8745b8a Mon Sep 17 00:00:00 2001
From: Edward Angert <EdwardAngert@users.noreply.github.com>
Date: Fri, 10 Jan 2025 08:59:52 -0500
Subject: [PATCH 09/15] Apply suggestions from code review

Co-authored-by: Spike Curtis <spike@coder.com>
---
 docs/tutorials/best-practices/scale-coder.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index cfe7ee1e72bea..5be2ba7a7e0b5 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -1,6 +1,6 @@
 # Scale Coder
 
-This best practice guide helps you prepare a low-scale Coder deployment that you can
+This best practice guide helps you prepare a Coder deployment that you can
 scale up to a high-scale deployment as use grows, and keep it operating smoothly with a
 high number of active users and workspaces.
 
@@ -249,11 +249,11 @@ size of the cluster, especially in the case of a private cloud, or soft limits,
 based on configured limits in your public cloud account.
 
 It is important to be aware of these limits and monitor Coder workspace resource
-utilization against the limits, so that a new influx of users doesn't encounter
+utilization against the limits, so that a new influx of users don't encounter
 failed builds. Monitoring these is outside the scope of Coder, but we recommend
 that you set up dashboards and alerts for each kind of limited resource.
 
-As you approach soft limits, you can increase limits to keep growing.
+As you approach soft limits, you can request limit increases to keep growing.
 
 As you approach hard limits, consider deploying to additional cluster(s).
 

From da810802585bfd59780ca90a30b29d409e04e316 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Fri, 10 Jan 2025 19:44:20 +0000
Subject: [PATCH 10/15] add suggestions from review

---
 docs/tutorials/best-practices/scale-coder.md | 27 ++++++++++++--------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 5be2ba7a7e0b5..41de10752c7dc 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -16,8 +16,8 @@ end-user experience and measure the effects of modifications you make to your
 deployment.
 
 - Log output
-  - Capture log output from Loki, CloudWatch logs, and other tools on your Coder Server
-  instances and external provisioner daemons and store them in a searchable log store.
+  - Capture log output from from Coder Server instances and external provisioner daemons
+  and store them in a searchable log store like Loki, CloudWatch logs, or other tools.
   - Retain logs for a minimum of thirty days, ideally ninety days.
   This allows you to look back to see when anomalous behaviors began.
 
@@ -42,13 +42,15 @@ Edit your Helm `values.yaml` to capture metrics from Coder Server and external p
    CODER_PROMETHEUS_COLLECT_DB_METRICS=true
    ```
 
-1. Configure agent stats to avoid large cardinality:
+1. For a high scale deployment, configure agent stats to avoid large cardinality or disable them:
 
-   ```yaml
-   CODER_PROMETHEUS_AGGREGATE_AGENT_STATS_BY=agent_name
-   ```
+   - Configure agent stats:
+
+     ```yaml
+     CODER_PROMETHEUS_AGGREGATE_AGENT_STATS_BY=agent_name
+     ```
 
-   - To disable agent stats:
+   - Disable agent stats:
 
      ```yaml
      CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
@@ -68,10 +70,13 @@ they affect the end-user experience.
 - CPU and Memory Utilization
   - Monitor the utilization as a fraction of the available resources on the instance.
 
-     Utilization will vary with use throughout the course of a day, week, and longer timelines. Monitor trends and pay special attention to the daily and weekly peak utilization. Use long-term trends to plan infrastructure upgrades.
+     Utilization will vary with use throughout the course of a day, week, and longer timelines.
+     Monitor trends and pay special attention to the daily and weekly peak utilization.
+     Use long-term trends to plan infrastructure upgrades.
 
 - Tail latency of Coder Server API requests
-  - High tail latency can indicate Coder Server or the PostgreSQL database is low on resources.
+  - High tail latency can indicate Coder Server or the PostgreSQL database is underprovisioned
+  for the load.
   - Use the `coderd_api_request_latencies_seconds` metric.
 
 - Tail latency of database queries
@@ -82,8 +87,8 @@ they affect the end-user experience.
 
 ### Locality
 
-To ensure increased availability of the Coder API, deploy at least three
-instances. Spread the instances across nodes with anti-affinity rules in
+If increased availability of the Coder API is a concern, deploy at least three
+instances of Coder Server. Spread the instances across nodes with anti-affinity rules in
 Kubernetes or in different availability zones of the same geographic region.
 
 Do not deploy in different geographic regions.

From ba303be7f9e9bb51b8140f5c3e58dbb5a4c79252 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Wed, 15 Jan 2025 16:06:00 +0000
Subject: [PATCH 11/15] copy edit

---
 docs/tutorials/best-practices/scale-coder.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 41de10752c7dc..85dd7030eaf54 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -19,7 +19,7 @@ deployment.
   - Capture log output from from Coder Server instances and external provisioner daemons
   and store them in a searchable log store like Loki, CloudWatch logs, or other tools.
   - Retain logs for a minimum of thirty days, ideally ninety days.
-  This allows you to look back to see when anomalous behaviors began.
+  This allows you investigate when anomalous behaviors began.
 
 - Metrics
   - Capture infrastructure metrics like CPU, memory, open files, and network I/O for all

From f110af3e9c7190f4d8e51c87c80d85172958f8a2 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Fri, 17 Jan 2025 20:59:41 +0000
Subject: [PATCH 12/15] rearrage observability subheadings

---
 docs/tutorials/best-practices/scale-coder.md | 57 ++++++++++----------
 1 file changed, 29 insertions(+), 28 deletions(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 85dd7030eaf54..a329e1b903b6e 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -24,8 +24,36 @@ deployment.
 - Metrics
   - Capture infrastructure metrics like CPU, memory, open files, and network I/O for all
   Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
+  - Capture Coder Server and External Provisioner daemons metrics [via Prometheus](#how-to-capture-coder-server-metrics-with-prometheus).
 
-### Capture Coder server metrics with Prometheus
+Retain metric time series for at least six months. This allows you to see
+performance trends relative to user growth.
+
+For a more comprehensive overview, integrate metrics with an observability
+dashboard like [Grafana](../../admin/monitoring/index.md).
+
+### Observability key metrics
+
+Configure alerting based on these metrics to ensure you surface problems before
+they affect the end-user experience.
+
+- CPU and Memory Utilization
+  - Monitor the utilization as a fraction of the available resources on the instance.
+
+     Utilization will vary with use throughout the course of a day, week, and longer timelines.
+     Monitor trends and pay special attention to the daily and weekly peak utilization.
+     Use long-term trends to plan infrastructure upgrades.
+
+- Tail latency of Coder Server API requests
+  - High tail latency can indicate Coder Server or the PostgreSQL database is underprovisioned
+  for the load.
+  - Use the `coderd_api_request_latencies_seconds` metric.
+
+- Tail latency of database queries
+  - High tail latency can indicate the PostgreSQL database is low in resources.
+  - Use the `coderd_db_query_latencies_seconds` metric.
+
+### How to capture Coder server metrics with Prometheus
 
 Edit your Helm `values.yaml` to capture metrics from Coder Server and external provisioner daemons with
 [Prometheus](../../admin/integrations/prometheus.md):
@@ -56,33 +84,6 @@ Edit your Helm `values.yaml` to capture metrics from Coder Server and external p
      CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
      ```
 
-Retain metric time series for at least six months. This allows you to see
-performance trends relative to user growth.
-
-For a more comprehensive overview, integrate metrics with an observability
-dashboard like [Grafana](../../admin/monitoring/index.md).
-
-### Observability key metrics
-
-Configure alerting based on these metrics to ensure you surface problems before
-they affect the end-user experience.
-
-- CPU and Memory Utilization
-  - Monitor the utilization as a fraction of the available resources on the instance.
-
-     Utilization will vary with use throughout the course of a day, week, and longer timelines.
-     Monitor trends and pay special attention to the daily and weekly peak utilization.
-     Use long-term trends to plan infrastructure upgrades.
-
-- Tail latency of Coder Server API requests
-  - High tail latency can indicate Coder Server or the PostgreSQL database is underprovisioned
-  for the load.
-  - Use the `coderd_api_request_latencies_seconds` metric.
-
-- Tail latency of database queries
-  - High tail latency can indicate the PostgreSQL database is low in resources.
-  - Use the `coderd_db_query_latencies_seconds` metric.
-
 ## Coder Server
 
 ### Locality

From 23f33004601b0b36d20eeb202bd37a50472a22ae Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Tue, 21 Jan 2025 19:40:20 +0000
Subject: [PATCH 13/15] cross-linking scale-testing

---
 docs/admin/infrastructure/scale-testing.md   | 37 ++++++++------
 docs/admin/infrastructure/scale-utility.md   | 53 +++++++++++++-------
 docs/manifest.json                           |  5 ++
 docs/tutorials/best-practices/scale-coder.md | 10 +++-
 4 files changed, 69 insertions(+), 36 deletions(-)

diff --git a/docs/admin/infrastructure/scale-testing.md b/docs/admin/infrastructure/scale-testing.md
index 37a79f5d6f742..4d4645085a38e 100644
--- a/docs/admin/infrastructure/scale-testing.md
+++ b/docs/admin/infrastructure/scale-testing.md
@@ -5,7 +5,7 @@ without compromising service. This process encompasses infrastructure setup,
 traffic projections, and aggressive testing to identify and mitigate potential
 bottlenecks.
 
-A dedicated Kubernetes cluster for Coder is recommended to configure, host and
+A dedicated Kubernetes cluster for Coder is recommended to configure, host, and
 manage Coder workloads. Kubernetes provides container orchestration
 capabilities, allowing Coder to efficiently deploy, scale, and manage workspaces
 across a distributed infrastructure. This ensures high availability, fault
@@ -13,27 +13,29 @@ tolerance, and scalability for Coder deployments. Coder is deployed on this
 cluster using the
 [Helm chart](../../install/kubernetes.md#4-install-coder-with-helm).
 
+For more information about scaling, see our [Coder scaling best practices](../../tutorials/best-practices/scale-coder.md).
+
 ## Methodology
 
 Our scale tests include the following stages:
 
 1. Prepare environment: create expected users and provision workspaces.
 
-2. SSH connections: establish user connections with agents, verifying their
+1. SSH connections: establish user connections with agents, verifying their
    ability to echo back received content.
 
-3. Web Terminal: verify the PTY connection used for communication with Web
+1. Web Terminal: verify the PTY connection used for communication with Web
    Terminal.
 
-4. Workspace application traffic: assess the handling of user connections with
+1. Workspace application traffic: assess the handling of user connections with
    specific workspace apps, confirming their capability to echo back received
    content effectively.
 
-5. Dashboard evaluation: verify the responsiveness and stability of Coder
+1. Dashboard evaluation: verify the responsiveness and stability of Coder
    dashboards under varying load conditions. This is achieved by simulating user
    interactions using instances of headless Chromium browsers.
 
-6. Cleanup: delete workspaces and users created in step 1.
+1. Cleanup: delete workspaces and users created in step 1.
 
 ## Infrastructure and setup requirements
 
@@ -54,13 +56,16 @@ channel for IDEs with VS Code and JetBrains plugins.
 The basic setup of scale tests environment involves:
 
 1. Scale tests runner (32 vCPU, 128 GB RAM)
-2. Coder: 2 replicas (4 vCPU, 16 GB RAM)
-3. Database: 1 instance (2 vCPU, 32 GB RAM)
-4. Provisioner: 50 instances (0.5 vCPU, 512 MB RAM)
+1. Coder: 2 replicas (4 vCPU, 16 GB RAM)
+1. Database: 1 instance (2 vCPU, 32 GB RAM)
+1. Provisioner: 50 instances (0.5 vCPU, 512 MB RAM)
+
+The test is deemed successful if:
 
-The test is deemed successful if users did not experience interruptions in their
-workflows, `coderd` did not crash or require restarts, and no other internal
-errors were observed.
+- Users did not experience interruptions in their
+workflows,
+- `coderd` did not crash or require restarts, and
+- Noo other internal errors were observed.
 
 ## Traffic Projections
 
@@ -90,11 +95,11 @@ Database:
 
 ## Available reference architectures
 
-[Up to 1,000 users](./validated-architectures/1k-users.md)
+- [Up to 1,000 users](./validated-architectures/1k-users.md)
 
-[Up to 2,000 users](./validated-architectures/2k-users.md)
+- [Up to 2,000 users](./validated-architectures/2k-users.md)
 
-[Up to 3,000 users](./validated-architectures/3k-users.md)
+- [Up to 3,000 users](./validated-architectures/3k-users.md)
 
 ## Hardware recommendation
 
@@ -107,7 +112,7 @@ guidance on optimal configurations. A reasonable approach involves using scaling
 formulas based on factors like CPU, memory, and the number of users.
 
 While the minimum requirements specify 1 CPU core and 2 GB of memory per
-`coderd` replica, it is recommended to allocate additional resources depending
+`coderd` replica, we recommend that you allocate additional resources depending
 on the workload size to ensure deployment stability.
 
 #### CPU and memory usage
diff --git a/docs/admin/infrastructure/scale-utility.md b/docs/admin/infrastructure/scale-utility.md
index b3094c49fbca4..110a4e9d7a4a8 100644
--- a/docs/admin/infrastructure/scale-utility.md
+++ b/docs/admin/infrastructure/scale-utility.md
@@ -1,20 +1,23 @@
 # Scale Tests and Utilities
 
-We scale-test Coder with [a built-in utility](#scale-testing-utility) that can
+We scale-test Coder with a built-in utility that can
 be used in your environment for insights into how Coder scales with your
-infrastructure. For scale-testing Kubernetes clusters we recommend to install
+infrastructure. For scale-testing Kubernetes clusters we recommend that you install
 and use the dedicated Coder template,
 [scaletest-runner](https://github.com/coder/coder/tree/main/scaletest/templates/scaletest-runner).
 
 Learn more about [Coder’s architecture](./architecture.md) and our
 [scale-testing methodology](./scale-testing.md).
 
+For more information about scaling, see our [Coder scaling best practices](../../tutorials/best-practices/scale-coder.md).
+
 ## Recent scale tests
 
-> Note: the below information is for reference purposes only, and are not
-> intended to be used as guidelines for infrastructure sizing. Review the
-> [Reference Architectures](./validated-architectures/index.md#node-sizing) for
-> hardware sizing recommendations.
+The information in this doc is for reference purposes only, and is not intended
+to be used as guidelines for infrastructure sizing.
+
+Review the [Reference Architectures](./validated-architectures/index.md#node-sizing) for
+hardware sizing recommendations.
 
 | Environment      | Coder CPU | Coder RAM | Coder Replicas | Database          | Users | Concurrent builds | Concurrent connections (Terminal/SSH) | Coder Version | Last tested  |
 |------------------|-----------|-----------|----------------|-------------------|-------|-------------------|---------------------------------------|---------------|--------------|
@@ -25,8 +28,7 @@ Learn more about [Coder’s architecture](./architecture.md) and our
 | Kubernetes (GKE) | 4 cores   | 16 GB     | 2              | db-custom-8-30720 | 2000  | 50                | 2000 simulated                        | `v2.8.4`      | Feb 28, 2024 |
 | Kubernetes (GKE) | 2 cores   | 4 GB      | 2              | db-custom-2-7680  | 1000  | 50                | 1000 simulated                        | `v2.10.2`     | Apr 26, 2024 |
 
-> Note: a simulated connection reads and writes random data at 40KB/s per
-> connection.
+> Note: A simulated connection reads and writes random data at 40KB/s per connection.
 
 ## Scale testing utility
 
@@ -34,17 +36,24 @@ Since Coder's performance is highly dependent on the templates and workflows you
 support, you may wish to use our internal scale testing utility against your own
 environments.
 
-> Note: This utility is experimental. It is not subject to any compatibility
-> guarantees, and may cause interruptions for your users. To avoid potential
-> outages and orphaned resources, we recommend running scale tests on a
-> secondary "staging" environment or a dedicated
-> [Kubernetes playground cluster](https://github.com/coder/coder/tree/main/scaletest/terraform).
-> Run it against a production environment at your own risk.
+<blockquote class="admonition important">
+
+This utility is experimental.
+
+It is not subject to any compatibility guarantees and may cause interruptions
+for your users.
+To avoid potential outages and orphaned resources, we recommend that you run
+scale tests on a secondary "staging" environment or a dedicated
+[Kubernetes playground cluster](https://github.com/coder/coder/tree/main/scaletest/terraform).
+
+Run it against a production environment at your own risk.
+
+</blockquote>
 
 ### Create workspaces
 
 The following command will provision a number of Coder workspaces using the
-specified template and extra parameters.
+specified template and extra parameters:
 
 ```shell
 coder exp scaletest create-workspaces \
@@ -56,8 +65,6 @@ coder exp scaletest create-workspaces \
         --job-timeout 5h \
         --no-cleanup \
         --output json:"${SCALETEST_RESULTS_DIR}/create-workspaces.json"
-
-# Run `coder exp scaletest create-workspaces --help` for all usage
 ```
 
 The command does the following:
@@ -70,6 +77,12 @@ The command does the following:
 1. If you don't want the creation process to be interrupted by any errors, use
    the `--retry 5` flag.
 
+For more built-in `scaletest` options, use the `--help` flag:
+
+```shell
+coder exp scaletest create-workspaces --help
+```
+
 ### Traffic Generation
 
 Given an existing set of workspaces created previously with `create-workspaces`,
@@ -105,7 +118,11 @@ The `workspace-traffic` supports also other modes - SSH traffic, workspace app:
 1. For SSH traffic: Use `--ssh` flag to generate SSH traffic instead of Web
    Terminal.
 1. For workspace app traffic: Use `--app [wsdi|wsec|wsra]` flag to select app
-   behavior. (modes: _WebSocket discard_, _WebSocket echo_, _WebSocket read_).
+   behavior.
+   
+   - `wsdi`: WebSocket discard
+   - `wsec`: WebSocket echo
+   - `wsra`: WebSocket read
 
 ### Cleanup
 
diff --git a/docs/manifest.json b/docs/manifest.json
index d4a8b4841da68..8a26c0a46497c 100644
--- a/docs/manifest.json
+++ b/docs/manifest.json
@@ -243,6 +243,11 @@
 							"title": "Scaling Utilities",
 							"description": "Tools to help you scale your deployment",
 							"path": "./admin/infrastructure/scale-utility.md"
+						},
+						{
+							"title": "Scaling best practices",
+							"description": "How to prepare a Coder deployment for scale",
+							"path": "./tutorials/best-practices/scale-coder.md"
 						}
 					]
 				},
diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index a329e1b903b6e..17ddcbf66bdb0 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -125,9 +125,10 @@ Although Coder Server persists no internal state, it operates as a proxy for end
 users to their workspaces in two capacities:
 
 1. As an HTTP proxy when they access workspace applications in their browser via
-   the Coder Dashboard
+the Coder Dashboard.
 
-1. As a DERP proxy when establishing tunneled connections with CLI tools like `coder ssh`, `coder port-forward`, and others, and with desktop IDEs.
+1. As a DERP proxy when establishing tunneled connections with CLI tools like
+`coder ssh`, `coder port-forward`, and others, and with desktop IDEs.
 
 Stopping a Coder Server instance will (momentarily) disconnect any users
 currently connecting through that instance. Adding a new instance is not
@@ -313,3 +314,8 @@ Coder customers have had success with both:
 Set up your network so that most users can get direct, peer-to-peer connections
 to their workspaces. This drastically reduces the load on Coder Server and
 workspace proxy instances.
+
+## Next steps
+
+- [Scale Tests and Utilities](../../admin/infrastructure/scale-utility.md)
+- [Scale Testing](../../admin/infrastructure/scale-testing.md)

From 58a573532c0468606ab09aeb25504c106665eda5 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Tue, 21 Jan 2025 19:52:01 +0000
Subject: [PATCH 14/15] md edit

---
 docs/tutorials/best-practices/scale-coder.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/tutorials/best-practices/scale-coder.md b/docs/tutorials/best-practices/scale-coder.md
index 17ddcbf66bdb0..9a640a051be58 100644
--- a/docs/tutorials/best-practices/scale-coder.md
+++ b/docs/tutorials/best-practices/scale-coder.md
@@ -24,7 +24,8 @@ deployment.
 - Metrics
   - Capture infrastructure metrics like CPU, memory, open files, and network I/O for all
   Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
-  - Capture Coder Server and External Provisioner daemons metrics [via Prometheus](#how-to-capture-coder-server-metrics-with-prometheus).
+  - Capture Coder Server and External Provisioner daemons metrics
+  [via Prometheus](#how-to-capture-coder-server-metrics-with-prometheus).
 
 Retain metric time series for at least six months. This allows you to see
 performance trends relative to user growth.

From 79f5af76e8dfb2165db13b095ef9dddcbffcb553 Mon Sep 17 00:00:00 2001
From: EdwardAngert <EdwardAngert@users.noreply.github.com>
Date: Tue, 21 Jan 2025 19:55:20 +0000
Subject: [PATCH 15/15] typo fix

---
 docs/admin/infrastructure/scale-testing.md | 2 +-
 docs/admin/infrastructure/scale-utility.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/admin/infrastructure/scale-testing.md b/docs/admin/infrastructure/scale-testing.md
index 4d4645085a38e..de36131531fbe 100644
--- a/docs/admin/infrastructure/scale-testing.md
+++ b/docs/admin/infrastructure/scale-testing.md
@@ -65,7 +65,7 @@ The test is deemed successful if:
 - Users did not experience interruptions in their
 workflows,
 - `coderd` did not crash or require restarts, and
-- Noo other internal errors were observed.
+- No other internal errors were observed.
 
 ## Traffic Projections
 
diff --git a/docs/admin/infrastructure/scale-utility.md b/docs/admin/infrastructure/scale-utility.md
index 110a4e9d7a4a8..a3162c9fd58f3 100644
--- a/docs/admin/infrastructure/scale-utility.md
+++ b/docs/admin/infrastructure/scale-utility.md
@@ -119,7 +119,7 @@ The `workspace-traffic` supports also other modes - SSH traffic, workspace app:
    Terminal.
 1. For workspace app traffic: Use `--app [wsdi|wsec|wsra]` flag to select app
    behavior.
-   
+
    - `wsdi`: WebSocket discard
    - `wsec`: WebSocket echo
    - `wsra`: WebSocket read