coder · mtojek · Mar 15, 2024 · Mar 11, 2024 · Mar 12, 2024 · Mar 12, 2024
diff --git a/docs/admin/architectures/1k-users.md b/docs/admin/architectures/1k-users.md
@@ -0,0 +1,51 @@
+# Reference Architecture: up to 1,000 users
+
+The 1,000 users architecture is designed to cover a wide range of workflows.
+Examples of subjects that might utilize this architecture include medium-sized
+tech startups, educational units, or small to mid-sized enterprises.
+
+**Target load**: API: up to 180 RPS
+
+**High Availability**: non-essential for small deployments
+
+## Hardware recommendations
+
+### Coderd nodes
+
+| Users       | Node capacity       | Replicas            | GCP             | AWS        | Azure             |
+| ----------- | ------------------- | ------------------- | --------------- | ---------- | ----------------- |
+| Up to 1,000 | 2 vCPU, 8 GB memory | 1-2 / 1 coderd each | `n1-standard-2` | `t3.large` | `Standard_D2s_v3` |
+
+**Footnotes**:
+
+- For small deployments (ca. 100 users, 10 concurrent workspace builds), it is
+  acceptable to deploy provisioners on `coderd` nodes.
+
+### Provisioner nodes
+
+| Users       | Node capacity        | Replicas                       | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 1,000 | 8 vCPU, 32 GB memory | 2 nodes / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- An external provisioner is deployed as Kubernetes pod.
+
+### Workspace nodes
+
+| Users       | Node capacity        | Replicas                | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ----------------------- | ---------------- | ------------ | ----------------- |
+| Up to 1,000 | 8 vCPU, 32 GB memory | 64 / 16 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- Assumed that a workspace user needs at minimum 2 GB memory to perform. We
+  recommend against over-provisioning memory for developer workloads, as this my
+  lead to OOMKiller invocations.
+- Maximum number of Kubernetes workspace pods per node: 256
+
+### Database nodes
+
+| Users       | Node capacity       | Replicas | Storage | GCP                | AWS           | Azure             |
+| ----------- | ------------------- | -------- | ------- | ------------------ | ------------- | ----------------- |
+| Up to 1,000 | 2 vCPU, 8 GB memory | 1        | 512 GB  | `db-custom-2-7680` | `db.t3.large` | `Standard_D2s_v3` |
diff --git a/docs/admin/architectures/2k-users.md b/docs/admin/architectures/2k-users.md
@@ -0,0 +1,59 @@
+# Reference Architecture: up to 2,000 users
+
+In the 2,000 users architecture, there is a moderate increase in traffic,
+suggesting a growing user base or expanding operations. This setup is
+well-suited for mid-sized companies experiencing growth or for universities
+seeking to accommodate their expanding user populations.
+
+Users can be evenly distributed between 2 regions or be attached to different
+clusters.
+
+**Target load**: API: up to 300 RPS
+
+**High Availability**: The mode is _enabled_; multiple replicas provide higher
+deployment reliability under load.
+
+## Hardware recommendations
+
+### Coderd nodes
+
+| Users       | Node capacity        | Replicas                | GCP             | AWS         | Azure             |
+| ----------- | -------------------- | ----------------------- | --------------- | ----------- | ----------------- |
+| Up to 2,000 | 4 vCPU, 16 GB memory | 2 nodes / 1 coderd each | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |
+
+### Provisioner nodes
+
+| Users       | Node capacity        | Replicas                       | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 2,000 | 8 vCPU, 32 GB memory | 4 nodes / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- An external provisioner is deployed as Kubernetes pod.
+- It is not recommended to run provisioner daemons on `coderd` nodes.
+- Consider separating provisioners into different namespaces in favor of
+  zero-trust or multi-cloud deployments.
+
+### Workspace nodes
+
+| Users       | Node capacity        | Replicas                 | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 2,000 | 8 vCPU, 32 GB memory | 128 / 16 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- Assumed that a workspace user needs 2 GB memory to perform
+- Maximum number of Kubernetes workspace pods per node: 256
+- Nodes can be distributed in 2 regions, not necessarily evenly split, depending
+  on developer team sizes
+
+### Database nodes
+
+| Users       | Node capacity        | Replicas | Storage | GCP                 | AWS            | Azure             |
+| ----------- | -------------------- | -------- | ------- | ------------------- | -------------- | ----------------- |
+| Up to 2,000 | 4 vCPU, 16 GB memory | 1        | 1 TB    | `db-custom-4-15360` | `db.t3.xlarge` | `Standard_D4s_v3` |
+
+**Footnotes**:
+
+- Consider adding more replicas if the workspace activity is higher than 500
+  workspace builds per day or to achieve higher RPS.
diff --git a/docs/admin/architectures/3k-users.md b/docs/admin/architectures/3k-users.md
@@ -0,0 +1,62 @@
+# Reference Architecture: up to 3,000 users
+
+The 3,000 users architecture targets large-scale enterprises, possibly with
+on-premises network and cloud deployments.
+
+**Target load**: API: up to 550 RPS
+
+**High Availability**: Typically, such scale requires a fully-managed HA
+PostgreSQL service, and all Coder observability features enabled for operational
+purposes.
+
+**Observability**: Deploy monitoring solutions to gather Prometheus metrics and
+visualize them with Grafana to gain detailed insights into infrastructure and
+application behavior. This allows operators to respond quickly to incidents and
+continuously improve the reliability and performance of the platform.
+
+## Hardware recommendations
+
+### Coderd nodes
+
+| Users       | Node capacity        | Replicas          | GCP             | AWS         | Azure             |
+| ----------- | -------------------- | ----------------- | --------------- | ----------- | ----------------- |
+| Up to 3,000 | 8 vCPU, 32 GB memory | 4 / 1 coderd each | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |
+
+### Provisioner nodes
+
+| Users       | Node capacity        | Replicas                 | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 3,000 | 8 vCPU, 32 GB memory | 8 / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- An external provisioner is deployed as Kubernetes pod.
+- It is strongly discouraged to run provisioner daemons on `coderd` nodes at
+  this level of scale.
+- Separate provisioners into different namespaces in favor of zero-trust or
+  multi-cloud deployments.
+
+### Workspace nodes
+
+| Users       | Node capacity        | Replicas                       | GCP              | AWS          | Azure             |
+| ----------- | -------------------- | ------------------------------ | ---------------- | ------------ | ----------------- |
+| Up to 3,000 | 8 vCPU, 32 GB memory | 256 nodes / 12 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- Assumed that a workspace user needs 2 GB memory to perform
+- Maximum number of Kubernetes workspace pods per node: 256
+- As workspace nodes can be distributed between regions, on-premises networks
+  and cloud areas, consider different namespaces in favor of zero-trust or
+  multi-cloud deployments.
+
+### Database nodes
+
+| Users       | Node capacity        | Replicas | Storage | GCP                 | AWS             | Azure             |
+| ----------- | -------------------- | -------- | ------- | ------------------- | --------------- | ----------------- |
+| Up to 3,000 | 8 vCPU, 32 GB memory | 2        | 1.5 TB  | `db-custom-8-30720` | `db.t3.2xlarge` | `Standard_D8s_v3` |
+
+**Footnotes**:
+
+- Consider adding more replicas if the workspace activity is higher than 1500
+  workspace builds per day or to achieve higher RPS.