now workspaces

mtojek · mtojek · commit fa1215f11d7b · 2024-03-12T13:34:54.000+01:00
diff --git a/docs/admin/architectures/1k-users.md b/docs/admin/architectures/1k-users.md
@@ -12,6 +12,6 @@ tech startups, educational units, or small to mid-sized enterprises.
 
 ### Coderd nodes
 
-| Users       | Cluster capacity    | Replicas | GCP             | AWS        | Azure             |
+| Users       | Node capacity       | Replicas | GCP             | AWS        | Azure             |
 | ----------- | ------------------- | -------- | --------------- | ---------- | ----------------- |
 | Up to 1,000 | 2 vCPU, 8 GB memory | 2        | `n1-standard-2` | `t3.large` | `Standard_D2s_v3` |
diff --git a/docs/admin/architectures/2k-users.md b/docs/admin/architectures/2k-users.md
@@ -17,6 +17,6 @@ enabling it for deployment reliability.
 
 ### Coderd nodes
 
-| Users       | Cluster capacity     | Replicas | GCP             | AWS         | Azure             |
+| Users       | Node capacity        | Replicas | GCP             | AWS         | Azure             |
 | ----------- | -------------------- | -------- | --------------- | ----------- | ----------------- |
 | Up to 2,000 | 4 vCPU, 16 GB memory | 2        | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |
diff --git a/docs/admin/architectures/3k-users.md b/docs/admin/architectures/3k-users.md
@@ -13,6 +13,6 @@ purposes.
 
 ### Coderd nodes
 
-| Users       | Cluster capacity     | Replicas | GCP             | AWS         | Azure             |
+| Users       | Node capacity        | Replicas | GCP             | AWS         | Azure             |
 | ----------- | -------------------- | -------- | --------------- | ----------- | ----------------- |
 | Up to 3,000 | 8 vCPU, 32 GB memory | 4        | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |
diff --git a/docs/admin/architectures/index.md b/docs/admin/architectures/index.md
@@ -205,27 +205,60 @@ this option enabled unless there are compelling reasons to disable it.
 
 Inactive users do not consume Coder resources.
 
-#### HTTP API latency
+#### Scaling formula
 
-API latency/response time average number of HTTP requests
+When determining scaling requirements, consider the following factors:
 
-depending on database perf
+- 1 vCPU x 2 GB memory x 250 users: A reasonable formula to determine resource
+  allocation based on the number of users and their expected usage patterns.
+- API latency/response time: Monitor API latency and response times to ensure
+  optimal performance under varying loads.
+- Average number of HTTP requests: Track the average number of HTTP requests to
+  gauge system usage and identify potential bottlenecks.
 
-TODO
+**Node Autoscaling**
 
-#### Scaling formula
+We recommend to disable autoscaling for `coderd` nodes. Autoscaling can cause
+interruptions for user connections, see [Autoscaling](../scale.md#autoscaling)
+for more details.
+
+### Workspaces
 
-reasonable ratio/formula: CPU x memory x users reasonable ratio/formula:
-provisionerd x users API latency/response time average number of HTTP requests
+Assumptions:
 
-TODO
+workspaces also run on the same Kubernetes cluster (recommend a different
+namespace/node pool)
 
-### Workspaces
+developers can pick between 4-8 CPU and 4-16 GB RAM workspaces (limits)
+
+developers have a resource quota of 16 GPU 32 GB RAM (2-maxed out workspaces).
 
-TODO
+However, the Coder agent itself requires at minimum 0.1 CPU cores and 256 MB to
+run inside a workspace.
+
+web microservice development use case: resources are mostly underutilized but
+spike during builds
+
+Case study:
+
+Developers for up to 2000+ users architecture are in 2 regions (a different
+cluster) and are evenly split. In practice, this doesn’t change much besides the
+diagram and workspaces node pool autoscaling config as it still uses the central
+provisioner. Recommend multiple provisioner groups for zero-trust and
+multi-cloud use cases. Developers for up to 3000+ users architecture are also in
+an on-premises network. Document a provisioner running in a different cloud
+environment, and the zero-trust benefits of that.
+
+scaling formula
+
+provisionerd x users: Another formula to consider, focusing on the capacity of
+provisioner nodes relative to the number of workspace builds, triggered by
+users.
 
 ### Database
 
-TODO
+PostgreSQL database
+
+measure and document the impact of dbcrypt
 
 ###