Skip to content

Commit 6ca6b8c

Browse files
committed
make fmt
1 parent b0787cd commit 6ca6b8c

File tree

1 file changed

+38
-20
lines changed

1 file changed

+38
-20
lines changed

docs/tutorials/best-practices/scale-coder.md

Lines changed: 38 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -13,23 +13,30 @@ operating smoothly with a high number of active users and workspaces.
1313
Observability is one of the most important aspects to a scalable Coder
1414
deployment.
1515

16-
[Monitor your Coder deployment](../../admin/monitoring/index.md) with log output and metrics to identify potential bottlenecks before they negatively affect the end-user experience and measure the effects of modifications you make to your deployment.
16+
[Monitor your Coder deployment](../../admin/monitoring/index.md) with log output
17+
and metrics to identify potential bottlenecks before they negatively affect the
18+
end-user experience and measure the effects of modifications you make to your
19+
deployment.
1720

1821
**Log output**
1922

20-
- Capture log output from Loki, CloudWatch logs, and other tools on your Coder Server instances and external provisioner
21-
daemons and store them in a searchable log store.
23+
- Capture log output from Loki, CloudWatch logs, and other tools on your Coder
24+
Server instances and external provisioner daemons and store them in a
25+
searchable log store.
2226

2327
- Retain logs for a minimum of thirty days, ideally ninety days. This allows
2428
you to look back to see when anomalous behaviors began.
2529

2630
**Metrics**
2731

28-
- Capture infrastructure metrics like CPU, memory, open files, and network I/O for all Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
32+
- Capture infrastructure metrics like CPU, memory, open files, and network I/O
33+
for all Coder Server, external provisioner daemon, workspace proxy, and
34+
PostgreSQL instances.
2935

3036
### Capture Coder server metrics with Prometheus
3137

32-
To capture metrics from Coder Server and external provisioner daemons with [Prometheus](../../admin/integrations/prometheus.md):
38+
To capture metrics from Coder Server and external provisioner daemons with
39+
[Prometheus](../../admin/integrations/prometheus.md):
3340

3441
1. Enable Prometheus metrics:
3542

@@ -49,30 +56,36 @@ To capture metrics from Coder Server and external provisioner daemons with [Prom
4956
CODER_PROMETHEUS_AGGREGATE_AGENT_STATS_BY=agent_name
5057
```
5158

52-
- To disable agent stats:
59+
- To disable agent stats:
5360

54-
```yaml
55-
CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
56-
```
61+
```yaml
62+
CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
63+
```
5764

58-
Retain metric time series for at least six months. This allows you to see performance trends relative to user growth.
65+
Retain metric time series for at least six months. This allows you to see
66+
performance trends relative to user growth.
5967

60-
For a more comprehensive overview, integrate metrics with an observability dashboard, for example, [Grafana](../../admin/monitoring/index.md).
68+
For a more comprehensive overview, integrate metrics with an observability
69+
dashboard, for example, [Grafana](../../admin/monitoring/index.md).
6170

6271
### Observability key metrics
6372

64-
Configure alerting based on these metrics to ensure you surface problems before they affect the end-user experience.
73+
Configure alerting based on these metrics to ensure you surface problems before
74+
they affect the end-user experience.
6575

6676
**CPU and Memory Utilization**
6777

68-
- Monitor the utilization as a fraction of the available resources on the instance.
78+
- Monitor the utilization as a fraction of the available resources on the
79+
instance.
6980

70-
Utilization will vary with use throughout the course of a day, week, and longer timelines. Monitor trends and pay special attention to the daily and weekly peak utilization. Use long-term trends to plan infrastructure
71-
upgrades.
81+
Utilization will vary with use throughout the course of a day, week, and
82+
longer timelines. Monitor trends and pay special attention to the daily and
83+
weekly peak utilization. Use long-term trends to plan infrastructure upgrades.
7284

7385
**Tail latency of Coder Server API requests**
7486

75-
- High tail latency can indicate Coder Server or the PostgreSQL database is low on resources.
87+
- High tail latency can indicate Coder Server or the PostgreSQL database is low
88+
on resources.
7689

7790
Use the `coderd_api_request_latencies_seconds` metric.
7891

@@ -86,15 +99,20 @@ Configure alerting based on these metrics to ensure you surface problems before
8699

87100
### Locality
88101

89-
To ensure increased availability of the Coder API, deploy at least three instances. Spread the instances across nodes with anti-affinity rules in
102+
To ensure increased availability of the Coder API, deploy at least three
103+
instances. Spread the instances across nodes with anti-affinity rules in
90104
Kubernetes or in different availability zones of the same geographic region.
91105

92106
Do not deploy in different geographic regions.
93107

94-
Coder Servers need to be able to
95-
communicate with one another directly with low latency, under 10ms. Note that this is for the availability of the Coder API. Workspaces are not fault tolerant unless they are explicitly built that way at the template level.
108+
Coder Servers need to be able to communicate with one another directly with low
109+
latency, under 10ms. Note that this is for the availability of the Coder API.
110+
Workspaces are not fault tolerant unless they are explicitly built that way at
111+
the template level.
96112

97-
Deploy Coder Server instances as geographically close to PostgreSQL as possible. Low-latency communication (under 10ms) with Postgres is essential for Coder Server's performance.
113+
Deploy Coder Server instances as geographically close to PostgreSQL as possible.
114+
Low-latency communication (under 10ms) with Postgres is essential for Coder
115+
Server's performance.
98116

99117
### Scaling
100118

0 commit comments

Comments
 (0)