@@ -24,8 +24,36 @@ deployment.
24
24
- Metrics
25
25
- Capture infrastructure metrics like CPU, memory, open files, and network I/O for all
26
26
Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances.
27
+ - Capture Coder Server and External Provisioner daemons metrics [ via Prometheus] ( #how-to-capture-coder-server-metrics-with-prometheus ) .
27
28
28
- ### Capture Coder server metrics with Prometheus
29
+ Retain metric time series for at least six months. This allows you to see
30
+ performance trends relative to user growth.
31
+
32
+ For a more comprehensive overview, integrate metrics with an observability
33
+ dashboard like [ Grafana] ( ../../admin/monitoring/index.md ) .
34
+
35
+ ### Observability key metrics
36
+
37
+ Configure alerting based on these metrics to ensure you surface problems before
38
+ they affect the end-user experience.
39
+
40
+ - CPU and Memory Utilization
41
+ - Monitor the utilization as a fraction of the available resources on the instance.
42
+
43
+ Utilization will vary with use throughout the course of a day, week, and longer timelines.
44
+ Monitor trends and pay special attention to the daily and weekly peak utilization.
45
+ Use long-term trends to plan infrastructure upgrades.
46
+
47
+ - Tail latency of Coder Server API requests
48
+ - High tail latency can indicate Coder Server or the PostgreSQL database is underprovisioned
49
+ for the load.
50
+ - Use the ` coderd_api_request_latencies_seconds ` metric.
51
+
52
+ - Tail latency of database queries
53
+ - High tail latency can indicate the PostgreSQL database is low in resources.
54
+ - Use the ` coderd_db_query_latencies_seconds ` metric.
55
+
56
+ ### How to capture Coder server metrics with Prometheus
29
57
30
58
Edit your Helm ` values.yaml ` to capture metrics from Coder Server and external provisioner daemons with
31
59
[ Prometheus] ( ../../admin/integrations/prometheus.md ) :
@@ -56,33 +84,6 @@ Edit your Helm `values.yaml` to capture metrics from Coder Server and external p
56
84
CODER_PROMETHEUS_COLLECT_AGENT_STATS=false
57
85
` ` `
58
86
59
- Retain metric time series for at least six months. This allows you to see
60
- performance trends relative to user growth.
61
-
62
- For a more comprehensive overview, integrate metrics with an observability
63
- dashboard like [Grafana](../../admin/monitoring/index.md).
64
-
65
- # ## Observability key metrics
66
-
67
- Configure alerting based on these metrics to ensure you surface problems before
68
- they affect the end-user experience.
69
-
70
- - CPU and Memory Utilization
71
- - Monitor the utilization as a fraction of the available resources on the instance.
72
-
73
- Utilization will vary with use throughout the course of a day, week, and longer timelines.
74
- Monitor trends and pay special attention to the daily and weekly peak utilization.
75
- Use long-term trends to plan infrastructure upgrades.
76
-
77
- - Tail latency of Coder Server API requests
78
- - High tail latency can indicate Coder Server or the PostgreSQL database is underprovisioned
79
- for the load.
80
- - Use the `coderd_api_request_latencies_seconds` metric.
81
-
82
- - Tail latency of database queries
83
- - High tail latency can indicate the PostgreSQL database is low in resources.
84
- - Use the `coderd_db_query_latencies_seconds` metric.
85
-
86
87
# # Coder Server
87
88
88
89
# ## Locality
0 commit comments