feat: add metrics to workspace agent scripts #11132

Emyrk · 2023-12-11T16:18:37Z

What this does

Add metrics for agent scripts

agents_scripts_executed_total{"success"="true/false",...}

Metric to keep track of the total number of scripts executed for a workspace and if they succeed or fail. This includes cron executed scripts by the agent. Was easy to include and felt beneficial to track script failures.

coderd_agentstats_startup_script_s{...} 1.969900304

Metric that indicates how long the startup scrip took to execute.

Other features

Adds template_name to all agent metrics as a label. Example:

agent_scripts_executed_total{agent_name="main",success="true",template_name="docker",username="admin",workspace_name="gvf"}

This allows grouping metrics by template_name, so you can see for example the average startup script time for a given template.

stat

Emyrk · 2023-12-11T23:32:05Z

I do not think this is a breaking change. I added a label to agent metrics, template_name, so all existing queries that collect the metrics will still work. But new workspaces will make all new metrics since they have a new label value now.

agent/agent.go

agent/agent_test.go

mtojek · 2023-12-13T09:42:41Z

coderd/prometheusmetrics/aggregator_test.go

@@ -119,6 +124,10 @@ func verifyCollectedMetrics(t *testing.T, expected []agentsdk.AgentMetric, actua
 		}

 		dtoLabels := asMetricAgentLabels(d.GetLabel())
+		// dto labels are sorted in alphabetical order.


Maybe we should move it this routine to asMetricAgentLabels, so the function produces compliant DTO labels.

The expected labels that we pass in have to be sorted. The DTO ones are already sorted.
I figured it was fine in the verify function to sort them, otherwise maybe we would have to move it closer to where we define expected?

coder/coderd/prometheusmetrics/aggregator_test.go

Lines 66 to 79 in 8bfa9ad

expected := []agentsdk.AgentMetric{

{Name: "a_counter_one", Type: agentsdk.AgentMetricTypeCounter, Value: 1, Labels: commonLabels},

{Name: "b_counter_two", Type: agentsdk.AgentMetricTypeCounter, Value: 4, Labels: commonLabels},

{Name: "c_gauge_three", Type: agentsdk.AgentMetricTypeGauge, Value: 5, Labels: commonLabels},

{Name: "c_gauge_three", Type: agentsdk.AgentMetricTypeGauge, Value: 2, Labels: []agentsdk.AgentMetricLabel{

{Name: "agent_name", Value: testAgentName},

{Name: "foobar", Value: "Foobaz"},

{Name: "hello", Value: "world"},

{Name: "username", Value: testUsername},

{Name: "workspace_name", Value: testWorkspaceName},

{Name: "template_name", Value: testTemplateName},

}},

{Name: "d_gauge_four", Type: agentsdk.AgentMetricTypeGauge, Value: 6, Labels: commonLabels},

}

But this only needs to be sorted when comparing to dto labels.

I see now what you mean. I guess we can leave it as is, there won't be a huge gain if you refactor it 👍

coderd/database/modelmethods.go

coderd/coderd.go

mtojek · 2023-12-13T09:57:31Z

coderd/batchstats/batcher.go

@@ -13,6 +13,7 @@ import (

 	"cdr.dev/slog"
 	"cdr.dev/slog/sloggers/sloghuman"
+


nit: many of these changes are sneaking in, should we adjust our linter?

I just added this to my editor config, maybe that will fix it:

coder/.golangci.yaml

Line 126 in 8bfa9ad

local-prefixes: coder.com,cdr.dev,go.coder.com,github.com/cdr,github.com/coder

The issue is that imports are not deterministic for the fmt from my experience. Both this in the PR, and grouping them is valid. There are packages like https://github.com/daixiang0/gci that make import ordering deterministic. Would prevent this, but require us to setup an extra tool.

agent/metrics.go

agent/agentscripts/agentscripts.go

agent/agent.go

mtojek

LGTM! Feel free to ship it if CI is happy.

github-actions bot assigned Emyrk Dec 11, 2023

Emyrk added 2 commits December 11, 2023 10:40

wip: work on agent script metrics

2f978c3

push startup script metrics to agent

c567eab

Emyrk force-pushed the stevenmasley/agent_script_metrics branch from ebf974d to c567eab Compare December 11, 2023 16:40

Emyrk added 4 commits December 11, 2023 16:04

Add metrics pushing to prometheus

5d39495

make metric a normal prom metric exported, rather than a first class

07fe74a

stat

linting/fmt/naming

753f1a6

add template_name label to workspace agent stats

aafd29e

Emyrk changed the title ~~wip: work on agent script metrics~~ feat: workspace agent script metrics Dec 11, 2023

Fix prom tests

6c5a560

Emyrk added 3 commits December 11, 2023 17:37

Update prom docs

ad3f47f

seed template id in test

19428fa

fixup! seed template id in test

8bfa9ad

Emyrk changed the title ~~feat: workspace agent script metrics~~ feat: add metrics to workspace agent script Dec 12, 2023

Emyrk changed the title ~~feat: add metrics to workspace agent script~~ feat: add metrics to workspace agent scripts Dec 12, 2023

Emyrk marked this pull request as ready for review December 12, 2023 15:13

Emyrk requested a review from mtojek December 12, 2023 15:31

mtojek reviewed Dec 13, 2023

View reviewed changes

Emyrk added 6 commits December 13, 2023 09:08

PR feedback, _s -> seconds, start -> startup

044b942

change string params to a struct

2db41b6

Change the name of the labels argument struct

dd32401

Clarify comment

b4a3607

Remove extra words

c5da250

measure workspace startup closer to script finish in lines of code

eb1d173

Emyrk requested a review from mtojek December 13, 2023 15:19

mtojek approved these changes Dec 13, 2023

View reviewed changes

Emyrk added 3 commits December 13, 2023 09:34

Fix unit test call args

e803436

Make gen

99483fe

Formatting

2f89b62

Emyrk merged commit b7bdb17 into main Dec 13, 2023

Emyrk deleted the stevenmasley/agent_script_metrics branch December 13, 2023 17:45

github-actions bot locked and limited conversation to collaborators Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add metrics to workspace agent scripts #11132

feat: add metrics to workspace agent scripts #11132

Emyrk commented Dec 11, 2023 •

edited

Loading

Emyrk commented Dec 11, 2023

mtojek Dec 13, 2023

Emyrk Dec 13, 2023

mtojek Dec 13, 2023

mtojek Dec 13, 2023

Emyrk Dec 13, 2023

mtojek left a comment

	expected := []agentsdk.AgentMetric{
	{Name: "a_counter_one", Type: agentsdk.AgentMetricTypeCounter, Value: 1, Labels: commonLabels},
	{Name: "b_counter_two", Type: agentsdk.AgentMetricTypeCounter, Value: 4, Labels: commonLabels},
	{Name: "c_gauge_three", Type: agentsdk.AgentMetricTypeGauge, Value: 5, Labels: commonLabels},
	{Name: "c_gauge_three", Type: agentsdk.AgentMetricTypeGauge, Value: 2, Labels: []agentsdk.AgentMetricLabel{
	{Name: "agent_name", Value: testAgentName},
	{Name: "foobar", Value: "Foobaz"},
	{Name: "hello", Value: "world"},
	{Name: "username", Value: testUsername},
	{Name: "workspace_name", Value: testWorkspaceName},
	{Name: "template_name", Value: testTemplateName},
	}},
	{Name: "d_gauge_four", Type: agentsdk.AgentMetricTypeGauge, Value: 6, Labels: commonLabels},
	}

		@@ -13,6 +13,7 @@ import (

		"cdr.dev/slog"
		"cdr.dev/slog/sloggers/sloghuman"

feat: add metrics to workspace agent scripts #11132

feat: add metrics to workspace agent scripts #11132

Conversation

Emyrk commented Dec 11, 2023 • edited Loading

What this does

agents_scripts_executed_total{"success"="true/false",...}

coderd_agentstats_startup_script_s{...} 1.969900304

Other features

Emyrk commented Dec 11, 2023

mtojek Dec 13, 2023

Choose a reason for hiding this comment

Emyrk Dec 13, 2023

Choose a reason for hiding this comment

mtojek Dec 13, 2023

Choose a reason for hiding this comment

mtojek Dec 13, 2023

Choose a reason for hiding this comment

Emyrk Dec 13, 2023

Choose a reason for hiding this comment

mtojek left a comment

Choose a reason for hiding this comment

Emyrk commented Dec 11, 2023 •

edited

Loading