Skip to content

Add prometheus metrics for full workspace start up time #10479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bpmct opened this issue Nov 1, 2023 · 3 comments · Fixed by #11132
Closed

Add prometheus metrics for full workspace start up time #10479

bpmct opened this issue Nov 1, 2023 · 3 comments · Fixed by #11132
Assignees

Comments

@bpmct
Copy link
Member

bpmct commented Nov 1, 2023

Background

We have customers who are very interested in measuring and optimizing the time it takes for an engineer to create or start a workspace. When a workspace is starting, two primary steps are kicked off:

  • provisioner job (terraform apply to create infrastructure)
    • current metric: coderd_provisionerd_job_timings_seconds
  • waiting for agent to download and connect on workspace
    • current metric: coderd_agents_up (kinda, it's not a timing but it changes from 0 to 1)
  • waiting for startup script to complete
    • current metric: none

We have coderd_provisionerd_job_timings_seconds for jobs but not for how long it takes for a startup script.

Proposal

  • Add metric for how long it takes for startup script
  • Add metric for how long it takes for agent to connect
  • Add overall metric for workspace start time
@cdr-bot cdr-bot bot added the feature label Nov 1, 2023
@bpmct
Copy link
Member Author

bpmct commented Nov 1, 2023

I'm not a prom/grafana expert though so I may be misunderstanding the best way to measure this

@Emyrk
Copy link
Member

Emyrk commented Nov 27, 2023

In v1 I added a way for the agent to push arbitrary metrics, making adding workspace side metrics very easy.

I think right now we send over a custom metrics payload?

@Emyrk
Copy link
Member

Emyrk commented Dec 8, 2023

Hmm we only support counters and gauges at the present:

coder/agent/metrics.go

Lines 71 to 84 in 78517ca

if metric.Counter != nil {
collected = append(collected, agentsdk.AgentMetric{
Name: metricFamily.GetName(),
Type: agentsdk.AgentMetricTypeCounter,
Value: metric.Counter.GetValue(),
Labels: labels,
})
} else if metric.Gauge != nil {
collected = append(collected, agentsdk.AgentMetric{
Name: metricFamily.GetName(),
Type: agentsdk.AgentMetricTypeGauge,
Value: metric.Gauge.GetValue(),
Labels: labels,
})

Makes some things a bit challenging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants