-
Notifications
You must be signed in to change notification settings - Fork 887
feat: fetch prebuilds metrics state in background #17792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
5da546e
to
e73dae6
Compare
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
e73dae6
to
fcbfb7f
Compare
@@ -55,20 +57,34 @@ var ( | |||
labels, | |||
nil, | |||
) | |||
lastUpdateDesc = prometheus.NewDesc( | |||
"coderd_prebuilt_workspaces_metrics_last_updated", | |||
"The unix timestamp when the metrics related to prebuilt workspaces were last updated; these metrics are cached.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is unix timestamp easy to alert on? Like can you do something like unix_now() - metric_value > 1000
or something in grafana and co? If not, it might be better if this was a duration since the last successful fetch instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 from me for duration since last successful fetch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idiomatic approach is to use unix timestamps, see prometheus_config_last_reload_success_timestamp_seconds
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess we have an existing metric for the coder server start timestamp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so (or at least not one we export), but I think as long as this metric is updated relative to itself and up
is taken into consideration, it should be useful.
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Collect()
is called whenever the/metrics
endpoint is hit to retrieve metrics.The queries used in prebuilds metrics collection are quite heavy, and we want to avoid having them running concurrently / too often to keep db load down.
Here I'm moving towards a background retrieval of the state required to set the metrics, which gets invalidated every interval.
Also introduces
coderd_prebuilt_workspaces_metrics_last_updated
which operators can use to determine when these metrics go stale.See #17789 as well.