Skip to content

Prometheus: expose Coder insights as metrics #9983

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mtojek opened this issue Oct 2, 2023 · 8 comments · Fixed by #10574
Closed

Prometheus: expose Coder insights as metrics #9983

mtojek opened this issue Oct 2, 2023 · 8 comments · Fixed by #10574
Assignees

Comments

@mtojek
Copy link
Member

mtojek commented Oct 2, 2023

Our customers may find it beneficial to expose Coder insights via the Prometheus endpoint and integrate it with their Observability solutions.

Items:

  • users and session time
  • expose license utilization somewhere, trailing 90-day active users
  • show the license limit on the graph
@mtojek mtojek self-assigned this Oct 2, 2023
@cdr-bot cdr-bot bot added the feature label Oct 2, 2023
@spikecurtis
Copy link
Contributor

How does this work with multiple Coderd replicas?

@mtojek
Copy link
Member Author

mtojek commented Oct 2, 2023

Similar to other collectors like in prometheusmetrics.go. Technically, you can't sum up this data.

Do you see any workaround or a better solution we can adopt, @spikecurtis?

@spikecurtis
Copy link
Contributor

Technically, you can't sum up this data.

It's probably fine that you can't sum as long as other aggregations give sensible results like max or avg. Whether this is true depends on how the metrics are computed.

For example, if a metric is computed entirely from "deployment"-wide state, like a database query, such that one can reasonably expect that all coderds will compute and expose the same value, then that's probably fine.

However, if a metric is computed from state that is local to each coderd, then we need to understand how the different values interact such that there is no under- or over-counting.

@mtojek
Copy link
Member Author

mtojek commented Oct 2, 2023

For example, if a metric is computed entirely from "deployment"-wide state, like a database query, such that one can reasonably expect that all coderds will compute and expose the same value, then that's probably fine.

This is the only use case we have. Query a database, transform results, and expose them as metrics. One can argue that the same query is executed N times, where N is the number of replicas.

@bpmct
Copy link
Member

bpmct commented Oct 2, 2023

Is prometheus the right tool for user session data?

@mtojek
Copy link
Member Author

mtojek commented Oct 3, 2023

Is prometheus the right tool for user session data?

What do you mean by data? Personal data like emails or IDs? I think we're already exposing them through the Prometheus channel.

BTW I copied this idea from the Sprint theme in the notion doc. Isn't that what you suggested? Alternatively, we can focus on generic stats first like "total active users per template", "used parameters with values", etc.

@spikecurtis
Copy link
Contributor

Wait, are these metrics per user, or are they aggregates like DAUs?

I wouldn't expect someone to go to prometheus to answer a question like, "how much has Marcin used Coder this last month"?

@mtojek
Copy link
Member Author

mtojek commented Oct 5, 2023

I will go with the "safe" metrics first, and leave user-related for later 👍

Battle plan:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants