-
Notifications
You must be signed in to change notification settings - Fork 874
feat: collect database metrics #17635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be done instead in enablePrometheus()
.
I don't see a point registering the collector if prometheus is not enabled for the deployment.
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
go.mod
Outdated
@@ -489,6 +489,7 @@ require ( | |||
require ( | |||
github.com/coder/preview v0.0.1 | |||
github.com/fsnotify/fsnotify v1.9.0 | |||
github.com/jackc/pgx/v5 v5.7.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New dependencies aren't too huge, but still insane for just parsing a DSN. I didn't find any other lib as rigorous as this one; open to suggestions.
$ gsa /home/coder/coder/build/coder_2.21.3-devel+9645ed314_linux_amd64 2>/dev/null | grep jackc
│ 0.07% │ github.com/jackc/pgx/v5 │ 202 kB │ vendor │
│ 0.00% │ github.com/jackc/pgservicefile │ 3.0 kB │ vendor │
│ 0.00% │ github.com/jackc/pgpassfile │ 2.8 kB │ vendor │
(source)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much should we care about getting an accurate db name? Some? Just hardcode "coder" and call it a day?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially thinking that, but I suspect seeing db_name=coder
as a label when they're configured it as coder_com_pg
or whatever would cause unnecessary confusion.
It's also possible that multiple coder installations (with separate DBs) exist in the same infra.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also possible that multiple coder installations (with separate DBs) exist in the same infra.
True enough, but the dbname is not enforced to be unique across the infra, it's local to a particular DB server instance. So you can already have multiple Coder deployments with separate DBs with the same label.
I suspect seeing db_name=coder as a label when they're configured it as coder_com_pg or whatever would cause unnecessary confusion.
What happens if we set it to ""
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go_sql_idle_connections{db_name=""} 3
Prometheus will drop the empty label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't hate that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, me neither. Cool, lemme refactor 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is assuming whatever scrapes the metrics augments the metrics to add pod/namespace/container labels. Without that, it might be hard to triangulate where these metrics are coming from in a complex environment without some reference to coder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True for every one of our generic go_
metrics, unfortunately.
I've never heard of a scraping system that doesn't augment with additional labels...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's see how it's received. Easy to change later.
Made the change.
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Currently we don't have a way to get insight into Postgres connections being exhausted.
By using the prometheus'
DBStats
collector, we get some insight out-of-the-box.go_sql_wait_count_total
is the metric I'm most interested in gaining, but the others are also very useful.Changing the prefix is easy (
prometheus.WrapRegistererWithPrefix
), but getting rid of thego_
segment is not quite so easy. I've kept the changeset small for now.NOTE: I imported a library to determine the database name from the given conn string. It's not as simple as one might hope. The database name is used for the
db_name
label.