Skip to content

feat: add metrics to PGPubsub #11971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 1, 2024
Merged

feat: add metrics to PGPubsub #11971

merged 1 commit into from
Feb 1, 2024

Conversation

spikecurtis
Copy link
Contributor

Adds prometheus metrics to PGPubsub for monitoring its health and performance in production.

Related to #11950 --- additional diagnostics to help figure out what's happening

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @spikecurtis and the rest of your teammates on Graphite Graphite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file isn't new, it's just renamed from pubsub_test.go since it was Linux only and the metrics tests runs on any platform.

@spikecurtis spikecurtis force-pushed the spike/11950-pubsub-metrics branch from bbf6e1b to 899ec80 Compare February 1, 2024 05:19
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few questions/suggestions.

p.logger.Debug(p.ctx, "publish", slog.F("event", event), slog.F("message_len", len(message)))
// This is safe because we are calling pq.QuoteLiteral. pg_notify doesn't
// support the first parameter being a prepared statement.
//nolint:gosec
_, err := p.db.ExecContext(p.ctx, `select pg_notify(`+pq.QuoteLiteral(event)+`, $1)`, message)
if err != nil {
p.publishesTotal.WithLabelValues("false").Inc()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does false label represent error here? Is that common prom metric practice? (Just checking since I'm not that experienced adding prom metrics.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The label name is success with values true or false. Just trying to keep our labeling consistent within the product and that's how we do it at

}, []string{"success"}),

p.mut.Lock()
defer p.mut.Unlock()
defer func() {
if err != nil {
// if we hit an error, we need to close the queue so we don't
// leak its goroutine.
newQ.close()
p.subscribesTotal.WithLabelValues("false").Inc()
} else {
p.subscribesTotal.WithLabelValues("true").Inc()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about including the event as well? Might be useful to see if there's an abnormal amount of subs/pubs/data going for a specific event.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bad idea because some events include the UUID of the thing, e.g. workspace, which would make the set of label values unbounded, which is a cardinal sin in metrics.

Not for this PR, but if we decide we really want that information, we could standardize event channel names to be a struct with a static root and then zero or more specific IDs, and then we could label the publishes, subscribes and messages with the static root.

@spikecurtis spikecurtis force-pushed the spike/11950-pubsub-metrics branch from 899ec80 to ff526cc Compare February 1, 2024 06:59
@spikecurtis spikecurtis merged commit 5a359d5 into main Feb 1, 2024
Copy link
Contributor Author

Merge activity

@spikecurtis spikecurtis deleted the spike/11950-pubsub-metrics branch February 1, 2024 07:25
@github-actions github-actions bot locked and limited conversation to collaborators Feb 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants