Skip to content

feat(coderd): add times_used to coder_apps in insights API #13292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mafredri
Copy link
Member

For now, only applied to coder_apps, same logic can be implemented for
VS Code, SSH, etc.

Part of #13099

Copy link
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @mafredri and the rest of your teammates on Graphite Graphite

@mafredri mafredri requested review from mtojek, johnstcn and stirby May 16, 2024 11:53
@@ -249,7 +249,7 @@ WITH
apps.slug,
apps.display_name,
apps.icon,
tus.app_usage_mins
(tus.app_usage_mins -> apps.slug)::smallint AS usage_mins
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by optimization to avoid jsonb_each in the following CTE.

@mafredri mafredri marked this pull request as ready for review May 16, 2024 11:55
For now, only applied to `coder_app`s, same logic can be implemented for
VS Code, SSH, etc.

Part of #13099
@mafredri mafredri force-pushed the mafredri/05-16-feat_coderd_add_times_used_to_coder_app_s_in_insights_api branch from 8b606bf to 7347b81 Compare May 16, 2024 12:00
-- there can be duplicate start_times. To handle this, we use the
-- dense_rank() function, otherwise row_number() would suffice.
start_time - (
dense_rank() OVER (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL dense_rank()

times_used AS (
SELECT DISTINCT ON (user_id, slug, display_name, icon, uniq)
slug,
display_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: If somebody changes the display_name for the app, will it break the calculation? Probably a question related with other subqueries too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll handle it according to current behavior -- a change in display_name will be treated as a new app.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it should nit as technically the app is the same app.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends, in aggregation across the whole deployment (which we don't show, but is possible), the apps might not be the same at all. For now I would not want to introduce one feature that behaves differently from the rest, we already show multiple entries for apps and parameters if something (like the display name) has changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I am aware if the existing behavior and suggesting that we should change it sometimes in the future to be more robust to cosmetic changes to a coder_app.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matifali actually, have you recently confirmed this is still a problem? I think this is mainly a problem with parameters. If the app name changes it should only use the most recent name, at least since the rollup refactor.

(SUM(ai.usage_mins) * 60)::bigint AS usage_seconds,
COALESCE((
SELECT
COUNT(*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just about cardinality of app usage, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are only interested in rows of this table, not of it's content.

Copy link
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but would be interesting to see an EXPLAIN before vs an EXPLAIN after.

Comment on lines +284 to +304
times_used AS (
SELECT DISTINCT ON (user_id, slug, display_name, icon, uniq)
slug,
display_name,
icon,
-- Turn start_time into a unique identifier that identifies a users
-- continuous app usage. The value of uniq is otherwise garbage.
--
-- Since we're aggregating per user app usage across templates,
-- there can be duplicate start_times. To handle this, we use the
-- dense_rank() function, otherwise row_number() would suffice.
start_time - (
dense_rank() OVER (
PARTITION BY
user_id, slug, display_name, icon
ORDER BY
start_time
) * '30 minutes'::interval
) AS uniq
FROM
template_usage_stats_with_apps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What sort of additional load does this place on the DB?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't measure but I don't expect it to be a lot, this is similar in weight (or lighter) as the app_insights CTE and could theoretically be run in parallel (don't know if PG will, though).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted the explains, it seems to be parallel and very cheap.

@mafredri
Copy link
Member Author

mafredri commented May 16, 2024

LGTM, but would be interesting to see an EXPLAIN before vs an EXPLAIN after.

Don't have a good dataset to test, but with a little bit of data it looks like this:

Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mafredri mafredri merged commit a0fce36 into main May 16, 2024
31 checks passed
@mafredri mafredri deleted the mafredri/05-16-feat_coderd_add_times_used_to_coder_app_s_in_insights_api branch May 16, 2024 13:53
@github-actions github-actions bot locked and limited conversation to collaborators May 16, 2024
@stirby
Copy link
Collaborator

stirby commented May 16, 2024

Awesome!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants