-
Notifications
You must be signed in to change notification settings - Fork 881
feat(coderd/database): keep only 1 day of workspace_agent_stats
after rollup
#12674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -90,7 +90,35 @@ ORDER BY | |
date ASC; | ||
|
||
-- name: DeleteOldWorkspaceAgentStats :exec | ||
DELETE FROM workspace_agent_stats WHERE created_at < NOW() - INTERVAL '180 days'; | ||
DELETE FROM | ||
workspace_agent_stats | ||
WHERE | ||
created_at < ( | ||
SELECT | ||
COALESCE( | ||
-- When generating initial template usage stats, all the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This change may be little too soon to drop. If customers identify problems with the release, they will decide to roll back to the previous version. Boom! No There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, I made this a separate PR so we can defer it for later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rollbacks require db backups at present. We just do not support downgrades at present. And not supported like, you can't even really do it. You have to run the |
||
-- raw agent stats are needed, after that only ~30 mins | ||
-- from last rollup is needed. Deployment stats seem to | ||
-- use between 15 mins and 1 hour of data. We keep a | ||
-- little bit more (1 day) just in case. | ||
MAX(start_time) - '1 days'::interval, | ||
-- Fall back to 6 months ago if there are no template | ||
-- usage stats so that we don't delete the data before | ||
-- it's rolled up. | ||
NOW() - '6 months'::interval | ||
) | ||
FROM | ||
template_usage_stats | ||
) | ||
AND created_at < ( | ||
-- Delete at most in batches of 3 days (with a batch size of 3 days, we | ||
-- can clear out the previous 6 months of data in ~60 iterations) whilst | ||
-- keeping the DB load relatively low. | ||
SELECT | ||
COALESCE(MIN(created_at) + '3 days'::interval, NOW()) | ||
FROM | ||
workspace_agent_stats | ||
); | ||
|
||
-- name: GetDeploymentWorkspaceAgentStats :one | ||
WITH agent_stats AS ( | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a maximum number of rows that could be deleted here?
Would it make sense to add a limit on the number of rows deleted?
My thinking is that we could choose a reasonably large upper limit and hopefully converge after a few iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad idea. Limiting would only be relevant for the first time(s) this query runs, after that there's no point but if we want to avoid the initial performance penalty of doing a huge delete, that would be sensible. There's no limit for deletes though so we could rely on selecting the min(created_at) and only delete 1 (or N days) of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thinking deleting the oldest day of data every interval will probably work fine here. The magnitude of that data should scale with the size of the deployment. We'll have to assume that the database is correctly sized to handle that level of deletes, but if not that is likely a sizing issue with the database itself in relation to the rest of the deployment.