-
Notifications
You must be signed in to change notification settings - Fork 896
feat: implement expiration policy logic for prebuilds #17996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement expiration policy logic for prebuilds #17996
Conversation
8977d49
to
5a27ee4
Compare
// filterExpiredWorkspaces splits running workspaces into expired and non-expired | ||
// based on the preset's InvalidateAfterSecs TTL. If TTL is missing or zero, | ||
// all workspaces are considered non-expired. | ||
func filterExpiredWorkspaces(preset database.GetTemplatePresetsWithPrebuildsRow, runningWorkspaces []database.GetRunningPrebuiltWorkspacesRow) (nonExpired []database.GetRunningPrebuiltWorkspacesRow, expired []database.GetRunningPrebuiltWorkspacesRow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There’s an alternative to perform this filtering directly in the database, but for now, opted for an in-memory approach to keep the implementation simpler and the PR smaller. The main trade-off here is performance at scale, while in-memory filtering is sufficient for the current scale, we may want to revisit a DB-based solution if we encounter issues at higher scale. Let me know if you think a database solution would be a better fit now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, the un-filtered data still comes directly from the database but is filtered in-memory.
From what I can tell, the changes required to move this logic inside the database would involve passing in the preset TTL and doing the TTL calculation inside the database. Is that correct?
If so, I agree that it can be kept in-memory for the moment, but we should add a follow-up issue to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, the un-filtered data still comes directly from the database but is filtered in-memory.
Yes, that's correct, the filtering currently happens in-memory after fetching all the running prebuilds (expired and non-expired) from the database.
From what I can tell, the changes required to move this logic inside the database would involve passing in the preset TTL and doing the TTL calculation inside the database. Is that correct?
Right, moving it into the database would require doing the TTL check within the query. We don’t need to pass the preset TTL manually since it's already stored in the database, so the query can join against the template_version_presets
table and compute the expiration inline.
We’d also need to update the current GetRunningPrebuiltWorkspacesRow
function to return only the non-expired running prebuilds.
If so, I agree that it can be kept in-memory for the moment, but we should add a follow-up issue to track this.
I’ll create a follow-up issue to track moving it into the database for better efficiency/scalability 👍
// - Running: prebuilds running and non-expired | ||
// - Expired: prebuilds running and expired due to the preset's TTL | ||
// - InProgress: prebuilds currently in progress | ||
// - Backoff: holds failure info to decide if prebuild creation should be backed off | ||
type PresetSnapshot struct { | ||
Preset database.GetTemplatePresetsWithPrebuildsRow | ||
Running []database.GetRunningPrebuiltWorkspacesRow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not a fan of the Running
field name here, it’s a bit misleading since now it only includes non-expired running prebuilds. Maybe something like RunningValid
would make that clearer. Open to other naming ideas too if anyone has suggestions.
I can make this change in a follow-up PR to avoid adding more entropy here.
5a27ee4
to
ef33f48
Compare
ef33f48
to
5538be5
Compare
5538be5
to
a38ee7c
Compare
LGTM, but given my proximity to the project, I'll defer formal approval to an independent reviewer. |
// filterExpiredWorkspaces splits running workspaces into expired and non-expired | ||
// based on the preset's InvalidateAfterSecs TTL. If TTL is missing or zero, | ||
// all workspaces are considered non-expired. | ||
func filterExpiredWorkspaces(preset database.GetTemplatePresetsWithPrebuildsRow, runningWorkspaces []database.GetRunningPrebuiltWorkspacesRow) (nonExpired []database.GetRunningPrebuiltWorkspacesRow, expired []database.GetRunningPrebuiltWorkspacesRow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, the un-filtered data still comes directly from the database but is filtered in-memory.
From what I can tell, the changes required to move this logic inside the database would involve passing in the preset TTL and doing the TTL calculation inside the database. Is that correct?
If so, I agree that it can be kept in-memory for the moment, but we should add a follow-up issue to track this.
…builds-cache-invalidation
Despite |
You’re right that I agree it would be valuable to add tests at the |
Summary
This PR introduces support for expiration policies in prebuilds. The TTL (time-to-live) is retrieved from the Terraform configuration (terraform-provider-coder PR):
Note: Since there is no need for precise TTL enforcement down to the second, in this implementation expired prebuilds are handled in a single reconciliation cycle: they are deleted, and new instances are created only if needed to match the desired count.
Changes
Prebuilt workspaces
pageterraform-provider-coder
to version 2.5.0: https://github.com/coder/terraform-provider-coder/releases/tag/v2.5.0Depends on: coder/terraform-provider-coder#404
Fixes: #17916