Skip to content

fix(coderd/database): consider tag sets when calculating queue position #16685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Mar 3, 2025

Conversation

evgeniy-scherbina
Copy link
Contributor

@evgeniy-scherbina evgeniy-scherbina commented Feb 24, 2025

Relates to #15843

PR Contents

  • Reimplementation of the GetProvisionerJobsByIDsWithQueuePosition SQL query to take into account provisioner job tags and provisioner daemon tags.
  • Unit tests covering different tag sets, job statuses, and job ordering scenarios.

Notes

  • The original row order is preserved by introducing the ordinality field.
  • Unnecessary rows are filtered as early as possible to ensure that expensive joins operate on a smaller dataset.
  • A "fake" join with provisioner_jobs is added at the end to ensure sqlc.embed compiles successfully.
  • Backward compatibility is preserved—only the SQL query has been updated, while the Go code remains unchanged.

@evgeniy-scherbina evgeniy-scherbina changed the title reimplement GetProvisionerJobsByIDsWithQueuePosition query fix(coderd/database): consider tag sets when calculating queue position Feb 24, 2025
@evgeniy-scherbina evgeniy-scherbina marked this pull request as ready for review February 25, 2025 19:27
Copy link
Member

@Emyrk Emyrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, the SQL query is compelx, but well structured.

Can we ensure we have unit tests for the following cases:

  • N jobs (1 job with 0 tags) & 0 provisioners exist
  • N jobs (1 with 0 tags) & N provisioners
    • 1 job not matching any provisioner
  • 0 jobs & 0 provisioners
  • N jobs with at least 1 being completed (not pending)

Just want to cover all cases

Comment on lines 89 to 90
COALESCE(MIN(rj.queue_position), 0) :: BIGINT AS queue_position, -- Best queue position across provisioners
COALESCE(MAX(rj.queue_size), 0) :: BIGINT AS queue_size, -- Max queue size across provisioners
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this 0. Should we instead use -1 to indicate no valid queue_position or queue_size?

0 could be misinterpreted as "complete" imo. -1 is more obviously "not in a queue".

@evgeniy-scherbina evgeniy-scherbina force-pushed the 15843-queue-position branch 2 times, most recently from 59cd0fd to 29048ec Compare February 26, 2025 21:06
@evgeniy-scherbina
Copy link
Contributor Author

@Emyrk I added test cases according to your recommendations: e7693ee

Also in previous version I made a mistake, I filtered out provisioner jobs by passed @ids too early, meaning if jobID is not passed to GetProvisionerJobsByIDsWithQueuePosition it won't be considered during queue_pos & queue_size calculation - so I fixed it and added test cases to cover it, they use this approach:

// GetProvisionerJobsByIDsWithQueuePosition takes jobIDs as a parameter.
// If skipJobIDs is empty, all jobs are passed to the function; otherwise, the specified jobs are skipped.
// NOTE: Skipping job IDs means they will be excluded from the result,
// but this should not affect the queue position or queue size of other jobs.
skipJobIDs map[int]struct{}

Copy link
Member

@Emyrk Emyrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

Emyrk
Emyrk previously requested changes Feb 28, 2025
Copy link
Member

@Emyrk Emyrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry forgot about dbmem. 😢

SQL and tests look good 👍

@evgeniy-scherbina evgeniy-scherbina force-pushed the 15843-queue-position branch 2 times, most recently from 45f8bc6 to aee0a11 Compare February 28, 2025 22:43
@@ -0,0 +1 @@
CREATE INDEX idx_provisioner_jobs_status ON provisioner_jobs USING btree (job_status);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: it took me a minute to figure out if this is supported, but apparently it is! The docs on generated columns specify that only stored generated columns are supported, and these are always written to disk. TIL!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnstcn
yes, that's what I figured as well
Also I thought about using Hash Index instead of BTree, because so far we're only using = or != operations, but wasn't sure, maybe we'll need other operations in the future.

Copy link
Member

@johnstcn johnstcn Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH this is probably fine, almost all of our other indexes are using BTree. If you were curious, I suppose you could validate with some EXPLAINs on a sample dataset.

name: "test-case-12",
jobTags: []database.StringMap{},
daemonTags: []database.StringMap{},
queueSizes: nil, // TODO(yevhenii): should it be empty array instead?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I prefer to return an empty slice for something that is always expected to be present. However, as these responses end up being converted to Typescript objects, we kind of always have to specify arrays as being nullable due to Go's type system. @Emyrk may have some more thoughts here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for a test case. I also generally prefer an empty slice in this case for the equality assertion. But I have no strong opinion in this test case.


For the typescript side of things, (off the top of my head an quick glancing) slices are never null. So a nil slice --> [] on the typescript side of things.

readonly audit_logs: readonly AuditLog[];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obligatory reminder to check migration number before merge!

@johnstcn johnstcn dismissed Emyrk’s stale review March 3, 2025 14:29

dbmem is done now

Copy link
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Changes look good to me! Nice tests!

@evgeniy-scherbina evgeniy-scherbina merged commit b85ba58 into main Mar 3, 2025
30 checks passed
@evgeniy-scherbina evgeniy-scherbina deleted the 15843-queue-position branch March 3, 2025 15:02
@github-actions github-actions bot locked and limited conversation to collaborators Mar 3, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants