Skip to content

fix(coderd/database): exclude canceled jobs in queue position #15835

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 12, 2024

Conversation

johnstcn
Copy link
Member

When calculating the queue position in GetProvisionerJobsByIDsWithQueuePosition we only counted jobs with started_at = NULL. This is misleading, as it allows canceling or canceled jobs to take up rows in the computed queue position, giving an impression that the queue is larger than it really is.

This modifies the query to also exclude jobs with a null canceled_at, completed_at, or error field for the purposes of calculating the queue position, and also adds a test to validate this behaviour.

(Note: due to the behaviour of dbgen.ProvisionerJob with dbmem I had to use other proxy methods to validate the corresponding dbmem implementation.)

@johnstcn johnstcn self-assigned this Dec 11, 2024
@@ -50,23 +50,29 @@ WHERE
id = ANY(@ids :: uuid [ ]);

-- name: GetProvisionerJobsByIDsWithQueuePosition :many
WITH unstarted_jobs AS (
WITH pending_jobs AS (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review: Renamed for clarity.

Comment on lines +60 to +65
AND
canceled_at IS NULL
AND
completed_at IS NULL
AND
error IS NULL
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review: this is in line with the generated provisioner job status:

CASE
    WHEN (completed_at IS NOT NULL) THEN
    CASE
        WHEN (error <> ''::text) THEN 'failed'::provisioner_job_status
        WHEN (canceled_at IS NOT NULL) THEN 'canceled'::provisioner_job_status
        ELSE 'succeeded'::provisioner_job_status
    END
    ELSE
    CASE
        WHEN (error <> ''::text) THEN 'failed'::provisioner_job_status
        WHEN (canceled_at IS NOT NULL) THEN 'canceling'::provisioner_job_status
        WHEN (started_at IS NULL) THEN 'pending'::provisioner_job_status
        ELSE 'running'::provisioner_job_status
    END
END)

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fixing the mentioned issue, this LGTM, nice work. Down the line I think we need to address provisioner matching for these queries too.

AND
completed_at IS NULL
AND
error IS NULL
),
queue_position AS (
SELECT
id,
ROW_NUMBER() OVER (ORDER BY created_at ASC) AS queue_position
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to amend this to take into account matching tags. Queue position could be 1/3 for the provisioners that match, but 20/50 on a larger scale (but that's nonsensical). Do we have an issue that tracks this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Account matching tags? Is that provisioner tags?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take into account 😄. Matching provisioner tags yes 👍🏻

Copy link
Member Author

@johnstcn johnstcn Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will create a follow-up issue to track this. 👍

Edit: #15843

Copy link
Member

@Emyrk Emyrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat we are adding this kinda of stuff 👍

@@ -3804,35 +3804,92 @@ func (q *FakeQuerier) GetProvisionerJobsByIDsWithQueuePosition(_ context.Context
q.mutex.RLock()
defer q.mutex.RUnlock()

jobs := make([]database.GetProvisionerJobsByIDsWithQueuePositionRow, 0)
queuePosition := int64(1)
// WITH pending_jobs AS (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is lovely, but I can't wait to get rid of dbmem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a love/hate relationship with dbmem. Part of me will be a bit sad 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m with you @Emyrk, many times it has been a nice way to verify query logic/results by forcing you to reproduce the logic in an alternative way. Still going to be happy that it’s gone though 🥲

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
@johnstcn johnstcn merged commit 36c2cf8 into main Dec 12, 2024
30 checks passed
@johnstcn johnstcn deleted the cj/ignore-canceled-provisioner-jobs-in-queue-position branch December 12, 2024 12:37
@github-actions github-actions bot locked and limited conversation to collaborators Dec 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants