-
Notifications
You must be signed in to change notification settings - Fork 962
fix: don't create autostart workspace builds with no available provisioners #19067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
provisioner is not available Signed-off-by: Callum Styan <callumstyan@gmail.com>
transition so that we don't accumulate autostart jobs that will eventually have to be cleaned up by the reaper routine Signed-off-by: Callum Styan <callumstyan@gmail.com>
DB/provisioner to be available, so add a way to skip the check Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
…ndling This commit addresses the requirements to: 1. Comment out provisionerCloser.Close() in TestExecutorAutostartSkipsWhenNoProvisionersAvailable 2. Remove the need for SkipProvisionerCheck by refactoring tests Changes: - Remove SkipProvisionerCheck field and logic from Executor - Remove SkipProvisionerCheck option from coderdtest.Options - Add proper provisioner daemon tags to match template requirements - Add helper functions for waiting for provisioner availability - Update tests to use real provisioner availability checks - Fix timing issues in provisioner staleness tests - Comment out provisioner close call to test proper behavior The test now correctly validates that provisioners remain available when not explicitly closed, and uses the real hasAvailableProvisioners logic instead of bypassing it. Signed-off-by: Callum Styan <callumstyan@gmail.com>
9971794
to
78a8f5e
Compare
autobuild path have a valid provisioner available and pass an appropriate time.Time to the channel for triggering the build based on whether the provisioner is supposed to be valid or stale Signed-off-by: Callum Styan <callumstyan@gmail.com>
Okay @deansheather thanks for your patience, I think this is finally ready for another review. Took quite a bit of digging with the help of Blink to find all the bits and pieces that had to be updated to get the tests working without the The TL; DR of the changes is that we in tests which deal with autobuilds we need:
|
TestExecutorAutostartSkipsWhenNoProvisionersAvailable Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
any before that Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
p, err := coderdtest.GetProvisionerForTags(db, time.Now(), workspace.OrganizationID, map[string]string{}) | ||
require.NoError(t, err) | ||
// When: the autobuild executor ticks after the scheduled time | ||
go func() { | ||
tickCh <- sched.Next(workspace.LatestBuild.CreatedAt) | ||
tickTime := sched.Next(workspace.LatestBuild.CreatedAt) | ||
coderdtest.UpdateProvisionerLastSeenAt(t, db, p.ID, tickTime) | ||
tickCh <- tickTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W.R.T the last comment from the previous review, this is what I'm referring to. Is this change (and other similar changes) necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, AFAICT it is. Otherwise the test is flakey, it would only pass if you run the test within StaleInterval
seconds of the top of any hour.
When one of these tests starts, the provisioner has a LastSeenAt
time of time.Now()
(sometimes + 1 second or so), but sched.Next(workspace.LatestBuild.CreatedAt)
will return a value that is the top of the hour.
For example: right now if I comment out the UpdateProvisionerLastSeenAt
line here, in hasValidProvisioner
the pd.LastSeenAt
time is 2025-08-12 16:34:02.908042 +0000 UTC
but t
from the tickCh
is 2025-08-12 17:00:00 +0000 UTC
. So in a real running deployment of Coder that provisioner would be considered stale.
This is the exact thing we were trying to solve for with the addition of hasValidProvisioner
. So any test case which utilized the autobuild functionality but needed to continue passing needed to update the LastSeenAt
value. In the test I have added, which should see the provisioner as stale and then assert that there were no autobuild transitions initiated, we ensure the LastSeenAt
time is far enough in the past that the provisioner is stale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh I get it, because the autostart time is way in the future.
Perhaps a better option for these autostarty tests to set a value on the lifecycle executor like TestIgnoreNoDaemons: true
, and then just set it to true for these tests. Avoids having to get the provisioners and update them right before ticking, which feels like it could be racey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that essentially just going back to what I had originally? #19067 (comment)
I guess we're doing the configuration in a cleaner way, but we'd still be setting that value to true for essentially all of these tests, which is what we wanted to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think I may have looped... sorry. I think what you have now is good then 👍
new test Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
nil here Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
This should fix #17941
With just the test addition the test will fail since the workspace build is triggered, we attempt to move the workspace into the
start
state:require.Equal(t, codersdk.WorkspaceTransitionStop, workspace.LatestBuild.Transition)
With the 2nd commit we skip the calls to
wsbuilder.New
andbuilder.Build
and so the transition is never scheduled.However, many tests don't fully setup provisioner or wait for the provisioner + DB to be fully available, so the third commit adds a somewhat smelly way to skip the provisioner check in tests that don't need to test that functionality.