-
Notifications
You must be signed in to change notification settings - Fork 881
fix: use background context for inmem provisionerd #11545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In this stack: This stack of pull requests is managed by Graphite. Learn more about stacking. Join @spikecurtis and the rest of your teammates on |
@@ -1371,10 +1371,10 @@ func newProvisionerDaemon( | |||
connector[string(database.ProvisionerTypeTerraform)] = sdkproto.NewDRPCProvisionerClient(terraformClient) | |||
} | |||
|
|||
return provisionerd.New(func(ctx context.Context) (proto.DRPCProvisionerDaemonClient, error) { | |||
return provisionerd.New(func(dialCtx context.Context) (proto.DRPCProvisionerDaemonClient, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I don't think this rename is required as I feel this is the default way we should think about contexts, unless explicitly specified otherwise. But there's also no harm so feel free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear from here that this function argument is dialer
, so I thought it would be a nice reminder.
// drop a job that we locked in the database but never passed to the provisionerd. The | ||
// provisionerd is local, in-mem, so there isn't a danger of losing contact with it and | ||
// having a dead connection we don't know the status of. | ||
err := server.Serve(context.Background(), serverSession) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check: will this result in provisioner jobs not being cancelled on shutdown? And is that really what we want?
Let's say a provisioner job (terraform) is running and it will take 5 minutes, I imagine we would like to ensure that job is cancelled gracefully, so we only wait for the cleanup but not the full 5 minutes. Is that what will happen or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in cli/server.go we explicitly close the provisionerDaemons before closing the API. Closing the provisioner daemon fails the active job. That's not exactly graceful, but at least it leaves the job in an understandable state.
This test case fails with an error log, showing "context canceled" when trying to send an acquired job to an in-mem provisionerd.
https://github.com/coder/coder/runs/20331469006
In this case, we don't want to supress this error, since it could mean that we acquired a job, locked it in the database, then failed to send it to a provisioner.
(We also don't want to mark the job as failed because we don't know whether the job made it to the provisionerd or not --- in the failed test you can see that the job is actually processed just fine).
The reason we got context canceled is because the API was shutting down --- we don't want provisionerdserver to abruptly stop processing job stuff as the API shuts down as this will leave jobs in a bad state. This PR fixes up the use of contexts with provisionerdserver and the associated drpc service calls.