-
Notifications
You must be signed in to change notification settings - Fork 881
Workspace goes to failed state and cannot be started, stopped or deleted after cancel-start->stop->start actions #2683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@bpmct can we get a couple of us together and see:
|
I'm tackling this issue from the perspective that we currently terminate the Terraform process immediately on a cancel request which does not give Terraform a chance to clean up resources. This will be changed so that cancellation signals an interrupt and we will wait for Terraform to clean up. A second forceful cancellation request terminates Terraform and may leave resources but can be necessary in some cases where Terraform is stuck cleaning up. |
This change allows terraform commands to be gracefully cancelled on Unix-like platforms by signaling interrupt on provision cancellation. One implementation detail to note is that we do not necessarily kill a running terraform command immediately even if the stream is closed. The reason for this is to allow for graceful cancellation even in such an event. Currently the timeout is set to one minute, which was chosen arbitrarily. Also note that the `force` flag was added to `provisioner.proto` and is handled in the `terraform` package, however, it is not used by the `provisionerd/runner`. The reason is that the `runner` would need to be refactored. Currently the "force stop" context is used as a base for streams, and this would mean that in a force stop event, the stream would close and we can't send out our force stop cancellation message. Related: #2683 The above issue may be partially of fully fixed by this change.
* fix: Allow terraform provisions to be gracefully cancelled This change allows terraform commands to be gracefully cancelled on Unix-like platforms by signaling interrupt on provision cancellation. One implementation detail to note is that we do not necessarily kill a running terraform command immediately even if the stream is closed. The reason for this is to allow for graceful cancellation even in such an event. Currently the timeout is set to 5 minutes by default. Related: #2683 The above issue may be partially or fully fixed by this change. * fix: Remove incorrect minimumTerraformVersion variable * Allow init to return provision complete response
With #3526 merged, I'm hoping this issue is fixed but I'll leave this ticket open for now in case there are more reports of it (it'll be part of the next release, the one after |
Let's reopen if there is a new report. |
Problem
Workspace goes to failed state and cannot be started or stopped or deleted after cancel-start->stop->autostart actions
I have met with the issue, that I cannot work with my workspace, getting error Error: Error creating instance: googleapi: Error 409: The resource 'projects/coder-dogfood/zones/europe-west4-b/instances/coder-nadzeya-nadzeya-test-autostart' already exists
Steps to reproduce:
coder version:
Coder v0.7.4-devel+bbbd5241
Start fails with
Error: Error creating instance: googleapi: Error 409: The resource 'projects/coder-dogfood/zones/us-central1-a/instances/coder-nadzeya-nadzeya-windows' already exists, alreadyExists
And cannot perform any other actions
Events order list sample:
Definition of Done
If a start is cancelled, a user should be able to start/stop a workspace without problems
The text was updated successfully, but these errors were encountered: