-
Notifications
You must be signed in to change notification settings - Fork 896
feat(agent/agentcontainers): recreate devcontainers concurrently #18042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(agent/agentcontainers): recreate devcontainers concurrently #18042
Conversation
d17bca2
to
0ebc262
Compare
This change introduces a refactor of the devcontainers recreation logic which is now handled asynchronously rather than being request scoped. The response was consequently changed from "No Content" to "Accepted" to reflect this. A new `Status` field was introduced to the devcontainer struct which replaces `Running` (bool). This reflects that the devcontainer can now be in various states (starting, running, stopped or errored). The status field also protects against multiple concurrent recrations, as long as they are initiated via the API. Updates #16424
0ebc262
to
9328228
Compare
// @Param workspaceagent path string true "Workspace agent ID" format(uuid) | ||
// @Param container path string true "Container ID or name" | ||
// @Success 204 | ||
// @Success 202 {object} codersdk.Response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: the response code change here does not appear to affect any endpoints currently in use by the FE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, not yet.
err = json.NewDecoder(rec.Body).Decode(&resp) | ||
require.NoError(t, err, "unmarshal response failed after recreation") | ||
require.Len(t, resp.Devcontainers, 1, "expected one devcontainer in response after recreation") | ||
assert.Equal(t, codersdk.WorkspaceAgentDevcontainerStatusRunning, resp.Devcontainers[0].Status, "devcontainer is not stopped after recreation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this could also be a require?
err = json.NewDecoder(rec.Body).Decode(&resp) | ||
require.NoError(t, err, "unmarshal response failed after error") | ||
require.Len(t, resp.Devcontainers, 1, "expected one devcontainer in response after error") | ||
assert.Equal(t, codersdk.WorkspaceAgentDevcontainerStatusError, resp.Devcontainers[0].Status, "devcontainer is not in an error state after up failure") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this could also be a require?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To what benefit? Assert is useful in that it can allow multiple conditions to fail and give you a larger picture of what's wrong. Require is ofc needed whenever you need that condition to be true, like nil and lengths checks so that the rest of the code doesn't panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My nit is very much stylistic here; having an assert
on the last line below one or more require
doesn't help us against panics or any of the above you mentioned. Feel free to ignore!
// The devcontainer state must be set to starting and the recreateWg must be | ||
// incremented before calling this function. | ||
func (api *API) recreateDevcontainer(dc codersdk.WorkspaceAgentDevcontainer, configPath string) { | ||
defer api.recreateWg.Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm nervous that a hanging request to recreateDevcontainer
could make closing the devcontainers API hang on api.wg.Wait()
. I wonder if it makes sense to have a 'graceful' shutdown context and a 'hard' shutdown context that will forcefully cancel all in-flight recreation waitgroups after a certain period of time elapses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses api.ctx
which is essentially that "hard" shutdown context, there isn't any reason to have a graceful one AFAICT.
This change introduces a refactor of the devcontainers recreation logic
which is now handled asynchronously rather than being request scoped.
The response was consequently changed from "No Content" to "Accepted" to
reflect this.
A new
Status
field was introduced to the devcontainer struct whichreplaces
Running
(bool). This reflects that the devcontainer can nowbe in various states (starting, running, stopped or errored).
The status field also protects against multiple concurrent recrations,
as long as they are initiated via the API.
Updates #16424