-
Notifications
You must be signed in to change notification settings - Fork 928
feat(cli): Add support for delay_login_until_ready
#5851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cli/ssh.go
Outdated
@@ -242,6 +244,7 @@ func ssh() *cobra.Command { | |||
cliflag.BoolVarP(cmd.Flags(), &forwardGPG, "forward-gpg", "G", "CODER_SSH_FORWARD_GPG", false, "Specifies whether to forward the GPG agent. Unsupported on Windows workspaces, but supports all clients. Requires gnupg (gpg, gpgconf) on both the client and workspace. The GPG agent must already be running locally and will not be started for you. If a GPG agent is already running in the workspace, it will be attempted to be killed.") | |||
cliflag.StringVarP(cmd.Flags(), &identityAgent, "identity-agent", "", "CODER_SSH_IDENTITY_AGENT", "", "Specifies which identity agent to use (overrides $SSH_AUTH_SOCK), forward agent must also be enabled") | |||
cliflag.DurationVarP(cmd.Flags(), &wsPollInterval, "workspace-poll-interval", "", "CODER_WORKSPACE_POLL_INTERVAL", workspacePollInterval, "Specifies how often to poll for workspace automated shutdown.") | |||
cliflag.BoolVarP(cmd.Flags(), &skipDelayLoginUntilReady, "skip-delay-login-until-ready", "", "CODER_SSH_SKIP_DELAY_LOGIN_UNTIL_READY", false, "Specifies whether to login to a workspace that has not finished starting up (only applicable when the delay login until ready option is enabled).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we go with something simpler, shorter here? Maybe --no-delay-login
, --no-wait
. I'd like to leave room for the opposite flag as well --[no-]wait
to allow waiting even when the template doesn't specify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--no-wait
is succinct 👍
cli/cliui/agent.go
Outdated
} | ||
case codersdk.WorkspaceAgentLifecycleReady: | ||
return nil | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay or nay?
default: | |
case codersdk.WorkspaceAgentLifecycleStartError: | |
showMessage() | |
return errors.New("...") | |
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case case
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should fail-open here. Let folks try to connect, but warn that it might not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I agree. Let's make it so!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, this is what it looks like now:
❯ coder ssh test
> The workspace agent ran into a problem during startup.
See troubleshooting instructions at:
https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates
Agent startup script exited with non-zero status, use --no-wait to login anyway.
Run 'coder ssh --help' for usage.
And with --no-wait
we still warn about the problem:
❯ coder ssh test --no-wait
> The workspace agent ran into a problem during startup.
See troubleshooting instructions at:
https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates
mafredri@test ~
❯
|
||
func waitingMessage(agent codersdk.WorkspaceAgent) (m *message) { | ||
m = &message{ | ||
Prompt: "Don't panic, your workspace is booting up!", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add directions on how to bypass this e.g.
To skip this check, use the `--no-wait` argument
(Or whatever we end up calling it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this message (this state is before agent is connected so we can't connect yet).
We could add it for the other message, but I wonder if it would encourage bypassing the prompt, esp. without understanding what the flag does. You just go Ctrl+C -> --no-wait
yay I'm in (and not everything is working). Would it suffice to explain this in the troubleshooting link?
I think this need will also be alleviated by streaming the startup log once we have it available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it suffice to explain this in the troubleshooting link?
That's fine either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is definitely a good step in troubleshooting failed workspace builds.
One possible avenue of future work I see is adding WorkspaceAgentDownloading
/ WorkspaceAgentDownloaded
states. We could add a random pre-shared nonce (even the build ID could work here) to the workspace agent downloaded URL and use that to determine if the workspace attempted to download the agent.
Updated messaging:
|
m.Prompt = "The workspace is taking longer than expected to get ready, the agent startup script is still executing." | ||
case codersdk.WorkspaceAgentLifecycleStartError: | ||
m.Spin = "" | ||
m.Prompt = "The workspace ran into a problem while getting ready, the agent startup script exited with non-zero status." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this copy! Once we start streaming build logs, I imagine we can also link to view the logs + the troubleshooting URL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes definitely! We can also stream the log in the terminal while the user is waiting, like 5 latest rows (keeps updating). Maybe that'd enabled/disabled via flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's even better!
This PR adds support to
ssh
andcoder ssh
fordelay_login_until_ready
.Ref: #5749
Here are examples of when
coder_agent.delay_login_until_ready = true
(for false, behavior is as before, sans formatting changes):On 30+s start,
start_timeout
=>start_error
:Ultimately in this case, the user can log in withcoder ssh --skip-delay-login-until-ready
.We might want to change behavior ofstart_error
to exit with non-zero status instead blocking.