-
Notifications
You must be signed in to change notification settings - Fork 887
"Running" on workspace list isn't helpful if the agent is disconnected #6461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm thinking about how we should represent this in the UI/CLI. What makes most sense to me is to have a workspace health field since we still want to differentiate between a started/stopped state. The two columns could essentially be combined into one since health for a stopped workspace doesn't make much sense, but this should illustrate the general idea. This is a bit tricker to express in the CLI, since we don't have popups to give extra information. Perhaps we can simply add a new column:
The user can then use
|
I'm definitely in favor of more of a single |
We could do the following:
|
I'm not a fan of hovering to see additional error details as every user may not intuitively hover. For example, the agent troubleshooting URL is hard to find. On the workspace list page, that is fair, but I think on the page itself we should promote restarting with a bug button. |
@mafredri it looks like a FE task, is there anything you want to do in the BE? Or are you interested on take the FE as well? |
I think the task #6462 is related to it as well |
@BrunoQuaresma happy to defer FE to you. As for the BE, I think potential requirements may be easier to identify after we have an implementation of this. For example, do we want a single field on the agent that says |
@mafredri this would be great! |
Alright @BrunoQuaresma and @bpmct, we can add a We could start off with a simple enum: The only question in my mind is, how can the UI help the user understand why a workspace is unhealthy based on the new To document my thought process, and possibly prompt some ideas from all of you. Here are the (currently possible) enumerated states of an agent (status:lifecycle state):
States that I would consider are healthy:
There's one state representing a timeout or an unknown state, we don't know if it's healthy or unhealthy, nor do we know if it's ever going to be:
There are two states representing soft timeouts, these could be considered either healthy or unhealthy:
The rest could all be considered unhealthy. If we ever want to support restarting agents on request, even these states could be considered healthy:
|
I don't think we should add more values to the However, I think we could surface the "reason" as a different property in the API request, perhaps one that shows up when you hover over "unhealthy." I think https://dev.coder.com/api/v2/debug/health has a good design for this. I assume we're not storing health information in the DB, right? Just computing it per-request based on workspace/agent state? In unhealthy scenarios (e.g. agent disconnected), we should offer a button to restart the workspace. Hope that helps! |
Also, I'm more than happy to help with the user-facing tooltip/ |
I created a draft implementation of the health field, borrowing concepts from I still need to think some more about how this field will behave when e.g. the workspace is stopping. Stopping is technically healthy, but the workspace could be considered "unhealthy" from a usability perspective. |
@BrunoQuaresma I created a draft for this, but could use some guidance for the popover/tooltip: #8387. Also, does this align with your vision? |
Hey @mafredri and @BrunoQuaresma! I have just seen this issue, so I'm wondering if we can close this one. Otherwise, could you please document what is left? |
@mtojek I'd say we still want to improve With There's also |
Could we have a combined state which displays instead whether the agent is connected, if it's still running the startup script, whether it's reachable/unreachable, etc?
The text was updated successfully, but these errors were encountered: