-
Notifications
You must be signed in to change notification settings - Fork 875
feature: provide a way to view all pending or running provisioner jobs #15084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can we also add the option to "force cancel" a job that is hanging? @johnstcn mentioned we have some existing endpoints https://coder.com/docs/reference/api/templates#cancel-template-version-by-id but the ability to click "cancel" or "delete" for a job with no matching provisioner on this page would be awesome. Happy to extract to another issue if needed. |
Hey @chrifro , would you have some bandwidth for a quick UI mockup for part 3 of the above? It doesn't have to be anything overly fancy; even a simple table will probably suffice. |
Sure, I can help @johnstcn. But I need a bit more context first. I'm not fully familiar with the provisioner setup yet.
screenshot of the current provisioner information on organization level |
That makes sense!
I think that covers it for now. But it actually might make sense to defer this until we have a corresponding CLI command defined, as that would probably inform us of any information I've missed. What do you think? |
Perfect, then I will draft something based on that and we can always iterate on it and adjust the shown information. Does it work if I have something by end of week/early next week? |
That sounds fine to me! |
Here is a first draft. Note: the design is not final yet. I'd like to align on the information structure and direction first. Is this what you had in mind? Are there any information missing? Feel free to comment directly in Figma.
Could you provide an example? What kind of information would you like to see here? |
@johnstcn I've drafted up an example of the outputs for the CLI commands, do these match the expected behavior and is there any missing/excess data? ❯ coder provisioner jobs list
CREATED AT JOB ORGANIZATION STATUS TYPE QUEUE COMPLETED BY PROVISIONER TAGS AVAILABLE PROVISIONERS
2024-10-19T16:22:39Z 9739dfd4-d69e-4369-9eed-f6e8a0c5cb57 Coder completed workspace_build coder-provisioner-789964695d-48cw6 []
2024-10-24T16:27:56Z 6976d3c0-64b1-4ace-8c5f-6bb620fae1ee Coder failed workspace_build coder-provisioner-789964695d-k285t [foo=bar] []
2024-10-26T16:28:20Z b1547e9d-502b-47a3-ba0b-b05e56825b36 Coder pending template_version_import [baz=qux] []
2024-10-26T16:29:26Z d0eb5f9f-81c7-4042-896e-055d095ae9cc Coder pending template_version_import 1/1 [coder-provisioner-789964695d-48cw6,coder-provisioner-789964695d-k285t] Note: I considered adding template version/workspace name, but it could be looked up by job ID instead. ❯ coder provisioner list
NAME ORGANIZATION STATUS CURRENT JOB PREVIOUS JOB PREVIOUS_JOB_STATUS JOBS LAST 24H TYPE CREATED AT LAST SEEN VERSION TAGS
coder-provisioner-789964695d-48cw6 Coder connected d0eb5f9f-81c7-4042-896e-055d095ae9cc 9739dfd4-d69e-4369-9eed-f6e8a0c5cb57 completed 2 psk 2024-10-19T16:22:39Z 2024-10-26T16:29:26Z up to date []
coder-provisioner-789964695d-k285t Coder connected 6976d3c0-64b1-4ace-8c5f-6bb620fae1ee failed 2 psk 2024-10-26T16:28:20Z 2024-10-26T16:29:26Z up to date [foo=bar]
coder-provisioner-789964695d-r822d Coder disconnected 66f8290b-2ac0-4fd4-aa16-c926d8b8b61c completed 0 psk 2024-10-26T16:28:20Z 2024-10-26T16:29:26Z up to date [baz=qux] The outputs are fairly wide, I'd love to reduce it, but I felt like all of this information is useful in trying to find potential issues. |
@mafredri I think we can show basic information by default but provide additional optional columns:
Apart from that, the basic gist of the output looks good to me! 👍 |
Seeing the number of jobs in the queue for each provisioner will be nice too. |
I think showing available provisioners could be a good default since it's what can tell you immediately if there's a problem or not. But maybe we just show a number, available provisioners =
I agree, maybe those columns are not even needed. We could add another column instead:
@matifali could you expand on what you're thinking of? Is it different from |
from sprint planning:
|
What I mean is to report the total number of jobs in queue. It would make better sense in UI and can be skipped in CLI. |
@mafredri @chrifro It'd make sense to visualize the queue in the UI for cases where a provisioner is not ready pick up a job, or if there are no compatible provisioners running. Since 1 provisioner can only pick up 1 job at a time, not sure how the drill-down by provisioner would work. The CLI output totally makes sense to me. Maybe the UI can mirror the same logical structure? |
@bpmct that makes sense to me, provisioners and provisioner jobs are intertwined but also distinct, having two separate views would be ideal. 👍🏻 |
@bpmct does showing a queue of jobs for provisioners also imply that the admin can "force cancel" a queued job, too (despite it being perfectly healthy, at least in theory)? |
Yep! I imagine that could be a quick prerequisite. Jobs that are assigned to a provisioner can be cancelled but I don't think we actually support cancelling "pending" jobs that have been unassigned can be deleted/cancelled, but we have had many requests for it. |
Yes. See: #12331 |
By that do you mean adding more details to the Based on this early mockup , what other changes would you like to see? What kind of information is still missing? |
This change adds metadata to provisioner jobs to help with rendering related tempaltes and workspaces in the UI. Updates #15084
Motivation
Relates to #15047
There is currently no way to get a top-level view of all pending or running provisioner jobs.
This is useful for administrators troubleshooting issues with hung or pending jobs in combination with tagged provisioner daemons. Having this available would have been helpful in surfacing the root cause of the linked issue.
Proposed Solution
Required permissions: read all templates in org, read all workspaces in org.
Add a CLI command that queries the above endpoint. We may also potentially expose a command to list all registered provisioners.
Add a UI endpoint to list all provisioner jobs. We should have enough contextual information available to link to associated workspace builds and template versions.
The text was updated successfully, but these errors were encountered: