-
Notifications
You must be signed in to change notification settings - Fork 874
Improve template/workspace build diagnostic #15447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The scope of this issue is fairly wide. I think it might need to be broken down a bit further.
Which specific error points?
We'll need to add code for each specific provider. How about we focus on the top 3: Docker, Kubernetes, AWS?
Are these not covered by our existing documentation? |
They are but could be improved with explicit instructions (specifically regarding the docker socket permission issues in the context of Coder), also we need to link them from within the dashboard to take the user to actionable troubleshooting steps. I would prefer error codes and docs similar to the ones we have for the heathstatus page.
Yes, We can start with these 3. It would be nice to have if we could run some preflight checks and suggest or auto-import the template the Coder server can connect.
More verbose template build errors in the web template editor. They could relate to anything from wrong syntax to provider authentication issues. |
Just got linked to this issue via support and this would be super helpful for users to be able to debug some of their own issues. We're using Kubernetes and a workspace build with the wrong image tag (so it can't pull the image) or requesting more resources than can be scheduled leads to a completely opaque error. I've added some internal docs for how people should debug this (describe the deployment to see the events and then map those events error strings to what is actually wrong and how the fix it) but it's hard to make those docs discoverable and for people to know what to do. If instead Coder could return those errors (either the raw error on the Kubernetes events for the deployment, or better yet something less raw and more user facing) then that would help a ton compared to just a time out waiting for the deployment to complete. |
removing from docs for now - once we have it implemented or part of a release, we'll be able to document it |
Problem Description
Template creation in Coder can fail due to various setup and infrastructure requirements, impacting user experience by not providing clear diagnostics to troubleshoot. This is especially critical for new deployments evaluating Coder.
Templates can be created via:
coder templates push
.During provisioning, Coder runs
terraform plan
andterraform apply
. Failures may occur due to unmet dependencies, incorrect configurations, or provider authentication issues, but diagnostics are limited, often only showing a genericterraform exit code 1
without specifics. For instance, the Docker template requires:Failures vary across setups, such as Docker Compose, Kubernetes, and system services, making troubleshooting challenging. Missing diagnostics for Docker socket, permissions, or provider authentication errors contribute to user frustration and delays in setup.
Desired Solution
To improve diagnostics, we need actionable logs, UI hints, and expanded documentation for common setup issues.
Enhanced Logging and Diagnostics:
terraform plan
andterraform apply
, especially around Docker socket access and provider authentication.UI Feedback and Suggested Fixes:
add_groups
or setting up Docker group permissions.Documentation Update:
Implementing these steps will streamline template creation, enabling users to resolve failures independently and improving the initial user experience with Coder.
Related issues:
The text was updated successfully, but these errors were encountered: