-
Notifications
You must be signed in to change notification settings - Fork 887
fix: Add trap
to agent startup script to sleep on failure
#2873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
32ecb43
to
3a0d162
Compare
provisionersdk/agent.go
Outdated
@@ -18,6 +18,7 @@ Start-Process -FilePath $env:TEMP\sshd.exe -ArgumentList "agent" -PassThru` | |||
|
|||
linuxScript = `#!/usr/bin/env sh | |||
set -eux pipefail | |||
trap '[ $? -ne 0 ] && echo === Agent script exited with non-zero code. Sleeping infinitely to preserve logs... && sleep infinity' EXIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried about sleep infinity
compatibility. From a quick test it doesn't seem to work on macOS. It does seem supported in busybox
, but e.g. in toolbox
it doesn't (more common in Android world I think).
I wonder if it's be enough to sleep just a "long time", say something in the ballpark of 1-24h? I think anything above 15s should spare the Docker container from being erased.
PS. It doesn't hurt, but I don't think we need the non-zero exit check here since the exec
at the end replaces this process, the trap should also go away. Up to you which you prefer, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful catch, I'll make it a static duration (I agree 24h should be plenty).
The non-zero exit code can be removed with exec
, another great catch!
The Docker Terraform provider removes containers immediately on exit, making it difficult to debug a failed container start with Coder. This will sleep on exit and output a friendly log, which should assist with debugging failures.
3a0d162
to
b3c8a87
Compare
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
The Docker Terraform provider removes containers immediately on exit, making
it difficult to debug a failed container start with Coder. This will sleep on
exit and output a friendly log, which should assist with debugging failures.