Skip to content

Commit 50b4a64

Browse files
committed
docs(templates): Document startup_script_behavior in-depth
Fixes #7759
1 parent a2e1290 commit 50b4a64

File tree

1 file changed

+92
-28
lines changed

1 file changed

+92
-28
lines changed

docs/templates/index.md

+92-28
Original file line numberDiff line numberDiff line change
@@ -145,12 +145,20 @@ by all child processes of the agent, including SSH sessions. See the
145145
[Coder Terraform Provider documentation](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent)
146146
for the full list of supported arguments for the `coder_agent`.
147147

148-
#### startup_script
148+
#### `startup_script`
149149

150150
Use the Coder agent's `startup_script` to run additional commands like
151151
installing IDEs, [cloning dotfiles](../dotfiles.md#templates), and cloning
152152
project repos.
153153

154+
**Note:** By default, the startup script is executed in the background allowing users to access the workspace before the script completes. If you want to change this, see `startup_script_behavior` below.
155+
156+
Here are a few guidelines for writing a good startup script (more on these below):
157+
158+
1. Use `set -e` to exit the script if any command fails and `|| true` for commands that are allowed to fail
159+
2. Use `&` to start a process in the background, allowing the startup script to complete
160+
3. Inform the user about what's going on via `echo`
161+
154162
```hcl
155163
resource "coder_agent" "coder" {
156164
os = "linux"
@@ -163,27 +171,53 @@ resource "coder_agent" "coder" {
163171
# that does not require root permissions. Note that /tmp may be mounted in tmpfs which
164172
# can lead to increased RAM usage. To avoid this, you can pre-install code-server inside
165173
# the Docker image or VM image.
174+
echo "Installing code-server..."
166175
curl -fsSL https://code-server.dev/install.sh | sh -s -- --method=standalone --prefix=/tmp/code-server --version 4.8.3
167176
168177
# The & prevents the startup_script from blocking so the next commands can run.
169178
# The stdout and stderr of code-server is redirected to /tmp/code-server.log.
179+
echo "Starting code-server..."
170180
/tmp/code-server/bin/code-server --auth none --port 13337 >/tmp/code-server.log 2>&1 &
171181
172-
# var.repo and var.dotfiles_uri is specified
173-
# elsewhere in the Terraform code as input
174-
# variables.
182+
# Notice: var.repo and var.dotfiles_uri are specified elsewhere in the Terraform
183+
# code as input variables.
184+
REPO=${var.repo}
185+
DOTFILES_URI=${var.dotfiles_uri}
175186
176187
# clone repo
177188
ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
178-
git clone --progress git@github.com:${var.repo}
189+
echo "Cloning $REPO..."
190+
git clone --progress git@github.com:"$REPO"
179191
180192
# use coder CLI to clone and install dotfiles
181-
coder dotfiles -y ${var.dotfiles_uri}
182-
193+
echo "Cloning dotfiles..."
194+
coder dotfiles -y "$DOTFILES_URI"
183195
EOT
184196
}
185197
```
186198

199+
The startup script can contain important steps that must be executed successfully so that the workspace is in a usable state, for this reason we recommend using `set -e` (exit on error) at the top and `|| true` (allow command to fail) to ensure the user is notified when something goes wrong. These are not shown in the example above because, while useful, they need to be used with care.
200+
201+
We also recommend that startup scripts always have an end, meaning that long running processes should be run in the background. This is usually achieved by adding `&` to the end of the command. For example, `sleep 10 &` will run the command in the background and allow the startup script to complete.
202+
203+
If a backgrounded command (`&`) writes to stdout or stderr, the startup script will not complete until the command completes or closes the file descriptors. To avoid this, you can redirect the stdout and stderr to a file. For example, `sleep 10 >/dev/null 2>&1 &` will redirect the stdout and stderr to `/dev/null` (discard) and run the command in the background.
204+
205+
PS. Notice how each step starts with `echo "..."` to provide feedback to the user about what is happening? This is especially useful when the startup script behavior is set to blocking because the user will be informed about why they're waiting to access their workspace.
206+
207+
#### `startup_script_behavior`
208+
209+
Use the Coder agent's `startup_script_behavior` to change the behavior between `blocking` and `non-blocking` (default). The blocking behavior is recommended for most use cases because it allows the startup script to complete before the user accesses the workspace. For example, let's say you want to check out a very large repo in the startup script. If the startup script is non-blocking, the user may log in via SSH or open the IDE before the repo is fully checked out. This can lead to a poor user experience.
210+
211+
Whichever behavior is enabled, the user can still choose to override it by specifying the appropriate flags (or environment variables) in the CLI when connecting to the workspace. For example, `coder ssh --no-wait` will connect to the workspace without waiting for the startup script to complete.
212+
213+
```hcl
214+
resource "coder_agent" "coder" {
215+
os = "linux"
216+
arch = "amd64"
217+
startup_script_behavior = "blocking"
218+
startup_script = "echo 'Starting...'"
219+
```
220+
187221
### Start/stop
188222

189223
[Learn about resource persistence in Coder](./resource-persistence.md)
@@ -372,37 +406,67 @@ practices:
372406
- The Coder agent shutdown script logs are typically stored in `/tmp/coder-shutdown-script.log`
373407
- This can also happen if the websockets are not being forwarded correctly when running Coder behind a reverse proxy. [Read our reverse-proxy docs](https://coder.com/docs/v2/latest/admin/configure#tls--reverse-proxy)
374408

375-
### Agent does not become ready
409+
### Startup script issues
376410

377-
If the agent does not become ready, it means the [startup script](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script) is still running or has exited with a non-zero status. This also means the [login before ready](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#login_before_ready) option hasn't been set to true.
411+
Depending on the contents of the [startup script](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script), and whether or not the [startup script behavior](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script_behavior) is set to blocking or non-blocking, you may notice issues related to the startup script. In this section we will cover common scenarios and how to resolve them.
378412

379-
```console
380-
$ coder ssh myworkspace
381-
⢄⡱ Waiting for [agent] to become ready...
382-
```
413+
#### Unable to access workspace, startup script is still running
383414

384-
To troubleshoot readiness issues, check the agent logs as suggested above. You can connect to the workspace using `coder ssh` with the `--no-wait` flag. Please note that while this makes login possible, the workspace may be in an incomplete state.
415+
If you're trying to access your workspace and are unable to because the [startup script](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script) is still running, it means the [startup script behavior](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script_behavior) option is set to blocking or you have enabled the `--wait` option (for e.g. `coder ssh` or `coder config-ssh`). In such an event, you can always access the workspace by using the web terminal, or via SSH using the `--no-wait` option. To force the startup script to exit, you can try to terminate processes started by the startup script or terminating the startup script itself (on Linux, `ps` and `kill` are useful tools).
385416

386-
```console
387-
$ coder ssh myworkspace --no-wait
417+
For tips on how to write a startup script that doesn't run forever, see the [`startup_script`](#startup_script) section, or read about [debugging the startup script](#debugging-the-startup-script) for help in resolving issues.
388418

389-
> The workspace is taking longer than expected to get
390-
ready, the agent startup script is still executing.
391-
See troubleshooting instructions at: [...]
419+
Template authors can also set the [startup script behavior](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script_behavior) option to non-blocking, which will allow users to access the workspace while the startup script is still running. Note that the workspace must be updated after changing the option.
392420

393-
user@myworkspace $
394-
```
421+
Useful commands:
395422

396-
If the startup script is expected to take a long time, you can try raising the timeout defined in the template:
423+
- `coder ssh --wait my-workspace` or `export CODER_SSH_WAIT=true; ssh coder.my-workspace`
424+
- `coder ssh --no-wait my-workspace` or `export CODER_SSH_NO_WAIT=true; ssh coder.my-workspace`
425+
- `coder config-ssh --wait` or `coder config-ssh --no-wait`
397426

398-
```tf
399-
resource "coder_agent" "main" {
400-
# ...
401-
login_before_ready = false
402-
startup_script_timeout = 1800 # 30 minutes in seconds.
403-
}
427+
#### Your workspace may be incomplete
428+
429+
If you see a warning that your workspace may be incomplete, it means you should be aware that programs, files, or settings may be missing from your workspace. This can happen if the [startup script](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script) is still running or has exited with a non-zero status (see [startup script error](#startup-script-error)). No action is necessary, but you may want to check the [startup script logs](#debugging-the-startup-script) to see if there are any issues.
430+
431+
#### Session was started before the startup script finished (web terminal)
432+
433+
They web terminal may show this message if it was started before the [startup script](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script) finished, but the startup script has since finished. This message can safely be dismissed, however, be aware that your preferred shell or dotfiles may not yet be activated for this shell session. You can either start a new session or source your dotfiles manually. Note that starting a new session means that commands running in the terminal will be terminated and you may lose unsaved work.
434+
435+
Examples for activating your preferred shell or sourcing your dotfiles:
436+
437+
- `exec zsh -l`
438+
- `source ~/.bashrc`
439+
440+
#### Startup script exited with an error
441+
442+
When the [startup script](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script) exits with an error, it means the last command run by the script failed. When `set -e` is used, this means that any failing command will immediately exit the script and the remaining commands will not be executed. This also means that [your workspace may be incomplete](#your-workspace-may-be-incomplete). If you see this error, you can check the [startup script logs](#debugging-the-startup-script) to figure out what the issue is.
443+
444+
Common causes for startup script errors:
445+
446+
- A missing command or file
447+
- A command that fails due to missing permissions
448+
- Network issues (e.g., unable to reach a server)
449+
450+
#### Debugging the startup script
451+
452+
The simplest way to debug the [startup script](https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#startup_script) is to open the workspace in the Coder dashboard and click "Show startup log" (if not already visible). This will show all the output from the script. Another option is to view the log file inside the workspace (usually `/tmp/coder-startup-script.log`). If the logs don't indicate what's going on or going wrong, you can increase verbosity by adding `set -x` to the top of the startup script (note that this will show all commands run and may output sensitive information). Alternatively, you can add `echo` statements to show what's going on.
453+
454+
Here's a short example of an informative startup script:
455+
456+
```sh
457+
echo "Running startup script..."
458+
echo "Run: long-running-command"
459+
/path/to/long-running-command
460+
status=$?
461+
echo "Done: long-running-command, exit status: ${status}"
462+
if [ $status -ne 0 ]; then
463+
echo "Startup script failed, exiting..."
464+
exit $status
465+
fi
404466
```
405467

468+
This script tells us what command is being run and what the exit status is. If the exit status is non-zero, it means the command failed. Note that here we don't need `set -x` because we're manually echoing the commands which protects against sensitive information being shown in the log. We also don't need `set -e` because we're manually checking the exit status and exiting if it's non-zero.
469+
406470
## Template permissions (enterprise)
407471

408472
Template permissions can be used to give users and groups access to specific

0 commit comments

Comments
 (0)