Graceful shutdowns for coder agents and shutdown scripts

In the ideal scenario where a workspace agent is connected to the coder server, we should initiate agent shutdown _before_ we initiate Terraform provisioning to destroy/re-create the resource(s).

**Why?**

- Providers behave differently, some may not initiate graceful shutdowns and we might not be able to control timeouts for them
- Template authors may use the agent `shutdown_script` to perform a critical task that must complete successfully (e.g. backing up filesystem)
	- The task/script may take a long time
- We can leave the workspace running and allow debugging an agent that didn't successfully execute its `shutdown_script`

**To consider:**

- At the end we must not _exit_ the agent process
	- _Only applies when graceful shutdown is initiated by coder server!_
	- We should de-register signal handlers and wait "indefinitely"
	- Let the next signal terminate the process
	- Why? This prevents e.g. a systemd service from restarting the agent
- Should behavior differ if a workspace is started/stopped?
- What happens when an agent is disconnected and we can't tell it to shut down? -> Block or allow de-provision? Require the use of "force"?

Related: #4677, #5914, #6139

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Graceful shutdowns for coder agents and shutdown scripts #6175

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Graceful shutdowns for coder agents and shutdown scripts #6175

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions