Skip to content

feat: Add support for shutdown_script #5171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 28 additions & 4 deletions agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -370,14 +370,22 @@ func (a *agent) runCoordinator(ctx context.Context, network *tailnet.Conn) error
}

func (a *agent) runStartupScript(ctx context.Context, script string) error {
return a.runScript(ctx, "startup", script)
}

func (a *agent) runShutdownScript(ctx context.Context, script string) error {
return a.runScript(ctx, "shutdown", script)
}

func (a *agent) runScript(ctx context.Context, lifecycle, script string) error {
if script == "" {
return nil
}

a.logger.Info(ctx, "running startup script", slog.F("script", script))
writer, err := a.filesystem.OpenFile(filepath.Join(a.tempDir, "coder-startup-script.log"), os.O_CREATE|os.O_RDWR, 0o600)
a.logger.Info(ctx, "running script", slog.F("lifecycle", lifecycle), slog.F("script", script))
writer, err := a.filesystem.OpenFile(filepath.Join(a.tempDir, fmt.Sprintf("coder-%s-script.log", lifecycle)), os.O_CREATE|os.O_RDWR, 0o600)
if err != nil {
return xerrors.Errorf("open startup script log file: %w", err)
return xerrors.Errorf("open %s script log file: %w", lifecycle, err)
}
defer func() {
_ = writer.Close()
Expand Down Expand Up @@ -569,7 +577,7 @@ func (a *agent) createCommand(ctx context.Context, rawCommand string, env []stri

rawMetadata := a.metadata.Load()
if rawMetadata == nil {
return nil, xerrors.Errorf("no metadata was provided: %w", err)
return nil, xerrors.Errorf("no metadata was provided")
}
metadata, valid := rawMetadata.(codersdk.WorkspaceAgentMetadata)
if !valid {
Expand Down Expand Up @@ -943,6 +951,22 @@ func (a *agent) Close() error {
}
close(a.closed)
a.closeCancel()

rawMetadata := a.metadata.Load()
if rawMetadata == nil {
return xerrors.Errorf("no metadata was provided")
}
metadata, valid := rawMetadata.(codersdk.WorkspaceAgentMetadata)
if !valid {
return xerrors.Errorf("metadata is the wrong type: %T", metadata)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably shouldn't do an early exit here and leave closure half-way after it's already started.


ctx := context.Background()
err := a.runShutdownScript(ctx, metadata.ShutdownScript)
if err != nil {
a.logger.Error(ctx, "shutdown script failed", slog.Error(err))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to do this earlier. And also, we might not want to continue on error. Critical workspace tasks could be performed at this step, like syncing filesystem to cold storage or something similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also, we might not want to continue on error. Critical workspace tasks could be performed at this step, like syncing filesystem to cold storage or something similar.

I assumed the opposite condition. I wouldn't like to end up with 100 workspaces I can't kill. I suppose that it depends on which option is less evil. I'm fine to change the approach to be aligned with your suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s a valid thought. Another way to look at it is that if it’s allowed to fail. Why have it at all? Both are of course valid. In the PR review comment I mentioned a force stop would allow shutting down such workspaces. That’s one option, alternatively it could be a provider option but we should be careful with adding those until there are real use-cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the PR review comment I mentioned a force stop would allow shutting down such workspaces

I might have missed this comment, but do you mean to have the option to configure the forced-stop somewhere in metadata which can be changed using UI or a different place?

Copy link
Member

@mafredri mafredri Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hinting that coder stop would receive a new flag, e.g. --force. Similarly the WebUI might receive a new button. This would (probably) be cli/server-only and not wait for the a-OK from the agent, instead it would perform the terraform teardown immediately.


if a.network != nil {
_ = a.network.Close()
}
Expand Down
53 changes: 53 additions & 0 deletions agent/agent_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -695,6 +695,59 @@ func TestAgent(t *testing.T) {
return err == nil
}, testutil.WaitShort, testutil.IntervalFast)
})

t.Run("ShutdownScriptOnce", func(t *testing.T) {
t.Parallel()

expected := "this-is-shutdown"
client := &client{
t: t,
agentID: uuid.New(),
metadata: codersdk.WorkspaceAgentMetadata{
DERPMap: tailnettest.RunDERPAndSTUN(t),
StartupScript: "echo 1",
ShutdownScript: "echo " + expected,
},
statsChan: make(chan *codersdk.AgentStats),
coordinator: tailnet.NewCoordinator(),
}

fs := afero.NewMemMapFs()
agent := agent.New(agent.Options{
Client: client,
Logger: slogtest.Make(t, nil).Leveled(slog.LevelInfo),
Filesystem: fs,
})

// agent.Close() loads the shutdown script from the agent metadata.
// The metadata is populated just before execution of the startup script, so it's mandatory to wait
// until the startup starts.
require.Eventually(t, func() bool {
outputPath := filepath.Join(os.TempDir(), "coder-startup-script.log")
content, err := afero.ReadFile(fs, outputPath)
if err != nil {
t.Logf("read file %q: %s", outputPath, err)
return false
}
return len(content) > 0 // something is in the startup log file
}, testutil.WaitShort, testutil.IntervalMedium)

err := agent.Close()
require.NoError(t, err, "agent should be closed successfully")

outputPath := filepath.Join(os.TempDir(), "coder-shutdown-script.log")
logFirstRead, err := afero.ReadFile(fs, outputPath)
require.NoError(t, err, "log file should be present")
require.Equal(t, expected, string(bytes.TrimSpace(logFirstRead)))

// Make sure that script can't be executed twice.
err = agent.Close()
require.NoError(t, err, "don't need to close the agent twice, no effect")

logSecondRead, err := afero.ReadFile(fs, outputPath)
require.NoError(t, err, "log file should be present")
require.Equal(t, string(bytes.TrimSpace(logFirstRead)), string(bytes.TrimSpace(logSecondRead)))
})
}

func setupSSHCommand(t *testing.T, beforeArgs []string, afterArgs []string) *exec.Cmd {
Expand Down
1 change: 1 addition & 0 deletions coderd/database/databasefake/databasefake.go
Original file line number Diff line number Diff line change
Expand Up @@ -2474,6 +2474,7 @@ func (q *fakeQuerier) InsertWorkspaceAgent(_ context.Context, arg database.Inser
ConnectionTimeoutSeconds: arg.ConnectionTimeoutSeconds,
TroubleshootingURL: arg.TroubleshootingURL,
MOTDFile: arg.MOTDFile,
ShutdownScript: arg.ShutdownScript,
}

q.workspaceAgents = append(q.workspaceAgents, agent)
Expand Down
5 changes: 4 additions & 1 deletion coderd/database/dump.sql

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE workspace_agents DROP COLUMN shutdown_script;
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
ALTER TABLE workspace_agents ADD COLUMN shutdown_script varchar(65534);

COMMENT ON COLUMN workspace_agents.shutdown_script IS 'Script that is executed before the agent is stopped.';
2 changes: 2 additions & 0 deletions coderd/database/models.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 16 additions & 7 deletions coderd/database/queries.sql.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions coderd/database/queries/workspaceagents.sql
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,11 @@ INSERT INTO
resource_metadata,
connection_timeout_seconds,
troubleshooting_url,
motd_file
motd_file,
shutdown_script
)
VALUES
($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17) RETURNING *;
($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18) RETURNING *;

-- name: UpdateWorkspaceAgentConnectionByID :exec
UPDATE
Expand Down
4 changes: 4 additions & 0 deletions coderd/provisionerdserver/provisionerdserver.go
Original file line number Diff line number Diff line change
Expand Up @@ -899,6 +899,10 @@ func InsertWorkspaceResource(ctx context.Context, db database.Store, jobID uuid.
ConnectionTimeoutSeconds: prAgent.GetConnectionTimeoutSeconds(),
TroubleshootingURL: prAgent.GetTroubleshootingUrl(),
MOTDFile: prAgent.GetMotdFile(),
ShutdownScript: sql.NullString{
String: prAgent.ShutdownScript,
Valid: prAgent.ShutdownScript != "",
},
})
if err != nil {
return xerrors.Errorf("insert agent: %w", err)
Expand Down
2 changes: 2 additions & 0 deletions coderd/provisionerdserver/provisionerdserver_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -772,6 +772,7 @@ func TestInsertWorkspaceResource(t *testing.T) {
Apps: []*sdkproto.App{{
Slug: "a",
}},
ShutdownScript: "shutdown",
}},
})
require.NoError(t, err)
Expand All @@ -786,6 +787,7 @@ func TestInsertWorkspaceResource(t *testing.T) {
require.Equal(t, "amd64", agent.Architecture)
require.Equal(t, "linux", agent.OperatingSystem)
require.Equal(t, "value", agent.StartupScript.String)
require.Equal(t, "shutdown", agent.ShutdownScript.String)
want, err := json.Marshal(map[string]string{
"something": "test",
})
Expand Down
2 changes: 2 additions & 0 deletions coderd/telemetry/telemetry.go
Original file line number Diff line number Diff line change
Expand Up @@ -542,6 +542,7 @@ func ConvertWorkspaceAgent(agent database.WorkspaceAgent) WorkspaceAgent {
StartupScript: agent.StartupScript.Valid,
Directory: agent.Directory != "",
ConnectionTimeoutSeconds: agent.ConnectionTimeoutSeconds,
ShutdownScript: agent.ShutdownScript.Valid,
}
if agent.FirstConnectedAt.Valid {
snapAgent.FirstConnectedAt = &agent.FirstConnectedAt.Time
Expand Down Expand Up @@ -741,6 +742,7 @@ type WorkspaceAgent struct {
LastConnectedAt *time.Time `json:"last_connected_at"`
DisconnectedAt *time.Time `json:"disconnected_at"`
ConnectionTimeoutSeconds int32 `json:"connection_timeout_seconds"`
ShutdownScript bool `json:"shutdown_script"`
}

type WorkspaceApp struct {
Expand Down
2 changes: 2 additions & 0 deletions coderd/workspaceagents.go
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ func (api *API) workspaceAgentMetadata(rw http.ResponseWriter, r *http.Request)
Directory: apiAgent.Directory,
VSCodePortProxyURI: vscodeProxyURI,
MOTDFile: workspaceAgent.MOTDFile,
ShutdownScript: apiAgent.ShutdownScript,
})
}

Expand Down Expand Up @@ -686,6 +687,7 @@ func convertWorkspaceAgent(derpMap *tailcfg.DERPMap, coordinator tailnet.Coordin
Apps: apps,
ConnectionTimeoutSeconds: dbAgent.ConnectionTimeoutSeconds,
TroubleshootingURL: troubleshootingURL,
ShutdownScript: dbAgent.ShutdownScript.String,
}
node := coordinator.Node(dbAgent.ID)
if node != nil {
Expand Down
2 changes: 2 additions & 0 deletions codersdk/workspaceagents.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ type WorkspaceAgent struct {
DERPLatency map[string]DERPRegion `json:"latency,omitempty"`
ConnectionTimeoutSeconds int32 `json:"connection_timeout_seconds"`
TroubleshootingURL string `json:"troubleshooting_url"`
ShutdownScript string `json:"shutdown_script,omitempty"`
}

type WorkspaceAgentResourceMetadata struct {
Expand Down Expand Up @@ -132,6 +133,7 @@ type WorkspaceAgentMetadata struct {
StartupScript string `json:"startup_script"`
Directory string `json:"directory"`
MOTDFile string `json:"motd_file"`
ShutdownScript string `json:"shutdown_script"`
}

// AuthWorkspaceGoogleInstanceIdentity uses the Google Compute Engine Metadata API to
Expand Down
1 change: 1 addition & 0 deletions docs/templates.md
Original file line number Diff line number Diff line change
Expand Up @@ -336,6 +336,7 @@ practices:
- Manually connect to the resource and check the agent logs (e.g., `docker exec` or AWS console)
- The Coder agent logs are typically stored in `/var/log/coder-agent.log`
- The Coder agent startup script logs are typically stored in `/var/log/coder-startup-script.log`
- The Coder agent shutdown script logs are typically stored in `/var/log/coder-shutdown-script.log`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and previous) documentation seems wrong. We currently always store in /tmp (or rather, os tempdir), even if that isn't ideal.


## Template permissions (enterprise)

Expand Down
9 changes: 5 additions & 4 deletions docs/workspaces.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,11 @@ coder update <your workspace name> --always-prompt

Coder stores macOS and Linux logs at the following locations:

| Service | Location |
| ---------------- | ------------------------------- |
| `startup_script` | `/tmp/coder-startup-script.log` |
| Agent | `/tmp/coder-agent.log` |
| Service | Location |
| ----------------- | -------------------------------- |
| `startup_script` | `/tmp/coder-startup-script.log` |
| `shutdown_script` | `/tmp/coder-shutdown-script.log` |
| Agent | `/tmp/coder-agent.log` |

---

Expand Down
Loading