feat: add support for `coder_script` #9584

kylecarbs · 2023-09-07T17:57:22Z

This allows users to define separated log streams for different scripts, and customize the names + icons.

The diff here is massive, but 90% of this is boilerplate for just piping through such a significant amount of new data to the workspace agent. The only real refactor is how the workspace agent handles scripts.

This breaks any customers that are manually piping logs with our external API as it is right now, so I'll be fixing that prior to a merge. I think I'll use a hard-coded UUID for an external source, and call it "External" for now... then eventually our external tools can catch up.

kylecarbs · 2023-09-07T20:37:11Z

@mafredri I'm going to just make the FE work as it did before, and then I believe this could be ready to merge. It's obviously massive, but I think there's a very low chance of regression here. I manually tested with an old version of the provider as well, and everything just works.

mafredri

Big fan of the changes in this PR, nice work!

Identified a few issues, building and tests seem broken, and I do think we'll introduce two breaking changes:

The startup blocking behavior for old clients will break
Old agents won't be able to send logs or may have issues communicating with coderd (didn't verify, just a hunch)

agent/agentscripts/agentscripts.go

mafredri · 2023-09-08T14:00:01Z

agent/agentscripts/agentscripts.go

+	if script.Timeout > 0 {
+		var cancel context.CancelFunc
+		// Add a buffer to forcefully kill with the context.
+		ctx, cancel = context.WithTimeout(ctx, script.Timeout+(3*time.Second))


This may require a rethink. We're passing the agent context in New and using it here (r.ctx). This means that when the agent is interrupted, all scripts will be terminated, and shutdown scripts won't be able to run.

Currently in (*agent).Close we use a background context as base, not the agent context. So I don't think it's a good idea tying the scripts context and agent context together. Instead we should rely on New and Close to perform the appropriate startup/teardown. In (agent).Close, the Execute (stop) should run for as long as it needs to, or until the script timeout is hit (which will interrupt + wait X seconds + kill).

PS. I think this could be bumped to like 5 or 10 seconds at least.

coderd/provisionerdserver/provisionerdserver.go

coderd/workspaceagents.go

codersdk/workspaceagents.go

provisioner/terraform/resources.go

mafredri · 2023-09-08T14:43:14Z

cli/ssh.go

-					wait = false
-				default:
-					return xerrors.Errorf("unknown startup script behavior %q", workspaceAgent.StartupScriptBehavior)
+				for _, script := range workspaceAgent.Scripts {


Seeing this change, I believe old clients connecting to coderd will be broken by this since the API no longer includes the startup script behavior.

I'll fix this!

@mafredri do you think I should just include the field and leave it as a static value to not break old clients?

@kylecarbs I'm assuming static would be non-blocking? For anyone using blocking I suppose it would still be a breaking change. But I'm not sure it's worth including so much legacy to keep piping it to the DB, so I think it's acceptable. But if we leave it like that, I think we should mention it in the release notes.

bpmct · 2023-09-23T09:11:28Z

This could be a separate PR, but thoughts on hiding these by default from the resources list on the dashboard?

…cscripts

mafredri

I left a few comments on the code, but other than that I also noticed one other issue:

I have two startup scripts, one blocks login and both have timed out. The workspace has entered start_timeout lifecycle but the CLI still thinks the scripts are running.

❯ CODER_SSH_WAIT=no ssh coder.test
👋 Your workspace is outdated! Update it here: http://127.0.0.1:3000/@admin/test

==> ⧗ Running workspace agent startup scripts (non-blocking)
2023-09-25 18:19:23.047Z Startup Script 2: Hello, world 2!
2023-09-25 18:19:23.047Z Startup Script: Hello, world!
Notice: The startup scripts are still running and your workspace may be incomplete.

And this shows that the lifecycle has indeed changed (also visible in logs, but didn't attach those).

❯ ./scripts/coder-dev.sh list -o json | grep lifecycle
              "lifecycle_state": "start_timeout",

Both the CLI from this PR and current main behave the same way. I took a quick look but didn't see why this would be.

Finally, one more wish:

It would be nice to differentiate scripts in the UI when no icon has been configured, currently a hover is required but that's not practical for 1000s of rows. Perhaps we could have like an array of 10 colors that we give the scripts based on their order/name. I say 10 colors since it seems risky to pick it randomly (if it's red it could feel like an error)

agent/agentscripts/agentscripts.go

mafredri · 2023-09-25T18:35:11Z

agent/agentscripts/agentscripts_other.go

+
+func cmdCancel(cmd *exec.Cmd) func() error {
+	return func() error {
+		return syscall.Kill(-cmd.Process.Pid, syscall.SIGINT)


When e.g. bash is executing a script, SIGINT gets captured and passed to the current command being executed, this results in that command exiting and then the script executes the next command. Essentially circumventing what we're trying to do here.

Changing this to SIGHUP will work better for what we're trying to do and will ensure everything is canceled. This is what a terminal typically sends to a program when the attached terminal goes away, so seems alright.

The downside is that perhaps it's more common to have handling for SIGINT, in which case this could cause a dirty exit for some script authors. But we can just document this behavior instead so it's a non-issue and can still be handled.

That makes sense. I appreciate the thorough thought here! 😍

mafredri

Pre-approving, this really turned out nicely btw, awesome job!

kylecarbs · 2023-09-25T21:33:31Z

@mafredri I did the deterministic colors - that was a great suggestion!

kylecarbs added 6 commits September 3, 2023 17:53

Add basic migrations

7199651

Merge branch 'main' into execscripts

51b0079

Improve schema

c18a401

Merge branch 'main' into execscripts

9ae6e62

Refactor agent scripts into it's own package

70ebaf3

Support legacy start and stop script format

89c7af1

kylecarbs self-assigned this Sep 7, 2023

kylecarbs added 7 commits September 7, 2023 18:24

Pipe the scripts!

d5df133

Finish the piping

58964c9

Fix context usage

00a4e73

It works!

942fde6

Fix sql query

92dedad

Fix SQL query

7cf6f0c

Rename LogSourceID -> SourceID

5b6f264

kylecarbs requested a review from mafredri September 7, 2023 20:36

kylecarbs added 4 commits September 7, 2023 20:40

Fix the FE

e2c9f91

Merge branch 'main' into execscripts

6fab755

fmt

51e08f4

Rename migrations

9a38131

kylecarbs force-pushed the execscripts branch from c981da9 to 9a38131 Compare September 7, 2023 20:51

kylecarbs added 4 commits September 8, 2023 13:51

Fix log tests

c0fac6b

Fix lint err

f7f1c7a

Fix gen

f0a8f53

Fix story type

66f9185

mafredri reviewed Sep 8, 2023

View reviewed changes

kylecarbs added 4 commits September 13, 2023 17:47

Rename source to script

78f01d1

Fix schema jank

75388f7

Uncomment test

8810326

Rename proto to TimeoutSeconds

45b395e

kylecarbs and others added 17 commits September 24, 2023 20:17

Fix agent leaking script process

dd5abdf

Fix migrations

9e85d7b

Merge branch 'main' into execscripts

c26a01b

Fix stories

9513acf

Merge branch 'main' into execscripts

2e3611b

Fix duplicate logs appearing

e8b1e43

Merge branch 'execscripts' of https://github.com/coder/coder into exe…

ee1fe11

…cscripts

Gen

b837aac

Fix log location

f1ff5cc

Fix tests

4ec3a87

Fix tests

d36ab53

Fix log output

eeddb52

Show display name in output

aa5540b

Fix print

f866a92

Return timeout on start context

a99b6dd

Gen

1865590

Fix fixture

3b26aa0

mafredri reviewed Sep 25, 2023

View reviewed changes

mafredri approved these changes Sep 25, 2023

View reviewed changes

kylecarbs added 6 commits September 25, 2023 19:03

Fix the agent status

f2f69bb

Fix startup timeout msg

aa68796

Fix command using shared context

73a7a78

Fix timeout draining

d1f4963

Change signal type

784e616

Add deterministic colors to startup script logs

7ac782b

kylecarbs merged commit 1262eef into main Sep 25, 2023

kylecarbs deleted the execscripts branch September 25, 2023 21:47

github-actions bot locked and limited conversation to collaborators Sep 25, 2023

feat: add support for coder_script #9584

feat: add support for coder_script #9584

Uh oh!

Conversation

kylecarbs commented Sep 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kylecarbs commented Sep 7, 2023

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mafredri Sep 8, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mafredri Sep 8, 2023

Choose a reason for hiding this comment

Uh oh!

kylecarbs Sep 8, 2023

Choose a reason for hiding this comment

Uh oh!

kylecarbs Sep 13, 2023

Choose a reason for hiding this comment

Uh oh!

mafredri Sep 13, 2023

Choose a reason for hiding this comment

Uh oh!

bpmct commented Sep 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mafredri Sep 25, 2023

Choose a reason for hiding this comment

Uh oh!

kylecarbs Sep 25, 2023

Choose a reason for hiding this comment

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

kylecarbs commented Sep 25, 2023

Uh oh!

Uh oh!

feat: add support for `coder_script` #9584

feat: add support for `coder_script` #9584

kylecarbs commented Sep 7, 2023 •

edited

Loading

bpmct commented Sep 23, 2023 •

edited

Loading