Skip to content

flake: agent/agentscripts TestTimeout: signal: hangup #329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
johnstcn opened this issue Jan 29, 2025 · 7 comments · Fixed by coder/coder#17293 or coder/coder#17300
Closed

flake: agent/agentscripts TestTimeout: signal: hangup #329

johnstcn opened this issue Jan 29, 2025 · 7 comments · Fixed by coder/coder#17293 or coder/coder#17300
Assignees
Labels

Comments

@johnstcn
Copy link
Member

Seen here: https://github.com/coder/coder/actions/runs/13030093986/job/36347281581

=== FAIL: agent/agentscripts TestTimeout (0.10s)
    t.go:106: 2025-01-29 11:15:43.271 [info]  initializing agent scripts  script_count=1  log_dir=/tmp/TestTimeout57977522/001
    t.go:106: 2025-01-29 11:15:43.271 [info]  running agent script  log_source_id=7e7e3b20-dd73-43ba-916d-f91cfc17c62b  log_path=/tmp/TestTimeout57977522/001/coder-script-7e7e3b20-dd73-43ba-916d-f91cfc17c62b.log  script_data_dir=/tmp/TestTimeout57977522/002/coder-script-data/7e7e3b20-dd73-43ba-916d-f91cfc17c62b  script="sleep infinity"
    t.go:106: 2025-01-29 11:15:43.274 [warn]  /tmp/TestTimeout57977522/001/coder-script-7e7e3b20-dd73-43ba-916d-f91cfc17c62b.log script failed  log_source_id=7e7e3b20-dd73-43ba-916d-f91cfc17c62b  log_path=/tmp/TestTimeout57977522/001/coder-script-7e7e3b20-dd73-43ba-916d-f91cfc17c62b.log  script_data_dir=/tmp/TestTimeout57977522/002/coder-script-data/7e7e3b20-dd73-43ba-916d-f91cfc17c62b  execution_time=3.356ms  exit_code=-1  error="signal: hangup"
    agentscripts_test.go:112: 
        	Error Trace:	/home/runner/work/coder/coder/agent/agentscripts/agentscripts_test.go:112
        	Error:      	Target error should be in err chain:
        	            	expected: "script timed out"
        	            	in chain: "run agent script \"7e7e3b20-dd73-43ba-916d-f91cfc17c62b\": signal: hangup"
        	            		"signal: hangup"
        	Test:       	TestTimeout
@johnstcn johnstcn added the flake label Jan 29, 2025
@johnstcn
Copy link
Member Author

johnstcn commented Jan 29, 2025

I'm not able to reproduce this easily. Possibly something on the runner sent a SIGHUP to the process?

The Go docs state:

Of the asynchronous signals, the SIGHUP signal is sent when a program loses its controlling terminal.

We could possibly switch the agent script runner out to use an agentexec.Execer?

@johnstcn
Copy link
Member Author

Opened coder/coder#16324 to at least add some more logging around this so we know if we did this ourselves.

johnstcn added a commit to coder/coder that referenced this issue Jan 29, 2025
Relates to coder/internal#329

It's currently unclear where the SIGHUP came from; adding some logging
to make it more clear if it happens again in future.

---------

Co-authored-by: Danny Kopping <danny@coder.com>
@johnstcn johnstcn self-assigned this Jan 29, 2025
aslilac pushed a commit to coder/coder that referenced this issue Jan 29, 2025
Relates to coder/internal#329

It's currently unclear where the SIGHUP came from; adding some logging
to make it more clear if it happens again in future.

---------

Co-authored-by: Danny Kopping <danny@coder.com>
@johnstcn
Copy link
Member Author

Closing this out as it hasn't cropped up in a while.

@spikecurtis spikecurtis reopened this Apr 7, 2025
@spikecurtis
Copy link

@johnstcn
Copy link
Member Author

johnstcn commented Apr 8, 2025

This looks like a side-effect of an overloaded runner based on the error log:

2025-04-07T06:48:55.1249925Z     t.go:106: 2025-04-07 06:47:10.671 [warn]  /tmp/TestTimeout496770596/001/coder-script-005d748a-c60f-4e90-8c8f-596d50f109f5.log script failed  log_source_id=005d748a-c60f-4e90-8c8f-596d50f109f5  log_path=/tmp/TestTimeout496770596/001/coder-script-005d748a-c60f-4e90-8c8f-596d50f109f5.log  script_data_dir=/tmp/TestTimeout496770596/002/coder-script-data/005d748a-c60f-4e90-8c8f-596d50f109f5  execution_time=4.634ms  exit_code=-1  error="signal: hangup"

Note that it took more than 4ms to return an exit code of -1 and the test at time of writing is configured thus:

	err := runner.Init([]codersdk.WorkspaceAgentScript{{
		LogSourceID: uuid.New(),
		Script:      "sleep infinity",
		Timeout:     time.Millisecond,
	}}, aAPI.ScriptCompleted)

From exec_posix.go:

// ExitCode returns the exit code of the exited process, or -1
// if the process hasn't exited or was terminated by a signal.
func (p *ProcessState) ExitCode() int {
	// return -1 if the process hasn't started.
	if p == nil {
		return -1
	}
	return p.status.ExitStatus()
}

So the sleep infinity process never even started before the test failed.

johnstcn added a commit to coder/coder that referenced this issue Apr 8, 2025
Fixes coder/internal#329

This was due to a race between the process starting and the timeout of
the agent startup script executor. I'm taking the 'lazy' route here and
increasing the timeout to 100ms. This does technically mean that this
makes the test 100 times longer to execute. However, if it takes more
than 100ms to run a `sleep infinity` command on our test runner, I think
we have other issues.
@ethanndickson
Copy link
Member

ethanndickson commented Apr 9, 2025

Seen again on a new branch off main: https://github.com/coder/coder/actions/runs/14350318934/job/40227755031

     agentscripts_test.go:114: 
        	Error Trace:	/Users/runner/work/coder/coder/agent/agentscripts/agentscripts_test.go:114
        	Error:      	Target error should be in err chain:
        	            	expected: "script timed out"
        	            	in chain: "run agent script \"ddb9f9e9-548c-4981-9a57-59f91734d1c0\": exit status 1"
        	            		"exit status 1"
        	Test:       	TestTimeout

@ethanndickson ethanndickson reopened this Apr 9, 2025
@johnstcn
Copy link
Member Author

johnstcn commented Apr 9, 2025

Skipping this test on macOS for now. coder/coder#17300

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants