Skip to content

flake: TestProvisionerd/MaliciousTar #584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
spikecurtis opened this issue Apr 15, 2025 · 1 comment · Fixed by coder/coder#17448
Closed

flake: TestProvisionerd/MaliciousTar #584

spikecurtis opened this issue Apr 15, 2025 · 1 comment · Fixed by coder/coder#17448
Assignees
Labels

Comments

@spikecurtis
Copy link

 === FAIL: provisionerd TestProvisionerd/MaliciousTar (0.00s)
    t.go:106: 2025-04-14 11:22:05.116 [debu]  provisionerd: dialing coderd
    t.go:106: 2025-04-14 11:22:05.116 [debu]  provisionerd: successfully connected to coderd
    t.go:106: 2025-04-14 11:22:05.116 [debu]  provisionerd: start of acquireAndRunOne
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd: graceful acquire done  job_id=test  error=<nil>
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd: acquired job  initiator_username=""  provisioner=someprovisioner  job_id=test
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd.runner: acquired job is template import  job_id=test  user_variable_values=[]
    t.go:106: 2025-04-14 11:22:05.117 [info]  test-provisioner: unpacking template source archive  session_id=309d435b-edb2-4ef9-900b-af87256b5c6b  size_bytes=1536
    t.go:106: 2025-04-14 11:22:05.117 [debu]  test-provisioner: read archive entry  session_id=309d435b-edb2-4ef9-900b-af87256b5c6b  name=../../../etc/  mod_time="1970-01-01T00:00:00Z"  size=0
    t.go:106: 2025-04-14 11:22:05.117 [debu]  test-provisioner: cleaned up work directory  session_id=309d435b-edb2-4ef9-900b-af87256b5c6b
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd.runner: sending FailedJob  job_id=test
    t.go:106: 2025-04-14 11:22:05.117 [info]  provisionerd: closing provisionerd
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd: failing active job because of close
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd.runner: sent FailedJob  job_id=test
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd: exiting acquire; provisionerd is closing
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd: acquire loop exited
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd: waiting for goroutines to exit
    t.go:106: 2025-04-14 11:22:05.117 [debu]  provisionerd: connect loop exited
    t.go:106: 2025-04-14 11:22:05.118 [debu]  provisionerd: closing server with error  error=<nil>
    provisionerd_test.go:1273: 
        	Error Trace:	/home/runner/work/coder/coder/provisionerd/provisionerd_test.go:1273
        	            				/home/runner/work/coder/coder/provisionerd/provisionerd_test.go:1213
        	            				/home/runner/work/coder/coder/provisionerd/proto/provisionerd_drpc.pb.go:197
        	            				/home/runner/go/pkg/mod/storj.io/drpc@v0.0.33/drpcmux/handle_rpc.go:33
        	            				/home/runner/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:124
        	            				/home/runner/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:66
        	            				/home/runner/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:114
        	            				/home/runner/go/pkg/mod/storj.io/drpc@v0.0.33/drpcctx/tracker.go:35
        	            				/opt/hostedtoolcache/go/1.24.1/x64/src/runtime/asm_amd64.s:1700
        	Error:      	Received unexpected error:
        	            	context canceled
        	Test:       	TestProvisionerd/MaliciousTar
    t.go:106: 2025-04-14 11:22:05.120 [info]  provisionerd: attempting graceful shutdown
    t.go:106: 2025-04-14 11:22:05.120 [info]  provisionerd: gracefully shutdown
    t.go:106: 2025-04-14 11:22:05.120 [info]  provisionerd: closing provisionerd
    t.go:106: 2025-04-14 11:22:05.120 [debu]  provisionerd: waiting for first closer to complete
    t.go:106: 2025-04-14 11:22:05.120 [debu]  provisionerd: first closer completed

seen here: https://github.com/coder/coder/actions/runs/14444260419/job/40501111874

@spikecurtis spikecurtis self-assigned this Apr 15, 2025
@spikecurtis
Copy link
Author

dRPC is just racy in that it will sometimes return context canceled when sending a message even if it has successfully sent the message prior to canceling.

It does a flush routine that ends

return s.checkCancelError(errs.Wrap(s.wr.Flush()))

where

func (s *Stream) checkCancelError(err error) error {
	if s.sigs.cancel.IsSet() {
		return s.sigs.cancel.Err()
	}
	return err
}

spikecurtis added a commit to coder/coder that referenced this issue Apr 17, 2025
fixes coder/internal#584

Ignore canceled error when sending an acquired job, since dRPC is racy and will sometimes return this error even after successfully sending the job, if the test is quickly finished.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant