Skip to content

flake: TestPortForward/UDP_OnePort - EOF #11294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
spikecurtis opened this issue Dec 20, 2023 · 0 comments · Fixed by #11306
Closed

flake: TestPortForward/UDP_OnePort - EOF #11294

spikecurtis opened this issue Dec 20, 2023 · 0 comments · Fixed by #11306
Assignees

Comments

@spikecurtis
Copy link
Contributor

    portforward_test.go:150: 
        	Error Trace:	/home/runner/actions-runner/_work/coder/coder/cli/portforward_test.go:351
        	            				/home/runner/actions-runner/_work/coder/coder/cli/portforward_test.go:336
        	            				/home/runner/actions-runner/_work/coder/coder/cli/portforward_test.go:150
        	Error:      	Received unexpected error:
        	            	EOF
        	Test:       	TestPortForward/UDP_OnePort
        	Messages:   	read payload

seen here: https://github.com/coder/coder/actions/runs/7226765867/job/19693041460#step:5:320

@spikecurtis spikecurtis self-assigned this Dec 20, 2023
@cdr-bot cdr-bot bot added the chore label Dec 20, 2023
spikecurtis added a commit that referenced this issue Jan 2, 2024
We're seeing some flaky tests related to agent connectivity - https://github.com/coder/coder/actions/runs/7286675441/job/19856270998

I'm pretty sure what happened in this one is that the client opened a connection while the wgengine was in the process of reconfiguring the wireguard device, so the fact that the peer became "active" as a result of traffic being sent was not noticed.

The test calls `AwaitReachable()` but this only tests the disco layer, so it doesn't wait for wireguard to come up.

I think we should be using TSMP for pinging and reachability, since this operates at the IP layer, and therefore requires that wireguard comes up before being successful.

This should also help with the problems we have seen where a TCP connection starts before wireguard is up and the initial round trip has to wait for the 5 second wireguard handshake retry.

fixes: #11294
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant