Skip to content

fix: agent disconnects from coordinator #7430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 5, 2023
Merged

Conversation

spikecurtis
Copy link
Contributor

fixes #7428

Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis spikecurtis requested review from coadler and kylecarbs May 5, 2023 08:00
@spikecurtis spikecurtis changed the title work around websocket deadline bug fix: agent disconnects from coordinator May 5, 2023
Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
sendClientNode(&tailnet.Node{})
clientNodes := <-agentNodeChan
require.Len(t, clientNodes, 1)

// wait longer than the internal wait timeout.
// this tests for regression of https://github.com/coder/coder/issues/7428
time.Sleep(tailnet.WriteTimeout * 3 / 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should shorten the write timeout for tests so that we don't add a long delay? Even though we're running tests concurrently, if we have a lot of these it'll end up impacting test times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, I could.

I could make the write timeout an option, and plumb it through. However, when we run these tests in parallel, it can slow them down quite a bit, so I wouldn't want to set the timer too short, lest we get a flaky test under load. Maybe 1 second? 500 ms?

The current sleep is 7.5s, and I'd be able to shave it down to maybe 750ms, which doesn't really feel worth the trouble to me.

Copy link
Member

@mafredri mafredri May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid plumbing, how about something like:

var WriteTimeout = func() time.Duration {
	if inTest() {
		return 1 * time.Second
	}
	return 5 * time.Second
}()

?

Maybe that'd be testutil.InTest() or just an inline check in this package. I think that'd be fine for now, at some point we may want a better way to know all places where we may be modifying behavior for tests.

Perhaps not ideal since this would affect all tests. I'll leave this up to you.

Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis spikecurtis merged commit dc3d39b into main May 5, 2023
@spikecurtis spikecurtis deleted the spike/coordinator-timeouts branch May 5, 2023 16:29
@github-actions github-actions bot locked and limited conversation to collaborators May 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tailnet coordinator periodically disconnects clients & agents
4 participants