chore: retry TestAgent_Dial subtests #19387

deansheather · 2025-08-18T05:12:34Z

Adds a new wrapper function testutil.RunRetry that will run the provided function multiple times until the test succeeds or the limit is reached. To accomplish this without failing the parent test, we use a fake testing.TB implementation that swallows failures until the final attempt.

Updates the TestAgent_Dial subtests to use this new wrapper. I believe the failures are coming from dropped UDP packets due to high load on the CI runner.

Closes coder/internal#595

Adds a new wrapper function testutil.RunRetry that will run the provided function multiple times until the test succeeds or the limit is reached. To accomplish this without failing the parent test, we use a fake testing.TB implementation that swallows failures until the final attempt. Updates the TestAgent_Dial subtests to use this new wrapper. I believe the failures are coming from dropped UDP packets due to high load on the CI runner.

ethanndickson

Your flake hypothesis sounds very much plausible, and this solution seems fine 👍 Just two minor comments.

testutil/t.go

mafredri

I think having the fakeT implementation will be a very useful addition to the testing package, thanks!

Although, I wonder if it's the right solution here. We could also fake the network stack instead, but at the same time we obviously lose a bit of realism. So just to be clear, and considering that, I'm fine with this solution.

I left a few suggestions, and I think we should definitely change the ctx passing in RunRetry, but the other part of that suggestion is optional/for your consideration.

testutil/retry.go

mafredri · 2025-08-18T07:49:12Z

testutil/retry.go

+	t.mu.Lock()
+	defer t.mu.Unlock()
+	t.failed = true
+	t.T.Log("WARN: t.Fail called in testutil.RunRetry closure")


Suggestion: We could give a hint here, like: rewrite test with error+early return if needed.

I'm not 100% sure what you mean by this comment. I refactored the handler a bit though, so let me know if you still want changes.

I simply meant the instances of WARN: XXX called in testutil.RunRetry closure message may be a bit confusing when followed by runtime.Goexit. So adding a little tip there how the user could rewrite their retrying test may be beneficial. But feel free to ignore.

Hmmm, in the stdlib testing package I don't think failing logs at all, so this is most certainly an improvement over that at least. Other than the log message getting added this matches the behavior of stdlib now

I think let's just merge this, and if anyone gets confused we can change the logs very easily in the future.

testutil/retry.go

mafredri

Looks great 👍🏻

deansheather requested review from mafredri and ethanndickson August 18, 2025 05:12

github-actions bot assigned deansheather Aug 18, 2025

ethanndickson approved these changes Aug 18, 2025

View reviewed changes

testutil/t.go Outdated Show resolved Hide resolved

testutil/t.go Outdated Show resolved Hide resolved

PR comments

fc53b59

deansheather requested a review from ethanndickson August 18, 2025 06:32

fixup! PR comments

a6acb60

ethanndickson approved these changes Aug 18, 2025

View reviewed changes

mafredri reviewed Aug 18, 2025

View reviewed changes

deansheather added 2 commits August 18, 2025 13:36

PR comments

6df4273

PR comments 2

f19a7f5

deansheather requested a review from mafredri August 18, 2025 13:40

mafredri approved these changes Aug 18, 2025

View reviewed changes

deansheather enabled auto-merge (squash) August 18, 2025 13:45

deansheather merged commit e2ba9e7 into main Aug 18, 2025
30 checks passed

deansheather deleted the dean/flake-agent-dial branch August 18, 2025 13:51

github-actions bot locked and limited conversation to collaborators Aug 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: retry TestAgent_Dial subtests #19387

chore: retry TestAgent_Dial subtests #19387

Uh oh!

deansheather commented Aug 18, 2025

Uh oh!

ethanndickson left a comment

Uh oh!

Uh oh!

Uh oh!

mafredri left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mafredri Aug 18, 2025

Uh oh!

deansheather Aug 18, 2025

Uh oh!

mafredri Aug 18, 2025

Uh oh!

deansheather Aug 18, 2025

Uh oh!

deansheather Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

mafredri left a comment

Uh oh!

Uh oh!

Uh oh!

chore: retry TestAgent_Dial subtests #19387

chore: retry TestAgent_Dial subtests #19387

Uh oh!

Conversation

deansheather commented Aug 18, 2025

Uh oh!

ethanndickson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mafredri Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

deansheather Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

mafredri Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

deansheather Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

deansheather Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!