Skip to content

test: add logging to TestWorkspaceActivityBump/Dial #15089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 16, 2024

Conversation

Emyrk
Copy link
Member

@Emyrk Emyrk commented Oct 15, 2024

My initial notes: coder/internal#102 (comment)

I am a bit split on how best to solve this. It is a classic problem of what is now.

At it's core, what is happening is:

  1. Workspace activity bump happens
  2. Detect bump
  3. Take now
  4. Assert deadeline-now=1hr

For an unknown reason, Detect Bump seems to have taken 10s in the flake failure. So now has drifted 10s later then step 1.

Our time tolerance is 10s, so the test fails.

Without more logs, it's difficult to know for certain. What makes this more confusing, is the Detect bump should detect when the bump happens, meaning now should be relative to when this loop unblocks.

So the implementation of taking now after the loop is correct.

What did I do then?

I added logging information to output timing information of this for loop. If we get another failure, this timing information should be helpful.

I also made the time drift allowed (was 10s) to match the maximum waiting time of the for loop. It's not perfect, but now the loop should take max ~15s, and the time comparison will allow 15s leeway.


Just a note that I do not believe there is any bugs in the production code. Looking at the failure logs, the activity bump works as expected.

If a test is running slow, the workspace deadline check can be
slow. Since we use 'now' as a reference point, any time drift
will affect the results.
@Emyrk Emyrk changed the title chore: add log statements to detect a slow test check test: add logging to TestWorkspaceActivityBump/Dia Oct 15, 2024
Emyrk added 2 commits October 15, 2024 12:12
If the hypothesis of a 10s time drift is true, this loop might be
acting slow.
waitedFor := time.Now()
// lastChecked is for logging within the Eventually loop.
// Debouncing log lines to every second to prevent spam.
lastChecked := time.Time{}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made to time.Time so the first loop is always logged.

@Emyrk Emyrk changed the title test: add logging to TestWorkspaceActivityBump/Dia test: add logging to TestWorkspaceActivityBump/Dial Oct 15, 2024
@Emyrk Emyrk requested a review from spikecurtis October 15, 2024 21:27
@Emyrk Emyrk merged commit 774c9dd into main Oct 16, 2024
29 checks passed
@Emyrk Emyrk deleted the stevenmasley/activitybump_flake branch October 16, 2024 15:01
@github-actions github-actions bot locked and limited conversation to collaborators Oct 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants