Skip to content

feat: bump workspace deadline on user activity #4119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 20, 2022
Merged

feat: bump workspace deadline on user activity #4119

merged 10 commits into from
Sep 20, 2022

Conversation

ammario
Copy link
Member

@ammario ammario commented Sep 19, 2022

TODO:

  • Document behavior in the WorkspaceScheduleForm and the Workspace Schedule popout
  • Test in coderd
  • Keep bumping for duration of connection
    • Bump async, it shouldn't hold up a request from responding

@ammario ammario requested a review from a team as a code owner September 20, 2022 03:24
@ammario ammario requested review from jsjoeio and removed request for a team September 20, 2022 03:24
@ammario ammario linked an issue Sep 20, 2022 that may be closed by this pull request
@kylecarbs kylecarbs requested a review from mafredri September 20, 2022 12:40
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice feature! 🤩 Some small things here and there but looks good in general. 👍🏻

build, err := s.GetLatestWorkspaceBuildByWorkspaceID(ctx, workspace.ID)
log.Debug(ctx, "build", slog.F("build", build))
if errors.Is(err, sql.ErrNoRows) {
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we don't want to surface an error? It seems like an inconsistency to me that a workspace is being bumped but doesn't exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the bump occurs asynchronously, there is no guarantee the workspace exists when this line executes. But, that is not an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense 👍🏻. Should we btw consider moving the log above to blow the error handling?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was an artifact from testing so I will remove.


return client, workspace, func(want bool) {
if !want {
time.Sleep(testutil.IntervalMedium)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not trigger a CI flake currently, but its behavior is racy. We can't guarantee that the goroutine has run to completion here so we also can't guarantee we're testing the correct thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it sucks, but I couldn't think of a clean alternative.

Copy link
Member

@mafredri mafredri Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option would be dependency injection, e.g. store the activity bumping method on coderd.API, replace it with a wrapper in the test that can inform e.g. on a channel when it has run.

This would be a bit cleaner as an internal test so that nothing needs to be exported.

Another idea: I'm not sure what we're sending on the websocket, but could we write a payload informing the other side about what we did (e.g. event: bumped workspace)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option would be dependency injection, e.g. store the activity bumping method on coderd.API, replace it with a wrapper in the test that can inform e.g. on a channel when it has run.

In this particular test we're looking for an absence of a call. So, it's not as simple as replacing the activity bump method. We should also strive to test behavior from the user's perspective. The more we dependency inject and replace internals in our tests the further away we're getting from the user. We may be better off accepting a little bit of raciness for more test accuracy here.

This would be a bit cleaner as an internal test so that nothing needs to be exported.

I, too, enjoy internal tests, but that is a codebase decision larger than this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an explanatory comment to the code. Let me know your thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular test we're looking for an absence of a call.

I thought we would be testing for the presence of a call, but the absence of modification? What’s the purpose of testing for the absence of a call?

The only case that the call wouldn’t happen is if the updateDB variable is false which seems like a detail that shouldn’t be tested in this test. If we really need to test that case then we should try to expose that some way so we don’t have racy tests.

In my personal opinion we should avoid test racyness as much as possible because those lead to CI flakes that in turn result in loss of developer productivity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, since updateDB is false, there be no call. Let's say we restructure the code so it always calls a replaceable function. In that case, there would still be raciness because we're looking to assert that the bump never happens. We can't wait for infinity, so we'd have to wait for some period or some event. We could wait for X ticks of the Agent Stat Loop, but there's still raciness.

Copy link
Contributor

@jsjoeio jsjoeio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE ✅

@ammario ammario requested a review from mafredri September 20, 2022 16:52
@bpmct
Copy link
Member

bpmct commented Sep 20, 2022

Does this only bump when a request occurs or is WebSocket activity accounted for? For example, my code-server IDE is open and I am typing in it but I do not refresh the page for an hour.

Testing this rn with 3 workspaces: code-server, tunnel/SSH, and web terminal

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see the test racyness amended in some way, but I'll leave that up to you. That aside, PR looks good.

@ammario
Copy link
Member Author

ammario commented Sep 20, 2022

Does this only bump when a request occurs or is WebSocket activity accounted for? For example, my code-server IDE is open and I am typing in it but I do not refresh the page for an hour.

Testing this rn with 3 workspaces: code-server, tunnel/SSH, and web terminal

It uses the same logic as the Last Seen / DAU features, so it should register long lived connections.

@ammario ammario enabled auto-merge (squash) September 20, 2022 19:30
@ammario ammario merged commit d30945c into main Sep 20, 2022
@ammario ammario deleted the activity-stop branch September 20, 2022 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Activity Based Auto-stop
4 participants