-
Notifications
You must be signed in to change notification settings - Fork 881
fix(cli): port-forward: update workspace last_used_at #12659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
17cb933
to
2412fb5
Compare
2412fb5
to
c99327c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly nits & small {que,sugge}stions
Very clean! LGTM
Co-authored-by: Danny Kopping <danny@coder.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-level review first
coderd/workspaceusage/tracker.go
Outdated
LastUsedAt: now, | ||
IDs: ids, | ||
}); err != nil { | ||
wut.log.Error(ctx, "failed updating workspaces last_used_at", slog.F("count", count), slog.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any retry mechanism if the batch fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a built-in assumption here that any failure of a single batch can be ignored as it gets retried frequently. I suppose consecutive failures should surface some form of error somewhere. I wonder if the database health page could show a summary of queries that have failed multiple consecutive executions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a built-in assumption here that any failure of a single batch can be ignored as it gets retried frequently. I suppose consecutive failures should surface some form of error somewhere
I might have missed it, but it would be cool to leave a comment stating this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome 👍
cli/portforward_test.go
Outdated
db, ps = dbtestutil.NewDB(t) | ||
wuTick = make(chan time.Time) | ||
wuFlush = make(chan int, 1) | ||
wut = workspaceusage.New(db, workspaceusage.WithFlushChannel(wuFlush), workspaceusage.WithTickChannel(wuTick)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wat
// the number of marked workspaces every time Tracker flushes. | ||
// For testing only and will panic if used outside of tests. | ||
func WithFlushChannel(c chan int) Option { | ||
if flag.Lookup("test.v") == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice strictness check! Feels like this could be a testutil func, even if obvious.
coderd/workspaceusage/tracker.go
Outdated
// Calling Loop after Close() is an error. | ||
select { | ||
case <-wut.doneCh: | ||
panic("developer error: Loop called after Close") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to do this instead of just returning, I'd prefer returning an error here, wouldn't want this to happen for a production instance after some changes to the code.
Feels like this mostly exists for tests (otherwise could be launched from New
), so more reason not to add a possibility of panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we return an error, then we have two options:
- Exit with the error, which brings us back to potentially impacting prod, but just in a nicer-looking way.
- Just log the error and do nothing, which means the error will likely pass unnoticed until folks wonder why their workspace usage isn't being tracked.
I don't think 2) is a valid option. We're not guaranteed to catch it in tests especially because of the possibility of ignoring errors in slogtest
.
There's an alternative possibility 1a) where we check the error and recreate the tracker if non-nil. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot about option c) - have New
start the loop and unexport the function.
Now it's un-possible!
coderd/coderdtest/coderdtest.go
Outdated
// Did last_used_at not update? Scratching your noggin? Here's why. | ||
// Workspace usage tracking must be triggered manually in tests. | ||
// The vast majority of existing tests do not depend on last_used_at | ||
// and adding an extra background goroutine to all existing tests may | ||
// lead to future flakes and goleak complaints. | ||
// To do this, pass in your own WorkspaceUsageTracker like so: | ||
// | ||
// db, ps = dbtestutil.NewDB(t) | ||
// wuTick = make(chan time.Time) | ||
// wuFlush = make(chan int, 1) | ||
// wut = workspaceusage.New(db, workspaceusage.WithFlushChannel(wuFlush), workspaceusage.WithTickChannel(wuTick)) | ||
// client = coderdtest.New(t, &coderdtest.Options{ | ||
// WorkspaceUsageTracker: wut, | ||
// Database: db, | ||
// Pubsub: ps, | ||
// }) | ||
// | ||
// See TestPortForward for how this works in practice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol, love the personality in this comment 😄
Fixes #12431
For the CLI to report workspace usage, we currently depend on the agent sending a stats report with a non-zero connection count. This is sourced from
SetConnStatsCallback
(tailnet/conn.go:621
).The problem with this approach is that the tailscale connection tracking only appears to track "connection established" events, and is not equivalent to running e.g.
ss -plunt
. This means:This PR updates the
coder port-forward
command to periodically inform coderd that the workspace is being used:workspaceusage.Tracker
which periodically batch-updates workspace LastUsedAtcoder port-forward
to periodically hit this endpointBatchUpdateWorkspacesLastUsedAt
to avoid overwriting with stale dataIn later PRs, we can:
coder ssh
to also use this behaviour,ActivityBumpWorkspace
so thatlast_used_at
anddeadline
are handled in the same place,Follow-ups:
coder
process on a user's workstation updating the workspace at a time. My assumption here is that most users will have at most one or two port-forwards running at a time.