chore: refactor notifier to use quartz.TickerFunc #15134

spikecurtis · 2024-10-18T10:17:31Z

In investigating coder/internal#109 I noticed many of the notification tests are still using time.Sleep and require.Eventually. This is an initial effort to start converting these to Quartz.

One product change is to switch the notifier to use a TickerFunc instead of a normal Ticker, since it allows the test to assert that a batch process is complete via the Quartz Mock clock. This does introduce one slight behavioral change in that the notifier waits the fetch interval before processing its first batch. In practice, this is inconsequential: no one will notice if we send notifications immediately on startup, or just a little later.

But, it does make a difference to some tests, which are fixed up here.

spikecurtis · 2024-10-18T10:17:43Z

chore: refactor notifier to use quartz.TickerFunc #15134 👈
chore: stop creating coderd for notification unit tests #15133
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @spikecurtis and the rest of your teammates on Graphite

spikecurtis · 2024-10-18T10:28:57Z

coderd/notifications/notifications_test.go

+	// Stop() waits for the in-progress flush to complete, meaning we have to advance the time such that sync triggers
+	// a total of (batchSize/StoreSyncBufferSize)-1 times. The -1 is because once we run the penultimate sync, it
+	// clears space in the buffer for the last dispatches of the batch, which allows graceful shutdown to continue
+	// immediately, without waiting for the last trigger.


@dannykopping this behavior of the manager surprised me: it doesn't immediately flush when you call Stop(), it waits until the notifier exits. In this sense, it was never really the Stop() that relieved the backpressure in this test, it was that we wait around until the sync triggers enough times to relieve the backpressure. Calling Stop() early prevents another fetch from starting tho.

Thanks for adding additional clarification in the comment here.
This looks great!

this behavior of the manager surprised me

This is a smell for me; is there a way we could make this more explicit?

I worry about requiring a sequence of calls (i.e. Drain() + Stop()) to properly shutdown the manager (which would make this more clear but would introduce some potential problems if not called correctly or at all).

I do worry that we can possibly wait around a long time for Graceful shutdown. It's up to 2s by default for a Sync, then the sync itself times out after 30s. The delivery of notifications times out after 60s.

Normally we expect Graceful shutdown to take 5-15 seconds or less, otherwise humans or cluster managers are starting to get likely to just kill us anyway. Maybe the initial, high level context can serve this purpose, but right now it's just tied to the CLI command, and doesn't get canceled in practice. If you moved the Manager to within coderd, then I think the API context does eventually get canceled.

spikecurtis · 2024-10-18T10:30:26Z

go.mod

@@ -88,7 +88,7 @@ require (
 	github.com/cli/safeexec v1.0.1
 	github.com/coder/flog v1.1.0
 	github.com/coder/pretty v0.0.0-20230908205945-e89ba86370e0
-	github.com/coder/quartz v0.1.0
+	github.com/coder/quartz v0.1.2


Interestingly, the backpressure test surfaced some bugs in Quartz that I had to fix to get the mock to behave like the real TickerFunc.

dannykopping

LGTM, thanks a mil for looking into this

dannykopping · 2024-10-18T14:24:15Z

coderd/notifications/notifications_test.go

+	// Stop() waits for the in-progress flush to complete, meaning we have to advance the time such that sync triggers
+	// a total of (batchSize/StoreSyncBufferSize)-1 times. The -1 is because once we run the penultimate sync, it
+	// clears space in the buffer for the last dispatches of the batch, which allows graceful shutdown to continue
+	// immediately, without waiting for the last trigger.


Thanks for adding additional clarification in the comment here.
This looks great!

this behavior of the manager surprised me

This is a smell for me; is there a way we could make this more explicit?

I worry about requiring a sequence of calls (i.e. Drain() + Stop()) to properly shutdown the manager (which would make this more clear but would introduce some potential problems if not called correctly or at all).

spikecurtis mentioned this pull request Oct 18, 2024

chore: stop creating coderd for notification unit tests #15133

Merged

github-actions bot assigned spikecurtis Oct 18, 2024

spikecurtis requested a review from dannykopping October 18, 2024 10:23

spikecurtis marked this pull request as ready for review October 18, 2024 10:24

spikecurtis commented Oct 18, 2024

View reviewed changes

spikecurtis force-pushed the spike/internal-109-notifier-ticker-func branch from 9103601 to 4ac5e9b Compare October 18, 2024 10:31

dannykopping approved these changes Oct 18, 2024

View reviewed changes

spikecurtis changed the base branch from spike/internal109-coderd to graphite-base/15134 October 21, 2024 06:39

spikecurtis force-pushed the spike/internal-109-notifier-ticker-func branch from 4ac5e9b to 0ca6cf1 Compare October 21, 2024 06:39

spikecurtis force-pushed the graphite-base/15134 branch from 085e994 to 8c8bd31 Compare October 21, 2024 06:39

spikecurtis changed the base branch from graphite-base/15134 to main October 21, 2024 06:40

chore: refactor notifier to use quartz.TickerFunc

a2a1743

spikecurtis force-pushed the spike/internal-109-notifier-ticker-func branch from 0ca6cf1 to a2a1743 Compare October 21, 2024 06:40

spikecurtis merged commit 29099d4 into main Oct 21, 2024
27 checks passed

spikecurtis deleted the spike/internal-109-notifier-ticker-func branch October 21, 2024 08:07

github-actions bot locked and limited conversation to collaborators Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: refactor notifier to use quartz.TickerFunc #15134

chore: refactor notifier to use quartz.TickerFunc #15134

spikecurtis commented Oct 18, 2024 •

edited

Loading

spikecurtis commented Oct 18, 2024 •

edited

Loading

spikecurtis Oct 18, 2024

dannykopping Oct 18, 2024

spikecurtis Oct 21, 2024

spikecurtis Oct 18, 2024

dannykopping left a comment

dannykopping Oct 18, 2024

chore: refactor notifier to use quartz.TickerFunc #15134

chore: refactor notifier to use quartz.TickerFunc #15134

Conversation

spikecurtis commented Oct 18, 2024 • edited Loading

spikecurtis commented Oct 18, 2024 • edited Loading

spikecurtis Oct 18, 2024

Choose a reason for hiding this comment

dannykopping Oct 18, 2024

Choose a reason for hiding this comment

spikecurtis Oct 21, 2024

Choose a reason for hiding this comment

spikecurtis Oct 18, 2024

Choose a reason for hiding this comment

dannykopping left a comment

Choose a reason for hiding this comment

dannykopping Oct 18, 2024

Choose a reason for hiding this comment

spikecurtis commented Oct 18, 2024 •

edited

Loading

spikecurtis commented Oct 18, 2024 •

edited

Loading