Skip to content

Commit e4294f1

Browse files
committed
chore: add README to clock testing
1 parent 808e1c0 commit e4294f1

File tree

1 file changed

+268
-0
lines changed

1 file changed

+268
-0
lines changed

clock/README.md

+268
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,268 @@
1+
# Quartz
2+
A Go time testing library for writing deterministic unit tests
3+
4+
_Note: Quartz is the name I'm targeting for the standalone open source project when we spin this
5+
out._
6+
7+
## Why another time testing library?
8+
9+
Writing good unit tests for components and functions that use the `time` package is difficult, and
10+
several open source libraries exist. In building Quartz, we took some inspiration from
11+
12+
* [github.com/benbjohnson/clock](https://github.com/benbjohnson/clock)
13+
* Tailscale's [tstest.Clock](https://github.com/coder/tailscale/blob/main/tstest/clock.go)
14+
* [github.com/aspenmesh/tock](https://github.com/aspenmesh/tock)
15+
16+
Quartz shares the high level design of a `Clock` interface that closely resembles the functions in
17+
the `time` standard library, and a "real" clock passes thru to the standard libary in production,
18+
while a mock clock gives precise control in testing.
19+
20+
Our high level goal is to write unit tests that
21+
22+
1. execute quickly
23+
2. don't flake
24+
3. are straightforward to write and understand
25+
26+
For several reasons, this is a tall order when it comes to code that depends on time, and we found
27+
the existing libraries insufficient for our goals.
28+
29+
For tests to execute quickly without flakes, we want to focus on _determinism_: the test should run
30+
the same each time, and it should be easy to force the system into a known state (no races) before
31+
executing test assertions. `time.Sleep`, `runtime.Gosched()`, and
32+
polling/[Eventually](https://pkg.go.dev/github.com/stretchr/testify/assert#Eventually) are all
33+
symptoms of an inability to do this easily.
34+
35+
### Preventing test flakes
36+
37+
The following example comes from the README from benbjohnson/clock:
38+
39+
```go
40+
mock := clock.NewMock()
41+
count := 0
42+
43+
// Kick off a timer to increment every 1 mock second.
44+
go func() {
45+
ticker := mock.Ticker(1 * time.Second)
46+
for {
47+
<-ticker.C
48+
count++
49+
}
50+
}()
51+
runtime.Gosched()
52+
53+
// Move the clock forward 10 seconds.
54+
mock.Add(10 * time.Second)
55+
56+
// This prints 10.
57+
fmt.Println(count)
58+
```
59+
60+
The first race condition is fairly obvious: moving the clock forward 10 seconds may generate 10
61+
ticks on the `ticker.C` channel, but there is no guarantee that `count++` executes before
62+
`fmt.Println(count)`.
63+
64+
The second race condition is more subtle, but `runtime.Gosched()` is the tell. Since the ticker
65+
is started on a separate goroutine, there is no guarantee that `mock.Ticker()` executes before
66+
`mock.Add()`. `runtime.Gosched()` is an attempt to get this to happen, but it makes no hard
67+
promises. On a busy system, especially when running tests in parallel, this can flake, advance the
68+
time 10 seconds first, then start the ticker and never generate a tick.
69+
70+
Let's talk about how Quartz tackles these problems.
71+
72+
In our experience, an extremely common use case is creating a ticker then doing a 2-arm `select`
73+
with ticks in one and context expiring in another, i.e.
74+
75+
```go
76+
t := time.NewTicker(duration)
77+
for {
78+
select {
79+
case <-ctx.Done():
80+
return ctx.Err()
81+
case <-t.C:
82+
err := do()
83+
if err != nil {
84+
return err
85+
}
86+
}
87+
}
88+
```
89+
90+
In Quartz, we refactor this to be more compact and testing friendly:
91+
92+
```go
93+
t := clock.TickerFunc(ctx, duration, do)
94+
return t.Wait()
95+
```
96+
97+
This affords the mock `Clock` the ability to explicitly know when processing of a tick is finished
98+
because it's wrapped in the function passed to `TickerFunc` (`do()` in this example).
99+
100+
In Quartz, when you advance the clock, you are returned an object you can `Wait()` on to ensure all
101+
ticks and timers triggered are finished. This solves the first race condition in the example.
102+
103+
(As an aside, we still support a traditional standard library-style `Ticker`, if you want to keep
104+
your code as close as possible to the standard library, or if you need to use the channel in a
105+
larger `select` block, but in that case you'll have to find some other mechanism to sync tick
106+
processing to your test code.)
107+
108+
To prevent race conditions related to the starting of the ticker, Quartz allows you to set "traps"
109+
for calls that access the clock.
110+
111+
```go
112+
func TestTicker(t *testing.T) {
113+
mClock := quartz.NewMock(t)
114+
trap := mClock.Trap().TickerFunc()
115+
defer trap.Close() // stop trapping at end
116+
go runMyTicker(mClock) // async calls TickerFunc()
117+
call := trap.Wait(context.Background()) // waits for a call and blocks its return
118+
call.Release() // allow the TickerFunc() call to return
119+
// optionally check the duration using call.Duration
120+
// Move the clock forward 1 tick
121+
mClock.Advance(time.Second).MustWait(context.Background())
122+
// assert results of the tick
123+
}
124+
```
125+
126+
Trapping and then releasing the call to `TickerFunc()` ensures the ticker is started at a
127+
deterministic time, so our calls to `Advance()` will have a predictable effect.
128+
129+
Take a look at `TestExampleTickerFunc` in `example_test.go` for a complete worked example.
130+
131+
### Complex time dependence
132+
133+
Another difficult issue to handle when unit testing is when some code under test makes multiple
134+
calls that depend on the time, and you want to simulate some time passing between them.
135+
136+
A very basic example is measuring how long something took:
137+
138+
```go
139+
var measurement time.Duration
140+
go func(clock quartz.Clock) {
141+
start := clock.Now()
142+
doSomething()
143+
measurement = clock.Since(start)
144+
}(mClock)
145+
146+
// how to get measurement to be, say, 5 seconds?
147+
```
148+
149+
The two calls into the clock happen asynchronously, so we need to be able to advance the clock after
150+
the first call to `Now()` but before the call to `Since()`. Doing this with the libraries we
151+
mentioned above means that you have to be able to mock out or otherwise block the completion of
152+
`doSomething()`.
153+
154+
But, with the trap functionality we mentioned in the previous section, you can deterministically
155+
control the time each call sees.
156+
157+
```go
158+
trap := mClock.Trap().Since()
159+
var measurement time.Duration
160+
go func(clock quartz.Clock) {
161+
start := clock.Now()
162+
doSomething()
163+
measurement = clock.Since(start)
164+
}(mClock)
165+
166+
c := trap.Wait(ctx)
167+
mClock.Advance(5*time.Second)
168+
c.Release()
169+
```
170+
171+
We wait until we trap the `clock.Since()` call, which implies that `clock.Now()` has completed, then
172+
advance the mock clock 5 seconds. Finally, we release the `clock.Since()` call. Any changes to the
173+
clock that happen _before_ we release the call will be included in the time used for the
174+
`clock.Since()` call.
175+
176+
As a more involved example, consider an inactivity timeout: we want something to happen if there is
177+
no activity recorded for some period, say 10 minutes in the following example:
178+
179+
```go
180+
type InactivityTimer struct {
181+
mu sync.Mutex
182+
activity time.Time
183+
clock quartz.Clock
184+
}
185+
186+
func (i *InactivityTimer) Start() {
187+
i.mu.Lock()
188+
defer i.mu.Unlock()
189+
next := i.clock.Until(i.activity.Add(10*time.Minute))
190+
t := i.clock.AfterFunc(next, func() {
191+
i.mu.Lock()
192+
defer i.mu.Unlock()
193+
next := i.clock.Until(i.activity.Add(10*time.Minute))
194+
if next == 0 {
195+
i.timeoutLocked()
196+
return
197+
}
198+
t.Reset(next)
199+
})
200+
}
201+
```
202+
203+
The actual contents of `timeoutLocked()` doesn't matter for this example, and assume there are other
204+
functions that record the latest `activity`.
205+
206+
We found that some time testing libraries hold a lock on the mock clock while calling the function
207+
passed to `AfterFunc`, resulting in a deadlock if you made clock calls from within.
208+
209+
Others allow this sort of thing, but don't have the flexibility to test edge cases. There is a
210+
subtle bug in our `Start()` function. The timer may pop a little late, and/or some measurable real
211+
time may elapse before `Until()` gets called inside the `AfterFunc`. If there hasn't been activity,
212+
`next` might be negative.
213+
214+
To test this in Quartz, we'll use a trap. We only want to trap the inner `Until()` call, not the
215+
initial one, so to make testing easier we can "tag" the call we want. Like this:
216+
217+
```go
218+
func (i *InactivityTimer) Start() {
219+
i.mu.Lock()
220+
defer i.mu.Unlock()
221+
next := i.clock.Until(i.activity.Add(10*time.Minute))
222+
t := i.clock.AfterFunc(next, func() {
223+
i.mu.Lock()
224+
defer i.mu.Unlock()
225+
next := i.clock.Until(i.activity.Add(10*time.Minute), "inner")
226+
if next == 0 {
227+
i.timeoutLocked()
228+
return
229+
}
230+
t.Reset(next)
231+
})
232+
}
233+
```
234+
235+
All Quartz `Clock` functions, and functions on returned timers and tickers support zero or more
236+
string tags that allow traps to match on them.
237+
238+
```go
239+
func TestInactivityTimer_Late(t *testing.T) {
240+
// set a timeout on the test itself, so that if Wait functions get blocked, we don't have to
241+
// wait for the default test timeout of 10 minutes.
242+
ctx, cancel := context.WithTimeout(10*time.Second)
243+
defer cancel()
244+
mClock := quartz.NewMock(t)
245+
trap := mClock.Trap.Until("inner")
246+
defer trap.Close()
247+
248+
it := &InactivityTimer{
249+
activity: mClock.Now(),
250+
clock: mClock,
251+
}
252+
it.Start()
253+
254+
// Trigger the AfterFunc
255+
w := mClock.Advance(10*time.Minute)
256+
c := trap.Wait(ctx)
257+
// Advance the clock a few ms to simulate a busy system
258+
mClock.Advance(3*time.Millisecond)
259+
c.Release() // Until() returns
260+
w.MustWait(ctx) // Wait for the AfterFunc to wrap up
261+
262+
// Assert that the timeoutLocked() function was called
263+
}
264+
```
265+
266+
This test case will fail with our bugged implementation, since the triggered AfterFunc won't call
267+
`timeoutLocked()` and instead will reset the timer with a negative number. The fix is easy, use
268+
`next <= 0` as the comparison.

0 commit comments

Comments
 (0)