fake database locks during transactions #2473

spikecurtis · 2022-06-17T17:51:15Z

Might fix #2436 entirely, but definitely will fix in the fake db case.

There are additional bugs lurking around basically every use of InTx() in our tests that this fixes, because until this PR, transactions exposed intermediate state to other goroutines with the fake database.

Signed-off-by: Spike Curtis <spike@coder.com>

…_transaction_isolation

Emyrk · 2022-06-17T18:02:04Z

Huh TIL intermediate state was causing db issues. Makes sense though...

Can we write a unit test for this?

johnstcn

It's a really cool idea, but I wonder at what point in this rabbit hole it makes sense to stop reinventing the wheel...?

johnstcn · 2022-06-17T17:57:34Z

coderd/database/databasefake/reentrantlock.go

+// reentrantLock is a lock that can be locked multiple times from the same goroutine.
+//
+// Go maintainers insist that this is a Bad Idea and refuse to implement it in the standard library, so let's talk about
+// why we're doing it here.
+//
+// We want to support locking the fake database for the duration of a transaction, so that other goroutines cannot see
+// uncommitted transactions.  However, we also need to lock the database during queries that are not explicitly in a
+// transaction.  When a goroutine executing a transaction calls a query, it is already holding the lock, so attempting
+// to lock a standard mutex again will deadlock.  A reentrant lock neatly solves this problem.
+//
+// The argument I've heard around why reentrant locks are a Bad Idea points out that it indicates a problem with your
+// interface, because some methods must leave the database in an inconsistent state.  That criticism applies here
+// because the whole reason we need transactions is sometimes individual queries leave the database in an inconsistent
+// state. However valid the criticism, the assumption that it becomes the most important factor in all cases is flawed
+// and, frankly, patronizing.
+//
+// Here we do not have the luxury of reinventing the interface, because this fake database is attempting to
+// emulate another piece of software which does have this interface: postgres. Basically, the logic that enforces the
+// consistency of the database resides at a higher layer than we are emulating.
+//
+// Some alternatives considered, but rejected:
+//
+//  1. create an explicit transaction type, which are serialized to a channel and then processed in order.
+//     * requires implementing each query function twice, once wrapping it in a transaction, and once doing the real
+//       work
+//     * cannot support recursive transactions
+//  2. store whether we're in a transaction in the Context passed to the queries.
+//     * changes InTx(func(store) error) -> InTx(ctx, func(ctx2, store) error).  Inside the transaction function,
+//       callers **must use** ctx2 to query.  Use of other contexts, like ctx, will deadlock.  Adding this tripmine
+//       to every use of transactions seems like a recipe for bugs that are hard to diagnose (and only show up with the
+//       fake database).


🤌 11/10 documentation

johnstcn · 2022-06-17T18:03:30Z

coderd/database/databasefake/reentrantlock.go

+	b := make([]byte, 64)
+	b = b[:runtime.Stack(b, false)]
+	b = bytes.TrimPrefix(b, []byte("goroutine "))
+	b = b[:bytes.IndexByte(b, ' ')]
+	n, _ := strconv.ParseUint(string(b), 10, 64)
+	return n


😬

At what point does it make more sense to just dispense with the fake database entirely?

you and me both, but we do need a suite of tests that run fast, and our postgres tests are slow.

CLI tests, by and large, shouldn't be calling the server at all, and instead be mocking out the codersdk. That will make them simpler (because you don't need a bunch of preamble to get the DB into the state you want), and super fast.

Coderd tests can and should mock out the database interface for most cases.

We still will want a couple integration tests that go all the way from the CLI -> API -> Database, but there will only be a few and we could use postgres for that.

Once we're in that state, we could consider killing the fake database. I'll repeat this in another issue.

Emyrk · 2022-06-17T18:10:24Z

Can we just make the mutex an interface, and when calling InTx() just pass this dummy mutex forward?

func (q *fakeQuerier) InTx(fn func(database.Store) error) error {
        q.mutex.Lock()
        defer q.mutex.Unlock()
	return fn(&fakeQuerier{
                // This mutex Lock()/Unlock() do nothing
		mutex:           &notAMutex{},
                // Same reference to all data
		fakeQuerierData: q.fakeQuerierData,
	})
}

deansheather · 2022-06-17T18:10:10Z

coderd/database/databasefake/reentrantlock.go

+//     * changes InTx(func(store) error) -> InTx(ctx, func(ctx2, store) error).  Inside the transaction function,
+//       callers **must use** ctx2 to query.  Use of other contexts, like ctx, will deadlock.  Adding this tripmine
+//       to every use of transactions seems like a recipe for bugs that are hard to diagnose (and only show up with the
+//       fake database).


You could probably implement this with a ruleguard rule and it would not require the reentrant lock

Our InTx currently doesn't require a context param, I think it's better to keep it that way.

I think it's easier just to make a wrapping struct that can contain some state. So store.IsInTx() could return true/false for example.

I started doing that, and that was the point as which I said, a reentrant lock is easier.

The ruleguard helps if it actually catches your problem (impossible to get to 100% on that one), and you remember to run it. It's natural when faced with test failures to want to fix them before you start worrying about things like linting.

The sqldb currently keeps the "is in a transaction" state on itself. See #2470

I'd rather copy that than use ctx to pass the "is in a tx" state

dwahler · 2022-06-17T18:13:57Z

Can we just make the mutex an interface

Seconded. I guess this approach has the slight downside of requiring a bit of code churn to add an extra layer of indirection (everything that currently accesses fields of q would have to be updated to use q.fakeQuerierData).

On the other hand, I think it makes it easier to add support for rolling back transactions. The simple, stupid approach would be to clone the entire inner fakeQuerierData value, pass the clone to fn, and then copy it back to the original if the transaction didn't abort.

Emyrk · 2022-06-17T18:18:14Z

Seconded. I guess this approach has the slight downside of requiring a bit of code churn to add an extra layer of indirection (everything that currently accesses fields of q would have to be updated to use q.fakeQuerierData).

If you anonymously embed, the fields can be accessed directly. So no need to update accesses.

type fakeQuerier struct {
	mutex Mutex
	*fakeQuerierData
}

On the other hand, I think it makes it easier to add support for rolling back transactions. The simple, stupid approach would be to clone the entire inner fakeQuerierData value, pass the clone to fn, and then copy it back to the original if the transaction didn't abort.

This might get tricky with deletes and things. It might not be too bad to merge states, but that is going to require some code to handle modifies vs deletes vs adds. Probably some reflect magic based on ID field for most structs.

spikecurtis · 2022-06-17T18:36:31Z

Closing in favor of #2478 (thanks @Emyrk for the suggestion!)

fake database locks during transactions

9e830f3

Signed-off-by: Spike Curtis <spike@coder.com>

spikecurtis requested a review from a team June 17, 2022 17:51

Merge branch 'main' of github.com:coder/coder into spike/2436_fake_db…

54fbfeb

…_transaction_isolation

johnstcn reviewed Jun 17, 2022

View reviewed changes

johnstcn assigned spikecurtis Jun 17, 2022

johnstcn requested a review from a team June 17, 2022 18:06

deansheather reviewed Jun 17, 2022

View reviewed changes

spikecurtis closed this Jun 17, 2022

spikecurtis mentioned this pull request Jul 18, 2022

Change the bulk of our tests to "true" unit tests, and kill the fake database #2476

Closed

github-actions bot deleted the spike/2436_fake_db_transaction_isolation branch February 4, 2023 20:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fake database locks during transactions #2473

fake database locks during transactions #2473

Uh oh!

spikecurtis commented Jun 17, 2022 •

edited

Loading

Uh oh!

Emyrk commented Jun 17, 2022

Uh oh!

johnstcn left a comment

Uh oh!

johnstcn Jun 17, 2022

Uh oh!

johnstcn Jun 17, 2022

Uh oh!

spikecurtis Jun 17, 2022

Uh oh!

Emyrk commented Jun 17, 2022

Uh oh!

deansheather Jun 17, 2022

Uh oh!

Emyrk Jun 17, 2022

Uh oh!

spikecurtis Jun 17, 2022

Uh oh!

Emyrk Jun 17, 2022

Uh oh!

dwahler commented Jun 17, 2022

Uh oh!

Emyrk commented Jun 17, 2022

Uh oh!

spikecurtis commented Jun 17, 2022

Uh oh!

Uh oh!

fake database locks during transactions #2473

fake database locks during transactions #2473

Uh oh!

Conversation

spikecurtis commented Jun 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Emyrk commented Jun 17, 2022

Uh oh!

johnstcn left a comment

Choose a reason for hiding this comment

Uh oh!

johnstcn Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

johnstcn Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

spikecurtis Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

Emyrk commented Jun 17, 2022

Uh oh!

deansheather Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

Emyrk Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

spikecurtis Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

Emyrk Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

dwahler commented Jun 17, 2022

Uh oh!

Emyrk commented Jun 17, 2022

Uh oh!

spikecurtis commented Jun 17, 2022

Uh oh!

Uh oh!

spikecurtis commented Jun 17, 2022 •

edited

Loading