Skip to content

fix: serialize updateEntitlements() #14974

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 5, 2024

Conversation

spikecurtis
Copy link
Contributor

@spikecurtis spikecurtis commented Oct 4, 2024

fixes #14961

Adding the license and updating entitlements is flaky, especially at the start of our coderdent testing because, while the actual modifications to the entitlements.Set were threadsafe, we could have multiple goroutines reading from the database and writing to the set, so we could end up writing stale data.

This enforces serialization on updates, so that if you modify the database and kick off an update, you know the state of the Set is at least as fresh as your database update.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @spikecurtis and the rest of your teammates on Graphite Graphite

@spikecurtis spikecurtis requested a review from Emyrk October 4, 2024 12:25
@spikecurtis spikecurtis marked this pull request as ready for review October 4, 2024 12:26
@Emyrk
Copy link
Member

Emyrk commented Oct 4, 2024

Just so I understand, and jotting notes as I break for a minute. I am going to play with this after some morning calls.

Update (which was Replace) does an entire entitlements replacement. The multiple go routines calling updateEntitlements all source from the license in the db.

Is the issue that in the multiple calls in unit tests, the earlier calls do not have a license. This earlier call happens in parallel to the later call (with license). If the later one finishes first, then the earlier call overwrites it back to no license.

The fix, is to force serialization such that the order of calls to updateEntitlements is respected. And because the db lookup happens within updateEntitlements, if the earlier execution in the previous case is delayed, it's db lookup is also delayed. So it will fetch the latest value.

This looks to be a good solution. The test does not exercise the bug when I remove the channel, but I only briefly looked at it atm.


An aside. when I extracted this logic out into this package, I was not really happy with the API it exposed. But it was already used throughout the codebase, and refactoring it everywhere felt like a large lift.

Since we only have 1 entitlement per coderd, it could probably be it's own go routine and take feature sets to merge into the existing. Ensuring everything is done in order if multiple threads are working with it. Order would matter still though, as there is no time information to know which Feature should take precedence.

@spikecurtis
Copy link
Contributor Author

Is the issue that in the multiple calls in unit tests, the earlier calls do not have a license. This earlier call happens in parallel to the later call (with license). If the later one finishes first, then the earlier call overwrites it back to no license.

Yeah, exactly.

The test does not exercise the bug when I remove the channel, but I only briefly looked at it atm.

Hard to hit the bug consistently, since it's a race over which one writes first.

Copy link
Member

@Emyrk Emyrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to hit the bug consistently, since it's a race over which one writes first.

I figured it would be.


The entitlements package I refactored out because I had to access it from the idpsync package, and there is an import loop if entitlements is kept in coderd.

I think if done from scratch, the update code would look a lot different.

Regardless, I like this serialized approach. Modify can still get around this, but it's really useful for unit tests.

@spikecurtis spikecurtis merged commit 288df75 into main Oct 5, 2024
31 checks passed
@spikecurtis spikecurtis deleted the spike/14961-update-entitlements branch October 5, 2024 02:58
@github-actions github-actions bot locked and limited conversation to collaborators Oct 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flake: TestProxyRegisterDeregister
2 participants