Skip to content

[Cache] Decrease the probability of invalidation loss on tag eviction #43301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

sbelyshkin
Copy link
Contributor

Q A
Branch? 5.3
Bug fix? yes
New feature? no
Deprecations? no
Tickets -
License MIT
Doc PR -

When using TagAwareAdapter with volatile caches such as Memcached and Redis with 'alkeys-lru' eviction policy, eviction of any key may happen.

Currently, tag-based invalidation is achieved by incrementing tag version numbers starting from 0. The 0th version is assumed by default and is not even stored in the cache, only non-zero versions are stored. Because of that, when a tag version key is evicted from cache, version counter is "reset" to 0, and if there are items still stored with tag version 0 (had not been accessed/updated by the moment of tag eviction), such items become valid again while must be deemed invalidated. Similarly invalifation may be lost when tag is invalidated particular number of times after its eviction (for items with non-zero tag version).

This scenario is more likely for item-tag pairs which are accessed and invalidated less frequently than others (due to LRU policy and lower version numbers) but this measure is quite relative. LRU caches implement different algorithms (e.g. https://redis.io/topics/lru-cache, https://memcached.org/blog/modern-lru/) and free to evict any key of their choice so even a tag which is accessed or invalidated pretty frequently can be evicted when LRU cache is full and under havy loads.

In order to prevent invalidation losses, any non-repetative numbers can be used as initial value for tag versions.

@simonberger
Copy link
Contributor

Not yet assessing if this fix should work correctly, but the code in 4.4 is looking identical. It probably should be targeted to this branch.

@sbelyshkin
Copy link
Contributor Author

There are no big differences in TagAwareAdapter between 4.4 and 5.3, but there is one conflict caused by this update
8f03a1f#diff-47997875179d5ba246b68c6ca2aac323cd05de24661476198f2758d2b3094ef7R144

@sbelyshkin
Copy link
Contributor Author

sbelyshkin commented Oct 4, 2021

One possible issue to consider is that this update requires tag versions to exist, otherwise they are treated as evicted and will be initialized and stored. Since 0 version is not stored at the moment, all items with 0th tag versions will be invalidated.

In order to mitigate possible stampede, some kind of migration utility can be created which scans caches for tagged items and creates tags with 0th version. It will be a low-level solution since utility needs to know adapter's configuration and simulate its behavior, or to use reflection.
Or, we might provide intermediate version of TagAwareAdapter which stores 0th version along with others. This way implies some period of exploitation before upgrade to the next fixed version.
Alternatively, it may be recommended to invalidate some part of tags beforehand.

@@ -416,14 +412,29 @@ private function getTagVersions(array $tagsByKey, array &$invalidatedTags = [])
return $tagVersions;
}

$tod = gettimeofday();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be better to use mt_rand()?

Copy link
Contributor Author

@sbelyshkin sbelyshkin Oct 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mt_rand() is a fairly good solution for this problem, it's even simpler than converting microseconds to an integer :) I chose timestamp because of the way of invalidation - incrementing of the value, so it's good to use monotonically growing value wich grows faster than increments. Another consideration was that random numbers may repeat current versions (0, 1, 100000, any of them, I don't know where is the lowest boundary), and the first invalidation will be lost (not more likey than losing any other version, though). Frankly speaking, even with timestamp a version may be overwritten if there is more than one process and there is significant time discrepancy.

So really mt_rand(0, 2**62-1) is a neat solution (all 63 bits cannot be used since incremented counter may overflow large integer to a float).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, should I care about 32-bit versions of php?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please. a random number would be fine honestly, the monotony doesn't provide anything to me.
About overflows, we should deal with them by looping back to zero (or to PHP_INT_MIN btw if negative numbers are OK)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nicolas-grekas What's the purpose of the last condition 1 !== $version - $tagVersions[$tag]?
here

if ($tagVersions[$tag] !== $version && 1 !== $version - $tagVersions[$tag]) {

and here
if ($itemTags[$tag] !== $version && 1 !== $itemTags[$tag] - $version) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of those lines and similar arithmetic here

$version -= $this->knownTagVersions[$tag][1];

we cannot use full range of integers from PHP_INT_MIN to PHP_INT_MAX.
So I left only positive numbers.

Copy link
Contributor Author

@sbelyshkin sbelyshkin Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I have managed to adjust all that arithmetics for negative numbers. Now the range of random numbers is 2^32 for 32-bit platforms and 2^64 for 64-bit ones.

@sbelyshkin sbelyshkin force-pushed the prevent-tag-invalidation-losses branch 3 times, most recently from 9d119df to 7e59ced Compare October 7, 2021 09:09
@sbelyshkin sbelyshkin force-pushed the prevent-tag-invalidation-losses branch 2 times, most recently from 74a3506 to 183e0fa Compare October 10, 2021 11:36
@nicolas-grekas
Copy link
Member

Closing in favor of #44015

fabpot added a commit that referenced this pull request Nov 16, 2021
…on tag eviction (nicolas-grekas)

This PR was merged into the 5.4 branch.

Discussion
----------

[Cache] Decrease the probability of invalidation loss on tag eviction

| Q             | A
| ------------- | ---
| Branch?       | 5.4
| Bug fix?      | no
| New feature?  | no
| Deprecations? | no
| Tickets       | Fix #43301
| License       | MIT
| Doc PR        | -

From `@sbelyshkin` in the linked PR:

> When using TagAwareAdapter with volatile caches such as Memcached and Redis with 'alkeys-lru' eviction policy, eviction of _any_ key may happen.
>
> Currently, tag-based invalidation is achieved by incrementing tag version numbers starting from 0. The 0th version is assumed by default and is not even stored in the cache, only non-zero versions are stored. Because of that, when a tag version key is evicted from cache, version counter is "reset" to 0, and if there are items still stored with tag version 0 (had not been accessed/updated by the moment of tag eviction), such items become valid again while must be deemed invalidated. Similarly invalidation may be lost when tag is invalidated particular number of times after its eviction (for items with non-zero tag version).
>
> This scenario is more likely for item-tag pairs which are accessed and invalidated less frequently than others (due to LRU policy and lower version numbers) but this measure is quite relative. LRU caches implement different algorithms (e.g. https://redis.io/topics/lru-cache, https://memcached.org/blog/modern-lru/) and free to evict any key of their choice so even a tag which is accessed or invalidated pretty frequently can be evicted when LRU cache is full and under havy loads.
>
> In order to prevent invalidation losses, any non-repetative numbers can be used as initial value for tag versions.

This PR goes one step further and borrows from #42997: instead of incrementing the version number on invalidation, we delete the value from the backend, and recreate it when needed with a random offset.

This PR also removes auto-commit on tag-invalidation: this behavior adds complexity, is not needed and is not consistent with Redis|FilesystemTagAwareAdapter.

/cc `@sbelyshkin` can you please give it a go in a review?

Commits
-------

4cfc727 [Cache] Decrease the probability of invalidation loss on tag eviction
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants