Invalidate any existing prefetch cache on PUT requests. #4668

tomchristie · 2016-11-10T15:43:30Z

Invalidate the prefetch cache instead of reloading the object from the database when a prefetch_related queryset is used with a PUT request.

Would be nice to also have a test case demonstrating why this approach has a different behavioral effect.

Closes #4661.
Refs #4553.

spectras · 2016-11-11T12:54:43Z

The only drawback I can see to this approach is, if I remember well, in the absence of a prefetch cache, every use of the related objects from that point forward will trigger a query. (but double-check that, I did not follow the latest Django updates very closely).
Other than this, it seems to be better than the former approach in every aspect: faster, won't reload queryset that won't get used, will not raise a 404 if the update took the object out of the queryset.

I was wondering whether this could be a possible solution too, digging just a single bit deeper into the queryset internals:

cache = getattr(instance, '_prefetched_objects_cache', None)
if cache:
    for key, queryset in cache.items():
        cache[key] = queryset.all()

Or, invalidating the cache in place instead:

cache = getattr(instance, '_prefetched_objects_cache', None)
if cache:
    for queryset in cache.values():
        queryset._result_cache = None

The idea is using related objects will still use the prefetch cache. First use will re-load data from the database, subsequent loads will not.

(I for one would love to see a class-level flag allowing to disable that code section entirely, but if I am too isolated a user, I will just use my own altered UpdateMixin where I need it).

tomchristie · 2016-11-11T13:02:01Z

Thanks for all that! Given that we use the instance once-only after removing the cache it's not clear that'd benefit us, tho would be perfectly happy to consider any pull requests that came along with assertNumQueries to demonstrate any improvements that we could still make here.
We'll roll with what we've got for now. 😄

spectras · 2016-11-11T13:15:58Z

I don't either, let's just say that if someone does he'll make his own mixin too. Well, thanks for taking the time to come up with an improved solution.
I still believe third-party packages should not "fix" or deviate from Django behaviors when they have an equivalent but, what can I say, it's your project and I have great respect for your work. I will live with it - and much more easily so with this new approach.

fredpalmer · 2016-11-16T03:51:50Z

I'm not sure this patch is the most diplomatic approach...

People spend a lot of time tweaking select_related and prefetch_related queries to minimize the amount of queries needed for m2m and other related objects. At least this is only happing on updates. However, it's sneaky in that one has dive into the depths of the DRF framework to understand why one's data is incurring a possible 2x+ the normal amount of queries for a PATCH request that updates a simple property only on the root object. Although, I'm currently not on this version - Unless this is changed, I now have make sure to bypass this behavior when I upgrade to the next point release. Don't get me wrong, I have to do this as well currently. However, there a few differences in my approach:

The primary differences are:

When the cache is cleared (before any updates take place)
What has to be reloaded at the end (only the related items)

    def update(self, instance, validated_data):
        """
        Surgically removes potentially stale, prefetched, cached, related objects
        in order to ensure non-stale related data in the serialized object after updating it.
        """
        # There are several reports of issues around this, e.g.:
        #   https://github.com/tomchristie/django-rest-framework/pull/1754
        #   https://github.com/tomchristie/django-rest-framework/pull/4668
        if hasattr(instance, "_prefetched_objects_cache"):
            # noinspection PyProtectedMember
            prefetch_objects_cache = instance._prefetched_objects_cache

            # It's common practice to support nested writes by popping it from `validated_data`
            # before going through the standard `update` from DRF
            initial_data_keys = self.initial_data.keys() if self.initial_data else []
            validated_data_keys = validated_data.keys()
            all_possible = set(initial_data_keys + validated_data_keys)
            prefetch_keys_to_clear = set(prefetch_objects_cache.keys()).intersection(all_possible)
            for key in prefetch_keys_to_clear:
                del prefetch_objects_cache[key]

        # This would essentially go on to DRF's `update`
        return super(BaseModelSerializer, self).update(instance, validated_data)

Summing Up...

The current implementation is not just re-loading the the related objects, it's reloading the root object, the m2m related objects and any other dependencies that could be needed and otherwise optimized within select_related.
Clearing the _prefetched_objects_cache before updating the object fixes the issue without the need to reload everything.
Developers are very deliberate about what they choose to prefetch_related. Automatically clearing might lead to unhappy users.
The ability to opt-in to this new behavior might also be more palatable (mentioned already somewhere else).
There may be more one could do here in my sample implementation to ascertain the need to clear an individual entry in the cache. For example:
- One already has incurred the cost to populate the cache, one could determine if there's a delta between the already saved, related objects and the validated ones.

I see there's already been a ton of discussion around this - sorry I'm late to the party. Thanks for the wonderful framework - I hope that helps

spectras · 2016-11-16T07:09:35Z

@fredpalmer > current release re-invokes get_object() indeed. The point of the change in this very issue is to invalidate the cache instead. Other than this I share your feeling. The issue being, some developers are not that deliberate about what they choose to prefetch_related and they regularly come and complain of stale cache entries because they fail to reload or update them.

For now, the best workaround is to have your own UpdateMixin that does not reload the object, and pull that in in all your ViewSets. I sure wish it was opt-in but, given the rationale, as I understand it, is some developers can't read, they would not read about the opt-in either.

At least with this patch here, the impact will be much more limited that with current release and it will avoid some pitfalls.

carltongibson · 2016-11-16T10:21:24Z

@tomchristie Is the behaviour here settled? (It's currently unreleased right?)

Lots of people run into this when first using prefetch_related — so doing something as a correct but maybe slow default seems good.

But equally, we're all adults, and I keep coming back to the upstream decision not to handle this, i.e. to leave it to application developers.

For now, the best workaround is to have your own UpdateMixin that does not reload the object, and pull that in in all your ViewSets

This seems (to me) an overly heavy burden to put on everybody just in order to add what are essentially training wheels for using prefetch_related.

If we do want this, do we not also want — although "want" isn't quite what I mean to say — a setting and/or an GenericAPIView-level attribute?

tomchristie · 2016-11-16T10:40:55Z

Is the behaviour here settled?

Yes, it's settled. (Tho the improvement is currently unreleased, yes.)

We've got two options:

Have slow behavior when prefetch_related is used with updates.
Have broken behavior when prefetch_related is used with updates.

There's no question which of those two we should do.

It may be that there's still further improvements we could make, and I'm very happy to consider productive discussion there, but that's a different issue.

Invalidate any existing prefetch cache on PUT requests.

384f3a0

tomchristie added the Bug label Nov 10, 2016

tomchristie added this to the 3.5.4 Release milestone Nov 10, 2016

Add regression test for #4661

fa8322d

tomchristie mentioned this pull request Nov 11, 2016

Fix prefetch_related updates. #4553

Merged

tomchristie merged commit 24791cb into master Nov 11, 2016

tomchristie deleted the invalidate-prefetch-cache branch November 11, 2016 09:44

K0Te mentioned this pull request Aug 8, 2018

UpdateModelMixin invlidate select_related objects cache #5241

Closed

yuekui mentioned this pull request Jun 19, 2021

Re-prefetch related objects after updating #8043

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Invalidate any existing prefetch cache on PUT requests. #4668

Invalidate any existing prefetch cache on PUT requests. #4668

Uh oh!

tomchristie commented Nov 10, 2016

Uh oh!

spectras commented Nov 11, 2016 •

edited

Loading

Uh oh!

tomchristie commented Nov 11, 2016

Uh oh!

spectras commented Nov 11, 2016

Uh oh!

fredpalmer commented Nov 16, 2016 •

edited

Loading

Uh oh!

spectras commented Nov 16, 2016

Uh oh!

carltongibson commented Nov 16, 2016 •

edited

Loading

Uh oh!

tomchristie commented Nov 16, 2016

Uh oh!

Uh oh!

Uh oh!

Invalidate any existing prefetch cache on PUT requests. #4668

Invalidate any existing prefetch cache on PUT requests. #4668

Uh oh!

Conversation

tomchristie commented Nov 10, 2016

Uh oh!

spectras commented Nov 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomchristie commented Nov 11, 2016

Uh oh!

spectras commented Nov 11, 2016

Uh oh!

fredpalmer commented Nov 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summing Up...

Uh oh!

spectras commented Nov 16, 2016

Uh oh!

carltongibson commented Nov 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomchristie commented Nov 16, 2016

Uh oh!

Uh oh!

spectras commented Nov 11, 2016 •

edited

Loading

fredpalmer commented Nov 16, 2016 •

edited

Loading

carltongibson commented Nov 16, 2016 •

edited

Loading