Skip to content

Update existing vary headers in response instead of overwriting them. #5047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 7, 2017

Conversation

kirberich
Copy link
Contributor

@kirberich kirberich commented Apr 4, 2017

Description

I've changed the behaviour of finalize_response to update vary headers in the response - previously, any existing vary headers would be wiped out going through DRF. Using patch_vary_headers assures that existing headers remain.

I ran the tests but am getting a bunch of (hopefully unrelated?) problems, so I'm passing the problem on to travis.

I'll happily add a test for this as well, some guidance on where it might best go would be good though :)

@kirberich kirberich force-pushed the merge-vary-headers branch from 936de4f to 6a15f3e Compare April 4, 2017 14:31
@xordoquy
Copy link
Collaborator

xordoquy commented Apr 4, 2017

I'd be happy to read a bit about why the change is required.
Why can't the vary be set by the views ?
Why should they be added instead of replaced ?

@kevin-brown
Copy link
Member

kevin-brown commented Apr 5, 2017

I'm +0 on this, since I see where DRF is completely wiping out any existing Vary headers and just setting Vary: Accept when there are multiple renderers. Based on looking through the code, this means that even if you set

return Response(headers={'Vary': 'X-My-Special-Header'})

It will always be returned with just Vary: Accept instead of Vary: X-My-Special-Header, Accept.

Assuming that's the case, that has the potential to cause issues with downstream proxies who rely on the Vary header.

@kirberich
Copy link
Contributor Author

Ah yes I blew straight past the context there didn't I 😀

Because the vary header is used by caches to decide when not to cache things, removing parts of the header can result in dangerous behaviour where responses get cached that the view defined shouldn't be cached. Here's a (contrived) example:

@api_view(['GET'])
@auth_classes([SuperSecretHTTPAuth])
@allow_caching(vary='Authorization') # super important, vary on authorization to make sure users can only get their own data from the cache!
def secret_user_details(request):
    user = user_from_auth_header(request.META['HTTP_AUTHORIZATION'])
    serializer = SecretUserData(instance=user)
    return Response(serializer.data)

The example is very silly, but we have this issue in a more complicated version of that, where we determine permissions from a JWT and need to cache responses, but need to make sure the cached responses are tied to a user.

With the current logic, when the response gets to DRF's finalize_response method the Vary header will be wiped out, resulting in an upstream cache just seeing Vary: Accept, instead of Vary: Accept,Authorization.

Patching the vary header instead of replacing it is also the common pattern in other bits of django, for example in the csrf middleware:

        # Content varies with the CSRF cookie, so set the Vary header.
        patch_vary_headers(response, ('Cookie',))

@xordoquy
Copy link
Collaborator

xordoquy commented Apr 5, 2017

current status: http://www.commitstrip.com/en/2016/01/26/genius-or-stupid/

@tomchristie what's the rational behind the vary header being wiped ?

@tomchristie
Copy link
Member

Possibly a simpler alternative would be to only set headers in the view that haven't already been set on the response. That'd be a simple and consistent approach that I think would cover what we'd need.

Ie, here https://github.com/kirberich/django-rest-framework/blob/6a15f3e47f31d1992cc5569414755991924bc923/rest_framework/views.py#L423

Instead use:

for key, value in self.headers.items():
    if key not in response:
        response[key] = value

That'd work fine for your use case @kirberich, right? You'd just want to make sure to included 'Accept' yourself.

Thoughts?

@tomchristie
Copy link
Member

what's the rational behind the vary header being wiped ?

Historical only.

Out of our defaults "Accept" and "Vary", would either of them be set already on a Response instance, if they'd not been added explicitly by the user?

@kirberich
Copy link
Contributor Author

@tomchristie that fix would solve our particular use case, but it might introduce weird bugs for people, because I think it can break content negotiation.

Here's another very contrived example to explain my thinking:

  1. User uses their browser to go to /api/endpoint/
  2. DRF renders HTML because of the request's Accept: text/html and use of the browseable API renderer
  3. The request gets cached, with the vary header having been overwritten as Vary: Authorization
  4. The user now makes an API request to the same url, with Accept: Application/json
  5. The cache sees that the two requests have the same Authorization header and returns the cached HTML version of the request, instead of rendering a new JSON response

I'm not sure about the Allow header, which is the only other header DRF sets (I think), but with Vary the right thing always seems to be to patch the headers instead of replacing them. You could require users to add the right headers, but this would require user code to know about DRF internals, like adding Vary: Accept headers if there's more than one renderer.

It's annoying that django doesn't have a simple patch_all_headers function to take care of this complexity, but as DRF only adds two headers it seems like a bit of special casing for Vary should be acceptable (especially as it's using standard django code to do it).

@tomchristie
Copy link
Member

tomchristie commented Apr 7, 2017

I don't have a problem with the user being responsible for ensuring that they set the Vary header correctly if they do set it, though perhaps I'm misreading here?

"with the vary header having been overwritten as Vary: Authorization"

Do you mean because user code explicitly included headers = {'Vary': 'authorization'}?
If so then I think it's just a constraint that any users setting the Vary header should be making sure to correctly include 'Accept' as one of the value parts.

@kirberich
Copy link
Contributor Author

I completely agree that the user should be responsible for setting the headers correctly, i.e. not overwriting existing ones, but I don't think the user can be responsible for also including headers that third-party code might add, as it requires knowledge of what goes on inside DRF and ties those behaviours together:

@api_view(['GET'])
def test_view(request):
    response = Response(some_serializer.data)

    # make sure we vary this response per user, be careful not to wipe out existing headers!
    patch_vary_headers(response, ['Authorization'])

    # Return response with patched headers
    return response

With a view like that, one of the following would happen (assuming multiple renderer classes are active)

  1. with the current code, the response will contain Vary: Accept (which can leak user data)
  2. with the "replace only if not set by user" method, the response will contain Vary: Authorization (which breaks content negotation)
  3. with the submitted patch, the response will contain Vary: Accept,Authorization

If adding the Vary:Accept was made to the be responsibility of the user, the code would have to look like this:

from rest_framework.settings import api_settings


@api_view(['GET'])
def test_view(request):
    response = Response(some_serializer.data)

    # make sure we vary this response per user, be careful not to wipe out existing headers!
    vary_headers = ['Authorization']
    # DRF adds a vary: accept header if multiple renderer classes are used, make sure we include this header here.
    if len(api_settings.DEFAULT_RENDERER_CLASSES) > 1:
        vary_headers.append('Accept')
    
    patch_vary_headers(response, vary_headers)

    # Return response with patched headers
    return response

I hope this makes sense! I'm also happy to do a quick hangout or something at some point to talk through this properly 🙂

# Add new vary headers to existing ones instead of overwriting.
new_vary_headers = cc_delim_re.split(self.headers.pop('Vary'))
patch_vary_headers(response, new_vary_headers)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'm convinced.

Let's tweak the code block slightly tho. The pop is hidden away inside the rest of the code so it's not totally clear at first why it doesn't jsut later get overridden.

How about...

vary = self.headers.pop('Vary', None)
if vary is not None:
    ...

@tomchristie
Copy link
Member

Thanks for taking the time to walk through things, @kirberich.
I think the "REST framework only sets the header some of the time" is a deciding factor here.

Previously, any existing vary headers would simply be wiped out by DRF. Using patch_vary_headers assures that existing headers remain.
@kirberich kirberich force-pushed the merge-vary-headers branch from 6a15f3e to 9ebd5a2 Compare April 7, 2017 16:06
@kirberich
Copy link
Contributor Author

@tomchristie no worries, thank you for the quick responses!

I updated the code, let me know if that looks better 🙂 - I'm a bit unsure about the comment, I think it needs one, but I'm not sure if it's being clear enough.

@tomchristie tomchristie merged commit d300c3c into encode:master Apr 7, 2017
@tomchristie
Copy link
Member

Yeah works for me.

@tomchristie tomchristie added this to the 3.6.3 Release milestone Apr 7, 2017
@kirberich
Copy link
Contributor Author

Thank you! ✨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants