Skip to content

[HttpKernel] Bugfix/last modified response strategy #42355

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 27, 2022

Conversation

aschempp
Copy link
Contributor

@aschempp aschempp commented Aug 3, 2021

Q A
Branch? 4.4
Bug fix? yes
New feature? no
Deprecations? no
Tickets Fix #41666
License MIT
Doc PR ~

The HttpCache uses the ResponseCacheStrategy to merge caching headers when merging ESI or SSI fragments into the main response content. Caching headers are merged as far as possible (age, cache-control/max-age, etc.). However, we forgot to merge the Last-Modified header which prevents revalidation of responses.

from @mpdude in #41666

Currently, ResponseCacheStrategy will always remove the Last-Modified header when merging ESI responses. This makes it impossible to revalidate the result.

My suggestion is to keep the maximum = most recent Last-Modified value on the final response as long as all (sub-)responses provided Last-Modified.

@carsonbot carsonbot added this to the 4.4 milestone Aug 3, 2021
@carsonbot carsonbot changed the title Bugfix/last modified response strategy [HttpKernel] Bugfix/last modified response strategy Aug 3, 2021
@aschempp
Copy link
Contributor Author

aschempp commented Aug 4, 2021

The failing tests don't seem related to me.

@OskarStark
Copy link
Contributor

OskarStark commented Aug 4, 2021

from the original issue:

I see @xabbuh has added the feature flag, but I wonder if this isn't actually a bug?

@xabbuh and @symfony/mergers , should we handle this as bug or feature?

@mpdude as you created the issue, can you please give us a review here? Thanks

@aschempp
Copy link
Contributor Author

aschempp commented Aug 5, 2021

I also stumbled over these lines:

// Last-Modified and Etag headers cannot be merged, they render the response uncacheable
// by default (except if the response also has max-age etc.).
if (\in_array($response->getStatusCode(), [200, 203, 300, 301, 410])
&& null === $response->getLastModified()
&& null === $response->getEtag()
) {
return false;
}

I can't remember why we added the check for Last-Modified and Etag, the RFC clearly states that 200, 203, 300, 301, 410 can be cached. A Last-Modified header would trigger revalidation as far as I can tell, but it can still be stored in the cache. Does anyone have more knowledge on this?

@aschempp
Copy link
Contributor Author

aschempp commented Aug 5, 2021

I tried to find an answer to my own question. @Toflar pointed out that ResponseCacheStrategy is based on RFC2616 (at least some comments say so) but RFC2616 has been superseded by RFC7234. I would suggest to update the implementation with RFC7234 but as a feature PR instead. RFC7234 is more clear on what can actually be considered cacheable.

@aleho
Copy link
Contributor

aleho commented Aug 5, 2021

Sorry if I'm not getting this entirely, but the idea of this PR is to allow e.g. a request including ESI to pass the Last-Modified header so that the client can actually try to check with If-Modified-Since (which is lost as soon as a page includes and ESI), right?

Wouldn't this change break if there's an actual 304 being returned by the app?

AbstractSurrogate::handle() does an if (!$response->isSuccessful()) { …, but 304 is not a valid response, so each request hitting the cache will break because of this change.

I tested this locally and could reproduce that behavior.

This could be easily fixed with if (!$response->isSuccessful() && $response->getStatusCode() !== Response::HTTP_NOT_MODIFIED) { but I'm really not sure whether I'm totally missing the point here.

@aschempp
Copy link
Contributor Author

aschempp commented Aug 9, 2021

Interesting that you could reproduce this locally, I didn't check on that at all. What I figured out is:

  1. the HttpCache already stores Last-Modified (and all other headers) in the cache.
  2. It does send If-Modified-Since to the backend if that header is present
    https://github.com/symfony/symfony/blob/5.4/src/Symfony/Component/HttpKernel/HttpCache/HttpCache.php#L383-L386
  3. if the response is 304, it should re-use the existing cache entry
    https://github.com/symfony/symfony/blob/5.4/src/Symfony/Component/HttpKernel/HttpCache/HttpCache.php#L397-L399

The point of the PR (or what @mpdude) meant in the original issue was to forward the most-recent Last-Modified header from multiple fragments to the final response. If that would be the case, then HttpCache would receive an If-Modified-Since header from the browser and would then in turn send a 304 if the final response generated after merging ESI still validates

$response->isNotModified($request);

@aleho
Copy link
Contributor

aleho commented Aug 9, 2021

@aschempp My steps to reproduce this are not that trivial to explain, I hope I can though. ;)

Request:

  1. Browser (If-Not-Modified-Since as stored)
  2. HttpCache (If-Not-Midified-Since also as stored in local cache)
  3. App

Response:

  1. App does the usual Response::isNotModified(), because the source data (e.g. updated_at field) didn't change it will return a 304.
  2. HttpCache does two requests, main request and subrequest (ESI) to a controller. App for both requests returns a Last-Modified and 304.
    2a. HttpCache replaces the response code with 200, doesn't care about modified headers.
    2b. HttpCache determines the Last-Modified header, response code is 304.

2b. is where I get a 500 because the response is not considered successful.

With this PR and low maxage I can observe one request being fine, one failing each time the cache is fresh enough.

@aschempp
Copy link
Contributor Author

aschempp commented Aug 9, 2021

I think I understand the problem, but rather because we're dealing with subrequests a lot in Contao (we even have our own fragment handler). The isSuccessful() check means it is currently not possible to return a 304 in an ESI fragment.

However, that's not really part of this PR and should probably be a separate issue? The proposed changes would neither fix nor worsen that. But it would allow the main request (the HttpCache) to send a 304 to the browser if possible. Would you agree?

@aleho
Copy link
Contributor

aleho commented Aug 9, 2021

In my testing, depending on cache freshness, there were 500 status responses, caused by this throw:
https://github.com/symfony/http-kernel/blob/c838cc1203534a030d035ff1380393bfd4c55e1c/HttpCache/AbstractSurrogate.php#L95-L102

Upstream isn't doing anything wrong – it's allowed to send a 304 as far as I'm aware of – and the app should not be forced to change it's behavior, no matter whether a CacheKernel is being used or not (again, as far as I understand it).

The changes in this PR would lead to a 500 response when there was none before. Even worse, serving a newly generated response works and subsequent responses won't, so it might be hard to debug that.

In my opinion this is a BC break, because nowhere in the docs there's a warning "Don't ever response with a 304 in fragment sub-requests" and, even though the status code was always 200, it would work as of now.

@aschempp
Copy link
Contributor Author

But the code lines you mentioned are not added by this pull request? The current surrogate (ESI) implementation do not allow for a response code <> 2xx, therefore this problem should also appear without this pull request? Can you please try to reproduce this in your local setup without my changes?

@aleho
Copy link
Contributor

aleho commented Aug 10, 2021

Let me try to explain it differently (I hope I can, language barrier is restricting).

Setup

Let's say there's a Symfony project that has been setting the last-modified header on sub-requests.
Haven't found any statement at all that would forbid this. I know I've been doing it (and its the goal of this PR, isn't it, am I missing something?).

Have a look at https://github.com/aleho/symfony-httpcache for a demo, it doesn't even set the upstream response to 304, just a last-modified header and 200, including content.

Problem

Before your changes, no response would include a Last-Modified header. No client would request with If-Modified-Since. Problem never observed.

After this PR there will be a Last-Modified header present, so each request by every browser will now set the since-header, resulting in a 500 (next request will work again because, afaik, browsers are supposed to clear the cache on a temporary 500).

It's true that this PR doesn't cause the root problem and the issue currently can also be triggered by setting the request header.

Nevertheless, this PR will definitely always trigger it, because clients implementing a cache are expected to request with If-Modified-Since.

Solution (just my suggestion)

Don't merge this PR unless AbstractSurrogate::handle() has been fixed in another PR or fix that simple if-statment in this PR.

@aleho
Copy link
Contributor

aleho commented Nov 10, 2021

@aschempp Are there any updates on this?

(Edit / PS: A made a PR fixing the 304 issue described above.)

@fabpot
Copy link
Member

fabpot commented Jun 27, 2022

It is a considered a new feature as it introduces something that was explicitly not supported before. Can you rebase on 6.2 so that I can merge (the code looks good to me.)

@fabpot fabpot modified the milestones: 4.4, 6.2 Jun 27, 2022
@fabpot
Copy link
Member

fabpot commented Jul 20, 2022

@aschempp Are you still interested in this PR?

fabpot added a commit that referenced this pull request Jul 20, 2022
…aleho)

This PR was merged into the 4.4 branch.

Discussion
----------

[HttpKernel] [HttpCache] Don't throw on 304 Not Modified

| Q             | A
| ------------- | ---
| Branch?       | 4.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Tickets       | Fixes  #43997
| License       | MIT
| Doc PR        | ~

If the response cache keeps a `Last-Modified` header clients will request with `If-Modified-Since`.
The surrogate will not handle a `304 Not Modified` correctly, resulting in a 500 and a failed request.

This fixes that request / response cycle, as observed in testing PR #42355.

Commits
-------

d27f02a [HttpKernel] [HttpCache] Don't throw on 304 Not Modified
@aschempp aschempp changed the base branch from 4.4 to 6.2 July 27, 2022 06:58
@aschempp aschempp force-pushed the bugfix/last-modified-response-strategy branch from 71db7e7 to 2f862bb Compare July 27, 2022 06:58
@aschempp
Copy link
Contributor Author

PR updated 😊
The failing checks seem unrelated to me

@fabpot fabpot force-pushed the bugfix/last-modified-response-strategy branch from b39c5f1 to f3aae33 Compare July 27, 2022 16:01
@fabpot
Copy link
Member

fabpot commented Jul 27, 2022

Thank you @aschempp.

@fabpot fabpot merged commit 91d0168 into symfony:6.2 Jul 27, 2022
@aschempp aschempp deleted the bugfix/last-modified-response-strategy branch October 11, 2022 06:29
@fabpot fabpot mentioned this pull request Oct 24, 2022
nicolas-grekas added a commit that referenced this pull request Aug 16, 2024
…ional requests (mpdude)

This PR was squashed before being merged into the 6.4 branch.

Discussion
----------

[HttpKernel] ESI fragment content may be missing in conditional requests

| Q             | A
| ------------- | ---
| Branch?       | 6.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Issues        |
| License       | MIT

The content for ESI-embedded fragments may be missing from the combined (main) response generated by the `HttpCache` under certain conditions:

1. The main request sent to the cache must be a conditional request with either `If-Modified-Since` or `If-None-Match`
2. The embedded response processed by the cache matches the validator (last-modified or etag)
3. The resulting (combined) response generated by the `HttpCache` does not match the validator

Condition 3 is necessary since otherwise the cache returns an empty 304 response. In that case, the issue still exists, but we don't see or care about it and the wrong body is not sent at all.

Regarding condition 2, it does not matter where this embedded response comes from. It may be a fresh (cached) response taken from the cache, it may be a stale cache entry that has been revalidated; probably it can even be a non-cacheable response.

In practice, the conditional request will always use `If-Modified-Since`: We're dealing with ESI subrequests, and the combined response created by the `HttpCache` does not provide `ETag`s, so a client has nothing to validate against.

Only since #42355 (merged in to 6.2) the main response will include a `Last-Modified` header, given that all of the included responses provided one. Probably that is the reason why this bug was not spotted earlier - it required that change and all of the responses processed by the cache must provide `Last-Modified` data.

Conditions 2 + 3 together seem unlikely, but may in fact happen easily when you have an application that generates different chunks of cacheable content for the main and embedded requests and adds last-modified information to them. For example:

* First request:  Main response modified at time 1, embedded fragment modified at time 2 -> last-modified at 2
* Data for main response changes at time 3
* Second request with "If-Modified-Since" time 2
* Embedded fragment still modified at time 2, main response at time 3 -> last-modified at 3
  * the embedded fragment is considered as not modified, content is stripped
  * main response is generated, fragment content is missing

Commits
-------

9fed8dc [HttpKernel] ESI fragment content may be missing in conditional requests
nicolas-grekas added a commit that referenced this pull request Aug 16, 2024
…omputation with ESI responses (mpdude)

This PR was merged into the 6.4 branch.

Discussion
----------

[HttpKernel] Improve and add tests for `Last-Modified` computation with ESI responses

| Q             | A
| ------------- | ---
| Branch?       | 6.4
| Bug fix?      | no
| New feature?  | no
| Deprecations? | no
| Issues        |
| License       | MIT

This PR slightly improves the test cases for #42355 (which fixed #41666).

It adds two tests for the cases where `Last-Modified` should _not_ be added,
and updates the existing test to make it more obvious that we only look at
the existence of `Last-Modified` headers, whereas the `ETag` is not
relevant.

Commits
-------

40d5455 Improve and add tests for Last-Modified computation with ESI responses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HttpCache should provide Last-Modified for ESI responses when possible
6 participants