-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Routing] [UrlGenerator] Encode slashes and adapt closer to RFC 3986 #39339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is a proof of concept for symfony#13017 and addresses some pressing issues when comparing the current implementation with RFC 3986. If this would be adapted in Symfony, the changed behaviour probably should become optional as some changes will break backwards compatibility. For now this should be fine to point out the changed behaviour.
Hey! I did a quick review of this PR, I think most things looks good. To help keep things organized, we don't allow "Draft" pull requests. Could you please click the "ready for review" button or close this PR and open a new one when you are done? Note that a pull request does not have to be "perfect" or "ready for merge" when you first open it. We just want it to be ready for a first review. Cheers! Carsonbot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for being up to your words!
'%2C' => ',', | ||
'%24' => '$', | ||
'%26' => '&', | ||
'%27' => '\'', // TODO: we may want an option to include this as original implementation claims usage in HTML, although this is wrong and htmlspecialchars et. al should be used here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should remove this from the list to preserve the encoding of single quotes
even though encoding this char is not required by the RFC, the URLs generated by Symfony should preserve this property of being safe from the pov of HTML and JavaScript escaping.
I would not make this configurable.
*/ | ||
protected $decodedChars = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should preserve this property and its semantic for BC reasons.
@@ -200,12 +223,12 @@ protected function doGenerate(array $variables, array $defaults, array $requirem | |||
return ''; | |||
} | |||
|
|||
$url = $token[1].$mergedParams[$varName].$url; | |||
$url = $token[1].self::encodeVariable($mergedParams[$varName]).$url; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check: why isn't $token[1]
encoded in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Had that in mind but did not care for now as this was not covered by any test...
$token[1]
should most likely handled as decoded so we would need to encode here.
TODO: add test for encoding $path
from Route
to document intended bahviour.
@@ -303,11 +323,11 @@ protected function doGenerate(array $variables, array $defaults, array $requirem | |||
} | |||
|
|||
if ($extra && $query = http_build_query($extra, '', '&', \PHP_QUERY_RFC3986)) { | |||
$url .= '?'.strtr($query, self::QUERY_FRAGMENT_DECODED); | |||
$url .= '?'.strtr($query, self::decodeQueryFragmentChars()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method call should be replaced by a static variable or a const.
} | ||
|
||
if ('' !== $fragment) { | ||
$url .= '#'.strtr(rawurlencode($fragment), self::QUERY_FRAGMENT_DECODED); | ||
$url .= '#'.strtr(rawurlencode($fragment), self::decodeQueryFragmentChars()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method call should be replaced by a static variable or a const.
|
||
private static function encodePath(string $path): string | ||
{ | ||
return strtr(rawurlencode($path), self::decodePathChars()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest removing the method and encoding inline
|
||
private static function encodeVariable(string $var): string | ||
{ | ||
return strtr(rawurlencode($var), self::decodeVariableChars()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest removing the method and encoding inline
public function testEncodingOfSlashInPath() | ||
{ | ||
$routes = $this->getRoutes('test', new Route('/dir/{path}/dir2', [], ['path' => '.+'])); | ||
$this->assertSame('/app.php/dir/foo%2Fbar/dir2', $this->getGenerator($routes)->generate('test', ['path' => 'foo/bar'])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a serious change with BC implications.
I would suggest a more conservative behavior that would make this test pass:
$this->assertSame('/app.php/dir/foo/bar%2Fbaz/dir2', $this->getGenerator($routes)->generate('test', ['path' => 'foo/bar%2Fbaz']));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on the implications (also mentioned in the description).
The more conservative behaviour looks confusing to the user.
I would expect UrlGenerator to do the encoding for me, so I could pass any $data
to $urlGenerator('route', ['path' => $data]);
.
Taking your suggestion I would have to call it as $urlGenerator('route', ['path' => rawurlencode($data)]);
.
I would have to play a bit more with code to get a stronger opinion on this and the exact implications.
In the long run, I see the best benefit moving the responsibility and ability to encode any arbitrary parameter into UrlGenerator.
yield ['/app.php/a%2Fb/b(ar/c%2Fd%2Fe', '/{foo}/b(ar/{baz}', '.+(?=/b\\(ar/)']; | ||
yield ['/app.php/a%2Fb/bar/c%2Fd%2Fe', '/{foo}/bar/{baz}', '.+(?!$)']; | ||
yield ['/app.php/bar/a%2Fb/bam/c%2Fd%2Fe', '/bar/{foo}/bam/{baz}', '(?<=/bar/).+']; | ||
yield ['/app.php/bar/a%2Fb/bam/c%2Fd%2Fe', '/bar/{foo}/bam/{baz}', '(?<!^).+']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these changes required?
if not, better add new lines in the provider and keep existing lines as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the current implementation these changes are required. I think this will change for the final implementation to maintain some level of BC and will adapt to keep changes to existing tests to a minimum. For now that outlines the implications.
Thank you for your review. I will adapt your suggestions when getting back to the code. Now it may be a good point to have some discussion on how in incorporate these changes. As written before...
Some ways I had in mind... I think that changing UrlGenerator in that way may cause unforseen consequences in many applications. So the proposed new behaviour should become optional and avoid breaking BC. I consider taking the changes as suggested into the current UrlGenerator is not acceptable. An option would be to introduce another UrlGenerator with just copying the current code. Copying usually sounds bad. Refactor the current UrlGenerator in a way that it can be extended for the changed behaviour. I am not very deep into the symfony development and so I would stand back from touching any single line of code that has broad usage and proven reliablity. Although it may provide a reasonable result. Add an option to activate the changed bahviour? For now my favourite way as only a small fraction of code is adapted. Finally there were changes on when and which characters to encode in path, query and fragment. Just change them as part of this PR? Make them part of the change behaviour? Introduce another option? Make this a new PR? Don't change? |
Just a note here for reference: Apache and encoding slashes don't like each other very much. Look for "apache slash encoding" in a search engine, and you will see plenty of references. Not sure about recent versions of Apache, but I know it was a real mess a few years back. |
Apache needs to explicitly allow encoded slashes. See http://httpd.apache.org/docs/current/mod/core.html#allowencodedslashes |
I propose #39732 instead. |
Let's continue on #39732 as I think this PR goes a little bit too far. |
…ed slashes when generating URLs (nicolas-grekas) This PR was merged into the 5.3-dev branch. Discussion ---------- [Routing] don't decode nor double-encode already encoded slashes when generating URLs | Q | A | ------------- | --- | Branch? | 5.x | Bug fix? | no | New feature? | yes | Deprecations? | - | Tickets | Fix #13017 | License | MIT | Doc PR | - Replaces #39339 Commits ------- eaac18b [Routing] don't decode nor double-encode already encoded slashes when generating URLs
…ed slashes when generating URLs (nicolas-grekas) This PR was merged into the 5.3-dev branch. Discussion ---------- [Routing] don't decode nor double-encode already encoded slashes when generating URLs | Q | A | ------------- | --- | Branch? | 5.x | Bug fix? | no | New feature? | yes | Deprecations? | - | Tickets | Fix #13017 | License | MIT | Doc PR | - Replaces symfony/symfony#39339 Commits ------- eaac18be6f [Routing] don't decode nor double-encode already encoded slashes when generating URLs
I hope that just opening a PR ist appropriate to get some feedback on changing UrlGenerator. If this would be adapted in Symfony, the changed behaviour probably should become optional as some changes will break backwards compatibility.
For now this should be fine to focus on changed behaviour.
This is a proof of concept for #13017 and addresses some pressing issues when comparing the current implementation with RFC 3986.
This PR changes how urlencoding is performed. Instead of just encoding the whole URL after replacing parameters, the tokens are encoded while prepending them to the constructed URL. That allows changing the behaviour for different kinds of tokens, so that parameters are encoded included the slashes.
The character set that is reprsented unencoded has been altered while verifying the implementation against the RFC.