-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[DomCrawler] Allow pipe (|) character in link tags when using Xpath expressions #20235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ee7bf5c
to
ce3f110
Compare
151af3f
to
6af53a6
Compare
continue 2; | ||
case '[': | ||
case ']': | ||
$openedBrackets += '[' === $xpath[$i] ? 1 : -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would find it more readable to use 2 different cases rather than redoing a check inside it:
case '[':
++$openedBrackets;
continue 2;
case '[':
--$openedBrackets;
continue 2;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
$parenthesis = ''; | ||
$xpathLen = strlen($xpath); | ||
$openedBrackets = 0; | ||
$lastUnion = strspn($xpath, " \t\n\r\0\x0B"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain what you are looking for here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitespaces, chars list comes from trim()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$startPosition
may be a better name than $lastUnion
for this variable, as it actually stores the position of the beginning of the current expression (which is never the position of the last pipe, but a position after it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@nicolas-grekas this is a bugfix IMO, because the current code breaks on valid XPath. Btw, this is a bug affecting Drupal |
6af53a6
to
298ad29
Compare
switch ($xpath[$i]) { | ||
case '"': | ||
case "'": | ||
if (false === $i = strpos($xpath, $xpath[$i], $i + 1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't do assignments in the if condition, I think you can move the assignment just one line before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the contrary, I like dealing with the return value of strpos on the same line than it is called, to make things more readable as far as var types is concerned... so unless there is a strong push to change it, I'd personally prefer to keep it this way...
@nicolas-grekas I updated the gist at https://gist.github.com/stof/5131c79c576600800aac7313b182d44d to include your implementation, as well as an implementation short-circuiting XPath splitting when there is no pipe in the string (the benchmark method fits well the Mink use-case of returning split expressions. The short-circuiting needs to be adapted for BrowserKit here as the loop is also doing the processing directly instead of doing a separate loop over the split XPath for processing). |
I can confirm that this PR fixes the issue for Drupal. Luckily the patch almost applies to Symfony 2.7. It was still a bit hard to actually apply it, for testing locally I did it with this hack (documenting for myself for next time):
So the test file was not patched because Drupal removes it, but that is fine for testing with Drupal. |
@klausi when you install individual components, you indeed need to remap paths to apply the patch, as the patch is for the main repo. @nicolas-grekas performance for your version is indeed much better than #20229 |
@stof thanks for the bench! Then this is ready to merge/vote, isn't it? |
👍 |
@nicolas-grekas FYI, I plan to reuse your faster implementation in Mink as well, (as it needs splitting union parts too, and this is where @klausi submitted the implementation proposed here in #20229). You can send the PR yourselves if you prefer. the code is at https://github.com/minkphp/Mink/blob/master/src/Selector/Xpath/Manipulator.php |
Please do |
298ad29
to
17757d8
Compare
Thank you @klausi. |
…ing Xpath expressions (klausi, nicolas-grekas) This PR was merged into the 2.7 branch. Discussion ---------- [DomCrawler] Allow pipe (|) character in link tags when using Xpath expressions | Q | A | ------------- | --- | Branch? | master | Bug fix? | yes | New feature? | no | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | #20229 | License | MIT | Doc PR | - @klausi could you please validate this patch? Is it an improvement over yours? (sorry I don't have the proper use case to test.) Commits ------- 17757d8 [DomCrawler] Optimize DomCrawler::relativize() 5b26e33 [DomCrawler] Allow pipe (|) character in link tags when using Xpath expressions
@klausi could you please validate this patch? Is it an improvement over yours? (sorry I don't have the proper use case to test.)