-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Yaml] Allow tabs as separators between tokens #40514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Yaml] Allow tabs as separators between tokens #40514
Conversation
Hey! I see that this is your first PR. That is great! Welcome! Symfony has a contribution guide which I suggest you to read. In short:
Review the GitHub status checks of your pull request and try to solve the reported issues. If some tests are failing, try to see if they are failing because of this change. When two Symfony core team members approve this change, it will be merged and you will become an official Symfony contributor! I am going to sit back now and wait for the reviews. Cheers! Carsonbot |
The exception message formatting suggested on fabbot.io is unrelated to my changes. Not sure if I should fix that then. |
The tests failing on Appveyor also seem unrelated to me, so again I'm not sure how to make those green. |
Hey! Nice work here, I this makes me happy. I think @NamelessCoder has recently worked with this code. Maybe they can help review this? Cheers! Carsonbot |
The check about the formatting of exception message has some false positive (where we explicitly decide not to wrap in quotes as we are using placeholders for a different reason). |
'foo: bar', // space and tab after colon | ||
]; | ||
|
||
foreach ($yamls as $yaml) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be better implemented as 2 separate tests IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to splitting it up, but I based it on the existing test for tabs as indentation. Should I also change that one then for consistency?
And would a dataprovider also be okay, or do you prefer separate tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a data provider is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests of the YAML component are quite old (that component actually comes from symfony 1 originally), and so they don't really follow PHPUnit best practices. But ideally, new tests should follow them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks for clarifying! I haven't had much time to change it so far but I still intend to follow up on this somewhere in the coming days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why should tests not have a : void
return type?
Thank you @bertramakers. |
@@ -200,7 +200,7 @@ private function doParse(string $value, int $flags) | |||
array_pop($this->refsBeingParsed); | |||
} | |||
} elseif ( | |||
self::preg_match('#^(?P<key>(?:![^\s]++\s++)?(?:'.Inline::REGEX_QUOTED_STRING.'|(?:!?!php/const:)?[^ \'"\[\{!].*?)) *\:( ++(?P<value>.+))?$#u', rtrim($this->currentLine), $values) | |||
self::preg_match('#^(?P<key>(?:![^\s]++\s++)?(?:'.Inline::REGEX_QUOTED_STRING.'|(?:!?!php/const:)?[^ \'"\[\{!].*?)) *\:(( |\t)++(?P<value>.+))?$#u', rtrim($this->currentLine), $values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why group (( |\t)
) insteand of list ([ \t]
)? Lists are a bit faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No specific reason except that I didn't know that there was a performance difference. Is it significant though?
I have no strong opinion about one or the other, but this PR is merged so it'd require a new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bertramakers I did a benchmark, and list is 1% faster. I do not know if it is considered a micro-optimization, but since it's a parser, maybe it's something significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea either, I just discovered this bug while using the Yaml component in my own app and reported and fixed it to the best of my ability 😄 I think you can always open another PR to optimize it if you think it's worth it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bertramakers yeah, no problem! The fix works.
As described in #40507, the Yaml spec allows tabs as whitespace characters between tokens. However, the Yaml parser crashes on this as it only expects spaces after the colon. https://yaml.org/spec/1.2/spec.html#id2778241
While I'm not a huge fan of it personally, it's an issue when a different linter tells us that a given YAML file with content that we have little control over has valid syntax in an unrelated check, and then our app crashes because it cannot be parsed after all.