Skip to content

[Yaml] Allow tabs as separators between tokens #40514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/Symfony/Component/Yaml/Parser.php
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ private function doParse(string $value, int $flags)
array_pop($this->refsBeingParsed);
}
} elseif (
self::preg_match('#^(?P<key>(?:![^\s]++\s++)?(?:'.Inline::REGEX_QUOTED_STRING.'|(?:!?!php/const:)?[^ \'"\[\{!].*?)) *\:( ++(?P<value>.+))?$#u', rtrim($this->currentLine), $values)
self::preg_match('#^(?P<key>(?:![^\s]++\s++)?(?:'.Inline::REGEX_QUOTED_STRING.'|(?:!?!php/const:)?[^ \'"\[\{!].*?)) *\:(( |\t)++(?P<value>.+))?$#u', rtrim($this->currentLine), $values)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why group (( |\t)) insteand of list ([ \t])? Lists are a bit faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reason except that I didn't know that there was a performance difference. Is it significant though?
I have no strong opinion about one or the other, but this PR is merged so it'd require a new PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bertramakers I did a benchmark, and list is 1% faster. I do not know if it is considered a micro-optimization, but since it's a parser, maybe it's something significant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea either, I just discovered this bug while using the Yaml component in my own app and reported and fixed it to the best of my ability 😄 I think you can always open another PR to optimize it if you think it's worth it!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bertramakers yeah, no problem! The fix works.

&& (false === strpos($values['key'], ' #') || \in_array($values['key'][0], ['"', "'"]))
) {
if ($context && 'sequence' == $context) {
Expand Down
73 changes: 57 additions & 16 deletions src/Symfony/Component/Yaml/Tests/ParserTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -52,26 +52,67 @@ public function getNonStringMappingKeysData()
return $this->loadTestsFromFixtureFiles('nonStringKeys.yml');
}

public function testTabsInYaml()
/**
* @dataProvider invalidIndentation
*/
public function testTabsAsIndentationInYaml(string $given, string $expectedMessage)
{
// test tabs in YAML
$yamls = [
"foo:\n bar",
"foo:\n bar",
"foo:\n bar",
"foo:\n bar",
$this->expectException(ParseException::class);
$this->expectExceptionMessage($expectedMessage);
$this->parser->parse($given);
}

public function invalidIndentation(): array
{
return [
[
"foo:\n\tbar",
"A YAML file cannot contain tabs as indentation at line 2 (near \"\tbar\").",
],
[
"foo:\n \tbar",
"A YAML file cannot contain tabs as indentation at line 2 (near \"\tbar\").",
],
[
"foo:\n\t bar",
"A YAML file cannot contain tabs as indentation at line 2 (near \"\t bar\").",
],
[
"foo:\n \t bar",
"A YAML file cannot contain tabs as indentation at line 2 (near \"\t bar\").",
],
];
}

foreach ($yamls as $yaml) {
try {
$this->parser->parse($yaml);
/**
* @dataProvider validTokenSeparators
*/
public function testValidTokenSeparation(string $given, array $expected)
{
$actual = $this->parser->parse($given);
$this->assertEquals($expected, $actual);
}

$this->fail('YAML files must not contain tabs');
} catch (\Exception $e) {
$this->assertInstanceOf(\Exception::class, $e, 'YAML files must not contain tabs');
$this->assertEquals('A YAML file cannot contain tabs as indentation at line 2 (near "'.strpbrk($yaml, "\t").'").', $e->getMessage(), 'YAML files must not contain tabs');
}
}
public function validTokenSeparators(): array
{
return [
[
'foo: bar',
['foo' => 'bar'],
],
[
"foo:\tbar",
['foo' => 'bar'],
],
[
"foo: \tbar",
['foo' => 'bar'],
],
[
"foo:\t bar",
['foo' => 'bar'],
],
];
}

public function testEndOfTheDocumentMarker()
Expand Down