-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Yaml] Add tags support #21194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Yaml] Add tags support #21194
Conversation
Note that the simpler TaggedValue proposal is still compatible with having the Yaml component hydrate things. It would just require a separate class whose responsibility would be solely to hydrate the TaggedValue objects to something else. This would remove this responsibility from the Yaml parser classes, which looks like a good idea to me. |
} | ||
|
||
if (null !== self::$tagResolver && $scalar && '!' === $scalar[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a particular big fan of putting conditions like '&& $scalar' because it takes away some readability. There are a lot of ways of validating '$scalar' (empty(), is_null(), etc...) instead of making me believe (at first sight) that it is a boolean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact we can just remove it, this case is already managed earlier. Thanks!
{ | ||
private $tags = array(); | ||
|
||
public function addTag(TagInterface $tag, $priority = 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a good idea to add some phpDoc about the parameters here.
|
||
private function prefixTag($tag) | ||
{ | ||
if ($tag && '!' === $tag[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is $tag a bool? If you validate $tag and it is a string I think it makes more sense to use any of the available methods just for improved readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to leverage this for DI =iterator
& =closure-proxy
in the same PR so that we can juge on an actual use case.
/** | ||
* @author Guilhem N. <egetick@gmail.com> | ||
* | ||
* @internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure this should be internal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be part of the api in 4.0 but as it never ends in userland for now, making it internal allows us to change its name (or maybe just remove it in favor of a simple ParseException
?).
try { | ||
return self::$tagResolver->resolve($scalar, $tag); | ||
} catch (UnsupportedTagException $e) { | ||
// Ignored for bc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should trigger a deprecation here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I removed the exception in the last commit (and its catch), I think we can do it later to focus on the feature.
@@ -44,6 +48,16 @@ public function __construct($offset = 0, $totalNumberOfLines = null, array $skip | |||
$this->offset = $offset; | |||
$this->totalNumberOfLines = $totalNumberOfLines; | |||
$this->skippedLineNumbers = $skippedLineNumbers; | |||
$this->tagResolver = new TagResolver(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a constructor argument instead? (default null, no need for any instance when not using tags)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #21194 (comment)
* @param TagInterface $tag | ||
* @param int $priority | ||
*/ | ||
public function addTag(TagInterface $tag, $priority = 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for this method, requiring a tagResolver as constructor is enough imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looked complicated to use to me as there are already 3 internal arguments in the constructor (that should be removed imo but not sure we can do that because of bc).
People would be forced to do new Parser(0, null, array(), new TagResolver());
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #21230
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, then w know how to make that the first arg: with a BC layer
try { | ||
return $this->doParse($value, $flags); | ||
} finally { | ||
Inline::$tagResolver = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be restored to the previous value of Inline::$tagResolver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woudln't it be better to pass it as argument instead of forcing to maintain some state externally ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicolas-grekas this is only done for bc (Inline
wasn't internal in 3.0) but it must not be used in userland so I think it's fine to always remove the previous value.
@stof it is following this. I think it would be better to instantiate Inline
and pass the TagResolver
as a constructor argument, wdyt?
try { | ||
return $this->tagResolver->resolve($data, $tag); | ||
} catch (UnsupportedTagException $e) { | ||
// ignored for bc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deprecation
* | ||
* @internal | ||
*/ | ||
final class TagResolver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at this closer, we might not need it: we could simply feed Yaml\Parser
with an associative array of tags-to-tagresolvers:
new Parser(..., ..., array('iterator' => new IteratorTagResolver())
(prefix excluded, or auto added)
we don't need to handle the "priority" + "supports()" logic, it just adds complixity to me, no flexibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicolas-grekas imo #21194 (comment) is a big locker in using a constructor argument. I'll submit a PR to see if we can do something about it.
we don't need to handle the "priority" + "supports()" logic, it just adds complixity to me, no flexibility
It could be useful to allow implicit tags (foo
becoming !bar foo
for example, based on regexes and on its path) but I'm not sure this is very useful in userland so will try :)
$tagName = '!'.substr($tagName, 18); | ||
} | ||
|
||
$this->tags[$tagName] = $tag; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that $tag has the required interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should indeed be safer
|
||
foreach ($tags as $tagName => $tag) { | ||
if (!$tag instanceof TagInterface) { | ||
throw new \InvalidArgumentException(sprintf('Expected tag of type "%s", given "%s".', TagInterface::class, get_class($tag) ?: gettype($tag))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_class
triggers a warning when given a non-object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, thanks !
Should I add mappings/sequences support here to be able to implement |
It would be good to have support in this PR I think. |
UPGRADE-4.0.md
Outdated
* The default value of the strict option of the `Choice` Constraint has been | ||
changed to `true` as of 4.0. If you need the the previous behaviour ensure to | ||
changed to `true` as of 4.0. If you need the the previous behaviour ensure to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double "the" while changing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, but not related ;) see #21250
try { | ||
return $this->doParse($value, $flags); | ||
} finally { | ||
Inline::$tags = array(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, when reading this line, I think we should restore to the previous value. I know you explained it's useless, but this requires contextual knowledge, and thus with no guarantee. Better make the code logic local.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you were right, it was creating a bug when entering a block (because of nested parsers).
* | ||
* @return mixed | ||
*/ | ||
public function construct($value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should pass $tag and $value, so that one could create a single object to manage several tags if required.
$tag should match the key used when registering if possible (no namespace added/removed).
About "construct' => "createValue"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR was merged into the 3.2 branch. Discussion ---------- Fix a typo | Q | A | ------------- | --- | Branch? | 3.2 | Bug fix? | no | New feature? | no | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | #21194 (comment) | License | MIT | Doc PR | <!-- - Bug fixes must be submitted against the lowest branch where they apply (lowest branches are regularly merged to upper ones so they get the fixes too). - Features and deprecations must be submitted against the master branch. - Please fill in this template according to the PR you're about to submit. - Replace this comment by a description of what your PR is solving. --> Commits ------- 2dcb22e Fix a typo
@@ -20,7 +22,7 @@ | |||
*/ | |||
class Parser | |||
{ | |||
const TAG_PATTERN = '((?P<tag>![\w!.\/:-]+) +)?'; | |||
const TAG_PATTERN = '!(?P<tag>[\w!.\/:-]+)'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I create a new constant/inline it instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't wait to see the DI part here also :)
$data = array(); | ||
$context = null; | ||
$allowOverwrite = false; | ||
while ($this->moveToNextLine()) { | ||
|
||
if(null !== ($tag = $this->getLineTag($this->currentLine))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing space after if, no need for brackets around assignment
@@ -129,6 +174,15 @@ public function parse($value, $flags = 0) | |||
// array | |||
if (!isset($values['value']) || '' == trim($values['value'], ' ') || 0 === strpos(ltrim($values['value'], ' '), '#')) { | |||
$data[] = $this->parseBlock($this->getRealCurrentLineNb() + 1, $this->getNextEmbedBlock(null, true), $flags); | |||
} elseif (null !== ($subTag = $this->getLineTag(ltrim($values['value'], ' ')))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for brackets
if ($mergeNode) { | ||
// Merge keys | ||
} elseif (!isset($values['value']) || '' == trim($values['value'], ' ') || 0 === strpos(ltrim($values['value'], ' '), '#')) { | ||
} elseif (!isset($values['value']) || '' == trim($values['value'], ' ') || 0 === strpos(ltrim($values['value'], ' '), '#') || null !== ($subTag = $this->getLineTag(ltrim($values['value'], ' ')))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brackets
|
||
private function getLineTag($value) | ||
{ | ||
if ('' === $value || '!' !== $value[0] || 1 !== preg_match('/^'.self::TAG_PATTERN.' *( +#.*)?$/', $value, $matches)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"m" modifier missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Values passed to this function do not contain newlines, so no need for "m".
{ | ||
/** | ||
* @param mixed $value | ||
* @param string $tag the tag name used to register the tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The + missing alignment (The should start one space after $value, isn't it?)
@nicolas-grekas thanks, i'll fix your comments this evening. |
Last commits added inline support ( Fabbot failure is normal (the current yaml parser can't parse the updated yaml files). |
* | ||
* @internal | ||
*/ | ||
final class IteratorTag implements TagInterface, DumpableTagInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is also "=closure_proxy" which could benefit from yaml tags.
which means we could have only one class - eg "ArgumentTag" for the DI component, able to deal with all tags it'll need to support (these 2 today and more in the future, no forward compat issue here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
$value = array('=iterator' => $value->getValues()); | ||
} elseif ($value instanceof ClosureProxyArgument) { | ||
$value = array('=closure_proxy' => $value->getValues()); | ||
if ($value instanceof IteratorArgument || $value instanceof ClosureProxyArgument) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instanceof ArgumentInterface
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't know the tag corresponding to other instances so I think it's better this way here.
* | ||
* @internal | ||
*/ | ||
final class ArgumentTag implements TagInterface, DumpableTagInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for final, especially if @internal
return new IteratorArgument($value); | ||
} | ||
|
||
if (!is_array($value) || array(0, 1) !== array_keys($value) || !is_string($value[0]) || !is_string($value[1]) || 0 !== strpos($value[0], '@') || 0 === strpos($value[0], '@@')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be safe and explicit, this should be in an if ('closure_proxy' === $tag) {
block, and the method should throw otherwise
$value = array('=closure_proxy' => $value->getValues()); | ||
if ($value instanceof IteratorArgument || $value instanceof ClosureProxyArgument) { | ||
$argument = clone $value; | ||
$argument->setValues($this->dumpValue($argument->getValues())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like a code smell to me:
this creates an instance of IteratorArgument
that is filled with a non usable state. Let's imagine that "setValues" does some validation, which it could very well do - that would make the current design fail.
I think we need a more robust design here, that preserves the semantic of objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure there are ways to fix this with the current design.
I still think the "TaggedValue" proposal could be a good idea to try :)
src/Symfony/Component/Yaml/Yaml.php
Outdated
@@ -29,6 +29,7 @@ class Yaml | |||
const DUMP_OBJECT_AS_MAP = 64; | |||
const DUMP_MULTI_LINE_LITERAL_BLOCK = 128; | |||
const PARSE_CONSTANT = 256; | |||
const PARSE_CUSTOM_TAGS = 512; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PARSE_TAGGED_VALUE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if people won't think that built-in tags are also converted to a TaggedValue
with this name. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK for me, keep as is
@trigger_error(sprintf('Using the unquoted scalar value "%s" is deprecated since version 3.3 and will be considered as a tagged value in 4.0. You must quote it.', $scalar), E_USER_DEPRECATED); | ||
} | ||
|
||
// Optimise for returning strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize
$this->totalNumberOfLines = $totalNumberOfLines; | ||
$this->skippedLineNumbers = $skippedLineNumbers; | ||
if (func_num_args() > 0) { | ||
@trigger_error(sprintf('Constructor arguments $offset, $totalNumberOfLines, $skippedLineNumbers of %s are deprecated and will be removed in 4.0', self::class), E_USER_DEPRECATED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this deprecation makes sense. But this PR is already quite big and this change is not really related to the ability to parse YAML tags. Couldn't we move this to a new PR instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is equal to me but I can reopen #21230 if you prefer.
* @param mixed $value | ||
* @param string $tag | ||
*/ | ||
public function __construct($value, $tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about swapping arguments. IMO we usally have key/value pairs, but not value/key pairs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally see the tag as a complement of the value, not as the key; a value could be represented without a tag (a simple plain value).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I felt the same also - swapping would be my preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, done
<<<YAML | ||
- !foo | ||
- yaml | ||
- !!quz [bar] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tags starting with two exclamation marks are tags defined by the YAML spec. I do not think we should support them through a flag that indicates to parse custom tags (and if we want to parse built-in tags, we will have to validate their names IMO).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we throw an exception then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @xabbuh - we need to keep BC of course
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
$nextOffset += strspn($value, ' ', $nextOffset); | ||
|
||
// Is followed by a scalar | ||
if (!isset($value[$nextOffset]) || !in_array($value[$nextOffset], array('[', '{'))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass true
as the third argument?
return; | ||
} | ||
|
||
if ($flags & Yaml::PARSE_CUSTOM_TAGS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everywhere else it's the other way around: YAML::PARSE_CUSTOM_TAGS & $flags
@@ -107,6 +109,10 @@ public static function parse($value, $flags = 0, $references = array()) | |||
$result = self::parseScalar($value, $flags, null, $i, true, $references); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to assume that we can always evaluate the scalar even when we are parsing the value of a tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xabbuh tags aren't supported on scalars yet anyway but updated to make sure we don't forget it in 4.0. thanks!
This PR was squashed before being merged into the 3.3-dev branch (closes #21350). Discussion ---------- [Yaml] Remove internal arguments from the api | Q | A | ------------- | --- | Branch? | master | Bug fix? | no | New feature? | no | BC breaks? | no | Deprecations? | yes | Tests pass? | yes | Fixed tickets | #21230 | License | MIT | Doc PR | Reopening of #21230 because of [@xabbuh's comment](#21194 (comment)). > This PR removes internal constructor arguments of the `Parser` class. > This should break nothing as they are only used to build errors message and are clearly meant for internal uses. > > This would allow to have a nicer api for #21194 (comment). Commits ------- ebae4ff [Yaml] Remove internal arguments from the api
rebase needed |
I think we should flag the feature as experimental. @GuilhemN Can you do that? |
@xabbuh done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Status: Reviewed
Thank you @GuilhemN. |
This PR adds custom tags support to the Yaml component.
Symfony tags (
!!binary
,!str
, etc.) are still managed in the parser to have a lighter diff but we'll be able to convert them later if we want to.The primary addition of this PR is the
TagInterface
:It can be used to register custom tags. An example that could be used to convert the syntax
=iterator
to a tag:If you think this is too complex, @nicolas-grekas proposed an alternative to my proposal externalizing this support by introducing a new class
TaggedValue
.