Skip to content

[Yaml] Add support for dumping null as an empty string by using the Yaml::DUMP_NULL_AS_EMPTY flag #58232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

alexandre-daubois
Copy link
Member

Q A
Branch? 7.2
Bug fix? no
New feature? yes
Deprecations? no
Issues Fix #58155
License MIT

Empty value in Yaml is a valid representation of null, it makes sense to me to support dumping an empty value, just like dumping null as tilde is already supported.

@stof
Copy link
Member

stof commented Sep 11, 2024

should Dumper::dump() throw an exception when you use both DUMP_NULL_AS* flags at the same time ?

@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Sep 11, 2024

Indeed!

@stof
Copy link
Member

stof commented Sep 11, 2024

what happens when dumping null as a top-level value in Yaml::dump ?

@alexandre-daubois
Copy link
Member Author

It dumps an empty string. Isn't it what it should do? Did you have something else in mind?

@@ -5,6 +5,7 @@ CHANGELOG
---

* Deprecate parsing duplicate mapping keys whose value is `null`
* Add support for dumping `null` as an empty string by using the `Yaml::DUMP_NULL_AS_EMPTY` flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think that „empty string“ is the correct term here. An empty string would be '' or "".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"As an empty value" maybe?

@stof
Copy link
Member

stof commented Sep 11, 2024

@alexandre-daubois It don't think an empty string is a valid representation for a null value in the parser. We need to ensure that the dumped file can be parsed again.

AFAIK, representing null by omitting it is only valid inside sequences and mapping values (because the syntax for them makes it unambiguous that a value is expected in that position). I'm not even sure it is valid in inline sequences and mappings.

@xabbuh
Copy link
Member

xabbuh commented Sep 11, 2024

I'm not even sure it is valid in inline sequences and mappings.

That’s indeed a good point. And we should make sure that the dumper results are actually parseable by our own parser.

@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Sep 11, 2024

I have two solutions to propose:

  • We can only dump an empty value for null with the flag in arrays/object dumps. This complexifies (a little) the dumper, but it is doable
  • We keep the proposition as is. After playing a bit, the parser returns null on an empty string at root level, and it also works well with something like this:
foo:
  -

I can add one or two tests if necessary to emphasize the point if someone comes across the commit later on.

For the record, here's my approach on the first solution:

diff --git a/src/Symfony/Component/Yaml/Inline.php b/src/Symfony/Component/Yaml/Inline.php
index ebd409328f..1c194d7c00 100644
--- a/src/Symfony/Component/Yaml/Inline.php
+++ b/src/Symfony/Component/Yaml/Inline.php
@@ -95,12 +95,13 @@ class Inline
     /**
      * Dumps a given PHP variable to a YAML string.
      *
-     * @param mixed $value The PHP variable to convert
-     * @param int   $flags A bit field of Yaml::DUMP_* constants to customize the dumped YAML string
+     * @param mixed $value  The PHP variable to convert
+     * @param int   $flags  A bit field of Yaml::DUMP_* constants to customize the dumped YAML string
+     * @param bool  $nested True if processing a value nested in an array or object
      *
      * @throws DumpException When trying to dump PHP resource
      */
-    public static function dump(mixed $value, int $flags = 0): string
+    public static function dump(mixed $value, int $flags = 0, bool $nested = false): string
     {
         switch (true) {
             case \is_resource($value):
@@ -138,7 +139,7 @@ class Inline
             case \is_array($value):
                 return self::dumpArray($value, $flags);
             case null === $value:
-                return self::dumpNull($flags);
+                return self::dumpNull($flags, $nested);
             case true === $value:
                 return 'true';
             case false === $value:
@@ -224,7 +225,7 @@ class Inline
         if (($value || Yaml::DUMP_EMPTY_ARRAY_AS_SEQUENCE & $flags) && !self::isHash($value)) {
             $output = [];
             foreach ($value as $val) {
-                $output[] = self::dump($val, $flags);
+                $output[] = self::dump($val, $flags, true);
             }
 
             return \sprintf('[%s]', implode(', ', $output));
@@ -247,19 +248,19 @@ class Inline
                 $key = (string) $key;
             }
 
-            $output[] = \sprintf('%s: %s', self::dump($key, $flags), self::dump($val, $flags));
+            $output[] = \sprintf('%s: %s', self::dump($key, $flags), self::dump($val, $flags, true));
         }
 
         return \sprintf('{ %s }', implode(', ', $output));
     }
 
-    private static function dumpNull(int $flags): string
+    private static function dumpNull(int $flags, bool $nested = false): string
     {
         if (Yaml::DUMP_NULL_AS_TILDE & $flags) {
             return '~';
         }
 
-        if (Yaml::DUMP_NULL_AS_EMPTY & $flags) {
+        if (Yaml::DUMP_NULL_AS_EMPTY & $flags && $nested) {
             return '';
         }

@stof
Copy link
Member

stof commented Sep 11, 2024

We should indeed have tests covering the other usages:

  • top-level null
  • null in expanded mapping
  • null in expanded sequence
  • null in inline mapping
  • null in inline sequence

And we should probably have those tests trying the full roundtrip too, to ensure that parsing gives the expected result.

@alexandre-daubois
Copy link
Member Author

I'm investigating in https://yaml.org/spec/1.2.2/#empty-nodes. I implemented tests for each case you mentioned. However, there's (only) one case that the Symfony parser doesn't handle well: foo: [ bar, , baz ]. Currently, Symfony ignores the value in the middle.

Here's what I have I mind: let's keep the PR as is, because it is valid YAML. In the meantime, I'm investigating on the Inline::parseSequence() side to bring a bug fix in 5.4 to support empty values in inline sequences. Does it sound good?

@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Sep 11, 2024

There are a few limitations in the current parser: there this empty value problem in sequences, but also empty keys aren't supported in mapping and sequences. This makes the potential fix way trickier.

Unless I missed something, I'm not sure it worth the extra complexity and pretty big changes across the parser to be able to dump null as an empty value.

Thanks to everyone involved in the review of this PR.

@stof
Copy link
Member

stof commented Sep 11, 2024

Both the node’s properties and node content are optional. This allows for a completely empty node. Completely empty nodes are only valid when following some explicit indication for their existence.

Based on this quote from the specification (from the link you gave), I don't think dumping a top-level null value as empty content is valid in Yaml (there is no indication of existence in such case)

@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Sep 11, 2024

Indeed. Also the blocker for sequences is that having a comma followed by spaces then by a new comma is the "explicit indication" of existence. However, the current sequence parsing logic makes it difficult to do this type of check, thus the complexity it would bring to support this. Supporting null keys in seq and mappings is another serious challenge by itself.

To me, it would worth it if there's a big demand to support these features. But just to be able to dump null in a different format the same way everywhere, it seems a bit too much.

@stof
Copy link
Member

stof commented Sep 11, 2024

Supporting null keys in sequences does not make sense as sequences don't have keys.

Supporting null keys in mappings is a no-go as we parse mappings into PHP arrays that don't support null as keys (we don't support boolean, custom tags or mappings as keys either in our parser due to that reason)

@alexandre-daubois
Copy link
Member Author

Indeed, I meant "supporting null keys for mapping and null values in seq" 😄

To be sure to understand, do you advocate for or against the feature proposed by this PR, after this discussion?

@stof
Copy link
Member

stof commented Sep 11, 2024

To me, either we should support the feature (where valid, so not for top-level null) while also ensuring that we can parse it back, or we should not support the flag at all. So this gives us 3 solutions:

  • status quo
  • adding support for the flag but making dump as empty content only in expanded mapping and sequence values, not in inline values (where we would fallback to null or ~ instead).
  • adding support for null values in sequences and supporting for dumping null as empty everywhere except top-level null values (again, we don't care about null values in mapping keys as we won't ever be able to dump a PHP array containing a null key anyway)

@alexandre-daubois
Copy link
Member Author

Right, that's crystal clear. I'll try to come up with something that matches one of the solution. Thanks 👍

xabbuh added a commit that referenced this pull request Dec 29, 2024
…e by using the `Yaml::DUMP_NULL_AS_EMPTY` flag (alexandre-daubois)

This PR was merged into the 7.3 branch.

Discussion
----------

[Yaml] Add support for dumping `null` as an empty value by using the `Yaml::DUMP_NULL_AS_EMPTY` flag

| Q             | A
| ------------- | ---
| Branch?       | 7.2
| Bug fix?      | no
| New feature?  | yes
| Deprecations? | no
| Issues        | Fix #58155
| License       | MIT

Another try after #58232. I went for the second solution from `@stof` (#58232 (comment)), namely dumping empty value as null just in expanded sequences and mappings. When the inline limit of the dumper is reached, it fallbacks on using `null` (examples in tests).

I'm not 100% happy about the name of the constant, `DUMP_NULL_AS_EMPTY_IN_EXPANDED_NOTATION`. It's verbose. I'm open to suggestions!

Commits
-------

1fc1379 [Yaml] Add support for dumping `null` as an empty value by using the `Yaml::DUMP_NULL_AS_EMPTY` flag
symfony-splitter pushed a commit to symfony/yaml that referenced this pull request Dec 29, 2024
…e by using the `Yaml::DUMP_NULL_AS_EMPTY` flag (alexandre-daubois)

This PR was merged into the 7.3 branch.

Discussion
----------

[Yaml] Add support for dumping `null` as an empty value by using the `Yaml::DUMP_NULL_AS_EMPTY` flag

| Q             | A
| ------------- | ---
| Branch?       | 7.2
| Bug fix?      | no
| New feature?  | yes
| Deprecations? | no
| Issues        | Fix #58155
| License       | MIT

Another try after #58232. I went for the second solution from `@stof` (symfony/symfony#58232 (comment)), namely dumping empty value as null just in expanded sequences and mappings. When the inline limit of the dumper is reached, it fallbacks on using `null` (examples in tests).

I'm not 100% happy about the name of the constant, `DUMP_NULL_AS_EMPTY_IN_EXPANDED_NOTATION`. It's verbose. I'm open to suggestions!

Commits
-------

1fc1379028 [Yaml] Add support for dumping `null` as an empty value by using the `Yaml::DUMP_NULL_AS_EMPTY` flag
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Yaml] Dumping with no value (like Docker Compose volumes section) isn't supported
5 participants