-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Serializer][FrameworkBundle] Add a CSV encoder #19197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -977,6 +978,13 @@ private function registerSerializerConfiguration(array $config, ContainerBuilder | |||
$definition->addTag('serializer.normalizer', array('priority' => -900)); | |||
} | |||
|
|||
if (class_exists('Symfony\Component\Serializer\Encoder\CsvEncoder')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use ::class here also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird (accessing a constant on a class that doesn't exist) but I confirm it works.
LGTM 👍 |
$result = array(); | ||
$this->flatten($value, $result); | ||
|
||
if (!$headersAdded) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dunglas, instead of this condition, why don’t you just set csv headers before the foreach ?
forget it, you need to extract the keys from the value… sorry.
@dunglas , thanks for your work ;) Still, I may be wrong, but I think this case may be problematic with your implementation: Given an entity (Person) with an 0,n attribute association (Addresses). I know it’s a pretty annoying use case but I think that flattening in this is case may be more a con that a pro. A workaround could be a kind of serialization of the related entities and merge them together in one csv column. Not very elegant but maybe more robust in term of cohesion of the generated file. |
|
||
$this->assertEquals($expected, $this->encoder->decode(<<<'CSV' | ||
foo,bar.baz.bat | ||
a,b,c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why there is c
here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To test that it is silently ignored.
@gorghoa another solution can be to generate something like that:
WDYT? |
@dunglas , that may work but I still see two flaws: 1st, we’ll need to iterate through all the rows first in order to generate the headers (EDIT, we may compute them during the rows iterations and put them as first line of our file at the end of the loop. Loosing the possibility to stream content as it goes, though). 2nd, If another 0,n attribute (say phones) is present, we can’t use the trick of the last facultatives columns (or we have to fill empty columns with null value). Finally, I think that, with this dynamic set of column, it will be more complex for consumer systems (apart from excel and associates) to handle the csv representation. |
return $result; | ||
} | ||
|
||
// If there is only one data line in the document, return it is not considered has a collection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand well the last part of this phrase: [...] return it is not considered has a collection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should probably be // If there is only one data line in the document, return it (the line), the result is not considered as a collection
(has => as). Not sure this is very good design : if you can't be sure you can't get one-line long documents, you are going to have to write a special case just for this in client code. I think there should be a dedicated method for this use case if it is really needed, that would throw an exception if more than one line is found. BTW, this does not look like it is unit tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind this is to allow encoding single items (e.g. /product/1
) as a CSV document containing 1 one line without having to tune the normalizer. And this is unit tested (testEncode
/testEncodeCollection
/ testDecode
/testDecodeCollection
).
It's an implementation detail for 1 and 2 (and I've an idea of how do that, even if it will decrease a bit performances). IMO this approach is still better - even for naive clients - than generating and output not covered by the RFC (like columns in the column). |
@gorghoa I've looked at how various projects handle that (Sales Force, Drupal) and they all use a solution similar to yours. So after a thinking again, I'll implement what you suggest. Probably something like that:
|
@dunglas maybe we should also ask for the opinion of Sonata guys (mostly @soullivaneuh and @greg0ire) because they have experience creating CSV writers (they even have a separate project for that: https://github.com/sonata-project/exporter) |
fwrite($handle, $data); | ||
rewind($handle); | ||
|
||
$headers = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this var is intended to store an array later, it can be initialized as an empty array IMO. I don't see any special case null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get your point @roukmoute. Can you please show where is this happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to keep is as is to have a code as close as possible in encode
and decode
(and in encode $headers
cannot be initialized as an empty array because the normalized data can contain an empty array).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phansys: sorry I write a mistake.
IIRC, the Sonata Exporter has no special handling for multiple values. You need to handle this case yourself by adding a special method to the entity returning a string. |
|
/** | ||
* {@inheritdoc} | ||
*/ | ||
public function encode($data, $format, array $context = array()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if $data
is huge ? It would be great to have an interator here instead be able to use an iterator here. Not sure line 50 is compatible with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be an array because of the current design of the Serializer (Normalizers return arrays).
Not speaking for @soullivaneuh here, as far as I'm concerned, I'm just maintaining these writers, I didn't actually create them. I think a good inspiration source could be https://github.com/ddeboer/data-import/blob/master/src/Writer/CsvWriter.php cc @sagikazarmark @Baachi @ddeboer (input from those guys would be great IMO) |
There is another issue with the solution proposed by @gorghoa. If it's a collection of complex structures (like a collection of associative arrays), it becomes impossible to display it (or it will require to include a full CSV document inside the cell, and it looks bad to me). However, the flatten column solution works as expected:
Thought? |
} | ||
|
||
rewind($handle); | ||
$value = stream_get_contents($handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem here, what if $data
is huge? $value
will be too. Not sure you can do anything against that though, it depends if Symfony can work with a stream rather than a string…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not currently, but it impacts all other decoders as well. However it would be a nice addition to the component to be able to deal with streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would indeed! Php has yield
now, which should make this kind of thing easier to work with (What if normalizers yielded values instead of returning a huge array they build ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be done in Symfony 3 because the minimal PHP version is now 5.5. However encoders must stay backward compatible with normalizers returning raw arrays (they must support both cases).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I create a new issue about this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding that to my todo list!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #19238
I think CSV is not the best format to represent deep data structures, and that you should simply throw an exception if the input data goes to deep. If people want to flatten things, they can still do it before feeding the result to your writer, which should IMO stay focused on doing one thing and doing it well. If you still feel a strong need however, then you should probably write a flattener class that would be able to wrap this one (which would be best with a stream friendly serializer |
@greg0ire's idea implemented (It's by far the best idea. Thanks.) and other comments fixed. |
I don't get it… Is it me or is your last push is 4 days old ? 😕 I don't see anything new, but then again github has been having strange caching issues lately. |
@greg0ire he rebased, this is why ^^ |
$headers = array_keys($result); | ||
fputcsv($handle, $headers, $this->delimiter, $this->enclosure, $this->escapeChar); | ||
} elseif (array_keys($result) !== $headers) { | ||
throw new InvalidArgumentException('To use the CSV encoder, each line in the data array must have the same structure. You may want to use a custom normalizer class to normalize the data format before passing it to the CSV encoder.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe linebreak this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, you convinced me !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fabpot even if I'm personnaly convinced about no linebreak for method and class signatures, I don't really understand why we can't linebreak this thing? It's a new class, no merge conflict possible here.
Could you tell us why? I'm curious. :-) Maybe a coding standard I missed?
I think we can at least put the string argument onto a new line.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the code to be dense vertically. Nowadays, screens have a lot of horizontal space. And anyway, this is just an error message, I don't want exceptions to take too much screen space when reading the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error messages should be keep on one line to allow searching them more easily in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nowadays, screens have a lot of horizontal space
That's why we disagree apparently : I use the fact that my screen is big to split windows and often have 4 files on the same screen. I think this is the very first factor of speed on my workflow, because it makes navigating files way easier than browsing through tabs IMO, but I know many people seem to like to use their screen for one file at a time, while they often need to switch between a controller, a service, a view and a translation file in applications, a class and its test in libraries.
Other things to take into consideration is that Github (for instance) will not display this nicely, and also long lines are harder to edit : people scroll way faster vertically than horizontally.
But anyway, your repo, your rules, so thanks for taking the time to give your point of view even if you're probably very busy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error messages should be keep on one line to allow searching them more easily in the code.
That's if you linebreak them in the middle of a sentence / proposition. This is still quite greppable :
throw new InvalidArgumentException(
'To use the CSV encoder,
each line in the data array must have the same structure.
You may want to use a custom normalizer class to normalize the data format
before passing it to the CSV encoder.'
);
Rebasing means you have to push again and rewrite commits, but ok… |
@greg0ire but rewriting commits can keep the original commit date, and github use the commit date, not the push date. |
Makes sense 💡 Thanks! |
I've added 2 unit tests for empty arrays and empty documents (and I fixed a bug in the decoder thanks to this test). |
Rebased. ping @symfony/deciders |
ping @symfony/deciders |
* @param array $result | ||
* @param string $parentKey | ||
*/ | ||
private function flatten(array $array, array &$result, $parentKey = '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be moved after public
methods
Idea that might be useful - a way of customising the column names. There might be value in being able to pass a mapping of names in the serializer context An example, you might have had this output:
But with a name mapping:
Changes to
It would be a simple look up when the headers are set, but allow much more flexible CSV generation. |
@mcfedr good idea but IMO it deserves its own PR. @symfony/deciders are you ok to merge this one? |
Thank you @dunglas. |
This PR was submitted for the master branch but it was merged into the 3.2 branch instead (closes #7040). Discussion ---------- [Serializer] Docs for CSV and YAML encoders Docs for symfony/symfony#19197 and symfony/symfony#19326. Commits ------- 5687972 [Serializer] Docs for CSV and YAML encoders.
} | ||
|
||
// If there is only one data line in the document, return it (the line), the result is not considered as a collection | ||
return $result[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as here: e71f5be#commitcomment-20876740
…r (GuilhemN) This PR was submitted for the 3.2 branch but it was merged into the 3.3 branch instead (closes #7654). Discussion ---------- [Serializer] Document the CsvEncoder and the YamlEncoder Document the CsvEncoder and the YamlEncoder introduced in symfony/symfony#19197 and symfony/symfony#19326. Commits ------- 6944ea1 [Serializer] Document the CsvEncoder and the YamlEncoder
Usage:
CSV files must contain a header line with property names as keys.
ping @clementtalleu @Simperfit @gorghoa