Skip to content

[Serializer] Fix security issue on CsvEncoder about CSV injection #24508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

welcoMattic
Copy link
Member

@welcoMattic welcoMattic commented Oct 10, 2017

Q A
Branch? master (4.1)
Bug fix? no
New feature? yes
BC breaks? yes
Deprecations? no
Tests pass? yes
License MIT

I read this article about CSV injection and I thought it best to update the CsvEncoder so that it does not generate potentially malicious CSV files by default.

@sroze
Copy link
Contributor

sroze commented Oct 10, 2017

Great initiative 👍 I'd target 3.4.

Copy link
Member

@javiereguiluz javiereguiluz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to prefixing a \t to problematic values, another solution proposed in the comments of the article is to prefix a = before the "..." because that forces Excel to interpret the value as text. Not sure if it's really better.

@welcoMattic welcoMattic force-pushed the fix/csv-encoder-formulas-char-escaping branch from 6d7dc03 to e254fcb Compare October 10, 2017 13:34
@xabbuh xabbuh added this to the 4.1 milestone Oct 11, 2017
@welcoMattic
Copy link
Member Author

Thanks @chalasr @kaznovac @xabbuh for your comments and review.
I just push fixes. I'm agree to set $escapeFormulas at false by default, after all negative numbers doesn't have to be escaped. And it will avoid BC Break.

Should I replace all assertEquals by assertSame in CsvEncoderTest class?

@welcoMattic welcoMattic force-pushed the fix/csv-encoder-formulas-char-escaping branch from 527dc6a to fc743a1 Compare October 11, 2017 08:38
@chalasr chalasr removed the BC Break label Oct 11, 2017
@chalasr
Copy link
Member

chalasr commented Oct 11, 2017

@welcoMattic Nope, that would create merge conflicts when merging lower branches up to master. Thanks.

@xabbuh
Copy link
Member

xabbuh commented Oct 11, 2017

Should we also allow to pass this option through the context so that you do not necessary have to configure this option globally?

@welcoMattic
Copy link
Member Author

@xabbuh why not. We should also make the context $escapeFormulas value overrides the global $escapeFormulas value. It could be set to true globally, but for one particular encoding it could be set to false.

@chalasr
Copy link
Member

chalasr commented Oct 11, 2017

👍 to allow overriding this option in the context.

@welcoMattic welcoMattic force-pushed the fix/csv-encoder-formulas-char-escaping branch 2 times, most recently from 09a5ed5 to 9c0733d Compare October 11, 2017 13:48
@welcoMattic
Copy link
Member Author

@chalasr @xabbuh Overriding is now possible

@dunglas
Copy link
Member

dunglas commented Oct 12, 2017

Great!

But as stated in the article, it's definitely not a security issue on our side. The current implementation respects the RFC (Excel and Google Sheets don't), and most software consuming CSV expect the current behavior.
It should made be clear in the docs PR that this option should only be used when the generated CSV will be imported in spreadsheets or it can cause interoperability issues.

@welcoMattic
Copy link
Member Author

Thanks @dunglas!

For documentation, should I create a new page under Serializer section? Named CsvEncoder, and where I explain why this options exists?

@ostrolucky
Copy link
Contributor

ostrolucky commented Oct 16, 2017

Alternative approach might be to just add a normalizer. That would help with deserialization of such sanitized csvs too and doesn't pollute __construct arguments.

I don't like having this in core anyway though, as it's security issue in Excel and Google sheets only and this is destructive kind of escaping

@fabpot
Copy link
Member

fabpot commented Dec 1, 2017

Not sure if we need to support that in core. /cc @symfony/mergers

@Tobion
Copy link
Contributor

Tobion commented Dec 10, 2017

👍 to include that as it's security feature that people should be aware about.

Copy link
Contributor

@dbu dbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think in 95% of the use cases, the serializer will be used to export data, which might include user provided input.

there is only a problem if this component is used to generate sheets with formula. i would add an explaination in the phpdoc to explain what this flag is about and say to set it to true when you don't intend to have formula in the csv you generate.

i am no merger, but +1 on this. it is a relevant security topic and obviously excel/libreoffice/google and whoever else is affected decided they rather not properly fix it, so its up to generators of csv files to ensure security.

i think this should be against 3.3 as that is the branch for security fixes.

@dunglas
Copy link
Member

dunglas commented Dec 20, 2017

👍 to support this feature too. But as it's definitely not a security fix (on our side), it's a new feature that should be merged in 4.1.

there is only a problem if this component is used to generate sheets with formula.

There is also a problem when the value start with '=', '-', '+', '@' and the parser supports correctly the RFC: a \t will be added, but it should not (actually, when using this flag, it's not really a true CSV file anymore).

@welcoMattic
Copy link
Member Author

Thanks @dbu @dunglas for your comments. I think that we can support this "fix" and explain it explicitly in PHPDoc.

Should I update the PR now, or we need to wait other opinions?

@dbu
Copy link
Contributor

dbu commented Dec 20, 2017

i'd update the changelog and phpdoc as that will be relevant either way. i hope one of the maintainers can decide which version this should go. i can see the point of kevin, and its a good reason why its not on by default. but i feel it still is security relevant, even if its not a bug of symfony, but symfony helping around lax security of other tools - and as such should go into 3.3 or 3.4.

@dunglas
Copy link
Member

dunglas commented Jan 11, 2018

I think that you can update and rebase the PR @welcoMattic

@welcoMattic
Copy link
Member Author

@nicolas-grekas Done, I also fix @ostrolucky's comment.

@nicolas-grekas
Copy link
Member

@welcoMattic thanks. Can you rebase to get rid of the merge commit? I invite you to squash while doing so.

@welcoMattic welcoMattic force-pushed the fix/csv-encoder-formulas-char-escaping branch 2 times, most recently from b98b9ab to 414a61d Compare February 4, 2018 18:30

public function __construct(string $delimiter = ',', string $enclosure = '"', string $escapeChar = '\\', string $keySeparator = '.')
public function __construct(string $delimiter = ',', string $enclosure = '"', string $escapeChar = '\\', string $keySeparator = '.', $escapeFormulas = false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing bool type declaration for the new argument

@welcoMattic welcoMattic force-pushed the fix/csv-encoder-formulas-char-escaping branch from 414a61d to a1b0bdb Compare February 6, 2018 10:21
@welcoMattic
Copy link
Member Author

Thanks for review @Tobion 👍

@fabpot
Copy link
Member

fabpot commented Feb 7, 2018

Thank you @welcoMattic.

@fabpot fabpot merged commit a1b0bdb into symfony:master Feb 7, 2018
fabpot added a commit that referenced this pull request Feb 7, 2018
…V injection (welcoMattic)

This PR was merged into the 4.1-dev branch.

Discussion
----------

[Serializer] Fix security issue on CsvEncoder about CSV injection

| Q             | A
| ------------- | ---
| Branch?       | master (4.1)
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | yes
| Deprecations? | no
| Tests pass?   | yes
| License       | MIT

I read [this article](http://georgemauer.net/2017/10/07/csv-injection.html) about CSV injection and I thought it best to update the `CsvEncoder` so that it does not generate potentially malicious CSV files by default.

Commits
-------

a1b0bdb Fix security issue on CsvEncoder
@fabpot fabpot mentioned this pull request May 7, 2018
@welcoMattic welcoMattic deleted the fix/csv-encoder-formulas-char-escaping branch February 4, 2020 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.