Closed
Description
Symfony version(s) affected
5.4.37
Description
The Crawler
encodes html entities before parsing it in
<script>
and </script>
).
How to reproduce
$crawler = new Crawler();
$crawler->addContent('<!doctype html><html><script>var foo = "bär";</script></html>', 'text/html; charset=UTF-8');
echo $crawler->filterXPath('//script')->text();
// output: var foo = "bär";
// expected: var foo = "bär";
Possible Solution
I’m not sure what’s the best way to fix it as convertToHtmlEntities()
cannot distinguish between outside and inside <script>
. But as it seems that convertToHtmlEntities()
is only there to fix issues with the libxml parser, maybe it can be skipped if the Masterminds\HTML5
is used?
Additional Context
No response