[DomCrawler] Use the native HTML5 parser on PHP 8.4 #61475
Merged
+340
−201
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR keeps the
DOM*
-based API but uses the native HTML5 parser on PHP 8.4 instead of masterminds/html5.This works by parsing HTML strings using
Dom\HTMLDocument
then serializing to XML, and loading again usingDOMDocument::loadXML()
.This basically replaces #61356 since it removes any BC breaks.
The drawback compared to a more native approach is the double-parsing that happens.
This could be worked on later by providing a way to leverage the new
Dom\*
API directly.To be proved worth it before.