Closed
Description
Symfony version(s) affected
5.4.2
Description
When using the flag PREG_OFFSET_CAPTURE
with $string->match()
on a UnicodeString
, the position of the matched characters is given in bytes, not in characters (graphemes).
This is in fact an issue of PHP's preg_match_all()
; I reported it at https://bugs.php.net/bug.php?id=80166 => Outcome: They didn't change anything, but at least documented the status-quo ;-)
So I was hoping that Symfony's String Component "fixed" it...
How to reproduce
$string = new UnicodeString('öa');
$result = $string->match('/a/', \PREG_PATTERN_ORDER|\PREG_OFFSET_CAPTURE); // PREG_PATTERN_ORDER is only there to make Symfony use `preg_match_all()` instead of `preg_match()`
dd($result);
Actual result (in the innermost array):
1 => 2 // `ö` is counted as 2 bytes, therefore `a` is at index-position 2
Expected result:
1 => 1 // `ö` should be counted as 1 Unicode character (grapheme), then `a` would be at index-position 1
Possible Solution
No response
Additional Context
No response