Skip to content

Better testing for U+0CC2 and U+0DCF collation edge cases #1101

@hsivonen

Description

@hsivonen

Two characters in the root collation have the special property that they occur in the middle of a contraction without also occurring at the start of a contraction. Therefore, checking if U+0CC2 and U+0DCF may start a contraction isn't a sufficient check for whether they can contract the next character.

This special case is worthwhile to test for explicitly, since this special case may cause a bug when skipping over the identical prefix of strings to be compared in a collator.

For U+0CC2, I suggest manually injecting the following into the collation test suite:
0CC8 0CC6 0CC2 0CD6
is less than
0CC8 0CC6 0CC2 0CD5

Here the inital 0CC8 is any filler character just in case to make the interesting case not occur right at the start of the input.

A similar case can probably be constructed for U+0DCF.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions