-
Notifications
You must be signed in to change notification settings - Fork 220
Make merge separator compare less than U+0000 on the identical strength #6814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
See unicode-org#6811. This change does not fix the issue for sort keys.
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback. |
return Decomposition::new(left_tail.$left_to_iter(), self.decompositions, self.tables).cmp( | ||
Decomposition::new(right_tail.$right_to_iter(), self.decompositions, self.tables), | ||
return Decomposition::new(left_tail.$left_to_iter(), self.decompositions, self.tables).map(|c| if c != '\u{FFFE}' { c as i32 } else { -1i32 }).cmp( | ||
Decomposition::new(right_tail.$right_to_iter(), self.decompositions, self.tables).map(|c| if c != '\u{FFFE}' { c as i32 } else { -1i32 }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if
condition is backwards due to the assumption that absent other information, LLVM treats the first branch as more likely and we still after a decade aren't allowed to say unlikely
on non-nightly Rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the best place to put this logic? It seems un-elegant to put this directly into the main compare
function macro, which to this point has just delegated to other functions and hasn't had any code-point-specific logic?
Are we sure that U+FFFE is the only code point with this special case? It is the one that the conformance test discovered, but I wonder if it was pointing at a bigger class of issues?
Yes. It needs to go into a
It makes sense for the relevant iterators to be inside the macro, since the concrete types end up differing per macro instantiation. And, as noted above, the
It looks a lot like this isn't a bigger class of issues but a very special case. In the case of ICU4C sort key generation, only U+FFFE is special-cased: https://searchfox.org/mozilla-central/rev/f6a806c38c459e0e0d797d264ca0e8ad46005105/intl/icu/source/i18n/bocsu.cpp#132 |
I do see U+FFFE specifically called out in the CLDR collator spec, so maybe this really is just a single isolated special case: https://www.unicode.org/reports/tr35/dev/tr35-collation.html#CLDR_Collation_Algorithm |
Why is this a 4 line change but #6823 a much bigger change? How come there aren't tests failing right now given an inconsistency in behavior between sort keys and compare? |
Signed 32-bit integers offer a way to easily switch from In the sort key case, the requirement is that
Because we in general lack test coverage to show consistency between sort keys and compare. It would make sense to fuzz them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve but I'd like @echeran to also take a look.
Thanks. I'm assuming that @echeran is doing a post-landing review. Landing now to avoid merge conflicts showing up and to avoid keeping the tree in a state where regular comparison and sort keys don't agree. |
See #6811.
This change does not fix the issue for sort keys.
Includes the test changeset from #6812.