Skip to content

Fix remaining issues with multivalued #618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

macchiati
Copy link
Member

@macchiati macchiati commented Nov 29, 2023

There were two further problems with multivalued support

  1. \p{exemplar} (has no exemplar value) returned the reverse of what was expected
  2. \p{scx=Arab} didn't work right.

The solution to these was to (1) allow for "" properly as a string value, and (2) allow script values that were not present in ScriptExtensions.txt as single values.

UnicodeJsps/src/main/java/org/unicode/jsp/ScriptTester.java

  • Supply the Script values in addition to the Script_Extension values as valid alternate aliases.

UnicodeJsps/src/main/java/org/unicode/jsp/UnicodeSetUtilities.java

  • Expand an empty value to be "No" only for binary, enumerated, and catalogue properties

UnicodeJsps/src/main/java/org/unicode/jsp/XPropertyFactory.java

  • main change is to allow "" as the default value in the scx map

UnicodeJsps/src/test/java/org/unicode/jsptest/TestMultivalued.java

  • Add some tests for the problems seen above

unicodetools/src/main/java/org/unicode/props/UnicodeProperty.java

  • add a mask for BINARY_OR_ENUMERATED_OR_CATALOG_MASK
  • if a property is multivalued, don't skip over string values.

Notes:

  1. As before, Eclipse helpfully adds missing @OverRide annotations
  2. The exemplars really should be treated like enumerations (over locales), but that was a bit too complicated for a first run.

} catch (Exception e) {
if (prop.isType(UnicodeProperty.BINARY_OR_ENUMERATED_OR_CATALOG_MASK)) {
try {
status = applyPropertyAlias0(prop, "No", result, !invert);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that binary properties default to "No", but each enumerated/catalog property has its own @missing value, so why should those default to "No" rather than their actual @missing value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just restricting the "No" default from applying to string/misc values, not introducing anything new for the others. Before changing that, we should have a discussion, probably in the CLDR/ICU design group to pull in everyone.

I think we explicitly wanted the "No" value to work in some enumerated properties, the ones with Yes, No, Maybe, for consistency.

Note that @missing isn't necessarily the best fallback value; it is just an feature of the file structure that allows us to minimize the number of lines we list. We would need to make that part of the discussion.

@macchiati macchiati requested a review from markusicu November 29, 2023 20:06
@macchiati macchiati marked this pull request as ready for review November 29, 2023 22:22
@macchiati macchiati merged commit 7631d20 into unicode-org:main Nov 29, 2023
@macchiati macchiati deleted the Fix-remaining-issues-with-multivalued branch November 29, 2023 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants