Skip to content

Conversation

srl295
Copy link
Member

@srl295 srl295 commented Aug 12, 2025

CLDR-18889

ALLOW_MANY_COMMITS=true

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

srl295 added 5 commits August 19, 2025 13:17
- I would update the docs to recommend running this tool after generating algorithmic, but the docs  aren't migrated over <https://unicode-org.atlassian.net/browse/CLDR-18606?focusedCommentId=186098>
- TestLocale doesn't look like it is a useful test, filed https://unicode-org.atlassian.net/browse/CLDR-18890
- add @cdata tag to request CDATA output
- update transformer to skip rbnfRules for DAIP
- update XML generator
@srl295 srl295 force-pushed the brs48/cldr-18889/algorithmic branch from 7ddf2a2 to b679fcd Compare August 19, 2025 18:49
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@srl295
Copy link
Member Author

srl295 commented Aug 19, 2025

had to redo this. Algorithmic transforms are cumulative so I can't just re-run it.

@srl295
Copy link
Member Author

srl295 commented Aug 19, 2025

Help? what is this error?

/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2959:42: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|full◂  〈h:mm:ss a zzzz〉      【】    〈ah:mm:ss [zzzz]〉     «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//1b89d7c2d516faca
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2965:39: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|long◂  〈h:mm:ss a z〉 【】    〈ah:mm:ss [z]〉        «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//924d14e677266de
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2971:35: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|medium◂        〈h:mm:ss a〉   【】    〈ah:mm:ss〉    «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//1d5a33b204d548da
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2977:32: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|short◂ 〈h:mm a〉      【】    〈ah:mm〉       «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//235af2cdaa201be0

@AEApple
Copy link
Contributor

AEApple commented Aug 19, 2025

Help? what is this error?

/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2959:42: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|full◂  〈h:mm:ss a zzzz〉      【】    〈ah:mm:ss [zzzz]〉     «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//1b89d7c2d516faca
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2965:39: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|long◂  〈h:mm:ss a z〉 【】    〈ah:mm:ss [z]〉        «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//924d14e677266de
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2971:35: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|medium◂        〈h:mm:ss a〉   【】    〈ah:mm:ss〉    «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//1d5a33b204d548da
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2977:32: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|short◂ 〈h:mm a〉      【】    〈ah:mm〉       «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//235af2cdaa201be0

yue_Hans which is default CN, requires H as the preferred:

<hours preferred="H" allowed="H hB hb h" regions="CN LV TL zu_ZA"/>

I assume that yue_Hans is being generated with h formats since HK prefers h:

<hours preferred="h" allowed="h hB hb H" regions="AE BH DZ EG EH HK IQ JO KW LB LY MO MR OM PH PS QA SA SD SY TN YE ar_001"/>

Nice to see the error working :)

@srl295
Copy link
Member Author

srl295 commented Aug 20, 2025

Help? what is this error?

/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2959:42: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|full◂  〈h:mm:ss a zzzz〉      【】    〈ah:mm:ss [zzzz]〉     «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//1b89d7c2d516faca
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2965:39: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|long◂  〈h:mm:ss a z〉 【】    〈ah:mm:ss [z]〉        «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//924d14e677266de
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2971:35: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|medium◂        〈h:mm:ss a〉   【】    〈ah:mm:ss〉    «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//1d5a33b204d548da
/Users/srl295/src/cldr3/common/main/yue_Hans.xml:2977:32: error
yue_Hans [Cantonese (Simplified)]       error   ▸Date_&_Time|Gregorian|Formats_-_Standard_-_Time_Formats|short◂ 〈h:mm a〉      【】    〈ah:mm〉       «=»     【】    ⁅inconsistent time pattern⁆     ❮Error: Time format inconsistent with supplemental time data for territory "CN". Use 'H' for 24 hour clock.❯ https://st.unicode.org/cldr-apps/v#/yue_Hans//235af2cdaa201be0

yue_Hans which is default CN, requires H as the preferred:

<hours preferred="H" allowed="H hB hb h" regions="CN LV TL zu_ZA"/>

I assume that yue_Hans is being generated with h formats since HK prefers h:

<hours preferred="h" allowed="h hB hb H" regions="AE BH DZ EG EH HK IQ JO KW LB LY MO MR OM PH PS QA SA SD SY TN YE ar_001"/>

Nice to see the error working :)

The transform is from yue to yue_Hans. However, yue is likely HK but yue_Hans is likely CN.

<likelySubtag from="yue" to="yue_Hant_HK"/>		
<likelySubtag from="yue_CN" to="yue_Hans_CN"/>	
<likelySubtag from="yue_Hans" to="yue_Hans_CN"/>

Comment on lines 2959 to 2978
<pattern>HH:mm:ss [zzzz]</pattern>
<datetimeSkeleton>HHmmssz</datetimeSkeleton>
</timeFormat>
</timeFormatLength>
<timeFormatLength type="long">
<timeFormat>
<pattern>ah:mm:ss [z]</pattern>
<datetimeSkeleton>ahmmssz</datetimeSkeleton>
<pattern>HH:mm:ss [z]</pattern>
<datetimeSkeleton>HHmmssz</datetimeSkeleton>
</timeFormat>
</timeFormatLength>
<timeFormatLength type="medium">
<timeFormat>
<pattern>ah:mm:ss</pattern>
<datetimeSkeleton>ahmmss</datetimeSkeleton>
<pattern>HH:mm:ss</pattern>
<datetimeSkeleton>HHmmss</datetimeSkeleton>
</timeFormat>
</timeFormatLength>
<timeFormatLength type="short">
<timeFormat>
<pattern>ah:mm</pattern>
<datetimeSkeleton>ahmm</datetimeSkeleton>
<pattern>HH:mm</pattern>
<datetimeSkeleton>HHmm</datetimeSkeleton>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, here are the 'promotions' I did changing ah to HH. Is the dateTimeSkeleton correct?

@srl295
Copy link
Member Author

srl295 commented Aug 20, 2025

looks like tests pass

@AEApple
Copy link
Contributor

AEApple commented Aug 26, 2025

Which docs aren't migrated?

@srl295
Copy link
Member Author

srl295 commented Aug 27, 2025

Which docs aren't migrated?

You migrated them since I wrote that and it was updated.

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some questions

@@ -168,7 +168,7 @@ x.x: << Komma >>;
1: ­=%spellout-ordinal=;
%%ste2:
0: ste;
1: ‘ =%spellout-ordinal=;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replacement of curly by straight apostrophe is suspicious. I don't think that should be in the transform, or otherwise be special-cased.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure. Perhaps it's an effect of daip

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's converted properly from:

so if de.xml has the wrong content, that's another bug (which I'll file). Maybe @grhoten hand-corrected this line previously?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The straight quote is correct. The curly quote is incorrect. The straight quote means quote the space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, this line change seems correct to me.

@@ -209,221 +209,10 @@ x.x: =#,##0.#=;
%spellout-ordinal-m:
-x: minus >>;
x.x: =#,##0.#=;
0: =%spellout-ordinal=m;
]]></rbnfRules>
<!-- The following redundant ruleset elements have been deprecated and will be removed in the next release. Please use the rbnfRules contents instead. -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are leaving in the old rules for one release, so they shouldn't disappear. Also, very odd to keep just one ("30").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because of MINIMIZE rules. If transformed = original, it drops the value.

'30' has dreissig which is transformed from dreißig - that's the only modified item, the rest get dropped.

@@ -12,324 +12,28 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<territory type="CH"/>
</identity>
<annotations>
<annotation cp="{">↑↑↑</annotation>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like all of the ↑↑↑ lines are disappearing. That's ok; I'm a little surprised that they were in the last release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a separate bug that was fixed.

@@ -115,10 +115,10 @@ x.x: << koma >>;
1000000000000000000: =#,##0=;
%%ordi:
0: i;
1: ‘ i =%spellout-ordinal=;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, odd that ‘ is being converted to '

Co-authored-by: Mark Davis <mark@unicode.org>
@@ -115,10 +115,10 @@ x.x: << koma >>;
1000000000000000000: =#,##0=;
%%ordi:
0: i;
1: i =%spellout-ordinal=;
1: ' i =%spellout-ordinal=;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know where the left hand side came from - maybe by hand.
However, this line is actually converted from:

which shows '

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The straight quote is correct. The curly quote is incorrect. This error was also introduced in this commit from 2021: d4cb32a#diff-15b116f0c5be91a93b36df309a3a5f8d78b7bc8c0d2de99d84c2b8c083f7d133

This change seems correct to me.

@@ -168,7 +168,7 @@ x.x: << Komma >>;
1: ­=%spellout-ordinal=;
%%ste2:
0: ste;
1: ‘ =%spellout-ordinal=;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's converted properly from:

so if de.xml has the wrong content, that's another bug (which I'll file). Maybe @grhoten hand-corrected this line previously?

@srl295 srl295 requested a review from macchiati August 30, 2025 17:25
@srl295
Copy link
Member Author

srl295 commented Aug 30, 2025

@macchiati the U+2018 did NOT occur in sr.xml or de.xml - looks like a typo introduced accidentally by @grhoten in #4779 - i think the conversion is happening properly.

@srl295
Copy link
Member Author

srl295 commented Aug 30, 2025

I re-ran cldrmodify no arg and -fP with no change (as per changes in #4996)

@srl295 srl295 marked this pull request as draft August 30, 2025 18:45
@srl295
Copy link
Member Author

srl295 commented Aug 30, 2025

OK, i see the issue with the old style rules being dropped.

- retain rbnfrules even with MINIMIZE
- add CLI options for partial generation
@srl295 srl295 marked this pull request as ready for review August 30, 2025 19:47
@srl295
Copy link
Member Author

srl295 commented Aug 30, 2025

Does (did) rulesetGrouping block inheritance? If it inherits, then the minimized de_CH data is OK and we can drop the last commit in this PR (but it's harmless).

If it does block, then this PR is optimal as is.

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving so we can get the bulk of the data in. We can look at the ' ‘ issue later.

@srl295 srl295 merged commit 7fa7166 into unicode-org:main Sep 2, 2025
15 checks passed
@srl295 srl295 deleted the brs48/cldr-18889/algorithmic branch September 2, 2025 18:07
@srl295
Copy link
Member Author

srl295 commented Sep 2, 2025

@macchiati the U+2018 did NOT occur in sr.xml or de.xml - looks like a typo introduced accidentally by @grhoten in #4779 - i think the conversion is happening properly.

Filed https://unicode-org.atlassian.net/browse/CLDR-18926 to track.

@macchiati
Copy link
Member

macchiati commented Sep 2, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants