Skip to content

Improvements to property history #1105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 25, 2025
Merged

Conversation

eggrobin
Copy link
Member

@eggrobin eggrobin commented Apr 25, 2025

Old dumps.

PropList-2.1.8.txt includes a diff between Unilib properties and what it calls « UnicodeData ». The « UnicodeData » values appear to prefigure the 3.1 derivation for, e.g., Alphabetic, but without the Other_Alphabetic exceptions. Ignoring the Discrepancy heading means we currently take the union of the Unilib and « UnicodeData » sets, which is nonsense. The Unilib set is the more useful one for historical investigation, as it is consistent with the files included in 2.1.5 and 2.1.9 (whereas taking the « UnicodeData » set introduces massive discrepancies that are immediately reverted in 2.1.9).
image

Currently, \P{U2.1.8:Alphabetic=@U2.1.9:Alphabetic@} has 21226 code points, and \P{U2.1.8:Alphabetic=@U2.1.5:Alphabetic@} has 21236 code points, whereas \P{U2.1.9:Alphabetic=@U2.1.5:Alphabetic@} has 10 code points. With this change, U2.1.9:Alphabetic and U2.1.8:Alphabetic are identical.

History tables.

Currently, the Unihan history for U+3400 looks like (A) below.
Because of multiple changes to a few properties, many other properties are squished into a column so narrow that the null does not fit on one line. In addition, there is no relation between the columns: synchronous changes are not aligned, and aligned changes may be decades apart.

With this change, the history of the same properties is displayed as in (B), both more compact and more informative as to the synchronicity of changes and the lifetime of properties.

(A) (B)
image image image image

@eggrobin eggrobin requested a review from markusicu April 25, 2025 13:01
Copy link
Member

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try to really follow the logic... amazing effects...

@eggrobin eggrobin merged commit 6520922 into unicode-org:main Apr 25, 2025
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants