-
Notifications
You must be signed in to change notification settings - Fork 5.4k
[Feature #20724] Bump Unicode version to 16.0.0 #13117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
Removing gsub may make this much slower. There should be no need for such big changes. |
template/unicode_norm_gen.tmpl
Outdated
|
||
quick_checks = %w(NFD_QC NFC_QC NFKD_QC NFKC_QC) | ||
|
||
File.foreach('enc/unicode/data/16.0.0/ucd/DerivedNormalizationProps.txt') do |line| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use InputDataDir
and vpath
, not the hardcoded path.
File.foreach('enc/unicode/data/16.0.0/ucd/DerivedNormalizationProps.txt') do |line| | |
vpath.foreach("#{InputDataDir}/DerivedNormalizationProps.txt") do |line| |
lib/unicode_normalize/normalize.rb
Outdated
def self.to_nfkd_arr(string) | ||
kompatibled_arr = string.each_char.flat_map { kompatible_one(it) } | ||
decomposed_arr = kompatibled_arr.each.flat_map { decompose_one(it) } | ||
ordered_arr = canonical_ordering(decomposed_arr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ordered_arr
is assigned but unused variable.
ordered_arr = canonical_ordering(decomposed_arr) | |
canonical_ordering(decomposed_arr) |
diff --git a/common.mk b/common.mk
index 8d1ea8815a1..d8d6e2525e3 100644
--- a/common.mk
+++ b/common.mk
@@ -1245,7 +1245,12 @@ srcs-extra: $(EXTRA_SRCS)
realclean-srcs-extra::
$(Q)$(RM) $(EXTRA_SRCS)
-LIB_SRCS = $(srcdir)/lib/unicode_normalize/tables.rb
+UNICODE_NORMALIZE_TABLES = $(srcdir)/lib/unicode_normalize/tables.rb
+encs: $(ALWAYS_UPDATE_UNICODE:yes=update-unicode_normalize-tables)
+
+update-unicode_normalize-tables: $(UNICODE_NORMALIZE_TABLES)
+
+LIB_SRCS = $(UNICODE_NORMALIZE_TABLES)
srcs-lib: $(LIB_SRCS)
|
[unicode_normalize/tables.rb should not only be updated when ALWAYS_UPDATE_UNICODE=yes. It should e.g. be updated when template/unicode_norm_gen.tmpl is changed. And makefile-related changes should be independent of updates for Unicode 16.0.0. |
3685ad0
to
97f35c7
Compare
0363..036F ; Alphabetic # Mn [13] COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X
97f35c7
to
f7b53e3
Compare
@duerst I removed make-file changes from this PR. |
https://www.unicode.org/versions/Unicode16.0.0/
https://bugs.ruby-lang.org/issues/20724