Skip to content

Commit 51e225d

Browse files
committed
Expand set of predefined ICU locales
Install language+region combinations even if they are not distinct from the language's base locale. This gives better long-term stability of the set of predefined locales and makes the predefined locales less implementation-dependent and more practical for users. Reviewed-by: Peter Geoghegan <pg@bowt.ie>
1 parent 1f6d515 commit 51e225d

File tree

2 files changed

+18
-10
lines changed

2 files changed

+18
-10
lines changed

doc/src/sgml/charset.sgml

+6-7
Original file line numberDiff line numberDiff line change
@@ -653,9 +653,8 @@ SELECT a COLLATE "C" &lt; b COLLATE "POSIX" FROM test1;
653653
string will be accepted as a locale name.)
654654
See <ulink url="http://userguide.icu-project.org/locale"></ulink> for
655655
information on ICU locale naming. <command>initdb</command> uses the ICU
656-
APIs to extract a set of locales with distinct collation rules to populate
657-
the initial set of collations. Here are some example collations that
658-
might be created:
656+
APIs to extract a set of distinct locales to populate the initial set of
657+
collations. Here are some example collations that might be created:
659658

660659
<variablelist>
661660
<varlistentry>
@@ -677,9 +676,9 @@ SELECT a COLLATE "C" &lt; b COLLATE "POSIX" FROM test1;
677676
<listitem>
678677
<para>German collation for Austria, default variant</para>
679678
<para>
680-
(As of this writing, there is no,
681-
say, <literal>de-DE-x-icu</literal> or <literal>de-CH-x-icu</literal>,
682-
because those are equivalent to <literal>de-x-icu</literal>.)
679+
(There are also, say, <literal>de-DE-x-icu</literal>
680+
or <literal>de-CH-x-icu</literal>, but as of this writing, they are
681+
equivalent to <literal>de-x-icu</literal>.)
683682
</para>
684683
</listitem>
685684
</varlistentry>
@@ -690,6 +689,7 @@ SELECT a COLLATE "C" &lt; b COLLATE "POSIX" FROM test1;
690689
<para>German collation for Austria, phone book variant</para>
691690
</listitem>
692691
</varlistentry>
692+
693693
<varlistentry>
694694
<term><literal>und-x-icu</literal> (for <quote>undefined</quote>)</term>
695695
<listitem>
@@ -724,7 +724,6 @@ SELECT a COLLATE "C" &lt; b COLLATE "POSIX" FROM test1;
724724
<programlisting>
725725
CREATE COLLATION german FROM "de_DE";
726726
CREATE COLLATION french FROM "fr-x-icu";
727-
CREATE COLLATION "de-DE-x-icu" FROM "de-x-icu";
728727
</programlisting>
729728
</para>
730729

src/backend/commands/collationcmds.c

+12-3
Original file line numberDiff line numberDiff line change
@@ -667,7 +667,16 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
667667
}
668668
#endif /* READ_LOCALE_A_OUTPUT */
669669

670-
/* Load collations known to ICU */
670+
/*
671+
* Load collations known to ICU
672+
*
673+
* We use uloc_countAvailable()/uloc_getAvailable() rather than
674+
* ucol_countAvailable()/ucol_getAvailable(). The former returns a full
675+
* set of language+region combinations, whereas the latter only returns
676+
* language+region combinations of they are distinct from the language's
677+
* base collation. So there might not be a de-DE or en-GB, which would be
678+
* confusing.
679+
*/
671680
#ifdef USE_ICU
672681
{
673682
int i;
@@ -676,7 +685,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
676685
* Start the loop at -1 to sneak in the root locale without too much
677686
* code duplication.
678687
*/
679-
for (i = -1; i < ucol_countAvailable(); i++)
688+
for (i = -1; i < uloc_countAvailable(); i++)
680689
{
681690
/*
682691
* In ICU 4.2, ucol_getKeywordValuesForLocale() sometimes returns
@@ -706,7 +715,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
706715
if (i == -1)
707716
name = ""; /* ICU root locale */
708717
else
709-
name = ucol_getAvailable(i);
718+
name = uloc_getAvailable(i);
710719

711720
langtag = get_icu_language_tag(name);
712721
collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name;

0 commit comments

Comments
 (0)