Skip to content

Commit 17ec2c5

Browse files
committed
doc: Add more ICU rules examples
In particular, add an example EBCDIC collation. Author: Daniel Verite <daniel@manitou-mail.org> Discussion: https://www.postgresql.org/message-id/flat/35cc1684-e516-4a01-a256-351632d47066@manitou-mail.org
1 parent 27a36f7 commit 17ec2c5

File tree

3 files changed

+62
-13
lines changed

3 files changed

+62
-13
lines changed

doc/src/sgml/charset.sgml

Lines changed: 57 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1481,7 +1481,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false
14811481
</sect3>
14821482

14831483
<sect3 id="icu-locale-examples">
1484-
<title>Examples</title>
1484+
<title>Collation Settings Examples</title>
14851485

14861486
<variablelist>
14871487
<varlistentry id="collation-managing-create-icu-de-u-co-phonebk-x-icu">
@@ -1530,6 +1530,62 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false
15301530
</variablelist>
15311531
</sect3>
15321532

1533+
<sect3 id="icu-tailoring-rules">
1534+
<title>ICU Tailoring Rules</title>
1535+
1536+
<para>
1537+
If the options provided by the collation settings shown above are not
1538+
sufficient, the order of collation elements can be changed with tailoring
1539+
rules, whose syntax is detailed at <ulink
1540+
url="https://unicode-org.github.io/icu/userguide/collation/customization/"></ulink>.
1541+
</para>
1542+
1543+
<para>
1544+
This small example creates a collation based on the root locale with a
1545+
tailoring rule:
1546+
<programlisting>
1547+
<![CDATA[CREATE COLLATION custom (provider = icu, locale = 'und', rules = '&V << w <<< W');]]>
1548+
</programlisting>
1549+
With this rule, the letter <quote>W</quote> is sorted after
1550+
<quote>V</quote>, but is treated as a secondary difference similar to an
1551+
accent. Rules like this are contained in the locale definitions of some
1552+
languages. (Of course, if a locale definition already contains the
1553+
desired rules, then they don't need to be specified again explicitly.)
1554+
</para>
1555+
1556+
<para>
1557+
Here is a more complex example. The following statement sets up a
1558+
collation named <literal>ebcdic</literal> with rules to sort US-ASCII
1559+
characters in the order of the EBCDIC encoding.
1560+
1561+
<programlisting>
1562+
<![CDATA[CREATE COLLATION ebcdic (provider = icu, locale = 'und',
1563+
rules = $$
1564+
& ' ' < '.' < '<' < '(' < '+' < \|
1565+
< '&' < '!' < '$' < '*' < ')' < ';'
1566+
< '-' < '/' < ',' < '%' < '_' < '>' < '?'
1567+
< '`' < ':' < '#' < '@' < \' < '=' < '"'
1568+
<*a-r < '~' <*s-z < '^' < '[' < ']'
1569+
< '{' <*A-I < '}' <*J-R < '\' <*S-Z <*0-9
1570+
$$);]]>
1571+
1572+
SELECT c
1573+
FROM (VALUES ('a'), ('b'), ('A'), ('B'), ('1'), ('2'), ('!'), ('^')) AS x(c)
1574+
ORDER BY c COLLATE ebcdic;
1575+
c
1576+
---
1577+
!
1578+
a
1579+
b
1580+
^
1581+
A
1582+
B
1583+
1
1584+
2
1585+
</programlisting>
1586+
</para>
1587+
</sect3>
1588+
15331589
<sect3 id="icu-external-references">
15341590
<title>External References for ICU</title>
15351591

doc/src/sgml/ref/create_collation.sgml

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -165,9 +165,8 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace
165165
<listitem>
166166
<para>
167167
Specifies additional collation rules to customize the behavior of the
168-
collation. This is supported for ICU only. See <ulink
169-
url="https://unicode-org.github.io/icu/userguide/collation/customization/"/>
170-
for details on the syntax.
168+
collation. This is supported for ICU only. See <xref
169+
linkend="icu-tailoring-rules"/> for details.
171170
</para>
172171
</listitem>
173172
</varlistentry>
@@ -257,12 +256,8 @@ CREATE COLLATION german_phonebook (provider = icu, locale = 'de-u-co-phonebk');
257256
<programlisting>
258257
<![CDATA[CREATE COLLATION custom (provider = icu, locale = 'und', rules = '&V << w <<< W');]]>
259258
</programlisting>
260-
With this rule, the letter <quote>W</quote> is sorted after
261-
<quote>V</quote>, but is treated as a secondary difference similar to an
262-
accent. Rules like this are contained in the locale definitions of some
263-
languages. (Of course, if a locale definition already contains the desired
264-
rules, then they don't need to be specified again explicitly.) See the ICU
265-
documentation for further details and examples on the rules syntax.
259+
See <xref linkend="icu-tailoring-rules"/> for further details and examples
260+
on the rules syntax.
266261
</para>
267262

268263
<para>

doc/src/sgml/ref/create_database.sgml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -232,9 +232,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
232232
<para>
233233
Specifies additional collation rules to customize the behavior of the
234234
default collation of this database. This is supported for ICU only.
235-
See <ulink
236-
url="https://unicode-org.github.io/icu/userguide/collation/customization/"/>
237-
for details on the syntax.
235+
See <xref linkend="icu-tailoring-rules"/> for details.
238236
</para>
239237
</listitem>
240238
</varlistentry>

0 commit comments

Comments
 (0)