Skip to content

Commit ec48314

Browse files
committed
Revert per-index collation version tracking feature.
Design problems were discovered in the handling of composite types and record types that would cause some relevant versions not to be recorded. Misgivings were also expressed about the use of the pg_depend catalog for this purpose. We're out of time for this release so we'll revert and try again. Commits reverted: 1bf946b: Doc: Document known problem with Windows collation versions. cf00200: Remove no-longer-relevant test case. ef387be: Fix bogus collation-version-recording logic. 0fb0a05: Hide internal error for pg_collation_actual_version(<bad OID>). ff94205: Suppress "warning: variable 'collcollate' set but not used". d50e3b1: Fix assertion in collation version lookup. f24b156: Rethink extraction of collation dependencies. 257836a: Track collation versions for indexes. cd6f479: Add pg_depend.refobjversion. 7d1297d: Remove pg_collation.collversion. Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com
1 parent a288d94 commit ec48314

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+463
-1354
lines changed

doc/src/sgml/catalogs.sgml

+11-12
Original file line numberDiff line numberDiff line change
@@ -2374,6 +2374,17 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
23742374
<symbol>LC_CTYPE</symbol> for this collation object
23752375
</para></entry>
23762376
</row>
2377+
2378+
<row>
2379+
<entry role="catalog_table_entry"><para role="column_definition">
2380+
<structfield>collversion</structfield> <type>text</type>
2381+
</para>
2382+
<para>
2383+
Provider-specific version of the collation. This is recorded when the
2384+
collation is created and then checked when it is used, to detect
2385+
changes in the collation definition that could lead to data corruption.
2386+
</para></entry>
2387+
</row>
23772388
</tbody>
23782389
</tgroup>
23792390
</table>
@@ -3317,18 +3328,6 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
33173328
A code defining the specific semantics of this dependency relationship; see text
33183329
</para></entry>
33193330
</row>
3320-
3321-
<row>
3322-
<entry role="catalog_table_entry"><para role="column_definition">
3323-
<structfield>refobjversion</structfield> <type>text</type>
3324-
</para>
3325-
<para>
3326-
An optional version for the referenced object. Currently used for
3327-
indexes' collations (see <xref linkend="collation-versions"/>).
3328-
</para>
3329-
</entry>
3330-
</row>
3331-
33323331
</tbody>
33333332
</tgroup>
33343333
</table>

doc/src/sgml/charset.sgml

-48
Original file line numberDiff line numberDiff line change
@@ -948,54 +948,6 @@ CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-tr
948948
</tip>
949949
</sect3>
950950
</sect2>
951-
952-
<sect2 id="collation-versions">
953-
<title>Collation Versions</title>
954-
955-
<para>
956-
The sort order defined by a collation is not necessarily fixed over time.
957-
<productname>PostgreSQL</productname> relies on external libraries that
958-
are subject to operating system upgrades, and can also differ between
959-
servers involved in binary replication and file-system-level migration.
960-
Persistent data structures such as B-trees that depend on sort order might
961-
be corrupted by any resulting change.
962-
<productname>PostgreSQL</productname> defends against this by recording the
963-
current version of each referenced collation for any index that depends on
964-
it in the
965-
<link linkend="catalog-pg-depend"><structname>pg_depend</structname></link>
966-
catalog, if the collation provider makes that information available. If the
967-
provider later begins to report a different version, a warning will be
968-
issued when the index is accessed, until either the
969-
<xref linkend="sql-reindex"/> command or the
970-
<xref linkend="sql-alterindex"/> command is used to update the version.
971-
</para>
972-
<para>
973-
Version information is available from the
974-
<literal>icu</literal> provider on all operating systems. For the
975-
<literal>libc</literal> provider, versions are currently only available
976-
on systems using the GNU C library (most Linux systems), FreeBSD and
977-
Windows.
978-
</para>
979-
980-
<note>
981-
<para>
982-
When using the GNU C library for collations, the C library's version
983-
is used as a proxy for the collation version. Many Linux distributions
984-
change collation definitions only when upgrading the C library, but this
985-
approach is imperfect as maintainers are free to back-port newer
986-
collation definitions to older C library releases.
987-
</para>
988-
<para>
989-
When using Windows collations, version information is only available for
990-
collations defined with BCP 47 language tags such as
991-
<literal>en-US</literal>. Currently, <command>initdb</command> selects
992-
a default locale using a traditional Windows language and country
993-
string such as <literal>English_United States.1252</literal>. The
994-
<literal>--lc-collate</literal> option can be used to provide an explicit
995-
locale name in BCP 47 format.
996-
</para>
997-
</note>
998-
</sect2>
999951
</sect1>
1000952

1001953
<sect1 id="multibyte">

doc/src/sgml/func.sgml

+5-3
Original file line numberDiff line numberDiff line change
@@ -26547,9 +26547,11 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
2654726547
</para>
2654826548
<para>
2654926549
Returns the actual version of the collation object as it is currently
26550-
installed in the operating system. <literal>null</literal> is returned
26551-
on operating systems where <productname>PostgreSQL</productname>
26552-
doesn't have support for versions.
26550+
installed in the operating system. If this is different from the
26551+
value in
26552+
<structname>pg_collation</structname>.<structfield>collversion</structfield>,
26553+
then objects depending on the collation might need to be rebuilt. See
26554+
also <xref linkend="sql-altercollation"/>.
2655326555
</para></entry>
2655426556
</row>
2655526557

doc/src/sgml/ref/alter_collation.sgml

+63
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ PostgreSQL documentation
2121

2222
<refsynopsisdiv>
2323
<synopsis>
24+
ALTER COLLATION <replaceable>name</replaceable> REFRESH VERSION
25+
2426
ALTER COLLATION <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
2527
ALTER COLLATION <replaceable>name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
2628
ALTER COLLATION <replaceable>name</replaceable> SET SCHEMA <replaceable>new_schema</replaceable>
@@ -86,9 +88,70 @@ ALTER COLLATION <replaceable>name</replaceable> SET SCHEMA <replaceable>new_sche
8688
</listitem>
8789
</varlistentry>
8890

91+
<varlistentry>
92+
<term><literal>REFRESH VERSION</literal></term>
93+
<listitem>
94+
<para>
95+
Update the collation's version.
96+
See <xref linkend="sql-altercollation-notes"/> below.
97+
</para>
98+
</listitem>
99+
</varlistentry>
89100
</variablelist>
90101
</refsect1>
91102

103+
<refsect1 id="sql-altercollation-notes" xreflabel="Notes">
104+
<title>Notes</title>
105+
106+
<para>
107+
When using collations provided by the ICU library, the ICU-specific version
108+
of the collator is recorded in the system catalog when the collation object
109+
is created. When the collation is used, the current version is
110+
checked against the recorded version, and a warning is issued when there is
111+
a mismatch, for example:
112+
<screen>
113+
WARNING: collation "xx-x-icu" has version mismatch
114+
DETAIL: The collation in the database was created using version 1.2.3.4, but the operating system provides version 2.3.4.5.
115+
HINT: Rebuild all objects affected by this collation and run ALTER COLLATION pg_catalog."xx-x-icu" REFRESH VERSION, or build PostgreSQL with the right library version.
116+
</screen>
117+
A change in collation definitions can lead to corrupt indexes and other
118+
problems because the database system relies on stored objects having a
119+
certain sort order. Generally, this should be avoided, but it can happen
120+
in legitimate circumstances, such as when
121+
using <command>pg_upgrade</command> to upgrade to server binaries linked
122+
with a newer version of ICU. When this happens, all objects depending on
123+
the collation should be rebuilt, for example,
124+
using <command>REINDEX</command>. When that is done, the collation version
125+
can be refreshed using the command <literal>ALTER COLLATION ... REFRESH
126+
VERSION</literal>. This will update the system catalog to record the
127+
current collator version and will make the warning go away. Note that this
128+
does not actually check whether all affected objects have been rebuilt
129+
correctly.
130+
</para>
131+
<para>
132+
When using collations provided by <literal>libc</literal> and
133+
<productname>PostgreSQL</productname> was built with the GNU C library, the
134+
C library's version is used as a collation version. Since collation
135+
definitions typically change only with GNU C library releases, this provides
136+
some defense against corruption, but it is not completely reliable.
137+
</para>
138+
<para>
139+
Currently, there is no version tracking for the database default collation.
140+
</para>
141+
142+
<para>
143+
The following query can be used to identify all collations in the current
144+
database that need to be refreshed and the objects that depend on them:
145+
<programlisting><![CDATA[
146+
SELECT pg_describe_object(refclassid, refobjid, refobjsubid) AS "Collation",
147+
pg_describe_object(classid, objid, objsubid) AS "Object"
148+
FROM pg_depend d JOIN pg_collation c
149+
ON refclassid = 'pg_collation'::regclass AND refobjid = c.oid
150+
WHERE c.collversion <> pg_collation_actual_version(c.oid)
151+
ORDER BY 1, 2;
152+
]]></programlisting></para>
153+
</refsect1>
154+
92155
<refsect1>
93156
<title>Examples</title>
94157

doc/src/sgml/ref/alter_index.sgml

-15
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@ ALTER INDEX [ IF EXISTS ] <replaceable class="parameter">name</replaceable> RENA
2525
ALTER INDEX [ IF EXISTS ] <replaceable class="parameter">name</replaceable> SET TABLESPACE <replaceable class="parameter">tablespace_name</replaceable>
2626
ALTER INDEX <replaceable class="parameter">name</replaceable> ATTACH PARTITION <replaceable class="parameter">index_name</replaceable>
2727
ALTER INDEX <replaceable class="parameter">name</replaceable> [ NO ] DEPENDS ON EXTENSION <replaceable class="parameter">extension_name</replaceable>
28-
ALTER INDEX <replaceable class="parameter">name</replaceable> ALTER COLLATION <replaceable class="parameter">collation_name</replaceable> REFRESH VERSION
2928
ALTER INDEX [ IF EXISTS ] <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">storage_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
3029
ALTER INDEX [ IF EXISTS ] <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">storage_parameter</replaceable> [, ... ] )
3130
ALTER INDEX [ IF EXISTS ] <replaceable class="parameter">name</replaceable> ALTER [ COLUMN ] <replaceable class="parameter">column_number</replaceable>
@@ -113,20 +112,6 @@ ALTER INDEX ALL IN TABLESPACE <replaceable class="parameter">name</replaceable>
113112
</listitem>
114113
</varlistentry>
115114

116-
<varlistentry>
117-
<term><literal>ALTER COLLATION <replaceable class="parameter">collation_name</replaceable> REFRESH VERSION</literal></term>
118-
<listitem>
119-
<para>
120-
Silences warnings about mismatched collation versions, by declaring
121-
that the index is compatible with the current collation definition.
122-
Be aware that incorrect use of this command can hide index corruption.
123-
If you don't know whether a collation's definition has changed
124-
incompatibly, <xref linkend="sql-reindex"/> is a safe alternative.
125-
See <xref linkend="collation-versions"/> for more information.
126-
</para>
127-
</listitem>
128-
</varlistentry>
129-
130115
<varlistentry>
131116
<term><literal>SET ( <replaceable class="parameter">storage_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
132117
<listitem>

doc/src/sgml/ref/create_collation.sgml

+21
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> (
2727
[ LC_CTYPE = <replaceable>lc_ctype</replaceable>, ]
2828
[ PROVIDER = <replaceable>provider</replaceable>, ]
2929
[ DETERMINISTIC = <replaceable>boolean</replaceable>, ]
30+
[ VERSION = <replaceable>version</replaceable> ]
3031
)
3132
CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replaceable>existing_collation</replaceable>
3233
</synopsis>
@@ -148,6 +149,26 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace
148149
</listitem>
149150
</varlistentry>
150151

152+
<varlistentry>
153+
<term><replaceable>version</replaceable></term>
154+
155+
<listitem>
156+
<para>
157+
Specifies the version string to store with the collation. Normally,
158+
this should be omitted, which will cause the version to be computed
159+
from the actual version of the collation as provided by the operating
160+
system. This option is intended to be used
161+
by <command>pg_upgrade</command> for copying the version from an
162+
existing installation.
163+
</para>
164+
165+
<para>
166+
See also <xref linkend="sql-altercollation"/> for how to handle
167+
collation version mismatches.
168+
</para>
169+
</listitem>
170+
</varlistentry>
171+
151172
<varlistentry>
152173
<term><replaceable>existing_collation</replaceable></term>
153174

doc/src/sgml/ref/pgupgrade.sgml

-15
Original file line numberDiff line numberDiff line change
@@ -215,21 +215,6 @@ PostgreSQL documentation
215215
</listitem>
216216
</varlistentry>
217217

218-
<varlistentry>
219-
<term><option>--index-collation-versions-unknown</option></term>
220-
<listitem>
221-
<para>
222-
When upgrading indexes from releases before 14 that didn't track
223-
collation versions, <application>pg_upgrade</application>
224-
assumes by default that the upgraded indexes are compatible with the
225-
currently installed versions of relevant collations (see
226-
<xref linkend="collation-versions"/>). Specify
227-
<option>--index-collation-versions-unknown</option> to mark
228-
them as needing to be rebuilt instead.
229-
</para>
230-
</listitem>
231-
</varlistentry>
232-
233218
<varlistentry>
234219
<term><option>-?</option></term>
235220
<term><option>--help</option></term>

doc/src/sgml/ref/reindex.sgml

-9
Original file line numberDiff line numberDiff line change
@@ -40,15 +40,6 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { IN
4040
several scenarios in which to use <command>REINDEX</command>:
4141

4242
<itemizedlist>
43-
<listitem>
44-
<para>
45-
The index depends on the sort order of a collation, and the definition
46-
of the collation has changed. This can cause index scans to fail to
47-
find keys that are present. See <xref linkend="collation-versions"/> for
48-
more information.
49-
</para>
50-
</listitem>
51-
5243
<listitem>
5344
<para>
5445
An index has become corrupted, and no longer contains valid

0 commit comments

Comments
 (0)