Skip to content

Commit 1ddc270

Browse files
committed
Work around deadlock problems with VACUUM FULL/CLUSTER on system catalogs,
as per my recent proposal. First, teach IndexBuildHeapScan to not wait for INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples to commit unless the index build is checking uniqueness/exclusion constraints. If it isn't, there's no harm in just indexing the in-doubt tuple. Second, modify VACUUM FULL/CLUSTER to suppress reverifying uniqueness/exclusion constraint properties while rebuilding indexes of the target relation. This is reasonable because these commands aren't meant to deal with corrupted-data situations. Constraint properties will still be rechecked when an index is rebuilt by a REINDEX command. This gets us out of the problem that new-style VACUUM FULL would often wait for other transactions while holding exclusive lock on a system catalog, leading to probable deadlock because those other transactions need to look at the catalogs too. Although the real ultimate cause of the problem is a debatable choice to release locks early after modifying system catalogs, changing that choice would require pretty serious analysis and is not something to be undertaken lightly or on a tight schedule. The present patch fixes the problem in a fairly reasonable way and should also improve the speed of VACUUM FULL/CLUSTER a little bit.
1 parent 1c05b0b commit 1ddc270

File tree

5 files changed

+94
-67
lines changed

5 files changed

+94
-67
lines changed

src/backend/catalog/index.c

+87-56
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/catalog/index.c,v 1.333 2010/02/07 20:48:09 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/catalog/index.c,v 1.334 2010/02/07 22:40:33 tgl Exp $
1212
*
1313
*
1414
* INTERFACE ROUTINES
@@ -1541,6 +1541,8 @@ IndexBuildHeapScan(Relation heapRelation,
15411541
IndexBuildCallback callback,
15421542
void *callback_state)
15431543
{
1544+
bool is_system_catalog;
1545+
bool checking_uniqueness;
15441546
HeapScanDesc scan;
15451547
HeapTuple heapTuple;
15461548
Datum values[INDEX_MAX_KEYS];
@@ -1560,6 +1562,13 @@ IndexBuildHeapScan(Relation heapRelation,
15601562
*/
15611563
Assert(OidIsValid(indexRelation->rd_rel->relam));
15621564

1565+
/* Remember if it's a system catalog */
1566+
is_system_catalog = IsSystemRelation(heapRelation);
1567+
1568+
/* See whether we're verifying uniqueness/exclusion properties */
1569+
checking_uniqueness = (indexInfo->ii_Unique ||
1570+
indexInfo->ii_ExclusionOps != NULL);
1571+
15631572
/*
15641573
* Need an EState for evaluation of index expressions and partial-index
15651574
* predicates. Also a slot to hold the current tuple.
@@ -1652,6 +1661,7 @@ IndexBuildHeapScan(Relation heapRelation,
16521661
{
16531662
/* do our own time qual check */
16541663
bool indexIt;
1664+
TransactionId xwait;
16551665

16561666
recheck:
16571667

@@ -1710,29 +1720,31 @@ IndexBuildHeapScan(Relation heapRelation,
17101720
case HEAPTUPLE_INSERT_IN_PROGRESS:
17111721

17121722
/*
1713-
* Since caller should hold ShareLock or better, we should
1714-
* not see any tuples inserted by open transactions ---
1715-
* unless it's our own transaction. (Consider INSERT
1716-
* followed by CREATE INDEX within a transaction.) An
1717-
* exception occurs when reindexing a system catalog,
1718-
* because we often release lock on system catalogs before
1719-
* committing. In that case we wait for the inserting
1720-
* transaction to finish and check again. (We could do
1721-
* that on user tables too, but since the case is not
1722-
* expected it seems better to throw an error.)
1723+
* Since caller should hold ShareLock or better, normally
1724+
* the only way to see this is if it was inserted earlier
1725+
* in our own transaction. However, it can happen in
1726+
* system catalogs, since we tend to release write lock
1727+
* before commit there. Give a warning if neither case
1728+
* applies.
17231729
*/
1724-
if (!TransactionIdIsCurrentTransactionId(
1725-
HeapTupleHeaderGetXmin(heapTuple->t_data)))
1730+
xwait = HeapTupleHeaderGetXmin(heapTuple->t_data);
1731+
if (!TransactionIdIsCurrentTransactionId(xwait))
17261732
{
1727-
if (!IsSystemRelation(heapRelation))
1728-
elog(ERROR, "concurrent insert in progress");
1729-
else
1733+
if (!is_system_catalog)
1734+
elog(WARNING, "concurrent insert in progress within table \"%s\"",
1735+
RelationGetRelationName(heapRelation));
1736+
1737+
/*
1738+
* If we are performing uniqueness checks, indexing
1739+
* such a tuple could lead to a bogus uniqueness
1740+
* failure. In that case we wait for the inserting
1741+
* transaction to finish and check again.
1742+
*/
1743+
if (checking_uniqueness)
17301744
{
17311745
/*
17321746
* Must drop the lock on the buffer before we wait
17331747
*/
1734-
TransactionId xwait = HeapTupleHeaderGetXmin(heapTuple->t_data);
1735-
17361748
LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
17371749
XactLockTableWait(xwait);
17381750
goto recheck;
@@ -1749,30 +1761,27 @@ IndexBuildHeapScan(Relation heapRelation,
17491761
case HEAPTUPLE_DELETE_IN_PROGRESS:
17501762

17511763
/*
1752-
* Since caller should hold ShareLock or better, we should
1753-
* not see any tuples deleted by open transactions ---
1754-
* unless it's our own transaction. (Consider DELETE
1755-
* followed by CREATE INDEX within a transaction.) An
1756-
* exception occurs when reindexing a system catalog,
1757-
* because we often release lock on system catalogs before
1758-
* committing. In that case we wait for the deleting
1759-
* transaction to finish and check again. (We could do
1760-
* that on user tables too, but since the case is not
1761-
* expected it seems better to throw an error.)
1764+
* Similar situation to INSERT_IN_PROGRESS case.
17621765
*/
17631766
Assert(!(heapTuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI));
1764-
if (!TransactionIdIsCurrentTransactionId(
1765-
HeapTupleHeaderGetXmax(heapTuple->t_data)))
1767+
xwait = HeapTupleHeaderGetXmax(heapTuple->t_data);
1768+
if (!TransactionIdIsCurrentTransactionId(xwait))
17661769
{
1767-
if (!IsSystemRelation(heapRelation))
1768-
elog(ERROR, "concurrent delete in progress");
1769-
else
1770+
if (!is_system_catalog)
1771+
elog(WARNING, "concurrent delete in progress within table \"%s\"",
1772+
RelationGetRelationName(heapRelation));
1773+
1774+
/*
1775+
* If we are performing uniqueness checks, assuming
1776+
* the tuple is dead could lead to missing a uniqueness
1777+
* violation. In that case we wait for the deleting
1778+
* transaction to finish and check again.
1779+
*/
1780+
if (checking_uniqueness)
17701781
{
17711782
/*
17721783
* Must drop the lock on the buffer before we wait
17731784
*/
1774-
TransactionId xwait = HeapTupleHeaderGetXmax(heapTuple->t_data);
1775-
17761785
LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
17771786
XactLockTableWait(xwait);
17781787
goto recheck;
@@ -2402,7 +2411,7 @@ IndexGetRelation(Oid indexId)
24022411
* reindex_index - This routine is used to recreate a single index
24032412
*/
24042413
void
2405-
reindex_index(Oid indexId)
2414+
reindex_index(Oid indexId, bool skip_constraint_checks)
24062415
{
24072416
Relation iRel,
24082417
heapRelation,
@@ -2411,6 +2420,7 @@ reindex_index(Oid indexId)
24112420
IndexInfo *indexInfo;
24122421
HeapTuple indexTuple;
24132422
Form_pg_index indexForm;
2423+
volatile bool skipped_constraint = false;
24142424

24152425
/*
24162426
* Open and lock the parent heap relation. ShareLock is sufficient since
@@ -2448,6 +2458,17 @@ reindex_index(Oid indexId)
24482458
/* Fetch info needed for index_build */
24492459
indexInfo = BuildIndexInfo(iRel);
24502460

2461+
/* If requested, skip checking uniqueness/exclusion constraints */
2462+
if (skip_constraint_checks)
2463+
{
2464+
if (indexInfo->ii_Unique || indexInfo->ii_ExclusionOps != NULL)
2465+
skipped_constraint = true;
2466+
indexInfo->ii_Unique = false;
2467+
indexInfo->ii_ExclusionOps = NULL;
2468+
indexInfo->ii_ExclusionProcs = NULL;
2469+
indexInfo->ii_ExclusionStrats = NULL;
2470+
}
2471+
24512472
/* We'll build a new physical relation for the index */
24522473
RelationSetNewRelfilenode(iRel, InvalidTransactionId);
24532474

@@ -2466,33 +2487,38 @@ reindex_index(Oid indexId)
24662487

24672488
/*
24682489
* If the index is marked invalid or not ready (ie, it's from a failed
2469-
* CREATE INDEX CONCURRENTLY), we can now mark it valid. This allows
2470-
* REINDEX to be used to clean up in such cases.
2490+
* CREATE INDEX CONCURRENTLY), and we didn't skip a uniqueness check,
2491+
* we can now mark it valid. This allows REINDEX to be used to clean up
2492+
* in such cases.
24712493
*
24722494
* We can also reset indcheckxmin, because we have now done a
24732495
* non-concurrent index build, *except* in the case where index_build
24742496
* found some still-broken HOT chains.
24752497
*/
2476-
pg_index = heap_open(IndexRelationId, RowExclusiveLock);
2498+
if (!skipped_constraint)
2499+
{
2500+
pg_index = heap_open(IndexRelationId, RowExclusiveLock);
24772501

2478-
indexTuple = SearchSysCacheCopy(INDEXRELID,
2479-
ObjectIdGetDatum(indexId),
2480-
0, 0, 0);
2481-
if (!HeapTupleIsValid(indexTuple))
2482-
elog(ERROR, "cache lookup failed for index %u", indexId);
2483-
indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
2502+
indexTuple = SearchSysCacheCopy(INDEXRELID,
2503+
ObjectIdGetDatum(indexId),
2504+
0, 0, 0);
2505+
if (!HeapTupleIsValid(indexTuple))
2506+
elog(ERROR, "cache lookup failed for index %u", indexId);
2507+
indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
24842508

2485-
if (!indexForm->indisvalid || !indexForm->indisready ||
2486-
(indexForm->indcheckxmin && !indexInfo->ii_BrokenHotChain))
2487-
{
2488-
indexForm->indisvalid = true;
2489-
indexForm->indisready = true;
2490-
if (!indexInfo->ii_BrokenHotChain)
2491-
indexForm->indcheckxmin = false;
2492-
simple_heap_update(pg_index, &indexTuple->t_self, indexTuple);
2493-
CatalogUpdateIndexes(pg_index, indexTuple);
2509+
if (!indexForm->indisvalid || !indexForm->indisready ||
2510+
(indexForm->indcheckxmin && !indexInfo->ii_BrokenHotChain))
2511+
{
2512+
indexForm->indisvalid = true;
2513+
indexForm->indisready = true;
2514+
if (!indexInfo->ii_BrokenHotChain)
2515+
indexForm->indcheckxmin = false;
2516+
simple_heap_update(pg_index, &indexTuple->t_self, indexTuple);
2517+
CatalogUpdateIndexes(pg_index, indexTuple);
2518+
}
2519+
2520+
heap_close(pg_index, RowExclusiveLock);
24942521
}
2495-
heap_close(pg_index, RowExclusiveLock);
24962522

24972523
/* Close rels, but keep locks */
24982524
index_close(iRel, NoLock);
@@ -2513,6 +2539,11 @@ reindex_index(Oid indexId)
25132539
* do CCI after having collected the index list. (This way we can still use
25142540
* catalog indexes while collecting the list.)
25152541
*
2542+
* We also skip rechecking uniqueness/exclusion constraint properties if
2543+
* heap_rebuilt is true. This avoids likely deadlock conditions when doing
2544+
* VACUUM FULL or CLUSTER on system catalogs. REINDEX should be used to
2545+
* rebuild an index if constraint inconsistency is suspected.
2546+
*
25162547
* Returns true if any indexes were rebuilt. Note that a
25172548
* CommandCounterIncrement will occur after each index rebuild.
25182549
*/
@@ -2594,7 +2625,7 @@ reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
25942625
if (is_pg_class)
25952626
RelationSetIndexList(rel, doneIndexes, InvalidOid);
25962627

2597-
reindex_index(indexOid);
2628+
reindex_index(indexOid, heap_rebuilt);
25982629

25992630
CommandCounterIncrement();
26002631

src/backend/commands/cluster.c

+1-2
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
*
1212
*
1313
* IDENTIFICATION
14-
* $PostgreSQL: pgsql/src/backend/commands/cluster.c,v 1.198 2010/02/07 20:48:10 tgl Exp $
14+
* $PostgreSQL: pgsql/src/backend/commands/cluster.c,v 1.199 2010/02/07 22:40:33 tgl Exp $
1515
*
1616
*-------------------------------------------------------------------------
1717
*/
@@ -948,7 +948,6 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex,
948948
*/
949949
if (!is_system_catalog &&
950950
!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple->t_data)))
951-
952951
elog(WARNING, "concurrent insert in progress within table \"%s\"",
953952
RelationGetRelationName(OldHeap));
954953
/* treat as live */

src/backend/commands/indexcmds.c

+2-2
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/commands/indexcmds.c,v 1.191 2010/02/07 20:48:10 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/commands/indexcmds.c,v 1.192 2010/02/07 22:40:33 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -1588,7 +1588,7 @@ ReindexIndex(RangeVar *indexRelation)
15881588

15891589
ReleaseSysCache(tuple);
15901590

1591-
reindex_index(indOid);
1591+
reindex_index(indOid, false);
15921592
}
15931593

15941594
/*

src/include/catalog/index.h

+2-2
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
10-
* $PostgreSQL: pgsql/src/include/catalog/index.h,v 1.82 2010/02/07 20:48:11 tgl Exp $
10+
* $PostgreSQL: pgsql/src/include/catalog/index.h,v 1.83 2010/02/07 22:40:33 tgl Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -70,7 +70,7 @@ extern double IndexBuildHeapScan(Relation heapRelation,
7070

7171
extern void validate_index(Oid heapId, Oid indexId, Snapshot snapshot);
7272

73-
extern void reindex_index(Oid indexId);
73+
extern void reindex_index(Oid indexId, bool skip_constraint_checks);
7474
extern bool reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt);
7575

7676
extern bool ReindexIsProcessingHeap(Oid heapOid);

src/test/regress/parallel_schedule

+2-5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# ----------
2-
# $PostgreSQL: pgsql/src/test/regress/parallel_schedule,v 1.59 2010/02/07 20:48:13 tgl Exp $
2+
# $PostgreSQL: pgsql/src/test/regress/parallel_schedule,v 1.60 2010/02/07 22:40:33 tgl Exp $
33
#
44
# By convention, we put no more than twenty tests in any one parallel group;
55
# this limits the number of connections needed to run the tests.
@@ -52,10 +52,7 @@ test: copy copyselect
5252
# ----------
5353
# Another group of parallel tests
5454
# ----------
55-
test: constraints triggers create_misc create_aggregate create_operator inherit typed_table drop_if_exists create_cast
56-
57-
# XXX temporarily run this by itself
58-
test: vacuum
55+
test: constraints triggers create_misc create_aggregate create_operator inherit typed_table vacuum drop_if_exists create_cast
5956

6057
# Depends on the above
6158
test: create_index create_view

0 commit comments

Comments
 (0)