Skip to content

Commit e3914bd

Browse files
committed
For inplace update, send nontransactional invalidations.
The inplace update survives ROLLBACK. The inval didn't, so another backend's DDL could then update the row without incorporating the inplace update. In the test this fixes, a mix of CREATE INDEX and ALTER TABLE resulted in a table with an index, yet relhasindex=f. That is a source of index corruption. Back-patch to v12 (all supported versions). The back branch versions don't change WAL, because those branches just added end-of-recovery SIResetAll(). All branches change the ABI of extern function PrepareToInvalidateCacheTuple(). No PGXN extension calls that, and there's no apparent use case in extensions. Reviewed by Nitin Motiani and (in earlier versions) Andres Freund. Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
1 parent dca6824 commit e3914bd

File tree

11 files changed

+290
-118
lines changed

11 files changed

+290
-118
lines changed

src/backend/access/heap/heapam.c

Lines changed: 36 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6383,6 +6383,24 @@ heap_inplace_update_and_unlock(Relation relation,
63836383
if (oldlen != newlen || htup->t_hoff != tuple->t_data->t_hoff)
63846384
elog(ERROR, "wrong tuple length");
63856385

6386+
/*
6387+
* Construct shared cache inval if necessary. Note that because we only
6388+
* pass the new version of the tuple, this mustn't be used for any
6389+
* operations that could change catcache lookup keys. But we aren't
6390+
* bothering with index updates either, so that's true a fortiori.
6391+
*/
6392+
CacheInvalidateHeapTupleInplace(relation, tuple, NULL);
6393+
6394+
/*
6395+
* Unlink relcache init files as needed. If unlinking, acquire
6396+
* RelCacheInitLock until after associated invalidations. By doing this
6397+
* in advance, if we checkpoint and then crash between inplace
6398+
* XLogInsert() and inval, we don't rely on StartupXLOG() ->
6399+
* RelationCacheInitFileRemove(). That uses elevel==LOG, so replay would
6400+
* neglect to PANIC on EIO.
6401+
*/
6402+
PreInplace_Inval();
6403+
63866404
/* NO EREPORT(ERROR) from here till changes are logged */
63876405
START_CRIT_SECTION();
63886406

@@ -6426,17 +6444,28 @@ heap_inplace_update_and_unlock(Relation relation,
64266444
PageSetLSN(BufferGetPage(buffer), recptr);
64276445
}
64286446

6447+
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
6448+
6449+
/*
6450+
* Send invalidations to shared queue. SearchSysCacheLocked1() assumes we
6451+
* do this before UnlockTuple().
6452+
*
6453+
* If we're mutating a tuple visible only to this transaction, there's an
6454+
* equivalent transactional inval from the action that created the tuple,
6455+
* and this inval is superfluous.
6456+
*/
6457+
AtInplace_Inval();
6458+
64296459
END_CRIT_SECTION();
6460+
UnlockTuple(relation, &tuple->t_self, InplaceUpdateTupleLock);
64306461

6431-
heap_inplace_unlock(relation, oldtup, buffer);
6462+
AcceptInvalidationMessages(); /* local processing of just-sent inval */
64326463

64336464
/*
6434-
* Send out shared cache inval if necessary. Note that because we only
6435-
* pass the new version of the tuple, this mustn't be used for any
6436-
* operations that could change catcache lookup keys. But we aren't
6437-
* bothering with index updates either, so that's true a fortiori.
6438-
*
6439-
* XXX ROLLBACK discards the invalidation. See test inplace-inval.spec.
6465+
* Queue a transactional inval. The immediate invalidation we just sent
6466+
* is the only one known to be necessary. To reduce risk from the
6467+
* transition to immediate invalidation, continue sending a transactional
6468+
* invalidation like we've long done. Third-party code might rely on it.
64406469
*/
64416470
if (!IsBootstrapProcessingMode())
64426471
CacheInvalidateHeapTuple(relation, tuple, NULL);

src/backend/access/transam/xact.c

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1276,14 +1276,24 @@ RecordTransactionCommit(void)
12761276

12771277
/*
12781278
* Transactions without an assigned xid can contain invalidation
1279-
* messages (e.g. explicit relcache invalidations or catcache
1280-
* invalidations for inplace updates); standbys need to process those.
1281-
* We can't emit a commit record without an xid, and we don't want to
1282-
* force assigning an xid, because that'd be problematic for e.g.
1283-
* vacuum. Hence we emit a bespoke record for the invalidations. We
1284-
* don't want to use that in case a commit record is emitted, so they
1285-
* happen synchronously with commits (besides not wanting to emit more
1286-
* WAL records).
1279+
* messages. While inplace updates do this, this is not known to be
1280+
* necessary; see comment at inplace CacheInvalidateHeapTuple().
1281+
* Extensions might still rely on this capability, and standbys may
1282+
* need to process those invals. We can't emit a commit record
1283+
* without an xid, and we don't want to force assigning an xid,
1284+
* because that'd be problematic for e.g. vacuum. Hence we emit a
1285+
* bespoke record for the invalidations. We don't want to use that in
1286+
* case a commit record is emitted, so they happen synchronously with
1287+
* commits (besides not wanting to emit more WAL records).
1288+
*
1289+
* XXX Every known use of this capability is a defect. Since an XID
1290+
* isn't controlling visibility of the change that prompted invals,
1291+
* other sessions need the inval even if this transactions aborts.
1292+
*
1293+
* ON COMMIT DELETE ROWS does a nontransactional index_build(), which
1294+
* queues a relcache inval, including in transactions without an xid
1295+
* that had read the (empty) table. Standbys don't need any ON COMMIT
1296+
* DELETE ROWS invals, but we've not done the work to withhold them.
12871297
*/
12881298
if (nmsgs != 0)
12891299
{

src/backend/catalog/index.c

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2895,12 +2895,19 @@ index_update_stats(Relation rel,
28952895
if (dirty)
28962896
{
28972897
systable_inplace_update_finish(state, tuple);
2898-
/* the above sends a cache inval message */
2898+
/* the above sends transactional and immediate cache inval messages */
28992899
}
29002900
else
29012901
{
29022902
systable_inplace_update_cancel(state);
2903-
/* no need to change tuple, but force relcache inval anyway */
2903+
2904+
/*
2905+
* While we didn't change relhasindex, CREATE INDEX needs a
2906+
* transactional inval for when the new index's catalog rows become
2907+
* visible. Other CREATE INDEX and REINDEX code happens to also queue
2908+
* this inval, but keep this in case rare callers rely on this part of
2909+
* our API contract.
2910+
*/
29042911
CacheInvalidateRelcacheByTuple(tuple);
29052912
}
29062913

src/backend/replication/logical/decode.c

Lines changed: 11 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -547,23 +547,19 @@ DecodeHeapOp(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
547547

548548
/*
549549
* Inplace updates are only ever performed on catalog tuples and
550-
* can, per definition, not change tuple visibility. Since we
551-
* don't decode catalog tuples, we're not interested in the
552-
* record's contents.
550+
* can, per definition, not change tuple visibility. Inplace
551+
* updates don't affect storage or interpretation of table rows,
552+
* so they don't affect logicalrep_write_tuple() outcomes. Hence,
553+
* we don't process invalidations from the original operation. If
554+
* inplace updates did affect those things, invalidations wouldn't
555+
* make it work, since there are no snapshot-specific versions of
556+
* inplace-updated values. Since we also don't decode catalog
557+
* tuples, we're not interested in the record's contents.
553558
*
554-
* In-place updates can be used either by XID-bearing transactions
555-
* (e.g. in CREATE INDEX CONCURRENTLY) or by XID-less
556-
* transactions (e.g. VACUUM). In the former case, the commit
557-
* record will include cache invalidations, so we mark the
558-
* transaction as catalog modifying here. Currently that's
559-
* redundant because the commit will do that as well, but once we
560-
* support decoding in-progress relations, this will be important.
559+
* WAL contains likely-unnecessary commit-time invals from the
560+
* CacheInvalidateHeapTuple() call in heap_inplace_update().
561+
* Excess invalidation is safe.
561562
*/
562-
if (!TransactionIdIsValid(xid))
563-
break;
564-
565-
SnapBuildProcessChange(builder, xid, buf->origptr);
566-
ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
567563
break;
568564

569565
case XLOG_HEAP_CONFIRM:

src/backend/utils/cache/catcache.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2131,7 +2131,8 @@ void
21312131
PrepareToInvalidateCacheTuple(Relation relation,
21322132
HeapTuple tuple,
21332133
HeapTuple newtuple,
2134-
void (*function) (int, uint32, Oid))
2134+
void (*function) (int, uint32, Oid, void *),
2135+
void *context)
21352136
{
21362137
slist_iter iter;
21372138
Oid reloid;
@@ -2172,7 +2173,7 @@ PrepareToInvalidateCacheTuple(Relation relation,
21722173
hashvalue = CatalogCacheComputeTupleHashValue(ccp, ccp->cc_nkeys, tuple);
21732174
dbid = ccp->cc_relisshared ? (Oid) 0 : MyDatabaseId;
21742175

2175-
(*function) (ccp->id, hashvalue, dbid);
2176+
(*function) (ccp->id, hashvalue, dbid, context);
21762177

21772178
if (newtuple)
21782179
{
@@ -2181,7 +2182,7 @@ PrepareToInvalidateCacheTuple(Relation relation,
21812182
newhashvalue = CatalogCacheComputeTupleHashValue(ccp, ccp->cc_nkeys, newtuple);
21822183

21832184
if (newhashvalue != hashvalue)
2184-
(*function) (ccp->id, newhashvalue, dbid);
2185+
(*function) (ccp->id, newhashvalue, dbid, context);
21852186
}
21862187
}
21872188
}

0 commit comments

Comments
 (0)