Skip to content

Commit 0bc3ed9

Browse files
committed
Account for catalog snapshot in PGXACT->xmin updates.
The CatalogSnapshot was not plugged into SnapshotResetXmin()'s accounting for whether MyPgXact->xmin could be cleared or advanced. In normal transactions this was masked by the fact that the transaction snapshot would be older, but during backend startup and certain utility commands it was possible to re-use the CatalogSnapshot after MyPgXact->xmin had been cleared, meaning that recently-deleted rows could be pruned even though this snapshot could still see them, causing unexpected catalog lookup failures. This effect appears to be the explanation for a recent failure on buildfarm member piculet. To fix, add the CatalogSnapshot to the RegisteredSnapshots heap whenever it is valid. In the previous logic, it was possible for the CatalogSnapshot to remain valid across waits for client input, but with this change that would mean it delays advance of global xmin in cases where it did not before. To avoid possibly causing new table-bloat problems with clients that sit idle for long intervals, add code to invalidate the CatalogSnapshot before waiting for client input. (When the backend is busy, it's unlikely that the CatalogSnapshot would be the oldest snap for very long, so we don't worry about forcing early invalidation of it otherwise.) In passing, remove the CatalogSnapshotStale flag in favor of using "CatalogSnapshot != NULL" to represent validity, as we do for the other special snapshots in snapmgr.c. And improve some obsolete comments. No regression test because I don't know a deterministic way to cause this failure. But the stress test shown in the original discussion provokes "cache lookup failed for relation 1255" within a few dozen seconds for me. Back-patch to 9.4 where MVCC catalog scans were introduced. (Note: it's quite easy to produce similar failures with the same test case in branches before 9.4. But MVCC catalog scans were supposed to fix that.) Discussion: <16447.1478818294@sss.pgh.pa.us>
1 parent 60de884 commit 0bc3ed9

File tree

3 files changed

+95
-36
lines changed

3 files changed

+95
-36
lines changed

src/backend/tcop/postgres.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3947,6 +3947,12 @@ PostgresMain(int argc, char *argv[],
39473947

39483948
initStringInfo(&input_message);
39493949

3950+
/*
3951+
* Also consider releasing our catalog snapshot if any, so that it's
3952+
* not preventing advance of global xmin while we wait for the client.
3953+
*/
3954+
InvalidateCatalogSnapshotConditionally();
3955+
39503956
/*
39513957
* (1) If we've reached idle state, tell the frontend we're ready for
39523958
* a new query.

src/backend/utils/time/snapmgr.c

Lines changed: 88 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
/*-------------------------------------------------------------------------
2+
*
23
* snapmgr.c
34
* PostgreSQL snapshot manager
45
*
@@ -8,27 +9,30 @@
89
* (tracked by separate refcounts on each snapshot), its memory can be freed.
910
*
1011
* The FirstXactSnapshot, if any, is treated a bit specially: we increment its
11-
* regd_count and count it in RegisteredSnapshots, but this reference is not
12+
* regd_count and list it in RegisteredSnapshots, but this reference is not
1213
* tracked by a resource owner. We used to use the TopTransactionResourceOwner
1314
* to track this snapshot reference, but that introduces logical circularity
1415
* and thus makes it impossible to clean up in a sane fashion. It's better to
1516
* handle this reference as an internally-tracked registration, so that this
1617
* module is entirely lower-level than ResourceOwners.
1718
*
1819
* Likewise, any snapshots that have been exported by pg_export_snapshot
19-
* have regd_count = 1 and are counted in RegisteredSnapshots, but are not
20+
* have regd_count = 1 and are listed in RegisteredSnapshots, but are not
2021
* tracked by any resource owner.
2122
*
23+
* Likewise, the CatalogSnapshot is listed in RegisteredSnapshots when it
24+
* is valid, but is not tracked by any resource owner.
25+
*
2226
* The same is true for historic snapshots used during logical decoding,
23-
* their lifetime is managed separately (as they life longer as one xact.c
27+
* their lifetime is managed separately (as they live longer than one xact.c
2428
* transaction).
2529
*
2630
* These arrangements let us reset MyPgXact->xmin when there are no snapshots
27-
* referenced by this transaction. (One possible improvement would be to be
28-
* able to advance Xmin when the snapshot with the earliest Xmin is no longer
29-
* referenced. That's a bit harder though, it requires more locking, and
30-
* anyway it should be rather uncommon to keep temporary snapshots referenced
31-
* for too long.)
31+
* referenced by this transaction, and advance it when the one with oldest
32+
* Xmin is no longer referenced. For simplicity however, only registered
33+
* snapshots not active snapshots participate in tracking which one is oldest;
34+
* we don't try to change MyPgXact->xmin except when the active-snapshot
35+
* stack is empty.
3236
*
3337
*
3438
* Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
@@ -66,7 +70,7 @@
6670
* SecondarySnapshot is a snapshot that's always up-to-date as of the current
6771
* instant, even in transaction-snapshot mode. It should only be used for
6872
* special-purpose code (say, RI checking.) CatalogSnapshot points to an
69-
* MVCC snapshot intended to be used for catalog scans; we must refresh it
73+
* MVCC snapshot intended to be used for catalog scans; we must invalidate it
7074
* whenever a system catalog change occurs.
7175
*
7276
* These SnapshotData structs are static to simplify memory allocation
@@ -82,11 +86,6 @@ static Snapshot SecondarySnapshot = NULL;
8286
static Snapshot CatalogSnapshot = NULL;
8387
static Snapshot HistoricSnapshot = NULL;
8488

85-
/*
86-
* Staleness detection for CatalogSnapshot.
87-
*/
88-
static bool CatalogSnapshotStale = true;
89-
9089
/*
9190
* These are updated by GetSnapshotData. We initialize them this way
9291
* for the convenience of TransactionIdIsInProgress: even in bootstrap
@@ -125,8 +124,7 @@ static ActiveSnapshotElt *ActiveSnapshot = NULL;
125124

126125
/*
127126
* Currently registered Snapshots. Ordered in a heap by xmin, so that we can
128-
* quickly find the one with lowest xmin, to advance our MyPgXat->xmin.
129-
* resowner.c also tracks these.
127+
* quickly find the one with lowest xmin, to advance our MyPgXact->xmin.
130128
*/
131129
static int xmin_cmp(const pairingheap_node *a, const pairingheap_node *b,
132130
void *arg);
@@ -201,6 +199,12 @@ GetTransactionSnapshot(void)
201199
/* First call in transaction? */
202200
if (!FirstSnapshotSet)
203201
{
202+
/*
203+
* Don't allow catalog snapshot to be older than xact snapshot. Must
204+
* do this first to allow the empty-heap Assert to succeed.
205+
*/
206+
InvalidateCatalogSnapshot();
207+
204208
Assert(pairingheap_is_empty(&RegisteredSnapshots));
205209
Assert(FirstXactSnapshot == NULL);
206210

@@ -232,9 +236,6 @@ GetTransactionSnapshot(void)
232236
else
233237
CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
234238

235-
/* Don't allow catalog snapshot to be older than xact snapshot. */
236-
CatalogSnapshotStale = true;
237-
238239
FirstSnapshotSet = true;
239240
return CurrentSnapshot;
240241
}
@@ -243,7 +244,7 @@ GetTransactionSnapshot(void)
243244
return CurrentSnapshot;
244245

245246
/* Don't allow catalog snapshot to be older than xact snapshot. */
246-
CatalogSnapshotStale = true;
247+
InvalidateCatalogSnapshot();
247248

248249
CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
249250

@@ -318,36 +319,72 @@ GetNonHistoricCatalogSnapshot(Oid relid)
318319
* scan a relation for which neither catcache nor snapshot invalidations
319320
* are sent, we must refresh the snapshot every time.
320321
*/
321-
if (!CatalogSnapshotStale && !RelationInvalidatesSnapshotsOnly(relid) &&
322+
if (CatalogSnapshot &&
323+
!RelationInvalidatesSnapshotsOnly(relid) &&
322324
!RelationHasSysCache(relid))
323-
CatalogSnapshotStale = true;
325+
InvalidateCatalogSnapshot();
324326

325-
if (CatalogSnapshotStale)
327+
if (CatalogSnapshot == NULL)
326328
{
327329
/* Get new snapshot. */
328330
CatalogSnapshot = GetSnapshotData(&CatalogSnapshotData);
329331

330332
/*
331-
* Mark new snapshost as valid. We must do this last, in case an
332-
* ERROR occurs inside GetSnapshotData().
333+
* Make sure the catalog snapshot will be accounted for in decisions
334+
* about advancing PGXACT->xmin. We could apply RegisterSnapshot, but
335+
* that would result in making a physical copy, which is overkill; and
336+
* it would also create a dependency on some resource owner, which we
337+
* do not want for reasons explained at the head of this file. Instead
338+
* just shove the CatalogSnapshot into the pairing heap manually. This
339+
* has to be reversed in InvalidateCatalogSnapshot, of course.
340+
*
341+
* NB: it had better be impossible for this to throw error, since the
342+
* CatalogSnapshot pointer is already valid.
333343
*/
334-
CatalogSnapshotStale = false;
344+
pairingheap_add(&RegisteredSnapshots, &CatalogSnapshot->ph_node);
335345
}
336346

337347
return CatalogSnapshot;
338348
}
339349

340350
/*
341-
* Mark the current catalog snapshot as invalid. We could change this API
342-
* to allow the caller to provide more fine-grained invalidation details, so
343-
* that a change to relation A wouldn't prevent us from using our cached
344-
* snapshot to scan relation B, but so far there's no evidence that the CPU
345-
* cycles we spent tracking such fine details would be well-spent.
351+
* InvalidateCatalogSnapshot
352+
* Mark the current catalog snapshot, if any, as invalid
353+
*
354+
* We could change this API to allow the caller to provide more fine-grained
355+
* invalidation details, so that a change to relation A wouldn't prevent us
356+
* from using our cached snapshot to scan relation B, but so far there's no
357+
* evidence that the CPU cycles we spent tracking such fine details would be
358+
* well-spent.
359+
*/
360+
void
361+
InvalidateCatalogSnapshot(void)
362+
{
363+
if (CatalogSnapshot)
364+
{
365+
pairingheap_remove(&RegisteredSnapshots, &CatalogSnapshot->ph_node);
366+
CatalogSnapshot = NULL;
367+
SnapshotResetXmin();
368+
}
369+
}
370+
371+
/*
372+
* InvalidateCatalogSnapshotConditionally
373+
* Drop catalog snapshot if it's the only one we have
374+
*
375+
* This is called when we are about to wait for client input, so we don't
376+
* want to continue holding the catalog snapshot if it might mean that the
377+
* global xmin horizon can't advance. However, if there are other snapshots
378+
* still active or registered, the catalog snapshot isn't likely to be the
379+
* oldest one, so we might as well keep it.
346380
*/
347381
void
348-
InvalidateCatalogSnapshot()
382+
InvalidateCatalogSnapshotConditionally(void)
349383
{
350-
CatalogSnapshotStale = true;
384+
if (CatalogSnapshot &&
385+
ActiveSnapshot == NULL &&
386+
pairingheap_is_singular(&RegisteredSnapshots))
387+
InvalidateCatalogSnapshot();
351388
}
352389

353390
/*
@@ -364,6 +401,7 @@ SnapshotSetCommandId(CommandId curcid)
364401
CurrentSnapshot->curcid = curcid;
365402
if (SecondarySnapshot)
366403
SecondarySnapshot->curcid = curcid;
404+
/* Should we do the same with CatalogSnapshot? */
367405
}
368406

369407
/*
@@ -381,6 +419,9 @@ SetTransactionSnapshot(Snapshot sourcesnap, TransactionId sourcexid,
381419
/* Caller should have checked this already */
382420
Assert(!FirstSnapshotSet);
383421

422+
/* Better do this to ensure following Assert succeeds. */
423+
InvalidateCatalogSnapshot();
424+
384425
Assert(pairingheap_is_empty(&RegisteredSnapshots));
385426
Assert(FirstXactSnapshot == NULL);
386427
Assert(!HistoricSnapshotActive());
@@ -769,7 +810,15 @@ xmin_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
769810
* Even if there are some remaining snapshots, we may be able to advance our
770811
* PGXACT->xmin to some degree. This typically happens when a portal is
771812
* dropped. For efficiency, we only consider recomputing PGXACT->xmin when
772-
* the active snapshot stack is empty.
813+
* the active snapshot stack is empty; this allows us not to need to track
814+
* which active snapshot is oldest.
815+
*
816+
* Note: it's tempting to use GetOldestSnapshot() here so that we can include
817+
* active snapshots in the calculation. However, that compares by LSN not
818+
* xmin so it's not entirely clear that it's the same thing. Also, we'd be
819+
* critically dependent on the assumption that the bottommost active snapshot
820+
* stack entry has the oldest xmin. (Current uses of GetOldestSnapshot() are
821+
* not actually critical, but this would be.)
773822
*/
774823
static void
775824
SnapshotResetXmin(void)
@@ -855,7 +904,7 @@ AtEOXact_Snapshot(bool isCommit)
855904
{
856905
/*
857906
* In transaction-snapshot mode we must release our privately-managed
858-
* reference to the transaction snapshot. We must decrement
907+
* reference to the transaction snapshot. We must remove it from
859908
* RegisteredSnapshots to keep the check below happy. But we don't bother
860909
* to do FreeSnapshot, for two reasons: the memory will go away with
861910
* TopTransactionContext anyway, and if someone has left the snapshot
@@ -895,7 +944,7 @@ AtEOXact_Snapshot(bool isCommit)
895944
/*
896945
* As with the FirstXactSnapshot, we needn't spend any effort on
897946
* cleaning up the per-snapshot data structures, but we do need to
898-
* unlink them from RegisteredSnapshots to prevent a warning below.
947+
* remove them from RegisteredSnapshots to prevent a warning below.
899948
*/
900949
foreach(lc, exportedSnapshots)
901950
{
@@ -907,6 +956,9 @@ AtEOXact_Snapshot(bool isCommit)
907956
exportedSnapshots = NIL;
908957
}
909958

959+
/* Drop catalog snapshot if any */
960+
InvalidateCatalogSnapshot();
961+
910962
/* On commit, complain about leftover snapshots */
911963
if (isCommit)
912964
{

src/include/utils/snapmgr.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ extern void SnapshotSetCommandId(CommandId curcid);
3232
extern Snapshot GetCatalogSnapshot(Oid relid);
3333
extern Snapshot GetNonHistoricCatalogSnapshot(Oid relid);
3434
extern void InvalidateCatalogSnapshot(void);
35+
extern void InvalidateCatalogSnapshotConditionally(void);
3536

3637
extern void PushActiveSnapshot(Snapshot snapshot);
3738
extern void PushCopiedSnapshot(Snapshot snapshot);

0 commit comments

Comments
 (0)