Skip to content

Commit 6fea655

Browse files
committed
Tighten ComputeXidHorizons' handling of walsenders.
ComputeXidHorizons (nee GetOldestXmin) thought that it could identify walsenders by checking for proc->databaseId == 0. Perhaps that was safe when the code was written, but it's been wrong at least since autovacuum was invented. Background processes that aren't connected to any particular database, such as the autovacuum launcher and logical replication launcher, look like that too. This imprecision is harmful because when such a process advertises an xmin, the result is to hold back dead-tuple cleanup in all databases, though it'd be sufficient to hold it back in shared catalogs (which are the only relations such a process can access). Aside from being generally inefficient, this has recently been seen to cause regression test failures in the buildfarm, as a consequence of the logical replication launcher's startup transaction preventing VACUUM from marking pages of a user table as all-visible. We only want that global hold-back effect for the case where a walsender is advertising a hot standby feedback xmin. Therefore, invent a new PGPROC flag that says that a process' xmin should be considered globally, and check that instead of using the incorrect databaseId == 0 test. Currently only a walsender sets that flag, and only if it is not connected to any particular database. (This is for bug-compatibility with the undocumented behavior of the existing code, namely that feedback sent by a client who has connected to a particular database would not be applied globally. I'm not sure this is a great definition; however, such a client is capable of issuing plain SQL commands, and I don't think we want xmins advertised for such commands to be applied globally. Perhaps this could do with refinement later.) While at it, I rewrote the comment in ComputeXidHorizons, and re-ordered the commented-upon if-tests, to make them match up for intelligibility's sake. This is arguably a back-patchable bug fix, but given the lack of complaints I think it prudent to let it age awhile in HEAD first. Discussion: https://postgr.es/m/1346227.1649887693@sss.pgh.pa.us
1 parent bdb71db commit 6fea655

File tree

3 files changed

+46
-17
lines changed

3 files changed

+46
-17
lines changed

src/backend/replication/walsender.c

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,21 @@ InitWalSender(void)
285285
MarkPostmasterChildWalSender();
286286
SendPostmasterSignal(PMSIGNAL_ADVANCE_STATE_MACHINE);
287287

288+
/*
289+
* If the client didn't specify a database to connect to, show in PGPROC
290+
* that our advertised xmin should affect vacuum horizons in all
291+
* databases. This allows physical replication clients to send hot
292+
* standby feedback that will delay vacuum cleanup in all databases.
293+
*/
294+
if (MyDatabaseId == InvalidOid)
295+
{
296+
Assert(MyProc->xmin == InvalidTransactionId);
297+
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
298+
MyProc->statusFlags |= PROC_AFFECTS_ALL_HORIZONS;
299+
ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
300+
LWLockRelease(ProcArrayLock);
301+
}
302+
288303
/* Initialize empty timestamp buffer for lag tracking. */
289304
lag_tracker = MemoryContextAllocZero(TopMemoryContext, sizeof(LagTracker));
290305
}

src/backend/storage/ipc/procarray.c

Lines changed: 28 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1810,21 +1810,28 @@ ComputeXidHorizons(ComputeXidHorizonsResult *h)
18101810
TransactionIdOlder(h->shared_oldest_nonremovable, xmin);
18111811

18121812
/*
1813-
* Normally queries in other databases are ignored for anything but
1814-
* the shared horizon. But in recovery we cannot compute an accurate
1815-
* per-database horizon as all xids are managed via the
1816-
* KnownAssignedXids machinery.
1813+
* Normally sessions in other databases are ignored for anything but
1814+
* the shared horizon.
18171815
*
1818-
* Be careful to compute a pessimistic value when MyDatabaseId is not
1819-
* set. If this is a backend in the process of starting up, we may not
1820-
* use a "too aggressive" horizon (otherwise we could end up using it
1821-
* to prune still needed data away). If the current backend never
1822-
* connects to a database that is harmless, because
1823-
* data_oldest_nonremovable will never be utilized.
1816+
* However, include them when MyDatabaseId is not (yet) set. A
1817+
* backend in the process of starting up must not compute a "too
1818+
* aggressive" horizon, otherwise we could end up using it to prune
1819+
* still-needed data away. If the current backend never connects to a
1820+
* database this is harmless, because data_oldest_nonremovable will
1821+
* never be utilized.
1822+
*
1823+
* Also, sessions marked with PROC_AFFECTS_ALL_HORIZONS should always
1824+
* be included. (This flag is used for hot standby feedback, which
1825+
* can't be tied to a specific database.)
1826+
*
1827+
* Also, while in recovery we cannot compute an accurate per-database
1828+
* horizon, as all xids are managed via the KnownAssignedXids
1829+
* machinery.
18241830
*/
1825-
if (in_recovery ||
1826-
MyDatabaseId == InvalidOid || proc->databaseId == MyDatabaseId ||
1827-
proc->databaseId == 0) /* always include WalSender */
1831+
if (proc->databaseId == MyDatabaseId ||
1832+
MyDatabaseId == InvalidOid ||
1833+
(statusFlags & PROC_AFFECTS_ALL_HORIZONS) ||
1834+
in_recovery)
18281835
{
18291836
/*
18301837
* We can ignore this backend if it's running CREATE INDEX
@@ -2681,10 +2688,14 @@ ProcArrayInstallRestoredXmin(TransactionId xmin, PGPROC *proc)
26812688
/* Install xmin */
26822689
MyProc->xmin = TransactionXmin = xmin;
26832690

2684-
/* Flags being copied must be valid copy-able flags. */
2685-
Assert((proc->statusFlags & (~PROC_COPYABLE_FLAGS)) == 0);
2686-
MyProc->statusFlags = proc->statusFlags;
2687-
ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
2691+
/* walsender cheats by passing proc == MyProc, don't check its flags */
2692+
if (proc != MyProc)
2693+
{
2694+
/* Flags being copied must be valid copy-able flags. */
2695+
Assert((proc->statusFlags & (~PROC_COPYABLE_FLAGS)) == 0);
2696+
MyProc->statusFlags = proc->statusFlags;
2697+
ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
2698+
}
26882699

26892700
result = true;
26902701
}

src/include/storage/proc.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,9 @@ struct XidCache
6060
#define PROC_VACUUM_FOR_WRAPAROUND 0x08 /* set by autovac only */
6161
#define PROC_IN_LOGICAL_DECODING 0x10 /* currently doing logical
6262
* decoding outside xact */
63+
#define PROC_AFFECTS_ALL_HORIZONS 0x20 /* this proc's xmin must be
64+
* included in vacuum horizons
65+
* in all databases */
6366

6467
/* flags reset at EOXact */
6568
#define PROC_VACUUM_STATE_MASK \

0 commit comments

Comments
 (0)