Skip to content

Commit f22624c

Browse files
committed
Fix Hot-Standby initialization of clog and subtrans.
These bugs can cause data loss on standbys started with hot_standby=on at the moment they start to accept read only queries, by marking committed transactions as uncommited. The likelihood of such corruptions is small unless the primary has a high transaction rate. 5a031a5 fixed bugs in HS's startup logic by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both clog and subtrans are written to before that. This only failed to fail in common cases because the usage of ExtendCLOG in procarray.c was superflous since clog extensions are actually WAL logged. f44eedc/I then tried to fix the missing extensions of pg_subtrans due to the former commit's changes - which are not WAL logged - by performing the extensions when switching to a state > STANDBY_INITIALIZED and not performing xid assignments before that - again missing the fact that ExtendCLOG is unneccessary - but screwed up twice: Once because latestObservedXid wasn't updated anymore in that state due to the earlier commit and once by having an off-by-one error in the loop performing extensions. This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed between the start of the checkpoint recovery started from and the first xl_running_xact record old transactions commit bits in pg_clog could be overwritten if they started and committed in that window. Fix this mess by not performing ExtendCLOG() in HS at all anymore since it's unneeded and evidently dangerous and by performing subtrans extensions even before reaching STANDBY_SNAPSHOT_PENDING. Analysis and patch by Andres Freund. Reported by Christophe Pettus. Backpatch down to 9.0, like the previous commit that caused this.
1 parent 3379263 commit f22624c

File tree

2 files changed

+41
-29
lines changed

2 files changed

+41
-29
lines changed

src/backend/access/transam/clog.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -622,7 +622,7 @@ ExtendCLOG(TransactionId newestXact)
622622
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
623623

624624
/* Zero the page and make an XLOG entry about it */
625-
ZeroCLOGPage(pageno, !InRecovery);
625+
ZeroCLOGPage(pageno, true);
626626

627627
LWLockRelease(CLogControlLock);
628628
}

src/backend/storage/ipc/procarray.c

Lines changed: 40 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -472,7 +472,7 @@ ProcArrayClearTransaction(PGPROC *proc)
472472
* ProcArrayInitRecovery -- initialize recovery xid mgmt environment
473473
*
474474
* Remember up to where the startup process initialized the CLOG and subtrans
475-
* so we can ensure its initialized gaplessly up to the point where necessary
475+
* so we can ensure it's initialized gaplessly up to the point where necessary
476476
* while in recovery.
477477
*/
478478
void
@@ -482,9 +482,10 @@ ProcArrayInitRecovery(TransactionId initializedUptoXID)
482482
Assert(TransactionIdIsNormal(initializedUptoXID));
483483

484484
/*
485-
* we set latestObservedXid to the xid SUBTRANS has been initialized upto
486-
* so we can extend it from that point onwards when we reach a consistent
487-
* state in ProcArrayApplyRecoveryInfo().
485+
* we set latestObservedXid to the xid SUBTRANS has been initialized upto,
486+
* so we can extend it from that point onwards in
487+
* RecordKnownAssignedTransactionIds, and when we get consistent in
488+
* ProcArrayApplyRecoveryInfo().
488489
*/
489490
latestObservedXid = initializedUptoXID;
490491
TransactionIdRetreat(latestObservedXid);
@@ -653,17 +654,23 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
653654
pfree(xids);
654655

655656
/*
656-
* latestObservedXid is set to the the point where SUBTRANS was started up
657-
* to, initialize subtrans from thereon, up to nextXid - 1.
657+
* latestObservedXid is at least set to the the point where SUBTRANS was
658+
* started up to (c.f. ProcArrayInitRecovery()) or to the biggest xid
659+
* RecordKnownAssignedTransactionIds() was called for. Initialize
660+
* subtrans from thereon, up to nextXid - 1.
661+
*
662+
* We need to duplicate parts of RecordKnownAssignedTransactionId() here,
663+
* because we've just added xids to the known assigned xids machinery that
664+
* haven't gone through RecordKnownAssignedTransactionId().
658665
*/
659666
Assert(TransactionIdIsNormal(latestObservedXid));
667+
TransactionIdAdvance(latestObservedXid);
660668
while (TransactionIdPrecedes(latestObservedXid, running->nextXid))
661669
{
662-
ExtendCLOG(latestObservedXid);
663670
ExtendSUBTRANS(latestObservedXid);
664-
665671
TransactionIdAdvance(latestObservedXid);
666672
}
673+
TransactionIdRetreat(latestObservedXid); /* = running->nextXid - 1 */
667674

668675
/* ----------
669676
* Now we've got the running xids we need to set the global values that
@@ -748,10 +755,6 @@ ProcArrayApplyXidAssignment(TransactionId topxid,
748755

749756
Assert(standbyState >= STANDBY_INITIALIZED);
750757

751-
/* can't do anything useful unless we have more state setup */
752-
if (standbyState == STANDBY_INITIALIZED)
753-
return;
754-
755758
max_xid = TransactionIdLatest(topxid, nsubxids, subxids);
756759

757760
/*
@@ -778,6 +781,10 @@ ProcArrayApplyXidAssignment(TransactionId topxid,
778781
for (i = 0; i < nsubxids; i++)
779782
SubTransSetParent(subxids[i], topxid, false);
780783

784+
/* KnownAssignedXids isn't maintained yet, so we're done for now */
785+
if (standbyState == STANDBY_INITIALIZED)
786+
return;
787+
781788
/*
782789
* Uses same locking as transaction commit
783790
*/
@@ -2634,18 +2641,11 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
26342641
{
26352642
Assert(standbyState >= STANDBY_INITIALIZED);
26362643
Assert(TransactionIdIsValid(xid));
2644+
Assert(TransactionIdIsValid(latestObservedXid));
26372645

26382646
elog(trace_recovery(DEBUG4), "record known xact %u latestObservedXid %u",
26392647
xid, latestObservedXid);
26402648

2641-
/*
2642-
* If the KnownAssignedXids machinery isn't up yet, do nothing.
2643-
*/
2644-
if (standbyState <= STANDBY_INITIALIZED)
2645-
return;
2646-
2647-
Assert(TransactionIdIsValid(latestObservedXid));
2648-
26492649
/*
26502650
* When a newly observed xid arrives, it is frequently the case that it is
26512651
* *not* the next xid in sequence. When this occurs, we must treat the
@@ -2656,22 +2656,34 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
26562656
TransactionId next_expected_xid;
26572657

26582658
/*
2659-
* Extend clog and subtrans like we do in GetNewTransactionId() during
2660-
* normal operation using individual extend steps. Typical case
2661-
* requires almost no activity.
2659+
* Extend subtrans like we do in GetNewTransactionId() during normal
2660+
* operation using individual extend steps. Note that we do not need
2661+
* to extend clog since its extensions are WAL logged.
2662+
*
2663+
* This part has to be done regardless of standbyState since we
2664+
* immediately start assigning subtransactions to their toplevel
2665+
* transactions.
26622666
*/
26632667
next_expected_xid = latestObservedXid;
2664-
TransactionIdAdvance(next_expected_xid);
2665-
while (TransactionIdPrecedesOrEquals(next_expected_xid, xid))
2668+
while (TransactionIdPrecedes(next_expected_xid, xid))
26662669
{
2667-
ExtendCLOG(next_expected_xid);
2670+
TransactionIdAdvance(next_expected_xid);
26682671
ExtendSUBTRANS(next_expected_xid);
2672+
}
2673+
Assert(next_expected_xid == xid);
26692674

2670-
TransactionIdAdvance(next_expected_xid);
2675+
/*
2676+
* If the KnownAssignedXids machinery isn't up yet, there's nothing
2677+
* more to do since we don't track assigned xids yet.
2678+
*/
2679+
if (standbyState <= STANDBY_INITIALIZED)
2680+
{
2681+
latestObservedXid = xid;
2682+
return;
26712683
}
26722684

26732685
/*
2674-
* Add the new xids onto the KnownAssignedXids array.
2686+
* Add (latestObservedXid, xid] onto the KnownAssignedXids array.
26752687
*/
26762688
next_expected_xid = latestObservedXid;
26772689
TransactionIdAdvance(next_expected_xid);

0 commit comments

Comments
 (0)