Skip to content

Commit 970fb12

Browse files
committed
Consistency check should compare last record replayed, not last record read.
EndRecPtr is the last record that we've read, but not necessarily yet replayed. CheckRecoveryConsistency should compare minRecoveryPoint with the last replayed record instead. This caused recovery to think it's reached consistency too early. Now that we do the check in CheckRecoveryConsistency correctly, we have to move the call of that function to after redoing a record. The current place, after reading a record but before replaying it, is wrong. In particular, if there are no more records after the one ending at minRecoveryPoint, we don't enter hot standby until one extra record is generated and read by the standby, and CheckRecoveryConsistency is called. These two bugs conspired to make the code appear to work correctly, except for the small window between reading the last record that reaches minRecoveryPoint, and replaying it. In the passing, rename recoveryLastRecPtr, which is the last record replayed, to lastReplayedEndRecPtr. This makes it slightly less confusing with replayEndRecPtr, which is the last record read that we're about to replay. Original report from Kyotaro HORIGUCHI, further diagnosis by Fujii Masao. Backpatch to 9.0, where Hot Standby subtly changed the test from "minRecoveryPoint < EndRecPtr" to "minRecoveryPoint <= EndRecPtr". The former works because where the test is performed, we have always read one more record than we've replayed.
1 parent ad69bd0 commit 970fb12

File tree

1 file changed

+21
-15
lines changed
  • src/backend/access/transam

1 file changed

+21
-15
lines changed

src/backend/access/transam/xlog.c

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -445,11 +445,15 @@ typedef struct XLogCtlData
445445
XLogRecPtr lastCheckPointRecPtr;
446446
CheckPoint lastCheckPoint;
447447

448-
/* end+1 of the last record replayed (or being replayed) */
448+
/*
449+
* lastReplayedEndRecPtr points to end+1 of the last record successfully
450+
* replayed. When we're currently replaying a record, ie. in a redo
451+
* function, replayEndRecPtr points to the end+1 of the record being
452+
* replayed, otherwise it's equal to lastReplayedEndRecPtr.
453+
*/
454+
XLogRecPtr lastReplayedEndRecPtr;
449455
XLogRecPtr replayEndRecPtr;
450456
TimeLineID replayEndTLI;
451-
/* end+1 of the last record replayed */
452-
XLogRecPtr recoveryLastRecPtr;
453457
/* timestamp of last COMMIT/ABORT record replayed (or being replayed) */
454458
TimestampTz recoveryLastXTime;
455459
/* current effective recovery target timeline */
@@ -5745,7 +5749,7 @@ StartupXLOG(void)
57455749
}
57465750

57475751
/*
5748-
* Initialize shared replayEndRecPtr, recoveryLastRecPtr, and
5752+
* Initialize shared replayEndRecPtr, lastReplayedEndRecPtr, and
57495753
* recoveryLastXTime.
57505754
*
57515755
* This is slightly confusing if we're starting from an online
@@ -5759,7 +5763,7 @@ StartupXLOG(void)
57595763
SpinLockAcquire(&xlogctl->info_lck);
57605764
xlogctl->replayEndRecPtr = ReadRecPtr;
57615765
xlogctl->replayEndTLI = ThisTimeLineID;
5762-
xlogctl->recoveryLastRecPtr = EndRecPtr;
5766+
xlogctl->lastReplayedEndRecPtr = EndRecPtr;
57635767
xlogctl->recoveryLastXTime = 0;
57645768
xlogctl->currentChunkStartTime = 0;
57655769
xlogctl->recoveryPause = false;
@@ -5851,9 +5855,6 @@ StartupXLOG(void)
58515855
/* Handle interrupt signals of startup process */
58525856
HandleStartupProcInterrupts();
58535857

5854-
/* Allow read-only connections if we're consistent now */
5855-
CheckRecoveryConsistency();
5856-
58575858
/*
58585859
* Pause WAL replay, if requested by a hot-standby session via
58595860
* SetRecoveryPause().
@@ -5983,16 +5984,19 @@ StartupXLOG(void)
59835984
}
59845985

59855986
/*
5986-
* Update shared recoveryLastRecPtr after this record has been
5987-
* replayed.
5987+
* Update lastReplayedEndRecPtr after this record has been
5988+
* successfully replayed.
59885989
*/
59895990
SpinLockAcquire(&xlogctl->info_lck);
5990-
xlogctl->recoveryLastRecPtr = EndRecPtr;
5991+
xlogctl->lastReplayedEndRecPtr = EndRecPtr;
59915992
SpinLockRelease(&xlogctl->info_lck);
59925993

59935994
/* Remember this record as the last-applied one */
59945995
LastRec = ReadRecPtr;
59955996

5997+
/* Allow read-only connections if we're consistent now */
5998+
CheckRecoveryConsistency();
5999+
59966000
/* Exit loop if we reached inclusive recovery target */
59976001
if (!recoveryContinue)
59986002
break;
@@ -6383,10 +6387,11 @@ CheckRecoveryConsistency(void)
63836387
* Have we passed our safe starting point? Note that minRecoveryPoint
63846388
* is known to be incorrectly set if ControlFile->backupEndRequired,
63856389
* until the XLOG_BACKUP_RECORD arrives to advise us of the correct
6386-
* minRecoveryPoint. All we prior to that is its not consistent yet.
6390+
* minRecoveryPoint. All we know prior to that is that we're not
6391+
* consistent yet.
63876392
*/
63886393
if (!reachedConsistency && !ControlFile->backupEndRequired &&
6389-
XLByteLE(minRecoveryPoint, EndRecPtr) &&
6394+
XLByteLE(minRecoveryPoint, XLogCtl->lastReplayedEndRecPtr) &&
63906395
XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
63916396
{
63926397
/*
@@ -6398,7 +6403,8 @@ CheckRecoveryConsistency(void)
63986403
reachedConsistency = true;
63996404
ereport(LOG,
64006405
(errmsg("consistent recovery state reached at %X/%X",
6401-
(uint32) (EndRecPtr >> 32), (uint32) EndRecPtr)));
6406+
(uint32) (XLogCtl->lastReplayedEndRecPtr >> 32),
6407+
(uint32) XLogCtl->lastReplayedEndRecPtr)));
64026408
}
64036409

64046410
/*
@@ -9094,7 +9100,7 @@ GetXLogReplayRecPtr(TimeLineID *targetTLI)
90949100
XLogRecPtr recptr;
90959101

90969102
SpinLockAcquire(&xlogctl->info_lck);
9097-
recptr = xlogctl->recoveryLastRecPtr;
9103+
recptr = xlogctl->lastReplayedEndRecPtr;
90989104
if (targetTLI)
90999105
*targetTLI = xlogctl->RecoveryTargetTLI;
91009106
SpinLockRelease(&xlogctl->info_lck);

0 commit comments

Comments
 (0)