Skip to content

Commit 9f0d2bd

Browse files
committed
Don't set reachedMinRecoveryPoint during crash recovery. In crash recovery,
we don't reach consistency before replaying all of the WAL. Rename the variable to reachedConsistency, to make its intention clearer. In master, that was an active bug because of the recent patch to immediately PANIC if a reference to a missing page is found in WAL after reaching consistency, as Tom Lane's test case demonstrated. In 9.1 and 9.0, the only consequence was a misleading "consistent recovery state reached at %X/%X" message in the log at the beginning of crash recovery (the database is not consistent at that point yet). In 8.4, the log message was not printed in crash recovery, even though there was a similar reachedMinRecoveryPoint local variable that was also set early. So, backpatch to 9.1 and 9.0.
1 parent 5d8a894 commit 9f0d2bd

File tree

3 files changed

+19
-6
lines changed

3 files changed

+19
-6
lines changed

src/backend/access/transam/xlog.c

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -562,7 +562,13 @@ static TimeLineID lastPageTLI = 0;
562562
static XLogRecPtr minRecoveryPoint; /* local copy of
563563
* ControlFile->minRecoveryPoint */
564564
static bool updateMinRecoveryPoint = true;
565-
bool reachedMinRecoveryPoint = false;
565+
566+
/*
567+
* Have we reached a consistent database state? In crash recovery, we have
568+
* to replay all the WAL, so reachedConsistency is never set. During archive
569+
* recovery, the database is consistent once minRecoveryPoint is reached.
570+
*/
571+
bool reachedConsistency = false;
566572

567573
static bool InRedo = false;
568574

@@ -6893,10 +6899,17 @@ StartupXLOG(void)
68936899
static void
68946900
CheckRecoveryConsistency(void)
68956901
{
6902+
/*
6903+
* During crash recovery, we don't reach a consistent state until we've
6904+
* replayed all the WAL.
6905+
*/
6906+
if (XLogRecPtrIsInvalid(minRecoveryPoint))
6907+
return;
6908+
68966909
/*
68976910
* Have we passed our safe starting point?
68986911
*/
6899-
if (!reachedMinRecoveryPoint &&
6912+
if (!reachedConsistency &&
69006913
XLByteLE(minRecoveryPoint, EndRecPtr) &&
69016914
XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
69026915
{
@@ -6906,7 +6919,7 @@ CheckRecoveryConsistency(void)
69066919
*/
69076920
XLogCheckInvalidPages();
69086921

6909-
reachedMinRecoveryPoint = true;
6922+
reachedConsistency = true;
69106923
ereport(LOG,
69116924
(errmsg("consistent recovery state reached at %X/%X",
69126925
EndRecPtr.xlogid, EndRecPtr.xrecoff)));
@@ -6919,7 +6932,7 @@ CheckRecoveryConsistency(void)
69196932
*/
69206933
if (standbyState == STANDBY_SNAPSHOT_READY &&
69216934
!LocalHotStandbyActive &&
6922-
reachedMinRecoveryPoint &&
6935+
reachedConsistency &&
69236936
IsUnderPostmaster)
69246937
{
69256938
/* use volatile pointer to prevent code rearrangement */

src/backend/access/transam/xlogutils.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno,
8585
* linger in the hash table until the end of recovery and PANIC there,
8686
* which might come only much later if this is a standby server.
8787
*/
88-
if (reachedMinRecoveryPoint)
88+
if (reachedConsistency)
8989
{
9090
report_invalid_page(WARNING, node, forkno, blkno, present);
9191
elog(PANIC, "WAL contains references to invalid pages");

src/include/access/xlog.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,7 @@ typedef enum
190190

191191
extern XLogRecPtr XactLastRecEnd;
192192

193-
extern bool reachedMinRecoveryPoint;
193+
extern bool reachedConsistency;
194194

195195
/* these variables are GUC parameters related to XLOG */
196196
extern int CheckPointSegments;

0 commit comments

Comments
 (0)