Skip to content

Commit 7863ee4

Browse files
committed
Fix control file update done in restartpoints still running after promotion
If a cluster is promoted (aka the control file shows a state different than DB_IN_ARCHIVE_RECOVERY) while CreateRestartPoint() is still processing, this function could miss an update of the control file for "checkPoint" and "checkPointCopy" but still do the recycling and/or removal of the past WAL segments, assuming that the to-be-updated LSN values should be used as reference points for the cleanup. This causes a follow-up restart attempting crash recovery to fail with a PANIC on a missing checkpoint record if the end-of-recovery checkpoint triggered by the promotion did not complete while the cluster abruptly stopped or crashed before the completion of this checkpoint. The PANIC would be caused by the redo LSN referred in the control file as located in a segment already gone, recycled by the previous restartpoint with "checkPoint" out-of-sync in the control file. This commit fixes the update of the control file during restartpoints so as "checkPoint" and "checkPointCopy" are updated even if the cluster has been promoted while a restartpoint is running, to be on par with the set of WAL segments actually recycled in the end of CreateRestartPoint(). This problem exists in all the stable branches. However, commit 7ff23c6, by removing the last call of CreateCheckPoint() from the startup process, has made this bug much easier to reason about as concurrent checkpoints are not possible anymore. No backpatch is done yet, mostly out of caution from me as a point release is close by, but we need to think harder about the case of concurrent checkpoints at promotion if the bgwriter is not considered as running by the startup process in ~v14, so this change is done only on HEAD for the moment. Reported-by: Fujii Masao, Rui Zhao Author: Kyotaro Horiguchi Reviewed-by: Nathan Bossart, Michael Paquier Discussion: https://postgr.es/m/20220316.102444.2193181487576617583.horikyota.ntt@gmail.com
1 parent a22652e commit 7863ee4

File tree

1 file changed

+33
-21
lines changed
  • src/backend/access/transam

1 file changed

+33
-21
lines changed

src/backend/access/transam/xlog.c

+33-21
Original file line numberDiff line numberDiff line change
@@ -6921,6 +6921,9 @@ CreateRestartPoint(int flags)
69216921
XLogSegNo _logSegNo;
69226922
TimestampTz xtime;
69236923

6924+
/* Concurrent checkpoint/restartpoint cannot happen */
6925+
Assert(!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER);
6926+
69246927
/* Get a local copy of the last safe checkpoint record. */
69256928
SpinLockAcquire(&XLogCtl->info_lck);
69266929
lastCheckPointRecPtr = XLogCtl->lastCheckPointRecPtr;
@@ -7014,40 +7017,49 @@ CreateRestartPoint(int flags)
70147017
PriorRedoPtr = ControlFile->checkPointCopy.redo;
70157018

70167019
/*
7017-
* Update pg_control, using current time. Check that it still shows
7018-
* DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing;
7019-
* this is a quick hack to make sure nothing really bad happens if somehow
7020-
* we get here after the end-of-recovery checkpoint.
7020+
* Update pg_control, using current time. Check that it still shows an
7021+
* older checkpoint, else do nothing; this is a quick hack to make sure
7022+
* nothing really bad happens if somehow we get here after the
7023+
* end-of-recovery checkpoint.
70217024
*/
70227025
LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
7023-
if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY &&
7024-
ControlFile->checkPointCopy.redo < lastCheckPoint.redo)
7026+
if (ControlFile->checkPointCopy.redo < lastCheckPoint.redo)
70257027
{
7028+
/*
7029+
* Update the checkpoint information. We do this even if the cluster
7030+
* does not show DB_IN_ARCHIVE_RECOVERY to match with the set of WAL
7031+
* segments recycled below.
7032+
*/
70267033
ControlFile->checkPoint = lastCheckPointRecPtr;
70277034
ControlFile->checkPointCopy = lastCheckPoint;
70287035

70297036
/*
7030-
* Ensure minRecoveryPoint is past the checkpoint record. Normally,
7037+
* Ensure minRecoveryPoint is past the checkpoint record and update it
7038+
* if the control file still shows DB_IN_ARCHIVE_RECOVERY. Normally,
70317039
* this will have happened already while writing out dirty buffers,
70327040
* but not necessarily - e.g. because no buffers were dirtied. We do
7033-
* this because a backup performed in recovery uses minRecoveryPoint to
7034-
* determine which WAL files must be included in the backup, and the
7035-
* file (or files) containing the checkpoint record must be included,
7036-
* at a minimum. Note that for an ordinary restart of recovery there's
7037-
* no value in having the minimum recovery point any earlier than this
7038-
* anyway, because redo will begin just after the checkpoint record.
7041+
* this because a backup performed in recovery uses minRecoveryPoint
7042+
* to determine which WAL files must be included in the backup, and
7043+
* the file (or files) containing the checkpoint record must be
7044+
* included, at a minimum. Note that for an ordinary restart of
7045+
* recovery there's no value in having the minimum recovery point any
7046+
* earlier than this anyway, because redo will begin just after the
7047+
* checkpoint record.
70397048
*/
7040-
if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr)
7049+
if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY)
70417050
{
7042-
ControlFile->minRecoveryPoint = lastCheckPointEndPtr;
7043-
ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID;
7051+
if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr)
7052+
{
7053+
ControlFile->minRecoveryPoint = lastCheckPointEndPtr;
7054+
ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID;
70447055

7045-
/* update local copy */
7046-
LocalMinRecoveryPoint = ControlFile->minRecoveryPoint;
7047-
LocalMinRecoveryPointTLI = ControlFile->minRecoveryPointTLI;
7056+
/* update local copy */
7057+
LocalMinRecoveryPoint = ControlFile->minRecoveryPoint;
7058+
LocalMinRecoveryPointTLI = ControlFile->minRecoveryPointTLI;
7059+
}
7060+
if (flags & CHECKPOINT_IS_SHUTDOWN)
7061+
ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
70487062
}
7049-
if (flags & CHECKPOINT_IS_SHUTDOWN)
7050-
ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
70517063
UpdateControlFile();
70527064
}
70537065
LWLockRelease(ControlFileLock);

0 commit comments

Comments
 (0)