Skip to content

Commit dca6824

Browse files
committed
At end of recovery, reset all sinval-managed caches.
An inplace update's invalidation messages are part of its transaction's commit record. However, the update survives even if its transaction aborts or we stop recovery before replaying its transaction commit. After recovery, a backend that started in recovery could update the row without incorporating the inplace update. That could result in a table with an index, yet relhasindex=f. That is a source of index corruption. This bulk invalidation avoids the functional consequences. A future change can fix the !RecoveryInProgress() scenario without changing the WAL format. Back-patch to v17 - v12 (all supported versions). v18 will instead add invalidations to WAL. Discussion: https://postgr.es/m/20240618152349.7f.nmisch@google.com
1 parent ad24b75 commit dca6824

File tree

3 files changed

+67
-0
lines changed

3 files changed

+67
-0
lines changed

src/backend/access/transam/xlog.c

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@
7070
#include "storage/proc.h"
7171
#include "storage/procarray.h"
7272
#include "storage/reinit.h"
73+
#include "storage/sinvaladt.h"
7374
#include "storage/smgr.h"
7475
#include "storage/spin.h"
7576
#include "storage/sync.h"
@@ -8019,6 +8020,30 @@ StartupXLOG(void)
80198020
CreateCheckPoint(CHECKPOINT_END_OF_RECOVERY | CHECKPOINT_IMMEDIATE);
80208021
}
80218022

8023+
/*
8024+
* Invalidate all sinval-managed caches before READ WRITE transactions
8025+
* begin. The xl_heap_inplace WAL record doesn't store sufficient data
8026+
* for invalidations. The commit record, if any, has the invalidations.
8027+
* However, the inplace update is permanent, whether or not we reach a
8028+
* commit record. Fortunately, read-only transactions tolerate caches not
8029+
* reflecting the latest inplace updates. Read-only transactions
8030+
* experience the notable inplace updates as follows:
8031+
*
8032+
* - relhasindex=true affects readers only after the CREATE INDEX
8033+
* transaction commit makes an index fully available to them.
8034+
*
8035+
* - datconnlimit=DATCONNLIMIT_INVALID_DB affects readers only at
8036+
* InitPostgres() time, and that read does not use a cache.
8037+
*
8038+
* - relfrozenxid, datfrozenxid, relminmxid, and datminmxid have no effect
8039+
* on readers.
8040+
*
8041+
* Hence, hot standby queries (all READ ONLY) function correctly without
8042+
* the missing invalidations. This avoided changing the WAL format in
8043+
* back branches.
8044+
*/
8045+
SIResetAll();
8046+
80228047
/*
80238048
* Preallocate additional log files, if wanted.
80248049
*/

src/backend/storage/ipc/sinvaladt.c

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -748,6 +748,47 @@ SICleanupQueue(bool callerHasWriteLock, int minFree)
748748
}
749749
}
750750

751+
/*
752+
* SIResetAll
753+
* Mark all active backends as "reset"
754+
*
755+
* Use this when we don't know what needs to be invalidated. It's a
756+
* cluster-wide InvalidateSystemCaches(). This was a back-branch-only remedy
757+
* to avoid a WAL format change.
758+
*
759+
* The implementation is like SICleanupQueue(false, MAXNUMMESSAGES + 1), with
760+
* one addition. SICleanupQueue() assumes minFree << MAXNUMMESSAGES, so it
761+
* assumes hasMessages==true for any backend it resets. We're resetting even
762+
* fully-caught-up backends, so we set hasMessages.
763+
*/
764+
void
765+
SIResetAll(void)
766+
{
767+
SISeg *segP = shmInvalBuffer;
768+
int i;
769+
770+
LWLockAcquire(SInvalWriteLock, LW_EXCLUSIVE);
771+
LWLockAcquire(SInvalReadLock, LW_EXCLUSIVE);
772+
773+
for (i = 0; i < segP->lastBackend; i++)
774+
{
775+
ProcState *stateP = &segP->procState[i];
776+
777+
if (stateP->procPid == 0 || stateP->sendOnly)
778+
continue;
779+
780+
/* Consuming the reset will update "nextMsgNum" and "signaled". */
781+
stateP->resetState = true;
782+
stateP->hasMessages = true;
783+
}
784+
785+
segP->minMsgNum = segP->maxMsgNum;
786+
segP->nextThreshold = CLEANUP_MIN;
787+
788+
LWLockRelease(SInvalReadLock);
789+
LWLockRelease(SInvalWriteLock);
790+
}
791+
751792

752793
/*
753794
* GetNextLocalTransactionId --- allocate a new LocalTransactionId

src/include/storage/sinvaladt.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ extern void BackendIdGetTransactionIds(int backendID, TransactionId *xid, Transa
3737
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
3838
extern int SIGetDataEntries(SharedInvalidationMessage *data, int datasize);
3939
extern void SICleanupQueue(bool callerHasWriteLock, int minFree);
40+
extern void SIResetAll(void);
4041

4142
extern LocalTransactionId GetNextLocalTransactionId(void);
4243

0 commit comments

Comments
 (0)