Skip to content

Commit 67f30c7

Browse files
committed
At end of recovery, reset all sinval-managed caches.
An inplace update's invalidation messages are part of its transaction's commit record. However, the update survives even if its transaction aborts or we stop recovery before replaying its transaction commit. After recovery, a backend that started in recovery could update the row without incorporating the inplace update. That could result in a table with an index, yet relhasindex=f. That is a source of index corruption. This bulk invalidation avoids the functional consequences. A future change can fix the !RecoveryInProgress() scenario without changing the WAL format. Back-patch to v17 - v12 (all supported versions). v18 will instead add invalidations to WAL. Discussion: https://postgr.es/m/20240618152349.7f.nmisch@google.com
1 parent 3e5ea47 commit 67f30c7

File tree

3 files changed

+67
-0
lines changed

3 files changed

+67
-0
lines changed

src/backend/access/transam/xlog.c

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@
6868
#include "storage/proc.h"
6969
#include "storage/procarray.h"
7070
#include "storage/reinit.h"
71+
#include "storage/sinvaladt.h"
7172
#include "storage/smgr.h"
7273
#include "storage/spin.h"
7374
#include "storage/sync.h"
@@ -7866,6 +7867,30 @@ StartupXLOG(void)
78667867
CreateCheckPoint(CHECKPOINT_END_OF_RECOVERY | CHECKPOINT_IMMEDIATE);
78677868
}
78687869

7870+
/*
7871+
* Invalidate all sinval-managed caches before READ WRITE transactions
7872+
* begin. The xl_heap_inplace WAL record doesn't store sufficient data
7873+
* for invalidations. The commit record, if any, has the invalidations.
7874+
* However, the inplace update is permanent, whether or not we reach a
7875+
* commit record. Fortunately, read-only transactions tolerate caches not
7876+
* reflecting the latest inplace updates. Read-only transactions
7877+
* experience the notable inplace updates as follows:
7878+
*
7879+
* - relhasindex=true affects readers only after the CREATE INDEX
7880+
* transaction commit makes an index fully available to them.
7881+
*
7882+
* - datconnlimit=DATCONNLIMIT_INVALID_DB affects readers only at
7883+
* InitPostgres() time, and that read does not use a cache.
7884+
*
7885+
* - relfrozenxid, datfrozenxid, relminmxid, and datminmxid have no effect
7886+
* on readers.
7887+
*
7888+
* Hence, hot standby queries (all READ ONLY) function correctly without
7889+
* the missing invalidations. This avoided changing the WAL format in
7890+
* back branches.
7891+
*/
7892+
SIResetAll();
7893+
78697894
/*
78707895
* Preallocate additional log files, if wanted.
78717896
*/

src/backend/storage/ipc/sinvaladt.c

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -750,6 +750,47 @@ SICleanupQueue(bool callerHasWriteLock, int minFree)
750750
}
751751
}
752752

753+
/*
754+
* SIResetAll
755+
* Mark all active backends as "reset"
756+
*
757+
* Use this when we don't know what needs to be invalidated. It's a
758+
* cluster-wide InvalidateSystemCaches(). This was a back-branch-only remedy
759+
* to avoid a WAL format change.
760+
*
761+
* The implementation is like SICleanupQueue(false, MAXNUMMESSAGES + 1), with
762+
* one addition. SICleanupQueue() assumes minFree << MAXNUMMESSAGES, so it
763+
* assumes hasMessages==true for any backend it resets. We're resetting even
764+
* fully-caught-up backends, so we set hasMessages.
765+
*/
766+
void
767+
SIResetAll(void)
768+
{
769+
SISeg *segP = shmInvalBuffer;
770+
int i;
771+
772+
LWLockAcquire(SInvalWriteLock, LW_EXCLUSIVE);
773+
LWLockAcquire(SInvalReadLock, LW_EXCLUSIVE);
774+
775+
for (i = 0; i < segP->lastBackend; i++)
776+
{
777+
ProcState *stateP = &segP->procState[i];
778+
779+
if (stateP->procPid == 0 || stateP->sendOnly)
780+
continue;
781+
782+
/* Consuming the reset will update "nextMsgNum" and "signaled". */
783+
stateP->resetState = true;
784+
stateP->hasMessages = true;
785+
}
786+
787+
segP->minMsgNum = segP->maxMsgNum;
788+
segP->nextThreshold = CLEANUP_MIN;
789+
790+
LWLockRelease(SInvalReadLock);
791+
LWLockRelease(SInvalWriteLock);
792+
}
793+
753794

754795
/*
755796
* GetNextLocalTransactionId --- allocate a new LocalTransactionId

src/include/storage/sinvaladt.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ extern void BackendIdGetTransactionIds(int backendID, TransactionId *xid, Transa
3737
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
3838
extern int SIGetDataEntries(SharedInvalidationMessage *data, int datasize);
3939
extern void SICleanupQueue(bool callerHasWriteLock, int minFree);
40+
extern void SIResetAll(void);
4041

4142
extern LocalTransactionId GetNextLocalTransactionId(void);
4243

0 commit comments

Comments
 (0)