Skip to content

Commit 7747a76

Browse files
committed
Fix latent(?) race condition in LockReleaseAll.
We have for a long time checked the head pointer of each of the backend's proclock lists and skipped acquiring the corresponding locktable partition lock if the head pointer was NULL. This was safe enough in the days when proclock lists were changed only by the owning backend, but it is pretty questionable now that the fast-path patch added cases where backends add entries to other backends' proclock lists. However, we don't really wish to revert to locking each partition lock every time, because in simple transactions that would add a lot of useless lock/unlock cycles on already-heavily-contended LWLocks. Fortunately, the only way that another backend could be modifying our proclock list at this point would be if it was promoting a formerly fast-path lock of ours; and any such lock must be one that we'd decided not to delete in the previous loop over the locallock table. So it's okay if we miss seeing it in this loop; we'd just decide not to delete it again. However, once we've detected a non-empty list, we'd better re-fetch the list head pointer after acquiring the partition lock. This guards against possibly fetching a corrupt-but-non-null pointer if pointer fetch/store isn't atomic. It's not clear if any practical architectures are like that, but we've never assumed that before and don't wish to start here. In any case, the situation certainly deserves a code comment. While at it, refactor the partition traversal loop to use a for() construct instead of a while() loop with goto's. Back-patch, just in case the risk is real and not hypothetical.
1 parent 4c71f48 commit 7747a76

File tree

1 file changed

+46
-24
lines changed
  • src/backend/storage/lmgr

1 file changed

+46
-24
lines changed

src/backend/storage/lmgr/lock.c

Lines changed: 46 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2099,19 +2099,39 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
20992099
{
21002100
LWLockId partitionLock = FirstLockMgrLock + partition;
21012101
SHM_QUEUE *procLocks = &(MyProc->myProcLocks[partition]);
2102+
PROCLOCK *nextplock;
21022103

2103-
proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
2104-
offsetof(PROCLOCK, procLink));
2105-
2106-
if (!proclock)
2104+
/*
2105+
* If the proclock list for this partition is empty, we can skip
2106+
* acquiring the partition lock. This optimization is trickier than
2107+
* it looks, because another backend could be in process of adding
2108+
* something to our proclock list due to promoting one of our
2109+
* fast-path locks. However, any such lock must be one that we
2110+
* decided not to delete above, so it's okay to skip it again now;
2111+
* we'd just decide not to delete it again. We must, however, be
2112+
* careful to re-fetch the list header once we've acquired the
2113+
* partition lock, to be sure we have a valid, up-to-date pointer.
2114+
* (There is probably no significant risk if pointer fetch/store is
2115+
* atomic, but we don't wish to assume that.)
2116+
*
2117+
* XXX This argument assumes that the locallock table correctly
2118+
* represents all of our fast-path locks. While allLocks mode
2119+
* guarantees to clean up all of our normal locks regardless of the
2120+
* locallock situation, we lose that guarantee for fast-path locks.
2121+
* This is not ideal.
2122+
*/
2123+
if (SHMQueueNext(procLocks, procLocks,
2124+
offsetof(PROCLOCK, procLink)) == NULL)
21072125
continue; /* needn't examine this partition */
21082126

21092127
LWLockAcquire(partitionLock, LW_EXCLUSIVE);
21102128

2111-
while (proclock)
2129+
for (proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
2130+
offsetof(PROCLOCK, procLink));
2131+
proclock;
2132+
proclock = nextplock)
21122133
{
21132134
bool wakeupNeeded = false;
2114-
PROCLOCK *nextplock;
21152135

21162136
/* Get link first, since we may unlink/delete this proclock */
21172137
nextplock = (PROCLOCK *)
@@ -2124,7 +2144,7 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
21242144

21252145
/* Ignore items that are not of the lockmethod to be removed */
21262146
if (LOCK_LOCKMETHOD(*lock) != lockmethodid)
2127-
goto next_item;
2147+
continue;
21282148

21292149
/*
21302150
* In allLocks mode, force release of all locks even if locallock
@@ -2140,7 +2160,7 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
21402160
* holdMask == 0 and are therefore recyclable
21412161
*/
21422162
if (proclock->releaseMask == 0 && proclock->holdMask != 0)
2143-
goto next_item;
2163+
continue;
21442164

21452165
PROCLOCK_PRINT("LockReleaseAll", proclock);
21462166
LOCK_PRINT("LockReleaseAll", lock, 0);
@@ -2169,9 +2189,6 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
21692189
lockMethodTable,
21702190
LockTagHashCode(&lock->tag),
21712191
wakeupNeeded);
2172-
2173-
next_item:
2174-
proclock = nextplock;
21752192
} /* loop over PROCLOCKs within this partition */
21762193

21772194
LWLockRelease(partitionLock);
@@ -3143,19 +3160,27 @@ PostPrepare_Locks(TransactionId xid)
31433160
{
31443161
LWLockId partitionLock = FirstLockMgrLock + partition;
31453162
SHM_QUEUE *procLocks = &(MyProc->myProcLocks[partition]);
3163+
PROCLOCK *nextplock;
31463164

3147-
proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
3148-
offsetof(PROCLOCK, procLink));
3149-
3150-
if (!proclock)
3165+
/*
3166+
* If the proclock list for this partition is empty, we can skip
3167+
* acquiring the partition lock. This optimization is safer than the
3168+
* situation in LockReleaseAll, because we got rid of any fast-path
3169+
* locks during AtPrepare_Locks, so there cannot be any case where
3170+
* another backend is adding something to our lists now. For safety,
3171+
* though, we code this the same way as in LockReleaseAll.
3172+
*/
3173+
if (SHMQueueNext(procLocks, procLocks,
3174+
offsetof(PROCLOCK, procLink)) == NULL)
31513175
continue; /* needn't examine this partition */
31523176

31533177
LWLockAcquire(partitionLock, LW_EXCLUSIVE);
31543178

3155-
while (proclock)
3179+
for (proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
3180+
offsetof(PROCLOCK, procLink));
3181+
proclock;
3182+
proclock = nextplock)
31563183
{
3157-
PROCLOCK *nextplock;
3158-
31593184
/* Get link first, since we may unlink/relink this proclock */
31603185
nextplock = (PROCLOCK *)
31613186
SHMQueueNext(procLocks, &proclock->procLink,
@@ -3167,7 +3192,7 @@ PostPrepare_Locks(TransactionId xid)
31673192

31683193
/* Ignore VXID locks */
31693194
if (lock->tag.locktag_type == LOCKTAG_VIRTUALTRANSACTION)
3170-
goto next_item;
3195+
continue;
31713196

31723197
PROCLOCK_PRINT("PostPrepare_Locks", proclock);
31733198
LOCK_PRINT("PostPrepare_Locks", lock, 0);
@@ -3178,7 +3203,7 @@ PostPrepare_Locks(TransactionId xid)
31783203

31793204
/* Ignore it if nothing to release (must be a session lock) */
31803205
if (proclock->releaseMask == 0)
3181-
goto next_item;
3206+
continue;
31823207

31833208
/* Else we should be releasing all locks */
31843209
if (proclock->releaseMask != proclock->holdMask)
@@ -3220,9 +3245,6 @@ PostPrepare_Locks(TransactionId xid)
32203245
&proclock->procLink);
32213246

32223247
PROCLOCK_PRINT("PostPrepare_Locks: updated", proclock);
3223-
3224-
next_item:
3225-
proclock = nextplock;
32263248
} /* loop over PROCLOCKs within this partition */
32273249

32283250
LWLockRelease(partitionLock);
@@ -3919,7 +3941,7 @@ VirtualXactLockTableInsert(VirtualTransactionId vxid)
39193941
* unblocking waiters.
39203942
*/
39213943
void
3922-
VirtualXactLockTableCleanup()
3944+
VirtualXactLockTableCleanup(void)
39233945
{
39243946
bool fastpath;
39253947
LocalTransactionId lxid;

0 commit comments

Comments
 (0)