Skip to content

Commit 9457508

Browse files
committed
Fix latent(?) race condition in LockReleaseAll.
We have for a long time checked the head pointer of each of the backend's proclock lists and skipped acquiring the corresponding locktable partition lock if the head pointer was NULL. This was safe enough in the days when proclock lists were changed only by the owning backend, but it is pretty questionable now that the fast-path patch added cases where backends add entries to other backends' proclock lists. However, we don't really wish to revert to locking each partition lock every time, because in simple transactions that would add a lot of useless lock/unlock cycles on already-heavily-contended LWLocks. Fortunately, the only way that another backend could be modifying our proclock list at this point would be if it was promoting a formerly fast-path lock of ours; and any such lock must be one that we'd decided not to delete in the previous loop over the locallock table. So it's okay if we miss seeing it in this loop; we'd just decide not to delete it again. However, once we've detected a non-empty list, we'd better re-fetch the list head pointer after acquiring the partition lock. This guards against possibly fetching a corrupt-but-non-null pointer if pointer fetch/store isn't atomic. It's not clear if any practical architectures are like that, but we've never assumed that before and don't wish to start here. In any case, the situation certainly deserves a code comment. While at it, refactor the partition traversal loop to use a for() construct instead of a while() loop with goto's. Back-patch, just in case the risk is real and not hypothetical.
1 parent 62e69cb commit 9457508

File tree

1 file changed

+46
-23
lines changed
  • src/backend/storage/lmgr

1 file changed

+46
-23
lines changed

src/backend/storage/lmgr/lock.c

Lines changed: 46 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2070,19 +2070,39 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
20702070
{
20712071
LWLockId partitionLock = FirstLockMgrLock + partition;
20722072
SHM_QUEUE *procLocks = &(MyProc->myProcLocks[partition]);
2073+
PROCLOCK *nextplock;
20732074

2074-
proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
2075-
offsetof(PROCLOCK, procLink));
2076-
2077-
if (!proclock)
2075+
/*
2076+
* If the proclock list for this partition is empty, we can skip
2077+
* acquiring the partition lock. This optimization is trickier than
2078+
* it looks, because another backend could be in process of adding
2079+
* something to our proclock list due to promoting one of our
2080+
* fast-path locks. However, any such lock must be one that we
2081+
* decided not to delete above, so it's okay to skip it again now;
2082+
* we'd just decide not to delete it again. We must, however, be
2083+
* careful to re-fetch the list header once we've acquired the
2084+
* partition lock, to be sure we have a valid, up-to-date pointer.
2085+
* (There is probably no significant risk if pointer fetch/store is
2086+
* atomic, but we don't wish to assume that.)
2087+
*
2088+
* XXX This argument assumes that the locallock table correctly
2089+
* represents all of our fast-path locks. While allLocks mode
2090+
* guarantees to clean up all of our normal locks regardless of the
2091+
* locallock situation, we lose that guarantee for fast-path locks.
2092+
* This is not ideal.
2093+
*/
2094+
if (SHMQueueNext(procLocks, procLocks,
2095+
offsetof(PROCLOCK, procLink)) == NULL)
20782096
continue; /* needn't examine this partition */
20792097

20802098
LWLockAcquire(partitionLock, LW_EXCLUSIVE);
20812099

2082-
while (proclock)
2100+
for (proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
2101+
offsetof(PROCLOCK, procLink));
2102+
proclock;
2103+
proclock = nextplock)
20832104
{
20842105
bool wakeupNeeded = false;
2085-
PROCLOCK *nextplock;
20862106

20872107
/* Get link first, since we may unlink/delete this proclock */
20882108
nextplock = (PROCLOCK *)
@@ -2095,7 +2115,7 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
20952115

20962116
/* Ignore items that are not of the lockmethod to be removed */
20972117
if (LOCK_LOCKMETHOD(*lock) != lockmethodid)
2098-
goto next_item;
2118+
continue;
20992119

21002120
/*
21012121
* In allLocks mode, force release of all locks even if locallock
@@ -2111,7 +2131,7 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
21112131
* holdMask == 0 and are therefore recyclable
21122132
*/
21132133
if (proclock->releaseMask == 0 && proclock->holdMask != 0)
2114-
goto next_item;
2134+
continue;
21152135

21162136
PROCLOCK_PRINT("LockReleaseAll", proclock);
21172137
LOCK_PRINT("LockReleaseAll", lock, 0);
@@ -2140,9 +2160,6 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
21402160
lockMethodTable,
21412161
LockTagHashCode(&lock->tag),
21422162
wakeupNeeded);
2143-
2144-
next_item:
2145-
proclock = nextplock;
21462163
} /* loop over PROCLOCKs within this partition */
21472164

21482165
LWLockRelease(partitionLock);
@@ -3074,18 +3091,27 @@ PostPrepare_Locks(TransactionId xid)
30743091
{
30753092
LWLockId partitionLock = FirstLockMgrLock + partition;
30763093
SHM_QUEUE *procLocks = &(MyProc->myProcLocks[partition]);
3094+
PROCLOCK *nextplock;
30773095

3078-
proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
3079-
offsetof(PROCLOCK, procLink));
3080-
3081-
if (!proclock)
3096+
/*
3097+
* If the proclock list for this partition is empty, we can skip
3098+
* acquiring the partition lock. This optimization is safer than the
3099+
* situation in LockReleaseAll, because we got rid of any fast-path
3100+
* locks during AtPrepare_Locks, so there cannot be any case where
3101+
* another backend is adding something to our lists now. For safety,
3102+
* though, we code this the same way as in LockReleaseAll.
3103+
*/
3104+
if (SHMQueueNext(procLocks, procLocks,
3105+
offsetof(PROCLOCK, procLink)) == NULL)
30823106
continue; /* needn't examine this partition */
30833107

30843108
LWLockAcquire(partitionLock, LW_EXCLUSIVE);
30853109

3086-
while (proclock)
3110+
for (proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
3111+
offsetof(PROCLOCK, procLink));
3112+
proclock;
3113+
proclock = nextplock)
30873114
{
3088-
PROCLOCK *nextplock;
30893115
LOCKMASK holdMask;
30903116
PROCLOCK *newproclock;
30913117

@@ -3100,7 +3126,7 @@ PostPrepare_Locks(TransactionId xid)
31003126

31013127
/* Ignore VXID locks */
31023128
if (lock->tag.locktag_type == LOCKTAG_VIRTUALTRANSACTION)
3103-
goto next_item;
3129+
continue;
31043130

31053131
PROCLOCK_PRINT("PostPrepare_Locks", proclock);
31063132
LOCK_PRINT("PostPrepare_Locks", lock, 0);
@@ -3111,7 +3137,7 @@ PostPrepare_Locks(TransactionId xid)
31113137

31123138
/* Ignore it if nothing to release (must be a session lock) */
31133139
if (proclock->releaseMask == 0)
3114-
goto next_item;
3140+
continue;
31153141

31163142
/* Else we should be releasing all locks */
31173143
if (proclock->releaseMask != proclock->holdMask)
@@ -3175,9 +3201,6 @@ PostPrepare_Locks(TransactionId xid)
31753201
*/
31763202
Assert((newproclock->holdMask & holdMask) == 0);
31773203
newproclock->holdMask |= holdMask;
3178-
3179-
next_item:
3180-
proclock = nextplock;
31813204
} /* loop over PROCLOCKs within this partition */
31823205

31833206
LWLockRelease(partitionLock);
@@ -3874,7 +3897,7 @@ VirtualXactLockTableInsert(VirtualTransactionId vxid)
38743897
* unblocking waiters.
38753898
*/
38763899
void
3877-
VirtualXactLockTableCleanup()
3900+
VirtualXactLockTableCleanup(void)
38783901
{
38793902
bool fastpath;
38803903
LocalTransactionId lxid;

0 commit comments

Comments
 (0)