@@ -228,14 +228,9 @@ a specialized hash function (see proclock_hash).
228
228
* Formerly, each PGPROC had a single list of PROCLOCKs belonging to it.
229
229
This has now been split into per-partition lists, so that access to a
230
230
particular PROCLOCK list can be protected by the associated partition's
231
- LWLock. (This is not strictly necessary at the moment, because at this
232
- writing a PGPROC's PROCLOCK list is only accessed by the owning backend
233
- anyway. But it seems forward-looking to maintain a convention for how
234
- other backends could access it. In any case LockReleaseAll needs to be
235
- able to quickly determine which partition each LOCK belongs to, and
236
- for the currently contemplated number of partitions, this way takes less
237
- shared memory than explicitly storing a partition number in LOCK structs
238
- would require.)
231
+ LWLock. (This rule allows one backend to manipulate another backend's
232
+ PROCLOCK lists, which was not originally necessary but is now required in
233
+ connection with fast-path locking; see below.)
239
234
240
235
* The other lock-related fields of a PGPROC are only interesting when
241
236
the PGPROC is waiting for a lock, so we consider that they are protected
@@ -292,20 +287,20 @@ To alleviate this bottleneck, beginning in PostgreSQL 9.2, each backend is
292
287
permitted to record a limited number of locks on unshared relations in an
293
288
array within its PGPROC structure, rather than using the primary lock table.
294
289
This mechanism can only be used when the locker can verify that no conflicting
295
- locks can possibly exist .
290
+ locks exist at the time of taking the lock .
296
291
297
292
A key point of this algorithm is that it must be possible to verify the
298
293
absence of possibly conflicting locks without fighting over a shared LWLock or
299
294
spinlock. Otherwise, this effort would simply move the contention bottleneck
300
295
from one place to another. We accomplish this using an array of 1024 integer
301
- counters, which are in effect a 1024-way partitioning of the lock space. Each
302
- counter records the number of "strong" locks (that is, ShareLock,
296
+ counters, which are in effect a 1024-way partitioning of the lock space.
297
+ Each counter records the number of "strong" locks (that is, ShareLock,
303
298
ShareRowExclusiveLock, ExclusiveLock, and AccessExclusiveLock) on unshared
304
299
relations that fall into that partition. When this counter is non-zero, the
305
- fast path mechanism may not be used for relation locks in that partition. A
306
- strong locker bumps the counter and then scans each per-backend array for
307
- matching fast-path locks; any which are found must be transferred to the
308
- primary lock table before attempting to acquire the lock, to ensure proper
300
+ fast path mechanism may not be used to take new relation locks within that
301
+ partition. A strong locker bumps the counter and then scans each per-backend
302
+ array for matching fast-path locks; any which are found must be transferred to
303
+ the primary lock table before attempting to acquire the lock, to ensure proper
309
304
lock conflict and deadlock detection.
310
305
311
306
On an SMP system, we must guarantee proper memory synchronization. Here we
@@ -314,19 +309,19 @@ A performs a store, A and B both acquire an LWLock in either order, and B
314
309
then performs a load on the same memory location, it is guaranteed to see
315
310
A's store. In this case, each backend's fast-path lock queue is protected
316
311
by an LWLock. A backend wishing to acquire a fast-path lock grabs this
317
- LWLock before examining FastPathStrongRelationLocks to check for the presence of
318
- a conflicting strong lock. And the backend attempting to acquire a strong
312
+ LWLock before examining FastPathStrongRelationLocks to check for the presence
313
+ of a conflicting strong lock. And the backend attempting to acquire a strong
319
314
lock, because it must transfer any matching weak locks taken via the fast-path
320
- mechanism to the shared lock table, will acquire every LWLock protecting
321
- a backend fast-path queue in turn. So, if we examine FastPathStrongRelationLocks
322
- and see a zero, then either the value is truly zero, or if it is a stale value,
323
- the strong locker has yet to acquire the per-backend LWLock we now hold (or,
324
- indeed, even the first per-backend LWLock) and will notice any weak lock we
325
- take when it does.
315
+ mechanism to the shared lock table, will acquire every LWLock protecting a
316
+ backend fast-path queue in turn. So, if we examine
317
+ FastPathStrongRelationLocks and see a zero, then either the value is truly
318
+ zero, or if it is a stale value, the strong locker has yet to acquire the
319
+ per-backend LWLock we now hold (or, indeed, even the first per-backend LWLock)
320
+ and will notice any weak lock we take when it does.
326
321
327
322
Fast-path VXID locks do not use the FastPathStrongRelationLocks table. The
328
- first lock taken on a VXID is always the ExclusiveLock taken by its owner. Any
329
- subsequent lockers are share lockers waiting for the VXID to terminate.
323
+ first lock taken on a VXID is always the ExclusiveLock taken by its owner.
324
+ Any subsequent lockers are share lockers waiting for the VXID to terminate.
330
325
Indeed, the only reason VXID locks use the lock manager at all (rather than
331
326
waiting for the VXID to terminate via some other method) is for deadlock
332
327
detection. Thus, the initial VXID lock can *always* be taken via the fast
@@ -335,6 +330,10 @@ whether the lock has been transferred to the main lock table, and if not,
335
330
do so. The backend owning the VXID must be careful to clean up any entry
336
331
made in the main lock table at end of transaction.
337
332
333
+ Deadlock detection does not need to examine the fast-path data structures,
334
+ because any lock that could possibly be involved in a deadlock must have
335
+ been transferred to the main tables beforehand.
336
+
338
337
339
338
The Deadlock Detection Algorithm
340
339
--------------------------------
@@ -376,7 +375,7 @@ inserted in the wait queue just ahead of the first such waiter. (If we
376
375
did not make this check, the deadlock detection code would adjust the
377
376
queue order to resolve the conflict, but it's relatively cheap to make
378
377
the check in ProcSleep and avoid a deadlock timeout delay in this case.)
379
- Note special case when inserting before the end of the queue: if the
378
+ Note special case when inserting before the end of the queue: if the
380
379
process's request does not conflict with any existing lock nor any
381
380
waiting request before its insertion point, then go ahead and grant the
382
381
lock without waiting.
@@ -414,7 +413,7 @@ need to kill all the transactions involved.
414
413
indicates a deadlock, but one that does not involve our starting
415
414
process. We ignore this condition on the grounds that resolving such a
416
415
deadlock is the responsibility of the processes involved --- killing our
417
- start- point process would not resolve the deadlock. So, cases 1 and 3
416
+ start-point process would not resolve the deadlock. So, cases 1 and 3
418
417
both report "no deadlock".
419
418
420
419
Postgres' situation is a little more complex than the standard discussion
@@ -620,7 +619,7 @@ level is AccessExclusiveLock.
620
619
Regular backends are only allowed to take locks on relations or objects
621
620
at RowExclusiveLock or lower. This ensures that they do not conflict with
622
621
each other or with the Startup process, unless AccessExclusiveLocks are
623
- requested by one of the backends .
622
+ requested by the Startup process .
624
623
625
624
Deadlocks involving AccessExclusiveLocks are not possible, so we need
626
625
not be concerned that a user initiated deadlock can prevent recovery from
@@ -632,3 +631,9 @@ of transaction just as they are in normal processing. These locks are
632
631
held by the Startup process, acting as a proxy for the backends that
633
632
originally acquired these locks. Again, these locks cannot conflict with
634
633
one another, so the Startup process cannot deadlock itself either.
634
+
635
+ Although deadlock is not possible, a regular backend's weak lock can
636
+ prevent the Startup process from making progress in applying WAL, which is
637
+ usually not something that should be tolerated for very long. Mechanisms
638
+ exist to forcibly cancel a regular backend's query if it blocks the
639
+ Startup process for too long.
0 commit comments