Skip to content

Commit d8d5354

Browse files
committed
Make XactLockTableWait work for transactions that are not yet self-locked
XactLockTableWait assumed that its xid argument has already added itself to the lock table. That assumption led to another assumption that if locking the xid has succeeded but the xid is reported as still in progress, then the input xid must have been a subtransaction. These assumptions hold true for the original uses of this code in locking related to on-disk tuples, but they break down in logical replication slot snapshot building -- in particular, when a standby snapshot logged contains an xid that's already in ProcArray but not yet in the lock table. This leads to assertion failures that can be reproduced all the way back to 9.4, when logical decoding was introduced. To fix, change SubTransGetParent to SubTransGetTopmostTransaction which has a slightly different API: it returns the argument Xid if there is no parent, and it goes all the way to the top instead of moving up the levels one by one. Also, to avoid busy-waiting, add a 1ms sleep to give the other process time to register itself in the lock table. For consistency, change ConditionalXactLockTableWait the same way. Author: Petr Jelínek Discussion: https://postgr.es/m/1B3E32D8-FCF4-40B4-AEF9-5C0E3AC57969@postgrespro.ru Reported-by: Konstantin Knizhnik Diagnosed-by: Stas Kelvich, Petr Jelínek Reviewed-by: Andres Freund, Robert Haas
1 parent 22f5e89 commit d8d5354

File tree

1 file changed

+28
-2
lines changed
  • src/backend/storage/lmgr

1 file changed

+28
-2
lines changed

src/backend/storage/lmgr/lmgr.c

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -522,6 +522,7 @@ XactLockTableWait(TransactionId xid, Relation rel, ItemPointer ctid,
522522
LOCKTAG tag;
523523
XactLockTableWaitInfo info;
524524
ErrorContextCallback callback;
525+
bool first = true;
525526

526527
/*
527528
* If an operation is specified, set up our verbose error context
@@ -555,7 +556,26 @@ XactLockTableWait(TransactionId xid, Relation rel, ItemPointer ctid,
555556

556557
if (!TransactionIdIsInProgress(xid))
557558
break;
558-
xid = SubTransGetParent(xid);
559+
560+
/*
561+
* If the Xid belonged to a subtransaction, then the lock would have
562+
* gone away as soon as it was finished; for correct tuple visibility,
563+
* the right action is to wait on its parent transaction to go away.
564+
* But instead of going levels up one by one, we can just wait for the
565+
* topmost transaction to finish with the same end result, which also
566+
* incurs less locktable traffic.
567+
*
568+
* Some uses of this function don't involve tuple visibility -- such
569+
* as when building snapshots for logical decoding. It is possible to
570+
* see a transaction in ProcArray before it registers itself in the
571+
* locktable. The topmost transaction in that case is the same xid,
572+
* so we try again after a short sleep. (Don't sleep the first time
573+
* through, to avoid slowing down the normal case.)
574+
*/
575+
if (!first)
576+
pg_usleep(1000L);
577+
first = false;
578+
xid = SubTransGetTopmostTransaction(xid);
559579
}
560580

561581
if (oper != XLTW_None)
@@ -572,6 +592,7 @@ bool
572592
ConditionalXactLockTableWait(TransactionId xid)
573593
{
574594
LOCKTAG tag;
595+
bool first = true;
575596

576597
for (;;)
577598
{
@@ -587,7 +608,12 @@ ConditionalXactLockTableWait(TransactionId xid)
587608

588609
if (!TransactionIdIsInProgress(xid))
589610
break;
590-
xid = SubTransGetParent(xid);
611+
612+
/* See XactLockTableWait about this case */
613+
if (!first)
614+
pg_usleep(1000L);
615+
first = false;
616+
xid = SubTransGetTopmostTransaction(xid);
591617
}
592618

593619
return true;

0 commit comments

Comments
 (0)