Skip to content

Commit 75c452a

Browse files
committed
Block interrupts during HandleParallelMessages().
As noted by Alvaro, there are CHECK_FOR_INTERRUPTS() calls in the shm_mq.c functions called by HandleParallelMessages(). I believe they're all unreachable since we always pass nowait = true, but it doesn't seem like a great idea to assume that no such call will ever be reachable from HandleParallelMessages(). If that did happen, there would be a risk of a recursive call to HandleParallelMessages(), which it does not appear to be designed for --- for example, there's nothing that would prevent out-of-order processing of received messages. And certainly such cases cannot easily be tested. So let's prevent it by holding off interrupts for the duration of the function. Back-patch to 9.5 which contains identical code. Discussion: <14869.1470083848@sss.pgh.pa.us>
1 parent 93ac14e commit 75c452a

File tree

1 file changed

+16
-5
lines changed

1 file changed

+16
-5
lines changed

src/backend/access/transam/parallel.c

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -628,14 +628,21 @@ HandleParallelMessages(void)
628628
{
629629
dlist_iter iter;
630630

631+
/*
632+
* This is invoked from ProcessInterrupts(), and since some of the
633+
* functions it calls contain CHECK_FOR_INTERRUPTS(), there is a potential
634+
* for recursive calls if more signals are received while this runs. It's
635+
* unclear that recursive entry would be safe, and it doesn't seem useful
636+
* even if it is safe, so let's block interrupts until done.
637+
*/
638+
HOLD_INTERRUPTS();
639+
631640
ParallelMessagePending = false;
632641

633642
dlist_foreach(iter, &pcxt_list)
634643
{
635644
ParallelContext *pcxt;
636645
int i;
637-
Size nbytes;
638-
void *data;
639646

640647
pcxt = dlist_container(ParallelContext, node, iter.cur);
641648
if (pcxt->worker == NULL)
@@ -645,13 +652,15 @@ HandleParallelMessages(void)
645652
{
646653
/*
647654
* Read as many messages as we can from each worker, but stop when
648-
* either (1) the error queue goes away, which can happen if we
649-
* receive a Terminate message from the worker; or (2) no more
650-
* messages can be read from the worker without blocking.
655+
* either (1) the worker's error queue goes away, which can happen
656+
* if we receive a Terminate message from the worker; or (2) no
657+
* more messages can be read from the worker without blocking.
651658
*/
652659
while (pcxt->worker[i].error_mqh != NULL)
653660
{
654661
shm_mq_result res;
662+
Size nbytes;
663+
void *data;
655664

656665
res = shm_mq_receive(pcxt->worker[i].error_mqh, &nbytes,
657666
&data, true);
@@ -673,6 +682,8 @@ HandleParallelMessages(void)
673682
}
674683
}
675684
}
685+
686+
RESUME_INTERRUPTS();
676687
}
677688

678689
/*

0 commit comments

Comments
 (0)