Skip to content

Commit 3bdef6d

Browse files
committed
Server-side fix for delayed NOTIFY and SIGTERM processing.
Commit 4f85fde introduced some code that was meant to ensure that we'd process cancel, die, sinval catchup, and notify interrupts while waiting for client input. But there was a flaw: it supposed that the process latch would be set upon arrival at secure_read() if any such interrupt was pending. In reality, we might well have cleared the process latch at some earlier point while those flags remained set -- particularly notifyInterruptPending, which can't be handled as long as we're within a transaction. To fix the NOTIFY case, also attempt to process signals (except ProcDiePending) before trying to read. Also, if we see that ProcDiePending is set before we read, forcibly set the process latch to ensure that we will handle that signal promptly if no data is available. I also made it set the process latch on the way out, in case there is similar logic elsewhere. (It remains true that we won't service ProcDiePending here unless we need to wait for input.) The code for handling ProcDiePending during a write needs those changes, too. Also be a little more careful about when to reset whereToSendOutput, and improve related comments. Back-patch to 9.5 where this code was added. I'm not entirely convinced that older branches don't have similar issues, but the complaint at hand is just about the >= 9.5 code. Jeff Janes and Tom Lane Discussion: https://postgr.es/m/CAOYf6ec-TmRYjKBXLLaGaB-jrd=mjG1Hzn1a1wufUAR39PQYhw@mail.gmail.com
1 parent 11359db commit 3bdef6d

File tree

2 files changed

+62
-35
lines changed

2 files changed

+62
-35
lines changed

src/backend/libpq/be-secure.c

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,9 @@ secure_read(Port *port, void *ptr, size_t len)
140140
ssize_t n;
141141
int waitfor;
142142

143+
/* Deal with any already-pending interrupt condition. */
144+
ProcessClientReadInterrupt(false);
145+
143146
retry:
144147
#ifdef USE_SSL
145148
waitfor = 0;
@@ -204,9 +207,8 @@ secure_read(Port *port, void *ptr, size_t len)
204207
}
205208

206209
/*
207-
* Process interrupts that happened while (or before) receiving. Note that
208-
* we signal that we're not blocking, which will prevent some types of
209-
* interrupts from being processed.
210+
* Process interrupts that happened during a successful (or non-blocking,
211+
* or hard-failed) read.
210212
*/
211213
ProcessClientReadInterrupt(false);
212214

@@ -243,6 +245,9 @@ secure_write(Port *port, void *ptr, size_t len)
243245
ssize_t n;
244246
int waitfor;
245247

248+
/* Deal with any already-pending interrupt condition. */
249+
ProcessClientWriteInterrupt(false);
250+
246251
retry:
247252
waitfor = 0;
248253
#ifdef USE_SSL
@@ -282,17 +287,16 @@ secure_write(Port *port, void *ptr, size_t len)
282287

283288
/*
284289
* We'll retry the write. Most likely it will return immediately
285-
* because there's still no data available, and we'll wait for the
286-
* socket to become ready again.
290+
* because there's still no buffer space available, and we'll wait
291+
* for the socket to become ready again.
287292
*/
288293
}
289294
goto retry;
290295
}
291296

292297
/*
293-
* Process interrupts that happened while (or before) sending. Note that
294-
* we signal that we're not blocking, which will prevent some types of
295-
* interrupts from being processed.
298+
* Process interrupts that happened during a successful (or non-blocking,
299+
* or hard-failed) write.
296300
*/
297301
ProcessClientWriteInterrupt(false);
298302

src/backend/tcop/postgres.c

Lines changed: 50 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -302,7 +302,7 @@ interactive_getc(void)
302302

303303
c = getc(stdin);
304304

305-
ProcessClientReadInterrupt(true);
305+
ProcessClientReadInterrupt(false);
306306

307307
return c;
308308
}
@@ -507,8 +507,9 @@ ReadCommand(StringInfo inBuf)
507507
/*
508508
* ProcessClientReadInterrupt() - Process interrupts specific to client reads
509509
*
510-
* This is called just after low-level reads. That might be after the read
511-
* finished successfully, or it was interrupted via interrupt.
510+
* This is called just before and after low-level reads.
511+
* 'blocked' is true if no data was available to read and we plan to retry,
512+
* false if about to read or done reading.
512513
*
513514
* Must preserve errno!
514515
*/
@@ -519,23 +520,31 @@ ProcessClientReadInterrupt(bool blocked)
519520

520521
if (DoingCommandRead)
521522
{
522-
/* Check for general interrupts that arrived while reading */
523+
/* Check for general interrupts that arrived before/while reading */
523524
CHECK_FOR_INTERRUPTS();
524525

525-
/* Process sinval catchup interrupts that happened while reading */
526+
/* Process sinval catchup interrupts, if any */
526527
if (catchupInterruptPending)
527528
ProcessCatchupInterrupt();
528529

529-
/* Process sinval catchup interrupts that happened while reading */
530+
/* Process notify interrupts, if any */
530531
if (notifyInterruptPending)
531532
ProcessNotifyInterrupt();
532533
}
533-
else if (ProcDiePending && blocked)
534+
else if (ProcDiePending)
534535
{
535536
/*
536-
* We're dying. It's safe (and sane) to handle that now.
537+
* We're dying. If there is no data available to read, then it's safe
538+
* (and sane) to handle that now. If we haven't tried to read yet,
539+
* make sure the process latch is set, so that if there is no data
540+
* then we'll come back here and die. If we're done reading, also
541+
* make sure the process latch is set, as we might've undesirably
542+
* cleared it while reading.
537543
*/
538-
CHECK_FOR_INTERRUPTS();
544+
if (blocked)
545+
CHECK_FOR_INTERRUPTS();
546+
else
547+
SetLatch(MyLatch);
539548
}
540549

541550
errno = save_errno;
@@ -544,9 +553,9 @@ ProcessClientReadInterrupt(bool blocked)
544553
/*
545554
* ProcessClientWriteInterrupt() - Process interrupts specific to client writes
546555
*
547-
* This is called just after low-level writes. That might be after the read
548-
* finished successfully, or it was interrupted via interrupt. 'blocked' tells
549-
* us whether the
556+
* This is called just before and after low-level writes.
557+
* 'blocked' is true if no data could be written and we plan to retry,
558+
* false if about to write or done writing.
550559
*
551560
* Must preserve errno!
552561
*/
@@ -555,25 +564,39 @@ ProcessClientWriteInterrupt(bool blocked)
555564
{
556565
int save_errno = errno;
557566

558-
/*
559-
* We only want to process the interrupt here if socket writes are
560-
* blocking to increase the chance to get an error message to the client.
561-
* If we're not blocked there'll soon be a CHECK_FOR_INTERRUPTS(). But if
562-
* we're blocked we'll never get out of that situation if the client has
563-
* died.
564-
*/
565-
if (ProcDiePending && blocked)
567+
if (ProcDiePending)
566568
{
567569
/*
568-
* We're dying. It's safe (and sane) to handle that now. But we don't
569-
* want to send the client the error message as that a) would possibly
570-
* block again b) would possibly lead to sending an error message to
571-
* the client, while we already started to send something else.
570+
* We're dying. If it's not possible to write, then we should handle
571+
* that immediately, else a stuck client could indefinitely delay our
572+
* response to the signal. If we haven't tried to write yet, make
573+
* sure the process latch is set, so that if the write would block
574+
* then we'll come back here and die. If we're done writing, also
575+
* make sure the process latch is set, as we might've undesirably
576+
* cleared it while writing.
572577
*/
573-
if (whereToSendOutput == DestRemote)
574-
whereToSendOutput = DestNone;
578+
if (blocked)
579+
{
580+
/*
581+
* Don't mess with whereToSendOutput if ProcessInterrupts wouldn't
582+
* do anything.
583+
*/
584+
if (InterruptHoldoffCount == 0 && CritSectionCount == 0)
585+
{
586+
/*
587+
* We don't want to send the client the error message, as a)
588+
* that would possibly block again, and b) it would likely
589+
* lead to loss of protocol sync because we may have already
590+
* sent a partial protocol message.
591+
*/
592+
if (whereToSendOutput == DestRemote)
593+
whereToSendOutput = DestNone;
575594

576-
CHECK_FOR_INTERRUPTS();
595+
CHECK_FOR_INTERRUPTS();
596+
}
597+
}
598+
else
599+
SetLatch(MyLatch);
577600
}
578601

579602
errno = save_errno;

0 commit comments

Comments
 (0)