Skip to content

Commit 4f85fde

Browse files
committed
Introduce and use infrastructure for interrupt processing during client reads.
Up to now large swathes of backend code ran inside signal handlers while reading commands from the client, to allow for speedy reaction to asynchronous events. Most prominently shared invalidation and NOTIFY handling. That means that complex code like the starting/stopping of transactions is run in signal handlers... The required code was fragile and verbose, and is likely to contain bugs. That approach also severely limited what could be done while communicating with the client. As the read might be from within openssl it wasn't safely possible to trigger an error, e.g. to cancel a backend in idle-in-transaction state. We did that in some cases, namely fatal errors, nonetheless. Now that FE/BE communication in the backend employs non-blocking sockets and latches to block, we can quite simply interrupt reads from signal handlers by setting the latch. That allows us to signal an interrupted read, which is supposed to be retried after returning from within the ssl library. As signal handlers now only need to set the latch to guarantee timely interrupt processing, remove a fair amount of complicated & fragile code from async.c and sinval.c. We could now actually start to process some kinds of interrupts, like sinval ones, more often that before, but that seems better done separately. This work will hopefully allow to handle cases like being blocked by sending data, interrupting idle transactions and similar to be implemented without too much effort. In addition to allowing getting rid of ImmediateInterruptOK, that is. Author: Andres Freund Reviewed-By: Heikki Linnakangas
1 parent 387da18 commit 4f85fde

File tree

9 files changed

+194
-434
lines changed

9 files changed

+194
-434
lines changed

src/backend/commands/async.c

Lines changed: 36 additions & 163 deletions
Original file line numberDiff line numberDiff line change
@@ -79,10 +79,12 @@
7979
* either, but just process the queue directly.
8080
*
8181
* 5. Upon receipt of a PROCSIG_NOTIFY_INTERRUPT signal, the signal handler
82-
* can call inbound-notify processing immediately if this backend is idle
83-
* (ie, it is waiting for a frontend command and is not within a transaction
84-
* block). Otherwise the handler may only set a flag, which will cause the
85-
* processing to occur just before we next go idle.
82+
* sets the process's latch, which triggers the event to be processed
83+
* immediately if this backend is idle (i.e., it is waiting for a frontend
84+
* command and is not within a transaction block. C.f.
85+
* ProcessClientReadInterrupt()). Otherwise the handler may only set a
86+
* flag, which will cause the processing to occur just before we next go
87+
* idle.
8688
*
8789
* Inbound-notify processing consists of reading all of the notifications
8890
* that have arrived since scanning last time. We read every notification
@@ -126,6 +128,7 @@
126128
#include "miscadmin.h"
127129
#include "storage/ipc.h"
128130
#include "storage/lmgr.h"
131+
#include "storage/proc.h"
129132
#include "storage/procarray.h"
130133
#include "storage/procsignal.h"
131134
#include "storage/sinval.h"
@@ -334,17 +337,13 @@ static List *pendingNotifies = NIL; /* list of Notifications */
334337
static List *upperPendingNotifies = NIL; /* list of upper-xact lists */
335338

336339
/*
337-
* State for inbound notifications consists of two flags: one saying whether
338-
* the signal handler is currently allowed to call ProcessIncomingNotify
339-
* directly, and one saying whether the signal has occurred but the handler
340-
* was not allowed to call ProcessIncomingNotify at the time.
341-
*
342-
* NB: the "volatile" on these declarations is critical! If your compiler
343-
* does not grok "volatile", you'd be best advised to compile this file
344-
* with all optimization turned off.
340+
* Inbound notifications are initially processed by HandleNotifyInterrupt(),
341+
* called from inside a signal handler. That just sets the
342+
* notifyInterruptPending flag and sets the process
343+
* latch. ProcessNotifyInterrupt() will then be called whenever it's safe to
344+
* actually deal with the interrupt.
345345
*/
346-
static volatile sig_atomic_t notifyInterruptEnabled = 0;
347-
static volatile sig_atomic_t notifyInterruptOccurred = 0;
346+
volatile sig_atomic_t notifyInterruptPending = false;
348347

349348
/* True if we've registered an on_shmem_exit cleanup */
350349
static bool unlistenExitRegistered = false;
@@ -1625,164 +1624,45 @@ AtSubAbort_Notify(void)
16251624
/*
16261625
* HandleNotifyInterrupt
16271626
*
1628-
* This is called when PROCSIG_NOTIFY_INTERRUPT is received.
1629-
*
1630-
* If we are idle (notifyInterruptEnabled is set), we can safely invoke
1631-
* ProcessIncomingNotify directly. Otherwise, just set a flag
1632-
* to do it later.
1627+
* Signal handler portion of interrupt handling. Let the backend know
1628+
* that there's a pending notify interrupt. If we're currently reading
1629+
* from the client, this will interrupt the read and
1630+
* ProcessClientReadInterrupt() will call ProcessNotifyInterrupt().
16331631
*/
16341632
void
16351633
HandleNotifyInterrupt(void)
16361634
{
16371635
/*
16381636
* Note: this is called by a SIGNAL HANDLER. You must be very wary what
1639-
* you do here. Some helpful soul had this routine sprinkled with
1640-
* TPRINTFs, which would likely lead to corruption of stdio buffers if
1641-
* they were ever turned on.
1637+
* you do here.
16421638
*/
16431639

1644-
/* Don't joggle the elbow of proc_exit */
1645-
if (proc_exit_inprogress)
1646-
return;
1647-
1648-
if (notifyInterruptEnabled)
1649-
{
1650-
bool save_ImmediateInterruptOK = ImmediateInterruptOK;
1651-
1652-
/*
1653-
* We may be called while ImmediateInterruptOK is true; turn it off
1654-
* while messing with the NOTIFY state. This prevents problems if
1655-
* SIGINT or similar arrives while we're working. Just to be real
1656-
* sure, bump the interrupt holdoff counter as well. That way, even
1657-
* if something inside ProcessIncomingNotify() transiently sets
1658-
* ImmediateInterruptOK (eg while waiting on a lock), we won't get
1659-
* interrupted until we're done with the notify interrupt.
1660-
*/
1661-
ImmediateInterruptOK = false;
1662-
HOLD_INTERRUPTS();
1663-
1664-
/*
1665-
* I'm not sure whether some flavors of Unix might allow another
1666-
* SIGUSR1 occurrence to recursively interrupt this routine. To cope
1667-
* with the possibility, we do the same sort of dance that
1668-
* EnableNotifyInterrupt must do --- see that routine for comments.
1669-
*/
1670-
notifyInterruptEnabled = 0; /* disable any recursive signal */
1671-
notifyInterruptOccurred = 1; /* do at least one iteration */
1672-
for (;;)
1673-
{
1674-
notifyInterruptEnabled = 1;
1675-
if (!notifyInterruptOccurred)
1676-
break;
1677-
notifyInterruptEnabled = 0;
1678-
if (notifyInterruptOccurred)
1679-
{
1680-
/* Here, it is finally safe to do stuff. */
1681-
if (Trace_notify)
1682-
elog(DEBUG1, "HandleNotifyInterrupt: perform async notify");
1683-
1684-
ProcessIncomingNotify();
1685-
1686-
if (Trace_notify)
1687-
elog(DEBUG1, "HandleNotifyInterrupt: done");
1688-
}
1689-
}
1640+
/* signal that work needs to be done */
1641+
notifyInterruptPending = true;
16901642

1691-
/*
1692-
* Restore the holdoff level and ImmediateInterruptOK, and check for
1693-
* interrupts if needed.
1694-
*/
1695-
RESUME_INTERRUPTS();
1696-
ImmediateInterruptOK = save_ImmediateInterruptOK;
1697-
if (save_ImmediateInterruptOK)
1698-
CHECK_FOR_INTERRUPTS();
1699-
}
1700-
else
1701-
{
1702-
/*
1703-
* In this path it is NOT SAFE to do much of anything, except this:
1704-
*/
1705-
notifyInterruptOccurred = 1;
1706-
}
1643+
/* make sure the event is processed in due course */
1644+
SetLatch(MyLatch);
17071645
}
17081646

17091647
/*
1710-
* EnableNotifyInterrupt
1711-
*
1712-
* This is called by the PostgresMain main loop just before waiting
1713-
* for a frontend command. If we are truly idle (ie, *not* inside
1714-
* a transaction block), then process any pending inbound notifies,
1715-
* and enable the signal handler to process future notifies directly.
1648+
* ProcessNotifyInterrupt
17161649
*
1717-
* NOTE: the signal handler starts out disabled, and stays so until
1718-
* PostgresMain calls this the first time.
1650+
* This is called just after waiting for a frontend command. If a
1651+
* interrupt arrives (via HandleNotifyInterrupt()) while reading, the
1652+
* read will be interrupted via the process's latch, and this routine
1653+
* will get called. If we are truly idle (ie, *not* inside a transaction
1654+
* block), process the incoming notifies.
17191655
*/
17201656
void
1721-
EnableNotifyInterrupt(void)
1657+
ProcessNotifyInterrupt(void)
17221658
{
17231659
if (IsTransactionOrTransactionBlock())
17241660
return; /* not really idle */
17251661

1726-
/*
1727-
* This code is tricky because we are communicating with a signal handler
1728-
* that could interrupt us at any point. If we just checked
1729-
* notifyInterruptOccurred and then set notifyInterruptEnabled, we could
1730-
* fail to respond promptly to a signal that happens in between those two
1731-
* steps. (A very small time window, perhaps, but Murphy's Law says you
1732-
* can hit it...) Instead, we first set the enable flag, then test the
1733-
* occurred flag. If we see an unserviced interrupt has occurred, we
1734-
* re-clear the enable flag before going off to do the service work. (That
1735-
* prevents re-entrant invocation of ProcessIncomingNotify() if another
1736-
* interrupt occurs.) If an interrupt comes in between the setting and
1737-
* clearing of notifyInterruptEnabled, then it will have done the service
1738-
* work and left notifyInterruptOccurred zero, so we have to check again
1739-
* after clearing enable. The whole thing has to be in a loop in case
1740-
* another interrupt occurs while we're servicing the first. Once we get
1741-
* out of the loop, enable is set and we know there is no unserviced
1742-
* interrupt.
1743-
*
1744-
* NB: an overenthusiastic optimizing compiler could easily break this
1745-
* code. Hopefully, they all understand what "volatile" means these days.
1746-
*/
1747-
for (;;)
1748-
{
1749-
notifyInterruptEnabled = 1;
1750-
if (!notifyInterruptOccurred)
1751-
break;
1752-
notifyInterruptEnabled = 0;
1753-
if (notifyInterruptOccurred)
1754-
{
1755-
if (Trace_notify)
1756-
elog(DEBUG1, "EnableNotifyInterrupt: perform async notify");
1757-
1758-
ProcessIncomingNotify();
1759-
1760-
if (Trace_notify)
1761-
elog(DEBUG1, "EnableNotifyInterrupt: done");
1762-
}
1763-
}
1662+
while (notifyInterruptPending)
1663+
ProcessIncomingNotify();
17641664
}
17651665

1766-
/*
1767-
* DisableNotifyInterrupt
1768-
*
1769-
* This is called by the PostgresMain main loop just after receiving
1770-
* a frontend command. Signal handler execution of inbound notifies
1771-
* is disabled until the next EnableNotifyInterrupt call.
1772-
*
1773-
* The PROCSIG_CATCHUP_INTERRUPT signal handler also needs to call this,
1774-
* so as to prevent conflicts if one signal interrupts the other. So we
1775-
* must return the previous state of the flag.
1776-
*/
1777-
bool
1778-
DisableNotifyInterrupt(void)
1779-
{
1780-
bool result = (notifyInterruptEnabled != 0);
1781-
1782-
notifyInterruptEnabled = 0;
1783-
1784-
return result;
1785-
}
17861666

17871667
/*
17881668
* Read all pending notifications from the queue, and deliver appropriate
@@ -2076,9 +1956,10 @@ asyncQueueAdvanceTail(void)
20761956
/*
20771957
* ProcessIncomingNotify
20781958
*
2079-
* Deal with arriving NOTIFYs from other backends.
2080-
* This is called either directly from the PROCSIG_NOTIFY_INTERRUPT
2081-
* signal handler, or the next time control reaches the outer idle loop.
1959+
* Deal with arriving NOTIFYs from other backends as soon as it's safe to
1960+
* do so. This used to be called from the PROCSIG_NOTIFY_INTERRUPT
1961+
* signal handler, but isn't anymore.
1962+
*
20821963
* Scan the queue for arriving notifications and report them to my front
20831964
* end.
20841965
*
@@ -2087,18 +1968,13 @@ asyncQueueAdvanceTail(void)
20871968
static void
20881969
ProcessIncomingNotify(void)
20891970
{
2090-
bool catchup_enabled;
2091-
20921971
/* We *must* reset the flag */
2093-
notifyInterruptOccurred = 0;
1972+
notifyInterruptPending = false;
20941973

20951974
/* Do nothing else if we aren't actively listening */
20961975
if (listenChannels == NIL)
20971976
return;
20981977

2099-
/* Must prevent catchup interrupt while I am running */
2100-
catchup_enabled = DisableCatchupInterrupt();
2101-
21021978
if (Trace_notify)
21031979
elog(DEBUG1, "ProcessIncomingNotify");
21041980

@@ -2123,9 +1999,6 @@ ProcessIncomingNotify(void)
21231999

21242000
if (Trace_notify)
21252001
elog(DEBUG1, "ProcessIncomingNotify: done");
2126-
2127-
if (catchup_enabled)
2128-
EnableCatchupInterrupt();
21292002
}
21302003

21312004
/*

src/backend/libpq/be-secure-openssl.c

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -511,6 +511,7 @@ be_tls_read(Port *port, void *ptr, size_t len)
511511
ssize_t n;
512512
int err;
513513
int waitfor;
514+
int latchret;
514515

515516
rloop:
516517
errno = 0;
@@ -531,12 +532,29 @@ be_tls_read(Port *port, void *ptr, size_t len)
531532
break;
532533
}
533534

535+
waitfor = WL_LATCH_SET;
536+
534537
if (err == SSL_ERROR_WANT_READ)
535-
waitfor = WL_SOCKET_READABLE;
538+
waitfor |= WL_SOCKET_READABLE;
536539
else
537-
waitfor = WL_SOCKET_WRITEABLE;
540+
waitfor |= WL_SOCKET_WRITEABLE;
538541

539-
WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
542+
latchret = WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
543+
544+
/*
545+
* We'll, among other situations, get here if the low level
546+
* routine doing the actual recv() via the socket got interrupted
547+
* by a signal. That's so we can handle interrupts once outside
548+
* openssl, so we don't jump out from underneath its covers. We
549+
* can check this both, when reading and writing, because even
550+
* when writing that's just openssl's doing, not a 'proper' write
551+
* initiated by postgres.
552+
*/
553+
if (latchret & WL_LATCH_SET)
554+
{
555+
ResetLatch(MyLatch);
556+
ProcessClientReadInterrupt(); /* preserves errno */
557+
}
540558
goto rloop;
541559
case SSL_ERROR_SYSCALL:
542560
/* leave it to caller to ereport the value of errno */
@@ -647,6 +665,10 @@ be_tls_write(Port *port, void *ptr, size_t len)
647665
waitfor = WL_SOCKET_WRITEABLE;
648666

649667
WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0);
668+
/*
669+
* XXX: We'll, at some later point, likely want to add interrupt
670+
* processing here.
671+
*/
650672
goto wloop;
651673
case SSL_ERROR_SYSCALL:
652674
/* leave it to caller to ereport the value of errno */

0 commit comments

Comments
 (0)