Skip to content

Commit 3f29aa4

Browse files
committed
Use _exit(2) for SIGQUIT during ProcessStartupPacket, too.
Bring the signal handling for startup-packet collection into line with the policy established in commits bedadc732 and 8e19a82, namely don't risk running atexit callbacks when handling SIGQUIT. Ideally, we'd not do so for SIGTERM or timeout interrupts either, but that change seems a bit too risky for the back branches. For now, just improve the comments in this area to describe the risk. Also relocate where BackendInitialize re-disables these interrupts, to minimize the code span where they're active. This doesn't buy a whole lot of safety, but it can't hurt. In passing, rename startup_die() to remove confusion about whether it is for the startup process. Like the previous commits, back-patch to all supported branches. Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
1 parent 9f358c5 commit 3f29aa4

File tree

1 file changed

+51
-32
lines changed

1 file changed

+51
-32
lines changed

src/backend/postmaster/postmaster.c

Lines changed: 51 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,7 @@
112112
#include "postmaster/autovacuum.h"
113113
#include "postmaster/bgworker_internals.h"
114114
#include "postmaster/fork_process.h"
115+
#include "postmaster/interrupt.h"
115116
#include "postmaster/pgarch.h"
116117
#include "postmaster/postmaster.h"
117118
#include "postmaster/syslogger.h"
@@ -401,7 +402,7 @@ static void SIGHUP_handler(SIGNAL_ARGS);
401402
static void pmdie(SIGNAL_ARGS);
402403
static void reaper(SIGNAL_ARGS);
403404
static void sigusr1_handler(SIGNAL_ARGS);
404-
static void startup_die(SIGNAL_ARGS);
405+
static void process_startup_packet_die(SIGNAL_ARGS);
405406
static void dummy_handler(SIGNAL_ARGS);
406407
static void StartupPacketTimeoutHandler(void);
407408
static void CleanupBackend(int pid, int exitstatus);
@@ -4337,22 +4338,30 @@ BackendInitialize(Port *port)
43374338
whereToSendOutput = DestRemote; /* now safe to ereport to client */
43384339

43394340
/*
4340-
* We arrange for a simple exit(1) if we receive SIGTERM or SIGQUIT or
4341-
* timeout while trying to collect the startup packet. Otherwise the
4342-
* postmaster cannot shutdown the database FAST or IMMED cleanly if a
4343-
* buggy client fails to send the packet promptly. XXX it follows that
4344-
* the remainder of this function must tolerate losing control at any
4345-
* instant. Likewise, any pg_on_exit_callback registered before or during
4346-
* this function must be prepared to execute at any instant between here
4347-
* and the end of this function. Furthermore, affected callbacks execute
4348-
* partially or not at all when a second exit-inducing signal arrives
4349-
* after proc_exit_prepare() decrements on_proc_exit_index. (Thanks to
4350-
* that mechanic, callbacks need not anticipate more than one call.) This
4351-
* is fragile; it ought to instead follow the norm of handling interrupts
4352-
* at selected, safe opportunities.
4353-
*/
4354-
pqsignal(SIGTERM, startup_die);
4355-
pqsignal(SIGQUIT, startup_die);
4341+
* We arrange to do proc_exit(1) if we receive SIGTERM or timeout while
4342+
* trying to collect the startup packet; while SIGQUIT results in
4343+
* _exit(2). Otherwise the postmaster cannot shutdown the database FAST
4344+
* or IMMED cleanly if a buggy client fails to send the packet promptly.
4345+
*
4346+
* XXX this is pretty dangerous; signal handlers should not call anything
4347+
* as complex as proc_exit() directly. We minimize the hazard by not
4348+
* keeping these handlers active for longer than we must. However, it
4349+
* seems necessary to be able to escape out of DNS lookups as well as the
4350+
* startup packet reception proper, so we can't narrow the scope further
4351+
* than is done here.
4352+
*
4353+
* XXX it follows that the remainder of this function must tolerate losing
4354+
* control at any instant. Likewise, any pg_on_exit_callback registered
4355+
* before or during this function must be prepared to execute at any
4356+
* instant between here and the end of this function. Furthermore,
4357+
* affected callbacks execute partially or not at all when a second
4358+
* exit-inducing signal arrives after proc_exit_prepare() decrements
4359+
* on_proc_exit_index. (Thanks to that mechanic, callbacks need not
4360+
* anticipate more than one call.) This is fragile; it ought to instead
4361+
* follow the norm of handling interrupts at selected, safe opportunities.
4362+
*/
4363+
pqsignal(SIGTERM, process_startup_packet_die);
4364+
pqsignal(SIGQUIT, SignalHandlerForCrashExit);
43564365
InitializeTimeouts(); /* establishes SIGALRM handler */
43574366
PG_SETMASK(&StartupBlockSig);
43584367

@@ -4408,8 +4417,8 @@ BackendInitialize(Port *port)
44084417
port->remote_hostname = strdup(remote_host);
44094418

44104419
/*
4411-
* Ready to begin client interaction. We will give up and exit(1) after a
4412-
* time delay, so that a broken client can't hog a connection
4420+
* Ready to begin client interaction. We will give up and proc_exit(1)
4421+
* after a time delay, so that a broken client can't hog a connection
44134422
* indefinitely. PreAuthDelay and any DNS interactions above don't count
44144423
* against the time limit.
44154424
*
@@ -4431,6 +4440,12 @@ BackendInitialize(Port *port)
44314440
*/
44324441
status = ProcessStartupPacket(port, false, false);
44334442

4443+
/*
4444+
* Disable the timeout, and prevent SIGTERM/SIGQUIT again.
4445+
*/
4446+
disable_timeout(STARTUP_PACKET_TIMEOUT, false);
4447+
PG_SETMASK(&BlockSig);
4448+
44344449
/*
44354450
* Stop here if it was bad or a cancel packet. ProcessStartupPacket
44364451
* already did any appropriate error reporting.
@@ -4456,12 +4471,6 @@ BackendInitialize(Port *port)
44564471
pfree(ps_data.data);
44574472

44584473
set_ps_display("initializing");
4459-
4460-
/*
4461-
* Disable the timeout, and prevent SIGTERM/SIGQUIT again.
4462-
*/
4463-
disable_timeout(STARTUP_PACKET_TIMEOUT, false);
4464-
PG_SETMASK(&BlockSig);
44654474
}
44664475

44674476

@@ -5351,16 +5360,22 @@ sigusr1_handler(SIGNAL_ARGS)
53515360
}
53525361

53535362
/*
5354-
* SIGTERM or SIGQUIT while processing startup packet.
5363+
* SIGTERM while processing startup packet.
53555364
* Clean up and exit(1).
53565365
*
5357-
* XXX: possible future improvement: try to send a message indicating
5358-
* why we are disconnecting. Problem is to be sure we don't block while
5359-
* doing so, nor mess up SSL initialization. In practice, if the client
5360-
* has wedged here, it probably couldn't do anything with the message anyway.
5366+
* Running proc_exit() from a signal handler is pretty unsafe, since we
5367+
* can't know what code we've interrupted. But the alternative of using
5368+
* _exit(2) is also unpalatable, since it'd mean that a "fast shutdown"
5369+
* would cause a database crash cycle (forcing WAL replay at restart)
5370+
* if any sessions are in authentication. So we live with it for now.
5371+
*
5372+
* One might be tempted to try to send a message indicating why we are
5373+
* disconnecting. However, that would make this even more unsafe. Also,
5374+
* it seems undesirable to provide clues about the database's state to
5375+
* a client that has not yet completed authentication.
53615376
*/
53625377
static void
5363-
startup_die(SIGNAL_ARGS)
5378+
process_startup_packet_die(SIGNAL_ARGS)
53645379
{
53655380
proc_exit(1);
53665381
}
@@ -5381,7 +5396,11 @@ dummy_handler(SIGNAL_ARGS)
53815396

53825397
/*
53835398
* Timeout while processing startup packet.
5384-
* As for startup_die(), we clean up and exit(1).
5399+
* As for process_startup_packet_die(), we clean up and exit(1).
5400+
*
5401+
* This is theoretically just as hazardous as in process_startup_packet_die(),
5402+
* although in practice we're almost certainly waiting for client input,
5403+
* which greatly reduces the risk.
53855404
*/
53865405
static void
53875406
StartupPacketTimeoutHandler(void)

0 commit comments

Comments
 (0)