Skip to content

Commit 6693a96

Browse files
committed
Don't run atexit callbacks during signal exits from ProcessStartupPacket.
Although 58c6fec fixed the case for SIGQUIT, we were still calling proc_exit() from signal handlers for SIGTERM and timeout failures in ProcessStartupPacket. Fortunately, at the point where that code runs, we haven't yet connected to shared memory in any meaningful way, so there is nothing we need to undo in shared memory. This means it should be safe to use _exit(1) here, ie, not run any atexit handlers but also inform the postmaster that it's not a crash exit. To make sure nobody breaks the "nothing to undo" expectation, add a cross-check that no on-shmem-exit or before-shmem-exit handlers have been registered yet when we finish using these signal handlers. This change is simple enough that maybe it could be back-patched, but I won't risk that right now. Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
1 parent 6a68a23 commit 6693a96

File tree

3 files changed

+51
-39
lines changed

3 files changed

+51
-39
lines changed

src/backend/postmaster/postmaster.c

Lines changed: 33 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -4298,6 +4298,8 @@ report_fork_failure_to_client(Port *port, int errnum)
42984298
* returns: nothing. Will not return at all if there's any failure.
42994299
*
43004300
* Note: this code does not depend on having any access to shared memory.
4301+
* Indeed, our approach to SIGTERM/timeout handling *requires* that
4302+
* shared memory not have been touched yet; see comments within.
43014303
* In the EXEC_BACKEND case, we are physically attached to shared memory
43024304
* but have not yet set up most of our local pointers to shmem structures.
43034305
*/
@@ -4341,27 +4343,14 @@ BackendInitialize(Port *port)
43414343
whereToSendOutput = DestRemote; /* now safe to ereport to client */
43424344

43434345
/*
4344-
* We arrange to do proc_exit(1) if we receive SIGTERM or timeout while
4345-
* trying to collect the startup packet; while SIGQUIT results in
4346-
* _exit(2). Otherwise the postmaster cannot shutdown the database FAST
4347-
* or IMMED cleanly if a buggy client fails to send the packet promptly.
4346+
* We arrange to do _exit(1) if we receive SIGTERM or timeout while trying
4347+
* to collect the startup packet; while SIGQUIT results in _exit(2).
4348+
* Otherwise the postmaster cannot shutdown the database FAST or IMMED
4349+
* cleanly if a buggy client fails to send the packet promptly.
43484350
*
4349-
* XXX this is pretty dangerous; signal handlers should not call anything
4350-
* as complex as proc_exit() directly. We minimize the hazard by not
4351-
* keeping these handlers active for longer than we must. However, it
4352-
* seems necessary to be able to escape out of DNS lookups as well as the
4353-
* startup packet reception proper, so we can't narrow the scope further
4354-
* than is done here.
4355-
*
4356-
* XXX it follows that the remainder of this function must tolerate losing
4357-
* control at any instant. Likewise, any pg_on_exit_callback registered
4358-
* before or during this function must be prepared to execute at any
4359-
* instant between here and the end of this function. Furthermore,
4360-
* affected callbacks execute partially or not at all when a second
4361-
* exit-inducing signal arrives after proc_exit_prepare() decrements
4362-
* on_proc_exit_index. (Thanks to that mechanic, callbacks need not
4363-
* anticipate more than one call.) This is fragile; it ought to instead
4364-
* follow the norm of handling interrupts at selected, safe opportunities.
4351+
* Exiting with _exit(1) is only possible because we have not yet touched
4352+
* shared memory; therefore no outside-the-process state needs to get
4353+
* cleaned up.
43654354
*/
43664355
pqsignal(SIGTERM, process_startup_packet_die);
43674356
pqsignal(SIGQUIT, SignalHandlerForCrashExit);
@@ -4420,8 +4409,8 @@ BackendInitialize(Port *port)
44204409
port->remote_hostname = strdup(remote_host);
44214410

44224411
/*
4423-
* Ready to begin client interaction. We will give up and proc_exit(1)
4424-
* after a time delay, so that a broken client can't hog a connection
4412+
* Ready to begin client interaction. We will give up and _exit(1) after
4413+
* a time delay, so that a broken client can't hog a connection
44254414
* indefinitely. PreAuthDelay and any DNS interactions above don't count
44264415
* against the time limit.
44274416
*
@@ -4449,6 +4438,17 @@ BackendInitialize(Port *port)
44494438
disable_timeout(STARTUP_PACKET_TIMEOUT, false);
44504439
PG_SETMASK(&BlockSig);
44514440

4441+
/*
4442+
* As a safety check that nothing in startup has yet performed
4443+
* shared-memory modifications that would need to be undone if we had
4444+
* exited through SIGTERM or timeout above, check that no on_shmem_exit
4445+
* handlers have been registered yet. (This isn't terribly bulletproof,
4446+
* since someone might misuse an on_proc_exit handler for shmem cleanup,
4447+
* but it's a cheap and helpful check. We cannot disallow on_proc_exit
4448+
* handlers unfortunately, since pq_init() already registered one.)
4449+
*/
4450+
check_on_shmem_exit_lists_are_empty();
4451+
44524452
/*
44534453
* Stop here if it was bad or a cancel packet. ProcessStartupPacket
44544454
* already did any appropriate error reporting.
@@ -5369,23 +5369,21 @@ sigusr1_handler(SIGNAL_ARGS)
53695369

53705370
/*
53715371
* SIGTERM while processing startup packet.
5372-
* Clean up and exit(1).
53735372
*
5374-
* Running proc_exit() from a signal handler is pretty unsafe, since we
5375-
* can't know what code we've interrupted. But the alternative of using
5376-
* _exit(2) is also unpalatable, since it'd mean that a "fast shutdown"
5377-
* would cause a database crash cycle (forcing WAL replay at restart)
5378-
* if any sessions are in authentication. So we live with it for now.
5373+
* Running proc_exit() from a signal handler would be quite unsafe.
5374+
* However, since we have not yet touched shared memory, we can just
5375+
* pull the plug and exit without running any atexit handlers.
53795376
*
5380-
* One might be tempted to try to send a message indicating why we are
5381-
* disconnecting. However, that would make this even more unsafe. Also,
5382-
* it seems undesirable to provide clues about the database's state to
5383-
* a client that has not yet completed authentication.
5377+
* One might be tempted to try to send a message, or log one, indicating
5378+
* why we are disconnecting. However, that would be quite unsafe in itself.
5379+
* Also, it seems undesirable to provide clues about the database's state
5380+
* to a client that has not yet completed authentication, or even sent us
5381+
* a startup packet.
53845382
*/
53855383
static void
53865384
process_startup_packet_die(SIGNAL_ARGS)
53875385
{
5388-
proc_exit(1);
5386+
_exit(1);
53895387
}
53905388

53915389
/*
@@ -5404,16 +5402,12 @@ dummy_handler(SIGNAL_ARGS)
54045402

54055403
/*
54065404
* Timeout while processing startup packet.
5407-
* As for process_startup_packet_die(), we clean up and exit(1).
5408-
*
5409-
* This is theoretically just as hazardous as in process_startup_packet_die(),
5410-
* although in practice we're almost certainly waiting for client input,
5411-
* which greatly reduces the risk.
5405+
* As for process_startup_packet_die(), we exit via _exit(1).
54125406
*/
54135407
static void
54145408
StartupPacketTimeoutHandler(void)
54155409
{
5416-
proc_exit(1);
5410+
_exit(1);
54175411
}
54185412

54195413

src/backend/storage/ipc/ipc.c

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -416,3 +416,20 @@ on_exit_reset(void)
416416
on_proc_exit_index = 0;
417417
reset_on_dsm_detach();
418418
}
419+
420+
/* ----------------------------------------------------------------
421+
* check_on_shmem_exit_lists_are_empty
422+
*
423+
* Debugging check that no shmem cleanup handlers have been registered
424+
* prematurely in the current process.
425+
* ----------------------------------------------------------------
426+
*/
427+
void
428+
check_on_shmem_exit_lists_are_empty(void)
429+
{
430+
if (before_shmem_exit_index)
431+
elog(FATAL, "before_shmem_exit has been called prematurely");
432+
if (on_shmem_exit_index)
433+
elog(FATAL, "on_shmem_exit has been called prematurely");
434+
/* Checking DSM detach state seems unnecessary given the above */
435+
}

src/include/storage/ipc.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ extern void on_shmem_exit(pg_on_exit_callback function, Datum arg);
7272
extern void before_shmem_exit(pg_on_exit_callback function, Datum arg);
7373
extern void cancel_before_shmem_exit(pg_on_exit_callback function, Datum arg);
7474
extern void on_exit_reset(void);
75+
extern void check_on_shmem_exit_lists_are_empty(void);
7576

7677
/* ipci.c */
7778
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;

0 commit comments

Comments
 (0)