Skip to content

Commit 58ce9c7

Browse files
committed
Don't run atexit callbacks in quickdie signal handlers.
exit() is not async-signal safe. Even if the libc implementation is, 3rd party libraries might have installed unsafe atexit() callbacks. After receiving SIGQUIT, we really just want to exit as quickly as possible, so we don't really want to run the atexit() callbacks anyway. The original report by Jimmy Yih was a self-deadlock in startup_die(). However, this patch doesn't address that scenario; the signal handling while waiting for the startup packet is more complicated. But at least this alleviates similar problems in the SIGQUIT handlers, like that reported by Asim R P later in the same thread. Backpatch to 9.3 (all supported versions). Discussion: https://www.postgresql.org/message-id/CAOMx_OAuRUHiAuCg2YgicZLzPVv5d9_H4KrL_OFsFP%3DVPekigA%40mail.gmail.com
1 parent f5973ac commit 58ce9c7

File tree

6 files changed

+68
-92
lines changed

6 files changed

+68
-92
lines changed

src/backend/postmaster/bgwriter.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -345,27 +345,21 @@ BackgroundWriterMain(void)
345345
static void
346346
bg_quickdie(SIGNAL_ARGS)
347347
{
348-
PG_SETMASK(&BlockSig);
349-
350348
/*
351-
* We DO NOT want to run proc_exit() callbacks -- we're here because
352-
* shared memory may be corrupted, so we don't want to try to clean up our
353-
* transaction. Just nail the windows shut and get out of town. Now that
354-
* there's an atexit callback to prevent third-party code from breaking
355-
* things by calling exit() directly, we have to reset the callbacks
356-
* explicitly to make this work as intended.
357-
*/
358-
on_exit_reset();
359-
360-
/*
361-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
362-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
349+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
350+
* because shared memory may be corrupted, so we don't want to try to
351+
* clean up our transaction. Just nail the windows shut and get out of
352+
* town. The callbacks wouldn't be safe to run from a signal handler,
353+
* anyway.
354+
*
355+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
356+
* a system reset cycle if someone sends a manual SIGQUIT to a random
363357
* backend. This is necessary precisely because we don't clean up our
364358
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
365359
* should ensure the postmaster sees this as a crash, too, but no harm in
366360
* being doubly sure.)
367361
*/
368-
exit(2);
362+
_exit(2);
369363
}
370364

371365
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/postmaster/checkpointer.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -813,27 +813,21 @@ IsCheckpointOnSchedule(double progress)
813813
static void
814814
chkpt_quickdie(SIGNAL_ARGS)
815815
{
816-
PG_SETMASK(&BlockSig);
817-
818816
/*
819-
* We DO NOT want to run proc_exit() callbacks -- we're here because
820-
* shared memory may be corrupted, so we don't want to try to clean up our
821-
* transaction. Just nail the windows shut and get out of town. Now that
822-
* there's an atexit callback to prevent third-party code from breaking
823-
* things by calling exit() directly, we have to reset the callbacks
824-
* explicitly to make this work as intended.
825-
*/
826-
on_exit_reset();
827-
828-
/*
829-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
830-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
817+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
818+
* because shared memory may be corrupted, so we don't want to try to
819+
* clean up our transaction. Just nail the windows shut and get out of
820+
* town. The callbacks wouldn't be safe to run from a signal handler,
821+
* anyway.
822+
*
823+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
824+
* a system reset cycle if someone sends a manual SIGQUIT to a random
831825
* backend. This is necessary precisely because we don't clean up our
832826
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
833827
* should ensure the postmaster sees this as a crash, too, but no harm in
834828
* being doubly sure.)
835829
*/
836-
exit(2);
830+
_exit(2);
837831
}
838832

839833
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/postmaster/startup.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -68,27 +68,21 @@ static void StartupProcSigHupHandler(SIGNAL_ARGS);
6868
static void
6969
startupproc_quickdie(SIGNAL_ARGS)
7070
{
71-
PG_SETMASK(&BlockSig);
72-
73-
/*
74-
* We DO NOT want to run proc_exit() callbacks -- we're here because
75-
* shared memory may be corrupted, so we don't want to try to clean up our
76-
* transaction. Just nail the windows shut and get out of town. Now that
77-
* there's an atexit callback to prevent third-party code from breaking
78-
* things by calling exit() directly, we have to reset the callbacks
79-
* explicitly to make this work as intended.
80-
*/
81-
on_exit_reset();
82-
8371
/*
84-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
85-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
72+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
73+
* because shared memory may be corrupted, so we don't want to try to
74+
* clean up our transaction. Just nail the windows shut and get out of
75+
* town. The callbacks wouldn't be safe to run from a signal handler,
76+
* anyway.
77+
*
78+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
79+
* a system reset cycle if someone sends a manual SIGQUIT to a random
8680
* backend. This is necessary precisely because we don't clean up our
8781
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
8882
* should ensure the postmaster sees this as a crash, too, but no harm in
8983
* being doubly sure.)
9084
*/
91-
exit(2);
85+
_exit(2);
9286
}
9387

9488

src/backend/postmaster/walwriter.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -328,27 +328,21 @@ WalWriterMain(void)
328328
static void
329329
wal_quickdie(SIGNAL_ARGS)
330330
{
331-
PG_SETMASK(&BlockSig);
332-
333331
/*
334-
* We DO NOT want to run proc_exit() callbacks -- we're here because
335-
* shared memory may be corrupted, so we don't want to try to clean up our
336-
* transaction. Just nail the windows shut and get out of town. Now that
337-
* there's an atexit callback to prevent third-party code from breaking
338-
* things by calling exit() directly, we have to reset the callbacks
339-
* explicitly to make this work as intended.
340-
*/
341-
on_exit_reset();
342-
343-
/*
344-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
345-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
332+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
333+
* because shared memory may be corrupted, so we don't want to try to
334+
* clean up our transaction. Just nail the windows shut and get out of
335+
* town. The callbacks wouldn't be safe to run from a signal handler,
336+
* anyway.
337+
*
338+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
339+
* a system reset cycle if someone sends a manual SIGQUIT to a random
346340
* backend. This is necessary precisely because we don't clean up our
347341
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
348342
* should ensure the postmaster sees this as a crash, too, but no harm in
349343
* being doubly sure.)
350344
*/
351-
exit(2);
345+
_exit(2);
352346
}
353347

354348
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/replication/walreceiver.c

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -775,27 +775,21 @@ WalRcvShutdownHandler(SIGNAL_ARGS)
775775
static void
776776
WalRcvQuickDieHandler(SIGNAL_ARGS)
777777
{
778-
PG_SETMASK(&BlockSig);
779-
780778
/*
781-
* We DO NOT want to run proc_exit() callbacks -- we're here because
782-
* shared memory may be corrupted, so we don't want to try to clean up our
783-
* transaction. Just nail the windows shut and get out of town. Now that
784-
* there's an atexit callback to prevent third-party code from breaking
785-
* things by calling exit() directly, we have to reset the callbacks
786-
* explicitly to make this work as intended.
787-
*/
788-
on_exit_reset();
789-
790-
/*
791-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
792-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
793-
* backend. This is necessary precisely because we don't clean up our
794-
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
795-
* should ensure the postmaster sees this as a crash, too, but no harm in
796-
* being doubly sure.)
779+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
780+
* because shared memory may be corrupted, so we don't want to try to
781+
* clean up our transaction. Just nail the windows shut and get out of
782+
* town. The callbacks wouldn't be safe to run from a signal handler,
783+
* anyway.
784+
*
785+
* Note we use _exit(2) not _exit(0). This is to force the postmaster
786+
* into a system reset cycle if someone sends a manual SIGQUIT to a
787+
* random backend. This is necessary precisely because we don't clean up
788+
* our shared memory state. (The "dead man switch" mechanism in
789+
* pmsignal.c should ensure the postmaster sees this as a crash, too, but
790+
* no harm in being doubly sure.)
797791
*/
798-
exit(2);
792+
_exit(2);
799793
}
800794

801795
/*

src/backend/tcop/postgres.c

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2568,6 +2568,16 @@ quickdie(SIGNAL_ARGS)
25682568
whereToSendOutput = DestNone;
25692569

25702570
/*
2571+
* Notify the client before exiting, to give a clue on what happened.
2572+
*
2573+
* It's dubious to call ereport() from a signal handler. It is certainly
2574+
* not async-signal safe. But it seems better to try, than to disconnect
2575+
* abruptly and leave the client wondering what happened. It's remotely
2576+
* possible that we crash or hang while trying to send the message, but
2577+
* receiving a SIGQUIT is a sign that something has already gone badly
2578+
* wrong, so there's not much to lose. Assuming the postmaster is still
2579+
* running, it will SIGKILL us soon if we get stuck for some reason.
2580+
*
25712581
* Ideally this should be ereport(FATAL), but then we'd not get control
25722582
* back...
25732583
*/
@@ -2582,24 +2592,20 @@ quickdie(SIGNAL_ARGS)
25822592
" database and repeat your command.")));
25832593

25842594
/*
2585-
* We DO NOT want to run proc_exit() callbacks -- we're here because
2586-
* shared memory may be corrupted, so we don't want to try to clean up our
2587-
* transaction. Just nail the windows shut and get out of town. Now that
2588-
* there's an atexit callback to prevent third-party code from breaking
2589-
* things by calling exit() directly, we have to reset the callbacks
2590-
* explicitly to make this work as intended.
2591-
*/
2592-
on_exit_reset();
2593-
2594-
/*
2595-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
2596-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
2595+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
2596+
* because shared memory may be corrupted, so we don't want to try to
2597+
* clean up our transaction. Just nail the windows shut and get out of
2598+
* town. The callbacks wouldn't be safe to run from a signal handler,
2599+
* anyway.
2600+
*
2601+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
2602+
* a system reset cycle if someone sends a manual SIGQUIT to a random
25972603
* backend. This is necessary precisely because we don't clean up our
25982604
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
25992605
* should ensure the postmaster sees this as a crash, too, but no harm in
26002606
* being doubly sure.)
26012607
*/
2602-
exit(2);
2608+
_exit(2);
26032609
}
26042610

26052611
/*

0 commit comments

Comments
 (0)