Skip to content

Commit 8e19a82

Browse files
committed
Don't run atexit callbacks in quickdie signal handlers.
exit() is not async-signal safe. Even if the libc implementation is, 3rd party libraries might have installed unsafe atexit() callbacks. After receiving SIGQUIT, we really just want to exit as quickly as possible, so we don't really want to run the atexit() callbacks anyway. The original report by Jimmy Yih was a self-deadlock in startup_die(). However, this patch doesn't address that scenario; the signal handling while waiting for the startup packet is more complicated. But at least this alleviates similar problems in the SIGQUIT handlers, like that reported by Asim R P later in the same thread. Backpatch to 9.3 (all supported versions). Discussion: https://www.postgresql.org/message-id/CAOMx_OAuRUHiAuCg2YgicZLzPVv5d9_H4KrL_OFsFP%3DVPekigA%40mail.gmail.com
1 parent 11e22e4 commit 8e19a82

File tree

7 files changed

+77
-108
lines changed

7 files changed

+77
-108
lines changed

src/backend/postmaster/bgworker.c

Lines changed: 9 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -644,28 +644,21 @@ SanityCheckBackgroundWorker(BackgroundWorker *worker, int elevel)
644644
static void
645645
bgworker_quickdie(SIGNAL_ARGS)
646646
{
647-
sigaddset(&BlockSig, SIGQUIT); /* prevent nested calls */
648-
PG_SETMASK(&BlockSig);
649-
650-
/*
651-
* We DO NOT want to run proc_exit() callbacks -- we're here because
652-
* shared memory may be corrupted, so we don't want to try to clean up our
653-
* transaction. Just nail the windows shut and get out of town. Now that
654-
* there's an atexit callback to prevent third-party code from breaking
655-
* things by calling exit() directly, we have to reset the callbacks
656-
* explicitly to make this work as intended.
657-
*/
658-
on_exit_reset();
659-
660647
/*
661-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
662-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
648+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
649+
* because shared memory may be corrupted, so we don't want to try to
650+
* clean up our transaction. Just nail the windows shut and get out of
651+
* town. The callbacks wouldn't be safe to run from a signal handler,
652+
* anyway.
653+
*
654+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
655+
* a system reset cycle if someone sends a manual SIGQUIT to a random
663656
* backend. This is necessary precisely because we don't clean up our
664657
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
665658
* should ensure the postmaster sees this as a crash, too, but no harm in
666659
* being doubly sure.)
667660
*/
668-
exit(2);
661+
_exit(2);
669662
}
670663

671664
/*

src/backend/postmaster/bgwriter.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -399,27 +399,21 @@ BackgroundWriterMain(void)
399399
static void
400400
bg_quickdie(SIGNAL_ARGS)
401401
{
402-
PG_SETMASK(&BlockSig);
403-
404402
/*
405-
* We DO NOT want to run proc_exit() callbacks -- we're here because
406-
* shared memory may be corrupted, so we don't want to try to clean up our
407-
* transaction. Just nail the windows shut and get out of town. Now that
408-
* there's an atexit callback to prevent third-party code from breaking
409-
* things by calling exit() directly, we have to reset the callbacks
410-
* explicitly to make this work as intended.
411-
*/
412-
on_exit_reset();
413-
414-
/*
415-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
416-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
403+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
404+
* because shared memory may be corrupted, so we don't want to try to
405+
* clean up our transaction. Just nail the windows shut and get out of
406+
* town. The callbacks wouldn't be safe to run from a signal handler,
407+
* anyway.
408+
*
409+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
410+
* a system reset cycle if someone sends a manual SIGQUIT to a random
417411
* backend. This is necessary precisely because we don't clean up our
418412
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
419413
* should ensure the postmaster sees this as a crash, too, but no harm in
420414
* being doubly sure.)
421415
*/
422-
exit(2);
416+
_exit(2);
423417
}
424418

425419
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/postmaster/checkpointer.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -813,27 +813,21 @@ IsCheckpointOnSchedule(double progress)
813813
static void
814814
chkpt_quickdie(SIGNAL_ARGS)
815815
{
816-
PG_SETMASK(&BlockSig);
817-
818816
/*
819-
* We DO NOT want to run proc_exit() callbacks -- we're here because
820-
* shared memory may be corrupted, so we don't want to try to clean up our
821-
* transaction. Just nail the windows shut and get out of town. Now that
822-
* there's an atexit callback to prevent third-party code from breaking
823-
* things by calling exit() directly, we have to reset the callbacks
824-
* explicitly to make this work as intended.
825-
*/
826-
on_exit_reset();
827-
828-
/*
829-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
830-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
817+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
818+
* because shared memory may be corrupted, so we don't want to try to
819+
* clean up our transaction. Just nail the windows shut and get out of
820+
* town. The callbacks wouldn't be safe to run from a signal handler,
821+
* anyway.
822+
*
823+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
824+
* a system reset cycle if someone sends a manual SIGQUIT to a random
831825
* backend. This is necessary precisely because we don't clean up our
832826
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
833827
* should ensure the postmaster sees this as a crash, too, but no harm in
834828
* being doubly sure.)
835829
*/
836-
exit(2);
830+
_exit(2);
837831
}
838832

839833
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/postmaster/startup.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -69,27 +69,21 @@ static void StartupProcSigHupHandler(SIGNAL_ARGS);
6969
static void
7070
startupproc_quickdie(SIGNAL_ARGS)
7171
{
72-
PG_SETMASK(&BlockSig);
73-
74-
/*
75-
* We DO NOT want to run proc_exit() callbacks -- we're here because
76-
* shared memory may be corrupted, so we don't want to try to clean up our
77-
* transaction. Just nail the windows shut and get out of town. Now that
78-
* there's an atexit callback to prevent third-party code from breaking
79-
* things by calling exit() directly, we have to reset the callbacks
80-
* explicitly to make this work as intended.
81-
*/
82-
on_exit_reset();
83-
8472
/*
85-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
86-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
73+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
74+
* because shared memory may be corrupted, so we don't want to try to
75+
* clean up our transaction. Just nail the windows shut and get out of
76+
* town. The callbacks wouldn't be safe to run from a signal handler,
77+
* anyway.
78+
*
79+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
80+
* a system reset cycle if someone sends a manual SIGQUIT to a random
8781
* backend. This is necessary precisely because we don't clean up our
8882
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
8983
* should ensure the postmaster sees this as a crash, too, but no harm in
9084
* being doubly sure.)
9185
*/
92-
exit(2);
86+
_exit(2);
9387
}
9488

9589

src/backend/postmaster/walwriter.c

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -309,27 +309,21 @@ WalWriterMain(void)
309309
static void
310310
wal_quickdie(SIGNAL_ARGS)
311311
{
312-
PG_SETMASK(&BlockSig);
313-
314312
/*
315-
* We DO NOT want to run proc_exit() callbacks -- we're here because
316-
* shared memory may be corrupted, so we don't want to try to clean up our
317-
* transaction. Just nail the windows shut and get out of town. Now that
318-
* there's an atexit callback to prevent third-party code from breaking
319-
* things by calling exit() directly, we have to reset the callbacks
320-
* explicitly to make this work as intended.
321-
*/
322-
on_exit_reset();
323-
324-
/*
325-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
326-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
313+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
314+
* because shared memory may be corrupted, so we don't want to try to
315+
* clean up our transaction. Just nail the windows shut and get out of
316+
* town. The callbacks wouldn't be safe to run from a signal handler,
317+
* anyway.
318+
*
319+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
320+
* a system reset cycle if someone sends a manual SIGQUIT to a random
327321
* backend. This is necessary precisely because we don't clean up our
328322
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
329323
* should ensure the postmaster sees this as a crash, too, but no harm in
330324
* being doubly sure.)
331325
*/
332-
exit(2);
326+
_exit(2);
333327
}
334328

335329
/* SIGHUP: set flag to re-read config file at next convenient time */

src/backend/replication/walreceiver.c

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -854,27 +854,21 @@ WalRcvShutdownHandler(SIGNAL_ARGS)
854854
static void
855855
WalRcvQuickDieHandler(SIGNAL_ARGS)
856856
{
857-
PG_SETMASK(&BlockSig);
858-
859857
/*
860-
* We DO NOT want to run proc_exit() callbacks -- we're here because
861-
* shared memory may be corrupted, so we don't want to try to clean up our
862-
* transaction. Just nail the windows shut and get out of town. Now that
863-
* there's an atexit callback to prevent third-party code from breaking
864-
* things by calling exit() directly, we have to reset the callbacks
865-
* explicitly to make this work as intended.
866-
*/
867-
on_exit_reset();
868-
869-
/*
870-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
871-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
872-
* backend. This is necessary precisely because we don't clean up our
873-
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
874-
* should ensure the postmaster sees this as a crash, too, but no harm in
875-
* being doubly sure.)
858+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
859+
* because shared memory may be corrupted, so we don't want to try to
860+
* clean up our transaction. Just nail the windows shut and get out of
861+
* town. The callbacks wouldn't be safe to run from a signal handler,
862+
* anyway.
863+
*
864+
* Note we use _exit(2) not _exit(0). This is to force the postmaster
865+
* into a system reset cycle if someone sends a manual SIGQUIT to a
866+
* random backend. This is necessary precisely because we don't clean up
867+
* our shared memory state. (The "dead man switch" mechanism in
868+
* pmsignal.c should ensure the postmaster sees this as a crash, too, but
869+
* no harm in being doubly sure.)
876870
*/
877-
exit(2);
871+
_exit(2);
878872
}
879873

880874
/*

src/backend/tcop/postgres.c

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2616,6 +2616,16 @@ quickdie(SIGNAL_ARGS)
26162616
whereToSendOutput = DestNone;
26172617

26182618
/*
2619+
* Notify the client before exiting, to give a clue on what happened.
2620+
*
2621+
* It's dubious to call ereport() from a signal handler. It is certainly
2622+
* not async-signal safe. But it seems better to try, than to disconnect
2623+
* abruptly and leave the client wondering what happened. It's remotely
2624+
* possible that we crash or hang while trying to send the message, but
2625+
* receiving a SIGQUIT is a sign that something has already gone badly
2626+
* wrong, so there's not much to lose. Assuming the postmaster is still
2627+
* running, it will SIGKILL us soon if we get stuck for some reason.
2628+
*
26192629
* Ideally this should be ereport(FATAL), but then we'd not get control
26202630
* back...
26212631
*/
@@ -2630,24 +2640,20 @@ quickdie(SIGNAL_ARGS)
26302640
" database and repeat your command.")));
26312641

26322642
/*
2633-
* We DO NOT want to run proc_exit() callbacks -- we're here because
2634-
* shared memory may be corrupted, so we don't want to try to clean up our
2635-
* transaction. Just nail the windows shut and get out of town. Now that
2636-
* there's an atexit callback to prevent third-party code from breaking
2637-
* things by calling exit() directly, we have to reset the callbacks
2638-
* explicitly to make this work as intended.
2639-
*/
2640-
on_exit_reset();
2641-
2642-
/*
2643-
* Note we do exit(2) not exit(0). This is to force the postmaster into a
2644-
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
2643+
* We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
2644+
* because shared memory may be corrupted, so we don't want to try to
2645+
* clean up our transaction. Just nail the windows shut and get out of
2646+
* town. The callbacks wouldn't be safe to run from a signal handler,
2647+
* anyway.
2648+
*
2649+
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
2650+
* a system reset cycle if someone sends a manual SIGQUIT to a random
26452651
* backend. This is necessary precisely because we don't clean up our
26462652
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
26472653
* should ensure the postmaster sees this as a crash, too, but no harm in
26482654
* being doubly sure.)
26492655
*/
2650-
exit(2);
2656+
_exit(2);
26512657
}
26522658

26532659
/*

0 commit comments

Comments
 (0)