Skip to content

Commit 7634bd4

Browse files
committed
Accept SIGQUIT during error recovery in auxiliary processes.
The bgwriter, checkpointer, walwriter, and walreceiver processes claimed to allow SIGQUIT "at all times". In reality SIGQUIT would get re-blocked during error recovery, because we didn't update the actual signal mask immediately, so sigsetjmp() would save and reinstate a mask that includes SIGQUIT. This appears to be simply a coding oversight. There's never a good reason to hold off SIGQUIT in these processes, because it's going to just call _exit(2) which should be safe enough, especially since the postmaster is going to tear down shared memory afterwards. Hence, stick in PG_SETMASK() calls to install the modified BlockSig mask immediately. Also try to improve the comments around sigsetjmp blocks. Most of them were just referencing postgres.c, which is misleading because actually postgres.c manages the signals differently. No back-patch, since there's no evidence that this is causing any problems in the field. Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com
1 parent 3c99230 commit 7634bd4

File tree

5 files changed

+65
-8
lines changed

5 files changed

+65
-8
lines changed

src/backend/postmaster/autovacuum.c

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -495,6 +495,12 @@ AutoVacLauncherMain(int argc, char *argv[])
495495
* If an exception is encountered, processing resumes here.
496496
*
497497
* This code is a stripped down version of PostgresMain error recovery.
498+
*
499+
* Note that we use sigsetjmp(..., 1), so that the prevailing signal mask
500+
* (to wit, BlockSig) will be restored when longjmp'ing to here. Thus,
501+
* signals will be blocked until we complete error recovery. It might
502+
* seem that this policy makes the HOLD_INTERRUPTS() call redundant, but
503+
* it is not since InterruptPending might be set already.
498504
*/
499505
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
500506
{
@@ -1550,7 +1556,15 @@ AutoVacWorkerMain(int argc, char *argv[])
15501556
/*
15511557
* If an exception is encountered, processing resumes here.
15521558
*
1553-
* See notes in postgres.c about the design of this coding.
1559+
* Unlike most auxiliary processes, we don't attempt to continue
1560+
* processing after an error; we just clean up and exit. The autovac
1561+
* launcher is responsible for spawning another worker later.
1562+
*
1563+
* Note that we use sigsetjmp(..., 1), so that the prevailing signal mask
1564+
* (to wit, BlockSig) will be restored when longjmp'ing to here. Thus,
1565+
* signals will be blocked until we exit. It might seem that this policy
1566+
* makes the HOLD_INTERRUPTS() call redundant, but it is not since
1567+
* InterruptPending might be set already.
15541568
*/
15551569
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
15561570
{

src/backend/postmaster/bgwriter.c

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,8 +115,9 @@ BackgroundWriterMain(void)
115115
*/
116116
pqsignal(SIGCHLD, SIG_DFL);
117117

118-
/* We allow SIGQUIT (quickdie) at all times */
118+
/* We allow SIGQUIT (SignalHandlerForCrashExit) at all times */
119119
sigdelset(&BlockSig, SIGQUIT);
120+
PG_SETMASK(&BlockSig);
120121

121122
/*
122123
* We just started, assume there has been either a shutdown or
@@ -140,7 +141,20 @@ BackgroundWriterMain(void)
140141
/*
141142
* If an exception is encountered, processing resumes here.
142143
*
143-
* See notes in postgres.c about the design of this coding.
144+
* You might wonder why this isn't coded as an infinite loop around a
145+
* PG_TRY construct. The reason is that this is the bottom of the
146+
* exception stack, and so with PG_TRY there would be no exception handler
147+
* in force at all during the CATCH part. By leaving the outermost setjmp
148+
* always active, we have at least some chance of recovering from an error
149+
* during error recovery. (If we get into an infinite loop thereby, it
150+
* will soon be stopped by overflow of elog.c's internal state stack.)
151+
*
152+
* Note that we use sigsetjmp(..., 1), so that the prevailing signal mask
153+
* (to wit, BlockSig) will be restored when longjmp'ing to here. Thus,
154+
* signals other than SIGQUIT will be blocked until we complete error
155+
* recovery. It might seem that this policy makes the HOLD_INTERRUPTS()
156+
* call redundant, but it is not since InterruptPending might be set
157+
* already.
144158
*/
145159
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
146160
{

src/backend/postmaster/checkpointer.c

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -209,8 +209,9 @@ CheckpointerMain(void)
209209
*/
210210
pqsignal(SIGCHLD, SIG_DFL);
211211

212-
/* We allow SIGQUIT (quickdie) at all times */
212+
/* We allow SIGQUIT (SignalHandlerForCrashExit) at all times */
213213
sigdelset(&BlockSig, SIGQUIT);
214+
PG_SETMASK(&BlockSig);
214215

215216
/*
216217
* Initialize so that first time-driven event happens at the correct time.
@@ -231,7 +232,20 @@ CheckpointerMain(void)
231232
/*
232233
* If an exception is encountered, processing resumes here.
233234
*
234-
* See notes in postgres.c about the design of this coding.
235+
* You might wonder why this isn't coded as an infinite loop around a
236+
* PG_TRY construct. The reason is that this is the bottom of the
237+
* exception stack, and so with PG_TRY there would be no exception handler
238+
* in force at all during the CATCH part. By leaving the outermost setjmp
239+
* always active, we have at least some chance of recovering from an error
240+
* during error recovery. (If we get into an infinite loop thereby, it
241+
* will soon be stopped by overflow of elog.c's internal state stack.)
242+
*
243+
* Note that we use sigsetjmp(..., 1), so that the prevailing signal mask
244+
* (to wit, BlockSig) will be restored when longjmp'ing to here. Thus,
245+
* signals other than SIGQUIT will be blocked until we complete error
246+
* recovery. It might seem that this policy makes the HOLD_INTERRUPTS()
247+
* call redundant, but it is not since InterruptPending might be set
248+
* already.
235249
*/
236250
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
237251
{

src/backend/postmaster/walwriter.c

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,8 +112,9 @@ WalWriterMain(void)
112112
*/
113113
pqsignal(SIGCHLD, SIG_DFL);
114114

115-
/* We allow SIGQUIT (quickdie) at all times */
115+
/* We allow SIGQUIT (SignalHandlerForCrashExit) at all times */
116116
sigdelset(&BlockSig, SIGQUIT);
117+
PG_SETMASK(&BlockSig);
117118

118119
/*
119120
* Create a memory context that we will do all our work in. We do this so
@@ -129,7 +130,20 @@ WalWriterMain(void)
129130
/*
130131
* If an exception is encountered, processing resumes here.
131132
*
132-
* This code is heavily based on bgwriter.c, q.v.
133+
* You might wonder why this isn't coded as an infinite loop around a
134+
* PG_TRY construct. The reason is that this is the bottom of the
135+
* exception stack, and so with PG_TRY there would be no exception handler
136+
* in force at all during the CATCH part. By leaving the outermost setjmp
137+
* always active, we have at least some chance of recovering from an error
138+
* during error recovery. (If we get into an infinite loop thereby, it
139+
* will soon be stopped by overflow of elog.c's internal state stack.)
140+
*
141+
* Note that we use sigsetjmp(..., 1), so that the prevailing signal mask
142+
* (to wit, BlockSig) will be restored when longjmp'ing to here. Thus,
143+
* signals other than SIGQUIT will be blocked until we complete error
144+
* recovery. It might seem that this policy makes the HOLD_INTERRUPTS()
145+
* call redundant, but it is not since InterruptPending might be set
146+
* already.
133147
*/
134148
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
135149
{

src/backend/replication/walreceiver.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -279,8 +279,9 @@ WalReceiverMain(void)
279279
/* Reset some signals that are accepted by postmaster but not here */
280280
pqsignal(SIGCHLD, SIG_DFL);
281281

282-
/* We allow SIGQUIT (quickdie) at all times */
282+
/* We allow SIGQUIT (SignalHandlerForCrashExit) at all times */
283283
sigdelset(&BlockSig, SIGQUIT);
284+
PG_SETMASK(&BlockSig);
284285

285286
/* Load the libpq-specific functions */
286287
load_file("libpqwalreceiver", false);

0 commit comments

Comments
 (0)