Skip to content

Commit 16e1b7a

Browse files
committed
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query cancel (either an asynchronous SIGINT, or a cancel triggered by one of the timeout handler functions). That would at least result in failure to schedule any required future interrupt, and might result in actual corruption of timeout.c's data structures, if the interrupt happened while we were updating those. We could still lose control if an asynchronous SIGINT arrives just as the function is entered. This wouldn't break any data structures, but it would have the same effect as if the SIGALRM interrupt had been silently lost: we'd not fire any currently-due handlers, nor schedule any new interrupt. To forestall that scenario, forcibly reschedule any pending timer interrupt during AbortTransaction and AbortSubTransaction. We can avoid any extra kernel call in most cases by not doing that until we've allowed LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events. Another hazard is that some platforms (at least Linux and *BSD) block a signal before calling its handler and then unblock it on return. When we longjmp out of the handler, the unblock doesn't happen, and the signal is left blocked indefinitely. Again, we can fix that by forcibly unblocking signals during AbortTransaction and AbortSubTransaction. These latter two problems do not manifest when the longjmp reaches postgres.c, because the error recovery code there kills all pending timeout events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal mask is restored. So errors thrown outside any transaction should be OK already, and cleaning up in AbortTransaction and AbortSubTransaction should be enough to fix these issues. (We're assuming that any code that catches a query cancel error and doesn't re-throw it will do at least a subtransaction abort to clean up; but that was pretty much required already by other subsystems.) Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when disabling that event: if a lock timeout interrupt happened after the lock was granted, the ensuing query cancel is still going to happen at the next CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user cancel. Per reports from Dan Wood. Back-patch to 9.3 where the new timeout handling infrastructure was introduced. We may at some point decide to back-patch the signal unblocking changes further, but I'll desist from that until we hear actual field complaints about it.
1 parent 50107ee commit 16e1b7a

File tree

6 files changed

+117
-7
lines changed

6 files changed

+117
-7
lines changed

src/backend/access/transam/xact.c

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
#include "commands/trigger.h"
3535
#include "executor/spi.h"
3636
#include "libpq/be-fsstubs.h"
37+
#include "libpq/pqsignal.h"
3738
#include "miscadmin.h"
3839
#include "pgstat.h"
3940
#include "replication/walsender.h"
@@ -52,6 +53,7 @@
5253
#include "utils/memutils.h"
5354
#include "utils/relmapper.h"
5455
#include "utils/snapmgr.h"
56+
#include "utils/timeout.h"
5557
#include "utils/timestamp.h"
5658
#include "pg_trace.h"
5759

@@ -2296,6 +2298,22 @@ AbortTransaction(void)
22962298
*/
22972299
LockErrorCleanup();
22982300

2301+
/*
2302+
* If any timeout events are still active, make sure the timeout interrupt
2303+
* is scheduled. This covers possible loss of a timeout interrupt due to
2304+
* longjmp'ing out of the SIGINT handler (see notes in handle_sig_alarm).
2305+
* We delay this till after LockErrorCleanup so that we don't uselessly
2306+
* reschedule lock or deadlock check timeouts.
2307+
*/
2308+
reschedule_timeouts();
2309+
2310+
/*
2311+
* Re-enable signals, in case we got here by longjmp'ing out of a signal
2312+
* handler. We do this fairly early in the sequence so that the timeout
2313+
* infrastructure will be functional if needed while aborting.
2314+
*/
2315+
PG_SETMASK(&UnBlockSig);
2316+
22992317
/*
23002318
* check the current transaction state
23012319
*/
@@ -4222,8 +4240,28 @@ AbortSubTransaction(void)
42224240
AbortBufferIO();
42234241
UnlockBuffers();
42244242

4243+
/*
4244+
* Also clean up any open wait for lock, since the lock manager will choke
4245+
* if we try to wait for another lock before doing this.
4246+
*/
42254247
LockErrorCleanup();
42264248

4249+
/*
4250+
* If any timeout events are still active, make sure the timeout interrupt
4251+
* is scheduled. This covers possible loss of a timeout interrupt due to
4252+
* longjmp'ing out of the SIGINT handler (see notes in handle_sig_alarm).
4253+
* We delay this till after LockErrorCleanup so that we don't uselessly
4254+
* reschedule lock or deadlock check timeouts.
4255+
*/
4256+
reschedule_timeouts();
4257+
4258+
/*
4259+
* Re-enable signals, in case we got here by longjmp'ing out of a signal
4260+
* handler. We do this fairly early in the sequence so that the timeout
4261+
* infrastructure will be functional if needed while aborting.
4262+
*/
4263+
PG_SETMASK(&UnBlockSig);
4264+
42274265
/*
42284266
* check the current transaction state
42294267
*/

src/backend/postmaster/autovacuum.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -493,9 +493,8 @@ AutoVacLauncherMain(int argc, char *argv[])
493493
HOLD_INTERRUPTS();
494494

495495
/* Forget any pending QueryCancel or timeout request */
496-
QueryCancelPending = false;
497496
disable_all_timeouts(false);
498-
QueryCancelPending = false; /* again in case timeout occurred */
497+
QueryCancelPending = false; /* second to avoid race condition */
499498

500499
/* Report the error to the server log */
501500
EmitErrorReport();

src/backend/storage/lmgr/proc.c

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1266,7 +1266,10 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
12661266
} while (myWaitStatus == STATUS_WAITING);
12671267

12681268
/*
1269-
* Disable the timers, if they are still running
1269+
* Disable the timers, if they are still running. As in LockErrorCleanup,
1270+
* we must preserve the LOCK_TIMEOUT indicator flag: if a lock timeout has
1271+
* already caused QueryCancelPending to become set, we want the cancel to
1272+
* be reported as a lock timeout, not a user cancel.
12701273
*/
12711274
if (LockTimeout > 0)
12721275
{
@@ -1275,7 +1278,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
12751278
timeouts[0].id = DEADLOCK_TIMEOUT;
12761279
timeouts[0].keep_indicator = false;
12771280
timeouts[1].id = LOCK_TIMEOUT;
1278-
timeouts[1].keep_indicator = false;
1281+
timeouts[1].keep_indicator = true;
12791282
disable_timeouts(timeouts, 2);
12801283
}
12811284
else

src/backend/tcop/postgres.c

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3803,6 +3803,13 @@ PostgresMain(int argc, char *argv[],
38033803
* always active, we have at least some chance of recovering from an error
38043804
* during error recovery. (If we get into an infinite loop thereby, it
38053805
* will soon be stopped by overflow of elog.c's internal state stack.)
3806+
*
3807+
* Note that we use sigsetjmp(..., 1), so that this function's signal mask
3808+
* (to wit, UnBlockSig) will be restored when longjmp'ing to here. This
3809+
* is essential in case we longjmp'd out of a signal handler on a platform
3810+
* where that leaves the signal blocked. It's not redundant with the
3811+
* unblock in AbortTransaction() because the latter is only called if we
3812+
* were inside a transaction.
38063813
*/
38073814

38083815
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
@@ -3823,11 +3830,17 @@ PostgresMain(int argc, char *argv[],
38233830

38243831
/*
38253832
* Forget any pending QueryCancel request, since we're returning to
3826-
* the idle loop anyway, and cancel any active timeout requests.
3833+
* the idle loop anyway, and cancel any active timeout requests. (In
3834+
* future we might want to allow some timeout requests to survive, but
3835+
* at minimum it'd be necessary to do reschedule_timeouts(), in case
3836+
* we got here because of a query cancel interrupting the SIGALRM
3837+
* interrupt handler.) Note in particular that we must clear the
3838+
* statement and lock timeout indicators, to prevent any future plain
3839+
* query cancels from being misreported as timeouts in case we're
3840+
* forgetting a timeout cancel.
38273841
*/
3828-
QueryCancelPending = false;
38293842
disable_all_timeouts(false);
3830-
QueryCancelPending = false; /* again in case timeout occurred */
3843+
QueryCancelPending = false; /* second to avoid race condition */
38313844

38323845
/*
38333846
* Turn off these interrupts too. This is only needed here and not in

src/backend/utils/misc/timeout.c

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
#include <sys/time.h>
1818

19+
#include "miscadmin.h"
1920
#include "storage/proc.h"
2021
#include "utils/timeout.h"
2122
#include "utils/timestamp.h"
@@ -259,6 +260,23 @@ handle_sig_alarm(SIGNAL_ARGS)
259260
{
260261
int save_errno = errno;
261262

263+
/*
264+
* We may be executing while ImmediateInterruptOK is true (e.g., when
265+
* mainline is waiting for a lock). If SIGINT or similar arrives while
266+
* this code is running, we'd lose control and perhaps leave our data
267+
* structures in an inconsistent state. Hold off interrupts to prevent
268+
* that.
269+
*
270+
* Note: it's possible for a SIGINT to interrupt handle_sig_alarm even
271+
* before we reach HOLD_INTERRUPTS(); the net effect would be as if the
272+
* SIGALRM event had been silently lost. Therefore error recovery must
273+
* include some action that will allow any lost interrupt to be
274+
* rescheduled. Disabling some or all timeouts is sufficient, or if
275+
* that's not appropriate, reschedule_timeouts() can be called. Also, the
276+
* signal blocking hazard described below applies here too.
277+
*/
278+
HOLD_INTERRUPTS();
279+
262280
/*
263281
* SIGALRM is always cause for waking anything waiting on the process
264282
* latch. Cope with MyProc not being there, as the startup process also
@@ -311,6 +329,20 @@ handle_sig_alarm(SIGNAL_ARGS)
311329
}
312330
}
313331

332+
/*
333+
* Re-allow query cancel, and then service any cancel request that arrived
334+
* meanwhile (this might in particular include a cancel request fired by
335+
* one of the timeout handlers).
336+
*
337+
* Note: a longjmp from here is safe so far as our own data structures are
338+
* concerned; but on platforms that block a signal before calling the
339+
* handler and then un-block it on return, longjmping out of the signal
340+
* handler leaves SIGALRM still blocked. Error cleanup is responsible for
341+
* unblocking any blocked signals.
342+
*/
343+
RESUME_INTERRUPTS();
344+
CHECK_FOR_INTERRUPTS();
345+
314346
errno = save_errno;
315347
}
316348

@@ -387,6 +419,30 @@ RegisterTimeout(TimeoutId id, timeout_handler_proc handler)
387419
return id;
388420
}
389421

422+
/*
423+
* Reschedule any pending SIGALRM interrupt.
424+
*
425+
* This can be used during error recovery in case query cancel resulted in loss
426+
* of a SIGALRM event (due to longjmp'ing out of handle_sig_alarm before it
427+
* could do anything). But note it's not necessary if any of the public
428+
* enable_ or disable_timeout functions are called in the same area, since
429+
* those all do schedule_alarm() internally if needed.
430+
*/
431+
void
432+
reschedule_timeouts(void)
433+
{
434+
/* For flexibility, allow this to be called before we're initialized. */
435+
if (!all_timeouts_initialized)
436+
return;
437+
438+
/* Disable timeout interrupts for safety. */
439+
disable_alarm();
440+
441+
/* Reschedule the interrupt, if any timeouts remain active. */
442+
if (num_active_timeouts > 0)
443+
schedule_alarm(GetCurrentTimestamp());
444+
}
445+
390446
/*
391447
* Enable the specified timeout to fire after the specified delay.
392448
*

src/include/utils/timeout.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ typedef struct
6767
/* timeout setup */
6868
extern void InitializeTimeouts(void);
6969
extern TimeoutId RegisterTimeout(TimeoutId id, timeout_handler_proc handler);
70+
extern void reschedule_timeouts(void);
7071

7172
/* timeout operation */
7273
extern void enable_timeout_after(TimeoutId id, int delay_ms);

0 commit comments

Comments
 (0)