Skip to content

Commit aa1351f

Browse files
committed
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker process per call, on the grounds that the postmaster might otherwise neglect its other duties for too long. However, that seems overly conservative, especially since bad effects only become obvious when many hundreds of bgworkers need to be launched at once. On the other side of the coin is that the existing logic could result in substantial delay of bgworker launches, because ServerLoop isn't guaranteed to iterate immediately after a signal arrives. (My attempt to fix that by using pselect(2) encountered too many portability question marks, and in any case could not help on platforms without pselect().) One could also question the wisdom of using an O(N^2) processing method if the system is intended to support so many bgworkers. As a compromise, allow that function to launch up to 100 bgworkers per call (and in consequence, rename it to maybe_start_bgworkers). This will allow any normal parallel-query request for workers to be satisfied immediately during sigusr1_handler, avoiding the question of whether ServerLoop will be able to launch more promptly. There is talk of rewriting the postmaster to use a WaitEventSet to avoid the signal-response-delay problem, but I'd argue that this change should be kept even after that happens (if it ever does). Backpatch to 9.6 where parallel query was added. The issue exists before that, but previous uses of bgworkers typically aren't as sensitive to how quickly they get launched. Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
1 parent fda4fec commit aa1351f

File tree

1 file changed

+22
-17
lines changed

1 file changed

+22
-17
lines changed

src/backend/postmaster/postmaster.c

+22-17
Original file line numberDiff line numberDiff line change
@@ -421,7 +421,7 @@ static void TerminateChildren(int signal);
421421

422422
static int CountChildren(int target);
423423
static bool assign_backendlist_entry(RegisteredBgWorker *rw);
424-
static void maybe_start_bgworker(void);
424+
static void maybe_start_bgworkers(void);
425425
static bool CreateOptsFile(int argc, char *argv[], char *fullprogname);
426426
static pid_t StartChildProcess(AuxProcType type);
427427
static void StartAutovacuumWorker(void);
@@ -1346,7 +1346,7 @@ PostmasterMain(int argc, char *argv[])
13461346
pmState = PM_STARTUP;
13471347

13481348
/* Some workers may be scheduled to start now */
1349-
maybe_start_bgworker();
1349+
maybe_start_bgworkers();
13501350

13511351
status = ServerLoop();
13521352

@@ -1813,7 +1813,7 @@ ServerLoop(void)
18131813

18141814
/* Get other worker processes running, if needed */
18151815
if (StartWorkerNeeded || HaveCrashedWorker)
1816-
maybe_start_bgworker();
1816+
maybe_start_bgworkers();
18171817

18181818
#ifdef HAVE_PTHREAD_IS_THREADED_NP
18191819

@@ -2859,7 +2859,7 @@ reaper(SIGNAL_ARGS)
28592859
PgStatPID = pgstat_start();
28602860

28612861
/* workers may be scheduled to start now */
2862-
maybe_start_bgworker();
2862+
maybe_start_bgworkers();
28632863

28642864
/* at this point we are really open for business */
28652865
ereport(LOG,
@@ -5026,7 +5026,7 @@ sigusr1_handler(SIGNAL_ARGS)
50265026
}
50275027

50285028
if (StartWorkerNeeded || HaveCrashedWorker)
5029-
maybe_start_bgworker();
5029+
maybe_start_bgworkers();
50305030

50315031
if (CheckPostmasterSignal(PMSIGNAL_WAKEN_ARCHIVER) &&
50325032
PgArchPID != 0)
@@ -5726,22 +5726,23 @@ assign_backendlist_entry(RegisteredBgWorker *rw)
57265726
}
57275727

57285728
/*
5729-
* If the time is right, start one background worker.
5729+
* If the time is right, start background worker(s).
57305730
*
5731-
* As a side effect, the bgworker control variables are set or reset whenever
5732-
* there are more workers to start after this one, and whenever the overall
5733-
* system state requires it.
5731+
* As a side effect, the bgworker control variables are set or reset
5732+
* depending on whether more workers may need to be started.
57345733
*
5735-
* The reason we start at most one worker per call is to avoid consuming the
5734+
* We limit the number of workers started per call, to avoid consuming the
57365735
* postmaster's attention for too long when many such requests are pending.
57375736
* As long as StartWorkerNeeded is true, ServerLoop will not block and will
57385737
* call this function again after dealing with any other issues.
57395738
*/
57405739
static void
5741-
maybe_start_bgworker(void)
5740+
maybe_start_bgworkers(void)
57425741
{
5743-
slist_mutable_iter iter;
5742+
#define MAX_BGWORKERS_TO_LAUNCH 100
5743+
int num_launched = 0;
57445744
TimestampTz now = 0;
5745+
slist_mutable_iter iter;
57455746

57465747
/*
57475748
* During crash recovery, we have no need to be called until the state
@@ -5826,12 +5827,16 @@ maybe_start_bgworker(void)
58265827
}
58275828

58285829
/*
5829-
* Quit, but have ServerLoop call us again to look for additional
5830-
* ready-to-run workers. There might not be any, but we'll find
5831-
* out the next time we run.
5830+
* If we've launched as many workers as allowed, quit, but have
5831+
* ServerLoop call us again to look for additional ready-to-run
5832+
* workers. There might not be any, but we'll find out the next
5833+
* time we run.
58325834
*/
5833-
StartWorkerNeeded = true;
5834-
return;
5835+
if (++num_launched >= MAX_BGWORKERS_TO_LAUNCH)
5836+
{
5837+
StartWorkerNeeded = true;
5838+
return;
5839+
}
58355840
}
58365841
}
58375842
}

0 commit comments

Comments
 (0)