Skip to content

Commit facd94e

Browse files
committed
Report failure to start a background worker.
When a worker is flagged as BGW_NEVER_RESTART and we fail to start it, or if it is not marked BGW_NEVER_RESTART but is terminated before startup succeeds, what BgwHandleStatus should be reported? The previous code really hadn't considered this possibility (as indicated by the comments which ignore it completely) and would typically return BGWH_NOT_YET_STARTED, but that's not a good answer, because then there's no way for code using GetBackgroundWorkerPid() to tell the difference between a worker that has not started but will start later and a worker that has not started and will never be started. So, when this case happens, return BGWH_STOPPED instead. Update the comments to reflect this. The preceding fix by itself is insufficient to fix the problem, because the old code also didn't send a notification to the process identified in bgw_notify_pid when startup failed. That might've been technically correct under the theory that the status of the worker was BGWH_NOT_YET_STARTED, because the status would indeed not change when the worker failed to start, but now that we're more usefully reporting BGWH_STOPPED, a notification is needed. Without these fixes, code which starts background workers and then uses the recommended APIs to wait for those background workers to start would hang indefinitely if the postmaster failed to fork a worker. Amit Kapila and Robert Haas Discussion: http://postgr.es/m/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3uw@mail.gmail.com
1 parent f4bb60e commit facd94e

File tree

2 files changed

+24
-8
lines changed

2 files changed

+24
-8
lines changed

src/backend/postmaster/bgworker.c

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -907,14 +907,18 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
907907
* Get the PID of a dynamically-registered background worker.
908908
*
909909
* If the worker is determined to be running, the return value will be
910-
* BGWH_STARTED and *pidp will get the PID of the worker process.
911-
* Otherwise, the return value will be BGWH_NOT_YET_STARTED if the worker
912-
* hasn't been started yet, and BGWH_STOPPED if the worker was previously
913-
* running but is no longer.
910+
* BGWH_STARTED and *pidp will get the PID of the worker process. If the
911+
* postmaster has not yet attempted to start the worker, the return value will
912+
* be BGWH_NOT_YET_STARTED. Otherwise, the return value is BGWH_STOPPED.
914913
*
915-
* In the latter case, the worker may be stopped temporarily (if it is
916-
* configured for automatic restart and exited non-zero) or gone for
917-
* good (if it exited with code 0 or if it is configured not to restart).
914+
* BGWH_STOPPED can indicate either that the worker is temporarily stopped
915+
* (because it is configured for automatic restart and exited non-zero),
916+
* or that the worker is permanently stopped (because it exited with exit
917+
* code 0, or was not configured for automatic restart), or even that the
918+
* worker was unregistered without ever starting (either because startup
919+
* failed and the worker is not configured for automatic restart, or because
920+
* TerminateBackgroundWorker was used before the worker was successfully
921+
* started).
918922
*/
919923
BgwHandleStatus
920924
GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
@@ -939,8 +943,11 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
939943
* time, but we assume such changes are atomic. So the value we read
940944
* won't be garbage, but it might be out of date by the time the caller
941945
* examines it (but that's unavoidable anyway).
946+
*
947+
* The in_use flag could be in the process of changing from true to false,
948+
* but if it is already false then it can't change further.
942949
*/
943-
if (handle->generation != slot->generation)
950+
if (handle->generation != slot->generation || !slot->in_use)
944951
pid = 0;
945952
else
946953
pid = slot->pid;

src/backend/postmaster/postmaster.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5842,7 +5842,16 @@ maybe_start_bgworker(void)
58425842
{
58435843
if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART)
58445844
{
5845+
int notify_pid;
5846+
5847+
notify_pid = rw->rw_worker.bgw_notify_pid;
5848+
58455849
ForgetBackgroundWorker(&iter);
5850+
5851+
/* Report worker is gone now. */
5852+
if (notify_pid != 0)
5853+
kill(notify_pid, SIGUSR1);
5854+
58465855
continue;
58475856
}
58485857

0 commit comments

Comments
 (0)