Skip to content

Commit b756440

Browse files
committed
Report failure to start a background worker.
When a worker is flagged as BGW_NEVER_RESTART and we fail to start it, or if it is not marked BGW_NEVER_RESTART but is terminated before startup succeeds, what BgwHandleStatus should be reported? The previous code really hadn't considered this possibility (as indicated by the comments which ignore it completely) and would typically return BGWH_NOT_YET_STARTED, but that's not a good answer, because then there's no way for code using GetBackgroundWorkerPid() to tell the difference between a worker that has not started but will start later and a worker that has not started and will never be started. So, when this case happens, return BGWH_STOPPED instead. Update the comments to reflect this. The preceding fix by itself is insufficient to fix the problem, because the old code also didn't send a notification to the process identified in bgw_notify_pid when startup failed. That might've been technically correct under the theory that the status of the worker was BGWH_NOT_YET_STARTED, because the status would indeed not change when the worker failed to start, but now that we're more usefully reporting BGWH_STOPPED, a notification is needed. Without these fixes, code which starts background workers and then uses the recommended APIs to wait for those background workers to start would hang indefinitely if the postmaster failed to fork a worker. Amit Kapila and Robert Haas Discussion: http://postgr.es/m/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3uw@mail.gmail.com
1 parent a4116c4 commit b756440

File tree

2 files changed

+24
-8
lines changed

2 files changed

+24
-8
lines changed

src/backend/postmaster/bgworker.c

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -908,14 +908,18 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
908908
* Get the PID of a dynamically-registered background worker.
909909
*
910910
* If the worker is determined to be running, the return value will be
911-
* BGWH_STARTED and *pidp will get the PID of the worker process.
912-
* Otherwise, the return value will be BGWH_NOT_YET_STARTED if the worker
913-
* hasn't been started yet, and BGWH_STOPPED if the worker was previously
914-
* running but is no longer.
911+
* BGWH_STARTED and *pidp will get the PID of the worker process. If the
912+
* postmaster has not yet attempted to start the worker, the return value will
913+
* be BGWH_NOT_YET_STARTED. Otherwise, the return value is BGWH_STOPPED.
915914
*
916-
* In the latter case, the worker may be stopped temporarily (if it is
917-
* configured for automatic restart and exited non-zero) or gone for
918-
* good (if it exited with code 0 or if it is configured not to restart).
915+
* BGWH_STOPPED can indicate either that the worker is temporarily stopped
916+
* (because it is configured for automatic restart and exited non-zero),
917+
* or that the worker is permanently stopped (because it exited with exit
918+
* code 0, or was not configured for automatic restart), or even that the
919+
* worker was unregistered without ever starting (either because startup
920+
* failed and the worker is not configured for automatic restart, or because
921+
* TerminateBackgroundWorker was used before the worker was successfully
922+
* started).
919923
*/
920924
BgwHandleStatus
921925
GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
@@ -940,8 +944,11 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
940944
* time, but we assume such changes are atomic. So the value we read
941945
* won't be garbage, but it might be out of date by the time the caller
942946
* examines it (but that's unavoidable anyway).
947+
*
948+
* The in_use flag could be in the process of changing from true to false,
949+
* but if it is already false then it can't change further.
943950
*/
944-
if (handle->generation != slot->generation)
951+
if (handle->generation != slot->generation || !slot->in_use)
945952
pid = 0;
946953
else
947954
pid = slot->pid;

src/backend/postmaster/postmaster.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5821,7 +5821,16 @@ maybe_start_bgworkers(void)
58215821
{
58225822
if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART)
58235823
{
5824+
int notify_pid;
5825+
5826+
notify_pid = rw->rw_worker.bgw_notify_pid;
5827+
58245828
ForgetBackgroundWorker(&iter);
5829+
5830+
/* Report worker is gone now. */
5831+
if (notify_pid != 0)
5832+
kill(notify_pid, SIGUSR1);
5833+
58255834
continue;
58265835
}
58275836

0 commit comments

Comments
 (0)