Skip to content

Commit bc0f76e

Browse files
committed
On Windows, retry process creation if we fail to reserve shared memory.
We've heard occasional reports of backend launch failing because pgwin32_ReserveSharedMemoryRegion() fails, indicating that something has already used that address space in the child process. It's not very clear what, given that we disable ASLR in Windows builds, but suspicion falls on antivirus products. It'd be better if we didn't have to disable ASLR, anyway. So let's try to ameliorate the problem by retrying the process launch after such a failure, up to 100 times. Patch by me, based on previous work by Amit Kapila and others. This is a longstanding issue, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com
1 parent 4d93184 commit bc0f76e

File tree

1 file changed

+15
-7
lines changed

1 file changed

+15
-7
lines changed

src/backend/postmaster/postmaster.c

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4463,6 +4463,7 @@ internal_forkexec(int argc, char *argv[], Port *port)
44634463
static pid_t
44644464
internal_forkexec(int argc, char *argv[], Port *port)
44654465
{
4466+
int retry_count = 0;
44664467
STARTUPINFO si;
44674468
PROCESS_INFORMATION pi;
44684469
int i;
@@ -4480,6 +4481,9 @@ internal_forkexec(int argc, char *argv[], Port *port)
44804481
Assert(strncmp(argv[1], "--fork", 6) == 0);
44814482
Assert(argv[2] == NULL);
44824483

4484+
/* Resume here if we need to retry */
4485+
retry:
4486+
44834487
/* Set up shared memory for parameter passing */
44844488
ZeroMemory(&sa, sizeof(sa));
44854489
sa.nLength = sizeof(sa);
@@ -4571,22 +4575,26 @@ internal_forkexec(int argc, char *argv[], Port *port)
45714575

45724576
/*
45734577
* Reserve the memory region used by our main shared memory segment before
4574-
* we resume the child process.
4578+
* we resume the child process. Normally this should succeed, but if ASLR
4579+
* is active then it might sometimes fail due to the stack or heap having
4580+
* gotten mapped into that range. In that case, just terminate the
4581+
* process and retry.
45754582
*/
45764583
if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
45774584
{
4578-
/*
4579-
* Failed to reserve the memory, so terminate the newly created
4580-
* process and give up.
4581-
*/
4585+
/* pgwin32_ReserveSharedMemoryRegion already made a log entry */
45824586
if (!TerminateProcess(pi.hProcess, 255))
45834587
ereport(LOG,
45844588
(errmsg_internal("could not terminate process that failed to reserve memory: error code %lu",
45854589
GetLastError())));
45864590
CloseHandle(pi.hProcess);
45874591
CloseHandle(pi.hThread);
4588-
return -1; /* logging done made by
4589-
* pgwin32_ReserveSharedMemoryRegion() */
4592+
if (++retry_count < 100)
4593+
goto retry;
4594+
ereport(LOG,
4595+
(errmsg("giving up after too many tries to reserve shared memory"),
4596+
errhint("This might be caused by ASLR or antivirus software.")));
4597+
return -1;
45904598
}
45914599

45924600
/*

0 commit comments

Comments
 (0)