-
-
Notifications
You must be signed in to change notification settings - Fork 32.6k
bpo-32904: Fix a potential crash in os.chdir() and os.getcwd() on Windows #5802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Modules/posixmodule.c
Outdated
PyMem_RawFree(wbuf2); | ||
return resobj; | ||
} | ||
if (!use_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.getcwdb
is also incorrect. It shouldn't call the CRT's ANSI function. It needs to call win32_wgetcwd
and convert the result to the file-system encoding, which defaults to UTF-8. This was apparently overlooked when implementing PEP 529.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. According to PEP 529, os.getcwdb
should have also been undeprecated. I'll file an issue for that in a while.
This PR is stale because it has been open for 30 days with no activity. |
Modules/posixmodule.c
Outdated
@@ -1508,36 +1541,28 @@ posix_fildes_fd(int fd, int (*func)(int)) | |||
static BOOL __stdcall | |||
win32_wchdir(LPCWSTR path) | |||
{ | |||
wchar_t path_buf[MAX_PATH], *new_path = path_buf; | |||
int result; | |||
wchar_t path_buf[MAXPATHLEN], *new_path; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAXPATHLEN
was changed to 256 in Windows. That's a strange value considering MAX_PATH
is 260 and buffers sometimes need to be MAX_PATH + 1
even with legacy code.
Anyway, I don't think performance is critical for getcwd()
and chdir()
. It would be simpler to merge win32_wgetcwd_buf()
into win32_wgetcwd()
, and change the definition to static wchar_t *win32_wgetcwd(LPDWORD plength)
. The caller would have to free the buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAXPATHLEN
was changed to 256 in Windows. That's a strange value consideringMAX_PATH
is 260 and buffers sometimes need to beMAX_PATH + 1
even with legacy code.
MAXPATHLEN
is indeed strange, I even opened an issue about it: #88822.
I believe I changed it purely for consistency with other code in posixmodule.c, but this might be the case of foolish consistency. I've now switched to MAX_PATH
instead.
Modules/posixmodule.c
Outdated
static PyObject * | ||
win32_wgetcwd(void) | ||
{ | ||
wchar_t wbuf[MAXPATHLEN], *wbuf2; | ||
PyObject *resobj; | ||
|
||
Py_BEGIN_ALLOW_THREADS | ||
wbuf2 = win32_wgetcwd_buf(wbuf, Py_ARRAY_LENGTH(wbuf)); | ||
Py_END_ALLOW_THREADS | ||
|
||
if (!wbuf2) { | ||
return PyErr_SetFromWindowsErr(0); | ||
} | ||
resobj = PyUnicode_FromWideChar(wbuf2, wcslen(wbuf2)); | ||
if (wbuf2 != wbuf) | ||
PyMem_RawFree(wbuf2); | ||
return resobj; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the suggested change to make win32_wgetcwd()
a function that doesn't hold the GIL and always allocates the buffer.
static PyObject * | |
win32_wgetcwd(void) | |
{ | |
wchar_t wbuf[MAXPATHLEN], *wbuf2; | |
PyObject *resobj; | |
Py_BEGIN_ALLOW_THREADS | |
wbuf2 = win32_wgetcwd_buf(wbuf, Py_ARRAY_LENGTH(wbuf)); | |
Py_END_ALLOW_THREADS | |
if (!wbuf2) { | |
return PyErr_SetFromWindowsErr(0); | |
} | |
resobj = PyUnicode_FromWideChar(wbuf2, wcslen(wbuf2)); | |
if (wbuf2 != wbuf) | |
PyMem_RawFree(wbuf2); | |
return resobj; | |
} | |
static wchar_t * | |
win32_wgetcwd(wchar_t *buf, size_t size) | |
{ | |
wchar_t *local_buf = buf; | |
DWORD local_size = (DWORD)size; | |
if (size > DWORD_MAX) { | |
SetLastError(ERROR_INVALID_PARAMETER); | |
errno = EINVAL; | |
return NULL; | |
} | |
// If buf is NULL, allocate a buffer with the given size. If size is 0, | |
// the result of GetCurrentDirectoryW() wil be used. | |
if (buf == NULL && size > 0) { | |
local_buf = PyMem_RawMalloc(dwsize * sizeof(wchar_t)); | |
if (local_buf == NULL) { | |
SetLastError(ERROR_NOT_ENOUGH_MEMORY); | |
errno = ENOMEM; | |
return NULL; | |
} | |
} | |
while (1) { | |
wchar_t *temp; | |
DWORD result = GetCurrentDirectoryW(local_size, local_buf); | |
if (!result) { | |
// Set errno for the sake of completeness, but the result is never | |
// 0 in practice. The call could fail due to an OS exception such | |
// as an access violation, but that will just crash the process if | |
// there's no exception handler. | |
errno = winerror_to_errno(GetLastError()); | |
goto fail; | |
} | |
// Note that the null terminator is not counted on success. | |
if (result < local_size) { | |
break; | |
} | |
// Reallocate only if buf is NULL and size is 0. | |
if (buf != NULL || size != 0) { | |
SetLastError(ERROR_INSUFFICIENT_BUFFER); | |
errno = ERANGE; | |
goto fail; | |
} | |
temp = PyMem_RawRealloc(local_buf, result * sizeof(wchar_t)); | |
if (!temp) { | |
SetLastError(ERROR_NOT_ENOUGH_MEMORY); | |
errno = ENOMEM; | |
goto fail; | |
} | |
local_buf = temp; | |
local_size = result; | |
} | |
return local_buf; | |
fail: | |
if (buf == NULL && local_buff != NULL) { | |
PyMem_RawFree(local_buf); | |
} | |
return NULL; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the suggestion to implement static wchar_t *win32_wgetcwd(wchar_t *buf, size_t size)
more faithfully to POSIX getcwd()
. Like glibc, passing buf
as NULL
supports allocating the buffer. The caller is responsible for freeing an allocated buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the idea of inlining win32_wgetcwd()
back into posix_getcwd()
. In 4 years since my original PR handling of use_bytes
was changed, and now it seems to make sense to have conversion from the native path type to Python's unicode/bytes in the same place for both Windows and Unix.
However, I can't see the value of introducing a pretty generic POSIX-like win32_wgetcwd()
as you suggest. We need to get the current directory in just two places, and I don't see why our needs would change in the future.
I've force-pushed a new commit, which is mostly just a rebase of the original PR + inlining of win32_wgetcwd()
and renaming of win32_wgetcwd_buf()
to win32_wgetcwd()
. ISTM that win32_wgetcwd()
is pretty simple. Having the caller to compare the returned buffer to the passed one is surely a wrinkle, but I don't think it matters, and overall I don't see why using your generic win32_wgetcwd()
would be much better. But if there is some rationale that I'm missing, I'll be happy to revisit this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if there is some rationale that I'm missing, I'll be happy to revisit this.
There's no deep rationale. I just figured I may as well copy the Linux behavior. POSIX getcwd()
only says that the behavior is "unspecified" when buf
is NULL
. Your code implements part of the extended behavior, given buf
is NULL
and buf_size
is 0. However, GetCurrentDirectoryW()
assumes that nBufferLength
accurately represents the size of lpBuffer
. If this size is big enough to hold the path of the current working directory, but lpBuffer
is NULL
, GetCurrentDirectoryW()
will try to write to a NULL
pointer and crash on an access violation. This can't happen in how we're currently using the function, but I felt it wouldn't hurt to round out the implementation. The Linux behavior is to allocate a new buffer in this case. Another option is to simply fail if buf_size
is greater than 0 when buf
is NULL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I've intended win32_getcwd()
purely as a helper for the two cases we have, not as a generic function (and one of those cases shouldn't probably exist at all). I generally don't like to abstract over a small number of use cases. It works for buf
is NULL
and buf_size
is 0 purely by accident. I can add an assert()
or a failure case for buf == NULL
if you like.
I see how it could be tempting to add a more flexible version, but I think that would be justified if we needed to implement Py_getcwd()
for Python/fileutils.c, but not here.
Modules/posixmodule.c
Outdated
if (!use_bytes) | ||
return win32_wgetcwd(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With win32_wgetcwd()
implemented as a lower-level function, more of the work remains in posix_getcwd()
, but it's still relatively simple:
if (!use_bytes) | |
return win32_wgetcwd(); | |
PyObject *obj | |
wchar_t *buf; | |
Py_BEGIN_ALLOW_THREADS | |
buf = win32_wgetcwd(NULL, 0); | |
Py_END_ALLOW_THREADS | |
if (!buf) { | |
PyErr_NoMemory(); | |
return NULL; | |
} | |
obj = PyUnicode_FromWideChar(buf, -1); | |
PyMem_RawFree(buf); | |
if (use_bytes && obj) { | |
Py_SETREF(obj, PyUnicode_EncodeFSDefault(obj)); | |
} | |
return obj; |
Thank you for reviewing, @eryksun. I'll get back to this PR later this week and address your suggestions. |
…Windows Both functions retry GetCurrentDirectoryW() if the initial buffer is insufficient, but don't check whether the newly allocated buffer is sufficient after the second call. This might not be the case if another thread changes the current directory concurrently, and if so, the functions will proceed to use uninitialized memory. Fix this by retrying GetCurrentDirectoryW() in a loop with larger buffers until it succeeds or we run out of memory. Co-authored-by: Eryk Sun <eryksun@gmail.com>
There is also a thing that I don't understand about
But it don't think it's true that |
Back then it couldn't, but long-path support in Windows 10+ does allow the current working directory to be a long path. A child process, however, can't inherit a long path as the current working directory.
Note that only drive and regular UNC paths are supported as the working directory. Device paths such as "\\?\" extended paths are not documented as supported, and the API is actually buggy in this case. It wasn't an issue before because the static buffer in the PEB (process environment block) was limited to
We should use We could also switch back to calling |
Thank you for clarifying the situation. With regard to long paths and At least now it's clear to me that we do need to retry |
This PR is stale because it has been open for 30 days with no activity. |
This comment has been minimized.
This comment has been minimized.
gh pr checkout 5802 |
21-23,25,45,53,80,110,111,113,135,139,143,199,256,443,445,554,587,912,993,995,1720,1723,3306,3389,5900,8080,8888,10257 |
win32_wchdir() retries GetCurrentDirectory() with a larger buffer
if necessary, but doesn't check whether the new buffer is large enough.
Another thread could change the current directory in meanwhile,
so the buffer could turn out to be still not large enough, left in
an uninitialized state and passed to SetEnvironmentVariable() afterwards.
A similar issue occurs in posix_getcwd().
https://bugs.python.org/issue32904