Skip to content

gh-127681: add overflow checks when using PyMem_Malloc #127686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

picnixz
Copy link
Member

@picnixz picnixz commented Dec 6, 2024

For now, this only adds checks in Modules and Objects.

@picnixz picnixz changed the title gh-127681: add overflow checks when using memory allocators gh-127681: add overflow checks when using PyMem_Malloc Dec 6, 2024
@picnixz
Copy link
Member Author

picnixz commented Dec 6, 2024

I've added comments for future readers so that they know that something does not need to be checked or not, but I can also remove them if you think it's overly verbose.

@picnixz picnixz marked this pull request as ready for review December 6, 2024 13:37
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed, and found only one site where an integer overflow may be possible. In all other cases it is not possible.

Comment on lines +334 to +343
if (prefix) {
size_t prefix_len = strlen(prefix);
if (len > PY_SIZE_MAX - prefix_len) {
goto oom;
}
len += prefix_len;
}
if (len == PY_SIZE_MAX) {
goto oom;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of overflow? AFAIK, prefix is smaller than 1030 bytes, and suffix also is a short string.

Comment on lines +377 to +383
if (prefix) {
size_t l = strlen(prefix);
if ((size_t)prefix > (size_t)PY_SSIZE_T_MAX - l) {
goto oom;
}
prefix_len += l;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefix is always "&" here.

@@ -1666,7 +1685,7 @@ PyCArrayType_init(PyObject *self, PyObject *args, PyObject *kwds)
if (stginfo->format == NULL)
goto error;
stginfo->ndim = iteminfo->ndim + 1;
stginfo->shape = PyMem_Malloc(sizeof(Py_ssize_t) * stginfo->ndim);
stginfo->shape = PyMem_New(Py_ssize_t, stginfo->ndim);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This never overflows. ndim < 32.

@@ -3094,7 +3113,7 @@ PyCData_MallocBuffer(CDataObject *obj, StgInfo *info)
/* In python 2.4, and ctypes 0.9.6, the malloc call took about
33% of the creation time for c_int().
*/
obj->b_ptr = (char *)PyMem_Malloc(info->size);
obj->b_ptr = PyMem_New(char, info->size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same.

@@ -4775,7 +4794,7 @@ Array_subscript(PyObject *myself, PyObject *item)
return PyBytes_FromStringAndSize(ptr + start,
slicelen);
}
dest = (char *)PyMem_Malloc(slicelen);
dest = PyMem_New(char, slicelen);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same.

@@ -1192,7 +1195,7 @@ subprocess_fork_exec_impl(PyObject *module, PyObject *process_args,
#endif /* HAVE_SETREUID */
}

c_fds_to_keep = PyMem_Malloc(fds_to_keep_len * sizeof(int));
c_fds_to_keep = PyMem_New(int, fds_to_keep_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It never overflows. fds_to_keep_len is the size of a tuple. sizeof(int) <= sizeof(PyObject *).

@@ -1725,7 +1725,7 @@ prepare_s(PyStructObject *self)
}

/* check for overflow */
if ((ncodes + 1) > ((size_t)PY_SSIZE_T_MAX / sizeof(formatcode))) {
if (ncodes > ((size_t)PY_SSIZE_T_MAX / sizeof(formatcode)) - 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ncodes + 1 does not overflow. ncodes is not longer then the length of str/bytes.

@@ -705,8 +705,7 @@ int PyObject_CopyData(PyObject *dest, PyObject *src)

/* Otherwise a more elaborate copy scheme is needed */

/* XXX(nnorwitz): need to check for overflow! */
indices = (Py_ssize_t *)PyMem_Malloc(sizeof(Py_ssize_t)*view_src.ndim);
indices = PyMem_New(Py_ssize_t, view_src.ndim);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not overflow. ndim < 32.

@@ -484,10 +484,13 @@ _PyObject_Call_Prepend(PyThreadState *tstate, PyObject *callable,
PyObject **stack;

Py_ssize_t argcount = PyTuple_GET_SIZE(args);
if (argcount + 1 <= (Py_ssize_t)Py_ARRAY_LENGTH(small_stack)) {
if (argcount <= (Py_ssize_t)Py_ARRAY_LENGTH(small_stack) - 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argcount + 1 does not overflow. argcount is the length of a tuple.

@@ -3637,8 +3637,7 @@ list_ass_subscript_lock_held(PyObject *_self, PyObject *item, PyObject *value)
step = -step;
}

garbage = (PyObject**)
PyMem_Malloc(slicelength*sizeof(PyObject*));
garbage = PyMem_New(PyObject *, slicelength);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not overflow. slicelength is the length of a slice in existing list. The list already uses at least slicelength*sizeof(PyObject*) bytes.

@bedevere-app
Copy link

bedevere-app bot commented Dec 6, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@picnixz
Copy link
Member Author

picnixz commented Dec 6, 2024

So actually, there are parts that could be removed instead of fortified. I won't work on this one today but my question is: can I actually add the comments so that we know that there won't be someone else wondering "oh maybe we should check for this"?

@picnixz picnixz marked this pull request as draft December 6, 2024 16:28
@serhiy-storchaka
Copy link
Member

I don't like pollution of the code with trivial comments. They actually make reading the code more difficult -- you need to read and comprehend the comment instead of just skipping the presumably correct code. It is responsibility of the contributor to prove that they changes are necessary. The CPython code is mature enough, and most trivial bugs are fixed.

So I suggest you to close this PR and open a new PR with only cases for which you can show at least theoretical possibility of integer overflow.

@picnixz picnixz closed this Dec 6, 2024
@picnixz picnixz deleted the fix/mem/overflow-checks-127681 branch December 6, 2024 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants