-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
list_slice() crashes if the list is mutated indirectly by PyList_New() #75348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Python 3.7 git commit 3ca9f50 compiled with afl-clang-fast on Ubuntu 16 x64. The following script triggers undefined-behavior followed by a null pointer dereference and a segfault. import weakref
class A(object):pass
def callback(x):del lst[0]
keepali0e=[]
for i in range(1):
lst=[str()]
a=A()
a.c=a
keepali0e.append(weakref.ref(a,callback))
del a
while lst:keepali0e.append(lst[:]) Objects/dictobject.c:547:12: runtime error: index 16 out of bounds for type 'int8_t [8]' AddressSanitizer can not provide additional info. |
Here is some similar code that crashes for the same reasons: # create a circular reference with a malicious __del__().
class A:
def __del__(*args):
del list1[0]
circ_ref_obj = A()
circ_ref_obj._self = circ_ref_obj
list1 = [None]
list2 = []
del circ_ref_obj
while len(list2) < 10000:
list2.append(list1[:]) IIUC, list_slice() first finds the boundaries of the slice and its length, and Maybe we should prevent collection of garbage with circular references (that ISTM there might be a lot of places with similar issues. (e.g. if we replace |
Oh, and calls to PyObject_GC_NewVar() might also cause similar issues. |
Oh, you are right Oren. Seems this is the only solution. |
There are other solutions. I wrote PR 3911 which checks if the list size changed after PyList_New(). If it's the case, a RuntimeError exception is raised. We implemented similar checks in the dict type, if the dict is mutated during iterating on it. |
"Maybe we should prevent collection of garbage with circular references (that has __del__() or weakref callbacks) from PyObject_GC_New()?" That would be a major change in the garbage collector. I would prefer to not touch the GC, any change can introduce a complex regression. Running the GC when an object is allocated makes sense to me. It's to reclaim memory, and prevent a MemoryError which would only be caused by "missed GC". |
This is different case than mutating a dict while iterate it. In this case the failure is caused by GC, and it is always hard to handle such issues. In case of a dict you can just copy it before iterating. But what to do with RuntimeError from slicing a list if copying a list is vulnerable to the same issue? The example of yet one solution you can see in dict_keys() in dictobject.c. I always wondered why this code is needed. |
This issue was found by fuzzing with weird destructor. I don't think that anyone will hit the bug with "normal" code in the wild. Otherwise, we would have get a bug report before this one. |
Or the bug is so hard for reproducing, that nobody had enough information for reporting. |
See a4dd011. The commit message:
We have the same issue with lists. PR 3915 tries to fix it by applying the same solution -- calling PyList_New() again if the source container was resized. list_slice() no longer can be considered safe, because it uses the size calculated before calling PyList_New(). Added _PyList_Copy() for copying the list for replacing unsafe PyList_GetSlice(). PyList_SetSlice() is not safe too (the PR still not fixes this). The code that uses the combination of PyList_GetSlice() and PyList_SetSlice() for safety (like in _asynciomodule.c) is not safe. Many code, including most implementations of slicing, should be rewritten if we go this way. PR 3915 shows only small example of such changes. I think than changing the Garbage Collector would be easier. |
What does "changing" mean exactly? What will be the effects on normal code? How do you know that it will not create new problems that didn't exist before? |
I'm not a GC expert. Maybe we should add a global flag that disable calling nontrivial destructors and set it in PyObject_GC_New(). The objects with nontrivial destructor should be added to the special queue and destroyed in "safe" place. There may be a problem with determining the "safe" place. I think that any Py_DECREF() is a "safe" place, and the eval loop is a "safe" place.
I think this shouldn't affect Python code if perform deferred destroying in the eval loop. This can affect the execution of extensions if reference loops are created in C code. Some objects in loops can live longer that before.
I don't know. But we can try and see what is the result. |
Reproduced on 3.11. |
Cannot reproduce in 3.12 (but still see it on 3.11). |
Python 3.12 has an important GC change:
cc @pablogsal |
Yeah this change was made precisely to address this kind of problems and many more. I'm glad to confirm that's working :) |
I propose to close this issue. Backporting the Python 3.12 change to 3.11 is too risky. I'm not sure that fixing Python 3.11 differently is worth it. This bug exists since Python exists, people managed to work around it for many years. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: