-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
numpy 1.11. Segfault: numpy.random.permutation on list of long strings #7710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmm, I don't see this on 64 bit fedora with 16 GiB memory.
|
What is the 50% reproducibility. |
And what is the full python version? Python 2.7.11 here. |
I have 64bit Ubuntu with 32Gb memory. And if you run it from command line, then you will see segfault after you quit the command line (ipython for example) |
With Python 2.7.11 it worked for me too. So I don't know if it's an issue with numpy or with Python. |
OK, I got a segfault, but very unreliably. Ran once, happened. Ran in loop 1,000,000 times, nada. Not sure what is going on. |
The segfault is happening at interpreter shutdown... did you start and then
|
IIRC, I got a segfault before shutting down. |
Clearly some sort of memory/GC corruption so some indeterminism is to be expected, but it also makes sense that interpreter shutdown would be a particularly likely time to hit the corruption, since when shutting down the interpreter tries to tear-down and garbage-collect all objects (and the traceback at the top of this thread shows it being hit during this process in |
Ok so I think I've got this one figured out. The issue is when It will pick up the wrong length of the string in the buffer's
Now when we swap a longer element into that buffer it will end up overflowing, almost certainly overwriting something important which will cause the segfault when the garbage collection is run. A simple fix for this would be to explicitly set the
|
np.random.shuffle will allocate a buffer based on the size of the first element of an array of strings. If the first element is smaller than another in the array this buffer will overflow, causing a segfault when garbage is collected. Additionally if the array contains objects then one would be left in the buffer and have it's refcount erroniously decrimented on function exit, causing that object to be deallocated too early. To fix this we change the buffer to be an array of int8 of the the size of the array's dtype, which sidesteps both issues. Fixes numpy#7710
np.random.shuffle will allocate a buffer based on the size of the first element of an array of strings. If the first element is smaller than another in the array this buffer will overflow, causing a segfault when garbage is collected. Additionally if the array contains objects then one would be left in the buffer and have it's refcount erroniously decrimented on function exit, causing that object to be deallocated too early. To fix this we change the buffer to be an array of int8 of the the size of the array's dtype, which sidesteps both issues. Fixes numpy#7710
I'm getting Segmentation fault when running numpy.random.permutation on list of long strings. With small strings works well.
NumPy version: 1.11.0
Python 2.7
OS: Ubuntu 14.04.4 LTS
GDB trace:
Reproducibility: ~ 50%
The text was updated successfully, but these errors were encountered: