-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
array.tolist() speed enhancement (Trac #1779) #2372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Attachment added by trac user Han on 2011-03-23: callgraph005.jpg |
Attachment added by trac user Han on 2011-03-23: callgraph001.jpg |
Attachment added by trac user Han on 2011-03-23: convert.patch |
Title changed from |
trac user Han wrote on 2011-03-23 Hmmm.. already spotted an issue where this is converted correctly:
But this isn't:
So the patch doesn't cover all bases.. |
trac user Han wrote on 2011-03-23 (Ah, sorry for the noise, that also was not possible in the original conversion, it seems.) |
trac user Han wrote on 2011-03-23 When using multi-dimensional arrays, the advantage of this patch goes away.. Maybe it would be better to improve the conversion mechanism, instead.. To convert 20 times 50000x10x10 arrays still takes up to 150 seconds with patch, against 180 seconds without patch. (Python takes about 50 seconds from a list comprehension.) |
@mwiebe wrote on 2011-03-23 I made a pull request which gives some speedup.
|
trac user Han wrote on 2011-03-23 Whoa, that was quick! :-D Your patch also gives a large improvement on multidimensional data:
Very nice, thanks! |
Original ticket http://projects.scipy.org/numpy/ticket/1779 on 2011-03-23 by trac user Han, assigned to unknown.
Hi,
For a while, a small issue has been bugging me, where array.tolist() takes a huge amount of time, compared to the speed of Python.
To illustrate, here are some timings (on Windows XP):
The conversion of 500 x 100000 elements takes up to 20(!) seconds.
While creating a NumPy arrays with the same amount of items takes .5 seconds.
And creating a Python list takes no more than 1 second, this is 20x faster than array.tolist(). So where is this discrepancy coming from?
To find out, I did some runs with valgrind on NumPy 1.4.1 dbg (Debian), and used kcachegrind to produce a few calling graphs.
The first thing I noticed was the amount of array_alloc and consequent array_dealloc calls. The number of calls amount up to the number of elements in the array! PyArray_NewFromDescr is called per array element, which creates a lot of overhead.
In NumPy 1.6.1b1, this overhead still exists; it stems from the PyArray_ToList function in convert.c:
For every element in the array, a array_big_item call is made to create a new array with that element and given recursively to PyArray_ToList to get the actual element item.
I added an extra clause to the function to account for 1-dimensional arrays:
Which gets the time down to 2 seconds on Windows.
I'm not sure about the patch, though, because it does not account for errors, and might be more optimized / streamlined.
Anyway, hope it can go in at 1.6.0 in some way or another, because it really helps in NumPy->Python conversions!
[attached: calling graphs]
The text was updated successfully, but these errors were encountered: