Skip to content

array.tolist() speed enhancement (Trac #1779) #2372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
numpy-gitbot opened this issue Oct 19, 2012 · 10 comments
Closed

array.tolist() speed enhancement (Trac #1779) #2372

numpy-gitbot opened this issue Oct 19, 2012 · 10 comments

Comments

@numpy-gitbot
Copy link

Original ticket http://projects.scipy.org/numpy/ticket/1779 on 2011-03-23 by trac user Han, assigned to unknown.

Hi,

For a while, a small issue has been bugging me, where array.tolist() takes a huge amount of time, compared to the speed of Python.

To illustrate, here are some timings (on Windows XP):

>>> timeit.timeit('a.tolist()', 'from numpy import arange; a = arange(1e5)', number=500)
19.07303038231646

The conversion of 500 x 100000 elements takes up to 20(!) seconds.

>>> timeit.timeit('a = arange(1e5)', 'from numpy import arange', number=500)
0.4194663546483639

While creating a NumPy arrays with the same amount of items takes .5 seconds.

>>> timeit.timeit('a = range(int(1e5))', number=500)
0.97656364422391562

And creating a Python list takes no more than 1 second, this is 20x faster than array.tolist(). So where is this discrepancy coming from?

To find out, I did some runs with valgrind on NumPy 1.4.1 dbg (Debian), and used kcachegrind to produce a few calling graphs.

The first thing I noticed was the amount of array_alloc and consequent array_dealloc calls. The number of calls amount up to the number of elements in the array! PyArray_NewFromDescr is called per array element, which creates a lot of overhead.

In NumPy 1.6.1b1, this overhead still exists; it stems from the PyArray_ToList function in convert.c:

NPY_NO_EXPORT PyObject *
PyArray_ToList(PyArrayObject *self)
{
    PyObject *lp;
    PyArrayObject *v;
    intp sz, i;

    if (!PyArray_Check(self)) {
        return (PyObject *)self;
    }
    if (self->nd == 0) {
        return self->descr->f->getitem(self->data,self);
    }

    sz = self->dimensions[0];
    lp = PyList_New(sz);
    for (i = 0; i < sz; i++) {
        v = (PyArrayObject *)array_big_item(self, i);
        if (PyArray_Check(v) && (v->nd >= self->nd)) {
            PyErr_SetString(PyExc_RuntimeError,
                            "array_item not returning smaller-" \
                            "dimensional array");
            Py_DECREF(v);
            Py_DECREF(lp);
            return NULL;
        }
        PyList_SetItem(lp, i, PyArray_ToList(v));
        Py_DECREF(v);
    }
    return lp;
}

For every element in the array, a array_big_item call is made to create a new array with that element and given recursively to PyArray_ToList to get the actual element item.

I added an extra clause to the function to account for 1-dimensional arrays:

    if (self->nd == 1) {
        sz = self->dimensions[0];
        lp = PyList_New(sz);
        for (i = 0; i < sz; i++) {
            PyList_SetItem(lp, i, self->descr->f->getitem(index2ptr(self, i),self));
        }
        return lp;
    }

Which gets the time down to 2 seconds on Windows.

I'm not sure about the patch, though, because it does not account for errors, and might be more optimized / streamlined.

Anyway, hope it can go in at 1.6.0 in some way or another, because it really helps in NumPy->Python conversions!

[attached: calling graphs]

@numpy-gitbot
Copy link
Author

Attachment added by trac user Han on 2011-03-23: callgraph005.jpg

@numpy-gitbot
Copy link
Author

Attachment added by trac user Han on 2011-03-23: callgraph001.jpg

@numpy-gitbot
Copy link
Author

Attachment added by trac user Han on 2011-03-23: convert.patch

@numpy-gitbot
Copy link
Author

Title changed from array.tolist() enhancement to array.tolist() speed enhancement by trac user Han on 2011-03-23

@numpy-gitbot
Copy link
Author

trac user Han wrote on 2011-03-23

Hmmm.. already spotted an issue where this is converted correctly:

>>> a = array([[1,2,3,object()],array([5,6,7,8])])
>>> a.tolist()
[[1, 2, 3, <object object at 0x00A515A0>], [5, 6, 7, 8]]

But this isn't:

>>> a = array([[1,2,3,object()],array([[5,6,7,8]])])
>>> a.tolist()
[[1, 2, 3, <object object at 0x00A51598>], array([[5, 6, 7, 8]])]

So the patch doesn't cover all bases..

@numpy-gitbot
Copy link
Author

trac user Han wrote on 2011-03-23

(Ah, sorry for the noise, that also was not possible in the original conversion, it seems.)

@numpy-gitbot
Copy link
Author

trac user Han wrote on 2011-03-23

When using multi-dimensional arrays, the advantage of this patch goes away.. Maybe it would be better to improve the conversion mechanism, instead..

To convert 20 times 50000x10x10 arrays still takes up to 150 seconds with patch, against 180 seconds without patch. (Python takes about 50 seconds from a list comprehension.)

@numpy-gitbot
Copy link
Author

@mwiebe wrote on 2011-03-23

I made a pull request which gives some speedup.

#59

# BASELINE:
In [16]: timeit a = np.arange(100000)
1000 loops, best of 3: 241 us per loop

In [17]: timeit range(100000)
100 loops, best of 3: 2.29 ms per loop

In [18]: timeit a.tolist()
100 loops, best of 3: 14.9 ms per loop

# WITH PATCH
In [3]: timeit a = np.arange(100000)
1000 loops, best of 3: 245 us per loop

In [4]: timeit range(100000)
100 loops, best of 3: 2.4 ms per loop

In [5]: timeit a.tolist()
100 loops, best of 3: 3.57 ms per loop

@numpy-gitbot
Copy link
Author

trac user Han wrote on 2011-03-23

Whoa, that was quick! :-D

Your patch also gives a large improvement on multidimensional data:

# Before:

>>> timeit.timeit('a.tolist()', 'from numpy import zeros; a = zeros((1e4,10,10))', number=50)
20.638730049133301

# After:

>>> timeit.timeit('a.tolist()', 'from numpy import zeros; a = zeros((1e4,10,10))', number=50)
4.0468051433563232

Very nice, thanks!

@numpy-gitbot
Copy link
Author

@mwiebe wrote on 2011-03-23

Fixed in c7040ba (in 1.6 branch) and 89292bf (master).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant