performance regression for record array access in numpy 1.10.1

It appears that access numpy record arrays by field name is significantly slower in numpy 1.10.1. I have put below a simple example test that illustrates the issue. (I am aware that this particular example is much better accomplished by other means. The point is that array access is slow, not that this a representative problem.)

The test script is

``` python
#!/usr/bin/env python
import os
import time
import sys
import numpy as np

def test(N=100000,verbose=False):
    d = np.zeros(1,dtype=[('col','f8')])

    t0 = time.time()
    for i in xrange(N):
        d['col'] += i
    t0 = time.time() - t0

    if verbose:
    print 'numpy version:',np.version.version
        print 'time: %g' % t0

if __name__ == "__main__":
    if len(sys.argv) > 1:
    N = int(sys.argv[1])
    test(N=N,verbose=True)
    else:
    test(verbose=True)
```

Here are the running times for different versions of numpy:

```
numpy version: 1.9.3
time: 0.262786

numpy version: 1.10.1
time: 3.57254
```

@esheldon has reproduced the relative timing differences on linux in addition to my tests which were with my mac. 

I profiled the code for v1.10.1 and found this

```
         3200006 function calls (3000006 primitive calls) in 4.521 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   200000    1.386    0.000    1.883    0.000 _internal.py:372(_check_field_overlap)
600000/400000    0.751    0.000    0.906    0.000 _internal.py:337(_get_all_field_offsets)
        1    0.566    0.566    4.521    4.521 numpy_test.py:7(test)
   200000    0.401    0.000    3.189    0.000 _internal.py:425(_getfield_is_safe)
   200000    0.350    0.000    3.955    0.000 _internal.py:287(_index_fields)
   200000    0.323    0.000    3.513    0.000 {method 'getfield' of 'numpy.ndarray' objects}
   400000    0.279    0.000    0.279    0.000 {range}
   400000    0.155    0.000    0.155    0.000 {method 'update' of 'set' objects}
   400000    0.106    0.000    0.106    0.000 {method 'append' of 'list' objects}
   200000    0.093    0.000    0.093    0.000 {isinstance}
   200000    0.062    0.000    0.062    0.000 {method 'difference' of 'set' objects}
   200000    0.048    0.000    0.048    0.000 {method 'extend' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
        1    0.000    0.000    4.521    4.521 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 {time.time}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
```

It appears that new code added at the python level for error checking is significantly degrading performance. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

performance regression for record array access in numpy 1.10.1 #6467

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

performance regression for record array access in numpy 1.10.1 #6467

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions