Closed
Description
It appears that access numpy record arrays by field name is significantly slower in numpy 1.10.1. I have put below a simple example test that illustrates the issue. (I am aware that this particular example is much better accomplished by other means. The point is that array access is slow, not that this a representative problem.)
The test script is
#!/usr/bin/env python
import os
import time
import sys
import numpy as np
def test(N=100000,verbose=False):
d = np.zeros(1,dtype=[('col','f8')])
t0 = time.time()
for i in xrange(N):
d['col'] += i
t0 = time.time() - t0
if verbose:
print 'numpy version:',np.version.version
print 'time: %g' % t0
if __name__ == "__main__":
if len(sys.argv) > 1:
N = int(sys.argv[1])
test(N=N,verbose=True)
else:
test(verbose=True)
Here are the running times for different versions of numpy:
numpy version: 1.9.3
time: 0.262786
numpy version: 1.10.1
time: 3.57254
@esheldon has reproduced the relative timing differences on linux in addition to my tests which were with my mac.
I profiled the code for v1.10.1 and found this
3200006 function calls (3000006 primitive calls) in 4.521 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
200000 1.386 0.000 1.883 0.000 _internal.py:372(_check_field_overlap)
600000/400000 0.751 0.000 0.906 0.000 _internal.py:337(_get_all_field_offsets)
1 0.566 0.566 4.521 4.521 numpy_test.py:7(test)
200000 0.401 0.000 3.189 0.000 _internal.py:425(_getfield_is_safe)
200000 0.350 0.000 3.955 0.000 _internal.py:287(_index_fields)
200000 0.323 0.000 3.513 0.000 {method 'getfield' of 'numpy.ndarray' objects}
400000 0.279 0.000 0.279 0.000 {range}
400000 0.155 0.000 0.155 0.000 {method 'update' of 'set' objects}
400000 0.106 0.000 0.106 0.000 {method 'append' of 'list' objects}
200000 0.093 0.000 0.093 0.000 {isinstance}
200000 0.062 0.000 0.062 0.000 {method 'difference' of 'set' objects}
200000 0.048 0.000 0.048 0.000 {method 'extend' of 'list' objects}
1 0.000 0.000 0.000 0.000 {numpy.core.multiarray.zeros}
1 0.000 0.000 4.521 4.521 <string>:1(<module>)
2 0.000 0.000 0.000 0.000 {time.time}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
It appears that new code added at the python level for error checking is significantly degrading performance.