Closed
Description
Mostly just as a note that says more about python not being so efficient as used from C than about our implementation, but adding a keyword argument to a ufunc takes at least double the time as a simple python function would suggest is needed (note that where=True
and subok=True
should just be no-ops; found this as I was hoping to speed up routines by passing subok=False
...).
In [53]: a = np.arange(2.)
In [54]: %timeit np.add(a, a)
The slowest run took 55.55 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 509 ns per loop
In [55]: %timeit np.add(a, a, subok=True)
The slowest run took 22.16 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 786 ns per loop
In [56]: %timeit np.add(a, a, subok=False)
The slowest run took 22.60 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 758 ns per loop
In [57]: %timeit np.add(a, a, where=True)
The slowest run took 39.37 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 818 ns per loop
Simple python class on the same machine
In [90]: class B:
...: def __call__(self, subok=True):
...: pass
...:
...:
In [91]: b = B()
In [92]: %timeit b()
The slowest run took 34.52 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 126 ns per loop
In [93]: %timeit b(subok=True)
The slowest run took 25.05 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 216 ns per loop