-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG: min/max is slow, re-implement using NEON (#17989) #20131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: min/max is slow, re-implement using NEON (#17989) #20131
Conversation
This fixes numpy#17989 by adding ARM NEON implementations for min/max and fmin/max. Before: Rosetta faster than native arm64 by `1.2x - 8.6x`. After: Native arm64 faster than Rosetta by `1.6x - 6.7x`. (2.8x - 15.5x improvement) **Benchmarks** ``` before after ratio [b0e1a44] [8301ffd7] <main> <gh-issue-17989/improve-neon-min-max> + 32.6±0.04μs 37.5±0.08μs 1.15 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'ceil'>, 2, 1, 'd') + 32.6±0.06μs 37.5±0.04μs 1.15 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 2, 1, 'd') + 37.8±0.09μs 43.2±0.09μs 1.14 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 4, 4, 'f') + 37.7±0.09μs 42.9±0.1μs 1.14 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 2, 2, 'd') + 37.9±0.2μs 43.0±0.02μs 1.14 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'ceil'>, 2, 2, 'd') + 37.7±0.01μs 42.3±1μs 1.12 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'conjugate'>, 2, 2, 'd') + 34.2±0.07μs 38.1±0.05μs 1.12 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 4, 2, 'f') + 32.6±0.03μs 35.8±0.04μs 1.10 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 4, 1, 'f') + 37.1±0.1μs 40.3±0.1μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'ceil'>, 1, 2, 'd') + 37.2±0.1μs 40.3±0.04μs 1.08 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 2, 4, 'f') + 37.1±0.09μs 40.3±0.07μs 1.08 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 1, 2, 'd') + 68.6±0.5μs 74.2±0.3μs 1.08 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'ceil'>, 4, 4, 'd') + 37.1±0.2μs 40.0±0.1μs 1.08 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'conjugate'>, 1, 2, 'd') + 2.42±0μs 2.61±0.05μs 1.08 bench_core.CountNonzero.time_count_nonzero_axis(3, 100, <class 'numpy.int16'>) + 69.1±0.7μs 73.5±0.7μs 1.06 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'conjugate'>, 4, 4, 'd') + 54.7±0.3μs 58.0±0.2μs 1.06 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'ceil'>, 2, 4, 'd') + 54.5±0.2μs 57.8±0.2μs 1.06 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'conjugate'>, 2, 4, 'd') + 3.78±0.04μs 4.00±0.02μs 1.06 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 100, <class 'str'>) + 54.8±0.2μs 57.9±0.3μs 1.06 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 2, 4, 'd') + 3.68±0.01μs 3.87±0.02μs 1.05 bench_core.CountNonzero.time_count_nonzero_multi_axis(1, 100, <class 'object'>) + 69.6±0.2μs 73.1±0.2μs 1.05 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 4, 4, 'd') + 229±2μs 241±0.2μs 1.05 bench_random.Bounded.time_bounded('PCG64', [<class 'numpy.uint64'>, 1535]) - 73.0±0.8μs 69.5±0.2μs 0.95 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'trunc'>, 4, 4, 'd') - 37.6±0.1μs 35.7±0.3μs 0.95 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'trunc'>, 1, 4, 'f') - 88.7±0.04μs 84.2±0.7μs 0.95 bench_lib.Pad.time_pad((256, 128, 1), 1, 'wrap') - 57.9±0.2μs 54.8±0.2μs 0.95 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'trunc'>, 2, 4, 'd') - 39.9±0.2μs 37.2±0.04μs 0.93 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'positive'>, 1, 2, 'd') - 2.66±0.01μs 2.47±0.01μs 0.93 bench_lib.Nan.time_nanmin(200, 0) - 2.65±0.02μs 2.46±0.04μs 0.93 bench_lib.Nan.time_nanmin(200, 50.0) - 2.64±0.01μs 2.45±0.01μs 0.93 bench_lib.Nan.time_nanmax(200, 90.0) - 2.64±0μs 2.44±0.02μs 0.92 bench_lib.Nan.time_nanmax(200, 0) - 2.68±0.02μs 2.48±0μs 0.92 bench_lib.Nan.time_nanmax(200, 2.0) - 40.2±0.01μs 37.1±0.1μs 0.92 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'ceil'>, 2, 4, 'f') - 2.69±0μs 2.47±0μs 0.92 bench_lib.Nan.time_nanmin(200, 2.0) - 2.70±0.02μs 2.48±0.02μs 0.92 bench_lib.Nan.time_nanmax(200, 0.1) - 2.70±0μs 2.47±0μs 0.91 bench_lib.Nan.time_nanmin(200, 90.0) - 2.70±0μs 2.46±0μs 0.91 bench_lib.Nan.time_nanmin(200, 0.1) - 2.70±0μs 2.42±0.01μs 0.90 bench_lib.Nan.time_nanmax(200, 50.0) - 11.8±0.6ms 10.6±0.6ms 0.89 bench_core.CountNonzero.time_count_nonzero_axis(2, 1000000, <class 'str'>) - 42.7±0.1μs 37.8±0.02μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'positive'>, 2, 2, 'd') - 42.8±0.03μs 37.8±0.2μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rint'>, 2, 2, 'd') - 43.1±0.2μs 37.7±0.09μs 0.87 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'ceil'>, 4, 4, 'f') - 37.5±0.07μs 32.6±0.06μs 0.87 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rint'>, 2, 1, 'd') - 41.7±0.03μs 36.3±0.07μs 0.87 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 1, 4, 'd') - 166±0.8μs 144±1μs 0.87 bench_ufunc.UFunc.time_ufunc_types('fmin') - 11.6±0.8ms 10.0±0.01ms 0.87 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 1000000, <class 'str'>) - 167±0.9μs 144±2μs 0.86 bench_ufunc.UFunc.time_ufunc_types('minimum') - 168±4μs 143±0.5μs 0.85 bench_ufunc.UFunc.time_ufunc_types('fmax') - 167±1μs 142±0.8μs 0.85 bench_ufunc.UFunc.time_ufunc_types('maximum') - 7.10±0μs 4.97±0.01μs 0.70 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'd', 2) - 7.11±0.07μs 4.96±0.01μs 0.70 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'd', 2) - 7.05±0.07μs 4.68±0μs 0.66 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'f', 4) - 7.13±0μs 4.68±0.01μs 0.66 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 4) - 461±0.2μs 297±7μs 0.64 bench_app.MaxesOfDots.time_it - 7.04±0.07μs 3.95±0μs 0.56 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 2) - 7.06±0.06μs 3.95±0.01μs 0.56 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'f', 2) - 7.09±0.06μs 3.24±0μs 0.46 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'd', 1) - 7.12±0.07μs 3.25±0.02μs 0.46 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'd', 1) - 14.5±0.02μs 3.98±0μs 0.27 bench_reduce.MinMax.time_max(<class 'numpy.int64'>) - 14.6±0.1μs 4.00±0.01μs 0.27 bench_reduce.MinMax.time_min(<class 'numpy.int64'>) - 6.88±0.06μs 1.34±0μs 0.19 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 1) - 7.00±0μs 1.33±0μs 0.19 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'f', 1) - 39.4±0.01μs 3.95±0.01μs 0.10 bench_reduce.MinMax.time_min(<class 'numpy.float64'>) - 39.4±0.01μs 3.95±0.02μs 0.10 bench_reduce.MinMax.time_max(<class 'numpy.float64'>) - 254±0.02μs 22.8±0.2μs 0.09 bench_lib.Nan.time_nanmax(200000, 50.0) - 253±0.1μs 22.7±0.1μs 0.09 bench_lib.Nan.time_nanmin(200000, 0) - 254±0.06μs 22.7±0.09μs 0.09 bench_lib.Nan.time_nanmin(200000, 2.0) - 254±0.01μs 22.7±0.03μs 0.09 bench_lib.Nan.time_nanmin(200000, 0.1) - 254±0.04μs 22.7±0.02μs 0.09 bench_lib.Nan.time_nanmin(200000, 50.0) - 253±0.1μs 22.7±0.04μs 0.09 bench_lib.Nan.time_nanmax(200000, 0.1) - 253±0.03μs 22.7±0.04μs 0.09 bench_lib.Nan.time_nanmin(200000, 90.0) - 253±0.02μs 22.7±0.07μs 0.09 bench_lib.Nan.time_nanmax(200000, 0) - 254±0.03μs 22.7±0.02μs 0.09 bench_lib.Nan.time_nanmax(200000, 90.0) - 254±0.09μs 22.7±0.04μs 0.09 bench_lib.Nan.time_nanmax(200000, 2.0) - 39.2±0.01μs 2.51±0.01μs 0.06 bench_reduce.MinMax.time_max(<class 'numpy.float32'>) - 39.2±0.01μs 2.50±0.01μs 0.06 bench_reduce.MinMax.time_min(<class 'numpy.float32'>) ``` Size change of _multiarray_umath.cpython-39-darwin.so: Before: 3,890,723 After: 3,924,035 Change: +33,312 (~ +0.856 %)
OT: @Developer-Ecosystem-Engineering I notice that your previous commit messages lack line breaks, which does not work well with text meant to be read in a terminal. Going forward it would be good to use hard line breaks. Depending on the your workflow, it should be possible for a commit to bring up an editor specific for commit messages that should take care of that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Developer-Ecosystem-Engineering! This looks pretty clean and the benchmark and size change numbers are convincing.
There's a lot of code here and this is not my area of expertise, so I hope @seiko2plus, @Qiyu8, @ganesh-k13 or someone else can review in detail and see if this is the right approach.
*/ | ||
|
||
// Implementation below assumes longlong and ulonglong are 64-bit. | ||
#if @HAVE_NEON_IMPL@ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation uses the universal intrinsics framework, but it has to be specific to Neon anyway because of the "longlong is 64-bit" assumption? It's true for some other platforms as well, at least MSVC comes to mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has to be specific to Neon anyway because of the "longlong is 64-bit" assumption?
Yes, thats correct.
@seiko2plus, @Qiyu8, @ganesh-k13 could you take a look? It seems a future PR could extend this to other architectures. Do the |
Still a rookie here in SIMD :). One thing I noticed is the new file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch requires the modifications mentioned in the following suggestions to be compatible with all architectures. The proposed modifications do not require you to add or support any special intrinsics for these architectures, just redesign the implementation to be more friendly.
However, once you get done with these changes, I will follow it with tweaks/cleanup/benchmark since we already have raw SIMD implementation for max and min located at simd.inc.
|
||
static inline npy_intp | ||
simd_reduce_@TYPE@_@kind@(char **args, npy_intp const *dimensions, npy_intp const *steps, npy_intp i) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a fact, any ufunc's reduction has an identity, which is the value of the first element of the second input array. @seberg please correct me if I'm wrong. that's mean the input and output array can't be empty at least the length greater than zero.
In the light of the above, your SIMD kernel should use this identity value as a default value for the final accumulated vector, so we loop vector by vector to deal with small arrays or when we deal with large SIMD width > 128-bit. see the following example:
// any ufunc reduce always has an identity
// which is the value of the first element of the second input array
assert(len > 0);
const int nlanes = npyv_nlanes_@sfx@;
npyv_@sfx@ acc = npyv_setall_@sfx@(op1[0]); // final accumulator
for (; len >= nlanes*8; len -= nlanes*8, ip += nlanes*8) {
// unroll goes here
}
for (; len >= nlanes; len -= nlanes, ip += nlanes) {
acc = npyv_@vop@_@sfx@(npyv_load_@sfx@(ip), acc);
}
npyv_lanetype_@sfx@ r = npyv_reduce_@vop@_@sfx@(acc);
for (; len > 0; --len, ++ip) {
const npyv_lanetype_@sfx@ a = *ip;
r = SCALAR_OP(r, a);
}
*op1 = r;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually... no I do not think you can, reading the 0th element is not actually well defined currently, the macro checking for a "reduction" doesn't ensure that it actually is a reduction (which indeed would have a guaranteed 1-element), but only that it effectively is one.
This is super obscure though, and pretty much implausible to hit (if you are in the "reduce" branch). So probably the fix may be that NumPy should guarantee to never call an inner-loop with N == 0
(*dimensions == 0
)?
(This might be one of the reason NumPy historically "over-allocates" empty arrays, but that habit is one of those 80% band-aid fixes, only...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the clarification, then we have to test empty arrays.
@ganesh-k13, separating the SIMD kernels into multiple files can increase the readability, speed up the build and reduce the binary size, so I don't see any issue with having a new dispatch-able source. |
yes, but the contributor should emulate any missing intrinsics inside the dispatch-able source using universal intrinsics itself, so later we can tweak them and move to the main interface. |
Our docs don't explain the behavior of the sign of zero for max/fmax/min/fmin, any ideas? should we unify certain behavior across all architectures or just leave it as is or depending on the behavior of native instructions? |
IIRC, both -0.0 and 0.0 are treated as 0.0 in comparisons, so I guess the question is what should be returned if either is a max or min. |
I don't think we currently have any guarantees and C99 seems to say it that it is undefined for |
We are not using C99 If you execute the following code:See the codeimport numpy as np
from numpy.core import _simd as simd
from numpy.core._multiarray_umath import __cpu_baseline__ as cpu_baseline
zp = np.array([ 0.], dtype=np.float32).repeat(8)
zn = np.array([-0.], dtype=np.float32).repeat(8)
a = np.ravel(np.column_stack((zp, zn)))
b = np.ravel(np.column_stack((zn, zp)))
reduce = np.concatenate((a, b), axis=None).repeat(2)
print("Operands:")
print("\tfirst:", a)
print("\tsecond:", b)
print("\treduce:", reduce)
print("\nNumPy behaviour:")
print(" CPU Features:", np.lib.utils._opt_info())
print("\tnp.minimum:", np.minimum(a, b))
print("\tnp.maximum:", np.maximum(a, b))
print("\tnp.fmin:", np.fmin(a, b))
print("\tnp.fmax:", np.fmax(a, b))
print(f"\tnp.minimum.reduce:", np.minimum.reduce(reduce))
print(f"\tnp.maximum.reduce:", np.maximum.reduce(reduce))
print(f"\tnp.fmin.reduce:", np.fmin.reduce(reduce))
print(f"\tnp.fmax.reduce:", np.fmax.reduce(reduce))
print(f"\nHW behaviour:")
for k, v in simd.targets.items():
if k == "baseline":
fname = f"baseline({', '.join(cpu_baseline)})"
else:
fname = k.split('__') # multi-target
fname = ', '.join(fname)
if not v:
print(f"\tescape target {fname}, not supported by current CPU")
continue
print(f"\tWith {fname} enabled:")
print("\t\tmin:", v.min_f32(v.load_f32(a), v.load_f32(b)))
print("\t\tmax:", v.max_f32(v.load_f32(a), v.load_f32(b))) The outputOn x86Operands:
first: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
second: [-0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0.]
reduce: [ 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0.
-0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. -0. -0. 0. 0.
-0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0.
0. 0. -0. -0. 0. 0. -0. -0. 0. 0.]
NumPy behavior:
CPU Features: SSE SSE2 SSE3 SSSE3* SSE41* POPCNT* SSE42* AVX* F16C* FMA3* AVX2* AVX512F? AVX512CD? AVX512_KNL? AVX512_KNM? AVX512_SKX? AVX512_CLX? AVX512_CNL? AVX512_ICL?
np.minimum: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.maximum: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.fmin: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.fmax: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.minimum.reduce: 0.0
np.maximum.reduce: 0.0
np.fmin.reduce: 0.0
np.fmax.reduce: 0.0
HW behavior:
escape target AVX512_SKX, not supported by current CPU
escape target AVX512F, not supported by current CPU
With FMA3, AVX2 enabled:
min: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0]>
max: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0]>
With SSE42 enabled:
min: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0]>
max: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0]>
With baseline(SSE, SSE2, SSE3) enabled:
min: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0]>
max: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0]>
On x86(AVX512)Operands:
first: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
second: [-0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0.]
reduce: [ 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0.
-0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. -0. -0. 0. 0.
-0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0.
0. 0. -0. -0. 0. 0. -0. -0. 0. 0.]
NumPy behavior:
CPU Features: SSE SSE2 SSE3 SSSE3* SSE41* POPCNT* SSE42* AVX* F16C* FMA3* AVX2* AVX512F* AVX512CD* AVX512_KNL? AVX512_KNM? AVX512_SKX* AVX512_CLX* AVX512_CNL* AVX512_ICL*
np.minimum: [-0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0.]
np.maximum: [-0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0.]
np.fmin: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.fmax: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.minimum.reduce: 0.0
np.maximum.reduce: 0.0
np.fmin.reduce: 0.0
np.fmax.reduce: 0.0
HW behavior:
With AVX512_SKX enabled:
min: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0]>
max: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0]>
With AVX512F enabled:
min: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0]>
max: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0]>
With FMA3, AVX2 enabled:
min: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0]>
max: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0]>
With SSE42 enabled:
min: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0]>
max: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0]>
With baseline(SSE, SSE2, SSE3) enabled:
min: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0]>
max: <npyv_f32 of [-0.0, 0.0, -0.0, 0.0]>
On aarch64Operands:
first: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
second: [-0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0.]
reduce: [ 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0.
-0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. -0. -0. 0. 0.
-0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0.
0. 0. -0. -0. 0. 0. -0. -0. 0. 0.]
NumPy behavior:
CPU Features: NEON NEON_FP16 NEON_VFPV4 ASIMD ASIMDHP? ASIMDDP?
np.minimum: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.maximum: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.fmin: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.fmax: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.minimum.reduce: 0.0
np.maximum.reduce: 0.0
np.fmin.reduce: 0.0
np.fmax.reduce: 0.0
HW behavior:
With baseline(NEON, NEON_FP16, NEON_VFPV4, ASIMD) enabled:
min: <npyv_f32 of [-0.0, -0.0, -0.0, -0.0]>
max: <npyv_f32 of [0.0, 0.0, 0.0, 0.0]>
On ppc64leOperands:
first: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
second: [-0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0.]
reduce: [ 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0.
-0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. -0. -0. 0. 0.
-0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0. 0. 0. -0. -0.
0. 0. -0. -0. 0. 0. -0. -0. 0. 0.]
NumPy behavior:
CPU Features: VSX VSX2 VSX3*
np.minimum: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.maximum: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.fmin: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.fmax: [ 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0. 0. -0.]
np.minimum.reduce: 0.0
np.maximum.reduce: 0.0
np.fmin.reduce: 0.0
np.fmax.reduce: 0.0
HW behavior:
With VSX3 enabled:
min: <npyv_f32 of [-0.0, -0.0, -0.0, -0.0]>
max: <npyv_f32 of [0.0, 0.0, 0.0, 0.0]>
With baseline(VSX, VSX2) enabled:
min: <npyv_f32 of [-0.0, -0.0, -0.0, -0.0]>
max: <npyv_f32 of [0.0, 0.0, 0.0, 0.0]>
On |
Does this code change the behaviour to enforcing |
Yes but only on
Alright, then no need to unify certain behavior for all architectures. |
Thank you all for the feedback, working on a response! |
Thank you @seiko2plus for the excellent example. Reorganized code so that it can be used for other architectures. Core implementations and unroll factors should be the same as before for ARM NEON. Beyond reorganizing, we've added default implementations using universal intrinsics for non-ARM-NEON. Additionally, we've moved most min, max, fmin, fmax implementations to a new dispatchable source file: numpy/core/src/umath/loops_minmax.dispatch.c.src **Testing** - Apple silicon M1 native (arm64 / aarch64) -- No test failures - Apple silicon M1 Rosetta (x86_64) -- No new test failures - iMacPro1,1 (AVX512F) -- No test failures **Benchmarks** - Apple silicon M1 native (arm64 / aarch64) - Similar improvements as before reorg (comparison below) - x86_64 (both Apple silicon M1 Rosetta and iMacPro1,1 AVX512F) - Some x86_64 benchmarks are better, some are worse Apple silicon M1 native (arm64 / aarch64) comparison to original implementation / before reorg: ``` before after ratio [559ddede] [a3463b09] <gh-issue-17989/improve-neon-min-max> <gh-issue-17989/feedback/round-1> + 6.45±0.04μs 7.07±0.09μs 1.10 bench_lib.Nan.time_nanargmin(200, 0.1) + 32.1±0.3μs 35.2±0.2μs 1.10 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 2, 1, 'd') + 29.1±0.02μs 31.8±0.05μs 1.10 bench_core.Core.time_array_int_l1000 + 69.0±0.2μs 75.3±3μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 2, 4, 'f') + 92.0±1μs 99.5±0.5μs 1.08 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 4, 4, 'd') + 9.29±0.1μs 9.99±0.5μs 1.08 bench_ma.UFunc.time_1d(True, True, 10) + 338±0.6μs 362±10μs 1.07 bench_function_base.Sort.time_sort('quick', 'int16', ('random',)) + 4.21±0.03μs 4.48±0.2μs 1.07 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'str'>) + 12.3±0.06μs 13.1±0.7μs 1.06 bench_function_base.Median.time_even_small + 1.27±0μs 1.35±0.06μs 1.06 bench_itemselection.PutMask.time_dense(False, 'float16') + 139±1ns 147±6ns 1.06 bench_core.Core.time_array_1 + 33.7±0.01μs 35.5±2μs 1.05 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'reciprocal'>, 2, 4, 'f') + 69.4±0.1μs 73.1±0.2μs 1.05 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 4, 4, 'f') + 225±0.09μs 237±9μs 1.05 bench_random.Bounded.time_bounded('PCG64', [<class 'numpy.uint32'>, 2047]) - 15.7±0.5μs 14.9±0.03μs 0.95 bench_core.CountNonzero.time_count_nonzero_axis(2, 10000, <class 'numpy.int64'>) - 34.2±2μs 32.0±0.03μs 0.94 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 4, 2, 'f') - 1.03±0.05ms 955±3μs 0.92 bench_lib.Nan.time_nanargmax(200000, 50.0) - 6.97±0.08μs 6.43±0.02μs 0.92 bench_ma.UFunc.time_scalar(True, False, 10) - 5.41±0μs 4.98±0.01μs 0.92 bench_ufunc_strides.AVX_cmplx_arithmetic.time_ufunc('subtract', 2, 'F') - 22.4±0.01μs 20.6±0.02μs 0.92 bench_core.Core.time_array_float64_l1000 - 1.51±0.01ms 1.38±0ms 0.92 bench_core.CorrConv.time_correlate(1000, 10000, 'same') - 10.1±0.2μs 9.27±0.09μs 0.92 bench_ufunc.UFunc.time_ufunc_types('invert') - 8.50±0.02μs 7.80±0.09μs 0.92 bench_indexing.ScalarIndexing.time_assign_cast(1) - 29.5±0.2μs 26.6±0.03μs 0.90 bench_ma.Concatenate.time_it('masked', 100) - 2.09±0.02ms 1.87±0ms 0.90 bench_ma.UFunc.time_2d(True, True, 1000) - 298±10μs 267±0.3μs 0.89 bench_app.MaxesOfDots.time_it - 10.7±0.2μs 9.60±0.02μs 0.89 bench_ma.UFunc.time_1d(True, True, 100) - 567±3μs 505±2μs 0.89 bench_lib.Nan.time_nanargmax(200000, 90.0) - 342±0.9μs 282±5μs 0.83 bench_lib.Nan.time_nanargmax(200000, 2.0) - 307±0.7μs 244±0.8μs 0.80 bench_lib.Nan.time_nanargmax(200000, 0.1) - 309±1μs 241±0.1μs 0.78 bench_lib.Nan.time_nanargmax(200000, 0) ```
Thank you @seiko2plus for the excellent example. Reorganized code so that it can be used for other architectures. Core implementations and unroll factors should be the same as before for ARM NEON. Beyond reorganizing, we've added default implementations using universal intrinsics for non-ARM-NEON. Additionally, we've moved most min, max, fmin, fmax implementations to a new dispatchable source file: numpy/core/src/umath/loops_minmax.dispatch.c.src Testing
Benchmarks
Apple silicon M1 native (arm64 / aarch64) comparison to original implementation / before reorg: M1 benchmark before after ratio
[559ddede] [a3463b09]
<gh-issue-17989/improve-neon-min-max> <gh-issue-17989/feedback/round-1>
+ 6.45±0.04μs 7.07±0.09μs 1.10 bench_lib.Nan.time_nanargmin(200, 0.1)
+ 32.1±0.3μs 35.2±0.2μs 1.10 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 2, 1, 'd')
+ 29.1±0.02μs 31.8±0.05μs 1.10 bench_core.Core.time_array_int_l1000
+ 69.0±0.2μs 75.3±3μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 2, 4, 'f')
+ 92.0±1μs 99.5±0.5μs 1.08 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 4, 4, 'd')
+ 9.29±0.1μs 9.99±0.5μs 1.08 bench_ma.UFunc.time_1d(True, True, 10)
+ 338±0.6μs 362±10μs 1.07 bench_function_base.Sort.time_sort('quick', 'int16', ('random',))
+ 4.21±0.03μs 4.48±0.2μs 1.07 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'str'>)
+ 12.3±0.06μs 13.1±0.7μs 1.06 bench_function_base.Median.time_even_small
+ 1.27±0μs 1.35±0.06μs 1.06 bench_itemselection.PutMask.time_dense(False, 'float16')
+ 139±1ns 147±6ns 1.06 bench_core.Core.time_array_1
+ 33.7±0.01μs 35.5±2μs 1.05 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'reciprocal'>, 2, 4, 'f')
+ 69.4±0.1μs 73.1±0.2μs 1.05 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 4, 4, 'f')
+ 225±0.09μs 237±9μs 1.05 bench_random.Bounded.time_bounded('PCG64', [<class 'numpy.uint32'>, 2047])
- 15.7±0.5μs 14.9±0.03μs 0.95 bench_core.CountNonzero.time_count_nonzero_axis(2, 10000, <class 'numpy.int64'>)
- 34.2±2μs 32.0±0.03μs 0.94 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 4, 2, 'f')
- 1.03±0.05ms 955±3μs 0.92 bench_lib.Nan.time_nanargmax(200000, 50.0)
- 6.97±0.08μs 6.43±0.02μs 0.92 bench_ma.UFunc.time_scalar(True, False, 10)
- 5.41±0μs 4.98±0.01μs 0.92 bench_ufunc_strides.AVX_cmplx_arithmetic.time_ufunc('subtract', 2, 'F')
- 22.4±0.01μs 20.6±0.02μs 0.92 bench_core.Core.time_array_float64_l1000
- 1.51±0.01ms 1.38±0ms 0.92 bench_core.CorrConv.time_correlate(1000, 10000, 'same')
- 10.1±0.2μs 9.27±0.09μs 0.92 bench_ufunc.UFunc.time_ufunc_types('invert')
- 8.50±0.02μs 7.80±0.09μs 0.92 bench_indexing.ScalarIndexing.time_assign_cast(1)
- 29.5±0.2μs 26.6±0.03μs 0.90 bench_ma.Concatenate.time_it('masked', 100)
- 2.09±0.02ms 1.87±0ms 0.90 bench_ma.UFunc.time_2d(True, True, 1000)
- 298±10μs 267±0.3μs 0.89 bench_app.MaxesOfDots.time_it
- 10.7±0.2μs 9.60±0.02μs 0.89 bench_ma.UFunc.time_1d(True, True, 100)
- 567±3μs 505±2μs 0.89 bench_lib.Nan.time_nanargmax(200000, 90.0)
- 342±0.9μs 282±5μs 0.83 bench_lib.Nan.time_nanargmax(200000, 2.0)
- 307±0.7μs 244±0.8μs 0.80 bench_lib.Nan.time_nanargmax(200000, 0.1)
- 309±1μs 241±0.1μs 0.78 bench_lib.Nan.time_nanargmax(200000, 0) AVX512F min/max compare before after ratio
[b0e1a445] [f62fb2bf]
<main> <gh-issue-17989/feedback/round-1>
+ 10.6±0.1μs 144±100μs 13.60 bench_ufunc.UFunc.time_ufunc_types('bitwise_not')
+ 16.6±0.2μs 140±100μs 8.44 bench_ufunc.UFunc.time_ufunc_types('bitwise_or')
+ 5.33±0.09μs 19.5±0.4μs 3.65 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 4)
+ 5.13±0.04μs 17.9±0.2μs 3.50 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 2)
+ 5.35±0.05μs 17.0±0.9μs 3.18 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'f', 4)
+ 5.23±0.1μs 15.1±0.2μs 2.88 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'f', 2)
+ 6.62±0.1μs 18.7±0.4μs 2.83 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'd', 2)
+ 6.63±0.1μs 18.7±0.3μs 2.82 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'd', 2)
+ 190±2μs 516±200μs 2.72 bench_ufunc.UFunc.time_ufunc_types('negative')
+ 7.49±0.09μs 19.5±0.4μs 2.61 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'd', 4)
+ 7.46±0.07μs 18.7±0.2μs 2.50 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'd', 4)
+ 122±1μs 250±100μs 2.06 bench_indexing.Indexing.time_op('indexes_', ':,I', '')
+ 271±3μs 550±300μs 2.03 bench_ufunc.UFunc.time_ufunc_types('less_equal')
+ 654±20μs 962±30μs 1.47 bench_ufunc.UFunc.time_ufunc_types('sqrt')
+ 3.46±0.05μs 4.88±0.07μs 1.41 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'd', 1)
+ 3.47±0.04μs 4.85±0.03μs 1.40 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'd', 1)
+ 110±3μs 152±5μs 1.38 bench_function_base.Sort.time_sort('merge', 'float64', ('sorted_block', 100))
+ 1.63±0.02μs 2.26±0.2μs 1.38 bench_itemselection.PutMask.time_sparse(False, 'longfloat')
+ 34.6±0.4μs 46.8±0.9μs 1.35 bench_function_base.Sort.time_argsort('heap', 'float64', ('uniform',))
+ 1.66±0.02μs 2.25±0.09μs 1.35 bench_itemselection.PutMask.time_sparse(False, 'complex128')
+ 129±2μs 173±4μs 1.34 bench_function_base.Sort.time_argsort('merge', 'float64', ('sorted_block', 100))
+ 7.67±0.06μs 10.1±1μs 1.32 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'str'>)
+ 37.9±0.3μs 49.7±1μs 1.31 bench_core.CountNonzero.time_count_nonzero(1, 1000000, <class 'bool'>)
+ 75.2±0.8μs 97.6±0.8μs 1.30 bench_reduce.ArgMax.time_argmax(<class 'numpy.float32'>)
+ 1.61±0.01μs 2.09±0.1μs 1.30 bench_itemselection.PutMask.time_sparse(False, 'float16')
+ 27.7±0.6μs 35.8±1μs 1.30 bench_function_base.Sort.time_sort('merge', 'int16', ('sorted_block', 100))
+ 38.2±0.1μs 48.9±0.7μs 1.28 bench_core.CountNonzero.time_count_nonzero(1, 1000000, <class 'numpy.int8'>)
+ 1.07±0.01μs 1.36±0.05μs 1.28 bench_itemselection.PutMask.time_sparse(True, 'float32')
+ 1.61±0.02μs 2.05±0.1μs 1.27 bench_itemselection.PutMask.time_sparse(False, 'int16')
+ 5.41±0.04μs 6.87±0.09μs 1.27 bench_core.UnpackBits.time_unpackbits_little
+ 64.4±0.4μs 81.6±5μs 1.27 bench_function_base.Sort.time_sort('merge', 'float64', ('sorted_block', 1000))
+ 1.10±0.03μs 1.39±0.07μs 1.26 bench_itemselection.PutMask.time_sparse(True, 'complex64')
+ 1.95±0.04μs 2.45±0.07μs 1.26 bench_core.CountNonzero.time_count_nonzero(3, 10000, <class 'numpy.int8'>)
+ 71.0±2μs 89.0±1μs 1.25 bench_function_base.Sort.time_argsort('merge', 'float64', ('sorted_block', 1000))
+ 28.8±0.6μs 36.0±0.7μs 1.25 bench_function_base.Sort.time_sort('merge', 'int16', ('sorted_block', 10))
+ 120±6μs 151±6μs 1.25 bench_function_base.Sort.time_sort('quick', 'float64', ('uniform',))
+ 1.08±0.02μs 1.35±0.08μs 1.25 bench_itemselection.PutMask.time_sparse(True, 'int32')
+ 7.92±0.1μs 9.82±1μs 1.24 bench_reduce.MinMax.time_max(<class 'numpy.float64'>)
+ 1.95±0.05ms 2.39±0.1ms 1.23 bench_linalg.Eindot.time_tensordot_a_b_axes_1_0_0_1
+ 1.10±0.05μs 1.35±0.07μs 1.22 bench_itemselection.PutMask.time_sparse(True, 'float64')
+ 137±0.7μs 167±6μs 1.22 bench_core.CountNonzero.time_count_nonzero(3, 1000000, <class 'numpy.int8'>)
+ 62.3±0.7μs 75.4±3μs 1.21 bench_ufunc.UFunc.time_ufunc_types('signbit')
+ 1.69±0.03μs 2.05±0.09μs 1.21 bench_itemselection.PutMask.time_dense(False, 'int16')
+ 1.10±0.01μs 1.33±0.1μs 1.21 bench_itemselection.PutMask.time_sparse(True, 'int64')
+ 1.98±0.06μs 2.38±0.1μs 1.21 bench_core.CountNonzero.time_count_nonzero(3, 10000, <class 'bool'>)
+ 1.78±0.02μs 2.15±0.1μs 1.20 bench_itemselection.PutMask.time_dense(False, 'complex128')
+ 380±3ns 458±9ns 1.20 bench_array_coercion.ArrayCoercionSmall.time_asanyarray_dtype(1)
+ 28.7±3μs 34.5±0.5μs 1.20 bench_function_base.Sort.time_sort('merge', 'int16', ('random',))
+ 1.18±0.01μs 1.42±0.03μs 1.20 bench_itemselection.PutMask.time_sparse(True, 'complex128')
+ 24.0±0.3μs 28.7±3μs 1.19 bench_ma.UFunc.time_scalar_1d(False, True, 100)
+ 12.5±0.2ms 14.9±0.2ms 1.19 bench_lib.Unique.time_unique(200000, 2.0)
+ 181±3μs 216±4μs 1.19 bench_function_base.Sort.time_argsort('merge', 'float64', ('sorted_block', 10))
+ 6.48±0.09ms 7.70±0.5ms 1.19 bench_lib.Unique.time_unique(200000, 90.0)
+ 168±1μs 199±100μs 1.18 bench_ufunc.UFunc.time_ufunc_types('isnan')
+ 164±1μs 194±8μs 1.18 bench_core.CountNonzero.time_count_nonzero(3, 10000, <class 'object'>)
+ 10.3±0.09μs 12.1±1μs 1.18 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'object'>)
+ 196±4μs 231±4μs 1.18 bench_core.CountNonzero.time_count_nonzero_multi_axis(1, 10000, <class 'object'>)
+ 10.9±0.04ms 12.8±0.2ms 1.18 bench_core.CountNonzero.time_count_nonzero(2, 1000000, <class 'object'>)
+ 55.2±0.3μs 64.9±2μs 1.18 bench_core.CountNonzero.time_count_nonzero(1, 10000, <class 'object'>)
+ 1.53±0.02μs 1.80±0.04μs 1.18 bench_core.CountNonzero.time_count_nonzero(2, 10000, <class 'bool'>)
+ 10.6±0.2μs 12.5±0.1μs 1.17 bench_indexing.ScalarIndexing.time_assign_cast(0)
+ 1.52±0.01μs 1.78±0.02μs 1.17 bench_core.CountNonzero.time_count_nonzero(2, 10000, <class 'numpy.int8'>)
+ 72.5±0.2μs 84.9±2μs 1.17 bench_core.CountNonzero.time_count_nonzero_axis(1, 10000, <class 'str'>)
+ 1.29±0.03μs 1.51±0.06μs 1.17 bench_itemselection.PutMask.time_dense(True, 'float64')
+ 11.2±0.1μs 13.1±0.3μs 1.17 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'negative'>, 1, 1, 'f')
+ 4.12±0.03μs 4.80±0.2μs 1.17 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'int64')
+ 31.3±0.4μs 36.4±2μs 1.16 bench_function_base.Sort.time_sort('merge', 'int16', ('reversed',))
+ 1.64±0.02μs 1.90±0.03μs 1.16 bench_core.Core.time_ones_100
+ 1.27±0.04μs 1.47±0.09μs 1.16 bench_itemselection.PutMask.time_dense(True, 'int64')
+ 64.2±2μs 74.3±2μs 1.16 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 1, 4, 'f')
+ 5.54±0.09μs 6.41±0.4μs 1.16 bench_indexing.ScalarIndexing.time_index(0)
+ 16.3±0.06ms 18.8±0.5ms 1.15 bench_core.CountNonzero.time_count_nonzero(3, 1000000, <class 'object'>)
+ 12.2±0.1ms 14.1±1ms 1.15 bench_lib.Unique.time_unique(200000, 0)
+ 626±20ns 723±30ns 1.15 bench_array_coercion.ArrayCoercionSmall.time_asanyarray([1])
+ 109±0.5μs 125±2μs 1.15 bench_core.CountNonzero.time_count_nonzero(2, 10000, <class 'object'>)
+ 99.6±0.3μs 115±5μs 1.15 bench_linalg.Einsum.time_einsum_contig_outstride0(<class 'numpy.float32'>)
+ 132±2μs 152±3μs 1.15 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 1, 1, 'd')
+ 4.07±0.02μs 4.68±0.2μs 1.15 bench_itemselection.Take.time_contiguous((1000, 3), 'wrap', 'longfloat')
+ 1.88±0.02μs 2.16±0.1μs 1.15 bench_itemselection.PutMask.time_sparse(False, 'int64')
+ 1.30±0.01μs 1.49±0.06μs 1.15 bench_ufunc.ArgParsingReduce.time_add_reduce_arg_parsing((array([0., 1.])))
+ 384±1ns 440±10ns 1.15 bench_array_coercion.ArrayCoercionSmall.time_asarray_dtype(1)
+ 368±10ns 422±10ns 1.15 bench_array_coercion.ArrayCoercionSmall.time_array_subok(1)
+ 494±7μs 566±20μs 1.15 bench_linalg.Einsum.time_einsum_mul(<class 'numpy.float32'>)
+ 12.4±0.2ms 14.2±0.9ms 1.15 bench_lib.Unique.time_unique(200000, 0.1)
+ 1.20±0.02μs 1.37±0.06μs 1.14 bench_itemselection.PutMask.time_sparse(True, 'longfloat')
+ 1.31±0.01μs 1.50±0.03μs 1.14 bench_ufunc.ArgParsingReduce.time_add_reduce_arg_parsing((array([0., 1.]), 0, None))
+ 138±1μs 158±3μs 1.14 bench_core.CountNonzero.time_count_nonzero(3, 1000000, <class 'bool'>)
+ 1.87±0.06μs 2.14±0.1μs 1.14 bench_itemselection.PutMask.time_sparse(False, 'complex64')
+ 10.00±0.06ms 11.4±0.6ms 1.14 bench_lib.Unique.time_unique(200000, 50.0)
+ 390±2ns 445±6ns 1.14 bench_array_coercion.ArrayCoercionSmall.time_asanyarray_dtype(5)
+ 1.89±0.04μs 2.15±0.1μs 1.14 bench_itemselection.PutMask.time_sparse(False, 'float64')
+ 22.1±0.2ms 25.1±0.5ms 1.14 bench_io.LoadtxtCSVdtypes.time_loadtxt_dtypes_csv('int64', 10000)
+ 4.72±0.06μs 5.36±0.3μs 1.14 bench_itemselection.Take.time_contiguous((1000, 1), 'clip', 'int32')
+ 336±3ns 382±10ns 1.14 bench_array_coercion.ArrayCoercionSmall.time_asanyarray(1)
+ 438±5ns 497±7ns 1.14 bench_ufunc.Scalar.time_add_scalar
+ 8.52±0.1μs 9.68±0.3μs 1.14 bench_indexing.ScalarIndexing.time_index(2)
+ 673±10ns 764±30ns 1.13 bench_ufunc.ArgParsing.time_add_arg_parsing((array(1.), array(2.), subok=True))
+ 5.55±0.09μs 6.29±0.4μs 1.13 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'int16')
+ 4.07±0.06μs 4.62±0.2μs 1.13 bench_itemselection.Take.time_contiguous((1000, 3), 'wrap', 'complex128')
+ 132±2μs 149±2μs 1.13 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'fabs'>, 1, 1, 'd')
+ 78.5±1μs 88.7±1μs 1.13 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'absolute'>, 1, 4, 'f')
+ 707±6μs 798±60μs 1.13 bench_lib.Pad.time_pad((1024, 1024), 1, 'wrap')
+ 5.33±0.06μs 6.01±0.4μs 1.13 bench_itemselection.Take.time_contiguous((1000, 3), 'clip', 'int16')
+ 5.77±0.03μs 6.50±0.3μs 1.13 bench_core.CountNonzero.time_count_nonzero_axis(1, 100, <class 'str'>)
+ 154±3μs 173±7μs 1.13 bench_function_base.Sort.time_sort('merge', 'float64', ('sorted_block', 10))
+ 5.53±0.08ms 6.22±0.04ms 1.13 bench_core.CountNonzero.time_count_nonzero(1, 1000000, <class 'object'>)
+ 221±3μs 248±6μs 1.12 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 10000, <class 'str'>)
+ 367±10ns 411±20ns 1.12 bench_array_coercion.ArrayCoercionSmall.time_ascontiguousarray(5)
+ 2.79±0.01μs 3.13±0.08μs 1.12 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 1)
+ 66.1±1μs 74.1±0.8μs 1.12 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'ceil'>, 2, 2, 'f')
+ 3.35±0.03μs 3.75±0.07μs 1.12 bench_itemselection.Take.time_contiguous((1000, 3), 'wrap', 'int32')
+ 91.1±2μs 102±2μs 1.12 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'square'>, 1, 2, 'd')
+ 23.3±0.3ms 26.1±0.5ms 1.12 bench_trim_zeros.TrimZeros.time_trim_zeros(dtype('bool'), 30000)
+ 237±1ms 265±4ms 1.12 bench_trim_zeros.TrimZeros.time_trim_zeros(dtype('bool'), 300000)
+ 396±4ns 443±20ns 1.12 bench_array_coercion.ArrayCoercionSmall.time_asarray_dtype(5)
+ 13.3±0.1μs 14.8±0.5μs 1.12 bench_core.CountNonzero.time_count_nonzero_axis(1, 10000, <class 'numpy.int64'>)
+ 4.70±0.08μs 5.24±0.4μs 1.12 bench_itemselection.Take.time_contiguous((1000, 2), 'clip', 'int16')
+ 96.4±2μs 107±1μs 1.11 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 4, 2, 'f')
+ 4.70±0.07μs 5.24±0.3μs 1.11 bench_itemselection.Take.time_contiguous((1000, 1), 'clip', 'float32')
+ 5.36±0.08μs 5.97±0.4μs 1.11 bench_itemselection.Take.time_contiguous((1000, 3), 'clip', 'float16')
+ 487±6ns 542±20ns 1.11 bench_array_coercion.ArrayCoercionSmall.time_array_all_kwargs(5)
+ 165±2μs 184±9μs 1.11 bench_lib.Pad.time_pad((256, 128, 1), 1, 'reflect')
+ 305±2ns 339±20ns 1.11 bench_array_coercion.ArrayCoercionSmall.time_asarray(5)
+ 5.13±0.04μs 5.70±0.4μs 1.11 bench_ma.Indexing.time_1d(False, 2, 100)
+ 3.18±0.06μs 3.53±0.09μs 1.11 bench_ufunc_strides.AVX_ldexp.time_ufunc('d', 1)
+ 8.71±0.1μs 9.66±0.6μs 1.11 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'clip', 'int32')
+ 11.7±0.3μs 13.0±0.3μs 1.11 bench_lib.Unique.time_unique(200, 2.0)
+ 389±7μs 431±30μs 1.11 bench_linalg.Eindot.time_einsum_i_ij_j
+ 1.81±0.01μs 2.00±0.09μs 1.11 bench_array_coercion.ArrayCoercionSmall.time_asarray_dtype(range(0, 3))
+ 132±2μs 146±1μs 1.11 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 1, 1, 'd')
+ 5.59±0.03μs 6.18±0.3μs 1.11 bench_ma.Indexing.time_1d(True, 1, 10)
+ 649±4μs 718±20μs 1.11 bench_core.VarComplex.time_var(100000)
+ 1.16±0μs 1.28±0.03μs 1.11 bench_core.CountNonzero.time_count_nonzero(1, 10000, <class 'bool'>)
+ 619±6μs 684±10μs 1.11 bench_function_base.Sort.time_argsort('merge', 'float64', ('random',))
+ 10.3±0.1μs 11.3±0.3μs 1.10 bench_function_base.Sort.time_sort('merge', 'float64', ('reversed',))
+ 5.47±0.06μs 6.04±0.4μs 1.10 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'float16')
+ 6.18±0.09μs 6.81±0.3μs 1.10 bench_core.CountNonzero.time_count_nonzero_axis(1, 100, <class 'object'>)
+ 4.37±0.03μs 4.82±0.09μs 1.10 bench_core.CountNonzero.time_count_nonzero_axis(1, 100, <class 'numpy.int64'>)
+ 1.78±0.01μs 1.96±0.02μs 1.10 bench_core.CountNonzero.time_count_nonzero(2, 100, <class 'object'>)
+ 46.1±0.4μs 50.8±1μs 1.10 bench_shape_base.Block2D.time_block2d((256, 256), 'uint16', (4, 4))
+ 234±0.8μs 258±50μs 1.10 bench_ufunc.UFunc.time_ufunc_types('sign')
+ 397±1μs 437±20μs 1.10 bench_random.Random.time_rng('binomial 10 0.5')
+ 223±1μs 245±7μs 1.10 bench_core.CountNonzero.time_count_nonzero_axis(3, 10000, <class 'str'>)
+ 5.12±0.09μs 5.61±0.3μs 1.10 bench_ma.Indexing.time_1d(False, 1, 10)
+ 452±2ns 495±5ns 1.10 bench_array_coercion.ArrayCoercionSmall.time_asarray_dtype(array([5]))
+ 120±1μs 131±0.9μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'reciprocal'>, 4, 4, 'f')
+ 3.45±0.05μs 3.78±0.1μs 1.09 bench_ufunc.CustomScalarFloorDivideInt.time_floor_divide_int(<class 'numpy.int32'>, -43)
+ 13.8±0.2μs 15.1±0.4μs 1.09 bench_ma.UFunc.time_2d(False, True, 10)
+ 77.8±1μs 84.9±0.4μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rint'>, 1, 4, 'f')
+ 91.7±0.8μs 100.0±0.8μs 1.09 bench_core.CountNonzero.time_count_nonzero(2, 1000000, <class 'numpy.int8'>)
+ 153±0.7μs 167±2μs 1.09 bench_core.CountNonzero.time_count_nonzero_axis(2, 10000, <class 'str'>)
+ 800±10μs 872±40μs 1.09 bench_lib.Pad.time_pad((1024, 1024), (0, 32), 'edge')
+ 5.65±0.03μs 6.16±0.6μs 1.09 bench_reduce.MinMax.time_min(<class 'numpy.float32'>)
+ 297±7μs 324±4μs 1.09 bench_core.UnpackBits.time_unpackbits_axis1_little
+ 4.36±0.05μs 4.73±0.05μs 1.09 bench_core.CountNonzero.time_count_nonzero_axis(1, 100, <class 'numpy.int8'>)
+ 2.33±0.04μs 2.53±0.04μs 1.09 bench_ufunc.CustomScalarFloorDivideInt.time_floor_divide_int(<class 'numpy.int16'>, 43)
+ 5.35±0.05μs 5.80±0.2μs 1.08 bench_ma.Indexing.time_0d(False, 1, 10)
+ 14.2±0.1μs 15.4±0.5μs 1.08 bench_ma.Concatenate.time_it('ndarray', 2)
+ 91.7±1μs 99.3±1μs 1.08 bench_core.CountNonzero.time_count_nonzero(2, 1000000, <class 'bool'>)
+ 6.48±0.1μs 7.01±0.6μs 1.08 bench_lib.Nan.time_nancumprod(200, 90.0)
+ 1.17±0.01μs 1.27±0.02μs 1.08 bench_core.CountNonzero.time_count_nonzero(1, 10000, <class 'numpy.int8'>)
+ 4.59±0.04μs 4.96±0.08μs 1.08 bench_core.CountNonzero.time_count_nonzero_axis(2, 100, <class 'numpy.int16'>)
+ 14.3±0.2ms 15.4±0.2ms 1.08 bench_linalg.Eindot.time_einsum_ij_jk_a_b
+ 337±6μs 363±8μs 1.08 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'log'>, 2, 1, 'd')
+ 2.80±0.01ms 3.01±0.07ms 1.08 bench_ufunc.UFunc.time_ufunc_types('arctanh')
+ 59.1±0.3μs 63.7±0.5μs 1.08 bench_shape_base.Block2D.time_block2d((512, 512), 'uint8', (4, 4))
+ 515±2ns 554±6ns 1.08 bench_ufunc.ArgParsing.time_add_arg_parsing((array(1.), array(2.), array(3.)))
+ 2.80±0.06ms 3.01±0.05ms 1.08 bench_ufunc.UFunc.time_ufunc_types('sinh')
+ 3.93±0.01μs 4.23±0.1μs 1.07 bench_core.CountNonzero.time_count_nonzero_axis(1, 100, <class 'bool'>)
+ 3.68±0.04s 3.96±0.02s 1.07 bench_ufunc_strides.Mandelbrot.time_mandel
+ 621±10ns 667±20ns 1.07 bench_ufunc.Scalar.time_add_scalar_conv
+ 5.11±0.09μs 5.49±0.2μs 1.07 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'numpy.int64'>)
+ 337±3μs 362±5μs 1.07 bench_ufunc.UFunc.time_ufunc_types('spacing')
+ 549±10ns 589±2ns 1.07 bench_array_coercion.ArrayCoercionSmall.time_array_invalid_kwarg(5)
+ 1.22±0.01μs 1.30±0.03μs 1.07 bench_core.CountNonzero.time_count_nonzero(1, 100, <class 'object'>)
+ 551±6ns 590±8ns 1.07 bench_core.Core.time_array_l1
+ 2.63±0.03μs 2.81±0.05μs 1.07 bench_core.CountNonzero.time_count_nonzero(3, 10000, <class 'numpy.int16'>)
+ 194±2μs 207±3μs 1.07 bench_ufunc.UFunc.time_ufunc_types('isinf')
+ 825±10ms 880±10ms 1.07 bench_trim_zeros.TrimZeros.time_trim_zeros(dtype('int64'), 300000)
+ 243±3ns 259±4ns 1.07 bench_array_coercion.ArrayCoercionSmall.time_array_no_copy(array([5]))
+ 543±10ns 578±3ns 1.06 bench_array_coercion.ArrayCoercionSmall.time_array_invalid_kwarg(1)
+ 7.53±0.02μs 8.02±0.2μs 1.06 bench_core.CountNonzero.time_count_nonzero_axis(3, 100, <class 'str'>)
+ 4.78±0.03μs 5.08±0.08μs 1.06 bench_core.CorrConv.time_correlate(1000, 10, 'same')
+ 577±3ns 614±6ns 1.06 bench_ufunc.ArgParsing.time_add_arg_parsing((array(1.), array(2.)))
+ 2.44±0.03μs 2.59±0.03μs 1.06 bench_core.CountNonzero.time_count_nonzero(3, 100, <class 'object'>)
+ 150±2μs 159±1μs 1.06 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 2, 1, 'd')
+ 370±2μs 391±6μs 1.06 bench_core.CountNonzero.time_count_nonzero_axis(2, 10000, <class 'object'>)
+ 530±10μs 559±4μs 1.06 bench_function_base.Sort.time_sort('merge', 'float64', ('random',))
+ 135±0.6ms 142±8ms 1.05 bench_core.CorrConv.time_convolve(100000, 10000, 'full')
+ 58.5±0.2μs 61.4±0.6μs 1.05 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 1, 1, 'd')
- 154±2μs 145±1μs 0.94 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 2, 4, 'f')
- 4.17±0.04μs 3.93±0.05μs 0.94 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'int32')
- 5.33±0.1μs 5.01±0.04μs 0.94 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 100, <class 'numpy.int32'>)
- 4.16±0.04μs 3.91±0.06μs 0.94 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'float32')
- 13.0±0.09μs 12.2±0.1μs 0.94 bench_core.PackBits.time_packbits_axis1(<class 'bool'>)
- 195±0.5μs 182±6μs 0.93 bench_reduce.AddReduceSeparate.time_reduce(0, 'float32')
- 121±2μs 113±2μs 0.93 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'sin'>, 2, 2, 'f')
- 51.6±0.8μs 47.6±0.5μs 0.92 bench_shape_base.Block.time_block_simple_row_wise(100)
- 775±10μs 713±8μs 0.92 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'sin'>, 2, 2, 'd')
- 130±1μs 119±2μs 0.92 bench_shape_base.Block2D.time_block2d((1024, 1024), 'uint8', (2, 2))
- 298±3μs 272±4μs 0.91 bench_core.UnpackBits.time_unpackbits_axis1
- 495±5μs 453±9μs 0.91 bench_function_base.Sort.time_sort('heap', 'float64', ('ordered',))
- 232±3μs 210±8μs 0.90 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 4, 2, 'd')
- 292±2μs 263±1μs 0.90 bench_function_base.Sort.time_argsort('quick', 'int64', ('sorted_block', 1000))
- 153±1μs 138±1μs 0.90 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 4, 1, 'f')
- 388±3μs 346±6μs 0.89 bench_function_base.Sort.time_argsort('quick', 'int64', ('sorted_block', 100))
- 398±5μs 355±6μs 0.89 bench_function_base.Sort.time_argsort('quick', 'int64', ('sorted_block', 10))
- 696±3μs 621±10μs 0.89 bench_function_base.Sort.time_argsort('heap', 'float64', ('sorted_block', 10))
- 97.0±3μs 86.4±0.7μs 0.89 bench_function_base.Sort.time_sort('quick', 'int16', ('uniform',))
- 312±8μs 277±1μs 0.89 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'fabs'>, 4, 4, 'd')
- 180±4μs 160±1μs 0.89 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 4, 4, 'f')
- 74.6±0.9μs 66.0±0.6μs 0.89 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 2, 2, 'f')
- 155±2μs 137±1μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'sin'>, 4, 4, 'f')
- 103±0.4μs 91.1±1μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'trunc'>, 2, 4, 'f')
- 490±8μs 431±6μs 0.88 bench_function_base.Sort.time_argsort('quick', 'int64', ('random',))
- 152±3μs 134±1μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 2, 2, 'f')
- 146±0.5μs 128±2μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 1, 2, 'f')
- 105±0.8μs 91.7±0.3μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'reciprocal'>, 2, 4, 'f')
- 4.17±0.09μs 3.65±0.2μs 0.88 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'longfloat')
- 4.10±0.05μs 3.59±0.3μs 0.87 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'complex64')
- 147±2μs 128±2μs 0.87 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 1, 2, 'f')
- 148±2μs 130±1μs 0.87 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 2, 2, 'f')
- 4.12±0.04μs 3.59±0.2μs 0.87 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'int64')
- 271±3μs 236±4μs 0.87 bench_function_base.Sort.time_sort('quick', 'int16', ('sorted_block', 1000))
- 686±4μs 597±20μs 0.87 bench_function_base.Sort.time_argsort('heap', 'int64', ('sorted_block', 100))
- 7.44±0.3μs 6.43±0.4μs 0.86 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'float64')
- 317±2μs 274±2μs 0.86 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 4, 4, 'd')
- 5.19±0.07μs 4.48±0.3μs 0.86 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'complex128')
- 789±8μs 680±20μs 0.86 bench_function_base.Sort.time_argsort('heap', 'int64', ('random',))
- 99.2±4μs 85.4±3μs 0.86 bench_lib.Nan.time_nanmax(200000, 0)
- 151±4μs 130±2μs 0.86 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 2, 1, 'f')
- 1.79±0.03ms 1.54±0.05ms 0.86 bench_lib.Pad.time_pad((1, 1, 1, 1, 1), 8, 'constant')
- 90.7±5μs 77.9±1μs 0.86 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'negative'>, 1, 4, 'f')
- 150±2μs 128±0.6μs 0.86 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 2, 1, 'f')
- 86.2±2μs 73.7±0.8μs 0.85 bench_function_base.Sort.time_argsort('quick', 'int16', ('reversed',))
- 312±7μs 267±6μs 0.85 bench_function_base.Select.time_select_larger
- 7.54±0.2μs 6.45±0.4μs 0.85 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'int64')
- 4.12±0.02μs 3.52±0.2μs 0.85 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'float64')
- 659±9μs 561±10μs 0.85 bench_function_base.Sort.time_argsort('heap', 'int64', ('sorted_block', 10))
- 146±2μs 124±1μs 0.85 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 1, 1, 'f')
- 537±6μs 455±10μs 0.85 bench_function_base.Sort.time_argsort('heap', 'float64', ('reversed',))
- 4.21±0.1μs 3.56±0.3μs 0.85 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'complex128')
- 87.1±2μs 73.1±1μs 0.84 bench_function_base.Sort.time_argsort('quick', 'int64', ('uniform',))
- 4.37±0.2μs 3.66±0.2μs 0.84 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'int64')
- 7.11±0.1μs 5.96±0.3μs 0.84 bench_function_base.Sort.time_sort('merge', 'int16', ('ordered',))
- 7.72±0.4μs 6.46±0.4μs 0.84 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'complex64')
- 184±3μs 153±6μs 0.84 bench_function_base.Bincount.time_weights
- 137±6μs 114±2μs 0.83 bench_function_base.Bincount.time_bincount
- 862±9μs 717±20μs 0.83 bench_reduce.AddReduceSeparate.time_reduce(0, 'complex64')
- 148±2μs 122±2μs 0.83 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 1, 1, 'f')
- 490±9μs 405±5μs 0.83 bench_function_base.Sort.time_argsort('heap', 'int64', ('reversed',))
- 13.6±0.02μs 11.3±0.2μs 0.82 bench_function_base.Sort.time_argsort('merge', 'float64', ('reversed',))
- 56.2±0.1μs 46.4±0.6μs 0.82 bench_function_base.Sort.time_argsort('quick', 'int16', ('ordered',))
- 4.34±0.2μs 3.55±0.2μs 0.82 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'int32')
- 654±40μs 528±10μs 0.81 bench_function_base.Sort.time_argsort('heap', 'int64', ('sorted_block', 1000))
- 839±30μs 675±20μs 0.80 bench_ufunc.UFunc.time_ufunc_types('multiply')
- 109±2μs 86.9±6μs 0.80 bench_lib.Nan.time_nanmin(200000, 0.1)
- 1.17±0.01ms 922±20μs 0.79 bench_core.PackBits.time_packbits_axis0(<class 'numpy.uint64'>)
- 108±2μs 84.9±3μs 0.79 bench_lib.Nan.time_nanmin(200000, 0)
- 8.50±1μs 6.56±0.3μs 0.77 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'complex128')
- 35.0±0.4μs 26.5±0.5μs 0.76 bench_io.CopyTo.time_copyto_sparse
- 126±0.9μs 93.0±0.7μs 0.74 bench_lib.Nan.time_nanmax(200000, 2.0)
- 1.03±0.01ms 727±20μs 0.71 bench_core.PackBits.time_packbits_axis1(<class 'numpy.uint64'>)
- 52.5±2μs 37.0±0.6μs 0.70 bench_core.PackBits.time_packbits(<class 'numpy.uint64'>)
- 357±10μs 234±8μs 0.66 bench_core.PackBits.time_packbits_axis0(<class 'bool'>)
- 144±1μs 87.5±5μs 0.61 bench_lib.Nan.time_nanmin(200000, 2.0)
- 281±2μs 92.3±0.6μs 0.33 bench_lib.Nan.time_nanmin(200000, 90.0)
- 299±3μs 87.9±5μs 0.29 bench_lib.Nan.time_nanmax(200000, 90.0)
- 773±6μs 95.5±4μs 0.12 bench_lib.Nan.time_nanmax(200000, 50.0)
- 772±10μs 87.9±5μs 0.11 bench_lib.Nan.time_nanmin(200000, 50.0)
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED. RERUN-- AVX 512F min/max compare before after ratio
[b0e1a445] [82801074]
<main> <minmax>
+ 5.10±0.1μs 17.5±0.3μs 3.43 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 2)
+ 5.42±0.1μs 18.6±0.4μs 3.43 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 4)
+ 167±0.8μs 523±90μs 3.13 bench_ufunc.UFunc.time_ufunc_types('conj')
+ 5.35±0.04μs 16.2±0.2μs 3.04 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'f', 4)
+ 5.12±0.3μs 14.7±0.3μs 2.87 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'f', 2)
+ 6.48±0.06μs 18.1±0.2μs 2.79 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'd', 2)
+ 6.48±0.05μs 17.8±0.1μs 2.75 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'd', 2)
+ 7.46±0.05μs 18.4±0.3μs 2.47 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'd', 4)
+ 7.52±0.05μs 18.1±0.2μs 2.41 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'd', 4)
+ 189±0.5μs 405±200μs 2.14 bench_ufunc.UFunc.time_ufunc_types('positive')
+ 238±2μs 462±200μs 1.94 bench_ufunc.UFunc.time_ufunc_types('abs')
+ 119±4μs 175±0.8μs 1.47 bench_function_base.Sort.time_argsort('merge', 'float64', ('sorted_block', 100))
+ 67.4±2μs 94.9±0.6μs 1.41 bench_function_base.Sort.time_argsort('merge', 'float64', ('sorted_block', 1000))
+ 3.42±0.02μs 4.76±0.1μs 1.39 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'd', 1)
+ 3.42±0.02μs 4.74±0.02μs 1.39 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'd', 1)
+ 734±2μs 1.02±0.3ms 1.38 bench_core.CountNonzero.time_count_nonzero_axis(1, 1000000, <class 'numpy.int16'>)
+ 8.30±0.07ms 11.5±2ms 1.38 bench_lib.Pad.time_pad((1, 1, 1, 1, 1), 8, 'wrap')
+ 33.9±0.2μs 46.0±0.2μs 1.36 bench_function_base.Sort.time_argsort('heap', 'float64', ('uniform',))
+ 111±0.3μs 146±0.8μs 1.32 bench_function_base.Sort.time_sort('merge', 'float64', ('sorted_block', 100))
+ 1.64±0.01μs 2.12±0.01μs 1.29 bench_itemselection.PutMask.time_sparse(False, 'longfloat')
+ 61.2±2μs 79.0±2μs 1.29 bench_function_base.Sort.time_sort('merge', 'float64', ('sorted_block', 1000))
+ 1.64±0.02μs 2.12±0.01μs 1.29 bench_itemselection.PutMask.time_sparse(False, 'complex128')
+ 179±0.9μs 230±2μs 1.28 bench_function_base.Sort.time_argsort('merge', 'float64', ('sorted_block', 10))
+ 74.5±0.2μs 95.3±0.5μs 1.28 bench_reduce.ArgMax.time_argmax(<class 'numpy.float32'>)
+ 3.60±0.02μs 4.56±0.01μs 1.27 bench_itemselection.Take.time_contiguous((1000, 3), 'wrap', 'complex64')
+ 5.32±0.05μs 6.73±0.06μs 1.27 bench_core.UnpackBits.time_unpackbits_little
+ 1.57±0.02μs 1.98±0.03μs 1.26 bench_itemselection.PutMask.time_sparse(False, 'int16')
+ 54.7±0.1ms 68.4±10ms 1.25 bench_core.CountNonzero.time_count_nonzero_axis(3, 1000000, <class 'object'>)
+ 27.5±0.9μs 34.2±0.2μs 1.24 bench_function_base.Sort.time_sort('merge', 'int16', ('random',))
+ 38.2±0.09μs 47.5±0.3μs 1.24 bench_core.CountNonzero.time_count_nonzero(1, 1000000, <class 'numpy.int8'>)
+ 38.3±0.1μs 47.5±0.2μs 1.24 bench_core.CountNonzero.time_count_nonzero(1, 1000000, <class 'bool'>)
+ 1.57±0.01μs 1.94±0.02μs 1.24 bench_itemselection.PutMask.time_sparse(False, 'float16')
+ 187±0.3μs 230±40μs 1.23 bench_core.CountNonzero.time_count_nonzero_axis(1, 10000, <class 'object'>)
+ 1.05±0.01μs 1.26±0μs 1.20 bench_itemselection.PutMask.time_sparse(True, 'complex64')
+ 29.1±0.3μs 34.7±0.3μs 1.19 bench_function_base.Sort.time_sort('merge', 'int16', ('sorted_block', 10))
+ 28.4±0.8μs 33.6±1μs 1.18 bench_function_base.Sort.time_sort('merge', 'int16', ('sorted_block', 100))
+ 120±4μs 142±0.4μs 1.18 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 1, 4, 'd')
+ 1.75±0.02μs 2.05±0.01μs 1.18 bench_itemselection.PutMask.time_dense(False, 'longfloat')
+ 1.93±0.01μs 2.27±0.03μs 1.17 bench_core.CountNonzero.time_count_nonzero(3, 10000, <class 'numpy.int8'>)
+ 1.06±0.01μs 1.25±0.01μs 1.17 bench_itemselection.PutMask.time_sparse(True, 'float64')
+ 1.06±0.01μs 1.25±0.01μs 1.17 bench_itemselection.PutMask.time_sparse(True, 'int32')
+ 1.06±0.01μs 1.25±0.01μs 1.17 bench_itemselection.PutMask.time_sparse(True, 'int64')
+ 1.78±0.01μs 2.07±0.01μs 1.17 bench_itemselection.PutMask.time_dense(False, 'complex128')
+ 31.4±0.4μs 36.4±0.3μs 1.16 bench_function_base.Sort.time_sort('merge', 'int16', ('sorted_block', 1000))
+ 11.1±0.1μs 12.9±0.09μs 1.16 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'negative'>, 1, 1, 'f')
+ 61.5±0.1μs 71.1±0.9μs 1.16 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 2, 4, 'f')
+ 10.2±0.08μs 11.8±0.8μs 1.16 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'object'>)
+ 601±10μs 693±4μs 1.15 bench_function_base.Sort.time_argsort('merge', 'float64', ('random',))
+ 1.09±0.01μs 1.25±0.01μs 1.15 bench_itemselection.PutMask.time_sparse(True, 'float32')
+ 122±0.7μs 141±4μs 1.15 bench_function_base.Sort.time_sort('quick', 'float64', ('uniform',))
+ 62.2±1μs 71.5±0.3μs 1.15 bench_ufunc_strides.Unary.time_ufunc(<ufunc '_ones_like'>, 1, 4, 'f')
+ 31.0±0.2μs 35.6±2μs 1.15 bench_function_base.Sort.time_sort('merge', 'int16', ('reversed',))
+ 1.54±0.02μs 1.77±0.02μs 1.15 bench_core.CountNonzero.time_count_nonzero(2, 10000, <class 'numpy.int8'>)
+ 1.94±0μs 2.23±0.02μs 1.15 bench_core.CountNonzero.time_count_nonzero(3, 10000, <class 'bool'>)
+ 148±0.4μs 170±1μs 1.14 bench_function_base.Sort.time_sort('merge', 'float64', ('sorted_block', 10))
+ 286±6μs 327±3μs 1.14 bench_core.UnpackBits.time_unpackbits_axis1_little
+ 847±40μs 967±4μs 1.14 bench_ufunc.UFunc.time_ufunc_types('divide')
+ 162±1μs 185±2μs 1.14 bench_core.CountNonzero.time_count_nonzero(3, 10000, <class 'object'>)
+ 7.74±0.04μs 8.82±0.04μs 1.14 bench_reduce.MinMax.time_max(<class 'numpy.float64'>)
+ 1.66±0.01μs 1.88±0.01μs 1.14 bench_itemselection.PutMask.time_dense(False, 'float16')
+ 72.6±0.4μs 82.3±2μs 1.13 bench_core.CountNonzero.time_count_nonzero_axis(1, 10000, <class 'str'>)
+ 93.6±0.9μs 106±0.5μs 1.13 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rint'>, 4, 2, 'f')
+ 91.8±0.2μs 104±0.3μs 1.13 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'sqrt'>, 2, 4, 'f')
+ 5.40±0.02ms 6.11±0.02ms 1.13 bench_core.CountNonzero.time_count_nonzero(1, 1000000, <class 'object'>)
+ 260±2μs 294±1μs 1.13 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 4, 4, 'd')
+ 383±4ns 433±5ns 1.13 bench_array_coercion.ArrayCoercionSmall.time_asarray_dtype(1)
+ 9.86±0.3μs 11.1±0.03μs 1.13 bench_function_base.Sort.time_sort('merge', 'float64', ('reversed',))
+ 16.2±0.04ms 18.3±0.09ms 1.13 bench_core.CountNonzero.time_count_nonzero(3, 1000000, <class 'object'>)
+ 1.55±0.01μs 1.75±0.02μs 1.12 bench_core.CountNonzero.time_count_nonzero(2, 10000, <class 'bool'>)
+ 31.0±0.2μs 34.8±0.1μs 1.12 bench_core.Core.time_array_float_l1000_dtype
+ 88.7±0.9μs 99.5±0.7μs 1.12 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'negative'>, 4, 2, 'f')
+ 72.7±0.2μs 81.5±0.5μs 1.12 bench_core.CountNonzero.time_count_nonzero_multi_axis(1, 10000, <class 'str'>)
+ 1.67±0.01μs 1.87±0.02μs 1.12 bench_itemselection.PutMask.time_dense(False, 'int16')
+ 474±6ns 530±30ns 1.12 bench_array_coercion.ArrayCoercionSmall.time_array_all_kwargs(1)
+ 109±0.8μs 122±1μs 1.12 bench_core.CountNonzero.time_count_nonzero(2, 10000, <class 'object'>)
+ 10.8±0.05ms 12.1±0.1ms 1.12 bench_core.CountNonzero.time_count_nonzero(2, 1000000, <class 'object'>)
+ 119±1μs 132±0.3μs 1.11 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 2, 2, 'd')
+ 119±1μs 132±0.4μs 1.11 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'trunc'>, 4, 4, 'f')
+ 129±0.5μs 144±0.7μs 1.11 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'fabs'>, 1, 1, 'd')
+ 130±1μs 145±0.7μs 1.11 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 1, 1, 'd')
+ 1.15±0.01μs 1.28±0.01μs 1.11 bench_core.CountNonzero.time_count_nonzero(1, 10000, <class 'bool'>)
+ 6.50±0.05μs 7.23±0.3μs 1.11 bench_core.CountNonzero.time_count_nonzero_multi_axis(1, 100, <class 'object'>)
+ 130±0.6μs 145±1μs 1.11 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 1, 1, 'd')
+ 5.40±0.02μs 6.00±0.02μs 1.11 bench_reduce.MinMax.time_min(<class 'numpy.float32'>)
+ 3.34±0.02μs 3.70±0.02μs 1.11 bench_itemselection.Take.time_contiguous((1000, 3), 'wrap', 'int32')
+ 1.14±0.02μs 1.26±0μs 1.11 bench_core.CountNonzero.time_count_nonzero(1, 10000, <class 'numpy.int8'>)
+ 46.7±0.5μs 51.6±0.7μs 1.10 bench_shape_base.Block.time_block_simple_row_wise(100)
+ 362±10μs 400±2μs 1.10 bench_function_base.Sort.time_sort('quick', 'float64', ('sorted_block', 100))
+ 5.44±0.02μs 6.01±0.03μs 1.10 bench_reduce.ArgMax.time_argmax(<class 'bool'>)
+ 138±0.7μs 152±5μs 1.10 bench_core.CountNonzero.time_count_nonzero(3, 1000000, <class 'numpy.int8'>)
+ 7.87±0.03μs 8.64±0.08μs 1.10 bench_reduce.MinMax.time_min(<class 'numpy.float64'>)
+ 232±0.6ns 255±0.8ns 1.10 bench_array_coercion.ArrayCoercionSmall.time_array_no_copy(array([5]))
+ 319±3ns 351±5ns 1.10 bench_array_coercion.ArrayCoercionSmall.time_asanyarray(1)
+ 278±10μs 304±2μs 1.09 bench_function_base.Sort.time_sort('merge', 'int64', ('random',))
+ 306±2ns 335±10ns 1.09 bench_array_coercion.ArrayCoercionSmall.time_asarray(5)
+ 120±0.8μs 132±0.6μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'cos'>, 4, 2, 'f')
+ 58.1±0.3μs 63.5±0.2μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'negative'>, 1, 1, 'd')
+ 284±9μs 310±2μs 1.09 bench_function_base.Sort.time_sort('quick', 'float64', ('sorted_block', 1000))
+ 6.98±0.03μs 7.61±0.2μs 1.09 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 100, <class 'str'>)
+ 6.46±0.3μs 7.04±0.3μs 1.09 bench_core.CountNonzero.time_count_nonzero_axis(1, 100, <class 'object'>)
+ 119±1μs 130±3μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'absolute'>, 4, 4, 'f')
+ 152±1μs 166±1μs 1.09 bench_core.CountNonzero.time_count_nonzero_axis(2, 10000, <class 'str'>)
+ 151±3μs 164±2μs 1.09 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 10000, <class 'str'>)
+ 125±0.5μs 136±2μs 1.09 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 1, 2, 'd')
+ 2.33±0.01μs 2.52±0.01μs 1.08 bench_core.CountNonzero.time_count_nonzero(3, 100, <class 'object'>)
+ 316±2ns 343±10ns 1.08 bench_array_coercion.ArrayCoercionSmall.time_asarray(1)
+ 1.17±0.01μs 1.27±0.01μs 1.08 bench_itemselection.PutMask.time_sparse(True, 'complex128')
+ 28.2±0.04μs 30.5±0.06μs 1.08 bench_function_base.Sort.time_argsort('heap', 'int16', ('uniform',))
+ 445±2μs 481±2μs 1.08 bench_function_base.Sort.time_sort('quick', 'float64', ('random',))
+ 689±3μs 743±3μs 1.08 bench_lib.Pad.time_pad((1024, 1024), 1, 'reflect')
+ 220±2μs 237±0.8μs 1.08 bench_core.CountNonzero.time_count_nonzero_axis(3, 10000, <class 'str'>)
+ 1.85±0.01μs 1.99±0.01μs 1.08 bench_itemselection.PutMask.time_sparse(False, 'int64')
+ 8.62±0.06μs 9.28±0.05μs 1.08 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'clip', 'float32')
+ 460±2μs 496±2μs 1.08 bench_random.RNG.time_64bit('MT19937')
+ 161±2μs 173±0.9μs 1.08 bench_lib.Pad.time_pad((256, 128, 1), 1, 'wrap')
+ 639±7ns 687±9ns 1.08 bench_array_coercion.ArrayCoercionSmall.time_array_dtype_not_kwargs([1])
+ 1.85±0.01μs 1.99±0.01μs 1.08 bench_itemselection.PutMask.time_sparse(False, 'complex64')
+ 3.15±0.01μs 3.39±0.02μs 1.08 bench_ufunc_strides.AVX_ldexp.time_ufunc('d', 1)
+ 4.63±0.03μs 4.98±0.03μs 1.08 bench_itemselection.Take.time_contiguous((1000, 2), 'clip', 'float16')
+ 12.1±0.03ms 13.0±0.05ms 1.07 bench_lib.Unique.time_unique(200000, 0)
+ 4.79±0.02μs 5.14±0.09μs 1.07 bench_core.CountNonzero.time_count_nonzero_axis(3, 100, <class 'numpy.int64'>)
+ 2.76±0.01μs 2.97±0.02μs 1.07 bench_ufunc_strides.AVX_BFunc.time_ufunc('minimum', 'f', 1)
+ 4.66±0.01μs 5.00±0.1μs 1.07 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 100, <class 'numpy.int16'>)
+ 169±1μs 181±5μs 1.07 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 2, 2, 'd')
+ 161±0.5μs 172±0.3μs 1.07 bench_shape_base.Block2D.time_block2d((1024, 1024), 'uint8', (4, 4))
+ 1.77±0.01μs 1.90±0.02μs 1.07 bench_core.CountNonzero.time_count_nonzero(2, 100, <class 'object'>)
+ 12.2±0.01ms 13.1±0.03ms 1.07 bench_lib.Unique.time_unique(200000, 2.0)
+ 446±2μs 477±20μs 1.07 bench_function_base.Sort.time_sort('heap', 'int16', ('sorted_block', 100))
+ 2.80±0.02μs 3.00±0.05μs 1.07 bench_ufunc_strides.AVX_BFunc.time_ufunc('maximum', 'f', 1)
+ 12.2±0.04ms 13.1±0.03ms 1.07 bench_lib.Unique.time_unique(200000, 0.1)
+ 46.1±0.3μs 49.3±0.1μs 1.07 bench_core.PackBits.time_packbits_little(<class 'numpy.uint64'>)
+ 5.28±0.03μs 5.63±0.02μs 1.07 bench_itemselection.Take.time_contiguous((1000, 3), 'clip', 'float16')
+ 222±4μs 237±0.9μs 1.07 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 10000, <class 'str'>)
+ 4.86±0.07μs 5.18±0.04μs 1.07 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'numpy.int16'>)
+ 384±2μs 409±3μs 1.07 bench_linalg.Eindot.time_einsum_i_ij_j
+ 4.89±0.03μs 5.21±0.09μs 1.07 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'numpy.int64'>)
+ 29.9±0.2μs 31.8±0.9μs 1.07 bench_linalg.Einsum.time_einsum_sum_mul2(<class 'numpy.float32'>)
+ 4.64±0.02μs 4.94±0.02μs 1.07 bench_itemselection.Take.time_contiguous((1000, 2), 'clip', 'int16')
+ 788±2μs 839±4μs 1.06 bench_lib.Pad.time_pad((1024, 1024), 8, 'reflect')
+ 694±5μs 738±2μs 1.06 bench_lib.Pad.time_pad((1024, 1024), 1, 'wrap')
+ 7.76±0.05μs 8.25±0.06μs 1.06 bench_shape_base.Block.time_no_lists(10)
+ 4.01±0.1ms 4.26±0.03ms 1.06 bench_lib.Pad.time_pad((256, 128, 1), 8, 'wrap')
+ 6.44±0.06μs 6.85±0.2μs 1.06 bench_lib.Nan.time_nancumsum(200, 90.0)
+ 3.89±0.02μs 4.13±0.07μs 1.06 bench_core.CountNonzero.time_count_nonzero_axis(1, 100, <class 'bool'>)
+ 432±6ns 459±10ns 1.06 bench_core.Core.time_arange_100
+ 10.7±0.06μs 11.3±0.08μs 1.06 bench_function_base.Where.time_1
+ 24.8±0.1μs 26.4±0.6μs 1.06 bench_ma.UFunc.time_1d(True, False, 1000)
+ 3.56±0.01s 3.78±0.02s 1.06 bench_ufunc_strides.Mandelbrot.time_mandel
+ 5.44±0.02μs 5.77±0.05μs 1.06 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'float16')
+ 4.67±0.03μs 4.96±0.03μs 1.06 bench_itemselection.Take.time_contiguous((1000, 1), 'clip', 'float32')
+ 517±20μs 549±2μs 1.06 bench_function_base.Sort.time_sort('merge', 'float64', ('random',))
+ 5.32±0.03μs 5.65±0.04μs 1.06 bench_itemselection.Take.time_contiguous((1000, 3), 'clip', 'int16')
+ 4.39±0.03μs 4.66±0.09μs 1.06 bench_core.CountNonzero.time_count_nonzero_axis(1, 100, <class 'numpy.int64'>)
+ 4.82±0.02μs 5.11±0.1μs 1.06 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 100, <class 'numpy.int32'>)
+ 11.3±0.07μs 11.9±0.04μs 1.06 bench_reduce.MinMax.time_max(<class 'numpy.int64'>)
+ 46.7±0.4μs 49.5±0.6μs 1.06 bench_lib.Nan.time_nanvar(200, 0.1)
+ 146±1μs 154±1μs 1.06 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'fabs'>, 2, 1, 'd')
+ 7.73±0.1μs 8.18±0.05μs 1.06 bench_core.CountNonzero.time_count_nonzero_axis(3, 100, <class 'str'>)
+ 8.39±0.04μs 8.87±0.08μs 1.06 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 100, <class 'object'>)
+ 4.84±0.05μs 5.12±0.04μs 1.06 bench_core.CountNonzero.time_count_nonzero_multi_axis(3, 100, <class 'numpy.int8'>)
+ 8.73±0.05μs 9.23±0.03μs 1.06 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'clip', 'int32')
+ 276±0.8μs 291±2μs 1.05 bench_ufunc.UFunc.time_ufunc_types('nextafter')
+ 1.89±0.02μs 2.00±0.01μs 1.05 bench_itemselection.PutMask.time_sparse(False, 'float64')
+ 4.68±0.01μs 4.93±0.1μs 1.05 bench_core.CountNonzero.time_count_nonzero_axis(3, 100, <class 'numpy.int16'>)
+ 186±0.3μs 196±1μs 1.05 bench_core.CountNonzero.time_count_nonzero_multi_axis(1, 10000, <class 'object'>)
+ 4.82±0.03μs 5.08±0.05μs 1.05 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 100, <class 'numpy.int64'>)
+ 3.99±0.01μs 4.20±0.04μs 1.05 bench_reduce.AnyAll.time_any_slow
+ 5.42±0.02μs 5.71±0.03μs 1.05 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'int16')
+ 7.37±0.07μs 7.75±0.03μs 1.05 bench_lib.Nan.time_nansum(200, 2.0)
+ 6.50±0.03ms 6.83±0.04ms 1.05 bench_lib.Unique.time_unique(200000, 90.0)
- 938±3μs 893±5μs 0.95 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'cosh'>, 4, 4, 'd')
- 32.9±0.2μs 31.3±0.2μs 0.95 bench_linalg.Einsum.time_einsum_noncon_sum_mul(<class 'numpy.float32'>)
- 585±7ns 556±4ns 0.95 bench_scalar.ScalarMath.time_abs('longfloat')
- 224±3μs 214±0.8μs 0.95 bench_ufunc.UFunc.time_ufunc_types('floor')
- 11.0±0.1μs 10.4±0.08μs 0.95 bench_ma.UFunc.time_scalar(True, False, 1000)
- 747±8μs 711±6μs 0.95 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'arcsin'>, 4, 4, 'f')
- 5.60±0.04μs 5.33±0.1μs 0.95 bench_itemselection.Take.time_contiguous((1000, 3), 'wrap', 'float16')
- 779±10ns 741±20ns 0.95 bench_core.CountNonzero.time_count_nonzero(1, 100, <class 'numpy.int64'>)
- 1.01±0.02ms 954±2μs 0.95 bench_reduce.AddReduceSeparate.time_reduce(0, 'complex128')
- 1.29±0.01μs 1.22±0.01μs 0.95 bench_scalar.ScalarMath.time_power_of_two('int64')
- 321±2μs 304±2μs 0.95 bench_ufunc.UFunc.time_ufunc_types('logical_xor')
- 12.9±0.08μs 12.2±0.03μs 0.95 bench_core.PackBits.time_packbits_axis1(<class 'bool'>)
- 745±5μs 706±6μs 0.95 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'arcsin'>, 4, 1, 'f')
- 387±3μs 366±2μs 0.95 bench_function_base.Sort.time_argsort('quick', 'int16', ('sorted_block', 100))
- 599±6μs 567±4μs 0.95 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'arcsinh'>, 4, 1, 'f')
- 53.2±0.7μs 50.3±0.07μs 0.95 bench_ufunc.UFunc.time_ufunc_types('ldexp')
- 25.0±0.5μs 23.6±0.2μs 0.95 bench_scalar.ScalarMath.time_power_of_two('complex64')
- 3.34±0.01μs 3.16±0.01μs 0.94 bench_itemselection.Take.time_contiguous((1000, 2), 'wrap', 'complex64')
- 379±2μs 358±10μs 0.94 bench_function_base.Sort.time_sort('heap', 'int64', ('ordered',))
- 22.6±0.1μs 21.3±0.05μs 0.94 bench_shape_base.Block2D.time_block2d((128, 128), 'uint64', (2, 2))
- 3.34±0.02μs 3.15±0.02μs 0.94 bench_itemselection.Take.time_contiguous((1000, 2), 'wrap', 'float64')
- 4.38±0.03μs 4.13±0.01μs 0.94 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'int16')
- 4.57±0.07μs 4.31±0.02μs 0.94 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'complex256')
- 4.42±0.07μs 4.16±0.03μs 0.94 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'float16')
- 393±2μs 369±1μs 0.94 bench_function_base.Sort.time_argsort('quick', 'int16', ('sorted_block', 10))
- 96.2±0.9μs 90.4±0.5μs 0.94 bench_lib.Nan.time_nanmax(200000, 0)
- 7.36±0.03μs 6.92±0.02μs 0.94 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'int16')
- 6.46±0.03μs 6.07±0.02μs 0.94 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'complex256')
- 4.70±0.02μs 4.41±0.05μs 0.94 bench_itemselection.Take.time_contiguous((1000, 2), 'wrap', 'float16')
- 5.59±0.03μs 5.25±0.02μs 0.94 bench_itemselection.Take.time_contiguous((1000, 3), 'wrap', 'int16')
- 5.92±0.02μs 5.55±0.03μs 0.94 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'int64')
- 700±10ns 657±5ns 0.94 bench_ufunc.ArgParsing.time_add_arg_parsing((array(1.), array(2.), out=array(3.), subok=True, where=True))
- 4.68±0.01μs 4.39±0.02μs 0.94 bench_itemselection.Take.time_contiguous((1000, 2), 'wrap', 'int16')
- 40.5±0.09μs 38.0±0.4μs 0.94 bench_linalg.Linalg.time_op('norm', 'float16')
- 2.76±0.03μs 2.58±0.01μs 0.94 bench_ufunc_strides.AVX_ldexp.time_ufunc('f', 1)
- 2.58±0.04μs 2.42±0.03μs 0.94 bench_ufunc.CustomScalarFloorDivideInt.time_floor_divide_int(<class 'numpy.uint8'>, 43)
- 569±9μs 532±5μs 0.94 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'log10'>, 4, 2, 'f')
- 5.94±0.05μs 5.56±0.03μs 0.94 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'longfloat')
- 609±2μs 570±4μs 0.94 bench_function_base.Sort.time_sort('heap', 'float64', ('sorted_block', 1000))
- 40.1±0.4μs 37.5±0.2μs 0.94 bench_lib.Pad.time_pad((1, 1, 1, 1, 1), 1, 'edge')
- 4.09±0.06μs 3.83±0.03μs 0.94 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'complex128')
- 570±5μs 533±8μs 0.93 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'log10'>, 4, 1, 'f')
- 5.95±0.02μs 5.56±0.02μs 0.93 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'complex64')
- 294±0.9μs 274±1μs 0.93 bench_function_base.Sort.time_argsort('quick', 'int16', ('sorted_block', 1000))
- 13.0±0.07μs 12.1±0.06μs 0.93 bench_function_base.Sort.time_argsort('merge', 'float64', ('uniform',))
- 7.44±0.04μs 6.93±0.05μs 0.93 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'float16')
- 3.34±0.02μs 3.12±0.01μs 0.93 bench_itemselection.Take.time_contiguous((1000, 2), 'wrap', 'int32')
- 4.07±0.03μs 3.79±0.03μs 0.93 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'longfloat')
- 28.4±0.2μs 26.4±0.08μs 0.93 bench_function_base.Sort.time_argsort('heap', 'int64', ('uniform',))
- 5.96±0.02μs 5.55±0.02μs 0.93 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'float64')
- 1.15±0.01μs 1.07±0.02μs 0.93 bench_itemselection.PutMask.time_sparse(True, 'complex256')
- 148±3μs 138±4μs 0.93 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 1, 4, 'f')
- 607±8μs 563±4μs 0.93 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'arcsinh'>, 4, 2, 'f')
- 154±2μs 142±0.9μs 0.93 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 2, 4, 'f')
- 24.7±0.2ms 22.9±0.1ms 0.93 bench_linalg.Eindot.time_einsum_ijk_jil_kl
- 3.74±0.1μs 3.47±0.03μs 0.93 bench_ufunc.CustomScalarFloorDivideInt.time_floor_divide_int(<class 'numpy.uint32'>, 8)
- 135±0.3μs 125±0.4μs 0.93 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'logical_not'>, 2, 4, 'f')
- 13.0±0.03μs 12.1±0.05μs 0.92 bench_function_base.Sort.time_argsort('merge', 'float64', ('ordered',))
- 3.67±0.1μs 3.39±0.02μs 0.92 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'complex256')
- 5.98±0.02μs 5.53±0.03μs 0.92 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'complex128')
- 5.28±0.3μs 4.88±0.02μs 0.92 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'float16')
- 151±2μs 140±2μs 0.92 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 4, 1, 'f')
- 8.81±0.02μs 8.13±0.07μs 0.92 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'int32')
- 525±2μs 484±2μs 0.92 bench_function_base.Sort.time_sort('heap', 'float64', ('reversed',))
- 8.79±0.04μs 8.11±0.04μs 0.92 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'wrap', 'float32')
- 3.20±0.1ms 2.95±0ms 0.92 bench_lib.Pad.time_pad((256, 128, 1), 8, 'reflect')
- 7.40±0.04μs 6.81±0.04μs 0.92 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'complex256')
- 3.41±0.1μs 3.13±0.02μs 0.92 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'complex64')
- 7.22±0.09ms 6.63±0.02ms 0.92 bench_reduce.AddReduceSeparate.time_reduce(0, 'float16')
- 3.41±0.09μs 3.13±0.03μs 0.92 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'int64')
- 40.2±0.4μs 36.8±0.3μs 0.92 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rint'>, 2, 1, 'f')
- 3.44±0.1μs 3.15±0.03μs 0.92 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'complex128')
- 3.44±0.1μs 3.16±0.02μs 0.92 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'longfloat')
- 47.2±0.7μs 43.0±0.1μs 0.91 bench_lib.Pad.time_pad((4, 4, 4, 4), 1, 'reflect')
- 4.18±0.1μs 3.81±0.01μs 0.91 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'float16')
- 292±1μs 266±10μs 0.91 bench_function_base.Select.time_select_larger
- 145±0.4μs 132±1μs 0.91 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 2, 2, 'f')
- 4.83±0.2μs 4.39±0.02μs 0.91 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'float32')
- 151±2μs 137±2μs 0.91 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 4, 1, 'f')
- 149±0.7μs 134±2μs 0.90 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 1, 4, 'f')
- 288±1μs 260±1μs 0.90 bench_function_base.Sort.time_argsort('quick', 'int64', ('sorted_block', 1000))
- 39.6±0.4μs 35.7±0.1μs 0.90 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'sqrt'>, 2, 1, 'f')
- 4.23±0.2μs 3.80±0.02μs 0.90 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'int16')
- 114±3μs 102±0.6μs 0.90 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'sign'>, 4, 2, 'f')
- 394±2μs 353±2μs 0.90 bench_function_base.Sort.time_argsort('quick', 'int64', ('sorted_block', 10))
- 637±40μs 570±5μs 0.89 bench_ufunc.UFunc.time_ufunc_types('maximum')
- 85.0±0.4μs 76.0±1μs 0.89 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'positive'>, 1, 4, 'f')
- 382±2μs 342±1μs 0.89 bench_function_base.Sort.time_argsort('quick', 'int64', ('sorted_block', 100))
- 146±1μs 130±0.6μs 0.89 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 2, 2, 'f')
- 104±1μs 92.9±0.8μs 0.89 bench_lib.Nan.time_nanmax(200000, 0.1)
- 1.23±0.02μs 1.10±0.01μs 0.89 bench_itemselection.PutMask.time_sparse(True, 'float16')
- 1.23±0.01μs 1.09±0.01μs 0.89 bench_itemselection.PutMask.time_sparse(True, 'int16')
- 820±4μs 728±4μs 0.89 bench_function_base.Sort.time_argsort('heap', 'float64', ('random',))
- 2.06±0.02ms 1.83±0.01ms 0.89 bench_reduce.AddReduceSeparate.time_reduce(1, 'float16')
- 103±0.3μs 91.0±2μs 0.89 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'trunc'>, 1, 2, 'd')
- 82.4±0.6μs 72.7±0.3μs 0.88 bench_function_base.Sort.time_argsort('quick', 'int16', ('reversed',))
- 476±3μs 420±2μs 0.88 bench_function_base.Sort.time_argsort('quick', 'int64', ('random',))
- 90.8±0.1μs 80.1±2μs 0.88 bench_indexing.Indexing.time_op('indexes_', 'I', '')
- 144±0.7μs 127±0.7μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 1, 2, 'f')
- 735±9μs 646±3μs 0.88 bench_function_base.Sort.time_argsort('heap', 'float64', ('sorted_block', 100))
- 4.34±0.2μs 3.81±0.02μs 0.88 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'complex256')
- 2.82±0.1ms 2.47±0.01ms 0.88 bench_ufunc.UFunc.time_ufunc_types('tan')
- 146±1μs 128±1μs 0.88 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 2, 1, 'f')
- 678±4μs 593±3μs 0.88 bench_function_base.Sort.time_argsort('heap', 'float64', ('sorted_block', 1000))
- 144±0.7μs 126±0.8μs 0.87 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 1, 2, 'f')
- 5.01±0.2μs 4.37±0.02μs 0.87 bench_itemselection.Take.time_contiguous((1000, 1), 'wrap', 'int32')
- 312±0.7μs 272±1μs 0.87 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'deg2rad'>, 4, 4, 'd')
- 683±3μs 595±1μs 0.87 bench_function_base.Sort.time_argsort('heap', 'float64', ('sorted_block', 10))
- 523±3μs 456±3μs 0.87 bench_function_base.Sort.time_argsort('heap', 'float64', ('reversed',))
- 70.2±0.3μs 61.0±0.2μs 0.87 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'positive'>, 1, 1, 'd')
- 771±5μs 667±3μs 0.87 bench_function_base.Sort.time_argsort('heap', 'int64', ('random',))
- 146±1μs 126±0.9μs 0.86 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 2, 1, 'f')
- 1.98±0.02μs 1.71±0.02μs 0.86 bench_itemselection.PutMask.time_dense(False, 'float32')
- 444±5μs 382±3μs 0.86 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 10000, <class 'object'>)
- 671±3μs 577±4μs 0.86 bench_function_base.Sort.time_argsort('heap', 'int64', ('sorted_block', 100))
- 84.6±0.6μs 72.5±0.4μs 0.86 bench_function_base.Sort.time_argsort('quick', 'int64', ('uniform',))
- 142±0.7μs 122±0.6μs 0.86 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'rad2deg'>, 1, 1, 'f')
- 44.6±0.5ms 38.1±0.09ms 0.86 bench_core.CountNonzero.time_count_nonzero_multi_axis(2, 1000000, <class 'object'>)
- 646±5μs 550±3μs 0.85 bench_function_base.Sort.time_argsort('heap', 'int64', ('sorted_block', 10))
- 98.2±0.5μs 83.5±3μs 0.85 bench_function_base.Sort.time_sort('quick', 'int16', ('uniform',))
- 435±1μs 369±2μs 0.85 bench_function_base.Sort.time_argsort('heap', 'int64', ('ordered',))
- 2.00±0.01μs 1.70±0μs 0.85 bench_itemselection.PutMask.time_dense(False, 'int32')
- 143±1μs 121±0.9μs 0.85 bench_ufunc_strides.Unary.time_ufunc(<ufunc 'degrees'>, 1, 1, 'f')
- 13.3±0.07μs 11.1±0.08μs 0.84 bench_function_base.Sort.time_argsort('merge', 'float64', ('reversed',))
- 7.20±0.05μs 6.03±0.03μs 0.84 bench_function_base.Sort.time_sort('merge', 'int16', ('ordered',))
- 617±4μs 517±1μs 0.84 bench_function_base.Sort.time_argsort('heap', 'int64', ('sorted_block', 1000))
- 497±2μs 416±1μs 0.84 bench_function_base.Sort.time_argsort('heap', 'float64', ('ordered',))
- 480±2μs 401±2μs 0.83 bench_function_base.Sort.time_argsort('heap', 'int64', ('reversed',))
- 4.09±0.5μs 3.41±0.01μs 0.83 bench_itemselection.Take.time_contiguous((1000, 3), 'wrap', 'float64')
- 7.26±0.08μs 6.04±0.02μs 0.83 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'longfloat')
- 1.07±0ms 894±4μs 0.83 bench_core.PackBits.time_packbits_axis0(<class 'numpy.uint64'>)
- 136±1μs 113±0.5μs 0.83 bench_function_base.Bincount.time_bincount
- 7.30±0.04μs 6.07±0.03μs 0.83 bench_function_base.Sort.time_sort('merge', 'int16', ('uniform',))
- 7.26±0.03μs 6.04±0.03μs 0.83 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'complex128')
- 4.14±0.06μs 3.44±0.03μs 0.83 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'float32')
- 4.07±0.07μs 3.38±0.01μs 0.83 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'float64')
- 4.06±0.03μs 3.37±0.04μs 0.83 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'complex64')
- 5.11±0.1μs 4.23±0.02μs 0.83 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'complex128')
- 4.09±0.03μs 3.38±0.01μs 0.83 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'int64')
- 7.29±0.02μs 6.03±0.03μs 0.83 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'complex64')
- 4.16±0.03μs 3.43±0.02μs 0.82 bench_itemselection.Take.time_contiguous((1000, 3), 'raise', 'int32')
- 4.12±0.05μs 3.39±0.02μs 0.82 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'complex64')
- 7.31±0.03μs 6.01±0.03μs 0.82 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'float64')
- 853±3μs 698±3μs 0.82 bench_reduce.AddReduceSeparate.time_reduce(0, 'complex64')
- 7.36±0.2μs 6.00±0.04μs 0.82 bench_itemselection.Take.time_contiguous((2, 1000, 1), 'raise', 'int64')
- 4.18±0.1μs 3.41±0.02μs 0.82 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'int64')
- 177±0.6μs 144±1μs 0.81 bench_function_base.Bincount.time_weights
- 109±2μs 87.7±5μs 0.81 bench_lib.Nan.time_nanmin(200000, 0.1)
- 4.23±0.1μs 3.41±0.04μs 0.81 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'longfloat')
- 55.0±0.2μs 44.3±0.2μs 0.81 bench_function_base.Sort.time_argsort('quick', 'int16', ('ordered',))
- 34.8±0.2μs 28.0±1μs 0.80 bench_io.CopyTo.time_copyto_sparse
- 104±0.3μs 82.0±0.4μs 0.79 bench_lib.Nan.time_nanmin(200000, 0)
- 4.31±0.2μs 3.40±0.01μs 0.79 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'int32')
- 4.33±0.6μs 3.39±0.02μs 0.78 bench_itemselection.Take.time_contiguous((1000, 1), 'raise', 'complex128')
- 4.34±0.2μs 3.39±0.01μs 0.78 bench_itemselection.Take.time_contiguous((1000, 2), 'raise', 'float64')
- 881±20μs 649±1μs 0.74 bench_ufunc.UFunc.time_ufunc_types('sqrt')
- 50.7±0.2μs 36.6±0.2μs 0.72 bench_core.PackBits.time_packbits(<class 'numpy.uint64'>)
- 1.00±0ms 707±2μs 0.70 bench_core.PackBits.time_packbits_axis1(<class 'numpy.uint64'>)
- 126±1μs 87.5±4μs 0.69 bench_lib.Nan.time_nanmax(200000, 2.0)
- 143±1μs 92.0±0.7μs 0.64 bench_lib.Nan.time_nanmin(200000, 2.0)
- 347±2μs 222±0.7μs 0.64 bench_core.PackBits.time_packbits_axis0(<class 'bool'>)
- 520±300μs 261±7μs 0.50 bench_ufunc.UFunc.time_ufunc_types('equal')
- 472±200μs 217±1μs 0.46 bench_ufunc.UFunc.time_ufunc_types('deg2rad')
- 476±200μs 217±0.8μs 0.46 bench_ufunc.UFunc.time_ufunc_types('radians')
- 276±2μs 82.9±0.2μs 0.30 bench_lib.Nan.time_nanmin(200000, 90.0)
- 650±20μs 191±6μs 0.29 bench_ufunc.UFunc.time_ufunc_types('negative')
- 298±0.7μs 83.4±1μs 0.28 bench_lib.Nan.time_nanmax(200000, 90.0)
- 764±4μs 91.6±0.2μs 0.12 bench_lib.Nan.time_nanmin(200000, 50.0)
- 757±2μs 87.0±4μs 0.11 bench_lib.Nan.time_nanmax(200000, 50.0)
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.
Edit (seiko2plus): put the benchmark into Edit: Added the re-run as requested |
@Developer-Ecosystem-Engineering, On |
Thanks @Developer-Ecosystem-Engineering! The Just copying one failure, there are multiple:
|
We've re-run and updated the comment with the results |
This patch significantly regresses performance on SKX with AVX-512:
|
The |
@mattip Thanks, I am taking a look. |
@mattip LGTM. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late reply, due to my marriage, and honeymoon :).
thank you for your cooperation and effort. I have made some extra
enhancements to the current implementation to avoid any performance
regression on x86 and also improved benchmark tests as indicated in the commit messages. I still need a few hours to run benchmarks across all supported architectures to verify my latest changes
Congratulations Sayed. |
- Avoid unroll vectorized loops max/min by x6/x8 when SIMD width > 128 to avoid memory bandwidth bottleneck - tune reduce max/min - vectorize non-contiguos max/min - fix code style - call npyv_cleanup() at end of inner loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am satisfied with the performance. the improvement includes all integers, precision operations to all supported architectures. for record, removed raw x86 SIMD(SSE, AVX512) was only supports max&min for single&double precision.
Just one downgrade to performance is argmax operation for single precision. it shouldn't be related to these changes the pr cover but still for somehow affected not sure exactly why since the performance of reduce operations for both fmax and maxmuim have been increased. however the current SIMD code of argmax need improvements and to be replaced with universal intrinics. please check the following performance benchmarks for more information:
X86
CPU
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
Stepping: 4
CPU MHz: 3410.808
BogoMIPS: 5999.99
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 64 KiB
L1i cache: 64 KiB
L2 cache: 2 MiB
L3 cache: 24.8 MiB
NUMA node0 CPU(s): 0-3
Vulnerability Itlb multihit: KVM: Mitigation: VMX unsupported
Vulnerability L1tf: Mitigation; PTE Inversion
Vulnerability Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full generic retpoline, STIBP disabled, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant
_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 hle avx2 smep b
mi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat
pku ospke
OS
Linux ip-172-31-32-40 5.11.0-1020-aws #21~20.04.2-Ubuntu SMP Fri Oct 1 13:03:59 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Python 3.8.10
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Benchmark
AVX512_SKX(before) vs AVX512_SKX(after)
unset NPY_DISABLE_CPU_FEATURES
python runtests.py -n --bench-compare parent/main "max|min" -- --sort ratio
before after ratio
[f224ca3c] [b49819a6]
+ 122±0.2μs 218±2μs 1.79 bench_reduce.ArgMax.time_argmax(<class 'numpy.float32'>)
+ 431±0.4μs 483±1μs 1.12 bench_ufunc.UFunc.time_ufunc_types('fmax')
+ 423±1μs 466±2μs 1.10 bench_ufunc.UFunc.time_ufunc_types('fmin')
+ 93.4±0.2μs 98.9±1μs 1.06 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 1, 'd')
- 103±0.6μs 98.3±0.8μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'I')
- 86.9±0.4μs 82.5±0.7μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'H')
- 87.3±1μs 82.6±1μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'I')
- 75.1±1μs 70.7±1μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'I')
- 73.1±0.2μs 68.6±0.5μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'i')
- 74.3±0.9μs 69.6±0.6μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'h')
- 73.7±0.5μs 69.0±0.4μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'h')
- 76.2±1μs 71.3±1μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'h')
- 73.3±0.3μs 68.4±0.4μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'i')
- 74.5±0.9μs 69.5±0.4μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'i')
- 75.0±0.9μs 69.8±0.9μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'i')
- 73.2±0.2μs 68.1±0.3μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'I')
- 82.9±0.8μs 77.2±0.5μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'H')
- 74.8±1μs 69.4±1μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'I')
- 82.4±0.9μs 76.5±0.8μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'H')
- 6.08±0.2μs 5.64±0.06μs 0.93 bench_lib.Nan.time_nanmin(200, 0.1)
- 6.15±0.1μs 5.70±0.05μs 0.93 bench_lib.Nan.time_nanmin(200, 2.0)
- 6.19±0.1μs 5.71±0.06μs 0.92 bench_lib.Nan.time_nanmin(200, 50.0)
- 1.01±0ms 931±5μs 0.92 bench_lib.Nan.time_nanargmin(200000, 90.0)
- 71.5±0.5μs 65.9±0.2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'h')
- 72.6±0.3μs 66.8±0.1μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'h')
- 72.7±0.3μs 66.8±0.2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'h')
- 6.10±0.2μs 5.61±0.08μs 0.92 bench_lib.Nan.time_nanmax(200, 0)
- 71.2±0.2μs 65.4±0.4μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'h')
- 6.14±0.2μs 5.63±0.03μs 0.92 bench_lib.Nan.time_nanmin(200, 0)
- 72.5±0.2μs 66.5±0.5μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'h')
- 1.01±0ms 926±4μs 0.92 bench_lib.Nan.time_nanargmax(200000, 90.0)
- 80.1±0.2μs 73.4±0.5μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'H')
- 79.9±0.3μs 73.1±0.1μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'I')
- 82.1±1μs 75.1±1μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'I')
- 81.1±0.7μs 74.1±0.8μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'H')
- 82.1±2μs 75.1±0.9μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'H')
- 81.5±0.6μs 74.4±0.7μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'I')
- 73.2±0.2μs 66.8±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'h')
- 6.17±0.2μs 5.63±0.03μs 0.91 bench_lib.Nan.time_nanmax(200, 0.1)
- 70.1±0.3μs 63.8±0.1μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'h')
- 69.6±0.2μs 63.2±0.3μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'h')
- 6.23±0.1μs 5.65±0.01μs 0.91 bench_lib.Nan.time_nanmax(200, 50.0)
- 70.7±0.2μs 64.1±0.03μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'h')
- 68.7±0.1μs 62.2±0.1μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'h')
- 71.9±0.4μs 65.2±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'h')
- 72.0±0.2μs 65.2±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'h')
- 69.8±0.1μs 63.1±0.08μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'h')
- 6.26±0.2μs 5.66±0.01μs 0.90 bench_lib.Nan.time_nanmax(200, 2.0)
- 69.0±0.1μs 62.4±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'h')
- 70.2±0.1μs 63.5±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'h')
- 80.4±0.2μs 72.6±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'H')
- 78.9±0.2μs 71.1±0.5μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'H')
- 78.8±0.1μs 71.0±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'H')
- 78.0±0.1μs 70.3±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'H')
- 68.9±0.2μs 62.1±0.1μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'h')
- 79.2±0.3μs 71.3±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'H')
- 80.4±0.2μs 72.3±0.3μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'H')
- 77.3±0.2μs 69.4±0.1μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'H')
- 80.4±0.3μs 72.2±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'H')
- 6.33±0.1μs 5.68±0.02μs 0.90 bench_lib.Nan.time_nanmax(200, 90.0)
- 6.37±0.1μs 5.71±0.03μs 0.90 bench_lib.Nan.time_nanmin(200, 90.0)
- 80.0±0.2μs 71.7±0.4μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'H')
- 77.1±0.2μs 69.1±0.1μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'H')
- 77.7±0.3μs 69.5±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'H')
- 76.4±0.2μs 68.3±0.1μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'H')
- 76.4±0.2μs 68.2±0.1μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'H')
- 77.7±0.1μs 69.4±0.09μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'H')
- 79.6±0.3μs 71.0±0.2μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'H')
- 76.5±0.1μs 68.3±0.06μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'H')
- 559±2μs 492±1μs 0.88 bench_lib.Nan.time_nanargmax(200000, 2.0)
- 560±0.7μs 493±2μs 0.88 bench_lib.Nan.time_nanargmin(200000, 2.0)
- 93.9±0.5μs 81.2±0.8μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'h')
- 94.0±0.2μs 81.2±0.3μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'H')
- 93.5±0.2μs 80.3±0.4μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'H')
- 93.8±0.3μs 80.4±0.8μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'h')
- 449±2μs 385±2μs 0.86 bench_lib.Nan.time_nanargmin(200000, 0.1)
- 93.5±0.3μs 80.1±0.4μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'H')
- 446±2μs 380±0.9μs 0.85 bench_lib.Nan.time_nanargmin(200000, 0)
- 444±1μs 378±1μs 0.85 bench_lib.Nan.time_nanargmax(200000, 0)
- 449±2μs 381±2μs 0.85 bench_lib.Nan.time_nanargmax(200000, 0.1)
- 94.0±0.4μs 79.3±0.8μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'h')
- 78.1±0.3μs 65.3±0.2μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'B')
- 78.0±0.3μs 65.1±0.4μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'b')
- 78.0±0.09μs 64.8±0.3μs 0.83 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'b')
- 6.38±0.09μs 5.28±0.04μs 0.83 bench_reduce.MinMax.time_max(<class 'numpy.float32'>)
- 6.37±0.07μs 5.26±0.02μs 0.83 bench_reduce.MinMax.time_min(<class 'numpy.float32'>)
- 85.4±0.2μs 69.2±0.2μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'B')
- 76.9±0.1μs 61.9±0.3μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'b')
- 84.4±0.2μs 67.7±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'B')
- 84.4±0.2μs 67.7±0.1μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'B')
- 77.1±0.2μs 61.8±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'b')
- 93.4±0.2μs 74.9±0.8μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'H')
- 77.1±0.2μs 61.6±0.09μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'b')
- 77.4±0.3μs 61.8±0.3μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'B')
- 84.7±0.2μs 67.5±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'B')
- 93.5±0.2μs 74.6±0.5μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'h')
- 77.1±0.2μs 61.5±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'b')
- 77.2±0.1μs 61.5±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'b')
- 77.2±0.2μs 61.5±0.1μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'b')
- 83.9±0.1μs 66.8±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'B')
- 83.5±0.1μs 66.3±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'B')
- 84.0±0.1μs 66.7±0.09μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'B')
- 77.4±0.2μs 61.4±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'B')
- 93.1±0.2μs 73.8±0.7μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'h')
- 83.1±0.1μs 65.8±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'B')
- 93.0±0.2μs 73.7±1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'H')
- 83.7±0.1μs 66.3±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'B')
- 84.1±0.1μs 66.6±0.3μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'B')
- 92.9±0.7μs 73.5±0.5μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'H')
- 83.4±0.2μs 65.9±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'B')
- 76.5±0.2μs 60.4±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'b')
- 83.2±0.2μs 65.7±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'B')
- 77.7±0.3μs 61.3±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'B')
- 83.7±0.07μs 65.9±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'B')
- 76.7±0.2μs 60.4±0.3μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'b')
- 76.7±0.2μs 60.4±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'B')
- 83.1±0.2μs 65.5±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'B')
- 83.3±0.2μs 65.6±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'B')
- 82.9±0.1μs 65.2±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'B')
- 83.0±0.08μs 65.4±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'B')
- 82.9±0.1μs 65.2±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'B')
- 76.8±0.4μs 60.4±0.3μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'B')
- 82.8±0.06μs 65.1±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'B')
- 82.8±0.1μs 65.1±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'B')
- 83.1±0.2μs 65.3±0.06μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'B')
- 76.4±0.2μs 60.0±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'b')
- 82.8±0.02μs 64.9±0.04μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'B')
- 82.8±0.06μs 64.9±0.08μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'B')
- 82.8±0.02μs 64.9±0.1μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'B')
- 82.7±0.02μs 64.9±0.08μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'B')
- 82.8±0.02μs 64.9±0.09μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'B')
- 76.5±0.2μs 59.9±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'b')
- 93.2±0.8μs 73.0±1μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'h')
- 76.6±0.1μs 59.8±0.1μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'b')
- 75.9±0.1μs 59.2±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'B')
- 76.1±0.2μs 59.3±0.4μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'B')
- 76.8±0.2μs 59.8±0.5μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'b')
- 76.1±0.1μs 59.3±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'b')
- 76.6±0.1μs 59.5±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'B')
- 76.2±0.2μs 59.0±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'B')
- 76.3±0.09μs 59.0±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'b')
- 76.0±0.3μs 58.7±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'b')
- 76.2±0.2μs 58.8±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'b')
- 76.4±0.2μs 59.0±0.4μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'b')
- 75.6±0.2μs 58.4±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'b')
- 76.4±0.2μs 58.9±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'b')
- 75.7±0.2μs 58.4±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'B')
- 75.7±0.09μs 58.3±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'B')
- 75.9±0.2μs 58.4±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'b')
- 93.2±0.3μs 71.7±1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'H')
- 75.9±0.2μs 58.4±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'b')
- 75.7±0.3μs 58.1±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'b')
- 75.5±0.2μs 58.0±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'b')
- 75.6±0.1μs 58.1±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'b')
- 75.7±0.08μs 58.1±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'B')
- 75.4±0.09μs 57.8±0.07μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'b')
- 75.9±0.08μs 58.2±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'b')
- 75.5±0.05μs 57.9±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'b')
- 75.8±0.2μs 58.1±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'B')
- 75.9±0.2μs 58.1±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'b')
- 75.8±0.1μs 58.1±0.04μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'b')
- 75.9±0.1μs 58.1±0.07μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'B')
- 75.5±0.1μs 57.8±0.09μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'b')
- 75.3±0.1μs 57.7±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'B')
- 75.7±0.08μs 58.0±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'b')
- 76.1±0.2μs 58.2±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'b')
- 75.5±0.05μs 57.7±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'b')
- 75.4±0.1μs 57.7±0.09μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'B')
- 75.5±0.1μs 57.7±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'b')
- 93.6±0.3μs 71.5±1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'h')
- 76.0±0.3μs 58.1±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'B')
- 75.4±0.06μs 57.6±0.08μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'b')
- 76.1±0.08μs 58.2±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'b')
- 75.4±0.03μs 57.5±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'B')
- 75.4±0.05μs 57.5±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'b')
- 75.3±0.07μs 57.5±0.03μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'b')
- 75.4±0.1μs 57.5±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'B')
- 75.4±0.07μs 57.5±0.04μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'b')
- 76.1±0.3μs 58.1±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'b')
- 75.5±0.08μs 57.6±0.06μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'B')
- 75.5±0.07μs 57.6±0.08μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'b')
- 75.5±0.1μs 57.5±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'B')
- 75.6±0.06μs 57.6±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'b')
- 75.4±0.04μs 57.5±0.08μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'b')
- 75.5±0.1μs 57.5±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'b')
- 75.3±0.08μs 57.4±0.06μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'b')
- 75.5±0.08μs 57.5±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'B')
- 75.4±0.1μs 57.5±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'b')
- 75.5±0.1μs 57.5±0.08μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'b')
- 75.6±0.2μs 57.6±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'B')
- 75.6±0.1μs 57.6±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'b')
- 76.2±0.1μs 58.0±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'b')
- 75.4±0.06μs 57.4±0.04μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'B')
- 75.4±0.06μs 57.3±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'b')
- 75.5±0.06μs 57.4±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'B')
- 93.0±0.09μs 69.6±0.6μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'H')
- 93.0±0.6μs 69.2±0.4μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'h')
- 93.1±0.2μs 69.2±0.8μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'h')
- 92.8±0.5μs 68.8±0.3μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'H')
- 92.9±0.2μs 67.1±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'H')
- 92.9±0.2μs 66.9±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'H')
- 93.1±0.2μs 67.0±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'H')
- 93.1±0.2μs 67.0±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'h')
- 92.9±0.1μs 66.8±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'H')
- 93.1±0.2μs 66.9±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'h')
- 93.2±0.1μs 66.8±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'h')
- 93.0±0.1μs 66.5±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'h')
- 92.5±0.2μs 65.8±0.4μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'H')
- 92.5±0.06μs 65.7±0.2μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'H')
- 92.7±0.2μs 65.6±0.4μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'h')
- 92.8±0.2μs 65.5±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'h')
- 92.7±0.1μs 65.3±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'h')
- 92.6±0.2μs 65.2±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'H')
- 92.6±0.2μs 65.1±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'H')
- 92.9±0.1μs 65.0±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'h')
- 92.1±0.2μs 64.2±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'h')
- 92.1±0.1μs 64.2±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'H')
- 537±6μs 373±1μs 0.69 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 4, 'd')
- 91.6±0.1μs 63.5±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'H')
- 91.9±0.2μs 63.7±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'H')
- 91.4±0.2μs 63.2±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'H')
- 91.6±0.1μs 63.3±0.07μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'h')
- 91.6±0.07μs 63.1±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'h')
- 91.5±0.08μs 63.0±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'h')
- 91.7±0.08μs 63.1±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'H')
- 92.2±0.06μs 63.4±0.08μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'h')
- 90.9±0.2μs 62.4±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'H')
- 90.9±0.09μs 62.4±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'h')
- 90.8±0.09μs 62.4±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'h')
- 90.9±0.1μs 62.3±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'H')
- 90.8±0.1μs 62.3±0.06μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'H')
- 90.8±0.09μs 62.2±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'h')
- 544±6μs 372±0.9μs 0.68 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 4, 'd')
- 9.33±0.05μs 6.30±0.03μs 0.67 bench_reduce.MinMax.time_min(<class 'numpy.float64'>)
- 184±0.9μs 124±0.3μs 0.67 bench_reduce.ArgMax.time_argmax(<class 'numpy.float64'>)
- 9.40±0.06μs 6.30±0.01μs 0.67 bench_reduce.MinMax.time_max(<class 'numpy.float64'>)
- 533±3μs 312±1μs 0.58 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 4, 'd')
- 531±5μs 310±0.5μs 0.58 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 4, 'd')
- 539±6μs 312±1μs 0.58 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 2, 'd')
- 539±5μs 309±0.9μs 0.57 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 4, 'd')
- 544±6μs 312±1μs 0.57 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 2, 'd')
- 544±5μs 310±0.3μs 0.57 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 4, 'd')
- 526±2μs 280±0.4μs 0.53 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 4, 'd')
- 583±30μs 306±10μs 0.53 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 4, 'd')
- 532±3μs 280±1μs 0.53 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 4, 'd')
- 139±0.4μs 72.2±1μs 0.52 bench_lib.Nan.time_nanmin(200000, 0)
- 538±7μs 280±1μs 0.52 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 1, 'd')
- 584±30μs 303±10μs 0.52 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 4, 'd')
- 142±0.09μs 73.5±1μs 0.52 bench_lib.Nan.time_nanmin(200000, 0.1)
- 544±9μs 279±1μs 0.51 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 1, 'd')
- 506±2μs 248±3μs 0.49 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 4, 'd')
- 512±2μs 248±2μs 0.48 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 4, 'd')
- 520±4μs 249±0.7μs 0.48 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 2, 'd')
- 524±2μs 250±0.6μs 0.48 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 2, 'd')
- 526±2μs 249±0.8μs 0.47 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 2, 'd')
- 531±2μs 248±0.3μs 0.47 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'd')
- 70.6±0.2μs 31.5±1μs 0.45 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'I')
- 71.0±0.2μs 31.3±0.7μs 0.44 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'i')
- 540±20μs 236±8μs 0.44 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 4, 'd')
- 540±30μs 236±10μs 0.44 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 4, 'd')
- 545±20μs 236±10μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 4, 'd')
- 70.9±0.2μs 30.7±0.7μs 0.43 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'i')
- 545±20μs 236±8μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 4, 'd')
- 507±0.9μs 217±0.6μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 2, 'd')
- 509±2μs 218±0.6μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 2, 'd')
- 507±5μs 217±1μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 1, 'd')
- 514±0.4μs 217±1μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 2, 'd')
- 515±2μs 218±0.5μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 1, 'd')
- 514±5μs 217±0.8μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 1, 'd')
- 516±2μs 217±0.5μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 2, 'd')
- 520±2μs 216±0.7μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 1, 'd')
- 177±0.3μs 72.9±0.6μs 0.41 bench_lib.Nan.time_nanmin(200000, 2.0)
- 78.2±0.1μs 31.0±0.6μs 0.40 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'I')
- 519±20μs 198±8μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 4, 'd')
- 527±20μs 199±6μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 4, 'd')
- 497±0.8μs 186±1μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 4, 'f')
- 494±0.4μs 185±0.4μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 2, 'd')
- 496±2μs 185±1μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 4, 'f')
- 500±2μs 185±0.4μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 1, 'd')
- 501±0.5μs 185±0.3μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 2, 'd')
- 504±0.6μs 186±0.3μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 1, 'd')
- 507±1μs 185±0.7μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 1, 'd')
- 203±0.2μs 73.3±0.5μs 0.36 bench_lib.Nan.time_nanmax(200000, 0.1)
- 512±2μs 185±0.3μs 0.36 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 1, 'd')
- 15.3±0.05μs 5.52±0.03μs 0.36 bench_reduce.MinMax.time_min(<class 'numpy.int64'>)
- 15.3±0.07μs 5.51±0.03μs 0.36 bench_reduce.MinMax.time_max(<class 'numpy.uint64'>)
- 15.3±0.08μs 5.49±0.03μs 0.36 bench_reduce.MinMax.time_max(<class 'numpy.int64'>)
- 199±0.2μs 70.9±1μs 0.36 bench_lib.Nan.time_nanmax(200000, 0)
- 486±2μs 155±0.4μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 2, 'd')
- 488±0.4μs 155±0.09μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 4, 'f')
- 489±0.5μs 155±0.3μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 4, 'f')
- 490±2μs 155±0.4μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 4, 'f')
- 490±1μs 155±0.6μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 4, 'f')
- 488±1μs 154±0.2μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 2, 'd')
- 13.7±0.01μs 4.33±0.05μs 0.32 bench_reduce.FMinMax.time_min(<class 'numpy.float64'>)
- 494±0.8μs 155±0.8μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 2, 'd')
- 494±1μs 155±0.6μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 2, 'f')
- 494±1μs 155±0.7μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 2, 'f')
- 231±0.2μs 72.3±1μs 0.31 bench_lib.Nan.time_nanmax(200000, 2.0)
- 493±0.8μs 154±0.4μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 1, 'd')
- 495±1μs 155±0.6μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 2, 'd')
- 501±0.4μs 154±0.3μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 1, 'd')
- 15.3±0.08μs 4.47±0.02μs 0.29 bench_reduce.MinMax.time_max(<class 'numpy.int32'>)
- 15.3±0.03μs 4.46±0.03μs 0.29 bench_reduce.MinMax.time_max(<class 'numpy.uint32'>)
- 15.4±0.04μs 4.47±0.01μs 0.29 bench_reduce.MinMax.time_min(<class 'numpy.int32'>)
- 487±1μs 140±0.5μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 4, 'f')
- 486±2μs 140±0.6μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 4, 'f')
- 487±0.8μs 139±0.6μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 4, 'f')
- 488±0.3μs 140±0.6μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 4, 'f')
- 493±0.8μs 139±0.6μs 0.28 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 1, 'f')
- 494±2μs 139±0.4μs 0.28 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 1, 'f')
- 21.2±0.09μs 5.52±0.03μs 0.26 bench_reduce.MinMax.time_min(<class 'numpy.uint64'>)
- 482±0.5μs 125±0.3μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 2, 'd')
- 15.3±0.04μs 3.96±0.04μs 0.26 bench_reduce.MinMax.time_max(<class 'numpy.uint16'>)
- 15.3±0.04μs 3.95±0.02μs 0.26 bench_reduce.MinMax.time_max(<class 'numpy.int16'>)
- 484±0.4μs 125±0.3μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 4, 'f')
- 485±0.6μs 125±0.4μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 4, 'f')
- 489±0.6μs 126±0.9μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 2, 'f')
- 15.3±0.05μs 3.92±0.02μs 0.26 bench_reduce.MinMax.time_min(<class 'numpy.int16'>)
- 489±2μs 125±0.8μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 2, 'f')
- 488±0.8μs 125±0.4μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'f')
- 488±0.6μs 125±0.2μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 2, 'f')
- 486±0.5μs 124±0.2μs 0.25 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 1, 'd')
- 491±0.8μs 125±0.9μs 0.25 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 2, 'd')
- 487±0.8μs 124±0.2μs 0.25 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 1, 'd')
- 494±1μs 124±0.6μs 0.25 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 1, 'd')
- 495±0.3μs 124±0.4μs 0.25 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 1, 'd')
- 15.3±0.06μs 3.78±0.01μs 0.25 bench_reduce.MinMax.time_max(<class 'numpy.uint8'>)
- 15.3±0.05μs 3.71±0.02μs 0.24 bench_reduce.MinMax.time_min(<class 'numpy.int8'>)
- 15.4±0.05μs 3.70±0.02μs 0.24 bench_reduce.MinMax.time_max(<class 'numpy.int8'>)
- 13.7±0.02μs 3.23±0.05μs 0.24 bench_reduce.FMinMax.time_min(<class 'numpy.float32'>)
- 13.7±0.01μs 3.19±0.04μs 0.23 bench_reduce.FMinMax.time_max(<class 'numpy.float32'>)
- 482±0.5μs 110±0.4μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 4, 'f')
- 481±0.3μs 110±0.3μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 4, 'f')
- 483±0.7μs 110±0.2μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 4, 'f')
- 482±0.3μs 109±0.3μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 4, 'f')
- 486±0.8μs 109±0.2μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 2, 'f')
- 488±1μs 110±0.3μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 1, 'f')
- 485±0.8μs 109±0.3μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 2, 'f')
- 487±0.4μs 109±0.3μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 1, 'f')
- 486±1μs 109±0.3μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 2, 'f')
- 487±0.6μs 109±0.3μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 2, 'f')
- 488±0.5μs 109±0.4μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 1, 'f')
- 489±2μs 109±0.3μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 1, 'f')
- 19.6±0.01μs 4.33±0.04μs 0.22 bench_reduce.FMinMax.time_max(<class 'numpy.float64'>)
- 21.2±0.04μs 4.45±0.02μs 0.21 bench_reduce.MinMax.time_min(<class 'numpy.uint32'>)
- 481±0.4μs 98.1±0.5μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 1, 'd')
- 489±0.7μs 98.3±0.5μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 1, 'd')
- 497±10μs 98.3±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 4, 'f')
- 497±10μs 98.1±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 4, 'f')
- 483±0.3μs 94.7±1μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 2, 'f')
- 483±0.4μs 94.5±1μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 2, 'f')
- 486±1μs 93.6±0.2μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 1, 'f')
- 486±0.4μs 93.5±0.6μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 1, 'f')
- 485±0.7μs 93.3±0.2μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 1, 'f')
- 486±0.8μs 93.4±0.6μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 1, 'f')
- 21.2±0.04μs 3.98±0.01μs 0.19 bench_reduce.MinMax.time_min(<class 'numpy.uint16'>)
- 21.3±0.06μs 3.73±0.02μs 0.18 bench_reduce.MinMax.time_min(<class 'numpy.uint8'>)
- 481±0.8μs 79.2±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 2, 'f')
- 481±0.2μs 78.8±0.4μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 2, 'f')
- 481±0.3μs 78.7±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 2, 'f')
- 483±0.5μs 79.1±0.2μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 1, 'f')
- 481±0.6μs 78.6±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 2, 'f')
- 483±0.2μs 79.0±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 1, 'f')
- 524±0.7μs 74.2±1μs 0.14 bench_lib.Nan.time_nanmax(200000, 90.0)
- 529±0.5μs 72.7±0.3μs 0.14 bench_lib.Nan.time_nanmin(200000, 90.0)
- 478±0.5μs 62.6±0.6μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 2, 'f')
- 478±0.3μs 61.6±0.9μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 2, 'f')
- 481±0.5μs 59.1±0.9μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 1, 'f')
- 480±0.7μs 58.9±0.8μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 1, 'f')
- 480±0.3μs 58.6±0.7μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 1, 'f')
- 480±0.8μs 58.4±0.3μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 1, 'f')
- 68.0±0.04μs 8.15±0.2μs 0.12 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'h')
- 75.7±0.1μs 8.48±0.2μs 0.11 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'H')
- 90.4±0.1μs 8.35±0.3μs 0.09 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'H')
- 90.3±0.1μs 8.29±0.1μs 0.09 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'h')
- 1.01±0ms 74.0±0.5μs 0.07 bench_lib.Nan.time_nanmax(200000, 50.0)
- 477±0.7μs 34.3±1μs 0.07 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 1, 'f')
- 1.01±0ms 72.6±0.5μs 0.07 bench_lib.Nan.time_nanmin(200000, 50.0)
- 476±1μs 33.8±0.6μs 0.07 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 1, 'f')
- 75.3±0.03μs 4.57±0.1μs 0.06 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'b')
- 75.3±0.05μs 4.55±0.2μs 0.06 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'b')
- 75.3±0.02μs 4.55±0.1μs 0.06 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'B')
- 82.7±0.05μs 4.57±0.1μs 0.06 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'B')
AVX2
export NPY_DISABLE_CPU_FEATURES="AVX512F AVX512_SKX"
python runtests.py -n --bench-compare parent/main "max|min" -- --sort ratio
+ 122±0.3μs 220±2μs 1.81 bench_reduce.ArgMax.time_argmax(<class 'numpy.float32'>)
- 87.5±1μs 83.2±0.8μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'I')
- 86.6±0.5μs 82.3±0.4μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'H')
- 72.9±0.5μs 69.1±0.4μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'i')
- 72.9±0.4μs 69.0±0.3μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'I')
- 422±0.7μs 398±2μs 0.94 bench_ufunc.UFunc.time_ufunc_types('maximum')
- 74.5±1μs 69.7±0.6μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'h')
- 74.8±1μs 69.8±0.6μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'i')
- 74.2±0.5μs 69.2±0.7μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'h')
- 82.6±0.8μs 77.0±0.7μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'H')
- 73.8±0.4μs 68.8±0.07μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'i')
- 83.1±0.6μs 77.2±0.6μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'H')
- 81.8±1μs 76.0±1μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'H')
- 983±3μs 911±5μs 0.93 bench_lib.Nan.time_nanargmax(200000, 90.0)
- 75.1±0.8μs 69.6±0.9μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'i')
- 72.6±0.3μs 67.2±0.5μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'h')
- 73.0±0.1μs 67.5±0.3μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'h')
- 71.3±0.2μs 65.9±0.1μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'h')
- 71.5±0.3μs 66.1±0.4μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'h')
- 80.1±0.4μs 73.9±0.4μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'H')
- 988±6μs 911±3μs 0.92 bench_lib.Nan.time_nanargmin(200000, 90.0)
- 80.9±0.7μs 74.6±0.8μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'H')
- 73.2±0.5μs 67.4±0.2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'h')
- 73.1±0.2μs 67.1±0.4μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'h')
- 82.0±0.9μs 75.0±1μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'I')
- 80.1±0.3μs 73.3±0.08μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'I')
- 71.7±0.2μs 65.4±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'h')
- 69.6±0.1μs 63.5±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'h')
- 70.2±0.2μs 64.0±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'h')
- 81.8±0.4μs 74.5±0.6μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'I')
- 68.7±0.1μs 62.6±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'h')
- 70.0±0.03μs 63.7±0.3μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'h')
- 68.7±0.2μs 62.5±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'h')
- 69.8±0.08μs 63.4±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'h')
- 80.2±0.3μs 72.9±0.2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'H')
- 80.2±0.3μs 72.8±0.3μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'H')
- 419±0.7μs 380±0.5μs 0.91 bench_ufunc.UFunc.time_ufunc_types('minimum')
- 71.0±0.3μs 64.3±0.3μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'h')
- 68.9±0.2μs 62.4±0.1μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'h')
- 79.8±0.2μs 72.0±0.5μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'H')
- 79.1±0.3μs 71.4±0.5μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'H')
- 78.9±0.2μs 71.2±0.4μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'H')
- 72.1±0.2μs 65.0±0.6μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'h')
- 79.4±0.1μs 71.6±0.3μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'H')
- 80.6±0.1μs 72.6±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'H')
- 543±2μs 489±2μs 0.90 bench_lib.Nan.time_nanargmax(200000, 2.0)
- 77.3±0.1μs 69.4±0.07μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'H')
- 78.4±0.2μs 70.3±0.3μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'H')
- 79.7±0.4μs 71.4±0.2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'H')
- 76.4±0.1μs 68.4±0.1μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'H')
- 77.7±0.2μs 69.6±0.07μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'H')
- 77.9±0.2μs 69.8±0.1μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'H')
- 76.5±0.2μs 68.4±0.2μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'H')
- 76.6±0.1μs 68.4±0.2μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'H')
- 77.6±0.08μs 69.3±0.1μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'H')
- 548±4μs 485±0.9μs 0.89 bench_lib.Nan.time_nanargmin(200000, 2.0)
- 6.29±0.01μs 5.57±0.1μs 0.89 bench_reduce.MinMax.time_max(<class 'numpy.float32'>)
- 6.28±0μs 5.54±0.09μs 0.88 bench_reduce.MinMax.time_min(<class 'numpy.float32'>)
- 435±2μs 376±2μs 0.86 bench_lib.Nan.time_nanargmax(200000, 0)
- 438±1μs 378±2μs 0.86 bench_lib.Nan.time_nanargmax(200000, 0.1)
- 94.5±0.3μs 81.3±0.6μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'H')
- 436±2μs 375±2μs 0.86 bench_lib.Nan.time_nanargmin(200000, 0)
- 94.1±0.3μs 80.8±0.5μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'h')
- 93.8±0.5μs 80.3±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'h')
- 94.1±0.3μs 80.5±0.2μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'H')
- 442±2μs 378±2μs 0.86 bench_lib.Nan.time_nanargmin(200000, 0.1)
- 93.9±0.2μs 80.0±0.5μs 0.85 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'h')
- 94.2±0.6μs 80.2±0.6μs 0.85 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'H')
- 78.3±0.3μs 65.1±0.4μs 0.83 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'b')
- 77.8±0.2μs 64.6±0.2μs 0.83 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'b')
- 78.7±0.3μs 64.8±0.2μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'B')
- 9.28±0.01μs 7.64±0.1μs 0.82 bench_reduce.MinMax.time_max(<class 'numpy.float64'>)
- 9.23±0.01μs 7.59±0.1μs 0.82 bench_reduce.MinMax.time_min(<class 'numpy.float64'>)
- 85.3±0.1μs 69.2±0.2μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'B')
- 77.0±0.2μs 61.8±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'b')
- 84.6±0.1μs 67.8±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'B')
- 84.7±0.1μs 67.8±0.08μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'B')
- 84.6±0.5μs 67.7±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'B')
- 93.1±0.2μs 74.5±1μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'H')
- 93.4±0.7μs 74.6±0.9μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'H')
- 77.3±0.1μs 61.7±0.4μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'b')
- 77.2±0.2μs 61.6±0.2μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'b')
- 93.8±0.2μs 74.8±1μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'H')
- 93.4±0.2μs 74.5±0.8μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'h')
- 77.5±0.2μs 61.7±0.3μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'B')
- 94.0±0.3μs 74.9±0.8μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'h')
- 77.5±0.1μs 61.7±0.07μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'b')
- 84.0±0.2μs 66.9±0.4μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'B')
- 77.5±0.06μs 61.5±0.7μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'B')
- 84.2±0.2μs 66.8±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'B')
- 77.5±0.3μs 61.4±0.3μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'b')
- 83.7±0.1μs 66.3±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'B')
- 84.3±0.2μs 66.8±0.3μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'B')
- 93.2±0.6μs 73.8±0.8μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'h')
- 77.4±0.3μs 61.3±0.3μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'b')
- 83.6±0.05μs 66.1±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'B')
- 83.2±0.1μs 65.7±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'B')
- 76.6±0.1μs 60.5±0.6μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'b')
- 76.7±0.09μs 60.5±0.4μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'b')
- 77.8±0.2μs 61.4±0.3μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'B')
- 83.4±0.2μs 65.8±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'B')
- 76.6±0.2μs 60.5±0.4μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'b')
- 83.3±0.1μs 65.7±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'B')
- 83.3±0.3μs 65.7±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'B')
- 83.7±0.1μs 66.0±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'B')
- 83.2±0.1μs 65.5±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'B')
- 77.0±0.4μs 60.6±0.4μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'B')
- 83.1±0.1μs 65.3±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'B')
- 82.8±0.05μs 65.1±0.06μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'B')
- 76.8±0.1μs 60.4±0.6μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'b')
- 83.1±0.2μs 65.3±0.06μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'B')
- 82.8±0.02μs 65.1±0.08μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'B')
- 82.9±0.3μs 65.2±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'B')
- 82.8±0.04μs 65.0±0.04μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'B')
- 82.8±0.02μs 65.0±0.06μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'B')
- 83.5±0.1μs 65.5±0.1μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'B')
- 83.0±0.2μs 65.0±0.03μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'B')
- 83.3±0.2μs 65.2±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'B')
- 82.9±0.06μs 64.8±0.03μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'B')
- 76.8±0.2μs 60.1±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'B')
- 76.0±0.2μs 59.3±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'b')
- 76.1±0.09μs 59.0±0.4μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'b')
- 76.3±0.2μs 59.1±0.4μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'b')
- 77.3±0.3μs 59.8±0.6μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'b')
- 76.5±0.2μs 59.2±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'b')
- 76.1±0.07μs 58.8±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'b')
- 77.3±0.2μs 59.7±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'b')
- 77.3±0.3μs 59.7±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'B')
- 76.4±0.2μs 58.9±0.4μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'B')
- 76.4±0.2μs 58.9±0.5μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'b')
- 75.7±0.1μs 58.3±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'B')
- 75.8±0.2μs 58.3±0.6μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'B')
- 76.5±0.1μs 58.9±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'B')
- 76.5±0.1μs 58.8±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'B')
- 76.2±0.3μs 58.6±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'B')
- 75.9±0.2μs 58.3±0.4μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'b')
- 93.6±0.2μs 72.0±0.8μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'H')
- 76.0±0.2μs 58.4±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'B')
- 75.9±0.1μs 58.3±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'b')
- 75.7±0.2μs 58.2±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'b')
- 76.0±0.2μs 58.3±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'b')
- 75.9±0.2μs 58.2±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'b')
- 75.9±0.2μs 58.2±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'b')
- 75.8±0.2μs 58.1±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'b')
- 76.0±0.2μs 58.2±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'B')
- 75.5±0.1μs 57.8±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'B')
- 75.9±0.3μs 58.1±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'b')
- 75.5±0.1μs 57.8±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'b')
- 75.6±0.1μs 57.9±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'B')
- 75.4±0.03μs 57.6±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'b')
- 75.4±0.1μs 57.7±0.06μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'B')
- 75.4±0.03μs 57.6±0.06μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'B')
- 75.6±0.1μs 57.8±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'b')
- 76.0±0.2μs 58.1±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'B')
- 75.4±0.04μs 57.6±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'b')
- 76.0±0.2μs 58.0±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'b')
- 76.0±0.2μs 58.1±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'b')
- 75.4±0.02μs 57.6±0.04μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'b')
- 75.4±0.06μs 57.6±0.09μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'B')
- 75.5±0.03μs 57.6±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'b')
- 75.5±0.04μs 57.6±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'b')
- 75.5±0.2μs 57.6±0.06μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'B')
- 75.9±0.07μs 57.9±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'b')
- 75.4±0.07μs 57.6±0.02μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'B')
- 75.4±0.03μs 57.5±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'B')
- 76.0±0.2μs 58.0±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'b')
- 75.4±0.04μs 57.5±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'B')
- 75.4±0.06μs 57.5±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'b')
- 75.5±0.06μs 57.6±0.03μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'b')
- 75.6±0.06μs 57.6±0.09μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'b')
- 75.6±0.1μs 57.7±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'b')
- 75.4±0.03μs 57.5±0.03μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'b')
- 93.2±0.2μs 71.0±1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'h')
- 75.3±0.04μs 57.4±0.06μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'b')
- 75.4±0.06μs 57.5±0.06μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'b')
- 75.7±0.07μs 57.7±0.05μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'b')
- 75.8±0.1μs 57.8±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'b')
- 75.6±0.07μs 57.6±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'b')
- 75.7±0.1μs 57.6±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'B')
- 75.6±0.1μs 57.6±0.04μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'b')
- 75.5±0.1μs 57.5±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'b')
- 75.5±0.07μs 57.4±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'b')
- 93.3±0.2μs 70.2±0.8μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'H')
- 93.2±0.2μs 70.0±0.4μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'h')
- 93.1±0.4μs 69.4±0.4μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'h')
- 93.2±0.6μs 69.3±0.6μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'H')
- 92.9±0.1μs 67.4±0.2μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'H')
- 93.3±0.2μs 67.4±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'H')
- 93.0±0.3μs 67.2±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'h')
- 93.0±0.08μs 67.1±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'h')
- 93.1±0.3μs 67.0±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'h')
- 93.1±0.2μs 67.1±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'h')
- 93.2±0.2μs 67.1±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'H')
- 529±3μs 381±5μs 0.72 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 4, 'd')
- 93.3±0.2μs 67.1±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'H')
- 92.5±0.1μs 66.2±0.5μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'H')
- 92.9±0.4μs 66.2±0.4μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'h')
- 92.6±0.1μs 65.9±0.2μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'H')
- 92.9±0.2μs 66.0±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'h')
- 92.9±0.2μs 65.7±0.3μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'H')
- 92.6±0.2μs 65.4±0.3μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'h')
- 544±4μs 384±6μs 0.71 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 4, 'd')
- 92.8±0.1μs 65.3±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'H')
- 92.9±0.2μs 65.1±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'h')
- 92.3±0.08μs 64.6±0.3μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'H')
- 91.8±0.08μs 64.0±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'H')
- 92.4±0.1μs 64.4±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'h')
- 91.7±0.06μs 63.7±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'H')
- 92.0±0.09μs 63.8±0.07μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'h')
- 91.7±0.2μs 63.5±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'h')
- 91.5±0.1μs 63.3±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'H')
- 91.8±0.1μs 63.5±0.08μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'h')
- 91.5±0.1μs 63.3±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'H')
- 90.8±0.1μs 62.5±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'H')
- 92.0±0.2μs 63.4±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'h')
- 90.8±0.08μs 62.4±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'H')
- 90.8±0.1μs 62.5±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'H')
- 90.9±0.2μs 62.4±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'h')
- 91.0±0.09μs 62.4±0.2μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'h')
- 91.0±0.1μs 62.3±0.2μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'h')
- 184±1μs 124±0.2μs 0.67 bench_reduce.ArgMax.time_argmax(<class 'numpy.float64'>)
- 527±5μs 318±3μs 0.60 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 4, 'd')
- 530±3μs 319±2μs 0.60 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 4, 'd')
- 531±3μs 318±1μs 0.60 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 2, 'd')
- 644±4μs 380±8μs 0.59 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 4, 'd')
- 541±5μs 318±2μs 0.59 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 2, 'd')
- 538±5μs 314±4μs 0.58 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 4, 'd')
- 652±4μs 378±6μs 0.58 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 4, 'd')
- 544±4μs 314±6μs 0.58 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 4, 'd')
- 134±0.2μs 76.2±1μs 0.57 bench_lib.Nan.time_nanmin(200000, 0)
- 138±0.2μs 78.0±0.6μs 0.57 bench_lib.Nan.time_nanmin(200000, 0.1)
- 572±30μs 315±10μs 0.55 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 4, 'd')
- 523±3μs 285±1μs 0.54 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 4, 'd')
- 527±10μs 286±0.9μs 0.54 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 1, 'd')
- 537±10μs 285±2μs 0.53 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 1, 'd')
- 533±1μs 282±4μs 0.53 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 4, 'd')
- 15.2±0.02μs 7.90±0.1μs 0.52 bench_reduce.MinMax.time_max(<class 'numpy.uint64'>)
- 592±30μs 306±10μs 0.52 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 4, 'd')
- 504±2μs 257±3μs 0.51 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 4, 'd')
- 637±5μs 316±4μs 0.50 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 2, 'd')
- 632±5μs 314±3μs 0.50 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 4, 'd')
- 637±6μs 315±5μs 0.49 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 4, 'd')
- 514±3μs 253±5μs 0.49 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 4, 'd')
- 515±3μs 253±2μs 0.49 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 2, 'd')
- 522±2μs 254±1μs 0.49 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 2, 'd')
- 657±4μs 316±1μs 0.48 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 2, 'd')
- 525±3μs 250±2μs 0.48 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 2, 'd')
- 657±5μs 313±0.8μs 0.48 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 4, 'd')
- 660±4μs 313±3μs 0.47 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 4, 'd')
- 533±4μs 252±3μs 0.47 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'd')
- 70.7±0.2μs 32.0±1μs 0.45 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'I')
- 172±0.1μs 77.7±0.2μs 0.45 bench_lib.Nan.time_nanmin(200000, 2.0)
- 540±30μs 244±10μs 0.45 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 4, 'd')
- 535±30μs 241±10μs 0.45 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 4, 'd')
- 15.2±0.01μs 6.83±0.1μs 0.45 bench_reduce.MinMax.time_min(<class 'numpy.int64'>)
- 70.7±0.4μs 31.7±0.3μs 0.45 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'i')
- 70.9±0.4μs 31.8±0.7μs 0.45 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'i')
- 629±2μs 280±0.8μs 0.45 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 4, 'd')
- 15.2±0.03μs 6.77±0.1μs 0.44 bench_reduce.MinMax.time_max(<class 'numpy.int64'>)
- 639±10μs 283±3μs 0.44 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 1, 'd')
- 546±30μs 239±10μs 0.44 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 4, 'd')
- 13.7±0.02μs 5.99±0.06μs 0.44 bench_reduce.FMinMax.time_min(<class 'numpy.float64'>)
- 693±40μs 303±10μs 0.44 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 4, 'd')
- 505±5μs 220±0.4μs 0.44 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 1, 'd')
- 506±2μs 220±0.8μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 2, 'd')
- 511±2μs 221±1μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 1, 'd')
- 507±1μs 219±0.7μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 2, 'd')
- 547±20μs 236±10μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 4, 'd')
- 652±2μs 281±0.9μs 0.43 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 4, 'd')
- 659±6μs 283±2μs 0.43 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 1, 'd')
- 721±40μs 309±10μs 0.43 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 4, 'd')
- 515±0.5μs 218±3μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 2, 'd')
- 517±4μs 219±2μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 1, 'd')
- 517±3μs 219±2μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 2, 'd')
- 522±2μs 219±1μs 0.42 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 1, 'd')
- 606±1μs 251±3μs 0.41 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 4, 'd')
- 78.2±0.3μs 31.7±0.3μs 0.40 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'I')
- 624±3μs 249±3μs 0.40 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 2, 'd')
- 196±0.3μs 78.2±0.8μs 0.40 bench_lib.Nan.time_nanmax(200000, 0.1)
- 634±2μs 252±3μs 0.40 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 4, 'd')
- 629±2μs 250±2μs 0.40 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 2, 'd')
- 193±0.4μs 76.6±0.6μs 0.40 bench_lib.Nan.time_nanmax(200000, 0)
- 517±20μs 201±7μs 0.39 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 4, 'd')
- 649±2μs 250±2μs 0.39 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 2, 'd')
- 495±0.5μs 190±1μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 4, 'f')
- 654±1μs 250±0.9μs 0.38 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 2, 'd')
- 494±1μs 188±1μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 4, 'f')
- 492±0.5μs 187±0.4μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 2, 'd')
- 498±0.9μs 188±2μs 0.38 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 4, 'f')
- 21.1±0.02μs 7.94±0.1μs 0.38 bench_reduce.MinMax.time_min(<class 'numpy.uint64'>)
- 528±20μs 199±7μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 4, 'd')
- 500±0.8μs 188±0.3μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 1, 'd')
- 502±1μs 187±0.4μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 1, 'd')
- 502±0.5μs 186±0.9μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 2, 'd')
- 509±2μs 187±1μs 0.37 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 1, 'd')
- 511±0.5μs 186±2μs 0.36 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 1, 'd')
- 651±30μs 237±10μs 0.36 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 4, 'd')
- 654±30μs 238±9μs 0.36 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 4, 'd')
- 614±4μs 218±1μs 0.36 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 1, 'd')
- 612±0.6μs 218±0.6μs 0.36 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 2, 'd')
- 614±3μs 218±2μs 0.36 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 2, 'd')
- 617±0.9μs 218±1μs 0.35 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 1, 'd')
- 224±0.2μs 78.3±0.2μs 0.35 bench_lib.Nan.time_nanmax(200000, 2.0)
- 682±40μs 236±10μs 0.35 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 4, 'd')
- 684±30μs 235±6μs 0.34 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 4, 'd')
- 638±3μs 218±1μs 0.34 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 2, 'd')
- 640±4μs 219±0.8μs 0.34 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 1, 'd')
- 637±1μs 218±1μs 0.34 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 2, 'd')
- 644±0.9μs 219±2μs 0.34 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 1, 'd')
- 15.2±0.02μs 4.93±0.1μs 0.32 bench_reduce.MinMax.time_max(<class 'numpy.int32'>)
- 488±0.7μs 157±0.6μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 4, 'f')
- 489±0.7μs 158±0.7μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 4, 'f')
- 490±0.6μs 158±2μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 4, 'f')
- 486±1μs 156±0.6μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 2, 'd')
- 493±0.6μs 158±0.7μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 2, 'f')
- 15.3±0.01μs 4.89±0.1μs 0.32 bench_reduce.MinMax.time_max(<class 'numpy.uint32'>)
- 15.2±0.03μs 4.88±0.2μs 0.32 bench_reduce.MinMax.time_min(<class 'numpy.int32'>)
- 489±1μs 156±1μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 2, 'd')
- 493±1μs 158±0.3μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 1, 'd')
- 493±1μs 157±2μs 0.32 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 4, 'f')
- 590±1μs 188±2μs 0.32 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 4, 'f')
- 489±0.4μs 156±1μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 4, 'f')
- 492±0.7μs 157±1μs 0.32 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 4, 'f')
- 494±0.8μs 157±0.6μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 2, 'f')
- 626±30μs 198±6μs 0.32 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 4, 'd')
- 497±0.9μs 157±0.8μs 0.32 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 2, 'f')
- 496±1μs 156±1μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 2, 'd')
- 496±1μs 156±0.8μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 2, 'd')
- 594±0.7μs 186±1μs 0.31 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 2, 'd')
- 501±0.3μs 156±2μs 0.31 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 1, 'd')
- 606±8μs 186±0.7μs 0.31 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 1, 'd')
- 608±1μs 186±1μs 0.31 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 1, 'd')
- 19.6±0.02μs 5.99±0.07μs 0.31 bench_reduce.FMinMax.time_max(<class 'numpy.float64'>)
- 660±30μs 200±5μs 0.30 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 4, 'd')
- 622±2μs 186±0.8μs 0.30 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 2, 'd')
- 635±3μs 187±0.4μs 0.30 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 1, 'd')
- 633±2μs 186±0.6μs 0.29 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 1, 'd')
- 487±1μs 142±0.5μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 4, 'f')
- 489±2μs 143±1μs 0.29 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 4, 'f')
- 487±1μs 142±2μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 4, 'f')
- 488±0.5μs 142±0.3μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 4, 'f')
- 490±0.7μs 142±0.8μs 0.29 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 4, 'f')
- 487±0.7μs 141±1μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 4, 'f')
- 13.7±0.04μs 3.95±0.06μs 0.29 bench_reduce.FMinMax.time_min(<class 'numpy.float32'>)
- 493±0.9μs 141±0.3μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 1, 'f')
- 13.7±0.03μs 3.90±0.05μs 0.29 bench_reduce.FMinMax.time_max(<class 'numpy.float32'>)
- 492±0.5μs 140±1μs 0.29 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 1, 'f')
- 496±1μs 140±1μs 0.28 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 1, 'f')
- 15.2±0.01μs 4.08±0.1μs 0.27 bench_reduce.MinMax.time_min(<class 'numpy.int16'>)
- 15.2±0.02μs 4.08±0.1μs 0.27 bench_reduce.MinMax.time_max(<class 'numpy.uint16'>)
- 15.2±0.02μs 4.05±0.1μs 0.27 bench_reduce.MinMax.time_max(<class 'numpy.int16'>)
- 585±2μs 157±1μs 0.27 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 4, 'f')
- 586±0.9μs 156±0.6μs 0.27 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 4, 'f')
- 590±1μs 156±0.5μs 0.26 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 2, 'f')
- 587±0.9μs 155±0.5μs 0.26 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 2, 'd')
- 589±1μs 155±0.6μs 0.26 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 2, 'd')
- 487±0.6μs 128±0.7μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 2, 'f')
- 484±0.2μs 126±0.4μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 4, 'f')
- 594±0.9μs 155±1μs 0.26 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 1, 'd')
- 488±1μs 127±0.3μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 2, 'f')
- 483±0.5μs 126±2μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 2, 'd')
- 488±1μs 127±1μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 2, 'f')
- 492±2μs 128±1μs 0.26 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 2, 'f')
- 485±0.6μs 126±1μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 4, 'f')
- 489±2μs 126±0.9μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'f')
- 487±1μs 126±0.6μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 1, 'd')
- 486±0.3μs 125±0.5μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 1, 'd')
- 487±0.5μs 125±0.2μs 0.26 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 4, 'f')
- 491±1μs 126±0.7μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 2, 'd')
- 490±1μs 126±1μs 0.26 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 2, 'f')
- 495±0.6μs 125±1μs 0.25 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 1, 'd')
- 496±0.3μs 125±1μs 0.25 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 1, 'd')
- 617±0.8μs 155±0.5μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 2, 'd')
- 617±1μs 155±0.3μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 2, 'd')
- 621±1μs 155±0.2μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 1, 'd')
- 15.2±0.01μs 3.74±0.1μs 0.25 bench_reduce.MinMax.time_max(<class 'numpy.uint8'>)
- 15.2±0.01μs 3.74±0.1μs 0.25 bench_reduce.MinMax.time_max(<class 'numpy.int8'>)
- 582±2μs 142±0.3μs 0.24 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 4, 'f')
- 15.2±0.02μs 3.70±0.1μs 0.24 bench_reduce.MinMax.time_min(<class 'numpy.int8'>)
- 585±0.9μs 142±0.6μs 0.24 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 4, 'f')
- 484±3μs 117±0.6μs 0.24 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 4, 'f')
- 484±0.7μs 115±1μs 0.24 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 4, 'f')
- 589±1μs 140±1μs 0.24 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 1, 'f')
- 487±0.8μs 112±0.8μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 2, 'f')
- 21.1±0.01μs 4.87±0.2μs 0.23 bench_reduce.MinMax.time_min(<class 'numpy.uint32'>)
- 482±0.4μs 111±0.4μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 4, 'f')
- 486±0.5μs 112±0.8μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 2, 'f')
- 482±0.7μs 111±0.9μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 4, 'f')
- 482±1μs 111±0.8μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 4, 'f')
- 481±0.7μs 111±0.7μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 4, 'f')
- 488±0.7μs 112±0.6μs 0.23 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 2, 'f')
- 485±0.9μs 111±0.6μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 2, 'f')
- 489±0.5μs 112±0.5μs 0.23 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 2, 'f')
- 485±0.7μs 111±1μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 2, 'f')
- 488±0.3μs 111±0.4μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 1, 'f')
- 488±1μs 110±0.4μs 0.23 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 1, 'f')
- 489±0.6μs 110±1μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 1, 'f')
- 489±2μs 110±0.7μs 0.22 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 1, 'f')
- 491±0.7μs 110±0.3μs 0.22 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 1, 'f')
- 491±2μs 109±0.4μs 0.22 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 1, 'f')
- 585±2μs 126±1μs 0.22 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 2, 'f')
- 581±0.7μs 125±0.3μs 0.22 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 4, 'f')
- 586±1μs 126±0.3μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 2, 'f')
- 585±3μs 125±0.6μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 2, 'd')
- 588±0.4μs 125±0.7μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 1, 'd')
- 498±10μs 106±3μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 4, 'f')
- 589±0.3μs 125±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 1, 'd')
- 613±2μs 125±0.8μs 0.20 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 2, 'd')
- 616±0.9μs 125±0.3μs 0.20 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 1, 'd')
- 617±0.3μs 125±0.4μs 0.20 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 1, 'd')
- 578±0.6μs 116±0.5μs 0.20 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 4, 'f')
- 579±1μs 116±1μs 0.20 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 4, 'f')
- 483±0.6μs 96.3±1μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 2, 'f')
- 483±0.5μs 96.2±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 2, 'f')
- 497±10μs 98.4±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 4, 'f')
- 481±0.4μs 94.9±0.4μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 1, 'd')
- 486±0.1μs 95.5±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 2, 'f')
- 497±10μs 97.6±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 4, 'f')
- 485±0.8μs 94.5±0.4μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 1, 'f')
- 486±0.7μs 94.6±0.3μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 1, 'f')
- 485±0.8μs 94.1±0.7μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 1, 'f')
- 21.1±0.02μs 4.09±0.1μs 0.19 bench_reduce.MinMax.time_min(<class 'numpy.uint16'>)
- 486±0.8μs 94.0±0.7μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 1, 'f')
- 490±0.6μs 94.4±0.6μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 1, 'd')
- 583±1μs 112±0.4μs 0.19 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 2, 'f')
- 490±1μs 94.0±0.05μs 0.19 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 1, 'f')
- 489±0.8μs 93.8±0.6μs 0.19 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 1, 'f')
- 584±1μs 112±0.6μs 0.19 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 2, 'f')
- 585±0.8μs 110±0.6μs 0.19 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 1, 'f')
- 585±1μs 109±0.4μs 0.19 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 1, 'f')
- 21.1±0.02μs 3.77±0.1μs 0.18 bench_reduce.MinMax.time_min(<class 'numpy.uint8'>)
- 483±0.4μs 85.4±0.5μs 0.18 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 2, 'f')
- 598±10μs 106±3μs 0.18 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 4, 'f')
- 483±0.4μs 84.3±0.8μs 0.17 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 2, 'f')
- 481±0.4μs 83.0±1μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 2, 'f')
- 481±0.4μs 81.7±1μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 2, 'f')
- 480±1μs 81.2±0.5μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 2, 'f')
- 482±1μs 80.9±1μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 2, 'f')
- 484±0.4μs 80.6±0.2μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 1, 'f')
- 483±0.4μs 80.1±0.4μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 1, 'f')
- 486±0.3μs 80.0±0.4μs 0.16 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 1, 'f')
- 580±1μs 95.3±1μs 0.16 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 2, 'f')
- 583±2μs 94.6±0.05μs 0.16 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 1, 'f')
- 583±0.9μs 94.1±0.1μs 0.16 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 1, 'd')
- 583±0.7μs 93.8±0.2μs 0.16 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 1, 'f')
- 481±0.6μs 76.8±1μs 0.16 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 2, 'f')
- 611±0.7μs 94.1±0.1μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 1, 'd')
- 513±1μs 78.4±0.1μs 0.15 bench_lib.Nan.time_nanmin(200000, 90.0)
- 509±0.5μs 77.8±0.4μs 0.15 bench_lib.Nan.time_nanmax(200000, 90.0)
- 68.1±0.08μs 10.2±0.5μs 0.15 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'h')
- 577±0.6μs 85.2±0.4μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 2, 'f')
- 578±0.6μs 84.6±0.9μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 2, 'f')
- 580±0.7μs 79.9±0.4μs 0.14 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 1, 'f')
- 75.7±0.2μs 10.2±0.4μs 0.14 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'H')
- 575±0.6μs 76.6±0.9μs 0.13 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 2, 'f')
- 478±0.6μs 63.4±0.6μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 2, 'f')
- 479±0.4μs 63.3±0.6μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 2, 'f')
- 480±0.9μs 61.6±0.4μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 1, 'f')
- 479±2μs 60.9±1μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 1, 'f')
- 480±0.5μs 60.9±0.8μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 1, 'f')
- 483±0.5μs 61.3±0.3μs 0.13 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 1, 'f')
- 481±1μs 60.4±0.8μs 0.13 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 1, 'f')
- 481±0.7μs 60.3±0.5μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 1, 'f')
- 90.2±0.1μs 10.2±0.2μs 0.11 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'H')
- 90.3±0.1μs 10.1±0.1μs 0.11 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'h')
- 577±0.8μs 60.6±1μs 0.11 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 1, 'f')
- 577±0.9μs 60.4±0.4μs 0.10 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 1, 'f')
- 977±2μs 78.6±0.6μs 0.08 bench_lib.Nan.time_nanmax(200000, 50.0)
- 975±2μs 78.3±0.7μs 0.08 bench_lib.Nan.time_nanmin(200000, 50.0)
- 75.3±0.03μs 5.34±0.02μs 0.07 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'B')
- 75.4±0.04μs 5.32±0.03μs 0.07 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'b')
- 75.4±0.03μs 5.31±0.02μs 0.07 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'b')
- 478±0.8μs 33.6±0.5μs 0.07 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 1, 'f')
- 478±0.7μs 33.4±0.5μs 0.07 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 1, 'f')
- 480±2μs 33.0±0.5μs 0.07 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 1, 'f')
- 82.7±0.04μs 5.31±0.02μs 0.06 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'B')
- 575±1μs 33.8±0.3μs 0.06 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 1, 'f')
Power little-endian
CPU
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 8
NUMA node(s): 1
Model: 2.2 (pvr 004e 1202)
Model name: POWER9 (architected), altivec supported
L1d cache: 256 KiB
L1i cache: 256 KiB
NUMA node0 CPU(s): 0-7
Vulnerability L1tf: Not affected
Vulnerability Meltdown: Mitigation; RFI Flush
Vulnerability Spec store bypass: Mitigation; Kernel entry/exit barrier (eieio)
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Vulnerable
processor : 7
cpu : POWER9 (architected), altivec supported
clock : 2200.000000MHz
revision : 2.2 (pvr 004e 1202)
timebase : 512000000
platform : pSeries
model : IBM pSeries (emulated by qemu)
machine : CHRP IBM pSeries (emulated by qemu)
MMU : Radix
OS
Linux e517009a912a 4.19.0-2-powerpc64le #1 SMP Debian 4.19.16-1 (2019-01-17) ppc64le ppc64le ppc64le GNU/Linux
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Benchmark
baseline(VSX2)
python runtests.py -n --bench-compare parent/main "max|min" -- --sort ratio
before after ratio
[1684a933] [fd5a2601]
+ 125±0.3μs 154±0.07μs 1.24 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'q')
+ 698±2μs 816±1μs 1.17 bench_ufunc.UFunc.time_ufunc_types('fmax')
+ 140±2μs 154±3μs 1.10 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'L')
+ 136±0.6μs 144±1μs 1.06 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'l')
- 149±1μs 141±0.6μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'l')
- 695±0.6μs 660±4μs 0.95 bench_ufunc.UFunc.time_ufunc_types('maximum')
- 146±2μs 137±1μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'Q')
- 145±1μs 136±1μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'q')
- 146±1μs 136±1μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'L')
- 147±2μs 136±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'l')
- 142±2μs 129±2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'Q')
- 40.7±0.06μs 36.8±0.2μs 0.90 bench_reduce.MinMax.time_max(<class 'numpy.float64'>)
- 142±2μs 127±2μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'L')
- 143±2μs 127±1μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'Q')
- 143±2μs 127±2μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'l')
- 144±1μs 127±1μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'q')
- 143±1μs 126±1μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'l')
- 143±2μs 126±0.8μs 0.88 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'L')
- 139±0.9μs 121±0.6μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'Q')
- 140±1μs 121±0.7μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'q')
- 141±0.6μs 121±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'Q')
- 140±1μs 120±0.8μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'q')
- 139±0.7μs 120±0.8μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'L')
- 140±2μs 119±0.7μs 0.85 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'l')
- 140±1μs 118±0.6μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'L')
- 141±0.9μs 118±0.8μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'l')
- 128±1μs 104±0.2μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'Q')
- 125±0.8μs 102±0.4μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'Q')
- 125±0.5μs 101±0.9μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'q')
- 131±0.3μs 106±0.1μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'q')
- 128±0.5μs 104±0.3μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'q')
- 131±0.4μs 106±0.1μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'Q')
- 129±0.4μs 104±0.4μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'Q')
- 128±0.1μs 103±0.3μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'q')
- 132±0.7μs 106±0.5μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'Q')
- 129±0.6μs 103±0.4μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'q')
- 129±0.4μs 103±1μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'q')
- 128±0.3μs 103±0.1μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'Q')
- 131±0.6μs 105±0.7μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'q')
- 129±0.5μs 102±0.4μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'Q')
- 141±0.2μs 111±1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'B')
- 142±0.1μs 111±0.1μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'B')
- 139±0.03μs 109±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'H')
- 141±0.2μs 110±2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'b')
- 139±0.3μs 109±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'H')
- 139±0.4μs 108±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'H')
- 140±0.5μs 109±0.1μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'B')
- 140±0.5μs 109±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'B')
- 141±0.09μs 109±0.5μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'H')
- 140±1μs 109±1μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'H')
- 140±0.1μs 109±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'H')
- 139±0.5μs 108±0.07μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'B')
- 141±0.1μs 109±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'B')
- 140±0.4μs 109±0.09μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'H')
- 141±0.1μs 109±0.07μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'B')
- 140±0.03μs 109±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'H')
- 140±0.1μs 109±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'H')
- 140±0.1μs 109±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'H')
- 140±0.06μs 109±0.04μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'B')
- 140±0.5μs 109±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'B')
- 140±1μs 108±0.6μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'B')
- 140±0.6μs 109±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'H')
- 140±0.2μs 108±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'B')
- 140±0.5μs 108±0.03μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'B')
- 140±0.05μs 109±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'B')
- 141±0.2μs 109±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'H')
- 140±0.1μs 108±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'H')
- 141±0.2μs 109±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'H')
- 141±0.1μs 109±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'B')
- 141±0.2μs 109±0.08μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'B')
- 141±0.05μs 109±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'B')
- 141±0.3μs 109±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'H')
- 140±0.7μs 109±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'B')
- 141±0.1μs 109±0.09μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'B')
- 141±0.05μs 109±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'H')
- 141±0.2μs 109±0.3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'H')
- 140±0.08μs 108±0.07μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'H')
- 141±0.1μs 109±0.5μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'B')
- 141±0.08μs 109±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'H')
- 140±0.2μs 109±0.04μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'H')
- 141±0.7μs 109±0.08μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'H')
- 140±0.2μs 109±0.07μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'B')
- 142±0.2μs 109±0.03μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'b')
- 139±1μs 107±3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'B')
- 140±1μs 108±1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'H')
- 141±0.1μs 109±0.4μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'H')
- 139±0.09μs 107±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'B')
- 139±0.3μs 108±2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'B')
- 141±0.08μs 109±0.06μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'H')
- 141±0.05μs 109±0.05μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'B')
- 141±0.1μs 109±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'H')
- 141±0.1μs 109±0.03μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'H')
- 140±1μs 108±3μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'B')
- 141±1μs 108±1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'B')
- 132±1μs 101±0.6μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'i')
- 140±0.4μs 107±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'b')
- 141±0.08μs 108±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'b')
- 141±0.3μs 108±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'b')
- 139±0.7μs 106±0.1μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'b')
- 140±0.06μs 107±0.04μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'b')
- 141±0.2μs 108±0.6μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'h')
- 138±1μs 106±3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'b')
- 140±1μs 107±1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'h')
- 141±0.5μs 108±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'b')
- 141±0.4μs 108±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'b')
- 139±0.7μs 107±0.8μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'b')
- 131±0.2μs 100±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'l')
- 131±0.2μs 100±0.4μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'L')
- 140±0.5μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'h')
- 140±0.9μs 107±2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'b')
- 140±0.5μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'b')
- 141±0.2μs 107±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'b')
- 141±0.06μs 107±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'b')
- 139±0.09μs 106±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'b')
- 140±0.4μs 106±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'h')
- 141±0.2μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'b')
- 133±1μs 101±0.6μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'I')
- 141±0.6μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'h')
- 139±0.3μs 106±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'h')
- 140±0.3μs 106±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'b')
- 141±0.2μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'h')
- 140±0.06μs 106±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'b')
- 141±0.2μs 107±0.6μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'b')
- 141±0.6μs 107±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'b')
- 141±0.2μs 107±0.4μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'h')
- 140±0.6μs 106±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'h')
- 141±0.2μs 107±0.04μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'b')
- 140±0.5μs 107±0.06μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'b')
- 140±0.1μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'h')
- 141±0.1μs 107±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'b')
- 141±0.6μs 107±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'h')
- 141±0.7μs 107±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'h')
- 141±0.1μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'h')
- 140±0.2μs 106±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'h')
- 141±0.2μs 107±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'h')
- 140±1μs 106±3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'b')
- 141±0.3μs 107±0.07μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'h')
- 140±0.3μs 106±2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'b')
- 141±1μs 107±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'h')
- 141±0.3μs 107±0.3μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'h')
- 140±0.7μs 106±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'h')
- 141±0.7μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'h')
- 141±0.5μs 107±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'h')
- 141±0.4μs 107±0.03μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'h')
- 141±0.8μs 107±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'h')
- 142±0.9μs 107±0.3μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'h')
- 142±0.1μs 107±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'h')
- 131±0.7μs 98.6±0.6μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'i')
- 141±1μs 106±1μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'h')
- 131±0.7μs 97.5±0.9μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'I')
- 131±0.6μs 97.1±0.7μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'I')
- 130±0.4μs 96.6±0.6μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'i')
- 128±0.3μs 94.7±0.5μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'I')
- 128±0.4μs 94.2±0.4μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'i')
- 131±0.3μs 96.3±0.8μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'l')
- 129±0.6μs 94.8±0.7μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'i')
- 130±0.6μs 94.8±0.5μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'I')
- 129±0.1μs 94.1±0.5μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'I')
- 131±0.3μs 95.8±0.8μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'L')
- 125±0.3μs 91.1±0.2μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'I')
- 125±0.2μs 90.8±0.2μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'i')
- 129±0.1μs 93.6±0.5μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'i')
- 128±0.2μs 92.8±0.4μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'i')
- 125±0.5μs 90.4±0.4μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'i')
- 129±0.5μs 93.6±0.4μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'i')
- 129±0.6μs 93.7±0.3μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'I')
- 126±0.5μs 91.2±0.3μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'i')
- 125±0.3μs 90.3±0.6μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'I')
- 128±0.3μs 93.1±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'I')
- 124±0.1μs 90.2±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'I')
- 130±0.4μs 94.1±0.9μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'I')
- 125±0.2μs 90.3±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'i')
- 129±0.4μs 93.5±0.5μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'I')
- 130±0.7μs 94.0±0.9μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'i')
- 128±0.3μs 92.5±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'I')
- 125±0.1μs 90.6±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'i')
- 129±0.9μs 93.4±0.5μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'i')
- 129±0.6μs 92.8±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'L')
- 126±0.4μs 90.6±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'I')
- 128±0.3μs 92.4±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'I')
- 129±0.7μs 92.8±0.3μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'i')
- 128±0.2μs 92.3±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'i')
- 125±0.06μs 90.4±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'I')
- 129±0.2μs 92.8±0.5μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'i')
- 129±0.7μs 93.0±0.6μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'L')
- 129±0.4μs 92.8±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'I')
- 129±0.3μs 92.4±0.5μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'I')
- 129±0.5μs 92.7±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'l')
- 128±0.2μs 92.1±0.5μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'i')
- 129±1μs 92.5±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'l')
- 125±0.5μs 89.8±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'I')
- 128±0.6μs 92.0±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'i')
- 125±0.6μs 89.8±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'i')
- 127±0.2μs 90.7±0.2μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'I')
- 125±0.7μs 89.2±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'l')
- 128±0.2μs 91.5±0.4μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'I')
- 125±0.3μs 89.2±0.5μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'L')
- 126±0.7μs 89.6±0.3μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'i')
- 127±0.08μs 90.1±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'I')
- 127±0.2μs 90.1±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'i')
- 129±0.8μs 91.3±0.4μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'l')
- 126±0.2μs 89.5±0.4μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'i')
- 36.6±0.1μs 25.9±0.08μs 0.71 bench_reduce.MinMax.time_max(<class 'numpy.int64'>)
- 36.7±0.06μs 25.9±0.06μs 0.71 bench_reduce.MinMax.time_max(<class 'numpy.uint64'>)
- 127±0.2μs 89.4±0.3μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'I')
- 125±0.9μs 88.5±0.6μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'i')
- 125±0.2μs 88.5±0.7μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'I')
- 126±0.5μs 88.6±0.4μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'i')
- 128±0.2μs 90.1±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'L')
- 126±0.4μs 88.6±0.4μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'I')
- 130±0.5μs 90.6±0.3μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'L')
- 129±0.5μs 89.8±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'l')
- 44.9±0.08μs 26.5±0.08μs 0.59 bench_reduce.MinMax.time_max(<class 'numpy.float32'>)
- 37.2±0.1μs 20.8±0.03μs 0.56 bench_reduce.MinMax.time_max(<class 'numpy.int32'>)
- 37.3±0.1μs 20.8±0.08μs 0.56 bench_reduce.MinMax.time_max(<class 'numpy.uint32'>)
- 33.0±0.1μs 18.2±0.05μs 0.55 bench_reduce.FMinMax.time_max(<class 'numpy.float64'>)
- 286±1μs 136±0.2μs 0.48 bench_lib.Nan.time_nanmax(200000, 0)
- 38.3±0.06μs 18.1±0.02μs 0.47 bench_reduce.MinMax.time_max(<class 'numpy.uint16'>)
- 38.3±0.3μs 18.1±0.07μs 0.47 bench_reduce.MinMax.time_max(<class 'numpy.int16'>)
- 294±0.7μs 136±0.2μs 0.46 bench_lib.Nan.time_nanmax(200000, 0.1)
- 38.1±0.2μs 16.8±0.06μs 0.44 bench_reduce.MinMax.time_max(<class 'numpy.uint8'>)
- 38.2±0.07μs 16.8±0.01μs 0.44 bench_reduce.MinMax.time_max(<class 'numpy.int8'>)
- 37.6±0.08μs 13.4±0.08μs 0.36 bench_reduce.FMinMax.time_max(<class 'numpy.float32'>)
- 408±3μs 136±0.2μs 0.33 bench_lib.Nan.time_nanmax(200000, 2.0)
- 125±0.2μs 41.7±0.07μs 0.33 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'i')
- 125±0.1μs 41.7±0.04μs 0.33 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'I')
- 890±2μs 276±10μs 0.31 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 4, 'd')
- 888±3μs 249±4μs 0.28 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 2, 'd')
- 895±5μs 242±10μs 0.27 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 4, 'd')
- 892±3μs 240±1μs 0.27 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 4, 'd')
- 895±0.6μs 240±2μs 0.27 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 4, 'd')
- 892±3μs 239±3μs 0.27 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 2, 'd')
- 892±2μs 237±2μs 0.27 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 2, 'd')
- 881±10μs 229±1μs 0.26 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 4, 'd')
- 885±3μs 225±1μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 1, 'd')
- 877±2μs 223±0.8μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 2, 'd')
- 957±1μs 243±2μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 4, 'f')
- 957±3μs 243±0.3μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 2, 'f')
- 954±4μs 242±0.8μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 4, 'f')
- 957±7μs 243±1μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 2, 'f')
- 956±4μs 243±0.8μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 4, 'f')
- 952±3μs 241±0.7μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 2, 'f')
- 960±0.6μs 243±1μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 4, 'f')
- 961±2μs 242±2μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 2, 'f')
- 884±3μs 219±3μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 2, 'd')
- 884±3μs 219±3μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 2, 'd')
- 896±10μs 220±1μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 4, 'd')
- 888±3μs 218±3μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 4, 'd')
- 895±2μs 220±2μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 1, 'd')
- 891±2μs 219±1μs 0.25 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 1, 'd')
- 886±6μs 210±2μs 0.24 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 4, 'd')
- 892±7μs 211±5μs 0.24 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 4, 'd')
- 879±4μs 205±1μs 0.23 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 1, 'd')
- 880±4μs 202±0.4μs 0.23 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 2, 'd')
- 882±3μs 201±2μs 0.23 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 2, 'd')
- 883±3μs 194±1μs 0.22 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 1, 'd')
- 885±3μs 193±0.8μs 0.22 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 1, 'd')
- 880±3μs 186±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 1, 'd')
- 882±8μs 184±2μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 4, 'd')
- 958±2μs 199±3μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 4, 'f')
- 958±7μs 199±4μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 2, 'f')
- 888±4μs 185±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 2, 'd')
- 896±2μs 186±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 4, 'd')
- 895±2μs 185±3μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 4, 'd')
- 883±3μs 182±2μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 1, 'd')
- 878±10μs 181±2μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 2, 'd')
- 960±1μs 196±3μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 4, 'f')
- 955±1μs 194±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 4, 'f')
- 960±1μs 194±0.2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 4, 'f')
- 961±2μs 193±0.7μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'f')
- 894±2μs 180±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 2, 'd')
- 894±2μs 180±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'd')
- 955±2μs 191±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 2, 'f')
- 961±0.9μs 191±0.5μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 2, 'f')
- 882±10μs 173±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 4, 'd')
- 886±2μs 173±2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 1, 'd')
- 141±0.9μs 27.0±0.2μs 0.19 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'H')
- 141±0.9μs 26.9±0.03μs 0.19 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'h')
- 879±2μs 164±0.4μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 2, 'd')
- 889±1μs 165±4μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 4, 'd')
- 895±10μs 166±1μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 4, 'd')
- 894±3μs 161±3μs 0.18 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 1, 'd')
- 887±1μs 158±2μs 0.18 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 2, 'd')
- 888±2μs 159±0.9μs 0.18 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 2, 'd')
- 895±2μs 159±1μs 0.18 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 1, 'd')
- 898±8μs 160±2μs 0.18 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 4, 'd')
- 775±3μs 136±0.2μs 0.18 bench_lib.Nan.time_nanmax(200000, 90.0)
- 894±7μs 157±2μs 0.18 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 4, 'd')
- 957±4μs 167±0.8μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 4, 'f')
- 960±2μs 168±0.9μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 4, 'f')
- 959±1μs 167±0.8μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 2, 'f')
- 954±3μs 166±0.3μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 4, 'f')
- 961±2μs 167±0.2μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 2, 'f')
- 954±3μs 166±0.2μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 2, 'f')
- 958±1μs 165±0.6μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 2, 'f')
- 957±3μs 164±0.7μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 4, 'f')
- 881±2μs 148±1μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 1, 'd')
- 884±0.4μs 143±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 2, 'd')
- 887±8μs 144±3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 4, 'd')
- 884±1μs 143±0.6μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 2, 'd')
- 957±3μs 149±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 1, 'f')
- 973±0.9μs 151±0.6μs 0.16 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 1, 'f')
- 958±3μs 148±0.3μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 1, 'f')
- 954±7μs 147±0.7μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 1, 'f')
- 888±2μs 137±0.9μs 0.15 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 1, 'd')
- 887±1μs 136±1μs 0.15 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 1, 'd')
- 876±4μs 133±0.05μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 1, 'd')
- 884±0.9μs 126±1μs 0.14 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 1, 'd')
- 883±0.8μs 125±1μs 0.14 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 1, 'd')
- 883±10μs 124±0.5μs 0.14 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 2, 'd')
- 961±5μs 128±1μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 4, 'f')
- 961±2μs 128±0.5μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 4, 'f')
- 959±1μs 127±0.4μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 2, 'f')
- 961±1μs 127±0.7μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 2, 'f')
- 959±2μs 126±0.6μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 4, 'f')
- 958±0.6μs 125±0.5μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 2, 'f')
- 958±0.9μs 125±0.08μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 2, 'f')
- 959±0.7μs 125±0.4μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 4, 'f')
- 957±4μs 121±0.4μs 0.13 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 1, 'f')
- 959±3μs 122±1μs 0.13 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 4, 'f')
- 955±3μs 119±0.5μs 0.12 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 1, 'f')
- 954±6μs 119±1μs 0.12 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 2, 'f')
- 955±2μs 119±0.07μs 0.12 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 1, 'f')
- 956±4μs 117±0.03μs 0.12 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 1, 'f')
- 973±0.6μs 113±0.3μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 1, 'f')
- 961±0.7μs 110±0.4μs 0.11 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 1, 'f')
- 961±1μs 110±0.3μs 0.11 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 1, 'f')
- 955±8μs 109±0.2μs 0.11 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 1, 'f')
- 137±1μs 15.5±0.1μs 0.11 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'B')
- 137±1μs 15.4±0.07μs 0.11 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'b')
- 958±3μs 91.5±0.7μs 0.10 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 4, 'f')
- 954±1μs 90.1±0.6μs 0.09 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 2, 'f')
- 959±2μs 85.5±0.6μs 0.09 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 1, 'f')
- 960±1μs 85.2±0.5μs 0.09 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 1, 'f')
- 958±2μs 82.5±0.2μs 0.09 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 1, 'f')
- 958±0.8μs 82.3±0.2μs 0.09 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 1, 'f')
- 876±0.7μs 74.7±0.08μs 0.09 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 1, 'd')
- 953±1μs 72.0±0.05μs 0.08 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 1, 'f')
- 1.94±0ms 136±0.3μs 0.07 bench_lib.Nan.time_nanmax(200000, 50.0)
- 953±0.7μs 41.9±0.02μs 0.04 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 1, 'f')`
AArch64
CPU
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: ARM
Model: 1
Model name: Neoverse-N1
Stepping: r3p1
BogoMIPS: 243.75
L1d cache: 128 KiB
L1i cache: 128 KiB
L2 cache: 2 MiB
L3 cache: 32 MiB
NUMA node0 CPU(s): 0,1
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
OS
Linux ip-172-31-44-172 5.11.0-1020-aws #21~20.04.2-Ubuntu SMP Fri Oct 1 13:01:34 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Benchmark
baseline(ASIMD)
python runtests.py --bench-compare parent/main "max|min" -- --sort ratio
before after ratio
[f224ca3c] [34d15c3d]
<as_min_max^2> <as_min_max>
+ 166±1μs 246±1μs 1.48 bench_reduce.ArgMax.time_argmax(<class 'numpy.float32'>)
+ 735±2μs 904±5μs 1.23 bench_ufunc.UFunc.time_ufunc_types('maximum')
+ 727±5μs 892±3μs 1.23 bench_ufunc.UFunc.time_ufunc_types('minimum')
- 1.76±0ms 1.67±0ms 0.95 bench_lib.Nan.time_nanargmax(200000, 50.0)
- 1.75±0ms 1.67±0ms 0.95 bench_lib.Nan.time_nanargmin(200000, 50.0)
- 143±2μs 136±0.7μs 0.95 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'i')
- 1.12±0ms 1.05±0ms 0.94 bench_lib.Nan.time_nanargmin(200000, 90.0)
- 201±2μs 189±2μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'L')
- 200±2μs 188±2μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'L')
- 1.12±0ms 1.05±0ms 0.94 bench_lib.Nan.time_nanargmax(200000, 90.0)
- 200±1μs 188±3μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'q')
- 201±2μs 189±2μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'l')
- 201±2μs 189±2μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'Q')
- 201±2μs 189±2μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'Q')
- 202±2μs 189±2μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'q')
- 201±2μs 189±2μs 0.94 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'l')
- 203±0.9μs 190±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'q')
- 162±3μs 151±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'q')
- 162±2μs 151±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'l')
- 163±1μs 151±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'L')
- 162±1μs 150±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'Q')
- 203±1μs 189±3μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'l')
- 162±2μs 151±1μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'l')
- 22.0±0.1μs 20.5±0.03μs 0.93 bench_reduce.ArgMax.time_argmax(<class 'bool'>)
- 234±2μs 217±3μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'l')
- 203±1μs 189±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'L')
- 202±2μs 187±1μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'Q')
- 163±2μs 151±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'q')
- 203±1μs 188±0.7μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'l')
- 162±2μs 150±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'Q')
- 162±3μs 150±2μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'L')
- 234±2μs 217±3μs 0.93 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'L')
- 202±2μs 187±0.8μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'q')
- 202±1μs 187±0.4μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'L')
- 235±2μs 217±3μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'Q')
- 204±1μs 189±2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'Q')
- 234±2μs 216±2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'Q')
- 235±0.7μs 217±3μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'q')
- 301±2μs 277±4μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'Q')
- 238±0.8μs 219±5μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'L')
- 235±2μs 216±2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'L')
- 236±1μs 217±2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'q')
- 238±0.7μs 219±3μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'l')
- 298±2μs 273±3μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'L')
- 238±0.5μs 218±2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'Q')
- 295±4μs 271±2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'q')
- 296±3μs 271±2μs 0.92 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'l')
- 159±2μs 145±0.8μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'i')
- 238±0.4μs 217±4μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'q')
- 299±3μs 273±3μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'l')
- 237±2μs 216±3μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'l')
- 236±1μs 215±1μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'L')
- 236±2μs 215±0.8μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'l')
- 236±2μs 215±1μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'Q')
- 296±3μs 269±2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'L')
- 236±2μs 214±0.8μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'q')
- 160±0.4μs 145±2μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'i')
- 301±2μs 273±4μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'q')
- 159±1μs 144±0.7μs 0.91 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'I')
- 299±3μs 268±2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'Q')
- 161±1μs 144±2μs 0.90 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'I')
- 131±0.4μs 116±0.3μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'i')
- 131±0.6μs 116±0.5μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'I')
- 131±0.5μs 116±0.8μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'i')
- 131±0.6μs 116±0.4μs 0.89 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'I')
- 132±0.5μs 116±2μs 0.88 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'i')
- 131±0.3μs 116±0.6μs 0.88 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'L')
- 132±0.5μs 116±1μs 0.88 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'I')
- 132±0.4μs 116±1μs 0.88 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'I')
- 132±1μs 116±0.5μs 0.88 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'l')
- 132±0.7μs 116±0.3μs 0.88 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'l')
- 133±0.8μs 116±0.5μs 0.88 bench_ufunc_strides.inaryInt.time_ufunc('maximum', 1, 2, 2, 'q')
- 132±0.5μs 116±0.7μs 0.88 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'Q')
- 133±0.8μs 116±1μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'Q')
- 132±0.6μs 116±0.7μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'q')
- 132±0.4μs 116±1μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'i')
- 133±0.2μs 116±0.8μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'L')
- 133±0.6μs 116±1μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'L')
- 132±0.7μs 115±0.9μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'Q')
- 133±0.6μs 116±0.9μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'l')
- 133±0.7μs 116±0.5μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'q')
- 133±0.6μs 115±0.8μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'q')
- 136±0.3μs 118±1μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'Q')
- 136±0.5μs 118±0.8μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'L')
- 133±0.3μs 115±1μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'L')
- 133±0.6μs 115±2μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'Q')
- 133±0.7μs 116±0.8μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'l')
- 136±0.1μs 117±1μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'q')
- 136±0.2μs 118±0.9μs 0.87 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'Q')
- 136±0.5μs 118±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'Q')
- 137±0.4μs 118±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'L')
- 734±3μs 635±1μs 0.86 bench_lib.Nan.time_nanargmin(200000, 2.0)
- 735±1μs 635±1μs 0.86 bench_lib.Nan.time_nanargmax(200000, 2.0)
- 136±0.9μs 118±0.5μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'l')
- 136±0.2μs 118±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'Q')
- 136±0.5μs 117±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'L')
- 137±0.7μs 118±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'l')
- 136±0.4μs 118±0.8μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'q')
- 136±0.2μs 117±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'q')
- 136±0.5μs 117±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'l')
- 137±0.3μs 118±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'L')
- 137±0.4μs 118±1μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'q')
- 137±0.6μs 117±2μs 0.86 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'l')
- 132±0.6μs 112±0.8μs 0.85 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'i')
- 644±2μs 546±2μs 0.85 bench_lib.Nan.time_nanargmax(200000, 0.1)
- 132±1μs 112±0.7μs 0.85 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'I')
- 641±0.8μs 543±0.9μs 0.85 bench_lib.Nan.time_nanargmax(200000, 0)
- 644±1μs 546±1μs 0.85 bench_lib.Nan.time_nanargmin(200000, 0.1)
- 640±1μs 542±2μs 0.85 bench_lib.Nan.time_nanargmin(200000, 0)
- 136±0.6μs 115±1μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'i')
- 135±0.4μs 114±1μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'I')
- 135±0.6μs 114±1μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'i')
- 132±0.4μs 111±0.4μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'i')
- 132±0.4μs 111±0.3μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'I')
- 136±0.2μs 114±1μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'I')
- 136±0.5μs 114±1μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'i')
- 136±0.6μs 114±1μs 0.84 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'I')
- 130±0.9μs 107±0.9μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'i')
- 137±0.3μs 112±0.6μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'I')
- 130±0.5μs 107±0.6μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'I')
- 137±0.6μs 112±0.4μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'i')
- 131±1μs 107±1μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'i')
- 131±0.5μs 107±0.9μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'i')
- 131±2μs 107±1μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'i')
- 131±0.4μs 107±0.4μs 0.82 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'I')
- 132±2μs 107±0.5μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'I')
- 132±0.7μs 106±0.7μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'I')
- 125±0.2μs 101±0.7μs 0.81 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'h')
- 124±0.2μs 99.7±0.3μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'h')
- 126±0.7μs 101±0.6μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'h')
- 124±0.3μs 99.2±0.1μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'h')
- 125±0.4μs 99.9±0.4μs 0.80 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'h')
- 125±0.7μs 99.4±0.2μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'h')
- 124±0.08μs 97.7±0.1μs 0.79 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'h')
- 124±0.6μs 97.3±0.2μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'h')
- 124±0.2μs 96.7±0.3μs 0.78 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'h')
- 125±0.3μs 96.7±0.5μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'h')
- 124±0.2μs 94.8±0.4μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'h')
- 124±0.4μs 94.9±0.4μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'h')
- 123±0.09μs 94.5±0.2μs 0.77 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'h')
- 124±0.1μs 94.7±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'h')
- 124±0.2μs 94.6±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'h')
- 127±0.3μs 96.6±0.09μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'h')
- 124±0.2μs 94.8±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'h')
- 124±0.4μs 94.7±0.1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'h')
- 127±0.1μs 96.7±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'h')
- 128±0.5μs 97.5±1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'Q')
- 129±0.5μs 97.6±1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'Q')
- 129±0.2μs 97.6±0.9μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'L')
- 128±0.3μs 97.3±1μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'q')
- 125±0.4μs 94.4±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'h')
- 128±0.2μs 97.3±0.5μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'i')
- 129±0.1μs 97.3±0.5μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'I')
- 124±0.2μs 93.5±0.2μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'h')
- 129±0.3μs 97.2±0.4μs 0.76 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'I')
- 124±0.2μs 93.5±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'h')
- 125±0.1μs 94.6±0.3μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'h')
- 129±0.6μs 97.2±1μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'l')
- 123±0.1μs 93.0±0.5μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'h')
- 125±0.2μs 94.0±0.02μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'h')
- 129±0.3μs 97.0±0.9μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'l')
- 125±0.1μs 94.4±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'h')
- 123±0.2μs 92.9±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'h')
- 129±0.6μs 96.9±0.6μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'q')
- 125±0.04μs 94.3±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'h')
- 125±0.1μs 94.1±0.1μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'h')
- 124±0.2μs 92.9±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'h')
- 125±0.2μs 94.1±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'h')
- 129±0.4μs 96.9±0.6μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'i')
- 124±0.5μs 93.5±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'h')
- 125±0.2μs 94.0±0.05μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'h')
- 124±0.3μs 93.2±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'h')
- 124±0.2μs 92.7±0.4μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'h')
- 129±0.7μs 97.0±2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'L')
- 124±0.5μs 93.3±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'h')
- 126±0.4μs 94.2±0.08μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'h')
- 123±0.1μs 92.1±0.4μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'h')
- 124±0.3μs 92.9±0.09μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'h')
- 123±0.1μs 91.9±0.06μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'h')
- 123±0.1μs 91.9±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'h')
- 123±0.3μs 91.9±0.2μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'h')
- 124±0.2μs 92.2±0.1μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'h')
- 123±0.4μs 91.9±0.1μs 0.75 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'h')
- 123±0.2μs 91.4±0.3μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'h')
- 123±0.2μs 91.2±0.04μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'h')
- 123±0.1μs 91.0±0.2μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'h')
- 133±0.1μs 98.8±0.9μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'I')
- 123±0.1μs 90.9±0.07μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'h')
- 123±0.2μs 91.0±0.3μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'h')
- 123±0.3μs 91.2±0.3μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'h')
- 123±0.1μs 91.0±0.1μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'h')
- 123±0.1μs 91.0±0.1μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'h')
- 133±0.1μs 98.5±1μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'i')
- 133±0.04μs 98.4±0.4μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'I')
- 133±0.1μs 98.3±0.9μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'i')
- 133±0.3μs 98.5±1μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'i')
- 133±0.1μs 98.2±0.8μs 0.74 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'I')
- 134±0.2μs 98.0±0.4μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'I')
- 128±0.8μs 93.4±1μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'i')
- 128±0.5μs 93.6±0.4μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'I')
- 129±0.4μs 93.9±0.8μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'i')
- 128±0.2μs 93.2±0.5μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'I')
- 128±0.1μs 93.2±0.3μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'I')
- 134±1μs 97.6±0.7μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'i')
- 128±0.2μs 92.7±0.5μs 0.73 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'i')
- 128±0.3μs 92.9±0.4μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'i')
- 125±0.4μs 90.2±5μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'b')
- 125±0.3μs 90.2±5μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'b')
- 128±0.2μs 92.1±0.7μs 0.72 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'I')
- 123±0.2μs 87.9±0.06μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'b')
- 123±0.2μs 87.9±0.5μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'b')
- 123±0.2μs 87.8±0.08μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'b')
- 123±0.2μs 87.7±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'b')
- 123±0.1μs 87.7±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'b')
- 123±0.1μs 87.7±0.3μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'b')
- 124±0.3μs 88.4±3μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'b')
- 124±0.4μs 88.1±3μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'b')
- 123±0.1μs 87.3±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'b')
- 123±0.1μs 87.1±0.2μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'b')
- 123±0.1μs 87.1±0.4μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'b')
- 123±0.08μs 87.2±0.3μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'b')
- 123±0.07μs 87.2±0.2μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'b')
- 123±0.1μs 87.2±0.2μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'b')
- 123±0.1μs 86.7±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'b')
- 123±0.06μs 86.5±0.1μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'b')
- 123±0.07μs 86.7±0.2μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'b')
- 123±0.06μs 86.6±0.09μs 0.71 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'b')
- 122±0.2μs 86.0±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'H')
- 123±0.1μs 86.5±0.06μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'b')
- 123±0.07μs 86.4±0.08μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'b')
- 123±0.2μs 86.6±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'b')
- 123±0.04μs 86.3±0.3μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'b')
- 123±0.2μs 86.4±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'b')
- 123±0.09μs 86.4±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'b')
- 123±0.1μs 86.4±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'b')
- 123±0.1μs 86.4±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'b')
- 123±0.1μs 86.4±0.09μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'b')
- 123±0.09μs 86.3±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'b')
- 123±0.1μs 86.2±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'b')
- 123±0.05μs 86.2±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'b')
- 123±0.2μs 86.3±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'b')
- 123±0.06μs 86.2±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'b')
- 123±0.06μs 86.2±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'b')
- 123±0.1μs 86.2±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'b')
- 127±1μs 89.4±0.9μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'I')
- 123±0.2μs 86.1±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'b')
- 123±0.1μs 86.3±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'b')
- 123±0.1μs 86.3±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'b')
- 122±0.07μs 85.9±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'b')
- 123±0.1μs 86.1±0.07μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'b')
- 129±0.5μs 90.5±0.6μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'i')
- 122±0.05μs 85.8±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'b')
- 123±0.03μs 85.9±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'b')
- 123±0.3μs 86.1±0.09μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'b')
- 123±0.2μs 85.8±0.8μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'b')
- 123±0.1μs 86.1±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'b')
- 123±0.2μs 86.0±0.4μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'b')
- 129±0.5μs 90.3±0.2μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'I')
- 129±0.4μs 90.2±0.5μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'i')
- 122±0.2μs 85.6±0.7μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'b')
- 130±0.1μs 91.1±0.5μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'i')
- 122±0.06μs 85.4±0.05μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'b')
- 122±0.03μs 85.2±0.1μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'b')
- 124±0.3μs 86.5±0.5μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'H')
- 123±0.1μs 85.3±0.06μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'b')
- 129±0.3μs 90.1±0.3μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'I')
- 122±0.08μs 85.1±0.08μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'b')
- 128±2μs 88.8±0.5μs 0.70 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'i')
- 122±0.2μs 84.5±0.1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'H')
- 122±0.3μs 84.6±0.3μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'H')
- 130±0.6μs 90.5±0.6μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'I')
- 129±2μs 89.3±1μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'i')
- 129±2μs 89.3±0.9μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'I')
- 131±0.4μs 90.2±0.4μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'I')
- 121±0.1μs 83.4±0.4μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'H')
- 121±0.2μs 83.4±0.3μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'H')
- 124±0.2μs 85.0±0.3μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'H')
- 131±0.5μs 89.9±0.5μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'i')
- 121±0.2μs 82.6±0.3μs 0.69 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'H')
- 124±0.3μs 84.8±0.5μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'H')
- 124±0.2μs 84.5±0.3μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'H')
- 248±2μs 169±2μs 0.68 bench_reduce.ArgMax.time_argmax(<class 'numpy.float64'>)
- 120±0.07μs 81.9±0.3μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'H')
- 121±0.2μs 81.9±0.4μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'H')
- 126±0.2μs 85.6±0.5μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'H')
- 121±0.1μs 82.4±0.2μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'H')
- 124±0.2μs 83.6±0.3μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'H')
- 122±0.1μs 82.7±0.2μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'H')
- 120±0.2μs 80.9±0.4μs 0.68 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'H')
- 122±0.2μs 82.0±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'H')
- 122±0.2μs 82.0±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'H')
- 123±0.3μs 82.6±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'H')
- 120±0.3μs 80.5±0.4μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'H')
- 124±0.3μs 83.3±0.4μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'H')
- 121±0.3μs 81.3±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'H')
- 123±0.5μs 82.7±0.6μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'H')
- 121±0.3μs 81.3±0.1μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'H')
- 121±0.2μs 80.7±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'H')
- 119±0.3μs 79.6±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'H')
- 120±0.1μs 79.9±0.1μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'H')
- 124±0.2μs 82.5±0.3μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'H')
- 121±0.1μs 80.6±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'H')
- 120±0.1μs 79.8±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'H')
- 123±0.3μs 82.2±0.2μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'H')
- 119±0.2μs 79.2±0.09μs 0.67 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'H')
- 119±0.1μs 78.9±0.3μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'H')
- 123±0.1μs 81.8±0.3μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'H')
- 119±0.2μs 79.0±0.3μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'H')
- 123±0.2μs 81.6±0.1μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'H')
- 125±0.2μs 82.9±0.2μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'H')
- 125±0.1μs 82.6±0.2μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'H')
- 123±0.3μs 81.4±0.2μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'H')
- 125±0.2μs 82.1±0.1μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'H')
- 123±0.04μs 80.6±0.2μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'H')
- 125±0.3μs 82.1±0.3μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'H')
- 123±0.1μs 80.7±0.4μs 0.66 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'H')
- 123±0.1μs 80.7±0.1μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'H')
- 128±0.2μs 83.7±0.3μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'i')
- 123±0.2μs 80.6±0.2μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'H')
- 128±0.1μs 83.5±0.3μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'i')
- 128±0.2μs 83.5±0.1μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'I')
- 123±0.2μs 79.9±0.2μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'H')
- 123±0.2μs 79.6±0.1μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'H')
- 128±0.08μs 83.3±0.4μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'I')
- 123±0.07μs 79.6±0.3μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'H')
- 129±0.2μs 83.0±0.4μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'i')
- 129±0.2μs 83.0±0.4μs 0.65 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'I')
- 123±0.09μs 79.0±0.1μs 0.64 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'H')
- 129±0.3μs 82.8±0.2μs 0.64 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'I')
- 123±0.2μs 79.0±0.2μs 0.64 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'H')
- 128±0.4μs 82.5±0.3μs 0.64 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'i')
- 127±0.2μs 81.6±0.4μs 0.64 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'I')
- 123±0.2μs 78.9±0.3μs 0.64 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'H')
- 128±0.5μs 81.3±0.7μs 0.64 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'i')
- 128±0.1μs 81.4±0.4μs 0.64 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'i')
- 128±0.2μs 81.1±0.7μs 0.63 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'I')
- 125±0.7μs 79.1±0.3μs 0.63 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'i')
- 125±0.5μs 78.7±0.5μs 0.63 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'I')
- 126±0.6μs 78.8±0.4μs 0.63 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'I')
- 126±0.5μs 78.8±0.3μs 0.63 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'i')
- 119±0.3μs 73.9±0.1μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'B')
- 120±0.6μs 74.4±0.3μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 4, 'B')
- 124±0.2μs 77.0±0.3μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'i')
- 124±0.2μs 76.9±0.2μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'I')
- 125±0.1μs 77.3±0.4μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'i')
- 119±2μs 73.4±1μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 4, 'B')
- 125±0.2μs 77.1±0.4μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'i')
- 120±0.08μs 73.9±0.1μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'B')
- 125±0.1μs 77.1±0.3μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'I')
- 125±0.4μs 77.3±0.4μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'i')
- 119±0.1μs 73.4±0.08μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 1, 'B')
- 119±0.2μs 73.3±0.2μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'B')
- 119±0.2μs 73.5±0.2μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'B')
- 125±0.4μs 77.0±0.3μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'I')
- 118±1μs 72.7±0.5μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'B')
- 118±0.3μs 72.9±0.05μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 2, 'B')
- 120±0.1μs 73.6±0.2μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 2, 'B')
- 119±0.2μs 73.2±0.1μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 4, 'B')
- 125±0.3μs 77.1±0.2μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'I')
- 119±0.2μs 73.3±0.08μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 1, 'B')
- 119±0.1μs 73.2±0.1μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 4, 'B')
- 119±0.2μs 73.1±0.09μs 0.62 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 4, 'B')
- 118±0.2μs 72.8±0.1μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'B')
- 119±0.1μs 73.5±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 4, 'B')
- 125±0.4μs 77.0±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'I')
- 119±0.05μs 73.2±0.09μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 4, 2, 'B')
- 120±0.2μs 73.6±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'B')
- 120±0.2μs 73.4±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 4, 'B')
- 125±0.1μs 77.0±0.3μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'i')
- 119±0.08μs 73.3±0.1μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'B')
- 119±0.1μs 73.1±0.1μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 2, 'B')
- 118±0.06μs 72.5±0.05μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'B')
- 118±0.04μs 72.5±0.04μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'B')
- 118±0.06μs 72.6±0.04μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'B')
- 120±0.1μs 73.3±0.1μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'B')
- 125±0.2μs 76.7±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'i')
- 119±1μs 72.6±0.1μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 2, 1, 'B')
- 122±0.3μs 74.9±0.1μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'i')
- 122±0.2μs 74.9±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 2, 'I')
- 125±0.1μs 76.6±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 1, 'I')
- 126±0.6μs 76.8±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'i')
- 126±0.3μs 76.7±0.2μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'i')
- 126±0.3μs 76.8±0.4μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'I')
- 126±0.3μs 76.5±0.1μs 0.61 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'I')
- 123±0.06μs 74.2±0.1μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 1, 'B')
- 124±0.2μs 74.9±0.1μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'I')
- 123±0.1μs 74.3±0.3μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 4, 'B')
- 124±0.2μs 74.8±0.2μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'i')
- 123±0.2μs 73.8±0.4μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 1, 'B')
- 123±0.2μs 73.8±0.1μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 4, 'B')
- 123±0.2μs 73.9±0.1μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'B')
- 123±0.2μs 73.8±0.2μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 4, 'B')
- 123±0.2μs 73.5±0.04μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 4, 2, 'B')
- 123±0.1μs 73.6±0.07μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 4, 'B')
- 123±0.1μs 73.5±0.2μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'B')
- 123±0.1μs 73.5±0.1μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 4, 'B')
- 123±0.1μs 73.3±0.08μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'B')
- 123±0.2μs 73.3±0.2μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'B')
- 123±0.2μs 73.3±0.2μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 1, 'B')
- 123±0.09μs 73.2±0.1μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 2, 'B')
- 123±0.1μs 73.2±0.2μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'B')
- 123±0.08μs 73.1±0.03μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 4, 'B')
- 123±0.3μs 73.2±1μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 4, 'B')
- 123±0.07μs 73.1±0.09μs 0.60 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'B')
- 122±0.06μs 72.9±0.1μs 0.59 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'B')
- 123±0.04μs 72.9±0.06μs 0.59 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 1, 2, 'B')
- 123±0.1μs 72.8±0.1μs 0.59 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 2, 'B')
- 122±0.02μs 72.7±0.6μs 0.59 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 2, 'B')
- 123±0.08μs 72.7±0.1μs 0.59 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'B')
- 122±0.06μs 72.6±0.1μs 0.59 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'B')
- 122±0.05μs 72.5±0.07μs 0.59 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 1, 'B')
- 123±0.1μs 72.7±0.04μs 0.59 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 2, 'B')
- 127±0.3μs 72.7±0.6μs 0.57 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'q')
- 127±0.2μs 72.6±0.4μs 0.57 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'l')
- 128±0.2μs 72.8±0.5μs 0.57 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'Q')
- 128±0.6μs 72.6±0.6μs 0.57 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'L')
- 128±0.2μs 72.4±0.7μs 0.57 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'L')
- 128±0.05μs 72.4±0.5μs 0.57 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'q')
- 128±0.2μs 72.1±0.6μs 0.57 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'Q')
- 128±0.1μs 72.2±0.4μs 0.56 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'l')
- 579±0.9μs 292±3μs 0.50 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 4, 'd')
- 579±1μs 290±5μs 0.50 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 4, 'd')
- 578±2μs 289±3μs 0.50 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 4, 'd')
- 579±1μs 289±3μs 0.50 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 4, 'd')
- 22.0±0.05μs 10.6±0.09μs 0.48 bench_reduce.MinMax.time_max(<class 'numpy.uint64'>)
- 22.0±0.04μs 10.5±0.02μs 0.48 bench_reduce.MinMax.time_min(<class 'numpy.uint64'>)
- 22.0±0.1μs 10.6±0.1μs 0.48 bench_reduce.MinMax.time_max(<class 'numpy.int64'>)
- 22.0±0.03μs 10.5±0.03μs 0.48 bench_reduce.MinMax.time_min(<class 'numpy.int64'>)
- 572±3μs 262±5μs 0.46 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 1, 'd')
- 572±1μs 261±5μs 0.46 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 1, 'd')
- 572±3μs 260±2μs 0.46 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 1, 'd')
- 573±3μs 260±1μs 0.45 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 1, 'd')
- 571±0.6μs 249±2μs 0.44 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 4, 'd')
- 572±0.4μs 249±2μs 0.44 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 4, 'd')
- 572±0.9μs 249±2μs 0.44 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 4, 'd')
- 571±0.9μs 249±2μs 0.44 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 4, 'd')
- 573±1μs 249±1μs 0.43 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 4, 'd')
- 571±0.9μs 248±1μs 0.43 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 4, 'd')
- 573±0.5μs 248±0.5μs 0.43 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 4, 'd')
- 574±0.9μs 248±0.9μs 0.43 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 4, 'd')
- 568±1μs 230±2μs 0.40 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 4, 'd')
- 568±1μs 229±2μs 0.40 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 4, 'd')
- 568±0.4μs 229±1μs 0.40 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 4, 'd')
- 568±0.8μs 228±0.8μs 0.40 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 4, 'd')
- 569±2μs 228±1μs 0.40 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 2, 'd')
- 601±20μs 240±6μs 0.40 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 4, 'd')
- 600±20μs 239±7μs 0.40 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 4, 'd')
- 569±2μs 227±0.7μs 0.40 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 2, 'd')
- 569±2μs 227±2μs 0.40 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 2, 'd')
- 602±20μs 240±6μs 0.40 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 4, 'd')
- 599±20μs 239±5μs 0.40 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 4, 'd')
- 570±1μs 227±2μs 0.40 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 2, 'd')
- 564±1μs 213±2μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 4, 'd')
- 563±0.6μs 212±4μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 1, 'd')
- 565±0.5μs 213±2μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 4, 'd')
- 563±0.5μs 212±2μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 1, 'd')
- 563±0.9μs 212±2μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 1, 'd')
- 564±1μs 212±2μs 0.38 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 1, 'd')
- 564±0.7μs 211±3μs 0.37 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 4, 'd')
- 566±1μs 211±1μs 0.37 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 4, 'd')
- 565±2μs 210±0.8μs 0.37 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 1, 'd')
- 564±1μs 209±0.4μs 0.37 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 1, 'd')
- 564±0.5μs 210±2μs 0.37 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 1, 'd')
- 563±0.5μs 209±0.6μs 0.37 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 1, 'd')
- 594±20μs 199±3μs 0.34 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 4, 'd')
- 21.9±0.06μs 7.36±0.1μs 0.34 bench_reduce.MinMax.time_max(<class 'numpy.int32'>)
- 21.9±0.05μs 7.35±0.1μs 0.34 bench_reduce.MinMax.time_max(<class 'numpy.uint32'>)
- 595±20μs 199±4μs 0.33 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 4, 'd')
- 594±20μs 199±4μs 0.33 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 4, 'd')
- 597±20μs 199±5μs 0.33 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 4, 'd')
- 599±20μs 200±5μs 0.33 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 4, 'd')
- 21.8±0.04μs 7.29±0.04μs 0.33 bench_reduce.MinMax.time_min(<class 'numpy.int32'>)
- 598±20μs 199±6μs 0.33 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 4, 'd')
- 598±20μs 199±5μs 0.33 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 4, 'd')
- 597±20μs 199±4μs 0.33 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 4, 'd')
- 559±1μs 186±2μs 0.33 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 1, 'd')
- 559±0.6μs 185±2μs 0.33 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 1, 'd')
- 22.1±0.06μs 7.29±0.03μs 0.33 bench_reduce.MinMax.time_min(<class 'numpy.uint32'>)
- 562±1μs 185±2μs 0.33 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 1, 'd')
- 561±0.9μs 183±1μs 0.33 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 1, 'd')
- 560±1μs 183±0.9μs 0.33 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 1, 'd')
- 561±0.5μs 183±1μs 0.33 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 1, 'd')
- 559±0.4μs 182±1μs 0.33 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 1, 'd')
- 561±0.6μs 182±1μs 0.33 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 1, 'd')
- 564±0.7μs 183±1μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 2, 'd')
- 29.2±0.05μs 9.49±0.1μs 0.32 bench_reduce.MinMax.time_max(<class 'numpy.float64'>)
- 565±1μs 183±2μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'd')
- 29.2±0.05μs 9.46±0.06μs 0.32 bench_reduce.MinMax.time_min(<class 'numpy.float64'>)
- 565±0.5μs 182±0.9μs 0.32 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 2, 'd')
- 564±0.9μs 182±1μs 0.32 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 2, 'd')
- 567±0.7μs 182±1μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 2, 'd')
- 567±0.8μs 182±0.9μs 0.32 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 2, 'd')
- 567±0.5μs 181±0.8μs 0.32 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 2, 'd')
- 568±0.8μs 181±0.7μs 0.32 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 2, 'd')
- 586±20μs 175±4μs 0.30 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 4, 'd')
- 587±20μs 175±4μs 0.30 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 4, 'd')
- 586±20μs 174±3μs 0.30 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 4, 'd')
- 587±20μs 174±4μs 0.30 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 4, 'd')
- 561±0.8μs 160±0.5μs 0.28 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 2, 'd')
- 562±0.4μs 160±1μs 0.28 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 2, 'd')
- 562±0.9μs 159±0.6μs 0.28 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 2, 'd')
- 563±0.8μs 159±0.8μs 0.28 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 2, 'd')
- 561±0.8μs 159±0.8μs 0.28 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 2, 'd')
- 563±1μs 159±0.6μs 0.28 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 2, 'd')
- 563±1μs 159±0.2μs 0.28 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 2, 'd')
- 563±0.8μs 159±0.3μs 0.28 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 2, 'd')
- 555±0.8μs 150±2μs 0.27 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 1, 'd')
- 555±0.8μs 150±1μs 0.27 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 1, 'd')
- 555±0.5μs 148±1μs 0.27 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 1, 'd')
- 554±0.4μs 147±0.4μs 0.27 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 1, 'd')
- 555±0.8μs 145±3μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 1, 'f')
- 555±0.6μs 145±3μs 0.26 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 1, 'f')
- 554±0.8μs 144±1μs 0.26 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 1, 'f')
- 554±0.4μs 144±2μs 0.26 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 1, 'f')
- 26.7±0.06μs 6.88±0.05μs 0.26 bench_reduce.FMinMax.time_min(<class 'numpy.float64'>)
- 26.7±0.05μs 6.86±0.05μs 0.26 bench_reduce.FMinMax.time_max(<class 'numpy.float64'>)
- 29.2±0.05μs 7.36±0.08μs 0.25 bench_reduce.MinMax.time_max(<class 'numpy.float32'>)
- 29.2±0.03μs 7.36±0.01μs 0.25 bench_reduce.MinMax.time_min(<class 'numpy.float32'>)
- 560±0.8μs 137±0.8μs 0.25 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 2, 'd')
- 558±2μs 137±0.6μs 0.24 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 2, 'd')
- 561±1μs 137±1μs 0.24 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 2, 'd')
- 559±1μs 136±0.7μs 0.24 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 2, 'd')
- 564±3μs 137±1μs 0.24 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 4, 'f')
- 563±2μs 136±1μs 0.24 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 4, 'f')
- 564±3μs 136±0.7μs 0.24 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 4, 'f')
- 563±3μs 136±0.2μs 0.24 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 4, 'f')
- 120±0.5μs 28.4±0.7μs 0.24 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'I')
- 120±0.3μs 28.2±0.5μs 0.23 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'i')
- 124±0.2μs 28.2±0.7μs 0.23 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'I')
- 123±0.2μs 28.0±0.3μs 0.23 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'i')
- 256±0.4μs 56.3±0.4μs 0.22 bench_lib.Nan.time_nanmax(200000, 0)
- 255±0.3μs 56.1±0.2μs 0.22 bench_lib.Nan.time_nanmin(200000, 0)
- 257±0.2μs 56.2±0.4μs 0.22 bench_lib.Nan.time_nanmax(200000, 0.1)
- 257±0.4μs 56.0±0.5μs 0.22 bench_lib.Nan.time_nanmin(200000, 0.1)
- 29.0±0.02μs 6.19±0.1μs 0.21 bench_reduce.MinMax.time_max(<class 'numpy.uint16'>)
- 29.0±0.03μs 6.18±0.1μs 0.21 bench_reduce.MinMax.time_max(<class 'numpy.int16'>)
- 29.0±0.01μs 6.15±0.08μs 0.21 bench_reduce.MinMax.time_min(<class 'numpy.int16'>)
- 29.0±0.02μs 6.16±0.04μs 0.21 bench_reduce.MinMax.time_min(<class 'numpy.uint16'>)
- 553±0.5μs 116±2μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 1, 'd')
- 553±0.9μs 115±0.6μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 1, 'd')
- 552±0.3μs 115±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 1, 'd')
- 558±1μs 116±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 4, 'f')
- 558±1μs 116±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 4, 'f')
- 553±0.5μs 115±0.7μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 1, 'd')
- 553±0.6μs 115±0.5μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 1, 'd')
- 553±0.5μs 115±0.6μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 1, 'd')
- 553±1μs 115±0.4μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 1, 'd')
- 559±0.9μs 116±2μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 4, 'f')
- 559±1μs 116±0.7μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 4, 'f')
- 559±1μs 116±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 2, 'd')
- 558±2μs 115±0.4μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 4, 'f')
- 552±0.5μs 114±0.3μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 1, 'd')
- 559±2μs 115±1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 4, 'f')
- 558±0.9μs 115±0.1μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 2, 'd')
- 558±0.8μs 115±0.9μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 2, 'd')
- 559±1μs 115±0.5μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 2, 'd')
- 558±1μs 115±0.6μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 4, 'f')
- 559±1μs 115±0.2μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 4, 'f')
- 558±0.8μs 115±0.6μs 0.21 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 2, 'd')
- 558±1μs 115±0.2μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 2, 'd')
- 559±0.8μs 115±0.4μs 0.21 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 2, 'd')
- 559±1μs 115±0.4μs 0.21 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 2, 'd')
- 553±1μs 112±1μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 1, 'f')
- 553±1μs 112±1μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 1, 'f')
- 553±1μs 112±1μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 1, 'f')
- 552±0.5μs 112±0.9μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 1, 'f')
- 552±0.6μs 111±0.3μs 0.20 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 1, 'f')
- 552±1μs 111±0.6μs 0.20 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 1, 'f')
- 558±0.7μs 111±0.6μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 2, 'f')
- 554±0.7μs 110±0.6μs 0.20 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 1, 'f')
- 552±0.9μs 110±0.7μs 0.20 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 1, 'f')
- 558±0.6μs 111±0.8μs 0.20 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 4, 2, 'f')
- 558±0.8μs 111±0.5μs 0.20 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 4, 2, 'f')
- 558±1μs 111±0.2μs 0.20 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 4, 2, 'f')
- 285±0.3μs 56.2±0.7μs 0.20 bench_lib.Nan.time_nanmax(200000, 2.0)
- 285±0.3μs 55.7±0.2μs 0.20 bench_lib.Nan.time_nanmin(200000, 2.0)
- 559±0.6μs 108±1μs 0.19 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 4, 'f')
- 559±1μs 107±1μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 4, 'f')
- 557±0.9μs 107±0.7μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 4, 'f')
- 559±1μs 107±1μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 4, 'f')
- 558±0.9μs 106±0.3μs 0.19 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 4, 'f')
- 558±0.9μs 106±0.6μs 0.19 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 4, 'f')
- 558±1μs 106±0.4μs 0.19 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 4, 'f')
- 557±0.8μs 106±0.3μs 0.19 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 4, 'f')
- 28.9±0.04μs 5.44±0.1μs 0.19 bench_reduce.MinMax.time_max(<class 'numpy.uint8'>)
- 28.9±0.03μs 5.43±0.1μs 0.19 bench_reduce.MinMax.time_max(<class 'numpy.int8'>)
- 28.9±0.01μs 5.40±0.04μs 0.19 bench_reduce.MinMax.time_min(<class 'numpy.uint8'>)
- 28.9±0.02μs 5.39±0.05μs 0.19 bench_reduce.MinMax.time_min(<class 'numpy.int8'>)
- 26.6±0.06μs 4.73±0.04μs 0.18 bench_reduce.FMinMax.time_max(<class 'numpy.float32'>)
- 26.6±0.05μs 4.69±0.03μs 0.18 bench_reduce.FMinMax.time_min(<class 'numpy.float32'>)
- 556±1μs 97.1±0.9μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 2, 'd')
- 555±0.2μs 96.6±0.7μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 4, 'f')
- 551±0.9μs 95.9±0.7μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 1, 'f')
- 556±0.6μs 96.6±0.4μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 4, 'f')
- 555±0.6μs 95.9±1μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 2, 'd')
- 556±0.8μs 96.0±0.7μs 0.17 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 2, 'd')
- 556±0.5μs 95.8±2μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 2, 'd')
- 555±0.8μs 95.7±0.3μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 4, 'f')
- 551±0.5μs 94.9±0.8μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 1, 'f')
- 556±0.6μs 95.7±0.2μs 0.17 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 4, 'f')
- 553±0.6μs 94.9±0.7μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 1, 'f')
- 551±1μs 94.4±0.5μs 0.17 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 1, 'f')
- 552±0.5μs 94.4±0.4μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 1, 'f')
- 553±0.6μs 94.5±0.9μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 1, 'f')
- 552±0.7μs 94.4±0.3μs 0.17 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 1, 'f')
- 552±0.6μs 94.4±0.6μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 1, 'f')
- 555±0.7μs 92.9±0.7μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 4, 'f')
- 554±1μs 92.4±0.6μs 0.17 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 4, 'f')
- 555±0.8μs 92.0±0.9μs 0.17 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 4, 'f')
- 556±0.7μs 91.8±0.9μs 0.17 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 4, 'f')
- 554±0.4μs 91.1±0.4μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 4, 'f')
- 554±0.9μs 90.8±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 4, 'f')
- 555±0.6μs 90.6±0.4μs 0.16 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 4, 'f')
- 555±0.7μs 90.6±0.6μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 4, 'f')
- 556±2μs 89.6±0.4μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'f')
- 556±0.3μs 89.5±0.8μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 2, 2, 'f')
- 556±1μs 89.5±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 2, 2, 'f')
- 556±1μs 89.3±0.4μs 0.16 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 2, 2, 'f')
- 555±1μs 89.1±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 4, 2, 'f')
- 556±0.9μs 89.0±0.5μs 0.16 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 2, 2, 'f')
- 556±1μs 89.0±0.2μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 4, 2, 'f')
- 556±2μs 88.9±0.3μs 0.16 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 4, 2, 'f')
- 568±8μs 88.0±0.8μs 0.16 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 4, 'f')
- 568±8μs 87.0±0.9μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 4, 'f')
- 568±8μs 86.7±1μs 0.15 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 4, 'f')
- 569±9μs 86.7±1μs 0.15 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 4, 'f')
- 554±0.6μs 81.6±0.4μs 0.15 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 4, 2, 'f')
- 553±0.5μs 81.3±0.4μs 0.15 bench_ufunc_strides.Binary.time_ufunc('fmax', 4, 1, 2, 'f')
- 555±0.4μs 81.4±0.5μs 0.15 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 4, 2, 'f')
- 553±0.5μs 81.1±0.1μs 0.15 bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 1, 2, 'f')
- 555±0.4μs 81.3±0.5μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 4, 2, 'f')
- 555±0.9μs 81.2±0.2μs 0.15 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 4, 2, 'f')
- 554±1μs 81.0±0.2μs 0.15 bench_ufunc_strides.Binary.time_ufunc('minimum', 4, 1, 2, 'f')
- 553±0.9μs 80.8±0.3μs 0.15 bench_ufunc_strides.Binary.time_ufunc('maximum', 4, 1, 2, 'f')
- 384±0.3μs 55.9±0.6μs 0.15 bench_lib.Nan.time_nanmax(200000, 90.0)
- 386±0.5μs 56.0±0.3μs 0.15 bench_lib.Nan.time_nanmin(200000, 90.0)
- 552±0.5μs 75.4±0.5μs 0.14 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 1, 'f')
- 551±0.8μs 75.2±0.3μs 0.14 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 1, 'f')
- 554±2μs 75.6±0.1μs 0.14 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 2, 2, 'f')
- 554±0.9μs 75.5±0.4μs 0.14 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 2, 2, 'f')
- 555±1μs 75.3±0.3μs 0.14 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 2, 'f')
- 551±0.9μs 74.8±0.3μs 0.14 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 1, 'f')
- 555±2μs 75.3±0.4μs 0.14 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 2, 2, 'f')
- 551±0.6μs 74.6±0.7μs 0.14 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 2, 1, 'f')
- 552±1μs 72.6±0.3μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 2, 'f')
- 553±0.6μs 72.7±0.6μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 2, 'f')
- 552±0.4μs 72.2±0.3μs 0.13 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 2, 'f')
- 552±0.7μs 72.1±0.4μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 2, 'f')
- 553±1μs 72.2±0.3μs 0.13 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 2, 'f')
- 553±0.5μs 72.0±0.3μs 0.13 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 2, 'f')
- 552±0.6μs 71.7±0.2μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 2, 'f')
- 552±0.6μs 71.5±0.1μs 0.13 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 2, 'f')
- 551±0.9μs 69.4±0.6μs 0.13 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 1, 'd')
- 551±0.5μs 69.4±0.5μs 0.13 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 1, 'd')
- 552±0.6μs 68.9±0.4μs 0.12 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 1, 'd')
- 550±0.4μs 68.5±0.2μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 1, 'd')
- 550±1μs 68.2±0.3μs 0.12 bench_ufunc_strides.Binary.time_ufunc('maximum', 2, 1, 1, 'f')
- 549±0.6μs 68.0±0.3μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmin', 2, 1, 1, 'f')
- 550±1μs 68.0±0.2μs 0.12 bench_ufunc_strides.Binary.time_ufunc('minimum', 2, 1, 1, 'f')
- 549±0.7μs 67.9±0.07μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 2, 1, 'f')
- 550±2μs 67.9±0.1μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 1, 1, 'f')
- 550±0.5μs 67.8±0.2μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 2, 1, 'f')
- 550±0.9μs 67.9±0.1μs 0.12 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 2, 1, 'f')
- 550±0.2μs 67.6±0.2μs 0.12 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 2, 1, 'f')
- 551±0.8μs 67.2±0.3μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 2, 'f')
- 551±0.6μs 67.2±0.3μs 0.12 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 2, 'f')
- 552±0.9μs 67.0±0.2μs 0.12 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 2, 'f')
- 550±0.5μs 66.8±0.2μs 0.12 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 2, 'f')
- 118±0.2μs 12.0±0.7μs 0.10 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'H')
- 123±0.08μs 11.6±0.4μs 0.09 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'h')
- 122±0.2μs 11.4±0.4μs 0.09 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'H')
- 123±0.2μs 11.4±0.3μs 0.09 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'h')
- 993±1μs 55.8±0.3μs 0.06 bench_lib.Nan.time_nanmin(200000, 50.0)
- 994±1μs 55.8±0.3μs 0.06 bench_lib.Nan.time_nanmax(200000, 50.0)
- 118±0.2μs 6.48±0.05μs 0.06 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'B')
- 122±0.03μs 6.50±0.05μs 0.05 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'B')
- 124±0.6μs 6.50±0.04μs 0.05 bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 1, 1, 'b')
- 124±0.5μs 6.46±0.03μs 0.05 bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 1, 1, 'b')
- 549±0.4μs 28.1±0.5μs 0.05 bench_ufunc_strides.Binary.time_ufunc('maximum', 1, 1, 1, 'f')
- 550±0.9μs 28.0±0.2μs 0.05 bench_ufunc_strides.Binary.time_ufunc('minimum', 1, 1, 1, 'f')
- 549±0.4μs 27.7±0.3μs 0.05 bench_ufunc_strides.Binary.time_ufunc('fmax', 1, 1, 1, 'f')
- 549±0.8μs 27.6±0.4μs 0.05 bench_ufunc_strides.Binary.time_ufunc('fmin', 1, 1, 1, 'f')
my latest push only adds 2022-01-06T18:55:39.9852972Z Fatal Python error: Aborted
2022-01-06T18:55:39.9854461Z
2022-01-06T18:55:39.9855828Z Current thread 0x00007fd960bc9740 (most recent call first):
2022-01-06T18:55:39.9858195Z File "/home/runner/work/numpy/numpy/numpy/core/tests/test_ufunc.py", line 2158 in test_reducelike_byteorder_resolution
2022-01-06T18:55:39.9861865Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/python.py", line 183 in pytest_pyfunc_call
2022-01-06T18:55:39.9880785Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
2022-01-06T18:55:39.9882503Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
2022-01-06T18:55:39.9884321Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
2022-01-06T18:55:39.9885893Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/python.py", line 1641 in runtest
2022-01-06T18:55:39.9887521Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/runner.py", line 162 in pytest_runtest_call
2022-01-06T18:55:39.9889161Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
2022-01-06T18:55:39.9890747Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
2022-01-06T18:55:39.9892266Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
2022-01-06T18:55:39.9893800Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/runner.py", line 255 in <lambda>
2022-01-06T18:55:39.9895359Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/runner.py", line 311 in from_call
2022-01-06T18:55:39.9896933Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/runner.py", line 254 in call_runtest_hook
2022-01-06T18:55:39.9898549Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/runner.py", line 215 in call_and_report
2022-01-06T18:55:39.9900197Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/runner.py", line 126 in runtestprotocol
2022-01-06T18:55:39.9901914Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/runner.py", line 109 in pytest_runtest_protocol
2022-01-06T18:55:39.9903544Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
2022-01-06T18:55:39.9905119Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
2022-01-06T18:55:39.9906660Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
2022-01-06T18:55:39.9908240Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
2022-01-06T18:55:39.9909834Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
2022-01-06T18:55:39.9911675Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
2022-01-06T18:55:39.9913207Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
2022-01-06T18:55:39.9914689Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/main.py", line 323 in _main
2022-01-06T18:55:39.9923391Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/main.py", line 269 in wrap_session
2022-01-06T18:55:39.9925014Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
2022-01-06T18:55:39.9926614Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
2022-01-06T18:55:39.9928195Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
2022-01-06T18:55:39.9929743Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
2022-01-06T18:55:39.9931282Z File "/home/runner/work/numpy/numpy/builds/venv/lib/python3.8/site-packages/_pytest/config/__init__.py", line 162 in main
2022-01-06T18:55:39.9932410Z File "/home/runner/work/numpy/numpy/numpy/_pytesttester.py", line 204 in __call__
2022-01-06T18:55:39.9933201Z File "../runtests.py", line 388 in main
2022-01-06T18:55:39.9933855Z File "../runtests.py", line 701 in <module>
2022-01-06T18:55:55.2013682Z ./tools/travis-test.sh: line 79: 4561 Aborted (core dumped) $PYTHON ../runtests.py -n -v $DURATIONS_FLAG -- -rs probely related to this pull-request I'm going to invistage on it. |
Please feel free to ignore that, this seems to have slipped in through another fix (not sure how it got passed CI), I have not yet spend serious effort to track it down though. |
@seiko2plus would you like to get this in as-is and then open an issue for improving argmax + float32 ? @Developer-Ecosystem-Engineering thoughts? |
I gonna ignore it then, thanks for the clarification.
Yes, I would but the issue is that I don't clearly understand how the new changes affected negatively to argmax/float32 and positively to argmax/float64. : AVX512+ 122±0.2μs 218±2μs 1.79 bench_reduce.ArgMax.time_argmax(<class 'numpy.float32'>)
- 184±0.9μs 124±0.3μs 0.67 bench_reduce.ArgMax.time_argmax(<class 'numpy.float64'>) AVX2+ 122±0.3μs 220±2μs 1.81 bench_reduce.ArgMax.time_argmax(<class 'numpy.float32'>)
- 184±1μs 124±0.2μs 0.67 bench_reduce.ArgMax.time_argmax(<class 'numpy.float64'>) ASIMD+ 166±1μs 246±1μs 1.48 bench_reduce.ArgMax.time_argmax(<class 'numpy.float32'>)
- 248±2μs 169±2μs 0.68 bench_reduce.ArgMax.time_argmax(<class 'numpy.float64'>) Is there any code paths for argmax different than the following?: numpy/numpy/core/src/multiarray/arraytypes.c.src Lines 3190 to 3320 in f4a3e07
|
If it is randomly affecting different dtypes in opposite direction I am willing to blame the typical (~30%) fluctuations we often see :/. And for those, my best guess is compiler (e.g. optimization/code layout) differences due to unrelated code changes. My best idea for mitigation is to try using PGO (profile guided optimization) on the benchmarks to stabilize them (Python does this). |
A couple of thoughts:
|
Splitting up the work by functions makes sense |
Putting this in, I opened #20785 to track the argmax regression. |
@seberg, both builds(before & after) were merged against the latest main and were under the same compiler flags so I think special profiling(PGO) not going to change the ground.
yes.
right 100%.
within my latest commits, I covered tests for integer types and the performance improvement was pretty good.
same even with filtering only argmax. As a precaution, I have improved the performance of argmax/argmin within #20846 till I got some free time to investigate on it. |
I am basing that guess solely on this: https://vstinner.github.io/journey-to-stable-benchmark-deadcode.html where during Python optimization dead code would cause differences and only PGO would eliminate it. I agree this feels too big of an effect, so probably there is something more/else... but I have no idea what :) EDIT: Actually, the |
xref gh-20863, I did not look closely at this, but it seems likely that this PR may have been the cause of the regression in Sounds like an incorrectly typed temporary, the example converts 2147483647 to 2147483648 (off by one), which happens for this operation:
(The example also has a change for a float32 example, but it mixes that with int32, so the operation may end up using float64 somewhere.) |
Taking a look |
This occurs in 1.21 prior to this commit
|
@Developer-Ecosystem-Engineering that was just me making a hypothesis about the reason. The problem is described in the linked issue, sorry. It is more something like |
Looking closer at it, it seems the issue is limited to |
Ah, got it. We are able to reproduce, appears to be related to these changes. Looking further. |
Should be resolved by #20872 |
This fixes #17989 by adding ARM NEON implementations for min/max and fmin/max.
Before: Rosetta faster than native arm64 by
1.2x - 8.6x
.After: Native arm64 faster than Rosetta by
1.6x - 6.7x
. (2.8x - 15.5x improvement)Benchmarks
Size change of _multiarray_umath.cpython-39-darwin.so: