ENH: Improve array function overhead by using vectorcall #23020

seberg · 2023-01-16T14:13:27Z

This moves dispatching for __array_function__ into a C-wrapper. This
helps speed for multiple reasons:

Avoids one additional dispatching function call to C
Avoids the use of *args, **kwargs which is slower.
For simple NumPy calls we can stay in the faster "vectorcall" world

This speeds up things generally a little, but can speed things up a lot
when keyword arguments are used on lightweight functions, for example::

np.can_cast(arr, dtype, casting="same_kind")

is more than twice as fast with this.

There is one alternative in principle to get best speed: We could inline
the "relevant argument"/dispatcher extraction. That changes behavior in
an acceptable but larger way (passes default arguments).
Unless the C-entry point seems unwanted, this should be a decent step
in the right direction even if we want to do that eventually, though.

Closes #20790
Closes #18547 (although not quite sure why)

seberg · 2023-01-16T14:14:22Z

numpy/lib/nanfunctions.py

@@ -169,7 +169,7 @@ def _remove_nan_1d(arr1d, overwrite_input=False):
    s = np.nonzero(c)[0]
    if s.size == arr1d.size:
        warnings.warn("All-NaN slice encountered", RuntimeWarning,
-                      stacklevel=5)
+                      stacklevel=6)


I triple checked this and it seemed just way too low before. It can even now sometimes point inside (but that is due to use at different depth). (Same below.)

seberg · 2023-01-16T14:34:35Z

Ah, seems we still have a 3.8 job or two and PyPy doesn't have PyObject_VectorCall after all. I could add a (slowish) compatibility shim for it?

mattip · 2023-01-16T15:05:33Z

If you mean the pypy failure in the build_test run, it is on PyPy 3.9 and fails on

malloc_consolidate(): unaligned fastbin chunk detected

mattip · 2023-01-16T15:11:44Z

The armv7_simd_test run is on CPython3.8. I think updating that would require modifying the container build

mattip · 2023-01-16T15:18:54Z

I think the debug run is also using a apt-installed python3.8-dbg

mattip · 2023-01-16T15:23:51Z

Are you happy with moving the like argument to be the first one? It seems to be a big change. I understand it needs to transition from kwarg to positional, can it still remain the last argument?

seberg · 2023-01-16T15:24:28Z

Ah, I must have mixed up the pypy and debug run. I guess there is a chance that the PyPy failure is a bug in this code in that case.

seberg · 2023-01-16T15:39:00Z

Are you happy with moving the like argument to be the first one? It seems to be a big change. I understand it needs to transition from kwarg to positional, can it still remain the last argument?

Sorry, I guess this was unclear: This is an implementation detail since we never pass like to __array_function__. Our dispatcher expects like as first argument now, wile previously it expected a kwarg like=like and then would delete it.

mattip · 2023-01-16T15:44:11Z

Ahh, and in order to make it a non-optional positional argument you moved it to be the first one, since there could be a different number of other positional arguments?

seberg · 2023-01-16T18:02:37Z

Turns out the PyPy problem was a refcounting issue that wasn't bad enough to show up in the non-debug build... and the debug build was failing after all :).

seberg · 2023-01-16T19:23:46Z

.github/workflows/build_test.yml

-          pip3 install cython==0.29.30 setuptools\<49.2.0 hypothesis==6.23.3 pytest==6.2.5 'typing_extensions>=4.2.0' &&
+          apt install -y git python3.9 python3.9-dev &&
+          ln -s /usr/bin/python3.9 /usr/bin/pythonx &&
+          pythonx -m pip install --upgrade pip setuptools wheel &&


This pythonx pattern was stolen from the old_gcc job above.

eendebakpt · 2023-01-17T10:04:21Z

numpy/core/overrides.py

-        Arbitrary positional arguments originally passed into ``public_api``.
-    kwargs : dict
-        Arbitrary keyword arguments originally passed into ``public_api``.
+        overrides when called like.


Suggested change

overrides when called like.

overrides when called like ``implementation(*args, **kwargs)``.

Not sure this is the right way to call, but the sentence seems to stop all of a sudden.

This moves dispatching for `__array_function__` into a C-wrapper. This helps speed for multiple reasons: * Avoids one additional dispatching function call to C * Avoids the use of `*args, **kwargs` which is slower. * For simple NumPy calls we can stay in the faster "vectorcall" world This speeds up things generally a little, but can speed things up a lot when keyword arguments are used on lightweight functions, for example:: np.can_cast(arr, dtype, casting="same_kind") is more than twice as fast with this. There is one alternative in principle to get best speed: We could inline the "relevant argument"/dispatcher extraction. That changes behavior in an acceptable but larger way (passes default arguments). Unless the C-entry point seems unwanted, this should be a decent step in the right direction even if we want to do that eventually, though. Closes numpygh-20790 Closes numpygh-18547 (although not quite sure why)

The refactor to use vectorcall/fastcall is obviously much better if we don't have to go back and forth, for concatenate we get: arr = np.random.random(20) %timeit np.concatenate((arr, arr), axis=0) Going from ~1.2µs to just below 1µs and all the way down to ~850ns (fluctuates quite a lot down to 822 even). ~40% speedup in total which is not too shabby.

This makes these functions much faster when used with keyword arguments now that the array-function dispatching does not rely on `*args, **kwargs` anymore.

seberg · 2023-01-17T17:48:52Z

I noticed that this PR breaks the slight enhancement from gh-21731 (the test there is subtly wrong). This is somewhat more tedious to get right in C, so I think I would prefer fixing it as a follow-up.

EDIT: Just to note, in general it is a bit nicer, because moving things to C removes the confusing <array-function-internals> stack from the traceback.

charris · 2023-01-17T19:32:54Z

Thanks Sebastian. I don't know if all the changes will be trouble free, but the CI is green and the best way to find out if there are other problems is to get it tested downstream.

jakirkham · 2023-01-17T19:33:18Z

Thanks for working on this Sebastian! 🙏

cc @pentschev @leofang (for awareness)

github-actions bot added the 01 - Enhancement label Jan 16, 2023

seberg commented Jan 16, 2023

View reviewed changes

seberg force-pushed the faster-array-function branch from 54d9afd to 7da93fe Compare January 16, 2023 18:01

seberg commented Jan 16, 2023

View reviewed changes

seberg force-pushed the faster-array-function branch 5 times, most recently from 35c9df3 to c3b96e4 Compare January 17, 2023 09:07

seberg mentioned this pull request Jan 17, 2023

CI: Bump debug test to ubuntu-latest/22.04 rather than 20.04 #23026

Merged

eendebakpt reviewed Jan 17, 2023

View reviewed changes

seberg and others added 10 commits January 17, 2023 18:40

MAINT: Move some others functions to use fastcall

0f7fb4f

This makes these functions much faster when used with keyword arguments now that the array-function dispatching does not rely on `*args, **kwargs` anymore.

BENCH: Add two small additional overhead core benchmarks

723b5ac

MAINT: Fix stacklevels for the new C dispatcher not adding one

3f00488

BUG: Fix refcounting issues in C-side like= implementation

0682fdb

STY: Make linter happy in numeric.py changes

1e277fe

DOC: Add release notes for faster array-function dispatching

419fd3a

TST: Cover some more internal array-function code paths

d67baad

DOC: Adept internal docs a bit based on review

2403dbe

seberg force-pushed the faster-array-function branch from cb91aa7 to 2403dbe Compare January 17, 2023 17:46

charris merged commit 8535df6 into numpy:main Jan 17, 2023

This was referenced Jan 17, 2023

BUG: array-function wrapped functions use dispatcher name in traceback (currently) #23029

Closed

With NumPy 1.20, SymPy generated code cannot be serialized with dill #18547

Closed

seberg deleted the faster-array-function branch January 17, 2023 19:45

rossbar mentioned this pull request Jan 17, 2023

wrong intersphinx prefix for many numpy functions #23032

Closed

rgommers mentioned this pull request Jan 18, 2023

CI job with numpy nightly build failing on missing _ArrayFunctionDispatcher.__code__ scipy/scipy#17811

Closed

dhomeier mentioned this pull request Jan 19, 2023

DO NOT MERGE: Test glue-astronomy PR 82 spacetelescope/jdaviz#1968

Closed

pllim mentioned this pull request Jan 19, 2023

TST: devdeps job failed with AttributeError 'numpy._ArrayFunctionDispatcher' object has no attribute '__code__' astropy/astropy#14287

Closed

rgommers mentioned this pull request Jan 20, 2023

oss-fuzz numpy build of main is failing on Py_TPFLAGS_HAVE_VECTORCALL #23055

Closed

This was referenced Jan 23, 2023

Pin numpy-dev to avoid GlueSerializer failure on numpy._ArrayFunctionDispatcher glue-viz/glue#2352

Merged

GlueSerializer fails on re-wrapped __array_function__ dispatcher in numpy-dev glue-viz/glue#2353

Closed

glemaitre mentioned this pull request Jan 24, 2023

Make check_parameters_default_constructible more lenient regarding accepted types scikit-learn/scikit-learn#25467

Closed

ngoldbaum mentioned this pull request Mar 1, 2023

BUG: np.size is no longer an instance of FunctionType (nightly/dev build) #23307

Closed

fsagbuya mentioned this pull request Sep 18, 2023

compiler: fix lit tests numpy.transpose error (#2190) m-labs/artiq#2208

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Improve array function overhead by using vectorcall #23020

ENH: Improve array function overhead by using vectorcall #23020

seberg commented Jan 16, 2023

seberg Jan 16, 2023

seberg commented Jan 16, 2023

mattip commented Jan 16, 2023

mattip commented Jan 16, 2023

mattip commented Jan 16, 2023

mattip commented Jan 16, 2023

seberg commented Jan 16, 2023

seberg commented Jan 16, 2023

mattip commented Jan 16, 2023

seberg commented Jan 16, 2023

seberg Jan 16, 2023

eendebakpt Jan 17, 2023

seberg commented Jan 17, 2023 •

edited

Loading

charris commented Jan 17, 2023

jakirkham commented Jan 17, 2023

	overrides when called like.
	overrides when called like ``implementation(args, *kwargs)``.

ENH: Improve array function overhead by using vectorcall #23020

ENH: Improve array function overhead by using vectorcall #23020

Conversation

seberg commented Jan 16, 2023

seberg Jan 16, 2023

Choose a reason for hiding this comment

seberg commented Jan 16, 2023

mattip commented Jan 16, 2023

mattip commented Jan 16, 2023

mattip commented Jan 16, 2023

mattip commented Jan 16, 2023

seberg commented Jan 16, 2023

seberg commented Jan 16, 2023

mattip commented Jan 16, 2023

seberg commented Jan 16, 2023

seberg Jan 16, 2023

Choose a reason for hiding this comment

eendebakpt Jan 17, 2023

Choose a reason for hiding this comment

seberg commented Jan 17, 2023 • edited Loading

charris commented Jan 17, 2023

jakirkham commented Jan 17, 2023

seberg commented Jan 17, 2023 •

edited

Loading