-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Tracking issue for implementation of NEP-18 (__array_function__) #12028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It might be good to merge a preliminary "Decorate all public NumPy functions with @array_function_dispatch" for some high-profile functions and request downstream consumers of the protocol to try it out |
Once we merge #12099 I have another PR ready that will add dispatch decorators for most of |
See https://github.com/shoyer/numpy/tree/array-function-easy-impl for my branch implementing all the "easy" overrides on functions with Python wrappers. The leftover parts are Note that I haven't written tests for overrides on each individual function. I'd like to add a few integration tests when we're done (e.g., a duck array that logs all applied operations), but I don't think it would be productive to write dispatching tests for each individual function. The checks in #12099 should catch the most common errors on dispatchers, and every line of code in dispatcher functions should get executed by existing tests. |
@shoyer - on the tests, I agree that it is not particularly useful to write tests for each one; instead, within numpy, it may make most sense to start using the overrides relatively quickly in |
@mhvk sounds good to me, though I'll let someone else who uses/knows MaskedArray take the lead on that. |
@shoyer - seeing some of the implementations, I have two worries:
I think overall these two things are probably benefits, since we simplify the interface and can make the implementations optimized for pure |
I'm not sure I follow here. We are indeed effectively deprecating the old way of overriding functions like |
I think the issue is that if we implement |
@hameerabbasi - yes, that is the problem. Though we need to be careful here how easy we make it to rely on duct-tape solutions that we would really rather get rid of... (which is why I wrote above that my "problems" may actually be benefits...). Maybe there is a case for trying as is in 1.16 and then deciding on actual experience whether we want to provide fall-back of "ignore my |
Re: dispatcher styling: My preferences on style are based on memory/import time considerations, and verbosity. Quite simply, merge the dispatchers where the signature is likely to remain the same. This way, we create the least amount of objects and the cache hits will be higher too. That said, I'm not too opposed to the lambda style. |
The style for writing dispatcher functions has now come up in a few PRs. It would good to make a consistent choice across NumPy. We have a few options: Option 1: Write a separate dispatcher for each function, e.g., def _sin_dispatcher(a):
return (a,)
@array_function_dispatch(_sin_dispatcher)
def sin(a):
...
def _cos_dispatcher(a):
return (a,)
@array_function_dispatch(_cos_dispatcher)
def cos(a):
... Advantages:
Disadvantages:
Option 2: Reuse dispatcher functions within a module, e.g., def _unary_dispatcher(a):
return (a,)
@array_function_dispatch(_unary_dispatcher)
def sin(a):
...
@array_function_dispatch(_unary_dispatcher)
def cos(a):
... Advantages:
Disadvantages:
Option 3: Use # inline style (shorter)
@array_function_dispatch(lambda a: (a,))
def sin(a):
...
@array_function_dispatch(lambda a, n=None, axis=None, norm=None: (a,))
def fft(a, n=None, axis=-1, norm=None):
... # multiline style (more readable?)
@array_function_dispatch(
lambda a: (a,)
)
def sin(a):
...
@array_function_dispatch(
lambda a, n=None, axis=None, norm=None: (a,)
)
def fft(a, n=None, axis=-1, norm=None):
... Advantages:
Disadvantages:
|
This comment has been minimized.
This comment has been minimized.
Note that the error message issues can be fixed by reconstructing the code object, although that will come with some import-time cost. Perhaps worth investigating, and breaking out @nschloe's tuna to compare some options. |
Yep, the decorator module could also be used for generating function definition (it uses a slightly different approach for code generation, a little more like namedtuple in that it uses |
As long as the error is not solved, I think we need to stick to the options with a dispatcher which has a clear name. I'd slightly to bundle dispatchers together (2) for memory reasons, though would then keep the error message very much in mind, so would suggest calling the dispatcher something like Though if we can change the error, things change. E.g., it might be as simple as catching exceptions, replacing |
I agree that the error message needs to be clear, ideally shouldn't change at all. |
I think in order to keep current code working, the actual function must come after the dispatcher. |
Right, but we can give it the same name as the dispatcher. The dispatcher name will get overwritten. |
It would great to be able to define custom dispatching for function like np.arange or np.empty. I guess one option would be for to NumPy to dispatch on scalars as well as arrays. Is this incompatible with the NEP? Would anything break with this change? |
Any objections to adopting this option? I think it's probably the friendliest choice from a user perspective. |
You might want to dispatch on either of those. For example, here is a custom shape object, that we might want to dispatch differently on. This example isn't very useful, but the idea is that I have a lazy object that behaves like shape, but doesn't return integers, it returns expressions. For example, it would nice to be able to do something like this: class ExprShape:
def __getitem__(self, i):
return ('getitem', self, i)
def __len__(self):
return ('len', self)
numpy.empty(ExprShape()) Which would I would like to override to return something like |
Yes, in principle we could dispatch on shape, too. That would add additional complexity/overhead to the protocol. Do you have use-cases where using an array as a template (like The other cases I can think of is the |
If we are taking an existing API that depends on NumPy and would like to transparently have it work on a different backend, without changing the existing source code. For example, let's say we were attempting to call You can see here it would be helpful if we could change |
This isn't possible in general. Explicit array construction like In this case, switching |
Is there a proposal around making this sort of change or are you saying that supporting dispatching of
Definitely. But there are many cases where either you don't control the source code, or you want to support users in using the functions they know and love as much as possible. There are other ways to provide users with that experience like:
Both of those options might be required in certain cases (like letting users call I understand there is a performance/complexity tradeoff here and that might be a good reason not to implement these. But it might force users to explore other means to get the flexibility they desire. |
NEP 22 has some discussion of the options here. I don't think we can safely change the semantics of The problem is that There are certainly plenty of use-cases where this isn't the case, but switching this behavior would break lots of downstream code, so it's a non-starter. Downstream projects will need to opt-in to at least this aspect of array duck typing.
Yes. NEP 18 is not intended to be a complete solution for drop-in NumPy alternatives, but its a step in that direction. |
I've drafted a revision to NEP-18 for adding a |
xref numpyGH-12028 Current behavior: >>> np.dot(None, None) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/shoyer/dev/numpy/numpy/core/overrides.py", line 175, in public_api implementation, public_api, relevant_args, args, kwargs) TypeError: unsupported operand type(s) for *: 'NoneType' and 'NoneType' >>> np.stack([], invalid=True) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/shoyer/dev/numpy/numpy/core/overrides.py", line 148, in public_api relevant_args = dispatcher(*args, **kwargs) TypeError: _stack_dispatcher() got an unexpected keyword argument 'invalid' With this change: >>> np.dot(None, None) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 6, in dot TypeError: unsupported operand type(s) for *: 'NoneType' and 'NoneType' >>> np.stack([], invalid=True) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 4, in stack TypeError: _stack_dispatcher() got an unexpected keyword argument 'invalid'
It occurs to me that we forget to warp the functions in I'm going to do that shortly... |
xref numpyGH-13624, numpyGH-12028 TODO: update tests/CI for NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=0
There's one revision that I'd like to see to the NEP, specifically to clarify what guarantees NEP-18 offers to subclass authors: #13633 |
There's also #13728 - a bug in the dispatcher for |
@shoyer can we close this? |
__array_function__
machinery #12005)Disable validation when not testing NumPy (if there is a measurable impact on import times)(unnecessary).__skip_array_function__
function attribute to allow for skipping__array_function__
dispatch. (ENH: implement__skip_array_function__
attribute for NEP-18 #13389)numpy/core/overrides.py
in C for speed (Tracking issue for implementation of NEP-18 (__array_function__) #12028):get_overloaded_types_and_args
array_function_implementation_or_override
ndarray.__array_function__
?array_function_dispatch
?numpy.core
__array_function__
support for most ofnumpy.core
#12115)np.core.defchararray
(ENH:__array_function__
fornp.core.defchararray
#12154)np.einsum
andnp.block
(ENH:__array_function__
fornp.einsum
andnp.block
#12163)numpy.lib
__array_function__
support fornp.lib
, part 1/2 #12116)__array_function__
support fornp.lib
, part 2/2 #12119)numpy.fft
/numpy.linalg
(ENH:__array_function__
support fornp.fft
andnp.linalg
#12117)__array_function__
for multiarray functions #12175)arange?
__array_function__
implementation found #12251)ndarray.__repr__
should not rely on__array_function__
(MAINT:ndarray.__repr__
should not rely on__array_function__
#12212)stacklevel
should be increased by 1 for wrapped functions, so tracebacks point to the right place (Warnings need different stacklevel after NEP 18 overrides #13329)The text was updated successfully, but these errors were encountered: