-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
NEP: Array function protocol #11189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NEP: Array function protocol #11189
Conversation
Written with Stephan Hoyer after discussion with Stefan Van der Walt, Charles Harris, Matti Picus, and Jaime Frio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we save NEP-16 for Nathaniel's An abstract base class for identifying "duck arrays"
in #10706 ? That would make this NEP-17.
Thanks. Done
…On Tue, May 29, 2018 at 11:50 AM, Stephan Hoyer ***@***.***> wrote:
***@***.**** commented on this pull request.
Can we save NEP-16 for Nathaniel's An abstract base class for identifying
"duck arrays" in #10706 <#10706> ?
------------------------------
In doc/neps/nep-0016-array-function-protocol.rst
<#11189 (comment)>:
> + class MyArray:
+ def __array_function__(self, func, overload_types, *args, **kwargs):
+ if func not in HANDLED_FUNCTIONS:
+ return NotImplemented
+ if not all(issubclass(t, MyArray) for t in overload_types):
+ return NotImplemented
+ return HANDLED_FUNCTIONS[func](*args, **kwargs)
+
+ HANDLED_FUNCTIONS = {
+ np.concatenate: my_concatenate,
+ np.broadcast_to: my_broadcast_to,
+ np.sum: my_sum,
+ ...
+ }
+
+*Note from MR: would it make sense instead for them to return the
delete? :)
------------------------------
In doc/neps/nep-0016-array-function-protocol.rst
<#11189 (comment)>:
> @@ -0,0 +1,517 @@
+==================================================
+NEP: Dispatch Mechanism for NumPy's high level API
+==================================================
+
+:Author: Authors: Stephan Hoyer ***@***.***> and Matthew Rocklin ***@***.***>
I think the standard is one line/:Author: entry per author
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#11189 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AASszD_VkAgClzxAFP3kYRjvyF24dbRfks5t3W5FgaJpZM4URsKT>
.
|
|
||
.. code-block:: python | ||
|
||
def __array_function__(self, func, types, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if {'func', 'types'} & kwargs.keys()
is non-empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I think we need to switch this to not use *
and **` unpacking.
This should be merged as nep-0018 once the initial comments are handled. Deeper discussion, as I understand the process, should take place on the mailing list |
.. code:: python | ||
|
||
def broadcast_to(array, shape, subok=False): | ||
success, value = do_array_function_dance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to handle this as a decorator? Perhaps we could use
ba = inspect.signature(func).bind(*args, **kwargs)
ba.apply_defaults()
array_function = ...
array_function(func, *ba.args, **bar.kwargs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we only supported Python 3. But in Python 2, we would lose introspection of function arguments.
Nonetheless this should indeed be mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example really does not look nice!! If this cannot be done more elegantly in python2, then I think we should aim for python3 only - this process is unlikely to be that fast and even if it were, skipping one release would be well worth having a much simpler interface. In particular, in python3 one could use a decorator that looks at annotations (which we'd like anyway!) - much more elegant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nature of types is really rather unclear. Should it perhaps be a list of types the ndarray
implementation does not know what to do with? (This would include ndarray
subclasses if the current function would do asarray
)? More comments on mailing list.
2. Are all arguments of a type that we know how to handle? | ||
|
||
If these conditions hold, ``__array_function__`` should return | ||
implementation for ``func(*args, **kwargs)``. Otherwise, it should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the example below I think you mean "the result from calling its implementation of ...".
``__array_function__`` attribute on those inputs, and call those | ||
methods appropriately until one succeeds. | ||
|
||
This is one additional function of moderate complexity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe comment here that speed will be of the essence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, the speed for regular arrays should not be impacted much even for small arrays.
.. code:: python | ||
|
||
def broadcast_to(array, shape, subok=False): | ||
success, value = do_array_function_dance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example really does not look nice!! If this cannot be done more elegantly in python2, then I think we should aim for python3 only - this process is unlikely to be that fast and even if it were, skipping one release would be well worth having a much simpler interface. In particular, in python3 one could use a decorator that looks at annotations (which we'd like anyway!) - much more elegant.
(Autograd, Tangent), higher order array factorizations (TensorLy), etc. | ||
that add additional functionality on top of the Numpy API. | ||
|
||
We would like to be able to use these libraries together, for example we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add:
Finally, some ``ndarray`` subclasses add new behaviour (e.g., MaskedArray and astropy's Quantity),
which gives different meaning to functions. Indeed, MaskedArray reimplements some of the core
numpy functions, and Quantity has long-standing issues about compatibility with stacking functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how best to weave this text into the NEP. To me this seems like orthogonal functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My text may have been confusing, but I think __array_function__
is definitely very relevant for subclasses.
For instance, currently ma.extras
is full of re-definitions of functions like stack
- i.e., it follows the make-your-own-namespace "solution" that this NEP would like to address. This could be completely replaced by the __array_function__
proposed here!
Similarly, for Quantity
we are currently relying on overrides or hope that a function uses ufuncs and does not do asarray
- and we have no good solution for functions like concatenate
(not being willing to provide our own namespace...). With __array_function__
, this would be solved. For instance, for concatenate
we can ensure that all inputs are converted to the same unit before doing the actual concatenation.
In other words, I probably should have written what looks much like this NEP for MaskedArray
and Quantity
alone - hence the request to mention them!
Aside: For both the above, I see possible implementations very much in terms of "1. prepare input arrays; 2. call original function on those; 3. set some properties on outputs" - step (2) is most logically implemented with a super()
call, which is why I strongly suggest there be an ndarray.__array_function__
.
I cannot find this mentioned on the mailing list, so then here:
|
We still need to send this out to the mailing this. I'll do this this
afternoon unless Matt beats me to it.
…On Wed, May 30, 2018 at 10:06 AM Marten van Kerkwijk < ***@***.***> wrote:
I cannot find this mentioned on the mailing list, so then here:
1. I'm rather unclear about the use of types. It can help me decide
what to do, but I would still have to find the argument in question. If we
pass anything at all, should it just be the argument itself that does not
get immediately recognized by ndarray? Or, better still, perhaps a
tuple of all arguments that were inspected?
2. For subclasses, it would be very handy to have
ndarray.__array_function__, so one can call super after changing
arguments. (For __array_ufunc__, there was lots of question about
whether this was useful, but it really is!!).
3. This ndarray.__array_function__ might also help solve the problem
of cases where coercion is fine: it could have an extra keyword argument
(say coerce) that would call the function with coercion in place.
4. Indeed, the ndarray.__array_function__ could just be used inside
the "dance" function, and then the actual implementation of a given
function would just be a separate, private one.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#11189 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABKS1vVuNaGfNEdEe0nBiLiXpdFfnLyfks5t3tGsgaJpZM4URsKT>
.
|
The authors should send it out to the mailing list. Still needed for merge (as per Nep Workflow, the merge should be sooner rather than later):
|
Another general comment, since this is still not on the mailing list, yet I'm thinking about it now: Since speed for normal operation should be impacted as minimally as possible, there should be obvious ways to ensure no type checking dance is done. Some possible solutions (which I think should be in the NEP, even if as discounted options):
|
Done! My apologies for the delays on this @mattip . |
Thanks @mrocklin |
I've now posted the NEP to the mailing list for discussion @mhvk with regards to With regards to namespaces, please note the section on "Separate namespace" that is already in the NEP text. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two smaller comments; more general ones on the mailing list.
``__array_function__``. | ||
|
||
This protocol is intended to be a catch-all for NumPy functionality that | ||
is not covered by existing protocols, like reductions (like ``np.sum``) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular reduction is an example of a ufunc
- maybe stick to the more complicated example of np.mean
?
would only include these as keyword arguments when they have changed | ||
from default values. This is similar to `what NumPy already has | ||
done <https://github.com/numpy/numpy/blob/v1.14.2/numpy/core/fromnumeric.py#L1865-L1867>`__, | ||
e.g., for the optional ``keepdims`` argument in ``sum``: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth mentioning that the dance decorator could do this part as well!
.. code:: python | ||
|
||
def broadcast_to(array, shape, subok=False): | ||
success, value = do_array_function_dance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a mini-thing, but one might as well have the dance return either a result or NotImplemented
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d be in favor of this. No need to complicate things with multiple out args, remember the order and so on.
Are we ready to start the approval process for NEP 18? |
@mattip almost -- I'd like to add short paragraphs on class methods and dtypes to "Alternatives" to make it clear that we aren't aren't ruling out such options in the future. |
Hello, I am a CuPy developer. |
Written with Stephan Hoyer
after discussion with Stefan Van der Walt, Charles Harris, Matti Picus, Jaime Frio and Nathaniel Smith