NEP: Proposal for array creation dispatching with `__array_function__` #14715

pentschev · 2019-10-15T20:43:47Z

This NEP proposes the introduction of a like= argument. The argument is to be
added to all array creation functions. This allows the user to pass a reference
array that will use __array_function__ protocol to dispatch such functions to
downstream libraries.

This NEP proposes the introduction of a `like=` argument. The argument is to be added to all array creation functions. This allows the user to pass a reference array that will use `__array_function__` protocol to dispatch such functions to downstream libraries.

pentschev · 2019-10-15T20:46:08Z

@shoyer @rgommers @seberg after being sidetracked for some time, I finally got the chance to finish the proposal that we discussed in the context of #14441.

cc @mrocklin @jakirkham

doc/neps/nep-0033-array-creation-dispatch-with-array-function.rst

shoyer · 2019-10-15T21:22:57Z

doc/neps/nep-0033-array-creation-dispatch-with-array-function.rst

+This newly proposed keyword shall be removed by the ``__array_function__``
+mechanism from the keyword dictionary before dispatching. The purpose for this
+is twofold:
+
+1. The object will have no use in the downstream library's implementation; and


This part feels questionable to me. I don't think it's a good idea to add special cases to a protocol, so I think it's a good idea to include like in argument even if it wouldn't be useful.

I can also definitely imagine a library with multiple duck array types (e.g., pydata/sparse) that uses a single implementation that relies upon the like argument.

In that case, we could just have a separate __array_function__ on each of our types. But in any case, I would rather the like= kwarg was removed, it'd drive adaptability up quickly... But only for array creation functions, not others.

I can also definitely imagine a library with multiple duck array types (e.g., pydata/sparse) that uses a single implementation that relies upon the like argument.

But that would still rely on the NumPy dispatching system, no? In that case, I would expect the implementation to find out internally what is the actual type it needs to create and call NumPy's API to create a new object, much like what we do now with _meta in Dask.

In that case, we could just have a separate array_function on each of our types.

I'm not sure I understand your comment, but perhaps this is what I also tried to say in my comment just above?

But in any case, I would rather the like= kwarg was removed, it'd drive adaptability up quickly... But only for array creation functions, not others.

This is indeed my intent, I think the removal would be incredibly helpful for downstream libraries, even though I see why @shoyer is concerned about it. Perhaps would be nice to have more opinions here and see if they eventually converge in some direction.

I'm not sure I understand your comment, but perhaps this is what I also tried to say in my comment just above?

I mean that PyData/Sparse has different types for the different kinds of arrays it uses. Each of those types could have its own __array_function__ if we wanted to specialize array creation functions at all, instead of using one overarching __array_function__.

However I could think of da.Array using such a keyword to copy e.g. chunk sizes and inner array types from the source array.

I mean that PyData/Sparse has different types for the different kinds of arrays it uses. Each of those types could have its own array_function if we wanted to specialize array creation functions at all, instead of using one overarching array_function.

Oh, I see now. Yeah, that would be an alternative, but that could be a big burden too. So maybe in that case just not removing like= kwarg would make more sense?

However I could think of da.Array using such a keyword to copy e.g. chunk sizes and inner array types from the source array.

In the case of Dask, we can now use _meta to identify the inner array and use it as the like object. So there I think this wouldn't be a problem.

True, you would get the like= array in the form of self.

but that could be a big burden too

Not really, I consider it a separation of concerns.

Edit: I should clarify: If I wanted separate implementations, adding the protocol on each type of array isn't a big burden. If I wanted a unified one, I could just have one on the ABC, as I do now.

You're right, this makes sense. I was mostly saying this from the quick/seamless adaptability point-of-view.

hgt312 · 2019-10-18T08:50:26Z

How about let the default value of like to be a global editable value?

pentschev · 2019-10-18T11:48:34Z

How about let the default value of like to be a global editable value?

I don't know if I understand your comment correctly, but I think what you're suggesting is that the global value be a module name, something like numpy or cupy. The __array_function__ protocol dispatches based on the function name and array type, so that's really not how it's expected to work. Maybe what you're looking for is #14389.

hgt312 · 2019-10-18T12:07:55Z

@pentschev
I think that a creation function's implementation may like this:

def asarray(a, dtype=None, order='C', like=np.ndarray/None):
    pass

then have an API to set the default ndarray type, then we can make all the following ndarray be the expected type

import cupy as cp
import numpy as np

np.set_default_ndarray_to(cp.ndarray)  # example method
np.ones((2, 2))  # -> a cupy ndarray

mattip · 2019-10-18T12:14:29Z

Modifying global state sounds like a recipe for disaster. Any random package could call that function and cause the whole application to behave differently.

pentschev · 2019-10-18T12:25:23Z

Besides @mattip's comment (which I agree with, btw), the purpose of this NEP or __array_function__ for that matter isn't this. What you're suggesting is more or less what uarray suggests. The like= parameter here takes a reference array, not a type, this reference array can be used by __array_function__ to dispatch. A library such as Dask could for example get a CuPy array passed by the user and dispatch a creation of a new array internally to CuPy via the protocol, all this with Dask internally never knowing about CuPy's existence directly.

seberg · 2019-10-18T15:30:58Z

doc/neps/nep-0033-array-creation-dispatch-with-array-function.rst

+into the C implementation itself. This is not the primary suggestion here due
+to its inherent complexity which would be difficult too long to describe in its
+entirety here, and too tedious for the reader. However, we leave that as an
+option open for discussion.


I think we may want to go the C route (or get out the cython idea again). (The main issue in python is that there is no simple fast kwarg parsing solution available right now; python has the clinic internally, but that does not really help us).

The thing is that if this is in python, it will add at least one more function overhead call, which is fairly significant for a no-op. For many library functions it will not matter, but for the simple numpy functions it may.

I suspected people would lean this way, that's why I suggested this, from my perspective, I'm fine with either way. Maybe would be good to wait and see what others prefer too?

If we go the C path, IMO it doesn't make sense to extend the NEP quite significantly to point out all the adaptations needed to allow this. Unless people disagree with my point, I would only move the suggestion for Python implementation to a "rejected implementation proposal" section and mention that we are going the C way, but not describe all its details, as those details will likely not aggregate any important information to non-NumPy developers using that as a reference.

The only part that is currently not implemented in C is the public_api helper function from insidenumpy.core.overrides.array_function_dispatch:

def public_api(*args, **kwargs): relevant_args = dispatcher(*args, **kwargs) return implement_array_function( implementation, public_api, relevant_args, args, kwargs)

Maybe we could save the overhead of parsing args and kwargs in public_api if we wrote it in C? Honestly I'm not entirely sure.

This uses a closure over implementation and public_api, which is probably the main reason why it wasn't written in C in the first place.

Sorry for being unresponsive. I have been thinking about this, what I suppose could work is to add the like kwarg to the C implementation of each function, and then do the inverse process:

If like is valid, call the C implementation of override;

The C implementation of override would have a callback to the Python implementation which would dispatch to the downstream library.

Admittedly, I don't know much about C Python and the current C implementation of the __array_function__ override, so I don't know whether this is something that is possible to achieve. Besides this, I also don't see any other alternatives to doing it the Python way, as per my proposal in the NEP itself. Thoughts on this @shoyer ?

stefanv · 2019-10-25T04:30:45Z

This NEP should be designated NEP 35. NEP 33 is the dtype NEP, and NEP 34 was introduced 15 days ago by Matti.

… NEP-35

pentschev · 2019-11-04T13:33:29Z

Thanks @stefanv for catching this. I've renamed it in the latest commit.

rgommers · 2020-04-11T19:03:31Z

Comments seem to have been addressed and discussion has stopped, so merging this NEP with status Draft. Discussion can still continue if needed of course.

pentschev · 2020-04-13T23:09:00Z

Thanks @rgommers for merging!

jakirkham · 2020-04-14T04:55:16Z

Is there a need for additional discussion? If so, where should we have that? If not, what should be the next step here?

rgommers · 2020-04-14T09:05:38Z

Is there a need for additional discussion? If so, where should we have that? If not, what should be the next step here?

Yep, @seberg took the lead on organizing a call - forwarded you the invite.

eric-wieser reviewed Oct 15, 2019

View reviewed changes

doc/neps/nep-0033-array-creation-dispatch-with-array-function.rst Outdated Show resolved Hide resolved

shoyer reviewed Oct 15, 2019

View reviewed changes

doc/neps/nep-0033-array-creation-dispatch-with-array-function.rst Outdated Show resolved Hide resolved

shoyer reviewed Oct 15, 2019

View reviewed changes

pentschev added 2 commits October 16, 2019 11:58

MAINT: fix implementation example of NEP-33

edd16e6

MAINT: cite few examples of array creation functions in NEP-33

f4f02e7

pentschev mentioned this pull request Oct 17, 2019

Other *_like creation functions #14441

Closed

seberg reviewed Oct 18, 2019

View reviewed changes

MAINT: change array creation dispatching with __array_function__ to…

2961292

… NEP-35

mattip added the component: NEP label Nov 26, 2019

thomasjpfan mentioned this pull request Jan 24, 2020

Use np.empty_like in PolynomialFeatures for NEP18 support scikit-learn/scikit-learn#16196

Closed

rgommers merged commit e44d0ec into numpy:master Apr 11, 2020

Uh oh!

NEP: Proposal for array creation dispatching with __array_function__ #14715

NEP: Proposal for array creation dispatching with __array_function__ #14715

Uh oh!

Conversation

pentschev commented Oct 15, 2019

Uh oh!

pentschev commented Oct 15, 2019

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hameerabbasi Oct 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hgt312 commented Oct 18, 2019

Uh oh!

pentschev commented Oct 18, 2019

Uh oh!

hgt312 commented Oct 18, 2019

Uh oh!

mattip commented Oct 18, 2019

Uh oh!

pentschev commented Oct 18, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stefanv commented Oct 25, 2019

Uh oh!

pentschev commented Nov 4, 2019

Uh oh!

rgommers commented Apr 11, 2020

Uh oh!

pentschev commented Apr 13, 2020

Uh oh!

jakirkham commented Apr 14, 2020

Uh oh!

rgommers commented Apr 14, 2020

Uh oh!

Uh oh!

NEP: Proposal for array creation dispatching with `__array_function__` #14715

NEP: Proposal for array creation dispatching with `__array_function__` #14715

hameerabbasi Oct 16, 2019 •

edited

Loading