ENH: Propagate like= downstream #17678

pentschev · 2020-10-29T22:54:39Z

After doing some testing with NEP-35 in Dask (see dask/dask#6738), I noticed something I grossly overlooked in the original proposal, which is passing the like= argument downstream iff supported. The benefit of doing that is to achieve the following:

In [1]: import numpy as np, cupy as cp, dask.array as da

In [2]: np.asarray([1, 2, 3], like=da.array(cp.array(())))
Out[2]: dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=cupy.ndarray>

Note how passing a Dask array backed by a CuPy array to like= will make it possible for Dask to also identify that we want chunktype=cupy.ndarray. Without this change, we would only be able to achieve Dask's default, which is chunktype=np.ndarray.

This is a change that is unnecessary (or at least I think so) for CuPy, as a CuPy array is purely CuPy-backed, therefore implementing like= there would have no value. Due to that, this change will attempt passing like= downstream, and if a TypeError is raised, then it will fallback to the previous behavior of simply removing like= before calling the downstream implementation.

It's also good to point out that such change isn't necessary for Dask internally, we can completely get away without that, as is currently the case in dask/dask#6738 . However, this may be useful for other library developers in the future, so I think it's worth expanding the scope of NEP-35 to allow this. Of course, if we agreed on this change, I will update NEP-35's text to reflect this, but I preferred to have the code functional for some review before doing so.

cc @seberg @rgommers @shoyer @hameerabbasi @quasiben @jakirkham @leofang

hameerabbasi · 2020-10-30T09:30:55Z

numpy/core/src/multiarray/arrayfunction_override.c

            method, argument, public_api, types, args, kwargs, NULL);
+        PyObject *err = PyErr_Occurred();
+        if (err && PyErr_GivenExceptionMatches(err, PyExc_TypeError)) {


This would match every TypeError, which could be very slow, and is hacky. I'd suggest something along the lines of if getattr(type, __array_function_like__, False).

Of course, this would require an update to the NEP.

That would also work, the downside to such an approach is that we would need to then follow an all-or-nothing approach, so if the downstream library implements the __array_function_like__ attribute, then all creation functions would need to add like=. That's not necessarily bad, but it increases effort for adoptability and forces us to maintain a list of all functions that add like= in NEP-35, so maintainers know which functions need to add that argument.

I would like to hear more opinions and potentially some other drawbacks from folks before we decide on what path to take.

Not sure I like idea of an additional flag much. I guess this is a bit special due to the assumption that you do not want to force projects to add like= if it is just ignored anyway.

Right now, adding kwargs.pop("like") in your __array_function__ implementation is also possible (using the fact that like isn't used for anything else at the moment). I guess there is a transition problem possible where projects do generalize to these functions, it just never happened yet. And forcing to add like= means they have to catch up. But, in a sense I guess the whole setup is prone to small transition issues like that already.

To be clear, you'd only force projects to add like= to functions iff they also choose to add __array_function_like__ attribute. For example, CuPy wouldn't implement either one, but Dask would probably add __array_function_like__, and only then it would be forced to add like= to all the functions that NEP-35 covers. IOW, it would be opt-in only and once the library decides to add __array_function_like__, they would know already of the associated cost.

Maybe I'm overly concerned about things, I'm not sure yet. I believe you're right on the situations where this may show up, I guess I'm also considering the case where CuPy is older than NumPy 1.20, for example, if one were to upgrade to NumPy master now and remain with CuPy in release version, when using a Dask function that does some implicit like=-related calls to CuPy, this would be a hassle.

It would only be necessary to do that change in CuPy, Dask would be fine as long as it does add like= to all functions where NumPy does too, or remove like= on a per-function basis.

As for backward compatibility, what worries me is that for a reader of NEP-18 there was nothing as like=, but now we're introducing a change that requires an amendment stating they need to explicitly remove like=, I'm not sure if that's ideal, maybe we can rely solely on the DeprecationWarning should we follow that path?

Well, NEP-18 explicitly does not include np.array, if you support it you are beyond what it covers? Also I expect NEP-18 says that signatures should match, which might suggest you have to add the like= kwarg as well.

@pentschev I talked about this with @seberg at today's triage meeting, we decided that this PR isn't needed. Any function with like= implies that if like= is passed in, it's the only thing being dispatched on. Therefore, inside __array_function__, self is equivalent to the like that was passed in. So anytime we need like=, we could use self instead, and copy chunk-types from that. Does that work for your use-case?

@hameerabbasi wow, that's indeed a great idea! I'm now embarrassed I didn't think of it before, but thankfully we have you here to come up with that! I agree, this seems to entirely cover that situation, we can then do one of two things in the __array_function__ definition of those libraries (e.g., Dask) that may eventually have an use for it:

Inspect the function to check whether it implements the like= argument and do func(*args, **kwargs, like=self); or

Special case functions that will use it it and pass func(*args, **kwargs, like=self), in case we it's not possible to do introspection or that's too detrimental for performance.

I'll try to do a bit more testing tomorrow just to ensure I'm not overlooking anything. After that, I'll then close this PR and I'm thinking of making a change to NEP-35 only to add the information on how downstream libraries can achieve this result that we're trying to do for Dask, should someone else need this in the future.

Sorry it took me longer than I expected to come back to this. I did some more checking, and I think @hameerabbasi 's solution indeed covers the cases I had in mind. With that said, I just opened #17725 to add instructions for downstream library maintainers on how to do that, and I'm closing this PR. Thanks @hameerabbasi @seberg and @eric-wieser for feedback here!

ENH: Propagate like= downstream

4e7cb90

github-actions bot added the 01 - Enhancement label Oct 29, 2020

pentschev mentioned this pull request Oct 29, 2020

NEP 35: Finalize like= argument behaviour before 1.21 release #17075

Closed

hameerabbasi reviewed Oct 30, 2020

View reviewed changes

charris added the component: numpy._core label Nov 2, 2020

pentschev mentioned this pull request Nov 6, 2020

NEP: Add NEP-35 instructions on reading like= downstream #17725

Merged

pentschev closed this Nov 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Propagate like= downstream #17678

ENH: Propagate like= downstream #17678

Uh oh!

pentschev commented Oct 29, 2020

Uh oh!

hameerabbasi Oct 30, 2020

Uh oh!

hameerabbasi Oct 30, 2020

Uh oh!

pentschev Oct 30, 2020

Uh oh!

seberg Oct 30, 2020

Uh oh!

pentschev Oct 30, 2020

Uh oh!

pentschev Nov 3, 2020

Uh oh!

seberg Nov 4, 2020

Uh oh!

hameerabbasi Nov 4, 2020

Uh oh!

pentschev Nov 4, 2020

Uh oh!

pentschev Nov 6, 2020

Uh oh!

Uh oh!

Uh oh!

ENH: Propagate like= downstream #17678

ENH: Propagate like= downstream #17678

Uh oh!

Conversation

pentschev commented Oct 29, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!