Skip to content

A protocol for numpy.ones_like #11074

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mrocklin opened this issue May 10, 2018 · 16 comments
Closed

A protocol for numpy.ones_like #11074

mrocklin opened this issue May 10, 2018 · 16 comments

Comments

@mrocklin
Copy link
Contributor

I recently sat down with @ericmjl to see if we could make autograd work with cupy by making them both understand protocols like __array_ufunc__

The goal here being to improve all projects to produce and consume numpy protocols rather than explicitly import each other in pair-wise plugin mechanisms.

One issue that we ran into is that autograd needs to produce new arrays, like ones (the gradient of sum) and random (not sure why yet). It would be nice to be able to say "produce an array of ones with a particular shape and dtype, but using the module that created this particular array object". This would allow autograd to produce dask arrays of ones, cupy arrays of ones, etc..

The numpy.ones_like function almost does this except for the following two issues:

  1. It takes the shape from the given array (which makes perfect sense given its original objective)
  2. There is no protocol for it, so np.ones_like(my_duck_array, ...) produces a numpy array

Also, to be clear, ones here is an example of a larger problem of how to improve dispatch for a wider set of numpy functions.

@mrocklin
Copy link
Contributor Author

Actually, I suspect that accepting a shape is not particularly important. My guess is that some of these immediate problems would be solved if functions like ones_like and empty_like were ufuncs.

@mhvk
Copy link
Contributor

mhvk commented May 10, 2018

Have a look at np.core.umath._ones_like (which is a ufunc). If this is really useful, it may well be possible to expose it. (Note that I just know of its existence through checking that astropy's Quantity covers all ufuncs in np.core.umath -- I don't know why this exists...)

@njsmith
Copy link
Member

njsmith commented May 10, 2018

I guess empty_like and zeros_like are the primitives, and ones_like is equivalent to empty_like + fill? Or maybe full_like(1)?

empty_like makes some sense as a ufunc, though I guess it might waste some time iterating and calling a no-op loop. zeros_like is a bit tricky because it's a special kind of allocation mode: for in-memory arrays you want to use calloc. I guess that implement it as full_like(0) and trust that good implementations of full_like will detect special values like this and do the right thing?

I feel like @shoyer and I might have discussed this in March, but I don't see anything in our notes and my brain is fried from getting my PyCon talk together for tomorrow. Maybe he remembers more :-)

@shoyer
Copy link
Member

shoyer commented May 10, 2018

@njsmith I think you can find it briefly mentioned at the end, under "A (very) rough sketch of future plans".

Speaking of which -- we really should finish polishing those notes up into a NEP :).

The one other big question in my mind that we didn't talk about is where these protocols should live:

  • In NumPy proper, e.g., in the actual numpy.ones_like() function. This raises potential backwards compatibility concerns, if somebody is relying on numpy.ones_like() always producing a numpy array (which seems quite plausible).
  • In a separate module inside NumPy, e.g., numpy.api.ones_like()?
  • In a separate package altogether, e.g., @hameerabbasi's arrayish?

Perhaps this would be a good topic of discussion for the NumPy sprint at BIDS in a few weeks (or maybe the next week when @mrocklin is in town).

@dopplershift
Copy link

This seems related to the problem of making concatenate work for duck arrays.

@charris
Copy link
Member

charris commented May 10, 2018

IIRC, the zeros_like and empty_like functions were inherited from Numeric and not ufuncs. When ones_like was added there was some discussion at to whether it should be a ufunc or not. In any case, adjustments to the various functions have been made over the years to make them resemble each other. My memory of events is hazy as it was a long time ago. We might want to revisit all of those functions at some point.

@mhvk
Copy link
Contributor

mhvk commented May 10, 2018

Indeed, ones_like comes from numeric.py; I still find it amusing that there is an actual _ones_like ufunc in umath... its docstring confirmed your memory:

_ones_like(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This function used to be the numpy.ones_like, but now a specific
function for that has been written for consistency with the other
*_like functions. It is only used internally in a limited fashion now.

@mrocklin
Copy link
Contributor Author

This seems to be a sensible solution for my immediate problem. Thank you all

In [1]: import numpy as np

In [2]: import dask.array as da

In [3]: x = da.arange(10, chunks=(5,))

In [4]: np.core.umath._ones_like(x)
Out[4]: dask.array<_ones_like, shape=(10,), dtype=int64, chunksize=(5,)>

I'm curious about the private functions in this namespace. Is this documented somewhere or is there a conversation I can read to get some context about why they exist?

@charris
Copy link
Member

charris commented May 10, 2018

@mrocklin I assume you mean _ones_like. There was a conversation, but I don't know how you would find it these days. There are also traces in the git commit messages.

@mrocklin mrocklin changed the title A protocol for numpy.ones A protocol for numpy.ones_like May 17, 2018
@mrocklin
Copy link
Contributor Author

Conversation seems to have slowed down a bit here.

To summarize what I'm hearing above it sounds like at one point this was considered but then an alternative was put in place. It's not clear what the reasoning was at the time, but presumably one could do some sleuthing to find out why. There is some mild concern from @shoyer that some code might be built expecting np.ones_like to definitely return a numpy array, which seems like a fair concern that should be discussed, but may not be a blocking issue. No one else has yet voiced strong dissent or support of this issue. Are there any other thoughts or concerns about this topic?

From my limited perspective the world would be a nicer place if Numpy would tend towards the default of making operations ufuncs. This would allow the ecosystem to evolve more quickly towards using generic code in gpu, sparse, and parallel situations.

@charris
Copy link
Member

charris commented May 17, 2018

The ones_like ufunc was the original, but was dropped for a different implementation in order to make it like the other *_like functions. If you really want ufunc equivalents I think you should raise the issue on the mailing list, and if there are more than one, maybe put together an NEP.

@hameerabbasi
Copy link
Contributor

hameerabbasi commented May 18, 2018

There's a related discussion over on #8994 about making the three-argument version of np.where dispatch to a ufunc.

Overall, I agree with @mrocklin fairly strongly here that Numpy should move towards a protocol-oriented and ufunc-based architecture where at all possible. This would allow for other projects and the ecosystem as a whole to benefit from this as well.

I would be willing to co-author an (or multiple) NEPs, perhaps in conjunction with @njsmith, @shoyer. Hopefully @mrocklin joins in too.

@ericmjl
Copy link

ericmjl commented May 18, 2018

If it's possible, would I be able to join in on the NEP? I've never written an enhancement proposal, and would like to observe the process, while also hopefully being able to contribute one end-user's perspective on it.

@mhvk
Copy link
Contributor

mhvk commented May 18, 2018

One other advantage of a NEP is that it could provide a list of functions that would, in principle, be suitable, and separate them by what extra machinery is needed (e.g., ones_like, like where, can in principle deal with structured arrays and strings, both of which are a pain for ufuncs).

@shoyer
Copy link
Member

shoyer commented May 10, 2019

This is solved by NEP-18

@shoyer shoyer closed this as completed May 10, 2019
@mrocklin
Copy link
Contributor Author

Woot. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants