A protocol for numpy.ones_like #11074

mrocklin · 2018-05-10T12:22:13Z

I recently sat down with @ericmjl to see if we could make autograd work with cupy by making them both understand protocols like __array_ufunc__

The goal here being to improve all projects to produce and consume numpy protocols rather than explicitly import each other in pair-wise plugin mechanisms.

One issue that we ran into is that autograd needs to produce new arrays, like ones (the gradient of sum) and random (not sure why yet). It would be nice to be able to say "produce an array of ones with a particular shape and dtype, but using the module that created this particular array object". This would allow autograd to produce dask arrays of ones, cupy arrays of ones, etc..

The numpy.ones_like function almost does this except for the following two issues:

It takes the shape from the given array (which makes perfect sense given its original objective)
There is no protocol for it, so np.ones_like(my_duck_array, ...) produces a numpy array

Also, to be clear, ones here is an example of a larger problem of how to improve dispatch for a wider set of numpy functions.

The text was updated successfully, but these errors were encountered:

mrocklin · 2018-05-10T14:15:00Z

Actually, I suspect that accepting a shape is not particularly important. My guess is that some of these immediate problems would be solved if functions like ones_like and empty_like were ufuncs.

mhvk · 2018-05-10T18:13:16Z

Have a look at np.core.umath._ones_like (which is a ufunc). If this is really useful, it may well be possible to expose it. (Note that I just know of its existence through checking that astropy's Quantity covers all ufuncs in np.core.umath -- I don't know why this exists...)

njsmith · 2018-05-10T18:24:01Z

I guess empty_like and zeros_like are the primitives, and ones_like is equivalent to empty_like + fill? Or maybe full_like(1)?

empty_like makes some sense as a ufunc, though I guess it might waste some time iterating and calling a no-op loop. zeros_like is a bit tricky because it's a special kind of allocation mode: for in-memory arrays you want to use calloc. I guess that implement it as full_like(0) and trust that good implementations of full_like will detect special values like this and do the right thing?

I feel like @shoyer and I might have discussed this in March, but I don't see anything in our notes and my brain is fried from getting my PyCon talk together for tomorrow. Maybe he remembers more :-)

shoyer · 2018-05-10T18:32:48Z

@njsmith I think you can find it briefly mentioned at the end, under "A (very) rough sketch of future plans".

Speaking of which -- we really should finish polishing those notes up into a NEP :).

The one other big question in my mind that we didn't talk about is where these protocols should live:

In NumPy proper, e.g., in the actual numpy.ones_like() function. This raises potential backwards compatibility concerns, if somebody is relying on numpy.ones_like() always producing a numpy array (which seems quite plausible).
In a separate module inside NumPy, e.g., numpy.api.ones_like()?
In a separate package altogether, e.g., @hameerabbasi's arrayish?

Perhaps this would be a good topic of discussion for the NumPy sprint at BIDS in a few weeks (or maybe the next week when @mrocklin is in town).

dopplershift · 2018-05-10T18:37:40Z

This seems related to the problem of making concatenate work for duck arrays.

charris · 2018-05-10T19:53:21Z

IIRC, the zeros_like and empty_like functions were inherited from Numeric and not ufuncs. When ones_like was added there was some discussion at to whether it should be a ufunc or not. In any case, adjustments to the various functions have been made over the years to make them resemble each other. My memory of events is hazy as it was a long time ago. We might want to revisit all of those functions at some point.

mhvk · 2018-05-10T20:32:09Z

Indeed, ones_like comes from numeric.py; I still find it amusing that there is an actual _ones_like ufunc in umath... its docstring confirmed your memory:

_ones_like(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This function used to be the numpy.ones_like, but now a specific
function for that has been written for consistency with the other
*_like functions. It is only used internally in a limited fashion now.

mrocklin · 2018-05-10T20:58:41Z

This seems to be a sensible solution for my immediate problem. Thank you all

In [1]: import numpy as np

In [2]: import dask.array as da

In [3]: x = da.arange(10, chunks=(5,))

In [4]: np.core.umath._ones_like(x)
Out[4]: dask.array<_ones_like, shape=(10,), dtype=int64, chunksize=(5,)>

I'm curious about the private functions in this namespace. Is this documented somewhere or is there a conversation I can read to get some context about why they exist?

charris · 2018-05-10T21:08:21Z

@mrocklin I assume you mean _ones_like. There was a conversation, but I don't know how you would find it these days. There are also traces in the git commit messages.

mrocklin · 2018-05-17T23:14:12Z

Conversation seems to have slowed down a bit here.

To summarize what I'm hearing above it sounds like at one point this was considered but then an alternative was put in place. It's not clear what the reasoning was at the time, but presumably one could do some sleuthing to find out why. There is some mild concern from @shoyer that some code might be built expecting np.ones_like to definitely return a numpy array, which seems like a fair concern that should be discussed, but may not be a blocking issue. No one else has yet voiced strong dissent or support of this issue. Are there any other thoughts or concerns about this topic?

From my limited perspective the world would be a nicer place if Numpy would tend towards the default of making operations ufuncs. This would allow the ecosystem to evolve more quickly towards using generic code in gpu, sparse, and parallel situations.

charris · 2018-05-17T23:51:30Z

The ones_like ufunc was the original, but was dropped for a different implementation in order to make it like the other *_like functions. If you really want ufunc equivalents I think you should raise the issue on the mailing list, and if there are more than one, maybe put together an NEP.

hameerabbasi · 2018-05-18T05:04:26Z

There's a related discussion over on #8994 about making the three-argument version of np.where dispatch to a ufunc.

Overall, I agree with @mrocklin fairly strongly here that Numpy should move towards a protocol-oriented and ufunc-based architecture where at all possible. This would allow for other projects and the ecosystem as a whole to benefit from this as well.

I would be willing to co-author an (or multiple) NEPs, perhaps in conjunction with @njsmith, @shoyer. Hopefully @mrocklin joins in too.

ericmjl · 2018-05-18T12:39:47Z

If it's possible, would I be able to join in on the NEP? I've never written an enhancement proposal, and would like to observe the process, while also hopefully being able to contribute one end-user's perspective on it.

mhvk · 2018-05-18T13:00:04Z

One other advantage of a NEP is that it could provide a list of functions that would, in principle, be suitable, and separate them by what extra machinery is needed (e.g., ones_like, like where, can in principle deal with structured arrays and strings, both of which are a pain for ufuncs).

shoyer · 2019-05-10T21:17:33Z

This is solved by NEP-18

mrocklin · 2019-05-10T22:35:29Z

Woot. Thanks all!

mrocklin changed the title ~~A protocol for numpy.ones~~ A protocol for numpy.ones_like May 17, 2018

This was referenced May 21, 2018

A protocol for tensordot #11128

Closed

A catch-all protocol for numpy-like duck arrays #11129

Closed

shoyer closed this as completed May 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A protocol for numpy.ones_like #11074

A protocol for numpy.ones_like #11074

mrocklin commented May 10, 2018

mrocklin commented May 10, 2018

mhvk commented May 10, 2018

njsmith commented May 10, 2018 •

edited

Loading

shoyer commented May 10, 2018

dopplershift commented May 10, 2018

charris commented May 10, 2018

mhvk commented May 10, 2018

mrocklin commented May 10, 2018

charris commented May 10, 2018

mrocklin commented May 17, 2018

charris commented May 17, 2018

hameerabbasi commented May 18, 2018 •

edited

Loading

ericmjl commented May 18, 2018

mhvk commented May 18, 2018

shoyer commented May 10, 2019

mrocklin commented May 10, 2019

A protocol for numpy.ones_like #11074

A protocol for numpy.ones_like #11074

Comments

mrocklin commented May 10, 2018

mrocklin commented May 10, 2018

mhvk commented May 10, 2018

njsmith commented May 10, 2018 • edited Loading

shoyer commented May 10, 2018

dopplershift commented May 10, 2018

charris commented May 10, 2018

mhvk commented May 10, 2018

mrocklin commented May 10, 2018

charris commented May 10, 2018

mrocklin commented May 17, 2018

charris commented May 17, 2018

hameerabbasi commented May 18, 2018 • edited Loading

ericmjl commented May 18, 2018

mhvk commented May 18, 2018

shoyer commented May 10, 2019

mrocklin commented May 10, 2019

njsmith commented May 10, 2018 •

edited

Loading

hameerabbasi commented May 18, 2018 •

edited

Loading