Skip to content

ENH: make np.where a ufunc #8994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
eric-wieser opened this issue Apr 26, 2017 · 12 comments
Open

ENH: make np.where a ufunc #8994

eric-wieser opened this issue Apr 26, 2017 · 12 comments

Comments

@eric-wieser
Copy link
Member

eric-wieser commented Apr 26, 2017

Should be very geared towards a bXX->X loop, for every possible X.

This would offer:

  • an out argument
  • support for subclasses using __array_ufunc__
  • Not really desirable, but comes with the package - a where argument (!) such that np.where(c, y, z, out=x, where=w) is a more efficient x = np.where(w, x, np.where(c, y, z))
  • A fix to BUG np.where half-initializes subclass of output #5095

Problems:

  • Are inner loops for void and other flexible types possible?
@mhvk
Copy link
Contributor

mhvk commented Apr 26, 2017

Love this idea!

@shoyer
Copy link
Member

shoyer commented Apr 28, 2017

Yes, this would be awesome! This would only work for the three argument version of where -- the one argument version isn't really ufunc like. So we can keep the public API unchanged, and don't need to support the confusing where(... where=...) unless we really want to :).

@hameerabbasi
Copy link
Contributor

I believe that they are possible. However, would it be possible to make the one-argument version work with this? Or would the three-argument version defer to this?

@hameerabbasi
Copy link
Contributor

hameerabbasi commented Feb 24, 2018

I'd be willing to work up a PR if we can decide on a new home for this ufunc so we support the one-argument where.

@mhvk
Copy link
Contributor

mhvk commented Feb 24, 2018

@hameerabbasi - I think the idea would be that np.where calls the ufunc for its three-argument form (and for types of argument for which the ufunc works). For the one-argument form, no change would be made (since that behaviour cannot be captured by a ufunc).

Questions to all: what should be the name of the ufunc? We already have np.select. Harking to c, perhaps np.conditional(condition, a, b). Or, more fortran-ish, np.merge(a, b, condition)? Or should it be a private function that only gets called by np.where?

p.s. On dealing with void and string: that is a bit trickier, as it needs passing on lengths, etc. Possibly it is best to just start with the regular dtypes...

@hameerabbasi
Copy link
Contributor

hameerabbasi commented Feb 24, 2018

Questions to all: what should be the name of the ufunc? We already have np.select. Harking to c, perhaps np.conditional(condition, a, b). Or, more fortran-ish, np.merge(a, b, condition)? Or should it be a private function that only gets called by np.where?

In the issue I opened (before opening this one), I suggested if, but it's a Python keyword so not optimal. ternary would also work since it mimics the ternary operator. ifx from LaTeX would make sense too.

p.s. On dealing with void and string: that is a bit trickier, as it needs passing on lengths, etc. Possibly it is best to just start with the regular dtypes...

Use np.promote_types? Doesn't work for all cases though.

>>> np.promote_types('S8', 'S11')
dtype('S11')
>>> dt1 = np.dtype([('f1', np.int16), ('f2', np.float32)])
>>> dt2 = np.dtype([('f5', np.int16), ('f6', np.float32)])
>>> np.promote_types(dt1, dt2)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: invalid type promotion
>>> np.promote_types(dt1, dt1)
dtype([('f1', '<i2'), ('f2', '<f4')])

@mhvk
Copy link
Contributor

mhvk commented Feb 24, 2018

I don't like ternary as that's just the number of arguments, and one could think of other ufuncs with three arguments (e.g., the fused multiply and add discussed on the mailing list).

@shoyer
Copy link
Member

shoyer commented Feb 24, 2018 via email

@hameerabbasi
Copy link
Contributor

I also had another idea. I know numpy doesn't follow this too strictly, but how about if_ similar to operator.and_ et al.

@hameerabbasi
Copy link
Contributor

I'm +1 on the name np.conditional. If someone can point me to similar ufunc implementations I'll try my hand at this one.

@mhvk
Copy link
Contributor

mhvk commented May 18, 2018

The casting would, I think, be fairly similar to addition (except that of course the boolean does not influence the outcome). The regular ufuncs are all defined in core/src/umath/, in particular loops.c.src; you may want to look at recent additions; e.g., #8774. Though if you haven't done a ufunc before, it might make sense to start without the scripting/looping in a .src file, and just follow the tutorial for writing a ufunc: https://docs.scipy.org/doc/numpy/user/c-info.ufunc-tutorial.html (I found this fairly helpful).

@eric-wieser
Copy link
Member Author

Regarding naming, I think np.where.ufunc is possibly the simplest place to put the actual ufunc object dispatched to by the special-casing of one-argument in np.where. Only users implementing __array_ufunc__ need to know where it is, and they'd find it organically while writing tests for the ufuncs they care about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants