ENH: add ufuncs additional kwargs like `out`, `dtype` etc.. for `np.where` (`out` is needed most) #18516

victor-zou · 2021-03-01T08:42:52Z

Feature

np.where can be regarded as a ternary ufuncs composed of cond?x:y, so it is natural
for np.where to have kwargs like out, dtype etc. Supporting these kwargs can save
efforts for allocating memory, type casting, etc. In addtion, library numexpr does support
this kind of syntax.

The text was updated successfully, but these errors were encountered:

eric-wieser · 2021-03-01T10:24:53Z

Largely duplicates #8994 I think

victor-zou · 2021-03-03T02:56:29Z

Largely duplicates #8994 I think

Thank you for the reference, to find that the feature is still not implemented after four years. It looks that the obstacle lays in the initial imature design that merges two inrelevent things in one function.

So, can we have a second best choice, i.e., if it is not easy to make where a ufunc, simply add out kwarg. If a user use the nonzero part and pass in an out, raise a ValueError. Similiarly, I think a dtype kwarg is also not hard to impl and works
well with the nonzero part (for example, the user may choose to output int32, uint32, int64, uint64, size_t, ptrdiff_t, ssize_t ...).

eric-wieser · 2021-03-03T08:01:12Z

It looks that the obstacle lays in the initial imature design that merges two inrelevent things in one function.

I don't think this is really the obstacle - a bigger issues is that where works on arbitrary dtypes, but ufuncs only work on simple dtypes (ie not string, unicode, void, etc)

seberg · 2021-03-03T16:19:25Z

Well, I am working on that part (next big item on my agenda). Hopefully we have better support for "flexible" or "parametric" dtypes fairly soon. We actually do have some support, but its weird enough that either nobody knew we have it or nobody felt like using it.

On the other hand, where is a bit special, in that the actual inner-loop could probably be written without any dtype specific code, but rather "using" the existing copy code. numexpr probably would require a proper ufunc, adding out is probably acceptable right now, dtype might be too, but I am not sure how much churn it would be to include casting logic in the current code.

victor-zou · 2021-03-04T02:42:28Z

Well, I am working on that part (next big item on my agenda). Hopefully we have better support for "flexible" or "parametric" dtypes fairly soon. We actually do have some support, but its weird enough that either nobody knew we have it or nobody felt like using it.

On the other hand, where is a bit special, in that the actual inner-loop could probably be written without any dtype specific code, but rather "using" the existing copy code. numexpr probably would require a proper ufunc, adding out is probably acceptable right now, dtype might be too, but I am not sure how much churn it would be to include casting logic in the current code.

Thanks for the reply. Until today do I read the source code and know that the np.where is implementated “simply” via if and copy instead of npyv_select* macros (namely, the _mm*_blend** simd instructions). My personal suggestion is to completely refactor the function and use the _mm*_blend** simd instructions for numeric types. Adding kwargs out is for performance consideration, the original code draw the speed down and makes the little time saved from memory allocation meaningless.

victor-zou · 2021-03-04T08:25:40Z

Well, I am working on that part (next big item on my agenda). Hopefully we have better support for "flexible" or "parametric" dtypes fairly soon. We actually do have some support, but its weird enough that either nobody knew we have it or nobody felt like using it.

On the other hand, where is a bit special, in that the actual inner-loop could probably be written without any dtype specific code, but rather "using" the existing copy code. numexpr probably would require a proper ufunc, adding out is probably acceptable right now, dtype might be too, but I am not sure how much churn it would be to include casting logic in the current code.

I found that there is already a ternary ufunc named clip, which is de facto composed by two where. So it would be not hard to add another ufunc. If keeping compatible with the old np.where is hard, is it ok to add another ufunc whose name is blend (as "where" and "select" are all used, and it is same with the instruction name), and another function np.blend.

seberg · 2021-03-04T15:42:24Z

If you don't mind overwriting one of the inputs, np.copyto should actually be a pretty decent solution. Of course it also doesn't do anything particularly fancy (but then aside from a few ufunc, not a whole lot of things in NumPy do). Although, I wonder if a dedicated meld is actually much faster for most use cases (e.g. if True/False's are blocked, you may not even have to even read both arrays for a chunk larger than a cache line).

I think adding a ternary ufunc for this "meld" operation is fine (assuming its not insanely much churn). Adding it to the main namespace, I am not sure... There is a clip ufunc, but that is just used inside np.clip and not exposed to end-users.
I also profoundly dislike the dual-use of np.where, but the np.nonzero alternative is probably more "clear". Although, if there is a "canonical" name most packages use, we could try to add it and nudge users towards it very slowly, but dispite my personal dislike of np.where due to its dual-use, the name is very pleasing and I think it is used a lot in every-day code.

victor-zou · 2021-03-05T02:29:22Z

The purpose for out is to avoid mem alloc, instead of writing to specific buffer.

When cond, x, y are contiguous, if dtype is float64, a simd version (written by Eigen and wrapped by pybind11, in an old intel cpu that only support avx2 that I tested) will give a nearly 50% performance boost on a large array (I have carefully excludes the time for memory allocation, memory dellocation, etc.). Let alone for servers with avx512 instructions.

The name where is good. I do not care what its name is, but how fast it is. I want the np.where can be as optimized as other part of the numpy package, instead of the status quo that 1) does not use simd instructions 2) cannot avoid memory allocation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: add ufuncs additional kwargs like `out`, `dtype` etc.. for `np.where` (`out` is needed most) #18516

ENH: add ufuncs additional kwargs like `out`, `dtype` etc.. for `np.where` (`out` is needed most) #18516

victor-zou commented Mar 1, 2021

eric-wieser commented Mar 1, 2021

victor-zou commented Mar 3, 2021

eric-wieser commented Mar 3, 2021

seberg commented Mar 3, 2021

victor-zou commented Mar 4, 2021

victor-zou commented Mar 4, 2021

seberg commented Mar 4, 2021

victor-zou commented Mar 5, 2021

ENH: add ufuncs additional kwargs like out, dtype etc.. for np.where (out is needed most) #18516

ENH: add ufuncs additional kwargs like out, dtype etc.. for np.where (out is needed most) #18516

Comments

victor-zou commented Mar 1, 2021

Feature

eric-wieser commented Mar 1, 2021

victor-zou commented Mar 3, 2021

eric-wieser commented Mar 3, 2021

seberg commented Mar 3, 2021

victor-zou commented Mar 4, 2021

victor-zou commented Mar 4, 2021

seberg commented Mar 4, 2021

victor-zou commented Mar 5, 2021

ENH: add ufuncs additional kwargs like `out`, `dtype` etc.. for `np.where` (`out` is needed most) #18516

ENH: add ufuncs additional kwargs like `out`, `dtype` etc.. for `np.where` (`out` is needed most) #18516