Skip to content

ENH: np.reduce should allow broadcasting WHERE with ARRAY in either direction #29152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jdtsmith opened this issue Jun 9, 2025 · 1 comment

Comments

@jdtsmith
Copy link

jdtsmith commented Jun 9, 2025

Proposed new feature or change:

In np.reduce, np.mean, etc., the WHERE argument allows for selective reduction of only certain elements. It performs a one-way broadcast of WHERE to ARRAY:

WHERE: A boolean array which is broadcasted to match the dimensions of [array], and selects elements to include in the reduction.

But sometimes you'd like the broadcast to go in the other direction. For example, suppose you want to compute the average row number in a 2D array where some condition is true:

a = np.random.randint(0,10, (5,10))
np.mean(np.arange(5)[:, np.newaxis], axis=0, where=(a <= 5.))

This raises:

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (5,10)  and requested shape (5,1)

because the broadcast is tried in one direction only (from WHERE -> ARRAY).

If ARRAY would otherwise be a repeated set of values (e.g. a column vector of row indices), it seems sensible to allow broadcasting between WHERE and ARRAY in either direction. In this example, ARRAY -> WHERE broadcasting would occur. Otherwise, you must waste memory "filling out" ARRAY:

np.mean(np.tile(np.arange(5)[:, np.newaxis], 10), axis=0, where=(a <= 5.))
@seberg
Copy link
Member

seberg commented Jun 10, 2025

I have mixed feelings about the suggestion. In most cases NumPy always broadcasts, but there are exceptions where it isn't clear that the arguments are on a similar standing (one example is that out=some_array will never broadcast, but of course that is an extreme case).

So it seems fine to me, but I am also unsure that it is wouldn't cause unintended broadcasts, when manual broadcasting via a = np.broadcast_to(a, ...) is pretty simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants