Skip to content

Conversation

seberg
Copy link
Member

@seberg seberg commented Feb 23, 2014

Boolean array indices will (unless the special case is taken)
always be converted using a nonzero logic. However, for example

arr = np.arange(10)
index = np.array([True])
arr[index]

would thus work as if index is filled up with False. This is
a source of confusion and hardly useful in practice. Especially
if the array has more then one dimension this behaviour can
be very unexpected as it conflicts with broadcasting.

@juliantaylor
Copy link
Contributor

while it makes sense to me to deprecate it (you can always use slicing if you have a sparse bool mask) what do we gain by deprecating it?
it seems it just adds a bunch more code instead of saving any

@seberg
Copy link
Member Author

seberg commented Feb 24, 2014

True, it adds quite a bit of code (could maybe restructure it a bit more, but won't change much). The gain is, that I really dislike this:

a = np.ones((5, 5))
index = np.ones((5, 1), dtype=bool)
a[index]  # Surely this uses broadcasting logic ;)

If nobody else minds that much, we can just leave it as is.

@charris
Copy link
Member

charris commented Feb 25, 2014

@juliantaylor Just to clarify the example, currently

In [1]: a = np.ones((5, 5))

In [2]: index = np.ones((5, 1), dtype=bool)

In [3]: a[index]
Out[3]: array([ 1.,  1.,  1.,  1.,  1.])

In [4]: a[index[:,0]]
Out[4]: 
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])

@charris
Copy link
Member

charris commented Aug 25, 2014

@seberg Needs a rebase.

@seberg seberg force-pushed the boolean-bad-shape-dep branch from 2e35648 to af5f0bb Compare August 27, 2014 10:56
@seberg
Copy link
Member Author

seberg commented Aug 27, 2014

Rebased, removed release notes comment for now. Will have to go over this once more though.

@seberg
Copy link
Member Author

seberg commented Aug 27, 2014

One useful thing I could now add (and maybe will), is to tell people that an array was original boolean when there is shape mismatch error. It won't be perfect, but better then nothing.

@seberg seberg force-pushed the boolean-bad-shape-dep branch 3 times, most recently from b9dee35 to 59fde25 Compare August 28, 2014 13:33
@seberg
Copy link
Member Author

seberg commented Oct 30, 2014

Seems like I would need to rebase this... So what is the take would a VisibleDeprecationWarning make sense? And would it be easy to give that from C?

@charris charris added this to the 1.10.0 release milestone Apr 7, 2015
@charris
Copy link
Member

charris commented May 12, 2015

I like this, but am inclined to put it off to 1.11.0 unless there is agreement and @seberg has time.

A VisibleDeprecationWarning would make sense, not sure of the easiest route to use it would be. Maybe define it in the Multiarray module?

@seberg
Copy link
Member Author

seberg commented May 13, 2015

Will have to wait two weeks probably, but after that should be able to find some time.

@seberg seberg force-pushed the boolean-bad-shape-dep branch from 59fde25 to 35104c2 Compare May 18, 2015 15:21
@seberg
Copy link
Member Author

seberg commented May 27, 2015

I have updated this to use VisibleDeprecationWarning which is imported on the fly when needed (we seem to do similar things for the Complex warnings), etc. However, I am not sure if this is quite right, and it might make sense to do the importing in the init instead of when the warning is given?
Have to review the code myself again some time, but it may be more or less done.

@charris
Copy link
Member

charris commented Jun 5, 2015

@seberg I've passed this on to the mailing list if you want to weigh in.

@njsmith
Copy link
Member

njsmith commented Jun 5, 2015

IMO the bizarre behavior of arr[mask1, mask2] is a strong specific argument for deprecating, above and beyond the general confusingness of allowing mismatched shapes. So 👍 here.

@charris
Copy link
Member

charris commented Jun 5, 2015

The consensus seems to be "yes", but I'll wait a bit longer.

@charris
Copy link
Member

charris commented Jun 5, 2015

@seberg The doc/sphinxext file should not be changed. Not sure what the easiest fix is here, maybe git submodule update -f will do, or maybe you need to clear the directory and rebase on master, or maybe something else.

@seberg seberg force-pushed the boolean-bad-shape-dep branch from d9e244f to 8803dbd Compare June 5, 2015 18:04
@seberg
Copy link
Member Author

seberg commented Jun 5, 2015

Fixed that part (in the end, manually checked out the correct submodule branch during a rebase...).

@@ -741,7 +744,8 @@ prepare_index(PyArrayObject *self, PyObject *index,
/*
* At this point indices are all set correctly, no bounds checking
* has been made and the new array may still have more dimensions
* then is possible.
* then is possible. And boolean indexing arrays may have had a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then -> than

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"than is possible and boolean indexing arrays may have an incorrect shape."

if ((indices[i].type == HAS_FANCY) && indices[i].value > 0) {
if (indices[i].value != PyArray_DIM(self, used_ndim)) {
PyObject *m, *warning;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might as well remove this blank line and insert before sprintf.

@charris
Copy link
Member

charris commented Jun 5, 2015

Mostly style stuff. There is not as much new code as I was led to believe.

@seberg seberg force-pushed the boolean-bad-shape-dep branch 3 times, most recently from 7b09f16 to 0247bda Compare June 6, 2015 08:48
@seberg
Copy link
Member Author

seberg commented Jun 6, 2015

Updated, justed the _pyfunc for now, but happy if you change that (other ideas, might be _from_import or just import, but don't want to do bikeshedding about this small thing :))

Boolean array indices will (unless the special case is taken)
always be converted using a nonzero logic. However, for example
```
arr = np.arange(10)
index = np.array([True])
arr[index]
```
would thus work as if index is filled up with `False`. This is
a source of confusion and hardly useful in practice. Especially
if the array has more then one dimension this behaviour can
be very unexpected as it conflicts with broadcasting.

Uses VisibleDeprecationWarning, since this is probably usually a
user bug in the first place.
@seberg seberg force-pushed the boolean-bad-shape-dep branch from 0247bda to c5ad058 Compare June 6, 2015 11:56
charris added a commit that referenced this pull request Jun 7, 2015
DEP: Deprecate boolean array indices with non-matching shape
@charris charris merged commit 01c59d6 into numpy:master Jun 7, 2015
@charris
Copy link
Member

charris commented Jun 7, 2015

Thanks Sebastian.

@seberg seberg deleted the boolean-bad-shape-dep branch June 8, 2015 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants