Skip to content

BUG/TST: Fix for #6724, make numpy.ma.mvoid consistent with numpy.void #6763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

gerritholl
Copy link
Contributor

This is an alternative fix for #6724. The other fix is in #6755.

Make indexing on numpy.ma.mvoid consistent with indexing on numpy.void.
Changes behaviour in rare cases (see below). Fixes #6724. Sometimes,
indexing ma.mvoid results in a non-scalar mask. For example, dimension
increases if indexing with a multi-dimensional field. Previously, this
led to a ValueError (truth value ambiguous). With this commit, indexing
now returns an ma.masked_array so that there is no loss of information.

Note that there is a precedence for returning from void to array. Z = zeros((2,), dtype="(2,)i2,(2,)i2"), then Z[0] is a void, but Z[0][0] and Z[0]["f1"] are array. This commit therefore implements behaviour
such that numpy.ma.mvoid is consistent with numpy.void.

Also adds a related test.

The behaviour changes in cases where for a masked array X, X.dtype["A"]
is multidimensional but size 1, such as in the example below. Any case
where X.dtype["A"] is multidimensional but with size>1 would previously
fail.

Old behaviour:

In [15]: X = ma.masked_array(data=[([0],)], mask=[([False],)],
        dtype=[("A", "(1,1)i2", (1,1))])

In [16]: X[0]["A"]
Out[16]: array([[[[0]]]], dtype=int16)

In [17]: X = ma.masked_array(data=[([0],)], mask=[([True],)],
        dtype=[("A", "(1,1)i2", (1,1))])

In [18]: X[0]["A"]
Out[18]: masked

New behaviour:

In [1]: X = ma.masked_array(data=[([0],)], mask=[([False],)],
        dtype=[("A", "(1,1)i2", (1,1))])

In [2]: X[0]["A"]
Out[2]:
masked_array(data =
 [[[[0]]]],
             mask =
 [[[[False]]]],
       fill_value = [[[[16959]]]])

In [3]: X = ma.masked_array(data=[([0],)], mask=[([True],)],
        dtype=[("A", "(1,1)i2", (1,1))])

In [4]: X[0]["A"]
Out[4]:
masked_array(data =
 [[[[--]]]],
             mask =
 [[[[ True]]]],
       fill_value = [[[[16959]]]])

The new behaviour is more consistent with indexing the data themselves:

In [7]: X.data[0]["A"]
Out[7]: array([[[[0]]]], dtype=int16)

In theory, this change in behaviour can break code, but I would consider
it very unlikely.

…y.void

Make indexing on numpy.ma.mvoid consistent with indexing on numpy.void.
Changes behaviour in rare cases (see below). Fixes numpy#6724.  Sometimes,
indexing ma.mvoid results in a non-scalar mask.  For example, dimension
increases if indexing with a multi-dimensional field.  Previously, this
led to a ValueError (truth value ambiguous).  With this commit, indexing
now returns an ma.masked_array so that there is no loss of information.

Note that there is a precedence for returning from void to array.  Z =
zeros((2,), dtype="(2,)i2,(2,)i2"), then Z[0] is a void, but Z[0][0]
and Z[0]["f1"] are array.  This commit therefore implements behaviouk
such that numpy.ma.mvoid is consistent with numpy.void.

Also adds a related test.

The behaviour changes in cases where for a masked array `X`, X.dtype["A"]
is multidimensional but size 1, such as in the example below.  Any case
where X.dtype["A"] is multidimensional but with size>1 would previously
fail.

Old behaviour:

In [15]: X = ma.masked_array(data=[([0],)], mask=[([False],)],
        dtype=[("A", "(1,1)i2", (1,1))])

In [16]: X[0]["A"]
Out[16]: array([[[[0]]]], dtype=int16)

In [17]: X = ma.masked_array(data=[([0],)], mask=[([True],)],
        dtype=[("A", "(1,1)i2", (1,1))])

In [18]: X[0]["A"]
Out[18]: masked

New behaviour:

In [1]: X = ma.masked_array(data=[([0],)], mask=[([False],)],
        dtype=[("A", "(1,1)i2", (1,1))])

In [2]: X[0]["A"]
Out[2]:
masked_array(data =
 [[[[0]]]],
             mask =
 [[[[False]]]],
       fill_value = [[[[16959]]]])

In [3]: X = ma.masked_array(data=[([0],)], mask=[([True],)],
        dtype=[("A", "(1,1)i2", (1,1))])

In [4]: X[0]["A"]
Out[4]:
masked_array(data =
 [[[[--]]]],
             mask =
 [[[[ True]]]],
       fill_value = [[[[16959]]]])

The new behaviour is more consistent with indexing the data themselves:

In [7]: X.data[0]["A"]
Out[7]: array([[[[0]]]], dtype=int16)

In theory, this change in behaviour can break code, but I would consider
it very unlikely.
@charris
Copy link
Member

charris commented Dec 3, 2015

@gerritholl @ahaldane Do you have a preference? This looks a bit more direct than the other fix.

@gerritholl
Copy link
Contributor Author

I prefer this one, as I think it's more correct. Arguably, the old behaviour for the one case where the old behaviour didn't crash is buggy. And from the observation that I discover (and sometimes fix) those bugs in exotic structured masked arrays, there don't seem to be very many people using them.

@charris
Copy link
Member

charris commented Dec 3, 2015

OK, I'm going to close the other. It can be reopened if needed.

@@ -5869,6 +5869,18 @@ def __getitem__(self, indx):

"""
m = self._mask
if isinstance(m[indx], ndarray):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is good now.

But for any future reference, another possibility which you reminded me of in another PR is if m.dtype[indx].subdtype is not None, which checks specifically for subarrays. But I think a check for ndarray is better because there might be other ways of getting an ndarray when indexing - for instance, multifield indices like marr[['f0', 'f1']] could return an ndarray, which we would want to convert to a masked array. (Currently void scalars to not support multifield indices though, but I think that is an oversight which we might want to fix soon, eg in #6053)

@charris
Copy link
Member

charris commented Dec 4, 2015

OK, let's give this a shot. @ahaldane Are you good with this?

@ahaldane
Copy link
Member

ahaldane commented Dec 4, 2015

Looks good to me, thanks @gerritholl . I'll let @charris do the merge.

charris added a commit that referenced this pull request Dec 4, 2015
…rray_mvoid_alt

BUG/TST: Fix for #6724, make numpy.ma.mvoid consistent with numpy.void
@charris charris merged commit 3cc797e into numpy:master Dec 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ValueError when indexing numpy.ma.core.mvoid with multidimensional structured dtype
3 participants