-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG/TST: Fix for #6724, make numpy.ma.mvoid consistent with numpy.void #6763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG/TST: Fix for #6724, make numpy.ma.mvoid consistent with numpy.void #6763
Conversation
…y.void Make indexing on numpy.ma.mvoid consistent with indexing on numpy.void. Changes behaviour in rare cases (see below). Fixes numpy#6724. Sometimes, indexing ma.mvoid results in a non-scalar mask. For example, dimension increases if indexing with a multi-dimensional field. Previously, this led to a ValueError (truth value ambiguous). With this commit, indexing now returns an ma.masked_array so that there is no loss of information. Note that there is a precedence for returning from void to array. Z = zeros((2,), dtype="(2,)i2,(2,)i2"), then Z[0] is a void, but Z[0][0] and Z[0]["f1"] are array. This commit therefore implements behaviouk such that numpy.ma.mvoid is consistent with numpy.void. Also adds a related test. The behaviour changes in cases where for a masked array `X`, X.dtype["A"] is multidimensional but size 1, such as in the example below. Any case where X.dtype["A"] is multidimensional but with size>1 would previously fail. Old behaviour: In [15]: X = ma.masked_array(data=[([0],)], mask=[([False],)], dtype=[("A", "(1,1)i2", (1,1))]) In [16]: X[0]["A"] Out[16]: array([[[[0]]]], dtype=int16) In [17]: X = ma.masked_array(data=[([0],)], mask=[([True],)], dtype=[("A", "(1,1)i2", (1,1))]) In [18]: X[0]["A"] Out[18]: masked New behaviour: In [1]: X = ma.masked_array(data=[([0],)], mask=[([False],)], dtype=[("A", "(1,1)i2", (1,1))]) In [2]: X[0]["A"] Out[2]: masked_array(data = [[[[0]]]], mask = [[[[False]]]], fill_value = [[[[16959]]]]) In [3]: X = ma.masked_array(data=[([0],)], mask=[([True],)], dtype=[("A", "(1,1)i2", (1,1))]) In [4]: X[0]["A"] Out[4]: masked_array(data = [[[[--]]]], mask = [[[[ True]]]], fill_value = [[[[16959]]]]) The new behaviour is more consistent with indexing the data themselves: In [7]: X.data[0]["A"] Out[7]: array([[[[0]]]], dtype=int16) In theory, this change in behaviour can break code, but I would consider it very unlikely.
@gerritholl @ahaldane Do you have a preference? This looks a bit more direct than the other fix. |
I prefer this one, as I think it's more correct. Arguably, the old behaviour for the one case where the old behaviour didn't crash is buggy. And from the observation that I discover (and sometimes fix) those bugs in exotic structured masked arrays, there don't seem to be very many people using them. |
OK, I'm going to close the other. It can be reopened if needed. |
@@ -5869,6 +5869,18 @@ def __getitem__(self, indx): | |||
|
|||
""" | |||
m = self._mask | |||
if isinstance(m[indx], ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line is good now.
But for any future reference, another possibility which you reminded me of in another PR is if m.dtype[indx].subdtype is not None
, which checks specifically for subarrays. But I think a check for ndarray
is better because there might be other ways of getting an ndarray when indexing - for instance, multifield indices like marr[['f0', 'f1']]
could return an ndarray, which we would want to convert to a masked array. (Currently void scalars to not support multifield indices though, but I think that is an oversight which we might want to fix soon, eg in #6053)
OK, let's give this a shot. @ahaldane Are you good with this? |
Looks good to me, thanks @gerritholl . I'll let @charris do the merge. |
…rray_mvoid_alt BUG/TST: Fix for #6724, make numpy.ma.mvoid consistent with numpy.void
This is an alternative fix for #6724. The other fix is in #6755.
Make indexing on numpy.ma.mvoid consistent with indexing on numpy.void.
Changes behaviour in rare cases (see below). Fixes #6724. Sometimes,
indexing ma.mvoid results in a non-scalar mask. For example, dimension
increases if indexing with a multi-dimensional field. Previously, this
led to a ValueError (truth value ambiguous). With this commit, indexing
now returns an ma.masked_array so that there is no loss of information.
Note that there is a precedence for returning from void to array.
Z = zeros((2,), dtype="(2,)i2,(2,)i2")
, then Z[0] is a void, butZ[0][0] and Z[0]["f1"]
are array. This commit therefore implements behavioursuch that
numpy.ma.mvoid
is consistent withnumpy.void
.Also adds a related test.
The behaviour changes in cases where for a masked array
X
,X.dtype["A"]
is multidimensional but size 1, such as in the example below. Any case
where
X.dtype["A"]
is multidimensional but with size>1 would previouslyfail.
Old behaviour:
New behaviour:
The new behaviour is more consistent with indexing the data themselves:
In theory, this change in behaviour can break code, but I would consider
it very unlikely.