-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG allow subclasses in MaskedArray ufuncs -- for non-ndarray _data #3907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'm a little wary of putting in a change of this complexity into 1.8 at
|
Proposed changes sound good to me. Tested with Test failure is real, needs fixing. |
I think it is too late for 1.8.0. However, it is good that masked arrays are getting a look over. There may be other parts that can be improved or fixed up using some of the machinery that has gone into numpy in the last few years. In particular, I'm wondering if the new |
@njsmith, @rgommers, @charris - having spent several hours looking at the test failure, I now agree this PR is not for 1.8... The failure is actually a combination of two previous bugs canceling each other out (see below). The test that fails effectively does
Now the differences are:
The basic problem is not so much what is done to masked values (FWIW, I think my PR improves that), but rather that masked values should not be used in the assignment. Indeed, there is a commented out check on the index in |
While writing, may as well ask: ideally, I'd like to see that, e.g., |
Looks like you merged master into this. That is generally not a good idea, better to rebase on master, and I think you should do that here: |
"Is it fine to introduce such a standard?" Questions like this are best raised on the list for discussion. |
@charris - OK, rebased (and I'll raise my question on the mailing list) |
Ping Travis. |
@@ -655,8 +655,11 @@ def getdata(a, subok=True): | |||
data = a._data | |||
except AttributeError: | |||
data = np.array(a, copy=False, subok=subok) | |||
if not subok: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for the function is no longer correct if this goes in. It needs correction. I also wonder if returning a is the right thing when data is an object array. It seems a little off to me and an indirect way of communicating the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a crucial change, which is something I'm having to produce horrible code to work around (I'm giving my classes _data
attributes that override view
, not at all pretty).
My guess is that it simply is an oversight, but really it seems one should not stuff something in arrays here when it doesn't obviously fit: that can still happen inside _MaskedBinaryOperation
.
The specific problem I have for astropy quantities perhaps illustrates this best. There, we use a Unit
class, which if multiplied with an numpy array, gives a quantity:
In[8]: np.arange(3)*Unit('m')
Out[8]: <Quantity [0,1,2] m>
It does this by setting __array_priority__
, and thus its __rmul__
method gets called. Now when I do
In[9]: np.ma.arange(3)*Unit('m')
Out[9]:
masked_array(data = [<Quantity 0 m> <Quantity 1 m> <Quantity 2 m>],
mask = False,
fill_value = ?)
The reason is that because the unit gets wrapped in an object array, it looses its __array_priority__
(and even if it didn't, it wouldn't help, since it is now an array; see #3164 by another astropy developer), and hence the actual multiplication of the unmasked array with the unit-inside-an-array proceeds by feeding every single number to the unit, which happily turns this into a quantity, so that one is left with a masked array full of objects. (Actually, since most installed versions of numpy do not have my NotImplemented bug fix from #3900, I cannot get the above to work at all, but at least with my hacks unit*masked_array
works).
p.s. I definitely should have changed the documentation -- will do so if you agree my change makes sense.
@charris - I had somewhat forgotten about this PR, and looking back at it realised the main thing still missing was documentation of the change in behaviour for |
@mhvk 1.9 is coming up, anything you want to add? |
@charris - thanks for warning me about 1.9. I think this is still a very useful change. Just to be sure it is still OK, I rebased on current master and pushed it again (assuming this triggers a new travis built). It so happens I just started working again on getting masked subclasses to work in |
@charris - an astropy bug report (astropy/astropy#2701) reminded me of this solution. I'm assuming it is too late to get this in 1.9 still; if not, do let me know and I'll rebase & ensure it is mergeable. |
@mhvk Needs a rebase. And the commit messages could use a tweak, see doc/source/dev/gitwash/development_workflow.rst for prefixes. I've been putting off all the uncommited fixes/workarounds from astropy until 1.10 development starts. This doesn't look terribly risky, but at this point I'd prefer all those changes have more time to settle. |
@charris - fair enough. I rebased it anyway, even though as I think you wrote earlier, it may well be best to rewrite |
As an aside, you might be interested in the thread that contains this quote
|
@charris - thanks for the quote! Though I should add that after my involvement in getting |
@charris - thanks for pointing me to #5227. I tried the trial implementation of taking account of However, #5230 does not address that subclasses should be passed on (the main goal of this PR). Since it is simpler, though, maybe it should be done first? |
@mhvk Where do you think this stands with respect to the |
Could use a couple of comments. |
eda932a
to
5a15d3f
Compare
5a15d3f
to
3c6b6ba
Compare
@charris - OK, I rebased this one, which meant that, as hoped, the changes to Apart from that, I added the comments to |
Hmm, this test failure looks potentially real (but maybe not repeatable). I'm going to restart the test but keeping this for reference. The
|
@charris - yes, I had seen that very weird test result, and was going to ask to restart, as it seemed unlikely any change in the python code would lead to a refcount problem. But you beat me to it... oddly, it seems irreproduceable (though at least it means it was not due to this PR). |
BUG allow subclasses in MaskedArray ufuncs -- for non-ndarray _data
Well, lets get this in. Thanks @mhvk. |
A possibly related issue on the mailing list: https://mail.scipy.org/pipermail/numpy-discussion/2016-February/074940.html |
Revert part of #3907 which incorrectly propogated MaskedArray info.
Backport 7296, Revert part of #3907 which incorrectly propogated MaskedArray info.
This is a follow-on to #3900, making two furthe changes to tryto reach the ultimate goal of
MaskedArray
behaving sensibly for subclasses.ndarray
; rather, a possible subclass is kept. This is particularly relevant for my work on aMaskedQuantity
class, where_data
contains aQuantity
which has not just a value (the ndarray) but also a unit.object
arrays, so that non-array Classes with an__array_priority__
set and with a reverse method defined properly lead toNotImplemented
(i.e., making the behaviour match that of regular arrays).In order for this to work, I
subok=False
in thegetdata
calls in the twoBinaryOperation
classesgetdata
to avoid the creation ofobject
arrays, andgetmaskedarray
to deal with the consequences.I also added a test case, which perhaps best makes clear why this is useful.
CC @charris -- I consider the above a correction of a bug, but realise this is subjective; it could also be ENH, even though the API does not change. It would be wonderful if this could still make it to 1.8.