-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG: Fix MaskedArray.__setitem__
#8594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
numpy/ma/core.py
Outdated
return data | ||
|
||
# object dtype should not be converted to an array | ||
if np.dtype(dtype) == np.object_: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could use the hasobject
property here, and avoid needlessley creating one:
if dtype is not None and np.dtype(dtype).hasobject:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, for speed this probably should be:
if dtype is not None:
dtype = np.dtype(dtype)
if dtype.hasobject:
return a
return np.array(a, copy=False, subok=subok, dtype=dtype)
This avoids possibly converting a string to dtype twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely do not want hasobject
- that would catch structured dtypes, whereas I specifically want just object types
numpy/ma/core.py
Outdated
try: | ||
data = a._data | ||
except AttributeError: | ||
data = np.array(a, copy=False, subok=subok) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason not to keep the logic here? That avoids having to add the else
clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the logic only made sense in this particular case. This separates the three cases of how the dtype
parameter should be used.
@eric-wieser - apart from the very minor comments, why the check for the object Separately: nice to actually use |
Well, it is! Otherwise |
Ah, I see, it was an existing test that starts breaking with your change; should have considered that possibility. But I fear this will fail the general case. E.g., what happens if one is trying to do Might it be an idea to move this object-checking piece of the logic out of |
Note that it can't fail any existing case, since the only place right now that sets the |
I meant the "general case of Indeed, that you only use the |
Looking more at
Anyway, it does seem we're stuck with this. |
I think that's consistent with how non-ma arrays behave |
p.s. Is there anything against just using |
@mhvk: Very tempting. I'd argue that getdata should become that, not just the use in setitem. Or at the very least, everything in ma.core should maybe use that. |
@mhvk: Actually, that makes much more sense as an approach, as really we want to forward array-conversion behaviour on to the base I'm not sure if it's acceptable to change the public behaviour of |
315886f
to
6aa9541
Compare
@mhvk: Really good spot there - this probably caused other bugs in subtle places as well. Updated to use |
6aa9541
to
c08566f
Compare
@eric-wieser - I tried to do something quite similar in an earlier PR, and at least at the time that was felt to be too risky. My suggestion, therefore, would be to split this PR in an uncontroversial one that fixes the bug (i.e., use But perhaps best to ask a real maintainer... |
I suspect there are a whole class of similar bugs, that mean this change should be made through |
c08566f
to
b4f0210
Compare
f604105
to
31e7726
Compare
@eric-wieser - looking back at this (now with a maintainer hat on): I do very much like the change to So, I think this is best done in two steps, the bug fix just using |
@mhvk: I've been trying to conjure up other bugs that arise from the current implementation, but haven't found any yet. If I don't succeed in doing so, then yes, I guess I'll split the PR |
Also, I moved the |
OK; if it does stay together, then I guess it should be |
Oh, I see what you mean now - I hadn't realized targeting 12.1 was on the cards. How would you feel about me introducing |
My sense remains that we should just fix the |
And fix it for 1.12 too then? In that case, I should rebase on the branch point for 1.12. Where is that? Update: 1718ee8? |
31e7726
to
10bf55e
Compare
@mhvk: Ok, I've stripped it down to just that change, and rebased on the branch point. |
OK, that looks good. I'll merge assuming the tests pass (don't see how they could not, but just to be sure). |
Arguably #8648 is one that should have the 12.1 release milestone |
@mhvk: tests pass :) |
Hmm, yes, it is my astropy background again, where we set the milestone to the lowest relevant release and do backporting after the fact. But I think you've done the right thing here, so will merge both... |
@mhvk Our backport policy is still somewhat ad hoc, exspecially as I the only one who has been doing release. What I currently do is set the milestone to the earlier version, so I will find the PR when looking for backports, then do a backport, label it as such, set the milestone on the backported version, and remove the milestone from the original. I'm not completely happy with the process, so if you have better ideas I'd like to hear them. One option I've considered is a |
@charris - OK, so clearly we should continue to set the milestone to a bug-release version; that is no effort to anyone and keeps things clear. For the rest, it would be nice if things could be more automated. E.g., might it be possible to have some travis magic that does a trial merge & test? But maybe we better discuss this on the mailing list? ... which I will do now. |
MaskedArray.__setitem__
Fixes #8510.
The root cause here is that
np.ma.getdata
does a conversion to np.ndarray that we don't want to happen.This conversion in general is a bad idea, because when delegating to base numpy functions, it doesn't allow that function to do the correct conversion with extra information.