BUG: Fix `MaskedArray.setitem` #8594

eric-wieser · 2017-02-09T14:47:30Z

The root cause here is that np.ma.getdata does a conversion to np.ndarray that we don't want to happen.

This conversion in general is a bad idea, because when delegating to base numpy functions, it doesn't allow that function to do the correct conversion with extra information.

mhvk · 2017-02-09T17:39:19Z

numpy/ma/core.py

+        return data
+
+    # object dtype should not be converted to an array
+    if np.dtype(dtype) == np.object_:


I think you could use the hasobject property here, and avoid needlessley creating one:

if dtype is not None and np.dtype(dtype).hasobject: ...

Actually, for speed this probably should be:

if dtype is not None: dtype = np.dtype(dtype) if dtype.hasobject: return a return np.array(a, copy=False, subok=subok, dtype=dtype)

This avoids possibly converting a string to dtype twice.

I definitely do not want hasobject - that would catch structured dtypes, whereas I specifically want just object types

mhvk · 2017-02-09T17:40:06Z

numpy/ma/core.py

    try:
        data = a._data
    except AttributeError:
-        data = np.array(a, copy=False, subok=subok)


Is there a particular reason not to keep the logic here? That avoids having to add the else clause.

Because the logic only made sense in this particular case. This separates the three cases of how the dtype parameter should be used.

mhvk · 2017-02-09T17:44:50Z

@eric-wieser - apart from the very minor comments, why the check for the object dtype at all? It seems this is not required to make the test case work (I do agree with the sentiment, but this is not really a function that will be used outside of np.ma -- indeed, one could also insist any dtype passed in is already a dtype instance).

Separately: nice to actually use getmask consistently!

eric-wieser · 2017-02-09T17:58:02Z

It seems this is not required to make the test case work

Well, it is! Otherwise test_set_element_as_object fails.

mhvk · 2017-02-09T18:16:32Z

Ah, I see, it was an existing test that starts breaking with your change; should have considered that possibility. But I fear this will fail the general case. E.g., what happens if one is trying to do ma['object'] = <object> if ma is a masked single-element record array and object is a record with objects?

Might it be an idea to move this object-checking piece of the logic out of getdata and in to __setitem__ (where one has self-awareness and can thus choose not to go through getdata at all if one is trying to set a single object).

eric-wieser · 2017-02-09T18:24:00Z

But I fear this will fail the general case.

Note that it can't fail any existing case, since the only place right now that sets the dtype parameter of getdata is in __setitem__. Without that argument, the behaviour is as (broken?) as before.

mhvk · 2017-02-09T18:29:20Z

I meant the "general case of __setitem__".

Indeed, that you only use the dtype in __setitem__ is why I suggested to keep the "does self hold objects" part of the logic there. As is, getdata always returns an array, and I think it makes a lot of sense to allow one to tell what dtype the result should have, but it makes less sense to allow it not to return an array any more.

mhvk · 2017-02-09T18:57:24Z

Looking more at test_set_element_as_object, I realize some parts of numpy I don't like. At least, I find the following bizarre:

a = np.ma.empty(5, dtype=object)
a[0] = x
a
# masked_array(data = [(1, 2, 3, 4, 5) None None None None],
#              mask = False,
#        fill_value = ?)
a[:] = x
a
# masked_array(data = [1 2 3 4 5],
#              mask = False,
#        fill_value = ?)
a[0:1] = x
# ValueError: cannot copy sequence with size 5 to array axis with dimension 1

Anyway, it does seem we're stuck with this.

eric-wieser · 2017-02-09T18:58:56Z

I think that's consistent with how non-ma arrays behave

mhvk · 2017-02-09T19:03:30Z

p.s. Is there anything against just using dval = getattr(value, '_data', value)? This is in closer analogy to what getmask does, and may solve everything too? At least, I just tried on your branch and there are no errors...

eric-wieser · 2017-02-09T21:18:07Z

@mhvk: Very tempting. I'd argue that getdata should become that, not just the use in setitem. Or at the very least, everything in ma.core should maybe use that.

eric-wieser · 2017-02-10T12:14:39Z

@mhvk: Actually, that makes much more sense as an approach, as really we want to forward array-conversion behaviour on to the base ndarray functions, rather than try and emulate the same behaviour ourself in np.ma.

I'm not sure if it's acceptable to change the public behaviour of getdata in that way though, so perhaps the entire of ma would use _getdata(), and getdata can be deprecated?

eric-wieser · 2017-02-10T15:08:09Z

@mhvk: Really good spot there - this probably caused other bugs in subtle places as well. Updated to use getattr(value, '_data', value) plus some backwards-compatible boilerplate

mhvk · 2017-02-10T16:09:32Z

@eric-wieser - I tried to do something quite similar in an earlier PR, and at least at the time that was felt to be too risky. My suggestion, therefore, would be to split this PR in an uncontroversial one that fixes the bug (i.e., use getattr(value, '_data', value) in __setitem__) and a maintenance PR that updates getdata (the latter should probably also have the generic use of getmask since it is not necessary for the bug fix).

But perhaps best to ask a real maintainer...

eric-wieser · 2017-02-10T16:12:02Z

split this PR in an uncontroversial one that fixes the bug (i.e., use getattr(value, '_data', value) in __setitem__

I suspect there are a whole class of similar bugs, that mean this change should be made through np.ma.core. But perhaps exposing this change publically is a bad idea

mhvk · 2017-02-20T17:48:29Z

@eric-wieser - looking back at this (now with a maintainer hat on): I do very much like the change to getdata in (as I think it is much more logical, and means there is at least a hope of a MaskedQuantity class, one of my longstanding goals...), but it also still seems to me that that part really is not for a bug fix.

So, I think this is best done in two steps, the bug fix just using getattr(value, '_data', value), and then a MAINT PR that changes getdata. Is that OK with you?

eric-wieser · 2017-02-20T19:12:41Z

@mhvk: I've been trying to conjure up other bugs that arise from the current implementation, but haven't found any yet. If I don't succeed in doing so, then yes, I guess I'll split the PR

eric-wieser · 2017-02-20T19:13:17Z

Also, I moved the np.ma.where stuff to #8647, which it seems the change to getdata requires be merged first

mhvk · 2017-02-20T19:25:30Z

OK; if it does stay together, then I guess it should be MAINT for 1.13 -- one cannot really introduce a deprecation warning in a bug-fix release...

eric-wieser · 2017-02-20T19:37:11Z

Oh, I see what you mean now - I hadn't realized targeting 12.1 was on the cards. How would you feel about me introducing _get_data, and using that everywhere in ma/core.py? Then the MAINT commit can just rename that to get_data

mhvk · 2017-02-20T19:58:31Z

My sense remains that we should just fix the __setitem__ bug for the bug fix, not make further changes...

eric-wieser · 2017-02-20T20:00:13Z

And fix it for 1.12 too then?

In that case, I should rebase on the branch point for 1.12. Where is that?

Update: 1718ee8?

eric-wieser · 2017-02-20T20:14:39Z

@mhvk: Ok, I've stripped it down to just that change, and rebased on the branch point.

mhvk · 2017-02-20T20:21:23Z

OK, that looks good. I'll merge assuming the tests pass (don't see how they could not, but just to be sure).

eric-wieser · 2017-02-20T20:28:14Z

Arguably #8648 is one that should have the 12.1 release milestone

eric-wieser · 2017-02-20T20:30:03Z

@mhvk: tests pass :)

mhvk · 2017-02-20T20:38:08Z

Hmm, yes, it is my astropy background again, where we set the milestone to the lowest relevant release and do backporting after the fact. But I think you've done the right thing here, so will merge both...

charris · 2017-02-20T20:48:40Z

@mhvk Our backport policy is still somewhat ad hoc, exspecially as I the only one who has been doing release. What I currently do is set the milestone to the earlier version, so I will find the PR when looking for backports, then do a backport, label it as such, set the milestone on the backported version, and remove the milestone from the original. I'm not completely happy with the process, so if you have better ideas I'd like to hear them. One option I've considered is a backported label in addition to the backport label, then use the latter for things to be backported.

eric-wieser · 2017-02-20T20:49:54Z

Thanks @mhvk. This now unblocks #8511 :)

mhvk · 2017-02-21T18:50:01Z

@charris - OK, so clearly we should continue to set the milestone to a bug-release version; that is no effort to anyone and keeps things clear. For the rest, it would be nice if things could be more automated. E.g., might it be possible to have some travis magic that does a trial merge & test? But maybe we better discuss this on the mailing list? ... which I will do now.

eric-wieser mentioned this pull request Feb 9, 2017

MAINT: make np.ma.apply_along_axis consistent with np.apply_along_axis #8511

Closed

mhvk reviewed Feb 9, 2017

View reviewed changes

charris added 00 - Bug component: numpy.ma masked arrays labels Feb 9, 2017

eric-wieser force-pushed the MaskedArray.__setitem__ branch from 315886f to 6aa9541 Compare February 10, 2017 15:03

eric-wieser force-pushed the MaskedArray.__setitem__ branch from 6aa9541 to c08566f Compare February 10, 2017 15:57

eric-wieser force-pushed the MaskedArray.__setitem__ branch from c08566f to b4f0210 Compare February 10, 2017 16:27

eric-wieser changed the title ~~BUG: Fix MaskedArray.__setitem__~~ BUG: Fix various bits of MaskedArray Feb 10, 2017

This was referenced Feb 19, 2017

BUG: Make ma.where work with structured types. #5827

Closed

MAINT: Use getmask where possible #8645

Merged

eric-wieser force-pushed the MaskedArray.__setitem__ branch 2 times, most recently from f604105 to 31e7726 Compare February 20, 2017 15:25

eric-wieser changed the title ~~BUG: Fix various bits of MaskedArray~~ BUG: Fix MaskedArray.__setitem__, and change np.ma.getdata Feb 20, 2017

BUG: Fix numpy#8510, making MaskedArray.__setitem__ work

10bf55e

eric-wieser force-pushed the MaskedArray.__setitem__ branch from 31e7726 to 10bf55e Compare February 20, 2017 20:10

eric-wieser mentioned this pull request Feb 20, 2017

BUG: Fix MaskedArray.__setitem__ #8648

Merged

mhvk changed the title ~~BUG: Fix MaskedArray.__setitem__, and change np.ma.getdata~~ BUG: Fix MaskedArray.__setitem__ Feb 20, 2017

mhvk added this to the 1.12.1 release milestone Feb 20, 2017

eric-wieser modified the milestones: 1.13.0 release, 1.12.1 release Feb 20, 2017

mhvk merged commit b8769a2 into numpy:master Feb 20, 2017

eric-wieser deleted the MaskedArray.__setitem__ branch February 20, 2017 20:44

charris changed the title ~~BUG: Fix MaskedArray.__setitem__~~ BUG: Fix MaskedArray.__setitem__ May 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Fix `MaskedArray.setitem` #8594

BUG: Fix `MaskedArray.setitem` #8594

eric-wieser commented Feb 9, 2017 •

edited

Loading

mhvk Feb 9, 2017

mhvk Feb 9, 2017

eric-wieser Feb 9, 2017 •

edited

Loading

mhvk Feb 9, 2017

eric-wieser Feb 9, 2017

mhvk commented Feb 9, 2017

eric-wieser commented Feb 9, 2017 •

edited

Loading

mhvk commented Feb 9, 2017

eric-wieser commented Feb 9, 2017 •

edited

Loading

mhvk commented Feb 9, 2017

mhvk commented Feb 9, 2017

eric-wieser commented Feb 9, 2017

mhvk commented Feb 9, 2017

eric-wieser commented Feb 9, 2017

eric-wieser commented Feb 10, 2017

eric-wieser commented Feb 10, 2017

mhvk commented Feb 10, 2017

eric-wieser commented Feb 10, 2017

mhvk commented Feb 20, 2017

eric-wieser commented Feb 20, 2017

eric-wieser commented Feb 20, 2017 •

edited

Loading

mhvk commented Feb 20, 2017

eric-wieser commented Feb 20, 2017

mhvk commented Feb 20, 2017

eric-wieser commented Feb 20, 2017 •

edited

Loading

eric-wieser commented Feb 20, 2017 •

edited

Loading

mhvk commented Feb 20, 2017

eric-wieser commented Feb 20, 2017

eric-wieser commented Feb 20, 2017

mhvk commented Feb 20, 2017

charris commented Feb 20, 2017 •

edited

Loading

eric-wieser commented Feb 20, 2017

mhvk commented Feb 21, 2017

BUG: Fix MaskedArray.__setitem__ #8594

BUG: Fix MaskedArray.__setitem__ #8594

Conversation

eric-wieser commented Feb 9, 2017 • edited Loading

mhvk Feb 9, 2017

Choose a reason for hiding this comment

mhvk Feb 9, 2017

Choose a reason for hiding this comment

eric-wieser Feb 9, 2017 • edited Loading

Choose a reason for hiding this comment

mhvk Feb 9, 2017

Choose a reason for hiding this comment

eric-wieser Feb 9, 2017

Choose a reason for hiding this comment

mhvk commented Feb 9, 2017

eric-wieser commented Feb 9, 2017 • edited Loading

mhvk commented Feb 9, 2017

eric-wieser commented Feb 9, 2017 • edited Loading

mhvk commented Feb 9, 2017

mhvk commented Feb 9, 2017

eric-wieser commented Feb 9, 2017

mhvk commented Feb 9, 2017

eric-wieser commented Feb 9, 2017

eric-wieser commented Feb 10, 2017

eric-wieser commented Feb 10, 2017

mhvk commented Feb 10, 2017

eric-wieser commented Feb 10, 2017

mhvk commented Feb 20, 2017

eric-wieser commented Feb 20, 2017

eric-wieser commented Feb 20, 2017 • edited Loading

mhvk commented Feb 20, 2017

eric-wieser commented Feb 20, 2017

mhvk commented Feb 20, 2017

eric-wieser commented Feb 20, 2017 • edited Loading

eric-wieser commented Feb 20, 2017 • edited Loading

mhvk commented Feb 20, 2017

eric-wieser commented Feb 20, 2017

eric-wieser commented Feb 20, 2017

mhvk commented Feb 20, 2017

charris commented Feb 20, 2017 • edited Loading

eric-wieser commented Feb 20, 2017

mhvk commented Feb 21, 2017

BUG: Fix `MaskedArray.setitem` #8594

BUG: Fix `MaskedArray.setitem` #8594

eric-wieser commented Feb 9, 2017 •

edited

Loading

eric-wieser Feb 9, 2017 •

edited

Loading

eric-wieser commented Feb 9, 2017 •

edited

Loading

eric-wieser commented Feb 9, 2017 •

edited

Loading

eric-wieser commented Feb 20, 2017 •

edited

Loading

eric-wieser commented Feb 20, 2017 •

edited

Loading

eric-wieser commented Feb 20, 2017 •

edited

Loading

charris commented Feb 20, 2017 •

edited

Loading