Skip to content

BUG: Ensure MaskedArray.flat can access single items #4585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 11, 2014

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented Apr 4, 2014

Currently, the MaskedArray.flat iterator barfs when trying to access single items, because it is tried to overwrite shape of a non-ndarray mask (see below). This PR corrects this, and adds test cases.

import numpy as np
ma = np.ma.MaskedArray([1,2,3], mask=[0,1,0])
ma.flat[1]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-d9496ad7cbf6> in <module>()
----> 1 ma.flat[1]

/usr/lib/python3/dist-packages/numpy/ma/core.py in __getitem__(self, indx)
   2530         if self.maskiter is not None:
   2531             _mask = self.maskiter.__getitem__(indx)
-> 2532             _mask.shape = result.shape
   2533             result._mask = _mask
   2534         return result

AttributeError: attribute 'shape' of 'numpy.generic' objects is not writable

_mask.shape = result.shape
result._mask = _mask
elif _mask:
return masked
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

masked is masked scalar, right? Why is that the right thing here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps a better question is, does isinstance(_mask, ndarray) catch all the alternatives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried ma.flat.maskiter[1], [:1], and .next() and it would seem isinstance(_mask, ndarray) catches all the alternatives. Of course, I could also put it in a try/except, or turn the if statement around, if isinstance(_mask, np.bool_); perhaps the latter is clearest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter might be better, it's sort of six of one, half dozen of the other. The underlying type of masked is float64, does that change the previous output type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the advantage of testing for np.bool_ is mostly that it is clearer why one has to do something special. As for previous output type, there was none -- one would get an exception! Of course, not so odd this wasn't caught, as most people would use flat only for the iterator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confess I don't really understand what is going on here, particularly this bit _mask.shape = result.shape. Is this some sort of broadcasting? What is the relation of the two iterators? I could study the code, but the easier route is to ask you ;) Maybe a comment would be helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not really understand either how the two iterators could return something with a different shape, but worry there is a good reason, but one not necessarily covered by the relatively sparse tests. I checked who last changed the lines with git blame and it seems to be @pierregm - Pierre: this is about MaskedIterator.__getitem__; could you comment?

@charris
Copy link
Member

charris commented Apr 4, 2014

Looks generally correct, but I'd like a little reassurance.

@charris
Copy link
Member

charris commented Apr 10, 2014

@mhvk Could you make a stab at understanding what is going on here? I'd like to get this in together with the follow on PR.

@mhvk
Copy link
Contributor Author

mhvk commented Apr 11, 2014

@charris - Thinking more about it, the assignment to shape simply seems wrong, so I removed it. This does not change any tests. As there are rather few, however, I added some more, and those showed that what I had did not yet work for masked record arrays. With the latest commit, this should be resolved.

Note that since single-item access clearly was broken as it was, this should be an improvement no matter what! Let me know if you can think of further tests to add...

@charris
Copy link
Member

charris commented Apr 11, 2014

LGTM, but what do I know ;) I still can't figure why that assignment was there to begin with, although grep shows several such assignments in the code, which I also haven't looked at.

I'm going to put this in and we'll see how it goes. It could use testing with some applications that make heavy use of masked arrays. Matplotlib? Pandas? Thanks for working on the masked array code.

charris added a commit that referenced this pull request Apr 11, 2014
BUG: Ensure MaskedArray.flat can access single items
@charris charris merged commit f9e0771 into numpy:master Apr 11, 2014
@seberg
Copy link
Member

seberg commented Apr 11, 2014

Sorry, not been following things closely. The assignment might be there for masked matrices (some weird beast...). Since matrices can change the shape in unexpected calls. Not sure anyone actually uses masked matrices, or that it is actually necessary here. But it is the reason why the .flat assignments are used as far as I can tell.

@charris
Copy link
Member

charris commented Apr 11, 2014

@seberg Hmm..., so we need some tests for that.

@mhvk mhvk deleted the ma/iterator-get-single-item branch April 11, 2014 13:59
@charris
Copy link
Member

charris commented Apr 12, 2014

@mhvk Looks like we need a fix.

In [8]: a = ma.array(matrix(eye(2)), mask=0)

In [9]: a
Out[9]: 
masked_matrix(data =
 [[1.0 0.0]
 [0.0 1.0]],
              mask =
 [[False False]
 [False False]],
        fill_value = 1e+20)


In [10]: b = a.flat

In [11]: b[:2]
Out[11]: 
masked_matrix(data =
 [[1.0 0.0]],
              mask =
 [False False],
        fill_value = 1e+20)

mhvk added a commit to mhvk/numpy that referenced this pull request Apr 12, 2014
charris added a commit that referenced this pull request Apr 12, 2014
Revert line from #4585 to get mask, data shapes to match in .flat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants