Skip to content

BUG: copy inherited masks in MaskedArray.__array_finalize__ #5203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 21, 2014

Conversation

njsmith
Copy link
Member

@njsmith njsmith commented Oct 18, 2014

Previously, operations which created a new masked array from an old
masked array -- e.g., np.empty_like -- would tend to result in the new
and old arrays sharing the same .mask attribute. This leads to
horrible brokenness in which writes to one array affect the other. In
particular this was responsible for part of the brokenness that
@jenshnielsen reported in gh-5184 in which np.gradient on masked
arrays would modify the original array's mask.

This fixes the worst part of the issues addressed in gh-3404, though
there's still an argument that we ought to deprecate the mask-copying
behaviour entirely so that empty_like returns an array with an empty
mask. That can wait until we find someone who cares though.

I also applied a small speedup to np.gradient (avoiding one copy);
previously this inefficiency was masking (heh) some of the problems
with masked arrays, so removing it is both an optimization and makes
it easier to test that things are working now.

@njsmith
Copy link
Member Author

njsmith commented Oct 19, 2014

While I'd never claim that a patch touching MaskedArray.__array_finalize__ is entirely safe, this should be pretty safe. The only effect it ever has is, in some cases where we used to end up with two MaskedArray objects who shared the same .mask array, it makes a copy so that they don't share the same .mask array. It's hard for me to see how having two different arrays that share the same .mask array is ever a good thing.

# we are being created from (= obj) and us. Or there might not! This
# method can be called in all kinds of places for all kinds of reasons
# -- could be empty_like, could be slicing, could be a ufunc, could be
# a view, who knows! The numpy subclassing interface is wildly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Just the facts, ma'am." ;) Could tone down the emotional content of the comment a bit.

@charris
Copy link
Member

charris commented Oct 19, 2014

As you say, bit of a hack. But the use of the same mask for the new array was clearly wrong.

@jenshnielsen
Copy link

@njsmith Thanks for doing this. It resolves the issue in Matplotlib. The combination of this and edge_order=1 reproduces the original results. A bit of context. We are seeing this issue in a unit test which is intended to test that a shading effect can also be applied to a masked array we don't directly use masked arrays here but a user might supply an image represented by a masked array.

@njsmith
Copy link
Member Author

njsmith commented Oct 19, 2014

@charris: Yeah, unfortunately there's no way to make this not a hack -- we just get to choose between more and less functional hacks :-).

Pushed a revised commit that I think fixes all that review comments.

@njsmith
Copy link
Member Author

njsmith commented Oct 19, 2014

@jenshnielsen: Just for reference, how problematic for you is the difference between edge_order=1 and edge_order=2 after this fix is applied?

@charris
Copy link
Member

charris commented Oct 19, 2014

Test failure not relevant, I restarted it. Leading comment still a bit, umm, wordy, but OK. Probably want to do the

git rebase --onto $(git merge-base master maintenance/1.9.x) HEAD^

thing for backport to 1.9.x. Mention it in the PR message.

@jenshnielsen
Copy link

Only cosmetic difference. It is in a purely visual shading effect test that fails. And visually it makes no difference as far as I know.

In the units tests we can just change the tests to not compare the outer edges. I have done this already for the non masked test which works as expected.

It might be possible for a user to produce an array where this is more problematic, but a user might also be able to produce an array where edge_order=2 looks better so I think we are indifferent.

For reference this code is in hillshade in Class LightSource in https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/colors.py
and the tests are in https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/tests/test_colors.py

Previously, operations which created a new masked array from an old
masked array -- e.g., np.empty_like -- would tend to result in the new
and old arrays sharing the same .mask attribute. This leads to
horrible brokenness in which writes to one array affect the other. In
particular this was responsible for part of the brokenness that
@jenshnielsen reported in numpygh-5184 in which np.gradient on masked
arrays would modify the original array's mask.

This fixes the worst part of the issues addressed in numpygh-3404, though
there's still an argument that we ought to deprecate the mask-copying
behaviour entirely so that empty_like returns an array with an empty
mask. That can wait until we find someone who cares though.

I also applied a small speedup to np.gradient (avoiding one copy);
previously this inefficiency was masking (heh) some of the problems
with masked arrays, so removing it is both an optimization and makes
it easier to test that things are working now.
@njsmith
Copy link
Member Author

njsmith commented Oct 21, 2014

I've attempted to do the backport rebase, hopefully it worked :-)

@njsmith njsmith added this to the 1.9.1 milestone Oct 21, 2014
@njsmith
Copy link
Member Author

njsmith commented Oct 21, 2014

...and I added the 1.9.1 milestone, is that sufficient so we don't lose track of it?

@charris
Copy link
Member

charris commented Oct 21, 2014

Thanks Nathaniel.

charris added a commit that referenced this pull request Oct 21, 2014
BUG: copy inherited masks in MaskedArray.__array_finalize__
@charris charris merged commit 72e9072 into numpy:master Oct 21, 2014
@juliantaylor
Copy link
Contributor

hm ok maybe a bit risky but fixes an ugly issue, pushing to 1.9

@juliantaylor
Copy link
Contributor

mh needs the gradient thing merged to 1.9 first for the test, whats the verdict on that?

@njsmith
Copy link
Member Author

njsmith commented Oct 21, 2014

Oh, right, I should have kept the gradient stuff separate I guess. Sorry. You could just drop the function_base stuff in the backport, it's not crucial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants