ma.empty/ones/zeros_like to copy mask properly #3404

tomyun · 2013-06-06T02:31:55Z

As reported in #3145 and #2572, ma.empty_like does not copy, but merely assigns a reference to the original mask, leading to undesirable results.

This patch makes numpy.ma.empty_like() to work properly, in addition to newly introduced numpy.ma.ones_like() and numpy.ma.zeros_like().

However, numpy.empty_like() and siblings still do not handle masks correctly. This could be a work demanding more lower-level changes.

As underlying `empty_like()` knows nothing about mask, it would be `_convert2ma`'s job to properly set up the output mask. `copy` argument is added to figure out which functions need this behavior. `ones_like()` and `zeros_like()` which were mere aliases of regular numpy functions now go through `_convert2ma` as well. Hopefully this fixes numpy#3145, numpy#2572.

njsmith · 2013-06-06T10:20:23Z

numpy/ma/tests/test_regression.py

+        a.mask = False
+        assert_array_equal(a_empty.mask, m)
+        assert_array_equal(a_ones.mask, m)
+        assert_array_equal(a_zeros.mask, m)


So this says that all the *_like functions return arrays in which everything that was masked in the input, is masked in the output. This seems odd to me. Shouldn't e.g. ones_like give you an array whose contents is all-ones and whose mask is all-False?

That might depend on the design of API. Yet, many users would expect *_like functions return arrays of same shape and mask. In either way, current implementation returns a new array with old mask, which is definitely confusing.

Just to follow up, it looks like everyone on the mailing list prefers the API where the *_like functions return a new array with the mask set to nomask, rather than copying the input array's mask, and that's my intuition as well. Want to update the PR to do that? (Or if you have an argument for why it should be the other way around, bring that up on the list?)

Also Pierre GM made a suggestion for how the _convert2ma function might be changed that you might want to take a look at.

seberg · 2013-06-06T12:35:44Z

What I am wondering here is, if the mask should not be copied whenever in array_finalize the memory between the two arrays is not shared. Also in the case where it is shared we should create a view due to reshape bugs (there are some issues about this open). The latter I once started (but have not much clue if it is right nor the time to think much about it in seberg/numpy@master...safemask . Though in that code I used apparently used the owndata flag, which I think should likely be replaced with the may_share_memory check mentioned.

Just if you have some time and interested and use/know masked arrays. I think it might solve other smaller issues elsewhere, too. Though I don't know if someone might actually work with the fact that masks are not copied sometimes when the array is (but it seems weird to me).

njsmith · 2013-06-10T19:48:08Z

Asked about the API:
http://mail.scipy.org/pipermail/numpy-discussion/2013-June/066832.html

On Thu, Jun 6, 2013 at 1:35 PM, seberg notifications@github.com wrote:

What I am wondering here is, if the mask should not be copied whenever in
array_finalize the memory between the two arrays is not shared. Also in the
case where it is shared we should create a view due to reshape bugs (there
are some issues about this open). The latter I once started (but have not
much clue if it is right nor the time to think much about it in
seberg/numpy@master...safemask . Though in
that code I used apparently used the owndata flag, which I think should
likely be replaced with the may_share_memory check mentioned.

Just if you have some time and interested and use/know masked arrays. I
think it might solve other smaller issues elsewhere, too. Though I don't
know if someone might actually work with the fact that masks are not copied
sometimes when the array is (but it seems weird to me).

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/3404#issuecomment-19042020
.

njsmith · 2013-07-17T15:15:21Z

A follow-up that won't be captured in the above thread, with a dissenting opinion:
http://mail.scipy.org/pipermail/numpy-discussion/2013-July/067136.html

charris · 2013-08-16T01:03:58Z

Needs decision.

charris · 2014-03-28T16:49:51Z

Well, that's no help ;) The majority voted for nomask, whereas Gregorio wants a mask copy and has code that uses that assumption. Hmm..., I think backward compatibility (almost) requires the copy, whereas changing the mask values would require deprecation. So I vote for this patch as is. Thoughts?

charris · 2014-03-28T16:55:04Z

Or at least for empty_like, and having the new zeros_like and ones_like behave differently would be confusing. Of course, once they are all in, I suspect further changes are off the table.

charris · 2014-03-28T17:48:08Z

Alternatively, we could use an ma specific keyword to determine which behavior to use, say clearmask=0 or nomask=0.

@jenshnielsen

Previously, operations which created a new masked array from an old masked array -- e.g., np.empty_like -- would tend to result in the new and old arrays sharing the same .mask attribute. This leads to horrible brokenness in which writes to one array affect the other. In particular this was responsible for part of the brokenness that @jenshnielsen reported in numpygh-5184 in which np.gradient on masked arrays would modify the original array's mask. This fixes the worst part of the issues addressed in numpygh-3404, though there's still an argument that we ought to deprecate the mask-copying behaviour entirely so that empty_like returns an array with an empty mask. That can wait until we find someone who cares though. I also applied a small speedup to np.gradient (avoiding one copy); previously this inefficiency was masking (heh) some of the problems with masked arrays, so removing it is both an optimization and makes it easier to test that things are working now.

njsmith · 2014-10-18T02:14:57Z

Closing in favor of #5203. The commentary in this PR is still relevant, but progress is stalled and the actual change currently in the branch is a subset of what's in #5203.

@jenshnielsen

Previously, operations which created a new masked array from an old masked array -- e.g., np.empty_like -- would tend to result in the new and old arrays sharing the same .mask attribute. This leads to horrible brokenness in which writes to one array affect the other. In particular this was responsible for part of the brokenness that @jenshnielsen reported in numpygh-5184 in which np.gradient on masked arrays would modify the original array's mask. This fixes the worst part of the issues addressed in numpygh-3404, though there's still an argument that we ought to deprecate the mask-copying behaviour entirely so that empty_like returns an array with an empty mask. That can wait until we find someone who cares though. I also applied a small speedup to np.gradient (avoiding one copy); previously this inefficiency was masking (heh) some of the problems with masked arrays, so removing it is both an optimization and makes it easier to test that things are working now.

tomyun added 3 commits June 5, 2013 16:13

Include ones_like, zeros_like public for import *

0908295

Add a regression test for numpy#3145

9543818

njsmith reviewed Jun 6, 2013
View reviewed changes

njsmith mentioned this pull request Oct 18, 2014

BUG: copy inherited masks in MaskedArray.__array_finalize__ #5203

Merged

njsmith closed this Oct 18, 2014

gerritholl mentioned this pull request Dec 3, 2015

Masked arrays with nested structured multidimensional dtypes inconsistent between data and mask #6760

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ma.empty/ones/zeros_like to copy mask properly #3404

ma.empty/ones/zeros_like to copy mask properly #3404

Uh oh!

tomyun commented Jun 6, 2013

Uh oh!

njsmith Jun 6, 2013

Uh oh!

tomyun Jun 6, 2013

Uh oh!

njsmith Jun 14, 2013

Uh oh!

seberg commented Jun 6, 2013

Uh oh!

njsmith commented Jun 10, 2013

Uh oh!

njsmith commented Jul 17, 2013

Uh oh!

charris commented Aug 16, 2013

Uh oh!

charris commented Mar 28, 2014

Uh oh!

charris commented Mar 28, 2014

Uh oh!

charris commented Mar 28, 2014

Uh oh!

njsmith commented Oct 18, 2014

Uh oh!

Uh oh!

Uh oh!

ma.empty/ones/zeros_like to copy mask properly #3404

ma.empty/ones/zeros_like to copy mask properly #3404

Uh oh!

Conversation

tomyun commented Jun 6, 2013

Uh oh!

njsmith Jun 6, 2013

Choose a reason for hiding this comment

Uh oh!

tomyun Jun 6, 2013

Choose a reason for hiding this comment

Uh oh!

njsmith Jun 14, 2013

Choose a reason for hiding this comment

Uh oh!

seberg commented Jun 6, 2013

Uh oh!

njsmith commented Jun 10, 2013

Uh oh!

njsmith commented Jul 17, 2013

Uh oh!

charris commented Aug 16, 2013

Uh oh!

charris commented Mar 28, 2014

Uh oh!

charris commented Mar 28, 2014

Uh oh!

charris commented Mar 28, 2014

Uh oh!

njsmith commented Oct 18, 2014

Uh oh!

Uh oh!