Skip to content

BUG: Ensure __array_finalize__ cannot back-mangle shape #10314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 29, 2018

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented Jan 2, 2018

Just a small step, but __array_finalize__ should not be allowed to change the shape of any obj passed in. Arguably, it should not change the shape of the mask at all, but removing that currently breaks quite a bit more, so perhaps it is best to try to remedy that in a second step.

Need to think a bit about a test case that shows that this actually prevents problems.

numpy/ma/core.py Outdated
@@ -3351,6 +3354,8 @@ def __setattr__(self, attr, value):
self._mask.shape = self.shape
except (AttributeError, TypeError):
pass
elif attr == 'shape' and getmask(self) is not nomask:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be clearer to add dtype and shape @properties that fall back on ndarray.shape.__get__ etc

@@ -273,7 +273,8 @@ def test_testCopySize(self):
assert_(y1.mask is m)

y1a = array(y1, copy=0)
assert_(y1a.mask is y1.mask)
assert_(y1a._mask.__array_interface__ ==
y1._mask.__array_interface__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I'd actually expect the stronger result of y1a is y1 here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is #4043, I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment referencing that issue here?

@@ -397,7 +397,8 @@ def test_copy(self):
y1a = array(y1)
assert_(y1a._data.__array_interface__ ==
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, what? This doesn't seem sensible to me at all (#10318)

@mhvk
Copy link
Contributor Author

mhvk commented Jan 3, 2018

@eric-wieser - I agree the changes in behaviour for np.ma.array are not good, and I probably should work around them in the initializer.

Also agreed that setters for shape and dtype would be better than overriding setting for all attributes with __setattr__. I wasn't quite sure how to properly do this for properties - but a bit of googling suggests one has to do something like super().shape.__set__(<shape>).

@mhvk mhvk force-pushed the ma-array-finalize-mask-view branch 2 times, most recently from e39107a to 97628c2 Compare March 25, 2018 19:24
@mhvk
Copy link
Contributor Author

mhvk commented Mar 25, 2018

@eric-wieser - your comment reminded me about the existence of this branch... I now added a second commit replacing __setattr__ with shape and dtype properties and setters. I'm not sure I like it that much more... If you are also in doubt, perhaps I should just remove it so this one can focus on just the fix for the back-mangling of shapes.

I also still need to think of a test case that shows this actually prevents any problems!

numpy/ma/core.py Outdated

@shape.setter
def shape(self, shape):
super(MaskedArray, MaskedArray).shape.__set__(self, shape)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be just super(MaskedArray).shape or perhaps just ndarray.shape

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: I'm wrong. TIL how super(cls, cls) works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this annoyance is in part why I wondered if it was a good idea to replace __setattr__ with this...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage is that subclasses are now able to call Maskedarray.shape.__set__ in the unlikely even they override shape, and it won't accidentally call ndarray.shape.__set__

Copy link
Member

@eric-wieser eric-wieser May 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think this should be super(MaskedArray, type(self)), which correctly walks the MRO.

@eric-wieser
Copy link
Member

eric-wieser commented Mar 25, 2018

Is this enough for a test? On master:

>>> a = np.ma.array([1, 2, 3, 4], mask=True)
>>> d = a.data
>>> m = a.mask
>>> a.shape = (2, 2)
>>> d
array([1, 2, 3, 4])
>>> m  # your patch stops this changing shape, which is consistent with d
array([[ True,  True],
       [ True,  True]], dtype=bool)

@mhvk
Copy link
Contributor Author

mhvk commented Mar 26, 2018

Hmm, that particular case is not solved by this PR - it needs your original one, where the mask is always viewed. Indeed, I found it very hard to far to get to a case where my PR matters... I'll try a bit more, but if it fails, perhaps best to split it into two - and at least get rid of the __setattr__ override.

@eric-wieser
Copy link
Member

Ok, we now have a concrete problem caused by this, as documented in #3140

Marten H. van Kerkwijk and others added 2 commits May 16, 2018 11:01
@mhvk mhvk force-pushed the ma-array-finalize-mask-view branch from 97628c2 to b779bae Compare May 16, 2018 15:01
@mhvk
Copy link
Contributor Author

mhvk commented May 16, 2018

Thanks for realizing there was a test case after all! I now added it, and rebased for good measure (and fixed the super following your comment).

try:
self._mask.shape = self.shape
except (AttributeError, TypeError):
pass

@property
def shape(self):
return super(MaskedArray, self).shape
Copy link
Member

@eric-wieser eric-wieser May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this needs to be super(MaskedArray, type(self)).shape.__get__(self, type(self)), unless there's something weird about base classes in C

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, seems there's something weird about base classes. If this were shadowing another property (rather than a getset_descriptor) , you'd need what I have above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's true - I;m fairly sure one can shadow a property getter as I have it above, just not the setter.

@eric-wieser eric-wieser merged commit 6ccb03d into numpy:master May 29, 2018
@eric-wieser
Copy link
Member

Thanks @mhvk!

@mhvk mhvk deleted the ma-array-finalize-mask-view branch May 29, 2018 13:28
@charris charris changed the title BUG: Ensure __array_finalize__ cannot back-mangle shape BUG: Ensure __array_finalize__ cannot back-mangle shape Jun 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants