Skip to content

ENH: Ensure that repr and str work for MaskedArray non-ndarray bases #4576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 4, 2014

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented Apr 1, 2014

(@charris: this is one of the corrections I mentioned in #3907 that we need to make to MaskedArray to get astropy's more complicated ndarray subclasses to be used as base classes.)

Currently, repr(ma) and str(ma) do not deal well with base classes that redefine __repr__ to use a different way to print the class name, or that override __setitem__ to do checks on values, respectively. This PR corrects this, by simply letting __repr__ use __class__.__name__ instead of trying to be smart in how to get it, and by ensuring that __str__ inserts masked_print_option in an ndarray view of the object array that is created for string output (the current way __str__ is handled is perhaps too clever a trick, but I'm not quite sure yet how to make it safer; my PR should at least help a little.).

Specifically, currently, after defining:

import numpy as np

class ComplicatedSubArray(np.ndarray):
    def __new__(cls, arr):
        return np.asanyarray(arr).view(cls)

    def __str__(self):
        return 'myprefix {0} mypostfix'.format(
            super(ComplicatedSubArray, self).__str__())

    def __repr__(self):
        # Return a repr that does not start with 'name('
        return '<{0} {1}>'.format(self.__class__.__name__, self)

    def __setitem__(self, item, value):
        # this ensures direct assignment to masked_print_option will fail
        if not isinstance(value, ComplicatedSubArray):
            raise ValueError("Can only set to MySubArray values")
        super(ComplicatedSubArray, self).__setitem__(item, value)

xcsub = ComplicatedSubArray(np.arange(5))
mxcsub1 = np.ma.masked_array(xcsub)
mxcsub2 = np.ma.masked_array(xcsub, mask=[True, False, True, False, False])

one gets

In [15]: mxcsub1
Out[15]: 
masked_<ComplicatedSubArray myprefix [0 1 2 3 4] mypostfix>(data = myprefix [0 1 2 3 4] mypostfix,
                                                            mask = False,
                                                      fill_value = 999999)
In [16]: mxcsub2
Out[16]: 
.
.
.
ValueError: Can only set to MySubArray values

With the PR,

In [2]: mxcsub1
Out[2]: 
masked_ComplicatedSubArray(data = myprefix [0 1 2 3 4] mypostfix,
                           mask = False,
                     fill_value = 999999)
In [3]: mxcsub2
Out[3]: 
masked_ComplicatedSubArray(data = myprefix [-- 1 -- 3 4] mypostfix,
                           mask = [ True False  True False False],
                     fill_value = 999999)

@charris
Copy link
Member

charris commented Apr 3, 2014

@mhvk I'll try to get to this and the previous PR tomorrow when I'm fresh.

@@ -3612,19 +3611,19 @@ def __repr__(self):

"""
n = len(self.shape)
name = repr(self._data).split('(')[0]
name = ('array' if self.__class__ == np.ndarray else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the parenthesis here. Might read better if it was an ordinary if statement.

@charris
Copy link
Member

charris commented Apr 3, 2014

Looks OK. Man, there is a lot of obfuscated code in ma. The use of lower case aliases for class names is a prime example. And the singleton masked_print_option global is a bit weird.

@mhvk
Copy link
Contributor Author

mhvk commented Apr 4, 2014

@charris - thanks for the comments. I changed to an if statement, and in the process discovered that it was incorrect as written anyway, so that the name would change for an np.ndarray. Sigh. Now added a test case to ensure that this does not happen again.

@charris
Copy link
Member

charris commented Apr 4, 2014

Great. Could you now squash the commits and begin the summary line ENH:? The maximum line length in commit messages should be 72 characters (Linus' Law). Might expand on the implications of the change in the commit message body. This is probably worth a reference in doc/release/1.9.0-notes.rst also.

For repr, use the name of the base class in output as "masked_<name>"
(with name=array for ndarray to match the previous implementation).
For str, insert masked_print_option in an ndarray view of the object
array that is created for string output, to avoid calling __setitem__
in the base class.  Add tests to ensure this works.
@mhvk mhvk changed the title BUG Ensure more complicated base classes work with MaskedArray str, repr ENH: Ensure that repr and str work for MaskedArray non-ndarray bases Apr 4, 2014
@mhvk
Copy link
Contributor Author

mhvk commented Apr 4, 2014

@charris - OK, made everything into a single commit, and added a line to the release notes. Will expand this line in further PRs.

@charris
Copy link
Member

charris commented Apr 4, 2014

Looks good. Thanks @mhvk .

charris added a commit that referenced this pull request Apr 4, 2014
ENH: Ensure that repr and str work for MaskedArray non-ndarray bases
@charris charris merged commit 547765d into numpy:master Apr 4, 2014
@mhvk mhvk deleted the ma/subclass-printing branch April 4, 2014 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants