Skip to content

BUG: Fixed a bug with string representation of masked structured arrays #6094

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 5, 2015

Conversation

astrofrog
Copy link
Contributor

This is a fix for the issue reported in #6019 and includes a regression test.

One thing I am not sure about is what the string representation should look like if the column is e.g. 2-dimensional - at the moment it will include a line return, but I'm not sure if that's desirable? I could strip out the line returns in that case?

@astrofrog astrofrog changed the title Fixed a bug with string representation of masked structured arrays BUG: Fixed a bug with string representation of masked structured arrays Jul 19, 2015
@astrofrog
Copy link
Contributor Author

For non-masked arrays, the row of a structured array column might look like e.g.

(1.0, [[[1.0, 2.0], [3.0, 4.0]], [[1.0, 2.0], [3.0, 4.0]]])

this is with a 2x2x2 cell after a scalar cell. So the str call for these arrays doesn't include newlines. I've updated this PR to remove the newlines, but this results in something like

(1.0, [[[1.0 2.0]  [3.0 4.0]] [[-- --]  [3.0 4.0]]])

i.e the commas are missing. Do we care about having exactly consistent output compared to the non-masked arrays? If so, what would be the best way to obtain that?

(alternatively, we could get rid of the commas in the non-masked output, in my opinion it looks cleaner without)

@astrofrog
Copy link
Contributor Author

Can any numpy.ma experts review this? :) (tagging @rgommers since you've reviewed my other numpy.ma PRs in the past)

@ahaldane
Copy link
Member

I'm no numpy.ma expert, but I've stared at ma code for a while....

I didn't try it myself, but is there a reason you didn't do the same thing as the MaskedArray.__str__ method does here, using _recursive_make_descr and _recursive_printoption? That code seems a bit wierd, but when in Rome do as the Romans do, and it seems to work since the full array prints properly:

>>> print(t_ma)
 masked_array(data = [([1, --, 3],)],
         mask = [([False, True, False],)],
   fill_value = ([999999, 999999, 999999],),
        dtype = [('a', '<i8', (3,))])

@astrofrog
Copy link
Contributor Author

@ahaldane - ah yes, I think that will help. I will implement this later today.

@astrofrog
Copy link
Contributor Author

Sorry for the delay, will get back to this soon.

@astrofrog astrofrog force-pushed the fix-gh-6019 branch 2 times, most recently from cae91ab to e0b2e8c Compare August 21, 2015 12:26
@astrofrog
Copy link
Contributor Author

@ahaldane @rgommers - this is now ready for review!

@astrofrog
Copy link
Contributor Author

Could someone review this? This is currently causing issues in Astropy so it would be nice to have it fixed in Numpy 1.11 - thanks!

p = str(p)
r = [(str(_), p)[int(_m)] for (_, _m) in zip(self._data.tolist(), m)]
return "(%s)" % ", ".join(r)
__repr__ = __str__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, doesn't this lose something? What is thinking here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charris - well, the output was basically the same before between the two. This is backward-compatible for the cases that did work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charris - would you prefer to see this behave differently? If so, do you have suggestions for expected behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems OK to me, since it looks like what normal void-scalars do too (repr is the same as str).

>>> a = np.array([([1, 2, 3],)], dtype=[('a', '<i8', (3,))])
>>> repr(a[0])
'([1, 2, 3],)'

This fixes a bug that caused the string representation of masked structured array rows with multi-dimensional columns to fail (numpy#6019), and includes a regression test.

Since __repr__ suffered from a similar bug, and since previously __repr__ returned the same as __str__ for mvoid, we now set __repr__ to reference the same method as __str__.
@astrofrog
Copy link
Contributor Author

@ahaldane - thanks for your suggestion, it works great! I've updated the pull request. I also added a test for the nested case you showed.

@ahaldane
Copy link
Member

ahaldane commented Oct 5, 2015

Cool, this looks ready to me now. @charris any further comments?

@charris
Copy link
Member

charris commented Oct 5, 2015

@ahaldane I'm good, leaving this to you :)

@ahaldane
Copy link
Member

ahaldane commented Oct 5, 2015

All right, merging. Thanks a lot, @astrofrog !

ahaldane added a commit that referenced this pull request Oct 5, 2015
BUG: Fixed a bug with string representation of masked structured arrays
@ahaldane ahaldane merged commit f8002ab into numpy:master Oct 5, 2015
@gerritholl
Copy link
Contributor

Should this go into maintenance/1.10.x?

@charris
Copy link
Member

charris commented Nov 25, 2015

Could do.

@charris charris added this to the 1.10.2 release milestone Nov 25, 2015
@charris charris removed this from the 1.10.2 release milestone Nov 26, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants