-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG: recarray __repr__ gives inaccurate representation #5523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ed10d2b
to
d802e09
Compare
# show zero-length shape unless it is (0,) | ||
lst = "[], shape=%s" % (repr(self.shape),) | ||
|
||
if self.dtype.type == record: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should maybe be self.dtype.type is record
.
Well, that representation is certainly a bit strange, but, I suppose, accurate ;) LGTM modulo possible nitpick. |
In numpy#5483, I solved the problem that a "recarray" and a "record array" (nomenclature defined in numpy#5482) looked identical by making sure that a type's subclass was listed in the repr. However, recarrays are still represented using the function 'rec.array' even though this function technically creates record arrays, not recarrays. So I have updated recarray.__repr__. Setup: >>> a = np.array([(1,'ABC'), (2, "DEF")], dtype=[('foo', int), ('bar', 'S4')]) >>> recordarr = np.rec.array(a) >>> recarr = a.view(np.recarray) Behavior after numpy#5483: >>> recordarr rec.array([(1, 'ABC'), (2, 'DEF')], dtype=(numpy.record, [('foo', '<i8'), ('bar', 'S4')])) >>> recarr rec.array([(1, 'ABC'), (2, 'DEF')], dtype=[('foo', '<i8'), ('bar', 'S4')]) New Behavior: >>> recordarr rec.array([(1, 'ABC'), (2, 'DEF')], dtype=[('foo', '<i8'), ('bar', '|S4')]) >>> recarr array([(1, 'ABC'), (2, 'DEF')], dtype=[('foo', '<i8'), ('bar', 'S4')]).view(numpy.recarray)
d802e09
to
73a74e9
Compare
Done. Thanks for checking over everything! I think this fixes #3581. At least, it explains why that behavior is expected. |
BUG: recarray __repr__ gives inaccurate representation
Merged, thanks @ahaldane . |
this breaks scipy/io/tests/test_idl.py
|
I'll take a closer look tonight. The purpose of the line that raises the error is to make accessing a structured field of an array with I don't think the bug is in this code, but rather I think this exposes the problem seen in a number of other issues, that you can't view object arrays. I happen to be trying to solve this problem in #5548 (see part 1 there for a list of similar problems with recarrays). I haven't had enough time to work on that PR lately, and the scope of changes is getting a little too ambitious. However, I think the portion of that PR that fixes object array views is good to go. I could submit that as a smaller PR on its own, which I think would solve this and related issues. |
gh-5762 was opened for this and marked as a blocker for 1.10. |
I don't think the bug is in this PR, exactly. Rather, this PR uses some views of the original array, and this PR merely exposes the fact that currently views into object arrays do not work properly. This (gh-5762) would be solved by my other PR #5548, which allows views into object arrays. I just ran the test case in gh-5762 with that PR and it works fine. |
This is a followup to #5483, with the goal of solving #3581,
In #5483, I solved the problem that a "recarray" and a "record array" (nomenclature defined in #5482) looked identical by making sure that a type's subclass was listed in the repr. However, there is still the problem that recarrays are represented using the function 'rec.array' even though this function technically creates record arrays, not recarrays.
So I have updated
recarray.__repr__
to give accurate representations. Demonstration:Setup:
(Very) Old Behavior:
Behavior after #5483:
New Behavior:
Representing the recarr using
array
plus a view seems a little clunky, but there aren't many other options. The only other option I can think of is to use the recarray constructor, but that seems even clunkier since it has a totally different signature.However, perhaps the clunkier description will be motivation for people to always use np.rec.array, and help emphasize that taking
view(np.recarray)
gives you a weird hybrid.Also, I now get some extra
|
in the dtype since I use dtype.descr, hopefully that's acceptable.I also fixed some missing parentheses in the record array docs.