-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG/ENH: Improve output for structured non-void types #10381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Mostly nitpicks about comments, to help possible future developers.
numpy/core/arrayprint.py
Outdated
@@ -394,7 +391,12 @@ def _get_format_function(data, **options): | |||
elif issubclass(dtypeobj, _nt.object_): | |||
return formatdict['object']() | |||
elif issubclass(dtypeobj, _nt.void): | |||
return formatdict['void']() | |||
# StructureFormat relies on np.void.__getitem__, so we can't use it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment would seem helpful only if you thought StructureFormat
should be used elsewhere (as it was originally, but which is not all that obvious). Maybe just state here something along the lines of "'void' only deals with unstructured bytes; we use StructureFormat if they are assigned to fields. Note that we cannot use StructureFormat where the void is in a union with a non-void dtype; those will be caught above and the typeset with the regular dtype (though the full dtype will be printed, to indicate to the user it is a union dtype)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you thought StructureFormat should be used elsewhere
That depends on what you think counts as a structured type:
issubdtype(dt, np.void) and dt.names is not None
dt.names is not None
issubdtype(dt, np.flexible)
(a lot of old code refers to flexible types when it means structured types)
Renaming it to StructuredVoidFormat
would make that clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed it is confusing, but this is one place where one can explain what it is in the code!
numpy/core/arrayprint.py
Outdated
@@ -1236,7 +1238,7 @@ def dtype_is_implied(dtype): | |||
dtype = np.dtype(dtype) | |||
if _format_options['legacy'] == '1.13' and dtype.type == bool_: | |||
return False | |||
return dtype.type in _typelessdata | |||
return dtype.type in _typelessdata and dtype.names is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're adding comments, maybe here note that the latter part guards against union dtypes?
numpy/core/tests/test_multiarray.py
Outdated
CustomScalar.__module__ = None | ||
|
||
dt_custom = np.dtype((CustomScalar, fields)) | ||
str(dt_custom) # segfault? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well test the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output is meaningless with __module__
assigned to garbage, I think. Can __module__
ever be None
in real code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, true. I tried it myself and indeed got __main__.CustomScalar
- I just wondered whether this is tested anywhere at all...
|
||
ret = PyUString_FromString("("); | ||
if (modulestr != NULL) { | ||
/* Note: if modulestr == NULL, the type is unpicklable */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More background behind this comment would be useful. Can this ever happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm my memory of this is vague now. I think this code had to do with "old style" vs "new style" classes in python2. I think "old style" classes fill in the __module__
entry incompletely when defined in C, or something along those lines. I will have to look it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, happily I left some kind of explanation of that in a github comment. See here (link).
Now as to whether this can be NULL, I think it can, for types defined in C in python 2. If you go to https://docs.python.org/2/extending/newtypes.html, and copy the first "noddy" example, you will find:
>>> import noddy
>>> mynoddy = noddy.Noddy()
>>> mynoddy.__module__
AttributeError: 'noddy.Noddy' object has no attribute '__module__'
>>> import cPickle
>>> cPickle.dumps(mynoddy)
TypeError: can't pickle Noddy objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, should the code continue to be able to deal with __module__ == NULL
? Or is the new behaviour of raising an exception fine? It would probably be good to add a comment with the conclusion regardless...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably fine to raise an exception if it is NULL, let's go with these changes.I would change the comment, though, it is technically wrong: Instead of this should never happen since types always have these attributes
it should say something like we do not expect this to happen happen since types always have these attributes, except in some very special cases (module may be null for "statically allocated" types in python2)
.
Probably these "special cases" never come up in practice, since they will cause people other problems (eg, unpicklable types).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last comment: I looked more closely at this and I have not been able to construct an example where __module__
is NULL. The type from numpy.core.test_rational import rational
is what I expected to fail, since it does not properly define a module, but if you do rational.__module__
you get __builtin__
, which is not NULL. I tested that np.dtype((rational, 'i4,i4'))
works properly with this PR.
So in summary, this code is fine.
a5a6f36
to
e1eecaf
Compare
@mhvk: Comments updated |
e1eecaf
to
32db2fc
Compare
32db2fc
to
b3c3fd2
Compare
numpy/core/arrayprint.py
Outdated
@@ -1192,13 +1199,16 @@ def __call__(self, x): | |||
else: | |||
return "({})".format(", ".join(str_fields)) | |||
|
|||
# for backwards compatibility | |||
StructureFormat = StructuredVoidFormat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a deprecation warning for use of StructureFormat, the same way we did for LongFloatFormat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, wouldn't hurt
namestr = PyObject_GetAttr((PyObject*)(dtype->typeobj), str_name); | ||
Py_DECREF(str_name); | ||
namestr = PyObject_GetAttrString((PyObject *)dtype->typeobj, "__name__"); | ||
modulestr = PyObject_GetAttrString((PyObject *)dtype->typeobj, "__module__"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these lines are too long, too
Everything else LGTM! |
Be good to finish this up. |
Will pick this up in the next few days |
I'm gonna roll back the segfault fix - we need a much wider audit of |
b3c3fd2
to
fed44d7
Compare
Alright, that's the fixes for regular structured scalars at least |
LGTM. Circleci fails, but I'll merge anyway. Thanks Eric! |
Fixes gh-9821
Based on the 1.14 branch point, in case we want to backport.