BUG/ENH: Improve output for structured non-void types #10381

eric-wieser · 2018-01-12T06:26:02Z

Based on the 1.14 branch point, in case we want to backport.

mhvk

Looks good. Mostly nitpicks about comments, to help possible future developers.

mhvk · 2018-01-15T17:11:00Z

numpy/core/arrayprint.py

@@ -394,7 +391,12 @@ def _get_format_function(data, **options):
    elif issubclass(dtypeobj, _nt.object_):
        return formatdict['object']()
    elif issubclass(dtypeobj, _nt.void):
-        return formatdict['void']()
+        # StructureFormat relies on np.void.__getitem__, so we can't use it


This comment would seem helpful only if you thought StructureFormat should be used elsewhere (as it was originally, but which is not all that obvious). Maybe just state here something along the lines of "'void' only deals with unstructured bytes; we use StructureFormat if they are assigned to fields. Note that we cannot use StructureFormat where the void is in a union with a non-void dtype; those will be caught above and the typeset with the regular dtype (though the full dtype will be printed, to indicate to the user it is a union dtype)"

If you thought StructureFormat should be used elsewhere

That depends on what you think counts as a structured type:

issubdtype(dt, np.void) and dt.names is not None

dt.names is not None

issubdtype(dt, np.flexible) (a lot of old code refers to flexible types when it means structured types)

Renaming it to StructuredVoidFormat would make that clear

Agreed it is confusing, but this is one place where one can explain what it is in the code!

mhvk · 2018-01-15T17:12:00Z

numpy/core/arrayprint.py

@@ -1236,7 +1238,7 @@ def dtype_is_implied(dtype):
    dtype = np.dtype(dtype)
    if _format_options['legacy'] == '1.13' and dtype.type == bool_:
        return False
-    return dtype.type in _typelessdata
+    return dtype.type in _typelessdata and dtype.names is None


As we're adding comments, maybe here note that the latter part guards against union dtypes?

mhvk · 2018-01-15T17:15:29Z

numpy/core/tests/test_multiarray.py

+        CustomScalar.__module__ = None
+
+        dt_custom = np.dtype((CustomScalar, fields))
+        str(dt_custom)  # segfault?


Might as well test the output?

Output is meaningless with __module__ assigned to garbage, I think. Can __module__ ever be None in real code?

Yes, true. I tried it myself and indeed got __main__.CustomScalar - I just wondered whether this is tested anywhere at all...

eric-wieser · 2018-01-15T17:25:36Z

numpy/core/src/multiarray/descriptor.c


-        ret = PyUString_FromString("(");
-        if (modulestr != NULL) {
-            /* Note: if modulestr == NULL, the type is unpicklable */


More background behind this comment would be useful. Can this ever happen?

Hmm my memory of this is vague now. I think this code had to do with "old style" vs "new style" classes in python2. I think "old style" classes fill in the __module__ entry incompletely when defined in C, or something along those lines. I will have to look it up.

Ah, happily I left some kind of explanation of that in a github comment. See here (link).

Now as to whether this can be NULL, I think it can, for types defined in C in python 2. If you go to https://docs.python.org/2/extending/newtypes.html, and copy the first "noddy" example, you will find:

>>> import noddy >>> mynoddy = noddy.Noddy() >>> mynoddy.__module__ AttributeError: 'noddy.Noddy' object has no attribute '__module__' >>> import cPickle >>> cPickle.dumps(mynoddy) TypeError: can't pickle Noddy objects

So, should the code continue to be able to deal with __module__ == NULL? Or is the new behaviour of raising an exception fine? It would probably be good to add a comment with the conclusion regardless...

It's probably fine to raise an exception if it is NULL, let's go with these changes.I would change the comment, though, it is technically wrong: Instead of this should never happen since types always have these attributes it should say something like we do not expect this to happen happen since types always have these attributes, except in some very special cases (module may be null for "statically allocated" types in python2).

Probably these "special cases" never come up in practice, since they will cause people other problems (eg, unpicklable types).

One last comment: I looked more closely at this and I have not been able to construct an example where __module__ is NULL. The type from numpy.core.test_rational import rational is what I expected to fail, since it does not properly define a module, but if you do rational.__module__ you get __builtin__, which is not NULL. I tested that np.dtype((rational, 'i4,i4')) works properly with this PR.

So in summary, this code is fine.

eric-wieser · 2018-01-21T20:42:14Z

@mhvk: Comments updated

charris · 2018-02-07T01:11:35Z

I'll leave this to @mhvk and @ahaldane .

ahaldane · 2018-02-07T18:05:57Z

numpy/core/arrayprint.py

@@ -1192,13 +1199,16 @@ def __call__(self, x):
        else:
            return "({})".format(", ".join(str_fields))

+# for backwards compatibility
+StructureFormat = StructuredVoidFormat


Should we add a deprecation warning for use of StructureFormat, the same way we did for LongFloatFormat?

Sure, wouldn't hurt

ahaldane · 2018-02-07T18:38:45Z

numpy/core/src/multiarray/descriptor.c

-        namestr = PyObject_GetAttr((PyObject*)(dtype->typeobj), str_name);
-        Py_DECREF(str_name);
+        namestr = PyObject_GetAttrString((PyObject *)dtype->typeobj, "__name__");
+        modulestr = PyObject_GetAttrString((PyObject *)dtype->typeobj, "__module__");


Some of these lines are too long, too

ahaldane · 2018-02-07T18:51:40Z

Everything else LGTM!

charris · 2018-02-12T03:36:18Z

Be good to finish this up.

eric-wieser · 2018-02-13T17:23:01Z

Will pick this up in the next few days

eric-wieser · 2018-02-15T08:45:39Z

I'm gonna roll back the segfault fix - we need a much wider audit of PyUString_ConcatAndDel, even within that file.

…void

Fixes numpygh-9821

eric-wieser · 2018-02-15T08:56:06Z

Alright, that's the fixes for regular structured scalars at least

ahaldane · 2018-02-15T15:27:42Z

LGTM. Circleci fails, but I'll merge anyway.

Thanks Eric!

charris added 00 - Bug 01 - Enhancement component: numpy._core 09 - Backport-Candidate PRs tagged should be backported labels Jan 14, 2018

mhvk reviewed Jan 15, 2018

View reviewed changes

eric-wieser commented Jan 15, 2018

View reviewed changes

eric-wieser force-pushed the fix-segfault branch from a5a6f36 to e1eecaf Compare January 21, 2018 20:38

eric-wieser force-pushed the fix-segfault branch from e1eecaf to 32db2fc Compare January 21, 2018 20:46

eric-wieser force-pushed the fix-segfault branch from 32db2fc to b3c3fd2 Compare February 3, 2018 19:43

charris added this to the 1.14.1 release milestone Feb 5, 2018

ahaldane reviewed Feb 7, 2018

View reviewed changes

eric-wieser changed the title ~~BUG/ENH: Fix segfault and improve output for structured non-void types~~ BUG/ENH: Improve output for structured non-void types Feb 15, 2018

eric-wieser added 3 commits February 15, 2018 00:53

BUG: Show the base of a compound dtype even when it doesn't subclass …

a30b294

…void

BUG: Fix crash on non-void structured array repr

7a3344a

Fixes numpygh-9821

ENH: Always show dtype fields in the array repr, even for non-void

fed44d7

eric-wieser force-pushed the fix-segfault branch from b3c3fd2 to fed44d7 Compare February 15, 2018 08:55

ahaldane merged commit 69fa37f into numpy:master Feb 15, 2018

charris mentioned this pull request Feb 16, 2018

BUG/ENH: Improve output for structured non-void types #10612

Merged

charris removed the 09 - Backport-Candidate PRs tagged should be backported label Feb 16, 2018

charris removed this from the 1.14.1 release milestone Feb 16, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG/ENH: Improve output for structured non-void types #10381

BUG/ENH: Improve output for structured non-void types #10381

eric-wieser commented Jan 12, 2018

mhvk left a comment

mhvk Jan 15, 2018

eric-wieser Jan 15, 2018 •

edited

Loading

mhvk Jan 15, 2018

mhvk Jan 15, 2018

mhvk Jan 15, 2018

eric-wieser Jan 15, 2018

mhvk Jan 15, 2018

eric-wieser Jan 15, 2018

ahaldane Jan 15, 2018 •

edited

Loading

ahaldane Jan 15, 2018

mhvk Feb 7, 2018

ahaldane Feb 7, 2018

ahaldane Feb 7, 2018

eric-wieser commented Jan 21, 2018

charris commented Feb 7, 2018

ahaldane Feb 7, 2018

eric-wieser Feb 7, 2018

ahaldane Feb 7, 2018

ahaldane commented Feb 7, 2018

charris commented Feb 12, 2018

eric-wieser commented Feb 13, 2018

eric-wieser commented Feb 15, 2018

eric-wieser commented Feb 15, 2018

ahaldane commented Feb 15, 2018

BUG/ENH: Improve output for structured non-void types #10381

BUG/ENH: Improve output for structured non-void types #10381

Conversation

eric-wieser commented Jan 12, 2018

mhvk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eric-wieser Jan 15, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahaldane Jan 15, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eric-wieser commented Jan 21, 2018

charris commented Feb 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahaldane commented Feb 7, 2018

charris commented Feb 12, 2018

eric-wieser commented Feb 13, 2018

eric-wieser commented Feb 15, 2018

eric-wieser commented Feb 15, 2018

ahaldane commented Feb 15, 2018

eric-wieser Jan 15, 2018 •

edited

Loading

ahaldane Jan 15, 2018 •

edited

Loading