MAINT: Move dtype string functions to python #10602

eric-wieser · 2018-02-16T04:55:05Z

Doing proper error checking of string concatenation in python super tedious.
As a results, we're not doing it, making us prone to exceptions turning into segfaults.
This moves all of the __str__ and __repr__ functions, where speed is irrelevant, into np.core._dtype.

Minimal refactors made during the conversion from C - mainly just using str.join instead of manual loops. I'm sure more could be done here, but now that's it's python it should be easier for new contributors to modify.

CircleCI failures are because this is working off an earlier branch point

eric-wieser · 2018-02-16T04:57:37Z

numpy/core/_dtype.py

+    """ Normalize byteorder to '<' or '>' """
+    # hack to obtain the native and swapped byte order characters
+    swapped = np.dtype(int).newbyteorder('s')
+    native = swapped.newbyteorder('s')


Is there a better way to get hold of these?

There is np.little_endianor sys.byteorder, but then you have to decide '<' or '>' yourself.

eric-wieser · 2018-02-16T04:58:08Z

numpy/core/_dtype.py

+    from a list of the field names and dtypes with no additional
+    dtype parameters.
+
+    Duplicates the C `is_dtype_struct_simple_unaligned_layout` functio.


I think we should add this as dtype.ispacked in another PR

eric-wieser · 2018-02-16T04:59:27Z

numpy/core/src/multiarray/descriptor.c

-            }
-            PyUString_ConcatAndDel(&ret, PyObject_Repr(title));
-            if (i != names_size - 1) {
-                PyUString_ConcatAndDel(&ret, PyUString_FromString(","));


This kind of thing segfaults if PyUString_FromString(...) returns NULL (ie, we run out of memory). This type of thing what motivated the switch to python - getting this right everywhere is too hard and too verbose in C, and for no gain.

eric-wieser · 2018-08-19T17:14:01Z

Any thoughts on this?

mattip · 2018-08-19T17:50:49Z

Could you rebase this after the coverage integration? I would like to see that we are checking all the new code in tests.

eric-wieser · 2018-08-19T18:26:23Z

Good idea - done

mattip · 2018-08-19T19:10:15Z

numpy/core/_dtype.py

+    if dtype.fields is not None:
+        return _struct_str(dtype, include_align=True)
+    elif dtype.subdtype:
+        return _subarray_str(dtype)


This line not hit in tests, see coverage

mattip · 2018-08-19T19:11:11Z

numpy/core/_dtype.py

+    if byteorder == '=':
+        return native.byteorder
+    if byteorder == 's':
+        return swapped.byteorder


This line not hit in tests

Not sure if it's possible to hit - but this same test existed in the old C code.

Added a TODO

mattip · 2018-08-19T19:14:17Z

Coverage seems pretty good, only two blocks (other than exceptions) not covered

Doing proper error checking of string concatenation in python super tedious. As a results, we're not doing it, making us prone to exceptions turning into segfaults. This moves all of the __str__ and __repr__ functions, where speed is irrelevant, into np.core._dtype.

eric-wieser · 2018-09-05T05:11:54Z

@mattip: Coverage improved, I think

charris · 2018-09-11T17:23:33Z

I agree the this sort of thing should be coded in Python, I'd like to do more like this. Given that we have made a whole new module, I suppose we could use Cython at some point. @mattip Does PyPy speed up this sort of thing? @eric-wieser I suppose if this has a time effect in practice that it will show up in some of the benchmarks. I don't know that timing dtype creation by itself is useful.

charris · 2018-09-11T17:25:24Z

Thanks Eric.

mattip · 2018-09-11T17:42:12Z

PyPy will JIT after ~1000 calls, if someone is calling repr(dtype) that many times I thing something is probably wrong with their code.

ahaldane · 2018-09-30T21:42:52Z

This seems to have triggered a circular import problem when using the dtype-transer debug mode. (set NPY_DT_DBG_TRACING to 1 at the top of dtype_transfer.c).

The problem seems to be that debug tracing tries to print a dtype before the dtype object has been created, I'm still trying to see what happened. It involves getlimits.py which has been the source of circular import issues in the past.

>>> import numpy as np
Calculating dtype transfer from  to 
Traceback (most recent call last):
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/_dtype.py", line 23, in __repr__
    arg_str = _construction_repr(dtype, include_align=False)
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/_dtype.py", line 77, in _construction_repr
    return _scalar_str(dtype, short=short)
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/_dtype.py", line 81, in _scalar_str
    byteorder = _byte_order_str(dtype)
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/_dtype.py", line 151, in _byte_order_str
    swapped = np.dtype(int).newbyteorder('s')
AttributeError: module 'numpy' has no attribute 'dtype'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 200, in _lock_unlock_module
  File "<frozen importlib._bootstrap>", line 163, in _get_module_lock
SystemError: <built-in function acquire_lock> returned a result with an error set

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/__init__.py", line 142, in <module>
    from . import core
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/__init__.py", line 72, in <module>
    from . import getlimits
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/getlimits.py", line 125, in <module>
    tiny=_f16(2 ** -14))
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/getlimits.py", line 75, in __init__
    self.epsilon = self.eps = float_to_float(kwargs.pop('eps'))
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/getlimits.py", line 70, in <lambda>
    float_to_float = lambda v : _fr1(float_conv(v))
  File "/home/allan/nmpy_dev/build/testenv/lib/python3.7/site-packages/numpy/core/getlimits.py", line 29, in _fr1
    a = a.copy()
SystemError: <method 'copy' of 'numpy.ndarray' objects> returned a result with an error set

charris · 2018-09-30T23:48:58Z

@ahaldane Could you open an issue for this?

eric-wieser · 2018-10-18T05:33:12Z

numpy/core/_dtype.py

+        else:
+            return "'%sU%d'" % (byteorder, dtype.itemsize / 4)
+
+    elif dtype.type == np.void:


Previously this caught subclasses of void too. Do we actually want that behavior?

Fixes numpy#12206, which was a regression introduced by numpy#10602 Frankly I'd consider the results expected by `test_void_subclass_unsized` and `test_void_subclass_sized` undesirable, but they match the behavior in 1.12.x

eric-wieser added component: numpy._core 03 - Maintenance labels Feb 16, 2018

eric-wieser force-pushed the dtype-__str__-__repr__-in-python branch from 5a5426f to 73dece0 Compare February 16, 2018 04:56

eric-wieser commented Feb 16, 2018

View reviewed changes

eric-wieser mentioned this pull request Feb 16, 2018

WIP: Move dtype.__repr__ and __str__ to python #10594

Closed

eric-wieser force-pushed the dtype-__str__-__repr__-in-python branch from 73dece0 to 73e328c Compare August 19, 2018 18:26

mattip reviewed Aug 19, 2018

View reviewed changes

eric-wieser force-pushed the dtype-__str__-__repr__-in-python branch from 73e328c to fa616a8 Compare August 20, 2018 04:40

mattip approved these changes Sep 5, 2018

View reviewed changes

eric-wieser mentioned this pull request Sep 11, 2018

BUG: Numpy scalar types sometimes have the same name #10151

Merged

charris merged commit 371cc4d into numpy:master Sep 11, 2018

eric-wieser mentioned this pull request Sep 12, 2018

MAINT: Move np.dtype.name.__get__ to python #11932

Merged

ahaldane mentioned this pull request Sep 30, 2018

Circular import issue involving getlimits and NPY_DT_DBG_TRACING #12063

Closed

eric-wieser mentioned this pull request Oct 18, 2018

BUG: dtype repr of subclasses of np.void throw an exception #12206

Closed

eric-wieser commented Oct 18, 2018

View reviewed changes

eric-wieser mentioned this pull request Oct 22, 2018

BUG: Fix crash in repr of void subclasses #12240

Merged

eric-wieser mentioned this pull request Oct 24, 2018

MAINT: Move ctype -> dtype conversion to python #12254

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT: Move dtype string functions to python #10602

MAINT: Move dtype string functions to python #10602

eric-wieser commented Feb 16, 2018 •

edited

Loading

eric-wieser Feb 16, 2018

mattip Aug 19, 2018

eric-wieser Feb 16, 2018

eric-wieser Feb 16, 2018

eric-wieser commented Aug 19, 2018

mattip commented Aug 19, 2018

eric-wieser commented Aug 19, 2018

mattip Aug 19, 2018

eric-wieser Aug 20, 2018

mattip Aug 19, 2018

eric-wieser Aug 20, 2018

eric-wieser Aug 20, 2018

mattip commented Aug 19, 2018

eric-wieser commented Sep 5, 2018

charris commented Sep 11, 2018

charris commented Sep 11, 2018

mattip commented Sep 11, 2018

ahaldane commented Sep 30, 2018

charris commented Sep 30, 2018

eric-wieser Oct 18, 2018

MAINT: Move dtype string functions to python #10602

MAINT: Move dtype string functions to python #10602

Conversation

eric-wieser commented Feb 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eric-wieser commented Aug 19, 2018

mattip commented Aug 19, 2018

eric-wieser commented Aug 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattip commented Aug 19, 2018

eric-wieser commented Sep 5, 2018

charris commented Sep 11, 2018

charris commented Sep 11, 2018

mattip commented Sep 11, 2018

ahaldane commented Sep 30, 2018

charris commented Sep 30, 2018

Choose a reason for hiding this comment

eric-wieser commented Feb 16, 2018 •

edited

Loading