WIP: ENH: print float scalars using double_to_string instead of printf #9932

ahaldane · 2017-10-27T05:05:35Z

Currently, numpy scalars of floating type print differently from both float-array-elements and Python floats:

>>> 0.3, np.float64(0.3), str(np.array([0.3]))
(0.3, 0.29999999999999999, '[ 0.3]')

similarly, scalars of complex type are different:

>>> complex(1,np.inf), np.complex128(complex(1,np.inf)), str(np.array([complex(1,np.inf)]))
((1+infj), (1+inf*j), '[ 1.+infj]')

This is because the scalars use the OS's printf to print floats, in contrast to python-floats and numpy arrays which use CPython's version of the dtoa library to print accurate and human-friendly floats. (see discussion in #9919).

This PR rewrites the scalar float printing code to use PyOS_double_to_string which uses python's dtoa algorithm, and tweaks the complex repr too. Thus, scalars now use the same algorithm as python floats and arrays. Scalars will not print exactly the same as python floats because numpy prints with increased precision, but otherwise the behavior is the same, eg the rounding and trimming of trailing zeros.

One exception is for printing longfloats, since the dtoa algorithm cannot handle these. In this case we fall back to the OS's printf, but I've also added the zero-trimming behavior from dtoa so it is still more similar to python floats than before.

I've also generalized these functions so it is easy to change the trimming behavior, precision, and format, which may be useful in #9919.

This is a WIP. I think the behavior is finished, but I need to write comments and tests, and I'll also write up some implementation notes in a comment below at some point. I'll ping when I'm done.

eric-wieser · 2017-10-27T05:17:03Z

numpy/core/src/multiarray/scalartypes.c.src

+        }
+    }
+    else {
+        /* we found a trailing nonzero-digit instead of '.' */


Not necessarily - this might also be the one and only zero, right?

eric-wieser · 2017-10-27T05:17:47Z

numpy/core/src/multiarray/scalartypes.c.src

+    while (repr[epos] != '\0') {
+        repr[nzpos++] = repr[epos++];
+    }
+    repr[nzpos] = repr[epos];


Could just use a do while here

mhvk

Did not have a thorough look at all, so these are a bit random.

mhvk · 2017-10-27T14:16:50Z

numpy/core/src/multiarray/scalartypes.c.src

+ *   * format_code: similar to the argument to PyOS_double_to_string,
+ *   * prec: same as the argument to PyOS_double_to_string
+ *   * sign: boolean value, controls whether sign is always printed
+ *   * tail: one of '\0', '.' or '0', to control what happens for


Should include 'r', no? You test for it below.

mhvk · 2017-10-27T14:19:21Z

numpy/core/src/multiarray/scalartypes.c.src

+        flags |= Py_DTSF_ALT;
+    }
+
+    /* 'g' format precision is 1 greater, so decr for consistency with f,e */


I'm a bit confused why we do this if the goal is to be more similar to PyOS_double_to_string. Is it just that we do it elsewhere too?

asd

ahaldane · 2017-10-28T05:38:08Z

Hmm, it turns out to be quite tricky to get everything to work using PyOS_double_to_string. The problem is that it has a specially coded "r" mode designed to print float64, which is what is used to print python floats, but which doesn't allow us to control precision. But in numpy we need to control precision, and we want to print things besides float64 with proper rounding. So PyOS_double_to_string is not good enough.

Furthermore, I noticed that in a lot of non-scalar places numpy prints ugly representations. Eg,

>>> np.arange(10., dtype='f4')/10
array([ 0.        ,  0.1       ,  0.2       ,  0.30000001,  0.40000001,
        0.5       ,  0.60000002,  0.69999999,  0.80000001,  0.89999998], dtype=float32)

Those trailing digits are unnecessary. This leads me to think we should include custom float-printing code in numpy.

After some inverstigation, the Dragon4 algorithm seems promising to me. Ryan Juckett has written it up in C++ here with a license which I think is numpy-compatible. It looks easy to port to C. I already tried modifying it to getfloat16 and float128` printing with "correct" rounding, which was quite easy to do.

Unless someone already has an objection, I'm probably going to close this PR, and work on another one to include Dragon4 printing in numpy. This should improve float reprs for 1. float-scalars and for 2. non-float64 types in general. Ie, the example above will print the way you would want.

mhvk · 2017-10-28T15:09:23Z

If you're up for it, by all means!

Fixes numpy#9360 numpy#2643 numpy#6136 numpy#9699 numpy#6908 Closes numpy#9919 numpy#9932

ahaldane · 2017-11-05T03:22:28Z

Closed in favor of #9941

ahaldane added 01 - Enhancement 51 - In progress component: numpy._core labels Oct 27, 2017

ahaldane added this to the 1.14.0 release milestone Oct 27, 2017

ahaldane force-pushed the dtoa_scalars branch 3 times, most recently from a019e96 to 7f3ca0d Compare October 27, 2017 05:15

eric-wieser reviewed Oct 27, 2017

View reviewed changes

mhvk reviewed Oct 27, 2017

View reviewed changes

ahaldane force-pushed the dtoa_scalars branch 5 times, most recently from 8f78ad5 to 36fc0cb Compare October 27, 2017 17:54

ENH: print float scalars using double_to_string instead of printf

a4c347e

ahaldane force-pushed the dtoa_scalars branch from 36fc0cb to a4c347e Compare October 27, 2017 18:08

use modes 4/5 in dtoa.c

d912389

asd

ahaldane force-pushed the dtoa_scalars branch from 885aa02 to d912389 Compare October 27, 2017 22:06

charris added the 57 - Close? Issues which may be closable unless discussion continued label Oct 28, 2017

ahaldane mentioned this pull request Oct 29, 2017

ENH: Use Dragon4 algorithm to print floating values #9941

Merged

ahaldane added a commit to ahaldane/numpy that referenced this pull request Nov 4, 2017

TST: New tests for scalar/array reprs with dragon4

9ab9e8b

Fixes numpy#9360 numpy#2643 numpy#6136 numpy#9699 numpy#6908 Closes numpy#9919 numpy#9932

ahaldane closed this Nov 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP: ENH: print float scalars using double_to_string instead of printf #9932

WIP: ENH: print float scalars using double_to_string instead of printf #9932

Uh oh!

ahaldane commented Oct 27, 2017

Uh oh!

eric-wieser Oct 27, 2017

Uh oh!

eric-wieser Oct 27, 2017

Uh oh!

mhvk left a comment

Uh oh!

mhvk Oct 27, 2017

Uh oh!

mhvk Oct 27, 2017

Uh oh!

ahaldane commented Oct 28, 2017

Uh oh!

mhvk commented Oct 28, 2017

Uh oh!

ahaldane commented Nov 5, 2017

Uh oh!

Uh oh!

Uh oh!

WIP: ENH: print float scalars using double_to_string instead of printf #9932

WIP: ENH: print float scalars using double_to_string instead of printf #9932

Uh oh!

Conversation

ahaldane commented Oct 27, 2017

Uh oh!

eric-wieser Oct 27, 2017

Choose a reason for hiding this comment

Uh oh!

eric-wieser Oct 27, 2017

Choose a reason for hiding this comment

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

mhvk Oct 27, 2017

Choose a reason for hiding this comment

Uh oh!

mhvk Oct 27, 2017

Choose a reason for hiding this comment

Uh oh!

ahaldane commented Oct 28, 2017

Uh oh!

mhvk commented Oct 28, 2017

Uh oh!

ahaldane commented Nov 5, 2017

Uh oh!

Uh oh!