py/formatfloat: Calculate reference values with pow(). #8985

dpwe · 2022-07-29T16:38:27Z

Formerly, formatfloat struggled with formatting exact powers of 10
as created by parsenum because of small differences in how those
powers of 10 were calculated: formatfloat multiplied together a few
values of 10^(2^Y) from a lookup table, but parsenum used
pow(10, X). This was mitigated in #8905 by adding 2eps of extra
margin when comparing values ("aggressive rounding up").

This patch instead eliminates the discrepency by using pow() in
formatfloat also. This eliminates the need for the 2eps margin, as
well as the lookup tables. It is likely slower to run, however.

In addition, this patch directly estimates the power-of-10 exponent from
the power-of-2 exponent in the floating-point representation.

This change surfaced a previously-noted problem with parsing numbers
including trailing zeros. The fix to that problem, #8980, is included
in this PR to ensure the tests pass.

Signed-off-by: Dan Ellis dan.ellis@gmail.com

dpwe · 2022-08-05T12:53:03Z

@dpgeorge Any hints as to why this is failing on appveyor? All I can see is the build error outputs, which seem catastrophic - my code prints nothing for numbers smaller than 1. But I can't see how this is happening (and can't reproduce, obviously). The Appveyor build appears to be under Windows using an MS compiler - is there some reason why mp_float_union_t wouldn't work there? I got rid of the special case float_union_t (for accessing the bits in a 32-bit float) from formatfloat since it appeared redundant with mp_float_union_t, but now I'm doubting my actions. In particular, the new code computes directly from the exp bits extracted from mp_float_union_t, maybe these read as signed under the MS compiler or something?

Edit: This was fixed by directly accessing the bits in the uint member of the union, rather than trying to use the bit-sliced struct members. So I guess there's something weird with how those structs behave under the MS compiler.

dpwe · 2022-08-05T17:22:43Z

Although the tests pass, there's a big problem with this. Using a float32 build,

>>> print("%.8e" % 1e-13)
0.99999990e-13

That's a malformed "e"-format number, the leading digit cannot be 0 in "e"-format.

The problem occurs with a mismatch between how the leading digit place is calculated, and how the leading digit itself is calculated. For numbers smaller than 1.0, we decide the leading digit by comparing the input number with pow(10, -e) for increasing values of e until we get something less than or equal to the input number. In the case of 1e-13, that returns e = 13 as expected.

We then scale up the input by pow(10, e), and the leading digit is however many times we can subtract 1 from this value and not go negative. Unfortunately, for some values of e (specifically, 13, 17, 19, 28, 31, and 38), the product pow(10, e) * pow(10, -e) is smaller than 1, e.g.:

>>> print("%.12f" % ((1e13) * (1e-13)))
0.999999904633

Thus, when we come to see how many units we can fit in the scaled-up input, it's zero, and we print the incorrectly-normalized "e"-format number above.

I think this was being hidden by the hard FPROUND_TO_ONE logic that I didn't understand and removed in this PR, but I'm looking for a cleaner solution.

Edit: This was fixed by dividing by pow(10, -e) instead of multiplying by pow(10, e), which is obvious in hindsight. Hooray for rubber-duck debugging!

dpwe · 2022-08-05T19:48:54Z

This code is more understandable, I believe more accurate (no aggressive rounding), and should even be smaller (although the changes to parsenum will have increase code size there). PTAL.

dpgeorge · 2022-08-11T01:58:40Z

Code size diff for this PR (which includes #8980):

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:  -192 -0.031% 
unix nanbox:  -208 -0.037% 
      stm32:   -52 -0.013% PYBV10
     cc3200:    -8 -0.004% 
    esp8266:   -48 -0.007% GENERIC
      esp32:  +168 +0.011% GENERIC[incl -48(data)]
     mimxrt:  -184 -0.052% TEENSY40
 renesas-ra:   -48 -0.008% RA6M2_EK
        nrf:   -32 -0.017% pca10040
        rp2:  -176 -0.034% PICO
       samd:   -52 -0.037% ADAFRUIT_ITSYBITSY_M4_EXPRESS

That's great!

py/formatfloat.c

py/parsenum.c

dpgeorge · 2022-08-11T02:16:22Z

This looks like a good improvement, because it reduces code size and fixes the trailing 0 problem. And as you mention it no longer does the aggressive rounding up.

Can you please rebase this on master and clean up the commits, with at least a separate commit to fix the trailing zero problem, and then 1 or more extra commits for the other changes. (At the moment the commits conflict, it seems it was merged to itself.)

Rework the conversion of floats to decimal strings so it aligns precisely with the conversion of strings to floats in parsenum.c. This is to avoid rendering 1eX as 9.99999eX-1 etc. Signed-off-by: Dan Ellis <dan.ellis@gmail.com>

dpwe · 2022-08-11T11:01:30Z

Sounds good. I have rebased and cleaned up the commit history as you requested. I have rebased to HEAD, then there are now two commits, one adds the changes to parsenum.c, and the second adds the changes to formatfloat.c and the tests.

I'm getting more successful with git rebase; however, there are a couple of formatting changes to parsenum in the second commit that don't belong there. I'm hoping this is OK.

dpgeorge · 2022-08-12T13:56:50Z

Thanks for updating the commits.

Rebased and merged in 6cd2e41 and 6f4d424

Final code size diff for this PR is:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:  -144 -0.023% standard
unix nanbox:  -208 -0.037% nanbox
      stm32:   -44 -0.011% PYBV10
     cc3200:    +0 +0.000% 
    esp8266:   -44 -0.006% GENERIC
      esp32:  +156 +0.010% GENERIC[incl -48(data)]
     mimxrt:  -176 -0.049% TEENSY40
 renesas-ra:   -48 -0.008% RA6M2_EK
        nrf:   -24 -0.013% pca10040
        rp2:  -152 -0.030% PICO
       samd:   -44 -0.031% ADAFRUIT_ITSYBITSY_M4_EXPRESS

dpwe force-pushed the format_float_with_pow branch 7 times, most recently from 20938e0 to 2e8e515 Compare August 4, 2022 17:05

dpwe force-pushed the format_float_with_pow branch from 963fe59 to caaf141 Compare August 5, 2022 18:05

dpgeorge added the py-core Relates to py/ directory in source label Aug 11, 2022

dpgeorge reviewed Aug 11, 2022

View reviewed changes

py/formatfloat.c Outdated Show resolved Hide resolved

py/formatfloat.c Show resolved Hide resolved

py/parsenum.c Outdated Show resolved Hide resolved

dpwe force-pushed the format_float_with_pow branch 2 times, most recently from 64d3e17 to 912334c Compare August 11, 2022 10:42

dpwe and others added 2 commits August 11, 2022 11:45

py/parsenum: Ensure that trailing zeros lead to identical results.

cbf5cc1

dpwe force-pushed the format_float_with_pow branch from 912334c to 0419279 Compare August 11, 2022 10:50

dhalbert mentioned this pull request Aug 11, 2022

Incorporate micropython #8985: py/formatfloat adafruit/circuitpython#6748

Closed

dpgeorge closed this Aug 12, 2022

dpgeorge mentioned this pull request Aug 12, 2022

py/parsenum: Ensure that trailing zeros lead to identical results. #8980

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

py/formatfloat: Calculate reference values with pow(). #8985

py/formatfloat: Calculate reference values with pow(). #8985

dpwe commented Jul 29, 2022

dpwe commented Aug 5, 2022 •

edited

Loading

dpwe commented Aug 5, 2022 •

edited

Loading

dpwe commented Aug 5, 2022

dpgeorge commented Aug 11, 2022

dpgeorge commented Aug 11, 2022

dpwe commented Aug 11, 2022

dpgeorge commented Aug 12, 2022

py/formatfloat: Calculate reference values with pow(). #8985

py/formatfloat: Calculate reference values with pow(). #8985

Conversation

dpwe commented Jul 29, 2022

dpwe commented Aug 5, 2022 • edited Loading

dpwe commented Aug 5, 2022 • edited Loading

dpwe commented Aug 5, 2022

dpgeorge commented Aug 11, 2022

dpgeorge commented Aug 11, 2022

dpwe commented Aug 11, 2022

dpgeorge commented Aug 12, 2022

dpwe commented Aug 5, 2022 •

edited

Loading

dpwe commented Aug 5, 2022 •

edited

Loading