Fix #4132 #4133

Ecco · 2018-09-13T12:07:38Z

This patches avoid multiplying with negative powers-of-10 when parsing floating-point values.

Similar behavior can be found in some implementations of strtod.

pfalcon · 2018-09-13T12:11:22Z

py/parsenum.c

@@ -295,7 +295,11 @@ mp_obj_t mp_parse_num_decimal(const char *str, size_t len, bool allow_imag, bool
            exp_val -= SMALL_NORMAL_EXP;
            dec_val *= SMALL_NORMAL_VAL;
        }
-        dec_val *= MICROPY_FLOAT_C_FUN(pow)(10, exp_val);
+        if (exp_val > 0) {
+          dec_val *= MICROPY_FLOAT_C_FUN(pow)(10, exp_val);


Incorrect indentation.

pfalcon · 2018-09-13T12:11:46Z

py/parsenum.c

@@ -295,7 +295,11 @@ mp_obj_t mp_parse_num_decimal(const char *str, size_t len, bool allow_imag, bool
            exp_val -= SMALL_NORMAL_EXP;
            dec_val *= SMALL_NORMAL_VAL;
        }
-        dec_val *= MICROPY_FLOAT_C_FUN(pow)(10, exp_val);
+        if (exp_val > 0) {


Please add a comment why this is needed.

dpgeorge · 2018-09-13T13:26:05Z

Thanks for the contribution. You'll notice that one of the float tests now fails, an existing number that parsed correctly now doesn't; float('.' + '9' * 70 + 'e-50') == float('1e-50') is now False.

It's tough to get this right and a lot of work has already been put into the parsing routine to at least make it correct within 1LSB of double precision, which IMO is good enough.

Here is a test script I use to check the float parsing, it uses 1016 troublesome float values from the CPython test suite (original source listed in the file). With strict mode disabled in the test (so accepts accuracy within 1LSB) MicroPython parses all values correctly: floating_points2.py.txt

It would be great to get all those tests passing, but it's not easy!

Ecco · 2018-09-13T15:35:30Z

Great, thanks @dpgeorge . Out of curiosity, this seems a bit like reinventing the wheel. Well, the libc at least. Why not leverage strtod(3) or something similar?

dpgeorge · 2018-09-14T04:21:14Z

Why not leverage strtod(3) or something similar?

I guess it's because, when this project started, there was zero dependency on any external libraries. And then required functionality was added as needed and in a minimal way.

It would be good to first understand the general algorithm of strtod functions and how they differ to what is done here.

Ecco · 2018-09-14T07:54:36Z

I guess it's because, when this project started, there was zero dependency on any external libraries. And then required functionality was added as needed and in a minimal way.

It definitely makes sense. There is one drawback though: if MicroPython is integrated alongside some other apps/libraries, this may end up duplicating code. In this case, if one of the other libraries uses strtod, we end up with two ascii-to-double routines.

pfalcon · 2018-09-14T08:16:33Z

@Ecco: Schedule we came to (which, as usual, is likely spread around tickets and not documented in one place, an definitely not fully executed):

As a general statement, we'd like to depend on libc as little as possible, as a general project aim of being minimal.
As a specific case, we don't want to depend on stdio, because it's bloat.
We have to depend on libc for OS services (literally, syscall wrappers in libc).
We also have to depend on libc for float handling, because in general, float binary format is unspecified.

Level of execution of these ideas varies, and some (well, all in general) of them conflicting, depending on a usecase. For example, for unix port float handling definitely should be offloaded to libc. But for STM32 port, where exact float params are known, why depend on outside bloat, if we can do it better? Even for unix port, there's a desire to optimize particular targets.

The current state is thus a mix-up somewhat, and it needs further elaboration (making it all configurable, based on requirements set in the last paragraph).

Ecco · 2018-09-14T09:05:21Z

Thank you for the clear explanation. Makes a lot of sense, thanks! There is one great advantage to internalizing float parsing: it will behave the same on all platforms.

dpgeorge · 2018-09-18T07:35:21Z

@Ecco as for this particular patch here, IMO it doesn't really change the situation much for the float parser. It may parse some floats "better", but others it parses "worse" (see failing test case). And since there are many floats (approx 2^32, or 2^64 for double prec) it's hard to quantify how much better this patch makes things overall. Since the existing parser gives results within 1LSB of the float precision, and this patch doesn't improve that but increases code size, I'd be inclined not to merge this. I'd rather have a full fix that makes it 100% compatible with other parsers, otherwise it'll just be patch on patch not really going anywhere.

Ecco · 2018-09-18T07:49:04Z

Exactly my conclusion as well… I could argue that it parses "more common" floats better, but I agree the argument is weak and subjective.

Ecco · 2018-09-18T07:52:50Z

Now unfortunately I don't think there is a fix using the current method. To avoid losing accuracy, I think you need to work using integers all the way, and build the floating-point at the very last steps from mantissa, exponent, and sign.

Ecco · 2018-09-18T09:15:32Z

Here's an updated patch which I think might be worth merging. It probably doesn't fix all cases, but I think it's a net improvement over the previous version.

As explained in the comment, it favorises exact operations for the small subset of cases where we manipulate small-powers-of-10 exponents.

The added code is super small (just a couple comparisons), fixes some very visible cases (0.1 + 0.2 == 0.3 for example), and most likely adds no regression.

dpgeorge · 2018-09-18T12:04:36Z

py/parsenum.c

+        // turns out small positive powers of 10 do, whereas small negative powers of 10 don't.
+        // So in that case, we'll yield a division of exact values rather than a multiplication
+        // of slightly erroneous values.
+        if (exp_val < 0 && exp_val > -15) { // 15 = log10(2^53)


Remember that this code also needs to work with single precision floats, so -15 would be something like -6 if single precision is used.

Indeed. Would you consider the merge if I fixed this?

Let me first run some tests with the patch as it is, to see how it changes things. I understand from the logic/comment you wrote what the intention is and it does sound like a good idea.

Great, makes perfect sense. If you decide you want to merge it, i'll be happy to make it work with single-precision floats too (well, it's really just an #if statement…) 😄

dpgeorge · 2018-09-19T05:14:12Z

@Ecco I tested that TCL strtod implementation that you linked to above, and it actually performs worse that the current uPy implementation for the floating_points2.py test suite I posted above: it doesn't handle subnormals, and fails another 357 tests (they are off by 1LSB like uPy). uPy also fails 357 tests but I can get that down to 296 by using a variant of your patch here which uses division (instead of multiplication) of the exponent for exp values between -1 and -22 inclusive (for double precision case).

The real issue is that the uPy parser will sometimes not give the closest floating representation of the input string. Eg for "0.3", which can't be represented exactly in floating point, uPy gives a slightly high (by 1LSB) result. This is because it computes the answer as 3 * 0.1, and since 0.1 also can't be represented exactly, multiplying by 3 multiplies up the rounding error in the 0.1. Computing it as 3 / 10 gives an answer which is as close to the true value of "0.3" as possible.

Full implementations of strtod look very complex because they need to do proper rounding of the result, to take all the above into account. If a system already provides strtod then it makes sense to delegate to that to do the string parsing.

My feeling is that it's probably a good idea to include the patch here to get accurate results parsing small fractions, like 0.3.

Ecco · 2018-09-19T09:44:23Z

If a system already provides strtod then it makes sense to delegate to that to do the string parsing.

Agreed. One could argue that it can also be useful to have a consistent behavior across all builds of uPy. Anyway, I think this is outside of the scope of that specific pull request 😄

My feeling is that it's probably a good idea to include the patch here to get accurate results parsing small fractions, like 0.3.

Great news! I'm preparing a commit with single-precision support then. Thanks!

dpgeorge · 2018-09-20T05:35:03Z

Thanks @Ecco, it looks really good now!

One minor thing: EXACT_POWER_OF_10 can actually be higher than you have it. This is because 10 is factorised into 2 and 5, and 2 can be absorbed into the base-2 exponent of a floating point number. Eg if I have 10^15 then 5^15 is stored in the mantissa and the exponent goes up by 15. So the largest power of 10 for double precision is log_5(2^52)=22.4, so EXACT_POWER_OF_10 should be 22. And 9 for single precision. (And then the test should be inclusive of the value of EXACT_POWER_OF_10).

You can see how this works by evaluating (in Python) (1e22).hex() and comparing with hex(5**22<<1).

Ecco · 2018-09-20T07:55:56Z

This is because 10 is factorised into 2 and 5, and 2 can be absorbed into the base-2 exponent of a floating point number.

That's a very good point! Nice arithmetics :-) Updated patch incoming, with a huge-ass comment :-)

dpgeorge · 2018-09-20T12:29:45Z

Thanks! Merged in b768cc6 with minor change: > -EXACT_POWER_OF_10 becomes >= -EXACT_POWER_OF_10. Further tests added in 9849209

This also removes the need to pin share because we don't use the status LED while user code is running. The status flashes fallback to the HW_STATUS LED if no RGB LED is present. Each status has a unique blink pattern as well. One caveat is the REPL state. In order to not pin share, we set the RGB color once. PWM and single color will be shutoff immediately but DotStars and NeoPixels will hold the color until the user overrides it. Fixes micropython#4133

Ecco mentioned this pull request Sep 13, 2018

Floating-point behavior is not consistent with IEEE754 #4132

Closed

pfalcon reviewed Sep 13, 2018

View reviewed changes

dpgeorge reviewed Sep 18, 2018

View reviewed changes

py/parsenum: Avoid rounding errors with negative powers-of-10

4839502

dpgeorge closed this Sep 20, 2018

Ecco mentioned this pull request Nov 3, 2018

License forbids modification, but pull requests accepted numworks/epsilon#38

Closed

ddiminnie mentioned this pull request Apr 3, 2019

Win32 micropython builds corrupt memory upon repeated execution of a particular function #4652

Closed

Uh oh!

Fix #4132 #4133

Fix #4132 #4133

Uh oh!

Conversation

Ecco commented Sep 13, 2018

Uh oh!

pfalcon Sep 13, 2018

Choose a reason for hiding this comment

Uh oh!

pfalcon Sep 13, 2018

Choose a reason for hiding this comment

Uh oh!

dpgeorge commented Sep 13, 2018

Uh oh!

Ecco commented Sep 13, 2018

Uh oh!

dpgeorge commented Sep 14, 2018

Uh oh!

Ecco commented Sep 14, 2018

Uh oh!

pfalcon commented Sep 14, 2018

Uh oh!

Ecco commented Sep 14, 2018

Uh oh!

dpgeorge commented Sep 18, 2018

Uh oh!

Ecco commented Sep 18, 2018

Uh oh!

Ecco commented Sep 18, 2018

Uh oh!

Ecco commented Sep 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dpgeorge Sep 18, 2018

Choose a reason for hiding this comment

Uh oh!

Ecco Sep 18, 2018

Choose a reason for hiding this comment

Uh oh!

dpgeorge Sep 18, 2018

Choose a reason for hiding this comment

Uh oh!

Ecco Sep 18, 2018

Choose a reason for hiding this comment

Uh oh!

dpgeorge commented Sep 19, 2018

Uh oh!

Ecco commented Sep 19, 2018

Uh oh!

dpgeorge commented Sep 20, 2018

Uh oh!

Ecco commented Sep 20, 2018

Uh oh!

dpgeorge commented Sep 20, 2018

Uh oh!

Uh oh!

Ecco commented Sep 18, 2018 •

edited

Loading