-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
Fix #4132 #4133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #4132 #4133
Conversation
py/parsenum.c
Outdated
@@ -295,7 +295,11 @@ mp_obj_t mp_parse_num_decimal(const char *str, size_t len, bool allow_imag, bool | |||
exp_val -= SMALL_NORMAL_EXP; | |||
dec_val *= SMALL_NORMAL_VAL; | |||
} | |||
dec_val *= MICROPY_FLOAT_C_FUN(pow)(10, exp_val); | |||
if (exp_val > 0) { | |||
dec_val *= MICROPY_FLOAT_C_FUN(pow)(10, exp_val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect indentation.
py/parsenum.c
Outdated
@@ -295,7 +295,11 @@ mp_obj_t mp_parse_num_decimal(const char *str, size_t len, bool allow_imag, bool | |||
exp_val -= SMALL_NORMAL_EXP; | |||
dec_val *= SMALL_NORMAL_VAL; | |||
} | |||
dec_val *= MICROPY_FLOAT_C_FUN(pow)(10, exp_val); | |||
if (exp_val > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment why this is needed.
Thanks for the contribution. You'll notice that one of the float tests now fails, an existing number that parsed correctly now doesn't; It's tough to get this right and a lot of work has already been put into the parsing routine to at least make it correct within 1LSB of double precision, which IMO is good enough. Here is a test script I use to check the float parsing, it uses 1016 troublesome float values from the CPython test suite (original source listed in the file). With strict mode disabled in the test (so accepts accuracy within 1LSB) MicroPython parses all values correctly: floating_points2.py.txt It would be great to get all those tests passing, but it's not easy! |
Great, thanks @dpgeorge . Out of curiosity, this seems a bit like reinventing the wheel. Well, the libc at least. Why not leverage strtod(3) or something similar? |
I guess it's because, when this project started, there was zero dependency on any external libraries. And then required functionality was added as needed and in a minimal way. It would be good to first understand the general algorithm of strtod functions and how they differ to what is done here. |
It definitely makes sense. There is one drawback though: if MicroPython is integrated alongside some other apps/libraries, this may end up duplicating code. In this case, if one of the other libraries uses strtod, we end up with two ascii-to-double routines. |
@Ecco: Schedule we came to (which, as usual, is likely spread around tickets and not documented in one place, an definitely not fully executed):
Level of execution of these ideas varies, and some (well, all in general) of them conflicting, depending on a usecase. For example, for unix port float handling definitely should be offloaded to libc. But for STM32 port, where exact float params are known, why depend on outside bloat, if we can do it better? Even for unix port, there's a desire to optimize particular targets. The current state is thus a mix-up somewhat, and it needs further elaboration (making it all configurable, based on requirements set in the last paragraph). |
Thank you for the clear explanation. Makes a lot of sense, thanks! There is one great advantage to internalizing float parsing: it will behave the same on all platforms. |
@Ecco as for this particular patch here, IMO it doesn't really change the situation much for the float parser. It may parse some floats "better", but others it parses "worse" (see failing test case). And since there are many floats (approx 2^32, or 2^64 for double prec) it's hard to quantify how much better this patch makes things overall. Since the existing parser gives results within 1LSB of the float precision, and this patch doesn't improve that but increases code size, I'd be inclined not to merge this. I'd rather have a full fix that makes it 100% compatible with other parsers, otherwise it'll just be patch on patch not really going anywhere. |
Exactly my conclusion as well… I could argue that it parses "more common" floats better, but I agree the argument is weak and subjective. |
Now unfortunately I don't think there is a fix using the current method. To avoid losing accuracy, I think you need to work using integers all the way, and build the floating-point at the very last steps from mantissa, exponent, and sign. |
Here's an updated patch which I think might be worth merging. It probably doesn't fix all cases, but I think it's a net improvement over the previous version. As explained in the comment, it favorises exact operations for the small subset of cases where we manipulate small-powers-of-10 exponents. The added code is super small (just a couple comparisons), fixes some very visible cases ( |
py/parsenum.c
Outdated
// turns out small positive powers of 10 do, whereas small negative powers of 10 don't. | ||
// So in that case, we'll yield a division of exact values rather than a multiplication | ||
// of slightly erroneous values. | ||
if (exp_val < 0 && exp_val > -15) { // 15 = log10(2^53) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that this code also needs to work with single precision floats, so -15 would be something like -6 if single precision is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Would you consider the merge if I fixed this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me first run some tests with the patch as it is, to see how it changes things. I understand from the logic/comment you wrote what the intention is and it does sound like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, makes perfect sense. If you decide you want to merge it, i'll be happy to make it work with single-precision floats too (well, it's really just an #if statement…) 😄
@Ecco I tested that TCL strtod implementation that you linked to above, and it actually performs worse that the current uPy implementation for the floating_points2.py test suite I posted above: it doesn't handle subnormals, and fails another 357 tests (they are off by 1LSB like uPy). uPy also fails 357 tests but I can get that down to 296 by using a variant of your patch here which uses division (instead of multiplication) of the exponent for exp values between -1 and -22 inclusive (for double precision case). The real issue is that the uPy parser will sometimes not give the closest floating representation of the input string. Eg for Full implementations of strtod look very complex because they need to do proper rounding of the result, to take all the above into account. If a system already provides strtod then it makes sense to delegate to that to do the string parsing. My feeling is that it's probably a good idea to include the patch here to get accurate results parsing small fractions, like 0.3. |
Agreed. One could argue that it can also be useful to have a consistent behavior across all builds of uPy. Anyway, I think this is outside of the scope of that specific pull request 😄
Great news! I'm preparing a commit with single-precision support then. Thanks! |
Thanks @Ecco, it looks really good now! One minor thing: You can see how this works by evaluating (in Python) |
That's a very good point! Nice arithmetics :-) Updated patch incoming, with a huge-ass comment :-) |
This also removes the need to pin share because we don't use the status LED while user code is running. The status flashes fallback to the HW_STATUS LED if no RGB LED is present. Each status has a unique blink pattern as well. One caveat is the REPL state. In order to not pin share, we set the RGB color once. PWM and single color will be shutoff immediately but DotStars and NeoPixels will hold the color until the user overrides it. Fixes micropython#4133
This also removes the need to pin share because we don't use the status LED while user code is running. The status flashes fallback to the HW_STATUS LED if no RGB LED is present. Each status has a unique blink pattern as well. One caveat is the REPL state. In order to not pin share, we set the RGB color once. PWM and single color will be shutoff immediately but DotStars and NeoPixels will hold the color until the user overrides it. Fixes micropython#4133
This also removes the need to pin share because we don't use the status LED while user code is running. The status flashes fallback to the HW_STATUS LED if no RGB LED is present. Each status has a unique blink pattern as well. One caveat is the REPL state. In order to not pin share, we set the RGB color once. PWM and single color will be shutoff immediately but DotStars and NeoPixels will hold the color until the user overrides it. Fixes micropython#4133
This also removes the need to pin share because we don't use the status LED while user code is running. The status flashes fallback to the HW_STATUS LED if no RGB LED is present. Each status has a unique blink pattern as well. One caveat is the REPL state. In order to not pin share, we set the RGB color once. PWM and single color will be shutoff immediately but DotStars and NeoPixels will hold the color until the user overrides it. Fixes micropython#4133
This patches avoid multiplying with negative powers-of-10 when parsing floating-point values.
Similar behavior can be found in some implementations of strtod.