-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
py: Fixes and test coverage for 64-bit big integer representations. #16953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
py: Fixes and test coverage for 64-bit big integer representations. #16953
Conversation
ec883a8
to
c6f3856
Compare
// and passes tests. | ||
|
||
#define MICROPY_LONGINT_IMPL (MICROPY_LONGINT_IMPL_LONGLONG) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yoctopuce suggested we test MICROPY_OBJ_REPR_C
and MICROPY_LONGINT_IMPL_LONGLONG
together, which makes sense - and I don't think we have CI coverage for the former at the moment.
@dpgeorge what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a very good idea to test MICROPY_OBJ_REPR_C
in CI.
It's enabled on esp8266 and stm32 Arduino boards, so it does get tested for a release. But would be great to test it in CI.
Enabling representation C in this new longlong variant is a good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MICROPY_OBJ_REPR_C
seems to need more work than just enabling it for this config, so I'll look at that for a follow-up PR.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #16953 +/- ##
=======================================
Coverage 98.54% 98.54%
=======================================
Files 169 169
Lines 21898 21904 +6
=======================================
+ Hits 21580 21586 +6
Misses 318 318 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code size report:
|
0dd8885
to
5d8e0c1
Compare
The other thing that might be worth considering here is 64-bit arithmetic overflow, should this raise OverflowError? (Currently arithmetic operations return the overflow result, which is UB in C - but I think on all our architectures it will wrap.) |
f8acf08
to
c862fb1
Compare
This is a bit troubling, my local |
The code only fails when run through mpy-cross (test option
2^63 is not a valid positive When using |
Ah of course, thanks! |
c862fb1
to
aa0917b
Compare
I feel like this difference in behaviour is a good example of why we might consider changing the 64-bit arithmetic operations to raise on overflow. |
I'm totally with you |
This PR now includes arithmetic overflow checks for 64-bit bigints! |
Excellent, thank you ! |
py/objint_longlong.c
Outdated
@@ -25,6 +25,7 @@ | |||
* THE SOFTWARE. | |||
*/ | |||
|
|||
#include <errno.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a bit of a nuisance to need to include this header... it sometimes doesn't exist on embedded systems.
At the very least I'd suggest moving this include to within the MICROPY_LONGINT_IMPL == MICROPY_LONGINT_IMPL_LONGLONG
guard, so it's only included if needed (because this file will be compiled on all systems regardless of the config).
In the long run, would be best to remove dependency on errno
, but I'm not sure it's worth the effort given this config is likely not used much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved it down behind the preprocessor guard.
I did start writing a version that doesn't depend on strtoll, using the approach @yoctopuce suggested for mp_parse_num_integer()
to handle 64-bit. It has potential to work quite nicely, but it wasn't particularly simple either so I shelved it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is my version, in case it helps:
const byte* restrict str = (const byte*)*str_;
const byte* restrict top = str + len;
long long int_val = 0;
for (; str < top; str++) {
// get next digit as a value
mp_uint_t dig = *str;
if ('0' <= dig && dig <= '9') {
dig -= '0';
} else if (dig == '_') {
continue;
} else {
dig |= 0x20; // make digit lower-case
if ('a' <= dig && dig <= 'z') {
dig -= 'a' - 10;
} else {
// unknown character
break;
}
}
if (dig >= (mp_uint_t)base) {
break;
}
// add next digit
int_val = int_val * base + dig;
}
// negate value if needed
if (neg) {
int_val = -int_val;
}
mp_obj_t result = mp_obj_new_int_from_ll(int_val);
*str_ = (char*)str;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got inspired to go and finish the version of this I was working on, have updated this PR with the result.
py/objint_longlong.c
Outdated
@@ -144,9 +148,22 @@ mp_obj_t mp_obj_int_unary_op(mp_unary_op_t op, mp_obj_t o_in) { | |||
} | |||
} | |||
|
|||
// 64-bit binary operations have overflow checking when supported by the compiler, | |||
// multiply is used in a few places | |||
#if __has_builtin(__builtin_mul_overflow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be wrapped in #if defined __has_builtin
. See py/misc.h
and https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005fbuiltin.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or follow lib/cmsis/inc/cmsis_gcc.h
:
#ifndef __has_builtin
#define __has_builtin(x) (0)
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also hit that __has_builtin issue when integrating this commit in my dev branch, and as __has_builtin()
is sometime undefined on compilers that do have support for __builtin_*_overflow
, I ended up using this code:
#if defined(_MSC_VER)
#define ENABLE_INTSAFE_SIGNED_FUNCTIONS
#include <intsafe.h>
#define add_overflow(lhs,rhs,res) LongLongAdd(lhs, rhs, res)
#define sub_overflow(lhs,rhs,res) LongLongSub(lhs, rhs, res)
#define mul_overflow(lhs,rhs,res) LongLongMult(lhs,rhs,res)
#elif defined(__EMSCRIPTEN__ ) || (defined(__GNUC__) && __GNUC__ >= 5)
#define add_overflow(lhs,rhs,res) __builtin_add_overflow(lhs, rhs, res)
#define sub_overflow(lhs,rhs,res) __builtin_sub_overflow(lhs, rhs, res)
#define mul_overflow(lhs,rhs,res) __builtin_mul_overflow(lhs, rhs, res)
#endif
#if !defined(add_overflow)
#include <limits.h>
__inline bool add_overflow(long long lhs, long long rhs, long long *result) {
// Check for overflow
if ((rhs > 0) && (lhs > LLONG_MAX - rhs)) {
return 1; // Overflow
}
if ((rhs < 0) && (lhs < LLONG_MIN - rhs)) {
return 1; // Overflow
}
*result = lhs + rhs;
return 0; // No overflow
}
__inline bool sub_overflow(long long lhs, long long rhs, long long *result) {
// Check for overflow
if ((rhs > 0) && (lhs < LLONG_MIN + rhs)) {
return 1; // Overflow
}
if ((rhs < 0) && (lhs > LLONG_MAX + rhs)) {
return 1; // Overflow
}
*result = lhs - rhs;
return 0; // No overflow
}
__inline bool mul_overflow(long long lhs, long long rhs, long long * result)
{
if (lhs > 0) {
if (rhs > 0) {
if (lhs > (LLONG_MAX / rhs)) {
return 1; // Overflow
}
} else {
if (rhs < (LLONG_MIN / lhs)) {
return 1; // Overflow
}
}
} else {
if (rhs > 0) {
if (lhs < (LLONG_MIN / rhs)) {
return 1; // Overflow
}
} else {
if ((lhs != 0) && (rhs <= (LLONG_MAX / lhs))) {
return 1; // Overflow
}
}
}
*result = lhs * rhs;
return 0; // No overflow
}
#endif
My embedded platforms are all using GCC >= 5, so the bigger+slower inline code only ends up in OS-based test builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive! @dpgeorge do you think it's worth pulling any/all of this into MP, or stick with __has_builtin
and fallback to not checking overflow? (__has_builtin
is GCC 10 and up, first released 2020.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I've added the check for defined __has_builtin
already.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we rely only on
__has_builtin
, we are going to significantly increase code size for all compiler version between 2015 ( GCC 5+) and 2020 (GCC 10+).
I think the impact is not that bad. If I build the Zephyr nucleo_wb55rg
board with this branch (and gcc 12) I get 242792 bytes flash size.
If I manually patch the three #if __has_builtin(*overflow)
lines to #if 0
then I get 242800 bytes flash size.
Admittedly, there is now no overflow check for addition and subtraction and it's possible GCC 12 has peephole optimised the manual mp_mul_ll_overflow
function to an overflow check in a way that older GCC can't detect.
It is a bit odd that GCC version determines whether or not an overflow is an exception, though. I'll see how hard it is to patch around that further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the implementation so the overflow checks are consistent regardless of whether __has_builtin(*overflow)
is true or not. Based on your code, so thanks for that! I think my version of the manual implementations are all sound...
The Zephyr build still only goes up by 8 bytes in code size if I swap from the built-in overflow checked ops to the manual ones, though. Even on GCC 5.x the difference in size seems neglible: https://gcc.godbolt.org/z/jhhn16dW6 (One key difference may be that the builtins return a correct "infinite precision" result even in case of overflow, whereas the manual functions can fall back on UB for overflow.)
On balance I'd rather keep the checks simple and fallback to the manual implementations rather than keep adding per-compiler tweaks here. Unless there's a case where it really does balloon out the code size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yoctopuce I'm sorry, I missed your reply from 3 days ago - GitHub hadn't loaded it into the tab I still had open so I just saw it now.
That's interesting you see such a big difference by comparison to the things I checked against. Can put in the suggested extra check for built-ins, then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yoctopuce Added the explicit GCC version check. Curious how the version that's there now compares to your tests. (It's still possible I'm doing something differently or incorrectly in a way that's impacting the code size difference.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh, clearly a case of "more haste less speed" today.
- The Zephyr port code size doesn't change by much because it isn't using
MICROPY_LONGINT_IMPL_LONG_LONG
. I don't know why I thought it did (or why a no-op source change causes an 8 byte difference in code size). - Updated the PR again to rely on the behaviour that an undefined macro has value 0, so we don't need to recursively expand a
defined
clause.
02112a4
to
8555e92
Compare
These will run on all ports which support them, but importantly they'll also run on ports that don't support arbitrary precision but do support 64-bit long ints. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Signed-off-by: Angus Gratton <angus@redyak.com.au>
These tests cover the use of mp_obj_new_int_from_str_len when mp_parse_num_integer overflows the SMALLINT limit. Signed-off-by: Yoctopuce dev <dev@yoctopuce.com>
Leans on the test case added in the parent commit, as json.loads() seems to be a trivial way to cause this. The test case is moved to an '_int_64' variant so the json_loads.py test can still run on builds without large int support. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Signed-off-by: Angus Gratton <angus@redyak.com.au>
This relies on errno unfortunately, but there's no other way to disambiguate MAX_LONGLONG and an out of range value (apart from manually re-implementing strtoll). Includes some test workarounds to account for things which now overflow: - uctypes_array_load_store test was failing already, now won't parse. - all the ffi_int tests contain 64-bit unsigned values, that won't parse as long long. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Angus Gratton <angus@redyak.com.au>
Relies on arbitrary precision math, so won't run on a port which has threads & limited bigint support. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Angus Gratton <angus@redyak.com.au>
The other performance tests run and pass with only 64-bit big integer support. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Angus Gratton <angus@redyak.com.au>
Enabled when supported via compiler builtins (i.e. gcc/clang). Also adds an error when shifting by a negative value. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
3b737ad
to
3d14d08
Compare
Will be used in a follow-up commit. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Makes it compatible with the __builtin_mul_overflow() syntax, used in follow-up commit. Signed-off-by: Angus Gratton <angus@redyak.com.au>
3d14d08
to
1962f40
Compare
If long integer support is 'long long' then mp_parse_num_integer() can parse to it directly instead of failing over from small int. One tricky case this correctly overflows on is things like int("9223372036854775808") which is one more than LLONG_MAX in decimal. No unit test case added for this as it's too hard to detect 64-bit long integer mode. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
1962f40
to
fd40fb1
Compare
Summary
As pointed out by @yoctopuce in this discussion and #16932, there is poor test coverage for the optional 64-bit bigint representation and this representation has some bugs.
This PR aims to improve this:
int_64
equivalent.longlong
unix build variant that enables 64-bit long int mode. Mostly useful for CI testing.Testing
longlong
variant on unix, looked for failures.Trade-offs and Alternatives
errno
which is a bit fraught in embedded systems. We could drop this commit out and accept the current saturating behaviour instead. The only way to correctly detect overflow without usingstrtoll
anderrno
would be to add a dependency onsscanf
or to re-implementstrtoll
ourselves (@yoctopuce suggested that when building in 64-bit mode we compilemp_parse_num_integer
to parse 64-bit integers, which could be quite a neat solution.)