Skip to content

py/objint: Add mp_obj_int_get_uint_checked() helper. #4928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Jongy
Copy link
Contributor

@Jongy Jongy commented Jul 17, 2019

Can be used where mp_obj_int_get_checked() will overflow due
to the sign-bit solely.

Shouldn't affect code size, since it's unused at the moment.
Should any extmod/port use it, they'd "pay the price".

Addressed #2868, #2269 and possibly more.

py/objint.c Outdated
@@ -363,6 +363,10 @@ mp_int_t mp_obj_int_get_checked(mp_const_obj_t self_in) {
return MP_OBJ_SMALL_INT_VALUE(self_in);
}

mp_uint_t mp_obj_int_get_uint_checked(mp_const_obj_t self_in) {
return MP_OBJ_SMALL_INT_VALUE(self_in);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably lean to code reuse here and return mp_obj_int_get_checked so that if the implementation would change you only have to change it once. But I don't know if the compiler is smart enough to skip the extra call, nor if that matters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should trust the compiler to always skip the extra call. I could introduce another helper function/macro to be used by both functions, but I don't think it matters that much.

@@ -279,6 +279,11 @@ mp_int_t mp_obj_int_get_checked(mp_const_obj_t self_in) {
return mp_obj_int_get_truncated(self_in);
}

mp_uint_t mp_obj_int_get_uint_checked(mp_const_obj_t self_in) {
// TODO: Check overflow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark as above. Bonus: you wouldn't have to repeat this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reply, as well :)

@stinos
Copy link
Contributor

stinos commented Jul 18, 2019

Shouldn't affect code size, since it's unused at the moment.

I'm not sure that is how it works. The code is there, and part of the public API so it makes it into the binary, no?

Maybe since the mp_obj_int_get_uint_checked implementation for mpz is basically the exact same as for signed int except for one function call the implementations can be merged to reduce code size, but it might be somewhat tricky due to the different return values.

@dpgeorge
Copy link
Member

I'm not sure that is how it works. The code is there, and part of the public API so it makes it into the binary, no?

If it's never used it won't make it into the binary, due to linker "garbage collection".

@stinos
Copy link
Contributor

stinos commented Jul 18, 2019

I was actually thinking in terms of static/shared libraries, my mistake, that's not the standard way MicroPython gets used, so I wonder if that principle also applies there.

@dpgeorge
Copy link
Member

I think there are a few functions which are like this, part of the API but unused by the core (eg mp_module_register, kind of).

@Jongy
Copy link
Contributor Author

Jongy commented Jul 20, 2019

I'm not sure that is how it works. The code is there, and part of the public API so it makes it into the binary, no?

As @dpgeorge also said, since MicroPython is compiled as a complete executable (and not as a shared library, a part of a bigger executable) and combined with the use of LTO / -ffunction-sections + --gc-sections, unused functions are cut off during the final linkage. Actually mpz_as_uint_checked which I call here is an example - it's not used (apart from unix/coverage builds) and it's still compiled whenever mpz is used.

@Jongy
Copy link
Contributor Author

Jongy commented Jul 20, 2019

Found another use-case for this function: machine_mem_get_addr calls mp_obj_int_get_truncated, basically because it can't call mp_obj_int_get_checked since there are legit addresses which don't fit in a signed word. So it calls mp_obj_int_get_truncated which works, but it gets messed up if you really use a value larger than an unsigned word. It should use mp_obj_int_get_uint_checked.

I can fix it here but it will cause users of machine_mem to automatically include mp_obj_int_get_uint_checked into the binary, so I'm not sure.

@dpgeorge What do you think about this function and the change?

@Jongy Jongy force-pushed the objint-get-uint branch from c2fb4a0 to 4b227b2 Compare July 20, 2019 21:09
@Jongy
Copy link
Contributor Author

Jongy commented Jul 20, 2019

Maybe since the mp_obj_int_get_uint_checked implementation for mpz is basically the exact same as for signed int except for one function call the implementations can be merged to reduce code size, but it might be somewhat tricky due to the different return

I made an attempt at it. I don't see any size diff (even when forcing the uint variant into the final binary), probably because the helper I made was inlined into them both. But I don't have time to verify that in the assembly now :(
Anyway I don't think it matters that much. I also think we better keep mp_obj_int_get_checked as it was, even if it makes the new helper function cost a bit more.

@stinos
Copy link
Contributor

stinos commented Jul 21, 2019

Anyway I don't think it matters that much. I also think we better keep mp_obj_int_get_checked as it was, even if it makes the new helper function cost a bit more.

Small incremental size additions could make for a large total.. But I mainly wonder why you think it's better to keep something, if changing it could be beneficial (don't know if it is though, didn't check)?

@Jongy
Copy link
Contributor Author

Jongy commented Jul 21, 2019

Small incremental size additions could make for a large total.. But I mainly wonder why you think it's better to keep something, if changing it could be beneficial (don't know if it is though, didn't check)?

In the long run, after there are many users for mp_obj_int_get_uint_checked, it won't matter. Currently when first pulling this PR I think it's better to insert it as diff-less as possible and allow for individual call-sites to make the choice later.

After all, my intention here is merely to allow future users to choose this API, not to enforce it on current compilations (Which have worked fine and will work fine regardless of this PR)

@Jongy Jongy mentioned this pull request Jan 1, 2020
8 tasks
py/objint_mpz.c Outdated
@@ -396,13 +396,13 @@ mp_int_t mp_obj_int_get_truncated(mp_const_obj_t self_in) {
}
}

mp_int_t mp_obj_int_get_checked(mp_const_obj_t self_in) {
STATIC mp_int_t _mp_obj_int_get_checked(mp_const_obj_t self_in, bool (*as_int)(const mpz_t *i, mp_int_t *value)) {
if (mp_obj_is_small_int(self_in)) {
return MP_OBJ_SMALL_INT_VALUE(self_in);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the intended semantics of mp_obj_int_get_uint_checked() when the arg is a negative small-int? Since it's "checked" IMO it should raise an exception, since that is what mpz_as_uint_checked() will do (if the mpz is negative). If you don't want an exception then wouldn't mp_obj_int_get_truncated() work fine instead of using this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention of using this (in relation to the Linux kernel port) is mainly to catch errors in arithmetic overflow errors before they propagate badly via the kernel FFI. From my personal experience those were usually overflows and not underflows and that's why I think I never encountered this negative small-int problem. That's also why I don't want to use mp_obj_int_get_truncated.

So yeah, it should raise an exception. I'll fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

py/objint.c Outdated
@@ -363,6 +363,10 @@ mp_int_t mp_obj_int_get_checked(mp_const_obj_t self_in) {
return MP_OBJ_SMALL_INT_VALUE(self_in);
}

mp_uint_t mp_obj_int_get_uint_checked(mp_const_obj_t self_in) {
return MP_OBJ_SMALL_INT_VALUE(self_in);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reopening the discussion of #5482 (comment) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this function.

@Jongy
Copy link
Contributor Author

Jongy commented Jan 12, 2020

I've rebased and fixed the negative small-int case for mpz. I split the entire function out because now keeping the logic for int/uint in the same function doesn't give that much of code-size benefit and has greater performance implications.

Regarding longlong and none impl - I wonder if it's even relevant to support them with this function?
mp_obj_int_get_uint_checked is currently an optional API that none of the builds are using. Builds that will use it (e.g Linux kernel) need it because they mess with unsigned machine words and they want protection from overflows. They will never use none/longlong for this case because both of them don't allow unsigned machine-word sized integers.

What do you think?

@Jongy Jongy requested a review from dpgeorge January 12, 2020 22:26
py/objint_mpz.c Outdated
if (mpz_as_uint_checked(&self->mpz, &value)) {
return value;
} else {
overflow:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid labels and goto (which is nice if possible) this function could be easily restructured so that it falls through to the bottom if there's an overflow, and the mp_raise_msg is at the bottom. Please try to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right. I dunno why I've structured it with a goto in the first place. I think it's a remnant of having a single function with mp_obj_int_get_checked.

py/obj.h Outdated
@@ -748,6 +748,8 @@ void mp_obj_cell_set(mp_obj_t self_in, mp_obj_t obj);
mp_int_t mp_obj_int_get_truncated(mp_const_obj_t self_in);
// Will raise exception if value doesn't fit into mp_int_t
mp_int_t mp_obj_int_get_checked(mp_const_obj_t self_in);
// Will raise exception if value doesn't fit into mp_uint_t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe write "Will raise exception if value is negative or doesn't fit into mp_uint_t"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair

@dpgeorge
Copy link
Member

I split the entire function out because now keeping the logic for int/uint in the same function doesn't give that much of code-size benefit and has greater performance implications.

+1

Regarding longlong and none impl - I wonder if it's even relevant to support them with this function?

Let's not bother for now. Can be done in the future if needed.

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label Jan 13, 2020
Can be used where mp_obj_int_get_checked() will overflow due
to the sign-bit solely.
This returns an mp_uint_t, so it also verifies the given integer is
not negative.

Currently implemented only for mpz, since I don't expect this function
to be used by ports based on longlong/none impls.
@Jongy
Copy link
Contributor Author

Jongy commented Jan 13, 2020

Let's not bother for now. Can be done in the future if needed.

Removed them, and fixed other comments.

@dpgeorge
Copy link
Member

Thanks for updating. Merged in 176ab99, with a coverage test added in 3448e69

@dpgeorge dpgeorge closed this Jan 14, 2020
@Jongy
Copy link
Contributor Author

Jongy commented Jan 14, 2020

Cool, I didn't know the unix_coverage build had this "unittest"-alike file.

@Jongy Jongy deleted the objint-get-uint branch December 2, 2020 01:35
tannewt added a commit to tannewt/circuitpython that referenced this pull request Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants