py/objint.c: Add support for int.bit_length(). #11679

dmazzella · 2023-06-01T08:43:37Z

Add support for int.bit_length()

Closes #4065 (feature request issue)

github-actions · 2023-06-01T08:56:40Z

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
        rp2:    +0 +0.000% PICO

codecov · 2023-06-01T12:43:59Z

Codecov Report

Merging #11679 (9dd4f8f) into master (5159304) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 9dd4f8f differs from pull request most recent head d6ee656. Consider uploading reports for the commit d6ee656 to get more accurate results

@@           Coverage Diff           @@
##           master   #11679   +/-   ##
=======================================
  Coverage   98.40%   98.40%           
=======================================
  Files         156      156           
  Lines       20609    20628   +19     
=======================================
+ Hits        20281    20300   +19     
  Misses        328      328

Impacted Files	Coverage Δ
py/mpz.h	`100.00% <ø> (ø)`
py/mpz.c	`100.00% <100.00%> (ø)`
py/objint.c	`100.00% <100.00%> (ø)`
py/objint_mpz.c	`100.00% <100.00%> (ø)`

Signed-off-by: Damiano Mazzella <damianomazzella@gmail.com>

massimosala · 2023-06-05T09:52:32Z

Hi

Wondering about the possible use-cases, and the impact on firmware size,
I propose to have bit_length and bit_count working only on native integers, not on mpn/mpz.

Rationale:

many cpus have specific instructions for these two operations, on native integers
the routines working only on native integers are smaller and easier to mantain
the code path is more efficient, it hasn't to fork based on the object types.

With my proposal, I think we can have the two features, bit_length and bit_count, which require less memory than just the bit_length proposal above.

IMHO these scaled down implementations will cover 99% of their real usage.

Addendum: if for special cases (?) you need to use "big integers", you can code the two routines in python and use @micropython.native
In the docs for bit_length and bit_count, we can show the python code for "big integers".

@dpgeorge, what do you think?

robert-hh · 2023-06-05T11:04:24Z

I propose to have bit_length and bit_count working only on native integers, not on mpn/mpz.

A user cannot tell which internal representation is used for an integer, and Python code must just work irrespective of the internal representation of a number. With your proposal there is the risk that bit_length() and bit_count() sometimes work and sometimes not. That is hardly acceptable.

jimmo

FWIW, this PR is +128 bytes on PYBV11.

I have implemented my suggestions in the review comments here: master...jimmo:micropython:int-bit-length

It comes out slightly smaller at +108 bytes.

I also tried a version that used builtin intrinsic bit operations (specifically CLZ). This didn't help with code size, and also because we still need to provide fallbacks it made the code a lot more complicated.

jimmo · 2023-06-05T14:14:57Z

ports/unix/variants/coverage/mpconfigvariant.h

@@ -42,3 +42,8 @@
 #define MICROPY_TRACKED_ALLOC          (1)
 #define MICROPY_WARNINGS_CATEGORY      (1)
 #define MICROPY_PY_UCRYPTOLIB_CTR      (1)
+
+// Enable int.bit_length(n)
+#ifndef MICROPY_INT_BIT_LENGTH


This needs to be defined (and defaulted) in mpconfig.h.

In general rather than enabling things in coverage, just make them MICROPY_CONFIG_ROM_LEVEL_AT_LEAST_EVERYTHING in mpconfig, which will be enabled by coverage.

Also for consistency I think this should be called MICROPY_PY_BUILTINS_INT_BIT_LENGTH.

jimmo · 2023-06-05T14:15:21Z

py/mpz.c

+        return 0;
+    }
+
+    mpz_t *dest = mpz_clone(n);


Instead of copying the mpz into a new integer and modifying it, you can just count the digits in the mpz.

jimmo · 2023-06-05T14:16:29Z

py/objint.c

+#if MICROPY_INT_BIT_LENGTH
+STATIC mp_obj_t int_bit_length(size_t n_args, const mp_obj_t *args) {
+    (void)n_args;
+    #if MICROPY_LONGINT_IMPL == MICROPY_LONGINT_IMPL_MPZ


This unnecessarily creates an mpz if the input is a small int.

Instead, these functions should handle small ints separately, and only send them off to mpz if necessary.

jimmo · 2023-06-05T14:17:25Z

py/objint.c

+    #else
+    mp_uint_t dest = MP_OBJ_SMALL_INT_VALUE(args[0]);
+    mp_uint_t num_bits = 0;
+    while (dest > 0) {


This does not work for negative numbers. It will always return 32 (or 64 depending on sizeof(mp_uint_t)) as it will just keep shifting down the sign bit.

The reason the tests pass with negative numbers is that this code is never hit.

jimmo · 2023-06-05T14:18:05Z

py/objint_mpz.c

+    if (n == &n_temp) {
+        mpz_deinit(n);
+    }
+    return mp_obj_new_int_from_ull(res);


res will always fit in a small int, so can just use MP_OBJ_NEW_SMALL_INT directly.

jimmo · 2023-06-05T14:18:47Z

py/objint_mpz.c

+#if MICROPY_INT_BIT_LENGTH
+mp_obj_t mp_obj_int_mpz_bit_length(mp_obj_t size) {
+    mpz_t n_temp;
+    mpz_t *n = mp_mpz_for_int(size, &n_temp);


As above, if you handle small ints directly, you don't need to allocate an mpz here.

jimmo · 2023-06-05T14:19:30Z

tests/basics/int_bit_length.py

+    print('SKIP')
+    raise SystemExit
+
+n = -37


Needs tests for big ints (i.e. mpz). Also some more interesting bit patterns around the boundaries.

jimmo · 2023-06-05T14:52:58Z

py/objint.c

+    return mp_obj_new_int_from_uint(num_bits);
+    #endif
+}
+STATIC MP_DEFINE_CONST_FUN_OBJ_VAR_BETWEEN(int_bit_length_obj, 0, 1, int_bit_length);


We also need to provide this for the "long long" implementation.

jimmo · 2023-06-05T15:01:29Z

@massimosala

Wondering about the possible use-cases, and the impact on firmware size, I propose to have bit_length and bit_count working only on native integers, not on mpn/mpz.

As @robert-hh has pointed out, MicroPython's integer type is a "small integer" until it needs to be big. But the integer type itself is what says that it has a bit_length function. So what you're proposing is that bit_length just stops working (raises not implemented?) when the integer is too big?

Rationale:

* **many cpus have specific instructions for these two operations, on native integers**

There are operations like clz, but they are not universally available. Additionally, different compilers have different ways of accessing these instructions, plus we have to provide a fallback. Also they come with quirks (e.g. clz is undefined in the input is zero), which leads to more code size.

* the routines working only on native integers are smaller and easier to mantain

See my proposed version linked above, the MPZ version is actually simpler because MPZ works entirely in unsigned digits.

* the code path is more efficient, it hasn't to fork based on the object types.

You still have to fork to decide at runtime whether the given integer is small or big. You end up with extra code to handle the big case anyway (e.g. raising an error).

With my proposal, I think we can have the two features, bit_length and bit_count, which require less memory than just the bit_length proposal above.

Unfortunately this is not the case. These things have to be measured and it takes time and effort and the results can be counterintuitive.

dmazzella · 2023-06-06T07:46:44Z

@jimmo good savings of bytes and optimizations, I flatly agree with your review.

massimosala · 2023-06-13T10:11:44Z

@jimmo wrote:
... results can be counterintuitive

I see, I was too focused on bit banging, where smallints are fine.

If I need to, I'll write a specialized viper routine.
And eventually I will compare its performance against this generic implementation ;-)

massimosala · 2023-06-23T17:36:57Z

FWIW, this PR is +128 bytes on PYBV11.

I have implemented my suggestions in the review comments here: master...jimmo:micropython:int-bit-length

It comes out slightly smaller at +108 bytes.

I also tried a version that used builtin intrinsic bit operations (specifically CLZ). This didn't help with code size, and also because we still need to provide fallbacks it made the code a lot more complicated.

A very nice code rewrite.
If I understand, the loop is done only on the last digit.

Quite smart, it overtakes Damiano code and my attempt to improve it.
(#4065)

Thanks to Damiano for the startup and to Jim for the improvement.
I hope bit_length will be included in the next MP release.

massimosala · 2023-06-23T18:11:29Z

@jimmo

Hi Jim

Is it possible to add your bit_length to the milestones of MP 1.21 ?

projectgus · 2024-03-07T23:50:17Z

This is an automated heads-up that we've just merged a Pull Request
that removes the STATIC macro from MicroPython's C API.

See #13763

A search suggests this PR might apply the STATIC macro to some C code. If it
does, then next time you rebase the PR (or merge from master) then you should
please replace all the STATIC keywords with static.

Although this is an automated message, feel free to @-reply to me directly if
you have any questions about this.

py/objint.c: Add support for int.bit_length

0811b35

dmazzella changed the title ~~py/objint.c: Add support for int.bit_length~~ py/objint.c: Add support for int.bit_length(). Jun 1, 2023

dmazzella added 3 commits June 1, 2023 11:09

code formatting

b71f370

py/objint.c: Add support for int.bit_length().

1747ae9

py/objint.c: Add support for int.bit_length().

9dd4f8f

py/objint.c: Add support for int.bit_length().

d6ee656

Signed-off-by: Damiano Mazzella <damianomazzella@gmail.com>

dpgeorge added the py-core Relates to py/ directory in source label Jun 2, 2023

jimmo reviewed Jun 5, 2023

View reviewed changes

robert-hh mentioned this pull request Sep 11, 2023

Implement int.bit_length and struct.pack("e") #12422

Closed

This was referenced Feb 29, 2024

global: Remove the STATIC macro. #13763

Merged

global: Prune trailing whitespace. #13777

Closed

Uh oh!

py/objint.c: Add support for int.bit_length(). #11679

Are you sure you want to change the base?

py/objint.c: Add support for int.bit_length(). #11679

Conversation

dmazzella commented Jun 1, 2023 • edited by projectgus Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 1, 2023

Uh oh!

codecov bot commented Jun 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

massimosala commented Jun 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robert-hh commented Jun 5, 2023

Uh oh!

jimmo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jimmo commented Jun 5, 2023

Uh oh!

dmazzella commented Jun 6, 2023

Uh oh!

massimosala commented Jun 13, 2023

Uh oh!

massimosala commented Jun 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

massimosala commented Jun 23, 2023

Uh oh!

projectgus commented Mar 7, 2024

Uh oh!

Uh oh!

dmazzella commented Jun 1, 2023 •

edited by projectgus

Loading

codecov bot commented Jun 1, 2023 •

edited

Loading

massimosala commented Jun 5, 2023 •

edited

Loading

massimosala commented Jun 23, 2023 •

edited

Loading