Skip to content

py/nlrthumb: Make non-Thumb2 workaround opt-in. #11353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 25, 2024

Conversation

neuschaefer
Copy link
Contributor

@neuschaefer neuschaefer commented Apr 27, 2023

py/nlrthumb: Make non-Thumb2 workaround opt-in.

Although the original motivation given for the workaround[1] is correct,
nlr.o and nlrthumb.o are linked with a small enough distance that the
problem does not occur, and the workaround isn't necessary. The distance
between the b instruction and its target (nlr_push_tail) is just 64
bytes[2], well within the ±2046 byte range addressable by an
unconditional branch instruction in Thumb mode.

The workaround induces a relocation in the text section (textrel), which
isn't supported everywhere, notably not on musl-libc3, where it causes
a crash on start-up. With the workaround removed, micropython works on an
ARMv5T Linux system built with musl-libc.

This commit changes nlrthumb.c to use a direct jump by default, but
leaves the long jump workaround as an option for those cases where it's
actually needed.

[1]: commit dd376a2

Author: Damien George damien.p.george@gmail.com
Date: Fri Sep 1 15:25:29 2017 +1000

py/nlrthumb: Get working again on standard Thumb arch (ie not Thumb2).

"b" on Thumb might not be long enough for the jump to nlr_push_tail so
it must be done indirectly.

[2]: Excerpt from objdump -d micropython:

000095c4 <nlr_push_tail>:
    95c4:       b510            push    {r4, lr}
    95c6:       0004            movs    r4, r0
    95c8:       f02d fd42       bl      37050 <mp_thread_get_state>
    95cc:       6943            ldr     r3, [r0, #20]
    95ce:       6023            str     r3, [r4, #0]
    95d0:       6144            str     r4, [r0, #20]
    95d2:       2000            movs    r0, #0
    95d4:       bd10            pop     {r4, pc}

000095d6 <nlr_pop>:
    95d6:       b510            push    {r4, lr}
    95d8:       f02d fd3a       bl      37050 <mp_thread_get_state>
    95dc:       6943            ldr     r3, [r0, #20]
    95de:       681b            ldr     r3, [r3, #0]
    95e0:       6143            str     r3, [r0, #20]
    95e2:       bd10            pop     {r4, pc}

000095e4 <nlr_push>:
    95e4:       60c4            str     r4, [r0, #12]
    95e6:       6105            str     r5, [r0, #16]
    95e8:       6146            str     r6, [r0, #20]
    95ea:       6187            str     r7, [r0, #24]
    95ec:       4641            mov     r1, r8
    95ee:       61c1            str     r1, [r0, #28]
    95f0:       4649            mov     r1, r9
    95f2:       6201            str     r1, [r0, #32]
    95f4:       4651            mov     r1, sl
    95f6:       6241            str     r1, [r0, #36]   @ 0x24
    95f8:       4659            mov     r1, fp
    95fa:       6281            str     r1, [r0, #40]   @ 0x28
    95fc:       4669            mov     r1, sp
    95fe:       62c1            str     r1, [r0, #44]   @ 0x2c
    9600:       4671            mov     r1, lr
    9602:       6081            str     r1, [r0, #8]
    9604:       e7de            b.n     95c4 <nlr_push_tail>

Signed-off-by: Jonathan Neuschäfer j.neuschaefer@gmx.net

@neuschaefer neuschaefer force-pushed the textrel branch 2 times, most recently from d924144 to 57362e2 Compare April 27, 2023 09:48
@github-actions
Copy link

github-actions bot commented Apr 27, 2023

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +0 +0.000% RPI_PICO
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label Apr 27, 2023
@dpgeorge
Copy link
Member

nlr.o and nlrthumb.o are linked with a small enough distance that the problem does not occur, and the workaround isn't necessary

Is it possible to guarantee that these files are always linked close enough?

The workaround induces a relocation in the text section (textrel), which isn't supported everywhere, notably not on musl-libc, where it causes a crash on start-up

I'm curious: how does relocation affect a libc implementation?

@neuschaefer
Copy link
Contributor Author

Is it possible to guarantee that these files are always linked close enough?

My pragmatic approach would be to have a Thumb target in CI, so that it fails loudly when this constraint is violated.

I'm curious: how does relocation affect a libc implementation?

This a dynamic relocation, performed by the dynamic linker/loader (ld.so) at program start-up, and musl brings its own ld.so.

@codecov-commenter
Copy link

codecov-commenter commented Apr 29, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.39%. Comparing base (49af8ca) to head (7b050b3).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #11353   +/-   ##
=======================================
  Coverage   98.39%   98.39%           
=======================================
  Files         161      161           
  Lines       21204    21204           
=======================================
  Hits        20864    20864           
  Misses        340      340           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dlech
Copy link
Contributor

dlech commented May 4, 2023

Can we not call nlr_push_tail() from C instead of assembly and let the compiler figure out what is best?

@neuschaefer
Copy link
Contributor Author

Can we not call nlr_push_tail() from C instead of assembly and let the compiler figure out what is best?

As a further refactoring, this may be worthwhile, but for now I think my patch is sufficient.

@dpgeorge
Copy link
Member

dpgeorge commented May 9, 2023

My pragmatic approach would be to have a Thumb target in CI, so that it fails loudly when this constraint is violated.

I don't think that's enough. There could be a Thumb target out in the wild that breaks because of this change.

Can we not call nlr_push_tail() from C instead of assembly and let the compiler figure out what is best?

I don't think that will work in general because these assembler functions are "naked" and we can't put any C code in it (only inline assembler).

The only solution I can think of is to add a compile-time config option (eg MICROPY_NLR_THUMB_FORCE_SHORT_JUMP) that forces a short jump.

@neuschaefer
Copy link
Contributor Author

The only solution I can think of is to add a compile-time config option (eg MICROPY_NLR_THUMB_FORCE_SHORT_JUMP) that forces a short jump.

That sounds reasonable, but I would add a migration path off it:

  • Invert the option, i.e. make it MICROPY_NLR_THUMB_FORCE_LONG_JUMP or similar
  • Use the short jump by default, and deprecate the option
  • A few releases down the line, remove the option

This should cover the risk that there's a target that depends on the old code (by giving its maintainers time to notice and speak up), without complicating the codebase forever.


I am, however, not sure how a target would break due to the short jump, except if it gratuitously changes the link order to place nlr.o and nlrthumb.o far apart, which seems like quite an unnecessary change to make.

@dpgeorge
Copy link
Member

  • Invert the option, i.e. make it MICROPY_NLR_THUMB_FORCE_LONG_JUMP or similar

Yes I'm happy with this. Will you make the change to this PR?

@neuschaefer
Copy link
Contributor Author

I'll do it.

@neuschaefer neuschaefer changed the title py/nlrthumb: Remove non-Thumb2 workaround. py/nlrthumb: Make non-Thumb2 workaround opt-in. Jun 18, 2023
@neuschaefer
Copy link
Contributor Author

After a delay due to personal circumstances, I finally got around to iterating this PR.

@neuschaefer
Copy link
Contributor Author

neuschaefer commented Jun 18, 2023

The unix port / coverage_32bit CI shows a test failure in thread/thread_exc1.py. Not sure what the cause is, especially since the tests execute on x86-32.

This error is now gone.

py/nlrthumb.c Outdated
@@ -71,7 +79,7 @@ __attribute__((naked)) unsigned int nlr_push(nlr_buf_t *nlr) {
"str lr, [r0, #8] \n" // store lr into nlr_buf
#endif

#if !defined(__thumb2__)
#if !defined(__thumb2__) && MICROPY_NLR_THUMB_FORCE_LONG_JUMP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest calling the option MICROPY_NLR_THUMB_USE_LONG_JUMP, and removing the defined(__thumb2__) bit from the logic. Ie it would simply be:

#if MICROPY_NLR_THUMB_USE_LONG_JUMP
...

Then the default is to just do a standard b-instruction for the branch, and if anyone runs into linker issues they can just enable the option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @dpgeorge, thank you for your reply! I hope I'll get around to it soon.

Copy link
Contributor Author

@neuschaefer neuschaefer Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgeorge done, and tested with musl-libc, glibc, and uClibc-ng

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating, and for testing.

@neuschaefer
Copy link
Contributor Author

neuschaefer commented Apr 19, 2024

CI currently fails due to Codecov, which is broken for reasons outside of my control addressed in #14341:

[2024-04-19T12:52:33.117Z] ['error'] There was an error running the uploader: Error uploading to [https://codecov.io:](https://codecov.io/) Error: There was an error fetching the storage URL during POST: 404 - {'detail': ErrorDetail(string='Unable to locate build via Github Actions API. Please upload with the Codecov repository upload token to resolve issue.', code='not_found')}

Although the original motivation given for the workaround[1] is correct,
nlr.o and nlrthumb.o are linked with a small enough distance that the
problem does not occur, and the workaround isn't necessary. The distance
between the b instruction and its target (nlr_push_tail) is just 64
bytes[2], well within the ±2046 byte range addressable by an
unconditional branch instruction in Thumb mode.

The workaround induces a relocation in the text section (textrel), which
isn't supported everywhere, notably not on musl-libc[3], where it causes
a crash on start-up. With the workaround removed, micropython works on an
ARMv5T Linux system built with musl-libc.

This commit changes nlrthumb.c to use a direct jump by default, but
leaves the long jump workaround as an option for those cases where it's
actually needed.

[1]: commit dd376a2

Author: Damien George <damien.p.george@gmail.com>
Date:   Fri Sep 1 15:25:29 2017 +1000

    py/nlrthumb: Get working again on standard Thumb arch (ie not Thumb2).

    "b" on Thumb might not be long enough for the jump to nlr_push_tail so
    it must be done indirectly.

[2]: Excerpt from objdump -d micropython:

000095c4 <nlr_push_tail>:
    95c4:       b510            push    {r4, lr}
    95c6:       0004            movs    r4, r0
    95c8:       f02d fd42       bl      37050 <mp_thread_get_state>
    95cc:       6943            ldr     r3, [r0, micropython#20]
    95ce:       6023            str     r3, [r4, #0]
    95d0:       6144            str     r4, [r0, micropython#20]
    95d2:       2000            movs    r0, #0
    95d4:       bd10            pop     {r4, pc}

000095d6 <nlr_pop>:
    95d6:       b510            push    {r4, lr}
    95d8:       f02d fd3a       bl      37050 <mp_thread_get_state>
    95dc:       6943            ldr     r3, [r0, micropython#20]
    95de:       681b            ldr     r3, [r3, #0]
    95e0:       6143            str     r3, [r0, micropython#20]
    95e2:       bd10            pop     {r4, pc}

000095e4 <nlr_push>:
    95e4:       60c4            str     r4, [r0, micropython#12]
    95e6:       6105            str     r5, [r0, micropython#16]
    95e8:       6146            str     r6, [r0, micropython#20]
    95ea:       6187            str     r7, [r0, micropython#24]
    95ec:       4641            mov     r1, r8
    95ee:       61c1            str     r1, [r0, micropython#28]
    95f0:       4649            mov     r1, r9
    95f2:       6201            str     r1, [r0, micropython#32]
    95f4:       4651            mov     r1, sl
    95f6:       6241            str     r1, [r0, micropython#36]   @ 0x24
    95f8:       4659            mov     r1, fp
    95fa:       6281            str     r1, [r0, micropython#40]   @ 0x28
    95fc:       4669            mov     r1, sp
    95fe:       62c1            str     r1, [r0, micropython#44]   @ 0x2c
    9600:       4671            mov     r1, lr
    9602:       6081            str     r1, [r0, micropython#8]
    9604:       e7de            b.n     95c4 <nlr_push_tail>

[3]: https://www.openwall.com/lists/musl/2020/09/25/4

Signed-off-by: J. Neuschäfer <j.ne@posteo.net>
@dpgeorge dpgeorge merged commit 7b050b3 into micropython:master Apr 25, 2024
61 checks passed
@neuschaefer neuschaefer deleted the textrel branch April 25, 2024 08:15
@dpgeorge
Copy link
Member

This broke all of the nrf port boards that use nRF51xx MCUs. That's because the nrf port is using LTO.

Eg:

$ make BOARD=MICROBIT
Use make V=1 or set BUILD_VERBOSE in your environment to increase build verbosity.
GEN build-MICROBIT/genhdr/mpversion.h
CC ../../py/modsys.c
CC ../../extmod/modos.c
CC ../../extmod/modplatform.c
CC ../../shared/runtime/pyexec.c
LINK build-MICROBIT/firmware.elf
lto-wrapper: warning: using serial compilation of 5 LTRANS jobs
lto-wrapper: note: see the '-flto' option documentation for more information
build-MICROBIT/py/nlrthumb.o: in function `nlr_push':
nlrthumb.c:(.text+0x20): relocation truncated to fit: R_ARM_THM_JUMP11 against symbol `nlr_push_tail' defined in .text section in /tmp/cc3NlVpc.ltrans0.ltrans.o
collect2: error: ld returned 1 exit status
make: *** [Makefile:467: build-MICROBIT/firmware.elf] Error 1

Because of LTO, I think the only way to fix these boards is to enable the long jump.

@dpgeorge
Copy link
Member

See #14381 for a fix for nrf boards.

@dhalbert
Copy link
Contributor

dhalbert commented Sep 9, 2024

I also encountered this when merging Micropython v1.23 into CircuitPython. At least one SAMD21 LTO build got the llnker error, so I'll need to set MICROPY_NLR_THUMB_USE_LONG_JUMP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants