py/scheduler: Allow C scheduler callbacks to re-queue themselves. #17347

projectgus · 2025-05-23T05:07:07Z

Summary

Currently if a scheduler callbacks queues a new C scheduled callback, that callback will run in the same schedule pass. This means if a scheduler callback re-queues itself, it can "infinite loop" inside the scheduler.

This change means that any call to mp_handle_pending() will only process the callbacks which were queued when mp_handle_pending() started running. Therefore any callback which re-queues itself should have that callback handled the next time mp_handle_pending() is called.

This is an alternative to #17248 and should remove the need for #17264. Fixes #17246.

This does create potential for some additional callback latency (i.e. if an interrupt fires while a different scheduler callback is running, the interrupt callback is now deferred until the next scheduler run). However the worst-case scheduler latency remains similar (interrupt fires at end of one scheduler pass, has to wait until the next one). I think most MicroPython systems only sparsely call C scheduler callbacks, so this shouldn't be noticeable in most cases.

There is also potential for a misbehaving callback that re-schedules itself continuously to degrade performance of MicroPython (as the callback runs continuously with some allowance for Python code to run), whereas before it would have locked up MicroPython entirely which is more obviously broken.

This work was funded through GitHub Sponsors.

Testing

I adapted the test case @andrewleech added in py/scheduler: warning about C callbacks scheduling new tasks. #17248 for the new behaviour and resubmitted here, so coverage test now includes C scheduler callbacks (testing the new logic).
Manually ran the rp2 unit tests on RP2_PICO board. As this is a tickless port I felt it had the most potential for an interrupt timing bug to surface due to this change. All passed.

Trade-offs and Alternatives

Possible to manually add anti-recursion checks in callbacks, i.e. approach in all/mpbthciport.c: Don't restart BLE polling if already running. #17264, but it can get fiddly quickly.

projectgus · 2025-05-23T05:09:15Z

@andrewleech Are you able to easily reproduce the issue that #17264 is fixing? Am interested to know if this fixes it OK, I noticed just now that PR links to your commit af4a8e9 which implies the problem is actually with recursive calls to mp_handle_pending() itself rather than callbacks re-adding themselves, so that might need a different fix...

codecov · 2025-05-23T05:09:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.54%. Comparing base (b153484) to head (8204853).
Report is 2 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #17347      +/-   ##
==========================================
- Coverage   98.54%   98.54%   -0.01%     
==========================================
  Files         169      169              
  Lines       21910    21943      +33     
==========================================
+ Hits        21591    21623      +32     
- Misses        319      320       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2025-05-23T05:17:48Z

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +8 +0.002% PYBV10
     mimxrt:    +8 +0.002% TEENSY40
        rp2:   +16 +0.002% RPI_PICO_W
       samd:    +8 +0.003% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

andrewleech · 2025-05-23T08:10:04Z

Thanks for this @projectgus yes it's easy to reproduce with a stm32wb55 that's had its rfcore firmware wiped, when trying to start BLE it gets stuck trying to init forever in a loop, the higher level timeout code never runs.
I'll get a unit set up to test the change.

py/scheduler.c

andrewleech · 2025-05-24T11:54:49Z

Current master:

MicroPython v1.26.0-preview.148.g49f81d5046 on 2025-05-24; NUCLEO-WB55 with STM32WB55RGV6
Type "help()" for more information.
>>> 
>>> 
>>> from bluetooth import BLE
>>> BLE().active()
False
>>> BLE().active(True)
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout

Note: Ctrl-C did not working here, I couldn't break out of the error loop other than hard reset.

With this PR:

MicroPython v1.26.0-preview.150.g8b14be1c02 on 2025-05-24; NUCLEO-WB55 with STM32WB55RGV6
Type "help()" for more information.
>>> from bluetooth import BLE
>>> BLE().active()
False
>>> BLE().active(True)
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 110] ETIMEDOUT
>>>

projectgus · 2025-05-29T00:39:45Z

Thanks @andrewleech for confirming this fixes the WB55 issue!

py/scheduler.c

Without this change, a scheduler callback which itself queues a new callback will have that callback executed as part of the same scheduler run. Where a callback may re-queue itself, this can lead to an infinite loop. With this change, each call to mp_handle_pending() will only service the callbacks which were queued when the scheduler pass started - any callbacks added during the run are serviced on the next mp_handle_pending(). This does mean some interrupts may have higher latency (as callback is deferred until next scheduler run), but the worst-case latency should stay very similar. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>

Test modified to reschedule itself based on a flag setting. Without the change in the parent commit, this test executes the callback indefinitely and hangs but with the change it runs only once each time mp_handle_pending() is called. Modified-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>

dpgeorge

Looks good now!

projectgus added the py-core Relates to py/ directory in source label May 23, 2025

projectgus requested a review from andrewleech May 23, 2025 05:07

dpgeorge reviewed May 23, 2025

View reviewed changes

py/scheduler.c Outdated Show resolved Hide resolved

andrewleech approved these changes May 24, 2025

View reviewed changes

projectgus commented May 29, 2025

View reviewed changes

py/scheduler.c Outdated Show resolved Hide resolved

projectgus force-pushed the feature/c-scheduler-recurse branch from 8b14be1 to 98e28ef Compare May 30, 2025 06:33

projectgus and others added 2 commits June 4, 2025 11:31

dpgeorge force-pushed the feature/c-scheduler-recurse branch from 98e28ef to 8204853 Compare June 4, 2025 01:31

dpgeorge approved these changes Jun 4, 2025

View reviewed changes

dpgeorge merged commit 8204853 into micropython:master Jun 4, 2025
66 checks passed

projectgus deleted the feature/c-scheduler-recurse branch June 5, 2025 01:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

py/scheduler: Allow C scheduler callbacks to re-queue themselves. #17347

py/scheduler: Allow C scheduler callbacks to re-queue themselves. #17347

Uh oh!

projectgus commented May 23, 2025 •

edited

Loading

Uh oh!

projectgus commented May 23, 2025

Uh oh!

codecov bot commented May 23, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 23, 2025 •

edited

Loading

Uh oh!

andrewleech commented May 23, 2025

Uh oh!

Uh oh!

andrewleech commented May 24, 2025

Uh oh!

projectgus commented May 29, 2025

Uh oh!

Uh oh!

dpgeorge left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

py/scheduler: Allow C scheduler callbacks to re-queue themselves. #17347

py/scheduler: Allow C scheduler callbacks to re-queue themselves. #17347

Uh oh!

Conversation

projectgus commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Trade-offs and Alternatives

Uh oh!

projectgus commented May 23, 2025

Uh oh!

codecov bot commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrewleech commented May 23, 2025

Uh oh!

Uh oh!

andrewleech commented May 24, 2025

Uh oh!

projectgus commented May 29, 2025

Uh oh!

Uh oh!

dpgeorge left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

projectgus commented May 23, 2025 •

edited

Loading

codecov bot commented May 23, 2025 •

edited

Loading

github-actions bot commented May 23, 2025 •

edited

Loading