-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
py/scheduler: Allow C scheduler callbacks to re-queue themselves. #17347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py/scheduler: Allow C scheduler callbacks to re-queue themselves. #17347
Conversation
@andrewleech Are you able to easily reproduce the issue that #17264 is fixing? Am interested to know if this fixes it OK, I noticed just now that PR links to your commit af4a8e9 which implies the problem is actually with recursive calls to |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #17347 +/- ##
==========================================
- Coverage 98.54% 98.54% -0.01%
==========================================
Files 169 169
Lines 21910 21943 +33
==========================================
+ Hits 21591 21623 +32
- Misses 319 320 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code size report:
|
Thanks for this @projectgus yes it's easy to reproduce with a stm32wb55 that's had its rfcore firmware wiped, when trying to start BLE it gets stuck trying to init forever in a loop, the higher level timeout code never runs. |
Current master: MicroPython v1.26.0-preview.148.g49f81d5046 on 2025-05-24; NUCLEO-WB55 with STM32WB55RGV6
Type "help()" for more information.
>>>
>>>
>>> from bluetooth import BLE
>>> BLE().active()
False
>>> BLE().active(True)
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout Note: Ctrl-C did not working here, I couldn't break out of the error loop other than hard reset. With this PR: MicroPython v1.26.0-preview.150.g8b14be1c02 on 2025-05-24; NUCLEO-WB55 with STM32WB55RGV6
Type "help()" for more information.
>>> from bluetooth import BLE
>>> BLE().active()
False
>>> BLE().active(True)
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
tl_ble_wait_resp: timeout
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 110] ETIMEDOUT
>>> |
Thanks @andrewleech for confirming this fixes the WB55 issue! |
8b14be1
to
98e28ef
Compare
Without this change, a scheduler callback which itself queues a new callback will have that callback executed as part of the same scheduler run. Where a callback may re-queue itself, this can lead to an infinite loop. With this change, each call to mp_handle_pending() will only service the callbacks which were queued when the scheduler pass started - any callbacks added during the run are serviced on the next mp_handle_pending(). This does mean some interrupts may have higher latency (as callback is deferred until next scheduler run), but the worst-case latency should stay very similar. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Test modified to reschedule itself based on a flag setting. Without the change in the parent commit, this test executes the callback indefinitely and hangs but with the change it runs only once each time mp_handle_pending() is called. Modified-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
98e28ef
to
8204853
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now!
Summary
Currently if a scheduler callbacks queues a new C scheduled callback, that callback will run in the same schedule pass. This means if a scheduler callback re-queues itself, it can "infinite loop" inside the scheduler.
This change means that any call to
mp_handle_pending()
will only process the callbacks which were queued whenmp_handle_pending()
started running. Therefore any callback which re-queues itself should have that callback handled the next timemp_handle_pending()
is called.This is an alternative to #17248 and should remove the need for #17264. Fixes #17246.
This does create potential for some additional callback latency (i.e. if an interrupt fires while a different scheduler callback is running, the interrupt callback is now deferred until the next scheduler run). However the worst-case scheduler latency remains similar (interrupt fires at end of one scheduler pass, has to wait until the next one). I think most MicroPython systems only sparsely call C scheduler callbacks, so this shouldn't be noticeable in most cases.
There is also potential for a misbehaving callback that re-schedules itself continuously to degrade performance of MicroPython (as the callback runs continuously with some allowance for Python code to run), whereas before it would have locked up MicroPython entirely which is more obviously broken.
This work was funded through GitHub Sponsors.
Testing
Trade-offs and Alternatives