Skip to content

py/vm.c: Only raise pending exceptions from the main thread. #8838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

jimmo
Copy link
Member

@jimmo jimmo commented Jun 30, 2022

Originally we had a global MP_STATE_VM(mp_pending_exception), which
meant that any thread could handle the pending exception.

In issue #7026 it was decided that only the main thread should handle
pending exceptions (e.g. keyboard interrupt), and PR #7051
(259d9b6 & ca920f7, FYI @dlech) changed this such that the pending exception
became per-thread, and keyboard interrupts were delivered only to the main
thread.

However, the MP_STATE_VM(sched_state) was still global, and this meant
that a non-main thread could race to clear MP_STATE_VM (sched_state), which meant that the main thread would never notice its
MP_STATE_VM(mp_pending_exception).

This could be fixed by having mp_sched_unlock also check if the main thread
still has a pending exception, and leaving MP_STATE_VM(sched_state)
pending. (It currently just checks whether the current thread has a pending exception).

However, there's no mechanism for pending exceptions to be delivered to
anything other than the main thread, so instead, this commit makes
MP_STATE_VM(mp_pending_exception) global again, but allows only the main
thread to process it. On non-threading builds this has no effect. On
threading builds, a port is able to provide its own "am I the main thread"
function, but a fallback is provided.

jimmo added 3 commits June 30, 2022 15:29
Originally we had a global `MP_STATE_VM(mp_pending_exception)`, which
meant that any thread could handle the pending exception.

In issue micropython#7026 it was decided that only the main thread should handle
pending exceptions (e.g. keyboard interrupt), and PR micropython#7051
(259d9b6 & ca920f7) changed this such that the pending exception
became per-thread, and keyboard interrupts were delivered only to the main
thread.

However, the `MP_STATE_VM(sched_state)` was still global, and this meant
that a non-main thread could race to clear `MP_STATE_VM
(sched_state)`, which meant that the main thread would never notice its
`MP_STATE_VM(mp_pending_exception)`.

This could be fixed by having mp_sched_unlock also check if the main thread
still has a pending exception, and leaving `MP_STATE_VM(sched_state)`
pending.

However, there's no mechanism for pending exceptions to be delivered to
anything other than the main thread, so instead, this commit makes
`MP_STATE_VM(mp_pending_exception)` global again, but allows only the main
thread to process it. On non-threading builds this has no effect. On
threading builds, a port is able to provide its own "am I the main thread"
function, but a fallback is provided.

Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
@jimmo
Copy link
Member Author

jimmo commented Jun 30, 2022

For background, I noticed this on the rp2 port after implementing #8836, which meant that the second core spent less time doing other things and was more likely to "win" the race to clear MP_STATE_VM(sched_state), and the result was that a busy loop running on the second core prevented Ctrl-C from being delivered to a busy loop on the main core. Tracing through in gdb, the main core never got a chance to see a "pending" sched_state.

@codecov-commenter
Copy link

Codecov Report

Merging #8838 (9cf2e15) into master (dd77dbd) will increase coverage by 0.00%.
The diff coverage is 88.88%.

@@           Coverage Diff           @@
##           master    #8838   +/-   ##
=======================================
  Coverage   98.31%   98.31%           
=======================================
  Files         157      157           
  Lines       20342    20347    +5     
=======================================
+ Hits        19999    20004    +5     
  Misses        343      343           
Impacted Files Coverage Δ
py/modthread.c 98.95% <ø> (-0.02%) ⬇️
py/vm.c 98.98% <66.66%> (ø)
py/runtime.c 99.71% <100.00%> (ø)
py/scheduler.c 98.38% <100.00%> (ø)
py/builtinevex.c 100.00% <0.00%> (ø)
py/builtinhelp.c 100.00% <0.00%> (ø)
ports/unix/mpconfigport.h 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dd77dbd...9cf2e15. Read the comment docs.

@dlech
Copy link
Contributor

dlech commented Jun 30, 2022

We are using the per-thread exception scheduling with the unix port to be able to cancel threads that are blocking on I/O. It works like this:

Suppose a thread is blocking on a syscall like read and we want to stop this thread. To do this, we call pthread_kill() to interrupt the syscall on this specific thread. However, MicroPython will automatically retry syscalls like read when interrupted (PEP 475). So to prevent the retry, we schedule a SystemExit exception on this thread before raising the signal to interrupt the syscall. Then when the syscall is interrupted, the pending exception is raised on this thread instead of retrying the syscall. This allows context managers and finally statements to properly clean up the thread that was canceled.

Making pending exceptions only occur on the main thread would break our code, so I hope we can come up with a different solution.

For example, could we make it so that the scheduler only runs on the main thread? This is inline with how CPython does things (i.e. Py_AddPendingCall). And it seems to me like it would be beneficial at least in the case where there is an unhandled exception in a scheduled callback. If the scheduler can run on any thread, and there is an unhandled exception in a scheduled callback, then it will just crash the random thread where it ran and the rest of the program will keep running in a bad state. But if all scheduled callbacks only run on the main thread and there is an unhandled exception, then the entire program crashes instead of continuing to run in a bad state. This seems like a more desirable behavior.

@jimmo
Copy link
Member Author

jimmo commented Jul 1, 2022

We are using the per-thread exception scheduling with the unix port to be able to cancel threads that are blocking on I/O.

@dlech OK thank you for the explanation! I wondered if maybe this might be the case but couldn't see any discussion around this. Definitely sounds useful though so we should keep it.

The other way to solve this is to either make it possible for mp_sched_unlock to know whether any threads have pending exceptions (this seems complicated), or probably better would be to decouple the scheduler (and the MP_STATE_VM(sched_state) variable) from pending exceptions.

I also agree that it makes sense to only dispatch the scheduler on the main thread (and I very nearly did that in this PR!). I notice in your commit that you use the same method to detect the main thread. I will make a new PR to address both.

@jimmo
Copy link
Member Author

jimmo commented Jul 1, 2022

Closed in favour of #8845

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants