Skip to content

py: Implement mp_sched_vm_abort() to force a return to the very top level (v2) #10971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 21, 2023

Conversation

dpgeorge
Copy link
Member

@dpgeorge dpgeorge commented Mar 8, 2023

This is an alternative and simpler version of #10241, which adds a way to forcefully abort the VM from an IRQ handler (or from the main running thread).

The version here is very simple because it just has a flag MP_STATE_VM(vm_abort) which is set (eg by an IRQ handler) when the VM should abort. The VM checks this flag when it checks the pending exception variable. This adds some small overhead to the VM, but keeps things simple and easy to reason about. It would be possible to combine all the ways of interrupting the VM (exception, scheduled callback, VM abort) into a single flag, but that's not done in this PR (possibly could be done in the future).

The approach here is able to abort pretty much anything:

  • an infinite Python loop
  • an infinite Python loop running in a (soft) scheduled callback
  • an infinite Python loop running in a hard interrupt

If using this feature, one thing to be careful of is trying to abort the REPL when it's not running any code. In that case it will most likely lead to an uncaught NLR hard fault, because there is no outer NLR context to catch the VM abort.

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label Mar 8, 2023
@github-actions
Copy link

github-actions bot commented Mar 8, 2023

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
        rp2:    +0 +0.000% PICO

@@ -140,7 +143,12 @@ STATIC int parse_compile_execute(const void *source, mp_parse_input_kind_t input
mp_hal_stdout_tx_strn("\x04", 1);
}

// check for SystemExit
#if MICROPY_SCHED_VM_ABORT
if (nlr.ret_val == NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an existing comment on the definition of ret_val that says:

// always a concrete object (an exception instance)

Perhaps this should be updated to explain the change in semantics when MICROPY_SCHED_VM_ABORT is enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've updated that comment.

@dpgeorge dpgeorge force-pushed the py-sched-vm-abort-v2 branch from 8955074 to 3637512 Compare March 9, 2023 05:15
@dpgeorge
Copy link
Member Author

dpgeorge commented Mar 9, 2023

This feature is a little bit difficult/dangerous to use. The user of it must be careful of race conditions: eg the possibility that mp_sched_vm_abort() is called just after the VM finished running, in which case the abort may be handled outside an NLR context.

My initial thoughts were that this VM abort feature could be useful (somehow) to catch an abort within the runtime. But I now think that it really only makes sense to set up the abort handler to catch an abort at the very top level, essentially to handle a soft reset. Schematically:

for (;;) {
    mp_init();
    machine_init();

    nlr_buf_t nlr;
    if (nlr_push(&nlr) == 0) {
        nlr_set_abort(&nlr);
        pyexec("boot.py");
        pyexec("main.py");
        execute_repl():
    } else {
        if (nlr.ret_val == NULL) {
            // vm abort
        } else {
            // uncaught exception
            print_exception(nlr.ret_val);
        }
    }

    // soft reset
    mp_printf(MP_PYTHON_PRINTER, "MPY: soft reboot\n");
    machine_deinit();
    gc_sweep_all();
    mp_deinit();
}

All ports would then implement that as their main loop (I don't propose to do that now, just an idea how it could work).

@dlech @laurensvalk how would PyBricks use this VM abort feature, would you use it similar to what I have sketched above, in your most outer exec/REPL handler? In your case would it always be that, if there is a VM abort, then it will do a mp_deinit() and restart?

@codecov-commenter
Copy link

Codecov Report

Merging #10971 (3637512) into master (b336b6b) will not change coverage.
The diff coverage is n/a.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@           Coverage Diff           @@
##           master   #10971   +/-   ##
=======================================
  Coverage   98.50%   98.50%           
=======================================
  Files         155      155           
  Lines       20549    20549           
=======================================
  Hits        20241    20241           
  Misses        308      308           
Impacted Files Coverage Δ
py/nlr.c 100.00% <ø> (ø)
py/runtime.h 100.00% <ø> (ø)
py/scheduler.c 96.77% <ø> (ø)
py/vm.c 100.00% <ø> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@laurensvalk
Copy link
Contributor

laurensvalk commented Mar 9, 2023

how would Pybricks use this VM abort feature, would you use it similar to what I have sketched above, in your most outer exec/REPL handler?

Yes, that's essentially how we do it. For context, our main.c-equivalent is this file.

The entry point is pbsys_main_run_program which gets called when the user starts a program.

For the abort, see here. For now it is still using mp_sched_system_exit_or_abort from #9949. We used to have a protection against the race condition you mentioned, but dropped it since it wasn't needed in this approach.

(Unrelated to this, but while you're there: The only other main difference you'll see is the customized builtin import, which scans through one or more concatenated mpy blobs held in RAM, based on #8381.)

In your case would it always be that, if there is a VM abort, then it will do a mp_deinit() and restart?

Yes. Technically, one step further: for the SystemAbort case we mp_deinit and then power off entirely. We only use this for forced shutdown. All other uncaught exceptions essentially lead to a soft reboot: MicroPython exits, then starts again via pbsys_main_run_program if another program is started.

@dpgeorge
Copy link
Member Author

I renamed the config option to MICROPY_ENABLE_VM_ABORT.

I do not intend to enable this on any ports, it's here mainly for PyBricks and others who need this feature.

The 3 top commits on this PR (stm32 and shared/runtime/pyexec changes) will not be merged. They are only to show how the feature can be used and demonstrate that it works as intended.

@jimmo
Copy link
Member

jimmo commented Mar 10, 2023

Looks good. I think adding the flag is a very pragmatic choice rather than all the other approaches we were investigating.

I would like to revisit this whole feature (scheduling, pending exceptions, etc) in the future to try and address some of the other issues we've talked about, but this is a good option for now!

Signed-off-by: Damien George <damien@micropython.org>
This is intended to be used by the very outer caller of the VM/runtime.  It
allows setting a top-level NLR handler that can be jumped to directly, in
order to forcefully abort the VM/runtime.

Enable using:

    #define MICROPY_ENABLE_VM_ABORT (1)

Set up the handler at the top level using:

    nlr_buf_t nlr;
    nlr.ret_val = NULL;
    if (nlr_push(&nlr) == 0) {
        nlr_set_abort(&nlr);
        // call into the VM/runtime
        ...
        nlr_pop();
    } else {
        if (nlr.ret_val == NULL) {
            // handle abort
            ...
        } else {
            // handle other exception that propagated to the top level
            ...
        }
    }
    nlr_set_abort(NULL);

Schedule an abort, eg from an interrupt handler, using:

    mp_sched_vm_abort();

Signed-off-by: Damien George <damien@micropython.org>
@dpgeorge dpgeorge force-pushed the py-sched-vm-abort-v2 branch from d5fa356 to d54208a Compare March 21, 2023 07:11
@dpgeorge dpgeorge changed the title WIP: py: Implement mp_sched_vm_abort() to force a return to the very top level (v2) py: Implement mp_sched_vm_abort() to force a return to the very top level (v2) Mar 21, 2023
@dpgeorge
Copy link
Member Author

I cleaned this up, ready to merge. It no longer includes the changes to stm32 or pyexec. But it does now include some information in the commit message about using this feature.

@laurensvalk
Copy link
Contributor

I do not intend to enable this on any ports, it's here mainly for PyBricks and others who need this feature.

Thank you! We will probably integrate this when we rebase to MicroPython 1.20.

@dpgeorge dpgeorge merged commit d54208a into micropython:master Mar 21, 2023
@dpgeorge dpgeorge deleted the py-sched-vm-abort-v2 branch March 21, 2023 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants