Skip to content

uasyncio: make uasyncio.Event() safe to call from an interrupt (RFC, WIP) #6056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

dpgeorge
Copy link
Member

The set of commits in this PR add an interface between soft interrupt handlers (ie micropython.schedule calls) and the uasyncio module. For example, one can use it to build an async-pin which can be awaited on for an edge or level change.

There's a lot to discuss here, and at the moment it's more of a proof-of-concept to show what's necessary to make it work.

The approach taken here has two parts to it:

  1. add a way for the asyncio scheduler (namely select.poll) to be woken asynchronously/externally
  2. make uasyncio.Event() "thread safe" so it can be set from a soft interrupt handler, ie from something scheduled by micropython.schedule

The solution for the first part was chosen so that it will work on the unix port, and it's also the same way that CPython works: create a special socket.socketpair() (like a unix file pipe), register that with select.poll and then write to it to signal that the asyncio scheduler should wake up. This is thread-safe and race-free (the signal can come in just before poll is called and it'll still work).

The second part (making Event.set() thread safe) is done for efficiency: even if there are hundreds of external events (eg pin irqs, UARTs, ble events, etc) they do not put a burden on poll because only the socketpair (and IO streams) are registered with poll.

The new things in this PR are:

  • addition of micropython.schedule_lock() and micropython.schedule_unlock()
  • addition of usocket.socketpair() including a bare-metal version for ports that need it
  • improve MICROPY_EVENT_POLL_HOOK on stm32 to be atomic wrt waiting for events
  • make uasyncio.Event() thread/IRQ safe

The way to use this feature is:

  1. create an uasyncio.Event() corresponding to the external event
  2. tasks wait on this event
  3. IRQ triggers soft callback via micropython.schedule()
  4. that callback will set the Event -- internally the event will schedule all waiting tasks (immediately in the soft callback) and then notify the asyncio poll via the socketpair that it should wake up and run the next task

@dpgeorge dpgeorge added the extmod Relates to extmod/ directory in source label May 20, 2020
@dpgeorge
Copy link
Member Author

dpgeorge commented May 20, 2020

Here's an example AsyncPin which allows waiting on an edge and is designed to never miss an edge. It runs on a pyboard:

import time, machine, micropython, pyb, uasyncio as asyncio
from machine import Pin

class AsyncPin:
    def __init__(self, pin, trigger):
        self.pin = pin
        self.event = asyncio.Event()
        self.timestamp = 0
        self.counter = 0
        self.pin.irq(self._cb, trigger)

    def _cb(self, pin):
        if self.counter == 0:
            self.timestamp = time.ticks_us()
        self.counter += 1
        self.event.set()

    async def wait_edge(self):
        ev = self.event
        await ev.wait()
        micropython.scheduler_lock()
        timestamp = self.timestamp
        count = self.counter
        self.counter = 0
        ev.clear()
        micropython.scheduler_unlock()
        return timestamp, count

async def wait_on_pin(pin, trigger):
    apin = AsyncPin(pin, trigger)
    while True:
        timestamp, count = await apin.wait_edge()
        dt = time.ticks_diff(time.ticks_us(), timestamp)
        print('edge dt={} count={} pin={}'.format(dt, count, pin))

async def led_flash(led, freq):
    while True:
        led.toggle()
        await asyncio.sleep(0.5 / freq)

async def main():
    asyncio.create_task(led_flash(pyb.LED(1), 1))
    asyncio.create_task(wait_on_pin(Pin('Y1', Pin.IN, Pin.PULL_DOWN), Pin.IRQ_RISING))
    asyncio.create_task(wait_on_pin(Pin('Y2', Pin.IN, Pin.PULL_UP), Pin.IRQ_FALLING))

    usr = AsyncPin(Pin('USR'), Pin.IRQ_FALLING)
    await usr.wait_edge()

asyncio.run(main())

Edit: removed unused worker() function.

@dpgeorge
Copy link
Member Author

Here's a simpler example that works on the unix port (coverage build, needs signal from micropython-lib) as well as a pyboard:

import sys, uasyncio

ev = uasyncio.Event()

if sys.platform == 'linux':
    import signal
    signal.signal(signal.SIGPIPE, lambda sig: ev.set())
else:
    import machine
    machine.Pin('USR').irq(lambda p: ev.set())

async def main():
    print('waiting on event', ev.is_set())
    await ev.wait()
    print('event:', ev.is_set())
uasyncio.run(main())

On unix do kill -PIPE $(ps -a | grep micropython | cut -d' ' -f3) to trigger the signal/event.

@peterhinch
Copy link
Contributor

In your first sample:

    async def wait_edge(self):
        ev = self.event
        await ev.wait()
        micropython.scheduler_lock()
        timestamp = self.timestamp

Is there a possible issue where an edge occurred immediately after await ev.wait() and before the scheduler is locked? Is some protection in place to handle this case?

Incidentally the worker coro in your first example is unused.

@dpgeorge
Copy link
Member Author

Is there a possible issue where an edge occurred immediately after await ev.wait() and before the scheduler is locked?

No there shouldn't be. The edge counter will just be incremented by 1. The AsyncPin class is designed here so that all edges will be accounted for, it's just whether the count is included in the current or next await pin.wait_edge().

Incidentally the worker coro in your first example is unused.

Thanks, now fixed.

@peterhinch
Copy link
Contributor

Sure. My general point is that an IRQ might occur between those instructions. I guess, in cases where this matters, the ISR can detect this case by testing event.is_set().

@dpgeorge
Copy link
Member Author

Some alternatives with pros/cons relative to the approach in this PR:

  • Instead of making uasyncio.Event.set() soft-IRQ-safe, implement loop.call_soon_threadsafe(func, args) as per CPython. This method is soft-IRQ-safe and can be used to schedule additional work from an IRQ, like setting an event (eg loop.call_soon_threadsafe(event.set)`). Pros: requires less use of schedule lock/unlock; more like CPython. Cons: requires 2x callback functions per "external event" (eg pin change).
  • Change all irq handlers to be like POSIX signals and have an associated number (eg Pin IRQ would be associated with Pin.IRQ_SIGNUM), and then handlers are registered via loop.add_signal_handler(signum, callback, args) and handled in a special way by uasyncio. Pros: centralised IRQ dispatch, same as how CPython does it. Cons: requires big changes to API and underlying code.
  • Make all objects pollable, either at the C level (eg BLE object) or in Python by inheriting from uio.IOBase. The latter is already possible without any changes to uasyncio. Pros: no changes required to uasyncio, can be reused in other situations that use select/poll. Cons: not easy to make it work on non-bare-metal ports like unix because there poll is the system-provided poll and only knows about file descriptors.
  • Abandon select/poll as the way to "wait for activity/events" and replace with a custom micropython.wait_event() interface that knows how to handle events and sleep while waiting. IRQs can signal an event via micropython.notify_event() (for example). Pros: ?. Cons: lots of changes, need to essentially duplicate poll/select. (Note that micropython.notify_event() is pretty much provided in this PR by a write to the socketpair.)

@tve
Copy link
Contributor

tve commented May 20, 2020

Interesting PR! I like the fact that the changeset is manageable!

What concerns me is that it appears to layer a simple event mechanism on top of a very heavyweight I/O mechanism. Yes, it's the way CPython does it, but CPython does it the way Unix does it, which has its own decades of constraints.

In embedded systems I'm familiar with, the event mechanism is a tiny little thing that is deep in the core and super efficient. The approach taken in this PR layers events on top of ipoll, which, looking at moduselect, means a busy-polling loop in poll_poll_internal, which calls poll_map_poll, which itself is a loop over all file descriptors being awaited calling through the stream ioctl machinery for each one. In the case of network connections that turns into a select call into LwIP for each file descriptor. (Am I getting this right?)

So in a way this PR kicks the proverbial can down the road in terms of making the innermost MP loop efficient and event driven. That may in itself not be a bad step as long as there's some discussion on how ipoll can be made event driven and efficient. It looks to me like that will take rewriting moduselect into a sort of rendez-vous mechanism where MP's core loop is on one side and the I/O providers (LwIP, BLE, USB, ..) are on the other and uselect really decouples the threads churning in those I/O providers from MP.

@dpgeorge
Copy link
Member Author

Yes, it's the way CPython does it, but CPython does it the way Unix does it, which has its own decades of constraints.

That is a big constraint, because it's important to try and match the CPython API (although that doesn't really dictate the underlying implementation of events), and also more important that it works on unix MicroPython.

In embedded systems I'm familiar with, the event mechanism is a tiny little thing that is deep in the core and super efficient. The approach taken in this PR layers events on top of ipoll

The problem is it's hard to come up with something that will work across many different systems, eg unix, zephyr/RTOS, esp32, bare-metal. Poll is something which is common, and already exists in MicroPython, so that's why it's used as a base.

And I did try to consider efficiency with the approach taken here. The point being that there's only one extra stream to be added to poll (namely socketpair) which now becomes the "event interface": any event at all just needs to write to this stream to signal to poll to wake up.

looking at moduselect, means a busy-polling loop in poll_poll_internal, which calls poll_map_poll, which itself is a loop over all file descriptors being awaited calling through the stream ioctl machinery for each one. In the case of network connections that turns into a select call into LwIP for each file descriptor.

Yes that's about right, although on bare-metal (eg stm32, esp8266) there is no final select call to lwip, it's done in extmod/modlwip.c efficiently. But on esp32 it does call into select.

So in a way this PR kicks the proverbial can down the road in terms of making the innermost MP loop efficient and event driven.

At this stage I'd rather just get the functionality and user interface correct, instead of trying to also rewrite all the internal event/polling code. And it's not clear how to do that yet on all supported platforms. Maybe you have ideas how to do it for esp32?

That may in itself not be a bad step as long as there's some discussion on how ipoll can be made event driven and efficient. It looks to me like that will take rewriting moduselect into a sort of rendez-vous mechanism

As I mention above, an alternative is to abandon select/poll and introduce something like micropython.wait_event()/micropython.notify_event(). But this PR is already close to that: without any IO streams in the picture the wait-event is exactly poll(...) and notify-event is exactly a write to the socketpair.

In the future bare-metal ports like stm32 could optimise the poll(...) call so that it knows which events/IRQs could trigger something to become readable/writable and does a proper sleep, only waking on the correct event.

@tve
Copy link
Contributor

tve commented May 21, 2020

It sounds like you've already made up your mind...

To make this stuff reasonably efficient, I think what needs to happen is to change the way ioctl is used. I don't know whether the current semantics are dictated by CPython or not. Anyway, currently:

  • the is-readable or is-writable flag is held in the device driver (at least conceptually) and an ioctl(MP_STREAM_POLL_XX) issued by ipoll traverses the stream layers to the device driver to query that flag

This could be augmented by:

  • poll pushes the read/write interest to the device driver, specifically in poll_register make an ioctl(MP_STREAM_REGISTER_XX) through to the device driver passing a callback function. The device driver would then call this callback when the device becomes readable or writable. This callback can then be used to unblock the waiting MP task or to end a WFI.

In order to avoid having to convert all device drivers at once, poll could be made to deal with both types of signaling by first trying the new ioctl and reverting to the old method for devices that don't support it.

With this type of change, whenever poll has to block, if all FD support the new callback functionality it can WFI or otherwise sleep 'til woken up. If at least one FD doesn't support the new ioctl then it has to do the current busy-wait dance but only needs to iterate through the ioctl(MP_STREAM_POLL_XX) for those devices.

On the ESP32 the way LwIP would tie into all this is via a separate task (unless there's some new event driven interface to LwIP sockets). That task's sole purpose would be to handle the ioctl(MP_STREAM_REGISTER_RD), perform the actual poll or select call into LwIP, and make the callback to unblock the MP thread.

Thoughts?

@dpgeorge
Copy link
Member Author

To make this stuff reasonably efficient, I think what needs to happen is to change the way ioctl is used

Your comments about this are correct and there are some good ideas how to make it better with MP_STREAM_REGISTER_XX. But that's a bit of a tangent to the main issue of this PR, which is to have a way for external events to hook into uasyncio. Yes the issues are related, so it's a good discussion to have, but regardless of how poll works (or even if it's used) we still need to make a decision on how external events like pin change IRQs trigger a uasyncio task to wake up.

The idea with this PR is that, even if there are 100 tasks await'ing on 100 different pins, the corresponding poll call is still O(1), not O(number of pins).

If pins were made pollable via an ioctl call then await'ing on 100 pins could lead to a very inefficient poll, even if it did support MP_STREAM_REGISTER_XX (because it'll have to scan and register 100 objects on each poll call, unless there is sophisticated caching of the calls).

On the ESP32 the way LwIP would tie into all this is via a separate task (unless there's some new event driven interface to LwIP sockets). That task's sole purpose would be to handle the ioctl(MP_STREAM_REGISTER_RD), perform the actual poll or select call into LwIP, and make the callback to unblock the MP thread.

It would be possible (and probably a good idea) for the esp32 to switch away from using extmod/moduselect.c and instead use a custom select module which just calls the IDF's poll function (basically use the unix usocket implementation). So that would eliminate all the ioctl stuff, and potentially allowing the IDF to do sleep during poll.

@tve
Copy link
Contributor

tve commented May 22, 2020

OK, just for the record, I didn't suggest:

If pins were made pollable via an ioctl call then await'ing on 100 pins could lead to a very inefficient poll

I would build poll on top of events, not the other way around. But I'll stop beating this dead horse.

It would be possible (and probably a good idea) for the esp32 to switch away from using extmod/moduselect.c and instead use a custom select module which just calls the IDF's poll function (basically use the unix usocket implementation). So that would eliminate all the ioctl stuff, and potentially allowing the IDF to do sleep during poll.

That's an interesting idea. My concern would be that this would mean that the socketpair trick to unblock poll when there's an event turns into a round-trip through LwIP on the other processor. If that's the case there's really a trade-off whether to prioritize the efficiency of Pin or other events vs the efficiency of socket I/O polling. There's also the problem of how one writes to the socketpair from an interrupt handler, that probably adds a helper task somewhere into the mix.

@dpgeorge
Copy link
Member Author

I would build poll on top of events, not the other way around.

Using such a scheme, can you explain how (eg) pin IRQs would wake the uasyncio scheduler? And how it would work on the unix port? That would give a good alternative to this PR.

There's also the problem of how one writes to the socketpair from an interrupt handler, that probably adds a helper task somewhere into the mix.

The write is from a soft callback scheduled via micropython.schedule() (or the C-level equivalent). So there shouldn't be any issues with safety.

@tve
Copy link
Contributor

tve commented May 22, 2020

I would build poll on top of events, not the other way around.

Using such a scheme, can you explain how (eg) pin IRQs would wake the uasyncio scheduler? And how it would work on the unix port? That would give a good alternative to this PR.

Do I have a week?

There's also the problem of how one writes to the socketpair from an interrupt handler, that probably adds a helper task somewhere into the mix.

The write is from a soft callback scheduled via micropython.schedule() (or the C-level equivalent). So there shouldn't be any issues with safety.

That's cheating! The point of the socketpair is to wake up the MP task, and micropython.schedule() runs in the MP task, so to wake up the MP task the MP task has to run?? I assume it works now because the MP task never really sleeps and keeps polling.

@dpgeorge
Copy link
Member Author

Do I have a week?

Sure.

That's cheating! The point of the socketpair is to wake up the MP task, and micropython.schedule() runs in the MP task, so to wake up the MP task the MP task has to run?? I assume it works now because the MP task never really sleeps and keeps polling.

The point of the socketpair is to wake the uasyncio scheduler. This is currently only done from Python code because that's the only way to make external events at the moment (internal ones like stream IO use ioctl polling). Remember that this PR is just about enabling user external events, not making polling event based.

On stm32 it's an efficient sleep because it does a WFI to wait for the next IRQ (all events in the system must originate from an IRQ), eg a pin IRQ. Such a hard IRQ handler may call mp_sched_schedule() and this will be run before the socketpair is polled again. It should be possible (and this is part of the plan) to make the WFI a proper sleep (stm32 STOP mode) when it's known that all IRQ sources can wake the MCU from STOP mode.

On unix it's an efficient sleep because the poll will wait forever until the socketpair becomes readable (eg written by another thread) or a signal arrives (external event) and that signal runs some Python code which writes to the socketpair.

@tve
Copy link
Contributor

tve commented May 24, 2020

Well, I still believe you're cheating and let me explain why ;-).

Remember that this PR is just about enabling user external events, not making polling event based.

Agreed, but making polling event based is the ultimate goal and my argument here is that by not stacking everything on top of ipoll we'd arrive at a better end result. (I think... there are so many moving parts it's difficult to be sure until it's implemented...)

As to the cheating... The way I understand it, the core of the asyncio loop services three things (not necessarily in this order):

  1. tasks that are runnable
  2. file descriptors that are readable/writable
  3. "sched-units" in the mp sched_queue (what's the name for these sched units?)

In your PR, Event.set runs in a sched-unit and it does two things:

  • it places tasks waiting on the event onto the run queue
  • it makes a FD readable

So if the asyncio loop is blocked in step 2 (waiting on file descriptors) then the sched-unit cannot run since that requires the asyncio loop to pop out of step 2 and reach step 3. Chicken and egg problem. Right?

But you gave two examples that work, so you must be cheating somehow. Here's how you are cheating. (Sorry, this is a bit long, I tried to lay it out in detail to convince myself I'm not smoking something here, even though in the end I may well be ;-).

On unix, step 2 above means calling the poll system call with a timeout. If there's nothing to do, there should be no timeout. But the call to poll is wrapped in MP_HAL_RETRY_SYSCALL and that includes a call to mp_handle_pending, i.e. it services the sched-units. So the mere fact of sending the process a signal already pops the asyncio loop out of the poll system call. The socketpair is only a convoluted way to tell moduselect and MP_HAL_RETRY_SYSCALL to stop calling poll forever. In more detail, if I understand correctly (big IF!!!) the sequence of events is:

  1. asyncio loop calls into ipoll, which makes a poll system call, process blocks
  2. kernel resumes process but in signal handler
  3. signal handler puts Event.set sched-unit onto MP sched queue
  4. signal handler returns causing poll system call to return with EINTR
  5. MP_HAL_RETRY_SYSCALL calls mp_handle_pending
  6. mp_handle_pending sees there's a sched-unit and runs it
  7. sched-unit is Event.set and puts waiting asyncio tasks onto the asyncio run queue and writes into socketpair
  8. MP_HAL_RETRY_SYSCALL blindly calls poll system call again, even though it could have been told not to
  9. kernel sees that an FD is ready (the socketpair) and returns from poll system call a second time
  10. ipoll returns to asyncio loop
  11. asyncio loop consumes dummy chars from socketpair, looks at task queue and runs tasks

On STM32, if I understand things correctly (there's that big IF again!), https://github.com/micropython/micropython/blob/master/ports/stm32/stm32_it.c#L348 clears the SLEEPONEXIT bit, which means that when an interrupt handler returns, if the processor was in sleep mode on a WFI instruction the WFI completes (i.e., the processor does NOT go back to sleep mode). I have not traced through the code and stm32 has so many options that I have difficulties locating the right code, but, if ipoll were to block on a WFI instruction the sequence of events would be as follows:

  1. asyncio loop calls into ipoll, which blocks on WFI
  2. interrupt happens, interrupt handler runs
  3. interrupt handler puts Event.set sched-unit onto MP sched queue
  4. interrupt handler returns causing WFI instruction to complete
  5. the loop in poll_poll_internal calls MICROPY_EVENT_POLL_HOOK which calls mp_handle_pending
  6. mp_handle_pending sees there's a sched-unit and runs it
  7. sched-unit is Event.set and puts waiting asyncio tasks onto the asyncio run queue and writes into socketpair
  8. the loop in poll_poll_internal rolls around and blindly calls poll_map_poll again, even though it could have been told not to
  9. poll_map_poll sees that an FD is ready (the socketpair) and returns from ipoll to asyncio loop
  10. asyncio loop consumes dummy chars from socketpair, looks at task queue and runs tasks

So let me rephrase my overall argument as follows:

  1. If you want to keep the overall structure of your proposal, then in bullet items 8 on both unix and stm32 the code that calls poll again could have checked a flag an opted to return instead. The socketpair becomes unnecessary. (I don't know the history of EINTR on Unix but I suspect that the reason it exists is precisely to enable breaking out of a blocked system call by the way of something a signal handler set.)

  2. The bigger argument I have is one of overall structure:

  • I find that MICROPY_EVENT_POLL_HOOK is sprinkled into more places than comfortable, it makes reasoning about all this stuff hard; it would be beautiful to have a single innermost loop (ok, this is not a hard argument and you may disagree)
  • Setting the "beauty" & "simplicity" argument aside, and focusing just on asyncio, it has three sources it checks (that's my 3-point list at the top of this post). At some point all three come back saying "ain't got nothin'" and (looking ahead beyond this PR) we'd like the loop to block.
  • What we need is for the loop to get unblocked if any of the three sources changes state. At that point the loop goes around again, checks all three, does the work necessary, and eventually blocks again.
  • The two issues we're trying to solve are (1) what does the loop call to block, and (2) how does it get unblocked from that.
  • On unix, I suspect that the only reasonable answer for (1) is the poll system call, and that's because it's the only thing that receives I/O events. The answer for (2) is "anything that causes the poll system call to return" and, looking at the possible things that could necessitate that: for file descriptors the return is built-in and for signals it is also in the form of EINTR.
  • On STM32 we want the answer for (1) to be WFI. That means that the answer for (2) has to be a hardware or software interrupt and here the unblocking happens thanks to the SLEEPONEXIT bit in the processor.

I believe the more difficult cases are the ports with RTOS 'cause they may force the blocking poll model for file descriptors without providing a notion of EINTR.

  • On the esp32 we want the answer for (1) to be either ulTaskNotifyTake (if automatic sleep is enabled) or esp_light_sleep_start (if we want to force sleep). So far the only way I know how to square that with the esp-idf tcp-adapter it to have a helper task that does the actual poll or select call. The answer for (2) then would be xTaskNotifyGive or the fact that any interrupt ends sleep mode.
  • Something like Zephyr would be an interesting case very similar to esp32 in that MP runs in a task and the question about how file I/O events generated by other tasks can awake the MP task.

Overall, what I would be delighted to see is:

  • an asyncio loop in which one sees the polling of the three sources followed by an "if nothing to do then MP_BLOCK".
  • The MP_BLOCK macro implemented by each port in a different manner
  • An MP_UNBLOCK_FROM_SCHED_SCHEDULE macro that mp_sched_schedule calls (again, different implementation per port)
  • An MP_UNBLOCK_FROM_OTHER_TASK macro that other tasks in RTOS systems can call (this should offer a way to integrate other event stuff, such as bluetooth, ESP-NOW, etc)

What this does is say "here's the loop and here's where it blocks" and "here's how you get to unblock it". The current PR has half the loop in asyncio and the other half in poll_poll_internal, and it says you have to go through the file descriptor machinery to unblock MP.

Ugh, long post, I just hope I didn't mis-understand something major that makes it irrelevant.

NB: while re-reading I noticed that on unix in the presence of threads there may be another source of wake-up that would map into the MP_UNBLOCK_FROM_OTHER_TASK macro but it's 'been a few years since I looked at kernel and user level threads on unix and linux so I don't know what the low-level implications are.

@dpgeorge
Copy link
Member Author

Thanks for the detailed post. Just to give a short answer now:

  • mp_sched_schedul() and "sched units" aka soft interrupt callbacks live outside asyncio, they live below asyncio (also below/outside normal running Python code). Such callbacks run "spontaneously" and can be thought of as threads that are started, run the callback, then finish. There is an effective "GIL" for such "threads" which means that all bytecodes are atomic wrt scheduled callbacks. Asyncio has no control over the execution of these callbacks.
  • Your understanding of how the code works on unix and stm32 is correct.
  • One must be very careful of race conditions with incoming events/IRQs/signals. "if nothing to do then MP_BLOCK" really needs to be "disable interrupts/signals; check if anything to do; if nothing to do then atomically enable interrupts/signals and block, otherwise enable interrupts/signals and continue".

@dpgeorge
Copy link
Member Author

Overall, what I would be delighted to see is:

  • The MP_BLOCK macro implemented by each port in a different manner
  • An MP_UNBLOCK_FROM_SCHED_SCHEDULE macro that mp_sched_schedule calls

...
What this does is say "here's the loop and here's where it blocks" and "here's how you get to unblock it".

If this scheme were implemented it'd need to support 1) the possibility for multiple MP_BLOCK call sites blocking at the same time; 2) the possibility that a mp_sched_schedule callback does not unblock MP_BLOCK.

For 1 (simultaneous MP_BLOCK call sites) consider a port with threading and a different asyncio scheduler per thread, or one thread using asyncio and another just using poll on its own to wait for IO. This means that each active MP_BLOCK needs to have a unique entity associated with it (eg blocking_id), and this is used like MP_UNBLOCK_FROM_SCHED_SCHEDULE(blocking_id). (The socketpair essentially provides this unique id/object, and the current PR should work ok on unix with multiple threads, a mp_sched_schedule callback running in one thread and waking a different thread via the socketpair write.)

For 2 (mp_sched_schedule callback not leading to an unblocking event) consider any of the following, which should not wake uasyncio:

  • a callback registered on a pin (or any object) which is independent to the asyncio scheduler, eg toggle a pin when a button is pressed, this can be done directly in the callback
  • a pin that can be await'ed on but which has nothing waiting on it: the callback still needs to run to register the pin edge and set the event, but this does not schedule any tasks
  • an object that can be await'ed on and is currently being await'ed on but needs some logic to decide whether to schedule any tasks as runnable, eg a pin-edge-counter that waits for N edges; if the logic decides that no tasks need waking then uasyncio should stay blocking

Handling these situations requires mp_sched_schedule callbacks to execute during blocking, because these callbacks must decide if the blocking is finished. So the macro would be something like MP_BLOCK_WHILE_EXECUTING_CALLBACKS. And also the MP_UNBLOCK_xxx triggers must be called from Python code when it decides that it's time to wake up, eg via micropython.notify_event(blocking_id). (This is essentially what the socketpair write is doing.)

Note: the above assumes that MP_BLOCK should block precisely until there is something to do, otherwise uasyncio is woken unnecessarily.

@tve
Copy link
Contributor

tve commented May 26, 2020

Thanks for highlighting all these corner cases ;-)

I think what I'll do is back out the socketpair stuff and see whether I can replace it with a simple flag causing the polling to stop in bullet items 8 in my previous comment and trying to avoid any looping inside of poll_poll_internal. I'll see what all I run into...

Note: the above assumes that MP_BLOCK should block precisely until there is something to do, otherwise uasyncio is woken unnecessarily.

It seems to me that unnecessarily waking up uasyncio might be an OK trade-off if it makes everything simpler (I'm not saying that it does). After all, the big cost of waking up from sleep mode or of context switching under unix is already paid and if an "idle" go-around the uasyncio loop is very expensive then that's a different problem in and of itself (which it currently kind'a is given how poll_map_poll queries each awaited device).

@dpgeorge
Copy link
Member Author

I think what I'll do is back out the socketpair stuff and see whether I can replace it with a simple flag causing the polling to stop in bullet items 8 in my previous comment and trying to avoid any looping inside of poll_poll_internal.

On unix (and probably windows, zephyr, other OS/RTOS-based ports) I think the system poll() function (or equivalent) will be the only way to do an efficient sleep. And to wake it up requires writing to file descriptor that it is monitoring, ie socketpair. Constrained by this, and to make the same uasyncio code run on all ports, that's why I implemented a bare-metal socketpair.

Note that the use of socketpair is really an implementation detail of uasyncio (and has nothing to do with poll itself) and the user shouldn't be exposed to it. So it could be swapped out for something else if that something else is better. That something else needs to provide 1) a way of creating the entity; 2) ability to register that entity with poll; 3) ability to signal that entity to wake up poll; 4) ability for poll to clear the signal that woke it up.

@dpgeorge
Copy link
Member Author

ThreadSafeFlag was added in 5e96e89 as a (temporary) way to connect interrupts with uasyncio. And ThreadSafeFlag will eventually be optimised/improved with a more sophisticated way of waiting on events, see #6125 for ideas on that front.

This PR is made obsolete by that.

@dpgeorge dpgeorge closed this Nov 17, 2021
@dpgeorge dpgeorge deleted the extmod-uasyncio-threadsafe-event branch November 17, 2021 03:24
tannewt added a commit to tannewt/circuitpython that referenced this pull request Feb 25, 2022
free RX and TX on QTPY-ESP32S2 in non debug builds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extmod Relates to extmod/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants