-
Notifications
You must be signed in to change notification settings - Fork 220
Hangs when reopening WiFi connection #167
Comments
Reading around the Espressif forums I've found references to the need periodically to allocate time to the underlying |
There is a machine.idle() method already, which goes like that:
I don't know if that is what you need. |
Robert, that does indeed look promising - thank you, well spotted. I'd assumed its function was the same as the ESP8266 version (which is entirely different) so never thought to investigate it. I'll experiment. async def _idle_task(self):
while True:
await asyncio.sleep_ms(10)
idle() # Yield to underlying RTOS but initial results suggest it hasn't entirely eliminated the need for |
After much experimentation the following precautions are required to make nonblocking sockets play tolerably well with uasyncio and to enable WiFi connections to reliably restart after an outage. I have a function which issues
It would be good to know the reason for this and whether it would be possible to enhance Note that issuing, say, |
@peterhinch thanks for the details on how to "fix" this. This is really a problem with the underlying "operating system" and so should be worked around in C code so that the Python programmer doesn't need to know about it (and of course so that scripts are portable across platforms). |
If progress is reported I'll re-test against the improved firmware build. |
Did anyone ever get to look into this? |
I took a quick look and can confirm that the issue is there. The following code reproduces it without asyncio: import network
s = network.WLAN(network.STA_IF)
s.active(True)
def wifi_connect():
print('WiFi connect')
s.disconnect()
s.connect('SSID', 'PASSWORD')
print('Awaiting conection')
while not s.isconnected():
pass
print('Got connection')
print(s.ifconfig())
wifi_connect() After a hard reset it works, but second time round (without hard reset) it locks up in the It does look like the idle task is being starved (by the uPy VM loop) and needs some time to process underlying network events. Changing the uPy task from priority 1 down to priority 0 doesn't help. Inserting a It's inclear how exactly to fix this but the first thing would be to update to the latest ESP IDF and then retest using that, then go from there. |
Why the busy-wait with pass? Why not idle-loop with a short sleep? That's what I have in my startup handler and it works fine. |
It's more to show the bug than as a real example. In real code that uses, eg, uasyncio one can't put short sleeps randomly throughout the code. |
Keep in mind that the main fork of micropython only enables one of the two ESP32 cores, and a bunch of critical work needs to be done in the system idle process which is easy to starve if you're busy-waiting. I've been using the @loboris fork that enables both cores, which seems to handle some of this stuff better. |
were you able to run asyncio on the loboris fork? I tried it with both cores enabled and not even the simplest async function (even with idle() in it) does work. It hangs until the watchdog gets triggered. It does not even execute the first command inside the async function, like this:
Results after a few seconds in:
|
I was able to trace the problem down to the function utimeq.peektime() which responds good on esp8266 but on esp32 this gives a really large number so the asyncio.wait() is called with a really high number. But I'm not sure how this error happens. |
There is some difference between the ESP8266 and loboris port regarding time stamps. The esp8266 port treats all time stamps as short integers (31 bit), which eventually may wrap around. The code deals fine with that. In the loboris port, ticks_ms() and ticks_us() return up to 64 bit quantities, but in while used by utimeq module and eventually by ticks_diff() and ticks_add() these may be truncated. Funny enough, it even runs into an error. Just try: |
You are right, thank you. The issue is being fixed by loboris currently. |
I've tried reproducing the problem as originally described on an M5Stack using today's 1.9.4 build (specifically: Just to be sure I'm trying going through the same process... I copied the text as reported, removed the sleeps and used my local wifi connection details: import uasyncio as asyncio
from utime import sleep
import network
s = network.WLAN(network.STA_IF)
s.active(True)
async def wifi_connect():
print('WiFi connect')
s.disconnect()
await asyncio.sleep(1)
s.connect('SSID', 'password')
print('Awaiting conection')
while not s.isconnected():
await asyncio.sleep(1)
print('Got conection, pausing')
for _ in range(3):
await asyncio.sleep(1)
print('conection done')
return
loop = asyncio.get_event_loop()
loop.run_until_complete(wifi_connect()) Saved it as import network, upip
wlan = network.WLAN(network.STA_IF)
wlan.active(True)
wlan.connect('SSID', 'password')
upip.install('micropython-uasyncio') Then I opened a repl to the device using mpfshell (with the Was that the right sequence of events to raise the issue? Can anyone else confirm the result? |
It will reconnect after a hard reset or power cycle. The question is whether it can reconnect on a second run (e.g. after a soft reset). You might also want to try @dpgeorge 's code sample as it is a more minimal test case than mine. |
OK, I see - the soft reset is the problem. Unfortunately I can confirm that the problem still exists on |
To make progress on this issue here are some things to try:
|
OK so: The That's also a mechanism where
We also haven't really answered the question of why it works on the first reset ... what changes? [UPDATE: some of that I can't replicate in the cold light of morning ... not sure about taskYIELD.] |
PS: I just noticed this is in the old repo. sigh we're going to have to do something about this repo, archive it off or something. I created micropython/micropython#4269 to carry this issue forward ... |
That is a very good point, and helped me to finally fix this issue. See discussion at micropython/micropython#4269 for further details about the fix. |
This sample works if run after a power cycle. It hangs if a WiFi connection is already present (e.g. on a second run). It works if the commented-out sleep() statements are inserted. The aim here is to achieve a reliable (re)connection regardless of initial conditions and after a WiFi outage.
Is there an issue of yielding to the underlying RTOS? sleep() statements are of course inappropriate in asynchronous code.
The text was updated successfully, but these errors were encountered: