Skip to content

docs/ESP32: Reinitialize watchdog timer with longer timeout. #11981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

IhorNehrutsa
Copy link
Contributor

@IhorNehrutsa IhorNehrutsa commented Jul 10, 2023

If the WDT timeout is short(several seconds), a problem occurs when updating the software:
you don't have enough time to copy updates to the device.
ESP32 allows reinitializing the watchdog with a longer timeout - like an hour.

Discussion in
ESP32/machine_wdt.c: Add WDT.deinit() method. #6631

The best explanation at comment

This PR is a copy of #10901 which was damaged due to rebase.

@dpgeorge
Copy link
Member

I think we discussed this before, that the idea of a WDT is that it cannot be reconfigured. That's a feature, not a limitation, because it prevents code from accidentally disabling the WDT (eg by setting the timeout to a very large number).

If the WDT cannot be fed fast enough then other things should be fixed so it can be fed fast enough.

@IhorNehrutsa
Copy link
Contributor Author

Let's start from the very beginning.
Please imagine.
You have well proven, tested, debugged and fully functional working program on a device under the ceiling of a room or outside a window.
The device is connected to Wi-Fi and all communication works over the air.
WDT period is 1second and it work perfect(I mean that you see single WDT reset in week in log file due to a not completely reliable external sensor for example).
You use WebREPL to copy updates to the device(only WebREPL!, WebREPL only!!!).

A problem occurs when you want to update the software:
you don't have enough time to copy updates to the device.
You can connect to WebREPL, you can see print's, but when you interrupt the program and try to copy updated main.py,
WDT resets the MCU due 1 second.

SOLUTION: Re-initializing the WDT gives you time to update the software.

boot.py is not interesting

ESSID = "essid"
PASSWORD = "password"

def do_connect():
    import network
    wlan = network.WLAN(network.STA_IF)
    wlan.active(True)
    if not wlan.isconnected():
        print('connecting to network...')
        wlan.connect(ESSID, PASSWORD)
        while not wlan.isconnected():
            pass
    print('network config:', wlan.ifconfig())
        
do_connect()        
        
import webrepl
webrepl.start()

main.py starts automatically

from machine import WDT

wdt = WDT(timeout=1000)  # ms

try:
    while True:
        # do something useful in less than 1 second
        wdt.feed()
except KeyboardInterrupt:
    wdt = WDT(timeout=1000*60*10)  # 10 minutes

@beyonlo
Copy link

beyonlo commented Sep 2, 2023

@IhorNehrutsa I liked so much you PR, because I will need to update my Micropython firmware too (via ESP OTA) in my ESP32-S3. But I think that is possible to update without change the watchdog timeout.

What I'm thinking to do is:

  1. Reset (via software) my Main app (that has the watchdog 1s) and boot in my OTA app (that has no watchdog) to update the firmware. That reset and boot in another app will stop the watchdog timer right?
  2. In the OTA app I will update the firmware (no watchdog timeout anymore)
  3. Reset (via software) the OTA app the and boot on the new (updated) Main app again - that has the watchdog.

I think that using this approach you do not need to change the watchdog timer. Does that make sense for you?

Thank you!

@IhorNehrutsa
Copy link
Contributor Author

@beyonlo
Good idea! I will try with generic ESP32.
May you get links to OTA firmware/functionality/howto?
Thank you.

@Carglglz
Copy link
Contributor

Carglglz commented Sep 2, 2023

@beyonlo @IhorNehrutsa
Not sure what your use case is but if extremely time precision1 is not a requirement then I would recommend using asyncio and aiorepl which works nicely with WebREPL too. So you could have access to the REPL while feeding the watchdog and doing scripts / OTA firmware updates.

I know this works because I've been working on it for a while at asyncmd.
I have not finished all the documentation yet, but I think there is enough to get started.

The only limitation (apart from time precision) I've found so far when it comes to the number of tasks currently running is memory which if using a ESP32-S3 is probably not a problem.

Footnotes

  1. Something like audio sampling or really fast neopixel animations.

@beyonlo
Copy link

beyonlo commented Sep 2, 2023

@Carglglz I did read about the aiorepl and it is wonderful - I want to use it the future for real time inspection :) But in my case I need to download a tar.gz file), unpack/unzip it into my flash and after update everything.

Problem is that write data in the flash is blocking. So, with watchdog timer with 1 second, to write/unpack all that data in the flash will take more than 1s. So, even that I'm using asyncio, write data in the same flash where MicroPython code are running, is blocking.

@beyonlo
Copy link

beyonlo commented Sep 2, 2023

@beyonlo Good idea! I will try with generic ESP32. May you get links to OTA firmware/functionality/howto? Thank you.

@IhorNehrutsa I not started to write that OTA app yet (to then boot on it from Main app), but I want to use the native ESP32 OTA that is already supported by MicroPython ESP32 port: https://docs.micropython.org/en/latest/library/esp32.html?highlight=ota#flash-partitions

@beyonlo
Copy link

beyonlo commented Sep 2, 2023

@beyonlo Good idea! I will try with generic ESP32. May you get links to OTA firmware/functionality/howto? Thank you.

@IhorNehrutsa There is a PR for an ESP32 OTA #7048. I tested (quickly) it a long time ago and I have not success. I think to check it again when I will to start an OTA app!

@Carglglz
Copy link
Contributor

Carglglz commented Sep 2, 2023

@beyonlo

Problem is that write data in the flash is blocking. So, with watchdog timer with 1 second, to write/unpack all that data in the flash will take more than 1s. So, even that I'm using asyncio, write data in the same flash where MicroPython code are running, is blocking.

I recognise that 1s watchdog timeout is pretty tight, any reason why 5s or 10s wouldn't work?
tbh I haven't tested unpacking yet, but if possible I would split the data in smaller batches and await asyncio.sleep_ms after every chunk is written. And it looks like with deflate this could be possible since:

It is itself a stream and implements the standard read/readinto/write/close methods.

The only trade-off I see is that it would be a few seconds slower I guess... 🤔

@beyonlo
Copy link

beyonlo commented Sep 2, 2023

Hello @Carglglz

Problem is that write data in the flash is blocking. So, with watchdog timer with 1 second, to write/unpack all that data in the flash will take more than 1s. So, even that I'm using asyncio, write data in the same flash where MicroPython code are running, is blocking.

I recognise that 1s watchdog timeout is pretty tight, any reason why 5s or 10s wouldn't work?

In my case I need to read some sensors each ~500 ms (it is a prerequisite - I have a thread just for that) with max 1s/2s, and if that do not read in until 2s, I need to act a relay(alarm)

tbh I haven't tested unpacking yet, but if possible I would split the data in smaller batches and await asyncio.sleep_ms after every chunk is written. And it looks like with deflate this could be possible since:

Yes, I though in that too. I'm already using the new deflate to compact some log files. Problem that to unpack/unzip the MicroPython .bin file with ~1.3MB will take much more than 1s (maybe even 10s will not enough) and it (maybe) can't to be splitted. I have in this tar.gz package also a big file.js.gz ~1.5MB. Well, this file.js.gz can be splitted in the source and after concatenated again.

In my opinion/my vision, maybe is much dangerous to work in the limit of watchdog timeout during the OTA. I prefer a more secure alternative like as to boot in another app (OTA app) - without watchdog, just to update. So, even if something use more time than calculated, will not be a problem, because OTA app do no have the watchdog.

docs/ESP32: Reinitialize watchdog timer with longer timeout.

Signed-off-by: Ihor Nehrutsa <IhorNehrutsa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants