-
Notifications
You must be signed in to change notification settings - Fork 1.3k
M4 Express can deadlock on certain complex import chains #1283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've seen a similar failure when the internal C code creates a stack that's bigger than the allocated stack space. It then writes onto the heap and then the moment the overwritten object is referenced it can cause a hard fault. This case may be different though. |
We could add |
@klardotsh Do you have a commit in your repo that provokes the problem? I'd like to test against it. |
There's no commit for it explicitly (it got rebased away when I was squashing down my then-WIP branch), however removing everything above line 37 in https://github.com/KMKfw/kmk_firmware/blob/master/kmk/firmware.py (master branch of KMKfw/kmk_firmware) should trigger at least the RuntimeError - not sure if it repros the deadlock (it may?), and I didn't end up with time this weekend to construct an independent repro example, sadly. Flashing instructions for KMK are available at https://github.com/KMKfw/kmk_firmware/blob/master/docs/flashing.md (it rsyncs over the If that doesn't repro, I'll try to assemble a specific and shrunken-down repro example this week. |
I don't think this is an issue anymore because 1) we check to make sure the stack hasn't overwritten the heap now and go into safe mode if it does and 2) we can enter safe mode manually by clicking reset when the status neopixel is yellow. |
I don't have my board handy to provide a proper repro case right now, so I'll do what I can to describe the scenario until I can provide said repro case (and/or crack out my JLINK and just dive in):
Normally, when stack depth is exceeded (too many nested imports), a
RuntimeError
is raised (and if this happens inmain.py
, safe mode should be triggered)It appears some cases of deeply nested import trees will bypass the
RuntimeError
entirely and simply lock the device. After a while, the serial console disconnects, anddmesg
starts complaining that it needs to reset the device, but that the USB device won't respond to addresses (meaning all execution of anything on the device has stopped, probably including the supervisor)If this happens in a REPL, a simple reboot of the device (through the button on ItsyBitsy/Feather) gets you back to a safe state. If this happens in
main.py
, the board is soft-bricked until the internal flash is wiped, which requires a custom build of CircuitPython to be flashed over UF2 that forcibly recreates the internal filesystem, and then another flash of "actual stable" CircuitPython (lest your filesystem be wiped every boot from then on)Some context:
RuntimeError
easily)Also interestingly, updating the max stack size in
boot.py
does not fix this. Setting the value to anything below 650 results in theRuntimeError
, anything over 700 and the modules don't have enough heap space to actually compile (I assume) and fail to import, anything in between and (if I recall correctly - this was a few days ago) I'd deadlock.The project branch that triggered this is available here: https://github.com/KMKfw/kmk_firmware/tree/topic-planck-klaranck. In
kmk/firmware.py
I hack around this issue and things work - I believe removing the giant block at the top of the file (everything beforeThanks for sticking around. Now let's do real work, starting below
) may repro one or both of the symptoms described above when trying to useuser_keymaps/klardotsh/klarank_featherm4.py
asmain.py
The text was updated successfully, but these errors were encountered: