Skip to content

esp32/esp32_common.cmake: Switch back to the vendored TinyUSB copy. #17913

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

agatti
Copy link
Contributor

@agatti agatti commented Aug 13, 2025

Summary

This PR reverts d737112, fixing USB issues reported by macOS users on ESP32S2/ESP32S3 boards.

The commit's intention was to rely on the TinyUSB tree provided by Espressif as a dependency of their USB support library (so maintenance and other bug fixes would come from them rather than from MicroPython).

However, this seems to have opened a Pandora's box of incompatibilities triggered by macOS's USB stack behaviour (see discussion #17465 and issues #17560 and #17865).

Whilst this may look like a permanent solution, it is not. There are some important race condition fixes added in Espressif's patched USB code that are not available in MicroPython's TinyUSB copy (along other bug fixes and feature additions). Still, cutting macOS users out of their boards probably isn't the best thing to do.

Testing

Before this change, copying a file to the board from a macOS machine via USB using mpremote would fail either on the first or on the second iteration.

With this change applied, repeated copies to and from an ESP32S3-DevKitC-R8N16 succeed without problems.

Trade-offs and Alternatives

The only trade-off I can see right now is that by reverting to that particular TinyUSB version we're missing on race condition fixes being added both by the TinyUSB authors and by Espressif.

Alternatives? Where to start... There seems to be a bunch of incompatibilities manifesting at the same time only in a very specific situation:

  • This is triggered by macOS's USB stack, things seem to be fine on Linux and Windows
  • The issue shows up as either losing data and deadlocking TinyUSB, not deadlocking anything but silently duplicating data instead, or just hang on the REPL when printing the help message; the whole lot seems to be board-dependent to a certain extent
  • By forcing different versions of the espressif/esp_tinyusb and espressif/tinyusb components in ports/esp32/main/idf_component.yml things seem to be fixed for some users although with the same versions set I could still get my board to lock up during a stress test
  • When letting espressif/tinyusb match the version of what's present in lib/tinyusb, the issue still appears - so this seems related to espressif/esp_tinyusb instead, but again, changing version numbers changes its final behaviour
  • This was reported on both ESP32S3 and ESP32S2, so race conditions and multi-core issues do not seem to be related

I got a macOS machine on loan from a friend and I can at least replicate the mpremote issues on my board so I can slowly work on this in the meantime (I've been busy with a family situation the past couple of weeks).

I've only got one S3 board and to have JTAG exposed on GPIOs in order to have USB CDC available an eFuse must be burnt (thanks @projectgus for the help on that!). I'm a bit hesitant to burn the eFuse on my board right now, being my only S3 specimen; I've ordered a new one but the shipping times combined with a lackluster postal service plus holiday season here means somebody else may fix this issue before I am ever able to get a JTAG probe attached to the new board (if that happens I won't mind at all!).

Incidentally, I am able to replicate the same issue on a whole different set of conditions (WCH CH32V208 single-core MCU using STM's DCD while connecting from Linux) using the vendored TinyUSB copy, but that may be caused by a coding error on my end.

This commit reverts d737112, fixing USB
issues reported by macOS users on ESP32S2/ESP32S3 boards.

The commit's intention was to rely on the TinyUSB tree provided by
Espressif as a dependency of their USB support library (so maintenance
and other bug fixes would come from them rather than from MicroPython).

However, this seems to have opened a Pandora's box of incompatibilities
triggered by macOS's USB stack behaviour (see discussion micropython#17465 and
issues micropython#17560 and micropython#17865).

Whilst this may look like a permanent solution, it is not.  There are
some important race condition fixes added in Espressif's patched USB
code that are not available in MicroPython's TinyUSB copy (along other
bug fixes and feature additions).  Still, cutting macOS users out of
their boards probably isn't the best thing to do.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
@projectgus
Copy link
Contributor

Thanks @agatti for digging into this.

By forcing different versions of the espressif/esp_tinyusb and espressif/tinyusb components in ports/esp32/main/idf_component.yml things seem to be fixed for some users although with the same versions set I could still get my board to lock up during a stress test

Just to check all the details, you don't get the stress test lockup with the version in this PR? Has anyone else reported problems when running the newer versions of the Espressif components?

Given that Espressif are likely to continue to debug and patch issues in their forks and then submit them upstream periodically into the future, strategically it might still be better to use their fork rather than upstream TinyUSB (especially with things like ESP32-P4 support looming on the horizon). What do you think?

If we can isolate the stress test lockup to TinyUSB then we can also report this to them (although one thing I can't account for is neither ESP-IDF or esp-tinyusb have similar bug reports regarding macOS+USB that I can see, which makes me think our unusual TinyUSB integration in MicroPython might be a contributing factor...)

Regarding JTAG debugging the stress test, I can borrow a Mac laptop and do some testing here if I don't need to set up a full development environment on the Mac. Is it easy to explain how to run it?

@dpgeorge
Copy link
Member

I can run stress tests on an old Macbook Air if needed.

@agatti
Copy link
Contributor Author

agatti commented Aug 14, 2025

Just to check all the details, you don't get the stress test lockup with the version in this PR? Has anyone else reported problems when running the newer versions of the Espressif components?

That's correct, I don't encounter lockups with this commit applied.

Regarding newer component versions: yes, there have been problem reports. #17865 (comment) for example, in which an S3 devkit board that used to work before now hangs at REPL with the updated Espressif components. However that same issue is also fixed by those same components on two different boards that also had the issue in the first place.

Given that Espressif are likely to continue to debug and patch issues in their forks and then submit them upstream periodically into the future, strategically it might still be better to use their fork rather than upstream TinyUSB (especially with things like ESP32-P4 support looming on the horizon). What do you think?

Personally I'm all for it. If once this gets sorted out a similar issue arises it should be easy to switch to a local TinyUSB repo copy, that's what we've been doing after all.

If we can isolate the stress test lockup to TinyUSB then we can also report this to them (although one thing I can't account for is neither ESP-IDF or esp-tinyusb have similar bug reports regarding macOS+USB that I can see, which makes me think our unusual TinyUSB integration in MicroPython might be a contributing factor...)

That's the interesting bit. The PHY initialisation code in MPy doesn't set external GPIO pin indices in the configuration structure, but since only the internal interface is used it shouldn't really matter. I'd wager if that was the problem it would have shown up on Linux and WSL too.

Everything else is virtually the same, except for how the TinyUSB task is run. That could be related to the fact that the issue mainly presents on macOS as TinyUSB's state machine may be a step or two behind what it expects to be? I don't think modern macOS versions let you dump USB traffic, but I can take a look at that to see if a particular pattern emerges.

Regarding JTAG debugging the stress test, I can borrow a Mac laptop and do some testing here if I don't need to set up a full development environment on the Mac. Is it easy to explain how to run it?

This should be enough:

dd if=/dev/urandom of=file bs=250000 count=1
for i in $(seq 0 50); do mpremote cp file :file-$i; mpremote cp :file-$i file-$i; diff file file-$i; done

On the laptop/board I run this it hangs on file number 35. Sometimes on smaller runs it'd complete all transfers but then hang on mpremote ls. I haven't tried to add delays between transfers or things like that - the situation was already puzzling to begin with.

I can run stress tests on an old Macbook Air if needed.

For the record, I can make things hang on macOS 13.7.6 - maybe older versions (or newer, that is) may not exhibit the issue.

I wish I could provide more help on this but things are "interesting" on my end to say the least.

@agatti
Copy link
Contributor Author

agatti commented Aug 14, 2025

If that helps, here's a wireshark dump of the USB interface for a failed transfer between macOS 13.7.6 and an ESP32S3 using the release v1.26.0 firmware off micropython.org. There's some fluff at the beginning, but if you need any pointers in the timeline:

  • Packet 587 should be the first packet sent by mpremote
  • Packet 12205 is the last valid file data chunk sent from macOS
  • Packet 12217 is the error packet reported by the board

failed_mpremote_copy.pcapng.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants