Description
Port, board and/or hardware
rp2 port, PICO and WeACTStudio boards
MicroPython version
MicroPython v1.24.0-preview.39.g411d66586c on 2024-06-18; WeAct Studio RP2040 with RP2040
(current master)
Reproduction
- Copy this code block to main.py on board:
from machine import Pin
import time
led = Pin(Pin.board.LED, Pin.OUT)
for i in range(100):
print("abcdefghijklmnopqrstuvwxyz")
time.sleep_ms(100)
led(1) # connect while LED is on to trigger bug
for i in range(100):
print("ABC ", i)
time.sleep_ms(50)
led(0)
- Connect the board to USB or hard reset it.
- While the LED is on, connect with "mpremote a0" or similar.
Expected behaviour
Since PR #14485 merged, expected output is something like:
[... similar lines cut from here ...]
ABC 96
ABC 97
ABC 98
ABC 99
MicroPython v1.24.0-preview.39.g411d66586c on 2024-06-18; WeAct Studio RP2040 with RP2040
Type "help()" for more information.
>>>
Observed behaviour
This is a race condition so it depends on timing, but on Linux the output sometimes shows the REPL behaving as if some random input taken from the earlier output has been sent to it, i.e.
[... similar lines cut from here ...]
ABC 96
ABC 97
ABC 98
ABC 99
MicroPython v1.24.0-preview.39.g411d66586c on 2024-06-18; WeAct Studio RP2040 with RP2040
Type "help()" for more information.
>>> stuvwxyz
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'stuvwxyz' isn't defined
>>>
>>> abcdefg
If the output printed earlier from main.py includes valid Python statements, these might even be executed...
Additional Information
The on-the-wire behaviour can be seen in usbmon capture output where the USB captures
- The host sets DTR, RTS via USB-CDC control transfers.
- The rp2 sends all of the buffered data to the host (correctly)
- The host unexpectedly echoes back some of the data to the rp2, one character at a time.
- If step 3 includes a newline then the device might try to evaluate it in the REPL, and returns an error.
Output of strace -tt mpremote a0
shows the last part of the puzzle:
12:37:58.522034 openat(AT_FDCWD, "/dev/ttyACM0", O_RDWR|O_NOCTTY|O_NONBLOCK|O_CLOEXEC) = 3
12:37:58.522533 flock(3, LOCK_EX|LOCK_NB) = 0
12:37:58.522733 ioctl(3, TCGETS, {c_iflag=ICRNL|IXON, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B9600|CS8|CREAD|HUPCL|CLOCAL, c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0
12:37:58.522988 ioctl(3, TCGETS, {c_iflag=ICRNL|IXON, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B9600|CS8|CREAD|HUPCL|CLOCAL, c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0
12:37:58.523032 ioctl(3, SNDCTL_TMR_START or TCSETS, {c_iflag=, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|, c_cflag=B115200|CS8|CREAD|HUPCL|CLOCAL, c_lflag=, ...}) = 0
12:37:58.523367 ioctl(3, TIOCMBIS, [TIOCM_DTR]) = 0
12:37:58.523480 ioctl(3, TIOCMBIS, [TIOCM_RTS]) = 0
12:37:58.523518 ioctl(3, TCFLSH, TCIFLUSH) = 0
The above shows:
- There is a window of approximately 1ms between when the serial port is opened by the host, and when TCSETS is used to set termios.
- During this window, the port is in the default "input echo" mode (see flags
ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE
). - Therefore if the USB device sends anything to the host during this 1ms window, it gets echoed back from the host to the device by Linux.
- After pyserial sets termios and disables echo, it also immediately flushes the input buffer so the input data received before this point is never actually received by mpremote. But some of it has already been echoed back to the device, and the device's REPL will then echo it back to the host!
I think the necessary conditions to trigger this are:
- main.py has to print a significant amount of output, and then be printing more output over time (i.e. with delays between prints).
- The host serial port has to be opened after the CDC buffer is full but before main.py is done printing.
- Under normal conditions, there is a delay (
cdc_connected_flush_delay
) that means output isn't sent to the host immediately after opening, instead it waits until after the 1ms race window is finished. However, if output is still actively being printed then the device may choose to flush early and triggers the bug.
Suggested fix
I think the fix may be pretty simple here, to make cdc_connected_flush_delay
apply to all flushes.
However this behaviour is odd and hard to see, so I wanted to document it here in case it comes up again in another context.
Code of Conduct
Yes, I agree