Skip to content

esp32: Apply the LWIP active TCP socket limit. #15952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 10, 2024

Conversation

projectgus
Copy link
Contributor

@projectgus projectgus commented Oct 2, 2024

Summary

This is a workaround for a bug in ESP-IDF where the configuration setting for LWIP maximum active TCP sockets (PCBs) is not applied. See espressif/esp-idf#9670

Fixes cases where a lot of short-lived TCP connections can cause:

  • Excessive memory usage (unbounded number of sockets in TIME-WAIT).
  • Much higher risk of stalled connections due to repeated port numbers. The maximum number of active TCP PCBs is reduced from 16 to 12 to further reduce this risk.

This is not a 100% fix for the second point: a peer can still reuse a port number while a previous socket is in TIME-WAIT, and LWIP will reject that connection (in an RFC compliant way) causing the peer to stall.

(Note that this may not be a complete fix for every failure reported in those issues, but it should fix most of them. Will need additional info to reproduce any remaining problems, so probably worth opening a new issue.)

Testing

Using the test program supplied in the issue report for #15844, and this test client:

#!/usr/bin/env python
import time
import requests

IP = '192.168.66.125:80'

start = time.time()

for i in range(1000_000):
    r = requests.get(f'http://{IP}/*JOY;{i};0;0;0;0;0', timeout=20)
    r.close()
    print(i, time.time() - start)
    time.sleep(0.15)
  • On master branch this test fails reliably, often in the first 500 iterations and always before 1000 iterations. Packet captures show a local port is reused by the client and that connection stalls (until either the client times out or the server socket ages out of TIME-WAIT).
  • With the fix in this PR applied but the default max number of TCP PCBs (16), failures were seen between 1000 and 2000 iterations.
  • With all changes in the PR applied, including reducing the max TCP PCBs to 12, a failure happened after 28,000 iterations and 1h40m (unclear if this was due to port reuse or another transient network issue). A follow up test has run over 7,000 iterations without failing. EDIT: This test stopped after 16,579 iterations due to reused port.

The frequency of TCP local port reuse depends on the client system (in this case desktop Linux), so may vary depending on the host OS and other network usage.

Trade-offs and Alternatives

  • The trade-off to sockets being cleaned up out of TIME-WAIT early is the possibility of delayed packets sent to the old socket being received by a new socket on the same port (see RFC1337). This patch compromises by only cleaning up sockets early when a lot of short-lived sockets are created, otherwise the full TIME-WAIT period should apply.
  • As discussed in the linked issues it is possible to lower the MSL instead, so TIME-WAIT always ends more rapidly (say reduced from 120 seconds to 10 seconds). On quality network links this is probably fine, but may lead to problems on poor network links.
  • LWIP could implement the behaviour of many other TCP/IP stacks: accept an out-of-window SYN on a TIME-WAIT socket so the reused port can be opened. Then port reuse would (mostly) not trigger stalled connections. See the last heading in this comment for some discussion. However, out of scope for MicroPython to be able to change this - would need to be discussed with the LWIP maintainers, then wait for the change to be adopted by esp-lwip.

This work was funded through GitHub Sponsors.

@projectgus
Copy link
Contributor Author

projectgus commented Oct 2, 2024

Uploading some builds for generic boards here if anyone would like to test:

EDIT: instructions for flashing below: #15952 (comment)

@dpgeorge
Copy link
Member

dpgeorge commented Oct 3, 2024

Ahh, our good old friend --wrap 😂

This looks pretty good to me, a relatively small and self-contained workaround.

I will test it.

@projectgus
Copy link
Contributor Author

Ahh, our good old friend --wrap 😂

Yeah 😅 . I originally thought I could do this by calling an internal LWIP API, but it was a ton fiddlier than this.

@projectgus
Copy link
Contributor Author

As this TIME-WAIT phenomenon seems to have haunted me on and off for the past decade, I've decided that I'll put some of my own time into it if the lwIP developers are amenable: https://lists.nongnu.org/archive/html/lwip-devel/2024-10/msg00000.html

@dpgeorge
Copy link
Member

dpgeorge commented Oct 3, 2024

I've decided that I'll put some of my own time into it if the lwIP developers are amenable

Very good!

@TRadigk
Copy link

TRadigk commented Oct 4, 2024

Uploading some builds for generic boards here if anyone would like to test:

...

Tried the C3 image on two different ESP32 C3 boards, both of them got stuck in bootloop:

Invalid image block, can't boot.
ets_main.c 333 
ESP-ROM:esp32c3-api1-20210207
Build:Feb  7 2021
rst:0x7 (TG0WDT_SYS_RST),boot:0xd (SPI_FAST_FLASH_BOOT)
Saved PC:0x40047ed2
SPIWP:0xee
mode:DIO, clock div:1
load:0x3c150020,len:0x3cdb8
load:0x3fc96600,len:0x3044
load:0x40380000,len:0x1ec
load:0x42000020,len:0x14c3d4
Invalid image block, can't boot.
ets_main.c 333 
ESP-ROM:esp32c3-api1-20210207
Build:Feb  7 2021
rst:0x7 (TG0WDT_SYS_RST),boot:0xd (SPI_FAST_FLASH_BOOT)
Saved PC:0x40047ed2
SPIWP:0xee
mode:DIO, clock div:1
load:0x3c150020,len:0x3cdb8
load:0x3fc96600,len:0x3044
load:0x40380000,len:0x1ec
load:0x42000020,len:0x14c3d4
Invalid image block, can't boot.
ets_main.c 333 
ESP-ROM:esp32c3-api1-20210207
Build:Feb  7 2021
rst:0x10 (RTCWDT_RTC_RST),boot:0xd (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3c150020,len:0x3cdb8
load:0x3fc96600,len:0x3044
load:0x40380000,len:0x1ec
load:0x42000020,len:0x14c3d4
Invalid image block, can't boot.
ets_main.c 333 

flashing https://micropython.org/resources/firmware/ESP32_GENERIC_C3-20241003-v1.24.0-preview.378.gca6723b14.bin brought both of them back to life.
The flash command used (path shortened):

esptool --chip esp32c3 --port COM8 --baud 460800 write_flash -z 0x0 "D:\...\ESP32_GENERIC_C3-20241003-v1.24.0-preview.378.gca6723b14.bin"

And the same goes for the ESP32 S3 image.

@dpgeorge
Copy link
Member

dpgeorge commented Oct 6, 2024

Testing

I have tested this PR. My test setup is:

I ran the following tests:

  • current master with ESP32_GENERIC for 10 minutes to prove it fails
  • this PR with ESP32_GENERIC for 37 hours
  • current master on PYBD-SF6 for 37 hours (parallel with above test)

Current master with ESP32_GENERIC

I ran this for 10 minutes and got 9 failures, all of which were ConnectTimeout errors from requests.

That's about 54 failures per hour.

This PR with ESP32_GENERIC

I ran this test for 37 hours. There were 400297 iterations of the client loop, ie that many attempted requests. Out of those, there were 49 which had an error, and all of those were of this form:

ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='192.168.0.61', port=8080): Max retries exceeded with url: /*JOY;1338;0;0;0;0;0 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x79bc338ef440>, 'Connection to 192.168.0.61 timed out. (connect timeout=20)'))"))

The first such error was at connection number 1338, after 466 seconds. Then more or less evenly distributed over the 37 hours.

Failure rate here is 1.32 failures per hour, about 40 times less than the failure rate on master.

Current master on PYBD-SF6

As a comparison to this PR I also ran in parallel the same test on master on PYBD-SF6 (using the same PC as the client, same access point). This board did nearly twice as many connections in the same time.

There were 715801 iterations of the client loop. And 9 failures.

One failure (the first failure, at iteration 36012) was:

ConnectionError(ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')))

The other 8 failures (the first at iteration 53639, and more or less distributed throughout the 37 hours) were the same ConnectTimeout as above.

Failure rate here is 0.24 failures per hour. But because this board had more iterations, to compare with ESP32_GENERIC the equivalent failure rate would be 0.14, which is about 10 times less than ESP32_GENERIC with this PR.

Summary

  • This PR definitely makes things a lot better on esp32, but it doesn't completely fix the problem.
  • Probably this PR would approach the PYBD-SF6 result if MEMP_NUM_TCP_PCB were reduced to 5 to match PYBD-SF6.
  • I do not know if the ConnectTimeout errors correspond to port reuse, I only tracked the error raised by requests on the client.
  • There are always going to be errors with networking, so clients and server must both be written to be as robust as possible.

It would be interesting to see if implementing "accept an out-of-window SYN on a TIME-WAIT socket" in lwIP would really reduce the failure rate to zero. But that's a lot of work and probably impossible for us to integrate into MicroPython without forking lwIP and ESP-IDF.

Client test code

Modified version of @projectgus client code above to track errors:

import sys
import time
import requests

IP = sys.argv[1]

start = time.time()
errors = []

try:
    for i in range(1000_000):
        try:
            r = requests.get(f'http://{IP}/*JOY;{i};0;0;0;0;0', timeout=20)
            r.close()
        except Exception as er:
            print("Exception:", repr(er))
            errors.append((i, time.time() - start, er))
        print(i, time.time() - start, len(errors))
        time.sleep(0.15)
except KeyboardInterrupt:
    for er in errors:
        print(er)

@dpgeorge dpgeorge added this to the release-1.24.0 milestone Oct 7, 2024
@projectgus
Copy link
Contributor Author

Tried the C3 image on two different ESP32 C3 boards, both of them got stuck in bootloop:

Sorry @TRadigk for the missing instruction. These .bin files attached above are only the app, and the .bin files published on the website are the full flash contents which incorporates some other binary files into one.

To test: First flash a "full" MicroPython firmware .bin from the website, and then do esptool.py -p PORT -b 460800 write_flash 0x10000 micropython.bin to flash one of the app binaries attached above. (The difference is the starting address.)

@projectgus
Copy link
Contributor Author

projectgus commented Oct 7, 2024

Very thorough testing, @dpgeorge! Nice.

* I do not know if the `ConnectTimeout` errors correspond to port reuse, I only tracked the error raised by `requests` on the client.

FWIW I think most of them probably do, the ones I tested with a running packet capture did 100% (on an otherwise quiet Wi-Fi network with good signal strength). Port reuse was sometimes really rapid (within a few subsequent connections), I don't fully understand why but I guess I have a lot of browser tabs open! However, fully agree with your other point that you can't ever assume a robust network and code should be prepared for some failures.

* Probably this PR would approach the PYBD-SF6 result if `MEMP_NUM_TCP_PCB` were reduced to 5 to match PYBD-SF6.

I think so too. Just want to make the point for anyone reading along that, similar to reducing MSL, reducing MEMP_NUM_TCP_PCB (even to 12) to address this is really only a workaround. We're trading off reducing one kind of network robustness (correctly handling lost or delayed packets in the TIME-WAIT state) to increase another kind of network robustness (dealing with port reuse on repeated connections).

@TRadigk
Copy link

TRadigk commented Oct 8, 2024

Tried the C3 image on two different ESP32 C3 boards, both of them got stuck in bootloop:

Sorry @TRadigk for the missing instruction. These .bin files attached above are only the app, and the .bin files published on the website are the full flash contents which incorporates some other binary files into one.

To test: First flash a "full" MicroPython firmware .bin from the website, and then do esptool.py -p PORT -b 460800 write_flash 0x10000 micropython.bin to flash one of the app binaries attached above. (The difference is the starting address.)

Thank you so much, @projectgus for clearing this up. Now I was able to fully test and compare ESP32_GENERIC_C3-20241003-v1.24.0-preview.378.gca6723b14 and the provided "app". The result is pretty clear to me. On the previous version I could execute (at best) 73 consecutive requests until wifi broke down.
With the new app I could run 1000 requests successfully, and only sometimes my resilience was triggered (unexpectedly closed connection encountered), where I could recover within 20 ms (not tuned down to the very last millisecond, but "felt" like that this should be enough to create a new HTTP client and reopen connection).

This is a workaround for a bug in ESP-IDF where the configuration setting
for maximum active TCP sockets (PCBs) is not applied.

Fixes cases where a lot of short-lived TCP connections can cause:

- Excessive memory usage (unbounded number of sockets in TIME-WAIT).
- Much higher risk of stalled connections due to repeated port numbers. The
  maximum number of active TCP PCBs is reduced from 16 to 12 to further
  reduce this risk (trade-off against possibility of TIME-WAIT
  Assassination as described in RFC1337).

This is not a watertight fix for the second point: a peer can still reuse a
port number while a previous socket is in TIME-WAIT, and LWIP will reject
that connection (in an RFC compliant way) causing the peer to stall.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
@dpgeorge dpgeorge force-pushed the bugfix/lwip_tcp_limit branch from 5a19a8d to 82e69df Compare October 10, 2024 06:57
@dpgeorge dpgeorge merged commit 82e69df into micropython:master Oct 10, 2024
8 checks passed
@dpgeorge
Copy link
Member

Although not a complete fix, this PR improves things dramatically.

Merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants