Skip to content

[Windows] test_httpservers hangs since Python 3.5 #69282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
terryjreedy opened this issue Sep 14, 2015 · 29 comments
Open

[Windows] test_httpservers hangs since Python 3.5 #69282

terryjreedy opened this issue Sep 14, 2015 · 29 comments
Labels
OS-windows tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@terryjreedy
Copy link
Member

BPO 25095
Nosy @terryjreedy, @pfmoore, @tjguk, @berkerpeksag, @vadmium, @zware, @zooba, @WildCard65
PRs
  • bpo-25095: test_httpservers hangs since Python 3.5 #5101
  • gh-69282: test_httpservers hangs since Python 3.5 #9564
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2015-09-14.03:56:33.216>
    labels = ['3.8', '3.7', 'tests', 'OS-windows', 'performance']
    title = 'test_httpservers hangs since Python 3.5'
    updated_at = <Date 2020-05-23.22:05:28.645>
    user = 'https://github.com/terryjreedy'

    bugs.python.org fields:

    activity = <Date 2020-05-23.22:05:28.645>
    actor = 'WildCard65'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests', 'Windows']
    creation = <Date 2015-09-14.03:56:33.216>
    creator = 'terry.reedy'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 25095
    keywords = ['patch', '3.5regression']
    message_count = 25.0
    messages = ['250606', '250721', '250725', '250739', '250751', '250845', '250846', '309467', '309470', '309473', '309480', '309498', '309499', '309515', '309516', '309522', '309528', '309530', '310867', '310885', '322839', '322856', '326626', '369666', '369751']
    nosy_count = 8.0
    nosy_names = ['terry.reedy', 'paul.moore', 'tim.golden', 'berker.peksag', 'martin.panter', 'zach.ware', 'steve.dower', 'WildCard65']
    pr_nums = ['5101', '9564']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue25095'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

    @terryjreedy
    Copy link
    Member Author

    test_httpservers ok on 3.4.3, hangs indefinitely (over an hour) on 3.5.0, win 7.

    @terryjreedy terryjreedy added stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump tests Tests in the Lib/test dir OS-windows labels Sep 14, 2015
    @vadmium
    Copy link
    Member

    vadmium commented Sep 15, 2015

    Can you identify which test case hangs? E.g. on Linux I can run the following to see each test case run:

    ./python -m unittest -v test.test_httpservers

    @terryjreedy
    Copy link
    Member Author

    test_get (test.test_httpservers.RequestHandlerLoggingTestCase) ...
    same on repeat

    @vstinner
    Copy link
    Member

    Where does it hang? For example, try to run the test using:

    python.exe -m test -v --timeout=10 test_httpservers

    Can it be a firewall or antivirus issue?

    Try to add some print() in TestServerThread.run() to check if the HTTP server is running or not.

    @terryjreedy
    Copy link
    Member Author

    test_get (test.test_httpservers.RequestHandlerLoggingTestCase) ... Timeout (0:00:10)!
    Thread 0x00001654 (most recent call first):
    File "C:\Programs\Python35\lib\socket.py", line 571 in readinto
    File "C:\Programs\Python35\lib\http\server.py", line 383 in handle_one_request
    File "C:\Programs\Python35\lib\http\server.py", line 417 in handle
    File "C:\Programs\Python35\lib\socketserver.py", line 684 in __init__
    File "C:\Programs\Python35\lib\socketserver.py", line 357 in finish_request
    File "C:\Programs\Python35\lib\socketserver.py", line 344 in process_request
    File "C:\Programs\Python35\lib\socketserver.py", line 318 in _handle_request_noblock
    File "C:\Programs\Python35\lib\socketserver.py", line 239 in serve_forever
    File "C:\Programs\Python35\lib\test\test_httpservers.py", line 47 in run
    File "C:\Programs\Python35\lib\threading.py", line 923 in _bootstrap_inner
    File "C:\Programs\Python35\lib\threading.py", line 891 in _bootstrap

    Thread 0x000017f8 (most recent call first):
    File "C:\Programs\Python35\lib\socket.py", line 571 in readinto
    File "C:\Programs\Python35\lib\http\client.py", line 243 in _read_status
    File "C:\Programs\Python35\lib\http\client.py", line 282 in begin
    File "C:\Programs\Python35\lib\http\client.py", line 1174 in getresponse
    File "C:\Programs\Python35\lib\test\test_httpservers.py", line 257 in test_get
    File "C:\Programs\Python35\lib\unittest\case.py", line 597 in run
    File "C:\Programs\Python35\lib\unittest\case.py", line 645 in __call__
    File "C:\Programs\Python35\lib\unittest\suite.py", line 122 in run
    File "C:\Programs\Python35\lib\unittest\suite.py", line 84 in __call__
    File "C:\Programs\Python35\lib\unittest\suite.py", line 122 in run
    File "C:\Programs\Python35\lib\unittest\suite.py", line 84 in __call__
    File "C:\Programs\Python35\lib\unittest\runner.py", line 176 in run
    File "C:\Programs\Python35\lib\test\support\init.py", line 1775 in _run_suite
    File "C:\Programs\Python35\lib\test\support\init.py", line 1809 in run_unittest
    File "C:\Programs\Python35\lib\test\test_httpservers.py", line 890 in test_main
    File "C:\Programs\Python35\lib\test\regrtest.py", line 1281 in runtest_inner
    File "C:\Programs\Python35\lib\test\regrtest.py", line 979 in runtest
    File "C:\Programs\Python35\lib\test\regrtest.py", line 763 in main
    File "C:\Programs\Python35\lib\test\regrtest.py", line 1565 in main_in_temp_cwd
    File "C:\Programs\Python35\lib\test\main.py", line 3 in <module>
    File "C:\Programs\Python35\lib\runpy.py", line 85 in _run_code
    File "C:\Programs\Python35\lib\runpy.py", line 170 in _run_module_as_main

    @vadmium
    Copy link
    Member

    vadmium commented Sep 16, 2015

    According to those back traces, the server has apparently already handled one request and is waiting for a second request. However the client is still waiting for a response from its original request.

    Some things I might try would be to disable the server, and make my own dummy server to see if it responds to that. Similarly, disable the client and manually make a request to the server and see what the response is. On Linux the “socat” or “netcat” programs are useful for this stuff, or you can just use the Python interactive interpreter to create a socket and send and receive.

    Client connection is made at <https://hg.python.org/cpython/file/v3.5.0/Lib/test/test_httpservers.py#l252\>. Server is constructed at <https://hg.python.org/cpython/file/v3.5.0/Lib/test/test_httpservers.py#l42\>.

    You could also try adding self.con.set_debuglevel(99) to the test_get() method, though I suspect it will just output the request sent, and not report any reply or headers. In my case the test works, and I see:

    send: b'GET / HTTP/1.1\r\nHost: 127.0.0.1:48059\r\nAccept-Encoding: identity\r\n\r\n'
    reply: 'HTTP/1.1 200 OK\r\n'
    [Unit test output]
    header: Server header: Date $

    @vstinner
    Copy link
    Member

    You may try "hg bisect" to find which revision broke test_httpservers. It can be slow if you have to recompile manually Python :-(

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Jan 4, 2018

    Adding onto this issue, I believe I tracked down the change that caused the issue: c0a23e6

    This issue also affects the test with Python 3.6

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Jan 4, 2018

    I believe I found the issue, might be an issue between sending the request + calling get_response() and the time it takes for server to process response, my print cases:

    test_err (test.test_httpservers.RequestHandlerLoggingTestCase) ... Creating a new connection to 127.0.0.1:50141
    Connecting to server
    Sending ERROR request, captured stderr instance is <_io.StringIO object at 0x05E5EC00>
    HANDLING AN ERROR REQUEST! SENDING RESPONSE!
    Request sent, starting to read response
    <http.client.HTTPResponse object at 0x05E6D7E0>
    RESPONSE SENT!
    Response obtained, err should be set, <_io.StringIO object at 0x05E5EC00>
    Running test's assert
    ok
    test_get (test.test_httpservers.RequestHandlerLoggingTestCase) ... Creating a new connection to 127.0.0.1:50143
    Connecting to server
    Sending GET request, captured stderr instance is <_io.StringIO object at 0x05E5EC00>
    Request sent, starting to read response
    HANDLING A GET REQUEST! SENDING RESPONSE!
    RESPONSE SENT! ENDING HEADERS!
    HEADERS ENDED!

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Jan 4, 2018

    Scratch the previous message about the possible cause, I found the true cause, getresponse() is waiting for a specific header, one that BaseHTTPRequestHandler.send_error sends and BaseHTTPRequestHandler.send_response() doesn't, that header is "Connection" with message "close" (http.server#450), adding that to RequestHandlerLoggingTestCase.request_handler.do_GET before self.end_headers() fixes the deadlock.

    @berkerpeksag
    Copy link
    Member

    Thank you for tracking down the problem, William! I just took a quick look at your patch and it seems reasonable to me.

    Setting protocol_version to 'HTTP/1.1' means setting the close_connection attribute of BaseHTTPRequestHandler to False which is expected and a feature of HTTP/1.1 (persistent connections)

    Quoting RFC 2616:

    HTTP/1.1 defines the "close" connection option for the sender to
    signal that the connection will be closed after completion of the
    response.
    
    HTTP/1.1 applications that do not support persistent connections MUST
    include the "close" connection option in every message.
    

    So I think it's OK to send a "Connection: close" header in the test (it would be nice to add a comment though)

    @berkerpeksag berkerpeksag removed the stdlib Python modules in the Lib dir label Jan 4, 2018
    @WildCard65 WildCard65 mannequin changed the title test_httpservers hangs on 3.5.0, win 7 test_httpservers hangs since Python 3.5 Jan 4, 2018
    @WildCard65 WildCard65 mannequin added performance Performance or resource usage and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Jan 4, 2018
    @vadmium
    Copy link
    Member

    vadmium commented Jan 5, 2018

    In the server, the send_header("Connection", "close") call sets the “close_connection” flag. This shuts down the connection once “do_GET” returns. Without the flag set, the server will wait and read another request.

    If you want the server to shut the connection down, I suggest to be explicit in setting “close_connection”. It should work even if no “Connection: close” appears in the HTTP protocol. The special behaviour of “send_header” I think you are relying on is not documented.

    On my Linux computer with the original code, I think the client shuts the connection down. This causes the server to see an empty “raw_requestline” and return from “handle_one_request”. It returns to “serve_forever” where it polls the “__shutdown_request” flag and sees that it should stop.

    The client shuts down the connection only because of subtleties in how the HTTP client manages the socket and how sockets are garbage collected. The response does not have Content-Length nor Transfer-Encoding fields, and would be terminated by the server shutting the connection down. So the HTTPConnection object cannot reuse the TCP connection and hands ownership to the HTTPResponse object returned by “getresponse”. Since this object is not saved anywhere, it gets garbage collected, which closes the socket and shuts the connection down. But perhaps on Windows the shutdown doesn’t happen, or perhaps the garbage collector is too slow.

    If I am right, closing the HTTPResponse object would also fix the deadlock. It is good practice to close the underlying socket anyway:

    with support.captured_stderr() as err:
        self.con.request('GET', '/')
        res = self.con.getresponse()
        
        # Shut down connection to stop the server reading from it
        res.close()
        self.con.close()

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Jan 5, 2018

    Martin, your suggestion will never work as if you look at the trace back posted terry.reedy and my test print statements, both the client and server get stuck waiting to read data their respective socket, hence the deadlock. Adding the header "Connection" with value of "close" is logic taken from the method "send_error()" which is what test_err executes (same test class), that header along with send_error() adding the Content-Type and Content-Length headers, those headers are what is different between do_ERROR and do_GET, I've already tried locally having do_GET set both Content-Type and Content-Length headers without the Connection header, no difference (deadlock continued).

    @terryjreedy
    Copy link
    Member Author

    Today, at least, python -m test -v test_httpservers does not hang for me on any of 3.5, 3.6, or 3.7, installed or repository debug. I don't know if the offending test was just disabled or somehow fixed.

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Jan 5, 2018

    It hangs for me on Windows 10 Professional running on a MSI gaming laptop for debug and PGO builds (Python 3.6)

    @vadmium
    Copy link
    Member

    vadmium commented Jan 5, 2018

    Sorry William, I forgot the client was waiting to read. But I don’t understand why your Connection field (which comes after the status line) allows the Python client to read the status line. Perhaps there is some malware scanner, firewall, or other proxy that intercepts the HTTP protocol? (I encountered something similar at work recently.)

    You said “my suggestion will never work”, but did you try setting “close_connection” on its own:

    def do_GET(self):
        self.send_response(HTTPStatus.OK)
        self.end_headers()
        self.close_connection = True  # Terminate response body to proxy

    You tried adding Content-Length, but did you try “Content-Length: 0”:

    def do_GET(self):
        self.send_response(HTTPStatus.OK)
        self.send_header("Content-Length", "0")  # Stop proxy reading body
        self.end_headers()

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Jan 5, 2018

    I have tried value 0 for "Content-Length" (along with "text/plain" for "Content-Type"), it was when I said I tried both "Content-Length" and "Content-Type", while I haven't tried directly setting "close_connection" in the handler, my solution is based on how "send_error()" works internally, not only that, but send_header is public API (by convention) and the "Connection" header is part of HTTP 1.1 so it should be documented ( reference: https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html ), so I don't really understand why my initial solution is consider "undocumented logic"

    @vadmium
    Copy link
    Member

    vadmium commented Jan 5, 2018

    Thanks, although the fact that “Content-Length: 0” doesn’t work kills my theory about the proxy.

    The “close_connection” flag is also a documented public API of Python: <https://docs.python.org/3/library/http.server.html#http.server.BaseHTTPRequestHandler.close_connection\>. According to the rules for framing the message body in <https://tools.ietf.org/html/rfc7230#section-3.3.3\>, the presence of “Connection: close” is not important. The server actually shutting down the connection is the key. That is why I prefer to explicitly set the flag (if it works).

    Anyway, your current proposal makes the server send a valid full HTTP response.

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Jan 27, 2018

    Ok, I found another way to apply the solution to this issue, that is by adding the "Connection" header (with value of "close") to the client's request instead of the server's response.

    I'm going to use this other method as the client is expected to expect the server to shutdown the connection (as it asked the server to).

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Jan 27, 2018

    Alright, the PR is ready for review.

    @WildCard65 WildCard65 mannequin added 3.7 (EOL) end of life 3.8 (EOL) end of life labels Jan 30, 2018
    @vadmium
    Copy link
    Member

    vadmium commented Aug 1, 2018

    I reproduced the problem on a Windows computer, and now understand why my "Content-Length: 0" suggestion isn't good enough on its own. It does solve the initial deadlock, but there is a further deadlock. The main thread is waiting for the server to shut down while it is holding the HTTP connection open, and the server is still trying to read a second request on that connection.

    Setting "self.close_connection = True" in the server (or using "Connection: close") solves both deadlocks. But it seems cleaner for the client to close the connection and response objects anyway, before shutting down the server. I would modify the "test_get" method:

    with support.captured_stderr() as err:
        self.con.request('GET', '/')
        res = self.con.getresponse()
        res.close()  # Not needed but cleans up socket if no Content-Length
    self.con.close()  # Needed to shut down connection with Content-Length

    I think my Windows computer has a firewall that holds TCP data if it looks like an unfinished HTTP request or response. I suspect Terry and William had a similar firewall. Here is a demonstration of the socket behaviour it causes:

    >>> from socket import *
    >>> listener = socket()
    >>> listener.bind(("127.1", 0))
    >>> listener.listen()
    >>> outgoing = create_connection(listener.getsockname())
    >>> [incoming, address] = listener.accept()
    >>> outgoing.sendall(b"GET / HTTP/1.1\r\n")  # Unfinished HTTP request
    >>> incoming.settimeout(3)
    >>> incoming.recv(300)  # Partial request should normally be received
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    socket.timeout: timed out
    >>> outgoing.sendall(b"\r\n")  # Complete the request
    >>> incoming.recv(300)  # Only now is the request received
    b'GET / HTTP/1.1\r\n\r\n'
    >>> incoming.sendall(b"HTTP/1.1 200 OK\r\n")  # Unfinished response
    >>> outgoing.settimeout(3)
    >>> outgoing.recv(300)  # Should normally be received
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    socket.timeout: timed out
    >>> incoming.sendall(b"Content-Length: 0\r\n\r\n")  # Complete the response
    >>> outgoing.recv(300)  # Only now received
    b'HTTP/1.1 200 OK\r\nContent-Length: 0\r\n\r\n'

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented Aug 1, 2018

    My computer was running BitDefender Total Security 2018 (At the time, currently running the 2019 edition) and MalwareBytes 3 Premium.

    BitDefender has both a built-in firewall and a web protection module while MalwareBytes has a web protection module.

    @vadmium
    Copy link
    Member

    vadmium commented Sep 28, 2018

    Hi William, when I mentioned “Content-Length”, I meant adding it to the response from the server. See the second version of “do_GET” in my earlier comment <https://bugs.python.org/issue25095#msg309522\>. But that is no good without also adding the “self.con.close()” line to the client I mentioned in two of my other comments.

    In your latest rev. e6b09d4 (and earlier rev. 2022b6f) you send “Content-Length: 0” in a GET _request_, which doesn’t make much sense. GET requests never have bodies of any length. It is the “200 OK” _response_ that should have a body according to HTTP. The server has to shut down the connection or add “Content-Length: 0” (or use chunked encoding) to indicate the end of the body to the proxy.

    So I would prefer to either adjust the “do_GET” server function as well as adding “self.con.close()” to the client, or just go back to rev. c1afa6b (“Connection: close”).

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented May 22, 2020

    I'll get to it Saturday.

    @WildCard65
    Copy link
    Mannequin

    WildCard65 mannequin commented May 23, 2020

    I've made the changes you've requested.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @terryjreedy
    Copy link
    Member Author

    @vadmium Martin, 3-line the patch (ignore new comments) was revised over 2 years ago in response to your comments. It does not seem to be needed, but might be a good idea. Can you give us an opinion? Should we close this and the PR?

    @iritkatriel iritkatriel added pending The issue will be closed if no feedback is provided type-bug An unexpected behavior, bug, or error and removed performance Performance or resource usage labels Aug 17, 2022
    @arhadthedev
    Copy link
    Member

    @terryjreedy vadmium has no GitHub activity since last September. Can the PR be merged then?

    @terryjreedy terryjreedy removed 3.8 (EOL) end of life 3.7 (EOL) end of life pending The issue will be closed if no feedback is provided labels Dec 13, 2023
    @terryjreedy
    Copy link
    Member Author

    I do not know enough about httpserver to have an opinion.

    @vadmium
    Copy link
    Member

    vadmium commented May 19, 2024

    My understanding of how the test runs without interference from a firewall:

    1. Client sends GET request
    2. Server’s GET handler sends the start of a response, and logs the request
    3. (Server does not send Content-Length, Transfer-Encoding or connection shutdown to end the response)
    4. GET handler returns and server waits for second request from the same connection
    5. Client calls getresponse, which reads the response header and determines response should be terminated by connection shutdown by server
    6. Client discards the response object returned by getresponse
    7. Garbage collection of the client response object shuts down the connection
    8. Server connection handler returns and it waits for another connection or server shutdown
    9. (Client leaves its HTTPConnection object open)
    10. Main thread shuts down the server

    The firewall was apparently holding the response header because the body was not terminated (step 3). This deadlocked with the client receiving the header in step 5.

    Previously, I suggested for the server to respond with “Content-Length: 0” in step 3 to allow the firewall to forward the response. However that caused HTTPConnection to keep the connection alive for a second request, and garbage-collecting the response object did not affect the connection in step 7. Since the HTTPConnection was not closed by step 9, the server could not be shut down in step 10. So one solution would be:

    Solution 1a: Send ”Content-Length: 0” in GET handler in step 3
    Solution 1b: Ensure the HTTPConnection is closed (step 9) before shutting the server down in step 10

    But it would be truer to the original test for the server to shut the connection down at the end of the response:

    Solution 2a: Request “Connection: close” in step 1 (William’s second solution; relies on undocumented behaviour IMO, and is a client workaround for server misbehaviour)
    Solution 2b: Respond with send_header in step 3 (William’s initial solution; also relies on undocumented side effect)
    Solution 2c: Respond with close_connection in step 3 (clearer and more robust IMO)

    The bare minimum to fix the firewall issue would be either one of solutions 2a/2b/2c, or both solution 1a and one of 1b/2a/2b/2c. None of these solutions are exclusive. The current pull request implements solutions 1a, 1b, and 2a.

    My current preference is solution 2c, as in my first version of do_GET in #69282 (comment).

    @vstinner vstinner changed the title test_httpservers hangs since Python 3.5 [Windows] test_httpservers hangs since Python 3.5 Feb 27, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    OS-windows tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants