Skip to content

uasyncio: KeyError in remove_reader (MacOS) #248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
belyalov opened this issue Jan 5, 2018 · 17 comments
Open

uasyncio: KeyError in remove_reader (MacOS) #248

belyalov opened this issue Jan 5, 2018 · 17 comments

Comments

@belyalov
Copy link
Contributor

belyalov commented Jan 5, 2018

uasyncio version 1.2.4
I'm using it in a pretty common way:

        loop = asyncio.get_event_loop()
        print("* Starting Web Server at {}:{}".format(host, port))
        loop.create_task(asyncio.start_server(self._handler, host, port, backlog=backlog))
        if loop_forever:
            loop.run_forever()
            loop.close()

Sometimes (probably when connection is closed from client side) I see exception:

Traceback (most recent call last):
  File "./esp_restapi.py", line 53, in <module>
  File "./esp_restapi.py", line 49, in run
  File "/Users/belyalov/.micropython/lib/tinyweb/server.py", line 320, in run
  File "/Users/belyalov/.micropython/lib/uasyncio/core.py", line 136, in run_forever
  File "/Users/belyalov/.micropython/lib/uasyncio/core.py", line 113, in run_forever
  File "/Users/belyalov/.micropython/lib/uasyncio/__init__.py", line 39, in remove_reader
KeyError: 4339811008
@pfalcon
Copy link
Contributor

pfalcon commented Jan 5, 2018

The backtrace you post don't seem to correlate with the code snippet you posted. You would need to review what callers of uasyncio do which causes such behavior. Alternatively, feel free to provide simple and self-contained example which reproduces this problem (directly against uasyncio or against other uasyncio-based reference software, like picoweb).

@belyalov
Copy link
Contributor Author

belyalov commented Jan 6, 2018

Provided snippet is just for example of usage. However, original code is close to snippet.
Ok, I'll try to reproduce the problem in a pretty small program and let you know.

@belyalov
Copy link
Contributor Author

belyalov commented Jan 6, 2018

So I've got stable repro of this problem.
Here is pretty simple example:

#!/usr/bin/env micropython

import uasyncio as asyncio


def _handler(reader, writer):
    print('New connection')
    line = yield from reader.readline()
    print(line)
    yield from writer.awrite('Gotcha!')
    yield from writer.aclose()


def run(host="127.0.0.1", port=8081, loop_forever=True, backlog=16):
    loop = asyncio.get_event_loop()
    print("* Starting Server at {}:{}".format(host, port))
    loop.create_task(asyncio.start_server(_handler, host, port, backlog=backlog))
    if loop_forever:
        loop.run_forever()
        loop.close()


if __name__ == '__main__':
    run()

This program is simply listen for TCP connection and then reads line.
So to reproduce the problem:

  1. Run attached snippet
  2. Open TCP connection e.g. by nc to localhost port 8081 and then close it without sending data (Ctrl+C)
$ nc -4 localhost 8081
^C
  1. Check output of snippet (something like that):
./z.py
* Starting Server at 127.0.0.1:8081
New connection
Traceback (most recent call last):
  File "./z.py", line 24, in <module>
  File "./z.py", line 19, in run
  File "/Users/belyalov/.micropython/lib/uasyncio/core.py", line 136, in run_forever
  File "/Users/belyalov/.micropython/lib/uasyncio/core.py", line 113, in run_forever
  File "/Users/belyalov/.micropython/lib/uasyncio/__init__.py", line 39, in remove_reader
KeyError: 4399426432

uasyncio is from commit d19253a which is last working version before breaking changes mentioned in #244

@belyalov
Copy link
Contributor Author

belyalov commented Jan 6, 2018

Forgot to mention, I'm using micropython for Mac (unix?).

@belyalov
Copy link
Contributor Author

belyalov commented Jan 6, 2018

More investigations:
This is because reader actually gets deleted twice:

  • When socket error happened:
    if ev & (select.POLLHUP | select.POLLERR):
    # These events are returned even if not requested, and
    # are sticky, i.e. will be returned again and again.
    # If the caller doesn't do proper error handling and
    # unregister this sock, we'll busy-loop on it, so we
    # as well can unregister it now "just in case".
    self.remove_reader(sock)
  • And one more time when IOReadDone:
    elif isinstance(ret, IOReadDone):
    self.remove_reader(arg)

So fix could be pretty simple like:

if id(sock) not in self.objmap:
    return

@pfalcon
Copy link
Contributor

pfalcon commented Jan 6, 2018

Thanks for preparing a testcase. But I can't reproduce the behavior you described, instead I get after pressing ^C on running nc:

$ micropython test-remove_reader-crash.py 
* Starting Server at 127.0.0.1:8081
New connection
b''

The script works pretty as expected. There definitely should be b'' printed, because breaking nc closes its side of the connection, and "peer closed connection" condition on a socket is recognized as an EOF (i.e. empty string).

So, the first question you'd need to answer is why you don't get an EOF.

What's a bit worrying is that writing to closed socket should lead to an error, and it doesn't happen. But "should lead" actually applies only to blocking socket and under some additional conditions. It's all more complicated with non-blocking sockets. Well, perhaps that matter worth being investigated (not directly related to this ticket).

@pfalcon
Copy link
Contributor

pfalcon commented Jan 6, 2018

This is because reader actually gets deleted twice:
When socket error happened:
And one more time when IOReadDone:

Yeah, good reasoning. Now you just need to explain how IOReadDone gets yielded. Nothing in the code you posted directly does that (e.g. writer.aclose() yields IOWriteDone). readline() may yield IOReadDone, but then why you don't get b''?

So, please keep investigating. You may need to go as deep as printing the values poll() returns on your system. I'm using Linux and the latest uasyncio for reference (and I recommend using the latter too).

@pfalcon
Copy link
Contributor

pfalcon commented Jan 6, 2018

You may need to go as deep as printing the values poll() returns on your system.

If you do that, please don't forget to post patch with print()s, I'll need to compare it with the values I get on Linux.

@pfalcon pfalcon changed the title uasyncio: KeyError in remove_reader uasyncio: KeyError in remove_reader (MacOS) Jan 6, 2018
@belyalov
Copy link
Contributor Author

belyalov commented Jan 7, 2018

how IOReadDone gets yielded

Whenever you yield reader.aclose()
The different question is should you call it? If yes - how to know if reader is still alive?

P.S. Test case is slightly wrong - looks like I've attached first version of it and forgot to update. So to get it reproduced simply replace writer.close() with reader.close()

@shawwwn
Copy link

shawwwn commented Sep 26, 2018

Hello,
I can confirm this problem does exist.
The culprit is in the following:

def read(self, n=-1):
while True:
yield IORead(self.polls)
res = self.ios.read(n)
if res is not None:
break
# This should not happen for real sockets, but can easily
# happen for stream wrappers (ssl, websockets, etc.)
#log.warn("Empty read")
if not res:
yield IOReadDone(self.polls)
return res

If, for some reasons, a call to ios.read() doesn't return anything(this happens to me on a low memory situation), then you effectively have two calls to IOReadDone() -- one in .read(), the other in .aclose(). The second call will cause a key fault.

An obvious fix is already in @belyalov's reply.

@pfalcon
Copy link
Contributor

pfalcon commented Sep 26, 2018

a call to ios.read() doesn't return anything

In Python, a function always returns something.

then you effectively have two calls to IOReadDone() -- one in .read(), the other in .aclose().

Ok, that seems plausible.

An obvious fix is already in @belyalov's reply.

That's obviously not consistent with how it's implemented for remove_writer().

@shawwwn
Copy link

shawwwn commented Sep 26, 2018

In Python, a function always returns something.

Did you understand my post or did you have to go word-to-word?

I said this happened in "a low memory situation." Socket read returns None because system is out-of-memory. Under such condition, there will be two calls to IOReadDone() in which the latter will cause a key fault.
@belyalov's solution is the quickest and it worked. If you happen to have your own implementation in mind, why don't you fix it?

Next time, maybe try to pay attention to the actual code instead of people's words.

@pfalcon
Copy link
Contributor

pfalcon commented Sep 26, 2018

Did you understand my post or did you have to go word-to-word?

There's such notion as "fail fast": https://en.wikipedia.org/wiki/Fail-fast . If you make mistake with such simple things as Python function returns, just imagine how many mistakes can be done with less obvious things?

Socket read returns None because system is out-of-memory. Under such condition, there will be two calls to IOReadDone() in which the latter will cause a key fault.

Still not a correct analysis, see the code above.

@belyalov's solution is the quickest and it worked.

As a maintainer of many (too many!) projects, I'm not interested in random quick-and-dirty "fixes", but in understanding the problem and finding the best solution, which is sometimes not obvious. I welcome anyone to apply the same approach and put the uasyncio code under scrutiny (but then the entire code, not just 2 random lines, it's a few screenfuls anyway). And if the description of a problem starts with "foo() doesn't return anything", please kindly let me be skeptical and call for more thoroughness on it.

@shawwwn
Copy link

shawwwn commented Sep 26, 2018

Please don't introduce irrelevant topic into this discussion. Thank you!

Conclusion:
The current uasyncio implementation is uncared of socket error and (judging from the conversation) the issue is not likely to get fixed.
So if you are reading this, please be prepared to write up your own select/poll mechanism when dealing with socket.

@belyalov
Copy link
Contributor Author

@shawwwn I'd suggest you not to expect any patches accepted for micropython-lib.
Here is good reading of what's going on.

Personally I decided not to spend any more time to talk to "maintainer" - simply because it is just time spending instead of problem solving.

Fork it (or use existing one), apply patch and use it like I did.

P.S. here is the fix: belyalov@68abd69

@pfalcon
Copy link
Contributor

pfalcon commented Sep 27, 2018

All fixes and further development happens in the upstream repo, https://github.com/pfalcon/micropython-lib . This issue will be fixed too, when someone prepares a proper patch for it.

@jonnor
Copy link

jonnor commented Aug 25, 2024

This remains the upstream library for the MicroPython project.

That said, is this still an issue on recent versions on MicroPython and asyncio versions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants