Skip to content

[RFC] Brainstorming generic async write support for advanced streams #3396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pfalcon opened this issue Oct 29, 2017 · 4 comments
Closed

[RFC] Brainstorming generic async write support for advanced streams #3396

pfalcon opened this issue Oct 29, 2017 · 4 comments

Comments

@pfalcon
Copy link
Contributor

pfalcon commented Oct 29, 2017

When asked why on Earth write operations in asyncio are usual functions instead of coroutines, Guido van Rossum answered something like "There're a big difference between read and write operations". Yeah, something like that. Let's compare for example read on a websock or SSL connection: to decode data, it needs a complete underlying record, and until it has it, there's no other choice as to buffer it internally and just return None (not ready) from .read().

But with .write(), following situation is normal: suppose .write(len(100)) was called, 64 bytes of that input was processed, and turned into 256 bytes of data which needs to be written to the underlying stream. Then 128 bytes of that was written to it before EAGAIN hit. So, what .write() should return? Apparently, no (single) return value can be useful, and there's no easy way for the calling code to continue the operation.

So, one idea is that .write() methods destined for async usage should not try to perform stream write op themselves (that requires a coroutine), but return following context:

[1] number of input bytes processed, pointer to buffer of output data, and its length.

One difference with sync write methods is that async apparently require much heavier buffering, or otherwise internal state machine gets very complex, or it's simply impossible to "chunk" input data for continuous operation. For example, for websock case, initial data to write is a small message header. But there's no way to output just that in [1] scheme, because no input bytes are consumed in this case, so next time client code would restart sending it again (which would lead to sending just message header, in an infinite loop). And note that for websock case, data after header might be sent as is, so artificially chunking it into multiple messages isn't ideal.

So, model [1], while already not exactly easy, might be not adequate/optimal enough.

[2] The ultimate solution would be to make async write methods to be coroutines, but we don't know how to do that for C methods, and devising fully generic scheme may take some effort.

So, some partial "coro-like" scheme may be a compromise, and effectively, [1] can be seen as a first iteration of that idea.

[3] More elaborated idea could be: async writer is multi-stage process, it starts with setting data to write, and then its another sub-operation is called, which repeatedly returns pairs of (ptr, len) chunks to be written. That's already pretty much a coro.

[4] Yet another idea is that ultimately this stuff needs to be written into real stream implementation, and writing single bytes/short records is inefficient, so maybe each async writer should be coupled with a buffer, it should be possible to query remaining space in a buffer, and either write next chunk of data, or request that given amount space should be available (to write fixed-size record in one go). This idea goes in the direction of CPython's asyncio implementation.

pfalcon added a commit to pfalcon/pycopy-lib that referenced this issue Nov 3, 2017
Q #1: Should this be in uasyncio package at all? Upstream doesn't have
this. Pro: will be easier for people do discover (see e.g.
micropython/micropython-lib#148)

Q #2: This provides implements 2 ways to create a WS connections:
1) using start_ws_server(); 2) using wrapping existing StreamReader
and StreamWriter. History: initial prototype of course used 2). But
the idea was "it should be like the official start_server()!!1". But
then I though how to integrate it e.g. with Picoweb, and became clear
that 2) is the most flixble way. So, 1) is intended to be removed.

Q #3: Uses native websocket module for read path, but has own
write path due to micropython/micropython#3396

Q #4: Requires micropython/micropython-lib#227
due to micropython/micropython#3394 .
@pfalcon
Copy link
Contributor Author

pfalcon commented Nov 3, 2017

E.g. adhoc, server-only impl for websockets: micropython/micropython-lib#228

@pfalcon
Copy link
Contributor Author

pfalcon commented Nov 28, 2017

Here's another facet of it: https://sans-io.readthedocs.io/ . And (the generic idea of the above) that's pretty much an immediate plan I'm looking for. For example, to implement DNS support, there would be dns_pkt module, which will write packets (using #3382) into a stream. For sync code, that could be a real socket, but for async, that will be BytesIO. Then content from BytesIO will be sent with the currently available async method(s).

@pfalcon
Copy link
Contributor Author

pfalcon commented Nov 28, 2017

Related slides:

pfalcon added a commit to pfalcon/pycopy-lib that referenced this issue Feb 1, 2018
During development, following questions were posed, and subsequently,
answered:

Q #1: Should this be in uasyncio package at all? Upstream doesn't have
this. Pro: will be easier for people do discover (see e.g.
micropython/micropython-lib#148)

A: uasyncio diverges more and more from asyncio, so if something is
convinient for uasyncio, there's no need to look back at asyncio.

Q #2: This provides implements 2 ways to create a WS connections:
1) using start_ws_server(); 2) using wrapping existing StreamReader
and StreamWriter. History: initial prototype of course used 2). But
the idea was "it should be like the official start_server()!!1". But
then I though how to integrate it e.g. with Picoweb, and became clear
that 2) is the most flixble way. So, 1) is intended to be removed.

A: 1) was removed and is not part of the merged version of the patch.

Q #3: Uses native websocket module for read path, but has own
write path due to micropython/micropython#3396

A: So far, so good.

Q #4: Requires micropython/micropython-lib#227
due to micropython/micropython#3394 .

A: The prerequisite was merged.
pfalcon added a commit to pfalcon/pycopy-lib that referenced this issue May 3, 2020
During development, following questions were posed, and subsequently,
answered:

Q #1: Should this be in uasyncio package at all? Upstream doesn't have
this. Pro: will be easier for people do discover (see e.g.
micropython/micropython-lib#148)

A: uasyncio diverges more and more from asyncio, so if something is
convinient for uasyncio, there's no need to look back at asyncio.

Q #2: This provides implements 2 ways to create a WS connections:
1) using start_ws_server(); 2) using wrapping existing StreamReader
and StreamWriter. History: initial prototype of course used 2). But
the idea was "it should be like the official start_server()!!1". But
then I though how to integrate it e.g. with Picoweb, and became clear
that 2) is the most flixble way. So, 1) is intended to be removed.

A: 1) was removed and is not part of the merged version of the patch.

Q #3: Uses native websocket module for read path, but has own
write path due to micropython/micropython#3396

A: So far, so good.

Q #4: Requires micropython/micropython-lib#227
due to micropython/micropython#3394 .

A: The prerequisite was merged.
pfalcon added a commit to pfalcon/pycopy-lib that referenced this issue May 3, 2020
During development, following questions were posed, and subsequently,
answered:

Q #1: Should this be in uasyncio package at all? Upstream doesn't have
this. Pro: will be easier for people do discover (see e.g.
micropython/micropython-lib#148)

A: uasyncio diverges more and more from asyncio, so if something is
convinient for uasyncio, there's no need to look back at asyncio.

Q #2: This provides implements 2 ways to create a WS connections:
1) using start_ws_server(); 2) using wrapping existing StreamReader
and StreamWriter. History: initial prototype of course used 2). But
the idea was "it should be like the official start_server()!!1". But
then I though how to integrate it e.g. with Picoweb, and became clear
that 2) is the most flixble way. So, 1) is intended to be removed.

A: 1) was removed and is not part of the merged version of the patch.

Q #3: Uses native websocket module for read path, but has own
write path due to micropython/micropython#3396

A: So far, so good.

Q #4: Requires micropython/micropython-lib#227
due to micropython/micropython#3394 .

A: The prerequisite was merged.
@dpgeorge
Copy link
Member

Closing due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants