-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
Towards consistent stream polling API across baremetal/unix ports #1550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
? https://github.com/micropython/micropython/blob/master/stmhal/moduselect.c#L250-L251 |
http://docs.micropython.org/en/latest/library/select.html#select.poll.poll
Nowhere it makes clear that it returns list of tuples. I'll take care of updating the docs. Thanks. |
Ok, docs are updated: 98b6d35 . Next issue: stmhal doesn't define symbolic constants for events. And then Linux e.g. has: http://lxr.free-electrons.com/source/include/uapi/asm-generic/poll.h#L5 . So, there's POLLPRI stuck in front of POLLOUT, making it have value of 4. At least, as that file says, values are governed by iBCS2, which is still "intel", but it being in asm-generic gives out good hope... Anyway, what to do about this? |
I'd say add |
Thanks, that's the solution I'd be looking for. Any concerns about handling of integer 2 value? We could check for it, and throw an error with instructions on upgrade. If there're no big concerns, I'd rather skip that, as it's carrying around unneeded luggage. |
@pfalcon regarding 98b6d35 : returning a tuple which is not exactly 2 elements in size might lead to compatibility issues with CPython. Eg code that does
I don't have big concerns, so we can skip providing an upgrade warning. What will happen instead is that polled objects will just never be available for "reading" so that should be a relatively easy error to track down. |
So, the idea behind that stuff is to be able to specify "callback data" in Poll.register() and get that data back in .poll() result. This is required for efficient processing of poll results. Yes, that's extension to CPython, but will let to standardize on simple clean interface Poll provides. The way CPython does it is via "selectors" module. I was ready to implement it for unix port (in micropython-lib), but I think it will be too much bloat for bare-metal. So, given all these reasons, I think it's acceptable if "obj, ev = result[0]" won't work and more specific code is required. Note that all this is tentative yet and will take time to be fully worked out, and there's possibility to handle the case discussed in compatible way (by returning 2-tuple in case non-standard callback data wasn't passed). But I'm actually thinking about optimizations which will put even more constraints on what can be done with result of .poll() - like mentioned above having a singleton return structure and refilling it on each return, instead of allocating new each time. |
I don't have any experience using selectors so would need to see and understand a use case of returning "callback data" before I could form a better opinion on this. I'd really rather not break CPython compatibility like this if we really don't need to. (Although one could look through CPython lib to see if this would indeed break things. It might not.)
Could always implement |
That's one of pretty big advantages of Linux epoll vs standard POSIX poll - almost always you need more state to handle particular connection than just fd. epoll allows to just set arbitrary data structure, thread pointer, coroutine pointer, and get it back directly (and know all this stuff is handled efficiently, in kernel). With poll, you need to map returned fd into your data/control structure yourself. This feature is used by uasyncio for example - it associates a coroutine with waited fd, to resume it directly. |
Yes, but why? The implementation I described would still be compliant with description of .poll() - indeed, it doesn't prohibit such behavior. |
And well, to pronounce it explicitly (though I guessed it was obvious), the drive of this work is to run uasyncio on baremetal ports. |
Ok, with bunch of thinking, and practical hurdles like #583 (comment) , I have following plan for unix implementation of Poll:
|
True, it does return a list. But the tuples in the list would need to be reallocated each time because a tuple is read only you can expect to keep a reference to it and for it not to change. And if there is ever multithreading support in uPy, it'd need a separate poll buffer for each thread. Regarding epoll: it seems that the Python interface to this does not support registering additional data along with the fd.
This is how stmhal (and cc3200?) work, since there is not really any concept of a file descriptor. If you can use non-fd objects then that provides a way to attach data to the polling event: just pass a tuple |
Keeping a reference to tuple returned from poll.poll() doesn't make sense.
Well, all buffers belong to particular poll object instance.
But C does, and micropython-lib supports that, and usasyncio depends on it.
Sure, and I'm talking about integrating of both native POSIX fd's and user-space i/o objects.
Well, tuple definitely doesn't know how to be waited on. And even tuple is pretty expensive object - I'm currently doing optimizations on uasyncio to reduce memory allocation for bookkeeping purposes like argument storage. (You should like this approach of avoiding memory allocation, right? ;-) ) |
Other thoughts of optimizing poll implementation (with uasyncio as client, as usual):
I also consider an optimization, where poll has special knowledge about socket objects, and each socket object stores its last used index in a poll map (which is just an array for POSIX poll()). |
@dpgeorge : Would be nice to hear your opinion on the matters above, if you have interest/time, otherwise my plan is to add optional arg to .register(), and to positional optional args to poll.poll() to cover usecases above. |
I changed my mind and already coded .regmod(), then went to read cpython docs about details of .register() behavior to make it compliant. D'oh:
So, poll.register() already has the desired behavior. Well, depends on the interpretation of "the same effect as registering the descriptor exactly once". This effect, what it is - ignoring any .register() request for existing fd, or just making the last call have effect? Obviously, only the latter behavior is useful. Another CPython miracle is that 4 different implementation of poll objects have 4 different ways how .register() works for each implementation, namely: explicitly undefined, implicitly undefined, allowed (with ambiguity above), raising exception. |
@pfalcon sorry got lost, will think about it and reply. |
I merged new implementation of .register() in 082b121. I tested that following testcase work as expected in both CPython and MicroPython (i.e. last .register() call has effect):
|
And that was last minimal change required to switch uasyncio from FFI-based epoll implementation to uselect.poll . And I did correctness testing to make sure it all works as expected (basic one of course, there should be more advanced tests), and then performance tests. So, for short requests, even with more or less high concurrency (10000 of 100 concurrency), poll() is faster. However, for large requests (>6MB) with moderate concurrency (30), on a overloaded system, poll() is 2 times slower. I rebooted my box to get a clean environment, but even with no user apps running (only X and daemons), I get 100% variation of poll() running times. Switching to epoll(), I get both less variable results, and, taking minimal poll() time I saw, epoll() is still 25% faster. This pretty well matches what I read about poll() vs epoll(), though I didn't think it would be so visible (and I still don't believe it). Anyway, it's like that - switching to portable poll() means losing good deal of large-request performance :-(. I also implemented poll.poll() one-shot mode as described above, but I wasn't able to quantify its result on the tests above - due to high variability and low enough concurrency (30, i.e. poll array size to scan is merely 30). Its effect would be more visible with concurrency of 1000. Anyway, it's obvious CPU cycle-saver, so let's have it. |
I grep'd through the stdlib and site-packages that I have on my machine for use of (e)poll. It's difficult to find (maybe because it's not used often? I find only 12 non-test scripts that import select). The two cases I found use poll in the same way:
and:
Such usage would break if poll() returned a list of tuples that had 3 elements (fd, ev, user_data). At least the breakage will raise an exception and can be fixed manually, instead of failing silently.
I would just use "unused" POLL bits in the constants provided at the Python level. At the C level it can be translated to a separate flag so not to clash with the C bits.
But does that mean that poll.poll(oneshot=True) applies to all fds that have been registered? Or just the ones that fire for this particular call? Ie, does it disable everything after firing, or just those that fire and are returned? I guess the latter is the most sensible thing to do. A shame about the lack of performance of poll vs epoll. Given all this (CPy compatibility, performance) why not create our own portable poll class (ppoll?) that has the semantics we want and works efficiently on all machines? Behind the scenes on Linux it can use epoll if available. Other platforms can fallback to poll (or something else). The API can be close to poll/epoll so that it requires minimal effort to provide a CPy poll class in terms of "ppoll". |
Yes, I guess so. Someone who needs to do something quick, just uses select(). That's why I'm not too concerned to add various optimizations which assume various restrictions on usage.
Ok, then we can pass additional flags to .poll() for extended usage, as I suggested above:
But there's no place to store them on C level.
Of course, one-shot works as: fd is "charged", then if event is triggered on it, it's "discharged", and no longer triggers until explicitly charged again. The behavior is tied to fds which trigger events, and don't affect other fds. So, the whole semantics can be handled purely on .poll() level, not on .register() (though that makes it incompatible with epoll()).
Because it's duplication of functionality and code. Why, if we need only fine adjustments to already existing behavior of Poll?
That's again duplication of code. After "revelation" regarding dynamic loading and FFI usage, I gave a good thought what I want to achieve with all this. And that's being able to use MicroPython everywhere (well, run software on any MicroPython install). And way to achieve that is to work on the most performant of portable APIs (and that's poll()), and work on consistent support for baremetal. Supporting Linux-specific solution just complicates the task. epoll() implementation stays available in micropython-lib, and someone interested in it (maybe me later), perhaps later can reintegrate it into uasyncio. But current aim is to make it usable in real-world, not just on development machines. |
Ok, I switched uasyncio to use uselect.poll(). Next TODO for unix port: accept arbitrary stream objects, not just fd's. But that requires implementing "callback data" support (if we pass in object, we want to get it back, not just its fd, right?) But that's still in backlog while I'm trying to get a sample webapp running on a static unix port, and there're bunch of related issue to solve first for that... |
Can't believe this still wasn't done :-(. I'm adding a stopgap implementation which can automagically (i.e. even without calling .fileno() method) extract fd from file and socket object. This will still return a raw fd from .poll() method, but still useful for number of usecases (like just adding timeout on a socket). |
Ok, time to have a final pass/decision on how to make (or not make) baremetal and unix The main client for these changes is, as before, uasyncio. And uasyncio doesn't care about about getting "file descriptor" or "socket object" from poll.poll() call. What it needs is "user data" (in the discussion above called "callback data") as associated with a file-like object. As neither unix nor baremetal uselect provides such functionality, uasyncio does such mapping itself, using a dictionary. For unix port, this dictionary has also another purpose: to keep references to file-like objects, to preclude them from being GCed while OS' poll works on them. Now, the state of unix uselect module is that it can accept a file/socket in call to poll.register(), just like baremetal, but unlike it, as a result of poll.poll() call, will return an underlying file descriptor, not the original object. With unix port, a file/socket object can also be GCed while it's registered in poll object (because it's actually underlying fd is what's registered). Suppose we want to make unix implementation be compliant with baremetal one, and return an original object passed to poll.register(). That will take storing this object in the internal internal array of poll object, in addition to already stored array of But all in all, if we store a file-like object in poll internal structures, it's just one step ahead to store additional information along with it. This is true for both unix and baremetal uselect's. Now, what we can do, specifically:
The neutral point is apparently p.3, and it also lies on the criticial path to p.4. But @dpgeorge, I wonder if you're ready for p.4, after a year in discussion. |
Implemented in 093a8f5.
Before I go and patch all files and sockets to implement MP_UNARY_OP_HASH, please confirm that you don't want to got for solution 4. It would be much better solution. For extmod/moduselect, We can stop caching ->ioctl and use it for user data, while staying within 16 bytes. One caveat is that it's useful to have 2 words of user data, e.g. a callbacks and arguments (single one or tuple of many). But we can scope-creep it step by step ;-). |
@dpgeorge: Ping ^^^ |
Point 4 is good (it's efficient) but the question for me is about the user-facing API. I don't want to have poll.poll() return a 3-tuple and breaking compatibility with CPython. The "selectors" module is really what we need here so why not just implement it? It can be done efficiently (in C) using pretty much all the existing code in extmod/moduselect. Just make existing poll_register and poll_modify take an extra optional "data" arg, and set up some aliases DefaultSelector=PollSelector=select.poll. If that's too much, then just do the modifications to add "data" args to register and modify, and have poll.poll() return a 2-tuple but where the first arg is a tuple of the file-like-object and the user data. |
My question is mostly about desire to store additional information in poll objects. API-wise, we can do .poll_into(), or maybe even .poll_iter() which would return an iterator.
That was my original idea, and how it evolves after that is I'm trying to add selectors-like functionality to existing select, to not bloat with adding a separate module (especially that for efficiency, we need another "into"-style API anyway). |
Ok, I'd like to move further with this. The key point is whether we can constrain number of returned events for different underlying polling implementations to 1. Answer: we can for baremetal poll (we can do anything), we can for Linux epoll (epoll_wait takes maxevents param). But we can't for standard POSIX poll(). So, then: Have an poll.ipoll() return an iterator (allocate-once apparently). Then its So, the use would be:
Btw, using internal pre-allocated list and sequence unpacking may make a more natural Python syntactic pattern after all, then external pre-allocated list, passed as an argument. Not as explicit about the fact that the list may change between the invocations, though. |
Sounds ok, hopefully it indeed doesn't allocate RAM in practice.
If code is going to be written like that then we can't change the number of elements returned by ipoll's next. So should make a decision now that |
How we can't, of course we can. I don't see any issue even with adding extra param to .poll() - we have bigger differences anyway. And of course we can state what's implied explicitly in the docs - "this function is provisional for next 5 (or 10) years". But the caveat is that there's no docs for .ipoll(), and before they can be written, docs for other things having existed for some time, need to be written first. Btw, with the plan like above I keep thinking whether returning a "caller-owned" tuple is better after all (than a list). With a tuple there's more or less big issue - that it would be changing tuple, but at least it's the only one. And it would be hard to abuse it for bad. So, we just document it as a special kind of tuple, voila, problem solved. With a list, there're lot of "smaller", but more severe issues, like users being able to change size of this list, etc. So, a reliable solution will be inventing a special type of list whose size can't be changed and some other operations are disabled. Well, it's better to invent a special kind of tuple, whose values can change by external force. At least that requires 0 lines of code. |
But then we break our own code as well as other people's... not a fun situation. The other way of doing it is
I think at this stage the best thing to do is make a proof of concept patch based on the discussion so far, then evaluate the code. |
Initial ipoll() impl posted: #2841 |
I'm curious about this:
Has there been any progress on this front? I naively tried to set up a dictionary of |
Per https://github.com/micropython/micropython/wiki/ContributorGuidelines#django-criteria , that's the correct approach. Anything you can do, please do the way you can. The focus of adding new features is features which can't, or very cumbersome, to do otherwise. |
Careful inspection shows that we still have consistency issues:
So, we need to decide what to do about unregister(). Either we:
It's worth thinking how to do that flag right, as it's probably not the last case when it's useful to select between raising exceptions and not. I believe we already had a discussion that usage of not negated words is preferred, e.g. And still, "nothrow" is a term on its own, e.g. https://en.cppreference.com/w/cpp/memory/new/nothrow (damn, they now have noexcept too: https://en.cppreference.com/w/cpp/language/noexcept_spec). But if this flag will ever be a keyword param (not expected to happen), just be able to reuse "throw" qstr calls for it to be named "throw", not "nothrow". |
Step 2: Move The Docs
This was fixed in b9bad7f
This is tracked by a TODO in the code. I'd say this issue can be closed. The |
What we have now is:
Proposed:
The text was updated successfully, but these errors were encountered: