Skip to content

Switch unix port to common select module, and add POSIX optimisations to that module #12138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 7, 2023

Conversation

dpgeorge
Copy link
Member

@dpgeorge dpgeorge commented Aug 1, 2023

The unix port has a custom select module which only works with objects that have a file descriptor, eg files and sockets. On the other hand, bare metal ports use the common extmod/modselect.c implementation of the select module that supports polling of arbitrary objects, as long as those objects provide a MP_STREAM_POLL in their ioctl implementation (which can be done in Python).

This PR removes the unix-specific select module implementation and makes unix use the common one. And then it also adds an optimisation so that the system poll function is used when polling objects that do have a file descriptor. If you register both file-descriptor-based objects, and non-file-descriptor-based objects with select.poll() then the following occurs:

  • the system poll is called for all file-descriptor-based objects with a timeout of 1ms
  • then the bare-metal polling implementation is used for remaining objects, which calls into their ioctl method (which can be in C or Python)

In the case where all objects have file descriptors, the system poll is called with the full timeout requested by the caller. That makes it as efficient as possible in the case everything has a file descriptor.

Benefits of this approach:

  • all ports use the same select module implementation
  • the unix port now supports polling of all objects and matches bare metal implementations
  • still efficient for existing cases where only files and sockets are polled (on unix)
  • bare metal implementation does not change
  • polling of SSL objects will now work on unix by calling in to the ioctl method on SSL objects; this is required for extmod/asyncio: Add ssl support with SSLContext. #11897

An additional follow-up optimisation would be to add a special case to handle polling of SSL sockets so it can poll the underlying file descriptor of the socket. This requires some tricks to swap read/write polling depending on the state of the SSL transport.

@dpgeorge dpgeorge added extmod Relates to extmod/ directory in source port-unix labels Aug 1, 2023
@github-actions
Copy link

github-actions bot commented Aug 1, 2023

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:  +400 +0.050% standard[incl -32(data)]
      stm32:   -48 -0.012% PYBV10
     mimxrt:   -48 -0.013% TEENSY40
        rp2:   -72 -0.022% PICO
       samd:   -72 -0.028% ADAFRUIT_ITSYBITSY_M4_EXPRESS

@andrewleech
Copy link
Contributor

I don't have anything particularly constructive to add from a review other than it looks really good thanks! This will be great for the unix port going forward.

Is this something that would also help with integrating #6125 into all ports in the future? Or does an event system sit alongside select?

@dpgeorge
Copy link
Member Author

dpgeorge commented Aug 1, 2023

There are some very minor differences than the unix port will see with this PR:

  • poll.register() now returns None (which is CPython compatible) instead of True/False (which was a MicroPython extension)
  • select.poll() no longer accepts an optional alloc argument (again, this was a MicroPython extension, mainly for old uasyncio)
  • poll.poll() now returns [] instead of () when there are no objects ready (on should prefer using ipoll() anyway for efficiency)

@codecov
Copy link

codecov bot commented Aug 1, 2023

Codecov Report

Merging #12138 (218242d) into master (b714f41) will increase coverage by 0.00%.
The diff coverage is 99.08%.

@@           Coverage Diff            @@
##           master   #12138    +/-   ##
========================================
  Coverage   98.38%   98.38%            
========================================
  Files         157      158     +1     
  Lines       20714    20892   +178     
========================================
+ Hits        20379    20555   +176     
- Misses        335      337     +2     
Files Changed Coverage Δ
extmod/vfs_posix_file.c 91.95% <ø> (ø)
extmod/modselect.c 98.87% <99.08%> (ø)

Copy link
Member

@jimmo jimmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a great improvement! Very helpful for running aioble on unix.

py/stream.h Outdated
@@ -44,12 +44,22 @@
#define MP_STREAM_SET_DATA_OPTS (9) // Set data/message options
#define MP_STREAM_GET_FILENO (10) // Get fileno of underlying file

#if MICROPY_PY_SELECT_POSIX_OPTIMISATIONS
// With this optimization enabled, use system-provided poll ioctl values
#include <poll.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to ever have port-specific constant values that can "leak" to Python, for the same reason as confusion around errno values. Instead we should map our values on the way in/out of the relevant modules.

Maybe not the best example, but e.g. https://github.com/micropython/micropython/blob/master/examples/bluetooth/ble_uart_repl.py#L15

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Now fixed. The implementation simply requires the system constant to match the MP ones.

} else {
((poll_obj_t *)MP_OBJ_TO_PTR(elem->value))->flags = flags;
poll_obj_set_flags(poll_obj, flags);
}
}
}
}

// poll each object in the map
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you know (from knowing about this PR) what's going on here, the concept of a "single round" might not be that obvious. The commit description doesn't explain it either.

I'd probably suggest renaming poll_group_poll_all to poll_group_poll_until_ready_or_timeout and then perhaps this function should be poll_group_poll_once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed as suggested.

typedef struct _poll_obj_t {
mp_obj_t obj;
mp_uint_t (*ioctl)(mp_obj_t obj, mp_uint_t request, uintptr_t arg, int *errcode);
#if MICROPY_PY_SELECT_POSIX_OPTIMISATIONS
struct pollfd *pollfd;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a comment to explain that pollfd is mutually exclusive with flags and flags_ret. (I think that's what you're getting at with the nonsys_ prefix, but it would be clearer to just spell it out and remove the prefix... although I guess the prefix stops you using the non-posix-opt code accidentally?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think it would be good to be explicit that the pollfd member here points to the element in poll_group_t::pollfds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments added, and nonsys_ prefix changed to nonfd_.

Also, flags renamed to events and flags_ret renamed to revents to make it more consistent and clearer (a flags argument is used elsewhere for completely unrelated things).

MP_THREAD_GIL_EXIT();

// Compute the timeout.
int t = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 1 is potentially an important tuning parameter and should be a defined constant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been made a macro (private to this file, but in the future could be made public and configurable).

// Compute the timeout.
int t = 1;
if (poll_group_all_are_fds(poll_group)) {
if (timeout == (mp_uint_t)-1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// All our pollables are file descriptors, so we can use blocking poll and let it handle the timeout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added.

return n_ready;
}

mp_handle_pending(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest that we should also do an explicit sleep here, and also why not use MICROPY_EVENT_POLL_HOOK. But that's actually what MICROPY_EVENT_POLL_HOOK does, so perhaps we should use it.

Perhaps a smarter option would be to only do the sleep if none of the pollables are file descriptors (i.e. we didn't get the sleep from poll).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added telling why MICROPY_EVENT_POLL_HOOK is not used.

} poll_obj_t;

STATIC void poll_map_add(mp_map_t *poll_map, const mp_obj_t *obj, mp_uint_t obj_len, mp_uint_t flags, bool or_flags) {
// A group of pollable objects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I was confused by what a "poll group" was, I think I expected maybe you'd group-by the poll-style (i.e. fd or no-fd). But there's never more than one group, rather it's a convenient way of bundling the map with the list of pollfds (and metadata for that list).

(I think "list" or "set" is a better term than "group", because "group" implies they're grouped by something).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed poll_group_t to poll_set_t.


if (n_ready == -1) {
int err = errno;
if (err != EINTR) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this also be EAGAIN ?

Also, the old code used MP_HAL_RETRY_SYSCALL ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added about PEP 475 and MP_HAL_RETRY_SYSCALL.

@dpgeorge dpgeorge force-pushed the unix-use-extmod-select branch 4 times, most recently from bb7e436 to 7e0bb18 Compare August 3, 2023 06:31
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
@dpgeorge dpgeorge force-pushed the unix-use-extmod-select branch 3 times, most recently from fd54dde to 0c009d4 Compare August 6, 2023 14:05
@dpgeorge
Copy link
Member Author

dpgeorge commented Aug 6, 2023

Enough tests have now been added so the coverage CI now passes.

@dpgeorge dpgeorge force-pushed the unix-use-extmod-select branch from 0c009d4 to a6109f3 Compare August 7, 2023 01:45
The unix port has a custom select module which only works with objects that
have a file descriptor, eg files and sockets.  On the other hand, bare
metal ports use the common extmod/modselect.c implementation of the select
module that supports polling of arbitrary objects, as long as those objects
provide a MP_STREAM_POLL in their ioctl implementation (which can be done
in C or Python).

This commit removes the unix-specific code and makes unix use the common
one provided by extmod/modselect.c instead.  All objects with file
descriptors implement MP_STREAM_POLL so they continue to work.

Signed-off-by: Damien George <damien@micropython.org>
To make it easier to extend and modify this polling implementation.

Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
A previous commit removed the unix-specific select module implementation
and made unix use the common one.

This commit adds an optimisation so that the system poll function is used
when polling objects that have a file descriptor.  With this optimisation
enabled, if code registers both file-descriptor-based objects, and non-
file-descriptor-based objects with select.poll() then the following occurs:

- the system poll is called for all file-descriptor-based objects with a
  timeout of 1ms

- then the bare-metal polling implementation is used for remaining objects,
  which calls into their ioctl method (which can be in C or Python)

In the case where all objects have file descriptors, the system poll is
called with the full timeout requested by the caller.  That makes it as
efficient as possible in the case everything has a file descriptor.

Benefits of this approach:

- all ports use the same select module implementation

- the unix port now supports polling of all objects and matches bare metal
  implementations

- it's still efficient for existing cases where only files and sockets are
  polled (on unix)

- the bare metal implementation does not change

- polling of SSL objects will now work on unix by calling in to the ioctl
  method on SSL objects (this is required for asyncio ssl support)

Note that extmod/vfs_posix_file.c has poll disable when the optimisation is
enabled, because the code is not reachable when the optimisation is used.

Signed-off-by: Damien George <damien@micropython.org>
@jimmo
Copy link
Member

jimmo commented Aug 7, 2023

Looks good! Nice work with the coverage tests.

The signature of this method was poller.poll(timeout=-1, flags=0, /) but
the flags argument was not documented and is not CPython compatible.  So
it's removed in this commit.

(The optional flags remains for the ipoll() method, which is documented.)

Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Some targets (eg PYBV10) have the socket module but are unable to create
UDP sockets without a registered NIC.  So skip UDP tests on these targets.

Signed-off-by: Damien George <damien@micropython.org>
@dpgeorge dpgeorge force-pushed the unix-use-extmod-select branch from a6109f3 to 218242d Compare August 7, 2023 02:42
@dpgeorge
Copy link
Member Author

dpgeorge commented Aug 7, 2023

New tests have been tested on bare-metal (stm32) and adjusted so they skip or pass there as well.

All comments above are now addressed, it was purely renaming of variables/functions and adding comments.

@dpgeorge dpgeorge merged commit 218242d into micropython:master Aug 7, 2023
@dpgeorge dpgeorge deleted the unix-use-extmod-select branch August 7, 2023 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extmod Relates to extmod/ directory in source port-unix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants