-
Notifications
You must be signed in to change notification settings - Fork 5.4k
M:N Threads, now w/ macOS support (kqueue) #9178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -3081,7 +3094,7 @@ rb_thread_create_timer_thread(void) | |||
|
|||
CLOSE_INVALIDATE_PAIR(timer_th.comm_fds); | |||
#if HAVE_SYS_EPOLL_H && USE_MN_THREADS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally made this #if (HAVE_SYS_EPOLL_H || HAVE_SYS_EVENT_H) && USE_MN_THREADS
, but several specs failed with EBADF
errors. In the docs (https://man.openbsd.org/kqueue.2) it says The queue is not inherited by a child created with [fork(2)](https://man.openbsd.org/fork.2). Similarly, kqueues cannot be passed across UNIX-domain sockets.
. Possibly that is the reason it's invalid at this point?
@@ -575,7 +585,12 @@ timer_thread_register_waiting(rb_thread_t *th, int fd, enum thread_sched_waiting | |||
} | |||
else { | |||
VM_ASSERT(fd >= 0); | |||
#if HAVE_SYS_EVENT_H | |||
EV_SET(&ke[num_events], fd, EVFILT_READ, EV_ADD, 0, 0, (void *)th); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on different usages, you often see EV_ADD
included with EV_CLEAR
or EV_ONESHOT
. Since the MN code automatically handles unregistering events these possibly are not necessary here.
} | ||
break; | ||
|
||
default: | ||
RUBY_DEBUG_LOG("%d event(s)", r); | ||
|
||
#if HAVE_SYS_EVENT_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code is very similar to the epoll code, with small kevent
specific changes. Maybe they should be consolidated more?
|
||
th->sched.waiting_reason.flags = thread_sched_waiting_none; | ||
th->sched.waiting_reason.data.fd = -1; | ||
th->sched.waiting_reason.data.result = filter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could receive multiple filters back, so setting this over and over again and ending on the last value probably doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually fine for now. Because the flags get set back to thread_sched_waiting_none
, any subsequent event in the loop pointing at this thread will act like a no-op and hit the // already released
else
clause. It is not granular about which event fired, just that the thread can be active again.
Got some Mac test failures I'll look into |
Do you mind me asking, if two threads wait for the same IO to become readable (or writable) or both, how is that handled? |
thread_pthread_mn.c
Outdated
// Linux 2.6.9 or later is needed to pass NULL as data. | ||
if (epoll_ctl(timer_th.epoll_fd, EPOLL_CTL_DEL, fd, NULL) == -1) { | ||
if (epoll_ctl(timer_th.kq_fd, EPOLL_CTL_DEL, fd, NULL) == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it wrong to use epoll_ctl
with kq_fd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was somewhat worried this may bring confusion 😕
I renamed the existing epoll_fd
to kq_fd
. Since both epoll and kqueue are kernel event queues, it felt reasonable as a name.
But because it sounds so much like kqueue
, maybe it's confusing. Since they're both int
descriptors, it allowed me to avoid some #if
checks by having a shared name. So I could:
- Change the name to something less overloaded sounding?
event_fd
? - Keep both and just have one or two additional checks.
- Leave it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to event_fd
Definitely the type of question I want, thanks! Also, I'm not sure I have an exact answer for it but I'll share some thoughts.
I'm not sure if it'd be expected to handle this scenario in this code or not, but I'm open to it if there is an appropriate solution or behavior for mn threads. @ioquatix using the fiber scheduler with Io-event on Mac (ie, kqueue), what would be the behavior if multiple fibers tried to wait for the same IO to be readable or writable? I am curious to see the behavior - so I will give it a try |
Hmmm I do see now that epoll can return EEXIST which means another thread has already registered a fd and it has special handling in the current code - might need additional handling here |
2f432bb
to
6f84e35
Compare
I think it would be better to split out the implementations into different files, i.e. I'm against the complexity of the code when 2+ different OS implementations are in the same file. In addition, it feels like we are solving all the problems that are already solved by |
Totally open to changing the pre processor approach and making more split out functions. The implementation is very small, so having a dedicated file seems like overkill. Considering how flat the CRuby codebase is I did not want to add an extra file without some @ko1 feedback on the approach. But I could at least get rid of all the preprocessor chunks if requested.
As a production user of The existing fiber scheduler hooks are a well defined interface and it'd be great to utilize them. I would love to see these two approaches come together. But it's also a large, fundamental change. I haven't seen activity or discussion around bringing the two together, and I wanted to see support for macOS for M:N threads. That's why I've gone with this current approach. |
I'm not sure I can even come up with a valid use-case of this, do you have a suggestion? When would it be correct for multiple threads to listen to the same IO? It'd still be good to be able to handle it, but it seems like that would be a bug to begin with? |
It's fairly normal to have one thread reading and several threads writing to the same network socket, e.g. HTTP/2. Right now, it's absolutely possible to have several threads writing to Using |
45c1999
to
12d9b03
Compare
15db3e3
to
18acf94
Compare
@ioquatix thanks for the reply! Fundamentally I agree that multiple registrations on the same IO is a scenario that should be accounted for - i'm just not sure how to create a valid scenario to test with.
Considering you've implemented an HTTP/2 server, you definitely have alot more experience here than me! But in my understanding, while multiple threads/fibers can share the same socket in HTTP/2, they don't read/write to that socket simultaneously. There would have to be coordinated access to the shared socket to make sure valid chunks were read, and valid chunks were written.
It's tough in the ruby world to find a concrete example of this since afaik falcon is the only http/2 enabled ruby server. Is there a place I can look in the falcon code for how multiple reads/writes are registering on the same socket as an example to mimic?
This is also a tough example, since // https://github.com/socketry/io-event/blob/main/ext/io/event/selector/kqueue.c#L697C2-L709C38
rb_io_buffer_get_bytes_for_reading(arguments->buffer, &base, &size);
size_t length = arguments->length;
size_t offset = arguments->offset;
size_t total = 0;
size_t maximum_size = size - offset; But in MN threaded code I have not been able to get it to register even with multiple threads
Yea - even license aside, it does not seem well maintained. And would require the ruby team to agree to bringing in that dependency. And would still likely be a total rewrite of how MN threads currently works. It'd be nice to find an example that doesn't represent buggy behavior I could try to actually test this scenario with, from Ruby code. |
Reads and writes don't need to be coordinated. However, usually there is one thread doing the reading and several threads doing writing (usually with a mutex). So yes, potentially.
Don't mistake extremely stable for "not well maintained".
Yes, I agree. I tried to create one but I'm unsure if it's running on the NM model. Is there any way to tell? e.g. |
A Falcon example of this would be great 👍🏼
Fair enough - but it's a big lift, and if there was a big lift a better one would be to use
To enable it you use
It's not 100% clear right now which paths will lead to MN kernel event queue code, and which paths won't. I've found that using |
Falcon uses async and supports HTTP/2. Every HTTP/2 connection has a background reader, while individual HTTP/2 streams will write packets to the socket. I'm not sure this example will help you with the MN scheduler though.
Makes sense.
Thanks that's helpful, yes, I can see the output, but I can't see any debug logs from the MN scheduler... so I don't think it's working? |
I think the result you'd expect to see here is a deadlock or errno/exception if it wasn't working correctly. |
It looks like to me, the epoll support is very optimistic. It will try to register the fd here: Line 1689 in b7e89d4
|
Yea the primary place I see MN actually take effect is in |
This has been a good discussion @ioquatix, because it's made me dig deeper into a few different things. I also think after reviewing alot of the MN code and codepaths that this I know you know this part in particular, but i'd like to lay it out: The fiber scheduler has a very well-defined interface for hooking into blocking operations and handling them efficiently using kernel-specific technologies (epoll, kqueue, io_uring, etc). At just about every possible blocking operation, a fiber scheduler implementation can take over. So
It's pretty exhaustive 👏! The MN implementation is two parts:
(1) seems like it was the biggest lift. It's tied into ractors and pthreads pretty deeply.
And even in those places, again unlike the fiber scheduler, it does not hijack the process. It uses epoll/kqueue to wait for readiness, then falls back to the rest of the normal code. So for instance in And when you hit scenarios that it is not in control of, the MN runtime falls back to creating actual native threads. static int
native_thread_check_and_create_shared(rb_vm_t *vm)
{
bool need_to_make = false;
rb_native_mutex_lock(&vm->ractor.sched.lock);
{
unsigned int snt_cnt = vm->ractor.sched.snt_cnt;
if (!vm->ractor.main_ractor->threads.sched.enable_mn_threads) snt_cnt++; // do not need snt for main ractor
if (((int)snt_cnt < MINIMUM_SNT) ||
(snt_cnt < vm->ractor.cnt &&
snt_cnt < vm->ractor.sched.max_cpu)) {
RUBY_DEBUG_LOG("added snt:%u dnt:%u ractor_cnt:%u grq_cnt:%u",
vm->ractor.sched.snt_cnt,
vm->ractor.sched.dnt_cnt,
vm->ractor.cnt,
vm->ractor.sched.grq_cnt);
vm->ractor.sched.snt_cnt++;
need_to_make = true;
}
else {
RUBY_DEBUG_LOG("snt:%d ractor_cnt:%d", (int)vm->ractor.sched.snt_cnt, (int)vm->ractor.cnt);
}
}
rb_native_mutex_unlock(&vm->ractor.sched.lock);
if (need_to_make) {
struct rb_native_thread *nt = native_thread_alloc();
nt->vm = vm;
return native_thread_create0(nt);
}
else {
return 0;
}
} If you run ruby code using MN, and ruby debug logging, you'll see alot of either "added snt" or "snt:". If you see alot of "added snt", the runtime is just creating native threads (This can happen a little during normal MN processing too, since the shared pool does not have to be 1. But it shouldn't end up as 1:1). I'll provide two examples - one that properly uses shared MN threads and 10.times.map do |i|
Thread.new do
# I've got a local server which just sleeps for 15 seconds
Net::HTTP.get(URI("http://localhost:8080/15000ms"))
end
end.each(&:join) In this example you'd see the following debug logs. It repeatedly shares the same thread (if you monitor activity, it only ever creates one additional native thread), and uses
This example will instead fallback to native threads - I'm assuming it must not hit the limited hostname = 'localhost'
port = 8080
path = '/15000ms'
10.times.map do
Thread.new do
socket = TCPSocket.open(hostname, port)
request = "GET #{path} HTTP/1.1\r\nHost: #{hostname}\r\n\r\n"
socket.puts(request)
response = socket.readpartial(1024)
socket.close
end
end.each(&:join) You'd see the following debug logs. It repeatedly creates new threads (if you monitor activity, it is a 1:1 native thread to ruby thread), and uses
Lots of room for growth and improvement, as future initatives past this PR 👍🏼 |
thread_pthread_mn.c
Outdated
@@ -753,14 +883,51 @@ timer_thread_polling(rb_vm_t *vm) | |||
// simply retry | |||
break; | |||
default: | |||
perror("epoll_wait"); | |||
rb_bug("epoll_wait errno:%d", errno); | |||
perror("kq"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kq specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier naming, sorry - I changed it to event_wait
to match the name at the top of the function.
} | ||
} | ||
|
||
rb_native_mutex_lock(&timer_th.waiting_lock); | ||
{ | ||
#if HAVE_SYS_EVENT_H | ||
if (num_events > 0) { | ||
if (kevent(timer_th.event_fd, ke, num_events, NULL, 0, NULL) == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with kevent
system call but is there any failure we can ingore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words kevent
should success any time? (otherwize is it a BUG?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a list of the possible errors: https://man.freebsd.org/cgi/man.cgi?query=kevent&apropos=0&sektion=0&manpath=FreeBSD+9.0-RELEASE&arch=default&format=html#end

I mostly followed the lead of the epoll_ctl
call, in particular for the EBADF
. I'm not really sure if ignorning them is safe/a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io1 = open("/tmp/x", 'w')
io2 = IO.for_fd(io1.fileno)
p io1.close #=> nil
p io2.close #=> Bad file descriptor (Errno::EBADF)
I think EBADF
is possible.
thread.c
Outdated
#ifdef RUBY_THREAD_PTHREAD_H | ||
|
||
static bool | ||
thread_sched_wait_events_timeval(int fd, int events, struct timeval *timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's receive th
as a function parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also if (!th_has_dedicated_nt(th))
can be located at first.
thread_sched_wait_events_timeval(th, ...)
{
#ifdef RUBY_THREAD_PTHREAD_H
if (!th->nt->dedicated) {
...
}
#endif // RUBY_THREAD_PTHREAD_H
return 0;
}
and caller can be simpler like:
if (thread_sched_wait_events_timeval()) { return ...; // timeouit }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestions! Much cleaner now - you can review the specific change made here 19fd62d
I didn't use macOS so I couldn't check it. I'm okay to merge it if it has good quality but I can't judge it. BTW I'd planned to restructure the code when merging kqueue support. But the re-structuring can apply later, after ruby 3.3. |
For me (if my kqueue knowledge is correct) it seems no problem. |
Do you know which method generating the native threads more? |
418b558
to
5543d5c
Compare
@ko1 The VALUE
rb_thread_io_blocking_region(rb_blocking_function_t *func, void *data1, int fd)
{
return rb_thread_io_blocking_call(func, data1, fd, 0);
}
VALUE
rb_thread_io_blocking_call(rb_blocking_function_t *func, void *data1, int fd, int events)
{
// ...
// is always false because `events` is always 0
if (events && !th_has_dedicated_nt(th)) {
// ...
thread_sched_wait_events(TH_SCHED(th), th, fd, waitfd_to_waiting_flag(events), NULL);
}
#endif |
a9dfe23
to
b71c0cd
Compare
* Allows macOS users to use M:N threads (and technically FreeBSD, though it has not been verified on FreeBSD) * Include sys/event.h header check for macros, and include sys/event.h when present * Rename epoll_fd to more generic kq_fd (Kernel event Queue) for use by both epoll and kqueue * MAP_STACK is not available on macOS so conditionall apply it to mmap flags * Set fd to close on exec * Log debug messages specific to kqueue and epoll on creation * close_invalidate raises an error for the kqueue fd on child process fork. It's unclear rn if that's a bug, or if it's kqueue specific behavior Use kq with rb_thread_wait_for_single_fd * Only platforms with `USE_POLL` (linux) had changes applied to take advantage of kernel event queues. It needed to be applied to the `select` so that kqueue could be properly applied * Clean up kqueue specific code and make sure only flags that were actually set are removed (or an error is raised) * Also handle kevent specific errnos, since most don't apply from epoll to kqueue * Use the more platform standard close-on-exec approach of `fcntl` and `FD_CLOEXEC`. The io-event gem uses `ioctl`, but fcntl seems to be the recommended choice. It is also what Go, Bun, and Libuv use * We're making changes in this file anyways - may as well fix a couple spelling mistakes while here Make sure FD_CLOEXEC carries over in dup * Otherwise the kqueue descriptor should have FD_CLOEXEC, but doesn't and fails in assert_close_on_exec
* When we have the thread already, it saves a lookup * `event_wait`, not `kq` Clean up the `thread_sched_wait_events_timeval` calls * By handling the PTHREAD check inside the function, all the other code can become much simpler and just call the function directly without additional checks
b71c0cd
to
19fd62d
Compare
use `rb_thread_io_blocking_call()` instead of `rb_thread_io_blocking_region()` more. See ruby#9178 (comment)
use `rb_thread_io_blocking_call()` instead of `rb_thread_io_blocking_region()` more. See #9178 (comment)
I got kevent fails:
|
Sounds good. Let's try it! |
🎉 |
I checked and OpenBSD can also use M:N threads with this. Haven't done any testing with it yet, though. |
@jeremyevans yes, it should! I would have given it a try there too but I am not sure of a way to test it without having OpenBSD installed on a machine. Is there an easy way to give it a try? |
Setting up OpenBSD in a virtual machine is probably the best way to give it a try. |
Adds macOS support to M:N threads by adding in
kqueue
/kevent
calls when present. Technically this would open up support for FreeBSD as well, but I don't have a way of testing that so i'm not sure how well it does or doesn't work there.I wanted to get support going for macOS so more devs can try out M:N threads and test it. I do think there should be a larger discussion around the potential relationship between this and some of the awesome fiber scheduler work over the last few years (and the potential of utilizing anything from io-event?). But that can be a topic for another day - let's get macOS support in the meantime!
Disclaimer: C isn't my day-to-day language, so I could definitely use feedback there. I'm also more of a kernel event queue (kqueue, epoll, io_uring) enthusiast, but
kqueue
isn't something I have specific experience writing with - just lots of reading code and small toy code up until this point.https://bugs.ruby-lang.org/issues/20053
cc @ko1 @byroot @ioquatix