Integrate Ractor#join and Ractor#value with the fiber scheduler #13517

luke-gruber · 2025-06-04T15:51:46Z

Allow ractors to be used within a fiber scheduler context. Currently,
only Ractor.new { ... }.value or Ractor.new { ... }.join are supported
and tested.

When calling Ractor#join or Ractor#value within a non-blocking fiber
scheduler context, do not block the calling thread. Instead, call the
fiber scheduler's block method, which will transfer to another fiber
if coded correctly. When the ractor has terminated and is ready to give
its value to the calling thread, it will send an interrupt to that
thread telling it to switch back to its original fiber. If the thread is
blocked on IO (because it's in the fiber scheduler close hook), it will unblock
it by calling its unblock function. When the thread receives the interrupt,
it will call the fiber scheduler's unblock method, which will either
transfer to the original fiber or put it on a ready list, if coded correctly.

Ex:

scheduler = Scheduler.new # your fiber scheduler
Fiber.set_scheduler scheduler
class << scheduler
  attr_reader :test_blockers
  def block(blocker, timeout=nil)
      (@test_blockers ||= []) << [blocker, timeout]
      super
  end
end
ordering = []
blocked_thread = nil
# in f1
Fiber.schedule do
  # in f2
  r = Ractor.new do
      # in f3
      sleep 0.5
  end
  ordering << "f2 before join"
  # Calling `r.join` should schedule us away from f2 back to f1. In f1, we end the script
  # and then Scheduler#close is called, which can block on IO.select or similar. When the ractor
  # is finished, it resumes fiber f2.
  blocked_thread = Thread.current
  r.join
  ordering << "f2 after join"
end
ordering << "f1 thread finish"
expected_ordering = ["f2 before join", "f1 thread finish", "f2 after join"]

assert_equal expected_ordering, ordering
assert_equal 1, scheduler.test_blockers.size
assert scheduler.test_blockers.first[0].is_a?(Thread) # the blocked thread that called join
assert_equal blocked_thread, scheduler.test_blockers.first[0]

samuel-williams-shopify · 2025-06-04T20:23:00Z

This looks good to me.

Let's split out the cosmetic changes so we can merge them separately.
Then let's rebase this PR so we can focus on the review cycle with @ko1.

Allow ractors to be used within a fiber scheduler context. Currently, only Ractor.new { ... }.value or Ractor.new { ... }.join are supported and tested. When calling Ractor#join or Ractor#value within a non-blocking fiber scheduler context, do not block the calling thread. Instead, call the fiber scheduler's `block` method, which will transfer to another fiber if coded correctly. When the ractor has terminated and is ready to give its value to the calling thread, it will send an interrupt to that thread telling it to switch back to its original fiber. If the thread is blocked on IO (because it's in the fiber scheduler close hook), it will unblock it by calling its unblock function. When the thread receives the interrupt, it will call the fiber scheduler's `unblock` method, which will either transfer to the original fiber or put it on a ready list, if coded correctly. Ex: ```ruby scheduler = Scheduler.new # your fiber scheduler Fiber.set_scheduler scheduler class << scheduler attr_reader :test_blockers def block(blocker, timeout=nil) (@test_blockers ||= []) << [blocker, timeout] super end end ordering = [] blocked_thread = nil Fiber.schedule do # in f2 r = Ractor.new do # in f3 sleep 0.5 end ordering << "f2 before join" # Calling `r.join` should schedule us away from f2 back to f1. In f1, we end the script # and then Scheduler#close is called, which can block on IO.select or similar. When the ractor # is finished, it resumes fiber f2. blocked_thread = Thread.current r.join ordering << "f2 after join" end ordering << "f1 thread finish" expected_ordering = ["f2 before join", "f1 thread finish", "f2 after join"] assert_equal expected_ordering, ordering assert_equal 1, scheduler.test_blockers.size assert scheduler.test_blockers.first[0].is_a?(Thread) # the blocked thread that called join assert_equal blocked_thread, scheduler.test_blockers.first[0] ```

launchable-app · 2025-06-04T21:20:00Z

❌ Tests Failed

✖️no tests failed ✔️61914 tests passed(3 flakes)

luke-gruber · 2025-06-09T21:06:21Z

The PR is rebased and ready for review by @ko1. Thanks!

ko1 · 2025-06-09T22:07:43Z

ractor_sync.c

+    rb_fiber_t *fiber = th->ec->fiber_ptr;
+    if (scheduler != Qnil && fiber && !rb_fiberptr_blocking(fiber)) {
+        waiter.fiber = fiber;
+        RACTOR_UNLOCK(cr);


What happens when another Ractor wakes up this Ractor here?

Good point. By looking at the code, I would say it might try to "unblock" the fiber before the fiber gets "blocked" by the ruby interrupt. I'll try to reproduce this behavior, and come back with a solution for this racy case. Thank you 😄

So I checked out the situation by adding a sleep call after the RACTOR_UNLOCK, and for the use case of this PR (Only Ractor#join and Ractor#take can be called in fiber scheduler context), it is not racy. This is because the only possibility for wakeup would be the end of the ractor during ractor_notify_exit, and that uses an interrupt to signal this thread. Because interrupts are not checked in between the unlock and the time rb_fiber_scheduler_block is called, unblock cannot happen before block and the normal order of operations happens.
Edit: I also added a new commit that should fix some edge-case behavior.

Ractor::Port#send also wakes up the target ractor.

ko1 · 2025-06-09T22:07:58Z

common.mk

@@ -14113,6 +14114,26 @@ ractor.$(OBJEXT): $(top_srcdir)/internal/thread.h
 ractor.$(OBJEXT): $(top_srcdir)/internal/variable.h
 ractor.$(OBJEXT): $(top_srcdir)/internal/vm.h
 ractor.$(OBJEXT): $(top_srcdir)/internal/warnings.h
+ractor.$(OBJEXT): $(top_srcdir)/prism/defines.h


why prism code are added here?

I don't know, I could look into it. I first opened the PR without these lines but the build failed saying it was missing these header dependencies, and tool/update-deps says that it's needed.

ko1 · 2025-06-09T22:10:42Z

Now that Ractor::Port was recently merged and is not yet mature, I don't want to introduce more complexity that would make it harder to debug.
However, Ractor::Port is dramatically simpler than the previous take/yield pair, so the situation has improved.

samuel-williams-shopify · 2025-06-09T23:17:26Z

I don't want to introduce more complexity that would make it harder to debug.

Understood, but in this case, but given that the functionality is only present when the fiber scheduler is enabled, and it's a requirement not to block the scheduler, are you okay if we continue moving forward with this?

We need to always clear the `waiter` off the current ractor's waiters list even if we get an uncaught error during the execution of the transferred fiber. Therefore, we need to be able to delete the `struct ractor_waiter` off the list multiple times if necessary: once when woken up (if it was woken up) and also before raising in the waiting thread. We use `ccan_list_del_init` to clear the node's `next` and `prev` pointers for safe deletion multiple times. Also, we now RACTOR_LOCK around this list deletion in the case of an error.

luke-gruber force-pushed the ractor_ports_with_fiber_scheduler branch from 544cad7 to c7c78b5 Compare June 4, 2025 18:36

luke-gruber force-pushed the ractor_ports_with_fiber_scheduler branch from c7c78b5 to 5fb6b50 Compare June 4, 2025 21:03

ko1 reviewed Jun 9, 2025

View reviewed changes

luke-gruber force-pushed the ractor_ports_with_fiber_scheduler branch from 59e4b5d to 459c9ff Compare June 10, 2025 18:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integrate Ractor#join and Ractor#value with the fiber scheduler #13517

Integrate Ractor#join and Ractor#value with the fiber scheduler #13517

luke-gruber commented Jun 4, 2025

Uh oh!

samuel-williams-shopify commented Jun 4, 2025

Uh oh!

launchable-app bot commented Jun 4, 2025 •

edited

Loading

Uh oh!

luke-gruber commented Jun 9, 2025

Uh oh!

ko1 Jun 9, 2025

Uh oh!

luke-gruber Jun 10, 2025

Uh oh!

luke-gruber Jun 10, 2025 •

edited

Loading

Uh oh!

ko1 Jun 11, 2025

Uh oh!

ko1 Jun 9, 2025

Uh oh!

luke-gruber Jun 10, 2025

Uh oh!

ko1 commented Jun 9, 2025

Uh oh!

samuel-williams-shopify commented Jun 9, 2025

Uh oh!

Uh oh!

Integrate Ractor#join and Ractor#value with the fiber scheduler #13517

Are you sure you want to change the base?

Integrate Ractor#join and Ractor#value with the fiber scheduler #13517

Conversation

luke-gruber commented Jun 4, 2025

Uh oh!

samuel-williams-shopify commented Jun 4, 2025

Uh oh!

launchable-app bot commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Tests Failed

Uh oh!

luke-gruber commented Jun 9, 2025

Uh oh!

ko1 Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

luke-gruber Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

luke-gruber Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ko1 Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

ko1 Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

luke-gruber Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

ko1 commented Jun 9, 2025

Uh oh!

samuel-williams-shopify commented Jun 9, 2025

Uh oh!

Uh oh!

launchable-app bot commented Jun 4, 2025 •

edited

Loading

luke-gruber Jun 10, 2025 •

edited

Loading