-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Improve IO#close
performance and semantics.
#5384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5268b41
to
9fd8abb
Compare
The field vm->waiting_fds` can be removed now, right? |
io.c
Outdated
// Ensure waiting_fd users do not hit EBADF. | ||
if (busy) { | ||
// Wait for them to exit before we call close(). | ||
do rb_thread_schedule(); while (!list_empty(busy)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this no longer needed?
Do callers all make sure the waiting list for that IO is empty before calling fptr_finalize_flush
?
And also that no other thread can start a blocking operation between being notified and reaching here?
fd = -1
is only set just above but so after notification, hence it seems possible for other threads to restart their blocking operation (or a new thread to start a blocking operation on this IO), unless we have some flag to prevent that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably going to introduce a flag because the fiber scheduler can defer io_close
operation using io_uring
for better performance, but we can't set fd = -1
to mean "we are closing" because then fd
has no meaning after that and can't be used by io_close
hook.
8e74578
to
4134546
Compare
8c5ce88
to
bb5fdce
Compare
#ifdef RUBY_VERSION_IS_3_0 | ||
VALUE timeout = Qnil; | ||
if (!NIL_P(secs)) { | ||
timeout = rb_float_new((double)FIX2INT(secs) + (0.000001f * FIX2INT(usecs))); | ||
} | ||
VALUE result = rb_io_wait(io, events, timeout); | ||
if (result == Qfalse) return INT2FIX(0); | ||
else return result; | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not ideal, because it means we won't test rb_wait_for_single_fd
anymore on Ruby >= 3.
So it'd be better to define a separate function/method and test that separately.
For the other changes above it seems fine because they are all trivial renames or helpers, but it's not that straightforward here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rb_wait_for_single_fd
should be deprecated and removed. How should we handle it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth the cost to deprecate it, what's the issue with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time you call it, the fiber scheduler has to make a temporary IO object, because the design choice was made that it should be IO-instance centric not int fd
centric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think deprecating this would need feedback from other core devs or matz.
Practically, you'd use one of the DEPRECATED
macros, and the specs would stop testing that method when it's removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it needs to be clear to everyone that both IO#close
and struct rb_io is made internal
are done by this PR/Redmine ticket (struct rb_io is made internal
is not an implementation detail change, it will affect gems).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct rb_io is made internal is not an implementation detail change, it will affect gems.
Do you know any examples of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usages of RB_IO_POINTER
or GetOpenFile
seem likely problematic, if they want to access any field of the returned rb_io_t*
.
GetOpenFile
is used in a bunch of C extension gems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ gem-codesearch 'GetOpenFile\b' | wc -l
974
Not all might access fields of it, but I guess many do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add deprecation warnings and alternatives. Most of them are probably just accessing the descriptor.
ext/io/console/console.c
Outdated
GetOpenFile(wio, ofptr); | ||
return ofptr->fd; | ||
return rb_ioptr_descriptor(ofptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem with such a change is io-console is a gem and supports older Ruby versions, which don't have any rb_ioptr_*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can create compatibility macros for this case, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this needs to be easy then, if rb_ioptr_descriptor
is a function then one needs have_func + their own macro, it's annoying already.
Could we keep exposing some fields of struct rb_io
public (like fd
), and the rest internal? Then there would be much less need to migrate code, which is a significant overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should expose any parts of struct rb_io
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright.
Maybe we could always define macros like HAVE_RB_IO_DESCRIPTOR
which means rb_ioptr_descriptor
is available without needing to have_func('rb_ioptr_descriptor')
which is quite slow (one C compilation for each function checked).
Or something more direct like RB_IO_DESCRIPTOR(io)
, and so users can just define it as the old way if not already defined.
I'm not a big fan of the rb_ioptr_
naming. I think rb_io_
is better. That might mean passing the VALUE instead of the struct rb_io
, that seems fine and much clearer that structs are not parts of the API (even opaque structs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with this but I'm between a rock and a hard place, I didn't create the interfaces that take the "fptr" argument and I'm all for defining more rb_io_...(VALUE)
functions. I agree the less of this we expose the better.
One way we can gracefully implement hiding |
Why not? Because it causes conflicts? I think currently it is part of the public interface (might be unintentional but still). |
Because it's implementation detail.
Where?
Great, so we agree :) |
3fb1441
to
b98507b
Compare
e9d18b1
to
ace3f18
Compare
This PR is too big to merge in one go, so I'm going to break out some of the smaller changes. |
See https://bugs.ruby-lang.org/issues/18455 for more context.
The first half of this PR (hiding
rb_io_t
) was done in #6511.