Skip to content

Improve IO#close performance and semantics. #5384

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ioquatix
Copy link
Member

@ioquatix ioquatix commented Jan 1, 2022

See https://bugs.ruby-lang.org/issues/18455 for more context.

The first half of this PR (hiding rb_io_t) was done in #6511.

@ioquatix ioquatix force-pushed the io-close-consistency branch from 5268b41 to 9fd8abb Compare January 1, 2022 07:32
@eregon
Copy link
Member

eregon commented Jan 1, 2022

The field vm->waiting_fds` can be removed now, right?

io.c Outdated
// Ensure waiting_fd users do not hit EBADF.
if (busy) {
// Wait for them to exit before we call close().
do rb_thread_schedule(); while (!list_empty(busy));
Copy link
Member

@eregon eregon Jan 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this no longer needed?
Do callers all make sure the waiting list for that IO is empty before calling fptr_finalize_flush?
And also that no other thread can start a blocking operation between being notified and reaching here?
fd = -1 is only set just above but so after notification, hence it seems possible for other threads to restart their blocking operation (or a new thread to start a blocking operation on this IO), unless we have some flag to prevent that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably going to introduce a flag because the fiber scheduler can defer io_close operation using io_uring for better performance, but we can't set fd = -1 to mean "we are closing" because then fd has no meaning after that and can't be used by io_close hook.

@ioquatix ioquatix force-pushed the io-close-consistency branch 11 times, most recently from 8e74578 to 4134546 Compare March 18, 2022 05:17
@ioquatix ioquatix force-pushed the io-close-consistency branch 4 times, most recently from 8c5ce88 to bb5fdce Compare April 1, 2022 03:15
Comment on lines +183 to +245
#ifdef RUBY_VERSION_IS_3_0
VALUE timeout = Qnil;
if (!NIL_P(secs)) {
timeout = rb_float_new((double)FIX2INT(secs) + (0.000001f * FIX2INT(usecs)));
}
VALUE result = rb_io_wait(io, events, timeout);
if (result == Qfalse) return INT2FIX(0);
else return result;
#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not ideal, because it means we won't test rb_wait_for_single_fd anymore on Ruby >= 3.
So it'd be better to define a separate function/method and test that separately.

For the other changes above it seems fine because they are all trivial renames or helpers, but it's not that straightforward here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rb_wait_for_single_fd should be deprecated and removed. How should we handle it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth the cost to deprecate it, what's the issue with it?

Copy link
Member Author

@ioquatix ioquatix Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time you call it, the fiber scheduler has to make a temporary IO object, because the design choice was made that it should be IO-instance centric not int fd centric.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think deprecating this would need feedback from other core devs or matz.
Practically, you'd use one of the DEPRECATED macros, and the specs would stop testing that method when it's removed.

Copy link
Member

@eregon eregon Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs to be clear to everyone that both IO#close and struct rb_io is made internal are done by this PR/Redmine ticket (struct rb_io is made internal is not an implementation detail change, it will affect gems).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct rb_io is made internal is not an implementation detail change, it will affect gems.

Do you know any examples of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usages of RB_IO_POINTER or GetOpenFile seem likely problematic, if they want to access any field of the returned rb_io_t*.
GetOpenFile is used in a bunch of C extension gems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ gem-codesearch 'GetOpenFile\b' | wc -l
974

Not all might access fields of it, but I guess many do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add deprecation warnings and alternatives. Most of them are probably just accessing the descriptor.

GetOpenFile(wio, ofptr);
return ofptr->fd;
return rb_ioptr_descriptor(ofptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem with such a change is io-console is a gem and supports older Ruby versions, which don't have any rb_ioptr_*.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can create compatibility macros for this case, wdyt?

Copy link
Member

@eregon eregon May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but this needs to be easy then, if rb_ioptr_descriptor is a function then one needs have_func + their own macro, it's annoying already.

Could we keep exposing some fields of struct rb_io public (like fd), and the rest internal? Then there would be much less need to migrate code, which is a significant overhead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should expose any parts of struct rb_io.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright.
Maybe we could always define macros like HAVE_RB_IO_DESCRIPTOR which means rb_ioptr_descriptor is available without needing to have_func('rb_ioptr_descriptor') which is quite slow (one C compilation for each function checked).
Or something more direct like RB_IO_DESCRIPTOR(io), and so users can just define it as the old way if not already defined.

I'm not a big fan of the rb_ioptr_ naming. I think rb_io_ is better. That might mean passing the VALUE instead of the struct rb_io, that seems fine and much clearer that structs are not parts of the API (even opaque structs).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with this but I'm between a rock and a hard place, I didn't create the interfaces that take the "fptr" argument and I'm all for defining more rb_io_...(VALUE) functions. I agree the less of this we expose the better.

@ioquatix
Copy link
Member Author

ioquatix commented Apr 6, 2022

One way we can gracefully implement hiding struct rb_io is by leaving it public for one minor release, as well as adding accessors (like rb_ioptr_descriptor) and then migrating all code, at some point we hide the implementation. But it will slow down the change here. We cannot expose ccan_list as part of a public interface.

@eregon
Copy link
Member

eregon commented May 17, 2022

We cannot expose ccan_list as part of a public interface.

Why not? Because it causes conflicts? I think currently it is part of the public interface (might be unintentional but still).
Agreed it shouldn't be part of it though.

@ioquatix
Copy link
Member Author

Why not? Because it causes conflicts?

Because it's implementation detail.

I think currently it is part of the public interface (might be unintentional but still).

Where?

Agreed it shouldn't be part of it though.

Great, so we agree :)

@ioquatix ioquatix force-pushed the io-close-consistency branch 3 times, most recently from 3fb1441 to b98507b Compare May 30, 2023 10:13
@ioquatix ioquatix force-pushed the io-close-consistency branch from e9d18b1 to ace3f18 Compare May 31, 2023 02:23
@ioquatix
Copy link
Member Author

This PR is too big to merge in one go, so I'm going to break out some of the smaller changes.

@ioquatix ioquatix closed this Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants