Skip to content

YJIT: add codegen for String#setbyte #9767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

maximecb
Copy link
Contributor

Before:

--------  -----------  ----------  ---------  ----------  ------------  -----------
bench     interp (ms)  stddev (%)  yjit (ms)  stddev (%)  yjit 1st itr  interp/yjit
ruby-xor  987.2        0.1         109.3      0.3         8.38          9.04       
--------  -----------  ----------  ---------  ----------  ------------  -----------

After:

--------  -----------  ----------  ---------  ----------  ------------  -----------
bench     interp (ms)  stddev (%)  yjit (ms)  stddev (%)  yjit 1st itr  interp/yjit
ruby-xor  991.8        0.6         99.6       0.4         9.25          9.96       
--------  -----------  ----------  ---------  ----------  ------------  -----------

@maximecb maximecb self-assigned this Jan 30, 2024
@matzbot matzbot requested a review from a team January 30, 2024 20:43
fn rb_str_setbyte(str: VALUE, index: VALUE, value: VALUE) -> VALUE;
}

// Raises when index is out of range
Copy link
Member

@k0kubun k0kubun Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we okay with breaking the backtrace when YJIT is enabled? We discussed this exact problem on Thursday and you suggested to put a guard on the index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the flip side, you could probably remove jit_prepare_routine_call if you guard it, so it would allow you to keep local types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd also need to guard for frozen strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method can still raise though? You're saying it won't report the method raising correctly? Why is that?

Copy link
Member

@k0kubun k0kubun Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't say it can't raise. I only talked about the backtrace.

master:

$ ruby -v --yjit-call-threshold=1 -e '"".setbyte(1, 0)'
ruby 3.4.0dev (2024-01-30T19:59:53Z master c1f8d974a8) +YJIT [x86_64-linux]
-e:1:in `setbyte': index 1 out of string (IndexError)
        from -e:1:in `<main>'

this branch:

$ ruby -v --yjit-call-threshold=1 -e '"".setbyte(1, 0)'
ruby 3.4.0dev (2024-01-30T20:39:42Z yjit_setbyte 247ee663df) +YJIT [x86_64-linux]
-e:1:in `<main>': index 1 out of string (IndexError)

With this change, you can't know which method is raising the exception. If a line has a chain of method calls, it'd be harder to tell whether it's from setbyte or not. And this problem will happen only after YJIT compiles it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd preserve the backtrace, but at the same time, this can really get in the way of performance optimizations. How do you guys feel about it?

In theory, it may be possible to add JIT support for recovering a correct backtrace based on the PC (we do set up the PC in jit_prepare_routine_call) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel uneasy about messing with the backtrace since it's user facing. More people should probably be brought into the discussion if we really want to do it.

With stuff like send and alias methods, it's hard to recover the callee from the PC alone, since that just points you to the callsite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok hmmm.

I feel like if we have to guard that the index is fixnum and in range, and that the string is also frozen, it really nerfs this PR because we're adding a fair bit of extra work on top of a C call that will duplicate the guards we just did.

So this opens up at least 2 discussion threads:

  1. Can we find a way to recover the backtrace for C calls based on the PC only, without needing to push a Ruby frame?
  2. How easy it it to guard that a string is not frozen, and that the string has binary encoding?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For (2), if we're going to guard, it's probably faster and cleaner to write a version of the C function that returns a bool instead of raising, that way we can just call it without preparation and we don't repeat any guards. With the guards, it's really close to actually setting the byte.

Copy link
Member

@k0kubun k0kubun Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR, Alan's idea on (2) seems reasonable to me.

  1. Can we find a way to recover the backtrace for C calls based on the PC only, without needing to push a Ruby frame?

For a long term, we should aim for having this capability. We have so many almost-leaf methods that are not leaf only because they raise. Many C methods need to be inlining-safe when we start inlining non-trivial ISEQs.

We could have a YJIT-global hash map { PC => C method }, make sure it's currently the only callee from the PC on YJIT (otherwise invalidate the existing block), and resurrect the frame on rb_raise if the top-most frame's PC is in the hash map.

If the same ISEQ is called from the interpreter for any reason, another frame (setbyte or another method, not the caller we're compiling) would be the top-most frame, so it wouldn't doubly push the frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants