-
Notifications
You must be signed in to change notification settings - Fork 5.4k
YJIT: Lazily push a frame for specialized C funcs #10080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3204cc3
to
d906d19
Compare
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
d906d19
to
0845cbd
Compare
Looking at this PR 👀 Nice results on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is surprisingly simple and the results look good so far. Well done 👍
Do you think it would make sense to try to apply this optimization to more cfuncs before merging?
Also, can this be applied to all calls to cfuncs in gen_send_cfunc
eventually? That would make a big performance difference if possible (doesn't need to be done in this PR).
I can apply this to
Theoretically, it's possible. However, given that frame outlining ended up slowing down the interpreter, I'd like to minimize the number of hooks where we push a lazy frame. Right now, So the benefit of the current design is that it has zero overhead to the C code, which likely doesn't hold when you apply this to every single C func. We currently apply the hook to But I think this is enough. On chunky-png, |
98abe9b
to
340cfb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done 👌
This PR adds the capability to lazily push a frame for specialized C methods. It saves the PC before a C call, remembers
{ PC => cme }
, and lazily pushes a frame using it when it calls non-leaf functions likerb_raise
orrb_to_int
.To demonstrate the capability, this PR also merges
String#setbyte
specialization #9767. While the JIT code never pushes a frame forString#setbyte
, it gets aString#setbyte
frame when a backtrace is obtained inside the C call.