-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Add ability to trace exit locations in yjit #5970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
11283ae
to
b08eab4
Compare
Some comments based on the description:
Maybe
Could make sense to add a I wish you'd posted the output on railsbench or a larger example so we can see the full potential. |
test/ruby/test_yjit.rb
Outdated
def test_compile_putobject | ||
assert_compiles('true', insns: %i[putobject], result: true) | ||
assert_compiles('123', insns: %i[putobject], result: 123) | ||
assert_compiles(':foo', insns: %i[putobject], result: :foo) | ||
assert_compiles('true', insns: %i[putobject], result: true, trace_exits: true) | ||
assert_compiles('123', insns: %i[putobject], result: 123, trace_exits: true) | ||
assert_compiles(':foo', insns: %i[putobject], result: :foo, trace_exits: true) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit confused as to why you're setting trace_exits: true
for some tests but not others. What are you testing? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the tests create so much data that tracing exits slowed them down, so instead I chose a bunch that I know produced exits and tested that the hash has the keys I expect (and aren't empty).
yjit.c
Outdated
// Parses the yjit_raw_samples and yjit_line_samples data into three | ||
// hashes (raw, lines, and frames) to be used by RubyVM::YJIT.exit_locations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are raw samples and line samples?
yjit/src/codegen.rs
Outdated
// If --yjit-trace-exits option is enabled, record the exit stack | ||
// while recording the side exits. | ||
if get_option!(gen_trace_exits) { | ||
mov(cb, RDI, const_ptr_opnd(exit_pc as *const u8)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use C_ARG_REGS[0]
instead of directly using RDI
yjit/src/codegen.rs
Outdated
/// For implementing exit location tracking | ||
yjit_raw_samples: Vec<VALUE>, | ||
yjit_line_samples: Vec<i32> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state for this should be stored in stats.rs
, not in codegen.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a struct just for these values, but @noahgibbs recommended moving it into CodegenGlobals but I don't remember the exact reason why. Unless I totally misunderstood what he meant when we paired on this, @noahgibbs do you remember?
Previously the way I was accessing my struct (needing an instance of it) required me to wrap this in unsafe, but that might have been my implementation. I don't know enough Rust to be confident in that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we talked about moving it into CodegenGlobals to avoid having the additional unsafe dance for a new structure. It sounds like @maximecb is recommending an additional unsafe dance for a new structure -- I don't think she wants it in the stats counters, so it would need its own singleton setup and accessors (including unsafe.)
If it was an additional structure in stats.rb, not in counters, you would indeed need unsafe for it. It's basically a singleton or global at that point, and Rust demands unsafe for situations with possible data races like that :-/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrapped the counter increments in a macro to avoid having to expose the internals of how its accessed.
It just doesn't semantically belong in CodegenGlobals 😅
yjit/src/stats.rs
Outdated
while idx < yjit_raw_samples.len() as size_t { | ||
let num = yjit_raw_samples[idx as usize]; | ||
let mut i = 0; | ||
|
||
idx += 1; | ||
|
||
while i < i32::from(num) { | ||
unsafe { rb_gc_mark(yjit_raw_samples[idx as usize]); } | ||
i += 1; | ||
idx += 1; | ||
} | ||
|
||
idx += 1; | ||
idx += 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs commenting.
What are raw samples? What kind of values do they contain that needs to be GC'd?
Why does idx +=1
appear twice in a row?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maximecb Each frame is wrapped by a GC allocated object that is returned via rb_profile_frames
.
@eileencodes correct me if I'm wrong, but IIRC the layout of raw_samples
looks something like this:
[ len, frame1, frame2, frameN, exit_insn, count, len, frame1, frame2, frameN, exit_insn, count, ... ]
^------------- First Sample ----------------^ ^--------------- Second Sample -------------^ ^--- etc ---^
The array isn't a homogenous list of VALUE pointers, it also contains the exit instruction (which isn't a VALUE) and also book keeping (the number of times we've seen that particular stack in a row and the actual length of the sample).
The two idx bumps skip those two "non GC allocated" objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should explain that quickly in the comments, maybe mention which source file rb_profile_frames
is defined in.
yjit/src/stats.rs
Outdated
const BUFF_LEN: usize = 2048; | ||
let mut frames_buffer = [VALUE(0 as usize); BUFF_LEN]; | ||
let mut lines_buffer = [0; BUFF_LEN]; | ||
let num = unsafe { rb_profile_frames(0, BUFF_LEN as i32, frames_buffer.as_mut_ptr(), lines_buffer.as_mut_ptr()) }; | ||
let mut i = num - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does rb_profile_frames do, and where does the 2048 come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rb_profile_frames
is the CRuby API for getting a "profiling" stack trace. 2048 came from the original stackprof implementation, but I don't think there's any real reason for it. rb_profile_frames
doesn't allocate anything, only write in to a buffer you provide. 2048 seemed to be working well for collecting stats on stackprof, so it seemed natural to use here.
7fd1331
to
82be00d
Compare
class TestYJITExitLocations < Test::Unit::TestCase | ||
def test_yjit_trace_exits_and_v_no_error | ||
_stdout, stderr, _status = EnvUtil.invoke_ruby(%w(-v --yjit-trace-exits), '', true, true) | ||
refute_includes(stderr, "NoMethodError") | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea moving this into its own test file 👍
When running with `--yjit-stats` turned on, yjit can inform the user what the most common exits are. While this is useful information it doesn't tell you the source location of the code that exited or what the code that exited looks like. This change intends to fix that. To use the feature, run yjit with the `--yjit-trace-exits` option, which will record the backtrace for every exit that occurs. This functionality requires the stats feature to be turned on. Calling `--yjit-trace-exits` will automatically set the `--yjit-stats` option. Users must call `RubyVM::YJIT.dump_exit_locations(filename)` which will Marshal dump the contents of `RubyVM::YJIT.exit_locations` into a file based on the passed filename. *Example usage:* Given the following script, we write to a file called `concat_array.dump` the results of `RubyVM::YJIT.exit_locations`. ```ruby def concat_array ["t", "r", *x = "u", "e"].join end 1000.times do concat_array end RubyVM::YJIT.dump_exit_locations("concat_array.dump") ``` When we run the file with this branch and the appropriate flags the stacktrace will be recorded. Note Stackprof needs to be installed or you need to point to the library directly. ``` ./ruby --yjit --yjit-call-threshold=1 --yjit-trace-exits -I/Users/eileencodes/open_source/stackprof/lib test.rb ``` We can then read the dump file with Stackprof: ``` ./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump ``` Results will look similar to the following: ``` ================================== Mode: () Samples: 1817 (0.00% miss rate) GC: 0 (0.00%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 1001 (55.1%) 1001 (55.1%) concatarray 335 (18.4%) 335 (18.4%) invokeblock 178 (9.8%) 178 (9.8%) send 140 (7.7%) 140 (7.7%) opt_getinlinecache ...etc... ``` Simply inspecting the `concatarray` method will give `SOURCE UNAVAILABLE` because the source is insns.def. ``` ./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump --method concatarray ``` Result: ``` concatarray (nonexistent.def:1) samples: 1001 self (55.1%) / 1001 total (55.1%) callers: 1000 ( 99.9%) Object#concat_array 1 ( 0.1%) Gem.suffixes callees (0 total): code: SOURCE UNAVAILABLE ``` However if we go deeper to the callee we can see the exact source of the `concatarray` exit. ``` ./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump --method Object#concat_array ``` ``` Object#concat_array (/Users/eileencodes/open_source/rust_ruby/test.rb:1) samples: 0 self (0.0%) / 1000 total (55.0%) callers: 1000 ( 100.0%) block in <main> callees (1000 total): 1000 ( 100.0%) concatarray code: | 1 | def concat_array 1000 (55.0%) | 2 | ["t", "r", *x = "u", "e"].join | 3 | end ``` The `--walk` option is recommended for this feature as it make it easier to traverse the tree of exits. *Goals of this feature:* This feature is meant to give more information when working on YJIT. The idea is that if we know what code is exiting we can decide what areas to prioritize when fixing exits. In some cases this means adding prioritizing avoiding certain exits in yjit. In more complex cases it might mean changing the Ruby code to be more performant when run with yjit. Ultimately the more information we have about what code is exiting AND why, the better we can make yjit. *Known limitations:* * Due to tracing exits, running this on large codebases like Rails can be quite slow. * On complex methods it can still be difficult to pinpoint the exact cause of an exit. * Stackprof is a requirement to to view the backtrace information from the dump file. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
82be00d
to
03ff674
Compare
Ok I think this is ready. I did the following (which it looks like you already saw @maximecb)
In addition here is an example of using this feature on Rails. Rails exampleRun a Rails test with yjit and trace exits turned on:
Stackprof output:
Let's look at the
Our first page of output will look like this which shows us that there are exits in each location.:
If we pick the first one we can see the exact code that's causing this exit to happen in
|
Well done Eileen! Now that this is complete, I think it would make sense to use the new tool that you've built to pinpoint improvements that we could make. Either improve YJIT to better support rails, or change rails to make it more easily optimizable by YJIT. Something to think about for next steps :) |
I originally added the check for RubyVM::YJIT.trace_exit_locations_enabled? to fix errors when these tests run without the stats feature enabled. However I forgot that this will never be true when this test is booting, so nothing was running when the stats feature is turned on. Instead I've decided to make a new hash in the dump file and check if exit locations are enabled there. If they aren't enabled we return early to avoid checking for keys that won't exit in the dumped exit locations. I chose to add this additional enabled check because empty exit locations might not indicate that stats isn't enabled, it could mean the feature is entirely broken. I do want these tests to fail if stats are on and nothing was collected. Followup to ruby#5970
I originally added the check for RubyVM::YJIT.trace_exit_locations_enabled? to fix errors when these tests run without the stats feature enabled. However I forgot that this will never be true when this test is booting, so nothing was running when the stats feature is turned on. Instead I've decided to make a new hash in the dump file and check if exit locations are enabled there. If they aren't enabled we return early to avoid checking for keys that won't exit in the dumped exit locations. I chose to add this additional enabled check because empty exit locations might not indicate that stats isn't enabled, it could mean the feature is entirely broken. I do want these tests to fail if stats are on and nothing was collected. Followup to #5970
When running with
--yjit-stats
turned on Ruby code can inform the userwhat the most common exits are. While this is useful information it
doesn't tell you the source location of the code that exited or what the
code that exited looks like. This change intends to fix that.
To use the feature, run yjit with
--yjit-stats
and--yjit-trace-exits
,which will record the backtrace for every exit that occurs. Users must save
the output of
RubyVM::YJIT.exit_locations
to a dump file. That filecan then be read by StackProf to see the code that exited and the
reason.
Example usage:
Given the following script, we write to a file called
concat_array.dump
the results ofRubyVM::YJIT.exit_locations
.When we run the file with this branch and the appropriate flags the
stacktrace will be recorded. Note Stackprof needs to be installed or you
need to point to the library directly.
We can then read the dump file with Stackprof:
Results will look similar to the following:
Simply inspecting the
concatarray
method will giveSOURCE UNAVAILABLE
because the source is insns.def.Result:
However if we go deeper to the callee we can see the exact
source of the
concatarray
exit.The
--walk
option is recommended for this feature as it make iteasier to traverse the tree of exits.
Goals of this feature:
This feature is meant to give more information when working on YJIT.
The idea is that if we know what code is exiting we can decide what
areas to prioritize when fixing exits. In some cases this means adding
prioritizing avoiding certain exits in yjit. In more complex cases it
might mean changing the Ruby code to be more performant when run with
yjit. Ultimately the more information we have about what code is exiting
AND why, the better we can make yjit.
Known limitations:
can be quite slow.
an exit.
the dump file
Co-authored-by: Aaron Patterson tenderlove@ruby-lang.org
Ruby Issue: https://bugs.ruby-lang.org/issues/18812