-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Optimize callcache invalidation for refinements #13077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
122dce4
to
c418bea
Compare
vm.c
Outdated
@@ -4443,6 +4448,7 @@ Init_vm_objects(void) | |||
vm->loading_table = st_init_strtable(); | |||
vm->ci_table = st_init_table(&vm_ci_hashtype); | |||
vm->frozen_strings = st_init_table_with_size(&rb_fstring_hash_type, 10000); | |||
vm->cc_refinement_table = st_init_table(&vm_cc_refinement_hashtype); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vm->cc_refinement_table = st_init_table(&vm_cc_refinement_hashtype); | |
vm->cc_refinement_table = st_init_numtable(); |
I think you can just use a standard numtable
, and get rid of your hash
and cmp
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see! I force-pushed the fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does numtable compute hash values from address?
If so, it might be necessary to rebuild and recalculate the hash values after GC compaction.
Alternatively, using a unique object_id as in the original implementation could result in stable hash values.
Which is better: rebuilding or the original approach?
If rebuilding, where should it happen? I’ll investigate tomorrow.
However, this is my first time writing C code, so I might be saying something incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does numtable compute hash values from address?
Yes it does.
If so, it might be necessary to rebuild and recalculate the hash values after GC compaction.
Indeed, I thought it being declared as a weak table would be enough, but looking closer at the code, it update the value in place, ignoring the hash.
So we indeed either need an address independent hash, or to use a more complex way to update that table after compaction.
Alternatively, using a unique object_id as in the original implementation could result in stable hash values.
object_id
insert that object in two other tables (obj_to_id
and id_to_obj
), so it's not ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or to use a more complex way to update that table after compaction.
One simple way could just be to rebuild it entirely after compaction. Iterate on the old one and insert in a new one, then swap them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the code to perform rehashing (deletion and reinsertion) within the foreach loop, following the approach used in vm_weak_table_gen_ivar_foreach
.
Also, value
was added for compatibility with vm_weak_table_foreach_update_weak_key, but it's no longer needed after switching to foreach. So now I'm passing 1
instead of cc
, as the value is unused.
Hope #13074 gets merged soon 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered the copy-then-swap approach, but if no objects move, it wastes memory by duplicating data unnecessarily.
I'm going with a naive implementation for now, though I'm not confident it's the right choice.
If GC compaction moves most cc, I'd prefer copy-then-swap.
I'd appreciate any insights on how GC compaction behaves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy-then-swap seems better after all. It's fast when there are few entries, and if there are many, ST_REPLACE will likely happen anyway.
@byroot Updated the code, would appreciate a review 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would appreciate a review 😄
I can't see anything wrong with that PR, but I'm not that knowledgeable about GC and Ractors, so that doesn't say much.
I'll leave it to @peterzhu2118 to say if there is an issue with the GC code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to everyone who's already reviewed this-no issues have been spotted so far. @peterzhu2118 , when you have a moment, could you take a look at the GC-related changes as well?
I'm not sure about compaction, but is it safe for |
Right, it's missing some code in |
Nevermind, It's already handled because it's declared as a weak table. |
The reference updating code looks correct to me |
571398d
to
6a20b7c
Compare
CI failed due to a network error, so I rebased and force-pushed to trigger it again. |
a2fe156
to
a959b6f
Compare
2a2198f
to
02b2e62
Compare
Follow the document https://smarthr-inc.docbase.io/posts/3759282 and pull request ruby#13077
Follow the document https://smarthr-inc.docbase.io/posts/3759282 and pull request ruby#13077
|
1c6afb9
to
811d670
Compare
348bcef
to
15321e3
Compare
Fixes [Bug #21201] This change addresses a performance regression where defining methods inside `refine` blocks caused severe slowdowns. The issue was due to `rb_clear_all_refinement_method_cache()` triggering a full object space scan via `rb_objspace_each_objects` to find and invalidate affected callcaches, which is very inefficient. To fix this, I introduce `vm->cc_refinement_table` to track callcaches related to refinements. This allows us to invalidate only the necessary callcaches without scanning the entire heap, resulting in significant performance improvement.
Fixes [Bug #21201]
This change addresses a performance regression where defining methods inside
refine
blocks caused severe slowdowns. The issue was due torb_clear_all_refinement_method_cache()
triggering a full object space scan viarb_objspace_each_objects
to find and invalidate affected callcaches, which is very inefficient.To fix this, I introduce
vm->cc_refinement_table
to track callcaches related to refinements. This allows us to invalidate only the necessary callcaches without scanning the entire heap, resulting in significant performance improvement.benchmark