Optimize callcache invalidation for refinements #13077

alpaca-tc · 2025-04-06T23:53:46Z

Fixes [Bug #21201]

This change addresses a performance regression where defining methods inside refine blocks caused severe slowdowns. The issue was due to rb_clear_all_refinement_method_cache() triggering a full object space scan via rb_objspace_each_objects to find and invalidate affected callcaches, which is very inefficient.

To fix this, I introduce vm->cc_refinement_table to track callcaches related to refinements. This allows us to invalidate only the necessary callcaches without scanning the entire heap, resulting in significant performance improvement.

benchmark

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  gem "benchmark-ips"
end

mod = Module.new
klass = Class.new

Benchmark.ips do |x|
  x.report(RUBY_VERSION) do
    mod.send(:refine, klass) do
      def call_1 = nil
      def call_2 = nil
      def call_3 = nil
    end
  end

  x.save! "/tmp/performance_regression_refine.bench"
  x.compare!
end

ruby 3.5.0dev (2025-06-05T14:33:43Z bugs/21201 e7833e82f4) +PRISM [arm64-darwin24]
Warming up --------------------------------------
         3.5.0_patch    50.158k i/100ms
Calculating -------------------------------------
         3.5.0_patch    520.371k (± 7.2%) i/s    (1.92 μs/i) -      2.608M in   5.038852s

Comparison:
               3.2.8:  1597482.0 i/s
         3.5.0_patch:   520371.1 i/s - 3.07x  slower (this PR)
               3.3.8:      213.0 i/s - 7499.45x  slower
               3.4.4:      138.2 i/s - 11561.20x  slower
3.5.0_master_0e0008da0f19d:      137.8 i/s - 11593.43x  slower

byroot · 2025-04-07T14:45:19Z

vm.c

@@ -4443,6 +4448,7 @@ Init_vm_objects(void)
    vm->loading_table = st_init_strtable();
    vm->ci_table = st_init_table(&vm_ci_hashtype);
    vm->frozen_strings = st_init_table_with_size(&rb_fstring_hash_type, 10000);
+    vm->cc_refinement_table = st_init_table(&vm_cc_refinement_hashtype);


Suggested change

vm->cc_refinement_table = st_init_table(&vm_cc_refinement_hashtype);

vm->cc_refinement_table = st_init_numtable();

I think you can just use a standard numtable, and get rid of your hash and cmp functions.

Ah, I see! I force-pushed the fix.

Does numtable compute hash values from address?
If so, it might be necessary to rebuild and recalculate the hash values after GC compaction.
Alternatively, using a unique object_id as in the original implementation could result in stable hash values.

Which is better: rebuilding or the original approach?
If rebuilding, where should it happen? I’ll investigate tomorrow.

However, this is my first time writing C code, so I might be saying something incorrect.

Does numtable compute hash values from address?

Yes it does.

If so, it might be necessary to rebuild and recalculate the hash values after GC compaction.

Indeed, I thought it being declared as a weak table would be enough, but looking closer at the code, it update the value in place, ignoring the hash.

So we indeed either need an address independent hash, or to use a more complex way to update that table after compaction.

Alternatively, using a unique object_id as in the original implementation could result in stable hash values.

object_id insert that object in two other tables (obj_to_id and id_to_obj), so it's not ideal.

or to use a more complex way to update that table after compaction.

One simple way could just be to rebuild it entirely after compaction. Iterate on the old one and insert in a new one, then swap them.

I updated the code to perform rehashing (deletion and reinsertion) within the foreach loop, following the approach used in vm_weak_table_gen_ivar_foreach.

Also, value was added for compatibility with vm_weak_table_foreach_update_weak_key, but it's no longer needed after switching to foreach. So now I'm passing 1 instead of cc, as the value is unused.
Hope #13074 gets merged soon 👍

I considered the copy-then-swap approach, but if no objects move, it wastes memory by duplicating data unnecessarily.
I'm going with a naive implementation for now, though I'm not confident it's the right choice.
If GC compaction moves most cc, I'd prefer copy-then-swap.
I'd appreciate any insights on how GC compaction behaves.

copy-then-swap seems better after all. It's fast when there are few entries, and if there are many, ST_REPLACE will likely happen anyway.
@byroot Updated the code, would appreciate a review 😄

would appreciate a review 😄

I can't see anything wrong with that PR, but I'm not that knowledgeable about GC and Ractors, so that doesn't say much.

I'll leave it to @peterzhu2118 to say if there is an issue with the GC code.

Thanks to everyone who's already reviewed this-no issues have been spotted so far. @peterzhu2118 , when you have a moment, could you take a look at the GC-related changes as well?

ko1 · 2025-04-07T15:59:59Z

I'm not sure about compaction, but is it safe for GC.compact?
cc/ @tenderlove

byroot · 2025-04-07T16:06:43Z

I'm not sure about compaction, but is it safe for GC.compact?

Right, it's missing some code in gc/default.c -> gc_update_references to update the call cache entries in that weak map.

byroot · 2025-04-07T16:08:37Z

Nevermind, It's already handled because it's declared as a weak table.

tenderlove · 2025-04-07T16:54:13Z

I'm not sure about compaction, but is it safe for GC.compact?

The reference updating code looks correct to me

vm_method.c

alpaca-tc · 2025-04-08T15:48:39Z

CI failed due to a network error, so I rebased and force-pushed to trigger it again.

gc.c

Follow the document https://smarthr-inc.docbase.io/posts/3759282 and pull request ruby#13077

alpaca-tc · 2025-05-08T11:12:14Z

~~I’m fixing the CI errors after rebasing~~
done

Fixes [Bug #21201] This change addresses a performance regression where defining methods inside `refine` blocks caused severe slowdowns. The issue was due to `rb_clear_all_refinement_method_cache()` triggering a full object space scan via `rb_objspace_each_objects` to find and invalidate affected callcaches, which is very inefficient. To fix this, I introduce `vm->cc_refinement_table` to track callcaches related to refinements. This allows us to invalidate only the necessary callcaches without scanning the entire heap, resulting in significant performance improvement.

alpaca-tc force-pushed the bugs/21201 branch from 155beef to 1d24c64 Compare April 7, 2025 00:44

This comment has been minimized.

Sign in to view

alpaca-tc force-pushed the bugs/21201 branch 4 times, most recently from 122dce4 to c418bea Compare April 7, 2025 14:08

alpaca-tc marked this pull request as ready for review April 7, 2025 14:10

byroot requested a review from ko1 April 7, 2025 14:38

byroot reviewed Apr 7, 2025

View reviewed changes

alpaca-tc force-pushed the bugs/21201 branch from c418bea to 9dee777 Compare April 7, 2025 15:58

byroot mentioned this pull request Apr 7, 2025

Implement Set as a core class #13074

Merged

XrXr reviewed Apr 7, 2025

View reviewed changes

vm_method.c Outdated Show resolved Hide resolved

alpaca-tc force-pushed the bugs/21201 branch 3 times, most recently from 571398d to 6a20b7c Compare April 8, 2025 15:47

alpaca-tc force-pushed the bugs/21201 branch 3 times, most recently from a2fe156 to a959b6f Compare April 10, 2025 06:08

byroot reviewed Apr 10, 2025

View reviewed changes

gc.c Outdated Show resolved Hide resolved

alpaca-tc force-pushed the bugs/21201 branch 3 times, most recently from 2a2198f to 02b2e62 Compare April 10, 2025 13:35

alpaca-tc force-pushed the bugs/21201 branch from 02b2e62 to 80bff08 Compare April 22, 2025 08:57

kinoppyd added a commit to kufu/ruby that referenced this pull request Apr 24, 2025

Backport CallCache for refiment to ruby_3_4

8067e0a

Follow the document https://smarthr-inc.docbase.io/posts/3759282 and pull request ruby#13077

kinoppyd mentioned this pull request Apr 24, 2025

Backport CallCache for refinement to ruby_3_4 kufu/ruby#1

Draft

exSOUL added a commit to kufu/ruby that referenced this pull request Apr 24, 2025

Backport CallCache for refiment to ruby_3_3

f81b3be

Follow the document https://smarthr-inc.docbase.io/posts/3759282 and pull request ruby#13077

exSOUL mentioned this pull request Apr 24, 2025

Backport CallCache for refiment to ruby_3_3 kufu/ruby#2

Open

alpaca-tc force-pushed the bugs/21201 branch from 80bff08 to b4ac42b Compare April 28, 2025 08:59

alpaca-tc force-pushed the bugs/21201 branch from b4ac42b to 0ececf3 Compare May 8, 2025 10:19

alpaca-tc marked this pull request as draft May 8, 2025 10:59

alpaca-tc force-pushed the bugs/21201 branch 2 times, most recently from 1c6afb9 to 811d670 Compare May 9, 2025 04:15

alpaca-tc marked this pull request as ready for review May 9, 2025 11:43

alpaca-tc force-pushed the bugs/21201 branch 2 times, most recently from 348bcef to 15321e3 Compare May 28, 2025 13:07

alpaca-tc force-pushed the bugs/21201 branch from 15321e3 to e7833e8 Compare June 5, 2025 16:05

alpaca-tc force-pushed the bugs/21201 branch from e7833e8 to 5ee0d8f Compare June 6, 2025 13:15

Merge branch 'master' into bugs/21201

cdbd4fa

ko1 merged commit c8ddc0a into ruby:master Jun 9, 2025
82 checks passed

alpaca-tc deleted the bugs/21201 branch June 9, 2025 03:54

	vm->cc_refinement_table = st_init_table(&vm_cc_refinement_hashtype);
	vm->cc_refinement_table = st_init_numtable();

Optimize callcache invalidation for refinements #13077

Optimize callcache invalidation for refinements #13077

Uh oh!

Conversation

alpaca-tc commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

benchmark

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ko1 commented Apr 7, 2025

Uh oh!

byroot commented Apr 7, 2025

Uh oh!

byroot commented Apr 7, 2025

Uh oh!

tenderlove commented Apr 7, 2025

Uh oh!

Uh oh!

alpaca-tc commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

alpaca-tc commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alpaca-tc commented Apr 6, 2025 •

edited

Loading

alpaca-tc commented Apr 8, 2025 •

edited

Loading

alpaca-tc commented May 8, 2025 •

edited

Loading