Implement object shapes for T_CLASS and T_MODULE #6637
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Object Shapes were introduced in #6386 for T_OBJECT and "extended ivars" with the plan to later implement it for Classes and Modules, which previously were using an
st_table
. This is that change 😁This brings a number of advantages over the previous approach, where each class/module had its own st_table: First it requires less storage, as we now store the values of the ivars in a flat array. This works particularly well with shapes because the shape can be reused between classes with the same ivar keys, which is very common (many classes only use the hidden
__classpath__
and__autoload__
keys). Next, this unifies behaviour between all types of object which can hold ivars as everything now uses shapes (though there are small differences). Finally, this change allows using inline caches to fetch class instance variables.One challenge while implementing this is that previously the same
st_table *iv_tbl
was shared between T_MODULE and the T_ICLASS representing that modules inheritance. This was convenient for class-variables (N.B. not class-instance-variables), but was incompatible with having IVs in an array (it previously relied on two levels of indirection because we mayrealloc
storage when introducing new IVs). The first few commits of this PR remove this technique, and instead in the cases we needed the behaviour we look though the original module. (T_ICLASS never supported proper instance variables, only explicit access via the st_table).RActor notes
One quirk of IVar access for RActors is that when on a non-main RActor we must guard that the result we receive is shareable. For this reason (and due to locking requirements described below) we don't attempt to use inline caches on other RActors, and simply fall back to the generic safe/locking implementation (which should already have similar or better performance to today).
Previously thread safety was provided using
lock_st_lookup
,lock_st_is_member
,lock_st_insert
,lock_st_delete
, which hold the VM lock when performing their operation. Similarly most operations now useRB_VM_LOCK_ENTER
/RB_VM_LOCK_LEAVE
around their access of the iv storage. With two exceptions:rb_ivar_defined
does not need to perform any locking as all information is encoded in the immutable shapeIt feels like in the future even more of these accesses could be made in a more lock-free way. Since shapes themselves are immutable, and we never shrink the IV storage in most cases it should be safe to access storage without locking (other than maybe removed ivars, which could be worked around), and likely could also be safe to set variables when it didn't require a shape transition (which would require some form of locking, but possibly much lighter or more granular), so long as
free
(specifically the free fromrealloc
) was deferred until GC or some other safepoint. However none of this was explored or attempted because RActor's current use and the relative rarity of class instance variable usage didn't justify the complexity or risk of this approach.Because of the locking requirement on class instance variable sets, this doesn't attempt use inline caches in that case (the inline cache is only used for read). If this ends up being common we could revisit this.
Benchmark
As expected we see a significant ~2x speedup from using inline caching.
cc @jemmaissroff @tenderlove