-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Save one VALUE per embedded RTypedData #13190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Save one VALUE per embedded RTypedData #13190
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's clever!
Looks good too to me, but I'd wait for Peter's expertise here.
test/ruby/test_gc.rb
Outdated
@@ -411,6 +411,8 @@ def test_latest_gc_info_weak_references_count | |||
before_weak_references_count = GC.latest_gc_info(:weak_references_count) | |||
before_retained_weak_references_count = GC.latest_gc_info(:retained_weak_references_count) | |||
|
|||
# Clear ary, so if ary itself is somewhere on the stack, it won't hold all references | |||
ary.clear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably merge that fix now, Koichi reported it just this morning.
This halves the amount of memory used for embedded RTypedData if they are one VALUE (8 bytes on 64-bit platforms) over the slot size limit. For Set, on 64-bit it uses an embedded 56-byte struct. With the previous implementation, the embedded structs starts at offset 32, resulting in a total size of 88. Since that is over the 80 byte limit, it goes to the next highest bucket, 160 bytes, wasting 72 bytes. This allows it to fit in a 80 byte bucket, which reduces the total size for small sets of from 224 bytes (160 bytes embedded, 64 bytes malloc, 72 bytes wasted in embedding) to 144 bytes (80 bytes embedded, 64 bytes malloc, 0 bytes wasted in embedding). Any other embedded RTypedData will see similar advantages if they are currently one VALUE over the limit. To implement this, remove the typed_flag from struct RTypedData. Embed the typed_flag information in the type member, which is now a tagged pointer using VALUE type, using the bottom low 2 bits as flags (1 bit for typed flag, the other for the embedded flag). To get the actual pointer, RTYPEDDATA_TYPE masks out the low 2 bits and then casts. That moves the RTypedData data pointer from offset 32 to offset 24 (on 64-bit). Vast amount of code in the internals (and probably external C extensions) expects the following code to work for both RData and non-embedded RTypedData: ```c DATA_PTR(obj) = some_pointer; ``` Allow this to work by moving the data pointer in RData between the dmark and dfree pointers, so it is at the same offset (24 on 64-bit). Other than these changes to the include files, the only changes needed were to gc.c, to account for the new struct layouts, handle setting the low bits in the type member, and to use RTYPEDDATA_TYPE(obj) instead of RTYPEDDATA(obj)->type.
5b9dcf9
to
a9eefae
Compare
Rebased against master as the initial first commit was already merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work. However, I'm a bit worried about users of TypedData_Make_Struct
that wrap a non-allocated piece of memory or char *
that is in the middle of a string that does not have the bottom address bit as zero.
This doesn't tag the data pointer, it tags the type pointer. So a call such as this would not be affected: // ruby_type_struct is a const rb_data_type_t
TypedData_Make_Struct(klass, type, &ruby_type_struct, pointer_to_garbage); This type of call could potentially be affected: TypedData_Make_Struct(klass, type, pointer_to_garbage, valid_pointer); However, I believe such a broken call would result in the same crash, because Does that alleviate your concern, @peterzhu2118 ? @nobu Do you see any issues with this, or is it OK to merge? |
This halves the amount of memory used for embedded RTypedData if they
are one VALUE (8 bytes on 64-bit platforms) over the slot size limit.
For Set, on 64-bit it uses an embedded 56-byte struct. With the
previous implementation, the embedded structs starts at offset 32,
resulting in a total size of 88. Since that is over the 80 byte
limit, it goes to the next highest bucket, 160 bytes, wasting 72
bytes. This allows it to fit in a 80 byte bucket, which reduces
the total size for small sets of from 224 bytes (160 bytes
embedded, 64 bytes malloc, 72 bytes wasted in embedding) to 144
bytes (80 bytes embedded, 64 bytes malloc, 0 bytes wasted in
embedding).
Any other embedded RTypedData will see similar advantages if they
are currently one VALUE over the limit.
To implement this, remove the typed_flag from struct RTypedData.
Embedded the typed_flag information in the type member, which is
now a tagged pointer using VALUE type, using the bottom low 2 bits
as flags (1 bit for typed flag, the other for the embedded flag).
To get the actual pointer, RTYPEDDATA_TYPE masks out
the low 2 bits and then casts. That moves the RTypedData data
pointer from offset 32 to offset 24 (on 64-bit).
Vast amount of code in the internals (and probably external C
extensions) expects the following code to work for both RData and
non-embedded RTypedData:
Allow this to work by moving the data pointer in RData between
the dmark and dfree pointers, so it is at the same offset (24
on 64-bit).
Other than these changes to the include files, the only changes
needed were to gc.c, to account for the new struct layouts,
handle setting the low bits in the type member, and to use
RTYPEDDATA_TYPE(obj) instead of RTYPEDDATA(obj)->type.
The first commit fixes a non-deterministic failure in the gc tests
that was exposed by these changes.