Skip to content

Save one VALUE per embedded RTypedData #13190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jeremyevans
Copy link
Contributor

This halves the amount of memory used for embedded RTypedData if they
are one VALUE (8 bytes on 64-bit platforms) over the slot size limit.

For Set, on 64-bit it uses an embedded 56-byte struct. With the
previous implementation, the embedded structs starts at offset 32,
resulting in a total size of 88. Since that is over the 80 byte
limit, it goes to the next highest bucket, 160 bytes, wasting 72
bytes. This allows it to fit in a 80 byte bucket, which reduces
the total size for small sets of from 224 bytes (160 bytes
embedded, 64 bytes malloc, 72 bytes wasted in embedding) to 144
bytes (80 bytes embedded, 64 bytes malloc, 0 bytes wasted in
embedding).

Any other embedded RTypedData will see similar advantages if they
are currently one VALUE over the limit.

To implement this, remove the typed_flag from struct RTypedData.
Embedded the typed_flag information in the type member, which is
now a tagged pointer using VALUE type, using the bottom low 2 bits
as flags (1 bit for typed flag, the other for the embedded flag).
To get the actual pointer, RTYPEDDATA_TYPE masks out
the low 2 bits and then casts. That moves the RTypedData data
pointer from offset 32 to offset 24 (on 64-bit).

Vast amount of code in the internals (and probably external C
extensions) expects the following code to work for both RData and
non-embedded RTypedData:

DATA_PTR(obj) = some_pointer;

Allow this to work by moving the data pointer in RData between
the dmark and dfree pointers, so it is at the same offset (24
on 64-bit).

Other than these changes to the include files, the only changes
needed were to gc.c, to account for the new struct layouts,
handle setting the low bits in the type member, and to use
RTYPEDDATA_TYPE(obj) instead of RTYPEDDATA(obj)->type.

The first commit fixes a non-deterministic failure in the gc tests
that was exposed by these changes.

Copy link
Member

@byroot byroot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's clever!

Looks good too to me, but I'd wait for Peter's expertise here.

@@ -411,6 +411,8 @@ def test_latest_gc_info_weak_references_count
before_weak_references_count = GC.latest_gc_info(:weak_references_count)
before_retained_weak_references_count = GC.latest_gc_info(:retained_weak_references_count)

# Clear ary, so if ary itself is somewhere on the stack, it won't hold all references
ary.clear
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably merge that fix now, Koichi reported it just this morning.

This halves the amount of memory used for embedded RTypedData if they
are one VALUE (8 bytes on 64-bit platforms) over the slot size limit.

For Set, on 64-bit it uses an embedded 56-byte struct.  With the
previous implementation, the embedded structs starts at offset 32,
resulting in a total size of 88.  Since that is over the 80 byte
limit, it goes to the next highest bucket, 160 bytes, wasting 72
bytes.  This allows it to fit in a 80 byte bucket, which reduces
the total size for small sets of from 224 bytes (160 bytes
embedded, 64 bytes malloc, 72 bytes wasted in embedding) to 144
bytes (80 bytes embedded, 64 bytes malloc, 0 bytes wasted in
embedding).

Any other embedded RTypedData will see similar advantages if they
are currently one VALUE over the limit.

To implement this, remove the typed_flag from struct RTypedData.
Embed the typed_flag information in the type member, which is
now a tagged pointer using VALUE type, using the bottom low 2 bits
as flags (1 bit for typed flag, the other for the embedded flag).
To get the actual pointer, RTYPEDDATA_TYPE masks out
the low 2 bits and then casts.  That moves the RTypedData data
pointer from offset 32 to offset 24 (on 64-bit).

Vast amount of code in the internals (and probably external C
extensions) expects the following code to work for both RData and
non-embedded RTypedData:

```c
DATA_PTR(obj) = some_pointer;
```

Allow this to work by moving the data pointer in RData between
the dmark and dfree pointers, so it is at the same offset (24
on 64-bit).

Other than these changes to the include files, the only changes
needed were to gc.c, to account for the new struct layouts,
handle setting the low bits in the type member, and to use
RTYPEDDATA_TYPE(obj) instead of RTYPEDDATA(obj)->type.
@jeremyevans jeremyevans force-pushed the rtypeddata-reduce-embedded-memory branch from 5b9dcf9 to a9eefae Compare April 28, 2025 15:43
@jeremyevans
Copy link
Contributor Author

Rebased against master as the initial first commit was already merged.

Copy link
Member

@peterzhu2118 peterzhu2118 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should work. However, I'm a bit worried about users of TypedData_Make_Struct that wrap a non-allocated piece of memory or char * that is in the middle of a string that does not have the bottom address bit as zero.

@jeremyevans
Copy link
Contributor Author

I think this should work. However, I'm a bit worried about users of TypedData_Make_Struct that wrap a non-allocated piece of memory or char * that is in the middle of a string that does not have the bottom address bit as zero.

This doesn't tag the data pointer, it tags the type pointer. So a call such as this would not be affected:

// ruby_type_struct is a const rb_data_type_t 
TypedData_Make_Struct(klass, type, &ruby_type_struct, pointer_to_garbage);

This type of call could potentially be affected:

TypedData_Make_Struct(klass, type, pointer_to_garbage, valid_pointer);

However, I believe such a broken call would result in the same crash, because typed_data_alloc still dereferences the pointer before tagging it: https://github.com/ruby/ruby/pull/13190/files#diff-d1cee85c3b0e24a64519c11150abe26fd0b5d8628a23d356dd0b535ac4595d49R1083

Does that alleviate your concern, @peterzhu2118 ?

@nobu Do you see any issues with this, or is it OK to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants