Skip to content

Ractor: Fix moving embedded objects #13008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 31, 2025
Merged

Conversation

casperisfine
Copy link
Contributor

@casperisfine casperisfine commented Mar 28, 2025

[Bug #20271]
[Bug #20267]
[Bug #20255]

rb_obj_alloc(RBASIC_CLASS(obj)) will always allocate from the basic 40B pool, so if obj is larger than 40B, we'll create a corrupted object when we later copy the shape_id.

Instead we can use the same logic than ractor copy, which is to use rb_obj_clone, and later ask the GC to free the original object.

We then must turn it into a T_OBJECT, because otherwise just changing its class to RactorMoved leaves a lot of ways to keep using the object, e.g.:

a = [1, 2, 3]
Ractor.new{}.send(a, move: true)
[].concat(a) # Should raise, but wasn't.

If it turns out that rb_obj_clone isn't performant enough for some uses, we can always have carefully crafted specialized paths for the types that would benefit from it.

This comment has been minimized.

@casperisfine casperisfine force-pushed the move-embeded-3 branch 5 times, most recently from 3b86dfb to cb3fc8f Compare March 29, 2025 14:18
gc.c Outdated
Comment on lines 2668 to 2670
if (UNLIKELY(FL_TEST_RAW(src, FL_SEEN_OBJ_ID))) {
rb_gc_impl_object_id_move(objspace, dest, src);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the big argument for not transferring the object_id is that we'd save on locking the VM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think ID should be inherited (moved object should be different object).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it because to me it seemed that semantically, move: true means we're just changing which ractor owns the object, and not allocating a new object.

But if you think it shouldn't be preserved, it's fine by me, I don't think it's very importan.

Removed.

@byroot byroot requested a review from peterzhu2118 March 30, 2025 10:27
Copy link
Member

@peterzhu2118 peterzhu2118 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

byroot added 2 commits March 31, 2025 11:29
[Bug #20271]
[Bug #20267]
[Bug #20255]

`rb_obj_alloc(RBASIC_CLASS(obj))` will always allocate from the basic
40B pool, so if `obj` is larger than `40B`, we'll create a corrupted
object when we later copy the shape_id.

Instead we can use the same logic than ractor copy, which is
to use `rb_obj_clone`, and later ask the GC to free the original
object.

We then must turn it into a `T_OBJECT`, because otherwise
just changing its class to `RactorMoved` leaves a lot of
ways to keep using the object, e.g.:

```
a = [1, 2, 3]
Ractor.new{}.send(a, move: true)
[].concat(a) # Should raise, but wasn't.
```

If it turns out that `rb_obj_clone` isn't performant enough
for some uses, we can always have carefully crafted specialized
paths for the types that would benefit from it.
That seemed like the logical thing to do to me, but ko1 disagree.
@byroot byroot enabled auto-merge (rebase) March 31, 2025 09:43
@byroot byroot merged commit 7db0e07 into ruby:master Mar 31, 2025
75 checks passed
eileencodes added a commit to eileencodes/ruby that referenced this pull request Apr 2, 2025
In ruby#13008 `RVALUE` was removed without replacement. This means
the lldb scripts that relied on `RVALUE` stopped working.

I updated the ones that were using it just for the bytesize to use
`slot_size` and then round to the nearest power of 40. We can't use
`slot_size` directly because in debug mode it's `48` but `RVALUE` is
`40` bytes.

For the `as_type` method, I updated it to check the type. It's only used
for `bignum` and `array` so that's a simple change.

Lastly, for the `dump_page` method I replaced it with `struct free_slot`
since that's looking at the freelist.

`struct RVALUE` has been removed from all the scripts and I verified
that `rp` is fixed. I'm not confident the `dump_page` method is fixed,
the freelist looks off, but for now this gets us closer.
eileencodes added a commit to eileencodes/ruby that referenced this pull request Apr 8, 2025
In ruby#13008 `RVALUE` was removed without replacement. This means
the lldb scripts that relied on `RVALUE` stopped working.

I updated the ones that were using it just for the bytesize to use
`slot_size` and then round to the nearest power of 40. We can't use
`slot_size` directly because in debug mode it's `48` but `RVALUE` is
`40` bytes.

For the `as_type` method, I updated it to check the type. It's only used
for `bignum` and `array` so that's a simple change.

Lastly, for the `dump_page` method I replaced it with `struct free_slot`
since that's looking at the freelist.

`struct RVALUE` has been removed from all the scripts and I verified
that `rp` is fixed. I'm not confident the `dump_page` method is fixed,
the freelist looks off, but for now this gets us closer.
eileencodes added a commit that referenced this pull request Apr 8, 2025
In #13008 `RVALUE` was removed without replacement. This means
the lldb scripts that relied on `RVALUE` stopped working.

I updated the ones that were using it just for the bytesize to use
`slot_size` and then round to the nearest power of 40. We can't use
`slot_size` directly because in debug mode it's `48` but `RVALUE` is
`40` bytes.

For the `as_type` method, I updated it to check the type. It's only used
for `bignum` and `array` so that's a simple change.

Lastly, for the `dump_page` method I replaced it with `struct free_slot`
since that's looking at the freelist.

`struct RVALUE` has been removed from all the scripts and I verified
that `rp` is fixed. I'm not confident the `dump_page` method is fixed,
the freelist looks off, but for now this gets us closer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants