Skip to content

Add red-black tree implementation backed by the C version. #400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 30, 2021

Conversation

wedsonaf
Copy link

Signed-off-by: Wedson Almeida Filho wedsonaf@google.com

@ksquirrel

This comment has been minimized.

Copy link
Member

@ojeda ojeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice.

A couple nits that can be skipped if needed.

@ksquirrel

This comment has been minimized.

@wedsonaf
Copy link
Author

v1 -> v2

  • Updated samples to assert the contents of the tree.
  • Updated references to None in documentation.
  • Fixed typo in postorder comment.
  • Fixed node pointer invalidation.

unsafe { rust_helper_rb_link_node(&mut node.links, parent, new_link) };

// SAFETY: All pointers are valid. `node` has just been inserted into the tree.
unsafe { bindings::rb_insert_color(&mut node.links, &mut self.root) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutable reference to node.links is invalidated after this function call, but the function stores it in self.root, right? I think you should use core::ptr::addr_of_mut!(node.links) instead. Note that the let node = unsafe { &mut *Box::into_raw(node) }; is also invalidated once this function returns. So maybe let node = Box::into_raw(node); and then addr_of_mut!((*node).links)? The &mut self.root is fine if there are no pointers to self.root only pointers stored in self.root.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need to update my mental model here.

What is invalidating node.links? How would using addr_of_mut!(node.links) avoid this invalidation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking a bit more about this, this &mut node.links doesn't get invalidated, but the previous &mut node.links does due to this &mut node.links. addr_of_mut!(node.links) avoids this problem by directly creating a raw pointer without an intermediate mutable reference. It is valid to have multiple raw pointers to the same value, but creating one mutable reference invalidates all other references and pointers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What model are you using to determine what gets invalidated?

I quickly read the model in the stacked borrows paper and my understanding is that the code is ok now. The reason is that the borrow stack for each byte that makes up node contains: [..., SharedRW, Unique(n), ...], where SharedRW refers to the raw pointer returned by into_raw() and Unique(n) refers to the mutable reference created on line 248. So any new raw references that I create in the function can still be used after the function returns because they will be resolved by the SharedRW entry (until I call from_raw(), which will pop that SharedRW entry) because raw pointers are not "tagged".

IOW, I don't agree with your claim that "creating one mutable reference invalidates all other pointers" -- my understanding is that it invalidates all other pointers on top of it in the borrow stack. Pointers below it continue to be valid until something else pops them off.

I am new to stacked borrows though, so my understanding may be wrong. Where does it not match yours?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly read the model in the stacked borrows paper and my understanding is that the code is ok now. The reason is that the borrow stack for each byte that makes up node contains: [..., SharedRW, Unique(n), ...], where SharedRW refers to the raw pointer returned by into_raw() and Unique(n) refers to the mutable reference created on line 248. So any new raw references that I create in the function can still be used after the function returns because they will be resolved by the SharedRW entry (until I call from_raw(), which will pop that SharedRW entry) because raw pointers are not "tagged".

You will end up with SharedRW(k), Unique(l), SharedRW(m) in this case. It is fine with the default options of miri as the raw pointer can resolve to either SharedRW, but it isn't when -Zmiri-track-raw-pointers as it will only resolve to SharedRW(m) which is popped of the borrow stack at the next &mut item.links in addition SharedRW(m) only covers links and not the whole item, so accessing the rest of the item through it is not allowed.

IOW, I don't agree with your claim that "creating one mutable reference invalidates all other pointers" -- my understanding is that it invalidates all other pointers on top of it in the borrow stack. Pointers below it continue to be valid until something else pops them off.

Yeah, I was simplifying a bit. Maybe a bit too much.

Copy link
Author

@wedsonaf wedsonaf Jun 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will end up with SharedRW(k), Unique(l), SharedRW(m) in this case. It is fine with the default options of miri as the raw pointer can resolve to either SharedRW, but it isn't when -Zmiri-track-raw-pointers as it will only resolve to SharedRW(m) which is popped of the borrow stack at the next &mut item.links in addition SharedRW(m) only covers links and not the whole item, so accessing the rest of the item through it is not allowed.

From miri's readme about -Zmiri-track-raw-pointers: This can make valid code fail to pass the checks. So I'm not so concerned about it, I'm concerned about the code correctness wrt the model. Do we agree that it is correct (wrt the model)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the reason I declared a &mut Node was to avoid having to add more SAFETY justifications in each of its uses (4 other places), but now I realise that 3 of them are &mut node.links so I can avoid repeating the justification by storing a raw pointer in a variable and reusing it.

I'll make this change and upload it shortly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new version is UB free I think.

@ksquirrel

This comment has been minimized.

@wedsonaf
Copy link
Author

wedsonaf commented Jun 30, 2021

v2 -> v3

  • Added keys, values, and values_mut per @nbdd0121's suggestion.
  • Updated sample to use values_mut.
  • Updated samples to assert the end of iteration (to possibly catch bugs when an extra node is left over).

Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
@ksquirrel
Copy link
Member

Review of 3444d9b72b1b:

  • ⚠️ This PR adds/deletes more than 500 lines.
  • ✔️ Commit 3444d9b: Looks fine!

@wedsonaf
Copy link
Author

v3 -> v4

  • Rebased
  • Derive raw pointers from the one returned by Box::into_raw

@ojeda ojeda merged commit 2b9530d into Rust-for-Linux:rust Jun 30, 2021
@wedsonaf wedsonaf deleted the rbtree branch June 30, 2021 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants