Skip to content

[RFC] Primary Design of the GC #2908

Closed
Closed
@frank-king

Description

@frank-king

Summary

This RFC is going to setup a minimal facilities for GC supports.

Here is the goal of this RFC:

  • define the data structures and main algorithms of GC,
  • give an implementation of GC for "header-only" objects (with dummy payload), which is easy to verify and able to be assembled into the VM.

Detailed Explanation

Background

As the design doc of CPython's GC (https://devguide.python.org/garbage_collector/), it is implemented by two pointers
https://github.com/python/cpython/blob/main/Include/internal/pycore_gc.h#L11-L20

/* GC information is stored BEFORE the object structure. */
typedef struct {
    // Pointer to next object in the list.
    // 0 means the object is not tracked
    uintptr_t _gc_next;

    // Pointer to previous object in the list.
    // Lowest two bits are used for flags documented later.
    uintptr_t _gc_prev;
} PyGC_Head;

The two pointers are not only dedicated pointers, but also coupled with flags in its lowest one or two bytes (in a 32bit machine, pointer addresses are typically aligned to 4 bytes, where the lowest two bits is 00).

  • The _gc_prev field is normally used as the “previous” pointer to maintain the doubly linked list but its lowest two bits are used to keep the flags PREV_MASK_COLLECTING and _PyGC_PREV_MASK_FINALIZED. Between collections, the only flag that can be present is _PyGC_PREV_MASK_FINALIZED that indicates if an object has been already finalized. During collections _gc_prev is temporarily used for storing a copy of the reference count (gc_refs), in addition to two flags, and the GC linked list becomes a singly linked list until _gc_prev is restored.
  • The _gc_next field is used as the “next” pointer to maintain the doubly linked list but during collection its lowest bit is used to keep the NEXT_MASK_UNREACHABLE flag that indicates if an object is tentatively unreachable during the cycle detection algorithm. This is a drawback to using only doubly linked lists to implement partitions: while most needed operations are constant-time, there is no efficient way to determine which partition an object is currently in. Instead, when that’s needed, ad hoc tricks (like the NEXT_MASK_UNREACHABLE flag) are employed.

Data Structures

The GC headers contains two pointers, so the first scaffolding might be

struct PyGcObject<T> {
    header: PyGcHeader,
    obj: PyObject<T>,
}
struct PyGcHeader<T> {
    next: Option<NonNull<PyGcObject<T>>>,
    prev: Option<NonNull<PyGcObject<T>>>,
}

Let's talk about next and prev one by one.

The next pointer

For the _gc_next pointer, it is mainly used as a single linked-list (for generations lists and unreachable sets when doing collecting).

However, the next pointer contains a NEXT_MASK_UNREACHABLE mark on its lowest bit. So we might need a type like TaggedPtr:

struct TaggedPtr<T, TAG: Tag> {
    ptr: usize,
    _marker: PhantomData<*const T>,
    _tag: PhantomData<TAG>,
}

trait Tagged<T, TAG: Tag>: Sized {
    fn new(ptr: Option<NonNull<T>>, tags: TAG) -> Self;
    fn get_ptr(&self) -> Option<NonNull<T>>; 
    fn set_ptr(&mut self, ptr: NonNull<T>); 
    fn unset_ptr(&mut self);                
    fn set_tag(&mut self, tag: TAG);        // self.ptr |= tag;
    fn unset_tag(&mut self, tag: TAG);      // self.ptr &= !tag;
    fn clear_tags(&mut self);               // self.ptr &= PTR_MASK;
    fn test_tag(&self, tag: TAG) -> bool;   // (self.ptr & tag) as bool
    fn get_tags(&self) -> TAG;              // self.ptr & TAG_MASK
    fn get(&self) -> (Option<NonNull<T>>, TAG) {
        (self.get_ptr(), self.get_tags())
    }
}

trait Tag: From<usize> + Into<usize> + BitOr + PartialEq + Eq {
    fn none() -> Self {
        0.into()
    }
}

impl<T, TAG: Tag> Tagged<T, TAG> for TaggedPtr<T, TAG> {/* ... */}

struct TagNext {
    const NEXT_MASK_UNREACHABLE = 0b01;
}

The prev pointer

The _gc_prev pointer is more complicated. Like the next pointer, it is a pointer with 2 kinds of tags. The tag class can be defined as:

struct TagPrev {
   const PREV_MASK_COLLECTING = 0b01;
   const PREV_MASK_FINALIZED = 0b10;
}

However, the prev pointer is also used as a GC reference counter during some stages of the GC algorithm. We can define a helper trait and impl for the prev pointer:

trait Counter {
    fn get(&self) -> usize;
    fn set(&mut self, cnt: usize);
    fn inc(&mut self);
    fn dec(&mut self); 
}
impl Counter for ...

The only problem is that when the pointer is used as a counter, we must forbid it being used as a real pointer. Otherwise it might crash if we dereference the ill-formed pointer (which is exactly a pointer). It is one of the unresolved questions and will be talked later.

I have setup a simple playground for the pointers' data structures.

Algorithms

Having the data structures of the pointers, we can decribe the algorithms.

Refer to CPython's implementation, the main GC procedure can be described as

/// find all objects that is unreachable from outside the `head`, and move than into the `unreachable` list.
fn deduce_unreachable(PyGcObject<T> head, PyGcObject<T> unreachable) {
    // initialize the GC reference counts of all the object in list `head` as their object reference counts.
    init_gc_refs(head);
    // for each object in list `head`, visit each of its referent (not recursively) and decrement its GC reference count. 
    substract_gc_refs(head);
    // move all of the objects in list `head` whose GC reference count is 0, to list `unreachable`.
    // but if encountered an object that has non-zero GC reference count, move all of its referent back to list `head`.
    move_unreachable(head, unreachable);
    //
}

/// main logics of GC collecting. finalizations and wake reference processing is omitted here.
fn collect() {
    // find all unreachable objects and move to the list `unreachable`.
    deduce_unreachable(head, unreachable);
    // drop all of the objects in list `unreachable`
    drop_all_in(unreachable);
}

Drawbacks, Rationale, and Alternatives

The CPython's GC algorithm is straight-forward and not difficult to implement, thought there are a few works to do because the global allocator and deallocator might be revised to fit the GC.

However, there are some drawbacks:

  • it is single-threaded, and stop-the-world,
  • mixing RC and GC might be inefficient. Especially for multithreading, there are lots of atomic inc/dec in RC.

Alternative GC algorithms are:

  1. The tricolor mark-and-sweep algorithm,
  2. The copy-and-compaction algorithm, (2 times memory consumption)
  3. GC algorithms in JVM, like CMS, PS, G1, ZGC, etc., (too enormous in engineering)

Maybe combining 1 and 2 in different generations is more reality because 3 requires too much in engineering and almost reproducing a yet-another JVM.

Anyway, the current GC algorithm CPython is a good way to start.

Unresolved Questions

There are still some unresolved questions.

1. Where should the GC header be put?

Currently, CPython puts the two pointers in front of the object header, and conversions between GC headers and object pointers are done by addition/subtraction of pointers. But in Rust, directly adding/subtracting pointers is not rather a good way.

One choice is to put the header in front of the object header, just like what CPython do. They are wrapped as:

struct PyGcObject<T> {
    header: PyGcHeader<T>,
    obj: PyObject<T>,
}

Next, make PyGcObject<T> implement Deref and DerefMut with taget = PyObject<T>. Then the wrapped PyGcObject<T> can be used like the PyObject<T> before. But we need to redefine the type PyObjectRef = PyRef<PyObject<Erased>> to type PyGcObjectRef = PyRef<PyGcObject<Erased>>. Also use the Deref and DerefMut trait to make it easier to use.

In alternative, the GC header can be also put inside the PyObject<T>, like this:

struct PyObject<T> {
    gc: PyGcHeader<T>,
    inner: ManuallyDrop<PyInner<T>>,
}

This needs less revision of the existing codes.

Another way is to put GC header at the beginning of the object payload. This needs additional pointer adding/subtracting when traversing on the GC list.

2. How to obtain the reference count of the object?

RustPython use Rust's Rc or Arc as its reference counting, which is not visible outside. Unless we pack the header together with a reference like

struct PyGcObject<T> {
    header: PyGcHeader<T>,
    obj: PyRef<T>,
}

we can not easily get the reference count. However, packing the header with a PyRef can bring another problem: since the obj is referred by the PyGcObject<T>, the reference count will be implicitly added one.

3. How to define the boundary of the safe and unsafe?

As the double-linked list is introduced, and the prev pointer is reused as GC reference count, it can be kind of unsafe if a counter is misused as a pointer and dereferencing might crash.

I designed a counter based on mutable borrowing like this. If a pointer is borrowed (mutably) as a counter, it cannot be read or write as pointer, until the counter is dropped.

type PyGcPrev<T> = TaggedPtr<T, TagPrev>;

impl<T> PyGcPrev<T> {
    fn get_counter<'a>(&'a mut self) -> impl Counter + 'a {
        GcCounter(&mut self.ptr)
    }
}

impl Counter for GcCounter<'_> {/* ... */}

Mutably borrowing the pointer as a counter works fine when the pointers are used in a local scope:

{
    let mut counter = tagged_ptr.get_counter();
    counter.set(5);
    assert_eq!(counter.get(), 5);
    counter.inc();
    // assert_ne!(tagged_ptr.get_ptr(), ptr1); // should not compile
    assert_eq!(counter.get(), 6);
    counter.dec();
    assert_eq!(counter.get(), 5);
}

But it seemed hard to work if there are many pointers that will be used as a counter in the same time. It is impossible to borrow each of them as a separate counter.


So much for this RFC. I will append something if more come into my mind. Any comments or suggestion is welcome!

Metadata

Metadata

Assignees

No one assigned

    Labels

    RFCRequest for comments

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions