[RFC] GC: Both Stop-the-World and On-the-fly

Draft PR: #4180
## Summary
I propose we implement a cycle collector for our rustPython RefCount garbage collection system based on Bacon etc.'s work in this two paper:
- [Concurrent Cycle Collection in Reference Counted Systems](http://link.springer.com/10.1007/3-540-45337-7_12), this paper come up with a sync(stop the world) cycle collector and a concurrent version of it, although I suggest only impl the sync version of it, for concurrent version is harder to implement and still limited in preformance which is not as good as the second paper we are introducing
- [An efficient on-the-fly cycle collection](https://dl.acm.org/doi/10.1145/1255450.1255453), this paper based on the sync cycle collector in previous paper, adding a "sliding view" algorithm to it, which is like a "copy on write" for Object so during tracing the GC can see a fixed version of object dependence graph while mutator can still freely mutate object whatever they want.

## Detailed Explanation
The synchronous cycle collector is easier to impl, I suggest we implement a synchronous verison of cycle collector for proof of Concept, then modfiy it to a On-the-Fly version.
### Synchronous Cycle Collection
The algo in "Concurrent Cycle Collection in Reference Counted Systems" is O(N + E) wrost-case time for collection at most.

For an object T in its' header there need fields denoted buffered(T), color(T),and RC(T). Where:
- buffer is used to place potential roots of cyclic garbage, buffered(T) denote whether a object is in the buffer.
- color is explain in this table(Red and Orange is only used in concurrent version):

| Color       | Meaning                     |
| ----------- | -----------                 |
| Black       | In use or free              |
| Gray        | Possible member of cycle    |
| White       | Member of garbage cycle     |
| Purple      | Possible root of cycle      |
| Green       | Acyclic                     |
| Red         | Candidate cycle undergoing $\Sigma$-computation |
| Orange      | Candidate cycle awaiting epoch boundary |

- RC is just a reference count(which doesn't even need to be atomic for a sync version).
#### Pseudocode and Explanation
In this section I just ref to paper because it explain much better than me, the basic idea is to first scan for possible cycles, then check for actual cycles:
> Our synchronous algorithm is similar to Lins’ algorithm: when reference counts are decremented, we place potential roots of cyclic garbage into abuffer called Roots. Periodically,we process this buffer and look for cycles by subtracting internal reference counts. 
> 
> There are two major changes that makethe algorithm linear time: first of all, we add a buffered flag to every object, which isused to prevent the same object being added to the root buffer more than once per cycle collection. This in turn places alinear bound on the size of the buffer. 
> 
> Secondly,we analyze the entire transitiveclosure of Roots as asingle graph, rather than as aset of graphs. This means that the complexity of the algorithm islimited by the size of that transitiveclosure, which in turn islimited by $N + E$ (since Roots is bounded by $N$ by the use of the buffered flag). Of course, in practice we hope that the transitiveclosure will be significantly smaller. 
> 
> In practice we found that the first change (the use of the buffered flag) made almost no difference in the running time of the algorithm; however,the second change (analyzing the entire graph at once) made an enormous difference in run-time. When we applied Lins’ algorithm unmodified to large programs, garbage collection delays extended into minutes.

There is also concurrent cycle collection, but that is not the main point today. Because the on-the-fly version is better both in performance and ease to implement.

### Efficient On-the-Fly Cycle Collection
The basic idea is to use a "sliding view" method(which is a bit like "Copy on write"), and create a stable snapshot by adding a "LogPointer" to record the original status of object to create a virtual snapshot of the heap for cycle collection.
> A crucial problem with repeated scanning arises when concurrent program threads modify the objects graph during the scan. This means that the collector cannot trust a scan to repeat the very same structure that a previous scan has traversed. Furthermore, as modifications occur concurrently with the scan, each specific scan cannot be guaranteed to view a consistent snapshot of the objects graph at any specific point in time. Such problems are the source of the two drawbacks of Bacon and Rajan’s on-the-fly cycle collector: a practical drawback and a theoretical one.

### Data Struct
We can use work in #2908 for object layout, and add two new feature:`"gc"` and `"on_the_fly_cc"` which depend on `"gc"` feature:
```rust
#![cfg(feature = "gc")]

struct PyGcObject {
    header: PyGcHeader,
    obj: PyObject, //or directly use PyInner<Erased>
}

struct PyGcHeader {
    ref_cnt: RefCount,
    color: Color,
    #[cfg(feature = "on_the_fly_cc")]
    log_ptr: Option<LogPointer> // for on-the-fly dirty bit and pointer
}
```

And add a `GcTrace` trait should be implement on every type require GC.
_edited:_ in#4180 add a field to PyInner<T> instead which is clearer and easier to impl.
```rust
pub trait GcTrace {
    /// call tracer_fn for every GcOjbect owned by a dyn GcTrace Object
    /// # API Contract
    /// must make sure that every owned object is called with tracer_fn, or garbage collect won't act correctly.
    fn trace(&self, tracer_fn: &mut TracerFn);
}

/// A `TracerFn` is a callback function that is invoked for each `PyGcObjectRef` owned
/// by an instance of something.
pub type TracerFn<'a> = dyn FnMut(&mut PyGcObjectRef) + 'a;
```


## Drawbacks, Rationale, and Alternatives
Alternative: impl #2908's CPython GC.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] GC: Both Stop-the-World and On-the-fly #4158

Summary

Detailed Explanation

Synchronous Cycle Collection

Pseudocode and Explanation

Efficient On-the-Fly Cycle Collection

Data Struct

Drawbacks, Rationale, and Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Color	Meaning
Black	In use or free
Gray	Possible member of cycle
White	Member of garbage cycle
Purple	Possible root of cycle
Green	Acyclic
Red	Candidate cycle undergoing $\Sigma$-computation
Orange	Candidate cycle awaiting epoch boundary

[RFC] GC: Both Stop-the-World and On-the-fly #4158

Description

Summary

Detailed Explanation

Synchronous Cycle Collection

Pseudocode and Explanation

Efficient On-the-Fly Cycle Collection

Data Struct

Drawbacks, Rationale, and Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions