Hone logo
Hone
Problems

Implementing a Simple Mark-and-Sweep Garbage Collector in Rust

Rust is known for its memory safety guarantees without a traditional garbage collector (GC). However, understanding GC principles is valuable for appreciating Rust's ownership system and for scenarios where manual memory management or custom GC might be necessary. This challenge will guide you in implementing a basic mark-and-sweep garbage collector for a simplified heap.

Problem Description

Your task is to implement a rudimentary mark-and-sweep garbage collector in Rust. You will need to manage a small, custom heap and simulate the process of identifying and reclaiming unreachable objects. This will involve defining a data structure to represent objects on the heap, a mechanism to track references between objects, and the core mark-and-sweep algorithm. This exercise helps in understanding memory management concepts beyond Rust's default.

Problem Description Details

  1. Heap Structure:

    • You will create a Heap struct that manages a collection of objects.
    • Each object on the heap should have a unique identifier and a way to store references to other objects on the heap. For simplicity, assume objects store references as indices into the heap.
    • The heap will be represented as a Vec or similar contiguous storage for your objects.
  2. Object Representation:

    • Define an Object struct that holds data (e.g., an integer value) and a Vec<usize> to store the indices of other objects it references.
  3. Garbage Collection Algorithm:

    • Implement the mark-and-sweep algorithm.
    • Mark Phase:
      • You'll need a roots set, which are initially considered reachable. These could be global variables or explicitly designated "live" objects.
      • The GC should traverse from the roots, marking all reachable objects. A marked flag on each Object can be used.
      • This traversal should be recursive or iterative, following the references within objects.
    • Sweep Phase:
      • After marking, iterate through the entire heap.
      • Any object that is not marked is considered garbage and should be reclaimed.
      • Reclaiming an object might involve removing it from the heap and adjusting references to it (though for this simplified version, we can just mark it as "free" or logically remove it, and a more robust implementation would compact).
  4. Key Requirements:

    • A Heap struct with methods to:
      • alloc(value: i32, references: Vec<usize>) -> usize: Allocate a new object on the heap and return its index.
      • collect_garbage(roots: Vec<usize>): Trigger the mark-and-sweep process.
      • get_object(index: usize) -> Option<&Object>: Retrieve an object by its index.
    • The Object struct with a marked: bool field.
    • The collect_garbage method must correctly identify and "reclaim" (logically, by marking as free/removed) unreachable objects.
  5. Expected Behavior:

    • Objects initially pointed to by roots, and any objects reachable from them through chains of references, should remain on the heap after garbage collection.
    • Objects that are not reachable from any root should be reclaimed.
  6. Edge Cases:

    • Circular references: The mark-and-sweep algorithm should handle circular references correctly (i.e., not get stuck in an infinite loop and correctly identify objects in cycles as reachable if at least one object in the cycle is reachable from a root).
    • Empty heap.
    • No roots provided.
    • Objects referencing themselves.

Examples

Example 1:

Input:
heap = Heap::new()
root1 = heap.alloc(10, vec![1, 2]) // Object 0 references Object 1 and 2
obj1_idx = 1
obj2_idx = 2
obj3_idx = heap.alloc(20, vec![]) // Object 3 is unreferenced initially
heap.collect_garbage(vec![0]) // Root is Object 0

Output:
Heap contains Object 0 (value: 10, marked: true)
Heap contains Object 1 (value: 20, marked: true)
Heap contains Object 2 (value: 30, marked: true)
Heap does NOT contain Object 3 (value: 20, marked: false) - It would be collected.

Explanation:
Object 0 is a root. It references objects 1 and 2. So, 0, 1, and 2 are marked as reachable.
Object 3 was never referenced by any reachable object, so it's considered garbage and would be collected.

Example 2:

Input:
heap = Heap::new()
obj0_idx = heap.alloc(10, vec![1]) // Object 0 references Object 1
obj1_idx = 1
obj2_idx = heap.alloc(20, vec![0]) // Object 2 references Object 0 (circular)
heap.collect_garbage(vec![2]) // Root is Object 2

Output:
Heap contains Object 0 (value: 10, marked: true)
Heap contains Object 1 (value: 20, marked: true)
Heap contains Object 2 (value: 30, marked: true)

Explanation:
Object 2 is the root. It references Object 0.
Object 0 references Object 1.
Object 1 is not explicitly referenced by Object 0, but it's reachable through the chain.
Object 1 references nothing.
The cycle between Object 0 and Object 2 is correctly handled; all are marked as reachable.

Example 3: (Unreachable object despite being allocated)

Input:
heap = Heap::new()
obj0_idx = heap.alloc(10, vec![]) // Object 0 is allocated but has no outgoing references.
heap.collect_garbage(vec![]) // No roots provided.

Output:
Heap does NOT contain Object 0.

Explanation:
When no roots are provided, and there are no initially marked objects, the GC will sweep the entire heap, collecting all objects, including Object 0, as they are all considered unreachable.

Constraints

  • The heap will contain a maximum of 1000 objects.
  • Object values are i32.
  • Object references are usize indices into the heap.
  • The collect_garbage method will be called after allocations.
  • Performance is secondary to correctness for this challenge. You do not need to implement compaction, just logical reclamation.

Notes

  • You can simulate "reclaiming" an object by setting a flag (e.g., is_free: bool) on the Object or by using Option<Object> in your heap vector and setting slots to None. For simplicity, you might also just ignore unmarked objects during traversal after the sweep phase.
  • Consider how you will manage the marked flag. It should be reset before each garbage collection cycle.
  • Think about the order of operations: marking all reachable objects first, then sweeping.
  • For the Heap structure, consider using a Vec<Option<Object>> to easily represent freed slots.
  • A good approach for traversal during the mark phase is to use a stack for iterative depth-first search or recursion.
Loading editor...
rust