Hone logo
Hone
Problems

Building a Mark-and-Sweep Garbage Collector in Go

Understanding memory management is crucial for writing efficient and robust Go programs. While Go has an excellent built-in garbage collector, implementing one yourself provides invaluable insight into how memory is automatically reclaimed. This challenge asks you to build a simplified mark-and-sweep garbage collector for a custom memory model.

Problem Description

Your task is to implement a basic mark-and-sweep garbage collector for a simulated heap in Go. You will manage a fixed-size array representing the heap and a set of "roots" from which reachable objects can be traced. The garbage collector should identify and reclaim objects that are no longer reachable from these roots.

Key Requirements:

  • Simulated Heap: Represent the heap as a []byte slice of a fixed size.
  • Object Representation: Define a simple structure to represent objects on the heap. Each object should have a unique identifier, a size, and pointers to other objects (simulated by their IDs).
  • Root Set: Maintain a list of object IDs that are considered roots (e.g., global variables, stack variables).
  • Allocation: Implement a function to allocate memory for new objects on the heap. This function should handle fragmentation and return an error if there isn't enough contiguous space.
  • Mark Phase: Implement the "mark" phase of the mark-and-sweep algorithm. This involves traversing the object graph starting from the roots and marking all reachable objects.
  • Sweep Phase: Implement the "sweep" phase. This involves iterating through the heap, freeing marked (unreachable) objects and coalescing free space.
  • Object Tracking: Keep track of allocated objects, their locations on the heap, and their marked status.

Expected Behavior:

  1. Allocate several objects.
  2. Some objects should point to others, forming a directed graph.
  3. Trigger the garbage collection process.
  4. Verify that only reachable objects remain allocated on the heap.
  5. Free space should be consolidated.

Edge Cases:

  • Allocating an object that is larger than the remaining free space.
  • Circular references between objects.
  • Allocating objects after garbage collection.
  • The heap being completely full.

Examples

Example 1:

Input:

  • Heap Size: 100 bytes
  • Roots: [object_id="root1"]
  • Allocations:
    • obj1: size 10, points to obj2
    • obj2: size 5
    • obj3: size 8, points to obj1
    • obj4: size 12 (unreachable)

Initial State:

(Imagine a visualization of the heap with objects obj1, obj2, obj3 allocated, and obj4 also allocated but not referenced by any root or reachable object.)

Garbage Collection Triggered.

Mark Phase:

  • Start from "root1" (assume it points to obj1).
  • Mark obj1.
  • obj1 points to obj2. Mark obj2.
  • obj3 points to obj1, but obj1 is already marked.
  • obj4 is not reachable.

Sweep Phase:

  • Iterate through heap.
  • obj1, obj2, obj3 are marked, so they are kept.
  • obj4 is not marked, so it's freed.

Output:

  • Heap: 100 bytes, with obj1, obj2, obj3 allocated and contiguous (or with free space appropriately managed). obj4 is gone.
  • Allocated Objects: obj1, obj2, obj3.
  • Free Space: Consolidated after obj3.

Explanation: obj4 was not reachable from the root set, so it was reclaimed by the garbage collector.

Example 2:

Input:

  • Heap Size: 50 bytes
  • Roots: [object_id="root1"]
  • Allocations:
    • a: size 15, points to b
    • b: size 10, points to c
    • c: size 10
    • d: size 20 (unreachable)
    • Attempt to allocate e: size 10

Initial State: a (15) -> b (10) -> c (10) are allocated. d (20) is also allocated but unreachable. Total allocated: 15 + 10 + 10 + 20 = 55 bytes. This exceeds the heap size, implying some initial allocation might have failed or previous GC happened. Let's assume a, b, c are allocated and d is also in the heap.

Garbage Collection Triggered.

Mark Phase:

  • Assume "root1" points to a.
  • Mark a.
  • a points to b. Mark b.
  • b points to c. Mark c.
  • d is not reachable.

Sweep Phase:

  • a, b, c are marked.
  • d is not marked, it's freed. The space occupied by d (20 bytes) becomes free.

Attempt to allocate e (size 10):

  • After sweeping d, there is now 20 bytes of free space.
  • e can be successfully allocated in this free space.

Output:

  • Heap: a, b, c and e are allocated. Free space is managed.
  • Allocated Objects: a, b, c, e.
  • Free Space: What remains after e is allocated.

Explanation: d was collected, freeing up space for e.

Constraints

  • Heap Size: The simulated heap will be a []byte slice with a maximum size of 1024 bytes.
  • Object Size: Individual object sizes will not exceed 100 bytes.
  • Number of Objects: The total number of objects managed at any time will not exceed 100.
  • Object IDs: Object IDs will be unique strings.
  • Pointers: Objects can point to at most 5 other objects.
  • Performance: While this is a simplified implementation, the mark and sweep phases should ideally complete within a reasonable number of operations relative to the heap size and number of objects. Avoid extremely inefficient algorithms (e.g., O(N^2) searches within the sweep phase for each object).

Notes

  • You will need to define your own data structures for representing objects on the heap, tracking free blocks, and managing the root set.
  • Consider how you will represent pointers to other objects. Using object IDs as references is a common simplification for simulation.
  • Think about how to efficiently find free memory during allocation and how to coalesce free blocks during sweeping.
  • You can simulate triggering the garbage collector manually for testing purposes. In a real GC, this is often done when memory allocation fails or at regular intervals.
  • This challenge focuses on the core logic of mark-and-sweep. You don't need to worry about concurrent GC, stack scanning, or complex heap layouts.
Loading editor...
go