Building a Mark-and-Sweep Garbage Collector in Go
Understanding memory management is crucial for writing efficient and robust Go programs. While Go has an excellent built-in garbage collector, implementing one yourself provides invaluable insight into how memory is automatically reclaimed. This challenge asks you to build a simplified mark-and-sweep garbage collector for a custom memory model.
Problem Description
Your task is to implement a basic mark-and-sweep garbage collector for a simulated heap in Go. You will manage a fixed-size array representing the heap and a set of "roots" from which reachable objects can be traced. The garbage collector should identify and reclaim objects that are no longer reachable from these roots.
Key Requirements:
- Simulated Heap: Represent the heap as a
[]byteslice of a fixed size. - Object Representation: Define a simple structure to represent objects on the heap. Each object should have a unique identifier, a size, and pointers to other objects (simulated by their IDs).
- Root Set: Maintain a list of object IDs that are considered roots (e.g., global variables, stack variables).
- Allocation: Implement a function to allocate memory for new objects on the heap. This function should handle fragmentation and return an error if there isn't enough contiguous space.
- Mark Phase: Implement the "mark" phase of the mark-and-sweep algorithm. This involves traversing the object graph starting from the roots and marking all reachable objects.
- Sweep Phase: Implement the "sweep" phase. This involves iterating through the heap, freeing marked (unreachable) objects and coalescing free space.
- Object Tracking: Keep track of allocated objects, their locations on the heap, and their marked status.
Expected Behavior:
- Allocate several objects.
- Some objects should point to others, forming a directed graph.
- Trigger the garbage collection process.
- Verify that only reachable objects remain allocated on the heap.
- Free space should be consolidated.
Edge Cases:
- Allocating an object that is larger than the remaining free space.
- Circular references between objects.
- Allocating objects after garbage collection.
- The heap being completely full.
Examples
Example 1:
Input:
- Heap Size: 100 bytes
- Roots: [object_id="root1"]
- Allocations:
obj1: size 10, points toobj2obj2: size 5obj3: size 8, points toobj1obj4: size 12 (unreachable)
Initial State:
(Imagine a visualization of the heap with objects obj1, obj2, obj3 allocated, and obj4 also allocated but not referenced by any root or reachable object.)
Garbage Collection Triggered.
Mark Phase:
- Start from "root1" (assume it points to
obj1). - Mark
obj1. obj1points toobj2. Markobj2.obj3points toobj1, butobj1is already marked.obj4is not reachable.
Sweep Phase:
- Iterate through heap.
obj1,obj2,obj3are marked, so they are kept.obj4is not marked, so it's freed.
Output:
- Heap: 100 bytes, with
obj1,obj2,obj3allocated and contiguous (or with free space appropriately managed).obj4is gone. - Allocated Objects:
obj1,obj2,obj3. - Free Space: Consolidated after
obj3.
Explanation: obj4 was not reachable from the root set, so it was reclaimed by the garbage collector.
Example 2:
Input:
- Heap Size: 50 bytes
- Roots: [object_id="root1"]
- Allocations:
a: size 15, points tobb: size 10, points tocc: size 10d: size 20 (unreachable)- Attempt to allocate
e: size 10
Initial State:
a (15) -> b (10) -> c (10) are allocated. d (20) is also allocated but unreachable. Total allocated: 15 + 10 + 10 + 20 = 55 bytes. This exceeds the heap size, implying some initial allocation might have failed or previous GC happened. Let's assume a, b, c are allocated and d is also in the heap.
Garbage Collection Triggered.
Mark Phase:
- Assume "root1" points to
a. - Mark
a. apoints tob. Markb.bpoints toc. Markc.dis not reachable.
Sweep Phase:
a,b,care marked.dis not marked, it's freed. The space occupied byd(20 bytes) becomes free.
Attempt to allocate e (size 10):
- After sweeping
d, there is now 20 bytes of free space. ecan be successfully allocated in this free space.
Output:
- Heap:
a,b,candeare allocated. Free space is managed. - Allocated Objects:
a,b,c,e. - Free Space: What remains after
eis allocated.
Explanation: d was collected, freeing up space for e.
Constraints
- Heap Size: The simulated heap will be a
[]byteslice with a maximum size of 1024 bytes. - Object Size: Individual object sizes will not exceed 100 bytes.
- Number of Objects: The total number of objects managed at any time will not exceed 100.
- Object IDs: Object IDs will be unique strings.
- Pointers: Objects can point to at most 5 other objects.
- Performance: While this is a simplified implementation, the mark and sweep phases should ideally complete within a reasonable number of operations relative to the heap size and number of objects. Avoid extremely inefficient algorithms (e.g., O(N^2) searches within the sweep phase for each object).
Notes
- You will need to define your own data structures for representing objects on the heap, tracking free blocks, and managing the root set.
- Consider how you will represent pointers to other objects. Using object IDs as references is a common simplification for simulation.
- Think about how to efficiently find free memory during allocation and how to coalesce free blocks during sweeping.
- You can simulate triggering the garbage collector manually for testing purposes. In a real GC, this is often done when memory allocation fails or at regular intervals.
- This challenge focuses on the core logic of mark-and-sweep. You don't need to worry about concurrent GC, stack scanning, or complex heap layouts.