Incremental Garbage Collection Simulation in Rust
This challenge asks you to simulate a simplified incremental garbage collector (GC) in Rust. While Rust's ownership system largely eliminates the need for a traditional GC, understanding GC concepts is valuable for systems programming and working with languages that rely on them. This simulation will focus on marking and sweeping phases, providing a foundational understanding of how GCs operate.
Problem Description
You are tasked with creating a simplified incremental garbage collector simulation. The core functionality should involve tracking allocated objects, marking reachable objects, and sweeping (reclaiming) unreachable objects. The simulation will operate on a vector of usize representing memory addresses. Each usize represents a memory location that may contain an object. You'll be given a set of root pointers (starting addresses of reachable objects) and a list of allocations. The GC should then identify and reclaim memory that is no longer reachable from the root pointers.
What needs to be achieved:
- Allocation Tracking: Maintain a data structure to track allocated memory addresses.
- Marking Phase: Starting from the root pointers, traverse the object graph and mark all reachable objects. Assume that each object at a given address points to another object at an address stored within that memory location (represented as a
usizewithin theusizeat the current address). For example, ifmemory[address] = other_address, then the object atother_addressis reachable from the object ataddress. - Sweeping Phase: Iterate through the allocated memory addresses. If an address is not marked, it is considered unreachable and should be removed from the allocation tracking.
Key Requirements:
- The GC should be able to handle cycles in the object graph.
- The GC should correctly identify and reclaim unreachable memory.
- The simulation should be efficient enough to handle a reasonable number of objects (see Constraints).
Expected Behavior:
The function should take the following inputs:
memory: AVec<usize>representing the memory space. Each element is ausizethat may contain anotherusizerepresenting a pointer to another memory location.roots: AVec<usize>representing the root pointers (starting addresses of reachable objects).allocations: AVec<usize>representing the addresses of allocated objects.
The function should return a Vec<usize> containing the addresses of the unreachable objects (those that were allocated but not reachable from the roots after sweeping).
Edge Cases to Consider:
- Empty
memoryorrootsvectors. rootscontaining addresses outside the bounds ofmemory.- Allocations outside the bounds of
memory. - Cycles in the object graph (e.g.,
memory[1] = 2andmemory[2] = 1). - Multiple allocations at the same address (should be treated as a single allocation).
Examples
Example 1:
Input: memory = [1, 2, 3, 4, 5], roots = [0], allocations = [0, 1, 2]
Output: [3, 4, 5]
Explanation: memory[0] = 1, memory[1] = 2, memory[2] = 3, memory[3] = 4, memory[4] = 5. Starting from root 0, we can reach 1 and 2. Since memory[1] = 2 and memory[2] = 3, we can also reach 3. Addresses 4 and 5 are not reachable.
Example 2:
Input: memory = [1, 2, 3, 4, 5], roots = [0], allocations = [0, 1, 2, 3]
Output: [4, 5]
Explanation: memory[0] = 1, memory[1] = 2, memory[2] = 1, memory[3] = 4, memory[4] = 5. Starting from root 0, we can reach 1 and 2. Since memory[1] = 2 and memory[2] = 1, we form a cycle. Address 3 is not reachable. Addresses 4 and 5 are not reachable.
Example 3: (Edge Case)
Input: memory = [1, 2, 3], roots = [5], allocations = [0, 1, 2]
Output: [0, 1, 2]
Explanation: Root 5 is out of bounds. No objects are reachable, so all allocated objects are unreachable.
Constraints
memory.len()will be between 10 and 1000.roots.len()will be between 0 and 10.allocations.len()will be between 0 andmemory.len().- All
usizevalues inmemory,roots, andallocationswill be within the bounds ofmemory.len(). - The solution should complete within 1 second for the given constraints.
Notes
- This is a simplified simulation. Real-world GCs are significantly more complex.
- Focus on correctness and clarity of code.
- You can use a
HashSetor similar data structure to efficiently track allocated and marked objects. - Consider using a recursive or iterative approach for the marking phase. Be mindful of potential stack overflow issues with deep recursion.
- The order of the unreachable objects in the output vector does not matter.