Implementing Remembered Sets in Go
This challenge focuses on implementing a "remembered set" data structure in Go. A remembered set is a collection that stores elements and allows for efficient checking of whether an element has been previously added. This is a fundamental concept used in various algorithms, such as graph traversal (e.g., Dijkstra's, BFS) and memoization, to avoid redundant processing of already visited nodes or computed values.
Problem Description
Your task is to create a RememberedSet type in Go that can store and efficiently check for the presence of elements. The set should support two primary operations:
Add(element): Adds an element to the set. If the element is already present, the set remains unchanged.Contains(element): Returnstrueif the element is present in the set, andfalseotherwise.
The RememberedSet should be generic, meaning it can store elements of any comparable type (types that support == and != operators).
Key Requirements:
- The
RememberedSetshould be thread-safe for concurrent access. Multiple goroutines might attempt toAddorContainselements simultaneously. - The implementation should aim for efficient time complexity for both
AddandContainsoperations.
Expected Behavior:
- Calling
Addmultiple times with the same element should only result in the element being stored once. - Calling
Containson an element that has been added should returntrue. - Calling
Containson an element that has not been added should returnfalse.
Edge Cases:
- Adding and checking for
nilvalues (if applicable to the generic type). - Concurrent access from multiple goroutines.
- Adding a large number of unique elements.
Examples
Example 1:
Input:
set := NewRememberedSet[int]()
set.Add(10)
set.Add(20)
set.Add(10) // Adding a duplicate
// Checking presence
contains10 := set.Contains(10)
contains20 := set.Contains(20)
contains30 := set.Contains(30)
Output:
contains10: true
contains20: true
contains30: false
Explanation: The set stores 10 and 20. Adding 10 again has no effect. Contains correctly identifies existing and non-existing elements.
Example 2:
Input:
type Person struct {
ID int
Name string
}
p1 := Person{ID: 1, Name: "Alice"}
p2 := Person{ID: 2, Name: "Bob"}
p3 := Person{ID: 1, Name: "Alice"} // Structurally same as p1
set := NewRememberedSet[Person]()
set.Add(p1)
set.Add(p2)
// Checking presence
containsP1 := set.Contains(p1)
containsP3 := set.Contains(p3) // p3 is equal to p1
containsNonExistent := set.Contains(Person{ID: 3, Name: "Charlie"})
Output:
containsP1: true
containsP3: true
containsNonExistent: false
Explanation: The set stores p1 and p2. Since `Person` structs are comparable by value, `p3` is considered equal to `p1` and `Contains` returns true.
Example 3: Concurrent Access
Input:
set := NewRememberedSet[string]()
var wg sync.WaitGroup
numGoroutines := 1000
elementsPerGoroutine := 10
for i := 0; i < numGoroutines; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
for j := 0; j < elementsPerGoroutine; j++ {
element := fmt.Sprintf("goroutine-%d-item-%d", id, j)
set.Add(element)
}
}(i)
}
wg.Wait()
// After all goroutines finish, check if a specific element exists
// For instance, an element from the first goroutine's additions
exists := set.Contains("goroutine-0-item-5")
Output:
exists: true
Explanation: Even with many goroutines concurrently adding elements, the set correctly stores all unique elements. The `Contains` check should return true for an element that was added by one of the goroutines.
Constraints
- The
RememberedSetcan store up to $2^{32}$ elements without significant performance degradation. - The elements stored must be of a type that implements the
comparableconstraint in Go (e.g., primitive types, structs with comparable fields, pointers, channels, interfaces, maps, slices). AddandContainsoperations should ideally have an average time complexity of O(1).
Notes
- Consider using Go's built-in
mapfor efficient key-value storage. - For thread safety, you'll need to employ synchronization primitives like
sync.Mutexorsync.RWMutex. - The generic nature of the set means you'll be using Go's generic type parameters.
- Think about how you will handle the underlying storage and synchronization to meet the performance and concurrency requirements.