Implement Work Stealing in Go
This challenge asks you to implement a work-stealing scheduler in Go. Work stealing is a dynamic load balancing technique that allows idle workers to "steal" tasks from busy workers, improving overall parallelism and resource utilization. This is particularly useful in scenarios where tasks have varying execution times.
Problem Description
Your goal is to create a Go program that simulates a work-stealing scheduler. You will have a pool of worker goroutines, and a set of tasks to be executed. Each worker should maintain its own local queue of tasks. When a worker finishes its local tasks, it should attempt to steal a task from the queue of another randomly chosen worker. If successful, it executes the stolen task. If unsuccessful (all other workers' queues are empty), it can either wait or terminate.
Key Requirements:
- Task Representation: Define a
Tasktype. For simplicity, aTaskcan be represented by a function that takes no arguments and returns no values (func()). - Worker Pool: Create a fixed number of worker goroutines.
- Task Queues: Each worker must have its own independent task queue. A
sync.Mutexorsync.RWMutexshould be used to protect access to each worker's queue. - Task Assignment: Initially, tasks should be distributed among the worker queues.
- Work Stealing Logic:
- When a worker exhausts its local queue, it should try to steal a task.
- The stealing mechanism should randomly select another worker.
- If the selected worker has tasks, the stealing worker takes one (preferably from the "tail" or "opposite end" of the busy worker's queue to minimize contention).
- If the selected worker's queue is empty, the stealing worker should try another random worker.
- A worker that cannot find any tasks after a reasonable number of attempts should potentially go idle or terminate.
- Synchronization: Ensure proper synchronization to prevent race conditions when accessing task queues and managing worker states.
- Termination: The scheduler should terminate gracefully when all tasks are completed and all workers are idle.
Expected Behavior:
- Tasks are distributed among workers.
- Busy workers process their local tasks.
- Idle workers actively seek and steal tasks from busy workers.
- The system should eventually complete all tasks.
- The program should terminate without deadlocks or panics.
Edge Cases:
- Zero tasks provided.
- More workers than tasks.
- All tasks having extremely different execution times, leading to some workers being much busier than others.
- A worker attempting to steal from itself (should be avoided).
Examples
Example 1: Basic Task Distribution and Stealing
Input:
Number of Workers: 4
Number of Tasks: 10
Tasks: [Task A, Task B, Task C, Task D, Task E, Task F, Task G, Task H, Task I, Task J]
Each task represented by a unique identifier and a small delay (e.g., 10ms).
Output:
Indication that tasks are being processed by different workers, with evidence of stealing (e.g., worker X processing a task originally assigned to worker Y).
Final confirmation of all tasks completion.
Explanation: Initially, tasks A-J are distributed among the 4 workers. For instance, worker 0 gets A, B; worker 1 gets C, D, E; worker 2 gets F, G; worker 3 gets H, I, J. If worker 0 finishes A and B quickly and worker 1 is still busy with C, D, E, worker 0 might steal task F from worker 2. The output will show the interleaving of task execution and potentially log messages indicating successful steals.
Example 2: Many Tasks, Few Workers
Input:
Number of Workers: 2
Number of Tasks: 100
Tasks: [Task 0, Task 1, ..., Task 99]
Each task represented by a unique identifier and a moderate delay (e.g., 50ms).
Output:
A clear demonstration of the two workers efficiently balancing the load, with one worker potentially stealing from the other when its local queue empties.
Explanation: With fewer workers and many tasks, load imbalance is more likely. The work-stealing mechanism is crucial here to ensure both workers remain busy as much as possible, processing tasks from each other's queues.
Example 3: Edge Case - Zero Tasks
Input:
Number of Workers: 3
Number of Tasks: 0
Tasks: []
Output:
The scheduler should initialize, find no tasks, and terminate gracefully without any errors.
Explanation: The scheduler should handle the scenario where there are no tasks to process. All workers should ideally not start processing any tasks and the program should exit quickly.
Constraints
- The number of workers will be between 1 and 16.
- The total number of tasks will be between 0 and 1000.
- Task execution times can vary, but for simulation purposes, assume each task takes between 10ms and 100ms to complete.
- The solution should be implemented using standard Go concurrency primitives.
- The program must not deadlock.
- The program must terminate after all tasks are completed.
Notes
- Consider how to represent the task queues for each worker. A slice or a linked list might be suitable.
- Think about the strategy for stealing: from the head or tail of the busy worker's queue? Which offers better performance characteristics in terms of contention?
- Implementing a mechanism to signal when all tasks are done and workers can exit is important for graceful termination.
- For debugging and understanding, consider adding logging to show when a worker starts/finishes a task, when it attempts to steal, and when it successfully steals.
- A
context.Contextcan be useful for managing cancellation and timeouts, especially for the stealing attempts.