Go Worker Pool System
This challenge requires you to build a robust worker pool system in Go. A worker pool is a crucial pattern for managing concurrent tasks, preventing resource exhaustion, and improving application performance by limiting the number of goroutines executing simultaneously.
Problem Description
Your task is to implement a generic worker pool system in Go. This system should be capable of accepting a collection of tasks (represented as functions or closures) and distributing them among a fixed number of worker goroutines. The system needs to handle task submission, execution, and provide a mechanism to know when all submitted tasks have been completed.
Key Requirements:
- Worker Pool Creation: Implement a function or struct that allows for the creation of a worker pool with a specified number of workers.
- Task Submission: Provide a method to submit tasks to the worker pool. Tasks should be of a type that can be executed (e.g.,
func()). - Concurrent Execution: The worker pool should distribute submitted tasks among its workers, executing them concurrently.
- Worker Management: The number of concurrently running worker goroutines should be capped by the pool's size.
- Completion Notification: Implement a way to signal when all submitted tasks have been processed and completed.
- Error Handling: Tasks might encounter errors during execution. The system should have a mechanism to report these errors, though the exact reporting strategy is up to you (e.g., collecting errors, stopping on first error, etc.). For this challenge, collecting and returning a list of errors is preferred.
- Graceful Shutdown: The worker pool should be able to shut down cleanly, ensuring all in-progress tasks are completed before returning.
Expected Behavior:
When tasks are submitted, workers will pick them up and execute them. If there are more tasks than available workers, tasks will queue up internally until a worker becomes free. The system should block until all tasks are done or an explicit shutdown is triggered.
Edge Cases:
- Submitting no tasks.
- Submitting a large number of tasks.
- Tasks that take a significant amount of time to complete.
- Tasks that panic.
- Submitting tasks after the pool has been shut down.
Examples
Example 1: Simple Task Execution
Input:
- Pool Size: 3
- Tasks:
- func() { fmt.Println("Task 1") }
- func() { fmt.Println("Task 2") }
- func() { fmt.Println("Task 3") }
- func() { fmt.Println("Task 4") }
Output (order may vary due to concurrency):
Task 1
Task 2
Task 3
Task 4
Explanation: With a pool size of 3, the first three tasks will start executing immediately. Task 4 will wait until one of the first three tasks finishes, and then it will be picked up by an available worker.
Example 2: Tasks with Return Values and Errors
Let's define a task as a function that returns a result and an error: func() (interface{}, error).
Input:
- Pool Size: 2
- Tasks:
- func() (interface{}, error) { time.Sleep(100 * time.Millisecond); return "result1", nil }
- func() (interface{}, error) { time.Sleep(50 * time.Millisecond); return nil, fmt.Errorf("task2 failed") }
- func() (interface{}, error) { time.Sleep(75 * time.Millisecond); return "result3", nil }
Output:
- A list of results: ["result1", "result3"] (order may vary)
- A list of errors: [fmt.Errorf("task2 failed")]
Explanation: The pool with 2 workers processes tasks concurrently. The first task starts, and the second task starts. When the second task finishes with an error, it's collected. The third task waits for a worker to be free and then executes, returning a successful result.
Example 3: Handling Panics
Input:
- Pool Size: 1
- Tasks:
- func() { fmt.Println("Task 1 starting") }
- func() { panic("oops! task 2 panicked") }
- func() { fmt.Println("Task 3 starting") }
Output:
Task 1 starting
Task 3 starting
// A panic error should be captured by the system.
Explanation: The first task runs. When the second task panics, the worker should ideally recover from the panic, log or report it, and continue processing subsequent tasks. Task 3 should still be executed.
Constraints
- The pool size must be a positive integer.
- Tasks submitted to the pool must be executable functions (e.g.,
func()). If you are handling tasks with return values, define a clear interface for them. - The system should be designed to handle a large number of tasks efficiently, ideally without excessive memory usage for task queues.
- The implementation should use Go's concurrency primitives (goroutines, channels, waitgroups) effectively.
Notes
- Consider using
sync.WaitGroupto manage task completion. - Channels will be invaluable for communication between the submitter, workers, and for signaling completion.
- Think about how you will manage the queue of tasks waiting to be processed.
- For tasks that return values and errors, you'll need a way to collect and return these aggregated results and errors.
- A common approach for panic recovery in Go is using
deferwithrecover(). - The
contextpackage can be useful for managing deadlines and cancellations, though it's not strictly required for this core implementation.