Robust Service Protection: Implementing a Circuit Breaker in Go
Circuit breakers are a crucial pattern for building resilient distributed systems. They prevent cascading failures by temporarily stopping requests to a failing service, allowing it time to recover and preventing resource exhaustion. This challenge asks you to implement a basic circuit breaker in Go to protect a simulated downstream service.
Problem Description
You are tasked with creating a CircuitBreaker struct in Go that monitors the success and failure rates of calls to a downstream service. The circuit breaker should have three states: Closed, Open, and Half-Open.
- Closed: Requests are allowed to pass through to the downstream service. The circuit breaker tracks success and failure counts. If the failure count exceeds a defined threshold within a sliding window, the circuit breaker transitions to the
Openstate. - Open: Requests are immediately rejected without calling the downstream service. A timer is started, and after a specified timeout, the circuit breaker transitions to the
Half-Openstate. - Half-Open: A limited number of test requests are allowed to pass through to the downstream service. If these requests succeed, the circuit breaker transitions back to the
Closedstate. If they fail, it returns to theOpenstate.
The CircuitBreaker should provide a Call method that attempts to execute a provided function (representing a call to the downstream service). The Call method should handle the circuit breaker's state and return an error if the circuit is open.
Key Requirements:
- Implement the
CircuitBreakerstruct withClosed,Open, andHalf-Openstates. - Implement a sliding window failure rate calculation.
- Implement timeouts for transitioning from
OpentoHalf-Open. - Implement limited test requests in the
Half-Openstate. - Provide a
Callmethod that respects the circuit breaker's state.
Expected Behavior:
- When the circuit is
Closedand calls succeed, the failure count should not increase. - When the circuit is
Closedand calls fail, the failure count should increase. Once the failure threshold is reached, the circuit should transition toOpen. - When the circuit is
Open, all calls should immediately return an error without attempting to call the downstream service. - After the timeout period in the
Openstate, the circuit should transition toHalf-Open. - In the
Half-Openstate, a limited number of test requests should be allowed. Success transitions toClosed, failure transitions back toOpen.
Edge Cases to Consider:
- What happens if the downstream service is consistently failing?
- What happens if the downstream service recovers quickly?
- How does the circuit breaker handle concurrent calls? (Concurrency safety is not explicitly required for this basic implementation, but consider it.)
- What happens if the timeout period is very short or very long?
Examples
Example 1:
Input: Repeated failures to a downstream service, exceeding the failure threshold.
Output: CircuitBreaker state transitions from Closed -> Open. Subsequent calls return an error immediately.
Explanation: The circuit breaker detects the high failure rate and opens the circuit to prevent further load on the failing service.
Example 2:
Input: After a timeout period in the Open state, a few successful test requests are made in the Half-Open state.
Output: CircuitBreaker state transitions from Open -> Half-Open -> Closed. Subsequent calls are allowed to pass through.
Explanation: The circuit breaker determines that the downstream service has recovered and allows normal operation to resume.
Example 3:
Input: After a timeout period in the Open state, several failed test requests are made in the Half-Open state.
Output: CircuitBreaker state remains Open. Subsequent calls return an error immediately.
Explanation: The circuit breaker determines that the downstream service is still failing and remains in the Open state.
Constraints
- Sliding Window Size: The sliding window for failure rate calculation should be 10 calls.
- Failure Threshold: The failure threshold should be 5 failures within the sliding window.
- Timeout Duration: The timeout duration for transitioning from
OpentoHalf-Openshould be 5 seconds. - Half-Open Test Requests: Allow a maximum of 3 test requests in the
Half-Openstate. - Error Type: The
Callmethod should return an error of typeerror.
Notes
- You can use the
timepackage for timeouts. - Consider using channels for synchronization if you want to explore concurrency safety (though not strictly required).
- Focus on the core logic of the circuit breaker. Error handling for the downstream service itself is not required.
- The sliding window can be implemented using a circular buffer or a similar data structure.
- This is a simplified implementation. Real-world circuit breakers often include more sophisticated features like metrics collection, configurable timeouts, and fallback mechanisms.