Implementing a Circuit Breaker Pattern in Python
The circuit breaker pattern is a crucial resilience mechanism in distributed systems. It prevents an application from repeatedly trying to execute an operation that is likely to fail, thus protecting resources and preventing cascading failures. Your task is to implement a Python class that embodies this pattern, allowing you to control access to a potentially unreliable service.
Problem Description
You need to create a CircuitBreaker class in Python that manages access to a function (representing a remote service call or any operation that might fail). The circuit breaker should transition between three states:
- Closed: Requests are allowed to pass through to the protected function. If the function succeeds, the state remains Closed. If it fails, a counter is incremented.
- Open: If the failure counter reaches a predefined threshold, the circuit breaker transitions to Open. In this state, all subsequent requests are immediately rejected without even calling the protected function, returning an error.
- Half-Open: After a specified timeout period in the Open state, the circuit breaker transitions to Half-Open. In this state, a limited number of requests are allowed to pass through. If these requests succeed, the circuit breaker resets to Closed. If they fail, it immediately returns to Open.
Key Requirements:
- The
CircuitBreakerclass should be initialized with parameters defining its behavior:failure_threshold: The number of consecutive failures allowed before tripping the circuit breaker to Open.recovery_timeout: The duration (in seconds) the circuit breaker stays in the Open state before transitioning to Half-Open.exceptions: A tuple of exception types that should be considered failures.
- A method (e.g.,
call) should be provided to wrap the protected function. This method will:- Check the current state of the circuit breaker.
- Execute the protected function based on the state.
- Update the state and failure count accordingly.
- Raise exceptions for failed calls or if the circuit is open.
- The circuit breaker should maintain an internal state and a failure counter.
- It should track the time elapsed since the last failure to manage the
recovery_timeout.
Expected Behavior:
- When in the
CLOSEDstate, calls to the protected function are executed. If an exception inexceptionsoccurs, the failure count increases. Iffailure_thresholdis reached, transition toOPEN. - When in the
OPENstate, calls to the protected function are immediately rejected with a specific exception (e.g.,CircuitBreakerOpenError) without executing the function. Afterrecovery_timeouthas passed since entering theOPENstate, transition toHALF_OPEN. - When in the
HALF_OPENstate, a limited number of calls (e.g., one or a small configurable number) are allowed to pass.- If a call in
HALF_OPENsucceeds, transition back toCLOSEDand reset the failure count. - If a call in
HALF_OPENfails (with an exception inexceptions), transition back toOPENimmediately.
- If a call in
- If the protected function raises an exception not in the
exceptionstuple, it should be re-raised immediately without affecting the circuit breaker's state.
Edge Cases to Consider:
- What happens if
recovery_timeoutis 0? - What happens if
failure_thresholdis 0 or 1? - Concurrent access to the circuit breaker: While not strictly required for this basic implementation, consider how race conditions might arise in a multi-threaded environment (though this challenge focuses on the core logic).
Examples
Example 1: Successful Calls and then Failure
Let's assume:
failure_threshold = 3
recovery_timeout = 5 seconds
exceptions = (ValueError, TypeError)
import time
class CircuitBreakerOpenError(Exception):
pass
class MockService:
def __init__(self):
self.call_count = 0
def potentially_failing_operation(self, should_fail_after_calls=4):
self.call_count += 1
if self.call_count >= should_fail_after_calls:
raise ValueError("Operation failed due to simulated error")
return f"Operation successful (call #{self.call_count})"
# --- Circuit Breaker Usage ---
mock_service = MockService()
cb = CircuitBreaker(failure_threshold=3, recovery_timeout=5, exceptions=(ValueError,))
# Call 1: Success
print(cb.call(mock_service.potentially_failing_operation))
# Call 2: Success
print(cb.call(mock_service.potentially_failing_operation))
# Call 3: Success
print(cb.call(mock_service.potentially_failing_operation))
# Call 4: Failure (will trip the breaker)
try:
print(cb.call(mock_service.potentially_failing_operation))
except ValueError as e:
print(f"Caught expected error: {e}")
# Call 5: Breaker is OPEN
try:
print(cb.call(mock_service.potentially_failing_operation))
except CircuitBreakerOpenError as e:
print(f"Caught circuit breaker error: {e}")
# Wait for recovery timeout
time.sleep(6)
# Call 6: Breaker is HALF-OPEN, success should reset it
print(cb.call(mock_service.potentially_failing_operation, should_fail_after_calls=1)) # Reset mock service to succeed immediately
# Call 7: Breaker should now be CLOSED again
print(cb.call(mock_service.potentially_failing_operation, should_fail_after_calls=1))
Expected Output for Example 1:
Operation successful (call #1)
Operation successful (call #2)
Operation successful (call #3)
Caught expected error: Operation failed due to simulated error
Caught circuit breaker error: Circuit breaker is open.
Operation successful (call #4)
Operation successful (call #5)
Explanation:
The first three calls succeed. The fourth call triggers a ValueError, incrementing the failure count to 3, which trips the breaker to OPEN. The fifth call is immediately rejected with CircuitBreakerOpenError. After a 6-second wait (longer than recovery_timeout), the breaker becomes HALF-OPEN. The sixth call is allowed through and succeeds, resetting the breaker to CLOSED. The seventh call also succeeds.
Example 2: Open State Timeout and Subsequent Failure in Half-Open
Let's assume the same parameters as Example 1.
import time
class CircuitBreakerOpenError(Exception):
pass
class MockService:
def __init__(self):
self.call_count = 0
def get_status(self, fail_on_attempt=None):
self.call_count += 1
if fail_on_attempt is not None and self.call_count == fail_on_attempt:
raise ConnectionError("Simulated network issue")
return "Service is healthy"
# --- Circuit Breaker Usage ---
mock_service = MockService()
cb = CircuitBreaker(failure_threshold=2, recovery_timeout=3, exceptions=(ConnectionError,))
# Two failures to open the circuit
print(cb.call(mock_service.get_status, fail_on_attempt=1))
mock_service.call_count = 0 # Reset for next call
print(cb.call(mock_service.get_status, fail_on_attempt=1))
mock_service.call_count = 0 # Reset for next call
print("Circuit tripped. Waiting for recovery timeout...")
time.sleep(4) # Wait for recovery timeout
# First call in HALF-OPEN state, but it fails
mock_service.call_count = 0 # Reset for next call
try:
print(cb.call(mock_service.get_status, fail_on_attempt=1)) # This call should fail
except ConnectionError as e:
print(f"Caught expected error in HALF-OPEN: {e}")
# Circuit should now be OPEN again
print("Circuit is OPEN again. Trying a call...")
try:
print(cb.call(mock_service.get_status))
except CircuitBreakerOpenError as e:
print(f"Caught circuit breaker error: {e}")
Expected Output for Example 2:
Service is healthy
Service is healthy
Circuit tripped. Waiting for recovery timeout...
Caught expected error in HALF-OPEN: Simulated network issue
Circuit is OPEN again. Trying a call...
Caught circuit breaker error: Circuit breaker is open.
Explanation:
Two calls lead to ConnectionError (simulated by fail_on_attempt=1 each time after resetting call_count), tripping the circuit to OPEN. After waiting 4 seconds, the circuit becomes HALF-OPEN. The first call in this state is configured to fail, causing the circuit breaker to immediately revert to the OPEN state. A subsequent call is then correctly rejected.
Constraints
failure_threshold: Must be an integer greater than or equal to 1.recovery_timeout: Must be a non-negative float or integer representing seconds.exceptions: Must be a tuple of valid Python exception classes.- The
callmethod should accept the function to be executed as its first argument and any subsequent positional or keyword arguments that should be passed to that function. - The circuit breaker should be thread-safe for basic state transitions (though full concurrency handling for complex scenarios is out of scope for this core implementation).
Notes
- You will need to implement your own
CircuitBreakerOpenErrorexception. - The
timemodule will be essential for tracking therecovery_timeout. - Consider how you will manage the state transitions and the failure counter efficiently.
- For the
HALF-OPENstate, a common approach is to allow a single "test" call. If successful, reset toCLOSED; if it fails, revert toOPEN. You can implement this as a single test call. - Think about how to pass arbitrary arguments (
*args,**kwargs) to the protected function. - You may want to use an enum or constants for the circuit breaker states (
CLOSED,OPEN,HALF_OPEN).