Hone logo
Hone
Problems

Robust Service Protection: Implementing a Circuit Breaker in Go

Circuit breakers are a crucial pattern for building resilient distributed systems. They prevent cascading failures by temporarily stopping requests to a failing service, allowing it time to recover and preventing resource exhaustion. This challenge asks you to implement a basic circuit breaker in Go to protect a simulated downstream service.

Problem Description

You are tasked with creating a CircuitBreaker struct in Go that monitors the success and failure rates of calls to a downstream service. The circuit breaker should have three states: Closed, Open, and Half-Open.

  • Closed: Requests are allowed to pass through to the downstream service. The circuit breaker tracks success and failure counts. If the failure count exceeds a defined threshold within a sliding window, the circuit breaker transitions to the Open state.
  • Open: Requests are immediately rejected without calling the downstream service. A timer is started, and after a specified timeout, the circuit breaker transitions to the Half-Open state.
  • Half-Open: A limited number of test requests are allowed to pass through to the downstream service. If these requests succeed, the circuit breaker transitions back to the Closed state. If they fail, it returns to the Open state.

The CircuitBreaker should provide a Call method that attempts to execute a provided function (representing a call to the downstream service). The Call method should handle the circuit breaker's state and return an error if the circuit is open.

Key Requirements:

  • Implement the CircuitBreaker struct with Closed, Open, and Half-Open states.
  • Implement a sliding window failure rate calculation.
  • Implement timeouts for transitioning from Open to Half-Open.
  • Implement limited test requests in the Half-Open state.
  • Provide a Call method that respects the circuit breaker's state.

Expected Behavior:

  • When the circuit is Closed and calls succeed, the failure count should not increase.
  • When the circuit is Closed and calls fail, the failure count should increase. Once the failure threshold is reached, the circuit should transition to Open.
  • When the circuit is Open, all calls should immediately return an error without attempting to call the downstream service.
  • After the timeout period in the Open state, the circuit should transition to Half-Open.
  • In the Half-Open state, a limited number of test requests should be allowed. Success transitions to Closed, failure transitions back to Open.

Edge Cases to Consider:

  • What happens if the downstream service is consistently failing?
  • What happens if the downstream service recovers quickly?
  • How does the circuit breaker handle concurrent calls? (Concurrency safety is not explicitly required for this basic implementation, but consider it.)
  • What happens if the timeout period is very short or very long?

Examples

Example 1:

Input: Repeated failures to a downstream service, exceeding the failure threshold.
Output: CircuitBreaker state transitions from Closed -> Open. Subsequent calls return an error immediately.
Explanation: The circuit breaker detects the high failure rate and opens the circuit to prevent further load on the failing service.

Example 2:

Input: After a timeout period in the Open state, a few successful test requests are made in the Half-Open state.
Output: CircuitBreaker state transitions from Open -> Half-Open -> Closed. Subsequent calls are allowed to pass through.
Explanation: The circuit breaker determines that the downstream service has recovered and allows normal operation to resume.

Example 3:

Input: After a timeout period in the Open state, several failed test requests are made in the Half-Open state.
Output: CircuitBreaker state remains Open. Subsequent calls return an error immediately.
Explanation: The circuit breaker determines that the downstream service is still failing and remains in the Open state.

Constraints

  • Sliding Window Size: The sliding window for failure rate calculation should be 10 calls.
  • Failure Threshold: The failure threshold should be 5 failures within the sliding window.
  • Timeout Duration: The timeout duration for transitioning from Open to Half-Open should be 5 seconds.
  • Half-Open Test Requests: Allow a maximum of 3 test requests in the Half-Open state.
  • Error Type: The Call method should return an error of type error.

Notes

  • You can use the time package for timeouts.
  • Consider using channels for synchronization if you want to explore concurrency safety (though not strictly required).
  • Focus on the core logic of the circuit breaker. Error handling for the downstream service itself is not required.
  • The sliding window can be implemented using a circular buffer or a similar data structure.
  • This is a simplified implementation. Real-world circuit breakers often include more sophisticated features like metrics collection, configurable timeouts, and fallback mechanisms.
Loading editor...
go