Hone logo
Hone
Problems

Distributed Leader Election in Go

Leader election is a fundamental problem in distributed systems. It allows a group of nodes to automatically choose one node as the leader, ensuring that only one node is responsible for coordinating tasks and making decisions. This challenge asks you to implement a basic leader election algorithm in Go using a simple heartbeat mechanism.

Problem Description

You are tasked with implementing a leader election algorithm for a cluster of Go nodes. Each node periodically sends a "heartbeat" signal. The node that sends the most recent heartbeat is considered the leader. If a node fails to send a heartbeat within a specified timeout period, the remaining nodes will initiate a new election.

What needs to be achieved:

  • Implement a Node struct representing a node in the cluster. Each node should have a unique ID, a channel for receiving heartbeats from other nodes, and a channel to send its own heartbeats.
  • Implement a LeaderElection function that takes a slice of Node structs as input and simulates the leader election process.
  • The function should return the ID of the elected leader.
  • Nodes should periodically send heartbeats to each other.
  • If a node fails to receive a heartbeat from the current leader within a timeout period, it should initiate a new election.

Key Requirements:

  • Heartbeat Mechanism: Nodes should periodically send heartbeats to all other nodes in the cluster.
  • Timeout: A timeout period should be defined. If a node doesn't receive a heartbeat within this period, it assumes the leader has failed.
  • Election Trigger: When a leader failure is detected, a new election should be triggered.
  • Unique Leader: Only one node should be elected as the leader at any given time.
  • Concurrency: The solution should be concurrent to simulate a distributed environment.

Expected Behavior:

  • Initially, all nodes are candidates for the leader.
  • The node that sends the first heartbeat becomes the initial leader.
  • Subsequent heartbeats from the leader maintain its leadership.
  • If the leader fails (doesn't send a heartbeat within the timeout), a new election is triggered, and a new leader is chosen.
  • The LeaderElection function should return the ID of the current leader.

Edge Cases to Consider:

  • Empty Cluster: What happens if the input slice of nodes is empty?
  • Single Node Cluster: What happens if there's only one node in the cluster?
  • Node Failure During Election: What happens if a node fails during the election process?
  • Network Partitioning: (This is a more advanced consideration, but think about how your solution might behave if nodes become isolated from each other.)

Examples

Example 1:

Input: [{ID: "A", HeartbeatChan: make(chan string), SendChan: make(chan string)}, {ID: "B", HeartbeatChan: make(chan string), SendChan: make(chan string)}, {ID: "C", HeartbeatChan: make(chan string), SendChan: make(chan string)}]
Output: "A" (or "B" or "C" - the first node to send a heartbeat)
Explanation: All nodes start sending heartbeats. The first node to send its heartbeat is elected as the leader.

Example 2:

Input: [{ID: "A", HeartbeatChan: make(chan string), SendChan: make(chan string)}, {ID: "B", HeartbeatChan: make(chan string), SendChan: make(chan string)}]
Output: "A" (or "B" - the first node to send a heartbeat)
Explanation: Similar to Example 1, but with only two nodes.

Example 3: (Simulating Leader Failure)

Input: [{ID: "A", HeartbeatChan: make(chan string), SendChan: make(chan string)}, {ID: "B", HeartbeatChan: make(chan string), SendChan: make(chan string)}]
Output: "B" (after node "A" fails to send a heartbeat within the timeout)
Explanation: Node "A" is initially elected as the leader.  If "A" stops sending heartbeats, "B" detects the failure and becomes the new leader.

Constraints

  • Number of Nodes: The input slice of nodes can contain between 1 and 10 nodes (inclusive).
  • Node ID: Node IDs are strings and are unique within the cluster.
  • Heartbeat Interval: The heartbeat interval should be between 100ms and 500ms (inclusive).
  • Timeout Period: The timeout period should be between 1 second and 3 seconds (inclusive).
  • Concurrency: The solution must be concurrent to simulate a distributed environment. Use goroutines and channels effectively.
  • No External Dependencies: Do not use external libraries for leader election. Implement the core logic yourself.

Notes

  • Consider using channels to communicate between nodes.
  • Think about how to handle node failures and trigger new elections.
  • The heartbeat interval and timeout period are crucial parameters that affect the stability and responsiveness of the leader election algorithm.
  • This is a simplified implementation. Real-world leader election algorithms are more complex and robust, handling issues like network partitions and message loss.
  • Focus on the core concepts of leader election: heartbeat mechanism, timeout, and election trigger.
  • Error handling is not required for this challenge, but consider how you might handle errors in a production environment.
Loading editor...
go