Byzantine Fault Tolerance Consensus in Go
Distributed consensus is a fundamental problem in distributed systems, ensuring that a group of machines agree on a single value even in the presence of failures. This challenge asks you to implement a simplified version of a Byzantine Fault Tolerance (BFT) consensus algorithm in Go, specifically focusing on the Practical Byzantine Fault Tolerance (PBFT) approach. This is useful for building reliable and fault-tolerant systems like blockchains, distributed databases, and replicated state machines.
Problem Description
You are tasked with building a basic PBFT consensus system among a set of nodes. The system should be able to agree on a proposed value, even if some nodes are faulty (Byzantine) and may send incorrect or malicious messages. The system will operate in rounds. Each round consists of a proposer, a set of acceptors, and a set of learners. The proposer proposes a value, the acceptors vote on the value, and the learners receive the agreed-upon value.
What needs to be achieved:
- Implement a simplified PBFT consensus protocol.
- Simulate a network of nodes, some of which may be faulty.
- Ensure that the system reaches consensus on a proposed value, even with faulty nodes.
Key Requirements:
- Node Structure: Each node should have an ID, a state (e.g., proposer, acceptor, learner), and a message queue.
- Proposer: The proposer proposes a value to all acceptors.
- Acceptors: Acceptors receive proposals, vote on them (accept or reject), and broadcast their votes to other acceptors and learners.
- Learners: Learners collect votes from acceptors and determine the consensus value.
- Fault Tolerance: The system should tolerate f faulty nodes, where f is less than (n/3), where n is the total number of nodes. This means at most (n-1)/3 nodes can be faulty.
- Round Management: Implement a simple round-based system. Each round has a single proposer. The proposer rotates among the nodes.
- Message Passing: Simulate message passing between nodes. Faulty nodes can send incorrect or malicious messages.
Expected Behavior:
- Given a proposed value, the system should eventually reach consensus on that value if the number of faulty nodes is less than (n/3).
- The system should handle faulty nodes gracefully, preventing them from disrupting the consensus process.
- The system should rotate the proposer role among the nodes in each round.
Edge Cases to Consider:
- All nodes are faulty: The system should not reach consensus.
- Network delays: Simulate network delays to make the system more realistic.
- Duplicate messages: Handle duplicate messages from faulty nodes.
- Proposer failure: What happens if the proposer fails before sending the proposal? (Consider a timeout mechanism).
Examples
Example 1:
Input: Nodes = [Node{ID: 1, Role: "Proposer"}, Node{ID: 2, Role: "Acceptor"}, Node{ID: 3, Role: "Acceptor"}], Proposed Value = "A", Faulty Nodes = []
Output: Consensus Value = "A"
Explanation: With no faulty nodes, the proposer (Node 1) proposes "A". Both acceptors (Node 2 and Node 3) accept the proposal, and the learners determine the consensus value to be "A".
Example 2:
Input: Nodes = [Node{ID: 1, Role: "Proposer"}, Node{ID: 2, Role: "Acceptor"}, Node{ID: 3, Role: "Acceptor"}], Proposed Value = "B", Faulty Nodes = [Node{ID: 2}]
Output: Consensus Value = "B"
Explanation: Node 1 proposes "B". Node 3 accepts the proposal. Node 2 (faulty) might send a conflicting message, but since only one node is faulty and 2 > (3-1)/3, the system still reaches consensus on "B" based on the majority vote from Node 3.
Example 3: (Edge Case)
Input: Nodes = [Node{ID: 1, Role: "Proposer"}, Node{ID: 2, Role: "Acceptor"}, Node{ID: 3, Role: "Acceptor"}], Proposed Value = "C", Faulty Nodes = [Node{ID: 2}, Node{ID: 3}]
Output: No Consensus Reached
Explanation: With two faulty nodes (Node 2 and Node 3), the system cannot guarantee consensus because 2 >= (3-1)/3. The acceptors might send conflicting votes, preventing the learners from determining a single consensus value.
Constraints
- Number of Nodes (n): 3 <= n <= 7
- Number of Faulty Nodes (f): 0 <= f < (n/3)
- Message Delay: Simulate message delays up to 100ms.
- Round Limit: Limit the number of rounds to 10 to prevent infinite loops.
- Data Types: Use strings for the proposed value and node IDs.
Notes
- This is a simplified implementation of PBFT. Real-world PBFT implementations are significantly more complex.
- Focus on the core concepts of proposal, voting, and consensus.
- You can use goroutines and channels to simulate concurrent message passing between nodes.
- Consider using a simple timeout mechanism to handle proposer failures.
- The "faulty" behavior of nodes can be simulated by randomly sending incorrect votes or delaying messages.
- Error handling is important, but for simplicity, you can focus on the core consensus logic.
- A
Nodestruct should be defined with at leastID(string) andRole(string - "Proposer", "Acceptor", "Learner") fields. You may add more fields as needed.