# Lecture 021 - Byzantine

The Problem: Several divisions of the Byzantine army are camped outside an enemy city, each division commanded by its own general. After observing the enemy, they must decide upon a common plan of action. Some of the generals may be traitors, trying to prevent the loyal generals from reaching agreement.

commander: send command

lieutenants: listen and act to command truthfully

Goal: All loyal generals decide upon the same plan of action. A small number of traitors cannot cause the loyal generals to adopt a bad plan. (Each nonfaulty process learn the true values sent by each of the nonfaulty processes)

Example Paxos under Byzantine Faults: with 3 servers, the malicious server always respond with ACCEPT, causing other servers to commit on different result.

Example Quorums under Byzantine Faults: (with fail-stop, we only need $2f+1$ nodes, ie can tolerate about half, since Paxos relies on overlapping node) - if intersection happens to be occupied by Byzantine node, two Quorums cannot communicate, so we need at least $f+1$ nodes - for liveness, Quorum size is at most $N - f$, otherwise non-Quorum node can always lie to nodes in Quorum, causing them to not agree. - so for Quorum size of $N-f$, overlap region can be calculated with $(N-f) + (N-f) - N$. We want this to be $\geq f + 1$ as stated above, giving $N \geq 3f + 1$

Impossibility: No solution with fewer than $3f + 1$ generals can cope with $f$ traitors.

## Byzantine Fault Tolerance (Lamport)

Byzantine Agreement Assumption:

• ordered message

• with bounded communication delay

• no lost package

• synchronous

• unicast

• known sender

Problem: Byzantine voting problem.

Intuition: if process $A$ lies about his vote, then we can define process $A$'s vote as what other processes $B, C, D$ hears from $A$.

Algorithm:

1. every process broadcast their vote
2. every process broadcast what they hear from every other processes
3. first majority voting to decide what each server actually voted for
4. then majority voting to decide the voting result

Assume P3 is malicious, P1 can create the following table by collecting information from P2, P3, P4. x in the entry is malicious data (where original data is 1, and - represent N/A since P1 does not hear from itself)

Hear From \ Vote P1 P2 P3 P4
P1 - - - -
P2 1 1 x 1
P3 x x x x
P4 1 1 x 1

Summing along the columns, we know it is likely P1, P2, P4 all voted for 1 and P3 voted for x. Then, from this result, we know 1 is the majority.

## Async. Practical Byzantine Fault (Liskov)

Problem: correctly replicate a opcode in a system of Replicated State Machine (RSM). An example will be Ethereum Classic.

[CS198.2x Week 1] Practical Byzantine Fault Tolerance

Practical Byzantine Fault Assumption:

• assume relatively small message delay

• only a small fraction of nodes are Byzantine

• Static configuration ($3f+1$ nodes would not leave or join)

• Primary-Backup Replication + Quorums

• 3 phrase protocol to agree on sequence number (to deal with malicious primary)

• big quorum size $2f+1$ out of $3f+1$ nodes (to deal with loss of agreement)

• authenticate communications (public key signatures, MACs)

Replica Stores:

• replicaID

• viewNumber (Primary = viewNumber % N)

• []log of <opcode, sequenceNumber, {PRE-PREPARED, PREPARED, COMMITTED}>

Algorithm: using a Replicated State Machine (RSM) with $3f+1$ replicas

1. client send opcode to Primary
2. Primary broadcast Pre-prepare<viewNumber, sequenceNumber, opcode> (and put it in log)
3. Replicas receive broadcast, determine if Pre-prepare<> is valid, if so, broadcast Prepare<replicaID, viewNumber, sequenceNumber, opcode> (and put it in log), wait until $2f+1$ Prepare<> from other replicas doing the same thing
1. crypto signature is valid
2. having the same viewNumber in message compared to stored
3. has not accepted other Pre-prepare<> with the same viewNumber
4. has not accepted the same sequenceNumber
5. Above ensure if <opcode1, viewNumber, sequenceNumber, replicaID1> in []log, then there is no <opcode2, viewNumber, sequenceNumber, replicaID2> in []log
6. Above ensure: All honest nodes that are prepared have the same opcode
7. Above ensure: At least $f+1$ honest nodes have sent Prepare<> and Pre-prepare<>
4. Replicas received $2f+1$ Prepare<>, send Commit<replicaID, viewNumber, sequenceNumber, opcode>, wait until $2f+1$ Commit<> from other replicas doing the same thing
5. Replicas received $2f+1$ Commit<> (if we assume viewNumber can change, then we only need $f+1$), (put it in log), send result to client
6. Client waits for $f+1$ matching replies before commit

Table of Content