Lecture 011

Distributed Replication

To deal with failure, we sometimes replicate services for efficiency and reliability.

Read-Only: avaliability boost and performance boost (CDNs, server load)

Causal Consistency: positive example

Causal Consistency: positive example

Causal Consistency: negative example

Causal Consistency: negative example

Read-only replicates is easy, but read-write will be tricky (harder to achieve consistency).

What to replicate:

When to replicate: Push vs Pull

Primary-backup Replication Model

Primary-backup Replication Model

Primary-backup Replication Model

We assume: - there is a manager allowing replica nodes to join/leave - fail-stop (not Byzantine) failure model - assume we can detect failure - assume delay and message lost - servers don't lie, saying that it has already complete something when it is not - we have a failure detector, but it has latency

Remote Write Protocol:

Failure Handle:

Asynchronous Replication: If you don't care about maintaining sequential consistency, you can reply to client before reaching agreement with backups (sometimes called "asynchronous replication").

Consensus Replication Model

Advantage: - fast response time even under failures - no master, operate as long as majority of machines is still alive - to handle f failure, we must have 2f + 1 replicas - replicated-write // QUESTION Also, for replicated-write => write to all replicas not just one• Paxos from Leslie Lamport is a famous protocol

Fischer-Lynch-Paterson Impossibility Result: No deterministic algorithm (under asynchronous communication) exist that will guarantee reaching to consensus in bounded amount of time for all runs. (In practice, network delay is random)

Paxos Consensus Algorithm

To create a replicated state machine, we only need to have a consistent replicated log (command input in the same order). It is the job of consensus algorithm to make replicated log consistent.

Basic Paxos

Problem: pick a single value once out of all proposed values.

Requirements:

Each machine consists of two parts

Terminology:

Bad Approaches to Problems:

ProposalSN: a total order on proposal

Phrases:

Paxos is not live

Paxos is not live

Notice:

Multi-Paxos

Problem: combine several instances of basic Paxos to agree on a series of values, creating the log

Table of Content