// QUESTION: "so the layers that don't need a protocol field are transport & up." is this sentence correct? https://edstem.org/us/courses/26903/discussion/1719927
// QUESTION: "There is only one data-link protocol" is this correct?
// QUESTION: do we consider the step where the host determines whether to use TCP vs UDP as part of the transport layer or the network layer?
// QUESTION: distinguish between server and client
// QUESTION: is that all in the slide for all checkpoint in P1? is the assignment fully established with all info, tests, specification avaliable? Only P1-PartA, and P1-Checkpoint released? What is PartB, any related?
Protocol Data
Message Type: connect
, data
, ack
(received previous message), cAck
(received all message up until this point)
Connection ID: uniquely identify each client-server connection
Sequence Number (sn
): increments with each data
message. // QUESTION: right? only for data message
isp
): randomly generated //QUESTION: Why track isn separately (make sense to not track, but just copy whatever for Ack, if not arrive in order) // TODO: understand (why random, overflow?, no nounce)Payload size (verification):
Checksum (verification): if doesn't match, drop message (check after truncate payload size). Check with CalculateChecksum()
.
Payload: data
Protocol Behavior
Message size: limited to single UDP (1000 bytes)
Ack:// QUESTION: right?
data
messageSliding Window Protocol: the window size determines the queue size for receiving message in order (by counting sequence number // QUESTION: right) // QUESTION: will older message be in window?
Max Un-ACKed Messages: 2 (must be smaller than window size). Track using sequence number // QUESTION: both max un-acked and sliding window is for sequence number, don't count duplicate?
Epoch Event: to deal with dropped packet, and check if connection is live (Server and client tracks and fires their epoch events independently.)
data
and connect
message) // QUESTION: only for data
connect
right?// TODO: last slide
Transmission Establishment
Connect: send until epochLimit
.
epochLimit
Disconnect: if no data
message is sent in the past epochLimit
epoch.
Heartbeat Message: a Ack message with sequence number 0
Useful Tools:
Marshal: a library to convert go datastructure to byte array for transmission
time
: go's package to keep track of time for epoch
Read()
, Write()
, Close()
, CloseConn()
: library functions // QUESTION: is this API or something we need to write
Proof of work: given M, and unsigned integer \max, find 0 \leq n \leq N such that hash(M, n) is minimum. Goal: build your own server cluster for mining pool.
Details:
When server lost contact to minor: re-assign assigned job to other minor
When the server loses contact with a request client: drop the client's job (wait for minor to complete and ignore result)
load balance: design by yourself (should clearly document it)
there is hiddent tests
you should: 360 lines of code, 700 lines of test
In this project you will implement Raft
Checkpoint: election, heartbeat
leader election: raft state machine for election
RequestVote RPC
: requesting leadership votes
Heatbeats: AppendEntries RPC
, timeout for detect leader failure to trigger re-election
Final: log replication
Raft:
numberOfNodes int
(must be odd// QUESTION)me ID
log [](index int, term int, payload []byte)
(on disk)maxTerm int
(sync with disk)maxIndex int
(sync with disk)votedFor int
(persist on disk)Request(payload []byte)
AppendLog(id ID, term int, prevTerm int, prevIndex int, payload []byte)
AppendAccept(id ID)
AppendReject(id ID, needIndex int)
RequestVote(id ID, term int, index int)
AcceptVote(id ID)
BecomeLeader
:numberOfAccept int
requestQueue payload[]byte
ReadRequest(payload []byte)
:log.append(payload)
notifyHeartbeatCh(1)
notifyHeartbeatCh
numberOfAccept
AppendAccept(_ ID)
numberOfAccept
numberOfAccept > numberOfNodes / 2
: finish WaitAppendReject(_ ID, needIndex int)
needIndex >= log.size()-1
: panic("index out of bound")
AppendLog(id=me ID, term=maxTerm int, prevTerm=log[needIndex-1].term int, prevIndex=needIndex-1 int, payload=log[needIndex].payload []byte)
Request(payload []byte)
:requestQueue.append(payload)
AppendLog(id ID, term int, _ int, _ int)
:term == maxTerm && id != me
: panic("two leader in one term")
term > maxTerm
, set term
, set votedFor=-1
, BecomeFollower
, re-process AppendLog(_ ID, term int, _ int, _ int)
AppendAccept(_ ID)
// TODOAppendReject(_ ID, _ int)
// TODORequestVote(id ID, term int, _ int)
:term > maxTerm
: update maxTerm
to term
, set votedFor=-1
, BecomeFollower
, respond AcceptVote(id=me ID)
AcceptVote(id ID)
// TODOHeartbeatTimeout
:<- notifyHeartbeatCh
has message, send AppendLog(id=me ID, term=maxTerm int, prevTerm=log[log.size()-1-1].term int, prevIndex=log.size()-1-1 int, payload=log[log.size()-1].payload []byte)
BecomeFollower
StartElectionTimeout
:BecomeCandidate
RequestVote(id ID, term int, index int)
:term > maxTerm
: update maxTerm
to term
, set votedFor=-1
votedFor == -1 || votedFor == id
and maxTerm > term || (maxTerm == term && log.size()-1 > index)
: set votedFor=id
, respond with AcceptVote(id=me ID)
and reset StartElectionTimeout
, else no respondAppendLog(_ ID, _ int, prevTerm, prevIndex int)
:term > maxTerm
: update maxTerm
to term
, set votedFor=-1
, BecomeFollower
// QUESTIONappendStack
StartElectionTimeout
index <= <index, _, _>+1
, write to log, respond AppendAccept(id=me ID)
. If log already written://TODO: choose the newer term? do a running checksum?index
does not match last index+1
, respond AppendReject(id=me ID,
BecomeCandidate
maxTerm++
, votedFor=-1
RequestVote(id=me ID, term=maxTerm int, index=log.size()-1 int)
to all nodes (including itself)AcceptVote
or NewElectionTimeout
(// QUESTION: assuming odd?)AcceptVote
, BecomeLeader
RequestVote(_ ID, _ int, _ int)
:NewElectionTimeout
BecomeCandidate
AppendLog(id ID, term int, _ int)
:term >= maxTerm
: update maxTerm
to term
, set votedFor=-1
, BecomeFollower
, re-process AppendLog(id ID, term int, _ int)
AppendReject(id=me ID, needIndex=-1 int)
Table of Content