Lecture 016 - CDN

Internet Content Delivery

How to map human-readable names (URLs) to server locations (IPs)? How to deliver content quickly & reliably?

DNS Architecture

DNS Tree

Routers don't route domain name, instead they route IP Address.

Challenges/Goals:

DNS trades off consistency for all these goals

The DNS Hierarchy: a tree

DNS Queries

The DNS protocol: RPC queries and responses

Client Side

TTL Common Practice

TTL Common Practice

Clients learn the local DNS server’s address via the host configuration protocol (DHCP)c

Recursive DNS Query

Recursive DNS Query

Recursive DNS Query: the DNS server always return a correct address, it does so by sending query from itself to other lower DNS servers.

Iterative DNS Query

Iterative DNS Query

Iterative DNS Query: the DNS server can response null answer, saying it does not know the answer, and let the client to query for lower level DNS.

Recursive and Iterative Combined

Recursive and Iterative Combined

In reality, there isn't a defined protocol. But generally, root servers uses iterative (lazy) strategy and leaf servers uses recursive (helpful) strategy.

Root Server: There are: 13 root name servers (internally replicated and geographically replicated), currently {a-m}.root-servers.net. // QUESTION: what is the point of replicating root server? will ISP redirect your DNS request if it go down? we have to trust our ISP don't maliciously modify to bad DNS? Malicious DNS attack by faking package?

Content Distribution Network (CDNs)

Websites have typically small "object" (jpg, mp3, ...) per page, and file sizes are heavy-tailed.

Each object needs: - 3-way handshake TCP - TLS encryption - Solution: HTTP2 & HTTP3 allows query in parallel // QUESTION: what do they solve, how

Content Delivery Network (CDNs):

Questions:

pull-based cache

  1. check cache in local machine
  2. if cache miss, pull from CDN
  3. CDN check cache in local machine
  4. if cache miss, pull from content provider

push-based cache: content provider can push to CDN

DNS-based Routing

Load-balancer:

Consistent Hashing: normal operation

Consistent Hashing: normal operation

Consistent Hashing: adding a node

Consistent Hashing: adding a node

Consistent Hashing: deleting a node

Consistent Hashing: deleting a node

Consistent Hashing: virtual node

Consistent Hashing: virtual node

Consistent Hashing

Typically, when we have multiple load-balancer, we need to hash all packet of one connection to the same server. So usually ip address and port number is used for hashing. This ensure packets of one connection always go to one server even with different Maglev balancer. Consistent hashing reduces the chance of a connection being sent to a different web server after a single server failure, relative to a naive hash-load balancer.

CDN Update Propagation:

Akamai: Akamai is the leading content delivery network (CDN) services provider for media and software delivery, and cloud security solutions. evolved out of MIT research on consistent hashing. It serves 15-30% of all Internet traffic, with 170K servers worldwide.

Other solutions: CloudFront, CloudFlare, Fastly, ChinaNet, Edgecast, Limelight, Lvl3, GCD

Current developments: as of 2022

Takeaway: caching is the only way to improve latency

Table of Content