Lecture 010

Set, Line, and Block

Set, Line, and Block

Set, Line, and Block

Reading Policy

Reading from Address:

  1. for each set, check if the set index bits match
  2. for each line with the same set index, check if the tag bits match
  3. reading from the block offset

(primitive datatype should not span multiple cache line, but this can happen)

Policy for Direct Mapped

Direct Mapped:

Policy for 2-Way Set Associative Cache

Occupy Until Bucket Size (number of lines) Filled

Occupy Until Bucket Size (number of lines) Filled

Writing Policy

Background: copies of data exists in L1, L2, L3, Main-Mem, Disk

Write-hit Policy

Write-miss Policy

Typical Combination:

  1. Write-back + Write-allocate (Mostly Used)

    Summary

    Summary
  2. Write-though + No-write-allocate

Memory Metrics and Facts

i7 Cache Hierarchy

i7 Cache Hierarchy

Metrics

Miss Rate: percentage of miss per access Hit Time: time to check and deliver a line to CPU (typically 4 cycle for L1, 10 cycle for L2) Miss Penalty: typically 50-200 cycles for main memory

(Therefore, 99% hits is twice as good as 97%, so we talk about miss rate rather than hit rate)

The Memory Mountain

Read throughput: read bandwidth Memory Mountain: measured read throughput as a function of spatial and temporal locality

Slope

Slope

Matrix Multiplication

Spacial Locality

We only care about the inner-most loop

Bad Example

Bad Example

Good Example

Good Example

Worst Example

Worst Example

Example Analyze

Example Analyze

Example Performance

Example Performance

Blocking

The Problem

The Problem

Optimized

Table of Content