# Written Assignment 4

## Question 1

Because this question is ambiguous in the language description, I will answer it using different interpretation of the word "represent". (The following ways of interpreting the question are asked on Piazza but no instructors has answered it as for June.28 21:35pm.)

We first calculate the following:

• Since there are $8$ sets, a set can store $1024 / 8 = 128$ bytes.

• Since a block can store $32$ bytes, there are $128 / 32 = 4$ blocks in a set, and therefore, there are $4$ lines per set.

### 1-A

#### Interpretation 1

If you are asking: "how many bits should be used to identify a unique block and a unique set in the cache" (ie. how many bits I should tell you so that you know which specific set or block I am pointing to)

To identify a unique set, we need the number of bits for set index. The number of bits for index $s = log_2(\text{S}) = log_2(8) = 3$ where $S$ is the number of sets. We need $3$ bits to identify a unique set. To identify a unique block, we need the number of bits for set index and the number of bits for tag. We know that the number of bits to represent block offset $b = log_2(B) = log_2(32) = 5$ where $B$ is block size. Therefore, the sum of number of tag bits and number of set index bits is $t + s = (t + s + b) - b = 16 - 5 = 11$ bits. Therefore we need $11$ bits to identify a unique block.

#### Interpretation 2

If you are asking: "how many bits are sufficient to identify each sub-element stored in a block and a set" (ie. the block offset and the tag)

Since number of block offset bits is $5$, 5 bits are sufficient to identify each sub-element stored in a block.

Since number of tag bits is $t = (t + s + b) - s - b = 16 - 3 - 5 = 8$, 8 bits are sufficient to identify each sub-element stored in a set.

#### Interpretation 3

If you are asking: "how many bits are sufficient to identify each byte stored in a block and a set" (ie. the block offset and the tag plus block offset)

Since number of block offset bits is $5$, 5 bits are sufficient to identify each byte stored in a block.

Since number of tag + block offsetbits is $t + b = 8 + 5 = 13$, 13 bits are sufficient to identify each byte stored in a set.

### 1-B

As calculated above, there are $4$ lines per set.

### 1-C

As calculated above, there are $8$ tag bits, $3$ set index bits, and $5$ block offset bits.

tab bits: 00011010 (in binary) set bits: 001 (in binary) block bits: 00011 (in binary)

## Question 2

We roughly calculate the average penalty per access for L1, L2, and L3 cache.

• L3: $25 + (100\% - 98%) * 100 = 27$ cycles

• L2: $10 + (100\% - 97%) * 27 = 10.81$ cycles

• L1: $5 + (100\% - 96%) * 10.81 = 5.4324$ cycles

Observe that the average cycle of L1 cache is nearly independent of the performance of other cache layers. Therefore, I would improve L1 cache because with memory hierarchy (and relative high hit rate of L1), the performance of the machine is largely depend on the performance of L1 cache.

If improve L1: We roughly calculate the average penalty per access for L1, L2, and L3 cache.

• L3: $25 + (100\% - 98%) * 100 = 27$ cycles

• L2: $10 + (100\% - 97%) * 27 = 10.81$ cycles

• L1: $5 + (100\% - 97%) * 10.81 = 5.3243$ cycles

If improve L2: We roughly calculate the average penalty per access for L1, L2, and L3 cache.

• L3: $25 + (100\% - 98%) * 100 = 27$ cycles

• L2: $10 + (100\% - 98%) * 27 = 10.54$ cycles

• L1: $5 + (100\% - 96%) * 10.54 = 5.4216$ cycles

If improve L3: We roughly calculate the average penalty per access for L1, L2, and L3 cache.

• L3: $25 + (100\% - 99%) * 100 = 26$ cycles

• L2: $10 + (100\% - 97%) * 26 = 10.78$ cycles

• L1: $5 + (100\% - 96%) * 10.78 = 5.4312$ cycles

This result ($5.3243$ being the lowest cycle) further justified my choice to improve L1.

Table of Content