Because this question is ambiguous in the language description, I will answer it using different interpretation of the word "represent". (The following ways of interpreting the question are asked on Piazza but no instructors has answered it as for June.28 21:35pm.)
We first calculate the following:
Since there are 8 sets, a set can store 1024 / 8 = 128 bytes.
Since a block can store 32 bytes, there are 128 / 32 = 4 blocks in a set, and therefore, there are 4 lines per set.
If you are asking: "how many bits should be used to identify a unique block and a unique set in the cache" (ie. how many bits I should tell you so that you know which specific set or block I am pointing to)
To identify a unique set, we need the number of bits for set index
. The number of bits for index s = log_2(\text{S}) = log_2(8) = 3 where S is the number of sets. We need 3 bits to identify a unique set. To identify a unique block, we need the number of bits for set index
and the number of bits for tag
. We know that the number of bits to represent block offset b = log_2(B) = log_2(32) = 5 where B is block size. Therefore, the sum of number of tag
bits and number of set index
bits is t + s = (t + s + b) - b = 16 - 5 = 11 bits. Therefore we need 11 bits to identify a unique block.
If you are asking: "how many bits are sufficient to identify each sub-element stored in a block and a set" (ie. the block offset and the tag)
Since number of block offset
bits is 5, 5 bits are sufficient to identify each sub-element stored in a block.
Since number of tag
bits is t = (t + s + b) - s - b = 16 - 3 - 5 = 8, 8 bits are sufficient to identify each sub-element stored in a set.
If you are asking: "how many bits are sufficient to identify each byte stored in a block and a set" (ie. the block offset and the tag plus block offset)
Since number of block offset
bits is 5, 5 bits are sufficient to identify each byte stored in a block.
Since number of tag
+ block offset
bits is t + b = 8 + 5 = 13, 13 bits are sufficient to identify each byte stored in a set.
As calculated above, there are 4 lines per set.
As calculated above, there are 8 tag bits, 3 set index bits, and 5 block offset bits.
tab bits: 00011010
(in binary)
set bits: 001
(in binary)
block bits: 00011
(in binary)
We roughly calculate the average penalty per access for L1, L2, and L3 cache.
L3: 25 + (100\% - 98%) * 100 = 27 cycles
L2: 10 + (100\% - 97%) * 27 = 10.81 cycles
L1: 5 + (100\% - 96%) * 10.81 = 5.4324 cycles
Observe that the average cycle of L1 cache is nearly independent of the performance of other cache layers. Therefore, I would improve L1 cache because with memory hierarchy (and relative high hit rate of L1), the performance of the machine is largely depend on the performance of L1 cache.
If improve L1: We roughly calculate the average penalty per access for L1, L2, and L3 cache.
L3: 25 + (100\% - 98%) * 100 = 27 cycles
L2: 10 + (100\% - 97%) * 27 = 10.81 cycles
L1: 5 + (100\% - 97%) * 10.81 = 5.3243 cycles
If improve L2: We roughly calculate the average penalty per access for L1, L2, and L3 cache.
L3: 25 + (100\% - 98%) * 100 = 27 cycles
L2: 10 + (100\% - 98%) * 27 = 10.54 cycles
L1: 5 + (100\% - 96%) * 10.54 = 5.4216 cycles
If improve L3: We roughly calculate the average penalty per access for L1, L2, and L3 cache.
L3: 25 + (100\% - 99%) * 100 = 26 cycles
L2: 10 + (100\% - 97%) * 26 = 10.78 cycles
L1: 5 + (100\% - 96%) * 10.78 = 5.4312 cycles
This result (5.3243 being the lowest cycle) further justified my choice to improve L1.
Table of Content