CL10 MemoryMgmt
CL10 MemoryMgmt
CL10 MemoryMgmt
Hardware Implementation
# More than one word are put in one cache block to
(a) exploit the temporal locality of reference in a program
(b) exploit the spatial locality of reference in a program
(c) reduce the miss penalty
(d) none of the above
Cache Performance Metrics
Miss Rate
Fraction of memory references not found in cache (misses / accesses)
1 – hit rate
Typical numbers (in percentages):
3-10% for L1
can be quite small (e.g., < 1%) for L2, depending on size, etc.
Hit Time
Time to deliver a line in the cache to the processor
includes time to determine whether the line is in the cache
Typical numbers:
1-2 clock cycle for L1
Miss Penalty
Additional time required because of a miss
typically 50-200 cycles for main memory (Trend: increasing!)
9 15-213, S’08
Cache Misses
• Compulsory (cold start or process migration, first reference): first
access to a block
– “Cold” fact of life: not a whole lot you can do about it
– Note: If you are going to run “billions” of instruction,
Compulsory Misses are insignificant
• Capacity:
– Cache cannot contain all blocks access by the program
– Solution: increase cache size
• Conflict (collision):
– Multiple memory locations mapped
to the same cache location
– Solution 1: increase cache size
– Solution 2: increase associativity
11 Memory block # 0 9 .
10 .
1 1 0 0 0 0 = 48
11 .
1 1 0 0 0 1 = 49 12 .
13 .
1 1 0 0 1 0 = 50
14 .
1 1 0 0 1 1 = 51 15 60 61 62 63
Memory block # 12
Problem:
16 GB -
4 KB 4 KB 10 10 x 1 - CL=1 bit
64 MB - - 10 -
- 512 KB - 7 -
Assuming that memory Byte addressable
MM Size Cache Block TAG bits TAG Directory
Size Size Size
128 KB 16 KB 256B 3 3X26 bits
32 GB 32 KB 1 KB - -
- 512 KB 1 KB 7 -
16 GB - 4 KB 10 -
64 MB - - 10 -
- 512 KB - 7 -
Assuming that memory Byte addressable
Direct Mapping pros & cons
• Simple & Fast
• Inexpensive
• Fixed location for given block
—If a program accesses 2 blocks that map to the same line
repeatedly, cache misses are very high
MM Size Cache Size Block Size TAG bits TAG Directory Size
128 KB 16 KB 256B -
32 GB 32 KB 1 KB - -
- 512 KB 1 KB 7 -
16 GB - 4 KB 10 -
64 MB - - 10 -
- 512 KB - 7 -
Block 1111111111222222222233
no.
01234567890123456789012345678901
Disadvantages of Set Associative Cache
• N-way Set Associative Cache versus Direct Mapped Cache:
– N comparators vs. 1
– Extra MUX delay for the data
– Data comes AFTER Hit/Miss decision and set selection
• In a direct mapped cache, Cache Block is available BEFORE
Hit/Miss:
Cache Index
Valid Cache Tag Cache Data Cache Data Cache Tag Valid
Cache Block 0 Cache Block 0
: : : : : :
Adr Tag
Compare Sel1 1 Mux 0 Sel0 Compare
OR
Cache Block
Hit
Replacement Algorithms
Direct mapping
• No choice
• Each block only maps to one line
• Replace that line
Associative & Set Associative
Answer: (C)
Que-2: Consider the data given in previous question. The size of the cache tag directory is
(A) 160 Kbits (B) 136 bits (C) 40 Kbits (D) 32 bits
Answer: (A)
Que-3: An 8KB direct-mapped write-back cache is organized as multiple blocks, each of size 32-bytes. The
processor generates 32-bit addresses. The cache controller maintains the tag information for each cache block
comprising of the following.
1 Valid bit 1 Modified bit As many bits as the minimum needed to identify the memory block mapped in the
cache. What is the total size of memory needed at the cache controller to store meta-data (tags) for the cache?
(A) 4864 bits (B) 6144 bits (C) 6656 bits (D) 5376 bits
Answer: (D)
A cache memory unit with capacity of N words and block size of B words is to be designed. If it is designed as
direct mapped cache, the length of the TAG field is 10 bits. If the cache unit is now designed as a 16-way set-
associative cache, the length of the TAG field is ______ bits.
(A) 12
(B) 14
(C) 16
(D) 18
Answer: (B)
Explanation: Type of mapping is direct map; for this direct map, 10 bits are required in its Tag. It is updated to 16
way set Associative map then new tag field size = 10 + log216 = 14 bits, because for k way set associative map
design, log2k bits are additionally required to the number of bits in tag field for Direct map design.
The size of the physical address space of a processor is 2P bytes. The word length is 2W bytes. The
capacity of cache memory is 2N bytes. The size of each cache block is 2M words. For a K-way set-
associative cache memory, the length (in number of bits) of the tag field is
Ans (B)
Q. Consider a 2-way set associative cache with 256 blocks and uses LRU replacement, Initially the cache is
empty. Conflict misses are those misses which occur due to contention of multiple blocks for the same cache set.
Compulsory misses occur due to first time access to the block. The following sequence of accesse to memory
blocks
(0, 128, 256, 128, 0, 128, 256, 128, 1, 129, 257, 129, 1, 129, 257, 129)
is repeated 10 times. The number of conflict misses experienced by the cache is ____________ .
Q. A cache memory unit with capacity of N words and block size of B words is to be designed. If it is designed as a direct
mapped cache, the length of the TAG field is 10 bits. If the cache unit is now designed as a 16-way set-associative cache,
the length of the TAG field is ______ bits.
Q. If a 16-way Set Associative cache is made up of 64 bit words , 16 words per line and 8192 sets, How big is the
cache in Megabytes ?
Ans: 16MB
Q. In a two-level cache system, the access times of L1 and L2 caches are 1 and 8 clock cycles, respectively. The
miss penalty from the L2 cache to main memory is 18 clock cycles. The miss rate of L1 cache is twise that of L2.
The average memory access time (AMAT) of this cache system is 2 cycles. The miss rate
of L1 and L2 respectively are:
Ans:
Q. Did you get any difference in the Programme ? Assuming, the cache has a block size of 4 words each, word size being 4 bytes
Program 1 Program 2