0% found this document useful (0 votes)
25 views

Memory Hierarchy Cache Memory

Uploaded by

Sharlin Lins L
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Memory Hierarchy Cache Memory

Uploaded by

Sharlin Lins L
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

1.4 MEMORY HIERARCHY


 The three key characteristics of memory: namely, capacity, access time, and cost.
 There is a trade-off among the three key characteristics of memory namely-
o Cost
o Capacity
o Access time
 Memory hierarchy is employed to balance this trade-off. Memory hierarchy is the
hierarchy of memory and storage devices found in a computer system. It ranges from
the slowest but high capacity auxiliary memory to the fastest but low capacity cache
memory.
 A typical hierarchy is illustrated in Figure. As one goes down the hierarchy, the
following occur:
a. Decreasing cost per bit
b. Increasing capacity
c. Increasing access time
d. Decreasing frequency of access to the memory by the processor
 Thus, smaller, more expensive, faster memories are supplemented by larger, cheaper,
slower memories.
 The key to the success of this organization is the decreasing frequency of access at
lower levels.
 Suppose that the processor has access to two levels of memory. Level 1 contains 1,000
bytes and has an access time of 0.1 μs; level 2 contains 100,000 bytes and has an access
time of 1 μs.
 Assume that if a byte to be accessed is in level 1, then the processor accesses it directly.
If it is in level 2, then the byte is first transferred to level 1 and then accessed by the
processor.

CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

Fig: Memory Hierarchy


 Following Figure shows the general shape of the curve that models this situation. The
figure shows the average access time to a two-level memory as a function of the hit
ratio H , where H is defined as the fraction of all memory accesses that are found in
the faster memory (e.g., the cache), T1 is the access time to level 1, and T2 is the access
time to level 2.

CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

Fig: Performance of Two Level Memory

 As can be seen, for high percentages of level 1 access, the average total access time is
much closer to that of level 1 than that of level 2. In our example, suppose 95% of the
memory accesses are found in the cache (H = 0.95) . Then the average time to access
a byte can be expressed as

 The result is close to the access time of the faster memory. So the strategy of using
two memory levels works in principle.
 The basis for the validity of condition (Decreasing frequency of access to the memory
by the processor) is a principle known as locality of reference.
 Locality of reference, also known as the principle of locality, is the tendency of a
processor to access the same set of memory locations repetitively over a short period
of time.
 Accordingly, it is possible to organize data across the hierarchy such that the
percentage of accesses to each successively lower level is substantially less than that
of the level above.
 The fastest, smallest, and most expensive type of memory consists of the registers
internal to the processor.

CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

 Skipping down two levels, main memory is the principal internal memory system of
the computer. Each location in main memory has a unique address.
 Main memory is usually extended with a higher-speed, smaller cache. The cache is not
usually visible to the programmer or, indeed, to the processor. It is a device for staging
the movement of data between main memory and processor registers to improve
performance.
 The three forms of memory just described are, typically, volatile and employ
semiconductor technology.
 Data are stored more permanently on external mass storage devices, of which the
most common are hard disk and removable media, such as removable disk, tape, and
optical storage.
 External, nonvolatile memory is also referred to as secondary memory or auxiliary
memory. These are used to store program and data files.
 A hard disk is also used to provide an extension to main memory known as virtual
memory.
 Additional levels can be effectively added to the hierarchy in software. For example, a
portion of main memory can be used as a buffer to temporarily hold data that are to
be read out to disk. Such a technique, sometimes referred to as a disk cache, improves
performance in two ways:
 Disk writes are clustered. Instead of many small transfers of data, we have a few large
transfers of data. This improves disk performance and minimizes processor
involvement.
 Some data destined for write-out may be referenced by a program before the next
dump to disk. In that case, the data are retrieved rapidly from the software cache
rather than slowly from the disk.

CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

1.5 CACHE MEMORY


Cache memory is the fast and small memory placed between processor & main memory. The
basic idea of using a cache is simple. When CPU need a word, it first looks in the cache. Only
if the word is not there does it go to main memory.

Motivation
 The rate at which the processor can execute instructions is clearly limited by the
memory cycle time (the time it takes to read one word from or write one word to
memory).

Fig: Single Cache

CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

 This limitation has been a significant problem because of the persistent mismatch
between processor and main memory speeds:
 Over the years, processor speed has consistently increased more rapidly than memory
access speed.
 We are faced with a trade-off among speed, cost, and size.
 Ideally, main memory should be built with the same technology as that of the
processor registers, giving memory cycle times comparable to processor cycle times.
 This has always been too expensive a strategy. The solution is to exploit the principle
of locality by providing a small, fast memory between the processor and main
memory, namely the cache.

Cache Principles
 The cache contains a copy of a portion of main memory.
 When the processor attempts to read a byte or word of memory, a check is made to
determine if the byte or word is in the cache.
 If so, the byte or word is delivered to the processor. (CACHE HIT) If not, a block of main
memory, consisting of some fixed number of bytes, is read into the cache and then the
byte or word is delivered to the processor. (CACHE MISS)
 The use of multiple levels of cache:
The L2 cache is slower and typically larger than the L1 cache, and the L3 cache
is slower and typically larger than the L2 cache.
Following Figure depicts the structure of a cache/main memory system.

 Main memory consists of up to 2n addressable words, with each word having a unique
n –bit address.
 This memory is considered to consist of a number of fixed-length blocks of K words
each. That is, there are M= 2n/K blocks.
 Cache consists of C slots (also referred to as lines) of K words each, and the number of
slots is considerably less than the number of main memory blocks (C<<M) (<< much
less than)

CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

Fig : Cache - Main Memory Structure

 If a word in a block of memory that is not in the cache is read, that block is transferred
to one of the slots of the cache.
 Each slot includes a tag that identifies which particular block is currently being stored.
 The tag is usually some number of higher-order bits of the address and refers to all
addresses that begin with that sequence of bits.
 As a simple example, suppose that we have a 6-bit address and a 2-bit tag. The tag 01
refers to the block of locations with the following addresses: 010000, 010001, 010010,
010011, 010100, 010101, 010110, 010111, 011000, 011001, 011010, 011011, 011100,
011101, 011110, 011111.
Cache Read Operation
The processor generates the address, RA, of a word to be read. If the word is contained
in the cache, it is delivered to the processor. Otherwise, the block containing that word is
loaded into the cache and the word is delivered to the processor.

CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

CACHE DESIGN
Key elements in cache design.
 Cache size
 Block size
 Mapping function
 Replacement algorithm
 Write policy
 Number of cache levels
Cache size: It turns out that reasonably small caches can have a significant impact on
performance.

Block size: The unit of data exchanged between cache and main memory. As the block size
increases from very small to larger sizes, the hit ratio will at first increase because of the
principle of locality:

CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

Mapping Function: When a new block of data is read into the cache, the mapping function
determines which cache location the block will occupy. Two constraints affect the design of
the mapping function.
First, when one block is read in, another may have to be replaced.
The more flexible the mapping function, the more scope we have to design a replacement
algorithm to maximize the hit ratio.
Second, the more flexible the mapping function, the more complex is the circuitry required to
search the cache to determine if a given block is in the cache.

The replacement algorithm: Effective strategy is to replace the block that has been in the
cache longest with no reference to it. This policy is referred to as the least-recently-used (LRU)
algorithm.

Write policy: If the contents of a block in the cache are altered, then it is necessary to write it
back to main memory before replacing it. The write policy dictates when the memory write
operation takes place.
At one extreme, the writing can occur every time that the block is updated. At the other
extreme, the writing occurs only when the block is replaced. (Write Through, Write Back)

CS8493-OPERATING SYSTEMS

You might also like