Unit 4
Unit 4
MEMORY MANAGEMENT
Main Memory: Background – Swapping - Contiguous Memory Allocation – Segmentation – Paging - Structure
of the Page Table – Virtual Memory: Background - Demand Paging - Copy-on-Write - Page Replacement -
Allocation of Frames – Thrashing - Memory-Mapped Files - Allocating Kernel Memory.
Introduction to Memory
Memory Background
Role of OS in memory management
Role of Memory
• A typical instruction-execution cycle, for example, first fetches an
instruction from memory.
• The instruction is then decoded and may cause operands to be fetched
from memory.
• After the instruction has been executed on the operands, results are stored
back in memory.
• Main memory and the registers built into the processor itself are the only
general-purpose storage that the CPU can access directly.
• Registers that are built into the CPU are generally accessible within one
cycle of the CPU clock.
Process and Address
• Each process has a separate memory space.
• Separate per-process memory space protects the processes from each other
and is fundamental to having multiple processes loaded in memory for
concurrent execution.
• To separate memory spaces, we need the ability to determine the range of
legal addresses that the process may access and to ensure that the process
can access only these legal addresses.
A base and a limit register define a logical
address space.
• 300040 and the limit register is 120900, then the
program can legally access all addresses from 300040
through 420939 (inclusive).
• Any attempt by a program executing in user mode to
access operating-system memory or users’ memory
results in a trap to the operating system, which treats
the attempt as a fatal error.
• This scheme prevents a user program from
(accidentally or deliberately) modifying the code or
data structures of either the operating system or other
users.
Base and limit register
• The base and limit registers can be loaded only by the operating system,
which uses a special privileged instruction.
• Since privileged instructions can be executed only in kernel mode, and
since only the operating system executes in kernel mode, only the
operating system can load the base and limit registers.
• This scheme allows the operating system to change the value of the
registers but prevents user programs from changing the registers’ contents.
Address Binding
• A program resides on a disk as a binary executable file.
• To be executed, the program must be brought into memory and placed
within a process.
• Depending on the memory management in use, the process may be moved
between disk and memory during its execution.
• The processes on the disk that are waiting to be brought into memory for
execution form the input queue.
• The normal single-tasking procedure is to select one of the processes in the
input queue and to load that process into memory.
Address Binding (Cont..)
• Most systems allow a user process to reside in any part of the physical memory.
• Thus, although the address space of the computer may start at 00000, the first
address of the user process need not be 00000.
• Addresses may be represented in different ways during these steps.
• Addresses in the source program are generally symbolic (such as the variable count).
• A compiler typically binds these symbolic addresses to relocatable addresses (such
as “14 bytes from the beginning of this module”).
• The linkage editor or loader in turn binds the relocatable addresses to absolute
addresses (such as 74014).
• Each binding is a mapping from one address space to another.
• The binding of instructions and data to memory addresses can be done at any step
like compile time, load time, execution time.
Dynamic Linking and Shared Libraries
• Some operating systems support only static linking, in which system
libraries are treated like any other object module and are combined by the
loader into the binary program image.
• Dynamic linking, in contrast, is similar to dynamic loading. Here, though,
linking, rather than loading, is postponed until execution time.
• With dynamic linking, a stub is included in the image for each library
routine reference.
• The stub is a small piece of code that indicates how to locate the
appropriate memory-resident library routine or how to load the library if
the routine is not already present.
Types of Loading
Dynamic Loading
• The size of a process has thus been limited to the size of physical memory. To
obtain better memory-space utilization, we can use dynamic loading.
• With dynamic loading, a routine is not loaded until it is called. All routines are
kept on disk in a relocatable load format.
• The advantage of dynamic loading is that a routine is loaded only when it is
needed.
• This method is particularly useful when large amounts of code are needed to
handle infrequently occurring cases, such as error routines.
• It is the responsibility of the users to design their programs to take advantage of
such a method. Operating systems may help the programmer, however, by
providing library routines to implement dynamic loading.
Logical Versus Physical Address Space
• An address generated by the CPU is commonly referred to as a logical
address (virtual address), whereas an address seen by the memory unit—
that is, the one loaded into the memory-address register of the memory is
commonly referred to as a physical address.
• The set of all logical addresses generated by a program is a logical address
space.
• The set of all physical addresses corresponding to these logical addresses
is a physical address space.
• The compile-time and load-time address-binding methods generate
identical logical and physical addresses.
• Execution time → Different logical (virtual) and physical address.
Runtime mapping Virtual MMU Physical
Memory Management Unit (MMU)
• The run-time mapping from virtual to physical addresses is done by a
hardware device called the memory-management unit (MMU).
• The base register is now called a relocation register.
• The value in the relocation register is added to every address generated by
a user process at the time the address is sent to memory.
• The user program never sees the real physical addresses.
• We now have two different types of addresses: logical addresses (in the
range 0 to max) and physical addresses (in the range R + 0 to R + max for a
base value R).
Swapping
• A process must be in memory to be executed.
• A process, however, can be swapped temporarily out of memory to a
backing store and then brought back into memory for continued
execution.
• Swapping makes it possible for the total physical address space of all
processes to exceed the real physical memory of the system, thus
increasing the degree of multiprogramming in a system.
• Standard swapping involves moving processes between main memory
and a backing store.
Swapping of two process
What is a backing store ?
• The backing store is commonly a fast disk.
• Large enough to accommodate copies of all memory images for all
users, and it must provide direct access to these memory images.
• The system maintains a ready queue consisting of all processes
whose memory images are on the backing store or in memory and are
ready to run.
• Whenever the CPU scheduler decides to execute a process, it calls the
dispatcher.
• The dispatcher checks to see whether the next process in the queue is
in memory
Standard swapping
• The major part of the swap time is transfer time.
• The total transfer time is directly proportional to the amount of
memory swapped.
• If we want to swap a process, we must be sure that it is
completely idle.
• Never swap a process with pending I/O, or execute I/O operations
only into operating-system buffers.
• It requires too much swapping time and provides too little
execution time to be a reasonable.
Memory Allocation
• The main memory must accommodate both the operating system
and the various user processes.
• We can place the operating system in either low memory or high
memory.
• Since the interrupt vector is often in low memory, programmers
usually place the operating system in low memory as well.
• In contiguous memory allocation, each process is contained in a
single section of memory that is contiguous to the section
containing the next process.
Memory Allocation (Cont..)
• One of the simplest methods for allocating memory is to divide memory
into several fixed-sized partitions.
• In this multiple partition method, when a partition is free, a process is
selected from the input
• queue and is loaded into the free partition.
• When the process terminates, the partition becomes available for another
process.
• In the variable-partition scheme, the operating system keeps a table
indicating which parts of memory are available and which are occupied.
• Initially, all memory is available for user processes and is considered one
large block of available memory, a hole.
Memory Allocation (Cont..)
• In general, as mentioned, the memory blocks available comprise a set of
holes of various sizes scattered throughout memory.
• When a process arrives and needs memory, the system searches the set for
a hole that is large enough for this process.
• If the hole is too large, it is split into two parts.
• One part is allocated to the arriving process; the other is returned to the set
of holes.
• If the new hole is adjacent to other holes, these adjacent holes are merged
to form one larger hole.
• System may need to check whether there are processes waiting for
memory and whether this newly freed and recombined memory could
satisfy the demands of any of these waiting processes.
How to satisfy a request of size n from a
list of free holes ?
• First fit. Allocate the first hole that is big enough. Searching can start
either at the beginning of the set of holes or at the location where the
previous first-fit search ended. We can stop searching as soon as we find a
free hole that is large enough.
• Best fit. Allocate the smallest hole that is big enough. We must search the
entire list, unless the list is ordered by size. This strategy produces the
smallest leftover hole.
• Worst fit. Allocate the largest hole. Again, we must search the entire list,
unless it is sorted by size. This strategy produces the largest leftover hole,
which may be more useful than the smaller leftover hole from a best-fit
approach.
External Fragmentation
• Both the first-fit and best-fit strategies for memory allocation suffer from
external fragmentation.
• As processes are loaded and removed from memory, the free memory
space is broken into little pieces.
• External fragmentation exists when there is enough total memory space to
satisfy a request but the available spaces are not contiguous: storage is
fragmented into a large number of small holes.
• In the worst case, we could have a block of free (or wasted) memory
between every two processes. If all these small pieces of memory were in
one big free block instead, we might be able to run several more processes.
Internal Fragmentation
• Statistical analysis of first fit, for instance, reveals that, even with some
optimization, given N allocated blocks, another 0.5 N blocks will be lost to
fragmentation. That is, one-third of memory may be unusable! This
property is known as the 50-percent rule.
• Memory fragmentation can be internal as well as external.
• Consider a multiple-partition allocation scheme with a hole of 18,464
bytes.
• Suppose that the next process requests 18,462 bytes.
• If we allocate exactly the requested block, we are left with a hole of 2 bytes.
The overhead to keep track of this hole will be substantially larger than the
hole itself.
Internal Fragmentation
• The general approach to avoiding this problem is to break the physical
memory into fixed-sized blocks and allocate memory in units based on
block size.
• With this approach, the memory allocated to a process may be slightly
larger than the requested memory. The difference between these two
numbers is internal fragmentation—unused memory that is internal to a
partition.
• One solution to the problem of external fragmentation
is compaction
Compaction
• The goal is to shuffle the memory contents to place all free memory
together in one large block.
• Compaction is not always possible, however.
• If relocation is static and is done at assembly or load time, compaction
cannot be done.
• Compaction is possible only if relocation is dynamic and is done at
execution time.
• If addresses are relocated dynamically, relocation requires only moving
the program and data and then changing the base register to reflect the
new base address.
Compaction (Cont..)
• The simplest compaction algorithm is to move all processes toward one
end of memory; all holes move in the other direction, producing one large
hole of available memory. This scheme can be expensive.
• Another possible solution to the external-fragmentation problem is to
permit the logical address space of the processes to be noncontiguous, thus
allowing a process to be allocated physical memory wherever such
memory is available.
Solution to external-Fragmentation
• Another possible solution to the external-fragmentation
problem is to permit the logical address space of the processes
to be noncontiguous, thus allowing a process to be allocated
physical memory wherever such memory is available.
Two complementary techniques achieve this solution are:
1. Segmentation
2. Paging
Segmentation – Programmers view of memory
• User’s view of memory is not the same as the actual physical memory.
• Segmentation is a memory-management scheme that supports this
programmer view of memory.
Segmentation (cont..)
• Each segment has a name and a length. The addresses specify both the
segment name and the offset within the segment.
• The programmer therefore specifies each address by two quantities: a
segment name and an offset.
• For simplicity of implementation, segments are numbered and are referred
to by a segment number, rather than by a segment name.
• Thus, a logical address consists of a two tuple:
<segment-number, offset>.
Segments
A C compiler might create separate segments for the following:
1. The code
2. Global variables
3. The heap, from which memory is allocated
4. The stacks used by each thread
5. The standard C library
Segmentation Hardware
• Although the programmer can now refer to objects in the program by a
two-dimensional address, the actual physical memory is still a one
dimensional sequence of bytes.
• Thus, we must define an implementation to map two-dimensional user-
defined addresses into one-dimensional physical addresses.
• This mapping is effected by a segment table.
• Each entry in the segment table has a segment base and a segment limit.
• The segment base contains the starting physical address where the
segment resides in memory, and the segment limit specifies the length of
the segment.
Segmentation Hardware
Segmentation Example
Paging
• Paging avoids external fragmentation and the need for compaction
whereas segmentation does not.
• The problem arises because, when code fragments or data residing in main
memory need to be swapped out, space must be found on the backing
store.
• Paging is implemented through cooperation between the operating system
and the computer hardware.
Basic Method of Paging Implementation
• The basic method for implementing paging involves breaking physical
memory into fixed-sized blocks called frames and breaking logical
memory into blocks of the same size called pages.
• The backing store is divided into fixed-sized blocks that are the same size
as the memory frames or clusters of multiple frames.
• Every address generated by the CPU is divided into two parts: a page
number (p) and a page offset (d).
• The page number is used as an index into a page table.
• The page table contains the base address of each page in physical memory.
• This base address is combined with the page offset to define the physical
memory address that is sent to the memory unit.
Paging Hardware
Paging Model
Paging
• The page size (like the frame size) is defined by the hardware.
• If the size of the logical address space is 2m, and a page size is 2n bytes,
then the high-order m− n bits of a logical address designate the page
number, and the n low-order bits designate the page offset.
• p is an index into the page table and d is the displacement within the page
Multi-level Paging
• Most modern computer systems
support a large logical address
space. In such an environment, the
page table itself becomes excessively
large.
• One way is to use a two-level paging
algorithm, in which the page table
itself is also paged.
Multi-level address translation
Implementation of Paging
• The standard solution is to use a special, small, fast lookup hardware cache
called a translation look-aside buffer (TLB).
• The TLB is associative, high-speed memory.
• When the associative memory is presented with an item, the item is compared
with all keys simultaneously.
• If the item is found, the corresponding value field is returned.
• The search is fast; a TLB lookup in modern hardware is part of the instruction
pipeline, essentially adding no performance penalty.
• To be able to execute the search within a pipeline step, however, the TLB must
be kept small. It is typically between 32 and 1,024 entries in size.
Translation Lookaside Buffer (TLB)
• The TLB contains only a few of the page-table entries.
• When a logical address is generated by the CPU, its page number is presented to the TLB.
EMAT
• The percentage of times that the page number of interest is found in the
TLB is called the hit ratio.
• t – TLB look up time, m – mapping to physical memory, h-hit
EMAT calculation example TLB access time is
20 ns
1. An 80-percent hit ratio, for example, means that we find the desired page number in the
TLB 80 percent of the time. If it takes 100 nanoseconds to access memory, then a
mapped-memory access takes 100 nanoseconds when the page number is in the TLB.
15 page faults
• Thus, when the kernel requests memory for an object, the slab allocator
returns the exact amount of memory required to represent the object.
Allocating Kernel Memory through slab
Slab Allocation (Cont..)
2. Memory requests can be satisfied quickly.
• The slab allocation scheme is thus particularly effective for managing
memory when objects are frequently allocated and deallocated, as is often
the case with requests from the kernel.
• Objects are created in advance and thus can be quickly allocated from the
cache.
• When the kernel has finished with an object and releases it, it is marked as
free and returned to its cache, thus making it immediately available for
subsequent requests from the kernel.