MergeResult 2024 11 07 10 08 04
MergeResult 2024 11 07 10 08 04
MergeResult 2024 11 07 10 08 04
Systems and
Networking
Professor Rajeev Raman
Dr Babajide Afeni
Operating Systems
CPU Virtualization
Overview
• RECAP
• What are the resources in a system
• What is a process
• What are the key functions of an OS with respect to processes
• Now
• process lifecycle
• context switch
Processes (RECAP)
• One of the most fundamental
abstractions that an OS can
provide
• A process is just a running
program
• OS should allow you to create,
run, suspend, terminate
processes
• Each process needs accesses to
resources
Virtualization
• All of the preceding “resources” may be used by a process
• Every process must be given the illusion that these resources are
present only for its own use, and these resources are secure (other
processes cannot view or change information one process put into
these resources).
• This is achieved through “virtualization”
• virtualization must be efficient – the O/S should use very little of the system’s
resources
• We will cover CPU virtualization, then memory virtualization
Process Creation
• A program is stored in executable format on a disk
• When needed (e.g. if you double-click an icon or say run “py
mycode.py”) the OS copies the program to memory
• Programs MUST run from memory
• Modern OS only load enough of the application to get it running – if the
application needs some additional libraries they are loaded later
• It allocates memory for the program to store its data
• on heap or stack (we’ll come to that later)
• it finds an unused piece of memory for your program (we’ll come to that later)
• It then tells the CPU to start executing your code
That’s it?
• No
• At any time MANY processes running on a system
• The OS needs to stop and start processes
• For example, a process may hang
• In general OS needs to ensure the resources are used efficiently
• The OS needs to ensure security
• e.g. processes don’t “trash” the system
• e.g. processes can’t read each others’ data
• Not in today’s lecture
Process Lifecycle [OSTEP]
• Running:
• The process is currently using the CPU to
execute instructions.
• For simplicity, we assume there’s only one
CPU/core.
• Ready:
• The process is waiting in a queue to use the
CPU.
• The OS decides when to switch processes
between the ready and running states.
• Blocked:
• The process is waiting for some I/O
operation (like reading from disk, or user
input).
• It can't continue until this operation is
finished.
Pause
(Open Top Hat)
Speeds
• Latency
• how long it takes between a request being sent and the first data received in
response
• What is a typical SSD latency? 100 microseconds
• How fast is a fast network ping? 10 milliseconds
• https://www.virginmedia.com/blog/gaming/what-is-a-good-ping
• How fast does a human being react? 250 milliseconds
• https://humanbenchmark.com/tests/reactiontime
• If a process accesses disk, network or waits for human user input, a
CPU is idle for a long time
CPU virtualization
• Why is a process blocked?
• It needs to open a file, access the internet, wait for user input etc.
• This takes an EON. An eternity. In CPU speed terms.
• If we let the process stay on the CPU it’s not very efficient
• A process gets un-blocked when it’s I/O is complete
• An un-blocked process does not get to run, it becomes “ready”
• Why is a process de-scheduled?
• Some other process needs to run (including the O/S)
• When you change the running process, this is called a context switch
• If the process initiated an I/O this is called a voluntary CS
• If the O/S forced the change this is called an involuntary CS
Pause
(Open Top Hat)
Context Switch
• If a process is scheduled or de-scheduled, this is a context switch
• The O/S must be able to RAPIDLY perform context switches
• If a process stops running (de-scheduled / blocked) and goes to ready and
comes back (scheduled) IT MUST BE AS THOUGH IT NEVER STOPPED
RUNNING
• Therefore the O/S must SAVE the state of the process when it de-schedules
a process or the process blocks
• What is the “state of a process”?
• Instruction counter
• Contents of CPU registers
• Contents of memory
Context Switch
• It can be too slow to save the contents of memory. You may have an
array of 2 billion entries in memory
• Ah but won’t a process’s memory get overwritten when it’s suspended?
• Will discuss more during memory virtualization
• No need to save cache data – cache is there only for speeding things
up, not for correctness.
• O/S therefore only saves the contents of CPU registers during context
switch.
• CPUs have hardware support for saving the contents of registers
Other Process States
• Initial (the process has just started and is being loaded)
• Final state (the process is finished but not yet completely deleted)
• This is also called a “zombie” state
• Often processes “spawn” other processes – e.g. a game may spawn a process
to get user input while it draws a screen
• When a spawned process ends, the parent process needs to check whether
the process completed successfully or (say) crashed or encountered an error
• That’s one reason for zombie processes
Summary
• What are the resources in a system (RECAP)
• What is a process
• What are the key functions of an OS with respect to processes
• What is a process lifecycle
• What happens during a context switch
CO2101 Operating
Systems and
Networking
Professor Rajeev Raman
Dr Babajide Afeni
Motivation
• Module covers Operating systems, Networks AND Concurrency
• Essential for building modern computer systems, working with BIG
data, basis of decentralized ledger technologies (crypto) etc.
• Lead to important career opportunities such as system
administration, cloud architect, or network analyst
• “Computer Systems” theme in the BSc which starts with Computer
Architecture, and continues in 3rd year with Networking and Cloud
computing
• Also relevant to 3rd year advanced C++ programming
Overview
• Module will cover Operating Systems, Networks, Concurrency
• EACH of these topics could be two 15-credit modules, so we cover
“highlights”
• Approximate split:
• O/S 4.5 weeks
• Concurrency 1.5 weeks
• Networks 3.5 weeks
• Taught in a similar way to last year
People
• Convenor 1: Professor Rajeev Raman ( 瑞吉福 )
• UG from Indian Institute of Technology, Delhi
• MSc/PhD University of Rochester, NY, USA
• Worked at: Max-Planck Institute, University of Maryland and King’s College
London before UoL
• Head of Department from 2003-06
• Taught CO2101 last two years, Networks at MSc level (UoL) and O/S (not UoL).
Wrote an O/S kernel for a module during MSc.
• Free time: gym, languages, scuba diving
People
• Convenor 2: Dr Babajide Afeni
• Support staff:
• Dr Anand Sengodan
• TAs: Ben Cham and Adam Machowczyk (GTAs), Ansel, Charles.
Contact
• In addition to teaching CO2101 I have other responsibilities:
• Acting Director of Research for the School
• Director of PhD programme for the School
• Responsible for MSc Exams process for the School
• Supervise BSc and MSc final projects, and MSc group work
• Commitments outside the University
• Lots of deadlines – so PLEASE REMIND me if you don’t hear from me.
Schedule
• Every week has 4 taught slots:
• 2 Lectures, 1 Tutorial and 1 Seminar.
• There are 2 hours of computer classes on roughly alternate weeks (5
in all)
• Tutorial/seminar
• Tutorials will typically be practical demonstrations. Tutorials may be short
• Seminars will typically go over answers to exercises
Resources
• Blackboard site
• Slides, recordings, lab worksheets etc.
• O/S textbook:
• OSTEP: Operating systems in three easy (!) pieces
• O-STEP is a free textbook, available online
• Reference for O/S and concurrency part
• Networking reference book:
• Tanenbaum, Feamster, Wetherall, Computer Networks, 6th ed, 2021.
Assessment
• All assessment will be Blackboard based
• Three types of questions
• MCQ / short answer questions (40%)
• Main resource: slides
• problem-solving questions (40%)
• Main resource: worksheets we will cover
• longer answers (20%)
• Main resource: textbooks, slides, lecture videos etc.
• We have two in-lab closed-book tests which contain only MCQ / SA questions:
• Knowledge Test 1 (08/11) (20%)
• Knowledge Test 2 (13/12) (20%)
• Final Exam (WB 6/1) contains last two types (60%): open-book, “take from
anywhere”
• You will have plenty of guidance on what will be asked. “No surprises”
Lab Work
• All lab exercises are formative
• There will be no programming coursework
• Attending labs is recommended
• A significant number of MCQ and essay-type questions will be about python
code fragments and other things covered in the lab exercises
• Lab worksheets will be introduced in tutorials and the code in the lab
will be discussed in the seminars
Operating Systems
CPU Virtualization
Pause
(Open Top Hat)
Operating System
• An operating system (OS) is system software that manages computer
hardware and software resources, and provides common services for
computer programs. [Wikipedia]
• OS is software that acts as an intermediary between users and the
computer hardware.
• It manages hardware resources and provides a platform for applications
to run.
• The OS ensures that the hardware is used efficiently and allows users to
interact with the computer easily.
Main Flavours of Operating Systems
• There are two main flavours of OS
• Microsoft Windows:
• One of the most widely used, especially in personal computers.
• Known for its user-friendly interface, extensive software support, and dominance
in desktops and laptops.
• Windows 10, Windows 11
• UNIX-based Systems:
• Family of OS that are stable and secure
• Commonly used in in mobile devices (iOS, Android) and technical environments
(servers).
• Linux (Ubuntu, Fedora), macOS (Apple computers), iOS (iPhones), Android
(smartphones).
• Most of what we teach is relevant to both flavours, but we’ll be basing
things like labs on Linux.
Logos
Processes
• One of the most fundamental abstractions that an OS can provide
• A process is just a running program
• OS must provide ways to create, run, temporarily suspend and
terminate processes
• Running processes need hardware and software resources.
• What are hardware resources?
Computer Systems [OSTEP]
Renders The “brain”
A kind of
images for of the
memory
display computer
List of currently
active processes
Command-line
arguments
Join Code
074942
Direct execution
• Gives control to user process
• User process may perform undesirable actions
• delete other users’ files etc.
• User process can run forever
• Since user process is running, OS can’t take back control
• Some of you felt one process running too long is a problem
• Training neural net versus short programs?
• CPU is at 100% utilization so what’s the problem?
Protecting the system
• All CPUs can run in kernel mode (privileged) and user mode (normal)
• In user mode, the CPU cannot access or modify memory locations that contain the OS code or data,
and cannot directly talk to storage or network devices
• To do this, the CPU must switch to kernel mode
• Any time the user process calls a function that has to do with the OS e.g. f =
open(“myfile”, “rw”)this is a system call (syscall)
• The first instruction in a syscall is a trap CPU instruction
• pauses the user code, saves register state carefully, changes the hardware status to kernel mode, and
starts running the OS code (similar to context switch)
• When the OS finishes a syscall, it returns to the user program via the return-from-trap CPU
instruction
• reduces privilege (going from kernel to user mode) and returns control to the user program at the
instruction after the syscall.
• “Limited direct execution”
Limited Direct Execution [OSTEP]
• Last instruction done by O/S
before starting a process
• return-from trap
• First instruction done in a syscall
• trap
• Last instruction done in a syscall
• return-from-trap
• Finally, trap executed when
process calls exit() – or ends
Protecting the system
• Kernel and user mode have to be built into CPU hardware – can’t be done using
software only
• User process can’t perform “illegal” actions (e.g. modify files without
permission)
• User has to make a syscall, OS starts running and checks if the user has the permissions
• However, the syscall itself can have vulnerabilities
• The parameter of a syscall (e.g. f = open(“myfile”, “rw”)is a potential security
problem
• E.g. a user asks to open file that unintentionally gives information on other users’ processes
• Bugs in the code for parameter handling: opening a file with the string $MFT in its name used to
cause Win 8.1 to crash
• Many exploits target the “syscall/trap handler” code (google syscall vulnerabilities)
Taking back control
• We need to share the CPU between different users’ processes
• “Cooperative” scheduling – user A’s code runs until it makes a syscall and hands over
control to the OS, OS then decides whether to let A’s code to restart or give to user
B (“voluntary” context switch). Used in early Mac OS
• What if a user A’s code has an infinite loop? [Answer: “reboot”]
• Much more typical is “non-cooperative” scheduling
• As we will see, in order to share the CPU among many users, the OS must stop user
A’s code and run user B’s – even if A’s code isn’t faulty
• The CPU supports a hardware timer
• When the time is up, the CPU interrupts whatever is running and calls a special OS timer
interrupt handler
• This allows the OS to regain control
Timer Interrupts [OSTEP]
Summary
• At any time, only the user’s code or the OS code are running
• When the user’s code is running, the OS code is NOT running
• The CPU needs to provide hardware support for the OS to keep
control and maintain system integrity
• hardware support for fast context switch
• hardware support for at least two modes of execution: a restricted user mode
and a privileged (non-restricted) kernel mode
• Should support a trap instruction to switch from user to kernel mode, and a
return-from-trap instruction for the other direction
• Should support hardware timers and timer-based interrupts of current code
CPU virtualization
• We’ve discussed how the OS shares the CPU with user processes
securely and can interrupt processes to take back control
• Virtualization should be secure, efficient and fair.
• For efficiency and fairness, the OS must decide which processes
should run when – this is CPU scheduling
• Ideal objective:
• Effective virtualization – as far as possible jobs get the impression the CPU is
only for them
• This includes “responsiveness” which is a kind of “fairness”
• OSTEP 7-10
CPU Scheduling
• The policy used by the OS to decide which job to run next
• There are 1000s of academic papers on this subject
• Many algorithms possible depending on
• The context
• What is known
• What is desired
• Our context: scheduling for a smallish multi-user system e.g. Xanthus
• Scheduling in Amazon’s cloud (for example) is different
Scheduling (Assumptions)
• Processes are sometimes called jobs here
• Various assumptions can be made
• Processes have a defined arrival time – when the process is started by a user
or another process
• Processes have a defined amount of CPU time they need
• Sometimes assume that we know this amount (job arrives saying “I need 10s
CPU time)
• This is an unrealistic assumption
Two options
• Letting A finish first
• A finishes after 10 seconds – A feels it had the CPU all to itself
• B finishes after 11 seconds -- B feels the CPU is very slow – it arrived 9
seconds ago, and it’s only just finished.
• Interrupting A
• A finishes after 11 seconds – A needed 10 seconds but finished in 11, CPU just
a little slow.
• B finishes within 1 second of arrival – B feels it has the CPU all to itself.
• In each case the CPU is kept 100% busy
• Arguably interrupting is better
Metrics
• The objective of scheduling is “efficiency” “fairness” and “responsiveness”
• These have to be measured, so we have scheduling “metrics”
• First metric is turnaround time (efficiency)
• For job A, if tarrival and tcompletionare the times it arrives and finishes respectively, then
• tturnaround = tcompletion- tarrival
• Objective is to minimize the average turnaround time for a collection of jobs
• Second metric is response time (responsiveness)
• For job A, if tstart is the times it first starts running, then
• tresponse = tstart- tarrival
• Objective is to minimize the average response time for a collection of jobs
• Fairness: no obvious metric
Summary
• Scheduling is the policy the OS uses to select the next job to run on
the CPU
• Scheduling should ensure that the system appears to the processes to
be efficient, fair and responsive
• “Efficiency” is measured by turnaround time
• “Responsiveness” is measured by response time
CO2101 Operating
Systems and
Networking
Professor Rajeev Raman
Operating Systems
CPU Virtualization
Outline
• Recap of CPU virtualization
• Scheduling metrics
• Scheduling algorithms:
• FIFO, STCF, RR
• Advantages and disadvantages of these
Recap: Process State
• Schedule/deschedule – done by
the OS (“involuntary CS”)
• Discussed previously how the OS
gains control, using traps and
timers
• Now, out of the many “ready”
processes, which to “schedule”?
Summary
• Scheduling is the policy the OS uses to select the next job to run on
the CPU
• Scheduling should ensure that the system appears to the processes to
be efficient, fair and responsive
• “Efficiency” is measured by turnaround time
• “Responsiveness” is measured by response time
CPU Scheduling (Us)
• Processes are called jobs here
• We assume the following
• Jobs have a defined arrival time – when the process is started by a user or
another process
• Jobs have a defined amount of CPU time they need
• Sometimes we assume that we know this amount (job arrives saying “I need
10s CPU time”)
• Objective:
• Effective virtualization – as far as possible, jobs get the impression the CPU is
only for them
Metrics
• The objective of scheduling is “efficiency”, “fairness”, “responsiveness”
• These have to be measured, so we have “scheduling metrics”
• First metric is turnaround time
• For job A, if tarrival and tcompletionare the times it arrives and finishes respectively, then
• tturnaround = tcompletion- tarrival
• The closer the turnaround time is to the CPU usage the better…
• Aim to minimize the average turnaround time for a collection of jobs
• Second metric is response time
• For job A, if tstart is the times it first starts running, then
• tresponse = tstart- tarrival
• Aim to minimize the average response time for a collection of jobs
• Fairness: no obvious metric
Algorithms
• We consider three simple scheduling algorithms
• FIFO (first-in-first-out) aka FCFS (first-come-first-served)
• Round-robin (RR)
• STCF (shortest time to completion first) aka SRTF (shortest remaining time
first)
FIFO (First-in First-out)
• Also called First-come first-served (FCFS)
• Place all jobs in ready queue in the order they arrive
• Always pick first job on the queue
FIFO (example)
• A, B, C all need 10s CPU time
• A arrives at t = 0, B, C arrive at t
= 10s
• A completes at t = 10s
• B completes at t = 20s
• C completes at t = 30s
• Average TAT
= 13.33
FIFO (example)
• A needs 100s and B and C 10s CPU
time
• A arrives at t = 0, B, C arrive at t = 10s
• A completes at t = 100s
• B completes at t = 110s
• C completes at t = 120s
• Average TAT
=103.33
Called “convoy” effect –ATAT is poor
Better Schedule (example)
• A needs 100s and B, C 10s CPU time
• A arrives at t = 0, B, C arrive at t =
10s
• Interrupt A when B, C arrive, restart
A when they finish
• B completes at t = 20s
• C completes at t = 30s
• Average TAT
=50
Half the average TAT of FIFO
STCF
• Shortest time to completion first
• In the queue, for each job, keep track of how much remaining CPU time it has to
complete
• Remaining time = time to completion, so STCF = SRTF
• Schedule the job with the smallest remaining time
• In the example:
• A ran for 10s and had 90s remaining time when B, C arrived
• B, C need only 10s so they get priority over A
• STCF is optimal, which means it reduces the average TAT to the smallest
amount possible
• Previous slide is STCF example – we’ll do a more complicated example in a
tutorial
Responsiveness
• Turnaround time is not the only measure
• Job A needs 100s but it first asks for user input (e.g. the name of a
file) from the console
• Job B needs 50s and has no user input
• Both jobs arrive at the same time
• Job B is scheduled first (good for TAT)
• Job A waits 50s before the user can even input the name of the file
• Response time of a job: how soon the job starts after it arrives
• Objective is to minimize average response time
Round-robin
• RR picks the first job in its ready queue and runs it for a fixed amount
of time
• This amount of time is called the time slice or scheduling quantum
• Once a job’s time slice is over, it is interrupted and added to the back
of the queue
• Does a newly created job come to the front or the back of the queue
(front is better for responsiveness)
RR vs STCF
• 3 jobs A, B, C, all arrive at t=0
and need 5s.
• STCF schedules A, B, C
• Average TAT = (5 + 10 + 15) = 10s
• Average RT = (0 + 5 + 10)/3 = 5s
• RR time-slice 1s
• Average TAT = (13 + 14 + 15) = 14s
• Average RT = (0 + 1 + 2)/3 = 1s
Trade-off
• Any policy (such as RR) that is fair, i.e., that evenly divides the CPU among active
processes on a small time scale, will perform poorly on metrics such as
turnaround time
• This is an inherent trade-off:
• if you are willing to be unfair, you can run shorter jobs to completion, but at the cost of
response time
• If you value fairness, then you will be more responsive but at the cost of turnaround time
• No “cakeism”
• Real-world schedulers
• Must try and balance the two requirements
• Must do so without knowing CPU requirements in advance
• Subject of next lecture
Summary
• Scheduling metrics
• Scheduling algorithms:
• FIFO, STCF, RR
• Advantages and disadvantages of these (“trade-offs”).
CO2101 Operating
Systems and
Networking
Professor Rajeev Raman
Operating Systems
CPU Virtualization
Recap: Scheduling
• OSTEP chapters 7-8
• Basic issue:
• in order to share the CPU among many users, the OS must stop user A’s code
and run user B’s – even if A’s code isn’t faulty
• Usually not just “A” and “B” but hundreds of processes running at any given
time
• Scheduling is about how to choose which process to run next
• Aim: “good” virtualization, jobs feel CPU is responsive to “their needs”
Recap: Scheduling
• The OS needs to balance
• Efficiency (one measure of which is “turnaround time”)
• Responsiveness (one measure of which is “response time”)
• Fairness (which doesn’t have a measure as such)
• Doing well on these three should lead to “good virtualization”
• Each job will feel the CPU is responsive to its needs
• Three algorithms so far
• FIFO – not good for anything really
• Shortest Time to Completion First (STCF)
• Guarantees minimum average turnaround time
• Unrealistically assumes we know the CPU time needed by each job
• Can be poor for response time
• Round-robin (RR)
• Good for response time
• Poor for turnaround time
Today’s Algorithm
• Multi-level Feedback Queue (Chapter 8, OSTEP)
• Invented in base form by Fernando Corbato in 1962 in an OS called CTSS
• Corbato led the development of MULTICS in 1964-67, a joint project between
MIT and Bell Labs
• MULTICS was the inspiration for UNIX
• Corbato won the Turing Award for this work
• Used in some form or the other in many Unix systems and Windows
NT. Linux switched from MLFQ a few years ago.
MLFQ
• Jobs can be roughly categorized into two kinds:
• CPU-intensive jobs – these often need a lot of CPU time
• de-prioritized for turnaround time, like STCF
• can’t use STCF in practice since we don’t know job’s CPU usage in advance)
• example – key generation from lab
• I/O intensive or interactive jobs – these usually need less CPU time
• typical execution pattern: short CPU bursts followed by I/O
• such jobs release CPU frequently by voluntary context switches
• prioritized to reduce response time (for interactivity) and also for turnaround time
• example – search for pattern in files from lab
• Not all jobs can be categorized this way
• Jobs can be CPU intensive for a bit then have an I/O intensive or interactive phase
• Which job is which kind?
• MLFQ observes the behaviour and tries to respond appropriately
MLFQ
• Puts jobs into several queues
• For concreteness, let’s assume there are three queues Q2, Q1, Q0.
• In real implementations, could be more
• Different queues have different priorities (Q2 highest, Q1 next
highest, Q0 lowest)
• All jobs in the same queue have the same priority
• Every queue has a fixed CPU time quantum
• It’s possible different queues have different quanta
• E.g. Q2 has quantum 10ms, Q1 has quantum 20ms etc.
MLFQ Rules
• Jobs are scheduled according to 5 rules
• Rules
1. always schedule next a job from the highest priority queue that isn’t empty
2. all the jobs in the same queue are scheduled round-robin
3. when a job arrives, it is placed at the highest priority queue
• jobs that return from I/O are also placed in the highest priority queue
4. once a job uses up its CPU quantum in its current queue it moves down one
queue
• Rule 5 later.
Rules 1-4
1. always schedule next a job from • Rules 1-3 are to ensure low
the highest priority queue that response time for interactive
isn’t empty jobs
2. all the jobs in the same queue
are scheduled round-robin • Rule 4 is aimed at identifying
3. when a job arrives, it is placed at jobs with potential high CPU
the highest priority queue usage
• jobs that return from I/O are also • To minimize ATAT, jobs with lower
placed in the highest priority queue CPU usage should be done first
4. once a job uses up its CPU • If a job keeps using up its CPU
quantum in its current queue it quantum without blocking, it
moves down one queue (its keeps losing priority
priority is reduced)
Example situation
• Assume CPU quantum in a queue is 10ms. Job A arrives at 0 and
needs 180ms, Job B arrives at 100 and needs 20 ms
1. Job A (black) starts in Q2 (highest priority) – RULE 3
2. Run A for 10ms (no other jobs) – RULE 1
3. After 10ms A is put into Q1 (lower priority) – RULE 4
4. Run A for 10ms (no other jobs) – RULE 1
5. After 10ms A is put into Q0 (lowest priority) – RULE 4
6. Run A for 10ms (no other jobs) – RULE 1 (8 more times)
7. Job B (grey) arrives and starts in Q2 (highest priority) – RULE 3
8. After 10ms B is put into Q1 (lower priority) – RULE 4
9. Run B for 10ms, Job B finishes
• Job B is in Q1, Job A is in Q2, so pick Job B (RULE 1)
• RULE 2 (Round-robin) is not used A and B are in different queues
• If B performs an I/O it will return after being blocked to Q2
• MLFQ behaves just like STCF in this case – without knowing in
advance that B needs 20ms or A needs 180ms
• In general MLFQ tries to be a compromise between STCF and RR
“Priority Boost”
• There is a fifth rule:
5. After some time period S, move all the jobs in the system to the highest-
priority queue
• Reasons:
• If lots of interactive short-running jobs keep (re-)entering the system, they will
get priority over a long-running job which won’t get any CPU time
• E.g. C arrives while B is running, then D arrives while C is running etc. A will not make
any progress at all
• This optimizes average turnaround time but poor from “fairness” perspective
• A job can switch between being CPU-bound and interactive
• If it starts out as CPU-bound it will get low priority and the OS will make it seem
unresponsive when it becomes interactive
Adjusting MLFQ parameters
• Each implementation of MLFQ has many possible parameters
• How many queues?
• How much should the allotment of CPU time at each level be?
• Many MLFQ implementations make the allotment different for different levels e.g. if three priority levels
• Time-slice 10ms in Q2 (high priority), 20ms in Q1 and 40s in Q0
• Reasoning: Q2 should have interactive or I/O bound jobs which won’t take up much CPU time before getting
blocked. Q0 jobs have large CPU requirements so give them a longer slice when they do get scheduled
• How often should you priority boost?
• Many implementations also have different versions of rule 4 (“decay-usage”) to
decide whether a job is interactive or CPU-bound
• Many implementations allow the user to give advice to the OS about their job.
E.g. nice command in many Un*x systems says this job is CPU bound and will
do OK with lower priority
CPU Virtualization (Summary)
• We’ve covered the following topics
• Processes (OSTEP 4)
• Process lifecycle, including how to create and manage processes (OSTEP 5)
• “Direct execution”, or how the OS boots up and maintains secure control of the
system through trap instructions, kernel/user mode, timer interrupts etc (OSTEP 6)
• CPU scheduling, or how the OS manages to share the CPU among many processes
while being “fair”, “responsive” and efficient (OSTEP 7 and 8).
• We’ve not covered the following advanced topics in OSTEP:
• “Lottery” scheduling (OSTEP 9)
• Multi-CPU scheduling (OSTEP 10)
• Expect: problem-solving, multiple-choice/short answer questions.
Memory Virtualization
• We now move to the next topic:
memory virtualization
• Here the focus is on RAM
• RAM is fast
• RAM is relatively limited in availability
• Most of a program’s data and code must
be in RAM for the CPU to execute it
• Lots of chapters in OSTEP (13-23)
• Some content will make a bit more sense
once you study C++
• Completed in just over 2 lectures here
Memory Virtualization (Taster)
• RAM must be shared among
processes
• When a process has CPU, it has
effectively 100% control of the CPU
• When a process has RAM it DOES
NOT have 100% of RAM
• We can kick a process off CPU
quickly
• context switch, microseconds
• Not possible for RAM
CO2101 Operating
Systems and
Networking
Professor Rajeev Raman
Operating Systems
Memory Virtualization
Overview
• We’ve finished CPU virtualization
• Next topic: memory virtualization, two lectures
• After this, file systems (1 week)
• Then concurrency (2 weeks)
• Then, networking (3 weeks)
Memory Virtualization
• We now move to the next topic:
memory virtualization
• Here the focus is on RAM
• RAM is fast
• RAM is relatively limited in availability
• A program’s data and code must
“mostly” be in RAM for the CPU to
execute it
• Lots of chapters in OSTEP (13-23)
• Some content will make more sense
once you learn C++
• Completed in just over 2 lectures here
Memory Virtualization
• RAM must be shared among
processes
• When a process has CPU, it has
effectively 100% control of the
CPU
• When a process has RAM it
DOES NOT have 100% of RAM
• We can kick a process off CPU
quickly
• Not possible for RAM
CPU’s view of RAM (recap)
• RAM is “randomly accessible”
• This means the CPU gives a
memory address, puts it on the
“address bus”
• An “address” typically specified by
64 bits
• The RAM chip returns the
contents of that memory
address on the “data bus”
• [Image: Wikipedia]
Process’s view of Memory [OSTEP
13-14]
• Each process has it’s own “virtual address space”
• In principle each process has memory addresses from 0 to 264 – 1 (for 64-
bit address spaces). TOP HAT please.
• 264 – 1 = 18,446,744,073,709,551,615 ~ 18 EB
• Computer systems have a few GB to 1TB of RAM
• Also, it’s hard work for the OS if the OS assumes every process can use any
part of its virtual address space
• So all processes interact with the OS to “allocate” and “free” memory,
within their own virtual address space
• These are indications as to which parts of the virtual address space are being used
by the process
Process Address Space
• Address space divided into three parts: code,
stack, heap
• In C, C++ the heap and stack look like this
(assuming 16KB address space)
• Heap and stack have slightly different
purposes
• Stack is used for variables and method calls in
Python (stack frames are created whenever a
method is called) – hence the name “stack”
• Stack is used in C/C++ for statically allocated
variables and method calls
• Heap is used for objects and instance
variables in Python
• Stack is used in C/C++ for “dynamically”
allocated variables
User code’s interaction with OS for
memory
• [Memory API chapter 14 OSTEP]
• A process is typically written in a high-level language such as Python, Java,
C++, C#
• Languages such as Java, Python, C# don’t let the user interact with the OS
for memory
• Java/Python/C# interact with the OS on the user’s behalf
• E.g. a = [1, 2, 3, 4] may require Python to tell the OS that it will use needs a bit
more heap space than before
• Languages such as C/C++ have calls such as malloc() and new() that allow
the user program to directly allocate chunks of the address space
• More efficient, much harder to debug and make secure
OS memory manager
• The process requires chunks of contiguous memory from its address space
• Reason: accessing elements in arrays must be super-fast
A = [1, 2, 3, 4, 5, 6, … , 999]
for x in range (1000):
A[i] = A[i] * A[i]
print(A[233])
• A[233] is located 233 “units” from the start of A
• CPU loads address of start of A
• CPU adds 233 “units” to this
• CPU gets the address where A[233] is located
• ONLY WORKS IF A IS STORED CONTIGUOUSLY
Memory fragmentation [Chapter 17]
• We must make sure that the memory locations used by a process in an
address space are contiguous
• Although there are 20 bytes free, they are not contiguous. This free memory
is fragmented.
• This kind of fragmentation is called “external fragmentation”
• Another kind of fragmentation is if you create a data structure but don’t fill all of it
with data (“internal fragmentation”)
• If fragmentation gets out of control, then a process may only be using 100MB
of real memory, but say 20GB of its virtual address space. This is not good.
Reducing External Fragmentation
• The OS memory manager tries to avoid fragmentation by
understanding typical memory request patterns
• There are a huge number of memory allocation algorithms, such as the
“buddy” allocator
• Languages such as Python, C#, Java can also “garbage collect” to
reduce their address space usage
• E.g. if array B is stored in the middle 10 bytes, copy it to the first 10 bytes,
now there are 20 contiguous free bytes
• Garbage collection is time-consuming and must be done infrequently
Summary so Far
• RAM is very large compared to CPU registers – can’t kick a process out of RAM when we
context switch
• Each process has a “virtual” address space (AS) of GINORMOUS size (on 64-bit machines --
18 EB)
• This contains its code, stack (static variables, procedure call frames) and heap (all other
variables, class instances etc)
• It’s essential that the OS only has to manage the parts of the AS of each process that is
“actually in use”
• The process interacts with the OS memory manager to allocate and free space within its
virtual address space
• In C, C++, user programs do this directly
• Fragmentation can cause the “actually in use” memory to be larger than it is.
• the OS memory manager tries to minimize fragmentation
Sharing RAM [OSTEP 18 19 20]
• Peek ahead at next lecture
• The standard method for virtualization is “Paged Memory”
• Simple memory management, avoids fragmentation of physical memory
• Flexible – doesn’t need to worry about how process uses its AS
• Divide the AS of each process into equal-sized pages
• A page is just a fixed-size piece of contiguous addresses in an AS
• Divide the physical memory into page frames
• Each page frame is the same size as a page and will typically hold the contents of a page from the AS of
a process
• OS keeps track of all pages “in use” from a process’s AS, still a huge amount of information
• Hence imperative to keep fragmentation low
• To share RAM, only some of a process’ “in use” pages are kept in memory (“resident” pages)
• Next lecture: how paged memory works.
CO2101 Operating
Systems and
Networking
Professor Rajeev Raman
Operating Systems
Memory Virtualization
Memory Virtualization (Recap)
• How to share RAM
• RAM is very fast compared to disk
• RAM is very large compared to CPU registers
• we can copy and save CPU registers when we context switch
• we cannot copy and save a process’ RAM when we context switch
• Each process has a “virtual” address space (AS) of GINORMOUS size (on 64-
bit machines -- 18 billion GB)
• The process’ code executes instructions such as
movl <virtual address>, %eax
• The virtual address refers to an address in the process’ AS
• It is not a physical RAM address
• Improves security
Address Space (Recap)
• This contains its code, stack (static
variables, procedure call frames) and
heap (all other variables, class
instances etc)
• The process interacts with the OS
memory manager to allocate and
free space within its virtual address
space
• In C, C++, user programs do this directly
• The OS memory manager tries to
minimize fragmentation
• Not too much “wasted” space in white
areas in figure
Sharing RAM [OSTEP 18 19 20]
• The standard method is “Paged Memory”
• Simple memory management, avoids fragmentation of physical memory
• Flexible – doesn’t need to worry about how process uses its AS
• Divide the AS of each process into pages
• A page is just a fixed-size piece of contiguous addresses in an AS
• Divide the physical memory (RAM) into page frames
• Each page frame is the same size as a page
• A page frame holds the content of a page of the AS of some process
Example AS
• Assume processes have a AS of
only 64 bytes
• These bytes are at addresses 0
to 63
• The page size is 16 bytes
• The AS has 4 pages
• Bytes 0 – 15 are page 0
• Bytes 16 – 31 are page 1
• Bytes 32 – 47 are page 2
• Bytes 48 – 63 are page 3
Example Physical Memory
• Assume computer has only 128 bytes
• These bytes are at addresses 0 to 127
• The page frame size is 16 bytes
• The memory has 8 page frames
• Bytes 0 – 15 are page frame 0
• Bytes 16 – 31 are page frame 1
…
• This virtual address is viewed as a virtual page number of 2 bits (there are 4 pages
in the AS) and an offset of 4 bits (page size 24 = 16 bytes)
Address Translation (Example)
• movl 21, %eax
• 21 = 0101012
• This is viewed as VPN and offset
• Block 0 is the Master Boot Record (MBR), contains system boot code and is used to
record the partitions (block/sector range, which one to boot)
• Each partition has a boot and a super block, containing info about that file system
Organization of File System
• What we have to build a file system with is just the storage blocks in
the HDD/SSD
• Example (OSTEP Ch 40 P2)
• HDD partition has size 256KB, divided into 64 blocks of 4KB each.
• We want most of this to contain user data, but FS will need to use
some of the HDD partition to do the “book-keeping”
Organization of File System
• Example (OSTEP P2)
• HDD (partition) has size 256KB, divided into 64 blocks of 4KB each
• First 8 blocks for FS “metadata” to manage the files
• The remaining 56 blocks hold user data
• Blocks 3-7 hold an array of inodes (index nodes)
• An inode is the starting point for creating files – 1 inode per file
• Our example inode will be 128 bytes, so this FS can have about 160 files (5*4K/128)
Organization of File System
• HDD (partition) has size 256KB, divided into 64 blocks of 4KB each
• Block 0 is “superblock”, has “magic number” indicating FS type (FAT, vfs etc.), #inodes, #blocks
inode contents
• Example ext2 FS inode
• Each file has one inode
• Disk pointers point to the blocks
containing this file’s data
• This example, 15 disk pointers
• a file can have only 15 data blocks
• 15 * 4KB = 60KB max file size
• Fact: most files are small, so this works
for most files
• Most != All
Multilevel pointers
• Unix ext2
• 12 direct pointers
• used for small files
• 12 pointers to DBs
• One indirect pointer
• points to a block that contains pointers to DBs
• One double indirect pointer
• points to a block that contains pointers to blocks
that contain pointers to DBs
• One triple indirect pointer
• Max file size depends on block size
• BS = 4KB, 4-byte pointers, 1000 pointers
• Max file size ~ 1000*1000*1000*4KB = 4TB
Implementing Directories
• The path name is used to locate a directory entry
• In Un*x, directory entries are inodes (so basically files)
• A directory entry contains all the information needed
• For an inode implementation, it contains the inode number
• A directory entry may also contain some attributes of the files (or a pointer to
them)
• The directory entry provides a mapping from a filename to the inode
corresponding to that file
Locating a Un*x file
• Given an absolute path name /usr/ast/mbox
• The system locates the root directory inode (always inode 2)
• Then look up the first path component (usr) in the root directory, to
find its inode number
• Use the inode number for /usr, to access its inode data; then locate
the inode number for /usr/ast, etc.
• This process is repeated until the actual file is located
Summary
• File systems organize a collection of blocks on HDD/SSD to provide a
nice user interface
• File systems are usually implemented using “inodes”
Module Status
• In weeks 1-5 we have covered:
• CPU virtualization
• Memory virtualization
• File systems
• This concludes the “Operating Systems” part of the module
• The interim test will be on what we have already covered
• Remaining topics:
• Concurrency (1.5 weeks)
• Networks (~ 3.5 weeks)