Operating System Notes - Compressed
Operating System Notes - Compressed
Operating System Notes - Compressed
System structure: Operating System > intermediary between user and hardware
1) system startup Program manage computer : convenient (mobile computers)
2) I / O hardware : efficient (mainframe computers)
3) storage
Multiprocessor Systems
Two or more processors close communication share computer bus + clock + memory +
devices
1. Increased throughput Faster execution, but not 100% linear speedup.
Blade Servers
2. Economy of scale Peripherals, disks, memory, shared among processors.
Multiple processor boards, I/O boards networking boards placed same chassis
3. Increased reliability
Each blade-processor board boots independent
Failure of a CPU slows system, doesn't crash it.
Runs own operating system
Boards multiprocessor (multiple independent multiprocessor system) Redundant processing provides system of checks and balances. ( e.g. NASA )
Clustered Systems
Independent systems, with shared common storage and connected by a high-speed
LAN, working together.
Special considerations for access to shared storage are required, ( Distributed lock
management ), as are collaboration protocols.
Operating-System Structure A timesharing (multiuser multitasking) OS requires:
Memory management
Process management
Job scheduling
Resource allocation strategies
Swap space / virtual memory in physical memory
Interrupt handling
File system management
Protection and security
Inter-process communications
Many cases smaller higher-speed storage space serves as cache, temporary storage, for
frequently needed portions of larger slower storage areas.
Hierarchy of memory storage CPU registers to hard drives and external storage.
OS responsible determining information store in what level of cache, and when transfer data
one level to another.
Proper choice cache management impact system performance.
Data read from disk follows migration path from hard drive to main memory, to CPU cache,
to registers before used, while data written follows reverse path. Each step (other than
registers) fetch data than needed, cache excess for future requests faster.
Writing, small amounts data frequently buffered until enough fill entire "block" on next
output device in chain.
Complicated when multiple processes access common data, important ensure every access
reaches most up-to-date copy of cached data (amongst several copies in different cache
levels.)
I/O Systems
Tree
Linked through parent-child relationship
(binary tree have parent have two children,
left and right child
Bitmaps
4) Client-Server Computing
Server provides services ( HW or SW ) to other systems as clients. (clients and servers processes, not HW, and may coexist on same computer.
Process may act as both client and server of either same or different resources.
Served resources include disk space, CPU cycles, time of day, IP name information, graphical displays ( X Servers ), other
5) Peer-to-Peer Computing
Any computer or process on network provide services to any other which requests it. No clear "leader" or overall organization.
May employ central "directory" server looking up location of resources, or may use peer-to-peer searching find resources.
E.g. Skype uses central server locate desired peer, further communication is peer to peer.
6) Virtualization
Allows one / more "guest" OS run on virtual machines hosted single physical machine and virtual machine manager.
Useful for cross-platform development and support.
Example, UNIX on virtual machine, hosted by virtual machine manager on Windows computer. Full root access to virtual
machine, if crashed, underlying Windows machine unaffected.
System calls caught by VMM and translated into (different) system calls made to underlying OS.
Virtualization slow down program run through VMM, can also speed up things if virtual hardware accessed through cache
instead of physical device.
Programs can also run simultaneously on native OS, bypassing virtual machines.
7) Cloud Computing
Deliver storage, applications as service over a network.
Types of cloud computing:
> Public cloud = anyone willing pay for service.
> Private cloud = Run company for internal use only.
> Hybrid cloud = Cloud with both public and private components.
> Software as a Service – SaaS = Applications (word processors) available via Internet
> Platform as a Service – PaaS = Software stack available for application use, (database
server)
> Infrastructure as a Service – IaaS = Servers or storage available on Internet, backup
servers, photo storage, file storage.
> Service providers may provide more than one type of service
Clouds contain thousands physical computers, millions virtual ones, petabytes total storage.
Web hosting services offer (one or more) virtual machine(s) to each client.
Linux
Developed by Linus Torvalds Finland 1991 as first full operating system developed GNU.
Different distributions Linux evolved from Linus's original, including RedHat, SUSE, Fedora, Debian, Slackware,
Ubuntu, each geared toward different group end-users and operating environments.
Linux on Windows system using VMware, steps:
1. Download free "VMware Player" tool from http://www.vmware.com/download/player
2. Choose Linux version virtual machine images at http://www.vmware.com/appliances
3. Boot the virtual machine within VMware Player.
BSD UNIX
Originally developed ATT Bell labs, source code available to computer science students.
UCB students developed UNIX further, released product as BSD UNIX binary and source-code format.
BSD UNIX not open-source, license needed from ATT.
Several versions of BSD UNIX, including Free BSD, NetBSD, OpenBSD, and DragonflyBSD
Source code located in /usr/src.
Core of Mac operating system is Darwin, derived from BSD UNIX, available at
http://developer.apple.com/opensource/index.html
Solaris
UNIX operating system computers from Sun Microsystems.
Based on BSD UNIX, migrated to ATT SystemV as basis.
Parts of Solaris opensource, some are not because covered by ATT copyrights.
Can change open-source components of Solaris, recompile, and link in with binary libraries of copyrighted
portions of Solaris.
Open Solaris from http://www.opensolaris.org/os/
Solaris allows viewing of source code online, without download and unpack entire package.
Utility
Free software movement gaining popularity, thousands ongoing projects involving untold numbers programmers.
Sites such as http://freshmeat.net/ and http://distrowatch.com/ provide portals to many of these projects.
#2 System
Structures
OS provides environment execution of programs
Provides certain services to programs and users
Services differ between OS, can identify common
classes
Services make programming task easier
User Interfaces: users issue commands to system. May be command-line interface ( e.g. sh, csh, ksh, tcsh, etc. ), a GUI interface ( e.g.
Windows, XWindows, KDE, Gnome, etc. ), or batch command systems.
Program Execution: OS must be load program into RAM, run program, and terminate program, normally or abnormally.
I/O Operations: OS responsible transferring data to and from I/O devices
File-System Manipulation: Addition raw data storage, OS responsible maintaining directory and subdirectory structures, mapping file
names to specific blocks of data storage, providing tools for navigating and utilizing file system.
Communications: Inter-process communications, IPC, between processes running on same processor, or between processes running
on separate processors or separate machines. May be implemented as shared memory or message passing, (or some systems may
offer both.)
Error Detection: Hardware and software errors detected and handled appropriately, minimum harmful repercussions. May include
complex error avoidance or recovery systems, including backups, RAID drives, redundant systems. Debugging and diagnostic tools aid
users and administrators in tracing down the cause of problems.
Choice of interface
Modern systems allow individual users select desired interface, customize operation, switch between different interfaces as needed.
System administrators determine interface user starts with when they first log in.
GUI provide option terminal emulator window for entering command-line commands.
Command-line commands also entered shell scripts, run like programs.
Use "strace" to see examples system calls invoked by
single simple command. (strace mkdir temp, strace cd
System Calls System calls provide user or application programs call services of operating system. temp, strace date > t.t, strace cp t.t t.2, etc. )
C or C++, some written assembly for optimal performance. Most programmers not use low-level system calls
Figure illustrates the sequence of system calls required to copy a file: directly, use "Application Programming Interface", API
1) Design Goals
Requirements define properties finished system must have, necessary first step in designing any large complex system.
OperatingSystem > User requirements: features users care and understand. Do not include implementation details.
Design and Implementation > System requirements: written for developers, include details about implementation specifics, performance requirements,
compatibility constraints, standards compliance, etc. Serve as "contract" between customer and developers
Requirements operating systems can vary depending planned scope and usage of system. ( Single user / multiuser, specialized system /
general purpose, high/low security, performance needs, operating environment, etc. )
3) Implementation
Traditionally OSes written assembly language. Direct control hardware-related issues, inextricably tied OS to particular HW platform.
Advances compiler efficiencies modern OSes written in C, or C++. Critical sections code still written assembly language
Operating systems developed using emulators of target hardware, if real hardware unavailable, or not suitable platform development
Operating-System Structure
Efficient performance implementation OS partitioned into separate subsystems, each defined tasks, inputs, outputs, and
performance characteristics. Subsystems arranged various architectural configurations:
1) Simple Structure
DOS written developers no idea how big and important become. Written by few programmers relatively short time, without
benefit of modern software engineering techniques, gradually exceed original expectations. Does not break system into
subsystems, has no distinction between user and kernel modes, allowing programs direct access to underlying hardware.
2) Layered Approach
Break OS into number smaller layers, each rests on layer below it, relies solely on services provided by next lower
layer.
Approach allows each layer developed and debugged independently, assumption all lower layers debugged and
trusted deliver proper services.
Problem deciding order place layers, no layer can call services of higher layer
Less efficient, request service higher layer filter all lower layers before reaches HW, significant processing each step.
original UNIX OS used simple layered approach,
almost all OS one big layer, not really breaking OS
down into layered subsystems:
3) Microkernels
Remove nonessential services from kernel, implement as system applications, making kernel small and efficient as possible.
Basic process and memory management, message passing between services.
Security and protection enhanced, services performed in user mode, not kernel mode.
System expansion easier, only adding more system applications, not rebuilding new kernel.
Mach first and most widely known microkernel, major component of Mac OSX.
Windows NT originally microkernel, performance problems relative to Windows 95. NT 4.0
improved performance moving more services into kernel, XP more monolithic.
QNX, a real-time OS for embedded systems.
4) Modules
Modern OS development is object-oriented, small core kernel and modules linked dynamically.
Modules similar to layers, each subsystem clearly defined tasks and interfaces, module free contact any other module, eliminating the
problems of going through multiple intermediary layers
The kernel relatively small architecture, similar microkernels, kernel does not have to implement message passing since modules are free to
contact each other directly.
5) Hybrid Systems
Most OSes today do not strictly adhere to one architecture, but are hybrids of several.
Mac OS X
Architecture relies on Mach microkernel for basic system management services, BSD kernel for
additional services. Application services and dynamically loadable modules ( kernel extensions )
provide the rest of the OS functionality
iOS
Developed by Apple for iPhones and iPads. Less memory and computing power needs than Max OS
X, supports touchscreen interface and graphics for small screens
Android
Developed for Android smartphones and tablets by Open Handset Alliance, primarily Google.
Open-source OS, lead to its popularity.
Includes versions of Linux and Java virtual machine both optimized for small platforms.
Developed using special Java-for-Android development environment.
Kernighan's Law Operating-System
Debugging includes both error "Debugging is twice as hard as writing the code in the first place. Debugging
discovery and elimination and Therefore, if you write the code as cleverly as possible, you are, by
performance tuning. definition, not smart enough to debug it."
1) Failure Analysis
Debuggers allow processes executed stepwise, provide examination of variables and expressions execution progresses.
Can document program execution, produce statistics how much time spent on sections or even lines of code.
Ordinary process crashes, memory dump of state of process's memory at time of crash saved for later analysis.
> Program specially compiled to include debugging information, slow down performance.
Approaches don't work well for OS code:
> Performance hit caused adding debugging code unacceptable.
> Parts of OS run kernel mode, direct access to hardware.
> Error during kernel's file-access or direct disk-access routines, not practical write crash dump into file.
# Instead the kernel crash dump saved to unallocated portion of disk reserved for that purpose.
2) Performance Tuning
Requires monitoring system performance.
System record important events into log files, analyzed by tools. Traces evaluate proposed system perform under same workload.
Provide utilities report system status upon demand, unix "top" command. ( w, uptime, ps, etc. )
System utilities may provide monitoring support.
3) DTrace
Special facility for tracing running OS, developed for Solaris 10.
Adds "probes" directly into OS code, queried by "probe consumers".
Probes removed when not use, DTrace facility zero impact on system when not used, proportional impact in use.
Consider, for example, the trace of an ioctl system call
Probe code restricted to be "safe", (no loops allowed), use minimum system resources.
When probe fires, enabling control blocks, ECBs, performed, each having structure of if-then block
When terminates, ECBs associated with consumer removed. When no more ECBs remain interested in particular probe, probe also removed.
Example, following D code monitors CPU time of each process running with user ID of 101.
sched:::oncpu
uid == 101
{
self-> ts = timestamp;
}
sched:::off-cpu
self-> ts
{
@time[execname] = sum( timestamp – self->ts );
self->ts = 0;
}
DTrace is restricted, due to direct access to (and ability to change) critical kernel data structures.
DTrace is opensource, being adopted by several UNIX distributions.
OperatingSystem Oses designed/built specific HW configuration specific site, designed with number variable parameters and components, configured particular operating environment.
Generation Systems need reconfigured after initial installation, add additional resources, capabilities, or tune performance, logging, or security.
Information needed configure OS include:
> What CPU(s) installed on system, optional characteristics does each have?
> RAM installed? (determined automatically, either at install or boot time.)
> Devices present? OS determine which device drivers include, as well as some device-specific characteristics and parameters.
> OS options desired, values set for particular OS parameters. Include size of open file table, number buffers use, process scheduling (priority), disk scheduling
algorithms, number of slots in process table, etc.
OS source code can be edited, recompiled, and linked into new kernel.
Configuration tables determine which modules link into new kernel, what values to set for some key important parameters. Require configuration of complicated make-
files, done automatically or through interactive configuration programs; Make used to generate new kernel specified by new parameters.
System configuration may defined by table data, "rebuilding" of system requires editing data tables.
Once system regenerated, required to reboot system to activate new kernel. Possibilities errors, systems provide mechanism for booting to older or alternate kernels.
System powers up, interrupt generated loads memory address into program counter, system begins executing instructions found at address. Address points to
"bootstrap" program located in ROM chips (or EPROM chips) on the motherboard.
System Boot ROM bootstrap program first runs hardware checks, determining physical resources present and doing power-on self tests (POST) of all HW.
> Some devices, such as controller cards may have own onboard diagnostics, called by the ROM bootstrap program.
User has option pressing special key during POST process, launch ROM BIOS configuration utility. Allows specify and configure certain hardware parameters as
where to look for OS and whether or not restrict access to utility with password.
> Some hardware provide access to additional configuration setup programs, RAID disk controller or some special graphics or networking cards.
Assuming utility invoked, bootstrap program looks for nonvolatile storage device containing OS. May look for floppy drive, CD ROM, primary or secondary hard drives.
Find first sector on hard drive and load up fdisk table, contains information about how physical drive divided logical partitions, where each partition starts and ends,
which partition "active" partition used for booting system.
Also small amount system code in portion of first disk block not occupied by fdisk table. Bootstrap code is first step not built into hardware, i.e. first part which might be in
any way OS-specific.
Generally code knows just enough access hard drive, load and execute a (slightly) larger boot program.
For a single-boot system, boot program loaded off of hard disk proceed locate kernel on hard drive, load into memory, transfer control over to kernel. Opportunity to
specify particular kernel to be loaded, may be useful if new kernel generated and doesn't work, or multiple kernels available different configurations.
For dual-boot or multiple-boot systems, boot program give user opportunity specify particular OS to load, default choice if user not pick within given time. Boot program
finds boot loader for chosen single-boot OS, runs program as described in previous bullet point.
Once kernel running, may give user opportunity enter into single-user mode (maintenance mode). Launches very few services, does not enable logins other than primary
log in on console. This mode used primarily for system maintenance and diagnostics.
When system enters full multiuser multitasking mode, examines configuration files determine system services to start and launches each. Spawns login programs ( gettys )
on each login devices configured to enable user logins.
The Process
Process is instance of program in execution.
Process memory is divided four sections:
> Text section comprises compiled program code, read in from
Batch systems work in terms of "jobs". Modern process
nonvolatile storage when the program is launched. concepts still expressed in terms of jobs, (e.g. job
> Data section stores global and static variables, allocated and scheduling ), two terms often used interchangeably
initialized prior to executing main.
> Heap used for dynamic memory allocation, managed via calls Process Control Block
to new, delete, malloc, free, etc. Each process has Process Control Block, PCB, stores following
> Stack used for local variables. Space on stack reserved for process-specific information:
local variables when declared ( function entrance or Process State: Running, waiting, etc., as discussed above.
elsewhere, depending language ), space freed up when
variables go out of scope. Stack also used for function return
Process ID: and parent process ID.
values, mechanisms of stack management language specific. CPU registers and Program Counter: Need saved and restored
Note stack and heap start opposite ends of process's free when swapping processes in and out of CPU.
space and grow towards each other. If ever meet, stack CPU-Scheduling information: Priority information and pointers to
overflow error occur, or call to new or malloc fail due to scheduling queues.
insufficient memory available.
When processes swapped out of memory and later restored,
#3 Process Concept Memory-Management information: page tables or segment
tables.
additional information also stored and restored. Also program Accounting information: user and kernel CPU time consumed,
counter and value of all program registers. account numbers, limits, etc.
I/O Status information: Devices allocated, open file tables, etc.
Process State
1) New: Process is in stage of being created
2) Ready: Process has all resources available to run, CPU not currently working on Threads
this process's instructions. Modern systems allow single process have multiple threads of
3) Running: CPU working on process's instructions. execution, which execute concurrently.
4) Waiting: Process cannot run at moment, waiting for resource become available
or some event to occur.
5) Terminated: Process completed.
Child process may receive shared resources with parent. May or may not limited subset resources
originally allocated to parent, preventing runaway children consuming system resource.
Options for parent process after creating child:
1. Wait for child process terminate before proceeding. Parent makes wait( ) system call, either
specific child or any child, causes parent process block until wait( ) returns. UNIX shells wait for
children to complete before issuing new prompt.
2. Run concurrently with child, continuing process without waiting. Operation seen when UNIX shell
runs process background task. Possible for parent run, then wait for child later, might occur parallel
processing operation. (parent fork off number children without waiting for any of them, then do a
little work of its own, and then wait for the children.)
Two possibilities address space of child relative to parent:
1. The child may be exact duplicate of parent, sharing same program and data segments in memory.
Each have their own PCB, program counter, registers, and PID. Behavior of fork system call UNIX.
2. Child process new program loaded into its address space, all new code and data segments.
Behavior spawn system calls in Windows. UNIX implement as second step, using exec system call.
Process Termination Cooperating Processes
Process executes last statement and then asks the operating system to delete it using the exit() Processes within a system may be independent or cooperating
system call. When processes execute produce some computational results
Returns status data from child to parent (via wait()) Independent process cannot affect (or affected) by such results of another process
Process resources are deallocated by operating system Cooperating process can affect (or affected) by such results of another process
Parent may terminate the execution of children processes using abort() system call. Some reasons
for doing so: Advantages of process cooperation
Child has exceeded allocated resources Information sharing
Task assigned to child is no longer required
Parent exiting and operating systems does not allow child to continue if its parent terminates
Computation speed-up
Some OSes don t allow child exists if parent has terminated
Modularity
> cascading termination - if process terminates, all its children, grandchildren, etc also terminated. Convenience
> Termination initiated by operating system
Parent process may wait for termination of child process by using wait() system call.
> Call returns status information and pid of the terminated process
pid = wait(&status); Interprocess Communication – Message
If no parent waiting (did not invoke wait()) process is a zombie
> All its resources are deallocated, but exit status is kept
Passing
Mechanism for processes to communicate and to synchronize
If parent terminated without invoking wait , process is an orphan their actions
> UNIX: assigns init process as the parent > Message system – processes communicate with each other
> Init calls wait periodically Interprocess Communication – Shared Memory without resorting to shared variables
> An area of memory is shared among the processes that wish to communicate > IPC facility provides two operations:
Interprocess Communication > The communication is under the control of the users processes, not the OS. send(message)
> Major issue is to provide mechanism that will allow the user processes to receive(message)
For fast exchange of information, cooperating
synchronize their actions when they access shared memory. > Message size is either fixed or variable
processes need some interprocess communication
(IPC) mechanisms
If processes P and Q wish to communicate, they need to:
> Two models of IPC Producer-Consumer Problem Establish a communication link between them
> Shared memory Common paradigm for cooperating processes
>Message passing Exchange messages via send/receive
> Used to exemplify one common generic way/scenario of cooperation
among processes Implementation issues:
Producer process How are links established?
> produces some information Can a link be associated with more than two processes?
> incrementally How many links can there be between every pair of
Consumer process communicating processes?
> consumes this information What is the capacity (buffer size) of a link?
> as it becomes available Is the size of a message that the link can accommodate
Challenge: fixed or variable?
Producer and consumer should run concurrently and efficiently Is a link unidirectional or bi-directional?
Producer and consumer must be synchronized
> Consumer cannot consume item before it is produced Logical implementation of communication link
Direct or indirect
Shared-memory solution to producer-consumer Synchronous or asynchronous
Uses a buffer in shared memory to exchange information Automatic or explicit buffering
unbounded-buffer: assumes no practical limit on the buffer size
bounded-buffer assumes a fixed buffer size
Direct Communication Indirect Communication
Messages directed and received from mailboxes (also referred to as ports)
Processes must name each other explicitly:
> Each mailbox has a unique id
send (P, message) – send a message to process P
> Processes can communicate only if they share a mailbox
receive(Q, message) – receive a message from process Q
Properties of indirect communication link
Properties of a direct communication link
> Link established only if processes share a common mailbox
> Links are established automatically
> A link may be associated with many processes
> A link is associated with exactly one pair of communicating processes
> Each pair of processes may share several communication links
> Between each pair there exists exactly one link
> Link may be unidirectional or bi-directional
> The link may be unidirectional, but is usually bi-directional
Operations
create a new mailbox (port)
send and receive messages through mailbox
destroy a mailbox
Sockets are considered a lowlevel communications channel, and processes may often choose to use something at a
higher level, such as those covered in the next two sections
Remote Procedure Calls, RPC
> Concept of RPC is to make procedure calls similarly to calling ordinary local procedures, except
procedure being called lies on a remote machine.
> Implementation involves stubs on either end of the connection.
- Local process calls on the stub, much as it would call upon a local procedure.
- RPC system packages up ( marshals ) the parameters to the procedure call, and transmits
them to the remote system.
- On the remote side, the RPC daemon accepts the parameters and calls upon the
appropriate remote procedure to perform the requested work.
- Any results to be returned are then packaged up and sent back by the RPC system to the
local system, which then unpackages them and returns the results to the local calling
procedure.
Another issue identifying which procedure on remote system particular RPC is destined for.
> Remote procedures identified by ports, not same ports as socket ports.
> One solution is for calling procedure to know port number to communicate with on
remote system.
> Problematic, port number would be compiled into code, break down if remote
system changes port numbers.
> Matchmaker process employed, acts like telephone directory service. Local process
must first contact matchmaker on remote system, looks up desired port number and
returns it. Local process can then use that information to contact desired remote
procedure. Involves an extra step, but is more flexible.
One common example of a system based on RPC calls is a networked file system. Messages are
passed to read, write, delete, rename, or check status, as might be made for ordinary local disk
access requests.
Pipes
Pipes are one of the earliest and simplest channels of communications between ( UNIX ) processes.
Key considerations in implementing pipes:
1. Unidirectional or Bidirectional communication?
2. Is bidirectional communication halfduplex or fullduplex?
3. Must a relationship such as parentchild exist between the processes?
4. Can pipes communicate over a network, or only on the same machine?
Ordinary Pipes
Ordinary pipes are unidirectional, reading end and writing end. ( If bidirectional communications needed, a second pipe is required. )
UNIX ordinary pipes are created with the system call "int pipe( int fd [ ] )".
> Rreturn value is 0 on success, 1 if an error occurs.
> The int array must be allocated before call, values filled in by pipe system call:
- fd[ 0 ] is filled in with a file descriptor for the reading end of the pipe
- fd[ 1 ] is filled in with a file descriptor for the writing end of the pipe
> UNIX pipes accessible as files, using standard read( ) and write( ) system calls.
> Ordinary pipes are only accessible within the process that created them.
- Typically a parent creates the pipe before forking off a child.
- When the child inherits open files from its parent, including the pipe file(s), a channel of communication is
established.
- Each process ( parent and child ) first close the ends of pipe they are not using.
example, if parent writing to pipe and child is reading, parent should close reading end of its pipe
after fork and child close writing end.
Ordinary pipes in Windows are very similar
> Windows terms them anonymous pipes
> They are still limited to parentchild relationships.
> They are read from and written to as files.
> They are created with CreatePipe( ) function, which takes additional arguments.
Named Pipes > In Windows it is necessary to specify what resources a child inherits, such as pipes.
Support bidirectional communication, communication between non parentchild
related processes, and persistence after process which created them exits.
Multiple processes can also share a named pipe, typically one reader and multiple
writers.
UNIX, named pipes are termed fifos, and appear as ordinary files in the file
system.
> ( Recognizable by a "p" as the first character of a long listing, e.g. /dev/
initctl )
> Created with mkfifo( ) and manipulated with read( ), write( ), open( ),
close( ), etc.
> UNIX named pipes are bidirectional, but halfduplex, so two pipes are still
typically used for bidirectional communications.
> UNIX named pipes still require that all processes be running on the same
machine. Otherwise sockets are used.
Windows named pipes provide richer communications. Fullduplex is supported.
Processes may reside on the same or different machines
Created and manipulated using CreateNamedPipe( ), ConnectNamedPipe( ),
ReadFile( ),and WriteFile( ).
#4 Multithreaded
Programming
Thread is basic unit of CPU utilization, consisting of program counter, stack,
and set of registers, a thread ID.
Traditional ( heavyweight ) processes have single thread control There is one
program counter, and one sequence of instructions that can be carried out at
any given time.
As shown in Figure, multithreaded applications have multiple threads within a
single process, each having their own program counter, stack and set of
registers, but sharing common code, data, and certain structures such as open
files
Motivation
Most modern applications are multithreaded
Threads run within application
Multiple tasks with the application can be implemented by separate threads
Update display
Fetch data
Spell checking
Answer a network request
Process creation is heavy-weight while thread creation is light-weight
Can simplify code, increase efficiency
Kernels are generally multithreaded
Benefits
1) Responsiveness
> may allow continued execution if part of process is blocked
> especially important for user interfaces
2) Resource Sharing
> threads share resources of process: easier than shared
memory or message passing
3) Economy
> Thread creation is faster than process creation
- Less new resources needed vs a new process
- Solaris: 30x faster
> Thread switching lower overhead than context switching
- 5x faster
4) Scalability
> Threads can run in parallel on many cores
Multicore or multiprocessor systems Concurrency vs. Parallelism
Putting pressure on programmers Parallelism implies a system can perform more than one task simultaneously
How to load all of them for efficiency Concurrency supports more than one task making progress
Challenges include: True parallelism or an illusion of parallelism Types of Parallelism
Dividing activities Single processor / core, scheduler providing concurrency 1) Data parallelism – distributes subsets of the same data
Balance across multiple cores, same operation/task on each
Data splitting Example: the task of incrementing elements by one of an
Data dependency array can be split into two: incrementing its elements in the
Testing and debugging 1st and 2nd halfs
2) Task parallelism – distributing threads across cores, each
User Threads and Kernel Threads thread performing unique operation
User threads
Support provided at the user-level In practice, people often follow a hybrid of the two
Managed above the kernel
> without kernel support
Management is done by thread library
Three primary thread libraries:
POSIX Pthreads, Windows threads, Java threads Many-to-One Many-to-Many Model
Kernel threads Many user-level threads mapped to single kernel thread > Allows many user-level threads to be mapped to (smaller or equal
Supported and managed by OS Advantage: number of) kernel threads
Virtually all modern general-purpose operating systems Thread management in user space > Allows the OS to create a sufficient number of kernel threads
support them Hence, efficient The number is dependent on specific machine or application
Disadvantages: It can be adjusted dynamically
One thread blocking causes all to block > Many-to-one
Multithreading Models Multiple threads may not run in parallel on multicore system Any number of threads is allowed, but low concurrency
Relationship between user threads and kernel threads > Since only 1 may be in kernel at a time > One-to-one
Three common ways of establishing this relationships > So, few systems currently use this model Great concurrency, but the number of threads is limited
1) Many-to-One model > Many-to-many
2) One-to-One model Gets rid of the shortcomings of the precious two
3) Many-to-Many model
One-to-One
> Each user-level thread maps to kernel thread
> Creating a user-level thread creates a kernel thread
Advantages:
More concurrency than many-to-one
Disadvantages:
High overhead of creating kernel threads Two-level Model
Hence, number of threads per process sometimes restricted > Similar to Many-to-Many
Examples > Allows user thread to be bound to
Windows kernel thread
Linux IRIX
Solaris 9 and later HP-UX
Tru64 UNIX
Solaris 8 and earlier
Thread Libraries Threading Issues
Provides programmer with API for creating and managing threads Semantics of fork() and exec() system calls
Primary ways of implementing Many other issues
Semantics of fork() and exec()
1) Library entirely in user space Does fork()duplicate only the calling thread or all
Signal handling
2) Kernel-level library supported by the OS threads?
> Synchronous and asynchronous
Examples of thread libraries > Some UNIXes have two versions of fork
Thread cancellation of target thread
Pthreads, exec() usually works as normal – replace the running
> Asynchronous or deferred
Java Threads, process including all threads
Thread-local storage
Windows threads Scheduler Activations
Lightweight process
Lightweight Process (LWP) Pthreads
> Intermediate data structure between user and kernel threads The POSIX standard ( IEEE 1003.1c ) defines the specification for pThreads, not the implementation.
> To user-level thread library, it appears as a virtual processor pThreads are available on Solaris, Linux, Mac OSX, Tru64, and via public domain shareware for Windows.
on which process can schedule user thread to run Global variables are shared amongst all threads.
> Each LWP attached to a kernel thread One thread can wait for the others to rejoin before continuing.
> LWP are used, for example, to implement Many-to-many and pThreads begin execution in a specified function
two-level models
Java Threads
> ALL Java programs use Threads even "common" singlethreaded ones.
> Creation of new Threads requires Objects that implement the Runnable Interface, which means they
contain a method "public void run( )" . Any descendant of the Thread class will naturally contain such a
method. ( In practice the run( ) method must be overridden / provided for the thread to have any
practical functionality. )
> Creating a Thread Object does not start the thread running To do that the program must call the
Thread's "start( )" method. Start( ) allocates and initializes memory for the Thread, and then calls the
run( ) method. (Programmers do not call run( ) directly. )
> Because Java does not support global variables, Threads must be passed a reference to a shared Object
in order to share data, in this example the "Sum" Object.
> Note that the JVM runs on top of a native OS, and that the JVM specification does not specify what
Linux Threads model to use for mapping Java threads to kernel threads. This decision is JVM implementation dependant,
Linux refers to them as tasks rather than threads and may be onetoone, manytomany, or many to one.. ( On a UNIX system the JVM normally uses
Thread creation is done through clone() system call clone() allows a PThreads and on a Windows system it normally uses windows threads. )
child task to share the address space of the parent task (process)
Flags control behavior
struct task_struct points to process data structures (shared
or unique)
#5 Process Scheduling As a process executes, it changes state
> new: The process is being created
> ready: The process is waiting to be assigned to a processor
> running: Instructions are being executed
Maximum CPU utilization obtained with multiprogramming > waiting: The process is waiting for some event to occur
waiting for I/O is wasteful > terminated: The process has finished execution
1 thread will utilize only 1 core
CPU–I/O Burst Cycle Levels of Scheduling
Process execution consists of:
High-Level Scheduling
> a cycle of CPU execution
> See Long-term scheduler or Job Scheduling from Chapter 3
> and I/O wait
> Selects jobs allowed to compete for CPU and other system resources.
CPU burst followed by I/O burst
Intermediate-Level Scheduling
CPU burst distribution is of main concern
> See Medium-Term Scheduling from Chapter 3
> Selects which jobs to temporarily suspend/resume to smooth
fluctuations in system load.
Low-Level (CPU) Scheduling or Dispatching
> Selects the ready process that will be assigned the CPU.
Dispatcher > Ready Queue contains PCBs of processes.
a module that gives control of the CPU to the process selected by the short-term
scheduler; this involves: CPU Scheduler
switching context Short-term scheduler
switching to user mode > Selects 1 process from the ready queue
jumping to the proper location in the user program to restart that @ then allocates the CPU to it
program > Queue may be ordered in various ways
Dispatch latency CPU scheduling decisions may take place when a process:
Time it takes for the dispatcher to stop one process and start another running 1. Switches from running to waiting state
This time should be as small as possible 2. Switches from running to ready state
3. Switches from waiting to ready
4. Terminates
Scheduling under 1 and 4 is called nonpreemptive (=cooperative)
Scheduling Criteria All other scheduling is called preemptive
How do we decide which scheduling algorithm is good? > Process can be interrupted and must release the CPU
Many criteria for judging this has been suggested > Special care should be taken to prevent problems that can arise
Which characteristics considered can change significantly which algo is @ Access to shared data –race condition can happen, if not handled
considered the best @ Etc.
> CPU utilization –keep the CPU as busy as possible
> Throughput–# of processes that complete their execution per time unit
> Turnaround time –amount of time to execute a particular Process
Scheduling Algorithm Optimization Criteria
> Waiting time –amount of time a process has been waiting in the ready queue
Max CPU utilization
> Response time –amount of time it takes to stat responding
Max throughput
- Used for interactive systems
- Time from when a request was submitted until the first response is produced Min turnaround time
Min waiting time
Min response time
First-Come, First-Served (FCFS) Scheduling
Race Condition
> counter++ could be implemented as
register1 = counter
register1 = register1 + 1
counter = register1
> counter-- could be implemented as
register2 = counter
register2 = register2 – 1
counter = register2
Synchronization Hardware
> Many systems provide hardware support for implementing the critical section code.
>All solutions below based on idea of locking
- Protecting critical regions via locks
>Uniprocessors – could disable interrupts
-Currently running code would execute without preemption
That is, without being interrupted
-Generally too inefficient on multiprocessor systems
OSes using this are not broadly scalable
>Modern machines provide special atomic hardware instructions
Atomic = non-interruptible
test_and_set instruction
test memory word and set value
compare_and_swapinstruction
swap contents of two memory words
Deadlock and Starvation
> Synchronization problems
> Deadlock – two or more processes are waiting indefinitely for an event that can be caused by only one of the waiting processes
> Let Sand Q be two semaphores initialized to 1
P0 P1
wait(S); wait(Q);
wait(Q); wait(S);
... ...
signal(S); signal(Q);
signal(Q); signal(S);
> Starvation (= indefinite blocking)
Related to deadlocks (but different)
Occurs when a process waits indefinitely in the semaphore queue
For example, assume a LIFO semaphore queue
- Processed pushed first into it might not get a chance to execute
> Priority Inversion
A situation when a higher-priority process needs to wait for a lower-priority process that holds a lock
Deadlock Detection
Safe State If no deadlock-prevention or deadlock-avoidance algorithm is used
The deadlock avoidance algorithm relies on the notion of safe state Then system can enter a deadlock state
A state is safe if the system can allocate resources to each process (up to its maximum) in In this environment, the system may provide
some order and still avoid a deadlock. Deadlock detection algorithm
System is in safe state only if Deadlock recovery algorithm
There exists a safe sequence (of processes) -- explained shortly
That is, it is possible to order all processes to form a safe sequence
Example of Detection Algorithm
Safe Sequence > Five processes P0 through P4;three resource types A (7 instances), B (2 instances), and C (6
System is in safe state only if there exists a safe sequence instances)
Safe sequence - is a sequence (ordering) <P1, P2, , Pn> of all the processes in the systems, > Snapshot at time T0:
such that: Allocation Request Available
for each Pi -- the resources that Pi can still request can be satisfied by: ABC ABC ABC
> currently available resources, plus P0 0 10 0 00 0 00
> resources held by all the Pj, with j < i (that is, by P1, P2, , Pi -1) P1 2 00 2 02
P2 3 03 0 00
Intuition behind a safe sequence: P3 2 11 1 00
ifPi resource needs are not immediately available P4 0 02 0 02
then Pi can wait until P1, P2, , Pi-1 have finished Sequence <P0, P2, P3, P1, P4> will result in Finish[i] = true for all I
- at that stage Pi will be guaranteed to obtain the needed resources
so Pi can execute ,return allocated resources ,and terminate P2 requests an additional instance of type C
Request
ABC
P0 0 00
P1 2 02
P2 0 01
P3 1 00
P4 0 02
State of system?
Can reclaim resources held by process P0, but insufficient resources to fulfill other
processes; requests
Deadlock exists, consisting of processes P1, P2, P3, and P4
Recovery from Deadlock: Process Termination
> Abort all deadlocked processes
Cons: Very costly – processes could have been running for long time.
They will need to be restarted.
> Abort one process at a time until the deadlock cycle is eliminated
Cons: High overhead since the deadlock detection algorithm is called each time one
process is terminated
> In which order should we choose to abort?
1. Priority of the process
2. How long process has computed, and how much longer to completion
3. Resources the process has used
4. Resources process needs to complete
5. How many processes will need to be terminated
6. Is process interactive or batch?
Internal Fragmentation
Example:
> Consider a multiple-partition allocation scheme with a hole of 10,002 bytes
> Suppose that the next process requests 10,000 bytes.
> If we allocate exactly the requested block, we are left with a hole of 2 bytes.
> Problem: The overhead to keep track of this hole will be substantially larger than the hole itself
Solution:
> Break the physical memory into fixed-sized blocks
> Allocate memory in units based on block size.
Issue: With this approach, the memory allocated to a process may be slightly larger than the requested memory.
> The difference between these two numbers is internal fragmentation— unused memory that is internal to a partition.
> Example:
- Block size is 4K
- Process request 16K + 2Bytes space
- 5 blocks will be allocated, (4K – 2) bytes are wasted in the last block
Segmentation: a Memory-Management Scheme
Motivation: Do programmers think of memory as a linear array of bytes, some containing instructions and others containing data
> Most programmers would say no.
> Rather, they prefer to view memory as a collection of variable-sized segments
- with no necessary ordering among the segments
> The programmer talks about the stack, the math library, and the main program without caring what addresses in memory these elements occupy
> The programmer is not concerned with whether the stack is stored before or after the sqrt() function
> What if the hardware could provide a memory mechanism that mapped the programmer s view to the actual physical memory?
Segmentation
> Solution: Segmentation – a memory-management scheme that supports programmer/user view of memory
> A logical address space is a collection of segments.
> A segment is a logical unit, such as:
- main program, procedure, function, method
- object, local variables, global variables, common block,
- stack, symbol table, array
> Each segment has a name and a length
> The addresses specify both
- the segment name, and
- the offset within the segment
> For simplicity of implementation, segments are numbered and are referred to by a segment number, rather than by a segment name.
Segmentation Architecture
> Logical address consists of a two tuple:
<segment-number, offset>
> How to map this 2D user-defined address into 1D physical address?
> Segment table – each table entry has:
base – contains the starting physical address where the segments reside in memory
limit– specifies the length of the segment
> Segment-table base register (STBR)
Points to the segment table s location in memory
> Segment-table length register (STLR)
Indicates number of segments used by a program;
Segment number s is legal if s < STLR
Paging
Segmentation Main Idea of Paging
Pros: permits the physical address space of a process to be noncontiguous > Divide physical memory into fixed-sized blocks called frames
Cons: can suffer from external fragmentation and needs compaction - Size is power of 2
- Any alternatives to it? - Between 512 bytes and 16 Mbytes
- Paging -- another memory-management scheme > Divide logical memory into blocks of same size called pages
Benefits of Paging > Keep track of all free frames
> Physical address space of a process can be noncontiguous > To run a program of size N pages, need to find N free frames and load program
> Process is allocated physical memory whenever the latter is available > Set up a page table to translate logical to physical addresses
> Avoids external fragmentation - One table per process
> Avoids problem of varying sized memory chunks > Still have internal fragmentation
Memory Protection
> Memory protection in paged environment accomplished by protection bits associated with each frame
> For example, protection bit to indicate if
1) read-only or
2) read-write access is allowed
Can also add more bits to indicate page execute-only, and so on
> Normally, these bits are kept in the page table. Valid (v) or Invalid (i) Bit In A Page Table
> Valid-invalid bit attached to each entry in the page table:
- valid indicates that associated page is in process logical address space, and is thus a legal page
- invalid indicates that the page is not in the process logical address space
> Some system have page-table length register (PTLR)
- Can also be used to check if address is valid
> Any violations result in a trap to the kernel
Shared Pages
> An advantage of paging is the possibility of sharing data and common code
> Shared code
- Particularly important in a time-sharing environment
: Ex: A system that supports 40 users, each executes a text editor
- A single copy of read-only (reentrant) code shared among processes
: For example, text editors, compilers, window systems
: Reentrant code is non-self-modifying code: never changes during exec.
- This is similar to multiple threads sharing the same process space
- Each process has its own copy of registers and data storage to hold the data for the process s execution
- The data for two different processes will, of course, be different.
> Shared data
- Some OSes implement shared memory suing shared pages.
> Private code and data
- Each process keeps a separate copy of the code and data
- The pages for the private code and data can appear anywhere in the logical address space
Two-Level Paging
> The top-level page table, with 1024 entries, corresponding to the 10-bit PT1 field.
> When a virtual address is presented to the MMU, it first extracts the PT1 field and uses this value as an index
into the top-level page table
> Each of these 1024 entries in the top-level page table represents 4 MB
# 4 GB (i.e., 32-bit) virtual address space has been chopped into 1024 chunks
# 4 GB / 1024 = 4 MB
> The entry located by indexing into the toplevel page table yields the address or the page frame number of a
second-level page table.
> Entry 0 of the top-level page table points to the page table for the program text
> Entry 1 points to the page table for the data
> Entry 1023 points to the page table for the stack
> The other (shaded) entries are not used
@ No need to generate page tables for them
@ Saving lots of space!
> The PT2 field is now used as an index into the selected second-level page table to find the page frame number
for the page itself.
Hashed Page Tables
> Common in address spaces > 32 bits
> The virtual page number is hashed into a page table
- This page table contains a chain of elements hashing to the same location
> Each element contains Inverted Page Table
(1) the virtual page number > Motivation: Rather than each process having a page table and keeping track of all possible logical pages,
(2) the value of the mapped page frame track all physical pages
(3) a pointer to the next element > One entry for each real page of memory
> Virtual page numbers are compared in this chain searching for a match - The entry keeps track of which (process, virtual page) is located in the page frame
If a match is found, the corresponding physical frame is extracted > Pros: tends to save lots of space
> Cons: virtual-to-physical translation becomes much harder
> When process n references virtual page p, the hardware can no longer find the physical page by using p
as an index into the page table
> Instead, it must search the entire inverted page table for an entry (n, p).
> Furthermore, this search must be done on every memory reference, not just on page faults
> Searching a 256K table on every memory reference is slow
> Other considerations:
- TLB and hash table (key: virtual address) is used to speed up accesses
- Issues implementing shared memory when using inverted page table
#9 Virtual-Memory
Management
Code needs to be in memory to execute, but entire program rarely used
> Example: Error code, unusual routines, large data structures
> Entire program code is not needed at the same time
Demand Paging
> Bring a page into memory only when it is needed.
1. Less I/O needed
Basic Concepts 2. Less Memory needed
When a process is to be swapped in, the pager guesses which pages will be used 3. Faster response
4. More users
before the process is swapped out again.
> Demand paging is kind of a lazy swapping
Instead of swapping in a whole process, the pager brings only those pages into - But deals with pages instead of programs
memory swapping deals with entire programs, not pages
Thus, it avoids reading into memory pages that will not be used anyway, - Lazy swapper– never swaps a page into memory unless page will be needed
decreasing the swap time and the amount of physical memory needed - Pager -- swapper that deals with pages
How to determine that set of pages? > The first reference to a page will trap to OS with a page fault
> Need new MMU functionality to implement demand paging > OS looks at another table to decide
- Invalid reference – abort
If pages needed are already memory resident
- Just not in memory – bring it from memory
> No difference from non demand-paging
If page needed and not memory resident
> Need to detect and load the page into memory from storage
* Without changing program behavior
* Without programmer needing to change code
Valid-Invalid Bit
> With each page table entry a valid–invalid bit is associated:
- v in-memory – memory resident
- i not-in-memory
> Initially, valid–invalid bit is set to i on all entries
> Example of a page table snapshot:
> During MMU address translation, if valid_invalid_bit =i page fault
Page Fault
> If the process tries to access a page that was not brought into memory,
> Or tries to access any invalid page:
- That will trap to OS, causing a page fault
- Such as when the first reference to a page is made
Optimal Algorithm
15 page faults
> How to track ages of pages?
Just use a FIFO queue
> FIFO algorithm
> Called OPT or MIN
Easy to implement
> Replace page that will not be used for longest period of time
Often, not best performing
9 page faults is optimal for the example
> How do you know this?
You don t, can t read the future
> Used for measuring how well other algorithms performs –
against the theoretical optimal solution
Directory Implementation
> The choice of the directory implementation is crucial for the efficiency, performance, and reliability of the file system
> Linear list of file names with pointer to the data blocks
Pros: Simple to program
Cons: Time-consuming to execute -- Linear search time
Solutions:
Keep sorted + binary search
Use indexes for search, such as B+ tree
> Hash Table – linear list with hash data structure
Hash on file name
Decreases directory search time
Collisions– situations where two file names hash to the same location
Each hash entry can be a linked list - resolve collisions by adding new entry to linked list
> Advantages
- Advantages of Linked File System
- FAT can be cached in memory
- Searchable at CPU speeds, pseudo-random access
> Disadvantages
- Limited size, not suitable for very large disks
- FAT cache describes entire disk
not just open files!
- Not fast enough for large databases
> Used in MS-DOS, early Windows systems
> Two-level index (generalizes to multi-level)
Indexed Allocation (Cont.)
4K blocks could store 1,024 four-byte pointers in outer index –
> Dynamic access without external fragmentation, but have overhead of index block
1,048,567 data blocks and file size of up to 4GB
> Mapping from logical to physical
- in a file of maximum size of 256K bytes and
- block size of 512 bytes
- We need only 1 block for index table
One common attack is masquerading, in which the attacker pretends to be a trusted third party. A variation of this is the maninthe-
middle, in which the attacker masquerades as both ends of the conversation to two targets.
A replay attack involves repeating a valid transmission. Sometimes this can be the entire attack, ( such as repeating a
request for a money transfer ), or other times the content of the original message is replaced with malicious content.
Spyware is version of Trojan Horse included in "free" software downloaded off Internet. Generate popup browser windows, accumulate information
about user and deliver it to some central site.
2) Trap Door
> Programmer deliberately inserts security hole use later to access system.
> Once system in untrustworthy state, system can never be trusted again. Backup tapes may contain copy of some cleverly hidden back door.
> Could be inserted into compiler, any programs compiled would contain security hole. Inspection of code being compiled would not reveal any problems
3) Logic Bomb
> Code not designed to cause havoc all the time, only when certain circumstances occurs, as particular date or time or noticeable event.
> Example DeadMan Switch, designed check certain person logging in every day, if don't log in for long time ( presumably fired ), then logic bomb goes off and opens security holes or
causes problems.
Viruses
> Fragment code embedded in legitimate program, designed to replicate itself, wreaking havoc.
> Infect PCs than UNIX, latter systems have limited authority to modify programs or access critical system structures
> Viruses delivered to systems in a virus dropper, some form of Trojan Horse, via email or unsafe downloads.
> Take many forms
Forms of viruses (program threats)
1) File attaches itself to executable file, run virus code first then jump to start of original program. Termed parasitic, do not leave new files on system, original program is still fully functional.
2) Boot occupies boot sector, runs before OS loaded. Known as memory viruses, operation reside in memory, do not appear in file system.
3) Macro exist as script run automatically by certain macro-capable programs (MS Word or Excel). Can exist in word processing documents or spreadsheet files.
4) Source code look for source code and infect it in order to spread.
5) Polymorphic change every time they spread - not underlying functionality, just their signature, by which virus checkers recognize them.
6) Encrypted travel in encrypted form to escape detection. Self-decrypting, allows to infect other files.
7) Stealth avoid detection modifying parts of system that could be used to detect it.
8) Tunneling attempt avoid detection inserting themselves into interrupt handler chain, or into device drivers.
9) Multipartite attack multiple parts of system, such as files, boot sector, and memory.
10) Armored are coded to make hard for antivirus researchers to decode and understand. Files associated with viruses are hidden, protected, given innocuous looking names such as "...".