William Stallings Computer Organization and Architecture 10 Edition
William Stallings Computer Organization and Architecture 10 Edition
William Stallings Computer Organization and Architecture 10 Edition
William Stallings
Computer Organization
and Architecture
10th Edition
© 2016 Pearson Education, Inc., Hoboken,
NJ. All rights reserved.
+ Chapter 17
Parallel Processing
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Multiple Processor Organization
Uniprocessor
Clusters
Symmetric Nonumiform
Multiprocessor Memory
(SMP) Access
(NUMA)
(a) SISD DS
PU2 LM2
IS
CU
IS DS DS
CU1 PU1 PUn LMn
Memory
Shared
IS DS
CU1 PU1 LM1
Interconnection
IS DS IS DS
Network
CUn PUn CU2 PU2 LM2
Process 1
Process 2
Process 3
Process 1
Process 2
Process 3
Blocked Running
I/O
I/O
Interconnection
Network
I/O
Main Memory
shared bus
Main I/O
Memory I/O Adapter
Subsytem
I/O
Adapter
I/O
Adapter
◼ Simplicity
◼ Simplest approach to multiprocessor organization
◼ Flexibility
◼ Generally easy to expand the system by attaching more
processors to the bus
◼ Reliability
◼ The bus is essentially a passive medium and the failure of any
attached device should not cause failure of the whole system
◼ Scheduling
◼ Any processor may perform scheduling so conflicts must be avoided
◼ Scheduler must assign ready processes to available processors
◼ Synchronization
◼ With multiple active processes having potential access to shared address spaces or I/O resources, care must be
taken to provide effective synchronization
◼ Synchronization is a facility that enforces mutual exclusion and event ordering
◼ Memory management
◼ In addition to dealing with all of the issues found on uniprocessor machines, the OS needs to exploit the available
hardware parallelism to achieve the best performance
◼ Paging mechanisms on different processors must be coordinated to enforce consistency when several processors
share a page or segment and to decide on page replacement
◼ Modified
◼ The line in the cache has been modified and is available only in
this cache
◼ Exclusive
◼ The line in the cache is the same as that in main memory and is
not present in any other cache
◼ Shared
◼ The line in the cache is the same as that in main memory and may
be present in another cache
◼ Invalid
◼ The line in the cache does not contain valid data
M E S I
Modified Exclusive Shared Invalid
This cache line Yes Yes Yes No
valid?
The memory out of date valid valid —
copy is…
Copies exist in
No No Maybe Maybe
other caches?
A write to this does not go to does not go to goes to bus and goes directly to
line… bus bus updates cache bus
R
M
E
WM
SHR
SH
SHW
W
R
H
SH
W
WH
(a) Line in cache at initiating pr ocessor (b) Line in snooping cache
◼ Defined as:
◼ A group of interconnected whole computers working
together as a unified computing resource that can
create the illusion of being one machine
◼ (The term whole computer means a system that can run
on its own, apart from the cluster)
RAID
◼ Two approaches:
◼ Highly available clusters
◼ Fault tolerant clusters
◼ Failover
◼ The function of switching applications and data resources over from a failed system
to an alternative system in the cluster
◼ Failback
◼ Restoration of applications and data resources to the original system once it
has been fixed
◼ Load balancing
◼ Incremental scalability
◼ Automatically include new computers in scheduling
◼ Middleware needs to recognize that processes may switch between machines
Cluster Middleware
(Single System Image and Availability Infrastructure)
Net. Interface HW Net. Interface HW Net. Interface HW Net. Interface HW Net. Interface HW
N 100GbE
10GbE
& Eth Switch Eth Switch Eth Switch
40GbE
SMP Clustering
◼ Easier to manage and ◼ Far superior in terms of
configure incremental and absolute
scalability
◼ Much closer to the original
single processor model for ◼ Superior in terms of
which nearly all applications availability
are written
◼ All components of the system
◼ Less physical space and lower can readily be made highly
power consumption redundant
I/O
Main
Memory 1
Processor Processor
2-1 2-m
L1 Cache L1 Cache
Interconnect
Network L2 Cache L2 Cache Directory
I/O
Main
Memory 2
Processor Processor
N-1 N-m
L1 Cache L1 Cache
Directory
Main
Memory N