CSC334 Parallel & Distributed Computing
Lecture # 06
Parallel Programming Models
Suliman Khan
Department of Computer Science
University of Lahore, Sargodha Campus
Parallel Computers
• Programming mode types
– Shared Memory
– Distributed Memory
– Hybrid Model
Parallel Programing Models
• Parallel programming models exist as an abstraction above hardware and memory
architectures
• These models are NOT specific to a particular type of machine or memory
architecture
• These models can (theoretically) be implemented on any underlying hardware
• Examples from past
• SHARED memory model on a DISTRIBUTED memory machine. Kendall Square Research (KSR)
ALLCACHE approach, “virtual shared memory"
• DISTRIBUTED memory model on a SHARED memory machine. Message Passing Interface (MPI)
on SGI Origin 2000, employed the CC-NUMA type of shared memory architecture, however, MPI
commonly done over a network of distributed memory machines
• Which model to use?
• Combination of what is available and personal choice
Shared Memory
• Architecture
Processors have direct access to global memory and
I/O through bus or fast switching network
• Cache Coherency Protocol guarantees
consistency of memory and I/O accesses
• Each processor also has its own memory (cache)
• Data structures are shared in global address space
• Concurrent access to shared memory must be coordinated
• Programming Models
– Multithreading (Thread Libraries)
– OpenMP P
P0 P1 ... Pn
0
Cach Cach Cach
e e e
Shared
Bus
Global Shared
Memory
Threads Model
• Threads implementations commonly comprise:
• A library of subroutines that are called from within parallel source code
• A set of compiler directives imbedded in either serial or parallel source code
• Historically, hardware vendors have implemented their own proprietary
versions of threads, making it difficult for programmers to develop
portable threaded applications
• Standardization efforts: POSIX Threads (IEEE POSIX 1003.1c) and
OpenMP (Industry standard)
• POSIX Part of Unix/Linux, Library based
• OpenMP Compiler directive based, Portable / multi-platform
• Mircosoft threads, Java, Python threads, CUDA threads for GPUs
OpenMP
• OpenMP: portable shared memory parallelism
• Higher-level API for writing portable
multithreaded applications
• Provides a set of compiler directives and library routines
for parallel application programmers
• API bindings for Fortran, C, and C++
Distributed Memory
Architecture
• Each Processor has direct access only to its local memory
• Processors are connected via high-speed interconnect
• Data structures must be distributed
• Data exchange is done via explicit processor-to-
processor communication: send/receive messages
• Programming Models
– Widely used standard: MPI
– Others: PVM, Express, P4, Chameleon, PARMACS, ...
Memory Memory Memory
P0 P1 ... Pn
Communicati
on
Interconne4 ct
Message Passing Interface
MPI provides:
• Point-to-point communication
• Collective operations
– Barrier synchronization
– gather/scatter operations
– Broadcast, reductions
• Different communication modes
– Synchronous/asynchronous
– Blocking/non-blocking
– Buffered/unbuffered
• Predefined and derived datatypes
• Virtual topologies
• Parallel I/O (MPI 2)
• C/C++ and Fortran bindings
Hybrid Model
• A hybrid model combines more than one of the
previously described programming models.
• A common example of a hybrid model is the
combination of the message passing model
(MPI) with the threads model (OpenMP).
• Threads perform computationally intensive
kernels using local, on-node data
• Communications between processes on
different nodes occurs over the network using
MPI
Hybrid Model
• Another similar and increasingly popular example
of a hybrid model is using MPI with CPU-GPU
(Graphics Processing Unit) programming.
• MPI tasks run on CPUs using local memory and
communicating with each other over a network.
• Computationally intensive kernels are off-loaded to
GPUs on node.
• Data exchange between node-local memory and
GPUs uses CUDA (or something equivalent)
High level programming model
• Single Program Multiple Data (SPMD)
• Multiple Program Multiple Data (MPMD)
SPMD
• Built upon any combination of the previously mentioned parallel
programming models
• SINGLE PROGRAM: All tasks execute their copy of the same program
simultaneously. This program can be threads, message passing, data
parallel or hybrid.
• MULTIPLE DATA: All tasks may use different data
-tasks do not necessarily have to execute the
entire program
- perhaps only a portion of it
MPMD
• built upon any combination of the previously mentioned parallel
programming models
• MULTIPLE PROGRAM: Tasks may execute different programs
simultaneously.
• The programs can be threads, message passing, data parallel or
hybrid.
• MULTIPLE DATA: All tasks may use different data
MPMD applications are not as common
as SPMD applications
Parallel and Distributed
Programming Models
• OPENMP
• MPI
• For message passing systems
• MapReduce and BigTable
• For internet clouds and data centers
• Service clouds require extension of Hadoop, EC2, S3 to facilitate distributed
computing over distributed storage system
• CUDA
• For NVIDIA GPUs
• Open Grid Service Architecture (OGSA)
• For grid application development
Performance, Security and Energy
Efficiency
• Performance Metrics
• CPU Speed, FLOPS, Job response
time, network latency, system
throughput, network bandwidth,
System overhead (OS boot time,
compile time, etc).
• Scalability
• Machine (size), software,
application, and technology
scalability
• Amdahl’s law
Performance, Security and Energy
Efficiency
• Security
• Threats to system and network
• Confidentiality, integrity, and availability
• Copyright protection
• System Defense technologies
• Data protection infrastructures (IDS)
• Energy efficiency
• Distributed power management
• Unused servers’ energy consumption
• Reducing energy in active servers
That’s all for today!!