0% found this document useful (0 votes)

14 views

DST4030A Lecture Notes Week 4

The document outlines the course DST4030A: Parallel Computing, focusing on various parallel programming models including Shared Memory, Threads, Distributed Memory, and Data Parallel models. It emphasizes the learning objectives for students, such as understanding parallel architectures and programming models, and introduces OpenMP as a standard for multi-threaded programming. The document provides insights into the implementation and characteristics of these models, along with the advantages and challenges associated with each.

Uploaded by

allansharad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

DST4030A Lecture Notes Week 4

Uploaded by

allansharad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

DST4030A: Parallel Computing

Parallel Programming Models

Dr Mike Asiyo

School of Science & Technology

Dr Asiyo (USIU) DST4030A: Parallel Computing 1 / 42

Contents

1 Course Learning Objectives

2 Parallel Programming Models

3 OpenMP

Dr Asiyo (USIU) DST4030A: Parallel Computing 2 / 42

Course Learning Objectives

Dr Asiyo (USIU) DST4030A: Parallel Computing 3 / 42

Course Learning Objectives

Lecture Learning Objectives

At the end of this lecture, students should be able to:

1 Appreciate parallel machine architectures from a software perspective
2 Understand parallel programming models/architectures

Dr Asiyo (USIU) DST4030A: Parallel Computing 4 / 42

Parallel Programming Models

Dr Asiyo (USIU) DST4030A: Parallel Computing 5 / 42

Parallel Programming Models Introduction

Parallel Programming Models exist as an abstraction above hardware

and memory architectures:
Shared Memory (without threads)
Shared Threads Models (Pthreads, OpenMP)
Distributed Memory / Message Passing (MPI)
Data Parallel
Hybrid
Single Program Multiple Data (SPMD)
Multiple Program Multiple Data (MPMD)

Dr Asiyo (USIU) DST4030A: Parallel Computing 6 / 42

Parallel Programming Models Shared Memory Model (without threads)

In this programming model, processes/tasks share a common address

space, which they read and write to asynchronously.

Various mechanisms such as locks / semaphores are used to control access

to the shared memory, resolve contentions and to prevent race conditions
and deadlocks.

This is perhaps the simplest parallel programming model.

An advantage of this model from the programmer’s point of view is that

the notion of data ”ownership” is lacking, so there is no need to specify
explicitly the communication of data between tasks.

Dr Asiyo (USIU) DST4030A: Parallel Computing 7 / 42

Parallel Programming Models Shared Memory Model (without threads)

All processes see and have equal access to shared memory. Program
development can often be simplified.

An important disadvantage in terms of performance is that it becomes

more difficult to understand and manage data locality:
Keeping data local to the process that works on it conserves memory
accesses, cache refreshes and bus traffic that occurs when multiple
processes use the same data.
Unfortunately, controlling data locality is hard to understand and
may be beyond the control of the average user.

Dr Asiyo (USIU) DST4030A: Parallel Computing 8 / 42

Parallel Programming Models Shared Memory Model (without threads)

Figure 1: Shared Memory Model

Dr Asiyo (USIU) DST4030A: Parallel Computing 9 / 42

Parallel Programming Models Shared Memory Model (without threads)

Implementations
On stand-alone shared memory machines, native operating systems,
compilers and/or hardware provide support for shared memory
programming. For example, the POSIX standard provides an API
for using shared memory, and UNIX provides shared memory
segments (shmget, shmat, shmctl, etc.).
On distributed memory machines, memory is physically distributed
across a network of machines, but made global through specialized
hardware and software.