Lecture 06 - OpenMP

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 37

Parallel and distributed computing

OUTLINE

Introduction to OpenMP
Introduction to OpenMP
• OpenMP Is open Multi-Processing
• Open is free i.e. freeware
• this is not hardware or not a software even not a tool
but its is an API, i.e. it is an library that is used with
some programming languages
• OpenMP is available for C, C++ and FORTRAN
Languages
• OpenMP is for shared memory model, create multiple
threads to share the memory of main process.
Introduction to OpenMP
It is managed by consortium OpenMP Architecture
Review Board (OpenMP ARB) Which is formed by
several companies like AMD, Intel, IBM, HP, Nvidia,
Oracle etc.
OpenMP
• Shared memory, thread-based parallelism
– Shared memory process consists of multiple threads
• Explicit Programming
– Programmer has full control over parallelization. OpenMP is
not an automatic parallel programming model.
• Compiler Directive based
– Parallelism is specified through the use of compiler
directives which are embedded in the source code.
OpenMP
• Each process starts with one main thread this thread is
called as master thread in OpenMP.
• For particular block of code, we create multiple
threads along with this master thread, theses extra
threads other than master are called slave Thread.
OpenMP
It is also called fork-join Model, because all slave
threads after execution get joined to the master thread
i.e. Process starts with single master threads and ends
with single master thread.
Parallel Memory Architecture
• Shared Memory
• Distributed Memory
• Hybrid
Header file use in C,C++ Language
For C, C++ we need to include “omp.h” as header file.
Simple Example to Run a program Parallel
#include<omp.h>
int main(){
#pragma omp parallel

printf("hello %d\n",id);

}
OpenMP has directives that allow programmer to:
• specify the parallel region
• specify whether the variables in the parallel section
are private or shared
• specify how/if the threads are synchronized
• specify how to parallelize loops
• specify how the works is divided between threads
(scheduling)
Simple Example to create Threads
#include<omp.h>
int main(){
#pragma omp parallel
{
int id = omp_get_thread_num();
printf("Hello! This Process is executed by %d\n",id);
}
}
• By default the number of threads created is equal to the
numbers of processors cores
Creating required number of threads
#include<omp.h>
int main(){
#pragma omp parallel num_threads(7)
{
int id = omp_get_thread_num();
printf("hello %d\n",id);
}
}
Creating required number of threads
• If you want to create specific number of threads,
use num_threads() and a number indicating
number of threads to be created should be passed
as argument to num_threads().
• In previous example, seven threads will be created
each one will be responsible for printing the
required task.
Creating multiple threads using for loop
int main(){
#pragma omp parallel
{
int id = omp_get_thread_num();
for(int i=0;i<3;i++)
{
printf(“Hellow Word\n");}
}
}
Creating multiple threads using for loop
In the previous snippet, since we are not mentioning the
number of threads, number of threads will be equl to the
number of cores.

This for loop will have iterations which are done by these
many number of threads in PC
Allocating different work to different threads

In OpenMP, we can allocate different work to different


thread by using SECTIONS e.g.
Allocating different work to different threads
#include<omp.h>
int main(){
#pragma omp parallel sections num_threads(3)
{
#pragma omp section
{ printf("hello word one\n"); }
#pragma omp section
{ printf("hello word two\n"); }
#pragma omp section
{ printf("hello word three\n"); }
}
In this Example; we have create three threads by mentioning num_threads(3) and each thread
is printing different line.
How do threads interact- Synhcronization
• OpenMP is a multithreading, shared address model
and threads communicate by sharing variables
• Un intended sharing of data may cause race conditions
• So to control race conditions we use synchronization to
protect data conflicts
• However synchronization is expensive so we many
need to change how data is accessed to minimized the
need for synchronization
• Syncronization is the process of ensuring that threads
in a parallel program execute in coordinate manner
Types of Synchronization
There are two types of Synchronization
Explicit: achieved using OpenMP Directives and clauses
e.g barrier, critical, atomic.
Implicit: Occurs automatically at certain points in the
execution of an OpenMp program
Uses of Synchronization
Ensuring Correctness: ensure correctness of program,
without synchronization, it is possible for race condition
and other errors to occur.
Improving Performance: also improve performance e.g.
using a barrier at the end of a parallel loop can ensure
that all threads have finished executing the loop before
moving on to next step.
Protecting Shared Data: also important for protecting
shared data. otherwise it may be possible for multiple
threads to write same variable at the same time, which
Synchronization
• Critical: enclosed code block will be executed by only
one thread at a time, not simultaneously. Often use to
protect shared data from race condition.
• Atomic: memory update (write, read) in next instruction
will be performed atomically. Doesn’t make entire
statement atomic but only memory update is atomic.
Better performance than critical
• Ordered: structured block is executed in order in which
iterations would be executed in sequential loop
Synchronization
• Barrier: each thread waits until all of other threads of a
team have reached this point. A work-sharing construct
has an implicit barrier synchronization at the end.
• nowait: specifies that threads completing assigned work
can proceed without waiting for all threads in the team to
finish. In the absence of this clause, threads encounter a
barrier synchronization at the end of the work sharing
construct.
Synchronization - Critical
• Synchronization is used to impose order constraints and
to protect access to shared data
• #pragma omp critical: this directive identifies section of
code that must be executed by single thread at a time
• Critical section prevents multiple threads from accessing
a variable at the same time
• At the start of critical region it is identified by a given
name until no other thread in the program is executing a
critical region with the same name.
Synchronization - Critical
Syntax:
#pragma omp critical
{
code
}
Synchronize Threads in OpenMP
In OpenMP we can avoid race condition among threads by
using preprocessor directive “pragma omp critical”
int main()
{ #pragma omp parallel
{ #pragma omp critical
{ int x=x+1; }}}
We have created 5 threads, here only one thread will
increment value of x at one time.
Synchronize Threads in OpenMP-ATOMIC
Atomic for synchronization: works almost similar to
critical
The main difference when atomic is used, the memory
update in the next instruction will be performed atomically.
i.e. only one thread will memory update.
A compiler might use special hardware instructions for
better performance then when using critical
Synchronization - Atomic
Syntax:
#pragma omp atomic
{
Expression statement;
}
Synchronization - Atomic
Clauses used with ATOMIC
• update
• read
• write
Synchronization - Atomic update
Gurantees that only one thread at a time update value at a
time, which avoids errors from simultaneous write to the
same variable
syntax:
#pragma omp atomic update
{
expression
}
Synchronization - Atomic Read
Reads the value atomically, value of shared variable can be
read safely avoiding the danger accessing the same
variable by multiple threads
syntax:
#pragma omp atomic read
{
expression
}
Synchronization - Atomic Write
write the value atomically, the value of shared variable can
be avoid writting simultaneously by mulitple threads.
syntax:
#pragma omp atomic write
{
expression
}
Note: if there is no clause with atomic, it is considred as
update
Synchronize Threads in OpenMP
Barrier for synchronization: it is a point of execution of the
program where threads wait for each other. No thread is
allowed to continue until all threads in a team reach the barrier.
It may have downside. i.e process may stay idle if barrier is
used and cost us the wastage of processor time cycle. So must
be very care full using barrier.
Barriers are fundamental for ensuring that parallel threads
synchronize and coordinate their execution, especially when
the need to complete their tasks collectively before proceeding.
Synchronize Threads in OpenMP
Barrier for synchronization: it is a point of execution of the
program where threads wait for each other. No thread is
allowed to continue until all threads in a team reach the barrier.
It may have downside. i.e process may stay idle if barrier is
used and cost us the wastage of processor time cycle. So must
be very care full using barrier.
Barriers are fundamental for ensuring that parallel threads
synchronize and coordinate their execution, especially when
the need to complete their tasks collectively before proceeding.
Synchronize Threads in OpenMP
Ordered for synchronization: it is a point of execution of
the program where threads wait for each other. No thread is
allowed to continue until all threads in a team reach the
barrier.
It may have downside. i.e process may stay idle if barrier is
used and cost us the wastage of processor time cycle. So
must be very care full using barrier.
List of some functions of OpenMP Library

• Omp_set_num_threads(4): request certain number of threads


• Omp_get_thread_num(): returning a thread ID
• #pragma omp parallel num threads(3): clause to run parallel
with specific number of threads

You might also like