100% found this document useful (1 vote)

460 views40 pages

Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing

The document discusses Message Passing Interface (MPI), which is a standard library specification for writing message-passing programs between multiple processes. MPI aims to provide a portable, efficient, and flexible standard for message passing. The document provides background on MPI's development, goals, programming model, common routines for environment management and communication, and getting started with MPI programming.

Uploaded by

Maria Isaura Lopez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

460 views40 pages

Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing

Uploaded by

Maria Isaura Lopez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Message Passing Interface (MPI)

Abstract

The Message Passing Interface Standard (MPI) is a message passing library standard based on the consensus of the MPI Forum, which
has over 40 participating organizations, including vendors, researchers, software library developers, and users. The goal of the
Message Passing Interface is to establish a portable, efficient, and flexible standard for message passing that will be widely used for
writing message passing programs. As such, MPI is the first standardized, vendor independent, message passing library. The
advantages of developing message passing software using MPI closely match the design goals of portability, efficiency, and
flexibility. MPI is not an IEEE or ISO standard, but has in fact, become the "industry standard" for writing message passing programs
on HPC platforms.

The goal of this tutorial is to teach those unfamiliar with MPI how to develop and run parallel programs according to the MPI
standard. The primary topics that are presented focus on those which are the most useful for new MPI programmers. The tutorial
begins with an introduction, background, and basic information for getting started with MPI. This is followed by a detailed look at the
MPI routines that are most useful for new MPI programmers, including MPI Environment Management, Point-to-Point
Communications, and Collective Communications routines. Numerous examples in both C and Fortran are provided, as well as a lab
exercise.

The tutorial materials also include more advanced topics such as Derived Data Types, Group and Communicator Management
Routines, and Virtual Topologies. However, these are not actually presented during the lecture, but are meant to serve as "further
reading" for those who are interested.

Level/Prerequisites: Ideal for those who are new to parallel programming with MPI. A basic understanding of parallel programming in
C or Fortran is assumed. For those who are unfamiliar with Parallel Programming in general, the material covered in EC3500:
Introduction To Parallel Computing would be helpful.

What is MPI?

An Interface Specification:

 M P I = Message Passing Interface

 MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a library - but rather the
specification of what such a library should be.
 Simply stated, the goal of the Message Passing Interface is to provide a widely used standard for writing message passing
programs. The interface attempts to be
o practical
o portable
o efficient
o flexible
 Interface specifications have been defined for C/C++ and Fortran programs.

History and Evolution:

 MPI resulted from the efforts of numerous individuals and groups over the course of a 2 year period between 1992 and 1994.
Some history:
 1980s - early 1990s: Distributed memory, parallel computing develops, as do a number of incompatible software tools for
writing such programs - usually with tradeoffs between portability, performance, functionality and price. Recognition of the
need for a standard arose.
 April, 1992: Workshop on Standards for Message Passing in a Distributed Memory Environment, sponsored by the Center
for Research on Parallel Computing, Williamsburg, Virginia. The basic features essential to a standard message passing
interface were discussed, and a working group established to continue the standardization process. Preliminary draft proposal
developed subsequently.
 November 1992: - Working group meets in Minneapolis. MPI draft proposal (MPI1) from ORNL presented. Group adopts
procedures and organization to form the MPI Forum. MPIF eventually comprised of about 175 individuals from 40
organizations including parallel computer vendors, software writers, academia and application scientists.
 November 1993: Supercomputing 93 conference - draft MPI standard presented.
 Final version of draft released in May, 1994 - available on the at: http://www-unix.mcs.anl.gov/mpi.
 MPI-2 picked up where the first MPI specification left off, and addressed topics which go beyond the first MPI specification.
The original MPI then became known as MPI-1. MPI-2 is briefly covered later. Was finalized in 1996.
 Today, MPI implementations are a combination of MPI-1 and MPI-2. A few implementations include the full functionality of
both.

Reasons for Using MPI:

 Standardization - MPI is the only message passing library which can be considered a standard. It is supported on virtually
all HPC platforms. Practically, it has replaced all previous message passing libraries.
 Portability - There is no need to modify your source code when you port your application to a different platform that
supports (and is compliant with) the MPI standard.
 Performance Opportunities - Vendor implementations should be able to exploit native hardware features to optimize
performance. For more information about MPI performance see the MPI Performance Topics tutorial.
 Functionality - Over 115 routines are defined in MPI-1 alone.
 Availability - A variety of implementations are available, both vendor and public domain.

Programming Model:

 MPI lends itself to virtually any distributed memory parallel programming model. In addition, MPI is commonly used to
implement (behind the scenes) some shared memory models, such as Data Parallel, on distributed memory architectures.
 Hardware platforms:
o Distributed Memory: Originally, MPI was targeted for distributed memory systems.
o Shared Memory: As shared memory systems became more popular, particularly SMP / NUMA architectures, MPI
implementations for these platforms appeared.
o Hybrid: MPI is now used on just about any common parallel architecture including massively parallel machines,
SMP clusters, workstation clusters and heterogeneous networks.
 All parallelism is explicit: the programmer is responsible for correctly identifying parallelism and implementing parallel
algorithms using MPI constructs.
 The number of tasks dedicated to run a parallel program is static. New tasks can not be dynamically spawned during run
time. (MPI-2 addresses this issue).
Getting Started

Header File:

 Required for all programs/routines which make MPI library calls.

C include file Fortran include file

#include "mpi.h" include 'mpif.h'

Format of MPI Calls:

C Binding

Format: rc = MPI_Xxxxx(parameter, ... )

Example: rc = MPI_Bsend(&buf,count,type,dest,tag,comm)

Error code: Returned as "rc". MPI_SUCCESS if successful

Fortran Binding
CALL MPI_XXXXX(parameter,..., ierr)
Format: call mpi_xxxxx(parameter,..., ierr)

Example: CALL MPI_BSEND(buf,count,type,dest,tag,comm,ierr)

Error code: Returned as "ierr" parameter. MPI_SUCCESS if successful

 C names are case sensitive; Fortran names are not.

General MPI Program Structure:

Communicators and Groups:

 MPI uses objects called communicators and groups to define which collection of processes may communicate with each
other. Most MPI routines require you to specify a communicator as an argument.
 Communicators and groups will be covered in more detail later. For now, simply use MPI_COMM_WORLD whenever a
communicator is required - it is the predefined communicator that includes all of your MPI processes.

Rank:

 Within a communicator, every process has its own unique, integer identifier assigned by the system when the process
initializes. A rank is sometimes also called a "task ID". Ranks are contiguous and begin at zero.
 Used by the programmer to specify the source and destination of messages. Often used conditionally by the application to
control program execution (if rank=0 do this / if rank=1 do that).

Environment Management Routines

MPI environment management routines are used for an assortment of purposes, such as initializing and terminating the MPI
environment, querying the environment and identity, etc. Most of the commonly used ones are described below.

MPI_Init
Initializes the MPI execution environment. This function must be called in every MPI program, must be called before any
other MPI functions and must be called only once in an MPI program. For C programs, MPI_Init may be used to pass the
command line arguments to all processes, although this is not required by the standard and is implementation dependent.

MPI_Init (&argc,&argv)
MPI_INIT (ierr)

MPI_Comm_size
Determines the number of processes in the group associated with a communicator. Generally used within the communicator
MPI_COMM_WORLD to determine the number of processes being used by your application.

MPI_Comm_size (comm,&size)
MPI_COMM_SIZE (comm,size,ierr)

MPI_Comm_rank
Determines the rank of the calling process within the communicator. Initially, each process will be assigned a unique integer
rank between 0 and number of processors - 1 within the communicator MPI_COMM_WORLD. This rank is often referred to
as a task ID. If a process becomes associated with other communicators, it will have a unique rank within each of these as
well.
MPI_Comm_rank (comm,&rank)
MPI_COMM_RANK (comm,rank,ierr)

MPI_Abort
Terminates all MPI processes associated with the communicator. In most MPI implementations it terminates ALL processes
regardless of the communicator specified.

MPI_Abort (comm,errorcode)
MPI_ABORT (comm,errorcode,ierr)

MPI_Get_processor_name
Returns the processor name. Also returns the length of the name. The buffer for "name" must be at least
MPI_MAX_PROCESSOR_NAME characters in size. What is returned into "name" is implementation dependent - may not
be the same as the output of the "hostname" or "host" shell commands.

MPI_Get_processor_name (&name,&resultlength)
MPI_GET_PROCESSOR_NAME (name,resultlength,ierr)

MPI_Initialized
Indicates whether MPI_Init has been called - returns flag as either logical true (1) or false(0). MPI requires that MPI_Init be
called once and only once by each process. This may pose a problem for modules that want to use MPI and are prepared to
call MPI_Init if necessary. MPI_Initialized solves this problem.

MPI_Initialized (&flag)
MPI_INITIALIZED (flag,ierr)

MPI_Wtime
Returns an elapsed wall clock time in seconds (double precision) on the calling processor.

MPI_Wtime ()
MPI_WTIME ()

MPI_Wtick
Returns the resolution in seconds (double precision) of MPI_Wtime.

MPI_Wtick ()
MPI_WTICK ()

MPI_Finalize
Terminates the MPI execution environment. This function should be the last MPI routine called in every MPI program - no
other MPI routines may be called after it.

MPI_Finalize ()
MPI_FINALIZE (ierr)

Examples: Environment Management Routines

C Language - Environment Management Routines Example

#include "mpi.h"
#include <stdio.h>

int main(argc,argv)
int argc;
char *argv[]; {
int numtasks, rank, rc;

rc = MPI_Init(&argc,&argv);
if (rc != MPI_SUCCESS) {
printf ("Error starting MPI program. Terminating.\n");
MPI_Abort(MPI_COMM_WORLD, rc);
}

MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
printf ("Number of tasks= %d My rank= %d\n", numtasks,rank);

/* do some work */

MPI_Finalize();
}

Fortran - Environment Management Routines Example

program simple
include 'mpif.h'

integer numtasks, rank, ierr, rc

call MPI_INIT(ierr)
if (ierr .ne. MPI_SUCCESS) then
print *,'Error starting MPI program. Terminating.'
call MPI_ABORT(MPI_COMM_WORLD, rc, ierr)
end if

call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)

call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)
print *, 'Number of tasks=',numtasks,' My rank=',rank

C do some work

call MPI_FINALIZE(ierr)

end

Point to Point Communication Routines

General Concepts

Types of Point-to-Point Operations:

 MPI point-to-point operations typically involve message passing between two, and only two, different MPI tasks. One task is
performing a send operation and the other task is performing a matching receive operation.
 There are different types of send and receive routines used for different purposes. For example:
o Synchronous send
o Blocking send / blocking receive
o Non-blocking send / non-blocking receive
o Buffered send
o Combined send/receive
o "Ready" send
 Any type of send routine can be paired with any type of receive routine.
 MPI also provides several routines associated with send - receive operations, such as those used to wait for a message's
arrival or probe to find out if a message has arrived.

Buffering:

 In a perfect world, every send operation would be perfectly synchronized with its matching receive. This is rarely the case.
Somehow or other, the MPI implementation must be able to deal with storing data when the two tasks are out of sync.
 Consider the following two cases:
o A send operation occurs 5 seconds before the receive is ready - where is the message while the receive is pending?
o Multiple sends arrive at the same receiving task which can only accept one send at a time - what happens to the
messages that are "backing up"?
 The MPI implementation (not the MPI standard) decides what happens to data in these types of cases. Typically, a system
buffer area is reserved to hold data in transit. For example:

 System buffer space is:

o Opaque to the programmer and managed entirely by the MPI library
o A finite resource that can be easy to exhaust
o Often mysterious and not well documented
o Able to exist on the sending side, the receiving side, or both
o Something that may improve program performance because it allows send - receive operations to be asynchronous.
 User managed address space (i.e. your program variables) is called the application buffer. MPI also provides for a user
managed send buffer.

Blocking vs. Non-blocking:

 Most of the MPI point-to-point routines can be used in either blocking or non-blocking mode.
 Blocking:
o A blocking send routine will only "return" after it is safe to modify the application buffer (your send data) for reuse.
Safe means that modifications will not affect the data intended for the receive task. Safe does not imply that the data
was actually received - it may very well be sitting in a system buffer.
o A blocking send can be synchronous which means there is handshaking occurring with the receive task to confirm a
safe send.
o A blocking send can be asynchronous if a system buffer is used to hold the data for eventual delivery to the receive.
o A blocking receive only "returns" after the data has arrived and is ready for use by the program.
 Non-blocking:
o Non-blocking send and receive routines behave similarly - they will return almost immediately. They do not wait for
any communication events to complete, such as message copying from user memory to system buffer space or the
actual arrival of message.
o Non-blocking operations simply "request" the MPI library to perform the operation when it is able. The user can not
predict when that will happen.
o It is unsafe to modify the application buffer (your variable space) until you know for a fact the requested non-
blocking operation was actually performed by the library. There are "wait" routines used to do this.
o Non-blocking communications are primarily used to overlap computation with communication and exploit possible
performance gains.

Order and Fairness:

 Order:
o MPI guarantees that messages will not overtake each other.
o If a sender sends two messages (Message 1 and Message 2) in succession to the same destination, and both match
the same receive, the receive operation will receive Message 1 before Message 2.
o If a receiver posts two receives (Receive 1 and Receive 2), in succession, and both are looking for the same
message, Receive 1 will receive the message before Receive 2.
o Order rules do not apply if there are multiple threads participating in the communication operations.
 Fairness:
o MPI does not guarantee fairness - it's up to the programmer to prevent "operation starvation".
o Example: task 0 sends a message to task 2. However, task 1 sends a competing message that matches task 2's
receive. Only one of the sends will complete.

Point to Point Communication Routines

MPI Message Passing Routine Arguments

MPI point-to-point communication routines generally have an argument list that takes one of the following formats:
Blocking sends MPI_Send(buffer,count,type,dest,tag,comm)

Non-blocking sends MPI_Isend(buffer,count,type,dest,tag,comm,request)

Blocking receive MPI_Recv(buffer,count,type,source,tag,comm,status)

Non-blocking receive MPI_Irecv(buffer,count,type,source,tag,comm,request)

Buffer
Program (application) address space that references the data that is to be sent or received. In most cases, this is simply the
variable name that is be sent/received. For C programs, this argument is passed by reference and usually must be prepended
with an ampersand: &var1
Data Count
Indicates the number of data elements of a particular type to be sent.
Data Type
For reasons of portability, MPI predefines its elementary data types. The table below lists those required by the standard.
C Data Types Fortran Data Types
MPI_CHAR signed char MPI_CHARACTER character(1)
MPI_SHORT signed short int
MPI_INT signed int MPI_INTEGER integer
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHOR
T unsigned short int

MPI_UNSIGNED unsigned int

MPI_UNSIGNED_LONG unsigned long int
MPI_FLOAT float MPI_REAL real
MPI_DOUBLE_PRECISIO
MPI_DOUBLE double N double precision

MPI_LONG_DOUBLE long double

MPI_COMPLEX complex

MPI_DOUBLE_COMPLEX double complex

MPI_LOGICAL logical
MPI_BYTE 8 binary digits MPI_BYTE 8 binary digits

MPI_PACKED
data packed or unpacked with MPI_PACKED
data packed or unpacked with
MPI_Pack()/ MPI_Unpack MPI_Pack()/ MPI_Unpack

Notes:

 Programmers may also create their own data types (see Derived Data Types).
 MPI_BYTE and MPI_PACKED do not correspond to standard C or Fortran types.
 The MPI standard includes the following optional data types:
o C: MPI_LONG_LONG_INT
o Fortran: MPI_INTEGER1, MPI_INTEGER2, MPI_INTEGER4, MPI_REAL2, MPI_REAL4,
MPI_REAL8
 Some implementations may include additional elementary data types (MPI_LOGICAL2, MPI_COMPLEX32, etc.).
Check the MPI header file.

Destination
An argument to send routines that indicates the process where a message should be delivered. Specified as the rank of the
receiving process.
Source
An argument to receive routines that indicates the originating process of the message. Specified as the rank of the sending
process. This may be set to the wild card MPI_ANY_SOURCE to receive a message from any task.
Tag
Arbitrary non-negative integer assigned by the programmer to uniquely identify a message. Send and receive operations
should match message tags. For a receive operation, the wild card MPI_ANY_TAG can be used to receive any message
regardless of its tag. The MPI standard guarantees that integers 0-32767 can be used as tags, but most implementations allow
a much larger range than this.
Communicator
Indicates the communication context, or set of processes for which the source or destination fields are valid. Unless the
programmer is explicitly creating new communicators, the predefined communicator MPI_COMM_WORLD is usually used.
Status
For a receive operation, indicates the source of the message and the tag of the message. In C, this argument is a pointer to a
predefined structure MPI_Status (ex. stat.MPI_SOURCE stat.MPI_TAG). In Fortran, it is an integer array of size
MPI_STATUS_SIZE (ex. stat(MPI_SOURCE) stat(MPI_TAG)). Additionally, the actual number of bytes received are
obtainable from Status via the MPI_Get_count routine.
Request
Used by non-blocking send and receive operations. Since non-blocking operations may return before the requested system
buffer space is obtained, the system issues a unique "request number". The programmer uses this system assigned "handle"
later (in a WAIT type routine) to determine completion of the non-blocking operation. In C, this argument is a pointer to a
predefined structure MPI_Request. In Fortran, it is an integer.

Point to Point Communication Routines

Blocking Message Passing Routines

The more commonly used MPI blocking message passing routines are described below.
MPI_Send
Basic blocking send operation. Routine returns only after the application buffer in the sending task is free for reuse. Note that
this routine may be implemented differently on different systems. The MPI standard permits the use of a system buffer but
does not require it. Some implementations may actually use a synchronous send (discussed below) to implement the basic
blocking send.

MPI_Send (&buf,count,datatype,dest,tag,comm)
MPI_SEND (buf,count,datatype,dest,tag,comm,ierr)

MPI_Recv
Receive a message and block until the requested data is available in the application buffer in the receiving task.

MPI_Recv (&buf,count,datatype,source,tag,comm,&status)
MPI_RECV (buf,count,datatype,source,tag,comm,status,ierr)

MPI_Ssend
Synchronous blocking send: Send a message and block until the application buffer in the sending task is free for reuse and
the destination process has started to receive the message.

MPI_Ssend (&buf,count,datatype,dest,tag,comm)
MPI_SSEND (buf,count,datatype,dest,tag,comm,ierr)

MPI_Bsend
Buffered blocking send: permits the programmer to allocate the required amount of buffer space into which data can be
copied until it is delivered. Insulates against the problems associated with insufficient system buffer space. Routine returns
after the data has been copied from application buffer space to the allocated send buffer. Must be used with the
MPI_Buffer_attach routine.

MPI_Bsend (&buf,count,datatype,dest,tag,comm)
MPI_BSEND (buf,count,datatype,dest,tag,comm,ierr)

MPI_Buffer_attach
MPI_Buffer_detach
Used by programmer to allocate/deallocate message buffer space to be used by the MPI_Bsend routine. The size argument is
specified in actual data bytes - not a count of data elements. Only one buffer can be attached to a process at a time. Note that
the IBM implementation uses MPI_BSEND_OVERHEAD bytes of the allocated buffer for overhead.

MPI_Buffer_attach (&buffer,size)
MPI_Buffer_detach (&buffer,size)
MPI_BUFFER_ATTACH (buffer,size,ierr)
MPI_BUFFER_DETACH (buffer,size,ierr)

MPI_Rsend
Blocking ready send. Should only be used if the programmer is certain that the matching receive has already been posted.

MPI_Rsend (&buf,count,datatype,dest,tag,comm)
MPI_RSEND (buf,count,datatype,dest,tag,comm,ierr)

MPI_Sendrecv
Send a message and post a receive before blocking. Will block until the sending application buffer is free for reuse and until
the receiving application buffer contains the received message.

MPI_Sendrecv (&sendbuf,sendcount,sendtype,dest,sendtag,
...... &recvbuf,recvcount,recvtype,source,recvtag,
...... comm,&status)
MPI_SENDRECV (sendbuf,sendcount,sendtype,dest,sendtag,
...... recvbuf,recvcount,recvtype,source,recvtag,
...... comm,status,ierr)

MPI_Wait
MPI_Waitany
MPI_Waitall
MPI_Waitsome
MPI_Wait blocks until a specified non-blocking send or receive operation has completed. For multiple non-blocking
operations, the programmer can specify any, all or some completions.

MPI_Wait (&request,&status)
MPI_Waitany (count,&array_of_requests,&index,&status)
MPI_Waitall (count,&array_of_requests,&array_of_statuses)
MPI_Waitsome (incount,&array_of_requests,&outcount,
...... &array_of_offsets, &array_of_statuses)
MPI_WAIT (request,status,ierr)
MPI_WAITANY (count,array_of_requests,index,status,ierr)
MPI_WAITALL (count,array_of_requests,array_of_statuses,
...... ierr)
MPI_WAITSOME (incount,array_of_requests,outcount,
...... array_of_offsets, array_of_statuses,ierr)

MPI_Probe
Performs a blocking test for a message. The "wildcards" MPI_ANY_SOURCE and MPI_ANY_TAG may be used to test for
a message from any source or with any tag. For the C routine, the actual source and tag will be returned in the status structure
as status.MPI_SOURCE and status.MPI_TAG. For the Fortran routine, they will be returned in the integer array
status(MPI_SOURCE) and status(MPI_TAG).

MPI_Probe (source,tag,comm,&status)
MPI_PROBE (source,tag,comm,status,ierr)

Examples: Blocking Message Passing Routines

Task 0 pings task 1 and awaits return ping

C Language - Blocking Message Passing Routines Example

#include "mpi.h"
#include <stdio.h>

int main(argc,argv)
int argc;
char *argv[]; {
int numtasks, rank, dest, source, rc, count, tag=1;
char inmsg, outmsg='x';
MPI_Status Stat;

MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);

if (rank == 0) {
dest = 1;
source = 1;
rc = MPI_Send(&outmsg, 1, MPI_CHAR, dest, tag, MPI_COMM_WORLD);
rc = MPI_Recv(&inmsg, 1, MPI_CHAR, source, tag, MPI_COMM_WORLD, &Stat);
}

else if (rank == 1) {
dest = 0;
source = 0;
rc = MPI_Recv(&inmsg, 1, MPI_CHAR, source, tag, MPI_COMM_WORLD, &Stat);
rc = MPI_Send(&outmsg, 1, MPI_CHAR, dest, tag, MPI_COMM_WORLD);
}

rc = MPI_Get_count(&Stat, MPI_CHAR, &count);

printf("Task %d: Received %d char(s) from task %d with tag %d \n",
rank, count, Stat.MPI_SOURCE, Stat.MPI_TAG);

MPI_Finalize();
}

Fortran - Blocking Message Passing Routines Example

program ping
include 'mpif.h'

integer numtasks, rank, dest, source, count, tag, ierr

integer stat(MPI_STATUS_SIZE)
character inmsg, outmsg
outmsg = 'x'
tag = 1

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

if (rank .eq. 0) then

dest = 1
source = 1
call MPI_SEND(outmsg, 1, MPI_CHARACTER, dest, tag,
& MPI_COMM_WORLD, ierr)
call MPI_RECV(inmsg, 1, MPI_CHARACTER, source, tag,
& MPI_COMM_WORLD, stat, ierr)

else if (rank .eq. 1) then

dest = 0
source = 0
call MPI_RECV(inmsg, 1, MPI_CHARACTER, source, tag,
& MPI_COMM_WORLD, stat, err)
call MPI_SEND(outmsg, 1, MPI_CHARACTER, dest, tag,
& MPI_COMM_WORLD, err)
endif

call MPI_GET_COUNT(stat, MPI_CHARACTER, count, ierr)

print *, 'Task ',rank,': Received', count, 'char(s) from task',
& stat(MPI_SOURCE), 'with tag',stat(MPI_TAG)

call MPI_FINALIZE(ierr)

end

Point to Point Communication Routines

Non-Blocking Message Passing Routines

The more commonly used MPI non-blocking message passing routines are described below.
MPI_Isend
Identifies an area in memory to serve as a send buffer. Processing continues immediately without waiting for the message to
be copied out from the application buffer. A communication request handle is returned for handling the pending message
status. The program should not modify the application buffer until subsequent calls to MPI_Wait or MPI_Test indicate that
the non-blocking send has completed.

MPI_Isend (&buf,count,datatype,dest,tag,comm,&request)
MPI_ISEND (buf,count,datatype,dest,tag,comm,request,ierr)

MPI_Irecv
Identifies an area in memory to serve as a receive buffer. Processing continues immediately without actually waiting for the
message to be received and copied into the the application buffer. A communication request handle is returned for handling
the pending message status. The program must use calls to MPI_Wait or MPI_Test to determine when the non-blocking
receive operation completes and the requested message is available in the application buffer.

MPI_Irecv (&buf,count,datatype,source,tag,comm,&request)
MPI_IRECV (buf,count,datatype,source,tag,comm,request,ierr)

MPI_Issend
Non-blocking synchronous send. Similar to MPI_Isend(), except MPI_Wait() or MPI_Test() indicates when the destination
process has received the message.

MPI_Issend (&buf,count,datatype,dest,tag,comm,&request)
MPI_ISSEND (buf,count,datatype,dest,tag,comm,request,ierr)

MPI_Ibsend
Non-blocking buffered send. Similar to MPI_Bsend() except MPI_Wait() or MPI_Test() indicates when the destination
process has received the message. Must be used with the MPI_Buffer_attach routine.

MPI_Ibsend (&buf,count,datatype,dest,tag,comm,&request)
MPI_IBSEND (buf,count,datatype,dest,tag,comm,request,ierr)

MPI_Irsend
Non-blocking ready send. Similar to MPI_Rsend() except MPI_Wait() or MPI_Test() indicates when the destination process
has received the message. Should only be used if the programmer is certain that the matching receive has already been
posted.

MPI_Irsend (&buf,count,datatype,dest,tag,comm,&request)
MPI_IRSEND (buf,count,datatype,dest,tag,comm,request,ierr)

MPI_Test
MPI_Testany
MPI_Testall
MPI_Testsome
MPI_Test checks the status of a specified non-blocking send or receive operation. The "flag" parameter is returned logical
true (1) if the operation has completed, and logical false (0) if not. For multiple non-blocking operations, the programmer can
specify any, all or some completions.

MPI_Test (&request,&flag,&status)
MPI_Testany (count,&array_of_requests,&index,&flag,&status)
MPI_Testall (count,&array_of_requests,&flag,&array_of_statuses)
MPI_Testsome (incount,&array_of_requests,&outcount,
...... &array_of_offsets, &array_of_statuses)
MPI_TEST (request,flag,status,ierr)
MPI_TESTANY (count,array_of_requests,index,flag,status,ierr)
MPI_TESTALL (count,array_of_requests,flag,array_of_statuses,ierr)
MPI_TESTSOME (incount,array_of_requests,outcount,
...... array_of_offsets, array_of_statuses,ierr)

MPI_Iprobe
Performs a non-blocking test for a message. The "wildcards" MPI_ANY_SOURCE and MPI_ANY_TAG may be used to test
for a message from any source or with any tag. The integer "flag" parameter is returned logical true (1) if a message has
arrived, and logical false (0) if not. For the C routine, the actual source and tag will be returned in the status structure as
status.MPI_SOURCE and status.MPI_TAG. For the Fortran routine, they will be returned in the integer array
status(MPI_SOURCE) and status(MPI_TAG).

MPI_Iprobe (source,tag,comm,&flag,&status)
MPI_IPROBE (source,tag,comm,flag,status,ierr)

Examples: Non-Blocking Message Passing Routines

Nearest neighbor exchange in ring topology

C Language - Non-Blocking Message Passing Routines Example

#include "mpi.h"
#include <stdio.h>
int main(argc,argv)
int argc;
char *argv[]; {
int numtasks, rank, next, prev, buf[2], tag1=1, tag2=2;
MPI_Request reqs[4];
MPI_Status stats[4];

MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);

prev = rank-1;
next = rank+1;
if (rank == 0) prev = numtasks - 1;
if (rank == (numtasks - 1)) next = 0;

MPI_Irecv(&buf[0], 1, MPI_INT, prev, tag1, MPI_COMM_WORLD, &reqs[0]);

MPI_Irecv(&buf[1], 1, MPI_INT, next, tag2, MPI_COMM_WORLD, &reqs[1]);

MPI_Isend(&rank, 1, MPI_INT, prev, tag2, MPI_COMM_WORLD, &reqs[2]);

MPI_Isend(&rank, 1, MPI_INT, next, tag1, MPI_COMM_WORLD, &reqs[3]);

{ do some work }

MPI_Waitall(4, reqs, stats);

MPI_Finalize();
}

Fortran - Non-Blocking Message Passing Routines Example

program ringtopo
include 'mpif.h'

integer numtasks, rank, next, prev, buf(2), tag1, tag2, ierr

integer stats(MPI_STATUS_SIZE,4), reqs(4)
tag1 = 1
tag2 = 2

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

prev = rank - 1
next = rank + 1
if (rank .eq. 0) then
prev = numtasks - 1
endif
if (rank .eq. numtasks - 1) then
next = 0
endif

call MPI_IRECV(buf(1), 1, MPI_INTEGER, prev, tag1,

& MPI_COMM_WORLD, reqs(1), ierr)
call MPI_IRECV(buf(2), 1, MPI_INTEGER, next, tag2,
& MPI_COMM_WORLD, reqs(2), ierr)

call MPI_ISEND(rank, 1, MPI_INTEGER, prev, tag2,

& MPI_COMM_WORLD, reqs(3), ierr)
call MPI_ISEND(rank, 1, MPI_INTEGER, next, tag1,
& MPI_COMM_WORLD, reqs(4), ierr)

C do some work

call MPI_WAITALL(4, reqs, stats, ierr);

call MPI_FINALIZE(ierr)

end

Collective Communication Routines

All or None:

 Collective communication must involve all processes in the scope of a communicator. All processes are by default, members
in the communicator MPI_COMM_WORLD.
 It is the programmer's responsibility to insure that all processes within a communicator participate in any collective
operations.

Types of Collective Operations:

 Synchronization - processes wait until all members of the group have reached the synchronization point.
 Data Movement - broadcast, scatter/gather, all to all.
 Collective Computation (reductions) - one member of the group collects data from the other members and performs an
operation (min, max, add, multiply, etc.) on that data.

Programming Considerations and Restrictions:

 Collective operations are blocking.

 Collective communication routines do not take message tag arguments.
 Collective operations within subsets of processes are accomplished by first partitioning the subsets into new groups and then
attaching the new groups to new communicators (discussed in the Group and Communicator Management Routines section).
 Can only be used with MPI predefined datatypes - not with MPI Derived Data Types.

Collective Communication Routines

MPI_Barrier
Creates a barrier synchronization in a group. Each task, when reaching the MPI_Barrier call, blocks until all tasks in the
group reach the same MPI_Barrier call.

MPI_Barrier (comm)
MPI_BARRIER (comm,ierr)

MPI_Bcast
Broadcasts (sends) a message from the process with rank "root" to all other processes in the group.

MPI_Bcast (&buffer,count,datatype,root,comm)
MPI_BCAST (buffer,count,datatype,root,comm,ierr)

MPI_Scatter
Distributes distinct messages from a single source task to each task in the group.

MPI_Scatter (&sendbuf,sendcnt,sendtype,&recvbuf,
...... recvcnt,recvtype,root,comm)
MPI_SCATTER (sendbuf,sendcnt,sendtype,recvbuf,
...... recvcnt,recvtype,root,comm,ierr)

MPI_Gather
Gathers distinct messages from each task in the group to a single destination task. This routine is the reverse operation of
MPI_Scatter.
MPI_Gather (&sendbuf,sendcnt,sendtype,&recvbuf,
...... recvcount,recvtype,root,comm)
MPI_GATHER (sendbuf,sendcnt,sendtype,recvbuf,
...... recvcount,recvtype,root,comm,ierr)

MPI_Allgather
Concatenation of data to all tasks in a group. Each task in the group, in effect, performs a one-to-all broadcasting operation
within the group.

MPI_Allgather (&sendbuf,sendcount,sendtype,&recvbuf,
...... recvcount,recvtype,comm)
MPI_ALLGATHER (sendbuf,sendcount,sendtype,recvbuf,
...... recvcount,recvtype,comm,info)

MPI_Reduce
Applies a reduction operation on all tasks in the group and places the result in one task.

MPI_Reduce (&sendbuf,&recvbuf,count,datatype,op,root,comm)
MPI_REDUCE (sendbuf,recvbuf,count,datatype,op,root,comm,ierr)

The predefined MPI reduction operations appear below. Users can also define their own reduction functions by using the
MPI_Op_create routine.

MPI Reduction Operation C Data Types Fortran Data Type

MPI_MAX maximum integer, float integer, real, complex
MPI_MIN minimum integer, float integer, real, complex
MPI_SUM sum integer, float integer, real, complex
MPI_PROD product integer, float integer, real, complex
MPI_LAND logical AND integer logical
MPI_BAND bit-wise AND integer, MPI_BYTE integer, MPI_BYTE
MPI_LOR logical OR integer logical
MPI_BOR bit-wise OR integer, MPI_BYTE integer, MPI_BYTE
MPI_LXOR logical XOR integer logical
MPI_BXOR bit-wise XOR integer, MPI_BYTE integer, MPI_BYTE
MPI_MAXLOC max value and location float, double and long double real, complex,double precision
MPI_MINLOC min value and location float, double and long double real, complex, double precision
MPI_Allreduce
Applies a reduction operation and places the result in all tasks in the group. This is equivalent to an MPI_Reduce followed by
an MPI_Bcast.

MPI_Allreduce (&sendbuf,&recvbuf,count,datatype,op,comm)
MPI_ALLREDUCE (sendbuf,recvbuf,count,datatype,op,comm,ierr)

MPI_Reduce_scatter
First does an element-wise reduction on a vector across all tasks in the group. Next, the result vector is split into disjoint
segments and distributed across the tasks. This is equivalent to an MPI_Reduce followed by an MPI_Scatter operation.
MPI_Reduce_scatter (&sendbuf,&recvbuf,recvcount,datatype,
...... op,comm)
MPI_REDUCE_SCATTER (sendbuf,recvbuf,recvcount,datatype,
...... op,comm,ierr)

MPI_Alltoall
Each task in a group performs a scatter operation, sending a distinct message to all the tasks in the group in order by index.

MPI_Alltoall (&sendbuf,sendcount,sendtype,&recvbuf,
...... recvcnt,recvtype,comm)
MPI_ALLTOALL (sendbuf,sendcount,sendtype,recvbuf,
...... recvcnt,recvtype,comm,ierr)

MPI_Scan
Performs a scan operation with respect to a reduction operation across a task group.
MPI_Scan (&sendbuf,&recvbuf,count,datatype,op,comm)
MPI_SCAN (sendbuf,recvbuf,count,datatype,op,comm,ierr)

Examples: Collective Communications

Perform a scatter operation on the rows of an array

C Language - Collective Communications Example

#include "mpi.h"
#include <stdio.h>
#define SIZE 4

int main(argc,argv)
int argc;
char *argv[]; {
int numtasks, rank, sendcount, recvcount, source;
float sendbuf[SIZE][SIZE] = {
{1.0, 2.0, 3.0, 4.0},
{5.0, 6.0, 7.0, 8.0},
{9.0, 10.0, 11.0, 12.0},
{13.0, 14.0, 15.0, 16.0} };
float recvbuf[SIZE];

MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);

if (numtasks == SIZE) {
source = 1;
sendcount = SIZE;
recvcount = SIZE;
MPI_Scatter(sendbuf,sendcount,MPI_FLOAT,recvbuf,recvcount,
MPI_FLOAT,source,MPI_COMM_WORLD);

printf("rank= %d Results: %f %f %f %f\n",rank,recvbuf[0],

recvbuf[1],recvbuf[2],recvbuf[3]);
}
else
printf("Must specify %d processors. Terminating.\n",SIZE);

MPI_Finalize();
}
Fortran - Collective Communications Example

program scatter
include 'mpif.h'

integer SIZE
parameter(SIZE=4)
integer numtasks, rank, sendcount, recvcount, source, ierr
real*4 sendbuf(SIZE,SIZE), recvbuf(SIZE)

C Fortran stores this array in column major order, so the

C scatter will actually scatter columns, not rows.
data sendbuf /1.0, 2.0, 3.0, 4.0,
& 5.0, 6.0, 7.0, 8.0,
& 9.0, 10.0, 11.0, 12.0,
& 13.0, 14.0, 15.0, 16.0 /

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

if (numtasks .eq. SIZE) then

source = 1
sendcount = SIZE
recvcount = SIZE
call MPI_SCATTER(sendbuf, sendcount, MPI_REAL, recvbuf,
& recvcount, MPI_REAL, source, MPI_COMM_WORLD, ierr)
print *, 'rank= ',rank,' Results: ',recvbuf
else
print *, 'Must specify',SIZE,' processors. Terminating.'
endif

call MPI_FINALIZE(ierr)

end

Sample program output:

rank= 0 Results: 1.000000 2.000000 3.000000 4.000000

rank= 1 Results: 5.000000 6.000000 7.000000 8.000000
rank= 2 Results: 9.000000 10.000000 11.000000 12.000000
rank= 3 Results: 13.000000 14.000000 15.000000 16.000000

Derived Data Types

 As previously mentioned, MPI predefines its primitive data types:

C Data Types Fortran Data Types

MPI_CHAR MPI_CHARACTER
MPI_SHORT MPI_INTEGER
MPI_INT MPI_REAL
MPI_LONG MPI_DOUBLE_PRECISION
MPI_UNSIGNED_CHAR MPI_COMPLEX
MPI_UNSIGNED_SHORT MPI_DOUBLE_COMPLEX
MPI_UNSIGNED_LONG MPI_LOGICAL
MPI_UNSIGNED MPI_BYTE
MPI_FLOAT MPI_PACKED
MPI_DOUBLE
MPI_LONG_DOUBLE
MPI_BYTE
MPI_PACKED
 MPI also provides facilities for you to define your own data structures based upon sequences of the MPI primitive data types.
Such user defined structures are called derived data types.
 Primitive data types are contiguous. Derived data types allow you to specify non-contiguous data in a convenient manner and
to treat it as though it was contiguous.
 MPI provides several methods for constructing derived data types:
o Contiguous
o Vector
o Indexed
o Struct

Derived Data Type Routines

MPI_Type_contiguous
The simplest constructor. Produces a new data type by making count copies of an existing data type.

MPI_Type_contiguous (count,oldtype,&newtype)
MPI_TYPE_CONTIGUOUS (count,oldtype,newtype,ierr)

MPI_Type_vector
MPI_Type_hvector
Similar to contiguous, but allows for regular gaps (stride) in the displacements. MPI_Type_hvector is identical to
MPI_Type_vector except that stride is specified in bytes.

MPI_Type_vector (count,blocklength,stride,oldtype,&newtype)
MPI_TYPE_VECTOR (count,blocklength,stride,oldtype,newtype,ierr)

MPI_Type_indexed
MPI_Type_hindexed
An array of displacements of the input data type is provided as the map for the new data type. MPI_Type_hindexed is
identical to MPI_Type_indexed except that offsets are specified in bytes.

MPI_Type_indexed (count,blocklens[],offsets[],old_type,&newtype)
MPI_TYPE_INDEXED (count,blocklens(),offsets(),old_type,newtype,ierr)

MPI_Type_struct
The new data type is formed according to completely defined map of the component data types.

MPI_Type_struct (count,blocklens[],offsets[],old_types,&newtype)
MPI_TYPE_STRUCT (count,blocklens(),offsets(),old_types,newtype,ierr)

MPI_Type_extent
Returns the size in bytes of the specified data type. Useful for the MPI subroutines that require specification of offsets in
bytes.

MPI_Type_extent (datatype,&extent)
MPI_TYPE_EXTENT (datatype,extent,ierr)

MPI_Type_commit
Commits new datatype to the system. Required for all user constructed (derived) datatypes.

MPI_Type_commit (&datatype)
MPI_TYPE_COMMIT (datatype,ierr)

MPI_Type_free
Deallocates the specified datatype object. Use of this routine is especially important to prevent memory exhaustion if many
datatype objects are created, as in a loop.
MPI_Type_free (&datatype)
MPI_TYPE_FREE (datatype,ierr)

Examples: Contiguous Derived Data Type

Create a data type representing a row of an array and distribute a different row to all processes.

C Language - Contiguous Derived Data Type Example

#include "mpi.h"
#include <stdio.h>
#define SIZE 4

int main(argc,argv)
int argc;
char *argv[]; {
int numtasks, rank, source=0, dest, tag=1, i;
float a[SIZE][SIZE] =
{1.0, 2.0, 3.0, 4.0,
5.0, 6.0, 7.0, 8.0,
9.0, 10.0, 11.0, 12.0,
13.0, 14.0, 15.0, 16.0};
float b[SIZE];

MPI_Status stat;
MPI_Datatype rowtype;

MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);

MPI_Type_contiguous(SIZE, MPI_FLOAT, &rowtype);

MPI_Type_commit(&rowtype);

if (numtasks == SIZE) {
if (rank == 0) {
for (i=0; i<numtasks; i++)
MPI_Send(&a[i][0], 1, rowtype, i, tag, MPI_COMM_WORLD);
}

MPI_Recv(b, SIZE, MPI_FLOAT, source, tag, MPI_COMM_WORLD, &stat);

printf("rank= %d b= %3.1f %3.1f %3.1f %3.1f\n",
rank,b[0],b[1],b[2],b[3]);
}
else
printf("Must specify %d processors. Terminating.\n",SIZE);

MPI_Type_free(&rowtype);
MPI_Finalize();
}

Fortran - Contiguous Derived Data Type Example

program contiguous
include 'mpif.h'

integer SIZE
parameter(SIZE=4)
integer numtasks, rank, source, dest, tag, i, ierr
real*4 a(0:SIZE-1,0:SIZE-1), b(0:SIZE-1)
integer stat(MPI_STATUS_SIZE), columntype

C Fortran stores this array in column major order

data a /1.0, 2.0, 3.0, 4.0,
& 5.0, 6.0, 7.0, 8.0,
& 9.0, 10.0, 11.0, 12.0,
& 13.0, 14.0, 15.0, 16.0 /

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

call MPI_TYPE_CONTIGUOUS(SIZE, MPI_REAL, columntype, ierr)

call MPI_TYPE_COMMIT(columntype, ierr)

tag = 1
if (numtasks .eq. SIZE) then
if (rank .eq. 0) then
do 10 i=0, numtasks-1
call MPI_SEND(a(0,i), 1, columntype, i, tag,
& MPI_COMM_WORLD,ierr)
10 continue
endif

source = 0
call MPI_RECV(b, SIZE, MPI_REAL, source, tag,
& MPI_COMM_WORLD, stat, ierr)
print *, 'rank= ',rank,' b= ',b

else
print *, 'Must specify',SIZE,' processors. Terminating.'
endif

call MPI_TYPE_FREE(columntype, ierr)

call MPI_FINALIZE(ierr)

end

Sample program output:

rank= 0 b= 1.0 2.0 3.0 4.0

rank= 1 b= 5.0 6.0 7.0 8.0
rank= 2 b= 9.0 10.0 11.0 12.0
rank= 3 b= 13.0 14.0 15.0 16.0

Examples: Vector Derived Data Type

Create a data type representing a column of an array and distribute different columns to all processes.

C Language - Vector Derived Data Type Example

#include "mpi.h"
#include <stdio.h>
#define SIZE 4

MPI_Status stat;
MPI_Datatype columntype;

MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);

MPI_Type_vector(SIZE, 1, SIZE, MPI_FLOAT, &columntype);

MPI_Type_commit(&columntype);

if (numtasks == SIZE) {
if (rank == 0) {
for (i=0; i<numtasks; i++)
MPI_Send(&a[0][i], 1, columntype, i, tag, MPI_COMM_WORLD);
}

MPI_Recv(b, SIZE, MPI_FLOAT, source, tag, MPI_COMM_WORLD, &stat);

printf("rank= %d b= %3.1f %3.1f %3.1f %3.1f\n",
rank,b[0],b[1],b[2],b[3]);
}
else
printf("Must specify %d processors. Terminating.\n",SIZE);

MPI_Type_free(&columntype);
MPI_Finalize();
}

Fortran - Vector Derived Data Type Example

program vector
include 'mpif.h'

integer SIZE
parameter(SIZE=4)
integer numtasks, rank, source, dest, tag, i, ierr
real*4 a(0:SIZE-1,0:SIZE-1), b(0:SIZE-1)
integer stat(MPI_STATUS_SIZE), rowtype

C Fortran stores this array in column major order

data a /1.0, 2.0, 3.0, 4.0,
& 5.0, 6.0, 7.0, 8.0,
& 9.0, 10.0, 11.0, 12.0,
& 13.0, 14.0, 15.0, 16.0 /

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

call MPI_TYPE_VECTOR(SIZE, 1, SIZE, MPI_REAL, rowtype, ierr)

call MPI_TYPE_COMMIT(rowtype, ierr)

tag = 1
if (numtasks .eq. SIZE) then
if (rank .eq. 0) then
do 10 i=0, numtasks-1
call MPI_SEND(a(i,0), 1, rowtype, i, tag,
& MPI_COMM_WORLD, ierr)
10 continue
endif

source = 0
call MPI_RECV(b, SIZE, MPI_REAL, source, tag,
& MPI_COMM_WORLD, stat, ierr)
print *, 'rank= ',rank,' b= ',b

else
print *, 'Must specify',SIZE,' processors. Terminating.'
endif

call MPI_TYPE_FREE(rowtype, ierr)

call MPI_FINALIZE(ierr)

end

Sample program output:

rank= 0 b= 1.0 5.0 9.0 13.0

rank= 1 b= 2.0 6.0 10.0 14.0
rank= 2 b= 3.0 7.0 11.0 15.0
rank= 3 b= 4.0 8.0 12.0 16.0

Examples: Indexed Derived Data Type

Create a datatype by extracting variable portions of an array and distribute to all tasks.
C Language - Indexed Derived Data Type Example

#include "mpi.h"
#include <stdio.h>
#define NELEMENTS 6

int main(argc,argv)
int argc;
char *argv[]; {
int numtasks, rank, source=0, dest, tag=1, i;
int blocklengths[2], displacements[2];
float a[16] =
{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0,
9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0};
float b[NELEMENTS];

MPI_Status stat;
MPI_Datatype indextype;

MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);

blocklengths[0] = 4;
blocklengths[1] = 2;
displacements[0] = 5;
displacements[1] = 12;

MPI_Type_indexed(2, blocklengths, displacements, MPI_FLOAT, &indextype);

MPI_Type_commit(&indextype);

if (rank == 0) {
for (i=0; i<numtasks; i++)
MPI_Send(a, 1, indextype, i, tag, MPI_COMM_WORLD);
}

MPI_Recv(b, NELEMENTS, MPI_FLOAT, source, tag, MPI_COMM_WORLD, &stat);

printf("rank= %d b= %3.1f %3.1f %3.1f %3.1f %3.1f %3.1f\n",
rank,b[0],b[1],b[2],b[3],b[4],b[5]);

MPI_Type_free(&indextype);
MPI_Finalize();
}

Fortran - Indexed Derived Data Type Example

program indexed
include 'mpif.h'

integer NELEMENTS
parameter(NELEMENTS=6)
integer numtasks, rank, source, dest, tag, i, ierr
integer blocklengths(0:1), displacements(0:1)
real*4 a(0:15), b(0:NELEMENTS-1)
integer stat(MPI_STATUS_SIZE), indextype

data a /1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0,

& 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0 /

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

blocklengths(0) = 4
blocklengths(1) = 2
displacements(0) = 5
displacements(1) = 12

call MPI_TYPE_INDEXED(2, blocklengths, displacements, MPI_REAL,

& indextype, ierr)
call MPI_TYPE_COMMIT(indextype, ierr)

tag = 1
if (rank .eq. 0) then
do 10 i=0, numtasks-1
call MPI_SEND(a, 1, indextype, i, tag, MPI_COMM_WORLD, ierr)
10 continue
endif

source = 0
call MPI_RECV(b, NELEMENTS, MPI_REAL, source, tag, MPI_COMM_WORLD,
& stat, ierr)
print *, 'rank= ',rank,' b= ',b

call MPI_TYPE_FREE(indextype, ierr)

call MPI_FINALIZE(ierr)

end

Sample program output:

rank= 0 b= 6.0 7.0 8.0 9.0 13.0 14.0

rank= 1 b= 6.0 7.0 8.0 9.0 13.0 14.0
rank= 2 b= 6.0 7.0 8.0 9.0 13.0 14.0
rank= 3 b= 6.0 7.0 8.0 9.0 13.0 14.0

Examples: Struct Derived Data Type

Create a data type that represents a particle and distribute an array of such particles to all processes.
C Language - Struct Derived Data Type Example

#include "mpi.h"
#include <stdio.h>
#define NELEM 25

int main(argc,argv)
int argc;
char *argv[]; {
int numtasks, rank, source=0, dest, tag=1, i;

typedef struct {
float x, y, z;
float velocity;
int n, type;
} Particle;
Particle p[NELEM], particles[NELEM];
MPI_Datatype particletype, oldtypes[2];
int blockcounts[2];

/* MPI_Aint type used to be consistent with syntax of */

/* MPI_Type_extent routine */
MPI_Aint offsets[2], extent;

MPI_Status stat;

MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);

/* Setup description of the 4 MPI_FLOAT fields x, y, z, velocity */

offsets[0] = 0;
oldtypes[0] = MPI_FLOAT;
blockcounts[0] = 4;

/* Setup description of the 2 MPI_INT fields n, type */

/* Need to first figure offset by getting size of MPI_FLOAT */
MPI_Type_extent(MPI_FLOAT, &extent);
offsets[1] = 4 * extent;
oldtypes[1] = MPI_INT;
blockcounts[1] = 2;

/* Now define structured type and commit it */

MPI_Type_struct(2, blockcounts, offsets, oldtypes, &particletype);
MPI_Type_commit(&particletype);

/* Initialize the particle array and then send it to each task */

if (rank == 0) {
for (i=0; i<NELEM; i++) {
particles[i].x = i * 1.0;
particles[i].y = i * -1.0;
particles[i].z = i * 1.0;
particles[i].velocity = 0.25;
particles[i].n = i;
particles[i].type = i % 2;
}
for (i=0; i<numtasks; i++)
MPI_Send(particles, NELEM, particletype, i, tag, MPI_COMM_WORLD);
}

MPI_Recv(p, NELEM, particletype, source, tag, MPI_COMM_WORLD, &stat);

/* Print a sample of what was received */

printf("rank= %d %3.2f %3.2f %3.2f %3.2f %d %d\n", rank,p[3].x,
p[3].y,p[3].z,p[3].velocity,p[3].n,p[3].type);

MPI_Type_free(&particletype);
MPI_Finalize();
}

Fortran - Struct Derived Data Type Example

program struct
include 'mpif.h'

integer NELEM
parameter(NELEM=25)
integer numtasks, rank, source, dest, tag, i, ierr
integer stat(MPI_STATUS_SIZE)

type Particle
sequence
real*4 x, y, z, velocity
integer n, type
end type Particle

type (Particle) p(NELEM), particles(NELEM)

integer particletype, oldtypes(0:1), blockcounts(0:1),
& offsets(0:1), extent

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

C Setup description of the 4 MPI_REAL fields x, y, z, velocity

offsets(0) = 0
oldtypes(0) = MPI_REAL
blockcounts(0) = 4

C Setup description of the 2 MPI_INTEGER fields n, type

C Need to first figure offset by getting size of MPI_REAL
call MPI_TYPE_EXTENT(MPI_REAL, extent, ierr)
offsets(1) = 4 * extent
oldtypes(1) = MPI_INTEGER
blockcounts(1) = 2

C Now define structured type and commit it

call MPI_TYPE_STRUCT(2, blockcounts, offsets, oldtypes,
& particletype, ierr)
call MPI_TYPE_COMMIT(particletype, ierr)

C Initialize the particle array and then send it to each task

tag = 1
if (rank .eq. 0) then
do 10 i=0, NELEM-1
particles(i) = Particle ( 1.0*i, -1.0*i, 1.0*i,
& 0.25, i, mod(i,2) )
10 continue
do 20 i=0, numtasks-1
call MPI_SEND(particles, NELEM, particletype, i, tag,
& MPI_COMM_WORLD, ierr)
20 continue
endif

source = 0
call MPI_RECV(p, NELEM, particletype, source, tag,
& MPI_COMM_WORLD, stat, ierr)

print *, 'rank= ',rank,' p(3)= ',p(3)

call MPI_TYPE_FREE(particletype, ierr)
call MPI_FINALIZE(ierr)
end

Sample program output:

rank= 0 3.00 -3.00 3.00 0.25 3 1

rank= 2 3.00 -3.00 3.00 0.25 3 1
rank= 1 3.00 -3.00 3.00 0.25 3 1
rank= 3 3.00 -3.00 3.00 0.25 3 1

Group and Communicator Management Routines

Groups vs. Communicators:

 A group is an ordered set of processes. Each process in a group is associated with a unique integer rank. Rank values start at
zero and go to N-1, where N is the number of processes in the group. In MPI, a group is represented within system memory
as an object. It is accessible to the programmer only by a "handle". A group is always associated with a communicator object.
 A communicator encompasses a group of processes that may communicate with each other. All MPI messages must specify a
communicator. In the simplest sense, the communicator is an extra "tag" that must be included with MPI calls. Like groups,
communicators are represented within system memory as objects and are accessible to the programmer only by "handles".
For example, the handle for the communicator that comprises all tasks is MPI_COMM_WORLD.
 From the programmer's perspective, a group and a communicator are one. The group routines are primarily used to specify
which processes should be used to construct a communicator.

Primary Purposes of Group and Communicator Objects:

1. Allow you to organize tasks, based upon function, into task groups.
2. Enable Collective Communications operations across a subset of related tasks.
3. Provide basis for implementing user defined virtual topologies
4. Provide for safe communications

Programming Considerations and Restrictions:

 Groups/communicators are dynamic - they can be created and destroyed during program execution.
 Processes may be in more than one group/communicator. They will have a unique rank within each group/communicator.
 MPI provides over 40 routines related to groups, communicators, and virtual topologies.
 Typical usage:
1. Extract handle of global group from MPI_COMM_WORLD using MPI_Comm_group
2. Form new group as a subset of global group using MPI_Group_incl
3. Create new communicator for new group using MPI_Comm_create
4. Determine new rank in new communicator using MPI_Comm_rank
5. Conduct communications using any MPI message passing routine
6. When finished, free up new communicator and group (optional) using MPI_Comm_free and MPI_Group_free
Group and Communicator Management Routines
Create two different process groups for separate collective communications exchange. Requires creating new communicators
also.

C Language - Group and Communicator Routines Example

#include "mpi.h"
#include <stdio.h>
#define NPROCS 8

int main(argc,argv)
int argc;
char *argv[]; {
int rank, new_rank, sendbuf, recvbuf, numtasks,
ranks1[4]={0,1,2,3}, ranks2[4]={4,5,6,7};
MPI_Group orig_group, new_group;
MPI_Comm new_comm;

MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);

if (numtasks != NPROCS) {
printf("Must specify MP_PROCS= %d. Terminating.\n",NPROCS);
MPI_Finalize();
exit(0);
}

sendbuf = rank;

/* Extract the original group handle */

MPI_Comm_group(MPI_COMM_WORLD, &orig_group);

/* Divide tasks into two distinct groups based upon rank */

if (rank < NPROCS/2) {
MPI_Group_incl(orig_group, NPROCS/2, ranks1, &new_group);
}
else {
MPI_Group_incl(orig_group, NPROCS/2, ranks2, &new_group);
}

/* Create new new communicator and then perform collective communications */

MPI_Comm_create(MPI_COMM_WORLD, new_group, &new_comm);
MPI_Allreduce(&sendbuf, &recvbuf, 1, MPI_INT, MPI_SUM, new_comm);

MPI_Group_rank (new_group, &new_rank);

printf("rank= %d newrank= %d recvbuf= %d\n",rank,new_rank,recvbuf);

MPI_Finalize();
}

Fortran - Group and Communicator Routines Example

program group
include 'mpif.h'

integer NPROCS
parameter(NPROCS=8)
integer rank, new_rank, sendbuf, recvbuf, numtasks
integer ranks1(4), ranks2(4), ierr
integer orig_group, new_group, new_comm
data ranks1 /0, 1, 2, 3/, ranks2 /4, 5, 6, 7/

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

if (numtasks .ne. NPROCS) then

print *, 'Must specify MPROCS= ',NPROCS,' Terminating.'
call MPI_FINALIZE(ierr)
stop
endif

sendbuf = rank

C Extract the original group handle

call MPI_COMM_GROUP(MPI_COMM_WORLD, orig_group, ierr)

C Divide tasks into two distinct groups based upon rank

if (rank .lt. NPROCS/2) then
call MPI_GROUP_INCL(orig_group, NPROCS/2, ranks1,
& new_group, ierr)
else
call MPI_GROUP_INCL(orig_group, NPROCS/2, ranks2,
& new_group, ierr)
endif

call MPI_COMM_CREATE(MPI_COMM_WORLD, new_group,

& new_comm, ierr)
call MPI_ALLREDUCE(sendbuf, recvbuf, 1, MPI_INTEGER,
& MPI_SUM, new_comm, ierr)

call MPI_GROUP_RANK(new_group, new_rank, ierr)

print *, 'rank= ',rank,' newrank= ',new_rank,' recvbuf= ',
& recvbuf

call MPI_FINALIZE(ierr)
end

Sample program output:

rank= 7 newrank= 3 recvbuf= 22

rank= 0 newrank= 0 recvbuf= 6
rank= 1 newrank= 1 recvbuf= 6
rank= 2 newrank= 2 recvbuf= 6
rank= 6 newrank= 2 recvbuf= 22
rank= 3 newrank= 3 recvbuf= 6
rank= 4 newrank= 0 recvbuf= 22
rank= 5 newrank= 1 recvbuf= 22

Virtual Topologies
What Are They?

 In terms of MPI, a virtual topology describes a mapping/ordering of MPI processes into a geometric "shape".
 The two main types of topologies supported by MPI are Cartesian (grid) and Graph.
 MPI topologies are virtual - there may be no relation between the physical structure of the parallel machine and the process
topology.
 Virtual topologies are built upon MPI communicators and groups.
 Must be "programmed" by the application developer.

Why Use Them?

 Convenience
o Virtual topologies may be useful for applications with specific communication patterns - patterns that match an MPI
topology structure.
o For example, a Cartesian topology might prove convenient for an application that requires 4-way nearest neighbor
communications for grid based data.
 Communication Efficiency
o Some hardware architectures may impose penalties for communications between successively distant "nodes".
o A particular implementation may optimize process mapping based upon the physical characteristics of a given
parallel machine.
o The mapping of processes into an MPI virtual topology is dependent upon the MPI implementation, and may be
totally ignored.

Example:

A simplified mapping of processes into a Cartesian virtual topology appears below:

Virtual Topology Routines

Create a 4 x 4 Cartesian topology from 16 processors and have each process exchange its rank with four neighbors.

C Language - Cartesian Virtual Topology Example

#include "mpi.h"
#include <stdio.h>
#define SIZE 16
#define UP 0
#define DOWN 1
#define LEFT 2
#define RIGHT 3

int main(argc,argv)
int argc;
char *argv[]; {
int numtasks, rank, source, dest, outbuf, i, tag=1,
inbuf[4]={MPI_PROC_NULL,MPI_PROC_NULL,MPI_PROC_NULL,MPI_PROC_NULL,},
nbrs[4], dims[2]={4,4},
periods[2]={0,0}, reorder=0, coords[2];

MPI_Request reqs[8];
MPI_Status stats[8];
MPI_Comm cartcomm;

MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);

if (numtasks == SIZE) {
MPI_Cart_create(MPI_COMM_WORLD, 2, dims, periods, reorder, &cartcomm);
MPI_Comm_rank(cartcomm, &rank);
MPI_Cart_coords(cartcomm, rank, 2, coords);
MPI_Cart_shift(cartcomm, 0, 1, &nbrs[UP], &nbrs[DOWN]);
MPI_Cart_shift(cartcomm, 1, 1, &nbrs[LEFT], &nbrs[RIGHT]);

outbuf = rank;

for (i=0; i<4; i++) {

dest = nbrs[i];
source = nbrs[i];
MPI_Isend(&outbuf, 1, MPI_INT, dest, tag,
MPI_COMM_WORLD, &reqs[i]);
MPI_Irecv(&inbuf[i], 1, MPI_INT, source, tag,
MPI_COMM_WORLD, &reqs[i+4]);
}

MPI_Waitall(8, reqs, stats);

printf("rank= %d coords= %d %d neighbors(u,d,l,r)= %d %d %d %d\n",

rank,coords[0],coords[1],nbrs[UP],nbrs[DOWN],nbrs[LEFT],
nbrs[RIGHT]);
printf("rank= %d inbuf(u,d,l,r)= %d %d %d %d\n",
rank,inbuf[UP],inbuf[DOWN],inbuf[LEFT],inbuf[RIGHT]);
}
else
printf("Must specify %d processors. Terminating.\n",SIZE);

MPI_Finalize();
}

Fortran - Cartesian Virtual Topology Example

program cartesian
include 'mpif.h'

integer SIZE, UP, DOWN, LEFT, RIGHT

parameter(SIZE=16)
parameter(UP=1)
parameter(DOWN=2)
parameter(LEFT=3)
parameter(RIGHT=4)
integer numtasks, rank, source, dest, outbuf, i, tag, ierr,
& inbuf(4), nbrs(4), dims(2), coords(2),
& stats(MPI_STATUS_SIZE, 8), reqs(8), cartcomm,
& periods(2), reorder
data inbuf /MPI_PROC_NULL,MPI_PROC_NULL,MPI_PROC_NULL,
& MPI_PROC_NULL/, dims /4,4/, tag /1/,
& periods /0,0/, reorder /0/

call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

if (numtasks .eq. SIZE) then

call MPI_CART_CREATE(MPI_COMM_WORLD, 2, dims, periods, reorder,
& cartcomm, ierr)
call MPI_COMM_RANK(cartcomm, rank, ierr)
call MPI_CART_COORDS(cartcomm, rank, 2, coords, ierr)
print *,'rank= ',rank,'coords= ',coords
call MPI_CART_SHIFT(cartcomm, 0, 1, nbrs(UP), nbrs(DOWN), ierr)
call MPI_CART_SHIFT(cartcomm, 1, 1, nbrs(LEFT), nbrs(RIGHT),
& ierr)

outbuf = rank
do i=1,4
dest = nbrs(i)
source = nbrs(i)
call MPI_ISEND(outbuf, 1, MPI_INTEGER, dest, tag,
& MPI_COMM_WORLD, reqs(i), ierr)
call MPI_IRECV(inbuf(i), 1, MPI_INTEGER, source, tag,
& MPI_COMM_WORLD, reqs(i+4), ierr)
enddo

call MPI_WAITALL(8, reqs, stats, ierr)

print *,'rank= ',rank,' coords= ',coords,

& ' neighbors(u,d,l,r)= ',nbrs
print *,'rank= ',rank,' ',
& ' inbuf(u,d,l,r)= ',inbuf

else
print *, 'Must specify',SIZE,' processors. Terminating.'
endif
call MPI_FINALIZE(ierr)
end

Sample program output: (partial)

rank= 0 coords= 0 0 neighbors(u,d,l,r)= -3 4 -3 1

rank= 0 inbuf(u,d,l,r)= -3 4 -3 1
rank= 1 coords= 0 1 neighbors(u,d,l,r)= -3 5 0 2
rank= 1 inbuf(u,d,l,r)= -3 5 0 2
rank= 2 coords= 0 2 neighbors(u,d,l,r)= -3 6 1 3
rank= 2 inbuf(u,d,l,r)= -3 6 1 3
. . . . .

rank= 14 coords= 3 2 neighbors(u,d,l,r)= 10 -3 13 15

rank= 14 inbuf(u,d,l,r)= 10 -3 13 15
rank= 15 coords= 3 3 neighbors(u,d,l,r)= 11 -3 14 -3
rank= 15 inbuf(u,d,l,r)= 11 -3 14 -3

A Brief Word on MPI-2

History:

 Intentionally, the MPI specification did not address several "difficult" issues. For reasons of expediency, these issues were
deferred to a second specification, called MPI-2.
 In March 1995, following the release of the initial MPI specification, the MPI Forum began discussing enhancements to the
MPI standard. Following this:
o December 1995: Supercomputing '95 conference - Birds of a Feather meeting to discuss proposed extensions to
MPI.
o November 1996: Supercomputing '96 conference - MPI-2 draft made available. Public comments solicited. Meeting
to discuss MPI-2 extensions.
o The draft presented at Supercomputing '96 shortly thereafter became the MPI-2 standard.
 Not all MPI libraries provide a full implementation of MPI-2.

Key Areas of New Functionality:

 Dynamic Processes - extensions that remove the static process model of MPI. Provides routines to create new processes.
 One-Sided Communications - provides routines for one directional communications. Include shared memory operations
(put/get) and remote accumulate operations.
 Extended Collective Operations - allows for non-blocking collective operations and application of collective operations to
inter-communicators
 External Interfaces - defines routines that allow developers to layer on top of MPI, such as for debuggers and profilers.
 Additional Language Bindings - describes C++ bindings and discusses Fortran-90 issues.
 Parallel I/O - describes MPI support for parallel I/O.

More Information on MPI-2:

 The Argonne National Lab MPI web pages have MPI-2 information. See the References section for links.

LLNL Specific Information and Recommendations

Although the MPI programming interface has been standardized, implementations will differ, as will the way MPI programs are
compiled and run on different platforms. A summary of LC's MPI environment is provided here, however users will definitely want to
consult the tutorials mentioned below for all of the details.

IBM AIX Clusters:

 IBM's MPI library is the only supported library on these platforms

 Full MPI-2 except for Dynamic Processes
 Thread-safe
 C, C++, Fortran77/90/95 are supported
 Compiling and running MPI programs, see: IBM POWER Systems Overview

Opteron Linux Clusters:

 The MVAPICH MPI library is the only supported library on these platforms. Open MPI and generic MPICH may also be
available, if really needed.
 This is an MPI-1 implementation, not MPI-2, but does include MPI-I/O support
 Not thread-safe
 C, C++, Fortran77/90/95 are supported
 Compiling and running MPI programs, see: Linux Clusters Overview

IBM BG/L Clusters:

 The IBM BG/L MPI library is the only supported library on these platforms.
 This is an IBM implementation based on MPICH2. Includes MPI-2 functionality minus Dynamic Processes.
 Thread-safe
 C, C++, Fortran77/90/95 are supported
 Compiling and running MPI programs, see: asc.llnl.gov/computing_resources/bluegenel/basics/

This completes the tutorial.

Please complete the online evaluation form - unless you are doing the exercise, in which case please
complete it at the end of the exercise.

Where would you like to go now?

 Exercise
 Agenda
 Back to the top

References and More Information

 Author: Blaise Barney, Livermore Computing.

 MPI web pages at Argonne National Laboratory
http://www-unix.mcs.anl.gov/mpi
 "Using MPI", Gropp, Lusk and Skjellum. MIT Press, 1994.
 Livermore Computing specific information:
o MPI at LLNL
computing.llnl.gov/mpi
o IBM POWER Systems Overview tutorial
computing.llnl.gov/tutorials/ibm_sp
o Linux Clusters Overview tutorial
computing.llnl.gov/tutorials/linux_clusters
 IBM Parallel Environment Manuals
http://www-1.ibm.com/servers/eserver/pseries/library/sp_books
 IBM Compiler Documentation:
Fortran: www-4.ibm.com/software/ad/fortran
C/C++: www-4.ibm.com/software/ad/caix
 "RS/6000 SP: Practical MPI Programming", Yukiya Aoyama and Jun Nakano, RS/6000 Technical Support Center, IBM
Japan. Available from IBM's Redbooks server at http://www.redbooks.ibm.com.
 "A User's Guide to MPI", Peter S. Pacheco. Department of Mathematics, University of San Francisco.

Appendix A: MPI-1 Routine Index

These man pages were derived from an IBM implementation of MPI and may differ from the man pages of other implementations.
Environment Management Routines
MPI_Abort MPI_Errhandler_create MPI_Errhandler_free
MPI_Errhandler_get MPI_Errhandler_set MPI_Error_class
MPI_Error_string MPI_Finalize MPI_Get_processor_name
MPI_Init MPI_Initialized MPI_Wtick
MPI_Wtime
Point-to-Point Communication Routines
MPI_Bsend MPI_Bsend_init MPI_Buffer_attach
MPI_Buffer_detach MPI_Cancel MPI_Get_count
MPI_Get_elements MPI_Ibsend MPI_Iprobe
MPI_Irecv MPI_Irsend MPI_Isend
MPI_Issend MPI_Probe MPI_Recv
MPI_Recv_init MPI_Request_free MPI_Rsend
MPI_Rsend_init MPI_Send MPI_Send_init
MPI_Sendrecv MPI_Sendrecv_replace MPI_Ssend
MPI_Ssend_init MPI_Start MPI_Startall
MPI_Test MPI_Test_cancelled MPI_Testall
MPI_Testany MPI_Testsome MPI_Wait
MPI_Waitall MPI_Waitany MPI_Waitsome
Collective Communication Routines
MPI_Allgather MPI_Allgatherv MPI_Allreduce
MPI_Alltoall MPI_Alltoallv MPI_Barrier
MPI_Bcast MPI_Gather MPI_Gatherv
MPI_Op_create MPI_Op_free MPI_Reduce
MPI_Reduce_scatter MPI_Scan MPI_Scatter
MPI_Scatterv
Process Group Routines
MPI_Group_compare MPI_Group_difference MPI_Group_excl
MPI_Group_free MPI_Group_incl MPI_Group_intersection
MPI_Group_range_excl MPI_Group_range_incl MPI_Group_rank
MPI_Group_size MPI_Group_translate_ranks MPI_Group_union
Communicators Routines
MPI_Comm_compare MPI_Comm_create MPI_Comm_dup
MPI_Comm_free MPI_Comm_group MPI_Comm_rank
MPI_Comm_remote_group MPI_Comm_remote_size MPI_Comm_size
MPI_Comm_split MPI_Comm_test_inter MPI_Intercomm_create
MPI_Intercomm_merge
Derived Types Routines
MPI_Type_commit MPI_Type_contiguous MPI_Type_count
MPI_Type_extent MPI_Type_free MPI_Type_hindexed
MPI_Type_hvector MPI_Type_indexed MPI_Type_lb
MPI_Type_size MPI_Type_struct MPI_Type_ub
MPI_Type_vector
Virtual Topology Routines
MPI_Cart_coords MPI_Cart_create MPI_Cart_get
MPI_Cart_map MPI_Cart_rank MPI_Cart_shift
MPI_Cart_sub MPI_Cartdim_get MPI_Dims_create
MPI_Graph_create MPI_Graph_get MPI_Graph_map
MPI_Graph_neighbors MPI_Graph_neighbors_count MPI_Graphdims_get
MPI_Topo_test
Miscellaneous Routines
MPI_Address MPI_Attr_delete MPI_Attr_get
MPI_Attr_put MPI_DUP_FN MPI_Keyval_create
MPI_Keyval_free MPI_NULL_COPY_FN MPI_NULL_DELETE_FN
MPI_Pack MPI_Pack_size MPI_Pcontrol
MPI_Unpack

Acura Integra (98-01) Electrical Wiring Diagram PDF
100% (5)
Acura Integra (98-01) Electrical Wiring Diagram PDF
621 pages
Flat Belly Diet Plan For Women
100% (5)
Flat Belly Diet Plan For Women
7 pages
How To Build and Operate A Modern Security Operations Center
No ratings yet
How To Build and Operate A Modern Security Operations Center
39 pages
A Case Study On Critical Thinking, Creativity, Innovation and Collaboration Practices in Classroom Using Digital Technology
No ratings yet
A Case Study On Critical Thinking, Creativity, Innovation and Collaboration Practices in Classroom Using Digital Technology
7 pages
Message Passing Interface (MPI)
No ratings yet
Message Passing Interface (MPI)
14 pages
C Programming For Beginners: The Simple Guide to Learning C Programming Language Fast!
From Everand
C Programming For Beginners: The Simple Guide to Learning C Programming Language Fast!
Tim Warren
5/5 (1)
Computing LLNL Gov
No ratings yet
Computing LLNL Gov
42 pages
Message Passing Interface (MPI) : Author: Blaise Barney, Lawrence Livermore National Laboratory
No ratings yet
Message Passing Interface (MPI) : Author: Blaise Barney, Lawrence Livermore National Laboratory
41 pages
Mpi
No ratings yet
Mpi
17 pages
An Introduction To MPI: Parallel Programming With The Message Passing Interface
No ratings yet
An Introduction To MPI: Parallel Programming With The Message Passing Interface
48 pages
Message Passing Interface (MPI)
No ratings yet
Message Passing Interface (MPI)
4 pages
Introduction To MPI Ranger Lonestar
No ratings yet
Introduction To MPI Ranger Lonestar
67 pages
Message Passing Interface (MPI)
No ratings yet
Message Passing Interface (MPI)
22 pages
NGK Mpi
No ratings yet
NGK Mpi
74 pages
Clase 4 - Tutorial de MPI
No ratings yet
Clase 4 - Tutorial de MPI
35 pages
Lec 9 DR Marwa Abbas
No ratings yet
Lec 9 DR Marwa Abbas
64 pages
5 MPIprogramming
No ratings yet
5 MPIprogramming
43 pages
CS-3006 - 5 - MPI Basics
No ratings yet
CS-3006 - 5 - MPI Basics
53 pages
Cs-3006 6 Mpi Basics 2
No ratings yet
Cs-3006 6 Mpi Basics 2
52 pages
Using MPI
No ratings yet
Using MPI
385 pages
2013 02 24 Ppopp Mpi Basic
No ratings yet
2013 02 24 Ppopp Mpi Basic
102 pages
Asg 03 - MPI
No ratings yet
Asg 03 - MPI
8 pages
Class03 - MPI, Part 1, Intermediate PDF
No ratings yet
Class03 - MPI, Part 1, Intermediate PDF
83 pages
6.3 Mpi: The Message Passing Interface: (Team Lib)
No ratings yet
6.3 Mpi: The Message Passing Interface: (Team Lib)
5 pages
Lecture 1
No ratings yet
Lecture 1
23 pages
MiniTool Partition Wizard Crack 12 Key Download Free 2025
No ratings yet
MiniTool Partition Wizard Crack 12 Key Download Free 2025
29 pages
Mpi Lecture
No ratings yet
Mpi Lecture
129 pages
Chapter 4 - Message-Passing Programming, MPI
No ratings yet
Chapter 4 - Message-Passing Programming, MPI
79 pages
03 MPIProgramStructure
No ratings yet
03 MPIProgramStructure
42 pages
The Message Passing Interface (MPI)
No ratings yet
The Message Passing Interface (MPI)
18 pages
Lecture 10-Introduction To MPI
No ratings yet
Lecture 10-Introduction To MPI
51 pages
Message Passing Interface (MPI) Programming
No ratings yet
Message Passing Interface (MPI) Programming
11 pages
BIg Data Anslysi
No ratings yet
BIg Data Anslysi
57 pages
02 Mpi 0
No ratings yet
02 Mpi 0
19 pages
02 Message Passing Interface Tutorial
No ratings yet
02 Message Passing Interface Tutorial
34 pages
Message Passing Interface: Parallel Processing Course University of Tehran
No ratings yet
Message Passing Interface: Parallel Processing Course University of Tehran
49 pages
In3200 Chap09
No ratings yet
In3200 Chap09
56 pages
Lecture 11 Distributed Memory Programming
No ratings yet
Lecture 11 Distributed Memory Programming
28 pages
Introduction To MPI Basics
No ratings yet
Introduction To MPI Basics
8 pages
MPI Tutorial Fall Break 2022
No ratings yet
MPI Tutorial Fall Break 2022
60 pages
Week 10
No ratings yet
Week 10
52 pages
Week09 L2
No ratings yet
Week09 L2
13 pages
Lec5 MPI
No ratings yet
Lec5 MPI
28 pages
Mpi Openmp Handouts
No ratings yet
Mpi Openmp Handouts
67 pages
Parallel and Distributed Computing Lab Digital Assignment - 5
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 5
7 pages
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
No ratings yet
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
19 pages
Mpi Half Day Public
No ratings yet
Mpi Half Day Public
140 pages
High Performance Computing WS2022 Slides 3 Mpi
No ratings yet
High Performance Computing WS2022 Slides 3 Mpi
48 pages
SERC IntroMPI 2019-09-14 v0
No ratings yet
SERC IntroMPI 2019-09-14 v0
43 pages
Message Passing Interface (MPI) Programming
No ratings yet
Message Passing Interface (MPI) Programming
11 pages
Ms. V. Uma Maheswari, Assistant Lecturer, Department of Information Technology, National Institute of Technology, Surathkal
No ratings yet
Ms. V. Uma Maheswari, Assistant Lecturer, Department of Information Technology, National Institute of Technology, Surathkal
91 pages
Intro To MPI: Hpc-Support@duke - Edu
No ratings yet
Intro To MPI: Hpc-Support@duke - Edu
56 pages
Key Concepts in MPI Programming: Processes
No ratings yet
Key Concepts in MPI Programming: Processes
6 pages
Intro MPI
No ratings yet
Intro MPI
60 pages
MPI Tutorial: MPI (Message Passing Interface)
No ratings yet
MPI Tutorial: MPI (Message Passing Interface)
29 pages
MPI Tutorial: MPI (Message Passing Interface)
No ratings yet
MPI Tutorial: MPI (Message Passing Interface)
29 pages
Fortran Mpi Tutorial
No ratings yet
Fortran Mpi Tutorial
29 pages
Message Passing Interface (MPI) : Steve Lantz Center For Advanced Computing Cornell University
No ratings yet
Message Passing Interface (MPI) : Steve Lantz Center For Advanced Computing Cornell University
53 pages
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
No ratings yet
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
199 pages
HPC Lecture40
No ratings yet
HPC Lecture40
25 pages
High Performance Computing: Matthew Jacob Indian Institute of Science
No ratings yet
High Performance Computing: Matthew Jacob Indian Institute of Science
25 pages
The 1 Page Python Book
From Everand
The 1 Page Python Book
Barani Kumar
2/5 (1)
Parallel Programming with MPI: Definitive Reference for Developers and Engineers
From Everand
Parallel Programming with MPI: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
C Language for Beginners with Easy Tips of C Basic Programming
From Everand
C Language for Beginners with Easy Tips of C Basic Programming
Publicancy Ltd
No ratings yet
July 15th 2024 - Threat Hunt Updates
No ratings yet
July 15th 2024 - Threat Hunt Updates
6 pages
Higher Education Cyber Threat Forecast - November 2024
No ratings yet
Higher Education Cyber Threat Forecast - November 2024
13 pages
Converge Threat Intel Report 2024 DEC
No ratings yet
Converge Threat Intel Report 2024 DEC
16 pages
Saint Dymphna Prayers
100% (1)
Saint Dymphna Prayers
3 pages
Converge Threat Intel Report 2025 JANUARY
No ratings yet
Converge Threat Intel Report 2025 JANUARY
16 pages
Task 1: Modify The /etc/hosts File On The Client Machine: Dns Lab
No ratings yet
Task 1: Modify The /etc/hosts File On The Client Machine: Dns Lab
14 pages
Chapter 1 - Questions
No ratings yet
Chapter 1 - Questions
5 pages
GRE Playbook
100% (2)
GRE Playbook
102 pages
Introduction To Cad/Cam Using Mastercam
No ratings yet
Introduction To Cad/Cam Using Mastercam
29 pages
Sms Code 1872b
No ratings yet
Sms Code 1872b
4 pages
BI644913-01-EN Frenos Tralasdo PDF
No ratings yet
BI644913-01-EN Frenos Tralasdo PDF
5 pages
Communiques - DP - DP 302 Submission of Annual System Audit Report
No ratings yet
Communiques - DP - DP 302 Submission of Annual System Audit Report
13 pages
Critical Review Exercise
No ratings yet
Critical Review Exercise
9 pages
Instruction Manual: Downloaded From Manuals Search Engine
No ratings yet
Instruction Manual: Downloaded From Manuals Search Engine
104 pages
Aluminium and Related Products
No ratings yet
Aluminium and Related Products
3 pages
Scope of SW Metrics - II
No ratings yet
Scope of SW Metrics - II
11 pages
Bio-Cor-10 Manual 2021-04
100% (1)
Bio-Cor-10 Manual 2021-04
11 pages
Caeassistant Com Blog A Fast Guide To The Most Useful Tasks in Abaqus Visualizat
No ratings yet
Caeassistant Com Blog A Fast Guide To The Most Useful Tasks in Abaqus Visualizat
9 pages
SH5.0 - 6.0 - 8.0 - 10RT User Manual
No ratings yet
SH5.0 - 6.0 - 8.0 - 10RT User Manual
137 pages
Brain-Computer Interface (Bci) : Shaik - Khaja Mohiuddin 10MSE0050
No ratings yet
Brain-Computer Interface (Bci) : Shaik - Khaja Mohiuddin 10MSE0050
13 pages
The 1hdfte Control Fault - Land Cruiser Club
No ratings yet
The 1hdfte Control Fault - Land Cruiser Club
1 page
Quality Control Thermal Spray
No ratings yet
Quality Control Thermal Spray
33 pages
Diesel Engines For Alamarin-Jet 180 - 185 Rev3-6
No ratings yet
Diesel Engines For Alamarin-Jet 180 - 185 Rev3-6
4 pages
JFQ4A
No ratings yet
JFQ4A
4 pages
Fusion Company Profile
No ratings yet
Fusion Company Profile
8 pages
VCB Frekvenciavalto Katalogus
No ratings yet
VCB Frekvenciavalto Katalogus
16 pages
MB 0049
No ratings yet
MB 0049
8 pages
Enterprise Architect User Guide
No ratings yet
Enterprise Architect User Guide
2,023 pages
Radar Simrad R-5000 (Operator Manual)
No ratings yet
Radar Simrad R-5000 (Operator Manual)
82 pages
parle G Presentation
No ratings yet
parle G Presentation
23 pages
Instruction Manual Ac/Dc Digital Clamp Meter: 1. Features
No ratings yet
Instruction Manual Ac/Dc Digital Clamp Meter: 1. Features
1 page
How To Install A Satellite Dish For DSTV, MyTV, Etc. Yourself
No ratings yet
How To Install A Satellite Dish For DSTV, MyTV, Etc. Yourself
40 pages
Computer Ethics: A Necessarily Brief Introduction
No ratings yet
Computer Ethics: A Necessarily Brief Introduction
13 pages
Jossa Counciling
No ratings yet
Jossa Counciling
11 pages
Why Python Rocks For Research: Programming
No ratings yet
Why Python Rocks For Research: Programming
5 pages

Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing

Uploaded by

Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing

Uploaded by

Message Passing Interface (MPI)

 M P I = Message Passing Interface

History and Evolution:

Reasons for Using MPI:

 Required for all programs/routines which make MPI library calls.

C include file Fortran include file

Format of MPI Calls:

Format: rc = MPI_Xxxxx(parameter, ... )

Error code: Returned as "rc". MPI_SUCCESS if successful

Example: CALL MPI_BSEND(buf,count,type,dest,tag,comm,ierr)

Error code: Returned as "ierr" parameter. MPI_SUCCESS if successful

 C names are case sensitive; Fortran names are not.

General MPI Program Structure:

Environment Management Routines

Examples: Environment Management Routines

C Language - Environment Management Routines Example

/******* do some work *******/

Fortran - Environment Management Routines Example

integer numtasks, rank, ierr, rc

call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)

C ****** do some work ******

Point to Point Communication Routines

Types of Point-to-Point Operations:

 System buffer space is:

Blocking vs. Non-blocking:

Order and Fairness:

Point to Point Communication Routines

MPI Message Passing Routine Arguments

Non-blocking sends MPI_Isend(buffer,count,type,dest,tag,comm,request)

Blocking receive MPI_Recv(buffer,count,type,source,tag,comm,status)

Non-blocking receive MPI_Irecv(buffer,count,type,source,tag,comm,request)

MPI_UNSIGNED unsigned int

MPI_LONG_DOUBLE long double

MPI_DOUBLE_COMPLEX double complex

Point to Point Communication Routines

Blocking Message Passing Routines

Examples: Blocking Message Passing Routines

C Language - Blocking Message Passing Routines Example

rc = MPI_Get_count(&Stat, MPI_CHAR, &count);

Fortran - Blocking Message Passing Routines Example

integer numtasks, rank, dest, source, count, tag, ierr

if (rank .eq. 0) then

else if (rank .eq. 1) then

call MPI_GET_COUNT(stat, MPI_CHARACTER, count, ierr)

Point to Point Communication Routines

Non-Blocking Message Passing Routines

Examples: Non-Blocking Message Passing Routines

C Language - Non-Blocking Message Passing Routines Example

MPI_Irecv(&buf[0], 1, MPI_INT, prev, tag1, MPI_COMM_WORLD, &reqs[0]);

MPI_Isend(&rank, 1, MPI_INT, prev, tag2, MPI_COMM_WORLD, &reqs[2]);

MPI_Waitall(4, reqs, stats);

Fortran - Non-Blocking Message Passing Routines Example

integer numtasks, rank, next, prev, buf(2), tag1, tag2, ierr

call MPI_IRECV(buf(1), 1, MPI_INTEGER, prev, tag1,

call MPI_ISEND(rank, 1, MPI_INTEGER, prev, tag2,

call MPI_WAITALL(4, reqs, stats, ierr);

Collective Communication Routines

Types of Collective Operations:

Programming Considerations and Restrictions:

 Collective operations are blocking.

Collective Communication Routines

MPI Reduction Operation C Data Types Fortran Data Type

Examples: Collective Communications

C Language - Collective Communications Example

printf("rank= %d Results: %f %f %f %f\n",rank,recvbuf[0],

C Fortran stores this array in column major order, so the

if (numtasks .eq. SIZE) then

Sample program output:

rank= 0 Results: 1.000000 2.000000 3.000000 4.000000

Derived Data Types

 As previously mentioned, MPI predefines its primitive data types:

C Data Types Fortran Data Types

Derived Data Type Routines

Examples: Contiguous Derived Data Type

C Language - Contiguous Derived Data Type Example

MPI_Type_contiguous(SIZE, MPI_FLOAT, &rowtype);

/* do some work */

C do some work