MPI2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

CS424.

Parallel Computing Lab#5

1 MPI Collective Communications


MPI's collective communication functions allow all processes in a communicator group to perform an operation
together. These functions offer efficient communication patterns for parallel programming. The followings are some
common collective functions.

1. MPI_Bcast: One process (usually the root) has a piece of data.This data is sent to all other processes in
the communicator. All processes will have the same data after the broadcast.
2. MPI_Reduce: Each process has a local value (e.g., individual score). A mathematical operation (like sum,
max, min) is applied to all the values sent by each process. The result of the operation is stored on the
designated process (usually the root).
3. MPI_All_reduce: Similar to Reduce but it distributes the reduced result to all processes.
4. MPI_Scatter: One process (usually the root) has a large dataset. The data is divided (scattered) into
smaller chunks and sent to all processes. Each process receives a portion of the original data.
5. MPI_Gather: Each process has a piece of data. All processes send their data to a designated process
(usually the root). The root process gathers all the data into a single collection.
6. MPI_All_gather: Gathers data from all processes and distributes it to all processes.
7. MPI_Barrier: All processes in the communicator must reach this point before any can proceed. Useful
for synchronization, ensuring all processes have finished a specific task before moving on.

Benefits of Collective Communication:

• Simplifies code compared to sending and receiving messages individually between processes.
• Ensures efficient data exchange and synchronization within the communicator group.

Additional Notes:

• Not all collective functions require data exchange (e.g., MPI_Barrier).


• The designated "root" process can vary depending on the function.
• Some collective functions offer additional options like message tags for further differentiation.
• MPI_All_reduce gathers data from processes, execute the chosen operation, and then distributes, while
MPI_All_gather simply gathers and distributes.

To optimize code and potentially enhance performance, it is beneficial to explore which MPI functions can be
effectively substituted with other functions while maintaining the core functionality of the program.

2 Examples
Code 1 distributes array values among processes using MPI_Scatter. Compile and run the program and study the
output of several runs.

Explanation:

1. Include headers: Necessary header files for MPI communication (mpi.h), standard input/output (stdio.h),
and memory allocation (stdlib.h).
2. MPI Initialization: MPI_Init initializes the MPI environment and prepares it for communication.
3. Get Process Information:
o MPI_Comm_rank: Retrieves the rank (unique identifier) of the current process within the
communicator group.
o MPI_Comm_size: Retrieves the total number of processes participating in the communicator group.
4. Global Array (Root Process Only):
o Allocates memory for the entire array data on the root process (rank == 0) using malloc.
o You can modify the loop to initialize the data array with your desired values.

1
5. Scattering Data:
o send_count: Calculates the number of elements to send to each process, considering potential
uneven division.
o remainder: Tracks any remaining elements after dividing the total by the number of processes.
o MPI_Scatter: Distributes elements of the data array on the root process to all processes, including
the root itself. Each process receives send_count elements (except the root process might receive
extras if there's a remainder). local_data is allocated to store the received portion.
6. Printing Received Data: Each process prints the elements it received using a loop.
7. Memory Deallocation: Frees the allocated memory for local_data on all processes and data on the root
process.
8. Finalize MPI: MPI_Finalize cleans up the MPI environment and releases resources.
Running MPI programs on a single core system
• The --oversubscribe parameter in mpiexec (or mpirun) allows you to launch more MPI processes on a
single node than the number of available cores or hardware threads.
• The behavior of --oversubscribe might vary depending on the specific MPI implementation and system
configuration.
• Some MPI libraries might offer alternative options for process placement that provide more control over
oversubscription.

Code 1

2
3 Practice
1. In MPI, achieving a desired outcome often involves multiple communication pathways. List the potential
substitutions between MPI functions that can achieve similar communication patterns.

2. Recall the version you wrote of the code in mpiSumArray.c (Code folder, sample run is shown below).
Answer the following questions.

a) Rewrite the program using MPI_Reduce.


b) Show in four ways- 4 programs- how to calculate the total sum and send each process a copy of the result. Each
process must display the result it has received.

3. Do the necessary changes to the MPI program in Code 1 to perform the following.
a) Instead of printing the values of local_data, multiply the values of local_data by rank+1.
b) Gather the values of arrays after multiplication and send it to process 1. Show the result.
c) Use MPI_Reduce to calculate the minimum value of the array- after gathering- and let process 2 display the
result.

You might also like