Lec5 Pramalgs 1

The document discusses different subclasses of PRAM models for parallel computing based on how concurrent access to shared memory is handled. It then presents algorithms for computing the parallel sum of elements stored in shared memory on different PRAM models and analyzes their time and processor complexities.

Uploaded by

SAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views13 pages

Lec5 Pramalgs 1

Uploaded by

SAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 13

Lecture 5 PRAM Algorithms

Parallel Computing
Spring 2021

1
Four Subclasses of PRAM
 Depending on how concurrent access to a single memory cell (of the shared
memory) is resolved, there are various PRAM variants.
 ER (Exclusive Read) or EW (Exclusive Write) PRAMs do not allow concurrent access of the
shared memory.
 It is allowed, however, for CR (Concurrent Read) or CW (Concurrent Write) PRAMs.
 Combining the rules for read and write access there are four PRAM variants:
 EREW:
 access to a memory location is exclusive. No concurrent read or write operations are allowed.
 Weakest PRAM model
 CREW
 Multiple read accesses to a memory location are allowed. Multiple write accesses to a memory
location are serialized.
 ERCW
 Multiple write accesses to a memory location are allowed. Multiple read accesses to a memory
location are serialized.
 Can simulate an EREW PRAM
 CRCW
 Allows multiple read and write accesses to a common memory location.
 Most powerful PRAM model
 Can simulate both EREW PRAM and CREW PRAM

2
Resolve concurrent write access
 (1) in the arbitrary PRAM, if multiple processors write into a single
shared memory cell, then an arbitrary processor succeeds in writing
into this cell.
 (2) in the common PRAM, processors must write the same value into
the shared memory cell.
 (3) in the priority PRAM the processor with the highest priority
(smallest or largest indexed processor) succeeds in writing.
 (4) in the combining PRAM if more than one processors write into the
same memory cell, the result written into it depends on the combining
operator. If it is the sum operator, the sum of the values is written, if it
is the maximum operator the maximum is written.

Note: An algorithm designed for the common PRAM can be executed on a

priority or arbitrary PRAM and exhibit similar complexity. The same
holds for an arbitrary PRAM algorithm when run on a priority PRAM.

3
PRAM Parallel Algorithm Assumptions
 Convention: In this subject we name processors arbitrarily
either 0, 1, . . . , p − 1 or 1, 2, . . . , p.
 The input to a particular problem would reside in the cells of the
shared memory. We assume, in order to simplify the exposition
of our algorithms, that a cell is wide enough (in bits or bytes) to
accommodate a single instance of the input (eg. a key or a
floating point number). If the input is of size n, the first n cells
numbered 0, . . . , n − 1 store the input.
 We assume that the number of processors of the PRAM is n or a
polynomial function of the size n of the input. Processor indices
are 0, 1, . . . , n − 1.

4
Parallel Sum: EREW PRAM solution 1
(Compute x0 + x1 + . . . + xn−1)
Algorithm Parallel Sum.
M[0] M[1] M[2] M[3] M[4] M[5] M[6] M[7]
x0 x1 x2 x3 x4 x5 x6 x7 t=0
x0+x1 x2+x3 x4+x5 x6+x7 t=1
x0+...+x3 x4+...+x7 t=2
x0+...+x7 t=3

 This EREW PRAM algorithm consists of lg n steps. In step i, if j can be exactly divisible by
2i, processor j reads shared-memory cells j and j + 2i-1 combines (sums) these values and
stores the result into memory cell j. After lgn steps the sum resides in cell 0. Algorithm
Parallel Sum has T = O(lg n), P = n and W = O(n lg n), W2 = O(n).
Processing node used:
P0, p2, p4, p6 t=1
P0, p4 t=2
P0 t=3

5
Parallel Sum: EREW PRAM solution 1
(Compute x0 + x1 + . . . + xn−1)
// pid() returns the id of the processor issuing the call.
begin Parallel Sum (n)
1. i = 1 ; j = pid();
2. while (j mod 2i == 0)
3. a = C[j];
4. b = C[j + 2i-1];
5. C[j] = a + b;
6. i = i + 1;
7. end
end Parallel Sum

6
Parallel Sum: PRAM solution
(Compute x0 + x1 + . . . + xn−1)
 A sequential algorithm that solves this problem requires n − 1
additions.
 For a PRAM implementation, value xi is initially stored in shared
memory cell i. The sum x0 + x1 + . . . + xn−1 is to be computed in T =
lgn parallel steps. Without loss of generality, let n be a power of two.
 If a combining CRCW PRAM with arbitration rule sum is used to solve
this problem, the resulting algorithm is quite simple. In the first step
processor i reads memory cell i storing xi. In the following step
processor i writes the read value into an agreed cell say 0. The time is
T = O(1), and processor utilization is P = O(n).
 A more interesting algorithm is the one presented below for the EREW
PRAM. The algorithm consists of lg n steps. In step i, processor j < n /
2i reads shared-memory cells 2j and 2j +1 combines (sums) these
values and stores the result into memory cell j. After lgn steps the sum
resides in cell 0. Algorithm Parallel Sum has T = O(lg n), P = n and W
= O(n lg n), W2 = O(n).

7
Parallel Sum: EREW PRAM solution 2
An example

Algorithm Parallel Sum.

M[0] M[1] M[2] M[3] M[4] M[5] M[6] M[7]
x0 x1 x2 x3 x4 x5 x6 x7 t=0
x0+x1 x2+x3 x4+x5 x6+x7 t=1
x0+...+x3 x4+...+x7 t=2
x0+...+x7 t=3

8
Parallel Sum: EREW PRAM solution 2
(Compute x0 + x1 + . . . + xn−1)

// pid() returns the id of the processor issuing the call.

begin Parallel Sum (n)
1. i = 1 ; j = pid();
2. while (j < n / 2i)
3. a = C[2j];
4. b = C[2j + 1];
5. C[j] = a + b;
6. i = i + 1;
7. end
end Parallel Sum

9
Parallel Sum: MPI solution
Algorithm Parallel Sum.
Step 1:
P0 P1 P2 P3 P4 P5 P6 P7
x0 <= x1 x2 <= x3 x4 <= x5 x6 <= x7
x0+x1 x1 x2+x3 x3 x4+x5 x5 x6+x7 x7

Step 2:
P0 P2 P4 P6
x0+x1 <= x2+x3 x4+x5 <= x6+x7

x0+...+x3 x2+x3 x4+...+x7 x6+x7

Step 3:
P0 P4
x0+...+x3 <= x4+...+x7

x0+...+x7
10
Parallel Sum: MPI solution

11
Parallel Sum
 Algorithm Parallel Sum can be easily extended to include the case
where n is not a power of two. Parallel Sum is the first instance of a
sequential problem that has a trivial sequential but more complex
parallel solution. Instead of operator Sum other operators like Multiply,
Maximum, Minimum, or in general, any associative operator could have
been used. As associative operator ⊗ is one such that (a ⊗ b) ⊗ c = a
⊗ (b ⊗ c).
 Exercise 1 Can you improve Parallel Sum so that T remains the same, P =
O(n/ lg n), and W = O(n)? Explain.
 Exercise 2 What if i have p processors where p < n ? (You may assume
that n is a multiple of p).
 Exercise 3 Generalize the Parallel Sum algorithm to any associative
operator.

12
End

Thank you!

Partial Solutions Manual Parallel and Distributed Computation: Numerical Methods
100% (1)
Partial Solutions Manual Parallel and Distributed Computation: Numerical Methods
95 pages
Co 2
No ratings yet
Co 2
22 pages
Sols Book PDF
100% (1)
Sols Book PDF
120 pages
f31 Book Arith Pres pt3
No ratings yet
f31 Book Arith Pres pt3
96 pages
ECE645 - Lecture3 - Fast - Adders and HW Approaches To Implementation
No ratings yet
ECE645 - Lecture3 - Fast - Adders and HW Approaches To Implementation
51 pages
06 AgilePM
No ratings yet
06 AgilePM
36 pages
Chapter 06
No ratings yet
Chapter 06
47 pages
04 SDLC Intro
No ratings yet
04 SDLC Intro
32 pages
1 Parallel and Distributed Computation
No ratings yet
1 Parallel and Distributed Computation
10 pages
Elec3010 HW6 S2025
No ratings yet
Elec3010 HW6 S2025
11 pages
Numerical Linear Algebra
No ratings yet
Numerical Linear Algebra
45 pages
Parallel Algorithms: Theory and Practice
No ratings yet
Parallel Algorithms: Theory and Practice
44 pages
Chapter 02
No ratings yet
Chapter 02
47 pages
Lecture 10
No ratings yet
Lecture 10
40 pages
Parallel Algorithm Merged
No ratings yet
Parallel Algorithm Merged
76 pages
Partial Solutions Manual Parallel and Distributed Computation: Numerical Methods
No ratings yet
Partial Solutions Manual Parallel and Distributed Computation: Numerical Methods
95 pages
Subject Name: Design and Analysis of Algorithms Subject Code: 10CS43 Prepared By: Sindhuja K Department: CSE Date
No ratings yet
Subject Name: Design and Analysis of Algorithms Subject Code: 10CS43 Prepared By: Sindhuja K Department: CSE Date
59 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
Numerical Analysis Problems and Solutions PART 1 CH 1 To CH 3
No ratings yet
Numerical Analysis Problems and Solutions PART 1 CH 1 To CH 3
99 pages
Lecture 03
No ratings yet
Lecture 03
31 pages
Parallel Prefix Sum
No ratings yet
Parallel Prefix Sum
32 pages
Pipe Lining
No ratings yet
Pipe Lining
36 pages
PRAM Algorithms
100% (1)
PRAM Algorithms
24 pages
Introduction To MATLAB
No ratings yet
Introduction To MATLAB
38 pages
3 CSC710 System UAT Template Ans Finsl
No ratings yet
3 CSC710 System UAT Template Ans Finsl
10 pages
Week5 Lec14
No ratings yet
Week5 Lec14
27 pages
6A-Divide-Conquer CP PC
No ratings yet
6A-Divide-Conquer CP PC
35 pages
Lect 5 Brent
No ratings yet
Lect 5 Brent
10 pages
02 Foundations
No ratings yet
02 Foundations
19 pages
DSP Updated Final Project
No ratings yet
DSP Updated Final Project
12 pages
H.W. #3 SolutionHnew
No ratings yet
H.W. #3 SolutionHnew
13 pages
LEC6 parallelAlg-Broadcasting
No ratings yet
LEC6 parallelAlg-Broadcasting
15 pages
Pap 3 Shared Memory Algos
No ratings yet
Pap 3 Shared Memory Algos
23 pages
Lec2 ParallelProgrammingPlatforms
No ratings yet
Lec2 ParallelProgrammingPlatforms
26 pages
Untitled Presentation
No ratings yet
Untitled Presentation
7 pages
Solution - Série4 - ASD en
No ratings yet
Solution - Série4 - ASD en
7 pages
Comp 372 Assignment 3
No ratings yet
Comp 372 Assignment 3
11 pages
DSD Lectures 13 14
No ratings yet
DSD Lectures 13 14
13 pages
Lec5 MPI
No ratings yet
Lec5 MPI
28 pages
Sample Final
No ratings yet
Sample Final
9 pages
Data Structures and Algorithms: (CS210/ESO207/ESO211)
No ratings yet
Data Structures and Algorithms: (CS210/ESO207/ESO211)
30 pages
PRAM COMP 633: Parallel Computing Algorithms: The PRAM Model of Computation
No ratings yet
PRAM COMP 633: Parallel Computing Algorithms: The PRAM Model of Computation
49 pages
Compsci Explanations PDF
No ratings yet
Compsci Explanations PDF
24 pages
Parallel Prefix Sum (Scan) With CUDA: Mark Harris
No ratings yet
Parallel Prefix Sum (Scan) With CUDA: Mark Harris
21 pages
Program To Solve The Polynomial Equation by Using Bisection Method
No ratings yet
Program To Solve The Polynomial Equation by Using Bisection Method
44 pages
Daily Log of Knowledge
No ratings yet
Daily Log of Knowledge
6 pages
R10 Compiler Design Unit-7 Code Optimization
No ratings yet
R10 Compiler Design Unit-7 Code Optimization
14 pages
Lecture 03-Parallel Prefix
No ratings yet
Lecture 03-Parallel Prefix
6 pages
High Speed Modified Booth's Multiplier For Signed and Unsigned Numbers
No ratings yet
High Speed Modified Booth's Multiplier For Signed and Unsigned Numbers
8 pages
R20 - CD - UNIT-4 Part 2 Intermediate Code Onwards-8-12
No ratings yet
R20 - CD - UNIT-4 Part 2 Intermediate Code Onwards-8-12
5 pages
Da PDF
No ratings yet
Da PDF
8 pages
Assignment of Algorithm
No ratings yet
Assignment of Algorithm
9 pages
Lec6 PRAMalgs
No ratings yet
Lec6 PRAMalgs
5 pages
Assignment # 5 - 5
No ratings yet
Assignment # 5 - 5
6 pages
Basic PRAM Algorithm Design Techniques
No ratings yet
Basic PRAM Algorithm Design Techniques
13 pages
Parallel and Distributed Computing Lab Digital Assignment - 3
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 3
10 pages
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
No ratings yet
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
12 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
Arithmetic Coding in Parallel: Jan Supol and Bo Rivoj Melichar
No ratings yet
Arithmetic Coding in Parallel: Jan Supol and Bo Rivoj Melichar
11 pages
Nondeterministic Procedure of Solving Simultaneous Equations
No ratings yet
Nondeterministic Procedure of Solving Simultaneous Equations
6 pages
Assignment 1: Name Class Date Period Sbuid Netid Email
No ratings yet
Assignment 1: Name Class Date Period Sbuid Netid Email
4 pages
Problem Statement
No ratings yet
Problem Statement
5 pages
Mpi Basic Operations
No ratings yet
Mpi Basic Operations
6 pages
Parallel Random Access Machine (PRAM) : Control
No ratings yet
Parallel Random Access Machine (PRAM) : Control
9 pages
Fundamental Algorithms, Assignment 11
No ratings yet
Fundamental Algorithms, Assignment 11
4 pages
Prog 6.2 Gauss-Jacobi's Iteration Method - Algorithm, Implementation in C With Solved Examples - Livedu
No ratings yet
Prog 6.2 Gauss-Jacobi's Iteration Method - Algorithm, Implementation in C With Solved Examples - Livedu
1 page
Distributed Arithmetic: Implementations and Applications: A Tutorial
No ratings yet
Distributed Arithmetic: Implementations and Applications: A Tutorial
30 pages