PRAM
(Parallel Random Access Machine)
Overview
⚫ What is a machine model?
⚫ Why do we need a model?
⚫ RAM
⚫ PRAM
◦ Steps in computation
◦ Write conflict
◦ Examples
A parallel Machine Model
⚫ What is a machine model?
◦ Describes a “machine”
◦ Puts a value to the operations on the machine
⚫ Why do we need a model?
◦ Makes it easy to reason algorithms
◦ Achieve complexity bounds
◦ Analyzes maximum parallelism
RAM (Random Access Machine)
⚫ Unbounded number of local memory cells
⚫ Each memory cell can hold an integer of
unbounded size
⚫ Instruction set included –simple operations,
data operations, comparator, branches
⚫ All operations take unit time
⚫ Time complexity = number of
instructions executed
⚫ Space complexity = number of memory
cells used
PRAM (Parallel Random Access Machine)
⚫ Definition:
◦ Is an abstract machine for designing the
algorithms applicable to parallel computers
◦ M’ is a system <M, X, Y, A> of infinitely many
● RAM’s M1, M2, …, each Mi is called a processor of
M’. All the processors are assumed to be identical.
Each has ability to recognize its own index i
● Input cells X(1), X(2),…,
● Output cells Y(1), Y(2),…,
● Shared memory cells A(1), A(2),…,
PRAM (Parallel RAM)
⚫ Unbounded collection of RAM
processors P0, P1, …,
⚫ Processors don’t have tape
⚫ Each processor has unbounded registers
⚫ Unbounded collection of share memory
cells
⚫ All processors can access all memory
cells in unit time
⚫ All communication via shared memory
PRAM (step in a computation)
⚫ Consist of 5 phases (carried in parallel by all
the processors) each processor:
◦ Reads a value from one of the cells x(1),…, x(N)
◦ Reads one of the shared memory cells A(1),
A(2),…
◦ Performs some internal computation
◦ May write into one of the output cells y(1), y(2),…
◦ May write into one of the shared memory cells
A(1), A(2),…
e.g. for all i, do A[i] = A[i-1] + 1;
Read A[i-1] , compute add 1, write A[i]
happened synchronously
PRAM (Parallel RAM)
⚫ Some subset of the processors can
remain idle
P0 P1 P2 PN
Shared Memory Cells
⚫ Two or more processors may read
simultaneously from the same cell
⚫ A write conflict occurs when two or
more processors try to write
simultaneously into the same cell
Share Memory Access Conflicts
⚫ PRAM are classified based on their
Read/Write abilities (realistic and useful)
◦ Exclusive Read(ER) : all processors can
simultaneously read from distinct memory
locations
◦ Exclusive Write(EW) : all processors can
simultaneously write to distinct memory
locations
◦ Concurrent Read(CR) : all processors can
simultaneously read from any memory location
◦ Concurrent Write(CW) : all processors can
write to any memory location
◦ EREW, CREW, CRCW
Concurrent Write (CW)
⚫ What value gets written finally?
◦ Priority CW: processors have priority based
on which value is decided, the highest priority
is allowed to complete WRITE
◦ Common CW: all processors are allowed to
complete WRITE iff all the values to be
written are equal.
◦ Arbitrary/Random CW: one randomly chosen
processor is allowed to complete WRITE
Strengths of PRAM
⚫ PRAM is attractive and important model for
designers of parallel algorithms Why?
◦ It is natural: the number of operations executed
per one cycle on p processors is at most p
◦ It is strong: any processor can read/write any
shared memory cell in unit time
◦ It is simple: it abstracts from any communication
or synchronization overhead, which makes the
complexity and correctness of PRAM algorithm
easier
◦ It can be used as a benchmark: If a problem has
no feasible/efficient solution on PRAM, it has no
feasible/efficient solution for any parallel machine
EXAMPLE - CW
⚫ P1(6) P2(2) P3(9) P4(3) P5(7)
M3 M7
6 7
EXAMPLE – CW ON EW MODEL
⚫ P1(6) P2(2) P3(9) P4(3) P5(7)
(3,1) (3,2) (7,3) (3,4) (7,5)
(3,1) (3,2) (3,4) (7,3) (7,5)
1 0 1 0 0
⚫ P1(6) P2(2) P3(9) P4(3) P5(7)
M3 M7
6 7
Parallel algorithms
1.Parallel reduction
2.Prefix sums
3.List ranking
4.Preorder traversal
5.Merging two sorted lists
6.Graph coloring
An initial example
⚫ How do you add N numbers residing in
memory location M[0, 1, …, N]
⚫ Serial Algorithm = O(N)
⚫ PRAM Algorithm using N processors P0,
P1, P2, …, PN ?
PRAM Algorithm
ParallelReduction(Parallel Addition)
P0 + P1 + P2 + P3 + Step 1
P0 + P2 + Step 2
P0 + Step 3
Parallel reduction
⚫ Begin
⚫ for all Pi where 0<=i<=n/2 - 1 do
⚫ For j=0 to log n -1 do
⚫ If i mod 2j =0 and 2i +2j <n then
⚫ A[2i]=A[2i]+A[2i +2j ]
⚫ End if
⚫ End for
⚫ End for
⚫ end
Example – PARALLEL REDUCTION
⚫ 4 3 8 2 9 1 0 5 6 3
⚫ 7 10 10 5 9
⚫ 17 15 9
⚫ 32 9
⚫ 41
PRAM Algorithm (Parallel Addition)
⚫ Log (n) steps = time needed
⚫ n / 2 processors needed
⚫ Applicable for other operations
◦ +, *, <, >, etc.
2.Prefix sums
⚫ Also called as parallel prefixes and scans
⚫ Packing elements-applications
Prefix sums algorithm
⚫ For all Pi where 1<=i<=n-1 do
⚫ For j=0 to log n -1 do
⚫ if i-2j >=0 then
A[i]=A[i]+A[i-2j ]
end if
end for
end for
end
⚫
Example:Prefix sums
⚫ 4 3 8 2 9 1 0 5 6 3
⚫ 4 7 11 10 11 10 1 5 11 9
⚫ 4 7 15 17 22 20 12 15 12 14
⚫ 4 7 15 17 26 27 27 32 34 34
⚫ 4 7 15 17 26 27 27 32 38 41
Application-Prefix sums
Packing characters
⚫ AbCDeFghI
APPLY PREFIX
⚫ 1 0 1 1 0 10 0 1 SUM
⚫1 1 2 3 3444 5
⚫A b CDeFgh I
Write the capital letter based on
position
⚫ ACDFI