Lect8 Parallel System

Sorting ordered data is easier to manipulate than random data. There are two main types of sorting algorithms: internal sorting which uses main memory and external sorting which uses auxiliary storage like hard disks. Comparison-based sorting algorithms repeatedly compare and exchange elements, with a lower bound of Θ(n log n) operations. Non-comparison based sorting uses known properties of elements like binary representation, with a lower bound of Θ(n) operations. Parallelizing sorting involves distributing elements among processors, which raises issues around where the input/output sequences reside and how comparisons are performed between elements on different processors.

Uploaded by

sama akram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Lect8 Parallel System

Uploaded by

sama akram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

sorting

Sorting data are easier to manipulate than random

ordered data.

Sorting Algorithm
1- Internal 2- External
The number of Use auxiliary storage (such
elements is small hard disks) to store because
enough to fit into the number of elements to
the processor’s be sorted is too large to fit
main memory into memory
Sorting Algorithm
1- Comparison based
Repeatedly comparing pairs of elements and if they are
out of order, exchange them. This is called (compare-
exchange).
Lower bound is Θ(n log n) on the sequential computer.

2- non comparison based

Used certain know properties of elements (such as their
binary representation).
Lower bound is Θ(n) on the sequential computer.
sorting
*Parallelizing a sequential sorting algorithm involves distributing
the elements to be sorted into the available processors. This
process raises a number of issues that we must address in order to
make the presentation of parallel sorting algorithms clearer.

*In sequential sorting algorithms, the input and the sorting

sequences are sorted in the processor’s memory.

*In parallel sorting, there are two places where these sequences
can reside. They may be sorted on only one of the processors, or
they may be distributed among the processors.

*A general method of distributed is to enumerate the processors

and use this enumeration to specify a global ordering for the sorted
sequence.
sorting
Ex
If Pi comes before Pj in the enumeration, all elements
sorted in Pi will be smaller than those sorted in Pj .

How comparisons are performed:

A sequential sorting algorithm can easily perform a
compare- exchange on two elements because they are
sorted locally in the processor’s memory. In parallel
sorting algorithms this step is not so easy. As the
elements reside on different processors.
sorting
One elements per processor
ai aj ai, aj ai, aj
PPP i
i
i
Pj PPP i
i
i
Pj
Step 1 Step 2

Min(ai, aj) Max(ai, aj) Each compare- exchange

operation required one
PPP i i
i
Pj comparison step and one
communication step.

Step 3
sorting
More than one elements per processor

We refer to this operation of comparing and splitting two

sorted blocks as compare - split
sorting Network
The key component of sorting network is a comparator.
A comparator is a device with two inputs x, y and two outputs x’
and y’.
sorting Network
sorting Network
The key component of these networks is a comparator. A
comparator is a device with two inputs x and y and two outputs x'
and y'. For an increasing comparator, x' = min{x, y} and y' = max{x,
y}; for a decreasing comparator x' = max{x, y} and y' = min{x, y}.
As the two elements enter the input wires of the comparator, they
are compared and, if necessary, exchanged before they go to the
output wires. A sorting network is usually made up of a series of
columns, and each column contains a number of comparators
connected in parallel. Each column of comparators performs a
permutation, and the output obtained from the final column is
sorted in increasing or decreasing order. The depth of a network is
the number of columns it contains. Since the speed of a
comparator is a technology-dependent constant, the speed of the
network is proportional to its depth.
Bitonic sorting
The key operation of bitonic sorting network
1- convert input sequence into bitonic sequence.
2- rearrangement of a bitonic sequence into a sorting
sequence.
Bitonic sorting
A bitonic sequence : is a sequence of elements < a0, a1,
a2, … , an-1> with the property that either
1- there exists an index i, 0 ≤ i ≤ n-1, such that <a0, a1,
… , ai > is monotonically increasing and < ai+1, … , an-1>
is monotonically decreasing.
2- there exit a cyclic shift of indices so that (1) is
satisfy.
Bitonic sorting
-Therefore, every element of the first sequence is
smaller than every element of the second sequence
and each of them is bitonic sequence.
-Thus we reduce the problem of rearranging a bitonic
sequence of size n to that of rearranging two smaller
bitonic sequences and concatenating the result.
-This known as a bitonic split.
Bitonic sorting
-we can recursively obtain shorter bitonic sequences
for each the bitonic subsequences until obtain
subsequences of size one.
-at that point the output is sorted in monotonically
increasing order.
-the number of splits steps required to rearrange the
bitonic sequence into a sort sequence is log n.
The network that used that technique is called Bitonic
Merging Network.
Bitonic sorting

- Merging a 16 elements bitonic sequence through a series of

log 16 bitonic splits
- Using sequential program to make that algorithm, it will take
n log n unit time.
Bitonic sorting
-Bitonic merging network contain log n columns, each -
column contains n/2 comparators which perform one
step of the bitonic merge.
-This network take as input a bitonic sequence and
output the sequence in sorting order.
- a bitonic merging network with n input was defined by
ΘBM[n] if output sorted in increasing order
ΘBM[n] if output sorted in decreasing order
Bitonic sorting
Bitonic sorting
Bitonic sorting
Bitonic sorting
the depth, d(n), of the network

The algorithm perform a total number of steps

= ( 1 + log n ) log (n ) / 2

≈ Θ
Mapping Bitonic sort algorithm into a
Hypercube and Mesh Networks
Mapping Bitonic sort algorithm into a Hypercube
and Mesh Networks

The bitonic sort network for sorting n elements contain log n

stages, and each stage I consists of I columns of n/2 comparators.
Each column of comparator performs compare exchange
operation s on n wires.
On parallel computer the compare exchange function is
performed by a pair of processors.
1- one element per processor
Each of the n processors contain one element of the input
sequence.
Each wire of the bitonic sorting network represent a distinct
processor.
During each step of the algorithm, the compare exchange
operation performed by a column of comparators are performed by
n/2 pairs of processors.
To obtain a good mapping, we must investigate the way that input
wires are paired during each stage of bitonic sort.
In any step, the compare exchange operation is performed
between two wires if their labels differ in exactly one bit.
1- one element per processor
During each of the four stages, wires whose labels differ in the
least significant bit perform a compare exchange in the last step
of each stage.
During the last three stages, wires whose labels differ in the
second least significant bit perform a compare exchange in the
second to last step of each stage.
In general, wires whose lables differ in the ith least significant
bit perform a compare exchange ( log (n) – i + 1 ) times.
This observation helps us efficiency map wires into processor
by mapping wires that perform compare exchange operations
more frequently to processors that are close to each other.
* Hypercube
Mapping wires into processors of a hypercube connected parallel
computer is straight forward.
Compare exchange operations take place between wires whose
labels differ in only one bit.
In hypercube, processors whose labels differ in only one bit are
neighbors.
Thus an optimal mapping of input wires to hypercube processors
is the one that maps an input wire with label l to a processor with
label l where l = 0, 1, 2, … , n-1.

* Hypercube
A hypercube with d dimensional ( that P = 2d ).
In the final stage of bitonic sort, the input has been converted into a
bitonic sequence.
During first step of this stage, processors that differ only in the dth bit
of the binary representation of their labels ( the most significant bit)
compare exchange their elements.
Thus the compare exchange operation takes place between
processors along the dth dimension.
During the second step of the algorithm, compare exchange
operation takes place among processors along the (d-1)st dimension.
Bitonic sorting
Bitonic sorting
* Hypercube
In general, during the ith step of the final stage, processors
communicate along the (d- (i-1))st dimension.
During each step of the algorithm, every processor performs
a compare exchange operation.
The algorithm performs a total of ( 1 + log n ) ( log n) /2 such
steps.
The parallel run time

=Θ
* Mesh
The connectivity of a mesh is lower than that of a hypercube, so it is
impossible to map wires to processors such that each compare
exchange operation occurs only between neighboring processors.
Instead, we map wires such that the most frequent compare exchange
operations occur between neighboring processors.
There are several ways to map the input wires into the mesh
processors.
Each processor is label by the wire that is mapped into it.
In this formula, in general wires that differ in the ith least significant bit
are mapped into processors that are 2[(i-1)/2] communication links away
* Mesh – row major shuffled mapping
The advantage of row major shuffled mapping is that perform
compare exchange operation reside on square subsections of the
mesh.
For every stage of bitonic sort (corresponding to wires that differ in
the least significant bit ) are neighbors.
total amount of communication performed by each processor = 7
A block of elements per processors

Let p be the number of processors

n be the number of elements, n > p
Each processor is assigned a block (n/p) elements to be sorted.
The main difference between this formulation and the one that
uses virtual processors is that (n/p) elements assigned to each
processor are initially sorted locally, using a fast sequential sorting
algorithm.
A block of elements per processors
Hypercubev

Mesh
Bubble sort and its variants
The sequential bubble sort algorithm compares and exchanges
adjacent elements in the sequence to be sorted.

Bubble sort is difficult to paralyzed

Certified Red Team Professional (CRTP)
No ratings yet
Certified Red Team Professional (CRTP)
33 pages
Case Study
33% (3)
Case Study
4 pages
Jayesh Resume Non IT
No ratings yet
Jayesh Resume Non IT
3 pages
Qualtrics Step by Step Manual PDF
No ratings yet
Qualtrics Step by Step Manual PDF
173 pages
The Art and Mystery of The Gentle Craft
0% (1)
The Art and Mystery of The Gentle Craft
75 pages
HPC Final PPTs
No ratings yet
HPC Final PPTs
369 pages
chap9_slides
No ratings yet
chap9_slides
68 pages
Bitonic Sort
No ratings yet
Bitonic Sort
23 pages
(Slideshare Downloader La) 63c8d73f6879b
No ratings yet
(Slideshare Downloader La) 63c8d73f6879b
31 pages
Bi Tonic Sort
No ratings yet
Bi Tonic Sort
20 pages
9-Biotonic sort
No ratings yet
9-Biotonic sort
25 pages
Bitonic Sort (Quang)
No ratings yet
Bitonic Sort (Quang)
11 pages
Parallel Algorithm Lecture Notes
No ratings yet
Parallel Algorithm Lecture Notes
28 pages
3.parallel Processing - Algorithms
No ratings yet
3.parallel Processing - Algorithms
37 pages
Sorting Bitonic Sort
No ratings yet
Sorting Bitonic Sort
7 pages
10 Sorting
No ratings yet
10 Sorting
20 pages
Algorithmica: Sorting-Based Selection Algorithms For Hypercubic Networks
No ratings yet
Algorithmica: Sorting-Based Selection Algorithms For Hypercubic Networks
18 pages
Advance Computer Architecture
No ratings yet
Advance Computer Architecture
16 pages
Lesson2 5DistributedMemorySorting PDF
No ratings yet
Lesson2 5DistributedMemorySorting PDF
9 pages
Hardware Implementatioon of Sorting Algorithm Using FPGA Ijariie7623
No ratings yet
Hardware Implementatioon of Sorting Algorithm Using FPGA Ijariie7623
7 pages
Cours 3
No ratings yet
Cours 3
54 pages
Parallel Algorithms
No ratings yet
Parallel Algorithms
48 pages
Parallel Sorting Algorithms
No ratings yet
Parallel Sorting Algorithms
22 pages
Sorting On A Mesh-Connected Parallel Computer
No ratings yet
Sorting On A Mesh-Connected Parallel Computer
30 pages
Parallel Algorithm & Sorting in Parallel Programming: Submitted By:-Submitted To: - Dalpat Songra
No ratings yet
Parallel Algorithm & Sorting in Parallel Programming: Submitted By:-Submitted To: - Dalpat Songra
42 pages
FPGA Based Hardware Accelerator For Sorting Data
No ratings yet
FPGA Based Hardware Accelerator For Sorting Data
4 pages
Parallel Distributed Computing Unit-4
No ratings yet
Parallel Distributed Computing Unit-4
27 pages
1 Counting Sort
No ratings yet
1 Counting Sort
8 pages
Q2.Nabil Mohsen Alzeqri
No ratings yet
Q2.Nabil Mohsen Alzeqri
7 pages
Bitonic Sort
No ratings yet
Bitonic Sort
4 pages
Parallel Algorithms Unit 3 By Dr. Choudhary Ravi Singh
No ratings yet
Parallel Algorithms Unit 3 By Dr. Choudhary Ravi Singh
21 pages
Information Processing Letters: Thorsten Ehlers
No ratings yet
Information Processing Letters: Thorsten Ehlers
4 pages
Algorithm-Lecture4 - Sorting-1
No ratings yet
Algorithm-Lecture4 - Sorting-1
45 pages
Chapter 10: Algorithms 10.1. Deterministic and Non-Deterministic Algorithm
No ratings yet
Chapter 10: Algorithms 10.1. Deterministic and Non-Deterministic Algorithm
5 pages
Sorting 2
No ratings yet
Sorting 2
26 pages
24 Notes
No ratings yet
24 Notes
12 pages
Cours 3
No ratings yet
Cours 3
54 pages
Reviw of Sorting Algorihms
No ratings yet
Reviw of Sorting Algorihms
4 pages
Lecture18
No ratings yet
Lecture18
2 pages
Chapter 8: Sorting: Important Concepts Common Applications
100% (2)
Chapter 8: Sorting: Important Concepts Common Applications
68 pages
Csed 605 WK 10
No ratings yet
Csed 605 WK 10
39 pages
Bitonic Sort
No ratings yet
Bitonic Sort
3 pages
Parallel Merge Sort
No ratings yet
Parallel Merge Sort
6 pages
A Cooperative Sort Algorithm Based On Indexing
No ratings yet
A Cooperative Sort Algorithm Based On Indexing
6 pages
Unit IV Searching & Sorting (Autosaved)
No ratings yet
Unit IV Searching & Sorting (Autosaved)
36 pages
(Haritha) IEEE - Paper
No ratings yet
(Haritha) IEEE - Paper
4 pages
Sorting, Ranking, Indexing, Selecting: I R S I R
No ratings yet
Sorting, Ranking, Indexing, Selecting: I R S I R
8 pages
Data Structures & Algorithm Analysis
No ratings yet
Data Structures & Algorithm Analysis
26 pages
6_Algorithms on Data Structures - part 1
No ratings yet
6_Algorithms on Data Structures - part 1
9 pages
sorting_methods
No ratings yet
sorting_methods
50 pages
Sorting: Gordon College
No ratings yet
Sorting: Gordon College
98 pages
Slides Sorting
No ratings yet
Slides Sorting
41 pages
Parallel and Distributed lec 11
No ratings yet
Parallel and Distributed lec 11
15 pages
CPP R16 - Unit-3
No ratings yet
CPP R16 - Unit-3
21 pages
Parallel Sorting Algorithms
100% (1)
Parallel Sorting Algorithms
7 pages
F8 PDF
No ratings yet
F8 PDF
32 pages
UNIT IV - Searching and Sorting
No ratings yet
UNIT IV - Searching and Sorting
21 pages
Sorting
No ratings yet
Sorting
32 pages
L9_sorting
No ratings yet
L9_sorting
50 pages
Bitonic Sort
No ratings yet
Bitonic Sort
2 pages
Chapter 3 Sorting Techniques
No ratings yet
Chapter 3 Sorting Techniques
45 pages
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Sorting (II) Reading: Chap.7, Weiss
No ratings yet
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Sorting (II) Reading: Chap.7, Weiss
26 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Sheet 1 Solution
No ratings yet
Sheet 1 Solution
12 pages
Dev Ops
No ratings yet
Dev Ops
29 pages
Professional Cloud DevOps Engineer Questions
No ratings yet
Professional Cloud DevOps Engineer Questions
4 pages
Devops
No ratings yet
Devops
1 page
MP QM - Part 2 - 2021
No ratings yet
MP QM - Part 2 - 2021
36 pages
(Ebook) Access 2007 All-in-One Desk Reference For Dummies by Alan Simpson, Margaret Levine Young, Alison Barrows, April Wells, Jim McCarter, ISBN 9780470036495, 9780470120620, 0470036494, 0470120622 download
100% (1)
(Ebook) Access 2007 All-in-One Desk Reference For Dummies by Alan Simpson, Margaret Levine Young, Alison Barrows, April Wells, Jim McCarter, ISBN 9780470036495, 9780470120620, 0470036494, 0470120622 download
48 pages
May Keyword-Query Relevance Evaluation Webinar Q&A
No ratings yet
May Keyword-Query Relevance Evaluation Webinar Q&A
8 pages
Predictive Analytics The Future of Business Intelligence
100% (3)
Predictive Analytics The Future of Business Intelligence
8 pages
Eagle Point
No ratings yet
Eagle Point
24 pages
CCNA 1 1 Release Notes
No ratings yet
CCNA 1 1 Release Notes
1 page
Regular Expressions in QTP
No ratings yet
Regular Expressions in QTP
15 pages
ITIL Foundation - Slide Deck
No ratings yet
ITIL Foundation - Slide Deck
177 pages
IT Advisory
No ratings yet
IT Advisory
46 pages
Data Store Export To Cloud
No ratings yet
Data Store Export To Cloud
8 pages
DaedalusR 0100dec US prf7
No ratings yet
DaedalusR 0100dec US prf7
26 pages
Java Study Guide
No ratings yet
Java Study Guide
24 pages
RAC SRVCTL Comands
No ratings yet
RAC SRVCTL Comands
8 pages
Pdii
No ratings yet
Pdii
9 pages
First Coding Session - Overview!
No ratings yet
First Coding Session - Overview!
5 pages
Metasploit-Background
No ratings yet
Metasploit-Background
12 pages
Running A Java Program From
No ratings yet
Running A Java Program From
9 pages
An Ethical Principles For Ubiquitous Communication
No ratings yet
An Ethical Principles For Ubiquitous Communication
23 pages
MS6155 Specs
No ratings yet
MS6155 Specs
1 page
Training PSSE Analisis Dinamico Avanzado 31oct Al 04nov 2011
100% (1)
Training PSSE Analisis Dinamico Avanzado 31oct Al 04nov 2011
31 pages
Decimal Point Analytics - Intern (Finance) PDF
No ratings yet
Decimal Point Analytics - Intern (Finance) PDF
2 pages
Learn C++ Programming Language
50% (2)
Learn C++ Programming Language
54 pages
EC2 Instance Step 1: Launch An Instance
No ratings yet
EC2 Instance Step 1: Launch An Instance
2 pages
Mellitah Oil & Gas BV
No ratings yet
Mellitah Oil & Gas BV
12 pages
Ahmed Ismail Mahdi: Work Experience
No ratings yet
Ahmed Ismail Mahdi: Work Experience
2 pages
Case Study 1
No ratings yet
Case Study 1
20 pages
Anomaly Detection in Social Networks Twitter Bot
No ratings yet
Anomaly Detection in Social Networks Twitter Bot
11 pages