Design and Analysis of Algorithms
Design and Analysis of Algorithms
Chapter I
Introduction and Analysis of Algorithms
Aim
The aim of this chapter is to:
• define algorithm
Objectives
Learning outcome
At the end of this chapter, you will be able to:
1
Design and Analysis of Algorithms
An algorithmic problem is specified by describing the set of instances it must work on and the desired properties
that the output must have. All algorithms must satisfy the following criteria
• Input: Zero or more quantities that are externally supplied.
• Output: At least one quantity is produced.
• Definiteness: Each instruction is clear and unambiguous.
• Finiteness: If we trace out the instructions of an algorithm, then for all cases, the algorithm terminates after a
finite number of steps.
• Effectiveness: Every instruction must be very basic, so that it can be carried out in principal by a person
using only pencil and paper.
2
1.2 The Classification of Algorithms
An algorithm is an effective method for solving a problem expressed as a finite sequence of steps. While there is
no universally accepted breakdown for the various types of algorithms, there are common classes that algorithms
are frequently agreed to belong to. Among these are:
• Dynamic programming algorithms: This class remembers older results and attempts to use these to speed the
process of finding new results.
• Greedy algorithms: Greedy algorithms attempt not only to find a solution but also find the ideal solution to any
given problem.
• Brute force algorithms: The brute force approach starts at some random point and iterates through every
possibility until it finds the solution.
• Randomised algorithms: This class includes any algorithm that uses a random number at any point during its
process.
• Branch and bound algorithms: Branch and bound algorithms form a tree of sub problems to the primary
problem, following each branch until it is either solved or lumped in with another branch.
• Simple recursive algorithms: This type of algorithm goes for a direct solution immediately and then backtracks
to find a simpler solution.
• Backtracking algorithms: Backtracking algorithms test for a solution, if a solution is found the algorithm has
solved, if not it recurs once and tests again, continuing until a solution is found.
• Divide and conquer algorithms: Divide and conquer algorithm is similar to a branch and bound algorithm,
except that it uses the backtracking method of recurring in tandem with dividing a problem into sub
problems.
Mostly, people prefer to describe the ideas of an algorithm in English, moving to pseudo code to clarify sufficiently
tricky details of the algorithm.
Name of algorithm
Every algorithm is given an identifying name written in capital letters.
Introductory comment
The algorithm name is followed by a brief description of the tasks the algorithm performs and any assumptions
that have been made. The description gives the names and types of the variables used in the algorithm.
Steps
The actual algorithm is made up of a sequence of numbered steps, each beginning with a phrase enclosed in
square brackets which gives an abbreviated description of that step. Following this phrase is an ordered sequence
of statements which describe actions to be executed or tasks to be performed. The statements in each step are
executed in a left-to-right order.
Comments
An algorithm step may terminate with a comment enclosed in round parentheses intended to help the reader better
understand that step. Comments specify no action and are included only for clarity.
3
Design and Analysis of Algorithms
Procedure Name
.........................
.........................
.........................
End Name
ii. Assignment is denoted as:
assignment operator
iii. The symbol indicates comment.
iv. Arrow in both directions is used to show exchange
For example, x y
v. If condition is shown as:
If condition
Then
statement(s);
Else
statement(s);
End if
vi. Do-loop is shown as:
Do
.............................
.............................
End for
vii. Comment is shown as:
/* statement */
viii. While construct is used as:
While condition Do
4
..........................
5
Design and Analysis of Algorithms
..........................
..........................
End while
ix. Case construct is used as:
Case
......................
......................
......................
End case
x. Length [A] shows length of array.
xi. A [i...j] shows array from i to j.
xii. Arithmetic operators +, -, *, / and % are used.
xiii.Returning results is shown as:
Return (value(s))
Example:
To understand the expressing method by using pseudo code, let us consider an algorithm which sum n number.
6
Design and Analysis of Algorithms
Precedence Operator
1 Parentheses
2 Arithmetic
3 Relational
4 Logical
Operator Notation
negation not
logical and and
logical or or
6
Input Tape
Accumulator
Program
Program Counter
Memory
Output Tape
The input tape consists of a sequence of squares, each of which can store an integer. Whenever one square is read
from the tape, head moves one square to the right.
Input Tape
R/
W
The output tape is also a sequence of squares, in each square an integer can be written. Initially, the tape is
empty. Output is written on the square under the tape head and after the writing, the tape head moves one square
to the right. Over writing on the same square is not permitted.
7
Design and Analysis of Algorithms
Output Tape
O1 O2 O3 On-1 On
R/W Head
The memory consists of a sequence of registers, each of which is capable of holding an integer of arbitrary size.
The program for RAM is not stored in the memory and so is not subject to modification.
A program for RAM contains a sequence of labelled instructions resembling those found in assembly language
programs. All computations take place in the first register called accumulator. The program counter determines
the next instruction to be executed.
A RAM program defines a mapping from input tape to the output tape. Since the program may not have all input
tapes, the mapping may be undefined for certain inputs.
I O
N U
P T
U P
T U
RAM PROGRAM
T
T
A T
P A
E P
E
Fig. 1.5 Mapping by RAM program from input tape to the output tape
8
Worst
Case
Complexity
Average
Case complexity
No. of
Best
Case complexity
1 2 3 4 5 6 N
Fig. 1.6 Graphical representation of Worst Case, Average Case and Best Case complexity
9
Design and Analysis of Algorithms
Incremental Approach
Divide-and-conquer:
Example: Divide: Separate list in to two pieces.
12345678
Conquer: recursively count inversions in each half.
Combine: Count inversions where ai and aj are in different halves.
1234
5 6 7 8
12 34 56 78
2 3 5 7
1 4 6 8
[ Top-down Approach]
10
Solved examples 1
Design an algorithm for finding square of any number.
Solution:
SQUARE (n)
1. if n = 0 then return 0
2. else
3. return square (n) 2n + square (n 1) 1
For instance,
Square (0) = 0
Square (1) = 2 × 1 + Square (1 1) 1 = 2 + Square (0) 1 = 2 + 0 1 = 2 1 = 1
Square (2) = 2 × 2 + Square (2 1) 1 = 4 + Square (1) 1 = 4 + 1 1 = 5 1 = 4
Solved example 2:
Write an algorithm using Pseudo code for finding GCD of two positive integers A and B.
Solution:
GCD (A, B)
1. Input A and B
2. If (A = B) then
3. return “either is the GCD”
4. If A > B then
5. replace A by the difference of A and B
6. else
7. replace B B˜A
8. go to step 2 to 3
11
Design and Analysis of Algorithms
Exercises 1
1. An algorithm is any well-defined procedure that takes some value or set of values as input, and
produces some value or set of values as output.
a. numerical
b. computational
c. randomised
d. bottom up
4. attempts not only to find a solution, but to find the ideal solution to any given problem.
a. Branch and bound algorithms
b. Randomised algorithms
c. Dynamic programming algorithms
d. Greedy algorithms
6. An algorithm step may terminate with a comment enclosed in parentheses intended to help the
reader better understand that step.
a. round
b. square
c. circular
d. bracket
7. The hides the implementation details and thus one can solely focus on the computational
aspects of an algorithm.
a. algorithm
b. data structures
c. comments
d. pseudo code
12
Design and Analysis of Algorithms
10. The consists of a sequence of squares, each of which can store an integer.
a. output tape
b. input tape
c. magnetic tape
d. video tape
14
Chapter II
Elementary Data Structures
Aim
The aim of this chapter is to:
• explain stacks with the help of its representation, standard operations, algorithms,
illustrations and applications
Objectives
The objectives of this chapter are to:
• explain the basic terminologies of stacks, queues, linked lists, trees and graphs
• explicate linked lists, singly linked lists, doubly linked lists with their representation and illustrations
• enlist various types of graphs with their representation and basic operations performed on graphs
Learning outcome
At the end of this chapter, you will be able to:
• explain the meaning of tress with its definition, representation, common operations, basic terminologies used
15
Design and Analysis of Algorithms
2.2.1 Arrays
Array is a consecutive group of memory locations, where all them have the same name and are of identical type.
In computer science, an array data structure or simply array is a data structure consisting of a collection of
elements (values or variables), each identified by at least one index. An array is stored, so that the position of
each element can be computed from its index type by a mathematical formula. Arrays are among the oldest and
most important data structures and are used by almost every program and are used to implement many other data
structures, such as lists and strings. They effectively exploit the addressing logic of computers. In most modern
computers and many external storage devices, the memory is a one-dimensional array of words, whose indices
are their addresses.
Grades 5
Grades 0
• Array declaration
Example:
float Grades[ ] = new float[7];
• Grades is an array of type float with size 7.
• Grades[0], Grades[1], …, Grades[6] are the elements of the Grades, each is of type float.
• 0, 1, 2,…,6 are the indices of the array which are also called as subscripts. (Note that the indices start at 0 and
not at 1)
• During array declaration we may also put the brackets before the variable name:
• i.e., float [ ]Grades = new float[7];
16
Initialising arrays
Arrays may be initialised in the following ways:
int n[ ] = { 2, 4, 6, 8, 10 };
which creates and initialises a five element array with specified values.
Note:
• You cannot assign data to arrays like:
List = {1, 2, 3, 4, 5};
• Array elements are indexed between zero and the size of the array minus one.
• Arrays can have any type.
Inserting elements
Adding an element to an array/list at an arbitrary position without overwriting the previous values requires that,
you move all elements "below" that position.
17
Design and Analysis of Algorithms
12
1 3 1 1 3 1 3
2
2 7 2 7 2 7
3 9 3 9 3 9
4 13 4 4 12
5 22 5 13 5 13
6 6 22 6 22
Sample Pseudo-Code
Applications
• Arrays are used to implement mathematical vectors and matrices as well as other kinds of rectangular tables.
Many databases, small and large consist of (or include) one-dimensional arrays whose elements are records.
• Arrays are used to implement other data structures such as heaps, hash tables, deques, queues, stack, strings
etc.
• One or more large arrays are sometimes used to emulate in-program dynamic memory allocation, particularly
memory pool allocation. Historically, this has sometimes been the only way to allocate "dynamic memory"
portably.
• Arrays can be used to determine partial or complete control flow in programs, as a compact alternative to
(otherwise repetitive) multiple IF statements. They are known in this context as control tables and are used in
conjunction with a purpose built interpreter whose control flow is altered according to values contained in the
array. The array may contain subroutine pointers (or relative subroutine numbers that can be acted upon by
SWITCH statements) - that direct the path of the execution.
18
2.2.2 Stacks
In the most general form of a linear list, we are allowed to delete an element from and insert an element to any
position in the list. An important subclass of lists permits the insertion or deletion of an element to occur only at
one end. A linear list belonging to this subclass is called a stack.
Node contains Data and Link; data field contains an item of stack; link points to the node with next item.
Example:
Stack top
E D C B A E
0
D
The mathematical model of a stack is LIFO (Last In First Out). Data placed in the stack is accessed through one
path. The next available data is the last data to be placed in the stack. In other words, the "newest" data is
withdrawn.
Push
Pop
19
Design and Analysis of Algorithms
Standard operations
The standard operations performed on a stack are as follows:
Operation Description
EMPTY (S) Return boolean value 'True' if stack 'S' is empty; returns 'False' otherwise.
Views the next available data on stack 'S'. This operation is redundant since
TOP (S)
one can simply POP(S), view the data, then PUSH(S, Data).
Algorithm
• STACK-EMPTY (S)
If top [S] = 0
• POP (S)
If STACK-EMPTY(S)
then error "underflow"
else top [S] top [S] 1
return S [top [S] + 1]
• PUSH (S, x)
top [S] top [S] + 1
S [top [S]] x
Illustration of PUSH
20
S 15 6 2 9
top [S]=4
PUSH (S, 17)
top [S]=4
top [S]=5
Illustration of POP
Applications
Direct applications
• page-visited history in a web browser
• undo sequence in a text editor
• saving local variables when one function calls another, and this one calls another, and so on
Indirect applications
• auxiliary data structure for algorithms
• component of other data structures
2.2.3 Queues
A queue is a particular kind of collection in which the entities in the collection are kept in order and the principal
(or only) operations on the collection are the addition of entities to the rear terminal position and removal of
entities from the front terminal position. This makes the queue a First-In-First-Out (FIFO) data structure. In a
FIFO data structure, the first element added to the queue will be the first one to be removed. This is equivalent to the
requirement that once an element is added, all elements that were added before have to be removed before the
new element can be invoked. A queue is an example of a linear data structure.
21
Design and Analysis of Algorithms
Front Rear
DEQUEUE ENQUEUE
Deletion Insertion
Standard operations
The standard operations performed on a queue are as follows:
Operation Description
ENQUEUE (Q, Data) Put 'Data' in the rear path of queue 'Q'
DEQUEUE (Q) Withdraw next available data from front path of queue 'Q'
• DEQUEUE (Q)
x Q [head [Q]]
if head [Q] = length [Q]
then head [Q] 1
else head [Q] head [Q] +1
return x
Illustration of a queue
22
1 2 3 4 5 6 7 8 9 10 11 12
(a) Q 15 6 9 8 4
1 2 3 4 5 6 7 8 9 10 11 12
(b) Q 3 5 15 6 9 8 4 17
1 2 3 4 5 6 7 8 9 10 11 12
(c) Q 3 5 15 6 9 8 4 17
In the above illustration, a queue is implemented using an array Q [1...12]. Queue elements appear only in the
lightly shaded positions.
• The queue has 5 elements, in locations Q [7...11].
• The configuration of the queue after the calls ENQUEUE (Q, 17), ENQUEUE (Q, 3) and ENQUEUE (Q, 5).
• The configuration of the queue after the call DEQUEUE (Q) returns the key value 15 formerly at the head of
the queue. The new head has key 6.
Applications
Direct applications
• waiting lines
• access to shared resources (e.g., printer)
• multiprogramming
Indirect applications
• auxiliary data structure for algorithms
• component of other data structures
Linked lists are among the simplest and most common data structures; they provide an easy implementation for
several important abstract data structures including stacks and queues.
23
Design and Analysis of Algorithms
The principal benefit of a linked list over a conventional array is that the order of the linked items may be
different from the order that the data items are stored in memory or on disk. For that reason, linked lists allow
insertion and removal of nodes at any point in the list, with a constant number of operations.
Each record of a linked list is often called an element or node. The field of each node that contains the address of
the next node is usually called the next link or next pointer. The remaining fields are known as the data, information,
value etc. The head of a list is its first node and the tail is the list minus that node (or a pointer).
In the last node of a list, the link field often contains a null reference, a special value that is interpreted by
programs meaning "there is no such node". A less common convention is to make it point to the first node of the
list; in that case the list is said to be circular or circularly linked; otherwise it is said to be open or linear.
Element
A B C D
Forward or Next
first last
Element A B C D
Backward or Previous
head [L]
24
Algorithm
Searching a linked list
LIST-SEARCH (L, k): finds the first element with key k in list L by a simple linear search, returning a pointer to
this element. If no object with key k appears in the list, then NIL is returned. The algorithm is given as:
LIST-SEARCH (L, k)
x head [L]
while x ≠ NIL and key [x] ≠ k
do x next [x]
return x
25
Design and Analysis of Algorithms
2.2.5 Trees
A tree is a set of related interconnected nodes in a hierarchical structure. It is a non-empty collection of vertices and
edges that satisfies certain requirements. The structure resembles branches of a “tree”, hence the name.
Tree Terminology
• A vertex (or node) is a simple object that can have a name and can carry other associated information.
• The first or top node in a tree is called the root node.
• An edge is a connection between two vertices
• A path in a tree is a list of distinct vertices in which successive vertices are connected by edges in the tree.
• Example :{ a, b, d, i} is path.
• The defining property of a tree is that there is precisely one path connecting any two nodes.
• A disjoint set of trees is called a forest.
• Nodes with no children are leaves or terminal nodes.
26
Fig. 2.9 Tree terminology
Root
This is the unique node in the tree to which further sub trees are attached.
Leaves
These are the terminal nodes of the tree. The nodes with degree 0 are always the leaves. Here, nodes e, f, g, h, i.
Internal Nodes
The nodes other than the root node and the leaves are called the internal node. Here, b, c, d and f are internal
nodes.
Parent Node
The node which is having further sub branches is called the parent node of those sub branches. In fig. 1.8, node b is
parent node of d, e and f and c is parent node of g and h, whereas d, e, f, g and h are child of b and c parent.
Rooted tree
• A Rooted tree is one where we designate one node as the root (i.e., the tree examples we have been looking
at so far are all rooted trees).
• In computer science, we normally reserve the term tree to refer to rooted trees. The more general structure is
a free tree.
• In a rooted tree, any node is the root of a sub tree consisting of it and the nodes below it.
• There is exactly one path between the root and each of the other nodes in the tree.
• Each node except the root has exactly one node above it in the tree, (i.e., it is parent), and we extend the
family analogy talking of children, siblings, or grandparents.
27
Design and Analysis of Algorithms
Root
Vertex Vertex
28
Example of trees
Types of Traversal
The following are the types of traversal:
• Preorder Traversal (visits and processes each node in a tree BEFORE visiting and processing its children)
visit the root node first and process
do pre-order traversal of left sub tree
do pre-order traversal of right sub tree
• Postorder Traversal (visits and processes each node in the tree AFTER visiting and processing its children)
do post-order traversal of left sub tree
do post-order traversal of right sub tree
visit the root node last and process
• Inorder Traversal (processes nodes in the tree in an ascending sorted order)
do in-order traversal of left sub tree
visit the root node and process
do in-order traversal of right sub tree
29
Design and Analysis of Algorithms
2.2.6 Graphs
A simple graph G consists of a set of vertices V and a set of edges E. The elements of E are defined as pairs of
elements of V, ek = (u,v) such that u is not equal to v and (u,v) an element of E implies that (v,u) is also an
element of E. (In other words (u,v) and (v,u) represent the same edge).
Graphs can be represented pictorially by nodes and lines as shown below:
30
Fig. 2.13 Types of graphs
• Multigraphs allow multiple edges between the same pair or vertices and edges from and to the same vertex.
• The edges of a directed graph are called arcs and have a direction as indicated by an arrow. Unlike graphs,
an arc (u,v) in a directed graph does not imply that the arc (v,u) is also in the directed graph.
• An acyclic graph is a graph with no cycles. That is, there is no path along edges in the graph (or along arcs
in a directed graph) that leads from a vertex back to the same vertex.
• Two vertices u, v in a graph are said to be adjacent if there is an edge e (or arc) connecting u to v. The vertices
u and v are called the endpoints of e.
• The degree of a vertex v is given as deg (v) and is the number of edges incident with v. That is, the number
of edges for which v is an endpoint.
• A complete graph is one in which there is an edge between every pair of vertices.
In a graph G = (V, E), an edge which is directed from one node to another is called a directed edge, while an edge
which has no specific direction is called an undirected edge. A graph in which every edge is directed is called a
directed graph or a digraph. A graph in which every edge is undirected is called an undirected graph. If some of
the edges are directed and some are undirected in a graph, then the graph is a mixed graph. All of these graphs
are as shown in fig. 2.13.
31
Design and Analysis of Algorithms
In addition to a parent pointer, each node also has a rank that for the time being, should be interpreted as the
height of the sub tree hanging from that node.
procedure makeset (x)
(x) = x
rank (x) = 0
function find (x)
while x (x) : x = (x)
return x
As can be expected, makeset is a constant-time operation. On the other hand, find follows parent pointers to the
root of the tree and therefore, takes time proportional to the height of the tree. The tree actually gets built via the
third operation union, and so we must make sure that this procedure keeps trees shallow.
Merging two sets is easy: make the root of one point to the root of the other. But we have a choice here. If the
representatives (roots) of the sets are rx and ry, do we make rx point to ry or the other way around? Since tree
height
32
is the main impediment to computational efficiency, a good strategy is to make the root of the shorter tree point
to the root of the taller tree. This way, the overall height increases only if the two trees which are being merged
are equally tall. Instead of explicitly computing heights of trees, we will use the rank numbers of their root nodes,
which is why this scheme is called union by rank.
By design, the rank of a node is exactly the height of the subtree rooted at that node. This means, for instance,
that as you move up a path toward a root node, the rank values along the way are strictly increasing.
A root node with rank k is created by the merger of two trees with roots of rank k 1.
Property 2: Any root node of rank k has at least 2k nodes in its tree.
This extends to internal (nonroot) nodes as well: a node of rank k has at least 2k descendants. After all, any internal
node was once a root, and neither its rank nor its set of descendants has changed since then. Moreover, different
rank-k nodes cannot have common descendants, since by Property 1 any element has at most one ancestor of rank
k.
Property 3: If there are n elements overall, there can be at most n2k nodes of rank k.
This last observation implies, crucially, that the maximum rank is log n. Therefore, all the trees have height log
n, and this is an upper bound on the running time of find and union.
We represent heaps in level order, going from left to right. The array corresponding to the heap above is [25, 13,
17, 5, 8, 3].
The root of the tree A[1] and given index i of a node, the indices of its parent, left child and right child can be computed as:
PARENT (i)
return floor (i/2)
LEFT (i)
return 2i
RIGHT (i)
return 2i + 1
Let's try these out on a heap to make sure we believe they are correct. Take this heap,
Heap property
In a heap, for every node i other than the root, the value of a node is greater than or equal (at most) to the value of
its parent.
A [PARENT (i)] ≥ A[i]
Thus, the largest element in a heap is stored at the root.
34
Following is an example of Heap:
By the definition of a heap, all the tree levels are completely filled except possibly for the lowest level, which is filled
from the left up to a point. Clearly a heap of height h has the minimum number of elements when it has just one
node at the lowest level. The levels above the lowest level form a complete binary tree of height h -1 and 2h -1
nodes. Hence the minimum number of nodes possible in a heap of height h is 2h. Clearly a heap of height h has
the maximum
number of elements when its lowest level is completely filled. In this case the heap is a complete binary tree of
height h and hence has 2h+1 -1 nodes.
Following is not a heap, because it only has the heap property - it is not a complete binary tree. Recall that to be
complete; a binary tree has to fill up all of its levels with the possible exception of the last one, which must be
filled in from the left side.
Height of a node
We define the height of a node in a tree to be a number of edges on the longest simple downward path from a node
to a leaf.
Height of a tree
It is defined as the number of edges on a simple downward path from a root to a leaf. Note that the height of a tree
with n node is log n which is (log n). This implies that an n-element heap has height log n.
In order to show this, let the height of the n-element heap be h. From the bounds obtained on maximum and
minimum number of elements in a heap, we get
2h ≤ n ≤ 2h+1-1
Where n is the number of elements in a
heap.
2h ≤ n ≤ 2h+1
35
Design and Analysis of Algorithms
Let's suppose we want to add a node with key 15 to the heap. First, we add the node to the tree at the next spot
available at the lowest level of the tree. This is to ensure that the tree remains complete.
Now we do the same thing again, comparing the new node to its parent. Since 14 < 15, we have to do another
swap:
36
• Outline of procedure heapify
Heapify picks the largest child key and compares it with the parent key. If the parent key is larger then heapify
quits, otherwise it swaps the parent key with the largest child key. So that, the parent is now larger than its
children. It is important to note that the swap may destroy the heap property of the subtree rooted at the largest
child node. If this is the case, heapify calls itself again using largest child node as the new root.
Heapify (A, i)
l ← left [i]
r ← right [i]
if l ≤ heap-size [A] and A[l] > A[i]
then largest ← l
else largest ← i
if r ≤ heap-size [A] and A[i] >
A[largest] then largest ← r
if largest ≠ i
then exchange A[i] ↔ A[largest]
Heapify (A, largest)
Analysis
If we put a value at root that is less than every value in the left and right subtree, then 'Heapify' will be called
recursively until leaf is reached. To make recursive calls traverse the longest path to a leaf, choose value that
make 'Heapify' always recurse on the left child. It follows the left branch when left child is greater than or equal
to the right child, so putting 0 at the root and 1 at all other nodes, for example, will accomplished this task. With
such values 'Heapify' will called h times, where h is the heap height so its running time will be θ(h) (since each
call does (1) work), which is (log n). Since we have a case in which Heapify's running time (log n), its
worst-case running time is Ω (log n).
Example of heapify
Suppose we have a complete binary tree somewhere whose sub trees are heaps. In the following complete binary
tree, the sub trees of 6 are heaps:
The Heapify procedure alters the heap, so that the tree rooted at 6's position is a heap. Here's how it works. First,
we look at the root of our tree and its two children.
We then determine which of the three nodes is the greatest. If it is the root, we are done, because we have a heap.
If not, we exchange the appropriate child with the root, and continue recursively down the tree. In this case, we
exchange 6 and 8, and continue.
37
Design and Analysis of Algorithms
Building a Heap
We can use the procedure 'Heapify' in a bottom-up fashion to convert an array A[1 . . n] into a heap. Since the
elements in the subarray A[n/2 +1 . . n] are all leaves, the procedure BUILD_HEAP goes through the
remaining nodes of the tree and runs 'Heapify' on each one. The bottom-up order of processing node guarantees
that the subtree rooted at children are heap before 'Heapify' is run at their parent.
BUILD_HEAP (A)
heap-size (A) ← length [A]
For i ← floor(length[A]/2) down to 1 do
Heapify (A, i)
We can build a heap from an unordered array in linear time.
HEAPSORT (A)
BUILD_HEAP (A)
for i ← length (A) down to 2 do
exchange A[1] ↔ A[i]
heap-size [A] ← heap-size [A] - 1
Heapify (A, 1)
The HEAPSORT procedure takes time O (n log n), since the call to BUILD_HEAP takes time O(n) and each of
the n -1calls to Heapify takes time O(log n).
38
Design and Analysis of Algorithms
Exercises 2
1. The mathematical model of a stack is .
a. FIFO
b. FILO
c. LIFO
d. LILO
3. In a , each node contains, besides the next-node link, a second link field pointing to the previous
node in the sequence.
a. singly linked list
b. triple linked list
c. linear linked list
d. doubly-linked list
5. In a graph G = (V, E), if some of the edges are directed and some are undirected, then the graph is a
.
a. multigraph
b. directed graph
c. mixed graph
d. complete graph
6. A is a tree where each node has exactly zero, one or two children.
a. ordered tree
b. binary tree
c. rooted tree
d. free tree
7. is the unique node in the tree to which further sub trees are attached.
a. Root
b. Leaves
c. Parent node
d. Edge
40
8. Multiprogramming is one of the direct applications of .
a. stack
b. array
c. linked list
d. queue
41
Design and Analysis of Algorithms
Chapter III
Divide–And–Conquer Algorithms
Aim
Objectives
The objectives of this chapter are to:
Learning outcome
At the end of this chapter, you will be able to:
• differentiate between binary search, finding minimum and maximum, merge sort, quick sort, selection sort and
42
3.1 Introduction to Divide-and-Conquer Method
In computer science, divide and conquer is an important algorithm design paradigm. It works by recursively
breaking down a problem into two or more sub-problems of the same type, until these become simple enough to
be solved directly. The solutions to the sub-problems are then combined to give a solution to the original
problem. A divide and conquer algorithm is closely tied to a type of recurrence relation between functions of the
data in question, data is "divided" into smaller portions and the result calculated hence. The divide and conquer
algorithms consist of three steps:
• Breaking the problem into several sub-problems that are similar to the original problem but smaller in size,
• Solve the sub-problem recursively (successively and independently), and then
• Combine these solutions to subproblems to create a solution to the original problem.
The technique is named "divide and conquer" because a problem is conquered by dividing it into several smaller
problems. This technique yields elegant, simple and quite often very efficient algorithms. This technique is the basis
of efficient algorithms for all kinds of problems such as sorting (quick sort, merge sort), binary search, powering
a number, Fibonacci numbers, matrix multiplication and so on.
The most straightforward implementation is recursion which recursively searches the sub-array dictated by the
comparison as given in algorithm below:
The algorithm is invoked with initial low and high values of 1 and n. We can eliminate the recursion above and
convert this to an iterative implementation as given in algorithm below:
43
Design and Analysis of Algorithms
Algorithm: BinarySearch(a[1...n], n, x)
1. low = 1
2. high = n
3. while (low ≤ high)
4. mid = (low + high) / 2
5. If (a[mid] > x)
6. high = mid – 1
7. else If (a[mid] < x)
8. Low = mid + 1
9. Else
10. Return mid // found
11. Endif
12. endwhile
13. return – 1 // not found
Fig. 3.1 illustrates trace of the algorithm to find the target value of 65.
Binary search is a logarithmic algorithm and runs in O(log2n) time. Specifically, 1 + log2n iterations are needed to
return an answer. In most cases it is considerably faster than a linear search. It can be implemented using
recursion or iteration, as shown above.
[1] [2] [3] [4] [5] [6] [7] [8] [9]
10 15 40 50 55 65 75 90 95
mid = (low + high) / 2 = (1 + 9) / 2 = 5
arr[mid] < value (55 < 65)
low = mid + 1 = 5 + 1 = 6
high = 9
low mid high
[1] [2] [3] [4] [5] [6] [7] [8] [9]
10 15 40 50 55 65 75 90 95
mid = (low + high) / 2 = (6 + 9) / 2 = 7
arr[7] > value (75 > 65)
high = mid – 1 = 7 – 1 = 6
low
mid
high
44
3.3 Minimum and Maximum
Finding the minimum and maximum is an example of the divide and conquer algorithmic paradigm. To illustrate
this approach, consider the problem of finding both the minimum and maximum in an array of integers A [1... n] and
assume for simplicity that n is a power of 2. A straightforward algorithm might look like the one below. It returns
a pair (x, y) where x is the minimum and y is the maximum.
Clearly, the number of element comparisons performed by this method is 2n-2. However, using the divide and
conquer strategy, we can find both the minimum and maximum in only (3n/2) – 2 element comparisons. The idea
is very simple:
Divide the input array into two halves A[1..n/2] and A[(n/2) + 1..n], find the minimum and maximum in each
half and return the minimum of the two minima and the maximum of the two maxima. The divide-and-conquer
algorithm is given in Algorithm MINMAX as given below:
45
Design and Analysis of Algorithms
Algorithm: To sort the entire sequence A [1 ... n], make the initial call to the procedure MERGE-SORT (A, 1, n).
Fig. 3.3 shows the bottom-up view of the above procedure for n = 8.
46
The pseudo code of the MERGE procedure is as follows:
The basic version of quick sort algorithm was invented by C. A. R. Hoare in 1960. It is used on the principle of
divide-and-conquer. Quick sort is an algorithm of choice in many situations because it is not difficult to
implement, it is a good "general purpose" sort and it consumes relatively fewer resources during execution.
• It is in-place since it uses only a small auxiliary stack.
• It requires only n log(n) time to sort n items.
• It has an extremely short inner loop
• This algorithm has been subjected to a thorough mathematical analysis, a very precise statement can be made
about performance issues.
Quick sort works by partitioning a given array A[p . . r] into two non-empty sub array A[p .... q] and A[q+1 r] such
that every key in A[p .... q] is less than or equal to every key in A[q+1......r]. Then the two subarrays are sorted by
recursive calls to Quick sort. The exact position of the partition depends on the given array and index q is
computed as a part of the partitioning procedure.
QUICKSORT
14. If p < r then
15. q Partition (A, p, r)
16. Recursive call to Quick Sort (A, p, q)
17. Recursive call to Quick Sort (A, q + r, r)
47
Design and Analysis of Algorithms
Note that to sort the entire array, the initial call Quick Sort (A, 1, length[A]) is made. As a first step, Quick Sort
chooses as pivot one of the items in the array to be sorted. Then the array is partitioned on either side of the pivot.
Elements that are less than or equal to pivot will move toward the left and elements that are greater than or equal
to pivot will move toward the right.
Partitioning the array
Partitioning procedure rearranges the subarrays in-place as given below:
PARTITION (A, p, r)
1. x ← A[p]
2. i ← p-1
3. j ← r+1
4. while TRUE do
5. Repeat j ← j-1
6. until A[j] ≤ x
7. Repeat i ← i+1
8. until A[i] ≥ x
9. if i < j
10. then exchange A[i] ↔ A[j]
11. else return j
Partition selects the first key, A[p] as a pivot key about which the array will
partitioned: Keys ≤ A[p] will be moved towards the left .
Keys ≥ A[p] will be moved towards the right.
SELECTION_SORT (A)
for i ← 1 to n-1 do
min j ← i;
min x ← A[i]
for j ← i + 1 to n do
If A[j] < min x then
min j ← j
min x ← A[j]
A[min j] ← A [i]
A[i] ← min x
Selection sort is among the simplest of sorting techniques and it works very well for small files. Furthermore,
despite its evident "naïve approach ", selection sort has a quite important application because each item is
actually moved at most once, selection sort is a method of choice for sorting files with very large objects
(records) and small keys.
48
Strassen's matrix multiplication is based on a recursive divide and conquer scheme. Given n × n matrices A and
B we wish to calculate C = AB. To see how this algorithm works, we first divide the matrices as follows
(decomposing C = AB) into four blocks):
If the matrices A and B are not of type 2n × 2n, we fill the missing rows and columns with zeroes. The elements
of
C11 = A11 B11+ A12 B21
C12 = A11 B12+ A12 B22
C21 = A21 B11+ A22 B21
C22 = A21 B12+ A22 B22
49
Design and Analysis of Algorithms
50
Design and Analysis of Algorithms
Exercises 3
1. Divide-and-conquer algorithms consist of steps.
a. two
b. three
c. four
d. one
5. is among the simplest of sorting techniques and it works very well for small files.
a. Merge sort
b. Quick sort
c. Selection sort
d. Bubble sort
7. The time complexity of calculating the matrix product C = AB where A, B and C are n × n matrices, using
traditional matrix multiplication is .
a. O (n2)
b. O (n3)
c. O (n log n3)
d. O (n log n)
52
8. works by recursively breaking down a problem into two or more sub-problems of the same
type, until these become simple enough to be solved directly.
a. Dynamic programming
b. Greedy algorithms
c. Backtracking algorithms
d. Divide-and-conquer technique
10. Finding the minimum and maximum is an example of the algorithmic paradigm.
a. greedy
b. dynamic programming
c. divide and conquer
d. backtracking
53
Design and Analysis of Algorithms
Chapter IV
Greedy Algorithms
Aim
The aim of this chapter is to:
• define feasibility
Objectives
The objectives of this chapter are to:
• explain the minimum spanning trees with the help of Prim's algorithm and Kruskal's algorithm
Learning outcome
At the end of this chapter, you will be able to:
• understand the minimum spanning trees and derive single source shortest path
54
4.1 Introduction
Greedy algorithms are simple and straightforward. They are short-sighted in their approach in the sense that they
take decisions on the basis of information at hand without worrying about the effect these decisions may have in
the future. They are easy to invent, easy to implement and most of the time quite efficient. Many problems cannot
be solved correctly by greedy approach. Greedy algorithms are used to solve optimisation problems.
Greedy algorithm is a technique for solving problems with the following properties:
• The problem is an optimisation problem; to find the solutions that minimises or maximises some value (cost/
profit).
• The solution can be constructed in a sequence of steps/choices.
• For each choice point:
the choice must be feasible
the choice looks as good as or better than alternatives
the choice cannot be revoked
"The one with maximum benefit from multiple choices is selected" is the basic idea of greedy method. A greedy
method arrives at a solution by making a sequence of choices, each of which simply looks the best at the
moment. The below given algorithm describes the details of greedy method:
For an optimisation problem, what remains is called a sub-problem after making one or several steps of greedy
choice. For problem or sub-problem P, let S be the partial solution and L be the list of candidate choices at the
current moment. To order or prioritise the choices, some evaluation criteria are used to express the feasible value.
According to the above algorithm, the candidate choice with the highest feasible value is selected and the partial
solution is updated accordingly. This procedure is repeated step by step until a resulting complete solution is
obtained.
The representation of the above algorithm can be illustrated by a search tree as shown in the figure below. Each
node in the search tree corresponds to a partial solution and a line between two nodes represents the decision to
add a candidate choice to the existing partial solution. Consequently, leaf nodes at the end of tree correspond to
complete solutions. The black circle at level 1 denotes an initial partial solution. At level 2, there are five candidate
choices for current partial solution which denotes by five nodes. In order to select the best node, promise of each
node should be determined. After some evaluation function has been employed, the second node with highest
benefit (the circle in gray at level 2) is selected. Then, the partial solution and sub-problem are updated
accordingly.
55
Design and Analysis of Algorithms
Definitions of feasibility
• A feasible set (of candidates) is promising if it can be extended to produce not merely a solution, but an
optimal solution to the problem. In particular, the empty set is always promising because an optimal solution
always exists.
• A greedy strategy usually progresses in a top-down fashion, making one greedy choice after another,
reducing each problem to a smaller one.
• The "greedy-choice property" and "optimal substructure" are two ingredients in the problem that lend to a
greedy strategy.
• It says that a globally optimal solution can be arrived at by making a locally optimal choice.
56
4.2 Fractional Knapsack Problem
In this version of a problem, the items can be broken into smaller piece so that we pack only a fraction xi of item
i, where 0 x 1.Item i contributes x w to the total weight in the knapsack, and x p to the total profit. Knapsack
i i i i i
capacity is c.
The fraction knapsack problem can be stated as follows:
Maximise
subject to constraint
It is clear that an optimal solution must fill the knapsack exactly, for otherwise we could add a fraction of one of
the remaining items and increase the total profit. Thus, in an optimal solution.
The fractional knapsack problem has the greedy-choice property and its algorithm is as given below. One way to
prove the correctness of the below algorithm is to prove the greedy choice property and optimal substructure
property. It consists of two steps. First, prove that there exists an optimal solution that begins with the greedy
choice. Secondly, prove that if A is an optimal solution to the original problem S, then A-a is also an optimal
solution to the problem S-s where a is the item included as in the greedy choice and S-s is the sub-problem after
the first greedy choice has been made. The second part is easy to prove since the more profitable items have less
weight. Note that if p`/w`, is not it can replace any other because w` < w, but it increases the profit because p` >
p.
Let the ratio p`/w` is maximal. This supposition implies that p`/w` p/w for any pair (p, w), so p`p/w > p for any
(p, w). Now, suppose a solution does not contain the full w` weight of the best ratio. Then by replacing an amount
of any other w with more w` will improve the value.
Time complexity: If the items are already sorted into decreasing order of pi/wi, then the while-loop takes a time in
O(n). Therefore, the total time including the sort is in O(n log n).
57
Design and Analysis of Algorithms
Problem formulation
Let G = (V, E, W) be a weighted connected undirected graph. Find a tree T that contains all the vertices in G and
minimise the sum of the weights of the edges (u, v) of T that is,
• Tree that contains every vertex of a connected graph is called spanning tree. The problem of constructing a
minimum spanning tree is computing a spanning tree T with smallest total weight.
• A tree is a connected graph with no cycles. A spanning tree is a subgraph of G which has the same set of
vertices of G and is a tree.
• A minimum spanning tree of a weighted graph G is the spanning tree of G whose edges sum to minimum
weight.
• A graph may have many spanning trees, for instance the complete graph on four vertices.
The above graph has sixteen spanning trees as shown in the figure below:
58
Here, we discuss Kruskal's algorithm and Prim's algorithm which are classic applications of the greedy strategy.
Kruskal's algorithm grows the MST in clusters by considering (greedily) edges in order of their weights and
Prim's algorithm on the other hand, grows the MST from single vertex (root).
Example:
59
Design and Analysis of Algorithms
Consider the graph shown in fig.4.3 (a). The vertices to the left of the dashed line belong to X and those to its
right belong to Y. First, as shown in fig. 4.3(a), X = {1} and Y = {2, 3... 6}. In fig. 4.3(b), vertex 2 is moved from Y to
X since edge (1, 2) has the least cost among all the edges incident to vertex 1. This is indicated by moving the
dashed line so that 1 and 2 are now to its left. As shown in fig. 4.3(b), the candidate vertices to be moved from Y
to X are 3 and 4. Since edge (1, 3) is of least cost among all edges with one end in X and one end in Y, 3 is
moved from Y to X. Next, from the two candidate vertices 4 and 5 in fig. 4.3(c), 4 is moved since the edge (3, 4)
has the least cost. Finally, vertices 6 and then 5 are moved from Y to X as shown in fig. 4.3(e). Each time a
vertex y is moved from Y to X, its corresponding edge is included in T, the set of edges of the minimum spanning
tree. The resulting minimum spanning tree is shown in fig. 4.3(f).
Example:
Fig. 4.4 An example of Kruskal's algorithm
60
Consider the weighted graph shown in fig. 4.4(a). As shown in fig. 4.4(b), the first edge that is added is (1, 2) since
it is of minimum cost. Next, as shown in fig. 4.4 (c)-(e), edges(1, 3), (4, 6) and then (5, 6) are included in T in
this order. Next, as shown in fig. 4.4(f), the edge (2, 3) creates a cycle and hence is discarded. For the same
reason, as shown in fig. 4.4(g), edge (4, 5) is also discarded. Finally, edge (3, 4) is included, which results in the
minimum spanning tree (V, T) as shown in fig. 4.4(h).
where
u = V1 and v = Vn.
The cost (or length or weight) of the path P is the sum of the weights of edges in the sequence. The shortest-path
weight from a vertex u V to a vertex v V in the weighted graph is the minimum cost of all paths from u to V.
If there exists no such path from vertex u to vertex v then the weight of the shortest path is . We can also define
this as:
61
Design and Analysis of Algorithms
In the above algorithm, we will assume that the input graph is represented by adjacency lists and the length of
edge (x, y) is stored in the vertex for y in the adjacency list for x. For instance, the directed graph is represented as
shown in fig. 4.5. We will also assume that the length of each edge in E is nonnegative. The two sets X and Y
will be implemented as boolean vectors X[1..n] and Y[1..n]. Initially, X[1] = 1 and Y[1] = 0 and for all i, 2 i
n, X[i] = 0 and Y[i] = 1.Thus, the operation X X {y} is implemented by setting X[y] to 1 and the operation
Y Y {y} is implemented by setting Y[y] to 0.
Fig. 4.5 Directed graph representation for the shortest path algorithm
Example:
To see how the algorithm works, consider the directed graph shown in fig. 4.6(a). The first step is to label each
vertex v with [v] = length[1, v]. As shown in the figure, vertex 1 is labelled with 0 and vertices 2 and 3 are
labelled with 1 and 12 since length[1, 2] = 1 and length[1, 3] = 12. All other vertices are labelled with since
there are no edges from the source vertex to these vertices. Initially X = {1} and Y = {2, 3, 4, 5, 6}. In the figure,
those vertices to the left of the dashed line belong to X and the others belong to Y. In fig. 4.6(a), we note that [2]
is the smallest
62
among all vertices' labels in Y and hence it is moved to X indicating that the distance to vertex 2 has been found.
To finish processing vertex 2, the labels of its neighbours 3 and 4 are inspected to see if there are paths that pass
through 2 and are shorter than their old paths. In this case, we say that we update the labels of the vertices
adjacent to 2. As shown in the figure, the path from 1 to 2 to 3 is shorter than the path from 1 to 3 and thus [3] is
changed to 10, which is the length of the path that passes through 2. Similarly, [4] is changed to 4 since now
there is a finite path of length 4 from 1 to 4 that passes through vertex 2. These updates are shown in fig. 4.6(b).
The next step is to move that vertex with minimum label, namely 4, to X and update the labels of its neighbours
in Y as shown in fig. 4.6(c). In this figure, we notice that the labels of vertices 5 and 6 became finite and [3] is
lowered to 8. Now, vertex 3 has a minimum label, so it is moved to X and [5] is updated accordingly as shown in
fig. 4.6(d). Continuing in this way, the distance to vertex 5 is found and thus it is moved to X as shown in fig. 4.6(e).
As shown in fig. 4.6(f), vertex 6 is the only vertex remaining in Y and hence its label coincides with the length of its
distance from 1. In fig. 4.6(f), the label of each vertex represents its distance from the source vertex.
Assumption: Whenever a program is to be retrieved from the tape, the tape is positioned at the front. Now if
the programs are stored in the order as I = i1, i2, i3, i4..............., in, then the time tj needed to retrieve program (i, j) is
proportional to ∑1 ≤ k ≤ j lik. So, the problem is that we have to store them in the tape in such an order that the
M.R.T (mean retrieval time) is minimum.
A greedy approach to build such an algorithm requires such a permutation that chooses the next program on the
basis of some optimisation measure. In this we take the optimisation measure to be 'd ' (which is the current
length of the tape that is written with the program). So now every time when we write onto the disk, we keep in
mind that the increment in the length of 'd ' is minimum.
So, this greedy method requires us to just sort the lengths of the programs or to assign them in a non-decreasing
order, so that they can be directly written into disk on their length basis. So it in turns takes the complexity equal
to O (n log n) just to sort the length of the programs. Here's a sample pseudo code for the program if we have
multiple disks on which we have to write the data.
Pseudo code
tapes (n, m) // here n is the numbers of programs and m is the numbers of tapes.
j = 0;
for (i = 1 to n do)
j = (j + 1) mod m
63
Design and Analysis of Algorithms
Exercises 4
1. Greedy algorithms are used to solve problems.
a. optimisation
b. divide and conquer
c. theoretical
d. implementation
2. A set (of candidates) is promising if it can be extended to produce not merely a solution, but an
optimal solution to the problem.
a. optimal
b. empty
c. feasible
d. greedy solution
3. The problem is to find a free tree T of a given graph G that contains all the vertices of G and has
the minimum total weight of the edges of G over all such trees.
a. optimal
b. greedy
c. feasible
d. MST
6. algorithm repeatedly chooses the smallest-weight edge that does not form a cycle.
a. Kruskal's
b. Prim's
c. Bellman-Ford's
d. Dijkstra's
7. If the input of the Prim's algorithm is a weighted connected graph G = (V, E) then it's output is given by
a. set of vertices comprising a MST
b. set of edges comprising a binary tree
c. set of edges comprising a MST
d. set of edges comprising a connected graph
64
Design and Analysis of Algorithms
9. If there exists no path from vertex u to vertex v, then the weight of the shortest path is .
b. 1
c. 0
d. -1
10. Dijkstra's algorithm solves the single source shortest path problem when all edges have ___________
weights.
a. zero
b. non negative
c. infinite
d. more than one
66
Chapter V
Graph Algorithms
Aim
Objectives
The objectives of this chapter are to:
Learning outcome
At the end of this chapter, you will be able to:
67
Design and Analysis of Algorithms
While traversing, if a particular sub tree is empty (i.e., a node has no left or right descendant), the traversal is
performed by doing nothing. In other words, a null sub tree is considered to be fully traversed when it is
encountered.
Preorder traversal
The preorder traversal of a binary tree is defined as follows:
process the root node
traverse the left sub tree in preorder
traverse the right sub tree in preorder
Inorder traversal
The inorder traversal of a binary tree is given by the following steps:
traverse the left sub tree in inorder
process the root node
traverse the right sub tree in inorder
Postorder traversal
The postorder traversal of a binary tree is defined as follows:
traverse the left sub tree in postorder
traverse the right sub tree in postorder
process the root node
68
Example:
The preorder traversal of the tree as given above gives the following processing order:
ABDEHCFGI
The inorder traversal of the tree as given above gives the following processing order:
DBHEAFCGI
The postorder traversal of the tree as given above gives the following processing order:
DHEBFIGCA
Process
The first task is to research code optimisation and get a feel for the general consensus on optimisation do’s and
don’ts. Optimisation techniques have been around for years, and not all of the long held beliefs are still
applicable today. Because of the boom in hardware speeds and the drop in hardware prices, many developers
have let code optimisation slip to the back of their minds. As a result, the rigorously developed techniques from
years ago have not been updated to account for modern compiler optimisation or hardware features. Synthesising
a sizable volume of data from opposing viewpoints led to the development of a general outline to follow when
optimising code. This is the process that was followed during the optimisation of the Reed-Solomon code.
Before beginning any project, it is imperative to choose a programming language. Many programmers will
choose a language they are familiar with, even if it is not the best language for the project. Speed, flexibility, and
ease of coding are a few of the major factors in deciding which language to use. Whatever be the language,
knowing it well will help reduce number of awkward constructs and also improve the quality of the code. When
inheriting code, the programming language is already chosen, but it may be more efficient to rewrite the code in
another language than try to optimise the existing code. With all of the choices available, choosing the right
language can be difficult yet is an important part of the optimisation process. The Reed-Solomon implementation
was written in C. This was a good choice because the GNU compiler collection (GCC) has a very good
optimising C compiler, allowing the compiler to do much of work. The availability of a good optimising
compiler (or fast command interpreter for an interpreted language) should always be considered when deciding
which language to use.
Assuming the programming language is already chosen and there is little need to choose a new one, the first step
towards optimisation is to look at algorithm choice. Choosing a good algorithm can be a difficult task and there is no
definitive process for choosing the right one. The gains of a good algorithm, however, include a major speed
increase and improved code readability and stability. Reducing an O(N2) algorithm to an O(N) algorithm will
speed up the code (especially for large amounts of data) and enable later optimisation to work with more efficient
code to produce even greater returns. Choosing an algorithm early in the process prevents optimisations from
being performed, and then being performed again after an algorithm change. Any changes to the algorithm
69
Design and Analysis of Algorithms
70
Design and Analysis of Algorithms
After settling on an algorithm, the compiler, optimisation options should be enabled. This provides an idea about
the final speed and size of the code. The compiler will also perform many optimisations faster and just as well as,
if not better than, the human programmer will. Optimisation like moving constant expressions outside of loops,
storing variables in registers, moving functions inline, and unrolling loops should be performed by the compiler
in most cases. Occasionally the programmer can perform an optimisation better than the compiler because he has
additional knowledge about the code that the compiler lacks. For most code, however, the compiler has enough
information to make good decisions and perform the proper optimisation. There are some cases where certain
optimisation will hinder performance or unacceptably enlarge the code. To prevent that hindrance the
programmer can specify which optimisation to include or omit by using compiler flags. There is little point in
performing an optimisation by hand when a compiler can perform the same optimisation faster and more
accurately.
If the code is still not fast enough after the compiler optimisations, there are a number of hand optimisation that
can be performed. Before optimising all the code, it is a good idea to profile the code and get a sense of where the
bottlenecks are. In general, most of the code in a program will only run once, and most of the processing time is
spent in an inner loop. Optimising just that loop will reap the greatest benefits, as a single optimisation will save
on each run through the loop. Any good optimisation book will outline basic optimisation techniques, but it is
good to keep in mind the capabilities of the compiler. The programmer knows many aspects of the code better
than the compiler and can therefore perform some optimisation the compiler cannot. Like any other tools,
compilers are not perfect so it is important to understand the specific compiler being used. As good as the
compiler may be, it is foolish to rely on it to do all of the work. When done properly, utilising a compiler’s
features is quicker, easier, and more effective than doing all the work by hand.
For code that needs to be extremely streamlined, assembly language is a good choice. Some programming
languages, like C, allow assembly statements to be inserted directly into the code. It is also possible to write an
entire section of code in assembly. For many programmers, modifying compiler-generated assembly will produce
the best results in the least amount of time. Skilled assembly programmers, however, may be able to write entire
blocks of assembly that will outperform compiler generated assembly. Even so, using the compiler-generated
assembly is a good way to start out and it is always possible to write the assembly from scratch if modifying the
compiler-generated assembly does not yield great enough gains. It is not always a good idea to write an entire
program in assembly. For code that only runs once, or for which the compiler produces good assembly, it will
often be faster to use a high-level language than to hand-code assembly. The loss in code performance may not
offset the time saved by using a high- level language.
If all optimisation fail to make the code run fast enough it may be necessary to explore hardware options.
Implementing the code in hardware allows faster processing than that attainable by software. Because there are a
minimum number of cycles required to perform any given task, it may be necessary to use faster hardware.
Ultimately there will be some project constraint imposing a limit on the speed of the code and the solution may
be difficult to find or accept.
70
Fig. 5.1 The first two ply of the game tree for tic-tac-toe
The above figure shows the first two levels, or ply, in the game tree for tic-tac-toe. We consider all the rotations and
reflections of positions as being equivalent, so the first player has three choices of move: in the centre, at the
edge, or in the corner. The second player has two choices for the reply if the first player played in the centre,
otherwise five choices and so on.
The number of leaf nodes in the complete game tree is the number of possible different ways the game can be
played. For example, the game tree for tic-tac-toe has 26,830 leaf nodes.
The game tree for tic-tac-toe is easily searchable, but the complete game trees for larger games like chess are
much too large to search. Instead, a chess-playing program searches a partial game tree: typically as many ply
from the current position as it can search in the time available. Two-person games can also be represented as
AND-OR trees. For the first player to win a game, there must exist a winning move for all moves of the second
player. This is represented in the AND-OR tree by using disjunction to represent the first player's alternative
moves and using conjunction to represent all of the second player's moves.
71
Design and Analysis of Algorithms
Breadth-first search starts at a given vertex s, which is at level 0. In the first stage, we visit all the vertices that
are at the distance of one edge away. When we visit there, we paint as "visited," the vertices adjacent to the start
vertex s - these vertices are placed into level 1. In the second stage, we visit all the new vertices we can reach at
the distance of two edges away from the source vertex s. These new vertices, which are adjacent to level 1 vertex
and not previously assigned to a level, are placed into level 2, and so on. The BFS traversal terminates when
every vertex has been visited.
To keep track of progress, breadth-first-search colours each vertex. Each vertex of the graph is in one of three
states:
• undiscovered
• discovered but not fully explored and
• fully explored
The BFS(G, s) algorithm develops a breadth-first search tree with the source vertex s, as its root. The parent or
predecessor of any other vertex in the tree is the vertex from which it was first discovered. For each vertex, v, the
parent of v is placed in the variable π[v]. Another variable, d[v], computed by BFS contains the number of tree
edges on the path from s to v. The breadth-first search uses a FIFO queue, Q, to store gray vertices.
Example:
The following diagram illustrates the progress of breadth-first search on the undirected sample graph.
After initialisation (paint every vertex white, set d[u] to infinity for each vertex u and set the parent of every
vertex to be NIL), the source vertex is discovered in line 5. Lines 8-9 initialise Q to contain just the source vertex
s.
72
r s t u
∞ 0 ∞ ∞
Q s
∞ ∞ ∞ ∞ 0
v w x y
The algorithm discovers all vertices 1 edge from s, i.e., discovered all vertices (w and r) at level 1.
r s t u
1 0 ∞ ∞
Q w r
∞ 1 ∞ ∞ 1 1
v w x y
r s t u
1 0 2 ∞
Q r t x
∞ 1 2 ∞ 1 2 2
v w x y
The algorithm discovers all vertices 2 edges from s i.e., discovered all vertices (t, x and v) at level 2.
r s t u
1 0 2 ∞
Q t x v
2 1 2 ∞ 2 2 2
v w x y
r s t u
1 0 2 3
Q x v u
2 1 2 ∞ 2 2 3
v w x
y
73
Design and Analysis of Algorithms
r s t u
1 0 2 3
Q v u y
2 1 2 3 2 3 3
v w x y
The algorithm discovers all vertices 3 edges from s i.e., discovered all vertices (u and y) at level 3.
r s t u
1 0 2 3
Q u y
2 1 2 3 3 3
v w x y
r s t u
1 0 2 3
Q y
2 1 2 3 3
v w x y
The algorithm terminates when every vertex has been fully explored.
r s t u
1 0 2 3
Q
2 1 2 3
v w x y
The while-loop in breadth-first search is executed at most V times. The reason is that every vertex enqueued atmost
once. So, we have O(V).
The for-loop inside the while-loop is executed atmost E times if G is a directed graph or 2E times if G is
undirected. The reason is that every vertex dequeued atmost once and we examine (u, v) only when u is
dequeued. Therefore, every edge examined atmost once if directed atmost twice if undirected. So, we have O(E).
Therefore the total running time for breadth-first search traversal is O(V + E).
Applications of BFS
BFS is used to solve the following problems:
Testing whether graph is connected.
Computing a spanning forest of graph.
Computing a cycle in graph or reporting that no such cycle exists.
74
Computing, for every vertex in graph, a path with the minimum number of edges between start vertex
and current vertex or reporting that no such path exists.
Depth-first search selects a source vertex s in the graph and paint it as "visited." Now the vertex s becomes our
current vertex. Then, we traverse the graph by considering an arbitrary edge (u, v) from the current vertex u. If
the edge (u, v) takes us to a painted vertex v, then we back down to the vertex u. On the other hand, if edge (u, v)
takes us to an unpainted vertex, then we paint the vertex v and make it our current vertex, and repeat the above
computation. Sooner or later, we will get to a “dead end,” meaning all the edges from our current vertex u takes
us to painted vertices. This is a deadlock. To get out of this, we back down along the edge that brought us here to
vertex u and go back to a previously painted vertex v. We again make the vertex v our current vertex and start
repeating the above computation for any edge that we missed earlier. If all of v's edges take us to painted vertices,
then we again back down to the vertex we came from to get to vertex v, and repeat the computation at that vertex.
Thus, we continue to back down the path that we have traced so far until we find a vertex that has yet unexplored
edges, at which point we take one such edge and continue the traversal. When the depth-first search has
backtracked all the way back to the original source vertex, s, it has built a DFS tree of all vertices reachable from
that source. If there still undiscovered vertices in the graph then it selects one of them as the source for another
DFS tree. The result is a forest of DFS-trees. Note that the edges lead to new vertices are called discovery or tree
edges and the edges lead to already visited (painted) vertices are called back edges.
Like BFS, to keep track of progress depth-first-search colours each vertex. Each vertex of the graph is in one of
three states:
• undiscovered
• discovered but not finished (not done exploring from it)
• finished (have found everything reachable from it) i.e., fully explored
Like BFS, depth-first search uses π[v] to record the parent of vertex v. We have π[v] = NIL if and only if vertex v is
the root of a depth-first tree.
• DFS time-stamps each vertex when its colour is changed.
• When vertex v is changed from white to gray the time is recorded in d[v].
• When vertex v is changed from gray to black the time is recorded in f[v].
The discovery and the finish times are unique integers, where for each vertex the finish time is always after the
discovery time. That is, each time-stamp is an unique integer in the range of 1 to 2|V| and for each vertex v, d[v]
< f[v]. In other words, the following inequalities hold:
75
Design and Analysis of Algorithms
The algorithm for such a traversal can be written using recursion as given in the algorithm below:
The algorithm starts by marking all vertices unvisited. The algorithm then calls Procedure dfs for each unvisited
vertex in V. This is because not all the vertices may be reachable from the start vertex. Starting from some vertex
v V, Procedure dfs performs the search on G by visiting v, marking v visited and then recursively visiting its
adjacent vertices. When the search is complete, if all vertices are reachable from the start vertex, a spanning tree
called the depth-first search spanning tree is constructed whose edges are those inspected in the forward direction
i.e. when exploring unvisited vertices. In other words, let (v, w) be an edge such that w is marked unvisited and
suppose the procedure was invoked by the call dfs(v). Then, in this case, that edge will be part of the depth-first
search spanning tree.
If not all the vertices are reachable from the start vertex then the search results in a forest of spanning trees
instead. After the search is complete, each vertex is labelled predfn and postdfn numbers. These two labels
impose preorder and postorder numbering on the vertices in the spanning tree (or forest) generated by the depth-first
search traversal. They give the order in which visiting a vertex starts and ends. In the following, we say that edge
(v, w) is being explored to mean that within the call dfs(v), the procedure is inspecting the edge (v, w) to test
whether w has been visited before or not. The edges of the graph are classified differently according to whether
the graph is directed or undirected.
76
Fig. 5.2 An example of depth-first search traversal of an undirected graph
Fig. 5.2 (b) illustrates the action of depth-first search traversal on the undirected graph shown in fig 5.2(a).
Vertex a has been selected as the start vertex. The depth-first search tree is shown in fig 5.2 (b) with solid lines.
Dotted lines represent back edges. Each vertex in the depth-first search tree is labelled with two numbers: predfn
and postdfn. Note that since vertex e has postdfn = 1, it is the first vertex whose depth-first search is complete.
Note also that since the graph is connected, the start vertex is labelled with predfn = 1 and postdfn = 10, the
number of vertices in the graph.
Tree edges: Edges in the depth-first search tree. An edge (v, w) is a tree edge if w was first visited when exploring
the edge (v, w).
77
Design and Analysis of Algorithms
Back edges: Edges of the form (v, w) such that w is an ancestor of v in the depth-first search tree (constructed so
far) and vertex w was marked visited when (v, w) was explored.
Forward edges: Edges of the form (v, w) such that w is a descendant of v in the depth-first search tree (constructed
so far) and vertex w was marked visited when (v, w) was explored.
78
Applications of DFS
DFS is used to solve the following problems:
• Testing whether graph is connected.
• Computing a spanning forest of G.
• Computing the connected components of G.
• Computing a path between two vertices of G or reporting that no such path exists.
• Computing a cycle in G or reporting that no such cycle exists.
Articulation point: A vertex v in a connected graph G is an articulation point if the deletion of v from G, along
with the deletion of all edges incident to v, disconnects the graph into two or more nonempty components.
Biconnected component: G' = (V', E') is a maximal biconnected subgraph of G if and only if G has no
biconnected subgraph G'' = (V'', E'') such that V' ⊆ V'' and E' ⊆ E''. A maximal biconnected subgraph is a
biconnected component.
• Two biconnected components can have at most one vertex in common and this vertex is an articulation point.
79
Design and Analysis of Algorithms
• No edge can be in two different biconnected components; this will require two common vertices
• Graph G can be transformed into a biconnected graph by using the edge addition scheme in the following
algorithm:
• Biconnected components of the above given graph are: {1, 2, 3, 4}, {2, 5, 7, 8}, {5, 6}, {3, 10}, {3, 9}
Add edge (4, 10) and (10, 9) corresponding to articulation point 3, edge (1, 5) corresponding to
articulation point 2, and edge (6, 7) corresponding to articulation point 5.
In the algorithm mk_biconnected, once the edge (vi, vi+1) is added, vertex a is no longer an articulation
point.
If G has p articulation points and b biconnected components, the above algorithm introduces exactly b − p
new edges into G.
5.6 Backtracking
A backtracking algorithm tries to build a solution to a computational problem incrementally. Whenever the
algorithm needs to decide between two alternatives to the next component of the solution, it simply tries both
options recursively.
For many real-world problems, the solution process consists of working your way through a sequence of decision
points in which each choice leads you further along some path. If you make the correct set of choices, you end
up at the solution. On the other hand, if you reach a dead end or otherwise discover that you have made an
incorrect choice somewhere along the way, you have to backtrack to a previous decision point and try a different
path. Algorithms that use this approach are called backtracking algorithms.
If you think about a backtracking algorithm as the process of repeatedly exploring paths until you encounter the
solution, the process appears to have an iterative character. As it happens, however, most problems of this form
are easier to solve recursively. The fundamental recursive insight is simply this: a backtracking problem has a
solution if and only if at least one of the smaller backtracking problems that results from making each possible
initial choice has a solution.
Back tracking is a systematic way to go through all the possible configuration of a search space. In general case, we
assume our solution is a vector v = (a1, a2, ..., an) where each element a is selected from a finite ordered set Si.
We build from a partial solution of length Kv = (a1, a2,...,ak) and try to extend it by adding another element. After
extending it, we will test whether what we have so far is still possible a partial solution. If not, we delete ak and
try the next element from Sk:
80
Recursive Backtracking
• Recursion can be used for elegant and easy implementation of backtracking.
• Backtracking can be easily be used to iterate through all subsets or permutations of a set.
• Backtracking ensures correctness by enumerating all possibilities.
• For backtracking to be efficient, we must prune the search space.
For example, if X = {3, 5, 6, 7, 8, 9, 10} and S = 15, there will be more than one subset whose sum is 15. The subsets
are {7, 8}, {3, 5, 7}, {6, 9} and {5, 10}. On the other hand, if X = {1, 5, 6, 7, 11, 12, 13} and S = 15, there will
be no subset that adds up to 15.
We will assume a binary state space tree. The nodes at depth 1 are for including (yes = 1, no = 0) item 1, the
nodes at depth 2 are for item 2, etc. The left branch includes xi and the right branch excludes xi. The nodes contain
the sum of the numbers included so far.
Backtracking consists of doing a DFS of the state space tree, checking whether each node is promising and if the
node is non-promising backtracking to the node's parent. We call a node non-promising if it cannot lead to a feasible
(or optimal) solution, otherwise it is promising. The state space tree consisting of expanded nodes only is called
the pruned state space tree. The below example will show the pruned state space tree for the sum of subsets
problem. Consider a node at depth i,
sumSoFar = sum of node – that is, sum of numbers included in partial solution node.
sumLeft = sum of the remaining items i + 1 to n (for a node at depth i).
To be able to use this "promising function" the xi must be sorted in increasing order. The include [1...n] is a boolean
array. If its value is true, then the node is included in the partial tree, else not included.
The below algorithm (a): sumOfSubsets (i, sumSofar, sumLeft) is invoked with:
i = 0, sumSoFar = 0 and sumLeft = x1 + x2 + ... + xn. The algorithm is called with sumOfSubsets (0, 0, 21) for the
below example.
81
Design and Analysis of Algorithms
Example:
X = {3, 5, 6, 7} and S = 15. There are only 15 nodes in the pruned state space tree as shown in the figure below. The
full state space tree has 31 nodes (24+1 – 1). Trace of the above algorithm is given in below Table 5.1. Nodes of
the tree are shown by circled numbers. Notice that the x1 is included at level 1 of the tree, x2 at level 2 and so on.
We have only one solution with subset {3, 5, 7}. This occurs at node 6 in the below figure.
82
i node sumSoFar sumLeft Comment
0 1 0 21 Start
1 2 3 18 include 3
2 3 8 13 include 5
3 4 14 7 include 6
3 5 8 7 include 6
4 6 15 0 include 7
357 ← Solution
4 7 8 0 exclude 7
2 8 3 13 exclude 5
3 9 9 7 include 6
3 10 3 7 exclude 6
1 11 0 18 exclude 3
2 12 5 13 include 5
3 13 11 7 include 6
3 14 5 7 exclude 6
2 15 0 13 exclude 5
Graph colouring is defined as colouring the nodes of a graph with the minimum number of colours without any
two adjacent nodes having the same colour. For example, the linked list needs two colours and so does the binary
search tree. Graph colouring has wide applications such as, estimation of sparse Jacobins, scheduling and
registering allocation.
The colouring of a graph G = (V, E) is a mapping c: v s, where s is a finite set of colours, such that if vw E
then c(v) ≠ c(w). In other words, adjacent vertices are not assigned the same colour. The problem that arises is
the colouring of a graph provided that no adjacent vertices have the same colour. The chromatic number X (G)
is the minimum number of colours needed for a colouring of G. A graph G is k_chromatic, if X (G) = k, and G is
k_colourable, if X(G) ≤ k.
Graph colouring is one of the most useful models in graph theory. It has been used to solve problems in school
timetabling, computer register allocation, electronic bandwidth allocation, and many other applications.
a. First fit: First Fit algorithm is the easiest and fastest technique of all greedy colouring heuristics. The algorithm
sequentially assigns each vertex the lowest legal colour. This algorithm has the advantage of being very simple
and fast and can be implemented to run in O(n) .
83
Design and Analysis of Algorithms
b. Degree based ordering: It provides a better strategy for colouring a graph. It uses a certain selection criterion
for choosing the vertex to be coloured. This strategy is better than the First Fit which simply picks a vertex from
an arbitrary order. Some strategies for selecting the next vertex to be coloured have been proposed such as:
Largest degree ordering (LDO): It chooses a vertex with the highest number of neighbours. Intuitively, LDO
provides a better colouring than the First Fit. This heuristic can be implemented to run in O(n2).
Saturation degree ordering (SDO): The saturation degree of a vertex is defined as the number of its adjacent
differently coloured vertices. Intuitively, this heuristic provides a better colouring than LDO as it can be
implemented to run in O(n3).
Incidence degree ordering (IDO): A modification of the SDO heuristic is the incidence degree ordering. The
incidence degree of a vertex is defined as the number of its adjacent coloured vertices. This heuristic can be
implemented to run in O(n2).
84
The algorithm for graph colouring is given as follows:
In the mathematical field of graph theory, a Hamiltonian cycle (or Hamiltonian circuit) is a cycle in an undirected
graph which visits each vertex exactly once and also returns to the starting vertex. A Hamiltonian path (or
traceable path) is a path in an undirected graph which visits each vertex exactly once.
A Hamiltonian cycle, Hamiltonian circuit, vertex tour or graph cycle is a cycle that visits each vertex exactly once
(except the vertex which is both the start and end, and so is visited twice). A graph that contains a Hamiltonian
cycle is called a Hamiltonian graph.
A Hamiltonian path or traceable path is a path that visits each vertex exactly once. A graph that contains a
Hamiltonian path is called a traceable graph. A graph is Hamilton-connected if for every pair of vertices there is a
Hamiltonian path between the two vertices.
85
Design and Analysis of Algorithms
(a) Input
(b) Output
86
5.6.4 The Eight Queens Problem
The problem of the eight queens is a well-known example of the use of trial-and-error methods and of
backtracking algorithms. It was investigated by C .F. Gauss in 1850, but he did not completely solve it. The
characteristic property of these problems is that they defy analytic solution. Instead, they require large amounts of
exacting labour, patience, and accuracy. Such algorithms have therefore gained relevance almost exclusively
through the automatic computer, which possesses these properties to a much higher degree than people, and even
geniuses, do.
Eight queens are to be placed on a chess board in such a way that no queen checks against any other queen. There
remains the question of representing the eight queens on the board. An obvious choice would again be a square
matrix to represent the board, but a little inspection reveals that such a representation would lead to fairly
cumbersome operations for checking the availability of positions. This is highly undesirable since it is the most
frequently executed operation. We should therefore choose a data representation which makes checking as simple
as possible. The best recipe is to represent as directly as possible that information which is truly relevant and
most often used.
• The problem is to place eight queens on an 8 x 8 chess board so that no two queens attack i.e. no two of them
are on the same row, column or diagonal.
• Strategy: The rows and columns are numbered through 1 to 8.
• The queens are also numbered through 1 to 8.
• Since each queen is to be on a different row without loss of generality, we assume queen i is to be placed on
row i.
• The solution is an 8 tuple (x1,x2,.. . .,x8) where xi is the column on which queen i is placed.
• The explicit constraints are : S = {1,2,3,4,5,6,7,8} 1 i n or 1 x 8
i i
where i = 1,........8.
• The solution space consists of 88 8- tuples.
• The implicit constraints are :
no two xis can be the same that is, all queens must be on different columns
no two queens can be on the same diagonal
reduces the size of solution space from 88 to 8! 8 – tuples
Two solutions are (4,6,8,2,7,1,3,5) and (3,8,4,7,1,6,2,5)
87
Design and Analysis of Algorithms
88
Design and Analysis of Algorithms
Exercises 5
1. For the below given graph, determine the preorder traversal.
4. is a procedure by which each node in the tree is processed exactly once in a systematic manner.
a. Backtracking
b. Recursion
c. Iteration
d. Traversal
90
9. In AND/OR graphs, the successors are sub goals that must all be achieved to satisfy the parent goal.
a. OR
b. AND
c. vertex
d. edge
10. A game tree is a graph whose nodes are positions in a game and whose edges are moves.
a. undirected
b. mixed
c. directed
d. multi
91
Design and Analysis of Algorithms
Chapter VI
Complexity
Aim
Objectives
The objectives of this chapter are to:
Learning outcome
At the end of this chapter, you will be able to:
92
6.1 Introduction
The need to be able to measure the complexity of a problem, algorithm or structure and to obtain bounds and
quantitive relations for complexity arises in more and more sciences: besides computer science, the traditional
branches of mathematics, statistical physics, biology, medicine, social sciences and engineering are also
confronted more and more frequently with this problem. In the approach taken by computer science, complexity
is measured by the quantity of computational resources (time, storage, program, communication) used up by a
particular task. These notes deal with the foundations of this theory.
Computation theory can basically be divided into three parts of different character. First, the exact notions of
algorithm, time, storage capacity, etc. must be introduced. For this, different mathematical machine models must
be defined, and the time and storage needs of the computations performed on these need to be clarified (this is
generally measured as a function of the size of input). By limiting the available resources, the range of solvable
problems gets narrower; this is how we arrive at different complexity classes. The most fundamental complexity
classes provide an important classification of problems arising in practice, but (perhaps more surprisingly) even
for those arising in classical areas of mathematics; this classification reflects the practical and theoretical
difficulty of problems quite well. The relationship between different machine models also belongs to this first
part of computation theory.
Second, one must determine the resource need of the most important algorithms in various areas of mathematics,
and give efficient algorithms to prove that certain important problems belong to certain complexity classes. In
these notes, we do not strive for completeness in the investigation of concrete algorithms and problems; this is
the task of the corresponding fields of mathematics (combinatorics, operations research, numerical analysis,
number theory). Nevertheless, a large number of concrete algorithms will be described and analyzed to illustrate
certain notions and methods, and to establish the complexity of certain problems.
Third, one must find methods to prove \negative results”, i.e. for the proof that some problems are actually
unsolvable under certain resource restrictions. Often, these questions can be formulated by asking whether certain
complexity classes are different or empty. This problem area includes the question whether a problem is
algorithmically solvable at all; this question can today be considered classical, and there are many important results
concerning it; in particular, the decidability or undecidablity of most concrete problems of interest is known.
The majority of algorithmic problems occurring in practice is, however, such that algorithmic solvability itself is
not in question, the question is only what resources must be used for the solution. Such investigations, addressed
to lower bounds, are very difficult and are still in their infancy. In these notes, we can only give a taste of this
sort of results. In particular, we discuss complexity notions like communication complexity or decision tree
complexity, where by focusing only on one type of rather special resource, we can give a more complete analysis
of basic complexity classes.
It is, finally, worth noting that if a problem turns out to be difficult to solve, this is not necessarily a negative
result. More and more areas (random number generation, communication protocols, cryptography and data
protection) need problems and structures that are guaranteed to be complex. These are important areas for the
application of complexity theory; from among them, we will deal with random number generation and
cryptography, the theory of secret communication.
Let us define some orderings of the set of words. Suppose that an ordering of the elements of is given. In the
lexicographic ordering of the elements of Σ*, a word proceeds a word , if either is a prefix (beginning
93
Design and Analysis of Algorithms
segment) of or the first letter which is different in the two words is smaller in . (E.g., 35244 precede 35344 which
precede
94
Design and Analysis of Algorithms
353447). The lexicographic ordering does not order all words in a single sequence: for example, every word
beginning with 0 precedes the word 1 over the alphabet {0, 1}. The increasing order is therefore often preferred:
here, shorter words precede longer ones and words of the same length are ordered lexicographically. This is the
ordering of {0, 1}* we get when we write up the natural numbers in the binary number system.
The set of real numbers will be denoted by R, the set of integers by Z and the set of rational numbers (fractions) by
Q. The sign of the set of non-negative real (integer, rational) numbers is R+ (Z+, Q+). When the base of a
logarithm will not be indicated it will be understood to be 2.
Let f and g be two real (or even complex) functions defined over the natural numbers. We write
f = O(g)
if there is a constant c > 0 such that for all n large enough we have |f (n)| ≤ c|g(n)|. We write
f = o(g)
if f is 0 only at a finite number of places and f (n)/g(n) → 0 if n → ∞. We will also use sometimes an inverse of the
big O notation: we write
f = Ω(g)
f = Θ(g)
means that both f = O(g) and g = O(f ) hold, i.e. there are constants c1 , c2 > 0 such that for all n large
enough we have c1 g(n) ≤ f (n) ≤ c2 g(n). We will also use this notation within formulas. Thus,
(n + 1)2 = n2 + O(n)
means that (n + 1)2 can be written in the form n 2 + R(n) where R(n) = O(n2). Keep in mind that in this kind of
formula, the equality sign is not symmetrical. Thus, O(n) = O(nn ) but O(n2 ) = O(n). When such formulas
become too complex it is better to go back to some more explicit notation.
We define these notions in terms of the Turing machine model of computation. This definition is suitable for
theoretical study; in describing algorithms, using the RAM is more convenient, and it also approximates reality
better. It follows, however, from our constructions, that from the point of view of the most important type of
resource restrictions (e.g. polynomial time and space) it does not matter, which machine model is used in the
definition.
This leads to the definition of various complexity classes: classes of problems solvable within given time bounds,
depending on the size of the input. Every positive function of the input size defines such a class, but some of
them are particularly important. The most central complexity class is polynomial time. Many algorithms
important in practice run in polynomial time (in short, are polynomial). Polynomial algorithms are often very
interesting mathematically, since they are built on deeper insight into the mathematical structure of the problems,
and often use strong mathematical tools.
94
We restrict the computational tasks to yes-or-no problems; this is not too much of a restriction, and pays off in
what we gain simplicity of presentation. Note that the task of computing any output can be broken down to
computing its bits in any reasonable binary representation.
Most of this chapter is spent on illustrating how certain computational tasks can be solved within given resource
contraints. We start with the most important case, and show that most of the basic everyday computational tasks
can be solved in polynomial time. These basic tasks include tasks in number theory (arithmetic operations,
greatest common divisor and modular arithmetic) linear algebra (Gaussian elimination) and graph theory. (We
cannot in any sense survey all the basic algorithms, especially in graph theory; we’ll restrict ourselves to a few
that will be needed later.
Polynomial space is a much more general class than polynomial time (i.e., a much less restrictive resource
constraint). The most important computational problems solvable in polynomial space (but most probably not in
polynomial time) are games like chess or GO. We give a detailed description of this connection.
The time demand of a Turing machine T is a function time T (n) defined as the maximum of the number of steps
taken by T over all possible inputs of length n. We assume time T (n) ≥ n (the machine must read the input; this
is not necessarily so but we exclude only trivial cases with this assumption). It may happen that time T (n) = ∞.
Similarly, the function space T (n) is defined as the maximum number, over all inputs of length n, of all cells on
all tapes to which the machine writes. (This way, the cells occupied by the input are not counted in the space
requirement.) We assume that space T (n) ≥ 1 (the machine has some output).
A Turing machine T is called polynomial , if there is a polynomial f (n) such that time T (n) = O(f (n)). This is
equivalent to saying that there is a constant c such that the time demand of T is O(n c ). We can define exponential
Turing machines similarly (for which the time demand is O( ) for some c > 0), and also Turing machines
working in polynomial and exponential space.
Now we consider a yes-or-no problem. This can be formalised as the task of deciding whether the input word x
belongs to a fixed language L ∈ Σ∗.
We say that a language L ∈ Σ∗ has time complexity at most f (n), if it can be decided by a Turing machine with
time demand at most f (n). We denote by DTIME(f (n)) the class of languages whose time complexity is at most f
(n). (The letter “D” indicates that we consider here only deterministic algorithms; later, we will also consider
algorithms that are “nondeterministic” or use randomness). We denote by PTIME, or simply by P, the class of all
languages decidable by a polynomial Turing machine. We define similarly when a language has space
complexity at most f (n), and also the language classes DSPACE(f (n)) and PSPACE (polynomial space).
Remark 6.4.1 It would be tempting to define the time complexity of a language L as the optimum time of a
Turing machine that decides the language. Note that we were more careful above, and only defined when the time
complexity is at most f (n). The reason is that there may not be a best algorithm (Turing machine) solving a given
problem: some algorithms may work better for smaller instances, some others on larger, some others on even
larger etc. Section
3.3 contains a theorem that provides such examples.
95
Design and Analysis of Algorithms
Remark 6.4.2 When we say that the multiplication of two numbers of size n can be per- formed in time n2 then
we actually find an upper bound on the complexity of a function (multiplication of two numbers represented by
the input strings) rather than a language. The classes DTIME(f (n)), DSPACE(f (n)), etc. are defined as classes of
languages; corresponding classes of functions can also be defined.
Sometimes, it is easy to give a trivial lower bound on the complexity of a function. Consider e.g. the function is x
• y where x and y are numbers in binary notation. Its computation requires at least |x| + |y| steps, since this is the
length of the output. Lower bounds on the complexity of languages are never this trivial, since the output of the
computation deciding the language is a single bit.
How to define time on the RAM machine? The number of steps of the Random Access Machine is not the best
measure of the “time it takes to work”. One could (mis)use the fact that the instructions operate on natural
numbers of arbitrary size, and develop computational tricks that work in this model but use such huge integers
that to turn them into practical computations would be impossible. For example, we can simulate vector addition
by the addition of two very large natural numbers.
Therefore, we prefer to characterise the running time of RAM algorithms by two numbers, and say that “the
machine makes at most n steps on numbers with at most k bits”. Similarly, the space requirement is best
characterised by saying that “the machine stores most n numbers with at most k bits”.
If we want a single number to characterise the running time of a RAM computation, we can count as the time of a
step not one unit but the number of bits of the integers occurring in it (both register addresses and their contents).
Since the number of bits of an integer is essentially base two logarithm of its absolute value, it is also usual to
call this model logarithmic cost RAM.)
In arithmetical and algebraic algorithms, it is sometimes convenient to count the arith- metical operations; on a
Random Access Machine, this corresponds to extending the set of basic operations of the programming language
to include the subtraction, multiplication, division (with remainder) and comparison of integers, and counting the
number of steps in- stead of the running time. If we perform only a polynomial number of operations (in terms of
the length of the input) on numbers with at most a polynomial number of digits, then our algorithm will be
polynomial in the logarithmic cost model.
A less trivial polynomial time arithmetic algorithm the Euclidean algorithm, computing the greatest common
divisor of two numbers.
Then gcd(a, b) = gcd(a, r), and it is enough therefore to determine the greatest common divisor of a and r. Since r
< a, this recurrence will terminate in a finite number of iterations and we get the greatest common divisor of a
and b.
Notice that strictly speaking, the algorithm given above is not a program for the Random Access Machine. It is a
recursive program, and even as such it is given somewhat informally. But we know that such an informal
program can be translated into a formal one, and a recursive program can be translated into a machine-language
96
program (most compilers can do that).
97
Design and Analysis of Algorithms
Lemma 6.5.1 The Euclidean algorithm takes polynomial time. More exactly, it carries out of O(log a + log b)
arithmetical operations carried out on input (a, b).
Proof: Since 0 ≤ r < a ≤ b, the Euclidean algorithm will terminate sooner or later. Let us see that it terminates in
polynomial time. Notice that b ≥ a + r > 2r and thus r < b/2. Hence ar < ab/2. Therefore after plog(ab)I iterations,
the product of the two numbers will be smaller than 1, hence one of them will be 0, i.e. the algorithm terminates.
Each iteration consists of elementary arithmetic operations, and can be carried out in polynomial time. D
It is an important feature of the Euclidean algorithm not only gives the value of the greatest common divisor, but
also delivers integers p, q such that gcd(a, b) = pa + qb. For this, we simply maintain such a form for all numbers
computed during the algorithm.
If a’ = p1 a + q1b and b’= p2 a + q2 b and we divide, say, bt by at with remainder: b’ = ha’+ r’ then
R’ = (p2 − hp1 )a + (q2 − hp2 )b,
and thus we obtain the representation of the new number r’ in the form p’a + q’b.
Remark 6.5.1 The Euclidean algorithm is sometimes given by the following iteration: if a = 0 then we are done. If a
> b then let us switch the numbers. If 0 < a ≤ b then let b:= b − a. Mathematically, essentially the same thing
happens (Euclid’s original algorithm was closer to this), this algorithm is not polynomial: even the computation
of gcd(1, b) requires b iterations, which is exponentially large in terms of the number log b + O(1) of digits of the
input.
The operations of addition, subtraction, multiplication can be carried out in polynomial times also in the ring of
remainder classes modulo an integer m. We represent the remainder classes by the smallest nonnegative remainder.
We carry out the operation on these as on integers; at the end, another division by m, with remainder, is
necessary.
If m is a prime number then we can also carry out the division in the field of the residue classes modulo m, in
polynomial time. This is different from division with remainder! It means that given integers a, b and m, where 0
≤ a, b ≤ m − 1 and b = 0, we can compute an integer x with 0 ≤ x < m such that
bx ≡ a (mod m).
The solution is to apply the Euclidean algorithm to compute the greatest common divisor of the numbers b, m. Of
course, we know in advance that the result is 1. But as remarked, we also obtain integers p and q such that bp +
mq = 1. In other words, bp ≡ 1 (mod m), and thus b(ap) ≡ a (mod m). So the quotient x we are looking for is the
remainder of the product ap after dividing by m.
We mention yet another application of the Euclidean algorithm. Suppose that a certain integer x is unknown to us
but we know its remainders x1 , . . . , xk with respect to the moduli m1 , . . . , mk which are all relatively prime to
each other. The Chinese Remainder Theorem says that these remainders uniquely determine the remainder of x
modulo the product m = m1 mk . But how can we compute this remainder?
It suffices to deal with the case k = 2 since for general k, the algorithm follows from this by mathematical
induction. We are looking for an integer x such that x ≡ x 1 (mod m1 ) and x ≡ x2 (mod m2 ) (we also want that 0 ≤
x ≤ m1 m2 − 1, but this we can achieve by dividing with remainder at the end).
In other words, we are looking for integers x, q1 and q2 such that x = x1 + q1 m1 and x = x2 + q2 m2. Subtracting, we
get x2 −x1 = q1 m1 −q2 m2 . This equation does not determine the numbers q1 and q2 uniquely, but this is not
important.
We can find, using the Euclidean algorithm, numbers q1 and q2 such that
x2 − x1 = q1 m1 − q2 m2 ,
98
and compute x = x1 + q1 m1 = x2 + q2 m2 . Then x ≡ x1 (mod m1 ) and x ≡ x2 (mod m2 ), as desired.
99
Design and Analysis of Algorithms
Next, we discuss the operation of exponentiation. Since even to write down the number 2n , we need an
exponential number of digits (in terms of the length of the input as the number of binary digits of n), so of course,
this number is not computable in polynomial time. The situation changes, however, if we want to carry out the
exponentiation modulo m: then ab is also a residue class modulo m, and hence it can be represented by log m +
O(1) bits. We will show that it can be not only represented polynomially but also computed in polynomial time.
Let us verify, first of all, that the polynomial computation of det(A) is not inherently impossible, in the sense that
the result can be written down with polynomially many bits.
Let K = max |aij |, then to write down the matrix A we need obviously at least L = n2 +log K bits. On the other hand,
the definition of determinants gives
| det(A)| ≤ n!Kn ,
Linear algebra gives a formula for each element of det(A−1 ) as the quotient of two sub- determinants of A. This
shows that A−1 can also be written down with polynomially many bits.
The usual procedure to compute the determinant is Gaussian elimination. We can view this as the transformation
of the matrix into a lower triangular matrix with column operations. These transformations do not change the
determinant, and in the final triangular matrix, the computation of the determinant is trivial: we just multiply the
diagonal elements to obtain it. It is also easy to obtain the inverse matrix from this form; we will not deal with
this issue separately.
Gaussian elimination: Suppose that for all i such that 1 ≤ i ≤ t, we have achieved already that in the i’th row, only
the first i entries hold a nonzero element. Pick a nonzero element from the last n − t columns (stop if there is no
such element). Call this element the pivot element of this stage. Rearrange the rows and columns so that this
element gets into position (t +1, t +1). Subtract column t +1, multiplied by at+1,i /at+1,t+1 , from column i column for
all i = t + 2, .
. . , n, in order to get 0’s in the elements (t + 1, t + 2), . . . , (t + 1, n). These subtractions do not change value of the
determinant and the rearrangement changes at most the sign, which is easy to keep track of.
Since one iteration of the Gaussian elimination uses O(n 2 ) arithmetic operations and n iterations must be performed,
this procedure uses O(n3 ) arithmetic operations. But the problem is that we must also divide, and not with
remainder. This does not cause a problem over a finite field, but it does in the case of the rational field. We
assumed that the elements of the original matrix are integers; but during the run of the algorithm, matrices also
occur that consist of rational numbers. In what form should these matrix elements be stored? The natural answer
is that as pairs of integers (whose quotient is the rational number).
Do we require that the fractions be in simplified form, i.e., that their numerator and denominator be relatively
prime to each other? We could do so; then we have to simplify each matrix element after each iteration, for which
we would have to perform the Euclidean algorithm. This can be performed in polynomial time, but it is a lot of
extra work, and it is desirable to avoid it. (Of course, we also have to show that in the simplified form, the
98
occurring numerators
99
Design and Analysis of Algorithms
and denominators have only polynomially many digits. This will follow from the discussions below.)
We could also choose not to require that the matrix elements be in simplified form. Then we define the sum and
product of two rational numbers a/b and c/d by the following formulas: (ad + bc)/(bd) and (ac)/(bd). With this
convention, the problem is that the numerators and denominators occurring in the course of the algorithm can
become very large (have a non-polynomial number of digits)!
Fortunately, we can give a procedure that stores the fractions in partially simplified form, and avoids both the
simplification and the excessive growth of the number of digits. For this, let us analyze a little the matrices occurring
during Gaussian elimination. We can assume that the pivot elements are, as they come, in positions (1, 1), . . . ,
(n, n), i.e., we do not have to permute the rows and columns. Let (a (k) ) (1 ≤ i, j ≤ n) be the matrix obtained after k
iterations. Let us denote the elements in the main diagonal of the final
i
matrix, for simplicity, by d 1 , . . . , dn (thus,
di = aii ). Let D denote the submatrix determined by the first k rows and columns of matrix A, and let D ij(k) , for
(n) (k)
k + 1 ≤ i, j ≤ n, denote the submatrix determined by the first k rows and the ith row and the first k columns and the
jth column. Let d(k) = det(D (k) ). Obviously, det(D(k) ) = d (k−1).
ij kk
if y2 ≡ x (mod p).
This implies that not every integer has a square root modulo p: squaring maps the non- zero residues onto a
subset of size (p − 1)/2, and the other (p − 1)/2 have no square root. The following lemma provides an easy way
to decide if a residue has a square root.
by Fermat’s “Little” Theorem. Conversely, the polynomial x(p−1)/2 − 1 has degree (p − 1)/2, and hence it has at
most (p − 1)/2 “roots” modulo p (this can be proved just like the well- know theorem that a polynomial of degree
n has at most n real roots). Since all quadratic residues are roots of x(p−1)/2 − 1, none of the quadratic non-
residues can be. D
It can be expected that for the price of further complicating the machine, the time demands can be decreased. The
next theorem shows the machine can indeed be accelerated by an arbitrary constant factor, at least if its time need
is large enough (the time spent on reading the input cannot be saved).
10
Design and Analysis of Algorithms
Theorem 6.6.1(Linear Speedup Theorem) For every Turing machine and c > 0 there is a Turing machine S over the
same alphabet which decides the same language an for which time S (n) ≤ c • timeT (n) + n.
Proof. For simplicity, let us also assume that T has a single work tape (the proof would be similar for k tapes).
We can assume that c = 1/p where p is an integer.
Let the Turing machine S have an input-tape, 2p − 1 “starting” tapes and 2p − 1 further work tapes. Let us
number these each from 1 − p to p − 1. Let the index of cell j of (start- or work) tape i be the number j(2p − 1) +
i. The start- or work cell with index t will correspond to cell t on the input resp. Work tape of machine T. Let S
also have an output tape.
Machine S begins its work by copying every letter of input x from its input tape to the cell with the
corresponding index on its starting tapes, then moves every head back to cell 0. From then on, it ignores the
“real” input tape.
Every further step of machine S will correspond p consecutive steps of machine T. After pk steps of machine T,
let the scanning head of the input tape and the work tape rest on cells t and s respectively. We will plan machine
S in such a way that in this case, each cell of each start- resp. Work tape of S holds the same symbol as the
corresponding cell of the corresponding tape of T , and the heads rest on the starting-tape cells with indices t − p
+ 1, . . . , t + p − 1 and the work-tape cells with indices s − p + 1, . . . , s + p − 1. We assume that the control unit
of machine S “knows” also which head scans the cell corresponding to t resp. s. It knows further what is the state
of the control unit of T.
Since the control unit of S sees not only what is read by T ’s control unit at the present moment on its input- and
work tape but also the cells at a distance at most p −1 from these, it can compute where T ’s heads will step and
what they will write in the next p steps. Say, after p steps, the heads of T will be in positions t + i and s + j
(where, say, i, j > 0). Obviously, i, j < p. Notice that in the meanwhile, the “work head” could change the
symbols written on the work tape only in the interval [s − p + 1, s + p − 1].
Let now the control unit of S do the following: compute and remember what will be the state of T’s control unit p
steps later. Remember which heads rest on the cells corresponding to the positions (t + i) and (s + j). Let it
rewrite the symbols on the work tape according to the configuration p steps later (this is possible since there is a
head on each work cell with indices in the interval [s − p + 1, s + p − 1]). Finally, move the start heads with
indices in the interval [t − p + 1, t − p + i] and the work heads with indices in the interval [s − p + 1, s − p + j] one
step right; in this way, the indices occupied by them will fill the interval [t + p, t + p + i − 1] resp. [s + p, s + p + i
− 1] which, together with the heads that stayed in their place, gives interval [t + i − p + 1, t + i + p − 1] resp. [s + j
− p + 1, s + j + p − 1].
If during the p steps under consideration, T writes on the output tape (either 0 or 1) and stops, then let S do this,
too. Thus, we constructed a machine S that (apart from the initial copying) makes only a pth of the number of
steps of T and decides the same language.
Theorem 6.6.2 For every recursive function f (n) there is a recursive language L that is not an element of
DTIME(f (n)).
Proof: The proof is similar to the proof of the fact that the halting problem is undecidable. We can assume f (n) >
n. Let T be the 2-tape universal Turing machine constructed in Chapter 1, and let L consist of all words x for
which it is true that having x as input on both of its tape, T halts in at most f (|x|)4 steps. L is obviously recursive.
Let us now assume that ∈ DTIME(f (n)). Then there is a Turing machine (with some k > 0 tapes) deciding
in time f (n). From this we can construct a 1-tape Turing machine deciding in time cf (n)2 (e.g. in such a way
that it stops and writes 0 or 1 as its decision on a certain cell). Since for large enough n we have cf (n) 2 < f (n)3 ,
and the words shorter than this can be recognised by the control unit directly, we can also make a 1-tape Turing
machine that always stops in time f (n)3 . Let us modify this machine in such a way that if a word x is in then
it runs forever, while if x ∈ Σ∗ \ then it stops. This machine be S can be simulated on T by some program p in
such a way that T halts with input (x, p) if and only if S halts with input x; moreover, it halts in these cases within
100
|p|f (|x|)3 steps.
101
Design and Analysis of Algorithms
There are two cases. If p ∈ then—according to the definition of starting with input p on both tapes,
machine T will stop. Since the program simulates S it follows that S halts with input p. This is, however,
impossible, since S does not halt at all for inputs from . On the other hand, if p then—according to the
construction of S—starting with p on its first tape, this machine halts in time |p|f (|p|)3 < f (|p|)4 . Thus, T also
halts in time f (|p|)4
. But then p ∈ L by the definition of the language
Obviously, every fully time-constructible function is recursive. On the other hands, it is easy to see that n2 , 2n , n!
and every “reasonable” function is fully time-constructible. The lemma below guarantees the existence of many
completely time-construct able functions.
Lemma 6.6.1
• To every well-computable function f (n), there is a fully time-constructible function g(n) such that f (n) ≤ g(n)
≤ const f (n).
• For every fully time-constructible function g(n) there is a well-computable function f (n) with g(n) ≤ f (n) ≤
const • g(n).
• For every recursive function f there is a fully time-constructible function g with f ≤ g.
This lemma allows us to use, in most cases, fully time-constructible and well-computable functions
interchangeably. Following the custom, we will use the former.
There is a variety of natural and interesting questions about the trade-off between storage and time. Let us first
102
mention the well-know practical problem that the work of most computers can be speeded up significantly by
adding
103
Design and Analysis of Algorithms
memory. The relation here is not really between the storage and time complexity of computations, only between
slower and faster memory. Possibly, between random-access memory versus the memory on disks, which is
closer to the serial-access model of Turing machines.
There are some examples of real storage-time trade-off in practice. Suppose that during a computation, the values
of a small but complex Boolean function will be used repeatedly. Then, on a random-access machine, it is worth
computing these values once for all inputs and use table look-up later. Similarly, if a certain field of our records
in a data base is often used for lookup then it is worth computing a table facilitating this kind of search
(inverting). All these examples fall into the following category. We know some problem P and an algorithm A
that solves it. Another algorithm at is also known that solves P in less time and more storage than A. But
generally, we don’t have any proof that with the smaller amount of time really more storage is needed to solve P.
Moreover, when a lower bound is known on the time complexity of some function, we have generally no better
estimate of the storage complexity than the trivial one mentioned above (and vice versa).
102
Design and Analysis of Algorithms
Exercises 6
1. A finite sequence formed from some elements of an alphabet is called a word.
a.
b.
c. Θ
d. ∞
6. A Turing machine T is called , if there is a polynomial f (n) such that time T (n) = O(f (n)).
a. equivalent
b. monomial
c. polynomial
d. exponential
7. If there is a function x • y where x and y are numbers in binary notation then how many steps are required for
its computation?
a. |x| + |y|
b. x*y
c. x2 + y2
d. log2 n + O(1)
104
8. To every well-computable function f (n), there is a fully time-constructible function g(n) such that
.
a. g(n) ≤ f (n) ≤ const • g(n)
b. f (n) ≤ g(n) ≤ const f (n)
c. f ≤ g
d. f : Z+ → Z+
9. For every fully time-constructible function g(n) there is a well-computable function f (n) with g(n) ≤ f (n) ≤
const • g(n).
a. g(n) ≤ f (n) ≤ const • g(n)
b. f (n) ≤ g(n) ≤ const f (n)
c. f ≤ g
d. f : Z+ → Z+
105
Design and Analysis of Algorithms
Chapter VII
Graph Theory
Aim
The aim of this chapter is to:
Objectives
The objectives of this chapter are to:
Learning outcome
At the end of this chapter, you will be able to:
106
7.1 Introduction
Conceptually, a graph is formed by vertices and edges connecting the vertices.
Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of edges, formed by pairs
of vertices. E is a multiset, in other words, its elements can occur more than once so that every element has a
multiplicity. Often, we label the vertices with letters (for example: a, b, c, . . . or v1, v2, . . . ) or numbers 1, 2, . . .
Throughout this chapter, we will label the elements of V in this way.
We have V = {v1, . . . , v5} for the vertices and E = {(v1, v2), (v2, v5), (v5, v5), (v5, v4), (v5, v4)} for the edges.
Similarly, we often label the edges with letters (for example: a, b, c, . . . or e1, e2, . . . ) or numbers 1, 2, . . . for
simplicity.
Remark: The two edges (u, v) and (v, u) are the same. In other words, the pair is not ordered. Example: We label
the edges as follows:
So E = {e1, . . . , e5}.
We have the following terminologies:
• The two vertices u and v are end vertices of the edge (u, v).
• Edges that have the same end vertices are parallel.
• An edge of the form (v, v) is a loop.
• A graph is simple if it has no parallel edges or loops.
• A graph with no edges (i.e. E is empty) is empty.
• A graph with no vertices (i.e. V and E are empty) is a null graph.
• A graph with only one vertex is trivial.
• Edges are adjacent if they share a common end vertex.
• Two vertices u and v are adjacent if they are connected by an edge, in other words, (u, v) is an edge.
• The degree of the vertex v, written as d(v), is the number of edges with v as an end ertex.
• By convention, we count a loop twice and parallel edges contribute separately.
• A pendant vertex is a vertex whose degree is 1.
107
Design and Analysis of Algorithms
Example:
• v4 and v5 are end vertices of e5.
• e4 and e5 are parallel.
• e3 is a loop.
• The graph is not simple.
• e1 and e2 are adjacent.
• v1 and v2 are adjacent.
• The degree of v1 is 1 so it is a pendant vertex.
• e1 is a pendant edge.
• The degree of v5 is 5.
• The degree of v4 is 2.
• The degree of v3 is 0 so it is an isolated vertex.
The minimum degree of the vertices in a graph G is denoted δ(G) (= 0 if there is an isolated vertex in G).
Similarly, we write ∆(G) as the maximum degree of vertices in G.
Theorem 1.1. The graphG = (V, E), where V = {v1, . . . , vn} and E = {e1, . . . , em}, satisfies
Proof. If the vertices v1, . . . , vk have odd degrees and the vertices vk+1, . . . , vn have even degrees, then (Theorem
1.1)
A simple graph that contains every possible edge between all the vertices is called a complete graph. A complete
graph with n vertices is denoted as Kn. The first four complete graphs are given as examples:
108
7.2 Walks, Trails, Paths, Circuits, Connectivity, Components
Remark. There are many different variations of the following terminologies. We will adhere to the definitions given
here.
which consists of alternating vertices and edges of G. The walk starts at a vertex. Vertices and vit are end
vertices of ejt (t = 1, . . . , k). vi0 is the initial vertex and vik is the terminal vertex. k is the length of the walk. A
zero length walk is just a single vertex vi 0. It is allowed to visit a vertex or go through an edge more than once. A
walk is open if vi0 6= vik . Otherwise it is closed.
The walk v2, e7, v5, e8, v1, e8, v5, e6, v4, e5, v4, e5, v4 is open. On the other hand, the walk v4, e5, v4, e3, v3, e2, v2,
e7, v5, e6, v4 is closed.
A walk is a trail if any edge is traversed at most once. Then, the number of times that the vertex pair u, v can
appear as consecutive vertices in a trail is at most the number of parallel edges connecting u and v.
A trail is a path if any vertex is visited at most once except possibly the initial and terminal vertices when they
are the same. A closed path is a circuit. For simplicity, we will assume in the future that a circuit is not empty, i.e.
its length ≥ 1. We identify the paths and circuits with the subgraphs induced by their edges.
The walk starting at u and ending at v is called a u–v walk. u and v are connected if there is a u–v walk in the graph
(then there is also a u–v path!). If u and v are connected and v and w are connected, then u and w are also connected,
i.e. if there is a u–v walk and a v–w walk, then there is also a u–w walk. A graph is connected if all the vertices
are connected to each other.
(A trivial graph is connected by convention.)
Theorem 1.2 If the graph G has a vertex v that is connected to a vertex of the component G1 of G, then v is also a
vertex of G1.
Since v′ is a vertex of G1, then (condition #2 above) ejk is an edge of G1 and vik−1 is a vertex of G1. We continue
this process and see that v is a vertex of G1.
109
Design and Analysis of Algorithms
Example. The complement of the complete graph Kn is the empty graph with n vertices.
Obviously, = G. If the graphs G = (V,E) and G′ = (V′,E′) are simple and V′ ⊆ V , then the difference graph is G
− G′ = (V,E′′), where E′′ contains those edges from G that are not in G′ (simple graph).
7.4 Cuts
A vertex v of a graph G is a cut vertex or an articulation vertex of G if the graph G−v consists of a greater number
of components than G.
( Note! Generally, the only vertex of a trivial graph is not a cut vertex, neither is an isolated vertex.)
A graph is separable if it is not connected or if there exists at least one cut vertex in the graph. Otherwise, the
graph is non separable.
Theorem 1.3 The vertex v is a cut vertex of the connected graph G if and only if there exist two vertices u and w
11
in the graph G such that:
(i) v 6= u, v 6= w and u 6= w, but
(ii) v is on every u–w path.
Proof: First, let us consider the case that v is a cut-vertex of G. Then, G − v is not connected and there are at
least two components G1 = (V1,E1) and G2 = (V2,E2). We choose u ∈ V1 and w ∈ V2. The u–w path is in G because
it is connected. If v is not on this path, then the path is also in G − v ( ). The same reasoning can be used for all
the u–w paths in G.
If v is in every u–w path, then the vertices u and w are not connected in G − v.
Theorem 1.8. A nontrivial simple graph has at least two vertices which are not cut vertices.
Theorem 1.9. If F is a cut set of the connected graph G, then G − F has two components.
In a weighted graph, the weight of a path is the sum of the weights of the edges traversed. The labeling of the
vertices (respectively edges) is injective if distinct vertices (respectively edges) have distinct labels. An injective
labeling is bijective if there are as many labels in A (respectively in B) as the number of vertices (respectively
edges).
Example. If A = {0, 1} and B = ℝ, then in the graph, the labeling of the edges (weights) is injective but not the
labeling of the vertices.
The two graphs G1 = (V1,E1) and G2 = (V2,E2) are isomorphic if labeling the vertices of G 1 bijectively with the
elements of V2 gives G2. (Note! We have to maintain the multiplicity of the edges.)
11
Design and Analysis of Algorithms
A spanning tree of a connected graph is a subtree that includes all the vertices of that graph. If T is a spanning
tree of the graph G, then
G − T =def.T∗
The edges of a spanning tree are called branches and the edges of the corresponding cospanning tree are called
links or chords.
Theorem 1.4. If the graph G has n vertices and m edges, then the following statements are equivalent:
(i) G is a tree.
(ii) There is exactly one path between any two vertices in G and G has no loops.
(iii) G is connected and m = n − 1.
(iv) G is circuitless and m = n − 1.
(v) G is circuitless and if we add any new edge to G, then we will get one and only one circuit.
Proof. (i)⇒(ii): If G is a tree, then it is connected and circuitless. Thus, there are no loops in G. There exists a
path between any two vertices of G. By Theorem 1.6, we know that there is only one such path.
(iii) ⇒(iv): Consider the counter hypothesis: There is a circuit in G. Let e be some edge in that circuit. Thus, there
are n vertices and n − 2 edges in the connected graph G − e.
(iv) ⇒(v): If G is circuitless, then there is at most one path between any two vertices (Theorem 1.6). If G has more
than one component, then we will not get a circuit when we draw an edge between two different components. By
adding edges, we can connect components without creating circuits:
11
m + k = n − 1 ( because m = n − 1).
So G is connected. When we add an edge between vertices that are not adjacent, we get only one circuit. Otherwise,
we can remove an edge from one circuit so that other circuits will not be affected and the graph stays connected,
in contradiction to (iii)⇒(iv). Similarly, if we add a parallel edge or a loop, we get exactly one circuit.
(v) ⇒(i): Consider the counter hypothesis: G is not a tree, i.e. it is not connected. When we
add edges as we did previously, we do not create any circuits (see figure).
Since spanning trees are trees, Theorem 2.1 is also true for spanning trees.
The graph T − bi has two components T1 and T2. The corresponding vertex sets are V 1 and V2. Then, is a
cut of G. It is also a cut set of G if we treat it as an edge set because G − has two components. Thus,
every branch bi of T has a corresponding cut set Ii. The cut sets Ii. The cut set I1, ,In−1 are also known as
fundamental
cut sets and they form a fundamental set of cut sets. Every fundamental cut set includes exactly one branch of T
and every branch of T belongs to exactly one fundamental cut set. Therefore, every spanning tree defines a unique
fundamental set of cut sets for G.
11
Design and Analysis of Algorithms
Formally, a digraph is a pair (V,E), where V is the vertex set and E is the set of vertex pairs as in ”usual” graphs.
The difference is that now the elements of E are ordered pairs: the arc from vertex u to vertex v is written as (u,
v) and the other pair (v, u) is the opposite direction arc. We also have to keep track of the multiplicity of the arc
(direction of a loop is irrelevant). We can pretty much use the same notions and results for digraphs. However:
• Vertex u is the initial vertex and vertex v is the terminal vertex of the arc (u, v). We also say that the arc is
incident out of u and incident into v.
• The out-degree of the vertex v is the number of arcs out of it (denoted d+(v)) and the in-degree of v is the number
of arcs going into it (denoted d−(v)).
• In the directed walk (trail, path or circuit),
vi0 , ej1, vi1 , ej2, . . . , ejk, vik
viℓ is the initial vertex and viℓ−1 is the terminal vertex of the arc ejℓ
• When we treat the graph (V,E) as a usual undirected graph, it is the underlying undirected graph of the
digraph G = (V,E), denoted Gu.
• Digraph G is connected if Gu is connected. The components of G are the directed subgraphs of G that
correspond to the components of Gu. The vertices of G are connected if they are connected in Gu. Other
notions for undirected graphs can be used for digraphs as well by dealing with the underlying undirected
graph.
• Vertices u and v are strongly connected if there is a directed u–v path and also a directed v–u path in G.
• Digraph G is strongly connected if every pair of vertices is strongly connected. By convention, the trivial
graph is strongly connected.
• A strongly connected component H of the digraph G is a directed subgraph of G (not a null graph) such that
H is strongly connected, but if we add any vertices or arcs to it, then it is not strongly connected anymore.
Every vertex of the digraph G belongs to one strongly connected component of G. However, an arc does not
necessarily belong to any strongly connected component of G.
11
Example. The digraph G is quasi-strongly connected.
Quasi-strongly connected digraphs are connected but not necessarily strongly connected. The vertex v of the
digraph G is a root if there is a directed path from v to every other vertex of G.
Theorem 3.1. A digraph has at least one root if and only if it is quasi-strongly connected.
Proof. If there is a root in the digraph, it follows from the definition that the digraph is quasi-strongly connected.
Let us consider a quasi-strongly connected digraph G and show that it must have at least one root. If G is trivial,
then it is obvious. Otherwise, consider the vertex set V = {v 1, . . . , vn} of G where n ≥ 2. The following process
shows that there must be a root:
• Set P ← V .
• If there is a directed u–v path between two distinct vertices u and v in P, then we remove v from P. Equivalently,
we set P ← P − {v}. We repeat this step as many times as possible.
• If there is only one vertex left in P, then it is the root. For other cases, there are at least two distinct vertices u
and v in P and there is no directed path between them in either direction. Since G is quasi-strongly connected,
from condition (iv) it follows that there is a vertex w and a directed w–u path as well as a directed w–v path.
Since u is in P, w can not be in P. We remove u and v from P and add w, i.e. we set P ← P − {u, v} and P ←
P
𝖴 {w}. Go back to step #2.
• Repeat as many times as possible.
Every time we do this, there are fewer and fewer vertices in P. Eventually, we will get a root because there is a
directed path from some vertex in P to every vertex we removed from P. The digraph G is a tree if Gu is a tree. It
is a directed tree if Gu is a tree and G is quasistrongly connected, i.e. it has a root. A leaf of a directed tree is a
vertex whose out-degree is zero.
11
Design and Analysis of Algorithms
Theorem 1.5 In an acyclic digraph, there exist at least one source (a vertex whose in-degree is zero) and at least
one sink (a vertex whose out-degree is zero).
Proof. Let G be an acyclic digraph. If G has no arcs, then it is obvious. Otherwise, let us consider the directed
path
which has the maximum path length k. Since G is acyclic, vi0 vik. If (v, vi0) is an arc, then one of the following
is true:
• v = vit for some value of t. We choose the smallest such t. Then, t > 0 because there are no loops in G and
vi0 , ej1, vi1, ej2, . . . , ejt , vit , (v, vi0), vi0
is a directed circuit
Hence, d−(v ) = 0. Using a similar technique, we can show that d+(v ) = 0 as well.
i0 ik
If G = (V,E) is a digraph with n vertices, then a labeling of the vertices with an injectivefunction α : V → {1, . . . ,
n} which satisfies the condition α(u) < α(v) whenever (u, v) is an arc in G is known as topological sorting.
11
Design and Analysis of Algorithms
Exercises 7
1. A graph is a pair of sets (V, E), where V is the set of and E is the set of , formed by
pairs of vertices.
a. variants, end-points
b. vertices, edges
c. edges, multiset
d. labels, numbers
2. A simple graph that contains every possible edge between all the vertices is called a .
a. sub-graph
b. isolated graph
c. complete graph
d. pendant graph
11
7. A complete graph with n vertices is denoted as .
a. G = (V, E)
b. Kn
c. ∆(G)
d. Gn
8. A trail is a if any vertex is visited at most once except possibly the initial and terminal vertices
when they are the same.
a. walk
b. path
c. edge
d. circuit
10. A vertex v of a graph G is a or an articulation vertex of G if the graph G−v consists of a greater
number of components than G.
a. connected vertex
b. equivalent vertex
c. cut vertex
d. labeled vertex
11
Design and Analysis of Algorithms
Chapter VIII
Brute Force
Aim
Objectives
The objectives of this chapter are to:
Learning outcome
At the end of this chapter, you will be able to:
120
8.1 Introduction
Brute force is a straightforward approach to problem solving, usually directly based on the problem’s statement
and definitions of the concepts involved. Though rarely a source of clever or efficient algorithms, the brute-force
approach should not be overlooked as an important algorithm design strategy. Unlike some of the other
strategies, brute force is applicable to a very wide variety of problems.
For some important problems (e.g., sorting, searching, string matching), the brute-force approach yields
reasonable algorithms of at least some practical value with no limitation on instance size Even if too inefficient in
general, a brute-force algorithm can still be useful for solving small-size instances of a problem. A brute-force
algorithm can serve an important theoretical or educational purpose.
Sorting Problem: Brute force approach to sorting Problem: Given a list of n orderable items (e.g., numbers,
characters from some alphabet, character strings), rearrange them in non-decreasing order.
Selection Sort
ALGORITHM Selection Sort(A[0..n - 1])
for i 0 to n - 2 do
min i
for j i + 1 to n - 1 do
if A[j ]<A[min] min j
swap A[i] and A[min]
Example:
| 89 45 68 90 29 34 17
17 | 45 68 90 29 34 89
17 29 | 68 90 45 34 89
17 29 34 | 90 45 68 89
17 29 34 45 | 90 68 89
17 29 34 45 68 | 90 89
17 29 34 45 68 89 | 90
Selection sort operation on the list 89, 45, 68, 90, 29, 34, 17. Each line correspond to one iteration of the
algorithm, i.e., a pass through the list trail to the right of the vertical bar; an element in bold indicates the smallest
element found. Elements to the left of the vertical bar are in their final positions and are not considered in this or
subsequent iterations.
Performance Analysis of the selection sort algorithm: The input’s size is given by the number of elements n.
The algorithm’s basic operation is the key comparison A[j]<A[min]. The number of times it is executed depends
only on the array’s size and is given by
121
Design and Analysis of Algorithms
Thus, selection sort is a O(n2) algorithm on all inputs. The number of key swaps is only O(n) or, more precisely,
n-1 (one for each repetition of the i loop).This property distinguishes selection sort positively from many other
sorting algorithms.
for i 0 to n - 2 do
for j 0 to n - 2 - i do
?
89 45 68 90 29 34 17
?
45 89 68 90 29 34 17
? ?
45 68 89 90 29 34 17
?
45 68 89 29 90 34 17
?
45 68 89 29 34 90 17
45 68 89 29 34 17 90
45 68 89 29 34 17 | 90
45 68 29 89 34 17 | 90
45 68 29 34 89 17 | 90
45 68 29 34 17 | 89 90
etc.
The first 2 passes of bubble sort on the list 89, 45, 68, 90, 29, 34, 17. A new line is shown after a swap of two
elements is done. The elements to the right of the vertical bar are in their final positions and are not considered in
subsequent iterations of the algorithm.
122
8.3 Bubble Sort- The Analysis
Clearly, the outer loop runs n times. The only complexity in this analysis in the inner loop. If we think about a
single time the inner loop runs, we can get a simple bound by noting that it can never loop more than n times.
Since the outer loop will make the inner loop complete n times, the comparison can’t happen more than O(n2)
times.
The number of key comparisons for the bubble sort version given above is the same for all arrays of size n.
The number of key swaps depends on the input. For the worst case of decreasing arrays, it is the same as the
number of key comparisons.
Observation: if a pass through the list makes no exchanges, the list has been sorted and we can stop the algorithm
Though the new version runs faster on some inputs, it is still in O(n 2) in the worst and average cases. Bubble sort
is not very good for big set of input. How ever bubble sort is very simple to code.
Sequential Search
ALGORITHM SequentialSearch2(A[0..n], K)
A[n] K
i 0
while A[i] = K do
i i+1
if i < n return i
else return
123
Design and Analysis of Algorithms
p0 . . . pj . . . pm-1 pattern P
1. Pattern: 001011
Text: 10010101101001100101111010
2. Pattern: happy
Text: It is never too late to have a happy Childhood
The algorithm shifts the pattern almost always after a single character b comparison. in the worst case, the
algorithm may have to make all m comparisons before shifting the pattern, and this can happen for each of the n -
m + 1 tries. Thus, in the worst case, the algorithm is in θ(nm).
Dmin
for i 1 to n-1 do
124
for j i+1 to n do
if d< dmin
125
Design and Analysis of Algorithms
Exercises 8
1. The number of key swaps depends on the .
a. a. input
b. b. output
c. c. string
d. d. pattern
2. Which of the following compares the adjacent elements of the list and exchange them if they are out of
order?
a. a. Brute force algorithm
b. b. Quick sort
c. c. Bubble sort
d. d. Knapsack algorithm
6. Brute force is a approach to problem solving, usually directly based on the problem’s statement
and definitions of the concepts involved.
a. Dynamic
b. Static
c. Straight forward
d. Recursive
7. Which of the following sorting is not very good for big set of input?
a. Quick sort
b. Selection sort
c. Merge sort
d. Bubble sort
126
Design and Analysis of Algorithms
8. Which of the following is the only complexity in the bubble sort anlaysis?
a. Inner loop
b. Outer loop
c. Number of elements in array
d. None of the above
128
Application 129
Using Quicksort Algorithm, sort the given numbers step by step:
314592687
Solution:
The basic concept is to pick one of the elements in the array as a pivot value around which the other elements
will be rearranged. Everything less than the pivot is moved left of the pivot (into the left partition). Similarly,
everything greater than the pivot goes into the right partition. At this point, each partition is recursively quick
sorted.
The Quicksort algorithm is fastest when the median of the array is chosen as the pivot value. That is because the
resulting partitions are of very similar size. Each partition splits itself in two and thus the base case is reached
very quickly.
In practice, the Quicksort algorithm becomes very slow when the array passed to it is already close to being
sorted. Because there is no efficient way for the computer to find the median element to use as the pivot, the first
element of the array is used as the pivot. So when the array is almost sorted, Quicksort doesn't partition it equally.
Instead, the partitions are lopsided as given in figure above. This means that one of the recursion branches is
much deeper than the other, and causes execution time to go up. Thus, it is said that the more random the
arrangement of the array, the faster the Quicksort Algorithm finishes.
129
Design and Analysis of Algorithms
Application II
For the given graph, the source vertex is 1 and the destination vertex is 7. Find the shortest path for this graph
from 1 to 7 and also determine the path length using Dijkstra's algorithm.
130
Application 131
Traverse a graph shown below, using DFS. And start from a vertex with number 1.
131
Design and Analysis of Algorithms
132