0% found this document useful (0 votes)
86 views83 pages

Csitnepal: UNIT:-1 Principles of Analyzing Algorithms and Problems

1) An algorithm is a set of computational instructions that takes input values and produces output values in finite time. Algorithms are independent of programming languages or machines. 2) Examples of algorithmic problems include finding the greatest common divisor of two numbers and sorting an array of numbers. Understanding algorithms is important for computer science concepts and analyzing the input-output relationship of problems. 3) Algorithm analysis considers properties like correctness, running time, and space usage. Running time is typically analyzed using asymptotic analysis like Big O notation, which provides upper bounds, and Omega notation, which provides lower bounds.

Uploaded by

Kiran Khanal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views83 pages

Csitnepal: UNIT:-1 Principles of Analyzing Algorithms and Problems

1) An algorithm is a set of computational instructions that takes input values and produces output values in finite time. Algorithms are independent of programming languages or machines. 2) Examples of algorithmic problems include finding the greatest common divisor of two numbers and sorting an array of numbers. Understanding algorithms is important for computer science concepts and analyzing the input-output relationship of problems. 3) Algorithm analysis considers properties like correctness, running time, and space usage. Running time is typically analyzed using asymptotic analysis like Big O notation, which provides upper bounds, and Omega notation, which provides lower bounds.

Uploaded by

Kiran Khanal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

DAA New Summit College(B.Sc.

CSIT)
UNIT:- 1
Principles of Analyzing algorithms and Problems
An algorithm is a finite set of computational instructions, each instruction can be executed in finite
time, to perform computation or problem solving by giving some value, or set of values as input to
produce some value, or set of values as output. Algorithms are not dependent on a particular
machine, programming language or compilers i.e. algorithms run in same manner everywhere. So
the algorithm is a mathematical object where the algorithms are assumed to be run under machine
with unlimited capacity.

Examples of problems
• You are given two numbers, how do you find the Greatest Common Divisor.
• Given an array of numbers, how do you sort them?
We need algorithms to understand the basic concepts of the Computer Science, programming.
Where the computations are done and to understand the input output relation of the problem we
must be able to understand the steps involved in getting output(s) from the given input(s).

You need designing concepts of the algorithms because if you only study the algorithms then you
are bound to those algorithms and selection among the available algorithms. However if you have
knowledge about design then you can attempt to improve the performance using different design
principles.

The analysis of the algorithms gives a good insight of the algorithms under study. Analysis of
algorithms tries to answer few questions like; is the algorithm correct? i.e. the
Algorithm generates the required result or not?, does the algorithm terminate for all the inputs
under problem domain? The other issues of analysis are efficiency, optimality, etc. So knowing the
different aspects of different algorithms on the similar problem domain we can choose the better
algorithm for our need. This can be done by knowing the resources needed for the algorithm for its
execution. Two most important resources are the time and the space. Both of the resources are
measures in terms of complexity for time instead of absolute time we consider growth

Algorithms Properties
Input(s)/output(s): There must be some inputs from the standard set of inputs and an
algorithm’s execution must produce outputs(s).
Definiteness: Each step must be clear and unambiguous.
Finiteness: Algorithms must terminate after finite time or steps.
Correctness: Correct set of output values must be produced from the each set of inputs.
Effectiveness: Each step must be carried out in finite time.
Here we deal with correctness and finiteness.

Random Access Machine Model


This RAM model is the base model for our study of design and analysis of algorithms to have
design and analysis in machine independent scenario. In this model each basic operations (+, -)
takes 1 step, loops and subroutines are not basic operations. Each memory reference is 1 step. We
al

measure run time of algorithm by counting the steps.


ep
itn

By Bhupendra Saud 1
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Best, Worst and Average case
Best case complexity gives lower bound on the running time of the algorithm for any
instance of input(s). This indicates that the algorithm can never have lower running time
than best case for particular class of problems.

Worst case complexity gives upper bound on the running time of the algorithm for all the
instances of the input(s). This insures that no input can overcome the running time limit
posed by worst case complexity.

Average case complexity gives average number of steps required on any instance of the
input(s).

In our study we concentrate on worst case complexity only.

Example 1: Fibonacci Numbers


Input: n
Output: nth Fibonacci number.
Algorithm: assume a as first(previous) and b as second(current) numbers
fib(n)
{
a = 0, b= 1, f=1 ;
for(i = 2 ; i <=n ; i++)
{
f = a+b ;
a=b ;
b=f ;
}
return f ;
}

Efficiency
Time Complexity: The algorithm above iterates up to n-2 times, so time complexity is
O(n).

Space Complexity: The space complexity is constant i.e. O(1).

Mathematical Foundation
Since mathematics can provide clear view of an algorithm. Understanding the concepts of
mathematics is aid in the design and analysis of good algorithms. Here we present some of the
mathematical concepts that are helpful in our study.
Exponents
Some of the formulas that are helpful are:

xa xb = xa+b
al

xa / xb = xa-b
(x a)b = xab
ep

xn + xn = 2xn
2n + 2n = 2n+1
itn

By Bhupendra Saud 2
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Logarithmes
Some of the formulas that are helpful are:
1. logab = logcb / logc a ; c>0
2. log ab = log a + log b
3. log a/b = log a - log b
4. log (ab) = b log a
5. Log x < x for all x>0
6. Log 1 = 0, log 2 = 1, log 1024 = 10.
7. a logbn = n logba
Series

Asymptotic Notation
Complexity analysis of an algorithm is very hard if we try to analyze exact. we know that the
complexity (worst, best, or average) of an algorithm is the mathematical function of the size of the
input. So if we analyze the algorithm in terms of bound (upper and lower) then it would be easier.
For this purpose we need the concept of asymptotic notations. The figure below gives upper and
lower bound concept.
Big Oh (O) notation
When we have only asymptotic upper bound then we use O notation. A function f(x)=O(g(x))
(read as f(x) is big oh of g(x) ) iff there exists two positive constants c and x 0 such that for all x >=
x0, 0 <= f(x) <= c*g(x)
The above relation says that g(x) is an upper bound of f(x)
al

Some properties:
Transitivity: f(x) = O(g(x)) & g(x) = O(h(x)) _ f(x) = O(h(x))
ep

Reflexivity: f(x) = O(f(x))


itn

By Bhupendra Saud 3
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
O(1) is used to denote constants.

For all values of n >= n0, plot shows clearly that f(n) lies below or on the curve of c*g(n)
Examples
1. f(n) = 3n2 + 4n + 7
g(n) = n2 , then prove that f(n) = O(g(n)).
Proof: let us choose c and n0 values as 14 and 1 respectively then we can have
f(n) <= c*g(n), n>=n0 as
3n2 + 4n + 7 <= 14*n2 for all n >= 1
The above inequality is trivially true
Hence f(n) = O(g(n))
2. Prove that n log (n3) is O(√n3)).
Proof: we have n log (n3) = 3n log n
Again, √n3 = n √n,
If we can prove log n = O(√n) then problem is solved
Because n log n = n O(√n) that gives the question again.
We can remember the fact that log a n is O (nb) for all a,b>0.
In our problem a = 1 and b = 1/2 ,
hence log n = O(√n).
So by knowing log n = O(√n) we proved that
n log (n3) = O(√n3)).

Big Omega (Ω) notation


Big omega notation gives asymptotic lower bound. A function f(x) =Ω (g(x)) (read as g(x) is big
omega of g(x) ) iff there exists two positive constants c and x 0 such that for all x >= x0, 0 <= c*g(x)
<= f(x).
The above relation says that g(x) is a lower bound of f(x).
Some properties:
al

Transitivity: f(x) = O(g(x)) & g(x) = O(h(x)) _ f(x) = O(h(x))


Reflexivity: f(x) = O(f(x))
ep
itn

By Bhupendra Saud 4
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Examples
1. f(n) = 3n2 + 4n + 7
g(n) = n2 , then prove that f(n) =Ω(g(n)).
Proof: let us choose c and n0 values as 1 and 1, respectively then we can have
f(n) >= c*g(n), n>=n0 as
3n2 + 4n + 7 >= 1*n2 for all n >= 1
The above inequality is trivially true
Hence f(n) =Ω (g(n))

Big Theta (Θ) notation


When we need asymptotically tight bound then we use notation. A function f(x) = (g(x)) (read as
f(x) is big theta of g(x) ) iff there exists three positive constants c 1, c2 and x0 such that for all x >=
x0, c1*g(x) <= f(x) <= c2*g(x)
The above relation says that f(x) is order of g(x)
Some properties:
Transitivity: f(x) = Θ (g(x)) & g(x) = Θ (h(x)) _ f(x) = Θ (h(x))
Reflexivity: f(x) = Θ (f(x))
Symmetry: f(x) = Θ g(x) iff g(x) = Θ f(x)

al
ep
itn

By Bhupendra Saud 5
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Example
1. f(n) = 3n2 + 4n + 7
g(n) = n2 , then prove that f(n) = (g(n)).
Proof: let us choose c1, c2 and n0 values as 14, 1 and 1 respectively then we can have,
f(n) <= c1*g(n), n>=n0 as 3n2 + 4n + 7 <= 14*n2 , and
f(n) >= c2*g(n), n>=n0 as 3n2 + 4n + 7 >= 1*n2
for all n >= 1(in both cases).
So c2*g(n) <= f(n) <= c1*g(n) is trivial.
Hence f(n) = Θ (g(n)).

Recurrences
• Recursive algorithms are described by using recurrence relations.
• A recurrence is an inequality that describes a problem in terms of itself.
For Example:
Recursive algorithm for finding factorial
T(n)=1 when n =1
T(n)=T(n-1) + O(1) when n>1
Recursive algorithm for finding Nth Fibonacci number
T(1)=1 when n=1
T(2)=1 when n=2
T(n)=T(n-1) + T(n-2) +O(1) when n>2
Recursive algorithm for binary search
T(1)=1 when n=1
T(n)=T(n/2) + O(1) when n>1

Techniques for Solving Recurrences


We’ll use four techniques:
• Iteration method
• Recursion Tree
• Substitution
• Master Method – for divide & conquer
• Characteristic Equation – for linear

Iteration method
• Expand the relation so that summation independent on n is obtained.
• Bound the summation
e.g.
T(n)= 2T(n/2) +1 when n>1
T(n)= 1 when n=1
al

T(n) = 2T(n/2) +1
= 2 { 2T(n/4) + 1} +1
ep

= 4T(n/4) + 2 + 1
itn

By Bhupendra Saud 6
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
= 4 { T(n/8) +1 } + 2 + 1
= 8 T(n/8) + 4 + 2 + 1
………………………
………………………
= 2k T( n/2k) + 2 k-1 T(n/2k-1) + ………………… + 4 + 2 + 1.
For simplicity assume:
n= 2k
 k=logn
 T(n)= 2k + 2k-1 + ……………………….. + 22 + 21 + 20
 T(n)= (2k+1 -1)/ (2-1)
 T(n)= 2k+1 -1
 T(n)= 2.2k -1
 T(n)= 2n-1
 T(n)= O(n)

Second Example:
T(n) = T(n/3) + O(n) when n>1
T(n) = 1 when n=1

T(n) = T(n/3) + O(n)


 T(n) = T(n/32) + O( n/3) + O(n)
 T(n) = T(n/33) + O(n/32) + O( n/3) + O(n)
 T(n) = T(n/34) + O(n/33) +O(n/32) + O( n/3) + O(n)
 T(n)= T(n/3k) + O(n/3k-1) + ………………+ O( n/3) + O(n)
For Simplicity assume
n= 3k
 k = log3n
 T(n)<= T(1) + c n/3k-1……………… + c.n/32+ c.n/3 + c.n
 T(n) <= 1 +{ c.n/3k-1……………… + c.n/32+ c.n/3 + c.n}
 T(n)<= 1+ c.n { 1/(1-1/3) }
 T(n) <= 1+ 3/2 c.n
 T(n) = O(n)

Substitution Method
Takes two steps:
1. Guess the form of the solution, using unknown constants.
2. Use induction to find the constants & verify the solution.

Completely dependent on making reasonable guesses


Consider the example:
T(n) = 1 n=1
T(n) = 4T(n/2) + n n>1
3
Guess: T(n) = O(n ).
More specifically:
al

T(n) ≤ cn3, for all large enough n.


Prove by strong induction on n.
ep

Assume: T(k) ≤ ck3 for ∀k<n.


itn

By Bhupendra Saud 7
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Show: T(n) ≤ cn for ∀n>n0.
3

Base case,
For n=1:
T(n) = 1 Definition
1≤c Choose large enough c for conclusion
Inductive case, n>1:
T(n) = 4T(n/2) + n Definition.
≤ 4c⋅(n/2) + n
3
Induction.
= c/2 ⋅ n3 + n Algebra.

While this is O(n3), we’re not done.


Need to show c/2 ⋅ n3 + n ≤ c⋅n3.
Fortunately, the constant factor is shrinking, not growing.
T(n) ≤ c/2 ⋅ n3 + n From before.
= cn3 - (c/2 ⋅ n3 - n) Algebra.
≤ cn3 Since n>0, if c≥2
Proved:
T(n) ≤ 2n3 for ∀n>0
Thus, T(n) = O(n3).

Second Example
T(n) = 1 n=1
T(n) = 4T(n/2) + n n>1
Guess: T(n) = O(n2).
Same recurrence, but now try tighter bound.
More specifically:
T(n) ≤ cn2 for ∀n>n0.

Assume T(k) ≤ ck2, for ∀k<n.

T(n) = 4T(n/2) + n
≤ 4c⋅(n/2)
= cn2 + n

Not ≤ cn2 !
Problem is that the constant isn’t shrinking.
Solution: Use a tighter guess & inductive hypothesis.
Subtract a lower-order term – a common technique.
Guess:
T(n) ≤ cn2 - dn for ∀n>0
Assume T(k) ≤ ck2 - dk, for ∀k<n. Show T(n) ≤ cn2 - dn.
Base case, n=1
T(n) = 1 Definition.
al

1 ≤ c-d Choosing c, d appropriately.


ep

Inductive case, n>1:


itn

By Bhupendra Saud 8
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n) = 4T(n/2) + n Definition.
≤ 4(c(n/2)2 - d(n/2)) + n Induction.
= cn2 - 2dn + n Algebra.
= cn2 - dn - (dn - n) Algebra.
≤ cn2 - dn Choosing d≥1.
T(n) ≤ 2n2 – dn for ∀n>0
Thus, T(n) = O(n2).
Ability to guess effectively comes with experience.

Changing Variables:
Sometimes a little algebric manipulation can make a unknown recurrence similar to one we have
seen
Consider the example
T(n) = 2T(n1/2) + logn
Looks Difficult: Rearrange like
Let m=logn => n= 2m
Thus,
T(2m)= 2T(2m/2)+m
Again let
S(m)= T(2m) S(m)= 2S(m/2) + m
We can show that
S(m)=O(logm)
 T(n)=T(2m) =S(m)= O(mlogm) = O(logn log logn)

Recursion Tree
Just Simplification of Iteration method:
Consider the recurrence
T(1)=1 when n=1
T(n)= T(n/2)+ 1 when n>1
1 1 1

T(n/2) 1 1

T(n/4) 1

T(n/2k)
Cost at each level =1
For simplicity assume that n= 2K
al

 k= logn
Summing the cost at each level,
ep

Total cost = 1 + 1 + 1 + …………………. Up to logn terms


itn

By Bhupendra Saud 9
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
 complexity = O (logn)
Second Example

T(n) = 1
n=1

T(n) = 4T(n/2) + n
n>1









T(n Cost at
) this level

n n

n/ n/ n/ n/ 2
2 2 2 2 n

n/ n/ n/ n/ n/ n/ k/ n/
… 22n
4 4 4 4 4 4 4 4

1 … 1 2kn

Asume: n= 2k
k= logn
T(n) = n + 2n + 4n + …+ 2k – 1n +2 kn
= n(1 + 2 + 4 + … + 2k-1 + 2k
= n( 2k+1 -1)/(2-1)
= n (2k+1 -1)
≤ n 2k+1
=2n 2k
= 2n . n
= O(n2)
al
ep
itn

By Bhupendra Saud 10
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Third Example
Solve T(n) = T(n/4) + T(n/2)+ n2

n2

n2/16 n2/4

n2/162 n2/82 n2/82 n2/42


. . . .
. . . .
1 . . 1
Total Cost ≤ n + 5 n /16 + 5 n /16 +5 n /16 +…………………… +5kn2/16k
2 2 2 2 2 3 2 3

{ why ≤? Why not =?}


=n 2
(1+ 5/16 + 52/162 + 53/163+…………………… + 5k/16k)
= n2 + (1- 5k+1/16K+1)
= n2 + constant
= O(n2)

Master Method
Cookbook solution for some recurrences of the form
T(n) = a ⋅ T(n/b) + f(n)
where a≥1, b>1, f(n) asymptotically positive
Describe its cases

Master Method Case 1

T(n) = a ⋅ T(n/b) + f(n)

f(n) = O(nlogb a - ε) for some ε>0 → T(n) = Θ(nlogb a)

T(n) = 7T(n/2) + cn2 a=7, b=2


2
Here f(n) = cn n b = n 2 = n2.8
log a log 7

 cn2 = O(nlogb a - ε) , for any ε ≤ 0.8.


T(n) = Θ(nlg2 7) = Θ(n2.8)

Master Method Case 2


T(n) = a ⋅ T(n/b) + f(n)
al

f(n) = Θ(nlogb a) → T(n) = Θ(nlogb a lg n )


ep
itn

By Bhupendra Saud 11
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n) = 2T(n/2) + cn a=2, b=2
Here f(n) = cn nlogb a = n
 f(n) = Θ(nlogb a)
T(n) = Θ(n lg n)

Master Method Case 3


T(n) = a ⋅ T(n/b) + f(n)

f(n) = Ω(nlogb a + ε) for some ε>0 and


a⋅f(n/b) ≤ c⋅f(n) for some c<1 and all large enough n

T(n) = 4⋅T(n/2) + n3 a=4,


b=2
n3 =?Ω(nlogb a + ε) = Ω(nlog2 4 + ε) = Ω(n2 + ε) for any ε ≤ 1.

Again, 4(n/2)3 = ½⋅n3 ≤ cn3, for any c ≥ ½.


T(n) = Θ(n3)

Master Method Case 4


T(n) = a ⋅ T(n/b) + f(n)
None of previous apply. Master method doesn’t
help.
T(n) = 4T(n/2) + n2/lg n a=4,
b=2

Case 1? n2/lg n = O(nlogb a - ε) = O(nlog2 4 - ε) = O(n2 - ε) =


O(n2/nε)
No, since lg n is asymptotically less than nε.
Thus, n2/lg n is asymptotically greater than n2/nε.
Case 2? n2/lg n =? Θ(nlogb a) = Θ(nlog2 4) =
Θ(n2)
No.
Case 3? n2/lg n =? Ω(nlogb a + ε) = Ω(nlog2 4 + ε) = Ω(n2 + ε)
al

No, since 1/lg n is asymptotically less than nε.


ep

Exercises
itn

By Bhupendra Saud 12
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
 Show that the solution of T(n) = 2T( n / 2 ) + n is Ω(nlogn). Conclude that solution is Θ
(nlogn).
 Show that the solution to T(n) = 2T(n / 2 + 17) + n is O(nlogn).
 Write recursive Fibonacci number algorithm derive recurrence relation for it and solve by
substitution method. {Guess 2n}
 Argue that the solution to the recurrence T(n) = T(n/3) + T(2n/3) + n is (nlogn) by
appealing to a recursion tree.
 Use iteration to solve the recurrence T(n) = T(n-a) + T(a) + n, where a >=1 is a constant.
 The running time of an algorithm A is described by the recurrence T(n) = 7T(n/2) + n 2. A
competing algorithm A’ has a running time of T’(n) = aT’(n/4) + n2. What is the largest
integer value for ‘a’ such that A’ is asymptotically faster than A?

Review of Data Structures


This part is to introduce some of the data structures if you want rigorous study you can consult the
book on Data Structures.
Simple Data structures
The basic structure to represent unit value types are bits, integers, floating numbers, etc. The
collection of values of basic types can be represented by arrays, structure, etc. The access of the
values are done in constant time for these kind of data structured
Linear Data Structures
Linear data structures are widely used data structures we quickly go through the following linear
data structures.
Lists
List is the simplest general-purpose data structure. They are of different variety. Most fundamental
representation of a list is through an array representation. The other representation includes linked
list. There are also varieties of representations for lists as linked list like singly linked, doubly
linked, circular, etc. There is a mechanism to point to the first element. For this some pointer is
used. To traverse there is a mechanism of pointing the next (also previous in doubly linked). Lists
require linear space to collect and store the elements where linearity is proportional to the number
of items. For e.g. to store n items in an array nd space is required were d is size of data. Singly
linked list takes n(d + p), where p is size of pointer. Similarly for doubly linked list space
requirement is n(d + 2p).
Array representation
 Operations require simple implementations.
 Insert, delete, and search, require linear time, search can take O(logn) if
binary search is used. To use the binary search array must be sorted.
 Inefficient use of space
Singly linked representation (unordered)
1. Insert and delete can be done in O(1) time if the pointer to the node is given, otherwise O(n)
time.
2. Search and traversing can be done in O(n) time
3. Memory overhead, but allocated only to entries that are present.
Doubly linked representation
al

4. Insert and delete can be done in O(1) time if the pointer to the node is given, otherwise O(n)
ep

time.
5. Search and traversing can be done in O(n) time
itn

By Bhupendra Saud 13
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
6. Memory overhead, but allocated only to entries that are present, search becomes easy.
boolean isEmpty ();
Return true if and only if this list is empty.
• int size ();
Return this list’s length.
• boolean get (int i);
Return the element with index i in this list.
• boolean equals (List a, List b);
Return true if and only if two list have the same length, and each element of the lists are equal
• void clear ();
Make this list empty.
• void set (int i, int elem);
Replace by elem the element at index i in this list.
• void add (int i, int elem);
Add elem as the element with index i in this list.
• void add (int elem);
Add elem after the last element of this list.
• void addAll (List a List b);
Add all the elements of list b after the last element of list a.
• int remove (int i);
Remove and return the element with index i in this list.
• void visit (List a);
Prints all elements of the list

Operation Array representation SLL representation


get O(1) O(n)
set O(1) O(n)
add(int,data) O(n) O(n)
add(data) O(1) O(1)
remove O(n) O(n)
equals O(n2) O(n2)
addAll O(n2) O(n2)

Stacks and Queues


These types of data structures are special cases of lists. Stack also called LIFO (Last In First Out)
list. In this structure items can be added or removed from only one end. Stacks are generally
represented either in array or in singly linked list and in both cases insertion/deletion time is O(1),
but search time is O(n).
Operations on stacks
 boolean isEmpty ();
al

Return true if and only if this stack is empty. Complexity is O(1).


 int getLast ();
ep

Return the element at the top of this stack. Complexity is O(1).


itn

By Bhupendra Saud 14
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
 void clear ();
Make this stack empty. Complexity is O(1).
 void push (int elem);
Add elem as the top element of this stack. Complexity is O(1).
 int pop ();
Remove and return the element at the top of this stack. Complexity is O(1).

The queues are also like stacks but they implement FIFO(First In First Out) policy. One end is for
insertion and other is for deletion. They are represented mostly circularly in array for O(1)
insertion/deletion time. Circular singly linked representation takes O(1) insertion time and O(1)
deletion time. Again Representing queues in doubly linked list have O(1) insertion and deletion
time.
Operations on queues
3. boolean isEmpty ();
Return true if and only if this queue is empty. Complexity is O(1).
4. int size ();
Return this queue’s length. Complexity is O(n).
5. int getFirst ();
Return the element at the front of this queue. Complexity is O(1).
6. void clear ();
Make this queue empty. Complexity is O(1).
7. void insert (int elem);
Add elem as the rear element of this queue. Complexity is O(1).
8. int delete ();
Remove and return the front element of this queue. Complexity is O(1).

Tree Data Structures


Tree is a collection of nodes. If the collection is empty the tree is empty otherwise it contains a
distinct node called root (r) and zero or more sub-trees whose roots are directly connected to the
node r by edges. The root of each tree is called child of r, and r the parent. Any node without a
child is called leaf. We can also call the tree as a connected graph without a cycle. So there is a
path from one node to any other nodes in the tree. The main concern with this data structure is due
to the running time of most of the operation require O(logn). We can represent tree as an array or
linked list.
Some of the definitions
• Level h of a full tree has dh-1 nodes.
• The first h levels of a full tree have
1 + d + d2 + …………………….. dh-1 = (dh -1)/(d-1)

Binary Search Trees


BST has at most two children for each parent. In BST a key at each vertex must be greater than all
the keys held by its left descendents and smaller or equal than all the keys held by its right
descendents. Searching and insertion both takes O(h) worst case time, where h is height of tree and
the relation between height and number of nodes n is given by log n < h+1 <= n. for e.g. height of
al

binary tree with 16 nodes may be anywhere between 4 and 15.


When height is 4 and when height is 15?
ep
itn

By Bhupendra Saud 15
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
So if we are sure that the tree is height balanced then we can say that search and insertion has
O(log n) run time otherwise we have to content with O(n).

Operation Algorithm Time complexity


search BST search O(log n) best O(n) worst
add BST insertion O(log n) best O(n) worst
Remove BST deletion O(log n) best O(n) worst

AVL Trees
Balanced tree named after Adelson, Velskii and Landis. AVL trees consist of a special case in
which the sub-trees of each node differ by at most 1 in their height. Due to insertion and deletion
tree may become unbalanced, so rebalancing must be done by using left rotation, right rotation or
double rotation.
Operation Algorithm Time complexity
Search AVL search O(log n) best, worst
Add AVL insertion O(log n) best, worst
Remove AVL deletion O(log n) best, worst

Priority Queues
Priority queue is a queue in which the elements are prioritized. The least element in the priority
queue is always removed first. Priority queues are used in many computing applications. For
example, many operating systems used a scheduling algorithm where the next process executed is
the one with the shortest execution time or the highest priority. Priority queues can be
implemented by using arrays, linked list or special kind of tree (I.e. heap).
• boolean isEmpty ();
Return true if and only if this priority queue is empty.
• int size ();
Return the length of this priority queue.
• int getLeast ();
Return the least element of this priority queue. If there are several least elements, return
any of them.
• void clear ();
Make this priority queue empty.
• void add (int elem);
Add elem to this priority queue.
al

• int delete(); Remove and return the least element from this priority queue. (If there are
several least elements, remove the same element that would be returned by getLeast.
ep


itn

By Bhupendra Saud 16
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Operation Sorted SLL Unsorted SLL Sorted Array Unsorted Array
add O(n) O(1) O(n) O(1)
removeLeast O(1) O(n) O(1) O(n)

getLeast O(1) O(n) O(1) O(n)

Heap
A heap is a complete tree with an ordering-relation R holding between each node and its
descendant. Note that the complete tree here means tree can miss only rightmost part of the bottom
level. R can be smaller-than, bigger-than.
E.g. Heap with degree 2 and R is “bigger than”.

8 8

5 6 5 6

2 3 4
1 7 3 4
1
Heap

Not a heap

Heap Sort Build a heap from the given set (O(n)) time, then repeatedly remove the elements from
the heap (O(n log n)).
Implementation
Heaps are implemented by using arrays. Insertion and deletion of an element takes O(log n) time.
More on this later

Operation Algorithm Time complexity


add insertion O(log n)
delete deletion O(log n)

getLeast access root element O(1)


al
ep
itn

By Bhupendra Saud 17
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
UNIT:-2
Divide and Conquer Algorithms
(Sorting Searching and Selection)
Chapter:1
Sorting
Sorting is among the most basic problems in algorithm design. We are given a sequence of items,
each associated with a given key value. The problem is to permute the items so that they are in
increasing (or decreasing) order by key. Sorting is important because it is often the first step in
more complex algorithms. Sorting algorithms are usually divided into two classes, internal sorting
algorithms, which assume that data is stored in an array in main memory, and external sorting
algorithm, which assume that data is stored on disk or some other device that is best accessed
sequentially. We will only consider internal sorting. Sorting algorithms often have additional
properties that are of interest, depending on the application. Here are two important properties.
In-place: The algorithm uses no additional array storage, and hence (other than perhaps the
system’s recursion stack) it is possible to sort very large lists without the need to allocate
additional working storage.
Stable: A sorting algorithm is stable if two elements that are equal remain in the same relative
position after sorting is completed. This is of interest, since in some sorting applications you sort
first on one key and then on another. It is nice to know that two items that are equal on the second
key, remain sorted on the first key.

Merge Sort
To sort an array A[l . . r]:
• Divide
– Divide the n-element sequence to be sorted into two subsequences of n/2 elements
each
• Conquer
– Sort the subsequences recursively using merge sort. When the size of the sequences
is 1 there is nothing more to do
• Combine
– Merge the two sorted subsequences
Divide

al
ep
itn

By Bhupendra Saud 18
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Merging:

Algorithm:
MergeSort(A, l, r)
{
If ( l < r)
{ //Check for base case
m = (l + r)/2 //Divide
MergeSort(A, l, m) //Conquer
MergeSort(A, m + 1, r) //Conquer
Merge(A, l, m+1, r) //Combine
}
}
Merge(A,B,l,m,r)
{
x=l, y=m;
k=l;
while(x<m && y<r)
{
if(A[x] < A[y])
{
B[k]= A[x];
k++; x++;
}
else
{
al

B[k] = A[y];
ep

k++; y++;
}
itn

By Bhupendra Saud 19
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
}
while(x<m)
{
A[k] = A[x];
k++; x++;
}
while(y<r)
{
A[k] = A[y];
k++; y++;
}
for(i=l;i<= r; i++)
{
A[i] = B[i]
}
}

Time Complexity:
Recurrence Relation for Merge sort:
T(n) = 1 if n=1
T(n) = 2 T(n/2) + O(n) if n>1
Solving this recurrence we get
Time Complexity = O(nlogn)

Space Complexity:
It uses one extra array and some extra variables during sorting, therefore
Space Complexity= 2n + c = O(n)

Quick Sort
• Divide
Partition the array A[l…r] into 2 subarrays A[l..m] and A[m+1..r], such that each element
of A[l..m] is smaller than or equal to each element in A[m+1..r]. Need to find index p to
partition the array.
• Conquer
Recursively sort A[p..q] and A[q+1..r] using Quicksort
• Combine
Trivial: the arrays are sorted in place. No additional work is required to combine them.
al
ep
itn

By Bhupendra Saud 20
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

5 3 2 6 4 1 3 7
x y

5 3 2 6 4 1 3 7
x y {swap x & y}

5 3 2 3 4 1 6 7
y x {swap y and pivot}

1 3 2 3 4 5 6 7
p
Algorithm:
QuickSort(A,l,r)
{
if(l<r)
{
p = Partition(A,l,r);
QuickSort(A,l,p-1);
QuickSort(A,p+1,r);
}
}
Partition(A,l,r)
{
x =l; y =r ; p = A[l];
while(x<y)
{
do {
x++;
}while(A[x] <= p);
do {
y--;
} while(A[y] >=p);
if(x<y)
swap(A[x],A[y]);
}
A[l] = A[y]; A[y] = p;
return y; //return position of pivot
}
Time Complexity:
We can notice that complexity of partitioning is O(n) because outer while loop executes cn
times.
Thus recurrence relation for quick sort is:
T(n) = T(k) + T(n-k-1) + O(n)
al

Best Case:
ep

Divides the array into two partitions of equal size, therefore


T(n) = T(n/2) + O(n) , Solving this recurrence we get,
itn

By Bhupendra Saud 21
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
 Time Complexity = O(nlogn)

Worst Case:
When array is already sorted or sorted in reverse order, one partition contains n-1 items and
another contains zero items, therefore
T(n) = T(n-1) + O(1), Solving this recurrence we get
 Time Complexity = O(n2)

Case between worst and best:


9-to-1 partitions split
T(n) = T(n=9n/10) + T(n/10) + O(n), Solving this recurrence we get
Time Complexity = O(nolgn)

Average case:
All permutations of the input numbers are equally likely. On a random input array, we will
have a mix of well balanced and unbalanced splits. Good and bad splits are randomly
distributed across throughout the tree
al

Suppose we are alternate: Balanced, Unbalanced,Balanced, ….


ep

B(n)= 2UB(n/2) + Θ(n) Balanced


itn

By Bhupendra Saud 22
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
UB(n)= B(n –1) + Θ(n) Unbalanced
Solving:
B(n)= 2(B(n/2 –1) + Θ(n/2)) + Θ(n)
= 2B(n/2 –1) + Θ(n)
= Θ(nlogn)

Randomized Quick Sort:


The algorithm is called randomized if its behavior depends on input as well as random value
generated by random number generator. The beauty of the randomized algorithm is that no
particular input can produce worst-case behavior of an algorithm. IDEA: Partition around a
random element. Running time is independent of the input order. No assumptions need to be made
about the input distribution. No specific input elicits the worst-case behavior. The worst case is
determined only by the output of a random-number generator. Randomization cannot eliminate the
worst-case but it can make it less likely!
Algorithm:
RandQuickSort(A,l,r)
{
if(l<r)
{
m = RandPartition(A,l,r);
RandQuickSort(A,l,m-1);
RandQuickSort(A,m+1,r);
}
}

RandPartition(A,l,r)
{
k = random(l,r); //generates random number between i and j including both.
swap(A[l],A[k]);
return Partition(A,l,r);
}

Partition(A,l,r)
{
x =l; y =r ; p = A[l];
while(x<y)
{
do {
x++;
}while(A[x] <= p);
al

do {
ep

y--;
} while(A[y] >=p);
itn

By Bhupendra Saud 23
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
if(x<y)
swap(A[x],A[y]);
}
A[l] = A[y]; A[y] = p;
return y; //return position of pivot
}
Time Complexity:
Worst Case:
T(n) = worst-case running time
T(n) = max1 ≤ q ≤ n-1 (T(q) + T(n-q)) + Θ(n)
Use substitution method to show that the running time of Quicksort is O(n2)
Guess T(n) = O(n2)
– Induction goal: T(n) ≤ cn2
– Induction hypothesis: T(k) ≤ ck2 for any k < n
Proof of induction goal:
T(n) ≤ max 1 ≤ q ≤ n-1 (cq2 + c(n-q)2) + Θ(n)
= c ⋅ max1 ≤ q ≤ n-1 (q2 + (n-q)2) + Θ(n)

The expression q2 + (n-q)2 achieves a maximum over the range 1 ≤ q ≤ n-1 at one of the
endpoints
max1 ≤ q ≤ n-1 (q2 + (n - q)2) = 12 + (n - 1)2 = n2 – 2(n – 1)
T(n) ≤ cn2 – 2c(n – 1) + Θ(n)
≤ cn2
Average Case:
To analyze average case, assume that all the input elements are distinct for simplicity. If
we are to take care of duplicate elements also the complexity bound is same but it needs
more intricate analysis. Consider the probability of choosing pivot from n elements is
equally likely i.e. 1/n.
Now we give recurrence relation for the algorithm as
n− 1
T(n) = 1/n ∑ (T (k ) + T (n − k )) + O(n)
k=1
For some k = 1,2, …, n-1, T(k) and T(n-k) is repeated two times

n− 1
T(n) = 2/n ∑ T (k ) + O(n)
k=1

n− 1
nT(n) = 2 ∑ T (k ) + O(n2)
k=1
Similarly
n− 2
(n-1)T(n-1) = 2 ∑ T (k ) + O(n-1)2
k=1
nT(n) - (n-1)T(n-1) = 2T(n-1) + 2n -1
al

nT(n) – (n+1) T(n-1) = 2n-1


ep
itn

By Bhupendra Saud 24
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n)/(n+1) = T(n-1)/n +(2n +1)/n(n-1)
Let An = T(n) /(n+1)

 An = An-1 + (2n+1)/n(n-1)
n
 An = ∑
i= 1
2i − 1 / i (i + 1)
n
 An ≈ ∑
i= 1
2i / i (i + 1)
n
 An ≈ 2 ∑ 1 /(i + 1)
i= 1
 An ≈ 2logn

Since An = T(n) /(n+1)


T(n) = nlogn

Heap Sort
A heap is a nearly complete binary tree with the following two properties:
– Structural property: all levels are full, except possibly the last one, which is filled
from left to right
– Order (heap) property: for any node x, Parent(x) ≥ x

Array Representation of Heaps


A heap can be stored as an array A.
– Root of tree is A[1]
– Left child of A[i] = A[2i]
– Right child of A[i] = A[2i + 1]
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
The elements in the subarray A[(n/2+1) .. n] are leaves
al
ep
itn

By Bhupendra Saud 25
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Max-heaps (largest element at root), have the max-heap property:


– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]
Min-heaps (smallest element at root), have the min-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≤ A[i]
Adding/Deleting Nodes
New nodes are always inserted at the bottom level (left to right) and nodes are removed from the
bottom level (right to left).

Operations on Heaps
 Maintain/Restore the max-heap property
– MAX-HEAPIFY
 Create a max-heap from an unordered array
– BUILD-MAX-HEAP
al

 Sort an array in place


– HEAPSORT
ep

 Priority queues
itn

By Bhupendra Saud 26
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Heapify Property
Suppose a node is smaller than a child and Left and Right subtrees of i are max-heaps. To
eliminate the violation:
– Exchange with larger child
– Move down the tree
– Continue until node is not smaller than children

Algorithm:
Max-Heapify(A, i, n)
{
l = Left(i)
r = Right(i)
largest=i;
if l ≤ n and A[l] > A[largest]
largest = l
if r ≤ n and A[r] > A[largest]
largest = r
if largest ≠ i
exchange (A[i] , A[largest])
Max-Heapify(A, largest, n)
}
Analysis:
al

In the worst case Max-Heapify is called recursively h times, where h is height of the heap
ep

and since each call to the heapify takes constant time


Time complexity = O(h) = O(logn)
itn

By Bhupendra Saud 27
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Building a Heap
Convert an array A[1 … n] into a max-heap (n = length[A]). The elements in the sub-array
A[(n/2+1) .. n] are leaves. Apply MAX-HEAPIFY on elements between 1 and n/2.

Algorithm:
Build-Max-Heap(A)
n = length[A]
for i ← n/2 down to 1
do MAX-HEAPIFY(A, i, n)

Time Complexity:
Running time: Loop executes O(n) times and complexity of Heapify is O(lgn), therefore
al

complexity of Build-Max-Heap is O(nlogn).


This is not an asymptotically tight upper bound
ep

Heapify takes O(h)


⇒ The cost of Heapify on a node i is proportional to the height of the node i in the tree
itn

By Bhupendra Saud 28
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
h
⇒ T(n) = ∑
i= 0
ni hi
hi = h – i height of the heap rooted at level i
ni = 2i number of nodes at level i
h
⇒ T(n) = ∑
i= 0
2i( h-i)
h
⇒ T(n) = ∑
i= 0
2h( h-i) / 2h-i
Let k= h-i
h
⇒ T(n) = 2h ∑
i= 0
k / 2k

⇒ T(n) ≤ n ∑
i= 0
k / 2k

We know that, ∑
i= 0
xk = 1/(1-x) for x<1
Differentiating both sides we get,


i= 0
k xk-1 = 1/(1-x)2


i= 0
k xk = x/(1-x)2
Put x=1/2


i= 0
k /2k = 1/(1-x)2 = 2

⇒ T(n) = O(n)

Heapsort
– Build a max-heap from the array
– Swap the root (the maximum element) with the last element in the array
– “Discard” this last node by decreasing the heap size
– Call Max-Heapify on the new root
– Repeat this process until only one node remains

7
4

ZZ 4 3 Heapify(A,1) Heapify(A,1)
2 3
al

1
2 1
7
ep
itn

By Bhupendra Saud 29
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

3 1

2 1 Heapify(A,1) 2 3

4 4
7 7

2 3

4
7

Algorithm:
HeapSort(A)
{
BuildHeap(A); //into max heap
n = length[A];
for(i = n ; i >= 2; i--)
{
swap(A[1],A[n]);
n = n-1;
Heapify(A,1);
}
}

Sorting comparisons:

al
ep
itn

By Bhupendra Saud 30
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Searching
Sequential Search
Simply search for the given element left to right and return the index of the element, if found.
Otherwise return “Not Found”.

Algorithm:
LinearSearch(A, n,key)
{
for(i=0;i<n;i++)
{
if(A[i] == key)
return I;
}

return -1;//-1 indicates unsuccessful search


}

Analysis:
Time complexity = O(n)

Binary Search:
To find a key in a large _le containing keys z[0; 1; : : : ; n-1] in sorted order, we first
compare key with z[n/2], and depending on the result we recurse either on the first half of the file,
z[0; : : : ; n/2 - 1], or on the second half, z[n/2; : : : ; n-1].

Take input array a[] = {2 , 5 , 7, 9 ,18, 45 ,53, 59, 67, 72, 88, 95, 101, 104}

al
ep
itn

By Bhupendra Saud 31
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Algorithm
BinarySearch(A,l,r, key)
{
if(l= = r)
{
if(key = = A[l])
return l+1; //index starts from 0
else
return 0;
}

else
{
m = (l + r) /2 ; //integer division
if(key = = A[m]
return m+1;
else if (key < A[m])
return BinarySearch(l, m-1, key) ;
al

else return BinarySearch(m+1, r, key) ;


}
ep

}
itn

By Bhupendra Saud 32
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Analysis:
From the above algorithm we can say that the running time of the algorithm is:
T(n) = T(n/2) + Q(1)
= O(logn) .
In the best case output is obtained at one run i.e. O(1) time if the key is at middle. In the worst case
the output is at the end of the array so running time is O(logn) time. In the average case also
running time is O(logn). For unsuccessful search best, worst and average time complexity is
O(logn).

Selection
ith order statistic of a set of elements gives i th largest(smallest) element. In general lets think of i th
order statistic gives ith smallest. Then minimum is first order statistic and the maximum is last
order statistic. Similarly a median is given by i th order statistic where i = (n+1)/2 for odd n and i =
n/2 and n/2 + 1 for even n. This kind of problem commonly called selection problem.

This problem can be solved in O(nlogn) in a very straightforward way. First sort the elements in
Ѳ(nlogn) time and then pick up the ith item from the array in constant time. What about the linear
time algorithm for this problem? The next is answer to this.

Nonlinear general selection algorithm


We can construct a simple, but inefficient general algorithm for finding the kth smallest or kth
largest item in a list, requiring O(kn) time, which is effective when k is small. To accomplish this,
we simply find the most extreme value and move it to the beginning until we reach our desired
index.
Select(A, k)
{
for( i=0;i<k;i++)
{
minindex = i;
minvalue = A[i];
for(j=i+1;j<n;j++)
{
if( A[j] < minvalue)
{
minindex = j;
minvalue = A[j];
}
swap(A[i],A[minIndex]);
}
return A[k];
}
Analysis:
al

When i=0, inner loop executes n-1 times


When i=1, inner loop executes n-2 times
ep

When i=2, inner loop executes n-3 times


…………………………………………...
itn

By Bhupendra Saud 33
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
When i=k-1 inner loop executes n-(k+1) times
Thus, Time Complexity = (n-1) + (n-2) + ………………..(n-k-1)
= O(kn) ≈ O(n2)

Selection in expected linear time


This problem is solved by using the “divide and conquer” method. The main idea for this problem
solving is to partition the element set as in Quick Sort where partition is randomized one.
Algorithm:
RandSelect(A,l,r,i)
{
if(l = =r )
return A[p];
p = RandPartition(A,l,r);
k = p – l + 1;
if(i <= k)
return RandSelect(A,l,p-1,i);
else
return RandSelect(A,p+1,r,i - k);
}

Analysis:
Since our algorithm is randomized algorithm no particular input is responsible for worst case
however the worst case running time of this algorithm is O(n 2). This happens if every time
unfortunately the pivot chosen is always the largest one (if we are finding minimum element).
Assume that the probability of selecting pivot is equal to all the elements i.e 1/n then we have the
recurrence relation,
n− 1

T(n) = 1/n( ∑ T(max(j,n – j))) + O(n)


j= 1

Where, max(j,n-j) = j, if j>= ceil(n / 2)


and max(j,n-j) = n-j, otherwise.
Observe that every T(j) or T(n – j) will repeat twice for both odd and even value of n (one
may not be repeated) one time form 1 to ceil(n / 2) and second time for ceil(n / 2) to n-1, so we
can write,
n− 1

T(n) = 2/n( ∑
j= n / 2
T(j)) + O(n)
Using substitution method,
Guess T(n) = O(n)
To show T(n) <= cn
Assume T(j) <= cj
Substituting on the relation
n− 1

T(n) = 2/n ∑ cj + O(n)


al
j= n / 2
n− 1 n / 2− 1

T(n) = 2/n { ∑ - ∑
ep

cj cj }+ O(n)
j= 1 j= 1
itn

By Bhupendra Saud 34
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n) = 2/n { (n(n-1))/2 –( (n/2-1)n/2)/2} + O(n)
T(n) ≤ c(n-1) – c(n/2-1)/2 + O( n)
T(n) ≤ cn – c – cn/4 + c/2 +O( n)
= cn – cn/4 –c/2 + O(n)
≤ cn {choose the value of c such that (-cn/4-c/2 +O(n) ≤ 0 }
 T(n) = O(n)

Selection in worst case linear time[Note:- this algorithm must be see in Class
Note]
Divide the n elements into groups of 5. Find the median of each 5-element group. Recursively
SELECT the median x of the ⎣n/5⎦group medians to be the pivot.

Algorithm
Divide n elements into groups of 5
Find median of each group
Use Select() recursively to find median x of the n/5 medians
Partition the n elements around x. Let k = rank(x ) //index of x
if (i == k) then return x
if (i < k) then use Select() recursively to find ith smallest element in first partition
else (i > k) use Select() recursively to find (i-k)th smallest element in last partition

At least half the group medians are ≤x, which is at least⎣⎣n/5⎦/2⎦= ⎣n/10⎦group medians.
Therefore, at least 3⎣n/10⎦elements are ≤x.
Similarly, at least 3⎣n/10⎦elements are ≥x
For n≥50, we have 3⎣n/10⎦≥n/4.
Therefore, for n≥50the recursive call to SELECT in Step 4 is executed recursively on
≤3n/4elements in worst case.
al

Thus, the recurrence for running time can assume that Step 4 takes time T(3n/4)in the worst case.
ep

Now, We can write recurrence relation for above algorithm as”


itn

By Bhupendra Saud 35
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n) = T(n/5) + T(3n/4) + Θ(n)
Guess T(n) = O(n)
To Show T(n) ≤ cn
Assume that our guess is true for all k<n
Now,
T(n) ≤ cn/5 + 3cn/4 + O(n)
= 19cn/20 + O(n)
= cn- cn/20 + O(n)
≤ cn { Choose value of c such that cn/20 –O(n) ≤ 0}
 T(n) = O(n)

Max and Min Finding


Here our problem is to find the minimum and maximum items in a set of n elements. Iterative

Divide and Conquer Algorithm for finding min-max:


Main idea behind the algorithm is: if the number of elements is 1 or 2 then max and min are
obtained trivially. Otherwise split problem into approximately equal part and solved
recursively.

MinMax(l,r)
{
if(l = = r)
max = min = A[l];
else if(l = r-1)
{
if(A[l] < A[r])
{
max = A[r]; min = A[l];
}
else
{
max = A[l]; min = A[r];
}
}
else
{
//Divide the problems
mid = (l + r)/2; //integer division
al

//solve the subproblems


{min,max}=MinMax(l,mid);
ep

{min1,max1}= MinMax(mid +1,r);


itn

By Bhupendra Saud 36
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
//Combine the solutions
if(max1 > max) max = max1;
if(min1 < min) min = min1;
}
}
Analysis:
We can give recurrence relation as below for MinMax algorithm in terms of number of
comparisons.
T(n) = 2T( n / 2 ) + 1 , if n>2
T(n) = 1 , if n ≤2
Solving the recurrence by using master method complexity is (case 1) O(n).

Matrix Multiplication
Given two A and B n-by-n matrices our aim is to find the product of A and B as C that is also
n-by-n matrix. We can find this by using the relation
n
C(i,j) = ∑
k=1
A(i,k)B(k,j)
MatrixMultiply(A,B)
{
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
for(k=0;k<n;k++)
{
C[i][j] = C[i][j]+ A[i][k]*B[k][j];
}
}
}
}
Analysis:
Using the above formula we need O(n) time to get C(i,j). There are n2 elements in C hence the
time required for matrix multiplication is O(n3). We can improve the above complexity by
using divide and conquer strategy.

Divide and Conquer Algorithm for Matrix Multiplication


Divide the n x n square matrix into four matrices of size n/2 x n/2. The basic calculation is
done for matrix of size 2 x 2 .
al

C11 C12 A11 A12 B11 B12


=
ep

C21 C22 A21 A22 B21 B22


itn

By Bhupendra Saud 37
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Where
C11= A11 x B11 + A12 x B21
C12= A11 x B12 + A12 x B22
C21= A21 x B11 + A22 x B21
C22= A21 x B12 + A22 x B22

Now, We can write recurrence relation for this as


T(n) = b if n≤2
T(n)= 8T(n/2) + cn2 if n>2
Solving this we get, T(n) = O(n3)

Strassens’s Matrix Multiplication


Strassen showed that 2x2 matrix multiplication can be accomplished in 7 multiplication and
18 additions or subtractions.
The basic calculation is done for matrix of size 2 x 2 .

C11 C12 A11 A12 B11 B12


=
C21 C22 A21 A22 B21 B22

Where;
P1 = (A11+ A22)(B11+B22)
P2 = (A21 + A22) * B11
P3 = A11 * (B12 - B22)
P4 = A22 * (B21 - B11)
P5 = (A11 + A12) * B22
P6 = (A21 - A11) * (B11 + B12)
P7 = (A12 - A22) * (B21 + B22)

C11 = P1 + P4 - P5 + P7
C12 = P3 + P5
C21 = P2 + P4
C22 = P1 + P3 - P2 + P6
Now, We can write recurrence relation for this as
T(n) = b if n≤2
2
T(n)= 7T(n/2) + cn if n>2
Solving this we get, T(n) = O(n2.81)
al
ep
itn

By Bhupendra Saud 38
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Unit 2
Chapter:2
Dynamic Programming
Dynamic Programming: technique is among the most powerful for designing algorithms for
optimization problems. Dynamic programming problems are typically optimization problems (find
the minimum or maximum cost solution, subject to various constraints). The technique is related to
divide-and-conquer, in the sense that it breaks problems down into smaller problems that it solves
recursively. However, because of the somewhat different nature of dynamic programming
problems, standard divide-and-conquer solutions are not usually efficient. The basic elements that
characterize a dynamic programming algorithm are:
Substructure: Decompose your problem into smaller (and hopefully simpler)
subproblems. Express the solution of the original problem in terms of solutions for smaller
problems.
Table-structure: Store the answers to the sub-problems in a table. This is done because
subproblem solutions are reused many times.
Bottom-up computation: Combine solutions on smaller subproblems to solve larger
subproblems. (We also discusses a top-down alternative, called memoization)

The most important question in designing a DP solution to a problem is how to set up the
subproblem structure. This is called the formulation of the problem. Dynamic programming is not
applicable to all optimization problems. There are two important elements that a problem must
have in order for DP to be applicable.
Optimal substructure: (Sometimes called the principle of optimality.) It states that for the
global problem to be solved optimally, each subproblem should be solved optimally. (Not
all optimization problems satisfy this. Sometimes it is better to lose a little on one
subproblem in order to make a big gain on another.)
Polynomially many subproblems: An important aspect to the efficiency of DP is that the
total number of subproblems to be solved should be at most a polynomial number.

Fibonacci numbers
Recursive Fibonacci revisited: In recursive version of an algorithm for finding Fibonacci number
we can notice that for each calculation of the Fibonacci number of the larger number we have to
calculate the Fibonacci number of the two previous numbers regardless of the computation of the
Fibonacci number that has already be done. So there are many redundancies in calculating the
Fibonacci number for a particular number. Let’s try to calculate the Fibonacci number of 4. The
representation shown below shows the repetition in the calculation.
al
ep
itn

By Bhupendra Saud 39
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Fib(4)

Fib(3) Fib(2)

Fib(1)
Fib(2) Fib(1) Fib(0)

Fib(1) Fib(0)

In the above tree we saw that calculations of fib(0) is done two times, fib(1) is done 3 times, fib(2)
is done 2 times, and so on. So if we somehow eliminate those repetitions we will save the running
time.

Algorithm:
DynaFibo(n)
{
A[0] = 0, A[1]= 1;
for(i = 2 ; i <=n ; i++)
A[i] = A[i-2] +A[i-1] ;
return A[n] ;
}
Analysis
Analyzing the above algorithm we found that there are no repetition of calculation of the sub-
problems already solved and the running time decreased from O(2n/2) to O(n). This reduction was
possible due to the remembrance of the sub-problem that is already solved to solve the problem of
higher size.

0/1 Knapsack Problem


Statement: A thief has a bag or knapsack that can contain maximum weight W of his loot. There
are n items and the weight of ith item is wi and it worth vi. An amount of item can be put into the
bag is 0 or 1 i.e. xi is 0 or 1. Here the objective is to collect the items that maximize the total profit
earned.
n n
We can formally state this problem as, maximize ∑
i= 1
xivi Using the constraints ∑
i= 1
xiwi ≤ W
al

The algorithm takes as input maximum weight W, the number of items n, two arrays v[] for values
of items and w[] for weight of items. Let us assume that the table c[i,w] is the value of solution for
ep

items 1 to i and maximum weight w. Then we can define recurrence relation for 0/1 knapsack
problem as
itn

By Bhupendra Saud 40
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

0 if i=0 or w=0
C[i,w] = C[i-1,w] if wi > w
Max{vi + C[i-1,w-wi], c[i-1,w] if i>0 and w>wi

Algorithm:

DynaKnapsack(W,n,v,w)
{
for(w=0; w<=W; w++)
C[0,w] = 0;
for(i=1; i<=n; i++)
C[i,0] = 0;
for(i=1; i<=n; i++)
{
for(w=1; w<=W;w++)
{
if(w[i]<w)
{
if v[i] +C[i-1,w-w[i]] > C[i-1,w]
{
C[i,w] = v[i] +C[i-1,w-w[i]];
}
else
{
C[i,w] = C[i-1,w];
}
}
else
{
C[i,w] = C[i-1,w];
}
}
}
}

Analysis
For run time analysis examining the above algorithm the overall run time of the algorithm is
O(nW).
al
ep
itn

By Bhupendra Saud 41
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Example
Let the problem instance be with 7 items where v[] = {2,3,3,4,4,5,7}and w[] = {3,5,7,4,3,9,2}and
W = 9.
w 0 1 2 3 4 5 6 7 8 9
i
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 2 2 2 2 2 2 2
2 0 0 0 2 2 3 3 3 5 5
3 0 0 0 2 2 3 3 3 5 5
4 0 0 0 2 4 4 4 6 6 7
5 0 0 0 4 4 4 6 8 8 8
6 0 0 0 4 4 4 6 8 8 8
7 0 0 7 7 7 11 11 11 13 15
Profit= C[7][9]=15

Matrix Chain Multiplication


Chain Matrix Multiplication Problem: Given a sequence of matrices A1;A2; : : : ; An and
dimensions p0; p1; : : : ; pn, where Ai is of dimension pi−1 x pi, determine the order of multiplication
that minimizes the number of operations.
Important Note: This algorithm does not perform the multiplications, it just determines the best
order in which to perform the multiplications.

Although any legal parenthesization will lead to a valid result, not all involve the same number of
operations. Consider the case of 3 matrices: A1 be 5 x 4, A2 be 4 x 6 and A3 be 6 x 2.
multCost[((A1A2)A3)] = (5 . 4 . 6) + (5 . 6 . 2) = 180
multCost[(A1(A2A3))] = (4 . 6 . 2) + (5 . 4 . 2) = 88
Even for this small example, considerable savings can be achieved by reordering the evaluation
sequence.
Let Ai….j denote the result of multiplying matrices i through j. It is easy to see that Ai…j is a pi−1 x
pj matrix. So for some k total cost is sum of cost of computing Ai…k, cost of computing Ak+1…j, and
cost of multiplying Ai…k and Ak+1…j.

Recursive definition of optimal solution: let m[j,j] denotes minimum number of scalar
multiplications needed to compute Ai…j.

0 if i=j
C[i,w] =
mini≤k<j{m[I,k]+ m[k+1,j] + pi-1pkpj if i<j

Algorithm:
Matrix-Chain-Multiplication(p)
{
al

n =length[p]
for( i= 1 i<=n i++)
ep

{
itn

By Bhupendra Saud 42
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
m[i, i]= 0
}
for(l=2; l<= n; l++)
{
for( = 1; i<=n-l+1; i++)
{
j=i+l−1
m[i, j] = ∞
for(k= i; k<= j-1; k++)
{
c= m[i, k] + m[k + 1, j] + p[i−1] * p[k] * p[j]
if c < m[i, j]
{
m[i, j] = c
s[i, j] = k
}
}
}
}
return m and s
}

Analysis
The above algorithm can be easily analyzed for running time as O(n3), due to three nested loops.
The space complexity is O(n2) .

Example:
Consider matrices A1, A2, A3 And A4 of order 3x4, 4x5, 5x2 and 2x3.

M Table( Cost of multiplication) S Table (points of parenthesis)

j 1 2 3 4 j 1 2 3 4
i i
1 0 60 64 82
2 0 40 64 1 1 1 3
3 0 30 2 2 3
4 0 3 3
4
Constructing optimal solution
(A1A2A3A4) => ((A1A2A3)(A4)) => (((A1)(A2A3))(A4))
al
ep
itn

By Bhupendra Saud 43
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Longest Common Subsequence Problem
Given two sequences X = (x1, x2, …………,xm) and Z = (z1; z2; ……… ; zk), we say that Z is a
subsequence of X if there is a strictly increasing sequence of k indices (i1, i2, ……… ik) (1 ≤ i1 < i2 <
……….. < ik) such that Z = (Xi1,Xi2 ………. Xik).

For example, let X = (ABRACADABRA) and let Z = (AADAA), then Z is a subsequence of X.


Given two strings X and Y , the longest common subsequence of X and Y is a longest sequence Z
that is a subsequence of both X and Y . For example, let X = (ABRACADABRA) and let Y =
(YABBADABBAD). Then the longest common subsequence is Z = (ABADABA)

The Longest Common Subsequence Problem (LCS) is the following. Given two sequences X =
(x1; : : : ; xm) and Y = (y, ……., yn) determine a longest common subsequence.

DP Formulation for LCS:


Given a sequence X = <x1,x2, …, xm>, Xi = <x1, x2, …, xi> is called ith prefix of X, here we have X0
as empty sequence. Now in case of sequences Xi and Yj:

If xi = yj (i.e. last Character match) , we claim that the LCS must also contail charater xi or yj .

If xi ≠ yj (i.e Last Character do not match) , In this case xi and yj cannot both be in the LCS (since
they would have to be the last character of the LCS). Thus either xi is not part of the LCS, or yj is
not part of the LCS (and possibly both are not part of the LCS). Let L[I,j] represents the length of
LCS of sequences Xi and Yj.

0 if i=0 or j=0
L[i,j] = L[i-1,j-1]+1 if xi = yj
max{L[i-1,j], L[i,j-1]} if i>0 and w>wi

Algorithm:

LCS(X,Y)
{
m = length[X];
n = length[Y];
for(i=1;i<=m;i++)
c[i,0] = 0;
for(j=0;j<=n;j++)
c[0,j] = 0;
for(i = 1;i<=m;i++)
for(j=1;j<=n;j++)
{
if(X[i]==Y[j])
al

{
c[i][j] = c[i-1][j-1]+1; b[i][j] = “upleft”;
ep

}
else if(c[i-1][j]>= c[i][j-1])
itn

By Bhupendra Saud 44
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
{
c[i][j] = c[i-1][j]; b[i][j] = “up”;
}
else
{
c[i][j] = c[i][j-1]; b[i][j] = “left”;
}
}
return b and c;
}
Analysis:
The above algorithm can be easily analyzed for running time as O(mn), due to two nested loops.
The space complexity is O(mn).

Example:
Consider the character Sequences X=abbabba Y=aaabba

Y Φ a a a b b a
X
Φ 0 0 0 0 0 0 0
a 0 1 upleft 1upleft 1upleft 1 left 1 left 1 left
b 0 1 up 1 up 1 up 2 upleft 2 upleft 2 left
b 0 1 up 1 up 1 up 2 upleft 3 upleft 3 left
a 0 1 upleft 2 upleft 2 upleft 2 up 3 up 4 upleft
b 0 1 up 2 up 2 up 3 upleft 3 upleft 4 up
b 0 1 up 2 up 2 up 3 upleft 4 upleft 4 up
a 0 1 upleft 2 upleft 3 upleft 3 up 4 up 5 upleft

LCS = aabba

Chapter:3
Greedy Paradigm
Greedy method is the simple straightforward way of algorithm design. The general class
of problems solved by greedy approach is optimization problems. In this approach the input ele-
ments are exposed to some constraints to get feasible solution and the feasible solution that meets
some objective function best among all the solutions is called optimal solution. Greedy algorithms
always makes optimal choice that is local to generate globally optimal solution however, it is not
guaranteed that all greedy algorithms yield optimal solution. We generally cannot tell whether the
given optimization problem is solved by using greedy method or not, but most of the problems that
can be solved using greedy approach have two parts:
al

Greedy choice property


Globally optimal solution can be obtained by making locally optimal choice and the choice at
ep

present cannot reflect possible choices at future.


Optimal substructure
itn

By Bhupendra Saud 45
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Optimal substructure is exhibited by a problem if an optimal solution to the problem contains op-
timal solutions to the sub-problems within it.
To prove that a greedy algorithm is optimal we must show the above two parts are exhibited. For
this purpose first take globally optimal solution; then show that the greedy choice at the first step
generates the same but the smaller problem, here greedy choice must be made at first and it should
be the part of an optimal solution; at last we should be able to use induction to prove that the
greedy choice at each step is best at each step, this is optimal substructure.

Fractional Knapsack Problem


Statement: A thief has a bag or knapsack that can contain maximum weight W of his loot. There
are n items and the weight of ith item is wi and it worth vi. Any amount of item can be put into the
bag i.e. xi fraction of item can be collected, where 0<=xi<=1. Here the objective is to collect the
items that maximize the total profit earned.
Algorithm:
Take as much of the item with the highest value per weight (v i/wi) as you can. If the item is fin-
ished then move on to next item that has highest (vi/wi), continue this until the knapsack is full.
v[1 … n] and w[1 … n] contain the values and weights respectively of the n objects sorted in non
increasing ordered of v[i]/w[i] . W is the capacity of the knapsack, x[1 … n] is the solution vector
that includes fractional amount of items and n is the number of items.

GreedyFracKnapsack(W,n)
{
for(i=1; i<=n; i++)
x[i] = 0.0;
tempW = W;
for(i=1; i<=n; i++)
{
if(w[i] > tempW) then break;
x[i] = 1.0;
tempW -= w[i];
}
if(i<=n)
x[i] = tempW/w[i];
}

Analysis:
We can see that the above algorithm just contain a single loop i.e. no nested loops the running time
for above algorithm is O(n). However our requirement is that v[1 … n] and w[1 … n] are sorted,
so we can use sorting method to sort it in O(nlogn) time such that the complexity of the algorithm
above including sorting becomes O(nlogn).
al
ep
itn

By Bhupendra Saud 46
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Huffman Codes:
Huffman codes are used to compress data by representing each alphabet by unique binary codes in
an optimal way. As an example consider the file of 100,000 characters with the following fre-
quency distribution assuming that there are only 7 characters
f(a) = 40,000 , f(b) = 20,000 , f(c) = 15,000 , f(d) = 12,000 , f(e) = 8,000 , f(f) = 3,000 ,
f(g) = 2,000.
Here fixed length code for 7 characters we need 3 bits to represent all characters like
a = 000 , b = 001 , c = 010 , d = 011 , e = 100 , f = 101 , g = 110.
Total number of bits required due to fixed length code is 300,000.
Now consider variable length character so that character with highest frequency is given smaller
codes like
C = {a, b, c, d, e, f, g}; f(c) = 40, 20, 15, 12, 8, 3, 2; n = 7
Initial priority queue is

al
ep
itn

By Bhupendra Saud 47
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

a = 0 , b = 10 , c = 110 , d = 1110 , e = 11111 , f = 111101 , g = 111100


Total number of bits required due to variable length code is
al

40,000*1 + 20,000*2 + 15,000*3 + 12,000*4 + 8,000*5 + 3,000*6 + 2,000*6.


i.e. 243,000 bits
ep

Here we saved approximately 19% of the space.


itn

By Bhupendra Saud 48
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Analysis
We can use BuildHeap(C){see notes on sorting} to create a priority queue that takes O(n) time. In-
side the for loop the expensive operations can be done in O(logn) time. Since operations inside for
loop executes for n-1 time total running time of HuffmanAlgo is O(nlogn).

Job Sequencing with Deadline:


We are given a set of n jobs. Associated with each job I, di≥0 is an integer deadline and pi≥0 is
profit. For any job i profit is earned iff job is completed by deadline. To complete a job one has
to process a job for one unit of time. Our aim is to find feasible subset of jobs such that profit
is maximum.

Example
n=4, (p1,p2,p3,p4)=(100,10,15,27), (d1,d2,d3,d4)=(2,1,2,1)
----------------------------------------------------
Feasible processing
Solution sequence value
1. (1, 2) 2, 1 110
2. (1, 3) 1, 3 or 3, 1 115
3. (1, 4) 4, 1 127
4. (2, 3) 2, 3 25
5. (3, 4) 4, 3 42
6. (1) 1 100
7. (2) 2 10
8. (3) 3 5
9. (4) 4 27

We have to try all the possibilities, complexity is O(n!).


Greedy strategy using total profit as optimization function to above example.
Begin with J=φ
– Job 1 considered, and added to J  J={1}
– Job 4 considered, and added to J  J={1,4}
– Job 3 considered, but discarded because not feasible  J={1,4}
– Job 2 considered, but discarded because not feasible  J={1,4}
Final solution is J={1,4} with total profit 127 and it is optimal

Algorithm:
Assume the jobs are ordered such that p[1]≥p[2]≥…≥p[n] d[i]>=1, 1<=i<=n are the deadlines,
n>=1. The jobs n are ordered such that p[1]>=p[2]>=... >=p[n]. J[i] is the ith job in the optimal
solution, 1<=i<=k. Also, at termination d[J[i]]<=d[J[i+1]], 1<=i<k.

JobSequencing(int d[], int j[], int n)


al

{
d[0] = J[0] = 0; // Initialize.
ep

J[1] = 1; // Include job 1.


itn

By Bhupendra Saud 49
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
int k=1;
for (int i=2; i<=n; i++)
{//Consider jobs in nonincreasing order of p[i]. Find position for i and check
feasibility of insertion.
int r = k;
while ((d[J[r]] > d[i]) && (d[J[r]] != r))
r--;
if ((d[J[r]] <= d[i]) && (d[i] > r))
{
// Insert i into J[].
for (int q=k; q>=(r+1); q--)
J[q+1] = J[q];
J[r+1] = i; k++;
}
}
return (k);
}

Analysis
For loop executes O(n) line. While loop inside the for loop executes at most times and if the
condition given inside if statement is true inner for loop executes O(k-r) times. Hence total time
for each iteration of outer for loop is O(k). Thus time complexity is O(n2) .

Unit 3:-
Graph Algorithms
Graph is a collection of vertices or nodes, connected by a collection of edges. Graphs are
extremely important because they are a very flexible mathematical model for many application
problems. Basically, any time you have a set of objects, and there is some “connection” or
“relationship” or “interaction” between pairs of objects, a graph is a good way to model this.
Examples of graphs in application include communication and transportation networks, VLSI
and other sorts of logic circuits, surface meshes used for shape description in computer-aided
design and geographic information systems, precedence constraints in scheduling systems etc.

A directed graph (or digraph) G = (V,E) consists of a finite set V , called the vertices or nodes,
and E, a set of ordered pairs, called the edges of G.
An undirected graph (or graph) G = (V,E) consists of a finite set V of vertices, and a set E of
unordered pairs of distinct vertices, called the edges. al
ep
itn

By Bhupendra Saud 50
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
We say that vertex v is adjacent to vertex u if there is an edge (u; v). In a directed graph, given
the edge e = (u; v), we say that u is the origin of e and v is the destination of e. In undirected
graphs u and v are the endpoints of the edge. The edge e is incident (meaning that it touches)
both u and v.

In a digraph, the number of edges coming out of a vertex is called the out-degree of that vertex,
and the number of edges coming in is called the in-degree. In an undirected graph we just talk
about the degree of a vertex as the number of incident edges. By the degree of a graph, we
usually mean the maximum degree of its vertices.

Notice that generally the number of edges in a graph may be as large as quadratic in the
number of vertices. However, the large graphs that arise in practice typically have much fewer
edges. A graph is said to be sparse if E = Θ(V ), and dense, otherwise. When giving the
running times of algorithms, we will usually express it as a function of both V and E, so that
the performance on sparse and dense graphs will be apparent.

Representation of Graph
Generally graph can be represented in two ways namely adjacency lists(Linked list
representation) and adjacency matrix(matrix).

Adjacency List:
This type of representation is suitable for the undirected graphs without multiple edges,
and directed graphs. This representation looks as in the tables below.

al

If we try to apply the algorithms of graph using the representation of graphs by lists of edges,
ep

or adjacency lists it can be tedious and time taking if there are high number of edges. For the
sake of the computation, the graphs with many edges can be represented in other ways. In this
itn

By Bhupendra Saud 51
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
class we discuss two ways of representing graphs in form of matrix.
Adjacency Matrix:
Given a simple graph G =(V, E) with |V| = n. assume that the vertices of the graph
are listed in some arbitrary order like v1, v2, …, vn. The adjacency matrix A of G, with respect
to the order of the vertices is n-by-n zero-one matrix (A = [aij]) with the condition,

Since there are n vertices and we may order vertices in any order there are n! possible order of
the vertices. The adjacency matrix depends on the order of the vertices, hence there are n!
possible adjacency matrices for a graph with n vertices.
In case of the directed graph we can extend the same concept as in undirected graph as dictated by
the relation

If the number of edges is few then the adjacency matrix becomes sparse. Sometimes it will be
beneficial to represented graph with adjacency list in such a condition.
Solution:Let the order of the vertices be a, b, c, d, e, f

Let us take a directed graph

al
ep

Solution:
Let the order of the vertices be a, b, c, d, e, f, g
itn

By Bhupendra Saud 52
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Graph Traversals
There are a number of approaches used for solving problems on graphs. One of the most
important approaches is based on the notion of systematically visiting all the vertices and edge
of a graph. The reason for this is that these traversals impose a type of tree structure (or
generally a forest) on the graph, and trees are usually much easier to reason about than general
graphs.

Breadth-first search
This is one of the simplest methods of graph searching. Choose some vertex arbitrarily as a
root. Add all the vertices and edges that are incident in the root. The new vertices added will
become the vertices at the level 1 of the BFS tree. Form the set of the added vertices of level 1,
find other vertices, such that they are connected by edges at level 1 vertices. Follow the above
step until all the vertices are added.

Algorithm:
BFS(G,s) //s is start vertex
{
T = {s};
L =Φ; //an empty queue
Enqueue(L,s);
while (L != Φ )
{
v = dequeue(L);
for each neighbor w to v
if ( w∉ L and w ∉ T )
{
enqueue( L,w);
T = T U {w}; //put edge {v,w} also
}
al

}
ep

}
itn

By Bhupendra Saud 53
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Example:
Use breadth first search to find a BFS tree of the following graph.

Solution:

Analysis
From the algorithm above all the vertices are put once in the queue and they are accessed. For
each accessed vertex from the queue their adjacent vertices are looked for and this can be done
in O(n) time(for the worst case the graph is complete). This computation for all the possible
vertices that may be in the queue i.e. n, produce complexity of an algorithm as O(n 2). Also
from aggregate analysis we can write the complexity as O(E+V) because inner loop executes E
times in total.

Depth First Search


This is another technique that can be used to search the graph. Choose a vertex as a root
and form a path by starting at a root vertex by successively adding vertices and edges. This
process is continued until no possible path can be formed. If the path contains all the vertices
then the tree consisting this path is DFS tree. Otherwise, we must add other edges and vertices.
For this move back from the last vertex that is met in the previous path and find whether it is
possible to find new path starting from the vertex just met. If there is such a path continue the
process above. If this cannot be done, move back to another vertex and repeat the process. The
whole process is continued until all the vertices are met. This method of search is also called
backtracking.
al

Example:
Use depth first search to find a spanning tree of the following graph.
ep
itn

By Bhupendra Saud 54
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

al
ep
itn

By Bhupendra Saud 55
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Algorithm:
DFS(G,s)
{
T = {s};
Traverse(s);
}
Traverse(v)
{
for each w adjacent to v and not yet in T
{
T = T U {w}; //put edge {v,w} also
Traverse (w);
}
}

Analysis:
The complexity of the algorithm is greatly affected by Traverse function we can write its running
time in terms of the relation T(n) = T(n-1) + O(n), here O(n) is for each vertex at most all the
vertices are checked (for loop). At each recursive call a vertex is decreased. Solving this we can
find that the complexity of an algorithm is O(n2).

Also from aggregate analysis we can write the complexity as O(E+V) because traverse function is
invoked V times maximum and for loop executes O(E) times in total.

Minimum Spanning Tree


Given an undirected graph G = (V,E), a subgraph T =(V,E’) of G is a spanning tree if and only if T
is a tree. The MST is a spanning tree of a connected weighted graph such that the total sum of the
weights of all edges eÎE’ is minimum amongst all the sum of edges that would give a spanning
tree.

Kruskal’s Algorithm:
The problem of finding MST can be solved by using Kruskal’s algorithm. The idea behind this al-
gorithm is that you put the set of edges form the given graph G = (V,E) in nondecreasing order of
their weights. The selection of each edge in sequence then guarantees that the total cost that would
al

from will be the minimum. Note that we have G as a graph, V as a set of n vertices and E as set of
edges of graph G.
ep
itn

By Bhupendra Saud 56
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Algorithm:
KruskalMST(G)
{
T = {V} // forest of n nodes
S = set of edges sorted in nondecreasing order of weight
while(|T| < n-1 and E !=∅)
{
al

Select (u,v) from S in order


Remove (u,v) from E
ep

if((u,v) doesnot create a cycle in T))


itn

By Bhupendra Saud 57
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T = T ∪ {(u,v)}
}
}

Analysis:
In the above algorithm the n tree forest at the beginning takes (V) time, the creation of set
S takes O(ElogE) time and while loop execute O(n) times and the steps inside the loop
take almost linear time (see disjoint set operations; find and union). So the total time
taken is O(ElogE) or asymptotically equivalently O(ElogV)!.

Prim’s Algorithm
This is another algorithm for finding MST. The idea behind this algorithm is just take any arbitrary
vertex and choose the edge with minimum weight incident on the chosen vertex. Add the vertex
and continue the above process taking all the vertices added. Remember the cycle must be avoided.

al
ep
itn

By Bhupendra Saud 58
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Algorithm:
PrimMST(G)
{
T = ∅; // T is a set of edges of MST
S = {s} ; //s is randomly chosen vertex and S is set of vertices
while(S != V)
{
e = (u,v) an edge of minimum weight incident to vertices in T and not forming a
simple circuit in T if added to T i.e. u ∈ S and v∈ V-S
T = T ∪ {(u,v)};
S = S ∪ {v};
}
}

Analysis:
In the above algorithm while loop execute O(V). The edge of minimum weight incident on a ver-
tex can be found in O(E), so the total time is O(EV). We can improve the performance of the
above algorithm by choosing better data structures as priority queue and normally it will be seen
that the running time of prim’s algorithm is O(ElogV)!.
al
ep
itn

By Bhupendra Saud 59
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Shortest Path Problem
Given a weighted graph G =(V,E), then it has weight for every path p = <v 0,v1,…vk> as w(p) =
w(v0,v1) + w(v1,v2) + … + w(vk-1,vk). A shortest path from u to v is the path from u to v with min-
imum weight. Shortest path from u to v is denoted by d(u,v). It is important to remember that the
shortest path may exist in a graph or may not i.e. if there is negative weight cycle then there is no
shortest path. For e.g the below graph has no shortest path from a to c .You can notice the negative
weight cycle for path a to b.

As a matter of fact even the positive weight cycle doesn’t constitute shortest path but there will be
shortest path. Some of the variations of shortest path problem include:

Single Source: This type of problem asks us to find the shortest path from the given vertex
(source) to all other vertices in a connected graph

Single Destination: This type of problem asks us to find the shortest path to the given vertex (des-
tination) from all other vertices in a connected graph.

Single Pair: This type of problem asks us to find the shortest path from the given vertex (source)
to another given vertex (destination).

All Pairs: This type of problem asks us to find the shortest path from the all vertices to all other
vertices in a connected graph

Single Source Problem


Relaxation: Relaxation of an edge (u,v) is a process of testing the total weight of the shortest path
to v by going through u and if we get the weight less than the previous one then replacing the re-
cord of previous shortest path by new one.

Directed Acyclic Graphs (Single Source Shortest paths)


Recall the definition of DAG, DAG is a directed graph G = (V,E) without a cycle. The algorithm
that finds the shortest paths in a DAG starts by topologically sorting the DAG for getting the linear
ordering of the vertices. The next step is to relax the edges as usual.

Example:
Find the shortest path from the vertex c to all other vertices in the following DAG.
al
ep
itn

By Bhupendra Saud 60
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

al
ep
itn

By Bhupendra Saud 61
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm:
DagSP(G,w,s)
{
Topologically Sort the vertices of G
for each vertex v belongs to V
do d[v] = ∞
d[s] = 0
for each vertex u, taken in topologically sorted order
do for each vertex v adjacent to u
do if d[v] > d[u] + w(u,v)
then d[v] = d[u] + w(u,v)
}

Dijkstra’s Algorithm
This is another approach of getting single source shortest paths. In this algorithm it is assumed that
there is no negative weight edge. Dijkstra’s algorithm works using greedy approach, as we will see
later.
Example:
Find the shortest paths from the source g to all other vertices using Dijkstra’s algorithm.

Solution:

al
ep
itn

By Bhupendra Saud 62
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Algorithm:
Dijkstra(G,w,s)
{
for each vertex v∈ V
do d[v] = ∞
d[s] = 0
S=∅
Q=V
While(Q!= ∅)
{
al

u = Take minimum from Q and delete.


ep

S = S ∪ {u}
for each vertex v adjacent to u
itn

By Bhupendra Saud 63
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
do if d[v] > d[u] + w(u,v)
then d[v] = d[u] + w(u,v)
}
}

Analysis:
In the above algorithm, the first for loop block takes O(V) time. Initialization of priority queue Q
takes O(V) time. The while loop executes for O(V), where for each execution the block inside the
loop takes O(V) times . Hence the total running time is O(V2).

All Pairs Problem


As defined in above sections, we can apply single source shortest path algorithms |V| times to
solve all pair shortest paths problem.
Flyod’s Warshall Algorithm
The algorithm being discussed uses dynamic programming approach. The algorithm being presen-
ted here works even if some of the edges have negative weights. Consider a weighted graph G =
(V,E) and denote the weight of edge connecting vertices i and j by w ij. Let W be the adjacency
matrix for the given graph G. Let D k denote an n´n matrix such that D k(i,j) is defined as the weight
of the shortest path from the vertex i to vertex j using only vertices from 1,2,…,k as intermediate
vertices in the ath. If we consider shortest path with intermediate vertices as above then computing
the path contains two cases. Dk(i,j) does not contain k as intermediate vertex and .Dk(i,j) contains k
as intermediate vertex. Then we have the following relations
Dk(i,j) = Dk-1(i,j), when k is not an intermediate vertex, and

al
ep
itn

By Bhupendra Saud 64
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

al
ep
itn

By Bhupendra Saud 65
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Analysis:
Clearly the above algorithm’s running time is O(n3), where n is cardinality of set V of vertices.

al
ep
itn

By Bhupendra Saud 66
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Chapter 2:
[Geometric Algorithms]
Computational Geometry
The field of computational geometry deals with the study of geometric problems. In our
class we present few geometric problems for e.g. detecting the intersection between line segments,
and try to solve them by using known algorithms. In this lecture, we discuss and present
algorithms on context of 2-D.

Application Domains
� Computer graphics
� Robotics
� GIS
� CAD/CAM – IC Design, automobile, buildings.
� Molecular Modeling
� Pattern recognition

Some Definitions
Point:
A point is a pair of numbers. The numbers are real numbers, but in our usual calculation we
concentrate on integers. For e.g. p1(x1,y1) and p2(x2,y2) are two points as shown

Line segment:
A line segment is a pair of points p1 and p2, where two points are end points of the segment.
For e.g. S(p1,p2) is shown below.

Ray: -
A ray is an infinite one dimensional subset of a line determined by two points: say P 0, P1,
where one point is denoted as the endpoint.
Thus, a ray consists of a bounded point & is extended to infinitely along a line segment.
al
ep
itn

By Bhupendra Saud 67
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Line: - Line is represented by a pair of points P 0 and P1 say, which is extended in both way to infinity
along the segment represented by the pair of points P 0 & P1.
Line: - Line is represented by a pair of points P 0 and P1 say, which is extended in both way to infinity
along the segment represented by the pair of points P 0 & P1.

Polygon:
Simply polygon is a homeomorphic image of a circle, i.e. it is a certain deformation of circle

Simple Polygon:
A simple polygon is a region of plane bounded by a finite collection of line segments to form a simple
closed curve. Mathematically, let V0, V1, V2, ---------, Vn-1 are n ordered vertices in the plane,
then the line segments e0 (V0, V1), e1 (V1, V2), ……., en-1 (Vn-1, V0) form a simple polygon if and only
if;
� the intersection of each pair of segments adjacent in cyclic ordering is a simple single point
shared by them;
ei ∩ ei+1 = Vi+1 &
� non-adjacent segments do not intersect;
ei ∩ e j = Φ
Thus, a polygon is simple if there are no points between non-consecutive linesegments,
i.e. vertices are only intersection points. Vertices of simple polygon
are assumed to be ordered into counterclockwise direction

Non-Simple polygon (Self Intersecting)


A polygon is non-simple if there is no single interior region, i.e. non-adjacent edges intersect
each other.
al
ep
itn

By Bhupendra Saud 68
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Convex Polygon:
A simple polygon P is convex if and only if for any pair of points x, y in P the line segment
between x and y lies entirely in P. We can notice that if all the interior angle is less than 1800, then
the simple polygon is a convex polygon.

Diagonal of a simple polygon:


A diagonal of a simple polygon is a line segments connecting two non-adjacent vertices and
lies completely inside the polygon.

Here all (V2, V8), (V3, V7), (V4, V6) & (V10, V12) are diagonals of the polygon but (V 9, V11) is not a
diagonal

Ear of Polygon:
Three consecutive vertices Vi, Vi+1, Vi+2 of a polygon form an ear if (Vi, Vi+2) is a diagonal, Vi+1 is the
tip of the ear.

al
ep
itn

By Bhupendra Saud 69
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Mouth:
Three consecutive vertices V i, Vi+1, Vi+2 of a polygon form a mouth if (V i, Vi+2) is an external
diagonal. In above figure, (V0, V1, V2) & (V2, V3, V4) are mouths of the polygon.

One-Mouth Theorem
Except for convex polygons, every simple polygon has at least one mouth.

Two-Ears Theorem
Every polygon of n ≥ 4 vertices has at least two non-overlapping ears.

Notion of Left Turn & Right Turns:


Left Turn:
For three points P0, P1, P2 in a place, P0, P1, P2 is said to be left turn if line segment (P 1, P2) lies
to the left of line segment (P 0, P1).

Right Turn:
If P line segment (P1, P2) lies to the right of (P0, P1) then P0, P1, P2 is a right turn.

Computing point of intersection between two line segments


We can apply our coordinate geometry method for finding the point of intersection between
two line segments. Let S1 and S2 be any two line segments. The following steps are used to
calculate point of intersection between two line segments. We are not considering parallel line
segments here in this discussion.
• Determine the equations of line through the line segment S 1 and S2. Say the equations are L1 = (y
= m1x + c1) and L2 = (y = m2x + c2) respectively. We can find the equation of line L 1 using the
formula of slope (m1) = (y2-y1)/ (x2-x1), where (x1,y1) and (x2,y2) are two given end points of the
line segment S1. Similarly we can find the m2 for L2 also. The values of ci’s can be obtained by
al

using the point of the line segment on the obtained equation after getting slope of the respective
ep

lines.
itn

By Bhupendra Saud 70
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
• Solve two equations of lines L1 and L2, let the value obtained by solving be p = (x i, yi). Here we
confront with two cases. The first case is, if p is the intersection of two line segments then p lies on
both S1 and S2. The second case is if p is not an intersection point then p does not lie on at least
one of the line segments S1 and S2.
The figure below shows both the cases.

Detecting point of intersection


In straightforward manner we can compute the point of intersection (p) between the lines
passing through S1 and S2 and see whether the line segments intersects or not as done in above
discussion. However, the above method uses the division in the computation and we know that
division is costly process. Here we try to detect the intersection without using division.
Left and Right Turn: Given points p0(x0,y0), p1(x1,y1), and p2(x2,y2). If we try to find whether the
path p0p1p2 make left or right turn, we check whether the vector p0p1 is clockwise or
counterclockwise with respect to vector p0p2. We compute the cross product
of the vectors given by two line segments as
(p1- p0)×(p2- p0) = (x1- x0, y1- y0) ×(x2- x0, y2- y0) = (x1- x0)(y2- y0)-(y1- y0) (x2- x0), this can
be represented as

See figure below to have idea on left and right turn as well as direction of points. The
cross product’s geometric interpretation is also shown below.
al
ep
itn

By Bhupendra Saud 71
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Using the concept of left and right turn we can detect the intersection between the two line
segments in very efficient manner.

Convex hull:
Definition:
- The convex hull of a finite set of points, S in plane is the smallest convex polygon P, that
encloses S. (Smallest area)
- The convex hull of a set of points, S in the plane is the union of all the triangles determined by
points in S.
− The convex hull of a finite set of points, S, is the intersection of all the convex polygons
(sets) that contain S.

There are wide ranges of application areas where it comes use of convex hulls such as;
- In pattern recognition, an unknown shape may be represented by its convex hull, which is then
matched to a database of known shapes.
- In motion planning, if the robot is approximated by its convex hull, then it is easier to plan
collision free path on the landscape of obstacles.
− Smallest box, fitting ranges & so on.

Graham’s Scan Algorithm:


This algorithm computes convex hull of points by maintaining the feasible candidate points on the
stack. If the candidate point is not extreme, then it is removed from the stack. When all points are
examined, only the extreme points remain on the stack & which will result the final hull.
Input ⇒ P = {p0, p1, -------, pn-1} of n-points.
al

Output ⇒ Convex hull of P


ep
itn

By Bhupendra Saud 72
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm:
-Find a point pi with lowest y-coordinate, let it be q0.
- Sort the input points angularly about q0 let the sorted list is now {q0, q1, -------, qn-1}
- Push q0 into stack S and push q1 into the stack S.
- Initialize i = 2
while (i < n)
if LeftTurn (next top (S), top (S), qi) is true
push qi into stack S.
i++
else
pop the stack
end if
end while
- Each points popped from stack are not vertex of convex hull.
- Finally, at last when all elements are processed, the points that remain on stack are the vertices of
the convex hull.

Complexity Analysis:
- Finding minimum y-coordinate point it takes O (n) time.
- Sorting angularly about the point takes O (nlogn) time.
- Pushing & popping takes constant time.
- The while loop runs for O (n) times
Hence, the complexity = O (n) + O (nlogn) + O (1) + O (n)
= O (nlogn).

Another way of constructin this Algorithm:


GrahamScan(P) // P = {p1, p2,…, pn}
{
p0 = point with lowest y-coordinate value.
Angularly sort the other points with respect to p0. Let q = {q1,q2,…, qm} be sorted points.
Push(S, p0); // S is a stack
Push(S, q1);
Push(S, q2);
For(i=3;i<m;i++)
{
a = NexttoTop(S);
b= Top(S);
while (a,b,qi makes non left turn)
Pop(S);
Push(S,qi);
}
return S;
}
al
ep
itn

By Bhupendra Saud 73
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Exercises:
1. Write down the algorithm to find the point of intersection of two linesegments, if exists.
2. Use Graham’s scan algorithm to find convex hull of the set of points below.

al
ep
itn

By Bhupendra Saud 74
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Example2:
Find the convex hull of the set of points given below using graham’s scan algorithm.

Solution:

al
ep
itn

By Bhupendra Saud 75
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

al
ep
itn

By Bhupendra Saud 76
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Chapter 3
[NP Complete Problems&Approximation Algorithms]
Up to now we were considering on the problems that can be solved by algorithms in worst-
case polynomial time. There are many problems and it is not necessary that all the problems have
the apparent solution. This concept, somehow, can be applied in solving the problem using the
computers. The computer can solve: some problems in limited time e.g. sorting, some problems
requires unmanageable amount of time e.g. Hamiltonian cycles, and some problems cannot be
solved e.g. Halting Problem. In this section we concentrate on the specific class of problems called
NP complete problems (will be defined later).

Tractable and Intractable Problems:


We call problems as tractable or easy, if the problem can be solved using polynomial time
algorithms. The problems that cannot be solved in polynomial time but requires superpolynomial
time algorithm are called intractable or hard problems. There are many problems for which no
algorithm with running time better than exponential time is known some of them are, traveling
salesman problem, Hamiltonian cycles, and circuit satisfiability, etc.
P and NP classes and NP completeness:
The set of problems that can be solved using polynomial time algorithm is regarded asclass
P. The problems that are verifiable in polynomial time constitute the class NP. The class of NP
complete problems consists of those problems that are NP as well as they are as hard as any
problem in NP (more on this later). The main concern of studying NP completeness is to
understand how hard the problem is. So if we can find some problem as NP complete then we try
to solve the problem using methods like approximation, rather than searching for the faster
algorithm for solving the problem exactly.

Problems:
Abstract Problems:
Abstract problem A is binary relation on set I of problem instances, and the set S of
problem solutions. For e.g. Minimum spanning tree of a graph G can be viewed as a pair of the
given graph G and MST graph T.

Decision Problems:
Decision problem D is a problem that has an answer as either “true”, “yes”, “1” or “false”,
”no”, “0”. For e.g. if we have the abstract shortest path with instances of the problem and the
solution set as {0,1}, then we can transform that abstract problem by reformulating the problem as
“Is there a path from u to v with at most k edges”. In this situation the answer is either yes or no.

Optimization Problems:
We encounter many problems where there are many feasible solutions and our aim is to
find the feasible solution with the best value. This kind of problem is called optimization problem.
al

For e.g. given the graph G, and the vertices u and v find the shortest path from u to v with
minimum number of edges. The NP completeness does not directly deal with optimizations
ep

problems, however we can translate the optimization problem to the decision problem.
itn

By Bhupendra Saud 77
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Encoding:
Encoding of a set S is a function e from S to the set of binary strings. With the help of encoding,
we define concrete problem as a problem with problem instances as the set of binary strings i.e. if
we encode the abstract problem, then the resulting encoded problem is concrete problem. So,
encoding as a concrete problem assures that every encoded problem can be regarded as a language
i.e. subset of {0,1}*.

Complexity Class P:
Complexity class P is the set of concrete decision problems that are polynomial time
solvable by deterministic algorithm. If we have an abstract decision problem A with instance set I
mapping the set {0,1}, an encoding e: I→{0,1}* is used to denote the concrete decision problem
e(A). We have the solutions to both the abstract problem instance i∈I and concrete problem
instance e(i) ∈{0,1}* as A(i)∈{0,1}. It is important to understand that the encoding mechanism
does greatly vary the running time of the algorithm for e.g. take some algorithm that runs in O(n)
time, where the n is size of the input. Say if the input is just a natural number k, then its unary
encoding makes the size of the input as k bits as k number of 1’s and hence the order of the
algorithm’s running time is O(k). In other situation if we encode the natural number k as binary
encoding then we can represent the number k with just logk bits (try to represent with 0 and 1only)
here the algorithm runs in O(n) time. We can notice that if n = logk then O(k) becomes O(2 n) with
unary encoding. However in our discussion we try to discard the encoding like unary such that
there is not much difference in complexity.
We define polynomial time computable function f:{0,1}*→{0,1}* with respect to some
polynomial time algorithm PA such that given any input x ∈{0,1}*, results in output f(x). For
some set I of problem instances two encoding e 1 and e2 are polynomially related if there are two
polynomial time computable functions f and g such that for any i ∈I, both f(e1(i)) = e2(i) and
g(e2(i)) = e1(i) are true i.e. both the encoding should computed from one encoding to another
encoding in polynomial time by some algorithm.

Polynomial time reduction:


Given two decision problems A and B, a polynomial time reduction from A to B is a
polynomial time function f that transforms the instances of A into instances of B such that the
output of algorithm for the problem A on input instance x must be same as the output of the
algorithm for the problem B on input instance f(x) as shown in the figure below. If there is
polynomial time computable function f such that it is possible to reduce A to B, then it is denoted
as A ≤p B. The function f described above is called reduction function and the algorithm for
computing f is called reduction algorithm.
al
ep
itn

By Bhupendra Saud 78
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Complexity Class NP:
NP is the set of decision problems solvable by nondeterministic algorithms in polynomial
time. When we have a problem, it is generally much easier to verify that a given value is solution
to the problem rather than calculating the solution of the problem. Using the above idea we say the
problem is in class NP (nondeterministic polynomial time) if there is an algorithm for the problem
that verifies the problem in polynomial time. V is the verification algorithm to the decision
problem D if V takes input string x as an instance of the problem D and another binary string y,
certificate, whose size is no more than the polynomial in the size of x. the algorithm V verifies an
input x if there is a certificate y such that answer of D to the input x with certificate y is yes. For
e.g. Circuit satisfiability problem (SAT) is the question “Given a Boolean combinational circuit, is
it satisfiable? i.e. does the circuit has assignment sequence of truth values that produces the output
of the circuit as 1?” Given the circuit satisfiability problem take a circuit x and a certificate y with
the set of values that produce output 1, we can verify that whether the given certificate satisfies the
circuit in polynomial time. So we can say that circuit satisfiability problem is NP. We can always
say P Í NP, since if we have the problem for which the polynomial time algorithm exists to solve
(decide: notice the difference between decide and accept) the problem, then we can always get the
verification algorithm that neglects the certificate and accepts the output of the polynomial time
algorithm. From the above fact we are clear that P Í NP but the question, whether P = NP remains
unsolved and is still the big question in theoretical computer science. Most of the computer
scientists, however, believes that P ¹ NP.

NP-Completeness:
NP complete problems are those problems that are hardest problems in class NP. We define
some problem say A, is NP-complete if
1. A ∈ NP, and
2. B ≤p A, for every B ∈ NP.
We call the problem (or language) A satisfying property 2 is called NP-hard.

Cook’s Theorem:
“SAT is NP-hard”
Proof: (This is not actual proof as given by cook, this is just a sketch)
Take a problem V NP, let A be the algorithm that verifies V in polynomial time (this must be true
since V NP). We can program A on a computer and therefore there exists a (huge) logical circuit
whose input wires correspond to bits of the inputs x and y of A and which outputs 1 precisely
when A(x,y) returns yes.
For any instance x of V let Ax be the circuit obtained from A by setting the x-input wire values
according to the specific string x. The construction of A x from x is our reduction function. If x is a
yes instance of V, then the certificate y for x gives satisfying assignments for A x. Conversely, if Ax
outputs 1 for some assignments to its input wires, that assignment translates into a certificate for x.

Theorem 2: (Cook’s Theorem)


“SAT is NP-complete”
Proof:
al

To show that SAT is NP-complete we have to show two properties as given by the definition of
NP-complete problems. The first property i.e. SAT is in NP we showed above (see pg 5 italicized
ep
itn

By Bhupendra Saud 79
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
part), so it is sufficient to show the second property holds for SAT. The proof for the second
property i.e. SAT is NP-hard is from lemma 3. This completes the proof.

Approximation Algorithms:
An approximate algorithm is a way of dealing with NP-completeness for optimization
problem. This technique does not guarantee the best solution. The goal of an approximation
algorithm is to come as close as possible to the optimum value in a reasonable amount of time
which is at most polynomial time. If we are dealing with optimization problem (maximization or
minimization) with feasible solution having positive cost then it is worthy to look at approximate
algorithm for near optimal solution.
An algorithm has an approximate ratio of ρ(n) if, for any problem of input size n, the cost C of
solution by an algorithm and the cost C* of optimal solution have the relation as max(C/C*,C*,C)
≤ ρ(n). Such an algorithm is called ρ(n)-approximation algorithm.
The relation applies for both maximization (0 < C ≤ C*) and minimization (0 < C* ≤ C) problems.
ρ(n) is always greater than or equal to 1. If solution produced by approximation algorithm is true
optimal solution then clearly we have ρ(n) = 1.

Vertex Cover Problem:

Algorithm:
ApproxVertexCover (G)
{
C {};
E’ = E
while E` is not empty
do Let (u, v) be an arbitrary edge of E`
C = C ≈ {u, v}
Remove from E` every edge incident on either u or v
return C
}

Example: (vertex cover running example for graph below)


al
ep
itn

By Bhupendra Saud 80
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)

Solution:

Analysis:
If E’ is represented using the adjacency lists the above algorithm takes O (V+E) since each
edge is processed only once and every vertex is processed only once throughout the whole
operation.

Tribhuvan University
Bachelor of Science in Computer Science
And Information Technology
Examination, 2069
New Summit College
(Old Baneshowr , Kathmandu)
al

Subject: “Design and Analysis of Algorithm(DAA)” FM:80


ep

Time: 3 hr. PM: 32


Attempt all the Questions {10 x 8 =80}
itn

By Bhupendra Saud 81
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
1. Why asymptotic notations are important in algorithm analysis? Describe big-O, big-Ω and big-
theta notation with suitable examples. {2 + 2 + 2+2}

2. What is recurrence relation? Prove that the complexity of the recurrence relation “T(n) =
8T(n/2) + n2” is O(n3) by using substitution method. {1+7}

3. Given the following block of code, write a recurrence relation for it and also find asymptotic
upper bound (Assume that all dotted code takes constant time) { 4+4)
Fun(int n)
{
…………….
if(condition1)
x=Fun(n/2)
else if(condition2)
x=Fun(2n/3)
else
x= Fun(n/4)
……………
}

4. What is the concept behind randomized quick sort? Write down its algorithm and give its
average case analysis. {1 + 3 + 4)

5. What is meant by medial order statistics? Write the algorithm for expected liner time selection
and analyze it. {1 + 3 + 4}

6. Devise a divide and conquer algorithm for finding minimum and maximum element among a
set of given elements. Write recurrence relation for your algorithm and give its big-O estimate.
{5+ 3}

7. What are the characteristics of problem that can be solved by using dynamic programming
algorithm? Give the recursive definition of solving 0/1 knapsack problem. Trace the algorithm
for w={3,4,2,2,3}, v={12,14,6,5,6} and knapsack of capacity 12. (2+1+5)

8. Write the recurrence relation for Longest Common subsequence problem(LCS). Trace the
algorithm to find LCS of X={a,b,c,b,d,a,b} and Y={b,d,c,a,b,a}. (2+6)

9. Use master method to find the big-O estimates of the recurrences: {4+4)
a. T(n) = 3T(n/2) + n
b. T(n) = 4T(n/2) + n2

10 Show all the steps required for sorting an array of size 10 by using Heap sort.
a[10]={5, 3, 2, 4, 7, 8, 1, 11, 9, 15}. (8)
Hint: At first construct a heap and then sort by using Heap sort properties.
al
ep
itn

By Bhupendra Saud 82
cs

Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
A Complete Note in Design And Analysis of Algorithms
By Bhupendra S. Saud

Email: Saud.bhupendra427@gmail.com

The End

al
ep
itn

By Bhupendra Saud 83
cs

Source: www.csitnepal.com

You might also like