Csitnepal: UNIT:-1 Principles of Analyzing Algorithms and Problems
Csitnepal: UNIT:-1 Principles of Analyzing Algorithms and Problems
CSIT)
UNIT:- 1
Principles of Analyzing algorithms and Problems
An algorithm is a finite set of computational instructions, each instruction can be executed in finite
time, to perform computation or problem solving by giving some value, or set of values as input to
produce some value, or set of values as output. Algorithms are not dependent on a particular
machine, programming language or compilers i.e. algorithms run in same manner everywhere. So
the algorithm is a mathematical object where the algorithms are assumed to be run under machine
with unlimited capacity.
Examples of problems
• You are given two numbers, how do you find the Greatest Common Divisor.
• Given an array of numbers, how do you sort them?
We need algorithms to understand the basic concepts of the Computer Science, programming.
Where the computations are done and to understand the input output relation of the problem we
must be able to understand the steps involved in getting output(s) from the given input(s).
You need designing concepts of the algorithms because if you only study the algorithms then you
are bound to those algorithms and selection among the available algorithms. However if you have
knowledge about design then you can attempt to improve the performance using different design
principles.
The analysis of the algorithms gives a good insight of the algorithms under study. Analysis of
algorithms tries to answer few questions like; is the algorithm correct? i.e. the
Algorithm generates the required result or not?, does the algorithm terminate for all the inputs
under problem domain? The other issues of analysis are efficiency, optimality, etc. So knowing the
different aspects of different algorithms on the similar problem domain we can choose the better
algorithm for our need. This can be done by knowing the resources needed for the algorithm for its
execution. Two most important resources are the time and the space. Both of the resources are
measures in terms of complexity for time instead of absolute time we consider growth
Algorithms Properties
Input(s)/output(s): There must be some inputs from the standard set of inputs and an
algorithm’s execution must produce outputs(s).
Definiteness: Each step must be clear and unambiguous.
Finiteness: Algorithms must terminate after finite time or steps.
Correctness: Correct set of output values must be produced from the each set of inputs.
Effectiveness: Each step must be carried out in finite time.
Here we deal with correctness and finiteness.
By Bhupendra Saud 1
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Best, Worst and Average case
Best case complexity gives lower bound on the running time of the algorithm for any
instance of input(s). This indicates that the algorithm can never have lower running time
than best case for particular class of problems.
Worst case complexity gives upper bound on the running time of the algorithm for all the
instances of the input(s). This insures that no input can overcome the running time limit
posed by worst case complexity.
Average case complexity gives average number of steps required on any instance of the
input(s).
Efficiency
Time Complexity: The algorithm above iterates up to n-2 times, so time complexity is
O(n).
Mathematical Foundation
Since mathematics can provide clear view of an algorithm. Understanding the concepts of
mathematics is aid in the design and analysis of good algorithms. Here we present some of the
mathematical concepts that are helpful in our study.
Exponents
Some of the formulas that are helpful are:
xa xb = xa+b
al
xa / xb = xa-b
(x a)b = xab
ep
xn + xn = 2xn
2n + 2n = 2n+1
itn
By Bhupendra Saud 2
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Logarithmes
Some of the formulas that are helpful are:
1. logab = logcb / logc a ; c>0
2. log ab = log a + log b
3. log a/b = log a - log b
4. log (ab) = b log a
5. Log x < x for all x>0
6. Log 1 = 0, log 2 = 1, log 1024 = 10.
7. a logbn = n logba
Series
Asymptotic Notation
Complexity analysis of an algorithm is very hard if we try to analyze exact. we know that the
complexity (worst, best, or average) of an algorithm is the mathematical function of the size of the
input. So if we analyze the algorithm in terms of bound (upper and lower) then it would be easier.
For this purpose we need the concept of asymptotic notations. The figure below gives upper and
lower bound concept.
Big Oh (O) notation
When we have only asymptotic upper bound then we use O notation. A function f(x)=O(g(x))
(read as f(x) is big oh of g(x) ) iff there exists two positive constants c and x 0 such that for all x >=
x0, 0 <= f(x) <= c*g(x)
The above relation says that g(x) is an upper bound of f(x)
al
Some properties:
Transitivity: f(x) = O(g(x)) & g(x) = O(h(x)) _ f(x) = O(h(x))
ep
By Bhupendra Saud 3
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
O(1) is used to denote constants.
For all values of n >= n0, plot shows clearly that f(n) lies below or on the curve of c*g(n)
Examples
1. f(n) = 3n2 + 4n + 7
g(n) = n2 , then prove that f(n) = O(g(n)).
Proof: let us choose c and n0 values as 14 and 1 respectively then we can have
f(n) <= c*g(n), n>=n0 as
3n2 + 4n + 7 <= 14*n2 for all n >= 1
The above inequality is trivially true
Hence f(n) = O(g(n))
2. Prove that n log (n3) is O(√n3)).
Proof: we have n log (n3) = 3n log n
Again, √n3 = n √n,
If we can prove log n = O(√n) then problem is solved
Because n log n = n O(√n) that gives the question again.
We can remember the fact that log a n is O (nb) for all a,b>0.
In our problem a = 1 and b = 1/2 ,
hence log n = O(√n).
So by knowing log n = O(√n) we proved that
n log (n3) = O(√n3)).
By Bhupendra Saud 4
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Examples
1. f(n) = 3n2 + 4n + 7
g(n) = n2 , then prove that f(n) =Ω(g(n)).
Proof: let us choose c and n0 values as 1 and 1, respectively then we can have
f(n) >= c*g(n), n>=n0 as
3n2 + 4n + 7 >= 1*n2 for all n >= 1
The above inequality is trivially true
Hence f(n) =Ω (g(n))
al
ep
itn
By Bhupendra Saud 5
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Example
1. f(n) = 3n2 + 4n + 7
g(n) = n2 , then prove that f(n) = (g(n)).
Proof: let us choose c1, c2 and n0 values as 14, 1 and 1 respectively then we can have,
f(n) <= c1*g(n), n>=n0 as 3n2 + 4n + 7 <= 14*n2 , and
f(n) >= c2*g(n), n>=n0 as 3n2 + 4n + 7 >= 1*n2
for all n >= 1(in both cases).
So c2*g(n) <= f(n) <= c1*g(n) is trivial.
Hence f(n) = Θ (g(n)).
Recurrences
• Recursive algorithms are described by using recurrence relations.
• A recurrence is an inequality that describes a problem in terms of itself.
For Example:
Recursive algorithm for finding factorial
T(n)=1 when n =1
T(n)=T(n-1) + O(1) when n>1
Recursive algorithm for finding Nth Fibonacci number
T(1)=1 when n=1
T(2)=1 when n=2
T(n)=T(n-1) + T(n-2) +O(1) when n>2
Recursive algorithm for binary search
T(1)=1 when n=1
T(n)=T(n/2) + O(1) when n>1
Iteration method
• Expand the relation so that summation independent on n is obtained.
• Bound the summation
e.g.
T(n)= 2T(n/2) +1 when n>1
T(n)= 1 when n=1
al
T(n) = 2T(n/2) +1
= 2 { 2T(n/4) + 1} +1
ep
= 4T(n/4) + 2 + 1
itn
By Bhupendra Saud 6
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
= 4 { T(n/8) +1 } + 2 + 1
= 8 T(n/8) + 4 + 2 + 1
………………………
………………………
= 2k T( n/2k) + 2 k-1 T(n/2k-1) + ………………… + 4 + 2 + 1.
For simplicity assume:
n= 2k
k=logn
T(n)= 2k + 2k-1 + ……………………….. + 22 + 21 + 20
T(n)= (2k+1 -1)/ (2-1)
T(n)= 2k+1 -1
T(n)= 2.2k -1
T(n)= 2n-1
T(n)= O(n)
Second Example:
T(n) = T(n/3) + O(n) when n>1
T(n) = 1 when n=1
Substitution Method
Takes two steps:
1. Guess the form of the solution, using unknown constants.
2. Use induction to find the constants & verify the solution.
By Bhupendra Saud 7
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Show: T(n) ≤ cn for ∀n>n0.
3
Base case,
For n=1:
T(n) = 1 Definition
1≤c Choose large enough c for conclusion
Inductive case, n>1:
T(n) = 4T(n/2) + n Definition.
≤ 4c⋅(n/2) + n
3
Induction.
= c/2 ⋅ n3 + n Algebra.
Second Example
T(n) = 1 n=1
T(n) = 4T(n/2) + n n>1
Guess: T(n) = O(n2).
Same recurrence, but now try tighter bound.
More specifically:
T(n) ≤ cn2 for ∀n>n0.
T(n) = 4T(n/2) + n
≤ 4c⋅(n/2)
= cn2 + n
Not ≤ cn2 !
Problem is that the constant isn’t shrinking.
Solution: Use a tighter guess & inductive hypothesis.
Subtract a lower-order term – a common technique.
Guess:
T(n) ≤ cn2 - dn for ∀n>0
Assume T(k) ≤ ck2 - dk, for ∀k<n. Show T(n) ≤ cn2 - dn.
Base case, n=1
T(n) = 1 Definition.
al
By Bhupendra Saud 8
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n) = 4T(n/2) + n Definition.
≤ 4(c(n/2)2 - d(n/2)) + n Induction.
= cn2 - 2dn + n Algebra.
= cn2 - dn - (dn - n) Algebra.
≤ cn2 - dn Choosing d≥1.
T(n) ≤ 2n2 – dn for ∀n>0
Thus, T(n) = O(n2).
Ability to guess effectively comes with experience.
Changing Variables:
Sometimes a little algebric manipulation can make a unknown recurrence similar to one we have
seen
Consider the example
T(n) = 2T(n1/2) + logn
Looks Difficult: Rearrange like
Let m=logn => n= 2m
Thus,
T(2m)= 2T(2m/2)+m
Again let
S(m)= T(2m) S(m)= 2S(m/2) + m
We can show that
S(m)=O(logm)
T(n)=T(2m) =S(m)= O(mlogm) = O(logn log logn)
Recursion Tree
Just Simplification of Iteration method:
Consider the recurrence
T(1)=1 when n=1
T(n)= T(n/2)+ 1 when n>1
1 1 1
T(n/2) 1 1
T(n/4) 1
T(n/2k)
Cost at each level =1
For simplicity assume that n= 2K
al
k= logn
Summing the cost at each level,
ep
By Bhupendra Saud 9
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
complexity = O (logn)
Second Example
T(n) = 1
n=1
T(n) = 4T(n/2) + n
n>1
…
…
…
…
…
…
…
…
…
T(n Cost at
) this level
n n
n/ n/ n/ n/ 2
2 2 2 2 n
n/ n/ n/ n/ n/ n/ k/ n/
… 22n
4 4 4 4 4 4 4 4
1 … 1 2kn
Asume: n= 2k
k= logn
T(n) = n + 2n + 4n + …+ 2k – 1n +2 kn
= n(1 + 2 + 4 + … + 2k-1 + 2k
= n( 2k+1 -1)/(2-1)
= n (2k+1 -1)
≤ n 2k+1
=2n 2k
= 2n . n
= O(n2)
al
ep
itn
By Bhupendra Saud 10
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Third Example
Solve T(n) = T(n/4) + T(n/2)+ n2
n2
n2/16 n2/4
Master Method
Cookbook solution for some recurrences of the form
T(n) = a ⋅ T(n/b) + f(n)
where a≥1, b>1, f(n) asymptotically positive
Describe its cases
By Bhupendra Saud 11
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n) = 2T(n/2) + cn a=2, b=2
Here f(n) = cn nlogb a = n
f(n) = Θ(nlogb a)
T(n) = Θ(n lg n)
Exercises
itn
By Bhupendra Saud 12
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Show that the solution of T(n) = 2T( n / 2 ) + n is Ω(nlogn). Conclude that solution is Θ
(nlogn).
Show that the solution to T(n) = 2T(n / 2 + 17) + n is O(nlogn).
Write recursive Fibonacci number algorithm derive recurrence relation for it and solve by
substitution method. {Guess 2n}
Argue that the solution to the recurrence T(n) = T(n/3) + T(2n/3) + n is (nlogn) by
appealing to a recursion tree.
Use iteration to solve the recurrence T(n) = T(n-a) + T(a) + n, where a >=1 is a constant.
The running time of an algorithm A is described by the recurrence T(n) = 7T(n/2) + n 2. A
competing algorithm A’ has a running time of T’(n) = aT’(n/4) + n2. What is the largest
integer value for ‘a’ such that A’ is asymptotically faster than A?
4. Insert and delete can be done in O(1) time if the pointer to the node is given, otherwise O(n)
ep
time.
5. Search and traversing can be done in O(n) time
itn
By Bhupendra Saud 13
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
6. Memory overhead, but allocated only to entries that are present, search becomes easy.
boolean isEmpty ();
Return true if and only if this list is empty.
• int size ();
Return this list’s length.
• boolean get (int i);
Return the element with index i in this list.
• boolean equals (List a, List b);
Return true if and only if two list have the same length, and each element of the lists are equal
• void clear ();
Make this list empty.
• void set (int i, int elem);
Replace by elem the element at index i in this list.
• void add (int i, int elem);
Add elem as the element with index i in this list.
• void add (int elem);
Add elem after the last element of this list.
• void addAll (List a List b);
Add all the elements of list b after the last element of list a.
• int remove (int i);
Remove and return the element with index i in this list.
• void visit (List a);
Prints all elements of the list
By Bhupendra Saud 14
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
void clear ();
Make this stack empty. Complexity is O(1).
void push (int elem);
Add elem as the top element of this stack. Complexity is O(1).
int pop ();
Remove and return the element at the top of this stack. Complexity is O(1).
The queues are also like stacks but they implement FIFO(First In First Out) policy. One end is for
insertion and other is for deletion. They are represented mostly circularly in array for O(1)
insertion/deletion time. Circular singly linked representation takes O(1) insertion time and O(1)
deletion time. Again Representing queues in doubly linked list have O(1) insertion and deletion
time.
Operations on queues
3. boolean isEmpty ();
Return true if and only if this queue is empty. Complexity is O(1).
4. int size ();
Return this queue’s length. Complexity is O(n).
5. int getFirst ();
Return the element at the front of this queue. Complexity is O(1).
6. void clear ();
Make this queue empty. Complexity is O(1).
7. void insert (int elem);
Add elem as the rear element of this queue. Complexity is O(1).
8. int delete ();
Remove and return the front element of this queue. Complexity is O(1).
By Bhupendra Saud 15
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
So if we are sure that the tree is height balanced then we can say that search and insertion has
O(log n) run time otherwise we have to content with O(n).
AVL Trees
Balanced tree named after Adelson, Velskii and Landis. AVL trees consist of a special case in
which the sub-trees of each node differ by at most 1 in their height. Due to insertion and deletion
tree may become unbalanced, so rebalancing must be done by using left rotation, right rotation or
double rotation.
Operation Algorithm Time complexity
Search AVL search O(log n) best, worst
Add AVL insertion O(log n) best, worst
Remove AVL deletion O(log n) best, worst
Priority Queues
Priority queue is a queue in which the elements are prioritized. The least element in the priority
queue is always removed first. Priority queues are used in many computing applications. For
example, many operating systems used a scheduling algorithm where the next process executed is
the one with the shortest execution time or the highest priority. Priority queues can be
implemented by using arrays, linked list or special kind of tree (I.e. heap).
• boolean isEmpty ();
Return true if and only if this priority queue is empty.
• int size ();
Return the length of this priority queue.
• int getLeast ();
Return the least element of this priority queue. If there are several least elements, return
any of them.
• void clear ();
Make this priority queue empty.
• void add (int elem);
Add elem to this priority queue.
al
• int delete(); Remove and return the least element from this priority queue. (If there are
several least elements, remove the same element that would be returned by getLeast.
ep
•
itn
By Bhupendra Saud 16
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Operation Sorted SLL Unsorted SLL Sorted Array Unsorted Array
add O(n) O(1) O(n) O(1)
removeLeast O(1) O(n) O(1) O(n)
Heap
A heap is a complete tree with an ordering-relation R holding between each node and its
descendant. Note that the complete tree here means tree can miss only rightmost part of the bottom
level. R can be smaller-than, bigger-than.
E.g. Heap with degree 2 and R is “bigger than”.
8 8
5 6 5 6
2 3 4
1 7 3 4
1
Heap
Not a heap
Heap Sort Build a heap from the given set (O(n)) time, then repeatedly remove the elements from
the heap (O(n log n)).
Implementation
Heaps are implemented by using arrays. Insertion and deletion of an element takes O(log n) time.
More on this later
By Bhupendra Saud 17
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
UNIT:-2
Divide and Conquer Algorithms
(Sorting Searching and Selection)
Chapter:1
Sorting
Sorting is among the most basic problems in algorithm design. We are given a sequence of items,
each associated with a given key value. The problem is to permute the items so that they are in
increasing (or decreasing) order by key. Sorting is important because it is often the first step in
more complex algorithms. Sorting algorithms are usually divided into two classes, internal sorting
algorithms, which assume that data is stored in an array in main memory, and external sorting
algorithm, which assume that data is stored on disk or some other device that is best accessed
sequentially. We will only consider internal sorting. Sorting algorithms often have additional
properties that are of interest, depending on the application. Here are two important properties.
In-place: The algorithm uses no additional array storage, and hence (other than perhaps the
system’s recursion stack) it is possible to sort very large lists without the need to allocate
additional working storage.
Stable: A sorting algorithm is stable if two elements that are equal remain in the same relative
position after sorting is completed. This is of interest, since in some sorting applications you sort
first on one key and then on another. It is nice to know that two items that are equal on the second
key, remain sorted on the first key.
Merge Sort
To sort an array A[l . . r]:
• Divide
– Divide the n-element sequence to be sorted into two subsequences of n/2 elements
each
• Conquer
– Sort the subsequences recursively using merge sort. When the size of the sequences
is 1 there is nothing more to do
• Combine
– Merge the two sorted subsequences
Divide
al
ep
itn
By Bhupendra Saud 18
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Merging:
Algorithm:
MergeSort(A, l, r)
{
If ( l < r)
{ //Check for base case
m = (l + r)/2 //Divide
MergeSort(A, l, m) //Conquer
MergeSort(A, m + 1, r) //Conquer
Merge(A, l, m+1, r) //Combine
}
}
Merge(A,B,l,m,r)
{
x=l, y=m;
k=l;
while(x<m && y<r)
{
if(A[x] < A[y])
{
B[k]= A[x];
k++; x++;
}
else
{
al
B[k] = A[y];
ep
k++; y++;
}
itn
By Bhupendra Saud 19
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
}
while(x<m)
{
A[k] = A[x];
k++; x++;
}
while(y<r)
{
A[k] = A[y];
k++; y++;
}
for(i=l;i<= r; i++)
{
A[i] = B[i]
}
}
Time Complexity:
Recurrence Relation for Merge sort:
T(n) = 1 if n=1
T(n) = 2 T(n/2) + O(n) if n>1
Solving this recurrence we get
Time Complexity = O(nlogn)
Space Complexity:
It uses one extra array and some extra variables during sorting, therefore
Space Complexity= 2n + c = O(n)
Quick Sort
• Divide
Partition the array A[l…r] into 2 subarrays A[l..m] and A[m+1..r], such that each element
of A[l..m] is smaller than or equal to each element in A[m+1..r]. Need to find index p to
partition the array.
• Conquer
Recursively sort A[p..q] and A[q+1..r] using Quicksort
• Combine
Trivial: the arrays are sorted in place. No additional work is required to combine them.
al
ep
itn
By Bhupendra Saud 20
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
5 3 2 6 4 1 3 7
x y
5 3 2 6 4 1 3 7
x y {swap x & y}
5 3 2 3 4 1 6 7
y x {swap y and pivot}
1 3 2 3 4 5 6 7
p
Algorithm:
QuickSort(A,l,r)
{
if(l<r)
{
p = Partition(A,l,r);
QuickSort(A,l,p-1);
QuickSort(A,p+1,r);
}
}
Partition(A,l,r)
{
x =l; y =r ; p = A[l];
while(x<y)
{
do {
x++;
}while(A[x] <= p);
do {
y--;
} while(A[y] >=p);
if(x<y)
swap(A[x],A[y]);
}
A[l] = A[y]; A[y] = p;
return y; //return position of pivot
}
Time Complexity:
We can notice that complexity of partitioning is O(n) because outer while loop executes cn
times.
Thus recurrence relation for quick sort is:
T(n) = T(k) + T(n-k-1) + O(n)
al
Best Case:
ep
By Bhupendra Saud 21
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Time Complexity = O(nlogn)
Worst Case:
When array is already sorted or sorted in reverse order, one partition contains n-1 items and
another contains zero items, therefore
T(n) = T(n-1) + O(1), Solving this recurrence we get
Time Complexity = O(n2)
Average case:
All permutations of the input numbers are equally likely. On a random input array, we will
have a mix of well balanced and unbalanced splits. Good and bad splits are randomly
distributed across throughout the tree
al
By Bhupendra Saud 22
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
UB(n)= B(n –1) + Θ(n) Unbalanced
Solving:
B(n)= 2(B(n/2 –1) + Θ(n/2)) + Θ(n)
= 2B(n/2 –1) + Θ(n)
= Θ(nlogn)
RandPartition(A,l,r)
{
k = random(l,r); //generates random number between i and j including both.
swap(A[l],A[k]);
return Partition(A,l,r);
}
Partition(A,l,r)
{
x =l; y =r ; p = A[l];
while(x<y)
{
do {
x++;
}while(A[x] <= p);
al
do {
ep
y--;
} while(A[y] >=p);
itn
By Bhupendra Saud 23
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
if(x<y)
swap(A[x],A[y]);
}
A[l] = A[y]; A[y] = p;
return y; //return position of pivot
}
Time Complexity:
Worst Case:
T(n) = worst-case running time
T(n) = max1 ≤ q ≤ n-1 (T(q) + T(n-q)) + Θ(n)
Use substitution method to show that the running time of Quicksort is O(n2)
Guess T(n) = O(n2)
– Induction goal: T(n) ≤ cn2
– Induction hypothesis: T(k) ≤ ck2 for any k < n
Proof of induction goal:
T(n) ≤ max 1 ≤ q ≤ n-1 (cq2 + c(n-q)2) + Θ(n)
= c ⋅ max1 ≤ q ≤ n-1 (q2 + (n-q)2) + Θ(n)
The expression q2 + (n-q)2 achieves a maximum over the range 1 ≤ q ≤ n-1 at one of the
endpoints
max1 ≤ q ≤ n-1 (q2 + (n - q)2) = 12 + (n - 1)2 = n2 – 2(n – 1)
T(n) ≤ cn2 – 2c(n – 1) + Θ(n)
≤ cn2
Average Case:
To analyze average case, assume that all the input elements are distinct for simplicity. If
we are to take care of duplicate elements also the complexity bound is same but it needs
more intricate analysis. Consider the probability of choosing pivot from n elements is
equally likely i.e. 1/n.
Now we give recurrence relation for the algorithm as
n− 1
T(n) = 1/n ∑ (T (k ) + T (n − k )) + O(n)
k=1
For some k = 1,2, …, n-1, T(k) and T(n-k) is repeated two times
n− 1
T(n) = 2/n ∑ T (k ) + O(n)
k=1
n− 1
nT(n) = 2 ∑ T (k ) + O(n2)
k=1
Similarly
n− 2
(n-1)T(n-1) = 2 ∑ T (k ) + O(n-1)2
k=1
nT(n) - (n-1)T(n-1) = 2T(n-1) + 2n -1
al
By Bhupendra Saud 24
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n)/(n+1) = T(n-1)/n +(2n +1)/n(n-1)
Let An = T(n) /(n+1)
An = An-1 + (2n+1)/n(n-1)
n
An = ∑
i= 1
2i − 1 / i (i + 1)
n
An ≈ ∑
i= 1
2i / i (i + 1)
n
An ≈ 2 ∑ 1 /(i + 1)
i= 1
An ≈ 2logn
Heap Sort
A heap is a nearly complete binary tree with the following two properties:
– Structural property: all levels are full, except possibly the last one, which is filled
from left to right
– Order (heap) property: for any node x, Parent(x) ≥ x
By Bhupendra Saud 25
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Operations on Heaps
Maintain/Restore the max-heap property
– MAX-HEAPIFY
Create a max-heap from an unordered array
– BUILD-MAX-HEAP
al
Priority queues
itn
By Bhupendra Saud 26
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Heapify Property
Suppose a node is smaller than a child and Left and Right subtrees of i are max-heaps. To
eliminate the violation:
– Exchange with larger child
– Move down the tree
– Continue until node is not smaller than children
Algorithm:
Max-Heapify(A, i, n)
{
l = Left(i)
r = Right(i)
largest=i;
if l ≤ n and A[l] > A[largest]
largest = l
if r ≤ n and A[r] > A[largest]
largest = r
if largest ≠ i
exchange (A[i] , A[largest])
Max-Heapify(A, largest, n)
}
Analysis:
al
In the worst case Max-Heapify is called recursively h times, where h is height of the heap
ep
By Bhupendra Saud 27
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Building a Heap
Convert an array A[1 … n] into a max-heap (n = length[A]). The elements in the sub-array
A[(n/2+1) .. n] are leaves. Apply MAX-HEAPIFY on elements between 1 and n/2.
Algorithm:
Build-Max-Heap(A)
n = length[A]
for i ← n/2 down to 1
do MAX-HEAPIFY(A, i, n)
Time Complexity:
Running time: Loop executes O(n) times and complexity of Heapify is O(lgn), therefore
al
By Bhupendra Saud 28
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
h
⇒ T(n) = ∑
i= 0
ni hi
hi = h – i height of the heap rooted at level i
ni = 2i number of nodes at level i
h
⇒ T(n) = ∑
i= 0
2i( h-i)
h
⇒ T(n) = ∑
i= 0
2h( h-i) / 2h-i
Let k= h-i
h
⇒ T(n) = 2h ∑
i= 0
k / 2k
∞
⇒ T(n) ≤ n ∑
i= 0
k / 2k
∞
We know that, ∑
i= 0
xk = 1/(1-x) for x<1
Differentiating both sides we get,
∞
∑
i= 0
k xk-1 = 1/(1-x)2
∞
∑
i= 0
k xk = x/(1-x)2
Put x=1/2
∞
∑
i= 0
k /2k = 1/(1-x)2 = 2
⇒ T(n) = O(n)
Heapsort
– Build a max-heap from the array
– Swap the root (the maximum element) with the last element in the array
– “Discard” this last node by decreasing the heap size
– Call Max-Heapify on the new root
– Repeat this process until only one node remains
7
4
ZZ 4 3 Heapify(A,1) Heapify(A,1)
2 3
al
1
2 1
7
ep
itn
By Bhupendra Saud 29
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
3 1
2 1 Heapify(A,1) 2 3
4 4
7 7
2 3
4
7
Algorithm:
HeapSort(A)
{
BuildHeap(A); //into max heap
n = length[A];
for(i = n ; i >= 2; i--)
{
swap(A[1],A[n]);
n = n-1;
Heapify(A,1);
}
}
Sorting comparisons:
al
ep
itn
By Bhupendra Saud 30
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Searching
Sequential Search
Simply search for the given element left to right and return the index of the element, if found.
Otherwise return “Not Found”.
Algorithm:
LinearSearch(A, n,key)
{
for(i=0;i<n;i++)
{
if(A[i] == key)
return I;
}
Analysis:
Time complexity = O(n)
Binary Search:
To find a key in a large _le containing keys z[0; 1; : : : ; n-1] in sorted order, we first
compare key with z[n/2], and depending on the result we recurse either on the first half of the file,
z[0; : : : ; n/2 - 1], or on the second half, z[n/2; : : : ; n-1].
Take input array a[] = {2 , 5 , 7, 9 ,18, 45 ,53, 59, 67, 72, 88, 95, 101, 104}
al
ep
itn
By Bhupendra Saud 31
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm
BinarySearch(A,l,r, key)
{
if(l= = r)
{
if(key = = A[l])
return l+1; //index starts from 0
else
return 0;
}
else
{
m = (l + r) /2 ; //integer division
if(key = = A[m]
return m+1;
else if (key < A[m])
return BinarySearch(l, m-1, key) ;
al
}
itn
By Bhupendra Saud 32
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Analysis:
From the above algorithm we can say that the running time of the algorithm is:
T(n) = T(n/2) + Q(1)
= O(logn) .
In the best case output is obtained at one run i.e. O(1) time if the key is at middle. In the worst case
the output is at the end of the array so running time is O(logn) time. In the average case also
running time is O(logn). For unsuccessful search best, worst and average time complexity is
O(logn).
Selection
ith order statistic of a set of elements gives i th largest(smallest) element. In general lets think of i th
order statistic gives ith smallest. Then minimum is first order statistic and the maximum is last
order statistic. Similarly a median is given by i th order statistic where i = (n+1)/2 for odd n and i =
n/2 and n/2 + 1 for even n. This kind of problem commonly called selection problem.
This problem can be solved in O(nlogn) in a very straightforward way. First sort the elements in
Ѳ(nlogn) time and then pick up the ith item from the array in constant time. What about the linear
time algorithm for this problem? The next is answer to this.
By Bhupendra Saud 33
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
When i=k-1 inner loop executes n-(k+1) times
Thus, Time Complexity = (n-1) + (n-2) + ………………..(n-k-1)
= O(kn) ≈ O(n2)
Analysis:
Since our algorithm is randomized algorithm no particular input is responsible for worst case
however the worst case running time of this algorithm is O(n 2). This happens if every time
unfortunately the pivot chosen is always the largest one (if we are finding minimum element).
Assume that the probability of selecting pivot is equal to all the elements i.e 1/n then we have the
recurrence relation,
n− 1
T(n) = 2/n( ∑
j= n / 2
T(j)) + O(n)
Using substitution method,
Guess T(n) = O(n)
To show T(n) <= cn
Assume T(j) <= cj
Substituting on the relation
n− 1
T(n) = 2/n { ∑ - ∑
ep
cj cj }+ O(n)
j= 1 j= 1
itn
By Bhupendra Saud 34
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n) = 2/n { (n(n-1))/2 –( (n/2-1)n/2)/2} + O(n)
T(n) ≤ c(n-1) – c(n/2-1)/2 + O( n)
T(n) ≤ cn – c – cn/4 + c/2 +O( n)
= cn – cn/4 –c/2 + O(n)
≤ cn {choose the value of c such that (-cn/4-c/2 +O(n) ≤ 0 }
T(n) = O(n)
Selection in worst case linear time[Note:- this algorithm must be see in Class
Note]
Divide the n elements into groups of 5. Find the median of each 5-element group. Recursively
SELECT the median x of the ⎣n/5⎦group medians to be the pivot.
Algorithm
Divide n elements into groups of 5
Find median of each group
Use Select() recursively to find median x of the n/5 medians
Partition the n elements around x. Let k = rank(x ) //index of x
if (i == k) then return x
if (i < k) then use Select() recursively to find ith smallest element in first partition
else (i > k) use Select() recursively to find (i-k)th smallest element in last partition
At least half the group medians are ≤x, which is at least⎣⎣n/5⎦/2⎦= ⎣n/10⎦group medians.
Therefore, at least 3⎣n/10⎦elements are ≤x.
Similarly, at least 3⎣n/10⎦elements are ≥x
For n≥50, we have 3⎣n/10⎦≥n/4.
Therefore, for n≥50the recursive call to SELECT in Step 4 is executed recursively on
≤3n/4elements in worst case.
al
Thus, the recurrence for running time can assume that Step 4 takes time T(3n/4)in the worst case.
ep
By Bhupendra Saud 35
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T(n) = T(n/5) + T(3n/4) + Θ(n)
Guess T(n) = O(n)
To Show T(n) ≤ cn
Assume that our guess is true for all k<n
Now,
T(n) ≤ cn/5 + 3cn/4 + O(n)
= 19cn/20 + O(n)
= cn- cn/20 + O(n)
≤ cn { Choose value of c such that cn/20 –O(n) ≤ 0}
T(n) = O(n)
MinMax(l,r)
{
if(l = = r)
max = min = A[l];
else if(l = r-1)
{
if(A[l] < A[r])
{
max = A[r]; min = A[l];
}
else
{
max = A[l]; min = A[r];
}
}
else
{
//Divide the problems
mid = (l + r)/2; //integer division
al
By Bhupendra Saud 36
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
//Combine the solutions
if(max1 > max) max = max1;
if(min1 < min) min = min1;
}
}
Analysis:
We can give recurrence relation as below for MinMax algorithm in terms of number of
comparisons.
T(n) = 2T( n / 2 ) + 1 , if n>2
T(n) = 1 , if n ≤2
Solving the recurrence by using master method complexity is (case 1) O(n).
Matrix Multiplication
Given two A and B n-by-n matrices our aim is to find the product of A and B as C that is also
n-by-n matrix. We can find this by using the relation
n
C(i,j) = ∑
k=1
A(i,k)B(k,j)
MatrixMultiply(A,B)
{
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
for(k=0;k<n;k++)
{
C[i][j] = C[i][j]+ A[i][k]*B[k][j];
}
}
}
}
Analysis:
Using the above formula we need O(n) time to get C(i,j). There are n2 elements in C hence the
time required for matrix multiplication is O(n3). We can improve the above complexity by
using divide and conquer strategy.
By Bhupendra Saud 37
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Where
C11= A11 x B11 + A12 x B21
C12= A11 x B12 + A12 x B22
C21= A21 x B11 + A22 x B21
C22= A21 x B12 + A22 x B22
Where;
P1 = (A11+ A22)(B11+B22)
P2 = (A21 + A22) * B11
P3 = A11 * (B12 - B22)
P4 = A22 * (B21 - B11)
P5 = (A11 + A12) * B22
P6 = (A21 - A11) * (B11 + B12)
P7 = (A12 - A22) * (B21 + B22)
C11 = P1 + P4 - P5 + P7
C12 = P3 + P5
C21 = P2 + P4
C22 = P1 + P3 - P2 + P6
Now, We can write recurrence relation for this as
T(n) = b if n≤2
2
T(n)= 7T(n/2) + cn if n>2
Solving this we get, T(n) = O(n2.81)
al
ep
itn
By Bhupendra Saud 38
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Unit 2
Chapter:2
Dynamic Programming
Dynamic Programming: technique is among the most powerful for designing algorithms for
optimization problems. Dynamic programming problems are typically optimization problems (find
the minimum or maximum cost solution, subject to various constraints). The technique is related to
divide-and-conquer, in the sense that it breaks problems down into smaller problems that it solves
recursively. However, because of the somewhat different nature of dynamic programming
problems, standard divide-and-conquer solutions are not usually efficient. The basic elements that
characterize a dynamic programming algorithm are:
Substructure: Decompose your problem into smaller (and hopefully simpler)
subproblems. Express the solution of the original problem in terms of solutions for smaller
problems.
Table-structure: Store the answers to the sub-problems in a table. This is done because
subproblem solutions are reused many times.
Bottom-up computation: Combine solutions on smaller subproblems to solve larger
subproblems. (We also discusses a top-down alternative, called memoization)
The most important question in designing a DP solution to a problem is how to set up the
subproblem structure. This is called the formulation of the problem. Dynamic programming is not
applicable to all optimization problems. There are two important elements that a problem must
have in order for DP to be applicable.
Optimal substructure: (Sometimes called the principle of optimality.) It states that for the
global problem to be solved optimally, each subproblem should be solved optimally. (Not
all optimization problems satisfy this. Sometimes it is better to lose a little on one
subproblem in order to make a big gain on another.)
Polynomially many subproblems: An important aspect to the efficiency of DP is that the
total number of subproblems to be solved should be at most a polynomial number.
Fibonacci numbers
Recursive Fibonacci revisited: In recursive version of an algorithm for finding Fibonacci number
we can notice that for each calculation of the Fibonacci number of the larger number we have to
calculate the Fibonacci number of the two previous numbers regardless of the computation of the
Fibonacci number that has already be done. So there are many redundancies in calculating the
Fibonacci number for a particular number. Let’s try to calculate the Fibonacci number of 4. The
representation shown below shows the repetition in the calculation.
al
ep
itn
By Bhupendra Saud 39
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Fib(4)
Fib(3) Fib(2)
Fib(1)
Fib(2) Fib(1) Fib(0)
Fib(1) Fib(0)
In the above tree we saw that calculations of fib(0) is done two times, fib(1) is done 3 times, fib(2)
is done 2 times, and so on. So if we somehow eliminate those repetitions we will save the running
time.
Algorithm:
DynaFibo(n)
{
A[0] = 0, A[1]= 1;
for(i = 2 ; i <=n ; i++)
A[i] = A[i-2] +A[i-1] ;
return A[n] ;
}
Analysis
Analyzing the above algorithm we found that there are no repetition of calculation of the sub-
problems already solved and the running time decreased from O(2n/2) to O(n). This reduction was
possible due to the remembrance of the sub-problem that is already solved to solve the problem of
higher size.
The algorithm takes as input maximum weight W, the number of items n, two arrays v[] for values
of items and w[] for weight of items. Let us assume that the table c[i,w] is the value of solution for
ep
items 1 to i and maximum weight w. Then we can define recurrence relation for 0/1 knapsack
problem as
itn
By Bhupendra Saud 40
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
0 if i=0 or w=0
C[i,w] = C[i-1,w] if wi > w
Max{vi + C[i-1,w-wi], c[i-1,w] if i>0 and w>wi
Algorithm:
DynaKnapsack(W,n,v,w)
{
for(w=0; w<=W; w++)
C[0,w] = 0;
for(i=1; i<=n; i++)
C[i,0] = 0;
for(i=1; i<=n; i++)
{
for(w=1; w<=W;w++)
{
if(w[i]<w)
{
if v[i] +C[i-1,w-w[i]] > C[i-1,w]
{
C[i,w] = v[i] +C[i-1,w-w[i]];
}
else
{
C[i,w] = C[i-1,w];
}
}
else
{
C[i,w] = C[i-1,w];
}
}
}
}
Analysis
For run time analysis examining the above algorithm the overall run time of the algorithm is
O(nW).
al
ep
itn
By Bhupendra Saud 41
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Example
Let the problem instance be with 7 items where v[] = {2,3,3,4,4,5,7}and w[] = {3,5,7,4,3,9,2}and
W = 9.
w 0 1 2 3 4 5 6 7 8 9
i
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 2 2 2 2 2 2 2
2 0 0 0 2 2 3 3 3 5 5
3 0 0 0 2 2 3 3 3 5 5
4 0 0 0 2 4 4 4 6 6 7
5 0 0 0 4 4 4 6 8 8 8
6 0 0 0 4 4 4 6 8 8 8
7 0 0 7 7 7 11 11 11 13 15
Profit= C[7][9]=15
Although any legal parenthesization will lead to a valid result, not all involve the same number of
operations. Consider the case of 3 matrices: A1 be 5 x 4, A2 be 4 x 6 and A3 be 6 x 2.
multCost[((A1A2)A3)] = (5 . 4 . 6) + (5 . 6 . 2) = 180
multCost[(A1(A2A3))] = (4 . 6 . 2) + (5 . 4 . 2) = 88
Even for this small example, considerable savings can be achieved by reordering the evaluation
sequence.
Let Ai….j denote the result of multiplying matrices i through j. It is easy to see that Ai…j is a pi−1 x
pj matrix. So for some k total cost is sum of cost of computing Ai…k, cost of computing Ak+1…j, and
cost of multiplying Ai…k and Ak+1…j.
Recursive definition of optimal solution: let m[j,j] denotes minimum number of scalar
multiplications needed to compute Ai…j.
0 if i=j
C[i,w] =
mini≤k<j{m[I,k]+ m[k+1,j] + pi-1pkpj if i<j
Algorithm:
Matrix-Chain-Multiplication(p)
{
al
n =length[p]
for( i= 1 i<=n i++)
ep
{
itn
By Bhupendra Saud 42
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
m[i, i]= 0
}
for(l=2; l<= n; l++)
{
for( = 1; i<=n-l+1; i++)
{
j=i+l−1
m[i, j] = ∞
for(k= i; k<= j-1; k++)
{
c= m[i, k] + m[k + 1, j] + p[i−1] * p[k] * p[j]
if c < m[i, j]
{
m[i, j] = c
s[i, j] = k
}
}
}
}
return m and s
}
Analysis
The above algorithm can be easily analyzed for running time as O(n3), due to three nested loops.
The space complexity is O(n2) .
Example:
Consider matrices A1, A2, A3 And A4 of order 3x4, 4x5, 5x2 and 2x3.
j 1 2 3 4 j 1 2 3 4
i i
1 0 60 64 82
2 0 40 64 1 1 1 3
3 0 30 2 2 3
4 0 3 3
4
Constructing optimal solution
(A1A2A3A4) => ((A1A2A3)(A4)) => (((A1)(A2A3))(A4))
al
ep
itn
By Bhupendra Saud 43
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Longest Common Subsequence Problem
Given two sequences X = (x1, x2, …………,xm) and Z = (z1; z2; ……… ; zk), we say that Z is a
subsequence of X if there is a strictly increasing sequence of k indices (i1, i2, ……… ik) (1 ≤ i1 < i2 <
……….. < ik) such that Z = (Xi1,Xi2 ………. Xik).
The Longest Common Subsequence Problem (LCS) is the following. Given two sequences X =
(x1; : : : ; xm) and Y = (y, ……., yn) determine a longest common subsequence.
If xi = yj (i.e. last Character match) , we claim that the LCS must also contail charater xi or yj .
If xi ≠ yj (i.e Last Character do not match) , In this case xi and yj cannot both be in the LCS (since
they would have to be the last character of the LCS). Thus either xi is not part of the LCS, or yj is
not part of the LCS (and possibly both are not part of the LCS). Let L[I,j] represents the length of
LCS of sequences Xi and Yj.
0 if i=0 or j=0
L[i,j] = L[i-1,j-1]+1 if xi = yj
max{L[i-1,j], L[i,j-1]} if i>0 and w>wi
Algorithm:
LCS(X,Y)
{
m = length[X];
n = length[Y];
for(i=1;i<=m;i++)
c[i,0] = 0;
for(j=0;j<=n;j++)
c[0,j] = 0;
for(i = 1;i<=m;i++)
for(j=1;j<=n;j++)
{
if(X[i]==Y[j])
al
{
c[i][j] = c[i-1][j-1]+1; b[i][j] = “upleft”;
ep
}
else if(c[i-1][j]>= c[i][j-1])
itn
By Bhupendra Saud 44
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
{
c[i][j] = c[i-1][j]; b[i][j] = “up”;
}
else
{
c[i][j] = c[i][j-1]; b[i][j] = “left”;
}
}
return b and c;
}
Analysis:
The above algorithm can be easily analyzed for running time as O(mn), due to two nested loops.
The space complexity is O(mn).
Example:
Consider the character Sequences X=abbabba Y=aaabba
Y Φ a a a b b a
X
Φ 0 0 0 0 0 0 0
a 0 1 upleft 1upleft 1upleft 1 left 1 left 1 left
b 0 1 up 1 up 1 up 2 upleft 2 upleft 2 left
b 0 1 up 1 up 1 up 2 upleft 3 upleft 3 left
a 0 1 upleft 2 upleft 2 upleft 2 up 3 up 4 upleft
b 0 1 up 2 up 2 up 3 upleft 3 upleft 4 up
b 0 1 up 2 up 2 up 3 upleft 4 upleft 4 up
a 0 1 upleft 2 upleft 3 upleft 3 up 4 up 5 upleft
LCS = aabba
Chapter:3
Greedy Paradigm
Greedy method is the simple straightforward way of algorithm design. The general class
of problems solved by greedy approach is optimization problems. In this approach the input ele-
ments are exposed to some constraints to get feasible solution and the feasible solution that meets
some objective function best among all the solutions is called optimal solution. Greedy algorithms
always makes optimal choice that is local to generate globally optimal solution however, it is not
guaranteed that all greedy algorithms yield optimal solution. We generally cannot tell whether the
given optimization problem is solved by using greedy method or not, but most of the problems that
can be solved using greedy approach have two parts:
al
By Bhupendra Saud 45
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Optimal substructure is exhibited by a problem if an optimal solution to the problem contains op-
timal solutions to the sub-problems within it.
To prove that a greedy algorithm is optimal we must show the above two parts are exhibited. For
this purpose first take globally optimal solution; then show that the greedy choice at the first step
generates the same but the smaller problem, here greedy choice must be made at first and it should
be the part of an optimal solution; at last we should be able to use induction to prove that the
greedy choice at each step is best at each step, this is optimal substructure.
GreedyFracKnapsack(W,n)
{
for(i=1; i<=n; i++)
x[i] = 0.0;
tempW = W;
for(i=1; i<=n; i++)
{
if(w[i] > tempW) then break;
x[i] = 1.0;
tempW -= w[i];
}
if(i<=n)
x[i] = tempW/w[i];
}
Analysis:
We can see that the above algorithm just contain a single loop i.e. no nested loops the running time
for above algorithm is O(n). However our requirement is that v[1 … n] and w[1 … n] are sorted,
so we can use sorting method to sort it in O(nlogn) time such that the complexity of the algorithm
above including sorting becomes O(nlogn).
al
ep
itn
By Bhupendra Saud 46
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Huffman Codes:
Huffman codes are used to compress data by representing each alphabet by unique binary codes in
an optimal way. As an example consider the file of 100,000 characters with the following fre-
quency distribution assuming that there are only 7 characters
f(a) = 40,000 , f(b) = 20,000 , f(c) = 15,000 , f(d) = 12,000 , f(e) = 8,000 , f(f) = 3,000 ,
f(g) = 2,000.
Here fixed length code for 7 characters we need 3 bits to represent all characters like
a = 000 , b = 001 , c = 010 , d = 011 , e = 100 , f = 101 , g = 110.
Total number of bits required due to fixed length code is 300,000.
Now consider variable length character so that character with highest frequency is given smaller
codes like
C = {a, b, c, d, e, f, g}; f(c) = 40, 20, 15, 12, 8, 3, 2; n = 7
Initial priority queue is
al
ep
itn
By Bhupendra Saud 47
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
By Bhupendra Saud 48
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Analysis
We can use BuildHeap(C){see notes on sorting} to create a priority queue that takes O(n) time. In-
side the for loop the expensive operations can be done in O(logn) time. Since operations inside for
loop executes for n-1 time total running time of HuffmanAlgo is O(nlogn).
Example
n=4, (p1,p2,p3,p4)=(100,10,15,27), (d1,d2,d3,d4)=(2,1,2,1)
----------------------------------------------------
Feasible processing
Solution sequence value
1. (1, 2) 2, 1 110
2. (1, 3) 1, 3 or 3, 1 115
3. (1, 4) 4, 1 127
4. (2, 3) 2, 3 25
5. (3, 4) 4, 3 42
6. (1) 1 100
7. (2) 2 10
8. (3) 3 5
9. (4) 4 27
Algorithm:
Assume the jobs are ordered such that p[1]≥p[2]≥…≥p[n] d[i]>=1, 1<=i<=n are the deadlines,
n>=1. The jobs n are ordered such that p[1]>=p[2]>=... >=p[n]. J[i] is the ith job in the optimal
solution, 1<=i<=k. Also, at termination d[J[i]]<=d[J[i+1]], 1<=i<k.
{
d[0] = J[0] = 0; // Initialize.
ep
By Bhupendra Saud 49
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
int k=1;
for (int i=2; i<=n; i++)
{//Consider jobs in nonincreasing order of p[i]. Find position for i and check
feasibility of insertion.
int r = k;
while ((d[J[r]] > d[i]) && (d[J[r]] != r))
r--;
if ((d[J[r]] <= d[i]) && (d[i] > r))
{
// Insert i into J[].
for (int q=k; q>=(r+1); q--)
J[q+1] = J[q];
J[r+1] = i; k++;
}
}
return (k);
}
Analysis
For loop executes O(n) line. While loop inside the for loop executes at most times and if the
condition given inside if statement is true inner for loop executes O(k-r) times. Hence total time
for each iteration of outer for loop is O(k). Thus time complexity is O(n2) .
Unit 3:-
Graph Algorithms
Graph is a collection of vertices or nodes, connected by a collection of edges. Graphs are
extremely important because they are a very flexible mathematical model for many application
problems. Basically, any time you have a set of objects, and there is some “connection” or
“relationship” or “interaction” between pairs of objects, a graph is a good way to model this.
Examples of graphs in application include communication and transportation networks, VLSI
and other sorts of logic circuits, surface meshes used for shape description in computer-aided
design and geographic information systems, precedence constraints in scheduling systems etc.
A directed graph (or digraph) G = (V,E) consists of a finite set V , called the vertices or nodes,
and E, a set of ordered pairs, called the edges of G.
An undirected graph (or graph) G = (V,E) consists of a finite set V of vertices, and a set E of
unordered pairs of distinct vertices, called the edges. al
ep
itn
By Bhupendra Saud 50
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
We say that vertex v is adjacent to vertex u if there is an edge (u; v). In a directed graph, given
the edge e = (u; v), we say that u is the origin of e and v is the destination of e. In undirected
graphs u and v are the endpoints of the edge. The edge e is incident (meaning that it touches)
both u and v.
In a digraph, the number of edges coming out of a vertex is called the out-degree of that vertex,
and the number of edges coming in is called the in-degree. In an undirected graph we just talk
about the degree of a vertex as the number of incident edges. By the degree of a graph, we
usually mean the maximum degree of its vertices.
Notice that generally the number of edges in a graph may be as large as quadratic in the
number of vertices. However, the large graphs that arise in practice typically have much fewer
edges. A graph is said to be sparse if E = Θ(V ), and dense, otherwise. When giving the
running times of algorithms, we will usually express it as a function of both V and E, so that
the performance on sparse and dense graphs will be apparent.
Representation of Graph
Generally graph can be represented in two ways namely adjacency lists(Linked list
representation) and adjacency matrix(matrix).
Adjacency List:
This type of representation is suitable for the undirected graphs without multiple edges,
and directed graphs. This representation looks as in the tables below.
al
If we try to apply the algorithms of graph using the representation of graphs by lists of edges,
ep
or adjacency lists it can be tedious and time taking if there are high number of edges. For the
sake of the computation, the graphs with many edges can be represented in other ways. In this
itn
By Bhupendra Saud 51
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
class we discuss two ways of representing graphs in form of matrix.
Adjacency Matrix:
Given a simple graph G =(V, E) with |V| = n. assume that the vertices of the graph
are listed in some arbitrary order like v1, v2, …, vn. The adjacency matrix A of G, with respect
to the order of the vertices is n-by-n zero-one matrix (A = [aij]) with the condition,
Since there are n vertices and we may order vertices in any order there are n! possible order of
the vertices. The adjacency matrix depends on the order of the vertices, hence there are n!
possible adjacency matrices for a graph with n vertices.
In case of the directed graph we can extend the same concept as in undirected graph as dictated by
the relation
If the number of edges is few then the adjacency matrix becomes sparse. Sometimes it will be
beneficial to represented graph with adjacency list in such a condition.
Solution:Let the order of the vertices be a, b, c, d, e, f
al
ep
Solution:
Let the order of the vertices be a, b, c, d, e, f, g
itn
By Bhupendra Saud 52
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Graph Traversals
There are a number of approaches used for solving problems on graphs. One of the most
important approaches is based on the notion of systematically visiting all the vertices and edge
of a graph. The reason for this is that these traversals impose a type of tree structure (or
generally a forest) on the graph, and trees are usually much easier to reason about than general
graphs.
Breadth-first search
This is one of the simplest methods of graph searching. Choose some vertex arbitrarily as a
root. Add all the vertices and edges that are incident in the root. The new vertices added will
become the vertices at the level 1 of the BFS tree. Form the set of the added vertices of level 1,
find other vertices, such that they are connected by edges at level 1 vertices. Follow the above
step until all the vertices are added.
Algorithm:
BFS(G,s) //s is start vertex
{
T = {s};
L =Φ; //an empty queue
Enqueue(L,s);
while (L != Φ )
{
v = dequeue(L);
for each neighbor w to v
if ( w∉ L and w ∉ T )
{
enqueue( L,w);
T = T U {w}; //put edge {v,w} also
}
al
}
ep
}
itn
By Bhupendra Saud 53
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Example:
Use breadth first search to find a BFS tree of the following graph.
Solution:
Analysis
From the algorithm above all the vertices are put once in the queue and they are accessed. For
each accessed vertex from the queue their adjacent vertices are looked for and this can be done
in O(n) time(for the worst case the graph is complete). This computation for all the possible
vertices that may be in the queue i.e. n, produce complexity of an algorithm as O(n 2). Also
from aggregate analysis we can write the complexity as O(E+V) because inner loop executes E
times in total.
Example:
Use depth first search to find a spanning tree of the following graph.
ep
itn
By Bhupendra Saud 54
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
al
ep
itn
By Bhupendra Saud 55
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm:
DFS(G,s)
{
T = {s};
Traverse(s);
}
Traverse(v)
{
for each w adjacent to v and not yet in T
{
T = T U {w}; //put edge {v,w} also
Traverse (w);
}
}
Analysis:
The complexity of the algorithm is greatly affected by Traverse function we can write its running
time in terms of the relation T(n) = T(n-1) + O(n), here O(n) is for each vertex at most all the
vertices are checked (for loop). At each recursive call a vertex is decreased. Solving this we can
find that the complexity of an algorithm is O(n2).
Also from aggregate analysis we can write the complexity as O(E+V) because traverse function is
invoked V times maximum and for loop executes O(E) times in total.
Kruskal’s Algorithm:
The problem of finding MST can be solved by using Kruskal’s algorithm. The idea behind this al-
gorithm is that you put the set of edges form the given graph G = (V,E) in nondecreasing order of
their weights. The selection of each edge in sequence then guarantees that the total cost that would
al
from will be the minimum. Note that we have G as a graph, V as a set of n vertices and E as set of
edges of graph G.
ep
itn
By Bhupendra Saud 56
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm:
KruskalMST(G)
{
T = {V} // forest of n nodes
S = set of edges sorted in nondecreasing order of weight
while(|T| < n-1 and E !=∅)
{
al
By Bhupendra Saud 57
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
T = T ∪ {(u,v)}
}
}
Analysis:
In the above algorithm the n tree forest at the beginning takes (V) time, the creation of set
S takes O(ElogE) time and while loop execute O(n) times and the steps inside the loop
take almost linear time (see disjoint set operations; find and union). So the total time
taken is O(ElogE) or asymptotically equivalently O(ElogV)!.
Prim’s Algorithm
This is another algorithm for finding MST. The idea behind this algorithm is just take any arbitrary
vertex and choose the edge with minimum weight incident on the chosen vertex. Add the vertex
and continue the above process taking all the vertices added. Remember the cycle must be avoided.
al
ep
itn
By Bhupendra Saud 58
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm:
PrimMST(G)
{
T = ∅; // T is a set of edges of MST
S = {s} ; //s is randomly chosen vertex and S is set of vertices
while(S != V)
{
e = (u,v) an edge of minimum weight incident to vertices in T and not forming a
simple circuit in T if added to T i.e. u ∈ S and v∈ V-S
T = T ∪ {(u,v)};
S = S ∪ {v};
}
}
Analysis:
In the above algorithm while loop execute O(V). The edge of minimum weight incident on a ver-
tex can be found in O(E), so the total time is O(EV). We can improve the performance of the
above algorithm by choosing better data structures as priority queue and normally it will be seen
that the running time of prim’s algorithm is O(ElogV)!.
al
ep
itn
By Bhupendra Saud 59
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Shortest Path Problem
Given a weighted graph G =(V,E), then it has weight for every path p = <v 0,v1,…vk> as w(p) =
w(v0,v1) + w(v1,v2) + … + w(vk-1,vk). A shortest path from u to v is the path from u to v with min-
imum weight. Shortest path from u to v is denoted by d(u,v). It is important to remember that the
shortest path may exist in a graph or may not i.e. if there is negative weight cycle then there is no
shortest path. For e.g the below graph has no shortest path from a to c .You can notice the negative
weight cycle for path a to b.
As a matter of fact even the positive weight cycle doesn’t constitute shortest path but there will be
shortest path. Some of the variations of shortest path problem include:
Single Source: This type of problem asks us to find the shortest path from the given vertex
(source) to all other vertices in a connected graph
Single Destination: This type of problem asks us to find the shortest path to the given vertex (des-
tination) from all other vertices in a connected graph.
Single Pair: This type of problem asks us to find the shortest path from the given vertex (source)
to another given vertex (destination).
All Pairs: This type of problem asks us to find the shortest path from the all vertices to all other
vertices in a connected graph
Example:
Find the shortest path from the vertex c to all other vertices in the following DAG.
al
ep
itn
By Bhupendra Saud 60
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
al
ep
itn
By Bhupendra Saud 61
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm:
DagSP(G,w,s)
{
Topologically Sort the vertices of G
for each vertex v belongs to V
do d[v] = ∞
d[s] = 0
for each vertex u, taken in topologically sorted order
do for each vertex v adjacent to u
do if d[v] > d[u] + w(u,v)
then d[v] = d[u] + w(u,v)
}
Dijkstra’s Algorithm
This is another approach of getting single source shortest paths. In this algorithm it is assumed that
there is no negative weight edge. Dijkstra’s algorithm works using greedy approach, as we will see
later.
Example:
Find the shortest paths from the source g to all other vertices using Dijkstra’s algorithm.
Solution:
al
ep
itn
By Bhupendra Saud 62
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm:
Dijkstra(G,w,s)
{
for each vertex v∈ V
do d[v] = ∞
d[s] = 0
S=∅
Q=V
While(Q!= ∅)
{
al
S = S ∪ {u}
for each vertex v adjacent to u
itn
By Bhupendra Saud 63
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
do if d[v] > d[u] + w(u,v)
then d[v] = d[u] + w(u,v)
}
}
Analysis:
In the above algorithm, the first for loop block takes O(V) time. Initialization of priority queue Q
takes O(V) time. The while loop executes for O(V), where for each execution the block inside the
loop takes O(V) times . Hence the total running time is O(V2).
al
ep
itn
By Bhupendra Saud 64
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
al
ep
itn
By Bhupendra Saud 65
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Analysis:
Clearly the above algorithm’s running time is O(n3), where n is cardinality of set V of vertices.
al
ep
itn
By Bhupendra Saud 66
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Chapter 2:
[Geometric Algorithms]
Computational Geometry
The field of computational geometry deals with the study of geometric problems. In our
class we present few geometric problems for e.g. detecting the intersection between line segments,
and try to solve them by using known algorithms. In this lecture, we discuss and present
algorithms on context of 2-D.
Application Domains
� Computer graphics
� Robotics
� GIS
� CAD/CAM – IC Design, automobile, buildings.
� Molecular Modeling
� Pattern recognition
Some Definitions
Point:
A point is a pair of numbers. The numbers are real numbers, but in our usual calculation we
concentrate on integers. For e.g. p1(x1,y1) and p2(x2,y2) are two points as shown
Line segment:
A line segment is a pair of points p1 and p2, where two points are end points of the segment.
For e.g. S(p1,p2) is shown below.
Ray: -
A ray is an infinite one dimensional subset of a line determined by two points: say P 0, P1,
where one point is denoted as the endpoint.
Thus, a ray consists of a bounded point & is extended to infinitely along a line segment.
al
ep
itn
By Bhupendra Saud 67
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Line: - Line is represented by a pair of points P 0 and P1 say, which is extended in both way to infinity
along the segment represented by the pair of points P 0 & P1.
Line: - Line is represented by a pair of points P 0 and P1 say, which is extended in both way to infinity
along the segment represented by the pair of points P 0 & P1.
Polygon:
Simply polygon is a homeomorphic image of a circle, i.e. it is a certain deformation of circle
Simple Polygon:
A simple polygon is a region of plane bounded by a finite collection of line segments to form a simple
closed curve. Mathematically, let V0, V1, V2, ---------, Vn-1 are n ordered vertices in the plane,
then the line segments e0 (V0, V1), e1 (V1, V2), ……., en-1 (Vn-1, V0) form a simple polygon if and only
if;
� the intersection of each pair of segments adjacent in cyclic ordering is a simple single point
shared by them;
ei ∩ ei+1 = Vi+1 &
� non-adjacent segments do not intersect;
ei ∩ e j = Φ
Thus, a polygon is simple if there are no points between non-consecutive linesegments,
i.e. vertices are only intersection points. Vertices of simple polygon
are assumed to be ordered into counterclockwise direction
By Bhupendra Saud 68
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Convex Polygon:
A simple polygon P is convex if and only if for any pair of points x, y in P the line segment
between x and y lies entirely in P. We can notice that if all the interior angle is less than 1800, then
the simple polygon is a convex polygon.
Here all (V2, V8), (V3, V7), (V4, V6) & (V10, V12) are diagonals of the polygon but (V 9, V11) is not a
diagonal
Ear of Polygon:
Three consecutive vertices Vi, Vi+1, Vi+2 of a polygon form an ear if (Vi, Vi+2) is a diagonal, Vi+1 is the
tip of the ear.
al
ep
itn
By Bhupendra Saud 69
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Mouth:
Three consecutive vertices V i, Vi+1, Vi+2 of a polygon form a mouth if (V i, Vi+2) is an external
diagonal. In above figure, (V0, V1, V2) & (V2, V3, V4) are mouths of the polygon.
One-Mouth Theorem
Except for convex polygons, every simple polygon has at least one mouth.
Two-Ears Theorem
Every polygon of n ≥ 4 vertices has at least two non-overlapping ears.
Right Turn:
If P line segment (P1, P2) lies to the right of (P0, P1) then P0, P1, P2 is a right turn.
using the point of the line segment on the obtained equation after getting slope of the respective
ep
lines.
itn
By Bhupendra Saud 70
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
• Solve two equations of lines L1 and L2, let the value obtained by solving be p = (x i, yi). Here we
confront with two cases. The first case is, if p is the intersection of two line segments then p lies on
both S1 and S2. The second case is if p is not an intersection point then p does not lie on at least
one of the line segments S1 and S2.
The figure below shows both the cases.
See figure below to have idea on left and right turn as well as direction of points. The
cross product’s geometric interpretation is also shown below.
al
ep
itn
By Bhupendra Saud 71
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Using the concept of left and right turn we can detect the intersection between the two line
segments in very efficient manner.
Convex hull:
Definition:
- The convex hull of a finite set of points, S in plane is the smallest convex polygon P, that
encloses S. (Smallest area)
- The convex hull of a set of points, S in the plane is the union of all the triangles determined by
points in S.
− The convex hull of a finite set of points, S, is the intersection of all the convex polygons
(sets) that contain S.
−
There are wide ranges of application areas where it comes use of convex hulls such as;
- In pattern recognition, an unknown shape may be represented by its convex hull, which is then
matched to a database of known shapes.
- In motion planning, if the robot is approximated by its convex hull, then it is easier to plan
collision free path on the landscape of obstacles.
− Smallest box, fitting ranges & so on.
By Bhupendra Saud 72
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Algorithm:
-Find a point pi with lowest y-coordinate, let it be q0.
- Sort the input points angularly about q0 let the sorted list is now {q0, q1, -------, qn-1}
- Push q0 into stack S and push q1 into the stack S.
- Initialize i = 2
while (i < n)
if LeftTurn (next top (S), top (S), qi) is true
push qi into stack S.
i++
else
pop the stack
end if
end while
- Each points popped from stack are not vertex of convex hull.
- Finally, at last when all elements are processed, the points that remain on stack are the vertices of
the convex hull.
Complexity Analysis:
- Finding minimum y-coordinate point it takes O (n) time.
- Sorting angularly about the point takes O (nlogn) time.
- Pushing & popping takes constant time.
- The while loop runs for O (n) times
Hence, the complexity = O (n) + O (nlogn) + O (1) + O (n)
= O (nlogn).
By Bhupendra Saud 73
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Exercises:
1. Write down the algorithm to find the point of intersection of two linesegments, if exists.
2. Use Graham’s scan algorithm to find convex hull of the set of points below.
al
ep
itn
By Bhupendra Saud 74
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Example2:
Find the convex hull of the set of points given below using graham’s scan algorithm.
Solution:
al
ep
itn
By Bhupendra Saud 75
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
al
ep
itn
By Bhupendra Saud 76
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Chapter 3
[NP Complete Problems&Approximation Algorithms]
Up to now we were considering on the problems that can be solved by algorithms in worst-
case polynomial time. There are many problems and it is not necessary that all the problems have
the apparent solution. This concept, somehow, can be applied in solving the problem using the
computers. The computer can solve: some problems in limited time e.g. sorting, some problems
requires unmanageable amount of time e.g. Hamiltonian cycles, and some problems cannot be
solved e.g. Halting Problem. In this section we concentrate on the specific class of problems called
NP complete problems (will be defined later).
Problems:
Abstract Problems:
Abstract problem A is binary relation on set I of problem instances, and the set S of
problem solutions. For e.g. Minimum spanning tree of a graph G can be viewed as a pair of the
given graph G and MST graph T.
Decision Problems:
Decision problem D is a problem that has an answer as either “true”, “yes”, “1” or “false”,
”no”, “0”. For e.g. if we have the abstract shortest path with instances of the problem and the
solution set as {0,1}, then we can transform that abstract problem by reformulating the problem as
“Is there a path from u to v with at most k edges”. In this situation the answer is either yes or no.
Optimization Problems:
We encounter many problems where there are many feasible solutions and our aim is to
find the feasible solution with the best value. This kind of problem is called optimization problem.
al
For e.g. given the graph G, and the vertices u and v find the shortest path from u to v with
minimum number of edges. The NP completeness does not directly deal with optimizations
ep
problems, however we can translate the optimization problem to the decision problem.
itn
By Bhupendra Saud 77
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Encoding:
Encoding of a set S is a function e from S to the set of binary strings. With the help of encoding,
we define concrete problem as a problem with problem instances as the set of binary strings i.e. if
we encode the abstract problem, then the resulting encoded problem is concrete problem. So,
encoding as a concrete problem assures that every encoded problem can be regarded as a language
i.e. subset of {0,1}*.
Complexity Class P:
Complexity class P is the set of concrete decision problems that are polynomial time
solvable by deterministic algorithm. If we have an abstract decision problem A with instance set I
mapping the set {0,1}, an encoding e: I→{0,1}* is used to denote the concrete decision problem
e(A). We have the solutions to both the abstract problem instance i∈I and concrete problem
instance e(i) ∈{0,1}* as A(i)∈{0,1}. It is important to understand that the encoding mechanism
does greatly vary the running time of the algorithm for e.g. take some algorithm that runs in O(n)
time, where the n is size of the input. Say if the input is just a natural number k, then its unary
encoding makes the size of the input as k bits as k number of 1’s and hence the order of the
algorithm’s running time is O(k). In other situation if we encode the natural number k as binary
encoding then we can represent the number k with just logk bits (try to represent with 0 and 1only)
here the algorithm runs in O(n) time. We can notice that if n = logk then O(k) becomes O(2 n) with
unary encoding. However in our discussion we try to discard the encoding like unary such that
there is not much difference in complexity.
We define polynomial time computable function f:{0,1}*→{0,1}* with respect to some
polynomial time algorithm PA such that given any input x ∈{0,1}*, results in output f(x). For
some set I of problem instances two encoding e 1 and e2 are polynomially related if there are two
polynomial time computable functions f and g such that for any i ∈I, both f(e1(i)) = e2(i) and
g(e2(i)) = e1(i) are true i.e. both the encoding should computed from one encoding to another
encoding in polynomial time by some algorithm.
By Bhupendra Saud 78
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Complexity Class NP:
NP is the set of decision problems solvable by nondeterministic algorithms in polynomial
time. When we have a problem, it is generally much easier to verify that a given value is solution
to the problem rather than calculating the solution of the problem. Using the above idea we say the
problem is in class NP (nondeterministic polynomial time) if there is an algorithm for the problem
that verifies the problem in polynomial time. V is the verification algorithm to the decision
problem D if V takes input string x as an instance of the problem D and another binary string y,
certificate, whose size is no more than the polynomial in the size of x. the algorithm V verifies an
input x if there is a certificate y such that answer of D to the input x with certificate y is yes. For
e.g. Circuit satisfiability problem (SAT) is the question “Given a Boolean combinational circuit, is
it satisfiable? i.e. does the circuit has assignment sequence of truth values that produces the output
of the circuit as 1?” Given the circuit satisfiability problem take a circuit x and a certificate y with
the set of values that produce output 1, we can verify that whether the given certificate satisfies the
circuit in polynomial time. So we can say that circuit satisfiability problem is NP. We can always
say P Í NP, since if we have the problem for which the polynomial time algorithm exists to solve
(decide: notice the difference between decide and accept) the problem, then we can always get the
verification algorithm that neglects the certificate and accepts the output of the polynomial time
algorithm. From the above fact we are clear that P Í NP but the question, whether P = NP remains
unsolved and is still the big question in theoretical computer science. Most of the computer
scientists, however, believes that P ¹ NP.
NP-Completeness:
NP complete problems are those problems that are hardest problems in class NP. We define
some problem say A, is NP-complete if
1. A ∈ NP, and
2. B ≤p A, for every B ∈ NP.
We call the problem (or language) A satisfying property 2 is called NP-hard.
Cook’s Theorem:
“SAT is NP-hard”
Proof: (This is not actual proof as given by cook, this is just a sketch)
Take a problem V NP, let A be the algorithm that verifies V in polynomial time (this must be true
since V NP). We can program A on a computer and therefore there exists a (huge) logical circuit
whose input wires correspond to bits of the inputs x and y of A and which outputs 1 precisely
when A(x,y) returns yes.
For any instance x of V let Ax be the circuit obtained from A by setting the x-input wire values
according to the specific string x. The construction of A x from x is our reduction function. If x is a
yes instance of V, then the certificate y for x gives satisfying assignments for A x. Conversely, if Ax
outputs 1 for some assignments to its input wires, that assignment translates into a certificate for x.
To show that SAT is NP-complete we have to show two properties as given by the definition of
NP-complete problems. The first property i.e. SAT is in NP we showed above (see pg 5 italicized
ep
itn
By Bhupendra Saud 79
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
part), so it is sufficient to show the second property holds for SAT. The proof for the second
property i.e. SAT is NP-hard is from lemma 3. This completes the proof.
Approximation Algorithms:
An approximate algorithm is a way of dealing with NP-completeness for optimization
problem. This technique does not guarantee the best solution. The goal of an approximation
algorithm is to come as close as possible to the optimum value in a reasonable amount of time
which is at most polynomial time. If we are dealing with optimization problem (maximization or
minimization) with feasible solution having positive cost then it is worthy to look at approximate
algorithm for near optimal solution.
An algorithm has an approximate ratio of ρ(n) if, for any problem of input size n, the cost C of
solution by an algorithm and the cost C* of optimal solution have the relation as max(C/C*,C*,C)
≤ ρ(n). Such an algorithm is called ρ(n)-approximation algorithm.
The relation applies for both maximization (0 < C ≤ C*) and minimization (0 < C* ≤ C) problems.
ρ(n) is always greater than or equal to 1. If solution produced by approximation algorithm is true
optimal solution then clearly we have ρ(n) = 1.
Algorithm:
ApproxVertexCover (G)
{
C {};
E’ = E
while E` is not empty
do Let (u, v) be an arbitrary edge of E`
C = C ≈ {u, v}
Remove from E` every edge incident on either u or v
return C
}
By Bhupendra Saud 80
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
Solution:
Analysis:
If E’ is represented using the adjacency lists the above algorithm takes O (V+E) since each
edge is processed only once and every vertex is processed only once throughout the whole
operation.
Tribhuvan University
Bachelor of Science in Computer Science
And Information Technology
Examination, 2069
New Summit College
(Old Baneshowr , Kathmandu)
al
By Bhupendra Saud 81
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
1. Why asymptotic notations are important in algorithm analysis? Describe big-O, big-Ω and big-
theta notation with suitable examples. {2 + 2 + 2+2}
2. What is recurrence relation? Prove that the complexity of the recurrence relation “T(n) =
8T(n/2) + n2” is O(n3) by using substitution method. {1+7}
3. Given the following block of code, write a recurrence relation for it and also find asymptotic
upper bound (Assume that all dotted code takes constant time) { 4+4)
Fun(int n)
{
…………….
if(condition1)
x=Fun(n/2)
else if(condition2)
x=Fun(2n/3)
else
x= Fun(n/4)
……………
}
4. What is the concept behind randomized quick sort? Write down its algorithm and give its
average case analysis. {1 + 3 + 4)
5. What is meant by medial order statistics? Write the algorithm for expected liner time selection
and analyze it. {1 + 3 + 4}
6. Devise a divide and conquer algorithm for finding minimum and maximum element among a
set of given elements. Write recurrence relation for your algorithm and give its big-O estimate.
{5+ 3}
7. What are the characteristics of problem that can be solved by using dynamic programming
algorithm? Give the recursive definition of solving 0/1 knapsack problem. Trace the algorithm
for w={3,4,2,2,3}, v={12,14,6,5,6} and knapsack of capacity 12. (2+1+5)
8. Write the recurrence relation for Longest Common subsequence problem(LCS). Trace the
algorithm to find LCS of X={a,b,c,b,d,a,b} and Y={b,d,c,a,b,a}. (2+6)
9. Use master method to find the big-O estimates of the recurrences: {4+4)
a. T(n) = 3T(n/2) + n
b. T(n) = 4T(n/2) + n2
10 Show all the steps required for sorting an array of size 10 by using Heap sort.
a[10]={5, 3, 2, 4, 7, 8, 1, 11, 9, 15}. (8)
Hint: At first construct a heap and then sort by using Heap sort properties.
al
ep
itn
By Bhupendra Saud 82
cs
Source: www.csitnepal.com
DAA New Summit College(B.Sc.CSIT)
A Complete Note in Design And Analysis of Algorithms
By Bhupendra S. Saud
Email: Saud.bhupendra427@gmail.com
The End
al
ep
itn
By Bhupendra Saud 83
cs
Source: www.csitnepal.com