CS3401 Algorithms Watermark

Download as pdf or txt
Download as pdf or txt
You are on page 1of 133

CS3401-ALGORITHMS

UNIT 1

TimeandSpaceComplexity
Time complexity is a measure of how long an algorithm takes to run as a function of the size of the
input. It is typically expressed using big O notation, which describes the upper bound on the growth
of the time required by the algorithm. For example, an algorithm with a time complexity of O(n)
takes longer to run as the input size (n) increases.

Therearedifferenttypesoftimecomplexities:

 O(1) or constant time: the algorithm takes the same amount of time to run regardless of the
size of the input.

 O(log n) or logarithmic time: the algorithm's running time increases logarithmically with the
size of the input.

 O(n)orlineartime:thealgorithm'srunningtimeincreaseslinearlywiththesizeoftheinput.

 O(n log n) or linear logarithmictime: the algorithm's running time increases linearly with the
size of the input and logarithmically with the size of the input.

 O(n^2) or quadratic time: the algorithm's running time increases quadratically with the size
of the input.
 O(2^n) or exponential time: the algorithm's running time increases exponentially with the
size of the input.

Space complexity, on the other hand, is a measure of how much memory an algorithm uses as a
function of the size of the input. Like time complexity, it is typically expressed using big O notation.
For example, an algorithm with a space complexity of O(n) uses more memory as the input size (n)
increases. Space complexities are generally categorized as:

 O(1) or constant space: the algorithm uses the same amount of memory regardless of the
size of the input.

 O(n) or linear space: the algorithm's memory usage increases linearly with the size of the
input.

 O(n^2) or quadratic space: the algorithm's memory usage increases quadratically with the
size of the input.

 O(2^n)orexponentialspace:thealgorithm'smemoryusageincreasesexponentiallywith the

 Big O notation (O(f(n))) provides an upper bound on the growth of a function. It describesthe
worst-case scenario for the time or space complexity of an algorithm. For example, an
algorithm with a time complexity of O(n^2) means that the running time of the algorithm is
at most n^2, where n is the size of the input.

 Big Ω notation (Ω(f(n))) provides a lower bound on the growth of a function. It describes the
best-case scenario for the time or space complexity of an algorithm. For example, an
algorithm with a space complexity of Ω(n) means that the memory usage of the algorithm is
at least n, where n is the size of the input.

 Big Θ notation (Θ(f(n))) provides a tight bound on the growth of a function. It describes the
average-case scenario for the time or space complexity of an algorithm. For example, an
algorithm with a time complexity of Θ(n log n) means that the running time of the algorithm
is both O(n log n) and Ω(n log n), where n is the size of the input.

It's important to note that the asymptotic notation only describes the behavior of the function for
large values of n, and does not provide information about the exact behavior of the function for
small values of n. Also, for some cases, the best, worst and average cases can be the same, in that
case the notation will be simplified to O(f(n)) = Ω(f(n)) = Θ(f(n))
Additionally, these notations can be used to compare the efficiency of different algorithms, where a
lower order of the function is considered more efficient. For example, an algorithm with a time
complexity of O(n) is more efficient than an algorithm with a time complexity of O(n^2).

It's also worth mentioning that asymptotic notation is not only limited to time and space complexity
but can be used to express the behavior of any function, not just algorithms.

Thereare three asymptoticnotations that areused torepresent the time complexityof analgorithm.
They are:

 Input:Hereourinputisanintegerarrayofsize"n"andwehaveoneinteger"k"thatwe need to
search for in that array.

 Output:Iftheelement"k"isfoundinthearray,thenwehavereturn1,otherwisewehave

//for-looptoiteratewitheachelementinthe array
for (inti = 0;i <n;++i)
{
//checkifithelement isequalto"k"ornot
if(arr[i]==k)
return1;//return1,ifyoufind"k"
}
return0;//return0,ifyoudidn'tfind"k"
}

 If the input array is [1, 2, 3, 4, 5] and you want to find if "1" is present in the array or not,
thenthe if-condition ofthe code willbe executed 1 time andit willfind that the element 1 is
there in the array. So, the if-condition will take 1 second here.

 If the input array is [1, 2, 3, 4, 5] and you want to find if "3" is present in the array or not,
then the if-condition of the code will be executed 3 times and it will find that the element 3is
there in the array. So, the if-condition will take 3 seconds here.

 If the input array is [1, 2, 3, 4, 5] and you want to find if "6" is present in the array or not,
then the if-condition of the code will be executed 5 times and it will find that the element 6is
not there in the array and the algorithm will return 0 in this case. So, the if-condition will
take 5 seconds here.

As we can see that for the same input array, we have different time for different values of "k".
So,this can be divided into three cases:

 Best case: This is the lower bound on running time of an algorithm. We must know the case
that causes the minimum number of operations to be executed. In the above example, our
array was [1, 2, 3, 4, 5] and we are finding if "1" is present in the array or not. So here, after
only one comparison, we will get that ddelement is present in the array. So, this is the best
case of our algorithm.
 Average case: We calculate the running time for all possible inputs, sum all the calculated
values and divide the sum by the total number of inputs. We must know (or predict)
distribution of cases.

 Worst case: This is the upper bound on running time of an algorithm. We must know the
case that causes the maximum number of operations to be executed. In our example, the
worst case can be if the given array is [1, 2, 3, 4, 5] and we try to find if element "6" is
present in the array or not. Here, the if-condition of our loop will be executed 5 times and
then the algorithm will give "0" as output.

So, we learned about the best, average, and worst case of an algorithm. Now, let's get back to the
asymptotic notation where we saw that we use three asymptotic notation to represent the
complexity of an algorithm i.e. Θ Notation (theta), Ω Notation, Big O Notation.

NOTE:Intheasymptoticanalysis,wegenerallydealwithlargeinput size.

ΘNotation(theta)

The Θ Notation is used to find the average bound of an algorithm i.e. it defines an upper bound anda
lower bound, and your algorithm will lie in between these levels. So, if a function is g(n), then the
theta representation is shown as Θ(g(n)) and the relation is shown as:

Θ(g(n))={f(n):thereexistpositiveconstantsc1,c2andn0

ΩNotation

The Ω notation denotes the lower bound of an algorithm i.e. the time taken by the algorithm can'tbe
lower thanthis.Inotherwords, thisisthefastesttimeinwhichthealgorithmwillreturn aresult.
Its the time taken by the algorithm when provided with its best-case input. So, if a function is g(n),
then the omega representation is shown as Ω(g(n)) and the relation is shown as:

Ω(g(n))={f(n):thereexistpositiveconstantscandn0 such

that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0 }

Theaboveexpressioncan bereadas omegaofg(n)isdefinedassetofallthe functionsf(n)forwhich there


exist some constants c and n0 such that c*g(n) is less than or equal to f(n), for all n greaterthan or
equal to n0.

iff(n)=2n²+3n+1 and

g(n) = n²

thenfor c=2 andn0=1,wecansaythatf(n)=Ω(n²)

BigONotation

The Big Onotation definesthe upper bound ofany algorithm i.e.you algorithm can't take more time
than this time. In other words, we can say that the big O notation denotes the maximum time taken
by an algorithm or the worst-case time complexity of an algorithm. So, big O notation is the most
used notation for the time complexity of an algorithm. So, if a function is g(n), then the big O
representation of g(n) is shown as O(g(n)) and the relation is shown as:

O(g(n))={f(n):thereexistpositiveconstantscandn0 such

that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }

Theabove expression can be read as Big O of g(n) is defined as a set offunctions f(n) for which there
exist some constants c and n0 such that f(n) is greater than or equal to 0 and f(n) is smaller than or
equal to c*g(n) for all n greater than or equal to n0.

iff(n)=2n²+3n+1 and
g(n) = n²

thenfor c=6 andn0=1,wecansaythatf(n)=O(n²)


BigOnotationexampleofAlgorithms

Big O notation is the most used notation to express the time complexity of an algorithm. In this
section of the blog, we will find the big O notation of various algorithms.

Example1:Findingthesumofthefirstn numbers.

In this example, we have to find the sum of first n numbers. For example, if n = 4, then our output
should be 1 + 2 + 3 + 4 = 10. If n = 5, then the ouput should be 1 + 2 + 3 + 4 + 5 = 15. Let's try various
solutions to this code and try to compare all those codes.

O(1)solution

//functiontakinginput"n"

intfindSum(intn)

returnn*(n+1)/2;//thiswilltakesomeconstanttimec1

In the above code, there is only one statement and we know that a statement takes constant time
for its execution. The basic idea is that if the statement is taking constant time, then it will take the
same amount of time for all the input size and we denote this as O(1) .

O(n)solution

In this solution, we will run a loop from 1 to n and we will add these values to a variable named
"sum".

//functiontakinginput"n"

intfindSum(intn)

intsum=0;// ------------------->ittakessomeconstanttime"c1"

for(inti= 1;i <=n; ++i)//--> herethecomparisionand increment willtakeplace ntimes(c2*n)and the


creation of i takes place with some constant time

sum=sum+i;//-------------- >thisstatementwillbeexecutedntimesi.e. c3*n


returnsum;// ------------------ >ittakessomeconstanttime"c4"

/*

* Totaltimetaken=timetakenbyallthestatmentstoexecute

* here in our example we have 3 constant time taking statements i.e. "sum = 0", "i = 0", and "return
sum", so we can add all the constatnts and replacce with some new constant "c"

* apart fromthis, we havetwo statementsrunning n-timesi.e. "i< n(in realn+1)"and "sum= sum+ i" i.e.
c2*n + c3*n = c0*n

* Totaltimetaken=c0*n+c

*/

The big O notation of the above code is O(c0*n) + O(c), where c and c0 are constants. So,the overall
time complexity can be written as O(n) .

O(n²)solution

In this solution, we will increment the value of sum variable "i" times i.e. for i = 1, the sum variable
will be incremented once i.e. sum = 1. For i = 2, the sum variable will be incremented twice. So, let's
see the solution.

//functiontakinginput"n"

intfindSum(intn)

intsum=0;// ---------------------- >constanttime

for(inti= 1;i<=n;++i)

for(intj=1;j<=i;++j)

sum++;// ------------------ >itwillrun[n*(n+1)/2]

returnsum;// --------------------- >constant time

/*

* Totaltimetaken=timetakenbyallthestatmentstoexecute

* thestatement thatisbeingexecutedmostofthetime is"sum++"i.e.n*(n+1)/2

* So, total complexity will be: c1*n² + c2*n + c3 [c1 is for the constant terms of n², c2 is for the
constant terms of n, and c3 is for rest of the constant time]

*/

The big O notation of the above algorithm is O(c1*n²) +O( c2*n) + O(c3). Since we take the higher
order of growth in big O. So, our expression will be reduced to O(n²) .
So,until now,we saw 3 solutions for the same problem. Now, whichalgorithm will you prefer to use
whenyouarefindingthesumoffirst "n"numbers?If youranswerisO(1)solution,thenwehaveone bonus
section for you at the end of this blog. We would prefer the O(1) solution because the time taken by
the algorithm will be constant irrespective of the input size.

RecurrenceRelation

A recurrence relation is a mathematical equation that describes the relation between the input size
and the running time ofa recursive algorithm.It expressesthe running time of aproblem intermsof
the running time of smaller instances of the same problem.

ArecurrencerelationtypicallyhastheformT(n)=aT(n/b)+f(n)where:

 T(n)istherunningtimeofthealgorithmonaninputofsizen

 aisthenumberofrecursivecallsmadebythealgorithm

 bisthesizeoftheinputpassedtoeachrecursivecall

 f(n)isthetimerequiredtoperformanynon-recursiveoperations

The recurrence relation can be used to determine the time complexity of the algorithm using
techniques such as the Master Theorem or Substitution Method.

For example, let's consider the problem of computing the nth Fibonacci number. A simple recursive
algorithm for solving this problem is as follows:

Fibonacci(n)

if n <= 1

return nelse

returnFibonacci(n-1)+Fibonacci(n-2)

The recurrencerelationforthisalgorithmisT(n)=T(n-1)+T(n-2)+ O(1),whichdescribesthe running time of


the algorithm in terms of the running time of the two smaller instances of the problem with input
sizes n-1 and n-2. Using the Master Theorem, it can be shown that the time complexity of this
algorithm is O(2^n) which is very inefficient for large input sizes.

Searching
Searching is the process of fetching a specific element in a collection of elements. The collection can
be an array or a linked list. If you find the element in the list, the process is considered successful,
and it returns the location of that element.
Two prominent search strategies are extensively used to find a specific item on a list. However, the
algorithm chosen is determined by the list's organization.
1. LinearSearch
2. BinarySearch
3. Interpolationsearch

LinearSearch
Linear search, often known as sequential search, is the most basic search technique. In this type of
search,wegothroughtheentirelistandtrytofetchamatchforasingleelement.Ifwe find a match, then the
address of the matching target element is returned.
On the other hand, if the element is not found, then it returns a NULL value.
Followingisastep-by-stepapproachemployedtoperformLinearSearchAlgorithm.

Theproceduresforimplementinglinearsearchareasfollows:
Step1:First,readthesearchelement(Targetelement)inthearray.
Step2:Inthesecondstepcomparethesearchelementwiththefirstelementinthearray.
Step3:Ifbotharematched,display"Targetelementisfound"andterminatetheLinearSearch function.
Step 4: If both are not matched, compare the search element with the next element in the array.
Step 5: In this step, repeat steps 3 and 4 until the search (Target) element is compared with the last
element of the array.
Step 6 - If the last element in the list does not match, the Linear Search Function will be terminated,
and the message "Element is not found" will be displayed.

AlgorithmandPseudocodeofLinearSearchAlgorithm Algorithm
of the Linear Search Algorithm

LinearSearch(ArrayArr,Value a)//Arristhenameofthe array,andaisthesearchedelement. Step 1: Set i


to 0 // i is the index of an array which starts from 0
Step2:ifi>nthengotostep7//nisthe numberofelementsinarray Step 3: if
Arr[i] = a then go to step 6
Step4:Setitoi+1
Step5:Gotostep2
Step6:Printelementafoundatindexiandgotostep8 Step 7:
Print element not found
Step8:Exit

PseudocodeofLinearSearchAlgorithm

Start
linear_search(Array,value)
Foreachelementinthearray
If(searchedelement==value)
Return'sthesearchedelementlocation end
if
endfor
end

ExampleofLinearSearchAlgorithm
Consider anarrayofsize7withelements13,9,21,15,39,19,and27thatstartswith0andends with size minus
one, 6.
Searchelement=39

Step1:Thesearchedelement39iscomparedtothefirstelementofanarray,whichis13.

Thematchisnotfound,younowmoveontothenextelementandtrytoimplement acomparison. Step 2:


Now, search element 39 is compared to the second element of an array, 9.

Step3:Now,searchelement39iscomparedwiththethirdelement,whichis21.

Again,boththeelementsarenotmatching,youmoveontothenextfollowingelement. Step 4;
Next, search element 39 is compared with the fourth element, which is 15.
Step5:Next,searchelement39iscomparedwiththefifthelement39.

Aperfectmatchisfound,displaytheelementfoundatlocation4.

TheComplexityofLinearSearchAlgorithm
Three different complexities faced while performing Linear Search Algorithm, they are mentioned as
follows.
1. BestCase
2. WorstCase
3. AverageCase
BestCase Complexity
 Theelementbeingsearchedcouldbefoundinthefirstposition.
 Inthiscase,thesearchendswithasinglesuccessful comparison.
 Thus,inthebest-casescenario,thelinearsearchalgorithmperformsO(1)operations.
WorstCaseComplexity
 Theelementbeingsearchedmaybeatthelastpositioninthearrayornotat all.
 Inthefirstcase,thesearchsucceedsin‘n’comparisons.
 Inthenextcase,thesearchfailsafter‘n’ comparisons.
 Thus,intheworst-casescenario,thelinearsearchalgorithmperformsO(n)operations.
AverageCaseComplexity
Whentheelementto be searchedisinthe middleofthe array,the averagecase ofthe LinearSearch
Algorithm is O(n).
SpaceComplexityofLinearSearchAlgorithm
Thelinearsearchalgorithmtakesupnoextraspace;itsspacecomplexityisO(n)foranarrayofn elements.
ApplicationofLinearSearchAlgorithm
Thelinearsearchalgorithmhasthefollowingapplications:
 Linearsearchcanbeappliedtobothsingle-dimensionalandmulti-dimensionalarrays.
 Linearsearchiseasytoimplementandeffectivewhenthearraycontainsonlyafewelements.
 LinearSearchisalsoefficientwhenthesearchisperformedtofetchasinglesearchinan unordered-
List.
CodeImplementationofLinearSearchAlgorithm

#include<stdio.h>
#include<stdlib.h>
#include<conio.h>
int main()
{
intarray[50],i,target,num;
printf("Howmanyelementsdoyouwantinthearray"); scanf("%d",&num);
printf("Enterarrayelements:");
for(i=0;i<num;++i)
scanf("%d",&array[i]);
printf("Enterelementtosearch:");
scanf("%d",&target);
for(i=0;i<num;++i)
if(array[i]==target)
break;
if(i<num)
printf("Targetelementfoundatlocation%d",i); else
printf("Targetelementnotfoundinanarray"); return
0;
}

BinarySearch

Binary search is the search technique that works efficiently on sorted lists. Hence, to search an
element into some list using the binary search technique, we must ensure that the list is sorted.
Binary search follows the divide and conquer approach in which the list is divided into two halves,
and the item is compared with the middle element of the list. If the match is found then,
thelocationofthe middle elementisreturned.Otherwise,wesearchintoeitherofthehalvesdepending
upon the result produced through the match
NOTE: Binary search can be implemented on sorted array elements. If the list elements are not
arranged in a sorted manner, we have first to sort them.

Algorithm
1. Binary_Search(a,lower_bound, upper_bound, val) //'a' is the given array,'lower_bound' is t
he index ofthe first array element, 'upper_bound'is the indexof the last array element, 'val' is
the value to search
2. Step1:setbeg=lower_bound,end=upper_bound,pos=-1
3. Step2:repeatsteps3 and4 whilebeg<=end
4. Step3:setmid=(beg+ end)/2
5. Step4:ifa[mid]=val
6. setpos =mid
7. printpos
8. gotostep6
9. elseifa[mid]>val
10. setend= mid-1
11. else
12. setbeg= mid+1
13. [endofif]
14. [endof loop]
15. Step5:if pos=-1
16. print"valueisnotpresentinthearray"
17. [endofif]
18. Step6:exit
Procedurebinary_search
A←sortedarray
n←sizeof array
x←valuetobesearched Set
lowerBound = 1
SetupperBound=n
while x not found
ifupperBound<lowerBound EXIT:
x does not exists.
setmidPoint=lowerBound+(upperBound-lowerBound)/2 if
A[midPoint] < x
setlowerBound=midPoint+1 if
A[midPoint] > x
setupperBound=midPoint-1 if
A[midPoint] = x
EXIT:xfoundatlocationmidPoint end
while
end procedure

WorkingofBinarysearch
To understand the working of the Binary search algorithm, let's take a sorted array. It will be easy to
understand the working of Binary search with an example.
Therearetwomethodstoimplementthebinarysearchalgorithm-
o Iterativemethod
o Recursivemethod
Therecursivemethodofbinarysearchfollowsthedivideandconquerapproach. Let the
elements of array are -

Lettheelementtosearchis,K= 56
Wehavetousethebelowformulatocalculatethemidofthearray-
1. mid=(beg+end)/2
So, in the given array -

beg= 0
end=8
mid=(0+ 8)/2= 4.So,4is themidofthe array.
Now,the elementtosearchisfound.Soalgorithmwillreturntheindexoftheelementmatched. Binary
Search complexity
Now, let's see the time complexity of Binary search in the best case, average case, and worst
case.We will also see the space complexity of Binary search.
1. TimeComplexity

Case TimeComplexity

BestCase O(1)

AverageCase O(logn)

WorstCase O(logn)
o Best Case Complexity - In Binary search, best case occurs when the element to search is
found in first comparison, i.e., when the first middle element itself is the element to be
searched. The best-case time complexity of Binary search is O(1).
o AverageCaseComplexity-TheaveragecasetimecomplexityofBinarysearchisO(logn).
o Worst Case Complexity - In Binary search, the worst case occurs, when we have to keep
reducing the search space till it has only one element. The worst-case time complexity of
Binary search is O(logn).
2. Space Complexity
SpaceComplexity O(1)
o ThespacecomplexityofbinarysearchisO(1).

ImplementationofBinarySearch
Program:WriteaprogramtoimplementBinarysearchinClanguage.
1. #include<stdio.h>
2. intbinarySearch(inta[],intbeg,intend,intval)
3. {
4. intmid;
5. if(end>=beg)
6. { mid=(beg+end)/2;
7. /*ifthe itemtobe searchedispresentatmiddle*/
8. if(a[mid]== val)
9. {
10. returnmid+1;
11. }
12. /* if the item to be searched is smaller than middle, thenit can onlybe in left subarra y
*/
13. elseif(a[mid]<val)
14. {
15. returnbinarySearch(a,mid+1,end,val);
16. }
17. /*if the itemto be searchedis greater than middle,thenit can onlybe in right subarr ay
*/
18. else
19. {
20. returnbinarySearch(a,beg,mid-1,val);
21. }
22. }
23. return-1;
24.}
25. intmain(){
26. inta[]={11,14,25,30,40,41,52,57,70};//givenarray
27. intval= 40;//valuetobesearched
28. intn=sizeof(a)/sizeof(a[0]);//sizeofarray
29. intres=binarySearch(a,0,n-1,val);//Storeresult
30. printf("Theelementsofthearrayare-");
31. for(inti =0;i<n;i++)
32. printf("%d",a[i]);
33. printf("\nElementtobesearchedis-%d",val);
34. if(res==-1)
35. printf("\nElementisnotpresentinthearray");
36. else
37. printf("\nElementispresentat%dpositionofarray",res);
38. return0;
39.}
Output

InterpolationSearch
Interpolation search is an improved variant of binary search. This search algorithm works on the
probing position of the required value. For this algorithm to work properly, the data collectionshould
be in a sorted form and equally distributed.
Binary search has a huge advantage of time complexity over linear search. Linear search has worst-
case complexity of Ο(n) whereas binary search has Ο(log n).
There are cases where the location of target data may be known in advance. For example, in case of
a telephone directory, if we want to search the telephone number of Morphius. Here, linear search
and even binary search will seem slow as we can directly jump to memory space where the names
start from 'M' are stored.
PositionProbinginInterpolationSearch
Interpolation search finds a particular item by computing the probe position. Initially, the probe
position is the position of the middle most item of the collection.

If a match occurs, then the index of the item is returned. To split the list into two parts, we use the
following method −
mid=Lo+((Hi-Lo)/(A[Hi]-A[Lo]))* (X-A[Lo])

where
−A=list
Lo=Lowestindexofthelist Hi=
Highestindexofthe list
A[n]=Valuestoredatindexninthelist

If the middle item is greater than the item, then the probe position is again calculated in the sub-
array to the right of the middle item. Otherwise, the item is searched in the subarray to the left of
the middle item. Thisprocess continueson the sub-array as welluntil the size ofsubarray reducesto
zero.
Runtime complexity of interpolation search algorithm is Ο(log (log n)) as compared to Ο(log n) ofBST
in favorable situations.
Algorithm
AsitisanimprovisationoftheexistingBSTalgorithm,wearementioningthestepstosearchthe 'target' data
value index, using position probing −
Step1−Startsearchingdatafrommiddleofthelist.
Step2−Ifitisamatch,returntheindexoftheitem,andexit. Step 3 −
If it is not a match, probe position.
Step4−Dividethelistusingprobingformulaandfind thenewmidle. Step 5 −
If data is greater than middle, search in higher sub-list.
Step6−Ifdataissmallerthanmiddle,searchinlowersub-list. Step 7
− Repeat until match.

PseudocodeA
→Arraylist
N→Size ofA
X→TargetValue

ProcedureInterpolation_Search()

Set Lo→0
Set Mid → -1
SetHi→N-1

WhileXdoesnotmatch

ifLoequalstoHiORA[Lo]equalsto A[Hi]
EXIT:Failure,Targetnotfound
end if

SetMid=Lo+ ((Hi-Lo)/ (A[Hi]-A[Lo]))*(X- A[Lo])

ifA[Mid]=X
EXIT:Success,TargetfoundatMid else
ifA[Mid]<X
SetLotoMid+1
else if A[Mid] > X
Set Hi to Mid-1
endif
end if
End While

EndProcedure

ImplementationofinterpolationinC

#include<stdio.h>#defi
ne MAX 10
//arrayofitemsonwhichlinearsearchwillbeconducted. int
list[MAX] = { 10, 14, 19, 26, 27, 31, 33, 35, 42, 44 };
intfind(intdata){ int
lo = 0;
inthi=MAX-1; int
mid = -1;
intcomparisons=1;
int index = -1;
while(lo <= hi) {
printf("\nComparison%d\n",comparisons);
printf("lo:%d,list[%d]=%d\n",lo,lo,list[lo]);
printf("hi:%d,list[%d]=%d\n",hi,hi, list[hi]);

comparisons++;
//probethemidpoint
mid=lo+(((double)(hi-lo)/(list[hi]-list[lo]))*(data-list[lo]));
printf("mid = %d\n",mid);
// data found
if(list[mid]==data){
index=mid;
break;
}else{
if(list[mid]<data){
//ifdataislarger,dataisinupperhalf lo =
mid + 1;
}else{
//ifdataissmaller,dataisinlowerhalf hi =
mid - 1;
}
}
}

printf("\nTotalcomparisonsmade:%d",--comparisons); return
index;
}
intmain(){
//find location of 33
intlocation=find(33);

//ifelementwasfound
if(location != -1)
printf("\nElementfoundatlocation:%d",(location+1)); else
printf("Elementnotfound.");
return 0;
}
Ifwecompileandruntheabove program,itwillproducethefollowingresult− Output
Comparison1
lo:0,list[0]= 10
hi:9,list[9]=44
mid=6

Total comparisons made: 1


Elementfoundatlocation:7

TimeComplexity
 Bestcase-O(1)
The best-case occurs when the target is found exactly as the first expected position
computed using the formula. As we only perform one comparison, the time complexity is
O(1).

 Worst-case-O(n)
Theworstcaseoccurswhenthegivendatasetisexponentiallydistributed.

 Averagecase-O(log(log(n)))
If the data set is sorted and uniformly distributed, then it takes O(log(log(n))) time as on an
average (log(log(n))) comparisons are made.
SpaceComplexity
O(1)asnoextraspaceisrequired.

PatternSearch
Pattern Searching algorithms are used to find a pattern or substring from another bigger string.There
are different algorithms. The main goal to design these type of algorithms to reduce the time
complexity. The traditional approach may take lots of time to complete the pattern searching taskfor
a longer text.
Herewewillseedifferentalgorithmstoget abetterperformanceofpatternmatching. In this
Section We are going to cover.

 Aho-CorasickAlgorithm
 AnagramPatternSearch
 BadCharacterHeuristic
 BoyerMooreAlgorithm
 EfficientConstructionofFiniteAutomata
 kasai’sAlgorithm
 Knuth-Morris-PrattAlgorithm
 Manacher’sAlgorithm
 NaivePatternSearching
 Rabin-KarpAlgorithm
 SuffixArray
 TrieofallSuffixes
 ZAlgorithm

Naïve pattern searching is the simplest method among other pattern searching algorithms. It checks
for all character of the main string to the pattern. This algorithm is helpful for smaller texts. It does
not need any pre-processing phases. We can find substring by checking once for the string. It also
does not occupy extra space to perform the operation.
The time complexity of Naïve Pattern Search method is O(m*n). The m is the size of pattern and n is
the size of the main string.

InputandOutput
Input:
MainString:“ABAAABCDBBABCDDEBCABC”,pattern:“ABC”
Output:
Pattern found at position: 4
Patternfoundatposition:10
Patternfoundatposition:18

Algorithm
naive_algorithm(pattern,text)
Input−Thetextandthepattern
Output−locations,wherethepatternispresentinthetext
Stpaar t _len:=patternSize
str_len:=string size
fori:=0to(str_len-pat_len),do for j
:= 0 to pat_len, do
iftext[i+j]≠pattern[j],then
break
ifj==patLen,then
displaythepositioni,astherepatternfound
End

ImplementationinC
#include <stdio.h>
#include<string.h>
int main (){
chartxt[]="tutorialsPointisthebestplatformforprogrammers"; char
pat[] = "a";
intM=strlen(pat); int
N = strlen (txt);
for(inti=0;i<=N-M;i++){ int j;
for (j = 0; j < M;
j++)if(txt[i+j]!=pat[j
])
break;
if(j==M)
printf ("Pattern matches at index %d
",i);
}
return0;
}
Output
Pattern matches at 6
Patternmatchesat25
Patternmatchesat 39

Rabin-Karpmatchingpattern

Rabin-Karp is another pattern searching algorithm. It is the string matching algorithm that was
proposed by Rabin and Karp to find the pattern in a more efficient way. Like the Naive Algorithm, it
alsochecksthe pattern bymoving the window oneby one,but withoutchecking allcharactersforall
cases, it finds the hash value. When the hash value is matched, then only it proceeds to check each
character. In this way, there is only one comparison per text subsequence making it a more efficient
algorithm for pattern searching.
Preprocessingtime-O(m)
ThetimecomplexityoftheRabin-KarpAlgorithmisO(m+n),butfortheworstcase,itisO(mn).
Algorithm
rabinkarp_algo(text,pattern,prime)
Input−Themaintextandthepattern.Anotherprimenumberoffindhash location
Output−locations,wherethepatternisfound
Start
pat_len:=patternLength
str_len := string Length
patHash:=0 andstrHash:=0,h:=1
maxChar:=totalnumberofcharactersincharacterset for
index i of all character in the pattern, do
h:=(h*maxChar)modprime
forallcharacterindexiofpattern,do
patHash:=(maxChar*patHash+pattern[i])modprime strHash
:= (maxChar*strHash + text[i]) mod prime
fori:=0to(str_len-pat_len),do if
patHash = strHash, then
forcharIndex:=0 topat_len-1,do
iftext[i+charIndex]≠pattern[charIndex],then
break
ifcharIndex=pat_len, then
printthelocationiaspatternfoundatiposition. if i <
(str_len - pat_len), then
strHash:=(maxChar*(strHash–text[i]*h)+text[i+patLen])modprime,then if
strHash < 0, then
strHash:=strHash+prime
End

ImplementationInC

#include<stdio.h>
#include<string.h>
int main (){
chartxt[80],pat[80];
int q;
printf("Enterthecontainerstring");
scanf ("%s", &txt);
printf("Enterthepatterntobesearched");
scanf ("%s", &pat);
int d = 256;
printf("Enteraprimenumber");
scanf ("%d", &q);
intM=strlen(pat);
int N = strlen (txt);
int i, j;
intp=0;
int t = 0;
inth=1;
for(i=0;i<M-1;i++) h =
(h * d) % q;
for(i=0;i<M;i++){
p= (d*p+ pat[i])%q;
t=(d*t+txt[i])%q;
}
for(i=0;i<=N-M;i++){ if (p
== t){
for (j = 0; j < M; j++){
if(txt[i+j]!=pat[j])
break;
}
if (j == M)
printf("Patternfoundatindex%d",i);
}
if(i<N-M){
t=(d*(t-txt[i]*h)+txt[i+M])%q; if (t < 0)
t=(t+q);
}
}
return0;
}
Output
Enter the container string
tutorialspointisthebestprogrammingwebsite
Enter the pattern to be searched
p
Enteraprimenumber 3
Pattern found at index 8
Patternfoundatindex21

nthisproblem,wearegiventwostringsatextandapattern.Ourtaskistocreateaprogramfor KMP algorithm


for pattern search, it will find all the occurrences of pattern in text string.
Here,wehavetofindalltheoccurrencesofpatternsinthetext.
Let’stakeanexampletounderstandtheproblem,
Input
text=“xyztrwqxyzfg”pattern=“xyz” Output
Foundatindex0
Foundatindex7
Here, we will discuss the solution to the problem using KMP (Knuth Morris Pratt) pattern searching
algorithm, it will use a preprocessing string ofthe pattern whichwill be usedfor matching inthe text.
And help’s in processing or finding pattern matches in the case where matching characters are
followed by the character of the string that does not match the pattern.
We will preprocess the pattern wand to create an array that contains the proper prefix and suffix
from the pattern that will help in finding the mismatch patterns.
ProgramforKMPAlgorithmforPatternSearching
//CProgramforKMPAlgorithmforPatternSearching Example
#include<iostream>
#include<string.h>usin
gnamespacestd;
voidprefixSuffixArray(char*pat,intM,int*pps){ int
length = 0;
pps[0] = 0;int
i=
1;while(i<M){
if(pat[i]==pat[length]){
length++;
pps[i]=length;
i++;
}else{
if(length!=0)
length=pps[length-1];
else {
pps[i]=0;
i++;
}
}
}
}
voidKMPAlgorithm(char*text,char*pattern){
int M = strlen(pattern);
intN=strlen(text); int
pps[M];
prefixSuffixArray(pattern,M,pps); int
i = 0;
int j = 0;
while(i<N){
if(pattern[j]==text[i]){ j++;
i++;
}
if(j==M)
{
printf("Foundpatternatindex%d",i-j); j
= pps[j - 1];
}
elseif(i<N&&pattern[j]!=text[i]){ if (j
!= 0)
j=pps[j-1];
else
i =i+1;
}
}
}
intmain(){
chartext[]="xyztrwqxyzfg";
char pattern[] = "xyz";
printf("Thepatternisfoundinthetextatthefollowingindex:");
KMPAlgorithm(text, pattern);
return0;
}
Output
Thepatternisfoundinthetextatthefollowingindex− Found
pattern at index 0
Foundpatternatindex7

Sorting:Insertionsort

Insertionsort workssimilarto thesorting ofplayingcardsinhands. It isassumedthatthe first cardis


already sorted in the card game, and then we select an unsorted card. If the selected unsorted cardis
greater than the first card, it will be placed at the right side; otherwise, it will be placed at the left
side. Similarly, all unsorted cards are taken and put in their exact place.

The same approach is applied in insertion sort. The idea behind the insertion sort is that first take
one element,iterate it through the sortedarray.Although it issimple to use,it is not appropriatefor
large data sets as the time complexity of insertion sort in the average case and worst case is O(n2),
where n is the number of items. Insertion sort is less efficient than the other sorting algorithms like
heap sort, quick sort, merge sort, etc.

Algorithm
Thesimplestepsofachievingtheinsertionsortarelistedasfollows-
Step1-Iftheelementisthefirstelement,assumethatitisalreadysorted.Return 1.
Step2 - Pick the next element, and store it separately in a key.
Step3-Now,comparethekeywithallelementsinthesortedarray.
Step4 -Iftheelement inthesortedarrayissmallerthanthecurrent element,thenmove tothenext element.
Else, shift greater elements in the array towards the right.
Step5-Insertthevalue.
Step6-Repeatuntilthearrayissorted. Working
of Insertion sort Algorithm
Now,let'sseetheworkingoftheinsertionsortAlgorithm.
Tounderstandtheworkingoftheinsertionsortalgorithm,let'stakeanunsortedarray.Itwillbe easier to
understand the insertion sort via an example.
Lettheelementsofarrayare-

Initially,thefirsttwoelementsarecomparedininsertionsort.

Here, 31 is greater than 12. That means both elements are already in ascending order. So, for now,
12 is stored in a sorted sub-array.
Now,movetothenexttwoelementsandcompare them.

Here,25issmallerthan31.So,31isnotatcorrectposition.Now,swap31with25.Alongwith swapping,
insertion sort will also check it with all elements in the sorted array.
For now, the sorted array has only one element, i.e. 12. So, 25 is greater than 12. Hence, the sorted
array remains sorted after swapping.

Now, two elements in the sorted array are 12 and 25. Move forward to the next elements that are31
and 8.

Both31and8are notsorted.So,swap them.

Afterswapping,elements25and8areunsorted.

So,swapthem.

Now,elements12and8areunsorted.

So,swapthem too.

Now, the sorted array has three items that are 8, 12 and 25. Move to the next items that are 31 and
32.

Hence,theyarealreadysorted.Now,thesortedarrayincludes8,12,25and31.

Movetothenextelementsthatare32and17.
17issmallerthan32.So,swap them.

Swappingmakes31and17unsorted.So,swapthemtoo.

Now,swappingmakes25and17unsorted.So,performswappingagain.

Now,thearrayiscompletelysorted.

Insertion sort complexity


1. TimeComplexity
Case TimeComplexity
BestCase O(n)
AverageCase O(n2)
WorstCase O(n2)
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already
sorted. The best-case time complexity of insertion sort is O(n).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is
not properly ascending and not properly descending. The average case time complexity of
insertion sort is O(n2).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in
reverse order. That means suppose you have to sort the array elements in ascending order,
butitselementsareindescendingorder.Theworst-casetimecomplexityofinsertionsort is O(n2).
2. Space Complexity
SpaceComplexity O(1)
Stable YES
o ThespacecomplexityofinsertionsortisO(1).Itisbecause,ininsertionsort,anextra variable is
required for swapping.
Implementationofinsertionsort
Program:WriteaprogramtoimplementinsertionsortinClanguage.
1. #include<stdio.h>
2.
3. voidinsert(inta[],intn)/*functiontosortanaaywithinsertionsort*/
4. {
5. inti,j, temp;
6. for (i=1;i<n;i++){
7. temp= a[i];
8. j =i- 1;
9.
10. while(j>=0 && temp<= a[j])/* Move the elements greater than temp to one position a
head from their current position*/
11. {
12. a[j+1]= a[j];
13. j=j-1;
14. }
15. a[j+1]= temp;
16. }
17.}
18.
19. voidprintArr(inta[],intn)/*functiontoprintthearray*/
20. {
21. inti;
22. for(i=0;i <n;i++)
23. printf("%d", a[i]);
24.}
25.
26. intmain()
27. {
28. inta[]={12,31,25,8,32,17 };
29. intn=sizeof(a)/sizeof(a[0]);
30. printf("Beforesortingarrayelementsare- \n");
31. printArr(a,n);
32. insert(a,n);
33. printf("\nAftersortingarrayelementsare-\n");
34. printArr(a,n);
35.
36. return0;
37. }
Output:

HeapSort

HeapSortAlgorithm

Heap sort processes the elements by creating the min-heap or max-heap using the elements of the
given array. Min-heap or max-heap represents the ordering of array in which the root element
represents the minimum or maximum element of the array.
Heapsortbasicallyrecursivelyperformstwomainoperations-

o BuildaheapH,usingtheelementsof array.

o Repeatedlydeletetherootelementoftheheapformedin1stphase.

Aheapisacompletebinary tree,andthe binary treeisatreeinwhichthe nodecanhave theutmost two


children. A complete binary tree is a binary tree in which all the levels except the last level, i.e., leaf
node, should be completely filled, and all the nodes should be left-justified.

Heapsort is a popular and efficient sorting algorithm. The concept of heap sort is to eliminate the
elements one by one from the heap part of the list, and then insert them into the sorted part of the
list.

Algorithm

1. HeapSort(arr)

2. BuildMaxHeap(arr)

3. fori=length(arr)to2

4. swaparr[1]witharr[i]

5. heap_size[arr]=heap_size[arr]?1

6. MaxHeapify(arr,1)

7. End

BuildMaxHeap(arr)

1. BuildMaxHeap(arr)

2. heap_size(arr)=length(arr)

3. fori=length(arr)/2to1

4. MaxHeapify(arr,i)

5. End

MaxHeapify(arr,i)

1. MaxHeapify(arr,i)

2. L= left(i)

3. R=right(i)

4. ifL?heap_size[arr]andarr[L]>arr[i]

5. largest=L

6. else

7. largest=i

8. ifR?heap_size[arr]andarr[R]>arr[largest]
9. largest=R

10. iflargest!=i

11. swaparr[i]witharr[largest]

12. MaxHeapify(arr,largest)

13. End

WorkingofHeapsortAlgorithm

In heap sort, basically, there are two phases involved in the sorting of elements. By using the heap
sort algorithm, they are as follows -

o Thefirststepincludesthecreationofaheapbyadjustingtheelementsofthearray.

o After the creation of heap, now remove the root element of the heap repeatedly by shifting
it to the end of the array, and then store the heap structure with the remaining elements.

First,wehavetoconstructaheapfromthegivenarrayandconvertitintomaxheap.

Afterconvertingthegivenheapintomaxheap,thearrayelementsare-

Next, we have to delete the root element (89) from the max heap. To delete this node, we have to
swap it with the last node, i.e. (11). After deleting the root element, we again have to heapify it to
convert it into max heap.

After swapping the array element 89 with 11, and converting the heap into max-heap, the elements
of array are -
In the next step, again, we have to delete the root element (81) from the max heap. To delete this
node, wehave to swapit with thelast node, i.e. (54). After deletingthe rootelement, we again have to
heapify it to convert it into max heap.

After swapping the array element 81 with 54 and converting the heap into max-heap, the elements
of array are -

In the next step, we have to delete the root element (76) from the max heap again. To delete this
node, we have to swap it with the last node, i.e. (9). After deleting the root element, we again have
to heapify it to convert it into max heap.

Afterswapping the array element 76with 9 and converting the heap into max-heap,the elementsof
array are -

In the next step, again we have to delete the root element (54) from the max heap. To delete this
node, wehave to swapit with thelast node, i.e. (14). After deletingthe rootelement, we again have to
heapify it to convert it into max heap.

After swapping the array element 54 with 14 and converting the heap into max-heap, the elements
of array are -

In the next step, again we have to delete the root element (22) from the max heap. To delete this
node, wehave to swapit with thelast node, i.e. (11). After deletingthe rootelement, we again have to
heapify it to convert it into max heap.
After swapping the array element 22 with 11 and converting the heap into max-heap, the elements
of array are -

In the next step, again we have to delete the root element (14) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (9). After deleting the root element, we again have
to heapify it to convert it into max heap.

Afterswapping the array element 14with 9 and converting the heap into max-heap,the elementsof
array are -

In the next step, again we have to delete the root element (11) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (9). After deleting the root element, we again have
to heapify it to convert it into max heap.

Afterswappingthearrayelement11with9,theelementsofarrayare-

Now,heaphasonlyoneelementleft.Afterdeletingit,heapwillbeempty.

Aftercompletionofsorting,thearrayelementsare-

TimecomplexityofHeapsortinthebestcase,averagecase,andworst case

1. TimeComplexity

Case TimeComplexity
BestCase O(nlogn)

AverageCase O(nlogn)

WorstCase O(nlogn)

o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already
sorted. The best-case time complexity of heap sort is O(n logn).

o Average Case Complexity - It occurs when the array elements are in jumbled order that is
not properly ascending and not properly descending. The average case time complexity of
heap sort is O(n log n).

o Worst Case Complexity - It occurs when the array elements are required to be sorted in
reverse order. That means suppose you have to sort the array elements in ascending order,
but itselements are in descending order. Theworst-case time complexityofheap sortis O(n
log n).

The time complexity of heap sort is O(n logn) in all three cases (best case, average case, and
worstcase). The height of a complete binary tree having n elements is logn.

2. Space Complexity

SpaceComplexity O(1)

Stable N0

o ThespacecomplexityofHeapsortisO(1).

Implementation of Heapsort

Program:WriteaprogramtoimplementheapsortinC language.

1. #include<stdio.h>

2. /*functiontoheapifyasubtree.Here'i'isthe

3. indexofrootnodeinarraya[],and'n'isthesizeofheap.*/

4. voidheapify(inta[],intn,inti)

5. {

6. intlargest=i;//Initializelargestas root

7. int left= 2*i+1;//leftchild

8. int right =2* i+2;//rightchild

9. //Ifleftchildislargerthan root

10. if(left<n&&a[left]>a[largest])

11. largest=left;
12. //Ifrightchildislargerthanroot

13. if(right<n&&a[right]>a[largest])

14. largest=right;

15. //Ifrootisnot largest

16. if(largest!=i){

17. //swapa[i]witha[largest]

18. inttemp=a[i];

19. a[i]= a[largest];

20. a[largest]=temp;

21. heapify(a,n,largest);

22. }

23.}

24. /*Functiontoimplementtheheapsort*/

25. voidheapSort(inta[],intn)

26. {

27. for(inti=n/2-1;i>=0;i--)

28. heapify(a,n,i);

29. //Onebyoneextract anelementfromheap

30. for(inti=n-1;i>=0;i--) {

31. /*Movecurrentrootelementtoend*/

32. //swapa[0]witha[i]

33. inttemp=a[0];

34. a[0]= a[i];

35. a[i]=temp;

36.

37. heapify(a,i,0);

38. }

39.}

40. /*functiontoprintthearrayelements*/

41. voidprintArr(intarr[],intn)

42. {
43. for(inti=0;i<n;++i)

44. {

45. printf("%d",arr[i]);

46. printf("");

47. }

48.

49.}

50. intmain()

51. {

52. inta[]={48,10,23,43,28,26,1};

53. intn=sizeof(a)/ sizeof(a[0]);

54. printf("Beforesortingarrayelementsare- \n");

55. printArr(a,n);

56. heapSort(a,n);

57. printf("\nAftersortingarrayelementsare-\n");

58. printArr(a,n);

59. return0;

60.}

Output
UNIT2-GRAPHS:basics,representation,
traversals, and application

Basicconcepts

Definition

AgraphG(V,E) isanon-lineardatastructurethat consistsofnode andedge


pairsofobjectsconnectedby links.

Thereare2typesofgraphs:

 Directed
 Undirected

Directedgraph

A graph with only directed edgesissaid tobe adirected graph. Example

The following directed graph has5 verticesand8 edges. This graphG


canbedefinedasG=(V,E),whereV={A,B,C,D,E}andE={(A,B),
(A,C)(B,E),(B,D),(D,A),(D,E),(C,D),(D,D)}.

DirectedGraph

Undirectedgraph

Agraphwithonlyundirectededgesissaidtobeanundirectedgraph. Example

Thefollowingisanundirectedgraph.

UndirectedGraph

RepresentationofGraphs
Graph data structure is represented using the following
representations.

1. AdjacencyMatrix
2. AdjacencyList

AdjacencyMatrix

 Inthisrepresentation,the graph canbe representedusing a


matrix of size n x n, where nisthe number of vertices.
 Thismatrixisfilledwitheither1’sor0’s.
 Here,1representsthatthere isanedgefromrowvertexto
columnvertex,and0representsthatthereisnoedgefromrow
vertextocolumnvertex.

Directedgraphrepresentation

Adjacencylist

 In this representation, every vertex of the graph contains a


listofitsadjacent vertices.
 Ifthegraphisnotdense,i.e.,thenumberofedgesisless, thenit
isefficient to represent thegraphthrough the adjacency list.

AdjacencyList

Graphtraversals

 Graph traversalisa technique used to search for a vertexina


graph.It isalso used to decide the order of vertices to be
visited inthe search process.
 A graph traversal finds the edges tobe usedinthe search
process without creating loops. Thismeans that, with graph
traversal,we canvisit allthe vertices of the graph without
getting into a looping path. There are two graph traversal
techniques:

1. DFS(DepthFirstSearch)
2. BFS(Breadth-FirstSearch)

Applicationsofgraphs
1. Social network graphs:To tweet or not to tweet. Graphs that
representwhoknowswhom,whocommunicateswithwhom,who
influenceswhom,orotherrelationshipsinsocialstructures.An
exampleisthe twitter graph ofwho followswhom.
2. Graphs in epidemiology: Vertices represent individuals and
directededgestoviewthetransferofaninfectiousdisease
fromoneindividualtoanother.Analyzingsuchgraphshasbecome
animportantcomponentinunderstandingandcontrollingthe spread of
diseases.
3. Protein-protein interactions graphs: Vertices represent proteins
andedges represent interactionsbetweenthem that carry out
some biological function in the cell.These graphscanbeused
to,forexample,studymolecularpathway—chainsofmolecular
interactions ina cellular process.
4. Network packet traffic graphs: Vertices are IP (Internet
protocol)addressesandedgesarethepacketsthatflowbetween
them.Such graphs are used for analyzingnetwork security,
studying the spread of worms,and trackingcriminalor non-
criminal activity.
5. Neuralnetworks:Verticesrepresentneuronsandedgesarethe
synapsesbetweenthem.Neuralnetworksareusedtounderstand
howourbrainworksandhowconnectionschangewhenwelearn.
Thehumanbrainhasabout1011neuronsandcloseto1015 synapses.

DFS–DepthFirstSearch
DepthFirstSearch(DFS)algorithmtraversesagraph inadepthwardmotionandusesastackto remember to
get the next vertex to start a search, when a dead end occurs in any iteration.

As inthe examplegivenabove, DFSalgorithmtraversesfromStoA toDtoG toE toBfirst,thentoF and lastly


to C. It employs the following rules.

 Rule1−Visittheadjacentunvisitedvertex.Markitasvisited.Displayit.Pushitinastack.

 Rule2−Ifnoadjacent vertexisfound,popup avertexfromthestack.(It willpopupallthe vertices


from the stack, which do not have adjacent vertices.)

 Rule3−RepeatRule1andRule2untilthestackisempty.

Step Traversal Description


1

Initializethestack.

2 Mark S as visited and put it onto the


stack. Explore any unvisited adjacent
nodefromS.Wehavethreenodesand we
can pick any of them. For this example,
we shall take the node in an
alphabetical order.

3 MarkAas visitedandput itontothe


stack.Exploreanyunvisitedadjacent
node from A. Both S and D are
adjacent to A but we are concerned
for unvisited nodes only.

4 VisitDandmarkitasvisitedandput onto
the stack. Here, we
have B and C nodes, which are
adjacenttoDandbothareunvisited.
However,weshallagainchooseinan
alphabetical order.

5
We choose B, mark it as visited and
put onto the stack. Here B does not
haveanyunvisitedadjacentnode.So,
we pop B from the stack.

6
Wecheckthestacktopforreturnto
thepreviousnodeandcheckifithas
any unvisited nodes. Here, we
findDtobe onthetopofthestack.

7
Onlyunvisitedadjacentnode is
fromDisCnow.SowevisitC,markit as
visited and put it onto the stack.
DFS(G, u)
u.visited=true
foreachv∈G.Adj[u]
ifv.visited==false
DFS(G,v)

init(){
For each u ∈ G
u.visited=false
Foreachu∈G DFS(G,
u)
}

ApplicationofDFSAlgorithm
1. Forfindingthepath

2. Totestifthegraphisbipartite

3. Forfindingthestronglyconnectedcomponentsofa graph

4. Fordetectingcyclesinagraph

BreadthFirstSearch
BreadthFirstSearch(BFS)algorithmtraversesagraph inabreadthwardmotionandusesaqueue to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.

Asinthe examplegivenabove,BFSalgorithmtraversesfromAtoBtoEtoFfirst thentoCandG lastly to D. It


employs the following rules.

 Rule1−Visittheadjacentunvisitedvertex.Markitasvisited.Displayit.Insertitinaqueue.

 Rule2−Ifnoadjacentvertexisfound,removethefirstvertexfromthequeue.

 Rule3− RepeatRule1andRule2untilthequeueisempty.
Step Traversal Description

Initializethequeue.

WestartfromvisitingS(starting node),
and mark it as visited.

3
We then see an unvisited adjacent
nodefromS.Inthisexample,wehave
three nodes but alphabetically we
choose A, mark it as visited and
enqueue it.

4
Next, the unvisited adjacent node
fromSisB.Wemarkitasvisitedand
enqueue it.

5
Next, the unvisited adjacent node
fromSisC.Wemarkitasvisitedand
enqueue it.

6
Now, S is left with no unvisited
adjacentnodes.So,wedequeueand
find A.

7
From A we have D as
unvisitedadjacentnode.We
mark it as visited and
enqueue it.
BFSpseudocode
createaqueueQ

markvasvisitedandputvintoQ while

Q is non-empty

removetheheaduofQ

markandenqueueall(unvisited)neighboursofu

BFSAlgorithmComplexity
ThetimecomplexityoftheBFSalgorithmis representedintheformofO(V +E),whereVis the number of
nodes and E is the number of edges.

ThespacecomplexityofthealgorithmisO(V).

BFSAlgorithmApplications
1. Tobuildindexbysearchindex
2. ForGPSnavigation
3. Pathfindingalgorithms
4. InFord-Fulkersonalgorithmtofindmaximumflowinanetwork
5. Cycledetectioninanundirectedgraph
6. Inminimumspanningtree

Connectedgraph,StronglyconnectedandBi-Connectivity

Connected Graph Component

Aconnectedcomponentorsimplycomponent ofanundirectedgraphisasubgraphinwhicheach pair of


nodes is connected with each other via a path.
StronglyConnectedGraph
The Kosaraju algorithm is a DFS based algorithm used to find Strongly Connected
Components(SCC)inagraph.It isbasedontheideathatifoneisabletoreachavertexvstarting
fromvertexu, thenoneshouldbe abletoreachvertexustartingfromvertexvand ifsuchis thecase, one can
say that vertices u and v are strongly connected - they are in a strongly connected sub- graph.

stackSTACK
voidDFS(intsource){
visited[s]=true
forallneighboursXofsourcethatarenotvisited:
DFS(X)
STACK.push(source)
}

CLEARADJACENCY_LIST
foralledgese:
first = one end point of e
second=otherendpointofe
ADJACENCY_LIST[second].push(first)

whileSTACKisnotempty:
source=STACK.top()
STACK.pop()
ifsourceisvisited:
continue
else :
DFS(source)

BiConnectivityGraph
An undirected graph is said to be a biconnected graph, if there are two vertex-disjoint paths
betweenanytwoverticesarepresent.Inotherwords,wecansay thatthereisacyclebetweenany two
vertices.

WecansaythatagraphGisabi-connectedgraphifitisconnected,andthereare noarticulation points or cut


vertex are present in the graph.
Tosolvethisproblem,wewillusetheDFStraversal.UsingDFS,wewilltrytofindifthereisany
articulationpointispresentornot.WealsocheckwhetherallverticesarevisitedbytheDFSornot, if not we
can say that the graph is not connected.

PseudocodeforBi connectivity
isArticulation(start,visited,disc,low,parent)
Begin
time := 0 //thevalueoftimewillnotbeinitializedfornextfunctioncalls
dfsChild := 0
markstartasvisited
setdisc[start]:=time+1andlow[start]:=time+1 time
:= time + 1
forallvertexvinthegraph G,do
ifthereisanedgebetween(start,v),then if v
is visited, then
increasedfsChild
parent[v]:=start
ifisArticulation(v,visited,disc,low,parent)istrue,then
return ture
low[start]:=minimumoflow[start]andlow[v] if
parent[start] is φ AND dfsChild > 1, then
returntrue
ifparent[start]isφANDlow[v]>=disc[start],then return
true
else if v is not the parent of start,
thenlow[start]:=minimumoflow[start]anddisc[
v]
donereturn
false
End
isBiconnected(graph)
Begin
initiallysetallverticesareunvisitedandparentofeachverticesareφ if
isArticulation(0, visited, disc, low, parent) = true, then
returnfalse
foreachnodeiofthegraph,do if i
is not visited, then
returnfalse
done
returntrue
End

MinimumSpanningTree
A Spanning Tree is a tree which have V vertices and V-1 edges. All nodes in a spanning tree
are reachable from each other.
A Minimum Spanning Tree(MST) or minimum weight spanning tree for a weighted,
connected, undirected graph is a spanning tree having a weight less than or equal to the
weight of every other possible spanning tree. The weight of a spanning tree is the sum of
weights given to each edge of the spanning tree. In short out of all spanning trees of a given
graph, the spanning tree having minimum weight is MST.

AlgorithmsforfindingMinimumSpanning Tree(MST):-
1. Prim’sAlgorithm
2. Kruskal’sAlgorithm
Prim’sAlgorithm
Prim'salgorithmisaminimumspanningtreealgorithmthattakesagraphasinputandfindsthe subset of the
edges of that graph which
 formatreethatincludeseveryvertex
 hastheminimumsumofweightsamongallthetreesthatcanbeformedfromthegraph

HowPrim'salgorithmworks
It falls under a class of algorithms called greedy algorithmsthat find the local optimum in the hopes
of finding a global optimum.
Westart fromonevertexandkeepaddingedgeswiththelowestweight untilwereachourgoal. The steps
for implementing Prim's algorithm are as follows:
1. Initializetheminimumspanningtreewithavertexchosenat random.
2. Find all the edges that connect the tree to new vertices, find the minimum and add it to the
tree
3. Keeprepeatingstep2untilwegetaminimumspanningtree

ExampleofPrim'salgorithm

Startwithaweightedgraph

Chooseavertex

Choosetheshortestedgefromthisvertexandaddit

Choosethenearestvertexnotyetinthesolution

Choosethenearestedgenotyetinthesolution,iftherearemultiplechoices,chooseoneatrandom
Prim'sAlgorithm pseudocode
The pseudocode for prim's algorithm shows how we create two sets of vertices U and V-U. U
contains the list of vertices that have been visited and V-U the list of vertices that haven't. One by
one, we move vertices from set V-U to set U by connecting the least weight edge.
T=∅;
U={1};
while(U≠V)
let (u,v)be thelowestcostedgesuchthatu∈ Uandv∈ V- U;
T=T∪ {(u,v)}
U =U∪ {v}

Prim'sAlgorithmComplexity
ThetimecomplexityofPrim'salgorithmisO(ElogV).

KruskalAlgorithm

Kruskal's algorithm is a minimum spanning treealgorithm that takes a graph as input and finds the
subset of the edges of that graph which
 formatreethatincludeseveryvertex
 hastheminimumsumofweightsamongallthetreesthatcanbeformedfromthegraph
HowKruskal'salgorithmworks
It falls under a class of algorithms called greedy algorithmsthat find the local optimum in the hopes
of finding a global optimum.
Westart fromtheedgeswiththe lowestweightandkeepaddingedgesuntilwereachourgoal. The steps
for implementing Kruskal's algorithm are as follows:
1. Sortalltheedgesfromlowweighttohigh
2. Taketheedgewiththelowestweightandaddittothespanningtree.Ifaddingtheedge created a
cycle, then reject this edge.
3. Keepaddingedgesuntilwereachallvertices.

ExampleofKruskal'salgorithm

Startwithaweightedgraph

Choosetheedgewiththeleastweight,iftherearemorethan1,chooseanyone
Choosethenextshortestedgeandaddit

Choosethenextshortestedgethatdoesn'tcreateacycleandaddit

Choosethenextshortestedgethatdoesn'tcreateacycleandaddit

Repeatuntilyouhaveaspanning tree

KruskalAlgorithmPseudocode
KRUSKAL(G):
A =∅
Foreachvertexv∈G.V:
MAKE-SET(v)
Foreachedge(u,v)∈G.Eorderedbyincreasingorderbyweight(u,v):
ifFIND-SET(u)≠FIND-SET(v):
A=A∪{(u,v)}
UNION(u, v)
returnA

ShortestPathAlgorithm

The shortest path problem is about finding a path between vertices in a graph such that the
totalsum of the edges weights is minimum.

AlgorithmforShortestPath
1. BellmanAlgorithm
2. DijkstraAlgorithm
3. FloydWarshallAlgorithm

BellmanAlgorithm
BellmanFordalgorithmhelpsusfindtheshortestpathfromavertextoallotherverticesofa weighted graph.
ItissimilartoDijkstra'salgorithmbutitcanworkwithgraphsinwhichedgescanhavenegative weights.
HowBellmanFord'salgorithmworks

Bellman Ford algorithm works by overestimating the length of the path from the starting vertex toall
other vertices. Then it iteratively relaxes those estimates by finding new paths that are shorter than
the previously overestimated paths.
Bydoingthisrepeatedlyforallvertices,wecanguaranteethattheresultisoptimized.

Step-1forBellmanFord'salgorithm

Step-2forBellmanFord'salgorithm
Step-4forBellmanFord'salgorithm

Step-5forBellmanFord'salgorithm

Step-6forBellmanFord'salgorithm
BellmanFordPseudocode
Weneedtomaintainthepathdistanceofeveryvertex.Wecanstorethatinanarrayofsizev, where v is the
number of vertices.
We also want to be able to get the shortest path, not only know the length of the shortest path. For
this, we map each vertex to the vertex that last updated its path length.
Oncethe algorithmisover,wecanbacktrack fromthe destinationvertextothesourcevertextofind the
path.
functionbellmanFord(G,S) for
each vertex V in G
distance[V] <- infinite
previous[V]<-NULL
distance[S] <- 0

for each vertex V in


Gforeachedge(U,V)inG
tempDistance<-distance[U]+edge_weight(U,V) if
tempDistance < distance[V]
distance[V]<-tempDistance
previous[V] <- U

foreachedge (U,V)inG
Ifdistance[U]+edge_weight(U,V)<distance[V}
Error:NegativeCycleExists

return distance[], previous[]

Bellman Ford's Complexity

Time Complexity

BestCaseComplexity O(E)

AverageCaseComplexity O(VE)

WorstCaseComplexity O(VE)

DijkstraAlgorithm

Dijkstra'salgorithmallowsustofindtheshortestpathbetweenanytwoverticesofa graph.
Itdiffersfromtheminimumspanningtreebecausetheshortestdistancebetweentwovertices might not
include all the vertices of the graph.

HowDijkstra'sAlgorithmworks
Dijkstra's Algorithm works on the basis that any subpath B -> D of the shortest path A -> D between
vertices A and D is also the shortest path between vertices B and D.
Eachsubpathistheshortest path

Djikstra used this property in the opposite direction i.e we overestimate the distance of each vertex
from the starting vertex. Then we visit each node and its neighbors to find the shortest subpath to
those neighbors.
The algorithm uses a greedy approach in the sense that we find the next best solution hoping that
the end result is the best solution for the whole problem.

ExampleofDijkstra'salgorithm
Itiseasiertostartwithanexampleandthenthinkaboutthealgorithm.

Startwithaweightedgraph

Chooseastartingvertexandassigninfinitypathvaluestoallotherdevices

Gotoeachvertexandupdateitspath length
Ifthepathlengthoftheadjacentvertexislesserthannewpathlength,don'tupdateit

Avoidupdatingpathlengthsofalreadyvisitedvertices

Aftereachiteration,wepicktheunvisitedvertexwiththeleastpathlength.Sowechoose5before7
Noticehowtherightmostvertexhasitspathlengthupdatedtwice

Repeatuntilalltheverticeshavebeenvisited

Djikstra'salgorithmpseudocode
Weneedtomaintainthepathdistanceofeveryvertex.Wecanstorethatinanarrayofsizev, where v is the
number of vertices.
We also want to be able to get the shortest path, not only know the length of the shortest path. For
this, we map each vertex to the vertex that last updated its path length.
Oncethe algorithmisover,wecanbacktrack fromthe destinationvertextothesourcevertextofind the
path.
Aminimumpriorityqueuecanbeusedtoefficiently receivethe vertexwithleastpathdistance. function
dijkstra(G, S)
for each vertex V in G
distance[V]<-infinite
previous[V] <- NULL
IfV!=S,addVtoPriorityQueueQ
distance[S] <- 0

whileQISNOTEMPTY
U<-ExtractMINfromQ
foreachunvisitedneighbourVofU
tempDistance<-distance[U]+edge_weight(U,V) if
tempDistance < distance[V]
distance[V]<-tempDistance
previous[V] <- U
returndistance[],previous[]
Dijkstra'sAlgorithmComplexity
TimeComplexity:O(ELogV)
where,EisthenumberofedgesandVisthenumberofvertices. Space
Complexity: O(V)

FloydWarshallAlgorithm

Floyd-Warshall Algorithm is an algorithm for finding the shortest path between all the pairs of
vertices in a weighted graph. This algorithm works for both the directed and undirected weighted
graphs. But, it does not work for the graphs with negative cycles (where the sum of the edges in a
cycle is negative).
Aweightedgraphisagraph inwhicheachedgehasanumericalvalueassociatedwith it.
Floyd-Warhshall algorithm is also called as Floyd's algorithm, Roy-Floyd algorithm, Roy-Warshall
algorithm, or WFI algorithm.
Thisalgorithmfollowsthedynamicprogrammingapproachtofindtheshortestpaths.

HowFloyd-WarshallAlgorithmWorks?
Letthegivengraphbe:

Initialgraph
Followthestepsbelowtofindtheshortestpathbetweenallthepairsof vertices.
1. CreateamatrixA0ofdimensionn*nwherenisthenumberofvertices.Therowandthe column are
indexed as i and j respectively. i and j are the vertices of the graph.
EachcellA[i][j]isfilledwiththedistancefromtheithvertextothejthvertex.Ifthereisno path from ith
vertex to jth vertex, the cell is left as infinity.

Filleachcellwiththedistancebetweenithandjthvertex

2. Now, create a matrix A1 using matrix A0. The elements in the first column and the first
roware left as they are. The remaining cells are filled in the following way.
Letkbetheintermediatevertexintheshortestpathfromsourcetodestination.Inthis step, k is the
first vertex. A[i][j] is filled with (A[i][k] + A[k][j]) if (A[i][j] > A[i][k] + A[k][j]).
Thatis,ifthedirectdistancefromthesourcetothedestinationisgreaterthanthepath h the vertex k,
then the cell is filled with A[i][k] + A[k][j].
Inthisstep,k isvertex1.Wecalculatethedistancefromsourcevertextodestination vertex
through this vertex

k. Calcula
tethedistancefromthesourcevertextodestinationvertexthroughthisvertexk

Forexample:ForA1[2,4],thedirectdistancefromvertex2to4is4andthesumofthe
distancefromvertex2to4throughvertex(ie.fromvertex2 to1andfromvertex1to4)is7.
Since4<7,A0[2,4]isfilledwith4.
3. Similarly, A2 is created using A1. The elements in the second column and the second row are
left as they are.
Inthisstep,kisthesecond vertex(i.e.vertex2).Theremainingstepsarethesameasin step

2. Calcula
tethedistancefromthesourcevertextodestinationvertexthroughthisvertex2

4. Similarly,A3andA4isalsocreated.

Calculat
e the distance from the source vertex to destination vertex through this
vertex C
alculatethedistancefromthesourcevertextodestinationvertexthroughthisvertex4
5. A4givestheshortestpathbetweeneachpairofvertices.

Floyd-WarshallAlgorithm
n=noof vertices
A=matrixofdimensionn*n for
k = 1 to n
for i = 1 to n
forj=1ton
Ak[i,j]=min(Ak-1[i,j],Ak-1[i,k]+Ak-1[k,j])
return A

TimeComplexity
There are three loops. Each loop has constant complexities. So, the time complexity of the Floyd-
Warshall algorithm is O(n3).

NetworkFlow
Flow Network is a directed graph that is used for modeling material Flow. There are two different
vertices; one is asource whichproducesmaterialat some steady rate,and anotherone issink which
consumes the content at the same constant speed. The flow of the material at any mark in the
system is the rate at which the element moves.
Somereal-life problemslikethe flowofliquids throughpipes, the currentthroughwiresanddelivery of
goods can be modelled using flow networks.
Definition:AFlowNetworkisadirectedgraphG=(V,E)suchthat
1. For each edge (u, v) ∈ E, we associate a nonnegative weight capacity c (u, v) ≥ 0.If (u, v) ∉ E,
we assume that c (u, v) = 0.
2. Therearetwodistinguishingpoints,thesources,andthesink t;
3. Foreveryvertexv∈ V,thereisapathfromstotcontainingv.
Let G = (V, E) be a flow network. Let s be the source of the network, and let t be the sink. A flow in G
is a real-valued function f: V x V→R such that the following properties hold:
PlayVideo
o CapacityConstraint:Forallu,v∈ V,weneedf(u,v)≤c(u,v).
o SkewSymmetry:Forallu,v∈ V,weneedf(u,v)=-f(u,v).
o FlowConservation:Forallu∈V-{s,t},we need

Thequantityf(u,v),whichcanbepositiveornegative,isknownasthenetflowfromvertexuto
vertexv.Inthemaximum-flowproblem,wearegivenaflownetworkGwithsourcesandsinkt,and
aflowofmaximumvaluefromstot.
Ford-FulkersonAlgorithm

Initially,theflowofvalueis 0.Find someaugmentingPathpandincreaseflowf oneachedge of pby residual


Capacity cf (p). When no augmenting path exists, flow f is a maximum flow.
FORD-FULKERSONMETHOD(G,s,t)
1. Initializeflowfto0
2. whilethereexistsanaugmentingpathp
3. doargumentflowfalongp
4. Returnf

FORD-FULKERSON(G,s,t)
1. foreachedge(u,v)∈E [G]
2. dof[u, v]←0
3. f[u,v]←0
4. whilethereexistsapathpfromstotintheresidualnetworkGf.
5. docf(p)←min?{Cf(u,v):(u,v)isonp}
6. foreachedge(u,v)inp
7. dof [u,v]←f[u, v]+ cf(p)
8. f[u,v]←-f[u,v]

Example: Each Directed Edge is labeled with capacity. Use the Ford-Fulkerson algorithm to find the
maximum flow.

Solution: The left side of each part shows the residual network Gfwith a shaded augmenting
pathp,and the right side of each part shows the net flow f.
MaximumBipartiteMatching
The bipartite matching is a set of edges in a graph is chosen in such a way, that no two edges in that
set will share an endpoint. The maximum matching is matching the maximum number of edges.

When the maximum match is found, we cannot add another edge. If one edge is added to the
maximum matched graph, it is no longer a matching. For a bipartite graph, there can be more than
one maximum matching is possible.

Algorithm

bipartiteMatch(u,visited,assign)
Input:Startingnode,visitedlisttokeeptrack,assignthelisttoassignnodewithanothernode.
Output−Returnstruewhenamatchingforvertexuispossible.
Begin
forallvertexv,whichareadjacentwithu,do if v is
not visited, then
markvas visited
ifvisnotassigned,orbipartiteMatch(assign[v],visited,assign)istrue,then assign[v] := u
returntrue
done
returnfalse
End
maxMatch(graph)Input
−Thegivengraph.
Output−Themaximumnumberofthematch.
Begin
initiallynovertexisassigned
count := 0
for all applicant u in M, do
makeallnodeasunvisited
ifbipartiteMatch(u,visited,assign),then
increase count by 1
done
End
Unit3
DivideandConquerAlgorithm
Adivideandconqueralgorithmis astrategy ofsolvingalargeproblemby
1. breakingtheproblemintosmallersub-problems
2. solvingthesub-problems,and
3. combiningthemtogetthedesiredoutput.
Tousethedivideandconqueralgorithm,recursionis used.

HowDivideandConquerAlgorithmsWork?
Herearethesteps involved:
1. Divide:Dividethegivenproblemintosub-problemsusing recursion.
2. Conquer:Solvethesmallersub-problemsrecursively.Ifthesubproblemissmall
enough, then solve it directly.
3. Combine:Combinethesolutionsofthesub-problemsthatarepartoftherecursive
process to solve the actual problem.

FindingMaximumand Minimum
To find the maximum and minimum numbers in a given array numbers[] of size n, the
followingalgorithmcan beused.Firstwearerepresentingthenaivemethodandthen we will
present divide and conquer approach.
NaïveMethod
Naïve method is a basic method to solve any problem. In this method, the maximum and
minimumnumbercanbefoundseparately.Tofindthemaximumandminimumnumbers, the
following straightforward algorithm can be used.
Algorithm:Max-Min-Element(numbers[])
max := numbers[1]
min:=numbers[1]
for i = 2 to n do
ifnumbers[i]>maxthen
max := numbers[i]
ifnumbers[i]<minthen
min := numbers[i]
return(max,min)

Analysis

ThenumberofcomparisoninNaivemethodis2n-2.
Thenumberofcomparisonscan bereducedusingthedivideandconquerapproach. Following is
the technique.
DivideandConquer Approach

In this approach, the array is divided into two halves. Then using recursive approach
maximum and minimum numbers in each halves are found. Later, return the maximum of
two maxima of each half and the minimum of two minima of each half.
Inthisgivenproblem,thenumberofelementsin anarrayisy−x+1, whereyisgreaterthan or equal
to x.
Max−Min(x,y)will returnthemaximumandminimum valuesofanarraynumbers[x...y].
Algorithm:Max-Min(x,y)

ify –x ≤1then
return(max(numbers[x],numbers[y]),min((numbers[x],numbers[y]))
else
(max1,min1):=maxmin(x,⌊((x+ y)/2)⌋)
(max2,min2):=maxmin(⌊((x+y)/2)+1)⌋,y)
return(max(max1, max2),min(min1,min2))
Analysis
LetT(n) bethenumberofcomparisonsmadebyMax−Min(x,y), wherethenumberof
elements n=y−x+1.
IfT(n)representsthenumbers,thentherecurrencerelationcanberepresentedas

Letusassumethatnisintheformofpowerof 2.Hence,n= 2kwherekisheightofthe recursion tree.


So,

ComparedtoNaïvemethod,individeandconquerapproach,thenumberofcomparisonsis less.
However, using the asymptotic notation both of the approaches are represented
by O(n).
MergeSort
MergeSortisoneofthemostpopularsortingalgorithmsthat isbasedontheprinciple of
Divide and Conquer Algorithm.
Here,aproblemisdividedintomultiplesub-problems.Eachsub-problemissolved individually.
Finally, sub-problems are combined to form the final solution.

MergeSort example

DivideandConquer Strategy
UsingtheDivideandConquertechnique,wedivideaproblemintosubproblems.Whenthe
solution to each subproblem is ready, we 'combine' the results from the subproblems to
solve the main problem.
Supposewe hadtosortanarrayA.Asubproblemwouldbetosortasub-sectionofthis array
starting at index p and ending at index r, denoted as A[p..r].
Divide

Ifqisthehalf-waypointbetweenpandr,thenwecansplitthesubarrayA[p..r]intotwo arrays
A[p..q] and A[q+1, r].
Conquer

Intheconquerstep,wetrytosortboth thesubarraysA[p..q]andA[q+1,r].Ifwehaven'tyet reached


the base case, we again divide both these subarrays and try to sort them.
Combine
Whentheconquer stepreachesthebasestepandwegettwosorted
subarraysA[p..q]andA[q+1,r]forarrayA[p..r],wecombinetheresultsbycreatingasorted array
A[p..r] from two sorted subarrays A[p..q] and A[q+1, r].

MergeSort Algorithm
TheMergeSortfunctionrepeatedlydividesthearrayintotwo halvesuntilwe reachastage where
we try to perform MergeSort on a subarray of size 1 i.e. p == r.
Afterthat,themergefunctioncomesintoplayandcombinesthesortedarraysinto larger arrays
until the whole array is merged.
MergeSort(A,p,r): if
p>r
return
q = (p+r)/2
mergeSort(A, p, q)
mergeSort(A,q+1,r)
merge(A, p, q, r)

voidmerge(intarr[],intp,intq,intr)
{

//CreateL←A[p..q]andM←A[q+1..r] int
n1 = q - p + 1;
intn2=r-q;
intL[n1],M[n2];
for(inti=0;i<n1;i++) L[i] =
arr[p + i];
for(intj=0;j<n2;j++) M[j]
= arr[q + 1 + j];

//Maintaincurrentindexofsub-arraysandmainarray int i,
j, k;
i=0;
j=0;
k=p;

//Untilwereacheither endofeitherLorM,picklarger among


//elementsLandMandplacetheminthecorrectpositionatA[p..r] while (i <
n1 && j < n2)
{
if(L[i] <=M[j])
{
arr[k]= L[i];
i++;
}
else
{
arr[k]=M[j]; j++;

}
k++;
}

//WhenwerunoutofelementsineitherL orM,
//pickuptheremainingelementsandputinA[p..r] while (i
< n1)
{
arr[k]=L[i];
i++;
k++;
}

while(j <n2)
{
arr[k]=M[j]; j++;
k++;

}
}

Time Complexity

Best Case Complexity: O(n*log n)


Worst Case Complexity: O(n*log n)

AverageCaseComplexity:O(n*logn)

Dynamic Programming
MatrixChainMultiplication
Dynamicprogrammingisamethodforsolvingoptimization problems.
Itisalgorithmtechniquetosolve acomplexandoverlappingsub-problems.Computethe
solutionsto thesub-problemsonce andstorethesolutionsinatable, sothattheycanbe
reused (repeatedly) later.
DynamicprogrammingismoreefficientthenotheralgorithmmethodslikeasGreedy method,
Divide and Conquer method, Recursion method, etc….
The real time many of problems are not solve using simple and traditional approach
methods. like as coin change problem , knapsack problem, Fibonacci sequence generating ,
complexmatrixmultiplication….TosolveusingIterativeformula,tediousmethod,repetition
again and again it become a more time consuming and foolish. some of the problem it
should be necessary to divide a sub problems and compute its again and again to solve a
suchkindofproblemsandgivetheoptimalsolution,effectivesolutiontheDynamic programming
is needed…
BasicFeaturesofDynamicprogramming:-
 Getallthepossiblesolutionandpickupbestandoptimal solution.

 Workonprincipalofoptimality.
 Definesub-partsandsolvethem usingrecursively.
 Lessspace complexityButmoreTimecomplexity.
 Dynamicprogrammingsavesusfromhavingtorecomputepreviouslycalculatedsub-
solutions.
 Difficultto understanding.
We are covered a many of the real world problems.In our day to day life when we do
making coin change, robotics world, aircraft, mathematical problems like Fibonacci
sequence,simplematrixmultiplicationofmorethentwomatricesanditsmultiplication
possibility is many more so in that get the best and optimal solution. NOW we can look
about one problem that is MATRIX CHAIN MULTIPLICATION PROBLEM.
Suppose,Wearegivenasequence(chain)(A1,A2……An)ofnmatricestobemultiplied,and we
wish to compute the product (A1A2…..An).We can evaluate the above expression using the
standard algorithm for multiplying pairs of matrices as a subroutine once we have
parenthesized it to resolve all ambiguities in how the matrices are multiplied together.
Matrixmultiplicationisassociative,andsoallparenthesizationsyield thesameproduct.For
example, if the chain of matrices is (A1, A2, A3, A4) then we can fully parenthesize the
product (A1A2A3A4) in five distinct ways:
1:-(A1(A2(A3A4))),
2:-(A1((A2A3)A4)),
3:-((A1A2)(A3A4)),
4:-((A1(A2A3))A4),
5:-(((A1A2)A3)A4).
WecanmultiplytwomatricesAandBonlyiftheyarecompatible.thenumberofcolumnsof A must
equal the number of rows of B. If A is a p x q matrix and B is a q x r matrix,the resulting
matrix C is a p x r matrix. The time to compute C is dominated by the number of scalar
multiplications is pqr. we shall express costs in terms of the number of scalar
multiplications.For example, if we have three matrices (A1,A2,A3) and its cost is
(10x100),(100x5),(5x500)respectively. so we can calculate thecost of scalarmultiplication is
10*100*5=5000 if ((A1A2)A3), 10*5*500=25000 if (A1(A2A3)), and so on cost
calculation. Note that in the matrix-chain multiplication problem, we are not actually
multiplyingmatrices.Ourgoalisonlytodetermineanorderformultiplyingmatricesthat has the
lowest cost.that is here is minimum cost is 5000 for above example .So problem is we can
perform a many time of cost multiplication and repeatedly the calculation is
performing.sothisgeneralmethodisverytimeconsumingandtedious.Sowecan apply
dynamic programming for solve this kind of problem.
whenweusedtheDynamicprogrammingtechniqueweshallfollowsomesteps.
1. Characterizethestructureofanoptimal solution.

2. Recursivelydefinethevalueofanoptimalsolution.
3. Computethevalueofanoptimal solution.
4. Constructanoptimalsolutionfromcomputedinformation.

wehavematricesofanyoforder.ourgoalisfindoptimalcostmultiplicationof matrices.when we
solve the this kind of problem using DP step 2 we can get
m[i ,j]=min {m[i , k]+m[i+k, j]+ pi-1*pk*pj}ifi <j….wherep isdimensionofmatrix,i≤ k < j …..
Thebasicalgorithmofmatrixchainmultiplication:-

//MatrixA[i]hasdimensiondims[i-1]xdims[i]fori =1..n
MatrixChainMultiplication(intdims[])
{
//length[dims]=n+1
n=dims.length -1;
//m[i,j]=Minimumnumberofscalarmultiplications(i.e.,cost)
//neededtocomputethematrixA[i]A[i+1]...A[j]= A[i..j]
//Thecostiszerowhenmultiplyingonematrix
for(i=1;i<=n;i++)
m[i, i] = 0;

for(len=2;len<=n;len++){
//Subsequence lengths
for(i=1;i<=n-len+1;i++){ j = i +
len - 1;
m[i, j]=MAXINT;
for(k =i;k <=j-1;k++) {
cost= m[i,k]+m[k+1,j]+dims[i-1]*dims[k]*dims[j];
if(cost<m[i,j]){ m[i,
j] = cost;
s[i,j]=k;
//Indexofthesubsequencesplitthatachievedminimalcost
}
}
}
}
}
ExampleofMatrixChainMultiplication
Example:Wearegiventhesequence {4, 10,3, 12,20, and7}.Thematriceshavesize4 x10,
10x3,3x12,12x20,20x7.We needtocomputeM[i,j],0 ≤i, j≤ 5.We knowM [i,i]=0 for all i.

Letusproceedwithworkingawayfromthediagonal.We computetheoptimalsolutionfor the


product of 2 matrices.

InDynamicProgramming,initializationofeverymethoddoneby‘0’.Soweinitializeitby ‘0’.It will


sort out diagonally.
Wehavetosortoutallthecombinationbuttheminimumoutputcombinationistakeninto consideration.
CalculationofProductof2matrices:
1. m (1,2)=m1x m2
=4x 10x10x3
=4x 10x 3=120

2. m (2,3)=m2x m3
=10x 3x3x 12
=10x 3x12=360

3. m (3,4)=m3x m4
=3x12x12x20
=3x12x20=720

4. m (4,5)=m4x m5
=12x 20x20x 7
=12x 20x 7=1680

 Weinitializethediagonalelementwithequali,j valuewith‘0’.

 Afterthatseconddiagonalissorted outandwegetallthevaluescorrespondedtoit Now


the third diagonal will be solved out in the same way.

Nowproductof3 matrices:
M[1,3] =M1M2 M3
1. Therearetwocasesbywhichwecansolvethismultiplication:(M1xM2)+M3,M1+ (M2x
M3)
2. Aftersolvingbothcaseswechoosethecase inwhichminimumoutputisthere.

M[1,3]=264

AsComparingbothoutput264isminimuminbothcasesso weinsert264intableand(M1 x M2) +


M3 this combination is chosen for the output making.

M[2,4] =M2M3 M4
1. Therearetwocasesbywhichwecansolvethismultiplication:(M2xM3)+M4,
M2+(M3 x M4)
2. Aftersolvingbothcaseswechoosethecase inwhichminimumoutputisthere.

M[2,4]=1320
AsComparingbothoutput1320isminimuminbothcasessoweinsert1320intableand M2+(M3 x
M4) this combination is chosen for the output making.
M[3,5]= M3M4M5
1. Therearetwocasesbywhichwecansolvethismultiplication:(M3xM4)+M5,M3+ (
M4xM5)
2. Aftersolvingbothcaseswechoosethecase inwhichminimumoutputisthere.

M[3,5]=1140

AsComparingbothoutput1140isminimuminbothcasessoweinsert1140intableand ( M3 x
M4) + M5this combination is chosen for the output making.

NowProductof4matrices:
M[1,4] =M1M2M3 M4
Therearethreecasesbywhich wecansolvethismultiplication:
1. ( M1 xM2 x M3)M4

2. M1x(M2x M3xM4)
3. (M1xM2)x ( M3xM4)
Aftersolvingthesecaseswechoosethecase inwhichminimumoutputisthere

M[1,4]=1080

Ascomparing theoutputofdifferentcases then‘1080’is minimumoutput,sowe insert


1080inthetableand(M1xM2) x(M3xM4) combinationistakenoutinoutputmaking,
M[2,5] =M2 M3M4 M5
Therearethreecasesbywhich wecansolvethismultiplication:
1. (M2x M3x M4)x M5

2. M2x( M3 x M4xM5)
3. (M2x M3)x( M4xM5)
Aftersolvingthesecaseswechoosethecase inwhichminimumoutputisthere

M[2,5]=1350

Ascomparingtheoutputofdifferentcasesthen‘1350’isminimumoutput,sowe insert 1350 in


the table and M2 x( M3 x M4xM5)combination is taken out in output making.

NowProductof5matrices:
M[1,5] =M1M2M3M4 M5
Therearefivecasesbywhichwe cansolvethismultiplication:
1. (M1x M2xM3x M4)xM5
2. M1x( M2 xM3x M4xM5)

3. (M1x M2xM3)xM4 xM5


4. M1x M2x(M3x M4xM5)
Aftersolvingthesecaseswechoosethecase inwhichminimumoutputisthere

M[1,5]=1344
As comparing the output of different cases then ‘1344’ is minimum output, so we insert
1344inthetableandM1xM2x(M3xM4xM5)combinationistakenoutinoutputmaking.
FinalOutputis:
Sowe cangettheoptimalsolutionofmatrices multiplication….

MultiStageGraph
MultistageGraphproblemisdefinedas follow:
 Multistage graph G = (V, E, W) is a weighted directed graph in which vertices are
partitioned into k ≥ 2 disjoint sub sets V = {V1, V2, …, Vk} such that if edge (u, v) is
presentinE thenu∈ Viandv∈ Vi+1,1 ≤i≤ k.Thegoalofmultistagegraphproblemis to find
minimum cost path from source to destination vertex.
 Theinputtothealgorithmisak-stagegraph,nverticesareindexedinincreasing order
of stages.
 Thealgorithmoperatesinthebackwarddirection,i.e.itstartsfromthelast vertexof the
graph and proceeds in a backward direction to find minimum cost path.
 Minimumcostofvertexj∈Vifromvertexr∈Vi+1isdefinedas, Cost[j]
= min{ c[j, r] + cost[r] }
where,c[j, r]istheweightofedge<j, r>andcost[r]isthecostofmovingfromend vertex to
vertex r.
 Algorithmforthemultistagegraphisdescribedbelow:
Algorithm for Multistage Graph

AlgorithmMULTI_STAGE(G,k,n,p)
//Description:Solvemulti-stageproblemusingdynamicprogramming

//Input:
k:NumberofstagesingraphG=(V,E) c[i,
j]:Cost of edge (i, j)

//Output:p[1:k]:Minimumcostpath

cost[n] ← 0
forj←n–1to1do
//Letrbeavertexsuchthat(j,r)inEandc[j,r]+cost[r]isminimum cost[j] ← c[j,
r] + cost[r]
π[j]←r
end

//Findminimumcostpath
p[1] ← 1
p[k]←n

forj←2tok-1do
p[j]←π[p[j-1]]
end
ComplexityAnalysisofMultistageGraph
IfgraphGhas|E|edges,thencostcomputationtimewouldbeO(n +|E|).Thecomplexity of
tracing the minimum cost path would be O(k), k < n. Thus total time complexity of
multistage graph using dynamic programming would be O(n + |E|).
Example
Example:Findminimumpathcostbetweenvertexsandtforfollowingmultistagegraph using
dynamic programming.

Solution:
Solutiontomultistagegraphusingdynamicprogrammingisconstructedas, Cost[j] =
min{c[j, r] + cost[r]}

Here,numberofstagesk=5,numberofverticesn=12, sources=1 andtargett =12 Initialization:


Cost[n]=0⇒Cost[12]=0.
p[1] = s ⇒ p[1] = 1

p[k]=t⇒p[5]=12. r =
t = 12.
Stage4:

Stage3:
Vertex6isconnected tovertices9and10:
Cost[6]=min{c[6,10]+Cost[10],c[6,9]+ Cost[9]}

=min{5+2,6+ 4}=min{7,10}=7
p[6]=10
Vertex7isconnected tovertices9and10:
Cost[7]=min{c[7,10]+Cost[10],c[7,9]+ Cost[9]}

=min{3+2,4+ 4}=min{5,8}=5
p[7]=10
Vertex8isconnected tovertex 10and11:
Cost[8]=min{c[8,11]+Cost[11],c[8,10]+Cost[10]}
=min{6+5,5+2}=min{11,7}=7p[8]=10

Stage2:
Vertex2isconnected tovertices6,7and8:
Cost[2]=min{c[2,6]+Cost[6], c[2,7]+Cost[7], c[2,8] +Cost[8]}
=min{4+7,2+5,1+7}=min{11,7, 8}=7
p[2]=7
Vertex3isconnectedtovertices6and7:
Cost[3]=min{c[3,6]+Cost[6],c[3,7]+Cost[7]}

=min{2+7,7+ 5}=min{9,12}=9
p[3]=6
Vertex4isconnectedtovertex 8:
Cost[4]=c[4, 8]+Cost[8]= 11+7=18
p[4]=8

Vertex5isconnected tovertices7and8:
Cost[5]=min{c[5,7]+Cost[7],c[5,8]+Cost[8]}
=min{11+5,8+7}=min{16,15}=15p[5]=8

Stage1:
Vertex1isconnected tovertices2,3, 4and5:

Cost[1]=min{c[1,2]+Cost[2],c[1, 3]+ Cost[3],c[1,4]+ Cost[4],c[1,5]+Cost[5]}


=min{9+7,7+9,3+18,2+15 }
=min{16,16,21,17}=16p[1]=2
Tracethe solution:
p[1]=2

p[2]=7
p[7]=10
p[10]=12
Minimumcostpathis: 1–2–7–10–12
Costofthepathis:9+2+3+2=16

OptimalBinarySearchTree
 OptimalBinary SearchTreeextends theconceptofBinary searctree. BinarySearch
Tree(BST) isanonlineardatastructurewhich isusedinmanyscientificapplications for
reducing the search time. In BST, left child is smaller than root and right child is
greater than root. This arrangement simplifies the search procedure.
 Optimal Binary Search Tree (OBST) is very useful in dictionary search. The probability
ofsearchingisdifferentfor differentwords. OBST hasgreat applicationintranslation.
If we translate the book from English to German, equivalent words are searched
fromEnglishtoGermandictionaryandreplacedintranslation.Wordsaresearched same
as in binary search tree order.
 Binarysearchtreesimplyarrangesthewordsinlexicographicalorder.Words like
‘the’, ‘is’, ‘there’ are very frequent words, whereas words
like‘xylophone’,‘anthropology’etc.appearsrarely.

 Itisnotawise ideatokeeplessfrequentwordsnearrootinbinarysearchtree. Instead


of storing words in binary search tree in lexicographical order, we shall arrange
them according to their probabilities. This arrangement facilitates few
searches for frequent words as they would be near the root. Such tree is
calledOptimalBinarySearch Tree.
 ConsiderthesequenceofnkeysK=<k1,k2,k3,…,kn>ofdistinctprobabilityinsorted order
such that
k1<k2<…<kn.Wordsbetweeneachpairofkeyleadtounsuccessfulsearch,soforn keys,
binary search tree contains n + 1 dummy keys di, representing unsuccessful searches.
 TwodifferentrepresentationofBSTwithsamefivekeys{k1,k2,k3,k4,k5}probability is
shown in following figure
 With n nodes, there exist (2n)!/((n + 1)! * n!) different binary search trees. An
exhaustivesearchforoptimalbinarysearch treeleadstohugeamountoftime.
 The goal is to construct a tree which minimizes the total search cost. Such tree is
calledoptimalbinarysearchtree.OBSTdoesnotclaimminimumheight.It isalsonot
necessary that parent of sub tree has higher priority than its child.
 Dynamicprogramming canhelpustofindsuchoptima tree.

Binarysearchtreeswith5keys
Mathematicalformulation
 WeformulatetheOBSTwithfollowing observations
 AnysubtreeinOBST containskeysinsortedorderki…kj,where1≤i≤j≤ n.
 Subtreecontainingkeyski…kj hasleaveswithdummykeysdi-1….dj.
 Supposekristherootofsubtreecontainingkeyski…..kj.So,leftsubtreeofroot kr
contains keys
ki….kr-1andrightsubtreecontainkeyskr+1tokj.Recursively,optimalsubtreesare
constructed from the left and right sub trees of kr.
 Lete[i,j]representstheexpected costofsearchingOBST. Withnkeys,ouraimisto find
and minimize e[1, n].
 Basecaseoccurswhenj=i–1,becausewejusthavethedummykeydi-1forthis case.
Expected search cost for this case would be e[i, j] = e[i, i – 1] = qi-1.
 Forthecasej≥i,we havetoselectanykeykrfromki…kjasarootofthetree.
 Withkrasarootkey andsubtreeki…kj,sumofprobability isdefinedas
(Actualkeystartsatindex1anddummykeystartsatindex0)

Thus,arecursiveformulaforformingtheOBSTisstatedbelow:

e[i,j]givestheexpectedcostintheoptimalbinarysearchtree.
AlgorithmforOptimalBinarySearchTree
Thealgorithmforoptimalbinary searchtree isspecifiedbelow:

AlgorithmOBST(p, q,n)
//e[1…n+1,0…n]: Optimalsubtree
//w[1…n+1,0…n]:Sumofprobability
//root[1…n,1…n]:UsedtoconstructOBST

fori←1ton+1 do
e[i,i–1]←qi–1
w[i, i–1]←qi–1
end

form←1ton do
fori←1ton–m+1 do
j←i+m–1 e[i,
j] ← ∞
w[i,j]←w[i,j–1]+pj+qj
forr←itojdo
t←e[i,r–1]+e[r+1,j]+w[i,j]
ift<e[i,j]then
e[i, j] ← t
root[i, j] ← r

end
end
end
end
return(e,root)

ComplexityAnalysisofOptimalBinarySearchTree
Itisverysimpletoderivethecomplexityofthisapproachfromtheabovealgorithm.It uses
threenestedloops.Statementsin theinnermostloopruninQ(1)time.Therunningtimeof the
algorithm is computed as

Thus,theOBSTalgorithmrunsincubictime
Example
Problem:Let p (1:3)= (0.5,0.1,0.05)q(0:3)=(0.15,0.1,0.05,0.05)Computeand
constructOBSTforabovevaluesusingDynamicapproach.
Solution:
Here,giventhat

i 0 1 2 3

pi 0.5 0.1 0.05

qi 0.15 0.1 0.05 0.05

RecursiveformulatosolveOBST problemis

DownloadedfromEnggTree.com
Where,

Initially,
Now,we willcompute e[i,j]

Initially,

e[1,0]=q0=0.15(∵j=i–1)
e[2,1]= q1=0.1 (∵j=i–1)
e[3,2]=q2=0.05(∵j=i–1)
e[4,3]=q3=0.05(∵j=i–1)
e[1,1]=min{e[1,0]+e[2,1]+w(1,1)}
=min{0.15+0.1+0.75}= 1.0
e[2,2]=min{e[2,1]+e[3,2]+w(2,2)}
=min{0.1+0.05+0.25}= 0.4
e[3,3]=min{e[3,2]+e[4,3]+w(3,3) }
=min{0.05+0.05+ 0.15}=0.25
e[1,3]is minimumforr=1,so r[1,3]=1
e[2,3]is minimumforr=2,so r[2,3]=2
e[1,2]is minimumforr=1,so r[1,2]=1
e[3,3]is minimumforr=3,so r[3,3]=3

e[2,2]is minimumforr=2,so r[2,2]=2

e[1, 1] is minimum for r = 1, so r[1, 1] = 1


LetusnowconstructOBSTforgivendata.
r[1,3] =1, so k1 will be at the root.
k2….3 are on right side of k1

r[2,3]=2,Sok2willbetherootofthissubtree. k3 will
be on the right of k2.
Thus,finally,weget.
Greedy
TechniqueActivitySelectio
n Problem
ActivitySelection problemisaapproachofselectingnon-conflictingtasks basedon startand
endtimeandcan besolved inO(N logN)timeusingasimplegreedyapproach.Modifications of this
problem are complex and interesting which we will explore as well. Suprising, if we use a
Dynamic Programming approach, the time complexity will be O(N^3) that is lower
performance.
The problem statement for Activity Selection is that "Given a set of n activities with their
start and finish times, we need to select maximum number of non-conflicting activities that
can be performed by a single person, given that the person can handle only one activity at a
time." The Activity Selection problem follows Greedy approach i.e. at every step, we can
make a choice that looks best at the moment to get the optimal solution of the complete
problem.
Our objective is to complete maximum number of activities. So, choosing the activity which
is going to finish first will leave us maximum time to adjust the later activities. This is the
intuition that greedily choosing the activity with earliest finish time will give us an optimal
solution. By induction on the number of choices made, making the greedy choice at every
step produces an optimal solution, so we chose the activity which finishes first. If we sort
elements based on their starting time, the activity with least starting time could take the
maximum duration for completion, therefore we won't be able to maximise number of
activities.
Algorithm
ThealgorithmofActivitySelectionisasfollows:
Activity-Selection(Activity, start, finish)
SortActivitybyfinishtimesstoredinfinish
Selected = {Activity[1]}

n=Activity.length j
=1

fori=2to n:
ifstart[i]≥finish[j]:
Selected=SelectedU{Activity[i]} j
=i

return Selected
Complexity
TimeComplexity:
Whenactivitiesaresortedbytheirfinishtime:O(N)
Whenactivitiesarenotsortedbytheirfinishtime,thetimecomplexityisO(N logN)dueto
complexity of sorting

Inthisexample,wetakethestartandfinishtimeofactivitiesasfollows: start = [1,


3, 2, 0, 5, 8, 11]
finish=[3,4,5, 7,9,10,12]
Sorted by their finish time, the activity 0 gets selected. As the activity 1 has starting time
whichisequaltothe finishtimeofactivity0, itgetsselected.Activities2and3havesmaller starting
time than finish time of activity 1, so they get rejected. Based on similar comparisons,
activities 4 and 6 also get selected, whereas activity 5 gets rejected. In this example, in all
the activities 0, 1, 4 and 6 get selected, while others get rejected.
OptimalMerge Pattern
Mergea setofsortedfilesofdifferentlengthintoa singlesortedfile.Weneedtofindan optimal
solution, where the resultant file will be generated in minimum time.
Ifthenumberofsortedfilesaregiven,therearemanywaystomergethemintoasingle sorted
file.This merge can be performed pairwise. Hence,this type ofmergingis called as 2-way
merge patterns.
As, different pairings require different amounts of time, in this strategy we want to
determineanoptimalwayofmergingmanyfilestogether.Ateachstep,twoshortest sequences
are merged.
Tomergeap-recordfileandaq-recordfilerequirespossiblyp +qrecordmoves,the obvious
choice being, merge the two smallest files together at each step.
Two-way merge patterns can be represented by binary merge trees. Let us consider a set
ofnsortedfiles{f1,f2,f3,…,fn}.Initially,eachelementofthisisconsideredasasinglenode binary
tree. To find this optimal solution, the following algorithm is used.

Algorithm:TREE(n)
fori :=1ton– 1do
declare new node

node.leftchild := least (list)


node.rightchild:=least(list)
node.weight):=((node.leftchild).weight)+((node.rightchild).weight) insert
(list, node);

returnleast (list);
Attheendofthisalgorithm,the weightoftherootnoderepresentstheoptimalcost. Example
Letusconsiderthegivenfiles,f1,f2,f3,f4andf5with20,30,10,5and30numberof elements
respectively.

Ifmergeoperationsareperformedaccordingtotheprovidedsequence,then M 1 =
merge f1 and f2 => 20 + 30 = 50
M2=mergeM1andf3=>50+10=60 M3 =
merge M2 and f4 => 60 + 5 = 65 M4
=mergeM3andf5=>65+30=95
Hence,thetotalnumberofoperationsis 50 +
60 + 65 + 95 = 270
Now,thequestionarisesisthereanybetter solution?
Sortingthenumbersaccordingtotheirsizeinanascendingorder, wegetthefollowing sequence −
f4,f3,f1,f2,f5

Hence,mergeoperationscanbeperformedonthissequence M1
= merge f4 and f3 => 5 + 10 = 15
M2=mergeM1andf1=>15+20=35

M3=mergeM2andf2=>35+30=65 M4
=mergeM3andf5=>65+30=95

Therefore,thetotalnumberofoperationsis 15 +
35 + 65 + 95 = 210

Obviously,thisisbetterthanthepreviousone.
Inthiscontext,wearenowgoingtosolvetheproblemusingthisalgorithm. Initial Set

Step1

Step2
Step3

Step4

Hence,thesolutiontakes15+ 35+60+ 95= 205numberofcomparisons.


Huffman Tree
Huffman coding provides codes to characters such that the length of the code depends on
the relative frequency or weight of the corresponding character. Huffman codes are of
variable-length, and without any prefix (that means no code is a prefix of any other). Any
prefix-free binary code can be displayed or visualized as a binary tree with the encoded
characters stored at the leaves.
Huffman tree or Huffman coding tree defines as a full binary tree in which each leaf of the
tree corresponds to a letter in the given alphabet.
The Huffman tree is treated as the binary tree associated with minimum external path
weight that means, the one associated with the minimum sum of weighted path lengths for
the given set of leaves. So the goal is to construct a tree with the minimum external path
weight.

Anexampleisgivenbelow-
Letter frequency table

Letter z k m c u d l e
Frequency 2 7 24 32 37 42 42 120

Huffmancode

Letter Freq Code Bits

e 120 0 1

d 42 101 3

l 42 110 3

u 37 100 3

c 32 1110 4

m 24 11111 5

k 7 111101 6

z 2 111100 6

TheHuffmantree(fortheaboveexample)isgivenbelow-
Algorithm Huffman (c)
{
n=|c|

Q=c
fori<-1to n-1

do
{

temp<-getnode()

left(temp]Get_min(Q)right[temp]GetMin(Q) a =

left [templ b = right [temp]

F[temp]<-f[a]+[b]

insert (Q, temp)

returnGet_min (0)
}
UNIT4
Backtracking
NqueenProblem
N-Queensproblemistoplacen-queensinsuchamanneronannxn chessboardthatnoqueensattack each other by
being in the same row, column or diagonal.

Itcanbe seenthatforn=1,theproblemhasatrivialsolution,andnosolutionexistsforn=2andn=3.So first we will


consider the 4 queens problem and then generate it to n - queens problem.

Givena4x4chessboardandnumbertherowsandcolumnofthechessboard1through4.

Since, we have to place 4 queens such as q1q2q3and q4on the chessboard, such that no two queens attack
eachother.Insuch aconditionaleachqueenmustbe placedona different row,i.e.,weput queen"i"onrow "i."

Now, we place queen q1 in the very first acceptable position (1, 1). Next, we put queen q 2 so that both these
queens do not attack each other. We find that if we place q2 in column 1 and 2, then the dead end is
encountered. Thus the first acceptable position for q2 in column 3, i.e. (2, 3) but then no position is left for
placing queen 'q3' safely. So we backtrack one step and place the queen 'q2' in (2, 4), the next best possible
solution. Then we obtain the position for placing 'q3' which is (3, 2). But later this position also leads to adead
end, and no place is found where 'q4' can be placed safely. Then we have to backtrack till 'q1' and place it to
(1, 2) and then all other queens are placed safely by moving q 2 to (2, 4), q3 to (3, 1) and q4 to (4, 3). That is,
we get the solution (2, 4, 1, 3). This is one possible solution for the 4-queens problem. For anotherpossible
solution, the whole method is repeated for all partial solutions. The other solutions for 4 - queens problems
is (3, 1, 4, 2) i.e.
Theimplicittreefor4-queenproblemforasolution(2,4,1,3)isasfollows:

Figshowsthecompletestatespacefor4-queensproblem.But wecanusebacktrackingmethodtogenerate the


necessary node and stop if the next node violates the rule, i.e., if two queens are attacking.
4-Queenssolutionspacewithnodesnumberedin DFS

Itcanbe seenthatallthe solutionstothe4queensproblemcanbe representedas4-tuples(x1,x2,x3,x4) where xi


represents the column on which queen "qi" is placed.

Onepossiblesolutionfor8queensproblemisshowninfig:

1. Thus,thesolutionfor8-queenproblemfor(4,6,8,2,7,1,3,5).

2. Iftwoqueensare placedatposition(i,j)and(k,l).

3. Thentheyareonsamediagonalonlyif(i-j)= k-lori+ j=k +l.

4. Thefirstequationimpliesthatj-l=i-k.

5. Thesecondequationimpliesthatj-l=k-i.

6. Therefore,twoqueenslieontheduplicatediagonalifandonlyif|j-l|=|i-k|

Place (k, i) returns a Boolean value that is true if the kth queen can be placed in column i. It tests both
whether i is distinct from all previous costs x1, x2, ... xk-1andwhetherthereisnootherqueenonthesame
diagonal.

Usingplace,wegiveaprecisesolutiontothenn-queens problem.
1. Place(k, i)
2. { DownloadedfromEnggTree.com
3. Forj←1tok- 1
4. doif(x[j]=i)
5. or(Absx[j]) -i)=(Abs(j- k))
6. thenreturnfalse;
7. returntrue;
8.}
Place(k,i)returntrueifaqueencanbe placedinthekthrowandithcolumnotherwisereturnisfalse. x [] is a
global array whose final k - 1 values have been set. Abs (r) returns the absolute value of r.
1. N-Queens(k,n)
2. {
3. Fori←1ton
4. doifPlace(k,i)then
5. {
6. x[k]←i;
7. if(k==n)then
8. write(x[1 ..... n));
9. else
10. N- Queens(k+1, n);
11. }
12.}

HamiltonianCircuit
TheHamiltoniancycleisthecycleinthegraphwhichvisitsalltheverticesingraphexactlyonceand terminates at the
starting node. It may not include all the edges

 TheHamiltoniancycleproblemistheproblemoffinding aHamiltoniancycleinagraphifthereexists any


such cycle.

 The input to the problem is an undirected, connected graph. For the graph shown in Figure (a), a
pathA–B– E– D–C–AformsaHamiltoniancycle.Itvisitsall theverticesexactlyonce,but does not visit
the edges <B, D>.

 TheHamiltoniancycleproblemisalsoboth,decisionproblemandanoptimizationproblem.A
decision problem is stated as, “Given a path, is it a Hamiltonian cycle of the graph?”.

 Theoptimizationproblemisstatedas,“GivengraphG,findtheHamiltoniancycleforthegraph.”

 WecandefinetheconstraintfortheHamiltoniancycleproblemas follows:

 Inanypath,verte x iand ( i + 1) must be adjacent.


D ownl oaded from EnggTree.com
EnggTree.com

 1stand(n–1)thvertexmustbeadjacent(nthofcycleistheinitialvertexitself).

 Verteximustnotappearinthefirst(i– 1)verticesofany path.

 Withtheadjacencymatrixrepresentationofthegraph,theadjacencyoftwoverticescanbeverified in
constant time.

Algorithm
HAMILTONIAN(i)
//Description:SolveHamiltoniancycleproblemusingbacktracking.
//Input:Undirected,connectedgraphG=<V,E>andinitialvertexi
//Output:Hamiltoniancycle
if
FEASIBLE(i)
then
if
(i==n-1)
then
PrintV[0…n– 1]
else
j ←2
while
(j ≤ n)
do
V[i] ← j
HAMILTONIAN(i+1)
j←j+1 end
end
end
function
FEASIBLE(i)
flag←1
for
j ←1toi –1
do
if
Adjacent(Vi,Vj)
then
flag←0
end
end
if
Adjacent(Vi,Vi-1)
then
flag←1
else
DownloadedfromEnggTree.com
EnggTree.com

flag←0
end
return
flag

ComplexityAnalysis
Lookingatthe statespacegraph,inworstcase,totalnumberofnodesintreewouldbe, T(n) = 1 +
(n – 1) + (n – 1)2 + (n – 1)3 + … + (n – 1)n–1
=frac(n−1)n–1n–2
T(n)=O(nn).Thus,theHamiltoniancyclealgorithmrunsinexponentialtime.

Example:FindtheHamiltoniancyclebyusingthebacktrackingapproachforagivengraph.

The backtracking approach uses a state-space tree to check if there exists a Hamiltonian cycle in the graph.
Figure (g) shows the simulation of the Hamiltonian cycle algorithm. For simplicity, we have not explored all
possible paths, the concept is self-explanatory. It is not possibleto include all the paths in the graph, so few
ofthesuccessfulandunsuccessfulpathsaretracedinthe graph.BlacknodesindicatetheHamiltoniancycle.

SubsetSum Problem
DownloadedfromEnggTree.com
EnggTree.com

SumofSubsetsProblem:Givenasetofpositiveintegers,findthe combinationofnumbersthatsumtogiven value M.


Sumofsubsetsproblemisanalogoustotheknapsackproblem.TheKnapsackProblemtriestofillthe knapsack
using a given set of items to maximize the profit. Items are selected in such a way that the total weight in
the knapsack does not exceed the capacity of the knapsack. The inequality condition in the knapsack
problem is replaced by equality in the sum of subsets problem.
Given the set of n positive integers, W = {w1, w2, …, wn}, and given a positive integer M, the sum of the
subsetproblemcanbeformulatedasfollows(wherewiandMcorrespondtoitemweightsandknapsack capacity in
the knapsack problem):

Where,

Numbers are sorted in ascending order, such that w1< w2< w3< …. < wn. The solution is often represented
using the solution vector X. If the ithitemis included, set xito 1 else set it to 0. Ineach iteration, oneitem is
tested.Iftheinclusionofanitemdoesnotvioletthe constraintoftheproblem,addit.Otherwise,backtrack,
removethepreviouslyaddeditem,andcontinuethe sameprocedurefor allremainingitems.Thesolutionis easily
described by the state space tree. Each left edge denotes the inclusion of wi and the right edge denotes the
exclusionof wi. Any path fromthe root to the leaf forms asubset. Astate-space tree for n = 3 is demonstrated
in Fig. (a).

Fig.(a):Statespacetreeforn= 3
AlgorithmforSumofsubsets
Thealgorithmforsolvingthesumofsubsetsproblemusingrecursionisstatedbelow:

DownloadedfromEnggTree.com
EnggTree.com

Examples

DownloadedfromEnggTree.com
EnggTree.com

GraphColouring

In this problem,an undirected graphis given.Thereis alsoprovided m colors.Theproblem isto find if itis
possibletoassignnodeswithmdifferentcolors,suchthatnotwoadjacentverticesofthegraphare ofthe same
colors. If the solution exists, then display which color is assigned on which vertex.
Starting from vertex0, wewill try to assign colors one by one to different nodes. But before assigning, we
havetocheckwhetherthecolorissafeornot.Acolorisnotsafewhetheradjacentverticesare containing the same
color.
InputandOutput Input:
TheadjacencymatrixofagraphG(V,E)andanintegerm,whichindicatesthemaximumnumberofcolors that can be
used.

Letthemaximumcolorm=3.
Output:
Thisalgorithmwillreturnwhichnodewillbe assignedwithwhichcolor.Ifthesolutionisnotpossible,it will return false.
Forthisinputtheassignedcolors are:
Node0-> color1
Node1-> color2
Node2-> color3
Node3-> color2

Algorithm
isValid(vertex,colorList,col)
Input−Vertex,colorListtocheck,andcolor,whichistryingtoassign.
Output−Trueifthecolorassigningisvalid,otherwisefalse.
Begin
forallverticesvofthegraph,do
ifthereisanedgebetweenvandi,andcol=colorList[i],then return false
done
returntrue
End DownloadedfromEnggTree.com
EnggTree.com

graphColoring(colors,colorList,vertex)
Input−Mostpossiblecolors,thelistforwhichverticesarecoloredwithwhichcolor,andthestartingvertex.
Output−True,whencolorsareassigned,otherwisefalse.
Begin
ifallverticesarechecked,then
return true
forallcolorscolfromavailablecolors,do if
isValid(vertex, color, col), then
addcoltothecolorListfor vertex
ifgraphColoring(colors,colorList,vertex+1)=true,then return
true
removecolorforvertex done
returnfalse

End

BranchandBound
Solving15puzzleProblem(LCBB)
The problem cinsist of 15numbered (0-15) tiles ona square box with16 tiles(one tile is blank or empty).
Theobjective ofthisproblemistochange thearrangementofinitialnodetogoalnodebyusing seriesof legal
moves.
TheInitialandGoalnodearrangementisshownbyfollowingfigure.

DownloadedfromEnggTree.com
EnggTree.com

1 2 4 15 1 2 3 4

2 5 12 5 6 7 8

7 6 11 14 9 10 11 12

8 9 10 13 13 14 15

InitialArrangement FinalArrangement

Ininitial nodefourmovesarepossible.Usercanmoveanyoneofthetilelike2,or 3,or5,or6totheempty tile. From


this we have four possibilities to move from initial node.
Thelegalmovesareforadjacenttilenumberisleft,right,up,down,onesatatime.
Each and every move creates a new arrangement, and this arrangement is called state of puzzle problem.
Byusingdifferentstates,astatespacetreediagramiscreated,inwhichedgesarelabeledaccordingtothe direction
in which the empty space moves.

DownloadedfromEnggTree.com
EnggTree.com

Thestatespacetreeisverylargebecauseitcanbe16!Differentarrangements.
Instatespacetree,nodesarenumberedasperthe level.Ineachlevelwemustcalculatethevalue or cost of
each node by using given formula:
C(x)=f(x)+g(x),
f(x)islengthofpathfromrootorinitialnodetonodex,
g(x)isestimatedlengthofpathfromxdownwardtothegoalnode.Numberofnonblank tilenotin their
correct position.
C(x)<Infinity.(initiallysetbound).
Eachtimenodewithsmallestcost isselectedforfurtherexpansiontowardsgoalnode.Thisnode become
the e-node.

StateSpacetreewithnodecostisshownin diagram.

AssignmentProblem
ProblemStatement
Let’sfirstdefine ajobassignment problem.Inastandardversionofajobassignment problem,there canbe
jobsand workers.Tokeepitsimple,we’retaking jobs and workersinourexample:

DownloadedfromEnggTree.com
EnggTree.com

Wecanassignanyofthe availablejobstoanyworkerwiththeconditionthatifajobisassignedtoa worker,


the other workers can’t take that particular job. We should also notice that each job has some cost
associated with it, and it differs from one worker to another.
Herethemainaimistocomplete allthejobsby assigningonejobtoeachworkerinsuchawaythat the sum
of the cost of all the jobs should be minimized.
BranchandBoundAlgorithmPseudocode
Nowlet’sdiscusshowtosolvethejobassignmentproblemusingabranchandboundalgorithm. Let’s see
the pseudocode first:

Here,is the input cost matrix that contains information like the number ofavailable jobs, a list of
available workers, and the associated cost for each job. The function MinCost() maintains a list of
active nodes. The function Leastcost()calculates the minimum cost of the active node at each level of
the tree. After finding the node with minimum cost, we remove the node from the list of active
nodes and return it.
We’re using the add() function in the pseudocode, which calculates the cost of a particular node and
adds it to the list of active nodes.
In the search space tree, each node contains some information, such as cost, a total number of jobs,
as well as a total number of workers.
Nowlet’srunthealgorithmonthesampleexamplewe’vecreated:

DownloadedfromEnggTree.com
EnggTree.com

Advantages
Inabranchandboundalgorithm,wedon’t exploreallthe nodesinthetree.That’swhythetime complexity
of the branch and bound algorithm is less when compared with other algorithms.
Iftheproblemisnotlargeandifwecandothebranching inareasonableamount oftime,itfindsan optimal
solution for a given problem.
Thebranchandboundalgorithmfindaminimalpathtoreachtheoptimalsolutionforagiven problem. It
doesn’t repeat nodes while exploring the tree.
Disadvantages
Thebranchandbound algorithmaretime-consuming.Dependingonthe sizeofthegivenproblem, the
number of nodes in the tree can be too large in the worst case.

KnapsackProblemusingbranchandbound
ProblemStatement
Weare a givenasetofnobjectswhichhaveeachhavea valuevianda weightwi. Theobjectiveof
the0/1Knapsackproblemistofindasubsetofobjectssuchthatthetotalvalueismaximized,and

thesumofweightsoftheobjectsdoesnotexceedagiventhresholdW.Animportant conditionhere is that


one can either take the entire object or leave it. It is not possible to take a fraction of the object.

DownloadedfromEnggTree.com
EnggTree.com

Consideranexamplewheren=4,andthevaluesaregivenby {10,12,12, 18}andtheweightsgiven by {2, 4,


6, 9}. The maximum weight is given by W = 15. Here, the solution to the problem will be including
the first, third and the fourth objects.

Here,theproceduretosolvetheproblemisasfollows are:
 Calculatethe costfunctionandtheUpperboundforthetwochildrenofeachnode.Here, the (i +
1)th level indicates whether the ith object is to be included or not.
 If the cost function for a given node is greater than the upper bound, then the node neednot
be explored further. Hence, we can kill this node. Otherwise, calculate the upper bound
forthisnode.IfthisvalueislessthanU,thenreplacethe valueofUwiththisvalue.Then,kill all
unexplored nodes which have cost function greater than this value.
 Thenextnodetobecheckedafterreachingallnodesinaparticularlevelwillbe theonewith the least
cost function value among the unexplored nodes.
 Whileincludinganobject,oneneedstocheckwhethertheadding theobjectcrossedthe
threshold. If it does, one has reached the terminal point in that branch, and all the
succeeding objects will not be included.

TimeandSpaceComplexity
Even though this method is more efficient than the other solutions to this problem, its worst case
timecomplexityisstillgivenbyO(2n),incaseswheretheentiretreehastobeexplored.However,in its best
case, only one path through the tree will have to explored, and hence its best case time complexity
isgivenby O(n).Sincethis method requiresthecreationofthestatespacetree, itsspace complexity will
also be exponential.

SolvinganExample
Considerthe problemwithn=4, V ={10,10,12, 18}, w={2,4,6,9}andW= 15.Here,wecalculate the initital
upper bound to be U = 10 + 10 + 12 = 32. Note that the 4th object cannot be included here, since
that would exceed W. For the cost, we add 3/9 th of the final value, and hence the cost function is
38. Remember to negate the values after calculation before comparison.
Aftercalculatingthecost ateachnode,killnodesthat donotneedexploring.Hence,thefinalstate space
tree will be as follows (Here, the number of the node denotes the order in which the state space
tree was explored):

DownloadedfromEnggTree.com
EnggTree.com

Note here that node 3 and node 5 have been killed after updating U at node 7. Also, node 6 is not
explored further, since adding any more weight exceeds the threshold. At the end, only nodes 6 and
8remain. SincethevalueofU islessfor node8,weselect thisnode.Hencethesolutionis{1,1,0,1}, and we
can see here that the total weight is exactly equal to the threshold value in this case.

Travellingsalesmanproblem
 TravellingSalesmanProblem(TSP)isaninterestingproblem.Problemisdefinedas“givenn cities
and distance between each pair of cities, find out the path which visits each city
exactlyonceandcomebacktostartingcity, withtheconstraintofminimizing thetravelling
distance.”
 TSPhasmanypracticalapplications.Itisusedinnetworkdesign,andtransportationroute
design. The objective is to minimize the distance. We can start tour fromany randomcity
and visit other cities in any order. With n cities, n! different permutations are possible.
Exploring all paths using brute force attacks may not be useful in real life applications.
LCBBusingStaticStateSpaceTreeforTravellingSalsemanProblem
 Branchand boundisaneffectivewaytofindbetter,ifnotbest,solutioninquicktime by pruning
some of the unnecessary branches of search tree.
 Itworksasfollow:
ConsiderdirectedweightedgraphG=(V,E,W),wherenode representscitiesand weighted directed
edges represents direction and distance between two cities.
1. Initially,graphisrepresentedbycostmatrixC,where
Cij=cost ofedge,ifthereisadirectpathfromcityitocityj Cij=∞, if
there is no direct path from city i to city j.
2. Convertcostmatrixtoreducedmatrixbysubtractingminimumvaluesfromappropriaterows and
columns, such that each row and column contains at least one zero entry.

DownloadedfromEnggTree.com
EnggTree.com

3. Findcostofreducedmatrix.Costisgivenby summationofsubtractedamountfromthecost matrix


to convert it in to reduce matrix.
4. Preparestatespacetreeforthereducematrix
5. FindleastcostvaluednodeA(i.e.E-node),bycomputingreducedcostnodematrix withevery
remaining node.
6. If<i,j>edgeistobeincluded,thendofollowing:
(a) SetallvaluesinrowiandallvaluesincolumnjofAto∞
(b) SetA[j,1]= ∞
(c) ReduceAagain,exceptrowsandcolumnshavingall∞entries.
7. Computethecostofnewlycreatedreducedmatrixas,
Cost=L + Cost(i, j) + r
Where,LiscostoforiginalreducedcostmatrixandrisA[i,j].
8. Ifallnodesarenotvisitedthengotostep4.
Reduction procedure is described below :
RawReduction:
MatrixMis calledreducedmatrixif eachof itsrowandcolumnhasatleastonezeroentryorentire row or
entire column has ∞ value. Let M represents the distance matrix of 5 cities. M can be reduced as
follow:
MRowRed={Mij– min{Mij|1≤ j≤n,and Mij< ∞}}
Consider the following distance matrix:

Findtheminimumelementfromeachrowand subtractitfromeachcellof matrix.

Reducedmatrixwouldbe:

Rowreductioncostisthesummationofallthevaluessubtractedfromeachrows: Row
reduction cost (M) = 10 + 2 + 2 + 3 + 4 = 21
Columnreduction:
MatrixMRowRedisrowreducedbut notthecolumnreduced.Matrixiscalledcolumnreducedifeach of its
column has at least one zero entry or all ∞ entries.

DownloadedfromEnggTree.com
EnggTree.com

MColRed={Mji–min{Mji|1≤j≤n, andMji<∞ }}
Toreducedabovematrix,wewillfindtheminimumelementfromeachcolumnand subtractit from each
cell of matrix.

ColumnreducedmatrixMColRedwouldbe:

Eachrowand columnofMColRed hasatleastonezeroentry,sothismatrixisreducedmatrix. Column


reduction cost (M) = 1 + 0 + 3 + 0 + 0 = 4
Statespacetreefor5cityproblemisdepictedinFig.6.6.1.Numberwithincircleindicatestheorder in which
the node is generated, and number of edge indicates the city being visited.

Example
Example:Findthesolutionoffollowingtravellingsalesmanproblemusingbranchandbound method.

DownloadedfromEnggTree.com
EnggTree.com

Solution:
 Theprocedurefordynamicreductionisasfollow:
 Drawstatespacetreewithoptimalreductioncostatrootnode.
 Derivecost ofpathfromnodeitojbysettingallentriesinithrowandjthcolumnas∞. Set M[j][i]
=∞
 Costofcorresponding nodeNforpathitojissummationofoptimalcost +reductioncost+ M[j][i]
 Afterexploringall nodesat leveli,setnodewithminimumcost asEnodeandrepeatthe
procedure until all nodes are visited.
 Givenmatrixisnotreduced. Inordertofindreducedmatrix of it,wewillfirstfindtherow
reduced matrix followed by column reduced matrix if needed. We can find row reduced
matrixbysubtractingminimum elementofeachrowfromeachelementofcorresponding row.
Procedure is described below:
 Reduceabovecostmatrixbysubtractingminimumvaluefromeachrowandcolumn.

M‘1

isnotreducedmatrix.Reduceitsubtractingminimumvaluefromcorrespondingcolumn.Doingthis we
get,

DownloadedfromEnggTree.com
EnggTree.com

CostofM1=C(1)
=Rowreductioncost+Columnreductioncost
=(10+2+2+3+4)+(1+3)=25
Thismeansalltoursingraphhaslengthatleast25.Thisistheoptimalcostofthepath.
Statespacetree

Letusfindcostofedge fromnode1to2,3,4,5.
Selectedge1-2:
SetM1[1][]=M1[][2]=∞ Set
M1[2] [1] = ∞
Reducetheresultantmatrixifrequired.

M2isalreadyreduced.
Cost of node 2 :
C(2)=C(1)+Reductioncost +M1[1][2]
=25+0+10=35
Selectedge1-3
SetM1[1][]=M1[][3]=∞ Set M1
[3][1] = ∞
Reducetheresultantmatrixifrequired.

Costofnode3:
C(3)=C(1)+Reductioncost +M1[1][3]
=25+11+17=53

DownloadedfromEnggTree.com
EnggTree.com

Selectedge1-4:
SetM1[1][]=M1[][4]=∞ Set
M1 [4][1] = ∞
Reduceresultantmatrixifrequired.

MatrixM4isalreadyreduced. Cost
of node 4:
C(4)=C(1)+Reductioncost +M1[1][4]
=25+0+0=25
Selectedge1-5:
SetM1[1][]=M1[][5]=∞ Set
M1 [5] [1] = ∞
Reducetheresultantmatrixifrequired.

Costofnode5:
C(5)=C(1)+reductioncost +M1[1][5]
=25+5+1=31
Statespacediagram:

Node4hasminimumcost forpath1-4.Wecangotovertex2,3 or5.Let’sexploreallthreenodes.


Selectpath1-4-2:(Addedge4-2)
SetM4[1][]=M4[4][]=M4[] [2]=∞ Set M4 [2]
[1]=∞
Reduceresultantmatrixifrequired.

DownloadedfromEnggTree.com
EnggTree.com

MatrixM6isalreadyreduced.
Cost of node 6:
C(6)=C(4)+Reductioncost +M4[4][2]
=25+0+3=28
Selectedge4-3(Path1-4-3):
SetM4[1][]=M4[4][]= M4[][3]=∞ Set M
[3][1]=∞
Reducetheresultantmatrixifrequired.

M‘7

isnotreduced.Reduceitbysubtracting11fromcolumn1.

Costofnode7:
C(7)=C(4)+Reductioncost +M4[4][3]
=25+2+11+12=50
Selectedge4-5(Path1-4-5):

MatrixM8isreduced. Cost
of node 8:
C(8)=C(4)+Reductioncost +M4[4][5]
=25+11+0=36
Statespacetree

DownloadedfromEnggTree.com
EnggTree.com

Path1-4-2leadstominimumcost.Let’sfindthecostfortwopossiblepaths.

Addedge2-3(Path1-4-2-3):
SetM6 [1][ ]=M6 [4][] =M6[2][ ]
=M6 [][3]=∞
SetM6[3][1]=∞
Reduceresultantmatrixifrequired.

Costofnode9:
C(9)=C(6)+Reductioncost +M6[2][3]
=28+11+2+11=52
Addedge2-5(Path1-4-2-5):
SetM6[1][]= M6[4][]=M6[2][]=M6[][5]=∞ Set M6
[5][1] = ∞
Reduceresultantmatrixifrequired.

DownloadedfromEnggTree.com
Costofnode10:
C(10)=C(6)+Reductioncost+M6[2][5]
=28+0+0=28
Statespacetree

Addedge5-3(Path1-4-2-5-3):

Costofnode11:
C(11)=C(10)+Reductioncost+M10[5][3]
=28+0+0=28
Statespacetree:

Sowecanselectany oftheedge.Thusthefinalpathincludestheedges<3,1>,<5,3>,<1,4>,<4,2>,
<2,5>,thatformsthe path1– 4–2 –5– 3–1.Thispathhascost of28.
UNIT5
TractableandIntractableProblems
Tractableproblemsrefertocomputationalproblemsthatcanbesolvedefficientlyusingalgorithms that
can scale with the input size of the problem. In other words, the time required to solve a tractable
problem increases at most polynomially with the input size.

Onthe otherhand,intractableproblemsarecomputationalproblemsforwhichnoknownalgorithm can


solve them efficiently in the worst-case scenario. This means that the time required to solve an
intractable problem grows exponentially or even faster with the input size.

Oneexampleofa tractableproblemis computingthesumofa list of nnumbers.The timerequired to


solve this problem scales linearly with the input size, as each number can be added to a running
total in constant time. Another example is computing the shortest path between two nodes in a
graph,whichcanbesolvedefficientlyusingalgorithmslikeDijkstra'salgorithmortheA*algorithm.

In contrast, some well-known intractable problems include the traveling salesman problem, the
knapsack problem, and the Boolean satisfiability problem. These problems are NP-hard, meaning
that any problem in NP (the set of problems that can be solved in polynomial time using a non-
deterministicTuringmachine)canbe reducedtotheminpolynomial time.Whileit ispossibletofind
approximatesolutionstotheseproblems,thereisnoknownalgorithmthatcansolvethemexactlyin
polynomial time.

In summary, tractable problems are those that can be solved efficiently with algorithms that scale
wellwiththeinput size,whileintractableproblemsarethosethatcannotbesolvedefficiently inthe worst-
case scenario.

ExamplesofTractableproblems

1. Sorting:Givenalistofnitems,thetaskistosorttheminascendingordescending order.
Algorithms like QuickSort and MergeSort can solve this problem in O(n log n) time
complexity.

2. Matrixmultiplication:GiventwomatricesAandB,thetaskistofindtheirproductC=AB. The
best-known algorithm for matrix multiplication runs in O(n^2.37) time complexity, which
is considered tractable for practical applications.

3. Shortest path in a graph: Given a graph G and two nodes s and t, the task is to find the
shortestpathbetweensandt.AlgorithmslikeDijkstra'salgorithmandtheA* algorithmcan
solvethisprobleminO(m+nlogn) timecomplexity,wheremis thenumberofedgesand n is the
number of nodes in the graph.

4. Linearprogramming:Givenasystemoflinearconstraintsandalinearobjectivefunction,the task is
to find the values of the variables that optimize the objective function subject to the
constraints. Algorithms like the simplex method can solve this problem in polynomial time.

5. Graph coloring: Given an undirected graph G, the task is to assign a color to each node such
thatno two adjacentnodeshavethesame color,using asfewcolorsas possible.The greedy
algorithmcansolvethisprobleminO(n^2)time complexity,wherenisthenumberofnodes in the
graph.
Theseproblemsare consideredtractablebecausealgorithmsexistthatcansolvetheminpolynomial time
complexity, which means that the time required to solve them grows no faster than a polynomial
function of the input size.

Examplesofintractableproblems

1. Travelingsalesmanproblem(TSP):Givenasetofcitiesandthedistancesbetweenthem,the taskis
tofindtheshortestpossibleroutethatvisitseachcityexactlyonceandreturns tothe starting city.
The best-known algorithms for solving the TSP have an exponential worst-case time
complexity, which makes it intractable for large instances of the problem.

2. Knapsack problem:Given a setof items with weights and values, and a knapsackthat can
carry amaximumweight,the taskis to find themostvaluable subsetofitemsthatcan be
carriedbytheknapsack.TheknapsackproblemisalsoNP-hardand isintractableforlarge
instances of the problem.

3. Boolean satisfiability problem (SAT): Given a boolean formula in conjunctive normal form
(CNF),thetaskis todetermineif thereexistsanassignment oftruthvaluestothe variables
thatmakestheformulatrue.TheSATproblemisoneofthemostwell-knownNP-complete
problems, which means that any NP problem can be reduced to SAT in polynomial time.

4. Subsetsumproblem:Givenasetofintegersandatargetsum,thetaskistofindasubsetof the
integers that sums up to the target sum. Like the knapsack problem, the subset sum
problem is also intractable for large instances of the problem.

5. Graphisomorphismproblem:GiventwographsG1andG2,thetaskistodetermineifthere

1. Linearsearch:Givenalistofnitems,thetaskistofindaspecificiteminthe list.Thetime
complexity of linear search is O(n), which is a polynomial function of the input size.
2. Bubble sort:Givenalistofnitems,thetaskistosorttheminascendingordescendingorder. The time
complexity of bubble sort is O(n^2), which is also a polynomial function of theinput size.

3. Shortest path in a graph: Given a graph G and two nodes s and t, the task is to find the
shortestpathbetweensandt.AlgorithmslikeDijkstra'salgorithmandtheA* algorithmcan solve
this problem in O(m + n log n) time complexity, which is a polynomial function of the input
size.

4. Maximum flow in a network: Given a network with a source node and a sink node, and
capacities on the edges, the task is to find the maximum flow from the source to the sink.
The Ford-Fulkerson algorithm can solve this problem in O(mf), where m is the number of
edgesinthenetworkandfisthemaximumflow,whichisalsoapolynomialfunctionofthe input
size.

5. Linearprogramming:Givenasystemoflinearconstraintsandalinearobjectivefunction,the task is
to find the values of the variables that optimize the objective function subject to the
constraints. Algorithms like the simplex method can solve this problem in polynomial time.

P(Polynomial)problems

P problems refer to problemswhere an algorithmwould take a polynomial amount of time


tosolve,orwhereBig-Oisapolynomial(i.e.O(1),O(n),O(n²),etc).Theseare problemsthat would
be considered ‘easy’ to solve, and thus do not generally have immense run times.

NP(Non-deterministicPolynomial)Problems

NPproblemswerealittleharderformetounderstand,but Ithinkthisiswhattheyare.In terms of


solving a NP problem, the run-time would not be polynomial. It would be something like
O(n!) or something much larger.

NP-HardProblems

A problem is classified as NP-Hard when an algorithm for solving it can be translated to


solveanyNPproblem.Thenwecansay,thisproblemisat leastashardasanyNPproblem, but it
could be much harder or more complex.

NP-CompleteProblems

NP-CompleteproblemsareproblemsthatliveinboththeNPandNP-Hardclasses.This means
that NP-Completeproblems can be verified in polynomial time and that any NP
problem can be reduced to this problem in polynomial time.
BinPackingproblem
BinPackingprobleminvolvesassigningnitemsofdifferentweightsandbinseachofcapacity c to a
bin such that number of total used bins is minimized. It may be assumed that all items have
weights smaller than bin capacity.

Thefollowing4 algorithmsdependonthe orderoftheirinputs.Theypackthe itemgiven first and


then move on to the next input or next item

1) NextFitalgorithm

The simplest approximate approach to the bin packing problem is the Next-Fit (NF)
algorithm which is explained later in this article. The first item is assigned to bin 1. Items
2,...,narethenconsideredbyincreasingindices:eachitemisassignedtothe currentbin,if it fits;
otherwise, it is assigned to a new bin, which becomes the current one.

VisualRepresentation

Letusconsiderthesameexampleasusedaboveandbinsofsize1

Assumingthesizesoftheitemsbe{0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1, 0.6}.

TheminimumnumberofbinsrequiredwouldbeCeil((TotalWeight)/(BinCapacity))= Celi(3.7/1)
= 4 bins.

The Next fit solution (NF(I))for this instance I would be-

Considering0.5sizeditemfirst,wecanplaceitinthefirstbin

Movingontothe0.7sizeditem,wecannotplaceit inthefirstbin.Hence weplace itina new bin.


Movingontothe0.5sizeditem,wecannotplaceit inthecurrentbin.Henceweplaceit ina new bin.

Movingontothe0.2sizeditem,wecanplaceitinthecurrent(third bin)

Similarly,placingalltheotheritemsfollowingtheNext-Fitalgorithmweget-

Thusweneed6 binsasopposedtothe4 binsofthe optimalsolution.Thuswecanseethat this


algorithm is not very efficient.

AnalyzingtheapproximationratioofNext-Fitalgorithm

ThetimecomplexityofthealgorithmisclearlyO(n).Itiseasytoprove that,foranyinstance I of
BPP,the solution value NF(I) provided by the algorithm satisfies the bound

NF(I)<2z(I)

wherez(I)denotestheoptimalsolutionvalue.Furthermore,thereexistinstancesforwhich the
ratio NF(I)/z(I) is arbitrarily close to 2, i.e. the worst-case approximation ratio of NF is r(NF)
= 2.

Psuedocode

NEXTFIT(size[],n,c)
size[]isthearraycontaingthesizesofthe items,nisthenumberofitemsandcisthe capacity of the
bin
{
Initializeresult(Countofbins)andremainingcapacityincurrentbin. res = 0
bin_rem=c
Placeitemsonebyone
for(inti=0;i <n;i++){
//Ifthisitemcan'tfitincurrentbin if
(size[i] > bin_rem) {
Useanewbin
res++
bin_rem=c-size[i]
}
else
bin_rem-=size[i];
}
returnres;
}
2) FirstFitalgorithm

A better algorithm, First-Fit (FF), considers the items according to increasing


indicesandassignseachitemtothelowestindexedinitializedbinintowhichit fits; only
when the current item cannot fit into any initialized bin, is a new bin introduced

VisualRepresentation

Letusconsiderthesameexampleasusedaboveandbinsofsize1

Assumingthesizesoftheitemsbe{0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1, 0.6}.

TheminimumnumberofbinsrequiredwouldbeCeil((TotalWeight)/(BinCapacity))= Celi(3.7/1)
= 4 bins.

The First fit solution (FF(I))for this instance I would be-

Considering0.5sizeditemfirst,wecanplaceitinthefirstbin

Movingontothe0.7sizeditem,wecannotplaceit inthefirstbin.Hence weplace itina new bin.

Movingontothe0.5sizeditem,wecanplaceitinthefirstbin.
Movingontothe0.2sizeditem, wecanplaceit inthefirstbin, wecheckwiththesecondbin and we
can place it there.

Movingontothe0.4sizeditem,wecannotplaceit inanyexistingbin. Henceweplaceit ina new bin.

Similarly,placingalltheotheritemsfollowingtheFirst-Fitalgorithmweget-

Thusweneed5 binsasopposedtothe4 binsofthe optimalsolutionbut ismuchmore efficient


than Next-Fit algorithm.

AnalyzingtheapproximationratioofNext-Fitalgorithm

IfFF(I)istheFirst-fitimplementationforIinstanceandz(I)isthemostoptimalsolution,then:

Itcanbeseenthatthe FirstFitneverusesmorethan1.7*z(I)bins. SoFirst-Fitisbetterthan Next Fit


in terms of upper bound on number of bins.

Psuedocode

FIRSTFIT(size[],n, c)
{
size[]isthearraycontaingthesizesofthe items,nisthenumberofitemsandcisthe capacity of the
bin

/Initializeresult(Countofbins)
res=0;
Createanarraytostoreremainingspaceinbinstherecanbeatmostnbins bin_rem[n];

Plae items one by one


for(inti=0;i<n;i++){
Findthefirstbinthatcanaccommodateweight[i] int j;
for(j=0;j <res;j++){
if (bin_rem[j] >= size[i]) {
bin_rem[j]=bin_rem[j]-size[i];
break;
}
}

Ifnobincouldaccommodatesize[i] if
(j == res) {
bin_rem[res]=c-size[i];
res++;
}

}
returnres;
}

3) BestFitAlgorithm

The next algorithm, Best-Fit (BF), is obtained from FF by assigning the current
itemtothefeasiblebin(ifany)havingthesmallestresidualcapacity(breaking ties in
favor of the lowest indexed bin).

Simplyput,theideaistoplacesthenextiteminthetightestspot.Thatis,put itinthe binso that the


smallest empty space is left.

VisualRepresentation

Letusconsiderthesameexampleasusedaboveandbinsofsize1

Assumingthesizesoftheitemsbe{0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1, 0.6}.

TheminimumnumberofbinsrequiredwouldbeCeil((TotalWeight)/(BinCapacity))= Celi(3.7/1)
= 4 bins.

TheFirstfitsolution(FF(I))forthisinstanceIwouldbe-
Considering0.5sizeditemfirst,wecanplaceitinthefirstbin

Movingontothe0.7sizeditem,wecannotplaceit inthefirstbin.Hence weplace itina new bin.

Movingontothe0.5sizeditem,wecanplaceitinthefirstbin tightly.

Movingontothe0.2sizeditem,wecannotplaceit inthefirstbin butwecanplace itin second bin


tightly.

Movingontothe0.4sizeditem,wecannotplaceit inanyexistingbin. Henceweplaceit ina new bin.

Similarly,placingalltheotheritemsfollowingtheFirst-Fitalgorithmweget-

Thusweneed5 binsasopposedtothe4 binsofthe optimalsolutionbut ismuchmore efficient


than Next-Fit algorithm.

AnalyzingtheapproximationratioofBest-Fitalgorithm
ItcanbenotedthatBest-Fit(BF),isobtainedfromFFbyassigningthecurrentitemtothe feasible
bin (if any) having the smallest residual capacity (breaking ties in favour of the lowest
indexed bin). BF satisfies the same worst-case bounds as FF

AnalysisOfupper-boundofBest-Fitalgorithm

Ifz(I)istheoptimalnumberofbins,thenBestFitneverusesmorethan2*z(I)-2bins. So Best Fit is


same as Next Fit in terms of upper bound on number of bins.

Psuedocode

BESTFIT(size[],n,c)
{
size[]isthearraycontaingthesizesofthe items,nisthenumberofitemsandcisthe capacity of the
bin
Initializeresult(Countofbins) res
= 0;

Createanarraytostoreremainingspaceinbinstherecanbeat mostnbins
bin_rem[n];

Placeitemsonebyone
for(inti=0;i <n;i++){

Findthebestbinthatcanaccommodateweight[i] int j;

Initializeminimumspaceleftandindexofbestbin int
min = c + 1, bi = 0;

for(j=0;j <res;j++){
if(bin_rem[j]>=size[i]&&bin_rem[j]-size[i]<min){ bi = j;
min=bin_rem[j]-size[i];
}
}

Ifnobincouldaccommodateweight[i],createanewbin if
(min == c + 1) {
bin_rem[res]=c-size[i];
res++;
}
else
Assigntheitemtobestbin
bin_rem[bi] -= size[i];
}
returnres;
}

Intheofflineversion,wehaveallitemsat ourdisposalsincethestartoftheexecution.The natural


solution is to sort the array fromlargest to smallest, and then apply the algorithms
discussed henceforth.

NOTE:Intheonlineprogramswehavegiventhe inputsupfront forsimplicitybut itcanalso work


interactively

Letuslookatthevariousofflinealgorithms

1) FirstFitDecreasing

Wefirst sortthe arrayofitemsindecreasingsizeby weight andapply first-fitalgorithmas


discussed above

Algorithm

 Readtheinputsofitems

 Sortthearrayofitemsindecreasingorderbytheirsizes

 ApplyFirst-Fitalgorithm

VisualRepresentation

Letusconsiderthesameexampleasusedaboveandbinsofsize1

Assumingthesizesoftheitemsbe{0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1, 0.6}.

Sortingthemweget{0.7,0.6,0.5,0.5,0.5,0.4,0.2,0.2,0.1}

TheFirstfitDecreasingsolutionwould be-

Wewillstartwith0.7andplaceitinthefirst bin
Wethenselect0.6sizeditem.Wecannotplaceitinbin1.So,weplaceitinbin2

Wethenselect0.5sizeditem.Wecannotplaceitinanyexisting.So,weplaceitinbin3

Wethenselect0.5sizeditem.Wecanplace itinbin3

Doingthesameforallitems,we get.

Thusonly4binsarerequiredwhichisthesameastheoptimalsolution.

2) BestFitDecreasing

WefirstsortthearrayofitemsindecreasingsizebyweightandapplyBest-fitalgorithmas discussed
above

Algorithm

 Readtheinputsofitems

 Sortthearrayofitemsindecreasingorderbytheirsizes

 ApplyNext-Fitalgorithm

VisualRepresentation
Letusconsiderthesameexampleasusedaboveandbinsofsize1

Assumingthesizesoftheitemsbe{0.5,0.7,0.5,0.2,0.4,0.2,0.5,0.1, 0.6}.
Sortingthemweget{0.7,0.6,0.5,0.5,0.5,0.4,0.2,0.2,0.1}

TheBestfitDecreasingsolutionwouldbe-

Wewillstartwith0.7andplaceitinthefirst bin

Wethenselect0.6sizeditem.Wecannotplaceitinbin1.So,weplaceitinbin2

Wethenselect0.5sizeditem.Wecannotplaceitinanyexisting.So,weplaceitinbin3

Wethenselect0.5sizeditem.Wecanplace itinbin3

Doingthesameforallitems,we get.

Thusonly4binsarerequiredwhichisthesameastheoptimalsolution.
ApproximationAlgorithmsfortheTravelingSalesmanProblem
WesolvedthetravelingsalesmanproblembyexhaustivesearchinSection3.4,mentioned its
decision version as one of the most well-known NP-complete problems in Section 11.3, and
saw how its instances canbe solved by a branch-and-bound algorithm in Section 12.2. Here,
we consider several approximation algorithms, a small sample of dozens of such algorithms
suggested over the years for this famous problem.

But first let us answer the question of whether we should hope to find a polynomial-time
approximation algorithm with a finite performance ratio on all instances of the traveling
salesmanproblem.Asthefollowingtheorem[Sah76]shows,the answerturnsouttobeno, unless
P=NP.

THEOREM1IfP!=NP,thereexistsnoc-approximationalgorithmforthetravelingsalesman
problem, i.e., there exists no polynomial-time approximation algorithm for this problem so
that for all instances

Nearest-neighbouralgorithm

Thefollowingwell-knowngreedyalgorithmisbasedonthenearest-neighborheuristic: always
go next to the nearest unvisited city.

Step1Chooseanarbitrarycityasthestart.

Step 2Repeatthe followingoperationuntilallthecitieshavebeenvisited:gotothe unvisited city


nearest the one visited last (ties can be broken arbitrarily).

Step3Returntothestartingcity.

EXAMPLE1 Fortheinstancerepresentedbythe graphinFigure 12.10,withaasthestarting


vertex, the nearest-neighbor algorithm yields the tour (Hamiltonian
circuit)sa:a− b−c −d−aoflength10.

Theoptimalsolution,ascanbeeasilycheckedbyexhaustivesearch,isthe
tours∗: a−b−d −c−aoflength8.Thus,theaccuracyratioofthisapproximationis
Unfortunately,exceptforitssimplicity,notmanygoodthingscanbesaidaboutthenearest-
neighbor algorithm. In particular, nothing can be said in general about the accuracy of
solutions obtained by this algorithm because it can force us to traverse a very long edge on
the last leg of the tour.Indeed, if we change the weight of edge (a, d) from6 to an arbitrary
large number w ≥ 6 in Example 1, the algorithm will still yield the tour a − b − c − d − a of
length 4 + w, and the optimal solution will still be a − b − d − c − a of length 8. Hence,

whichcanbemadeaslarge aswewishby choosinganappropriatelylargevalueofw. Hence, RA=


∞ for this algorithm (as it should be according to Theorem 1).

Twice-around-the-treealgorithm

Step1Constructaminimumspanningtreeofthegraphcorrespondingtoagiveninstanceof the
traveling salesman problem.

Step 2Startingatanarbitraryvertex,performawalkaroundtheminimumspanning tree


recording all the vertices passed by. (This can be done by a DFS traversal.)

Step3ScanthevertexlistobtainedinStep2andeliminatefromit allrepeatedoccurrences of the


same vertex except the starting one at the end of the list. (This step is equivalent to making
shortcuts in the walk.) The vertices remaining on the list will form a Hamiltonian circuit,
which is the output of the algorithm.

EXAMPLE 2 Let us apply this algorithm to the graph in Figure 12.11a. The minimum
spanningtreeofthisgraphismadeupofedges(a,b),(b,c),(b, d),and(d, e).Atwice-
around-the-treewalkthatstartsandendsatais

a,b,c,b,d,e,d,b,a.

Eliminatingthesecondb(ashortcutfromctod),the secondd,andthethirdb(ashortcut from e to


a) yields the Hamiltonian circuit

a,b,c,d,e,a

oflength39.

ThetourobtainedinExample2isnotoptimal.Althoughthatinstanceissmallenoughtofind an
optimal solution by either exhaustive search or branch-and-bound, we refrained from doing
so to reiterate a general point. As a rule, we do not know what the length of an
optimaltouractually is,and thereforewecannotcomputetheaccuracyratio f (sa)/f(s∗). For the
twice-around-the-tree algorithm, we can at least estimate it above, provided the graphis
Euclidean.

Fermat'sLittleTheorem:

Ifnisaprimenumber,thenforeverya,1<a<n-1,

an-1≡1(modn)OR

an-1%n=1

Example:Since 5isprime,24≡1(mod5)[or24%5=1],

34≡1(mod5)and44≡1(mod5)

Since7isprime,26≡ 1(mod7),

36≡1(mod7),46≡1(mod7)

56≡1(mod7)and66≡1(mod7)
Algorithm

1) Repeatfollowingktimes:

a) Pickarandomlyinthe range[2,n-2]

b) Ifgcd(a,n)≠1,thenreturn false

c) Ifan-1&nequiv;1(modn),thenreturnfalse

2) Returntrue[probablyprime].

Unlikemergesort,we don’tneedtomerge thetwosortedarrays.ThusQuicksortrequires lesser


auxiliary space than Merge Sort, which is why it is often preferred to Merge Sort.
UsingarandomlygeneratedpivotwecanfurtherimprovethetimecomplexityofQuickSort.

Algorithmforrandompivoting

partition(arr[],lo,hi)
pivot=arr[hi]
i = lo //placeforswapping
for j := lo to hi – 1 do
if arr[j] <= pivot then
swaparr[i]witharr[j] i
=i+1
swaparr[i]witharr[hi] return
i
partition_r(arr[],lo,hi)
r=RandomNumberfromlotohi Swap
arr[r] and arr[hi]
returnpartition(arr,lo,hi)
quicksort(arr[], lo, hi)
iflo<hi
p=partition_r(arr,lo,hi)
quicksort(arr, lo , p-1)
quicksort(arr, p+1, hi)

Findingkthsmallestelement
ProblemDescription:GivenanarrayA[]ofnelementsandapositiveintegerK,findtheKth smallest
element in the array. It is given that all array elements are distinct.

ForExample:

Input :A[]={10,3,6,9,2,4,15,23},K=4

Output:6

Input:A[]={5,-8,10,37,101,2,9},K=6

Output:37

Quick-Select:Approachsimilartoquicksort

Thisapproachissimilartothe quicksortalgorithmwhereweusethepartitionontheinput array


recursively. But unlike quicksort, which processes both sides of the array recursively, this
algorithm works on only one side of the partition. We recur for either the left or right side
according to the position of pivot.

SolutionSteps

1. PartitionthearrayA[left..right]intotwosubarraysA[left..pos]andA[pos+1..right]such that each


element of A[left .. pos] is less than each element of A[pos + 1 .. right].

2. ComputesthenumberofelementsinthesubarrayA[left..pos]i.e.count=pos-left+1

3. if(count==K),thenA[pos]istheKthsmallestelement.

4. OtherwisedeterminesinwhichofthetwosubarraysA[left..pos-1]andA[pos+1 ..right] the Kth


smallest element lies.

 If(count>K)thenthedesiredelementliesontheleftsideofthe partition
 If (count < K), then the desired element lies on the right side of the partition. Since we
alreadyknowivaluesthataresmallerthanthekthsmallestelementofA[left..right],the desired
element is the (K - count)th smallest element of A[pos + 1 .. right].

 Basecaseisthescenarioofsingleelementarrayi.eleft==right.returnA[left]orA[right].

Pseudo-Code
//Originalvalueforleft=0andright=n-1
intkthSmallest(intA[],intleft,intright,intK)
{
if(left== right)
returnA[left]
intpos=partition(A,left,right)
count = pos - left + 1
if(count==K)
returnA[pos]
elseif(count>K)
returnkthSmallest(A,left,pos-1,K)
else
returnkthSmallest(A,pos+1,right,K-i)
}

intpartition(intA[],intl,intr)
{
intx=A[r]
inti=l-1
for (j=ltor-1)
{
if(A[j]<= x)
{
i=i+1
swap(A[i],A[j])
}
}
swap(A[i+1],A[r])
returni+1
}
ComplexityAnalysis

TimeComplexity:Theworst-case timecomplexityforthisalgorithmisO(n²),but itcanbe


improved if we choose the pivot element randomly. If we randomly select the pivot, the
expected time complexity would be linear, O(n).

You might also like