DAA - Module I
DAA - Module I
MODULE I
INTRODUCTION
Contents
1. INTRODUCTION
What is an Algorithm?
Informal Definition:
An Algorithm is any well-defined computational procedure that takes some value or set of values
as input and produces a set of values or some value as output. Thus algorithm is a sequence of
computational steps that transforms the input into the output.
Formal Definition:
An algorithm is a sequence of unambiguous instructions for solving a problem, i.e., for obtaining
a required output for any legitimate input in a finite amount of time.
This definition can be illustrated by a simple diagram:
• The reference to instructions in the definition implies that there is something or someone
capable of understanding and following the instructions given.
• Human beings and the computers nowadays are electronic devices being involved in
performing numeric calculations.
• However, that although the majority of algorithms are indeed intended for eventual
computer implementation, the notion of algorithm does not depend on such an assumption.
PROPERTIES OF ALGORITHM.
All algorithms should satisfy the following criteria or properties.
1. INPUT → Zero or more quantities are externally supplied.
2. OUTPUT → At least one quantity is produced.
3. DEFINITENESS → Each instruction is clear and unambiguous.
4. FINITENESS → If we trace out the instructions of an algorithm, then for all cases, the
algorithm terminates after a finite number of steps.
5. EFFECTIVENESS → Every instruction must very basic so that it can be carried out, in
principle, by a person using only pencil & paper.
• Algorithms are procedural solutions to problems. These solutions are not answers but
specific instructions for getting answers.
• The following diagram briefly illustrates the sequence of steps one typically goes
through in designing and analyzing an algorithm.
• It includes the following sequence of steps:
✓ Important to specify exactly the range of instances the algorithm needs to handle. else
it will work correctly for majority of inputs but crash on some boundary value.
✓ A correct algorithm is not one that works most of the time, but one that works
correctly for all legitimate inputs.
• Ascertaining the capabilities of a computational device
algorithms are also more efficient than more complicated alternatives. Unfortunately,
it is not always true, in which case a judicious compromise needs to be made.
✓ Generality: Generality of the problem the algorithm solves and the set of inputs itaccepts.
✓ If not satisfied with these three properties it is necessary to redesign the algorithm.
✓ Selection of programming language should support the features mentioned in the design
phase.
✓ Program testing: If the inputs to algorithms belong to the specified sets then require no
verification. But while implementing algorithms as programs to be used in actual
applications, it is required to provide such verifications.
Algorithm Specification
1. Natural language like English: When this way is choosed care should be taken, we should
ensure that each & every statement is definite.
2. Graphic representation called flowchart: This method will work well when the
algorithm is small& simple.
7. The following looping statements are employed. For, while and repeat-until
While Loop:
While < condition > do
{
<statement-1>
...
<statement-n>
}
For Loop:
for variable: = value-1 to value-2 step step do
{
<statement-1>
...
<statement-n>
}
repeat-until:
repeat
<statement-1>
...
<statement-n>
until<condition>
8. A conditional statement has the following forms.
✓ If <condition> then <statement>
: <condition-1> : <statement-1>
...
: <condition-n> : <statement-n>
: else : <statement-n+1>
}
9. Input and output are done using the instructions read & write.
10. There is only one type of procedure:
✓ Algorithm, the heading takes the form, Algorithm Name (Parameter lists)
✓ As an example, the following algorithm fields & returns the maximum of ―n” given numbers:
Algorithm Max(A,n)
// A is an array of size n
{
Result := A[1];
for i:= 2 to n do
if A[i] > Result then
Result :=A[i];
return Result;
}
In this algorithm (named Max), A & n are procedure parameters.
Result & i are Local variables.
Analysis Framework
♦ For example, it will be the size of the list for problems of sorting, searching, finding the
list's smallest element, and most other problems dealing with lists.
♦ For the problem of evaluating a polynomial p(x) of degree n, it will be the polynomial's
Dr. Abhijith H V, Dept. of ISE, JSSATEB Page 8
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42) MODULE I-INTRODUCTION
degree or the number of its coefficients, which is larger by one than its degree.
♦ There are situations, of course, where the choice of a parameter indicating an input size
does matter.
♦ Example - computing the product of two n-by-n matrices.There are two natural measures of
size for this problem.
♦ Since there is a simple formula relating these two measures, we can easily switch from one
to the other, but the answer about an algorithm's efficiency will be qualitativelydifferent
depending on which of the two measures we use.
♦ The choice of an appropriate size metric can be influenced by operations of the algorithm in
question. For example, how should we measure an input's size for a spell-checking algorithm?
If the algorithm examines individual characters of its input, then we should measure the size
by the number of characters; if it works by processing words, we should count their
number in the input.
♦ We should make a special note about measuring size of inputs for algorithms involving
properties of numbers (e.g., checking whether a given integer n is prime).
Step 4: Let cop be the execution time of an algorithm‘s basic operation on a particular computer, and let
C(n) be the number of times this operation needs to be executed for this algorithm. Then we can estimate
the running time T (n) of a program implementing this algorithm on that computer by the formula
T (n) ≈ copC(n).
Total number of steps for basic operation execution, C (n) = n
✓ Orders Of Growth
♦A difference in running times on small inputs is not what really distinguishes efficient algorithms
from inefficient ones.
♦When we have to compute, for example, the greatest common divisor of two small numbers, it
is not immediately clear how much more efficient Euclid‘s algorithm is compared to the other two
algorithms discussed in previous section or even why we should care which of them is faster and
by how much. It is only when we have to find the greatest common divisor of two large
numbers that the difference in algorithm efficiencies becomes both clear and important.
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42) MODULE I-INTRODUCTION
♦ For large values of n, it is the function‘s order of growth that counts: look at Table which
contains values of a few functions particularly important for analysis of algorithms.
Table: Values (some approximate) of several functions important for analysis of algorithms
♦ Here is the algorithm's pseudo code, in which, for simplicity, a list is implemented as an array. (It
also assumes that the second condition will not be checked if the first one, which checks that the
array's index does not exceed its upper bound, fails.)
♦ Clearly, the running time of this algorithm can be quite different for the same list size n.
Worst case efficiency
♦ The worst-case efficiency of an algorithm is its efficiency for the worst-case input of size n,
which is an input (or inputs) of size n for which the algorithm runs the longest among all possible
inputs of that size.
♦ In the worst case the input might be in such a way that there are no matching elements or the first
matching element happens to be the last one on the list, in this case the algorithm makes the largest
number of key comparisons among all possible inputs of sizen:
Cworst (n) = n.
♦ The way to determine is quite straightforward
♦ To analyze the algorithm to see what kind of inputs yield the largest value of the basic
operation's count C(n) among all possible inputs of size n and then compute this worst-case value
Cworst (n).
♦ The worst-case analysis provides very important information about an algorithm's efficiency by
bounding its running time from above. In other words, it guarantees that for any instance of size n,
the running time will not exceed Cworst (n) its running time on the worst-case inputs.
Best case Efficiency
♦ The best-case efficiency of an algorithm is its efficiency for the best-case input of size n, which
is an input (or inputs) of size n for which the algorithm runs the fastest among all possible inputs of
that size.
♦ We can analyze the best case efficiency as follows.
♦ First, determine the kind of inputs for which the count C (n) will be the smallest among all
possible inputs of size n. (Note that the best case does not mean the smallest input; it means the
input of size n for which the algorithm runs the fastest.)
♦ Then ascertain the value of C (n) on these most convenient inputs.
♦ Example- for sequential search, best-case inputs will be lists of size n with their first element equal
to a search key; accordingly, Cbest(n) = 1.
♦ The analysis of the best-case efficiency is not nearly as important as that of the worst-case
efficiency.
♦ But it is not completely useless. For example, there is a sorting algorithm (insertion sort) for which
the best-case inputs are already sorted arrays on which the algorithm works very fast.
♦ Thus, such an algorithm might well be the method of choice for applications dealing with almost
sorted arrays. And, of course, if the best-case efficiency of an algorithm is unsatisfactory, we can
immediately discard it without further analysis.
Average case efficiency
♦ It yields the information about an algorithm about an algorithms behaviour on a ―typical and
random input.
♦To analyze the algorithm's average-case efficiency, we must make some assumptions about possible
inputs of size n.
♦The investigation of the average case efficiency is considerably more difficult than
investigation of the worst case and best case efficiency.
♦It involves dividing all instances of size n .into several classes so that for each instance of the
class the number of times the algorithm's basic operation is executed is the same.
♦Then a probability distribution of inputs needs to be obtained or assumed so that the expected
value of the basic operation's count can then be derived. The average number of key comparisons Cavg(n)
can be computed as follows,
♦Let us consider again sequential search. The standard assumptions are,
♦Let p be the probability of successful search.
♦ In the case of a successful search, the probability of the first match occurring in the ith position
of the list is p/n for every i, where 1<=i<=n and the number of comparisons made by the algorithm
in such a situation is obviously i.
♦ In the case of an unsuccessful search, the number of comparisons is n with the probability of
such a search being (1 - p). Therefore,
♦ Example, if p = 1 (i.e., the search must be successful), the average number of key comparisons
made by sequential search is (n + 1) /2.
♦ If p = 0 (i.e., the search must be unsuccessful), the average number of key comparisons will be
n because the algorithm will inspect all n elements on all such inputs.
Recapitulation of the Analysis Framework
1 Both time and space efficiencies are measured as functions of the algorithm‘s input size.
2 Time efficiency is measured by counting the number of times the algorithm‘s basic operation
is executed. Space efficiency is measured by counting the number of extra memory units
consumed by the algorithm.
3 The efficiencies of some algorithms may differ significantly for inputs of the same size. For
such algorithms, we need to distinguish between the worst-case, average-case, and best-case
efficiencies.
4 The framework‘s primary interest lies in the order of growth of the algorithm‘s running time
(extra memory units consumed) as its input size goes toinfinity.
2. PERFORMANCE ANALYSIS
Space complexity
• Total amount of computer memory required by an algorithm to complete its execution is called
as space complexity of that algorithm.
✓ A fixed part that is independent of the input and output. This includes memory space for
codes, variables, constants and so on.
✓ A variable part that depends on the input, output and recursion stack. ( We call these
parameters as instance characteristics)
• Space requirement S(P) of an algorithm P,
S(P) = c + Sp
Here fixed component depends on the size of a, b and c. Also instance characteristics Sp=0
Time Complexity
• The time T(p) taken by a program P is the sum of the compile time and the run
time(execution time)
✓ The compile time does not depend on the instance characteristics. Also we may assume that
a compiled program will be run several times without recompilation .
• Usually, the execution time or run-time of the program is refered as its time complexity
denoted by tp (instance characteristics). This is the sum of the time taken to execute all instructions
in theprogram.
✓ Exact estimation runtime is a complex task, as the number of instruction executed is
dependent on the input data. Also different instructions will take different time to execute. So
for the estimation of the time complexity we count only the number of program steps.
✓ A program step is loosely defined as syntactically or semantically meaning segment of the
program that has an execution time that is independent of instance characteristics. For
example comment has zero steps; assignment statement has one step and in an iterative
statement such as the for, while & repeat-until statements, the step count is counted for the
• We can determine the steps needed by a program to solve a particular problem instance in
two ways
✓ In the first method we introduce a new variable count to the program which is initialized
to zero. We also introduce statements to increment count by an appropriate amount into the
program. So when each time original program executes, the count also incremented by the step
count.
✓ Example-1: Consider the algorithm Sum( ). After the introduction of the count the program
will be as follows.
From the above we can estimate that invocation of Sum( ) executes total number of
2n+3 steps.
▪ When analyzing a recursive program for its step count, we often obtain arecursive
formula for the step count, for example
▪ These recursive formulas are referred to as recurrence relations. One way of solving
is to make repeated substitutions for each occurrence of the function tRSum on the right-
hand side until all such occurrences disappear
3. ASYMPTOTIC NOTATIONS
The efficiency analysis framework concentrates on the order of growth of an algorithm‘s basic
operation count as the principal indicator of the algorithm‘s efficiency. To compare and rank such
orders of growth, computer scientists use three notations: O (big oh), Ω(big omega), and
Θ (big theta).
Informal Introduction
Informally, O(g(n)) is the set of all functions with a lower or same order of growth as
g(n) (to within a constant multiple, as n goes to infinity). Thus, to give a few examples, the following
assertions are all true:
Indeed, the first two functions are linear and hence have a lower order of growth than g(n) = n2,
while the last one is quadratic and hence has the same order of growth as n2. On the other hand,
Indeed, the functions n3 and 0.00001n3 are both cubic and hence have a higher order of
growth than n2, and so has the fourth-degree polynomial n4 + n + 1.
The second notation, Ω (g(n)), stands for the set of all functions with a higher or same order of
growth as g(n) (to within a constant multiple, as n goes to infinity).
For example,
Finally, Θ (g(n)) is the set of all functions that have the same order of growth as g(n).
Analysis:
1. Input size: the number of elements = n (size of the array)
2. Two operations can be considered to be as basic operation i.e.,
a) Comparison ::A[i]>maxval.
b) Assignment : maxval←A[i].
Here the comparison statement is considered to be the basic operation of the algorithm.
3. No best, worst, average cases- because the number of comparisons will be same for all arrays
of size n and it is not dependent on type ofinput.
4. Let C(n) denotes number of comparisons: Algorithm makes one comparison on each execution
of the loop, which is repeated for each value of the loop‘s variable i within the bound between 1
and n – 1.
This is an easy sum to compute because it is nothing other than 1 repeated n – 1 times. Thus,
Example: Element uniqueness problem-Checks whether the elements in the array are distinct or
not.
Analysis
1. Input size: number of elements = n (size of the array)
2. Basic operation: Comparison
3. Best, worst, average cases exists
Worst case input is an array giving largest comparisons.
• Array with no equal elements
• Array with last two elements are the only pair of equal elements
4. Let C (n) denotes number of comparisons in worst case: Algorithm makes one comparison for
each repetition of the innermost loop i.e., for each value of the loop‘s variable j between its limits
i + 1 and n – 1; and this is repeated for each value of the outer loop i.e, for each value of the loop‘s
variable i between its limits 0 and n – 2.
we have,
Example 3: Given two n × n matrices A and B, find the time efficiency of the definition-based
algorithm for computing their product C = AB. By definition, C is an n × n matrix whose elements
are computed as the scalar (dot) products of the rows of matrix A and the columns of matrix B:
Analysis:
Example 4: The following algorithm finds the number of binary digits in the binary
representation of a positive decimal integer.
✓ First, notice that the most frequently executed operation here is not inside the while loop but
rather the comparison n > 1 that determines whether the loop‘s body will be executed.
✓ Since the number of times the comparison will be executed is larger than the number of
repetitions of the loop‘s body by exactly 1.
Step 3: Check whether the number of times the basic operation is executed depends only on the input
size n. If it also depends on the type of input then investigate worst, average, and best case efficiency
separately.
✓ We can observe that the comparison statement depends only on input size not on type of input ,
we do not have to investigate the worst-case, average-case, and best-case efficiencies separately.
Step 4: Set up summation for C(n) reflecting the number of times the algorithm‘s basic operation is
executed.
✓ A more significant feature of this example is the fact that the loop variable takes on only a few
values between its lower and upper limits;
✓ Therefore, we have to use an alternative way of computing the number of times the body
of the loop is executed.
✓ Since the value of n is about halved on each repetition of the loop, the answer should be
about log2 n
Step 5: Simplify summation using standard formulas and express C (n) using orders of growth.
✓ We know that the number of times the body of the loop will be executed is log2 n
✓ Therefore the number of times the comparison n>1 will be executed will be one larger
than the body of the loop
Algorithm Factorial(n)
//Purpose – Computes n! factorial recursively
//Input – A non negative integer n
//Output – The value of n!
{
if (n==0) // base case
return 1;
else
return Factorial(n-1)*n;
}
Analysis:
= (M(n-2) + 1) + 1
= M(n-2) + 2 substitute M(n-2)= M(n-3) + 1
= (M(n-3) + 1) + 2
= M(n-3) + 3
…
The general formula for the above pattern for some ‘i’ is as follows:
= M(n-i) + i
By taking the advantage of the initial condition given i.e., M(0)=0, we now equate n-i=0 and obtain
i=n.
Substitute i=n in the patterns formula to get the ultimate result of the backward substitutions.
=M(n-n)+n
= M(0) + n
M(n) = n
The number of multipilcations to compute the factorial of ‗n‘ is n where the time complexity is
linear.
In this puzzle, we have n disks of different sizes that can slide onto any of three pegs. Initially, all
the disks are on the first peg in order of size, the largest on the bottom and the smallest on top.
The goal is to move all the disks to the third peg, using the second one as an auxiliary, if necessary.
We can move only one disk at a time, and it is forbidden to place a larger disk on top of a smaller
one.
Algorithm TowerHanoi(n,src,aux,dest)
//Purpose: to Move n disks from source peg to destination peg recursively
//Input: ‘n’ disks on source peg in order of size, the largest on the bottom and the smallest on
top
//Output: ‘n’ disks on destination peg in order of size, the largest on the bottom and the
smallest on top
{
if (n = = 1) then
move disk from src to dest
else
TowerHanoi(n- 1, src, dest, aux) // Step 1
move disk from source to dest // Step 2
TowerHanoi(n - 1, aux, src,dest) // Step 3
}
Analysis:
Let us apply the general plan outlined above to the Tower of Hanoi problem.
1.The number of disks n is the obvious choice for the input‘s size indicator.
2.Moving one disk as the algorithm‘s basic operation.
3. Clearly, the number of moves M(n) depends on n only, and we get the following recurrence
equation for it:
M(n) = 2M(n-1) + 1
Substitute M(n − 1) = 2M(n − 2) + 1 in above eqn
M(n)= 2(2M(n-2) + 1) + 1
M(n)= 22*M(n-2) + 21 + 20
The general formula for the above pattern for some ‘i’ is as follows:
M(n)= 2i*M(n-i) + 2i-1 + ............. + 22 + 21 + 20
By taking the advantage of the initial condition given i.e., M(1)=1, we equate n-i=1, then substitute
i=n-1 in the above equation
M(n)= 2n-1*M(n-(n-1)) + 2(n-1)-1 + ............. + 22 + 21 + 20
M(n)= 2n-1*M(n-n+1) + 2n-2 +............. + 22 + 21 + 20
M(n)= 2n-1*M(1) + 2n-2 + ............. + 22 + 21 + 20
M(n)= 2n-1 + 2n-2 + .............+ 22 + 21 + 20
M(n)= 20 + 21 + 22 +…+ 2n-2 + 2n-1
This is Geometric Progression Series with common ratio r=2, a = 20 = 1, with number of terms = n,
Thus, we have an exponential algorithm, which will run for an unimaginably long time even for
moderate values of n .This is not due to the fact that this particular algorithm is poor; in fact, it is not
difficult to prove that this is the most efficient algorithm possible for this problem. It is the
problem‘s intrinsic difficulty that makes it so computationally hard.
EXAMPLE 3 A recursive version of the algorithm that computes number of binary digits in n‘s
binary representation.
✓ Let us set up a recurrence and an initial condition for the number of additions A(n) made by the
algorithm.
✓ Since the recursive calls end when n is equal to 1 and there are no additions made then, the
initial condition is
A(1) = 0. for n=1
✓ The presence of _n/2_ in the function‘s argument makes the method of backward substitutions
stumble on values of n that are not powers of 2.
✓ Therefore, the standard approach to solving such a recurrence is to solve it only for n = 2k and
then take advantage of the theorem called the smoothness rule, which claims that undervery
broad assumptions the order of growth observed for n =2k gives a correct answer about the order
of growth for all values of n. (Alternatively, after getting a solution for powers of2, we can
sometimes fine-tune this solution to get a formula valid for an arbitrary n.)
✓ Applying the same to the recurrence, for n = 2k which takes the form
Brute force is a straightforward approach to solving a problem, usually directly based on the problem
statement and definitions of the concepts involved.
Selection Sort
Selection sort starts by scanning the entire given list to find its smallest element and exchange it with the
first element, putting the smallest element in its final position in the sorted list. Then scan the list, starting
with the second element, to find the smallest among the last n − 1 elements and exchange it with the second
element, putting the second smallest element in its final position. Generally, on the ith pass through the list,
which we number from 0 to n − 2, the algorithm searches for the smallest item among the last n − i elements
and swaps it with Ai :
Example:
89, 45, 68, 90, 29, 34, 17
The analysis of selection sort is straightforward. The input size is given by the number of elements n; the
basic operation is the key comparison A[j ]<A[min]. The number of times it is executed depends only on
the array size and is given by the following sum:
Sequential Search
Input Size: n
Basic Operation: Comparision
Best, Worst and Average exists as basic operation vary based on input type
Best Case: Cbest(n) = Ω(1)
Worst Case: C(n) = O(n)
Average Case:
✓ Let p be the probability of successful search.
✓ In the case of a successful search, the probability of the first match occurring in the ith position
of the list is p/n for every i, where 1<=i<=n
✓ The number of comparisons made by the algorithm in such a situation is obviously i.
✓ In the case of an unsuccessful search, the number of comparisons is n with the probability of
such a search being (1 - p).
✓ Therefore Cavg(n) = Probability of succesful search + Probabiltiy of unsuccesful search
Cavg(n) = [1.p/n+2.p/n+3.p/n+…+i.p/n………n.p/n]+n.(1-p)
=p/n[1+2+3…+i+…..+n]+n(1-p)
Example: