Concept of Algorithm
Concept of Algorithm
Concept of Algorithm
A common man’s belief is that a computer can do anything and everything that he imagines. It is
very difficult to make people realize that it is not really the computer but the man behind computer who
does everything.
In the modern internet world man feels that just by entering what he wants to search into the
computers he can get information as desired by him. He believes that, this is done by computer. A
common man seldom understands that a man made procedure called search has done the entire job and
the only support provided by the computer is the executional speed and organized storage of
information.
In the above instance, a designer of the information system should know what one frequently
searches for. He should make a structured organization of all those details to store in memory of the
computer. Based on the requirement, the right information is brought out. This is accomplished through
a set of instructions created by the designer of the information system to search the right information
matching the requirement of the user. This set of instructions is termed as program. It should be evident
by now that it is not the computer, which generates automatically the program but it is the designer of
the information system who has created this.
Thus, the program is the one, which through the medium of the computer executes to perform all the
activities as desired by a user. This implies that programming a computer is more important than the
computer itself while solving a problem using a computer and this part of programming has got to be
done by the man behind the computer. Even at this stage, one should not quickly jump to a conclusion
that coding is programming. Coding is perhaps the last stage in the process of programming.
Programming involves various activities form the stage of conceiving the problem upto the stage of
creating a model to solve the problem. The formal representation of this model as a sequence of
instructions is called an algorithm and coded algorithm in a specific computer language is called a
program.
One can now experience that the focus is shifted from computer to computer programming and then
to creating an algorithm. This is algorithm design, heart of problem solving.
Charaterstic Of Algorithm
Let us try to present the scenario of a man brushing his own teeth(natural denture) as an algorithm as
follows. Step 1. Take the brush Step 2. Apply the paste Step 3. Start brushing Step 4. Rinse Step 5.
Wash Step 6. Stop
If one goes through these 6 steps without being aware of the statement of the problem, he could
possibly feel that this is the algorithm for cleaning a toilet. This is because of several ambiguities while
comprehending every step. The step 1 may imply tooth brush, paint brush, toilet brush etc. Such an
ambiguity doesn’t an instruction an algorithmic step. Thus every step should be made unambiguous.
An unambiguous step is called definite instruction. Even if the step 2 is rewritten as apply the tooth
paste, to eliminate ambiguities yet the conflicts such as, where to apply the tooth paste and where is the
source of the tooth paste, need to be resolved. Hence, the act of applying the toothpaste is not
mentioned. Although unambiguous, such unrealizable steps can’t be included as algorithmic instruction
as they are not effective. The definiteness and effectiveness of an instruction implies the successful
termination of that instruction. However the above two may not be sufficient to guarantee the
termination of the algorithm. Therefore, while designing an algorithm care should be taken to provide a
proper termination for algorithm.
Thus, every algorithm should have the following five characteristic feature
Input
Output
Definiteness
Effectiveness
Termination
Therefore, an algorithm can be defined as a sequence of definite and effective instructions, which
terminates with the production of correct output from the given input. In other words, viewed little
more formally, an algorithm is a step by step formalization of a mapping function to map input set onto
an output set. The problem of writing down the correct algorithm for the above problem of brushing the
teeth is left to the reader. For the purpose of clarity in understanding, let us consider the following
examples. Example 1: Problem : finding the largest value among n>=1 numbers. Input : the value of n
and n numbers Output : the largest value Steps : Let the value of the first be the largest value denoted
by BIG
Let R denote the number of remaining numbers. R=n-1
If R != 0 then it is implied that the list is still not exhausted. Therefore look the next number called
NEW.
Now R becomes R-1
If NEW is greater than BIG then replace BIG by the value of NEW
Repeat steps 3 to 5 until R becomes zero.
Print BIG
Stop
End of algorithm Example 2: quadratic equation Example 3: listing all prime numbers between two
limits n1 and n2. 1.2.1 Algorithmic N otations In this section we present the pseudocode that we use
through out the book to describe algorithms. The pseudo code used resembles PASCAL and C
language control structures. Hence, it is expected that the reader be aware of PASCAL/C. Even
otherwise atleast now it is required that the reader should know preferably C to practically test the
algorithm in this course work. However, for the sake of completion we present the commonly
employed control constructs present in the algorithms. A conditional statement has the following form
If < condition> then Block 1 Else Block 2 If end. This pseudocode executes block1 if the condition is
true otherwise block2 is executed. The two types of loop structures are counter based and conditional
based and they are as follows For variable = value1 to value2 do Block For end Here the block is
executed for all the values of the variable from value 1 to value 2. There are two types of conditional
looping, while type and repeat type. While (condition) do Block While end. Here block gets executed
as long as the condition is true. Repeat Block Until<condition> Here block is executed as long as
condition is false. It may be observed that the block is executed atleast once in repeat type. Exercise 1;
Devise the algorithm for the following and verify whether they satisfy all the features. An algorithm
that inputs three numbers and outputs them in ascending order. To test whether the three numbers
represent the sides of a right angle triangle. To test whether a given point p(x,y) lies on x-axis or y-axis
or in I/II/III/IV quadrant. To compute the area of a circle of a given circumference To locate a specific
word in a dictionary.
Numerical Algorithm
If there are more then one possible way of solving a problem, then one may think of more than
one algorithm for the same problem. Hence, it is necessary to know in what domains these
algorithms are applicable. Data domain is an important aspect to be known in the field of
algorithms. Once we have more than one algorithm for a given problem, how do we choose the
best among them? The solution is to devise some data sets and determine a performance profile
for each of the algorithms.
A best case data set can be obtained by having all distinct data in the set. But, it is always
complex to determine a data set, which exhibits some average behavior. The following sections
give a brief idea of the well-known accepted algorithms.
2.1 Numerical Algorithms
Numerical analysis is the theory of constructive methods in mathematical analysis. Constructive
method is a procedure used to obtain the solution for a mathematical problem in finite number of
steps and to some desired accuracy.
2.1.1 Numerical Iterative Algorithm
An iterative process can be illustrated with the flow chart given in fig 2.1. There are four main
blocks in the process viz., initialization, decision, computation, and update. The functions of these
four blocks are as follows:
1. Initialization: all parameters are set to their initial values.
2. Decision: decision parameter is used to determine when to exit from the loop.
3. Computation: required computation is performed.
4. Update: decision parameter is updated and is transformed for next iteration.
Many problems in engineering or science need the solution of simultaneous linear algebraic
equations. Every iterative algorithm is infinite step algorithm. One of the iterative algorithms to
solve system of simultaneous equations is Guass Siedel. This iteration method requires generally a
few iteration. Iterative techniques have less round-off error. For large system of equations, the
iteration required may be quite large. But, there is a guarantee of getting the convergent result.
For example: consider the following set of equations,
10x1+2x2+x3= 9
2x1+20x2-2x3= -44
-2x1+3x2+10x3= 22.
To solve the above set of equations using Guass Siedel iteration scheme, start with
(x1(1),x2(1),x3(1))=(0,0,0) as initial values and compute the values of we write the system of
x1, x2, x3 using the equations given below
x1(k+1)=(b1-a12x2(k+1)-a13x3(k))/a11
x2(k+1)=(b2-a21x1(k+1)-a23x3(k))/a22
x3(k+1)=(b3-a31x1(k+1)-a32x3(k+1))/a33
for k=1,2,3,…
This process is continued upto some desired accuracy. Numerical iterative methods are also
applicable for obtaining the roots of the equation of the form f(x)=0. The various iterative
methods used for this purpose are,
1. Bisection method: xi+2=(xi+xi+1)/2
2. Regula- Falsi method: x2=(x0f(x1)+ x1f(x0))/ (f(x1)-f(x0))
Let us assume that we have a sequential file and we wish to retrieve an element matching with key ‘k’,
then, we have to search the entire file from the beginning till the end to check whether the element
matching k is present in the file or not.
There are a number of complex searching algorithms to serve the purpose of searching. The linear
search and binary search methods are relatively straight forward methods of searching.
2.2.1 Sequential search
In this method, we start to search from the beginning of the list and examine each element till the end
of the list. If the desired element is found we stop the search and return the index of that element. If the
item is not found and the list is exhausted the search returns a zero value.
In the worst case the item is not found or the search item is the last (nth) element. For both situations
we must examine all n elements of the array so the order of magnitude or complexity of the sequential
search is n. i.e., O(n). The execution time for this algorithm is proportional to n that is the algorithm
executes in linear time.
The algorithm for sequential search is as follows,
Algorithm : sequential search
Input : A, vector of n elements
K, search element
Output : j –index of k
Method : i=1
While(i<=n)
{
if(A[i]=k)
{
write("search successful")
write(k is at location i)
exit();
}
else
i++
if end
while end
write (search unsuccessful);
algorithm ends.
Sorting
One of the major applications in computer science is the sorting of information in a table. Sorting
algorithms arrange items in a set according to a predefined ordering relation. The most common types
of data are string information and numerical information. The ordering relation for numeric data simply
involves arranging items in sequence from smallest to largest and from largest to smallest, which is
called ascending and descending order respectively.
The items in a set arranged in non-decreasing order are {7,11,13,16,16,19,23}. The items in a set
arranged in descending order is of the form {23,19,16,16,13,11,7}
Similarly for string information, {a, abacus, above, be, become, beyond}is in ascending order and
{ beyond, become, be, above, abacus, a}is in descending order.
There are numerous methods available for sorting information. But, not even one of them is best for all
applications. Performance of the methods depends on parameters like, size of the data set, degree of
relative order already present in the data etc.
20 35 18 8 14 41 3 39
The resulting array should be
a[1] a[2] a[8]
3 8 14 18 20 35 39 41
One way to sort the unsorted array would be to perform the following steps:
• Find the smallest element in the unsorted array
• Place the smallest element in position of a[1]
i.e., the smallest element in the unsorted array is 3 so exchange the values of a[1] and a[7]. The
array now becomes,
a[1] a[2] a[8]
3 35 18 8 14 41 20 39
Now find the smallest from a[2] to a[8] , i.e., 8 so exchange the values of a[2] and a[4] which
results with the array shown below,
a[1] a[2] a[8]
3 8 18 35 14 41 20 39
Repeat this process until the entire array is sorted. The changes undergone by the array is shown in fig
2.2.The number of moves with this technique is always of the order O(n).
20 35 18 8 14 41 3 39
Initially the whole array is unordered. So select the minimum and put it in place of a[1] to act as
sentinel. Now the array is of the form,
a[1] a[2] a[8]
3 35 18 8 14 41 20 39
Now we have one element in the sorted list and the remaining elements are in the unordered set. Select
the next element to be inserted. If the selected element is less than the preceding element move the
preceding element by one position and insert the smaller element.
In the above array the next element to be inserted is x=35, but the preceding element is 3 which is less
than x. Hence, take the next element for insertion i.e., 18. 18 is less than 35, so move 35 one position
ahead and place 18 at that place. The resulting array will be,
a[1] a[2] a[8]
3 18 35 8 14 41 20 39
Now the element to be inserted is 8. 8 is less than 35 and 8 is also less than 18 so move 35 and 18 one
position right and place 8 at a[2]. This process is carried till the sorted array is obtained.
Recursion
Recursion may have the following definitions:
-The nested repetition of identical algorithm is recursion.
-It is a technique of defining an object/process by itself.
-Recursion is a process by which a function calls itself repeatedly until some specified condition has
been satisfied.
2.4.1 When to use recursion
Recursion can be used for repetitive computations in which each action is stated in terms of previous
result. There are two conditions that must be satisfied by any recursive procedure.
1. Each time a function calls itself it should get nearer to the solution.
2. There must be a decision criterion for stopping the process.
In making the decision about whether to write an algorithm in recursive or non-recursive form, it is
always advisable to consider a tree structure for the problem. If the structure is simple then use non-
recursive form. If the tree appears quite bushy, with little duplication of tasks, then recursion is
suitable.
The recursion algorithm for finding the factorial of a number is given below,
Algorithm : factorial-recursion
Input : n, the number whose factorial is to be found.
Output : f, the factorial of n
Method : if(n=0)
f=1
else
f=factorial(n-1) * n
if end
algorithm ends.
The general procedure for any recursive algorithm is as follows,
1. Save the parameters, local variables and return addresses.
2. If the termination criterion is reached perform final computation and goto step 3 otherwise
perform final computations and goto step 1
3. Restore the most recently saved parameters, local
variable and return address and goto the latest return address.
2.4.2 Iteration v/s Recursion
Demerits of recursive algorithms
1. Many programming languages do not support recursion, hence recursive mathematical function
is implemented using iterative methods.
2. Even though mathematical functions can be easily implemented using recursion it is always at
the cost of execution time and memory space. For example, the recursion tree for generating 6
numbers in a fibonacci series generation is given in fig 2.5. A fibonacci series is of the form
0,1,1,2,3,5,8,13,…etc, where the third number is the sum of preceding two numbers and so on.
It can be noticed from the fig 2.5 that, f(n-2) is computed twice, f(n-3) is computed thrice, f(n-4)
is computed 5 times.
3. A recursive procedure can be called from within or outside itself and to ensure its proper
functioning it has to save in some order the return addresses so that, a return to the proper
location will result when the return to a calling statement is made.
4. The recursive programs needs considerably more storage and will take more time.
Demerits of iterative methods
1. Mathematical functions such as factorial and fibonacci series generation can be easily
implemented using recursion than iteration.
2. In iterative techniques looping of statement is very much necessary.
Recursion is a top down approach to problem solving. It divides the problem into pieces or selects out
one key step, postponing the rest.
Iteration is more of a bottom up approach. It begins with what is known and from this constructs the
solution step by step. The iterative function obviously uses time that is O(n) where as recursive
function has an exponential time complexity.
It is always true that recursion can be replaced by iteration and stacks. It is also true that stack can be
replaced by a recursive program with no stack.
2.5 Hashing
Hashing is a practical technique of maintaining a symbol table. A symbol table is a data structure which
allows to easily determine whether an arbitrary element is present or not.
Consider a sequential memory shown in fig 2.6. In hashing technique the address X of a variable x is
obtained by computing an arithmetic function (hashing function) f(x). Thus f(x) points to the address
where x should be placed in the table. This address is known as the hash address.
The memory used to store the variable using hashing technique is assumed to be sequential. The
memory is known as hash table. The hash table is partitioned into several storing spaces called buckets
and each bucket is divided into slots (fig 2.6).
If there are b buckets in the table, each bucket is capable of holding s variables, where each variable
occupies one slot. The function f(x) maps the possible variable onto the integers 0 through b-1. The
size of the space from where the variables are drawn is called the identifier space. Let T be the
identifier space, n be the number of variables/identifiers in the hash table. Then, the ratio n/T is called
the identifier density and a = n/sb is the loading density or loading factor.
If f(x1)=f(x2), where x1and x2 are any two variables, then x1and x2 are called synonyms. Synonyms are
mapped onto the same bucket. If a new identifier is hashed into a already complete bucket, collision
occurs.
A hashing table with single slot is as given below. Let there be 26 buckets with single slot. The
identifier to be stored are GA, D, A, G, L, A2, A1, A3, A4, Z, ZA, E. Let f(x) be the function which
maps on to a address equal to the position of the first character of the identifier in the set of English
alphabet. The hashing table generated is as shown in fig 2.7.
Time taken to retrieve the identifiers is as follows,
Search Search
element (x) time (t)
GA 1
D 1
A 1
G 2
L 1
A2 2
A1 3
A3 5
A4 6
Z 1
ZA 10
E 6
∑t =39
Average retrieval time =(∑t)/n.
The average retrieval time entirely depends on the hashing function.
Exercise 2:
1. What are the serious short comings of the binary search method and sequential search
method.
2. Know more searching techniques involving hashing functions
3. Implement the algorithms for searching and calculate the complexities
4. Write an algorithm for the above method of selection sort and implement the same.
5. Write the algorithm for merge sort method
6. Take 5 data set of length 10 and hand simulate for each method given above.
7. Try to know more sorting techniques and make a comparative study of them.
8. Write an iterative algorithm to find the factorial of a number
9. Write a recursive and iterative program for reversing a number
10. Write recursive and iterative program to find maximum and minimum in a list of
numbers.
11. Write an algorithm to implement the hashing technique and implement the same
12. Hand simulate all algorithms for a 5 datasets.
Introduction To Graph Theory
3.1
3.1.1 What is graph?
A graph G = (V, E) consists of a set of objects V = {v1, v2, …} called vertices, and another set E
= {e1, e2, …} whose elements are called edges. Each edge ek in E is identified with an unordered
pair (vi, vj) of vertices. The vertices vi, vj associated with edge ek are called the end vertices of
ek.
The most common representation of graph is by means of a diagram, in which the vertices are
represented as points and each edge as a line segment joining its end vertices. Often this diagram
itself is referred to as a graph.
Fig 3-1.
In the Fig. 3-1 edge e1 having same vertex as both its end vertices is called a self-loop. There
may be more than one edge associated with a given pair of vertices, for example e4 and e5 in Fig.
3-1. Such edges are referred to as parallel edges.
A graph that has neither self-loop nor parallel edges are called a simple graph, otherwise it is
called general graph. It should also be noted that, in drawing a graph, it is immaterial whether
the lines are drawn straight or curved, long or short: what is important is the incidence between
the edges and vertices.
A graph is also called a linear complex, a 1-complex, or a one-dimensional complex. A vertex is
also referred to as a node, a junction, a point, 0-cell, or an 0-simplex. Other terms used for an
edge are a branch, a line, an element, a 1-cell, an arc, and a 1-simplex.
Because of its inherent simplicity, graph theory has a very wide range of applications in
engineering, physical, social, and biological sciences, linguistics, and in numerous other areas. A
graph can be used to represent almost any physical situation involving discrete objects and a
relationship among them.
3.1.2 Finite and Infinite Graphs
Although in the definition of a graph neither the vertex set V nor the edge set E need be finite, in
most of the theory and almost all applications these sets are finite. A graph with a finite number
of vertices as well as a finite number of edges is called a finite graph; otherwise, it is an infinite
graph.
3.1.3 Incidence and Degree
When a vertex vi is an end vertex of some edge ej, vi and ej are said to be incident with (on or to)
each other. In Fig. 3-1, for example, edges e2, e6, and e7 are incident with vertex v4. Two
nonparallel edges are said to be adjacent if they are incident on a common vertex. For example,
e2 and e7 in Fig. 3-1 are adjacent. Similarly, two vertices are said to be adjacent if they are the
end vertices of the same edge. In Fig. 3-1, v4 and v5 are adjacent, but v1 and v4 are not.
The number of edges incident on a vertex vi, with self-loops counted twice is called the degree,
d(vi), of vertex vi. In Fig. 3-1, for example, d(v1) = d(v3) = d(v4) = 3, d(v2) = 4, and d(v5) = 1.
The degree of a vertex is sometimes also referred to as its valency. Since each edge contributes
two degrees, the sum of the degrees of all vertices in G is twice the number of edges in G.
3.1.4 Isolated vertex, Pendent vertex, and Null graph
A vertex having no incident edge is called an isolated vertex. In other words, isolated vertices are
vertices with zero degree. Vertex v4 and v7 in Fig. 3-2, for example, are isolated vertices. A
vertex of degree one is called a pendent vertex or an end vertex. Vertex v3 in Fig. 3-2 is a
pendant vertex. Two adjacent edges are said to be in series if their common vertex is of degree
two. In Fig. 3-2, the two edges incident on v1 are in series.
Fig. 3-2 Graph containing isolated vertices, series edges and a pendant vertex.
In the definition of a graph G = (V, E), it is possible for the edge set E to be empty. Such a graph,
without any edges, is called a null graph. In other words, every vertex in a null graph is an
isolated vertex. A null graph of six vertices is shown in Fig. 3-3. Although the edge set E may be
empty, the vertex set V must not be empty; otherwise, there is no graph. In other words, by
definition, a graph must have at least one vertex.
Matrix Representation Of Graph
Although a pictorial representation of a graph is very convenient for a visual study, other
representations are better for computer processing. A matrix is a convenient and useful way of
representing a graph to a computer.
Matrices lend themselves easily to mechanical manipulations. Besides, many known results of matrix
algebra can be readily applied to study the structural properties of graphs from an algebraic point of
view. In many applications of graph theory, such as in electrical network analysis and operation
research, matrices also turn out to be the natural way of expressing the problem.
3.2.1 Incidence Matrix
Let G be a graph with n vertices, e edges, and no self-loops. Define an n by e matrix A =[aij], whose n
rows correspond to the n vertices and the e columns correspond to the e edges, as follows:
The matrix element
Aij = 1, if jth edge ej is incident on ith vertex vi, and
= 0, otherwise.
(a)
abcdefgh
v1 0 0 0 1 0 1 0 0
v2 0 0 0 0 1 1 1 1
v3 0 0 0 0 0 0 0 1
v4 1 1 1 0 1 0 0 0
v5 0 0 1 1 0 0 1 0
v6 1 1 0 0 0 0 0 0
(b)
Fig. 3-4 Graph and its incidence matrix.
Such a matrix A is called the vertex-edge incidence matrix, or simply incidence matrix. Matrix A for a
graph G is sometimes also written as A(G). A graph and its incidence matrix are shown in Fig. 3-4. The
incidence matrix contains only two elements, 0 and 1. Such a matrix is called a binary matrix or a (0,
1)-matrix.
The following observations about the incidence matrix A can readily be made:
1. Since every edge is incident on exactly two vertices, each column of A has exactly two
1’s.
2. The number of 1’s in each row equals the degree of the corresponding vertex.
3. A row with all 0’s, therefore, represents an isolated vertex.
4. Parallel edges in a graph produce identical columns in its incidence matrix, for example,
columns 1 and 2 in Fig. 3-4.
Concept Of Trees
The concept of a tree is probably the most important in graph theory, especially for those interested in
applications of graphs.
A tree is a connected graph without any circuits. The graph in Fig 3-5 for instance, is a tree. It follows
immediately from the definition that a tree has to be a simple graph, that is, having neither a self-loop
nor parallel edges (because they both form circuits).
There are a number of general and powerful computational strategies that are repeatedly used in
computer science. It is often possible to phrase any problem in terms of these general strategies.
These general strategies are Divide and Conquer, Dynamic Programming. The techniques of
Greedy Search, Backtracking and Branch and Bound evaluation are variations of dynamic
programming idea. All these strategies and techniques are discussed in the subsequent chapters.
The most widely known and often used of these is the divide and conquer strategy.
The basic idea of divide and conquer is to divide the original problem into two or more sub-
problems which can be solved by the same technique. If it is possible to split the problem further
into smaller and smaller sub-problems, a stage is reached where the sub-problems are small
enough to be solved without further splitting. Combining the solutions of the individuals we get
the final conquering. Combining need not mean, simply the union of individual solutions.
Divide and Conquer involves four steps
1. Divide
2. Conquer [Initial Conquer occurred due to solving]
3. Combine
4. Conquer [Final Conquer].
In precise, forward journey is divide and backward journey is Conquer. A general binary divide and
conquer algorithm is :
Procedure D&C (P,Q) //the data size is from p to q
{
If size(P,Q) is small Then
Solve(P,Q)
Else
M ← divide(P,Q)
Combine (D&C(P,M), D&C(M+1,Q))
}
Sometimes, this type of algorithm is known as control abstract algorithms as they give an abstract
flow. This way of breaking down the problem has found wide application in sorting, selection and
searching algorithm.
4.1 Binary Search:
Algorithm:
m← (p+q)/2
If (p ≤ m ≤ q) Then do the following Else Stop
If (A(m) = Key Then ‘successful’ stop
Else
If (A(m) < key Then
q=m-1;
Else
p← m+1
End Algorithm.
Illustration :
Consider the data set with elements {12,18,22,32,46,52,59,62,68}. First let us consider the
simulation for successful cases.
Successful cases:
Key=12 P Q m Search
195x
142x
1 1 1 successful
To search 12, 3 units of time is required
Key=18 P Q m Search
195x
1 4 2 successful
To search 18, 2 units of time is required
Key=22 P Q m Search
195x
142x
3 4 3 successful
To search 22, 3 units of time is required
Key=32 P Q m Search
195x
142x
343x
4 4 4 successful
To search 32, 4 units of time is required
Key=46 P Q m Search
1 9 5 successful
To search 46, 1 unit of time is required
Key=52 P Q m Search
195x
697x
6 6 6 successful
To search 52, 3 units of time is required
Key=59 P Q m Search
195x
6 9 7 successful
To search 59, 2 units of time is required
Key=62 P Q m Search
195x
697x
8 9 8 successful
To search 62, 3 units of time is required
Key=68 P Q m Search
195x
697x
898x
9 9 9 successful
To search 68, 4 units of time is required
3+2+3+4+1+3+2+4
Successful average search time= -------------------------
9
unsuccessful cases
Key=25 P Q m Search
195x
142x
343x
444x
To search 25, 4 units of time is required
Key=65 P Q m Search
195x
697x
898x
999x
To search 65, 4 units of time is required
4+4
Unsuccessful search time =--------------------
2
average (sum of unsuccessful search time
search = + sum of Successful search time)/(n+(n+1))
time
Key=18 P Q m Search
195x
1 4 2 successful
To search 18, 2 units of time is required
Key=22 P Q m Search
195x
142x
3 4 3 successful
To search 22, 3 units of time is required
Key=32 P Q m Search
195x
142x
343x
4 4 4 successful
To search 32, 4 units of time is required
Key=46 P Q m Search
1 9 5 successful
To search 46, 1 unit of time is required
Key=52 P Q m Search
195x
697x
6 6 6 successful
To search 52, 3 units of time is required
Key=59 P Q m Search
195x
6 9 7 successful
To search 59, 2 units of time is required
Key=62 P Q m Search
195x
697x
8 9 8 successful
To search 62, 3 units of time is required
Key=68 P Q m Search
195x
697x
898x
9 9 9 successful
To search 68, 4 units of time is required
3+2+3+4+1+3+2+4
Successful average search time= -------------------------
9
unsuccessful cases
Key=25 P Q m Search
195x
142x
343x
444x
To search 25, 4 units of time is required
Key=65 P Q m Search
195x
697x
898x
999x
To search 65, 4 units of time is required
4+4
Unsuccessful search time =--------------------
2
average (sum of unsuccessful search time
search = + sum of Successful search time)/(n+(n+1))
time
Max-Min Search
Max-Min search problem aims at finding the smallest as well as the biggest element in a vector A
of n elements.
Following the steps of Divide and Conquer the vector can be divided into sub-problem as shown
below.
The search has now reduced to comparison of 2 numbers. The time is spent in conquering and
comparing which is the major step in the algorithm.
Consider a data set with elements {82,36,49,91,12,14,06,76,92}. Initially the max and min
variables have null values. In the first call, the list is broken into two equal halves.. The list is
again broken down into two. This process is continued till the length of the list is either two or
one. Then the maximum and minimum values are chosen from the smallest list and these values
are returned to the preceding step where the length of the list is slightly big. This process is
continued till the entire list is searched. The detail description is shown in fig 4.1
Integer Multiplication
There are various methods of obtaining the product of two numbers. The repeated addition
method is left as an assignment for the reader. The reader is expected to find the product of some
bigger numbers using the repeated addition method.
Another way of finding the product is the one we generally use i.e., the left shift method.
4.3.1 left shift method
981*1234
3924
2943*
1962**
981***
1210554
In this method, a=981 is the multiplicand and b=1234 is the multiplier. A is multiplied by every
digit of b starting from right to left. On each multiplication the subsequent products are shifted
one place left. Finally the products obtained by multiplying a by each digit of b is summed up to
obtain the final product.
The above product can also be obtained by a right shift method, which can be illustrated as
follows,
4.3.2 right shift method
981*1234
981
1962
*2943
**3924
1210554
In the above method, a is multiplied by each digit of b from leftmost digit to rightmost digit. On
every multiplication the product is shifted one place to the right and finally all the products
obtained by multiplying ‘a’ by each digit of ‘b’ is added to obtain the final result.
The product of two numbers can also be obtained by dividing ‘a’ and multiplying ‘b’ by 2
repeatedly until a<=1.
4.3.3 halving and doubling method
Let a=981 and b=1234
The steps to be followed are
1. If a is odd store b
2. A=a/2 and b=b*2
3. Repeat step 2 and step 1 till a<=1
a b result
61 19744 19744
30 39488 ----------
--
15 78976 78976
7 157952 157952
3 315904 315904
1 631808 631808
Sum=1210554
The above method is called the halving and doubling method.
4.3.4 Speed up algorithm:
In this method we split the number till it is easier to multiply. i.e., we split 0981 into 09 and 81
and 1234 into 12 and 34. 09 is then multiplied by both 12 and 34 but, the products are shifted ‘n’
places left before adding. The number of shifts ‘n’ is decided as follows
Multiplication shifts
sequence
09*12 4 108****
09*34 2 306**
81*12 2 972**
81*34 0 2754
Sum=1210554
For 0981*1234, multiplication of 34 and 81 takes zero shifts, 34*09 takes 2 shifts, 12 and 81
takes 2 shifts and so on.
Exercise 4
1. Write the algorithm to find the product of two numbers for all the methods explained.
2. Hand simulate the algorithm for atleast 10 different numbers.
3. Implement the same for verification.
4. Write a program to find the maximum and minimum of the list of n element with and
without using recursion.
Greedy Method
Greedy method is a method of choosing a subset of the dataset as the solution set that results in
some profit. Consider a problem having n inputs, we are required to obtain the solution which is a
series of subsets that satisfy some constraints or conditions. Any subset, which satisfies these
constraints, is called a feasible solution. It is required to obtain the feasible solution that
maximizes or minimizes the objective function. This feasible solution finally obtained is called
optimal solution.
If one can devise an algorithm that works in stages, considering one input at a time and at each
stage, a decision is taken on whether the data chosen results with an optimal solution or not. If
the inclusion of a particular data results with an optimal solution, then the data is added into the
partial solution set. On the other hand, if the inclusion of that data results with infeasible solution
then the data is eliminated from the solution set.
The general algorithm for the greedy method is
1. Choose an element e belonging to dataset D.
2. Check whether e can be included into the solution set S if Yes solution set is s ← s U e.
3. Continue until s is filled up or D is exhausted whichever is earlier.
5.1 Cassette Filling
Consider n programs that are to be stored on a tape of length L. Each program I is of length li
where i lies between 1 and n. All programs can be stored on the tape iff the sum of the lengths of
the programs is at most L. It is assumed that, whenever a program is to be retrieved the tape is
initially positioned at the start end.
Let tj be the time required retrieving program ij where programs are stored in the order
The time taken to access a program on the tape is called the mean retrieval time (MRT)
i.e., tj =Σ lik k=1,2,…,j
Now the problem is to store the programs on the tape so that MRT is minimized. From the above
discussion one can observe that the MRT can be minimized if the programs are stored in an
increasing order i.e., l1 ≤ l2 ≤ l3, …≤ ln.
Hence the ordering defined minimizes the retrieval time. The solution set obtained need not be a
subset of data but may be the data set itself in a different sequence.
Illustration
Assume that 3 sorted files are given. Let the length of files A, B and C be 7, 3 and 5 units
respectively. All these three files are to be stored on to a tape S in some sequence that reduces
the average retrieval time. The table shows the retrieval time for all possible orders.
The problem of knapsack is to fill the available items into the knapsack so that the knapsack gets
filled up and yields a maximum profit. If a fraction xi of object i is placed into the knapsack, then
a profit pixi is earned. The constrain is that all chosen objects should sum up to M
Illustration
Consider a knapsack problem of finding the optimal solution where, M=15, (p1,p2,p3…p7) = (10,
5, 15, 7, 6, 18, 3) and (w1, w2, …., w7) = (2, 3, 5, 7, 1, 4, 1).
In order to find the solution, one can follow three different srategies.
Strategy 1 : non-increasing profit values
Let (a,b,c,d,e,f,g) represent the items with profit (10,5,15,7,6,18,3) then the sequence of objects
with non-increasing profit is (f,c,a,d,e,b,g).
Profit= 47 units
The solution set is (1,0,1,4/7,0,1,0).
Profit= 54 units
The solution set is (1,1,4/5,0,1,1,1).
Strategy 2: maximum profit per unit of capacity used
(This means that the objects are considered in decreasing order of the ratio Pi/wI)
g: P7/w7 =3/1=3
In the optimal storage on tape problem, we are required to find a permutation for the n programs
so that when they are stored on the tape, the MRT is minimized.
Let n=3 and (l1,l2,l3)=(5,10,3),there are n!=6 possible orderings. These orderings and their
respective MRT is given in the fig 6.1. Hence, the best order of recording is 3,1,2.
Consider the problem of 4 queens, backtracking solution for this is as shown in the fig 6.3. The
figure shows a partial backtracking tree. Completion of the tree is left as an assignment for the
reader.
Concept Of Branch And Bound
The term branch and bound refer to all state space search methods in which all possible branches
are derived before any other node can become the E-node. In other words the exploration of a
new node cannot begin until the current node is completely explored.
6.2.1 Tape filling:
The branch and bound tree for the records of length (5,10,3) is as shown in fig 6.4
Example:
Consider the digraph of fig 7-1. Let the numbers on the edges be the costs of
travelling along that route. If a person is interested travel from v1 to v2, then he
encounters many paths. Some of them are
1. v1à v2 = 50 units
2. v1à v3à v4à v2 = 10+15+20=45 units
3. v1à v5à v4à v2 = 45+30+20= 95 units
4. v1à v3à v4à v5à v4à v2 = 10+15+35+30+20=110 units
The cheapest path among these is the path along v1à v3à v4à v2. The cost of the
path is 10+15+20 = 45 units. Even though there are three edges on this path, it is
cheaper than travelling along the path connecting v1 and v2 directly i.e., the path
v1à v2 that costs 50 units. One can also notice that, it is not possible to travel to v6
from any other node.
To formulate a greedy based algorithm to generate the cheapest paths, we must
conceive a multistage solution to the problem and also of an optimization measure.
One possibility is to build the shortest paths one by one. As an optimization
measure we can use the sum of the lengths of all paths so far generated. For this
measure to be minimized, each individual path must be of minimum length. If we
have already constructed i shortest paths, then using this optimization measure, the
next path to be constructed should be the next shortest minimum length path. The
greedy way to generate these paths in non-decreasing order of path length. First, a
shortest path to the nearest vertex is generated. Then a shortest path to the second
nearest vertex is generated, and so on.
A much simpler method would be to solve it using matrix representation. The steps
that should be followed is as follows,
Step 1: find the adjacency matrix for the given graph. The adjacency matrix for fig
7.1 is given below
V1 V2 V3 V4 V5 V6
V1 - 50 10 Inf 45 Inf
Step 2: consider v1 to be the source and choose the minimum entry in the row v1.
In the above table the minimum in row v1 is 10.
Step 3: find out the column in which the minimum is present, for the above example
it is column v3. Hence, this is the node that has to be next visited.
Step 4: compute a matrix by eliminating v1 and v3 columns. Initially retain only row v1. The
second row is computed by adding 10 to all values of row v3.
The resulting matrix is
V2 V4 V5 V6
Minimum 50 25 45 inf
Step 5: find the minimum in each column. Now select the minimum from the resulting row. In the
above example the minimum is 25. Repeat step 3 followed by step 4 till all vertices are covered or
single column is left.
The solution for the fig 7.1 can be continued as follows
V2 V5 V6
V1à Vw 50 45 Inf
Minimum 45 45 inf
V5 V6
V1à Vw 45 Inf
Minimum 45 inf
V6
V1à Vw Inf
V1à V3à V4à V2à V5à Vw 45+inf
Minimum inf
Finally the cheapest path from v1 to all other vertices is given by V1à V3à V4à V2à V5.
Above figure shows the complete graph on four nodes together with three of its spanning tree.
Spanning trees have many applications. For example, they can be used to obtain an independent
set of circuit equations for an electric network. First, a spanning tree for the electric network is
obtained. Let B be the set of network edges not in the spanning tree. Adding an edge from B to
the spanning tree creates a cycle. Kirchoff’s second law is used on each cycle to obtain a circuit
equation.
Another application of spanning trees arises from the property that a spanning tree is a minimal
sub-graph G’ of G such that V(G’) = V(G) and G’ is connected. A minimal sub-graph with n
vertices must have at least n-1 edges and all connected graphs with n-1 edges are trees. If the
nodes of G represent cities and the edges represent possible communication links connecting two
cities, then the minimum number of links needed to connect the n cities is n-1. the spanning trees
of G represent all feasible choice.
In practical situations, the edges have weights assigned to them. Thse weights may represent the
cost of construction, the length of the link, and so on. Given such a weighted graph, one would
then wish to select cities to have minimum total cost or minimum total length. In either case the
links selected have to form a tree. If this is not so, then the selection of links contains a cycle.
Removal of any one of the links on this cycle results in a link selection of less const connecting all
cities. We are therefore interested in finding a spanning tree of G. with minimum cost since the
identification of a minimum-cost spanning tree involves the selection of a subset of the edges, this
problem fits the subset paradigm.
7.2.1 Prim’s Algorithm
A greedy method to obtain a minimum-cost spanning tree builds this tree edge by edge. The next
edge to include is chosen according to some optimization criterion. The simplest such criterion is
to choose an edge that results in a minimum increase in the sum of the costs of the edges so far
included. There are two possible ways to interpret this criterion. In the first, the set of edges so
far selected form a tree. Thus, if A is the set of edges selected so far, then A forms a tree. The
next edge(u,v) to be included in A is a minimum-cost edge not in A with the property that A U
{(u,v)} is also a tree. The corresponding algorithm is known as prim’s algorithm.
For Prim’s algorithm draw n isolated vertices and label them v1, v2, v3,…vn. Tabulate the given
weights of the edges of g in an n by n table. Set the non existent edges as very large. Start from
vertex v1 and connect it to its nearest neighbor (i.e., to the vertex, which has the smallest entry
in row1 of table) say Vk. Now consider v1 and vk as one subgraph and connect this subgraph to
its closest neighbor. Let this new vertex be vi. Next regard the tree with v1 vk and vi as one
subgraph and continue the process until all n vertices have been connected by n-1 edges.
Consider the graph shown in fig 7.3. There are 6 vertices and 12 edges. The weights are tabulated
in table given below.
V1 V2 V3 V4 V5 V6
V1 - 10 16 11 10 17
V3 16 9.5 - 7 Inf 12
V4 11 Inf 7 - 8 7
V5 10 Inf Inf 8 - 9
V6 17 19.5 12 7 9 -
Start with v1 and pick the smallest entry in row1, which is either (v1,v2) or (v1,v5). Let us pick
(v1, v5). The closest neighbor of the subgraph (v1,v5) is v4 as it is the smallest in the rows v1
and v5. The three remaining edges selected following the above procedure turn out to be (v4,v6)
(v4,v3) and (v3, v2) in that sequence. The resulting shortest spanning tree is shown in fig 7.4.
The weight of this tree is 41.5.