Computer Algorithms: Muralikrishna S.N
Computer Algorithms: Muralikrishna S.N
Muralikrishna S.N
A way to create a computer world with simplicity
Chapter 1
Computer Algorithms
Contents
1 INTRODUCTION .................................................................................................................................... 2 1. 1 WHAT IS AN ALGORITHM? ........................................................................................................... 2 1.2 Analysis of Algorithms................................................................................................................... 2 How to find out the time complexity: ............................................................................................. 3 What cases to Consider?................................................................................................................. 5 Best Case ......................................................................................................................................... 5 Worst Case ...................................................................................................................................... 6 Average Case ................................................................................................................................... 6
Chapter 1
Computer Algorithms
1 INTRODUCTION
1. 1 WHAT IS AN ALGORITHM?
An algorithm can be thought as a white box with a certain set of instructions given. When we consider a problem, on carrying out the instruction mentioned in the white box, it takes us to a desired solution.
Input
Algorithm
Figure 1 Algorithm as a white box
Desired Output
Here we refer to an algorithm as white box because each step that should be carried out is definite, finite and effective. Each step in the algorithm should be unambiguous called as definiteness. Finiteness refers to the termination of algorithm. The algorithm should terminate after finite amount of time. The algorithm should be traceable. There may not be any input for certain algorithm. Consider an example of generation of a random number that do not require any external input. Every algorithm has at least one output which is the desired solution. There can be multiple algorithms to solve the same problem. For example let us consider the finding the sum of first natural numbers then the first solution is just add up all the numbers one by one, second solution is to use the formula. There arise few questions like, which is better among the two? Obviously the second, however it is not possible to tell which is better in all the cases. Does it make sense to tell the efficiency of the algorithm without knowing the computers configuration on which the algorithm is run? Can I find out the efficiency of the algorithm without considering machines on which algorithm will be executed? Before executing the algorithm on a machine can I predict what the expected time for output is? For what inputs I can get an output? Etc. To answer these questions it is necessary to know more about the algorithm and its behaviour.
Analysing an algorithm determines the amount of time that algorithm takes to execute. This is not really a number of seconds or any other clock measurement but rather an approximation of the number of operations that an algorithm performs. The number of operations is related to the execution time, so we will sometimes use the word time to describe an algorithms computational complexity. The actual number of seconds it takes an algorithm to execute on a computer is not 2
Chapter 1
Computer Algorithms
useful in our analysis because we are concerned with the relative efficiency of algorithms that solve a particular problem. You should also see that the actual execution time is not a good measure of algorithm efficiency because an algorithm does not get better just because we move it to a faster computer or worse because we move it to a slower one. The actual number of operations done for some specific size of input data set is not very interesting nor does it tells us very much. Instead, our analysis will determine an equation that relates the number of operations that a particular algorithm does to the size of the input. We can then compare two algorithms by comparing the rate at which their equations grow. The growth rate is critical because there are instances where algorithm A may take fewer operations than algorithm B when the input size is small, but many more when the input size gets large. So, the time complexity of an algorithm is the amount of time for running the algorithm considering operations in the algorithm, which contributes to time and represented a function of an algorithms input size. Time complexity is independent of machine speed. Similarly space complexity is the amount of memory required for running the algorithm and represented a function of an algorithms input size. It is also independent of machine. Looking at software that is on the market today, it is easy to see that space analysis is not being done. Programs, even simple ones, regularly quote space needs in a number of megabytes. Software companies seem to feel that making their software space efficient is not a consideration because customers who dont have enough computer memory can just go out and buy another 32 megabytes (or more) of memory to run the program or a bigger hard disk to store it. This attitude drives computers into obsolescence long before they really are obsolete. What is Input Size? Input size is a parameter for knowing the behaviour of the algorithm. The example of input size for different problems is mentioned below. Sorting The number of items to be sorted. (i.e., the array size) Graphs The number of vertices and/or edges Numerical The number of bits needed to represent a number Factorial n for which factorial is to be found Matrix Operation Dimension of the Matrix
How to find out the time complexity: We can measure the performance of an algorithm before the implementation on a machine also after the implementation. If the performance is analysed through manual paper work before actual implementation we call it as Performance Analysis or A-priori analysis. If the efficiency is measured after the implementation then it is called Performance Measurement or Profiling. The profiling or performance measurement can be done by introducing counter variable in the program. The variable will be incremented when the operation, in the algorithm implemented, is significant or operations that are integral to the algorithm and which are overhead. Another way is to use the system clock. The algorithm is repeatedly executed for a large number of trials, noting the elapsed time between start time and end time. This is not a good measure because in a multiprogramming environment the noted time may not be accurate. However if we take more trials the average time will give us the behaviour, when time is plotted as function of input. 3
Chapter 1
Computer Algorithms
The performance analysis is done using two methods either step count or operation count. Step Count Method: In step count method each instruction in the algorithm is considered. 1. First, we find out the amount of time taken for executing that instruction. It is a rough estimation in numeric value considering each operation such as comparison or arithmetic etc. Example a=b[i] + c may be counted as an array access, an addition and an assignment operation so totally 3. But this is not true in reality. Because different operations takes different amount of time. Example multiplication will take more time than addition however the step count may be computed as equal for both operations. 2. Second, we find out the frequency of execution of that instruction. Example a statement present within the body of loop may be executed n times. This n is the frequency of execution. Once this is done for all the instruction is done then the total time is represented as a function of input size n.
Operation count method In operation count method we consider only the significant operation or operations that are integral to the algorithm and which are overhead. This is because when the algorithm has a lot of steps to be carried out then step count is a difficult measure so the operation count is used. For example in a sorting algorithm we may be consider the number of swapping and comparison. For finding matrix products we may consider the major operations like multiplications. In general, deciding what to count involves two steps. The first is choosing the significant operation or operations, and the second is deciding which of those operations are integral to the algorithm and which are overhead or bookkeeping. There are two classes of operations that are typically chosen for the significant operation: comparison or arithmetic. The comparison operators are all considered equivalent and are counted in algorithms such as searching and sorting. In these algorithms, the important task being done is the comparison of two values to determine, when searching, if the value is the one we are looking for or, when sorting, if the values are out of order. Comparison operations include equal, not equal, less than, greater than, less than or equal, and greater than or equal. We will count arithmetic operators in two groups: additive and multiplicative. Additive operators (usually called additions for short) include addition, subtraction, increment, and decrement. Multiplicative operators (usually called multiplications for short) include multiplication, division, and modulus. These two groups are counted separately because multiplications are considered to take longer than additions. In fact, some algorithms are viewed more favourably if they reduce the number of multiplications even if that means a similar increase in the number of additions. In algorithms beyond the scope of this book, logarithms and geometric functions that are used in algorithms would be another group even more time consuming than multiplications because those are frequently calculated by a computer through a power series. A special case is integer multiplication or division by a power of 2. This operation can be reduced to a shift operation, which is considered as fast as an addition. There will, however, be very few cases when this will be significant, because multiplication or division by 2 is commonly found in divide and conquer algorithms that frequently have comparison as their significant operation. 4
Chapter 1
Computer Algorithms
Example of step count method: 1. Analysis of algorithm for finding the sum of an array Statement Algorithm SumArray ( a[1.N] ) { Sum=0 // Initialization of I For I= 1 To N // Condition Checking including I as false { Sum=Sum+a[I]; } //post increment of I Print sum 1 1 N 1 N 1 5N+3 So time as a function of n can be written as f(n)=5N+3 [Linear time] What cases to Consider? Choosing what input to consider when analysing an algorithm can have a significant impact on how an algorithm will perform. If the input list is already sorted, some sorting algorithms will perform very well, but other sorting algorithms may perform very poorly. The opposite may be true if the list is randomly arranged instead of sorted. Because of this, we will not consider just one input set when we analyse an algorithm. In fact, we will actually look for those input sets that allow an algorithm to perform the most quickly and the most slowly. We will also consider an overall average performance of the algorithm as well. Best Case As its name indicates, the best case for an algorithm is the input that requires the algorithm to take the shortest time. This input is the combination of values that causes the algorithm to do the least amount of work. If we are looking at a searching algorithm, the best case would be if the value we are searching for (commonly called the target or key) was the value stored in the first location that 5 3 N 3N Steps/execs frequency Total 0 0 1 1 1 0 0 0 1 N+1 0 0 0 1 N+1
Chapter 1
Computer Algorithms
the search algorithm would check. This would then require only one comparison no matter how complex the algorithm is. Notice that for searching through a list of values, no matter how large, the best case will result in a constant time of 1. Because the best case for an algorithm will usually be a very small and frequently constant value, we will not do a best-case analysis very frequently. Worst Case Worst case is an important analysis because it gives us an idea of the most time an algorithm will ever take. Worst-case analysis requires that we identify the input values that cause an algorithm to do the most work. For searching algorithms, the worst case is one where the value is in the last place we check or is not in the list. This could involve comparing the key to each list value for a total of N comparisons. The worst case gives us an upper bound on how slowly parts of our programs may work based on our algorithm choices. Average Case Average-case analysis is the toughest to do because there are a lot of details involved. The basic process begins by determining the number of different groups into which all possible input sets can be divided. The second step is to determine the probability that the input will come from each of these groups. The third step is to determine how long the algorithm will run for each of these groups. All of the input in each group should take the same amount of time, and if they do not, the group must be split into two separate groups.
Best Case of sequential search: 2. Sequential Search Best case analysis Statement Algorithm SeqSearch ( a[1.N], Key ) { // Initialization of I For I= 1 To N { If(a[I] =key) Return 1; } //post increment of I Return 0; 1 1 0 0 0 0 4 f(n)=4 [Constant time] 2 1 2 Steps/execs 0 0 1 1 frequency Total 0 0 1 1 0 0 1 1
Chapter 1 Worst Case of unsuccessful search: 1. Sequential Search worst case(Unsuccessful) Statement Algorithm SeqSearch ( a[1.N], Key ) { // Initialization of I For I= 1 To N //condition check { If(a[I] =key) Return 1; } //post increment of I Return 0; 1 1 2 Steps/execs 0 0 1 1
Computer Algorithms
2N
N 1
N 1 4N+3
f(n)=4N+3
[Linear time]
Average case of sequential search: Average case analysis is done using probabilistic method. When we consider the average case the element may be present in any location between 1.N or an unsuccessful search. Suppose we take only the successful cases, the number of comparison can be considered to find the average case behaviour. If the element is present at the position 1 the no. of comparison is 1. If the element is found at location 2 the number of comparison is 2. Similarly for other locations. There are N places where that target can be located. We will assume that all of these possibilities are equally likely, giving a probability of 1/N for each potential location. Take a moment to answer the following questions before reading on: How many comparisons are done if the match is in the first location? What about the second location? What about the third? What about the last or Nth location? If you looked at the algorithm carefully, you should have determined that the answers to these questions are 1, 2, 3, and N, respectively. This means that for each of our N cases, the number of comparisons is the same as the location where the match occurs. So the equation for average case is = (1/N)*(1+2+3++N) = (1/N)* N(N+1)/2 = (N+1)/2 7
Computer Algorithms
frequency Total 0 0 0 0
FACT=1 // Initialization of I For I= 1 To N //condition check { FACT=FACT *I; } //post increment of I Return FACT;
1 1 1
1 1 N+1
1 1 N+1
2N
1 1
N 1
N 1 4N+3
f(N) =4N+3
Analysis of Matrix Transpose Algorithm (Square Matrix): Statement Algorithm MATTRANSPOSE ( a[1..N][1..N] ) { // Initialization of I For i= 1 To N //condition check { //Initialization of J For j= 1 TO I { Swap(a[i][j],a[j][i]) } //post increment of j } //post increment of I } //end of algorithm 1 0 N 0 N 0 =5N(N+1)/2 +N 8 1 1+2++N N(N+1)/2 3 (roughly) 1 1 N 2+3+4 ..+(N+1) 1+2+3++N N N(N+1)/2 Steps/execs frequency 0 0 1 1 0 0 1 N+1 Total 0 0 1 N+1
3[N(N+1)/2]
Chapter 1
Computer Algorithms
In analysis of algorithms, it is not important to know exactly how many operations an algorithm does. Of greater concern is the rate of increase in operations for an algorithm to solve a problem as the size of the problem increases. This is referred to as the rate of growth of the algorithm. What happens with small sets of input data is not as interesting as what happens when the data set gets large. Because we are interested in general behaviour, we just look at the overall growth rate of algorithms, not at the details. If we look closely at the graph in Fig. 1.1, we will see some trends. The function based on n^2 increases slowly at first, but as the problem size gets larger, it begins to grow at a rapid rate than function with n. But it we compare n^2 with 2^n function then 2^n grows much faster. Putting all of these together means that as we analyse algorithms, we will be interested in which rate of growth class an algorithm falls into rather than trying to find out exactly how many of each operation are done by the algorithm. When we consider the relative size of a function, we will do so for large values of n, not small ones
Chapter 1
Computer Algorithms
The following table shows the different classes of algorithm and the values computed for time as a function of input size.
Classification of growth and Asymptotic Notations: Because the rate of growth of an algorithm is important, and we have seen that the rate of growth is dominated by the largest term in an equation, we will discard the terms that grow more slowly. When we strip all of these things away, we are left with what we call the order of the function or related algorithm. We can then group algorithms together based on their order. We group them in three categoriesthose that grow at least as fast as some function, those that grow at the same rate, and those that grow no faster.
Big Oh Notation (O) : Let n be the input size f(n ) and g(n) be the positive function of n. We say that f(n) and only if there exist a real positive constant C and positive integer n0 such that
is O(g (n)) if
Big omega Notation (): Let n be the input size f(n ) and g(n) be the positive function of n. We say that f(n) and only if there exist a real positive constant C and positive integer n0 such that
is (g (n)) if
10
Chapter 1
Computer Algorithms
Theta Notation () Let n be the input size f(n ) and g(n) be the positive function of n. We say that f(n)
is (g (n)) if
and only if there exist a real positive constants C1, C2 and positive integer n0 such that
If an algorithms analysed time is represented f(n)= 2n2 in the best case and f(n)= 4n2 in the worst and average case then the f(n)= (n2) is true. Exercises: 1. Express the function n3/1000 - 100n2 - 100n + 3 in terms of -notation. 2. Consider sorting n numbers stored in array A by first finding the smallest element of A and exchanging it with the element in A[1]. Then find the second smallest element of A, and exchange it with A[2]. Continue in this manner for the first n - 1 elements of A. Write 11
Chapter 1
Computer Algorithms
pseudo-code for this algorithm, which is known as selection sort. What loop invariant does this algorithm maintain? Why does it need to run for only the first (n 1) elements, rather than for all n elements? Give the best-case and worst-case running times of selection sort in -notation.
(The notes is not yet Reviewed, So if you have any doubts please bring it to my notice)
Topics not yet covered in notes: * * * Horners Rule: Recurrence Relation Recursive Algorithms |___________Fibonacci , Tower of Hanoi, Factorial Using recursion *Divide and Conquer Chapter
12