CS350-CH03 - Algorithm Analysis

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 88

CS360-Data Structures

Text by Clifford Shaffer


Modified by Dr. Martins

Chapter 03 Algorithm
Analysis
Chaminade University of
Honolulu
Department of Computer
Science
1

Goal

To introduce the basic motivation, and


fundamental techniques of algorithm
analysis.
Algorithm analysis is a methodology for
estimating the resource consumption of
an algorithm.
It allows us to compare the relative
costs of two or more algorithms for
solving the same problem.
2

How to Measure Efficiency?


1.
2.

Empirical comparison (run programs)


Asymptotic Algorithm Analysis

Critical resources: Time and space


Factors affecting running time: For most algorithms, running
time depends on size of the input.

Running time is expressed as T(n) for some


function T on input size n.
3

Critical Resources

The critical resource for a program


is most often its running time.
Other factors include the memory
space required to run the program.
Typically, we will analyze the time
required for an algorithm, and the
space required for a data structure.
4

Two approaches

Two approaches to obtaining


running time:

Measuring under standard benchmark


conditions.
Estimating the algorithms
performance

Estimation

Estimation is based on:

The size of the input


The number of basic operations

The time to complete a basic


operation does not depend on the
value of its operands.

Example 3.1
// Return position of largest value in array
int largest(int array[], int n) {
int currlarge = 0; // Largest value seen
for (int i=1; i<n; i++) // For each val
if (array[currlarge] < array[i])
currlarge = i;
// Remember pos
return currlarge;
// Return largest
}

Example 3.1

The basic operation is the


comparison
It takes a fixed amount of time to do
one comparison, regardless of the
value of the two integers or their
positions in the array.
The size of the problem is n
The running time is T(n) = cn.
8

Example 3.2
int a =

array[0];

The running time is T(n) = c1

This is called constant running time.

Example 3.3
sum = 0;
for (i=1; i<=n; i++)
for (j=1; j<n; j++)
sum++;
}
What is the running time for this code?
The basic operation is sum++
The costs of the sum operation can be
bundled into time c2
The running time is T(n) = C n2
2
10

The Growth Rate

The growth rate for an algorithm is


the rate at which the cost of the
algorithm grows as the size of the
input grows.

Linear Growth T(n) = n


Quadratic Growth T(n) = n2
Exponential Growth T(n) = 2n

11

Growth Rate Graph

12

Best, Worst, Average Cases

Not all inputs of a given size take the same time to run.

Sequential search for K in an array of n integers:

Begin at first element in array and look at each element in turn


until K is found

Best case: The first position of the array has K


Worst case: The last position in the array has K
Average case: n/2

13

Which Analysis to Use?

The Best Case. Normally, we are


not interested in the best case,
because:

It is too optimistic
Not a fair characterization of the
algorithms running time
Useful in some rare cases, where the best
case has high probability of occurring.

14

Which Analysis to Use?

The Worst Case. Useful in many


real-time applications.

The advantage is:

Predictability: you know for certain that the


algorithm must perform at least that well.

The disadvantage is:

Might not be a representative measure of


the behavior of the algorithm on inputs of
size n.
15

Which Analysis to Use?

The average case. Often we prefer to


know the average-case running time.

The average case reveals the typical behavior


of the algorithm on inputs of size n.
Average case estimation is not always possible.
For the sequential search example, it assumes
that the integer value of K is equally likely to
appear in any position of the array. This
assumption is not always correct.

16

The moral of the story

If we know enough about the


distribution of our input we prefer the
average-case analysis.
If we do not know the distribution,
then we must resort to worst-case
analysis.
For real-time applications, the worstcase analysis is the preferred method.
17

A Faster Computer?

How much larger a problem can be


solved in a given amount of time
by a faster computer ?
Assume that you buy a new
machine that is 10 faster than the
old one.

18

Linear Algorithms
T(n)
10n

n
1,000

n
Change
10,000
n =
10n

n/n
10

n # inputs processed by the old computer (in 1


hour)
n # inputs processed by the new computer (in 1
hour)
Note: The new computer is 10 times faster than
your old computer

19

Quadratic Algorithms
T(n)
2n2

n
70

n
223

Change
n = 10n

n/n
3.16

n # inputs processed by the old computer (in 1


hour)
n # inputs processed by the new computer (in 1
hour)
Note: The new computer is 10 times faster than
your old computer

20

Exponential Algorithms
T(n)
2n

n
13

n
16

Change
n = n +3

n/n
---

n # inputs processed by the old computer (in 1


hour)
n # inputs processed by the new computer (in 1
hour)
Note: The new computer is 10 times faster than
your old computer

21

Faster Computer or Algorithm?


What happens when we buy a computer
10 times faster?
T(n)
10n
20n
5n log
n
2n2
n

n
n
Change
1,00 10,00 n = 10n
0
0
500 5,000 n = 10n
250 1,842 10 n < n <
10n
70
223 n = 10n

n/n
10
10
7.37
3.16
22

Conclusions

An algorithm with time equation T(n) = 2n 2


does not receive nearly as great an
improvement from the faster machine as
an algorithm with linear growth rate.
Instead of an improvement by a factor of
ten, the improvement is only the square
root of 10 (3.16)
Instead of buying a faster computer,
consider what happens if you replace an
algorithm with quadratic running time with
a new algorithm with nlogn running time.
23

Exercises (work in groups)

Read items 3.1 and 3.2 from


chapter 3, on algorithm analysis.
What is the growth rate for an
algorithm?
How do you define exponential
growth rate ?

24

Exercises (work in groups)

What do you understand by:

Sequential search
Best case, worst case and average
case
When are we interested in:

Best case scenario


Average case
Worst-case
25

Asymptotic Analysis

Big-Oh
Big Omega
Big Theta

26

Asymptotic Analysis: Big-oh


Definition: For T(n) a non-negatively valued
function, T(n) is in the set O(f(n)) if there exist
two positive constants c and n0 such that T(n)
<= cf(n) for all n > n0.
Usage: The algorithm is in O(n2) in [best, average, worst]
case.
Meaning: For all data sets big enough (i.e., n>n0), the
algorithm always executes in less than cf(n) steps in
[best, average, worst] case.
27

Big-oh Notation (cont)

Big-oh notation indicates an upper bound.


How bad things can get perhaps things
are not nearly bad
Lowest possible upper bound
Example: linear search: n2 is an upper
bound, but n is the tightest upper bound.

Example: If T(n) = 3n2 then T(n) is in O(n2).


28

Big-Oh Examples
Example 1: Finding value X in an array.
T(n) = csn/2.
For all values of n > 1, csn/2 <= csn.
Therefore, by the definition, T(n) is in O(n)
for n0 = 1 and c = cs.

29

Big-Oh Examples
Example 2: T(n) = c1n2 + c2n in average case.
c1i2 + c2n <= c1n2 + c2n2 <= (c1 + c2)n2 for all n > 1.
T(n) <= cn2 for c = c1 + c2 and n0 = 1.
Therefore, T(n) is in O(n2) by the definition.
Example 3: T(n) = c. We say this is in O(1).

30

A Common Misunderstanding
The best case for my algorithm is n=1
because that is the fastest. WRONG!
Big-oh refers to a growth rate as n grows to
.
Best case is defined as which input of size n
is cheapest among all inputs of size n.
31

Big-Omega
Definition: For T(n) a non-negatively valued
function, T(n) is in the set (g(n)) if there exist
two positive constants c and n0 such that T(n)
>= cg(n) for all n > n0.
Meaning: For all data sets big enough (i.e., n > n0),
the algorithm always executes in more than
cg(n) steps.
Lower bound.
32

Big-Omega Example
T(n) = c1n2 + c2n.
c1n2 + c2n >= c1n2 for all n > 1.
T(n) >= cn2 for c = c1 and n0 = 1.
Therefore, T(n) is in (n2) by the definition.
We want the greatest lower bound.
33

Theta Notation
When big-Oh and meet, we indicate this
by using (big-Theta) notation.
Definition: An algorithm is said to be
(h(n)) if it is in O(h(n)) and it is in (h(n)).

34

A Common Misunderstanding
Confusing worst case with upper bound.

Worst case refers to the worst input from


among the choices for possible inputs of
a given size.

Upper bound refers to a growth rate

35

Simplifying Rules
1.

2.

3.

4.

If f(n) is in O(g(n)) and g(n) is in O(h(n)), then


f(n) is in O(h(n)).
If f(n) is in O(kg(n)) for any constant k > 0, then
f(n) is in O(g(n)).
If f1(n) is in O(g1(n)) and f2(n) is in O(g2(n)),
then (f1 + f2)(n) is in O(max(g1(n), g2(n))).
If f1(n) is in O(g1(n)) and f2(n) is in O(g2(n)) then
f1(n)f2(n) is in O(g1(n)g2(n)).
36

Running Time Examples (1)


Example 1: a = b;
This assignment takes constant time, so it is
(1).
Example 2:
sum = 0;
for (i=1; i<=n; i++)
sum += n;
37

Running Time Examples (2)


Example 3:
sum = 0;
for (j=1; j<=n; j++)
for (i=1; i<=j; i++)
sum++;
for (k=0; k<n; k++)
A[k] = k;

38

Running Time Examples (3)


Example 4:
sum1 = 0;
for (i=1; i<=n; i++)
for (j=1; j<=n; j++)
sum1++;
sum2 = 0;
for (i=1; i<=n; i++)
for (j=1; j<=i; j++)
sum2++;

39

Running Time Examples (4)


Example 5:
sum1 = 0;
for (k=1; k<=n; k*=2)
for (j=1; j<=n; j++)
sum1++;
sum2 = 0;
for (k=1; k<=n; k*=2)
for (j=1; j<=k; j++)
sum2++;

40

Binary Search

How many elements are examined in worst


case?
41

Binary Search
// Return position of element in sorted
// array of size n with value K.
int binary(int array[], int n, int K) {
int l = -1;
int r = n; // l, r are beyond array bounds
while (l+1 != r) { // Stop when l, r meet
int i = (l+r)/2; // Check middle
if (K < array[i]) r = i;
// Left half
if (K == array[i]) return i; // Found it
if (K > array[i]) l = i;
// Right half
}
return n; // Search value not in array
}
42

Other Control Statements


while loop: Analyze like a for loop.
if statement: Take greater complexity of
then/else clauses.
switch statement: Take complexity of most
expensive case.
Subroutine call: Complexity of the subroutine.

43

Analyzing Problems
Upper bound: Upper bound of best known
algorithm.
Lower bound: Lower bound for every
possible algorithm.

44

Analyzing Problems: Example


Common misunderstanding: No distinction
between upper/lower bound when you know the
exact running time.

Example of imperfect knowledge: Sorting

45

Analyzing Problems: Example


1.

2.
3.

4.

Cost of I/O: (n). This is a lower bound


for every possible algorithm
Bubble or insertion sort: O(n2).
A better sort (Quicksort, Mergesort,
Heapsort, etc.): O(n log n). upper bound
for best known algorithm
We prove later that sorting is (n log n).
46

Multiple Parameters
Compute the rank ordering for all C pixel values in
a picture of P pixels.
for (i=0; i<C; i++)
count[i] = 0;
for (i=0; i<P; i++)
count[value(i)]++;
sort(count);

// Initialize count
// Look at all pixels
// Increment count
// Sort pixel counts

If we use P as the measure, then time is (P log


P).
More accurate is (P + C log C).
47

Space Bounds
Space bounds can also be analyzed with
asymptotic complexity analysis.
Time: Algorithm
Space: Data Structure

48

Space/Time Tradeoff Principle


One can often reduce time if one is willing to
sacrifice space, or vice versa. Examples:
1.
2.
3.

Encoding or packing information


Storing Boolean flags
Calculating function values (Table lookup)

49

The space/time tradeoff example 1

Packing Information

Reduce storage requirements by


packing or encoding information;
Unpacking or decoding information
requires additional time;

50

The space/time tradeoff example 2

Storing 32 Boolean Flags


Types of
storage

32
integers

32
chars

32 bit
fields

32 x 4 =
128

32 x 1 =
32

resources

Space
required
(bytes)

Time
required
T
T
5T
to set a
value approximation machine dependent 51
me T is a relative

The space/time tradeoff example 3

Calculating Functions

A table lookup pre-stores the value


of a function that would otherwise
be computed each time it is needed:

Factorials: e.g. 12!

Calculate 12 x 11 x 10 x 1 (using no
space)
Pre-store 479001600 (using 4 bytes)

Sine, cosine

52

Space/Time Tradeoff Principle


Disk-based Space/Time Tradeoff :
The smaller you make the disk storage
requirements, the faster your program
will run.

53

Practical Considerations

No such big difference in running


time between 1(n) and 2(nlogn).

Example:
(10,000) = 10,000
1

2(10,000) =
=10,000 log1010,000 = 40,000

54

Practical Considerations

There is an enormous difference


between 1(n2) and 2(nlogn).

1(10,000) = 100,000,000
2(10,000) = 10,000 log10 10,000
=
= 40,000

55

Practical Considerations

Many problems whose obvious


solution requires (n2) time also
has a solution that requires
(nlogn). Examples:

Sorting
Searching

56

Code tuning

Practical Considerations

Code tuning can also lead to dramatic


improvements in running time;
Code tuning is the art of handoptimizing a program to run faster or
require less storage.
For many programs, code tuning can
reduce running time by a factor of
ten.
57

Code tuning

Remarks

Most statements in a program do not


have much effect on the running
time of that program;
There is little point to cutting in half
the running time of a subroutine that
accounts for only 1% of the total.
Focus your attention on the parts of
the program that have the most
impact
58

Code tuning

Remarks

When tuning code, it is important to


gather good timing statistics;
Be careful not to use tricks that
make the program unreadable;
Make use of compiler optimizations;
Check that your optimizations really
improve the program.
59

Remarks

Comparative timing of programs is


a difficult business:

Experimental errors from uncontrolled


factors (system load, language,
compiler etc..);
Bias towards a program;
Unequal code tuning.

60

Remarks

The greatest time and space


improvements come from a better
data structure or algorithm

FIRST TUNE THE ALGORITHM,


THEN TUNE THE CODE

61

Appendix A - Notes
Algorithm Analysis

62

Computational complexity
theory

Complexity theory is part of the


theory of computation dealing with the
resources required during computation to
solve a given problem.
The most common resources are time (how
many steps does it take to solve a problem)
and space (how much memory does it take to
solve a problem).

63

Computational complexity
theory

Other resources can also be


considered, such as how many parallel
processors are needed to solve a
problem in parallel.
Complexity theory differs from
computability theory, which deals with
whether a problem can be solved at all,
regardless of the resources required.
64

Computational complexity
theory

If a problem has time complexity


O(n) on one typical computer,
then it will also have complexity
O(n) on most other computers
This notation allows us to
generalize away from the details of
a particular computer.
65

Big Oh

The Big Oh is the upper bound of a


function.
In the case of algorithm analysis, we use
it to bound the worst-case running time,
or the longest running time possible for
any input of size n.
We can say that the maximum running
time of the algorithm is in the order of
Big Oh.
66

Big Omega

is also an order of growth but it


is the opposite of the Big Oh : it is
the lower bound of a function.
We can say that the minimum
running time of the algorithm is in
the order of .

67

Upper / Lower bounds

68

Remarks

We only care what happens for


large n (n no).
O, , "hide" the constant C
from us.
If an is the worst-case time of an
algorithm, then O(.) gives worstcase guarantees, while gives a
lower bound.
69

Theta Notation
C2g(n)
f(n)
C1g(n)

n0

f(n) = (g(n))

70

Appendix B - Exercises
Algorithm Analysis

71

Write True or False


1.

2.

3.

The equation T(n) = 3n + 2 is an


example of a linear growth rate.
If Algorithm A has a faster growth rate
than Algorithm B in the average case,
that means Algorithm A is more
efficient than Algorithm B on average.
When performing asymptotic analysis,
we can ignore constants and low order
terms.
72

Write True or False


4.

5.

6.

The best case for an algorithm occurs


when the input size is as small as
possible
Asymptotic algorithm analysis is most
useful when the input size is small
When performing algorithm analysis,
we measure the cost of programs in
terms of basic operations. Each
operation should require constant time.
73

Write True or False

If a program has a growth rate


proportional to n2 for an input size
n, then a computer that runs twice
as fast will be able to run in one
hour an input that is twice as large
as that which can be run in one
hour on the slower computer.

74

Write True or False

The concepts of asymptotic analysis apply


equally well to space costs as they do to time
costs.
The most reliable method for comparing two
approaches to solving a problem is simply to
write to programs and compare their running
time.
We can often make a program faster if we are
willing to use more space, and conversely, we
can often make a program require less space if
we are willing to take more running time.
75

Exercise

Suppose that a particular algorithm


has time complexity T(n) = n2 , and
that executing an implementation
of it on a particular machine takes
T seconds for N inputs. Now
suppose that we are presented with
a machine that is 64 times as fast.
How many inputs could we process
on the new machine in T seconds?
76

CS350-Data Structures
Created by David Schneider
Modified by Prof. Martins

Appendix C
Calculating the Running Time
of a Program
Chaminade University of
Honolulu
Department of Computer
Science
77

Analysis of the first double loop example 3.10

Running Time Examples (2)


Example 3.10 (page 65):
sum = 0;
for (j=1; j<=n; j++)
for (i=1; i<=j; i++)
sum++;
for (k=0; k<n; k++)
A[k] = k;

78

Analysis of the first loop example 3.10

Number of executions
Outer
loop
j

Inner
loop
i

1,2

1,2,3

1,2,..
n

#
runs

1 + 2 + 3 + 4 + 5 + N
79

Analysis of the first loop example 3.10

Number of executions
n

# runs = 1 + 2 + 3 + 4 ..+ n = j
j=1
n

j = n(n+1)/2
j=1

Closed form solution


see page 31
equation 2.1

Which is (n2)
80

Analysis of first double loop example 3.12

Running Time Examples (4)


Example 3.12:
sum1 = 0;
for (k=1; k<=n; k*=2)
for (j=1; j<=n; j++)
sum1++;
sum2 = 0;
for (k=1; k<=n; k*=2)
for (j=1; j<=k; j++)
sum2++;

81

nalysis of the second double loop example 3.12

## of executions outer
loop
k

1,2,..
n

#
runs

1,2,.. 1,2..n
n
2

1,2,. n

Log n

N x log N

82

Analysis of the first double loop example 3.12

Number of executions

# runs = (1 + ..N) log n=


n

Log n

n = n log n

Log n

j=1

j=1

Closed form solution


see page 31
equation 2.3

Which is (n log n)
83

Analysis of first double loop example 3.12

Running Time Example


Outer loop
for (k=1; k<=n; k*=2)
This loop is executed log n times.
Inner loop
for (j=1; j<=n; j++)
This loop is executed n times for each iteration of the
outer loop.
Therefore (n logn)

84

Remarks

In this course nearly all algorithms


used have a base two. If any base
for algorithm other than two is
intended, then the base is shown
explicitly.

85

nalysis of the second double Loop example 3.12

Running Time
sum2 = 0;
for (k=1; k<=n; k*=2)
for (j=1; j<=k; j++)
sum2++;

The outer loop always executes log


n times as we have seen.
The inner loop executes according to
the table in the next slide.
86

nalysis of the second double loop example 3.12

## of executions outer
loop
k

1,2

1,2,3, 1,2, 1,2, n


3,4,
4
5,6,
7,8

#
runs

Log n

1 + 2 + 4 + 8 + 16 + n

87

Analysis of the second double loop example 3.12

Number of executions

# runs = 1+2+4+8+16 ..+ n

= 1 + 21 + 22 + 23 + 24 + + 2logn

log n

2 =
i

2n - 1

i=1

Closed form solution


see page 32
equation 2.9

Which is simply (n)


88

You might also like