CS2040 Data Structures and Algorithms
Lecture Note #3
Analysis of Algorithms
Objectives
• To introduce the theoretical basis for
1 measuring the efficiency of algorithms
• To learn how to use such measure to
compare the efficiency of different
2 algorithms
2
Outline
1. What is an Algorithm?
2. What do we mean by Analysis of
Algorithms?
3. Big-O notation – Upper Bound
4. How to find the complexity of a program?
3
1 What is an algorithm?
1 Algorithm
◼ A step-by-step procedure for solving a problem.
◼ Properties of an algorithm:
❑ Each step of an algorithm must be exact.
❑ An algorithm must terminate.
❑ An algorithm must be effective.
❑ *An algorithm should be general.
Exact Terminate
Effective General
5
2 What do we mean by
Analysis of Algorithms?
2.1 What is Analysis of Algorithms?
◼ Analysis of algorithms
❑ Provides tools for comparing the efficiency of
different methods of solution (rather than programs)
❑ Efficiency = Complexity of algorithms
◼ A comparison of algorithms
❑ Should focus on significant differences in the
efficiency of the algorithms
❑ Should not consider reductions in computing costs
due to clever coding tricks. Tricks may reduce the
readability of an algorithm.
7
2.2 Determining the Efficiency/Complexity
of Algorithms
◼ To evaluate rigorously the resources (time and
space) needed by an algorithm and represent
the result of the analysis with a formula
◼ We will emphasize more on the time
requirement rather than space requirement here
◼ The time requirement of an algorithm is also
called its time complexity
8
2.3 By measuring the run time?
TimeTest.java
public class TimeTest {
public static void main(String[] args) {
long startTime = System.currentTimeMillis();
long total = 0;
for (int i = 0; i < 10000000; i++) {
total += i;
}
long stopTime = System.currentTimeMillis();
long elapsedTime = stopTime - startTime;
System.out.println(elapsedTime);
}
}
Note: The run time depends on the compiler, the computer
used, and the current work load of the computer.
9
2.4 Exact run time is not always needed
❑ Using exact run time is not meaningful
when we want to compare two algorithms
▪ coded in different languages,
▪ running on different computers or
▪ using different data sets
10
2.5 Determining the Efficiency of
Algorithms
◼ Algorithm analysis should be independent of
❑ Specific implementations
❑ Compilers and their optimizers
❑ Computers
❑ Data
11
2.6 Run Time of Algorithms
◼ Instead of working out the exact timing, we count
the number of some or all of the primitive
operations (e.g. +, -, *, /, assignment, …)
needed.
◼ Counting an algorithm's primitive operations is a
way to assess its efficiency
❑ An algorithm’s run time is proportional to the number
of primitive operations it executes.
12
2.7 Counting the number of statements
◼ To simplify the counting further, we can ignore
❑ the different types of primitive operations, and
❑ different number of primitive operations in a statement,
and simply count the number of statements
executed.
◼ Caveat
❑ The number of primitive operations per statement is
bounded and thus can be taken to be some constant
amount
❑ A function call is not considered a primitive operation.
Need to figure out what is the time complexity of the
function call
13
2.8 Computation cost of an algorithm
◼ How many operations are required?
for (int i=1; i<=n; i++) {
perform 100 operations; // A
for (int j=1; j<=n; j++) {
perform 2 operations; // B
}
}
n n n
Total Ops = A + B = 100 + ( 2)
i =1 i =1 j =1
n
= 100n + 2n = 100n + 2n = 2n 2 + 100n
2
i =1
14
2.9 Approximation of analysis results
◼ Very often, we are interested only in using a simple
term to indicate how efficient an algorithm is. The
exact formula of an algorithm’s performance is not
really needed.
◼ A good simple term to “characterize” an algorithm’s
efficiency is one describing the growth rate of the
algorithm (how the number of operations executed
grows as n increases in size). This is also known as
the growth term or order of magnitude.
15
2.10 Example of Algorithm Growth Rates
Problem size
Figure - Time requirements as a function of the problem size n
◼ Algorithm efficiency is typically a concern for large
problems only. Why?
16
2.11 Algorithm Growth Rates (1/2)
◼ Given the exact function for the number of operations
executed by a algorithm, how do we get the growth
term?
◼ For example:
2n2 + 100n
❑ Among 2n2 and 100n, the dominating term/leading term 2n2
will be used as the growth term
◼ This kind of analysis of the efficiency of an algorithm
is called asymptotic analysis of the algorithm
17
2.12 Asymptotic analysis
◼ Asymptotic analysis is an analysis of algorithms
that focuses on
❑ analyzing the problems of large input size,
❑ considering only the leading term of the formula, and
❑ ignoring the coefficient of the leading term
◼ Some notations are needed in asymptotic
analysis
18
3 Big O notation
3.1 Definition – Big O notation
◼ Given a function f(n), we say g(n) is an (asymptotic)
upper bound of f(n), denoted as f(n) = O(g(n)), if there
exist a constant c > 0, and a positive integer n0 such
that f(n) c*g(n) for all n n0.
c*g(n)
❑ f(n) is said to be
bounded from
f(n)
above by g(n).
❑ O() is called the
g(n)
“big O” notation.
n0
❑ Another way of saying it is that O(g(n)) is the set of all
functions f(n) where f(n) is asymptotically upper bounded
by g(n)
20
3.2 Ignore the coefficients of all terms
◼ Based on the definition, 2n2 and 30n2 have the
same upper bound n2, i.e.,
❑ 2n2 = O(n2) f1(n) = 2n2; g(n) = n2.
Let c=3 and n0=1, since 2n2 cn2 n ≥ n0
Hence f1(n) = O(g(n))
Why?
f2(n) = 30n2; g(n) = n2.
❑ 30n = O(n2)
2
Let c=31 and n0=1, since 30n2 cn2 n ≥ n0
Hence f2(n) = O(g(n))
They differ only in the choice of c.
◼ Therefore, in big O notation, we can omit the
coefficients of all terms in a formula:
❑ Example: f(n) = 2n2 + 100n = O(n2) + O(n)
21
3.3 Finding the constants c and n0
◼ Given f(n) = 2n2 + 100n, prove that f(n) = O(n2).
Observe that: 2n2 + 100n 2n2 + n2 = 3n2
whenever n ≥ 100.
→ Set the constants to be c = 3 and n0 = 100.
By definition, we have f(n) = O(n2).
Notes:
1. n2 2n2 + 100n for all n, i.e., g(n) f(n), and yet g(n)
is an asymptotic upper bound of f(n)
2. c and n0 are not unique.
For example, we can choose c = 2 + 100 = 102, and
n0 = 1 (because f(n) 102n2 n ≥ 1)
Q: Can we write f(n) = O(n3)? Yes
22
3.4 Is the bound tight?
◼ The complexity of an algorithm can be bounded
by many functions.
◼ Example:
❑ Let f(n) = 2n2 + 100n.
❑ f(n) is bounded by n2, n3, n4 and many others according
to the definition of big O notation.
❑ Hence, the following are all correct:
◼ f(n) = O(n2); f(n) = O(n3); f(n) = O(n4)
◼ However, we are more interested in the tightest
bound which is n2 for this case.
23
3.5 Growth Terms: Order-of-Magnitude
◼ In asymptotic analysis, a formula can be simplified
to a single term with coefficient 1
◼ Such a term is called a growth term (rate of growth,
order of growth, order-of-magnitude)
◼ The most common growth terms can be ordered as
follows: (note: many others are not shown)
O(1) < O(log n) < O(n) < O(n log n) < O(n2) < O(n3) < O(2n) < O(n!)
“fastest” “slowest”
Note:
▪ “log” = log base 2, or log2; “log10” = log base 10; “ln” = log
base e. In big O, all these log functions are the same.
(Why?)
24
3.6 Some more properties of Big O
1. if f1(n) = O(g1(n)) and f2(n) = O(g2(n)) then
f1(n)+f2(n) = O(g1(n)+g2(n)) = O(max(g1(n),g2(n)))
2. if f1(n) = O(g1(n)) and f2(n) = O(g2(n)) then
f1(n)*f2(n) = O(g1(n)*g2(n))
3. if f1(n), f2(n) = O(g(n)) then
f1(n)+f2(n) = O(g(n))
4. if g1(n) = O(g2(n)) then
g1(n)+g2(n) = O(g2(n))
25
3.7 Examples on Big O notation
◼ f1(n) = ½n + 4
= O(n)
◼ f2(n) = 240n + 0.001n2
= O(n2)
◼ f3(n) = n log n + log n + n log (log n)
= O(n log n)
26
3.8 Order-of-Magnitude Analysis and
Big O Notation (1/2)
Figure - Comparison of growth-rate functions in tabular form
27
3.8 Order-of-Magnitude Analysis and
Big O Notation (2/2)
Figure - Comparison of growth-rate functions in graphical form
28
3.9 Summary: Order-of-Magnitude
Analysis and Big O Notation
◼ Order of growth of some common functions:
O(1) < O(log n) < O(n) < O(n log n) < O(n2) < O(n3) < O(2n) <O(n!)
◼ Properties of growth-rate functions/Big O notation
❑ You can ignore low-order terms
❑ You can ignore a multiplicative constant in the high-order
term
❑ Other properties in slide 25
29
Common Questions
1. When we want to prove that a function f(n) = O(g(n)) do
we have to show the working? That is pick a C and n0
so that C*g(n) >= f(n) for all n > n0
❑ No, just use the rule of thumb (identify dominating term in f(n) and
use it as g(n))
2. How do we know if we have the tightest upper bound
for a function f(n)?
❑ If f(n) is accurately computed, the dominating term will be the
tightest bound.
❑ Difficulty
◼ Computing the accurate f(n) <- we will work through some example
for this lecture
◼ Identifying the dominating term (sometimes terms in f(n) need not be
the common growth terms we have listed in the lecture notes)
30
4 How to find the complexity
of a program?
4.1 Examples on finding complexity (1/3)
◼ What is the complexity of the following code fragment?
int sum = 0;
for (int i=1; i<n; i=i*2) {
sum++;
}
◼ It is clear that sum is incremented only when
i = 1, 2, 4, 8, …, 2k where k = log2 n
There are k+1 iterations. So the complexity is O(k) or
O(log n)
Note:
▪ In Computer Science, log n means log2 n.
▪ When 2 is replaced by 10 in the ‘for’ loop, the complexity is O(log10 n)
which is the same as O(log2 n). (Why?)
▪ log10 n = log2 n / log2 10
32
4.1 Examples on finding complexity (2/3)
◼ What is the complexity of the following code fragment?
int sum = 0;
for (int i=1; i<n; i=i*2) {
for (int j=n; j>1; j=j/2)
sum++;
}
33
4.1 Examples on finding complexity (3/3)
◼ What is the complexity of the following code fragment?
(For simplicity, let’s assume that n is some power of 3.)
int sum = 0;
for (int i=1; i<n; i=i*3) {
for (j=1; j<=i; j++) {
sum++;
}
}
◼ T(n)= 1 + 3 + 9 + 27 + … + 3(log3 n)
= 1 + 3 + … + n/9 + n/3 + n
= n + n/3 + n/9 + … + 3 + 1 (reversing the terms in previous step)
= n * (1 + 1/3 + 1/9 + …)
n * (3/2) Why is (1 + 1/3 + 1/9 + …) 3/2?
= 3n/2 This is sum of infinite geometric series see word
= O(n) file
“analysis_of_algorithms_useful_equalities.pdf”
34
4.2 Non-recursive Binary Search Algorithm (1)
◼ Requires array to be sorted in ascending order
◼ Maintain subarray where x (the search key) might
be located
◼ Repeatedly compare x with m, the middle element
of current subarray
❑ If x = m, found it!
❑ If x > m, continue search in subarray after m
❑ If x < m, continue search in subarray before m
35
4.2 Non-recursive Binary Search Algorithm (2)
◼ Data in the array a[] are sorted in ascending order
public static int binSearch(int[] a, int len, int x)
{
int mid, low = 0;
int high = len - 1;
while (low <= high) {
mid = (low + high) / 2;
if (x == a[mid]) return mid;
else if (x > a[mid]) low = mid + 1;
else high = mid - 1;
}
return -1;
}
36
4.2 Non-recursive Binary Search Algorithm (3)
◼ At any point during binary search, part of array is “alive”
(might contain the point x)
◼ Each iteration of loop eliminates at least half of previously
“alive” elements
◼ At the beginning, all n elements are “alive”, and after
❑ After 1 iteration, at most n/2 elements are left, or alive
❑ After 2 iterations, at most (n/2)/2 = n/4 = n/22 are left
❑ After 3 iterations, at most (n/4)/2 = n/8 = n/23 are left
:
i
❑ After i iterations, at most n/2 are left
❑ At the final iteration, at most 1 element is left
37
4.2 Non-recursive Binary Search Algorithm (4)
In the worst case, we have to search all the way up to the last
iteration k with only one element left.
We have:
n/2k = 1
2k = n
k = log n
Hence, the binary search algorithm takes O(f(n)) , or O(log n)
times
38
4.3 Time complexity of recursion: An example (1/2)
◼ What is the complexity of the following recursive function?
// Precond: n >= 0
public static int f(int n) {
if (n == 0)
return 1;
else {
int sum = 0;
for (int i=0; i<n; i++)
sum++;
return f(n-1)+sum;
}
}
Recurrence for the recursive function f
f(n) = 1 for n == 0
= f(n-1) + n for n > 0
39
4.3 Time complexity of recursion: An example (2/2)
◼ Function for the total operations of a recursive function is
similar in form to the recursive case of the recurrence
# of operations to compute # of operation executed by the
f(n) and f(n-1) respectively for loop in the else clause
T(n) = T(n-1) + c*n
expand T(n-1)
= [T(n-2) + n-1] + n
= T(n-2) + (n-1) + n Iterative Method
= T(n-3) + (n-2) + (n-1) + n
…
= T(0) + 1 + 2 + 3 + … + n (must stop when n=0)
= 1 + 1 + 2 + 3 + … + n = O(n2)
40
4.3 Time complexity of recursion: Fibonacci numbers
◼ Fibonacci numbers: 1, 1, 2, 3, 5, 8, 13, 21, …
❑ The first two Fibonacci numbers are both 1 (arbitrary numbers)
❑ The rest are obtained by adding the previous two together.
◼ Calculating the nth Fibonacci number recursively:
// Precond: n > 0
public static int fib(int n) {
if (n <= 2)
return 1;
else
return fib(n-1) + fib(n-2);
}
Recurrence for fib
fib(n) = 1 for n=1, 2
= fib(n-1) + fib(n-2) for n > 2
41
4.3 Time complexity of recursion: Fibonacci numbers
◼ Again the function for total number of operations is
similar to the the recursive case of Fibonacci
❑ T(n) = T(n-1)+T(n-2)+4 (1 comparison, 1 addition, 2 subtraction)
2nd order linear recurrence → Hard !
T(n) < 2T(n-1)+c 1st order linear recurrence → Easier !
< 2(2T(n-2)+c)+c (expand T(n-1) to be 2*T(n-2)+c)
< 4T(n-2)+2c+c
< 8T(n-3)+4c+2c+c (cont. the expansion)
… (will stop when n is 1)
< 2n-1(T(1))+2n-2c+ 2n-3c+…+c (T(1) is c)
< 2n*(1/2+1/4+1/8+…)*c
= O(2n)
42
4.4 Some rules of thumb and examples
◼ Basically just count the number of statements executed.
◼ If there are only a small number of simple statements in a
program – O(1)
◼ If there is a ‘for’ loop dictated by a loop index that goes up
to n – O(n)
◼ If there is a nested ‘for’ loop with outer one controlled by n
and the inner one controlled by m – O(n*m)
◼ For a loop with a range of values n, and each iteration
reduces the range by a fixed constant fraction (eg: ½)
– O(log n)
◼ For a recursive method, each call is usually O(1). So
❑ if n calls are made – O(n)
❑ if n log n calls are made – O(n log n)
43
4.5 Analysis of Different Cases (1)
Worst-Case Analysis
❑ Interested in the worst-case behaviour.
❑ A determination of the maximum amount of time that an algorithm
requires to solve problem/input of size n
❑ This is the one we will be mostly looking at for the rest of the course
Best-Case Analysis
❑ Interested in the best-case behaviour
❑ Not very useful
Average-Case Analysis
❑ A determination of the amount of time that an algorithm requires to
solve an “average” input of size n
❑ Have to know the probability distribution of the inputs to determine
what is an “average input”
❑ Actual analysis is not covered in this module, will only state the
average case time complexity when needed
44
4.5 Analysis of Different Cases (2)
Expected-Case Analysis
❑ Analysis performed on randomized algorithms – i.e algorithms that
employ randomness in their logic
❑ Often confused with average-case analysis (although they are related)
❑ Actual analysis not covered in this module, again will only state the
expected case time complexity when needed
Amortized Analysis
❑ Sometimes worst case behavior cannot be possible for every run of
the algorithm, meaning that across several runs, only some can
induce worst case behavior while others don’t
❑ Amortized analysis determines the total time complexity required for a
sequence of runs and thus the “amortized” cost per run
❑ Need more advanced techniques which will be covered in CS3230,
so will only cover some very simple examples in CS2040
45
4.6 The Efficiency of Searching Algorithms
◼ Example: Efficiency of Sequential Search (data not
sorted)
❑ Worst case: O(n)
Which case?
❑ Average case: O(n)
❑ Best case: O(1)
Why? Which case?
❑ Unsuccessful search?
◼ Q: What is the best case complexity of Binary Search
(data sorted)?
❑ Best case complexity is not interesting. Why?
46
4.7 Keeping Your Perspective
◼ If the problem size is always small, you can
probably ignore an algorithm’s efficiency
◼ Weigh the trade-offs between an algorithm’s
time requirements and its memory requirements
◼ Order-of-magnitude analysis focuses on large
problems
◼ There are other measures, such as big Omega (), big
theta (), little oh (o), and little omega (). These may be
covered in more advanced module.
47
End of file
Question Time:
◼ What is the complexity of the following code fragment?
String someStr = “A”;
for (int i=1; i<n; i++) {
someStr = someStr + “A”;
}
1. O(n)
2. O(nlogn)
3. O(logn)
4. O(n2)
49