Program Cost and Asymptotic Analysis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Design and Analysis of Algorithms

Part 1
Program Cost and Asymptotic Notation

Elias Koutsoupias
with thanks to Giulio Chiribella

Hilary Term 2021

DAA 2020-21 1. Program cost and asymptotic notation – 1 / 33


Fast computers vs efficient algorithms [CLRS 1]

Many recent innovations rely on

 fast computers
 efficient algorithms.

Which is more important?

DAA 2020-21 1. Program cost and asymptotic notation – 2 / 33


The importance of efficient algorithms

The cost of an algorithm can be quantified by the number of steps T (n) in


which the algorithm solves a problem of size n.
Imagine that a certain problem can be solved by four different algorithms,
with T (n) = n, n2 , n3 , and 2n , respectively.
Question: what is the maximum problem size that the algorithm can solve
in a given time?
Assume that a computer is capable of 1010 steps per second.

Cost T (n) Maximum problem size solvable in


(Complexity) 1 second 1 hour 1 year
n 1010 3.6 × 1013 3 × 1017
n2 105 6 × 106 5 × 108
n3 2154 33000 680000
2n 33 45 58

DAA 2020-21 1. Program cost and asymptotic notation – 3 / 33


Faster computers vs more efficient algorithms

Suppose a faster computer is capable of 1016 steps per second.

Cost T (n) Max. size before Max. size now


n s1 106 × s1
n2 s2 1000 × s2
n3 s3 100 × s3
2n s4 s4 + 20

A 106 × increase in speed results in only a factor-of-100 improvement if


cost is n3 , and only an additive increase of 20 if cost is 2n .
Conclusions As computer speeds increase ...
1. ... it is algorithmic efficiency that really determines the increase in
problem size that can be achieved.
2. ... so does the size of problems we wish to solve.
Thus, designing efficient algorithms becomes even more important!
DAA 2020-21 1. Program cost and asymptotic notation – 4 / 33
From Algorism to Algorithms

Invented in India around AD 600, the decimal system was a revolution in


quantitative reasoning. Arabic mathematicians helped develop arithmetic
methods using the Indian decimals.
A 9th-century Arabic textbook by the Persian Al Khwarizmi was the key
to the spread of the Indian-Arabic decimal arithmetic. He gave methods
for basic arithmetic (adding, multiplying and dividing numbers), even the
calculation of square roots and digits of π.
Derived from ‘Al Khwarizmi’, algorism means rules for performing
arithmetic computations using the Indian-Arabic decimal system.
The word “algorism” devolved into algorithm, with a generalisation of the
meaning to
Algorithm: a finite set of well-defined instructions for accomplishing
some task.

DAA 2020-21 1. Program cost and asymptotic notation – 5 / 33


Evaluating algorithms

Two questions we ask about an algorithm


1. Is it correct?
2. Is it efficient?
Correctness is of utmost importance.
It is easy to design a highly efficient but incorrect algorithm.
Efficiency with respect to:
 Running time
 Space (amount of memory used)
 Network traffic
 Other features (e.g. number of times secondary storage is accessed)
Proving correctness and analysing the efficiency of programs are difficult
problems, in general. Take for example the Collatz program: starting from a
positive integer x repeat “if x is even then x = x/2, else x = (3x + 1)/2” until
x = 1. We don’t know how many steps this program takes.
DAA 2020-21 1. Program cost and asymptotic notation – 6 / 33
Measuring running time

On an actual computer, the running time of a program depends on many


factors:
1. The running time of the algorithm.
2. The input of the program.
3. The quality of the implementation
(e.g. quality of the code generated by the compiler).
4. The machine running the program.
We are concerned with 1.

DAA 2020-21 1. Program cost and asymptotic notation – 7 / 33


Sorting [CLRS 2.1]

The Sorting Problem


Input: A sequence of n integers a1 , · · · , an .
Output: A permutation aσ(1) , · · · , aσ(n) of the input such that

aσ(1) ≤ aσ(2) ≤ · · · ≤ aσ(n) .

The sequences are typically stored in arrays.

DAA 2020-21 1. Program cost and asymptotic notation – 8 / 33


Insertion sort: Informal description

 The input is an integer array A[1 . . n],


with A[1] = a1 , A[2] = a2 , . . . , A[n] = an .
 The algorithm consists of n − 1 iterations.
At the j-th iteration,
the first j + 1 entries of the array A[1 . . n] are arranged in sorted order.
To do this, the entry A[j + 1] is compared with the entry A[i], i ≤ j,
starting from i = j.
– If A[i] > A[j + 1], the value A[i] is moved from the i-th position to
the (i + 1)-th position, and the counter i is decremented to i − 1.
– If A[i] ≤ A[j + 1], the value A[j + 1] is put into the (i + 1)-th
position of the array and the iteration terminates.

DAA 2020-21 1. Program cost and asymptotic notation – 9 / 33


Example

j j
1 2 3 4 5 6 1 2 3 4 5 6
5 2 4 6 1 3 2 4 5 6 1 3

j j
1 2 3 4 5 6 1 2 3 4 5 6
2 5 4 6 1 3 1 2 4 5 6 3

j j
1 2 3 4 5 6 1 2 3 4 5 6 7
2 4 5 6 1 3 1 2 3 4 5 6

Observation. At the start of the j-th iteration, the subarray A[1 . . j]


consists of the elements originally in A[1 . . j] but in sorted order.

DAA 2020-21 1. Program cost and asymptotic notation – 10 / 33


Pseudocode (CRLS-style)

I NSERTION -S ORT(A)

Input: An integer array A


Output: Array A sorted in non-decreasing order
1 for j = 1 to A.length − 1
2 key = A[j + 1]
3 // Insert A[j + 1] into the sorted sequence A[1 . . j].
4 i=j
5 while i > 0 and A[i] > key
6 A[i + 1] = A[i] // moves the value A[i] one place to the right
7 i = i−1
8 A[i + 1] = key

DAA 2020-21 1. Program cost and asymptotic notation – 11 / 33


Characteristics of the CLRS pseudocode

 Similar to Pascal, C and Java.


 Pseudocode is for communicating algorithms to humans:
many programming issues (e.g. data abstraction, modularity, error
handling, etc.) are ignored.
 English statements are sometimes used.
 “// ” indicates that the reminder of the line is a comment.
(In 2nd edition “✄” is used.)
 Variables are local to the block, unless otherwise specified.
 Block structure is indicated by indentation.
 Assignment “x = y” makes x reference the same object as y.
(In 2nd edition “←” is used.)
 Boolean operators “and” and “or” are “short-circuiting”.

DAA 2020-21 1. Program cost and asymptotic notation – 12 / 33


Loop-invariant approach to correctness proof

Three key components of a loop-invariant argument:


1. Initialization: Prove that invariant (I) holds prior to first iteration.
2. Maintenance: Prove that if (I) holds just before an iteration, then it
holds just before the next iteration.
3. Termination: Prove that when the loop terminates, the invariant (I),
and the reason that the loop terminates, imply the correctness of the
loop-construct.
The approach is reminiscent of mathematical induction:
1. Initialization corresponds to establishing the base case.
2. Maintenance corresponds to establishing the inductive case.
3. The difference is that we expect to exit the loop,
whereas mathematical induction establishes a result for all natural
numbers.

DAA 2020-21 1. Program cost and asymptotic notation – 13 / 33


Correctness of I NSERTION -S ORT
Invariant of outer loop: At the start of the j-th iteration, the subarray
A[1 . . j] consists of the elements originally in A[1 . . j] but in sorted order.

DAA 2020-21 1. Program cost and asymptotic notation – 14 / 33


Correctness of I NSERTION -S ORT
Invariant of outer loop: At the start of the j-th iteration, the subarray
A[1 . . j] consists of the elements originally in A[1 . . j] but in sorted order.
Initialization. When j = 1, the subarray A[1 . . j] is a singleton and
trivially sorted.
Termination. The outer for loop terminates when j = n := A.length.
With j = n, the invariant reads: A[1 . . n] consists of the elements
originally in A[1 . . n] but in sorted order.

DAA 2020-21 1. Program cost and asymptotic notation – 14 / 33


Correctness of I NSERTION -S ORT
Invariant of outer loop: At the start of the j-th iteration, the subarray
A[1 . . j] consists of the elements originally in A[1 . . j] but in sorted order.
Initialization. When j = 1, the subarray A[1 . . j] is a singleton and
trivially sorted.
Termination. The outer for loop terminates when j = n := A.length.
With j = n, the invariant reads: A[1 . . n] consists of the elements
originally in A[1 . . n] but in sorted order.
Maintenance. Suppose input is sequence a1 , · · · , an .
We need to prove the following:
If at the start of the j-th iteration A[1 . . j] consists of a1 , · · · , aj in sorted
order, then at the start of the (j + 1)-th iteration A[1 . . j + 1] consists of
a1 , · · · , aj+1 in sorted order.
The proof requires us to examine the behaviour of the inner while loop,
under the promise that the subarray A[1 . . j] consists of a1 , · · · , aj in
sorted order.

DAA 2020-21 1. Program cost and asymptotic notation – 14 / 33


Correctness of I NSERTION -S ORT, continued

The inner while loop has the following property:


Property of the while loop: if A[1 . . j] consists of a1 , · · · , aj in sorted
order, then, at termination of the while loop, the sequence
A[1 . . i] key A[i+2 . . j+1] consists of a1 , · · · , aj+1 in sorted order.
This property implies the maintenance of the loop invariant of the for
loop, because the array A[1 . . j + 1] after the j-th iteration of the for loop
is exactly the sequence A[1 . . i] key A[i+2 . . j+1].

DAA 2020-21 1. Program cost and asymptotic notation – 15 / 33


Correctness of I NSERTION -S ORT, continued

The inner while loop has the following property:


Property of the while loop: if A[1 . . j] consists of a1 , · · · , aj in sorted
order, then, at termination of the while loop, the sequence
A[1 . . i] key A[i+2 . . j+1] consists of a1 , · · · , aj+1 in sorted order.
This property implies the maintenance of the loop invariant of the for
loop, because the array A[1 . . j + 1] after the j-th iteration of the for loop
is exactly the sequence A[1 . . i] key A[i+2 . . j+1].

Hence, it only remains to prove the validity of the above property.


The proof, provided in the next slide, uses a loop invariant for the while
loop.

DAA 2020-21 1. Program cost and asymptotic notation – 15 / 33


Correctness of I NSERTION -S ORT, concluded
Invariant of while loop:
(I1) A[1 . . i] A[i+2 . . j+1] is a1 , · · · aj in sorted order
(I2) all elements in A[i+2 . . j+1] are strictly greater than key.
Initialization. For i = j, (I1) is true because A[1, . . j] was promised to be
a1 , · · · , aj in sorted order, and (I2) is trivially true because the array
A[j + 2 . . j + 1] is empty.
Termination. Termination occurs if either i = 0 or A[i] ≤ key. In both
cases, (I1) and (I2) guarantee that the sequence A[1 . . i] key A[i+2 . . j+1]
contains the same elements as a1 , · · · aj+1 in sorted order.
Maintenance. For a given i, the body of the loop is executed only if
A[i] > key. In that case, A[i + 1] gets the value A[i]. After this change,
the sequence A[1 . . i − 1] A[i+1 . . j + 1] is a1 , · · · aj in sorted order, and
all elements in A[i+1 . . j+1] are strictly greater than key.
Hence, (I1) and (I2) still hold when i is decremented to i − 1.

DAA 2020-21 1. Program cost and asymptotic notation – 16 / 33


Running time analysis [CLRS 2.2]

Running time is
X
(cost of statement) · (number of time statement executed)
all statements

CLRS assume the following model: for a given pseudocode


 Line i takes constant time ci .
 When a for or while loop exits normally,
the test is executed one more time than the loop body.
This model is well justified when each line of the pseudocode contains:
 a constant number of basic arithmetic operations
(add, subtract, multiply, divide, remainder, floor, ceiling)
 a constant number of data movement instructions (load, store, copy)
 a constant number of control instructions
(conditional and unconditional branch, subroutine call and return)

DAA 2020-21 1. Program cost and asymptotic notation – 17 / 33


The running time of I NSERTION -S ORT

Recall the pseudocode:

1 for j = 1 to A.length − 1
2 key = A[j + 1]
3 // Insert A[j + 1] into the sorted sequence A[1 . . j].
4 i = j
5 while i > 0 and A[i] > key
6 A[i + 1] = A[i]
7 i = i−1
8 A[i + 1] = key

Setting n := A.length, the running time is


Pn−1
T (n) = c1 n + c2 (n − 1) + c3 (n − 1) + c4 (n − 1) + c5 j=1 tj
Pn−1 Pn−1
+c6 j=1 (tj − 1) + c7 j=1 (tj − 1) + c8 (n − 1) ,

where tj is the number of times the test of the while loop is executed for a
given value of j (note that tj may also depend on the input).
DAA 2020-21 1. Program cost and asymptotic notation – 18 / 33
Worst-case analysis

 The input array contains distinct elements in reverse sorted order


i.e. is strictly decreasing.
 Why? Because we have to compare key = aj+1 with every element to
left of the (j + 1)-th element, and so, compare with j elements in total.
Pn−1 Pn−1 n(n+1)
 Thus tj = j + 1. We have t
j=1 j = j=1 (j + 1) = 2
− 1, and
so,

T (n) = c1 n + c2 (n − 1) + c4 (n − 1) + c5 ( n(n+1)
2
− 1)
+c6 ( n(n−1)
2
) + c 7
n(n−1)
2
+ c8 (n − 1)
= an2 + bn + c

for appropriate a, b and c.


Hence T (n) is a quadratic function of n.

DAA 2020-21 1. Program cost and asymptotic notation – 19 / 33


Best-case analysis

 The array is already sorted.


 Always find A[i] ≤ key upon the first iteration of the while loop
(when i = j).
 Thus tj = 1.

T (n) = c1 n + c2 (n − 1) + c4 (n − 1) + c5 (n − 1) + c8 (n − 1)
= (c1 + c2 + c4 + c5 + c8 )n − (c2 + c4 + c5 + c8 )

I.e. T (n) is a linear function of n.

DAA 2020-21 1. Program cost and asymptotic notation – 20 / 33


Average-case analysis (informal)

 Randomly choose n numbers as input.


 On average, the key in A[j] is less than half the elements in A[1 . . j]
and greater than the other half, and so, on average the while loop has
to look halfway through the sorted subarray A[1 . . j] to decide where
to drop the key.
 Hence tj = j/2.
 Although average-case running time is approximately half that of the
worst-case, it is still quadratic.
Moral. Average-case complexity can be asymptotically as bad as the
worst-case.
Average-case analysis is not straightforward:
 What is meant by “average input” depends on the application.
 The mathematics can be difficult.

DAA 2020-21 1. Program cost and asymptotic notation – 21 / 33


Features of insertion sort: a summary

 Worst-case quadratic time.


 Linear-time on already sorted (or nearly sorted) inputs.

 Stable: Relative order of elements with equal keys is maintained.


 In-place: Only a constant amount of extra memory space (other than
that which holds the input) is required, regardless of the size of the
input.
 Online: it can sort a list as it is received.

DAA 2020-21 1. Program cost and asymptotic notation – 22 / 33


The Big-O notation [CLRS 3]

Let f, g : N −→ R+ be functions. Define the set

O(g(n)) := { f : N −→ R+ : ∃n0 ∈ N+ . ∃c ∈ R+ . ∀n .
n ≥ n0 → f (n) ≤ c · g(n) }

In words, f ∈ O(g) if there exist a positive integer n0 and a positive real c


such that f (n) ≤ c · g(n) for all n ≥ n0 .
Informally O(g) is the set of functions that are bounded above by g,
ignoring constant factors, and ignoring a finite number of exceptions.
If f ∈ O(g), then we say that “g is an asymptotic upper bound for f ”

DAA 2020-21 1. Program cost and asymptotic notation – 23 / 33


Examples

O(g(n)) := { f : N −→ R+ : ∃n0 ∈ N+ . ∃c ∈ R+ . ∀n .
n ≥ n0 → f (n) ≤ c · g(n) }

1. 398 ∈ O(1) [regarding 398 and 1 as (constant) functions of n].


Take n0 = 1 and c = 398 .
2. 5n2 + 9 ∈ O(n2 ).
Take n0 = 3 and c = 6. Then for for all n ≥ n0 , we have 9 ≤ n2 , and
so 5n2 + 9 ≤ 5n2 + n2 = 6n2 = cn2 .
3. Take g(n) = n2 and f (n) = 7n2 + 3n + 11. Then f ∈ O(g).
4. Some more functions in O(n2 ):
1000n2 , n, n1.9999 , n2 / lg lg lg n and 6.

DAA 2020-21 1. Program cost and asymptotic notation – 24 / 33


Properties of Big-O

Lemma 1. Let f, g, h : N −→ R+ . Then:


1. For every constant c > 0, if f ∈ O(g) then c f ∈ O(g).
2. For every constant c > 0, if f ∈ O(g) then f ∈ O(c g).
3. If f1 ∈ O(g1 ) and f2 ∈ O(g2 ) then f1 + f2 ∈ O(g1 + g2 ).
4. If f1 ∈ O(g1 ) and f2 ∈ O(g2 ) then f1 + f2 ∈ O(max(g1 , g2 )).
5. If f1 ∈ O(g1 ) and f2 ∈ O(g2 ) then f1 · f2 ∈ O(g1 · g2 ).
6. If f ∈ O(g) and g ∈ O(h) then f ∈ O(h).
7. Every polynomial of degree l ≥ 0 is in O(nl ).
8. For any c > 0 in R, we have lg(nc ) ∈ O(lg(n)).
9. For every constant c, d > 0, we have lgc (n) ∈ O(nd ).
10. For every constant c > 0 and d > 1, we have nc ∈ O(dn ).
11. For every constant 0 ≤ c ≤ d, we have nc ∈ O(nd ).

DAA 2020-21 1. Program cost and asymptotic notation – 25 / 33


Example

Example. Show that

57n3 + 4n2 · lg5 (n) + 17n + 498 ∈ O(n3 )

by appealing to Lemma 1.

lg5 (n) ∈ O(n) ∵9


4n2 · lg5 (n) ∈ O(4n3 ) ∵5
57n3 + 4n2 · lg5 (n) + 17n + 498 ∈ O(57n3 + 4n3 + 17n + 498) ∵ 3
57n3 + 4n3 + 17n + 498 ∈ O(n3 ) ∵7
57n3 + 4n2 · lg5 (n) + 17n + 498 ∈ O(n3 ) ∵6

DAA 2020-21 1. Program cost and asymptotic notation – 26 / 33


A shorthand for Big-O

Instead of writing f ∈ O(g) we often write

f (n) = O(g(n))

(read “f is Big-O of g”).

DAA 2020-21 1. Program cost and asymptotic notation – 27 / 33


A shorthand for Big-O

Instead of writing f ∈ O(g) we often write

f (n) = O(g(n))

(read “f is Big-O of g”).

It is also convenient to write

f1 (n) = f2 (n) + O(g(n))

meaning that
f1 (n) = f2 (n) + h(n) ,
where h(n) is a generic function in O(g(n)).

DAA 2020-21 1. Program cost and asymptotic notation – 27 / 33


Pitfalls of the shorthand

When writing
f (n) = O(g(n))
we must bear in mind that it is a shorthand for f (n) ∈ O(g(n)).
Here “=” is not an equality between two objects.
In particular, it does not have the transitive property:
f (n) = O(g(n)) and h(n) = O(g(n)) does not imply f (n) = h(n)!
Two elements of the same set are not necessarily the same element!

Example
 n = O(n3 ) and n2 = O(n3 ) but n 6= n2 .

DAA 2020-21 1. Program cost and asymptotic notation – 28 / 33


So why use the shorthand?

 It is convenient to write equations like f (n) = g(n) + O(nd ), or


f (n) = g(n) (1 + O(1/n))
 The Big-O shorthand is a very common mathematical notation,
in use since more than 100 years ago.
 We already abuse the “=” symbol in computer science:
think of the pseudocode instruction i = i − 1.

DAA 2020-21 1. Program cost and asymptotic notation – 29 / 33


Big-Ω

The Big-O notation is useful for upper bounds. There is an analogous


notation for lower bounds, called the Big-Omega. We write

f (n) = Ω(g(n))

to mean “there exist a positive integer n0 and a positive real c such that for
all n ≥ n0 , we have f (n) ≥ cg(n).”
If f ∈ Ω(g), then we say that “g is an asymptotic lower bound for f ”.
Example
1. nn = Ω(n!)
2. 2n = Ω(n10 ).

Exercise
Prove that f (n) = Ω(g(n)) iff g(n) = O(f (n)).

DAA 2020-21 1. Program cost and asymptotic notation – 30 / 33


Big-Θ

When the function g(n) is both an asymptotic upper bound and an


asymptotic lower bound for f (n), we say that “g is an asymptotic tight
bound for f ”, and we write

f (n) = Θ(g(n)) .

Equivalently, f = Θ(g) means that there exist positive reals c1 and c2 and
a positive integer n0 such that for all n ≥ n0 , we have

c1 g(n) ≤ f (n) ≤ c2 g(n).

We can think of f and g as having the “same order of magnitude”.

DAA 2020-21 1. Program cost and asymptotic notation – 31 / 33


Examples

1. 5n3 + 88n = Θ(n3 )


2. 2 + sin(lg n) = Θ(1)
3. n! = Θ(nn+1/2 e−n ).
Consequence of Stirling’s Approximation.
4. For all a, b > 1, loga n = Θ(logb n).
Consequence of the relation logb a = log ca
logc b
.
From now on, we will be “neutral” and write Θ(log n),
without specifying the base of the logarithm.

DAA 2020-21 1. Program cost and asymptotic notation – 32 / 33


Examples

1. 5n3 + 88n = Θ(n3 )


2. 2 + sin(lg n) = Θ(1)
3. n! = Θ(nn+1/2 e−n ).
Consequence of Stirling’s Approximation.
4. For all a, b > 1, loga n = Θ(logb n).
Consequence of the relation logb a = log ca
logc b
.
From now on, we will be “neutral” and write Θ(log n),
without specifying the base of the logarithm.

Question. OK, we have seen that loga n is a Big-Theta of logb n.


But is 2loga n a Big-Theta of 2logb n ?

DAA 2020-21 1. Program cost and asymptotic notation – 32 / 33


Examples

1. 5n3 + 88n = Θ(n3 )


2. 2 + sin(lg n) = Θ(1)
3. n! = Θ(nn+1/2 e−n ).
Consequence of Stirling’s Approximation.
4. For all a, b > 1, loga n = Θ(logb n).
Consequence of the relation logb a = log ca
logc b
.
From now on, we will be “neutral” and write Θ(log n),
without specifying the base of the logarithm.

Question. OK, we have seen that loga n is a Big-Theta of logb n.


But is 2loga n a Big-Theta of 2logb n ?
No! Recall that alogb c = clogb a for all a, b, c > 0.
Using this relation, we have 2loga n = nloga 2 and 2logb n = nlogb 2 .
If a 6= b, we have two different powers of n, but nc = Θ(nd ) only if c = d.

DAA 2020-21 1. Program cost and asymptotic notation – 32 / 33


Revision
Logarithms.
log2 n is sometimes written lg n, and loge n is sometimes written ln n.
Recall the following useful facts. Let a, b, c > 0.

a = blogb a
logc a
logb a = logc b
1
logb a = loga b

alogb c = clogb a

A form of Stirling’s approximation.


√  n n   
1
n! = 2πn 1+Θ
e n

DAA 2020-21 1. Program cost and asymptotic notation – 33 / 33

You might also like