CMSC 351: Algorithms

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

CMSC 351

Algorithms

Alex Reustle
Dr. Clyde Kruskal • Fall 2017 • University of Maryland
http://cs.umd.edu/class/fall2017/cmsc351/

Last Revision: January 6, 2018

Table of Contents
1 Maximum Subarray Problem 5
Brute Force solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Analysis of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Methods to improve integer list summation algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Timing analysis 6
Classes of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Θ(n2 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Θ(n log(n)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Bubble Sort Analysis 7


Average Case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 Insertion Sort with Sentinel 7


Analysing Number of Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Best Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Worst Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Average Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Analyzing Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Best Case Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Worst Case Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Average Case Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5 Insertion Sort without Sentinel 9


Best Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Worst Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Average Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

6 Selection Sort 11
Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1
7 Merge Sort 11
Merge Comparison analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Equal length arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Different length arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Mergesort Comparison Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Merge Sort Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

8 Integer Arithmetic 13
Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Problem size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Recursive Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Better Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

9 Recursion Master Formula 15


Applying to known algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Better Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Mergesort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

10 Heapsort 17
Create Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Heap Creation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Finish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

11 Finding Bounds to summation functions 21


Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Geometric Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Gauss’s Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Harmonic Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Using continuous math to solve discrete math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Gauss’s Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Harmonic Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

12 Order Notation 24

13 Quicksort 26
Comparison Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Quicksort Worst case comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Quicksort Best case comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Proof by Constructive Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Quicksort Worst Case from Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Quicksort Best Case from Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Average case analysis of quicksort from approximate recurrence. . . . . . . . . . . . . . . . . . . . . . . 28
Average case analysis of quicksort from exact recurrence. . . . . . . . . . . . . . . . . . . . . . . . . . . 29
New Book method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Is Quicksort In place? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Randomized pivot selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Median Pivot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Small Size Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

14 Algorithm Strengths and Weaknesses 31


Memory Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2
15 Linear Sorting Algorithms 31
Re-evaluating Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Find biggest element in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Find second largest element in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Find k largest elements in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Natural approach: Hold a Tournament . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Finding Best and Worst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

16 Counting Sort 32

17 Radix Sort 33
Analysis of running time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

18 Bucket Sort 34

19 Selection 34
Analyzing the recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Constrained case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Average case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

20 Finding Median explicitly during selection 36


Worst Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

21 Graph Algorithms 37
Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Edge Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
List all Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Take aways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Changing the number of nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Adding Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Removing Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Depth First Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Breadth First Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Identifying Connected Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Bridges of Köningsberg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

22 Trees 39
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Minimum Spanning Tree Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Kruskal’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Prim’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Proof for Kruskal’s and Prim’s Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

23 Shortest Path Algorithms 41


Dijkstra’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Correctness Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

24 NP-Completeness 42
Decision Problem Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
The Class of NP problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Problems between P and NP-Complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3
Exam Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Formula SAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Graph Coloring Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

25 Appendices 45
A. Transpositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
B. Summation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4
1 Maximum Subarray Problem
Given a sequence of integers, find the largest collection which is a sum of adjacent values. Sequence is in an array
with indices from 1 to n.
3 − 5 4 3 − 6 4 8 −5 3 − 7 2 1
| {z }
Sum=13

Brute Force solution


Brute force solution: try every possible sum. (I’d guess it’s n3 ) Sum variable M set to zero. We will allow empty
sums (no elements, sum = 0).
1 M ← 0;
2 for i=1 to n do
3 for j=i to n do
4 S ← 0;
5 for k=i to j do
6 S = S + A[k];
7 M = max(M, S);
8 end
9 end
10 end
Algorithm 1: Matrix Subarray Brute Force
Pn
Insight: Loops in programs are like summation in Mathematics.(for i=1) to n == i=1
Pn Pn Pj
Nested loops are nested summations. i=1 j=i k=i 1 Pn Pn
Simplify summation using summation algebra from the inside out. i=1 j=i j − i + 1
Tangent: since j is the variable, i and 1 are constants. So it can be separated into two sums.

Analysis of the Algorithm


Actual progress: Change a variable to manage sum. Like integrals in calculus. Every technique in calculus for
integrations are matched by similar techniques in summations. Change of variable by example. See appendix for
examples of summation properties.
n X
X j
n X n X
X n
1= j−i+1 By adding 1 j − i times, and fenceposting
i=1 j=i k=i i=1 j=i
n n−i+1
X X
= j By change of variable Method
i=1 j=1
n
X (n − i + 1)(n − i + 2)
= Gausses sum
i=1
2
n
1 X
= × (n − i + 1)(n − i + 2) Expanding is error prone. Change variable instead
2 i=1
n
1 X
= × i(i + 1) substitute i=n-i+1
2 n−i+1=1
n
1 X
= × i(i + 1) simplify bounds
2 i=1
1 n(n + 1)(n + 2)
= × magic.
2 3
n(n + 1)(n + 2)
=
6

This first method is Θ n3 .




5
Methods to improve integer list summation algorithm.
Remember previous sums, do not recalculate known data.
1 M ← 0;
2 for i=1 to n do
3 S ← 0;
4 for j = i to n do
5 S = S + A[j];
6 M = max(M, S)
7 end
8 end
Algorithm 2: n2 Subarray
n X n n n
X X X n(n + 1)
1= n−i+1= i=
i=1 j=i i=1 i=1
2

This new method is Θ n2 .




Loosely speaking the maximum sum is just the previous maximum sum, plus the current value.
1 M ← 0, S ← 0;
2 for i = 1 to n do
3 S ← max(S + A[i], 0);
4 M = max(M, S);
5 end
Algorithm 3: Linear Subarray
The third method is Θ(n).
Correctness of the algorithm stems from proof by induction on S = max(S + A[i], 0). This is the Loop Invariant.
n
X
1=n
i=1

2 Timing analysis
Lower Bound
No algorithm is Faster than this boundary condition, the lower bound time.
Best Case scenario, very hard to prove in the general case.
Upper Bound
Some Algorithm can obtain upper bound time.
Worst case Scenario.

Classes of algorithms
Θ(n2 )
Bubble sort
Insertion Sort
Selection Sort

Θ(n log(n))
Merge Sort
Heap Sort
Quick Sort

Others
Radix Sort

6
3 Bubble Sort Analysis

1 for i = n downto 2 do
2 for j = 1 to i-1 do
3 if A[j] > A[j + 1] then
4 swap (A[j],A[j+1]);
5 end
6 end
7 end
Algorithm 4: Bubble Sort

Major Growth Elements of Bubble sort will be Comparisons and Exchanges (swaps).

Best case number of comparisons require you to view every element unconditionally.

n X
X i−1 n
X
1= i−1
i=2 j=1 i=2
n−1
X
= (i + 1) − 1
i=1
n−1
X
= i
i=1
 
(n − 1)n n
= = Equivalent to n choose 2
2 2

Comparisons will always occur, regardless of state of list.

Best Case Exchanges occur when the list is sorted. Zero exchanges.
Worst Case Exchanges occur when the list is reverse sorted. Same number of exchanges as comparisons.

Definition 3.1 (Random). Each permutation (ordering) of the list is equally likely.
Rotations of list permutations are equally likely to be in any position, but are not random because they ignore other
possible permutations.

Average Case Analysis


Count Transpositions. Two elements that are out of order related to each other. See appendix for details about
transpositions.

In the best case there are no transpositions.


In the worst case there are n choose 2 transpositions. Every element is out of order with the number of elements
below it. So the number of transpositions is the number of unique pairs. Which is n choose 2 pairs.
In a randomized sample the permutations are equally likely to be out of order greater or lesser, so the number of
average case exchanges will be half the number of comparisons. n(n−1)
4 .

4 Insertion Sort with Sentinel


Every 0 to [i-1] index during the loop will be sorted.
ith term is stored in tmp. Values in the sorted subarray which are larger than the tmp value are brought forward
individually until tmp > A[j]. A sentinel −∞ is used at A[0], so values don’t drop off array.

7
1 A[0] ← −∞;
2 for i = 2 to n do
3 t ← A[i];
4 j ← i − 1;
5 while A[j] ≥ t do
6 A[j + 1] ← A[j];
7 j ← j − 1;
8 end
9 A[j + 1] ← t;
10 end
Algorithm 5: Insertion Sort with Sentinel

Analysing Number of Comparisons


Best Case Comparisons
In the case where the array is already correctly sorted, there will be only 1 comparison for each iteration of the other
for loop. It will always evaluate false. So the number of best case comparisons is just the number of for loop iterations.
n
X
1 = (n − 2) + 1 Already sorted
i=2
=n−1

Worst Case Comparisons


In the worst case the array is reverse sorted, and every element must be moved. The while loop always decrements j
to zero to compare against the sentinel value.
n n
!
X X
i= i −1 Reverse Sorted
i=2 i=1
n(n + 1)
= −1 must remove i=1
2
(n + 2)(n − 1)
=
2

Average Case Comparisons


In the average case we must P determine the probability that a given element will move. So for evaluation we want the
expected value over the set x∈X P (x) · V (x).
Starting at location i, go until reach location j-1. Probability (P) that 1 element (A[i]) will end up in any location in
the subarray is 1i . Value (V) is number of moves which is (i-j+1).

n X
i n i
X 1 X 1X
· (i − j + 1) = (i − j + 1) Pull out 1/i
i=2 j=1
i i=2
i j=1
n i
X 1X
= j
i=2
i j=1
n
X 1 i(i + 1)
= ·
i=2
i 2
n
X (i + 1)
=
i=2
2
" n #
1 X
= i+1
2 i=2

8
" n n
#
1 X X
= i+ 1
2 i=2 i=2
" n #
1 X
= i − 1 + (n − 1)
2 i=1
 
1 (n)(n + 1) 2(n − 1)
= −1+
2 2 2
 2 
1 (n + n − 2) 2n − 2
= +
2 2 2
2
(n + 3n − 4)
=
4
(n + 4)(n − 1)
=
4

Analyzing Exchanges
Best Case Exchanges
Best case number of moves = 2n − 1. There is 1 move in in the −∞ assignment at the top, then in the for loop there
are 2 moves per iteration, moving A[i] into t, and moving it back into the same spot. 1 + 2(n − 1) = 2n − 1.

Worst Case Exchanges


Worst case number of moves is 2 for each iteration of outer loop plus worst case number of comparisons, minus the time
when the comparison is false at the end, of the inner loop, plus 1 for the top assignment. 2(n−1)+COM P −(n−1)+1.
Note the short cut, at each iteration we’re doing one more move than comparison. So just easily take the value
already derived for Worst case comparisons, and add the number of loop iterations, plus 1 for the sentinel assignment.

(n + 2)(n − 1)
+n
2

Average Case Exchanges


In the average case, whenever there’s a comparison, there will always be a move, except the 1 time per iteration it
evaluates false. So the analysis method is the same as for the worse case. Since we know the number.

(n + 4)(n − 1)
+n
4
Remark 4.1. An important trick is to recognize that the moves are related to the comparisons 1 to 1 and that there
will be 1 time when the comparison evaluates false each iteration, so you subtract that from the final analysis.

5 Insertion Sort without Sentinel


Add another comparison in the algorithm so the A[0] term is unnecessary, by checking that you haven’t gone past i=1.
1 for i=2 to n do
2 t ← A[i];
3 j ← i − 1;
4 while j > 0 ∧ A[j] > t do
5 A[j + 1] ← A[j];
6 j ← j − 1;
7 end
8 A[j + 1] ← t;
9 end
Algorithm 6: Insertion Sort w/o Sentinel
A note about comparisons. We don’t count index value comparisons, because index values can easily be stored in
registers and won’t have the same cost to check as array value comparisons.

9
Best Case Comparisons
In
Pnthe best case, the array is already sorted, so there will just be 1 comparison per iteration of the outer loop.
i=2 1 = (n − 2) + 1 = n − 1

Worst Case Comparisons


In the worst case, the array is reverse sorted, and every ith iteration of the loop must compare against all previous
(i − 1) elements.
n X
X i−1 n
X
1= (i − 1)
i=2 j=1 i=2
Xn n
X
= i− 1
i=2 i=2
n(n + 1)
= − 1 − (n − 1)
2
n(n + 1)
= −n
2
n2 + n 2n
= −
2 2
2
n −n
=
2
n(n − 1)
=
2

Average Case Comparisons


Remark 5.1. HARMONIC SERIES and brick problem. How far out can you stack bricks on top of one another?
Arbitrarily far out (but not infinitely far out)
n
X 1 1 1 1 1
Hn = =1+ + + + + · · · ≈ ln n + 1
i=1
i 2 3 4 5

n
X 1 1 1 1 1
Hn − 1 = = + + + + · · · ≈ ln n
i=2
i 2 3 4 5

Harmonic Series come up again and again because of the idea that something has a 1
i chance of happening in an
iteration.
What are we saving when we don’t have a sentinel? In the event that the ith element is the smallest element in the
A[1..i-1] sub-array, we will descend all the way to the 1st index of the list. If that happens, we won’t have the 1 extra
step of comparing against the sentinel. So only when the ith element is the smallest seen thus far do we save 1 step.
Probability that current ith element is smallest in sub array is 1i .PThe value when that occurs is 1. Thus over all n
n
elements of the array, on average the sentinel would have cost us i=2 1i = Hn − 1 comparisons.
So average case comparisons without sentinel is average case with sentinel minus average cost of sentinel. (n−1)(n+4)
4 −
(Hn − 1) ≈ (n−1)(n+4)
4 − ln n.

Exchanges
Removing the sentinel from insertion sort adds no new exchanges, nor does it alter when a change might have been
done already. Therefore, the best, worst and average cases are all the same as insertion sort with sentinel.

10
6 Selection Sort
Summary: Find biggest element, put at bottom of list, recurse for sub array.
1 for i=n downto 2 do
2 k ← 1;
3 for j=2 to i do
4 if A[j] > A[k] then
5 k ← j;
6 end
7 end
8 swap (A[k], A[i]);
9 end
Algorithm 7: Selection Sort

Comparisons
The comparison is always run, so the comparison value should be the same for all cases.
n X
X i n
X
1= (i − 1)
i=2 j=2 i=2
(n − 1)n
=
2

Exchanges
Exchanges don’t occur inside the conditional, so exchanges number is simply n-1. P
How many times do we exchange a number with itself? How many times does i=k? 1/i = Hn

7 Merge Sort
Divide and Conquer. Recursively split array in half down to 1 element, then merge sorted sublists. First call of
Mergesort is (A,1,n)

1 MERGESORT (A,p,r);
2 //p and r are indices in array defining sub array.;
3 if p<r then
4 q ← b p+r
2 c;
5 MERGESORT (A,p,q);
6 MERGESORT (A,q+1,r);
7 MERGE (A, (p,q), (q+1,r));
8 end
Algorithm 8: Merge Sort

Merge Comparison analysis


Equal length arrays
Merge algorithm is linear over 2 sorted lists of same size (n). Total number of comparisons in the worst case during
merge is 2n − 1, best case comparisons is n, where 1 array is strictly greater than the other.
Can’t do better than 2n − 1 in worst case, because worst case situation is 2 lists with interleaved values. In this case
the granularity of difference is on the scale of 1 value. So you must check every value to ensure against that. The last
remaining value of 2 lists is not compared, which leads to 2n-1.
This gives us very tight bounds and lets us know everything there is to know about merging.

11
Different length arrays
Arrays of size m and n where m ≤ n Worst case comparisons = m + n − 1
When m << n you can get close to log n running time using binary search. But when m is close to n you are much
closer to the m + n − 1 worst case.

Mergesort Comparison Analysis


Exact Number of Comparisons During a merge

(q − p + 1) + (r − (q + 1) + 1) − 1 = (r − p) comparisons

starting at index p and ending at index q. So for the full list, where p=1 and q=n a MERGESORT will start with
(n-1) comparisons and have the following behavior.
n−1 20 (n − 1)

+
n n
+ n

2 −1 2 −1 21 2 −1

+
n n n n
+ + + n

22 −1 22 −1 22 −1 22 −1 22 22 −1

+
.. .. .. .. .. .. .. .. ..
. . . . . . . . .
+
n
− 1 + 2nk − 1 + n
− 1 + 2nk − 1 n
    
2k
··· ··· 2k
2k 2k
−1

0 + 00 + 0 ··· + ··· 0 + 0 = 0n
=
Plg n
i=0 2i ( 2ni − 1)
Because this gives lg n terms, we can make it a little easier by dropping the bottom row, which sums to 0 anyway,
because it’s 0n.
lgX
n−1 lgXn−1
n
2i ( − 1) = (n − 2i )
i=0
2i i=0
lgX
n−1 lgX
n−1
= n− 2i
i=0 i=0
= (n lg n) − (2 lg n−1+1
− 1) geometric series
= (n lg n) − (n − 1)
= n lg n − n + 1

Merge Sort Recurrence


When expressed as a recurrence relation, the mergesort algorithm given previously behaves like
n n
S(n) = S +S +n−1
2 2

12
n
=2·S +n−1
2
Which gives the following recurrence relation

S(0) = S(1) = 0 Base case of 0 comparisons with 0 or 1 element


 n   n 
S(n) = S b c + S d e
2 2

8 Integer Arithmetic
Addition
Just like in elementary addition, the time to add 2 n digit numbers is proportional to the number of digits. Θ(n).
Since each digit must be used to compute the sum this is the right order for best case time.

1 1
1234
+
5678
6912

We assume that each digital addition takes constant time (with carry also, so ignore that little trouble spot) and label
this constant α. The time to add 2 n digit numbers is then α(n).

Problem size
If we were attempting to determine if a number was prime we would divide it by consecutive primes up to the square
root of that number. In Computer Science we want to approach all problems in terms of the problem size. So we
should look at both in terms
√ of the NUMBER OF DIGITS, and not the size of the number. So while the prime
identification problem is Θ N where N is the number, expressed in binary as 2n , n being the number of binary digits.
n
Θ(2 2 ). Giving an exponential time algorithm.

Multiplication
Elementary school approach to multiplication runs in n2 time because every digit on top is multiplied by every digit
on the bottom. There are 2n(n − 1) atomic additions in this elementary approach as described in class.

1 2 34
×
5 6 78
9 8 72
8 6 3 8
74 0 4
617 0
700 6 652

Recursive Multiplication
By memorizing the numbers in large bases, say base 100, 2 digit decimal numbers can be treated as 1 digit numbers
in base 100. That makes the multiplication easier, because there are fewer steps.
n
To generalize this for n digit numbers, treat them as base 10 2 and cut each in half. Each has n2 digits, and we know
the base. Then, recurse through this method until we reach a base we really have memorized the table for. Then
multiply them as 2 2 digit numbers, and pull out the answer.
n n
n 2 2
cd c d ac bd
× n × n
2
n
2 × ab × a b ↓ + ↓
ad+bc
bd
bc }αn time αn time Total add time = 2αn
ad
+ ac

13
ac is an n digit number, as are ad, bc, bd. When adding, bd is not shifted, but ad, and bc are shifted by n2 digits, and
ac is shifted by n digits. Since there’s no overlap for ac and bd, you can add them simply by concatenation. ad + bc
takes αn time. That value plus the center (overlapping) segment of ac + bd is also αn.

So without writing out the algorithm: M (n) = 4M n2 + 2αn which gives us a nice recursion to work on. Let’s assume


the time to multiply in the base case is µ. The base case is when both of our numbers are 1 digit long M (1) = µ.
2αn 40 (2αn)

··· ··· ··· 2α n2 41 (2α n2 )

··· ··· ··· ··· ..


.

··· ··· ··· 2α 2nk 4k (2α 2nk )

µ µ µ µ 4lg(n) µ
Which is fun, I’m having fun.

lg(n)−1
X n
atomic actions = (4i (2α )) + 4lg(n) µ
i=0
2i
lgX
n−1
4i
= 2αn ···
i=0
2i
lgX
n−1
= 2αn 2i · · ·
i=0
lg n−1+1
geometric series

= 2αn 2 − 1 ···
lg(n)
= 2αn (n − 1) + 4 µ
lg(4)
= 2αn (n − 1) + n µ 4lg(n) µ = nlg 4 µ
= 2αn (n − 1) + n2 µ nlg 4 µ = n2 µ
= 2αn (n − 1) + n2 µ

We’re doing n2 multiplications and n2 additions, so our application of recursion didn’t improve the running time of
our algorithm. There’s a better way!

Better Multiplication
With the distributive law (a + b)(c + d) = ac + ad + bc + bd, we can use this property to improve the fast multiplication
algorithm. This will let us replace 1 recursive multiplication with 1 new addition and 1 subtraction, as we shall see.
w = (a + b)(c + d) 2 · n2 additions, 1n multiplication u v
= ac + ad + bc + bd + z
u = ac 1n mult.
v = bd 1n mult.
z = w − (u + v) 1n addition, and 1n subtraction.
By the same logic as the earlier multiplication algorithm, ac and bd are shifted by n bits, and can be concatenated.
Let w = (a + b)(c + d), and z = w − (ac + bd) = (ad + bc) so to calculate z takes 3 Multiplications, 3 additions and
1 subtraction. Let’s treat the subtraction as an addition, 4 additions. As such, the recursion for this algorithm is
T (n) = 3T n2 + 4αn, T (1) = µ.

14
lgX
n−1  n X 3i
3i 4α i + 3lg n µ = 4αn +nlg 3 µ
i=0
2 2i
3 lg n−1+1 ..

−1
= 4αn · 2
3 .
2 −1
3 lg n ..

−1
= 4αn · 2
1 .
2
h i ..
= 8αn nlg( 2 ) − 1
3
.
..
= 8αn nlg 3−lg 2 − 1 .
 

..
 lg 3 
n
= 8αn lg 2 − 1 .
n
..
 lg 3 
n
= 8αn −1 .
n
= 8α n − n + nlg 3 µ
 lg 3 

≈ nlg 3 [8α + µ]
≈ n1.57 [8α + µ]

This is O(n1.57 ), a better result than the previous algorithm.

9 Recursion Master Formula


n
T (n) = aT ( ) + cnd , T (1) = f
b
Kruskal claims this is a better formula than the master theorem in the book, easier to use and more applicable. Let’s
solve using tree method.
num rows sum

cnd 1 cnd

··· ··· ··· c nb d a ac( nb )d

··· ··· ··· c bn2 d a2 a2 c( bn2 )d

..
··· ··· ··· ··· .
n
··· ··· ··· n d
c bn−1 an−1 an−1 c( bn−1 )d

f f f f a
n
f alogb n

You will get logb n levels, the branching factor is a.

logb n−1 logb n−1


X n d X ai
a c( i ) = cnd
i
d
+f nlogb a
i=0
b i=0 bi
logb n−1
ai .
+..
X
= cnd i
i=0 bd

15
logb n−1 
d
X a i
= cn
i=0
bd
logb n
!
d ( bad ) −1
= cn a
bd
−1
a
!
nlogb ( bd ) − 1
= cnd a
bd
−1
d
!
d nlogb a−logb b − 1
= cn a
bd
−1
 log a−d 
n b −1
= cnd a
bd
−1
cn logb a
− cnd ..
= a +.
bd
−1
cnlogb a cnd
= a − a + f nlogb a
b d − 1 b d − 1
cnd
 
c
= a + f · nlogb a − a
bd
−1 bd
−1

This formula can’t apply to the situation when the value of a geometric sum would be 1. The geometric sum isn’t
applicable in that case. SO this equation isn’t applicable when a = bd , we need a special case. When the r term = 1, a
geometric series becomes a simple summation of 1, thus in this case if a = bd the formula becomes cnd logb n + f nlogb a .
In the special case of merge sort, we drop the constant -1 from the non recursion part of the term, and apply the
special case, then we repeat the process with the n dropped, and add the two results.

cnd
 
c
a 6= bd a +f · nlogb a − a
bd
−1 bd
−1

a = bd cnd logb n + f nlogb a

The relative sizes of these constant factors will overwhelm in differing circumstances.
 
c
a > bd →T (n) ≈ a + f · nlogb a = Θ(nlogb a )
bd − 1

cnd
a < bd →T (n) ≈ = Θ(nd )
1 − bad

a = bd →T (n) ≈ cnd logb n because d = logb a = Θ(nd (logb n))

Applying to known algorithms


Multiplication

n
T (n) = 4T ( ) + 2αn, T (1) = µ
2
a = 4, b = 2, c = 2α, d = 1, f = µ

16
 
2α 2αn
+ µ nlg 4 −
2−1 2−1
= (2α + µ) n2 − 2αn ≈ Θ(n2 )
= 2α n2 − n + µn2


≈ Θ(n2 )

Better Multiplication

n
T (n) = 3T ( ) + 4αn, T (1) = µ
2
a = 3, b = 2, c = 4α, d = 1, f = µ
 
4α 4αn
3 + µ nlg 3 − 3
2 − 1 2 −1
= (8α + µ) n1.57 − 8αn ≈ Θ(n1.57 )

Mergesort
When we get
T (n) = 2T ( n2 ) + n − 1, T (1) = 0

n term: a = 2, b = 2, c = 1, d = 1 -1 term: a = 2, b = 2, c = −1, d = 0

a = bd a 6= bd

 
c cnd
cnd logb n a
−1 + f · nlogb a − a
−1
bd bd

−1 −1·n0
1 · n1 log2 n = n lg n 2
−1
nlg 2 − 2
−1
20 −1 20

= −n + 1

Implementation details
This gives us the incredible ability to use the constants of the recurrence into a running time and determine if it
will be greater than another algorithm in constant time. There’s also the Schonhage Strassen algorithm, which is
Θ(n log n log log n).
In real life, the size of T (1) is the word size of your machine, because the atomic multiplication is implemented at
that level. So our atomic base is 264 for most machines.

10 Heapsort
Created by J.W.J. Williams Improved by Robert Floyd

• Create Heap

• Finish

17
Definition 10.1 (Heap). A binary tree where every value is larger than its children. Equivalently its descendants.
In this class we require that all binary trees are full binary trees.
100

90 70

80 50 60 10

20 30 40

Create Heap
Turn a tree into a heap.
The traditional way to create a heap is to insert at the end of the array and sift up. Robert Floyd created a better
way to create the heap. Treat the tree as a recursive heap: each parent is the parent to 2 heaps, and sift it from the
bottom up. Create heap in left, create heap on right, then sift root down, and move up a level.

Heap Creation Analysis


In a binary tree, most nodes are near bottom, so when doing bottom up technique, most work is done near bottom,
and because you’re near the bottom you don’t have far to go down. So we take advantage of that fact: Most elements
are near the bottom and don’t have far to go.
There are n2 leaves. Each doing 0 comparisons.
There are n4 parents of leaves. Each doing 2 comparisons.
There are n8 grandparents of leaves. Each doing 4 comparisons (2 per level).
There are 16
n
greatgrandparents of leaves. Each doing 6 comparisons (2 per level).

n n n n n n
·0+ ·2+ ·4+ ·6+ ·8+ · 10 + · · ·
2 4 8  16 32 64 
1 2 3 4 5
=n· + + + + + ··· This sum is:
2 4 6 16 32
1 1 1 1 1
= + + + + + ··· = 1
2 4 8 16 32
1 1 1 1 1
+ + + + + ··· =
4 8 16 32 2
1 1 1 1
+ + + + ··· =
8 16 32 4
1 1 1
+ + + ··· =
16 32 8
..
+.
= 2n

So Heap creation does Θ(2n) comparisons

18
Finish
• Put root node at the bottom of the array, as it must be the largest element, so it will go at the end of the sorted
array.
• Then take the bottom right hand leaf and move to a tmp space.
• Then sift, reordering the tree and put the tmp leaf in its proper spot.
• Repeat.

Sift comparisons: each level has 2 comparisons, child and tmp. There are ≈ lg n levels. Total for a sift ≈ 2 lg n This
must be done for each element in the tree, so our heap has an upper bound of ≈ 2n lg n.
But the heap shrinks, each iteration removes an element. So let’s sum 0 to n-1 (because we remove an element first
before sifting).

n−1
X n
X
2 lg(i + 1) ≈ 2 lg i
i=0 i=1
= 2 [lg 1 + lg 2 + lg 3 + · · · + lg n]
= 2 lg(1 · 2 · 3 · · · · n)
 n n √
= 2 lg(n!) Stirling’s Formula n! ≈ · 2πn
h n n √ i e
= 2 lg · 2πn
e
h n 1
i
≈ 2 n lg( ) + lg((2πn) 2 )
 e 
1
= 2 n [lg n − lg e] + lg(2πn)
2
= 2n lg n − 2n lg e + lg n + lg(2π)
= 2n lg n + O(n)

This isn’t much better, it makes sense that it’s not much better than our conservative guess, because in a full binary
tree, half the elements are leaves. So while the tree shrinks, it doesn’t shrink very fast, asymptotically you’re still
doing Θ(2n lg n) comparisons, plus some linear term.

Implementation Details
Use array to implement tree structure.
60 1 60 Node has index i
tmp
2 30
Left child: 2i
3 100
30 100 4 50 Right child: 2i + 1
5 10
Parent: b 2i c
6 70
50 10 70 90 7 90
8 20
9 80
20 80 40 10 40
CREATE The First parent is at index b n2 c. Start there and sift down during heap creation. Siblings can be reached
by adding or subtracting 1. Result is a created heap.
FINISH Pop bottom into tmp first, then move heap root into bottom spot.

19
Optimization
Why is this result still worse than merge sort? Θ(2n lg n) vs Θ(n lg n). We’re comparing tmp against both children
and this doubles our number of comparisons. Instead let’s sift the hole left by the root down to the bottom in lg n
comparisons (1 per level) then put tmp in the hole an sift it back up into position.
How far will tmp need to go to sift back up and re-form the heap? Not far, since 1/2 the nodes are in the bottom
layer, and 1/4 of the nodes are in the 2nd, etc. It takes 1 comparison to confirm tmp belongs in the bottom layer, 2
to check the 2nd layer up. . .
1 1 1 1
·1+ ·2+ ·3+ · 4 + ··· = 2
2 4 8 16
Giving 2n comparisons on average. We can further optimize by binary searching up. This gives Heapsort n lg n +
n lg lg n = Θ(n lg n) performance.

Pseudocode
1 Function Heap_Sort(A,n)is
2 // Create Heap
3 for r = b n2 c to 1 do
4 Sift(r,n,A[r])
5 end
6 // Finish Sort
7 for m = n To 2 do
8 s ← A[m]
9 A[m] ← A[1]
10 Sift(1,m-1,s)
11 end
12 end
13 Function Sift(r,n,s)is
14 // r:root index, n:size index, s:sift value, p:parent index, c:child index
15 p←r
16 while 2p ≤ n do
17 if 2p < n then
18 if A[2p] ≥ A[2p + 1] then
19 c ← 2p
20 else
21 c ← 2p + 1
22 end
23 else
24 c ← 2p
25 end
26 if A[c] > s then
27 A[p] ← A[c]
28 p←c
29 else
30 Break
31 end
32 end
33 A[p] ← s
34 end
Algorithm 9: Heap Sort

20
11 Finding Bounds to summation functions
Mathematical Preliminaries
Geometric Series
For |r| ≤ 1 recall:


X 1
ri = Infinite Geometric Series
i=0
1−r
n−1
X 1 − rn rn − 1
ri = = Finite Geometric Series
i=0
1−r r−1

How do we solve this? Or at least get a reasonable estimate?



X ∞
X
i · ri = i · ri
i=0 i=1

X
=r· i · ri−1
i=1
X∞ Z 0
=r· ir i−1
sum of derivatives is the derivative of the sum
i=1

X 0
=r· ri
i=1

!0
X
i
=r r
i=1
 0
1
=r −1
1−r
!
1
=r 2
(1 − r)
r
= 2
(1 − r)

How can we check to confirm? We’ve already calculated the r = 1/2 case, where the infinite sum is 2.
1
2
=2
1 2

1− 2

Now for the finite series.


n−1
X n−1
X
i · ri = i · ri
i=0 i=1
n−1
X
=r· i · ri−1
i=1
n−1
X Z 0
=r· ir i−1
sum of derivatives is the derivative of the sum
i=1
n−1
X 0
=r· ri
i=1

21
n−1
!0
X
i
=r r
i=1
" 0 #
rn − 1
=r −1
r−1
" #
nrn−1 (r − 1) − (rn − 1)
=r 2
(r − 1)
" #
nrn − nrn−1 − rn + 1
=r 2
(r − 1)
" #
(n − 1)rn − nrn−1 + 1
=r 2
(r − 1)
(n − 1)rn+1 − nrn + r
= 2
(r − 1)

Gauss’s Sum
n
X n(n + 1)
i=
i=1
2
Without knowing the answer already, we can see an easy upper bound. If every term is bound by it’s largest element
it can’t be larger than n2 . We can also get a lower bound by flooring everything to the lowest value of 1, since there
are n of them we know that the result must be larger than n. n ≤ sum ≤ n2 .
Next we can try to split the sum in half and floor both to the lowest term in that series.
n n
SU M = 1 + 2 + 3 + 4 + · · · + + + 1 + ··· + n
2 2
n n
≥ 1 + 1 + 1 + 1 + ··· + 1 + + 1 + ··· + + 1
2 2
n n n 
≥ + +1
2 2 2 i
nh n
≥ 1+ +1
2 2
n n+4

2 2
2
n
≥ +n
4

Harmonic Sum

n
X 1
Hn = = 1 + 1/2 + 1/3 + 1/4 + 1/5 + · · · + 1/n
i=1
i
≈ ln n
We can approximate it using the same
technique we applied to the Gaussian Sum
≤ 1 + [1/2 + 1/2] + [1/4 + 1/4 + 1/4 + 1/4] + · · ·
= 1 + 1 + 1 + ··· + 1 num ones = num groups
k−1
X
= 2i = n
i=0
k is the number of groups. The number of

22
items per group, summed, which must sum to n
= 2k − 1 = n
= 2k = n + 1
= k = lg(n + 1)
Hn ≤ lg(n + 1)

Using continuous math to solve discrete math


Assume a function that’s bounded from m to n. The Riemann sum of areas in discrete terms is bounded by the
integral of some function.
Z n n
X Z n+1
f (x)dx ≤ f (i) ≤ f (x)dx
m−1 i=m m

Provided f (x) is increasing.


Z n+1 n
X Z n
f (x)dx ≤ f (i) ≤ f (x)dx
m i=m m−1

Provided f (x) is decreasing.

Gauss’s Sum

Z n n
X Z n+1
xdx ≤ i≤ xdx
m−1 i=m m

x2 n n2 n+1
≤ ≤
2 0 2 i

2
n2 02 (n + 1) 12
− ≤ ≤ −
2 2 2 2
n2 n2 + 2n + 1 1
−0≤ ≤ −
2 2 2
n2 n2 + 2n
≤ ≤
2 2
n2 n(n + 2)
≤ ≤
2 2

Harmonic Sum

Z n+1 n Z n
1 X 1 1
dx ≤ ≤ dx
m x i=m
i m−1 x
n+1 n
ln x ≤ ≤ ln(x)

1 0
ln(n + 1) − ln(1) ≤ ≤ ln(n) − ln 0
ln(n + 1) − 0 ≤ ≤ ln(n) − (−∞)
ln(n + 1) ≤ ≤ ln n + ∞
≤ +∞

Less than infinity isn’t helpful. Let’s avoid it by removing the 1 term from the sum, so we don’t integrate 0 to 1 for ln
x

Z n
≤1+ f (x)dx
1

23
n
≤ 1 + ln(x)

1
≤ 1 + ln n − ln 1
≤ 1 + ln n − 0
≤ ln(n) + 1

12 Order Notation
Did you know that some algorithms can be faster than others? It’s true!
https://en.wikipedia.org/wiki/Time_complexity
Name Running time T (n) Examples of running times
Constant O(1) 10

Iterated logarithmic O(log n)
Log-logarithmic O(log log n)
Logarithmic O(log n) log n, log(n2 )
2
Polylogarithmic poly(log n) (log n)
Fractional power O(nc ) where 0 < c < 1 n1/2 , n2/3
Linear O(n) n

N log star n O(n log n)
Linearithmic O(n log n) n log n, log n!
Quadratic O(n )2
n2
Cubic O(n3 ) n3
Polynomial 2O(log n) = poly(n) n, n log n, n10
Quasi-polynomial 2poly(log n) nlog log n , nlog n
Exponential (with Linear exponent) 2O(n) 1.1n , 10n
2
Exponential 2poly(n) 2n , 2n
Factorial O(n!) n!
poly(n) n
Double exponential 22 22

Example: 17n3 + 24n2 − 6n + 8


=17n3 + O(n2 )
=17n3 + o(n3 )
≈17n3
=Θn3

Order notation can hide simple truths like a large constant in a lower order term that can seriously affect the running
time for small n. Little o means the function is always less than what’s in the bounds. o(n3 ) could be O(n2.9998 )
which could throw off all of our calculations.

24
Equivalence Function Order Set
= Θ
≤ O
≥ Ω
< o
> ω

Informally f (n) = Θ(g(n))


f (n)
lim =c>0
n→∞ g(n)
7n2 + 3n − 8 7
lim =
n→∞ 4n2 + 20n + 4 4

f (n)
lim = c ≥ 0 −→ f (n) ∈ O(g(n))
n→∞ g(n)
g(n)
lim = c ≥ 0 −→ f (n) ∈ Ω(g(n))
n→∞ f (n)

f (n)
lim = 0 −→ f (n) ∈ o(g(n))
n→∞ g(n)

g(n)
lim = 0 −→ f (n) ∈ ω(g(n))
n→∞ f (n)

What about non-polynomial functions? Like f (n) = n2 · (10 + sin(n)) The book has a non-limiting approach which
gives us a nice way of addressing oscillating functions.
2 3 2 3
To a first approximation you can do plain algebra using Theta notation. 2Θn · 2Θn = 2Θn +Θn .
n
What’s faster? nlg n or(lg n)
Exponentials grow much faster than polynomials. How do we state that formally?

na = o(bn ) for a ≥ 0, b > 1


a
(log n) = o(nb ) b > 0

a b
(log log n) = o((log n) )

n
nlg n ? (lg n)
nlg n
lim n
n→∞ (lg n)

2lg n·lg n
= lim lg (lg n)n
n→∞ 2
2
2lg n
= lim n lg(lg n)
n→∞ 2
=0
n
∴ nlg n = o((lg n) )

Polylogs grow slower than polynomials, and Quasi-polynomials grow slower than exponentials with variable bases.

25
13 Quicksort
An efficient sorting algorithm developed by Tony Hoare in 1959 while trying to translate Russian. Recursively
partition the array, put small numbers to the left and large numbers to the right.

1 Function QUICKSORT(A,p,r)is
2 if p < r then
3 q ←PARTITION(A,p,r)
4 QUICKSORT(A,p,q-1)
5 QUICKSORT(A,q+1,r)
6 end
7 end
8 Function PARTITION(A,p,r)is
9 x ← A[r]
10 i←p−1
11 for j = p to r − 1 do
12 if A[j] ≤ x then
13 i←i+1
14 ExchangeA[i], A[j]
15 end
16 end
17 i←i+1
18 ExchangeA[i], A[r]
19 return i
20 end

Comparison Analysis
First note that PARTITION is always linear in comparisons on n = r −p+1 terms. The FOR loop runs n−1 comparisons
regardless of the result of the IF condition.

Quicksort Worst case comparisons


The worst case output for PARTITION must occur when the pivot doesn’t move, and the recursive call to QUICKSORT is
only 1 smaller than the previous call. Thus, partitioning on n elements gets n-1 comparisons. Worst Case Comparisons
Occurs when pivot is at far end of array:

T (n) = T (n − 1) + n − 1 , T (1) = 0
= (n − 1) + (n − 2) + (n − 3) + · · · + 3 + 2 + 1 pivot = 2
n−1
X
= i
i=1
(n − 1) · n
=
2

Quicksort Best case comparisons


The best case occurs when PARTITION moves the pivot to the middle of the array, so that the recursive calls to
QUICKSORT are n−1
2 smaller than the last call. Best Case Comparisons Occurs when pivot is right in the middle of
array:
n−1
T (n) = 2 · T ( )+n−1 , T (1) = 0
2
n
≤ 2 · T( ) + n − 1
2
= n · lg(n) − n + 1

26
= O(n · lg n)

Proof by Constructive Induction


Mathematical Induction is great at proving an answer is true if you already know it. But if you don’t know that what
the answer is for sure, you can make an educated guess, and use induction to derive the right answer. We know the
answer to the worst case comparisons of quicksort is quadratic, but we don’t know the coefficients.
For example, let’s prove 1 + 2 + 3 + · · · + n is quadratic.
n
X n
X
i Guess i = an2 + bn + c
i=1 i=1

P1 2
Base case i = 1. i=1 i = a(1) + b(1) + c = a + b + c = 1
Pn−1 2
Inductive Hypothesis: i=1 i = a(n − 1) + b(n − 1) + c
n
X n−1
X
i=n+ i
i=1 i=1
2
= n + a(n − 1) + b(n − 1) + c
= n + a(n2 − 2n + 1) + b(n − 1) + c
= an2 + (−2a + b + 1)n + a − b + c

If our assumption was true then we should get a system of simultaneous equations matching the quadratic. Then we
could solve the system for the coefficients of the result.

a=a
b = −2a + b + 1
c=a−b+c
1
a = , a = b, c = 0
2
n
X 1 2 1 n(n + 1)
i= n + n=
i=1
2 2 2

Quicksort Worst Case from Recurrence


T (n) = T (n − 1) + n − 1, T (1) = 0. Let’s guess that the result is quadratic: an2 + bn + c
2
Base: n = 1, a(1) + b(1) + c = 0
2
Inductive Hypothesis: Assume T (n − 1) = a(n − 1) + b(n − 1) + c

T (n) = T (n − 1) + n − 1
2
= a(n − 1) + b(n − 1) + c + n − 1
= a(n2 − 2n + 1) + b(n − 1) + c + n − 1
= an2 (−2a + b + 1)n + c − b − 1 + a
Find coefficients
a+b+c=0
a=a
−2a + b + 1 = b
c−1+a−b=c
1
a=
2
1
b=−
2
c=0

27
1 2 1
T (n) = n − n
2 2
n(n − 1)
=
2

Quicksort Best Case from Recurrence


T (n) = 2T (n/2) + n − 1, T (1) = 0. Guess upper bound: T (n) = an lg n
Base: n = 1, a(1) lg(1) = 0
Inductive Hypothesis: By strong induction, assume T (k) = ak lg k for all k < n

T (n) = 2T (n/2) + n − 1
h n ni
≤ 2 a · · lg +n−1
2 2
= an[lg n − lg 2] + n − 1
= an[lg n − 1] + n − 1
= an lg n − an + n − 1
= an lg n + (−a + 1)n − 1
To be true, must be ≤ an lg n
drop -1 because <
(−a + 1)must be ≤ 0, → a ≥ 1

So a constant a greater than or equal to 1 solves the recurrence, but the best case value occurs when a is 1

T (n) ≤ 1 · n lg n
= n lg n

Average case analysis of quicksort from approximate recurrence.


Best case occurs when pivot is in the middle, while worst case occurs when pivot is at the end. Let’s approximate the
average case happening when the pivot is at the one and three quarter marks.
3n n
T (n) = T ( ) + T ( ) + n − 1 , T (1) = 0
4 4
Let’s guess optimistically that our upper bound is T (n) ≤ an lg n
Base case n=1: a(1) lg(1) = 0 ≤ 0X
Inductive Hypothesis: By strong induction, assume true for k < n T (k) ≤ ak lg k

3n n
T (n) = T ( ) + T( ) + n − 1
4 4  
n n 3n 3n
≤ a lg +a lg +n−1
4 4 4 4
n 3an
= a (lg n − lg 4) + (lg n − lg 4 + lg 3) + n − 1
4 4
n an 3an 3an 3an
= a lg n − + lg n − + lg 3 · +n−1
4 2 4 2  4
a 3a a lg 3
= an lg n + − − + +1 n−1
2 2 4
 a 
= an lg n + −2a + lg 3 + 1 n − 1
4
To be true, must be ≤ an lg n
a
− 2a + lg 3 + 1 ≤ 0
4

28
lg 3
( − 2)a ≥ 1
4
1
a≥
2 − 3 lg
4
3
!
1
T (n) ≥ n lg n
2 − 3 lg
4
3

T (n) ≈ 1.23 · n lg n

Average case analysis of quicksort from exact recurrence.


When we choose a pivot element, the pivot will end up at some index q. We don’t know where q is yet. Since we will
be iterating over the whole array, the probability of choosing a particular index is 1/n. Then you have to do the group
on the left and the group on the right. And do it for all n indices. Solving this will require everything we know to do.
n
X 1
T (n) = (T (q − 1) + T (n − q)) + n − 1
q=1
n
n
1X
= (T (q − 1) + T (n − q)) + n − 1
n q=1
n n
1X 1X
= (T (q − 1)) + ( T (n − q)) + n − 1
n q=1 n q=1

first is sum of Ts from 0 to n-1 and the other is the downward sum from n-1 to 0. So we get two sums of T

n−1
2X
= T (q) + n − 1
n q=0

now apply strong constructive induction, guess an lg n

Base n =1a(1) lg(1) = 0X


I.H. For 1 ≤ k < n T (k) ≤ ak lg k
n−1
2X
T (n) = T (q) + n − 1
n q=1
n−1
2X
≤ aq lg q + n − 1
n q=1
n−1
2a X
= q lg q + n − 1
n q=1
2a n
Z
≈ x lg xdx + n − 1
n 1
2a x2 lg x x2 lg e
 
n
= − +c +n−1
n 2 4 1
 2 2
12 lg 1 12 lg e

2a n lg n n lg e
= − +c−( − + c) + n − 1
n 2 4 2 4
 2 2

2a n lg n n lg e lg e
= − + +n−1
n 2 4 4
an lg e a lg e
= an lg n − + +n−1
2 2n

29
a lg e a lg e
= an lg n + ( + 1)n − 1 +
2 2n
a lg e
need − +1≤0
2
2
a≥ = 2 ln 2 ≈ 1.39
lg e
choose a = 1.39 and the last term becomes a number always less than or equal to 1, which is dominated by minus 1

and the second highest term becomes 1 minus 1=0


T (n) ≤ 1.39n lg n
2
= n lg n
lg e
= 2n ln n

New Book method


Given What is the probability that the smallest and largest elements are compared? (x1 , xn )? n2 If you pick the
largest or smallest as the pivot, then it will compare against every other element including the other extreme. But if
you don’t pick an extreme the two will be partitioned away and never be compared. So on average the number of
comparisons you get from comparing the smallest and largest is also n2 .
What is the probability that xi , xj : (i < j) will be compared? If you choose a pivot less than i, the two will end up
on the same pivot together on the large side. If you choose a pivot greater than j, the two will end up on the small
side together. The only way not to compare the two is to choose a pivot in between i and j. That probability is j−i+1
2

If you have 2 elements next to one another in the sorted array, they will always be compared. Otherwise how will you
know? (i+1)−i+1
2
=1

n
X n
X
T (n) =
i=1 j=i+1
n n
X X 2
=
i=1 j=i+1
j − i+1
n n
X X 1
=2
i=1 j=i+1
j − i+1
n
X
=2 (Hn−i+1 − 1)
i=1
Xn
=2 (Hi − 1)
i=1
Xn n
X
=2 Hi − 2 1
i=1 i=1

Big important insight, that you can calculate the probability that any 2 arbitrary elements can be compared.

Is Quicksort In place?
In place algorithms use a constant amount of extra space, it isn’t a function of the number of elements n. Because
Quick sort is recursive it can add more variables to the stack (multiple versions of q, and r). So to be in place you use
extra memory Θ(1) for algorithm variables + Θ(log n) stack average. There isn’t really a formal definition of In place.
Informally it’s that an algorithm doesn’t use a lot of extra space.

30
One way to avoid using much extra space is to check which side of a partition is smaller and do that side of the
partition first.

Randomized pivot selection


You can use a random pivot element as seen in the book.

3 Median Pivot
You can make sure that you’ve got a good pivot by picking 3 elements randomly and using their median as the pivot.
This gives you a much better chance of pivoting near the middle.

Small Size Optimization


In practice when qucksort gets down to 10 to 20 elements many implementations use a more efficient algorithm like
insertion sort to sort the small groups of elements.

14 Algorithm Strengths and Weaknesses


Algorithm Comparisons Moves Spatial Locality In Place Good Worst Case
Merge Sort n lg n unknown X X X
Heap Sort n lg n n lg n X X X
Quick Sort 1.39n lg n 1
2 n lg n X X X

Memory Hierarchy
Memory Hierarchy allows further, machine specific optimizations. Doing a lot of work in fast memory is preferable to
waiting to grab data from slower memories Merge sort and quick sort both take advantage of this spacial locality of
fast memory. It sub sorts the array in pieces that fit into caches. Heap sort doesn’t as it’s sorting mechanism jumps
all over the tree. It lacks spatial locality. Merge Sort also has the problem that it isn’t In-place, so it uses a lot of
extra memory.

15 Linear Sorting Algorithms


Re-evaluating Sort
All of our algorithms have thus far been strongly dependent upon comparisons. Let’s take a closer look at that.

Find biggest element in array


Run one instance of selection sort, find biggest element. n − 1 comparisons.

Find second largest element in array


First idea: form max heap and pull two off the top. Create Heap + sift: ≈ n + lg n

Find k largest elements in array


Sift again: n + (k − 1) lg n

31
Natural approach: Hold a Tournament
Comparisons are like playing games in a sportsball match. You need to eliminate n − 1 teams to find the best, so you
must at least run n − 1 comparisons. This gives us a lower bound. In a double elimination tournament when one of
the playing groups loses it must lose again to be eliminated. The superior team will never lose. All other teams must
lose at most twice. If you lose to someone and they lose to someone else, you’re no longer second best. So you don’t
need to incur an extra comparison. If you keep track of who lost to whom you’ll conserve comparisons and be bound
by the height of the tournament bracket, lg n − 1. When you run the tournament you incur n − 1 to find the best
player. Then another lg n − 1 to order everyone else, giving n + lg n comparisons.

Finding Best and Worst


Run tournament to find best, then take all teams who lost both matches (n/2), and run again in reverse. There are
2 − 1 following comparisons to find the worst team. n − 1 + 2 − 1 = n + 2 − 2 = 2 n − 2.
n n n 3

16 Counting Sort
Say you have a list of duplicate integers in a range and you want to sort them. n integers in the range 0, . . . , k − 1.
Sort A into B. So simply count the number of integers of each size, and dole them out in order.

1 for i = 0 to k − 1 do
2 C[i] ← 0;
3 end
4 for j = 1 to n do
5 C[A[j]] ← C[A[j]] + 1;
6 end
7 t ← 0;
8 for i = 0 to k − 1 do
9 for j = 0 to C[i] do
10 t ← t + 1;
11 B[t] ← i;
12 end
13 end
Algorithm 10: Counting Sort

The running time is the number of numbers plus the range of the numbers. Θ(n + k). If you’re sorting a lot of
numbers in a small range, it’s the numbers that will dominate. If it’s a large range with sparse numbers, the range
will dominate.
An alternative method is to form the partial sums of C. Where each value in C is the number of elements less than or
equal to the index. Then iterate backwards through the original array. When you encounter an element in A, lookup
that index in C, then assign an element into B, then decrement the value in C. This has the added value of treating
each element as a unique entity, instead of as simply an integer. So you could do this on comparable objects.
1 for i = 0 to k − 1 do
2 C[i] ← 0;
3 end
4 for j = 1 to n do
5 C[A[j]] ← C[A[j]] + 1;
6 end
7 for i = 1 to k − 1 do
8 C[i] ← C[A[i]] + 1;
9 end
10 for j = n down to 1 do
11 B[C[A[j]]] ← A[j];
12 C[A[j]] ← C[A[j]] − 1;
13 end
Algorithm 11: Partial Sum Counting Sort

32
Running time Θ(n + k).
Now the C array represents the number of numbers less than the value of the index.
Why do we go through A backwards? To maintain stability.
We can use counting sort when k < nlogn and mergesort when k > nlogn

17 Radix Sort
Sort on the least significant digit, then the 2nd least significant digit, etcetera until the largest digit arrives. This only
works if a stable sort is used for each intermittent pass over a digit.
Proved by example in lecture. Proved by induction on the intermittent sorts.
Attribute Value Variable
Size 10 n
Radix 9 r
Digits 3 d
Running Time: Θ(d(n + r))

Analysis of running time


First let’s decide which radix is best for generic input. Let’s introduce a new parameter: Size of Values: s. This is the
total range of numbers. s, d, r are all related by s = rd . If given 6 digit numbers in base 10, s = 106 . Taking logs we
get logr s = d This gives us the new Running time of Θ((logr s)(n + r)). Taking out the r is trickier. Let’s think
about optimizing this function. We can optimize by trying to find a minimum with respect to r. We can do that by
taking the derivative and solving for 0.

0
ln r − 1r (n + r)

ln s
(n + r) = (ln s) 2
ln r (ln r)
ln r − nr
= (ln s) 2 =0
(ln r)
n
0 = ln r −
r
n
r=
ln r
n
r=
ln( lnnn )
n
=
ln n − ln ln n
n

ln n

So surprisingly the running time of radix sort can depend entirely on the input size, and not on the range of values.
Θ( ln(ln ns ) (n + lnnn ))
ln n
Θ( nlnlnns )
But that’s not a great number, so in real life pick the closest power of 2. It’s nicest for the computer, and a digit
looks like a set of bits. Take every group of r bits and compare based on those bits and return them. So we want to
use Radix sort when it’s running time is less than Quicksort.

 
n log s
< Θ log2 n log s < Θ (log n) log n = log nΘ(log n) s < nΘ(log n)

Θ < Θ (n log n) Θ (log s)
log n

So Radix sort is theoretically better when the range is relatively small. But for 1 million numbers in binary, that
gives us a very large range: 1, 000, 00020
It’s space-wise inefficient, but has spatial locality.

33
18 Bucket Sort
Assume you have a uniformly distributed array of real numbers between 0 and 1. Create n buckets.

1 Clear B;
2 for i = 1 to n do
3 Put A[i] into bucket B [bn · A[i]c];
4 end
5 Sort each bucket;
6 Concatenate each bucket;

On average there will be 1 element per bucket. If you can rely on the buckets in the worst case not being very large
you can use a quadratic sort within the bucket and still get a linear running time.
You can even apply this to non uniform distributions. Assume a Normal distribution. Simply change the size of the
buckets relative to the mean, with buckets closer to the mean having a smaller range and ones farther from the range
on either side being larger.
Also space wise inefficient, but has spatial locality.

19 Selection
How do we select the kth smallest element from a group of numbers? How do we select the median from a group of
numbers? Median = n + n2 lg n

1 Function SELECT(A,k,p,r)is
2 Find Approximate median for ‘qth smallest’;
3 Partition based on ‘qth smallest’;
4 if k < q − p + 1 then
5 SELECT(A,k,p,q-1);
6 end
7 else if k>q-p+1 then
8 SELECT(A,k-q-p+1,q+1,r);
9 end
10 else
11 k=q-p+1
12 end
13 returnq,A[q];
14 end

Take the list and find some approximate median, partition with this approximate median (q), then look on the left or
right side depending on how k compares to our partition.

Analyzing the recurrence


It sounds like selection should be a very important problem, but in real life it doesn’t come up very much.

Lower Bound
Let’s assume we know the true median, or that our approximate median is good enough.

 
n−1
T (n) = n − 1 + T d e
2
n
≈ T( ) + n − 1
2
≈ 2n − 1 − lg n
≈ 2n

34
So if we got really lucky and got the true median we could find the kth element in linear time. As such we have a
reasonable assumption for our lower bound. You’ll never do better than 2n comparisons in a real case when you don’t
have the perfect median.

Constrained case analysis


Let’s assume that we get a q at the 1
4 mark, and assume that our k is always on the larger side.

3n
T (n) = n − 1 + T ( )
4
3n
= T( )+n−1
4
3 32
≈ (n − 1) + ( n − 1) + ( n − 1) + . . .
4 4
1
= n − c log n
1 − 3/4
≈ 4n

For the general fractional partition.

T (n) = n − 1 + T ((1 − r)n)


= T ((1 − r)n) + n − 1
2
≈ (n − 1) + ((1 − r)n − 1) + ((1 − r) n − 1) + . . .
1
= n − c log n
1−1+r
1
≈ n
r
Absolute worst case q.

T (n) = n − 1 + T (n − 1)
= (n − 1) + (n − 2) + (n − 3) + . . .
(n − 1)n
=
2
n2

2

Average case Analysis


Sum over all possible pivots and let’s assume you always end up on the bigger side, for the pessimistic view. So we
get the probabilistic analysis.

n
X 1
T (n) = n − 1 + · T (max(q − 1, n − q))
q=1
n
n−1
2X
= T (q)
n n
2

Constructive Induction, guess the algorithm is linear


Guess T (n) ≤ an

35
n−1
2X
=n−1+ aq
n n
2

n−1
2a X
=n−1+ q
n n
2
 n

n−1 2 −1
2a  X X
= q− q + n − 1
n q=1 q=1
n n
( − 1)
 
2a n(n − 1)
= − 2 2 +n−1
n 2 2
2a n2 n n2
 
n
= − − + +n−1
n 2 2 8 4
an a
= an − a − + +n−1
4 2
3 a
= an − + n − 1
4 2
3 a
= ( a + 1)n − − 1 for induction to work ≤ an ∴ a ≥ 4
4 2
T (n) ≈ 4n

20 Finding Median explicitly during selection


Take the numbers and put them into a 2 dimensional grid with 5 rows and n/5 columns.
Computer Memory is a 1 dimensional array, meaning that representing a 2 dimensional array requires a clever trick
to access. A[i,j] Represented differently in row major order vs column major order. Column major order A[5(i-1) + j]
Without using very specific pseudocode, let’s look theoretically at how something like this algorithm might work at a
very high level.

1 Put items into 5x n5 grid


2 Bubble Sort
3 // Find median of each column. 10*n/5 = 2n comparisons.
4 Find the median of these medians recursively. // smaller subset of elements it should take T (n/5)
comparisons.
5 Move columns with small medians left and columns with large medians right. // This puts the median of
medians in the middle element.
6 Partition using the median of medians. // Partitioning takes n − 1 comparisons.
7 Recursively select on correct side, knowing what we know about median of medians. // Takes T ( 7n
10 )
comparisons.

            
            
      M      
            
            

Worst Case analysis


About 25% of elements are guaranteed to be smaller, and about 25% are guaranteed to be bigger than the median of
medians. The exact value is 10
3
n This gives us an upper bound on the size of a recursive call to partition at 10
7
n,
which is the remaining number of elements to recurse over.

36
3
10 n

7
10 n → remaining number of elements to recurse over

n 7
T (n) ≤ 2n + T ( ) + T ( n) + n − 1
5 10
n 7
= T ( ) + T ( n) + 3n − 1 Constructive induction, guess T (n) ≤ an
5 10
n 7
≤ a + an + 3n − 1
5 10

9
= +3 n−1
10
9
need an =⇒ need a + 3 ≤ a ∴ a ≥ 30
10
∴ T (n) ≤ 30n

Median is doable in linear time, but 30 is a very large coefficient. Compare it to quick sort at n lg n,n must be ∼ 230
for our linear median selection to be better.

Theory
Say the upper bound, as some researchers claim to have found, that the upper bound on median selection is 3n.

2n < T (n) < n3n


(2 + ε)n (3 − δ)n
5
T (n) ≈ n
2

21 Graph Algorithms
Graph description. Can be represented as an adjacency list or adjacency matrix.
Undirected graph adjacency list representation is a bit awkward because it double counts shared edges. Altering an
edge on one vertex requires altering the edge on its partner too. Requiring that you traverse both lists. One awkward
way to alleviate this is to keep extra pointers between the list elements in shared edges.

Memory Usage
Matrix, where n = number of vertices, Memory usage = Θ(n2 )
List, where m = number of edges, Memory usage = Θ(m + n)
How big can m be? M can be as large as n2 , but will probably be smaller.
n2
You could also use a bit matrix of size word size .

Edge Lookup
List: Θ(n) Matrix: Θ(1)

37
List all Edges
List, Go down each index, then traverse each list. Amortized analysis is to visit n elements. Θ(m + n).
Matrix, Go across each row, looking at all the 0s, even when you don’t care about them. Θ(n2 ).

Take aways
Typically we want to use the Adjacency list representation, because it’s linear for in the average case.

Changing the number of nodes


Adding Nodes
Add dummy values that you can fill later to the next power of 2. If you have 11 nodes build a matrix with 16 columns
and just read the first 11, when a new node is added just use the dummy space, when you reach 17 you need to copy
the old matrix into the new one of size 32.

Removing Nodes
Shrink the array in a way similar to adding a node.

Depth First Search

1 Mark v at visited;
2 for all v such that (u,v) is an edge and v has not been visited ;
3 DFS v;

Adjacency Matrix: O(n2 )


Adjacency List: O(n + m)

Applications
Identify Connected Components.

Breadth First Search


While depth first search uses a recursive call, or a stack (which are naturally similar in modern machines) BFS uses a
queue, which while similar in order analysis for most things is not naturally represented in a stack based machine.
Does have some nice applications, like shortest distance algorithms.

Identifying Connected Components


A graph is connected if there is a path from every vertex to every other vertex in the graph.

38
A connected component is a connected subgraph, or a connection of such components.
1 Function Component(G)is
2 for all x ∈ V [G] do
3 visited[x] ← F alse;
4 t ← 0;
5 end
6 for all x ∈ V [G] do
7 if not visited[x] then
8 t ← t + 1;
9 Compvisit [x];
10 end
11 end
12 end
13 Function Compvisit(x)is
14 visited[x] ← T rue;
15 CompN um[x] ← t;
16 for all y ∈ adj[x] do
17 if not visited[y] then
18 Compvisit[y];
19 end
20 end
21 end
Algorithm 12: Connected Component
Adjacency Matrix: O(n ) 2

Adjacency List: O(n + m)

Bridges of Köningsberg
Theorem 21.1 (Eulerian Cycle 1736). An undirected, (connected) graph G = (v, e) has an Eulerian Cycle if and
only if every vertex is connected and every vertex has an even degree (number of incoming and outgoing edges, with
loops counted twice).

22 Trees
Created by S. Jaladanki

Definitions
Definition 22.1 (Tree). An undirected and connected acyclic graph.

• If an edge is added between two vertices in a tree, a cycle is created

• If any edge is removed from this cycle, we recreate a tree

Source: https://en.wikipedia.org/wiki/Minimum_spanning_tree

Definition 22.2 (Spanning Tree). A subset of edges that form a tree and contains every vertex in the graph. It does
not have cycles.
Definition 22.3 (Minimum Spanning Tree (MST)). A spanning tree that has the lowest possible sum of edge weights

• m = n − 1 There are m edges in a spanning tree in a graph with n vertices


– For example, a graph with 2 vertices will have a spanning tree consisting of one edge which connects the
two vertices

39
Minimum Spanning Tree Algorithms
There are different approaches to form the Minimum Spanning Tree of a graph, and two important algorithms are
Kruskal’s Algorithm and Prim’s Algorithm.

Kruskal’s Algorithm
In this algorithm, we go through the graph edges one at a time, from smallest edge weight to largest edge weight, and
add it to a Minimum Spanning Tree (MST) as long as a cycle is not created. If the edge to add would create a cycle
in the MST, it is not added and the next edge is examined. We can stop when we have n-1 edges in the MST.

1 Sort edges by weight so that |e1 | ≤ |e2 | ≤ |em | ;


2 // Assign empty tree
3 T ← 0;
4 // m is the edges in a graph
5 for i = 1 to m do
6 Put edge i into T if cycle is not made;
7 end
Algorithm 13: Kruskal’s Algorithm

Prim’s Algorithm
In this algorithm, we can start at any vertex and grow the Minimum Spanning Tree (MST). We take the cheapest
(smallest) edge coming off the growing tree that leads to a new vertex and add it to the MST. This process is repeated
until a MST is made.
1 for all x ∈ V [G] do
2 // Distances to all vertices initialized to ∞
3 D[x] ← ∞;
4 // Predecessors (Π) to all vertices initialized to NIL
5 Π[x] ← N IL;
6 end
7 Q ← V [G] // Q will hold the vertices which haven’t been checked off yet or "outside" vertices
8 // Distance to start vertex (a) assigned 0 so it is first vertex popped from Q
9 D[a] ← 0;
10 while Q 6= ∅ do
11 x ←Pop-minimum[Q];
12 for all y adjacent to x in Q do
13 if y ∈ Q and W [x, y] < D[y] then
14 // W[x,y] is the weight of edge x,y
15 // Distance and predecessor to new vertex updated
16 D[y] ← W [x, y];
17 Π[y] ← x;
18 end
19 end
20 end
Algorithm 14: Prim’s Algorithm
This algorithm is nearly identical to Dijkstra’s Algorithm. One important difference between these two is that Prim’s
adds the vertex closest to the current tree and Dijkstra’s adds the vertex closest to the source vertex.

The implementation details are similar to Dijkstra’s Algorithm.

Proof for Kruskal’s and Prim’s Algorithms


The proof for Kruskal’s and Prim’s Algorithms, which are greedy algorithms, is a proof by contradiction. Let’s imagine
that we have a partially formed MST and that this MST has been made by adding cheapest edges to the tree. On the
other side of an imaginary boundary line, there are the remaining vertices which need to be added to the MST.

40
Statement: The cheapest edge to cross the boundary is in the MST.
Let’s say we just added the cheapest edge (1) to cross the boundary to the MST. If this edge (1) were then removed
and then replaced with another edge (2) that also crosses the boundary, then we have a contradiction because the new
edge (2) is not the cheapest edge to cross the boundary because the old edge (1) was already the cheapest. Therefore,
we would no longer have a MST with the new edge (2).

23 Shortest Path Algorithms


Types
• Single source, Single sink.
– Ex. What is the best route from site A to site B?
• All Pairs.
– Ex. What is the shortest route between every pair of cities?
• Single source.
– Ex. What is the shortest distance from A to other locations?
It’s very hard to solve the Single source, single sink problem without also solving the single source problem for every
node along the way. Same is true for All pairs.

Dijkstra’s Algorithm
1 for all x ∈ V [G] do
2 // Θ(n)
3 D[x] ← ∞;
4 Π[x] ← N IL;
5 end
6 Q ← V [G] // Q will hold the vertices which haven’t been checked off yet
7 D[a] ← 0;
8 while Q 6= ∅ do
9 // Θ(n)
10 x ←Pop-minimum[Q];
11 for all y adjacent to x in Q do
12 // Depends on implementation. Can be adjacency matrix, checkable in Θ(n)
13 if D[x] + W [x, y] < D[y] then
14 // W[x,y] is the weight of edge x,y
15 D[y] ← D[x] + W [x, y];
16 Π[y] ← x;
17 end
18 end
19 end

This bears a similarity to Breadth First Search. From the source, you reach the vertices separated by 1 edge first,
then those separated by 2 edges, then 3, etc. This forms an expanding shell of visited nodes with known shortest
distance. We have connections to edges outside the shell, with known edge weights. Since we are picking the smallest
such edge, the BFS-like process guarantees that you determine the shortest path to that vertex.
This only works when you have non-negative weight edges.

Correctness Proof
Base case: Nothing is in the circle, except the source vertex with distance 0.
Inductive Step: Remove smallest edge weight from consideration and update connecting vertices if smaller.
Since you know the shortest path to everything in the circle, the available path to the next shortest edge is guaranteed
to be the shortest path to that vertex.

41
Implementation details
Create Q as array. Let Q hold 1’s for vertices that haven’t been eliminated yet, and 0’s otherwise. Naive approach is
to travel Q looking for 1, lookup weight in D and track smallest value. If implemented with Adjacency Matrix, the
whole thing done in Θ(n2 ).

The alternative is to use a priority queue / heap for Q, removing the smallest value. This necessitates some
extra heap functions to grab an arbitrary node in the heap and change its weight. That lets you pop from the heap in
logarithmic time. There’s an additional requirement to visit |adj(x)| additional vertices, and update them with our
heap manipulation function to update weight, which is an additional Θ(lg n). Since Dijkstra’s does this for every
element the running time is Θ(m · lg n).

So if your graph is dense it’s better to use the Adjacency Matrix algorithm, which is quadratic in the number
of vertices. If your graph is sparse it’s better to use the m log n algorithm with a priority queue.
Of course in reality the quadratic, Adj. Matrix algorithm is usually better at the low level because of spatial locality.

24 NP-Completeness
Goal is to separate problems easy to solve from those hard to solve. Loosely speaking easy means polynomial time
and hard means exponential time.
These are not perfect definitions, in spite of that fact it’s still okay.

• Nature is kind to us. Natural problems that are polynomial time tend to have low degree. Not a theorem, just
an empirical statement.
• We don’t really know what a computer is. Real world computers have memory hierarchy and variable running
times, there’s concurrency as well. All models are equivalent up to polynomial time. One exception is quantum
computers, but not really proven in reality.
• Polynomials are closed under addition, multiplication, and composition.

Decision Problem Theory


Decision problems are yes/no questions.
Does a graph have an Eulerian cycle? But not What is the Eulerian Cycle.

Definition 24.1 (P). Decision problems solvable in polynomial time.


Definition 24.2 (NP). Decision problems verifiable in polynomial time with a certificate.
Problems in the class NP have the dichotomy that if the answer is yes you can demonstrate it in polynomial time, but
many of them are not able to be disproved in polynomial time. When the answer is yes it’s easy to prove, otherwise
it’s very hard.
Certificate is usually the solution to the problem. Example would be the coloring problem from the homework.
Definition 24.3 (SAT). Satisfiability Problem Given a Boolean formula, is there a way of setting the variables to
true and false such that the formula is true?

If it is then it can be evaluated in polynomial time, the certificate is the assignment. If it isn’t satisfiable the only way
to show that is to try every possibility of assignments, which takes exponential time.
Theorem 24.1 (Cook-Levin 1972). Satisfiability is NP-Complete.

• The Cook-Levin Theorem implies that if we can solve a satisfiability problem in polynomial time, we can solve
any problem in the class NP in polynomial time when the answer is "yes".

42
The Class of NP problems
Is it the case that if something can be solved in polynomial time it can be verified in polynomial time? Does P = N P ?
Or more directly, are P and NP-Complete classes distinct? No one knows.
NP-Complete: The class of problems that are all equivalent to one another under reduction. Formula-SAT, circuit-SAT
and coloring are all the same problem, really. They can be reduced to one another.
Most things in real life are in the NP class. Empirical Statement: For problems occurring in nature, Once in the class
NP, it’s either P or NP-Complete. There’s a theorem that says that if P = 6 N P − Complete then there are an infinite
class of problem between them.

Reduction
Having proved that a problem is NP complete, how do we prove other problems are also NP complete?

Theorem 24.2. Circuit SAT is NP-complete.


Proof. - Show in NP - Show hard for NP - Show that Circuit SAT is at least as hard as some other problem which we
know is NP-complete.
Formula SAT ≤p Circuit SAT
Circuit SAT ≤p Formula SAT

Converting from circuit to formula is harder than the other way. But the two are still verifiable in the same class of
running time.

Traveling Salesman Problem


Eulerian Cycles require that you hit every path once, while Hamiltonian Cycles require that you hit every vertex once.
This is the crux of the traveling salesman problem. Hamiltonian Cycles are NP-Complete.
The Traveling Salesman problem is this problem of finding the shortest path that hits all the nodes in a graph.
Because the traveling salesman problem isn’t a yes no question the technically correct way of formulating it is to ask
if there is a path with length less than some target length.
This is a generalization of the Hamiltonian Cycle problem, which is NP complete, therefore the traveling salesman
problem is NP complete.

Theory
There were generalizable characteristics of the Eulerian cycle problem that allowed Mathematicians to determine
proofs around them, but the Hamiltonian Cycle problem doesn’t have any generalizable characteristics like that and
it wasn’t until the advent of the NP completeness theorem that this was understood.

Problems between P and NP-Complete


• Factoring

• Discrete Logarithm
• Graph Isomorphism

Their exact location isn’t known.

Exam Materials
Must prove a problem is in class NP by stating what the cert is and showing how to use the cert to verify.
No Reductions.
Show that you can use the fact of the class of the decision problem form of the question to derive the class of the
optimization problem form. If you can determine the yes/no condition of whether a formula is satisfiable, you can
determine the certificate for that factor in Polynomial time.

43
Formula SAT
Assume Decision version of Formula SAT is in P. We want to find the assignment.
Given Formula A variables x1 , x2 , . . . , xn

1 if not SAT-decidability(A) then


2 return False
3 end
4 for i = 1 to n do
5 B ← A;
6 Assign True to xi in A // Every time you see xi in the formula substitute with true
7 Simplify to remove True;
8 if not SAT-decidability(A) then
9 A ← B;
10 Assign False to xi and simplify;
11 end
12 end
13 return X;

Graph Coloring Problem


Given an integer k, is a graph k-colorable?
That’s the decision problem. You can use that to solve the real question of what’s the smallest number k such that a
graph is k-colorable.
1 c ← 1;
2 while not Colorable(G,c) do
3 c ← c + 1;
4 end
5 for i=1 to n do
6 for d =1 to c do
7 Set vertex i to color d;
8 if Colorable(G’,d) then
9 exit;
10 end
11 end
12 end

Assign arbitrary colors to the graph nodes one at a time and then ask the routine if the graph is still c-colorable. But
assignment of colors isn’t allowed in our routine. So we need to use a trick.
Make a complete graph of c vertexes. Then connect our goal vertex in the original graph G to every vertex except the
color we want. This forces the color determiner to require a color at that vertex.
Size of G0 = |G| + c2 + (c − 1)n

44
25 Appendices
Created by S. Jaladanki
Here, additional information is given about concepts described in the above text.

A. Transpositions
A transposition involves a pair of numbers where the bigger number is above the smaller number.

For example, let’s look at an example array with the values at indices 1 to 5 listed from top to bottom. We’re trying
to sort this list to have lower numbers on top (or the left) and larger numbers on the bottom (the right).

3
5
2
1
4

In the above array, there are 2 transpositions involving 3, since both 1 and 2 are smaller than 3 but located farther to
the bottom (or the right) than 3. There are 3 transpositions involving 5 since 2, 1, and 4 are smaller than 5 but
located closer to the bottom of the array. Similarly, there is 1 transposition involving 2 (because of 1), 0 transpositions
involving 1, and 0 transpositions involving 4. From the above array, there is a total of 6 transpositions.

The number of transpositions is equal to the number of exchanges in bubble sort. For instance, in an already
sorted list, there would be 0 exchanges needed since there are 0 transpositions.

B. Summation Properties
Manipulating summations is an important part of the course, and examples of important properties of summations
are provided.

j
X
1=j−i+1 Upper limit - lower limit + 1
k=i
n
X n(n + 1)
i= Gauss’ sum formula, notice that lower limit is 1
i=1
2
n
X n−1
X n−1
X
(i − 1) = (i − 1) + 1 = i Subtract 1 from both limits and add 1 to formula
i=2 i=1 i=1
Xn Xn
i=( i) − 1 Since added 1 to summation, subtract 1 outside of it
i=2 i=1
n−1
X n−1
X
2=2 1 Can pull out coefficients if not related to index (which is i here)
i=1 i=1
n n n
1 X 1X 2 1X
2
(i + i) = i + i Can treat each term in parantheses as separate summation
2 i=1
2 i=1 2 i=1
n X
i n i
X 1 X 1X 1
(i − j + 1) = (i − j + 1) Can pull out from inner summation because j is index
i=2 j=1
i i=2
i j=1 i
lgn−1
X 1 − 2(lgn−1)+1 1 − 2lgn
2i = = = −1 + nlg2 = n − 1 Use of geometric series formula
i=0
1−2 −1

45
Following is an example of the Change of Variable Method.

n
X
(n − i + 1)
i=1
Writing out summation terms with substitution gives
= (n − 1 + 1) + (n − 2 + 1) + (n − 3 + 1) +... + (n − (n − 2) + 1) + (n − (n − 1) + 1) + (n − n + 1)
| {z } | {z } | {z } | {z } | {z } | {z }
n n−1 n−2 3 2 1

Based on the pattern and range of the terms (1,2,3...n-2,n-1,n) we can rewrite the summation as
n
X n(n + 1)
= (i) =
i=1
2

46

You might also like