CMSC 351: Algorithms
CMSC 351: Algorithms
CMSC 351: Algorithms
Algorithms
Alex Reustle
Dr. Clyde Kruskal • Fall 2017 • University of Maryland
http://cs.umd.edu/class/fall2017/cmsc351/
Table of Contents
1 Maximum Subarray Problem 5
Brute Force solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Analysis of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Methods to improve integer list summation algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Timing analysis 6
Classes of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Θ(n2 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Θ(n log(n)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6 Selection Sort 11
Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1
7 Merge Sort 11
Merge Comparison analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Equal length arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Different length arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Mergesort Comparison Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Merge Sort Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
8 Integer Arithmetic 13
Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Problem size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Recursive Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Better Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
10 Heapsort 17
Create Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Heap Creation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Finish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
12 Order Notation 24
13 Quicksort 26
Comparison Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Quicksort Worst case comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Quicksort Best case comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Proof by Constructive Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Quicksort Worst Case from Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Quicksort Best Case from Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Average case analysis of quicksort from approximate recurrence. . . . . . . . . . . . . . . . . . . . . . . 28
Average case analysis of quicksort from exact recurrence. . . . . . . . . . . . . . . . . . . . . . . . . . . 29
New Book method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Is Quicksort In place? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Randomized pivot selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Median Pivot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Small Size Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2
15 Linear Sorting Algorithms 31
Re-evaluating Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Find biggest element in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Find second largest element in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Find k largest elements in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Natural approach: Hold a Tournament . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Finding Best and Worst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
16 Counting Sort 32
17 Radix Sort 33
Analysis of running time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
18 Bucket Sort 34
19 Selection 34
Analyzing the recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Constrained case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Average case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
21 Graph Algorithms 37
Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Edge Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
List all Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Take aways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Changing the number of nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Adding Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Removing Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Depth First Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Breadth First Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Identifying Connected Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Bridges of Köningsberg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
22 Trees 39
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Minimum Spanning Tree Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Kruskal’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Prim’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Proof for Kruskal’s and Prim’s Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
24 NP-Completeness 42
Decision Problem Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
The Class of NP problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Problems between P and NP-Complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3
Exam Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Formula SAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Graph Coloring Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
25 Appendices 45
A. Transpositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
B. Summation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4
1 Maximum Subarray Problem
Given a sequence of integers, find the largest collection which is a sum of adjacent values. Sequence is in an array
with indices from 1 to n.
3 − 5 4 3 − 6 4 8 −5 3 − 7 2 1
| {z }
Sum=13
5
Methods to improve integer list summation algorithm.
Remember previous sums, do not recalculate known data.
1 M ← 0;
2 for i=1 to n do
3 S ← 0;
4 for j = i to n do
5 S = S + A[j];
6 M = max(M, S)
7 end
8 end
Algorithm 2: n2 Subarray
n X n n n
X X X n(n + 1)
1= n−i+1= i=
i=1 j=i i=1 i=1
2
Loosely speaking the maximum sum is just the previous maximum sum, plus the current value.
1 M ← 0, S ← 0;
2 for i = 1 to n do
3 S ← max(S + A[i], 0);
4 M = max(M, S);
5 end
Algorithm 3: Linear Subarray
The third method is Θ(n).
Correctness of the algorithm stems from proof by induction on S = max(S + A[i], 0). This is the Loop Invariant.
n
X
1=n
i=1
2 Timing analysis
Lower Bound
No algorithm is Faster than this boundary condition, the lower bound time.
Best Case scenario, very hard to prove in the general case.
Upper Bound
Some Algorithm can obtain upper bound time.
Worst case Scenario.
Classes of algorithms
Θ(n2 )
Bubble sort
Insertion Sort
Selection Sort
Θ(n log(n))
Merge Sort
Heap Sort
Quick Sort
Others
Radix Sort
6
3 Bubble Sort Analysis
1 for i = n downto 2 do
2 for j = 1 to i-1 do
3 if A[j] > A[j + 1] then
4 swap (A[j],A[j+1]);
5 end
6 end
7 end
Algorithm 4: Bubble Sort
Major Growth Elements of Bubble sort will be Comparisons and Exchanges (swaps).
Best case number of comparisons require you to view every element unconditionally.
n X
X i−1 n
X
1= i−1
i=2 j=1 i=2
n−1
X
= (i + 1) − 1
i=1
n−1
X
= i
i=1
(n − 1)n n
= = Equivalent to n choose 2
2 2
Best Case Exchanges occur when the list is sorted. Zero exchanges.
Worst Case Exchanges occur when the list is reverse sorted. Same number of exchanges as comparisons.
Definition 3.1 (Random). Each permutation (ordering) of the list is equally likely.
Rotations of list permutations are equally likely to be in any position, but are not random because they ignore other
possible permutations.
7
1 A[0] ← −∞;
2 for i = 2 to n do
3 t ← A[i];
4 j ← i − 1;
5 while A[j] ≥ t do
6 A[j + 1] ← A[j];
7 j ← j − 1;
8 end
9 A[j + 1] ← t;
10 end
Algorithm 5: Insertion Sort with Sentinel
n X
i n i
X 1 X 1X
· (i − j + 1) = (i − j + 1) Pull out 1/i
i=2 j=1
i i=2
i j=1
n i
X 1X
= j
i=2
i j=1
n
X 1 i(i + 1)
= ·
i=2
i 2
n
X (i + 1)
=
i=2
2
" n #
1 X
= i+1
2 i=2
8
" n n
#
1 X X
= i+ 1
2 i=2 i=2
" n #
1 X
= i − 1 + (n − 1)
2 i=1
1 (n)(n + 1) 2(n − 1)
= −1+
2 2 2
2
1 (n + n − 2) 2n − 2
= +
2 2 2
2
(n + 3n − 4)
=
4
(n + 4)(n − 1)
=
4
Analyzing Exchanges
Best Case Exchanges
Best case number of moves = 2n − 1. There is 1 move in in the −∞ assignment at the top, then in the for loop there
are 2 moves per iteration, moving A[i] into t, and moving it back into the same spot. 1 + 2(n − 1) = 2n − 1.
(n + 2)(n − 1)
+n
2
(n + 4)(n − 1)
+n
4
Remark 4.1. An important trick is to recognize that the moves are related to the comparisons 1 to 1 and that there
will be 1 time when the comparison evaluates false each iteration, so you subtract that from the final analysis.
9
Best Case Comparisons
In
Pnthe best case, the array is already sorted, so there will just be 1 comparison per iteration of the outer loop.
i=2 1 = (n − 2) + 1 = n − 1
n
X 1 1 1 1 1
Hn − 1 = = + + + + · · · ≈ ln n
i=2
i 2 3 4 5
Harmonic Series come up again and again because of the idea that something has a 1
i chance of happening in an
iteration.
What are we saving when we don’t have a sentinel? In the event that the ith element is the smallest element in the
A[1..i-1] sub-array, we will descend all the way to the 1st index of the list. If that happens, we won’t have the 1 extra
step of comparing against the sentinel. So only when the ith element is the smallest seen thus far do we save 1 step.
Probability that current ith element is smallest in sub array is 1i .PThe value when that occurs is 1. Thus over all n
n
elements of the array, on average the sentinel would have cost us i=2 1i = Hn − 1 comparisons.
So average case comparisons without sentinel is average case with sentinel minus average cost of sentinel. (n−1)(n+4)
4 −
(Hn − 1) ≈ (n−1)(n+4)
4 − ln n.
Exchanges
Removing the sentinel from insertion sort adds no new exchanges, nor does it alter when a change might have been
done already. Therefore, the best, worst and average cases are all the same as insertion sort with sentinel.
10
6 Selection Sort
Summary: Find biggest element, put at bottom of list, recurse for sub array.
1 for i=n downto 2 do
2 k ← 1;
3 for j=2 to i do
4 if A[j] > A[k] then
5 k ← j;
6 end
7 end
8 swap (A[k], A[i]);
9 end
Algorithm 7: Selection Sort
Comparisons
The comparison is always run, so the comparison value should be the same for all cases.
n X
X i n
X
1= (i − 1)
i=2 j=2 i=2
(n − 1)n
=
2
Exchanges
Exchanges don’t occur inside the conditional, so exchanges number is simply n-1. P
How many times do we exchange a number with itself? How many times does i=k? 1/i = Hn
7 Merge Sort
Divide and Conquer. Recursively split array in half down to 1 element, then merge sorted sublists. First call of
Mergesort is (A,1,n)
1 MERGESORT (A,p,r);
2 //p and r are indices in array defining sub array.;
3 if p<r then
4 q ← b p+r
2 c;
5 MERGESORT (A,p,q);
6 MERGESORT (A,q+1,r);
7 MERGE (A, (p,q), (q+1,r));
8 end
Algorithm 8: Merge Sort
11
Different length arrays
Arrays of size m and n where m ≤ n Worst case comparisons = m + n − 1
When m << n you can get close to log n running time using binary search. But when m is close to n you are much
closer to the m + n − 1 worst case.
(q − p + 1) + (r − (q + 1) + 1) − 1 = (r − p) comparisons
starting at index p and ending at index q. So for the full list, where p=1 and q=n a MERGESORT will start with
(n-1) comparisons and have the following behavior.
n−1 20 (n − 1)
+
n n
+ n
2 −1 2 −1 21 2 −1
+
n n n n
+ + + n
22 −1 22 −1 22 −1 22 −1 22 22 −1
+
.. .. .. .. .. .. .. .. ..
. . . . . . . . .
+
n
− 1 + 2nk − 1 + n
− 1 + 2nk − 1 n
2k
··· ··· 2k
2k 2k
−1
0 + 00 + 0 ··· + ··· 0 + 0 = 0n
=
Plg n
i=0 2i ( 2ni − 1)
Because this gives lg n terms, we can make it a little easier by dropping the bottom row, which sums to 0 anyway,
because it’s 0n.
lgX
n−1 lgXn−1
n
2i ( − 1) = (n − 2i )
i=0
2i i=0
lgX
n−1 lgX
n−1
= n− 2i
i=0 i=0
= (n lg n) − (2 lg n−1+1
− 1) geometric series
= (n lg n) − (n − 1)
= n lg n − n + 1
12
n
=2·S +n−1
2
Which gives the following recurrence relation
8 Integer Arithmetic
Addition
Just like in elementary addition, the time to add 2 n digit numbers is proportional to the number of digits. Θ(n).
Since each digit must be used to compute the sum this is the right order for best case time.
1 1
1234
+
5678
6912
We assume that each digital addition takes constant time (with carry also, so ignore that little trouble spot) and label
this constant α. The time to add 2 n digit numbers is then α(n).
Problem size
If we were attempting to determine if a number was prime we would divide it by consecutive primes up to the square
root of that number. In Computer Science we want to approach all problems in terms of the problem size. So we
should look at both in terms
√ of the NUMBER OF DIGITS, and not the size of the number. So while the prime
identification problem is Θ N where N is the number, expressed in binary as 2n , n being the number of binary digits.
n
Θ(2 2 ). Giving an exponential time algorithm.
Multiplication
Elementary school approach to multiplication runs in n2 time because every digit on top is multiplied by every digit
on the bottom. There are 2n(n − 1) atomic additions in this elementary approach as described in class.
1 2 34
×
5 6 78
9 8 72
8 6 3 8
74 0 4
617 0
700 6 652
Recursive Multiplication
By memorizing the numbers in large bases, say base 100, 2 digit decimal numbers can be treated as 1 digit numbers
in base 100. That makes the multiplication easier, because there are fewer steps.
n
To generalize this for n digit numbers, treat them as base 10 2 and cut each in half. Each has n2 digits, and we know
the base. Then, recurse through this method until we reach a base we really have memorized the table for. Then
multiply them as 2 2 digit numbers, and pull out the answer.
n n
n 2 2
cd c d ac bd
× n × n
2
n
2 × ab × a b ↓ + ↓
ad+bc
bd
bc }αn time αn time Total add time = 2αn
ad
+ ac
13
ac is an n digit number, as are ad, bc, bd. When adding, bd is not shifted, but ad, and bc are shifted by n2 digits, and
ac is shifted by n digits. Since there’s no overlap for ac and bd, you can add them simply by concatenation. ad + bc
takes αn time. That value plus the center (overlapping) segment of ac + bd is also αn.
So without writing out the algorithm: M (n) = 4M n2 + 2αn which gives us a nice recursion to work on. Let’s assume
the time to multiply in the base case is µ. The base case is when both of our numbers are 1 digit long M (1) = µ.
2αn 40 (2αn)
µ µ µ µ 4lg(n) µ
Which is fun, I’m having fun.
lg(n)−1
X n
atomic actions = (4i (2α )) + 4lg(n) µ
i=0
2i
lgX
n−1
4i
= 2αn ···
i=0
2i
lgX
n−1
= 2αn 2i · · ·
i=0
lg n−1+1
geometric series
= 2αn 2 − 1 ···
lg(n)
= 2αn (n − 1) + 4 µ
lg(4)
= 2αn (n − 1) + n µ 4lg(n) µ = nlg 4 µ
= 2αn (n − 1) + n2 µ nlg 4 µ = n2 µ
= 2αn (n − 1) + n2 µ
We’re doing n2 multiplications and n2 additions, so our application of recursion didn’t improve the running time of
our algorithm. There’s a better way!
Better Multiplication
With the distributive law (a + b)(c + d) = ac + ad + bc + bd, we can use this property to improve the fast multiplication
algorithm. This will let us replace 1 recursive multiplication with 1 new addition and 1 subtraction, as we shall see.
w = (a + b)(c + d) 2 · n2 additions, 1n multiplication u v
= ac + ad + bc + bd + z
u = ac 1n mult.
v = bd 1n mult.
z = w − (u + v) 1n addition, and 1n subtraction.
By the same logic as the earlier multiplication algorithm, ac and bd are shifted by n bits, and can be concatenated.
Let w = (a + b)(c + d), and z = w − (ac + bd) = (ad + bc) so to calculate z takes 3 Multiplications, 3 additions and
1 subtraction. Let’s treat the subtraction as an addition, 4 additions. As such, the recursion for this algorithm is
T (n) = 3T n2 + 4αn, T (1) = µ.
14
lgX
n−1 n X 3i
3i 4α i + 3lg n µ = 4αn +nlg 3 µ
i=0
2 2i
3 lg n−1+1 ..
−1
= 4αn · 2
3 .
2 −1
3 lg n ..
−1
= 4αn · 2
1 .
2
h i ..
= 8αn nlg( 2 ) − 1
3
.
..
= 8αn nlg 3−lg 2 − 1 .
..
lg 3
n
= 8αn lg 2 − 1 .
n
..
lg 3
n
= 8αn −1 .
n
= 8α n − n + nlg 3 µ
lg 3
≈ nlg 3 [8α + µ]
≈ n1.57 [8α + µ]
cnd 1 cnd
..
··· ··· ··· ··· .
n
··· ··· ··· n d
c bn−1 an−1 an−1 c( bn−1 )d
f f f f a
n
f alogb n
15
logb n−1
d
X a i
= cn
i=0
bd
logb n
!
d ( bad ) −1
= cn a
bd
−1
a
!
nlogb ( bd ) − 1
= cnd a
bd
−1
d
!
d nlogb a−logb b − 1
= cn a
bd
−1
log a−d
n b −1
= cnd a
bd
−1
cn logb a
− cnd ..
= a +.
bd
−1
cnlogb a cnd
= a − a + f nlogb a
b d − 1 b d − 1
cnd
c
= a + f · nlogb a − a
bd
−1 bd
−1
This formula can’t apply to the situation when the value of a geometric sum would be 1. The geometric sum isn’t
applicable in that case. SO this equation isn’t applicable when a = bd , we need a special case. When the r term = 1, a
geometric series becomes a simple summation of 1, thus in this case if a = bd the formula becomes cnd logb n + f nlogb a .
In the special case of merge sort, we drop the constant -1 from the non recursion part of the term, and apply the
special case, then we repeat the process with the n dropped, and add the two results.
cnd
c
a 6= bd a +f · nlogb a − a
bd
−1 bd
−1
The relative sizes of these constant factors will overwhelm in differing circumstances.
c
a > bd →T (n) ≈ a + f · nlogb a = Θ(nlogb a )
bd − 1
cnd
a < bd →T (n) ≈ = Θ(nd )
1 − bad
n
T (n) = 4T ( ) + 2αn, T (1) = µ
2
a = 4, b = 2, c = 2α, d = 1, f = µ
16
2α 2αn
+ µ nlg 4 −
2−1 2−1
= (2α + µ) n2 − 2αn ≈ Θ(n2 )
= 2α n2 − n + µn2
≈ Θ(n2 )
Better Multiplication
n
T (n) = 3T ( ) + 4αn, T (1) = µ
2
a = 3, b = 2, c = 4α, d = 1, f = µ
4α 4αn
3 + µ nlg 3 − 3
2 − 1 2 −1
= (8α + µ) n1.57 − 8αn ≈ Θ(n1.57 )
Mergesort
When we get
T (n) = 2T ( n2 ) + n − 1, T (1) = 0
a = bd a 6= bd
c cnd
cnd logb n a
−1 + f · nlogb a − a
−1
bd bd
−1 −1·n0
1 · n1 log2 n = n lg n 2
−1
nlg 2 − 2
−1
20 −1 20
= −n + 1
Implementation details
This gives us the incredible ability to use the constants of the recurrence into a running time and determine if it
will be greater than another algorithm in constant time. There’s also the Schonhage Strassen algorithm, which is
Θ(n log n log log n).
In real life, the size of T (1) is the word size of your machine, because the atomic multiplication is implemented at
that level. So our atomic base is 264 for most machines.
10 Heapsort
Created by J.W.J. Williams Improved by Robert Floyd
• Create Heap
• Finish
17
Definition 10.1 (Heap). A binary tree where every value is larger than its children. Equivalently its descendants.
In this class we require that all binary trees are full binary trees.
100
90 70
80 50 60 10
20 30 40
Create Heap
Turn a tree into a heap.
The traditional way to create a heap is to insert at the end of the array and sift up. Robert Floyd created a better
way to create the heap. Treat the tree as a recursive heap: each parent is the parent to 2 heaps, and sift it from the
bottom up. Create heap in left, create heap on right, then sift root down, and move up a level.
n n n n n n
·0+ ·2+ ·4+ ·6+ ·8+ · 10 + · · ·
2 4 8 16 32 64
1 2 3 4 5
=n· + + + + + ··· This sum is:
2 4 6 16 32
1 1 1 1 1
= + + + + + ··· = 1
2 4 8 16 32
1 1 1 1 1
+ + + + + ··· =
4 8 16 32 2
1 1 1 1
+ + + + ··· =
8 16 32 4
1 1 1
+ + + ··· =
16 32 8
..
+.
= 2n
18
Finish
• Put root node at the bottom of the array, as it must be the largest element, so it will go at the end of the sorted
array.
• Then take the bottom right hand leaf and move to a tmp space.
• Then sift, reordering the tree and put the tmp leaf in its proper spot.
• Repeat.
Sift comparisons: each level has 2 comparisons, child and tmp. There are ≈ lg n levels. Total for a sift ≈ 2 lg n This
must be done for each element in the tree, so our heap has an upper bound of ≈ 2n lg n.
But the heap shrinks, each iteration removes an element. So let’s sum 0 to n-1 (because we remove an element first
before sifting).
n−1
X n
X
2 lg(i + 1) ≈ 2 lg i
i=0 i=1
= 2 [lg 1 + lg 2 + lg 3 + · · · + lg n]
= 2 lg(1 · 2 · 3 · · · · n)
n n √
= 2 lg(n!) Stirling’s Formula n! ≈ · 2πn
h n n √ i e
= 2 lg · 2πn
e
h n 1
i
≈ 2 n lg( ) + lg((2πn) 2 )
e
1
= 2 n [lg n − lg e] + lg(2πn)
2
= 2n lg n − 2n lg e + lg n + lg(2π)
= 2n lg n + O(n)
This isn’t much better, it makes sense that it’s not much better than our conservative guess, because in a full binary
tree, half the elements are leaves. So while the tree shrinks, it doesn’t shrink very fast, asymptotically you’re still
doing Θ(2n lg n) comparisons, plus some linear term.
Implementation Details
Use array to implement tree structure.
60 1 60 Node has index i
tmp
2 30
Left child: 2i
3 100
30 100 4 50 Right child: 2i + 1
5 10
Parent: b 2i c
6 70
50 10 70 90 7 90
8 20
9 80
20 80 40 10 40
CREATE The First parent is at index b n2 c. Start there and sift down during heap creation. Siblings can be reached
by adding or subtracting 1. Result is a created heap.
FINISH Pop bottom into tmp first, then move heap root into bottom spot.
19
Optimization
Why is this result still worse than merge sort? Θ(2n lg n) vs Θ(n lg n). We’re comparing tmp against both children
and this doubles our number of comparisons. Instead let’s sift the hole left by the root down to the bottom in lg n
comparisons (1 per level) then put tmp in the hole an sift it back up into position.
How far will tmp need to go to sift back up and re-form the heap? Not far, since 1/2 the nodes are in the bottom
layer, and 1/4 of the nodes are in the 2nd, etc. It takes 1 comparison to confirm tmp belongs in the bottom layer, 2
to check the 2nd layer up. . .
1 1 1 1
·1+ ·2+ ·3+ · 4 + ··· = 2
2 4 8 16
Giving 2n comparisons on average. We can further optimize by binary searching up. This gives Heapsort n lg n +
n lg lg n = Θ(n lg n) performance.
Pseudocode
1 Function Heap_Sort(A,n)is
2 // Create Heap
3 for r = b n2 c to 1 do
4 Sift(r,n,A[r])
5 end
6 // Finish Sort
7 for m = n To 2 do
8 s ← A[m]
9 A[m] ← A[1]
10 Sift(1,m-1,s)
11 end
12 end
13 Function Sift(r,n,s)is
14 // r:root index, n:size index, s:sift value, p:parent index, c:child index
15 p←r
16 while 2p ≤ n do
17 if 2p < n then
18 if A[2p] ≥ A[2p + 1] then
19 c ← 2p
20 else
21 c ← 2p + 1
22 end
23 else
24 c ← 2p
25 end
26 if A[c] > s then
27 A[p] ← A[c]
28 p←c
29 else
30 Break
31 end
32 end
33 A[p] ← s
34 end
Algorithm 9: Heap Sort
20
11 Finding Bounds to summation functions
Mathematical Preliminaries
Geometric Series
For |r| ≤ 1 recall:
∞
X 1
ri = Infinite Geometric Series
i=0
1−r
n−1
X 1 − rn rn − 1
ri = = Finite Geometric Series
i=0
1−r r−1
How can we check to confirm? We’ve already calculated the r = 1/2 case, where the infinite sum is 2.
1
2
=2
1 2
1− 2
21
n−1
!0
X
i
=r r
i=1
" 0 #
rn − 1
=r −1
r−1
" #
nrn−1 (r − 1) − (rn − 1)
=r 2
(r − 1)
" #
nrn − nrn−1 − rn + 1
=r 2
(r − 1)
" #
(n − 1)rn − nrn−1 + 1
=r 2
(r − 1)
(n − 1)rn+1 − nrn + r
= 2
(r − 1)
Gauss’s Sum
n
X n(n + 1)
i=
i=1
2
Without knowing the answer already, we can see an easy upper bound. If every term is bound by it’s largest element
it can’t be larger than n2 . We can also get a lower bound by flooring everything to the lowest value of 1, since there
are n of them we know that the result must be larger than n. n ≤ sum ≤ n2 .
Next we can try to split the sum in half and floor both to the lowest term in that series.
n n
SU M = 1 + 2 + 3 + 4 + · · · + + + 1 + ··· + n
2 2
n n
≥ 1 + 1 + 1 + 1 + ··· + 1 + + 1 + ··· + + 1
2 2
n n n
≥ + +1
2 2 2 i
nh n
≥ 1+ +1
2 2
n n+4
≥
2 2
2
n
≥ +n
4
Harmonic Sum
n
X 1
Hn = = 1 + 1/2 + 1/3 + 1/4 + 1/5 + · · · + 1/n
i=1
i
≈ ln n
We can approximate it using the same
technique we applied to the Gaussian Sum
≤ 1 + [1/2 + 1/2] + [1/4 + 1/4 + 1/4 + 1/4] + · · ·
= 1 + 1 + 1 + ··· + 1 num ones = num groups
k−1
X
= 2i = n
i=0
k is the number of groups. The number of
22
items per group, summed, which must sum to n
= 2k − 1 = n
= 2k = n + 1
= k = lg(n + 1)
Hn ≤ lg(n + 1)
Gauss’s Sum
Z n n
X Z n+1
xdx ≤ i≤ xdx
m−1 i=m m
x2 n n2 n+1
≤ ≤
2 0 2 i
2
n2 02 (n + 1) 12
− ≤ ≤ −
2 2 2 2
n2 n2 + 2n + 1 1
−0≤ ≤ −
2 2 2
n2 n2 + 2n
≤ ≤
2 2
n2 n(n + 2)
≤ ≤
2 2
Harmonic Sum
Z n+1 n Z n
1 X 1 1
dx ≤ ≤ dx
m x i=m
i m−1 x
n+1 n
ln x ≤ ≤ ln(x)
1 0
ln(n + 1) − ln(1) ≤ ≤ ln(n) − ln 0
ln(n + 1) − 0 ≤ ≤ ln(n) − (−∞)
ln(n + 1) ≤ ≤ ln n + ∞
≤ +∞
Less than infinity isn’t helpful. Let’s avoid it by removing the 1 term from the sum, so we don’t integrate 0 to 1 for ln
x
Z n
≤1+ f (x)dx
1
23
n
≤ 1 + ln(x)
1
≤ 1 + ln n − ln 1
≤ 1 + ln n − 0
≤ ln(n) + 1
12 Order Notation
Did you know that some algorithms can be faster than others? It’s true!
https://en.wikipedia.org/wiki/Time_complexity
Name Running time T (n) Examples of running times
Constant O(1) 10
∗
Iterated logarithmic O(log n)
Log-logarithmic O(log log n)
Logarithmic O(log n) log n, log(n2 )
2
Polylogarithmic poly(log n) (log n)
Fractional power O(nc ) where 0 < c < 1 n1/2 , n2/3
Linear O(n) n
∗
N log star n O(n log n)
Linearithmic O(n log n) n log n, log n!
Quadratic O(n )2
n2
Cubic O(n3 ) n3
Polynomial 2O(log n) = poly(n) n, n log n, n10
Quasi-polynomial 2poly(log n) nlog log n , nlog n
Exponential (with Linear exponent) 2O(n) 1.1n , 10n
2
Exponential 2poly(n) 2n , 2n
Factorial O(n!) n!
poly(n) n
Double exponential 22 22
Order notation can hide simple truths like a large constant in a lower order term that can seriously affect the running
time for small n. Little o means the function is always less than what’s in the bounds. o(n3 ) could be O(n2.9998 )
which could throw off all of our calculations.
24
Equivalence Function Order Set
= Θ
≤ O
≥ Ω
< o
> ω
f (n)
lim = c ≥ 0 −→ f (n) ∈ O(g(n))
n→∞ g(n)
g(n)
lim = c ≥ 0 −→ f (n) ∈ Ω(g(n))
n→∞ f (n)
f (n)
lim = 0 −→ f (n) ∈ o(g(n))
n→∞ g(n)
g(n)
lim = 0 −→ f (n) ∈ ω(g(n))
n→∞ f (n)
What about non-polynomial functions? Like f (n) = n2 · (10 + sin(n)) The book has a non-limiting approach which
gives us a nice way of addressing oscillating functions.
2 3 2 3
To a first approximation you can do plain algebra using Theta notation. 2Θn · 2Θn = 2Θn +Θn .
n
What’s faster? nlg n or(lg n)
Exponentials grow much faster than polynomials. How do we state that formally?
a b
(log log n) = o((log n) )
n
nlg n ? (lg n)
nlg n
lim n
n→∞ (lg n)
2lg n·lg n
= lim lg (lg n)n
n→∞ 2
2
2lg n
= lim n lg(lg n)
n→∞ 2
=0
n
∴ nlg n = o((lg n) )
Polylogs grow slower than polynomials, and Quasi-polynomials grow slower than exponentials with variable bases.
25
13 Quicksort
An efficient sorting algorithm developed by Tony Hoare in 1959 while trying to translate Russian. Recursively
partition the array, put small numbers to the left and large numbers to the right.
1 Function QUICKSORT(A,p,r)is
2 if p < r then
3 q ←PARTITION(A,p,r)
4 QUICKSORT(A,p,q-1)
5 QUICKSORT(A,q+1,r)
6 end
7 end
8 Function PARTITION(A,p,r)is
9 x ← A[r]
10 i←p−1
11 for j = p to r − 1 do
12 if A[j] ≤ x then
13 i←i+1
14 ExchangeA[i], A[j]
15 end
16 end
17 i←i+1
18 ExchangeA[i], A[r]
19 return i
20 end
Comparison Analysis
First note that PARTITION is always linear in comparisons on n = r −p+1 terms. The FOR loop runs n−1 comparisons
regardless of the result of the IF condition.
T (n) = T (n − 1) + n − 1 , T (1) = 0
= (n − 1) + (n − 2) + (n − 3) + · · · + 3 + 2 + 1 pivot = 2
n−1
X
= i
i=1
(n − 1) · n
=
2
26
= O(n · lg n)
P1 2
Base case i = 1. i=1 i = a(1) + b(1) + c = a + b + c = 1
Pn−1 2
Inductive Hypothesis: i=1 i = a(n − 1) + b(n − 1) + c
n
X n−1
X
i=n+ i
i=1 i=1
2
= n + a(n − 1) + b(n − 1) + c
= n + a(n2 − 2n + 1) + b(n − 1) + c
= an2 + (−2a + b + 1)n + a − b + c
If our assumption was true then we should get a system of simultaneous equations matching the quadratic. Then we
could solve the system for the coefficients of the result.
a=a
b = −2a + b + 1
c=a−b+c
1
a = , a = b, c = 0
2
n
X 1 2 1 n(n + 1)
i= n + n=
i=1
2 2 2
T (n) = T (n − 1) + n − 1
2
= a(n − 1) + b(n − 1) + c + n − 1
= a(n2 − 2n + 1) + b(n − 1) + c + n − 1
= an2 (−2a + b + 1)n + c − b − 1 + a
Find coefficients
a+b+c=0
a=a
−2a + b + 1 = b
c−1+a−b=c
1
a=
2
1
b=−
2
c=0
27
1 2 1
T (n) = n − n
2 2
n(n − 1)
=
2
T (n) = 2T (n/2) + n − 1
h n ni
≤ 2 a · · lg +n−1
2 2
= an[lg n − lg 2] + n − 1
= an[lg n − 1] + n − 1
= an lg n − an + n − 1
= an lg n + (−a + 1)n − 1
To be true, must be ≤ an lg n
drop -1 because <
(−a + 1)must be ≤ 0, → a ≥ 1
So a constant a greater than or equal to 1 solves the recurrence, but the best case value occurs when a is 1
T (n) ≤ 1 · n lg n
= n lg n
3n n
T (n) = T ( ) + T( ) + n − 1
4 4
n n 3n 3n
≤ a lg +a lg +n−1
4 4 4 4
n 3an
= a (lg n − lg 4) + (lg n − lg 4 + lg 3) + n − 1
4 4
n an 3an 3an 3an
= a lg n − + lg n − + lg 3 · +n−1
4 2 4 2 4
a 3a a lg 3
= an lg n + − − + +1 n−1
2 2 4
a
= an lg n + −2a + lg 3 + 1 n − 1
4
To be true, must be ≤ an lg n
a
− 2a + lg 3 + 1 ≤ 0
4
28
lg 3
( − 2)a ≥ 1
4
1
a≥
2 − 3 lg
4
3
!
1
T (n) ≥ n lg n
2 − 3 lg
4
3
T (n) ≈ 1.23 · n lg n
first is sum of Ts from 0 to n-1 and the other is the downward sum from n-1 to 0. So we get two sums of T
n−1
2X
= T (q) + n − 1
n q=0
29
a lg e a lg e
= an lg n + ( + 1)n − 1 +
2 2n
a lg e
need − +1≤0
2
2
a≥ = 2 ln 2 ≈ 1.39
lg e
choose a = 1.39 and the last term becomes a number always less than or equal to 1, which is dominated by minus 1
If you have 2 elements next to one another in the sorted array, they will always be compared. Otherwise how will you
know? (i+1)−i+1
2
=1
n
X n
X
T (n) =
i=1 j=i+1
n n
X X 2
=
i=1 j=i+1
j − i+1
n n
X X 1
=2
i=1 j=i+1
j − i+1
n
X
=2 (Hn−i+1 − 1)
i=1
Xn
=2 (Hi − 1)
i=1
Xn n
X
=2 Hi − 2 1
i=1 i=1
Big important insight, that you can calculate the probability that any 2 arbitrary elements can be compared.
Is Quicksort In place?
In place algorithms use a constant amount of extra space, it isn’t a function of the number of elements n. Because
Quick sort is recursive it can add more variables to the stack (multiple versions of q, and r). So to be in place you use
extra memory Θ(1) for algorithm variables + Θ(log n) stack average. There isn’t really a formal definition of In place.
Informally it’s that an algorithm doesn’t use a lot of extra space.
30
One way to avoid using much extra space is to check which side of a partition is smaller and do that side of the
partition first.
3 Median Pivot
You can make sure that you’ve got a good pivot by picking 3 elements randomly and using their median as the pivot.
This gives you a much better chance of pivoting near the middle.
Memory Hierarchy
Memory Hierarchy allows further, machine specific optimizations. Doing a lot of work in fast memory is preferable to
waiting to grab data from slower memories Merge sort and quick sort both take advantage of this spacial locality of
fast memory. It sub sorts the array in pieces that fit into caches. Heap sort doesn’t as it’s sorting mechanism jumps
all over the tree. It lacks spatial locality. Merge Sort also has the problem that it isn’t In-place, so it uses a lot of
extra memory.
31
Natural approach: Hold a Tournament
Comparisons are like playing games in a sportsball match. You need to eliminate n − 1 teams to find the best, so you
must at least run n − 1 comparisons. This gives us a lower bound. In a double elimination tournament when one of
the playing groups loses it must lose again to be eliminated. The superior team will never lose. All other teams must
lose at most twice. If you lose to someone and they lose to someone else, you’re no longer second best. So you don’t
need to incur an extra comparison. If you keep track of who lost to whom you’ll conserve comparisons and be bound
by the height of the tournament bracket, lg n − 1. When you run the tournament you incur n − 1 to find the best
player. Then another lg n − 1 to order everyone else, giving n + lg n comparisons.
16 Counting Sort
Say you have a list of duplicate integers in a range and you want to sort them. n integers in the range 0, . . . , k − 1.
Sort A into B. So simply count the number of integers of each size, and dole them out in order.
1 for i = 0 to k − 1 do
2 C[i] ← 0;
3 end
4 for j = 1 to n do
5 C[A[j]] ← C[A[j]] + 1;
6 end
7 t ← 0;
8 for i = 0 to k − 1 do
9 for j = 0 to C[i] do
10 t ← t + 1;
11 B[t] ← i;
12 end
13 end
Algorithm 10: Counting Sort
The running time is the number of numbers plus the range of the numbers. Θ(n + k). If you’re sorting a lot of
numbers in a small range, it’s the numbers that will dominate. If it’s a large range with sparse numbers, the range
will dominate.
An alternative method is to form the partial sums of C. Where each value in C is the number of elements less than or
equal to the index. Then iterate backwards through the original array. When you encounter an element in A, lookup
that index in C, then assign an element into B, then decrement the value in C. This has the added value of treating
each element as a unique entity, instead of as simply an integer. So you could do this on comparable objects.
1 for i = 0 to k − 1 do
2 C[i] ← 0;
3 end
4 for j = 1 to n do
5 C[A[j]] ← C[A[j]] + 1;
6 end
7 for i = 1 to k − 1 do
8 C[i] ← C[A[i]] + 1;
9 end
10 for j = n down to 1 do
11 B[C[A[j]]] ← A[j];
12 C[A[j]] ← C[A[j]] − 1;
13 end
Algorithm 11: Partial Sum Counting Sort
32
Running time Θ(n + k).
Now the C array represents the number of numbers less than the value of the index.
Why do we go through A backwards? To maintain stability.
We can use counting sort when k < nlogn and mergesort when k > nlogn
17 Radix Sort
Sort on the least significant digit, then the 2nd least significant digit, etcetera until the largest digit arrives. This only
works if a stable sort is used for each intermittent pass over a digit.
Proved by example in lecture. Proved by induction on the intermittent sorts.
Attribute Value Variable
Size 10 n
Radix 9 r
Digits 3 d
Running Time: Θ(d(n + r))
0
ln r − 1r (n + r)
ln s
(n + r) = (ln s) 2
ln r (ln r)
ln r − nr
= (ln s) 2 =0
(ln r)
n
0 = ln r −
r
n
r=
ln r
n
r=
ln( lnnn )
n
=
ln n − ln ln n
n
≈
ln n
So surprisingly the running time of radix sort can depend entirely on the input size, and not on the range of values.
Θ( ln(ln ns ) (n + lnnn ))
ln n
Θ( nlnlnns )
But that’s not a great number, so in real life pick the closest power of 2. It’s nicest for the computer, and a digit
looks like a set of bits. Take every group of r bits and compare based on those bits and return them. So we want to
use Radix sort when it’s running time is less than Quicksort.
n log s
< Θ log2 n log s < Θ (log n) log n = log nΘ(log n) s < nΘ(log n)
Θ < Θ (n log n) Θ (log s)
log n
So Radix sort is theoretically better when the range is relatively small. But for 1 million numbers in binary, that
gives us a very large range: 1, 000, 00020
It’s space-wise inefficient, but has spatial locality.
33
18 Bucket Sort
Assume you have a uniformly distributed array of real numbers between 0 and 1. Create n buckets.
1 Clear B;
2 for i = 1 to n do
3 Put A[i] into bucket B [bn · A[i]c];
4 end
5 Sort each bucket;
6 Concatenate each bucket;
On average there will be 1 element per bucket. If you can rely on the buckets in the worst case not being very large
you can use a quadratic sort within the bucket and still get a linear running time.
You can even apply this to non uniform distributions. Assume a Normal distribution. Simply change the size of the
buckets relative to the mean, with buckets closer to the mean having a smaller range and ones farther from the range
on either side being larger.
Also space wise inefficient, but has spatial locality.
19 Selection
How do we select the kth smallest element from a group of numbers? How do we select the median from a group of
numbers? Median = n + n2 lg n
1 Function SELECT(A,k,p,r)is
2 Find Approximate median for ‘qth smallest’;
3 Partition based on ‘qth smallest’;
4 if k < q − p + 1 then
5 SELECT(A,k,p,q-1);
6 end
7 else if k>q-p+1 then
8 SELECT(A,k-q-p+1,q+1,r);
9 end
10 else
11 k=q-p+1
12 end
13 returnq,A[q];
14 end
Take the list and find some approximate median, partition with this approximate median (q), then look on the left or
right side depending on how k compares to our partition.
Lower Bound
Let’s assume we know the true median, or that our approximate median is good enough.
n−1
T (n) = n − 1 + T d e
2
n
≈ T( ) + n − 1
2
≈ 2n − 1 − lg n
≈ 2n
34
So if we got really lucky and got the true median we could find the kth element in linear time. As such we have a
reasonable assumption for our lower bound. You’ll never do better than 2n comparisons in a real case when you don’t
have the perfect median.
3n
T (n) = n − 1 + T ( )
4
3n
= T( )+n−1
4
3 32
≈ (n − 1) + ( n − 1) + ( n − 1) + . . .
4 4
1
= n − c log n
1 − 3/4
≈ 4n
T (n) = n − 1 + T (n − 1)
= (n − 1) + (n − 2) + (n − 3) + . . .
(n − 1)n
=
2
n2
≈
2
n
X 1
T (n) = n − 1 + · T (max(q − 1, n − q))
q=1
n
n−1
2X
= T (q)
n n
2
35
n−1
2X
=n−1+ aq
n n
2
n−1
2a X
=n−1+ q
n n
2
n
n−1 2 −1
2a X X
= q− q + n − 1
n q=1 q=1
n n
( − 1)
2a n(n − 1)
= − 2 2 +n−1
n 2 2
2a n2 n n2
n
= − − + +n−1
n 2 2 8 4
an a
= an − a − + +n−1
4 2
3 a
= an − + n − 1
4 2
3 a
= ( a + 1)n − − 1 for induction to work ≤ an ∴ a ≥ 4
4 2
T (n) ≈ 4n
M
36
3
10 n
7
10 n → remaining number of elements to recurse over
n 7
T (n) ≤ 2n + T ( ) + T ( n) + n − 1
5 10
n 7
= T ( ) + T ( n) + 3n − 1 Constructive induction, guess T (n) ≤ an
5 10
n 7
≤ a + an + 3n − 1
5 10
9
= +3 n−1
10
9
need an =⇒ need a + 3 ≤ a ∴ a ≥ 30
10
∴ T (n) ≤ 30n
Median is doable in linear time, but 30 is a very large coefficient. Compare it to quick sort at n lg n,n must be ∼ 230
for our linear median selection to be better.
Theory
Say the upper bound, as some researchers claim to have found, that the upper bound on median selection is 3n.
21 Graph Algorithms
Graph description. Can be represented as an adjacency list or adjacency matrix.
Undirected graph adjacency list representation is a bit awkward because it double counts shared edges. Altering an
edge on one vertex requires altering the edge on its partner too. Requiring that you traverse both lists. One awkward
way to alleviate this is to keep extra pointers between the list elements in shared edges.
Memory Usage
Matrix, where n = number of vertices, Memory usage = Θ(n2 )
List, where m = number of edges, Memory usage = Θ(m + n)
How big can m be? M can be as large as n2 , but will probably be smaller.
n2
You could also use a bit matrix of size word size .
Edge Lookup
List: Θ(n) Matrix: Θ(1)
37
List all Edges
List, Go down each index, then traverse each list. Amortized analysis is to visit n elements. Θ(m + n).
Matrix, Go across each row, looking at all the 0s, even when you don’t care about them. Θ(n2 ).
Take aways
Typically we want to use the Adjacency list representation, because it’s linear for in the average case.
Removing Nodes
Shrink the array in a way similar to adding a node.
1 Mark v at visited;
2 for all v such that (u,v) is an edge and v has not been visited ;
3 DFS v;
Applications
Identify Connected Components.
38
A connected component is a connected subgraph, or a connection of such components.
1 Function Component(G)is
2 for all x ∈ V [G] do
3 visited[x] ← F alse;
4 t ← 0;
5 end
6 for all x ∈ V [G] do
7 if not visited[x] then
8 t ← t + 1;
9 Compvisit [x];
10 end
11 end
12 end
13 Function Compvisit(x)is
14 visited[x] ← T rue;
15 CompN um[x] ← t;
16 for all y ∈ adj[x] do
17 if not visited[y] then
18 Compvisit[y];
19 end
20 end
21 end
Algorithm 12: Connected Component
Adjacency Matrix: O(n ) 2
Bridges of Köningsberg
Theorem 21.1 (Eulerian Cycle 1736). An undirected, (connected) graph G = (v, e) has an Eulerian Cycle if and
only if every vertex is connected and every vertex has an even degree (number of incoming and outgoing edges, with
loops counted twice).
22 Trees
Created by S. Jaladanki
Definitions
Definition 22.1 (Tree). An undirected and connected acyclic graph.
Source: https://en.wikipedia.org/wiki/Minimum_spanning_tree
Definition 22.2 (Spanning Tree). A subset of edges that form a tree and contains every vertex in the graph. It does
not have cycles.
Definition 22.3 (Minimum Spanning Tree (MST)). A spanning tree that has the lowest possible sum of edge weights
39
Minimum Spanning Tree Algorithms
There are different approaches to form the Minimum Spanning Tree of a graph, and two important algorithms are
Kruskal’s Algorithm and Prim’s Algorithm.
Kruskal’s Algorithm
In this algorithm, we go through the graph edges one at a time, from smallest edge weight to largest edge weight, and
add it to a Minimum Spanning Tree (MST) as long as a cycle is not created. If the edge to add would create a cycle
in the MST, it is not added and the next edge is examined. We can stop when we have n-1 edges in the MST.
Prim’s Algorithm
In this algorithm, we can start at any vertex and grow the Minimum Spanning Tree (MST). We take the cheapest
(smallest) edge coming off the growing tree that leads to a new vertex and add it to the MST. This process is repeated
until a MST is made.
1 for all x ∈ V [G] do
2 // Distances to all vertices initialized to ∞
3 D[x] ← ∞;
4 // Predecessors (Π) to all vertices initialized to NIL
5 Π[x] ← N IL;
6 end
7 Q ← V [G] // Q will hold the vertices which haven’t been checked off yet or "outside" vertices
8 // Distance to start vertex (a) assigned 0 so it is first vertex popped from Q
9 D[a] ← 0;
10 while Q 6= ∅ do
11 x ←Pop-minimum[Q];
12 for all y adjacent to x in Q do
13 if y ∈ Q and W [x, y] < D[y] then
14 // W[x,y] is the weight of edge x,y
15 // Distance and predecessor to new vertex updated
16 D[y] ← W [x, y];
17 Π[y] ← x;
18 end
19 end
20 end
Algorithm 14: Prim’s Algorithm
This algorithm is nearly identical to Dijkstra’s Algorithm. One important difference between these two is that Prim’s
adds the vertex closest to the current tree and Dijkstra’s adds the vertex closest to the source vertex.
40
Statement: The cheapest edge to cross the boundary is in the MST.
Let’s say we just added the cheapest edge (1) to cross the boundary to the MST. If this edge (1) were then removed
and then replaced with another edge (2) that also crosses the boundary, then we have a contradiction because the new
edge (2) is not the cheapest edge to cross the boundary because the old edge (1) was already the cheapest. Therefore,
we would no longer have a MST with the new edge (2).
Dijkstra’s Algorithm
1 for all x ∈ V [G] do
2 // Θ(n)
3 D[x] ← ∞;
4 Π[x] ← N IL;
5 end
6 Q ← V [G] // Q will hold the vertices which haven’t been checked off yet
7 D[a] ← 0;
8 while Q 6= ∅ do
9 // Θ(n)
10 x ←Pop-minimum[Q];
11 for all y adjacent to x in Q do
12 // Depends on implementation. Can be adjacency matrix, checkable in Θ(n)
13 if D[x] + W [x, y] < D[y] then
14 // W[x,y] is the weight of edge x,y
15 D[y] ← D[x] + W [x, y];
16 Π[y] ← x;
17 end
18 end
19 end
This bears a similarity to Breadth First Search. From the source, you reach the vertices separated by 1 edge first,
then those separated by 2 edges, then 3, etc. This forms an expanding shell of visited nodes with known shortest
distance. We have connections to edges outside the shell, with known edge weights. Since we are picking the smallest
such edge, the BFS-like process guarantees that you determine the shortest path to that vertex.
This only works when you have non-negative weight edges.
Correctness Proof
Base case: Nothing is in the circle, except the source vertex with distance 0.
Inductive Step: Remove smallest edge weight from consideration and update connecting vertices if smaller.
Since you know the shortest path to everything in the circle, the available path to the next shortest edge is guaranteed
to be the shortest path to that vertex.
41
Implementation details
Create Q as array. Let Q hold 1’s for vertices that haven’t been eliminated yet, and 0’s otherwise. Naive approach is
to travel Q looking for 1, lookup weight in D and track smallest value. If implemented with Adjacency Matrix, the
whole thing done in Θ(n2 ).
The alternative is to use a priority queue / heap for Q, removing the smallest value. This necessitates some
extra heap functions to grab an arbitrary node in the heap and change its weight. That lets you pop from the heap in
logarithmic time. There’s an additional requirement to visit |adj(x)| additional vertices, and update them with our
heap manipulation function to update weight, which is an additional Θ(lg n). Since Dijkstra’s does this for every
element the running time is Θ(m · lg n).
So if your graph is dense it’s better to use the Adjacency Matrix algorithm, which is quadratic in the number
of vertices. If your graph is sparse it’s better to use the m log n algorithm with a priority queue.
Of course in reality the quadratic, Adj. Matrix algorithm is usually better at the low level because of spatial locality.
24 NP-Completeness
Goal is to separate problems easy to solve from those hard to solve. Loosely speaking easy means polynomial time
and hard means exponential time.
These are not perfect definitions, in spite of that fact it’s still okay.
• Nature is kind to us. Natural problems that are polynomial time tend to have low degree. Not a theorem, just
an empirical statement.
• We don’t really know what a computer is. Real world computers have memory hierarchy and variable running
times, there’s concurrency as well. All models are equivalent up to polynomial time. One exception is quantum
computers, but not really proven in reality.
• Polynomials are closed under addition, multiplication, and composition.
If it is then it can be evaluated in polynomial time, the certificate is the assignment. If it isn’t satisfiable the only way
to show that is to try every possibility of assignments, which takes exponential time.
Theorem 24.1 (Cook-Levin 1972). Satisfiability is NP-Complete.
• The Cook-Levin Theorem implies that if we can solve a satisfiability problem in polynomial time, we can solve
any problem in the class NP in polynomial time when the answer is "yes".
42
The Class of NP problems
Is it the case that if something can be solved in polynomial time it can be verified in polynomial time? Does P = N P ?
Or more directly, are P and NP-Complete classes distinct? No one knows.
NP-Complete: The class of problems that are all equivalent to one another under reduction. Formula-SAT, circuit-SAT
and coloring are all the same problem, really. They can be reduced to one another.
Most things in real life are in the NP class. Empirical Statement: For problems occurring in nature, Once in the class
NP, it’s either P or NP-Complete. There’s a theorem that says that if P = 6 N P − Complete then there are an infinite
class of problem between them.
Reduction
Having proved that a problem is NP complete, how do we prove other problems are also NP complete?
Converting from circuit to formula is harder than the other way. But the two are still verifiable in the same class of
running time.
Theory
There were generalizable characteristics of the Eulerian cycle problem that allowed Mathematicians to determine
proofs around them, but the Hamiltonian Cycle problem doesn’t have any generalizable characteristics like that and
it wasn’t until the advent of the NP completeness theorem that this was understood.
• Discrete Logarithm
• Graph Isomorphism
Exam Materials
Must prove a problem is in class NP by stating what the cert is and showing how to use the cert to verify.
No Reductions.
Show that you can use the fact of the class of the decision problem form of the question to derive the class of the
optimization problem form. If you can determine the yes/no condition of whether a formula is satisfiable, you can
determine the certificate for that factor in Polynomial time.
43
Formula SAT
Assume Decision version of Formula SAT is in P. We want to find the assignment.
Given Formula A variables x1 , x2 , . . . , xn
Assign arbitrary colors to the graph nodes one at a time and then ask the routine if the graph is still c-colorable. But
assignment of colors isn’t allowed in our routine. So we need to use a trick.
Make a complete graph of c vertexes. Then connect our goal vertex in the original graph G to every vertex except the
color we want. This forces the color determiner to require a color at that vertex.
Size of G0 = |G| + c2 + (c − 1)n
44
25 Appendices
Created by S. Jaladanki
Here, additional information is given about concepts described in the above text.
A. Transpositions
A transposition involves a pair of numbers where the bigger number is above the smaller number.
For example, let’s look at an example array with the values at indices 1 to 5 listed from top to bottom. We’re trying
to sort this list to have lower numbers on top (or the left) and larger numbers on the bottom (the right).
3
5
2
1
4
In the above array, there are 2 transpositions involving 3, since both 1 and 2 are smaller than 3 but located farther to
the bottom (or the right) than 3. There are 3 transpositions involving 5 since 2, 1, and 4 are smaller than 5 but
located closer to the bottom of the array. Similarly, there is 1 transposition involving 2 (because of 1), 0 transpositions
involving 1, and 0 transpositions involving 4. From the above array, there is a total of 6 transpositions.
The number of transpositions is equal to the number of exchanges in bubble sort. For instance, in an already
sorted list, there would be 0 exchanges needed since there are 0 transpositions.
B. Summation Properties
Manipulating summations is an important part of the course, and examples of important properties of summations
are provided.
j
X
1=j−i+1 Upper limit - lower limit + 1
k=i
n
X n(n + 1)
i= Gauss’ sum formula, notice that lower limit is 1
i=1
2
n
X n−1
X n−1
X
(i − 1) = (i − 1) + 1 = i Subtract 1 from both limits and add 1 to formula
i=2 i=1 i=1
Xn Xn
i=( i) − 1 Since added 1 to summation, subtract 1 outside of it
i=2 i=1
n−1
X n−1
X
2=2 1 Can pull out coefficients if not related to index (which is i here)
i=1 i=1
n n n
1 X 1X 2 1X
2
(i + i) = i + i Can treat each term in parantheses as separate summation
2 i=1
2 i=1 2 i=1
n X
i n i
X 1 X 1X 1
(i − j + 1) = (i − j + 1) Can pull out from inner summation because j is index
i=2 j=1
i i=2
i j=1 i
lgn−1
X 1 − 2(lgn−1)+1 1 − 2lgn
2i = = = −1 + nlg2 = n − 1 Use of geometric series formula
i=0
1−2 −1
45
Following is an example of the Change of Variable Method.
n
X
(n − i + 1)
i=1
Writing out summation terms with substitution gives
= (n − 1 + 1) + (n − 2 + 1) + (n − 3 + 1) +... + (n − (n − 2) + 1) + (n − (n − 1) + 1) + (n − n + 1)
| {z } | {z } | {z } | {z } | {z } | {z }
n n−1 n−2 3 2 1
Based on the pattern and range of the terms (1,2,3...n-2,n-1,n) we can rewrite the summation as
n
X n(n + 1)
= (i) =
i=1
2
46