Introduction to Randomized Algorithms
Randomized algorithms are algorithms that utilize randomness as part of their logic.
They can be especially effective for problems that are difficult to solve
deterministically. Randomization can be used to simplify the algorithms, make them
faster, or provide a probabilistic guarantee of their correctness or performance.
A Min-Cut Algorithm (Karger’s Algorithm)
Karger’s algorithm is used to compute a minimum cut of a connected graph. A
minimum cut is the smallest set of edges that, if removed, would separate the graph
into two disjoint subgraphs.
Pseudocode:
function KargerMinCut(graph G)
while the number of vertices in G > 2
choose a random edge (u, v) in G
merge (or “contract”) u and v into a single vertex
remove self-loops
end while
return the cut represented by the remaining 2 vertices
end function
To improve the probability of finding the actual minimum cut, this algorithm should
be run ( n^2 \log n ) times, where ( n ) is the number of vertices in the graph.
Las Vegas and Monte Carlo Algorithms
Las Vegas algorithms always produce a correct result or indicate failure. Monte Carlo
algorithms, on the other hand, have a bounded running time but may only produce a
correct result with a certain probability.
Example (Las Vegas Algorithm – Randomized Quick Sort):
function RandomizedQuickSort(A, low, high)
if low < high
pivot_index <- RandomizedPartition(A, low, high)
RandomizedQuickSort(A, low, pivot_index - 1)
RandomizedQuickSort(A, pivot_index + 1, high)
end if
end function
function RandomizedPartition(A, low, high)
pivot_index <- choose a random index from low to high
swap A[pivot_index] and A[high] // Move pivot to end
return Partition(A, low, high)
end function
Binary Planar Partitions
Binary Planar Partitions are used in computational geometry and computer graphics to
recursively divide a plane with line segments, which can be represented as a binary
tree.
Example (Binary Space Partitioning Tree):
function BuildBSPTree(segments)
if segments is empty
return null
end if
choose a segment S randomly from segments
create a node N with S
N.left <- BuildBSPTree(segments on the left side of S)
N.right <- BuildBSPTree(segments on the right side of S)
return N
end function
Probabilistic Recurrence
This involves using probability theory to analyze the behavior of recursive algorithms,
especially those that involve randomization.
Example (Recurrence for Randomized Quick Sort):
The running time ( T(n) ) can be expressed by the recurrence:
[ T(n) = T(k) + T(n - k - 1) + \Theta(n) ]
where ( k ) is the number of elements less than the pivot (chosen randomly).
Game-Theoretic Techniques: Game Tree Evaluation
Game tree evaluation involves algorithms that look ahead many moves in a game and
evaluate the possibilities. This often uses the minimax principle but can be enhanced
with randomization to handle the exponential growth of possible moves in complex
games.
Pseudocode (Minimax with Alpha-Beta Pruning):
function Minimax(node, depth, α, β, maximizingPlayer)
if depth == 0 or node is a terminal node
return the heuristic value of node
if maximizingPlayer
value <- -∞
for each child of node
value <- max(value, Minimax(child, depth - 1, α, β, false))
α <- max(α, value)
if α ≥ β
break // β cut-off
end for
return value
else
value <- +∞
for each child of node
value <- min(value, Minimax(child, depth - 1, α, β, true))
β <- min(β, value)
if β ≤ α
break // α cut-off
end for
return value
end function
The Minimax Principle
The minimax principle is a decision rule used for minimizing the possible loss for a
worst-case scenario. It’s widely used in game theory to determine the best move for a
player assuming the opponent is playing optimally.
Moments and Deviations: Occupancy Problems
Occupancy problems, also known as balls-into-bins problems, deal with distributing
( n ) balls into ( m ) bins and analyzing the distribution properties.
Key Concepts:
Expected Number of Balls per Bin (First Moment): ( E[X] = \frac{n}{m} )
Variance and Standard Deviation (Second Moment): Used to describe the
spread of balls among bins.
Probability Bounds: Using Markov and Chebyshev inequalities to bound the
probability of deviating from the expected value.
Pseudocode for Simulation:
import random
def occupancy_simulation(balls, bins):
bin_counts = [0] * bins
for _ in range(balls):
bin = random.randint(0, bins - 1)
bin_counts[bin] += 1
return bin_counts
# Example usage:
balls = 100
bins = 10
result = occupancy_simulation(balls, bins)
print(result) # Outputs the distribution of balls in bins
The Markov and Chebyshev Inequalities
The Markov inequality provides an upper bound on the probability that a non-negative
random variable is at least a certain value. Chebyshev’s inequality gives a bound on
the probability that a random variable deviates from its mean by more than a certain
number of standard deviations.
Markov Inequality (Statement):
[ P(X \geq a) \leq \frac{E[X]}{a} ]
Chebyshev Inequality (Statement):
[ P(|X - E[X]| \geq k\sigma) \leq \frac{1}{k^2} ]
Randomized Selection
Randomized selection algorithms find the ( i^{th} ) smallest element in an unsorted
list in expected linear time.
Pseudocode (Randomized Select Algorithm):
def randomized_select(arr, left, right, i):
if left == right:
return arr[left]
pivot_index = randomized_partition(arr, left, right)
k = pivot_index - left + 1
if i == k:
return arr[pivot_index]
elif i < k:
return randomized_select(arr, left, pivot_index - 1, i)
else:
return randomized_select(arr, pivot_index + 1, right, i - k)
Markov Chains and Random Walks: A 2-SAT Example
A Markov chain is a stochastic model describing a sequence of possible events where
the probability of each event depends only on the state attained in the previous event.
Random walks on graphs are a form of Markov chain.
Key Concepts for 2-SAT Algorithm:
Represent the 2-SAT problem as an implication graph.
A random walk can find a satisfying assignment by randomly flipping the
values of variables when a clause is unsatisfied.
Pseudocode (Random Walk for 2-SAT):
def random_walk_2SAT(clauses, max_steps):
assignment = random_initial_assignment()
for _ in range(max_steps):
unsatisfied = find_unsatisfied_clause(clauses, assignment)
if not unsatisfied:
return assignment
variable_to_flip = choose_random_variable_from_clause(unsatisfied)
assignment[variable_to_flip] = not assignment[variable_to_flip]
return None # No satisfying assignment found
Markov Chains, Random Walks on Graphs, Graph Connectivity
Graph connectivity can be analyzed using random walks. The hitting time, the
expected number of steps to reach a node, and the mixing time, the time it takes for
the walk to reach a distribution close to its steady state, are of particular interest.
Key Concepts:
Hitting Time: Expected number of steps to reach a node.
Mixing Time: Time it takes to get close to the stationary distribution.
Algebraic Techniques: Fingerprinting
Fingerprinting is an algebraic technique used to quickly compare large sets of data or
objects by using a hash function to generate a much smaller, unique representation
called a fingerprint.
Application (String Matching):
Rabin-Karp Algorithm: A fingerprinting technique for string matching that
uses a rolling hash to find all occurrences of a pattern string within a text string.
Pseudocode (Rabin-Karp for String Matching):
def rabin_karp(text, pattern):
p_len = len(pattern)
t_len = len(text)
p_hash = hash(pattern)
for i in range(t_len - p_len + 1):
t_hash = hash(text[i:i+p_len])
if p_hash == t_hash:
if text[i:i+p_len] == pattern:
return i # Match found at index i
return -1 # No match found
Freivalds’ Technique
Freivalds’ algorithm is a probabilistic method to verify matrix multiplication. Given
matrices ( A ), ( B ), and ( C ), it verifies whether ( AB = C ) with high probability.
Pseudocode (Freivalds’ Algorithm):
import random
def freivalds(A, B, C, iterations):
n = len(A)
for _ in range(iterations):
# Create a random vector r
r = [random.randint(0, 1) for _ in range(n)]
# Check if A * (B * r) equals C * r
if multiply(A, multiply(B, r)) != multiply(C, r):
return False # AB does not equal C
return True # AB probably equals C
Verifying Polynomial Identities
A polynomial identity is an equation of polynomials that holds for all values of the
variables in the polynomials.
Schwartz–Zippel Lemma:
The Schwartz–Zippel lemma states that if a polynomial is non-zero, then its value is
also non-zero for a randomly chosen point within a finite field with high probability.
Pseudocode (Schwartz-Zippel Lemma):
def schwartz_zippel(polynomial, domain, k):
# Choose a random point from the domain
point = {var: random.choice(domain) for var in polynomial.variables()}
# Evaluate the polynomial at this point
value = polynomial.evaluate(point)
return value != 0
Perfect Matching in Graphs
A perfect matching in a graph is a set of edges such that each vertex in the graph is
incident to exactly one edge of the set.
Algorithm (Blossom Algorithm for Perfect Matching):
The Blossom algorithm can find a maximum matching in a general graph.
Verifying Equality of Strings
This problem involves checking if two strings are equal.
Algorithm (Direct Comparison):
def are_strings_equal(str1, str2):
if len(str1) != len(str2):
return False
for i in range(len(str1)):
if str1[i] != str2[i]:
return False
return True
Comparison of Fingerprinting Techniques
This would involve comparing different fingerprinting techniques like Rabin-Karp,
Karp-Rabin, and others based on their efficiency, probability of collision, and use
cases.
Pattern Matching
Pattern matching is the act of checking a given sequence of tokens for the presence of
the constituents of some pattern.
Algorithm (Knuth-Morris-Pratt (KMP)):
The KMP algorithm searches for occurrences of a “word” ( W ) within a main “text
string” ( S ) by employing the observation that when a mismatch occurs, the word
itself embodies sufficient information to determine where the next match could begin.
Pseudocode (KMP for Pattern Matching):
def KMP_table(pattern):
# Create the longest prefix that is also a suffix (LPS) array
lps = [0] * len(pattern)
length = 0
i=1
while i < len(pattern):
if pattern[i] == pattern[length]:
length += 1
lps[i] = length
i += 1
elif length != 0:
length = lps[length-1]
else:
lps[i] = 0
i += 1
return lps
def KMP_search(text, pattern):
lps = KMP_table(pattern)
i=j=0
while i < len(text):
if pattern[j] == text[i]:
i += 1
j += 1
if j ==
len(pattern):
return i-j # Pattern found at index i-j
elif i < len(text) and pattern[j] != text[i]:
if j != 0:
j = lps[j-1]
else:
i += 1
return -1 # Pattern not found
The Fundamental Data-structuring Problem
Data structuring involves organizing and storing data in a computer so that it can be
accessed and modified efficiently. The fundamental data-structuring problem is about
finding the most efficient data structure that can handle a given set of operations like
search, insert, and delete with the minimum time complexity.
Random Treaps
A treap is a data structure that combines binary search trees and heaps (tree + heap =
treap). It maintains a dynamic set of ordered keys and allows binary search tree
operations to be performed in expected ( O(\log n) ) time.
Key Operations:
Insert: Add a new key to the treap.
Search: Find a key in the treap.
Delete: Remove a key from the treap.
Skip Lists
Skip lists are data structures that allow ( O(\log n) ) search complexity as well as fast
operations on a sorted sequence of elements. They achieve this by maintaining
multiple linked lists that skip over a number of elements, creating a hierarchy of
linked lists that quickly narrows down the search area.
Pseudocode (Search in Skip List):
def search_skip_list(skip_list, value):
current = skip_list.head
while current:
# Move to the right in the current level
while current.right and current.right.value < value:
current = current.right
# If you can go down, go down
if current.down:
current = current.down
else:
break
return current.right if current.right and current.right.value == value else None
Hashtables with ( O(1) ) Search Time
Hashtables, or hashmaps, provide ( O(1) ) average-case time complexity for search,
insert, and delete operations. This is achieved by using a hash function to index into
an array, where the keys are stored.
Key Operations:
Insert: Add a key-value pair to the hashtable.
Search: Retrieve the value associated with a given key.
Delete: Remove a key-value pair from the hashtable.
Graph Algorithms: All Pairs Shortest Paths
The all-pairs shortest path problem is to find the shortest paths between every pair of
vertices in a weighted graph.
Algorithm (Floyd-Warshall Algorithm):
The Floyd-Warshall algorithm is a dynamic programming solution that computes the
shortest paths between all pairs of vertices in ( O(n^3) ) time, where ( n ) is the
number of vertices.
Pseudocode (Floyd-Warshall):
def floyd_warshall(weights):
n = len(weights)
dist = list(map(lambda i: list(map(lambda j: j, i)), weights))
for k in range(n):
for i in range(n):
for j in range(n):
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
return dist
The Min-Cut Problem
The min-cut problem in a graph is to find the minimum set of edges that, if removed,
would disconnect the graph.
Algorithm (Karger’s Algorithm): (Already described in detail in UNIT - I)
Minimum Spanning Trees
A minimum spanning tree (MST) is a subset of the edges of a connected, edge-
weighted undirected graph that connects all the vertices together, without any cycles
and with the minimum possible total edge weight.
Algorithm (Kruskal’s Algorithm):
Kruskal’s algorithm is a greedy algorithm that finds an MST for a connected weighted
graph.
Pseudocode (Kruskal’s Algorithm):
def find(parent, i):
if parent[i] == i:
return i
return find(parent, parent[i])
def kruskal(graph):
result = []
i, e = 0, 0
graph = sorted(graph, key=lambda item: item[2])
parent = [i for i in range(len(graph))]
while e < len(graph) - 1:
u, v, w = graph[i]
i += 1
x = find(parent, u)
y = find(parent, v)
if x != y:
e += 1
result.append((u, v, w))
parent[x] = y
return result
Randomized Incremental Construction
Randomized Incremental Construction is a technique used in computational geometry
for various algorithms, including Delaunay triangulation and computing convex hulls.
Algorithm (Randomized Incremental Construction for Convex Hulls):
1. Randomly permute the input points.
2. Initialize the convex hull with the first three points.
3. Iteratively add the next point and update the convex hull.
Pseudocode:
def incremental_convex_hull(points):
random.shuffle(points) # Step 1
hull = create_initial_hull(points[0:3]) # Step 2
for point in points[3:]: # Step 3
if point not in hull:
hull = update_hull(hull, point)
return hull
Convex Hulls in the Plane
The convex hull of a set of points is the smallest convex polygon that contains all the
points.
Algorithm (Graham’s Scan for Convex Hull):
1. Find the point with the lowest y-coordinate, break ties by x-coordinate (this is
the pivot).
2. Sort the points by the polar angle with the pivot.
3. Traverse the sorted array and for each point, determine whether moving from
the two previously considered points to this point is a “left turn” or a “right
turn”. If a “right turn”, the second-to-last point is not part of the convex hull
and should be removed.
Pseudocode:
def graham_scan(points):
pivot = min(points, key=lambda p: (p.y, p.x))
points.sort(key=lambda p: polar_angle(p, pivot))
hull = [pivot]
for point in points:
while len(hull) > 1 and right_turn(hull[-2], hull[-1], point):
hull.pop()
hull.append(point)
return hull
Delaunay Triangulations
Delaunay triangulation for a set of points in the plane is a triangulation such that no
point is inside the circumcircle of any triangle.
Algorithm (Incremental Construction for Delaunay Triangulations):
Similar to the convex hull, but maintains a Delaunay triangulation throughout.
Trapezoidal Decompositions
Trapezoidal decomposition is a method for partitioning the plane into trapezoids,
which is often used as a preprocessing step for point location or motion planning.
Parallel and Distributed Algorithms: The PRAM Model
PRAM (Parallel Random Access Machine) is a model for designing parallel
algorithms. It allows for the analysis of algorithmic complexity in a parallel
computing environment.
Example (Parallel Summation):
def parallel_sum(array, num_processors):
n = len(array)
if n == 1:
return array[0]
sums = [0] * num_processors
# Parallel step
for i in range(num_processors):
sums[i] = sum(array[i * (n // num_processors):(i + 1) * (n // num_processors)])
# Recursive step
return parallel_sum(sums, num_processors)
Sorting on a PRAM
Sorting can be efficiently performed on a PRAM using algorithms like bitonic sort or
parallel merge sort.
Maximal Independent Sets
A maximal independent set is a set of vertices in a graph, no two of which are
adjacent, and adding any other vertex to the set forces the inclusion of an edge.
Perfect Matchings
A perfect matching in a graph is a set of edges such that every vertex is incident to
exactly one edge in the matching.
Algorithm (Hungarian Algorithm for Perfect Matching in Bipartite Graphs):
The Hungarian algorithm finds a maximum matching or maximum weight matching
in a weighted bipartite graph.