0% found this document useful (0 votes)
30 views

04 Randomized Algorithms

Uploaded by

bshayer150
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

04 Randomized Algorithms

Uploaded by

bshayer150
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

CS2315

Algorithm Fundamentals

Randomized
Algorithms
Lectures prepared by: Dr. Manal Alharbi, and Dr. Areej Alsini
Deterministic algorithms

• A deterministic algorithm is an algorithm that, given the same input and


starting conditions, will always produce the same output and follow the same
sequence of steps. In other words, the behaviour of a deterministic algorithm
is completely predictable and reproducible.

• Key characteristics of deterministic algorithms include:


1.Deterministic decision-making: At each step of the algorithm, the decision-
making process is based solely on the input and the current state of the algorithm.
There is no randomness or uncertainty involved in the decision-making process.

2.Consistent output: Given the same input, a deterministic algorithm will always
produce the same output. This property is crucial for ensuring the reliability and
predictability of an algorithm.

3.Sequential execution: A deterministic algorithm follows a specific sequence of


steps or instructions in a predetermined order. The execution of each step depends
on the successful completion of the previous steps, ensuring a consistent and
predictable flow.
Randomized Algorithms

• A randomized algorithm is an algorithm whose behavior


depends, in part, on the outcomes of random choices or the
values of random bits.

• In addition, there are some problems that need randomization


for them to work effectively.

• For instance, consider the problem common in computer


games involving playing cards—that of randomly shuffling a
deck of cards so that all possible orderings are equally likely.
Randomized Algorithms

▪ For many algorithms (such as quicksort), the average run time is much
better than the worst-case run time. Average run times are computed
assuming a specific (typically uniform) distribution on the input space.
However, this assumption may not hold in practice.

▪ Is it possible to develop algorithms whose run times will be as good as


the average run times but where no assumptions are made on the input
space? The answer is yes, and such algorithms employ coin flips.

▪ A randomized algorithm is one where certain decisions are made based


on outcomes of coin flips made in the algorithm.
Randomized Algorithms

• In addition to the inputs, the algorithm takes a source of random numbers


and makes random choices during execution.
• Behavior of the algorithm can vary in different runs even on a fixed input
(based on the random choices).
• These probabilities expectations are controlled by the random choices made
by the algorithm and are independent of the inputs.
Randomized Algorithms

• The analysis of a randomized algorithm is done without assuming anything


about the input distribution. The analysis is done in the space of all possible
outcomes for coin flips made in the algorithm (rather than in all possible
inputs).

• A randomized algorithm uses a randomizer – a random number generator.


Some of the algorithm's decisions depend on the randomizer's output.

• Since the output of any randomizer might differ unpredictably from run to
run, the output of a randomized algorithm could also vary from run to run for
the same input.
• The execution time of a randomized algorithm could also vary from run to run for the
same input.
Randomized Algorithms-Challenge

There are some challenges that come with randomized


algorithms:

• Design the algorithm.

• Come up with an analysis to show that this behavior is likely


to be good on every input.
Types of randomized algorithms

Input: Given an array with n elements (n even). A[1 … n]. Half of the
array contains 0s, the other half contains 1s.
Output: An index that contains a 1.
Types of randomized algorithms

• There are two kinds of randomized algorithms:


• Monte Carlo and
• Las Vegas.
Las Vegas algorithm

• A Las Vegas algorithm always outputs the correct answer for


the same input and its run time is a random variable.
• Ideally, we would like to prove that the run time of a Las
Vegas algorithm is small" with a very high probability. Such
run time bounds proven are referred to as high probability
bounds".
• The execution time of a Las Vegas algorithm depends on the
output of the randomizer. If we are lucky, the algorithm might
terminate fast, and if not, it might run for a longer period of
time.
• In general, the execution time of a Las Vegas algorithm is
characterized as a random variable
Las Vegas algorithm
Monte Carlo algorithm
• A Monte Carlo algorithm typically pertains to decision
problems (i.e., problems for which the answer is either \yes" or
\no").
• Its outputs might differ from run to run (for the same input).

• If a Monte Carlo algorithm is employed to solve such a problem, then the


algorithm might give incorrect answers depending on the output of the
randomizer.
• We require that the probability of an incorrect answer from a Monte Carlo
algorithm be low (the output of a Monte Carlo algorithm is correct with a very
high probability).

• Typically, for a fixed input, a Monte Carlo algorithm does not display
much variation in execution time between runs, whereas in the case of a
Las Vegas algorithm, this variation is significant.
Monte Carlo algorithm
Randomized Algorithms
• Randomized algorithms offer simplicity and better performance relative to
their deterministic counterparts. For a number of fundamental problems in
computing (such as sorting, selection, convex hull, etc.).

• The best-known algorithms happen to be randomized. These algorithms


have the best asymptotic run times and perform the best in practice.

• By a high (or very high) probability, we mean:


• a probability of at least (1-n-α), where n is the input size, and α is a
probability parameter (assumed to be a constant >= 1).
Randomized Algorithms
Identification of the Repeated Element:
• Consider the following problem:
𝑛
Input: 𝑋 = 𝑘1 , 𝑘2 , … . , 𝑘𝑛 . It is known that 𝑋 has copies of one element, and the
2
other elements are distinct (i.e., they occur exactly once each).
Output: the repeated element.

• We can solve the above problem deterministically in several ways:


1. We can sort the sequence X and scan through the sorted list. Copies of the repeated
element will be found in successive positions. The run time of this algorithm will be
Θ(nlogn) (if we use merge sort, for example);
2. We can use a median finding algorithm to identify the repeated element. There exist
Θ(n) time algorithms for median finding;
3. There is a relatively simple linear time (i.e., Θ(n) time) algorithm for finding the
repeated element: Partition X into groups of size 3 each and look for the repeated
element in the individual groups. By pigeon-hole principle, it follows that at least one
of the groups will have at least two copies of the repeated element.
Randomized Algorithms
Identification of the Repeated Element:
• Consider the following problem:
𝑛
Input: 𝑋 = 𝑘1 , 𝑘2 , … . , 𝑘𝑛 . It is known that 𝑋 has 2 copies of one element, and the
other elements are distinct (i.e., they occur exactly once each).
Output: the repeated element.

• We can argue that any deterministic algorithm for solving the above problem
𝑛
needs 2 + 2 time as follows: Consider an adversary who has perfect knowledge
about the algorithm to be used and who is picking the input. Specifically, the
adversary knows the order in which the sequence elements are accessed by the
algorithm. In this case the adversary can make sure that the first + 1 elements
accessed by the algorithm are distinct, forcing the algorithm to access at least
𝑛
one more element ➔ 2 + 2 .
▪ Example: n=8, A= {1, 2, 7, 8, 3, 3, 3, 3}
Identification of the Repeated Element: Las
Vegas algorithm
• We can develop a Las Vegas algorithm that takes only Õ(log 𝑛) time with a
high probability.
Basic step
• repeat
1. Flip an 𝑛-sided coin to get 𝑖;
2. Flip an 𝑛-sided coin to get 𝑗;
3. if 𝑖 ≠ 𝑗 and 𝑘𝑖 = 𝑘𝑗 then output 𝑘𝑖 and quit;
• forever
1 2 3 4 5 6 7 8

𝒓𝒆𝒑𝒆𝒂𝒕𝒆𝒅_𝑬𝒍𝒆𝒎𝒆𝒏𝒕𝒔 (𝑨, 𝒏): 1 2 7 3 7 4 7 7

𝑊ℎ𝑖𝑙𝑒 (𝑡𝑟𝑢𝑒) 𝑑𝑜: // i and j are random numbers in the range [1,n]
𝑖 = 𝑅𝑎𝑛𝑑𝑜𝑚 () 𝑚𝑜𝑑(𝑛 + 1); //to make sure the range of the value is between 1 to n
𝑗 = 𝑅𝑎𝑛𝑑𝑜𝑚 () 𝑚𝑜𝑑(𝑛 + 1); //to make sure the range of the value is between 1 to n

𝑖𝑓 (𝑖 ≠ 𝑗) && 𝐴[𝑖] == 𝐴[𝑗] 𝑡ℎ𝑒𝑛 𝑟𝑒𝑡𝑢𝑟𝑛 𝑖


Identification of the Repeated Element: Las
Vegas algorithm
• It randomly picks two array elements and checks whether they come from two
different cells and have the same value. If they do, the repeated element has
been found. If not, this basic step of sampling is repeated as many times as it
takes to identify the repeated element.

• In this algorithm, the sampling performed is with repetitions; that is, the first
and second elements are randomly picked from out of the n elements (each
element being equally likely to be picked).
1
• Thus, there is a probability (equal to ) that the same array element is picked
𝑛
each time. If we just check for the equality of the two elements picked, our
answer might be incorrect (in case the algorithm picked the same array index
each time).
• Therefore, it is essential to make sure that the two array indices picked are
different and the two array cells contain the same value.
Identification of the Repeated Element: Las
Vegas algorithm
• Definition: The run time of a Las Vegas algorithm is Õ(𝑓(𝑛)) if− the run time is no more
than 𝑐𝛼𝑓(𝑛) for all 𝑛 ≥ 𝑛0 , with a probability 𝑜𝑓 ≥ (1 − 𝑛 𝛼 ), where 𝑐 and 𝑛0 are
some constants.

• Claim: The run time of the above algorithm is Õ(log 𝑛).

• Proof: Call the sequence of three statements within the repeat loop as a basic step.
• Any iteration of the while loop will be successful in identifying the repeated number if i is any
𝑛 𝑛
one the array indices corresponding to the repeated element and j is any one of the same
2 2
indices other than i.
𝒏
• When we have 𝒏 numbers where are repeated, what is the probability of choosing
𝟐
𝒊, 𝒋 from that repeated part?
𝒏 𝒏
−𝟏
• The probability of choosing 𝒊 = 𝟐
, the probability of choosing 𝒋 = 𝟐
𝒏 𝒏

• In other words, the probability that the algorithm quits in any given iteration of the
𝑛 𝑛 𝑛 𝑛
−1 −1
while loop is 2 2
= 2 2
𝑛 𝑛 𝑛2
Identification of the Repeated Element: Las
Vegas algorithm
• Proof: Call the sequence of three statements within the repeat loop as a basic step.
𝑛 𝑛−1

• Probability that we get success in one run of the basic step is 2 2


𝑛2
1
• This probability is at least for all n ≥10.
5
4
• Thus, the probability of failure in one basic step is ≤ .
5
4 𝑘
• This means that the probability that the first k basic steps are all unsuccessful is ≤
5
4 10
• Therefore, the probability that the algorithm does not quit in 10 iterations < < 0.1074 . So,
5
Algorithm. will terminate in 10 iterations or less with probability ≥ 1 - 0.1074= 0.8926.
4 100
➔ The probability that the algorithm does not terminate in 100 iterations is < < 2.04 ×
5
10−10 . That is, almost certainly the algorithm will quit in 100 iterations or less.

• If n = 2 * 106, for example, any deterministic algorithm will have to spend at least one million-time
steps, as opposed to the 100 iterations of Algorithm.
• We want this probability to be very small, specifically no more than n−α.
4 𝑘 4 𝛼𝑙𝑜𝑔𝑛
➔ = 𝑛−𝛼 ➔ 𝑘 𝑙𝑜𝑔 ≤ − 𝛼𝑙𝑜𝑔𝑛 ➔ 𝑘 ≥ 5
5 5 log(4)
𝛼𝑙𝑜𝑔𝑛
•We have shown that the probability that the above algorithm takes more than 5 basic steps is ≤ n−α.
log(4)
•Note that each basic step takes a constant time. Therefore, the run time of the algorithm is ≤ cαlogn with a
probability of ≥(1- n−α), for some constant c. Thus, it follows that the run time of the algorithm is Õ(logn).
Searching repeated elements- analysis
Finding an Element as Large as the Median:
Monte Carlo algorithm
• Consider the following problem.
• Input: X = k1 ,k2,...,kn;
• Output: An element of X that is at least as large as the median of X.
• For this problem also, several deterministic algorithms can be developed:
1. Sort X and pick an element at the middle or to its right. This will take Θ(nlogn)
time;
2. Find and output the maximum of X. This takes Θ(n) time;
3. Find and output the maximum of k1, k2,...,kn/2. This also takes Θ(n) time.

• One could argue that any deterministic algorithm will need Ω(n) time to solve
this problem as follows:
• If an adversary chooses the input and if this adversary has perfect knowledge
𝑛
about the algorithm, (s)he can force the algorithm to look at ≥ elements of X.
2
𝑛
• If the algorithm is ready to give an answer looking at less than elements, the
2
adversary can choose the other elements of X in such a manner to make the output
of the algorithm incorrect.
Finding an Element as Large as the Median:
Monte Carlo algorithm
• We can devise a Monte Carlo algorithm that takes only Õ(logn) time and
whose output is correct with a high probability as follows.
• Pick a random sample S of αlogn elements from X;

• Find and output the largest element of S;

• The above algorithm will give an incorrect answer only if all the elements in S
are less than the median.
• Probability that a randomly picked element of X is less than the median is
1
≤2
1 𝑘
• The probability that all k elements are incorrect is 2

• If we equate this to 𝑛−𝛼 we get 𝑘 = 𝛼𝑙𝑜𝑔𝑛


• Then, probability that all the elements in S are less than the median of 𝑋 is
1 𝛼𝑙𝑜𝑔𝑛
≤ = 𝑛−𝛼
2
Finding an Element as Large as the Median:
Monte Carlo algorithm
• In other words, the output of this algorithm is correct with a high
probability. Clearly, the run time of the algorithm is O(logn).
• We defined a high probability to be a probability that is ≥ (1−n−α).
However, all the analyses can be done even if we have a different notion
of a high probability.

• Let p be the high probability of interest. For the algorithm that we


discussed for the problem of finding an element ≥ the median, we picked
k elements randomly, found and output the maximum of these k elements.
1
Probability that a randomly picked element is < the median is ≤ Thus,
2
1 𝑘
the probability of an incorrect answer from the algorithm is ≤ We
2
1
want this probability to be ≤ (1−p). This will happen when 𝑘 ≥ log
1−𝑝
References

•Disclaimer: Original course by Dr. Rajasekaran -3500 Algorithms and


Complexity and other references.

•https://youtu.be/bsZXgXdSomc?si=MMhhEyGWdxA4NOmM

You might also like