Probability and Randomized Algorithms
Prof. Zeph Grunschlag
Discrete Random Variable
DEF: A discrete random variable is a set X together with an assignment of a nonnegative probability Pr[X=x] that X takes value x; furthermore, the sum over all possible x X of the probability that X takes value x must equal 1.
If X is clearly xed from context, may p
abbreviate Pr[X=x] to Pr[x] or
2
x.
Let X, Y be random variables over the resp.
sets X, Y. (Note, X,Y may/may not be same) DEF: Joint probability Pr[x,y] is the probability that (X,Y) = (x,y). (Probability of both occurring simultaneously) DEF: Conditional probability is dened by Pr[x|y] = Pr[x,y] / Pr[y] - assuming that Pr[y] > 0.
3
Joint and Conditional Probability
Independent Variables
Random variables are independent if their
probabilities dont depend on each others values: DEF: X and Y are independent if Pr[x,y] = Pr[x]Pr[y] for all x, y. LEMMA: Equivalently, X and Y are independent if (excluding 0-prob. y)
x X, y Y, Pr[x|y] = Pr[x]
4
Bayes Theorem
THM: If Pr[y] > 0 then Pr[x|y] = Pr[y|x] Pr[x] / Pr[y]
Assume X a random variable on {0,1} and Repeat experiment n times. I.e., take n
Bernoullis Thm:
n
DEF: The product of random variables X, Y is the random variable XY dened on XY with distribution Pr[(x,y)] = Pr[x]Pr[y]. let p = Pr[X=1], q = Pr[X=0]
Binomial Rand.Var.
result called Binomial random variable
n k nk Pr ! Xi = k = pq k i=1
6
independent copies: X1 X2 Xn
Expectation
The average value taken on by a function f
on probability distribution X DEF: The expectation of f is dened by:
E( f ) =
THM:
E( f + g) = E( f ) + E(g)
xX
! f (x) px
COR: For n repetitions of a Binomial random variable X consider sum S which counts the number outcomes = 1. Then E(S) = np
7
Chernoff Bound
Estimates probability that sum of Binomial
THM: experiment deviate from expected sum np
Pr S (1 + !)pn e
!2 3 pn
Note: probability that sum too big falls off exponentially with n
Randomized Algorithms
Equivalent formulations: Turing machine with coin ips at every step of computation Non-deterministic Turing machine with probability distribution over computation branches Nomenclature (varies from author to author): Monte-Carlo: Colloquially any randomized algorithm Complexity theory: NOs always right Las-Vegas: always correct, but may fail BPP: answers correct most of the time
Monte Carlo Algorithm
False negative allowed, but no false positives
DEF: A poly-time Monte Carlo algorithm for the decision problem P is a poly-time nondeterministic Turing machine (NDTM) s.t. 1 xP 2 Pr[x is accepted] : =0 xP Probability measured over coin-ips in TM or equivalently, by taking the ratio of accepting branches in NTM to total number
Denes complexity class RP Rand-Poly
10
Las Vegas Algorithm
Symmetric version of Monte Carlo - no false
negatives nor false positives but can fail DEF: A poly-time Las Vegas algorithm is a poly-time NDTM with a constant >0 for which Pr[fail] for all inputs.
Repeat algorithm to make arbitrarily small Gives class ZPP Zero-Prob-of-error-Poly ZPP = RP co-RP
11
BPP = Bounded-Prob-of-error-Poly Most general class - allow false negatives and
positives. Compensate by insisting answer correct signicantly more than half the time DEF: A poly-time randomized algorithm for the decision problem P is a poly-time NDTM with a constant >0 for which 1+! x P 2 Pr[x is accepted] : 1! x P 2 Chernoff bound implies may assume =0.25
12
Class BPP
Pseudo Random Sequence
DEF: A pseudo random sequence is a deterministic algorithm from nite bitstrings to innite bitstrings whose outputs cannot be distinguished from a random strings by any BPP algorithm.
13
Given: A black box f which is known a= n
-bias Detector
Pr[output is correct] > 3/4 therefore
this problem is in BPP so -bias
sequences are not pseudorandom.
14
2 ( 1 !)!2 2 x = output of length n from f c = number of 1s in x return (c > n/2) // YES if 1-bias, NO if 0-bias
priori to have some built-in bias in an unknown direction. Decide: Which direction the bias is in.