Best Question

UC Berkeley, CS 174: Combinatorics and Discrete Probability (Fall 2010)
Solutions to Problem Set 2
1. (MU 2.4; Jensens Inequality) Prove that E[X k ] E[X]k for any even integer k 1.
By Jensens inequality, E[f (X)] f (E[X]) for any convex function f . If f is twice differentiable
and its second derivative is non-negative, then f is convex. For f (x) = xk , the second derivative
is f 00 (x) = k(k 1)xk2 which is non-negative if x 0.
2. (MU 2.7) Let X and Y be independent geometric random variables, where X has parameter p
and Y has parameter q.
(a) What is the probability that X = Y ?

X
P[X = Y ] = (1 p)x1 p(1 q)x1 q
x
X
= [(1 p)(1 q)]x1 pq
x
Recall that from page 31, for geometric random variables, we have the identity

X
P[X i] = (1 p)n1 p = (1 p)i1 . (1)
n=i
So, we obtain
pq
P[X = Y ] =
p + q pq
(b) What is E[max(X, Y )]? We know from problem MU 2.9 that E[max(X, Y )] = E[X] +
E[Y ] E[min(X, Y )]. From below, in part (c), we know that min(X, Y ) is a geometric
1
random variable mean p + q pq. Therefore, E[min(X, Y )] = p+qpq , and we get
1 1 1
E[max(X, Y )] = + .
p q p + q pq
(c) What is P[min(X, Y ) = k]? We split this event into two disjoint events.
P[min(X, Y ) = k] = P[X = k, Y k] + P[X > k, Y = k]
= P[X = k]P[Y k] + P[X > k]P[Y = k]
Recall the identity in Eqn 1. So we have P[X > k] = P[X k]P[x = k] = (1p)k1 (1p).
Finally, we get
P[min(X, Y ) = k] = (1 p)k1 p(1 q)k1 + (1 p)k1 (1 p)(1 q)k1 q
= [(1 p)(1 q)]k1 (p + (1 p)q)
= [(1 p)(1 q)]k1 (p + q pq)
1
(d) What is E[X|X Y ]?
X
E[X|X Y ] = xP[X = x|x Y ]
x1
X P[X = x x Y ]
= x
x
P[X Y ]
First, lets consider the denominator.

X
P[X Y ] = P[X = z z Y ]
z1
X
= P[X = z]P[z Y ]
z
X
= (1 p)z1 p(1 q)z1
z
X
= [(1 p)(1 q)]z1 p
z
X
= p [(1 p q + pq)]z1
z
p
=
p + q pq
The last step above is again by the identity in Eqn 1. Now we can compute the whole
equation.
p + q pq X
E[X|X Y ] = xP[X = x]P[x Y ]
p x
p + q pq X
= x(1 p)x1 p(1 q)x1
p x
X
= (p + q pq) x(1 p q + pq)x1
x
This is equal to the expectation of a geometric random variable with mean p + q pq.
Therefore
1
E[X|X Y ] = .
p + q pq
3. (MU 2.9; Linearity of expectation)
(a) Suppose that we roll twice a fair k-sided die with the numbers 1 through k on the dies faces,
obtaining values X1 and X2 . What is E[max(X1 , X2 )]? What is E[min(X1 , X2 )]?
2
XX
E[max(X1 , X2 )] = max(x1 , x2 )(1/k)(1/k)
x1 x2
X X X
= 1/k 2 x1 + x2
x1 x2 x1 x2 >x1
XX
E[min(X1 , X2 )] = min(x1 , x2 )(1/k)(1/k)
x1 x2
X X X
= 1/k 2 x2 + x1
x1 x2 x1 x2 >x1
(b) Show from your calculation in part (a) that
E[max(X1 , X2 )] + E[min(X1 , X2 )] = E[X1 ] + E[X2 ]. (2)
X X X X X X
E[max(X1 , X2 )] + E[min(X1 , X2 )] = 1/k 2 x1 + x2 + 1/k 2 x2 + x1
x1 x2 x1 x2 >x1 x1 x2 x1 x2 >x1
!
X X X
= 1/k 2 x2 + x1
x1 x2 x2
XX
= 1/k 2 x2 + x1
x1 x2
= E[x1 ] + E[x2 ]
(c) Explain why Eqn. (2) must be true by using the linearity of expectations instead of a direct
computation.
By using linearity of expectation twice, we get
E[max(X1 , X2 )] + E[min(X1 , X2 )] = E[max(X1 , X2 ) + min(X1 , X2 )]
= E[X1 + X2 ]
= E[X1 ] + E[X2 ].
4. (MU 2.15; Coupon Collector) For a coin that comes up heads independently with probability
p on each flip, what is the expected number of flips until the kth heads?
Let X be the number of flips until the kth heads. Let Xi be the number of coin flips for the next
3
heads.
Xk
E[X] = E[ Xi ]
i=1
k
X
= E[Xi ] (by linearity of expectation)
i=1
k
X
= 1/p (since Xi geom(p))
i=1
= k/p
5. (MU 2.18; Induction) The following approach is often called reservoir sampling. Suppose we
have a sequence of items passing by one at a time. We want to maintain a sample of one item with
the property that it is uniformly distributed over all the items that we have seen at each step.
Moreover, we want to accomplish this without knowing the total number of items in advance or
storing all of the items that we see.
Consider the following algorithm, which stores just one item in memory at all times. When the
first item appears, it is stored in the memory. When the kth item appears, it replaces the item
in memory with probability 1/k. Explain why this algorithm solves the problem.
Let b1 , b2 , ...bn be the the values of the items observed at time bt . We will prove this by induction.
Let Mt be a random variable that takes the value of the item in memory at time t. We need to
show that at time t, P[Mt = bi ] = 1/t for all 1 i t.
The base case is when t = 1, which is trivially true since Mt = b1 with probability 1. Assume that
at time t, P[Mt = bi ] = 1/t for all 1 i t. Now we prove that this property holds for time t + 1.
At time t+1, we set Mt+1 = bt+1 with probability 1/(t+1). Therefore, P[Mt+1 = bt+1 ] = 1/(t+1).
For 1 i t,
P[Mt+1 = ai = P[no swap at time t andMt = bi ]
= P[no swap at time t]P[Mt = bi ]

t 1
=
t+1 t
1
=
t+1
6. (MU 2.24) We roll a standard die over and over. What is the expected number of rolls until
the first pair of consecutive sixes appears?
For a given sequence of coin flips ending in 66, we can write it as .. 6 .. 6 ... .. 66,
where is any roll that is not a 6. Let Xi be the number of flips until obtaining a 6. Notice
that Xi is geometric with parameter 1/6. Let N be the number of 6s that are observed before
observing 66. Notice that N is also geometric with parameter 1/6. Now we can use conditional
4
expectation to write
E[X] = E [E[X|N ]]
" N #
X
= E E[ Xi + 1|N ]
i=1
N
" #
X
= E E[Xi ] + 1
i=1
N
" #
X
= E 7
i=1
= 7 E[N ]
= 76
= 42
7. (MU 2.27) Consider the following distribution on the integers x 1: P[X = x] = (6/ 2 )x2 .
P
This is a valid distribution, since k=1 k 2 = 2 /6. What is its expectation?
We begin by simply writing out the definition of expectation

X
E[X] = xP[X = x]
x=1

X 6 1
=
2 x
x=1
P 1
This sum diverges since x diverges, so the expectation is not finite.
8. (MU 2.32) You need a new staff assistant, and you have n people to interview. You want to hire
the best candidate for the position. When you interview a candidate, you can give them a score,
with the highest score being the best and no ties being possible. You interview the candidates
one by one. Because of your companys hiring practices after you interview the kth candidate,
you either offer the candidate the job before the next interview or you forever lose the chance
to hire that candidate. We suppose the candidates are interviewed in a random order, chosen
uniformly at random from all n! possible orderings.
We consider the following strategy. First, interview m candidates but reject them all; these
candidates give you an idea of how strong the field is. After the mth candidate, hire the first
candidate you interview who is better than all of the previous candidates you interviewed.
(a) Let E be the event that we hire the best assistant, and let Ei be the event the ith candidate
is the best and we hire them. Determine P[Ei ], and show that
n
m X 1
P[E] = .
n j1
j=m+1
5
Notice that the Ei are disjoint events, therefore P[E] = ni=1 P[Ei ]. For i <= m, P[Ei ] =
P
0, since none of the first m candidates are selected. Now, we see that for i > m two
independents events make up Ei .
P[Ei ] = P[ith candidate is the best] P[the ith candidate is chosen]

1
= P[best of the i 1 candidates is in the first m candidates]
n
1 m
=
ni1
Now, putting this all together, we get
n n
X m X 1
P[E] = P[Ei ] =
n i1
i=m+1 i=m+1
Pn 1
(b) Bound j=m+1 j1 to obtain
m m
(ln n ln m) P[E] (ln(n 1) ln(m 1)).
n n
Using Lemma 2.10 from the book, we get the solution
m n+1 1
Z
m
P[E] dx = ln(x 1)|n+1
m+1 = (ln(n) ln(m))
n m+1 x 1 n
and Z n
m 1 m
P[E] dx = ln(x 1)|nm = (ln(n 1) ln(m 1))
n m x1 n
(c) Show that m(ln n ln m)/n is maximized when m = n/e, and explain why this means
P[E] 1/e for this choice of m.
How should we find the best m? Since the bound from above is concave, we can take the
derivative, set it equal to zero, and solve for m. This yields the m that maximizes P r[E].
We have
d m ln(n) ln(m) 1
(ln(n) ln(m)) = + = 0.
dm n n n n
Then we get ln(m) = ln(n) 1, which is
n
m = eln(n)1 = eln(n) e1 = ne1 = .
e
Substituting this m back into the bound from part (b), we get
1 n
P[E] ln n ln = 1/e.
e e

Best Question

Uploaded by

Copyright:

Available Formats

Best Question

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Best Question

Uploaded by

Copyright:

Available Formats

UC Berkeley, CS 174: Combinatorics and Discrete Probability (Fall 2010)

Solutions to Problem Set 2

(a) What is the probability that X = Y ?

P[min(X, Y ) = k] = P[X = k, Y k] + P[X > k, Y = k]

= P[X = k]P[Y k] + P[X > k]P[Y = k]

P[min(X, Y ) = k] = (1 p)k1 p(1 q)k1 + (1 p)k1 (1 p)(1 q)k1 q

= [(1 p)(1 q)]k1 (p + (1 p)q)

= [(1 p)(1 q)]k1 (p + q pq)

First, lets consider the denominator.

3. (MU 2.9; Linearity of expectation)

(b) Show from your calculation in part (a) that

E[max(X1 , X2 )] + E[min(X1 , X2 )] = E[X1 ] + E[X2 ]. (2)

E[max(X1 , X2 )] + E[min(X1 , X2 )] = E[max(X1 , X2 ) + min(X1 , X2 )]

P[Mt+1 = ai = P[no swap at time t andMt = bi ]

= P[no swap at time t]P[Mt = bi ]

P[Ei ] = P[ith candidate is the best] P[the ith candidate is chosen]

You might also like