MIT18 05S14 Class27-Sol PDF

Class 27: review for final exam –solutions, 18.
05, Spring 2014

This does not cover everything on the final.
Look at the posted practice problems for other topics.
To save time in class: set up, but do not carry out computations.
Problem 1. (Counting)
(a) There are several ways to think about this. Here is one.
The 11 letters are p, r, o, b,b, a, i,i, l, t, y. We use the following steps to create a sequence
of these letters.
Step 1: Choose a position for the letter p: 11 ways to do this.
Step 2: Choose a position for the letter r: 10 ways to do this.
Step 3: Choose a position for the letter o: 9 ways to do this.
Step 4: Choose two positions for the two b’s: 8 choose 2 ways to do this.
Step 5: Choose a position for the letter a: 6 ways to do this.
Step 6: Choose two positions for the two i’s: 5 choose 2 ways to do this.
Step 7: Choose a position for the letter l: 3 ways to do this.
Step 8: Choose a position for the letter t: 2 ways to do this.
Step 9: Choose a position for the letter y: 1 ways to do this.
Multiply these all together we get:

8 5 11!
11 · 10 · 9 · ·6· ·3·2·1 =
2 2 2! · 2!
(b) Here are two ways to do this problem.

Method 1. Since every arrangement has equal probability of being chosen we simply have
to count the number that start with the letter ‘b’. After putting a ‘b’ in position 1 there
are 10 letters: p, r, o, b, a, i,i, l, t, y, to place in the last 10 positions. We count this in the
same manner as part (a). That is
Choose the position for p: 10 ways.
Choose the positions for r,o,b,a,: 9 · 8 · 7 · 6 ways.
Choose two positions for the two i’s: 5 choose 2 ways.
Choose the position for l: 3 ways.
Choose the position for t: 2 ways.
Choose the position for y: 1 ways.

5 10!
Multiplying this together we get 10·9·8·7·6· ·3·2·1 = arrangements start with the
2 2!
10!/2! 2
letter b. Therefore the probability a random arrangement starts with b is =
11!/2! · 2! 11
Method 2. Suppose we build the arrangement by picking a letter for the first position,
then the second position etc. Since there are 11 letters, two of which are b’s we have a 2/11
chance of picking a b for the first letter.
Problem 2. (Probability)
We are given P (E ∪ F ) = 3/4.
1
Class 26: review for final exam solutions, Spring 2014 2
E c ∩ F c = (E ∪ F )c ⇒ P (E c ∩ F c ) = 1 − P (E ∪ F ) = 1/4.
Problem 3. (Counting)
Let Hi be the event that the ith hand has one king. We have the conditional probabilities

4 48 3 36 2 24
1 12 1 12 1 12
P (H1 ) = ; P (H2 |H1 ) = ; P (H3 |H1 ∩ H2 ) =
52 39 26
13 13 13
P (H4 |H1 ∩ H2 ∩ H3 ) = 1
P (H1 ∩ H2 ∩ H3 ∩ H4 ) = P (H4 |H1 ∩ H2 ∩ H3 ) P (H3 |H1 ∩ H2 ) P (H2 |H1 ) P (H1 )

2 24 3 36 4 48
1 12 1 12 1 12
= .
26 39 52
13 13 13
Problem 4. (Conditional probability)
(a) Sample space = Ω = {(1, 1), (1, 2), (1, 3), . . . , (6, 6) } = {(i, j) | i, j = 1, 2, 3, 4, 5, 6 }.
(Each outcome is equally likely, with probability 1/36.)
A = {(1, 3), (2, 2), (3, 1)},
B = {(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (1, 3), (2, 3), (4, 3), (5, 3), (6, 3) }
P (A ∩ B ) 2/36 2
P (A|B) = = = ..
P (B) 11/36 11
(b) P (A) = 3/36 6= P (A|B), so they are not independent.
Problem 5. (Bayes formula)

For a given problem let C be the event the student gets the problem correct and K the
event the student knows the answer.
The question asks for P (K|C).
P (C |K) P (K )
We’ll compute this using Bayes’ rule: P (K |C) = .
P (C)
We’re given: P (C|K) = 1, P (K) = 0.6.
Law of total prob.:
P (C) = P (C|K) P (K) + P (C|K c ) P (K c ) = 1 · 0.6 + 0.25 · 0.4 = 0.7.
0.6
Therefore P (K|C) = = .857 = 85.7%.
0.7
Problem 6. (Bayes formula)
Here is the game tree, R1 means red on the first draw etc.
7/10 3/10
R1 B1
6/9 3/9 7/10 3/10
R2 B2 R2 B2
5/8 3/8 6/9 3/9 6/9 3/9 7/10 3/10
R3 B3 R3 B3 R3 B3 R3 B3
Summing the probability to all the B3 nodes we get

7 6 3 7 3 3 3 7 3 3 3 3
P (B3 ) = · · + · · + · · + · · = .350.
10 9 8 10 9 9 10 10 9 10 10 10
Problem 7. (Expected value and variance)
X -2 -1 0 1 2
We compute
p(X) 1/15 2/15 3/15 4/15 5/15
[
1 2 3]15 + 1 · 4 5 2
E[X] = −2 · + −1 · +0· +2· = .
15 15 15 15 3
Thus
2 14
Var(X) = E((X − )2 ) = .
3 9

We first compute Z 1
2
E[X] = x · 2xdx =
0 3
Z 1
1
E[X 2 ] = x2 · 2xdx =
0 2
Z 1
1
E[X 4 ] = x4 · 2xdx = .
0 3
Thus,
1 4 1
Var(X) = E[X 2 ] − (E[X])2 = − =
2 9 18
and
2 1 1 1
Var(X 2 ) = E[X 4 ] = E[X 2 ] = − = .
3 4 12

Use Var(X) = E(X 2 ) − E(X)2 ⇒ 3 = E(X 2 ) − 4 ⇒ E(X 2 ) = 7.

answer:
Make a table X: 0 1
prob: (1-p) p
X2 0 1.
From the table, E(X) = 0 · (1 − p) + 1 · p = p.
Since X and X 2 have the same table E(X 2 ) = E(X) = p.
Therefore, Var(X) = p − p2 = p(1 − p).
Problem 11. (Expected value)

Let X be the number of people who get their own hat.
Following the hint: let Xj represent whether person j gets their own hat. That is, Xj = 1
if person j gets their hat and 0 if not.
100
X 100
X
We have, X = Xj , so E(X) = E(Xj ).
j=1 j=1
Since person j is equally likely to get any hat, we have P (Xj = 1) = 1/100. Thus, Xj ∼
Bernoulli(1/100) ⇒ E(Xj ) = 1/100 ⇒ E(X) = 1.
(a) There are a number of ways to present this.

X ∼ 3 binomial(25, 1/6), so
k 25−k
25 1 5
P (X = 3k) = , for k = 0, 1, 2, . . . , 25.
k 6 6
(b) X ∼ 3 binomial(25, 1/6).

Recall that the mean and variance of binomial(n, p) are np and np(1 − p). So,
1
E(X) = 3np = 3 · 25 · = 75/6, and Var(X) = 9np(1 − p) = 9 · 25(1/6)(5/6) = 125/4.
6
(c) E(X + Y ) = E(X) + E(Y ) = 150/6 = 25., E(2X) = 2E(X) = 150/6 = 25.
Var(X + Y ) = Var(X) + Var(Y ) = 250/4. Var(2X) = 4Var(X) = 500/4.
The means of X + Y and 2X are the same, but Var(2X) > Var(X + Y ).
This makes sense because in X + Y sometimes X and Y will be on opposite sides from the
mean so distances to the mean will tend to cancel, However in 2X the distance to the mean
is always doubled.
Problem 13. (Continuous random variables)

First we find the value of a:
Z 1 Z 1
1 a
f (x) dx = 1 = x + ax2 dx = + ⇒ a = 3/2.
0 0 2 3
The CDF is FX (x) = P (X ≤ x). We break this into cases:
(i) b < 0 ⇒ FX (b) = 0.

b
b2 b3
Z
3
(ii) 0 ≤ b ≤ 1 ⇒ FX (b) = x + x2 dx = + .
0 2 2 2
(iii) 1 < x ⇒ FX (b) = 1.
Using FX we get
.52 + .53

13
P (.5 < X < 1) = FX (1) − FX (.5) = 1 − = .
2 16
Problem 14. (Exponential distribution)
(a) We compute
Z 5
P (X ≥ 5) = 1 − P (X < 5) = 1 − λe−λx dx = 1 − (1 − e−5λ ) = e−5λ .
0
(b) We want P (X ≥ 15|X ≥ 10). First observe that P (X ≥ 15, X ≥ 10) = P (X ≥ 15).
From similar computations in (a), we know
P (X ≥ 15) = e−15λ P (X ≥ 10) = e−10λ .
From the definition of conditional probability,
P (X ≥ 15, X ≥ 10) P (X ≥ 15)

P (X ≥ 15|X ≥ 10) = = = e−5λ
P (X ≥ 10) P (X ≥ 10)
Note: This is an illustration of the memorylessness property of the exponential distribu-

tion.
Problem 15. (a) Note, Y follows what is called a log-normal distribution.

FY (a) = P (Y ≤ a) = P (eZ ≤ a) = P (Z ≤ ln(a)) = Φ(ln(a)).
Differentiating using the chain rule:
d d 1 1 2
fy (a) = FY (a) = Φ(ln(a)) = φ(ln(a)) = √ e−(ln(a)) /2 .
da da a 2π a
(b) (i) We want to find q.33 such that P (Z ≤ q.33 ) = .33. That is, we want
Φ(q.33 ) = .33 ⇔ q.33 = Φ−1 (.33) .
(ii) We want q.9 such that

−1 (.9)
FY (q.9 ) = .9 ⇔ Φ(ln(q.9 )) = .9 ⇔ q.9 = eΦ .
−1 (.5)
(iii) As in (ii) q.5 = eΦ = e0 = 1 .
Problem 16. (a) We did this in class. Let φ(z) and Φ(z) be the PDF and CDF of Z.
FY (y) = P (Y ≤ y) = P (aZ + b ≤ y) = P (Z ≤ (y − b)/a) = Φ((y − b)/a).
Differentiating:
d d 1 1 2 2
fY (y) = FY (y) = Φ((y − b)/a) = φ((y − b)/a) = √ e−(y−b) /2a .
dy dy a 2π a
Since this is the density for N(b, a2 ) we have shown Y ∼ N(b, a2 ).
(b) By part (a), Y ∼ N(µ, σ 2 ) ⇒ Y = σZ + µ.
But, this implies (Y − µ)/σ = Z ∼ N(0, 1). QED
Problem 17. (Quantiles)

The density for this distribution is f (x) = λ e−λx . We know (or can compute) that the
distribution function is F (a) = 1 − e−λa . The median is the value of a such that F (a) = .5.
Thus, 1 − e−λa = 0.5 ⇒ 0.5 = e−λa ⇒ log(0.5) = −λa ⇒ a = log(2)/λ.
Problem 18. (Correlation)

As usual let Xi = the number of heads on the ith flip, i.e. 0 or 1.
Let X = X1 + X2 + X3 the sum of the first 3 flips and Y = X3 + X4 + X5 the sum of the
last 3. Using the algebraic properties of covariance we have
Cov(X, Y ) = Cov(X1 + X2 + X3 , X3 + X4 + X5 )
= Cov(X1 , X3 ) + Cov(X1 , X4 ) + Cov(X1 , X5 )
+ Cov(X2 , X3 ) + Cov(X2 , X4 ) + Cov(X2 , X5 )
+ Cov(X3 , X3 ) + Cov(X3 , X4 ) + Cov(X3 , X5 )
1
Because the Xi are independent the only non-zero term in the above sum is Cov(X3 X3 ) = Var(X3 ) =
4
Therefore, Cov(X, Y ) = 14 .
We get the correlation by dividing by the
p standard deviations. Since X is the sum of 3
independent Bernoulli(.5) we have σX = 3/4
Cov(X, Y ) 1/4 1
Cor(X, Y ) = = = .
σX σY (3)/4 3
Problem 19. (Joint distributions)
(a) Here we have two continuous random variables X and Y with going potability density
function
12
f (x, y) = xy(1 + y) for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1,
5
and f (x, y) = 0 otherwise. So
Z 1 Z 2
1 1 1 2 2 3 41
P( ≤ X ≤ , ≤ Y ≤ ) = f (x, y)dy dx = .
4 2 3 3 1 1 720
4 3
RaRb
(b) F (a, b) = 0 0 f (x, y)dy dx = 35 a2 b2 + 25 a2 b3 for 0 ≤ a ≤ 1 and 0 ≤ b ≤ 1.
(c) We find the marginal cdf FX (a) by setting b in F (a, b) to the top of its range, i.e. b = 1.
So FX (a) = F (a, 1) = a2 .
(d) For 0 ≤ x ≤ 1, we have
Z ∞ Z 1
fX (x) = f (x, y)dy = f (x, y)dy = 2x.
−∞ 0
d 2
This is consistent with (c) because dx (x ) = 2x.
(e) We first compute fY (y) for 0 ≤ y ≤ 1 as
Z 1
6
fY (y) = f (x, y)dx = y(y + 1).
0 5
Since f (x, y) = fX (x)fY (y), we conclude that X and Y are independent.
Problem 20. (Joint distributions)

(a) The marginal probability PY (1) = 1/2
⇒ P (X = 0, Y = 1) = P (X = 2, Y = 1) = 0.
Now each column has one empty entry. This can be computed by making the column add
up to the given marginal probability.
Y \X 0 1 2 PY
-1 1/6 1/6 1/6 1/2
1 0 1/2 0 1/2
PX 1/6 2/3 1/6 1
(b) No, X and Y are not independent.
For example, P (X = 0, Y = 1) = 0 6= 1/12 = P (X = 0) · P (Y = 1).
Problem 21.
Standardize:
! !
1 P
X
n Xi − µ 30/n − µ
P Xi < 30 =P √ < √
σ/ n σ/ n
i

30/100 − 1/5
≈P Z< (by the central limit theorem)
1/30
= P (Z < 3)
= 0.9987 (from the table of normal probabilities)
Problem 22. (Central limit theorem)

Let Xj be the IQ of a randomly selected person. We are given E(Xj ) = 100 and σXj = 15.
Let X be √ the average of the IQ’s of 100 randomly selected people. We have (X) = 100 and
σX = 15/ 100 = 1.5.
The problem asks for P (X > 115). Standardizing we get P (X > 115) ≈ P (Z > 10).
This is effectively 0.
Problem 23. (NHST chi-square)

We will use a chi-square test for homogeneity. Remember we need to use all the data!. For
hypotheses we have:
H0 : the re-offense rate is the same for both groups.
HA : the rates are different.
Here is the table of counts. The computation of the expected counts is explained below.
Control group Experimental group
observed expected observed expected
Re-offend 70 50 30 50 100
Don’t re-offend 130 150 170 150 300
200 200 400
The expected counts are computed as follows. Under H0 the re-offense rates are the same,
say θ. To find the expected counts we find the MLE of θ using the combined data:
total re-offend 100
θ̂ = = .
total subjects 400
Then, for example, the expected number of re-offenders in the control group is 200 · θ̂ = 50.
The other expected counts are computed in the same way.
The chi-square test statistic is
X (observed - expected)2 202 202 202 202
X2 = = + + + ≈ 8 + 2.67 + 8 + 2.67 ≈ 21.33.
expected 50 150 50 150
Finally, we need the degrees of freedom: df = 1 because this is a two-by-two table and
(2 − 1) · (2 − 1) = 1. (Or because we can freely fill in the count in one cell and still be
consistent with the marginal counts 200, 200, 100, 300, 400 used to compute the expected
counts.)
From the χ2 table: p = P (X 2 > 21.33|df = 1) < 0.01.
Conclusion: we reject H0 in favor of HA . The experimental intervention appears to be
effective.
Problem 24. (Confidence intervals)

We compute the data mean and variance x̄ = 65, s2 = 35.778. The number of degrees of
freedom is 9. We look up the critical value t9,.025 = 2.262 in the t-table The 95% confidence
interval is
√ √
h
t9,0.025 s t9,0.025 s i
x̄ − √ , x̄ + √ = 65 − 2.262 3.5778, 65 + 2.262 3.5778 = [60.721, 69.279]
n n

Suppose we have taken data x1 , . . . , xn with mean x̄. The 95% confidence interval for
σ σ
the mean is x ± z0.025 √ . This has width 2 z0.025 √ . Setting the width equal to 1 and
n n
substitituting values z0.025 = 1.96 and σ = 5 we get
5 √
2 · 1.96 √ = 1 ⇒ n = 19.6.
n
So, n = (19.6)2 = 384. .

√
If we use our rule of thumb that z0.025 = 2 we have n/10 = 2 ⇒ n = 400.

The 90% confidence interval is x ± z0.05 · 2√1 n . Since z0.05 = 1.64 and n = 400 our confidence
interval is
1
x ± 1.64 · = x ± 0.041
40
If this is entirely above 0.5 we have x − 0.041 > 0.5, so x > 0.541. Let T be the number
T
out of 400 who prefer A. We have x = 400 > 0.541, so T > 216 .

A 95% confidence means about 5% = 1/20 will be wrong. You’d expect about 2 to be
wrong.
With a probability p = 0.05 of being wrong, the number wrong p follows a Binomial(40, p)
distribution. This has expected value 2, and standard deviation 40(0.05)(0.95) = 1.38. 10
wrong is (10-2)/1.38 = 5.8 standard deviations from the mean. This would be surprising.

We have n = 20 and s2 = 4.062 . If we fix a hypothesis for σ 2 we know
(n − 1)s2
∼ χ2n−1
σ2
We used R to find the critical values. (Or use the χ2 table.)
c025 = qchisq(0.975,19) = 32.852
c975 = qchisq(0.025,19) = 8.907
The 95% confidence interval for σ 2 is
(n − 1) · s2 (n − 1) · s2 19 · 4.062 19 · 4.662

, = , = [9.53, 35.16]
c0.025 c0.975 32.852 8.907
We can take square roots to find the 95% confidence interval for σ
[3.09, 5.93]
Problem 29. (a) Step 1. We have the point estimate p ≈ p̂ = 0.30303.

Step 2. Use the computer to generate many (say 10000) size 100 samples. (These are called
the bootstrap samples.)
Step 3. For each sample compute p∗ = 1/x̄∗ and δ ∗ = p∗ − p̂.
Step 4. Sort the δ ∗ and find the critical values δ0.95 and δ0.05 . (Remember δ0.95 is the 5th
percentile etc.)
Step 5. The 90% bootstrap confidence interval for p is
[p̂ − δ0.05 , p̂ − δ0.95 ]

(b) It’s tricky to keep the sides straight here. We work slowly and carefully:
The 5th and 95th percentiles for x̄∗ are the 10th and 190th entries
2.89, 3.72
(Here again there is some ambiguity on which entries to use. We will accept using the 11th
or the 191st entries or some interpolation between these entries.)
So the 5th and 95th percentiles for p∗ are
1/3.72 = 0.26882, 1/2.89 = 0.34602
So the 5th and 95th percentiles for δ∗ = p∗ − p̂ are
−0.034213, 0.042990
These are also the 0.95 and 0.05 critical values.
So the 90% CI for p is
[0.30303 − 0.042990, 0.30303 + 0.034213] = [0.26004, 0.33724]
Problem 30. (a) The steps are the same as in the previous problem except the bootstrap
samples are generated in different ways.
Step 1. We have the point estimate q0.5 ≈ q̂0.5 = 3.3.
Step 2. Use the computer to generate many (say 10000) size 100 resamples of the original
data.
Step 3. For each sample compute the median q0∗.5 and δ ∗ = q0∗.5 − q̂0.5 .
Step 4. Sort the δ ∗ and find the critical values δ0.95 and δ0.05 . (Remember δ0.95 is the 5th
percentile etc.)
Step 5. The 90% bootstrap confidence interval for q0.5 is
[q̂0.5 − δ0.05 , q̂0.5 − δ0.95 ]
(b) This is very similar to the previous problem. We proceed slowly and carefully to get
terms on the correct side of the inequalities.
The 5th and 95th percentiles for q0∗.5 are
2.89, 3.72
So the 5th and 95th percentiles for δ ∗ = q0∗.5 − q̂0.5 are
[2.89 − 3.3, 3.72 − 3.3] = [−0.41, 0.42]
These are also the 0.95 and 0.05 critical values.
So the 90% CI for p is
[3.3 − 0.42, 3.3 + 0.41] = [2.91, 3.71]
Problem 31. The model is yi = a + bxi + εi , where εi is random error. We assume the
errors are independent with mean 0 and the same variance for each i (homoscedastic).
The total error squared is
X
E2 = (yi − a − bxi )2 = (1 − a − b)2 + (1 − a − 2b)2 + (3 − a − 3b)2
The least squares fit is given by the values of a and b which minimize E 2 . We solve for
them by setting the partial derivatives of E 2 with respect to a and b to 0. In R we found
that a = 1.0, b = 0.5
MIT OpenCourseWare
https://ocw.mit.edu
18.05 Introduction to Probability and Statistics

Spring 2014
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

MIT18 05S14 Class27-Sol PDF

Uploaded by

Copyright:

Available Formats

MIT18 05S14 Class27-Sol PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MIT18 05S14 Class27-Sol PDF

Uploaded by

Copyright:

Available Formats

Class 27: review for final exam –solutions, 18.

05, Spring 2014

(b) Here are two ways to do this problem.

P (H1 ∩ H2 ∩ H3 ∩ H4 ) = P (H4 |H1 ∩ H2 ∩ H3 ) P (H3 |H1 ∩ H2 ) P (H2 |H1 ) P (H1 )

Problem 4. (Conditional probability)

Problem 5. (Bayes formula)

Problem 6. (Bayes formula)

Summing the probability to all the B3 nodes we get

Problem 7. (Expected value and variance)

Problem 8. (Expected value and variance)

Problem 9. (Expected value and variance)

Problem 10. (Expected value and variance)

Problem 11. (Expected value)

Problem 12. (Expected value and variance)

(a) There are a number of ways to present this.

(b) X ∼ 3 binomial(25, 1/6).

Problem 13. (Continuous random variables)

(i) b < 0 ⇒ FX (b) = 0.

Problem 14. (Exponential distribution)

P (X ≥ 15) = e−15λ P (X ≥ 10) = e−10λ .

From the definition of conditional probability,

P (X ≥ 15, X ≥ 10) P (X ≥ 15)

Note: This is an illustration of the memorylessness property of the exponential distribu-

Problem 15. (a) Note, Y follows what is called a log-normal distribution.

Φ(q.33 ) = .33 ⇔ q.33 = Φ−1 (.33) .

(ii) We want q.9 such that

Problem 17. (Quantiles)

Problem 18. (Correlation)

Problem 19. (Joint distributions)

Problem 20. (Joint distributions)

Problem 22. (Central limit theorem)

Problem 23. (NHST chi-square)

Problem 24. (Confidence intervals)

Problem 25. (Confidence intervals)

So, n = (19.6)2 = 384. .

Problem 26. (Confidence intervals)

Problem 27. (Confidence intervals)

Problem 28. (Confidence intervals)

Problem 29. (a) Step 1. We have the point estimate p ≈ p̂ = 0.30303.

[p̂ − δ0.05 , p̂ − δ0.95 ]

18.05 Introduction to Probability and Statistics

You might also like