Statistics Coursework Homework Help
Statistics Coursework Homework Help
Statistics Coursework Homework Help
We have shown that P(n) ⇒ P(n + 1). Thus, P(n) is true for all n ∈ N
which for sufficiently large n, is clearly less than any positive constant,
and thus is o(1).
Problem 3. A person is passing time by advancing a token on the set
of natural numbers. In the beginning, a token is placed on 0.
1. First the person tosses a fair coin (with heads/tails equally likely).
2
For each n ∈ N, let En be the event ”There was a move on which the
token landed on n”. Let pn = Pr[En].
Find a recurrence relation for pn. You do not need to solve the
recurrence, but you should specify the boundary conditions that would
be necessary to find a solution to the recurrence.
Problem 4. Exactly 1/5th of the people in a town have Beaver
Fever©.
There are two tests for Beaver Fever, TEST1 and TEST2. When a
person goes to a doctor to test for Beaver Fever, with probability 2/3
the doctor conducts TEST1 on him and with probability 1/3 the doctor
conducts TEST2 on him.
• If the person has the disease, the result is positive with probability
3/4.
• If the person does not have the disease, the result is positive with
probability 1/4.
• If the person has the disease, the result is positive with probability
1.
• If the person does not have the disease, the result is positive with
probability 1/2.
A person is picked uniformly at random from the town and is sent to a
doctor to test for Beaver Fever. The result comes out positive. What is the
probability that the person has the disease?
Solution. Let B be the event that the person has BLAH. Let T1 be the
event that the person is tested with test1. Let T2 be the event that the
person is tested with test2. Let P be the event that the test comes out
positive.
There are 104 cards. There are 5 cards in a hand. Order does not matter.
The total number of possible hands is:
Problem 6. [28 points] Scores for a final exam are given by picking an
integer uniformly at random from the set {50, 51, . . . , 97, 98}. The
scores of all 128 students in the class are assigned in this manner. For
parts (a), (b), (c) and (d) you may NOT assume that these scores are
assigned independently. For parts (e), (f), (g) and (h) you MAY assume
that these scores are assigned independently.
(a) For i ∈ {1, . . . , 128}, what is E[Si] ?
(d) Improve your previous bound by using the fact that the minimum
possible score is 50. Prove that
(e) For the remaining problems, assume that all the scores are
assigned mutually independently. Use Problem 1 of this final to find
V ar[Si].
(f) What is V ar[S]?
Solution.
(a) We simply take the average of the numbers from 50 to 98. Thus,
74.
E[Si] =
(b) By linearity of expectation,
Problem 7. 1000 files F1, F2, . . . , F1000 have just reached a disk
manager for writing onto disk. Each file’s size is between 0MB and
1MB. The sum of all files’ sizes is 400MB.
The disk manager has 4 disks under its control. For each file Fi, the
disk manager chooses a disk uniformly at random from amongst the
4 disks, and Fi is written to that disk. The choices of disk for the
different files are mutually independent.
(a) What is the expected number of files that will be written to the first
disk?
We can use indicator variables. For each file, Pi = 1 if Fi is written to
the first disk. The chance of an individual file being written to the
first disk is 1/4. By linearity of expectation, the expected number of
files written to the first disk is the sum of the expected values of Pi’s.
The expected value of each indicator variable is 1/4, and
so the expected number of files to be written to the first disk
(b) What is the expected number of bytes written on the first disk?
We can say that each file Fi has bit size Si. Each file has a 1/4 chance
of being written do the first disk. Therefore, by linearity of expectation,
the expected number of bytes written to the first disk is the sum of the
expected number of bytes per file written to the first disk, which is:
(c) Find the best upper bound you can on the probability that 200MB or
more are written on the first disk?
For this we can use the first Chernoff bound, which is:
The Chernoff bound only works if X is the sum of random variables that
each take on a value between 0 and 1. The file size of each file in the
first disk is between 0 and 1Mb . So we can define X to be the total
number of bytes in disk 1. The expected value of X is 100, so we take c
to be 2. We get:
(d) Find the best upper bound you can on the probability that there is some
disk with 200MB or more written on it?
For this we can use the Union Bound along with our result from above.
The probability of this event happening in one or more disks is upper
bounded by the sum of the probabilities of the event happening in each
disk. This gives us an upper bound of