Interval Estimation: Part 2: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
Interval Estimation: Part 2: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
Interval Estimation: Part 2: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
(Module 4)
Statistics (MAST20005) & Elements of Statistics (MAST90058)
Semester 2, 2018
Contents
1 Confidence intervals 1
1.1 Less common scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 General techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Choice of confidence level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Prediction intervals 7
1 Confidence intervals
We can construct one-sided confidence intervals, e.g. just an upper or lower bound.
For example, if we sample from N(µ, σ 2 ) with known σ:
X̄ − µ
Pr √ <c =1−α
σ/ n
1
and therefore a one-sided 100 · (1 − α)% confidence interval for µ is
σ
x̄ − c √ ,∞ .
n
Remarks
• The main thing to remember is to start with a one-sided probability statement about the pivot.
• In this example, we obtained a lower bound.
• To get an upper bound, start with an inequality in the other direction.
• Other scenarios are analogous. For example, if σ is unknown then replace σ with s and let c be a quantile from
tn−1 rather than N (0, 1).
• Since we only need one tail probability, we don’t need to separate α into two parts. That’s why we use the 1 − α
quantile here rather than the 1 − α/2 quantile.
A winemaker requires a minimum concentration of 10 g/L of sugar in the grapes used to make a certain wine. In
a sample of 30 units she finds an average concentration of 11.9 g/L and a standard deviation of 0.96. Is that high
enough?
She calculates a 95% lower bound (one-sided CI) as follows:
s 0.96
x̄ − c √ = 11.9 − 1.699 × √ = 11.60
n 30
where c = 1.699 is the 0.95 quantile from t29 .
On that basis, she is confident that the average sugar content is adequately high.
Example using R
Recall the butterfat example from the previous module. Now re-doing using one-sided CIs. . .
> t.test(butterfat,
+ conf.level = 0.90,
+ alternative = "less")
...
...
2
⇒ Inversion is messy.
Usually aim for something close, with ‘at least’ probability. For example,
where:
• a(θ) is the largest value of x such that Pr(x 6 T | θ) > 0.975
• b(θ) is the smallest value of x such that Pr(T 6 x | θ) > 0.975
How do we invert these?
For an observed value tobs (of T ), we have:
• c is such that Pr(tobs 6 T | θ = c) = 0.025
• d is such that Pr(T 6 tobs | θ = d) = 0.025
Then, the ‘at least’ 95% confidence interval is (c, d).
Maximum likelihood estimators have many convenient properties. We will cover some of the theory later in the
semester. For now, it is useful to know the following. . .
Let,
∂ 2 ln L
V (θ) = −
∂θ2
This is known as the observed information function. It can be used to estimate the standard deviation of the MLE:
1
se(θ̂) = q
V (θ̂)
3
Review of general methods for constructing CIs
Methods:
• Invert a probability interval based on a known sampling distribution (use a pivot)
• Use the asymptotic MLE result
Common approximations:
• Normality (based on the CLT or the asymptotic MLE)
• Substitute parameter estimates into the expression for the standard deviation of the estimator
1.3 Properties
Coverage
The coverage or coverage probability of a confidence interval (estimator) is the probability it contains the true value of
the parameter,
C = Pr(L < θ < U )
Usually this is equal to the confidence level, which is also known as the nominal coverage probability.
4
However, due to various approximations we use, the actual coverage achieved may vary from the confidence level.
0.90
0.85
More detail about the quadratic approximation will be shown in the tutorials and lab classes.
5
1.5 Interpretation
Explaining CIs
The probability associated with a CI (i.e. the confidence level) relates to the sampling procedure. In particular, it
refers to hypothetical repeated samples.
Once a specific sample is observed and a CI is calculated, the confidence level cannot be interpreted probabilistically
in the context of the specific data at hand.
It is incorrect to say things like:
• This CI has a 95% chance of including the true value
• We can be ‘95% confident’ that this CI includes the true value
Don’t do it!
The probability only has a meaning when considering potential replications of the whole sampling and estimation
procedure.
We can only say something like:
• If we were to repeat this experiment, then 95% of the time the CI we calculate will cover the true value.
(This is a bit of a mouthful...)
In practice:
• If you are reporting results to people who know what they are, you can just state that the “95% confidence
interval is. . . ”
• If people want to know what this means, use an intuitive notion like, “it is the set of plausible values of the
parameter that are consistent with the data”. (Note: this is not actually true in general, but will be accurate
enough for all of the examples we cover this semester.)
• If you need to actually explain what a CI is precisely, you need to explain it in terms of repeated sampling. (No
shortcuts!)
1.6 Summary
6
2 Prediction intervals
Prediction intervals
Suppose we want to estimate the value of a future observation, rather than a parameter of the distribution. We usually
call this prediction rather than ‘estimation’.
We have available data that arose from the same probability distribution. Can we use this to come up with an interval
estimate?
Yes. Easiest to see with an example. . .
Remarks
• The prediction interval for X is much wider than the confidence interval for µ.
• As n → ∞, the width of the confidence interval shrinks to zero, but the width of the prediction interval tends
to the width of the corresponding population probability interval (µ ± 1.96).
• This makes sense: we get complete certainty about µ, but each observation on X has inherent variability (in
this case, a variance of 1).
• In the prediction interval estimator, all quantities are random variables:
7
3 Sample size determination
Sample size determination: overview
A researcher plans to select a sample of first-grade girls in order to estimate their mean height µ. The sample is
required to be large enough to get an estimate to within 0.5 cm. From previous studies we know σ ≈ 2.8 cm.
cσ 2 1.96 × 2.8 2
n= = = 120.47
0.5
The researcher selects 121 girls.
8
Example: Sample size for proportions
The unemployment rate has been 8% for a while. A researcher wishes to take new sample to estimate it and wants to
be ‘very certain’, by using a 99% CI, that the new estimate is within 0.001 of true proportion.