Chapter Two - Estimation (STA408)
Chapter Two - Estimation (STA408)
Chapter Two - Estimation (STA408)
▪ Introduction to Estimation
▪ Point Estimate
▪ Interval Estimate
➢ Confidence Interval for a Population Mean
➢ One Population Mean
➢ Two Population Means
Mean σ𝑋 σ𝑥
𝜇= 𝑁
𝑥ҧ = 𝑛
Standard Deviation
1 σ𝑋 2 1 σ𝑥 2
𝜎= 𝑁 𝑋2 − 𝑁
𝑠= 𝑛−1 𝑥2 − 𝑛
Variance σ𝑥 2
2 1 2 σ𝑋 2 2 1 2
𝜎 = 𝑁 𝑋 − 𝑁
𝑠 = 𝑛−1 𝑥 −
𝑛
3
__________________________________Statistics For Science and Engineering (STA408)
✓ These statistic and other numerical descriptive measures computed from the
samples can be used not only to describe the sample but also to make
inferences about the population parameter in the form of estimates and
hypothesis.
Make Estimation
Compute
Population Sample inferences/
statistic
conclusion
Hypothesis
testing
4
__________________________________Statistics For Science and Engineering (STA408)
✓ CLT states that the sampling distribution of any statistic will be normal or
approximately normal if the sample is large.
Rule of thumb:
𝒏 ≥ 𝟑𝟎
(consider large)
5
__________________________________Statistics For Science and Engineering (STA408)
1. The mean of the sample means will be the same as the population
mean,𝜇𝑥ҧ = 𝜇.
2. The standard deviation of the sample means will be smaller than the
standard deviation of the population, and it will be equal to population
𝜎
standard deviation divided by the square root of the sample size, 𝜎𝑥ҧ = 𝑛
3. When the samples are selected from a finite population (sampling without
𝑛
replacement) of size N and 𝑁 ≥ 0.05, then the standard deviation of the
𝜎 𝑁−𝑛
distribution of sample mean is 𝜎𝑥ҧ = .
𝑛 𝑁−1
6
__________________________________Statistics For Science and Engineering (STA408)
Notation
2
𝑋ത ~ N (𝜇, 𝜎𝑛 )
Formula
𝑋ത − 𝜇
Z= 𝜎
𝑛
Sampling distribution of the
Sample Mean, 𝑋ത Where:
(CLT) 𝑋ത = sample mean
𝜇 = population mean
𝜎 = population standard deviation
n = sample size
Mean, 𝜇𝑥ҧ = μ
𝜎2
Variance, 𝜎 2 𝑥ҧ = 𝑛
Standard error
Standard Deviation, 𝜎𝑥ҧ = 𝜎
𝑛
of the mean
7
__________________________________Statistics For Science and Engineering (STA408)
Example 2.1: A production firm manufactures light bulbs that have a length of
life that approximately normally distributed with mean 800 hours and standard
deviation of 40 hours.
a) Write the probability distribution of the life of the light bulbs.
b) Write the sampling distribution of the mean life of the light bulbs.
c) Find the probability that the life of the light bulbs is more than 850 hours.
(Ans: 0.1056)
d) Find the probability that a random sample of 16 bulbs will have an average
life of less than 775 hours. (Ans: 0.00621)
Solution:
8
__________________________________Statistics For Science and Engineering (STA408)
Example 2.2: A manager observes that his income per day averages RM 1000
with standard deviation of RM 200. He selected a random sample of 30 days.
a) Describe the distribution of the sample mean.
b) What is the probability that the mean income for the sample of 30 days
exceeds RM 1050? (Ans: 0.0853)
c) What is the probability that the sample mean will be within RM 100 of the
population mean. (Ans: 0.9938)
Solution:
9
__________________________________Statistics For Science and Engineering (STA408)
10
____________________________________Statistics For Science and Engineering (STA408)
11
____________________________________Statistics For Science and Engineering (STA408)
Types of estimation
12
____________________________________Statistics For Science and Engineering (STA408)
Estimator : The quantity calculated from the sample data to estimate population
parameters.
If we have two or more estimators for the same parameters, we need to compare them and
choose the best estimator.
Properties of Best Estimator
Point Estimate
➢ A single number calculated from the sample to estimate population
parameter
➢ The best point estimate of the population mean, µ is the sample mean, x.
14
____________________________________Statistics For Science and Engineering (STA408)
Example 2.4: The total time for exercise in a week among 8 career women is
selected. The resulting observations are 10.2 9.3 11.9 9.2 8.3 11.2 10.4 9.5.
What are the point estimates of mean and standard deviation of exercise time?
15
____________________________________Statistics For Science and Engineering (STA408)
Interval Estimate
➢ Two numbers calculated from the sample to form an interval within
which the parameter is expected to lie with a specified level of
confidence.
Where,
a = the lower limit of the interval
b = the upper limit of the interval
1 − 𝛼 = the confidence coefficient
(1 − 𝛼)100% = the confidence level
➢ The confidence level measures the probability that the interval contains
the parameter being estimated.
16
Confidence Interval (CI) ഥ ± 𝒕α sd
𝒅 ൗ ,𝒏−𝟏 𝒏
𝟐
σ𝑑
𝑑ҧ =
Population Mean 𝑛𝑑
1 σ𝑑 2
sd = d2 −
One Population Two Population nd − 1 𝑛𝑑
Dependent
σ2 σ2 Independent Sample
known unknown Sample
ഥ ± 𝐙𝛂ൗ 𝛔
𝐗 𝜎12 = 𝜎22
𝟐 𝐧 𝜎12 and 𝜎22
known
𝟏 𝟏
ഥ±
𝑿 𝒕αൗ ,𝒏−𝟏 𝒔𝒏 (ഥ
𝒙𝟏 −ഥ
𝒙𝟐 ) ± 𝒕𝜶,𝒅𝒇 𝒔𝒑 +
𝟐
𝝈𝟐𝟏 𝝈𝟐𝟐 𝟐 𝒏𝟏 𝒏𝟐
(ഥ
𝒙𝟏 −ഥ
𝒙𝟐 ) ± 𝒁𝜶 +
𝟐 𝒏𝟏 𝒏𝟐 df = 𝑛1 + 𝑛2 − 2
𝜎12 ≠ 𝜎22
𝒔𝟐𝟏 𝒔𝟐𝟐
(ഥ
𝒙𝟏 −ഥ
𝒙𝟐 ) ± 𝒕𝜶,𝒅𝒇 +
𝟐 𝒏𝟏 𝒏𝟐
df = 𝑛1 − 1 / 𝑛2 − 1 17
Confidence Interval (CI)
Population Variance
𝜎12
σ2
𝜎22
𝑛 − 1 𝑠2 𝑛 − 1 𝑠2
< 𝜎2 < 𝑠12 1 𝑠12
𝜒𝛼2 𝜒2 𝛼 , 𝐹𝛼
2 ,𝑛−1
1− 2 ,𝑛−1
𝑠22 𝐹𝛼,𝑣 ,𝑣 𝑠22 2 ,𝑣2,𝑣1
2 1 2
𝜎1
σ
𝜎2
𝑛 − 1 𝑠2 𝑛 − 1 𝑠2 𝑠12 1 𝑠12
<𝜎< , 𝐹𝛼
𝜒𝛼2 𝜒2 𝛼 𝑠22 𝐹𝛼,𝑣 ,𝑣 𝑠22 2 ,𝑣2,𝑣1
2 ,𝑛−1 1− 2 ,𝑛−1
2 1 2
18
____________________________________Statistics For Science and Engineering (STA408)
One Assumption
Population, ✓ Random sample
known σ2 ✓ Small or large sample
✓ σ2 is known
Formula
19
____________________________________Statistics For Science and Engineering (STA408)
20
____________________________________Statistics For Science and Engineering (STA408)
Example 2.6: Now, repeat the same problem by finding the 90%
confidence interval for the average lifetime.
21
____________________________________Statistics For Science and Engineering (STA408)
Assumption
One
Population, ✓ Random sample
unknown σ2 ✓ The population is normal distributed
✓ σ2 is unknown
Formula
22
____________________________________Statistics For Science and Engineering (STA408)
23
____________________________________Statistics For Science and Engineering (STA408)
24
____________________________________Statistics For Science and Engineering (STA408)
Example 2.9: A sample 15 bulbs were tested and the lengths of life are as follows
(hours):
4300 , 4302, 4415, 4483, 4301, 4446, 4478, 4319
3985, 4483, 4377, 4401, 4346, 4261, 4353
Assume the data a normal distribution.
25
____________________________________Statistics For Science and Engineering (STA408)
26
____________________________________Statistics For Science and Engineering (STA408)
Independent sample
Two population
means, Known
𝝈𝟐𝟏 𝒂𝒏𝒅 𝝈𝟐𝟐
Formula
27
____________________________________Statistics For Science and Engineering (STA408)
28
____________________________________Statistics For Science and Engineering (STA408)
Independent sample
Two population
Formula
Means, 𝝈𝟐𝟏 = 𝝈𝟐𝟐
and unknown The (1-α) 100% confidence level for two
population means, 𝜇1 − 𝜇2 is
Assumption
1 1
(𝑥ҧ1 −𝑥ҧ2 ) ± 𝑡𝛼,𝑑𝑓 𝑠𝑝 +
✓ The population is normal 2 𝑛1 𝑛2
distributed Where
✓ 𝜎12 and 𝜎22 is unknown 𝑑𝑓 = 𝑛1 + 𝑛2 − 2
✓ Variances are assumed to be
equal
𝑛1 − 1 𝑠12 + 𝑛2 − 1 𝑠22
𝑆𝑝 =
𝑛1 + 𝑛2 − 2
29
____________________________________Statistics For Science and Engineering (STA408)
Construct a 98% confidence interval for the difference between the mean speeds
of cars driven by all men and all women on this highway. (Ans: 𝑠𝑝 = 2.317,
(2.2162,5.7838))
30
____________________________________Statistics For Science and Engineering (STA408)
Construct a 95% confidence interval on the differences between the average lifetimes of
the two brands. Can the supervising inspector of incoming quality conclude that the
average lifetimes of the two brands are equal? (Ans: 5.682, 19.518)
31
____________________________________Statistics For Science and Engineering (STA408)
Independent sample
Two population
Means, 𝝈𝟐𝟏 ≠ 𝝈𝟐𝟐
and unknown
Formula
32
____________________________________Statistics For Science and Engineering (STA408)
33
____________________________________Statistics For Science and Engineering (STA408)
Example 2.14: A set of facilitation tools to help with data analysis for problem solving
is being developed by a group of statisticians at UiTM. In order to test effectiveness of
these tools, a group of research officers were asked to analyze and produce a built-in
report for a set of data on the computer. Twelve equally capable research officers were
randomly selected and six were randomly assigned a standard procedure to complete
the task. The other six were asked to do the task using the developed facilitation tools.
The response measured was the time to completion (in minutes). The data collected are
shown below:
Assume that the population variances of the two procedures are different. Construct a
99% confidence interval to estimate the difference between the average completion
times for the two procedures. Can you conclude that the facilitation tools increase the
speed with which the task is completed by more than 20 minutes? (Ans: 19.18, 38.82 or
17.1987, 40.8013)
34
____________________________________Statistics For Science and Engineering (STA408)
Example 2.15: The following Minitab Output was obtained from two independent
samples selected from two normally distributed populations with unknown and unequal
standard deviations.
35
____________________________________Statistics For Science and Engineering (STA408)
Dependent sample
Formula
Two population
Means, Dependent The (1-α) 100% confidence interval for the mean
difference between two observations from matched
Samples
samples, 𝜇𝑑 is
sd
ഥ ± 𝒕α
𝒅 ൗ 𝟐,𝒏−𝟏 𝒏
✓ Matched or paired σ𝑑
ഥ𝑑 =
samples involve a 𝑛𝑑
procedure whereby pairs of 1 σ𝑑 2
observations are matched as Sd = d2 −
nd − 1 𝑛𝑑
close as possible according
to certain relevant
characteristics
36
____________________________________Statistics For Science and Engineering (STA408)
Example 2.16: The manufacturer of a gasoline additive claimed that the use of this
additive increases gasoline mileage. A random sample of six cars was selected and
these cars were driven for one week without the gasoline additive and then for one
week with the gasoline additive. The following table gives the miles per gallon for
these cars without and with the gasoline additive.
Without 24.6 28.3 18.9 23.7 15.4 29.5
Construct a 95% confidence interval for the difference in mean mileage per gallon
for cars without and with the gasoline additive. (Ans: -3.2150,-0.2184)
37
____________________________________Statistics For Science and Engineering (STA408)
Example 2.17: Many engineering students are having problems in data analysis using
statistical software. A professor who teaches statistics for engineering course offered a two
day workshop on this topic. The following table gives the test scores of seven engineering
students before and after they attended the workshop.
Before 56 69 48 74 65 71 58
After 62 73 44 85 71 70 69
a) Show that 95% confidence interval for the difference in mean tests scores before and after
attending the workshop is between -9.94 and 0.51.
b) Can we conclude whether attending the workshop increases the test score?
38
____________________________________Statistics For Science and Engineering (STA408)
Chi-Square F- Distribution
Distribution
Characteristics
Assumption
Formula
𝑛 − 1 𝑠2 𝑛 − 1 𝑠2
< 𝜎2 <
𝜒𝛼2 𝜒2 𝛼
2 , 𝑛−1 1− 2 , 𝑛−1
𝑛 − 1 𝑠2 𝑛 − 1 𝑠2
<𝜎<
𝜒𝛼2 𝜒2 𝛼
2 , 𝑛−1
1− 2 , 𝑛−1
40
____________________________________Statistics For Science and Engineering (STA408)
Example 2.18: Find the 95% confidence interval for the variance and standard
deviation of the nicotine content of cigarettes manufactured if a random sample
of 20 cigarettes has a standard deviation of 1.6 milligrams. Assume that the
variable is normally distributed. (Ans: 1.5 < 𝜎 2 < 5.5 , (1.2 < 𝜎 < 2.3) )
41
____________________________________Statistics For Science and Engineering (STA408)
Assumption
Formula
𝜎12
The (1- 𝛼) 100% confidence interval for ratio of two population variances, is
𝜎22
𝑠12 1 𝑠12
, 2 𝐹𝛼,𝑣
𝑠22 𝐹𝛼 𝑠2 ,𝑣
2 2 1
2 ,𝑣1 ,𝑣2
The (1- 𝛼) 100% confidence interval for ratio of two population standard
𝜎1
deviations, is
𝜎2
𝑠12 1 𝑠12
, , 2 𝐹𝛼,𝑣
𝑠22 𝐹𝛼 𝑠2 ,𝑣
2 2 1
2 ,𝑣1 ,𝑣2
where, v1 = n1 - 1 , v2 = n2 - 1
42
____________________________________Statistics For Science and Engineering (STA408)
Brand 1 43 48 38 41 51
Brand 2 30 26 37 31 34
43
____________________________________Statistics For Science and Engineering (STA408)
Example 2.21: The following Minitab output was obtained from two independent
samples selected from two normally distributed populations with unknown and
unequal variances. Show that the lower limit for the 95% confidence interval of
the ratio of variances and standard deviations for the two populations are as given
in the output.
44
____________________________________Statistics For Science and Engineering (STA408)
45
__________________________________Statistics For Science and Engineering (STA408)
46