Ch08 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 59

CHAPTER 8

ESTIMATION OF THE
MEAN AND
PROPORTION in One
Population
8.1 ESTIMATION: AN INTRODUCTION

Definition
The assignment of value(s) to a population
parameter based on a value of the
corresponding sample statistic is called
estimation.
ESTIMATION: AN INTRODUCTION
Definition
The value(s) assigned to a population
parameter based on the value of a sample
statistic is called an estimate.
The sample statistic used to estimate a
population parameter is called an
estimator.
ESTIMATION: AN INTRODUCTION
The estimation procedure involves the
following steps.
◼ Draw a sample.
◼ Collect the required information from the
members of the sample.
◼ Calculate the value of the sample statistic.
◼ Assign value(s) to the corresponding
population parameter.
A Point Estimate

Definition
The value of a sample statistic that is used
to estimate a population parameter is
called a point estimate.
Example

To find mean housing expenditure 𝜇 per month for all


households in U.S.
➢ take a sample of 10,000 households
➢ determine the mean housing expenditure per month for
this sample: 𝑥ҧ = $2970
➢ use $2970 to approximate 𝜇: 𝜇 ≈ $2970

$2970 is a point estimate. 𝑋ത is called estimator.


An Interval Estimation

In interval estimation, an interval is


constructed around the point estimate,
and it is stated that this interval is likely to
contain the corresponding population
parameter.
Confidence Interval

 Definition
 Each interval is constructed with regard to a given
confidence level and is called a confidence
interval. The confidence interval is given as
Point estimate ± Margin of error
 The confidence level associated with a confidence
interval states how much confidence we have that
this interval contains the true population parameter.

 The confidence level is denoted by (1 – 𝛼)100%


 𝛼 is called significance level
Example

For mean housing expenditure 𝜇 per month for all


households in U.S.. instead of saying that 𝜇 ≈ $2970, we
may obtain an interval
$2970 ± $340 = ($2630, $3310)
and state that the interval contains 𝜇 with 95% confidence.

Above procedure is called interval estimation


(2630, 3310) is called 95% confidence interval
2630 is called lower limit, 3310 upper limit
340 is called margin of error
Comment: Here 1 − 𝛼 = 0.95, significance level 𝛼 = 0.05
8.2 ESTIMATION OF A POPULATION
MEAN: σ KNOWN
Three Possible Cases
ESTIMATION OF A POPULATION MEAN: σ
KNOWN
Confidence Interval for μ

The (1 – a)100% confidence interval for μ


under Cases I and II is

𝑋ത ± 𝑧𝛼/2 𝜎𝑋ത where 𝜎𝑋ത = 𝜎/ 𝑛

The value of za/2 used here is obtained from the


standard normal distribution table (Table IV of
Appendix C) with upper tail area a/2 (or lower tail
area 1 – a/2).
ESTIMATION OF A POPULATION MEAN: σ
KNOWN
Definition
The margin of error for the estimate
for μ, denoted by E, is the quantity that is
subtracted from and added to the value of
x to obtain a confidence interval for μ.
Thus,
𝐸 = 𝑧𝛼/2 𝜎𝑋ത
𝑧𝛼/2 for 1 − 𝛼 % confidence level
Finding 𝑧0.025 for a 95% confidence level.
Table 8.1 z Values for Commonly Used
Confidence Levels
Example 8-1
A publishing company has just published a new
college textbook. Before the company decides the
price at which to sell this textbook, it wants to
know the average price of all such textbooks in the
market. The research department at the company
took a sample of 25 comparable textbooks and
collected information on their prices. This
information produces a mean price of $145 for this
sample. It is known that the standard deviation of
the prices of all such textbooks is $35 and the
population of such prices is normal (needed
because the sample size, 25, is small).
Example 8-1

(a) What is the point estimate of the mean


price of all such textbooks?
(b) Construct a 90% confidence interval for
the mean price of all such college
textbooks.
Example 8-1: Solution

a)
n = 25, x = $145, and σ = $35

 35
x = = = $7.00
n 25
Point estimate of μ = x = $145
Example 8-1: Solution
b) Confidence level is 90% or .90. Here, the
area in each tail of the normal distribution
curve is α/2=(1-.90)/2=.05. Hence, z =
1.65.

x  z x = 145  1.65(7.00) = 145  11.55


= (145-11.55) to (145 + 11.55)
= $133.45 to $156.55
Example 8-1: Solution

We can say that we are 90% confident that


the mean price of all such college textbooks
is between $133.45 and $156.55.
Interpretation of the CI
If we are going to construct many other CIs
using the exactly same method, we are
confident that 90% of the CIs will cover the
true mean prices of such textbooks.
Interpretation of the CI
Remarks on CI
 Confidence level means the coverage rate
under repeated sampling
 We cannot say that the probability is .90
that this interval contains 𝜇
 Confidence interval isn’t prediction interval
for a single data point
 The validity of CI for mean parameters
still holds even when the population
distribution isn’t normal, as long as
sample size is sufficiently large
Example 8-2

According to a 2013 study by Moebs Services Inc., an


individual checking account at major U.S. banks costs
the banks more than $380 per year. A recent random
sample of 600 such checking accounts produced a
mean annual cost of $500 to major U.S. banks. Assume
that the standard deviation of annual costs to major
U.S. banks of all such checking accounts is $40. Make
a 99% confidence interval for the current mean annual
cost to major U.S. banks of such checking accounts.
Example 8-2: Solution

𝑮𝒊𝒗𝒆𝒏 𝒊𝒏𝒇𝒐𝒓𝒎𝒂𝒕𝒊𝒐𝒏:
𝑛 = 600, 𝑥ҧ = $500, 𝜎 = $40
1 − 𝛼 = 0.99
𝑺𝒐𝒍𝒖𝒕𝒊𝒐𝒏:
Central Limit Theorem applies b/c n = 600 > 30
𝛼 = 0.01 ⇒ 𝛼/2 = 0.005 ⇒ 𝑧0.005 = 2.58
𝑴𝒂𝒓𝒈𝒊𝒏 𝒐𝒇 𝒆𝒓𝒓𝒐𝒓:
𝜎 40
𝜎𝑋ത = = = 1.63299316 𝐸 = 𝑧0.005 × 𝜎𝑋ത = 4.21
𝑛 600
𝟗𝟗% 𝑪𝑰:
𝑥ҧ ± 𝐸 = 500 ± 4.21 = $495.79 𝑡𝑜 $504.21
Control width of CI

The width of a confidence interval


depends on the size of the margin of
error, 𝑧𝛼/2 𝜎𝑋ത . Hence, the width of a
confidence interval can be controlled using
1. The value of z, which depends on the
confidence interval
2. The sample size, n
Determining n given width of CI

Given the confidence level and the standard


deviation of the population, the sample size that
will produce a predetermined margin of error E of
the confidence interval estimate of μ is
2
𝑧𝛼/2 𝜎2
𝑛=
𝐸2
Example 8-3
An alumni association wants to estimate
the mean debt of this year’s college
graduates. It is known that the population
standard deviation of the debts of this
year’s college graduates is $11,800. How
large a sample should be selected so that
the estimate with a 99% confidence level is
within $800 of the population mean?
Example 8-3: Solution
 The maximum size of the margin of error of
estimate is to be $800; that is, E = $800.
 The value of z for a 99% confidence level is z =
2.58.
 The value of σ is $11,800.

2
𝑧𝛼/2 𝜎2 2.582 ×118002
𝑛= = = 1448.18 ≈ 1449
𝐸2 8002

 Thus, the required sample size is 1449.


8.3 ESTIMATION OF A POPULATION
MEAN: 𝜎 NOT KNOWN
The t Distribution
The t distribution is a specific type of bell-shaped
distribution with a lower height and a wider
spread than the standard normal distribution. As
the sample size becomes larger, the t distribution
approaches the standard normal distribution. The
t distribution has only one parameter, called the
degrees of freedom (df). The mean of the t
distribution is equal to 0 and its standard
deviation is 𝑑𝑓/(𝑑𝑓 − 2) .
Figure 8.5 The t distribution for df = 9 and the
standard normal distribution.
Example 8-4

Find the value of t for 16 degrees of


freedom and .05 area in the right tail of a
t distribution curve.
Table 8.2 Determining t for 16 df and .05 Area in
the Right Tail
Figure 8.6 The value of t for 16 df and .05 area in
the right tail.
Figure 8.7 The value of t for 16 df and .05 area in
the left tail.
Confidence Interval for μ Using the t Distribution
The (1 – α)100% confidence interval for μ is

s
x  ts x where s x =
n
The value of t is obtained from the t distribution
table for df = n – 1 degrees of freedom and the
given confidence level. Here ts x is the margin
of error of the estimate.
Example 8-5
Dr. Moore wanted to estimate the mean
cholesterol level for all adult men living in
Hartford. He took a sample of 25 adult men from
Hartford and found that the mean cholesterol level
for this sample is 186 mg/dL with a standard
deviation of 12 mg/dL. Assume that the
cholesterol levels for all adult men in Hartford are
(approximately) normally distributed. Construct a
95% confidence interval for the population mean
μ.
Example 8-5: Solution
 σ is not known, n < 30, and the
population is normally distributed (Case I)
 Use the t distribution to make a
confidence interval for μ
 .n=25, x=186, s=12, and confidence level = 95%
 s 12
sx = = = 2.40
n 25
Example 8-5: Solution
 df = n – 1 = 25 – 1 = 24
 Area in each tail = (1 – 0.95)/2
= .025
 The value of t in the right tail is 2.064

x  tsx = 186  2.064(2.40) = 186  4.95


= 181.05 to 190.95
Example 8-5: Solution

 Thus, we can state with 95% confidence


that the mean cholesterol level for all
adult men living in Harford lies between
181.05 and 190.95 mg/dL.
Figure 8.8 The value of t.
Example 8-6
Sixty-four randomly selected adults who buy
books for general reading were asked how much
they usually spend on books per year. The
sample produced a mean of $1450 and a
standard deviation of $300 for such annual
expenses. Determine a 99% confidence interval
for the corresponding population mean.
Example 8-6: Solution
 σ is not known, n > 30 (Case II)
 Use the t distribution to make a
confidence interval for μ
 n=64, x=$1450, s=$300, and confidence level = 99%

s 300
sx = = = $37.50
n 64
Example 8-6: Solution
 df = n – 1 = 64 – 1 = 63
 Area in each tail = (1 – 0.99)/2
= .005
 The value of t in the right tail is 2.656

x  tsx = $1450  2.656(37.50) = $1450  $99.60


= $1350.40 to $1549.60
Example 8-6: Solution

 Thus, we can state with 99% confidence


that based on this sample the mean
annual expenditure on books by all
adults who buy books for general
reading is between $1350.40 and
$1549.60.
Confidence Interval for μ Using the t Distribution
What If the Sample Size Is Too Large?

1. Use the t value from the last row (the


row of ∞) in Table V.

2. Use the normal distribution as an


approximation to the t distribution.
8.4 ESTIMATION OF A POPULATION
PROPORTION: LARGE SAMPLES
Properties of sampling distribution of 𝒑
ෝ:
 Approximately normal for large sample
 𝜇𝑝ො = 𝑝
𝑝𝑞 𝑛
 𝜎𝑝ො = , where 𝑞 = 1 − 𝑝, given that ≤ 0.05
𝑛 𝑁

Note: Sample is considered to be large if


𝑛𝑝 > 5 and 𝑛𝑞 > 5. If 𝑝 and 𝑞 are not known,
then 𝑛𝑝Ƹ > 5 and 𝑛𝑞ො > 5 must be true for the
sample to be large
Estimator of the Standard Deviation of p̂

The value of s pˆ , which gives a point


estimate of  p̂ , is calculated as follows.
Here, s pˆ is an estimator of  p̂

pˆ qˆ
s pˆ =
n
𝑛
𝑵𝒐𝒕𝒆: ≤ 0.05 𝑚𝑢𝑠𝑡 ℎ𝑜𝑙𝑑 𝑡𝑟𝑢𝑒 𝑡𝑜 𝑢𝑠𝑒 𝑡ℎ𝑖𝑠 𝑓𝑜𝑟𝑚𝑢𝑙𝑎
𝑁
ESTIMATION OF A POPULATION
PROPORTION: LARGE SAMPLES
Confidence Interval for the Population
Proportion, p

The (1 – α)100% confidence interval for the


population proportion, p, is

ˆ  zs pˆ
p
The value of z used here is obtained from the
standard normal distribution table for the given
confidence level, and s pˆ = pˆ qˆ /n . The term zsp̂
is called the margin of error, E.
Example 8-7

According to a survey by Pew Research


Center in June 2009, 44% of people aged 18
to 29 years said that religion is very
important to them. Suppose this result is
based on a sample of 1000 people aged 18
to 29 years.
Example 8-7

a) What is the point estimate of the


population proportion?
b) Find, with a 99% confidence level, the
percentage of all people aged 18 to 29
years who will say that religion is very
important to them. What is the margin
of error of this estimate?
Example 8-7: Solution

 n = 1000, p̂ = .44, and, q̂ = .56


ˆˆ
pq (.44)(.56)
 spˆ = = = .01569713
n 1000
 Note that npˆ and nqˆ are both greater
than 5.
Example 8-7: Solution

a)
Point estimate of p = p̂ = .44
Example 8-7: Solution
b)
The confidence level is 99%, or .99. z = 2.58.
ˆ  zspˆ = .44  2.58(.01569713) = .44  .04
p
= .40 to .48 or 40% to 48%

Margin of error = ±2.58 s pˆ


= ± 2.58(.01569713)
= ±.04 or ±4%
DETERMINING THE SAMPLE SIZE FOR
THE ESTIMATION OF PROPORTION
Given the confidence level and the values
of p̂ and q̂ , the sample size that will
produce a predetermined maximum of
error E of the confidence interval
estimate of p is

ˆˆ
z pq 2
n= 2
E
DETERMINING THE SAMPLE SIZE FOR
THE ESTIMATION OF PROPORTION
In case the values of p̂ and q̂ are not known

1. We make the most conservative estimate


of the sample size n by using pˆ = .5 and
qˆ = .5

2. We take a preliminary sample (of


arbitrarily determined size) and calculate
p̂ and q̂ from this sample. Then use
these values to find n.
Example 8-9

Lombard Electronics Company has just installed a


new machine that makes a part that is used in
clocks. The company wants to estimate the
proportion of these parts produced by this
machine that are defective. The company
manager wants this estimate to be within .02 of
the population proportion for a 95% confidence
level. What is the most conservative estimate of
the sample size that will limit the maximum error
to within .02 of the population proportion?
Example 8-9: Solution
 The value of z for a 95% confidence level
is 1.96.
 pˆ = .50 and qˆ = .50
 ˆ ˆ (1.96)2 (.50)(.50)
z 2 pq
n= 2
= 2
= 2401
E (.02)
 Thus, if the company takes a sample of
2401 parts, there is a 95% chance that
the estimate of p will be within .02 of the
population proportion.

You might also like