Credit Sessions5 & 6
Credit Sessions5 & 6
Credit Sessions5 & 6
CONFIDENCE INTERVAL
ESTIMATION
CONFIDENCE INTERVAL ESTIMATION
Point Estimator
Interval Estimator
Confidence Interval
• Consider the following statements:
• x = 550
A single-valued estimate that conveys little information
about the actual value of the population mean.
We are 99% confident that μ is in the interval
[449,551]
An interval estimate which locates the population mean
within a narrow interval, with a high level of
confidence.
We are 90% confident that μ is in the interval
[400,700]
An interval estimate which locates the population mean
within a broader interval, with a lower level of
confidence.
Point and Interval Estimates
A point estimate is a single-valued estimate, a single
element chosen from sampling distribution.
Doesn’t reflect the effects of larger sample sizes
Conveys little information about actual value of the
population parameter, about accuracy of estimate.
A confidence interval provides additional information
about the variability of the estimate
Provides amount of uncertainty associated with a point
estimate of a population parameter
Lower Upper
Confidence Limit Point Estimate Confidence Limit
Width of
confidence interval
Confidence Interval
A confidence interval or interval estimate is a range or
interval of numbers believed to include an unknown
population parameter.
Associated with the interval is a measure of the
confidence we have that the interval does indeed
contain the parameter of interest.
A confidence interval or interval estimate has two
components:
A range or interval of values
An associated level of confidence
Confidence interval is constructed as:
Point estimate ± Margin of error.
Margin of error accounts for the variability of the
estimator and the desired confidence level of the interval
Confidence Interval Example
In practice we only take one sample of size n
Confidence
Intervals
Population Population
Mean Proportion
σ Known σ Unknown
Confidence Interval for Population
Mean µ when Population S.D ơ Is Known
If the population distribution is normal, the sampling
distribution of the mean is normal (Normal Theorem)
X ~ N(µ,ơ2 /n)
Z = (X - µ) / (ơ /√n) ~ N(0,1)
A (1-a )100% Confidence Interval for m
z
is defined as the z value that cuts off a right-tail area of
2
α/2 under the standard normal curve. (1- α) is called the
confidence coefficient. α is called the error probability,
and (1- α)100% is called the confidence level.
S tand ard N o r m al D is trib uti on æ ö
Pçz > za = a/2
0.4 è ø
2
(1 )
æ ö a/2
0.3 Pçz < -za =
è ø
2
æ ö
f(z)
0.2
Pç-za < z < za = (1 - a )
è ø
0.1 2 2
2 2
0.0 (1- a )100% Confidence Interval:
-5 -4 -3 -2 -1 0 1 2 3 4 5 s
z Z z x ± za
2 2 2
n
Confidence Interval for μ (σKnown)
Assumptions
Population standard deviation σ is known
α α
0.025 0.025
2 2
P 1.96 Z 1.96 0.95 as illustrated here.
95% Intervals around Sample Mean
0.2
0.1
2.5% 2.5% mean, µ. (When the sample mean
0.0
1.96
x falls within the 95% interval
1.96
n n
around the population mean.)
x
x
*5% of such intervals around the
sample mean can be expected not
x
* x
x
x to include the actual value of the
population mean. (When the
x
x
*
x
x
x sample mean falls outside the 95%
interval around the population
Critical Values of z and Levels of
Confidence
z
S t an d ard N o rm al D is trib uti o n
(1)
2 2
0.4
(1 )
0.99 0.005 2.576 0.3
0.95 0.025 1.960 0.1
2 2
-5 -4 -3 -2 -1 0 1 2 3 4 5
z Z z
0.80 0.100 1.282 2 2
Confidence Interval of the Population
Mean When ơ Is Known
Interpreting a Confidence Interval
Interpreting a confidence interval requires care.
Incorrect: The probability that µ falls in the
interval is 0.95.
10.24
Example …
Need to estimate the mean demand over lead time with 95%
confidence in order to set inventory levels…
The parameter to be estimated is the population mean:µ
Confidence interval estimator will be:
1.96
75
Given
n 25
10.26
Confidence Interval of the Population
Mean When ơ Is Known- Example 1
A sample of 25 cereal boxes yields a mean
weight of 1.02 kgs of cereal per box.
x 1.96
n 1.02 1.96 0.03
25 1.02 0.012
0.95
0.025 0.025
0.475 0.475
-1.96 0 1.96
SOLUTION TO EXAMPLE 2
95% confidence interval for population mean:
X – (1.96)(ơ/√n) ≤ μ ≤ X + (1.96)(ơ/√n)
Problem 2:
For a fixed sample size, what is the value of the true
population proportion P that maximizes the variance of the
sample proportion p?
Problem 3:
The width of a 95% confidence interval for population
mean µ is 10 units. If everything else stays the same, how
wide would a 90% confidence interval be for µ
Problem 4:
Suppose you have a confidence interval based on a
sample of size n. Using the same level of confidence, how
Do You Ever Truly Know σ?
Probably not!
The variance
The
variance of of tt isis greater
greater than than
1, but
1, but approaches
approaches 11 as as the
the number
number
of degrees
of degrees of of freedom
freedom increases.
increases.
Standard
Normal (t
with df = ∞)
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
0 t
Student’s t Table
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) (∞ d.f.)
Note: t Z as n increases
The t Distribution
dft0.100 t0.050 t0.025 t0.010 t0.005
---
----- ----- ------ ------ ------
1
3.078 6.314 12.706 31.821 63.657
2
1.886 2.920 4.303 6.965 9.925
3
1.638 2.353 3.182 4.541 5.841 t D is trib u tio n : d f = 1 0
4
1.533 2.132 2.776 3.747 4.604
5
1.476 2.015 2.571 3.365 4.032 0 .4
6
1.440 1.943 2.447 3.143 3.707
7
1.415 1.895 2.365 2.998 3.499
0 .3
8
1.397 1.860 2.306 2.896 3.355
Area = 0.10 Area = 0.10
}
9
1.383 1.833 2.262 2.821 3.250
}
f(t)
0 .2
10
1.372 1.812 2.228 2.764 3.169
11
1.363 1.796 2.201 2.718 3.106
12
1.356 1.782 2.179 2.681 3.055 0 .1
13
1.350 1.771 2.160 2.650 3.012
14
1.345 1.761 2.145 2.624 2.977 0 .0
15
1.341 1.753 2.131 2.602 2.947 -2.228
-1.372 0 1.372
}
}
2.228
16
1.337 1.746 2.120 2.583 2.921 t
17
1.333 1.740 2.110 2.567 2.898
18
1.330 1.734 2.101 2.552 2.878 Area = 0.025 Area = 0.025
19
1.328 1.729 2.093 2.539 2.861
20
1.325 1.725 2.086 2.528 2.845
21
1.323 1.721 2.080 2.518 2.831 Whenever ơơ isis not
Whenever not known
known (and
(and the the
22
1.321
23
1.319
1.717 2.074 2.508 2.819
1.714 2.069 2.500 2.807 population isis assumed
population assumed normal),
normal), the the
24
1.318
25
1.316
1.711 2.064 2.492 2.797
1.708 2.060 2.485 2.787
correct distribution
correct distribution to to use the tt
use isis the
26
1.315
27
1.314
1.706 2.056 2.479 2.779
1.703 2.052 2.473 2.771
distribution with
distribution n-1 degrees
with n-1 degrees of of
28
1.313 1.701 2.048 2.467 2.763 freedom. For
freedom. For large
large degrees
degrees of of
29
1.311 1.699 2.045 2.462 2.756
30
1.310 1.697 2.042 2.457 2.750 freedom, the
freedom, the tt distribution
distribution isis
the ZZ
40
1.303 1.684 2.021 2.423 2.704
60
1.296 1.671 2.000 2.390 2.660 approximated well
approximated well by
by the
120
1.289
1.282
1.658 1.980 2.358 2.617
1.645 1.960 2.326 2.576 distribution.
distribution.
Confidence Interval for μ
(σ Unknown)
Assumptions
Population standard deviation is unknown
Population is normally distributed
If population is not normal, use large sample
Use Student’s t Distribution
Confidence Interval Estimate:
S
X tα / 2
n
(where tα/2 is the critical value of the t
distribution
with n -1 degrees of freedom that cut-off an area
of α/2 in each tail)
Confidence Interval of the Population
Mean µ When ơ Is Unknown- Example
Need to estimate mean mpg of all ultra-green cars.
Use the sample information to construct a 90%
confidence interval of the population mean.
Assume that mpg follows a normal distribution.
Solution: Since the population SD is not known,
the sample SD has to be computed from the
sample.
Sample Mean:96.52, Sample SD: 10.70, Sample
size: 25
As a result, the 90% confidence interval is
x t 2,df s
n 96.52 1.711 10.70
25 96.52 3.66
Example 3
AA stock
stock market
market analyst
analyst wants
wants to
to estimate
estimate thethe average
average return
return on
on
aa certain
certain stock.
stock. AA random
random sample
sample ofof 15
15 days
days yields
yields an
an average
average
(annualized) return
(annualized) return ofofx 10.37% and aa standard
and standard deviation
deviation of
of
3.5%. Assuming
3.5%. Assuming aa normal
normal population
population ofof returns,
returns, give
give aa 95%
95%
confidence interval
confidence interval for
for the
the average
average return
return onon this
this stock
stock..
The critical value of t for df = (n
df t0.100 t0.050 t0.025 t0.010 t0.005
-1) = (15 -1) =14 and a right-
--- ----- ----- ------ ------ ------
tail area of 0.025 is:
t 0 . 025 2.145
1 3.078 6.314 12.706 31.821 63.657
. . . . . .
. . . . . .
. . . . . .
13
14
15
1.350
1.345
1.341
1.771
1.761
1.753
2.160
2.145
2.131
2.650
2.624
2.602
3.012
2.977
2.947
The corresponding confidence
.
.
.
.
.
.
.
.
.
.
.
.
interval or interval estimate is:
. . . . . .
s
x t 0 .0 2 5
n
3 .5
1 0 . 3 7 2 .1 4 5
15
1 0 .3 7 1 .9 4
8 . 4 3 ,1 2 . 3 1
Confidence Intervals for the
Population Proportion, P
P proportion of successes in the population,
where success is defined by a particular outcome.
p is the point estimator of population proportion P
By central limit theorem, p can be approximated by
a normal distribution for large samples (i.e., nP > 5
and n(1 P) > 5) with mean µp= P
& SD P (1 P )
σ
n
p
p(1 p)
p Z α/2
n
where
Z is the standard normal value for the
α/2
level of confidence desired
p is the sample proportion
n is the sample size
Note: must have nP > 5 and n(1-P) > 5
Confidence Interval of the Population
Proportion- Example:
Need to estimate the proportion of all ultra-green
cars that obtain over 100 mpg.
Use the sample information to construct a 90%
confidence interval of the population proportion.
Solution: Given n= 25, p=7/25 =0.28
pq ( 0 . 34 )( 0 . 66 )
p z 0 . 34 1 . 96
2 n 100
0 . 34 (1 . 96 )( 0 . 04737 )
0 . 34 0 . 0928
0 . 2472 , 0 . 4328
The firm
The firm may
may be
be 95%
95% confident
confident that
that foreign
foreign manufacturers
manufacturers
control anywhere
control anywhere from
from 24.72%
24.72% to
to 43.28%
43.28% of of the
the market.
market.
Selecting a Useful Sample Size
Precision in interval estimates is implied
by a low margin of error.
}
2
Bound, E
Sample Size and Standard Error
The sample size determines the bound of a
statistic, since the standard error of a statistic
shrinks as the sample size increases:
Sample size = 2n
Standard error
of statistic
Sample size = n
Standard error
of statistic
Sampling Error
The required sample size can be found to
reach a desired margin of error (E) with a
specified level of confidence (1 - )
Sampling
For the
error (margin
Mean
of error)
σ σ
X Zα / 2
n
E Z
n
/2
Determining Sample Size
Determining
Sample Size
For the
Mean
2
σ Now solve Z / 2 σ 2
E Z / 2 for n to n 2
n get E
Determining Sample Size
To determine the required sample size for
the mean, we must know:
Solution:
Given ơ = 500, E = + 100
Z = 2.58 for a 99% confidence level
n = Zα/22ơ2/E2,
n = (2.58)2.(500)2/(100)2 = 166.41 ~ 166
Sample-Size Determination:
Example-6
AA marketing
marketing research
research firmfirm wants
wants toto conduct
conduct aa survey
survey to
to
estimate the
estimate the average
average amount
amount spent
spent onon entertainment
entertainment by by
each person
each person visiting
visiting aa popular
popular resort.
resort. The
The people
people who
who plan
plan
the survey
the survey would
would like
like to
to determine
determine the the average
average amount
amount
spent by
spent by all
all people
people visiting
visiting the
the resort
resort to
to within
within $120,
$120, with
with
95% confidence.
95% confidence. From From past
past operation
operation of of the
the resort,
resort, an
an
estimate of
estimate of the
the population
population standard
standard deviation
deviation isis
$400. What
$400. What isis the
the minimum
minimum required
required sample
sample size?
size?
z
2
2
n 2
E 2
( 1 . 96 ) 2
( 400 ) 2
120 2
42 . 684 43
Determining Sample Size
Determining
Sample Size
For the
Proportion
P (1 P ) Z 2 P (1 P)
EZ
Now solve n 2
n for n to get E
Determining Sample Size
To determine the required sample size for the
proportion, we must know:
than 0.10.
How large a sample do we need for his analysis
of the population proportion?
n=z2α/2P(1-P) /E2
=124.42 = 125
EXAMPLE-7
A company manufacturing sports goods wants
to estimate the proportion of cricket players
among high school students in India.
The company wants the estimate to be within
+ 0.03 with a confidence level of 99%.
A pilot study done earlier reveals that out of 80
school students, 36 students play cricket.
What should be the sample size for this study?
Solution:
Given P = 36/80 = 0.45, E = + 0.03, &
Z = 2.58 for 99% confidence level
Using n = P(1-P) (Z2/E2),
we get n = (0.45)(0.55)(2.58)2/(0.03)2
» n = 1830.51 ~ 1831
Problems
The director of a market research agency
wishes to study the reach of a particular
advertising campaign.
He is concerned with the percentage of the
target market that has seen at least a portion of
the campaign.
The director does not think that the figure will
exceed 25%.
What should be the sample size for this study if
the director wishes the estimate to be within
three percentage points of the true value and
95% confidence level is specified?
Problems
2. The average travel time taken based on a random
sample of 15 people working in a company to reach
the office is 45 minutes with an S.D of 9minutes.
Assuming normal establish the 95% confidence
interval for the mean travel time of every one in the
office.
28% exported
Market share:
HUL: 19%
Tata Tea: 18%
Sales Figure Tata Tea
Year Sales (in Million CAGR
rupees) =[(Sales2007 /
Sales1995)^(1/Yrs)-1]*100
1995 3993.2
1996 5196.9 =0.084447 *100
1997 6921.9 =8.45%
1998 8719
1999 8762
2000 9136.5
2001 8244.4
2002 7628.2
2003 7484.3
2004 7775.3
2005 8932.7
2006 9710.1
2007 10563.8
Production Graph
Sales in Million Rupees
12000
10000
8000
6000
4000
2000
0
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
Production Graph 2
Sales in Million Rupees
12000
10000
8000
6000
4000
2000
0
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
• Suppose Tata Tea decides to assess its employees’ job
satisfaction on the basisof a brief survey that includes five
questions.
• Response of 150 employees randomly surveyed is as shown
in Table.
• Use α = 0.05, to estimate the population’s positive
response to these questions on the basis of the sample
responses.
Following data is given :
• The research team has randomly selected 150 employees,
thus sample size n=150
• Here the sample statistic is the sample proportion
• = 0.05
• Sample Proportion
0≤ p≤1
p(1 p)
p Z/2
n
• where
– Zα/2 is the standard normal value for the level of
confidence desired
–p is the sample proportion
–n is the sample size
– is given as = 0.05 , it is the level of significance
SL NO Questions in the Yes No Sample
form of Proportion for
Statements positive
responses
1 I am proud to work 110 40 110/150 = 0.733
for my company
2 HR Policies for 120 30 120/150 = 0.8
promotion are fair
3 Seniors are co- 105 45 105/150 = 0.7
operative and
helpful
4 I will leave my 25 125 25/150 = 0.167
company in case a
better opportunity
arises
5 My company follows 40 110 40/150 = 0.267
a fair compensation
structure
Question Sample Confidence Limit for Confidence Limit for
No Proportion Population Proportion Population Proportion
(p) (P) (P)
1 110/150 0.663≤P ≤0.804
(0.733)(0.267)
0.7331.96
150
2 120/150 0.736≤P≤0.864
(0.8)(0.2)
0.8 1.96
150
3 105/150 0.627≤P≤0.773
(0.7)(0.3)
0.7 1.96
150
4 25/150
(0.167)(0.833) 0.108≤P≤0.226
0.1671.96
150
5 40/150
(0.267)(0.733) 0.197≤P≤0.337
0.2671.96
150
Analysis: Based on the survey conducted amongst 150
employees of Tata Tea, we can say with 95% level of
confidence that the percentage of positive response for
all the employees of Tata Tea to the below questions
would lie between
Question Response Percentage of
Total
Population
I am proud to work for my company Yes 66.3 % to
80.4%
HR Policies for promotion are fair Yes 73.6% to
86.4%
Seniors are co-operative and helpful Yes 62.7% to
77.3%
I will leave my company in case a better Yes 10.8% to
opportunity arises 22.6%
(1-α)% CI for S
population X t / 2
responses n
(where tα/2 is the critical value of the t distribution with n -1
degrees of freedom that cut-off an area of α/2 in each tail)
Sl. Questions in form of Mean SD X - tα-1(S/√n) X + tα-1(S/√n)
No statements X
.
1 There are good 3.45 0.90 3.45 - 3.45 +
opportunities for growth 2.145(0.90/√15)= 2.145(0.90/√15)=
in my organization 2.95 3.95
2 My job matches my 4.10 0.80 4.10 - 4.10 +
qualification and 2.145(0.80/√15)= 2.145(0.80/√15)=
experience 3.66 4.54
0 .2
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763 0 .1
}
}
60 1.296 1.671 2.000 2.390 2.660 t
120 1.289 1.658 1.980 2.358 2.617 Area = 0.025 Area = 0.025
1.282 1.645 1.960 2.326 2.576
Analysis of the above data :
95 %
1 2 3 4 5
Q1 2.9
5
3.9
5
Q2 3.6 4.5
6 4
Q3 3.1 4.4
Q4 9
3.4
1
4.2
Q5 3
3.6
6
4.7
7 3
1 = Strongly disagree 2 = Disagree 3 = Neutral 4 = Agree
5 = Strongly Agree
Analysis: Based on the survey conducted amongst 15 managers of
Tata Tea, we can say with 95% level of confidence about job
satisfaction levels of managers who have joined the organization in
the last five years, that average population’s response to the below
questions would lie between