Estimtion Confidence Interval
Estimtion Confidence Interval
Estimtion Confidence Interval
Introduction
This chapter considers several important aspects of sampling. We begin by
studying point estimates. A point estimate is a single value (point) derived
from a sample and used to estimate a population value.
The sample mean, , is not the only point estimate of a population parameter. For
example, p, a sample proportion, is a point estimate of π, the population proportion; and
s, the sample standard deviation, is a point estimate of σ, the population standard
deviation.
Confidence Intervals for a Population Mean
For example, we estimate the mean yearly income for construction workers in the New
York–New Jersey area is $85,000. The range of this estimate might be from $81,000 to
$89,000. We can describe how confident we are that the population parameter is in the
interval by making a probability statement. We might say, for instance, that we are 90
percent sure that the mean yearly income of construction workers in the New York–
New Jersey area is between $81,000 and $89,000.
Confidence Intervals for a Population Mean
Population Standard Deviation Known σ
A confidence interval is computed using two statistics: the sample mean, X̅ , and the
standard deviation. In computing a confidence interval, the standard deviation is
used to compute the range of the confidence interval.
Though it is reasonable to assume the standard deviation of the population is available,
but in most sampling situations the population standard deviation (σ) is not known.
Here are some examples where we wish to estimate the population means and it is
unlikely we would know the population standard deviations. Suppose each of these
studies involves students at Feni University.
The Dean of the Business Administration wants to estimate the mean number of
hours full-time students work at paying jobs each week. He selects a sample of 30
students, contacts each student and asks them how many hours they worked last
week. From the sample information, he can calculate the sample mean, but it is not
likely he would know or be able to find the population (σ) standard deviation
required in formula. He could calculate the standard deviation of the sample and use
that as an estimate, but he would not likely know the population standard deviation
The Dean of Faculty of Business wants to estimate the distance the typical
commuter student travels to class. He selects a sample of 40 commuter students,
contacts each, and determines the one-way distance from each student’s home to the
campus. From the sample data, he calculates the mean travel distance, that is X̅ . It
is unlikely the standard deviation of the population would be known or available,
again making formula unusable.
The Director of Student Loans wants to know the mean amount owed on student
loans at the time of his/her graduation. The director selects a sample of 20
graduating students and contacts each to find the information. From the sample
information, he can estimate the mean amount. However, to develop a confidence
interval using formula , the population standard deviation is necessary. It is not
likely this information is available
When population standard deviation is not known, we can use the sample standard
deviation to estimate the population standard deviation. That is, we use s, the
sample standard deviation, to estimate σ, the population standard deviation. But in
doing so, we cannot use formula we used for known population standard deviation.
Because we do not know σ we cannot use the z distribution. However, there is a
remedy. We use the sample standard deviation and replace the z distribution with
the t distribution
Consider a data sample consisting of, for the sake of simplicity, five positive integers.
The values could be any number with no known relationship between them. This data
sample would, theoretically, have five degrees of freedom.
Four of the numbers in the sample are {3, 8, 5, and 4} and the average of the entire
data sample is revealed to be 6.
This must mean that the fifth number has to be 10. It can be nothing else. It does not
have the freedom to vary.
PROPORTION: The fraction, ratio, or percent indicating the part of the sample
or the population having a particular trait of interest.
When working with confidence intervals, one important variable is sample size.
However, in practice, sample size is not a variable. It is a decision we make so
that our estimate of a population parameter is a good one. Our decision is based
on three variables:
1. The margin of error the researcher will tolerate.
2. The level of confidence desired, for example, 95 percent.
3. The variation or dispersion of the population being studied.
Margin Of Error
The first variable is the margin of error. It is designated as E and is the amount that is
added and subtracted to the sample mean (or sample proportion) to determine the
endpoints of the confidence interval. For example, in a study of wages, we may decide
that we want to estimate the population average wage with a margin of error of plus or
minus $1000.
The margin of error is the amount of error we are willing to tolerate in estimating
a population parameter. You may wonder why we do not choose small margins of
error. There is a trade-off between the margin of error and sample size. A small margin
of error will require a larger sample and more money and time to collect the sample. A
larger margin of error will permit a smaller sample and a wider confidence interval.
Level Of Confidence
Conduct a pilot study: This is the most common method. Suppose we want an
estimate of the number of hours per week worked by students enrolled in the
College of Business at the University of Texas. To test the validity of our
questionnaire, we use it on a small sample of students. From this small sample, we
compute the standard deviation of the number of hours worked and use this value
as the population standard deviation
population standard deviation
Use a comparable study: Use this approach when there is an estimate of the
standard deviation from another study. Suppose we want to estimate the number of
hours worked per week by refuse workers. Information from certain state or federal
agencies that regularly study the workforce may provide a reliable value to use for
the population standard deviation.
To estimate a population mean, we can express the interaction among these three
factors and the sample size in the following formula. Notice that this formula is the
margin of error used to calculate the endpoints of confidence intervals to estimate a
population mean.
The maximum allowable error, E, is $100. The value of z for a 95 percent level of
confidence is 1.96, and the value of the standard deviation is $1,000. Substituting
these values into formula gives the required sample size as:
The computed value of 384.16 is rounded up to 385. A sample of 385 is required to
meet the specifications. If the student wants to increase the level of confidence, for
example to 99 percent, this will require a larger sample. The z value corresponding
to the 99 percent level of confidence is 2.58
We recommend a sample of 666. Observe how much the change in the confidence
level changed the size of the sample. An increase from the 95 percent to the 99
percent level of confidence resulted in an increase of 281 observations or 73
percent [(666/385)*100]. This would greatly increase the cost of the study, both in
terms of time and money. Hence, the level of confidence should be considered
carefully
Sample Size to Estimate a Population Proportion
To determine the sample size for a proportion, the same three variables need to be
specified:
1. The margin of error.
2. The desired level of confidence.
3. The variation or dispersion of the population being studied.
The estimate of the population proportion is to be within .10, so E .10. The desired
level of confidence is .90, which corresponds to a z value of 1.65. Because no
estimate of the population proportion is available, we use .50. The suggested number
of observations is