Chapter 9

Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

Slides by:

Andrew Stephenson
Georgia Gwinnett College
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Chapter 9
9.2
Sampling Distributions

• Sampling Distribution of the Mean

• Sampling Distribution of a Proportion

• Sampling Distribution of the Difference

between Two Means

• From Here to Inference

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distributions…
9.3
A sampling distribution is created by, as the name
suggests, sampling.

The method we will employ on the rules of probability


and the laws of expected value and variance to
derive the sampling distribution.

For example, consider the roll of one and two dice…

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of the Mean…
9.4
A fair die is thrown infinitely many times, with the
random variable X = # of spots on any throw.

The probability distribution of X is:

x 1 2 3 4 5 6
P(x) 1/6 1/6 1/6 1/6 1/6 1/6
…and the mean and variance are calculated as
well:

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of Two Dice
9.5
A sampling distribution is created by looking at
all samples of size n=2 (i.e. two dice) and their means…

While there are 36 possible samples of size 2, there are only 11


values for , and some (e.g. =3.5) occur more frequently than
others (e.g. =1).

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of Two Dice…
The sampling distribution of is shown below:
9.6

6/36
P( )
5/36
1.0 1/36
1.5 2/36
4/36
P( )
2.0 3/36
2.5 4/36
3.0 5/36 3/36
3.5 6/36
4.0 5/36 2/36
4.5 4/36
5.0 3/36
5.5 2/36 1/36
6.0 1/36
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Compare…
9.7
Compare the distribution of X…

1 2 3 4 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

…with the sampling distribution of .

As well, note that:

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Generalize…
9.8
We can generalize the mean and variance of the
sampling of two dice:

…to n-dice:

The standard deviation of the


sampling distribution is
called the standard error:

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Central Limit Theorem…
9.9
The sampling distribution of the mean of a random
sample drawn from any population is approximately
normal for a sufficiently large sample size.

The larger the sample size, the more closely the sampling
distribution of will resemble a normal distribution.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Central Limit Theorem…
9.10
If the population is normal, then is normally distributed
for all values of n.

If the population is non-normal, then is approximately


normal only for larger values of n.

In most practical situations, a sample size of 30 may be


sufficiently large to allow us to use the normal distribution
as an approximation for the sampling distribution of .

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of the Sample Mean
9.11
1.

2.

3. If X is normal, is normal. If X is nonnormal, is


approximately normal for sufficiently large sample sizes.
Note: the definition of “sufficiently large” depends on the
extent of nonnormality of x (e.g. heavily skewed;
multimodal)

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of the Sample Mean
9.12
We can express the sampling distribution of the mean
simple as

X
Z
/ n

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of the Sample Mean
9.13
The summaries above assume that the population is
infinitely large. However if the population is finite the
standard error is

 Nn
x 
n N 1

where N is the population size and

Nn
N 1

is the finite population correction factor.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of the Sample Mean
9.14
If the population size is large relative to the sample
size the finite population correction factor is close to 1
and can be ignored.

We will treat any population that is at least 20 times


larger than the sample size as large.

In practice most applications involve populations that


qualify as large.

As a consequence the finite population correction


factor is usually omitted.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.1(a)…
9.15
The foreman of a bottling plant has observed that the
amount of soda in each “32-ounce” bottle is actually a
normally distributed random variable, with a mean of 32.2
ounces and a standard deviation of .3 ounce.

If a customer buys one bottle, what is the probability that


the bottle will contain more than 32 ounces?

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.1(a)…
9.16
We want to find P(X > 32), where X is normally
distributed and µ = 32.2 and σ =.3

 X   32  32.2 
P(X  32)  P    P( Z   .67)  1  .2514  .7486
  .3 

“there is about a 75% chance that a single bottle


of soda contains more than 32oz.”

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.1(b)…
9.17
The foreman of a bottling plant has observed that the
amount of soda in each “32-ounce” bottle is actually a
normally distributed random variable, with a mean of 32.2
ounces and a standard deviation of .3 ounce.

If a customer buys a carton of four bottles, what is the


probability that the mean amount of the four bottles
will be greater than 32 ounces?

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.1(b)…
9.18
We want to find P( > 32), where X is normally distributed
With µ = 32.2 and σ =.3
`
Things we know:
1) X is normally distributed, therefore so will .

2) = 32.2 oz.

3)

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.1(b)…
9.19
If a customer buys a carton of four bottles, what is the
probability that the mean amount of the four bottles
will be greater than 32 ounces?

“There is about a 91% chance the mean of the


four bottles will exceed 32oz.”

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Graphically Speaking…
9.20
mean=32.2

what is the probability that one bottle will what is the probability that the mean of
contain more than 32 ounces? four bottles will exceed 32 oz?

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Chapter-Opening Example
9.21
Salaries of a Business School’s Graduates
In the advertisements for a large university, the dean of
the School of Business claims that the average salary of
the school’s graduates one year after graduation is $800
per week with a standard deviation of $100.

A second-year student in the business school who has


just completed his statistics course would like to check
whether the claim about the mean is correct.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Chapter-Opening Example
9.22
Salaries of a Business School’s Graduates
He does a survey of 25 people who graduated one year ago
and determines their weekly salary.

He discovers the sample mean to be $750.

To interpret his finding he needs to calculate the probability


that a sample of 25 graduates would have a mean of $750
or less when the population mean is $800 and the standard
deviation is $100.

After calculating the probability, he needs to draw some


conclusions.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Chapter-Opening Example
9.23
We want to find the probability that the sample mean is
less than $750. Thus, we seek
P( < 750)

The distribution of X, the weekly income, is likely to be


positively skewed, but not sufficiently so to make the
distribution of nonnormal. As a result, we may assume
that is normal with mean

 x    800

and standard deviation


 x   / n  100 / 25  20
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Chapter-Opening Example
9.24
Thus,
P(X  750)
 X  x 750  800 
 P  

  x 20 
 P(Z   2.5)
 .5  .4938
 .0062

The probability of observing a sample mean as low as $750


when the population mean is $800 is extremely small.
Because this event is quite unlikely, we would have to
conclude that the dean's claim is not justified.
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Using the Sampling Distribution for Inference
9.25
Here’s another way of expressing the probability calculated
from a sampling distribution.
P(-1.96 < Z < 1.96) = .95
Substituting the formula for the sampling distribution

X 
P(1.96   1.96)  .95
/ n

With a little algebra


 
P( 1.96  X    1.96 )  .95
n n

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Using the Sampling Distribution for Inference
9.26
Returning to the chapter-opening example where µ = 800, σ
= 100, and n = 25, we compute
100 100
P(800  1.96  X  800  1.96 )  .95
25 25

or

P(760.8  X  839.2)  .95

This tells us that there is a 95% probability that a sample


mean will fall between 760.8 and 839.2. Because the sample
mean was computed to be $750, we would have to conclude
that the dean's claim is not supported by the statistic.
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Using the Sampling Distribution for Inference
9.27
Changing the probability from .95 to .90 changes the
probability statement to

 
P( 1.645  X    1.645 )  .90
n n

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Using the Sampling Distribution for Inference
9.28
We can also produce a general form of this statement

In this formula α (Greek letter alpha) is the probability


that does not fall into the interval.

 
P (  z  / 2  X    z / 2 ) 1 
n n

To apply this formula all we need do is substitute the


values for µ, σ, n, and α.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Using the Sampling Distribution for Inference
9.29
For example, with µ = 800, σ = 100, n = 25 and α= .01, we
produce
 
P (   z .005  X    z .005 )  1  .01
n n

100 100
P(800  2.575  X  800  2.575 )  .99
25 25

P(748.5  X  851.5)  .99

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of a Proportion…
9.30
The estimator of a population proportion of successes is
the sample proportion. That is, we count the number
of successes in a sample and compute:

(read this as “p-hat”).

X is the number of successes, n is the sample size.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Normal Approximation to Binomial…
Binomial distribution with n=20 and p=.5 with a normal
9.31
approximation superimposed (μ =10 and σ = 2.24)

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Normal Approximation to Binomial…
Binomial distribution with n=20 and p=.5 with a normal 9.32
approximation superimposed ( μ =10 and σ =2.24)

where did these values come from?!

From Section 7.4d we saw that:

Hence:
and
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Normal Approximation to Binomial…
9.33
Normal approximation to the binomial works best when
the number of experiments, n, (sample size) is large, and
the probability of success, p, is close to 0.5

For the approximation to provide good results two


conditions should be met:
1) np ≥ 5
2) n(1–p) ≥ 5

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Normal Approximation to Binomial…
9.34
To calculate P(X=10) using the normal distribution, we can find
the area under the normal curve between 9.5 & 10.5

P(X = 10) ≈ P(9.5 < Y < 10.5)


where Y is a normal random variable approximating
the binomial random variable X
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Normal Approximation to Binomial…
9.35
In fact:
P(X = 10) = .176
while
P(9.5 < Y < 10.5) = .1742
the approximation is quite good.

P(X = 10) ≈ P(9.5 < Y < 10.5)

where Y is a normal random variable approximating


the binomial random variable X

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution of a Sample Proportion…
9.36
Using the laws of expected value and variance, we
can determine the mean, variance, and standard
deviation of .
(The standard deviation of is called the standard
error of the proportion.)

Sample proportions can be standardized to a


standard normal distribution using this formulation:

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.2
9.37
In the last election a state representative received 52% of
the votes cast.

One year after the election the representative organized a


survey that asked a random sample of 300 people
whether they would vote for him in the next election.

If we assume that his popularity has not changed what is


the probability that more than half of the sample would
vote for him?

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.2
9.38
The number of respondents who would vote for the
representative is a binomial random variable with n = 300
and p = .52.

We want to determine the probability that the sample


proportion is greater than 50%. That is, we want to find
P( >.50)

We now know that the sample proportion is approximately


normally distributed with mean p = .52 and standard
deviation
p(1  p) / n  (.52)(1  .52) / 300  .0288

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.2
9.39
Thus, we calculate

P ( P̂  .50 )
 P̂  p .50  .52 
P  
 p (1  p ) / n .0288 

 P ( Z   .69 )
 .7549

If we assume that the level of support remains at 52%, the probability


that more than half the sample of 300 people would vote for the
representative is 75.49%.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution: Difference of two means
9.40
The final sampling distribution introduced is that of the
difference between two sample means. This requires:

 independent random samples be drawn from each of


two normal populations

If this condition is met, then the sampling distribution of the


difference between the two sample means, i.e. - will be
normally distributed.
(note: if the two populations are not both normally
distributed, but the sample sizes are “large” (>30), the
distribution of - is approximately normal)

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sampling Distribution: Difference of two means
9.41
The expected value and variance of the sampling
distribution of - are given by:

mean:

standard deviation:

(also called the standard error if the difference


between two means)

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.3…
9.42
Since the distribution of - is normal and has a

mean of

and a standard deviation of

We can compute Z (standard normal random


variable) in this way:

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.3…
Starting salaries for MBA grads at two universities are normally
9.43
distributed with the following means and standard deviations.
Samples from each school are taken…

University 1 University 2
Mean (μ) 62,000 $/yr 60,000 $/yr
Std. Dev. (σ) 14,500 $/yr 18,300 $/yr
sample size (n) 50 60

What is the probability that the sample mean starting salary of


University #1 graduates will exceed that of the #2 grads?

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 9.3…
“What is the probability that the sample mean starting
9.44
salary of University #1 graduates will exceed that of the

#2 grads?”

We are interested in determining P( > ). Converting

this to a difference of means, what is: P( - > 0) ?

Z
“there is about a 74% chance that the sample
mean starting salary of U. #1 grads will exceed
that of U. #2”
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
From Here to Inference
9.45
In Chapters 7 and 8 we introduced probability
distributions, which allowed us to make probability
statements about values of the random variable.

A prerequisite of this calculation is knowledge of the


distribution and the relevant parameters.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
From Here to Inference
9.46
In Example 7.9, we needed to know that the probability
that Pat Statsdud guesses the correct answer is 20% (p =
.2) and that the number of correct answers (successes) in
10 questions (trials) is a binomial random variable.

We then could compute the probability of any number of


successes.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
From Here to Inference
9.47
In Example 8.2, we needed to know that the return on
investment is normally distributed with a mean of 10%
and a standard deviation of 5%.

These three bits of information allowed us to calculate


the probability of various values of the random variable.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
From Here to Inference
9.48
The figure below symbolically represents the use of
probability distributions.

Simply put, knowledge of the population and its


parameter(s) allows us to use the probability distribution
to make probability statements about individual members
of the population.

Probability Distribution Individual

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
From Here to Inference
9.49
In this chapter we developed the sampling distribution,
wherein knowledge of the parameter(s) and some
information about the distribution allow us to make
probability statements about a sample statistic.

Population Sampling Sample


& Parameters Distribution Statistics

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
From Here to Inference
9.50
Statistical works by reversing the direction of the flow of
knowledge in the previous figure. The next figure displays
the character of statistical inference.

Starting in Chapter 10, we will assume that most


population parameters are unknown. The statistics
practitioner will sample from the population and compute
the required statistic. The sampling distribution of that
statistic will enable us to draw inferences about the
parameter.

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
From Here to Inference
9.51

Sample Sampling Population


Statistics Distribution Parameter

© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted
in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

You might also like