Chap 12 - Analysis of Variance
Chap 12 - Analysis of Variance
Chap 12 - Analysis of Variance
Chapter 12
(STBE)
Contents
◼ F distribution and F table.
◼ Hypothesis testing to determine whether the
variances of two populations are equal.
◼ ANOVA approach for testing difference in
sample means.
◼ ANOVA tables for analysis.
◼ Hypothesis testing among three or more
treatment means
◼ Develop confidence intervals for the difference
in treatment means
◼ Two-Way Analysis of Variance (Interaction)
12-2
The F Distribution
▪ Uses of F Distribution
Test whether two samples are from populations
having equal variances.
To compare several population means
simultaneously.
◼ Simultaneous comparison of several population
means is called analysis of variance (ANOVA).
◼ Assumption:
❑ The populations must follow a normal distribution
❑ The data must be at least interval-scale.
12-3
The F Distribution
Characteristics of the F Distribution
1. There is a “family” of F Distributions.
A particular member of the family is determined by
two parameters:
◼ Degrees of freedom in the numerator
◼ Degrees of freedom in the denominator.
2. F distribution is continuous.
3. F cannot be negative.
4. It is asymptotic.
As F → , the curve approaches the X-axis but never
touches it.
12-4
The F Distribution
12-5
Comparing Two Population Variances
The F distribution is used to test the hypothesis that the
variance of one normal population equals the variance
of another normal population.
Examples:
◼ The mean rate of return on two types of common stock may
be the same, but there may be more variation in the rate of
return in one than the other. A sample of 10 technology and
10 utility stocks shows the same mean rate of return, but
there is likely more variation in the Internet stocks.
◼ A study by the marketing department for a large newspaper
found that men and women spent about the same amount of
time per day reading the paper. However, the same report
indicated there was nearly twice as much variation in time
spent per day among the men than the women.
12-6
Comparing Two Population Variances
12-7
Test for Equal Variances - Example
Lammers Limos offers limousine service from the city hall in Toledo,
Ohio, to Metro Airport in Detroit. Sean Lammers, president of the
company, is considering two routes. One is via U.S. 25 and the
other via I-75. He wants to study the time it takes to drive to the
airport using each route and then compare the results. He
collected the following sample data, which is reported in minutes.
Using the .10 significance level, is there a difference in the
variation in the driving times for the two routes?
12-8
Test for Equal Variances - Example
Step 1: The hypotheses are:
H0: σ12 = σ22
H1: σ12 ≠ σ22
Step 2: The significance level is .05.
Step 3: The test statistic is the F distribution.
Step 4: State the decision rule.
Reject H0 if F > F/2,v1,v2
F > F.10/2,7-1,8-1
F > F.05,6,7
F > 3.87
12-9
Test for Equal Variances - Example
12-10
Test for Equal Variances - Example
Step 5: Compute the value of F and make a decision
Assumptions:
The sampled populations follow the normal
distribution.
The populations have equal standard deviations.
The samples are randomly selected and are
independent.
12-12
Comparing Means of Three or More
Populations (ANOVA approach)
The Null Hypothesis is that the population means are
the same.
The Alternative Hypothesis is that at least one of the
means is different.
H0: µ1 = µ2 =…= µk
H1: The means are not all equal
Reject H0 if F > F,k-1,n-k
12-13
Comparing Means of Three or More
Populations (ANOVA approach)
12-14
Comparing Means of Three or More
Populations (ANOVA approach)
12-15
Comparing Means of Three or More
Populations – Example
Recently a group of four major carriers joined in hiring
Brunner Marketing Research, Inc., to survey recent
passengers regarding their level of satisfaction with a
recent flight. The survey included questions on
ticketing, boarding, in-flight service, baggage
handling, pilot communication, and so forth.
Twenty-five questions offered a range of possible
answers: excellent, good, fair, or poor. A response of
excellent was given a score of 4, good a 3, fair a 2, and
poor a 1.
12-16
Comparing Means of Three or More
Populations – Example
These responses were then totaled, so the total score was
an indication of the satisfaction with the flight. Brunner
Marketing Research, Inc., randomly selected and
surveyed passengers from the four airlines.
Is there a difference in the mean satisfaction level among
the four airlines? Use the .01 significance level.
12-17
Comparing Means of Three or More
Populations – Example
Step 1: State the null and alternate hypotheses.
H0: µ1 = µ2 = µ3 = µ4 H1: The means are not all equal.
Step 2: State the level of significance.
The .01 significance level
Step 3: Find the appropriate test statistic.
Use the F statistic
Step 4: State the decision rule.
Reject H0 if: F > F,k-1,n-k
F > F.01,4-1,22-4
F > F.01,3,18
F > 5.09
12-18
Comparing Means of Three or More
Populations – Example
Step 5: Compute the value of F and make a decision
12-20
Comparing Means of Three or More
Populations – Example
Step 5: Compute the value of F and make a decision
RANDOM (ERROR) VARIATION The sum of the
squared differences between each observation and
its treatment mean.
12-22
Comparing Means of Three or More
Populations – Example
Step 5: Compute the value of F and make a decision
Calculation of SS TOTAL
12-23
Comparing Means of Three or More
Populations – Example
Step 5: Compute the value of F and make a decision
Calculation of SSE
12-24
Comparing Means of Three or More
Populations – Example
Step 5: Compute the value of F and make a decision
Calculation of SSE
12-25
Comparing Means of Three or More
Populations – Example
Step 5: Compute the value of F and make a decision
Calculation of SST
12-26
Comparing Means of Three or More
Populations – Example
12-28
Comparing Means of Three or More
Populations – Example
12-29
Inferences about Pairs of Treatment
Means
• ANOVA procedure used to make the decision to reject
the null hypothesis.
• Conclusion: All the treatment means are not the same.
• Are you satisfied with this conclusion?
• Which treatment means differ??
• Previous EXAMPLE:
• If the passenger ratings do differ, the question is:
Between which groups do treatment means differ?
• Procedure:
• Confidence Interval (Chap 09)
• t distribution (Chap 10 & 11)
12-30
Inferences about Pairs of Treatment
Means
• One of the assumptions of ANOVA is that the population
variances are the same for all treatments.
• This common population value is the mean square
error, or MSE, and is determined by
12-31
Inferences about Pairs of Treatment
Means
Decision Rule:
• If the confidence interval includes zero, there is not a
difference between the treatment means.
• If the left endpoint of the confidence interval has a
negative sign and the right endpoint has a positive sign,
the interval includes zero and the two means do not
differ.
Using the previous airline example, let us compute the
confidence interval for the difference between the mean
scores of passengers on Northern and Branson. With a
95 percent level of confidence, the endpoints of the
confidence interval are 10.46 and 26.04.
12-32
Inferences about Pairs of Treatment
Means
𝟏 𝟏
ഥ𝑵 − 𝑿
𝑿 ഥ 𝑩 ± 𝐭 𝑴𝑺𝑬 +
𝒏𝑵 𝒏𝑩
12-33
Two-Way Analysis of Variance
◼ Total variation into two categories:
The variation between the treatments and
The variation within the treatments (error or the random variation).
◼ To put it another way, we considered only two sources of
variation, that due to the treatments and the random
differences.
◼ In the airline passenger ratings example, there
may be other causes of variation.
◼ These factors might include, for example, the season of
the year, the particular airport, or the number of
passengers on the flight.
12-34
Two-Way Analysis of Variance
EXAMPLE
WARTA, the Warren Area Regional Transit Authority, is
expanding bus service from the suburb of Starbrick into the
central business district of Warren.
There are four routes being considered from Starbrick to
downtown Warren: (1) via U.S. 6, (2) via the West End, (3) via
the Hickory Street Bridge, and (4) via Route 59.
WARTA conducted several tests to determine whether there
was a difference in the mean travel times along the four
routes. Because there will be many different drivers, the test
was set up so that each driver drove along each of the four
routes. Next slide shows the travel time, in minutes, for each
driver-route combination.
12-35
Two-Way Analysis of Variance
EXAMPLE
At the .05 significance level, is there a difference in the
mean travel time along the four routes?
If we remove the effect of the drivers, is there a difference
in the mean travel time?
12-36
Two-Way
Analysis of
Variance
12-37
Two-Way Analysis of Variance
◼ In the above example, we considered the variation due to
the treatments (routes) and took all the remaining
variation to be random.
◼ If we could consider the effect of the several drivers, this
would allow us to reduce the SSE term, which would lead
to a larger value of F.
◼ The second treatment variable, the drivers in this case, is
referred to as a blocking variable.
◼ BLOCKING VARIABLE A second treatment variable that
when included in the ANOVA analysis will have the effect
of reducing the SSE term.
12-38
Two-Way Analysis of Variance
◼ In this case, we let the drivers be the blocking variable,
and removing the effect of the drivers from the SSE
term will change the F ratio for the treatment variable.
First, we need to determine the sum of squares due to
the blocks.
◼ In a two-way ANOVA, the sum of squares due to blocks
is found by the following formula.
12-39
Two-Way Analysis of Variance
12-40
Two-Way Analysis of Variance
12-41
Two-Way Analysis of Variance
Step 1: State the null and alternate hypotheses:
12-42
Two-Way Analysis of Variance
Step 4: State the decision rule.
Reject H0 (1) if Reject H0 (2) if
F (7.93) > F.05,k-1,(b-1)(k-1) F (9.78) > F.05,b-1,(b-1)(k-1)
F (7.93) > F.05,4-1,(5-1)(4-1) F (9.78) > F.05,5-1,(5-1)(4-1)
F (7.93) > F.05,3,12 F (9.78) > F.05,4,12
F (7.93) > 3.49 F (9.78) > 3.26
The null hypothesis is rejected. The null hypothesis is rejected.
We conclude that the mean The mean time is not the same for
travel time is not the same for the various drivers.
all routes. Thus, WARTA management can
WARTA will want to conduct conclude, based on the sample
some tests to determine which results, that there is a difference in
treatment means differ. the routes and in the drivers.
12-43
Two-Way
Analysis of
Variance
12-44
Two-way ANOVA with Interaction
◼ In the previous section, we studied the separate or
independent effects of two variables, routes into the
city and drivers, on mean travel time.
◼ There is another effect that may influence travel time.
This is called an interaction effect between route and
driver on travel time. For example, is it possible that one
of the drivers is especially good driving one or more of
the routes?
◼ The combined effect of driver and route may also
explain differences in mean travel time.
◼ To measure interaction effects, it is necessary to have
at least two observations in each cell.
12-45
Two-way ANOVA with Interaction
◼ When we use a two-way ANOVA to study interaction,
we now call the two variables as factors instead of
blocks.
◼ Interaction occurs if the combination of two factors has
some effect on the variable under study, in addition to
each factor alone.
◼ The variable being studied is referred to as the response
variable.
◼ INTERACTION The effect of one factor on a response
variable differs depending on the value of another
factor.
◼ One way to study interaction is by plotting factor means
in a graph called an interaction plot.
12-46
Two-way ANOVA with Interaction
12-47
Two-way ANOVA with Interaction
• Is there really an interaction between
routes and drivers?
• Are the travel times for the drivers the
same?
• Are the travel times for the routes the
same?
• Out of the three questions, we are most
interested in the test for interactions.
• To put it another way, does a particular
route/driver combination result in
significantly faster (or slower) driving
times?
• Also, the results of the hypothesis test
for interaction affect the way we analyze
the route and driver questions.
12-48
Example – ANOVA with Interaction
Suppose the
WARTA blocking
experiment
discussed earlier
is repeated by
measuring two
more travel times
for each driver and
route combination
with the data
shown in the Excel
worksheet.
12-49
Example – ANOVA with Interaction
The ANOVA now has three sets of hypotheses to test:
12-50
Two-way ANOVA with Interaction
Sum of squares due to possible interaction is:
12-51
Two-way ANOVA Table with
Interaction
12-52
Two-way ANOVA with Interaction
12-53
One-way ANOVA for Each Driver
H0: Route travel times are equal.
Conclusion:
12-54
Learning Objectives
LO1 List the characteristics of the F distribution and locate
values in an F table.
LO2 Perform a test of hypothesis to determine whether
the variances of two populations are equal.
LO3 Describe the ANOVA approach for testing difference in
sample means.
LO4 Organize data into ANOVA tables for analysis.
LO5 Conduct a test of hypothesis among three or more
treatment means and describe the results.
LO6 Develop confidence intervals for the difference in
treatment means and interpret the results.
LO7 Carry out a test of hypothesis among treatment means
using a blocking variable and understand the results.
LO8 Perform a two-way ANOVA with interaction and
describe the results.
12-55