Nonparametric Methods:: Goodness-of-Fit Tests

Nonparametric Methods:
Goodness-of-Fit Tests
Chapter 17
McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Learning Objectives
LO1 Conduct a test of hypothesis comparing an observed set
of frequencies to an expected distribution.
LO2 List and explain the characteristics of the chi-square
distribution.
LO3 Conduct a goodness-of-fit test for unequal expected
frequencies.
LO4 Conduct a test of hypothesis to verify that data grouped
into a frequency distribution is a sample from a normal
distribution.
LO5 Use graphical methods to determine if a set of sample
data is from a normal distribution.
LO6 Conduct a test of hypothesis to determine whether two
classification criteria are related.
17-2
LO1 List and explain the characteristics of
the chi-square distribution.
Characteristics of the Chi-Square

Distribution
The major
characteristics of the
chi-square
distribution
 It is positively skewed.
 It is non-negative.
 It is based on degrees of
freedom.
 When the degrees of
freedom change a new
distribution is created.
17-3
LO2 Conduct a test of hypothesis comparing an observed
set of frequencies to an expected distribution.
Goodness-of-Fit Test: Comparing an Observed

Set of Frequencies to an Expected Distribution
 Let f0 and fe be the observed and expected

frequencies respectively.
 Hypothses
 H0: There is no difference between the observed
and expected frequencies.
 H1: There is a difference between the observed
and the expected frequencies.
17-4
LO2
Goodness-of-fit Test: Comparing an Observed Set

of Frequencies to an Expected Distribution
The test statistic is:
 fo  fe  2 
 2
  
 fe


The critical value is a chi-square value with (k-1)

degrees of freedom, where k is the number of
categories
17-5
LO2
Goodness-of-Fit Example
The Bubba’s Fish and Pasta is a chain of restaurants located along the
Gulf Coast of Florida. Bubba, the owner, is considering adding steak to
his menu. Before doing so he decides to hire the Magnolia Research,
LLC to conduct a survey of adults as to their favorite meal when eating
out. Magnolia selected a sample of 120 adults and asked each to
indicate their favorite meal when dining out. The results are reported
below.
Is it reasonable to conclude there is no preference among the four entrees?
17-6
LO2
Step 1: State the null hypothesis and the alternate hypothesis.
H0: there is no difference between fo and fe
H1: there is a difference between fo and fe
Step 2: Select the level of significance.

α = 0.05 as stated in the problem
Step 3: Select the test statistic.

The test statistic follows the chi-square distribution,
designated as χ2
 fo  fe  2 
 2
  
 fe


17-7
LO2
Step 4: Formulate the decision rule.
Reject H0 if  2   2 ,k 1
  fo  fe  2
  fe
    ,k 1

2

  fo  fe  
2
  fe
   .05,41

2

  fo  fe  2 
  f    2.05,3
 e 
  fo  fe  2 
  f   7.815
 e 
17-8
LO2
7.815
Critical
Value
17-9
LO2
Step 5: Compute the value of the chi-square

statistic and make a decision
 fo  fe 
2 
2
   
 fe


17-10
LO2
7.815
Critical
Value
2.20
The computed χ2 of 2.20 is in the Fail to Reject H0 region, less the critical value of 7.815. The
decision, therefore, is to fail to reject H0 at the .05 level .
Conclusion: The difference between the observed and the expected frequencies is due to
chance. There appears to be no difference in the preference among the four entrees.
17-11
LO2
Chi-square - MegaStat
17-12
LO3 Conduct a goodness-of-fit test for
unequal expected frequencies.
Goodness-of-Fit Test: Unequal

Expected Frequencies
 Let f0 and fe be the observed and
expected frequencies respectively.
 Hypotheses
 H0: There is no difference between the
observed and expected frequencies.
 H1: There is a difference between the
observed and the expected frequencies.
17-13
LO3
Goodness-of-Fit Test: Unequal Expected
Frequencies - Example
The American Hospital Administrators Association (AHAA) reports the

following information concerning the number of times senior citizens
are admitted to a hospital during a one-year period. Forty percent are
not admitted; 30 percent are admitted once; 20 percent are admitted
twice, and the remaining 10 percent are admitted three or more times.
A survey of 150 residents of Bartow Estates, a community devoted to
active seniors located in central Florida, revealed 55 residents were
not admitted during the last year, 50 were admitted to a hospital once,
32 were admitted twice, and the rest of those in the survey were
admitted three or more times.
Can we conclude the survey at Bartow Estates is consistent with the

information suggested by the AHAA? Use the .05 significance level.
17-14
LO3
H0: There is no difference between local and national
experience for hospital admissions.
H1: There is a difference between local and national
experience for hospital admissions.


The test statistic follows the chi-square distribution,
designated as χ2
17-15
LO3

Reject H 0 if  2   2 ,k 1
  fo  fe  2 
  f    2 ,k 1
 e 
  fo  fe  2 
  f    2.05,41
 e 
  fo  fe  2 
  f    2.05,3
 e 
  fo  fe  2 
  f   7.815
 e 
17-16
LO3
Expected frequencies of
Distribution stated in Frequencies observed in sample if the distribution
the problem a sample of 150 Bartow stated in the Null Hypothesis
residents is correct
Computation of fe
0.40 X 150 = 60
0.30 X 150 = 45
0.30 X 150 = 30
0.10 X 150= 15
17-17
LO3
Step 5: Compute the value of the chi-square
statistic and make a decision
 fo  fe  2 
2
   
 fe


Computed χ2
17-18
LO3
1.3723
The computed χ2 of 1.3723 is in the “Do not reject H0” region. The difference between the
observed and the expected frequencies is due to chance.
We conclude that there is no evidence of a difference between the local and national experience
for hospital admissions.
17-19
LO4 Conduct a test of hypothesis to verify that data grouped into
a frequency distribution is a sample from a normal distribution.
Testing the Hypothesis That A Distribution of Data

Is From A Normal Population
Recall the frequency distribution of Applewood’s profits from
the sale of 180 vehicles. The frequency distribution is repeated
below.
Is it reasonable to conclude that the profit data is a sample

obtained from a normal population?
17-20
LO4
Testing the Hypothesis That A Distribution of

Data Is From A Normal Population
Step 1: Calculate
the probabilities
for each class.
Convert each
class limit into a z-
score using mean
of $1,843.17 and
standard deviation
of $643.63, then
find the
probability.
17-21
LO4
0.0214 X 180 = 3.852
Step 2: Use these

probabilities to
compute the
expected
frequencies for
each class
17-22
LO4
Step 3: Compute
the Ch-square
statistic using:
17-23
LO4
Step 4: Compare the computed statistic to the critical statistic

and make a statistical conclusion:
17-24
LO5 Use graphical methods to determine if a set
of sample data is from a normal distribution.
Graphical Approach to Confirm
Normality: Anderson-Darling Test
Step 1: Create 2 cumulative
distributions
a. Cumulative distribution of the
raw data
b. Cumulative normal
distribution
Step 2: Compare the 2
cumulative distributions
a. Search the largest absolute
numerical difference between
the 2 distributions
b. Using a statistical test, if the
difference is large, then we In the following graph the red dots represent the profit of each of
reject the null hypothesis that the the 180 vehicles from the Applewood Auto Group, and the blue
line, which is mostly covered by the red dots, represents a
data is normally distributed. normal cumulative distribution. The graph shows that the profit
data closely follows the blue line and that the distribution of
profits follows a normal distribution rather closely.
17-25
LO6 Conduct a test of hypothesis to determine
whether two classification criteria are related.
Contingency Table Analysis

A contingency table is used to investigate whether two
traits or characteristics are related. Each observation is
classified according to two criteria. We use the usual
hypothesis testing procedure.
 The degrees of freedom is equal to:
(number of rows-1)(number of columns-1).
 The expected frequency is computed as:
17-26
LO6
Contingency Analysis
We can use the chi-square statistic to formally test for a relationship between two
nominal-scaled variables. To put it another way, Is one variable independent of the
other?
 Ford Motor Company operates an assembly plant in Dearborn, Michigan. The plant
operates three shifts per day, 5 days a week. The quality control manager wishes to
compare the quality level on the three shifts. Vehicles are classified by quality level
(acceptable, unacceptable) and shift (day, afternoon, night). Is there a difference in the
quality level on the three shifts? That is, is the quality of the product related to the shift
when it was manufactured? Or is the quality of the product independent of the shift on
which it was manufactured?
 A sample of 100 drivers who were stopped for speeding violations was classified by
gender and whether or not they were wearing a seat belt. For this sample, is wearing a
seatbelt related to gender?
 Does a male released from federal prison make a different adjustment to civilian life if
he returns to his hometown or if he goes elsewhere to live? The two variables are
adjustment to civilian life and place of residence. Note that both variables are
measured on the nominal scale.
17-27
LO6
Contingency Analysis - Example

The Federal Correction Agency is investigating the
question, “Does a male released from federal prison
make a different adjustment to civilian life if he returns to
his hometown or if he goes elsewhere to live?” To put it
another way, is there a relationship between adjustment
to civilian life and place of residence after release from
prison? Use the .01 significance level.
17-28
LO6
The agency’s psychologists
interviewed 200 randomly
selected former prisoners.
Using a series of questions,
the psychologists classified
the adjustment of each
individual to civilian life as
outstanding, good, fair, or
unsatisfactory.
The classifications for the

200 former prisoners were
tallied as follows. Joseph
Camden, for example,
returned to his hometown
and has shown outstanding
adjustment to civilian life.
His case is one of the 27
tallies in the upper left box
(circled).
17-29
LO6

H0: There
is no relationship between adjustment to civilian life
and where the individual lives after being released from prison.
H1: There
is a relationship between adjustment to civilian life
and where the individual lives after being released from prison.


The test statistic follows the chi-square distribution, designated as χ2
17-30
LO6
Reject H 0 if  2   2 ,( r 1)( c 1)

  fo  fe  2 
 f    2
 ,( 2 1)( 4 1)
 e 
  fo  fe  2 
  f    2.01,(1)(3)
 e 
  fo  fe  2 
  f    2.01,3
 e 
  fo  fe  2 
  f   11.345
 e 
17-31
LO6
Computing Expected Frequencies (fe)
(120)(50)
200
17-32
LO6
Computing the Chi-square
Statistic
17-33
LO6
Conclusion
5.729
The computed χ2 of 5.729 is in the “Do not rejection H0” region. The null hypothesis is not
rejected at the .01 significance level.
We conclude there is no evidence of a relationship between adjustment to civilian life and where
the prisoner resides after being released from prison. For the Federal Correction Agency’s
advisement program, adjustment to civilian life is not related to where the ex-prisoner lives.
17-34
LO6
Contingency Analysis - Minitab
17-35

Nonparametric Methods:: Goodness-of-Fit Tests

Uploaded by

Copyright:

Available Formats

Nonparametric Methods:: Goodness-of-Fit Tests

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nonparametric Methods:: Goodness-of-Fit Tests

Uploaded by

Copyright:

Available Formats

Nonparametric Methods:

Characteristics of the Chi-Square

Goodness-of-Fit Test: Comparing an Observed

 Let f0 and fe be the observed and expected

Goodness-of-fit Test: Comparing an Observed Set

The test statistic is:

The critical value is a chi-square value with (k-1)

Is it reasonable to conclude there is no preference among the four entrees?

Step 2: Select the level of significance.

Step 3: Select the test statistic.

Step 5: Compute the value of the chi-square

Goodness-of-Fit Test: Unequal

The American Hospital Administrators Association (AHAA) reports the

Can we conclude the survey at Bartow Estates is consistent with the

Step 2: Select the level of significance.

Step 3: Select the test statistic.

Step 4: Formulate the decision rule.

Testing the Hypothesis That A Distribution of Data

Is it reasonable to conclude that the profit data is a sample

Testing the Hypothesis That A Distribution of

Step 2: Use these

Step 4: Compare the computed statistic to the critical statistic

Contingency Table Analysis

 The expected frequency is computed as:

Contingency Analysis - Example

The classifications for the

Step 1: State the null hypothesis and the alternate hypothesis.

Step 2: Select the level of significance.

Step 3: Select the test statistic.

Step 4: Formulate the decision rule.

Reject H 0 if  2   2 ,( r 1)( c 1)

Contingency Analysis - Minitab

You might also like