Chapter 15: Chi-Square Test: Student
Chapter 15: Chi-Square Test: Student
Student: ___________________________________________________________________________
1. The chi-square test is a nonparametric test designed to analyze relationships between variables using what
type of information?
A. mean
B. frequencies
C. standard deviation
D. mode
2. A researcher was interested in examining the relationship between gender (male and female) and car
preference (SUV, sedan, or sports car). What type of table would be most appropriate for this data?
A. a two-way contingency table
B. a two-way table
C. a 2 X 3 table
D. all of these
3. If the researcher in the study above (examining gender and car preference) specifically designed the study so
that 50 men and 50 women would be selected for the study, then in the data table, the marginal means for the
car choice variable would be _____, but the marginal means for gender would be _____.
A. fixed; random
B. expected; determined
C. random; fixed
D. determined; independent
5. Conceptually, what are the “expected frequencies” telling us?
A. The mean of the participants scores on the variable of interest.
B. The frequency that would be observed in each “cell” of the design if the null hypothesis were true.
C. The frequency that would be observed in each “cell” of the design if the null hypothesis were false.
D. the frequency of scores in the data collected.
6. As degrees of freedom for the sampling distribution of the chi-square statistic decreases the sampling
distribution becomes
A. more platykurtic.
B. more skewed.
C. more leptokurtic.
D. more normal in shape.
7. AS N increases results of the chi-square test become more similar to which of the following alternatives?
A. Fisher’s exact test
B. Yate’s correction for continuity
C. the phi coefficient
D. Cramer’s statistic
8. To assess the strength of the relationship, Cramer’s statistic is used when _____, and the fourfold point
correlation coefficient is used when _____.
A. sample sizes are small; sample sizes are large
B. both variables have two levels; one or both variables have more than two levels
C. one or both variables have more than two levels; both variables have two levels
D. sample sizes are large; sample sizes are small
9. What is the possible range of values for the fourfold point correlation coefficient/Cramer’s statistic?
A. -1.00 to +1.00
B. 0 to 1.00
C. -1.00 to 0
D. 0 to infinity
10. The interpretation of the obtained value for the fourfold point correlation coefficient/Cramer’s statistic is
similar to _____.
A. a Pearson correlation coefficient
B. eta-squared
C. omega-squared
D. Tukey HSD
11. Which of the following is an appropriate and viable approach to statistically analyzing the nature of the
relationship for a statistically significant chi-square test?
A. Goodman’s procedure
B. Fisher’s exact test
C. modified Bonferroni procedure
D. Cramer’s statistic
12. A researcher is interested in investigating 3 types of interventions designed to increase physical activity. The
independent variable is the type of intervention and the dependent variable is amount of physical activity,
measured in minutes per week. If this researcher wanted to use chi-square to analyze the data, what
modification would need to be made?
A. Formulas utilizing degrees of freedom need to be modified to N-3.
B. Subtract .5 from each of the differences between the observed and expected values.
C. At least double the amount of data should be collected.
D. The dependent variable needs to be collapsed into categories.
13. For the example in question 11, which is the more appropriate statistical test to analyze the data?
A. independent groups t test.
B. one-way between subjects analysis of variance
C. Pearson correlation coefficient
D. one-way within subjects analysis of variance
14. Which of the following is the appropriate way to report a chi-square statistic in a research report?
A. c2 (2, N = 200) = 15.71, p < .01.
B. c2 (2, 200) = 15.71, p = .01.
C. c2 = 15.71, p < .01, (2, N = 200).
D. c2 = 15.71, p < .01.
15. c2 (1, N = 200) = 12.89, p < .05.
What is the value of the chi-square statistic?
A. 200
B. 12.89
C. 1
D. .05
16. 16.c2 (1, N = 200) = 12.89, p < .05. What is the value for the degrees of freedom?
A. 4
B. 3
C. 2
D. 1
18. One of the assumptions of the chi-square test is that the expected frequencies be no lower than
approximately 5. Which of the following is a strategy to ensure that this assumption is met?
A. conduct a goodness-of-fit test
B. add a third group to at least one of the variables
C. increase sample size
D. ensure that eta-squared is at least .15.
19. As the population value of Cramer’s statistic increases, what happens to the sample size needed to achieve a
desired power level?
A. The sample size requirement increases.
B. The sample size requirement decreases.
C. The sample size requirement is unchanged.
D. Cannot be determined without looking at the table.
20. The chi-square goodness of fit test is a _____ test.
A. descriptive
B. biased
C. bivariate
D. univariate
21. The chi-square test is typically used to analyze the relationship between two variables when
A. both variables are qualitative in nature.
B. the two variables have been measured on different individuals.
C. the observations on each variable are within-subjects in nature.
D. none of these
22. When the marginal frequencies for both variables under study are random, the chi-square test of _____ is
used, and when the marginal frequencies are random for one variable and fixed for the other, the chi-square test
of _____ is used.
A. independence; heterogeneity
B. heterogeneity; dependence
C. independence; homogeneity
D. homogeneity; independence
23. Yates' correction for continuity should not be used because it tends to reduce statistical power while adding
little control over Type I errors.
A. true
B. false
24. A chi-square test can only be used to analyze the relationship between qualitative variables.
A. true
B. false
25. If the researcher suspects that the strength of the relationship between two variables in the population is .30
as indexed by Cramer's statistic, what sample size should she use in a study involving a 2 x 4 contingency table
and an alpha level of .05 to achieve a power of .80?
A. 58
B. 76
C. 121
D. 172
26. If the researcher suspects that the strength of the relationship between two variables in the population is .50
as indexed by Cramer's statistic, what sample size should she use in a study involving a 3 x 3 contingency table
and an alpha level of .05 to achieve a power of .70?
A. 7
B. 19
C. 22
D. 28
27. A researcher is interested in whether or not domestic violence occurs in the same proportion in families of
alcoholics as it does in the population in general. If the relative frequency of domestic violence in the
population is .05 and the relative frequency of no domestic violence in the population is .95, what is the chi-
square goodness-of-fit value for the following data?
28. A researcher is interested in whether frequencies of ice cream preference are evenly distributed across the
three categories of vanilla, chocolate, and strawberry. What is the chi-square goodness-of-fit value of the
following data?
30. Which level of measurement is appropriate for the independent and dependent variables when using a chi-
square analysis?
A. nominal
B. ordinal
C. interval
D. ratio
31. Which of the following conditions would necessitate choosing a test other than chi-square test of
independence?
A. observations are within subjects.
B. the independent variable is qualitative.
C. the dependent variable is qualitative.
D. the observations on each variable are between subjects in nature.
32. A chi-square test may be used to analyze the relationship between two variables when ______.
A. both variables are qualitative in nature
B. the two variables have been measured on the same subjects
C. the observations on each variable are between-subjects in nature
D. all of these
33. The chi-square test is typically used to analyze the relationship between two variables when:
A. both variables are quantitative in nature and are measured on a level that at least approximates interval
characteristics
B. there is an extremely small sample size
C. the observations on each variable are within-subjects in nature
D. none of these
34. The use of frequency information is predicated on the fact that the chi-square test is designed for use with
_____ variables, and it is not appropriate to compute _____ for variables of this type.
A. quantitative; marginal frequencies
B. quantitative; contingency tables
C. qualitative; means
D. qualitative; degrees of freedom
35. The basis of analysis for the chi-square test is a(n) _____ table (also called a frequency or crosstabulation
table).
A. expected frequency
B. cellular
C. summary
D. contingency
37. The entries within the cells of a contingency table represent the number of individuals in the sample who are
characterized by the corresponding levels of the _____ and are referred to as _____.
A. marginal frequencies; observed frequencies
B. contingency table; unexpected frequencies
C. variables; expected frequencies
D. variables; observed frequencies
38. The sum of the frequencies in the corresponding row or column are referred to as:
A. observed frequencies
B. marginal frequencies
C. expected frequencies
D. total frequencies
39. If subjects in a gender and political party identification study were selected for participation without regard
to gender or political party identification, the marginal frequencies for both of these variables would be:
A. fixed
B. constrained
C. random
D. unidentified
40. When the marginal frequencies of both variables under study are random, the test is known as the chi-square
test of:
A. heterogeneity
B. dependence
C. independence
D. homogeneity
41. When the marginal frequencies are random for one variable and fixed for the other, the test is referred to as
the chi-square test of:
A. heterogeneity
B. dependence
C. independence
D. homogeneity
42. Which of the following situations refers to a chi-square test of independence?
A. when the marginal frequencies of both variables are random
B. when one of the marginal frequencies for a variable is fixed
C. when both of the variables have marginal frequencies that are fixed
D. all of these
44. As the observed and expected frequencies become more dissimilar, then we are more likely to ______.
A. accept the null
B. reject the null
C. increase the sample size
D. reject H1
48. As the discrepancy between the observed and expected frequencies becomes larger, the magnitude of the
chi-square statistic ______.
A. decreases
B. does not change
C. increases
D. increases by a factor of the square of the difference
50. Application of the chi-square test requires computation of a(n) _____ for each cell under the assumption of
no relationship between the two variables.
A. expected frequency
B. observed frequency
C. mean
D. standard error
51. The chi-square statistic is an index that reflects the overall difference between the _____ and the _____
frequencies.
A. observed; total
B. observed; expected
C. expected; marginal
D. expected; total
52. If two variables are unrelated in the population, the _____ value of chi-square will equal _____ a. sample; 0
A. population; 1.0
B. sample; 1.0
C. population; 0
53. Because of sampling error, a chi-square computed from sample data might be _____ even when the null
hypothesis is true.
A. less than 0
B. uninterpretable
C. greater than 0
54. There are different chi-square distributions depending on the _____ associated with them.
A. standard errors
B. marginal frequencies
C. number of cells
D. degrees of freedom
55. One of the assumptions of the chi-square test is that the expected frequency for each cell is:
A. zero
B. nonzero
C. independent
D. random
56. Although the issue is controversial, statisticians generally recommend that the lowest _____ one should
have in order to use the chi-square statistic is somewhere around _____.
A. expected frequency; 5
B. observed frequency; 20
C. number of cells; 5
D. degrees of freedom; 10
57. As long as the _____ assumption is met, observed frequencies can be as low as _____.
A. homogeneity of variance; 10
B. normality; 1.0
C. observed frequency; 0
D. expected frequency; 0
58. When both of the variables under study have only two levels, the sampling distribution of the chi-square
statistic corresponds _____ to a chi-square distribution than when one or both variables have more than two
levels.
A. more precisely
B. exactly
C. more closely
D. less closely
59. Recent studies have shown that Yates’ correction of continuity should not be used, as it tends to reduce
_____ with little gain over _____ control.
A. cell frequencies; standard error
B. alpha levels; statistical
C. statistical power; Type I error
D. sample size; random error
60. Research has generally found _____ to be preferable to the chi-square test in small sample situations.
A. Yates’ Correction for Continuity
B. Fisher’s Exact Test
C. Fisher’s alpha test
D. None of these
61. As the degrees of freedom increase, the shape of the sampling distribution of the chi-square statistic will
become more _____.
A. positively skewed
B. negatively skewed
C. normal in shape
D. peaked in shape
62. When the degrees of freedom of a chi-square statistic are small (for example, 2), the shape of the sampling
distribution will be _______.
A. extremely positively skewed
B. highly negatively skewed
C. very normal
D. unknown
63. The degrees of freedom for the chi-square statistic for the test of independence depends on the _____.
A. number of levels of both variables
B. sample size
C. number of levels of the independent variable
D. number of levels of the dependent variable
64. Using Table J, what decision would you make if you observed a chi-square statistic was 36.08 and your
sample size was 20?
A. p<.05, reject H0
B. p<.02, reject H0
C. p>.01, fail to reject H0
D. need more information
66. Which of the following is the recommended lowest expected frequency per cell one should have in order to
use the chi-square test?
A. 0
B. 5
C. .5
D. none of these
67. When analyzing data from a 2?2 contingency table, a correction factor has been suggested so that _______.
A. the sampling distribution better approximates the chi-square distribution
B. the power to reject the null increases
C. the degrees of freedom are correct
D. all of these
68. Yates' correction for continuity involves which of the following procedures?
A. subtracting 5 from the absolute value of ?Oj-Ej?
B. subtracting .5 from the absolute value of ?Oj-Ej?
C. subtracting .5 from the absolute value of ?Oj-Ej?2
D. subtracting 5 from the absolute value of ?Oj-Ej?2
70. Which of the following indices measure the strength of the relationship between two variables in a
contingency table when both variables have three levels?
A. the phi coefficient
B. Pearson correlation
C. Cramer's statistic
D. the fourfold point correlation coefficient
72. If a variable has been measured on a multi-valued quantitative scale, then creating categories and using a
chi-square procedure will usually _____.
A. violate the assumptions of the chi square test
B. result in using a less powerful statistical test than alternatives
C. increase the sensitivity of the test
D. none of these
73. When trying to decide the sample size necessary to achieve a particular level of statistical power, what
information is needed?
A. size of the contingency table
B. population value of Cramer's statistic
C. the alpha level
D. all of these
74. The most common index of the strength of the relationship between two variables in a contingency table is
the:
A. fourfold point correlation coefficient
B. Cramér’s statistic
C. a and b
D. neither a nor b
75. The test of the chi-square statistic applies to the data as a whole and provides _____ as to which cells are
causing rejection of H0.
A. no information
B. partial information
C. complete information
D. none of these
76. The power of the chi-square test is further reduced when quantitative variables are collapsed into categories
because _____ by placing individuals with different scores into the same group.
A. considerable information is likely to be lost
B. more Type I errors are likely to be made
C. sampling difficulties may arise
D. all of these
77. The question addressed by the _____ test is whether a distribution of frequencies across categories for a
variable in a population are distributed in a specified manner.
A. nonparametric
B. goodness-of-fit
C. F
D. distribution-specific
78. For the chi-square goodness-of-fit test, statisticians recommend that the lowest _____ frequency of any cell
be between 5 and 10 in a 2X2 table and 5 in larger tables.
A. unexpected
B. total
C. observed
D. expected
79. The analytical procedures for the chi-square test of independence and the chi-square test of homogeneity are
identical.
True False
80. A correct example of a null hypothesis is stated: Ho: Gender and political party identification are unrelated
in the sample.
True False
81. When scores on quantitative variables are collapsed into intervals, the power of the chi-square test is
reduced relative to that of a parametric test.
True False
82. A 2x5 contingency table indicates that the column variable has two levels and the row variable has 5 levels.
True False
83. The assignment of one variable as the row variable and the other variable as the column variable in a
contingency table is arbitrary.
True False
84. A modified Bonferroni procedure can be applied to pairwise comparisons between the proportions of
observations in various cells of the contingency table to analyze the nature of the relationship between two
qualitative variables.
True False
85. The chi-square test is typically used to analyze the relationship between two variables when both variables
are qualitative in nature (that is, measured on a nominal level).
True False
86. Unlike the previous tests we have considered, the chi-square test analyzes the relationship between variables
using frequency information.
True False
87. The use of frequency information is predicated on the fact that the chi-square test is designed for use with
quantitative variables, and it is not appropriate to compute means for variables of this type.
True False
90. The entries within cells represent the number of individuals in the sample who are characterized by the
corresponding levels of the variables and are referred to as marginal frequencies.
True False
91. Cell frequencies are the sum of the frequencies in the corresponding row or column.
True False
92. If people in a study on gender and political party identification were selected for participation without
regard to gender or political party identification, the marginal frequencies for both of these variables would be
free to vary, or random.
True False
93. If people in a gender and political party identification study were selected so that it consisted of a specified
number of individuals of each gender (for instance, 100 males and 100 females) who were then classified
according to political party identification, the marginal frequencies for the party identification variable would
still be random, but the marginal frequencies for the gender variable would be fixed.
True False
94. When the marginal frequencies of both variables under study are random, the chi-square test is known as the
chi-square test of dependence.
True False
95. When the marginal frequencies are random for one variable and fixed for the other, the analytical
procedures are identical but the test is referred to as the chi-square test of heterogeneity.
True False
96. The logic underlying the chi-square test focuses on the concept of expected frequencies and how they
compare to observed frequencies.
True False
97. The null hypothesis for a chi-square test states that the variables of interest are unrelated
in the population, while the alternative hypothesis states that there is a relationship between the two variables in
the population.
True False
98. Because the relationship between two variables can take a number of different forms, the alternative
hypothesis for the chi-square test, as with the F test in analysis of variance, is directional in nature.
True False
99. Application of the chi-square test requires computation of an expected frequency for each cell under the
assumption of no relationship between the two variables.
True False
100. In order to calculate an expected frequency for a given cell, you divide the total of the column (the column
marginal frequency) in which it appears by the total number of observations and then multiply this by the total
of the row (the row marginal frequency) in which the cell appears.
True False
101. The chi-square statistic is an index that reflects the overall difference between the observed and the
expected frequencies.
True False
102. If two variables are unrelated in the population, the population value of chi-square will equal 1.0.
True False
103. Because of sampling error, a chi-square computed from sample data might be less than 0 even when the
null hypothesis is true.
True False
104. There are different chi-square distributions depending on the degrees of freedom associated with them.
True False
105. Since all discrepancies from the expected frequencies are reflected in the upper tail of the chi-square
distribution (as defined by the critical value), the chi-square test is nondirectional.
True False
106. The chi-square test is based on the assumption that the observations are independently sampled from the
population of all possible observations.
True False
107. The chi-square test is based on the assumption that the expected frequency for each cell is nonzero.
True False
108. Although the issue is controversial, statisticians generally recommend the use of the correction for
continuity.
True False
109. When both of the variables under study have only two levels, the sampling distribution of the chi-square
statistic corresponds more closely to a chi-square distribution than when one or both variables have more than
two levels.
True False
110. Yates’ correction for continuity involves subtracting .5 from the absolute value of Oj - Ej before these
quantities are squared, divided by Ej, and summed across cells.
True False
111. Research has found that Fisher’s Exact Test, an alternative method for testing the relationship between two
variables in 2 X 2 tables, is less powerful than the chi-square test.
True False
112. Probably the most common index of the strength of the relationship between two variables in a
contingency table is a measure known as the fourfold point correlation coefficient (as it is called when applied
to the relationship between variables with two levels each) or Cramér’s statistic (as it is called when one or both
variables have more than two levels).
True False
113. The fourfold point correlation coefficient and Cramér’s statistic can range from -1.00 to 1.00.
True False
114. Conceptually, a large value of V indicates a tendency for particular categories of one variable to be
associated with particular categories of the other variable.
True False
115. The test of the chi-square statistic applies to the sample as a whole and provides useful information as to
which cells are responsible for rejecting the null hypothesis.
True False
116. Because the chi-square test is typically used to analyze the relationship between two qualitative variables,
it cannot be used when one or both variables are quantitative.
True False
117. The question addressed by the goodness-of-fit test is whether a distribution of frequencies across
categories for a variable in a population are distributed in a specified manner.
True False
118. For the chi-square goodness-of-fit test, statisticians generally recommend that the lowest expected
frequency of any category be somewhere between 5 and 10 if only two categories are involved and about 5 if
more than two categories are involved.
True False
119. Marginal frequencies are the sums of the frequencies in the corresponding ____________________ or
columns.
________________________________________
120. When a sample is selected so that it consists of a specified number for one of the two variables, the
marginal frequency for that variable is ____________________.
________________________________________
121. When the marginal frequencies are random for one variable and fixed for the other, the analytical
procedure is referred to as the ____________________.
________________________________________
122. Application of the chi-square test requires the computation of an ____________________ frequency for
each cell of a contingency table under the assumption that the null hypothesis is true.
________________________________________
123. In chi-square analysis, the sum of the discrepancy scores are combined to create an index known as the
____________________ estatistic.
________________________________________
124. The chi-square statistic reflects the overall difference between the ____________________ and the
expected frequencies.
________________________________________
125. If computed for all possible random samples of a given size, a sampling distribution of the chi-square
statistic will closely approximate the theoretical distribution known as the ____________________
distribution.
________________________________________
126. The shape of the chi-square distribution depends on how many ____________________ are associated
with it.
________________________________________
127. ____________________ is a popular alternative to the chi-square test for testing the relationship between
two variables in a 2x2 contingency table, when sample sizes are small.
________________________________________
128. Power tables can be used to determine the power associated with the chi-square test given the dimensions
of the contingency table, the alpha level, the sample size, and the value of ____________________ statistic in
the population.
________________________________________
129. A common use for the ____________________ test is to test whether frequencies in the population are
evenly distributed across the categories under study.
________________________________________
130. When is Pearson correlation typically used to analyze the relationship between two variables?
131. How is the chi-square test different from previously considered statistical tests?
132. Distinguish between the chi-square test of independence and the chi-square test of homogeneity.
133. Discuss the concept of expected frequencies.
135. Discuss the sampling distribution of the chi-square statistic and the chi-square distribution.
137. What is Fisher’s Exact Test?
138. How is the strength of the relationship between two variables in a contingency table evaluated?
139. Why is it necessary to conduct follow-up tests in a chi square analysis in order to determine the nature of
the relationship?
140. What are the advantages and disadvantages of applying the chi-square test to quantitative variables?
141. What is the goodness-of-fit test?
142. The following tables summarize responses and partial chi-square test results from a survey asking college
students to respond to the following question: “Within the last 30 days, how often do you think the typical
student at your school used alcohol?” Both men and women responded to this question.
Observed Perceptions of
Typical Alcohol
Use
Gender Never 1 + Days Daily
Totals
Female 10 63 57 130
Male 8 39 23 70
Totals 18 102 80 200
Expected Perceptions of
Typical Alcohol Use
Gender Never 1 + Days Daily
Female 11.7 66.3
Male 35.7 28
143. The following tables summarize responses and partial chi-square test results from a survey asking college
students to respond to the following question: “Within the last 30 days, how often do you think the typical
student at your school used alcohol?” Both men and women responded to this question.
Expected Perceptions of
Typical Alcohol Use
Gender Never 1 + Days Daily
Female 11.7 66.3
Male 35.7 28
144. The following tables summarize responses and partial chi-square test results from a survey asking college
students to respond to the following question: “Within the last 30 days, how often do you think the typical
student at your school used alcohol?” Both men and women responded to this question.
Observed Perceptions of
Typical Alcohol Use
Gender Never 1 + Days Daily
Totals
Female 10 63 57 130
Male 8 39 23 70
Totals 18 102 80 200
Expected Perceptions of
Typical Alcohol Use
Gender Never 1 + Days Daily
Female 11.7 66.3
Male 35.7 28
145. The following tables summarize responses and partial chi-square test results from a survey asking college
students to respond to the following question: “Within the last 30 days, how often do you think the typical
student at your school used alcohol?” Both men and women responded to this question.
Expected Perceptions of
Typical Alcohol Use
Gender Never 1 + Days Daily
Female 11.7 66.3
Male 35.7 28
Chapter 15: Chi-Square Test Key
1. The chi-square test is a nonparametric test designed to analyze relationships between variables using what
type of information?
A. mean
B. frequencies
C. standard deviation
D. mode
2. A researcher was interested in examining the relationship between gender (male and female) and car
preference (SUV, sedan, or sports car). What type of table would be most appropriate for this data?
A. a two-way contingency table
B. a two-way table
C. a 2 X 3 table
D. all of these
3. If the researcher in the study above (examining gender and car preference) specifically designed the study so
that 50 men and 50 women would be selected for the study, then in the data table, the marginal means for the
car choice variable would be _____, but the marginal means for gender would be _____.
A. fixed; random
B. expected; determined
C. random; fixed
D. determined; independent
5. Conceptually, what are the “expected frequencies” telling us?
A. The mean of the participants scores on the variable of interest.
B. The frequency that would be observed in each “cell” of the design if the null hypothesis were true.
C. The frequency that would be observed in each “cell” of the design if the null hypothesis were false.
D. the frequency of scores in the data collected.
6. As degrees of freedom for the sampling distribution of the chi-square statistic decreases the sampling
distribution becomes
A. more platykurtic.
B. more skewed.
C. more leptokurtic.
D. more normal in shape.
7. AS N increases results of the chi-square test become more similar to which of the following alternatives?
A. Fisher’s exact test
B. Yate’s correction for continuity
C. the phi coefficient
D. Cramer’s statistic
8. To assess the strength of the relationship, Cramer’s statistic is used when _____, and the fourfold point
correlation coefficient is used when _____.
A. sample sizes are small; sample sizes are large
B. both variables have two levels; one or both variables have more than two levels
C. one or both variables have more than two levels; both variables have two levels
D. sample sizes are large; sample sizes are small
9. What is the possible range of values for the fourfold point correlation coefficient/Cramer’s statistic?
A. -1.00 to +1.00
B. 0 to 1.00
C. -1.00 to 0
D. 0 to infinity
10. The interpretation of the obtained value for the fourfold point correlation coefficient/Cramer’s statistic is
similar to _____.
A. a Pearson correlation coefficient
B. eta-squared
C. omega-squared
D. Tukey HSD
11. Which of the following is an appropriate and viable approach to statistically analyzing the nature of the
relationship for a statistically significant chi-square test?
A. Goodman’s procedure
B. Fisher’s exact test
C. modified Bonferroni procedure
D. Cramer’s statistic
12. A researcher is interested in investigating 3 types of interventions designed to increase physical activity. The
independent variable is the type of intervention and the dependent variable is amount of physical activity,
measured in minutes per week. If this researcher wanted to use chi-square to analyze the data, what
modification would need to be made?
A. Formulas utilizing degrees of freedom need to be modified to N-3.
B. Subtract .5 from each of the differences between the observed and expected values.
C. At least double the amount of data should be collected.
D. The dependent variable needs to be collapsed into categories.
13. For the example in question 11, which is the more appropriate statistical test to analyze the data?
A. independent groups t test.
B. one-way between subjects analysis of variance
C. Pearson correlation coefficient
D. one-way within subjects analysis of variance
14. Which of the following is the appropriate way to report a chi-square statistic in a research report?
A. c2 (2, N = 200) = 15.71, p < .01.
B. c2 (2, 200) = 15.71, p = .01.
C. c2 = 15.71, p < .01, (2, N = 200).
D. c2 = 15.71, p < .01.
15. c2 (1, N = 200) = 12.89, p < .05.
What is the value of the chi-square statistic?
A. 200
B. 12.89
C. 1
D. .05
16. 16.c2 (1, N = 200) = 12.89, p < .05. What is the value for the degrees of freedom?
A. 4
B. 3
C. 2
D. 1
18. One of the assumptions of the chi-square test is that the expected frequencies be no lower than
approximately 5. Which of the following is a strategy to ensure that this assumption is met?
A. conduct a goodness-of-fit test
B. add a third group to at least one of the variables
C. increase sample size
D. ensure that eta-squared is at least .15.
19. As the population value of Cramer’s statistic increases, what happens to the sample size needed to achieve a
desired power level?
A. The sample size requirement increases.
B. The sample size requirement decreases.
C. The sample size requirement is unchanged.
D. Cannot be determined without looking at the table.
20. The chi-square goodness of fit test is a _____ test.
A. descriptive
B. biased
C. bivariate
D. univariate
21. The chi-square test is typically used to analyze the relationship between two variables when
A. both variables are qualitative in nature.
B. the two variables have been measured on different individuals.
C. the observations on each variable are within-subjects in nature.
D. none of these
22. When the marginal frequencies for both variables under study are random, the chi-square test of _____ is
used, and when the marginal frequencies are random for one variable and fixed for the other, the chi-square test
of _____ is used.
A. independence; heterogeneity
B. heterogeneity; dependence
C. independence; homogeneity
D. homogeneity; independence
23. Yates' correction for continuity should not be used because it tends to reduce statistical power while adding
little control over Type I errors.
A. true
B. false
24. A chi-square test can only be used to analyze the relationship between qualitative variables.
A. true
B. false
25. If the researcher suspects that the strength of the relationship between two variables in the population is .30
as indexed by Cramer's statistic, what sample size should she use in a study involving a 2 x 4 contingency table
and an alpha level of .05 to achieve a power of .80?
A. 58
B. 76
C. 121
D. 172
26. If the researcher suspects that the strength of the relationship between two variables in the population is .50
as indexed by Cramer's statistic, what sample size should she use in a study involving a 3 x 3 contingency table
and an alpha level of .05 to achieve a power of .70?
A. 7
B. 19
C. 22
D. 28
27. A researcher is interested in whether or not domestic violence occurs in the same proportion in families of
alcoholics as it does in the population in general. If the relative frequency of domestic violence in the
population is .05 and the relative frequency of no domestic violence in the population is .95, what is the chi-
square goodness-of-fit value for the following data?
28. A researcher is interested in whether frequencies of ice cream preference are evenly distributed across the
three categories of vanilla, chocolate, and strawberry. What is the chi-square goodness-of-fit value of the
following data?
30. Which level of measurement is appropriate for the independent and dependent variables when using a chi-
square analysis?
A. nominal
B. ordinal
C. interval
D. ratio
31. Which of the following conditions would necessitate choosing a test other than chi-square test of
independence?
A. observations are within subjects.
B. the independent variable is qualitative.
C. the dependent variable is qualitative.
D. the observations on each variable are between subjects in nature.
32. A chi-square test may be used to analyze the relationship between two variables when ______.
A. both variables are qualitative in nature
B. the two variables have been measured on the same subjects
C. the observations on each variable are between-subjects in nature
D. all of these
33. The chi-square test is typically used to analyze the relationship between two variables when:
A. both variables are quantitative in nature and are measured on a level that at least approximates interval
characteristics
B. there is an extremely small sample size
C. the observations on each variable are within-subjects in nature
D. none of these
34. The use of frequency information is predicated on the fact that the chi-square test is designed for use with
_____ variables, and it is not appropriate to compute _____ for variables of this type.
A. quantitative; marginal frequencies
B. quantitative; contingency tables
C. qualitative; means
D. qualitative; degrees of freedom
35. The basis of analysis for the chi-square test is a(n) _____ table (also called a frequency or crosstabulation
table).
A. expected frequency
B. cellular
C. summary
D. contingency
37. The entries within the cells of a contingency table represent the number of individuals in the sample who are
characterized by the corresponding levels of the _____ and are referred to as _____.
A. marginal frequencies; observed frequencies
B. contingency table; unexpected frequencies
C. variables; expected frequencies
D. variables; observed frequencies
38. The sum of the frequencies in the corresponding row or column are referred to as:
A. observed frequencies
B. marginal frequencies
C. expected frequencies
D. total frequencies
39. If subjects in a gender and political party identification study were selected for participation without regard
to gender or political party identification, the marginal frequencies for both of these variables would be:
A. fixed
B. constrained
C. random
D. unidentified
40. When the marginal frequencies of both variables under study are random, the test is known as the chi-square
test of:
A. heterogeneity
B. dependence
C. independence
D. homogeneity
41. When the marginal frequencies are random for one variable and fixed for the other, the test is referred to as
the chi-square test of:
A. heterogeneity
B. dependence
C. independence
D. homogeneity
42. Which of the following situations refers to a chi-square test of independence?
A. when the marginal frequencies of both variables are random
B. when one of the marginal frequencies for a variable is fixed
C. when both of the variables have marginal frequencies that are fixed
D. all of these
44. As the observed and expected frequencies become more dissimilar, then we are more likely to ______.
A. accept the null
B. reject the null
C. increase the sample size
D. reject H1
48. As the discrepancy between the observed and expected frequencies becomes larger, the magnitude of the
chi-square statistic ______.
A. decreases
B. does not change
C. increases
D. increases by a factor of the square of the difference
50. Application of the chi-square test requires computation of a(n) _____ for each cell under the assumption of
no relationship between the two variables.
A. expected frequency
B. observed frequency
C. mean
D. standard error
51. The chi-square statistic is an index that reflects the overall difference between the _____ and the _____
frequencies.
A. observed; total
B. observed; expected
C. expected; marginal
D. expected; total
52. If two variables are unrelated in the population, the _____ value of chi-square will equal _____ a. sample; 0
A. population; 1.0
B. sample; 1.0
C. population; 0
53. Because of sampling error, a chi-square computed from sample data might be _____ even when the null
hypothesis is true.
A. less than 0
B. uninterpretable
C. greater than 0
54. There are different chi-square distributions depending on the _____ associated with them.
A. standard errors
B. marginal frequencies
C. number of cells
D. degrees of freedom
55. One of the assumptions of the chi-square test is that the expected frequency for each cell is:
A. zero
B. nonzero
C. independent
D. random
56. Although the issue is controversial, statisticians generally recommend that the lowest _____ one should
have in order to use the chi-square statistic is somewhere around _____.
A. expected frequency; 5
B. observed frequency; 20
C. number of cells; 5
D. degrees of freedom; 10
57. As long as the _____ assumption is met, observed frequencies can be as low as _____.
A. homogeneity of variance; 10
B. normality; 1.0
C. observed frequency; 0
D. expected frequency; 0
58. When both of the variables under study have only two levels, the sampling distribution of the chi-square
statistic corresponds _____ to a chi-square distribution than when one or both variables have more than two
levels.
A. more precisely
B. exactly
C. more closely
D. less closely
59. Recent studies have shown that Yates’ correction of continuity should not be used, as it tends to reduce
_____ with little gain over _____ control.
A. cell frequencies; standard error
B. alpha levels; statistical
C. statistical power; Type I error
D. sample size; random error
60. Research has generally found _____ to be preferable to the chi-square test in small sample situations.
A. Yates’ Correction for Continuity
B. Fisher’s Exact Test
C. Fisher’s alpha test
D. None of these
61. As the degrees of freedom increase, the shape of the sampling distribution of the chi-square statistic will
become more _____.
A. positively skewed
B. negatively skewed
C. normal in shape
D. peaked in shape
62. When the degrees of freedom of a chi-square statistic are small (for example, 2), the shape of the sampling
distribution will be _______.
A. extremely positively skewed
B. highly negatively skewed
C. very normal
D. unknown
63. The degrees of freedom for the chi-square statistic for the test of independence depends on the _____.
A. number of levels of both variables
B. sample size
C. number of levels of the independent variable
D. number of levels of the dependent variable
64. Using Table J, what decision would you make if you observed a chi-square statistic was 36.08 and your
sample size was 20?
A. p<.05, reject H0
B. p<.02, reject H0
C. p>.01, fail to reject H0
D. need more information
66. Which of the following is the recommended lowest expected frequency per cell one should have in order to
use the chi-square test?
A. 0
B. 5
C. .5
D. none of these
67. When analyzing data from a 2?2 contingency table, a correction factor has been suggested so that _______.
A. the sampling distribution better approximates the chi-square distribution
B. the power to reject the null increases
C. the degrees of freedom are correct
D. all of these
68. Yates' correction for continuity involves which of the following procedures?
A. subtracting 5 from the absolute value of ?Oj-Ej?
B. subtracting .5 from the absolute value of ?Oj-Ej?
C. subtracting .5 from the absolute value of ?Oj-Ej?2
D. subtracting 5 from the absolute value of ?Oj-Ej?2
70. Which of the following indices measure the strength of the relationship between two variables in a
contingency table when both variables have three levels?
A. the phi coefficient
B. Pearson correlation
C. Cramer's statistic
D. the fourfold point correlation coefficient
72. If a variable has been measured on a multi-valued quantitative scale, then creating categories and using a
chi-square procedure will usually _____.
A. violate the assumptions of the chi square test
B. result in using a less powerful statistical test than alternatives
C. increase the sensitivity of the test
D. none of these
73. When trying to decide the sample size necessary to achieve a particular level of statistical power, what
information is needed?
A. size of the contingency table
B. population value of Cramer's statistic
C. the alpha level
D. all of these
74. The most common index of the strength of the relationship between two variables in a contingency table is
the:
A. fourfold point correlation coefficient
B. Cramér’s statistic
C. a and b
D. neither a nor b
75. The test of the chi-square statistic applies to the data as a whole and provides _____ as to which cells are
causing rejection of H0.
A. no information
B. partial information
C. complete information
D. none of these
76. The power of the chi-square test is further reduced when quantitative variables are collapsed into categories
because _____ by placing individuals with different scores into the same group.
A. considerable information is likely to be lost
B. more Type I errors are likely to be made
C. sampling difficulties may arise
D. all of these
77. The question addressed by the _____ test is whether a distribution of frequencies across categories for a
variable in a population are distributed in a specified manner.
A. nonparametric
B. goodness-of-fit
C. F
D. distribution-specific
78. For the chi-square goodness-of-fit test, statisticians recommend that the lowest _____ frequency of any cell
be between 5 and 10 in a 2X2 table and 5 in larger tables.
A. unexpected
B. total
C. observed
D. expected
79. The analytical procedures for the chi-square test of independence and the chi-square test of homogeneity are
identical.
TRUE
80. A correct example of a null hypothesis is stated: Ho: Gender and political party identification are unrelated
in the sample.
FALSE
81. When scores on quantitative variables are collapsed into intervals, the power of the chi-square test is
reduced relative to that of a parametric test.
TRUE
82. A 2x5 contingency table indicates that the column variable has two levels and the row variable has 5 levels.
FALSE
83. The assignment of one variable as the row variable and the other variable as the column variable in a
contingency table is arbitrary.
TRUE
84. A modified Bonferroni procedure can be applied to pairwise comparisons between the proportions of
observations in various cells of the contingency table to analyze the nature of the relationship between two
qualitative variables.
TRUE
85. The chi-square test is typically used to analyze the relationship between two variables when both variables
are qualitative in nature (that is, measured on a nominal level).
TRUE
86. Unlike the previous tests we have considered, the chi-square test analyzes the relationship between variables
using frequency information.
TRUE
87. The use of frequency information is predicated on the fact that the chi-square test is designed for use with
quantitative variables, and it is not appropriate to compute means for variables of this type.
FALSE
90. The entries within cells represent the number of individuals in the sample who are characterized by the
corresponding levels of the variables and are referred to as marginal frequencies.
FALSE
91. Cell frequencies are the sum of the frequencies in the corresponding row or column.
FALSE
92. If people in a study on gender and political party identification were selected for participation without
regard to gender or political party identification, the marginal frequencies for both of these variables would be
free to vary, or random.
TRUE
93. If people in a gender and political party identification study were selected so that it consisted of a specified
number of individuals of each gender (for instance, 100 males and 100 females) who were then classified
according to political party identification, the marginal frequencies for the party identification variable would
still be random, but the marginal frequencies for the gender variable would be fixed.
TRUE
94. When the marginal frequencies of both variables under study are random, the chi-square test is known as the
chi-square test of dependence.
FALSE
95. When the marginal frequencies are random for one variable and fixed for the other, the analytical
procedures are identical but the test is referred to as the chi-square test of heterogeneity.
FALSE
96. The logic underlying the chi-square test focuses on the concept of expected frequencies and how they
compare to observed frequencies.
TRUE
97. The null hypothesis for a chi-square test states that the variables of interest are unrelated
in the population, while the alternative hypothesis states that there is a relationship between the two variables in
the population.
TRUE
98. Because the relationship between two variables can take a number of different forms, the alternative
hypothesis for the chi-square test, as with the F test in analysis of variance, is directional in nature.
FALSE
99. Application of the chi-square test requires computation of an expected frequency for each cell under the
assumption of no relationship between the two variables.
TRUE
100. In order to calculate an expected frequency for a given cell, you divide the total of the column (the column
marginal frequency) in which it appears by the total number of observations and then multiply this by the total
of the row (the row marginal frequency) in which the cell appears.
TRUE
101. The chi-square statistic is an index that reflects the overall difference between the observed and the
expected frequencies.
TRUE
102. If two variables are unrelated in the population, the population value of chi-square will equal 1.0.
FALSE
103. Because of sampling error, a chi-square computed from sample data might be less than 0 even when the
null hypothesis is true.
FALSE
104. There are different chi-square distributions depending on the degrees of freedom associated with them.
TRUE
105. Since all discrepancies from the expected frequencies are reflected in the upper tail of the chi-square
distribution (as defined by the critical value), the chi-square test is nondirectional.
TRUE
106. The chi-square test is based on the assumption that the observations are independently sampled from the
population of all possible observations.
TRUE
107. The chi-square test is based on the assumption that the expected frequency for each cell is nonzero.
FALSE
108. Although the issue is controversial, statisticians generally recommend the use of the correction for
continuity.
FALSE
109. When both of the variables under study have only two levels, the sampling distribution of the chi-square
statistic corresponds more closely to a chi-square distribution than when one or both variables have more than
two levels.
FALSE
110. Yates’ correction for continuity involves subtracting .5 from the absolute value of Oj - Ej before these
quantities are squared, divided by Ej, and summed across cells.
TRUE
111. Research has found that Fisher’s Exact Test, an alternative method for testing the relationship between two
variables in 2 X 2 tables, is less powerful than the chi-square test.
FALSE
112. Probably the most common index of the strength of the relationship between two variables in a
contingency table is a measure known as the fourfold point correlation coefficient (as it is called when applied
to the relationship between variables with two levels each) or Cramér’s statistic (as it is called when one or both
variables have more than two levels).
TRUE
113. The fourfold point correlation coefficient and Cramér’s statistic can range from -1.00 to 1.00.
FALSE
114. Conceptually, a large value of V indicates a tendency for particular categories of one variable to be
associated with particular categories of the other variable.
TRUE
115. The test of the chi-square statistic applies to the sample as a whole and provides useful information as to
which cells are responsible for rejecting the null hypothesis.
FALSE
116. Because the chi-square test is typically used to analyze the relationship between two qualitative variables,
it cannot be used when one or both variables are quantitative.
FALSE
117. The question addressed by the goodness-of-fit test is whether a distribution of frequencies across
categories for a variable in a population are distributed in a specified manner.
TRUE
118. For the chi-square goodness-of-fit test, statisticians generally recommend that the lowest expected
frequency of any category be somewhere between 5 and 10 if only two categories are involved and about 5 if
more than two categories are involved.
TRUE
119. Marginal frequencies are the sums of the frequencies in the corresponding ____________________ or
columns.
rows
120. When a sample is selected so that it consists of a specified number for one of the two variables, the
marginal frequency for that variable is ____________________.
fixed
121. When the marginal frequencies are random for one variable and fixed for the other, the analytical
procedure is referred to as the ____________________.
chi-square test of homogeneity
122. Application of the chi-square test requires the computation of an ____________________ frequency for
each cell of a contingency table under the assumption that the null hypothesis is true.
expected
123. In chi-square analysis, the sum of the discrepancy scores are combined to create an index known as the
____________________ estatistic.
chi-square
124. The chi-square statistic reflects the overall difference between the ____________________ and the
expected frequencies.
observed
125. If computed for all possible random samples of a given size, a sampling distribution of the chi-square
statistic will closely approximate the theoretical distribution known as the ____________________
distribution.
chi-squre
126. The shape of the chi-square distribution depends on how many ____________________ are associated
with it.
degrees of freedom
127. ____________________ is a popular alternative to the chi-square test for testing the relationship between
two variables in a 2x2 contingency table, when sample sizes are small.
Fisher's exact test
128. Power tables can be used to determine the power associated with the chi-square test given the dimensions
of the contingency table, the alpha level, the sample size, and the value of ____________________ statistic in
the population.
Cramer's
129. A common use for the ____________________ test is to test whether frequencies in the population are
evenly distributed across the categories under study.
goodness-of-fit
130. When is Pearson correlation typically used to analyze the relationship between two variables?
The chi-square test is typically used to analyze the relationship between two variables when:(1) both variables
are qualitative in nature (that is, measured on a nominal level); (2) the two variables have been measured on the
same individuals, and (3) the observations on each variable are between-subjects in nature.
131. How is the chi-square test different from previously considered statistical tests?
Unlike the previous statistical tests that we have considered, the chi-square test analyzes relationships between
variables using frequency information. The use of frequency information is predicated on the fact that the chi-
square test is designed for use with qualitative variables, and it is not appropriate to compute means for
variables of this type.
132. Distinguish between the chi-square test of independence and the chi-square test of homogeneity.
When the marginal frequencies of both variables under study are random, or free to vary, the test is referred to
as the chi-square test of independence. When the marginal frequencies are random for one variable and fixed for
the other, the analytical procedures are identical but the test is referred to as the chi-square test of homogeneity.
The logic underlying the chi-square test focuses on the concept of expected frequencies. An expected frequency
is the number of people you would expect to observe in a given cell, assuming the null hypothesis of no
relationship is true. If expected frequencies deviate substantially from observed frequencies, then the null
hypothesis is called into question.
134. Summarize the steps involved in computing expected frequencies.
The steps involved in the calculation of expected frequencies can be summarized as follows, based on the
frequencies observed in a contingency table: (1) For the cell in question, divide the total of the column (the
column marginal frequency) in which it appears by the total number of observations (2) multiply this value by
the total of the row (the row marginal frequency) in which the cell appears.
135. Discuss the sampling distribution of the chi-square statistic and the chi-square distribution.
A sampling distribution of the chi-square statistic can be formed by computing sample chi-square statistics for
all possible random samples of a given size. This distribution closely approximates, under certain conditions, a
theoretical distribution called the chi-square distribution. There are different chi-square distributions depending
on the degrees of freedom associated with them. For the chi-square test that we are considering, df = (r - 1)(c -
1), where r is the number of levels of the row variable and c is the number of levels of the column variable.
Analogous to the previous sampling distributions we have considered, probability statements can be made with
respect to scores in the chi-square distribution. It is therefore possible to set an alpha level and define a critical
value such that if the null hypothesis is true, the probability of obtaining a chi-square value larger than that
critical value is less than alpha. Since all discrepancies from the expected frequencies are reflected in the upper
tail of the chi-square distribution (as defined by the critical value), the chi-square test is, by its nature,
nondirectional.
The chi-square test is based on several assumptions. These ensure that the sampling distribution of the chi-
square statistic approximates a chi-square distribution. Specifically, the following is assumed (1) the
observations are independently and randomly sampled from the population of all possible observations; (2) the
expected frequency for each cell is nonzero.
Methods other than the chi-square statistic have been suggested for testing the relationship between variables in
2 X 2 tables. One of the more popular alternatives is Fisher’s Exact Test. Research has generally found the
Fisher exact method to be preferable to the chi-square test (although they converge to the same result as N
increases).
138. How is the strength of the relationship between two variables in a contingency table evaluated?
Probably the most common index of the strength of association is a measure known as the fourfold point
correlation coefficient (as it is called when applied to the relationship between variables with two levels each) or
Cramér’s statistic (as it is called when one or both variables have more than two levels). The fourfold point
correlation/Cramér’s statistic can range from 0 to 1.00, where a value of 0 indicates no relationship and a value
of 1.00 indicates a perfect relationship.
139. Why is it necessary to conduct follow-up tests in a chi square analysis in order to determine the nature of
the relationship?
If the null hypothesis of no relationship is rejected, then additional steps are required to discern more fully the
nature of the relationship. The test of the chi-square statistic applies to the data taken as a whole and provides no
useful information as to which cells are responsible for rejecting the null hypothesis. Just as the Tukey HSD test
can be applied to break down the overall relationship following a statistically significant analysis of variance,
comparable tests can be applied following a statistically significant chi-square test.
140. What are the advantages and disadvantages of applying the chi-square test to quantitative variables?
While the chi-square test is typically used to analyze the relationship between two qualitative variables, it can
also be applied when one or both variables are quantitative. A common procedure involves classifying scores on
a quantitative variable into a small number of groups before applying the chi-square test. When possible,
however, it is usually preferable to analyze quantitative variables with the parametric tests discussed in prior
chapters than with the chi-square test. Parametric tests are usually preferred over the chi-square test because
they tend to be more powerful. The power of the chi-square test is further reduced when quantitative variables
are collapsed into categories because considerable information is likely to be lost by placing individuals with
different scores into the same group. While this approach implicitly assumes that all individuals assigned to a
given category are equivalent on the underlying dimension, this may not in fact be the case. An advantage of the
chi-square approach, however, is that a quantitative variable need be measured on only an ordinal level as
opposed to the approximately interval level required for parametric tests.
The question addressed by the goodness-of-fit test is whether a distribution of frequencies across categories for
a variable in a population are distributed in a specified manner. A common use of the goodness-of-fit test is to
test whether frequencies in the population are evenly distributed across the categories under study.
142. The following tables summarize responses and partial chi-square test results from a survey asking college
students to respond to the following question: “Within the last 30 days, how often do you think the typical
student at your school used alcohol?” Both men and women responded to this question.
What is the expected frequency for males who believed the typical student at their school never used alcohol in the last 30 days?
143. The following tables summarize responses and partial chi-square test results from a survey asking college
students to respond to the following question: “Within the last 30 days, how often do you think the typical
student at your school used alcohol?” Both men and women responded to this question.
What is the expected frequency for females who believe the typical student at their school used alcohol daily in the last 30 days?
E = (CMF/N)(RMF) = (80/200)(130) = 52
144. The following tables summarize responses and partial chi-square test results from a survey asking college
students to respond to the following question: “Within the last 30 days, how often do you think the typical
student at your school used alcohol?” Both men and women responded to this question.
Complete the table above for the computation of the chi-square statistic and calculate the value of chi-square.
145. The following tables summarize responses and partial chi-square test results from a survey asking college
students to respond to the following question: “Within the last 30 days, how often do you think the typical
student at your school used alcohol?” Both men and women responded to this question.
= = 0.11