Chi Square Test

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

CHI-SQUARE TESTS

16.1 TEST FOR INDEPENDENCE (Categorical Data) Eij =the expected (theoretical) frequency of the ith row and
The chi-square test of independence is a nonparametric jth column
statistical test to determine if two or more classifications of Ri C j
the samples are independent or not. It uses a contingency Eij =
table (sometimes referred to as a cross classification table) n
to examine the nature of the relationship between these Ri = total of the ith row
variables. C j = total of the jth column
By independence, we mean that the row and column
5) Decision
variables are unassociated (i.e. knowing the value of a row
6) Conclusion
variable will not help us predict the value of a column
variable, and likewise, knowing the value of a column
16.2 MEASURES OF ASSOCIATION
variable will not help us predict the value of a row
Use of the chi-square test of independence can provide
variable).
information on whether the association between two
qualitative statistic figure values A and B can be regarded
Contingency Table
as statistically significant or not. Direct evaluation of the
Variable 1
Variable 2 degree of association can be done using measures of
1 2 … c Total
association, which are based on the computed chi-square
1 O 11 O 12 … O1 C R1 value ( χ 2). The nearer the value of the measure of
2 O 21 O 22 … O2 C R2 association is to 0, the greater the degree of independence
between the two variables is confirmed. Here are some
...

...

...

...

...

...

measures of association:
r OR 1 OR 2 … O RC Rr a) Phi coefficient is used in 2 by 2 tables.
Total C1 C2 … Cc n b) Contingency coefficient C (Pearson’s C) is only used
for 5 by 5 tables or larger.
c) Cramer’s V is the most popular measure of association
Data Consideration
regardless of table size.
a) Use ordered or unordered numeric categorical variables
(ordinal or nominal levels of measurement).
16.3 TESTING FOR SEVERAL PROPORTIONS
b) The data are assumed to be a random sample. The
The steps are similar to the test for independence however
expected frequencies for each category should be at least 1.
the null hypothesis is that the several population proportions
No more than 20% of the categories should have expected
are all mutually equal.
frequencies of less than 5. If not, use Fisher’s Exact or other
tests.
H o :π 1=π 2=π 3=…=π k
H a : Not all π 1=π 2=π 3=…=π k
Steps in testing independence between two variables:
1) Formulate the null and alternative hypothesis. EXERCISES:
H o: O ij =Eij 1. Grades in a statistics course and mathematical analysis
 There is no relationship between the two variables (or for business taken simultaneously were as follows for a
the two variables are independent). group of students.
H a: O ij ≠ Eij Mathematical Analysis
 There is some relationship between the variables (or the for Business Grade
two variables are dependent). Statistics
A B C Others
2) Determine the significance level (α ). Grade
2 2 A 25 6 17 13
3) Decision rule: Reject H o if χ > χ α , ν where B 17 16 15 6
ν=( r−1 ) (c−1). C 18 4 18 10
4) Calculate the chi-square test statistic. Others 10 8 11 20
r c 2 2
2 ( Oij −E ij ) ❑
( Oi−E i )
χ =∑ ∑ ∨∑ Are the grades in statistics and mathematical analysis for
i=1 j =1 E ij k=1 Ei business related? Use α =0.01 in reaching your conclusion.
where: 2. A random sample of students is asked their opinions on a
χ 2 = the test statistic that asymptotically approaches a chi- proposed core curriculum change. The results are as
square distribution follows.
O ij= the observed frequency of the ith row and jth column Opinion

Page 1 of 2 |kkb2014
CHI-SQUARE TESTS
Class Favoring Opposing Year Strongly Somewhat Somewhat Strongly
Freshman 120 80 Level Agree Agree Disagree Disagree
Sophomore 70 130 I 45 33 12 16
Junior 60 70 II 39 44 33 15
Senior 40 60 III 88 82 80 34
IV 84 76 52 20
Test the hypothesis that the proportions in the opinions on
7. In order to investigate the relationship between
the change are the same for all year levels. Use α =.05. employment status at the time a loan was arranged and
whether or not the loan is now in default, a loan company
3. A company has to choose among three pension plans. manager randomly chooses 100 accounts, with the results
Management wishes to know whether the preference for indicated below.
plans is independent of job classification and wants to use Present status Employment status at time of loan
α =0.05 . The opinions of a random sample of 500 of loan Employed Unemployed Total
employees are shown below: In default 10 8 18
Pension Plan Not in default 60 22 82
Job Classification 1 2 3 Total 70 30 100
Salaried workers 160 140 40
Hourly workers 40 60 60 a) Test the null hypothesis that employment status and
status of the loan are independent variables, using
4. A survey sampling example showing a cross the 5 percent level of significance for the test.
classification of gender by class was given below. Use the b) Test that the proportion of employed in default is equal
chi square test of independence to determine if gender and to the proportion of unemployed in default. Use α =.05 .
social class of the respondent are independent of each other.
Use the 0.05 level of significance. 8. A survey was done by a notebook manufacturer
Gender concerning a particular make and model. A group of 500
Social Class Male Female potential customers were asked whether they purchased
Upper Middle 33 29 their current notebook because of its appearance, its
Middle 153 181 performance rating, or its fixed price (no negotiating). The
Working 103 81 results, broken down by gender, are given below.
Lower 16 14 Observed frequencies
Gender Appearance Performance Cost
5. A sample of adults in X city was conducted to examine Male 100 50 35
public attitudes toward government cuts in social spending. Female 80 170 65
Concerning this data, the researcher comments,
“Respondents who knew someone on social assistance, Do females feel differently from males about the three
were more likely to feel that welfare rates were too low...” different criteria used in choosing a notebook, or do they
Knows someone on feel basically the same?
social assistance
Welfare Spending Yes No
Too little 40 6
About right 16 13
Too much 9 7

6. In a survey, students were asked their views concerning


whether statistics is an easier subject than algebra. The table
below gives a cross classification of respondent's year level
by respondent's opinion concerning statistics. The question
asked was, “Do you strongly agree, somewhat agree,
somewhat disagree, or strongly disagree that statistics is
easier than algebra?” Test the independence between year
level and opinion concerning statistics. Use α =.05 .
Opinion

Page 2 of 2 |kkb2014

You might also like