Chi - Square

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

CHI-SQUARE

Chi-square (X2) is statistical measure with the help of which it is possible to assess the significance of the difference
between the observed and the expected frequencies obtained from some hypothetical universe. Observed
frequencies are those obtained after conducting observations or experiment. The expected are generated based on
hypothesis.

Chi-Square Formula

2 ∑(𝐹𝑜−𝐹ⅇ )2 ∑(𝑜−ⅇ )2
𝑥 = or 𝑥 2 =
𝐹ⅇ ⅇ
Where:
FO= Observed Frequency (O)
Fe = Expected Frequency (e)
If the two distribution that is observed and expected are exactly alike then 𝑥 2 = 0
If X2>0 then a discrepancy is experienced. The larger the value of X2, the greater is the discrepancy.
The level of significance is also necessary for the calculation of chi-square. The most common one used is 5% which is
0.05 and 1% which 0.01.
If the calculated X2 is greater than table value which is expected, the difference is significant.
If X2 is less than expected value in table the difference is insignificant, it could have arised because of error and ignored.
If X2>(E), the difference is significant hence reject Null Hypothesis (H o).
If X2<(E), the difference is insignificant hence accept Null Hypothesis (Ho).

CONDITIONS FOR APPLICATION of X2 Test.


 All the numbers (or items ) in the sample must be independent
 The sample size must be reasonably large in order for the difference between the actual and expected
observation to be normally distributed. A sample size of at least 50 is recommended
 No expected frequencies or observations should be small. Expected frequency of above five is as such
recommended. Where it is less than five it should be merged with another.
 The constraints must be linear.

Degree of Freedom K=n-1


Or
Degree of Freedom =(r-1)(c-1)

Example 1: A marketing research department of a coloured TV Division manufacturing Company has chosen fives
towns and it is believed that each city has the same sales potential. The actual number of coloured TV set sold by each
company in each city in a six month period is given in the table below.
Required: Test the hypothesis that the five towns have equal sales potential, using a level of significance 0.05
Town No. of sets sold
A 150
B 180
C 250
D 230
E 190
Total 1000

1
Soln
∑𝑂 1000
Expected Sales=𝐸 = 𝑛 = 5 =200 per town

Town Observed Frequency(o) Expected frequency (e)


(𝑜 − ⅇ ) (𝑜 − ⅇ )2
A 150 200 -50 12.5

B 180 200 -20 2
C 250 200 50 12.5
D 230 200 30 4.5
E 190 200 -10 0.5
Total 1000 200 ∑(𝑜−ⅇ)2
𝑥2 = =32

K=n-1=5-1=4
Given level of significance 0.05
1-0.05=0.95
From the Chi-square table: 0.95 under 4 = 9.49
The expected X20.95=9.49
Therefore 𝑋 2 > X20.95 that is to say 32 > 9.49
The Null hypothesis is rejected hence difference is significant and as such can not be attributed to error.

Example 2: Daily demand of loaves of bread in a given city is given in the table below. Test the hypothesis that the
Number of loaves of bread sold does not depend on the day of the week. Use level of significance =0.01

Day of the week Number of loaves sold


Mon 3100
Tue 3500
Wed 3300
Thurs 4800
Fri 4300
Sat. 5000
Total 24000

Soln
∑𝑂 24000
Expected Demand=𝐸 = = =4000 Daily
𝑛 6
Day Observed Frequency(o) Expected frequency (e)
(𝑜 − ⅇ ) (𝑜 − ⅇ )2
Mon 3100 4000 -900 202.5

Tue 3500 4000 -500 62.5
Wed 3300 4000 -700 122.5
Thurs 4800 4000 800 160
Fri 4300 4000 300 22.5
Sat. 5000 4000 1000 250
Total 24000 24000 ∑(𝑜−ⅇ)2
𝑥2 = =820

2
K=n-1=6-1=5
Given level of significance 0.01
1-0.01=0.99
From the Chi-square table: 0.99 under 5 = 15.08 = 15.1
The expected X20.99=15.1
Therefore 𝑋 2 > X20.99 that is to say 820 > 15.1
The Null hypothesis is therefore rejected.

Example 3: A random sample of 200 married men, all self employed were classified according to education and number
or children.
Town
Number of Children
0-3 4-7 Over 7 Total
Elementary 14 37 32 83
Secondary 19 42 17 78
University 12 17 10 39
Total 45 96 59 200
Using a 0.01 level of significance, test whether the size of a family is independent of the level of education attained by
the father.
Ho is that fathers education and size of family are statistically independent.

Soln.

Classification Observed
Frequency(o)
Expected
(e)
frequency
(𝑜 − ⅇ) (𝑜 − ⅇ)2

Elementary 0-3 14 18.675 -4.675 1.7
Elementary 4-7 37 39.840 -2.840 0.220
Elementary Above 7 32 24.485 7.515 2.307
Secondary 0-3 19 17.550 1.450 0.120
Secondary 4-7 42 37.440 4.560 0.555
Secondary Above 7 12 8.775 3.225 1.183
University 0-3 17 8.775 3.222 1.570
University 4-7 17 18.720 -1.720 0.158
University Above 7 10 11.505 -1.505 0.197
Total 200 200 ∑(𝑜−ⅇ)2
𝑥2 = =7.462

K=(r-1)(c-1)= (3-1)(3-1)=4
Given level of significance 0.01
1-0.01=0.99
From the Chi-square table: 0.99 under 4 is = 18.467
The expected X20.99=18.467
Therefore 𝑋 2 < X20.99 that is to say 7.462 < 18.467
The Null hypothesis is therefore accepted which conclude that there is a relationship between educational attainment
and number of children.

3
Workings:
1) E= RT*CT = 83*45/200= 18.675
GT
83*96/200= 39.84
83*59/200 = 24.485

2) E=78*45/200=17.55
78*96/200=37.44
78*59/200=23.01

3) E=39*45/200= 8.775
39*96/200=18.72
39*59/200=11.505

You might also like