9 - Correlation 1 v2
9 - Correlation 1 v2
9 - Correlation 1 v2
BBD 30402
By:
Definition
Pearson’s correlation coefficient measures the strength or the
degree of the linear relationship between two variables.
• It is assumed that both variables (often called X and Y) are of
interval or ratio scale.
• Data set approximately normally distribute.
Cont…
r
XY N X Y
_ 2 _ 2
( X 2 N X )( Y 2 N Y ) Where:
SP X Mean of X
r
SS x SSY
N XY ( X )( Y ) Y Mean of Y
rp
(N X ( X ) ) (N Y (Y ) )
2 2 2 2
N number of sample
How to choose the
Start correlation?
No
Interval/ratio Spearman rank
data?
Yes
Normally No
distributed
Yes
Pearson
End
Correlation
Example 1
A high school guidance is interested in a relationship between
proximity to school and participation in extracurricular activities.
He collects the data on the distance from home to school (in
miles) and number of clubs joined for a sample of 10 juniors.
Using the following data compute a Pearson’s correlation is
significant.
Distance to school Numbers of clubs
(in miles) joined
X Y
Lee 4 3
Rhonda 2 1
Jess 7 5
Evelyn 1 2
Mohammad 4 1
Steve 6 1
George 9 9
Juan 7 6
Chi 7 5
David 10 8
Solution
Step 1
Step 2
r
XY N X Y
_ 2 _ 2
( X N X )( Y N Y )
2 2
299 (10)(5.7)(4.1)
rp
(401 (10)(5.7 2 )(247 (10)(4.12 )
65.3
rp
76.1 (78.9)
rp 0.84
Interpretation
Pearson's correlation coefficient was +0.84, indicating that there
was a strong positive linear relationship between distance from
school and number of participants in the club.
Pearson’s Coefficient Correlation
Ho : 0 Null hypothesis
Ha : 0 Alternative hypothesis
>0
Degree of freedom, df = n-2
< 0
n2
statistik ujian : T rp
1 rp
2
Example 2
Masa (jam 4 10 14 12 4 5 8 11 13 15
seminggu)
Markah 26 17 7 12 30 40 20 15 10 5
peperiksaan
X Y XY X2 Y2
4 26 104 16 676
10 17 170 100 389
14 7 98 196 49
12 12 144 144 144
4 30 120 16 900
5 40 200 25 1600
8 20 160 64 400
11 15 165 121 225
13 10 130 169 100
15 5 75 225 25
rp =
10 (1336) – 96 (182)
rp = - 0.927
Example 3
6 d 2
rs 1
n(n 1)
2
• Where,
d = u – v (difference between each pair of
ranks)
n = number of pairs
Example 1
Candidate 1 2 3 4 5 6 7 8
English (x) 50 58 35 86 76 43 40 60
Maths (y) 65 72 54 82 32 74 40 53
• Rank the results and hence find Spearman’s rank
correlation coefficient between the two sets of
marks. Comment on the value obtained.
Solution
Rank Maths
Eng (x) Rank (v) d=u-v d2
(u) (y)
50 4 65 5 -1 1
58 5 72 6 -1 1
35 1 54 4 -3 9
86 8 82 8 0 0
76 7 32 1 6 36
43 3 74 7 -4 16
40 2 40 2 0 0
60 6 53 3 3 9
∑ d2= 72
Solution
• Calculate the value of the test statistic, rs
6 d 2
rs 1
n(n 1)
2
where d = u - v
Solution
6 d 2
rs 1
n ( n 2 1)
Interpretation:
6 72 There is a very weak
rs 1 positive correlation
8(82 1)
between English
432 and Mathematics
rs 1 ranking.
504
rs 0.142
Exercise 2
• Early in the first semester, 10 students were
asked to sit on a test to determine their
Mathematics ability. At the end of the first
semester they sat for their Mathematics
examination. The distribution of data is not
normal. Calculate the Spearman rank
correlation coefficient for the two sets of
marks and interpret the results.
Students Pre-test Examination marks
1 Exercise 1
45 92
2 23 86
3 50 97
4 46 95
5 33 87
6 21 76
7 13 72
8 30 84
9 34 85
10 50 98
Solution
• Spearman rho, rs 0.94