Spearman Rho Rank

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 3

WEST VISAYAS STATE UNIVERSITY

COLLEGE OF EDUCATION
GRADUATE SCHOOL
LA PAZ, ILOILO CITY

Discussants: Professor: Elvira L. Arellano, Ph.D.


Aguacito, FND 503 - Statistics
Montialbucio, Jane Bryl
Tambolero, Christine Mae

Topic: The Spearman Rho Rank Correlation Coefficient

- In statistics, Spearman's rank correlation coefficient or Spearman's rho, named after Charles
Spearman and often denoted by the Greek letter (rho) or as , is a nonparametric measure of statistical
dependence between two variables. It assesses how well the relationship between two variables can be
described using a monotonic function. If there are no repeated data values, a perfect Spearman correlation of
+1 or −1 occurs when each of the variables is a perfect monotone function of the other.
Spearman's coefficient, like any correlation calculation, is appropriate for both continuous and discrete
variables, including ordinal variables.
- The Spearman rho rank correlation coefficient is denoted by for sample data and s for population
data. This correlation coefficient is simply the linear correlation coefficient between the ranks of the data. To
calculate the value of we rank the data for each variable, x and y, separately and denote those ranks by u
and v, respectively. Then we take the difference between each pair of ranks and denote it by d. Thus,
Difference between each pair of ranks = d = u, - v
Next, we square each difference d add these squared differences to find ∑d2 . Finally, we calculate the value of
using the formula:
6∑d2
=1–
n (n2 – 1)

In a test of hypothesis about the Spearman rho rank correlation coefficient s the test statistic is and its
observed value is calculated by using the above formula.
Example 14-12 shows how to calculate the Spearman rho rank correlation coefficient and how to perform a
test of hypothesis about s .
Example 14-12
Suppose we want to investigate the relationship between the per capita income (in thousands of dollars) and the infant
mortality rate (in percent) for different states. The following table gives data on these two variables for a random sample
of eight states.

Per capita income (x) 29.85 19.0 19.18 31.78 25.22 16.68 23.98 26.33
Infant mortality (y) 8.3 10.1 10.3 7.1 9.9 11.5 8.7 9.8

Based on these data, can you conclude that there is no significant (linear) correlation between the per capita
incomes and the infant mortality rates for all states? Use α = .05.
Solution. We perform the five steps to test the null hypothesis that there is no correlation between the two
variables against the alternative hypothesis that there is a significant correlation.
Step 1. State the null and alternative hypotheses.
The null and alternative hypotheses are as follows:
H0 : There is no correlation between per capita incomes and infant mortality rates in all states
H1 : There is a correlation between per capita incomes and infant mortality rates in all states
If we denote the Spearman correlation coefficient by s the null hypotheses and the alternative hypotheses can
be written as
H0 : s =0
H1 : s ≠0

Note that this is a two-tailed test.


Step 2. Select the distribution to use.
Because the sample is taken from a small population and the variables do not follow a normal
distribution, we use the Spearman rho rank correlation coefficient test procedure to make this test.
Step 3. Determine the rejection and nonrejection regions.
The test statistic that is used to make this test is and its critical values are given in Table XIV in
Appendix C. Note that, for this example,
n=8 and α = .05
To read the critical value of from the Table XIV in Appendix C, we locate 8 in the column labeled n
and .05 in the top row of the table for a two-taliled test. The critical values of are ±.738, or +738 and -.738.
Thus, we will reject the null hypothesis if the observed value of is either -.738 or less, or +.738 or greater.
The rejection and nonrejection regions for this example are shown in Figure 14,12.
Figure 14.12

Rejection Region Nonrejection Region Rejection Region

-.738 +.738
Critical Value of . The critical value of is obtained from the Table XIV in Appendix C for the given sample
size and significance level. If the test is two-tailed, we use two critical values, one negative and one positive.
However, we use only the negative value of if the test is left-tailed and only the positive value of if the
test is right-tailed.
Step 4. Calculate the value of the test statistic.
In the Spearman rho rank correlation coefficient test, the test statistic is denoted by , which is simply
the linear correlation coefficient between the ranks of the data. As explained in the beginning of this section, to
calculate the observed value of we use the formula:
6∑d2
=1–
n (n2 – 1)

where d = u – v, and u and v are the ranks of variables x and y, respectively.

Table 14.8
u 7 2 3 8 5 1 4 6

v 2 6 7 1 5 8 3 4

d 5 -4 -4 7 0 -7 1 2

d2 25 16 16 49 0 49 1 4 ∑d2 = 160

Table 14.8 shows the ranks for x and y, which are denoted by u and v, respectively. The table also lists the
values of d, d2, and ∑d2. Note that if two or more values are equal, we use the average of their ranks for all of
them. Hence, the observed value of is
6 (160) 960
=1– =-1 = - .905
8 (64 – 1) 504
Note that Spearman’s rho rank correlation coefficient has the same properties as the linear correlation
coefficient. Thus, -1 ≤ ≤ 1 or -1 ≤ s ≤ 1, depending on whether sample or population data are used to
calculate the Spearman rho rank correlation coefficient. If s = 0, there is no relationship between the x and y
data. If 0 < s ≤ 1, on average a larger value of x is associated with a larger value of y. Similarly, if -1 ≤ s < 0,
1on average a larger value of x is associated with a smaller value of y.

Step 5. Make a decision.


Because = - .905 is less than - .738 and it falls in the rejection region, we reject H0 and comclude that
there is a correlation between the per capita incomes and the infant mortality rates in all states. Because the
value of from the sample is negative, we can also state that as per capita income increases, infant mortality
tends to decrease.
Decision Rule for the Spearman Rho Rank Correlation Coefficient.
The null hypothesis is always H0 : s = 0. The observed value of the test statistic is always the value of
computed from the sample data. Let α denote the significance level and – c and +c be the critical values for the
Spearman rho rank correlation coefficient test obtained from Table XIV.
1. For two-tailed test, the alternative hypothesis is H1 s ≠ 0. If ± c are the critical values corresponding to
sample size n and two-tailed α, we reject H0 if either ≤ - c or ≥ + c; that is, reject H0 if is “too small” or
“too large”.
2. For a right-tailed test, the alternative hypothesis is H1 : s > 0. If +c is the critical value corresponding to
sample size n and one – sided α, we reject H0 if ≥ + c; that is, reject H0 if is “too large”.
3. For a left-tailed test, the alternative hypothesis is H1 : s < 0. If – c is the critical value corresponding to
sample size n and one – sided α, we reject H0 if ≤ - c; that is, reject H0 if , is “too small”.

EXERCISES
CONCEPTS AND PROCEDURES
14.60 Two sets of paired data on two variables, x and y, have been ranked. In each case, the ranks for x and y
are denoted by u and v, respectively, and are shown in the tables. Calculate the Spearman rho rank correlation
coefficient for each case.
a.
u 2 1 3 4 6 5 7 8

v 8 6 7 4 5 2 1 3

b.

u 1 2 3 4 5 6 7
v 4 2 1 5 3 7 6

14.61 Calculate the Spearman rho rank correlation coefficient for each of the following data sets.
a.

x 5 10 15 20 25 30
y 17 15 12 14 10 9
b.

x 27 15 32 21 16 40 8
y 95 81 102 88 75 120 62

References:

You might also like