Spearman P Value

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Journal of Educational Statistics

Fall 1989, Vol 14, No. 3, pp. 245-253

Critical Values for Spearman's Rank


Order Correlation
Philip H. Ramsey
Queens College of CUNY

Key words: Edgeworth series, Pearson Type II curves, exact probabilities

Introductory statistics texts have been noted to have inaccuracies in the tables
of critical values for Spearman s correlation. Even the best texts currently
available use critical values from the exact distribution only for N<77.
Zar's table gives critical values for N < 100 but does not use the most
accurate approximation procedure available. This paper provides a table of
critical values based on the exact distribution for 3 < N < 18 and very
accurate critical values for 19 < N < 100 estimated using the Edgeworth
approximation.

Tables of the exact probability of Spearman's rank correlation coefficient,


r„ are available for 2 < TV < 18 (De Jonge & Van Montfort, 1972; Franklin,
1987a, 1987b, 1988a, 1988b; Lehmann, 1975; Otten, 1973; Owen, 1962).
Nijsse (1988) notes numerous differences among many introductory statis-
tics textbooks in the tables of critical values for rs. The most accurate table
is given in the textbook by Zar (1984), based on research by the same
author (Zar, 1972). Nijsse gives critical values for rs based on the exact
distribution for a = 0.01 and 0.05, directional and nondirectional tests, and
4 < N < 1 6 . For larger values of N, Nijsse recommends using the table
provided by Zar (1972).
For N > 1 1 , Zar's (1972) table is based on a Pearson Type II curve
approximation suggested by Olds (1938). Franklin (1987a, 1987b, 1988a)
reports the determination of the exact distribution of the rank correlation
for 12 < N < 18. Franklin (1988b) evaluates seven approximations and finds
the Pearson Type II curve clearly better than the other six; he recommends
the use of Zar's (1972) table for N > 18. However, Franklin did not include
the Edgeworth series approximation among his seven procedures. Best and
Roberts (1975) showed that the exact distribution of rs is approximated
more accurately by the Edgeworth series (David, Kendall, & Stuart, 1951)
than by the Pearson Type II curve used by Zar (1972). In particular, Best
and Roberts showed the maximum, absolute error as compared to the exact
probability is lower for the Edgeworth series for 7 < N < 13.

245
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016
Philip H. Ramsey

Evaluation of Statistical Tests


The value of rs is usually expressed as

r
°=l-w^> (1)

where d is the difference between corresponding ranks of each variate, and


N is the sample size. For upper tail probabilities (of positive correlations)
the continuity corrected version is

6(X^M)
5
N3-N ' {)

For lower tail probabilities (of negative correlations) the value of Xd2 is
reduced by 1. For conventional alpha levels, the continuity corrected ver-
sion should always provide a more conservative test than the uncorrected
version.
One of the simplest ways of testing rs for significance is by direct com-
parison with a table of critical values. To construct a table using the Edge-
worth series we use the cumulants Kr. The needed ratios are

YA (3)
~K\ N >
,1*6^114.22
4 N2
Yi (5)
~^—w-
We also use the Tchebycheff-Hermite polynomials defined by
/13 = x ~ 3x,

H5 = x5-10x3 + 15x,
H7 = x7- 21* 5 + 105x3 - 105*,
H9 = x9- 36*7 + 378*5 - 1260A:3 + 945*,
Hn =xn- 55*9 + 990*7 - 6930*5 + 17325*3 - 10395*. (6)
The standard deviation of rs is 1/VN - 1. Using Equation 2 multiplied by
VN - 1 we have the standardized version of the continuity corrected rs.
Applying Equations 3 to 6 and a(*) = exp(-*2/2)/V27r, we have from
David, Kendall, and Stuart (1951) the complement of the distribution
function of the standardized version of rs,

246
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016
Critical Values

1 - F(x) = fa(x)dx

4. ( J*4#3 ^6^5 YJHy YBH7 Y6Y4H9 Y4H11I ,_.


+ a l ;
W | 24 720 1152 40320 17280 82944J'
A computer program using this method is available (Best & Roberts, 1975).
Two tests are commonly given in introductory textbooks for testing the
significance of rs. The t test is based on the statistic

(8>
'•'TTi
and Student's t distribution with df = N-2. The t test will give exact results
when applied to Pearson's product moment correlation with an appropriate
normality assumption. It is only approximately correct with rs but is recom-
mended provided N ^ 10 (Hays, 1988, p. 836). A continuity corrected
version, t', can be obtained by using r's of Equation 2 in Equation 8.
Another method for testing rs is a Z test based on the statistic
Z=rsVN-l (9)
and the standard normal distribution. The Z test has been advocated pro-
vided N ^ 30 (Berenson, Levine, & Rindskopf, 1988, p. 455; Marascuilo &
Serlin, 1988, p. 300).
To evaluate the accuracy of the five tests of r5, the exact probability of
rejecting a true null was determined for a variety of a levels and values of
N. Table 1 gives the results for 12^N ^ 18. For each value of Nand a and
for each test, the minimum value of rs necessary for significance was deter-
mined. The corresponding probability of that value of rs was determined
from the exact tables (Franklin, 1987a, 1987b, 1988a). For example, the
critical value, Z0.975 = 1-96, is used for directional Z tests at a = 0.025 or
nondirectional Z tests at a = 0.05. For N = 12, a value of rs = 0.5944 results
in Z = 1.971 from Equation 9. The next possible lower value of rs = 0.5874
results in Z = 1.948. Therefore, the rs value of 0.5944 is the smallest rs value
whose Z exceeds 1.96. From Franklin (1987b, 1988a) the exact probability
is 0.0229 as shown for Z in Table 1. The exact probability of rs = 0.5874
from Franklin (1987b, 1988a) is 0.0244. Because 0.0244 < 0.025, we would
reject the null hypothesis by the exact test but not by Z. The Z test is
unnecessarily conservative because its rejection probability of 0.0229 is less
than the exact probability 0.0244. The results in Table 1 give the exact
probability of rejection when each of the five tests is run at a specific a level
and sample size.
We can define a conservative test as one in which the true probability of
a Type I error is always less than or equal to the nominal a level. A
nonconservative test would be one in which the true probability sometimes

247
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016
Philip H. Ramsey

exceeds the nominal level. It should be noted that a test in which the true
probability of a Type I error is always identical to the nominal a level would
be conservative by the present definition but would, of course, not be
nonconservative. Clearly, the t and t' tests are nonconservative with actual
Type I error rates somewhat higher than nominal levels.
With accurate tables of critical values available for r5, a stringent criterion
for use of an approximate test would seem justified. Bradley's (1978) neg-
ligible criterion for nonrobustness calls for an upper limit of 1.1a as the
maximum, true Type I error rate for any test at level a. Applying that
criterion to the results of t in Table 1 indicates that a 0.025 level, directional
t test would have a true rate less than 0.0275 for N > 12. Although not
shown in Table 1, the criterion was also met for N > 10. This agrees exactly
with the recommendation by Hays (1988) that t tests be used with N > 10
(provided we limit a to 0.05 and 0.025). However, a directional test at

TABLE 1
Exact probability of rejecting the hypothesis p5 = 0 based on five approximations

Nominal, directional alpha levels


N Test 0.05 0.025 0.01 0.005 0.001

12 Z 0.0521 0.0229 0.0065 0.0019 0.00001


t 0.0260 0.0110 0.0059 0.00146
V 0.0260 0.0110 0.0059 0.00126
Pr 0.0101 0.00080
Ed 0.0043 0.00080
Ex 0.0495 0.0244 0.0093 0.0048 0.00093
13 Z 0.0507 0.0237 0.0068 0.0023 0.00004
t 0.0507 0.0263 0.0111 0.0059 0.00139
t' 0.0111 0.0059 0.00139
Pr 0.00078
Ed 0.00088
Ex 0.0485 0.0249 0.0097 0.0047 0.00099
14 Z 0.0504 0.0229 0.0072 0.0025 0.00009
t 0.0504 0.0261 0.0112 0.0060 0.00133
V 0.0106 0.0057 0.00133
Pr 0.0101 0.0050 0.00085
Ed
Ex 0.0486 0.0250 0.0095 0.0047 0.00093
15 Z 0.0501 0.0235 0.0074 0.0029 0.00013
t 0.0501 0.0262 0.0111 0.0058 0.00134
t' 0.0501 0.0253 0.0106 0.0055 0.00134
Pr 0.00087
Ed 0.0047
Ex 0.0486 0.0244 0.0097 0.0050 0.00094

248
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016
Critical Values
(TABLE 1 continued)
Nominal, directional alpha levels
N Test 0.05 0.025 0.01 0.005 0.001

16 Z 0.0505 0.0232 0.0076 0.0029 0.00017


t 0.0505 0.0254 0.0108 0.0058 0.00135
t' 0.0254 0.0108 0.0055 0.00128
Pr 0.00090
Ed 0.0096
Ex 0.0493 0.0247 0.0100 0.0049 0.00095
17 Z 0.0509 0.0239 0.0078 0.0032 0.00022
t 0.0258 0.0108 0.0056 0.00128
t' 0.0252 0.0105 0.0056 0.00128
Pr 0.0252 0.0050 0.00091
Ed
Ex 0.0498 0.0245 0.0098 0.0048 0.00096
18 Z 0.0509 0.0239 0.0081 0.0032 0.00026
t 0.0255 0.0107 0.0056 0.00130
t' 0.0255 0.0107 0.0056 0.00125
Pr 0.00094
Ed
Ex 0.0499 0.0250 0.0099 0.0049 0.00098

Note. Z, Ztest; t, ftest; t', continuity corrected f test; Pr, Pearson approximation;
Ed, Edgeworth series; Ex, exact distribution with entries identical to the exact
distribution omitted.

a = 0.01 would require N ^ 16 to limit the true error rate to 0.011 for the
t test. At a = 0.005 even an N of 18 would not be adequate to limit the true
rate to 0.0055. Also in Table 1, we see that t' applied at the 0.01 level would
require N ^ 14 to limit the true Type I error rate to 0.011. However, t'
shows no advantage over t for a < 0.005 in meeting the 1.1a criterion.
Table 1 shows Z to be a conservative test for N < 18 provided a < 0.025.
However, Z is so conservative there is likely to be a large power loss in
comparison to the exact test or one of the better approximate tests. At
a = 0.05 Table 1 shows Z to be a nonconservative test. For N>30 the
critical values from Equation 9 show fairly good agreement with the values
obtained from either the Pearson Type II curve or the Edgeworth series
approximation. However, the maximum, absolute error in critical values
for directional and nondirectional tests at the 0.05 or 0.01 levels would be
0.011. This is just slightly above the limit used by Nijsse (1988) to represent
nonnegligible differences in critical values. Requiring N ^ 33 would corre-
spond exactly to the requirement that the critical value not be in error by
as much as 0.01. This is fairly good agreement with the N > 30 suggestion
by Berenson et al. (1988) and Marascuilo and Serlin (1988).
The results in Table 1 show that both the Pearson and the Edgeworth

249
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016
Philip H. Ramsey
TABLE 2
Accuracy of estimating critical values (CV's) for the Edgeworth and Pearson
approximations at nine alpha levels
Maximum absolute
Number of errors > |0.001| error in CV
N Edgeworth Pearson Edgeworth Pearson
12 3 5 0.007 0.014
13 1 2 0.006 0.011
14 0 4 0.000 0.009
15 1 3 0.003 0,007
16 2 2 0.003 0.003
17 1 5 0.003 0.005
18 1 2 0.002 0.003

Note. Nine alpha levels are 0.25, 0.10, 0.05, 0.025, 0.01, 0.005, 0.0025, 0.001, and
0.0005.

approximations are quite accurate. There are 12 cases, however, in which


the Pearson result differs from the exact distribution and 5 cases of differ-
ences between the Edgeworth and exact results. Different critical values
also would result from the Pearson and Edgeworth approximations.
Table 2 summarizes the difference between critical values of r5 from the
exact distribution and the Edgeworth and Pearson approximations based
on nine alpha levels. The comparisons include the number of errors and the
maximum absolute error in the critical values. These results clearly support
the findings of Best and Roberts (1975) that the Edgeworth approximation
is more accurate than is the Pearson. For a given N the smallest difference
that can be found between two values of rs is 6/(N3 - N), which is 0.00103
for N = 18. The results in Table 2 include all absolute errors >0.001, which
means that all possible errors for these a and N values have been included.
Nijsse (1988) used the criterion that absolute errors of at least 0.01 in
critical values should be considered nonnegligible. With this criterion, the
tables of Nijsse and those of Zar (1972) would be adequate for a levels
equal to or larger than 0.01, but there would still be some small errors that
could be avoided by using more accurate tables.
An Accurate Table of Critical Values
Table 3 gives exact critical values for 3 < N < 18 from the previously cited
literature and approximate values for 19 < N < 100 based on the Edge-
worth approximation. The results in Tables 1 and 2 indicate that these
critical values for N > 19 should be accurate; no error in the critical values
should be greater than 0.002 for these N. As suggested by Nijsse and by
Zar, the t test can be used for testing significance for AT > 100.
250
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016
Critical Values

TABLE 3
Exact critical values of rs for 3 < N < 18 and Edgeworth approximations for
N>19

Quantiles
.75 .90 .95 .975 .99 .995 .9975 .999 .9995

Directional alpha levels


.25 .10 .05 .025 .01 .005 .0025 .001 .0005

Nondirectional alphalevels
N .50 .20 .10 .05 .02 .01 .005 .002 .001
3 1.000
4 0.600 1.000 1.000
5 0.500 0.800 0.900 1.000 1.000
6 0.371 0.657 0.829 0.886 0.943 1.000 1.000
7 0.321 0.571 0.714 0.786 0.893 0.929 0.964 1.000 1.000
8 0.310 0.524 0.643 0.738 0.833 0.881 0.905 0.952 0.976
9 0.267 0.483 0.600 0.700 0.783 0.833 0.867 0.917 0.933
10 0.248 0.455 0.564 0.648 0.745 0.794 0.830 0.879 0.903
11 0.236 0.427 0.536 0.618 0.709 0.755 0.800 0.845 0.873
12 0.217 0.406 0.503 0.587 0.678 0.727 0.769 0.818 0.846
13 0.209 0.385 0.484 0.560 0.648 0.703 0.747 0.791 0.824
14 0.200 0.367 0.464 0.538 0.626 0.679 0.723 0.771 0.802
15 0.189 0.354 0.446 0.521 0.604 0.654 0.700 0.750 0.779
16 0.182 0.341 0.429 0.503 0.582 0.635 0.679 0.729 0.762
17 0.176 0.328 0.414 0.488 0.566 0.618 0.659 0.711 0.743
18 0.170 0.317 0.401 0.472 0.550 0.600 0.643 0.692 0.725
19 0.165 0.309 0.391 0.460 0.535 0.584 0.628 0.675 0.709
20 0.161 0.299 0.380 0.447 0.522 0.570 0.612 0.662 0.693
21 0.156 0.292 0.370 0.436 0.509 0.556 0.599 0.647 0.678
22 0.152 0.284 0.361 0.425 0.497 0.544 0.586 0.633 0.665
23 0.148 0.278 0.353 0.416 0.486 0.532 0.573 0.621 0.652
24 0.144 0.271 0.344 0.407 0.476 0.521 0.562 0.609 0.640
25 0.142 0.265 0.337 0.398 0.466 0.511 0.551 0.597 0.628
26 0.138 0.259 0.331 0.390 0.457 0.501 0.541 0.586 0.618
27 0.136 0.255 0.324 0.383 0.449 0.492 0.531 0.576 0.607
28 0.133 0.250 0.318 0.375 0.441 0.483 0.522 0.567 0.597
29 0.130 0.245 0.312 0.368 0.433 0.475 0.513 0.558 0.588
30 0.128 0.240 0.306 0.362 0.425 0.467 0.504 0.549 0.579
31 0.125 0.236 0.301 0.356 0.419 0.459 0.496 0.540 0.570
32 0.124 0.232 0.296 0.350 0.412 0.452 0.489 0.532 0.562
33 0.121 0.229 0.291 0.345 0.405 0.446 0.482 0.525 0.554
34 0.119 0.225 0.287 0.340 0.400 0.439 0.475 0.517 0.546
35 0.118 0.222 0.283 0.335 0.394 0.433 0.468 0.510 0.539
36 0.116 0.219 0.279 0.330 0.388 0.427 0.462 0.503 0.532
37 0.114 0.215 0.275 0.325 0.383 0.421 0.456 0.497 0.525
251
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016
Philip H. Ramsey
(TABLE 3 continued)
Quantiles
.75 .90 .95 .975 .99 .995 .9975 .999 .9995

Directional alpha levels


.25 .10 .05 .025 .01 .005 .0025 .001 .0005

Nondirectional alphalevels
N .50 .20 .10 .05 .02 .01 .005 .002 .001

38 0.113 0.212 0.271 0.321 0.378 0.415 0.450 0.491 0.519


39 0.111 0.210 0.267 0.317 0.373 0.410 0.444 0.485 0.512
40 0.110 0.207 0.264 0.313 0.368 0.405 0.439 0.479 0.506
41 0.108 0.204 0.261 0.309 0.364 0.400 0.433 0.473 0.501
42 0.107 0.202 0.257 0.305 0.359 0.396 0.428 0.468 0.495
43 0.105 0.199 0.254 0.301 0.355 0.391 0.423 0.462 0.489
44 0.104 0.197 0.251 0.298 0.351 0.386 0.419 0.457 0.484
45 0.103 0.194 0.248 0.294 0.347 0.382 0.414 0.452 0.479
46 0.102 0.192 0.246 0.291 0.343 0.378 0.410 0.448 0.474
47 0.101 0.190 0.243 0.288 0.340 0.374 0.405 0.443 0.469
48 0.100 0.188 0.240 0.285 0.336 0.370 0.401 0.439 0.465
49 0.098 0.186 0.238 0.282 0.333 0.366 0.397 0.434 0.460
50 0.097 0.184 0.235 0.279 0.329 0.363 0.393 0.430 0.456
52 0.095 0.180 0.231 0.274 0.323 0.356 0.386 0.422 0.447
54 0.094 0.177 0.226 0.268 0.317 0.349 0.379 0.414 0.439
56 0.092 0.174 0.222 0.264 0.311 0.343 0.372 0.407 0.432
58 0.090 0.171 0.218 0.259 0.306 0.337 0.366 0.400 0.424
60 0.089 0.168 0.214 0.255 0.301 0.331 0.360 0.394 0.417
62 0.087 0.165 0.211 0.250 0.296 0.326 0.354 0.387 0.411
64 0.086 0.162 0.207 0.246 0.291 0.321 0.348 0.382 0.405
66 0.084 0.160 0.204 0.243 0.287 0.316 0.343 0.376 0.399
68 0.083 0.157 0.201 0.239 0.282 0.311 0.338 0.370 0.393
70 0.082 0.155 0.198 0.235 0.278 0.307 0.333 0.365 0.387
72 0.081 0.153 0.195 0.232 0.274 0.303 0.329 0.360 0.382
74 0.080 0.151 0.193 0.229 0.271 0.299 0.324 0.355 0.377
76 0.078 0.149 0.190 0.226 0.267 0.295 0.320 0.351 0.372
78 0.077 0.147 0.188 0.223 0.264 0.291 0.316 0.346 0.368
80 0.076 0.145 0.185 0.220 0.260 0.287 0.312 0.342 0.363
82 0.075 0.143 0.183 0.217 0.257 0.284 0.308 0.338 0.359
84 0.074 0.141 0.181 0.215 0.254 0.280 0.305 0.334 0.355
86 0.074 0.139 0.179 0.212 0.251 0.277 0.301 0.330 0.351
88 0.073 0.138 0.176 0.210 0.248 0.274 0.298 0.327 0.347
90 0.072 0.136 0.174 0.207 0.245 0.271 0.294 0.323 0.343
92 0.071 0.135 0.173 0.205 0.243 0.268 0.291 0.319 0.339
94 0.070 0.133 0.171 0.203 0.240 0.265 0.288 0.316 0.336
96 0.070 0.132 0.169 0.201 0.238 0.262 0.285 0.313 0.332
98 0.069 0.130 0.167 0.199 0.235 0.260 0.282 0.310 0.329
100 0.068 0.129 0.165 0.197 0.233 0.257 0.279 0.307 0.326

252
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016
Critical Values

References
Best, D. J., & Roberts, D. E. (1975). Algorithm AS 89: The upper tail probabilities
of Spearman's rho. Applied Statistics, 24, 377-379.
Berenson, M. L., Levine, D. M., & Rindskopf, D. (1988). Applied statistics: A first
course. Englewood Cliffs, NJ: Prentice Hall.
Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical
Psychology, 31, 144-152.
David, S. T., Kendall, M. G., & Stuart, A. (1951). Some questions of distribution
in the theory of rank correlation. Biometrika, 38, 131-140.
De Jonge, C., & Van Montfort, M. A. J. (1972). The null distribution of Spearman's
S when n = 12. Statistica Neerlandica, 26, 15-17.
Franklin, L. A. (1987a, August). Approximations, convergence and exact tables for
Spearman's rank correlation coefficient. Proceedings of the Statistical Computing
Section of the American Statistical Association Convention (pp. 244-247).
Franklin, L. A. (1987b, March). The complete exact null distribution of Spearman's
rho for n = 12(1)16. Proceedings of the 19th Symposium on the Interface between
Computer Science and Statistics (pp. 337-342). American Statistical Association.
Franklin, L. A. (1988a). The complete exact null distribution of Spearman's rho for
n = 12(1)18. Journal of Statistical Computation and Simulation, 29, 255-269.
Franklin, L. A. (1988b). A note on approximations and convergence in distribu-
tion for Spearman's rank correlation coefficient. Communications in Statistics—
Theory and Methods, 17, 55-59.
Hays, W. L. (1988). Statistics (4th ed.). New York: Holt, Rinehart and Winston.
Lehmann, E. L. (1975). Nonparametric Statistics: Statistical methods based on
ranks. San Francisco: Holden-Day.
Marascuilo, L. A., & Serlin, R. C. (1988). Statistical method for the social and
behavioral sciences. New York: W. H. Freeman.
Nijsse, M. (1988). Testing the significance of Kendall's t and Spearman's rs. Psycho-
logical Bulletin, 103, 235-237.
Olds, E. G. (1938). Distributions of sums of squares of rank differences for small
numbers of individuals. Annals of Mathematical Statistics, 9, 133-148.
Otten, A. (1973). The null distribution of Spearman's S when n = 13(1)16. Statistica
Neerlandica, 27, 19-20.
Owen, D. B. (1962). Handbook of statistical tables. Reading, MA: Addison-
Wesley.
Zar, J. H. (1972). Significance testing of Spearman rank correlation coefficient.
Journal of the American Statistical Association, 67, 578-580.
Zar, J. H. (1984). Biostatistical analysis (2nd ed.). Englewood Cliffs, NJ: Prentice-
Hall.
Author
PHILIP H. RAMSEY, Associate Professor, Queens College of CUNY, Flushing,
NY 11367. Specializations: applied statistics, measurement, and computer simu-
lation.

253
Downloaded from http://jebs.aera.net at PENNSYLVANIA STATE UNIV on May 12, 2016

You might also like