T Test As A Parametric Statistic: Tae Kyun Kim
T Test As A Parametric Statistic: Tae Kyun Kim
T Test As A Parametric Statistic: Tae Kyun Kim
Statistical Round
pISSN 2005-6419 • eISSN 2005-7563
In statistic tests, the probability distribution of the statistics is important. When samples are drawn from population
− should be a normal distribution N (μ, σ2/n). Un-
N (μ, σ2) with a sample size of n, the distribution of the sample mean X
−
X–μ
der the null hypothesis μ = μ0, the distribution of statistics z = σ/ n0 should be standardized as a normal distribution.
√
When the −variance of the population is not known, replacement with the sample variance s2 is possible. In this case, the
X–μ
statistics s/ n0 follows a t distribution (n-1 degrees of freedom). An independent-group t test can be carried out for a
√
comparison of means between two independent groups, with a paired t test for paired data. As the t test is a parametric
test, samples should meet certain preconditions, such as normality, equal variances and independence.
CC This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/
licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright ⓒ the Korean Society of Anesthesiologists, 2015 Online access in http://ekja.org
KOREAN J ANESTHESIOL Tae Kyun Kim
1 63 77 14 11 81 101 20 1 63 77 14 1 73 103 30
2 69 88 19 12 87 103 16 2 69 88 19 2 74 104 30
3 76 90 14 13 77 107 30 3 76 90 14 3 76 107 31
4 78 95 17 14 80 114 34 4 78 95 17 4 84 108 24
5 80 96 16 15 76 116 40 5 80 96 16 wash out 5 84 110 26
6 89 96 7 16 86 116 30 6 89 96 7 6 86 110 24
7 90 102 12 17 98 116 18 7 90 102 12 7 92 113 21
8 92 104 12 18 87 120 33 8 92 104 12 8 95 114 19
9 103 110 7 19 105 120 15 9 103 110 7 9 103 118 15
10 112 115 3 20 69 127 58 10 112 115 3 10 115 120 5
ID: individual identification, preA, preB: before the treatment A or B, ID: individual identification, preA, preB: before the treatment A or B,
postA, postB: after the treatment A or B, ΔA, ΔB: difference between postA, postB: after the treatment A or B, ΔA, ΔB: difference between
before and after the treatment A or B. before and after the treatment A or B.
fact that the patient may or may not actually have the disease. Criteria based
0.00 on probability
This conclusion is based on the statistical concept which holds
130 140 150 160 170
that it is more statistically valid to conclude that the patient has
Lab. test
the disease than to declare that the patient is a rare case among
people without the disease because such test results are statisti- Fig. 1. The determination of whether the laboratory finding is abnormal
is done according to the probability that the laboratory finding occurs
cally rare in normal people.
in the distribution of the population.
The test results and the probability distribution of the results
must be known in order for the results to be determined as
statistically rare. The criteria for clinical indicators have been es- individual) against the population mean. Again, it is meaning-
tablished based on data collected from an entire population or at less to compare each individual separately; we must compare
least from a large number of people. Here, we examine a case in the means of the two groups. Thus, do we make a statistical
which a clinical indicator exhibits a normal distribution with a inference using only the distribution of the clinical indicators of
mean of μ and a variance of σ2. If a patient’s test result is χ, is this the entire population and the mean of the sample? No. In order
statistically rare against the criteria (e.g., 5 or 1%)? Probability to infer a statistical possibility, we must know the indicator of
is represented as the surface area in a probability distribution, interest and its probability distribution. In other words, we must
and the z score that represents either 5 or 1%, near the margins know the mean of the sample and the distribution of the mean.
of the distribution, becomes the reference value. The test result We can then determine how far the sample mean varies from
χ can be determined to be statistically rare compared to the the population mean by knowing the sampling distribution of
reference probability if it lies in a more marginal area than the z the means.
score, that is, if the value of χ is located in the marginal ends of
the distribution (Fig. 1). Sampling Distribution (Sample Mean
This is done to compare one individual’s clinical indicator Distribution)
value. This however raises the question of how we would com-
pare the mean of a sample group (consisting of more than one The sample mean we can get from a study is one of means of
A B
Density line 0.30 Distributions
0.25 Theoretic Population
distribution Sample mean
0.25
0.20
0.20
Density
Density
0.15
0.15
0.10 0.10
0.05 0.05
0.00
0.00
146 148 150 152 154 135 140 145 150 155 160 165
Sample mean Sample mean or population
Fig. 2. Simulation of sampling distribution. (A) A histogram of the sample mean distribution which results from 1,000 samples from population
N (150, 52) with a sample size of 10. The simulated density line shows a distribution similar to the theoretical sampling distribution N(150, 5 2/10).
(B) Comparison of the shapes between the population and the sampling distribution.
all possible samples which could be drawn from a population. 0degree of freedom of each sample rather than a normal distri-
This sample mean from a study was already acquired from a bution (Fig. 3).
real experiment, however, how could we know the distribution
of the means of all possible samples including studied sample? Independent T test
Do we need to experiment it over and over again? The simula-
tion in which samples are drawn repeatedly from a population A t test is also known as Student’s t test. It is a statistical
is shown in Fig. 2. If samples are drawn with sample size n from analysis technique that was developed by William Sealy Gosset
population of normal distribution (μ, σ2), the sampling distribu- in 1908 as a means to control the quality of dark beers. A t test
tion shows normal distribution with mean of μ and variance of used to test whether there is a difference between two indepen-
σ2/n. The number of samples affects the shape of the sampling dent sample means is not different from a t test used when there
distribution. That is, the shape of the distribution curve becomes is only one sample (as mentioned earlier). However, if there is
a narrower bell curve with a smaller variance as the number of no difference in the two sample means, the difference will be
samples increases, because the variance of sampling distribu- close to zero. Therefore, in such cases, an additional statistical
tion is σ2/n. The formation of a sampling distribution is well test should be performed to verify whether the difference could
explained in Lee et al. [2] in a form of a figure. be said to be equal to zero.
Let’s extract two independent samples from a population
T Distribution that displays a normal distribution and compute the difference
between the means of the two samples. The difference between
Now that the sampling distribution of the means is known, the sample means will not always be zero, even if the samples are
we can locate the position of the mean of a specific sample extracted from the same population, because the sampling pro-
against the distribution data. However, one problem remains. cess is randomized, which results in a sample with a variety of
As we noted earlier, the sampling distribution exhibits a normal combinations of subjects. We extracted two samples with a size
distribution with a variance of σ2/n, but in reality we do not of 6 from a population N (150, 52) and found the difference in
know σ2, the variance of the population. Therefore, we use the the means. If this process is repeated 1,000 times, the sampling
sample variance instead of the population variance to determine distribution exhibits the shape illustrated in Fig. 4. When the
the sampling distribution of the mean. The sample variance is distribution is displayed in terms of a histogram and a density
defined as follows: line, it is almost identical to the theoretical sampling distribu-
σሺ௫ ି௫ҧ ሻమ tion: N(0, 2 × 52/6) (Fig. 4).
ଶ ൌ
ିଵ However, it is difficult to define the distribution of the differ-
In such cases in which the sample variance is used, the sam- ence in the two sample means because the variance of the popu-
pling distribution follows a t distribution that depends on the lation is unknown. If we use the variance of the sample instead,
A B
0.4 t, df = 9 0.4 t, df = 3
Normal Normal
0.3 0.3
Density
Density
0.2 0.2
0.1 0.1
t = 2.26 (df = 9)
z = 1.96
0.0 0.0
4 2 0 2 4 4 2 0 2 4
z or t z or t
C D
0.4 t, df = 10 0.4 t, df = 30
Normal Normal
0.3 0.3
Density
0.1 0.1
0.0 0.0
4 2 0 2 4 4 2 0 2 4
z or t z or t
Fig. 3. Comparison between a normal distribution and a t distribution. (A) The point t (-2.25, df = 9) corresponding to a probability of 0.025 for
a t distribution is located more toward the tail than that of z for a normal distribution. (B–D) As the degree of freedom of the t distribution increase,
the t distribution becomes closer to a normal distribution.
0.20 Theoretic mean difference the distribution of the difference of the samples means would
Sample difference density follow a t distribution. It should be noted, however, that the two
samples display a normal distribution and have an equal vari-
0.15
ance because they were independently extracted from an identi-
cal population that has a normal distribution.
Density
0.10 Under the assumption that the two samples display a normal
distribution and have an equal variance, the t statistic is as fol-
lows:
0.05
and the degree of freedom is calculated based on the Welch In this equation, the t statistic is increased if the correlation
Satterthwaite equation. coefficient is greater than 0 because the denominator becomes
It is apparent that if n1 and n2 are sufficiently large, the t sta- smaller, which increases the statistical power of the paired t test
tistic resembles a normal distribution (Fig. 3). compared to that of an independent t test. On the other hand, if
A statistical test is performed to verify the position of the dif- the correlation coefficient is less than 0, the statistical power is
ference in the sample means in the sampling distribution of the decreased and becomes lower than that of an independent t test.
mean (Fig. 4). It is statistically very rare for the difference in two It is important to note that if one misunderstands this character-
sample means to lie on the margins of the distribution. There- istic and uses an independent t test when the correlation coef-
fore, if the difference does lie on the margins, it is statistically ficient is less than 0, the generated results would be incorrect, as
significant to conclude that the samples were extracted from two the process ignores the paired experimental design.
different populations, even if they were actually extracted from
the same population. Assumptions
Conclusion may not follow a normal distribution, and that t tests have suf-
ficient statistical power even if they do not satisfy the condition
Owing to user-friendly statistics software programs, the rich of normality [5]. Moreover, they contend that the condition of
pool of statistics information on the Internet, and expert advice equal variance is not so strict because even if there is a nine-
from statistics professionals at every hospital, using and process- fold difference in the variance, the α level merely changes from
ing statistics data is no longer an intractable task. However, it 0.5 to 0.6 [6]. However, the arguments regarding the conditions
remains the researchers’ responsibility to design experiments to of normality and the limit to which the condition of equal vari-
fulfill all of the conditions of their statistic methods of choice ance may be violated are still bones of contention. Therefore,
and to ensure that their statistical assumptions are appropriate. researchers who unquestioningly accept these arguments and
In particular, parametric statistical methods confer reasonable neglect the basic assumptions of a t test when submitting papers
statistical conclusions only when the statistical assumptions will face critical comments from editors. Moreover, it will be
are fully met. Some researchers often regard these statistical as- difficult to persuade the editors to neglect the basic assump-
sumptions inconvenient and neglect them. Even some statisti- tions regardless of how solid the evidence in the paper is. Hence,
cians argue on the basic assumptions, based on the central limit researchers should sufficiently test basic statistical assumptions
theory, that sampling distributions display a normal distribution and employ methods that are widely accepted so as to draw valid
regardless of the fact that the population distribution may or statistical conclusions.
References
1. Yim KH, Nahm FS, Han KA, Park SY. Analysis of statistical methods and errors in the articles published in the korean journal of pain.
Korean J Pain 2010; 23: 35-41.
2. Lee DK, In J, Lee S. Standard deviation and standard error of the mean. Korean J Anesthesiol 2015; 68: 220-3.
3. Welch BL. The generalisation of student’s problems when several different population variances are involved. Biometrika 1947; 34: 28-35.
4. Satterthwaite FE. An approximate distribution of estimates of variance components. Biometrics 1946; 2: 110-4.
5. Lumley T, Diehr P, Emerson S, Chen L. The importance of the normality assumption in large public health data sets. Annu Rev Public
Health 2002; 23: 151-69.
6. Box GEP. Some theorems on quadratic forms applied in the study of analysis of variance problems, I. Effect of inequality of variance in the
one-way classification. Ann Math Statist 1954; 25: 290-302.
Appendix
The results of independent and paired t tests of the examples are illustrated in Tables 1 and 2. The tests were conducted using the
SPSS Statistics Package (IBM® SPSS® Statistics 21, SPSS Inc., Chicago, IL, USA).
F Sig. t df P (2-tailed)
First, we need to examine the degree of normality by confirming the Kolmogorov-Smirnov or Shapiro-Wilk test in the second
table. We can determine that the samples satisfy the condition of normality because the P value is greater than 0.05. Next, we check the
results of Levene’s test to examine the equality of variance. The P value is again greater than 0.05; hence, the condition of equal vari-
ance is also met. Finally, we read the significance probability for the “equal variance assumed” line. If the condition of equal variance is
not met (i.e., if the P value is less than 0.05 for Levene’s test), we reach a conclusion by referring to the significance probability for the
“equal variance not assumed” line, or we perform a nonparametric test.
Paired t test
Paired differences
Paired t test Standard Standard error 95% confidence interval of the difference t df P (2-tailed)
Mean
deviation of the mean Lower Upper
A paired t test is identical to a single-sample t test. Therefore, we test the normality of the difference in the amount of change for
treatment A and treatment B (∆A-∆B). The normality is verified based on the results of Kolmogorov-Smirnov and Shapiro-Wilk tests,
as shown in the second table. In conclusion, there is a significant difference between the two treatments (i.e., the P value is less than
0.001).