Pediatric Reference Intervals for Serum Thyroxine, Tri-
iodothyronine, Thyrotropin, and Free Thyroxine, David
Zurakowski,1* James Di Canzio,1 and Joseph A. Majzoub2
1 Departments of Biostatistics and Medicine and 2 Divi-
sion of Endocrinology, Department of Medicine, Chil-
drens Hospital, Harvard Medical School, Boston, MA
02115; * address correspondence to this author at: Depart-
ment of Biostatistics, Childrens Hospital, 300 Longwood
Ave., Boston, MA 02115; fax 617-278-9770, e-mail
zurakowski@a1.tch.harvard.edu)
iatrogenic pituitary disorders, syndromes of diencephalo-
Thyroid function tests provide information about hor- hypophyseal origin, and dyspituitarism). Infants less than
mone metabolism and thyroid dysfunction. Reference 1 month of age and premature infants were also excluded
intervals enable clinicians to evaluate thyroid function. from analysis. Results from patients who had multiple
Several pediatric reference intervals for thyroid function measurements on the same thyroid function test (,8% for
tests have been published (1 4 ). Laboratory tests and each analyte) were averaged to obtain a single value for
their nomenclature have been published (5 ), and the each patient in the study. The final sample size consisted
American Thyroid Association has classified thyrotropin of 13 145 test results on 5817 patients (3221 females and
(TSH) as the best single measurement of thyroid status 2596 males) after outliers were removed. The total num-
because of its high sensitivity (6 ). However, reference ber of patients for each thyroid function test was as
intervals that are derived from small numbers of patients follows: T4 (n 5 4551), T3 (n 5 2683), TSH (n 5 5558), and
are not reliable for accurately evaluating test results that free T4 (n 5 353).
are dependent on covariates such as age and sex. The Measurements of T4, T3, and TSH were obtained with
IFCC has recommended a minimum of 120 subjects for the DELFIA immunofluorometric system according to the
nonparametric methods in which subgrouping of data is manufacturers instructions (Wallac Oy). The assay is a
performed (7 ). Virtanen et al. (8 ) have proposed that a solid-phase fluoroimmunoassay that provides a quantita-
smaller sample size may be sufficient for regression-based tive determination of hormone concentrations in human
reference intervals. serum (15). The interassay CV in the euthyroid ranges was
Recently, there has been much interest in using hospital ,10% for T3 and TSH (16) and 11% for T4 and free T4 (17).
databases to extract large volumes of patient data for A variety of statistical techniques have been proposed
clinical research (9 12 ). Hospital databases provide a to obtain reference limits (7, 8, 11, 18 ). Some authors have
sufficient number of subjects for evaluating age and sex recommended a nonparametric approach for removing
differences and for establishing age- and sex-based refer- possible abnormal laboratory test results by discarding
ence intervals. Test results can be affected by medications the top and bottom 10% of the results before calculating
or treatment received by patients for thyroid disease that 2.5% and 97.5% limits on the remaining 80% of the data
alter the physiologic features of the thyroid hormone (19 ). However, this method can distort the shape of the
concentrations during the neonatal period (13, 14 ). There- underlying distribution by cutting off the tails. If outliers
fore, neonates and patients who have thyroid disease or that should be removed are not removed, the resulting
demonstrate abnormal test results should be excluded intervals are too wide and can lead to failure to detect
from the analysis used to establish the reference intervals. patients with thyroid dysfunction. In this study, we
Here we report health-related reference intervals for applied a parametric approach to preserve the shape of
serum thyroxine (T4), triiodothyronine (T3), TSH, and free the distribution. For all four analytes, least-squares regres-
T4 to be used as clinical guidelines for screening patients sion was performed using age as the predictor. Outliers
with suspected thyroid dysfunction. These pediatric were detected by the Tukey interquartile range (IQR)
norms are more accurate than those previously published procedure (20 ) and removed from analysis. Briefly, the
because they are age- and sex-specific and were derived IQR is calculated as the difference between the 75th and
from a large population of children and adolescents. 25th percentiles. Lower and upper limits are calculated as
Between January 1993 and August 1996, test results for the 25th percentile 2 (1.5 3 IQR) and the 75th percentile
T4, T3, TSH, and free T4 were obtained retrospectively 1 (1.5 3 IQR), respectively. This procedure led to the
from outpatient records at Childrens Hospital in Boston removal of 320 outliers (2.4%) before reference intervals
for patients 1 month through 20 years of age. Anonymity were determined. The KolmogorovSmirnov test (21 ) was
of patient data was maintained. Medical record number, used to assess normality of the data for each thyroid test,
date of birth, gender, test code, date of test, test result, and and scatter plots were inspected to evaluate the degree of
International Classification of Diseases (ICD-9) codes skewness. Because of the extreme right-tailed skew of the
were downloaded from the Oracle database on the hos- TSH and free T4 distributions, natural log transformations
pital mainframe computer so that queries could be per- were applied before analysis (22 ). Analysis of covariance
formed using Paradox (Ver. 5.0; Borland International). was used to compare slopes and y-intercepts between
ICD-9 codes were used to select patients for the reference females and males. Because of significant gender differ-
group. Excluded diagnoses were hypothyroidism (con- ences, separate regression equations were calculated for
females and males. Residual diagnostics included visual Table 1 presents the pediatric reference intervals, the
checks of normal probability plots and Kolmogorov mean thyroid hormone concentrations, and the number of
Smirnov tests. children analyzed for each age group and sex. Similar
Linear equations were used to predict mean values, numbers of females and males were analyzed at all ages
using the midpoint for each of five age groups: 111 except the 16 20 year category, in which females outnum-
months, 15 years, 6 10 years, 1115 years, and 16 20 bered males. The KolmogorovSmirnov test revealed no
years. Reference intervals were defined as the 95% pre- significant departures from a gaussian distribution for T4
diction intervals around the estimated value correspond- or T3. Serum T4 concentrations decreased significantly
ing to each midpoint. Intervals for TSH and free T4 were with chronological age in both females and males (P
converted back to the original units by calculating the ,0.0001). There was no significant difference in slope
antilog of the logarithmically transformed values. Al- between the sexes for T4 (P 5 0.18), indicating that
though there is a theoretical distinction between predic- females and males shared a common rate of decline.
tion and reference intervals, we used prediction intervals Euthyroid mean values for T4 were estimated to be higher
derived from least-squares regression. This method takes for females than males in the same age group (P ,0.0001).
into account both the residual standard error as well as Earlier studies (1, 24 ) found a similar inverse relationship
the uncertainty of the slope estimate of the regression line. between T4 and age; however, no sex differences were
For large sample sizes, using prediction intervals as reported.
reference intervals is equivalent to the method recom- In this investigation, euthyroid mean values for T3,
mended by Virtanen et al. (23 ). In our study, we have TSH, and free T4 were also inversely correlated with age
results on 5817 children and adolescents. Two-tailed val- (P ,0.0001). These results are consistent with those of
ues of P ,0.05 were considered statistically significant. other investigators who have reported age-related
SAS, Ver. 6.12 (SAS Institute), and SPSS, Ver. 8.0 (SPSS), changes for T3 (25 ) and TSH (26 ). In addition, we found
were used for statistical analysis. significant slope differences between females and males
Table 1. Pediatric reference intervals for T4, T3, TSH, and free T4.
Females Males
Reference Reference
Analyte Age Mean interval n Mean interval n
T4, nmol/La 111 months 122 82162 116 120 79161 135
15 years 120 79160 471 116 75158 589
610 years 115 75154 462 111 69152 600
1115 years 109 69149 799 106 63147 614
1620 years 104 64144 565 99 58142 200
Total 2413 2138
TSH, mIU/L 111 months 2.2 0.86.3 131 2.2 0.86.3 158
15 years 2.0 0.75.9 523 2.1 0.76.0 659
610 years 1.8 0.65.1 562 1.9 0.75.4 698
1115 years 1.5 0.54.4 1057 1.7 0.64.9 738
1620 years 1.3 0.53.9 809 1.6 0.54.4 223
Total 3082 2476
Females and Males
Free T4, pmol/L 111 months 19.5 9.539.5 47
15 years 18.4 9.037.2 91
610 years 16.9 8.334.1 57
1115 years 15.5 7.631.5 88
1620 years 14.1 7.028.7 70
Total 353
To convert nmol/L to mg/dL, divide by 12.87.
To convert nmol/L to ng/dL, multiply by 65.1.
To convert pmol/L to ng/dL, divide by 12.87.
Fig. 1. T4, T3, and TSH data for females and males.
Solid lines represent regression-based upper and lower reference limits.
for T3 (P 5 0.022) and TSH (P 5 0.003). Although T3 and producing slightly lower T4 values, especially in males
TSH decreased with age for both sexes, this decline was (29 ). Ideally, reference intervals should be determined
significantly faster for females. Scatter plots showing the using healthy subjects. Because endocrine function tests
empirical data for T4, T3, and TSH with superimposed involve drawing blood, practical and ethical consider-
upper and lower reference limits are provided in Fig. 1. ations require that we rely on the hospital database to
No gender differences were detected for free T4 (P 5 0.57), ensure a sufficient number of subjects. We addressed this
and both sexes demonstrated similar rates of decline with concern by excluding all tests from patients whose ICD-9
age (P 5 0.88). Our reference intervals for free T4 are codes indicated a condition that may have affected their
broader than those reported by others (4 ). thyroid hormone concentrations. Another limitation of
Acquisition of clinical data to establish pediatric refer- this study is that immunofluorometric systems can yield
ence intervals for thyroid function testing is challenging slightly different results than other laboratory assay tech-
because of the difficulty in obtaining blood samples from niques (15 ). However, interassay differences of this mag-
a large number of healthy children. Therefore, the number nitude should not limit the applicability of our reference
of tests used to define pediatric norms is limited. Hospital intervals for most children. We recommend that patients
databases contain large stores of clinical data that can be with borderline values be re-tested and that those with
used to establish reference intervals if the appropriate values falling outside the reference intervals should be
selection criteria are applied (9, 10, 26 ). further evaluated using ultrasensitive TSH measurements.
The large number of patients in this study permitted Our pediatric reference intervals for serum T4, T3, TSH,
evaluation of age and sex as possible factors influencing and free T4 are more accurate than those reported previ-
the concentrations of T4, T3, TSH, and free T4. In our ously because they take into account the significant effects
viewpoint, the determination of age and sex effects is of both age and sex and are based on a very large number
essential for establishing reliable age- and sex-based pe- of children. The other advantage of these intervals is that
diatric reference intervals. Significant age-related declines they are easy to use and provide an improved clinical tool
were found for all three thyroid function tests, suggesting for assessing thyroid function in children and adolescents.
that during childhood, the requirements for these thyroid
hormones decrease. Among postpubertal adolescents,
males exhibited lower serum T4 concentrations and We thank Nader Rifai for valuable comments on this
higher T3 concentrations compared with their female manuscript and Shawn F. OBrien for technical expertise
counterparts. Differences in T4 are consistent with stimu- in database management.
latory and suppressive effects of estrogen and androgen
and plasma at baseline and at (time13) are shown in
Table 1. The mean difference between the serum and
plasma potassium concentrations at baseline was similar
Serum Potassium Is Unreliable as an Estimate of in Vivo for the different tube types (0.324 0.379 mmol/L). How-
Plasma Potassium, Andrew J. Hartland* and Richard H. ever, the variation between patients in the amount of
Neary (Department of Clinical Biochemistry, Central Pa- potassium released during clotting was marked, as shown
thology Laboratory, North Staffordshire Hospital NHS by the wide confidence intervals of the plasma-serum
Trust, Hartshill Rd., Stoke-on-Trent ST4 7PA, UK; * au- difference, which spanned 1.08 mmol/L for one type of
thor for correspondence: fax 44 1782 554646) sample tube (tube E). The 3-h delay in separation of serum
from clot increased this difference, with the 95% confi-
The in vitro release of potassium from cells and platelets dence intervals spanning up to 1.5 mmol/L (tube C),
during blood clotting [particularly in patients with blood although in one type of tube, the mean increased by a
dyscrasias (1, 2 )] increases serum potassium, on average, nonsignificant 0.079 mmol/L (tube D). The serum-plasma
by 0.4 mmol/L (3 ). This difference is considered to be and the baseline to (time13) differences were not related