Jump to content

Reference range

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Mikael Häggström (talk | contribs) at 04:15, 3 November 2011 (Confidence interval of limit: specified). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In health-related fields, a reference range or reference interval usually describes the variations of a measurement or value in healthy individuals. It is a basis for a physician or other health professional to interpret a set of results for a particular patient.

The standard definition of a reference range (usually referred to if not otherwise specified) basically originates in what is most prevalent in a reference group taken from the population. However, there are also optimal health ranges that are those that appear to have the optimal health impact on people.

Standard definition

The standard definition of a reference range for a particular measurement is defined as the prediction interval between which 95% of values of a reference group fall into, in such a way that 2.5% of the time a sample value will be less than the lower limit of this interval, and 2.5% of the time it will be larger than the upper limit of this interval, whatever the distribution of these values.[1]

Reference ranges that are given by this definition are sometimes referred as standard ranges.

Regarding the target population, if not otherwise specified, a standard reference range generally denotes the one in healthy individuals, or without any known condition that directly affects the ranges being established. These are likewise established using reference groups from the healthy population, and are sometimes termed normal ranges or normal values (and sometimes "usual" ranges/values). However, using the term normal may not be appropriate as not everyone outside the interval is abnormal, and people who have a particular condition may still fall within this interval.

However, reference ranges may also be established by taking samples from the whole population, with or without diseases and conditions. In some cases, diseased individuals are taken as the population, establishing reference ranges among those having a disease or condition. Preferably, there should be specific reference ranges for each subgroup of the population that has any factor that affects the measurement, such as, for example, specific ranges for each sex, age group, race or any other general determinant.

Establishment methods

Normal distribution

When assuming a normal distribution, the reference range is obtained by measuring the values in a reference group and taking two standard deviations either side of the mean.

The 95% prediction interval, is often estimated by assuming a normal distribution of the measured parameter, in which case it can alternatively be defined as the interval limited by 1.96[2] (often rounded up to 2) standard deviations from either side of the arithmetic mean (usually simply called the "mean").

This method is often acceptably accurate if the standard deviation, as compared to the mean, is not very large.

The following example of this method is based on values of fasting plasma glucose taken from a reference group of 12 subjects:[3]

Fasting plasma glucose
(FPG)
in mmol/L
Deviation from
mean m
Subject 1 5.5 0.17
Subject 2 5.2 0.13
Subject 3 5.2 0.13
Subject 4 5.8 0.47
Subject 5 5.6 0.27
Subject 6 4.6 0.73
Subject 7 5.6 0.27
Subject 8 5.9 0.57
Subject 9 4.7 0.63
Subject 10 5.0 0.33
Subject 11 5.7 0.37
Subject 12 5.2 0.13
Mean: 5.33
(m)
Mean: 0.35
= standard deviation (s.d.)

Subsequently, the lower and upper limits of the standard reference range are calculated as:

Thus, the standard reference range for this example is estimated to be 4.6 to 6.0 mmol/L.

Confidence interval of limit

The confidence interval of a standard reference range limit as estimated assuming a normal distribution can be calculated from the standard deviation of a standard reference range limit (SDSRRL) as can be estimated by a diagram such as the one shown at right.

Taking the example from the previous section, the sample size is 12, corresponding to a SDSRRL of approximately 0.57 standard deviations of the primary value, that is, 0.35mmol/L * 0.57 = 0.1995 mmol/L (appriximately 0.2 mmol/L in this case). Thus, the 95% confidence interval is:

Thus, the lower limit of the reference range can be written as 4.6 (CI 4.2-5.0) mmol/L

Likewise, the upper limit of the reference range can be written as 6.0 (CI 5.6-6.4) mmol/L

Log-normal distribution

Some functions of log normal distribution (here shown with the measurements non-logarithmized), with the same means - μ (as calculated after logarithmizing) but different standard deviations - σ (after logarithmizing).

In reality, biological parameters tend to have a log-normal distributions,[4] rather than the arithmetical normal distribution (which is generally referred to as normal distribution without any further specification).

An explanation for this log-normal distribution for biological parameters is: The event where a sample has half the value of the mean or median tends to have almost equal probability to occur as the event where a sample has twice the value of the mean or median. Also, only a log-normal distribution can compensate for the inability of almost all biological parameters to be of negative numbers (at least when measured on absolute scales), with the consequence that there is no definite limit to the size of outliers (extreme values) on the high side, but, on the other hand, they can never be less than zero, resulting in a positive skewness.

As shown in diagram at right, this phenomenon has relatively small effect if the standard deviation (as compared to the mean) is relatively small, as it makes the log-normal distribution appear similar to an arithmetical normal distribution. Thus, the arithmetical normal distribution may be more appropriate to use with small standard deviations for convenience, and the log-normal distribution with large standard deviations.

In a log-normal distribution, the geometric standard deviations and geometric mean more accurately estimate the 95% prediction interval than their arithmetic counterparts.

Necessity

The necessity to establish a reference range by log-normal distribution rather than arithmetic normal distribution can be regarded as depending on how much difference it would make to not do so, which can be described as the ratio:

, where:

  • Limitlog-normal is the (lower or upper) limit as estimated by assuming log-normal distribution
  • Limitnormal is the (lower or upper) limit as estimated by assuming arithmetically normal distribution.

This difference can be put solely in relation to the coefficient of variation, as in the diagram at right, where:

, where:

  • s.d. is the arithmetic standard deviation
  • m is the arithmetic mean

In practice, it can be regarded as necessary to use the establishment methods of a log-normal distribution if the difference ratio becomes more than 0.1, meaning that a (lower or upper) limit estimated from an assumed arithmetically normal distribution would be more than 10% different than the corresponding limit as estimated from a (more accurate) log-normal distribution. As seen in the diagram, a difference ratio of 0.1 is reached for the lower limit at a coefficient of variation of 0.213 (or 21.3%), and for the upper limit at a coefficient of variation at 0.413 (41.3%). The lower limit is more affected by increasing coefficient of variation, and its "critical" coefficient variation of 0.213 corresponds to a ratio of (upper limit)/(lower limit) of 2.43, so as a rule of thumb, if the upper limit is more than 2.43 times the lower limit when estimated by assuming arithmetically normal distribution, then it should be considered to do the calculations again by log-normal distribution.

Taking the example from previous section, the arithmetic standard deviation (s.d.) is estimated at 0.35 and the arithmetic mean (m) is estimated at 5.33. Thus the coefficient of variation is 0.066. This is less than both 0.213 and 0.413, and thus both the lower and upper limit of fasting blood glucose can most likely be estimated by assuming arithmetically normal distribution. More specifically, a the coefficient of variation of 0.066 corresponds to a difference ratio of 0.0072 (0.7%) for the lower limit and and 0.0054 (0.5%) for the upper limit.

By logarithmized sample values

A method to estimate the reference range for a parameter with log-normal distribution is to logarithmize all the measurements with an arbitrary base (for example e), derive the mean and standard deviation of these logarithms, determine the logarithms located (for a 95% prediction interval) 1.96 standard deviations below and above that mean, and subsequently exponentiate using those two logarithms as exponents and using the same base as was used in logarithmizing, with the two resultant values being the lower and upper limit of the 95% prediction interval.

The following example of this method is based on the same values of fasting plasma glucose as used in the previous section, using e as a base:[3]

Fasting plasma glucose
(FPG)
in mmol/L
loge(FPG) loge(FPG) deviation from
mean μlog
Subject 1 5.5 1.70 0.029
Subject 2 5.2 1.65 0.021
Subject 3 5.2 1.65 0.021
Subject 4 5.8 1.76 0.089
Subject 5 5.6 1.72 0.049
Subject 6 4.6 1.53 0.141
Subject 7 5.6 1.72 0.049
Subject 8 5.9 1.77 0.099
Subject 9 4.7 1.55 0.121
Subject 10 5.0 1.61 0.061
Subject 11 5.7 1.74 0.069
Subject 12 5.2 1.65 0.021
Mean: 5.33
(m)
Mean: 1.67
(μlog)
Mean: 0.064
= standard deviation of loge(FPG)
(σlog)

Subsequently, the still logarithmized lower and upper limit of the reference range is calculated as:

Conversion back to non-logarithmized values are subsequently performed as:

Thus, the standard reference range for this example is estimated to be 4.7 to 6.0.

From arithmetic mean and variance

An alternative method of establishing a reference range with the assumption of log-normal distribution is to use the arithmetic mean and arithmetic value of standard deviation. This is somewhat more tedious to perform, but may be useful for example in cases where a study that establishes a reference range presents only the arithmetic mean and standard deviation, leaving out the source data. If the original assumption of arithmetically normal distribution is shown to be less appropriate than the log-normal one, then, using the arithmetic mean and standard deviation may be the only available parameters to correct the reference range.

By assuming that the expected value can represent the arithmetic mean in this case, the parameters μlog and σlog can be estimated from the arithmetic mean (m) and standard deviation (s.d.) as:

Following the exampled reference group from the previous section:

Subsequently, the logarithmized, and later non-logarithmized, lower and upper limit are calculated just as by logarithmized sample values.

Directly from percentages of interest

Reference ranges can also be established directly from the 2.5th and 97.5th percentile of the measurements in the reference group. For example, if the reference group consists of 200 people, and counting from the measurement with lowest value to highest, the lower limit of the reference range would correspond to the 5th measurement and the upper limit would correspond to the 195th measurement.

This method can be used even when measurement values do not appear to conform conveniently to any form of normal distribution or other function.

However, the reference range limits as estimated in this way have higher variance, and therefore less reliability, than those estimated by an arithmetic or log-normal distribution (when such is applicable), because the latter ones acquire statistical power from the measurements of the whole reference group rather than just the measurements at the 2.5th and 97.5th percentiles. Still, this variance decreases with increasing size of the reference group, and therefore, this method may be optimal where a large reference group easily can be gathered, and the distribution mode of the measurements is uncertain.

Bimodal distribution

Bimodal distribution

In case of a bimodal distribution (seen at right), it is useful to find out why this is the case. Two reference ranges can be established for the two different groups of people, making it possible to assume a normal distribution for each group. This bimodal pattern is commonly seen in tests that differ between men and women, such as prostate specific antigen.

Interpretation of standard ranges in medical tests

In case of medical tests whose results are of continuous values, reference ranges can be used in the interpretation of an individual test result. This is primarily used for diagnostic tests and screening tests, while monitoring tests may optimally be interpreted from previous tests of the same individual instead.

Reference ranges aid in the evaluation of whether a test result's deviation from the mean is a result of random variability or a result of an underlying disease or condition. If the reference group used to establish the reference range can be assumed to be representative of the individual person in a healthy state, then a test result from that individual that turns out to be lower or higher than the reference range can be interpreted as that there is less than 2.5% probability that this would have occurred by random variability, which, in turn, is strongly indicative for considering an underlying disease or condition as a cause.

Such further consideration can be performed by finding relevant differential diagnoses that may explain the finding, independently calculate their probabilities of being present in the individual at hand, and comparing them to the probability that the result could have occurred by random variability.

Example

An individual takes a test that measures the ionized calcium in the blood, resulting in a value of 1.30 mmol/L, and a reference group that appropriately represents the individual has established a reference range of 1.05 to 1.25 mmol/L. The individual's value is higher than the upper limit of the reference range, and therefore has less than 2.5% probability of being a result of random variability, constituting a strong indication to make a differential diagnosis of possible causative conditions.

Probability of differential diagnoses
A simplified differential diagnosis workup for this case may proceed by first estimating the probability of each condition independently of the others, as follows:

Hypercalcemia (usually defined as a calcium level above the reference range) is mostly caused by either primary hyperparathyroidism or malignancy,[5] and therefore, it is reasonable to include these in the differential diagnosis.

The prevalence of primary hyperparathyroidism in the general population, estimated to be 3 in 1000, [6] corresponds to a probability of 0.3% per individual, which may in this case be used as pre-test probability for various diagnostic tests. These tests may also include the serum calcium test itself, but specifically considering its effects on the probability of having primary hyperparathyroidism. The desired result of all such tests is an estimation of a final post-test probability of having hyperpathyroidism.

A post-test probability of having a malignancy can be estimated similarly.

The consideration of what is the proper diagnosis can be based on comparing the independently estimated probabilities of each condition of interest, in this case mainly "no disease", primary hyperparathyroidism and malignancy. In this comparison, a probability parameter that is more clinically relevant than the independently estimated post-test probability is the relative probability, as calculated as:

, where:

  • RPx is the relative probability of condition X
  • ICPx is the independently calculated probability of condition X
  • ICPall is the sum of all independently calculated probabilities of all relevant differential diagnoses

Let's say that the independently calculated post-test probabilities for primary hyperparathyroidism and malignancy end up as 2.0% and 0.1%, respectively. With a probability of "no disease" with a random variability as a cause of the apparent hypercalcemia still at less than 2.5%, the sum of all independently calculated probabilities becomes 4.6%. This is far less than 100%, indicating that further causes of hypercalcemia need to be considered and tested for. However, it is realistic to have a sum of all independently calculated probabilities at still far less than 100% even when including every possible disease and condition in the differential diagnosis; the low sum reflects a low probability that the hypercalcemia would have happened in the first place, but in reality it has happened, and therefore, the relative probabilities are more clinically relevant than the independently calculated ones. Let's say that all other possible differential diagnoses for hypercalcemia were considered as well, but their independently calculated probabilities, taken together still only totaling 0.4%. Hence, the sum of all independently calculated probabilities becomes 5.0%, and the relative probabilities of each condition becomes:

  • Primary hyperparathyroidism: 2%/5% = 0.4 = 40%
  • Mailgnancy: 0.1%/5% = 0.02 = 2%
  • All other conditions: 0.4%/5% = 0.08 = 8%
  • No disease with the apparent hypercalcemia being a result of random variability: Less than 2.5%/5% = Less than 0.5 = Less than 50%

Thus, if all other explanations for an abnormal value are found to be of low probability, then there can still be a possibility worth further consideration that a value outside the reference range is caused by random variability. Such further consideration can include further specification of the probability of random variability:

Further specification of probability of random variability
In order to make a proper comparison to the possibility of presence of a disease or condition, the probability that the result would be an effect of random variability can be further specified as follows:

Following the same example of an individual's calcium value of 1.30 mmol/L. The value is assumed to conform acceptably to a normal distribution, so the mean can be assumed to be 1.15 in the reference group. The standard deviation, if not given already, can be inversely calculated by knowing that the absolute value of the difference between the mean and, for example, the upper limit of the reference range, is approximately 2 standard deviations (more accurately 1.96), and thus:

The standard score for the individual's test is subsequently calculated as:

The probability that a value is of so much larger value than the mean as having a standard score of 3 corresponds to a probability of approximately 0.14% (given by (100%-99.7%)/2, with 99.7% here being given from the 68-95-99.7 rule). Taking these calculations into account, the individual in this example more likely has primary hyperparathyroidism as a cause than that the value was merely a result of random variability.

Optimal health range

Optimal (health) range or therapeutic target (not to be confused with biological target) is a reference range or limit that is based on concentrations or levels that are associated with optimal health or minimal risk of related complications and diseases, rather than the standard range based on normal distribution in the population.

It may be more appropriate to use for e.g. folate, since approximately 90 percent of North Americans may actually suffer more or less from folate deficiency, [7] but only the 2.5 percent that have the lowest levels will fall below the standard reference range. In this case, the actual folate ranges for optimal health are substantially higher than the standard reference ranges. Vitamin D has a similar tendency. In contrast, for e.g. uric acid, having a level not exceeding the standard reference range still does not exclude the risk of getting gout or kidney stones.

A problem with optimal health range is a lack of a standard method of estimating the ranges. The limits may be defined as those where the health risks exceed a certain threshold, but with various risk profiles between different measurements (such as folate and vitamin D), and even different risk aspects for one and the same measurement (such as both deficiency and toxicity of vitamin A) it is difficult to standardize. Subsequently, optimal health ranges, when given by various sources, have an additional variability caused by various definitions of the parameter. Also, as with standard reference ranges, there should be specific ranges for different determinants that affects the values, such as sex, age etc. Ideally, there should rather be an estimation of what is the optimal value for every individual, when taking all significant factors of that individual into account - a task that may be hard to achieve by studies, but long clinical experience by a physician may make this method more preferable than using reference ranges.

One-sided cut-off values

In many cases, only one side of the range is usually of interest, such as with markers of pathology including cancer antigen 19-9, where it is generally without any clinical significance to have a value below what is usual in the population. Therefore, such targets are often given with only one limit of the reference range given, and, strictly, such values are rather cut-off values or threshold values.

They may represent both standard ranges and optimal health ranges. Also, they may represent an appropriate value to distinguish healthy person from a specific disease, although this gives additional variability by different diseases being distinguished. For example, for NT-proBNP, a lower cut-off value is used in distinguishing healthy babies from those with acyanotic heart disease, compared to the cut-off value used in distinguishing healthy babies from those with congenital nonspherocytic anemia.[8]

General drawbacks

For standard as well as optimal health ranges, and cut-offs, sources of inaccuracy and imprecision include:

  • Instruments and lab techniques used, or how the measurements are interpreted by observers. These may apply both to the instruments etc. used to establish the reference ranges and the instruments, etc. used to acquire the value for the individual to whom these ranges is applied. To compensate, individual laboratories should have their own lab ranges to account for the instruments used in the laboratory.
  • Determinants such as age, diet, etc. that are not compensated for. Optimally, there should be reference ranges from a reference group that is as similar as possible to each individual they are applied to, but it's practically impossible to compensate for every single determinant, often not even when the reference ranges are established from multiple measurements of the same individual they are applied to, because of test-retest variability.

Also, reference ranges tend to give the impression of definite thresholds that clearly separate "good" or "bad" values, while in reality there are generally continuously increasing risks with increased distance from usual or optimal values.

With this and uncompensated factors in mind, the ideal interpretation method of a test result would rather consist of a comparison of what would be expected or optimal in the individual when taking all factors and conditions of that individual into account, rather than strictly classifying the values as "good" or "bad" by using reference ranges from other people.

Examples

Reference ranges for blood tests, sorted by mass and molar concentration.

See also

References

  1. ^ Page 19 in: Stephen K. Bangert MA MB BChir MSc MBA FRCPath; William J. Marshall MA MSc PhD MBBS FRCP FRCPath FRCPEdin FIBiol; Marshall, William Leonard (2008). Clinical biochemistry: metabolic and clinical aspects. Philadelphia: Churchill Livingstone/Elsevier. ISBN 0-443-10186-8.{{cite book}}: CS1 maint: multiple names: authors list (link)
  2. ^ Page 48 in: Sterne, Jonathan; Kirkwood, Betty R. (2003). Essential medical statistics. Oxford: Blackwell Science. ISBN 0-86542-871-9.{{cite book}}: CS1 maint: multiple names: authors list (link)
  3. ^ a b Table 1. Subject characteristics in: Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 9665434, please use {{cite journal}} with |pmid=9665434 instead.
  4. ^ Huxley, Julian S. (1932). Problems of relative growth. London. ISBN 0486611140. OCLC 476909537. {{cite book}}: Invalid |ref=harv (help)
  5. ^ Table 20-4 in: Mitchell, Richard Sheppard; Kumar, Vinay; Abbas, Abul K.; Fausto, Nelson. Robbins Basic Pathology. Philadelphia: Saunders. ISBN 1-4160-2973-7.{{cite book}}: CS1 maint: multiple names: authors list (link) 8th edition.
  6. ^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1210/jc.2004-1891 , please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1210/jc.2004-1891 instead.
  7. ^ Folic Acid: Don't Be Without It! by Hans R. Larsen, MSc ChE, retrieved on July 7, 2009. In turn citing:
    • Boushey, Carol J., et al. A quantitative assessment of plasma homocysteine as a risk factor for vascular disease. Journal of the American Medical Association, Vol. 274, October 4, 1995, pp. 1049- 57
    • Morrison, Howard I., et al. Serum folate and risk of fatal coronary heart disease. Journal of the American Medical Association, Vol. 275, June 26, 1996, pp. 1893-96
  8. ^ Screening for Congenital Heart Disease with NT-proBNP: Results By Emmanuel Jairaj Moses, Sharifah A.I. Mokhtar, Amir Hamzah, Basir Selvam Abdullah, and Narazah Mohd Yusoff. Laboratory Medicine. 2011;42(2):75-80. © 2011 American Society for Clinical Pathology

Further reading

The procedures and vocabulary referring to reference intervals: CLSI (Committee for Laboratory Standards Institute) and IFCC (International Federation of Clinical Chemistry) CLSI - Defining, Establishing, and Verifying Reference Intervals in the Laboratory; Approved guideline - Third Edition. Document C28-A3 (ISBN 1-56238-682-4)Wayne, PA, USA, 2008