Article

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

pharmacoepidemiology and drug safety 2005; 14: 601–609

Published online 13 June 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/pds.1107

ORIGINAL REPORT

Comparing data mining methods on the VAERS database{


z
David Banks PhD , Emily Jane Woo MD, MPH*, Dale R. Burwen MD, MPH, Phil Perucci BS,
M. Miles Braun MD, MPH and Robert Ball MD, MPH, ScM
The Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, Food and Drug Administration,
Rockville, MD, USA

SUMMARY
Purpose Data mining may enhance traditional surveillance of vaccine adverse events by identifying events that are
reported more commonly after administering one vaccine than other vaccines. Data mining methods find signals as the pro-
portion of times a condition or group of conditions is reported soon after the administration of a vaccine; thus it is a relative
proportion compared across vaccines, and not an absolute rate for the condition. The Vaccine Adverse Event Reporting
System (VAERS) contains approximately 150 000 reports of adverse events that are possibly associated with vaccine admin-
istration.
Methods We studied four data mining techniques: empirical Bayes geometric mean (EBGM), lower-bound of the
EBGM’s 90% confidence interval (EB05), proportional reporting ratio (PRR), and screened PRR (SPRR). We applied these
to the VAERS database and compared the agreement among methods and other performance properties, particularly focus-
ing on the vaccine–event combinations with the highest numerical scores in the various methods.
Results The vaccine–event combinations with the highest numerical scores varied substantially among the methods. Not
all combinations representing known associations appeared in the top 100 vaccine–event pairs for all methods.
Conclusions The four methods differ in their ranking of vaccine–COSTART pairs. A given method may be superior in
certain situations but inferior in others. This paper examines the statistical relationships among the four estimators. Deter-
mining which method is best for public health will require additional analysis that focuses on the true alarm and false alarm
rates using known vaccine–event associations. Evaluating the properties of these data mining methods will help determine
the value of such methods in vaccine safety surveillance. Copyright # 2005 John Wiley & Sons, Ltd.

key words — adverse event; empirical Bayes; proportional reporting ratio; vaccine

INTRODUCTION ease Control and Prevention (CDC).1 VAERS receives


more than 14 000 reports each year from vaccine man-
The Vaccine Adverse Event Reporting System ufacturers, healthcare professionals, and the general
(VAERS) is a passive surveillance system to monitor public. Each report describes one or more adverse
vaccine safety and is co-managed by the Food and events that, at least on temporal grounds, appear to
Drug Administration (FDA) and the Centers for Dis- be associated with the administration of a vaccine.
Some of these associations are surely coincidental,
in some cases the relationship is unclear, and in some
* Correspondence to: Dr Emily Jane Woo, HFM-222, Food and Drug cases (e.g., injection site reactions) the relationship is
Administration, 1401 Rockville Pike, Rockville, MD 20852. likely causal.
E-mail: wooj@cber.fda.gov
{
No conflict of interest was declared.
At the time of this analysis, the VAERS database
z
Present address: The Institute of Statistics and Decision Sciences, included information on about 70 vaccines and 989
Duke University, Durham, North Carolina, USA. adverse event coding terms. The coding terms for the

Received 1 March 2004


Revised 7 March 2005
Copyright # 2005 John Wiley & Sons, Ltd. Accepted 9 March 2005
602 d. banks ET AL.
adverse events are known as Coding Symbols for a to adverse event, and from one segment of the
Thesaurus of Adverse Reaction Terms (COSTARTs), population to another. Many factors may stimulate
and describe signs, symptoms, and diagnoses, such as reporting, especially media reporting of suspected side
headache, swelling at the vaccination site, arthritis, effects, but also FDA and CDC communications.
gastroenteritis, and so forth. A single report may list Moreover, the seriousness of an event is known to
more than one vaccine and may generate several influence reporting: only a small minority of rashes
COSTARTs. Since COSTARTs may overlap (e.g., after MMR vaccine are reported to VAERS, for
apathy and depression), the same condition may be example, but the majority of cases of paralytic polio
coded in different ways, according to the judgment of after OPV are reported to VAERS.4
the person assigning the COSTART codes. To address some of these limitations, various data
Analysis of VAERS data focuses on describing mining techniques have been developed to help
clinical and demographic characteristics of reports and uncover potential signals in the data.5–7 The methods
looking for patterns to detect ‘signals’ of adverse permit rapid analysis of large volumes of data that
events plausibly linked to a vaccine. While pharma- humans cannot possibly evaluate in detail. Although a
coepidemiologists do not universally agree upon what spontaneous reporting system lacks a true control (i.e.,
constitutes a signal,2 a signal can be generally defined people are not randomized to receive a placebo), data
as evidence that suggests an adverse event might be mining techniques permit analysis of a vaccine of
caused by vaccination and warrants further investiga- interest, with all other vaccines as a quasi-control
tion or action. Evidence of a signal in case reports and group for comparison. Data mining cannot eliminate
case series of spontaneous reports may include the reporting bias, but it does account for different
number of reports and any unexpected patterns in reporting proportions for each vaccine. Specifically,
clinical conditions by such factors as age, gender, time the methods can identify conditions that comprise a
to onset, and dose. larger proportion of reported events for a given vaccine,
Limitations of spontaneous reporting systems such compared to other vaccines, but an absolute rate is not
as VAERS include lack of verification of reported calculated. Moreover, data mining might identify rare
diagnoses, lack of consistent diagnostic criteria for all conditions which may not appear during premarketing
cases with a given diagnosis, wide range in data quality, trials. We propose to apply a variety of methods to help
underreporting, inadequate denominator data (doses shed light on the potential strengths and weaknesses of
administered or patients vaccinated), and absence of an the methods with regard to vaccine adverse event data.
unvaccinated control group. Signals detected through There are no ‘gold standards’ for the detection of
analysis of VAERS data almost always require vaccine–COSTART associations. The ability to con-
confirmation through a controlled study. Data mining firm retrospectively a known connection between
methods cannot address biases in reporting and should vaccination and a particular event (e.g., rotavirus
be used in conjunction with medical judgment. The vaccine and intussusception8) helps to validate data
lack of denominator data limits the use of VAERS to mining methods. Other known associations come from
discover unforeseen safety problems that may be the Vaccine Injury Table, a list of vaccine–adverse
associated with particular vaccines. The analyst must event associations which the Institute of Medicine has
rely on vaccine distribution data to estimate how many determined are causal9. Agreement among data mining
people receive a given vaccine, and does not know the methods, i.e., two or more methods signal a given
demographic or clinical characteristics of the recipi- vaccine–COSTART association, may also be helpful.
ents. Thus, the ability to apply traditional methods of Our objective is to extend such empirical work through
risk analysis, which depend upon estimation of the an examination of the statistical properties of the data
baseline incidence rates, is limited. Calculation of mining methods that have been proposed.
reporting rates (number of adverse events reported/
number of doses of vaccine distributed) and reporting
METHODS
rate ratios that compare vaccines has been used to
generate signals.3 Biases in reporting, inadequate The VAERS database may be viewed as a contingency
denominator data, and lack of background rates for table with 70 rows (the vaccines) and 989 columns
some conditions often limit the utility of the reporting (the COSTARTs). Each cell in the table contains a
rate approach. For spontaneous reporting systems such value nij that gives the number of reports for the ith
as VAERS, it is natural to worry about the effects of vaccine and the jth COSTART.
systematic underreporting or overreporting. Reporting The usual multinomial model assumes that separate
rates vary from vaccine to vaccine, from adverse event events are classified independently, and this is probably

Copyright # 2005 John Wiley & Sons, Ltd. Pharmacoepidemiology and Drug Safety, 2005; 14: 601–609
vaers data mining 603
approximately correct. A small deviation from this Analysis
model is that sometimes a single report will generate a
One objective of this comparison is to determine
handful of different COSTARTs (e.g., both nausea and
whether all four methods agree with each other, as
headache), but the effect of this is likely to be small.
shown by scatterplots and as measured by rank corre-
A second deviation is that sometimes a news story or
lation. Comparison is of methods’ sensitivity and spe-
popular television show can trigger a burst of reports,
cificity is desirable, but the paucity of gold standards
and these reports are not independent. But again, the
for vaccine–event causality limits the ability to esti-
overall magnitude of these effects is probably small,
mate these properties. The theoretical properties of
and the authenticity of signals generated in this way
the procedures are also an important consideration.
could be evaluated through examination of long- This paper addresses all three bases of comparison;
itudinal trends.
we measure the agreement between methods, we dis-
One aspect of the VAERS data that has not yet played
cuss performance with respect to a handful of known
a substantial role in signal detection research is the
adverse effects, and evaluate both kinds of informa-
association among COSTART terms. One could
tion on the basis of the performance differences
potentially ‘borrow strength’ by pooling signals from
expected from theory.
similar COSTART terms. For example, reports of
The Vaccine Injury Table is a list of vaccine–adverse
dizziness and vertigo might be usefully combined to
event associations that the Institute of Medicine
improve the power of the signal detection algorithm.
has determined are causal.9 By operationalizing these
However, we do not address this extension.
associations as 32 vaccine–COSTART pairs (Table 1),
Instead, this paper focuses upon a statistical we compare the ability of the methods to signal those
comparison of four signal detection methods that have
pairs. Such operationalization is imperfect, since
been discussed in the literature. We call these methods
COSTARTs are applied without standardized defini-
proportional reporting ratio (PRR), screened PRR tions or diagnostic confirmation. For example,
(SPRR), empirical Bayes geometric mean (EBGM),
ARTHRITIS may refer to acute or chronic inflamma-
and lower-bound of the EBGM’s 90% confidence
tion of joints. We then evaluate the efficiency of the
interval (EB05). We do not address the relative risk,10
methods by comparing the number of vaccine–
nor do we consider a conditional probability me-
COSTART pairs signaled by each method.
asure developed by Friedman et al.11 and critiqued by
Injection site reactions are accepted as being caused
DuMouchel et al.12
by injectable vaccines. We also look at the methods’
There are other methods that can be used for signal
ability to signal injection site reactions, represented by
detection in large contingency tables without true
COSTART codes ABSCESS INJECT SITE, ATRO-
measures of exposure. For example, the U.S. Census
PHY INJECT SITE, CYST INJECT SITE, EDEMA
Bureau and the Consumer Product Safety Commission INJECT SITE, GRANULOMA INJECT SITE, HEM
have explored the use of ‘raking’ to detect interactions
INJECT SITE, HYSN INJECT SITE, INFLAM
in large tables (cf. Little and Wu13). Bate et al.14
INJECT SITE, INJECT SITE REACT, MASS INJECT
propose using a Bayesian Confidence Propagation
SITE, NECRO INJECT SITE, and PAIN INJECT
Neural Network for adverse event detection in the
SITE. This comparison allows us to evaluate the
WHO database (but DuMouchel15 argues that this
methods’ abilities to detect an adverse effect which is
method is an approximation to EBGM based on beta-
known to be caused by many vaccines.
binomial Bayesian estimates). Hauben and Zhou7
A given method may be superior in some situations
review much of this literature.
but inferior in others. There are six possible pairwise
Although these and other methods could be
comparisons among the four data mining methods.
considered, this research has focused upon the four Since our primary interest is to determine whether any
main techniques that have been piloted within the
method is the most effective for discovering adverse
FDA to date; this paper is not intended to be a
event risks, we focus on four comparisons that seem
comprehensive overview of all currently available most informative in terms of identifying plausible
methods. A key concern is that methods used for
vaccine–event pairs.
official purposes ideally should be transparent and
sufficiently interpretable that expert knowledge can
Data mining methods assessed
guide the evaluation of new signals. Also, it is highly
desirable that the signal detection system used in Proportional reporting ratio (PRR). The PRR
VAERS not be radically different from systems approach was first described by Finney16 and
already in place. further developed recently by Evans, Waller, and

Copyright # 2005 John Wiley & Sons, Ltd. Pharmacoepidemiology and Drug Safety, 2005; 14: 601–609
604 d. banks ET AL.
Table 1. Vaccine–event associations from vaccine injury table, operationalized as vaccine–COSTART pairs
Association from vaccine injury table COSTART Vaccine code(s)

Anaphylaxis or anaphylactic shock after any of Anaphylaxis DT, DTAP, DTAPH*, DTP, HEP, IPV, MMR, TD
the following: tetanus toxoid-containing vac-
cines; pertussis antigen-containing vaccines;
measles, mumps and rubella virus-containing
vaccines in any combination; polio inactivated-
virus containing vaccines; hepatitis B antigen-
containing vaccines
Chronic arthritis after rubella virus-containing Arthritis MMR, MR, MUR, R
vaccines
Brachial neuritis after tetanus toxoid-containing Brachial neuritis DTAP*, DTAPH*, DT*, DTP*, TD*, TTOX
vaccines
Encephalitis after any of the following: pertussis Encephalitis DTAP, DTAPH*, DTP, MMR
antigen-containing vaccines; measles, mumps
and rubella virus-containing vaccines in any
combination
Encephalopathy after any of the following: Encephalopathy DTAP, DTAPH*, DTP, MMR
pertussis antigen-containing vaccines; measles,
mumps and rubella virus-containing vaccines in
any combination
Intussusception after rotavirus vaccine Intussusception RV
Paralytic polio after polio live virus-containing Poliomyelitis OPV
vaccines
Thrombocytopenia purpura after measles virus- Thrombocytopenic purpura M, MM*, MMR, MR
containing vaccines
*VAERS did not contain any reports of these vaccine–COSTART pairs.

Davis.17 To describe the method, suppose we are a=ða þ bÞ


PRRij ¼
interested in developing a measure for the strength c=ðc þ dÞ
of the association between vaccine i and COSTART
j. Let This fraction is the proportion of COSTART j reports
for vaccine i divided by the proportion of COSTART j
a. denote nij, the number of reports for a given vaccine- reports for all other vaccines. A large PRR for a specific
COSTART combination; vaccine–COSTART pair indicates that the COSTART
b. denote the number of times that any other has been disproportionately reported for that vaccine,
COSTART is reported for vaccine i; compared with all the other vaccines in VAERS
database.
c. denote the number of times that COSTART j is
There are several problems with PRR as a metric.
reported for all other vaccines;
First, it does not account for the number of cases nij. If
d. denote the number of reports for any other vaccine–
this value is small, then the generated signal can have
COSTART combination.
large variance. Second, an association might not be
statistically significant, but the raw PRR does not
These values may be depicted in a contingency table: reveal this fact since it lacks a well-defined null
distribution. Third, the PRR is subject to major
distortion due to artifacts in the reporting process.
COSTART j All Other COSTARTs
Nonetheless, PRR is an intuitive measure in the
Vaccine i a b absence of exposure data.
All other vaccines c d
Screened proportional reporting ratio (SPRR).
Evans, Waller, and Davis17 proposed screening
In this contingency table notation, the PRR signal for criteria to define SPRR: nij  3, PRR  2, and Yates-
vaccine i and COSTART j is corrected chi-square 4. (The Yates correction is a

Copyright # 2005 John Wiley & Sons, Ltd. Pharmacoepidemiology and Drug Safety, 2005; 14: 601–609
vaers data mining 605
continuity adjustment to improve the accuracy of the ques that allow the data to determine the shapes of
chi-squared approximation to the distribution of the the mixture components.
Pearson’s test for independence in a contingency This kind of framework, called a hierarchical model,
table18). These requirements help to address the first is widely used in Bayesian practice (see Carlin and
two of the three concerns about use of the raw PRR Louis19 for details). It allows one to exploit a
score. The formula is simple Bayesian computational structure for inference
X while avoiding the need to choose a subjective prior for
Yates-corrected X2 ¼ ðjOrs  Ers j  0:5Þ2 =Ers the unknown distribution of mij. Formally, the measure
corresponding to vaccine i and COSTART j is given by
Here Ors is the observed number in cell (r,s) for
r ¼ 1,2 and s ¼ 1,2 and thus takes the values a, b, c, and log2 EBGMij ¼ E½log2 ðij =Eij Þjnij 
d, as in the contingency table in 2.1. The Ers are the
numbers expected in those cells under the assumption where the right-hand side of the equation denotes
that the adverse events are independent of the vaccine, the expectation operator and Eij is the value
and this is given by the row sum times the column sum (a þ b)(a þ c)/(a þ b þ c þ d) in the notation in the
divided by the total, so SPRR section, and for the vaccine–COSTART pair
of interest, i and j correspond to cell E11). This expres-
E11 ¼ ða þ bÞða þ cÞ=ða þ b þ c þ dÞ sion calculates the expected value of the base 2 loga-
rithm of the ratio between the estimated reporting
E12 ¼ ða þ bÞðb þ dÞ=ða þ b þ c þ dÞ ratio and that under the assumption of no causal rela-
tionship, given the observed count of the spontaneous
E21 ¼ ða þ cÞðc þ dÞ=ða þ b þ c þ dÞ reports for that vaccine and that COSTART. Large
values suggest that vaccine i might provoke the
E22 ¼ ðb þ dÞðc þ dÞ=ða þ b þ c þ dÞ adverse event described by COSTART j.
The practical effect of this hierarchical model
Under the null hypothesis of no relationship between framework is that it ‘shrinks’ the estimates of the
vaccine and COSTART, the Yates-corrected X2- reporting ratio parameters in the Poisson distributions
statistic follows a chi-squared distribution with one towards each other, thereby reducing the effect of
degree of freedom. Avalue of 3.84 would be significant sampling variation in the data. The shrinkage is
at the 0.05 level, which agrees closely with the greatest when Eij is small and/or nij/Eij is small, which
screening criterion of Yates-corrected X2  4. typically occurs when a or b is small. Another
advantage is that the model preserves the interpret-
Empirical Bayes geometric mean (EBGM). ability of the parameters and their estimates. The main
DuMouchel15 developed the empirical Bayes drawback of this approach is that it is computationally
approach to analysis of spontaneous reporting sys- intensive, taking several minutes to run and requiring
tems such as VAERS. The empirical Bayes model investment in well-tested, special-purpose code. The
assumes that the counts nij in each cell are random computational burden depends upon the number of
variables from Poisson distributions with unknown rows and columns in the matrix, not the number of
means ij where the ij are themselves random vari- reports—so from the standpoint of scaling concerns,
ables with a common distribution. Usually this com- this performance is adequate for all foreseeable
mon distribution is taken to be a mixture of two VAERS applications.
gamma distributions, one of which is centered at
the null value corresponding to a coincidental
adverse event, and the other of which is more dis- Lower-bound of EBGM’s 90% confidence interval
persed and centered at a value corresponding to a (EB05). The EB05 is the lower-bound of the 90%
true causal relationship between the vaccine and confidence interval of EBGM. DuMouchel and
the adverse event. There are many alternative mod- Pregibon20 recommend that one use the 5th percen-
els that lead to similar results; this is a simple mix- tile point of the posterior distribution of the ratio as
ture model with two gamma components, one of the metric. If the 5th percentile is large, then the
which is highly dispersed and the other of which is association is unlikely to be due to chance alone
concentrated near 1. Simple alternative models and warrants further exploration. The rationale
assume a mixture of different distributions and use for selecting the 5th percentile point is based upon
the observed counts nij to estimate the parameters, a loose analogy with frequentist inference, in
but one could also consider nonparametric techni- which one wants to indicate associations that are

Copyright # 2005 John Wiley & Sons, Ltd. Pharmacoepidemiology and Drug Safety, 2005; 14: 601–609
606 d. banks ET AL.

Figure 1. Frequency of occurrence of vaccine–COSTART pairs. This scatterplot illustrates the number of vaccine–COSTART pairs versus
the total number of occurrences of a particular vaccine–COSTART pair, for the 14 800 pairs that occurred at least once in VAERS. For each
of 4857 pairs, only one occurrence has been reported to VAERS (far left of graph). The pairs that occurred most frequently (far right of
graph) correspond to pairs in which the COSTART is a common and expected event (such as fever) that occurs after many vaccines

significant at the 0.05 level. The EB05 signal is con- Comparison of EBGM and PRR
servative and this quality should minimize false
Figure 2 displays the natural logarithm of the EBGM
positives, but because it represents the lower bound
signal versus the natural logarithm of the PRR signal
of the confidence interval, it is theoretically less sen-
(175 points for which PRR is infinite are omitted from
sitive than EBGM.
A small modification of the EBGM method takes
better account of the uncertainty in the posterior
distribution of the ratio mij/Eij. As part of the EBGM
computation, one finds the distribution of this ratio.
This distribution can be asymmetric and highly
dispersed, in which case use of the expected value
could overemphasize the apparent relationship
between the vaccine and the COSTART.

RESULTS
Of 69 230 theoretical vaccine–COSTART pairs,
14 800 actually occurred in VAERS at the time of this
analysis. Figure 1 illustrates the number of vaccine–
COSTART pairs versus the total number of occur-
rences in VAERS, for these 14 800 pairs. The point
at (1, 4857) indicates that 4857 vaccine–COSTART
pairs each occurred only once in VAERS. The pairs
that occurred most frequently, at the far right of
Figure 1, correspond to pairs in which the COSTART Figure 2. Scatterplot of ln EBGM vs ln PRR. This plot
is a common and expected event (such as fever) demonstrates a filament (arrow) that consists of vaccine–COSTART
that occurs after many vaccines. Many vaccine– pairs for which only one report was received. For these singleton
reports, the range of the PRR scores is large compared to that of the
COSTART pairs occurred rarely and some pairs EBGM scores, suggesting that PRR gives undue weight to singleton
occurred at high frequency, but overall the curve is reports relative to EBGM. EBGM: Empirical Bayesian Geometric
very smooth. Mean. PRR: Proportional Reporting Ratio

Copyright # 2005 John Wiley & Sons, Ltd. Pharmacoepidemiology and Drug Safety, 2005; 14: 601–609
vaers data mining 607
the graph). The logarithmic plot demonstrates a strong than 2. The lower filament, which consisted of cells for
filamentary structure. The lowest filament consists which nij ¼ 1, has disappeared. Several of the upper
entirely of cases for which nij ¼ 1, which accounts filaments are also gone, since they corresponded to
for most of the largest (rightmost) values of the cells in which Yates-corrected X2 was not statistically
PRR scores. In fact, of the 175 vaccine–COSTART significant. Note that because of the large number of
pairs with infinite values of PRR, 146 have nij ¼ 1, points, there is considerable overplotting.
confirming that PRR gives undue weight to singleton Among the top 100 vaccine–COSTART pairs from
reports, and thus is highly susceptible to sampling var- EBGM and SPRR, 54 appear in both, including nine
iation. Only two of these 175 vaccine–COSTART pairs for which SPRR is infinite. Of those cells flagged
pairs are also in the top 100 EBGM. The known asso- by both methods, the Pearson correlation coefficient
ciation of rotavirus vaccine and intussusception is not for the signal ranks is 0.543 ( p < 0.0001). Among the
in the top 100 PRR scores, because the value—while top 100 EBGM scores (EBGM  7.16, ln EBGM 
very large—is finite. 1.97), there are nine cases in which the SPRR method
does not signal because nij < 3. There are 37 cases in
which all criteria are met, but the rank is simply greater
Comparison of EBGM and SPRR than 100. Among the top 100 SPRR scores (SPRR 
The screened version of PRR is intended to repair the 20.08, ln SPRR  3.0), there are 46 cases in which the
deficiencies noted in the previous comparison. The EBGM score is not in the top 100 of scores by that
SPRR drops cells for which nij < 3 and additionally method. The top 100 EBGM scores include the
requires both statistical significance and a raw rotavirus–intussusception and rubella–arthritis asso-
PRR  2; there are 1596 vaccine–COSTART pairs ciations, whereas the top 100 SPRR scores include the
for which SPRR is defined. Figure 3 shows a plot of rotavirus–intussusception and oral polio vaccine–
the natural logarithm of the EBGM score against the poliomyelitis associations. The top 100 SPRR scores
natural logarithm of the SPRR score, for the cells for include three injection site reaction COSTARTs, two
which both SPRR and EBGM are defined (nine points of which are also among the top 100 EBGM scores.
for which SPRR is infinite are omitted from the graph).
In comparison with Figure 2, the figure is left-trun-
cated at 0.693, the natural logarithm of 2, because
SPRR does not generate a score for cells with PRR less

Figure 4. Scatterplot of ln EBGM vs ln EB05. Lines indicate the


cutoff for the top 100 pairs in each method (EBGM  7.16, ln
EBGM  1.97; EB05  3.98, ln EB05  1.38). There is greater
Figure 3. Scatterplot of ln EBGM vs ln SPRR. SPRR eliminates agreement between EBGM and EB05 (67 vaccine costart pairs
the overweighting of singleton reports, as evidenced by the absence appear in top 100 of both) than between any of the other methods
of the lower filament seen in Figure 2. Lines indicate the cutoff for compared. The natural logs of the two scores generally have a linear
the top 100 pairs in each method (EBGM  7.16, ln EBGM  1.97; relationship, but there is divergence from this linear relationship for
SPRR  20.08, ln SPRR  3.0). Fifty-four vaccine–COSTART some vaccine–costart pairs (upper left quadrant) indicating that
pairs appear in the top 100 of both methods (upper right quadrant). these methods are not interchangeable EBGM: Empirical Bayesian
EBGM: Empirical Bayesian Geometric Mean. SPRR: Screened Geometric Mean. EB05: Lower-Bound of the 90% Confidence
Proportional Reporting Ratio Interval of the Empirical Bayesian Geometric Mean.

Copyright # 2005 John Wiley & Sons, Ltd. Pharmacoepidemiology and Drug Safety, 2005; 14: 601–609
608 d. banks ET AL.
purely to differences in score ranks (top 100 or not) and
not due to restrictions on count, Yates-corrected chi-
squared, or SPRR value. Of those cells flagged by both
methods, the Pearson correlation coefficient for the
signal ranks is 0.416 ( p < 0.0001). The top 100 SPRR
scores (SPRR  20.08, ln SPRR  3.0) include the
rotavirus–intussusception and oral live polio vaccine–
poliomyelitis associations; the top 100 EB05 scores
(EB05  3.98, ln EB05  1.38) include these two, as
well as the rubella–arthritis association. The top 100
SPRR scores include three injection site reaction
COSTARTs, one of which appears among the top
100 EB05 scores.

DISCUSSION
Figure 5. Scatterplot of ln EB05 vs ln SPRR. Lines indicate the
cutoff for the top 100 pairs in each method (EB05  3.98, ln
Data mining methods have been proposed as screen-
EB05  1.38; SPRR  20.08, ln SPRR  3.0). Forty-two vaccine– ing tools for improving the efficiency of adverse event
COSTART pairs appear in the top 100 of both methods (upper right reports. This is the first analysis comparing several
quadrant). EB05: lower-bound of the 90% confidence interval of the proposed methods using the VAERS database. Several
empirical Bayesian geometric mean. SPRR: screened proportional data mining methods exist, and our purpose is to com-
reporting ratio
pare four approaches that have been piloted within the
FDA. The qualitative features of the comparisons are
as follows. The PRR signal appears less useful for
postmarketing safety surveillance than SPRR, EBGM,
Comparison of EBGM and EB05 and EB05. The large number of PRR signals for sin-
As shown in Figure 4, the natural logs of EBGM and gleton reports could result in many false alarms and
EB05 generally have a linear relationship, as is divert resources from more consequential relation-
expected since the posterior distributions are reason- ships. Because of these limitations, PRR was removed
ably symmetrical. Sixty-seven vaccine–COSTART from further consideration in the analysis.
pairs appear in the top 100 scores of both EBGM Even the best method for detecting clinically
and EB05. The top 100 EBGM scores (EBGM  7.16, important signals among spontaneous report data is
7.16, ln EBGM  1.97) include the rotavirus–intus- subject to limitations. First, if nearly all vaccines are
susception and rubella–arthritis associations; the top associated with the same adverse event, such as
100 EB05 scores (EB05  3.98, ln EB05  1.38) injection site reactions, then automatic signal detection
include these two, as well as the oral live polio vac- systems are unlikely to discover this association from
cine–poliomyelitis association. The top 100 EBGM VAERS data. No single vaccine would likely emerge as
scores include two injection site reaction COSTARTs, markedly different from others, with regard to this
one of which appears among the top 100 EB05 scores. event, even if the event were extremely common. Some
vaccines are commonly administered simultaneously,
e.g., Hemophilus influenzae type B vaccine, inacti-
Comparison of EB05 and SPRR
vated polio vaccine, pneumococcal conjugate vaccine,
Figure 5 plots the natural log of EB05 against the nat- and diphtheria and tetanus toxoids with acellular
ural log of SPRR for the cells for which both SPRR pertussis vaccine in children. Determining whether a
and EB05 are defined (nine points for which SPRR given adverse event results from one of several
is infinite are omitted from the graph). From the plot simultaneously administered vaccines (thereby exon-
of the natural logs of the scores, it is clear that many of erating the ‘innocent bystanders’), from the simple
the top-ranked signals from one method are not the additive effects of multiple vaccines, or from the
same as the top-ranked signals from the other method. synergistic effect of multiple vaccines, is a topic for
We have examined the top 100 vaccine-COSTART further research.
pairs flagged by the SPRR method and the EB05 We found that the SPRR method was generally
method. Among these, 42 are in common, including competitive with the EBGM method. In com-
one infinite value of SPRR. The discrepancies are due paring EBGM versus SPRR, one should consider the

Copyright # 2005 John Wiley & Sons, Ltd. Pharmacoepidemiology and Drug Safety, 2005; 14: 601–609
vaers data mining 609
21
bias-variance tradeoff. SPRR estimates have large 2. Meyboom RH, Egberts AC, Edwards IR, Hekster YA, de
variance; EBGM estimates are shrunk towards a Koning FH, Gribnau FW. Principles of signal detection in
common mean, which reduces variance at the expense pharmacovigilance. Drug Saf 1997; 16: 355–365.
3. Martin M, Weld LH, Tsai TF, et al. Advanced age a risk factor
of a small bias. From a public health standpoint, good for illness temporally associated with yellow fever vaccina-
methods will agree on the strongest signals; close tion. Emerg Infect Dis 2001; 7: 945–951.
correlation among the other signals is not as helpful. 4. Rosenthal S, Chen R. The reporting sensitivities of two passive
EB05 is designed with statistical principles in mind and surveillance systems for vaccine adverse events. Am J Public
Health 1995; 85: 1706–1709.
takes explicit account of the asymmetry in the 5. Szarfman A, Machado SG, O’Neill RT. Use of screening algo-
distribution of signals. However, these properties rithms and computer systems to efficiently signal higher-than-
may not ensure superior performance. We have expected combinations of drugs and events in the US FDA’s
evaluated the ability of the different methods to detect spontaneous reports database. Drug Saf 2002; 25: 381–392.
some well-known adverse effects. The causal relation- 6. Hauben M. A brief primer on automated signal detection. Ann
Pharmacother 2003; 37: 1117–1123.
ship of the vast majority of vaccine–event pairs is 7. Hauben M, Zhou X. Quantitative methods in pharmacovigi-
unknown, making estimates of sensitivity and speci- lance: focus on signal detection. Drug Saf 2003; 26: 159–186.
ficity unreliable. This paper brings together the 8. Niu MT, Erwin DE, Braun MM. Data mining in the US Vac-
comparative information that is currently available, cine Adverse Event Reporting System (VAERS): early detec-
tion of intussusception and other events after rotavirus
relying on both theory and some empirical work. The vaccination. Vaccine 2001; 19: 4627–4634.
number of vaccine–COSTART pairs that ranked in the 9. Vaccine Safety Committee, Institute of Medicine. Adverse
top 100 by each of two methods (EBGM, EB05, or Events Associated with Childhood Vaccines: Evidence Bear-
SPRR) ranged from 42 to 67. Few known associations ing on Casuality, Stratton KR, Howe CJ, Johnston RB (eds).
were in the top 100 scores of any of the methods that we National Academy Press: Washington, DC, 1994. see also
http://www.hrsa.gov/osp/vicp/table.htm.
studied, but the known associations that were signaled 10. Church KW, Hanks P. Word association norms, mutual infor-
overlapped and were more similar than different. mation, and lexicography. Computational Linguistics 1990;
Under the limitations described above, our research 16: 22–29.
finds that each method has strengths and limitations, 11. Friedman C, Hripcsak G, DuMouchel W, Johnson SB, Clayton
PD. Natural language processing in an operational clinical
and knowledge of these differences has practical value. information system. Natural Lang Eng 1995; 2(1): 83–108.
12. Dumouchel W, Friedman C, Hripcsak G, Johnson SB, Clayton
PD. Two applications of statistical modeling to natural lan-
ACKNOWLEDGEMENTS guage processing. In AI and Statistics V, Fisher D, Lenz H
(eds). Springer-Verlag: New York, 1996; 413–422.
The study was part of routine activities by the Office 13. Little RJA, Wu M-M. Models for contingency tables with
of Biostatistics and Epidemiology/Food and Drug known margins when target and sampled populations differ.
Administration and therefore did not require supple- J Am Stat Assoc 1991; 86: 87–95.
mental funding. The authors thank Dr. Susan 14. Bate A, Lindquist M, Orre R, Edwards IR, Meyboom RHB.
Ellenberg for helpful critique and Dr. Vitali Pool for Data-mining analyses of pharmacovigilence signals in relation
to to relevant comparison drugs. Eur J Clin Pharmacol 2002;
assistance with data mining method definitions. We 58: 483–490.
also greatly appreciate the efforts of the VAERS 15. DuMouchel W. Bayesian data mining in large frequency
Working Group for their dedication to the mainte- tables, with an application to the FDA spontaneous reporting
nance of VAERS. The members of the VAERS Work- system. Am Statistician 1999; 53: 177–190.
ing Group include: Marthe Bryant-Genevier, Soju 16. Finney DJ. Systemic signalling of adverse reactions to drugs.
Methods Inf Med 1974; 13: 1–10.
Chang, Hector Izurieta, Ann W. McMahon, Lise 17. Evans SJ, Waller PC, Davis S. Use of proportional reporting
Stevens, Frederick Varricchio, and Robert Wise (Food ratios (PRRs) for signal from spontaneous adverse drug reac-
and Drug Administration); Scott Campbell, Robert tion reports. Pharmacoepidemiol Drug Safe 2001; 10: 483–
Chen, Penina Haber, John Iskander, Alena Khromova, 486.
18. Yates F. Contingency tables involving small numbers and the
Elaine Miller, Gina T. Mootrey, Vitali Pool, and Sean chi-square test. Suppl J Roy Stat Soc 1934; 1: 217–235.
Shadomy (Centers for Disease and Prevention). 19. Carlin BP, Louis TA. Bayes and Empirical Bayes Methods for
Data Analysis. Boca Raton: Chapman & Hall/CRC: 2000.
20. DuMouchel W, Pregibon D. Empirical Bayes screening for
REFERENCES multi-item associations. Proceedings of the Seventh ACM
SIGKDD International Conference on Knowledge Discovery
1. Chen RT, Rastogi SC, Mullen JR, et al. The vaccine and Data Mining, 2001, pp. 67–76.
adverse event reporting system (VAERS). Vaccine 1994; 12: 21. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical
542–550. Learning. Springer-Verlag: New York, 2001 (see chapter 7).

Copyright # 2005 John Wiley & Sons, Ltd. Pharmacoepidemiology and Drug Safety, 2005; 14: 601–609

You might also like