Meta-Analysis Ebm 1
Meta-Analysis Ebm 1
Meta-Analysis Ebm 1
in Evidence-based Medicine
Stefan Sauerland1 and Christoph M. Seiler2
(1) Biochemical and Experimental Division, Medical Faculty, University of Cologne,
Ostmerheimer Strasse 200, D-51109 Köln, Germany
(2) Clinical Study Center of the German Surgical society (SDGC), Department of Surgery,
University of Heidelberg, Im Neuenheimer Feld 110, D-69120, Heidelberg, Germany
Stefan Sauerland
Email: S.Sauerland@uni-koeln.de
Published online: 14 April 2005
Abstract The overwhelming increase in the quantity of clinical evidence has led to
detachment of the evidence and practice because new evidence can be integrated into clinical
practice only after it has been critically appraised and synthesized on the basis of the existing
evidence. Because many clinicians lack the skills and the time for such information
processing, systematic reviews and meta-analyses, their quantitative counterparts, play an
important role in health care. Well performed systematic reviews provide clinically relevant
information for surgeons, abrogating the need to identify, read, and evaluate many individual
studies. This article reviews the basic principles of meta-analysis, discusses its potential
weaknesses such as heterogeneity and publication bias, and highlights special situations when
dealing with surgical trials.
The practice of evidence-based surgery consists of three essential parts: the preferences,
concerns, and expectations of each patient; the clinical expertise including skills, past
experience, and knowledge of the surgeon; and the best research evidence that is relevant for
clinical practice [1]. The challenge for a surgeon today is the last part of this definition. The
traditional acquisition of knowledge from inter- and intraspecialty consultation during work or
at a congress is limited in many ways (e.g., incomplete or biased knowledge of colleagues,
uncontrolled statements without quantification, conflicts of interest). The most important
source for external evidence is therefore the medical literature. In the medical literature,
randomized controlled trials (RCTs) are considered a reference standard of clinical research
methods owing to a number of unique characteristics that try to avoid possible biases.
Surgeons, however, who try to upgrade their knowledge by reading journal articles face an
enormous task. There are about 200 major surgical journals, and each journal publishes about
250 articles per year. Thus, any surgeon who wants to keep abreast of new knowledge must
browse through 50,000 articles per year, which means 137 articles per day. This figure may
be smaller for a highly specialized surgeon, who can carefully select his or her reading, but it
still requires much more time than a busy surgeon can spare [2].
In addition to the huge volume of literature, its scattered nature poses further problems. Every
time a new article appears, readers must compare this new piece of evidence with the existing
external evidence to come to an overall clinical conclusion. This process of collecting and
summarizing scientific information over many years requires extraordinary memory
capacities that most human brains simply lack.
A third problem of the surgical literature lies in the small proportion of high-quality evidence
in most journals [3]. The number of surgical randomized controlled trials is still small, and
case reports and series seem to be the predominant publication type. Reading surgical articles
can also be trying owing to differences in study quality, such as insufficient sample size,
unclear methodology, and nonclinical outcome parameters [4].
It is therefore not surprising that many surgeons subscribe to only a few selected journals. In
consequence, clinicians are often unaware of important surgical innovations. On the other
hand, researchers are sometimes disappointed by the small overall impact of the publication
of a major trial. Here, systematic reviews are an important tool for finding important and valid
studies while filtering out the large number of seriously flawed and irrelevant articles. By
condensing the results of many trials, systematic reviews allow readers to obtain a valid
overview on a topic with substantially less reading time involved.
Figure 1 Seven steps when performing a meta-analysis. (Modified from Neugebauer et al.
[10], with permission.)
Searching the literature for potentially relevant articles [11] is the most time-consuming part
of a systematic review. Consequently, systematic reviews still appear in the literature for
which careless and superficial literature searches were performed. It has been demonstrated
that restricting a literature search to only English-language or MEDLINE-indexed articles
carries the potential risk of an incorrectly increased treatment estimate [12, 13]. Any valid
search strategy should include non-MEDLINE databases such as Embase and Cochrane.
Language restrictions should be avoided.
Two independent persons should screen and select potentially relevant abstracts. All retrieved
studies are assessed again independently for inclusion and exclusion criteria. The exclusion of
studies should be documented so readers can understand the essential selection process [14].
Once a set of studies has been selected, these studies must be critically appraised in detail
[15]. The use of checklists may be helpful for this assessment. However, the summary score
determined by the checklists should not be used as a measure of overall study quality [16].
Important clinical and methodologic study characteristics should then be tabulated. At this
stage, it might be necessary to contact the authors of primary studies to obtain important
information missing in the publication. Contact with the authors is also essential when a study
has been published as an abstract only.
The fourth step in a systematic review is the extraction of data from primary studies. In
systematic reviews the original data summarized in publications is analyzed, but again it can
be necessary to contact the authors if an article does not contain the relevant data in the right
format. For a meta-analysis, original data should be given as rates (for binary outcomes) or as
mean values with standard deviations (or any other measure of spread) for both treatment and
control groups. If results are reported only as being significant, it is nearly impossible to
use this information. Another problem arises when data are skewed in a way that normal
distribution is missing. Hospital stay, for example, is often reported as median and
interquartile range. According to the distribution of data, this approach is correct and even
desirable; but the meta-analysis cannot work with nonnormally distributed data. Similar but
less severe problems occur when survival data must be analyzed.
Statistical Principles
Step five of a systematic review consists of the statistical meta-analysis. Although it is
frequently assumed that pooling data from various trials is equivalent to a simple addition of
trial results, this assumption is not correct. In fact, each study must be treated as a singly
entity. Differences in baseline risk, concomitant therapy, and outcome definition of studies
should not be combined directly. Consequently, treatment estimates, such as relative risks,
must be calculated for each trial result. Multiplication of relative risks with a weighting factor
that is dependent on the sample size of each trial ensures comparability of qualitatively and
quantitatively diverse studies. This pooling method is valid, as it does not use data from
different studies as if they came from one large single study. Nowadays, freely available
statistical software packages allow a quick meta-analytic calculation of different effect
measures with 95% confidence intervals (CIs) (Fig. 2). The choice of the effect measure (e.g.,
relative risk, odds ratio, or risk difference) depends on the effect sizes and their relation to
baseline risk [17].
Figure 2 Example of a meta-analysis without major apparent problems. All studies vary
around the summary estimate of 0.81. Although none of the 10 primary studies is significant,
the pooled estimate becomes significant because of the increased power (p = 0.04). Larger
studies can be recognized by having smaller confidence intervals and larger boxes.
(Hypothetical data for this forest plot were simulated assuming an effect size of 0.80 and an
average study size of 50 patients.)
The main argument against meta-analysis claims that highly unequal studies are forced into a
common treatment estimate. This mixture of apples and oranges may represent an
unjustified simplification, whereas in truth a broad diversity of results exists. Therefore, it is
generally agreed upon that any meta-analysis should include a formal examination of
heterogeneity. It can (and should) be tested statistically whether the results of the studies in a
meta-analysis differ among each other to a greater extent than can be expected from pure
chance alone. For this purpose, a new quantity, 12, was recently proposed [18]. It allows
measurement of heterogeneity as a percentage, with values over 75% indicating high
heterogeneity. Alternatively, heterogeneity can also be examined graphically, as shown in
Figure 3.
Figure 3 Left. Example of a meta-analysis with evidence of strong heterogeneity. Note that
the confidence intervals of many primary studies are not overlapping each other. The pooled
estimate gives a misleading impression of the primary trials. Right. Heterogeneity can be
resolved by subdividing the studies into two groups according to clinical or methodologic
characteristics. Two different effects of 0.52 and 1.18 appear in the two subgroups.
Publication Bias
Studies reporting significant treatment effects are more likely to be published [23], published
in English [13], and published more quickly [24] than nonsignificant studies. This problem,
known as publication bias, affects the validity of the medical literature as a whole because the
obtained results may be misleading. Although exhaustive literature searches can partly
compensate for the problem, unpublished studies cannot be traced even through the best
literature search. Therefore, any withholding of study results should be banned as scientific
and ethical misconduct. In addition, prospective registration of planned or ongoing trials may
reduce the proportion of unpublished data.
Until these efforts are realized, other measures are necessary to examine how seriously the
results of a meta-analysis are influenced by publication bias. In this regard, the funnel plot is a
useful graphic tool, which Egger et al. have complemented with a statistical test [25]. The
principle of the funnel plot is quite simple (Fig. 4). It goes by the assumption that larger
studies are more likely of being published, whereas smaller trials may get published only if
they report a significant result. Thus, any difference between the results of large versus small
studies should cast suspicion on the overall validity of the published evidence because this
finding indicates that some small studies have not been published owing to their
nonsignificance. Today, the funnel plot has become a standard procedure in meta-analysis,
although sometimes constructing a funnel plot is impossible because of the low number of
available studies. Also noteworthy is the fact that publication bias does not necessarily lead to
heterogeneity.
Figure 4 Left. Example of a meta-anaiysis with evidence of publication bias. Note that only
two small studies (E and H) reach borderline signficance; the larger studies are closer to the
line of equivalence (RR = 1). Right. Publication bias can be detected by arranging studies
according to sample size (or statistical weight), as shown in the funnel plot. Apparently, some
smaller, nonsignificant studies went unpublished. The gray fields indicate the 95% confidence
limits for study results.
Once the funnel plot shows clear asymmetry, the interpretation of the meta-analysis becomes
complicated, as there is no opportunity to locate the apparently missing unpublished
phantom studies. In consequence, for evidence-based decision-making, one rigorously
conducted study of 1000 patients is a better information source than 10 studies of a 100
patients each [26].
In the seventh and final step of a systematic review, the results are summarized and published.
Publication of systematic reviews should comply with the QUORUM standards [14], which
recommend a clear description of all critical steps. The conclusions of a systematic review
should envisage clinical and scientific recommendations. Especially in cases where not a
single study was found addressing an important question, systematic reviews are important for
guiding future research funding. For further details on systematic review methodology,
readers are referred to recent comprehensive textbooks [27, 28].
Surgical Meta-analyses
Most of the differences between meta-analyses in surgery and those in other fields originate
from the differences between, for example, a surgical procedure and a pharmaceutical drug.
While a tablet acts more or less uniformly, the success of an operation depends on the
expertise of the surgeon. Furthermore, most surgeons do not perform their operations in a
fully standardized and reproducible manner. Slight modifications occur over time that may
lead to variable treatment effects. Although a few surgical RCTs have prescribed the
commitment of all trial surgeons to a standard technique, this is still not the case in most
surgical studies.
Although there is a large body of evidence showing superior treatment results from
experienced versus inexperienced surgeons, this association has not yet been shown in
surgical meta-analyses. A comparison of operating times and wound infection rates after
laparoscopic and open appendectomy failed to find a difference between experienced and
inexperienced trial surgeons [34, 35]. In primary studies, however, reporting on surgical
expertise is often vague or completely missing. These and other problems may lead to a more
heterogeneity among surgical trials than medical trials, thus threatening the validity of a meta-
analysis based on these data. Those who use systematic reviews for clinical decision-making
and evidence-based medicine must take into account the possible shortcomings of surgical
meta-analyses.
Cochrane Collaboration
Given the complex and difficult methodology of systematic reviews, it is understandable that
some reviews are still published despite major flaws. Meta-analysts therefore have decided to
collaborate on an international level to raise the quality standards of systematic reviews.
Based on preliminary experience in the united kingdom, the Cochrane Collaboration was
founded a decade ago. Its aim is to prepare, maintain, and disseminate systematic reviews of
the effects of health care [36]. The organizational structure of the Collaboration is based on
the scientific enthusiasm of its members, and financial issues are kept in the background. The
revenues from selling the Cochrane Library are used mostly for burning and distributing CDs
and for software development.
Cochrane reviews, now numbering over 1700, have been found to be of higher quality than
paper-based reviews [37]. This finding can be explained by the close collaboration of
researchers and clinicians. Furthermore, manual searching of non-MEDLINE journals and
abstract volumes for RCT reports is a cornerstone of the Cochrane Collaboration. Probably
the biggest advantage of Cochrane reviews is their periodic updating. Each time a major new
study appears (or at least every second year), a review must be updated or it is excluded.
For surgeons, the Cochrane Library also contains valuable information, although clearly more
reviews have been published on nonsurgical topics [38]. The interested reader should note
that many surgical topics are as yet unreviewed, although conducting a review would earn the
reviewer a free copy of the Cochrane Library.
Conclusions
The art of writing an overview has developed from the classic and unsystematic into a
systematic, often quantitative review. For surgeons such articles provide the opportunity to
obtain evidence-based summaries more quickly than from primary studies. Because
heterogeneity, publication bias, and learning curve effects represent serious problems to
systematic reviews, surgeons should have a basic understanding of methodology. Not every
meta-analysis represents level I evidence.
References
1. Sackett, DL, Rosenberg, WM. (1995) "The need for evidence-based medicine" J. R. Soc.
Med. 88: 620-624
5. Barnes, DE, Bero, LA. (1998) "Why review articles on the health effects of passive
smoking reach different conclusions" J.A.M.A. 279: 1566-1570
6. Antman, EM, Lau, J, Kupelnick, B et al. (1992) "A comparison of results of meta-
analyses of randomized control trials and recommendations of clinical experts: treatments
for myocardial infarction" J.A.M.A. 268: 240-248
7. L Abbe, KA, Detsky, AS, O Rourke, K. (1987) "Meta-analysis in clinical research" Ann.
Intern. Med. 107: 224-233
11. Allen, IE, Olkin, I. (1999) "Estimating time to conduct a meta-analysis from number of
citations retrieved" J.A.M.A. 282: 634-635
12. McAuley, L, Pham, B, Tugwell, P et al. (2000) "Does the inclusion of grey literature
influence estimates of intervention effectiveness reported in meta-analyses?" Lancet 356:
1228-1231
14. Moher, D, Cook, DJ, Eastwood, S et al. (1999) "Improving the quality of reports of
meta-analyses of randomised controlled trials: the QUOROM statement" Lancet 354:
1896-1900
15. Jüni, P, Altman, DG, Egger, M. (2001) "Assessing the quality of controlled clinical trials"
B.M.J. 323: 42-46
16. Jüni, P, Witschi, A, Bloch, R et al. (1999) "The hazards of scoring the quality of clinical
trials for meta-analysis" J.A.M.A. 282: 1054-1060
17. Deeks, JJ. (2002) "Issues in the selection of a summary statistic for meta-analysis of
clinical trials with binary outcomes" Stat. Med. 21: 1575-1600
18. Higgins, JPT, Thompson, SG, Deeks, JJ et al. (2003) "Measuring inconsistency in meta-
analyses" B.M.J. 327: 557-560
20. Glasziou, PP, Sanders, SL. (2002) "Investigating causes of heterogeneity in systematic
reviews" Stat. Med. 21: 1503-1511
21. Moher, D, Pham, B, Jones, A et al. (1998) "Does quality of reports of randomised trials
affect estimates of intervention efficacy reported in meta-analyses?" Lancet 352: 609-613
22. Lau, J, Ioannidis, JP, Schmid, CH. (1998) "Summing up evidence: one answer is not
always enough" Lancet 351: 123-127
23. Dickersin, K. (1990) "The existence of publication bias and risk factors for its occurrence"
J.A.M.A. 263: 1385-1389
24. Krzyzanowska, MK, Pintilie, M, Tannock, IF. (2003) "Factors associated with failure to
publish large randomized trials presented at an oncology meeting" J.A.M.A. 290: 495-501
25. Egger, M, Davey Smith, G, Schneider, M et al. (1997) "Bias in meta-analysis detected by
a simple, graphical test" B.M.J. 315: 629-634
27. Egger, M, Davey Smith, G, Altman, DG (2001) Systematic reviews in health care, BMJ
Publishing, London
28. Khan, K, Kunz, R, Kleijnen, J, et al. ( 2003) Systematic reviews to support evidence-
based medicine, Royal Society of Medicine Press, London
29. Stewart, LA, Parmar, MKB. (1993) "Meta-analysis of the literature or of individual
patient data: is there a difference?" Lancet 341: 418-422
30. McCormack K, Scott NW, Go PM. et al. Laparoscopic techniques versus open techniques
for inguinal hernia repair. Cochrane Database Syst. Rev. 2003;CD001785
31. Song, F, Altman, DG, Glenny, AM et al. (2003) "Validity of indirect comparisons for
estimating efficacy of competing interventions: empirical evidence from published meta-
analyses" B.M.J. 326: 472-475
32. Lijmer, JG, Mol, BW, Heisterkamp, S et al. (1999) "Empirical evidence of design-related
bias in studies of diagnostic tests" J.A.M.A. 282: 1061-1066
34. Sauerland S, Lefering R, Neugebauer EAM. Laparoscopic versus open surgery for
suspected appendicitis. Cochrane Database Syst. Rev. 2002;CD001546
37. Jadad, AR, Cook, DJ, Jones, A et al. (1998) "Methodology and reports of systematic
reviews and meta-analyses: a comparison of Cochrane reviews with articles published in
paper-based journals" J.A.M.A. 280: 278-280
38. Emond, SD, Wyer, PC, Brown, MD et al. (2002) "How relevant are the systematic
reviews in the Cochrane library to emergency medical practice?" Ann Emerg. Med. 39:
153-158