Amstar Shea 2007
Amstar Shea 2007
Amstar Shea 2007
Address: 1EMGO Institute, VU University Medical Center, Amsterdam, the Netherlands, 2Institute of Population Health, Ottawa, Ontario, Canada,
3Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, Ontario, Canada, 4Department of Clinical Epidemiology
and Biostatistics. VU University Medical Center, Amsterdam, the Netherlands, 5Community Information and Epidemiological Technologies
(CIETcanada), Ottawa, Ontario, Canada and 6University of Ottawa, Ottawa, Ontario, Canada
Email: Beverley J Shea* - bshea@ciet.org; Jeremy M Grimshaw - jgrimshaw@ohri.ca; George A Wells - gawells@ottawaheart.ca;
Maarten Boers - mboers@vumc.nl; Neil Andersson - neil@ciet.org; Candyce Hamel - chamel@uottawa.ca;
Ashley C Porter - ashleyclaireporter@hotmail.com; Peter Tugwell - ptugwell@uottawa.ca; David Moher - dmoher@uottawa.ca;
Lex M Bouter - lm.bouter@dienst.vu.nl
* Corresponding author †Equal contributors
Abstract
Background: Our objective was to develop an instrument to assess the methodological quality of
systematic reviews, building upon previous tools, empirical evidence and expert consensus.
Methods: A 37-item assessment tool was formed by combining 1) the enhanced Overview Quality
Assessment Questionnaire (OQAQ), 2) a checklist created by Sacks, and 3) three additional items
recently judged to be of methodological importance. This tool was applied to 99 paper-based and
52 electronic systematic reviews. Exploratory factor analysis was used to identify underlying
components. The results were considered by methodological experts using a nominal group
technique aimed at item reduction and design of an assessment tool with face and content validity.
Results: The factor analysis identified 11 components. From each component, one item was
selected by the nominal group. The resulting instrument was judged to have face and content
validity.
Conclusion: A measurement tool for the 'assessment of multiple systematic reviews' (AMSTAR)
was developed. The tool consists of 11 items and has good face and content validity for measuring
the methodological quality of systematic reviews. Additional studies are needed with a focus on the
reproducibility and construct validity of AMSTAR, before strong recommendations can be made
on its use.
Page 1 of 7
(page number not for citation purposes)
Luiza Maria Machado De Moura - luizamaria0011@gmail.com - CPF: 138.512.006-16
BMC Medical Research Methodology 2007, 7:10 http://www.biomedcentral.com/1471-2288/7/10
conducted systematic review addresses a carefully formu- in systematic reviews remains controversial. Some studies
lated question by analyzing all available evidence. It have suggested that systematic reviews that include only
employs an objective search of the literature, applying English language publications tend to overestimate effect
predetermined inclusion and exclusion criteria to the lit- sizes [10], whereas other studies suggest that such lan-
erature, critically appraising what is found to be relevant. guage restrictions may not do so [11]. An item was added
It then extracts and synthesizes data from the available to determine whether a language restriction was applied
evidence base to formulate findings [3]. in selecting studies for the systematic review. 2) Publica-
tion bias: Publication bias refers to the tendency for
However, in spite of the care with which they are con- research with negative findings to get published less fre-
ducted, systematic reviews may differ in quality, and yield quently, less prominently, or more slowly, and the ten-
different answers to the same question [4]. As a result, dency for research with positive findings to get published
users of systematic reviews should be critical and look more than once. Publication bias has been identified as a
carefully at the methodological quality of the available major threat to the validity of systematic reviews. Empiri-
reviews [5]. cal research suggests that publication bias is widespread,
and that a variety of methods are now available to assess
A decade has passed since the initial development of tools publication bias [12-19]. An item was added to determine
to assess the quality of systematic reviews, such as those whether the authors assessed the likelihood of publica-
created by Oxman and Guyatt [6] and Sacks [7]. There are tion bias. 3) Publication status of studies suggests that pub-
now more than 24 instruments available to assess the lished trials are generally larger and may show an overall
quality of systematic reviews [8]. Nevertheless, the major- greater treatment effect than studies published in the 'grey'
ity of the available instruments are not widely used. Sev- literature [20]. The importance of including grey literature
eral are lengthy and include complicated instructions for in all systematic reviews has been discussed [21]. The
their use. Furthermore, since their development, consider- assessment of the inclusion of grey literature considers
able empirical research has accumulated about potential whether or not the authors reported searching for grey lit-
sources of bias in systematic reviews. For example, recent erature.
methodological research has highlighted the potential
importance of publication language and publication bias Objective 1
in systematic reviews [9-11]. The 37-item assessment tool was used to appraise 99
paper-based reviews identified from a database of reviews
Therefore, our goal was to develop a new instrument for and meta-analyses [22] and 52 Cochrane systematic
assessing the methodological quality of systematic reviews from the Cochrane Database of Systematic
reviews by building upon empirical data collected with Reviews [9]. After the list of selected systematic reviews
previously developed tools and utilizing expert opinion. was generated, full copies of these were retrieved, copied,
and masked to conceal author, institution, and journal.
This goal was pursued by two study objectives. Our first Reviews in languages other than English (i.e., French, Ger-
objective was to assess a large sample of systematic man, and Portuguese) were translated into English with
reviews using an item pool drawn from two available the assistance of colleagues before masking [23].
instruments used to assess methodological quality, sup-
plemented by additional items judged to be needed on For each included systematic review, two reviewers inde-
the basis of recent publications. We used exploratory fac- pendently assessed the methodological quality with the
tor analysis to identify the underlying component struc- 37 items (CH, BS).
ture. Our second objective was to build on the results of
this factor analysis, by using experts in a nominal group Statistical analyses and graphs displaying the results
technique (NGT) to reduce the items pool and to decide obtained were produced using SPSS version 13.0 for Win-
on a new assessment tool with face and content validity. dows. The 37 items were subjected to principal compo-
nents analysis, and Varimax rotations were used to rotate
Methods the components. Items with low factor loadings of < 0.50
We designed a 37-item assessment tool that we developed were removed.
by combining items from two available instruments: the
enhanced Overview Quality Assessment Questionnaire Objective 2
(OQAQ) [8] containing 10 items and a checklist created We convened an international panel of eleven experts in
by Sacks [7] containing 24 items. We supplemented this the fields of methodological quality assessment and sys-
with three additional items based upon methodological tematic reviews. The group was selected from three organ-
advances in the field since the development of the original izations involved both in the conduct of systematic
two instruments: Language restriction: Language restriction reviews and in the assessment of methodological quality.
Page 2 of 7
(page number not for citation purposes)
Luiza Maria Machado De Moura - luizamaria0011@gmail.com - CPF: 138.512.006-16
BMC Medical Research Methodology 2007, 7:10 http://www.biomedcentral.com/1471-2288/7/10
The group was made up of clinicians, methodologists and the methodological quality of systematic reviews, by
epidemiologists, and reviewers who were new to the field. building upon empirical data on previously developed
Some individuals were previously involved in the tools, empirical evidence and utilizing expert opinion.
Cochrane Collaboration, while a number were not. By
examining the results of the factor analysis, they reflected Because we had already created a dataset of 151 systematic
critically on the components identified and decided on reviews assessed using 37 completed items for each
the items to be included in the new instrument. The nom- review, we were able to conduct a factor analysis as the
inal group process took place in San Francisco during a first step in the creation of the new tool. A more com-
one day session. monly used approach would have been to harvest appro-
priate items from existing questionnaires. This method
We conducted the following NGT in order to achieve has been used extensively in the development of instru-
agreement. After delivery of an overview of the project and ments for assessing the quality of both randomized and
the planned process for the day, the panel reviewed the non-randomized studies of health care interventions [24-
results of the factor analysis. The aim of the NGT was to 26]. The disadvantage of harvesting appropriate items
structure interaction within the group. Firstly, each partic- from existing questionnaires is that it relies heavily on the
ipant was asked to record his or her ideas independently validation of the source questionnaires [27]. Conducting
and privately. The ideas were then listed in a round-robin a factor analysis made it possible to determine whether
format. One idea was collected from each individual in the measured dimensions could in principle be assessed
turn and listed in front of the group by the facilitator, and using a smaller number of items.
the process was continued until all ideas had been listed.
Individuals then privately recorded their judgements. Traditionally, factor analysis is divided into two types of
Subsequent discussions took place. The individual judge- analyses: exploratory and confirmatory. As its name indi-
ments were aggregated statistically to derive the group cates, exploratory factor analysis aims to discover the
judgements. The nominal group was also asked to agree main constructs or dimensions of a concept by conduct-
on a final label for each of the 11 components. A descrip- ing a preliminary investigation of the correlations
tion was formulated for each of the items and a next-to- between all the identified variables. This process is also
final instrument was assembled. This was circulated elec- known as Principal Components Analysis (PCA). PCA has
tronically to the group for a final round of fine tuning. been recommended for use in test construction by Kline,
as a means of condensing the correlation matrix, rather
Results than as an aid to the interpretation of the factor-structure
Objective 1 of a questionnaire [28]. Items with low factor loadings
The items were subjected to factor analysis, and only those tend to be weakly correlated with other items, and there-
items that loaded highly on one component (>.50) were fore were removed. Various rotational strategies have also
retained. The described factor analysis made it possible to been proposed. The goal of all of them is to obtain a clear
reduce the 37-item instrument to a shorter (29-item) pattern of loadings, that is, factors that are somehow
instrument that measured 11 components (Table 1). clearly marked by high loadings for some variables and
low loadings for others [29,30]. We used this approach
Objective 2 because it is useful when a body of theory or principles
The nominal group discussed all 11 components (Table has been established, but has not yet been operationalised
1). The items most appropriate for the components (Table into an evaluative framework [31].
2), were included in the draft instrument [also see Addi-
tional file 1]. The instrument is an 11-item questionnaire The structured-discussion format employed in this project
that asks reviewers to answer yes, no, can't answer or not enabled all participants to contribute to the refining of the
applicable. A separate question on language was identi- assessment tool. The nominal technique followed
fied in the factor analysis as a significant issue, but the involved experts, discussion, and a consensus that was
nominal group felt that the contradictory evidence in the qualitative in nature. Consequently, it complemented the
literature warranted removing this item from the short- quantitative nature of factor analysis, and as a result the
ened item list and capturing it under the question on pub- final tool had face and content validity as judged by the
lication status. nominal consensus panel.
Page 3 of 7
(page number not for citation purposes)
Luiza Maria Machado De Moura - luizamaria0011@gmail.com - CPF: 138.512.006-16
BMC Medical Research Methodology 2007, 7:10 http://www.biomedcentral.com/1471-2288/7/10
Page 4 of 7
(page number not for citation purposes)
Luiza Maria Machado De Moura - luizamaria0011@gmail.com - CPF: 138.512.006-16
BMC Medical Research Methodology 2007, 7:10 http://www.biomedcentral.com/1471-2288/7/10
Table 2: AMSTAR is a measurement tool created to assess the methodological quality of systematic reviews.
4. Was the status of publication (i.e. grey literature) used as an inclusion criterion? 䊐 Yes
The authors should state that they searched for reports regardless of their publication type. The authors should state whether 䊐 No
or not they excluded any reports (from the systematic review), based on their publication status, language etc. 䊐 Can't answer
䊐 Not applicable
7. Was the scientific quality of the included studies assessed and documented? 䊐 Yes
'A priori' methods of assessment should be provided (e.g., for effectiveness studies if the author(s) chose to include only 䊐 No
randomized, double-blind, placebo controlled studies, or allocation concealment as inclusion criteria); for other types of studies 䊐 Can't answer
alternative items will be relevant. 䊐 Not applicable
8. Was the scientific quality of the included studies used appropriately in formulating conclusions? 䊐 Yes
The results of the methodological rigor and scientific quality should be considered in the analysis and the conclusions of the 䊐 No
review, and explicitly stated in formulating recommendations. 䊐 Can't answer
䊐 Not applicable
9. Were the methods used to combine the findings of studies appropriate? 䊐 Yes
For the pooled results, a test should be done to ensure the studies were combinable, to assess their homogeneity (i.e. Chi- 䊐 No
squared test for homogeneity, I2). If heterogeneity exists a random effects model should be used and/or the clinical 䊐 Can't answer
appropriateness of combining should be taken into consideration (i.e. is it sensible to combine?). 䊐 Not applicable
Page 5 of 7
(page number not for citation purposes)
Luiza Maria Machado De Moura - luizamaria0011@gmail.com - CPF: 138.512.006-16
BMC Medical Research Methodology 2007, 7:10 http://www.biomedcentral.com/1471-2288/7/10
Publication bias remains an area of contention amongst The authors also thank Drs. Andy Oxman, Gordon Guyatt and Henry Sacks
those who assess the quality of systematic reviews. It for their methodological instruments and for providing additional informa-
remains a research priority because it is unclear what the tion.
impact of publication bias is on making decisions in
health care. We are aware of the 20 years of work that has References
1. Davidoff F, Haynes B, Sackett D, Smith R: Evidence-based medi-
gone in this area of research. This has given us some clear cine: a new journal to help doctors identify the information
answers as to the effect publication bias may have on the they need. BMJ 1995, 310:1085-6.
overall results of estimating the impact of interventions. 2. Lau J, Ioannidis JPA, Schmid CH: Summing up evidence: one
answer is not always enough. Lancet 1998, 351:123-127.
3. Systematic Review definition [http://www.nlm.nih.gov/nichsr/
AMSTAR will remain a living document and advances in hta101/ta101014.html]
4. Moher D, Soeken K, Sampson M, Campbell K, Ben Perot L, Berman
empirical methodological research will be reflected in fur- B: Assessing the quality of reports of systematic reviews in
ther improvements to the instrument. pediatric complementary and alternative medicine. BMC
Pediatr 2002, 2(2):.
5. Jadad A, Moher M, Browman G, Booker L, Sigouin C, Fuentes M: Sys-
Conclusion tematic reviews and meta-analyses on treatment of asthma:
A measurement tool for assessment of multiple systematic critical evaluation. BMJ 2000, 320:537-540.
6. Oxman AD: Checklists for review articles. BMJ 1994,
reviews (AMSTAR) was developed. The tool consists of 11 309:648-651.
items and has good face and content validity for measur- 7. Sacks H, Berrier J, Reitman D, Ancona-Berk VA, Chalmers TC: Meta-
ing the methodological quality of systematic reviews. analyses of randomized controlled trials. N Engl J Med 1987,
316(8):450-455.
Additional studies are needed with a focus on the repro- 8. Shea B, Dubé C, Moher D: Assessing the quality of reports of
ducibility and construct validity of AMSTAR, before strong systematic reviews: the QUOROM statement compared to
recommendations can be made on its use. other tools. In Systematic Reviews in Health Care: Meta-analysis in con-
text Edited by: Egger M, Smith GD, Altman DG. London: BMJ books;
2001:122-139.
Competing interests 9. The Cochrane Library. Volume 3. Chichester, UK: John Wiley &
Sons Ltd; 2004.
The author(s) declare that they have no competing inter- 10. Egger M, Zellweger-Zahner T, Schneider M, Junker C, Lengeler C,
ests. Antes G: Language bias in randomised controlled trials pub-
lished in English and German. Lancet 1997, 350:326-329.
11. Moher D, Pham B, Klassen T, Schulz K, Berlin J, Jadad A, Liberati A:
Authors' contributions What contributions do languages other than English make
BS designed and conducted the study and wrote the man- to the results of meta-analyses? Journal of Clinical Epidemiology
2000, 53:964-972.
uscript. MB, JG and LB participated in the design and 12. Pai M, McCulloch M, Colford J Jr, Bero LA: Assessment ofPublica-
coordination and assisted with writing the manuscript. tion Bias in Systematic Reviews on HIV/AIDS. [http://
CH and AP carried out the quality assessments and www.igh.org/Cochrane/pdfs/MSRI_workshop_talk_abstract.pdf].
13. Dickersin K: The existence of publication bias and risk factors
assisted with the writing. NA assisted in the conduct of the for its occurrence. JAMA 1990, 263:1385-1389.
study and commented on earlier drafts. GW assisted in the 14. Dickersin K: How important is publication bias? A synthesis of
design of the study and commented on the statistical anal- available data. AIDS Educ Prev 1997, 9:15-21.
15. Phillips C: Publication bias in situ. BMC Med Res Method 2004,
ysis. PT helped with the design of the study and assisted 4:20.
with the nominal group process. All authors read and 16. Pham B, Platt R, McAuley L, Sampson M, Klassen T, Moher D:
Detecting and minimizing publication bias. A systematic
approved the final manuscript.
Page 6 of 7
(page number not for citation purposes)
Luiza Maria Machado De Moura - luizamaria0011@gmail.com - CPF: 138.512.006-16
BMC Medical Research Methodology 2007, 7:10 http://www.biomedcentral.com/1471-2288/7/10
Page 7 of 7
(page number not for citation purposes)
Luiza Maria Machado De Moura - luizamaria0011@gmail.com - CPF: 138.512.006-16