COSMOS-E- Guidance on conducting
COSMOS-E- Guidance on conducting
COSMOS-E- Guidance on conducting
† Deceased.
* o.m.dekkers@lumc.nl
OPEN ACCESS
perform or report their review. Also, this paper will not settle some ongoing debates and con-
troversies [18] around the performance of reviews of observational studies on etiology, but
rather will give the different viewpoints and possibilities. The writing group included research-
ers experienced in meta-analyses and observational studies of etiology. No external advise was
sought; standard peer-review was performed.
The protocol
Every systematic review should be planned in a detailed protocol. The key issues that need to
be addressed are listed in Box 2. It is not always possible to specify fully all review methods
beforehand; the writing of the study protocol will often be an iterative process, informed by
scoping the literature and piloting procedures. Reviewers should take care not to change the
protocol based on study results, but the protocol may be adapted, for example, based on the
Study selection
The search produces bibliographic references with information on authors, titles, journals, etc.
However, the unit of interest is the study and not the publication—the same study might have
been reported more than once [33], and a single publication can report on multiple studies.
First, all identified reports are screened based on title and abstract to remove duplicate publica-
tions and articles that are clearly not relevant. This leads to a set of studies for which the full
texts are required to determine eligibility and potential overlaps in study populations. Even
with clearly defined eligibility criteria, not all decisions will be straightforward. For example, if
researchers want to perform a review restricted to children, some studies may have included
young adults without providing data for children only. In this case, reviewers may have to
decide what proportion of adults is acceptable for a study to be included, or they may attempt
to obtain the data on children from the authors.
Data extraction
Article screening, data extraction, and assessments of risk of bias should preferably be done
independently by two reviewers to reduce errors and to detect any differences in interpretation
between extractors [35,36]. Discrepancies can then be discussed and resolved [37]. Standard-
ised data extraction sheets should be developed for each review, piloted with a few typical stud-
ies, refined, and then implemented in generic (for example, EpiData) or preferably dedicated
software (for example, Covidence; see http://systematicreviewtools.com). For all included
studies, the following core data should be extracted:
1. Bibliographic information
2. Study design
3. Risk of bias assessment
4. Exposure(s) and outcomes, including definitions
5. Characteristics of study participants
selection bias refer to biases that are internal to the study (‘internal validity’) and not to issues
of generalisability or applicability (‘external validity’) [20]. How should the risk of bias in
observational studies best be assessed? A review identified more than 80 tools for assessing risk
of bias in nonrandomised studies [45]. The reviewers concluded that there is no ‘single obvious
candidate tool for assessing quality of observational epidemiological studies’. This is not sur-
prising considering the large heterogeneity in study designs, contexts, and research questions
in observational research. We believe that the quest for a ‘one size fits all’ approach is mis-
guided; rather, a set of criteria should be developed for each observational systematic review
and meta-analysis, guided by the general principles outlined below.
General principles
When assessing the risk of bias, seven general principles are relevant, based on theoretical con-
siderations, empirical work, and the experience with assessing risk of bias in RCTs and other
studies [55–57].
1. The relevant domains of bias should be defined separately for each review question
and for different study designs. Relevant domains of risk of bias that should be considered
include (i) bias due to (time-dependent) confounding, (ii) bias in selection of participants into
the study (selection bias), (iii) bias in measurement of exposures or outcomes (information
bias), (iv) bias due to missing data (selection bias), and (v) bias in selection of studies or
reported outcomes (selection bias) [56]. The risk of bias should be assessed for each domain
and, if required, for different outcomes. The focus should be on bias. For example, whether or
not a sample size calculation was performed or ethical approval was obtained does not affect
the risk of bias.
2. The risk of bias should be assessed qualitatively. For each study and bias domain, the
risk of bias should be assessed in qualitative categories, for example, as ‘low risk’, ‘moderate
risk’, or ‘high risk’. These categories and the criteria used to define them should be described
in the paper. Quantitative assessments by assigning points should be avoided (see also point
6).
3. Signalling questions may be useful. Within each bias domain, simple signalling ques-
tions may be useful to facilitate judgments about the risk of bias (Table 1). A comprehensive
list of signalling questions has recently been compiled by the developers of the Cochrane risk
of bias assessment tool for nonrandomised studies of interventions (ROBINS-I) [56], and a
similar tool is in development for nonrandomised studies of exposures [58]. These lists and
tools will be useful, but reviewers should think about further questions that may be relevant in
the context of their review. Cooper and colleagues compiled a list of questions relating to study
sensitivity [23].
4. Separate assessments may have to be made for different outcomes. The risk of bias
will often differ across different outcomes. For example, bias in the ascertainment of death
from all causes is much less likely than for a subjective outcome, such as quality of life or pain,
or for an outcome that relies on clinical judgment, such as pneumonia.
5. Assessments should be documented. It is good practice to copy and archive the text
from the article on which an assessment regarding the risk of bias is based. Such
Statistical analysis
Fixed- versus random-effect models in the context of observational studies
Once the decision has been made that (some) studies can be combined in a meta-analysis,
reviewers need to decide whether to use a fixed-effect or a random-effects model or both.
These models have been described extensively [70]. In short, fixed-effect analyses assume that
Metaregression
Metaregression is used to investigate whether study characteristics are associated with the
magnitude of effects and whether specific study characteristics can explain (some of) the
observed statistical heterogeneity. The presence of heterogeneity motivates metaregression
analyses, and random-effects metaregression should therefore always be used. The use of
fixed-effect metaregression is conceptually nonsensical and yields a high rate of false-positive
results [79].
Variables included in a metaregression model may be study features, such as study design,
year of publication, or risk of bias; or characteristics of the people included in the different
studies, such as age, sex, or disease stage. These variables are potential effect modifiers. For
example, the risks of smoking decrease with advanced age. Only a few variables should be
included in a metaregression analysis (about one variable per 10 studies), and they should be
prespecified to minimise the risk of false-positive results [80]. In multivariable metaregression,
the model presents mutually adjusted estimates, and permutation tests to adjust for multiple
testing can be considered [80]. When including characteristics of study participants, note that
associations observed at the study level may not reflect those at the individual level—the so-
called ecological fallacy [81]. This phenomenon is illustrated in Fig 3 for trends in the CD4
positive lymphocyte count in HIV-positive patients starting antiretroviral therapy (ART): in
five of the six studies, the CD4 cell count at the start of ART increased over time, which was
not shown in metaregression analysis at the study level. A graphical display of the metaregres-
sion is informative [82]. Such a graph shows for each study the outcome (e.g., a relative risk or
a risk difference) on the y-axis, the explanatory variable on the x-axis, and the regression line
that shows the association between two variables. In a metaregression graph, the weight of the
studies is preferably shown by ‘bubbles’ around the effect estimates, with larger bubbles relat-
ing to studies with more weight in the analysis (Fig 3 provides a schematic example).
Dose-response meta-analysis
In many epidemiologic studies, several levels of exposure are compared. For example, the
effect of blood glucose on cardiovascular outcomes can be studied across several groups of glu-
cose levels, using one category as reference. However, different studies may report different
categories of the exposure variable (tertiles, quartiles, or quintiles). One approach is to meta-
analyse the estimates by comparing the lowest and highest category. This is not recommended
because the meaning of lowest versus highest differs across studies. A more sophisticated
approach is to model the association between the exposure and outcome to estimate the
increase (or decrease) in risk associated with one unit (or other meaningful incremental)
increase in exposure. See references for technical details [90,91]. For example, a meta-analysis
of the association between Homeostasis Model Assessment Insulin Resistance (HOMA-IR)
and cardiovascular events used dose-response modelling to estimate that the cardiovascular
risk increased by 46% per one standard deviation increase in HOMA-IR [24].
Supporting information
S1 Box.
(DOCX)
S2 Box.
(DOCX)
Acknowledgments
Douglas Altman died on June 3, 2018. We dedicate this paper to his memory.
References
1. Page MJ, Shamseer L, Altman DG, Tetzlaff J, Sampson M, Tricco AC, et al. Epidemiology and Report-
ing Characteristics of Systematic Reviews of Biomedical Research: A Cross-Sectional Study. PLoS
Med. 2016; 13(5):e1002028. https://doi.org/10.1371/journal.pmed.1002028 PMID: 27218655
2. Mansournia MA, Higgins JP, Sterne JA, Hernan MA. Biases in Randomized Trials: A Conversation
Between Trialists and Epidemiologists. Epidemiology. 2017; 28(1):54–9. https://doi.org/10.1097/EDE.
0000000000000564 PMID: 27748683
3. Vandenbroucke JP, von Elm E, Altman DG, Gotzsche PC, Mulrow CD, Pocock SJ, et al. Strengthen-
ing the Reporting of Observational Studies in Epidemiology (STROBE): explanation and elaboration.
PLoS Med. 2007; 4(10):e297. https://doi.org/10.1371/journal.pmed.0040297 PMID: 17941715
4. Dekkers OM, Horvath-Puho E, Jorgensen JO, Cannegieter SC, Ehrenstein V, Vandenbroucke JP,
et al. Multisystem morbidity and mortality in Cushing’s syndrome: a cohort study. J Clin Endocrinol
Metab. 2013; 98(6):2277–84. https://doi.org/10.1210/jc.2012-3582 PMID: 23533241
5. Hernan MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream? Epidemiology.
2006; 17(4):360–72. https://doi.org/10.1097/01.ede.0000222409.00878.37 PMID: 16755261
6. Rassen JA, Brookhart MA, Glynn RJ, Mittleman MA, Schneeweiss S. Instrumental variables I: instru-
mental variables exploit natural variation in nonexperimental data to estimate causal relationships. J
Clin Epidemiol. 2009; 62(12):1226–32. https://doi.org/10.1016/j.jclinepi.2008.12.005 PMID: 19356901
7. Mountjoy E, Davies NM, Plotnikov D, Smith GD, Rodriguez S, Williams CE, et al. Education and myo-
pia: assessing the direction of causality by mendelian randomisation. BMJ. 2018; 361:k2022. https://
doi.org/10.1136/bmj.k2022 PMID: 29875094
8. Petersen I, Douglas I, Whitaker H. Self controlled case series methods: an alternative to standard epi-
demiological study designs. BMJ. 2016; 354:i4515. https://doi.org/10.1136/bmj.i4515 PMID:
27618829
9. Ponjoan A, Blanch J, Alves-Cabratosa L, Marti-Lluch R, Comas-Cufi M, Parramon D, et al. Effects of
extreme temperatures on cardiovascular emergency hospitalizations in a Mediterranean region: a
self-controlled case series study. Environ Health. 2017; 16(1):32. https://doi.org/10.1186/s12940-017-
0238-0 PMID: 28376798
10. Coureau G, Bouvier G, Lebailly P, Fabbro-Peray P, Gruber A, Leffondre K, et al. Mobile phone use
and brain tumours in the CERENAT case-control study. Occup Environ Med. 2014; 71(7):514–22.
https://doi.org/10.1136/oemed-2013-101754 PMID: 24816517
11. Mason KE, Pearce N, Cummins S. Associations between fast food and physical activity environments
and adiposity in mid-life: cross-sectional, observational evidence from UK Biobank. Lancet Public
Health. 2018; 3(1):e24–e33. https://doi.org/10.1016/S2468-2667(17)30212-8 PMID: 29307385
12. Moses S, Bradley JE, Nagelkerke NJ, Ronald AR, Ndinya-Achola JO, Plummer FA. Geographical pat-
terns of male circumcision practices in Africa: association with HIV seroprevalence. Int J Epidemiol.
1990; 19(3):693–7. PMID: 2262266
13. Siegfried N, Muller M, Deeks JJ, Volmink J. Male circumcision for prevention of heterosexual acquisi-
tion of HIV in men. Cochrane Database Syst Rev. 2009;(2):CD003362. https://doi.org/10.1002/
14651858.CD003362.pub2 PMID: 19370585
14. Higgins JPT, Green S (Editors). Cochrane Handbook for Systematic Reviews of Interventions: online
version (5.1.0, March 2011) The Cochrane Collaboration, 2011. 2011;(Available from www.
handbook-5-1.cochrane.org).
15. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JP, et al. The PRISMA statement
for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions:
explanation and elaboration. PLoS Med. 2009; 6(7):e1000100. https://doi.org/10.1371/journal.pmed.
1000100 PMID: 19621070
16. Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observa-
tional studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epi-
demiology (MOOSE) group. JAMA. 2000; 283(15):2008–12. PMID: 10789670