Common Mistakes in Clinical Research
Common Mistakes in Clinical Research
Common Mistakes in Clinical Research
com
Review
Fifteen common mistakes encountered in clinical research
Glenn T. Clark DDS, MSa,1,*, Roseann Mulligan DDS, MSb,1
a
Orofacial Pain and Oral Medicine Center, Herman Ostrow School of Dentistry, University of Southern California,
Los Angeles, CA 90089-0641, USA
b
Community Dentistry Programs and Hospital Affairs, Herman Ostrow School of Dentistry,
University of Southern California, Los Angeles, CA, USA
Received 9 August 2010; accepted 23 August 2010
Available online 20 November 2010
Abstract
The baseline standards for minimally acceptable science are improving as the understanding of the scientific method improves. Journals
publishing research papers are becoming more and more rigorous. For example, in 2001 a group of authors evaluated the quality of clinical trials in
anesthesia published over a 20 year period [Pua et al., Anesthesiology 2001;95:1068–73]. The authors divided the time into 3 subgroups and
analyzed and compared the quality assessment score from research papers in each group. The authors reported that the scientific quality scores
increased significantly in this time, showing more randomization, sample size calculation and blinding of studies. Because every journal strives to
have a high scientific impact factor, research quality is critical to this goal. This means novice researchers must study, understand and rigorously
avoid the common mistakes described in this review. Failure to do so means the hundreds and hundreds of hours of effort it takes to conduct and
write up a clinical trial will be for naught, in that the manuscript with be rejected or worse yet, ignored. All scientists have a responsibility to
understand research methods, conduct the best research they can and publish the honest and unbiased results.
# 2010 Japan Prosthodontic Society. Published by Elsevier Ireland. Open access under CC BY-NC-ND license.
Contents
1. Failure to carefully examine the literature for similar, prior research . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Failure to critically assess the prior literature . . . . . . . . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3. Failure to specify the inclusion and exclusion criteria for your subjects . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
4. Failure to determine and report the error of your measurement methods. . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
5. Failure to specify the exact statistical assumptions made in the analysis . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
6. Failure to perform sample size analysis before the study begins . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
7. Failure to implement adequate bias control measures . . . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
8. Failure to write and stick to a detailed time line . . . . . . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
9. Failure to vigorously recruit and retain subjects. . . . . . . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
10. Failure to have a detailed, written and vetted protocol . . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
11. Failure to examine for normality of the data . . . . . . . . . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
12. Failure to report missing data, dropped subjects and use of an intention to treat analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
13. Failure to perform and report power calculations . . . . . . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
14. Failure to point out the weaknesses of your own study. . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
15. Failure to understand and use correct scientific language . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
preparation phases. In addition, hints on how to improve a and exclusion criteria. Listing these criteria helps other
research project and publication are suggested. researchers understand why current results might differ from
other published studies. For example your patient population
1. Failure to carefully examine the literature for might be younger or your patient population might be from a
similar, prior research different racial group or have a different ratio of males to
females than were used in other research studies. In any case,
All research begins with the idea or question. What young or it is necessary to specify as best you can the make-up of your
novice researchers often fail to appreciate is that the questions subjects. This includes specific criteria for exclusion if you
they take an interest in are likely not to be new, but are actually have any. Once you have the inclusion and exclusion criteria,
questions that others have thought of and frequently have made be sure that you actually follow these criteria in selecting
attempts to investigate in the past. The way to avoid this subjects for your study.
mistake is to assume that the question of interest has already HINT 3: If a novice researcher is not sure how to develop a
been studied and the first job in the research design process is to list of inclusion and exclusion criteria for a specific research
exhaustively pursue, find and then catalog what has been question, look at prior research and use criteria that other
published. Of course, the novice researcher may have a new researchers have specified.
variation of the question, or they may be using a new
methodology or examining a new population of patients, but it
should always be assumed that the core question in some form 4. Failure to determine and report the error of your
is likely to have been addressed previously. It now becomes the measurement methods
novice investigator’s job to find that information, and consider
the positive and negative outcomes of the prior studies in the Very few research reports actually provide more than a
new research design development. single sentence saying their examiners were calibrated. They
HINT 1: When selecting and refining the exact focus of a rarely specify the method of training, the standards of
question it is critically important for the novice to read in detail performance and the frequency of re-assessment of their
the discussion section of similar articles, for in that portion of putatively calibrated examiners. All methods need replication
the paper, most researchers speculate on what needs to be and every researcher who is attempting the research project
accomplished next in that topical area to advance the science. needs to be able to answer the question, ‘‘What is the error of
your measurement method?’’ Some researchers refer to prior
2. Failure to critically assess the prior literature publications when answering this question but a good
researcher knows the exact error of his/her own measurement
Once a wise novice researcher has systematically accumulated methods and the inter-examiner variation. To find this error
and categorized the literature concerning the question of interest, value involves conducting a small test-retest experiment. If
the next step is to carefully examine the research papers related to however a researcher is using multiple examiners to help collect
the question of interest to find out what prior researchers felt could data, these examiners need to be calibrated to a known standard
have been improved. One strategy to achieve this is to put together before being given the go-ahead to begin making measure-
a team or group of research colleagues and select the 10–15 most ments. If the research project is a long-term project, i.e. lasting
important articles on a topic for the team to review. Ask each for many months or years, it is critical to have examiners who
member of the team to present a critical analysis of the literature are calibrated and re-calibrated periodically to an accepted
assigned, presenting both the good and bad points. Developing an standard of performance. Often extensive, complex and
individual’s critical analysis skills will aid novices greatly in difficult studies fail because of the lack of detail to this small
designing studies that minimize error. Not only is it necessary to issue. A 2001 article examined the effects of measurement error
critically analyze the literature before designing a new research on therapeutic equivalence trials and reported that measure-
project, but it is necessary to include these critical remarks in the ment errors inappropriately favor the goal of showing treatment
introductory section of the resulting final manuscript in order to equivalence [1]. Essentially, this article reported on how
justify why the study was needed and what you as a researcher did imprecise data makes it difficult to tell if there are any real
better than previous researchers. differences between two methods or two treatments. Such
HINT 2: There is an old adage that says: ‘‘those who forget imprecision is a disadvantage if your goal is to evaluate that a
history are doomed to repeat it’’ and it is applicable to research new method of treatment is better than the old method;
as well. Investigators who repeat work previously done and do however, if you want to show that the new method or treatment
not recognize and build on prior efforts are likely to find their is equivalent to or as good as the old treatment then imprecise
work unpublishable. data benefits this goal of showing equivalence or non-
superiority. Another study in 2008 examined the frequency
3. Failure to specify the inclusion and exclusion criteria and characteristics of data entry errors in large clinical
for your subjects databases [2]. These authors reported that error rates ranged
from 2.3 to 26.9%, with the errors being not just mistakes in
A common omission from many research papers is the data entry but many non-random, clusters that could potentially
lack of research subject specifications, namely the inclusion affect the study outcome.
G.T. Clark, R. Mulligan / Journal of Prosthodontic Research 55 (2011) 1–6 3
HINT 4: A good researcher might even make the calibration 2 treatment groups, the sample size required to show
process an independent research endeavor that could result in a significance goes up substantially.
publication of the process in a scientific journal.
7. Failure to implement adequate bias control measures
5. Failure to specify the exact statistical assumptions
made in the analysis The single most important mistake that clinical researchers
make is the failure to implement adequate bias control
Since most studies will include statistical analysis of the measures. Bias control is what distinguishes good from bad
data, specifying the level of significance (called the alpha level) research and measures to control for bias include: randomiza-
that is acceptable and the exact statistical tests methods used is tion of subjects to the areas, interventions and control
common place. However, rarely do you see the authors stating conditions: measurement and analysis of subjects with the
what they used as their beta value (type II error) which indicates investigators blind to the subject status; and having a credible
their chance of a type II error (usually beta is 0.2 or less). The control condition and verifying at the onset and along the way
reciprocal of beta (1minus beta) is then converted to a percent that the subject is truly blind to the group to which they were
and reported as the power of a study (usually 80%). Novice assigned. This process is called a blinding status check. Double-
researchers often do not state the directionality of the testing blinding of researchers and subjects is desirable in a clinical
that they perform, namely whether they are using a one-tailed or trial to decrease bias. When blinding is not used or when the
two-tailed analysis. In 2007, an excellent review of the subject group status is easily detected, subjects will generally
literature was published which cataloged and described 47 try to fulfill the perceived expectations of the researcher. The
specific statistical mistakes that are commonly made in the issue of expectation fulfillment was first pointed out in a study
medical literature [3]. These authors strongly suggested in Hawthorne, Michigan at an electronics plant [5]. The
involving a statistical consultant early in a study as a way to experimenters varied the intensity of electrical lighting
prevent some of these common mistakes. available in the plant to see if there was a cause and effect
HINT 5: Providing statistical test assumption details gives relationship between work productivity and light intensity.
the reader/reviewer the sense that the authors are attentive to Fortunately they varied the electric lighting in both directions,
detail and honest in describing the research process and the lack increasing the intensity and decreasing the intensity. What they
of such detail implies the opposite. discovered is that whenever an experiment was being
conducted, work productivity increased; thus the phrase ‘‘the
6. Failure to perform sample size analysis before the Hawthorne Effect’’ entered our scientific lexicon. This term
study begins means that any subject is likely to perform to the investigator’s
expectations if they are not blind to their status. In 2001 a study
Most clinical trials that claim two methods are equivalent (or examined the influence of study size on study outcome [6].
non-superior) are underpowered, which means they have too few Specifically a meta-analysis reviewed 190 randomized trials
subjects. To avoid this mistake, prior to initiation of a research involving 8 different therapeutic interventions divided the
project, it is important to know how many subjects are needed to various studies into those with more than 1000 participants and
achieve the minimum power level desired. There are multiple those with less than thousand participants. The results of this
online and commercial computer based programs that will, with analysis were that the smaller sized studies had more positive
minimum information, provide the user with both the power and therapeutic effects than those studies with the larger size. These
the estimated group sample size. To achieve sample size analysis researchers also reported that the larger studies were system-
it is necessary to understand the nature of the data that is to be atically less likely to report a positive effect, suggesting bias
collected, i.e. is the data linear or non-linear. It is also necessary to was easier to occur and have an impact in smaller studies. These
have a reasonable estimate of what effect the intervention will be, researchers also looked at other bias control measures such as
called the effect size. Finally it is essential to understand the randomization and blinding and concluded that inadequate
variability of data collected. Without knowing the variability of randomization and blinding leads to exaggerated estimates of
the data, the effect size, and the power that is expected, it is the intervention’s benefit.
impossible to estimate sample size, but with these data sample HINT 7: Patient’s are remarkably able to detect to which
size estimation can easily be achieved. In a 2001 paper, the topic group they have been assigned even though the blinding
of equivalency testing and sample size in dental clinical trials was measures have been implemented; therefore good studies
examined [4]. Specifically these researchers examined studies always perform periodic blinding checks.
that compared the efficacy of dentures supported by 2 implants
versus dentures supported by 4 implants. Such a study design is 8. Failure to write and stick to a detailed time line
called an equivalency study. If the 2 methods are found to be
equivalent, then one would logically recommend the use of the A detailed timeline or Gantt chart is an essential feature to
simpler and less expensive method. The authors found that include in a protocol of a clinical trial. These charts can be
underpowering a study makes it easier to find equivalency. created using a Microsoft Office Excel spreadsheet and every
HINT 6: For linear data, if the standard deviation is quite a step of the trial should be noted in the timeline. The problem
bit larger (e.g. 2–3 times larger) than the difference between the often seen with novice researchers is that they lack experience
4 G.T. Clark, R. Mulligan / Journal of Prosthodontic Research 55 (2011) 1–6
and cannot estimate realistically the time needed to achieve a immensely in the last 10 years, no software program can
specific task. Nevertheless, a timeline is a critical and important make up for inappropriate or inexact design of a research
overall feature in clinical studies, and failure to create and project so consultation with an experienced statistician is
follow the timeline is a common mistake that is frequently almost always a necessity. In 2001, a review paper was
made in clinical research. written which discussed the topic of optimal clinical
HINT 8: Good researchers make a timeline plan that research design for chronic pain drug efficacy studies [8].
includes critical benchmarks along the way, they post it on the The authors made a list of suggestions that researchers
wall for everyone to see and they stick to it! should consider when they design and conduct such studies,
but in their conclusions, they strongly suggested that a
9. Failure to vigorously recruit and retain subjects biostatistician consultant be used throughout all phases of the
clinical trial.
Clinical research implies that human subjects will be HINT 10: The adage that is applicable here is: ‘‘the devil is
involved in the study. Subjects must be identified and recruited in the details!’’ This saying refers to the fact that getting a
and a plan for this recruitment process needs to be developed general understanding and agreement that a project will be
and written down. A 2009 study actually compared 3 methods conducted is not enough. A researcher must also achieve a
of subject recruitment and reported that direct telephone calls to thorough understanding and agreement on the specifics of the
the patient by the investigator were the most effective method project, which must be adequately documented or it can easily
[7]. Failure to have a specific recruitment plan and a method for fail.
retaining subjects in the study is a common mistake. Moreover,
since subject recruitment is often a major issue in research 11. Failure to examine for normality of the data
studies, there should be more than one plan for subject
recruitment. In the analytic phase, it is important to examine the data that
HINT 9: Well designed research often fails because of poor has been collected to see if it is normally distributed. Normality
subject recruitment and retention procedures so make this a is a concept that applies to continuous linear data and is not
priority. applicable to categorical or non-linear dichotomous data. There
are statistical programs that will take a data set and examine
10. Failure to have a detailed, written and vetted whether it meets the standards of normality. Data that is
protocol unevenly distributed about the mean can sometimes be
transform into more equally distributed data by using a log
Before you begin any research project, especially clinical or log–log transformation The advantage of transforming the
research, a fully developed protocol is critical. Novice data is that it allows you to continue using parametric statistical
researchers often begin research without completing the methods, as opposed to using non-parametric statistical
protocol. Moreover, in addition to writing the protocol, the analysis methods. In general, parametric statistical analysis
researcher needs to present the protocol to a peer group, is a more sensitive method (i.e. has more statistical power) and
hopefully a peer group with moderate research experience, is preferred over that used to analyze non-parametric data.
with the request that the group provide critical comments and HINT 11: A researcher should always look at the raw data
suggestions for improvement. There is an old saying ‘‘luck obtained from the study displayed graphically since this
favors the well prepared’’. In the field of research, being well demonstrates areas where there are problems with the data. The
prepared means a well thought out, detailed written protocol goal is to see if a histogram of the data demonstrates a bell-
is available and consulted frequently during the conduct of shaped curve or some other figure.
the clinical research project. Once the second phase of the
research project starts, the data analysis phase, it is critical 12. Failure to report missing data, dropped subjects
that an appropriate statistical methodology be selected and and use of an intention to treat analysis
implemented to effectively analyze the data. Typically an
experienced clinical researcher will consult a statistician for Statistical consultants will most likely recommend analy-
advice both before beginning the research and after the data tical methods that are consistent with an intention to treat
has been collected. In the research phase a statistician is methodology. This methodology deals with dropouts. Often
critical in helping to conceptualize the analytical methodol- novice researchers exclude dropouts from the analysis, and this
ogy that should be used. Ideally the consultation with the can alter the conclusions of the study. Regardless of the method
statistician needs to continue as the data is being collected of analysis used, it is critical to report all dropped data, missing
and prior to final analysis of the data. In many ways, the data, and subject dropouts in a careful and honest fashion. How
statistician serves as an outside auditor attesting to the the project dealt with lost or dropped data must be included in
diligence and honesty of the research process and analysis. It the methods section of the research report. Clinical trials that
is not uncommon that the data that was planned to be involve complicated, difficult or prolonged protocols often
collected, changes for pragmatic and unexpected reasons. suffer from subject dropout. Many researchers will implement
This means the analytical plan may need to be adjusted. inclusion and exclusion criteria that reasonably eliminate the
Although statistical software programs have improved non-compliant patient. For example exclusion criteria might
G.T. Clark, R. Mulligan / Journal of Prosthodontic Research 55 (2011) 1–6 5
specify that: ‘‘subjects that did not complete the health history adjustments are made to the level of significance to compensate
questionnaire will be excluded from this study’’ or ‘‘subjects for the fact that there were multiple measurements.
that failed to appear for more than one follow-up visit will be One example of spurious associations being made is in the
excluded’’. Sometimes researchers will see the potential field of genetic polymorphisms. In 2007 one researcher
clinical subjects more than once during the pre-enrollment examined why so many statistically significant associations
phase to determine their eligibility. This pre-enrollment phase between diseases in genetic polymorphisms are not replicated
frequently is referred to as the run-in phase. A run-in phase in a in future studies [12]. Specifically this paper looked at 10 single
clinical study is an advantage in that it is easier to identify nucleotide polymorphisms or SNPs of the COMT gene that
subjects who are likely to be non-compliant with the protocol have been associated with various specific diseases. The
and would best be excluded before enrollment. Clearly such a authors concluded that false positive findings are commonplace
strategy would result in fewer dropouts, which is highly and initial associations between genetic SNPs and diseases
desirable. Unfortunately, run-in designs with many exclusions, must be interpreted with high caution, since they are frequently
make the results less generalizeable to the real world population not replicated. In 2006, a group of researchers conducted a
of subjects. Often such trade-offs are made between meta-analysis on the topic of false positive gene associations,
practicality, and idealism in design. In 1998 a small study specifically those associated with human lymphocyte disease
was published describing the advantages and disadvantages of a [13]. These researchers suggested that a median sample size of
run-in phase to a research protocol [9]. The authors concluded over 3500 subjects was necessary to avoid false positive results.
that run in clinical trials overestimate the benefits and They went on to state that collaborative studies seem like a
underestimate the risks of treatment. logical approach for collecting large data sets like this, since
HINT 12: If you have to choose between excluding subjects individual researchers often do not have the resources to gather
and having many drop-outs, always choose excluding. such a large data set themselves. A 2010 paper suggested a
statistical standard be developed before initial results are
13. Failure to perform and report power calculations accepted [14]. This paper suggested that a true report
probability (TRP) score be developed based on data from
Novice researchers often fail to perform a power calculation multiple studies. The authors suggested that the suggested TRP
on their study. Such a calculation is critical in studies of formula would be straightforward and appropriate and help
equivalency. Small studies with low power often find no distinguish spurious results from true results.
significant differences between the treatment interventions, HINT 13: Remember that ‘‘associations never prove
however, if the study was inadequately powered then a type II causality.’’ This is certainly appropriate when trying to link
error is more likely. A type II error is the acceptance of a false genetic polymorphisms and disease, so replicate, replicate, and
negative hypothesis. There are in fact multiple software replicate.
programs that allow researchers to determine the power of their
results. In 2001 an article examined how often underpowered 14. Failure to point out the weaknesses of your own
reports of equivalency occurred in the surgical literature [10]. study
Specifically these authors looked at randomized controlled
trials, where the control treatment was an active intervention, In the last phase of a clinical trial, the results are written in a
usually the standard treatment of the day. In these studies a new manuscript form and submitted for review. Many novice
treatment was compared to the standard treatment and researchers fail to point out the weaknesses of their own study
considered to be equal to the standard treatment if the results in the discussion section of their manuscript. This is often
were equivalent. These researchers looked at 90 randomized reason for rejection of the manuscript.
controlled trials in the surgical literature and found that 39% of HINT 14: In general hiding your mistakes or obfuscating
these reports met the standards for equivalency. The other 61% them with the hope that no one will notice is not a good policy.
of the reports were typically underpowered and thus subject to a Keep in mind that ‘‘honesty is the best policy’’ holds here as
type II error. In 2001 another paper, examined type II error rates well.
in the orthopedic trauma literature [11]. Similar to the results
published in the prior study, 90% of this literature was 15. Failure to understand and use correct scientific
underpowered with the overall power calculated for the 117 language
papers reviewed being 25%. The standard acceptable power in a
study is 80% and therefore the authors concluded that many Finally all researchers, experienced and novices must use the
type II errors were likely to continue to occur in the orthopedic correct scientific language when describing their results.
literature thereby affecting critical future research. Type II Specifically, a single study never proves that a hypothesis is
errors occur because there are too few subjects, but they also true; it can only reject the null hypothesis. While most people are
occur because there are too many measurements made on too not comfortable using such cautionary language, this is the
few subjects. If you measure two groups of subjects twice, it is correct scientific language. This understanding begins with
likely that some of the measurements taken on the second studying a good statistical textbook which focuses on clinical
occasion will be different. It is also possible to show that the research design [15]. Actually very few research manuscripts
differences are indeed statistically different, if no downward formally state the null hypothesis in the method section, and then
6 G.T. Clark, R. Mulligan / Journal of Prosthodontic Research 55 (2011) 1–6
formally reject or accept the null hypothesis in the discussion [7] Schroy 3rd PC, Glick JT, Robinson P, Lydotes MA, Heeren TC, Prout M,
et al. A cost-effectiveness analysis of subject recruitment strategies in the
section, but when this is done it shows a true understanding of
HIPAA era: results from a colorectal cancer screening adherence trial.
scientific research and the limitations of the scientific method. Clin Trials 2009;6:597–609.
HINT 15: If you want to be a good researcher, you must [8] Harden RN, Bruehl S. Conducting clinical trials to establish drug efficacy
study and understand the nuances of the language associated in chronic pain. Am J Phys Med Rehabil 2001;80:547–57.
with the scientific process and only by doing this will you also [9] Pablos-Méndez A, Barr RG, Shea S. Run-in periods in randomized trials:
understand the limitations of this process. implications for the application of results in clinical practice. JAMA
1998;279:222–5.
[10] Dimick JB, Diener-West M, Lipsett PA. Negative results of randomized
References clinical trials published in the surgical literature: equivalency or error?
Arch Surg 2001;136:796–800.
[1] Kim MY, Goldberg JD. The effects of outcome misclassification and [11] Lochner HV, Bhandari M, Tornetta 3rd P. Type-II error rates (beta errors)
measurement error on the design and analysis of therapeutic equivalence of randomized trials in orthopaedic trauma. J Bone Joint Surg Am
trials. Stat Med 2000;20:2065–78. 2001;83:1650–5.
[2] Goldberg SI, Niemierko A, Turchin A. Analysis of data errors in clinical [12] Sullivan PF. Spurious genetic associations. Biol Psychiatry 2007;61:
research databases. AMIA Annu Symp Proc 2008;6:242–6. 1121–6.
[3] Strasak AM, Zaman Q, Pfeiffer KP, Göbel G, Ulmer H. Statistical errors in [13] Ioannidis JP, Trikalinos TA, Khoury MJ. Implications of small effect sizes
medical research—a review of common pitfalls. Swiss Med Wkly of individual genetic variants on the design and interpretation of genetic
2007;137:44–9. association studies of complex diseases. Am J Epidemiol 2006;164:
[4] Burns DR, Elswick Jr RK. Equivalence testing with dental clinical trials. J 609–14.
Dent Res 2001;80:1513–7. [14] Weitkunat R, Kaelin E, Vuillaume G, Kallischnigg G. Effectiveness of
[5] Gale EAM. The Hawthorne studies—a fable for our times? Q J Med strategies to increase the validity of findings from association studies: size
2004;97:439–49. vs. replication. BMC Med Res Methodol 2010;10:47.
[6] Gluud LL, Thorlund K, Gluud C, Woods L, Harris R, Sterne JA. Reported [15] Hulley SB, Cummings SR, Browner WS, Grady DG, Newman TB.
methodologic quality and discrepancies between large and small random- Designing clinical research, 3rd ed., Philidelphia: Lippincott, Williams
ized trials in meta-analyses. Ann Intern Med 2001;135:982–9. and Wilkins; 2007. p. 51–63.