Statistics
Statistics
Statistics
صالح دعائكم
Please Grace us with your good prayers
يوسف معيوف
Youssef Maayouf
What is the Null Hypothesis?
• A null hypothesis states that two treatments are equally effective (and is hence
negatively phrased), IN SIMPLE WORDS, it means that the effect of the new
treatment/condition is ZERO”NULL” and it adds nothing to the patient
• A significance test uses the sample data to assess how likely the null hypothesis is
to be correct.
What is the P value?
• True
What are the types of errors that may be
encountered when testing the Null Hypothesis?
• Two types of errors may occur when testing the null hypothesis, Type 1 and
Type2
What is Type 1 Error (one of the two types that
are encountered when testing the Null
Hypothesis)?
• Type I: the null hypothesis is rejected when it is true (you say that there is
significance after the treatment or the added condition while there is actually
not) - i.e. Showing a difference between two groups when it doesn't exist (=
significance level)
What is Type 2 Error (one of the two types that
are encountered when testing the Null
Hypothesis)?
• Type II: the null hypothesis is accepted when it is false (you say there is no
significance after the treatment or the added condition while there is) - i.e. Failing
to spot a difference when one really exists
What is the power of a study?
• The power of a study is the probability of (correctly) rejecting the null hypothesis
when it is false (you confidently can say that there is effect when there is effect)
• The type of significance test used depends on whether the data is parametric
(something which can be measured, or normally distributed) or non-parametric
What are the Parametric tests?
• Paired data refers to data obtained from a single group of patients, e.g.
Measurement before and afteran intervention.
• Unpaired data comes from two different groups of patients, e.g. Comparing
response to different interventions in two groups
What is Meta-Analysis?
• Funnel plots are usually drawn with treatment effects on the horizontal axis and
study size on the vertical axis.
What are the different interpretations of Funnel
Plot?
• Central Limit Theorem (CLT): the random sampling distribution of mean would
always tend to be normal irrespective of the population distribution for which the
sample were drown.
• The mean of the random sampling distribution of means is equal to the mean of
the original population
What is Confidence Interval?
• Confidence Interval (CI): describes the range of value around a mean, an odds ratio, a P
value or a standard deviation within which the true mean value lies.
• 95% CI 5% chance the true mean value for variable lies outside the range CI = mean ±
2xSE (Standard Error)
• In Normal Distribution, The range of the mean - (1.96 *SD) to the mean + (1.96 * SD) is
called the 95% confidence interval, i.e. if a repeat sample of 100 observations are taken
from the same group 95 of them would be expected to lie in that range
In normal distribution Mean = Median = Mode,
T/F?
• True
What is the standard deviation?
• The standard deviation (SD) represents the average difference each observation
in a sample lies from the sample mean
• To remember the above note how they are in alpha order, think positive going
forward with '>', whilst negative going backwards '<'
Standard error of the mean = standard
deviation / square root (number of patients),
T/F?
• True
What The Standard Error of the Mean?
• Relative Risk (RR) is the ratio of risk in the experimental group (experimental
event rate, EER) to risk in the control group (control event rate, CER)
• Simply:
• CER = rate at which events occur in the control group
• True
What is Absolute Risk Reduction?
• Relative risk reduction (RRR) is calculated by dividing the absolute risk reduction
by the control event rate
• The Hazard Ratio (HR) is similar to relative risk but is used when risk is not
constant to time. It is typically used when analysing survival over time
What is the equation for calculating Number
needed to treat (NNT)?
• Numbers needed to treat (NNT) is a measure that indicates how many patients
would require an intervention to decrease the expected number of outcomes by
1.
• Odds Ratio may be defined as the ratio of the odds of a particular outcome with
experimental treatment and that of control
• The proportion of people with the target disorder in the population at risk at a
specific time (point prevalence) or time interval (period prevalence)
• The proportion of patients with that particular test result who have the target
disorder
• The odds that the patient has the target disorder before the test is carried out
• The odds that the patient has the target disorder after the test is carried out
• Where the likelihood ratio for a positive test result = sensitivity / (1 - specificity)
What is Incidence and Prevalence?
• The incidence is the number of new cases per population in a given time period
• For example, if condition X has caused 40 new cases over the past 12 months per
1,000 of the population the annual incidence is 0.04 or 4%.
What is the Prevalence?
• The prevalence is the total number of cases per population at a particular point in
time
• For example, imagine a questionnaire is sent to 2,500 adults asking them how
much they weigh.
• If from this sample population of 500 of the adults were obese then the
prevalence of obesity would be 0.2 or 20%.
What is the relationship between incidence and
prevalence?
• For conditions such as the common cold the incidence may be greater than the
prevalence
Can you give a story to elaborate true positive,
negative and false positive, negative?
• Imagine a scenario where people are tested for a disease. The test outcome can be positive (sick)
or negative (healthy), while the actual health status of the persons may be different. In that
setting:
• A sensitivity of 100% means that the test recognizes all sick people as such. Thus
in a high sensitivity test, a negative result is used to rule out the disease.
If sensitivity is used to evaluate the effectiveness
of the test to diagnose positive cases, what is the
parameter used to diagnose negative cases?
• Sensitivity alone does not tell us how well the test predicts other classes (that is,
about the negative cases).
• Sensitivity is not the same as the positive predictive value (ratio of true positives
to combined true and false positives), which is as much a statement about the
proportion of actual positives in the population being tested as it is about the
test.
When you’re doing a test over a number of samples to
determine the sensitivity of the test, what would you do about
samples who give intermediate results (not positive or
negative)?
• The calculation of sensitivity does not take into account indeterminate test
results.
• A specificity of 100% means that the test recognizes all healthy people as healthy.
Thus a positive result in a high specificity test is used to confirm the disease.
• The maximum is trivially achieved by a test that claims everybody healthy
regardless of the true condition.
• Therefore, the specificity alone does not tell us how well the test recognizes
positive cases.
• We also need to know the sensitivity of the test to the class, or equivalently, the
specificities to the other classes.
A test with high specificity has a low Type I error
rate, T/F?
• True
What is the difference between specificity and
Percision?
• The distinction is critical when the classes are different sizes. A test with very high
specificity can have very low precision if there are far more true negatives than
true positives, and vice versa.
Increasing the cut-off of a positive test result
will decrease the number of false positives and
hence increase the specificity, T/F?
• True
What is Sensitivity?
• False
• Whilst correlation coefficients give information about how one variable may
increase or decrease as another variable increase they do not give information
about how much the variable will change.
• They also do not provide information on cause and effect
• Alternatively, subgroups within the cohort may be compared with each other
What is a Case-Control study??
• Case-control studies are used to identify factors that may contribute to a medical
condition by comparing subjects who have that condition (the 'cases') with
patients who do not have the condition but are otherwise similar (the 'controls')
What are Cross-Sectional Studies ?
• IIa - evidence from at least one well designed controlled trial which is not
randomised
What are IIb, III, IV evidences of a study?
• It’s about the study design, Superiority needs large number of patients
• Equality , confidence interval between – delta to + Delta
• Non inferiority, confidence interval needs to be in an area not less than – delta
• For drug companies, they aim to non inferiority then compete on price range