BIOS O6S A4

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 23

BIOSTATISTICS 06

INFERENCES & HYPOTHESIS


TESTING
Prof Francis Martinson
MAIN USES OF STATISTICS
 Making inferences
 Testsof significance/hypothesis testing
 Confidence interval

 Estimation
 Regression
 Linear regression
 Logistic regression

 Survival analysis

 Correlation
OUTLINE

 Hypothesis testing/Tests of significance


 Numerical variables
Difference between means
Paired data

Proportions
AN OBSERVED DIFFERENCE IN A PARAMETERS,
BETWEEN TWO GROUPS
 Eg. treated vs. control may be as a result of

4
 sampling variation or Inherent differences between the two
groups
 Differences in the handling and end point evaluation

(ascertainment)
 (the two groups during the course of the investigation)
 Chance
 The true effects of the new procedure/drug.
HOW DO WE DECIDE IF THE EFFECT IS TRUE OR DUE
TO CHANCE ?

 Statistical Inference
 Two main techniques are used:
1. Significance testing & Test of Association (the p-value)
 Null hypothesis vs. Alternate hypothesis

2. Confidence intervals
TEST OF SIGNIFICANCE & P-VALUE
 Test of significance
 Test of significance is the method to rule out chance as an
explanation of the observed difference.
 Used more for numerical data

 P-value
 Chance/probability that random sampling from the population
would produce a sample statistic i.e. mean as deviant or more
deviant than the statistic observed.

 Probability that the observed is due to chance


THE NORMAL DISTRIBUTION &
STANDARD NORMAL DISTRIBUTION

7
CHARACTERISTICS OF STANDARD DEVIATIONS &
VARIANCES
Variance of 2 variables/proportions
Standard deviation formula when combined
 Discrete or Continuous variables  Computed as the addition of the
 sd = two variances of the variables
of interest
 Proportion
sd =  Standard deviation 2 Discrete or
 where Continuous variables together
(1-p)=q
 sd = s12 + s22]

 Note: Standard deviation is the


square root of the variance  Standard deviation 2 Proportions
together
 sd =
PARTS OF A STATISTICAL TEST OF SIGNIFICANCE
1. State your hypotheses
 Null hypothesis, denoted by Ho
 Research hypothesis (also called the alternate hypothesis) denoted by
Ha
2. Test statistic
3. State significance level /Rejection region eg. 0.05
4. Do your calculation
5. Check the p-value from the appropriate distribution table
6. Make a decision to either “Reject the Ho“ or “Not to reject the
Ho”
7. Conclusion
2 TYPES OF ERROR FROM STATISTICAL TESTING
 Type I error
 Thisis committed if we reject the null hypothesis when it is true.
 The probability of type I error is denoted by the symbol α

 Type II error
 This is committed if we accept the null hypothesis when it is false
and the alternate hypothesis is true.
 The probability of type II error is denoted by the symbol β

Decision Hypothesis True Hypothesis False


Reject Ho α Correct
Accept Ho Correct β

 Not possible control both errors at the same time in real life
 Typically compromise one (β) for the other (α)
Common Test Situation Example
Statistics
Student’s T-test For group sample data Mean haemoglobin level in mothers who attended
antenatal clinic versus non attendant

Student’s T-test For paired data Change in haemoglobin levels after iron
supplementation
Z-score test For proportions Proportion of children with parasitaemia in urban
population and rural population

Chi square Test Categorical variables Association of antenatal attendance and outcome of
labour
Common Test Situation Example
Statistics

F – test for analysis of Comparison of means in Mean haemoglobin levels in urban, peri-
variance ANOVA more than two groups urban and rural mothers

Correlation coefficient Strength of association Blood sugar levels and systolic blood
pressure level

Regression analysis To determine mathematical Relationship between height and weight of


relationship between two children under the age of five years
variables
. Percentage Points of Student’s t Distribution
This table gives percentage points of the t-distribution on v d.f.

These are the values of t for which a given percentage, P, of the t-distribution lies outside the range -t to +t.

As the number of degrees of freedom increases, the distribution becomes closer to the standard normal
distribution.

P/2 P/2

-t 0 t

P 50 20 10 5 2 1 0.2 0.1

v=1 1.00 3.08 6.31 12.7 31.8 63.7 318 637

2 0.82 1.89 2.92 4.30 6.96 9.92 22.3 31.6

3 0.76 1.64 2.35 3.18 4.54 5.84 10.2 12.9

4 0.74 1.53 2.13 2.78 3.75 4.60 7.17 8.61

5 0.73 1.48 2.02 2.57 3.36 4.03 5.89 6.87


T DISTRIBUTION TABLE
TEST OF SIGNIFICANCE BETWEEN TWO SAMPLE
MEANS

 Two independent groups.


 Test statistic is given by the ratio of the difference between
the two means and the standard deviation of the difference
between the means

 t=

 Degrees of freedom is given by


 the sum of the two sample sizes minus two i.e. (n1 +n2 -2)

 Ho: = 0 and Ha: ≠ 0


 Ho: = 0 and Ha: ≠ 0
EXAMPLE :  df = (40+75-2) = 113 (2 sided
 Comparison of mean heights test)
of males and females
preschool children.
t =
 M F  =
 mean height 79 76
 standard dev. 6.2 8.2  = 2.201
 sample size 40 75
p value = 0.0139
 Specify null hypothesis and  Reject Ho i.e.
test it.
 Implies there is a difference in
heights of males and females
TEST OF SIGNIFICANCE FOR PAIRED DATA
 For Paired‑data set, the test statistic is conducted on the
sample of the differences in the before and after situation.
 The test statistic is the same as for difference between 2 pop

means

t =

 the df is (n-1)

 Remember Ho: (x-y) = 0 and Ha: (x-y) ≠ 0


 EXAMPLE  Ho: µd = 0 and Ha: µd ≠ 0
 Below are the before and after
medication systolic BP of 11 patients.
Did the medication make a significant  mean of the differences
difference in their wellbeing?  (µ )
d = 264/11 = 24
 Pat. Sys.BP Sys.BP DIFFERENCE
(d)  variance of the mean difference
 ID. BEFORE AFTER in = 171.4 sd(D) = √(
Sys.BP
 1 211 181 30
 2 210 172 38  t= = = 6.08
 3 210 196 14 
 4 203 191 12
 5 196 167 29
 p-value < .025
 6 190 161 29 
 7 191 178 13  Reject Ho i.e.
 8 177 160 17
 9 173 149 24
 10 170 119 51  Reject the fact that there is no
 11 163 156 7 difference
 TOT 264
DIFFERENCE IN TWO PROPORTIONS
 Test of significance on proportions
 The test statistic used is the z‑test.

 Hypothesis
 Ho: (p1-p2) = 0 and Ha: (p1-p2) ≠ 0

 Test statistic
z =

Example 1: An attack rate of para‑influenza virus type II in nurses is known to be


40%.
In a sample of 60 nurses an attack rate of 56% was detected.
Is this attack rate out of the ordinary?

Example 2: Survival rates in the 67 patients of the treated group and 24 of the
control group were 84.4 and 63 respectively. Is the survival rate in the treated
group better than in the control group?
 p1 =.56 p2=.40

EXAMPLE 1  Hypothesis
 H : (p -p ) = 0 and Ha: (p1-p2) ≠
o 1 2
An attack rate of para‑influenza 0 (2 tail test i.e. p < 0.025)
virus type II in nurses is known
to be 40%. In a sample of 60
nurses an attack rate of 56%  Test statistic
was detected. Is this attack rate z=
out of the ordinary?
 =

 = 1.985

 p-value = 0.0239 which is less than 0.025


 Meaning: Reject Ho

 implying the attack rate is out of the


ordinary
 p1 =..844 p2=.63

EXAMPLE 2  Hypothesis
 H : (p -p ) = 0 and Ha: (p1-
o 1 2
 Survival rates in the 67 p2 ) > 0 (1 tail test i.e. p <
patients of the treated 0.05)
group and 24 of the control
group were 84.4% and 63%
 Test statistic
z=
respectively. Is the survival
rate in the treated group  =
better than in the control 
group? Use α=.05  = 1.981

 p-value = 0.0239 Meaning: Reject Ho

 implying the survival rate in treated


group is better than that in control
group
Z DISTRIBUTION TABLE
QUESTIONS?

You might also like