C 3 Inferential Statistics
C 3 Inferential Statistics
C 3 Inferential Statistics
INFERENTIAL STATISTICS
1
The purpose of inferential statistics is to designate how likely it is that a given
finding is simply the result of chance. Inferential statistics would not be
necessary if investigators studied all members of a population. However,
because we can rarely observe and study entire populations, we try to select
samples that are representative of the entire population so that we can
generalize the results from the sample to the population.
Here, we will mention some basic concepts related to the subject of hypothesis
testing.
Research typically involves measuring one or more variables for a sample and
computing descriptive statistics for that sample. In general, however, the
researcher’s goal is not to draw conclusions about that sample but to draw
conclusions about the population that the sample was selected from. Thus,
researchers must use sample statistics to conclude about the corresponding
values in the population. These corresponding values in the population that
correspond to variables measured in a study are called parameters.
2
Example 3.1:
For example, the mean, median, and variance are parameters of a population
and they are denoted by , M and 2 respectively.
The corresponding values for the sample are called the sample mean, sample
median, and the sample variance, and they are denoted by X , m and S 2
respectively.
There are two types of statistical hypotheses, the null hypothesis which is
denoted by 𝐻0 and the alternative hypothesis which is denoted by 𝐻1 .
• The null hypothesis is the statement that the value of the parameter is
equal to the claimed value. We assume that the null hypothesis is true
until we prove that it is not.
3
Example 3.2:
For the data represent the heart rate values (in beats per minute or bpm) for a
sample of 100 patients.
Let 𝜇 denote the mean of heart rate beats per minute for the population of all
patients. The following are examples of statistical hypotheses
1. 𝐻0 : 𝜇 = 65
𝐻1 : 𝜇 ≠ 65 (Two-sided)
2. 𝐻0 : 𝜇 = 65
𝐻1 : 𝜇 > 65 (One side)
3. 𝐻0 : 𝜇 = 65
𝐻1 : 𝜇 < 65 (One side)
4
3.3 TYPES OF ERRORS
There are two types of errors when we decide whether to accept or reject the
null hypothesis.
1. Type I error
This error happens if we reject 𝐻0 when it is true.
2. Type II error
This error happens if we fail to reject (accept) 𝐻0 when it is false.
5
3.4 THE PROBABILITY VALUE
The Probability value (p-value) is the smallest value of 𝛼 for which we can
reject the null hypothesis 𝐻0 .
The test statistic is used to calculate the p-value of your results, helping to
decide whether to reject your null hypothesis.
6
3.5 STATISTICAL POWER
Statistical power
iv. There are a number of ways to increase statistical power. The most
common is to increase the sample size.
Example 3.3:
Answer: D
Example 3.5:
Since 95% 𝐶𝐼 does not include 1, then alcohol consumption plays a role in
rejected.
Answer: A
8
Example 3.6:
Answer: A
9
3.6 CONFIDENCE INTERVAL FOR THE MEAN
Confidence Intervals
Practice Questions
10
Answer: When comparing two groups, any overlap of confidence intervals
means the groups are not significantly different. Therefore, if the graph
represents 95% confidence intervals, Drugs B and C are no different in their
effects; Drug B is no different from Drug A; Drug A has a better effect than
Drug C.
𝑆𝐷
𝑀𝑒𝑎𝑛 ± 𝑧𝛼
2 √𝑛
𝑆𝐷 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
𝑛 = 𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
𝑺𝑫
𝑨 𝟗𝟎% confidence interval is 𝑴𝒆𝒂𝒏 ± 𝟏. 𝟔𝟓
√𝒏
𝑺𝑫
𝑨 𝟗𝟓% confidence interval is 𝑴𝒆𝒂𝒏 ± 𝟏. 𝟗𝟔
√𝒏
𝑺𝑫
𝑨 𝟗𝟗% confidence interval is 𝑴𝒆𝒂𝒏 ± 𝟐. 𝟓𝟖
√𝒏
11
Example 3.7:
Answer: B.
Example 3.8:
12
Nonnormal Populations
Example 3.9:
Remark
Frequently, when the sample is large enough for the application of the central
limit theorem, the population variance is unknown. In that case we use the
sample variance as a replacement for the unknown population variance in the
formula for constructing a confidence interval for the population mean.
13
Confidence intervals for relative risk and odds ratios
Example 3.8:
1. If RR > 1.0, then subtract 1.0 and read as percent increase. So, 1.77
means one group has 77% more cases than the other.
2. If RR < 1.0, then subtract from 1.0 and read as reduction in risk. So, 0.78
14