C 3 Inferential Statistics

CHAPTER 3
INFERENTIAL STATISTICS
3.1 Hypothesis testing

3.2 Types of statistical hypotheses
3.3 Types of error
3.4 The probability value (p-value)
3.5 Statistical power
3.6 Confidence interval for the mean
1
The purpose of inferential statistics is to designate how likely it is that a given
finding is simply the result of chance. Inferential statistics would not be
necessary if investigators studied all members of a population. However,
because we can rarely observe and study entire populations, we try to select
samples that are representative of the entire population so that we can
generalize the results from the sample to the population.
3.1 HYPOTHESIS TESTING
Here, we will mention some basic concepts related to the subject of hypothesis
testing.
The Purpose of Hypotheses Testing
Research typically involves measuring one or more variables for a sample and
computing descriptive statistics for that sample. In general, however, the
researcher’s goal is not to draw conclusions about that sample but to draw
conclusions about the population that the sample was selected from. Thus,
researchers must use sample statistics to conclude about the corresponding
values in the population. These corresponding values in the population that
correspond to variables measured in a study are called parameters.
For example, a researcher measures the number of depressive symptoms

exhibited by each of 50 clinically depressed adults and computes the mean
number of symptoms. The researcher probably wants to use this sample
statistic (the mean number of symptoms for the sample) to conclude about the
corresponding population parameter (the mean number of symptoms for
clinically depressed adults).
2
Example 3.1:
For example, the mean, median, and variance are parameters of a population
and they are denoted by  , M and  2 respectively.
The corresponding values for the sample are called the sample mean, sample
median, and the sample variance, and they are denoted by X , m and S 2
respectively.
3.2 TYPES OF STATISTICAL HYPOTHESES
There are two types of statistical hypotheses, the null hypothesis which is
denoted by 𝐻0 and the alternative hypothesis which is denoted by 𝐻1 .
• The null hypothesis is the statement that the value of the parameter is
equal to the claimed value. We assume that the null hypothesis is true
until we prove that it is not.
• The alternative hypothesis is the statement that the value of the

parameter differs in some way from the null hypothesis.
3
Example 3.2:
For the data represent the heart rate values (in beats per minute or bpm) for a
sample of 100 patients.
Let 𝜇 denote the mean of heart rate beats per minute for the population of all
patients. The following are examples of statistical hypotheses
1. 𝐻0 : 𝜇 = 65
𝐻1 : 𝜇 ≠ 65 (Two-sided)
2. 𝐻0 : 𝜇 = 65
𝐻1 : 𝜇 > 65 (One side)
3. 𝐻0 : 𝜇 = 65
𝐻1 : 𝜇 < 65 (One side)
Hypothesis testing is an approach to inferential statistics. Hypothesis testing

involves using sampling distributions and the laws of probability to make an
objective decision about whether to accept or reject the null hypothesis.
4
3.3 TYPES OF ERRORS
There are two types of errors when we decide whether to accept or reject the
null hypothesis.
1. Type I error
This error happens if we reject 𝐻0 when it is true.
2. Type II error
This error happens if we fail to reject (accept) 𝐻0 when it is false.
The level of significance is the probability of a type I error and it is denoted

by 𝛼.
Usually, the value of 𝛼 is small and most of the researchers take 𝛼 equals one
of the following values, 0.01, 0.05, or 0.10.
5
3.4 THE PROBABILITY VALUE
The Probability value (p-value) is the smallest value of 𝛼 for which we can
reject the null hypothesis 𝐻0 .
Using the p-value as a decision tool

1. If the P-value is less than or equal to α (P-value ≤ α), we reject the null
hypothesis
2. If the p-value is greater than α (P-value > α), we do not reject the null
hypothesis
A test statistic is a number calculated from a statistical test of a hypothesis.

It shows how closely your observed data match the distribution expected
under the null hypothesis of that statistical test.
The test statistic is used to calculate the p-value of your results, helping to
decide whether to reject your null hypothesis.
Types of statistical tests

There are two types of statistical tests used in hypothesis tests
1. Parametric tests: In parametric tests, the probability distribution of the

population must be known.
2. Non-Parametric tests: In non-parametric tests, the probability

distribution of the population must not be known.
6
3.5 STATISTICAL POWER
Statistical power
i. In statistics, power is the capacity to detect a difference if there is one.

ii. Increasing statistical power allows us to detect what is happening in the
data.
iii. Power is directly related to type II error: 1 – β = Power
iv. There are a number of ways to increase statistical power. The most
common is to increase the sample size.
Example 3.3:
Answer: E (The power of the test)

7
Example 3.4:
Type II error 𝛽 = 0.20
The power of the test = 1 − 𝛽 = 0.80
Answer: D
Example 3.5:
Since 95% 𝐶𝐼 does not include 1, then alcohol consumption plays a role in
the occurrence of breast carcinoma. Therefore, the null hypothesis is
rejected.
If 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05, so 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.03
Answer: A
8
Example 3.6:
Answer: A
9
3.6 CONFIDENCE INTERVAL FOR THE MEAN
Confidence Intervals
Confidence intervals are a way of admitting that any measurement from a

sample is only an estimate of the population. Although the estimate given
from the sample is likely to be close, the true values for the population may
be above or below the sample values. A confidence interval specifies how far
above or below a sample-based value the population value lies within a given
range, from a possible high to a possible low. Reality, therefore, is most likely
to be somewhere within the specified range
Practice Questions
1. Assuming the graph presents 95% confidence intervals, which groups, if

any, are statistically different from each other?
10
Answer: When comparing two groups, any overlap of confidence intervals
means the groups are not significantly different. Therefore, if the graph
represents 95% confidence intervals, Drugs B and C are no different in their
effects; Drug B is no different from Drug A; Drug A has a better effect than
Drug C.
Confidence interval for the mean of a normally distributed population
𝐴 (1 − 𝛼)100% confidence interval (CI) for the unknown mean of a
normally distributed population is given by
𝑆𝐷
𝑀𝑒𝑎𝑛 ± 𝑧𝛼
2 √𝑛
𝑀𝑒𝑎𝑛 = 𝑋̅ (The mean of the sample)
𝑆𝐷 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
𝑛 = 𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
𝑧𝛼 = z-score for the confidence interval

2
Confidence level 90% 95% 99%

𝜶 0.10 0.05 0.01
𝜶 0.05 0.025 0.005
𝟐
𝒛𝜶 1.645 1.96 2.58
𝟐
𝑺𝑫
𝑨 𝟗𝟎% confidence interval is 𝑴𝒆𝒂𝒏 ± 𝟏. 𝟔𝟓
√𝒏
𝑺𝑫
𝑨 𝟗𝟓% confidence interval is 𝑴𝒆𝒂𝒏 ± 𝟏. 𝟗𝟔
√𝒏
𝑺𝑫
𝑨 𝟗𝟗% confidence interval is 𝑴𝒆𝒂𝒏 ± 𝟐. 𝟓𝟖
√𝒏
11
Example 3.7:
Answer: B.
Example 3.8:
A physical therapist wished to estimate, with 99 percent confidence, the mean

maximal strength of a particular muscle in a certain group of individuals. He
is willing to assume that strength scores are approximately normally
distributed with a variance of 144. A sample of 15 subjects who participated
in the experiment yielded a mean of 84.3.
Answer: 76.3 to , 92.3 (76.3, 92.3)
We say we are 99 percent confident that the population mean is between76.3

and 92.3 since, in repeated sampling, 99 percent of all intervals that could be
constructed in the manner just described would include the population mean.
12
Nonnormal Populations
It will not always be possible to assume that the population of interest is

normally distributed. Using the central limit theorem, we have learned that for
large samples (greater than 30), the sampling distribution of 𝑥̅ is
approximately normally distributed regardless of how the population is
distributed.
Example 3.9:
Punctuality of patients in keeping appointments is of interest to a research

team. In a study of patient flow through the offices of general practitioners, it
was found that a sample of 35 patients was 17.2 minutes late for appointments,
on the average. Previous research had shown the standard deviation to be
about 8 minutes. The population distribution was felt to be nonnormal. What
is the 90 percent confidence interval for 𝜇, the true mean amount of time late
for appointments?
Answer: 15.0 to 19.4 (15.0, 19.4)
Remark
Frequently, when the sample is large enough for the application of the central
limit theorem, the population variance is unknown. In that case we use the
sample variance as a replacement for the unknown population variance in the
formula for constructing a confidence interval for the population mean.
13
Confidence intervals for relative risk and odds ratios
If the given confidence interval contains 1.0, then there is no statistically
significant effect of exposure.
Example 3.8:
1. If RR > 1.0, then subtract 1.0 and read as percent increase. So, 1.77
means one group has 77% more cases than the other.
2. If RR < 1.0, then subtract from 1.0 and read as reduction in risk. So, 0.78
means one group has a 22% reduction in risk.
14

C 3 Inferential Statistics

Uploaded by

Copyright:

Available Formats

C 3 Inferential Statistics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

C 3 Inferential Statistics

Uploaded by

Copyright:

Available Formats

CHAPTER 3

3.1 Hypothesis testing

3.1 HYPOTHESIS TESTING

The Purpose of Hypotheses Testing

For example, a researcher measures the number of depressive symptoms

3.2 TYPES OF STATISTICAL HYPOTHESES

• The alternative hypothesis is the statement that the value of the

Hypothesis testing is an approach to inferential statistics. Hypothesis testing

The level of significance is the probability of a type I error and it is denoted

Using the p-value as a decision tool

A test statistic is a number calculated from a statistical test of a hypothesis.

Types of statistical tests

1. Parametric tests: In parametric tests, the probability distribution of the

2. Non-Parametric tests: In non-parametric tests, the probability

i. In statistics, power is the capacity to detect a difference if there is one.

iii. Power is directly related to type II error: 1 – β = Power

Answer: E (The power of the test)

Type II error 𝛽 = 0.20

The power of the test = 1 − 𝛽 = 0.80

the occurrence of breast carcinoma. Therefore, the null hypothesis is

If 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05, so 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.03

Confidence intervals are a way of admitting that any measurement from a

1. Assuming the graph presents 95% confidence intervals, which groups, if

Confidence interval for the mean of a normally distributed population

𝐴 (1 − 𝛼)100% confidence interval (CI) for the unknown mean of a

normally distributed population is given by

𝑀𝑒𝑎𝑛 = 𝑋̅ (The mean of the sample)

𝑧𝛼 = z-score for the confidence interval

Confidence level 90% 95% 99%

A physical therapist wished to estimate, with 99 percent confidence, the mean

Answer: 76.3 to , 92.3 (76.3, 92.3)

We say we are 99 percent confident that the population mean is between76.3

It will not always be possible to assume that the population of interest is

Punctuality of patients in keeping appointments is of interest to a research

Answer: 15.0 to 19.4 (15.0, 19.4)

If the given confidence interval contains 1.0, then there is no statistically

significant effect of exposure.

means one group has a 22% reduction in risk.

You might also like