Hypothesis Testing

HYPOTHESIS
TESTING
WHAT IS HYPOTHESIS TESTING ?
• Hypothesis testing is a form of statistical inference that uses data from a sample to draw
conclusions about a population parameter or a population probability distribution. First, a
tentative assumption is made about the parameter or distribution. This assumption is called the
null hypothesis and is denoted by H0. An alternative hypothesis (denoted Ha), which is the
opposite of what is stated in the null hypothesis, is then defined. The hypothesis-testing
procedure involves using sample data to determine whether or not H0 can be rejected. If H0 is
rejected, the statistical conclusion is that the alternative hypothesis Ha is true.
PARAMETRIC VS NON-PARAMETRIC
TESTING TESTING
The non-parametric test does not require
• In Statistics, a parametric test is a kind of
any population distribution, which is
hypothesis test which gives generalizations
meant by distinct parameters. It is also a
for generating records regarding the mean
kind of hypothesis test, which is not based
of the primary/original population. The t-
on the underlying hypothesis. In the case
test is carried out based on the students’ t-
of the non-parametric test, the test is based
statistic, which is often used in that value.
on the differences in the median. So this
• The t-statistic test holds on the underlying
kind of test is also called a distribution-
hypothesis, which includes the normal free test. The test variables are determined
distribution of a variable. In this case, the on the nominal or ordinal level. If the
mean is known, or it is considered to be
independent variables are non-metric, the
known. For finding the sample from the
non-parametric test is usually performed.
population, population variance is
identified. It is hypothesized that the
variables of concern in the population are
estimated on an interval scale.
What are the examples of parametric test?
T-test and Z-test are the examples of parametric test, in statistics
Q5
What are the examples of non-parametric test?
Kruskal-Wallis and Mann-Whitney
Definition of Z-Test
Z-test is the statistical test used to analyze whether
two population means are different or not when the
variances are known, and the sample size is large.
The z-test is based on the normal distribution.
The assumptions for Z-test are:
•All observations are independent.
•The size of the sample should be more than 30.
•The Z distribution is normal when the mean is 0, and
the variance is 1.
The test statistic is defined by:
Xbar is the sample mean
σ is the population standard

deviation
n is the sample size
μ is the population mean

Definition Of T-Test
A T-test is a parametric test applied to identify how the
average of two data sets differs when variance is not given.
When the sample size is small, and the population standard
deviation is unknown, the T-test is used in conjunction with
the t-distribution. The degree of freedom significantly
impacts the shape of a t-distribution. The number of
independent observations in a given set of observations is
the degree of freedom.
There are the following assumptions taken for the T-Test:
•All the data points are independent.
•The sample size is very small.
•The sample size should be taken and recorded accurately.
The Mann- Whitney U test and the Kruskal-
Wallis H test
are both non-parametric tests used to compare two or more independent

samples. The Mann- Whitney U test is used to compare two independent
samples, while the Kruskal- Wallis H test is used to compare three or more
independent samples. The Mann- Whitney U test is generally considered to
be the better option when comparing two independent samples, as it is more
sensitive to differences between the two samples.
The Mann-Whitney U test
is thus the non-parametric counterpart to the t-test for
independent samples; it is subject to less stringent
assumptions than the t-test. Therefore, the Mann-
Whitney U test is always used when the requirement of
normal distribution for the t-test is not met.
Kruskal–Wallis test
The Kruskal-Wallis test is a non-parametric
statistical test used to compare three or more
independent groups to determine if there are
statistically significant differences between them. It is
an extension of the Mann-Whitney U test, which is used
for comparing two groups.
α & β ERRORS
• In statistics, a Type I error is a false positive conclusion, while a Type II error is a false
negative conclusion.
• Making a statistical decision always involves uncertainties, so the risks of making these errors
are unavoidable in hypothesis testing.
• The probability of making a Type I error is the significance level, or alpha (α), while the
probability of making a Type II error is beta (β). These risks can be minimized through careful
planning in your study design.
• Example: Type I vs Type II error
• You decide to get tested for COVID-19 based on mild symptoms. There are two errors that
could potentially occur:
• Type I error (false positive): the test result says you have coronavirus, but you actually don’t.
• Type II error (false negative): the test result says you don’t have coronavirus, but you
actually do.
P-VALUE , SIGNIFICANCE LEVEL, AND
CONFIDENCE INTERVAL
• The p-value, significance level, and confidence interval are vital concepts in statistical hypothesis
testing. The p-value measures the strength of evidence against a null hypothesis. A small p-value
(typically less than 0.05) suggests strong evidence against the null hypothesis, indicating that the
observed data is unlikely if the null hypothesis is true. The significance level, often denoted as
alpha (α), is predetermined and represents the threshold for determining statistical significance.
It's the probability of rejecting the null hypothesis when it's actually true. Confidence intervals, on
the other hand, provide a range of values within which we are confident that the true population
parameter lies. Typically, a 95% confidence interval is used, implying that if the experiment were
repeated many times, 95% of the intervals would contain the true parameter. These statistical
tools collectively guide researchers in drawing conclusions from data analysis, helping to ensure
the reliability and validity of research findings.
NORMALITY TESTS
• Normality tests, such as the Shapiro-Wilk and Kolmogorov-Smirnov tests, assess whether a
dataset follows a normal distribution, a fundamental assumption in many statistical analyses.
The Shapiro-Wilk test is sensitive to deviations from normality in smaller sample sizes, while
the Kolmogorov-Smirnov test is useful for larger datasets. These tests help researchers
determine if their data can be appropriately analyzed using methods that assume normality, like
parametric tests. If the data fails the normality test, alternative approaches may be necessary,
ensuring accurate and reliable conclusions from statistical analysis.
CENTRAL LIMIT THEOREM
• The Central Limit Theorem (CLT) is a key statistical principle stating that regardless of the
population's distribution, the distribution of sample means tends to approximate a normal
distribution with a large enough sample size. This theorem enables reliable inference about
population parameters from sample data, even when the population distribution is unknown or
non-normal. It forms the foundation for many statistical analyses, allowing us to make accurate
conclusions and decisions based on sample data.

Hypothesis Testing

Uploaded by

Copyright:

Available Formats

Hypothesis Testing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hypothesis Testing

Uploaded by

Copyright:

Available Formats

HYPOTHESIS

σ is the population standard

n is the sample size

μ is the population mean

are both non-parametric tests used to compare two or more independent

You might also like