Module 3-4
Module 3-4
Module 3-4
Hypothesis
Introduction to Hypothesis
In the context of research, a hypothesis is a testable statement or proposition
that serves as a starting point for investigation. It is a tentative assumption or
prediction about the relationship between two or more variables.
A hypothesis is an assumption that is made based on some evidence. This is
the initial point of any investigation that translates the research questions into
predictions. It includes components like variables, population and the relation
between the variables.
What is hypothesis
It is a foundational concept in scientific research and inquiry. A hypothesis
serves as a starting point for investigation and is formulated based on
existing knowledge, observations, and theories
Types of Hypothesis
• Simple Hypothesis: A simple hypothesis predicts a
relationship between two variables. For example,
"Increasing sunlight exposure leads to higher plant
growth.“
• Complex Hypothesis: A complex hypothesis predicts
relationships among multiple variables. For example,
"The interaction between sleep quality, stress levels,
and dietary habits affects academic performance.“
• Directional Hypothesis: This type of hypothesis
predicts the direction of the relationship between
variables. For example, "Higher levels of exercise
lead to decreased body weight.“
• Non-Directional Hypothesis: Also known as a two-tailed
hypothesis, it predicts a relationship between variables without
specifying the direction. For example, "There is a relationship
between caffeine consumption and sleep quality.“
• Associative Hypothesis: This hypothesis predicts a correlation
or association between variables without necessarily implying
a cause-and-effect relationship. For example, "There is an
association between smartphone use and eye strain.“
• Causal Hypothesis: A causal hypothesis proposes a cause-and-
effect relationship between variables. For example, "Increased
sugar intake causes higher instances of dental cavities.“
Major two types of Hypothesis
• Null Hypothesis (H0): This is a statement of no effect
or no relationship. It suggests that there is no
significant difference or connection between variables.
Researchers often aim to test the null hypothesis to
determine whether there is enough evidence to reject it
in favor of an alternative hypothesis.
• Alternative Hypothesis (H1 or Ha): Also known as the
research hypothesis, this is the statement that suggests
a specific effect or relationship between variables. It is
the hypothesis researchers are trying to support with
their data.
Source of Hypothesis
• The resemblance between the phenomenon.
• Observations from past studies, present-day experiences
and from the competitors.
• Scientific theories.
• General patterns that influence the thinking process of
people.
• Personal Experience
• Imagination & Thinking
• Previous Study
• Culture
Steps involved to Test Hypothesis
• State the Null Hypothesis and Alternate Hypothesis
• Set the criteria for the decision (Level of Significance):-
It is the probability of rejecting the null hypothesis when it is true. Also
called as Type I error
It is set prior to conducting the hypothesis testing
It can be set at 5% or lower.
For eg, Significance level of 5% indicates a 5% risk of concluding that a
difference exists when there is no actual difference
Lower significance level indicates that strong evidence is required before
rejecting the null hypothesis
It is denoted by alpha.
• Collect the data (Select your sample)
• Decide which test is to be performed
• Compute the test statistics/ Value
• Find the critical value:-
Critical value is the cutoff value which is to be compared with the test
value to take a decision about the null hypothesis
It divides the graph into two sections: Rejection area and Acceptance area
It test value falls into the rejection area then reject the null hypothesis
It is derived from the level of significance of the test.
It is the table value of level of siginificance
• Compare the critical value with the test statistics/value:-
If the value of test statistics is greater than the critical value “Reject the
null hypothesis”.
If the value of test statistics is less than the critical value “Do not reject the
null hypothesis.
• Make a decision to either reject or not reject the null hypothesis
Errors in Hypothesis Testing
Decision Accept Ho Reject Ho
Ho is True Correct Decision (No error) Type 1 error (alpha error)
Probability (1-alpha) Probability (alpha)
Ho is False Type II error (beta error) Correct Decision (No error)
Probability (beta) Probability (1-beta)
Univariate and Bivariate Data Analysis:
Univariate data :
since the information deals with only one quantity that changes. It
does not deal with causes or relationships and the main purpose of
the analysis is to describe the data and find patterns that exist
within it.
This type of data involves two different variables. The analysis of this
type of data deals with causes and relationships and the analysis is
done to find out the relationship among the two variables. Example
of bivariate data can be temperature and ice cream sales in summer
season.
What is Hypothesis testing?
The theory, methods, and practice of testing a
hypothesis by comparing it with the null
hypothesis. The null hypothesis is only rejected if
its probability falls below a predetermined
significance level, in which case the hypothesis
being tested is said to have that level of
significance.
Example: - A teacher assumes that 60% of his
college's students come from lower-middle-class
families.
Hypothesis Testing
Statistical Tests:-
• Statistical Tests are conducted to test the hypothesis and to find the
inference about the population.
• For that samples are selected and various tests are performed on them to
find the inference about the population under study.
• These are of two types: Parametric Tests and Non Parametric Tests.
Parametric Tests:-
• Parameters tests are applied under the circumstances where the population
is normally distributed or is assumed to be normally distributed.
• Parameters like mean, standard deviation etc are used
• For example:- T – Test, Z – Test, F – Test, ANOVA.
• These are applied where the data is quantitative.
• These are applied where the scale of measurement is either an interval or
ratio scale.
Non – Parametric Tests:-
• Non Parametric tests are applied under the circumstances where the
population is not normally distributed or is not assumed to be normally
distributed.
• Where parametric tests cannot be applied, then non parametric tests come
into play.
• These tests are also called as distribution free tests.
• Parameters like mean, standard deviation etc are not used
• For example, Chi – Square test, U – Test (Mann Whitney Test), Spearman's
Rank Correlation Test.
• These are applied where the data is qualitative
• These are applied where the scale of measurement is either an ordinal or a
nominal scale
Difference between Parametric and Non Parametric Test
Parametric Test Non Parametric Test
Assumes the distribution to be normal Does not assume the distribution to be
normal
Make assumptions about the populations Does not make any assumptions about the
population
Parameters such as mean, Standard No such parameters are used
Deviation etc are used
Applied in case of quantitative data Applied in case of qualitative data
Scale of measurement is either interval or Scale of measurement is either ordinal or
ratio nominal
More powerful (As they possess the Less powerful than a parametric tests
ability to reject the null hypothesis, when
it is false)
Types of Hypothesis Testing:
Parametric Test:-
T Test
Z Test
F Test
ANOVA
T Test
• It is a parametric test of hypothesis testing based on Students T
distribution.
• It was developed by William Sealy Gosset.
• It is essentially, testing the significance of the difference of the mean
values when the sample size is small (i.e. less than 30) and when
population standard deviation is not available.
• It assumes:
Population distribution is normal
Samples are random and independent
Sample size is small
Population standard deviation is not known.
Z Test:-
It is a parametric test of hypothesis testing.
It is used to determine whether the means are different when the
population variance is known and the sample size is large (i.e. greater than
30)
• It assumes:
Population distribution is normal and
Samples are random and independent
Sample size is large
Population Standard deviation is known.
F Test
It is a parametric test of hypothesis testing based on Snedecor F
distribution.
F test is named after its test statistics, F which was named in the honour of
Sir Ronald Fisher.
It is a test for the null hypothesis that two normal populations have the
same variance.
An F test is regarded as a comparison of equality of sample variances.
F statistics is simply a ratio of two variances.
By changing the variance in the ratio, F test becomes a very flexible test. It
can then be used to:
• Test the overall significance for a regression model.
• To compare the fits of different models and
• To test the equality of means.
It assumes
• Population distribution is normal and
• Samples are drawn randomly and independently.
ANOVA
• Also called as (Analysis of variance), it is a parametric test of hypothesis
testing.
• It was developed by Ronald Fisher, also referred to as Fishers ANOVA.
• It is an extension of T Test and Z Test
• It is used to test the significance of the differences of the mean values
among more than two sample groups.
• It uses F Test to statistically test the equality of means and the relative
variances between them.
It assumes
Population distribution is noraml and
Samples are random and independent
Homegeneity of sample variance.
• One way ANOVA and Two way ANOVA are its types.
Limitations of Hypothesis Test :
• Assumptions
• Sample Size
• Type I and Type II Errors
• Choice of Test
• Multiple Testing
• Data Quality
• Publication Bias
• External Validity
Statistical analysis introduction :
Statistical analysis, or statistics, is the process
of collecting and analysing data to identify patterns and
trends, remove bias and inform decision-making. It's an
aspect of business intelligence that involves the
collection and scrutiny of business data and the
reporting of trends.
The results acquired from research project are
meaningless raw data unless analysed with statistical
tools. Therefore, determining statistics in research is of
utmost necessity to justify research findings.
Importance of statistical analysis:
Meaning :
Multivariate analysis is a statistical technique used to
analyze data that involves multiple variables simultaneously.
It allows researchers to understand the relationships between
several variables and how they interact with each other.