What Is Statistical Significance

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

What Is Statistical Significance?

Statistical significance refers to the claim that a set of observed data are not
the result of chance but can instead be attributed to a specific cause.
Statistical significance is important for academic disciplines or practitioners
that rely heavily on analyzing data and research, such as
economics, finance, investing, medicine, physics, and biology.

Statistical significance can be considered strong or weak. When analyzing a


data set and doing the necessary tests to discern whether one or more
variables have an effect on an outcome, strong statistical significance helps
support the fact that the results are real and not caused by luck or chance.
Simply stated, if a p-value is small then the result is considered more reliable.

KEY TAKEAWAYS

 Statistical significance refers to the claim that a result from data


generated by testing or experimentation is likely to be attributable to a
specific cause.
 A high degree of statistical significance indicates that an observed
relationship is unlikely to be due to chance.
 The calculation of statistical significance is subject to a certain degree
of error. 
 Statistical significance can be misinterpreted when researchers do not
use language carefully in reporting their results. 
 Several types of significance tests are used depending on the research
being conducted.
Problems arise in tests of statistical significance because researchers are
usually working with samples of larger populations and not the populations
themselves. As a result, the samples must be representative of the
population, so the data contained in the sample must not be biased in any
way. In most sciences, including economics, a result may be considered
statistically significant if it has a confidence level of 95% (or sometimes 99%).

Understanding Statistical Significance


The calculation of statistical significance (significance testing) is subject to a
certain degree of error. Even if data appear to have a strong relationship,
researchers must account for the possibility that an apparent correlation
arose due to random chance or a sampling error.
Sample size is an important component of statistical significance in that larger
samples are less prone to flukes. Only randomly chosen, representative
samples should be used in significance testing. The level at which one can
accept whether an event is statistically significant is known as the
significance level.

Researchers use a measurement known as the p-value to determine


statistical significance: if the p-value falls below the significance level, then
the result is statistically significant. The p-value is a function of the means
and standard deviations of the data samples.

The p-value indicates the probability under which the given statistical result
occurred, assuming chance alone is responsible for the result. If this
probability is small, then the researcher can conclude that some other factor
could be responsible for the observed data.

The opposite of the significance level, calculated as 1 minus the significance


level, is the confidence level. It indicates the degree of confidence that the
statistical result did not occur by chance or by sampling error. The
customary confidence level in many statistical tests is 95%, leading to a
customary significance level or p-value of 5%.

"P-hacking" is the practice of exhaustively comparing many different sets of


data in search of a statistically significant result. This is subject to
reporting bias because the researchers only report favorable results–not
negative ones.

Special Considerations
Statistical significance does not always indicate practical significance,
meaning the results cannot be applied to real-world business situations. In
addition, statistical significance can be misinterpreted when researchers do
not use language carefully in reporting their results. The fact that a result is
statistically significant does not imply that it is not the result of chance, just
that this is less likely to be the case.

Just because two data series hold a strong correlation with one another does
not imply causation. For example, the number of movies in which the actor
Nicolas Cage stars in a given year is very highly correlated with the number
of accidental drownings in swimming pools.1 But this correlation
is spurious since there is no theoretical causal claim that can be made.
Another problem that may arise with statistical significance is that past data,
and the results from that data, whether statistically significant or not, may not
reflect ongoing or future conditions. In investing, this may manifest itself in a
pricing model breaking down during times of financial crisis as correlations
change and variables do not interact as usual. Statistical significance can
also help an investor discern whether one asset pricing model is better than
another.

Types of Statistical Significance Tests


Several types of significance tests are used depending on the research being
conducted. For example, tests can be employed for one, two, or more data
samples of various sizes for averages, variances, proportions, paired or
unpaired data, or different data distributions.

There are also different approaches to significance testing, depending on the


type of data that is available. Ronald Fisher is credited with formulating one of
the most flexible approaches, as well as setting the norm for significance
at p  < 0.05. Because most of the work can be done after the data have
already been collected, this method remains popular for short-term or ad-hoc
research projects.2

Seeking to build on Fisher's method, Jerzy Neyman and Egon Pearson


ended up developing an alternative approach. This method requires more
work to be done before the data are collected, but it allows researchers to
design their study in a way that controls the probability of reaching false
conclusions.2

Null Hypothesis Testing

Statistical significance is used in null hypothesis testing where researchers


attempt to support their theories by rejecting other explanations. Although the
method is sometimes misunderstood, it remains the most popular method of
data testing in medicine, psychology, and other fields.2

The most common null hypothesis is that the parameter in question is equal
to zero (typically indicating that a variable has zero effect on the outcome of
interest). If researchers reject the null hypothesis with a confidence of 95% or
better, they can claim that an observed relationship is statistically significant.
Null hypotheses can also be tested for the equality of effect for two or more
alternative treatments.

You might also like