Central Tendency Vs Dispersion and Parametric and Non

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Central Tendency vs Dispersion

In descriptive and inferential statistics, several indices are used to describe a data set
corresponding to its central tendency, dispersion, and skewness: the three most
important properties that determine the relative shape of the distribution of a data set.

What is central tendency?

Central tendency refers to and locates the center of the distribution of values. Mean,
mode, and median are the most commonly used indices in describing the central
tendency of a data set. If a data set is symmetric, then both the median and the mean of
the data set coincide with each other.

Given a data set, the mean is calculated by taking the sum of all the data values and
then dividing it by the number of data. For example, the weights of 10 people (in
kilograms) are measured to be 70, 62, 65, 72, 80, 70, 63, 72, 77 and 79. Then the mean
weight of the ten people (in kilograms) can be calculated as follows. Sum of the weights
is 70 + 62 + 65 + 72 + 80 + 70 + 63 + 72 + 77 + 79 = 710. Mean = (sum) / (number of
data) = 710 / 10 = 71 (in kilograms). It is understood that outliers (data points that
deviate from the normal trend) tend to affect the mean. Thus, in the presence of outliers
mean alone will not give a correct picture about the center of the data set.

The median is the data point found at the exact middle of the data set. One way to
compute the median is to order the data points in ascending order, and then locate the
data point in the middle. For example, if once ordered the previous data set looks like,
62, 63, 65, 70, 70, 72, 72, 77, 79, 80. Therefore, (70+72)/2 = 71 is at the middle. From
this, it is seen that median need not be in the data set. Median is not affected by the
presence of the outliers. Hence, median will serve as a better measure of central
tendency in the presence of outliers.

The mode is the most frequently occurring value in the set of data. In the previous
example, the value 70 and 72 both occurs twice and thus, both are modes. This shows
that, in some distributions, there is more than one modal value. If there is only one
mode, the data set is said to be unimodal, in this case, the data set is bimodal.
What is dispersion?

Dispersion is the amount of spread of data about the center of the distribution. Range
and standard deviation are the most commonly used measures of dispersion.

The range is simply the highest value minus the lowest value. In the previous example,
the highest value is 80 and the lowest value is 62, so the range is 80-62 = 18. But range
does not provide a sufficient picture about the dispersion.

To calculate the standard deviation, first the deviations of data values from the mean are
calculated. The root square mean of deviations is called the standard deviation. In the
previous example, the respective deviations from the mean are (70 71) = -1, (62 71)
= -9, (65 71) = -6, (72 71) = 1, (80 71) = 9, (70 71) = -1, (63 71) = -8, (72 71)
= 1, (77 71) = 6 and (79 71) = 8. The sum of squares of deviation is (-1) 2 + (-9)2 + (-
6)2 + 12 + 92 + (-1)2 + (-8)2 + 12 + 62 + 82 = 366. The standard deviation is (366/10) =
6.05 (in kilograms). Unless the data set is greatly skewed, from this it can be concluded
that the majority of the data is in the interval 716.05, and it is indeed so in this
particular example.

What is the difference between central tendency and dispersion?

Central tendency refers to and locates the center of the distribution of values

Dispersion is the amount of spread of data about the center of a data set

Comparison Chart


Meaning A statistical test, in which A statistical test used in the

specific assumptions are case of non-metric
made about the independent variables, is
population parameter is called non-parametric test.
known as parametric test.

Basis of test Distribution Arbitrary


Measurement Interval or ratio Nominal or ordinal


Measure of Mean Median

central tendency

Information Completely known Unavailable

about population

Applicability Variables Variables and Attributes

Correlation test Pearson Spearman

Definition of Parametric Test

The parametric test is the hypothesis test which provides generalisations for making
statements about the mean of the parent population. A t-test based on Students t-
statistic, which is often used in this regard. The t-statistic rests on the underlying
assumption that there is the normal distribution of variable and the mean in known or
assumed to be known. The population variance is calculated for the sample. It is
assumed that the variables of interest, in the population are measured on an interval

Definition of Nonparametric Test

The nonparametric test is defined as the hypothesis test which is not based on
underlying assumptions, i.e. it does not require populations distribution to be denoted
by specific parameters. The test is mainly based on differences in medians. Hence, it is
alternately known as the distribution-free test. The test assumes that the variables are
measured on a nominal or ordinal level. It is used when the independent variables are

Key Differences Between Parametric and Nonparametric Tests

The fundamental differences between parametric and nonparametric test are discussed
in the following points:

1. A statistical test, in which specific assumptions are made about the population
parameter is known as the parametric test. A statistical test used in the case of non-
metric independent variables is called nonparametric test.

2. In the parametric test, the test statistic is based on distribution. On the other
hand, the test statistic is arbitrary in the case of the nonparametric test.

3. In the parametric test, it is assumed that the measurement of variables of interest

is done on interval or ratio level. As opposed to the nonparametric test, wherein the
variable of interest are measured on nominal or ordinal scale.

4. In general, the measure of central tendency in the parametric test is mean, while
in the case of the nonparametric test is median.

5. In the parametric test, there is complete information about the population.

Conversely, in the nonparametric test, there is no information about the population.

6. The applicability of parametric test is for variables only, whereas nonparametric

test applies to both variables and attributes.
7. For measuring the degree of association between two quantitative variables,
Pearsons coefficient of correlation is used in the parametric test, while spearmans rank
correlation is used in the nonparametric test.

Hypothesis Tests Hierarchy

Equivalent Tests


Independent Sample t Test

Paired samples t test


One way Analysis of Variance (ANOVA)

One way repeated measures Analysis of Variance


You might also like