X X X X X N X N X X, X, X: Mean and Variance
X X X X X N X N X X, X, X: Mean and Variance
X X X X X N X N X X, X, X: Mean and Variance
Mean is a measure of central dispersion. Central dispersion tells us how the data that we are
taking for observation are scattered and distributed. We can know about different properties, but
for doing that, we need to know about some of the features like mean, median and variance of
the given data distribution.
Now mean means different definitions in different branches of mathematics. Normally, by mean
we usually denote the average of the discrete data present in a set of numbers. The arithmetic
mean is usually given by (This is the formula that we represent for ungrouped data)
n
x 1+ x 2 + x 3+ x 4 ...... x n 1
x= ⇒ x= ∑ ❑ x i
n n i=1
Where x 1 , x 2 , x 3 .... x ndenote the value of the respective terms;
And n= number of terms.
Let us take another example where each data point is given with separate frequency data.
The formula for the mean calculation in this case (called the discrete frequency data) is
f 1 x 1+ f 2 x2 + f 3 x 3❑ +.....+f n x n 1 n
x= = ∑ ❑ f i xi
n n i=1
Where x 1 , x 2 , x 3 .... x ndenote the value of the respective terms;
And f 1 , f 2 , f 3 ..... f ndenote the respective frequency data of the respective term;
And n= number of terms.
The formula for both a sample and the population taken are the same, but the denotation
different; sample mean is denoted by x , and, the population mean represented by μ.
By median, we usually denote the data that precisely indicate the middle position of a set of
numbers taken. To calculate the average, we have to arrange the numbers in ascending order
only.
Let us represent the number of observations as n. If n is odd, the median of the data set will be
n+1
represented by ( )t hterm and if n is even, the median of the data set will be represented by
2
th
n n+1 t h
the mean of ( ) and ( ) term.
2 2
Sometimes, the mean and the median become the same. But this depends on the particular
case that we are taking. This happens in the case of batsmen scoring runs for a team. That’s
why sometimes, measures of central dispersion are catchy, and so, it’s better to take other
measures like range, mean deviation, variance and standard deviation of the data that we are
calculating.
Variance:
Sometimes we have to take the mean deviation by taking the absolute values from a set of
values. The absolute values were taken to measure the deviations, as otherwise, the positive
and negative deviation may cancel out each other.
So, to remove the sign of deviation, we usually take the variance of the data set, i.e. we usually
square the deviation values. As squares are always positive, so the variance is always a
positive number.
Let us take ”n” observations as a 1 , a2 , a3 .....a nand their mean is represented by a .
Then the variance is denoted by
n
σ =(a1−a) +(a2−a) +(a3−a) .....+(a n−a) =∑ ❑(ai−a)2 .
2 2 2 2 2
i=1
Properties of variance:
● If the variance comes out to be zero, this means that (a i−a)is equal to zero, which is
nothing but each value of the set is equal to the mean value a .
● If the variance is small, it means that the observations are pretty close to the mean value
a and if the value is greater, the deviations of the observations are far from the mean
value a .
● If each observation is increased by a where aϵR , then the variance will remain
unchanged.
● If each observation is multiplied by a where aϵR , then the variance will be multiplied by
a 2also.
n
2
But for some data sets, the variance by the formula ∑ ❑(ai−a) does not give the proper
i=1
values as the range of deviation may vary and the observations may be more scattered about
the mean. So, to overcome this difficulty, we take the mean of the square of the deviations.
So, the variance is given by:
n
1
σ 2= ∑ ❑( ai−a)2 .
n i=1
As a result of squaring, the unit of variance is not the same as that of the data sets taken.
Standard Deviation:
To take a proper measure of dispersion, we have to calculate the standard deviation by taking
the square root of the variance. This measure often prevents above-average deviations from
cancelling those below, which can sometimes contribute to a null variance. If the variance is
great, then the standard deviation will be more, and for lesser variance, the opposite case
occurs.
The formula of standard deviation is given by:
n
σ = √ σ 2=
√ 1
∑ ❑(ai−a)2
n i=1
Standard Deviation of distribution with discrete frequency:
It is given by:
n
σ=
1
√
∑
N i=1
❑ f i (a i−a)2
√ (∑ ) 2 2
❑ f i × ∑ ❑ f i x i −( ∑ ❑ f i x i )
i=1 i=1 i=1
σ= n
∑ ❑f i
i=1