Final Module 15 Measures of Variability

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

BULACAN STATE UNIVERSITY

COLLEGE OF SCIENCE

MMW 101
MATHEMATICS IN THE MODERN WORLD

Module 15
Measures of Variability
“Statistics: Our Life Saver
and Influencer”
141

Measures of Variability
Objectives of the Module
At the end of the module, you should be able to:
1. identify the different measures of variability,
2. solve the range, the mean deviation, the quartile deviation, the variance, and the
standard deviation of given data sets, and
3. interpret the computed measures of variability.

The information on the measures of central tendency and other measures of


location in comparing two or more sets of data alone may not be adequate. Some
measures are usually needed to supplement these measures in describing and
comparing the sets of data.

Look at these two sets of scores in a Mathematics Quiz. How will you
compare the scores in Set A and set B?
Set A Set B
12 6
14 15
15 15
15 19
19 20

Computing the mean, the median, and the mode of the two sets of data, we will
get the following values:
𝚺𝑿 75
̅=
𝑿 = 5 = 15 ̅ = 𝚺𝑿 = 75 = 15
𝑿
𝒏 𝒏 5
md = 15 md = 15
mo = 15 mo = 15

The two sets of scores have the same mean, median, and mode. But if you will
look closely at the scores, you will notice that the values in set A are less spread
compared to the values in set B. This shows that computing the measures of central
tendency will not give us all the features or characteristics of a given set of data. Other
measures can provide other information about the data, and these are the measures
of variability.

The measures of variability tell us how the data are spread out or dispersed
around the center. The values are more clustered around the center if the computed
measure of variability is small. On the other hand, a high measure of variability
indicates that the values fall farther from the center. Measures of variability are also
called measures of variation and measures of dispersion. The most common
measures of variability are the following: 1.) range, 2.) quartile deviation or semi-
142

interquartile range, 3.) mean or average deviation, 4.) variance, and 5.) standard
deviation.

The range is the simplest measure of variability but the most unstable because
its value quickly fluctuates when there is a change in either the lowest or highest value.
It is easily affected by outliers (extremely small or extremely large values). It does not
give the dispersion or the spread of the values between the highest and the lowest
value. The range is the difference between the highest and the lowest value. The
formula is:

Range = Highest Value - Lowest Value


R = HV – LV Formula 1

A measure of variability that is not influenced by the presence of outliers in the


data is the interquartile range. This measure considers the spread of the middle 50
percent of data points falling between Q1 and Q3. The interquartile range (IQR) is
simply the difference between the third and first quartiles corresponding to the 75th
and 25th percentiles.

The formula is:

Interquartile Range = 3rd Quartile - 1st Quartile


Formula 2
IQR = Q3 - Q1 or IQR = P75 - P25

To illustrate, we have:

Q3

Q2 IQR

Q1

The quartile deviation is also called the semi-interquartile range. It is


computed as one half the difference between the third and first quartiles. In the
formula, we have

3𝑟𝑑 𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒−1𝑠𝑡 𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒


Quartile Deviation = 2
Formula 3
𝑸𝟑− 𝑸𝟏 𝑰𝑸𝑹 𝑃75− 𝑃25 𝐼𝑄𝑅
QD = = or QD = =
𝟐 𝟐 2 2
143

The mean deviation (MD) or the average deviation (AD) is the sum of the
absolute deviations of each value from the mean divided by the total number of
observations in the distribution. This measure shows the spread of the distribution
around the mean.

The variance is a measure that is obtained by getting the average of the


squared deviations from the mean. There are two ways of calculating the variance.
These are the deviation method and the raw score method.

The standard deviation shows the spread or dispersion of the values around
the mean. It is the positive square root of the variance. A low standard deviation
indicates that the values are clustered around the mean. In contrast, a high
standard deviation indicates that the values of the data set are spread out over
a wider range.

Measures of Variability for Ungrouped Data

1. The Range
Example: Consider the following scores of students in a test:
Sets of observation:
A: 10, 8, 6, 5, 12, 11, 13, 7
B: 9, 4, 8, 6, 10, 9, 10, 17
Solving for the range:

FOR SET A: R = HV - LV FOR SET B: R = HV - LV


R = 13 - 5 R = 17 - 4
R=8 R = 13

If you noticed, the range is computed using only the lowest and the highest
values and does not include the other values.

Interpretation: The range of the data in set B is greater than the range of the data in
set A. This shows that the scores in set B are more spread than the
scores in set A.

Note: A measure that takes into account all the values in the distribution is more
reliable.
144

2. The Quartile Deviation


Example:
Using the two sets of observation in the previous example, determine the
following:

1) Interquartile Range (IQR)


2) Quartile Deviation (QD)

A: 10, 8, 6, 5, 12, 11, 13, 7


B: 9, 4, 8, 6, 10, 9, 10, 17

Solution:

For Set A: 5, 6, 7, 8, 10, 11,12, 13


(Recall the steps on how to solve quantiles of ungrouped data)

Find: Q3 = P75: Find: Q1 = P25


P75= 75 and n = 8 P25= 25 and n = 8

1st 2nd 3rd 4th 5th 6th 7th 8th


5 6 7 8 10 11 12 13

𝑃𝑖 (𝑛+1) 75(8+1) 𝑃𝑖 (𝑛+1) 25(8+1)


Position: = = 6.75 Position: = = 2.25
100 100 100 100

The location of 6.75 is between the 6th The location of 2.25 is between the
value and the 7th value. 2nd value and the 3rd value.

Interpolating: Interpolating:
1.7th value – 6th value = 12 -11 = 1 1. 3rd value - 2nd value = 7 – 6 = 1
2. 1 (0.75) = 0.75 2.1 (0.25) = 0.25
3. 0.75 + 6th value = 0.75 + 11 3. 0.25 + 2nd value = 0.25 + 6
= 11.75 = 6.25

Q3 = P75= 11.75 Q1=P25 = 6.25

IQR = Q3 - Q1
= 11.75 - 6.25
IQR = 5.5

𝑸𝟑− 𝑸𝟏 𝑰𝑸𝑹 𝟓.𝟓


QD = = = = 𝟐. 𝟕𝟓
𝟐 𝟐 𝟐

Interpretation: The middle 50% of the values in Set A that lie between 6.25 and 11.75
has a range of 5.5. Half of the distance between the first quartile and
the third quartile is 2.75.
145

For Set B: 9, 4, 8, 6, 10, 9, 10, 17

Find: Q3 = P75. Find: Q1 = P25


P75 = 75 and n = 8 P25 = 25 and n = 8

1st 2nd 3rd 4th 5th 6th 7th 8th


4 6 8 9 9 10 10 17

𝑃𝑖 (𝑛+1) 75(8+1) 𝑃𝑖 (𝑛+1) 25(8+1)


Position: = = 6.75 Position: = = 2.25
100 100 100 100

The location of 6.75 is between the 6th The location of 2.25 is between the2nd
value and the 7th value. value and the 3rd value.

Since the values are the same, there is Interpolate:


no need to interpolate. 1. 3rd value - 2nd value = 8 - 6 = 2
2. 2(0.25) = 0.5
3. 0.5 + 2nd value = 0.5 + 6 = 6.5

Therefore, Q3 = P75 = 10. Q1 = P25 = 6.5


IQR = Q3 - Q1
= 10 - 6.5
IQR = 3.5

𝑸𝟑− 𝑸𝟏 𝑰𝑸𝑹 𝟑.𝟓


QD = = = = 𝟏. 𝟕𝟓
𝟐 𝟐 𝟐

Interpretation: The middle 50% of the values in Set B that lie between 6.5 and 10 has
a range of 3.5. Half of the distance between the first quartile and the
third quartile is 1.75.

Summary of Values:
Measure of Variation SET A SET B
Interquartile Range (IQR) 5.5 3.5
Quartile Deviation (QD) 2.75 1.75
146

Illustration 1 (For SET A) Illustration 2 (For SET B)


13 17

12 P75 = 10
P75 = 10
11.75
11 10
QD = 2.75 QD = 1.75
10 9
IQR = 5.5 IQR = 3.5
8 9
QD = 2.75 QD = 1.75
7 8
P25 = 6.25 P25 = 6.5
6 6

5 4

3. The Mean Deviation or the Average Deviation

̅|
𝚺 |𝑿 − 𝑿
𝑴𝑫 = Formula 4
𝒏

where: MD = mean deviation


𝑿 = individual value
̅ = mean of data
𝑿
𝒏 = total number of items or observations
̅ | = absolute deviations from the mean
|𝑿 − 𝑿

Example:

Determine the mean deviations of the two sets of data and follow the steps below.

Sets of observation: A: 10, 8, 6, 5, 12, 11, 13, 7


B: 9, 4, 8, 6, 10, 9, 10, 17

Step 1. Arrange the values in ascending or descending order.

Step 2. Find the sum of the values.

Step 3. Compute the value of the mean.

Step 4. Find the deviation of each score from the mean.

Step 5. Find the absolute value of each deviation obtained in Step 4.


147

Step 6. Find the sum of the absolute deviations in Step 5.

Step 7. Substitute the values in the formula and solve.

𝑿 𝑿−𝑿 ̅ ̅|
|𝑿 − 𝑿
(Step 1) (Step 4) (Step 5)
5 5 - 9 = -4 4
6 6 - 9 = -3 3
7 7 - 9 = -2 2
8 8 - 9 = -1 1
10 10 - 9 = 1 1
11 11 - 9 = 2 2
12 12 - 9 = 3 3
13 13 - 9 = 4 4
𝚺𝑿 = 72 | ̅
Σ 𝑋 − 𝑋| = 20
(Step 2) (Step 6)
𝚺𝑿 10+8+6+5+12+11+13+7 72
(Step 3) ̅
𝑿= = =𝟗
𝒏 8 8
̅|
𝚺|𝑿−𝑿 𝟐𝟎
(Step 7) 𝑴𝑫 = = = 𝟐. 𝟓
𝒏 𝟖

Interpretation: The scores of the students in set A, on average,


deviated from the mean by 2.5.

Note: The higher the mean deviation, the more spread out the values are from the
mean.

Try this!
Solve set B as an exercise.

4. The Variance

By Deviation Method (Long Method): By Raw Score Method (Short Method):

̅ )𝟐
𝚺(𝑿−𝑿 (𝚺𝑿)𝟐
s2 = 𝚺 𝑿𝟐 −
𝒏−𝟏 Formula 5a s2 = 𝒏 Formula 5b
𝒏−𝟏

where: where:
s2= variance s2 = variance
X = individual value X = individual value
̅
𝑋 = mean X2 = square of individual value
n = total number of items or observations n = total number of items or
observations
148

Example: Using the two sets of observations, let us solve the variance using the
following steps:
Sets of observations: (used in the previous example)
A: 10, 8, 6, 5, 12, 11, 13, 7
B: 9, 4, 8, 6, 10, 9, 10, 17

By Deviation Method (Long Method) By Raw Score Method (Short


Method)
Step 1. Arrange the values in ascending Step 1. Arrange the values in
or descending order. ascending or descending
Step 2. Find the sum of the values. order.
Step 3. Compute the value of the mean. Step 2. Find the sum of the
Step 4. Get the individual deviations values.
from the mean. Step 3. Square each value and
Step 5. Square each deviation and write write the results under
the results under column 3. column 2.
Step 6. Find the sum of the squared Step 4. Get the sum of the
deviations. squared values in step 3.
Step 7. Substitute the values in the Step 5. Substitute the values in
formula and solve. the formula and solve.

FOR SET A: (Long Method) FOR SET A: (Short Method)


Score ̅
𝑿−𝑿 ̅ )2
(𝑿 − 𝑿 Score
𝑿 𝑿 𝑿2
(Step1) (Step 4) (Step 5) (Step 1) (Step 3)
5 5 - 9 = -4 -42 16 5 52 25
2
6 6 - 9 =-3 -32 9 6 6 36
7 7 - 9 = -2 -22 4 7 72 49
8 8 - 9 = -1 1 8 64
10 10 - 9 = 1 1 10 100
11 11 - 9 = 2 4 11 121
12 12 - 9 = 3 9 12 144
13 13 - 9 = 4 16 13 169
𝚺𝑿 =72 𝚺(𝑿 − 𝑿̅ )2 =60 𝚺𝑿 = 72 2
𝚺𝑿 = 708
(Step2) (Step 6) (Step 2) (Step 4)

(Step 3)
𝚺𝑿 10+8+6+5+12+11+13+7 72
̅
𝑿= = = =9
𝒏 8 8

(Step 7) (Step 5)
(𝚺𝑋)2 (72)2
̅ )𝟐
𝚺(𝑿−𝑿 60 𝚺 𝑋2 − 708−
𝑛 8
s2 = = = 𝟖. 𝟓𝟕 s2 = = = 𝟖. 𝟓𝟕
𝑛−1 8−1 𝑛−1 8−1
149

The variance of a set of data is always expressed in square units since the
deviations from the mean are squared. The square root of the variance will give us a
measure that has the same unit as the data. This is the standard deviation.

Try this!
Solve set B as an exercise.

5. The Standard Deviation

By Deviation Method (Long Method) By Raw Score Method (Short


Method)
̅ )𝟐
𝚺(𝑿−𝑿
s=√ or s = √𝒔𝟐
𝒏−𝟏 𝟐 (𝚺𝑿)𝟐
𝚺𝑿 −
s=√ 𝒏
or s = √𝒔𝟐
Formula 6a 𝒏−𝟏

Formula 6b

Example:

Calculate the standard deviation of the two sets of observations.

Sets of observations:
A: 10, 8, 6, 5, 12, 11, 13, 7
B: 9, 4, 8, 6, 10, 9, 10, 17

To solve the standard deviation, follow the procedures in computing the


variance either by deviation method or by raw score method. Substitute the computed
value of the variance in the standard deviation formula.

SET A SET B
Standard s= √𝑠 2 = √8.57 = 𝟐. 𝟗𝟑 s= √𝑠 2 = √14.4107 = 𝟑. 𝟖𝟎
Deviation

Interpretation: The computed standard deviations of the two sets of values show
that the values in set A are less dispersed from the mean since its standard
deviation is smaller than the standard deviation of set B. We can also say that the
values in set A are less variable than the values in set B.
150

Let us have another example.

You were asked to buy brown sugar, and you saw two brands with the same
price that appears to have the same quality. You cannot choose which brand to buy.
How can you decide? Here is what you can do.

Take five one-kilo packs of each brand, weigh them, and list the results. Let us
say that the table below shows the weights of the ten packs of sugar from the two
brands. By just comparing the values, it would be difficult for you to choose a better
brand. But this is where standard deviation comes in!

Brand A 0.87 kg 1.03 kg 1.04 kg 1.01 kg 1.14 kg


Brand B 0.99 kg 0.81 kg 1.07 kg 1.05 kg 1.16 kg

Computations reveal the following:

Mean Standard Deviation


Brand A 1.018 s = 0.0968
Brand B 1.016 s = 0.1303

We can see that the two brands of brown sugar have almost the same mean,
but they differ in their standard deviations. Since the standard deviation of brand A is
smaller than the standard deviation of brand B, then it means that the contents of the
packs of sugar of brand A are more uniform than brand B. Therefore, it is better to buy
brand A!

Now let us see how to solve the measures of variability for grouped data.

Measures of Variability for Grouped Data

For the computation of the measures of variability for grouped data, let us
consider the frequency distribution of two sections in Math 323 in a 60-item
examination.
SET A SET B

LL - UL f <cf LL - UL f <cf
30 -34 3 3 31 - 35 6 6
35 -39 6 9 36 - 40 7 13
40 - 44 14 23 41 - 45 12 25
45 - 49 11 34 46 - 50 8 33
50 - 54 7 41 51 - 55 6 39
55 - 59 4 45 56 - 60 1 40
c=5 n = 45 c=5 n = 40
151

1. The Range

Range = Highest UPPER CLASS BOUNDARY - Lowest LOWER CLASS BOUNDARY


R = HUCB - LLCB

FOR SET A:

HUCB = 59.5
LLCB = 29.5

R = HUCB - LLCB
R = 59.5 - 29.5
R = 30

The range of 30 shows the width of the distribution.

Solve for the range of set B. Compare the range of Set A to the range of Set B.

2. The Quartile Deviation

𝑄3 −𝑄1 𝑃75 −𝑃25 𝐼𝑄𝑅


QD = = =
2 2 2

SET A

LL - UL UB - LB F <cf
30 -34 29.5 - 34.5 3 3( <cfbi )
35 -39 li34.5 - 39.5 9fi 12(11.25 is found here)
40 - 44 39.5 - 44.5 14 26( <cfbi )
45 - 49 li44.5 - 49.5 11fi 37(33.75 is found here)
50 - 54 49.5 - 54.5 6 43
55 - 59 54.5 - 59.5 2 45
c=5 n = 45

Recall the steps in solving quantiles (quartiles)of grouped data.

FOR SET A

Solving for Q3: Q3 = P75 Solving for Q1: Q1 = P25

i = 75 and n = 45 i = 25 and n = 45

𝑖𝑛 75(45) 𝑖𝑛 25(45)
100
= 100
= 33.75 100
= 100
= 11.25
152

𝒊𝝁 𝒊𝝁
− <𝑐𝑓𝒃𝒊 − <𝑐𝑓𝒃𝒊
𝟏𝟎𝟎 𝟏𝟎𝟎
𝑃75 = 𝓵𝒊 + ( )𝒄 𝑃25 = 𝓵𝒊 + ( )𝒄
𝒇𝒊 𝒇𝒊

33.75−26 11.25−3
P75 = 44.5 + ( )5 P25 = 34.5 + ( )5
11 9
P75 = 48.02 P25 = 39.08

𝑄3 −𝑄1 𝑃75 −𝑃25 𝐼𝑄𝑅 48.02−39.08 8.94


QD = = = = = = 𝟒. 𝟒𝟕
2 2 2 2 2

Interpretation: The results show that the middle 50% of the scores of
the forty-five students in Math 323 lie between 39.08 and 48.02. Half
the distance between the first quartile and the third quartile is 4.47.

Try this!
Solve for the QD of set B as an exercise. Interpret the result.

3. The Mean Deviation, Variance, and the Standard Deviation

In computing the mean deviation, to follow the steps below.

Step 1. Find the class mark (class midpoint) of each class.


Step 2. Multiply each frequency (f) by its corresponding class mark (𝑿𝒊 ).
Step 3. Find the sum of the products of the frequency and its class mark (Σf𝑿𝒊 ).
Step 4. Compute the mean (𝐗 ̅ ) of the distribution.
Step 5. ̅
Subtract the mean (𝐗) from each of the class marks.
Step 6. ̅|
Write the absolute values of the results obtained in Step 5 under the |𝑿𝒊 − 𝐗
column.
Step 7. Multiply each frequency by its corresponding absolute deviation from the
mean.
Step 8. Find the sum of the products under column 𝒇|𝑿𝒊 − 𝐗 ̅ |.

Step 9. Compute the mean deviation.


̅|
𝒇 |𝑿 𝒊 − 𝐗 Formula 7
𝑴𝑫 =
𝒏

where: f = frequencies
𝑿𝒊 = class marks (class midpoints)
̅ = mean of the distribution
𝐗
n = total number of observations
153

In calculating the variance, follow the steps below.

Step 10. Square each deviation obtained in Step 6 and write the results under column
9.
Step 11. Multiply each squared deviation in Step 10 by its corresponding frequency
and get the sum of the products.
Step 12. Compute the variance using Method 1 (Long Method).

Method 1. (Long Method)

̅ )𝟐
𝚺𝒇(𝑿𝒊 − 𝐗
𝒔𝟐 = Formula 8
𝒏−𝟏

In determining the standard deviation, follow the step below.

Step 13. The standard deviation can be solved using the formula given below or by
simply getting the square root of the variance.

𝚺𝒇(𝑿𝒊 − 𝐗)̅ 𝟐
𝒔 = √ 𝒏−𝟏 or 𝐬 = √𝐬𝟐 Formula 9

Table 1 shows the additional columns for the computation of the mean deviation, the
variance, and the standard deviation of SET A.
154
155

However, there is a shorter way for the computation of the variance and the standard
deviation by Method 2 (Short Method).
In solving the variance of a grouped data by using Method 2 (Short Method),
follow the steps enumerated below.
Step 1. Assign values for the class intervals to be written in column d. You may choose
from any of the class intervals and assign 0 to it. Consecutive negative
integers will be assigned to the class intervals before the class interval
assigned with 0 (see the table that follows). Positive numbers will be assigned
to the class intervals after the one assigned with 0.
Step 2. Square each value in the d column.
Step 3. Multiply the frequencies by their corresponding d values.
Step 4. Find the sum of the products obtained in Step 3.
Step 5. Multiply the frequencies by the corresponding squared d values.
Step 6. Find the sum of the products under the column fd2.
Step 7. Substitute the values in the formula and solve.

𝐧 𝚺 𝐟𝐝𝟐 − (𝚺 𝐟𝐝)𝟐
s2 = c2⌈ ⌉
𝐧 (𝐧−𝟏) Formula 10

where:
s2 = variance
f = frequencies
d = coded values
n = number of observations

In determining the standard deviation, follow the step below.

Step 8. The standard deviation can be solved using the formula given below or
by simply getting the square root of the variance.

𝐧 𝚺 𝐟𝐝𝟐 − (𝚺 𝐟𝐝)𝟐
s = c√ or s = √𝒔𝟐
𝐧 (𝐧−𝟏) Formula 11

METHOD 2. SHORT METHOD (FOR SET A)

d d2 fd fd2
LL - UL f
(Step 1) (Step 2) (Step 3) (Step 5)
30 -34 3 -2 (-2)2 4 3 x -2 -6 3 x 4 12
35 -39 9 -1 (-1)2 1 9 x -1 -9 (-15) 9x1 9
40 - 44 14 0 02 0 14x 0 0 14x 0 0
45 - 49 11 1 1 11 11
50 - 54 6 2 4 12 24
55 - 59 2 3 9 6 (29) 18
n = 45 Σfd = (-15+29) =14 Σfd2 = 74
c=5
(Step 4) (Step 6)

Note: Σfd can be negative.


156

Summary of Values:
c=5
n = 45
Σ𝑓𝑑 2 = 74
(Σ𝑓𝑑)2 = (14)2

𝐧 𝚺 𝐟𝐝𝟐 − (𝚺 𝐟𝐝)𝟐
(Step 7) s2 = c2⌈ ⌉
𝐧 (𝐧−𝟏)

45(74) − (142 )
s2 = 52 ( ) = 𝟑𝟗. 𝟓𝟕
45(45−1)

𝐧 𝚺 𝐟𝐝𝟐 − (𝚺 𝐟𝐝)𝟐
(Step 8) s=c√ 𝐧 (𝐧−𝟏)

s =√𝑠 2 = √39.57 = 6.29

Let us assume that there is another Set C that took the same test. The scores
in section C have a mean value of 43.9 and a standard deviation of 9.32. Which section
has more variability in their scores?

Comparing Set A with Set C:

Set A has a mean of 43.56 and a standard deviation of 6.29. Set C has a mean
of 43.9, with a standard deviation of 9.32. Since the standard deviation of section C is
higher than the standard deviation of Set A, we can say that the scores in Set C are
more variable than in Set A. This means that the scores in Set C are more dispersed
from the mean.

It’s your turn to construct Table 2 for Set B and solve for its mean deviation, variance,
and standard deviation. Are the scores in Set B more variable than the scores in Set
A?
157

References

Mangaran, A. J. (2004) et al. Elementary Statistics. City of Malolos, Bulacan: Bulacan


State University.

Sirug, W.S. (2018). Mathematics in the Modern World. Intramuros, Manila:


Mindshapes Co., Inc.

Zorilla, Roland, et.al. (2013) Statistics: Basic Concepts and Applications. Malabon
City, Philippines : MUTYA Publishing House.

You might also like