Chapter Four: Measures of Variation
Chapter Four: Measures of Variation
Chapter Four: Measures of Variation
Measures of Variation
1
Chapter Goals
After completing this chapter, you will be able to:
• Compute and interpret the absolute and relative measures of
variation for a set of data.
2
Introduction
Dispersion refers to lack of uniformity in the sizes or
qualities.
A measure of central tendency only shows the middle or
the average of a dataset, i.e., variability cannot be
determined.
Example: Consider the following datasets
A: 1, 2, 5, 6, 6 B: −40, 0, 5, 20, 35
Mean(A) = Mean(B) = 4
However, the data in A seem more consistent (less
variable) than the data in B.
If observations are close to the center, we say that
dispersion is small.
3
4.1 Objectives of Measuring Variation
Absolute Relative
• Absolute Variations are expressed in the same units of
measurement in which the original data are given.
• Recommended to compare variations in distributions
where units/standards of measurements are the
same.
• A relative variation is obtained from the ratio of
absolute variation to a measure of central tendency.
• These are used to compare variations of sets of data
measured without same standards(units).
5
Absolute and Relative Measures . . .
6
4.3 Types of Measures of Variation
4.3.1 Range and Relative Range
Range: the difference between the smallest and the largest
values.
Range= UCB of the last class – LCB of the first class
(for grouped frequency distribution)
• Example: Find the range of the following distributions.
1) 23, 42, 20, 30, 35, 21, 45, 33, 23, 23, 20, 42, 29, 20.
Range: 45 – 20 = 25
1) Class: 2.5 – 10.5 10.5 – 18.5 18.5 – 26.5 26.5 – 34.5
Frequency: 4 7 6 15
Range: 34.5 – 2.5 = 32
7
Range and Relative Range . . .
Range cannot be calculated for open–end distributions.
8
Disadvantages of the Range
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
• Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
(x i μ) 2
σ 2 i 1
N
Where
μ = population mean
N = population size
xi = ith value of the variable x 10
Sample Variance
• Average (approximately) of squared deviations of values
from the mean.
– Sample variance: n
(x x)i
2
s
2 i1
n -1
X = arithmetic mean
Where
n = sample size
Xi = ith value of the variable X
• The sum of squares in this case is divided by n-1 in order
to get an unbiased estimator of the population variance.
11
Standard Deviation
• Standard deviation is the positive square root of variance.
• To put the variance on the same scale as the original data, we
prefer to work with the standard deviation.
• Standard deviation shows variation about the mean.
Population standard deviation:
N
i
(x μ) 2
σ i 1
N
Sample standard deviation:
n
i
(x x) 2
S i 1
n -1
12
Example: Sample Standard Deviation
Sample
Data (xi) : 10 12 14 15 17 18 18 24
n=8 Mean = x = 16
16
Measuring Variation
17
Comparing Standard Deviations
Which of the following datasets is the most variable?
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21
s = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 0.926
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.570
18
Algebraic Properties of Variance and Standard Deviation
• Sd(X) > = 0
• If K is added to/subtracted from each observation, the
variance and sd remain the same.
• If each observation is multiplied by K, the new variance &
sd will be (K2)(“previous” variance) & (|K|)(“previous” sd)
respectively.
• If each observation is divided by k, the new variance & sd
will be (“previous” variance)/K2 & (“previous” sd) / |K|
respectively.
If the data are from a sample and approximately normally
distributed, then
x1s will include approximately 68% of the data.
x 2 s will include approximately 95% of the data.
x 3s will include approximately 99.7% of the data. 19
The Empirical Rule
68%
μ
x 1s
95% 99.7%
x 2s x 3s
20
The Empirical Rule . . .
21
Advantages and Disadvantages of Standard Deviation
25
Example
• A student obtained 80 on a civics exam that had a mean of 70
and a standard deviation of 10. The same student obtained 60
on a calculus exam, which had a mean of 51 and a variance of
64. On which exams did the student perform better relative to
other students? Why?
Civics Calculus
Mean = 70 Mean = 51
Standard deviation = 10 Standard deviation = 8
Score = 80 Score = 60
Z = [(80 – 70)/10] Z = [(60 – 51)/8]
Z= 1.00 Z= 1.125