Lecture 2: Methods For Describing Data Through Numerical Measures
Lecture 2: Methods For Describing Data Through Numerical Measures
Lecture 2: Methods For Describing Data Through Numerical Measures
Outline
Measures of central tendency and dispersion
Characteristics, uses, advantages, and disadvantages of
each measure of location and dispersion
Slide 1 of 76
School of Business
Slide 2 of 76
School of Business
Measures of location
Measures of dispersion
Slide 3 of 76
School of Business
Slide 4 of 76
School of Business
1/27/2015
Measures of Location
School of Business
Slide 5 of 76
Average
Joe
It is calculated by
summing the
values and
dividing by the
number of values
North South University
Slide 7 of 76
School of Business
Slide 6 of 76
Population Mean
Arithmetic Mean
The Arithmetic Mean
is the most widely used
measure of location and
shows the central value of
the data
School of Business
X
N
where
is the population mean
N is the total number of observations.
X is a particular value.
indicates the operation of adding.
North South University
Slide 8 of 76
School of Business
1/27/2015
Example 1
The Kiers family
owns four cars.
The following is
the current
mileage on
each of the four
cars.
Sample Mean
For ungrouped data, the sample mean
is the sum of all the sample values
divided by the number of sample
values:
56,000
42,000
23,000
73,000
Find the mean mileage for the cars.
X
N
School of Business
Example 2
A sample of
five
executives
received the
following
bonus last
year ($000):
Slide 10 of 76
School of Business
14.0,
15.0,
17.0,
16 0
16.0,
15.0
15.4
n
5
5
X
n
Slide 11 of 76
School of Business
Slide 12 of 76
School of Business
1/27/2015
Example 3
Weighted Mean
Slide 13 of 76
School of Business
Xw
(w1 X1 w2 X 2 ... wn X n )
(w1 w2 ...wn )
Example 4
$0.89
50
Xw
Slide 15 of 76
School of Business
The Median
Slide 14 of 76
School of Business
Slide 16 of 76
School of Business
1/27/2015
Example 5
Slide 17 of 76
School of Business
Slide 18 of 76
The median is
found at the
(n+1)/2 = (4+1)/2
=2.5th data point.
School of Business
The Mode
Slide 19 of 76
School of Business
Slide 20 of 76
School of Business
1/27/2015
Example 6
The exam scores for ten students are: 81, 93, 84,
75, 68, 87, 81, 75, 81, 87. Because the score of
81 occurs the most often, it is the mode.
Skewed distribution: One whose shapes on either side of
the center differ; a nonsymmetrical distribution.
Slide 21 of 76
School of Business
Mean
= Median
= Mode
School of Business
Slide 22 of 76
Mean
Median
Mode
Mode
Slide 23 of 76
Mean
Median
School of Business
Slide 24 of 76
School of Business
1/27/2015
Geometric Mean
The Geometric Mean
(GM) of a set of n positive
numbers is defined as the nth
root of the product of the n
numbers. The formula is:
Negatively Skewed:
Mean and Median are to
the left of the Mode
GM
Mode
Median
Slide 25 of 76
School of Business
Example 7
( 5 )( 21 )( 4 ) 7 . 49
Slide 27 of 76
Slide 26 of 76
School of Business
Example 8
GM
( X 1)( X 2 )( X 3 )... ( X n )
The geometric mean is used
to average percents,
indexes, and relatives.
Mean
School of Business
Slide 28 of 76
School of Business
1/27/2015
Example 9
Sa
ales in Millions($)
50
40
30
20
10
0
1999
2000
2001
2002
2003
2004
Year
School of Business
Slide 29 of 76
GM 8
The value 0.0127 indicates that the average annual growth over
the last 8-year period was 1.27%.
25
20
15
10
5
0
0
10
12
School of Business
30
refers to the
spread or
variability in
the data.
Slide 30 of 76
Example 10
Measures of Dispersion
Dispersion
835,000
1 .0127
755,000
Slide 31 of 76
School of Business
-8.1
-5.1
-3.1
-1.4
14
1.2
3.2
4.1
4.6
48
4.8
5.7
5.9
6.3
7.9
79
7.9
8.0
8.1
9.2
9.5
97
9.7
10.3
12.3
13.3
14.0
15 0
15.0
22.1
Slide 32 of 76
School of Business
1/27/2015
Example 11
MAD:
All
The arithmetic
mean of the
absolute
values
l
off the
th
deviations
from the
arithmetic
mean.
MAD
Slide 33 of 76
MD
X X
n
1 5 1 4 5
2.4
5
n
School of Business
X X
Slide 34 of 76
School of Business
Population Variance
The major characteristics:
Variance:
the arithmetic
mean of the
squared
deviations from
the mean.
Not
Standard deviation:
The
square root of the variance.
Slide 35 of 76
School of Business
Slide 36 of 76
School of Business
1/27/2015
Example 10 (revisited)
In Example 10, the variance and standard
deviation are:
(X - )2
N
= 42.227
= 6.498
2
North South University
Slide 37 of 76
School of Business
s2
(X - X)2
n-11
s s2
North South University
Slide 39 of 76
School of Business
Slide 38 of 76
Example 12
The hourly wages earned by a sample of five
students are:
$7, $5, $11, $8, $6.
Find the sample variance and standard deviation.
X
X 37
X
7.40
n
5
n 1
5 1
5 1
2
s2
s
School of Business
6.62
s2
5 . 30 2 . 30
Slide 40 of 76
School of Business
10
1/27/2015
Chebyshevs theorem
Chebyshevs theorem: For any set of
observations (sample or population), the proportion
of the values that lie within k standard deviations of
the mean is at least:
1
where
1
k2
1
1
1
1
1
0 . 92
k2
12 . 25
3 . 5 2
Slide 41 of 76
School of Business
About
Slide 42 of 76
68%
Virtually
Slide 43 of 76
School of Business
School of Business
North South University
95%
99.7%
Slide 44 of 76
School of Business
11
1/27/2015
organized in a frequency
distribution is computed by the
following formula:
Mf
n
Slide 45 of 76
School of Business
Example 13
A sample of ten
movie theaters
in a large
metropolitan
area tallied the
total number of
movies showing
last week.
Compute the
mean number of
movies
showing.
North South University
Movies
showing
1 up to 3
frequency class
f
midpoi
nt M
1
2
(f)(M)
3 up to 5
5 up to 7
18
7 up to 9
9 up to 11
10
30
Total
10
66
Mf
66
6 .6
n
10
Slide 46 of 76
School of Business
n
CF
Median L 2
(i )
f
where L is the lower limit of the median class, CF is the
cumulative frequency preceding the median class, f is the
frequency of the median class, and i is the median class
interval.
North South University
Slide 47 of 76
School of Business
Slide 48 of 76
School of Business
12
1/27/2015
Example 13 (revisited)
Example 13 (contd)
Movies
showing
1 up to 3
Frequency
1
Cumulative
Frequency
1
3 up to 5
5 up to 7
7 up to 9
9 up to 11
10
Slide 49 of 76
School of Business
frequency
class
f
midpoint
id i t
M
1
2
3 up to 5
5 up to 7
7 up to 9
9 up to 11
10
Slide 51 of 76
The modes in
example 13 are 6
and 10 and so is
bimodal.
School of Business
n
10
CF
3
Median L 2
(i ) 5 2
(2) 6.33
f
3
Slide 50 of 76
School of Business
Standard Deviation
s
North South University
f M X
n 1
Slide 52 of 76
School of Business
13
1/27/2015
Example 13 (revisited)
A sample of ten
movie theaters in a
large metropolitan
area tallied the total
number of movies
showing last week.
Compute the
standard deviation of
movies showing.
Movies
showing
1 up to 3
frequency f
class
midpoint M
2
(M-X)
f*(M-X)
-4.6
21.16
3 up to 5
-2.6
13.52
5 up to 7
-0.6
1.08
7 upp to 9
1.4
1.96
9 up to 11
10
3.4
34.68
Total
X
n 1
10
72.40
72 . 40
10 1
Slide 53 of 76
2 . 8363
School of Business
Slide 54 of 76
School of Business
Location of a Percentile
Quartiles
Lp = (n+1)
(50th percentile)
first quartile (25th percentile)
P
100
where
(75th percentile)
Slide 55 of 76
School of Business
Slide 56 of 76
School of Business
14
1/27/2015
Example 14
Stock prices on twelve
consecutive days for a
major
publicly traded company
Example 14 (contd)
Using the twelve stock prices, we can find the
median, 25th, and 75th percentiles as follows:
Quartile 3
100
90
80
70
60
Median
L50 = (12 + 1)
50
th
100 = 6.50 observation
Quartile 1
L25 = (12+1)
25 = 3.25th observation
100
50
1
10
11 12
Slide 57 of 76
School of Business
Example 14 (contd)
School of Business
Slide 58 of 76
Interquartile Range
To locate the values, the first step is to organize the data in increasing order
12
Q4 11
10
9
Q3 8
7
6
Q2 5
4
3
Q1 2
1
96
92
91
88
86
85
84
83
82
79
78
69
75th percentile
Price at 9.75 observation = 88 + .75(91-88)
= 90.25
50th percentile: Median
Price at 6.50 observation = 84 + .5(85-84)
= 84.50
25th percentile
Price at 3.25 observation = 79 + .25(82-79)
= 79.75
Slide 59 of 76
School of Business
The Interquartile
range is the distance
between the third
quartile Q3 and the
first quartile Q1.
Interquartile range = Q3 - Q1
North South University
Slide 60 of 76
School of Business
15
1/27/2015
Box Plots
Example 15
For a set of
observations the third
quartile is 24 and the
first quartile is 10.
What is the quartile
deviation?
Five pieces of
data are needed
to construct a box
plot: the Minimum
Value, the First
Quartile, the
Median, the Third
Quartile, and the
Maximum Value.
Slide 61 of 76
School of Business
Example 16
Slide 62 of 76
School of Business
Example 16 (contd)
Slide 63 of 76
School of Business
Slide 64 of 76
School of Business
16
1/27/2015
Example 16 (contd)
Min Q
1
Median
Coefficient of Variation
Max
Q3
Relative dispersion
The coefficient of variation
is the ratio of the standard
deviation to the arithmetic
mean expressed as a
mean,
percentage:
12
14
16
18
20
22
24
26
28
30
32
CV
s
(100 %)
X
Mean
North South University
Slide 65 of 76
School of Business
Skewness
symmetry of the distribution.
sk
The coefficient of
skewness can range
from -3
3.00
00 up to 3
3.00
00
when using the
following formula:
3 X Median
s
School of Business
Slide 66 of 76
Example 14 revisited
Coefficient of variation:
CV
A value of 0 indicates a
symmetric distribution.
s
(100 %)
X
= 8.5%
Coefficient of skewness:
Some software packages use a
different formula which results in a
wider range for the coefficient.
Slide 67 of 76
School of Business
sk
North South University
3 X Median
s
Slide 68 of 76
= -.035
School of Business
17
1/27/2015
Scatter diagram
School of Business
Slide 69 of 76
8.0
7.5
7.5
7.3
7.2
7.2
7.1
7.1
7.0
6.2
6.2
5.1
96
92
91
88
86
85
84
83
82
79
78
69
Slide 70 of 76
School of Business
Contingency table
Contingency
g
y tables are used
when one or both variables are
nominally scaled.
100
90
Price
Price
Example 14 revisited
Index
(000s)
80
70
60
A contingency table is a
cross tabulation that
simultaneously summarizes
two variables of interest.
50
5
10
Index
Slide 71 of 76
School of Business
Slide 72 of 76
School of Business
18
1/27/2015
Example 17 (contd)
Example 17
Weight Loss
45 adults, all 60 pounds
overweight, are randomly
assigned to three weight loss
programs. Twenty weeks into
the program,
program a researcher
gathers data on weight loss
and divides the loss into three
categories: less than 20
pounds, 20 up to 40 pounds,
40 or more pounds. Here are
the results.
Weight
Loss
Plan
Less 20 up to
40
than 20
40
pounds
pounds pounds or more
Plan 1
Plan 2
12
Plan 3
12
Slide 73 of 76
School of Business
Practice Problems
Slide 74 of 76
School of Business
Assignment-2
(Problem 13)
Slide 75 of 76
School of Business
Slide 76 of 76
School of Business
19