Chapter 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Chapter 3

MEASURES OF CENTERAL TENDENCY


3.1. Definition and purpose (objective) of average
When we want to make comparison between groups of numbers it is good to have
a single value that is considered to be a good representative of each group. This
single value is called the average of the group.
 Averages are also called measures of central tendency.
Objectives:
1. To comprehend the data easily
2. To facilitate comparison
3. To make further statistical analysis
A typical average should possess the following:
• It should be rigidly defined.
• It should be based on all observation under investigation.
• It should be as little as affected by extreme observations.
• It should be capable of further algebraic treatment.
• It should be as little as affected by fluctuations of sampling.
• It should be ease to calculate and simple to understand.

3.2. The summation notation


Let X1, X2,X3, ………,XN be a number of measurements where N is the total
number of observation and Xi is ith observation. Then a shorthand notation to
represent a sum of scores X1+X2+X3+...+XN is ∑ which is called the
summation notation.

∑ = X1+X2+X3+...+XN where i=1,2,…….N

Properties of Summation
n n n
a. (X
i 1
i  Yi ) =  X i 
i 1
Y
i 1
i

n
b.  k =kn
i 1
where k is a constant value

n n
c.  kX
i 1
i =k  X i
i 1

n n
d.  (a  bX i ) = an + b  X i
i 1 i 1

1
Example: considering the following data on two variables

X Y
2 1
3 2
4 3
find
3 3
a.  Xi Y i
i 1 and i 1

3 3
a.  3X i  2Y i
i 1 and i 1

3 3
b.  ( X i  Yi ) (X i  Yi )
i 1 and i 1

3 3 3
c. X Y i i  ( X )  (Y ) i i
i 1 and i 1 i 1

Types of measures of central tendency

The three common measure of central tendency are

A. The Mean (Arithmetic, Geometric and Harmonic)

B. The Median

C. The mode

A. Arithmetic Mean

The mean is the sum of the values, divided by the total number of values.

X Represents the sample mean.

For a population, the Greek letter (mu) is used for the mean.

2
Arithmetic Mean for a frequency array (ungrouped FD)

X=
 fX = f X
1 1  f 2 X 2  ....  f K X K
f f1  f 2  ...  f K

Example:

Arithmetic Mean for Grouped Data


k

f X
i 1
i mi

X= k

f
i 1
i

Where: - Xmi is the class mark of the ith class


fi is the frequency of the ith class
Example: calculate the mean for the following age distribution.
Class frequency
6- 10 35
11- 15 23
16- 20 15
21- 25 12
26- 30 9
31- 35 6

3
Solutions:
• First find the class marks
• Find the product of frequency and class marks
• Find mean using the formula.

Class fi Xi Xifi
6- 10 35 8 280
11- 15 23 13 299
16- 20 15 18 270
21- 25 12 23 276
26- 30 9 28 252
31- 35 6 33 198
Total 100 1575

f X
i 1
i mi

X= 6
= =15.75
f
i 1
i

Special properties of Arithmetic mean

1. The sum of the deviations of a set of items from their mean is always zero.
n
i.e.. (X i 1
i  X)  0

2. The sum of the squared deviations of a set of items from their mean is the
minimum. i.e.

3. If X 1 is the mean of observations n1


If X 2 is the mean of observations n2
.
.
.
If X k is the mean of observations nk
Then the mean of all the observation in all groups often called the combined
mean is given by:

4
Example: In a class there are 30 females and 70 males. If females averaged 60

in an examination and boys averaged 72, find the mean for the entire class.

Solution

4. If a wrong figure has been used when calculating the mean the correct
mean can be obtained without repeating the whole process using:

Example: An average weight of 10 students was calculated to be 65.Latter it


was discovered that one weight was misread as 40 instead of 80 k.g. Calculate
the correct average weight.

5. The effect of transforming original series on the mean.


a. If a constant K is added/ subtracted to/from every observation
then the new mean will be
New mean= Old mean  K
b. If every observations are multiplied by a constant k then the new
mean will be
New mean= K*Old mean
Example: The mean of n Tetracycline Capsules X1, X2, …,Xn are known to
be 12 gm. New set of capsules of another drug are obtained by the linear
transformation Yi = 2Xi – 0.5 (i = 1, 2… n) then what will be the mean of the
new set of capsule

5
Exercise: - The mean of a set of numbers is 500.
a. If 10 is added to each of the numbers in the set, then
what will be the mean of the new set?
b. If each of the numbers in the set are divided by 5, then
what will be the mean of the new set?

Weighted Mean: - weighted mean is appropriate when a proper importance is


desired to be given to different data; Weights are assigned to each item in
proportion to its relative importance.

Let X1, X2, …Xn be the value of items of a series and W1, W2,…., Wn their
corresponding weights , then the weighted mean denoted X w is defined as:

Exercise: A teacher assigned a weight to assessment methods as 2 to Quiz, 3 to


Mid-exam and 5 for Final exam. If a student gets 90, 50 and 60 for Quiz, Mid-
exam and Final-exam respectively, what is his/her average academic
performance?

Harmonic Mean

Harmonic Mean is another specialized average which is useful in averaging

variables expressed as rate per unit of time, such as speed, number of units

produced per day. It is the reciprocal of the arithmetic mean of the numbers.
n n
HM= =
1 1 1 1
X 
X1 X 2
 ... 
Xn

For ungrouped FD, HM=


f =
f1  f 2  ...  f K
f f1 f f
X  2  ...  K
X1 X 2 XK

Exercise:

1. Find the harmonic mean of A) 1, 2, 3, 4, 5. B) 1, 2, 3, 4, 100. Is there a great

difference between the HM of A and that of B?

6
2. Suppose a person drove 100 miles at 40 miles per hour and returned driving

50 miles per hour find the average miles per hour

3. A driver traveled 400 km per day for three days at a speed of 60, 50 and 40

kilometers per hour. Find the average speed of the driver.

4. A student reads the first 100 pages of a book at a rate of 5 pages per hour,

the next 100 pages at a rate of 8 pages per hour. What is the student’s

average reading speed?

5. A carpenter buys $500 worth of nails at $50 per pound and $500 worth of

nails at $10 per pound.Find the average cost of 1 pound of nails.

 Harmonic mean is not affected by extreme values. But it cannot be

calculated when one or more observations are zero

Geometric Mean:

If the variable values are measures as ratios, proportions or percentage and

some values are larger in magnitude and others are small, then the geometric

mean is a better representative of the data than the simple average. In a

“geometric series”, the most meaning full average is the geometric mean. The

arithmetic mean is very biased toward the large numbers in the series.

The geometric mean is important in determining the average rate of growth,

percentages, ratios and portions.

 The disadvantage of GM is that it cannot be calculated if one or more

observations are zero or negative. It is also affected by extreme values

but not to the extent of AM.

Geometric mean is the nth root of the product of the n values.

GM=
n
X = n X 1 X 2 ... X n

But this formula is used if n is small. If it is large, it is difficult to calculate the

nth root. Thus to facilitate the computation, we make use of logarithms


1
GM=Antilog ( ∑logX)
n

7
Exercise

1. Find the geometric mean of A) 1, 2, 3, 4, 5. B) 1, 2, 3, 4, 100. Is there a great

difference between the GM of A and that of B?

2. The price of a commodity increased by 5% from 1989 to 1990, 8% from 1990 to

1991 and by 77% from 1991 to 1992. Find the average price increase.

3. A machine depreciated by 10% each in the first two years and by 40% in the third

year. Find out the average rate of depreciation.

4. The growth rates of the Living Life Insurance Corporation for the past 3 years

were 35, 24,and 18%.


~
B. The Median( X )

Median is the value of the variable which divides the data set it in to two equal
halves or it is the halfway point in a data set. Before you can find this point, the
data must be arranged in order. When the data set is ordered, it is called a
data array. The median either will be a specific value in the data set or will fall
between two values

Median for ungrouped data


~ n  1 th
X =( ) value if n is odd
2
n n
( ) th value  (  1) th value
~
X= 2 2 if n is even
2

Example: Find the median of the following numbers.


a. 6, 5, 2, 8, 9, 4.
b. 2, 1, 8, 3, 5, 8.

Solutions:
a) First order the data: 2, 4, 5, 6, 8, 9
Here n=6 which is even
6 6
( ) th value  (  1) th value
~
X= 2 2
2
(3) value  (4) th value
th
=
2
(5)  (6)
=
2

8
~
X =5.5
Interpretation: - 50% of the values are less than 5.5 and the remaining 50% of
the data values are more than 5.5.

b) Order the data: 1, 2, 3, 5, 8


Here n=5 which is odd
~ 5  1 th
X =( ) value
2
~
X = (3) th value
~
X =3

Median for grouped data


For grouped frequency distributions median is given by the formula
n
 Cf X~ 1
~ 2
X = LB X~  ( )w
f X~

Where:-
LB X~
is the LCB of the median class.
Cf X~ 1
is the less than cumulative frequency just before the median class.
f X~
is frequency of the median class.
W is the class width
First obtain the less than cumulative frequencies. From the cumulative
frequencies select the minimum one which contains the value n/2 . Then the
median class is the class corresponding to this minimum cumulative frequency
which contains the value n/2 .
Example: Find the median of the following distribution.
Class Frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3

Solutions:

9
• First find the less than cumulative frequency.
• Identify the median class.
• Find median using formula.

Class Frequency Cfl


40-44 7 7
45-49 10 17
50-54 22 39
55-59 15 54
60-64 12 66
65-69 6 72
70-74 3 75

n/2=75/2=37.5
39 is the cumulative frequency greater than or equal to 37.5
 the median class is 50-54

LB X~ =49.5
Cf X~ 1 =17
f X~ =22
W=5

~ 37.5  17
X = 49.5  ( )5
22
~
X =54.16

Note that: - the median value is always within the median class.

C. The Mode ( X̂ )

The mode is denoted by is the most frequently occurring value in a set of

observations or it is the value with the highest frequency.

 Uni-modal:- a data set with only one modal value is called uni-modal.

 bi-modal:- a data set with two modal value is called bi-modal.

 multi-modal:- a data set with more than two modal values is called

multi-modal.

Examples:

10
1. Find the mode of 5, 3, 5, 8, 9
Mode =5
2. Find the mode of 8, 9, 9, 7, 8, 2, and 5.
It is a bimodal Data: 8 and 9
3. Find the mode of 4, 12, 3, 6, and 7.
No mode for this data.
Mode for Grouped data
If data are given in the shape of continuous frequency distribution, the mode is
defined as:

Lmo=lower class limit of the modal class

Note: The modal class is a class with the highest frequency.


 The mode is the only measure of central tendency that can be used in

finding the most typical case when the data are nominal or categorical.

Example: Following is the distribution of the size of certain farms selected at


random from a district. Calculate the mode of the distribution.

Size of farms No. of farms


11
5- 15 _________________ 8
15- 25_________________12
25- 35_________________17
35- 45_________________29
45- 55_________________31
55- 65_________________5
65- 75_________________3
Solutions:
The modal class is 45-55 because it is the class with the highest
frequency

Lmo=45 fmo=31

f1=29 f2=5

w=10

∆1=31-29=2

∆2=31-5=26

MEASURES OF POSITION

In addition to measures of central tendency there are measures of position or

location. They are used to locate the relative position of a data value in the data

set

These measures include: -

1. Quartiles
2. Percentiles
3. Deciles

1. Quartiles:
12
 Quartiles are measures that divide the frequency distribution in to four
equal parts.
 The value of the variables corresponding to these divisions are denoted
Q , Q , and Q often called the first, the second and the third quartile
1 2 3
respectively.
 Q is a value which has 25% items which are less than or equal to it.
1
Similarly Q has 50%items with value less than or equal to it and Q has
2 3
75% items whose values are less than or equal to it.

Quartile for ungrouped data


Step 1:- Arrange the values in ascending order
Let Qi be the ith quartile (i=1, 2, 3), then
i (n  1) th
Qi= ( ) value
4

Example: Find the Quartiles (Q , Q , and Q ) of the following data.


1 2 3
A. 6, 5, 2, 8, 9, 4
B. 2, 1, 8, 3, 5, 8
Solution:- (0.25i*n)th value
A. First order the data: 2, 4, 5, 6, 8, 9
n=6
1(6  1) th
Q 1= ( ) value
4
1(6  1) th
=( ) value
4

= (1.75)th value

=x1 + 0.75(x2 –x1)

=2+0.75(4-2)

Q1 =3.5

Interpretation: 25% of the values are less than or equal 3.5 and the remaining

75% of the data values are more than or equal to 3.5.

Exercise: Find Q2? And Q3? , What do you conclude about Q2 and the median?

13
Quartile for grouped data
For grouped data: we have the following formula

in
 Cf Qi 1
Qi= LBQi  ( 4 )w i=1, 2, 3
f Qi

Where:
LBQi= lower class boundary of the ith quartile class

CfQi-1= the comulative frequency before the ith quartile class

fQi= the frequency of the ith quartile class

W=class width

Note: The quartile class (class containing Q ) is the class with the smallest less
i
in
than type cumulative frequency greater than or equal to
4

Example: consider the following age distribution find Q1 and Q2?


Class Frequency cfl
3-12 4 4
13-22 12 16
23-32 10 26
33-42 7 33
43-52 2 35

Solutions:
• First find the less than cumulative frequency.
• Use the formula to calculate the required quantile.
Class Frequency
3-12 4
13-22 12
23-32 10
33-42 7
43-52 2

 To solve for Q1 Find the class that contain the first quartile Q1
in 1(35)
  8.75
4 4

14
The class that contain the (8.75)th value is 13-22
LBQ1=12.5

CfQi-1=4

fQ1=12

W=10
1 * 35
4
Q1= 12.5  ( 4 )10
12
Q1=16.25

 To solve for Q3 Find the class that contain the third quartile Q3
in 3(35)
  26.25
4 4
The class that contain the (26.25)th value is 33-42
LBQ3=32.5

CfQi-1=26

fQ3=7

W=10
3 * 35
 26
Q3= 32.5  ( 4 )10
7
Q3=32.86

Exercise: find Q2?

2. Deciles:
 Are values that divide the data into ten equal parts.

 These values are denoted by D1, D2, …, D9

 10% of the data fall below D1, 20% below D2,……, 90% below D9.

15
Decile for ungrouped data
Let Di be the ith decile (i=1,2,…,9)
i (n  1) th
Di= ( ) value
10
Example: Find the D , D of the following data.
1 5
A. 6, 5, 2, 8, 9, 4,12,8,5,1
B. 2, 1, 8, 3, 5, 8
Solution:-
A. First order the data: 1,2, 4, 5,5, 6,,8,8, 9,12
n=10
1(10  1) th
D1= ( ) value
10
D1=(1.1)th value
=x1 + 0.1(x2 –x1)
=1 + 0.1(2 –1)
D1 =1.1

Interpretation: 10% of the values are less than or equal 1.1 and the remaining
90% of the data values are more than or equal to 1.1.

Exercise: find D2 and D5?

Deciles for grouped data

For grouped data: we have the following formula


in
 Cf Di 1
Di= LB Di  ( 10 ) w i=1, 2, 3,……,9
f Di

Where:
LBDi= lower class boundary of the ith Decile class

CfDi-1= the comulative frequency before the ith Decile class

fDi= the frequency of the ith Decile class

W=class width

Note: The decile class that contains D is the class with the smallest less than
i
in
type cumulative frequency greater than or equal to
10

16
Example: consider the following age distribution, find D4 and D8?

Class Frequency cfl


3-12 4 4
13-22 12 16
23-32 10 26
33-42 7 33
43-52 2 35

 To solve for D4 Find the class that contain the fourth decile D4
in 4(35)
  14
10 10

The class that contain the (14)th value is 13-22

LBD4=12.5

CfDi-1=4

fD4=12

W=10
1 * 35
4
D4= 12.5  ( 10 )10
12
D4=20.8

 To solve for D8 Find the class that contain the eighth decile D8
in 8(35)
  28
10 10

The class that contain the (28)th value is 33-42

LBD8=32.5

CfDi-1=26

fD8=7

W=10
8 * 35
 26
D8= 32.5  ( 10 )10
7
D4=35.35

Exercise: find D2, D5 and D9?


17
3. Percentiles:
 Percentiles are measures that divide the frequency distribution in to
hundred equal parts.
 The values of the variables corresponding to these divisions are denoted
P1, P2,……P99 often called the first, the second,…, the ninety-ninth
percentile respectively.

Percentile for ungrouped data

Let Pi be the ith percentile (i=1,2,…,99)

i (n  1) th
Pi = ( ) value
100

Example: find P25 , P50 and P75 for the natural number 1,2,3,4,5…………200

Solution:-
n=200
i (n  1) th
P25 = ( ) value
100
25(200  1) th
=( ) value
100

= (50.25) th value

=x50 + 0.25(x51 –x50)


=50 + 0.25(51 –50)
P25=50.25

Interpretation: 25% of natural numbers are less than or equal 50.25 and the
remaining 75% of the numbers are more than or equal to 50.25.

Exercise: find P50 and P75 ? And interpret them

18
Percentile for grouped data

For grouped data: we have the following formula


in
 Cf pi 1
Pi= LB Pi  ( 100 ) w i=1, 2, 3,……,99
f pi

Where:
LBpi= lower class boundary of the ith percentile class

Cfpi-1= the comulative frequency before the ith percentile class

fpi= the frequency of the ith percentile class

W=class width

Note: The percentile class that contains p is the class with the smallest less
i
in
than type cumulative frequency greater than or equal to
100

Example: Considering the following distribution, find P90?

Values Frequency cfl cfl


140- 150 17 17
150- 160 29 46
160- 170 42 88
170- 180 72 160
180- 190 84 244
190- 200 107 351
200- 210 49 400
210- 220 34 434
220- 230 31 465
230- 240 16 481
240- 250 12 493

Solution: find the less than type cumulative frequency

To solve for P90, Find the class that contain the ninetieth percentile

P90
in 90( 493)
  443.7
100 100

The class that contain the (443.7)th value is 220-230

19
LBP90=219.5

Cfpi-1=434

fp90=31

W=10
90 * 493
 434
P90= 219.5  ( 100 )10
31
P90=222.63

20

You might also like