Chapter 2
Chapter 2
Chapter 2
Measure of central tendency (Averages) is the middle point and unique value that describes the entire data. Methods: 1.Arithmatic mean or mean 2.Median 3.Mode 4.Geometric Mean 5.Hormonic Mean
Mean:The mean of n observations is the ratio of sum of n observations to the total no of observations. Ungrouped data/ Raw data:
Sumofobservations Mean ! No.ofobservations
f vx Mean ! f
Deviation Method:
f v d mean ! A vh f
d!
x A h
1.Child care community Nursery is eligible for a county social services grant as long as the average age of its children stays below 9. If these data represent the ages of all the children currently attending ChildCare, do they qualify for the grant? 8 5 9 10 9 12 7 12 13 7 8
2. The following data represent the ages of patients admitted to a small hospital. 85 75 66 43 40 88 80 56 56 67 89 83 65 53 75 87 83 52 44 48 a) Construct a frequency distribution with classes 40-49,50-59 etc.. b) Compute the A.M
3)National Tire company holds reserve funds in short-term marketable securities. The ending daily balance (in million $) of the marketable securities account for 2-weeks is shown below: Week 1: 1.973 1.970 1.972 1.975 1.976 Week 2: 1.969 1.892 1.893 1.887 1.895 What was the average amount invested in marketable securities during a) the first week b) the second week c) the 2-week period d) An average balance over the 2-weeks of more than $1.970 millions would qualify National Tire company for higher interest rates. Does it qualify? e) If the answer to part (c) is less than $1.970 millions, by how much would the last days invested amount have to rise to qualify the company for the higher interest rates?
Merits:1.It is based on each and every observation of the data. 2.It is useful for performing statistical procedures such as comparing the means from several datasets. 3.It is unique Demerits:1.It is highly affected by extreme values 2.It is not suitable in case of open-end class intervals.
Ex:- Compute average for the following 2 , 8 , 9 , 11 Mean = (2+8+9+11)/4=30/4=7.5 Mean is not suitable
Median:Median is the value which separates the entire data into two parts. Raw data: 1.Arrange the data in an ascending or descending order of magnitude. 2.Compute [(n+1)/2]th term which gives the position of the median.
N2 m Median ! L vh f
Where L is the lower limit of the median class f is the frequency of the median class h is the width of the median class interval N is the total frequency m is the sum of all the class frequencies up to but not including the median class Median class is the cumulative frequency just greater than N/2.
1.Find the median 5 , 7, 4, 9, 5, 6, 2 Ans: 2, 4, 5, 5, 6, 7, 9 n=7 i.e odd median=(7+1)/2=4th term=5 2.Find the median 8, 7, 9, 4, 8, 10, 9, 9, 3, 5 Ans: 3, 4, 5, 7, 8, 8, 9, 9, 9, 10 n=10 even Median=(10+1)/2=5.5th term i.e median is the average of 5th and 6th terms=(8+8)/2=8
C.f 2
2 5 7(m) 6(f) 4 3
13 Median Class 17 20
N/2=20/2=10 L=4000
The following data relates to distribution of total loans/credit among various borrowers according to rate of interest.
Rate of interest % of Total C.F <6 6-8 8-10 10-12 >12 10.2 15.5 26 32.6 15.7 N=100
24.3 50 25.7 Median ! 8 v2 ! 8 ! 8 1.8692 ! 9.8692 $ 9.87 13 26
Merits: 1. Extreme values do not affect its value 2. Specially useful when data is skewed De-Merits: 1.It may not be representative of the data Ex: 1, 2, 10 2. It does not consider all the observations
Mode
The mode of a data set is the value that occurs with greatest frequency. The greatest frequency can occur at two or more different values. If the data have exactly two modes, the data are bimodal. If the data have more than two modes, the data are multimodal.
f1 f 0 Mode ! L h 2 f1 f 0 f 2
Value 8 9 10 11 12 13 14 15 frequency 5 6 8 7 9 8 9 6
Compute average growth rate (savings A/c) for the following Year Interest rate(%) 7 8 10 12 18 Growth Savings factor at the end of year 1.07 107 1.08 1.10 1.12 1.18 115.56 127.12 142.37 168
1 2 3 4 5
Growth factor = 1+(Interest rate/100) Let us deposit initially Rs100.00 Average growth factor=(1.07+1.08+1.10+1.12+1.18)/5=1.11 i.e an average interest rate of 11% per year Therefore a Rs100.00 deposit would grow in five years to (100)x(1.11)x(1.11)x(1.11)x(1.11)x(1.11)= Rs168.51
G ! x1 x2 x3 ...xn
1n
Ex: The annual rate of growth for a factory for 5 years is 7%,8%,4%,6% and 10% respectively. What is the average rate of growth per annum for this period?
Ex: The price of a commodity increased by 8% from 1993 to 1994, 12% from 1994 to 1995, and 76% from 1995 to 1996. The average price increase from 1994 to 1996 is quoted as 28.64% and not 32%. Explain and verify.
Ex: The following table shows the production of rice in million tonnes along with % coverage under irrigation in India from 1982-83 to 2003-04. Find the average production of rice from 1982-83 to 1992-93 and 1993-94 to 200304. Also compute the average % coverage under irrigation from 1982-83 to 1992-93 and 2003-04.
Year 1982-1983 1983-1984 1984-1985 1985-1986 1986-1987 1987-1988 1988-1989 1989-1990 1990-1991 1991-1992 1992-1993 1993-1994 1994-1995 1995-1996 1996-1997 1997-1998 1998-1999 1999-2000 2000-2001 2001-2002 2002-2003 2003-2004 Production of rice (in million tonnes) 47.12 60.1 58.34 63.83 60.56 56.86 70.49 73.57 74.29 74.68 72.86 80.3 81.81 76.98 81.73 82.54 86.08 89.68 84.98 93.34 71.82 88.53 % coverage under irregation 42 42.7 43.7 42.9 44.1 43.6 45.8 46.1 45.5 47.3 48 48.6 49.8 49.9 51 50.8 52.3 53.9 53.6 53.2 50.2 52.6
Measures of Variability
Range Interquartile Range Quartile Deviation Mean Deviation Standard Deviation Coefficient of Variation
Range
The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of variability. It is very sensitive to the smallest and largest data values.
Ex: The following data lists the average salary (000 s) offered to nine students during placement interviews. Find the range and its coefficient from the following series.
96 180 98 75 270 80 102 100 94
Ex: calculate range and its coefficient from the following data.
X f
10 11 12 13 14 15 8 10 16 20 4 2
Interquartile Range
The interquartile range of a data set is the difference between the third quartile and the first quartile. It overcomes the sensitivity to extreme data values.
Interquartile Range 3rd Quartile (Q3) = 525 1st Quartile (Q1) = 445 Interquartile Range = Q3 - Q1 = 525 - 445 = 80
50% of the apartment have rent between 445&525 and the range of their rents being 80
Roll No Marks
1 25
2 55
3 5
4 45
5 15
6 35
th n 1 7 item ! 1.75th item Q1 ! size of item ! size of 4 4 size of 1.75th item ! size of 1st item 0.75(size of 2 nd size of 1st item) ! 5 0.75(15 5) ! 5 7.5 ! 12.5 th 3(n 1) (3 v 7) Q3 ! size of item ! 5.25th item item ! size of 4 4 th
th
size of 5.25th item ! size of 5 th item 0.25(size of 6 th size of 5 th item) ! 45 0.25( 55 45 ) ! 45 2.5 ! 47.5 Q.D ! 47.5 12.5 / 2 ! 17.5
Mean Deviation
M.D is the arithmetic mean of the absolute deviations of all items from average and is given 1 by
M .D !
x Mean n
A rainwear manufacturing company wants to launch some new products in a new state. The rainfall in the state(in cm) for the past 10 years is given in the following table. Find the average deviation and its coefficient.
Year
Rainfall
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 110 120 130 135 140 150 160 145 130 125
Ex: The weekly earnings of 187 employees of a company is given in the following table. Find the mean of the weekly earnings and average deviation, its coefficient.
weekly No.of earnings (in Rs) employees 100 5 120 8 140 12 160 16 180 22 200 44 210 80
Variance
The variance is a measure of variability that utilizes all the data.
It is based on the difference between the value of each observation (xi) and the mean.
The variance is the average of the squared differences between each data value and the mean and usually denoted by 2 .
1 2 W ! x x
n 1 2 2 W ! f x x
N 1 1 2 2 W ! fd N N x -a Where d ! h
2
fd
2 vh
The standard deviation of a data set is the positive square root of the variance, denoted by . 1 2 W! x x n
W!
1 N
f x x
2
1 W! N
1 fd N
fd
vh
x -a Where d ! h
Ex: The following table shows the sales (in million rupees) of four leading cement companies. Ambuja, Birla, Madras cement, ACC from 1994-1995 to 2006-2007. Find the range, IQR, standard deviation, variance, and coefficient of variation.
Year 1994-95 1995-96 1996-97 1997-98 1998-99 1999-00 2000-01 2001-02 2002-03 2003-04 2004-05 2005-06 2006-07 Ambuja 3209.1 4292.3 7305.6 9303.5 11457.8 12523.4 13027.8 14473.2 15826.3 20251 23012.8 30258.4 70167 Birla 32747 42876 53478 56914 73030 74337 75550 81199 87763 98945 133781 150290 179713 Madras Cement 2973.2 3901.8 4171.4 4886.7 5223.8 5180.9 6192.6 8166.6 7506.9 8451.9 8852.8 11909.7 18024.8 ACC 20427 23294.6 24510.5 23731.1 25858.3 26792.2 29361.2 32260 33718.8 39003.7 45498 37235.1 64680.6
A Quality control laboratory received samples of electric bulbs for testing their lives, from two suppliers. The results were as follows.
Length of life (in hours) 1500-2000 2000-2500 2500-3000 Company A Company B
16 26 8
18 22 8
Which companys bulbs have the greater length of life? Which companys bulbs are more uniform w.r.t their lives?
The shareholders Research Centre of India has recently conducted a research-study on price behaviour of three leading industrial shares, A,B, and C for the period 1979-1985, the results of which are published as follows in its Quarterly Journal: Share Average price (RS) A B C 18.2 22.5 24.0 Standard deviation 5.4 4.5 6.0 Current selling price 36.00 34.75 39.00
i)
Which share, in your opinion, appears to be more stable? ii) If you are the holder of all the three shares, which one would you like to dispose of at present, and why?
X 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1
Relative Measures:-
Karl Pearsons Coefficient of Skewness is given by 3( Mean Median) Sk ! Standard deviation ( Mean Mode) Sk ! Standard deviation Bowlys Coefficient of Skewness is given by Q3 Q2 Q2 Q1 Sk B ! Q3 Q2 Q2 Q1
Ex: The data on the profits (in Rs lakh) earned by 60 companies is as follows:
Profits No.of companies Below 10 5 10-20 12 20-30 20 30-40 16 Above 40-50 50 5 2