Numerical Desciptive Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Numerical Descriptive Statistics

Ritesh Pandey

June 21, 2016

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Francis Galton

1822-1911
Numerical Descriptive Statistics
Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Karl Pearson

1857-1936

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Ronald Fisher

1890-1962

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Table of Contents

Measures of Central Tendency

Measures of Dispersion

Measures of Linear Relationship between variables

Measures of shape

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Table of Contents

Measures of Central Tendency

Measures of Dispersion

Measures of Linear Relationship between variables

Measures of shape

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Arithmetic Mean

Pn
i=1 xi
x=
n

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Weighted Average

Pn
w i xi
x w = Pi=1
n
i=1 wi

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Arithmetic Mean: when each value occurs


with a given frequency

Pn
fi x i
x = Pi=1
n
i=1 fi

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Arithmetic Mean: when values are grouped


into frequency intervals

Pn
fi m i
x = Pi=1
n
i=1 fi
where mi is the midpoint of each interval.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Arithmetic Mean: calculation from deviations


from any number

Pn
i=1 di
x =A+
n
where di = (xi A) is the deviation of each value from any A.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Geometric Mean

n
" #1/n
Y
xg = xi
i=1

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Harmonic Mean

n
x h = Pn 1
i=1 xi

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Median: for ungrouped data

First arrange the values in increasing (or even decreasing)


order.
For odd n:
Median is the value at the( n+1
2
)th spot.
For even n:
Median is (conventionally) taken to be the mean of the values
at the ( n2 )th and ( n+1
2
)th spots.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Median: when each value occurs with a given


frequency

Here, the cumulative frequency table should be drawn up first.


f = n. Now locate the value n2 under the
P
Then find
cumulative frequency column in the above mentioned table.
The corresponding x value will be the median value for x.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Median: when values are grouped into


frequency intervals

Again, first draw up the cumulative frequency table and find


the median class.
n
2
(cf )
Median = l0 + w0
f0

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Mode: when values are grouped into


frequency intervals
The modal class is the class with the highest frequency.

Mode = l0 + w0
+ +
Here,

= f0 f
and
+ = f0 f+

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Quartiles

The spread of the values of the variable is divided into four


equal parts by three locations. The values at these locations
are the three quartiles. The second quartile divides the data
into two equal parts and is obviously the median.The other
two are called the lower and upper quartile respectively.
Denoted by Q1 , Q2 and Q3 respectively.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Quartiles: For ungrouped data

Simply order the data in increasing order and find the 3


locations that divide the dataset into four equal halves. For
Q1 , find n+1
4
and round up to the nearest integer. For Q3 , find
3(n+1)
4
and round down to the nearest integer.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Quartiles: For grouped data

First find the relevant class from the cumulative frequency


table. n
(cf )
Q1 = l0 + 4 w0 .
f0
n
2
(cf )
Q2 = l0 + w0 .
f0
3n
4
(cf )
Q3 = l0 + w0 .
f0

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Deciles

These divide the value range into 10 equal halves. They are
obviously nine in number. The kth decile can be calculated by
k
the usual formula.The kth decile class is given by (n + 1) 10
kn
10
(cf )
Pk = l0 + w0 .
f0
for i = 1,2, . . . , 9.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Percentiles

These divide the value range into 100 equal halves. They are
obviously ninty nine in number. The kth percentile can be
calculated by the usual formula. The kth percentile class is
k
given by (n + 1) 100
kn
100
(cf )
Pk = l0 + w0 .
f0
for i = 1,2, . . . , 99.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Table of Contents

Measures of Central Tendency

Measures of Dispersion

Measures of Linear Relationship between variables

Measures of shape

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Range

Range = xmax xmin

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Interquartile Range

Interquartile Range = Q3 Q1

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Semi - Interquartile Range

Q3 Q1
Semi Interquartile Range =
2

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

(Coefficient of) Quartile Deviation

Semi Interquartile Range


Quartile Deviation = Q3 +Q1
2
Hence,
Q3 Q1
2 Q3 Q1
QD = Q3 +Q1
=
2
Q3 + Q1

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Mean Absolute Deviation

P
|xi x|
MAD =
n
Here, x can be any measure of central tendency.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Variance

Pn
2 i=1 (xi x)2
s =
n1

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Variance: For Grouped Data

Pn
2 i=1 fi (mi x)2
s =
n1

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Standard Deviation


s= s2

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Coefficient of Variation

s
cv =
x

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Table of Contents

Measures of Central Tendency

Measures of Dispersion

Measures of Linear Relationship between variables

Measures of shape

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Covariance between two variables

Pn
i=1 (xi x)(yi y )
sxy = cov (x, y ) =
n1

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The ( Coefficient of ) Correlation between two


variables

sxy
rxy = corr (x, y ) =
sx sy
with 1 rxy 1.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Table of Contents

Measures of Central Tendency

Measures of Dispersion

Measures of Linear Relationship between variables

Measures of shape

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Skewness

Pn
x)3 /(n 1) i=1 (xi
Skewness =
s3
Positive skew: right tail longer; mean > median
Negative skew: left tail longer; mean < median

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

The Kurtosis

Pn
i=1 (xi x)4 /(n 1)
Kurtosis =
s4
Negative: Lepto- (slender) -kurtic
Zero: Meso -kurtic
Positive: Platy- (broad) -kurtic

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Chebysheffs Theorem

Theorem (Chebysheff)
Within k standard deviations around the mean, will be found a
proportion equal to at least 1 k12 of the total number of
observations.
2s will contain at least 75% data points.
3s will contain at least 89% data points.

Numerical Descriptive Statistics


Measures of Central Tendency Measures of Dispersion Measures of Linear Relationship between variables Measures of shape

Chebysheffs Theorem: For a Normal Distribution

1s will contain at least 68% data points.


2s will contain at least 95% data points.
3s will contain at least 99.7% data points.

Numerical Descriptive Statistics

You might also like