TP02 BasicStatistics p1
TP02 BasicStatistics p1
TP02 BasicStatistics p1
Master in Finance
Class #2
Jorge Caiado, PhD Luís Silveira Santos, PhD
CEMAPRE/ISEG, University of CEMAPRE/ISEG, University of
Lisbon Lisbon
Email: jcaiado@iseg.ulisboa.pt Email: lsantos@iseg.ulisboa.pt
Web:
http://jcaiado100.wixsite.com/
jorgecaiado
II. Basic Statistics
Statistical Concepts
Prices and returns
POR POR
280 .12
240 .08
200 .04
160 .00
120 -.04
80 -.08
40 -.12
95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09
Daily prices for the PSI20 Index over the period Jan, Daily returns for the PSI20 Index over the period
2 1995 - Dec, 31 2009 (3914 obs.) Jan, 2 1995 - Dec, 31 2009
2
II. Basic Statistics
Simple return
𝑃𝑃𝑡𝑡 − 𝑃𝑃𝑡𝑡−1 𝑃𝑃𝑡𝑡 𝑃𝑃𝑡𝑡 − 𝑃𝑃𝑡𝑡−𝑘𝑘 𝑃𝑃𝑡𝑡
𝑅𝑅𝑡𝑡 = = −1 𝑅𝑅𝑡𝑡 𝑘𝑘 = = −1
𝑃𝑃𝑡𝑡−1 𝑃𝑃𝑡𝑡−1 𝑃𝑃𝑡𝑡−𝑘𝑘 𝑃𝑃𝑡𝑡−𝑘𝑘
Annualized return
1/𝑘𝑘
𝑃𝑃𝑛𝑛
𝑃𝑃0 (1 + 𝑅𝑅𝑡𝑡𝐴𝐴 )𝑘𝑘 = 𝑃𝑃𝑛𝑛 ⇔ 𝑅𝑅𝑡𝑡𝐴𝐴 = −1
𝑃𝑃0
3
II. Basic Statistics
Fundamental Concepts
- Population: all members of a specified group;
- Population parameter (or simply, parameter): a quantity computed from or used
to describe a population;
- Sample: a subset of a population;
- Sample statistic (or simply, statistic): a quantity computed from or used to
describe a sample.
Measurement scales
- Nominal: categorize data, but do not rank them (ex.: investment strategies);
- Ordinal: categorize data and order them with respect to some characteristic (ex.:
ratings);
- Interval: ordinal scales characteristics + the difference between scale values are
equal, i.e., addition and subtraction is meaningful (ex.: Celsius and Fahrenheit
scales);
- Ratio: interval scales characteristics + there is a true “zero point” defined as the
origin (ex.: money).
4
II. Basic Statistics
Exercise: State the scale of measurement for each of the following (Source:
DeFusco et al., 2015):
1. Credit ratings for bond issues;
2. Cash dividends per share;
3. Hedge fund classification types;
4. Bond maturity in years.
5
II. Basic Statistics
Construction of a frequency distribution
1. Sort the data in ascending order;
2. Calculate the range of the data, defined as 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = 𝑀𝑀𝑀𝑀𝑀𝑀 − 𝑀𝑀𝑀𝑀𝑀𝑀;
3. Decide on the number of intervals (𝑘𝑘) in the frequency distribution;
4. Determine the interval width as 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅/𝑘𝑘;
5. Determine the intervals by successively adding the interval width to the
minimum value, to determine the ending points of intervals, stopping after
reaching an interval that includes the maximum value;
6. Count the number of observations falling in each interval;
7. Construct a table of the intervals listed from the smallest to the largest that
shows the number of observations falling in each interval
Exercise: Construct the frequency table using the following values (Source:
DeFusco et al., 2015):
−4.57, −4.04, −1.64, 0.28, 1.34, 2.35, 2.38, 4.28, 4.42, 4.68, 7.16 and 11.43
6
II. Basic Statistics
Measures of Central Tendency
- (Arithmetic) Mean: it is the sum of the observations divided by the number of
observations
𝑛𝑛
1
𝑥𝑥̄ = � 𝑥𝑥𝑖𝑖
𝑛𝑛
𝑖𝑖=1
- Median: it is the value of the middle item of a set of items that has been sorted
into ascending or descending order. In an odd-numbered set of 𝑛𝑛 items, the
median occupies the (𝑛𝑛 + 1)/2 position. In an even-numbered set of 𝑛𝑛 items,
the median is defined as the mean of the values of the items occupying the 𝑛𝑛/2
and (𝑛𝑛 + 2)/2 positions (the two middle items).
7
II. Basic Statistics
Other concepts of the mean
- Weighted Mean:
𝑥𝑥̄ 𝑊𝑊 = ∑𝑛𝑛𝑖𝑖=1 𝑤𝑤𝑖𝑖 𝑥𝑥𝑖𝑖 , where ∑𝑛𝑛𝑖𝑖=1 𝑤𝑤𝑖𝑖 = 1, with 𝑤𝑤1 , 𝑤𝑤2 , … , 𝑤𝑤𝑛𝑛 being the weights
- Geometric Mean:
𝑥𝑥̄ 𝐺𝐺 = 𝑛𝑛 ∏𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 = 𝑛𝑛 𝑥𝑥1 𝑥𝑥2 … 𝑥𝑥𝑛𝑛 , with 𝑥𝑥𝑖𝑖 ≥ 0, for 𝑖𝑖 = 1,2, … , 𝑛𝑛
Note: If 𝑟𝑟𝑖𝑖 < 0, add a quantity such that the new value becomes positive (or equal
to zero)
- Harmonic Mean:
1 𝑛𝑛
𝑥𝑥̄ 𝐻𝐻 = = ∑𝑛𝑛 , with 𝑥𝑥𝑖𝑖 > 0, for 𝑖𝑖 = 1,2, … , 𝑛𝑛
(1/𝑛𝑛) ∑𝑛𝑛
𝑡𝑡=1 1/𝑥𝑥𝑖𝑖 𝑡𝑡=1 1/𝑥𝑥𝑖𝑖
The harmonic mean can be viewed as a special type of weighted mean, in which an
observation’s weight is inversely proportional to its magnitude. This type of mean is
more appropriate when averaging ratios, when they are repeatedly applied to a
fixed quantity to yield a variable number of units (ex.: cost averaging).
8
II. Basic Statistics
Measures of location: quantiles
A quantile is a statistical concept used to divide a set into equal sized intervals of
subsets, each containing an (approximately) equal portion of the data. The most
used quantiles are:
9
II. Basic Statistics
How to compute the percentiles?
1. Sort the set in ascending order;
10
II. Basic Statistics
Measures of Dispersion
- Range: it is the difference between the maximum and the minimum in a set
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚 − 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚
- Mean absolute deviation:
𝑛𝑛
1
𝑀𝑀𝑀𝑀𝑀𝑀 = � 𝑥𝑥𝑖𝑖 − 𝑥𝑥̄
𝑛𝑛
𝑖𝑖=1
Note: the bias-corrected variance has better statistical properties than the variance.
11
II. Basic Statistics
Measures of Symmetry and Skewness
- Skewness
𝑛𝑛
𝑛𝑛 (𝑥𝑥𝑖𝑖 − 𝑥𝑥)̄ 3
𝑆𝑆𝐾𝐾 = �
(𝑛𝑛 − 1)(𝑛𝑛 − 2) 𝑠𝑠 3
𝑖𝑖=1
12
II. Basic Statistics
Measures of Kurtosis
- Kurtosis
𝑛𝑛
𝑛𝑛(𝑛𝑛 + 1) (𝑥𝑥𝑖𝑖 − 𝑥𝑥)̄ 4
𝐾𝐾 = �
(𝑛𝑛 − 1)(𝑛𝑛 − 2)(𝑛𝑛 − 3) 𝑠𝑠 4
𝑖𝑖=1
- Excess Kurtosis
𝑛𝑛
𝑛𝑛(𝑛𝑛 + 1) (𝑥𝑥𝑖𝑖 − 𝑥𝑥)̄ 4 3 𝑛𝑛 − 1 2
𝐾𝐾𝐸𝐸 = � 4
−
(𝑛𝑛 − 1)(𝑛𝑛 − 2)(𝑛𝑛 − 3) 𝑠𝑠 (𝑛𝑛 − 2)(𝑛𝑛 − 3)
𝑖𝑖=1
14
II. Basic Statistics
Exercise: Consider the annual total returns on the MSCI German Index from 1993
to 2002 (Source: DeFusco et al., 2015)
15