Quantitative Methods 1 (Juice Notes)
Quantitative Methods 1 (Juice Notes)
Quantitative Methods 1 (Juice Notes)
e
r = −1 means perfectly −ve correlation
−ve covariance = Variables tend to
move in opposite directions
re
−ve covariance −ve correlation −ve slope
Scatter plot: Graph that shows the relationship between values of two variables
LOS c Test of the hypothesis that the population correlation coefficient equals zero
Fi
r × √n − 2 0.4 × √62 − 2
Step 2: Calculate test statistic 2
3.2
√1 − r √1 − 0.42
y
Rp = RFR + β (Rm − RFR)
Dependent
variable
Dependent Independent
variable variable
Intercept Slope
x
Independent
variable
e
Independent variable is uncorrelated with the error term
Ž Expected value of the error term is zero
Variance of the error term is constant (NOT ZERO). The economic relationship
b/w variables is intact for the entire time period (eg. change in political regime)
Error term is uncorrelated with other observations (eg. seasonality)
re
‘ Error term is normally distributed
Sum of squared errors (SSE): Sum of the squared vertical distances between the estimated
and actual Y-values
Slope coefficient (beta): Describes change in ‘y’ for one unit change in ‘x’
nT
Cov (x,y)
Variance (x)
Eg. ‘x’ 10 15 20 30
Actual ‘y’ 17 19 35 45
Fi
SSE 38.166
Standard error of estimate (SEE) =
√ n−2
=
√ 2
= 4.36
Slope Standard
error
Critical value
(t-value)
^
Eg. b1 = 0.48 SE = 0.35 n = 42 Calculate 90% confidence interval
^ ^
Step 1: Define hypothesis H0: b1 = 0, Ha: b1 ≠ 0
e
Std. error 0.35
LOS h & i
Predicted value of Confidence interval for the predicted
dependent variable value of dependent variable
nT
^ ^ ^ ^
Y = b0 + b1 × Xp Y ± (tc × SE)
Eg. Forecasted return (x) = 12% Intercept = −4% Slope = 0.75 Standard error = 2.68
n = 32 Calculate predicted value (y) and 95% confidence interval
−0.472 to 10.472
https://www.fintreeindia.com/ © 2017 FinTree Education Pvt. Ltd.
Measures unexplained
variation
Measures explained Measures total
variation variation
aka sum of squared
residuals ^
∑ (Yi − Yi)2 ∑ (Yi − Yi)2
^
∑ (Yi − Yi)2
ª R2 = RSS/SST
ANOVA Table
e
Source of variation DoF Sum of squares Mean sum of squares
Regression
k RSS MSR = RSS/k
(explained)
re
Error
n−k−1 SSE MSE = SSE/n − k − 1
(unexplained)
F-test
Y = b 0 + b 1 x1 + b 2 x2 + ε
t-test t-test
Public knowledge of regression relationship may make their future usefulness ineffective
If the regression assumptions are violated, hypothesis tests will not be valid
(heteroscedasticity and autocorrelation)
Multiple Regression And Issues In
https://www.fintreeindia.com/ © 2017 FinTree Education Pvt. Ltd.
Regression Analysis
LOS a Multiple regression equation
Y = b0 + b1 X1 + b2 X2 + …. + bk Xk + ε
Intercept Independent
variable
e
when independent variable
when all independent
changes by one unit,
variables are equal to zero
holding other independent
variables constant
re
LOS c & d Hypothesis testing for population value of a regression coefficient
Since calculated test statistic (b1) lies inside the range, conclusion is ‘Failed to reject the null hypothesis’
And test statistic (b2) lies outside the range, conclusion is ‘Reject the null hypothesis’
Variable with slope ‘b1’ is not significantly different from zero
and variable with slope ‘b2’ is significantly different from zero
Solution is to drop the variable with slope ‘b1’
DoF = n − k − 1
https://www.fintreeindia.com/ © 2017 FinTree Education Pvt. Ltd.
P-value
Reject
Reject
FTR FTR
FTR
LOS e
Confidence interval for Predicted value of
regression coefficient dependent variable
^ ^ ^ ^ ^ ^ ^ ^ ^
b1 ± (tc × SE) Y = b0 + b1 X1 + b2 X2 + …. + bk Xk
e
Slope Standard Intercept Forecasted
error value (x)
Variance of the error term is constant (NOT ZERO). The economic relationship
b/w variables is intact for the entire time period (eg. change in political regime)
Error term is uncorrelated with other observations (eg. seasonality)
‘ Error term is normally distributed
LOS g F-statistic
ª F-statistic = MSR/MSE with ‘k’ and ‘n − k − 1' DoF
Fi
Eg. n = 48 k=6 SST = 430 SSE = 190 Significance level = 2.5% and 5%
Perform an F-test
RSS 240
MSR = 40
k 6
SSE 190
MSE = 4.634
n−k−1 41
MSR 40
F-statistic = 8.631
MSE 4.634
2 2
LOS h R and adjusted R
R2: % variation of dependent variable explained by % variation of all the independent variables
e
R2 = RSS/SST
k=8 n = 30 R2 = 75%
Adding two more variables is not justified because adjusted R22 < adjusted R21
Fi
Regression
k RSS MSR = RSS/k
(explained)
Error
n−k−1 SSE MSE = SSE/n − k − 1
(unexplained)
Y = b0 + b1 X1 + b2 X2 + …. + bk Xk + ε
Intercept Independent
variable
Dummy variables: Independent variables that are binary in nature (i.e. in the form of yes/no)
Conditional Unconditional
e
Occurs when Occurs when
heteroskedasticity of heteroskedasticity of
the error variance is the error variance is
correlated with the not correlated with the
re
independent variables independent variables
Two or more
Variance not Errors are Errors are
Meaning independent variables
constatnt correlated correlated
are correlated
Examining scatter
Durbin-Watson Durbin-Watson F - significant
Fi
ª Breusch-Pagan test: n × R2
ª White-corrected standard errors is also known as robust standard error
ª Durbin-Watson test ≈ 2(1 − r).
ª Multicollinearity: The question is never a yes or no, it is how much
ª None of the assumption violations have any impact on slope coefficients.
The impact is on standard errors and therefore on t-test
https://www.fintreeindia.com/ © 2017 FinTree Education Pvt. Ltd.
LOS m
Model Model
specifications misspecifications
Model misspecifications might have impact on both slope coefficient and error terms
e
LOS n Models with qualitative dependent variables
e
Downward-sloping line:
−ve trend Concave curve:
−ve trend
Equation:
yt = b0 + b1t + εt Equation:
re
ln yt = b0 + b1t + εt
y y
x x
Fi
Limitation of trend models is that they are not useful if the error terms are serially correlated
https://www.fintreeindia.com/ © 2017 FinTree Education Pvt. Ltd.
Eg. Xt = b0 + b1 Xt−1
Xt = 5 + 0.5 Xt−1
Xt − 1 = 6 Xt = 8 Xt − 1 = 20 Xt = 15
Xt − 1 = 8 Xt = 9 Xt − 1 = 15 Xt = 12.5
e
Xt − 1 = 9 Xt = 9.5 Xt − 1 = 12.5 Xt = 11.25
re
Xt − 1 = 10 Xt = 10
b0 5
Mean of the time series = = = 10
1 − b1 1 − 0.5
nT
Most economic and financial time series relationships are not stationary
Test used to know if the autocorrelations are significantly different from zero: t-test
Autocorrelation
t statistic =
Standard error
e
It means tendency of time series to move toward its mean
b0
Mean reverting level =
1 − b1
re
LOS g In-sample and out-of-sample forecasts and RMSE criterion
200 - - - -
1004.41
Fi
215 - - -
632
e
Shorter sample period → More stability but less statistical reliability
Longer sample period → Less stability but more statistical reliability
Equation: Equation:
Xt = Xt − 1 + εt Xt = b0 + Xt − 1 + εt
ª They are not covariance stationary because they do not have a finite mean
ª To use standard regression analysis, we must convert this data to covariance stationary.
This conversion is called ‘first differencing’
https://www.fintreeindia.com/ © 2017 FinTree Education Pvt. Ltd.
Autocorrelation Dickey-Fuller
approach test
First differencing
Eg. Sales Lag 1 First difference
- -
∆ sales
e ∆ sales
(current year) (previous year)
re
230 - - -
270 230 40 -
290 270 20 40
310 290 20 20
nT
340 310 30 20
^ ^ ^
Equation: y = 30 − 0.25x Equation: y = 30 − 0.25(340) y = (55)
Forecasted sales: 340 − 55 = 285
If time series is a random walk then we must convert this data to covariance stationary.
This conversion is called first differencing
Seasonality is present if the autocorrelation of error term is significantly different from zero
Correction: Adding a lag of dependent variable (corresponding to the same period in previous year)
to the model as another independent variable
https://www.fintreeindia.com/ © 2017 FinTree Education Pvt. Ltd.
Testing: Squared errors from the model are regressed on the first
lag of the squared residuals
^2 ^2
Equation: εt = a0 + a1 εt − 1 + μt
Intercept Predicted
error term of
last period
To test whether the two time series have unit roots, a Dickey-Fuller test is used
e
Possible scenarios:
Œ Both time series are covariance stationary (linear regression can be used)
Only the dependent variable time series is covariance stationary (linear regression
re
should not be used)
Ž Only the independent variable time series is covariance stationary (linear regression
should not be used)
Neither time series is covariance stationary and the two series are not cointegrated
(linear regression should not be used)
Neither time series is covariance stationary and the two series are cointegrated
(linear regression can be used)
nT
Cointegration: Long term economic or financial relationship between two time series
ª If you have decided to use a time-series model plot the values to see whether the time series
looks covariance stationary
Fi
ª If you find significant serial correlation in the error terms, use a complex model such as AR model
ª If the data has serial correlation, reexamine the data for stationarity before running an AR model
ª If you find significant serial correlation in the residuals, use an AR(2) model
Step 1 Determine probabilistic variables: No constraint on number of input variables that can be
allowed to vary.
Focus on a few variables that have significant impact on
value.
e
Step 3 Check for correlation across If the correlation is strong, either allow only one of the
variables: variables to vary (focus on the variable that has the
highest impact on value) or build the correlation into
re
the simulation
Step 4 Run the simulation: It means to draw an outcome from each distribution
and compute the value based on these outcomes
Number of probabilistic inputs: Higher the number of
probabilistic inputs, greater the number of simulations
required.
Types of distributions: Greater the diversity of
nT
Imposed internally:
Regulatory capital Likelihood of financial
Analyst’s expectations
restrictions distress
Imposed externally:
Negative equity Indirect bankruptcy costs
Loan covenants
ª Garbage in, garbage out: Inputs should be based on analysis and data, rather than guesswork
ª Non-stationary distributions: Distributions may change over time due to change in market
e
structure. There can be a change the form of distribution or the parameters of the distribution
ª Dynamic correlations: Correlation across input variables can be modeled into simulations only
when they are stable. If they are not it becomes far more difficult to model them
re
Risk-adjusted value
Cash flows from simulations are not risk-adjusted and should not be discounted at RFR
ª We have already accounted for B’s greater risk by using a higher discount rate