0% found this document useful (0 votes)

24 views81 pages

Block 1

Uploaded by

karolina.jindr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views81 pages

Block 1

Uploaded by

karolina.jindr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 81

Block 1

Repetition from BSc courses

LRM estimators & non-linear extensions
Predictions from regression models

Advanced econometrics 1 4EK608

Pokročilá ekonometrie 1 4EK416

Vysoká škola ekonomická v Praze

Outline

1 Estimation methods, predictions from a model

Ordinary least squares
General properties of estimators
Method of moments
Maximum likelihood estimator

2 Non-linear extensions to LRM, quantile regression

Non-linear regression models
Quantile regression

3 Predictions from a regression model

Predictions from a CLRM (repetition from BSc courses)
Predictions: general features, kFCV, Variance vs. Bias
Linear regression model (LRM) and OLS estimation

y = Xβ + ε
LRM assumptions (for OLS estimation):
(Notation follows Greene, Econometric analysis, 7th ed.)

A1 Linearity: yi = β1 + β2 xi2 + · · · + βK xiK + εi

LRM describes linear relationship between yi and xi .
A2 Full rank: Matrix X is an n×K matrix with rank K.
Columns of X are linearly independent and n ≥ K.
A3 Exogeneity of regressors: E[εi |X] = 0 (strict form).
If relaxed to contemporaneous form in TS: E[εt |xt ] = 0.
Law of iterated expectations: E[εi |X] = 0 ⇒ E[ε] = 0.
Linear regression model (LRM) and OLS estimation

y = Xβ + ε
LRM assumptions (continued):
A4 Homoscedastic & nonautocorrelated disturbances:
E[εε0 ] = σ 2 In
Homoscedasticity: var[εi |X] = σ 2 , ∀ i = 1, . . . , n.
Independent disturbances: cov[εt , εs |X] = 0, ∀ t 6= s.
GARCH models [i.e. ARCH(1): var[εt |εt−1 ] = σ 2 + αεt−1 ]
do not violate the conditional variance assumption
var[εi |X] = σ 2 . However, var[εt |εt−1 ] 6= var[εt ], with
conditioning on X omitted from notation but left as
implicit.
A5 DGP of X: Variables in X may be fixed or random.
A6 Normal distribution of disturbances:
ε|X ∼ N [0, σ 2 In ].
Ordinary least squares (OLS)
y = Xβ + ε
The least squares estimator is unbiased (given A1 – A3):

β̂ = b = (X 0 X)−1 X 0 y = β + (X 0 X)−1 X 0 ε,
take expectations :
E[b|X] = β + E[(X 0 X)−1 X 0 ε|X] = β, (zero by A3).
Variance of the least squares estimator (A1 – A4):
var[b|X] = var[(X 0 X)−1 X 0 ε|X]
because var(β) = 0. Using A3 & A4:
= A σ 2 In A0 where A = (X 0 X)−1 X 0
which is a matrix quadratic form for var(cZ) = c2 var(Z)
= σ 2 (X 0 X)−1
because (AB)0 = B 0 A0 ; dim. compatible matrices A, B.
Normal distribution of the least squares estimator (A1 – A6):
b|X ∼ N [β, σ 2 (X 0 X)−1 ].
General properties of estimators

Estimators and estimation methods:

LRM is not the only type of regression model.

OLS is not the only useful estimator.

Let’s approach estimators and their properties more

generally.

(again, notation follows Greene, Econometric analysis.)

Estimators and estimation methods
Notation/definitions:
xj = (x1j , . . . , xnj )0 - random sample of n observations.
θ - population parameter [unknown parameter(s)]
f (xj , θ): probability distribution function
θ̂ is some estimator of θ
Basic notions:
All estimators have sampling distributions
mean: E(θ̂)
variance: E[(θ̂ − E(θ̂))2 ], etc.
Estimators × estimate
Generally, many estimators exist for a given parameter.
Population mean example:
Pn
i=1 xi
θ̂1 = x =
n
1
θ̂2 = x̃ = (xmax + xmin )
2
Properties of estimators - classification:

Unbiasedness: can be described as E(θ̂) = θ.

Ocassionally useful – in finite (small) sample context.
Asymptotic unbiasedness (large sample property):
not very useful, discussion would be directed towards consistency
(which is a far more desirable feature).

Consistency: plim(θ̂) = θ.
As n → ∞, vector θ̂ is an unbiased estimator of θ and
plim(var(θ̂)) = 0 [i.e. var(θ̂) → 0 as n → ∞].
Consistent estimators: unbiased (or at least asymptotically
unbiased) & their variance shrinks to zero as sample size
grows (entire population is used).
Minimal requirement for estimator used in statistics or
econometrics.
If some estimator is not consistent, then it does not provide
relevant estimates of population θ values, even with
unlimited data, i.e. as n → ∞.
Unbiased estimators are not necessarily consistent.
Properties of estimators - classification:

Efficiency: an estimator is efficient if it is unbiased and no

other unbiased estimator has a smaller variance. Often
difficult to prove, we usually simplify the concept to
relative efficiency (e.g.: efficiency with respect to linear
unbiased estimators, etc.).
Asymptotic efficiency: holds for an estimator that is
asymptotically unbiased and no other asymptotically
unbiased estimator has smaller asymptotic variance.

Normality, asymptotic normality: basis for most

statistical inference performed with common estimators.
Estimators and estimation methods

Extremum estimator: obtained as the optimizer of some

criterion function q(θ|data). Most common estimators:
n
" #
X
2
LS θ̂LS = argmax − n1 (yi − h(xi , θLS )) ,
i=1
n
" #
X
1
ML θ̂ML = argmax n log f (yi |xi , θML ) ,
i=1
GMM θ̂GMM = argmax [−m(data, θGMM )0 W m(data, θGMM )],

where h(·) is a function (linear/non-linear → OLS/NLS),

f (·) is a probability density function (pdf),
m denotes sample moments,
W is a convenient positive definite matrix.
LS and ML estimators belong to a class of M estimators
(type of extremum estimators where objective function is a sample average).
Estimators and estimation methods

Assumptions for asymptotic properties of extremum estimators:

1 Parameter space: must be convex and the parameter

vector that is the object of estimation must be point in its
interior. Gaps and nonconvexities in parameter spaces
would generally collide with estimation algorithms
(settings such as σ 2 ≥ 0 are OK).

2 Criterion function: must be concave in the parameters

(concave in the neighborhood of the true parameter
vector). Criterion functions need not be globally concave.
In such situation, there may be multiple local optima
(often associated with poor model specification).
Estimators and estimation methods
Assumptions for asymptotic properties of extremum estimators:

3 Identifiability of parameters: has a relatively complex

technical definition (anything like “true parameters θ0 are
identified if...” is problematic - leads to a paradox if condition is
not met). Simple way to secure identification:

LS: for a given set of any two different parameter vectors θ

and θ0 , a vector of observations xi must exist (for some i),
leading to different conditional mean function (ŷi ).
ML: For any two parameter vectors θ 6= θ0 , a data vector
(yi , xi ) must exist, which generates different values of
density function: f (yi |xi , θ) 6= f (yi |xi , θ0 ).
Note: identifiability does not rule out possibility of:
f (yi |xi , θ) = f (y` |x` , θ), where, yi = y` , xi 6= x` .
GMM: sufficient condition for identification:
E[m(data, θ)] 6= 0 if θ 6= θ0 .
Estimators and estimation methods

Assumptions for asymptotic properties of extremum estimators:

4 Behavior of the data: Grenander conditions for

well-behaved data:
Pn
G1 For each xk column of X and d2nk = x0k xk = i=1 x2ik ,
it must hold that: limn→∞ d2nk = +∞.
Sum of squares continue to grow with sample size, i.e. xk
does not degenerate into a series of 0.
G2 The limn→∞ x2ik /d2nk = 0 for all i = 1, 2, . . . , n. Single
observations become less important as sample size grows.
No single observation will dominate x0k xk .
G3 Let Cn be sample correlation matrix of the columns in X
(excluding the intercept, if present). Then limn→∞ Cn = C
where C is positive definite. This implies that the full rank
condition for X (A2) is not asymptotically violated.
Estimators and estimation methods

Quick convergence recap (terminology):

Convergence in probability: a sequence of random variables

X1 , X2 , X3 , . . . converges in probability to a random
p
variable X, denoted as Xn → X [or plim(Xn ) = X], if:

lim P (|Xn − X| ≥ ) = 0, ∀ > 0.

n→∞

Convergence in distribution: a weaker type of convergence.

It states that the CDF of Xn converges to the CDF of X as
n goes to infinity (does not require dependency between
d
Xn and X). Xn → X, if:

lim FXn (x) = FX (x), FX (x) continuous.

n→∞
Estimators and estimation methods

Theorem: Consistency of M estimators

If:
(a) the parameter space is convex and the true parameter
vector is a point in its interior,
(b) the criterion function is concave,
(c) the parameters are identified by the criterion function,
(d) the data are well behaved,
then the M estimator converges in probability to the true
parameter vector.
Estimators and estimation methods

Theorem: Asymptotic normality of M estimators

If:
(a) θ̂ is a consistent estimator of θ0 where θ0 is a point in the
interior of the parameter space Θ,

(b) q(θ|data) is concave and twice continuously differentiable in θ in

a neighborhood of θ0 ,
√ d
(c) n [∂q(θ0 |data)/∂θ0 ] −−−→ N (0, Φ),

(d) lim Pr |(∂ 2 q(θ|data)/∂θk ∂θm ) − hkm (θ)| > ε = 0 ∀ε > 0 for
n→∞
any θ in Θ; hkm (θ) is a continuous finite valued function of θ,

(e) the matrix of elements H(θ) is nonsingular at θ0 ,

√ d
then n(θ̂ − θ0 ) −−−→ N {0, [H −1 (θ0 )ΦH −1 (θ0 )]}.

where Φ is a variance-covariance matrix,

and H(θ0 ) = ∂ 2 q(θ|data)/∂θ∂θ 0 is a Hessian (evaluated at θ0 ).
Method of moments

Method of moments (MM)

Generalized method of moments (GMM)

Method of moments

With the method of moments, we simply estimate

population moments by corresponding sample moments.

Under very general conditions, sample moments are

consistent estimators of the corresponding population
moments, but NOT necessarily unbiased estimators.

Application example 1
Sample covariance is a consistent estimator of population
covariance.

Application example 2
OLS estimators we have used for parameters in the CLRM can
be derived by the method of moments.
Method of moments

Method of moments (MM)

Population moments for a stochastic variable X
E(X r ): rth population moment about zero
E(X): population mean: 1st population moment about zero
E[(X − E(X))2 ]: population variance is the second moment
about the mean

Sample moments for sample observations (x1 , x2 , . . . , xn )

Pn r
x
i=1 i
n : rth sample moment about zero
Pn
xi
i=1
n = x : sample mean is the first moment about zero
Pn
(xi −x)2
i=1
n−1 : sample variance is the second sample moment
about the mean
Method of moments

For MM, the usual linear model assumption (concerning

1st population moment) E[xi εi ] = 0 implies:

E[xi (yi − x0i β)] = 0,

which constitutes a population moment equation:

E xi (yi − x0i β) = E [m(β)] = 0 ,

and the corresponding sample (empirical) moment equation

can be formalized as:
n
" #
1X
xi (yi − x0i β̂) = m(β̂) = 0.
n i=1
Method of moments
For a LRM with K regressors, MM sample equations can be
cast as:
n
1 X
yi − β̂1 − β̂2 xi2 − · · · − β̂K xiK = 0
n
i=1
n
1 X
xi2 yi − β̂1 − β̂2 xi2 − · · · − β̂K xiK = 0
n
i=1

...
n
1 X
xiK yi − β̂1 − β̂2 xi2 − · · · − β̂K xiK = 0
n
i=1

1
Removing n
elements from equations does not affect the solution.
This is a system of K equations with K unknown parameters βj .
The set of moment equations is equivalent to 1st order conditions for
the OLS estimator:
n
X 2
min yi − β̂1 − β̂2 xi2 − · · · − β̂K xiK
β̂
i=1
Generalized method of moments

GMM is a very general class of estimators, includes many

other estimators as a special case (IVR, simultaneous
equations, Arellano-Bond estimator for dynamic panels).

For single equation linear models, GMM may be

conveniently described using the instrumental variable case:

For the LRM yi = x0i β + εi ,

we abandon the assumption E[x0i εi ] = 0 and
we replace it by E[zi0 εi ] = 0.
Hence, columns of X (n×K) are potentially endogenous
and Z (n×L) is a matrix of exogenous instruments.
Generalized method of moments

GMM equation (matrix form) can be cast by analogy to

the MM case:

we start by E[zi εi ] = 0, which implies a population

moment equation:

E zi (yi − x0i β) = E [m(β)] = 0 ,

and corresponding sample (empirical) moment equation:

n
" #
1X
zi (yi − x0i β̂) = m(β̂) = 0.
n i=1
Generalized method of moments
The equation form of GMM empirical equations can be
produced as:
n
1 X
yi − β̂1 − β̂2 xi2 − · · · − β̂K xiK = 0
n
i=1
n
1 X
zi2 yi − β̂1 − β̂2 xi2 − · · · − β̂K xiK = 0
n
i=1

...
n
1 X
ziL yi − β̂1 − β̂2 xi2 − · · · − β̂K xiK = 0
n
i=1

First column of Z is assumed to be a vector of ones (same as for X).

For Z = X as a special case, the above equations are identical to MM
(shown previously) and the solution is identical to the OLS estimator:
β̂ = (X 0 X)−1 X 0 y.
For Z 6= X, where Z is (n×L) and X is (n×K), three identification
possibilities have to be considered.
Generalized method of moments
Identification of GMM equations

1 Underidentified: with L < K, there are fewer moment

equations than unknown parameters (βj ). Without
additional information (parameter restrictions), there is no
solution to the system of GMM equations.

2 Exactly identified: for L = K, single solution exists:

n
" #
1X
zi (yi − x0i β̂) = m(β̂) = 0,
n i=1
can be conveniently re-written as:
1 0 1 0

m(β̂) = Zy − Z X β̂ = 0
n n
and the solution yields the familiar IV estimator:
β̂ = (Z 0 X)−1 Z 0 y.
Generalized method of moments

Identification of GMM equations (continued)

3 With L > K, there is no unique solution to the equation

system m(β̂) = 0.
One intuitite solution is the “least squares approach”:

min m(β̂)0 m(β̂)
β

Through the first order conditions, we obtain a GMM

estimator as
−1
β̂ = (X 0 Z)(Z 0 X) (X 0 Z)Z 0 y.

Generalized method of moments
GMM - consistency conditions
Convergence of the moments: Empirical (sample)
moments converge in probability to their population
counterparts. DGP meets the conditions for LLN.
p
m(β) = 1
n (Z 0 y − Z 0 Xβ) →
− 0.
Identification: For any n ≥ K and β1 6= β2 it holds that
m(β1 ) 6= m(β2 ). Three implications:
Order condition: L ≥ K. Number of moment equations
at least as large as number of parameters.
Rank condition: matrix G(β) = ∂ m(β)/∂β 0 (i.e. n1 Z 0 X)
is a L×K matrix with row rank equal to K.
Uniqueness: unique solution/optimizer exists.

Limiting Normal distribution for the sample

moments: Population moments obey central limit
theorem (CLT) or some similar variant.
Generalized method of moments
GMM - final remarks & summary
GMM-based asymptotic covariance matrix of β̂ is discussed
in Greene (Econometric analysis, ch. 13.6) for the classical,
heteroscedastic and generalized case (includes TS-based
estimation).
GMM is robust to differences in “specification” of the data
generating process (DGP). → i.e. sample mean or sample
variance estimate their population counterparts (assuming
they exist) regardless of DGP.
GMM is free from distributional assumptions. “Cost” of
this approach: if we know the specific distribution of a
DGP, GMM does not make use of such information →
inefficient estimates.
Alternative approach: method of maximum likelihood
utilizes distributional information and is more efficient
(provided this information is available & valid).
Maximum likelihood estimator

Maximum likelihood estimator (MLE)

Normal distribution & MLE

Maximum likelihood estimator
Maximum likelihood estimator – single parameter
For a stochastic variable y with a known distribution, described
by a single θ parameter:
f (y|θ) is the pdf of y, conditioned on parameter θ.
For n iid observations, joint density of this process:
n
Y
f (y1 , y2 , . . . , yn |θ) = f (yi |θ) = L(θ|y)
i=1
is the likelihood function.
We estimate θ by maximizing L(θ|y) with respect to the
parameter (1st order conditions). Solution (MLE) often
denoted as θ̂ML .
For maximization (MLE), it is usually simpler to work with
a log-transformed likelihood function:
n
X
log L(θ|y) = log f (yi |θ).
i=1
Maximum likelihood estimator

MLE – Poisson distribution example

Consider 10 iid observations from a Poisson distribution:
y 0 = (5, 0, 1, 1, 0, 3, 2, 3, 4, 1).
e−λ λyi
The pdf: f (yi |λ) = yi ! .
P10
n yi
Y e−λ λyi e−10λ λ i=1
Likelihood function: L(λ|y) = = Q10 .
i=1
yi ! i=1 yi !
n
X n
X
logL: log L(λ|y) = −nλ + log λ yi − log(yi !),
i=1 i=1
n
∂ log L(λ|y)
X
1
1st order condition: ∂λ = −n + λ yi = 0.
i=1

From 1st order condition: λ̂ML = y n .

For our empirical example: λ̂ML = 2.
Maximum likelihood estimator

Maximum likelihood estimator – vector of parameters

θ = (θ1 , . . . , θm )0

L = L(θ1 , θ2 , ...θm |y1 , y2 , ..., yn )

We find MLEs of the m parameters by partially

differentiating the likelihood function L (often, log L is
used) with respect to each θ and then setting all the partial
derivatives obtained to zero.
Maximum likelihood estimator

MLE – Normal distribution

n (yi −xi β)2
1
e−
Y
L(θ|data) = L(β, σ 2 |yi , xi ) = √ 2σ 2

i=1 2πσ 2
In matrix form, the log likelihood function is:
n n 1
LL(β, σ 2 |y, X) = − log(2π) − log(σ 2 ) − 2 (y − Xβ)0 (y − Xβ)
2 2 2σ

Recall that:
(y − Xβ)0 (y − Xβ) = y 0 y − 2β 0 X 0 y + β 0 X 0 Xβ
and
∂(y−Xβ)0 (y−Xβ)
∂β 0
= −2X 0 y + 2X 0 Xβ.
Maximum likelihood estimator
MLE – Normal distribution (continued)
n n 1
LL(β, σ 2 |y, X) = − log(2π)− log(σ 2 )− 2 (y−Xβ)0 (y−Xβ)
2 2 2σ

1st order conditions:

0
∂LL
∂β 0 = 1
2σ 2 [2X y − 2X 0 Xβ] = 0
is solved by:
β̂ = (X 0 X)−1 X 0 y

∂LL
∂ σ2 = − 2σn2 + 1
2σ 4 [(y − Xβ)0 (y − Xβ)] = 0
is solved by:
(y−Xβ)0 (y−Xβ) u0 u SSR
σ̂ 2 = n = n = n .

Note: the MLE estimate σ̂ 2 is biased downwards in small

samples, as the unbiased estimate is equal to SSR/(n − K).
Maximum likelihood estimator

Basic MLE assumptions

Parameter space: Gaps and nonconvexities in parameter

spaces would generally collide with estimation algorithms.
Identifiability: The parameter vector θ is identified
(estimable), if for two vectors, θ ∗ 6= θ and for some data
observations x, L(θ ∗ |x) 6= L(θ|x).
Well-behaved data: Laws of large numbers (LLN) apply.
Some form of CLT can be applied to the gradient (i.e. for
the estimation method).
Regularity conditions: “well behaved” derivatives of
f (yi |θ) with respect to θ (see Greene, chapter 14.4.1).
Maximum likelihood estimator

MLE properties

Consistency: plim(θ̂) = θ0 (θ0 is the true parameter)

Asymptotic normality of θ̂

Asymptotic efficiency: θ̂ is asymptotically efficient and

achieves the Cramér-Rao lower bound for consistent
estimators (see Greene, chapter 14.4.5)

Invariance: MLE of γ0 = c(θ0 ) is c(θ̂) if c(θ0 )

is a continuous and countinuously differentiable function.
(empirical advantages: we can use reparameterization in MLE,
e.g. γj = 1/θj or θ2 = 1/σ 2 ).
Maximum likelihood estimator
MLE - properties of the estimator
(Normal distribution):

Under the above assumption, variance-covariance matrix of θ is

the inverse of the Information matrix:
" #
−1 σ 2 (X 0 X)−1 0
var(θ̂) = I[θ̂] = 2σ 4 ,
0 n

where I[θ] = −E[H(θ)]. MLE gives the familiar formula for

the variance-covariance matrix of the β̂: σ 2 (X 0 X)−1 , and
a simple expression for the variance of σ̂ 2 .
The square root of the diagonal elements of I[θ̂]−1 gives
estimates of the standard errors of the parameter estimates.
We can construct simple z-scores to test the null hypothesis
concerning any individual parameter, just as in OLS, but using
the normal instead of the t-distribution.
Maximum likelihood estimator

MLE - inference, three classic tests:

Consider MLE of parameter θ and a test of the hypothesis
H0 : h(θ) = 0. Recall that ML parameter estimates are
asymptotically normally distributed.

1 Likelihood ratio test: If the restriction h(θ) = 0 is valid,

then imposing it should not lead to a large reduction in the
log-likelihood function.

LR = 2(LLU − LLR ) ∼ χ2 (r),

where LLU is the LL of unconstrained model, LLR denotes

restricted model and r is the number of restrictions
imposed. To do this test you have to estimate two models
(one nested) and get the results of both.
Maximum likelihood estimator

MLE - inference, three classic tests:

We have an unrestricted ML estimate θ̂ = (θ̂1 , . . . , θ̂m )0 ,

and test of the hypothesis H0 : h(θ) = q,
where q is a (r × 1) vector function of θ (linear/non-linear
restrictions, continuous partial derivatives assumed).

2 Wald test: If restriction h(θ) = q is valid, then h(θ̂) − q

should be close to zero since MLE is consistent.

h i−1
W = [h(θ̂) − q]0 Asy.Var[h(θ̂) − q] [h(θ̂) − q] ∼ χ2 (r),
H0

where the estimated

∂h(θ̂) ∂h(θ̂)
Asy.Var[h(θ̂) − q] = Asy.Var(θ̂) .
∂ θ̂ ∂ θ̂
Maximum likelihood estimator
MLE - inference, three classic tests:

We have a ML estimate θ̂R – i.e. ML estimation of the

restricted model, under H0 : h(θ) = 0,

3 Lagrange multiplier test: If the restriction is valid, then

the restricted estimator should be near the point that
maximizes the log-likelihood. Therefore, the slope of the
log-likelihood function should be near zero at the restricted
estimator. The test is based on the slope of the
log-likelihood at the point where the function is maximized
subject to the restriction.

!0 !
∂ log L(θ̂R ) ∂ log L(θ̂R )
LM = I[θ̂R ]−1 ∼ χ2 (r),
∂ θ̂R ∂ θ̂R H0

where −I[θ̂R ] = ∂ 2 LL(θ)/∂θ 0 ∂θ evaluated at θ = θ̂R .

Maximum likelihood estimator

MLE - inference, three classic tests:

The χ2 distributions of the three test statistics are

asymptotically valid.

The three tests are asymptotically equivalent, but may

differ in small samples:

W ≥ LR ≥ LM.

Hence, in finite samples, LR rejects H0 less often than W

but more often than LM.

The above tests are discussed in ML context, i.e. with a

known distribution of the variable/error term
(ML parameter estimates are asymptotically normally
distributed).
Maximum likelihood estimator

MLE – summary

MLE is only possible if we know the form of the probability

distribution function for the population (Normal, Poisson,
Negative Binomial, etc.).

MLE has the large sample properties of consistency and

asymptotic efficiency. There is no guarantee of desirable
small-sample properties.

Under CLRM assumptions (A1 – A6), ML estimator is

identical to OLS estimator (for β̂).
Non-linear extensions to LRM, quantile regression

Non-linear regression models

Quantile regression
Non-linear regression models

Nonlinear regression model:

yi = h(xi , β) + εi

Linear model is a special case of the nonlinear model.

yi = h(xi , β) + εi = x0i β + εi .
Linear models: linear in parameters. Definition includes
non-linear regressors such as x2i , etc.
Many nonlinear models can be transformed into linear
models (log-transformation)

For nonlinear models that cannot be transformed into

LRM, nonlinear LS (NLS) are available.

∂h(xi , β)/∂x is no longer equal to β

(interpretation based on estimated model . . . )
Nonlinear regression
Assumptions relevant to the nonlinear regression model
1 Functional form: The conditional mean function for yi ,
given xi is:

E[yi |xi ] = h(xi , β) , i = 1, 2, . . . , n

2 Identifiability of model parameters: The parameter

vector in the model is identified (estimable) if there is no
nonzero parameter β0 6= β such that h(xi , β0 ) = h(xi , β)
for all xi .

3 Zero mean of the disturbance: For yi = h(xi , β) + εi ,

we assume

E[εi |h(xi , β)] = 0 , i = 1, 2, . . . , n

i.e. disturbance at observation i is uncorrelated with the

conditional mean function.
Nonlinear regression

Assumptions relevant to the nonlinear regression model

4 Homoscedasticity and non-autocorrelation:

conditional homoscedasticity:

E[ε2i |h(xi , β)] = σ 2 , i = 1, 2, . . . , n

non-autocorrelation:

E[εt εs |h(xt , β), h(xs , β)] = 0, for all t 6= s

Nonlinear regression

Assumptions relevant to the nonlinear regression model

5 Data generating process: DGP for xi is assumed to be

a well-behaved population such that first and second
sample moments of the data can be assumed to converge to
fixed, finite population counterparts. The crucial
assumption is that the process generating xi is strictly
exogenous to that generating εi
6 Underlying probability model There is a well-defined
probability distribution generating εi . At this point, we
assume only that this process produces a sample of
uncorrelated, identically (marginally) distributed random
variables εi with mean zero and variance σ 2 conditioned on
h(xi , β). Hence, our statement of the model is
semi-parametric (i.e. specific distributional assumption
on residuals are replaced by weaker assumptions).
Nonlinear Regression: NLS

NLS: estimator of the nonlinear regression model

[yi − h(xi , β)]2

P
NLS: min: S(β) =

Using the standard procedure, we can get k first order

conditions for the minimization:
n
∂S(β) X ∂h(xi , β)
= [yi − h(xi , β)] = 0
∂β i=1
∂β

The above first order conditions are also moment conditions

and this defines the NLS estimator as a GMM estimator.
Nonlinear regression: NLS

NLS: estimator of the nonlinear regression model

NLS being a GMM estimator allows us to deduce that the

NLS estimator has good large sample properties:
consistency and asymptotic normality (if assumptions are
fulfilled).

Hypothesis testing: The principal testing procedure is the

Wald test, which relies on the consistency and asymptotic
normality of the estimator. Likelihood ratio and LM tests
can also be constructed.
Nonlinear regression: computing NLS estimates

For nonlinear models, a closed-form solution (NLS estimator)

usually does not exist.

Most of the nonlinear maximization problems are solved by

an iterative algorithm.
The most commonly used of iterative algorithms are
gradient methods.
The template for most gradient methods in common use is
the Newton’s method.
Look at your software packages which methods are
available for computing NLS estimates.
Nonlinear regression: examples

LRM on TS with autocorrelation:

yt = x0t β + ut , ut = ρut−1 + εt ,
yt = x0t β + ρut−1 + εt note: ut−1 = yt−1 − x0t−1 β,
hence:
yt = ρyt−1 + x0t β + ρ(x0t−1 β) + εt ,
which is non-linear in parameters (ρβ).

Non-linear consumption function example:

consi = β1 + β2 incβi 3 + εi
special case: model is linear for β3 = 1
(such assumption can be tested).
Nonlinear regression: examples

Examples 7.4 & 7.8 (Greene):

Analysis of a Nonlinear Consumption Function
OLS version: for β3 = 1.

Depednent Variable: REALCONS

Method: Least Squares (Marquard - EViews legacy)
Date: 09/19/16 Time 16:31
Sample 1950Q1 2000Q4
Included observations: 204
REALCONS=C(1)+C(2)*REALDPI

Coeficient Std.Error t-Statistic Prob.

C(1) -80.35475 14.30585 -5.616915 0.0000

C(2) 0.921686 0.003872 238.0540 0.0000

R-squared 0.996448 Mean dependent var 2999.436

Adjusted R-squared 0.996431 S.D. dependent var 1459.707
S.E. of regression 87.20983 Akaike info criterion 11.78427
Sum squared resid 1536322 Schwarz criterion 11.81680
Log likelihood -1199.995 Hannan-Quinn criter. 11.79743
F-statistics 56669.72 Durbin-Watson stat 0.092048
Prob(F-statistics) 0.000000
Nonlinear regression: examples

Examples 7.4 & 7.8 (Greene):

Analysis of a Nonlinear Consumption Function
NLS with starting values equal to 0

Depednent Variable: REALCONS

Method: Least Squares (Marquard - EViews legacy)
Sample 1950Q1 2000Q4 Included observations: 204
Convergence achieved after 200 iterations
REALCONS=C(1)+C(2)*REALDPI^C(3)

Coeficient Std.Error t-Statistic Prob.

C(1) 458.7991 22.50140 20.38980 0.0000

C(2) 0.100852 0.010910 9.243667 0.0000
C(3) 1.244827 0.012055 103.2632 0.0000

R-squared 0.998834 Mean dependent var 2999.436

Adjusted R-squared 0.998822 S.D. dependent var 1459.707
S.E. of regression 50.09460 Akaike info criterion 10.68030
Sum squared resid 504403.2 Schwarz criterion 10.72910
Log likelihood -1086.391 Hannan-Quinn criter. 10.70004
F-statistics 86081.29 Durbin-Watson stat 0.295995
Prob(F-statistics) 0.000000
Nonlinear regression: examples

Examples 7.4 & 7.8 (Greene):

Analysis of a Nonlinear Consumption Function
NLS with starting values equal to the parameters from the OLS
estimation (c(3) equal to 1)

Depednent Variable: REALCONS

Method: Least Squares (Marquard - EViews legacy)
Sample 1950Q1 2000Q4 Included observations: 204
Convergence achieved after 80 iterations
REALCONS=C(1)+C(2)*REALDPI^C(3)

Coeficient Std.Error t-Statistic Prob.

C(1) 458.7989 22.50149 20.38971 0.0000

C(2) 0.100852 0.010911 9.243447 0.0000
C(3) 1.244827 0.012055 103.2632 0.0000

R-squared 0.998834 Mean dependent var 2999.436

Adjusted R-squared 0.998822 S.D. dependent var 1459.707
S.E. of regression 50.09460 Akaike info criterion 10.68030
Sum squared resid 504403.2 Schwarz criterion 10.72910
Log likelihood -1086.391 Hannan-Quinn criter. 10.70004
F-statistics 86081.28 Durbin-Watson stat 0.295995
Prob(F-statistics) 0.000000
Quantile regression - LAD

Quantile regression estimates the relationship between

regressors and a specified quantile of dependent variable.
LAD estimator is the QREG for q = 12 (median) and the
loss function can be described as (compare to OLS
objective function):
n
|yi − x0i β̂q |
X
min: Qn (β̂q ) =
β̂q i=1

LAD estimator predates OLS (itself older than 200 years).

Until recently, QREG and LAD have seen little use in
econometrics, as OLS is vastly easier to compute.
Different software packages use a variety of optimization
algorithms for QREG/LAD estimation.
Linear programming can be used for finding QREG
estimates (Koenkerr and Bassett (around 1980).
Quantile regression (QREG)

For LRMs, the q-th quantile QREG estimator βq minimizes:

n n
q|yi − x0i β̂q | + (1 − q)|yi − x0i β̂q |,
X X
min: Qn (β̂q ) =
β̂q i: ei ≥ 0 i: ei < 0

where ei = (yi − x0i β̂q ).

We use the notation β̂q to make clear that different choices
of q lead to different β̂.
Slope of the loss function Qn is asymmetrical
(around ei = 0).
The loss function is not differentiable (at ei = 0)
→ gradient methods are not applicable
(linear programming can be used).
Quantile regression (QREG)

Quantile regression: used to describe relationship between

regressors and a specified quantile of dependent variable.

The (linear) quantile model can be defined as

Q[y|x, q] = x0 βq , such that Prob[y ≤ x0 βq |x] = q, 0 < q < 1
where q denotes the q-th quantile of y.

One important special case of quantile regression is the

least absolute deviations (LAD) estimator, which
corresponds to fitting the conditional median of the
response variable (q = 12 ).

QREG (LAD) estimator can be motivated as a robust

alternative to OLS (with respect to outliers).
Quantile regression

QREG coefficient interpretation example:

(1) wagei = β1 + ui
(2) wagei = β1 + β2 femalei + ui
(3) wagei = β1 + β2 femalei + β3 experi + ui

The above equations are estimated by OLS / LAD / QREG:

Coefficient OLS LAD (q = 1 2
) QREG (q = 3 4
)
(1) β1 β̂1 = y β̂1 = ỹ β̂1 = Q3
sample mean sample median sample 3rd quartile
(2) β1 , β1 +β2 conditional sample mean cond. sample median conditional sample Q3
wage: male / female wage: male / female wage: male / female
(3) β3 change in expected mean change in exp. median change in expected Q3
wage for ∆exper = 1 wage for ∆exper = 1 wage for ∆exper = 1
Quantile regression example

Example 7.10 (Greene):

Income Elasticity of Credit Cards Expenditure
OLS & LAD & Income elasticity at different deciles

Depednent Variable: LOGSPEND

Method: Least Squares
Date: 09/15/16 Time 13:53
Sample (adjusted): 3 13443
Included observations: 10499 after adjustments

Variable Coeficient Std.Error t-Statistic Prob.

C -3.055807 0.239699 -12.74852 0.0000

LOGINC 1.083438 0.032118 33.73296 0.0000
AGE -0.017364 0.001348 -12.88069 0.0000
ADEPCNT -0.044610 0.010921 -4.084857 0.0000

R-squared 0.100572 Mean dependent var 4.728778

Adjusted R-squared 0.100315 S.D. dependent var 1.404820
S.E. of regression 1.332496 Akaike info criterion 3.412366
Sum squared resid 18634.35 Schwarz criterion 3.415131
Log likelihood -17909.21 Hannah-Quinn criter. 3.413300
F-statistic 391.1750 Durbin-Watson stat 1.888912
Prob(F-statistic) 0.000000
Quantile regression example 2

Example 7.10 (Greene):

Income Elasticity of Credit Cards Expenditure (LAD)
Depednent Variable: LOGSPEND Method: Quantile Regression (Median)
Sample (adjusted): 3 13443 Included observations: 10499 after adjustments
Huber Sandwich Standard Errors & Covariance
Sparsity method: Kemel (Epanechnikov) using residuals
Bandwidth method: Hall-Sheather, bw=0.04437
Estimation successfully identifies unique optimal solution

Variable Coeficient Std.Error t-Statistic Prob.

C -2.803756 0.233534 -12.00577 0.0000

LOGINC 1.074928 0.030923 34.76139 0.0000
AGE -0.016988 0.001530 -11.10597 0.0000
ADEPCNT -0.049955 0.011055 -4.518599 0.0000

Pseudo R-squared 0.058243 Mean dependent var 4.728778

Adjusted R-squared 0.057974 S.D. dependent var 1.404820
S.E. of regression 1.346476 Objective 5096.818
Quantile dependent va... 4.941583 Restr. objective 5412.032
Sparsity 2.659971 Quasi-LR statistic 948.0224
Prob(Quasi-LR stat) 0.000000
Quantile regression example 2

Example 7.10 (Greene):

Income Elasticity of Credit Cards Expenditure
(Intercept) log(INCOME)
0

1.4
−2

1.2
−4

1.0
−6

0.8
−8

0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8

AGE ADEPCNT

0.00
−0.02
−0.015

−0.04
−0.025

−0.06
−0.08
−0.035

0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8

Predictions from a model

Predictions from a CLRM (repetition from BSc courses)

Predictions: general features, kFCV, Variance vs. Bias

Predictions - basics

CLRM and its estimate:

y = β1 + β2 x2 + β3 x3 + · · · + βK xK + u
ŷ = β̂1 + β̂2 x2 + β̂3 x3 + · · · + β̂K xK

Prediction of expected value:

ŷp = E(y|x1 = 1, x2 = c2 , . . . , xK = cK )
ŷp = β̂1 + β̂2 c2 + β̂3 c3 + · · · + β̂K cK

Rough (underestimated) confidence interval for the

expected value prediction: (95%): ŷp ± 2 × s.e.(ŷp ).
(Rule of thumb)
Predictions - basics

s.e.(ŷp ) can be obtained by reparametrization:

Reparametrized CLRM:

y ∗ = β1∗ + β2∗ (x2 − c2 ) + β3∗ (x3 − c3 ) + · · · + u

The following holds:

ŷp = βˆ1∗
s.e.(ŷp ) = s.e.(βˆ1∗ ), i.e.
var(ŷp ) = var(β̂1∗ )
Predictions - basics

Predicted and actual values of yp :

ŷp = β̂1 + β̂2 c2 + β̂3 c3 + · · · + β̂K cK

yp = β1 + β2 c2 + β3 c3 + · · · + βK cK + up

Prediction error

êp = yp − ŷp = (β1 + β2 c2 + β3 c3 + · · · + βK cK ) + up − ŷp

Prediction error variance

var(êp ) = var(up ) + var(ŷp )

because var(β1 + β2 c2 + β3 c3 + · · · + βK cK ) = 0
Predictions - basics

In CLRM, homoscedasticity holds, σ 2 = var(up ):

var(êp ) = σ 2 + var(ŷp )
We estimate σ 2 from the original CLRM as (SSR/(n − K))
We get var(ŷp ) from the reparametrized LRM

Standard prediction error:

p
s.e.(êp ) = var(êp )

Prediction interval (95%)

ŷp ± t0.025 × s.e.(êp )
Predictions - basics

Prediction with logarithmic dependent variable

log(y) = β1 + β2 x2 + · · · + βK xK + u
\ = β̂1 + β̂2 x2 + · · · + β̂K xK
log(y)

\
ŷ = elog(y) systematically underestimates ŷ ,
\
b 0 elog(y)
we can use a correction: ŷ = α
Pn
b 0 = n−1
where α i=1 exp(ûi )

is a consistent (not unbiased) estimator of exp (u).

Predictions - basics (Matrix form)

Prediction based on estimated model:

ŷp = x0p β̂

Difference between prediction and actual yp value:

êp = ŷp − yp = x0p β̂ − x0p β − up = x0p (β̂ − β) − up

If β̂ is unbiased estimator for β,

ŷp is an unbiased estimator for yp value:

E(êp ) = E(ŷp − yp ) = x0p E(β̂ − β) + E(−up ) = 0

and the variance of êp can be expressed as:

E(ê2p ) = var(êp ) = x0p var(β̂)xp + var(up )

Predictions - basics (Matrix form)

Variance of êp (continued):

var(êp ) = x0p var(β̂)xp + var(up )

h −1 i
= x0p σ 2 X 0X xp + var(up )
substitute σ 2 , var(up ) with σ̂ 2 (homoscedasticity)
h −1 i
= x0p σ̂ 2 X 0X xp +σ̂ 2
| {z }
σ̂p2

With growing sample size (asymptotically),

var(up ) = σ̂p2 + σ̂ 2 converges to σ̂ 2
. . . plim β̂ = β ↔ plim σ̂p2 = 0
(Note: recall consistency of the OLS estimator under A1–A5
conditions & for the CLRM model - i.e. under A1–A6.)
Predictions - basics (Matrix form)

Variance of êp (continued):

h −1 i
var(êp ) = x0p σ̂ 2 X 0X xp + σ̂ 2
after re-arranging, s.e.(êp ) may be written as

q
s.e.(êp ) = σ̂ · 1 + x0p (X 0X)−1 xp ,
which relates to the individual prediction error.

For mean prediction errors (considering σ̂p2 only):

q
s.e.(eep ) = σ̂ · x0p (X 0X)−1 xp .
Predictions - basics (Matrix form)

Prediction intervals: individual vs. mean value predictions:

Individual prediction: yp ∈ ŷp ± t∗α/2 × s.e.(êp )

Mean value: yp ∈ ŷp ± t∗α/2 × s.e.(eep )

Predictions – general discussion:

Reliability of predictions:

we work with estimated parameters

(if we generalize from the CLRM paradigm, finite/small
sample properties of estimators may be difficult to
describe),
model parameters can change in time
(discussed separately in next Block – see Chow tests),
predictions include “individual” random errors.

Impacts of random errors on predictions of individual

values are usually much bigger than the impacts of
variance in estimated parameters.
Mean Squared Error of prediction

We can generalize the previous discussion on predictions by

considering both biased and unbiased predictors and by
allowing for different functional forms and complexity levels in
predictive models.

Predictions may be compared/evaluated using:

2
MSE = E yi − fˆ(xi )

where fˆ(xi ) is the prediction that fˆ generates for the i-th

regressor set. Here, fˆ represents a general class of
predictors (linear, non-linear, non-parametric, etc.) and it
may produce either biased or unbiased predictions
Variance vs. Bias trade-off

Example for a “sine-like” function: y = f (x) + u

Train sample & Test sample

Suppose we fit a model fˆ(x) to some training data

Tr = {yi , xi }n1 and we wish to see how well it performs.
We could compute MSE over Tr:
1 Xh i2
MSETr = yi − fˆ(xi )
n i∈Tr

When searching for the “best” model by minimizing MSE, the

above statistic would lead to over-fit models.

Instead, we should (if possible) compute the MSE using

fresh test data Te = {yi , xi }m
1 :

1 Xh i2
MSETe = yi − fˆ(xi )
m i∈Te
Variance vs. Bias trade-off

Suppose we have a model fˆ(x), fitted to some training data Tr

and let {y0 , x0 } be a test observation drawn from the
population. If the true model is yi = f (xi ) + εi ,
with f (xi ) = E(yi |xi ), then the expected test MSE can be
decomposed into:
E(MSE0 ) = var(fˆ(x0 )) + [Bias (fˆ(x0 ))]2 + var(ε0 ),
where
Bias (fˆ(x0 )) = E[fˆ(x0 )] − f (x0 ),
ε0 is the irreducible error: E(MSE0 ) ≥ ε0 ,
all three RHS elements are non-negative,
The above equation refers to the average test MSE that we
would obtain if we repeatedly estimated f (x) using a large
number of training sets and then tested each fˆ(x) at x0 .
Variance vs. Bias trade-off

E(MSE0 ) = var(fˆ(x0 )) + [Bias (fˆ(x0 ))]2 + var(ε0 ),

This is an illustration, var(ε0 ) not shown explicitly.

(lies at the /asymptotic/ minima of Variance and Bias2 )
k-Fold Cross Validation

Training error (MSETr ) can be calculated easily.

However, MSETr is not a good approximation for the
MSETe (out-of sample predictive properties of the model).
Usually, MSETr dramatically underestimates MSETe .

Cross-validation is based on re-sampling (similar to bootstrap).

Repeatedly fit a model of interest to samples formed from the
training set & make “test sample” predictions, in order to
obtain additional information about predictive properties of the
model.
k-Fold Cross Validation

In k-Fold Cross-Validation (kFCV), the original sample is

randomly partitioned into k roughly equal subsamples
(divisibility).
One of the k subsamples is retained as the test sample, and
the remaining (k − 1) subsamples are used as training data.

The cross-validation process is then repeated k times

(the k folds), with each of the k subsamples used exactly
once as the test sample.
The k results from the folds can then be averaged to
produce a single estimation.
k = 5 or k = 10 is commonly used.
k-Fold Cross Validation

kFCV example for CS data & k = 5:

(random sampling, no replacement)

In TS, a similar “Walk forward” test procedure may be applied.

k-Fold Cross Validation

k
X
1
CV(k) = k MSEs ,
s=1

where CV(k) is the cross-validated estimate of MSE,

k is the number of folds used (e.g. 5 or 10),
1
− ybi )2
P
MSEs = ms i∈Cs (yi
ms is the number of observations in the s-th test sample
Cs refers to the s-th set of test sample observations.

As we evaluate predictions from two or more models,

we look for the lowest CV(k) .

Series of Functions
No ratings yet
Series of Functions
236 pages
Full
No ratings yet
Full
1,224 pages
Statistical Methods in Experimental Physics
No ratings yet
Statistical Methods in Experimental Physics
362 pages
International Congress of Mathematicians 2010
No ratings yet
International Congress of Mathematicians 2010
1,131 pages
Random Matrices and Random Partitions Normal Convergence, Volume 1 PDF
100% (1)
Random Matrices and Random Partitions Normal Convergence, Volume 1 PDF
284 pages
M.Sc. Statistics NEP Syllabus Wef 2022-23-1
No ratings yet
M.Sc. Statistics NEP Syllabus Wef 2022-23-1
67 pages
ECE2191 Lecture Notes
No ratings yet
ECE2191 Lecture Notes
106 pages
Class Notes in Statistics and Econometrics
No ratings yet
Class Notes in Statistics and Econometrics
1,644 pages
Block1
No ratings yet
Block1
83 pages
Lecture Notes For Mathematical Statistics
No ratings yet
Lecture Notes For Mathematical Statistics
184 pages
An Introduction to the Mathematics of Planning and Scheduling (Geza Paul Bottlik) (Z-Library)
No ratings yet
An Introduction to the Mathematics of Planning and Scheduling (Geza Paul Bottlik) (Z-Library)
226 pages
Material de Trabajo Topics in Time Series Econometrics 3
No ratings yet
Material de Trabajo Topics in Time Series Econometrics 3
125 pages
Gambling, Random Walks and The Central Limit Theorem: 3.1 Random Variables and Laws of Large Num-Bers
No ratings yet
Gambling, Random Walks and The Central Limit Theorem: 3.1 Random Variables and Laws of Large Num-Bers
59 pages
Gary Chamberlain Econometric S
No ratings yet
Gary Chamberlain Econometric S
152 pages
Advanced Econometrics: Masters Class
No ratings yet
Advanced Econometrics: Masters Class
38 pages
Moment Generating Functions and Characteristic Functions: Scott Sheffield
No ratings yet
Moment Generating Functions and Characteristic Functions: Scott Sheffield
74 pages
7772 LectureNotes
No ratings yet
7772 LectureNotes
120 pages
A Short Course On Nonparametric Curve Estimation R PDF
No ratings yet
A Short Course On Nonparametric Curve Estimation R PDF
114 pages
Chapter 36 Large Sample Estimation and Hypothesis Testing
No ratings yet
Chapter 36 Large Sample Estimation and Hypothesis Testing
135 pages
Probability Theory (MATHIAS LOWE)
No ratings yet
Probability Theory (MATHIAS LOWE)
69 pages
Module 4: Point Estimation: Statistics (OA3102)
No ratings yet
Module 4: Point Estimation: Statistics (OA3102)
41 pages
GGGGG 5
No ratings yet
GGGGG 5
53 pages
Statinf Estimation
No ratings yet
Statinf Estimation
110 pages
Lecture 5
No ratings yet
Lecture 5
39 pages
Estimação Pontual
No ratings yet
Estimação Pontual
58 pages
Ecmet
No ratings yet
Ecmet
1,644 pages
Dynamic Econometric Models Time Series Econometrics For Microeconometricians 2011
No ratings yet
Dynamic Econometric Models Time Series Econometrics For Microeconometricians 2011
51 pages
02 Point Estimators
No ratings yet
02 Point Estimators
33 pages
Econ-2042- Unit 6-W12-13
No ratings yet
Econ-2042- Unit 6-W12-13
77 pages
Robert Engle Dan McFadden Handbook of Econometrics PDF
No ratings yet
Robert Engle Dan McFadden Handbook of Econometrics PDF
1,024 pages
Stable Convergence and Stable Limit Theorems: Erich Häusler Harald Luschgy
No ratings yet
Stable Convergence and Stable Limit Theorems: Erich Häusler Harald Luschgy
231 pages
Eco No Metrics
No ratings yet
Eco No Metrics
1,045 pages
Ebook Econometrics
No ratings yet
Ebook Econometrics
1,006 pages
RegEstimationLS_ML_StatColumbia
No ratings yet
RegEstimationLS_ML_StatColumbia
44 pages
PDF Estimation Corr
No ratings yet
PDF Estimation Corr
43 pages
Asymptotic Theory and Parametric Inference
No ratings yet
Asymptotic Theory and Parametric Inference
32 pages
6-point-estimation
No ratings yet
6-point-estimation
49 pages
Psp-Unit-6 Estimation Theory PDF
No ratings yet
Psp-Unit-6 Estimation Theory PDF
38 pages
Mathematical Statistics For Economics and Business
No ratings yet
Mathematical Statistics For Economics and Business
17 pages
A Modern Gauss-Markov Theorem
No ratings yet
A Modern Gauss-Markov Theorem
18 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
(Newey and Mcfadden) Large Sample Estimation and Hypothesis Testing
No ratings yet
(Newey and Mcfadden) Large Sample Estimation and Hypothesis Testing
135 pages
M.Sc. Stat
No ratings yet
M.Sc. Stat
28 pages
Estimation Theory MCQ
86% (7)
Estimation Theory MCQ
8 pages
Point Estimation: Institute of Technology of Cambodia
No ratings yet
Point Estimation: Institute of Technology of Cambodia
22 pages
EC501 Lecture 02
No ratings yet
EC501 Lecture 02
27 pages
AllNotes-4 (2)
No ratings yet
AllNotes-4 (2)
56 pages
Statistical Methods
No ratings yet
Statistical Methods
25 pages
Basic Lebesgue Measure Theory: Royden 2010 Kolmogorov and Fomin 1970 Stein and Shakarchi 2005 Tao 2011
No ratings yet
Basic Lebesgue Measure Theory: Royden 2010 Kolmogorov and Fomin 1970 Stein and Shakarchi 2005 Tao 2011
28 pages
Econometrica - 2022 - Hansen - A Modern Gauss Markov Theorem
No ratings yet
Econometrica - 2022 - Hansen - A Modern Gauss Markov Theorem
12 pages
SLRM note
No ratings yet
SLRM note
15 pages
ECN 318 Lecture Notes Weeks 3-4
No ratings yet
ECN 318 Lecture Notes Weeks 3-4
25 pages
Advanced Econometrics PDF
No ratings yet
Advanced Econometrics PDF
58 pages
Lect 6
No ratings yet
Lect 6
20 pages
Statistical Convergence and Convergence in Statistics: Mark Burgin, Oktay Duman
No ratings yet
Statistical Convergence and Convergence in Statistics: Mark Burgin, Oktay Duman
27 pages
Stochastic Differential Equations. Introduction To Stochastic Models For Pollutants Dispersion, Epidemic and Finance
100% (1)
Stochastic Differential Equations. Introduction To Stochastic Models For Pollutants Dispersion, Epidemic and Finance
156 pages
Wickham Stati
No ratings yet
Wickham Stati
12 pages
Properties of Estimators New Update Spin
No ratings yet
Properties of Estimators New Update Spin
43 pages
GMM, MLE and Tests For Non-Linear Restrictions: 1 Generalized Method of Moments (GMM)
No ratings yet
GMM, MLE and Tests For Non-Linear Restrictions: 1 Generalized Method of Moments (GMM)
15 pages
GAANS-Probability Measures On Metric Spaces
No ratings yet
GAANS-Probability Measures On Metric Spaces
29 pages
07_Module-Handbook_INTRODUCTION-TO-PROBABILITY-THEORY
No ratings yet
07_Module-Handbook_INTRODUCTION-TO-PROBABILITY-THEORY
15 pages
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
No ratings yet
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
7 pages
Stat Lecture 2
No ratings yet
Stat Lecture 2
6 pages
Probability Theory
No ratings yet
Probability Theory
6 pages
SampleQs Solutions PDF
No ratings yet
SampleQs Solutions PDF
20 pages
Basic Stats Estimation
No ratings yet
Basic Stats Estimation
8 pages
Estimators: The Basic Statistical Model
No ratings yet
Estimators: The Basic Statistical Model
9 pages
Week 1 1720465962 Estimation Hour 2
No ratings yet
Week 1 1720465962 Estimation Hour 2
14 pages
Lecture II - Docx - 12
No ratings yet
Lecture II - Docx - 12
12 pages
Machine learning: A Bayesian and optimization perspective 2nd Edition Theodoridis S - eBook PDF - The ebook in PDF/DOCX format is ready for download now
100% (2)
Machine learning: A Bayesian and optimization perspective 2nd Edition Theodoridis S - eBook PDF - The ebook in PDF/DOCX format is ready for download now
44 pages
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
No ratings yet
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
16 pages
6.CHAPTER 4
No ratings yet
6.CHAPTER 4
9 pages
Estimation Bertinoro09 Cristiano Porciani 1
No ratings yet
Estimation Bertinoro09 Cristiano Porciani 1
42 pages
Lecture1
No ratings yet
Lecture1
8 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
Lecture 21
No ratings yet
Lecture 21
4 pages
Asymptotic Theory For OLS
No ratings yet
Asymptotic Theory For OLS
15 pages
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
No ratings yet
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
18 pages
A Short Introduction To The Generalized Method of Moments Estimation
No ratings yet
A Short Introduction To The Generalized Method of Moments Estimation
5 pages
3 The Basic Linear Model Finite Sample Results
No ratings yet
3 The Basic Linear Model Finite Sample Results
9 pages
Unit - III
No ratings yet
Unit - III
4 pages
(Advanced Texts in Physics) Philippe Réfrégier (Auth.) - Noise Theory and Application To Physics - From Fluctuations To Information-Springer-Verlag New York (2004)
No ratings yet
(Advanced Texts in Physics) Philippe Réfrégier (Auth.) - Noise Theory and Application To Physics - From Fluctuations To Information-Springer-Verlag New York (2004)
293 pages
Topics in Random Matrix Theory
No ratings yet
Topics in Random Matrix Theory
342 pages
MAN006 Assignment7 PDF
No ratings yet
MAN006 Assignment7 PDF
3 pages
Probability and Statistics B
No ratings yet
Probability and Statistics B
1 page
An Introduction to Linear Algebra and Tensors
From Everand
An Introduction to Linear Algebra and Tensors
M. A. Akivis
1/5 (1)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
From Everand
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
Mohmmad Khaja Shareef
No ratings yet