0% found this document useful (0 votes)

59 views7 pages

7 Single Index Models

This document summarizes semiparametric single index models. It discusses Ichimura's estimator for semiparametric single index regression models. Ichimura proposed replacing the unknown link function g with a leave-one-out kernel weighted average of the observed y values. This estimator is consistent for estimating the parameter β. The document also derives the asymptotic distribution of Ichimura's estimator, showing it is asymptotically normal.

Uploaded by

dssd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views7 pages

7 Single Index Models

Uploaded by

dssd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

8 Semiparametric Single Index Models

8.1 Index Models

A object of interest such as the conditional density f (y j x) or conditional mean E (y j x) is a
single index model when it only depends on the vector x through a single linear combination x0 :
Most parametric models are single index, including Normal regression, Logit, Probit, Tobit,
and Poisson regression.
In a semiparametric single index model, the object of interest depends on x through the function
g (x0 ) where 2 Rk and g : R ! R are unknown. g is sometimes called a link function. In single
index models, there is only one nonparametric dimension. These methods fall in the class of
dimension reduction techniques.
The semiparametric single index regression model is

E (y j x) = g x0 (1)

where g is an unknown link function.

The semiparametric single index binary choice model is

P (y = 1 j x) = E (y j x) = g x0 (2)

where g is an unknown distribution function. We use g (rather than, say, F ) to emphasize the
connection with the regression model.
In both contexts, the function g includes any location and level shift, so the vector Xi cannot
include an intercept. The level of is not identi…ed, so some normalization criterion for is needed.
0
It is typically easier to impose this on than on g. One approach is to set = 1. A second
approach is to set one component of to equal one. (This second approach requires that this
variable correctly has a non-zero coe¢ cient.)
The vector Xi must be dimension 2 or larger. If Xi is one-dimensional, then is simply
normalized to one, and the model is the one-dimensional nonparametric regression E (y j x) = g (x)
with no semiparametric component.
Identi…cation of and g also requires that Xi contains at least one continuously distributed
variable, and that this variable has a non-zero coe¢ cient. If not, Xi0 only takes a discrete set of
values, and it would be impossible to identify a continuous function g on this discrete support.

8.2 Single Index Regression and Ichimura’s Estimator

The semiparametric single index regression model is

yi = g Xi0 + ei
E (ei j Xi ) = 0

68
This model generalizes the linear regression model (which sets g(z) to be linear), and is a
restriction of the nonparametric regression model.
The gain over full nonparametrics is that there is only one nonparametric dimension, so the
curse of dimensionality is avoided.
Suppose g were known. Then you could estimate by (nonlinear) least-squares. The LS
criterion would be
n
X 2
Sn ( ; g) = yi g Xi0 :
i=1

We could think about replacing g with an estimate g^; but since g(z) is the conditional mean of yi
given Xi0 = z; g depends on ; so a two-step estimator is likely to be ine¢ cient.
In his PhD thesis, Ichimura proposed a semiparametric estimator, published later in the Journal
of Econometrics (1993).
Ichimura suggested replacing g with the leave-one-out NW estimator

P (Xj Xi )0
j6=i k yj
0 h
g^ i Xi = :
P (Xj Xi )0
j6=i k
h

The leave-one-out version is used since we are estimating the regression at the i’th observation i:
Since the NW estimator only converges uniformly over compact sets, Ichimura introduces trim-
ming for the sum-of-squared errors. The criterion is then
n
X 2
Sn ( ) = yi g^ i Xi0 1i (b)
i=1

He is not too speci…c about how to pick the trimming function, and it is likely that it is not
important in applications.
The estimator of is then
^ = argmin Sn ( ) :

The criterion is somewhat similar to cross-validation. Indeed, Hardle, Hall, and Ichimura (An-
nals of Statistics, 1993) suggest picking and the bandwidth h jointly by minimization of Sn ( ):
In his paper, Ichimura claims that the g^ i (Xi0 ) could be replaced by any other uniformly
consistent estimator and the consistency of ^ would be maintained, but his asymptotic normality
result would be lost. In particular, his proof rests on the asymptotic orthogonality of the derviative
of g^ 0
i (Xi ) with ei ; which holds since the former is a leave-one-out estimator, and fails if it is a
conventional NW estimator.

8.3 Asymptotic Distriubution of Ichimura’s Estimator

Let 0 denote the true value of :

69
The tricky thing is that g^ 0 ) is not estimating g(Xi0
i (Xi 0 ); rather it is estimating

G Xi0 = E yi j Xi0 = E g Xi0 0 j Xi0

the second equality since yi = g(Xi0 0) + ei :

That is
G (z) = E yi j Xi0 = z

and G (Xi0 ) is then evaluated at Xi0 :

Note that
G Xi0 0 = g Xi0 0

but for other values of ;

G Xi0 6= g Xi0

Hardle, Hall, and Ichimura (1993) show that the LS criterion is asymptotically equivalent to re-
placing g^ 0 ) with G (Xi0 ) ; so
i (Xi

n
X 2
S n ( ) ' Sn ( ) = yi G Xi0 :
i=1

This approximation is essentially the same as Andrews’ MINPIN argument, and relies on the
estimator g^ 0
i (Xi
) being a leave-one-out estimator, so that it is orthogonal with the error ei :
This means that ^ is asymptotically equivalent to the minimizer of Sn ( ) ; a NLLS problem.
As we know from the Econ710, the asymptotic distribution of the NLLS estimator is identical to
least-squares on
@
Xi = G Xi0 :
@
This implies
p
n ^ 0 !d N (0; V )

1 1
V = Q Q
Q = E Xi Xi 0
= E Xi Xi 0 e2i

To complete the derivation, we now …nd this Xi :

As ^ is n 1=2 consistent, we can use a Taylor expansion of g (Xi0 0) to …nd

g Xi0 0 ' g Xi0 + g (1) Xi0 Xi0 ( 0 )

where
d
g (1) (z) = g (z) :
dz

70
Then

G Xi0 = E g Xi0 0 j Xi0

' E g Xi0 + g (1) Xi0 Xi0 ( 0 ) j Xi0
0
= g Xi0 g (1) Xi0 E Xi j Xi0 ( 0)

since g (Xi0 ) and g (1) (Xi0 ) are measureable with respect to Xi0 . Another Taylor expansion
forg (Xi0 ) yields that this is approximately

0
G Xi0 ' g Xi0 0 + g (1) Xi0 Xi E Xi j Xi0 ( 0)
0
' g Xi0 0 + g (1) Xi0 0 Xi E Xi j Xi0 0 ( 0)

the …nal approximation for in a n 1=2 neighborhood of 0: (The error is of smaller stochastic
order.)
We see that
@
Xi = G Xi0 ' g (1) Xi0 0 Xi E Xi j Xi0 0 :
@
Ichimura rigorously establishes this result.
This asymptotic distribution is slightly di¤erent than that which would be obtained if the func-
tion g were known a priori. In this case, the asymptotic design depends on Xi ; not E (Xi j Xi0 0) :

2
Q = E g (1) Xi0 0 Xi Xi0

This is the cost of the semiparametric estimation.

Recall when we described identi…cation that we required the dimension of Xi to be 2 or larger.
Suppose that Xi is one-dimensional. Then Xi E (Xi j Xi0 0) = 0 so Q = 0 and the above theory
is vacuous (as it should be).
The Ichimura estimator achieves the semiparametric e¢ ciency bound for estimation of when
the error is conditionally homoskedastic. Ichimura also considers a weighted least-squares estimator
setting the weight to be the inverse of an estimate of the conditional variance function (as in
Robinson’s FGLS estimator). This weighted LS estimator is then semiparametrically e¢ cient.

8.4 Klein and Spady’s Binary Choice Estimator

Klein and Spady (Econometrica, 1993) proposed an estimator of the semiparametric single index
binary choice model which has strong similarities with Ichimura’s estimator.
The model is

yi = 1 Xi0 ei

where ei is an error.

71
If ei is independent of Xi and has distribution function g; then the data satisfy the single-index
regression
E (y j x) = g x0 :

It follows that Ichimura’s estimator can be directly applied to this model.

Klein and Spady suggest a semiparametric likelihood approach. Given g; the log-likelihood is
n
X
Ln ( ; g) = yi ln g Xi0 + (1 yi ) ln 1 g Xi0 :
i=1

This is analogous to the sum–of-squared errors function Sn ( ; g) for the semiparametric regression
model.
Similarly with Ichimura, Klein and Spady suggest replacing g with the leave-one-out NW esti-
mator
P (Xj Xi )0
j6=i k yj
0 h
g^ i Xi = :
P (Xj Xi )0
j6=i k
h
Making this substitution, and adding trimming function, this leads to the feasible likelihood
criterion
n
X
Ln ( ) = yi ln g^ i Xi0 + (1 yi ) ln 1 g^ i Xi0 1i (b):
i=1

Klein and Spady emphasize that the trimming indicator should not be a function of ; but instead
of a preliminary estimator. They suggest

1i (b) = 1 f^X 0 ~ Xi0 ~ b

where ~ is a preliminary estimator of ; and f^ is an estimate of the density of Xi0 ~ : Klein and
Spady observe that trimming does not seem to matter in their simulations.
The Klein-Spady estimator for is the value ^ which maximizes Ln ( ):
In many respects the Ichimura and Klein-Spady estimators are quite similar.
Unlike Ichimura, Klein-Spady impose the assumption that the kernel k must be fourth-order
(e.g. bias reducing). They also impose that the bandwidth h satisfy the rate n 1=6 <h<n 1=8 ;

which is smaller than the optimal n 1=9 rate for a 4th order kernel. It is unclear to me if these are
merely technical su¢ cient conditions, or if there a substantive di¤erence with the semiparametric
regression case.
Klein and Spady also have no discussion about how to select the bandwidth. Following the
ideas of Hardle, Hall and Ichimura, it seems sensible that it could be selected jointly with by
minimization of Ln ( ); but this is just a conjecture.
They establish the asymptotic distribution for their estimator. Similarly as in Ichimura, letting

72
g denote the distribution of ei ; de…ne the function

G Xi0 = E g Xi0 0 j Xi0 :

Then
p
n ^ 0 !d N 0; H 1

@ @ 0 1
H=E G Xi0 G Xi0
@ @ g (Xi0 0 ) (1 g (Xi0 0 ))

They are not speci…c about the derivative component, but if I understand it correctly it is the same
as in Ichimura, so

@
G Xi0 ' g (1) Xi0 0 Xi E Xi j Xi0 0 :
@
The Klein-Spady estimator achieves the semiparametric e¢ ciency bound for the single-index
binary choice model.
Thus in the context of binary choice, it is preferable to use Klein-Spady over Ichimura. Ichimura’s
LS estimator is ine¢ cient (as the regression model is heteroskedastic), and it is much easier and
cleaner to use the Klein-Spady estimator rather than a two-step weighted LS estimator.

8.5 Average Derivative Estimator

Let the conditional mean be

E (y j x) = (x)

Then the derivative is

(1) @
(x) = (x)
@x
and a weighted average is
(1)
E (X)w(X)

where w(x) is a weight function. It is particularly convenient to set w(x) = f (x); the marginal
density of X: Thus Powell, Stock and Stoker (Econometrica, 1989) de…ne this as the average
derivative
(1)
=E (X)f (X) :

This is a measure of the average e¤ect of X on y: It is a simple vector, and therefore easier to
report than a full nonparametric estimator.
There is a connection with the single index model, where

(x) = g x0

73
for then
(1)
(x) = g (1) (x0 )

where
c = E g (1) (x0 )f (X) :

Since is identi…ed only up to scale, the constant c doesn’t matter. That is, a (normalized)
estimate of is an estimate of normalized :
PSS observe that by integration by parts

(1)
= E (X)f (X)
Z
(1)
= (x)f (x)2 dx
Z
= 2 (x)f (x)f (1) (x)dx

= 2E (X)f (1) (X)

= 2E yf (1) (X)

By the reasoning in CV, an estimator of this is

n
X
^= 2 (1)
yi f^( i) (Xi )
n 1
i=1

(1)
where f^( i) (Xi ) is the leave-one-out density estimator, and f^( i) (Xi ) is its …rst derivative.
This is a convenient estimator. There is no denominator messing with uniform convergence.
There is only a density estimator, no conditional mean needed.
PSS show that ^ is n 1=2 consistent and asy. normal, with a convenient covariance matrix.
The asymptotic bias is a bit complicated.
Let q = dim(X): Set p = ((q + 4)=2 if q is even and p = (q + 3)=2) if q is odd. e.g. p = 2 for
q = 1; p = 3 for q = 2 or q = 3 and p = 4 for q = 4:
PSS require that the kernel for estimation of f be of order at least p: Thus a second-order kernel
for q = 1; a fourth order for q = 2; 3, or 4.
PSS then show that the asymptotic bias is

n1=2 E^ = O n1=2 hp

which is o(1) if the bandwidth is selected so that nh2p ! 0: This is violated (too big) if h is selected
to be optimal for estimation of f^ or f^(1) : This requirement needs the bandwidth to undersmooth to
reduce the bias. This type of result is commonly seen in semiparametric methods. Unfortunately,
it does not lead to a practical rule for bandwidth selection.

Reaserch Methodes Course Chapter 1, 2, 3
No ratings yet
Reaserch Methodes Course Chapter 1, 2, 3
106 pages
Spatial Economterics Using SMLE
100% (1)
Spatial Economterics Using SMLE
29 pages
Gaussian Sequence Model
No ratings yet
Gaussian Sequence Model
470 pages
Week9 Estimation (New)
No ratings yet
Week9 Estimation (New)
166 pages
Block 1
No ratings yet
Block 1
81 pages
Solution Manual for Introductory Econometrics A Modern Approach 5th Edition Wooldridge 1111531048 9781111531041 pdf download
80% (5)
Solution Manual for Introductory Econometrics A Modern Approach 5th Edition Wooldridge 1111531048 9781111531041 pdf download
49 pages
Robert Engle Dan McFadden Handbook of Econometrics PDF
No ratings yet
Robert Engle Dan McFadden Handbook of Econometrics PDF
1,024 pages
Arthur E. Albert, Leland A. Gardner Jr. Stochastic Approximation and NonLinear Regression
No ratings yet
Arthur E. Albert, Leland A. Gardner Jr. Stochastic Approximation and NonLinear Regression
211 pages
Economterics Final 2024.
No ratings yet
Economterics Final 2024.
32 pages
Strategies For Revising Judgment
No ratings yet
Strategies For Revising Judgment
62 pages
06 Nonlinear Regression Models
No ratings yet
06 Nonlinear Regression Models
57 pages
(Newey and Mcfadden) Large Sample Estimation and Hypothesis Testing
No ratings yet
(Newey and Mcfadden) Large Sample Estimation and Hypothesis Testing
135 pages
FULLTEXT012
No ratings yet
FULLTEXT012
126 pages
R20 M.Tech DS
No ratings yet
R20 M.Tech DS
64 pages
Statistics Notes PDF
No ratings yet
Statistics Notes PDF
27 pages
Gary Chamberlain Econometric S
No ratings yet
Gary Chamberlain Econometric S
152 pages
A Brief Overview of The Classical Linear Regression Model (CLRM)
No ratings yet
A Brief Overview of The Classical Linear Regression Model (CLRM)
85 pages
Generalized Method of Moments Estimation PDF
No ratings yet
Generalized Method of Moments Estimation PDF
29 pages
Cinelli Et Al 2022 A Crash Course in Good and Bad Controls
No ratings yet
Cinelli Et Al 2022 A Crash Course in Good and Bad Controls
34 pages
MIT18 650F16 Regression
No ratings yet
MIT18 650F16 Regression
44 pages
1 Regression Analysis and Least Squares Estimators
No ratings yet
1 Regression Analysis and Least Squares Estimators
7 pages
Adaptive Estimation in An Autoregression and A Geometrical
No ratings yet
Adaptive Estimation in An Autoregression and A Geometrical
37 pages
A Novel Bayesian Approach For Variable Selection in Linear Regression Models
No ratings yet
A Novel Bayesian Approach For Variable Selection in Linear Regression Models
24 pages
Bayes Lecture Notes
No ratings yet
Bayes Lecture Notes
79 pages
Econometrics I 1
No ratings yet
Econometrics I 1
40 pages
AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION
No ratings yet
AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION
44 pages
1665330032006_ch3-rev2
No ratings yet
1665330032006_ch3-rev2
45 pages
Chapter 4 Lesson 6
No ratings yet
Chapter 4 Lesson 6
28 pages
09 SS049
No ratings yet
09 SS049
14 pages
Institute of Mathematical Statistics
No ratings yet
Institute of Mathematical Statistics
12 pages
Chapter 36 Large Sample Estimation and Hypothesis Testing
No ratings yet
Chapter 36 Large Sample Estimation and Hypothesis Testing
135 pages
Block1
No ratings yet
Block1
83 pages
Normand (1999) - Meta-Analysis. Formulating, Evaluating, Combining, and Reporting
No ratings yet
Normand (1999) - Meta-Analysis. Formulating, Evaluating, Combining, and Reporting
39 pages
2011 12BScMathSciSyl
No ratings yet
2011 12BScMathSciSyl
44 pages
Posterior Mean and Variance Approximation For Regression and Time Series Problems
No ratings yet
Posterior Mean and Variance Approximation For Regression and Time Series Problems
25 pages
Linear Method of Moments 1.1. The Model
No ratings yet
Linear Method of Moments 1.1. The Model
15 pages
Asymptotic Behaviorsof Support Vector Machineswith Gaussian Kernel Keerthi 2003
No ratings yet
Asymptotic Behaviorsof Support Vector Machineswith Gaussian Kernel Keerthi 2003
23 pages
2 NW and Local Linear Regression
No ratings yet
2 NW and Local Linear Regression
14 pages
Annal Horowitz Mammen 2004
No ratings yet
Annal Horowitz Mammen 2004
32 pages
Weather Wax Hastie Solutions Manual
No ratings yet
Weather Wax Hastie Solutions Manual
18 pages
RegEstimationLS_ML_StatColumbia
No ratings yet
RegEstimationLS_ML_StatColumbia
44 pages
Smoothing: Smooth
No ratings yet
Smoothing: Smooth
19 pages
DFSS BB318 Intro To Rob
No ratings yet
DFSS BB318 Intro To Rob
26 pages
GMM Estimation PDF
No ratings yet
GMM Estimation PDF
35 pages
Lecture Notes on High Dimensional Linear Regression
No ratings yet
Lecture Notes on High Dimensional Linear Regression
73 pages
Second-Order Nonlinear Least Squares Estimation: Liqun Wang
No ratings yet
Second-Order Nonlinear Least Squares Estimation: Liqun Wang
18 pages
Classical LinearReg 000
No ratings yet
Classical LinearReg 000
41 pages
Ridge 3
No ratings yet
Ridge 3
4 pages
Essentials of Modern Business Statistics (7e) : Anderson, Sweeney, Williams, Camm, Cochran
No ratings yet
Essentials of Modern Business Statistics (7e) : Anderson, Sweeney, Williams, Camm, Cochran
52 pages
2005_OriginsLIML
No ratings yet
2005_OriginsLIML
16 pages
Z Test
No ratings yet
Z Test
17 pages
ch-3
No ratings yet
ch-3
69 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Non Parametric Prediction
No ratings yet
Non Parametric Prediction
16 pages
6 Partially Linear Regression
No ratings yet
6 Partially Linear Regression
12 pages
USING THE MOVING BLOCK BOOTSTRAP
No ratings yet
USING THE MOVING BLOCK BOOTSTRAP
25 pages
Kernel regression uniform rate estimation for censored data under α-mixing condition
No ratings yet
Kernel regression uniform rate estimation for censored data under α-mixing condition
18 pages
2504.19089v1
No ratings yet
2504.19089v1
33 pages
Handout 1 Experiment Design
No ratings yet
Handout 1 Experiment Design
15 pages
25921-Article Text-29984-1-2-20230626
No ratings yet
25921-Article Text-29984-1-2-20230626
9 pages
IMportant Question 4th
No ratings yet
IMportant Question 4th
8 pages
A BAYESIAN APPROACH TO NONPARAMETRIC TEST PROBLEMS
No ratings yet
A BAYESIAN APPROACH TO NONPARAMETRIC TEST PROBLEMS
16 pages
Point Estimation An Introduction
No ratings yet
Point Estimation An Introduction
10 pages
Econ0064 Final Exam 2018-19 Term3
No ratings yet
Econ0064 Final Exam 2018-19 Term3
7 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
2169_edited
No ratings yet
2169_edited
16 pages
Course Guide in EnggMath 3
No ratings yet
Course Guide in EnggMath 3
13 pages
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
No ratings yet
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
9 pages
Kernel Smoothers: An Overview of Curve Estimators For The First Graduate Course in Nonparametric Statistics
No ratings yet
Kernel Smoothers: An Overview of Curve Estimators For The First Graduate Course in Nonparametric Statistics
13 pages
9.0 Estimation of A Random Variable's Possible Value: Statistical Inference Consists of Using Methods by Which One
No ratings yet
9.0 Estimation of A Random Variable's Possible Value: Statistical Inference Consists of Using Methods by Which One
8 pages
FCDS - RA ch1 Sp21
No ratings yet
FCDS - RA ch1 Sp21
14 pages
Manual Econometrics
No ratings yet
Manual Econometrics
20 pages
Cours 2 MVA
No ratings yet
Cours 2 MVA
5 pages
Estimating Regression Models of Finite But Unknown Order
No ratings yet
Estimating Regression Models of Finite But Unknown Order
17 pages
1981 Estimating the Dimension of a Linear-model_j. Andel, m. g. Perez and a. i. Negrao
No ratings yet
1981 Estimating the Dimension of a Linear-model_j. Andel, m. g. Perez and a. i. Negrao
12 pages
ESL: Chapter 1: 1.1 Introduction To Linear Regression
No ratings yet
ESL: Chapter 1: 1.1 Introduction To Linear Regression
4 pages
Introduction To Curve Fitting
No ratings yet
Introduction To Curve Fitting
10 pages
Nonparametric Inference Using Orthogonal Functions
No ratings yet
Nonparametric Inference Using Orthogonal Functions
13 pages
R300 Solution Guide 2018M
No ratings yet
R300 Solution Guide 2018M
8 pages
Advanced Bank Managment
100% (1)
Advanced Bank Managment
50 pages
composites-as-factors
No ratings yet
composites-as-factors
11 pages
Schwarz EstimatingDimensionModel 1978
No ratings yet
Schwarz EstimatingDimensionModel 1978
5 pages
Statistical Inference in Nonlinear Sure Model
No ratings yet
Statistical Inference in Nonlinear Sure Model
7 pages
Unit - III
No ratings yet
Unit - III
4 pages
Inferential Statistics
No ratings yet
Inferential Statistics
3 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
8 pages
Linear Regression
83% (6)
Linear Regression
499 pages
Complex Integration and Cauchy's Theorem
From Everand
Complex Integration and Cauchy's Theorem
G. N. Watson
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)