Nickell-Biases DynamicModels-1981

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Biases in Dynamic Models with Fixed Effects

Author(s): Stephen Nickell


Source: Econometrica , Nov., 1981, Vol. 49, No. 6 (Nov., 1981), pp. 1417-1426
Published by: The Econometric Society

Stable URL: https://www.jstor.org/stable/1911408

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to
Econometrica

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
Econometrica, Vol. 49, No. 6 (November, 1981)

BIASES IN DYNAMIC MODELS WITH FIXED EFFECTS

BY STEPHEN NICKELL'

It is well known from the Monte-Carlo work of Nerlove that using the standard
within-group estimator for dynamic models with fixed individual effects generates esti-
mates which are inconsistent as the number of "individuals" tends to infinity if the number
of time periods is kept fixed. In this paper we present analytical expressions for these
inconsistencies for the first order autoregressive case.

INTRODUCTION

SINCE THE PIONEERING WORK of Balestra and Nerlove [2] on the


natural gas, economists have made extensive use of panel data in the elucidation
of economic relationships. In this work it has been typically assumed that the
error term corresponding to the ith individual in the tth time period, ujt, is made
up of three components, one individual specific, one time specific, and a
remainder which is both time and individual specific. Thus we have

(1) uit = fi + t + (it


where the three components are often assumed to be uncorrelated with each
other and, indeed, with the included variables in the equation. The fundamental
question was generally considered to be, in the words of Nerlove, "whether or
not to treat fi and L, as parameters or as random variables."2 This is particularl
important in the case of fi because the typical panel has vastly more individual
than time periods and treating the fi as parameters introduces an enormous
number of additional parameters into the model compared with the alternative in
which thefi are usually considered as being drawn from a distribution with but a
single unknown parameter. The advantages of this latter so called random effects
model over the alternative fixed effects model are thus manifest particularly
when it is realized that the fixed effects model implies that one is ruling out of
order all the information that may be gleaned by directly comparing one
individual with another.
However, in recent years, the error components model has been looked at from
a slightly different viewpoint by some researchers who view the individual effects,
fi, as relevant but unobserved characteristics which are highly likely to be
correlated with the observed exogenous variables in the model. Thus, for exam-
ple, in Ashenfelter's study of the effect of training programs on earnings the
individual effects are talked of as capturing "such factors as ability, motivation
or other previous investments in human capital"3 which are clearly thought of as

'I should like to thank Jim Heckman for encouraging me to write this paper and John Ham,
David Hendry, two referees of Econometrica, and members of the econometrics group at the London
School of Economics for their useful comments on an earlier draft. Financial support was provided
by the Industrial Relations Section, Princeton University, and the Social Science Research Council.
2See Nerlove [13, p. 3611.
3See Ashenfelter [1, p. 491.
1417

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
1418 STEPHEN NICKELL

being correlated with the extent to which the individual participates in training
programs. Hausman [4] takes a similar view in his brief discussion of wage
equations in which he finds strong evidence that the individual effects are
correlated with the observed exogenous variables and uncompromisingly rejects
the uncorrelated random effects model. This is very important point because if
one takes the view that, in any particular model, the individual effects are likely
to be correlated with all the observed exogenous variables, then one is lead
inexorably to the fixed effects model.4 This will enable one to obtain coefficients
on the exogenous variables which do not suffer from bias due to the omission of
relevant individual attributes. Indeed, Mundlak [11] and Chamberlin [3] have
shown, in the context of linear regression with strictly exogenous regressors, that
the random effects model leads to the same estimators as the fixed effects model
in situations where the individual effects are correlated with the exogenous
variables and thus, in these hardly unusual circumstances, the fixed effects model
assumes paramount importance.5
Unfortunately, as the Monte-Carlo work of Nerlove [12, 13] makes clear, the
fixed effects model suffers from an important drawback. Standard methods of
estimation are liable to lead to seriously biased coefficients in dynamic models. A
typical set of panel data has a rather large number of individuals and a rather
small number of time periods and it is in just these circumstances that the biases,
which are essentially of the Hurwicz type, are most serious.6 The fact that they
will go to zero when the number of time periods becomes infinite is scant
consolation. It is the purpose of this paper to investigate these biases analytically
for the first-order autoregressive case. Two models will be considered. These are,
omitting the time effects for simplicity of exposition,

(2) Yi + Pyit-I + Xj + f+ei (i= I ... N; t = I ... T),

and

(3) Yi, =8 + 8jxijt+ f + uit i N; t = I ... T),

(4) uit =puit - I + 'Eit,

4A superior alternative to the fixed effects or within-groups estimator is available if one is


prepared to assert, a priori, that some of the included exogenous variables are not correlated with the
individual effects. This is discussed in Hausman and Taylor [5].
5 It is, of course, always open to the investigator to take a random effects model and specify a joint
distribution for the random effects and the included variables and then to integrate the former out of
the likelihood function. This general procedure is discussed in Chamberlin [3] and an illustration of
problems it can cause in a particular context is presented in Lancaster and Nickell [8]. The basic
difficulty is, of course, that the estimates obtained often depend crucially on the distributional
assumptions made about the individual effects, something on which economic theory has little to say.
Unfortunately, as soon as one moves outside the framework of linear regression with strictly
exogenous regressors, the fixed effects model (with its distribution-free advantages) generates incon-
sistent estimates for fixed T. Heckman [6] presents some Monte Carlo estimates on the size of these
biases in some simple probit models.
61t is important to recognize that the Hurwicz type bias may be serious in any dynamic model
estimated using a short time series.

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
BIASES IN DYNAMIC MODELS 1419

f are fixed parameters, ci, are IN(O, a,), a


ergodic. If we let Ei represent the expectation of a random variable taken over
the individuals for a fixed time period, the above ensures that E,cj, = 0 and we
assume that EiEjjf = 0.
Rearranging (3) and (4) gives

(5) Yit = 83(1 - p) + pyi,-l + P A(X - pxi,- 1) + fi(l - P) + Ej,.

Concentrating our exposition on the lagged dependent variable model (2), the
standard estimation procedure is to start by eliminating the fixed effects f
may be done in any number of ways but the standard technique is to subtract the
time mean of (2) from (2) itself to yield

(6) Yit -Yi = P(Yit.- - Yi -) + 8 /3(Xij, - x,. ) + (Ei. - )

where for any variable z,, z1 =(l/T)T Izi, and zi. = (l/T) T-jzi,. It is
clear that OLS estimates based on (6) will be biased even if N, the number of
individuals, goes to infinity. This arises because in these circumstances the
correlation between yi, - and Ej., for example, does not go to z
remainder of the paper is devoted to an analysis of these biases and is set out as
follows. In the next section we shall compute the bias as N -x oc in the model
with no exogenous variables and we shall then look at the effect of including
exogenous variables on these results. In subsequent sections we compare our
analytical computations with some of the extensive Monte-Carlo results pre-
sented in Nerlove [12, 13] and Maddala [9].

1. BIASES IN AUTOREGRESSIVE MODEL WITH FIXED EFFECTS

If we remove the exogenous variables from (6), we have

(7) yi. -yi- = P(Yit-I -Yi - ) + (Ei. - ei.) (i = 1,..., n; t = l . . . T).

Note that this equation follows from both the lagged dependent variable and the
residual autoregression model. In order to estimate p we have a number of
options. The standard method is to use OLS on (7) pooling all the cross-sections.
It is perfectly legitimate, however, to use directly only one cross-section in
estimating (7) although the data on all the others have, of course, already been
used to compute the time means. If we use the tth cross-section we may define an
OLS estimate

N N

(8) A'= (Yi.- _ I_Yi 1 )(Yi. _ Yi.) (Yi.-, _ Yi. _ 1)2.


i=l i=

We shall now compute the asymptotic bias or inconsistency by taking probability

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
1420 STEPHEN NICKELL

limits as N-> oo. Thus we have, using (7),

plimN II ENE, I (yi.- Yi. -I(it f


(9) plim pt= p + N-oo /i(I . ( - E.)
9) N-ox P plim IIlNE N I (Yit- Yi -I
or

(10) plim (At-p) = At/Bt, say.


N-*oo

To avoid continual use of more complicated terminology we shall, in future,


generally refer to anything of the form plimNoo(p - p) as a bias. Since E
random drawings from a normal distribution when t is fixed we can replace7
plims as N -4 x by expectations across i, Ei, and thus obtain

At = Ei(yit- I - y 1)(cit - c,)


or

( 11) At =-E-yit I Ei. - Eiyi. - I Eit + ELy,. - I Ei ,

noting that EiY, - Eit = 0. Before proceeding it is worth pointing out that station-
arity implies the following result. Removing the exogenous variables and time
effects, (2) implies
00

(12) yit = 68+ f)/(l - P)+ E Pji,i-


j=O

Then (1 1) and (12) imply

( /O ) T
i 5,i)
- i(
(F T -E oi-iI

where we have us
independent of t, we have after some manipulation

_T1 + p- 12 T(1-p)J
T 1-p T 1-p T1-p

where the h terms correspond to those in the previous section. Collecting


terms then yields

(13) A =- T( p) p pT-t + T (I-P) })

7Our analysis of the bias does not depend on the normality of Ei,. So long
and independently distributed for each t, then the results will go through.

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
BIASES IN DYNAMIC MODELS 1421

Proceeding to the denominator of the bias, B

l T oo 2

B, = Ei Pi, tj - I P-j_,
1=0 s= 1 j=O

= j if,j' -Ti o i' -i )( T= =o)

+ 12 E.(Xit I E Pe-I_l-
= I-E, = 0

1-p2 T(1 p2) -p

+ (72J2 1 2p(1 PT)


T(1 p)2 T(1 - p2) p

where again the three terms in the final expression co


previous line. Collecting terms and using (13) we obtain

(14) Bt= 2 (i- I+ 2At.

The final value for the bias is thus given by

(15 pli w ) 2 |T I - p 't-I T-t

T (I1-p) J

or

(16) plim(pt - p) (+ P) (I_p. t I pr-t + I IPT))


N-o T-~ I T I-p

x 2p [ tI p -
x 1-(T - 1)(I p)[1p -

(I _-p
+{ T I }-
This content downloaded from
193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
1422 STEPHEN NICKELL

The first of these expansions is computationally simpler but the second reveals
clearly that the inconsistency is 0(1 / T). There are a number of interesting
points about this bias. First, if p > 0, it is invariably negative since At <0.
Second, the bias depends on t and hence varies with the cross-section which is
used to generate the estimate. Indeed, (15) clearly reveals that the bias will be
smaller if we use cross-sections at the ends of the sample period and it will be
largest if we use the middle one. More relevant for practical purposes, however, is
the bias if we generate our estimate, p, using the whole sample. Thus we have

T N T N

P= (Yit -Yi. - i)(Yit-Yi. )/ E E (Yit-I Yi. -1)2


t=1 =1 t=1 i=

and in our standard notation the bias is given by

T T

plim (5 p) =EAtl E Bt
Nvx ot=1 t=1

which yields

F 2 I~ (1{ (iPT)\1
(17) plim (P p) | 2 ( T)

or

(18) plim (p-p)= - ) ( 1 (- ) =


N T-1 T I-p j

- (1-p)(T- 1) T I p ]}'

Furthermore, for reasonably large values of T we have the simple approximation

-(1 +p)
(19) plim(P -pP) T-1
N-*c

On the other hand for small values of T we have

-(1 +p)
plim (P -P)= 2 for T= 2,
Nx p ~~~2
-(2+ p)(1+ p) for T=3,
2

with the latter confirming the result in Chamberlin [3, p. 228]. These results are
of considerable interest. Apart from the fact that the bias is always negative if
p > 0, we can see how large it is if T is small. Even with T = 10, which is the
order of magnitude of most sets of panel data, if p = 0.5 then the bias is --0.167

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
BIASES IN DYNAMIC MODELS 1423

which can hardly be ignored. Furthermore, the bias does not go to zero as p goes
to zero. However, these biases are not as severe as the standard Hurwicz biases
associated with first-order autoregressive processes with a constant term. In this
case, at least to order 1 / T, the bias is - (1 + 3p)/ T - 1 which is larger than
those considered above. This approximation and a large number of related
results may be found in Mariott and Pope [10] and Kendall [7]. The standard
Hurwicz bias differs from that given in (17) basically because if we let A
= ,(yI- yl- .. - 1)(i,- i. ) and B = .(yi.- _-yi. _ 1)2, the Hurwicz bias i
the standard regression (with one value of i) is given by E(A/B). We, of course,
are computing the bias as N-* ox and are thus considering E(A)/E(B) where
expectations are all taken across i. These expressions are related in the sense that
the approximation to the standard Hurwicz bias to order TJ- is given by

E(A) ( cov(AB) var(B)


E(A/B) = 1 I- +
E(B ) E(A)E(B) E2(B)J

which yields the formula cited above. It is, of course, the sec
which make the standard bias bigger. Nevertheless it usually troubles us less
because the typical time series is very much longer than typical panel. Further-
more, when we introduce exogenous variables the situation gets worse as we shall
see in the next section.

2. THE INCLUSION OF EXOGENOUS VARIABLES

In this section we shall concentrate on model (2) with the lagged endogenous
variable. If we define the following matrices

y, = [ y1,-y1. ], N x I vector,
2 - I [yi,-yi. -l] N x 1 vector,

X,=[ xij,-x x,. N x J matrix,

= [ci-.e.], N x 1 vector,
b =[ ], J x I vector,

(6) may then be written in deviation form as

(20) p,=py,_ + X,b + Et


where we have introduced J exogenous variables. To obtain the most efficien
estimates we may now stack these equations over the T time periods to obtain

YT2 i2 j
(21) -1+D b
y ];XI
+ 61l

YT YT- I - XT eT

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
1424 STEPHEN NICKELL

or

(22) =p_ - + Xb +
again in obvious notation. If p, b are the OLS estimates, using standard
procedures we obtain

(23) A-P=(y'_M y_IQ1M,

(24) b-b = _- p) + (X'X)<'X,


where M = I - X(X'X<X'. Taking plims as N-* ox and noting that

plim 1 MW= plim 1 IF


N--c, N T ' N- N T

since X is exogenous, we have

(25) plim( -p) = plim N plim 1


N N NTT
--*oc I x
N--*oc I N T
/N-xc~ T
and

pliM
(26) b-b) 'b-b'[(X'Xx)
=-plim ]plM(A_ )
Xy_l] plim (p-_p)
N-oo,c N-->c N-->

Note first from our previous analysis that we have already calculated
plimN- (1/NT)I'_1 and this is given by

(27) plim 1 ' T - T(I)(l T 1-pI- (

This result remains unaltered by the introduction of exogenous variables since


their incorporation into equation (12) will have no effect given they are uncorre-
lated with the error term. When p is positive we may derive the direction of the
biases. Since A, is negative, plim( -p) must be negative and will indeed be
larger (in absolute value) than if the exogenous variables are omitted since the
denominator of the expression in (25) is reduced by the inclusion of M. The bias
on b depends on the relationship between the exogenous variables and _ ,. If an
exogenous variable is positively related (in the regression sense) toy_ ,, then (26)
indicates that its coefficient will be upward biased and vice-versa.
Having derived all these explicit results for the inconsistencies generated in the
dynamic fixed effects model it is obviously worth comparing them with what we
know from Monte Carlo experiments. Nerlove [12] obtains a number of results
on fixed effect first-order autoregressive models with no exogenous variables. He
concludes on page 58 that the bias in p is uniformly negative (he only considers
p > 0) as we would expect. Furthermore, his actual estimates provide most
compelling confirmation of the usefulness of our results. Using a sample size of
T = 10, N = 25, he computes a large number of estimates of p corresponding to
true values of p equal to 0.0, 0.1, 0.5, and 0.9. He does this for a large number of

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
BIASES IN DYNAMIC MODELS 1425

different values of the variance of f


not influence the asymptotic bias and indeed, as he himself points out, does not
affect his p$ estimates except when p =0.9. For p = 0.0, 0.1, 0.5 the average
Nerlove Monte Carlo estimates of p reported in Table Cl are -0.10083,
-0.01115, and 0.33354 respectively.8 The corresponding p's given by equation
(17) are -0.10000, -0.01108, and 0.33779. They are thus more or less exact even
though they are only asymptotic in N and Nerlove has N = 25. The results for
p = 0.9 are not quite so clear cut. Equation (17) gives 5 = 0.65677 whereas the
Monte Carlo results only yield this value if the variance of f is not large r
to the total error variance. Thus the average Monte Carlo estimate of p$ is
we only consider those experiments where the variance of f is less than on
of the total-otherwise the Monte Carlo estimates become considerably higher
and our asymptotic result is no longer accurate.
Turning to the introduction of exogenous variables we may note an experiment
by Maddala for p = 0.7 where he again has N = 25 and although he claims to
have T = 10 it appears that he only generates ten values of the dependent
variable for each i, which would imply T = 9 in our notation given the lagged
value on the right-hand side. When there is no exogenous variable he generates
p = 0.475 (equation (17) yields p = 0.4805) whereas the introduction of an
exogenous variable with a true coefficient equal to 0.5 reduces p to 0.3178 and
generates an estimate of ,B which is strongly upward biased as we might expect
from (26).

SUMMARY

We have presented analytical expressions for the asymptotic biases in first-


order autoregressive models estimated by OLS using panel data and including
individual fixed effects. These asymptotic biases are shown to be both large and
to coincide almost exactly with the estimates provided by the Monte Carlo
studies of Nerlove (1967) and Maddala (1971).

London School of Economics

Manuscript received January, 1980; revision received A ugust, 1980.

8When p = 0.0. Nerlove reports a sequence of p's for various different values of var(j])/(total error
variance). All the numbers in this sequence bar one lie between -0.086 and -0.1 15. The odd man
out is -0.010 and this has been omitted in computing the average presented in the text on the
grounds that it is probably a typographical error.

REFERENCES

[1] ASiHENFEITER. O.: "Estimating the Effect of Training Programs on Earnings," Review ofj
Economics and Statistics, 60(1978). 47-57.
[2] BALESTRA, P., AND M. NERLOVE: "Pooling Cross Section and Time Series Data in the Esti
of a Dynamic Model: The Demand for Natural Gas," Econometrica. 34(1966), 585-612.
[3] CHAMBERLAIN, G.: "Analysis of Covariance with Qualitative Data, Review of Economic Stlidies,
47(1980). 225-238.

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms
1426 STEPHEN NICKELL

[4] HAUSMAN, J. A.: "Specification Tests in Econometrics," Econometrica, 46(1978), 1251-1273.


[5] HAUSMAN, J. A., AND W. E. TAYIOR: "Panel Data and Unobservable Individual Effects,"
forthcoming in Econometrica.
[6] HECKMAN, J. J.: "The Incidental Parameters Problem and the Problem of Initial Conditions in
Estimating a Discrete Time-Discrete Data Stochastic Process and Some Monte Carlo Evi-
dence," in Structural Analysis of Discrete Data, ed. by D. McFadden and C. Manski.
Cambridge, Mass.: MIT Press, 1979.
[7] KENDALL, M. G.: "Note on the Bias in the Estimation of Autocorrelations," Biometrika,
41(1954), 403-404.
[8] LANCASTER, T., AND S. NICKELL: "The Analysis of Re-Employment Probabilities for the
Unemployed," Journal of the Royal Statistical Society, Series A, 143(1980), 141-165.
[9] MADDALA, G. S.: "The Use of Variance Components Models in Pooling Cross Section and Time
Series Data," Econometrica, 39 (1971), 341-358.
[10] MARIoTT, F. H. C., AND J. A. POPE: "Bias in the Estimation of Autocorrelations," Biometrika,
41(1954), 393-403.
[11] MUNDLAK, Y.: "On the Pooling of Time Series and Cross Section Data," Econometrica,
46(1978), 69-85.
[12] NERLOVE, M.: "Experimental Evidence on the Estimation of Dynamic Economic Relations from
a Time Series of Cross-Sections," Economic Studies Quarterly, 18(1967), 42-74.
[13] : "Further Evidence on the Estimation of Dynamic Economic Relations from a Time
Series of Cross-Sections," Econometrica, 39(1971)' 359-387.

This content downloaded from


193.0.101.216 on Wed, 30 Aug 2023 18:52:43 +00:00
All use subject to https://about.jstor.org/terms

You might also like