Spatial Econometrics
Spatial Econometrics
Spatial Econometrics
net/publication/350327936
CITATIONS READS
2 4,947
1 author:
Luc Anselin
University of Chicago
290 PUBLICATIONS 61,022 CITATIONS
SEE PROFILE
All content following this page was uploaded by Luc Anselin on 23 March 2021.
Luc Anselin
anselin@uchicago.edu
January 4, 2021
∗
Summary
lagged variable, or spatial lag. The spatial lag can be applied to the de-
the spatial weights matrix has been taken to be given and exogenous,
but recent research has focused on estimating the weights from the
that is concerned with the spatial aspects present in cross-sectional and space-
checking and prediction” (Anselin 2006). The spatial aspects are referred to
with the distinct nature of spatial dependence, specifically, with the implied
1
Spatial heterogeneity is a special case of structural instability, a familiar
Whereas early on, apart from Anselin (1980, 1988), Cliff and Ord (1981),
and later LeSage and Pace (2009), there was a relative dearth of treatments
longer the case. In recent years, several new texts were published, providing
ample access to the breadth of the field, in terms of theoretical results, new
Rey (2014), Arbia (2014), Dubé and Legros (2014), Elhorst (2014a), Kelejian
In addition, there are several extensive reviews of the state of the art,
include Anselin and Bera (1998), Anselin (2001, 2006, 2021), Anselin et al.
(2008), Lee and Yu (2010, 2011, 2015), Elhorst (2012, 2014b), and Bai et al.
(2016).
specification tests is outside the scope (see the review articles cited above for
2
details). Instead, in this essay, the focus is on three important aspects that
desirable properties for estimators and specification tests. The focus is solely
observations of variables at other locations on the right hand side of the model
matrix are zero and the rows are standardized such that their sum equals
3
one (a detailed discussion of spatial weights is given below). The spatially
observations.
Wy, for the explanatory variables, WX, or for the error terms, We. The
ables as well as random effects gives rise to a wide range of model specifica-
tions.
Three broad classes of models are reviewed in what follows, i.e., the basic
cross-sectional model, static spatial panels, and dynamic spatial panels. The
for the error term. Specifically, using matrix notation, the unconstrained
4
spatial Durbin model is expressed as (LeSage and Pace 2009, Elhorst 2010):1
y = αι + ρWy + Xβ + WXγ + e,
specification, the two sets of spatial variables are the spatially lagged depen-
dent variable, Wy, and the spatially lagged explanatory variables, WX (not
order not to complicate the notation (the main effect of including additional
process:
e = λWe + u,
5
cratic error terms, which could be heteroskedastic.
obtained (Ord 1975, Anselin 1988), arguably the most studied spatial model:2
y = ρWy + Xβ + e.
A spatial lag model is often viewed as expressing the effect of the neigh-
bors on the right hand side of the equation through the spatial lag term Wy.
the values for those neighbors are observed, although this is not realistic in
pattern is determined jointly, since the values for the neighbors in turn de-
pend on the value at i and cannot occur independently. This is clearly seen
from the reduced form for the spatial lag model, obtained by moving the
spatial lag term to the left hand side and “solving” for y by means of the
y = (I − ρW)−1 Xβ + (I − ρW)−1 e
6
power expansion can be applied to the inverse term in the reduced form:3
(I − ρW)−1 = I + ρW + ρ2 W2 + . . . ,
E[y|X] = Xβ + ρWXβ + ρ2 W2 Xβ + ρ3 W3 Xβ + . . .
The reduced form also leads to the notion of a spatial multiplier, and the
interpretation of direct and indirect effects. The latter show the effect of a
includes not only ∂yi /∂xh,i (for a given explanatory variable xh at i), but
also ∂yj /∂xh,i , as well as ∂yi /∂xh,j . These complex effects follow from the
reduced form and how the locations are interconnected through the elements
of the inverse matrix (I − ρW)−1 (see, e.g., LeSage and Pace 2009, for a
detailed discussion).
with a standard non-spatial regression for the substantive model. Using the
3
See Anselin and Bera (1998) for an early example of this principle.
7
result of the reduced form for the error term yields:
y = Xβ + (I − λW)−1 u.
not constant across observations (a typical case in practice), the error term
Other forms of spatial dependence in the error term include a direct para-
metric specification of the error covariance (see, e.g., Dubin et al. 1999, for a
fication (Haining 1978, Fingleton 2008, Baltagi and Liu 2011), and factor
which would suggest a spatial error model, may contradict such a speci-
interpreted with caution as the models are not nested in the usual sense.
8
specification, but with the constraint γ = −ρβ in place, the spatial Durbin
contradicting λ = 0.
SLX model (for spatially lagged X), originally suggested as a spatial cross-
has gained greater prominence since its reintroduction by Vega and Elhorst
(2015):
y = Xβ + WXγ + e.
The SLX model differs from the reduced form of the spatial lag not only
in that higher order (global) effects are truncated, but also by the lack of
The SLX model has received less attention in the theoretical literature
9
2.2 Static Spatial Panels
When observations are available across space as well as over time, a rich
of spatial and space-time correlation (for extensive reviews, see, e.g. Anselin
et al. 2008, Elhorst 2014a, Lee and Yu 2010, 2015). In static spatial panel
ordered as cross-sections for each time period, and not as (short) time series
yt = ρWyt + Xβ + et ,
subscript t to identify the time period. Stacked over all time periods, the
model becomes:
y = ρ(IT ⊗ Wn )y + Xβ + e,
10
the subscripts on I and W indicate their dimensions, respectively T × T
for the time dimension and n × n for the cross-sectional dimension. For
the sake of simplicity, the spatial autoregressive coefficient ρ and the slope
through an unspecified error covariance that contains the serial (time) cor-
relation (see Anselin 1988, Anselin et al. 2008, Baltagi and Pirotte 2011)
vector of time specific effect η, the stacked spatial lag model becomes:
Due to the stacking by cross-section, this form differs from the standard
textbook expression.
11
error correlation.
A first approach was outlined in Anselin (1988) (see also Baltagi et al.
2007), with the spatial autoregressive process applied to the error term in
each cross-section, as
et = λWn et + ut ,
time periods for simplicity, but this is easily relaxed), and ut as a n×1 vector
error term also includes a n × 1 vector of spatial random effects µ, such that
spatial autoregressive process applied to both the spatial random effects and
It should be noted that the two models imply different spatial spillover ef-
fects. In the first model, these spillovers are inherently time-varying, whereas
12
specifications as special cases. They refer to the first model as the Anselin
The Baltagi et al. (2013) specification includes two error components for
non-spatial model:
yt = Xt β + u1 + u2t ,
For the first component, this is time-invariant, which gives the fully stacked
specification as:
effects.
u2 = λ2 (IT ⊗ W)u2 + ν,
13
with ιT as a T × 1 vector of ones. As Baltagi et al. (2013) show, setting
expression is found.
and HAC models are other ways to incorporate spatial correlation in the panel
error terms. These error terms can be combined with spatial lag or spatial
Pesaran and Tosetti 2011, Kapetanios et al. 2011, Chudik and Pesaran 2015,
strong or weak (Chudik et al. 2011), and the strength of cross-sectional cor-
dependence (Bailey et al. 2016). However, while such models deal with cross-
sectional correlation, they are essentially aspatial in that they ignore the or-
14
2.3 Dynamic Spatial Panels
able, yt−1 , and/or explanatory variables, Xt−1 , but also time-lagged versions
of the spatial lags, such as Wyt−1 and WXt−1 . In addition, several time lags
Clearly, not all parameters in such a model can be identified, but many
temporally lagged variables (see also Anselin et al. 2008, for a discussion of
identification concerns).
The main issues associated with dynamic panel models are determining
The basic cross-sectional model has been extended into a number of dif-
15
including spatial interactions to a range of models beyond the standard linear
regression.
Spatial lag and error models have been applied to discrete dependent
variable specifications, such as probit and tobit (e.g., Pinkse and Slade 1998,
Fleming 2004, Smirnov 2010, Calabrese and Elkink 2014, Xu and Lee 2015,
Baltagi et al. 2016, Xu and Lee 2018), as well as count models (Lambert
the specification of the weights and the interpretation of direct and indirect
models other than the classic linear regression specification. Examples, in-
clude spatial quantile regressions (e.g., Kostov 2013, McMillen 2013), and
spatial stochastic frontier models (e.g., Fusco and Vidoli 2013, Orea and
Àlvarez 2019).
16
3 Spatial Weights
The specification of the spatial weights has been the subject of consid-
erable research and debate in spatial econometrics. Most of the early lit-
erature considered the weights as given and exogenous, but more recently,
nously have received interest. In addition, the extent to which the weights
are stable over time has been an important concern. Each of these topics is
considered in turn.
with pairs of observations are neighbors, i.e., for which pairs i − j, wij 6= 0.
Stakhovych and Bijmolt (2009), Harris et al. (2011), and Anselin and Rey
17
(2014, Chapters 3 and 4), among others.
number of neighbors (typically a small value), and use the matrix in row-
standardized form, such that the elements of each row sums to one. As a
result, the sum of all the elements in the weights matrix (an important ele-
critics, and alternatives have been proposed that scale the full matrix and
retain its symmetry (e.g., Kelejian and Prucha 2010, Plümper and Neumayer
weights, i.e., wij = 1 for all i 6= j. While this approach facilitates some esti-
and Prucha 2002, Lee 2002, Martellosio 2011). So-called block weights have
with all observations in the same subset (e.g., counties in a state) consid-
ered to be neighbors (Case 1991, 1992, Kelejian et al. 2006, Anselin and
Arribas-Bel 2013).
18
and the corresponding variance-covariance matrix. As shown in the reduced
form for the spatial autoregressive process, the range and strength of interac-
the autoregressive coefficient. In other words, the weights matrix is not the
the literature arguing they are critical to the model specification (e.g., Harris
et al. 2011), with others stressing the interpretation of direct and indirect
effects, in which the actual weights specification plays a less important role
In practice, there are often several candidate weights, without clear the-
model fit criteria, such as AIC (Anselin 1988, Stakhovych and Bijmolt 2009),
area of recent research (Anselin 1986, Kelejian 2008, Kelejian and Piras 2011,
Piras and Lozano-Gracia 2012, Han and Lee 2013, Jin and Lee 2013, Wang
et al. 2013, Delgado and Robinson 2015, Debarsy and Ertur 2019). Spatial
J tests have also been suggested to select the optimal number of neighbors
19
for k-nearest neighbor weights (Gerkman and Ahlgren 2014).
ing (examples are LeSage and Parent 2007, LeSage and Pace 2009, Cotteleer
et al. 2011, Piribauer and Fischer 2015, Debarsy and LeSage 2018, among
have been applied to select the best (combination) from among a set of
more parameters. Typical functional forms are wij = e−θdij or wij = d−θdij
(see, e.g., Bodson and Peeters 1975, Anselin 1980, Benjanuvatra and Bur-
ridge 2015). Care must be taken that the parameter space does not violate
20
A second strand of literature focuses on estimating the elements of the
weights matrix from the covariance structure in the data. This is typically
The key aspect of these approaches is the equality between the residual
the n × n matrix:
which follows from the definition of the error SAR process.7 Together with
be extracted.8
6
A cross-sectional application is given in Bhattacharjee et al. (2012), where the covari-
ance between housing submarkets is exploited to provide identification of the weights.
7
There is no separate spatial autoregressive coefficient, since it would not be jointly
identified with the elements of the weights matrix.
8
An alternative based on entropy maximization is given in Fernández-Vázquez et al.
(2009).
21
In practice, this approach has only been implemented for rather small n.
jee et al. 2012), to 5 (Bhattacharjee and Holly 2011) and 9 (Meen 1996,
Shrinkage and Selection Operator, Hastie et al. 2009) has recently been ap-
plied to this context as well. In Ahrens and Bhattacharjee (2015), and Lam
and Souza (2016, 2020), a two-step LASSO estimation is applied to both the
elements of the weights matrix and the model parameters by exploiting spar-
the sparsity condition must be such that the number of non-zero elements in
a row of the weights matrix, together with the number of exogenous variables
y = α + W(d)y + Xβ + e,
22
idea is extended to a functional coefficient spatial autoregressive model with
nonparametric spatial weights in Sun (2016) and Koroglu and Sun (2016),
allowing for both exogenous and endogenous variables to drive the spatial
potential endogeneity, i.e., when the (known) elements of the weights matrix
endogenous weights are the use of economic distance, trade shares, relative
the weights are known (and thus not estimated), their endogeneity should be
Kelejian and Piras 2017, Chapter 13) consists of constructing additional in-
struments for the weights matrix elements. An estimate of the weights ma-
ables. The estimation of the spatial lag specification utilizes the predicted
23
ity of the spatial weights matrix is formalized by expressing the weights
notation:
Z = X2 Γ + E,
y = ρW(Z)y + X1 β + u,
where the notation W(Z) indicates that the spatial weights are constructed
The source of the endogeneity comes from the joint distribution of the
error terms in E and u, which are zero mean, but have a covariance matrix:
2 0
σu σue
Σ= .
σue Σe
For σue 6= 0, the weights matrix is endogenous. As a result, the spatial lag
24
endogeneity of the weights matrix. Estimation strategies are outlined in Qu
This general framework can also be exploited to develop tests for the
endogeneity of the weights, as in Qu and Lee (2015), Cheng and Lee (2017),
et al. (2019), Gao and Bradley (2019), and Han et al. (2021).
A final issue, pertinent to a spatial panel data setting, is whether the spa-
tial interaction is constant over time. More specifically, this pertains to the
Two different perspectives can be taken. In one, the weights are fixed,
with time-varying weights in spatial dynamic panel models are Lee and Yu
25
Tests for structural breaks in the context of parameterized weights have
compare a model with a constant parameter to one with two different values,
4 Asymptotics
on the structure of the spatial weights and the allowed parameter space for
spatial coefficients.
26
4.1 General Concepts
In space, the mode of convergence, i.e., the manner in which the sample size
size). On the one hand, one can envisage a fixed spatial domain from which
with the field view of spatial data typically adopted in the physical sciences
and the associated methodology of geostatistics (see, e.g., Cressie 1993, Stein
of this sampling design is that the number of neighbors that interact directly
This has major implications for obtaining the desired asymptotic properties.
over to the other type, something noted by Lahiri (1996), among others.
First, unlike what generally holds for time series, the spacing of observa-
27
tions in two-(or three) dimensional space is uneven. Of course, regular lattice
structures can and have been considered, but these are of limited relevance
uneven spacing implies that many results obtained for dependent stochastic
processes in the time domain do not transfer directly to the spatial domain.
In addition, in a time series context, growing the sample with new obser-
vations does not affect earlier observations. However, this does not hold for
is more complex since the dependence structure impacts (almost) all other
observations as well.
X
yi = ρ wij yj + ei ,
j
with wij as the elements of the spatial weights matrix and ei as a random
error term (typically assumed to be i.i.d.). For a sample of size n, the reduced
where the subscript n refers to the sample size. In other words, each yi is
not only affected by ei , but also by the error terms at all other locations
28
in the system.9 In a typical times series context, the addition of s new
error terms en+1 , en+2 , . . . , en+s would simply yield s new values for yi for
process, the values for yi for i ≤ n are affected by the introduction of the new
observations as well, since the inverse terms in the reduced form will ensure
that the s additional error terms change all the elements in the vector y. As
conceptualize the sample expansion (see also Robinson 2011, p.6, for further
the sample (as is customary), but also by the sample size, as in yi(n) . In
y1(1) y1(2) y1(3) . . . y1(n)
y2(2) y2(3) . . . y2(n)
3(3) . . . y3(n) ,
y
.
..
yn(n)
hence the designation as a triangular array (see Kelejian and Prucha 1998,
29
The consideration of the data generation process as following a triangular
neighbors is not constant (Anselin 1988, 2002), even when the error terms
are i.i.d. This also follows from the reduced form. Consequently, LLN and
CLT need to account for this additional heterogeneity, even for intrinsically
stationary processes (for the error terms). In addition, more complex forms
the error terms. The framework can be applied to a range of tests for spatial
tests, as well as several estimation methods for the lag and error models, in-
30
cluding maximum likelihood (ML), quasi-maximum likelihood (QMLE), two
stage least squares (2SLS), general method of moments (GMM), and gen-
eralized empirical likelihood (GEL) (see also Kelejian and Prucha 2010, Xu
The key result is that these tests and estimators can be written as special
Qn = e0 n An en + b0 n en .
In these expressions, the subscript n refers to the variability with the sample
size as a result of the triangular array property. The error terms ei,n are
of unknown form. In addition, there are some general limits on the variability
The CLT of Kelejian and Prucha allows for the diagonal elements of
the matrix A to be non-zero, although for most spatial models this is not
sums bounded (i.e., their maximum sum for each n is finite). The symmetry
utilizing A = (1/2)(W + W0 ).
31
Kelejian and Prucha (2001) invoke a CLT for martingale difference arrays
(MDA) to prove that the standardized random variable (Qn − µQn )/σQn
such as in multivariate and panel data settings, e.g., Keursteiner and Prucha
(2013), and the review in Xu and Lee (2019). Central limit theorems for
linear spatial models with long range dependence are derived in Lahiri and
Robinson (2016).
CLT boil down to restrictions on the spatial weights. For example, in the
spatial lag model, the weights matrix W should be both row- and column-
straint pertains to the inverse matrix (I − ρW)−1 , which should also be row-
and column-bounded. This is satisfied for a parameter space with |ρ| < 1.
For nonlinear spatial models, i.e., when the dependent variable is not a linear
function of the error terms, such as spatial probit and tobit, the framework
32
of MDAs is no longer sufficient to establish asymptotic properties.
Jenish and Prucha (2009) develop the basic theory for asymptotic prop-
CLT and ULLN (uniform LLN) for random spatial fields that satisfy mix-
ing and continuity conditions. The random fields allow for non-stationarity,
models.
The familiar notions of α and φ mixing are broadened to account for the
growing size of the data subsets to which they pertain (more precisely, the
space.
In general terms, the α measure is the difference between the joint prob-
ability and the product of the marginal probabilities, which are equal under
33
that the mixing coefficient goes to zero as the distance goes to infinity.10 In
However, the mixing conditions as such are not sufficient to deal with
Jenish and Prucha (2012), new CLT and LLN are derived that broaden the
range of models with spatial dynamics that can be accommodated. The key
for ψ(s) such that lims→∞ ψ(s) = 0. In this expression, Fi (s) is a σ-field
distance that defines the neighborhood becomes infinitely large. With the
34
weak spatial dependence. Formal conditions and the application of these
5 Concluding Remarks
focus has been on “what is special about spatial” in terms of model specifi-
cation. In particular, a range of spatial models has been reviewed and their
Two items received particular attention, since they are essentially absent
in standard econometrics: the use of a spatial weights matrix and the spe-
While these topics provide a taste of the special features of spatial mod-
els, many other econometric aspects could not be considered within the cur-
35
likelihood, GMM), diagnostic tests for spatial correlation (e.g., Lagrange
heterogeneity. The texts and review articles cited in the introduction provide
References
Angulo, A., Burridge, P., and Mur, J. (2017). Testing for a structural break
in the weight matrix of the spatial error or spatial lag model. Spatial
Angulo, A., Burridge, P., and Mur, J. (2018). Testing for breaks in the
Ithaca, NY.
36
Anselin, L. (2001). Spatial econometrics. In Baltagi, B., editor, A Companion
Anselin, L. (2002). Under the hood. Issues in the specification and interpre-
Science, 89:2–25.
Anselin, L., Le Gallo, J., and Jayet, H. (2008). Spatial panel econometrics.
37
Fundamentals and Recent Developments in Theory and Practice (3rd Edi-
31:1–3.
metrics, 31:929–960.
38
Baltagi, B. H. and Liu, L. (2011). An improved generalized moments estima-
tor for a spatial moving average error model. Economics Letters, 113:282–
284.
Baltagi, B. H., Song, S. H., Jung, B. C., and Koh, W. (2007). Testing for
44:386–397.
Bera, A. K., Doğan, O., and Taşpinar, S. (2018). Simple tests for endogene-
69:130–142.
39
Bhat, C. R., Paleti, R., and Singh, P. (2014). A spatial multivariate count
Economics, 43:617–634.
54:664–687.
59:953–965.
40
Case, A. C. (1992). Neighborhood influence and technological change. Re-
Chi, G. and Zhu, J. (2020). Spatial regression models for the social sciences.
Chudik, A. and Pesaran, M. H. (2015). Large panel data models with cross-
UK.
Chudik, A., Pesaran, M. H., and Tosetti, E. (2011). Weak and strong cross
14:C45–C90.
Pion, London.
Cotteleer, G., Stobbe, T., and van Kooten, G. C. (2011). Bayesian model
41
averaging in the context of spatial hedonic pricing: an application to farm-
techniques for real estate data. Journal of Real Estate Literature, 7:79–95.
42
Elhorst, J. P. (2014a). Spatial Econometrics, From Cross-Sectional Data to
spatial model with moving average errors, with application to real estate
choice models. In Anselin, L., Florax, R. J., and Rey, S. J., editors, Ad-
berg.
Economics, 27:679–694.
43
Gao, H. and Bradley, J. R. (2019). Bayesian analysis of areal data with
unknown adjacencies using the stochastic edge mixed effects model. Spatial
Statistics, 31:100357.
283.
410.
Haining, R. (1978). The moving average model for spatial interaction. Trans-
Han, X., Hsieh, C.-S., and Ko, S. I. (2021). Spatial modeling approach for
Han, X. and Lee, L. (2013). Model selection using J-test for the autoregressive
model vs. the matrix exponential spatial model. Regional Science and
44
Harris, R., Moffat, J., and Kravtsova, V. (2011). In search of ‘W’. Spatial
Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statis-
Jenish, N. and Prucha, I. R. (2009). Central limit theorems and uniform laws
150:86–98.
190.
Jin, F. and Lee, L.-F. (2013). Cox-type tests for competing spatial autore-
Kapetanios, G., Pesaran, M. H., and Yamagata, T. (2011). Panels with non-
348.
Kapoor, M., Kelejian, H. H., and Prucha, I. (2007). Panel data models with
45
spatially correlated error components. Journal of Econometrics, 140:97–
130.
Sciences, 1:3–11.
41:281–292.
London, UK.
17:99–121.
104(2):219–257.
46
Kelejian, H. H. and Prucha, I. R. (2002). 2SLS and OLS in a spatial au-
toregressive model with equal spatial weights. Regional Science and Urban
Economics, 32(6):691–707.
lems in models with spatial weighting matrices which have blocks of equal
of Econometrics, 174:107–126.
Econometrics, 4:6.
47
Kostov, P. (2013). Empirical likelihood estimation of the spatial quantile
Lahiri, S. and Robinson, P. M. (2016). Central limit theorems for long range
38:693–710.
for a spatial lag model of counts: theory, small sample performance and
18(2):252–277.
Lee, L.-F. and Yu, J. (2010). Some recent developments in spatial panel data
48
Lee, L.-F. and Yu, J. (2011). Estimation of spatial panels. Foundations and
Lee, L.-F. and Yu, J. (2012). QML estimation of spatial dynamic panel
data models with time varying spatial weights matrices. Spatial Economic
Analysis, 7:31–74.
Lee, L.-F. and Yu, J. (2015). Spatial panel data models. In Baltagi, B. H.,
editor, The Oxford Handbook of Panel Data, pages 363–401. Oxford Uni-
49
Martellosio, F. (2011). Nontestability of equal weights spatial dependence.
Verlag, Heidelberg.
Econometrics, 213:556–577.
Springer-Verlag, Heidelberg.
50
Pinkse, J. and Slade, M. E. (1998). Contracting in space: An application
85:125–154.
Pinkse, J., Slade, M. E., and Brett, C. (2002). Spatial price competition: A
261.
Qu, X., Lee, L., and Yu, J. (2017). QML estimation of spatial dynamic
panel data models with endogenous time varying spatial weights matrices.
Qu, X., Wang, X., and Lee, L. (2016). Instrumental variable estimation of
51
Robinson, P. M. (2009). Large-sample inference on spatial dependence.
Rodrigues, E., Assunção, R., and Dey, D. K. (2014). A closer look at the
Shi, W. and Lee, L. (2018). A spatial panel data model with time varying
408.
52
Strauß, M. E., Mezzetti, M., and Leorato, S. (2017). Is a matrix exponential
Science, 55:339–363.
Wall, M. M. (2004). A close look at the spatial structure implied by the CAR
324.
Wang, S., Wang, S., and Smith, P. (2013). Two spatial non-nested tests for
sis, 45:345–358.
449.
53
Xu, X. and Lee, L. (2019). Theoretical foundations for spatial econometric
Zhang, X. and Yu, J. (2018). Spatial weights matrix selection and model av-
18.
54