Notes on financial econometrics

2001, Journal of Econometrics

Essays Notes on "nancial econometrics George Tauchen* Department of Economics, Duke University, PO Box 90097, Social Science Building, Durham, NC 27708-0097, USA Abstract The "rst part of the discussion reviews recent successes in modeling of discrete time "nancial data and argues that a direct approach is better suited than stochastic volatility. The second part reviews recent work on estimating continuous time models with emphasis on simulation-based techniques and joint estimation of the risk neutral and objective probability distributions.  2001 Elsevier Science S.A. All rights reserved. 1. Discrete time data 1.1. Direct specixcation Perhaps the most stunning empirical success is the extent to which we now essentially understand the statistical dynamics of a scalar "nancial price series. Suppose P is the price of a "nancial asset at time t, and we let y "log(P )! R R R log(P ) or more generally y "log(P #D )!log(P ) if dividends are taken R\ R R R R\ into account. Let Y "y  denote the lag history of y , and put R\ R\I IV R  "E(y Y ), R R R\ "Var(y Y ), R R R\ y ! R, (1) z" R R  R * Tel.: #1-919-660-1812; fax: #1-919-684-8974. E-mail address: (G. Tauchen). 58 G. Tauchen / Journal of Econometrics 100 (2001) 57}64 where for now, we suppose z is iid with density f (z). This setup underlies much R discrete-time "nancial modeling such as ARCH/GARCH and its various nonparametric relatives. We know from over 15 years of work with various series that the conditional mean is nearly constant,  +AR(1) with autoregressive R coe$cient near zero,  is extremely persistent series that might be best captured R by either long memory (Ding et al., 1993), (Baillie et al., 1996) or multiple components GARCH models (Engle and Lee, 1999), and that f (z) is highly non-Gaussian with more mass at the origin and in the tails than the Gaussian distribution. If y is an interest rate, the basic facts are more complicated. The R conditional mean is nearly that of a random walk and is perhaps nonlinear (AmK t-Sahalia, 1996a). The conditional variance displays both GARCH-like behavior and a level e!ect (Andersen and Lund, 1997), while f (z) is highly non-Gaussian with a shape much like that of an equity return. These basic facts arise from what I will call a direct approach to specifying models for y . If we specify functional forms for  "m(Y , ), R R R\ "V(Y , ), f (z)"f (z), then transition density for y is R R\ R y &f R   y !m(Y , ) R R\ ,  (V(Y , ) R\ (V(Y , ), R\ (2) which provides the basis for maximum-likelihood type estimation of the model from which much can be learned regarding the dynamics of y . R There are somewhat less parametric approaches to the same modeling task. In the SNP approach of Gallant and Tauchen (1989, 2000a), f (z) is approximated by a modi"ed Hermite series with f (z, ) denoting the Kth term in the series. In ) this case, the transition density of y R y &f R )  y !m(Y ,  ) R R\  ,  (V(Y ,  ) ) R\   (V(Y ,  ), R\  (3) where  3 contains all of the parameters of the expansion,  L , and ) ) ) )> the functions m and V retain their same parametric speci"cations for each K, as given in base speci"cations K"0, which is usually a Gaussian model. In the semiparametric GARCH formulation of Engle and Gonzales}Rivera (1991), the transition density is y &fK R  y !m(Y , ) R R\ (V(Y , ) R\  (V(Y , ), R\ (4) where fK (z) is a kernel-based estimate of the transition density of z . For very long R time series, the presumption of time homogeneity in the error density f (z) becomes untenable (Gallant et al., 1992). The conditional skewness, kurtosis, and other higher order properties become state dependent. The SNP approach G. Tauchen / Journal of Econometrics 100 (2001) 57}64 59 accommodates such dependence by making the Hermite coe$cients depend upon Y so the error density is f (zY ,  ), and the transition density takes R\ R\ ) a form similar to (3). It also has a natural multivariate generalization. Gallant and Tauchen (2000a) discuss computational details and provide computer code and worked examples. 1.2. Stochastic volatility The general approach above is direct in that the investigator directly speci"es the three key pieces: the conditional mean function m(Y , ), the conditional R\ variance function V(Y ), and the error density f (zY , ), This speci"cation R\ R\ can be done in either a fairly tightly parameterized manner or a more #exibly parameterized manner with a non-parametric interpretation. The direct approach stands in contrast to stochastic volatility, for which the basic model is y "z h , R R R (5) where z &q(z) R h &g(hH ), H "h  R R\ R\ R\I IV (6) and for simplicity the conditional mean is ignored here. In the above, h is R unobserved stochastic volatility and z is a return shock. The basic model takes R q(z) as the standard Gaussian density and log(h ) as Gaussian AR(1) process with R possible correlation between volatility innovations and returns shocks; see Ghysels et al. (1995) for a survey. The appeal of the stochastic volatility model is its simplicity and ease of interpretation. A drawback, however, is that given (6) the conditional density of the observed process given its own past, f (y Y ), is R R\ not available in a convenient closed form. This has led to a large number of method-of-moments based approaches and Bayesian-based approaches to estimation of (6); see Andersen and Sorensen (1996), Jacquier et al. (1994), and Kim et al. (1998), among others. These additional complications in estimation seem a small price to pay for the elegant simplicity of the basic speci"cation. However, Gallant et al. (1997) "nd that a realistic stochastic volatility model has to be far more complicated if it is to actually "t the data. The error density q(z) has to be made strongly thick-tailed and left skewed, while the dynamics of h have to R be very rich with both short-term (Markov) and long-memory components. The entire apparatus becomes so complicated and so di$cult to estimate that the appeal of stochastic volatility on grounds of simplicity is lost. The direct approach is better suited to the task than is stochastic volatility. 60 G. Tauchen / Journal of Econometrics 100 (2001) 57}64 2. Continuous time estimation 2.1. Estimation of price dynamics Continuous time estimation has attracted a huge amount of attention in the past "ve years. Lo (1988) points out what was considered the major obstacle: given a speci"cation of the continuous time dynamics the conditional density of the discretely sampled price process is not available in closed form. This either precludes, or greatly complicates maximum likelihood estimation. Hansen and Scheinkman (1995) and AmK t-Sahalia (1996a, b) are among the "rst works in this area. For reasons space, I will con"ne my discussion to simulation-based moments estimators, though there is progress on implementing maximum likelihood (AmK t-Sahalia, 1999; Elerian et al., 1999) for scalar observed data. Some advantages of simulation-based procedures are that they can more readily handle multivariate situations with partially observed state vectors and pathdependent observed variables. Suppose the underlying state vector u of the economy evolves as R du "a(u )dt#B(u )dw , R R R R (7) where w is a vector of Brownian motions. Assume a vector of logged "nancial R prices p evolve according to R dp "a (u )dt#B(u )dw . R N R R R (8) Clearly, if one speci"es the functional forms a(u)"a(u, ) B(u)"B(u, ) a (u)"a (u, ) B (u)"B (u, ), N N N N (9) where  is a parameter vector, then the price data generation process is determined in continuous time. The econometrician observes functions of the path of the price process at discrete time points: ], y "O[p R R Q QR\ (10) means the within-period continuous price path, y is the where p R R Q QR\ observed process for integer t, and O is the observation function. The form of O depends upon the application. For interest rate data, O just selects out the yields implied by bond prices; for equities data, which have a unit root, O selects out "rst di!erences of log prices; more generally, O also selects out pathdependent quantities such as the high/low range as in Gallant et al. (1999) or the quadratic variation as in Bollerslev and Zhou (2000). , the task is to estimate  and Given the observed data set > "y  2 R R  2 2 test the speci"cation. Although pdf (y y ,2,) is not readily available, it is R R\ G. Tauchen / Journal of Econometrics 100 (2001) 57}64 61 clear that one can easily simulate from the system (7)}(10). For each candidate . Simulated value of  one generates simulated data sets >T ()!yT ()T T  method of moments (SMM) of Du$e and Singleton (1993) is feasible and there are some good ways to implement SMM. One approach is the Indirect Inference approach of Gourieroux et al. (1993). Suppose we consider an auxiliary model f (y y ,2,) for the observed data. Let K "B (> ) denote the QML R R\ 2 2 estimator of  based on f as a function B ( ) ) of the observed data set > . 2 2 The Indirect Inference estimator minimizes [K !M ()] =[K !M ()] where = is a weight matix and M () is given by the binding function M ()" limT BT [>T ()], which is approximated by BT [>T ()] for large T. Unless  f (y y ,2,) is linear, or only mildly nonlinear, this approach is very comR R\ putationally demanding as one needs to evaluate the binding function M () for any permissible value of . The estimator of Gallant and Tauchen (1996, 2000b) circumvents the need to evaluate the binding function by using the score vector ( / )log[ f (y y ,2,)] to de"ne the moment conditions. If the auxiliary R R\ model f (y y ,2,) is chosen #exibly with a suitable nonparametric interR R\ pretation, then the estimator achieves the asymptotic e$ciency of maximum likelihood and has good power properties for detecting misspeci"cation (Gallant and Long, 1997; Tauchen, 1997), hence the term e$cient, method of moments (EMM). Some applications of EMM are Andersen and Lund (1997), Dai and Singleton (1999), and Gallant et al. (1999). 2.2. Joint estimation of objective and risk neutral distributions One of the most interesting and exciting challenges in continuous time analysis is the prospect of joint estimation of the so-called objective and risk-neutral probability distributions. As before, assume the state vector u evolves according to R du "a(u )dt#B(u )dw , R R R R (11) Suppose we have traded security prices p with cash #ows c "C (u ). Internal HR HR H R consistency (no arbitrage) requires that each price be the present value of the expected cash #ow  p " HR  Q   exp ! Q T  r dv EK (c u )ds, T H R>Q R (12) where r is instantaneous short rate of interest and EK denotes the expectation R under risk neutral dynamics: du "aH(u ) dt#B(u )dw . R R R R (13) 62 G. Tauchen / Journal of Econometrics 100 (2001) 57}64 Observe that the objective (i.e. actual) dynamics of the state vector u in (11) and R the risk neutral dynamics (13) in general have di!erent local drift functions a(u ) R and a*(u ) but they have the same local volatility structure B(u ). R R At a "xed point in time t, one can actually estimate the risk-neutral distribution from a cross section of derivative prices p )( . Finance economists are HR H quite familiar with the calculation, which is undertaken routinely in industry. A stylized overview follows. One speci"es functional forms a*(u)"a*(u, *) and B(u)"B(u, *) such that the expectation in (13) is relatively easy to compute and * means the parameterization under the risk neutral distribution. The expectation determines functional forms for the prices p (u , *). Estimation proceeds H R via minimization of the pricing errors ( (( H, u( )"argmin [p !p (u , H)], (14) R R HR H R MH SR H where the unobserved (or partially observed) state u is estimated along with *. R Given (( H, u( ), the dynamics (13), and the pricing equation (12), one can price any R R contingent claim (derivative security) as of date t. Of course, one can assume that * is constant across time and add the objective function (14) across days to produce a common estimate of * and a time series of estimates of the state vector u( . This approach is sensible but there are immediate econometric R questions to raise. The "rst is the lack of a theory of the pricing error. Why should the error be expected to be serially uncorrelated of constant variance? More to the point, why does the model not "t the cross section exactly, as deviations entail possible arbitrage opportunities? This point at least should be pondered. A related issue is the appropriate econometric theory to apply in the face as many incidental parameters (u ) as there are data points. Finally, the R approach only delivers an estimate of the risk neutral dynamics (13). A potentially very progressive approach, and one of the most exciting frontiers on "nancial econometrics, is to exploit the common local volatility structure of (11) and (13) and estimate jointly the objective and risk neutral distributions. Speci"cally, parameterize du "a(u , ) dt#B(u , ) dw , R R R R du "a(u , H) dt#B(u , ) dwH, (15) R R R R where w and wH are independent Brownian motions and the restrictions is that R R the functional form of B(u, ) must be the same across the two sets of dynamics. Financial prices are generated via dp "a (u , , H) dt#B (u , , H) dw , (16) R N R N R R where the functional forms of a and B are determined by computing the N N expectations in (12) under the risk-neutral dynamics of (15). Given (16), then G. Tauchen / Journal of Econometrics 100 (2001) 57}64 63 joint estimations via SMM can proceed exactly as outlined in Section 3.1. There is research along these lines. For example, Chernov and Ghysels (2000) have recently undertaken exactly this approach for a multifactor stochastic volatility model, while Pan (1999) undertakes similar estimation via a GMM procedure for an a$ne jump di!usion model of interest rates.

3. Conclusion

The preceding remarks all pertain to estimation situations with long time series observations } often thousands } on a relatively modest number of series } often three or four at most. Another huge challenge is dealing with extremely dense data sets comprised of ultra high frequency data on many } possibly hundreds } of series. 