This document summarizes a 1967 paper on dynamical equations for optimal nonlinear filtering. The paper proves that the conditional expectation of a stochastic process given observations has a representation as the solution to a stochastic differential equation. This provides a recursive method for computing the conditional expectation over time as new observations become available. The paper establishes conditions on the stochastic processes under which the representation holds and discusses potential applications to practical filtering problems.
This document summarizes a 1967 paper on dynamical equations for optimal nonlinear filtering. The paper proves that the conditional expectation of a stochastic process given observations has a representation as the solution to a stochastic differential equation. This provides a recursive method for computing the conditional expectation over time as new observations become available. The paper establishes conditions on the stochastic processes under which the representation holds and discusses potential applications to practical filtering problems.
This document summarizes a 1967 paper on dynamical equations for optimal nonlinear filtering. The paper proves that the conditional expectation of a stochastic process given observations has a representation as the solution to a stochastic differential equation. This provides a recursive method for computing the conditional expectation over time as new observations become available. The paper establishes conditions on the stochastic processes under which the representation holds and discusses potential applications to practical filtering problems.
This document summarizes a 1967 paper on dynamical equations for optimal nonlinear filtering. The paper proves that the conditional expectation of a stochastic process given observations has a representation as the solution to a stochastic differential equation. This provides a recursive method for computing the conditional expectation over time as new observations become available. The paper establishes conditions on the stochastic processes under which the representation holds and discusses potential applications to practical filtering problems.
JOURNAL OF DIFFERENTIAL EQUATIONS 3, 179-190 (1967)
Dynamical Equations for Optimal Nonlinear Filtering
H. J. KUSHNER* Center Dynamical Systems, Division of Applied Mathematics, Brown University, Providence, Rhode Island 1. INTRODUCTION In this paper, we prove a result in optimal nonlinear filtering (and representation of a conditional expectation as a solution to a stochastic differential equation) which we derived formally in [I] and [2]. Some possible computational methods are briefly discussed. Write the vector stochastic differential (Its) equations dx = f(x, t) dt + W2(x, t) dz, (1) where z, and 8, are independent vector Wiener processes. The matrices W2 and .N2 are square roots of the nonnegative-definite V and positive- definite Z, respectively. Let dw = .W2dti, and suppose that z, is independent1 of w, . E is the expectation conditioned on a a-field a, and a(*) is the completion of the minimal o-field over which the random variables the parenthesis are measurable. The function yt represents observations on the process X, . (From one engineering point of view, the observation is j = g(x, t) + 5, where f is white Gaussian noise.) Write % = g(yS, s < t), and suppose that there is a conditional probability density PFt(x, t) of xt conditioned on &. Then, the formal results in [2] are that the conditional density and expectation have the representations dPFt(x, t) = PFt(x, t)(dy, - Estg(x, , t)dt)Z;l(g(x, t) --ES%+, , t)) + L*PSt(x, t)dt (3) * Part of this research was done while the author was a consultant to the RAND Corp., Santa Monica, California, and part was supported by the United States Air Force through the Air Force Office of Scientific Research under Grant No. AF- AFOSR-693-64. 1 A derivation may also be carried out without this assumption. 179 180 KUSHNER qPQ(x,)) = (dy, - Ptg(x, , t)d+T~(E-%(xt)g(xt , t> --EFtg(x, , t)ESdz(x,)) + EFtLh(x,)dt L* is the formal adjoint of the differential generator (4) L = Em t) g, + ;c Vij(X, t) A?- i e t.3 ax,axj I f Z; = 0, then the observations are valueless, and (3) reduces to the Fokker-Planck equation. The right sides of (3) and (4) are linear in the (incremental) observation dy, . In this paper, we prove (4) under explicit conditions [(AI)-(All) below] on the xt process. Note also that the proof is valid if we suppose that the f and V of (1) are general nonanticipative functions, and (Al)-(All) holds, and (Al) is uniform in w. (3) has not yet been proved without the implicit assumption that PSt(x, , t) is sufficiently differentiable and has suitable properties for large 11 x 11, which we have not been able to very from conditions on the xt andy, processes. Iff(x, t) = 0 and V(x, t) = 0, then L* = 0 and (3) may be proved. From a Baysian point of view, (4) is a complete description of the optimum filter. It gives, in principle, a recursive method for computing the conditional moments of st , given the observations ys , s < t; i.e., a particular sample path of a version of Es&(x,) may be obtained as a function of time, as the observations become available. (The estimate is the output of a dynamical system whose input is the observation.) As such, it would be expected to be significant from the point of view of the practical problems of filtering. Recursive methods for computing E 9F:~t (a linear differential equation whose forcing term is linear in the observation) for the case of linear f , g, and V independent of x are available [9] and widely used. The practical usefulness of the result depends on the constructability of physical apparatus which provide useful approximations to the system described by (4). The numerical work will be reported on in detail later. Some of the ideas which we feel to be novel and worthwhile are discussed here. For some of the nonlinear systems [(l) and (2)] studied, our methods yield consistently better results than currently existing methods based on linearization, and the use of methods for the linear problem. In a recent note, Bucy [3] put the problem in [2] in a form in which a formal application of Itos Lemma yields (3). This work, still formal, is less intuitive, but more satisfying mathematically than our approach in [2]. Some relevant results for Poisson processes are given by Wonham [A. Although the work in [Z] and [2] was independent, Stratonovich [.5] had, from a formal point of view, considered the same problem earlier. These DYNAMICAL EQUATIONS FOR OPTIMAL NONLINEAR FILTERING 181 results are not consistent with the It6 interpretation of the stochastic integral. However, Stratonovich has recently described a stochastic calculus [S] (somewhat different from Itos), with respect to which his earlier results must be interpreted. Then the formal continuous time results in [2] and [5] appear to be equivalent, at least in the scalar case with independent x, and w, processes. 2. ASSUMPTIONS Functions oft only are written with the argument as a subscript. Otherwise we use whatever form appears most convenient. (Al) 642) 643) (A4) (A5) 646) (-47) (4 (A9) The components of f(*, .) and W2(., a) are Baire functions and satisfy a uniform Lipschitz condition in the variable x, and are bounded, in absolute value, by K(1 + xx)li2 for some real positive number K. EIJ x,, iI2 < co, and x,, is independent of x, , 0 < s < T. The components of g(*, *) and the scalar valued h(e) are Baire functions of all their arguments for 0 < t < T, 11 x 11 < co. h(-) has continuous second partial derivatives at each finite x. Z, is positive-definite and continuous at each t in the finite interval [0, T]. .Z3 does not depend on x (see remark at the end of the proof). &7(x,) explll + 6) /:gk , w,lg(x,Y 44 < 00, &@t, t)&t,t)<~ in [0, T], where 4(x,) is either 1 or ) h(~,)Jl+~, for some b > 0. El h(x,)l < co, t < T. 4 -%,)I -=c ~,41g(xt > t> W,)ll < 00, t < T. The zt process is independent of the w, process.2 s T Qt2P.k > v?&t , 0 t) exp [3 1 g(q , s)~;!g(x, , s)d~] dt < 03, 0 where qt = h(x,) or 1. s: E I Wxd I exp [~~g(x, ,4~;l.(x, , +] dt < ~0. ? Introduced only to allow a simpler proof. 182 KUSHNER 3. PROOF OF EQ. (4) THEOREM. Assume (Al) to (All). Then, a version of Esth(x,) sutisjies the stochastic dzfferential equation (4). Proof. (1) Under (Al), the process xt, t < T, is defined and continuous with probability one (w.p.1). Fix t < T until mentioned otherwise. For each positive integer k, define the partition of [0, t]: 0 = t,, < t,, < *** < Gc(,,+1) = t; I& = {t: t,*j+, > t > t/&j; s~ki = s,,, dy, = ytr,,+, - yt,, ; Gkj = jIksg(x, , s)ds, 8wki = I dw, = 1 Z: dw, . Iki Iki Then 8yki = G,i + Sw,i . Write %% = W(&yL,, ,..., ~JJ~,,~), St = 9Y(y, , s < t), and gk = g(G,, ,..., Gknk). The 8yki , i 6 nk , are conditionally independent (with respect to gk) normally distributed random variables with mean G, i and finite variance PS(8yk E A) = C I, N(Gk, Sk, a) da, where C is a normalizing constant and a = (a, ,..., a,,), Gk = {GKi , i < nk}, Sy = {Sy*, i ,( nk), and N(Gk, Sk, a) = exp - i $ (ai - G,,)Sii(ai - Gkz). t=l DYNAMICAL EQUATIONS FOR OPTIMAL NONLINEAR FILTERING 183 Let the Wiener processes 1, and z, be independent. Let $, s < T, correspond to z, via (1). Then the processes 3, and x, are independent, but have the same distribution. Define eki = Jltig(~s, s) ds and ek = jeki , i < 12k)- Let P(dGk) and P(dGk x dx,) be the measures on the Euclidean range spaces of Gk and the pair (Gk, x,), respectively. Considered as an w function (since Sy* is an w function), Htk is obviously a version of EF4(x,). H k = SSh(xt)N(Gk,Sk,Syk)P(dGk x dxt) t J- N(Gk, Sk, Gyk)P(dG)
Since 8yk is held fixed in the integrations in (6), we may change notation to a more convenient form by substituting 2t and Gk for xt and G, respectively. Then, w.p.1. (recall that f t has the same law as X$ , but is independent of x, , s S T). H k = EFkh(q)N(Gk, Sk, 6 y = E-vz(x,). ) t zFN(G, Sk, F3y) We now multiply both terms of (7) by the Pk-measurable function exp 4 CF 8y$& 8yki > 1 [which is finite w.p.l.)] yielding H, = ESFkh(xt) = EFkh(st) exp Rk/Epk exp R, , Define Ht = ESth($ exp R,/Est exp R, . (9) (2) As k increases, let .Fk C Fkfl and maxi (length of Iki) - 0. Owing to the w.p.1. continuity of ys on [0, T], Sk T gt = u .Fk.3 Next we prove that Htk -+ Ht (w.p.l.), and Ht is a version of Eth(x,); t is still fixed. The sequence of conditional expectations H,k is a martingale, and El Htk 1 = El h(x,)l < co. By the martingale convergence theorem there is s u Sk is the completion of the minimal o-field containing 9r1 ,... . 184 KUSHNER an St-measurable random variable 7, with El 7 ) = El h(x,)], such that H, --t 7 w.p.1. as k ---f co and Htl, Hi2 ,..., 7 is a martingale. In fact, we also have Es#z(x,) = 7. (3) By (A3) and (A4), and for small 6 = t,,+, - tki , we write u,,i z~dsrl = $ (Z;12 + el(S, s))(,Z;~-~/~ + ~~(6, s)) where ~~(8, 8) is uniformly small in s and in i. Then 1 =- s I I~* g& 9 ws -12 + 4, 9) ds I,, (K-12 + 4, s))g(fs , s) ds = W,. (A4) and Fubinis theorem imply that, as a function of s, &(a, , s) ,Zi lg(Rs, s) is integrable (w.p.1.) on [0, T]. The Schwarz inequality yields 1 M,i < I g(% , s)~,g(% , s) ds(l + 4)). * 0 where e(S) --+ 0 as 6 -+ 0. The sequence of functions M2 with values M&3 in Iki tends tog(fl, , s) Z:,lg(f, , s) almost everywhere on [0, T] (w.p.1.). Then, an application of Fatous lemma yields. t lim inf c Mki = lim inf s M,zs ds 2 I i 0 :g'(% , K%(f, , s) ds, which implies equality in the limit of (10) (w.p.1.). Similarly it may be shown that the other sums in R, converge w.p.1. to the corresponding integrals in Rt . For small b, by definition of R, , E ( k(&) exp Rk I1+b = E 1 h(q) Il+* exp (1 + b) c SW,&&~G,~ I - exp (1 + b)(c G,&SLjGki - iz C.$S,,)/ s z The last step in (12) makes use of the facts that the expectation of exp(1 + b) x:i SW,~S$G~~ , given the Gki , is exp +(l + b)2 x:i C;iS&rGki , DYNAMICAL EQUATIONS FOR OPTIMAL NONLINEAR FILTERING 185 and also of the inequality 2G~$i$ki < G;,S$G,, + G$S$&. By (A4) and (lo), the integrand on the right side of (12) is uniformly bounded by an integrable function, and we may conclude that h(&) exp Rk are in L, for some r > 1, and are uniformly integrable. Since, in addition, exp R, -+ exp Rt w.p.l., h(3,) exp R, -+ h(f,) exp R, in L, , Y > 1. Thus, since Pk t &, E9h(&) exp Rk -+ Esth(~,) exp R, in probability (Loeve [12], p. 409, para. 10a). Similarly Es exp A, --f E*t exp R, in probability Thus Htk -+ Ht in probability. Since limits in probability and w.p.1. limits are the same (w.p.1.) H, + H, w.p.1. (4) Now, we show that (9) satisfies (4) for each t (w.p.1.) We use the martingale definition of the stochastic integral. The maximum values (in [0, T]) of the ordinary integrals in R, are finite w.p.1. Since jr Eg(f, , s) Z;lg(fs , s) ds < co, the stochastic integral sig(*$, s) Zg1j2 d6, is continuous in [0, T] w.p.l., and, hence, is bounded there w.p.1. Thus co > R, > -co and co > exp R, > 0 for all t in [0, T] w.p.1. Now, since the function exp(u) is twice continuously differentiable: at each UE(-co, co), and -co < R, < co w.p.l., Itos lemma (see [7], Theorem 7.2) implies that exp R, is a stochastic integral with exp R, = exp R, + f a(eil Rs) dR, 1 t a2(ezp R,) + 2 1, aR,2 g(% > %i(% > S) ds, = exp 4, + I 1 (exp R,)g(f, , +%y, , dyd = Z:2dG, + g(x, , s) ds . Since h(x) has continuous second derivatives and, by (Al), y<y II 5, II2 < ~0 (13) w.p.l., h(~,) is also a stochastic integral, h(s,) - A(*,-,) = JlLh(a,) ds + sl (grad h(R,))VC/ls12dZS . (exp R,) h(&) is also a stochastic integral. Using the independence of w, and f, , h(&) exp R, - h(&,) exp R, = 11 (exp R$z(fJg(jS, , s)[C, dyJ + s: (LA(&)) exp R, ds + 1: (exp R,)(grad h(2,)) Vi2 dz, . (14) 505/3/2-3 186 KUSHNER By (AlO) and the independence of 5, and ys , s < T, the expectation with respect to 4 , of the last term in (14) is zero w.p.1. The term will be omitted hence forth. It will be proved (5) that for all integrals in (13) and (14) we have ESt s k, dy, = 1 [EF8k,] dy, and ESZt 0 0 f t k, ds = s [EFSk,] ds w.p.1. 0 0 and that the maximum values (in [0, T] )of the conditional expectation of each term in (14) with respect to & is finite w.p.1. We also have ESt exp R, > 0 w.p.1. With this interchange of the order of the ordinary and stochastic integration with the conditional expectation, Ht is the ratio of stochastic integrals; Ht = At/B, = E9k(nt) exp Rt/ESt exp R, , where dA8 = [EFS(exp li#Jz(~,)] ds + [Ess(exp RS)h(f8)g(x8 , s)]Z;l dy, dBS = [Es8(exp RJg($S , s)]L;~ dr, . (15) Applying It8s Lemma to the ratio At/B, = Ht yields Ht = Ho + j: (2 dA, + $+ dB, + a (dA,)(dB,) i?!& (dBJf(dB8)] . +ZaB: Thus, dHt = ([E9(exp R,)k(%)g($ , t)]Z;ldyt + Est[(exp R,)Lh(E,)]dt}/B, - [Esk(&) exp Rt][ESt(exp R,)g& , t)Ztm1dyt]/Bt2 - [Est(exp R#(%)g(% , t)Z;] . [EFtg($ , t) exp R,] dt/B, i- [EFtk(Qexp Rt][ESZt(exp RJg($, t)] .Z;[EFtg(nt, t) expR,]dt/B,3 . Since Eftk(&) exp R,/B, = EYtk( xt w ) h enever El k(x,)( < co, we obtain finally dE9%(xt) = (dy,-EFg(xt, t)dt)Z;1(E9k($g(x,, t)-ESFtk(xt)E9g(xt, t)) + EFtLh(xt) dt, which is (4). It only remains to prove the statement in the second paragraph below (14). (5) Let D, be a vector-valued measurable (s, W) function which is Lebesgue-measurable for almost all fixed w. Let D, be independent of fit - &, , all t > s. Let &, , s < t be measurable over the u-field G& , and let EatD 8 = EaD s (16) DYNAMICAL EQUATIONS FOR OPTIMAL NONLINEAR FILTERING 187 w.p.l., t > s. First we show that if f T ED,D, ds < co, 0 (17) then, w.p.l., for each t, I t Eat D, drZ, - 0 f t (EaSDs) deit, = 0. (18) 0 Under (16) (18) is obviously true if D, is a step function with fixed points of discontinuity s = t, ,.... By (17) we may approximate D, by a sequence of nonanticipative right continuous step functions D,n satisfying (see Doob [6], IX, pp. 440-441) s T E(D., - DS)(DS - DSn) ds < 2-. 0 Finally, it is straightforward to prove that, w.p.1. s t Eat D,dziS = lim Eat 0 n f (D,)dzi8 = hm 1 [Ea8D,]d~, 0 = s t [EaD;]diZ, . 0 By a similar argument, if s T E II Ds II ds < ~0, 0 (20) then, w.p.1. 1 Ea8DSds = Eat 1 D,ds . 0 0 (21) Let c,$ = $Y(yS, rZS, s < t). Note that R, , s 9 T, is independent of all random variables which are measurable over at . The integrands of all the integrals of (22) satisfy (16). (4 I t (exp R,)g(a, , s) Z[dGS , 0 @I (4 (4 I 1 (exp R,)h(&)g(*S , s)Za dG3 , s 1 (exp JW(~, , SK1 g(xs , 4 ds, t s (exp KJ@Jg(~, , SF?&, ,4 4 0 (4 s t (exp &)L~(R,) ds. 0 (22) 188 KUSHNER Under (A8) the integrands in (22a) and (22b) satisfy (17). Under (A9) and (All) the integrands (22c), (22d), and (22e) satisfy (20). Hence, for these integrands the appropriate result, either (18) or (21), is true. The fact that (22a) and (22b) are martingales together with (A8)-(All) imply that the maximum, over t < T, of the expectations of all terms in (22), conditional on C&, are finite w.p.1. By adding the results for (22a) and (22~) and for (22b) and (22d), (a) Eat [l (exp Qi(*s , s)z;l dy, = s: [Eat(exp RJg(*, , $~;I dy, , (b) Eat 1: (exp V@,lg(x, ,s)T dr, = s t [IP(exp RJz(fJg(%, , s)ZC,~] dy, , (23) 0 (c) Eat [ (exp R&~(z,) ds = It Ent[(exp R&~(R,)] ds. 0 0 Now, note that, on the right sides of (23), the expectation E* is equivalent to the expectation ESFs, and the Theorem is proved. Remark. The case where .ZS is a function of x is degenerate. The value Z(s, x.J at time 0 may be determined by observing yS , 0 ,< s < T, where r is arbitrarily small. Divide [0, T] into Nk units of length d, , NJk = TV . Form Under mild additional hypothesis, it can be shown, via the strong law of large numbers, that, as rk + 0, A, + 0, Nk + co, Qk + Z(0, x0) w.p.1. The problem is essentially one of computing the variance of a normally distributed variate when infinitely many independent observations are available. The degeneracy arises owing to the fact that the observation noise is white. 5. REMARKS ON CONSTRUCTION OF PHYSICAL SYSTEMS CORRESPONDING TO EQ. (4) Let h(x,) = xi6 , the ith component of the vector xt . Then Lb(x) = fi(x, t) and the equation for the conditional mean E4xit = mit is easily obtained from (4). dm,, = (dy, - E9g(xt , t)dt2Y:,(EFt(xitg(xt , t)) - mitEsg(xt , t)) + z?P=tfi(xt , t)dt. DYNAMICAL EQUATIONS FOR OPTIMAL NONLINEAR FILTERING 189 Similarly, the equations for cijt = E Ft~it~it can be obtained. Then, the equation for the covariances miit = ESt(xit - mi,)(xjt - mjt) are obtained from d[cijt - mitmjt] and Itos Lemma (to obtain the differential of the product of stochastic integrals mitmjt). This procedure is valid and can be carried further (higher moments obtained) provided that (Al)-(All) hold for the necessary h(x) functions. Let (Al)-(All), corresponding to functions h(x) = {hi(X), i = l,..., M), hold. First assume: (Bl); that the right side of the equations (4) for d(E~thi(xt)), i = l,..., M, involve only functions of Esth(x,); i.e., E%xt , t) = F(EFth(x,)) for some function F(.), etc. Then, the system (4) of equations for the d(Egthi(xt)) has the usual form of the vector It6 equation. If the uniform Lipschitz and growth conditions are satisfied, the system (4) has a unique solution which is a version of the conditional expectation of the vector h(x,). Let samples 6yi = Y~,+~ - yti be available for small di = t,+i z ti . Then, by writting all differentials in the It6 equation as finite differences, the It6 equation transforms into a difference equation. If a continuous parameter process is obtained (from the solution of the difference equation) by a suitable interpolation, then, as the difference interval goes to zero, the result of the interpolation converges to the solution of the It6 equation w.p.l., for each t [II]. This suggests that, for sufficiently small da , the difference scheme, applied to (4), would yield a useful approximation to the conditional expectation. Dynamical systems for constructing the sample solutions of the It8 equation do not appear to be available. Although the introduction of the Wiener process rut seems necessary for careful theoretical work, the true physical observation may be of the form g(x, , t) + I+$ , where t,$ is a well- defined process-unlike dw,ldt. If si #I, d s h as a distribution close to that of wt , but tit is still a well-defined process, then using (4), valid or not, we may divide (4) by dt(dy/dt = g + 4) and obtain a differential equation. Since the right side of (4) contains, by (Bl), only functions of EFth(xt), a dynamical system corresponding to the resulting equation may now be built. The observation dy/dt occurs as a driving term. One would like to assert that the solution process approximates the process EF%(xt). The validity of such an assertion is closely connected to the relation between the solution of Itos equation using Itos constructive method, and the solution when the rules of ordinary integration are used. We mention only that a very similar question, on the relation between solutions to equations interpreted in the ordinary and in the It6 sense, has been treated in [IO]. The general conclusion is that, if St $S ds is close to wt in distribution, then to each equation interpreted in the It6 sense, there is a second equation, possibly containing extra terms, to which the application of the ordinary calculus yields a solution with a 190 KUSHNER distribution close to that of the solution to the It6 equation. This question is also related to the difference in results between [2] and [.Fj. Now, drop Assumption (Bl). Then the right side of (4) contains terms EStQ which are not functions of EFth(xt). A number of interesting possi- bilities for approximation of the EstQ arise. Some of these will hopefully be discussed elsewhere, in connection with results of some current numerical and experimental studies. Th ere are the obvious approaches of either neglecting such terms or using an approximation (e.g., by a truncated Taylor series) of EStQ in terms of Esrth(xt). Both involve serious pitfalls where nonlinearities in g, f and V are large. A seemingly promising discrete-parameter type of approximation, which is currently under study follows; namely, compute the equations for dnzit , dmtit , and, perhaps, dmiikt . Convert the set to finite-difference form. Arbitrarily assume a multivariate distribution D, , e.g., normal, uniform, etc. Compute the parameters of the distribution from mit, , mijt,, and perhaps mijetn . Then compute the necessary expectations of all terms Q with respect to Dn(Eo,Q) and let E,,Q replace E.&Q. Then compute the conditional moments at tn+l , etc. Distributions D, , suitable for the problem, must, of course, be found. I. KUSHNER, H. J., On the dynamical equations of conditional probability density functions, with applications to optimal stochastic control theory. J. Mati. Anal. Appl. 8 (1964), 332-344. 2. KUSHNER, H. J., On the differential equation satisfied by conditional probability densities of Markov processes. J. SIAM Control, Ser. A. 2 (1964), 106-l 19. 3. BUCY, R. C., Nonlinear filtering theory. IEEE Trans. Automatic Control 10 (1965), 198-199. 4. WONHAM, W. M., Some applications of stochastic differential equations to optimal nonlinear filtering. J. SIAM Control, Ser. A. 2 (1964), 347-369. 5. STRATONOVICH. R. I., Conditional Markov processes. Theory Prob. Appl. 5 (1960), 156-178. 6. DOOB, J. L., Stochastic Processes. Wiley, New York, 1953. 7. DYNKIN, E. B., Markov Processes. Springer-Verlag, Berlin, 1965. 8. STRATONOVICH, R. I., A new representation for stochastic integrals and equations. Vesestn. Moskoo. Univ., Ser. Mat. Mekhan. 1 (1964), 3-12. 9. KALMAN, R. E. AND BUCY, R. C., New results in linear filtering and prediction theory. Trans. ASME, J. Basic Engr. 83D (1961). IO. WONG, E. AND ZAKAI, M., On the convergence of ordinary integrals to stochastic integrals. Ann. Math. Stat. 36 (1965), 1560-1564. II. MARUYAMA, G., Continuous Markov processes and stochastic equations. Rend. Circ. Mat. Palermo 4 (1955), 48-49. 12. LOEVE, M., Probability Theory, 3rd ed. Van Nostrand, Princeton, New Jersey, 1963.