The Kalman Filter: State-Space Derivation For Mass-Spring-Damper System

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

The Kalman Filter

Jacob Bishop December 10, 2007

This is an introductory paper to the Kalman lter. It is meant to be a summary of the essential information as I would have liked to have found it, but could't. It employs the familiar mass-spring-damper system and is an example driven introduction to the Kalman lter. A model of the system is developed in state-space form. The Kalman lter equations are then introduced, and a simple explanation of the Kalman lter is oered. An algorithm for the implementation of the Kalman lter is provided, along with a sample MATLAB code, allowing the reader to gain an understanding of the Kalman lter through experimentation. Results from dierent input conditions are presented as a reference, and explanations are provided. The reader is encouraged to experimentally duplicate results provided, and then to continue experimenting.

Abstract

State-Space Derivation for Mass-Spring-Damper System:


Consider the system shown in Figure 1 below. As seen, this system consists of a single mass m, with a spring and damper connected in parallel. The spring and damping coecients are represented by k and m respectively. The next task is to develop the equations of motion that describe this system. First, we draw the free-body diagram. This is shown in Figure 2.

Next, we write the equations of motion that govern this system. This is shown in

Equation 1.

F z

= ma

(1)

Figure 1: Mass-Spring Damper System

Figure 2: Free-Body Diagram Then, the equation is rewritten, substituting the specic parameters of our system. This yields
tion 2. Equa-

mz =
Re-arranging, we get

bz kz + u

(2)

z =

k b +u m m

(3)

As seen, this is a second-order ordinary dierential equation. A common technique, which puts the equations in a more convenient form to solve numerically, is to re-write a higher order dierential equation (or system of dierential equations) as a system of rst-order dierential equations in matrix form. This form is called the state-space representation of a system [3] and in general is written as shown in Equation 4 and Equation 5.

x = Ax + Bu y = Cx + Du

(4) (5)

The column vector x represents the states of the system, and the row vector y represents the ouput of the system. Thus, for our system, we let x1 = z and x2 = z . Furthermore, we let the output of the system be z , the position of the mass (which is also what we are able to measure). Using these substitutions, we put the system into state-space form. This is given in Equation 6 and Equation 7.

x 1 x 2 y1 y2

= =

b m 0 0

k m 1 x1 x2

x1 x2 + (0) u

1 m 0

(6) (7)

Explanation of the Kalman lter:


For a real situation, the goal is to know exactly what is going on with the system all of the time. This is equivalent to wanting to know what all of the states (values in the vector x) are all of the time. We would naturally just say, "get a sensor, and measure them." However, we can seldom measure states directly, and even when we can, we can't usually measure all states directly. For our example, suppose that we can measure only y, which in the example presented means that we can measure position,

but not velocity. Assuming we need to know velocity, this presents a problem. To solve this problem, we create what is known as an observer. An observer can be thought of as a black box that takes whatever information about the system is available (i.e. the measurements) and spits out the states. There are many dierent ways to create an observer, but the one we discuss here is known as the Kalman lter. The Kalman lter is of particular interest because it can be shown that, under certain assumptions, the Kalman lter is the optimal observer. In other words, under the conditions of the assumptions, it is the observer that will estimate the state with the least squared error on average. Here, we only present the Kalman lter and try to help the reader develop an understanding of how it behaves, rather than give rigorous derivation of the lter and detailed proof that it is what it claims to be. The Kalman lter equations come in several equivalent forms; the form we include here is the continuous-discrete form [1]. Model:

x = Ax + Bu + w y
Time Update:

(8) (9)

= Cx + Du + v

= x = P
Measurement Update:

Ax + Bu AP + P A + Q
T

(10) (11)

S x P

CP C T + R
T

(12) (13) (14) (15)

L = PC S = x + L (y C x ) = (I LC ) P

First, we examine the model, which states what we are analyzing, and introduces the assumptions we are making. The rst important assumption the Kalman lter makes is linearity. Next, as seen in Equation 8, we have w. This term is present to compensate for our dynamic equation not being perfect. Through it, we assume that any imperfections in it can be accurately modeled as zero-mean white gaussian noise with covariance Q. The noise term on line 2 is similar. It assumes that any imperfections of the sensor can be represented by the zero-mean white gaussian noise term v with covariance R. We have developed the system model, and applied the Kalman lter to it, but an explanation of the lter itself has yet to be oered. As seen from the equations, there are two sections of the Kalman lter: The time update (also called the prediction), and the measurement update. The time update is a propagation of the current state according to the model. In other words, if we never got a sensor reading, our best guess of the states (x ) would be according to the mathematical model only. Note that while there are two lines in the time update section, only the rst (the system model) really eects what the estimate is. P is the covariance of the estimate x . As expected (assuming that initally, P is nonzero), P increases the longer the prediction runs without an update. When a measurement is received, which may or may not happen at a regular intervals, the measurement update loop is entered. In the measurement update, S and L are rst computed. (y C x ) is the residual of the measurement y. In other words, it is the measurement you got minus the measurement you think you should have gotten. S is the covariance associated with this residual. These two terms are also called

Algorithm 1 Kalman Filter Implementation


input: A, B, C, N, x , P, Q, R, y, dt output: x ,P for i = 0 to N

x =x + dt(Ax + Bu(i)) P = P + dt(AP + P A + Q) while(I_Got_a_Measurement==true) L = P C (CP C + R)1 x =x + L(y C x ) P = (I LC )P end while

end for

innovation (y C x ), and innovation covariance (S ) [4]. L is called the Kalman gain, and is the optimal gain (weighting), which can be thought of as a ratio between the covariance of the model, and the covariance of the measurement residual. This Kalman gain is then used in Equation 14 to weight the measurement. If the sensor is very good (the covariance of the sensor noise is small), the measurement is heavily weighted in the estimation of x . If instead the sensor is very noisy (with respect to the model), just the opposite happensthe measurement is essentially ignored [?]. For this reason, the Kalman lter can be "divergent." For example, if a covariance of 0 was input for Q, that would mean the model is perfect. Any sensor measurements would be completely ignored. Thus, the state estimate might diverge from the true state, unless the model really is exactly perfect, which is almost never true in practice. Unless the model were in fact perfect (in which case no sensor would be needed), this would be undesireable behavior. The point here, though, is that the lter is still optimal because it did the best possible with what it was told. In other words, it was lied to, so it gave the wrong answer. The Kalman lter can handle bad information in sensor measurements or even an innacurate model. It won't perform well, however, if the covariances of those values are input incorrectly. (If you're wrong and you know it, that's okay, but if you're wrong, and you insist you're right, that's bad.) From the model, and implementation of the Kalman equations, we develop Algorithm 1. It is important to notice that x and P are known at all times, as this is a recursively formulated algorithm. Now that the model and lter have been introduced, and algorithm developed, the reader is invited to experiment so that a practical understanding may be gained. To begin, we need numbers for our system, so we can run an actual simulation and examine performance and behavior. Let k = 1N/m, b = 1N s/m, and m = 10 kg. Further, we let the initial conditions be z (0) = 0 and z (0) = 0. U is given as a step input at t = 0 of magnitude 10N . Since we're seeing how the lter behaves, an exact solution is obtained by solving the system directly. This is given in general for in the m-le provided in the appendix. We now need to set conditions that provide an interesting simulation. Thus, assume that we don't know the model parameters perfectly, and while we are certain we have the mass and damping coecients right, we estimate the spring coecient poorly. Let this be given by using k = 0.5N/m instead of k = 1N/m. We assume this to be equivalent to adding noise with a magnitude of 4 N. Thus, Q becomes about 2I . We also start with a perfect sensor, so R = 0. We need an initial estimate, so we'll say x = 0. Initially, P represents the covariance of our guess. We know pretty well where we started, so P will start small, say 1. We will run the simulation from t = 0 to t = 100. Let's divide that up into time increments of 0.1 seconds (so, dt = 0.1). With the information provided, it should be possible for the reader to write a code that will simulate output for the system as derived. An

position 18 16 14 position from 0 in m 12 10 8 6 4 2 0 Actual Position Measured Position Estimated Position

10

20

30

40

50 time in sec.

60

70

80

90

100

velocity 3 2.5 2 1.5 velocity in m/s 1 0.5 0 0.5 1 1.5 2 0 10 20 30 40 50 time in sec. 60 70 80 90 100 Actual Velocity Estimated Velocity

Figure 3: Simulation Output example MATLAB code is also provided in the appendix. Given the input conditions as described, the plot shown below in Figure 4 was obtained.

There are several things that should be noticed in Figure 3. First, in examining the position estimate ( x1 ), we see that as time continues, and the system goes without a measurement, the estimate strays further and further from the exact position. This is because during that time, an inexact model is being propagated to obtain the position estimate. If the covariance of the estimate were plotted, we would see that as we go without a measurement, that value also increases in time. However, we see that when we do get a measurement, that measurement is heavily weighted, and the position estimate is corrected according to the measurement. This is just what we expected. Now, in examining the velocity estimate, we see that it does not get corrected as well at measurement time. This is because the velocity is not measured directly, so there is some error in the calculation of what the position should be according to the velocity. The reader is now encouraged to experiment by changing the code, and discovering the eects of several things such as the following: What happens when a poor initial estimate is given? What happens if the model uncertainty is set to 0? How can a velocity measurement be added, and how would that eect things? What if position were unmeasurable, and velocity measurements only were used? What if the sensor is noisy? For further study of the Kalman lter, the reader is referred to [1] and [2]. [2] provides links to many other useful sources on the Kalman lter as well. Of particular interest is the compilation of several articles on the Kalman lter, used by Greg Welch and Gary Bishop in a course on the Kalman lter. This can be found at: http://www.cs.unc.edu/~tracker/media/pdf/SIGGRAPH2001_CoursePack_08.pdf.

References
[1] Michael L. Workman Gene F. Franklin, J. David Powell. Addison Wesley Longman, Inc., 3 edition, 1998.
Digital Control of Dynamic Systems.

[2] Gary Bishop Greg Welch. Greg welch and gary bishop's kalman lter page, 2007. [3] Katsuhiko Ogata.
System Dynanmics.

Pearson Prentice Hall, 4 edition, 2004.

[4] Wikipedia. Kalman lter, 2007.

12/10/07 7:11 AM

C:\Documents and Settings\bishopj1\Desktop\Jacob_Bis...\kalman.m

1 of 3

1 %-------------------------------------------------------------% 2 % Jacob Bishop-Demonstration of the KALMAN filter. 3 % BYU November 2007 4 %-------------------------------------------------------------% 5 % Set up the equations of the system (what's the truth?) 6 clear all close all clc 7 m 8 b 9 k 10 x = linspace(0,100,1001) 11 % ics were z(0)=0,Dz(0)=0. The equation was 'm*D2z + b*Dz + k*z = 10' 12 % The following is the solution. 13 z =-5*exp(-1/2*(b-(b^2-4*k*m)^(1/2))/m*x)*(b+(b^2-4*k*m)^(1/2))/(b^2-4*k*m)^(1/2) /k+5*exp(-1/2*(b+(b^2-4*k*m)^(1/2))/m*x)*(b-(b^2-4*k*m)^(1/2))/(b^2-4*k*m)^(1/2)/k+10/k 14 dz =5/2*(b-(b^2-4*k*m)^(1/2))/m*exp(-1/2*(b-(b^2-4*k*m)^(1/2))/m*x)*(b+(b^2-4*k*m)^ (1/2))/(b^2-4*k*m)^(1/2)/k-5/2*(b+(b^2-4*k*m)^(1/2))/m*exp(-1/2*(b+(b^2-4*k*m)^(1/2))/m*x) *(b-(b^2-4*k*m)^(1/2))/(b^2-4*k*m)^(1/2)/k 15 z1 = z 16 z2=dz 17 18 % Plot the actual solution. 19 subplot(2,1,2),plot(x,real(z2),'g') % Exact velocity solution 20 title('velocity') 21 xlabel('time in sec.') 22 ylabel('velocity in m/s') 23 hold on 24 subplot(2,1,1),plot(x,real(z1),'k') % Exact position solution 25 hold on 26 title('position') 27 xlabel('time in sec.') 28 ylabel('position from 0 in m') 29 shg 30 hold on 31 %-------------------------------------------------------------% 32 % Set up simulated measurements. 33 random_vector_1=0*randn(size(z1)) 34 random_vector_2=0*randn(size(z2)) 35 position_sample = random_vector_1+z1 36 velocity_sample = random_vector_2+z2 37 position_samples_taken = [] 38 sample_time = [] 39 %-------------------------------------------------------------% 40 % State-Space representation of the system 41 m 42 b 43 k 44 A = [0 b/m -k/m] 45 B = [ m] 46 C = [1 0] 47 D = 49 %-------------------------------------------------------------%

12/10/07 7:11 AM 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 95 96 97 98 99 100

C:\Documents and Settings\bishopj1\Desktop\Jacob_Bis...\kalman.m

2 of 3

% Kalman filter implementation % Setup definitions dt x_old = [ ] tmin tmax t = tmin:dt:tmax u = 10*ones( size(t) ) %let us have a 10*unit step @0 x_hat = zeros( max(size(A)),1 ) P = 100*eye( size(A) ) % Uncertainty in initial estimate Q = 4*eye( size(A) ) % Uncertainty in model R = % Uncertainty in measurement % Time update for i=1:max( size(u) ) x_hat = x_hat + dt*( A*x_hat+B*u(i) ) P = P+dt*(A*P + P*A' + Q) if(mod(i,20)==0) %-----------------------------------------------------------% % Measurement Update S = inv(C*P*C'+ R) L = P*C'*S mu = (position_sample(i)-C*x_hat) x_hat = x_hat + L*mu P = (eye(2)-L*C)*P position_samples_taken = [position_samples_taken real(position_sample(i))] sample_time = [sample_time t(i)] end all_x(:,i) = x_hat all_p(:,i)=trace(P) all_P(:,:,i)=P end % Plot the filtered solution. hold on subplot(2,1,1),plot(sample_time, position_samples_taken,'g*' ) hold on subplot(2,1,1),plot( t,real( all_x(1,:) ),'r' ) legend('Actual Position','Measured Position','Estimated Position') hold on subplot(2,1,2),plot( t,real( all_x(2,:) ),'b' ) legend('Actual Velocity','Estimated Velocity') hold on

%-------------------------------------------------------------% % If you have maple, you can replace lines 10-16 with these, % and maple will solve the equation instead of using the given % solution, allowing for experimentation changes for ic's, etc. %-------------------------------------------------------------%

12/10/07 7:11 AM

C:\Documents and Settings\bishopj1\Desktop\Jacob_Bis...\kalman.m

3 of 3

103 % z = dsolve(eqn,ics,'x') 104 % dz = diff(z,'x')

108

You might also like