Academia.eduAcademia.edu

Slides-4up

AI-generated Abstract

Slides-4up discusses the methodology of system identification, emphasizing its importance in obtaining mathematical models from experimental data for dynamic systems. The paper outlines the core concepts, such as Bode plots and impulse response, and highlights various identification techniques, including the use of Empirical Transfer Function Estimate (ETFE) and toolboxes for practical implementation. It further addresses challenges in modeling and parameter identification, particularly in the context of closed-loop systems.

System Identification and Today Parameter Estimation PeSi/0/1 Introduction part 1: System Identification • What is system identification? A set of methods to obtain a mathematical model of a dynamic system from experimental data. • What is the goal of system identification? The obtained model should on the one hand be compact and on the other hand be adequate for the intended purpose. Such purpose may be the design of a controller for that system. • How is system identification being carried out? In dedicated experiments measurements are carried out on a system. The obtained system response is analysed, e.g. with M ATLAB’s system identification toolbox ident. Ronald Aarts PeSi/1/2 – Assessment / examination • Systems and signals • Continuous and discrete time systems • Time and frequency domain Ronald Aarts PeSi/1/1 Example: Data from piezo mechanism of “Project F”, input piezo current, output position, Ts = 33 µs. Input and output signals Input and output signals 900 900 y [−] Ronald Aarts http://blackboard.utwente.nl/ – Software: Matlab 800 700 2100 800 700 2100 u [−] BlackBoard: Ronald Aarts Mechanical Automation / Werktuigbouwkundige Automatisering (Wa) Room: Horstring Z 229 Phone: (053) 489 2557 Email: R.G.K.M.Aarts@utwente.nl – Course topics and course material y [−] Lecturer: • Introduction of System identification and Parameter estimation u [−] Course number: 191131700 2050 2000 0 0.2 0.4 0.6 t [s] 0.8 2050 2000 0.5 1 0.501 0.502 0.503 0.504 0.505 t [s] This data will be used more often during the lectures and is available for download. Ronald Aarts PeSi/1/3 • How is this toolbox being used? Often with a GUI: Introduction part 2: Parameter Estimation • What is parameter estimation (in this lecture)? A set of methods to obtain a physical model of a dynamic system from experimental data. • What is the goal of parameter estimation? The obtained parameters should represent certain physical properties of the system being examined. Click-and-play?? • And what is the use of these lectures? An overview of the applied methods, the background of the algorithms, the pitfalls and suggestions for use. Ronald Aarts Physical models PeSi/1/4 Ronald Aarts PeSi/1/5 vs. Mathematical models microscopic “First principles”: conservation laws, physical properties, parameter estimation PDE’s: ∞-dimensional non-linear no general-purpose technique macroscopic Experimental: planned experiments, look for relations in data, system identification ODE’s: finite dimension LTI “standard” techniques LTI: Linear Time Invariant system, e.g. described by means of a transfer function y(t) Y (s) = G(s) of = G(s) U (s) u(t) Where possible, non-linearities are removed from the data by preprocessing prior to system identification Ronald Aarts • How is parameter estimation being carried out? The parameters of the model are fitted to obtain the “best” representation of the experimental data, e.g. with M ATLAB’s optimization toolbox. PeSi/1/6 Approach for (1) system identification and (2) parameter estimation: • • • • • • Input/output selection Experiment design Collection of data Choice of the model structure (set of possible solutions) Estimation of the parameters Validation of the obtained model (preferably with separate “validation” data). 1. The models used for system identification are “mathematical”: The underlying physical behaviour is unknown or incomplete. To the extend possible physical insight is used, e.g. to select the model structure (representation, order). 2. The parameters in the models for parameters estimation have, to a more or lesser extend, a physical meaning. Ronald Aarts PeSi/1/7 Overview topics: 1. System identification (first part of the lectures): Overview topics: 2. Parameter estimation (first and second part of the lectures): • Introduction, discrete systems, basic signal theory • Open-loop LTI SISO systems, time domain, frequency domain • Non-linear model equations • Subspace identification • Identifiability of parameters • Non-parametric identification: correlations and spectral analysis • Linearity in the parameters • Identification with “Prediction Error”-methods: prediction, model structure, approximate models, order selection, validation • Error propagation • Experiment design • Experiment design Ronald Aarts PeSi/1/8 Ronald Aarts PeSi/1/9 Software: M ATLAB (any recent version) + identification toolbox + optimization toolbox: E.g. available from NCS. Overview topics: 3. Miscellaneous (second part of the lectures): • MIMO-systems • Identification in the frequency domain • Identification of closed loop systems • Non-linear optimisation • Cases. Course material: Lecture notes / slides in PDF-format from BlackBoard site. On-line M ATLAB documentation of the toolboxes (selection). Examination (5 EC): Exercises parallel to the lectures, usually to be handed in next week. One exercise includes a lab experiment (during the “break” in March / April). The answers will be graded and may be discussed during an oral exam. Grades will be available on-line∗. ∗ Ronald Aarts PeSi/1/10 Grades: Ronald Aarts Approximately 65% of maximum: 5.5 100% of maximum: 10.0 In between: linear interpolation PeSi/1/11 u System T y Systems and signals A description of the system T should specify how the output signal(s) y depend on the input signal(s) u. The signals depend on time: continuous or at discrete time instants. • The phase shift is found from the phase angle ∠G(iω). Continuous versus discrete time Time domain versus frequency domain The output y(t) for an arbitrary input signal u(t) can be found by considering all frequencies in the input signal: Fourier transform. Examples for LTI systems: • • • • Frequency Response Function (FRF) or Bode plot Impulse response of step response State space model Transfer function Ronald Aarts PeSi/2/1 The input signal can be considered as a sequence of impulses with some (time dependent) amplitude u(t). The outputs due to all impulses are added: Continuous time signals and systems: Convolution integral Z∞ 0 Ronald Aarts ∞ X l=0 g(l)u(k − l) PeSi/2/3 PeSi/2/2 Discrete time dynamic systems Signals are sampled at discrete time instances tk with sample time Ts,: uk−1 uk uk+1 uk+2 = = = = u(tk−1) u(tk ) u(tk+1) u(tk+2) differential operator du su ⇒ Z dt s−1u ⇒ u dt (k = 0, 1, 2, ...) = = = = u(tk − Ts) u(tk ) u(tk + Ts) u(tk + 2Ts ) Transfer function y = G(z)u + H(z)e (in z-domain) Continuous time g(τ )u(t − τ ) dτ Discrete time signals and systems: Summation y(k) = (g ∗ u)(k) = Procedure (in principle): Transform the input into the frequency domain, multiply with the Bode plot or FRF, transform the result back to the time domain. Ronald Aarts The impulse response g(t) of an LTI system can also be used to compute the output y(t) for an arbitrary input signal u(t): y(t) = (g ∗ u)(t) = It specifies for each frequency ω the input-output relation for harmonic signals (after the transient behaviour vanished): • The amplification equals the absolute value |G(iω)|. Several system descriptions are available: • • The Bode plot of a system G(iω) is the amplitude and phase plot of the complex function G depending on the real (angular) frequency ω. vs. Discrete time shift operator (1 − q)uk ⇒ uk − uk+1 (1 − q −1)uk ⇒ uk − uk−1 State space equations: ẋ(t) = Ax(t) + Bu(t) + Kv(t) x(tk+1) = Ax(tk ) + Bu(tk ) + Kv(tk ) y(t) = Cx(t) + Du(t) + v(t) y(tk ) = Cx(tk ) + Du(tk ) + v(tk ) Ronald Aarts PeSi/2/4 Discrete time dynamic systems: stability Continue time Discrete systems and phase lag vs. Discrete time Zero order hold (ZOH) discretisation introduces a phase lag: z transform Transfer function: Imaginary axis: s = iω Stability: Poles inside the unit circle: |z| < 1 Phase lag depends on frequency ω and sample time Ts: −ω · Ts/2 (in rad). 0.5 x [−] Relation between s en z: z = esTs (Franklin (8.19)) Poles in LHP: Re(s) < 0 1 b1z −1 + b2z −2 G(z) = 1 + a1z −1 + a2z −2 n1 s + n0 G(s) = 2 d2s + d1s + d0 0 0 phase [deg] Laplace transform −0.5 −1 0 0.5 Undamped poles: Unit circle: z = eiωTs 1 t [s] 1.5 2 −20 −40 −60 Ts = 0.1 s −80 0 10 1 ω [rad/s] 10 So the phase lag is −90o at the Nyquist frequency ωN = ωs/2 = π/Ts. Ronald Aarts PeSi/2/5 Ronald Aarts PeSi/2/6 Fourier transforms Signal characterisation M ATLAB’s identification toolbox ident works with time domain data. Even then, the frequency domain will appear to be very important. Furthermore, identification can also be applied in the frequency domain. • Frequency content: Fourier transform Continuous-time deterministic signals u(t): Fourier integral: Z ∞ Z ∞ 1 u(t)e−iωt dt u(t) = U (ω) = U (ω)eiωt dω 2π −∞ −∞ For a finite number (N ) of discrete-time samples ud(tk ): Fourier summation: UN (ωl ) = ud(tk ) = −1 1 NX UN (ωl )eiωl tk N l=0 UN (ωl ) with ωl = Nl ωs = Nl 2π Ts , l = 0, ..., N − 1 is the discrete Fourier transform (DFT) of the signal ud(tk ) with tk = kTs, k = 0, ...N − 1. For N equal a power of 2, the Fast Fourier Transform (FFT) algorithm can be applied. Deterministic or stochastic signals? Ronald Aarts ud(tk )e−iωl tk k=0 • Energy • Power NX −1 PeSi/2/7 Ronald Aarts PeSi/2/8 Energy and power (continue time signals) 6 900 10 800 10 Energy spectrum: Ψu(ω) = |U (ω)|2 4 YN [−] y [−] Example: 32768 data samples of the piezo mechanism (left). 700 Energy: 2 Eu = 10 Z∞ u(t)2dt = −∞ 0 600 10 0 0.5 t [s] 1 0 2 10 10 f [Hz] 4 10 Power spectrum: Φu(ω) = lim Power: Pu = lim 1 T →∞ T ZT 1 T →∞ T 1 2π Z∞ Ψu(ω)dω −∞ |UT (ω)|2 u(t)2dt = 0 1 2π Z∞ Φu(ω)dω −∞ DFT (right) in 16384 frequencies computed with M ATLAB’s fft command. Horizontal axis in steps of 1/(total measurement time) = 0.925 Hz. [ UT (ω) is de Fourier transform of a continuous time signal with a finite duration ] Ronald Aarts Ronald Aarts PeSi/2/9 Signal types Energy and power (discrete time signals) • Deterministic with finite energy: R Eu = Ψu(ω)dω bounded. Ψu(ω) = |U (ω)|2 limited. Energy spectrum: Ψu(ω) = |U (ω)|2 1 0 • Deterministic with finite power: R Pu = Φu (ω)dω finite. Φu(ω) = T1 |UT (ω)|2 unbounded. • Stochastic with finite power: R Pu = Φu (ω)dω finite. Φu(ω) = T1 |UT (ω)|2 bounded. Ronald Aarts PeSi/2/10 Energy: −1 1 0 −1 1 0 20 40 60 80 ∞ X ud(k)2 = k=−∞ Φu(ω) = lim Power: Pu = lim (“periodogram”) Z −1 1 NX ud(k)2 = Φu(ω)dω N →∞ N k=0 ωs 100 Ronald Aarts Ψu(ω)dω ωs t [s] PeSi/2/11 Z 1 |UN (ω)|2 N →∞ N Power spectrum: −1 0 Eu = (from the DFT) PeSi/2/12 Convolutions Stochastic signals Continuous time: y(t) = (g ∗ u)(t) = Z∞ g(τ )u(t − τ ) dτ 0 Realisation of a signal x(t) is not only a function of time t, but depends also on the ensemble behaviour. After Fourier transform: Y (ω) = G(ω) · U (ω) Discrete time: y(k) = (g ∗ u)(k) = ∞ X l=0 An important property is the expectation:E f (x(t)) (t = 0, 1, 2, ...) g(l)u(k − l) After Fourier transform: Y (ω) = G(ω) · U (ω) Examples: Mean Power E x(t) E (x(t) − E x(t))2 Cross-covariance: Rxy (τ ) = E[x(t) − Ex(t)][y(t − τ ) − Ey(t − τ )] Autocovariance: Rx(τ ) = E[x(t) − Ex(t)][x(t − τ ) − Ex(t − τ )] Example: u is the input and g(k), k = 0, 1, 2, ... is the impulse response of the system, that is the response for an input signal that equals 1 for t = 0 en equals 0 elsewhere. Then with the expressions above y(k) is the output of the system. White noise: e(t) is not correlated met signals e(t − τ ) for any τ 6= 0. Consequence: Re(τ ) = 0 for τ 6= 0. Ronald Aarts Ronald Aarts PeSi/2/13 Systems and models Power density or Power Spectral Density: Φx(ω) = Z∞ Rx(τ ) e−iωτ dτ PeSi/2/14 ∞ X Φxd (ω) = Rxd (k) e−iωkT k=−∞ −∞ A system is defined by a number of external variables (signals) and the relations that exist between those variables (causal behaviour). With (inverse) Fourier transform: 1 Rx(τ ) = 2π Z∞ Φx(ω) eiωτ dω Rxd (k) = −∞ v T Φxd (ω) eiωkT dω 2π ω Z u s ✲ ❄ ✗✔ ✲ ✖✕ G y ✲ Power: E (x(t) − E x(t))2 = Rx(0) = 1 = 2π Ronald Aarts Z∞ E (xd(t) − E xd(t))2 = Rxd (0) = T Φxd (ω) dω = 2π ω Z Φx(ω) dω −∞ s PeSi/2/15 Signals: • • • Ronald Aarts measurable input signal(s) u measurable output signal(s) y unmeasurable disturbances v (noise, non-linearities, ...) PeSi/2/16 Estimators Non-parametric (system) identification Suppose we would like to determine a vector θ with n real coefficients of which the unknown (true) values equal θ0. An estimator θ̂N has been computed from N measurements. • t-domain: Impulse or step response IDENT: cra ∗ • f -domain: Bode plot • Give models with “many” numbers, so we don’t obtain models with a “small” number of parameters. This estimator is • The results are no “simple” mathematical relations. • unbiased if the estimator E θ̂N = θ0. • consistent, if for N → ∞ the estimator E θ̂N resembles a δ-function, or in other words the certainty of the estimator improves for increasing N . • The results are often used to check the “simple” mathematical relations that are found with (subsequent) parametric identification. • Non-parametric identification is often the fist step. The estimator is consistent if it is unbiased and the asymptotic covariance lim covθ̂N = lim E(θ̂N − E θ̂N )(θ̂N − E θ̂N )T = 0 ∗ Ronald Aarts Ronald Aarts N →∞ The IDENT commands impulse & step use a different approach that is related to the parametric identification to be discussed later. N →∞ PeSi/2/17 Correlation analyse Ident manual: Tutorial pages 3-9,10,15; Function reference cra (4-42,43). ✲ G0 ❄ ✗✔ ✲ ✖✕ y ✲ PeSi/3/1 The Finite Impulse Response (FIR) ĝ(k), k = 0, 1, 2, ..., M , is a model estimator for system G0(z) for sufficiently high order M : v u IDENT: spa & etfe y(t) = G0(z) u(t) + v(t) y(t) ≈ M X k=0 ĝ(k)u(t − k) (t = 0, 1, 2, ...) Note: In an analysis the lower limit of the summation can be taken less than 0 (e.g. −m) to verify the (non-)existence of a non-causal relation between u(t) and y(t). Using the Impulse Response g0(k), k = 0, 1, 2, ... of system G0(z) y(t) = ∞ X k=0 g0(k)u(t − k) + v(t) (t = 0, 1, 2, ...) So the transfer function can be written as: G0(z) = ∞ X g0(k)z −k k=0 ⇒ Impulse response of infinite length. ⇒ Assumption that the “real” system is linear and v is a disturbance (noise, not related to input u). Ronald Aarts PeSi/3/2 How do we compute the estimator ĝ(k)? • • • u(t) en v(t) are uncorrelated (e.g. no feedback from y to u!!!). P Multiply the expression y(t) = g0(k)u(t − k) + v(t) with u(t − τ ) and compute the expectation. This leads to the Wiener-Hopf equation: Ryu(τ ) = ∞ X k=0 Ronald Aarts g0(k)Ru (τ − k) PeSi/3/3 Wiener-Hopf: Ryu(τ ) = ∞ X k=0 g0(k)Ru (τ − k) If u(t) is a white noise signal, then Ru(τ ) = σu2δ(τ ), so g0(τ ) = Ryu (τ ) R̂yu (τ ) and ĝ(τ ) = 2 σu σu2 How do we compute the estimator for the cross covariance R̂yu (τ ) from N measurements? Sample covariance function: N (τ ) = R̂yu Wiener-Hopf: Ryu(τ ) = k=0 g0(k)Ru (τ − k) And what if u(t) is not a white noise signal? Option 1: Estimate the autocovariance e.g. with N 1 X N u(t)u(t − τ ) R̂u (τ ) = N t=τ and solve the linear set of M equations for ĝ(k): N R̂yu (τ ) = N 1 X y(t)u(t − τ ) N t=τ ∞ X M X N ĝ(k)R̂u (τ ) k=1 Not so favorable .... is asymptotically unbiased (so for N → ∞). Ronald Aarts Wiener-Hopf: Ryu(τ ) = PeSi/3/4 ∞ X k=0 Ronald Aarts PeSi/3/5 g0(k)Ru (τ − k) Finding a pre-whitening filter L(z) for uF (k) = L(z)u(k): And what if u(t) is not a white noise signal (continued)? Try to use a linear model of order n (default in ident n = 10) Option 2: Filter input and output with a pre-whitening filter L(z): Suppose we know a filter L(z), such that uF (k) = L(z)u(k) is a white noise signal. Apply this filter to the output as well (yF (k) = L(z)y(k)), then yF (t) = G0(z) uF (t) + L(z)v(t) so the impulse response ĝ(k) can also be estimated from uF (k) and yF (k). L(z) = 1 + a1z −1 + a2z −2 + ... + anz −n and look for a “best fit” of the n parameters such that e.g. minimised. ⇒ Exercise 3. OK, but how do we find such a pre-whitening filter L(z)? Ronald Aarts PeSi/3/6 Ronald Aarts PeSi/3/7 N 1 X u2 (k) is N k=1 F Impulse response of the piezo mechanism. An example (1): The piezo mechanism. [ir,R,cl]=cra(piezod,200,10,2); Warning: All equations starting from y(t) = G0(z) u(t) + v(t) do not account for offsets due to non-zero means in input and/or output. So detrend! Covf for filtered y Covf for prewhitened u 2000 2500 1800 2000 Upper right: u is indeed whitened. 1500 1600 Input and output signals load piezo; Ts = 33e-6; u=piezo(:,2); y=piezo(:,1); piezo = iddata(y,u,Ts); piezod = detrend(piezo); plot(piezod); 150 100 y1 50 0 −50 −100 −150 0 0.2 0.4 0.6 0.8 1 100 u1 50 0 −50 −100 0 0.2 0.4 0.6 0.8 1000 1400 500 1200 0 1000 −200 −100 0 100 200 −500 −50 0 Correlation from u to y (prewh) 50 Impulse response estimate 0.1 2500 0.08 2000 0.06 1500 0.04 The horizontal axes count the time samples, so the values should be scaled with T = 33 µs. 1000 0.02 500 0 0 −0.02 0 50 100 150 Lower right: The impulse response is causal. 200 −500 0 50 100 150 200 1 Time Ronald Aarts PeSi/3/8 Ronald Aarts Second example: Known system PeSi/3/9 Impulse response of known system Ident manual page 2-15: Impulse response 3 yk − 1.5yk−1 + 0.7yk−2 = uk−1 + 0.5uk−2 + ek − ek−1 + 0.2ek−2. G0 (exact) 2.5 2 So and G0(q) = q −1 + 0.5q −2 1 − 1.5q −1 + 0.7q −2 1 − q −1 + 0.2q −2 H0(q) = 1 − 1.5q −1 + 0.7q −2 Simulation: Ronald Aarts N T fs u(t) e(t) = = = or or z + 0.5 G0(z) = 2 z − 1.5z + 0.7 z 2 − z + 0.2 H0(z) = 2 z − 1.5z + 0.7 4096 1s 1 Hz binary signal in frequency band 0..0.3fs “white” noise (random signal) with variance 1 PeSi/3/10 cra with n = 10 1.5 1 cra with n = 1 (not enough pre-whitening) 0.5 0 −0.5 −1 −1.5 0 5 Ronald Aarts 10 15 20 Time [s] 25 30 35 PeSi/3/11 40 Spectral analyse Ident manual: Tutorial pages 3-10,15,16; Function reference etfe (4-53,54), spa (4-193–195). The estimator ĜN (a) (b) is unbiased. has an asymptotic variance 1 |U (ω)|2 unequal 0 ! Φv (ω)/ N N v u ❄ ✗✔ ✲ ✖✕ G0 ✲ y ✲ is asymptotically uncorrelated for different frequencies ω. Difficulty: For N → ∞ there is more data, but there are also estimators at more (=N/2) frequencies, all with a finite variance. Fourier transform (without v): Y (ω) = G0(eiωT )U (ω), so G0(eiωT ) = (c) y(t) = G0(z) u(t) + v(t) Y (ω) . U (ω) Solutions: Estimator for G0(eiωT ) using N measurements: ĜN (eiωT ) = 1. Define a fixed period N0 and consider an increasing number of measurements N = rN0 by r → ∞. Carry out the spectral analysis for each period and compute the average to obtain a “good” estimator in N0/2 frequencies. YN (ω) . UN (ω) V (ω) Effect of v: ĜN (eiωT ) = G0(eiωT ) + N . UN (ω) Ronald Aarts 2. Smoothen the spectrum in the f -domain. PeSi/3/12 Ronald Aarts PeSi/3/13 An example: 32768 data samples of the output signal (position) of the piezo mechanism. An example: 32768 data samples of the output signal (position) of the piezo mechanism. Split the DFT in parts: Filter the DFT with a Hamming∗ filter: 8 10 6 7 10 10 7 10 6 10 5 10 6 10 5 10 5 10 4 10 4 4 10 10 3 10 3 10 3 10 2 10 2 10 2 10 1 10 0 10 1 10 1 10 2 10 3 10 4 10 1 0 1 10 10 2 10 3 10 4 10 10 0 10 1 10 2 10 3 10 4 10 Original DFT and with Hamming∗ filters of width 8, 64 and 512. DFT 16384 frequencies Ronald Aarts and 2048 and 256 frequencies PeSi/3/14 ∗ Signal Processing Toolbox: Filters neighbouring frequencies with cosine function. Ronald Aarts PeSi/3/15 With IDENT (1): periodogram (“data spectra”) of input and output signals Periodogram slope -2 0 Output # 1 10 −5 10 0 10 1 10 2 3 10 4 10 10 The output signal shows a low frequent slope of -2, which is caused by the existence of a pure integrator 1/s in the system (slope -1 in a Bode plot and squared in the periodogram). With IDENT (2): identification ETFE (Empirical Transfer Function Estimate, manual page 4-53): Estimate the transfer function G with Fourier transforms ĜN (eiωT ) = in 128 (default) frequencies. YN (ω) UN (ω) Smoothing is applied and depends on a parameter M , which equals the width of the Hamming window of a filter that is applied to input and output (small M means more smoothing). SPA (SPectral Analysis, manual page 4-193): 0 10 Input # 1 The input signal is “reasonable” white. −5 10 0 10 1 10 2 3 10 Frequency (Hz) 4 10 Ronald Aarts 10 PeSi/3/16 Estimate the transfer function G with Fourier transforms of the covariances ĜN (eiωT ) = Φ̂yu(ω) in 128 (default) frequencies. Φ̂u(ω) Smoothing is also applied and again depends on a parameter M , that sets the width of the Hamming window of the applied filter. Ronald Aarts PeSi/3/17 ETFE of the piezo example Frequency response Choice between ETFE and SPA? 0 • Not always straightforward to predict which method will perform best. Why not try both? Amplitude 10 • ETFE is preferred for systems with clear peaks in the spectrum. −4 10 • SPA estimates also the noise spectrum v(t) = y(t) − G0(z)u(t) according to u u |Φ̂yu(ω)|2 κ̂yu(ω) = t Φ̂y (ω)Φ̂u (ω) 3 4 10 10 window parameter: 0 2 Φ̂u(ω) • Measure of signal to noise ratio with the coherence spectrum v 2 10 −200 Phase (deg) Φ̂v (ω) = Φ̂y (ω) − Φ̂yu(ω) −2 10 −400 −600 −800 2 10 3 4 10 Frequency (Hz) 10 What is the “real” width of the peak near 2 kHz? Ronald Aarts PeSi/3/18 Ronald Aarts PeSi/3/19 M = 15 M = 30 M = 60 * M = 90 M = 120 Second example: Spectral analysis of the known system SPA of the piezo example Frequency response Ident manual page 2-15: 0 yk − 1.5yk−1 + 0.7yk−2 = uk−1 + 0.5uk−2 + ek − ek−1 + 0.2ek−2. −2 10 −4 10 2 3 10 4 10 So G0(q) = and H0(q) = 10 window parameter: 0 Phase (deg) −200 M = 15 M = 30 M = 60 M = 90 * M = 120 −400 −600 −800 2 3 10 4 10 Frequency (Hz) Ronald Aarts 10 PeSi/3/20 Simulation: q −1 + 0.5q −2 1 − 1.5q −1 + 0.7q −2 N T fs u(t) e(t) = = = Spectra: Frequency response 1 Periodic input, r periods 0 10 G0 −1 10 r=1 2 −1 0 10 10 0 −2 10 0 −180 −360 ETFE, M = 30 SPA, M = 30 −90 Filtering, Hamming window of width w −180 −270 −2 −1 10 0 10 10 Frequency (Hz) Ronald Aarts −1 −3 10 w=1 2 −2 −1 10 10 f [Hz] −3 10 w=2 −2 −1 10 10 f [Hz] w=4 0 10 −2 10 0 −180 −360 −3 PeSi/3/22 −2 10 10 f [Hz] 10 10 Ronald Aarts r = 16 10 10 0 r=4 10 −3 −2 10 Phase (deg) PeSi/3/21 Phase [deg] Magnitude [−] Amplitude 10 z 2 − z + 0.2 H0(z) = 2 z − 1.5z + 0.7 or 4096 1s 1 Hz binary signal in frequency band 0..0.3fs “white” noise (random signal) with variance 1 Ronald Aarts Smoothen the ETFE: z + 0.5 G0(z) = 2 z − 1.5z + 0.7 or 1 − q −1 + 0.2q −2 1 − 1.5q −1 + 0.7q −2 Phase [deg] Magnitude [−] Amplitude 10 −2 −1 10 10 f [Hz] PeSi/3/23 −3 10 −2 −1 10 10 f [Hz] −3 10 −2 −1 10 10 f [Hz] Intermezzo: Linear regression and Least squares estimate Going from “many to “just a few” parameters: a first step Idea: Try to recognise “features” in the data. • Immediately in u(k) en y(k) ??? Regression: • Prediction of variable y on the basis of information provided by other measured variables ϕ1, ..., ϕd.   • In the spectral models: Are there “features” e.g. like peaks as are are expected in the Bode plots / FRF (eigenfrequency, ...) of a system with a complex pole pair. ϕ1   • Collect ϕ =  .. . ϕd • In the impulse response (measured or identified): (1) Recognise “features” (settling time, overshoot, ...). (2) Realisation algorithms → to be discussed next. • Problem: find function of the regressors g(ϕ) that minimises the difference y − g(ϕ) in some sense. So ŷ = g(ϕ) should be a good prediction of y. • Example in a stochastic framework: minimise E[y − g(ϕ)]2. Ronald Aarts PeSi/3/24 Ronald Aarts PeSi/4/1 Linear regression — Examples: 15 • Special case: regression function g(ϕ) is linear in the parameters θ. Note that this does not imply any linearity with respect to the variables from ϕ. • Special case: g(ϕ) = θ1ϕ1 + θ2ϕ2 + ... + θdϕd So g(ϕ) = ϕT θ. Ronald Aarts PeSi/4/2 • Linear fit y = ax + b. Then g(ϕ) = ϕT θ with input vector ϕ = and parameter vector θ = " " # x 1 # a . So: g(ϕ) = b • Quadratic function y = c2x2 + c1x + c0. 10 5 0 0 h x 1 i 5 x [−] a . b 15 10 5 0 −5 x2   Then g(ϕ) = ϕT θ with input vector ϕ =  x  0 5 x [−] 1     c2 h i c2     and parameter vector θ =  c1 . So: g(ϕ) = x2 x 1  c1 . c0 c0 Ronald Aarts PeSi/4/3  10 # " y [−] • Regression function g(ϕ) is parameterised. It depends on a set of parameters   θ1   θ =  .. . θd y [−] Linear regression: 10 Linear least-squares estimate (1): Least-squares estimate (LSE): • N measurements y(t), ϕ(t), • Minimise VN (θ) = t = 1, ..., N . N 1 X [y(t) − g(ϕ(t))]2 . N t=1 • So a suitable θ is θ̂N = arg min VN (θ). • Linear case VN (θ) = N 1 X [y(t) − ϕT (t)θ]2. N t=1 Ronald Aarts PeSi/4/4 • A global minimum is found for θ̂N that satisfies a set of linear equations, the normal equations  N 1 X N t=1  ϕ(t)ϕT (t) θ̂N = N 1 X N t=1 ϕ(t)y(t). • If the matrix on the left is invertible, the LSE is  θ̂N =  Ronald Aarts −1 N 1 X ϕ(t)ϕT (t) N t=1 is a quadratic function of θ. N 1 X [y(t) − ϕT (t)θ]2 N t=1 • It can be minimised analytically: All partial derivatives zero in the minimum: N 1 X 2ϕ(t)[y(t) − ϕT (t)θ] = 0 N t=1 ∂VN (θ) have to be ∂θ The solution of this set of equations is the parameter estimate θ̂N . Ronald Aarts PeSi/4/5 Linear least-squares estimate — Matrix formulation: Linear least-squares estimate (2):  • In the linear case the “cost” function VN (θ) = N 1 X ϕ(t)y(t). N t=1 PeSi/4/6  • Collect the output measurements in the vector YN =    and the inputs in the N × d regression matrix ΦN =   h i T • Normal equations: ΦT N ΦN θ̂N = ΦN YN .  y(1)  .. , y(N )  ϕT (1)  .. . T ϕ (N ) † • Estimate θ̂N = ΦN YN † h (Moore-Penrose) pseudoinverse of ΦN : ΦN = ΦT N ΦN † Note: ΦN ΦN = I. Ronald Aarts PeSi/4/7 i−1 ΦT N. Linear least-squares estimate in Matlab: Example 1: The “well-known” linear fit y = ax + b Solution x of overdetermined Ax = b with rectangular matrix A, so more equations than unknowns, or more rows than columns, or A is m-by-n with m > n and full rank n Measurements xi and yi for i = 1, ..., N . 1) “Manual” solution: Then least squares solution x̂ = A†b % Preferred Parameter estimate: Ronald Aarts PeSi/4/8  1 ||Y − Φ θ||2. Cost function (in vector form) VN = N N N 2 h i−1  Ronald Aarts #−1 " P # " P P xi y i xi x2 i P P = P yi xi 1 PeSi/4/9 that filters a given signal u(t) with uF (k) = L(z)u(k), such that VN = N 1 X u2 (k) is minimised. N k=1 F 1) “Manual” solution: ΦT N YN . In Matlab: theta = Phi\Y; # L(z) = 1 + a1z −1 + a2z −2 + ... + anz −n # " x(1) 1 y(1) a     . . . . . .  and θ = 2) Matrix solution: YN =  . , ΦN =  b x(N ) 1 y(N ) † â b̂ Find a linear filter of order n (e.g. n = 10) 1 P(y − ax − b)2 . Cost function VN = N i i Estimate θ̂N = ΦN YN = ΦT N ΦN " Example 2: Pre-whitening filter L(z) for correlation analysis Measurements xi and yi for i = 1, ..., N .  ∂VN ∂VN = 0 and = 0, so ∂a ∂b Ronald Aarts Example 1: The “well-known” linear fit y = ax + b  (yi − axi − b)2. P  P # # " P # " " P P  −xi(yi − axi − b) = 0 xi y i a x2 i P xi P P = ⇔ P  yi b xi 1 −(yi − axi − b) = 0 In Matlab: x = A\b; x = pinv(A)*b; x = inv(A’*A)*A’*b; 1 Cost function VN = N ∂VN = 0 for all i = 1, 2, ..., n. ∂ai 2) Matrix solution: Write cost function in vector form † 1 ||Y − Φ θ||2 with best fit θ̂ = Φ Y . VN = N N N N 2 N N PeSi/4/10 Ronald Aarts PeSi/4/11 For both solutions substitute in VN = N 1 X u2 (k) N k=1 F Measurements uk and yk for k = 1, ..., N . uF (k) = u(k) + a1u(k − 1) + a2u(k − 2) + ... + anu(k − n). Consider A(z) = 1 + a1z −1 and B(z) = b1z −1 + b2z −2, ∂VN 1) Partial derivatives can be computed, etc. ∂ai so (1 + a1z −1)yk = (b1z −1 + b2z −2)uk + ek . 2) Vector form Cost function VN = VN = 1 N      u(n + 1)+ a1u(n) + a2u(n − 1) +...+ anu(1) u(n + 2)+ a1u(n + 1) + a2u(n) +...+ anu(2) .. n(N ) +a1u(N − 1)+a2u(N − 2)+...+anu(N − n) Parameter vector θ = [a1a2...an]T  2     2 (yk + a1yk−1 − b1uk−1 − b2uk−2)2. P h Estimate θ̂N = ΦT N ΦN Ronald Aarts Ronald Aarts † PeSi/4/12 Least-squares estimate (LSE) from normal equations — Summary: 1 ||Y − Φ θ||2. • Write the cost function as VN = N N N 2 h i i−1 ΦT N YN or PHIN\YN. PeSi/4/13 Least-squares estimate (LSE) from normal equations — Pitfalls: h i • Analytical difficulty: What if ΦT N ΦN is singular? h i • Numerical difficulty: What if ΦT N ΦN is near-singular? How to solve the normal equations anyhow? Specific answers can be given for “easy” problems like y = ax + b. † • Solution for the best fit θ̂N = ΦN YN , h with pseudo-inverse ΦN = ΦT N ΦN i−1 Is that all? Often yes, but ... Ronald Aarts  • How about the accuracy of the estimate θ̂N ? T ΦT N ΦN θ̂N = ΦN YN †    y(3) a1     .. Matrix solution: Collect θ =  b1 , YN =   and y(N ) b2   −y(2) u(2) u(1)   .. .. .. ΦN =  . −y(N − 1) u(N − 1) u(N − 2) Recognise YN and ΦN , then compute best fit θ̂N = ΦN YN . • Compute the LSE from the normal equations ARX models will be discussed in one of the next chapter 1 e u + A(z) Example 3: An ARX fit yk = B(z) k A(z) k PeSi/4/14 ΦT N. For a more generic treatment we will look at Singular Value Decomposition (svd). Ronald Aarts PeSi/4/15 Singular Value Decomposition (svd): Svd and (near) singularity:   Φ = U ΣV T or              =       Φ    σ    1 ..  .    T  σd   V      U 0  Condition number of matrix Φ is the ratio of the largest and smallest singular values σ1/σd.   • If all σii > 0 (i = 1, 2, ..., d), then the condition number is finite h ΦT Φ is invertible. Matrix Φ has full (column) rank. • Φ is a N × d matrix with N (rows) > d (columns). • U is an orthogonal N × N matrix. • V is an orthogonal d × d matrix. • Σ is a N × d matrix with the singular values Σii = σi ≥ 0 (i = 1, 2, ..., d) and 0 elsewhere. Usually the singular values are sorted: σ1 ≥ σ2 ≥ ... ≥ σd ≥ 0. Ronald Aarts PeSi/4/16 • hIf σi =i 0 for i = r + 1, r + 2, ..., d, the condition number is infinite ΦT Φ is singular, r is the rank of matrix Φ. h i • “Large” condition numbers indicate near-singularity of ΦT N ΦN and will lead to numerical inaccuracies. Ronald Aarts PeSi/4/17 Svd and condition number (1): Svd and condition number (2): Consider quadratic function y = c2x2 + c1x + c0 Consider linear functions y = ax + b and y = a1x + a2(10 − x) + b x2(1) x(1) 1  .. .. ..  Regression matrix ΦN =   depends only on x(i). x2(N ) x(N ) 1 40 c0 x(1)  .. Regression matrices ΦN =  x(N ) 20 5 x [−] 10 svd(phi) = [160.3; 5.2; 1.2] cond(phi) = 131 phi\y = [0.43; −2.33; −2.25] Exact: [0.40; −2.00; −3.00] Ronald Aarts 10 5 0 0 10 x [−] svd(phi) = [842.5; 9.1; 0.1] cond(phi) = 6364 phi\y = [0.43; −10.9; 63.8] Exact: [0.40; −10.0; 57.0] PeSi/4/18  0 0 0  20 5 x [−] 10 svd(phi) = [19.8; 1.75] cond(phi) = 11.3 phi\y = [1.97; −2.76] svd(phi) = [23.7; 14.8; 0.0] cond(phi) = 5.7 · 1016 phi\y = [1.70; −0.28; 0] Exact: [2.00; −3.00] Exact: [2.00; 0.00; −3.00] Ronald Aarts  1 x(1) 10−x(1) 1  ..  .. .. ..  and Φ =    N 1 x(N ) 10−x(N ) 1 15 c0 y [−] ❍❍ ❍❍ 15 10 5 0 −5   y [−] y [−]  Warning: Rank deficient, rank = 2 tol = 4.8e-14. PeSi/4/19 Step 1 (continued): Then only U1, Σ1 and V1 contribute to Φ, so Svd and (near) singularity (2): Step 1: Split Σ in Σ1 with non-zero singular values and Σ2 ≈ 0: Φ = U ΣV T  or                =       Φ  U1 U2 Σ1            Σ2    V1   0 Φ = U1Σ1V1T T  V2  • Φ is a N × d matrix with N (rows) > d (columns). • U is an orthogonal N × N matrix. U1 are the first r columns. • V is an orthogonal d × d matrix. V1 are the first r columns. • Σ1 is a r × r diagonal matrix with the nonzero singular values σ1 ≥ σ2 ≥ ... ≥ σr > 0. The rest of the matrix is zero. Ronald Aarts PeSi/4/20   x(1) 10−x(1) 1  .. .. ..  Regression matrix ΦN =   x(N ) 10−x(N ) 1 y [−] 15 10 5 0 10 svd(phi) = [23.7; 14.8; 0.0] cond(phi) = 5.7 · 1016 phi\y = [1.70; −0.28; 0] Warning: Rank deficient, rank = 2 tol = 4.8e-14. Exact: [2.00; 0.00; −3.00] Ronald Aarts Φ        =        U1    h   Σ  1     V1T i Step 2: Compute the (pseudo) inverse by only considering this part: T Φ† = V1Σ−1 1 U1 Then the LSE is with Σ−1 1 = diag(1/σ1 , 1/σ2 , ..., 1/σr ). θ̂N = V1Σ−1U1T YN This is the output of the M ATLAB command pinv(phi)*yn Note: If the rank r of Φ is less than d, the solution is not unique. PeSi/4/21 SVD and LSE — summary part 1 Consider again the linear function y = a1x + a2(10 − x) + b 5 x [−] or         Ronald Aarts SVD and LSE — example 0  [u,s,v]=svd(phi); r=1:2; % rank u1=u(:,r); s1=s(r,r); v1=v(:,r); % Compute pinv(phi)*y: v1*diag(1./diag(s1))*u1’ * y = [ 1.68; -0.29; 0.14 ] v2=v(:,3) = [ 0.099; 0.099; -0.99 ] The parameters can be modified with any multiple of V2, e.g. into [1.97; 0; −2.76]. PeSi/4/22 • Singular Value Decomposition (svd) can provide the numerical tools to compute the linear least squares estimate. • It can detect a (near) singular regression matrix. The pseudo inverse of the regression matrix ΦN can always be computed reliably. • r parameters or linear combinations of parameters are obtained. In case of non-uniqueness all solutions can be found found by taking into account matrix V2. What is left? • The accuracy of the estimate θ̂N . This will be discussed after some system identification topics ..... Ronald Aarts PeSi/4/23 Approximate realisations (not part of ident) Given: impulse response g(k) for k = 0, 1, ..., ∞ (Markov parameters) of a finite-dimensional linear time invariant system.  Wanted: state space model of minimal order, like Observations: • G(z) = • g(k) = g(k)z −k k=0   CAk−1B  D Ronald Aarts g(1) g(2) g(3) g(2) g(3) g(4) g(3) g(4) g(5) .. .. .. g(nr ) g(nr + 1) g(nr + 2)    Hnr ,nc =     x(k + 1) = Ax(k) + Bu(k) y(k) = Cx(k) + Du(k) ∞ X Algorithm by Ho & Kalman (1966) to construct an exact model (A, B, C, D) from the impulse response: Compose Hankel matrix H: −1 • G(z) = D + C(zI − A) B k≥1 k=0 PeSi/5/1 To determine the rank n and to compute the decomposition, “singular value decomposition” (SVD) will be applied:     Then Hnr ,nc =  C CA .. n CA r −1 g(nc) g(nc + 1) g(nc + 2) .. · · · g(nr + nc − 1) ··· ··· ···          h i   B AB · · · Anc−1B  so rank(H) is the minimal order n (provided nr and nc are sufficiently large). Ronald Aarts PeSi/5/2 1/2 Then also H = UnΣn 1/2 Σn VnT and compute 1/2 B from the first column of H2 = Σn VnT and H = UnΣnVnT 1/2 C from the first row of H1 = UnΣn . with Un and Vn unitary matrices (UnT Un = VnT Vn = In) and Σn a diagonal matrix with the positive “singular values” σ1 ≥ σ2 ≥ . . . ≥ σn > 0 on the diagonal. Note that zero singular values are removed in the singular value decomposition. Ronald Aarts PeSi/5/3 −1/2 A is Σn ← − −1/2 using the shifted Hankel matrix UnT H VnΣn     ← − H nr ,nc =     Ronald Aarts g(2) g(3) g(4) g(3) g(4) g(5) g(4) g(5) g(6) .. .. .. g(nr + 1) g(nr + 2) g(nr + 3) PeSi/5/4 g(nc + 1) g(nc + 2) g(nc + 3) .. · · · g(nr + nc) ··· ··· ···         What if there is noise and g(k) is only a finite series for k = 0, 1, ..., N ? Example: The estimated impulse response (CRA) of the simulated data from z + 0.5 G0(z) = 2 z − 1.5z + 0.7 8 7 • Construct a Hankel matrix with nc + nr = N . • Apply SVD (H = U ΣV T ) and find out how many n singular values are significantly nonzero. • Next construct an approximating Hankel matrix of rank n according Hn = UnΣnVnT , in which Un and Vn are the first n columns of U and V . Σn is the diagonal matrix with the first n (nonzero) singular values. • Apply the original Ho&Kalman algorithm to Hn to compute B and C. For A ← − the shifted Hankel matrix H with the original series g(k) is used. SVD’s: 7.410 3.602 0.033 0.033 0.023 0.023 0.016 0.015 0.009 6 5 4 3 2 1 0 Ronald Aarts PeSi/5/5 1 Ronald Aarts 2 3 4 5 6 7 8 9 Conclusion: order n = 2. PeSi/5/6 Identified system (with Ho&Kalman algorithm): A= C= " h 0.829 −0.379 0.379 0.673 1.3915 0.9681 # B= i " D= h 1.392 −0.968 0 Subspace identification (E.g. n4sid in ident) # More recent developed realisation method (early 90’s, KU Leuven), using directly the input and output data (so without explicit impulse response). i 0.999z + 0.496 As a transfer function: Ĝ(z) = 2 z − 1.502z + 0.701 Pole-zero-plot: + No explicit model equations needed (specify only the order n, very well suited for MIMO). + Mathematically elegant (robust, reliable) and efficient (optimisation with linear equations). − Not all mathematical issues of the optimisation are settled. + but is well suited as an initial guess to obtain better PE-models (these will be discussed afterwards and may need non-linear iterative optimisation algorithms). Poles (x) and Zeros (o) 0.8 0.6 G0 0.4 0.2 0 Ĝ −0.2 −0.4 −0.6 − The obtained solution is “sub-optimal”, −0.8 −1 Ronald Aarts −0.5 0 0.5 PeSi/5/7 1 Ronald Aarts PeSi/5/8 xk+1 = A xk + B uk + K ek yk = C xk + D uk + ek Subspace algorithm (van Overschee and de Moor (1994,1996); ident manual pages 3-13,17 & 4-140) Solution approach: Typical state space model structure: xk+1 = A xk + B uk y k = C xk + D u k + v k • If the states xk , input uk and output yk were known, then : Compute C and D with linear regression, compute ek , compute A, B and K also with linear regression. Measured input and output uk and yk Noise source vk Characterisation of the noise source according to the innovations form xk+1 = A xk + B uk + K ek yk = C xk + D uk + ek • How do we find the states xk ? Measured input and output uk and yk White noise source ek • Determine the order n of the system: → From a Singular Value Decomposition of the “correct” matrix. Then find an estimator for the states. Compute the state matrices as above. Equivalent to yk = G(z)uk + H(z)ek with G(z) = C(zI − A)−1B + D H(z) = C(zI − A)−1K + I Ronald Aarts This overview is rather “black box” alike. So-called PEM models will be discussed in much more detail later. PeSi/5/9 Ronald Aarts PeSi/5/10 Example subspace identification of simulated data from Example subspace identification of the piezo data z + 0.5 G0(z) = 2 z − 1.5z + 0.7 Order estimation: Pole-zero-plot: Model singular values vs order Poles (x) and Zeros (o) 10 Order estimation: Red: Default Choice Pole-zero-plot: Model singular values vs order 2.5 2 9 Poles (x) and Zeros (o) 3 1.5 Red: Default Choice 0.8 1 8 2 0.6 Log of Singular values 0.5 0.4 1 Log of Singular values 0.2 0 0 0 7 −0.5 −1 6 −1.5 −0.2 -1 −2 5 −0.4 −2.5 -2 −0.6 −3 4 −0.8 -3 −1 −0.5 0 0.5 1 3 0 2 -4 G0 and Ĝn4s2. -5 0 2 Ronald Aarts 4 6 8 Model order 10 12 14 4 6 8 Model order 10 12 Matlab 7.0.4: 4e order? 16 (May be different for other versions) PeSi/5/11 Ronald Aarts 14 16 −2 −1 0 1 2 3 Ĝn4s4 and Ĝn4s5. The poles of Ĝn4s4 are quite unlikely as the main resonance peak is “overlooked”. PeSi/5/12 4 Frequency response Spectral analysis: 0 Amplitude 10 Prediction Error identification Methods: Introduction −2 10 A model with the essential dynamics of a system with a finite (and limited) number of parameters is wanted for −4 10 2 3 10 4 10 10 0 Phase (deg) −180 • Simulation of the process. −360 • Prediction of future behaviour. −540 −720 −900 2 10 3 • Controller design. 4 10 Frequency (Hz) 10 • ... Ĝn4s4 and Ĝn4s5 compared with a non-parametric ETFE100-model. Poles (using M ATLAB’s pole and damp commands after exporting the model from ident’s GUI to the workspace and transforming into a tf model): The applicability of non-parametric models is limited. • 0.9988: “almost” pure integator • 0.96 ± 0.27i: frequency 1.3 kHz and relative damping 0.031 • 0.82 ± 0.56i: frequency 2.9 kHz and relative damping 0.013 Subspace models may be “suboptimal”. Ronald Aarts Ronald Aarts PeSi/5/13 e PEM: system description and disturbances ❄ u with measured input u(t) and output y(t). The non-measurable disturbance v(t) includes: PEM: prediction and prediction error (ident manual page 3-16) H0 LTIFD (Linear Time Invariant Finite Dimensional) system y(t) = G0(z) u(t) + v(t) PeSi/6/1 + ✲ G0 ✲ + v ❄ ✎☞ ✍✌ y ✲ Assume the dynamic system is given as y(t) = G(z)u(t) + H(z)e(t) with observations Yt−1 = {y(s), s ≤ t − 1} and Ut−1 = {u(s), s ≤ t − 1}. What is the best prediction for y(t) ? Situation I: u = 0, so y(t) = v(t) = H(z)e(t) and Vt−1 is known. • measurement noise, • process disturbances, • effects on non-measured inputs. • non-linearities. Noise model: v(t) with power spectrum Φv (ω) that can be written as v(t) = H0(z) e(t) The white noise e(t) has a variance σe2 and the transfer function H0(z) is stable, monic (H0(∞) = 1) and minimum phase (so it has a stable inverse). Ronald Aarts PeSi/6/2 Then v(t) = e(t) + m(t − 1), with m(t − 1) = ∞ P k=1 h(k)e(t − k). As H(z) has a stable inverse, also Et−1 is known, and then also m(t − 1) = [1 − H −1(z)]v(t). The expectation for v(t) is: v̂(t|t − 1) := E{v(t)|Vt−1} = m(t − 1). Ronald Aarts PeSi/6/3 Situation II: u 6= 0, so y(t) = G(z)u(t) + v(t) and Ut−1 and Yt−1 are known. Then Vt−1 is also known. So E{y(t)|Ut−1, Yt−1} = G(z)u(t) + E{v(t)|Vt−1}, Model structures and ŷ(t|t − 1) = G(z)u(t) + [1 − H −1(z)]v(t). Predictor model: Two rational functions (G(z), H(z)). With v(t) = y(t) − G(z)u(t) we find Model candidates from the model set M = {(G(z, θ), H(z, θ)) | θ ∈ Θ ⊂ Rd}. ŷ(t|t − 1) = H −1(z)G(z)u(t) + [1 − H −1(z)]y(t), the predictor and the prediction error ε(t) := y(t) − ŷ(t|t − 1) = H −1(z)[y(t) − G(z)u(t)]. Parameterisation: models are represented by real valued parameters θ from a parameter set Θ. Examples of parameterisations: Predictor models: “Tuning” of the transfer functions G(z) and H(z). When they both are exactly equal to the “real” G0(z) and H0(z), the prediction error ε(t) is a white noise signal. In practice: “tune” the estimates to minimise the error ε(t) with a least squares fit. Ronald Aarts PeSi/6/4 • • • Coefficients of series impulse response models Coefficients in state space matrices subspace identification Coefficients in fractions of polynomials PE methods Ronald Aarts PeSi/6/5 ARX Model structure (ident manual page 3-11) −1 ,θ) 1 , H(z, θ) = A(z −1 , G(z, θ) = B(z A(z −1,θ) ,θ) Model structures equation G(z, θ) H(z, θ) ARX A(z −1) y(t) = B(z −1) u(t) + e(t) B(z −1,θ) A(z −1,θ) 1 A(z −1,θ) ARMAX A(z −1) y(t) = B(z −1) u(t) + C(z −1) e(t) B(z −1,θ) A(z −1,θ) C(z −1,θ) A(z −1,θ) −1 ,θ) 1 y(t) = B(z u(t) + A(z −1 e(t) A(z −1,θ) ,θ) OE −1) y(t) = B(z u(t) + e(t) F (z −1) B(z −1,θ) F (z −1,θ) 1 A(z −1, θ) y(t) = B(z −1, θ) u(t) + e(t) FIR y(t) = B(z −1) u(t) + e(t) B(z −1, θ) 1 BJ −1) C(z −1 ) u(t) + D(z y(t) = B(z −1 ) e(t) F (z −1) B(z −1,θ) F (z −1,θ) C(z −1,θ) D(z −1,θ) with B(z −1, θ) = b1 + b2z −1 + ... + bnb z −(nb−1) A(z −1, θ) = 1 + a1z −1 + ... + ana z −na and θ = [a1 a2 ... ana b1 b2 ... nnb ]T ∈ Rna+nb . Name (AutoRegressive eXogenous) Predictor: ŷ(t|t − 1; θ) = B(z −1, θ) u(t) + [1 − A(z −1, θ)] y(t) is linear in the parameters θ. Ronald Aarts PeSi/6/6 Ronald Aarts PeSi/6/7 Identification criterium Properties of the model structures Ingredients for the identification Linearity-in-the-parameters: if H −1(z)G(z) and [1 − H −1(z)] are polynomials, then the predictor ŷ is a linear function in θ. Result: least squares criterium for ε(t) is quadratic in θ and there is an analytical solution for the best fit θ̂ (fast algorithm, suitable as order estimator). OK: ARX, FIR. Not: ARMAX, OE, BJ. Independent parameterisation of G(z, θ) and H(z, θ): no common parameters in G and H. Result: independent identification of G and H. OK: OE, FIR, BJ. Not: ARX, ARMAX. The structure and the order define the model set M. Ronald Aarts PeSi/6/8 Consistency properties of the least squares criterium If • the system (G0, H0) is in the chosen model set and • Φu(ω) 6= 0 in sufficient frequencies (sufficiently exciting) then G(z, θN ) and H(z, θN ) are consistent estimators. If • the system G0 is in the chosen model set, • G and H are parameterised independently (FIR, OE, BJ) and • Φu(ω) 6= 0 in sufficient frequencies (sufficiently exciting) then G(z, θN ) is a consist estimator. • the experimental data {(y(t), u(t)), t = 1, ..., N }. • the model set M. • the identification criterium. PE-methods: consider the prediction error ε(t, θ) = y(t) − ŷ(t|t − 1, θ) for all t as a function of θ. • • Least squares criterium: minimisation of the scalar function of ε(t, θ): Ronald Aarts PeSi/6/9 Example of the consistency properties Model system from the ident manual page 2-14: yk − 1.5yk−1 + 0.7yk−2 = uk−1 + 0.5uk−2 + ek − ek−1 + 0.2ek−2. Or G0(z) = 1.0 − 1.0z −1 + 0.2z −2 z −1 + 0.5z −2 and H (z) = 0 1.0 − 1.5z −1 + 0.7z −2 1.0 − 1.5z −1 + 0.7z −2   A(z −1) = 1.0 − 1.5z −1 + 0.7z −2   This is an ARMAX structure with B(z −1) = z −1 + 0.5z −2    −1 −1 −2 C(z Simulation (as before): PeSi/6/10 N P 1 or a filtered error: VN = N (L(z)ε(t, θ))2. t=1 Instrumental variable (IV) techniques: try to obtain an uncorrelated signal ε(t, θ). Note: The IV-method combined with an ARX model set can also provide a consistent estimator for G0 even when the noise model is not in the chosen model set. Ronald Aarts N P 1 VN = N ε(t, θ)2 t=1 Ronald Aarts N T fs u(t) e(t) = = = ) = 1.0 − 1.0z + 0.2z 4096 1s 1 Hz binary signal in frequency band 0..0.3fs white noise (random signal) with variance 1 PeSi/6/11 Parametric models: G0, H0 armax2221 na = 2, nb arx221 na = 2, nb oe221 nb = 2, nf arx871 na = 8, nb = 2, nc = 2, nk = 1 = 2, nk = 1 = 2, nk = 1 = 7, nk = 1 Spectral analysis: Parametric models: G0, H0 armax2221 na = 2, nb arx221 na = 2, nb oe221 nb = 2, nf arx871 na = 8, nb model G0 and H0 ∈ M. only G0 ∈ M. H = 1. High order guess. Noise spectrum (|H|): Pole-zero-plot: Frequency response = 2, nc = 2, nk = 1 = 2, nk = 1 = 2, nk = 1 = 7, nk = 1 Step response: Step Response Power Spectrum 2 10 10 Poles (x) and Zeros (o) model G0 and H0 ∈ M. only G0 ∈ M. H = 1. High order guess . 1 1 10 9 Amplitude 0.8 8 0 10 0.6 7 1 10 0.4 −1 10 6 −2 10 −1 0 10 0.2 10 5 0 0 4 −0.2 0 Phase (deg) 10 3 −90 −0.4 2 −0.6 −180 1 −0.8 −270 −2 10 −1 −1 10 0 10 10 Frequency (rad/s) Ronald Aarts −1 −1 −0.5 0 0.5 1 PeSi/6/12 −2 10 −1 0 0 10 10 0 5 10 15 Frequency (rad/s) Ronald Aarts 20 Time 25 30 35 40 PeSi/6/13 Approximate modelling: What happens if the model set does not fit? Asymptotic variance Error: ε(t, θ) = H −1(z, θ)[y(t) − G(z, θ)u(t)] = H −1(z, θ)[(G0(z) − G(z, θ))u(t) + v(t)] For system that fit in the model set, it can be proven for n → ∞, N → ∞ and n/N → 0 (n ≪ N ) that Power spectrum: Φε(ω, θ) = var(ĜN (eiω )) ∼ n Φv (ω) , N Φu(ω) var(ĤN (eiω )) ∼ n Φv (ω) n = |H0(eiω )|2. N σe2 N Φv (ω) = σe2|H0(eiω )|2. |G0(eiω ) − G(eiω , θ)|2 Φu(ω) + Φv (ω) with |H(eiω , θ)|2 N P 1 ε(t, θ)2. Cost function: VN = N t=1 Limit for N → ∞: VN → V̄ (θ) = Ēε(t, θ)2. 1 ∞ Φε(ω, θ) dω 2π −∞ shows the limit θ ∗ to which the estimator θN converges. With Parseval: V̄ (θ) = Ronald Aarts PeSi/6/14 Ronald Aarts Z PeSi/6/15 Minimising Example (Ljung 1999, Example 8.5, page 268) 1 ∞ |G0(eiω ) − G(eiω , θ)|2 Φu (ω) + Φv (ω) dω 2π −∞ |H(eiω , θ)|2 Z y(t) = G0(z)u(t), with white noise input (Φu(ω) ≈ 1) and 4th order G0 Two mechanisms: G0(z) = |G0(eiω ) − G(eiω , θ)|2 Φu (ω) . |H(eiω , θ)|2 • Minimising 0.001z −2(10 + 7.4z −1 + 0.924z −2 + 0.1764z −3) 1 − 2.14z −1 + 1.553z −2 − 0.4387z −3 + 0.042z −4 Approximate models: • Fitting of nominator and denominator. Observation: The way G0 is approximated by G(z, θ̂N ) depends on the noise model H(z, θ) (so in fact filtering occurs). Ronald Aarts PeSi/6/16 • 2nd order OE (oe221): y(t) = • 2nd order ARX (arx221): (1 + a1 z −1 + a2 z −2) y(t) = (b1 z −1 + b2 z −2) u(t) + e(t) Ronald Aarts Amplitude Bode-plot of G0, 2nd order OE (oe221) and 2nd order ARX (arx221) (left). b1 z −1 + b2 z −2 u(t) + e(t) 1 + f1 z −1 + f2 z −2 PeSi/6/17 Approximate modelling: using a fixed noise model That means: H(eiω , θ) = H∗(z) 1 10 0 10 With Parseval we obtain the limit θ ∗ to which the estimator θN converges now by minimising 0 10 1 ∞ |G0(eiω ) − G(eiω , θ)|2 Φu(ω) + Φv (ω) dω, or 2π −∞ |H∗(eiω )|2 |A| Amplitude Bode plot −1 10 Z −2 10 −1 10 −3 10 −2 −2 −1 10 0 10 10 10 −2 10 frequency (rad/s) −1 0 10 10 frequency (rad/s) Background: In the ARX estimation there is an extra filtering with the (a priori unknown) function (right): 1 |H(eiω , θ)|2 Ronald Aarts = |A(eiω , θ)|2 1 ∞ Φu(ω) dω |G0(eiω ) − G(eiω , θ)|2 2π −∞ |H∗(eiω )|2 Z Mechanism: Find the least squares estimate G(eiω , θ) of G0(eiω ) by applying a frequency domain weighting function Φu(ω)/|H∗(eiω )|2. Example: For the OE-model H∗(z) = 1. PeSi/6/18 Ronald Aarts PeSi/6/19 Example (extended with prefilter for identification) Approximate modelling: using a prefilter L(z) New cost function: VN = N N 1 X 1 X εF (t, θ)2 = (L(z) ε(t, θ))2. N t=1 N t=1 y(t) = G0(z)u(t), with PRBS input (Φu(ω) ≈ 1) and 4th order G0 G0(z) = 0.001z −2(10 + 7.4z −1 + 0.924z −2 + 0.1764z −3) 1 − 2.14z −1 + 1.553z −2 − 0.4387z −3 + 0.042z −4 Bode amplitude plots of 2e order (approximate) estimates: Limit for N → ∞: VN → V̄F (θ) = ĒεF (t, θ)2. Frequency response Again with Parseval the limit θ ∗ to which the estimator θN converges: Z Observation: The way G0 is approximated by G(z, θ̂N ) depends on the modified noise model L−1(z)H(z, θ), of which the prefilter L(z) can be tuned by the user. Ronald Aarts PeSi/6/20 Amplitude |L(eiω )|2 1 ∞ dω {|G0(eiω ) − G(eiω , θ)|2 Φu(ω) + Φv (ω)} V̄F (θ) = 2π −∞ |H(eiω , θ)|2 0 10 G0 OE, L(z) = 1 ARX, L(z) = 1 −2 10 −2 −1 10 10 0 10 Frequency (rad/s) OE, L(z) = L1(z) ARX, L(z) = L2(z) Applied 5th order Butterworth prefilters: L1(z): high-pass, cut-off 0.5 rad/s. L2(z): low-pass, cut-off 0.1 rad/s. Ronald Aarts PeSi/6/21 Remarks: • Both the 2nd order OE model without prefilter and the 2nd order ARX model with prefilter L2(z) give a good model estimate at low frequencies. For the obtained ARX-model the Bode amplitude plots of the prefilter |L2(eiω )| (dashed) and the ultimate weighting function |L2(eiω )A∗(eiω )| (solid) are: Frequency response 0 10 −2 Amplitude 10 • The estimated model depends of course on the chosen model structure. In addition the result can be tuned by • the chosen input spectrum Φu(ω), −4 10 −6 10 −8 10 −10 10 −2 10 −1 10 0 Frequency (rad/s) 10 Note that ARX models (with and without a prefilter) can be computed uniquely and quick from a linear optimisation problem, whereas the identification of OE models involves an iterative algorithm for a nonlinear optimisation. Ronald Aarts • In ident’s GUI a Butterworth filter of a specified order can be applied to input and output data. At M ATLAB’s command prompt any prefilter L(z) can be applied by computing uF (t) = L(z) u(t) and yF (t) = L(z) y(t). and next using the filtered data for the identification. See also e.g. help idfilt. PeSi/6/22 • the chosen prefilter L(z) and • (when possible) the chosen noise model H(z) (actually complementary to L(z)). V̄F (θ) = Ronald Aarts 1 2π Z ∞ −∞ {|G0(eiω ) − G(eiω , θ)|2 Φu (ω) + Φv (ω)} PeSi/6/23 |L(eiω )|2 dω |H(eiω , θ)|2 Example 4th order G0: Approximate modelling: time domain analysis The asymptotic estimator θ ∗ of an ARX model (order denominator = na, order nominator = nb − 1, delay nk = 0) with an input signal that satisfies Ru(τ ) = 0 for τ 6= 0 is always: G0(z) = 0.001z −2(10 + 7.4z −1 + 0.924z −2 + 0.1764z −3) 1 − 2.14z −1 + 1.553z −2 − 0.4387z −3 + 0.042z −4 Impulse responses of: G0, arx120, arx230, arx340, oe210 and oe320: Impulse Response Impulse Response 0.06 0.08 0.05 0.07 0.06 0.04 • • a stable G(z, θ ∗), 0.05 0.03 0.04 an exact result for the first nb samples of the impulse response g0(k), k = 0, ..., nb − 1. 0.02 0.03 0.02 0.01 0.01 0 On the contrary, in an OE model the (total) squared difference between the impulse responses of system and model are minimised. 0 −0.01 −0.01 −0.02 0 1 2 Time 3 −0.02 4 0 10 20 30 40 50 Time 60 70 80 90 100 Note: nb ≤ 2 does not give meaningful results. Ronald Aarts PeSi/6/24 Ronald Aarts PeSi/6/25 What is a good model? Choice of the model set Depends on the goal: Is it e.g. good enough to design a controller of does it simulate the most important dynamic behaviour? • The chosen model parameterisation, so writing [G(z, θ) and H(z, θ)] as a polynomial fraction. • The choice of the model structure in the transfer functions [G(z, θ), H(z, θ)]: ARX / ARMAX / OE / BJ: Linearity in the parameters (ARX, FIR) or not. Independent parameterisation of G(z, θ) and H(z, θ) (OE, FIR, BJ) or not. • De choice of the model complexity, so the number of parameters or the order of the polynomials. What are the criteria to make these choices in order to obtain a good model for a fair price? Ronald Aarts PeSi/6/26 More general: small bias and small variance. Conflict: • For a small bias a large model set is necessary (high order, flexible model structure). • A small variance is obtained easier with a small number of parameters. Large model set: Good fit with data (small residue) Large variance Model depends on the specific noise realisation Ronald Aarts Small model set: Bad fit with data (large residue) Small variance Better distinction between stochastic and structured effects PeSi/6/27 What makes up the “price” of the model? • “Price” of the identification: Complexity of the algorithm, amount of work. • “Price” of model usage: Order in the case of controller design or simulations. Methodology for the choice of the model set • • • • A priori considerations Analysis of the data A posteriori comparison of the models Validation of the models Ronald Aarts PeSi/6/28 A priori considerations Physical insight regarding the (minimal) order of the model. Physical insight regarding the nature of the noise disturbance. Relation between the number of data points N and the number of parameters to be estimated Nθ : General: N ≫ Nθ Rule of thumb: N > 10Nθ Note that the required number of points depends strongly on the signal to noise ratio. Ronald Aarts PeSi/6/29 A posteriori comparison of the identified models Analysis of the data For a chose model structure, compare the obtained results for the criterium VN (θ̂N , Z N ) as a function of the parameters sets θ̂N of different model orders. Information regarding the model order from • Non-parametric identification: (anti)resonance peaks, phase behaviour. • Approximate realisation methods by doing an SVD-analysis of the Hankel-matrix or for subspace identification. ARX-example (system with na = 2, nb = 3) in left graph: VN decreases for increasing number of parameters (i.e. model order), so the best fit is obtained with the most complex model! Model order selection: estimation data set Model order selection: validation data set 0.012 0.012 0.0115 0.0115 0.011 0.011 loss function N P 1 ϕn(t)ϕT R̂(n) = N n (t) t=1 with ϕn (t) = [−y(t − 1) ... − y(t − n) u(t) ... u(t − n)]T . For noiseless data matrix R̂(n) becomes singular when the order n is taken “too large” (compare with the ARX order estimator). loss function • Evaluation of the rank of the Toeplitz matrix: 0.0105 0.01 0.01 0.0095 0.0095 0.009 0 5 10 0.009 15 # of parameters Ronald Aarts PeSi/6/30 0.0105 Ronald Aarts 0 5 10 # of parameters PeSi/6/31 15 Solution to avoid this overfit: cross-validation: ARX order selection Split data into two parts: Z N = Z (1)Z (2). Identify θ̂N from Z (1) and evaluate criterium for Z (2): right graph: There is a minimum VN near 5 prameters, but it is not very distinct. Model order selection: validation data set 0.012 0.012 0.0115 0.0115 0.011 0.011 loss function loss function Model order selection: estimation data set 0.0105 0.01 0.0095 0.0095 0.009 0.009 5 10 15 • Akaike’s Information Criterion (AIC) • Akaike’s Final Prediction Error Criterion (FPE) 0.0105 0.01 0 For ARX models automatic (and reliable) selection criteria exist (considering ĒVN (θ̂N , Z N )) (ident manual page 3-70 ff, 4-183 ff): • 0 5 # of parameters 10 AIC = log((1 + 2 n/N ) V ) 1+n/N FPE = 1−n/N V Rissanen’s minimum description length (MDL) MDL = (1 + log N ∗ n/N ) V 15 # of parameters Ronald Aarts PeSi/6/32 Ronald Aarts PeSi/6/33 Comparison between arx432 and 4th order G0 ARX order estimation: 4th order ARX model system with noise Garx432(z) = With validation data: With one data set: Model Fit vs # of par’s 2 1.8 1.8 1.6 1.6 1.4 1.2 1 0.8 0.6 0.4 G0(z) = 1.4 0.001z −2(10 + 7.4z −1 + 0.924z −2 + 0.1764z −3) 1 − 2.14z −1 + 1.553z −2 − 0.4387z −3 + 0.042z −4 Pole/zero: Bode: Frequency response 1.2 Poles (x) and Zeros (o) 0 10 1 0.8 Amplitude % Unexplained of output variance % Unexplained of output variance Model Fit vs # of par’s 2 0.001z −2(10 + 7.3z −1 + 0.8z −2) 1 − 2.147z −1 + 1.557z −2 − 0.4308z −3 + 0.0371z −4 0.8 0.6 −1 10 0.6 −2 10 0.4 0.4 −3 10 0.2 0.2 −2 −1 10 0.2 10 0 0 0 5 10 15 # of par’s 20 25 0 30 0 5 10 15 # of par’s 20 25 0 30 −0.2 Blauw = AIC and MDL criterium, Rood = Best fit Phase (deg) −90 −0.4 −180 −0.6 −270 −360 −450 −0.8 −2 −1 10 10 −1 Frequency (Hz) Ronald Aarts PeSi/6/34 Ronald Aarts PeSi/6/35 −0.5 0 0.5 1 Suggestion: filter with a low-pass filter in the frequency range 0...3788 Hz and resample with a factor 4. ARX order estimation: Piezo data Comparison between arx542 and a (new) etfe200: Frequency response 0 Amplitude 10 Bode: Pole/zero: Poles (x) and Zeros (o) Frequency response 1 10 −2 10 0 0.8 −1 0.6 2 10 3 10 4 10 Amplitude 10 −4 10 10 −2 0.4 10 0 0.2 −3 10 3 10 10 0 −400 0 −600 −0.2 −100 −800 2 10 3 10 Frequency (Hz) 4 10 Phase (deg) Phase (deg) −200 2 −200 −0.4 −300 −400 −0.6 −500 −0.8 −600 −700 2 3 10 10 −1 −0.5 0 0.5 Problem: The suggested arx 8 10 4 puts too much emphasis on high frequencies. The right graph shows a comparison with the non-parametric etfe 60. First eigenfrequency ≈ 1300 Hz (ζ ≈ 0.03). Clearly more emphasis on the low-frequent behaviour (sufficient ?) Ronald Aarts Ronald Aarts PeSi/6/36 Validation of the models The ultimate answer to the question whether the identified model is “good enough”, so also for the choice of the model structure and order. Techniques: • Comparison of model G(z, θ̂N ) with previous results: Non-parametric estimates (spectra, impulse response). • Model reduction: pole-zero-cancellation (within the relevant confidence interval) may indicate too high model order. Frequency (Hz) PeSi/6/38 PeSi/6/37 • Simulation of the model: Consider the simulation error esim(t) = y(t) − ysim(t) = y(t) − G(z, θ̂N ) u(t). “Optimal” case: esim (t) = v(t) = H0(z)e(t). In order to avoid overfit, cross-validation is preferable. • Residual test: For a consistent model the prediction error ε(t) (the residue) converges to a white noise signal. Two tests: 1. Is ε(t, θ̂N ) a realisation of white noise? Evaluate autocovariance RεN (τ ) → δ(τ ). (Test for both Ĝ and Ĥ) 2. Is ε(t, θ̂N ) a realisation of a stochastic process? N (τ ) → 0. Evaluate cross-covariance Rεu (Test for Ĝ) • Large confidence intervals for one or more parameters may indicate too high model order (or too few data). Ronald Aarts 1 Ronald Aarts PeSi/6/39 Method 1: Autocovariance RεN (τ ) = −τ 1 NX ε(t + τ )ε(t) for τ ≥ 0. N t=1 Requirement RεN (τ ) small for τ > 0 and N large: √ |RεN (τ )|/RεN (0) ≤ Nα/ N Example: 4th order G0 with noise (arx442-model). arx432 oe342 arx222 System model OK OK not OK Noise model OK not OK not OK (autocovariance does not fit in the plot) Autocorrelation of residuals for output 1 0.2 with Nα the confidence level for the confidence interval with probability α (E.g. N95% = 1.96, N99% ≈ 3). 0.1 0 −0.1 N Method 2: Cross-covariance Rεu (τ ) = −τ 1 NX N t=1 −0.2 −20 ε(t + τ )u(t) for τ ≥ 0. ∞ X −5 0 5 10 15 20 10 15 20 Cross corr for input 1and output 1 resids 0.05 0 −0.05 N R̂εN (k)R̂u (k) −0.1 −20 k=−∞ Ronald Aarts −10 0.1 √ √ N (τ ) ≤ N Requirement Rεu α P/ N with estimator for P : P̂ N = −15 PeSi/6/40 −15 Ronald Aarts −10 −5 0 Samples 5 PeSi/6/41 Identification design: Introduction Design variables for the estimation of the transfer functions The user can tune the identified model by choosing The asymptotic estimator of the parameters θ ∗ minimises • the model set (model structure and order) • the experimental conditions like the number of data samples N , de sample time Ts, the input spectrum Φu(ω), etc. • a prefilter L(q) Which requirements should be satisfied by the model? • Consistency? No problem with validation and sufficiently exciting input signal. However, is the requirement for a correct model set realistic? • Adequate for the intended purpose: simulation, prediction, diagnostics, controller design. Ronald Aarts PeSi/7/1 V̄F (θ) = 1 ∞ |L(eiω )|2 dω {|G0(eiω ) − G(eiω , θ)|2 Φu(ω) + Φv (ω)} 2π −∞ |H(eiω , θ)|2 Z in which a compromise between two mechanisms plays a role: 1 ∞ |G0(eiω ) − G(eiω , θ)|2 Q(ω, θ ∗) dω, with 2π −∞ Q(ω, θ) = Φu(ω) |L(eiω )|2 / |H(eiω , θ)|2. • Minimisation of Z • Fitting of |H(eiω , θ)|2 with the error spectrum. The limit θ ∗ depends on the model set, the prefilter L(q) and the input spectrum Φu(ω). Ronald Aarts PeSi/7/2 Experiment design Fitting of G0(eiω ): How can we get a fit in terms of the usual Bode plots? • The vertical axis of the Bode amplitude plot shows log |G|, which is approximated better with a small relative error. We are minimising Z 1 ∞ G0(eiω ) − G(eiω , θ) 2 | | |G0(eiω )|2Q(ω, θ ∗) dω, so Q(ω, θ ∗) 2π −∞ G0(eiω ) should be large when G0(eiω ) becomes small. • The horizontal axis shows log ω, which means that high frequencies dominate. This can be compensated by looking at |G0(eiω )|2 Q(ω, θ ∗) dω = ω|G0(eiω )|2 Q(ω, θ ∗) d(log ω). So the fit at low frequencies improves if |G0(eiω )|2 Q(ω, θ ∗) is larger than ω in that frequency region. The factor Q can be manipulated by modification of the design variables: input spectrum Φu(ω), noise model set {H(eiω , θ)}, prefilter L(q). Ronald Aarts PeSi/7/3 First analysis of the process (preparation): • What are the possible input(s) and output(s) of the system? • How are these signals being measured? • What are usual and extreme amplitudes of the signals. • At which frequencies do we expect essential dynamic behaviour? Preliminary experiments (not specifically aiming at identification): • Measurement of output signal (noise) for constant input signal. • Measurement of a step response: - Linearity - Relevant frequencies - Static gain - Selection input signal • Non-parametric identification (correlation and frequency analysis) with harmonic or broadband input signal: - Relevant frequencies - Duration of the experiment - Sample frequency - Selection input signal Ronald Aarts PeSi/7/4 Pseudo-Random, Binary Signal Input signals Quite often a broadband input signal is desirable: sufficiently exciting and the data contains information of the system in a large range of frequencies. Matlab-tool: u = idinput(N,type,band,levels) • Signal that can be generated easily by a computer algorithm using n shift registers and modulo-2 addition ⊕: s(t) = xn(t) xi(t + 1) = xi−1(t) 2≤i≤n x1(t + 1) = a1x1(t) ⊕ a2x2(t) ⊕ ... ⊕ anxn(t) 0⊕0=1⊕1=0 0⊕1=1⊕0=1 Clock Possibilities for (type): ❄ • ’RGS’: Random, Gaussian Signal: discrete white noise with a flat spectrum • ’RBS’: Random, Binary Signal (default). • • ✲ x1 ✲ ❄ State 2 ❄ ❆ ✁ ❆ ✁ ❆✁ ❄ ✛✘ ✛ ✚✙ a1 ⊕ ’PRBS’: Pseudo-Random, Binary Signal. ’SINE’: sum of harmonic signals (sine functions). State 1 x2 ❄ ❆ ✁ ❆ ✁ ❆✁ ❄ ✛✘ ✛ ✚✙ a2 ⊕ xn−1 ✲ ❄ State n ❄ ❆ ✁ ❆ ✁ ❆✁ ❄ ✛✘ ✛ ✚✙ an−1 ⊕ xn=s ✲ ❄ ❆ ✁ ❆ ✁ ❆✁ ❄ ✛✘ an ⊕ ✚✙ Given: initial condition and binary coefficients a1, ..., an. • Output: Deterministic periodic signal. “Pseudo-random”? Ronald Aarts PeSi/7/5 Ronald Aarts PeSi/7/6 • Maximum length PRBS: The period of the PRBS is as large as possible and equals M = 2n − 1 if the coefficients are chosen correctly, for example: n ai = 1 n ai = 1 n ai = 1 n ai = 1 3 1,3 7 1,7 11 9,11 15 14,15 4 1,4 8 1,2,7,8 12 6,8,11,12 16 4,13,15,16 5 2,5 9 4,9 13 9,10,12,13 17 14,17 6 1,6 10 3,10 14 4,8,13,14 18 11,18 • For M → ∞ the autocovariance Ru(τ ) converges to a white noise autocovariance. Ru(τ ) c2(1 − M12 ) c2 (1 + 1 ) −M M • The binary signal s(t) can be transformed into a signal u(t) with amplitude c and mean m with u(t) = m + c(−1 + 2 s(t)): Ru(0) = (1 − M12 )c2 2 Ronald Aarts Ronald Aarts ❊ ❊ ❊ ❊ ✆ ✆ ❊ ❊ ✆ ✆ ✆ ✆❊ ✆❊ ✆ ❊ ✆ ❊ ❊ ❊ M ❊ ❊ ❊ ✲ τ t This is also a binary signal, but it is a stochastic signal (instead of deterministic like the PRBS). PRBS, Nc = 2 ✻ ✲ Two classes: t −c • The actual clock period can also be taken equal to a multiple of the sample frequency: uNc (t) = u(ent(t/Nc)). The signal is stretched. The spectrum is no longer flat and even has some frequencies with Φu(ω) = 0. Φu(ω)/2π 1.6 1. Generated from u(t) = c sign[R(q)u(t − 1) + w(t)], in which w(t) is a stochastic white noise process, and R(q) is a stable linear filter. The power spectral density Φu(ω) depends on R(q). For R(q) ≡ 0 the output spectrum is flat. 2. A random binary signal u(t) that equals Φu (ω)/2π ±c, according to P r(u(t) = u(t − 1)) = p and P r(u(t) = −u(t − 1)) = 1 − p, in which p is the probability that the signal does not p = 0.90 change the next sample time (0 < p < 1). p = 0.75 p = 0.50 The power spectral density depends on p: For p = 1/2 the spectrum is flat. ω 1.5 1.4 Nc = 10 1.2 1 1 0.8 Nc = 5 0.6 Nc = 3 0.4 0.5 Nc = 1 0.2 0 0 0.5 1 1.5 2 2.5 3 3.5 ω PeSi/7/9 PeSi/7/8 Random Binary Signal PRBS, Nc = 1 ✻ −c Ronald Aarts ❊ τ = 1, ..., M − 1 ✲ c ❊ • The binary character is advantageous in the case of (presumed) non-linearities as it has a maximum power for a limited amplitude. PeSi/7/7 c ❊ The power spectrum can be modified by filtering of the signal with a (linear) filter, but then the binary character is no longer guaranteed. c Ēu(t) = m + M c (1 + 1 ) Ru(τ ) = − M M ✻ 0 Ronald Aarts PeSi/7/10 0 0.5 1 1.5 2 2.5 3 3.5 Periodic sum of harmonic signals (sine functions) Sample frequency ωs = 2π/Ts Compute Two different situations: • data acquisition. u(t) = r X • αk sin(ωk t + ϕk ) k=1 with a user defined set of excitation frequencies {ωk }k=1,...,r and associated amplitudes {α}k=1,...,r . Selection of the phases {ϕ}k=1,...,r : • • For a minimal amplitude in the time domain: Schroeder-phased sines Randomly chosen. Ronald Aarts Advice for data acquisition: collect as many data as possible. For a fixed total measurement time TN = N · Ts and unlimited N the sample time Ts is as small as possible. For a fixed (or limited) N the sample time Ts should not be too small in order to capture also the slowest responses of the system. Rule of thumb: • Upper limit Ts, lower limit ωs: Nyquist frequency ωN = ωs/2 > highest relevant frequency. For a first order system with bandwidth ωb: ωs ≥ 10ωb. • Information of the system is obtained in a limited number of frequencies. PeSi/7/11 identification & application of the model. Lower limit TN > 5–10 times the largest relevant time constant. Make sure to apply (analog) anti-aliasing filters when needed. Ronald Aarts PeSi/7/12 The advice for parametric identification & application of the model is: Numerical aspects define an upper limit for ωs: A continuous system with state space matrix Ac and ZOH discretisation has a discrete state space matrix Ad = eAcTs . For Ts → 0 it appears that Ad → I: all poles cluster near z = 1. The physical length of the range of the difference equation becomes smaller. PE-methods also emphasise high frequencies if Ts is too small: The prediction horizon (one step ahead) becomes small. For smaller Ts / Zlarger ωs also higher frequencies are included in the minimisation of Rule of thumb for the upper limit for a first order system : ωs ≤ 30ωb . Combined (for a first order system): 10ωb ≤ ωs ≤ 30ωb. Strategy: • Data acquisition and preliminary experiments at a high ωs. • Where needed reduction of ωs for parametric identification by applying a digital filter. |G0(eiω ) − G(eiω , θ)|2 Q(ω, θ) dω. In e.g. an ARX model it is very likely that the weighting of (too) high frequencies is too large (Q(ω, θ) = Φu (ω) |A(eiω , θ|2). Ronald Aarts PeSi/7/13 Ronald Aarts PeSi/7/14 Parameter identification: Model parameters have physical meaning. Data processing: • If necessary an anti-aliasing and/or noise filter should be used during the measurement: analog low-pass filters. Example: double pendulum with actuators (motors) driving the rotations φ1 and φ2 with torques T1 and T2. • Outliers / spikes: visual inspection and manual removal. • Means and drift: remove of model explicitly. • Scaling of signals: very important for MIMO systems. • Delays: compensate or model explicitly. • Adaptation of the sample frequency for the identification: after digital anti-aliasing filtering (explicit or automatic in a Matlab command). Take notice of this step when the input data is generated: Do not put energy in a frequency band that is later removed (e.g. for a PRBS: set Nc). Non-linear equations of motion (see e.g. Dynamics of Machines (113173)): M̄ " PeSi/7/15 = h m1 J1 φ2 φ1 l1 l2 m3 g m3 l1 l2 (φ̇22 + 2φ̇1 φ̇2 ) sin φ2 + (m2 + m3 )l1 g sin φ1 + m3 l2 g sin(φ1 + φ2 ) −m3 l1 l2 φ̇21 sin φ2 + m3 l2 g sin(φ1 + φ2 ) i + " # T1 T2 with (reduced) mass matrix M̄ = Ronald Aarts # φ̈1 φ̈2 m 2 J2  m2 l12 + m3 (l12 + l22 + 2l1l2 cos φ2) + J1 + J2 m3 (l22 + l1l2 cos φ2 ) + J2 m3(l22 + l1l2 cos φ2) + J2 m3 l22 + J2 Ronald Aarts  PeSi/8/1 Parameter-linear form Parameters: • Link lengths (l1 and l2) are known. • Gravity g is known • Masses and inertias (m1, J1, m2, J2 and m3) are to be estimated. • Measurements of the torques (T1 and T2). • Angles known as function of time (φ1(t) and φ2(t)). Parameter-linear form (2) M̄ " φ̈1 φ̈2 # = with M̄ = ⇔ with • measurements vector τ = [T1, T2]T , • parameter vector p including m1, J1, m2, J2 and m3, • regression matrix Φ(q , q̇ , q̈ ) depending only on known kinematic quantities with q = [φ1, φ2]T . Ronald Aarts PeSi/8/2 "  m2 l12 + m3 (l12 + l22 + 2l1l2 cos φ2 ) + J1 + J2 m3 (l22 + l1l2 cos φ2 ) + J2 m3 (l22 + l1l2 cos φ2 ) + J2 m3 l22 + J2 T1 T2 #     = Φ(q , q̇ , q̈ )     m1 J1 m2 J2 m3 i +         ??? • Mass m1 does not show up in the equations of motion, so it can not be identified! Ronald Aarts T1 T2 m3 l1 l2 (φ̇22 + 2φ̇1 φ̇2 ) sin φ2 + (m2 + m3 )l1 g sin φ1 + m3 l2 g sin(φ1 + φ2 ) −m3 l1 l2 φ̇21 sin φ2 + m3 l2 g sin(φ1 + φ2 ) Can the equations of motion be expressed in a parameter-linear form? τ = Φ(q , q̇ , q̈ )p " h PeSi/8/3 #  Parameter-linear form (3) • The other parameters are collected in p = [J1, J2, m2, m3]T . Then the elements of Φ can be written as Φ11 Φ12 Φ13 Φ14 = = = = Φ21 Φ22 Φ23 Φ24 = = = = φ̈1 φ̈1 + φ̈2 2φ̈ −l1g sin φ1 + l1 1 l1l2((2φ̈1 + φ̈2) cos φ2 − (φ̇2 2 + 2φ̇1 φ̇2 ) sin φ2 ) 2 φ̈ + l2(φ̈ + φ̈ ) − l1g sin φ1 − l2g sin(φ1 + φ2) + l1 1 2 2 1 0 φ̈1 + φ̈2 0 2 l1l2(φ̇2 1 sin φ2 + φ̈1 cos φ2) − l2 g sin(φ1 + φ2) + l2 (φ̈1 + φ̈2) • So matrix Φ indeed depends only on known quantities and not on any of the parameters in p: A parameter linear form can be obtained: Experimental approach • Measure the torques τ i = [T1, T2]T at N time instances ti (i = 1..N ) with known angles, velocities and accelerations q i = [φ1, φ2]T , q̇ i = [φ̇1, φ̇2]T and q̈ i = [φ̈1, φ̈2]T .     • Collect b =  τ1 τ2 .. τN        and A =    Φ1 Φ2 .. ΦN     in  b = Ap. • With the pseudoinverse the LS solution is p = A†b. τ = Φp. Ronald Aarts PeSi/8/4 Ronald Aarts PeSi/8/5 Experimental approach (3) Experimental approach (2) • In order to compute the solution p = A†b, matrix A has to be known with sufficient accuracy, so all angles, velocities and accelerations have to be known with sufficient accuracy. • In order to compute the solution p = A†b, the pseudoinverse of matrix A has to exist. In other words: AT A may not be singular or all columns of A should be linearly independent (full column rank). This is the case when: • All parameters are independent. • The input angles, velocities and accelerations q , q̇ , q̈ are sufficiently exciting. → The rank of matrix A can be checked with an SVD analysis: All singular values should be (sufficiently) unequal zero. Ronald Aarts PeSi/8/6 • An upper bound for the relative error of the estimated parameter vector is ||p − pLS || ||en|| ≤ cond(A) , ||p|| ||b|| where en is the measurement noise and the condition number is cond(A) = σmax(A) , σmin (A) with the largest and smallest singular values σmax(A) and σmin(A), respectively. → The input angles, velocities and accelerations q , q̇ , q̈ can be optimised to minimise the condition number. Ronald Aarts PeSi/8/7 Experimental approach (4) Experimental approach (5) • How can such an optimisation be carried out? • How can velocities and accelerations be computed accurately when only position sensor data is available? Possible solution: Use finite Fourier series: φi(t) = Ni  X ail sin(ωf lt) + bil cos(ωf lt) + φi(0) φ̇i(t) = Ni  X ailωf l cos(ωf lt) − bil ωf l sin(ωf lt) Ni  X −ailωf2l2 sin(ωf lt) − bilωf2l2 cos(ωf lt) l=1 φ̈i(t) = l=1 • The use of harmonic function makes it possible to compute velocities and accelerations analytically. From experimental data the time derivatives can be computed by considering only the relevant frequencies after Fourier transform.  l=1 • For each φi there are 2 × Ni + 1 parameters (ail, bil for i = 1..Ni and φi(0)) that can be optimised for optimal excitation while e.g. motion constraints are satisfied, e.g. with the M ATLAB command fmincon.   • Due to periodicity several periods can be averaged for improved signal to noise ratio. The fundamental pulsation of the Fourier series ωf should match the total measurement time. • The input signal contains only the selected frequencies. Ronald Aarts Ronald Aarts PeSi/8/8 Identifiability PeSi/8/9 Identifiability (2) m 2 J2 • Column rank deficiency of matrix A is not always easy to see beforehand as a linear dependency of parameters can be hidden well. Consider e.g. a horizontal pendulum, so with no gravity. Then with parameter vector p = [J1, J2, m2, m3]T the elements of Φ can be written as Φ11 Φ12 Φ13 Φ14 = = = = Φ21 Φ22 Φ23 Φ24 = = = = Ronald Aarts φ̈1 φ̈1 + φ̈2 2φ̈ l1 1 l1l2((2φ̈1 + φ̈2) cos φ2 − (φ̇2 2 + 2φ̇1 φ̇2 ) sin φ2 ) 2 φ̈ + l2 (φ̈ + φ̈ ) + l1 2 1 2 1 0 φ̈1 + φ̈2 0 2 l1l2(φ̇2 1 sin φ2 + φ̈1 cos φ2) + l2 (φ̈1 + φ̈2) PeSi/8/10 Now the first and third column of Φ are dependent as " Φ13 Φ23 # 2 = l1 " Φ11 Φ21 m1 J1 # φ2 φ1 l1 l2 It implies that p1 = J1 and p3 = m2 can not be estimated separately. Instead 2 p = J + l2 m can be estimated. only the linear combination p1 + l1 3 1 1 2 2m , J , m ]T a matrix Φ By defining a new parameter vector p∗ = [J1 + l1 2 2 3 with full rank can be constructed and the pseudoinverse exists (provided sufficient excitation). Such a parameter vector is denoted a base parameter vector. Ronald Aarts PeSi/8/11 m3 Measurement errors Suppose parameter set p∗ is obtained for measurement set τ ∗ and parameter set p∗ + δp is obtained for measurement set τ ∗ + δτ . How are δp and δτ related? For cost function VN (p, τ ) and the estimated parameters sets: ∂VN ∗ ∂VN ∗ ∗ (p , τ ) = 0 and (p + δp, τ ∗ + δτ ) = 0 ∂p ∂p Measurement errors (2) T For least squares criterion VN (p, τ ) = 1 2 ε(p, τ ) ε(p, τ ):  ∂ 1 ∂VN T ε(p, τ ) = ε(p, τ )T S(p, τ ) with ε(p, τ ) = ∂p ∂p 2 ∂ε(p, τ ) . sensitivity matrix S(p, τ ) = ∂p First derivative: Taylor series for the second expression ∂ 2VN ∗ ∗ ∂ 2VN ∗ ∗ (p , τ )δp + (p , τ )δτ = 0 2 ∂p ∂p∂τ Second derivative: ∂S(p, τ ) ∂ 2VN = ε(p, τ )T + S(p, τ )T S(p, τ ) 2 ∂p ∂p Small residue, so ignore first term and so δp = − " #−1 ∂ 2VN ∗ ∗ (p , τ ) ∂p2 ∂ 2VN ∗ ∗ (p , τ ) δτ ∂p∂τ Ronald Aarts PeSi/8/12 δp = −(S(p∗, τ ∗)T S(p∗, τ ∗))−1S(p∗, τ ∗)δτ = −S(p∗, τ ∗)†δτ Ronald Aarts PeSi/8/13 Non-linear parameter optimisation (as not the whole world is linear) Non-linear least-squares problem → Numerical minimisation of function V by iterative search methods Quadratic criterion: VN (θ) = ε(θ)T ε(θ) θ̂ (i+1) = θ̂ (i) + αf (i) Gradient: VN′ (θ) = S(θ)T ε(θ) with sensitivity matrix S. with search direction f (i) and (positive) step size factor α. 1. using function values only, or 2. using function values and the gradient, or 3. using function values, the gradient and the Hessian (second derivative matrix). Hessian: VN′′ (θ) = S(θ)T S(θ) + ε(θ)T S ′(θ) or VN′′ (θ) ≈ S(θ)T S(θ) Gauss-Newton (µ = 0, λ = 0) / damped Gauss-Newton (λ = 0) / Levenberg-Marquardt: Typical member of group 3: Newton algorithms with search direction i−1 f (i) = − V ′′(θ̂ (i)) h (i+1) θ̂N V ′(θ̂ (i)) Subclass of group 2 are quasi-Newton methods: estimate Hessian. Ronald Aarts PeSi/8/14 (i) (i) = θ̂N − µN  (i) (i) S(θ̂N )T S(θ̂N ) + λI see e.g. M ATLAB’s “Optimization Toolbox”. Ronald Aarts PeSi/8/15 −1 (i) (i) S(θ̂N )T ε(θ̂N ) Outline: Parameter identification: More complex model. Joint 3 • Modelling: Which effects are included in the (dynamic) model? • Masses: acceleration and gravity: Yes. • Friction: Yes. • Gravity compensation spring: Yes. • Driving system: Yes. • Elasticity of the mechanism: No, so only “low” frequencies. Link 3 Link 2 Hinge 3 Beam 3 Slider truss Hinge 4 Joint 4 Link 4 Beam 4 Beam 2 Joint 2 Hinge 5 Joint 5 Joint 6 Link 1 Hinge 2 Link 5 Beam 5 Beam 1 Link 6 Hinge 6 Hinge 1 Joint 1 Beam 6 z • Modelling: Complete the equations of motion and indicate the parameters. Base y Figure 1. Stäubli RX90B six-axis industrial robot. x Courtesy of Stäubli, Faverges, France. Figure 2. Finite element model of the Stäubli RX90B. • Identification: Linear-in the parameters or not? Six axes industrial robot Ronald Aarts PeSi/8/16 Ronald Aarts From Dynamics of Machines (113173): M̄ q̈ + DF (x)T M (D2F (x)q̇ )q̇ − f + DF (e,c)T σ (c) = T , i • q are the six degrees of freedom, i.e. the joint angles. • First two terms are the inertia effects: • Reduced mass matrix M̄ . • Accelerations q̈ . • First order geometric transfer functions of Jacobians DF . • Force f includes gravity. • Total stress in the gravity compensating spring σ (c). • Joint torques T Ronald Aarts PeSi/8/17 m(k) , J(k) Rq ny′ q Modelling (2): Masses and spring parameters Modelling (1): Masses and spring h • Experiment: How is the data collected? (3) • Lumped Masses: Each link is described by a symmetric rotational inertia matrix J (k), a mass m(k) and a vector s(k) defining the center of gravity with respect to the corresponding element node at which the body is lumped. (k) Rp ny′ s z Rq nx′ Rq nz′ p y Rp nz′ Rp nx′ x (Toon Hardeman, 2005) For each link element a lumped parameter vector p(l,k) is defined as p(l,k) = (m, msx′ , msy′ , msz ′ , Jx′x′ , Jx′y′ , Jx′z ′ , Jy′y′ , Jy′z ′ , Jz ′z ′ )(k) containing the components of m(k), s(k) and J (k), respectively. • Total stress in pre-stressed spring: σ (c) = σ (c,0) + ks e(c) , (4) depends on elongation e(c) and parameters: pre-stress σ (c,0) and stiffness ks . PeSi/8/18 Ronald Aarts PeSi/8/19 Modelling (3): Driving system Modelling (4): Friction – first try i5 ij (m) nj Jj k5 (f ) Tj ϕj ϕ5 Tj (m) kj q5 T5 n56 qj (m) T (m) = kj ij . n5 i6 • Part of the torque is needed for the acceleration of the motor inertia (m) Jj ϕ¨j . (f ) T6 q6 (m) k6 Friction torques have been measured in single joint constant velocity experiments. Commonly used model equation: (f ) T5 T6 ϕ6 J6 n6 (b) Joints 5 and 6. (a) Joints 1 to 4. Figure 3. T (f ) /T (max) T (m) − TT J (m)Tq̈ − T (f ) = T , (6) Stribeck (v,0) + Tj q̇j viscous 0.04 0.30 0.20 0.10 0.00 0 1 2 3 4 0.03 Dots (•): experiments. 0.02 Dashed (- - -): estimated in the full velocity range. 0.01 0.00 0.00 5 Angular velocity q̇ [rad/s] with gear ratios T and all motor inertias collected in J (m). 0.05 0.10 0.15 0.20 Angular velocity q̇ [rad/s] (a) Full velocity range. PeSi/8/20 (b) Low velocity range. Ronald Aarts Solid (—): estimated in the range from 0 to 0.5 rad/s. PeSi/8/21 Modelling (6): Friction – conclusion Modelling (5): Friction – improved model After looking carefully at the experimental data and with physical insight:  (a) (s) δj Coulomb   (a) (s) δj (s,0) − q̇j /q̇j + Tj e 0.40 • The input-output relation of the complete driving system (f ) (a,0) − q̇j /q̇j Tj = Tj e (C,0) Tj Layout of the driving system. (f ) • All friction losses are combined into a single term Tj . Ronald Aarts (f ) Tj = T (f ) /T (max) • Generated torques depend on (m) known motor constants kj and currents ij J5   (a) (v) (s) δj (f ) (a,0) − q̇j /q̇j (v,0) (1−δj ) Tj = Tj e + Tj q̇j . (v) (v,0) (1−δj ) + Tj q̇j . 0.40 (7) 0.04 (7) • For each friction torque there are five unknown parameters: (a,0) • the static friction torque Tj , T (f ) /T (max) T (f ) /T (max) (s) 0.30 0.20 0.10 0.00 0 1 2 3 4 0.03 • the Stribeck velocity q̇j , 0.02 • the Stribeck velocity power δj , 0.01 • the viscous friction coefficient Tj (v,0) 0.00 0.00 5 Angular velocity q̇ [rad/s] 0.05 0.10 0.15 0.20 Angular velocity q̇ [rad/s] (a) Full velocity range. (a) (b) Low velocity range. and (v) • the viscous friction power δj . • All parameters can be obtained with a non-linear parameter estimation, provided a reasonable initial guess is available. (a,0) (v,0) This friction model gives an accurate fit in the full velocity range with a minimal parameter set and physically sound model structure (Rob Waiboer, 2005). • Parameters Tj and Tj depend strongly on temperature and have to be identified again in each experiment. Fortunately, these parameters appear linearly in the model! Ronald Aarts Ronald Aarts PeSi/8/22 PeSi/8/23 Modelling (8): Parameter linear form Modelling (7): Putting it all together Φ(q̈ , q̇ , q )p = T (m) All submodels can be combined in a single equation: M̄ (m)q̈ + DF (x)T M (D2F (x)q̇ )q̇ − f + DF (e,c)T σ (c) + T (f ) = T (m).(8) h i • The motor inertias J (m) are included in the new reduced mass matrix M̄ • The joint driving torques T (m) are computed from the measured motor currents. • This acceleration linear form is well suited for simulations. (m) . The parameter linear form is well suited for a linear least squares fit of the model parameters p: • p(l): 60 parameters of the lumped inertia parameters of the six links. • p(s): 2 parameters of the spring (pre-stress σ (c,0) and stiffness ks ). • p(m) : 6 parameters with the motor inertias. • It can be proven that a parameter linear form can be obtained, provided: • the proposed model structure is used to model the mass and inertia properties (Toon Hardeman, 2004) and • the non-linear parameters in the friction model are taken constant. Ronald Aarts (9) • p(f ) : 14 parameters representing the linear part of seven equations for friction torques. So in total 82 parameters! PeSi/8/24 Ronald Aarts PeSi/8/25 Parameter identification: LS solution Identification experiment • Estimate parameter vector p by minimising the residual ρ in the linear system of equations • Move the robot along a trajectory for the joint angles q (t). • Record the joint driving torques T (m) (t) at n time instances. Ap = b + ρ. • The least-squares solution • Compute all matrices Φi (q̈ i, q̇ i, q i) from the (actual) trajectory q (t). p̂ = arg min ||ρ||22 p • Collect all data in the regression matrix A and measurement vector b:   Φ1(q̈ 1, q̇ 1, q 1)  .. A=   Φn(q̈ n, q̇ n, q n) T 1(m)   ..  and b =    (m) Tn   (15) • Identification experiment: Multi-sine (well-defined input in frequency domain). Ronald Aarts PeSi/8/26 (14) (16) can be obtained using the pseudo-inverse A† p̂ = A†b = (AT A)−1AT b, (19) provided the inverse (AT A)−1 exists, so A has full rank. Ronald Aarts PeSi/8/27 Parameter identification (2): SVD • Unfortunately, A is rank deficient Parameter identification (3): SVD result rank(A) = r < m, (20) 105 where m is the length of the parameter vector p (m = 82). 1 • Apply Singular Value Decomposition σ A = U ΣV T , (21) with orthogonal 6n × 6n matrix U , m × m matrix V . Σ= " S 0 • For two different trajectories the rank of the regression matrix appears to be r = 55. 10−5 10−10 10−15 10−20 # 1 10 20 (22) 30 40 50 55 60 70 82 Singular value number • Does that mean that 55 parameters can be estimated meaningfully? contains the nonnegative singular values S = diag(σ1, σ2, ..., σm) in descending magnitude. Ronald Aarts PeSi/8/28 Ronald Aarts PeSi/8/29 Parameter identification (4): essential parameters and null space Parameter identification (5): parameter solution and ambiguity • Split S in nonzero part S 1 (r × r) and zero part S 2 • Transform measurement vector b with the left singular matrix U S= " S1 0 0 S2 # g = U T b, , then the original LS problem can be replaced by and accordingly U= h U1 U2 i and V = h V1 V2 i (24) • Transform and partition parameter vector p with the right singular matrix V : p = V 1α(E) + V 2α(N ) (25) α(E) is the essential parameter vector that contains the parameter combinations that can be estimated independently, α(N ) is associated with the null space: It can have any value! Ronald Aarts (30) (23) PeSi/8/30 α̂ = arg min || α " S 0 # α − g ||22, (31) with solution for the essential parameters (E) α̂i g = i , for i = 1, 2, ...r. σi (32) • An infinite number of parameters is found p̂ = V 1α̂(E) + V 2α̂(N ) (34) as α̂(N ) can take any value. Ronald Aarts PeSi/8/31 Parameter identification (6): residual error and parameter accuracy Parameter identification (7): truncated SVD • An estimate of the variance of the residual is s2 = var(ρ) = 6n X 1 ρi 6n − r i=1 (38) • In other words: Take only r singular values into account where r is smaller than the rank of the regression matrix A. This is the truncated or partial SVD method. • Now the variance of the LS estimate α̂i equals var(α̂i) = s2 . σi2 • To avoid degradation of the parameter accuracy: Do not take into account too small singular values. (39) So for small singular values, the accompanying α̂i can not be estimated accurately. • Such inaccurate estimates will ruin the parameters in p!!! Ronald Aarts PeSi/8/32 • How large should r be? • Large enough to avoid model errors and to obtain a small residue. • Small enough to avoid degradation of the accuracy of the parameter estimate and overfit. Ronald Aarts PeSi/8/33 Intermezzo: Scaling Parameter identification (8): Residual error • Consider the residual error as a function of the number of essential parameter r. 10.0 • Scaling is important for any MISO, SIMO or MIMO identification! • The absolute errors in all torques are weighted equally. kρk22 5.0 2.0 1.5 1.0 1 10 20 30 40 50 55 Number of singular values r Ronald Aarts PeSi/8/34 • About 20 parameters seems to be sufficient. E.g. scaling with the maximum torques for each joint. • How can we determine r more exactly? It depends on the error in the parameter estimate that is accepted. • The resulting errors in the parameters are also absolute errors. E.g. scaling with some nominal parameter set changes this effectively in relative errors of all parameters. Ronald Aarts PeSi/8/35 Parameter identification (9): Applying truncated SVD Simulation results Measured and simulated joint torques (22 essential parameters). 102 1 (a) Joint 1. Dots (•): singular values σi . 10−2 (c) Joint 6. 0.15 0.08 0.3 0.10 10−6 10−8 1 10 20 30 40 50 55 Singular value number Parameter scaling: gi should be near σi, which is true for about the first 30 parameters. 0.2 0.1 0 −0.1 −0.2 0.06 normalised joint torque Solid (—): magnitude of gi. −4 normalised joint torque 10 (b) Joint 3. 0.10 0.4 normalised joint torque σ, |g| and 10% threshold 104 0.04 0.02 0 −0.02 −0.04 −0.06 0.05 0 −0.05 −0.10 −0.3 −0.08 −0.4 −0.10 0 1 2 3 4 5 6 7 8 −0.15 0 From an estimate of the variance s2 and a maximum parameter error of 10%, 1 2 3 4 5 6 7 8 0 1 2 3 time [s] time [s] 4 5 6 7 8 time [s] Figure 8. The simulated and measured joint torques along the trajectory as a function of time. The simulation has been carried out with model M1 . With: the measured torque, the simulated torque and the residual torque. the value of the smallest singular value is computed: The dashed line (- - -). Remaining residual error appears to be mainly unmodelled high-frequent dynamics. So only 22 or 23 parameters can be estimated sufficiently accurate. Ronald Aarts PeSi/8/36 Ronald Aarts PeSi/8/37 e r_1 Closed loop identification Identification of a system G0 being controlled by a control system C can be e 3 desirable / necessary for: r_1 1 Noise H_0 • instability of the system, r_2 u 2 C G_0 • safety requirements, Controller System • impossibility to remove the control system, • controller relevant identification. • Ronald Aarts PeSi/9/1 y G_0 Controller System By defining signal r(t) = r1(t) + C(q)r2(t) the feedback is written as: u(t) = r(t) − C(q)y(t). With the input sensitivity S0(q) = (I + C(q)G0(q))−1 it follows u(t) = S0(q)r(t) − C(q)W0(q)H0 (q)e(t) y(t) = G0(q)S0(q)r(t) + W0(q)H0(q)e(t) = Ronald Aarts PeSi/9/2 H_0 y C (SISO: S0 = W0) Wanted: estimate Ĝ of system G0 from measurements of system output y(t) and furthermore system input u(t) and/or feedforward r1(t) and/or tracking reference r2(t). Problem: input u(t) is correlated with the disturbance v(t) H0(q) e(t) because of the controller C(q). Notations: Noise u and the output sensitivity W0(q) = (I + G0(q)C(q))−1 Situation: • Prerequisites: y(t) = G0(q) u(t) + H0(q) e(t) u(t) = C(q) [r2(t) − y(t)] + r1(t) The control system C(q) is known or not. • r_2 2 1 3 e 3 e r_1 Closed loop identification problems: r_2 2 1 r_1 r_2 2 H_0 Noise y u C G_0 Controller System Solution strategy depends also on starting point: When is it possible • (a) to identify a model [G(q, θ̂N ), H(q, θ̂N )] consistently. (b) to identify a system model G(q, θ̂N ) consistently. (c) to formulate an explicit approximation criterium that characterises the asymptotic estimator G(q, θ ∗) independent of Φv (ω) (d) to set a specific order for G0 in the model set. (e) to identify an unstable process G0. (h) to guarantee that the identified model for G0 will be stabilised by the applied control system C. 1 Noise H_0 y u C G_0 Controller System Measurements of the system output y(t). • Measurements of the system input u(t). • Knowledge regarding presence and properties of signals r1(t) and r2(t). • Measurements of r1(t) and/or r2(t). • 3 Knowledge of controller C(q). For most solution methods it is required that system G0 (and its model) have at least one delay. What can go wrong? Ronald Aarts PeSi/9/3 Ronald Aarts e Example of spectral analysis: SISO closed loop system r_2 2 r_1 1 Noise 3 H_0 y u C G_0 Controller System Suppose that accurate estimators for Φu(ω) and Φyu(ω) exist (N → ∞), then an estimator for G0 is given by: Ĝ(eiω ) = Measurements of system input u(t) and output y(t). Background information: r2(t) ≡ 0, r1(t) and e(t) uncorrelated. Define: z(t) = C(q)H0(q)e(t). Φu(ω) = |S0(eiω )|2 [Φr1 (ω) + Φz (ω)] Φyu (ω) = |S0(eiω )|2 [G0(eiω )Φr1 (ω) − Ronald Aarts PeSi/9/5 Φyu (ω) G (eiω )Φr1 (ω) − Φz (ω)/C(eiω ) = 0 Φu(ω) Φr1 (ω) + Φz (ω) Situation 1: No noise, so z(t) ≡ 0 and Φz (ω) = 0, then: Ĝ(eiω ) = G0(eiω ) Situation 2: No external signal, so r1(t) ≡ 0 and Φr1 (ω) = 0, then: 1 z(t)] Then: y(t) = S0(q) [G0(q)r1(t) + C(q) u(t) = S0(q) [r1(t) − z(t)] So: PeSi/9/4 Ĝ(eiω ) = −1 C(eiω ) in other words the inverse controller is estimated! 1 Φ (ω)] C(eiω ) z In general: weighted combination of both situations. Ronald Aarts PeSi/9/6 Overview techniques: Dual Youla Coprime factorisation Two-steps Tailor-made Joint input/output Indirect Direct IV Example of parametric identification: SISO system with proportional feedback Background information: r1(t) ≡ r2(t) ≡ 0, and assume a system with a proportional controller: u(t) = f y(t) with f ∈ R. Consider a first order ARX model with two parameters θ = (a, b)T : • • ε(t, θ) = y(t) + a y(t − 1) − b u(t − 1) Consistency (Ĝ, Ĥ) Consistency Ĝ + + + - + + + -   + + + + + + With feedback, so: ε(t, θ) = y(t) + (a − bf ) y(t − 1) Tunable bias Fixed model order + Unstable system + (G(q, θ̂N ), C) stable C known n • in this course. +  n + +  j + n + + + + j + + n +  + n + + /+ j In other words all models with the same â − b̂f predict an identical error ε and can not be distinguished. Ronald Aarts PeSi/9/7 Idea: Neglect feedback and estimate model for G0 and/or H0 with common open-loop techniques from measured {u(t), y(t)}. Consistent estimators for G0 and/or H0 are possible if The system (G0, H0) is in the model set. There is at least one sample delay in CG0. There is a sufficiently exciting (measurable or unmeasurable) signal r(t) and/or controller C is sufficiently complex (e.g. nonlinear or order equal to or larger than G0). Conclusion: measurements of {u(t), y(t)} seems sufficient. However: Requirement (a) is not practical. Ronald Aarts PeSi/9/8 Open-loop asymptotic identification criterium Direct closed-loop identification a. b. c. Ronald Aarts PeSi/9/9 1 π |G0(eiω ) − G(eiω , θ)|2 Φu(ω) + |H0(eiω )|2Φe(ω) dω 2π −π |H(eiω , θ)|2 Z is for the closed-loop situation |S0(eiω )|2 1 π Φr (ω) {|G0(eiω ) − G(eiω , θ)|2 2π −π |H(eiω , θ)|2 |H0(eiω )|2|S0(eiω )|2 Φe(ω)} dω + |H(eiω , θ)|2S(eiω , θ)|2 Z with sensitivity S(eiω , θ) = (1 + C(eiω )G(eiω , θ))−1. Independent parameterisation is no longer advantageous, e.g. OE (H = 1): G(eiω , θ) = G0(eiω ) minimises the first term, but G is also in S in the second term. Requirements for a small bias: Accurate (and fixed) noise model H0 and/or a large signal-to-noise ratio at the input u. Ronald Aarts PeSi/9/10 Indirect closed-loop identification Idea: (1) identify closed-loop system with common open-loop techniques and measurements of {r(t), y(t)} and (2) next compute the open-loop system with knowledge of the controller. Properties indirect method: System: y(t) = S0(q)[G0(q) r(t) + H0(q) e(t)]. • • • • Model set: y(t) = [Gc(q, θ) r(t) + Hc(q, θ) e(t)]. Step 1 identify in fact S0G0 and S0H0. Step 2: G(q, ρ̂) = Gc(q, θ̂N ) 1 + G(q, ρ̂)C(q) G(q, ρ̂) = H(q, ρ̂) = Hc(q, θ̂N ) 1 + G(q, ρ̂)C(q) H(q, ρ̂) = Gc(q, θ̂N ) 1 − C(q)Gc(q, θ̂N ) Consistency is comparable with direct method. Order of model can not be chosen beforehand. The order of Ĝ is usually the sum of the orders of Ĝc and C. A consistent estimation of only G0 is possible using independent parameterisation. Hc(q, θ̂N ) 1 − C(q)Gc (q, θ̂N ) with parameter vector ρ for the open-loop system. Ronald Aarts PeSi/9/11 In the case a frequency response is available, e.g. from • • a f -domain measurement, e.g. with a spectrum-analyser: often applied for mechanical systems. a t-domain measurement with a multi-sine excitation (e.g. from idinput), processed with FFT. an estimate from a non-parametric identification. Advantages of identification in the frequency domain: Ğ(ωj ), ωj ∈ Ω Ω = {ω1, ω2, ..., ωN } data reduction simplicity of processing: filtering is a multiplication. evaluation of models in f -domain: aiming at application. choice discrete/continuous models. Ronald Aarts PeSi/10/1 Data is complex (amplitude/phase) Frequency grid can be chosen (design) Suppose: Ğ(ωj ) = G0(eiωj ) + V (ωj ), with the real response G0(eiωj ) and a stochastic noise process V (ωj ). Identification problem: Find θ̂N by minimising N X k=1 • • • • PeSi/9/12 The data: Frequency domain identification • Ronald Aarts |Ğ(ωk ) − G(eiωk , θ)|2|W (ωk )|2 with G(eiωk , θ) = B(eiωk , θ) and W (ωk ) a chosen weighting function. A(eiωk , θ) → Resembles time domain OE-problem (non-linear-in-the-parameters). Ronald Aarts PeSi/10/2 In practice: • • • Accurate initial estimate with linear regression techniques. Subsequent non-linear optimisation (e.g. Sanathanan-Koerner algorithm). Weighting function is important: yes/not emphasis on high-frequent behaviour. Example: the freqid toolbox Starting point are frequency domain data: • Measured FRF. • Time domain data + FFT. “Tricks”: • An ARX-like problem (linear-in-the-parameters) can be obtained by rewriting the optimisation problem (implicit weighting with |A(eiωk , θ)|) • previous iteration. • The GUI has some similarities with the ident GUI. Implementation examples: • • Commercial fdident toolbox from ELEC, Vrije Universiteit Brussel. Freely downloadable freqid toolbox developed by R.A. de Callafon, initially at the TU Delft. Ronald Aarts PeSi/10/3 Example: the freqid toolbox (2) • Models are fitted using a non-linear weighted least squares fit, e.g. using the accompanying lsfits toolbox. Ronald Aarts PeSi/10/4 Example: the freqid toolbox (3) • Often used weighting function: Coherence spectrum, see M ATLAB’s help cohere. Typical weighting functions: • None. • Inverse of data. • User defined, e.g. frequency range. • Supplied weighting function. • Identification can be carried out using both discrete-time and continuous-time models. Ronald Aarts PeSi/10/5 Cyu (ω) = Estimate:   1 for a perfect correlation between input u and output y  0 for no correlation between input u and output y v u u N Ĉyu(ω) = t 2 |Φ̂N yu (ω)| N Φ̂N y (ω)Φ̂u (ω) • Identification is possible for both open-loop and closed-loop systems, but in the latter case the coherence spectrum should be used with care. Ronald Aarts PeSi/10/6 Aspects of MIMO systems • Polynomials become polynomial matrices, e.g. ARX: And the ny × nu matrix B(q −1) = B0 + B1q −1 + ... + Bnbq −nb, or A(q −1)y(t) = B(q −1)u(t) + e(t) with ny × ny matrix A(q −1) = Iny + A1q −1 + ... + Anaq −na, or    A(q −1) =    a11(q −1) a12(q −1) a21(q −1) a22(q −1) .. .. any1(q −1) any2(q −1) · · · a1ny (q −1) · · · a2ny (q −1) .. ... · · · any ny (q −1)        b (q −1) b12(q −1) · · · b1nu(q −1)  11 −1  b21(q ) b22 (q −1 ) · · · b2nu(q −1) B(q −1) =  .. .. .. ...   −1 −1 bny1(q ) bny2(q ) · · · bny nu(q −1)       nb −nkkj with elements bkj (q −1) = b1 + ... + bkj kj q −nkkj −nbkj +1. kj q na −1 + ... + a kj q −nakj . with elements akj (q −1) = δkj + a1 kj q kj Ronald Aarts PeSi/10/7 Usually: choice of a specific model structure, e.g. common denominator polynomial a(z). E.g. the transfer function G(z) = A(z)−1 B(z) of a 2 × 2 system  b11(z)  G(z) =  ba(z) 12(z) a(z)  # " b21(z) b11(z) b21(z) a(z)  −1 b22(z)  = (a(z)I2×2 ) b12(z) b22(z) a(z) Note: this expression has a large number of parameters, especially when the poles differ for each of the elements in the matrix. Then the denominator has to contain all poles and in each element some of them need to be compensated by zeros of bij (z). Ronald Aarts PeSi/10/8 • Available in the ident toolbox: • • • • Models of polynomials with ARMAX-, OE- and BJ-structure for MISO-systems. Models of polynomials with ARX-structure for MIMO-systems. Arbitrary state space models for MIMO-systems with subspace identification. Structured state space models for MIMO-systems with PEMtechniques: OE- and ARMAX-structure. Minimisation of p N X 1 X w i ε2 ip (t, θ). N t=1 i=1 • Simpler representation by means of state space models: MIMO is a straightforward extension of SISO. • Example: Subspace techniques. • Extra requirements regarding fixed parameters, available orders and structures may be necessary, e.g. for the physical model / identifiability / uniqueness. Ronald Aarts Ronald Aarts PeSi/10/9 PeSi/10/10 • Closed-loop MIMO identification: For the direct, indirect and twee-steps methods it depends on the cababilities of the applied open-loop identification. • Frequency domain MIMO identification: Algorithm is also applicable for m inputs and p outputs: Measure all m × p frequency responses Ğjl (ωk ) and minimise p X N m X X j=1 l=1 k=1 |Ğjl (ωk ) − Gjl (eiωk , θ)|2|Wjl (ωk )|2 Also in this case there is a large variation of available parameterisations for MIMO systems, e.g. left MFD (for the number of outputs p ≤ number of inputs m): G(q, θ) = A(q, θ)−1 B(q, θ) with A(q, θ) = I + A1 q −1 + ... + Ana q −na B(q, θ) = B0 + B1 q −1 + ... + Bnb q −nb Ronald Aarts PeSi/10/11