A Wiener Neural Network-Based Identification and Adaptive Generalized Predictive Control For Nonlinear SISO Systems
19 216
Ma'moun M. Abu-Ayyad
Ma'moun Abu-Ayyad
Mechanical Engineering Department, Penn StateHarrisburg, Middletown, Pennsylvania, 17057, United States
ABSTRACT: In this study, a Wiener-type neural network (WNN) is derived for identification and control of single-input and
single-output (SISO) nonlinear systems. The nonlinear system is identified by the WNN, which consists of a linear dynamic block in
cascade with a nonlinear static gain. The Lipschitz criteria for model order determination and back propagation for the adjustment of
weights in the network are presented. Using the parameters of the Wiener model, the analytical expressions used in the controller,
generalized predictive control (GPC) is modified every time step, to handle the nonlinear dynamics of the controlled variable.
Finally, the proposed WNN-based GPC algorithm is tested in simulation on several nonlinear plants with different degrees of
nonlinearity. Simulation results show that WNN identification approach has better accuracy, in comparison to other neural network
identifiers. The WNN-based GPC has better control performance, in comparison to standard GPC.
’ INTRODUCTION formulate the dynamic linear part and nonlinear static part of the
Wiener and Hammerstein models are the most known and Wiener or Hammerstein model, using a multilayer neural net-
most widely used for modeling of various processes, such as work. Janczak13 designed a neural network for formulation of the
chemical processes,13 separation processes,4 hydraulic sys- Hammerstein model, which was composed of one hidden layer
tems,5 and chaotic systems.6,7 In Wiener modeling, a linear with nonlinear nodes and one linear output node. Wu et al.14
dynamic block precedes a nonlinear steady-state one, while proposed a Hammerstein neural network compensator to identify
Hammerstein models contain the same elements in the reverse the dynamic calibration process for an infrared thermometer sensor.
order .8 These types of models are called block-oriented non- In many instances, most of these investigations assume that the
linear models.9 Unlike black-box models, the block-oriented orders of dynamic linear part in the Wiener or Hammerstein model
models have a clear physical interpretation: the steady-state part are known a priori.
describes the gain of the system.10 Model predictive control (MPC) is an optimization-based
Artificial neural network (ANN) models have been success- control methodology that is widely used in industry. The main
fully applied to the identification and control of a variety of goal of MPC is to use a system mathematical model to obtain a
nonlinear dynamical systems and processes. Many researchers control signal by minimizing an objective function.15,16 Various
have integrated neural networks with Wiener and Hammerstein MPC algorithms were proposed, such as dynamic matrix control
model structures, to formulate the system static nonlinearities. (DMC), generalized predictive control (GPC), and nonlinear
Al-Duwaish et al.11 proposed an identification model using MPC13 and others. yawryn’czuk10 proposed a computationally
a hybrid model consisting of a linear autoregressive moving efficient nonlinear MPC algorithm based on neural Wiener
average (ARMA) model in cascade with a multilayer neural models, where nonlinear prediction and linearalization are
network. The multilayer network was used to represent the adopted. Arefi et al.2 proposed a nonlinear MPC based on classic
dynamic linear block and the static nonlinear element of Wiener optimization methods with nonlinear identification, using a
model, respectively. For identifying a chaotic system, Chen et al.6 Wiener model for a nonlinear chemical process. The nonlinear
used a simple linear model to represent the dynamic part and static term is a neural network; then, the design of the nonlinear
a neural network to represent the nonlinear static part. Also, predictive controller is based on the identified Wiener model.
the dynamic linear part was replaced by Laguerre filters and These methods achieved satisfactory performance; however, the
the nonlinear static part was described as a neural network.4 fixed parameters in these controllers restricted their ability to
T€otterman and Toivonen12 used support vector regression to control highly nonlinear systems. In addition, from the compar-
identify nonlinear Wiener systems; the linear block is expanded ison of several predictive controllers by Abu-Ayyad and Dubay,17
in terms of Laguerre or Kautz filters, and the static nonlinear
determination scheme that used the Lipschitz criterion. Luh and and l(nþ1)
ij are calculated, if jnþ1 is a redundant input variable,
Rizzoni28 extended this scheme to MIMO systems and also there will be a slight difference between l(n) (nþ1)
ij and lij .
improved its performance to a limited extent, using the concept To avoid the effect of measurement noise, the following
of orthogonal basis functions. Wang and Chen29 extended the index25 is used to determine an appropriate order:
approach to develop the order determination algorithm for !1=m
multiple-input single-output (MISO) systems. Here, the order ðnÞ pffiffiffi Y
determination scheme for SISO systems is provided for com- l ¼ n l ðsÞ ð9Þ
pleteness of the study.
Consider a general nonlinear SISO dynamic system that can (n)
where l (s) is the sth-largest Lipschitz quotient among all l(n)
be represented as follows: with the n input variables (j1, ..., jn). The parameter m is a
yðtÞ ¼ gðyðt 1Þ, ... , yðt ny Þ, uðt 1Þ, ... , uðt nu ÞÞ ð5Þ positive number usually selected to be m ∈ [0.01Nset, 0.02Nset].
For testing purposes, the stop criterion can be defined as29
where y(t) and u(t) are the output and input variables of the
dynamic system, and ny and nu are the true orders of the output jlðn þ 1Þ lðnÞ j
<ε ð10Þ
and input, respectively; g( 3 ) is a nonlinear function assumed to maxð1, jlðnÞ jÞ
be continuous and smooth.
Rewriting eq 5 in a compact form gives where ε > 0 is a prespecified threshold. From the investigation in
the work of Wang and Chen,29 ε = 0.1 is suitable for most cases.
y ¼ gðj1 , j2 , ... , jn Þ ð6Þ To obtain the number of nodes in the nonlinear layer, p is
chosen manually. This is because, from empirical experience, the
where n is the number of input variables (n = ny þ nu). The next
value of p has less influence on the accuracy of the identification
task is to reconstruct the nonlinear function g( 3 ) from the
performance than the values of na and nb. For most cases, a value
inputoutput data patterns [j(k),y(k)]N k=1, where Nset is the
of 3 e p e 6 can be chosen.
number of datasets used for the model order determination.
Define the Lipschitz quotient lij as follows:
jyðiÞ yðjÞj In the neural network learning procedure, the weights are updated
lij ¼ ði 6¼ jÞ ð7Þ
jjðiÞ jðjÞj along the negative gradient of a given error function as follows:
where |j(i) j(j)| is the distance of two points in the input 1 2 1
Ξðw, tÞ ¼ yðtÞ y^ ðtÞ ¼ ^eðtÞ2 ð11Þ
space and |y(i) y(j)| is the difference between g(j(i)) and 2 2
g(j(j)). For data points with a small distance |j(i) j(j)|
where^e(t) = y(t) ^y(t), and y(t) and ^y(t) are the actual output and
between them, the Lipschitz quotient l(n)
ij can be rewritten as
the neural network output, respectively. Let w be the adjustable
ðnÞ jδyj parameter, which consists of the weights as shown:
lij ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð8Þ
ðδj1 Þ þ ðδj2 Þ2 þ 3 3 3 þ ðδjn Þ2 w ¼ ½^a1 , ... , ^ana , ^b1 , ... , ^bnb ,^c1 , ... ,^cp T
where δy = y(i) y(j) and δjr = jr(i) jr(j) for r = 1, 2, ..., n. Applying a pattern learning version of the steepest descent
The superscript n in l(n)
ij is an index representing the number of
optimization method, the partial derivatives for the minimization
input variables in eq 6. From the investigation in He and Asada,25 of the error function (eq 11), with respect to the adjustable
the values of l(n)
ij can be used to indicate when one or more input
parameters w of the network,
variables is missing, or the case when one or more redundant ∂Ξ ∂y^ ðtÞ
input variable is included. For example, if a variable jn is missing ¼ ^eðtÞ ð12Þ
from the input set, the Lipschitz quotient l(n1) will be con- ∂w ∂w
siderably larger than l(n)
ij , or even unbounded. In contrast, when where w represents the elements of w. To overcome the short-
an input variable jnþ1 is included and the Lipschitz quotients l(n)
ij coming of the traditional BP algorithm, a momentum term is
∂y^ ðtÞ
^bj ðtÞ ¼ ^b ðt 1Þ þ ηb^eðtÞ∂yðtÞ þ γΔ^bj ðt 1Þ
¼ x^ k ðtÞ ðk ¼ 1, 2, :::, pÞ ð16Þ
∂^c k j
∑ ∂^ai respectively. This is shown in Figure 4.
The linear dynamic equation in the WNN is regarded as the
ði ¼ 1, 2, ..., na Þ ð20Þ
CARIMA model for designing the WGPC. Recall eq 1 with
∂y^ ðtÞ ∂y^ ðtÞ ∂x^ ðtÞ εðtÞ
¼ Aðq1 ÞxðtÞ ¼ Bðq1 Þuðt 1Þ þ Hðq1 Þ ð25Þ
∂^b j ∂x^ ðtÞ 3 ∂^bj Δ
! !
∂x^ ðt sÞ
nb where u(t) and x(t) are the control input and output variables
¼ ∑ k^ckx
^ ðk 1Þ
ðtÞ uðt jÞ ^as
∑ ∂^bj and ε(t) is a zero mean white noise. Δ = 1 q1 denotes the
backward-difference operator. A(q1) and B(q1) are the same
ðj ¼ 1, 2,... , nb Þ ð21Þ as eq 4. For simplicity, the H(q1) polynomial is chosen to be 1.
The prediction of the system output x(t þ j) can be evaluated the WNN. This makes ^x more accurate as the nonlinea-
using the following optimum jth-step-ahead predictor: rities are taken into account locally.
• Second, the dynamic matrix D(t) is also recalculated,
x^ ðt þ jjtÞ ¼ Dj ðq1 ÞΔuðt þ j 1Þ providing a localized open-loop behavior during closed-
þ Fj ðq1 ÞxðtÞ þ Ej ðq1 Þεðt þ jÞ ð26Þ loop experiences, thereby providing more optimal control
actions to the nonlinear dynamic system, in comparison to
where the polynomials Ej, Fj, and Dj can be derived by the having a fixed D(t).
following Diophantine equation: These major differences are key to having improved control
1 ¼ Ej ðq1 ÞΔAðq1 Þ þ qj Fj ðq1 Þ ð27Þ when controlling nonlinear systems.
The step response coefficients at each time step j in Dj are no
The polynomial Ej is uniquely defined with a degree of j 1. As longer static but changing every j; hence, the dynamic matrix
the degree of polynomial Ej(q1) = j 1, the noise terms in eq 27 D(t), which contains a local linear approximation of the non-
are all in the future. Therefore, the best prediction of ^x(t þ j) is, linear model, is
2 3
x^ ðt þ jjtÞ ¼ Dj ðq1 ÞΔuðt þ j 1Þ þ Fj ðq1 ÞxðtÞ ð28Þ d1 ðtÞ 0 3 3 3 0
6 d2 ðtÞ 7
d1 ðtÞ 333 0
The GPC algorithm consists of applying a control sequence DðtÞ ¼ 6 6 l
that minimizes a cost function of the following form: 4 l l l 5
dN ðtÞ dN1 ðtÞ 3 3 3 dNNu þ 1 ðtÞ
JðtÞ ¼ ∑
jj xðt þ jjtÞ xsp ðt þ jÞjj 2
^x ðtÞ ¼ DðtÞΔuðtÞ þ x 0 ðtÞ ð30Þ By minimizing J(t), the control law of unconstrained WGPC
can be given as
where the first part of the right-hand of eq 30 is dependent only
on the future control moves, while the second part x0(t) is a free ΔuðtÞ ¼ ðDT ðtÞDðtÞ þ λlÞ1 DT ðtÞðx sp ðtÞ x0 ðtÞÞ ð33Þ
response that is dependent only on the past moves.
At this point, key enhancements to the GPC method are now Note that only the first element of Δu(t) is applied to the process,
provided in the context of WGPC. i.e., u(t) = Δu(t) þ u(t 1). The prediction is shifted one step
• First, the coefficients of the Diophantine equation are forward and the procedure is repeated at the next sampling
recalculated every time step, as shown in Figure 4, using instant.
Figure 5. Values of the order determination based on Lipschitz Figure 6. Comparison of the mean square error (MSE) with different
quotients. orders.
obtained as
In this section, three examples are considered to illustrate the >
< xðtÞ¼ 0:1410xðt 1Þ þ 0:0249xðt 2Þ 0:0419xðt 3Þ
^ ^ ^ ^
WNN identifier and the WGPC controller described above. The 0:9172uðt 1Þ 0:1511uðt 2Þ
first example is aimed at demonstrating the WNN identification >
: y^ ðtÞ ¼ 0:8300x^ ðtÞ 0:0138x^ 2 ðtÞ þ 0:0242x^ 3 ðtÞ 0:2466x^ 4 ðtÞ
on a nonlinear dynamic system. The second and third examples
are aimed at the control of nonlinear dynamic systems using ð36Þ
the WGPC.
Example 1: Nonlinear System Identification. The following The testing input signal used to verify the identification
process is a nonlinear dynamic process formulated in a discrete performance of the WNN is
form as24 uðtÞ
yðtÞ ¼ f ðyðt 1Þ, yðt 2Þ, yðt 3Þ, uðt 1Þ, uðt 2ÞÞ ð34Þ >
> sin 0 e t < 250
> 25
where < 1:0 250 e t < 500
> 1:0 500 e t < 750
x1 x2 x3 x5 ðx3 1Þ þ x4 >
> πt πt πt
f ðx1 , x2 , x3 , x4 , x5 Þ ¼ ð35Þ >
> þ þ 750 e t < 1000
1 þ x3 2 þ x2 2 : 0:3 sin
0:1 sin
0:6 sin
A total of 1000 data pairs are used to train the network. The ð37Þ
first 500 timesteps are an independent, identically distributed The proposed WNN was compared to several identification
(i.i.d.) uniform sequence u(t) within the limits [ 1.0,1.0] and procedures, the Wiener-type Recurrent Neural Network
the remaining timesteps are given by a sinusoidal function u(t) = (WRNN),24 the Controllable-Canonical-Form-Based Recurrent
1.05 sin(πt/45). Also, the first 500 timesteps were used for Neurofuzzy Network (CReNN),32 and the Dynamic Fuzzy Neural
determination of the system orders. Network (DFNN).33 The results are quantified in Table 1,
In the order determination procedure, we use the input j1 = illustrating that the proposed WNN has the least number of
y(t 1) only to compute the Lipschitz quotient l(1,0) = þ¥. Adding parameters and the lowest MSE values. Figure 7 shows the output
other inputs ji gives the Lipschitz quotients, in relation to its terms of the plant using the true model and the WNN.
shown in Figure 5. For increasing orders, the corresponding Example 2: WGPC for a Nonlinear Plant. The nonlinear
quotients l(2,1) = 27.65, l(3,1) = 18.02, l(2,2) = 4.201 are decreasing plant is given by16,24
significantly. In addition, l(3,2) = 2.525, l(3,3) = 2.489, l(4,2) =
2.450,l(4,3) = 2.416 are relatively constant from l(3,2); therefore, yðt 1Þyðt 2Þðyðt 1Þ þ βÞ
yðtÞ ¼ R þ uðt 1Þ ð38Þ
the stop criterion (eq 10) is satisfied. From Figure 5, the best 1 þ y2 ðt 1Þ þ y2 ðt 2Þ
order of the system using the WNN is (3,2) and from eq 34, the
true order of the system is (3,2). Figure 6 demonstrates the where the parameters have values of R = 0.35 and β = 2.5. The
comparison of the mean square error (MSE) of the WNN with control procedure contains two phases: one is the offline
different orders. training phase and another is the online control phase. During
The number of neurons in nonlinear static block is the off-line training phase, the same procedure as that described
chosen as p = 4. To train the neural network, the learning in Example 1 is used to train the WNN. A total of 1000 data
rate and the momentum factor are chosen as η = 0.01 and pairs of i.i.d. uniform sequences within the limits u(t) ∈
γ = 0.1, respectively. The initial parameters in eqs 2224 [ 1.0, 1.0] are used to train the network. From Figure 8, the
are y(t) = 0, x(t) = 0, ∂y(t)/∂^ai = 0, ∂y(t)/∂^bj = 0, and ∂y(t)/ stop criterion ends at l(3,1) = 1.895, showing that the best order
∂^ck = 0 as t e 0. After training, the identified model is of the system is (3,1). The number of neurons in nonlinear
Table 2. Settling Time (Ts) and Overshoot (σ%) of GPC and Table 3. Model Parameters of CSTR Process
notation description value
controller Ts (s) σ% (%)
Q volumetric flow rate 10 L min1
GPC 0.28 18.92 V reactor volume 10 L
WGPC 0.16 2.102 F density of reaction mixture 1000 g L1
CAf feed concentration 1.0 mol L1
Tf feed temperature 350 K
Cp specific heat capacity 1.0 J g1 K1)
ΔH heat of reaction 1.0 105 J mol1
k0 Arrhenius pre-exponential constant 5.33685 107 min1
E/R activation energy/gas law constant 6000 K
UA heat-transfer term 5000 J min1 K1
