RMP v065 p1331
RMP v065 p1331
RMP v065 p1331
Henry D. I. Abarbanel
Department of Physics and Marine Physical Laboratory, Scripps institution of Oceanography
University of California at San Diego, La Jolla, California 92093-0402
Reggie Brown, John J. Sidorowich, and Lev Sh. Tsimring
Institute for Nonlinear Science, University of California at San Diego, Mail Code 0402,
La Jolla, California 92093-0402
Chaotic time series data are observed routinely in experiments on physical systems and in observations in
the field. The authors review developments in the extraction of information of physical importance from
such measurements. They discuss methods for (1) separating the signal of physical interest from contamination ("noise reduction"), (2) constructing an appropriate state space or phase space for the data in
which the full structure of the strange attractor associated with the chaotic observations is unfolded, (3)
evaluating invariant properties of the dynamics such as dimensions, Lyapunov exponents, and topological
characteristics, and (4) model making, local and global, for prediction and other goals. They briefly touch
on the effects of linearly filtering data before analyzing it as a chaotic time series. Controlling chaotic
physical systems and using them to synchronize and possibly communicate between source and receiver is
considered. Finally, chaos in space-time systems, that is, the dynamics of fields, is briefly considered.
While much is now known about the analysis of observed temporal chaos, spatio-temporal chaotic systems
pose new challenges. The emphasis throughout the review is on the tools one now has for the realistic
study of measured data in laboratory and field settings. It is the goal of this review to bring these tools
into general use among physicists who study classical and semiclassical systems. Much of the progress in
studying chaotic systems has rested on computational tools with some underlying rigorous mathematics.
Heuristic and intuitive analysis tools guided by this mathematics and realizable on existing computers
constitute the core of this review.
CONTENTS
I. Introduction
A. Observed chaos
B. Outline of the review
II. Signals, Dynamical Systems, and Chaos
III. Analyzing Measured SignalsLinear and Nonlinear
A. Signal separation
B. Finding the space
C. Classification and identification
D. Modelinglinear and nonlinear
E. Signal synthesis
IV. Reconstructing Phase Space or State Space
A. Choosing time delays
B. Average mutual information
C. Choosing the embedding dimension
1. Singular-value analysis
2. Saturation of system invariants
3. False nearest neighbors
4. True vector
fields
D. TanddE
V. Invariants of the Dynamics
A. Density estimation
B. Dimensions
1. Generalized dimensions
2. Numerical estimation
C. A seguefrom geometry to dynamics
D. Lyapunov spectrum
E. Global exponents
F. Ideal case: Known dynamics
G. Real world: Observations
1. Analytic approach
2. Trajectory tracing method
3. Spurious exponents
H. Local exponents from known dynamics and from
observations
Reviews of Modern Physics, Vol. 65, No. 4, October 1993
1331
1333
1335
1335
1339
1339
1340
1341
1342
1342
1343
1344
1344
1346
1347
1347
1348
1351
1351
1352
1352
1353
1354
1355
1356
1358
1358
1359
1359
1360
1361
1362
1362
I. Topological invariants
VI. Nonlinear Model Building: Prediction in Chaos
A. The dimension of models
B. Local modeling
C. Global modeling
D. In between local and global modeling
VII. Signal Separation"Noise" Reduction
A. Knowing the dynamics: manifold decomposition
B. Knowing a signal: probabilistic cleaning
C. Knowing very little
VIII. Linearly Filtered Signals
IX. Control and Synchronization of Chaotic Systems
A. Controlling chaos
1. Using nonlinear dynamics
2. More traditional methods
3. Targeting
B. Synchronization and chaotic driving
X. Spatio-temporal Chaos
XI. Conclusions
A. Summary
B. Cloudy crystal ball gazing
Acknowledgments
References
Additional References
1365
1366
1367
1368
1370
1370
1372
1373
1375
1376
1377
1378
1378
1378
1380
1381
1383
1384
1386
1386
1387
1388
1388
1340
I. INTRODUCTION
Interpretations of measurements on physical systems
often turn on the tools one has to make those interpretations. In this review article we shall describe new tools
for the interpretation of observations of physical systems
where the time trace of the measured quantities is irregular or chaotic. These time series have been considered
bothersome "noise" without the tools we shall describe
0034-6861/93/65(4)/1331(62)/$11.20
1331
1332
and, as such, have traditionally been taken as contamination to be removed from the interesting physical signals
one sought to observe. The characterization of irregular,
broadband signals, generic in nonlinear dynamical systems, and the extraction of physically interesting and useful information from such signals is the set of achievements we review here. The chaotic signals that we shall
address have been discarded in the past as "noise," and
among our goals here is to emphasize the valuable physics that lies in these signals. Further we want to indicate
that one can go about this characterization by exploiting
the structure in these chaotic time series in a systematic,
nearly algorithmic, fashion. In a sense we shall describe
new methods for the analysis of time series, but on another level we shall be providing handles for the investigation and exploitation of aspects of physical processes that
could be simply dismissed as "stochastic" or random
when seen with different tools. Indeed, the view we take
in this review is that chaos is not an aspect of physical
systems that is to be located and discarded, but is an attribute of physical behavior that is quite common and
whose utilization for science and technology is just beginning. The tools we discuss here are likely also to be just
the beginning of what we can hope to bring to bear in the
understanding and use of this remarkable feature of physical dynamics.
The appearance of irregular behavior is restricted to
nonlinear systems. In an autonomous linear system of /
degrees of freedom, u(t) = [u {(t),u2(t), . . . ,Uf(t)], one
has
- = A-u(r) ,
(1)
at
where A is a constant fXf
matrix. The solution of this
linear equation results in (1) directions in /-space along
which the orbits uU) shrink to zeronamely, directions
along which the real part of the eigenvalues of A are
negative, (2) directions along which the orbits unstably
grow to infinitynamely, directions along which the real
part of the eigenvalues of A are positive, and (3) directions where the eigenvalues occur in complex-conjugate
pairs along with zero or negative real part. In any situation where the eigenvalue has a positive real part, this is
a message that the linear dynamics is incorrect, and one
must return to the nonlinear evolution equations that
govern the process to make a better approximation to the
dynamics.
Much of the work in nonlinear dynamics reported in
the past has concentrated on learning how to classify the
nonlinear systems by analyzing the output from known
systems. These efforts have provided, and continue to
provide, significant insights into the kinds of behavior
one might expect from nonlinear dynamical systems as
realized in practice and has led to an ability to evaluate
now familiar quantities such as fractal dimensions,
Lyapunov exponents, and other invariants of the nonlinear system. The capability for doing these calculations
is strongest when we know the differential equations or
1333
Clearly the source is no more than an abstract statement of the system producing our measurements, and the
concept of entropy of a deterministic dynamical system,
where no notion of probability is immediately evident,
seems much the same as in the information-theoretic setting. This idea was realized over thirty years ago by Kolmogorov (1958), who gave a definition of this entropy
which, in practice, is precisely the same as h (X) above.
Further, this entropy is an invariant of the orbits of the
system, that is, independent of the initial conditions or
the specific orbits observed. Additional work on this
quantity was done by Sinai (1959), and the quantity h (X)
has become known as the Kolmogorov-Sinai (or KS) entropy of a dynamical system. Strengthening this connection is a theorem of Pesin (1977) equating the K S entropy
to the sum of the positive Lyapunov exponents. If a
dynamical system has no positive Lyapunov exponents, it
does not possess chaos in the usual sense.
These comments indicate the connection between
dynamical systems and information theory, but do not
explain the interest for someone concerned with the
analysis and modeling of chaotic data. The KS entropy
gives a precise statement of the limits to predictability for
a nonlinear system. Think of a phase space for the system which is reconstructed or, if one actually knows the
differential equations or mapping, precise. Any realizable statement about the system is known only to some
precision in this state space. Within that precision or
resolution we cannot distinguish at a given time t = 0 between two state-space points lying in the resolution cell.
A nonlinear system with positive h (X) is intrinsically unstable, however, and the two points unresolvable at t = 0
will move after some time T to differing parts of the
phase space. 2^-h{X)T^ is a measure of the number of states
occupied by the system after the passage of a time T
(Gallager, 1968; Rabinovich, 1978). When this number
of states is approximately the total number of states
available for the orbits of the system, then all prediction
of the future of an orbit becomes untenable. This occurs
in approximately T^l/h
(X). One still has statistical information about the system that is quite useful and interesting, but knowledge of the evolution of a specific orbit is lost. Thus, h (X) becomes an object of considerable
interest for the physicist. Since it is the sum of the positive Lyapunov exponents, one can use the time series itself to determine how predictable it is.
HN{X)=-^P(xux2,
A. Observed chaos
. . . ,xN)log2[P(xl,x2,.
. . ,xN)]
(2)
As N becomes large this entropy, divided by N, has a
finite limit,
lim
N-+oo
HN(X)
"
=h(X),
(3)
and this is the amount one learns about the source of the
symbols by the sequence of measurements on the set X.
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
Actual observations of irregular evolution that we consider are typically measurements of a single scalar observable at a fixed spatial point: call the measured quantity
s(t0 + nrs)=s{n),
where tQ is some initial time and rs is
the sampling time of the instrument used in the experiment. sin) could be a voltage or current in a nonlinear
circuit or a density or velocity in a fluid dynamic or plasma dynamic experiment or could be the temperature at
some location in a laboratory container or an atmospher-
1334
The statistical aspect of chaotic dynamics was discussed and its importance stressed in an earlier article in
this journal by Eckmann and Ruelle (1985). Our review
now is in many ways an update of features of that earlier
article with an emphasis on practical aspects of the
analysis of experimental data and model building for prediction and control. Of course, we also include many developments unanticipated by the earlier review, but we
certainly recommend it as an introduction to many of the
topics here, often from a rather more mathematical
viewpoint than we project.
Many items we shall mention here, especially in the
sections on signal separation, control of chaotic systems,
and synchronization of chaotic systems, have direct practical implications which move beyond the usual preoccu-
1335
...
9Uf(t)]9
^ = G ( u U ) ) ,
(4)
dt
where the vector field G ( u ) is always taken to be continuous in its variables and also taken to be differentiate as
often as needed. When time is discrete, which is the realistic situation when observations are only sampled every
rs, the evolution is given by a map from vectors in R-f to
other vectors in R^, each labeled by discrete time:
u(n) = u(t0-\-nTs),
1336
(5)
u(n+l)=F(u(/i))
The continuous-time and discrete-time views of the dynamics can be connected by thinking of the time derivative as approximated by
du(t)
dt
uUp + U +l)Ts)-u(t0
^
+ nrs)
(6)
Ts
(7)
= G(u(x,f))
(8)
(9)
u ( / i + l ) = F(u(w))
3F(u)
9u
(10)
When this Jacobian is unity, which is a very special occurrence and is usually associated with Hamiltonian evolution in a physical setting, phase-space volume remains
constant under the evolution of the system. Generally
the Jacobian has a determinant less than unity, which
means volumes in phase space shrink as the system
evolves. The physical origin of this phase-space volume
contraction is dissipation in the dynamics.
As the physical parameters in the vector field are
varied, the behavior of u(t) or u(), considered as an
initial-value problem starting from u U 0 ) or u(0), will
change in detail as t or n becomes large. Some of these
changes are smooth and preserve the geometry of the set
of points in R f visited by the dynamics. Some changes
reflect a sudden alteration in the qualitative behavior of
the system. To have an explicit example to discuss, we
consider the three differential equations of Lorenz (1963)1
dx(t)
dt
= a(y(t)-x(t))
dy(t)
= x{t)z(t)Jrrx{t)y{t)
,
,
(11)
dt
dz(t)
dt
=x(t)y(t)
bz(t) .
These equations are derived from a finite mode truncation of the partial differential equations describing the
thermal driving of convection in the lower atmosphere.
The ratio of thermal to viscous dissipation is called a , the
ratio of the Rayleigh number to that critical Rayleigh
number where convection begins is called r, and the dimensionless scale of a convective cell is called b. The parameters a and b are properties of the system and the
geometry-planar herewhile the parameter r is deter-
^ n a later paper (Lorenz, 1984) there is a delightful description of how he came to consider this model. He was concerned
with linear prediction of nonlinear models and used this threedegree-of-freedom model as a testing ground for this study. He
concluded that the idea of linear prediction for this kind of system would not do.
y +rx
G3(x,y,z)=bz
+xy
xz
(12)
at (x,j;,z) = (0,0,0). This represents steady thermal conduction. As r increases beyond unity, the state
(0,0,0) is linearly unstable, and the vector field has two
symmetric
linearly
stable
fixed
points
at
(x,y,z)
= {Vb(r-l),Vb(r-\),r-l).
For all
r < 1, the geometry of the time-asymptotic state of the
system is the same, namely, all initial conditions
tend to (0,0,0). For r > 1 and until r reaches
rca{a + b -\-3)/(a b 1), orbits end up at
(x,y,z)
depending on the value of (x(0),y(0),z(0)).
The
geometry of the final state of the orbits is the same,
namely, all volumes of initial conditions shrink to zerodimensional objects. These zero-dimensional limit sets
change with parameters, but retain their geometrical nature until rrc. When r>rc, the situation changes
dramatically. The two fixed points of the vector field at
(x,yyz)
become linearly unstable, and no stable fixed
points remain. As r is increased from rc, the timeasymptotic state emerging from an initial condition will
either perform irregular motion characterized by a broad
Fourier spectrum or undergo periodic motion.
The analysis of the sequence of allowed states in a
differential equation or a map as parameters of the system are varied is called bifurcation theory. We shall not
dwell on this topic as it is covered in numerous monographs and literature articles (Guckenheimer and
Holmes, 1983; McKay, 1992; Crawford, 1991; Crawford
and Knobloch, 1991). What is important for us here is
that as we change the ratio of forcing to damping,
represented by the parameter r in the Lorenz equations,
the physical system undergoes a sequence of transitions
from regular to more complex temporal evolution. The
physical picture underlying these transitions is the requirement of passing energy through the system from
whatever forcing is present to the dissipation mechanism
for the particular system. As the amount of energy flux
2
The same equations were derived in the context of laser physics by Orayevsky et al. (1965).
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
1337
5.0
10.0
15.0
20.0
25.0
30.0
35.0
40.0
1338
10 [
io-2
10"4
L
10
___
..
10
. , . . . Ill
10
100
200
300
400
500
t, ms
< -
- r r -
1339
sions from global linear modeling can significantly improve performance in modeling nonlinear signals.
III. ANALYZING MEASURED SIGNALSLINEAR
AND NONLINEAR
io 2 I
10 [
A. Signal separation
FIG. 5. Power spectrum of the time series V(t) from the hysteretic circuit.
fixed points, or oscillations at a few frequencies, it is not
surprising that the use of linear models is not likely to
give much insight into the behavior of signals
arising from nonlinear systems. Traditional autoregressive/moving-average models (Oppenheim and
Schafer, 1989), then, are not going to do very well in the
description of chaotic signals. Indeed, if one tries to use
linear models to fit data from chaotic sources and
chooses to fit the unknown parameters in the model by a
least-squares fit of the model to the observations, the rms
error in the fit is about the same size as the attractor; that
is, nothing has been accomplished. If one tries a local
linear fit connecting neighborhoods in the state space of
the system (this is certainly one of the simplest nonlinear
models), the rms error can essentially be made as small as
one likes depending on the quantity of data and size of
the neighborhoods. This is not to suggest that local
linear models answer all questionsindeed, we shall see
below this is not the casebut that even small excur-
1340
TABLE I. Connection between the requirements in linear and nonlinear signal processing. The goals are much the same in each
case; the techniques are quite different.
Linear signal processing
Phase-space reconstruction
Using timelagged variables, form coordinates for the phase space in d dimensions:
y{n)=[x(n),x(n+T),
. . . ,x(n + ( d -
\)T)]
j)
. . . ,ap]
Find parameters a7 consistent with invariant classifiers (Lyapunov exponents, dimensions). Find dynamical dimensions dL from
data.
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
1341
1342
s(n)=
<*ks(n -k)+
k=\
*/g(/i ~ ' ) ,
(13)
1=1
S(z)=
zns(n),
(14)
n = oo
which leads to
i b,z'
S(z) = ^
G(z) ,
(15)
k= \
E. Signal synthesis
The main emphasis of this review and actually of most
work on nonlinear signals has been on their analysis; that
is, given an observed signal, can we tell if it came from a
low-dimensional chaotic attractor, can we characterize
that attractor, can we predict the evolution of state-space
points near the attractor or into the future, etc.
Another direction now being taken concerns the synthesis of new signals using chaotic sources. Can one utilize these signals for communication between distant locations or synchronization of activities at these locations,
etc. This, in our opinion, is likely to be the main direction of engineering applications of the knowledge we
have garnered about nonlinear systems operating in a
chaotic mode. We shall touch upon these ideas as we
proceed here, but their review will have to await a more
s(t0 + {n +\)Ts)-s(t0
+ nrs)
(17)
1343
FIG. 7. Lorenz time series x(t) (Fig. 1) embedded in a threedimensional phase space (x(t)yx(t -\rdT),x(t +2dT)),dT =0.2.
1344
provide an embedding.
Time-delay embedding is certainly not the only
method for embedding scalar data in a multivariate
dynamical environment. Other methods (Landa and
Rosenblum, 1989; Mindlin et al.y 1991) are available and
have their advantages. Time-delay embedding is,
perhaps, the only systematic method for going from scalar data to multidimensional phase space, and it has certainly been the most explored in the literature.
A. Choosing time delays
The statement of the embedding theorem that any time
lag will be acceptable is not terribly useful for extracting
physics from the data. If we choose T too small, then the
coordinates s(n+jT)
and s(n -\-(j + l)T) will be so
close to each other in numerical value that we cannot distinguish them from each other. From any practical point
of view they have not provided us with two independent
coordinates. Similarly, if T is too large, then s(n +jT)
and s(n +{j +1)T) are completely independent of each
other in a statistical sense, and the projection of an orbit
on the attractor is onto two totally unrelated directions.
The origin of this statistical independence is the ubiquitous instability in chaotic systems, which results in any
small numerical or measurement error's being amplified
exponentially in time. A criterion for an intermediate
choice is called for, and it cannot come from the embedding theorem itself or considerations based on it, since
the theorem works for almost any value of T.
One's first thought might be to consider the values of
sin) as chosen from some unknown distribution. Then
computing the linear autocorrelation function
~
"
CL(r)--
[s(m+r)-s][s(m)-s]
m = l
(19)
[s(m)-s]2
771)2
+r)-s-CL(r)[s(n)-s]}
{s(n
(21)
n= l
m=\
where
^AB^i^k)
= l0g2
m 1
and looking for that time lag where CL(r) first passes
through zero, would give us a good hint of a choice for T.
Indeed, this does give a good hint. It tells us, however,
about the independence of the coordinates only in a
linear fashion. To see this, recall that if we want to know
whether two measurements sin) and s(n + r ) depend
linearly on each other on the average over the observations, we find that their connection, in a least-squares
sense, is through the correlation matrix just given.
That is, if we assume that the values of s(n) and
s (n + r ) are connected by
s(n+T)-J=CL(r)[s(n)-J]
then minimizing
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
(20)
PAB^i^k)
(22)
PA^i)PB(bk)
PAB(aiybk)IABiat,bk)
(23)
To place this abstract definition into the context of observations from a physical system s(n)> we think of the
sets of measurements s (n) as the A set and of the mea-
P(s(n),s(n+T))\og2
1345
P(s(n),s(n
+;))
P(s(n))P(s(n+T))
(24)
0.2
0.3
Time lag
+Tm))=P(s(n))P(s(n
+Tm))
x(n+l)=l+y(n)-1.4xM2
,
(25)
y(n+l)
0.3x(n)
1346
6.0
ormation
5.0
4.0
Mutual
g
3.0
2.0
1.0
0.0
1. Singular-value analysis
If our measurements y(n) are composed of the signal
from the dynamical system we wish to study plus some
contamination from other systems, then in the absence of
specific information about the contamination it is plausible to assume it to be rather high dimensional and to assume that it will fill more or less uniformly any fewdimensional space we choose for our considerations. Let
us call the embedding dimension necessary to unfold the
dynamics we seek dN. If we work in dE>dN, then in an
heuristic sense dE dN dimensions of the space are being
populated by contamination alone. If we think of the observations embedded in dE as composed of a true signal
yT(n) plus some contamination c: y(n) = y r ( n ) + c(n),
then the dE X dE sample covariance matrix
C O V =
4F
iV
[y(")-y a v][yU)-y a v] 7 \
(26)
n= 1
with
y a v ^ i ; y(*),
iV
n= 1
(27)
1347
will, again in an heuristic sense, have dN eigenvalues arising from the variation of the (slightly contaminated) real
signal about its mean and dEdN eigenvalues which
represent the "noise." If the contamination is quite high
dimensional, it seems plausible to think of it filling these
extra dE dN dimensions in some uniform manner, so
perhaps one could expect the unwelcome dEdN eigenvalues, representing the power in the extra dimensions, to
be nearly equal. If this were the case, then by looking at
the eigenvalues, or equivalently the singular values of
COV, we might hope to find a "noise floor" at which the
eigenvalue spectrum turned over and became flat
(Broomhead and King, 1986). There are dE eigenvalues,
and the one where the floor is reached may be taken as
d
N-
This analysis can also be carried out locally (Broomhead and Jones, 1989), which means that the covariance
matrix is over a neighborhood of the NB nearest neighbors, y{r)(n) of any given data point y(n):
1
COV(n) = ~
[y{r)(n)-yJn)][y{r)(n)-ysiV(n)]T
(28)
(r,
1348
0(r-\y(k)-x\),
(29)
W k= i
Cq(r)-
[n(r,y{j))iq-
(30)
7= 1
= [s(k)sNN(k)]2+[s(k+T)-sNN(k+T)]2+
[s{k+dT)--s(k+dT)]
(33)
The third method for determining dN comes from asking, directly of the data, the basic question addressed in
the embedding theorem: when has one eliminated false
crossings of the orbit with itself which arose by virtue of
having projected the attractor into a too low dimensional
space?
Answers to this question have been discussed in various ways. Each of the ways has addressed the problem of
determining when points in dimension d are neighbors of
one another by virtue of the projection into too low a dimension. By examining this question in dimension one,
then dimension two, etc. until there are no incorrect or
false neighbors remaining, one should be able to establish, from geometrical considerations alone, a value for
the necessary embedding dimension dE=dN.
We describe the implementation of Kennel et al. (1992).
In dimension d each vector
y(k) = [s(k),s(k
+T),.
. . ,s(k +(d
-DT)]
(31)
NN
-l))-sNN(k
+T(d
-l))]2
(32)
\s{k+Td)-s""(k+Td)
Rd(k)
>R
T >
(34)
IK-
ia
(l + |z(fc)|2)
Embedding dimension
(35)
1349
40.0
Embedding dimension
1350
Ikeda Map
12 000 Points
t>
*"'
\\ *
<-*>"<*
J*?+*>
x(k)
(36)
then y( k) and its nearest neighbor are false nearest neighbors. As a measure of RA one may use the rms value of
the observations
R
2 _
[s(k)-s]2
*k%
1
N
(37)
k= \
FIG. 16. Time series from the Ikeda map embedded in a twodimensional space (x (k),x (k 4- 1)).
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
r-
1351
2
o
XJ
o> 20.0
Embedding dimension
D.
Tan6dE
1352
8dU-y(k))
(g) = fddxg(x)p(x)
= -h%zWk)),
(40)
^ k= \
we see that
fddxg(F(x))p(x)
= ^-"2
^
*(y(*+l
k = i
= | [ g ( y W + D ) - g ( y ( l ) ) ] + (g> ,
(41)
and in the limit TV> oo, the averages of g(x) and g ( F ( x ) )
are the same. Since F(x) moves us along the orbit precisely one step, we see that, for long enough data sets, the
physicist's meaning of iV oo, all (g) are unchanged by
perturbations within the basin of attractionall those
points in state space which end up on an attractorof
the observations.
Quantities such as ( g ) can now be used to identify the
system that produced the observations. The (g ) are statistical quantities for a deterministic system and are the
only quantities that will be the same for any long observation of the same system. Different chaotic orbits will,
as we shall make quantitative below, differ essentially
everywhere in phase space. The issue, then, is to choose
a g (x) that holds some interest for physical questions.
(39)
(42)
becomes
A. Density estimation
(43)
f d x p(x)
r
xt
d
2 fJx d xp(x)
J
, ,
nix;)
2,n(Xi)
with n(Xj) equal to the number of counts in partition element i. These discrete estimates of the partition probabilities in the state space provide our approximation for
the invariant measure. Equation (40) can be interpreted
as the sparsely populated version of Eq. (43), resulting
from taking so fine a partitioning of phase space that
each partition element contains only one point.
This type of partition-based estimator is simply a multidimensional histogram and, as such, does not provide
smooth distributions and can suffer from boundary
effects. One way of improving upon traditional histograms is to use kernel density estimators, which replace
the delta function in Eq. (39) with one's favorite integrable model for a localized, but nonsingular, distribution
function. Tight Gaussians are a popular choice for the
kernel, but scores of alternatives have been suggested on
various grounds of optimality criteria (Silverman, 1986).
Kernel density estimators illuminate an alternative to
the traditional fixed-volume methods for estimating
densities/probabilities. Instead of fixing a uniform grid
to partition the phase space and counting the number of
observations in each equal-sized partition element, it is
possible to base our estimates on the localized properties
of the given data set around each individual point.
Fixed-mass techniques perform a virtual partitioning of
the phase space with overlapping elements centered on
every point in the data set. The radius of each partition
element, rN (*), is increased until a prespecified number
NB of neighboring points have been enclosed. In this
way we accumulate a large number of overlapping partition elements, all containing the same number of points
but having different spatial sizes. The local density, or
probability, is then estimated as a function of the inverse
of the bin radius:
p(x)*rN
(x)~
dE
(45)
B. Dimensions
Attractor dimension has been the most intensely studied invariant quantity for dynamical systems. Much of
the interest of the past decade was spawned by the realiRev. Mod. Phys., Vol. 65, No. 4, October 1993
1353
(46)
1354
a large range, hopefully at least several orders of magnitude. In theory, in the limit as e>0 the ratio in Eq. (47)
describes the space-filling properties of the point set being analyzed. It is obvious that the largest value of D calculable with this box-counting algorithm is dE. If the
embedding dimension is not large enough to unfold the
attractor's structure fully, we shall only be observing a
projection of the structure, and much information will be
obscured. Therefore the common practice is to repeat
the described box-counting calculations in a number of
increasing
embedding
dimensions.
For
lowerdimensional embeddings, the attractor's projection is expected to "fill" the space, resulting in an estimated fractal dimension equal to the embedding dimension. As the
embedding dimension increases through the minimum required for complete unfolding of the geometric structure,
the calculated fractal dimension saturates at the proper
value. Figure 19 illustrates this basic technique with data
from the Henon map.
According to our geometric intuition, Eq. (46) takes
"the amount" of an object to be equivalent to its volume.
But this is not the only measure for how much of something there is. Theiler has nicely phrased Eq. (46) in the
more general form
bulk c size'dimension
(48)
1. Generalized dimensions
Let us now return to our discussion of invariant quantities by considering the case in which bulk is measured
in terms of some moment of the invariant distribution.
In other words, take g(x)=pp(x)
in Eq. (40) and use
(g ) l / p to measure bulk. Putting everything together we
arrive at
r-*o log(size)
r-+o
r
~\l/p
= lim-log P 2 / ^
logr
r-*0
log2>*
= lim
r-+o q\
logr
logr
r
1 log 2 P " + 1
lim
logr
Dn
This definition of the generalized dimension Dq provides a whole spectrum of invariant quantities for
oo < q < oo.
We
are
already
familiar
with
Z> 0 = logA^ 0 /logr; this is just the capacity estimated by
the box-counting method.
Next we have Dx to consider. The q>\ limit results
in an indeterminate form, but one easy way to find the
limit is with L'HospitaFs rule:
_3_
1 dq
log2>f
Dx = \im
or
j?=
(49)
2/^logA-
(50)
dq
:
^FsV
J.
/ .A^
//
^
/
f
lim
r-*0 T
i
FIG. 19. Box counting for Henon attractor in different embedding dimensions dE = 2,3 and at differing spatial resolutions.
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
i = i
5>;loSA
(51)
= lim
;
r~>o logr
Note that, when pt l/N for all /, then Dl=DQ.
But
when the/?, are not all equal Dx <D0. (In fact, in general
Dt< Dj, if i>j.)
The difference between the two measures is explained
by how we use our state-space partition. For D0 the bulk
is how many partition elements are nonempty. An element containing one sample contributes as much bulk as
an element containing a million samples. Dx is calculated from a bulk in which each partition element's contribution is proportional to how many samples it contains.
So bulk in DQ is simply the volume, whereas for Dx bulk
is equivalent to the mass.
Because of the functional form of the numerator in Eq.
(51), Dx is commonly called the information
dimension.
That numerator expresses the amount of information
needed to specify a state to precision r, and the ratio
defining the dimension tells us how fast this required information grows as r is decreased.
Next we have D2, the so-called correlation dimension:
-log 2 pf
D2 = \im
.
(52)
r^o
logr
We see that the numerator constitutes a two-point correlation function, measuring the probability of finding a
pair of random points within a given partition element,
just as the numerator in the definition of D{ measures the
probability of finding one point in a given element.
Consequently, we can try to estimate the numerical
value of that numerator by counting how many pairs of
points have a separation distance less than some value r.
In 1983 Grassberger and Procaccia (1983a, 1983b) suggested using
C2(r)=
fddxp(x)n(r,x)
(53)
1355
1356
10 6
514
102 I
10
10
'
10
-^
'
10
r
ute a factor of Lt /C to N:
1357
log
1=1
Dr=-
N(C)=N()Hmax
\\ max
LK + \
,1
logL K + \
(54)
,1
log
Recall the basic scaling relation that motivated our boxcounting algorithm:
K
i = l
V*N(e)~e-
(55)
So
maxLt
-n
AT, x .Vi
N(e)
N{Ce)
_ ^
(Ce) -D
--D
C~D
(56)
Y[ max
log
Dr =
i = i
(57)
logC
2*/
=K
i = l
^K + \
where the ki9 the logarithms of the Lt, are called the
Lyapunov exponents.
Which Lyapunov exponent should be picked for kkl
In the spirit of the definition of the capacity D0 we want
to choose the value that minimizes DL. Note that if all
the A,, are less than zero the attractor is a stable fixed
point and has dimension zero. If the only non-negative
exponents are zero, the attractor is a limit cycle. Multiple null exponents correspond to the number of incommensurate frequencies in a quasiperiodic system, which is
also the system's dimension. When at least one exponent
is positive, the last equation is to be minimized with
respect to kk, to provide the most efficient covering of
the invariant set.
To find the optimal kk consider the difference in DL
calculated with kk or kk + l:
2*/
&Dr = k'+l-
2*/
i = l
/=!
^k+2
** + l
i = i
^-k+2
^fc + 1
^k + \
k
= 1-
^k + \
^k+7
k
/=1
^+n
|X A:+2l
+ 2l~I^A: + ll
\i k+2\
Wt + li ^ + 2 1
i
x
and
therefore
Assuming
that
'k + u^k
UA'k+2 < 0,
|A,fc + 1 | < l^&+2l> w e s e e t n a t t n e n r s t factor is a positive
quantity less than one. The second factor is a negative
so long
as
quantity,
and DT
is decreasing,
2?-i*i>lA- fc + l l Therefore we take
2*/
DL=K~X
/=1
K+l
(58)
i = l
l^fc + ll
:
1358
(60)
w ( f c + l ) = F ( y ( i k ) ) - y ( i k + l ) + DF(y(ik))-w(ife)+
D. Lyapunov spectrum
= DF(y(fc))-w(fc) .
Dimensions characterize the distribution of points in
the state space; Lyapunov exponents describe the action
of the dynamics denning the evolution of trajectories.
Dimensions only deal with the way measure is distributed
throughout the space, whereas Lyapunov exponents examine the structure of the time-ordered points making up
a trajectory.
To understand the significance of the spectrum of
Lyapunov exponents, consider the effects of the dynamics
on a small spherical fiducial hypervolume in the (possibly
reconstructed) phase space. Arbitrarily complicated dynamics, like those associated with chaotic systems, can
cause the fiducial element to evolve into extremely complex shapes. However, for small enough length scales
and short enough time scales the initial effect of the dynamics will be to distort the evolving spheroid into an ellipsoidal shape, with some directions being stretched and
others contracted. The primary, longest, axis of this ellipsoid will correspond to the most unstable direction of
the flow, and the asymptotic rate of expansion of this axis
is what is measured by the largest Lyapunov exponent.
More precisely, if the infinitesimal radius of the initial
fiducial volume is called r ( 0 ) , and the length of the ith
principal axis at time t is called //(f), then the /th
Lyapunov exponent can be defined as
^=Iim|log-^ .
(59)
By convention, the Lyapunov exponents are always ordered so that A^ > A,2 > A,3 .
Equivalently, the Lyapunov exponents can be seen to
measure the rate of growth of fiducial subspaces in the
phase space. kx measures how quickly linear distances
growtwo points initially separated by an infinitesimal
distance will, on average, have their separation grow as
ee l . The two largest principal axes define an area element, and the sum kx + k2 determines the rate at which
two-dimensional areas grow. In general, the behavior of
d-dimensional subspaces is described by the sum of the
first d exponents, X / = o ^jE. Global exponents
The spectrum of Lyapunov exponents is determined by
following the evolution of small perturbations to an orbit
by the linearized dynamics of the system. Suppose we
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
(61)
(62)
L
(63)
L> oo
h(X)= 2 K
(64)
xa>o
1359
(65)
(66)
(67)
= 1 R(fc)
(68)
1360
adopted when a predictive model is available for supplying values for the Jacobians of the dynamical rules.
Lacking such detailed information about the dynamics,
one can still try to estimate the larger exponents by
tracking the evolution of small sample subspaces through
the tangent space.
1. Analytic approach
The problem is to reconstruct the phase space and then
estimate numerically the Jacobians DF(y(&)) in the
neighborhood of each orbit point. Then the determination of the Lyapunov exponents by use of a recursive Q R
or singular-value decomposition is much the same as before. T o estimate the partial derivatives in phase space
we evidently must use t h e information we have about
neighbors of each orbit point on the attractor. The idea
(Sano and Sawada, 1985; Eckmann et al, 1986; Brown
et ah, 1990, 1991) is to make local maps from all points
in t h e neighborhood of the point y(fc) to their mapped
image in the neighborhood of y ( f c + l ) . In the early
literature (Sano and Sawada, 1985; 6 Eckmann et al.,
1986) this map was taken to to be locally linear, but this
accurately gives only the largest exponent, since the numerical accuracy of the local Jacobians is not very good
and when one makes mistakes in each of the elements of
the composition of Jacobians that one is required to diagonalize to determine the ka, those errors are exponentially compounded by the ill conditioned nature of the problem (Chatterjee and Hadi, 1988). The problem is that the
burden of making the local map and the burden of being
an accurate Jacobian were both put on the dE X dE matrix in the local linear map. The solution is to allow for a
larger class of local neighborhood-to-neighborhood
maps, and then extract the Jacobian matrix from that
map.
This means that locally in state space, that is, near a
point y(n) on the attractor, one approximates the dynamics x>F(x) by
.FU)= 2 cB(fcty*(x) ,
(69)
Lyapunov exponents
k= \
with the <f>k(x) a set of basis functions chosen for convenience or motivated by the data. The coefficients cn(k)
are then determined by a least-squares fit minimizing the
residuals for taking a set of neighbors of y( n) to a set of
neighbors of y(n + 1 ) . The numerical approximation to
the local Jacobian then arises by differentiating this approximate local map.
The burden of making the map and achieving a numerically accurate Jacobian are separated by this approach,
5.0
IB
0.0 4
-5-0 [
-io.o \
c
^
5*
'
-8.0
-g____^___~-Br
**__^__
/
K
-15.0 V
-25.0
-2o.o i
6
\
\
-I
'
-6.0
'
-4.0
log(noise)
-2.0
'
0.0
1361
TABLE II. Global Lyapunov exponents computed from x (t) data for the Lorenz attractor. The order
of the neighborhood-to-neighborhood polynomial map is varied. The correct values for these exponents are ^ ! = 1.50, A,2 = 0.0, and ^3= 22.5.
Global Lyapunov exponents for the Lorenz system
Order of
polynomial used
in local fit
Linear
Quadratic
Cubic
Quartic
Quintic
1.549
1.519
1.505
1.502
1.502
-0.094 70
-0.02647
-0.005 695
-0.002 847
-0.000 387
where the three Lyapunov exponents for the Lorenz system are evaluated from x(n) data for various levels of
uniform random numbers added to the data. The sensitivity is clearly seen here.
2. Trajectory tracing method
As an alternative to finding numerical values for the
local Jacobians of the dynamics along the orbit, one can
attempt to follow small orbit differences (Wolf et aL,
1985) and small areas defined by two orbits, and then
small subvolumes defined by three or more orbits to
sequentially extract values for the Lyapunov exponents.
These methods, while popular, almost always give accurate values for only the largest Lyapunov exponent k{
due to inevitable numerical errors. Nonetheless, we describe this method and alert the reader to its sensitivity.
Assume we have a large collection of experimental
points from a long time series. What space our data is in
is not important, whether it be the natural space of the
defining dynamics or an embedding by time delays or
whatever. We want to find points in our sample library
that have close neighbors from different temporal segments of the library. Call these points x(t) and y(t). We
then keep track of how they deviate under dynamical
evolution until they become separated by a distance
greater than some threshold a n d / o r another trajectory
segment passes closer to the presently monitored one.
When either of these conditions is met, the ratio of
1
n
\\y(n+t)-x(n+t)\\
\\y(t)-x{t)\\
(70)
is calculated and a new pair of neighboring points is observed. The largest Lyapunov exponent is estimated by
the log of the long-term average of the above quantity.
Strictly speaking, we want the asymptotic value as
n > 00. In practice, we are happy to be able to trace a
trajectory in this manner for several dozen characteristic
time steps.
If a large sample of observation points is available, we
may adopt a numerical trajectory-tracing technique
motivated by an alternative, but equivalent, definition of
the largest Lyapunov exponent:
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
-14.31
-20.26
-22.59
-22.63
-22.40
^ m a x = lim - l o g | | D F ^ u | [
n>00
(71)
rl
where u is a nearly randomly oriented unit vector perturbation in the tangent space. We say nearly because if u
should just happen to fall precisely along the stable manifold, it would remain within that invariant set and not relax onto the most unstable direction.
Our task is to find points in our sample library that are
very close together and watch how trajectories specified
by following the points separate. In locating the initial
neighboring points we must not consider points that are
from the same temporal segment of the library. Let us
denote the starting point of our reference orbit as
r ( 0 ) = y(/) and call its nearest spatial neighbor
Vi(0) = y(j). We require that i and j differ by some
minimum decorrelation length.
Now consider following the evolving trajectories
emanating from these two points. Initially the points are
separated by some distance || A 0 ||, and we want to see how
this distance grows under dynamical evolution. Therefore we continue to calculate a series of separation distances
||A fc || = ||r(fc)-i; 1 (fc)|| = | | y ( i + f c ) - y ( 7 + f c ) | |
(72)
1362
(73)
^0
^max + 1
fcS^
+l
and estimate
*,,1 = ^ - l o g A .
u max
M
(75)
To determine the second Lyapunov exponent we repeat the above tracing procedure, but monitor the evolution of area elements rather than separation lengths.
Consider the same starting points used in the above example, r(0) and v{(0). We want to augment this pair
with another neighboring point wx(0), which is close to
r(0) while being in a direction orthogonal to 8 0 , the separation between r ( 0 ) and 1^(0). Call the separation vector
between r(0) and w{(0) y0. Then 8 0 and y 0 define a fiducial area element, which we can try to follow under
dynamical evolution. We again follow r(k)> vx{k), and
now w{(k) also, at each point calculating the area
spanned by 8^ and yk.
However, following this two-dimensional element introduces further complications because, in general, yk
will tend to become aligned with 8^ along the unstable
direction. Therefore, in addition to reinitializing v and w
when the spanned area grows above a threshold value, we
must carefully reorient w to keep it as orthogonal to v as
the data resources permit.
Clearly the trajectory tracing method requires a substantial quantity of data to facilitate the aligned updating
procedure. If we are unable to locate nearby update
points in the required directions, our exponents will be
underestimated, since the maximal expansion rates will
not be observed. Although the basic idea easily generalizes to higher-dimensional subspace elements for estimating the rest of the exponents, the precision and quantity
of data required, especially in larger dimensions, makes
this technique impractical for anything beyond the largest couple of exponents.
3. Spurious exponents
One difficulty associated with working in reconstructed
phase spaces is that artifacts arising from inefficiencies of
representation inherent to the ad hoc embedding can be
annoying. A good example of such an annoyance is the
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
presence of spurious Lyapunov exponents when the dimension of the reconstructed space exceeds the dimension of the dynamics' natural space dN.
Analysis of a dynamical system in a d-dimensional
space necessarily produces d Lyapunov exponents. A n d
Takens' theorem tells us that the embedding space may
have to be expanded to d =2dN + l. So it is possible,
even in the ideal noise-free case, to have dN + 1 spurious
exponents in addition to the dN real ones.
Takens' theorem assures us that dN of the exponents
we calculate in a large enough embedding space will be
the same as those we could calculate from a complete set
of measurements in the system's natural space. But little
was known about the behavior of the spurious exponents
until recent work clarified the robustness of the numerical estimation techniques.
First clues were provided by Brown et al. (Bryant
et al., 1990; Brown et al., 1991) when they investigated
the use of higher-order polynomial interpolants for modeling the dynamics in the reconstructed phase space.
Identifying the directions associated with each computed
Lyapunov exponent, they demonstrated that very few
sample points could be found along the directions of the
spurious exponents. Essentially, the directions of the real
exponents were thickly populated, indicating there was
actual dynamical structure along these directions. T h e
directions of the artificial exponents showed no structure
beyond a thickness attributable to noise and numerical
error.
Recently another method h a s been introduced for
discriminating between real and spurious exponents.
Based on the idea that most of the deterministic dynamical systems of physical interest are invertible, we first
perform the typical analysis on the given time series, and
then we reverse the time series, relearn the reversed dynamics, and proceed to calculate exponents for this backward data.
Under time reversal, stable directions become unstable,
and vice versa, no real exponents are expected to change
sign. On the other hand, the spurious exponents, being
artifacts of the geometry imposed by our ad hoc embedding, will not, in general, behave so nicely. In fact, experience seems to show that the artificial exponents vary
strongly under time reversal, making identification easy.
H. Local exponents from known dynamics
and from observations
Global Lyapunov exponents tell us how sensitive the
dynamics is on the average over the attractor. A positive
Lyapunov exponent assures us that orbits will be exponentially sensitive to initial conditions or to perturbations or to roundoff errors. N o information is gained
from them about the local behavior on an attractor. Just
as the inhomogeneity of the attractor leads to interesting
characterization of powers of the number density, so it is
natural to ask about the local behavior of instabilities in
various parts of state space.
OSL(y,L) = D F ( y ) . [ D F ( y ) ] ,
lim ka(y9L)
= ka
the cra(p,L)-+0
for L - > o o and p>2, again by the
Oseledec multiplicative ergodic theorem.
The behavior of the ka(y9L) can also be captured in a
probability density for values of each exponent, for a
fixed L. The probability density for the ath exponent to
lie in the interval [u,u + du] is after L steps along the orbit,
(77)
(81)
ca
T
(76)
1363
ca
a
(82)
JU
+ -
(83)
(78)
Lorenz 63; Local Lyapunov exponents
= kn
(79)
j ddy
p{y)[ka(y9L)-ka{L)f
= T7 2
N~
[ka(y{k)9L)-ka(L)f
(80)
FIG. 22. Average local Lyapunov exponents for Lorenz attractor calculated from the Lorenz equations (11) as a function of
the composition length L. Averages are over 10000 initial conditions.
1364
DFL(y)-[DFL(y)]r=Q1(2L)R1(2L)R1(2L - 1 ) R ^ l ) .
(84)
If QX(2L) were the identity matrix, the eigenvalues of
OSL(y,L) would all lie in the "upper triangular" part of
the Q R decomposition. In general, QX(2L) is not the
identity. We follow the construction in Stoer and Burlisch (1980) and make a similarity transformation to
Qf(2L)Q 1 (2L)R 1 (2L)R 1 (2L - 1 ) R 1 (1)Q 1 (2L)
= R 1 (2L)R 1 (2L - 1 ) R 1 (1)Q 1 (2L) .
(85)
~l)
- KK(1)
(86)
original
(87)
K(y>LY-
l
2Z,T,s
2L
(88)
2iog[^(;U,
j= \
triangular
the indiviof QK(2L)
or so, are
Probability distribution
x.
10.0
-10.0
-20.0
-30.0 r
11
7.0
9.0
Lyapunov exponent
13.0
15.0
13
15
iog2L
1.5 i
1.0Jv
0.5 I
<<
0.0
-0.5 [
-1.0 h
-^^"^
- 1 . 5 [
-2.0 '
'
'
'
'
'
"-
11
13
'
'
15
logL
FIG. 26. Average local Lyapunov exponents for data from the
Ikeda map; 60000 points are used. dE = 2, dL=2; 1500 initial
conditions.
ever needed to assure that QK (2L) differs from the identity matrix by a part in 10~ 5 or 10~ 6 .
In Figs. 25 and 26 we show the local Lyapunov exponents determined from observations on the x(n) data
from the Lorenz equations and from Re[z(/i)] from the
Ikeda map. In the former case we used dE=4 and used
local polynomials of dimension dL=3
for the
neighborhood-to-neighborhood maps, while in the latter,
dE = 4 and dL=2.
We shall discuss below how we know
dL2 for the Ikeda map.
I. Topological invariants
An ideal strange attractor, which never has a coincidence between stable and unstable directions, has embedded within it a dense set of unstable periodic orbits
(Devany and Niteeki, 1979; Auerbach et ah, 1987; Cvitanovic, 1988; Melvin and Tufillaro, 1991). Such a hyperbolic attractor is rarely, if ever, found in physical settings, but the approximation of real strange attractors by
this idealization can be quite rewarding. Indeed, in many
examples of strange attractors, the Henon and Ikeda
maps used as examples throughout this article seem to
possess numerically many of the properties of the "ideal"
strange attractor. The topological properties of the linkings among these unstable periodic orbits characterizes
classes of dynamics, and these topological classifications
remain true even when the parameters of the vector field
governing the dynamics are changed. So when one goes
from stable periodic behavior to quasiperiodic behavior
to chaos, the classification by integers or ratios of integers characteristic of topological invariants remains the
same. This is in direct contrast to the various invariant
properties we have discussed heretofore. It is clear that,
if one were able to extract these topological invariants
from experimental observations, they could be used to exclude incorrect dynamics and to point to classes of dynamics that could be further investigated, say, by cornRev. Mod. Phys., Vol. 65, No. 4, October 1993
1365
k= \
y2(n)=s(n)
y3(n)=s(n)
(89)
s(n
1) .
This is a perfectly fine embedding, and in the study of experiments on chemical reactions (Mindlin et ah, 1991)
and laser physics (Papoff et ah, 1992) it has served to unfold the unstable periodic orbits in three dimensions.
One may not always be able to accomplish this, but when
it is possible, tools are then available for further analysis.
In particular, by analyzing the linking of low-order
1366
(90)
The cost function for this error we shall call W(e). If the
map F ( y , a ) we are constructing is local, then for each
neighbor of y(k), yir)(k)r = 1,2, . . . ,NB,
e{\k) = y(r,k +\)-F(y{r)(kU)
y{r,k + 1 ) is the point in phase space to which the neighbor yir)(k) evolves; it is not necessarily the rth nearest
neighbor of y( k + 1 ) . For a least-squares metric the local
neighborhood-to-neighborhood cost function would be
W(e,k)=
a, then
NB
\e(\k)\2/^
[y(k)-(y(k))]2
(91)
W(e)=
(94)
a=\
(95)
as
f ddx p(x)(/)a(x)<l)b(x) = bab .
(96)
2
k= \
ki>(fc)l2/2
[y(k)-(y(k))]2
(92)
r= l
and so forth; An interesting option for modeling is to require the a to be selected by maximizing the average mutual information between y( r, k -f 1) over the neighborhood and the F(y ( r ) (/c),a); i.e., maximize
+ l),F(y<rU),a)]log2
r= \
FJx) = 2 ca(a) , ( * ) ,
1367
P[y(r,k+l),F(y{r\kU)]
P[y(r,fc+l)]P[F(y(r)(*),a)]
(93)
1368
W= X \\y(ryk + \)-&{r)y{r\k)-h{r)f
(97)
r= l
10
12
14
logL
FIG. 27. Local Lyapunov exponents calculated from data generated by the Ikeda map, dE = 4, dL =4. Forward exponents are
marked by open symbols, minus backward exponents by solid
symbols. Clearly, there are two true exponents and two spurious ones.
cision as desired. The advantage to using local Lyapunov
exponents is simply that one has more points of direct
comparison between forward exponents and negative
backward exponents than just the comparison of global
exponents. Further, if the data come from a differential
equation, local exponents are able to identify true and
false versions of the zero exponent, which as a global exponent will not change sign under time reversal,
In Fig. 27 are shown the average local Lyapunov exponents for the Ikeda map with dE=4, which is larger
than dL=2 but is the dimension suggested by the global
false-nearest-neighbor algorithm. Forward and minus
backward ka(L) are displayed. Evidently only two exponents satisfy the condition of equality of forward and
minus backward average local Lyapunov exponents. Decreasing dL from four to three and then to two, we observe (but do not display) the same phenomenon: only
two exponents satisfy the required time-reversal condition.
B. Local modeling
We now assume that our data are embedded in an appropriate phase space, and we have determined the dimension of the model. The problem now is to reconstruct the deterministic rule underlying the data.
We start our discussion with the simplest and earliest
nonlinear method of local forecasting, which was suggested by E. Lorenz (1969). Let us propose to predict the
value of y(k -f 1) knowing a long time series of y(y) for
j<k.
In the "method of analogs" we find the nearest
neighbor to the current value of y(k) say, y(m) and then
assume that the y ( m + 1 ) is the predicted value for
y(k + 1 ) . This is pure persistence, and is not much of a
model. In the classification by Farmer and Sidorowich
(1988) it is called a. first-order approximation. Clearly, the
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
{K)
(98)
r= \
F(xn )+DF(xn
)b + \D2F{xn
+ f D 3 F U ) 8 3 + .
)8 2
(99)
(100)
(101)
This expression for the average distance between neighboring points can be used to replace 8 in Eq. (99).
Next there are the derivative factors in each term of
Eq. (99). As we have seen in Sec. V, the Lyapunov exponents can be viewed as measuring the average derivative of the dynamics over the attractor. Discrepancies
between our forecasts and the actual points on the evolving trajectory would grow much in the same way that initially nearby points would separate under the dynamics,
and this expansion rate is measured by the largest
Lyapunov exponent:
(DF(x))^exp(X1K)
(102)
(103)
+ 1)/
^exp[(m+l)^1^] .
(104)
1369
c7V"(m
+ 1)/
^exp[V^step]?
(106)
(105)
1370
F M ( x ) = 2 c(fl)0fl(x) ,
(109)
a= l
c(a)=
=
^ 1 F<y(*)tyfl(y(*)>
iV
k= I
^ T F X y(k + l)(f>a(y(k)) .
0
30
60
forecast interval
90
FIG. 29. Direct and iterative linear forecast error for hysteretic
circuit data.
C. Global modeling
The collection of local polynomial (or other basis function) maps form some global model. The shortcomings
of such a global model are its discontinuities and extremely large number of adjustable parameters. The
latter is clearly a penalty for high accuracy. At the same
time, it would be nice to have a relatively simple continuous model describing the whole collection of data. A
number of solely global models has been invented which
present a closed functional representation of the dynamics in the whole phase space (or, at least, on the whole attractor).
Each method uses some expansion of the dynamical
vector field F ( x ) in a set of basis functions in Rd. The
first such global method that comes into mind is to use
polynomials again. Their advantage in local modeling
where least-squares fitting works well, is now reduced by
the extremely large number of data points and the need
to use rather high-order polynomials. There is an attractive approach to finding a polynomial representation of a
global m a p . This measure-based functional
reconstruction (Giona et al., 1991; Brown, 1992) uses orthogonal
polynomials whose weights are determined by the invariant density on the attractor.
The method eliminates the problem of multiparameter
optimization. Finding the coefficients of the polynomials
and the coefficients of the function F( y) requires only the
computation of moments of data points in phase space.
It works as follows:
We introduce polynomials (f>a(x) on R d which are orthogonal with respect to the natural invariant density on
the attractor,
fddxp(x)<f>a(x)(l>b(x)
= 8ab ,
(107)
(110)
120
(108)
F ( y ) = 2 ^<*>(||y-yJD,
(in)
(112)
/(y)=2*(||y-y,ID
i
Pc(ylz)=2*(||y-y, + ill)*(||z-y,ll)i
Kernel density estimation can then be used for conditional forecasting by the rule
y(k + l)=fdxpc(x\y(k))x
(113)
Kernel density estimation usually provides the same accuracy as the first-order local predictors. It has the advantage of being quite stable even with noisy data (Casdagli et al.y 1991).
Again, computing the conditional probability distribution, one can impose weights in order to attach more
value to the points close to the starting point of prediction (both in time and in phase space). In fact, this leads
to a class of models that are hybrids of local and global
ones. Moreover, it allows one to construct a model that
not only possesses good predicting properties but also
preserves important invariants of the dynamics. The prediction model by Abarbanel, Brown, and Kadtke (1989)
belongs to this class. It parametrizes the mapping in the
form
1371
||y(fc + l ) -
C(X,a) =
y(* + l)g(y,y(*);a) ,
2 lly(*)||
k = l
(115)
[ 2 TyXf
+ dt 1
(116)
(114)
k= \
where the function g(y,y(A:);a) is the analog of the kernel function, that is, it is near 1 for y y(k) and vanishes
rapidly for nonzero ||y y(&)||. a is a set of parameters.
The cost function to be minimized is chosen as
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
,
2
g(x) = [i+tanh(x>] .
F(y,a)= 2
XF"(y(fc-+l),a)||2
1372
-Z^**^--J**~^**z=^::^_
(X^O
OUTPUT UNIT
O O O O O^^XP
cQo o o o o o o o ox)H,Ds
J
INPUT UNITS
y(k+p)
= F ( y ( * ) , y ( * - 1 ) , . . . , y(k-dL
+ l),{Tij}f{0i})
,
(118)
E=Z
\\y*Jk+p)-y(k+p)\\
(119)
where Ns is the number of data in a training set. D e pending on what sets of data are used for the training,
the neural net eventually can be considered as a local or
as a global model. In the first case only Nc i V near
neighbors are used for feeding the net, while in t h e
second one uses all the data. The fitted parameters are
here the weights {Ttj} and the thresholds {0 ( }. Unfortunately, this learning appears to be a kind of nonlinear
least-squares problem, which takes much longer than
linear fitting to run on regular computers. Nevertheless,
neural networks present great opportunities for parallel
implementation, and in the future that may become the
most useful technique for forecasting.
Neutral nets are quite useful when we have a simple
dynamics, and we can hope that a net with an a priori architecture will work well. Then the only thing to do is to
train this net, i.e., t o adjust the weights of connections
and thresholds. Adjustment of the architecture itself is a
secondary (and very often omitted) procedure for standard neural network modeling. The approach taken by
genetic learning algorithms (Goldberg, 1989) represent a
quite opposite point of view: This starts with finding an
architecture based on the data set. Sorting out different
architectures for a complex dynamics in a highdimensional phase space is a very difficult task. To accomplish this, genetic algorithms utilize Darwinian evolution expressed in terms of variables (data), mathematical conditions (architecture), and quality of fit (criterion
for sorting out the better architecture). Specifically, if we
have a set of pairs (xp,yp) where each xp is mapped into
yp in some dL -dimensional space, then we try to learn the
set of conditions (genes) imposed on {xp} under which
the quality of fit associated with {yp } is extremal. In particular, we can seek conditions determining the subspaces
of the whole phase space for which the distribution of
{yp} is sharpest (Packard, 1990). The search for the appropriate set of conditions is performed by a kind of natural selection. We start with a number of random sets
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
-"
(120)
+TX),
. . . 9sl(n+T1(dl-l))]
,
(121)
different
... .
1373
\yi(k)-yE(k)\2
| i
N-l
+ X z(k)-[yE(k
-hD-F^y^k))]
(122)
= F1(yE(k))
(123)
yl(k)-yE(k)
= z(k-\)-z(k)-T>Fl(yE(k))
+ l)^yE(k,p)
+ A(k,p)
(124)
= Fl(yE(k,p)
&(k,p))-yE(k+l,p)
DF 1 (y^(A:,/?))-A(^, j p) + 6 i) (fc,^) ,
where eD(kyp) = Fx(yE(k,p))
yE(k +1,/?) is the deterministic error, namely, the amount by which the estimate
at the pth step fails to satisfy the dynamics.
1374
40
140
240
L_^
340
440
840
940
1040 11.40
homoclinic tangencies in a precise description of the process, and then the orbit recovers. What we show in Fig.
31 is the absolute error \\yE(k,p) yi(fc)|| as a function of
k. A plot of D(k) shows no peaks and simply displays
noise at the level of machine precision.
The "glitches" caused by the homoclinic tangencies
can be cured in at least two ways (Abarbanel et al.9 1993)
that we are aware of. First, one can go back to the full
least-squares problem stated above and in a region
around the tangency do a full diagonalization of the matrices that enter. One can know where these tangencies
will occur because the Jacobians, which we know analytically since we know F ^ x ) , are evaluated at the present
best estimate yE{k,p).
The alternate solution is to scale
the step A(k,p) we take in updating the estimated trajectory by a small factor in all regions where a homoclinic
tangency might occur. One can do this automatically by
putting in a scale factor proportional to the angle between stable and unstable directions or by doing it uniformly across the orbit. The penalty one pays is a larger
number of iterations to achieve the same absolute error,
but the glitches are avoided.
In either of these cases one arrives at an estimate of the
true orbit y\(k) which is accurate to the roundoff error of
one's machine. So, if one has an orbit of a nonlinear
dynamical system that in itself is of some importance, but
it is observed to be contaminated by another signal, one
can extract that orbit extremely accurately if the dynamics are known. The limitations on the method come from
the relative amplitudes of the contamination and the
chaotic signal. In practice, when the contamination is
10% or so of the chaos, the method works unreliably.
The other important application of the method would be
using the chaotic signal as a mask for some signal of interest which has no relation to the dynamics itself. Then
1375
w(y.cM}>=
d26)
K(yc(m)-yR(j))K(yc(m
_
2
(127)
1)) via
-\)-yR(j'))
(128)
*(yc(m-l)-y*U))
n= l
The conditional probability P ({s] | {y c ]) is estimated by assuming that the contamination z is independent of the y c , so
m-\
P(s(j);
y = l, m\yc(k); fc = l , m ) = n
Pz(yc(k)s(k))
k = l
=P(s(j);j = l,m-l\yc(k);k=l9m-l)Pz(yc(m)s(m))
(129)
1376
Lorenz Attractor
Uniform Noise in [-12.57, 12.57]
-40.0
in Henon Data
-30.0
-30.0
30.0
-20.0
40.0
FIG. 33. The cleaned version of the data in Fig. 32. Thirty-five
passes of the probabilistic cleaning algorithm are used. The
final signal-to-noise ratio, comparing with the known clean
data, is 12.6 dB.
Stars)
40
45
50
55
FIG. 34. A pulse added to data from the Henon map. Signalto-noise ratio is 20 dB. Pulse is in the neighborhood of time
step 75.
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
When we have no foreknowledge of the signal, the dynamics, or the contaminant, we must proceed with a
number of assumptions, explicit or implicit. We cannot
expect any method to be perfectly general, of course, but
within some broad domain of problems we can expect to
succeed in separating signals with some success.
In the work that has been done in this area there have
been two general strategies:
Make local polynomial maps using neighborhood-toneighborhood information. Then adjust each point to
conform to the map determined by a collection of points;
i.e., the whole neighborhood. In some broad sense a local
map is determined with some kind of averaging over
domains of state space, and after that this averaged dy-
(131)
1377
1378
S{n) = ^h(j)s{n-j)
(132)
5 ( n ) = 2 h(j)s(n-j)
(133)
then the dimension and other characteristics of the attractor are unchanged. This is more or less obvious,
since this filter is simply a linear change of coordinates
from t h e original s(k), to S(k). By t h e embedding
theorem, invariant properties of the system seen in the
reconstructed phase space should be unaltered.
In t h e case of an autoregressive or infinite-impulseresponse (IIR) filter, the story is different. A model of
this kind of filter is
s(n+l)
= F(s(n))
(134)
S(n+l)
= aS(n)+g(s{n))
\h(k)\exp[kkN] ,
(135)
k=o
^F(^O)
(136)
P=P*
VM(P)
?,F(PQ + SP)
1379
(137)
where the As and Au are the stable and unstable eigenvalues, and t and u are the stable and unstable eigenvectors. The vectors fs and fu are the stable and unstable
contravariant eigenvectors defined by f u - u = l , fu-/s = 0,
fg-'s^l, and f s -u = 0.
Equations (136) and (137) and the vector g are all that
is necessary to define the control mechanism. The
motion along the stable manifold of vM brings the original orbit very rapidly to the vicinity of the unstable
periodic orbit vM. Once the orbit is in this vicinity, small
changes in p proportional to v v M are made. This
linearization defines a new linear map near vM, and we
1380
(138)
g-f
This type of control mechanism has been experimentally verified in an experiment on magnetoelastic ribbons
(Ditto et al., 1990). The device involves a ribbon whose
Young's modulus changes under the influence of small
magnetic fields. An inverted ribbon is placed in the
Earth's gravitational field so that initially it buckles due
to gravity. By adding a vertical oscillating magnetic
field, one creates a driven chaotic oscillator. For this experiment the force driving the oscillation is gravity, while
the control parameter is the magnetic field strength. The
results of this experiment are shown in Figs. 37 and 38.
In Fig. 37 the iteration number represents passes
through the stroboscopic surface of a section, which is
one period of the drive. The gray portion for iterates less
than approximately 2350 represents the chaotic motion
of the ribbon before the control is initiated. For an
iterate near 2350, the motion of the ribbon passes near an
unstable period-one orbit, and the control is initiated.
After initiating control the motion of the ribbon remains
near the periodic orbit. In the figure this is indicated by
the fact that the location of the orbit on the surface of
the section is constant near 3.4 for all subsequent intersections. The period of the driving of this magnetic field
is typically 0.85 Hz; thus the period-one orbit has period
1/0.85 Hz1.2 sec. The experimenters have been able
to confine the motion of the ribbon to the unstable orbit
for as long as 64 hours. Clearly the control mechanisms
works.
Figure 38 is an admirable piece of showing off on the
part of the researchers. First they control near a periodone orbit. Then they change their minds and control
near period-two orbits. Then they change their minds
again and control back on the original period-one orbit.
Control has also been demonstrated near period-four orbits.
1000
2000
3000
4000
Iteration Number
FIG. 37. Time series of the voltages stroboscopically sampled
from the sensor attached to magnetoelastic ribbon (Ditto et al.,
1990). Control was initiated after iteration 2350.
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
2000
4000
6000
Iteration Number
10000
FIG. 38. Time series of voltages from the magnetoelastic ribbon. System is switched from no control to controlling the (unstable) period-1 orbit (fixed point) at iteration 2360, to controlling period-2 orbit at n =4800, and back to the fixed point at
=7100.
(139)
F[y,f]
^=F[y(t),t]
at
H[y{t)-y*(t),t;p]
(140)
1381
role of external forces is to effect changes in the parameters of the system rather than the form of the system.
The main distinctions are the use of the nonlinear dynamics of the system in the first method and the important fact that only infinitesimal external forces are required to change parameters. This latter point comes
from the use in the method of the exponential approach
along stable manifolds.
fl|^^
3. Targeting
32
*0
48
64
72
80
Time (minutes)
-15
-19
20
25
30
Minutes
FIG. 40. Switching the direction of the flow in a thermoconvection experiment from clockwise to counterclockwise using the
controller (Bau and Singer, 1992).
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
= F[v(n);p]
1382
1000
1/E,
FIG. 41. Average time required to reach a typical target neighborhood from a typical initial position as a function of the inverse neighborhood size: dashed line, no control; circles and
solid line, with control (from Shinbrot et al., 1990).
dence of F by p. For now we shall use only one parameter, but that is not necessary. The change in one iteration
on the surface due to a change in the parameter is
8v(n + l ) = v(w + l ) - v U + 1 )
=
3F[v(n)]
dp
dp
+ra.
ln[6v(/t+D]
ln[er]
wx w2 >
Many physical phenomena can be modeled as an underlying dynamical system driven by external forces.
The most often studied types of driving are periodic and
steady forcing. A down-to-earth example is given by a
system that is driven with a period half that of the natural oscillation. This models pumping a playground
swing.
A type of driving that is new to our field of inquiry is
chaotic driving (Kowalski et al., 1990; Pecora and Carroll, 1990, 1991a, 1991b). Under this type of scenario one
has a dynamical system that is driven by the output of
another dynamical system. If the second dynamical system, the one that is providing the drive, is exhibiting
chaotic motion, then we say that the first system is undergoing chaotic driving. In Fig. 42 we schematically illustrate chaotic driving. The first dynamical system, the
one that is doing the driving, we call the driving system.
The second dynamical system, the one that is being
driven we call the response system.
Imagine dividing the two dynamical systems into three
parts. The degrees of freedom of the driving system are
uU) and v(t) while those of the response system are
called wit).
The vector fields for the driving and
response are g, f, and h, and the equations of motion are
du
= g[u;v] ,
dt
dv
= f[u;v],
dt
(141)
dw
= h[w;v] .
dt
The first and second equations of motion involve the
driving system. We have split it into two parts. The first
part represents the variables in the driving system that
are not involved in driving the response. The second part
represents the variables that are actually involved in the
driving of the second dynamical system. Finally, the
third equation represents the response system. We shall
assume that the combined system has n equations of
motion, while partitions g, f, and h have k, m, and /
equations of motion, respectively.
A simple familiar example is provided by a damped
driven pendulum: 0-\-y6^-sm(0)
= cos(cot). This system
can be written in the form of Eqs. (141) through the
identification of variables: B = w1, 6 = w2, (f> v, and
(f> = u. The result is written
1383
ib2::=yw2sm(ivl)-\-v
v=v
In this example m = 1, k = 1, and / = 2 . The driving system is a simple harmonic oscillator and the response is
the damped pendulum.
In chaotic driving we replace the simple harmonic oscillator with a chaotic system. When the chaotic system
is identical to the response system the most dramatic
effects occur. Under these conditions the chaotic response
system may synchronize with the chaotic driving system.
Synchronization means here that the two systems follow
the same trajectory in phase space (Afraimovich et al.9
1986). This occurs despite the fact that the response system has a different initial condition from the drive system. Under normal chaotic circumstances a different initial condition would imply that the two trajectories
would diverge exponentially on the attractor. Thus the
fact that chaotic driving can cause two systems to follow
the same trajectory for a long time is quite an attention
getter.
When does synchronization occur? That is, when will
two nearby initial conditions in w converge onto the
same orbit? The condition that determines whether or
not a driven dynamical system will synchronize is obtained by examining the values of the
conditional
Lyapunov
exponents.
To determine the conditional
Lyapunov exponents, as well as to define them, we must
simultaneously solve Eqs. (141) and the variational equations of the response system,
d(Aw)
dt
= DFw[v,w]-Aw ,
(142)
1384
Lorenz
a = 16
6=4
r =45.92
(y,z)
( 2.50,-2.5)
(x,z)
(-2.95,-16.0)
(x,y)
(0.007 89,-17.0)
Lyapunov exponents.
The set of / Lyapunov exponents associated with Eqs.
(141) and (142) are the conditional Lyapunov exponents.
If all / conditional exponents are less than zero, then the
subsystem, w, will synchronize [i.e., w'(t)>w(t) as
f>oo]. If we denote the conditional Lyapunov exponents as 0>%1<X2* -A,/ then the convergence
rate is given by expfA^]. Numerical experiments on the
Lorenz system are given in Table III. As the table
shows, for the parameter values used one can implement
either x or y as the drive. The corresponding subsystems
[(y,z) in the case of x drive and (x,z) in the case of y
drive] have two negative conditional Lyapunov exponents. The third possible subsystem has z as the driver
and possesses a small positive conditional Lyapunov exponent. The positive conditional Lyapunov exponent implies that this type of drive/response system will not synchronize.
One can cascade several dynamical systems together
(Carroll and Pecora, 1992; Oppenheim et a/., 1992) and
achieve synchronization in all of them by using only one
drive. An example of this type of system is given in Fig.
43. Here we show a system of three cascaded Lorenz systems. The x coordinate of the first is used to drive the
second. Similarly, the y coordinate of the second is used
to drive the third. One could extend this cascade
indefinitely without changing the results. Moreover,
since the second and third systems respond to the drive
of the first system there is really only one drive, namely,
the x coordinate of the first system.
difference
4x
I
Drive
-L-^x
Response
X. SPATIO-TEMPORAL CHAOS
As we remarked in the introduction, the study of
spatio-temporal chaos is much further from a full understanding than solely temporal chaotic behavior. The situation is changing quite quickly, and while we shall
merely touch upon our own expectation of where this
area will go, it seems certain that the developments will
bring the analysis of spatio-temporal observations to the
same stage as that of temporal chaos, reviewed in this article. So what we include now is less a review than, we
hope, a preview to indicate how the subject might develop from the results reviewed throughout this article.
In the last few years increasing attention has naturally
been focused on spatio-temporal chaos in various experimental setups such as Rayleigh-Benard convection
(Manneville, 1990), Taylor-Couette flow (Anderek and
Hayot, 1992), and surface excitations of the Faraday sort
(Gollub and Ramshankar, 1991). On the theoretical side
there have been extensive studies of cellular automata
which provide coarse graining in the dynamical variables
as well as in the independent variables of space and time,
coupled map lattices (Keeler and Farmer, 1986), which
study the interaction between known chaotic maps coupled together on spatial lattices, and model physical
equations such as the Kuramoto-Sivashinsky and various
Ginzburg-Landau equations (Kuramoto, 1984). Discussion of those studies as well as relevant mechanisms of
transition to chaos go far beyond our scope. In that regard, however, we recommend the review by Cross and
Hohenberg (1993), which reports on studies in which the
instabilities of linearized dynamics can be traced as the
origin of observed space-time patterns in a variety of experiments.
Our present aim is to outline the approach to analysis
of data from spatio-temporal systems following the philosophy and methods developed for the pure temporal
cases in previous sections. In other words, we need to develop machinery that allows us to calculate meaningful
characteristics identifying the spatio-temporal behavior
and to build models for prediction of behavior of the
spatio-temporal systemand to do all of this working
with data.
The main difference (and difficulty) which comes with
spatial degrees of freedom is the very high (or even, typically, infinite) dimension of the system. Of course, one
may take only one observable from one space location
and try to reconstruct the dynamics in the usual way.
The idea would be that since the system is totally coupled, by the embedding theorem, any measurement
should reveal the full structure of all degrees of freedom
in interaction. This standard approach of treating a
spatio-temporal system as though it were purely temporal, while possible in principle, is practically infeasible
with existing or projected computational capability. It
seems much more promising to account for the spatial
extension explicitly at both the phase-space reconstruction and model-making stages of the problem. We have
J ^ = _JM
dt
(1 43)
oa
a(i,n+(Mt-l)T)
a(i,n)
a(i+D,n+(Mt-l)T)
a(i+D,n)
a(i+(Mx-l)D9n)
1385
a{i +(Mx-l)D,n
+{Mt-l)T)
1386
(144)
matrix U and the nXn
matrix V are
Vr V= I ,
(145)
diagonal matrix
2 = diag[o- 1 ,a 2 , . . . ,an ] .
(146)
(147)
||X|| = l"
(148)
If = 1 , the
significant progress for purposes of studying chaotic laboratory experiments and chaotic field observations of interest to physicists. In this review we have tried to emphasize tools that are of direct utility in understanding
the physical mechanisms at work in these measurements.
We have placed less emphasis on mathematical results,
using them as a guide to what one can learn about fundamental physical processes. Numerical results are critical
to the study of nonlinear systems that have chaotic
behavior. Indeed, computation plays a larger role in
such studies than is traditional in many parts of the physics literature. Progress such as that reported throughout
this review rests heavily on the ability to compute accurately and rapidly, and as such would often not have been
possible a decade ago. The subject reviewed here is almost "experimental" in that sense through its reliance on
computers as an instrument for its study. This bodes
well for further analysis of chaos and its physical manifestations, since one can expect even more powerful computing tools to be widely available on a continuing basis.
The point of view taken throughout this review, often
implicitly, is that chaotic behavior is a fundamental and
interesting property of nonlinear physical systems. This
point of view suggests that as a basic property of the nonlinear equations governing real physical systems there is
much new to be learned by the study of chaos. While at
times we have reported on efforts to " t a m e " that chaos,
as in the section on control of periodic behavior, our
main goal has been to bring to physicists a set of tools for
extracting important information on the underlying
physics of chaotic observations. Absent many of the
tools created by those whose contributions we have reviewed here, much of the physics of chaos has been
passed by as "noise." We expect the development of
these tools to continue and the exploitation of the physics
thus exposed to blossom in applications across the spectrum of phenomena where chaos is found. Indeed, this
spectrum is so vast that our review has been inevitably
unable to cite fully even a fraction of the literature on
laboratory and field observations where chaos is known
to occur. In any case, such a full recitation might be perceived as a litany of all of classical and semiclassical
physics, and thus slightly useless.
The critical aspect of chaotic behavior from the
physicist's view is that it is broadband spectrally, yet
structured in an appropriate phase space, which itself can
be reconstructed from data. This distinguishes it from
the conventional view of "noise" and opens up for review
and reexamination a wide variety of work which
presumes unstructured stochastic behavior to be a driver
or major influence on the physical processes being studied. That reexamination has not been undertaken in this
review, nor is it a finished project to be found in the
literature. Our review has focused on the tools one will
use or the ideas underlying the tools, which will evolve
for the extraction of new physics from observations of
chaotic physics. We have no doubt whatsoever that
where physics has been hidden by the assumption of un-
1387
data sets as well as from numerical, experimental, or instrumental uncertainty. It would be extremely important
to extend the full power of the existing topological
analysis (Mindlin et ah, 1991) in three dimensions to
higher dimensions. We anticipate extensive work on the
analysis of geometric properties of experimental strange
attractors. These may also be more robust against contamination by unwanted signals than the more familiar
derived or metric quantities such as dimensions and
Lyapunov exponents.
We look forward to the application of chaotic analysis
tools, both as covered here and in their further developments, to the detailed study of numerous physical systems. The few examples we have presented here are
clearly the "tip of the iceberg." It is in the actual making
of these applications where the push for further progress
is sure to come most persuasively. The part of the task
for the physicist which has been little touched on in this
review, since there is little to review on it as a matter of
fact, is the building and verification of models that arise
from considerations of chaotic physical experiments. We
took the liberty to say a few general words about this in
the Introduction, but the "proof of the pudding" is in
performing the task itself. It is clear, and even
mathematically good sense (Rissanen, 1989), that we cannot algorithmically discover fundamental physics. For
the physicist the road ahead with the subject of chaotic
data analysis will differ from signal processing and return
to the analysis of interesting physical systems with the
methods reviewed. The point of view in the creation of
physical models will change, of course. It will not be
necessary, we imagine, always to start with the NavierStokes equations to build models of chaos in fluids,
though, perhaps, we will learn to project those infinitedimensional equations down to the few relevant physical
degrees of freedom, which will appear in the models.
The discovery of temporal chaos in physical systems is
a "finished" topic of research, we believe. Many physical
systems have been shown to exhibit chaotic orbits, and
we are certain many more will be found. It is no longer
enough, however, to have good chaos, one must move on
and extract good physics out of it. The methods and
techniques in this review are directed towards that in the
realm of temporal chaos. An important challenge for the
area of research we have reviewed here is to extend both
the existing methods and the classification of widely expected and universal behaviors to spatio-temporal chaos.
This is to say that the study of chaotic fields remains
rather untouched from the point of view taken
throughout this review. We certainly do not mean that
early and productive steps have not been taken; they
have! We simply desire to underline the clear fact that
the developments in the analysis of spatio-temporal
chaotic data and the connection with model building, signal separation, and the other goals we have outlined are
not yet as far along as for temporal chaos. The recent
thorough review of Cross and Hohenberg (1993) makes it
quite clear that the predominant body of work in this
1388
REFERENCES
Abarbanel, H. D. I., R. Brown, and J. B. Kadtke, 1989, Phys.
Rev. A 41, 1782.
Abarbanel, H. D. I., R. Brown, and M. B. Kennel, 1991, J.
Nonlinear Sci. 1, 175.
Rev. Mod. Phys., Vol. 65, No. 4, October 1993
1389
1390
ADDITIONAL REFERENCES
One of the challenges in a review such as this one is making
it finite. In rising to this challenge we have had to make a
selection of topics and to identify the appropriate literature
that bears quite directly on those topics. We have, of course,
done this, and thus risked ignoring vast parts of the vast literature on the analysis of time series from chaotic observations.
The items cited in the main body of the text by no means
represent the same selection many of our colleagues would
have made, but they do represent the literature we consulted
and learned from in putting together this review. Many other
papers and books have contributed to our understanding of the
material we are covering in this review, and we attempt to
identify below the main body of these. As ever, we are pleased
to hear from authors whose work we have inadvertently failed
to cite. We shall be happy to read these papers and we shall
happily cite that work as well.
Abarbanel, H. D . I., R. Brown, and M. B. Kennel, 1991,
"Lyapunov Exponents in Chaotic Systems: Their Importance
and Their Evaluation using Observed Data," Int. J. Mod.
Phys. B 5, 1347.
Abarbanel, H. D . I., S. Koonin, H. Levine, G. J. F . MacDonald,
and O. Rothaus, 1990a, "Statistics of Extreme Events with Application to Climate," J A S O N / M I T R E Report JSR-90-305.
Abarbanel, H. D . I., S. Koonin, H. Levine, G. J. F . MacDonald,
and O. Rothaus,
1990b, "Issues in
Predictability,"
J A S O N / M I T R E Report JSR-90-320.
Afraimovich, V. S., I. S. Aranson, and M . I. Rabinovich, 1989,
"Multidimensional Strange Attractors and Turbulence," Sov.
Sci. C Math. Phys. 8, 3.
Afraimovich, V. S., M. I. Rabinovich, and V. I. Sbitnev, 1985,
"Attractor dimension in a chain of coupled oscillators," Sov.
Tech. Phys. Lett. 11, 139.
Albano, A. M., J. Muench, C. Schwartz, A. Mees, and P. R a p p ,
1391
1392
Smith, L. A., 1988, "Intrinsic Limits on Dimension Calculations," Phys. Lett. A 133, 283.
Stoop, R., and J. Parisi, 1991, "Calculation of Lyapunov Exponents Avoiding Spurious Elements," Physica D 50, 89.
Theiler, J., 1986, "Spurious Dimension from Correlation Algorithms Applied to Limited Time Series Data," Phys. Rev. A
34, 2427.
Theiler, J., 1987, "Efficient Algorithm for Estimating the Correlation Dimension from a Set of Discrete Points," Phys. Rev. A
36, 4456.
Theiler, J., 1988, "Lacunarity in a Best Estimator of Fractal Dimension," Phys. Lett. A 133, 195.
Theiler, J., 1990, "Statistical Precision of Dimension Estimators," Phys. Rev. A 41, 3038.
Thompson, J. M. T., and H. B. Stewart, 1986, Nonlinear Dy-