Multi-Agent Coordination: Fluid-Inspired and Optimal Control Approaches
Multi-Agent Coordination: Fluid-Inspired and Optimal Control Approaches
Multi-Agent Coordination: Fluid-Inspired and Optimal Control Approaches
A Thesis
Presented to
The Academic Faculty
by
Peter M. Kingston
In Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy in the
School of Electrical and Computer Engineering
Approved by:
Thanks must go, first and foremost, to my advisor, Dr. Magnus Egerstedt – for his
support during my Ph.D.; for creating a uniquely positive academic environment and
research group; and for giving me the freedom to do interesting and varied research.
At all stages, Dr. Egerstedt has been an unfailing advocate, and for this I cannot
thank him enough. I would also like to thank my Ph.D. committee, Dr. Santiago
Grijalva, Dr. Panagiotis Tsiotras, Dr. Erik Verriest, and Dr. Anthony Yezzi. I
appreciate tremendously the time and effort that they have taken to review this dis-
sertation. Additionally, this work would not have been possible without the generous
support of the Office of Naval Research and the National Science Foundation.
I would like to thank Dr. Todd Murphy, who was kind enough to host me at
I would also like to warmly acknowledge Dr. Mike Stilman and Tobias Kunz, with
My present and former colleagues at the Georgia Robotics and Intelligent Systems
(GRITS) Lab have been an unrivaled source of insight and comradeship. GRITS Lab:
For our discussions (technical and otherwise), for your advice, and for your friendship,
I count myself lucky. It has been a singular privilege to work with you.
Finally, thank you to my family – for encouraging me to begin this journey, and
iii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . iii
SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
I INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Eulerian Swarms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Lagrangian Swarms . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Subjective Controller Tuning . . . . . . . . . . . . . . . . . . . . . . 4
II PROBLEM BACKGROUND . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Preliminaries: Algebraic Topology . . . . . . . . . . . . . . . . . . . 7
2.1.1 Abstract Simplicial Complexes . . . . . . . . . . . . . . . . . 7
2.2 Index-Free Multiagent Systems . . . . . . . . . . . . . . . . . . . . . 10
2.3 Time and Output Warped Formation Tracking . . . . . . . . . . . . 12
2.3.1 Time Warping . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Output Warping . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Preference Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
iv
3.2.3 Hodge Decomposition on Graphs . . . . . . . . . . . . . . . . 35
3.2.4 Laplacian Operators and Energy Functions . . . . . . . . . . 36
3.2.5 Two-Dimensional Models . . . . . . . . . . . . . . . . . . . . 39
3.2.6 A Combined Algorithm . . . . . . . . . . . . . . . . . . . . . 45
3.2.7 Example: Air Traffic Control via Incompressible Flows . . . . 45
3.2.8 Incompressible Flows: Numerical Example . . . . . . . . . . 47
3.2.9 Incompressible Flows: Implementation . . . . . . . . . . . . . 48
3.2.10 Incompressible Flows: Summary . . . . . . . . . . . . . . . . 50
3.2.11 Harmonic Flows: or, Homological Patrol Strategies . . . . . . 50
3.2.12 Distributed Computation of Homological Streamfunctions . . 53
3.2.13 Infrastructure-Assisted Behavior Generation . . . . . . . . . 59
3.2.14 Distributed-Infrastructure Routing: Conclusions . . . . . . . 61
IV LAGRANGIAN SWARMS . . . . . . . . . . . . . . . . . . . . . . . . 62
4.1 Constraint-Coupled Mechanical Systems . . . . . . . . . . . . . . . . 62
4.1.1 Distance Constraints . . . . . . . . . . . . . . . . . . . . . . 64
4.1.2 From DAEs to ODEs by the Implicit Function Theorem . . . 65
4.1.3 A Physical Analogue: The Marionette . . . . . . . . . . . . . 65
4.2 Optimal Control for Approximate Formation Tracking . . . . . . . . 68
4.2.1 Time Warping . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2.2 Output Warping . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.2.3 Example: The Puppet . . . . . . . . . . . . . . . . . . . . . . 92
V PREFERENCE LEARNING . . . . . . . . . . . . . . . . . . . . . . 98
5.1 Learning Metric Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.1.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 99
5.1.2 State of the Art: Support Vector Machines . . . . . . . . . . 100
5.2 Metric Preference Learning: A Convex Formulation . . . . . . . . . 104
5.3 The Preference Graph . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.4 Metric Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
v
5.4.1 Direct Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4.2 Instance Vector Expansion . . . . . . . . . . . . . . . . . . . 109
5.4.3 QP Form and Relation to SVMs . . . . . . . . . . . . . . . . 113
5.5 An Asymptotic Observer for Metric Cost Models . . . . . . . . . . . 114
5.6 Apples and Oranges . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.7 Amoebas and Humans . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.8 Learning Metric Costs: Contributions . . . . . . . . . . . . . . . . . 122
VI CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
vi
LIST OF FIGURES
vii
14 Depicted are a specified flow (left), and its projections onto the in-
compressible (center) and harmonic (right) subspaces. The harmonic
flow described in this section (right) differs from the incompressible
flow described in Section 3.2.8 (center), in that the former avoids the
local vortices visible in the latter. In both cases, it is a collection of
hybrid, piecewise-linear controllers that realize the flows. These con-
trollers are produced as the Hamiltonian vector field corresponding to
a piecewise-linear streamfunction (color gradients). . . . . . . . . . . 60
15 The marionette . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
16 Human subject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
17 The marionette is modeled as a system of point masses (m1 , · · · , m10 )
interconnected by massless rods (solid lines) and suspended by strings
(dashed lines) from four kinematically-controllable points (p1 , · · · , p4 ). 67
18 Comparison of three signals: Although subjectively, Signal 1 represents
the “same motion” as the reference signal (it is merely shifted, dilated,
and scaled) Signal 2 will be judged “more similar” to the reference by
either the L2 metric (79) or angle (80). . . . . . . . . . . . . . . . . . 69
19 J(ξ) is plotted against the linear time warping parameter ξ = v(0)
for the problem in which an autonomous nonlinear pendulum approx-
imately tracks a sinusoid. . . . . . . . . . . . . . . . . . . . . . . . . . 79
20 Simplicial tessellations in the plane containing the region outlined in
bold satisfying (left) all rules 1-4; (center) all rules except 3 (the of-
fending vertex is circled); and (right) all rules except 4 (the offending
simplex is circled). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
21 Given the set S of input simplices, the output warping function s
is determined by the positions of the vertices of the corresponding
output simplices R. This example uses a Coxeter-Kuhn-Freudenthal
tessellation of a regular grid of cubes in R2 . . . . . . . . . . . . . . . 90
22 Van der Pol oscillator vs. driven pendulum, before warping. We wish
to scale the time axis of the output (top left) and the output space of
the reference (bottom right) to align the two signals. . . . . . . . . . 93
23 Van der Pol oscillator vs. driven pendulum, after warping. Time warp-
ing matches the first part of the Van der Pol oscillator’s transient to
that of the pendulum (left top, bottom), and output warping rotates
and deforms the reference output space to better match the output
(right top, bottom). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
24 The cost in Example 4.2.5 is reduced (top), eventually reaching a local
extremum as evidenced by the reduction of the norm of the derivative
to zero (bottom). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
viii
25 The gradient descent procedure is characterized by a rapid initial de-
scent followed by slower final descent. Each curve corresponds to a
different initial guess for the control trajectory; each component of the
guess for u takes the form α cos(ωt) + β sin(ωt), with (α, β) chosen
uniformly at random from [−1, 1] × [−1, 1] and ω from [0, ωmax ], where
ωmax is the Nyquist frequency for the sample rate used. The initial
guesses for the output warping are realizations of the random vari-
able I + 51 , where here I denotes the coefficients corresponding to the
identity affine transformation and is drawn uniformly from [−1, 1]6 . 94
26 Animation frames showing the lowest-cost imitation of the human sub-
ject’s bhangra performance by the puppet; percentages are of the total
playback time elapsed. The puppet begins hanging in an equilibrium
state (at time “0%”). . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
27 The affine transformations arrived at by the optimization algorithm for
different initial guesses (as described in Figure 25) are illustrated as
frames in R2 (solid), along with the identity transformation (dotted).
The lowest-cost transformation is drawn in bold. . . . . . . . . . . . . 96
28 The original preference graph G (left), and the corresponding transitively-
reduced quotient graph, (G/ ∼)t (right). . . . . . . . . . . . . . . . . 106
29 Two examples for X = R2 . Shades of gray indicate the number of
violated constraints (points in darker regions violate more constraints),
and discontinuities in the derivative of the piecewise-linear function
x 7→ maxi ||d1i || (hdi , xi − bi ) are indicated by dashed lines. In the first
example (top), P 6= ∅ (white region), and x̄ is its incenter, the point
maximally far away from the closest of the constraint surfaces (thin,
solid lines) - i.e., it is the center of the largest inscribed sphere (thick,
solid curve). In the second example (bottom), P = ∅, and the resulting
optimum, x̄, is the point whose worst constraint violation is minimal. 108
30 A number of uniformly-randomly selected points in [−1, 1] × [−1, 1] ⊂
R2 are compared according to a point at infinity (i.e., a linear cost
function) (dotted), and both the traditional SVM (dashed) and the
minimax-rate (solid) approaches are used to produce estimates of this
direction from the comparisons. From the difference-classification point
of view (top), one wishes to separate the vectors {di }N i=1 (displayed
N
as “o”s) from the vectors {−di }i=1 (displayed as “*”s). From the
minimax-rate point of view (bottom), one wishes to find the direc-
tion that maximizes the rate of constraint satisfaction (the numbers
of violated constraints are represented by shades of gray; the white
region is feasible). The traditional SVM solution separates the posi-
tive from the negative differences with a larger margin (top), but the
minimax-rate solution stays as far from the edge of the constraint cone
as possible (bottom). . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
ix
31 A series of the observer’s estimates, with αk = 1 ∀k. The initial esti-
mate is x̃0 , and the true ideal is given by x̄. In step 0, the observer
projects x̃0 onto the plane (solid line) corresponding to the measured
output s0 = (x10 , x20 ) to produce x̃1 . In step 1, the observer makes no
changes to its estimate, because x̃1 is on the correct side of the plane
corresponding to s1 ; hence x̃2 = x̃1 . In step 2, the observer projects x̃2
onto the plane corresponding to s2 to create the estimate x̃3 , which is
yet closer to x̄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
32 Example estimate trajectory for observer (201-202) for αk = α = 1,
with X = R2 . The estimate begins at x̃0 = (−15, 15), and approaches
the ideal x̄ = (17, 0). . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
33 Depicted are the 9 apples used to generate comparisons with the single
orange. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
34 An example of a pairwise comparison between two apples, relative to
the orange. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
35 The preference graph corresponding to the apple experiments. . . . . 119
36 The preference graph corresponding to the amoeba experiments. . . . 121
37 Each question took the form, “Which of the two ‘amoebas’ (bottom)
looks more like the [motion capture data from a] human dancer (top)?” 121
38 If x1 ∈ B1 and x2 ∈ B2 , then ||x̃k+1 − x̄|| < ||x̃k − x̄||. . . . . . . . . . 131
x
SUMMARY
lite constellations and formation flight, to air traffic control and unmanned vehicle
teams. We investigate the coordination of mobile agents using two kinds of ap-
proaches. In the first, which takes its inspiration from fluid dynamics and algebraic
topology, control authority is split between mobile agents and a network of static
infrastructure nodes - like wireless base stations or air traffic control towers - and
controllers are developed that distribute their computation throughout this network.
novel optimal control algorithms, which involve the computation of optimal deforma-
tions of time- and output- spaces, to achieve approximate formation tracking. Finally,
of humans.
xi
CHAPTER I
INTRODUCTION
constellations, formation flight, air traffic control, and unmanned vehicle teams. Mo-
tivations are multifold: to produce systems that are robust to the loss of individual
solve larger problems than those for which they were originally designed; and to im-
pose order on systems, like electrical grids and transportation networks, that arise
from distributed social structures. Much of the inspiration for multiagent approaches
comes from biology, where flocks, schools, and packs of animals coordinate to solve
problems – like flying with aerodynamic efficiency, foiling predators, or hunting prey
– but physics also has a role to play, where ensembles of particles, like those com-
prising fluids or electrical conductors, exhibit properties that arise only from their
interaction.
robotic solutions, and despite the intuitive biological and physical justifications for
doings so, there remains a need for useful abstractions for controlling these systems.
The goal of this dissertation is to fill this gap, by presenting novel abstractions for
multiagent systems.
coordination. To describe them, we will borrow the terms Eulerian and Lagrangian
from fluid dynamics. The Eulerian approach considers points or control volumes that
1
are fixed in space, and the dynamics one obtains relate values that are assigned to
these points – like density, temperature, and pressure – to flows of mass, energy, and
are employed that move with the fluid – which has the interpretation of tracking
an individual particle of the fluid. For our purposes, Eulerian control schemes for
multiagent systems will be those that work with fixed control volumes containing
mobile agents, and Lagrangian control schemes will be those that track individual
agents, as particles. The key distinction between the two approaches is that, whereas
the Lagrangian approach is concerned with what state each agent is in, the Eulerian
For implementation, the Eulerian approach to multiagent control splits control au-
thority between mobile agents and a network of static infrastructure nodes – like
wireless base stations or air traffic control towers – that are responsible for differ-
ent regions of the environment. Algorithms are then developed that distribute their
computation throughout the static network, to produce local controllers that are
broadcast to mobile agents within each region – an approach that we first considered
in [55].
An underlying assumption for the Eulerian approach is that mobile agents are
ties a priori by representing the joint state of a swarm as a density function over a
Another motivation is the recognition that real multiagent systems often do use
2
static communications infrastructure, so in these situations, it is natural to ask
whether that infrastructure can be used for control as well. Moreover, in certain ap-
plication domains, like air traffic control, static infrastructure is not only widespread,
but indeed has legal authority. The Eulerian approach is compatible with these ex-
interesting tools from algebraic topology, Hodge Theory, and the theory of discrete
harmonic functions – which gives, in addition to the practical reasons just outlined,
mechanical systems, and develop novel optimal, approximate formation tracking algo-
to represent robot formations; in this case, graph edges represent constraints that
couple agents. The chief source of novelty is the development of a number of new
optimal control problems, which involve additional degrees of freedom, and their ap-
plication to these constraint-coupled swarms. Specifically, we allow for both the time
axis and the output space to be deformed elastically, and optimize over the homeo-
first addressed the time- and output- warped tracking problems in [59] and [61].
Deformation of the time axis is motivated by the recognition that the controlled
system either simply may not be able to move as quickly as a human, or may have
natural modes of oscillation (imagine waves propagating through a swarm) that can be
exploited by interpreting time liberally. In other words, although one system may not
be able to execute a motion at the same speed as another, it may nevertheless be able
3
to perform an “equivalent” motion by moving at a different speed. This approach
marionette actuated by weak servos whose arm swings with a natural frequency of 1
Hz may be asked to mimic a human waving her hand at 2 Hz. By choosing to interpret
the human’s wave at a slower speed, the oscillatory dynamics of the marionette can be
used to create a more convincing wave. In the same way, we aim to allow controllers
to harness, rather than fight, the natural modes of swarms. This is enabled by a
flexible interpretation of time, which will arise automatically as the solution to one
space as well. Returning to the example of the marionette asked to mimic a waving
human, suppose that the human waves at head level yet the puppet cannot raise its
hand above shoulder level. Using any of the usual metrics, the marionette simply
cannot track this reference signal. However, if the human’s “wave” motion is allowed
to be translated down to shoulder level, it can be tracked and the puppet can “mimic”
the human by executing a motion which is lower but still recognizably the same. In
“learn” controllers, under both frameworks, that optimize humans’ subjective experi-
ence in controlling swarms. Our goal in this complementary work, which we explored
in [58], [60], and [57], is not to investigate human psychology itself, but rather to
4
Entertainment and artistic applications help to motivate this aspect of the prob-
lem, where robots are asked to generate motions that, rather than achieving a well-
these situations, the effectiveness of control is ultimately the degree to which it aligns
address techniques both for using empirical measurements to learn cost functions
that are consistent with humans’ aesthetic preferences, and for generalizing from
Work in this area has the opportunity to impact areas beyond control and robotics,
zon, eBay), Internet radio (Pandora, Grooveshark), and advertising (Facebook, Google)
websites (see e.g. [44], [1], [74]); to more fundamental study of human behavior, e.g.
in psychometrics (e.g., [97], [17]) and economics (e.g., [2], [103], [84]). Nevertheless,
for the purposes of this dissertation, preference learning plays a secondary, comple-
mentary role to the primary contributions of Eulerian and Lagrangian swarm control
schemes.
5
CHAPTER II
PROBLEM BACKGROUND
Graph theory and algebraic topology provide a natural mathematical language for
discussing multiagent coordination. At heart, both simply model how things are
The first is to describe interactions between agents. Interactions can take various
forms: Agents may communicate with one another over traditional communications
channels (e.g., radio, optical, or acoustic links); they may react to one another’s
physical presence using sensors (like sonar, laser, or infrared rangefinders); or they
may mechanically alter one another’s states (e.g., by pushing or pulling). The simplest
interactions are pairwise, and are naturally represented as edges of a graph that takes
the agents as its nodes, but more complicated interactions involving three or more
agents can also be modeled using a generalization of graphs, the abstract simplicial
complex. Graphs and simplicial complexes, which are illustrated in Figure 2, will be
The second purpose of graphs and simplicial complexes, for us, is to serve as
abstract output spaces of systems. In this context, the nodes of graphs, and the
6
triangles, tetrahedra, and higher-order simplices of simplicial complexes, will repre-
introduced in Section 3.2), abstract simplicial complexes will serve both roles simul-
represent both static agents, like routers, and regions of an environment under their
control authority.
A number of tools from algebraic topology and homology theory – most notably the
e.g. [69]), and of a chain on a simplicial complex (as in [40] or [76]) – will play a key
role. The analogy between the discrete and continuous in this context is the subject of
discrete exterior calculus (discussed in [27]), which has found application in a number
of areas including computer graphics (e.g. [98]), image processing and clustering (e.g.
[37]), computational physics [81], statistical ranking [50], and multiagent control,
including [36] where a connection to continuous PDEs is made, [110] which explores
to probe the homology of the complex, and [92], which additionally gives subgradient
The formalism used in this section closely parallels that of [75] and [92]. Philo-
sophically, however, the goals are very different – in [75] and [92], one seeks to locate
holes in a network; here, we will look to understand human preferences and to direct
We now introduce abstract simplicial complexes more formally. Our exposition will
be brief, intended mainly to introduce notation and terminology; for more detail, the
7
0-simplex 1-simplex 2-simplex 3-simplex
(Vertex) (Edge) (Triangle) (Tetrahedron)
interested reader may wish to refer to [40] or [76] (although the formal definitions
used in each are slightly different), as well as the introductions to [50] (which uses a
dual formulation) and [92]. The definitions that follow in this section are more-or-less
standard.
simplices that is closed with respect to taking faces; i.e., if ∆ ∈ K and σ is a face of
order is less than k are faces of higher-order simplices. We denote the k-simplices of
2.
An orientation of a simplex is a total order over its vertices, modulo even permuta-
tions, with a formal sign,1 as is illustrated by Figure 2.1.1. If the set ∆ = {v0 , · · · , vk }
orientations related by an odd permutation are said to be opposite, and this is written
1
The formal sign is necessary only to allow 0-simplices to have two orientations.
8
0-simplex 1-simplex 2-simplex 3-simplex
(Vertex) (Edge) (Triangle) (Tetrahedron)
a simplex induces an orientation on its faces; the i-th oriented face of an oriented
to each of its k-simplices. A simplicial k-complex is consistently oriented if, for every
orientations on σ.
elements from Σk (K) taking coefficients from some commutative ring; we use the
real numbers, R. For instance, the formal sum 1.2v0 + 2.6v1 − 0.5v4 is a 0-chain
over an appropriate simplicial complex. Formal sums can be added and multiplied
by scalars in the natural way, so Ck (K) forms a finite-dimensional real vector space.
Boundary operators will be central to this work. The k-th boundary operator
9
δk (K) : Ck (K) → Ck−1 (K) on the oriented simplicial complex K is defined,
!
XN XN Xk
δk (K) ai σi = ai Fj (σi ) ; (3)
i=0 i=0 j=0
by convention, δ0 (K) = 0. The null space of δk (K) is called the k-cycles of K and
denoted Zk (K); the image of δk+1 (K) is called the k-boundaries and denoted Bk (K).
The k-th homology group is the quotient space Hk (K) = Zk (K)/Bk (K); its dimension
The k-th combinatorial Laplacian is defined, Lk (K) = δk∗ (K)δk (K)+δk+1 (K)δk+1
∗
(K),
where δk∗ (K) denotes the adjoint operator to δk (K), called the k-th coboundary oper-
ator. The matrix representations of δk (K) and δk∗ (K) are transposes of one another.
may also use terminology from graph theory [28].2 There, C0 (K) is called the vertex
space, C1 (K) is the edge space, δ1 (K) is the cycle space (and its dimension is the
set V (K 0 ) is a finite subset of Rn for some n ∈ N, and its Rips Shadow R(K 0 ) ⊂ Rn
The systems that we will be controlling are swarms of interchangeable agents. Conse-
quently, we are interested in problem formulations that do not require agent identities.
collected in a single vector x ∈ RnN , and analysis proceeds from this point. Examples
include [78], [107], [32], [94], [11], [12], [85], [104], [30], [31], [39], [71], [79], [95], [96],
[106], [108], [56]. For systems in which there are many homogeneous agents, however,
2
In graph theory, it is more common to use the two-element field F2 = {0, 1} (i.e., XOR serves
as the addition operation) instead of R.
10
it may be argued that this is an unnatural approach because it requires that agents be
of agent labeling. One way of dealing with this is to introduce an equivalence relation
for any x, y ∈ RnN , where π(N ) denotes the group of N × N permutation matrices
and ⊗ denotes the Kronecker product. Essentially, one is then looking for properties
of the system that are invariant under permutation of the agent indices or for methods
by which the agents can agree on a permutation; this is the approach taken in e.g.
Similarly-motivated previous work includes [63] and [62], in which the individual
configurations in R2 ∼
= C of robots in a formation are represented by the roots of
a complex polynomial whose coefficients constitute the permutation-invariant joint
in two dimensions, and since the representation hinges on the Fundamental Theorem
of Algebra it is not clear how one would generalize to higher (or lower) dimensions.
The work with the largest technical similarities to ours exists in the mathemat-
swarms; this includes [73], [99], [68], [13], and [19]. The models studied in this area
typically also include diffusion and sometimes nonlinear reaction terms, but share the
the results given in e.g. [99] can be specialized to the case which we will study. Never-
theless, there are a number of essential novelties to our approach which arise precisely
11
because of the interplay between the indicator distribution and the indexed represen-
tation (the latter of which is not present in a purely continuum model). Among these
tation inherits via the kernel trick, and an Eulerian control philosophy which has the
potential to result in dramatically simple controllers, which are the subject of Section
3.1.
In this section, we discuss methods that assert, a priori, that a multiagent formation
analogy: Suppose one were to draw, using a pencil, the graphs of two signals (repre-
senting formation trajectories) on sheets of rubber, and to then stretch those pieces
of rubber until the graphs approximately “lined up.” Formally, such a process could
– an idea that is investigated in more detail in Section 4.2. In preparation for that
formal problem statement, in this section we review literature that approaches the
For the case when control is not the objective but instead one simply wants to compare
signals, the time warping problem has been studied extensively in the automatic
speech recognition literature. There, variations in speaking rate are accounted for by
“normalizing” the time axes of different signals. That problem is usually solved in a
Time Warping (DTW) (for example, see [86], [87], [20], [105]). The research presented
in Section 4.2 is inspired in large part by that work, but differs crucially in that we are
now interested in the control of systems, and the control problem and the “warping”
12
problems are inextricably coupled. What this means is that one cannot simply perform
dynamic time warping on a reference signal and then control the system to track it;
A number of different but related problems have also been studied in the controls
tems with bounded control inputs are made feasible by reparametrization. Certain
minimum-time optimal control problems are introduced, and necessary conditions for
the existence of time warping functions are given for this large class of systems. Exact
and asymptotic tracking are considered, but finite-time approximate tracking – which
Of the previous work, that most similar in approach to our own appears in [83],
in which the authors discuss a number of time warping problems in which dynamical
motions for computer vision purposes rather than the control of systems, and this does
systems are studied; a combination of dynamic time warping and deconvolution is used
control problem. Key differences from our work include that in [83] time warping is
applied to the input to the systems rather than the output as in our case, and in [83]
the control effort required to effect the motion is not penalized in the optimization
deformation, which has applications to image registration (e.g., [54], [67], [70], [82],
[101]), video compression (e.g., [93], [45]), and image processing (e.g., [9]). As we will
13
be “warping” not images but output trajectories, there are however very few technical
similarities to our work in this literature (besides the use of simplicial complexes to
The time- and output- warping problems developed in this dissertation fit nat-
urally into the context of multiagent formations, but can also be applied to fairly
general nonlinear systems, including large classes of mechanical systems – e.g., robots
like those we have studied in other contexts, like [65]. The warped tracking problems
The idea of learning cost or rating functions from expressed preferences has been
alternatives (usually points in a real vector space)3 together with information about
humans’ preferences among them, and it is sought to learn functions that generalize
these expressed preferences to the entire space in some way. When the preference
data take the form of values from an ordinal scale – e.g., “Good,” “Fair,” “Poor” –
the problem is known as ordinal regression (e.g., [43], [23]). When they take the form
of these two options is better?”), we will refer to the problem as preference learning
(e.g. [24] [42] [34] [26] [22] [3] [50]). Often, preference learning is done by constructing
a real-valued ranking function over the instances (e.g., [34] is representative), but in
some cases (particularly when one wishes to allow intransitive preferences), one can
seek merely to solve a binary classification problem that determines, for a given pair
3
To the extent that a distinction is made between “instances” and “alternatives,” it is that
“instances” are the points that were shown to human judges, whereas “alternatives” may also include
other points in the space besides those that were seen.
14
of instances, which is preferred (as in, e.g., [42]). Applications have included route
selection by automotive GPS systems [34], food preference studies [10] [29], and the
It should be noted that pairwise comparison studies have the advantage over
chological phenomenon in which people’s rankings are only accurate among objects
compared at around the same time [29]. Specific experimental protocols include two-
alternative forced choice (2AFC), in which, when comparing two objects A and B,
subjects must either respond “A is better than B” or “B is better than A;” three-
alternative forced choice (3AFC), in which “the two are the same” is also an acceptable
answer; and 4AFC, which also includes “the two are incomparable” as an answer.
The problem of label preference learning (e.g. [48] [7]) introduces an extra level of
complexity over instance preference learning. Here, two sets of objects are considered,
one of instances and the other of labels; then, one is given, for each instance, a partial
order (or label ranking) over labels for that instance. In the prototypical example, the
instances are documents, the labels are document classifications, and the label ranking
for e.g. the book Mathematical Models in the Social Sciences [53] would indicate that
it is more “Mathematics” than “Social Sciences,” and more “Social Sciences” than
“Biology.” Continuing the example, the goal is to learn from these data a function
that, when presented with some unseen book, will return a label ranking for it.
which ultimately employ Support Vector Machines (SVMs), dominate in the litera-
ture; most of these approaches draw heavy inspiration from [43]. However, Bayesian
The least-squares approach of [50] is worth singling out, because a very strong
link to combinatorial Hodge theory exists here. Indeed, [50] represents preferences
15
as 1-chains on a simplicial 2-complex, and considers projections of these preference-
instance preference learning for 2- or 3- AFC experiments, but differs from existing
ferent problem formulations and optimization problems, which allow for the efficient
designed using the Eulerian and Lagrangian control methodologies that are the em-
phasis of our work, so that they align with humans’ subjective preferences.
16
CHAPTER III
EULERIAN SWARMS
discard agent identities, and, often, that offload control authority from individual,
gin this chapter by discussing the index-free multiagent representation itself (first
introduced in Section 2.2), and demonstrating that, within this framework, many of
the results typically obtained in a more traditional Lagrangian setting can also be
achieved without indexes. With this done, we reach the heart of our work in the next
local controllers that they pass to mobile agents, and that, despite being arrived at
Our discussion of the index-free multiagent system representation will proceed as fol-
lows: After introducing the indicator distribution and its dynamics (Section 3.1.1),
we relate this representation to the more standard graph-theoretic one (Section 3.1.2);
(Section 3.1.3), and finally use a discrete analogue to this representation to give
an example both of the dual relationship between the Eulerian and Lagrangian ap-
proaches, and of the utility for certain problems of selecting the Eulerian approach
17
functions throughout this section wherever this should not cause undue confusion.
Beginning from the classical notion of an indexed set of agents, we will build an
formation while stripping out agent identities. The essential idea will be to construct
an object that tells us not what state each agent is in, but rather how many agents
are in each state. The construction is as follows: Starting from an indexed collection
defined,
X
N
m(x) = Φ(x1 , · · · , xN )(x) = δ(x − xi ) (5)
i=1
where δ is the Dirac delta distribution on R , T (Rn ) denotes the space of tempered
n
agent “mass distribution.” Importantly, notice that although indices were used in
that answers the question, “If an agent is chosen uniformly at random, what is the
More generally, we would like to be able to add and subtract distributions so that
we have a full vector space structure. Hence we will also consider linear combinations
X
K
x 7→ ci δ(x − ξi ) (6)
i=1
Now suppose that, in the classical setting, each agent i ∈ {1, · · · , N } has state
18
xi ∈ Rn and dynamics
ẋi = vi (7)
where vi is our control input, and that moreover vi is generated by some controller,
identical for all agents, that depends only on m(·, t) and not the indexed set of states.
In other words (and making time dependence explicit), vi (t) = v(xi (t), m(·, t)) ∀i ∈
{1, · · · N }. Then, the equivalent dynamics for our indicator distribution are given by
ṁ = − div(mv) (8)
which holds in a weak sense to be described in the next paragraph. Informally, the
meaning of (8) is that the rate at which m decreases at a given point is precisely the
divergence of the mass flux vector field mv from around that point.
state exactly what is meant by (8). Formally, the distribution m is defined as a linear
Dirac delta distribution is identified with the evaluation functional. Then, by analogy
D E D E
∂m ∂f
with integration by parts, one defines ∂x i
, f , − m, ∂xi
; and multiplication by a
One sees from (9) that the only classical derivatives actually used in (8) are those of
the vector field v and of the test function f . For our purposes, it is sufficient merely
to note that, with these definitions, integration by parts and the product rule (for the
19
3.1.2 Weighted Linear Consensus
In the indexed setting, a particular problem that has received a great deal of attention
where Lw (G(t)) is the (possibly weighted) graph Laplacian for some undirected inter-
where D(t) is the incidence matrix for any orientation of G(t) and W (t) is a di-
undirected graph G(t), then D(t) is the matrix representation of the boundary op-
erator δ1 (G(t)) defined in (3); i.e., if G(t) has n vertices, m edges, and edge set
{(e11 , e21 )(t), · · · , (e1m , e2m )(t)}, then D(t) ∈ Rn×m is defined by,
1 if e1j (t) = i
Dij (t) = −1 if e2j (t) = i . (12)
0 otherwise
We say that the protocol (10) is permutation invariant if both the presence of an
edge between two agents, and the weight assigned to an edge, are functions only of
the interaction graph is allowed to depend on the states of the many agents, but not
on their identities.
This occurs in a great many cases of interest, including disk graphs, Gabriel and
Delaunay graphs, nearest-neighbor graphs, and even situations in which edges are
two agents can be severed by a third agent who gets in the way). Consensus on static
20
(i.e., constant) interaction graphs, however, is generally not permutation-invariant,
since in this case edges are determined not by agents’ states but by their identities
(i.e., if agent i is to communicate with agent j and not with agent k regardless of
the many agents’ states, then it requires some way to differentiate between agents j
and k). In graph-theoretic terms, what is required for a static interaction graph is
possessed only by the empty graph and the complete graph. Hence, permutation-
T (Rn ). In most cases that arise, w depends only on its first two arguments, since the
intensity with which a pair of agents interacts is usually a function only of their two
states and not on other agents’. For instance, the unit-disk topology1 can be encoded
where, for a given set S, 1S denotes the zero-one indicator function for S, and Br (ζ)
1
I.e., two agents can interact if and only if they are within one unit distance of one another.
21
Regardless of the particular form of w, the controller (13) then induces the closed-
closed-loop system (16). Note that the following theorems hold not only when m is
a sum of Dirac deltas as in (5), but also for any positive m which either has compact
support or which more generally vanishes at infinity; naturally, this includes smooth
density functions.
We note here that similar results to Theorems 1 and 2 are also proven for a closely-
related PDE featuring diffusive terms in [99]. We nevertheless include Theorems 1 and
2 both in the interest of completeness, and for their value as instructive specializations
to our case.
x̄i , m, xi (17)
∂
m, xi = ṁ, xi
∂t
= − div(mv), xi
Xn
∂ j i
=− (mv) , x .
j=1
∂xj
2
Here, hm, ·i denotes evaluation of m as a linear functional.
22
Since m vanishes at infinity, integration by parts gives that this equals
1 ∂ i n ∂ i
(mv) , 1 x + · · · + (mv) , n x
∂x ∂x
= (mv)1 , δ 1i + · · · + (mv)n , δ ni
∂
m, xi = mv i , 1
∂t
where 1 denotes the constant function that returns 1 ∈ R; this bracket is interpreted
· (ζ − x)dζdx
Z
= m(x)m(ζ)w(ζ, x, m)
(x,ζ)∈R2n
· (ζ − x)d(x, ζ) .
We note that the term being integrated is antisymmetric in x and ζ, and hence that
Theorem 2 (Stability). Let w : Rn × Rn → R take the form (15). Then the closed-
so long as m vanishes at infinity. Moreover, in the first case, if f 0 (x) > 0 ∀x ∈ [0, R),
23
Proof : Consider the Lyapunov functional,
Z Z
1 2
V (m) = m(x)m(y)f ||x − y|| dxdy . (18)
y∈Rn x∈Rn 2
Differentiating,
Z Z
V̇ = [ṁ(x)m(y) + m(x)ṁ(y)]
y∈Rn x∈Rn
1 2
·f ||x − y|| dxdy
2
which by symmetry
Z Z
1 2
=2 ṁ(x)m(y)f ||x − y|| dxdy
y∈Rn x∈Rn 2
24
Substituting in (13) we arrive at the separable integral
Xn Z Z
i i
2 m(x) m(y)w(x, y)(x − y )dy
i=1 x∈Rn y∈Rn
Z
i i
· m(ζ)w(x, ζ)(ζ − x )dζ dx
ζ∈Rn
Xn Z
= −2 m(x)||v(x)||2 dx
i=1 x∈Rn
which is never positive so long as f 0 ≥ 0 (thus proving the first statement of the
theorem), and always strictly negative provided f 0 > 0 (which, together with the
previous theorem, proves the second statement). The third statement likewise follows
In this section we imbue the state space of indicator distributions with an inner prod-
uct structure; this will enable us to reason geometrically about multiagent control
smooth functions for which the standard L2 inner product is defined; this is illustrated
by figure 4. The reason for choosing Gaussians in particular is that the corresponding
structed as the pullback of the standard L2 inner product under a linear isomorphism.
the inner product in section 3.1.3.2. Finally, in section 3.1.3.3, we use this inner
product together with the embedding Φ introduced in (5) to compute these inner
25
Indicator Distribution, m Smoothed Indicator Distribution, Aw (m)
2 2
1.5 1.5
Aw (m)(x)
m(x)
1 1
0.5 0.5
0 0
−1 −1
0 1 0 1
0.5 0.5
0 0
−0.5 −0.5
1 −1 1 −1
x1 x2 x1 x2
L2 (Rn , R) by,
Aw (m) = w ∗ m (20)
with the initial conditions φ(z, 0) = m(Q−1/2 z) (where (Q1/2 )T Q1/2 = Q; Q1/2 exists
since Q 0, and can be computed by e.g. the Cholesky decomposition) has the
solution
X
N
1 (z−Q1/2 xi )T (z−Q1/2 xi )
−
φ(z, t) = e 4t (22)
i=1
(4πt)n/2
for t > 0, and hence m̃(x) = π n/2 φ(Q1/2 x, 14 ).3 In the same way, starting with m̃
3
The function (22) is known as the heat kernel.
26
and imposing the condition φ(z, 41 ) = 1
π n/2
m̃(Q−1/2 z), m can be reconstructed as
Note that although the backwards heat equation used in lemma 1 is extremely
Proof : Since A(m1 ) and A(m2 ) are square integrable by Lemma 2, their L2
inner product exists. Since h·, ·iL2 (Rn ,R) is symmetric, so is h·, ·iA ; since A is linear and
h·, ·iL2 (Rn ,R) is bilinear, h·, ·iA is bilinear; and since h·, ·iL2 (Rn ,R) is positive definite and
A is an isomorphism, h·, ·iA is positive definite. Hence h·, ·iA is an inner product.
How does the inner product of the previous section relate to the classical indexed
In other words, it is the map that computes inner products between joint states in
27
to explicitly construct the indicator distribution representation ([8] has more on the
For indicator distributions of the form (5) and the inner product h·, ·iA defined in
= Aw (δ)(x − x21 ) + · · · + Aw (δ)(x − x2N ) dx
XZ
= Aw (δ)(x − x1i )Aw (δ)(x − x2j )dx . (24)
i,j x∈Rn
form (6). In this case we have another embedding Φ2 which maps (6), thought of as
a formal sum, to the distribution encoded by the same, and its corresponding kernel
28
is
which differs from (26) by the factors c1i , c2j which are now included.
The significance of the kernel function we have obtained is that it gives the original
about multiagent control laws without necessarily needing to work at the level of
terms of indicator distributions, and this ties the Eulerian and Lagrangian approaches
together.
One interpretation of the Eulerian view of multiagent systems presented so far is that,
rather than thinking of agents as making decisions about which actions to take, one
can instead view the states themselves as making decisions about how many agents
the number of agents in each state. We may take this interpretation very literally, and
consider situations in which, e.g., rooms of a building decide which robots should enter
to meet an objective.
the mass distribution m of the agents (the Eulerian setting) to that of their joint
state vector x (the Lagrangian setting). We observe that in the Eulerian setting, con-
29
at one point, whereas in the Lagrangian setting, consensus corresponds to a “flat”
more generally, that each controller in the Lagrangian setting corresponds to a “dual”
controller in the Eulerian setting, and that certain control objectives may be easier
to achieve in one setting or in the other; this is the idea explored in Section 3.1.4.1.
The finite state spaces of these examples – e.g. rooms of a building – also mo-
representation we have discussed so far. To this end, we assume the existence of a set
in which edges indicate physical paths – e.g., hallways – by which agents can move
between them. We likewise assume that the rooms can communicate via some net-
define the vector m = (m1 , · · · , mN ) ∈ RN [The relaxation to allow for a real (rather
than only natural) number of agents in each room can be viewed as a limiting case
for a very large number of agents]. Assuming either a discrete timestep or Lebesgue
subject to the elementwise state constraint m[k] ≥ 0 for all k ∈ N, where D is the
As a particular concrete example, suppose that each room of an office building has
been outfitted with a short-range (e.g., Bluetooth) wireless access point, and that
we have a number of vacuum-cleaning robots that we would like the access points to
deterministically direct throughout the building. The goal in this case, in order to
minimize the amount of dirt left in the worst-cleaned room, is to achieve a uniform
30
distribution of robots over the rooms. The question then becomes how the rooms of
the building, in a distributed way and while respecting state constraints, can choose
to direct robots between themselves so that the robots are eventually distributed
uniformly throughout the building. This is a version of the coverage problem, and it
For the purposes of this example we will assume Gp = Gc = G – i.e., that the
physical and network topologies are the same – in which case the controller
= (I − γL)m[k] (30)
where L is the graph Laplacian for G. Since L is positive semidefinite (see [72]), the
eigenvalues of the closed-loop system matrix lie within the unit circle for sufficiently
small γ (for all but the 1 eigenvector, whose eigenvalue is exactly 1), e.g. γ = 21 ||L||2 ,
and so the system is stable. Moreover, it can be seen that the given controller satisfies
We have stripped agent identities from the multiagent modeling machinery by em-
differential model for multiagent systems which parallels the now-standard graph-
theoretic constructions. Along the way, we proved stability and conservation prop-
representation and the so-called kernel trick, endowed the traditional vector state
31
Figure 5: Multiple mobile robots (red) are directed throughout a triangulated envi-
ronment with the help of wireless base stations (dark gray).
space analogue, which demonstrated that for certain problems a literal interpretation
from the Eulerian perspective, we now aim to solve problems that involve not just
motions in the steady state. The purpose will be to route agents throughout an envi-
ronment, while satisfying various safety and efficiency constraints. Figure 5 shows a
“cartoon” of the envisioned scenario, in which static infrastructure nodes, like wireless
size controllers that are passed to mobile agents, that move within an environment
We will first consider “one-dimensional” models – those that represent the movement
32
in subsequent chapters, address the two-dimensional case – in which controllers are
– in which tetrahedra, pentachora, etc, take the place of triangles – are then also
possible.
A first set of analogies relates to the use of graphs as models for environments.
per -adjacency represents topology. A 0-chain is the analogue of a scalar field; its
coefficients are values assigned to the corresponding vertices. A 1-chain is the ana-
logue of a vector field; it can be thought of as assigning a directed flow to each edge.
The coboundary operator δ1∗ : C0 (K) → C1 (K) is the analogue of the gradient opera-
tor, and the boundary operator δ1 : C1 (K) → C0 (K) is the analogue of the divergence
operator. Just as the Laplacian on Rn factors as ∇ = div grad, so too does the ze-
L0 = δ1∗ ◦ δ1 .
We will make a dual analogy for simplicial 2-complexes. Here, a triangle is the
a unit tangent vector; here, lower -adjacency represents topology. A 2-chain is the
directed flux across each face. The coboundary operator δ1∗ : C0 (K) → C1 (K) is the
analogue of the gradient operator, and the boundary operator δ1 : C1 (K) → C0 (K)
33
Irrotational Flow Incompressible Flow Harmonic Flow
v = vc + vr + vh (31)
From a functional analysis perspective, the three terms are projections of v onto
three orthogonal linear subspaces of the space of vector fields on R3 . The three terms
are the the curl-free, divergence-free, and harmonic components, respectively. The
first represents sources and sinks, the second vortices, and the third global flows
v = vc + vr (32)
with vc ⊥ vr under the inner product (2); this is the subject of section 3.2.3. Note
analogue to the harmonic component of (31); this is the path taken in e.g. [50], but
it comes at the cost of treating edges rather than nodes as the agents that perform
computation.
34
3.2.3 Hodge Decomposition on Graphs
From Hilbert’s Projection Lemma, we know that orthogonal projections are least-
1-chain v ∈ C1 (G) onto its curl-free component can be found from the least-squares
in the subsequent sections, but also in preference learning (see e.g. [50]; preference
What is interesting is that consensus dynamics solve the equation (33), as de-
4
For the (matrix representation of the) graph Laplacian of a connected graph, this is the inverse
restricted to span{1}⊥ . I.e., L† = (L + n1 11T )−1 − n1 11T .
35
Proof : The ODE (37) can be written as
1 ∗
ṗ = −gradp ||δ1 (G)p − v||2 (38)
2
which are precisely the gradient descent dynamics needed to solve (33) (Here, the
norm is that induced by the inner product (2)). Since the quadratic form is convex
on the quotient space C0 (G)/ N (L0 (G)), gradient descent converges in that quotient
The important message is that the familiar Laplacian dynamics, when forced,
compute p.
vr = v − vc = v − δ1∗ p (39)
Due to the important role of discrete Laplacian operators in this and subsequent
proceeding, it will also be useful to define the boundary subcomplex B(K) of K, which
36
K is the n − 1-subcomplex,
σ has fewer than
B(K) = cl σ ∈ Σn−1 (K) (40)
two cofaces.
where cl denotes simplicial closure.
(n − 1)-skeleton of F (K), which are disjoint with one another. Simply put, each
(n − 1)-simplex is either a face that can be crossed, or one that cannot be crossed.
With the boundary subcomplex defined, we are now in a position to define the
energy functions:
Definition 3.2.2. For an abstract simplicial complex K, the k-th energy function
1 X
Ek (K)(x) = hδk+1 (∆), xi2
2
∆∈Σk+1 (K)
1 X
+ hδk∗ (σ), xi2 , (41)
2
σ∈Σk−1 (K)
with the convention that Σi (K) = ∅ for all i < 0, and that summations over the empty
Laplacian Lk (K) : Ck (K) → Ck (K) is the Hessian of the k-th energy function,
Ek (K).
It will also be useful to define generalizations of the energy functions that omit
certain terms from the summations of Definition 3.2.2, as well as corresponding Lapla-
37
Definition 3.2.4. Let K be an abstract simplicial complex and L ⊂ K be a subcom-
plex of K. Then the k-th restricted energy function Ek (K, L) is defined by,
1 X
Ek (K, L)(x) = hδk+1 (∆), xi2
2
∆∈Σk+1 (K)/Σk+1 (L)
1 X
+ hδk∗ (σ), xi2
2
σ∈Σk−1 (K)/Σk−1 (L)
1 X
+ hσ, xi2 (42)
2
σ∈Σk (L)
The restricted Laplacian is then defined in much the same way as before:
Moreover, we will refer to the special case of L(K, B(K)) as the boundary-restricted
Laplacian corresponding to K.
zero flow out of the complex, while still characterizing the homology group of the
Lk−1 (K, B(K)) if and only if it is in the null space of the standard Laplacian Lk−1 (K).
Proof. Since x is zero on B(K), all terms of the third summation of (42) are
zero; consequently, we need only consider the first two summations. If x ∈ N L(K),
it then follows that x ∈ N L(K, B(K)), since the terms of the first two summations
in (42) are a subset of those in (41), and each term is positive. To show the converse,
we note that the only terms that appear in (41) but not in the first two summations
of (42) correspond to faces in B(K), and, since x is zero on B(K), these terms are
zero.
38
Finally, we define a directed zeroeth Laplacian operator:
Definition 3.2.6. Let G = (V, E) be a directed graph. Then the directed graph
X
[L(G)(x)]i = (xi − xj ) . (43)
j|(vj ,vi )∈E
Note that, since G in Definition 3.2.6 is directed, it may be that (vi , vj ) ∈ E while
(vj , vi ) ∈
/ E, in which case L(G) is not symmetric.
With the necessary Laplacian operators thus defined, we now consider the com-
putation of face fluxes that serve as representatives of the first homology group.
We now shift our attention from one- to two- dimensional models of the environment;
incompressible vector fields in their Rips Shadows as Hamiltonian vector fields, and
In this line of thought, agents are 2-simplexes. For the case of air traffic control,
this represents the idea that each simplex is a region of airspace under the authority
of a particular controller on the ground, and that it is the job of these automated
ground controllers to agree in a distributed way how airplanes should be routed among
themselves.
We will assume that the graph G of the previous sections is the lower-adjacency
graph of the triangles of a pure simplicial 2-complex – i.e., that, given a 2-complex
39
v0
v0
v1
v2
v5
v0
v3
v4
v0
v0
defined on the individual k-simplices (this is the subject of Section 3.2.5.1), and then
In this section we will describe the individual building blocks for our global vector
field. In particular, given a 0-chain over the vertices of a simplex, we will produce an
incompressible flow within the simplex. This is done by using barycentric interpola-
tion to create a streamfunction over the simplex, and defining a Hamiltonian vector
Xb = x (44)
1T b = 1 . (45)
40
It is also convenient to define the inverse matrices B1 ∈ R3×2 and B2 ∈ R3×1 by5
−1
X
= B1 B2 . (46)
1T
coordinates, as
ẋ = J grad φ(∆)
= JB1T c (48)
ḃ = B1 JB1T c
, A(∆) (49)
where J ∈ R2×2 is the matrix representation of the symplectic form (a, b) 7→ det([a, b]).
0 1 2 T 2×2
I.e., J = [ −1 0 ], so that, for all a, b ∈ R , a Jb = det([a, b]) (where [a, b] ∈ R
We will now use these per-simplex building blocks to assemble a single global
vector field on K.
5
The inverse has a nice interpretation: bi is the ratio of the volume of the simplex with x
substituted for xi , to that of the original simplex.
41
3.2.5.2 A global vector field
Under the assumption that the interiors of the Rips Shadows of all the simplices are
by,
ν(x) = A(∆) if x ∈ R(∆) ∀∆ ∈ K . (50)
In the section that follows, we will show that this vector field is globally divergence-
Before proceeding, however, we would like to point out that, already, (50) by itself
constitutes a single hybrid controller for the vehicles: Each vehicle looks up which
2-simplex ∆ it is in, requests the vector A(∆) from ∆, and then follows that vector
field.
that produces the vector field (50) – for some global 0-chain over K. In the following
sections, we prove that such a 0-chain exists, and give algorithms for computing it.
Definition 3.2.7. Given an oriented simplicial k-complex K, a vector field (in barycen-
tric coordinates) v : R(K) → R3 agrees with a (k − 1)-chain v if, for each simplex
K that induces a Hamiltonian vector field agreeing with v on the Rips Shadow of K.
42
Proof : Since the edge flow v is in the cycle space of G and G ⊂ G, it is in the
cycle space of G. Then, by cycle-cut duality, it is in the cut space of G ∗ , the 1-skeleton
Method 1 This first method serves to motivate the second. As in section 3.2.3, we
are faced with the problem of computing a 0-chain whose boundary best approximates
a given 1-chain; hence the global 0-chain c ∈ C0 (K) can be computed using the
where now c is a 0-chain over the vertices of K rather than of G, and the operators
is required for an actual implementation. The next method avoids this messiness,
and is much more compatible with the reality that it is triangles, not vertices, that
represent agents.
and 1-chains over ∆ representing streamfunction values and face fluxes, respectively.
43
where δ1 (∆) has the matrix representation
0 1 −1
E3 , −1 0 1 . (54)
1 −1 0
Since the matrix E3T has a 1-dimensional null space spanned by 1, there is a family
of solutions,
where 1 ∈ C0 (∆) is the 0-chain that assigns a 1 to each vertex.6 What this means is
that, if a single agent – a triangle – knows its face fluxes, then it can independently
determine what the 0-chain over its vertices should be, up to a constant. The coor-
dination problem then is only to determine that scalar s for each triangle – i.e., a
What need the values s1 , · · · , sN of the different triangles satisfy? Namely, for
two consistently-oriented simplices indexed i and j, sharing a face that is the k th face
1
si − sj = − [Dk (v̄i ) − Dl (v̄j )] , wij (56)
6
where v̄j ∈ R3 is the vector representation of the restriction of the 1-chain v to the
The skew-symmetric matrix W = [wij ]ij itself encodes a 1-chain over G. The problem
has thus been reduced to computing a 0-chain s ∈ C0 (G) – that with coefficients
6
Note that the matrix representation of the pseudoinverse in (55) is particularly simple: (E3T )† =
1
3 E3 .
44
to be its boundary. Hence, s can be computed asymptotically by the system,
much as before.
The two distributed computations described in the previous sections can be performed
simultaneously within the network, and stability properties are maintained. This is
(where D is the linear operator that produces the 1-chain w following (56)), converges
asymptotically to a vector in C0 (G) × C0 (G) that solves the equations (35) and (56).
triangular, so its eigenvalues are those of its diagonal blocks. Those in turn are graph
Laplacians, which are known to be positive semi-definite (see e.g. [72]). Consequently,
larger than 1 × 1 – a possibility that is ruled out since image(δ1 Dδ1∗ ) ⊥ N (L0 ).
In this section, we explore the use of the the vector fields obtained in the previous
sections to direct air traffic throughout an environment, in order to give a flavor for
how the preceding theory can be applied. The idea will be to project control inputs
onto the divergence-free subspace, using consensus dynamics as in Section 3.2.3; this
45
The aircraft are assumed to inhabit the nodes of the graph G (which correspond
to different regions of airspace), and the edges encode which regions are adjacent.
vertex – i.e., the number of aircraft that can safely share that airspace – and let
at each vertex. Assuming m(0) = M – i.e., that the airspace is initially filled to
capacity – we investigate the general problem of directing the aircraft between them
For this example, we restrict our attention to the case when safety is ensured
ṁ(t) = 0 ∀t, which in turn is satisfied by ensuring that the 1-chain describing the air
traffic is divergence free. Two such problems naturally arise, from different projection
Suppose an operator wishes to command the air traffic system with a particular
reference vector field. One way in which the system can respond to this command is
by providing the vector field that approximates the commanded field optimally in a
the projection problem of Section 3.2.3, so the problem can be directly solved in a
distributed fashion by the algorithm (37). We should note that, in order to do this,
the operator need only communicate with two nodes per nonzero commanded edge
finding the smallest (in an l2 sense) divergence-free flow containing the commanded
flow v̄ as a component; this is the lowest-energy safe holding pattern that guarantees
46
a certain amount of traffic on specified edges. In this case, we seek a solution to the
1
arg min ||v||2 (60)
v∈C1 (G) 2
Theorem 7. Let Pr denote the l2 projection operator for the divergence-free subspace
of C1 (G), which is computed by (37). Then for all v̄ 6= 0, the problem (60-62) either
so in this case the problem is infeasible. Hence, without loss of generality, suppose
/ (ker δ1 (G))⊥ .
v̄ ∈
vc,|| + vc,⊥ , with vc,|| ∈ span Pr v̄ and vc,⊥ ∈ (span Pr v̄)⊥ ; hence a unique decomposition
vc = vc,|| + vc,⊥ + vr exists, and by the Pythagorean Theorem, ||vc ||2 = ||vc,|| ||2 +
||vc,⊥ ||2 + ||vr ||2 . By (61), ||vr ||2 = 0, and by (62), ||vc,|| ||2 = ||v̄||2 . Only ||vc,⊥ ||2
remains free; the quantity ||vc ||2 is minimized when ||vc,⊥ ||2 = 0. To summarize, we
know that v = vc,|| with vc,|| ∈ span Pr v̄, and that ||vc,|| ||2 = ||v̄||2 . Only one element
of the vector space C1 (G) satisfies these properties, and that is (63).
To demonstrate the character of the results obtained with these methods, starting
47
the divergence-free projection of a commanded 1-chain on G with three nonzero ele-
ments, and the corresponding 0-chain on K and streamfunction on the Rips Shadow
of K; this is shown in Figure 8. Note that the large commanded flow across a single
face at the upper right of the complex is propagated through the “jughandle” at the
upper right, and that the commanded flows lower in the complex in less confined
areas result in pairs of vortices that have mostly local effects; nevertheless, small
flows are produced throughout the complex. These qualitative characteristics are
typical of the kinds of flows obtained: Where necessary, flows propagate globally, but
otherwise most effects of a command are manifested locally. It is the pressure field
that propagates this information; essentially, “shocks” are created across the faces
where large flows are commanded, and elsewhere the pressure is smoothed across the
complex by diffusion. The nonzero commanded flow at the upper right demonstrates
this well; it creates a “shock” in the pressure field (black triangle next to white trian-
gle), which diffusion spreads into linearly-decreasing pressure around the upper right
“jughandle.” Where vortices are produced, the streamfunction exhibits a pair of local
one – as can be observed in the left part of the complex. Vehicles then follow level
A demonstration was produced using a collection of ten Khepera III mobile robots,
of the complex; internal edges are shown in purple. The input flows are generated
by a human using a motion capture wand; whenever the wand’s projection onto the
floor crosses an internal edge, an input flow is added in the same direction across that
edge, with magnitude an increasing function of the wand’s speed. Video frames, with
48
Figure 8: Computational results are shown. Given a flow as input (first plot; arrow
sizes indicate flow magnitudes) on G, a circulant flow on G and a streamfunction on
the Rips Shadow of K are produced (second plot). The Lagrange multipliers for the
cycle-space projection (third plot) are a close analogue of pressure in the dynamics
of incompressible fluids. The streamfunction is computed locally at each triangle,
requiring only the addition of a local offset (fourth plot), which is computed in a
distributed fashion.
49
Figure 9: Incompressible-flow human-swarm interaction demo: The scenario.
Given specified input flows, distributed consensus-like algorithms were described that
to two-dimensional streamfunctions that generate vector fields over the entire Rips
of incompressible fluids, and, since vehicles following them will never concentrate in
any region, provide a useful method for coordinating collision-free navigation among
We will now consider more complicated flow constraints. The goal is, as before, for
agents can circulate throughout the environment, as required for robot patrol or
aircraft holding-pattern applications, but we will now require that the flows satisfy
additional conditions.
In particular, we look for any vector field f in this family to satisfy three properties:
P1.1 The first of these, uniform coverage, insists that if the initial probability density
50
Figure 10: Incompressible-flow human-swarm interaction demo: Video frames, over-
laid with flows (gray arrows) and streamfunctions (colored gradients; blues are low
values and reds are high). 51
Figure 11: The vector fields produced avoid the concentration of agents within any
one control volume (dashed circle) (left), paths with local loops (center), and collisions
with the boundary of the complex (right).
environment, that this condition persist; i.e., that m(x, t) = 1/ Area R(K) for
P1.2 The second, no local cycles, encourages that efficient paths without unneces-
sary loops be traced out by the agents following v, and is expressed by the
point.
P1.3 The third, zero boundary flux, requires that agents not leave the environment.
Formally, this means that f must have no component orthogonal to the bound-
This problem will be addressed in three parts. First, it is shown how existing
distributed protocols can be used to produce face fluxes satisfying a discrete version
of most (but not all) of the requirements P1. Next, symplectic vector fields are
generated over the continuous space that are consistent with given flows, also in a
distributed fashion. Finally, a simple, unified algorithm is presented that solves the
continuous and discrete parts of the problem simultaneously, while also satisfying the
52
3.2.12 Distributed Computation of Homological Streamfunctions
In the next subsections, we will describe two classes of distributed methods for com-
puting vector fields satisfying P1. The first, which serves to motivate the second,
employs a method described in [75] to compute face fluxes within K, and then, as
vector fields that are consistent with those fluxes, via streamfunctions. The second,
fied, unified algorithms that compute the discrete flows and continuous controllers
simultaneously, and that also satisfy additional requirements neglected by the first
method.
We will make a departure from the previous section, in which 1-chains represented
flows on edges, and a dual graph was constructed, by using n − 1-chains on K (i.e.,
elements of Cn−1 (K)) to, directly and without constructing a dual graph, represent
flows of mobile agents across faces. The idea is that, if a simplex σ ∈ Σn−1 (K) appears
in a chain with positive coefficient c ∈ R, then agents are flowing across σ from the
Cn−1 (F (K)) that serve as representatives of the homology group Hn−1 (F (K)). Later
in this section, continuous control laws will be produced that achieve these fluxes.
dom elements of the homology group Hk (K) without global knowledge of the graph
topology.
Recalling that Hk (K) = Zk (K)/Bk (K), what we will do is produce unique repre-
sentatives of elements of Hk (K), that have a component in Zk (K) but not in Bk (K).
We are able to do this because a natural isomorphism exists between Hk (K) and
53
the null space N Lk (K) of the k-th combinatorial Laplacian. Since N Lk (K) =
Since the restricted energy function E1 (K, B(K)) is convex, gradient descent from
any point converges asymptotically to N L(K, B(K))k . Indeed, since E1 (K, B(K))
has a quadratic but not a linear term, those gradient dynamics are simply,
which, so long as k is not too large, constitute a distributed method for asymptotically
the sparsity pattern of Lk implies that this process requires (k+1)-hop communication
in each round.
In [75], this property of the dynamics (64) was used to project a random 1-chain
onto the homology group H1 (K) in order to determine, with unit probability, whether
it is trivial (i.e., has dimension zero). In this way, a sensor network could determine
For our purposes, what matters is that this is an algorithm for producing unique
turn our attention to the generation of continuous control laws from these edge flows.
these can be produced via streamfunctions, which we describe in the next subsection.
We will compute streamfunctions over R(K, r) that induce vector fields that agree
with the flows computed in the previous section in the following sense:
V (K) → Rn , a vector field v : R(K) → Rn agrees with a (k − 1)-chain ν if, for each
simplex ∆ ∈ Σk−1 (K), the flux of v across R(∆) equals hν, ∆i.
54
Now that we have both (1) a method for computing 1-chains in H1 (K) and (2)
a method for computing streamfunctions that agree with them, in principle the two
unified manner, while additionally satisfying P1.3, as described in the next section.
The key idea of this section is that the two algorithms just described can be unified
by considering as a whole the properties that the 0-chain defining the streamfunc-
tion must satisfy; in the process, we will enforce the additional constraints imposed
by P1.3. We will first consider a undirected, 2-hop algorithm that follows directly
from our definitions, before introducing a directed, 1-hop algorithm that, remarkably,
converge asymptotically from any initial condition c(0) ∈ C0 (K) to a value c(∞) ∈
1
E(K, B(K))(x) = x∗ L1 (K, B(K))x ; (66)
2
consequently
1
E(K, B(K))(δ1∗ (K)c) = c∗ (t)δ1 (K)L1 (K, B(K))δ1∗ (K)c(t) (67)
2
55
whose gradient-descent dynamics with respect to c are (65).
What this means is that, by running the simple, linear, 2-hop protocol (65), we
vector field satisfying P1. However, in the next section, we will see that this is also
Ultimately, we will produce a 0-chain satisfying the desired properties (i.e., that
graph built from the complex K, which allows information to flow bidirectionally
within the interior of the complex, as well as bidirectionally within the boundary, but
only in a single direction between the two. This directed graph, which we will refer
to as the insulated 1-skeleton, G(K), only allows a 1-way flow of information from
P2.1 For all a, b ∈ Σ0 (K/B(K)), we have (a, b), (b, a) ∈ E if and only if (a, b) ∈
P2.2 For all a, b ∈ Σ0 (B(K)), we have (a, b), (b, a) ∈ E if and only if (a, b) ∈
links the vertices in the interior of the complex; another links those in its boundary;
56
Figure 12: A pure simplicial 2-complex K (left), its boundary subcomplex B(K)
(center), and the corresponding insulated 1-skeleton G(K) (right)
and edges between the two are directed from the boundary to the interior.
Figure 3.2.12.3. The next theorem explains how consensus dynamics on G(K) can
and L the directed Laplacian corresponding to G(K). For any 0-chain c0 ∈ C0 (G(K)),
c(0) = c0
Proof : We must demonstrate that the system is stable, and that {c | δ1∗ c ∈
N L1 (K, B(K))} is its equilibrium set. First, we address stability. Without loss of
generality, the dynamics (68) can be block-decomposed as,
ẋ −(Lx + Dxb ) C x
= (69)
ḃ 0 −Lb b
where Lx is the (undirected) Laplacian for the undirected graph G(K)/B(K), Dxb is
57
connecting vertices in G(K)/B(K) to B(K)), Lb is the (undirected) Laplacian for B,
To demonstrate stability, we must show that none of the Jordan blocks for the
system’s zero eigenvalues are larger than 1x1, and that all eigenvalues are nonpositive.
First, note that, although the lower diagonal block, −Lb , does have a nontrivial
nullspace, it, by the Spectral Theorem, is diagonalizable, so none of its Jordan blocks
are larger than 1x1. Next, consider the upper diagonal block −(Lx + Dxb ): Since
zero-chain in the null space of Lx (i.e., that is constant on each connected component
of G(K)/B(K)) can also be in the null space of Dxb , and consequently −(Lx + Dxb )
is negative definite. Since the upper diagonal block has strictly negative eigenvalues,
and, although the lower block does have zero eigenvalues, they correspond to Jordan
Next, we demonstrate that the equilibrium set is precisely the subspace {c | δ1∗ c ∈
• No circulation: The first sum of (42) is zero if and only if, for each simplex ∆ ∈
Σ2 (K)/Σ2 (B(K)), we have hδ2 (∆), δ1∗ (c∞ )i = 0, or equivalently (by definition
• No divergence in interior: The second sum of (42) is zero if and only if, for
each σ ∈ Σ0 (K)/Σ2 (B(K)), we have hδ1∗ (σ), δ1∗ (c∞ )i = 0, or, equivalently, if
hσ, δ1 δ1∗ c∞ i = hσ, L0 (c∞ )i = 0. This is precisely the condition, specified by the
upper diagonal block of the system matrix, for vertices in σ ∈ Σ0 (K)/Σ2 (B(K))
to be at equilibrium.
• Boundary condition: The third sum of (42) is zero if and only if, for each
58
Input: Oriented simplicial complex K; q ∈ N
Output: c∞,1 , · · · , c∞,q
Algorithm:
• For i = 1 to q
each connected component of B(K). The lower diagonal block of the system
matrix is the standard zeroeth combinatorial Laplacian for B(K), and its null
With stability guaranteed and the equilibrium set characterized, we conclude the
proof.
With the “heavy lifting” done by the distributed projection algorithm, the remainder
resulting family of vector fields induced by the zero-chains c∞,1 ), · · · , c∞,q will span
a conservative upper bound on h, then the set of behaviors obtained will not be
7
i.e., in a 2-complex whose first Betti number is h
59
Input Flow Incompressible Flow Harmonic Flow
Figure 14: Depicted are a specified flow (left), and its projections onto the incom-
pressible (center) and harmonic (right) subspaces. The harmonic flow described in
this section (right) differs from the incompressible flow described in Section 3.2.8
(center), in that the former avoids the local vortices visible in the latter. In both
cases, it is a collection of hybrid, piecewise-linear controllers that realize the flows.
These controllers are produced as the Hamiltonian vector field corresponding to a
piecewise-linear streamfunction (color gradients).
linearly independent; i.e., vector fields will be generated that are redundant in the
sense that they lie in the span of the others. In general, two ways to deal with this
situation exist:
in e.g. [14], [64], [90], [46], which perform distributed Arnoldi-like iterations to com-
is that the computation of inner products inherently requires sums across all of the
agents; hence each outer iteration of these algorithms involves an entire consensus
problem to compute inner products, and the algorithms are consequently (and nec-
haviors satisfying the properties P1, there is no particular reason to believe that a
minimal set of such behaviors is required. Indeed, it is precisely by relaxing the or-
An example showing the output of the algorithm of Figure 13, compared to that
60
3.2.14 Distributed-Infrastructure Routing: Conclusions
static infrastructure nodes can synthesize controllers for mobile agents that cause
in any location, or following paths with contractible loops. This is done first by com-
bining existing algorithms for computing flows and synthesizing controllers that agree
with the flows; then by, in a unified fashion, computing controllers and flows together
remarkably, arises from a directed Laplacian. The result is a family of linear pro-
the environment.
61
CHAPTER IV
LAGRANGIAN SWARMS
that rendered agent identities irrelevant, in this chapter we will address swarm repre-
sentations that explicitly maintain the state of all agents in the system. We will begin
interact via holonomic constraints, and then describe approximate formation track-
ing algorithms that allow a controller, by manipulating just a few “leader” agents, to
For our purposes, each agent will be a Lagrangian mechanical system. We will ini-
framework; however, in later sections, we will, invoking the implicit function theo-
rem, impose local coordinate charts and solve optimal control problems on general
nonlinear systems on Rn .
controlled leader agents that represent the exogeneous input to the system, and a
finite set F of follower agents that interact with each other and with the leaders
62
S
all agents in L F , and Lagrange multipliers λij : R+ → T∗ (Qi ) for all (i, j) ∈ F ×
S
(L F ). The control input signals are, for each i ∈ L, a function ui : R+ → T Qi that
specifies velocities at all times.1 We will additionally require that the potential fields
V , {Vij }ij be continuously-differentiable, and that the constraint functions {hij }ij be
γi0 (0) = γ0
with initial condition γ0 ∈ T Q (note that, since tangent vectors implicitly refer-
ence their base points, γ0 contains not just the initial velocity, but also the initial
configuration).
Here, we also use the slightly nonstandard notation f ∗,i to refer to what we will
call the slot pullback of a function – a partial derivative akin to the slot derivative. Pre-
cisely, if f : S1 ×· · ·×Sn → T , then f ∗,i (s1 , · · · , sn ) = (s 7→ f (s1 , · · · , si−1 , s, si+1 , · · · , sn ))∗ (si ).
I.e., it is the standard pullback of f , considered as a function only of its i-th argument.
slot gradient.
S
On a subset Ui ⊂ Qi of each configuration manifold for i ∈ L F , one may
1
Each ui must also satisfy πT Q ◦ γi ≡ πT Q ◦ ui , where πT Q is the bundle projection map. All this
means is that the only velocities that can be specified while at a point γ(t) are those that actually
exist in T γ(t).
63
the first equation of the DAE (70) can be written,
ni X
X ni
q̈i (t) + Γi,jk (qi (t))q̇i,j (t)q̇i,k (t) =
j=1 k=1
dVi T
− Mi−1 (qi (t)) (qi (t))
dqi
X ∂Vij T (71)
− Mi−1 (qi (t)) (qi (t), qj (t))
j∈Ni
∂q i
" #
X ∂hi,j T
+ Mi−1 (qi (t)) (qi (t), qj (t))λij (t)
j∈N
∂qi
i
matrix), and each Γi,jk is a vector of Christoffel symbols of the second kind. The left
side of the equation represents coordinate-free acceleration, which includes the effects
of Coriolis and centripetal forces; and the right side represents internal, interaction,
A common special case of the dynamics (70) arises when agents are particles – i.e.,
Q1 = · · · = QN = Rn , for some n > 0 (typically 2 or 3), and the constraints {hij }ij
that link agents specify interagent distances. In this case, the (shared) configuration
manifold is not just a Riemannian manifold but also an inner product space, and the
with constants rij = rji > 0 for all i, j. Dropping time arguments, the dynamics (70)
64
4.1.2 From DAEs to ODEs by the Implicit Function Theorem
In subsequent sections, it will be helpful not to work directly with DAEs like (70),
This approach comes with the caveat that coordinate charts will have singularities,
but in practice we have found that, by a reasonable choice of coordinate chart, one
can position singularities away from the regions of the configuration manifold that
are of interest.
A physical analogue exists for swarms that behave according to (73), after only mild
relaxations, in the form of marionettes like that shown in figure 15. Here, one has a
collection of mechanical systems – the various rigid bodies (arms, legs, torso, etc) that
comprise the marionette, that are coupled by joints. We will apply the optimal control
In our case, the marionette is modeled as a collection of ten point masses m1 , · · · , m10
joined by massless rods and suspended from four massless strings in two dimensions,
as illustrated by Figure 17. The free endpoints p1 , · · · , p4 of the strings can be moved
65
Figure 15: The marionette Figure 16: Human subject
kinematically, and the remainder of the model is fully dynamic. The resulting me-
chanical system has 11 dynamic and 8 kinematic degrees of freedom, for a state space
on the distance between the points at which their ends are anchored; in our model
where agents take the role of point masses, interagent distance constraints take the
role of massless rods, and the string model, based on an artificial potential, corre-
sponds exactly to the artificial potentials typically used for connectivity preservation
specifically, q(t) consists of the position in R2 of m1 , together with the nine joint
angles. Additionally denoting the generalized velocity by ν(t) ∈ R11 , the dynamics
q̇ = ν (75)
!
X
ν̇ = M (q)−1 − Γij (q)νi νj − ∇Ugrav (q) − ∇Uspring (q, p) − µν (76)
i,j
ṗ = u (77)
66
p1 p3 p4 p2
m2 m1 m4
m3 m5
m6
m7 m8
m9 m10
Figure 17: The marionette is modeled as a system of point masses (m1 , · · · , m10 )
interconnected by massless rods (solid lines) and suspended by strings (dashed lines)
from four kinematically-controllable points (p1 , · · · , p4 ).
where M (q) is the mechanical system’s mass matrix, each Γij (q) ∈ R11 is a vector of
damping coefficient, and Uspring (q, p) is the sum of spring potentials; specifically,
X
Uspring (q, p) = k1 exp k2 ||ρi (q) − pj ||2 − lij
2
(78)
(i,j)∈Ispring
where Ispring = {(2, 1), (7, 3), (4, 2), (8, 4)} is a set of index pairs summarizing the
spring connections, each ρi : R11 → R2 is the forward kinematic map from the
configuration q to the position of the i-th point, lij is the nominal length of the string
The output map h in this case depends only on the configuration component q
of the state, and returns the positions ρ1 (q), · · · , ρ10 (q) of the ten masses m1 , · · · , m10 .
Our goal is that the output signal y track the corresponding positions r̄ = [ρ̄1 , · · · , ρ̄10 ]T
next sections. More specifically, the reference signal was created by a human dancer
67
motion-capture software,2 and projected onto a coronal (or frontal ) plane.
dressed, we now turn our attention to optimal control algorithms for these systems.
another? We would like to consider this question in a reasonably general way to begin
motion – besides that they can be measured and that they vary in time. Hence,
whether they are described by joint angles of an articulated figure, or as the positions
defined on some time interval [0, T ], T ∈ R+ . Again we ask: How should y1 and y2
(Q being any symmetric positive definite matrix) and consider motions with smaller
distances between them to be “more similar.” Alternatively, we might use the angle
as our measure, and treat motion signals with small angles between them as more
For instance, see Figure 18, and suppose that the given signals are joint angle tra-
2
Vicon VisionIQ was used.
68
that the reference signal represents a “wave” motion. Although Signal 1 also repre-
sents a “wave” motion, simply dilated and shifted, the qualitatively different signal
2 would be judged more similar using either the L2 metric (79) or angle (80) as the
dissimilarity measure.
Reference
Signal 1
Signal 2
A basic problem in both (79) and (80) which is highlighted by Figure 18 is the as-
sumption that the temporal correspondence between different signals is known a priori
– that y1 (t1 ) should be compared to y2 (t1 ) and not y2 (t2 ) for all t1 6= t2 (t1 , t2 ∈ [0, T ]).
Much is assumed about the spatial correspondence as well: (79) presupposes that at
any time t ∈ [0, T ] it is preferable that y1 (t) be equal to y2 (t) than to some other value,
e.g., a scaled or otherwise transformed value of y2 (t). The angle between signals (80)
is somewhat more forgiving – comparing not motions themselves but entire equiva-
lence classes of motions which are the same up to a positive scalar multiple – but
nevertheless it cannot tease out any spatial correspondences more complicated than
this. Hence, in the sections that follow we will propose a measure of the similarity of
69
4.2.1 Time Warping
1. w is continuously-differentiable.
2. w is strictly increasing
3. w(0) = 0 .
of all time warping functions (that is, those functions satisfying 1-3 above) by Ω, and
the set of derivatives of time warping functions by Ω0 ; i.e., Ω0 is the set of continuous
the goal of time warping is to find a function w satisfying the above (i.e., w ∈ Ω) that
τ 2
Example 4.2.1. Suppose T = 2π, r(τ ) = sin( 2π ), and y(t) = sin(t) for all τ ∈ [0, T ]
τ2
w∗ (τ ) = ∀τ ∈ [0, T ]
2π
2
t
since (y ◦ w)(τ ) = sin( 2π ) = r(τ ), so J(w∗ ) = 0 ≤ J(w) ∀w 6= w∗ . Also note that
although in this particular example, exact matching is possible (i.e., there exists a
3
An example of a function that satisfies these requirements and is a homeomorphism but not a
diffeomorphism would be x 7→ x3 , because its inverse is not differentiable at the origin.
70
At this point it may be appropriate to point out that another measure of the
where Λ ⊂ Ω is the set of all strictly increasing functions λ such that λ(0) = 0
and λ(T ) = T . Some salient differences between (82) and (81) include that we use
generalized L2 norms instead of supremum norms; allow a slightly larger class of time
warping functions; and do not directly penalize deviation of the time warping function
from the identity function, but rather deviation of their derivatives. Additionally, the
in (82). The strength of the Skorohod metric, which we do not require, is that it
results in a complete metric space. An advantage of the cost (81) over (82) for our
purposes is that it considers the duration and not just the amplitude of deviations
between signals.
problem as well. We will do this in two ways: First, we will solve the problem when
“any” time warping function is allowed; we call this the nonparametric time warping
problem. We follow this with a look at techniques that fix a particular parametrized
form for the time warping function – we address linear functions and, more generally,
dxt
(t) = f (xt (t), ut (t), t) (83)
dt
yt (t) = h(xt (t))
71
(with xt (t) ∈ Rn , ut (t) ∈ Rm , yt (t) ∈ Rp , and compatible dimensions for the domain
and codomain of f and h)4 we consider, to begin, the optimal control problem,
"Z Z w(T ) #
T
min ||yt (w(τ )) − r(τ )||2Q dτ + ||ut (t)||2Ru dt (84)
u∈U[0,T ] ,w∈Ω 0 0
or equivalently,
Z T
min ||yt (w(τ )) − r(τ )||2Q + w0 (τ )||ut (w(τ ))||2Ru dτ (85)
u∈U[0,T ] ,w∈Ω 0
functions from [0, T ] to Rm .5 In other words, we would like the output y to “track” the
reference signal r as closely as possible. However, note that this differs from the usual
tracking problem by the introduction of the time warping function w : [0, T ] → [0, ∞),
and also in that we integrate tracking error over reference time (“τ time”) instead of
augmenting the state with the time t = w(τ ) we can write the dynamics in reference
time as,
d xt ◦ w v(τ )f ((xt ◦ w)(τ ), (ut ◦ w)(τ ), w(τ ))
(τ ) = (86)
dτ w v(τ )
a function of time (so in fact t = w), we can simplify this notation and combine
the above with (85) to state the following standard (Bolza) optimal control problem:
y(τ ) = h(x(τ ))
4
Note that the subscript t is simply part of the function names xt , yt , etc, and is used to
distinguish these functions from others to be introduced later.
5
In (85) and elsewhere we denote || · ||2M , (·)T M (·) and assume M = M T 0, for whichever
matrix M is used in the subscript.
72
with known initial conditions (x, t)(0) = (x0 , 0), minimize the cost functional
Z T
Jtrack (u, v) = ||y(τ ) − r(τ )||2Q + v(τ )||u(τ )||2Ru dτ (88)
0
over the functions u and v over U[0,T ] and Ω0 , respectively (these functions can be
In fact, however, we will actually use a slightly modified cost that also penalizes
large deviations of w0 (τ ) from one, both to regularize the problem and to capture the
intuition that signals that must be “warped” by a great deal are more dissimilar than
J : U[0,T ] × Ω0 → R
This, together with (87), is the problem with which we will be interested in this
section. Since it is also a Bolza problem, the solution can be found using standard
optimal control theory, which we apply in the next section. First, however, we will
present a brief discussion of the significance of dynamics to the time warping problem
Example 4.2.2. Consider the underdamped simple harmonic oscillator with state
73
where the reference signal to be tracked is the sinusoid,
with ωr ∈ R+ . Moreover for clarity of exposition we will limit our attention to time
for some ξ ∈ R+ (Such parametric time warping functions are discussed in more
Before proceeding, we point out that in this example the control input that globally
minimizes the cost (91) will not be unique. This is, informally, because components
of signals that are bounded and occur for finite time “disappear” in the infinite-time
for any bounded ũ such that limt→∞ ũ(t) = 0. The consequence for this example is
we will assume that they are in fact sinusoids. Also observing that the phase of ut ◦ w
has no effect on the second term of (91), it must be that y ◦ w = a cos(ωr t) for some
a ∈ R+ , and
a ωr
ut (t) = ei ξ t (94)
h(iωr /ξ)
6
These arguments can be made rigorous using Plancherel’s identity for Fourier series and consid-
ering a sequence of values for T that are multiples of ω2πr ; we have omitted this lengthier development
for the purposes of our informal discussion.
74
It follows that the minimization problem (91) then reduces to,
" Z Z ξT 2 #
Q T Ru a ω r
min lim (a − 1)2 cos2 (ωr τ )dτ + ei ξ t dt
a,ξ T →∞ T 0 ξT 0 h(iωr /ξ)
" 2 #
Q R
u a
= min (a − 1)2 + . (95)
a,ξ 2 2 h(iωr /ξ)
For any fixed a, this is minimized with respect to ξ when the magnitude of the transfer
p
function |h(iωr /ξ)| is maximized. This occurs when ξ = ξ ∗ , ωr /(ω0 1 − ζ 2 ) – that
is, when ξ is chosen so that the resonant frequency of the system with system matrix
0 1
ξ (96)
2
−ω0 −2ζω0
warping is to scale the frequency axis (by the Fourier Dilation Theorem) so that
passbands of the system coincide with concentrations of energy in the reference signal.
(In fact, this effect is sufficient to overcome certain performance limitations, caused by
unstable zero dynamics, that are inherent to the unwarped reference tracking problem;
Theorem 10. The first order necessary optimality conditions for the minimization
of (89) are
75
Proof : Taking (x, t) to be the state, (u, v) the control input, and (λ, µ) the
H((x, t), (u, v), (λ, µ), τ ) = L(x, (u, v), τ ) + vλT f (x, u, t) + µv (99)
where
L(x, (u, v), τ ) = ||h(x) − r(τ )||2Q + v(τ )||u||2Ru + Rv (v(τ ) − 1)2 (100)
dλ ∂H T
− (τ ) = ((x, t)(τ ), (u, v)(τ ), (λ, µ)(τ ), τ )
dτ ∂x
∂L T ∂f T
= (x(τ ), (u, v)(τ ), τ ) + v(τ ) (x(τ ), u(τ ), t(τ ))λ(τ )
∂x ∂x
∂f T
= 2h0 (x)T Q(h(x) − r(τ )) + v(τ ) (x(τ ), u(τ ), t(τ ))λ(τ ) (101)
∂x
dµ ∂H
− (τ ) = ((x, t)(τ ), (u, v)(τ ), (λ, µ)(τ ), τ )
dτ ∂t
∂f T
= v(τ ) (x(τ ), u(τ ), t(τ ))λ(τ ) (102)
∂t
∂f
with (λ, µ)(T ) = 0. Note that when f is not time-varying, ∂t
= 0, which gives the
In any case, the first order necessary optimality conditions (FONCs) are,
∂H
((x, t)(τ ), (u, v)(τ ), (λ, µ)(τ ), τ ) =
∂u
2v(τ )uT (τ )Ru + v(τ )λT (τ ) ∂f
∂u
(x(τ ), u(τ ), t(τ )) = 0T
(103)
∂H
((x, t)(τ ), (u, v)(τ ), (λ, µ)(τ ), τ ) =
∂v
||u(τ )||2Ru + 2Rv v(τ − 1) + λT (τ )f (x(τ ), u(τ ), t(τ )) + µ(τ ) = 0
In fact, these equations (103) can be given the stronger interpretation of stating
that the gradient of the functional (89) in the functional space of which (u, v) is an
element must be zero; we will take advantage of this interpretation in the later section
4.2.1.2 which describes an algorithm for computing the optimal (u, v).
76
Example 4.2.3. The preceding, more general formulation can be used to solve the
usual, more specific time-warping problem (81). Given signals y and r to be compared,
H((x, t), v, (λ, µ), τ ) = (x − r(τ ))2 + k(v − 1)2 + λvy 0 (t) + µv, (106)
dλ
− (τ ) = 2(x − r(τ )) (107)
dτ
dµ
− (τ ) = λ(τ )v(τ )y 00 (t(τ )) (108)
dτ
∂H
((x, t)(τ ), v(τ ), (λ, µ)(τ ), τ ) = µ(τ ) + λ(τ )y 0 (t(τ )) = 0 (109)
∂v
which converts the problem into a two point boundary value problem.
particular parametric form. One example is linear time-warping functions, which are
of special interest since they represent a uniform scaling of the time axis. Another
motive for investigating time warping functions with given parametric forms is the
77
returns the derivative of a time warping function. Then, we are in fact considering
the problem,
return linear and polynomial time warping functions (whose structure allow them to
be treated nicely under the Bolza framework), and then give a more general view of
the problem.
w(τ ) = ξτ (111)
d
φv (ξ)(τ ) = (ξτ ) = ξ (112)
dτ
for all ξ ∈ Ξ , R+ , τ ∈ [0, T ]. Letting v = φv (ξ) (so v(τ ) = ξ ∀τ ∈ [0, T ]), we can
address this problem by augmenting the state with v (rather than treating v as a
control input as before) to obtain yet another Bolza problem: Given the system,
x v(τ )(f (x(τ ), u(τ ), t(τ ))
d
t (τ ) = v(τ ) (113)
dτ
v 0
y(τ ) = h(x(τ ))
with partially-known initial conditions (x, t)(0) = (x0 , 0), minimize (89) with respect
to the initial condition v(0) = ξ. Rather than giving a specialized solution to this
problem here, we will instead present the more general polynomial time warping
problem of which this is a special case. First, however, we would like to point out
some features of this minimization problem: Referring to figure 19, we note that this
78
is a highly nonconvex problem, so without an impractical amount of searching we can
at best expect local solutions to the problem. This said, even local solutions offer
improvement over the solution to the standard LQ tracking problem, as the time
warping still represents additional degrees of freedom that can be exploited to find
motions which are “similar” in a sense that is less strict than the usual L2 norm.
60
55
50
45
J(φv(ξ))
40
35
30
25
20
15
0 1 2 3 4 5
ξ
Figure 19: J(ξ) is plotted against the linear time warping parameter ξ = v(0) for
the problem in which an autonomous nonlinear pendulum approximately tracks a
sinusoid.
X
Nv
w(τ ) = φv (ξ)(τ ) = ξi τ i (114)
i=1
for some integer N ≥ 1, and with discrete parameter vector ξ = [ξ1 , . . . , ξN ] ∈ RNv
(Note that the requirement that w(0) = 0 implies that there is no constant term in
the polynomial). As in the linear case in the previous subsection, we will augment the
system to obtain a Bolza problem with partially-free initial conditions. The structure
79
polynomials as the output of an autonomous linear system. Differentiating (114), we
obtain,
X
Nv
w0 (τ ) = iξi τ i−1 . (115)
i=1
y(τ ) = h(x(τ ))
with partially-known initial conditions (x, t)(0) = (x0 , 0), and seek to minimize (89)
with respect to the initial conditions v1 (0), . . . , vN (0). In order for t to be a valid time
warping function, it is also necessary that this minimization be performed such that
80
Theorem 11. The FONCs for the polynomial time warping problem are given by
(122) and
∂J T
(ξ) = diag(1, 12 , 3!1 , ..., N1v ! )ν(0) = 0 (119)
∂ξ
where ν is the solution to
dν ∂H T
− (τ ) = ((x, t, v)(τ ), u(τ ), (λ, µ, ν)(τ ), τ )
dτ ∂v
||u||2Ru T
+ 2Rv (v1 − 1) + λ (τ )f (x(τ ), u(τ ), t(τ )) + µ(τ )
ν1 (τ )
= =0 (120)
...
νNv −1 (τ )
Proof : For the polynomial time warping problem, the Hamiltonian is (here we
identify the costate by (λ, µ, ν), where ν(τ ) ∈ RNv ∀τ ∈ [0, T ]),
H((x, t, v), u, (λ, µ, ν), τ ) = L(x, (u, v1 ), τ ) + λT v1 f (x, u, t) + µv1 + ν T [v2 , . . . , vN , 0]T
= ||h(x) − r(τ )||2Q + v1 ||u||2Ru + Rv (v1 − 1)2 + λT v1 f (x, u, t) + µv1 + ν T [v2 , . . . , vN , 0]T
and we obtain the costate equation for ν, (those for λ and µ are the same as given in
∂J T ∂J T
(ξ) = diag(1, 12 , 3!1 , ..., N1v ! ) (v(0)) = diag(1, 21 , 3!1 , ..., N1v ! )ν(0) . (121)
∂ξ ∂v(0)
In other words the gradient of the cost with respect to the polynomial coefficients is
∂H
((x, t, v)(τ ), u(τ ), (λ, µ, ν)(τ ), τ ) =
∂u
∂f
2v1 (τ )uT (τ )Ru + v1 (τ )λT (τ ) (x(τ ), u(τ ), t(τ )) = 0T (122)
∂u
81
for all τ ∈ [0, T ].
10, it can also be proven as a corollary following the approach of the next section
(4.2.1.2), in which case (119) would be seen as the projection of the second equation
of (103) onto the subspace of polynomial functions. That said, the introduction of
the exosystem (118) is particularly convenient, in that it reduces the problem to one
If the Fréchet derivatives of both the parametrization function φ and the cost J
[defined in (89)] exist, then we may in fact apply the solution given in section 4.2.1
directly to the discretized problem (110) through a simple application of the chain
rule.
Since we will be interested in discretizing not just the time warping function v
but also the control input u, at this point we will introduce a second parametrization
With the interpretation that the partial derivative of the Hamiltonian with respect
to the control input (as a function of time) is the gradient of J with respect to the
∂H T
∇u J(u, v)(τ ) = ((x(τ ), t(τ )), (u(τ ), v(τ )), (λ, µ)(τ ), τ )
∂u
∂H T
∇v J(u, v)(τ ) = ((x(τ ), t(τ )), (u(τ ), v(τ )), (λ, µ)(τ ), τ )
∂v
82
for all τ ∈ [0, T ]. Then, applying the chain rule to (123), the partial gradients of
Jφu ,φσ with respect to σ and ξ are given by the inner products,
h∇σ1 φu (σ), ∇u J(φu (σ), φv (ξ))iL2 ([0,T ],Rm )
..
∇σ Jφu ,φv (σ, φ) =
.
(124)
∇σNu φu (σ), ∇u J(φu (σ), φv (ξ)) L2 ([0,T ],Rm )
h∇ξ1 φv (ξ), ∇v J(φu (σ), φv (ξ))iΩ0
..
∇ξ Jφu ,φv (σ, φ) =
.
(125)
∇ξNv φv (ξ), ∇v J(φu (σ), φv (ξ)) Ω0
for any two functions a, b therein. By solving the state and costate ODEs and evalu-
ating the integrals (126) numerically we thus obtain a principled and general way to
A technical note: The set Ω0 equipped with the inner product (126) is not strictly
an inner product space because it contains neither the zero element nor an additive
inverse, but Ω0 ∪ (−Ω0 ) ∪ {0} is (where here 0 denotes the zero function), and it is
sufficient for our purposes that Ω0 be a subset of an inner product space. Moreover
Ω0 ∪ (−Ω0 ) ∪ {0} is not a complete inner product space (and so not a Hilbert space)
since Ω0 contains points arbitrarily close to functions that are not strictly increasing.
gence to functions that are not strictly-increasing (but the term of (89) that penalizes
b-splines. In other words, the control inputs u and v are represented by uniformly
spaced samples, and linear interpolation is used for the intermediate values. Then,
83
φu (σ) and φv (ξ) can both be expressed as sums of triangular basis functions, as given
below,
XK
T K −1
φu (σ)(τ ) = [σ1+m(i−1) , . . . , σmi ] tri (τ − i − 1) (127)
i=1
T
XK
K −1
φv (ξ)(τ ) = ξi tri (τ − i − 1) (128)
i=1
T
where K ∈ N is the number of samples used. Then, the functional gradients are given
by,
K −1
(∇σi φu (σ, ξ))(τ ) = emod(i−1,m)+1 tri (τ − d mi e − 1) ∀i ∈ {1, . . . , mK}
T
(129)
K −1
(∇ξi φv (σ, ξ))(τ ) = tri (τ − i − 1) ∀i ∈ {1, . . . , K} (130)
T
This gives us the finite-dimensional partial gradient for Jφu ,φv with respect to ξi ,
Z T
∂H K −1
(∇ξ Jφu ,φv (σ, ξ))i = ((x, t, v)(τ ), u(τ ), (λ, µ, ν)(τ ), τ ) tri (τ − i − 1) dτ
0 ∂v T
(132)
for all i ∈ {1, . . . , K}. The expression for ∇σ Jφu ,φv is similar.
In the previous sections, we assumed that the spatial correspondence between val-
ues of the output signal and those of the reference signal was known, and that only
explicitly addresses the fact that it is not just the dynamics of the “mimicking” sys-
tem that may differ from those of the system that generated the reference motion,
84
but also spatial constraints and scales – a problem evident even in the prototypical
To treat the problem of spatial correspondence, we will assume that the reference
signal r that we have been considering so far is in fact the composition of two functions:
With these definitions, we can extend the original cost functional (89) to obtain
J¯ : U[0,T ] × Ω0 × C → R
where J¯outwarp is some cost used to penalize “large” output warpings, regularize the
problem, and in certain cases enforce constraints; its form will be determined by the
Differentiating (133) to find the partial gradient with respect to c, we obtain the
85
FONC,
Z T
∇c J¯ = − (∇c φs (c))(r̄(τ ))T Q [y(τ ) − φs (c)(r̄(τ ))] dτ + ∇c J¯outwarp (c) = 0 (134)
0
which must be satisfied in addition to those given in Section 10. Note that in this
equation we assume for notational simplicity that c is a column vector, but we may
also use other finite-dimensional real vector spaces, like the matrix-vector pairs of the
next section.
where M ∈ Rp×p is an invertible matrix, and z ∈ Rp . Hence, for affine output warping
2. to ensure that s remains a bijection. This means penalizing values of (M, z) for
[tr(M T M )]p 1
J¯outwarp ((M, z)) = α T
+ β tr[(M − I)T (M − I)] + γz T z (136)
det(M M ) p
where α, β, γ ∈ R+ ∪ {0} are scalar coefficients used to weight the relative “impor-
The first term is inversely proportional to the Gram determinant of the columns of
M and so gives a measure of “how singular” M is. The presence of the Frobenius norm
tr(M T M ) in the numerator makes this term somewhat independent of the absolute
86
tr((αM )T (αM ))p tr(M T M )p
scaling of the matrix, since det((αM )T (αM ))
= det(M T M ))
for all α ∈ R; that is, it is
The second term is the squared Frobenius norm of M − I, and gives the expected
value of ||M x − x||2 /||x||2 if x is a random variable drawn from a uniform distribution
on a ball of (any) fixed radius centered at the origin in Rp . Together with the third
term, it penalizes “large” transformations – i.e., those that differ substantially from
Combining (138) with (136) and taking partial gradients with respect to M and
Theorem 12. The additional FONCs for the problem (138) are,
Z T
¯ v, (M, z)) =
∇M J(u, 2Q (M r̄(τ ) + z − y(τ )) rT (τ )dτ
0
2 T p −T T p−1
+α − tr(M M ) M + p tr(M M ) M
det(M T M )
1
+ 2β (M − I)
p
=0 (139)
Z T
¯
∇z J(u, v, (M, z)) = 2Q(M r̄(τ ) + z − y(τ ))dτ + 2γz = 0 ∈ Rp (140)
0
87
4.2.2.1 Piecewise Affine Output Warping
A natural extension of affine output warping is piecewise affine output warping, which,
with a sufficiently fine subdivision of the region R ⊂ Rp of interest, allows for the
The essential idea will be that we divide the space R into some number of p-
simplices, and use an affine warping function (as described in the previous section)
within each of these, chosen in such a way that the resulting piecewise function is
continuous. A new element that this adds to the problem is that, in order to enforce
that s remain a bijection, more is required than that each of the individual affine
warping functions’ “M ” matrices be full rank; we must also ensure that the images
S
1. S⊃R
2. int(S i ) ∩ int(S j ) = ∅ ∀i 6= j
The third condition ensures that there are no “t-junctions” in the mesh and the
fourth requires that each simplex share a face with another simplex; these are illus-
trated in Figure 20. Simplicial tessellations which meet these criteria include the
tessellations, see e.g. [21]). We will represent each simplex S i as the (p + 1)-tuple
88
of its vertices in Rp , which we denote S i = (S1i , . . . , Sp+1
i
) ∀i ∈ {1, . . . , |S|}. Further-
more, let verts(S) = {V1S , . . . , V|Sverts(S)| } be the unique vertices of S, and G S be the
graph whose nodes are the elements of verts(S), and in which an edge exists between
VkS and VlS iff ∃i ∈ {1, . . . , |S|} s.t. {(1 − t)VkS + tVlS : t ∈ [0, 1]} ⊂ S i . Together with
requirement 3 above, this means that the edges of G S , which we denote “edges(G S ),”
Figure 20: Simplicial tessellations in the plane containing the region outlined in
bold satisfying (left) all rules 1-4; (center) all rules except 3 (the offending vertex is
circled); and (right) all rules except 4 (the offending simplex is circled).
simplices) defined by
Rp| verts(R)| . In other words, holding the input simplices fixed, we optimize over the
89
where
−1
Mi = R2i − R1i R3i − R1i ... i
Rp+1 − R1i S2i − S1i S3i − S1i ... i
Sp+1 − S1i
(143)
where πS (r) : R → N is the function that, given a point r ∈ R, returns the index of
Figure 21: Given the set S of input simplices, the output warping function s is
determined by the positions of the vertices of the corresponding output simplices R.
This example uses a Coxeter-Kuhn-Freudenthal tessellation of a regular grid of cubes
in R2 .
With this representation, a cost which tends to maintain the bijectivity of s is
given by,
1 X 2
J¯outwarp (c) = ||ViR − VjR ||K − ||ViS − VjS ||K (146)
2
R(Vi ,VjR ) ∈ edges(G R )
90
The idea here is that G R is a rigid graph, and that by maintaining edge distances we
Admittedly, this cost does leave something to be desired, since simplices can
collapse with finite energy. Nevertheless, we believe it is useful for its simplicity. One
may wish to also apply (136) for each simplex in cases where (146) is not sufficient.
T
Now, define c = (V1R )T . . . (|Vverts(R)|
R
)T . Letting i1 (r), . . . ip+1 (r) be the
letting πS (r) be the simplex in S containing r, and defining the p| verts(S)|×p matrix
Iα1
..
Z(r) = . (148)
Iα| verts(S)|
where αi1 (r) = β1 (r, πS (r)), . . . , αip+1 (r) = βp+1 (r, πS (r)) and αi = 0 ∀i ∈
/ {i1 (r), . . . , ip+1 (r)},
then the the partial gradient of (133) without the last term J¯outwarp is given by,
Z T
∇c (J¯ − J¯outwarp )(R) = −2 Z(r(τ ))Q [y(τ ) − (φs (c) ◦ r)(τ )] dτ . (149)
0
Hence the partial gradient of J¯ with respect to c is simply the sum of (147) and (149).
We apply piecewise affine output warping together with linear time warping in a
Example 4.2.5. Suppose we would like the state of an autonomous Van der Pol
allowing for linear time warping and piecewise affine output warping. That is, the
91
system is given by,
d xt,1 xt,2 (t)
(t) =
dt x ζvp (1 − x2t,1 (t))
t,2
yt (t) = xt (t)
(with in our case ζvp = 0.9) and the reference signal r is the solution to,
d r1 r2 (τ )
(τ ) =
dτ r 2
sin(τ ) − ω0 sin(r1 (τ )) − ζpend r2 (τ )
2
with in our case ω0 = 1, ζpend = 0.5, x(0) = r(0) = [0.1, 0.1]T , and τ ∈ [0, T ] = [0, 10].
We will use essentially the same costs introduced earlier, but with some specially-
where γi,j is the number of simplices containing both ViR and VjR for all i, j ∈
{1, . . . , |V R |}, i 6= j. Then, performing gradient descent using the gradient given
by (147), (149), and (121), we obtain the results shown in Figures 22, 23, and 24.
The optimization takes place with both time- and output- warping. The time
warping is nonparametric, and the output warping optimizes over the six scalar pa-
92
Output
2 2
1
0
y2(τ)
0
−2
−1
−4 −2
0 5 10 −2 −1 0 1 2
y1(τ)
Figure 22: Van der Pol oscillator vs. Figure 23: Van der Pol oscillator vs.
driven pendulum, before warping. We driven pendulum, after warping. Time
wish to scale the time axis of the out- warping matches the first part of the Van
put (top left) and the output space of der Pol oscillator’s transient to that of
the reference (bottom right) to align the the pendulum (left top, bottom), and
two signals. output warping rotates and deforms the
reference output space to better match
the output (right top, bottom).
4
J
0
0 10 20 30 40 50 60
Iteration
2−Norm of Derivative vs. Iteration
300
200
Norm
100
0
0 10 20 30 40 50 60
Iteration
Figure 24: The cost in Example 4.2.5 is reduced (top), eventually reaching a lo-
cal extremum as evidenced by the reduction of the norm of the derivative to zero
(bottom).
the reference points. In other words, for a given warping parameter vector c =
93
Cost vs. Iteration
130
120
110
100
90
80
J
70
60
50
40
30
20
0 10 20 30 40 50 60 70 80
Iteration
Figure 25: The gradient descent procedure is characterized by a rapid initial descent
followed by slower final descent. Each curve corresponds to a different initial guess for
the control trajectory; each component of the guess for u takes the form α cos(ωt) +
β sin(ωt), with (α, β) chosen uniformly at random from [−1, 1] × [−1, 1] and ω from
[0, ωmax ], where ωmax is the Nyquist frequency for the sample rate used. The initial
guesses for the output warping are realizations of the random variable I + 51 , where
here I denotes the coefficients corresponding to the identity affine transformation and
is drawn uniformly from [−1, 1]6 .
The results are shown in Figures 25, 26, and 27, where it can be seen that marked
improvements in similarity are achieved, both in numerical cost (Figure 25) and in
subjective appearance (Figure 26). Both Figure 25 and Figure 27 also illustrate
the point that, since the dynamical constraints are nonconvex, the local optimum
to which the gradient descent procedure converges depends on the initial condition
used to initialize the algorithm. Nevertheless, even for fairly widely-separated initial
94
0% 20% 40%
Figure 26: Animation frames showing the lowest-cost imitation of the human sub-
ject’s bhangra performance by the puppet; percentages are of the total playback time
elapsed. The puppet begins hanging in an equilibrium state (at time “0%”).
In order to allow one system to “mimic” a reference signal, we have introduced several
versions of a modified output tracking problem that also includes time and output
warping functions as decision variables. The basic motivation has been that this
captures a measure of qualitative similarity which the usual error metrics used in
where, rather than just being a method to compare signals, it actually modifies the
dynamics of systems when viewed in “reference time.” We have used this fact to
control, to be solved numerically. The intuition is that we can essentially shift the
95
Output Warping Affine Transformations
0.5
−0.5
−1
−1.5
Figure 27: The affine transformations arrived at by the optimization algorithm for
different initial guesses (as described in Figure 25) are illustrated as frames in R2
(solid), along with the identity transformation (dotted). The lowest-cost transforma-
tion is drawn in bold.
an intuition for the problem, our development has been entirely time-domain, so
that it is applicable to problems with finite time horizons and very general nonlinear
systems.
The time-warping problem was studied both when allowing “arbitrary” time warp-
ing functions, and when restricting the allowed functions to certain parametric forms.
For linear and polynomial time warping functions this allows the problem to be re-
framed as yet another Bolza problem. For time warping functions with different
parametric forms, parametrization functions were introduced; this has the advantage
Likewise, output warping functions having fixed parametric forms were studied;
particular attention was paid to affine output warping functions and to a class of
In all of the above cases, first order necessary optimality conditions – really gra-
dients – were derived, with the underlying motivation being that this information
96
methods - e.g. steepest descent, nonlinear conjugate gradient methods, or quasi-
Newton methods. Numerical examples were given for the problems (1) of aligning
an autonomous Van der Pol oscillator’s output to that of a nonlinear pendulum us-
ing linear time warping and piecewise affine output warping, and (2) of controlling a
marionette to mimic a human dancer using nonparametric time warping and affine
output warping.
and are common to many problems in numerical optimal control: The gradients,
being the solutions to ordinary differential equations, are fairly expensive to compute;
and the nonconvexity of the problem means that only local optima are guaranteed.
Nevertheless, it is possible to compute local optima which do give results that achieve
97
CHAPTER V
PREFERENCE LEARNING
space, we wish to find (1) a real-valued rating function that is consistent with those
preferences, and (2) a global optimum to this function – the best point in the metric
space. By solving these two problems we would have recovered what the underlying
Crucially, we would like to be able to determine both the rating function and the
point that minimizes it entirely by convex optimization – both so that the resulting
problems are computationally efficient, and to ensure that any minima we find are in
fact global optima. Although existing approaches like the Support Vector Machine
(SVM) methodology of [43] and [29], which we will discuss in Section 5.1.2, do find
a rating function as the solution to a convex program, these typically use the so-
called kernel trick, which introduces nonlinearities that prevent the determination of
a minimizer to that function by convex programming. Yet, without the kernel trick,
and using the SVM approach, one arrives at linear cost functions that have no unique
minima at all. Our contribution, beginning in Section 5.2, is instead a set of convex
programs that provide a useful compromise between these extremes, and which only
allow us to entertain the idea of a unique “best” point in the space, and at the same
98
5.1.1 Problem Formulation
to find a real-valued rating function that is consistent with those preferences. Our
motivation is that, given some assumptions about what this function looks like, we
Formally, let (X, h·, ·i) be the Hilbert space, and S , {(x1i , x2i )}N
i=1 ⊂ X × X the
sequence of comparisons; a pair (x1i , x2i ) appears in the sequence S if and only if x1i is
That is, we adopt the convention that lower scores are better; hence we will refer to
f as a cost function.
will make it generalize well to other points in X besides the ones we have seen in S.
The very general requirement, just introduced, that X be a Hilbert space, will be
all that is needed to apply the algorithms that will be developed in the subsequent
sections. Examples of Hilbert spaces to which the preference learning of this chapter
1. The solution to the time- and output- warped tracking problems of Section 4.2
warping parameter vector ζ ∈ C,1 and reference signal r̄. Because the vector
Hilbert space – e.g. L2 – then, since direct products of Hilbert Spaces2 are
1
C is defined in Section 4.2.2.1. For instance, ζ = (α, β) if (135) is used, or ζ = verts(R) if (144)
is used.
2
Here, it is assumed that the inner product for a Hilbert space A × B is defined, unless otherwise
specified, by h(a1 , b1 ), (a2 , b2 )iA×B , ha1 , a2 i + hb1 , b2 i.
99
themselves Hilbert spaces, tuples (Q, Ru , Rv , ζ, r̄) live in a Hilbert space and
space, and the input 1-chain v ∈ C1 (G) can be learned under the preference-
In the following sections, we investigate a few choices for (1) particular parametric
forms for f , and (2) particular smoothness criteria, beginning with the current state
of the art. As we finish summarizing the state of the art, our interest in finding the
best possible point – a global minimizer of f – will begin to lead us into new territory,
The Support Vector Machine (SVM) approach to preference learning, which follows
the foundational work on SVMs of [15] and [102], was developed in [42] and [43], and
has been applied in a number of contexts, including [29]. The following subsections
briefly summarize a particular instantiation of this approach; for more detail see [43].
Initially, we would like to consider linear cost functions. These have the form
for some w ∈ X. This will be less of a restriction than it initially appears, because
5.1.2.1.1 Constraints
The first observation, from (152) and linearity of the inner product, is that
w, x2 − x1 > 0 ⇔ (x1 , x2 ) ∈ S . (154)
100
We strengthen the constraints by asserting that, in fact, hw, x2 − x1 i > , for some
> 0 and all (x1 , x2 ) ∈ S. Since the scale of f is arbitrary (i.e., if f satisfies all
pairwise inequality constraints, then so does cf for any c > 0), we without loss of
w, x2 − x1 ≥ 1 ∀(x1 , x2 ) ∈ S . (155)
the SVM literature. In the case of classification problems, this corresponds to finding
min
1 2
f (x2 ) − f (x1 ) . (156)
(x ,x )∈S
If w satisfies the constraints (155), then this number will be positive. In fact, since
scale is arbitrary and we are minimizing with respect to ||w||2 , it will be exactly one
(if it were some number greater than one, we could divide f by that number to obtain
Next consider those pairs (x1 , x2 ) ∈ S satisfying f (x2 )−f (x1 ) = 1. Such points x1
1
and x2 lie on level sets of f which are a distance of ||w||
apart. Therefore, minimizing
||w||2 maximizes the distance between the level sets on which lie the least-well-separated
points.
Combining the minimization problem described in the previous subsection with the
constraints (155), we state the following primal optimization problem, which is the
101
standard unbiased 3 SVM classification problem: We wish to find
This is a quadratic programming (QP) problem which, in principle, can be solved as-
QP – reveals additional insights however: (1) that the optimal w ∈ span {x2 − x1 : (x1 , x2 ) ∈ S}
and so one need only solve for the coefficients of its expansion in terms of this finite
basis, and (2) moreover that this expansion is sparse, with only a few terms being
nonzero; these elements of {x2 − x1 | (x1 , x2 ) ∈ S} which have nonzero coefficients are
called the support vectors. For more see [89]. For our purposes it is enough to note
(1) that because w ∈ span {x2 − x1 | (x1 , x2 ) ∈ S}, (157) can be found as the solution
(2) that the optimization problem depends only on the inner products between the
various elements of {x2 − x1 | (x1 , x2 ) ∈ S}. This finite-dimensional dual problem can
generally for the more common biased SVM classification problem) by augmenting
the problem with additional instances in a symmetric way which ensures that the
solution hyperplane nevertheless passes through the origin; for more on this latter
5.1.2.2 Kernelization
The basic idea behind the so-called kernel trick (see e.g. [15], [8]) is that, rather
than finding a linear classifier for instances directly, one can first map the points to a
higher-dimensional inner product space F (called the feature space) using a nonlinear
3
Here, unbiased means that the separating hyperplane must pass through the origin. This is as
opposed to the biased case, where not just linear but also affine varieties are allowed.
102
embedding Φ : X → F . Moreover since only inner products between elements of this
space need be computed, the embedding need not be explicitly constructed; rather,
κ(x1 , x2 ) , Φ(x1 ), Φ(x2 ) (158)
is all that is actually needed. Mercer’s theorem establishes conditions under which
a Φ and F exist such that a given function from X × X to R can be written in the
the kernel matrix [κ(xi , xj )]i,j ∈ RM ×M must be symmetric positive definite and so a
Gramian.
Early work in preference learning (e.g. [42]) simply applied the kernel trick to the
this implicitly passes differences between points through Φ and not the points them-
selves, this does not preserve the structure of an assumed underlying cost function;
i.e.,
w, Φ(x2 − x1 ) > 0 ⇐⇒
/ w, Φ(x2 ) − Φ(x1 ) > 0 (159)
and so it is incorrect using this approach to assume that there exists a cost function
of the form
The work [43] resolves this to recover a cost function which does have the form (160)
by classifying not differences between points (elements of X) but rather pairs thereof
kernel; this is also the approach used in [29]. In [43] it is shown that a cost function
defined by,
103
in the resulting classification problem, where κ is the kernel function corresponding
Kernel attached to κ.
5.1.2.2.1 Limitations
alternatives by convex optimization, we will investigate metric cost models, and find
that these cannot be addressed efficiently within the SVM framework. This will
motivate the construction of Chebyshev estimators. These also employ convex opti-
mization and the idea of maximizing robustness, but only reduce to SVMS in par-
ticular special cases. (A more detailed discussion of the incompatibilities that exist
between metric cost models and standard Support vector Machine approaches has
directed graph whose vertex set V = {x11 , x21 , · · · , x1N , x2N } ⊂ X is the collection of all
104
unique points that have been compared, and whose edge set is S. We will index the
If (152) is to hold with strict inequality, then we note immediately that the graph
G must be acyclic, and thus represent a partial order. When nonstrict inequalities
are allowed, however, then we may permit cycles, and moreover G can be replaced
by a smaller, equivalent acyclic graph. This has the practical significance of allowing
speeding up later optimization steps. This is constructed, following [6], in the follow-
ing way:
belong to the same cell (denoted v1 ∼ v2 ) if and only if there exist directed paths in
graph whose vertices are these equivalence classes, and in which the directed edge
(C1 , C2 ) exists between two cells C1 and C2 whenever there exist vertices v1 ∈ C1 and
Since any two vertices in the same cell must by (152) have the same cost, one may
optimize using only the constraints represented by the edges of this quotient graph,
and discard the rest. Hence without loss of generality we will assume that G is acyclic;
In the case of a directed acyclic graph, the reduction G t (which is unique) is a subgraph
105
C3
C1
7→
C1 C2 C3
C2
Figure 28: The original preference graph G (left), and the corresponding transitively-
reduced quotient graph, (G/ ∼)t (right).
same complexity as transitive closure, and hence matrix multiplication; thus, the
transitive reduction can be found in O(nlog2 7 ) steps using Strassen’s algorithm [91],
or, in principle, O(n2.376 ) steps using the Coopersmith-Winograd algorithm [25]. (See,
e.g., [35], [77]). Moreover, if G contains cycles, then the algorithm given in [6] can
In short, by working with the transitive reduction of the quotient graph, we are
even knowing the form of the cost function f . The reduction is illustrated by Figure
28.
“closer to what we would like,” or of being “far from perfect.” Motivated by this
everyday use of geometric language, in [58] we considered metric costs, which have
the form,
In short, it is assumed that there exists some single best point x̄ in X, and one
106
What does an individual response (x1 , x2 ) tell us about the location of x̄? Simply,
1. (x1i , x2i ) ∈ S
2. f (x1 ) ≤ f (x2 )
Defining,
the totality of what we know, then, about where x̄ might lie is summarized by the
inequalities and gave an asymptotic observer that x̄ under certain assumptions. Here,
we ask another question: Out of all the points in this polytope, which is “best?”
of the polytope,
1
x̄ = arg min max (hdi , xi − bi ) (168)
x i ||di ||
which is the point that is maximally far away from the closest constraint plane, as
illustrated by Figure 29. In other words, when P is nonempty, x̄ is the point that
107
x̄
x̄
Figure 29: Two examples for X = R2 . Shades of gray indicate the number of violated
constraints (points in darker regions violate more constraints), and discontinuities in
the derivative of the piecewise-linear function x 7→ maxi ||d1i || (hdi , xi − bi ) are indi-
cated by dashed lines. In the first example (top), P 6= ∅ (white region), and x̄ is
its incenter, the point maximally far away from the closest of the constraint surfaces
(thin, solid lines) - i.e., it is the center of the largest inscribed sphere (thick, solid
curve). In the second example (bottom), P = ∅, and the resulting optimum, x̄, is the
point whose worst constraint violation is minimal.
Note that with the definition (168), if the constraints are feasible (i.e., if P 6= ∅),
then x̄ ∈ P . This can be viewed as minimizing the ∞-norm of the vector of con-
straints. Additionally, x̄ ∈ aff {x11 , x21 , · · · , x1N , x2N } and hence we need only solve for
the coefficients of an expansion in terms of this basis (see Theorem 13). Furthermore,
this minimization problem has a sensible solution even when P is empty; it is the
Theorem 13. If (168) has a global minimizer, then it has a global minimizer in the
Proof : Let x be a global minimum to (168), and x̄ be the projection of x onto the
108
affine subspace, aff {x11 , x21 , · · · , x1N , x2N }; i.e., x̄ = x + δ with δ ⊥ span{d1 , · · · , dN }.
Then for all i ∈ {1, · · · , N }, since hdi , δi = 0 and by linearity of the inner product,
1 1
||di ||
hdi , x̄i − bi = ||di ||
hdi , xi − bi , and hence the value of the objective function in
Since x̄ ∈ aff {x11 , x21 , · · · , x1N , x2N }, the optimization problem (168) can be solved as a
x∗ = arg min ||x||2 | x ∈ aff x11 , x21 , · · · , x1N , x2N , (172)
x
109
and c̄ is found by solving
βi , hdi , µi i (174)
Proof : Defining x∗ by (172), one can write any x in the affine span of the data
in the form (171). Substituting the expansion (171) into (169) and noting that by
X
M
x= (indegc (xk ) − outdegc (xk )) xk + x∗ (176)
k=1
XM
, ξk xk + x∗ (177)
k=1
by treating c as a vector of edge weights to the preference graph, and denoting the
weighted in- and out-degrees of a given node xk by indegc (xk ) and outdegc (xk ) respec-
tively. Precisely,
X
indegc (xk ) , ci (178)
i|x2i =xk
X
outdegc (xk ) , ci . (179)
i|x1i =xk
110
Remark 2. Moreover, β can be written,
Remark 3. Note that this problem depends only on inner products of the various
di and ui vectors, and hence the problem can be solved even when X is infinite-
N (N +1)
dimensional. Precisely, 2
+ N 2 ∼ O(n2 ) inner products must be computed to
build the matrices Gdd and Gµd , where N is the number of comparisons. Alternatively,
the relevant matrices can also be produced directly from inner products of elements of
S, as
Gdd = K 22 − K 21 − K 12 + K 11 (182)
1
Gµd = (K 22 + K 21 − K 12 − K 11 ) (183)
2
where each matrix K lm ∈ RN ×N is defined by
Kijlm = xli , xm
j (184)
and can be built by indexing into the single Gramian (or kernel) matrix K ∈ RM ×M
defined,
111
where K † denotes the Moore-Penrose pseudoinverse of K, and 1 = (1, 1, · · · , 1) ∈
RM .
the “point at infinity,” or direction, that is best? More precisely, what we seek in this
or equivalently,
PN
Theorem 15. Letting v = k=1 ck dk , the optimization problem (191) is equivalent
to
112
Proof : The proof takes the form of theorem 14’s.
The cost function for the unbounded case arises from a similar limit process to
(190), as
1 2
f (x) = lim ||x − vt|| − t (193)
t→∞ t
1 2 2
= lim ||x|| − 2 hx, vti + ||vt|| − t (194)
t→∞ t
= −2 hx, vi (195)
When int P is nonempty - as is almost always true in the unbounded case - the
(QP), which will make the relationship to the usual SVM approach very clear. In
fact, (190) is equivalent to a particular SVM classification problem (which differs from
Defining,
1
w= v (197)
p
and restricting our attention to negative values for p (since when int P is nonempty,
1
arg min p = argmax p2 = arg min = arg min ||w||2 . (198)
p2
113
which results in the standard unbiased SVM problem,
since p̄ = 0, w̄ from (197) is undefined, but the solution to the SOCP problem (192)
nevertheless exists.
The minimax-rate problem (200) differs from the SVM problem considered in
1
e.g. [29] and [43] by the factor of ||di ||
included in each constraint. The difference
is that whereas the standard SVM approach attempts to classify differences using a
tion that maximizes the rate of constraint satisfaction; this is illustrated in Figure
30.
Suppose we have access to a very long (infinite) sequence of comparisons S = {(x1k , x2k )}∞
k=1 =
period of time, and we would like to know the features x̄ of the ideal alternative. If
observer for x̄ which can avoid storing all of the very (infinitely) many constraints
implied by this sequence? It turns out that the answer is yes, and exactly such an
114
Difference classification Constraint satisfaction rate
1.5 1.5
Minimax−rate Minimax−rate
SVM SVM
True plane True direction
1 1
0.5 0.5
0 0
−0.5 −0.5
−1 −1
−1.5 −1.5
−1.5 −1 −0.5 0 0.5 1 1.5 −1.5 −1 −0.5 0 0.5 1 1.5
115
x10 x̄ x21 x12
x̃3 x11
x̃1,2 x22
x20
x̃0
Figure 31: A series of the observer’s estimates, with αk = 1 ∀k. The initial estimate
is x̃0 , and the true ideal is given by x̄. In step 0, the observer projects x̃0 onto the
plane (solid line) corresponding to the measured output s0 = (x10 , x20 ) to produce x̃1 .
In step 1, the observer makes no changes to its estimate, because x̃1 is on the correct
side of the plane corresponding to s1 ; hence x̃2 = x̃1 . In step 2, the observer projects
x̃2 onto the plane corresponding to s2 to create the estimate x̃3 , which is yet closer
to x̄.
assumptions; we will prove this shortly in Theorem 16. Moreover, note that, although
(201-202) are broken down into separate expressions for clarity of presentation, they
are in fact all functions of x̃k , so this observer can be implemented with only dim{X}
for solving the convex optimization problem; for more on subgradient algorithms, see
e.g. [16].
bringing the estimate x̃k of the ideal closer to the true ideal, x̄. A proof of convergence
Lemma 5. Let x1 , x2 , x̄ be any vectors in an inner product space (X, h·, ·i), and let
116
20
15
10
−5
−10
−15
−20 −15 −10 −5 0 5 10 15 20
Figure 32: Example estimate trajectory for observer (201-202) for αk = α = 1, with
X = R2 . The estimate begins at x̃0 = (−15, 15), and approaches the ideal x̄ = (17, 0).
./ be a binary relation from the set, {=, <, >, ≤, ≥}. Then,
Theorem 16. Let x̄ ∈ X be the ideal alternative, and S = {(x1k , x2k )}∞
k=1 = {s1 , s2 , · · · }
function p on {(x1 , x2 ) ∈ X | ||x1 − x̄|| < ||x2 − x̄||} which is nonzero in an open ball
B(x̄, r) = Br around x̄. Then, the asymptotic observer given by (201-202) converges
to x̄ in probability.
observer is given in Figure 32. For this example, X = R2 , and features were drawn
from a uniform distribution in the square [−20, 20] × [−20, 20]. The estimate evolves
from its initial condition, x̃0 = (−15, 15) to near the ideal x̄ = (17, 0).
117
Figure 33: Depicted are the 9 apples used to generate comparisons with the single
orange.
To demonstrate the application of these ideas, photos of nine apples were shown to
Each apple was described by a 15-dimensional feature vector, containing (1-3) the
average color in HSB (hue, saturation, brightness) color space, (4-6) the average color
in RGB color space, (7) the color variance, (8-10) width, height, and the ratio of the
two, (11-12) stem length, and angle relative to apple, (13-14) dimple angle and depth,
The partial order over the apples was thus generated by having a number of people
make a number of randomly selected, pairwise comparisons (as the one depicted in
Figure 34). Represented as a preference graph, the results of these experiments are
For these data, the minimization problem (173) is unbounded and hence we find
118
Figure 34: An example of a pairwise comparison between two apples, relative to the
orange.
x1
x2
x5
x6
x3
x7 x9 x4
x8
119
an optimal direction via (171). We obtain the optimum,
which has the interpretation that dimple angle and redness are important orangelike
tions, another experiment was performed in which an audience of 25 people was asked
example of one such question is illustrated in Figure 37. In this manner, a preference
graph was generated as before, with 12 vertices (the amoeba motions) and 20 edges;
Inner products between the various amoeba motions were computed by rasterizing
the motions to binary videos, blurring each frame of the result, and computing the
this representation highlights the advantage of the instance vector expansion de-
scribed in section 5.4.2, without which the optimization problem simply could not be
realistically solved.
120
x5 x10 x9
x7
x3 x11
x8 x1 x4
x2
x12 x6
Figure 37: Each question took the form, “Which of the two ‘amoebas’ (bottom)
looks more like the [motion capture data from a] human dancer (top)?”
121
The minimization problem (173) with the resulting data turns out to be un-
bounded and hence we again find an optimal direction via (171). We obtain the
X
M
v̄ = ξk xk (203)
k=1
where
ξ = 103 ( 1.4918, −3.6556, −0.1390, 0.3113,
possible, an amoeba should as its first priority aspire to be as much like amoeba 7
(ξ7 = 2.6335) and as dissimilar from amoeba 2 (ξ2 = −3.6556) as possible, and that
it should to a lesser extent model itself after amoebas 1 and 9 (ξ1 = 1.4918, ξ9 =
(ξ5 = −1.1243, ξ10 = −1.7319). Although this does not explain why, psychologically,
structure, and an estimate for an amoeba motion that will be preferred to all others
under the assumption of an underlying metric cost model ; here, the alternatives being
compared are points in a Hilbert space, and human judges are assumed to prefer one
point to another if and only if it is closer to some fixed but unknown best alternative
that they may not have been shown. This assumption appears to be a good one for
the example considered and the features chosen, in that the feasible set P in this case
is nonempty.
Based on the metric cost assumption, a Chebyshev estimator was given for the
122
best point for the case when P is bounded, and a natural generalization, the minimax-
rate estimator, was developed for when P is unbounded. In the first case, the solution
was found, with an efficiency rivaling standard quadratic SVMs, as the solution to a
linear program; and in the second case the problem was shown to in fact reduce to a
cost functions was described, and shown to converge under reasonable assumptions.
In order that the estimators for the bounded and unbounded cases be applicable
metric spaces – as is the case for motion signals – the optimization problems were
problems whose size is proportional not to the dimensionality of the metric space, but
In all cases, optimal cost functions and points/directions were found efficiently
by convex programming. The results are efficient minimax estimators for the best
possible alternative.
123
CHAPTER VI
CONCLUSIONS
inspired points of view, and then produced algorithms that tune the resulting con-
trollers to satisfy the subjective requirements of the humans who use them.
The first, Eulerian, family of controllers, uses tools from algebraic topology to cre-
operators not as communications protocols for mobile agents themselves, but rather
to partial differential equations like the Laplace, Poisson, and Euler equations that
occur throughout physics, naturally follow. These, in turn, provoke the investigation
of other, novel Laplacian-like operators, whose null spaces are modified in various
ways, to characterise different spaces of interest. Perhaps most importantly, all of the
resulting algorithms are simple, linear, and globally-convergent – properties that are
appealing both from an implementation and from an analysis point of view. The gen-
formulations and then techniques from optimal control, produces algorithms that can
be used to manipulate virtual structures of mobile robots built from pairwise con-
straints. More broadly, the optimal control algorithms described in this dissertation,
124
that makes them generally applicable to nonlinear systems, and so have the potential
for broad use. It shares with the first approach a geometric-topological inspiration
use of simplicial complexes as spatial discretizations. What it does not share is the
approaches to formation control, the Lagrangian approach may, especially in the short
The final portion of this dissertation, which investigates the learning of human
problems than are the previous two sections. It does, however, complement the devel-
opment of Eulerian and Lagrangian control approaches, and helps to answer emerging
questions about how humans can interact effectively with multiagent swarms. Addi-
so the two, apparently-disjoint areas, can and do cross-fertilize one another. These
125
APPENDIX A
Theorem. The additional FONCs for the problem (138) are given by (139) and (139),
return [expression]; e.g., (x 7→ x2 ) is the function that squares its argument. The
of f . Note that Dc (f ) is itself a (linear) function which can be evaluated; we denote its
∇f (c). The derivative is the vector dual to the gradient; that is Dc (f )(h) = h∇f (c), hi
for all variations h of the argument to f , where h·, ·i is an appropriate inner product.
126
Proof : Considering the first term of (138),
tr(M T M )p
DA M 7→
det(M T M )
T p 1 1 T p
= tr(A A) DA M 7→ + D A M →
7 tr(M M ) (206)
det(M T M ) det(AT A)
where, since
DA M 7→ det(M T M )
= dX 7→ det(AT A) tr((AT A)−1 dX) ◦ (dA 7→ AT dA + dAT A)
= dA 7→ det(AT A) tr((AT A)−1 (AT dA + dAT A))
= dA 7→ 2 det(AT A)(tr(A−1 dA) (207)
we have
1
DA M 7→
det(M T M )
−dx
= dx 7→ T 2
◦ DA M 7→ det(M T M )
det(A A)
−2 det(AT A) tr(A−1 dA)
= dA 7→
det(AT A)2
−2 tr(A−1 dA)
= dA 7→ (208)
det(AT A)
and
DA M 7→ tr(M T M )p
= dx 7→ p tr(AT A)p−1 dx ◦ dA 7→ 2 tr(AT dA)
= dA 7→ 2p tr(AT A)p−1 tr(AT dA) (209)
127
Substituting (208) and (209) into (206),
tr(M T M )p
DA M 7→
det(M T M )
tr(AT A)p −1 2p tr(AT A)p−1 T
= dA 7→ −2 tr(A dA) + tr(A dA)
det(AT A) det(AT A)
−2 T p −1 T p−1 T
= dA 7→ tr tr(A A) A − p tr(A A) A dA
det(AT A)
−2
T p −T T p−1
= dA 7→ tr(A A) A − p tr(A A) A, dA (210)
det(AT A)
where the angle brackets denote the inner product h., .i defined for matrices by
This (210) is the dual to the gradient, which is given by (and replacing the symbol
A by M ),
2 T p −T T p−1
− tr(M M ) M − p tr(M M ) M . (212)
det(M T M )
function p on {(x1 , x2 ) ∈ X | ||x1 − x̄|| < ||x2 − x̄||} which is nonzero in an open ball
128
B(x̄, r) = Br around x̄. Then, the asymptotic observer given by (201-202) converges
to x̄ in probability.
Proof : Please see the appendix. 1. If hdk , x̃k i − bk > 0, then ||x̃k+1 − x̄|| <
||x̃k − x̄||. The distances ||x̃k − x̄|| and ||x̃k+1 − x̄|| are related through the Polarization
so, it order to show that ||x̃k+1 − x̄|| < ||x̃k − x̄||, it is sufficient to demonstrate
so, substituting ∆k into (215) (and dropping the superscript indices k),
α2 α
(b − hd, x̃i)2 + 2 (b − hd, x̃i) hd, x̃ − x̄i < 0
hd, di hd, di
Since by assumption hd, x̃i − b > 0, this is satisfied iff the second factor is positive;
that is,
Since hd, x̃i > b, and by Lemma 5, hd, x̄i ≤ b, this is satisfied so long as α ∈ (0, 2), as
we require.
129
2. The sequence ek = ||x̃k − x̄k ||, k = 0, 1, 2, ... is nonincreasing. In the second
case of (201), x̃k+1 = x̃k ; this is nonincreasing. In the first case, hdk , x̃k i − bk > 0, so
To show that this is the greatest such bound, consider some > 0 and suppose that,
at iteration m, ||x̃m − x̄|| = . Now, let z = min(r, /2), and consider the open balls
B1 = B(c1 , z/4), B2 = B(c2 , z/4), where the center points c1 , c2 are defined,
x̃ − x̄ (2j − 1)
cj = x̄ + z
||x̃ − x̄|| 4
5, we can confirm that x̄ and x̃ are on opposite sides of the plane corresponding to
so this is indeed the case. Considering the second inequality (220), we have likewise,
and
130
B1 x2
x̄ c1 c2
| {z } x̃k
z/4
x1 B2
x̃k+1
Figure 38: If x1 ∈ B1 and x2 ∈ B2 , then ||x̃k+1 − x̄|| < ||x̃k − x̄||.
with a plane that separates x̃ from x̄ and hence triggers a projection. Since B1
and B2 have nonzero measure, and are subsets of Br in which p(· ) is nonzero, then
the probabilities for this iteration P1 = P(“a point is selected in B1 ”) and P2 = P(“a
point is selected in B2 ”) are both nonzero, and therefore, since the sk are independent,
nonzero, and the probability that this occurs for at least one iteration k > m is given
Q
by 1 − ∞ k=m 1 − P k
both = 1 or in other words, with probability one, there exists a
q > m such that hdq , x̃q i − bq > 0. Then, by point 1, ||x̃q − x̄|| < ||x̃m − x̄|| = ,
probability.
The metric cost models considered in the previous section are a special case of
quadratic cost models, subject to additional constraints. We will discuss their re-
131
A.3.1 Quadratic Cost Models
f (x) = xT Qx + bT x (221)
Rdim(X) as
f (x) = (Q, b), (xxT , x) = h(Q, b), Φ(x)i (222)
I.e., this is the inner product induced on the product space by the Frobenius inner
= tr(x(xT y)y T ) + xT y
= (xT y) tr(xy T ) + xT y
= (xT y)2 + xT y
Hence, if no additional constraints are placed on (Q, b), then the quadratic case is
that follow we examine what happens when additional constraints are imposed.
132
A.3.2 Metric Cost Models: Known Metric
One might be tempted to try solving the original optimization problem (157) over
(Q, b) as in the previous subsection using this kernel, now subject to the equality
Unfortunately, adding this equality constraint ruins the important property of the
optimization problem that the optimal (Q, b) lies in the span of the (transformed)
data and so that the cost at a point can be reconstructed as a linear combination of
the kernel evaluated at support vectors and the point. Although the problem is still
this renders the resulting optimization problem impractically large. Hence, standard
Support Vector Machines are incompatible with metric cost models, which helps to
133
REFERENCES
[1] Adomavicius, G. and Tuzhilin, A., “Toward the next generation of rec-
ommender systems: A survey of the state-of-the-art and possible extensions,”
IEEE Trans. on Knowl. and Data Eng., vol. 17, no. 6, pp. 734–749, 2005.
[2] Afriat, S. N., “The construction of utility functions from expenditure data,”
International Economic Review, vol. 8, pp. 67–77, Feb 1967.
[3] Agarwal, S., “Ranking on graph data,” in In ICML, pp. 25–32, ACM Press,
2006.
[6] Aho, A. V., Garey, M. R., and Ullman, J. D., “The transitive reduction
of a directed graph,” SIAM Journal on Computing, vol. 1, pp. 131–137, Jun
1972.
[7] Aiolli, F. and Sperduti, A., “Learning preferences for multiclass problems,”
in Advances in Neural Information Processing Systems 17, pp. 17–24, MIT
Press, 2004.
[8] Aizerman, M., Braverman, E., and Rozonoer, L., “Theoretical founda-
tions of the potential function method in pattern recognition learning,” Au-
tomation and Remote Control, vol. 25, pp. 821–837, 1964.
[10] Bahamonde, A., Diez, J., Quevedo, J., Luances, O., and del Coz, J.,
“How to learn consumer preferences from the analysis of sensory data by means
of support vector machines (svm),” Trends in Food Science and Technology,
vol. 18, pp. 20–28, 2007.
[11] Balch, T., Arkin, R. C., and Member, S., “Behavior-based formation con-
trol for multi-robot teams,” IEEE Transactions on Robotics and Automation,
vol. 14, pp. 926–939, 1999.
134
[12] Beard, R., Lawton, J., and Hadaegh, F. vol. 6, pp. 4087 – 4091, 2000.
[14] Borges, L. and Oliveira, S., “A parallel davidson-type algorithm for several
eigenvalues,” vol. 144, pp. 727–748, 1998.
[15] Boser, B. E., Guyon, I. M., and Vapnik, V. N., “A training algorithm for
optimal margin classifiers,” in 5th Annual ACM Workshop on COLT (Haus-
sler, D., ed.), pp. 144–152, ACM Press, 1992.
[18] Bullo, F. and Lewis, A. D., Geometric Control of Mechanical Systems: Mod-
eling, Analysis, and Design for Simple Mechanical Control Systems. Springer,
2004.
[19] Burger, M., Capasso, V., and Morale, D., “On an aggregation model
with long and short range interactions,” Nonlinear Analysis: Real World Ap-
plications, vol. 8, no. 3, pp. 939 – 958, 2007.
[21] Carr, H., Möller, T., and Snoeyink, J., “Artifacts caused by simplicial
subdivision,” in IEEE Transactions on Visualization and Computer Graphics,
vol. 12, pp. 231–242, 2006.
[22] Chu, W. and Ghahramani, Z., “Preference learning with gaussian pro-
cesses,” in ICML ’05: Proceedings of the 22nd international conference on Ma-
chine learning, (New York, NY, USA), pp. 137–144, ACM, 2005.
[23] Chu, W. and Keerthi, S. S., “Support vector ordinal regression,” Neural
Computation, vol. 19, p. 2007, 2007.
[24] Cohen, W. W., Schapire, R. E., and Singer, Y., “Learning to order
things,” Journal of Artificial Intelligence Research, vol. 10, pp. 243–270, 1999.
135
[26] Cortes, C., Mohri, M., and Rastogi, A., “Magnitude-preserving ranking
algorithms,” in Proceedings of the Twenty-fourth International Conference on
Machine Learning (ICML 2007), pp. 169–176, 2007.
[27] Desbrun, M., Hirani, A., Leok, M., and Marsden, J., “Discrete exterior
calculus.” (preprint, http://arxiv.org/abs/math/0508341), 2003.
[29] Dı́ez, J., del Coz, J. J., Luaces, O., and Bahamonde, A., “Clustering
people according to their preference criteria,” Expert Syst. Appl., vol. 34, no. 2,
pp. 1274–1284, 2008.
[30] Do, K. and Pan, J., “Nonlinear formation control of unicycle-type mobile
robots,” Robotics and Autonomous Systems, vol. 55, no. 3, pp. 191 – 204, 2007.
[34] Fiechter, C.-N. and Rogers, S., “Learning subjective functions with large
margins,” in Proceedings of the Seventeenth International Conference on Ma-
chine Learning, pp. 287–294, Morgan Kaufmann, 2000.
[39] Guldner, J., Utkin, V., Hashimoto, H., and Harashima, F., “Tracking
gradients of artificial potential fields with non-holonomic mobile robots,” in
Proc. American Control Conference, vol. 4, pp. 2803–2804, 1995.
136
[40] Hatcher, A., Algebraic Topology. Cambridge University Press, 2002. http:
//www.math.cornell.edu/~hatcher/AT/ATpage.html.
[41] Hauser, J. and Hindman, R., “Aggressive flight maneuvers,” in Decision and
Control, 1997., Proceedings of the 36th IEEE Conference on, vol. 5, pp. 4186–
4191, 1997.
[43] Herbrich, R., Graepel, T., and Obermayer, K., “Large margin rank
boundaries for ordinal regression,” in Advances in large margin classifiers,
pp. 115–132, MIT Press, 2000.
[44] Herlocker, J. L., Konstan, J. A., Terveen, L. G., and Riedl, J. T.,
“Evaluating collaborative filtering recommender systems,” ACM Trans. Inf.
Syst., vol. 22, no. 1, pp. 5–53, 2004.
[46] Holzrichter, M. and Oliveira, S., “A graph based method for generat-
ing the fiedler vector of irregular problems,” in In Lecture Notes in Computer
Science, pp. 978–985, Lecture, 1999.
[47] Hörmander, L., The Analysis of Linear Partial Difference Operators, vol. 1.
Springer-Verlag, 1983.
[48] Hüllermeier, E., Frnkranz, J., Cheng, W., and Brinker, K., “La-
bel ranking by learning pairwise preferences,” Artificial Intelligence, vol. 172,
pp. 1897–1917, 2008.
[49] Ji, M., Azuma, S., and Egerstedt, M., “Role-assignment in multi-agent
coordination,” International Journal of Assistive Robotics and Mechatronics,
vol. 7, pp. 32–40, March 2006.
[50] Jiang, X., Lim, L.-H., Yao, Y., and Ye, Y., “Statistical ranking and com-
binatorial hodge theory,” Mathematical Programming, vol. 127, pp. 203–244,
2011.
[52] Johnson, E. and Murphey, T., “Dynamic modeling and motion planning
for marionettes: Rigid bodies articulated by massless strings,” in Robotics and
Automation, 2007 IEEE International Conference on, pp. 330–335, 2007.
137
[53] Kemeny, J. G. and Snell, L., “Preference ranking: an axiomatic approach,”
pp. 9–23, 1973.
[56] Kingston, P., Egerstedt, M., and Verriest, E., “Health monitoring of
networked systems,” in Proc. Mathematical Theory of Networks and Systems,
2008.
[57] Kingston, P. and Egerstedt, M., “Metric preference learning with appli-
cations to motion imitation,” SIAM J. Control and Optimization. Submitted.
[58] Kingston, P. and Egerstedt, M., “Comparing apples and oranges through
partial orders: An empirical approach,” in American Control Conference,
pp. 5434–5439, 2009.
[59] Kingston, P. and Egerstedt, M., “Time and output warping of control
systems,” in American Control Conference, pp. 6042–6047, 2010.
[61] Kingston, P. and Egerstedt, M., “Time and output warping of control
systems: Comparing and imitating motions,” Automatica, vol. 47, pp. 1580–
1588, 2011.
[64] Korada, S., Montanari, A., and Oh, S., “Gossip pca,” 2011.
[65] Kunz, T., Kingston, P., Stilman, M., and Egerstedt, M., “Dynamic
chess: Strategic planning for robot motion,” in Proc. Int’l. Conference on
Robotics and Automation (ICRA), pp. 3796–3803, 2011.
[66] Kwakernaak, H. and Sivan, R., “The maximal achievable accuracy of lin-
ear optimal regulators and linear optimal filters.,” IEEE Trans. on Automat.
Contr., vol. 17, no. 1, pp. 79–86, 1972.
138
[67] Kybic, J., Elastic Image Registration using Parametric Deformation Models.
Phd in biomedical image processing, Ecole Polytechnique Fédérale de Lausanne,
July 2001.
[68] Laurent, T., “Local and global existence for an aggregation equation,” Com-
munications in Partial Differential Equations, vol. 32, pp. 1941–1964, December
2007.
[69] Lee, J. M., Introduction to Smooth Manifolds. Springer, 2000.
[70] Lei, H. and Govindaraju, V., “Direct image matching by dynamic warping,”
in Computer Vision and Pattern Recognition Workshop (CVPRW’04), p. 76,
2004.
[71] Leonard, N. E. and Fiorelli, E., “Virtual leaders, artificial potentials and
coordinated control of groups,” in Proc. 40th IEEE Conference on Decision and
Control, pp. 2968–2973, December 2001.
[72] Mesbahi, M. and Egerstedt, M., Graph Theoretic Methods in Multiagent
Networks. Princeton University Press, 2010.
[73] Mogilner, A. and Edelstein-Keshet, L., “A non-local model for a
swarm,” Journal of Mathematical Biology, vol. 38, pp. 534–570, 1999.
[74] Montaner, M., López, B., and De La Rosa, J. L., “A taxonomy of recom-
mender agents on the internet,” Artif. Intell. Rev., vol. 19, no. 4, pp. 285–330,
2003.
[75] Muhammad, A. and Jadbabaie, A., “Decentralized computation of homol-
ogy groups in networks by gossip,” in Proc. American Control Conference,
pp. 3438–3443, 2007.
[76] Munkres, J. R., Elements of Algebraic Topology. Addison-Wesley, 1984.
[77] Munro, I., “Efficient determination of the strongly connected components and
the transitive closure of a graph.” Unpublished manuscript, 1971.
[78] Olfati-Saber, R. and Murray, R. M., “Consensus problems in networks
of agents with switching topology and time-delays,” IEEE Transactions on Au-
tomatic Control, vol. 49, no. 9, pp. 1520–1533, 2004.
[79] Olfati-Saber, R. and Murray, R. M., “Distributed cooperative control
of multiple vehicle formations using structural potential functions,” in IFAC
World Congress, pp. 346–352, 2002.
[80] Pappas, G., “Avoiding saturation by trajectory reparameterization,” in Pro-
ceedings of the 35th IEEE Decision and Control, vol. 1, pp. 76–81, Dec 1996.
[81] Perot, J. B. and Subramanian, V., “Discrete calculus methods for diffu-
sion,” J. Comput. Phys., vol. 224, pp. 59–81, May 2007.
139
[82] Prochazka, Z., Ito, T., and Okamoto, T., “Image correspondences by
warping functions and its applications,” in Systems and Computers in Japan,
vol. 33, pp. 22–30, 2002.
[83] Raptis, M., Bustreo, M., and Soatto, S., “Time warping under dynamic
constraints,” in Eleventh IEEE International Conference on Computer Vision,
Workshop on Dynamical Vision, 2007.
[84] Restificar, A. and Haddawy, P., “Inferring implicit preferences from nego-
tiation actions,” in Proc. Int’l Symposium on Artificial Intelligence and Math-
ematics, Jan 2004.
[85] Reynolds, C. W., “Flocks, herds, and schools: A distributed behavioral
model,” in Computer Graphics, pp. 25–34, 1987.
[86] Sakoe, H. and Chiba, S., “A dynamic programming approach to continuous
speech recognition,” in Proc. Int’l. Cong. Acoust., pp. 65–68, 1971.
[87] Sakoe, H. and Chiba, S., “Dynamic programming algorithm optimization
for spoken word recognition,” in IEEE Transactions on Acoustics, Speech and
Signal Processing, pp. 43–49, Feb 1978.
[88] Skjetne, R., Fossen, T. I., and Kokotovic, P. V., “Adaptive maneu-
vering, with experiments, for a model ship in a marine control laboratory,” in
Automatica, vol. 41, pp. 289–298, 2005.
[89] Smola, A. J. and Schölkopf, B., “A tutorial on support vector regression,”
Statistics and Computing, vol. 14, no. 3, pp. 199–222, 2004.
[90] Srinivasan, A. and Mascagni, M., Monte Carlo Techniques for Estimating
the Fiedler Vector in Graph Applications,. 2002.
[91] Strassen, V., “Gaussian elimination is not optimal,” Numer. Math, vol. 13,
no. 3, pp. 354–356, 1969.
[92] Tahbaz-Salehi, A. and Jadbabaie, A., “Distributed coverage verification
algorithms in sensor networks without location information,” IEEE Transac-
tions on Automatic Control, vol. 55, pp. 1837–1849, August 2010.
[93] Takaya, K. and Reinhardt, R. T., “Low bit-rate facial motion picture
coding using image warping,” in WESCANEX 97: Communications, Power
and Computing. Conference Proceedings., IEEE, pp. 138–143, May 1997.
[94] Tan, K.-H. and Lewis, M. A., “Virtual structures for high-precision coop-
erative mobile robotic control,” in Proc. IEEE/RSJ Int Intelligent Robots and
Systems ’96, IROS 96 Conf, vol. 1, pp. 132–139, 1996.
[95] Tanner, H. G., Jadbabaie, A., and Pappas, G. J., “Stable flocking of
mobile agents, part i: Fixed topology,” in IEEE Conference on Decision and
Control, pp. 2010–2015, 2003.
140
[96] Tanner, H. G., Jadbabaie, A., and Pappas, G. J., “Stable flocking of
mobile agents, part ii: Dynamic topology,” in IEEE Conference on Decision
and Control, pp. 2016–2021, 2003.
[98] Tong, Y., Lombeyda, S., Hirani, A. N., and Desbrun, M., “Discrete
multiscale vector field decomposition,” in Proc. SIGGRAPH, 2003.
[99] Topaz, C. M., Bertozzi, A. L., and Lewis, M. A., “A nonlocal continuum
model for biological aggregation,” Bulletin of Mathematical Biology, vol. 68,
pp. 1601–1623, October 2006.
[102] Vapnik, V. N., The nature of statistical learning theory. Springer-Verlag, 1995.
[104] Vidal, R., Shakernia, O., and Sastry, S., “Formation control of non-
holonomic mobile robots with omnidirectional visual servoing and motion seg-
mentation,” in In IEEE International Conference on Robotics and Automation,
pp. 584–589, 2003.
[107] Xiao, L. and Boyd, S., “Fast linear iterations for distributed averaging,” in
Proc. 42nd IEEE Conf. Decision and Control, vol. 5, pp. 4997–5002, 2003.
141
[110] Zelazo, D. and Mesbahi, M., “Edge agreement: Graph-theoretic perfor-
mance bounds and passivity analysis,” IEEE Transactions on Automatic Con-
trol, vol. 56, no. 3, pp. 544–555, 2011.
142