Thesis
Thesis
Thesis
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Department of Electrical Engineering and Computer Science
May 20, 2011
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Michel X. Goemans
Professor of Applied Mathematics
Thesis Supervisor
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jonathan A. Kelner
Assistant Professor of Applied Mathematics
Thesis Supervisor
Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Professor Leslie A. Kolodziejski
Chairman, Department Committee on Graduate Students
2
From Graphs to Matrices, and Back: New Techniques for
Graph Algorithms
by
Aleksander Mądry
Abstract
The growing need to deal efficiently with massive computing tasks prompts us to
consider the following question:
How well can we solve fundamental optimization problems if our algorithms have to
run really quickly?
The motivation for the research presented in this thesis stems from addressing the
above question in the context of algorithmic graph theory.
To pursue this direction, we develop a toolkit that combines a diverse set of
modern algorithmic techniques, including sparsification, low-stretch spanning trees,
the multiplicative-weights-update method, dynamic graph algorithms, fast Laplacian
system solvers, and tools of spectral graph theory.
Using this toolkit, we obtain improved algorithms for several basic graph problems
including:
• The Maximum s-t Flow and Minimum s-t Cut Problems. We develop
a new approach to computing (1 − )-approximately maximum s-t flow and
(1 + ε)-approximately minimum s-t cut in undirected graphs that gives the
fastest known algorithms for these tasks. These algorithms are the first ones to
improve the long-standing bound of O(n3/2 ) running time on sparse graphs;
3
• Undirected (Multi-)Cut-Based Minimization Problems. We develop a
general framework for designing fast approximation algorithms for (multi-)cut-
based minimization problems in undirected graphs. Applying this framework
leads to the first algorithms for several fundamental graph partitioning primi-
tives, such as the (generalized) sparsest cut problem and the balanced separator
problem, that run in close to linear time while still providing polylogarithmic
approximation guarantees;
4
Acknowledgments
First and foremost, I would like to thank Michel Goemans and Jonathan Kelner for
the guidance, support, and nurturing that they provided me with during my graduate
studies. Michel taught me how to choose problems to work on and served as a role
model of an upstanding researcher. Without Jon’s mentoring, contagious enthusiasm,
and endless support that was going beyond the call of duty, this thesis would not have
been possible. Having both of them as my co-advisors was a truly unique experience
that I will always remember. I learned a great deal from it.
I also want to thank Andrzej Borowiec, Maciej Liśkiewicz, Krzysztof Loryś, and
Leszek Pacholski for their help, advice, and support at the beginning of my scientific
career. If not for them, I would not be here.
I want to express my gratitude to Costis Daskalakis and Piotr Indyk for serving
on my thesis committee and their frequent advice.
I am grateful to Daniel Spielman and Shang-hua Teng for their advice, as well as,
for their beautiful work without which many of the results in this thesis would have
never been conceived.
The MIT’s theory group provided a wonderful environment for being a graduate
student. I benefited from it greatly. I want to thank Costis Daskalakis, Piotr Indyk,
Ronitt Rubinfeld, and Madhu Sudan, for making me feel welcome here. (Also, thanks
for all the outings, Madhu!) Special thanks to all my MIT friends: Alexandr Andoni,
Arnab Bhattacharyya, Erik and Martin Demaine, Andy Drucker, Bernhard Haeupler,
Ankur Moitra, Jelani Nelson, Krzysztof Onak, Rotem Oshman, Debmalya Panigrahi,
and Mihai Pǎtraşcu for making each day at the office an enjoyable experience. I am
especially grateful to Alex for being my “older brother” during my first years at MIT.
I also want to thank Umesh Vazirani for hosting me during my visits to Berkeley,
and Nikhil Bansal for the invitations to IBM Research. These experiences influenced
me greatly and helped me develop as a researcher.
5
6
To my wife Kamila,
and my family
7
8
Contents
1 Introduction 15
1.1 Results and Structure of This Thesis . . . . . . . . . . . . . . . . . . 17
2 Background 21
2.1 Preliminary Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Cuts and Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Laplacian of a Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Electrical Flows and Resistor Networks . . . . . . . . . . . . . . . . . 25
2.4.1 Effective s-t Resistance and Effective s-t Conductance . . . . . 27
2.5 Electrical Flows, Random Walks, Random Spanning Trees, and Lapla-
cians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.1 Spectral Connection . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.2 Electrical Flows, Random Walks, and Random Spanning Trees 29
2.6 Approximating Graphs via Simpler Graphs . . . . . . . . . . . . . . . 30
2.7 Laplacian System Solvers . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.8 Packing Flows via Multiplicative-Weights-Update Method . . . . . . 35
2.8.1 Designing a γ-approximate Solver . . . . . . . . . . . . . . . . 36
2.8.2 Correctness of the Multiplicative-Weights-Update Routine . . 39
2.8.3 Convergence of the Multiplicative-Weights-Update Routine . . 41
9
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1.1 Previous Work on Maximum Flows and Minimum Cuts . . . . 46
3.1.2 Outline of This Chapter . . . . . . . . . . . . . . . . . . . . . 47
3.2 Preliminaries and Notations . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 A Simple Õ(m3/2 ε−5/2 )-Time Flow Algorithm . . . . . . . . . . . . . 48
3.3.1 From Electrical Flows to Maximum Flows: Oracle Construction 49
3.3.2 Computing Electrical Flows . . . . . . . . . . . . . . . . . . . 51
1/3 −11/3
3.4 An O(mn
e ε ) Algorithm for Approximate Maximum Flow . . . 55
3.4.1 The Improved Algorithm . . . . . . . . . . . . . . . . . . . . . 56
3.4.2 Analysis of the New Algorithm . . . . . . . . . . . . . . . . . 57
3.4.3 The Proof of Lemma 3.4.1 . . . . . . . . . . . . . . . . . . . . 58
1/3 −11/3
3.4.4 Improving the Running Time to O(mn
e ε ) . . . . . . . . 63
3.4.5 Approximating the Value of the Maximum s-t Flow in Time
e + n4/3 ε−8/3 ) . . . . . . . . . . . . . . . . . . . . . . . . .
O(m 63
3.5 A Dual Algorithm for Finding an Approximately Minimum s-t Cut in
e + n4/3 ε−8/3 ) . . . . . . . . . . . . . . . . . . . . . . . . . .
Time O(m 64
3.5.1 An Overview of the Analysis . . . . . . . . . . . . . . . . . . . 64
3.5.2 Cuts, Electrical Potentials, and Effective Resistance . . . . . 65
3.5.3 The Proof that the Dual Algorithm Finds an Approximately
Minimum Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.6 Towards a Maximum Flow Algorithm for Directed Graphs . . . . . . 72
10
4.3 Solving Multicommodity Flow Problems and Dynamic Graph Algorithms 83
4.3.1 (δ, Mmax , Mmin , Q)-ADSP data structure . . . . . . . . . . . . 86
4.3.2 Solving the Decremental Dynamic All-Pairs Shortest Paths Prob-
lem Using the (δ, Mmax , Mmin , P)-ADSP
b Data Structure . . . . 87
4.4 Maximum Multicommodity Flow Problem . . . . . . . . . . . . . . . 88
4.4.1 Existence of Short Paths in P
b . . . . . . . . . . . . . . . . . . 89
4.4.2 Randomized Cycling Through Commodities . . . . . . . . . . 89
4.4.3 Our Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.5 Construction of the (δ, Mmax , Mmin , P)-ADSP
b Data Structure . . . . . 94
4.5.1 Implementation of the (δ, Mmax , Mmin , P)-ADSP
b with Linear
Mmax
Dependence on Mmin
. . . . . . . . . . . . . . . . . . . . . . . 95
4.5.2 Proof of Theorem 4.3.4 . . . . . . . . . . . . . . . . . . . . . . 99
4.6 Maximum Concurrent Flow Problem . . . . . . . . . . . . . . . . . . 101
4.6.1 Our Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 104
11
5.4.2 Generalized Sparsest Cut Problem . . . . . . . . . . . . . . . 124
5.4.3 Balanced Separator and Sparsest Cut Problem . . . . . . . . . 124
5.5 Proof of Theorem 5.3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.5.1 Graphs H(T, F ) and the Family H[j] . . . . . . . . . . . . . . 129
5.5.2 Obtaining an (O(log
e e m log U )])-decomposition of G . . 130
n), H[O( t
12
7.3.2 Obtaining a Good (φ, γ)-decompositions Quickly . . . . . . . 171
7.3.3 e 3/2 ) . . . . . 171
Generating a Random Arborescence in Time O(m
7.4 e √n) Running Time . . . . . . . . . . . . . . . . . . 172
Obtaining an O(m
7.4.1 Strong (φ, γ)-decompositions . . . . . . . . . . . . . . . . . . . 172
7.4.2 Computating of Qv (u) . . . . . . . . . . . . . . . . . . . . . . 174
7.4.3 Coping with Shortcomings of X
b . . . . . . . . . . . . . . . . . 175
8 Conclusions 177
Bibliography 181
13
14
Chapter 1
Introduction
Optimization is one of the most basic tasks in modern science. An amazingly vast
array of tasks arising in computer science, operations research, economics, biology,
chemistry, and physics can be cast as optimization problems. As a result, the goal
of understanding the nature of optimization from an algorithmic point of view is a
cornerstone of theoretical computer science. In particular, it motivated the notion of
polynomial-time algorithms as a standard of efficiency.
Unfortunately, many optimization problems of practical interest are NP-hard and
thus probably cannot be solved exactly in polynomial time. One way of coping
with this intractability is to change our paradigm by insisting that our algorithms
produce solutions in polynomial time, and then trying to obtain a solution whose
quality is within some approximation ratio of the optimum. Over the last thirty-
five years, this approach has been extremely successful, giving rise to the field of
approximation algorithms and leading to practical algorithms for a plethora of real-
world optimization problems.
However, there has been an emergence in recent years of various extremely large
graphs, which has changed the nature of the optimization tasks that we are facing.
If one considers, for example, the graphs that companies like Google or Facebook
have to deal with, one quickly realizes that the size of the corresponding optimization
problems that one needs to solve on a day-to-day (or even hour-to-hour) basis is much
larger than what one would have encountered just a decade ago. It becomes apparent
that this explosion in the amount of data to be processed cannot be accommodated
by the development of hardware-based solutions alone, especially given the growing
need to perform computations in a cost- and energy-efficient manner. It is therefore
critical to address this issue from an algorithmic point of view as well. As a parallel
to the method employed in the past to cope with NP-hardness, one should focus in
this context on reducing the running time of the algorithms to make them as fast as
possible – ideally, to run in nearly-linear time – even if it comes at a cost of reducing
the quality of the returned solution.
15
The research presented in this thesis stems from pursuing of the above program
in the context of graph problems. At its core is the general question:
That is, can we obtain algorithms for graph problems that run in essentially the best
possible time and that compute solutions whose quality is comparable to the best
achievable by polynomial-time algorithms?
Addressing this challenge seems to be beyond the reach of classical, purely combi-
natorial methods. Therefore, we approach it by developing and applying a diverse set
of modern algorithmic techniques, including sparsification, low-stretch spanning trees,
the multiplicative-weights-update method, dynamic graph algorithms, fast Laplacian
system solvers, and tools of spectral graph theory. Using these, we obtain fast approx-
imation algorithms for the single- and multi-commodity flow problems, and a variety
of cut and multi-cut problems such as the minimum s-t cut problem, the (generalized)
sparsest cut problem, and the balanced separator problem.
A recurring theme that we want to highlight here is that most of the results
presented in this thesis are established by connecting combinatorial properties of
graphs to linear-algebraic properties of associated matrices, known as Laplacians.
It is well-known that one can use the eigenvalue spectrum of the Laplacian matrix
to study the underlying graph. For example, one can use this to estimate the mixing
time of the random walks or approximate certain cut properties of the graph. (An
in-depth treatment of such methods can be found in the book by Chung [42].)
In this thesis, however, we show that much more can be obtained by broadening
our investigation to include more general linear-algebraic properties of Laplacians. In
particular, a key ingredient in many of our results is the use of electrical flows and
related objects to probe the combinatorial structure of the graph. It turns out that
this approach allows one to obtain much richer information about the graph than that
conveyed by just looking at the spectrum of the Laplacian. Furthermore, computing
electrical flows corresponds to solving a linear system in the Laplacian matrix (i.e.,
Laplacian system). One can approximately solve such a system in nearly-linear time
[129, 95], thus this gives rise to an extremely efficient primitive that allows us to
access the information conveyed by electrical flows in an amount of time comparable
to that required by Dijkstra’s algorithm.
The above approach to studying graphs provides us with a powerful new toolkit
for algorithmic graph theory. In fact, as we will see in this thesis, its applicability goes
beyond providing faster algorithms for graph problems. By combining the classical
polyhedral methods with the notion of random spanning trees and its understanding
via electrical flows, we are able to break the long-standing approximation barrier
of Θ(log n) for one of the fundamental problems of combinatorial optimization: the
asymmetric traveling salesman problem.
Finally, we want to note that even though the above-mentioned bridge between
the combinatorial and linear-algebraic worlds is usually used to obtain better under-
standing of combinatorial problems, it is also possible to utilize this connection in the
other direction. Namely, in this thesis, we consider a classical algorithmic problem of
16
spectral graph theory: the generation of an uniformly random spanning tree in undi-
rected graph, a problem that was traditionally approached with random-walk-based
– and thus linear-algebraic in nature – methods. We show how by integrating these
methods with the combinatorial graph partitioning techniques and fast Laplacian
system solvers, one can speed up the existing algorithms for this problem.
Chapter 3 – The Maximum s-t Flow and Minimum s-t Cut Problems. We
develop a new approach to computing (1 − )-approximately maximum s-t flows and
(1 + ε)-approximately minimum s-t cut in undirected graphs. Our approach employs
electrical flow computations – each of them corresponds to solving a Laplacian system
– to probe the global flow structure of the graph. The algorithms that result from
doing this are the fastest known ones for these tasks. In particular, they improve
the long-standing bound of O(n3/2 ) running time on sparse graphs. The material in
this chapter is based on joint work with Paul Christiano, Jonathan Kelner, Daniel
Spielman, and Shang-Hua Teng [40].
17
the ideas from an area of dynamic graph algorithms to improve the performance of
the multiplicative-weights-update-based algorithms that are traditionally used in the
context of multicommodity flows. As a result, we obtain running time improvements
by a factor of roughly O(m/n)
e and, whenever ε is fixed, the obtained running times
essentially match a natural barrier for all the existing fast algorithms and are within
a O(m/n)
e factor of a trivial lower bound that applies to all the algorithms for these
problems. The material in this chapter is based on [105].
18
techniques for the problem with combinatorial graph partitioning primitives and fast
Laplacian system solvers. The material in this chapter is based on joint work with
Jonathan Kelner and James Propp [90].
19
20
Chapter 2
Background
In this chapter, we provide the basic graph-theoretic definitions and introduce some
of the tools and concepts that will be employed throughout this thesis. This includes
the concept of a Laplacian matrix of a graph and of electrical flows, as well as, tools
such as Laplacian system solvers, multiplicative-weights-update method, random and
low-stretch spanning trees, and sparsification.
21
By an s-t cut, for given vertices s and t, we mean any cut C in G that separates
s from t, i.e., such that s ∈ C, but t ∈
/ C.
Now, for a directed graph G = (V, E) and two of its vertices s and t, we define
an s-t flow to be a function that assigns non-negative values to arcs and obeys the
flow-conservation constraints:
X X
f (e) − f (e) = 0 for all v ∈ V \ {s, t},
e∈E − (v) e∈E + (v)
where E − (v) (resp. E + (v)) is the set of arcs with v being their tail (resp. head). The
value |f | of the flow is defined to be the net flow out of the source vertex,
X X
|f | := f (e) − f (e).
e∈E − (s) e∈E + (s)
It follows easily from the flow conservation constraints that the net flow out of s is
equal to the net flow into t, so |f | may be interpreted as the amount of flow that is
sent from the source s to the sink t.
To extend the above definition to undirected graphs, one could just model undi-
rected edges as two arcs that are oriented in opposite directions. However, for nota-
tional convenience, we will make our definition of s-t flow in undirected graph slightly
different. Namely, given an undirected graph G = (V, E), we orient each edge in E
arbitrarily; this divides the edges incident to a vertex v ∈ V into the set E + (v) of
edges oriented towards v and the set E − (v) of edges oriented away from v. These
orientations will be used to interpret the meaning of a positive flow on an edge. If
an edge has positive flow and is in E + (v), then the flow is towards v. Conversely,
if it has negative flow then the flow is away from v. One should keep in mind that
we allow here the flow on an edge to go in either direction, regardless of this edge’s
orientation.
Now, once the sets E + (v) and E − (v) are introduced, the rest of the definition of
an s-t flow is analogous to the one in the directed case. The only difference is that
now f can also assign negative values to edges (as they correspond to flowing in the
direction opposite to the orientation).
22
achieve this, one can focus on minimizing the ratio of the capacity of the cut to some
measure of the size of its smaller side. For example, in the minimum conductance cut
problem we are given a capacitated graph G = (V, E, u) and we want to find a cut C
that minimizes its conductance:
u(C)
, (2.1)
min{V ol(C), V ol(C)}
where V ol(C) (resp. V ol(C)) is the sum of capacities of all the edges with at least
one endpoint in C (resp. C). Another important problem of this type is the sparsest
cut problem in which we want to find a cut C with minimal sparsity:
u(C)
. (2.2)
min{|C|, |C|}
An important property of the Laplacian – one that reveals a lot of its structure
– is that it can be expressed as a product of very simple matrices. Namely, if we
1
One could also extend the definition of the Laplacian that we are about to present to the case
of directed graphs. However, the resulting object loses most of the properties that are useful to us.
Therefore, we consider the Laplacian only in the undirected setting.
23
consider the edge-vertex incidence matrix B, which is an n × m matrix with rows
indexed by vertices and columns indexed by edges, such that
1
if e ∈ E − (v),
B v,e = −1 if e ∈ E + (v),
0 otherwise.
L(G) = BW B T , (2.3)
This quadratic form reveals a lot of basic properties of the Laplacian. For instance,
from it one can immediately see that if λ1 ≤ λ2 ≤ . . . ≤ λn are the eigenvalues of the
Laplacian L(G) and v 1 , v 2 . . . , v n are the corresponding eigenvectors then:
1. All λi s are non-negative, i.e., the Laplacian L(G) is a positive semi-definite
matrix;
2. λ1 is equal to 0 and the corresponding eigenvector v 1 is a normalized all-ones
vector 1;
3. If G is connected (which will be always the case in this thesis) λ2 is strictly
positive.
Also, the quadratic form (2.4) highlights the crucial connection between Laplacian
and the cut structure of the underlying graph. Namely, if we consider a cut C in G
and its characteristic vector χC ∈ {−1, 1}|V | that has coordinates corresponding to
vertices in C equal to 1, and the remaining coordinates equal to −1, then we will
have that
χTC L(G)χC = 4wG (C), (2.5)
where wG (C)is the total weight in G of edges in the cut C. That is, the quadratic
form of the Laplacian on such characteristic vectors measures the capacity of the
corresponding cuts.
Finally, we should observe that the Laplacian matrix is not invertible (as λ1 is
equal to 0). However, it will be useful to employ a notion that allows us to invert
a Laplacian in a restricted sense. This notion is the Moore-Penrose pseudoinverse
L(G)† of L(G) that is defined as
X 1
L(G)† := v i v Ti . (2.6)
i:λ 6=0
λ i
i
24
As one can see, L(G)† acts as an inverse on the range of the matrix L(G) and as a
zero matrix on the orthogonal complement of this range.
Now, the electrical (s-t) flow of value F (with respect to r ) is the (unique) s-t flow
that minimizes Er (f ) among all s-t flows f of value F .
From a physical point of view, the electrical flow of value F corresponds to the
current that is induced in G if we view it as an electrical circuit in which each edge
e has resistance of re , and we send F units of current from s to t, say by attaching s
to a current source and t to ground.
It is well-known (see e.g. [103]) that an electrical flow is a potential flow, which
means that for any electrical flow f there exists a vector φ ∈ IRn of vertex potentials
that determine the flow f via Ohm’s Law:
φv − φu
f (e) = for each e = (u, v). (2.7)
r(u,v)
As a result, finding an electrical flow boils down to finding the corresponding vertex
potentials.
Now, the key connection between electrical flows and Laplacian matrices is that
these vertex potentials and, thus, the electrical flow, can be obtained by solving a
Laplacian system, i.e., a linear system of equations where the constraint matrix is the
Laplacian matrix.
To establish this connection formally, let us consider a task of finding an electrical
s-t flow f of value 1. (We assume that the value of the flow is 1 here, but one can
get a flow of arbitrary value F by just multiplying this value-one flow by F .)
Let us view our flow f as a vector f ∈ IRm of m real entries – one entry per
edge – with the orientations of the edges determining the signs of the coordinates. It
is easy to see that in this vector interpretation, the v th entry of the vector Bf will
be the difference between the flow out of and the flow into vertex v. Therefore, the
25
constraints on f that 1 unit of flow is sent from s to t and that flow is conserved at
all other vertices can be compactly written as
Bf = χs,t , (2.8)
BC B T φ = χs,t .
Now, by looking at the factorization of the Laplacian from (2.3) we can conclude
that the vertex potentials φ corresponding to the electrical s-t flow f of value 1 are
given by a solution to the Laplacian system
where L(G) is the Laplacian of the underlying graph with the weight we equal to the
conductances ce for each e (and thus being inverses of the resistances re ).
In other words, φ can be obtained by multiplying the pseudo-inverse L(G)† of the
corresponding Laplacian by the vector χs,t
φ = L(G)† χs,t .
This, in turn, allows us to use the Ohm’s Law (2.7) again to conclude that the
vector representation f of our electrical flow f can be written compactly as
At this point we also want to note that the above discussion allows us to express
the energy Er (f ) of an electrical s-t flow f (of arbitrary value) as
T
Er (f ) = f T Rf = χs,t T L(G)† BC T R C B T L(G)† χs,t
(2.12)
= χs,t L(G)† L(G)L(G)† χs,t = χs,t T L(G)† χs,t = φT L(G)φ,
26
2.4.1 Effective s-t Resistance and Effective s-t Conductance
Apart of the definition of the electrical flows, there are two other basic quantities from
the theory of electrical networks that we will find useful: the effective s-t resistance
and effective s-t conductance.
Let f be the electrical s-t flow of value 1, and let φ be its vector of vertex potentials.
The effective s-t resistance of G with respect to the resistances r is given by
s,t
Reff (r ) := φs − φt . (2.13)
(For notation convenience, we suppress the reference to s and t in our notation – and
simply write Reff (r ) – whenever these vertices are clear from the context.)
Using our linear-algebraic description of the electrical flow and equation (2.12),
we have
s,t
Reff (r ) = φs − φt = χs,t T φ = χs,t T L(G)† χs,t = Er (f ).
This gives us an alternative description of the effective s-t resistance as the energy of
the corresponding electrical flow of value 1.
Now, the related notion of the effective s-t conductance of G with respect to the
resistances r is defined as
s,t s,t
Ceff (r ) = 1/Reff (r ).
s,t
We note that Ceff (r ) is equal to the value of the electrical flow in which the difference
φs − φt of corresponding vertex potentials at s and t is 1. (Similarly to the case of
effective resistance, we will suppress the reference to s and t in notation whenever
they will be clear from the context.)
The following standard fact (see [32] for a proof) will be useful to us
Fact 2.4.1. For any G = (V, E) and any vector of resistances r ,
s,t
X (φu − φv )2
Ceff (r ) = min .
φ | φs =1, r(u,v)
φt =0 (u,v)∈E
Furthermore, the equality is attained for φ being vector of vertex potentials corre-
s,t
sponding to the electrical s-t flow of G (with respect to r ) of value 1/Reff (r ).
This fact will allow for better understanding of how changes of the resistance of a
particular edge e influence the effective resistances of other edges. In particular, we
can prove the following simple, but fundamental result.
Corollary 2.4.2 (Rayleigh Monotonicity Principle). If re0 ≥ re for all e ∈ E, then
s,t s,t
Reff (r 0 ) ≥ Reff (r ).
Proof. For any φ,
X (φu − φv )2 X (φu − φv )2
0
≤ ,
r(u,v) r(u,v)
(u,v)∈E (u,v)∈E
so the minimums of these expressions over possible values of φ obey the same relation,
s,t s,t
and thus – by Fact 2.4.1 – Ceff (r 0 ) ≤ Ceff (r ). Inverting both sides of this inequality
yields the desired result.
27
2.5 Electrical Flows, Random Walks, Random Span-
ning Trees, and Laplacians
As we mentioned, one reason behind fundamental importance of the Laplacian matrix
is that there is a great deal of information about the graph G that one can infer from
analyzing the properties of its Laplacian L(G). Here, we survey some of the examples
illustrating that.
L(G)
b := D(G)−1/2 L(G)D(G)−1/2 ,
Mixing time of random walks: The first of them is the one to the mixing time of
random walks. Recall that a random walk in G corresponds to a stochastic process in
which we start at some starting vertex v0 and repeat for some number of times a step
such that if we are at a vertex v we move to one of its neighbors u with probability
proportional to the weight w(v,u) of the corresponding edge. Now, let τ (ε) be the
number of steps needed for a distribution on vertices of G induced by performing a
lazy random walk2 started at an arbitrary vertex, to converge point-wisely to within
ε of the stationary distribution. It turns out that
1
τ (ε) ≤ , (2.14)
λ
b2 log(n/W ε)
28
governed by the value of λ
b2 . Motivated by this connection, expander graphs (that
are known for having their mixing time very small) are often defined as graphs with
their spectral gap bounded away (by some constant) from zero.
Cheeger’s inequality: The second connection is the one to cut structure of the
graph. Namely, it turns out that there is an intimate relation between the spectral
gap λb2 and the conductance ΦG of G defined as ΦG := min∅6=C⊂V Φ(C), where the con-
ductance Φ(C) of a cut C is given by expression (2.1) in Section 2.2. This connection
is inspired by, so called, Cheeger’s inequality [38] that was discovered in the context
of Riemannian manifolds, and a discrete version of it (see [98, 9, 6, 55, 136, 124])
asserts that
2ΦG ≥ λ b2 ≥ Φ2 /2.
G
Note that as cuts with small conductance are obvious obstacles to mixing of a
random walk, this inequality tells us that in fact such sets are the only obstacle. To
observe why it is the case, note that if there is no cuts with small conductance then
the above inequality implies that the spectral gap is large and thus, by inequality
(2.14), we know that the mixing time of the random walk is fast.
Electrical flows and random walks: There is a lot of connections between prop-
erties of electrical flow with respect to resistances r and of the random walks in the
graph with weights we of each edge e being equal to 1/re . Here, we mention only one
of them – we will need it later – but we refer the reader to the book [103] for a more
complete exposition.
Fact 2.5.1 (see [103]). Let ps,t (u) be the probability that a random walk started at
vertex u will visit the vertex s before visiting the vertex t, and let φ be the vertex
potentials corresponding to the electrical s-t flow of value 1, we have:
φu − φt
ps,t (u) = . (2.15)
φs − φt
Electrical flows and random spanning trees: Yet another objects to which
electrical flows (and thus Laplacians) have connection to, is the notion of random
spanning trees. Given a weighted graph G = (V, E, w ) let us consider a stochastic
procedure of choosing a spanningQtree T of G (from the set of all its spanning trees)
with probability proportional to e∈T we . The following beautiful fact ties the result-
29
ing distribution over the edges of the graph and their effective resistance – its proof
can be found in Chapter 4 of [103].3
Fact 2.5.2 (see [103]). Consider a vector of resistances r in which the resistance re0
of each edge e0 is taken to be 1/we0 . For any edge e = (u, v), the probability that this
u,v
edge e is included in a random spanning tree is equal to Reff (r )/re .
The second fundamental connection between the random spanning trees and
Laplacians is captured in the following fact.
Note that the right-hand side term is equal to an absolute value of any cofactor
of the Laplacian. Furthermore, when all the weights we are equal to 1, the left-hand
side term simplifies to the number of spanning trees of the underlying graph.
Sparsification: The goal of the (spectral) sparsification is to find for a given dense
graph G, another graph G e that is sparse and has its spectral picture approximately
the same as the one of G, i.e., the quadratic form (2.4) corresponding to the Laplacian
L(G) of the graph G is approximately the same as the quadratic form corresponding
to the Laplacian L(G)
e of the graph G. e
3
Lyon and Peres prove this fact only in the case of T being a uniform spanning tree i.e. when all
we s are equal, but Section 4.1 of [103] contains a justification why this proof implies this property
also in the case of arbitrary we s. Roughly speaking, for rational we s, the main idea is to replace
each edge e with Cwe edges (for an appropriate choice of C) and consider a uniform spanning tree
in the corresponding multigraph. The irrational case follows from a limit argument.
30
It turns out that irrespectively of the density and structure of the graph G, one
can always take Ge to be really sparse while still have such a spectral picture of G
e differ
not too much from the original one. More precisely, inspired by the result of Benczúr
and Karger [26], Spielman and Teng [129] – and later Spielman and Srivastava [127]
– proved the following theorem.
Corollary 2.6.2 (see [26]). Given a capacitated graph G = (V, E, u) and an accu-
racy parameter δ > 0, there is a Monte Carlo algorithm that finds in O(|E|)
e time a
(capacitated) subgraph G = (V, E, u
e e e ) of G such that:
e has O(δ −2 |V | log |V |) edges
(i) G
31
Low-stretch spanning trees: The notion of low-stretch spanning trees was in-
troduced for the first time by Alon, Karp, Peleg and West [8] in the context of the
k-server problem. Since then, however, these trees turned out to be useful in many
other contexts – one of them will be presented in this thesis – including precondition-
ing of Laplacian matrices – see Section 2.7.
To define this notion, let us consider some spanning tree T of a weighted graph
G = (V, E, w ). For a given edge e let us define the stretch stretchw T (e) of e (with
respect to T and w ) as
P
w e0 ∈pathT (e) we0
stretchT (e) = ,
we
where pathT (e) is the unique path in T that connects the endpoints of the edge e. Note
that if we interpret the weights as lengths of the edges, then stretchw
T (e) corresponds
to a ratio of the distance between the endpoints of the edge e in T to the length of
the edge e.
Now, the natural question to ask in the above context is: can we always find a
tree T so as all the edges e of the graph G have small stretch with respect to it?
Unfortunately, if one considers G being an unweighted cycle then one immediately
sees that no matter which spanning tree is chosen there always will be an edge that
suffers a very high stretch of n − 1.
However, somewhat surprisingly, if one relaxes the above requirement and focuses
on obtaining low average stretch then trees that yield such a small-on-average stretch
always exist. In particular, the following theorem was proved by Abraham, Bartal
and Neiman [1] by building on the work of Elkin, Emek, Spielman and Teng [58].4
(Also, see [8] for a previous result, and [62] for a related work.)
Theorem 2.6.3 ([1]). There is an algorithm working in O(|E|)e time that for any
weighted graph G = (V, E, w ) generates a spanning tree T of G such that the average
1 w
P
stretch |E| e∈E stretchT (e) of edges of G is O(log |V |).
e
32
The classical approach to solving Laplacian systems – and, more generally, linear
systems – is based on, so called, direct methods. These methods can be viewed as
variants of Gaussian elimination and give exact solutions. Unfortunately, due to their
construction, they tend to have fairly large running time. The fastest known methods
require O(nω ) running time [35, 34], where ω is the exponent of matrix multiplication
and is known to be between 2 and 2.376 [46].
As the sizes of instances we want to handle are very large, even O(n2 ) running time
is prohibitive from our point of view. This motivates us to turning our attention to a
different approach to solving Laplacian systems: the one based on iterative (indirect)
methods.
These methods consist of finding successfully better approximations of the solution
and tend to be much faster than the direct methods, but pay a price of providing only
approximate answers. The most fundamental method in this family is the Conjugate
Gradient method. This method considers a Laplacian system
L(G)x = b,
where b is in the range of L(G) (when the graph G is connected this means that b is
for any ε > 0, provides an ε-approximate solution
orthogonal to all-ones vector 1) and, p
xe to this system after at most O( κf (L(G)) log 1/ε) iterations. Here, κf (L(G))
denotes the finite condition number of the Laplacian L(G) that is equal to the ratio of
its largest to smallest non-zero eigenvalue, and xe ε-approximates the optimal solution
x = L(G)† b by satisfying the following condition
33
tion number κf (A† L(G)) of the matrix A† L(G), and each of these iterations requires
solving a linear system in A.
Note that one can view the original Conjugate Gradient method as a special case of
the preconditioned version. If A is an identity matrix I then κf (I , L(G)) = κf (L(G))
and solving linear system in I is trivial. It turns out, however, that using non-trivial
preconditioners allows for a dramatically faster (approximate) solving of Laplacian
systems.
In particular, inspired by the pioneering work of Vaidya [135], Spielman and Teng
[128, 129] (see also [130]) designed a Laplacian system solver that employs Precon-
ditioned Conjugate Gradient method (with appropriate choice of preconditioners)
and works in nearly-linear time, i.e., it provides an ε-approximate solution in time
O(m
e log 1/ε). This solver was, in turn, simplified and sped-up recently by Koutis,
Miller and Peng [95] to provide the following result.
Theorem 2.7.1 ([95]). For any ε > 0, and Laplacian system L(G)x = b with b
being in range of L(G), one can obtain an ε-approximate solution xe to it in time
O(m log2 n(log log n)2 log 1/ε) = O(m
e log 1/ε).
Note that the running time of the above solver is within a small poly-logarithmic
factor of the obvious lower bound of Ω(m) and its running time is comparable to the
running time of such elementary algorithms as Dijkstra’s algorithm. As we will see
later, such an extremely fast solver constitutes a powerful primitive that will enable
us to make progress on a several basic graph problems.
The key idea behind the construction of the fast Laplacian system solver from
Theorem 2.7.1, is building preconditioners for a given graph Laplacian L(G) from
Laplacians of appropriately chosen (and reweighted) subgraphs of the graph G. This
idea was first proposed by Vaidya [135] and then was fully developed by Spielman and
Teng in [129]. This graph-based preconditioners that were introduced by Spielman
and Teng are called ultrasparsifiers.
34
and applying the Preconditioned Conjugate Gradient method with the corresponding
k-ultrasparsifier G
e of G, one essentially reduces the task of (ε-approximately) solving
the Laplacian system corresponding to L(G), to solving a small number of Laplacian
systems in the matrix L(G).e If the choice of k was large enough, G e will be a graph
with only n − 1 + o(n) edges, so a significant fraction of its vertices has to have a
degree of 1 or 2. These vertices can be eliminated (via Cholesky factorization) to
make the size of the Laplacian system one needs to solve significantly smaller. By
solving this resulting system recursively, one obtains the desired nearly-linear time
Laplacian systems solver (see [129] for details).
(Note that, a priori, we do not bound the number of λf s – in our applications, however,
only a small number of them will be non-zero.)
To make the edge-constraints of the above system more homogeneous, let us define
for a given flow f and an edge e, the congestion congf (e) of e (with respect to f ) to
be |fu(e)|
e
. Then, we can rewrite (2.17) in the following form
X
∀e∈E λf congf (e) ≤ 1
f ∈F
X
λf = 1 (2.18)
f ∈F
∀f ∈F λf ≥ 0.
To illustrate why having a solution to the above system might be useful, consider
F to be a set of all s-t flows in G of some prescribed value F . Now, if {λ∗f }f would
be a solution to this system then we could associate with it a convex combination of
flows f λ∗f f . Note that this convex combination corresponds naturally to a valid
P
s-t flow f with f (e) = f λ∗f f (e) for any e ∈ E. This flow would have a value of
P
35
F and, by definition, would need to be feasible, i.e., respect all the capacities of the
graph G. So, it would be a witness that shows that the maximum value F ∗ attainable
by a feasible s-t flow in G has to be at least F . On the other hand, if the system
of constraints would have no solution, this would mean that F > F ∗ (otherwise the
solution corresponding to just taking the maximum s-t flow f ∗ would satisfy the
constraints). We can therefore see that the ability to solve such (F, G)-systems (or
deciding that they are infeasible), for various values of F , together with employment
of binary search, would allow us to solve the maximum s-t flow problem.
Unfortunately, in practice, it is hard to obtain fast methods of solving such sys-
tems exactly. Therefore, in this thesis, we will be settling for approximate solutions.
Namely, for a given γ ≥ 1, let us define a γ-relaxed version of a (F, G)-system to be
X
∀e∈E λf congf (e) ≤ γ
f ∈F
X
λf = 1 (2.19)
f ∈F
∀f ∈F λf ≥ 0.
Now, our goal will be to solve such (F, G)-systems approximately. Formally, we
would like to obtain algorithms of the following type.
Definition 2.8.1. For a given set of flows F, graph G, and γ ≥ 1, we define a
γ-approximate solver for (F, G)-system to be a procedure that:
• if the system is feasible (i.e., all the constraints of (2.18) can be satisfied), it
finds a feasible solution to its γ-relaxed version (cf. (2.19));
• on the other hand, if the system is infeasible, it either finds a feasible solution
to the γ-relaxed system (2.19) as above, or “fail” is returned to indicate the
infeasibility of this system.
To put this definition in context, note that having such a γ-approximate solver
for the above-mentioned case of the s-t flows would enable us to solve the maximum
s-t problem γ-approximately.
36
More precisely, instead of solving the (F, G)-system directly, this oracle is solving
the following, much simpler, task. It is given a set of weights {we }e – one weight
for each edge-constraint – and its task is to output a flow f ∈ F that respects the
edge-constraints only on weighted average, where weighting in this average is given
by the weights {we }e .
To formalize this notion, we introduce the following definition.
Definition 2.8.2. Given an α ≥ 1 and some (F, G)-system, by an α-oracle for this
system we mean an algorithm that given weights vector w returns:
Remark: An important detail of the above definition is that sometime the oracle
might return the desired flow f even though the (F, G)-system is infeasible. To
understand why this might be the case, one should observe that our requirements on
f are weaker than the requirements on a feasible solution to the (F, G)-system.
∗
P that if f ∈ F is a feasible solution to the (F, G)-system then we must have
Note
that e we congf ∗ (e) ≤ |w |1 . So, whenever the system is feasible there exists at least
one flow that satisfies the condition required of the oracle.
Now, the way we will use an α-oracle to (approximately) solve the corresponding
(F, G)-system is by employing the routine presented in Figure 2-1. This routine takes
an α-oracle O and an accuracy parameter δ > 0 as an input, then it presents this
oracle with a sequence of queries corresponding to different weight vectors. Finally,
it either combines the obtained answers to output a γ-approximate solution to the
(F, G)-system, where γ = α(1 + 3δ), or it returns “fail” to indicate that the system
is not feasible.
In a bit greater detail, the way such a γ-approximate solution is build is by main-
taining weights for each edge-constraint – initially, all the weights are set to 1. In
each iteration i, the oracle O is called with these weights. If this call returns a flow
i
f i ∈ F, the weight of each edge is multiplied by a factor of (1+δN λ congf i (e)), where
i
N λ is roughly proportional to the inverse of the largest congestion ρi exerted by f i
i
in G. This iterative process finishes once the sum of all the λ coefficients becomes
one.
The key fact underlying the above process is that if in some iteration i the con-
gestion of some edge is poor, say close to ρi , then the weight of the corresponding
edge-constraint will increase by a factor of roughly (1 + δ). On the other hand, if
congestion of an edge is good, e.g., the flow on that edge is no more than its capacity,
then the weight of the corresponding edge-constraint is essentially unchanged. In
this way, we make sure that a larger fraction of the total weight is being put on the
37
Input : An accuracy parameter 1/2 > δ > 0, capacitated graph G = (V, E, u),
corresponding (F, G)-system, and an α-oracle O for it;
Output: Either a convex combination {λf }f ∈F of flows from F, or “fail” indicating
that the (F, G)-system is infeasible;
Initialize we0 ← 1 for all edges e, i = 1, and N ← 2 ln
δ2
m
Pi−1 i0
while i0 =1 λ < 1 do
Query the oracle O with edge weights given by w i−1
if O returns “fail” then return “fail”
else
Let f i be the returned flow and let ρi ← maxe∈E congf i (e) be the maximum
congestion f i exerts in G
i i
Set λ ← min{ N1ρi , 1 − ii−1
P
0 =1 λ }
i
Set wei ← wei−1 (1 + δN λ congf i (e)) for each e ∈ E
Increment i
end
end
P i
Let λf ← i:f i =f λ , for each f ∈ F
return {λf }f
violated constraints, so the oracle is being forced to return solutions that come closer
to satisfying these constraints (possibly at the expense of some other, previously only
slightly violated, constraints).
Now, the dynamics that emerges from this process is that, on one hand, the total
weight |w |1 does not grow too quickly, due to the average (weighted) congestion
constraint on the flows returned by the oracle O (cf. Definition 2.8.2). On the other
hand, if a particular edge e consistently suffers large congestion in a sequence of flows
returned by O then the weight of corresponding edge-constraint increases rapidly
relative to the total weight, which significantly penalizes any further congestion of
this edge in the subsequent flows. So, as in the end we are returning an appropriately
weighted convex combination of all the obtained flows, this ensures that no edge-
constraint is violated by more than γ in it.
We proceed to making the above intuition precise by proving that our routine will
indeed return the desired γ-approximate solution to the (F, G)-system. After that,
in section 2.8.3, we will analyze how by looking at some additional properties of the
oracle one can bound the number of oracle calls this routine makes before termination
– this will be critical to bounding the total running time of the resulting algorithms.
38
2.8.2 Correctness of the Multiplicative-Weights-Update Rou-
tine
In this section, we prove the correctness of the routine from Figure 2-1 by establishing
the following theorem.
Theorem 2.8.3. For any 1/2 > δ > 0 and α ≥ 1, if all the oracle calls in the
routine from Figure 2-1 terminate without returning “fail”, the convex combination
{λf }f returned is an γ-approximate solution to the corresponding (F, G)-system, with
γ = α(1 + 3δ).
Our proof of the above theorem will be based on the potential function µi := |w i |1 .
Clearly, µ0 = m and this potential only increases during the course of the algorithm.
Now, the crucial fact we will be using is that from Definition 2.8.2 it follows that if
the oracle O did not return “fail” then
X
wei−1 congf i (e) ≤ α|w i−1 |1 , (2.20)
e
for all i ≥ 1.
We start the proof by upper bounding the total growth of µi throughout the
algorithm.
Lemma 2.8.4. For any i ≥ 0,
i+1
µi+1 ≤ µi exp αδN λ .
In particular,
after the
final, T -th iteration, the total weight |w T |1 = µT is at most
PT i
m exp αδN i=1 λ = nO(α/δ) .
X X i+1
µi+1 = wei+1 = wei 1 + δN λ congf i+1 (e)
e e
X i+1 X
= wei + δN λ wei congf i+1 (e)
e e
i+1
≤ µi + αδN λ |w i |1 ,
where the last inequality follows from (2.20). Thus, we can conclude that
i+1 i+1
i+1
i
µi+1 ≤ µi + αδN λ |w |1 = µi (1 + αδN λ ) ≤ µi exp αδN λ ,
as desired. The lemma follows.
One of the consequences of the above lemma is that whenever we make a call to
the oracle, the total weight |w i |1 is at most nO(α/δ) . Thus, we have an absolute bound
on the value of the weights the oracle is dealing with.
39
Next, we tie the final weight weT of a particular edge e to the congestion
X
λf congf (e)
f ∈F
that this edge suffers with respect to the convex combination output by the algorithm.
i
where we used the fact that, by definition, N λ congf i (e) ≤ 1 and that for any 1/2 >
δ > 0 and x ∈ [0, 1]:
exp((1 − δ)δx) ≤ (1 − δx).
Now, the lemma follows since
i i
!
Y i 0 X i 0
wei ≥ exp (1 − δ)δN λ congf i0 (e) = exp (1 − δ)δN λ congf i0 (e) ,
i0 =1 i0 =1
as desired.
Finally, by Lemmas 2.8.4 and 2.8.5, we conclude that for any edge e,
!
X
m exp (αδN ) ≥ µT = |w T |1 ≥ weT ≥ exp (1 − δ)δN λf congf (e) .
f ∈F
for every edge e, as 1/2 > δ > 0. The Theorem 2.8.3 follows.
40
2.8.3 Convergence of the Multiplicative-Weights-Update Rou-
tine
Note that even though we proved above that our multiplicative-weights-update rou-
tine from Figure 2-1 will always returned the desired solution, it is unclear how many
iterations will this routine make before termination. We proceed now to relating this
number of iterations to some properties of the oracle that is employed.
Width-based bound: Probably the simplest and most broadly applied way of
bounding the number of iterations is based on the concept of width of an oracle.
Definition 2.8.6. For a given oracle O for some (F, G)-system, we say that O has
width ρ if for any possible weights w , the flow returned by O in response to them
exerts a congestion of at most ρ in G.
Clearly, width of an oracle provides us with a simple way of bounding the number
of queries to it, as explain in the following lemma.
Lemma 2.8.7. If the oracle O used by the routine from Figure 2-1 has width ρ then
at most O(ρδ −2 log m) iterations is made.
i
Proof. By Definition 2.8.6, we must have that all ρi ≤ ρ and thus each λ (possibly
T i
except λ ) has to be at least N1ρi ≥ N1ρ . This means that ii0 =1 λ reaches 1 after at
P
most N ρ + 1 iterations. The lemma follows.
Tightness-based bound: Even though the use of width is a very simple way of
bounding the running time of the multiplicative-weights-update routine, sometime
one can obtain a better bound. This is possible as the width-based bound is, in some
sense, pessimistic, i.e., it does not give a good handle on situations in which only
a few of the calls made in the multiplicative-weights-update routine results in flows
with ρi being close to the width ρ.
To get a different type of bound that avoids this problem, let us define ρG (f ) to
be the maximum congestion maxe∈E congf (e) that flow f exerts in G. (Note that if
f is returned by an oracle with width ρ then ρG (f ) has to be at most ρ.) Also, let
κG (f ) := {e ∈ E | congf (e) ≥ ρG2(f ) } be the set of edges of G whose congestion is
within a factor of two of the maximum congestion exerted by f in G.
Definition 2.8.8. For a given oracle O for some (F, G)-system, we say that O has
tightness k if for any possible weights w , the flow f returned by O in response to it
is such that |κG (f )| ≥ k.
Observe that, trivially, every oracle has a tightness of at least one.
Now, we prove the following lemma that tells us that the number of oracle calls
in our multiplicative-weights-update routine is inversely proportional to the tightness
of the oracle.
Lemma 2.8.9. If the α-oracle O used by the routine from Figure 2-1 has tightness
k then at most 2αN
k
m
= 4αmkδlog
2
m
iterations is made.
41
Proof. Consider the following potential function:
X X i0
φi = iλ congf i0 (e).
e∈E i0 =1
Note that φ0 = 0, φi+1 ≥ φi for each i, and, by Theorem 2.8.3, we know that
T
X i0 X
λ congf i0 (e) = λf congf (e) ≤ γ = α(1 + 3δ),
i0 =1 f ∈F
for each edge e. Therefore, the final value φT of this potential is at most α(1 + 3δ)m.
Now, observe that in each iteration i of the multiplicative-weights-update routing
from Figure 2-1, we have that for every e ∈ κG (f i ),
i 1
λ congf i (e) ≥ .
2N
k
As a result, we see that in each iteration the potential φ increases by at least 2N .
2α(1+3δ)mN
This implies that the total number of iterations is at most k
and the lemma
follows.
Note that the bounds offered by Lemma 2.8.7 and Lemma 2.8.9 are in general
incomparable. Therefore, in this thesis, we will be using one or the other depending
on the characteristic of the flow packing problem we will be dealing with.
42
Part I
43
44
Chapter 3
In this chapter, we consider the maximum s-t flow problem and the dual problem of
finding a minimum s-t cut.1 We propose a new approach to solving these problems
approximately in a capacitated, undirected graph. Our approach is based on solving
a sequence of electrical flow problems. Each electrical flow is given by a solution to
a Laplacian system (cf. Section 2.4) and thus may be approximately computed in
nearly-linear time (cf. Section 2.7).
Using this approach, we develop the fastest known algorithm for computing ap-
proximately maximum s-t flow. For a graph having n vertices and m edges, our algo-
1/3 −11/3
rithm computes a (1 − ε)-approximately maximum s-t flow in time2 O(mn e ε ).
A dual version of our approach gives the fastest known algorithm for computing a
(1 + ε)-approximately minimum s-t cut. Its running time is O(me + n4/3 ε−16/3 ).
3.1 Introduction
The maximum s-t flow problem and its dual, the minimum s-t cut problem, are two
of the most fundamental and extensively studied problems in Operations Research
and Optimization [120, 2]. They have many applications (see [3]) and are often used
as subroutines in other algorithms (see e.g. [14, 122]). Also, the celebrated Max-Flow
Min-Cut theorem [67, 57] that establishes equality of the value of the maximum s-t
flow and capacity of the minimum s-t cut, is a prototypical example of duality relation
in optimization. Many advances have been made in the development of algorithms
for this problem (see Goldberg and Rao [73] for an overview). However, for the basic
problem of computing or (1−ε)-approximating the maximum flow in undirected, unit-
capacity graphs with m = O(n) edges, the asymptotically fastest known algorithm is
the one developed in 1975 by Even and Tarjan [61], which takes time O(n3/2 ). Despite
35 years of extensive work on the problem, this bound has not been improved.
1
This chapter is based on joint work with Paul Christiano, Jonathan Kelner, Daniel Spielman,
and Shang-Hua Teng and contains material from [40].
2
Recall that O(g(m))
e denotes O(g(m) logc g(m)) for some constant c.
45
In this chapter, we introduce a new approach to computing approximately max-
imum s-t flows and minimum s-t cuts in undirected, capacitated graphs. Using it,
we present the first algorithms that break the O(n3/2 ) complexity barrier described
above. In addition to being the fastest known algorithms for these problems, they are
simple to describe and introduce techniques that may be applicable to other tasks.
One of the key steps of these algorithms is reducing the problem of computing max-
imum flows subject to capacity constraints to the problem of computing electrical
flows in resistor networks. An approximate solution to each electrical flow problem
can be found in time O(m)
e using recently developed algorithms for solving systems
of linear equations in Laplacian matrices [95, 130] that are discussed in Section 2.7.
There is a simple physical intuition that underlies our approach, which we describe
here in the case of a graph with unit edge capacities. We begin by thinking of each
edge of the input graph as a resistor with resistance one, and we compute the electrical
flow that results when we send current from the source s to the sink t. These currents
obey the flow conservation constraints, but they are ignorant of capacities of the
edges. To remedy this, we increase the resistance of each edge in proportion to the
amount of current flowing through it—thereby penalizing edges that violate their
capacities—and compute the electrical flow with these new resistances.
e 1/3 · poly(1/ε)) times, we will be able to obtain
After repeating this operation O(m
a (1−ε)-approximately maximum s-t flow by taking a certain average of the electrical
flows that we have computed, and we will be able to extract a (1 + ε)-approximately
minimum s-t cut from the vertex potentials3 . This will give us algorithms for both
problems that run in time O(m e 4/3 · poly(1/ε)). By combining this with the graph
smoothing and sampling techniques of Karger [86], we can get a (1−ε)-approximately
1/3 −11/3
maximum s-t flow in time O(mn e ε ). Furthermore, by applying the cut algo-
rithm to a sparsifier [26] of the input graph, we can compute a (1 + ε)-approximately
minimum s-t cut in time O(m e + n4/3 ε−16/3 ).
We remark that the results presented in this chapter immediately improve the
running time of algorithms that use the computation of an approximately maximum
s-t flow on an undirected, capacitated graph as a subroutine. For example, combining
our work with that of Sherman [122] allows
√ us to achieve the best currently known
asymptotic approximation ratio of O( log n) for the sparsest cut problem in time
O(m + n4/3+δ ) for any constant δ > 0.
46
assuming that the edge capacities are integers between 1 and U . In undirected graphs,
one can find faster approximation algortithms for these problems by using the graph
sparsification techniques of Benczúr and Karger [26, 27]. Goldberg and Rao [73]
observe that by running their exact maximum flow algorithm on the sparsifiers of
Benczúr and Karger [26], one can obtain a (1 + ε)-approximately minimum cut in
e + n3/2 ε−3 ). This provides an approximation of the value of the maximum
time O(m
flow. To actually obtain a feasible flow whose value approximates the maximum one
can apply the algorithm of Goldberg and Rao in the divide-and-conquer framework
of Benczúr and Karger [27]. This provides a (1 − ε)-approximately maximum flow in
e √nε−1 ). We refer the reader to the the paper of Goldberg and Rao for a
time O(m
survey of the previous work on algorithms for computing maximum s-t flows.
In more recent work, Daitch and Spielman [51] showed that fast solvers for Lapla-
cian linear systems [130, 95] could be used to make interior-point algorithms for the
maximum flow and minimum-cost flow problems run in time O(m e 3/2 log U ), and, as
we will see in Chapter 5, one can approximate a wide range of cut problems, including
the minimum s-t cut problem, within a polylogarithmic factor in almost linear time.
47
an (1 + ε)-approximately minimum s-t cut if its capacity is within (1 + ε) factor of
the capacity of the minimum cut C ∗ .
To simplify the exposition, we will take ε to be a constant independent of m
throughout this chapter, and m will be assumed to be larger than some fixed con-
stant. However, our analysis will go through unchanged as long as ε > Ω(m e −1/3 ). In
particular, our analysis will apply to all ε for which our given bounds are faster than
the O(m3/2 ) time required by existing exact algorithms.
One can easily reduce the problem of finding a (1 − ε)-approximation to the maxi-
mum flow in an arbitrary undirected graph to that of finding a (1−ε/2)-approximation
in a graph in which the ratio of the largest to smallest capacities is polynomially
bounded. To do this, one should first compute a crude approximation of the max-
imum flow in the original graph. For example, one can compute the s-t path of
maximum bottleneck in time O(m + n log n) [120, Section 8.6e], where we recall that
the bottleneck of a path is the minimum capacity of an edge on that path. If this
maximum bottleneck of an s-t path is B, then the maximum flow lies between B and
mB. This means that there is a maximum flow in which each edge flows at most
mB, so all capacities can be decreased to be at most mB. On the other hand, if one
removes all edges with capacities less than εB/2m, the maximum flow can decrease
by at most εB/2. So, we can assume that the minimum capacity is at least εB/2m
and the maximum is at most Bm, for a ratio of 2m2 /ε. Thus, by a simple scaling,
we can assume that all edge capacities are integers between 1 and 2m2 /ε.
48
design a (1 + ε)-approximate solver for the corresponding (F, G)-system. This will
be done by employing the multiplicative-weights-update routine from Figure 2-1 with
accuracy parameter δ equal to ε. To this end,pwe will design a simple (1 + ε)-oracle
for the (F, G)-system that has a width of 3 m/ε and runs in O(m e log ε−1 ) time.
By our discussion above, Theorem 2.8.3 and Lemma 2.8.7, providing such an oracle
will give the desired (1 − O(ε))-approximation algorithm for the maximum s-t flow
e 3/2 ε−5/2 ).
problem running in time O(m
Theorem 3.3.1 (Fast Approximation of Electrical Flows). For any δ > 0, any F >
0, and any vector r of resistances in which the ratio of the largest to the smallest
resistance is at most R, we can compute, in time O(m e log R/δ), a vector of vertex
˜
potentials φ̃ and an s-t flow f of value F such that
a. Er (f˜) ≤ (1 + δ)Er (f ), and Er (f˜) ≤ (1 + δ)Er (f ) where f is the electrical s-t flow
of value F ,
c.
δ s,t
φes − φet ≥ 1− F Reff (r ).
12nmR
ε|w |1
1
re := 2 we + (3.1)
ue 3m
for each edge e, and we use the procedure from Theorem 3.3.1 to approximate the
electrical flow that sends F units of flow from s to t in a network whose resistances
are given by the re . The pseudocode for this oracle is shown in Figure 3-1.
It is easy to see that f˜ is an s-t flow of value F and thus f˜ ∈ F. So, to established
the desired properties of our oracle we need to show that the resulting flow f˜ meets
also the conditions required by Definition 2.8.2 (with α = (1 + ε)) and Definition
2.8.6. We will do it by using the basic fact that, by its definition, the electrical flows
49
Input : A graph G = (V, E) with capacities {ue }e , a target flow value F , and edge
weights {we }e
Output: Either a flow f˜, or “fail” indicating that F > F ∗
re ← u12 we + ε|w
3m
|1
for each e ∈ E
e
minimize the energy of the flow. Our analysis will then be based on comparing the
energy of f˜ with that of an optimal max flow f ∗ . Intuitively, what will turn out is
that the we term in equation (3.1) guarantees the bound on the average congestion
from Definition 2.8.2, while the ε|w |1 /(3m) term guarantees the bound on the width,
i.e., on the maximum congestion ρG (f˜) exerted by f˜.
To prove it formally, suppose f ∗ is a maximum flow. By its feasibility, we have
congf ∗ (e) ≤ 1 for all e, so
2
X ε|w |1 f ∗ (e)
∗
Er (f ) = we +
e
3m ue
X ε|w |1
2
= we + congf ∗ (e)
e
3m
ε|w |1
X
≤ we +
3m
e ε
= 1+ |w |1 .
3
Since the electrical flow minimizes the energy, Er (f ∗ ) is an upper bound on the
energy of the electrical flow of value F whenever F ≤ F ∗ . In this case, Theorem 3.3.1
implies that our (ε/3)-approximate electrical flow satisfies
ε ε 2
Er (f˜) ≤ 1 + Er (f ∗ ) ≤ 1 + |w |1 ≤ (1 + ε) |w |1 . (3.2)
3 3
This shows that our oracle will never output “fail” when F ≤ F ∗ . It thus suffices to
show that the energy bound Er (f˜) > (1 + ε)|w |1 , which holds whenever the algorithm
does not return “fail”, implies the required bounds on the average and worst-case
congestion. To see this, we note that the energy bound implies that
X 2
we congf˜(e) ≤ (1 + ε) |w |1 , (3.3)
e
50
and, for all e ∈ E,
ε|w |1 2
congf˜(e) ≤ (1 + ε) |w |1 . (3.4)
3m
By the Cauchy-Schwarz inequality,
!2 !
X X 2
we congf˜(e) ≤ |w |1 we congf˜(e) , (3.5)
e e
which is the required bound on the average congestion. Furthermore, equation (3.4)
and the fact that ε < 1/2 implies that
r
3m(1 + ε) p
congfˆ(e) ≤ ≤ 3 m/ε
ε
for all e, which establishes the required bound on the maximum
p congestion. So our
algorithm indeed implements an (1 + ε)-oracle with width 3 m/ε), as desired.
To bound the running time of this oracle, recall that we can assume that all edge
capacities lie between 1 and U = m2 /ε and obtain
re maxe re + ε |w |1 m+ε
≤ U2 ≤ U2 = O (m/ε)O(1) .
R = max (3.7)
0
e,e re0 ε |w |1 ε
This establishes an upper bound on the ratio of the largest resistance to the
smallest resistance. Thus, by Theorem 3.3.1, the running time of this implementation
is O(m
e log R/ε) = O(m e log 1/ε). Combining this with Theorem 2.8.3 and Lemma
2.8.7, we get
Theorem 3.3.2. For any 0 < ε < 1/2, the maximum flow problem can be (1 − ε)-
e 3/2 ε−5/2 ) time.
approximated in O(m
F2
≤ Er (f ) ≤ F 2 Rm. (3.8)
m
Let L be the Laplacian matrix for this electrical flow problem, i.e., one in which
weights of edges are equal to the inverses of their resistances. Note that all off-
51
diagonal entries of L lie between −1 and −1/R. Recall that the vector of optimal
vertex potentials φ of the electrical flow is the solution to the following linear system:
Lφ = F χs,t
Using Theorem 2.7.1, for any ε > 0 ,we can compute in time
a set of potentials φ
b (cf. (2.16)) such that
b−φ
φ ≤ ε kφkL ,
L
where q
def
kφkL = φT Lφ.
f̂ = C B T φ.
b
Before fixing this problem, we first observe that the energy of fˆ is not much more
than the energy of f . We have
2
φ
b = Er (fˆ),
L
φ
b ≤ kφkL + φ
b−φ ≤ (1 + ε) kφkL .
L L
So
Er (fˆ) ≤ (1 + ε)2 Er (f ).
To see that the flow fˆ does not flow too much into or out of any vertex other than
52
s and t, note that the flows into and out of vertices are given by the vector
def
i ext = B f̂ .
Observe that the sum of the entries in i ext is zero. Furthermore, we will produce an
s-t flow of value F by adding a flow to fˆ to obtain a flow f˜ for which
B f̃ = F χs,t .
then we have
p
b − Lφ
η ≤ ki ext − F χs,t k2 = Lφ ≤ kLk2 φ
b−φ b−φ
≤ 2n φ ≤ 2nε Er (f ).
2 L L
It is an easy matter to produce an s-t flow f˜ that differs from the flow represented
by fˆ by at most nη on each edge. For example, one can do this by solving a flow
problem in a spanning tree of the graph. Let T be any spanning tree of the graph
G. We now construct a flow in T that with the demands F χs,t (u) − i ext (u) at each
vertex u. As the sum of the positive demands in this flow is at most nη, one can find
such a flow in which every edge flows at most nη. Moreover, since the edges of T are
a tree, one can find the flow in linear time. We obtain the flow f˜ by adding this flow
to the flow fˆ. The resulting flow is an s-t flow of value F . Moreover,
f̂ − f̃ ≤ nη.
∞
To ensure that we get a good approximation of the electrical flow, we now impose the
requirement that
δ
ε ≤ 2 1/2 3/2 .
2n m R
To check that requirement a is satisfied, observe that
X
Er (f˜) = re f˜2
e
e
X
≤ re (fˆe + nη)2
e
X X
≤ Er (fˆ) + 2nη re fˆe + n2 η 2 re
e e
X
≤ Er (fˆ) + (2nηF + n2 η 2 ) re
e
53
We may ensure that this is at most (1 + δ)Er (f ) by setting
δ
ε= .
12n4 mR3/2
and so X 2
re (fe − fˆe )2 = φ − φ
b ≤ ε2 Er (f ).
L
e
2 2 2
re fe2 − re fˆe2 = re fe − fˆe re fe + fˆe
≤ ε2 Er (f )(2 + ε)2 Er (f ),
which implies
re fe2 − re fˆe2 ≤ ε(2 + ε)Er (f ).
Lφ = F χs,t
and that
Er (f ) = F 2 Reff
.
(r )
54
So,
2
ε2 kφk2L ≥ φ − φ̃
L
T
= φ − φ̃ L φ − φ̃
2 T
= φ̃ + kφk2L − 2φ̃ Lφ
L
2
= kφk2L + φ̃ − 2F χs,t T φ̃.
L
Thus,
2
2F χs,t T φ̃ ≥ kφk2L + φ̃− ε2 kφk2L
L
≥ 1 + (1 − ε) − ε2 kφk2L
2
= (2 − 2ε) kφk2L
= (2 − 2ε) F 2 Reff
.
(r )
By dividing both side of the obtained inequality by 2F and noting that χs,t T φ̃ =
φes − φet
1/3 −11/3
3.4 An O(mn
e ε ) Algorithm for Approximate Max-
imum Flow
In this section, we modify our algorithm to run in time O(m e 4/3 ε−3 ). We then com-
bine this with the smoothing and sampling techniques of Karger [86] to obtain an
1/3 −11/3
O(mn
e ε )-time algorithm.
For fixed ε, the algorithm in the previous section required us to compute O(me 1/2 )
electrical flows, each of which took time O(m),
e which,in turn, led to a running time
of O(m ). To reduce this running time to O(m4/3 ), we’ll show how to find an
e 3/2 e
approximately maximum s-t flow while computing only O(m e 1/3 ) electrical flows.
Before we proceed to doing this, let us recall that our analysis of the oracle from
Section 3.3.1 was fairly simplistic, and therefore one might hope to improve the run-
ning time of the algorithm by proving a tighter bound on the width. Unfortunately,
the graph in Figure 3-2 shows that our analysis was essentially tight. The graph
consists of k parallel paths of length k connecting s to t, along with a single edge e
that directly connects s to t. The max flow in this graph is k + 1. In the first call
made to the oracle by the multiplicative weights routine, all of the edges will have the
same resistance. In this case, the electrical flow of value k + 1 will send (k + 1)/2k
units of flow along each of the k paths and (k + 1)/2 units of flow across e. Since the
graph has m = Θ(k 2 ), the width of the oracle in this case is Θ(m1/2 ).
55
s
…
…
k paths
1 edge
of length k
e
…
…
…
…
…
…
√
Figure 3-2: A graph on which the electrical flow sends approximately m units of
flow across an edge when sending the maximum flow F ∗ from s to t.
56
Input : A graph G = (V, E) with capacities {ue }e , a target flow value F , edge
weights {we }e , and a set H of forbidden edges
Output: Either a flow f˜ and a set H of forbidden edges , or “fail” indicating that
F > F∗
8m1/3 ln1/3 m
ρ←
ε
1 ε|w |1
re ← ue2 w e + 3m for each e ∈ E \H
Find an approximate electrical flow f˜ using Theorem 3.3.1 on GH := (V, E \ H) with
resistances r , target flow value F , and parameter δ = ε/3.
if Er (f˜) > (1 + ε)|w |1 or s and t are disconnected in GH then return “fail”
if there exists e with congf˜(e) > ρ then add e to H and start over
return f˜
57
and
30mF ln m
u(H) ≤ .
ε2 ρ 3
Now, if we plug in the value ρ = (8m1/3 ln1/3 m)/ε used by the algorithm, the
above lemma gives us that |H| ≤ 1532
15
(m ln m)1/3 and u(H) ≤ 256 εF < εF/12.
Given these bounds, it is now straightforward to show the following theorem,
which establishes the correctness and bounds the running time of our algorithm.
Theorem 3.4.2. For any 0 < ε < 1/2, if F ≤ F ∗ our improved algorithm will return
e 4/3 ε−3 ).
a feasible s-t flow f of value |f | = (1 − O(ε))F in time O(m
Proof. To bound the running time, we note that whenever we invoke the algorithm
from Theorem 3.3.1, we either advance the number of iterations of the multiplicative-
weights-update routine or increase the cardinality of the set H. As a result, the
e −2 )+ 15 (m ln m)1/3 .
e −2 )+|H| ≤ O(ρε
number of linear systems we solve is at most O(ρε 32
equation (3.7) implies that the value of R from Theorem 3.3.1 is O((m/ε)O(1) ),
so solving each linear system takes time at most O(me log 1/ε). This gives an overall
running time of
−2 15
O((ρε
e + 32 (m ln m)1/3 )m) = O(m
e 4/3 ε−3 ),
as claimed.
It thus remains to prove correctness. For this, we need to show that if F ≤ F ∗ ,
then the oracle does not return “fail”. By definition of the oracle, this can occur only
if we disconnect s from t or if Er (f˜) > (1 + ε)|w |1 . By Lemma 3.4.1 and the comment
following it, we know that throughout the whole algorithm GH has maximum flow
value of at least F ∗ − εF/12 ≥ (1 − ε/12) F and thus, in particular, we will never
disconnect s from t.
Furthermore, this implies that there exists a feasible flow in our graph of value
(1 − ε/12) F , even after we have removed the edges in H. There is thus a flow of
value F in which every edge has congestion at most 1/ (1 − ε/12). We can therefore
use the argument from Section 3.3.1 (equation (3.2) and the lines directly preceding
it) to show that we always have
as required.
The above theorem allows us to apply the binary search strategy that we used in
Section 3.3. This yields the following theorem:
Theorem 3.4.3. For any 0 < ε < 1/2, the maximum flow problem can be (1 − ε)-
e 4/3 ε−3 ) time.
approximated in O(m
58
which we compute electrical flows. The key insight is that we only cut an edge when
its flow accounts for a nontrivial fraction of the energy of the electrical flow, and
that cutting such an edge will cause a substantial change in the effective resistance.
Combining this with a bound on how much the effective resistance can change during
the execution of the algorithm will guarantee that we won’t cut too many edges.
Before we do all of that, let us first prove the following lemma which gives a lower
bound on the effect that increasing the resistance of an edge can have on the effective
resistance.
Lemma 3.4.4. Let f be an electrical s-t flow on a graph G with resistances r . Sup-
pose that some edge h = (i, j) accounts for a β fraction of the total energy of f ,
i.e.,
f (h)2 rh = βEr (f ).
For some γ > 0, define new resistances r 0 such that rh0 = γrh , and re0 = re for all
e 6= h. Then
γ
Reff (r 0 ) ≥ R (r ).
β + γ(1 − β) eff
In particular:
Reff (r )
Reff (r 0 ) ≥ .
1−β
The assumption that h contributes a β fraction of the total energy implies that, in
the above expression, 2
φfi − φfj
,
= βCeff (r )
rh
59
and thus 2
X φfu − φfv
= (1 − β)Ceff (r ).
r(u,v)
(u,v)∈E\{h}
We will obtain our bound on Ceff (r 0 ) by plugging the original vector of potentials
f
φ into the expression in Fact 2.4.1:
X φu − φv 2 X φfu − φfv 2
0
Ceff (r ) = min 0
≤ 0
φ | φs =1, r (u,v) r(u,v)
φt =0 (u,v)∈E (u,v)∈E
2 2 2 2
φfi − φfj X φfu − φfv φfi − φfj X φfu − φfv
= + = +
rh0 0
r(u,v) γrh r(u,v)
(u,v)∈E\{h} (u,v)∈E\{h}
β + γ(1 − β)
β
= Ceff (r ) + (1 − β)Ceff (r ) = Ceff (r ) .
γ γ
Since Reff (r ) = 1/Ceff (r ) and Reff (r 0 ) = 1/Ceff (r 0 ), the desired result follows.
Now, to prove Lemma 3.4.1, let r j be the resistances used in the j th electrical flow
computed during the execution of the algorithm. If an edge e is not in E or if e has
been added to H by step j, set rej = ∞. We define the potential function
Φ(j) = Reff (r j ) = Er j (f j ),
where f j is the (exact) electrical flow of value 1 arising from r j . Lemma 3.4.1 will
follow easily from the following statement.
Proof of (1)
The only way that the resistance rej of an edge e can change is if the weight we is
increased by the multiplicative-weights-update routine, or if e is added to H so that
rej is set to ∞. As such, the resistances are nondecreasing during the execution of the
algorithm. By Rayleigh Monotonicity (Corollary 2.4.2), this implies that the effective
resistance is nondecreasing as well.
60
Proof of (2)
1 + ε/3 1 + ε/3 1
re1 = 2
≥ 2 > ∗2
ue F ∗ F
for all e ∈ E(C ∗ ). As f 1 is an electrical s-t flow of value 1, it sends 1 unit of net flow
∗
between C ∗ and C ; so, some edge e0 ∈ E(C ∗ ) must have f 1 (e0 ) ≥ 1/m. This gives
X 1
Φ(1) = Er 1 (f 1 ) = f 1 (e)2 re1 ≥ f 1 (e0 )2 re10 > . (3.9)
e∈E
m2 F ∗ 2
Proof of (3)
Suppose we add the edge h to H between steps j − 1 and j. We will show that
h accounts for a substantial fraction of the total energy of the electrical flow with
respect to the resistances r j−1 , and use Lemma 3.4.4.
Now, let w be the weights used at step j − 1, and let f˜ be the flow we computed
in step j − 1. Because we added h to H, we know that congf˜(h) > ρ. Since our
algorithm did not return “fail” after computing this f˜, we must have that
Using the definition of rhj−1 , the fact that congf˜(h) > ρ, and equation (3.10), we
obtain:
|w |1
we + ε 3m
f˜(h)2 rhj−1 = f˜(h)2
u2e
ε|w |1
≥ f˜(h)2
3mu2e
!2
ε f˜(h)
= |w |1
3m ue
ε
= congf˜(h)2 ((1 + ε)|w |1 )
3(1 + ε)m
ερ2
> Er j−1 (f˜).
3(1 + ε)m
ερ2
The above inequalities establish that edge h accounts for more than a 3(1+ε)m
fraction of the total energy Er i (f˜) of the flow f˜.
61
The flow f˜ is the approximate electrical flow computed by our algorithm, but our
argument will require that an inequality like this holds in the exact electrical flow
f j−1 . This is guaranteed by part b of Theorem 3.3.1, which, along with the facts that
Er (f˜) ≥ Er (f j−1 ), ρ ≤ 1, and ε < 1/2, gives us that
2
2 j−1 j−1 ε/3 ερ ε/3
f (h) rh > f˜(h) rh −
j−1 2 j−1
Er (f ) > − Er (f j−1 )
2mR 3(1 + ε)m 2mR
ερ2
> Er (f j−1 ).
5m
The result now follows from Lemma 3.4.4.
Proof of Lemma 3.4.1. Let k be the cardinality of the set H at the end of the algo-
rithm. Let f˜ be the flow that was produced by our algorithm just before k-th edge
was added to H, let j be the time when this flow was output, and let w be the
corresponding weights.
As the energy required by an s-t flow scales with the square of the value of the
flow,
E j (f ) E j (f˜)
Φ(j) ≤ r 2 ≤ r 2 . (3.11)
F F
By the construction of our algorithm, it must have been the case that Er j (f˜) ≤
(1 + ε)|w |1 . This inequality together with equation (3.11) and part 2 of Lemma 3.4.5
implies that
E j (f˜)
Φ(j) = Er j (f j ) ≤ r 2 ≤ (1 + ε)|w |1 m4 Φ(1).
F
Now, since up to time j we had k − 1 additions of edges to H, parts 1 and 3 of
Lemma 3.4.5, and Lemma 2.8.4 imply that
−(k−1)
ερ2
Φ(j) 4 4 (1 + ε)
1− ≤ ≤ (1 + ε)|w |1 m ≤ (1 + ε)m m exp
5m Φ(1) ε
≤ 2m5 exp(3ε−1 ln m),
where the last inequality used the fact that ε < 1/2. Rearranging the terms in the
above inequality gives us that
where we used the inequalities ε < 1/2 and log(1 − c) < −c for all c ∈ (0, 1). This
establishes our bound on cardinality of the set H.
To bound the value of u(H), let us note that we add an edge e to H only when we
send at least ρue units of flow across it. But since we never flow more than F units
62
across any single edge, we have that ue ≤ F/ρ. Therefore, we may conclude that
F 30mF ln m
u(H) ≤ |H| ≤ ,
ρ ε2 ρ 3
as desired.
1/3 −11/3
3.4.4 Improving the Running Time to O(mn
e ε )
We can now combine our algorithm with existing methods to further improve its
running time. In [86] (see also [27]), Karger presented a technique, which he called
“graph smoothing”, that allows one to use random sampling to speed up an exact or
(1 − ε)-approximate flow algorithm. More precisely, his techniques yield the following
theorem, which is implicit in [86] and stated in a more similar form in [27].
Theorem 3.4.6 ([86, 27]). Let T (m, n, ε) be the time needed to find a (1 − ε)-
approximately maximum flow in an undirected, capacitated graph with m edges and n
vertices. Then one can obtain a (1 − ε)-approximately maximal flow in such a graph
e 2 m/n · T (O(nε
in time O(ε e −2 ), n, Ω(ε))).
Theorem 3.4.7. For any 0 < ε < 1/2, the maximum flow problem can be (1 − ε)-
1/3 −11/3
approximated in O(mn
e ε ) time.
63
3.5 A Dual Algorithm for Finding an Approximately
e + n4/3ε−8/3)
Minimum s-t Cut in Time O(m
In this section, we’ll describe our approach from a dual perspective that yields an even
simpler algorithm for computing an approximately minimum s-t cut. Rather than
using electrical flows to obtain a flow, it will use the electrical potentials to obtain a
cut.
The algorithm will eschew the oracle abstraction and multiplicative-weights-update
machinery discussed in Section 2.8. Instead, it will just repeatedly compute an elec-
trical flow, increase the resistances of edges according to the amount flowing over
them, and repeat. It will then use the electrical potentials of the last flow computed
to find a cut by picking a cutoff and splitting the vertices according to whether their
potentials are above or below the cutoff.
The algorithm is shown in Figure 3-4. It finds a (1 + ε)-approximately minimum
e 4/3 ε−8/3 ); applying it to a cut sparsifier will give us
s-t cut in time O(m
Theorem 3.5.1. For any 0 < ε < 1/7, we can find a (1 + ε)-approximate minimum
e + n4/3 ε−8/3 ) time.
s-t cut in O(m
We note that, in this algorithm, there is no need to deal explicitly with edges
flowing more than ρ, maintain a set of forbidden edges, or average the flows from
different steps. We will separately study edges with very large flow in our analysis,
but the algorithm itself avoids the complexities that appeared in the improved flow
algorithm described in Section 3.4.
We further note that the update rule is slightly modified from the one that ap-
peared earlier in this chapter. This is done to guarantee that the effective resistance
increases substantially when some edge flows more than ρ, without having to explic-
itly cut it. Our previous rule allowed the weight (but not resistance) of an edge to
constitute a very small fraction of the total weight; in this case, a significant multi-
plicative increase in the weight of an edge may not produce a substantial change in
the effective resistance of the graph.
64
Input : A graph G = (V, E) with capacities {ue }e , and a target flow value F
Output: A cut C
Initialize we0 ← 1 for all edges e, ρ ← 3m1/3 ε−2/3 , N ← 5ε−8/3 m1/3 ln m, and δ ← ε2 .
for i := 1, . . . , N do
Find an approximate electrical flow f˜i−1 and potentials φ̃ using Theorem 3.3.1 on G
with
i−1
resistances rei−1 = wue2 , target flow value F , and parameter δ.
e
µi−1 ← e wei−1
P
ε2 i−1
wei ← wei−1 + ρε congf˜i−1 (e)wei−1 + mρ µ for each e ∈ E
Scale and translate φ̃ so that φs = 1 and φt = 0
e e
Let Cx = {v ∈ V | φv > x}
Set C to be the set Cx of minimal capacity
If the capacity of Cx is less than F/(1 − 7ε), return Cx .
end
return “fail”“
crossing a large cut has non-negligible weight. In order to formalize the above argu-
ment, we will thus need some good way to measure the extent to which the weight is
“concentrated on approximately minimum s-t cuts”.
In Section 3.5.2, we will show how to use effective resistance to formulate such a
notion. In particular, we will show that if we can make the effective resistance large
enough then we can find a cut of small capacity. In Section 3.5.3, we will use an
argument like the one described above to show that the resistances produced by the
e 1/3 ε−8/3 ) steps to one that meets such
algorithm in Figure 3-4 converge after T = O(m
a bound.
65
Now, suppose that one has a fixed total amount of resistance µ to distribute
over the edges of a cut of size F . It is not difficult to see that the maximum possible
effective resistance between s and t in such a case is Fµ2 , and that this is achieved when
one puts a resistance of Fµ on each of the edges. This suggests the following lemma,
which bounds the quantity in Lemma 3.5.2 in terms of the effective resistance and the
total resistance (appropriately weighted when the edges have non-unit capacities):
Lemma 3.5.3. Let µ = e u2e re , and let the effective s-t resistance of G with edge
P
resistances given by r be Reff (r ). Let φ be the potentials of the electrical s-t flow,
scaled to have potential drop 1 between s and t. Then
µ
X r
|φu − φv |u(u,v) ≤ .
Reff (r )
(u,v)∈E
Proof. By Fact 2.4.1, the rescaled true electrical potentials correspond to a flow of
value 1/Reff (r ) and
X (φu − φv )2 1
= .
r(u,v) Reff (r )
(u,v)∈E
µ
r
= .
Reff (r )
By parts a and c of Theorem 3.3.1, after we rescale φ̃ to have potential drop 1 between
s and t, it will have energy
X (φeu − φev )2 1+δ 1 1
≤ ≤ (1 + 3δ) ,
re 1 − δ Reff (r ) Reff (r )
(u,v)∈E
for δ ≤ 1/3. The rest of the analysis follows from another application of Cauchy-
Schwarz.
66
3.5.3 The Proof that the Dual Algorithm Finds an Approxi-
mately Minimum Cut
We’ll show that if F ≥ F ∗ then within T = 5ε−8/3 m1/3 ln m iterations, the algorithm
in Figure 3-4 will produce a set of resistances r i such that
µi
Reff (r i ) ≥ (1 − 7ε) . (3.13)
(F )2
Once such a set of resistances has been obtained, Lemmas 3.5.2 and 3.5.3 tell us that
the best potential cut of φ̃ will have capacity at most
1 + 2δ 1
√ F ≤ F.
1 − 7ε 1 − 7ε
ν i ≤ µi
for all i.
Our proof that the effective resistance cannot remain large for too many iterations
will be similar to our analysis of the flow algorithm in Section 3.4. We suppose to the
i
contrary that Reff (r i ) ≤ (1 − 7ε) (Fµ)2 for each 1 ≤ i ≤ T . We will show that, under
this assumption:
1. The total weight µi doesn’t get too large over the course of the algorithm
[Lemma 3.5.4].
2. The quantity ν i increases significantly in any iteration in which no edge has
congestion more than ρ [Lemma 3.5.5]. Since µi doesn’t get too large, and
ν i ≤ µi , this will not happen too many times.
3. The effective resistance increases significantly in any iteration in which some
edge has congestion more than ρ [Lemma 3.5.6]. Since µi does not get too
large, and the effective resistance is assumed to be bounded in terms of the total
weight µi , this cannot happen too many times.
67
The combined bounds from 2 and 3 will be less than T , which will yield a contradic-
tion.
≤ i
Lemma 3.5.4. For each i ≤ T such that Reff (r i )(1 − 7ε) Fµ2 ,
ε(1 − 2ε)
i+1 i
µ ≤ µ exp .
ρ
≤ i
So, if for all i ≤ T we have Reff (r i )(1 − 7ε) Fµ2 , then
ε(1 − 2ε)
T 0
µ ≤ µ exp T . (3.14)
ρ
≤ i
Proof. If Reff (r i )(1 − 7ε) Fµ2 then the electrical flow f of value F has energy
X
F 2 Reff
=
(r i ) congf i (e)2 wei ≤ (1 − 7ε)µi .
By Theorem 3.3.1, the approximate electrical flow f˜ has energy at most (1 + δ) times
the energy of f ,
X
congf˜i (e)2 wei ≤ (1 + δ)(1 − 7ε)µi ≤ (1 − 6ε)µi .
as desired.
ε(1 − ε)
i+1
ν ≥ exp ν i.
ρ
68
Proof. To bound the increase of ν i+1 over ν i , we use the inequality
which holds for ε and x between 0 and 1. We apply this inequality with x =
congf˜i (e)/ρ. As f˜i is a flow of value F and C is a cut, e∈C f˜ei ≥ F . We now
P
compute
!1/uC
t+1 ue
Y
ν i+1 =
we
e∈C
Y ue !1/uC
ε
≥ wei 1 + congf˜i (e)
e∈C
ρ
Y ue !1/uC
ε
= νi 1 + congf˜i (e)
e∈C
ρ
!
i 1 X ε(1 − ε)
≥ ν exp ue congf˜i (e)
uC e∈C ρ
!
1 X ε(1 − ε)
= ν i exp f˜ei
uC e∈C ρ
1 ε(1 − ε)
i
≥ ν exp F
uC ρ
ε(1 − ε)
i
≥ ν exp .
ρ
i
Lemma 3.5.6. If Reff (r i ) ≤ (1 − 7ε) Fµ2 and there exists some edge e such that
congf˜i (e) > ρ, then
2 2
i+1 ερ
Reff (r ) ≥ exp Reff (r i ).
4m
ε2 i
wei+1 ≥ wei + µ
mρ
ε2
ε
≥ + µi
m mρ
ε ε
= 1+ µi
m ρ
ε(1 − 2ε)
ε
≥ exp µi
m ρ
ε
≥ µi+1 ,
m
by Lemma 3.5.4.
We now show that an edge e for which congf˜i (e) ≥ ρ contributes a large fraction
of the energy to the true electrical flow of value F . By the assumptions of the lemma,
the energy of the true electrical flow of value F is
F 2 Reff (r ) ≤ (1 − 7ε)µi .
On the other hand, the energy of the edge e in the approximate electrical flow is
ε
f˜i (e)2 rei = congf˜i (e)2 we ≥ ρ2 µi .
m
As a fraction of the energy of the true electrical flow, this is at least
1 ρ2 ε 1
= .
1 − 7ε m (1 − 7ε)ε1/3 m1/3
By part b of Theorem 3.3.1, the fraction of the energy that e accounts for in the true
flow is at least
1 ε2 1 ρ2 ε
− ≥ = .
(1 − 7ε)ε1/3 m1/3 2mR ε1/3 m1/3 m
As
ε
wei+1 ≥ wei 1 + congf˜i (e) ≥ (1 + ε)wei ,
ρ
we have increased the weight of edge e by a factor of at least (1 + ε). So, by
Lemma 3.4.4,
Reff (r i+1 ) ρ 2 ε2
2 2
ρε
i
≥ 1+ ≥ exp .
Reff (r ) 2m 4m
70
Before proving Lemma 3.5.7, we note that combining it with Lemmas 3.5.2 and 3.5.3
immediately implies our main result:
Theorem 3.5.8. On input ε < 1/7, the algorithm in Figure 3-4 runs in time
e 4/3 ε−8/3 ). If F ≥ F ∗ , then it returns a cut of capacity at most F/(1 − 7ε),
O(m
where F ∗ is the minimum capacity of an s-t cut.
To use this algorithm to find a cut of approximately minimum capacity, one should
begin as described in Section 3.3 by crudely approximating the minimum cut, and
then applying the above algorithm in a binary search. Crudely analyzed, this incurs
a multiplicative overhead of O(log m/ε).
i
Proof of Lemma 3.5.7. Suppose as before that Reff (r i ) ≤ (1−7ε) Fµ2 for all 1 ≤ i ≤ T .
By Lemma 3.5.4 and the fact that µ0 = m, the total weight at the end of the algorithm
is bounded by
Let A ⊆ {1, . . . , T } be the set of i for which congf˜i (e) ≤ ρ for all e, and let
B ⊆ {1, . . . , T } be the set of i for which there exists an edge e with congf˜i (e) > ρ.
Let a = |A| and b = |B|, and note that a + b = T . We will obtain a contradiction by
proving bounds on a and b that add up to something less than T .
We begin by bounding a. By applying Lemma 3.5.5 to all of the steps in A and
noting that ν i never decreases during the other steps, we obtain
ε(1 − ε) ε(1 − ε)
T 0
ν ≥ exp a ν = exp a (3.16)
ρ ρ
71
obtain
ε2 ρ2
2 2 2 2
T 0 ερ 1+ε ερ 1+ε
Reff (r ) ≥ exp b Reff (r ) ≥ exp b 2 ≥ exp b ,
4m 4m 2
mF ∗ 4m m2 F 2
where the second inequality applies the bound Reff (r 0 ) ≥ 1/(m2 F ∗ 2 ), which follows
from an argument like the one used in equation (3.9). Using our assumption that
T
Reff (r T ) ≤ (1 − 7ε) (Fµ )2 and the bound on µT from equation (3.15), this gives us
ε2 ρ 2 1 − 7ε ε(1 − 2ε)
1+ε
exp b 2 2
≤ m exp T .
4m mF F2 ρ
4m(1 − 2ε) 1 − 7ε 3
4m 4m 12m
b≤ 3
T + 2 2 ln m < 3 T + 2 2 ln m. (3.18)
ερ ερ 1+ε ερ ερ
Adding the inequalities in equations (3.17) and (3.18), grouping terms, and plugging
in the values of ρ and T , yields
4m 7ρ 12m
T = a + b < (1 − ε) + 3 T + + ln m
ερ 6ε ε2 ρ2
4m
= (1 − ε) + (5ε−8/3 m1/3 ln m)
ε(27mε−2 )
1/3 −2/3
7m ε 12m
+ + 2 ln m
2ε ε (9m2/3 ε−4/3 )
1/3
12m1/3
4ε −8/3 1/3 7m
= (1 − ε) + (5ε m ln m) + + ln m
27 2ε5/3 9ε2/3
5m1/3 41 12ε m1/3
= 8/3 ln m − − ln m
ε 54 9 ε5/3
5m1/3
< 8/3 = T.
ε
This is our desired contradiction, which completes the proof.
72
one can turn an efficient algorithm that computes the maximum s-t flow in undirected
graphs into an efficient algorithm that computes such a flow also in directed graphs.
More precisely, the following theorem holds.
Theorem 3.6.1 (Folklore). Given an algorithm A that computes a maximum s-t flow
in any undirected capacitated graph G0 = (V 0 , E 0 , u 0 ) in time T (|V 0 |, |E 0 |, U 0 ), where
0
U 0 := maxe,e0 uu0e is the capacity ratio of G0 , one can compute a maximum s-t flow
e0
in any directed capacitated graph G = (V, E, u) in time T (O(|V |), O(|E|), |V | · U ) +
O(|E|),
e where U := maxe,e0 uue0 is the capacity ratio of G.
e
Note that in the above theorem we need the algorithm A to compute the exact
maximum s-t flow not only its approximation. So, we cannot directly use here the
algorithm presented in this chapter. Still, however, this theorem shows that to make
progress also on the directed case, we can just focus on refining our approach to
develop an error control mechanism that allows us to perform exact undirected max-
imum flow computations. We also observe that the maximum s-t flow and minimum
s-t cut duality [67, 57] can be used to get an analogous result for the minimum s-t
cut problem.
Proof. The way we will find the maximum s-t flow f ∗ in the graph G within desired
time bounds – and thus prove the theorem – is based on constructing an appropriate
undirected graph G e = (Ve , E,
e ue ) and extracting the flow f ∗ from the maximum s-t
flow f˜∗ in G
e (that we will find using the algorithm A).
The graph G e is constructed from G as follows. We take Ve to be equal to V .
Next, for each (directed) arc e = (u, v) of G we add (undirected) edges h(e) := (u, v),
hs (e) := (s, v), and ht (e) := (u, t) to G,
e with the capacity of each of these edges being
ue . Clearly, the resulting graph G e is undirected and has O(|E|) edges and O(|V |)
vertices. Also, by just viewing each multi-edge created in G e as a singe edge with total
capacity being the sum of the capacities of all the edges forming it, we see that the
capacity ratio of G e is at most |V | · U .
Now, we claim that the maximum s-t flow value Fe∗ of the graph G e is at least
X
Fe∗ ≥ 2F ∗ + ue ,
e∈E
73
show that the total capacity u(C) of the arcs leaving the cut C in G has to be at
least F ∗ .
In the light of the above, if we compute the maximum s-t flow f˜∗ in G e using the
algorithm A, then the running time of this procedure P will be T (O(|V |), O(|E|), |V | ·
˜ ∗ ˜∗ ∗
U ) + O(|E|) and the value |f | of f is at least 2F + e∈E ue .
Now, we describe how to extract out of the flow f˜∗ a feasible flow f ∗ in G of
value at least F ∗ in O(|E|)
e time. Clearly, once we obtain this flow f ∗ it will be the
maximum s-t flow in G that we are looking for. The extraction will consists of two
steps.
The first step is based on first obtaining a flow f˜0 out of f˜∗ by subtracting for each
arc e ∈ E, the s-t flows f e in G that sends ue units of flow through the arcs hs (e),
h(e), and ht (e); and then scaling it down by a factor of 2. In other words, we take
!
1 X
f˜0 := f˜∗ − fe .
2 e∈E
Note that the value |f˜0 | of this flow is at least 12 (Fe∗ − e∈E ue ) ≥ F ∗ . Furthermore,
P
it is easy to see that by the feasibility of f˜, f˜0 routes over each edge h(e) at most
1 ˜
2
(|f (h(e))| + f e (h(e))) ≤ ue units of flow and this flow has to flow in direction
consistent with the orientation of the arc e. So, if f˜0 would not flow any flow over
the edges hs (e)s and ht (e)s, i.e., we had |f˜0 (hs (e))| = |f˜0 (ht (e))| = 0, for each e ∈ E,
then f˜0 would be also a feasible flow in G and thus we could just take f ∗ to be f˜0 .
Of course, we cannot expect the flow f˜0 to never flow anything over the edges
hs (e) and ht (e). However, what is still true is that the only direction the flow over
the edge hs (e) (resp. the edge ht (e)) can flow is towards s (resp. out of t). This is so,
since by feasibility of f˜, the flow over the edges hs (e) and ht (e) in f˜ can be at most
ue , and during construction of f˜0 we subtract the s-t flows f e s that flow exactly ue
units over the edges hs (e) (resp. ht (e)) in the direction out of s (resp. towards t).
However, as an acyclic s-t flow does not send any flow towards s or out of t, it
must be the case that the flow f˜0 can have a non-zero flow on edges hs (e) and ht (e)
only if it contains some flow cycles. Therefore, if we just apply to f˜0 the O(m)-timee
flow-cycle-canceling algorithm that is described in [125] then the acyclic flow f ∗ that
we obtain in this way will be the maximum s-t flow in G that we are seeking.
74
Chapter 4
4.1 Introduction
We will be studying multicommodity flow problems. These problems correspond to
a task of finding in a capacitated and directed graph G = (V, E, u), a set of k flows
(f1 , . . . , fk ) – each fi being an si -ti flow of commodity i with a source si and a sink
ti – that optimize some objectivePfunction while respecting the capacity constraints,
i.e., while having the total flow i fi (e) of the commodities through any arc e not
exceeding its capacity ue . There are two basic variations of this task. The first one is
the maximum P multicommodity flow problem – in this case the objective is to maximize
the sum i |fi | of the values of the flows. The second one is the maximum concurrent
flow problem. In this problem, we are given a set of k positive demands d1 , . . . , dk and
are asked to find a multicommodity flow that is feasible (i.e. obeys arc capacities)
and routes θdi units of commodity i between each source-sink pair (si , ti ). The goal
is to maximize the value of the rate θ.
Although the problems defined above can be solved optimally in polynomial time
by formulating them as linear programs, in many applications it is more important
to compute an approximate solution fast than to compute an optimal one. There-
fore, much effort was put into obtaining efficient approximation algorithms for these
problems. In particular, one is interested in getting a fast algorithm that computes a
1
This chapter contains material from [105].
75
solution that has objective value within a factor of (1 − ε) of the optimal one, where
ε > 0 is the desired accuracy factor.
76
be inherent for Dantzig-Wolfe-type algorithms2 and all the algorithms mentioned so
far are of this type. As it turns out, better dependence on 1/ε can be obtained.
Bienstock and Iyengar [29] adapted the technique of Nesterov [107] to give an (1 − ε)-
approximation algorithm for maximum concurrent flow problem that have O( ε log1 1/ε )
dependence on 1/ε. Very recently, Nesterov [108] obtained an algorithm for this
problem in which this dependence is just linear. However, the running times of both
these algorithms have worse dependence on parameters other than 1/ε compared
to the algorithms described above – for example, the approximation scheme due to
e 2 m2 ε−1 ).
Nesterov has the running time of O(k
77
Problem Previous best Our result
e 2 ε−2 ) [65]
maximum multicommodity flow O(m O(mnε
e −2
)
2 −2
maximum concurrent flow O((m
e + kn)ε ) [83] O((m
e + k)nε−2 log M )
e 2 m2 ε−1 ) [108]
O(k
time would be within our intended bounds, then it seems there is no suitable existing
result that can be used (see section 4.3 for details).
This lack of existing solution fitting our needs brings us to the second idea. We
note that when we employ the multiplicative-weights-update-based framework to solve
multicommodity flow problems, it is not necessary to compute the (approximately)
shortest path for each shortest-path subproblem. All we really need is that the set
of the suitable paths over which we are optimizing the length comes from a set that
contains all the flowpaths of some fixed optimal solution to the multicommodity flow
instance that we are solving. To exploit this fact, we introduce a random set of
paths Pb (formally defined in Definition 4.3.1) that can be seen as a sparsification of
the set of all paths in G, and that with high probability has the above-mentioned
containment property. Next, we combine the ideas from dynamic graph algorithms
to design an efficient data structure that maintains all-pairs shortest path distances
with respect to the set P. b This data structure allows us to modify (in an almost
generic manner) the existing multiplicative-weights-update-based algorithms for vari-
ous multicommodity flow problems and transform them into Monte-Carlo algorithms
with improved running times.
The summary of our results and their comparison with the previous ones can be
found in Figure 4-1. To put these result in a context, we note that there are simple
examples (see Figure 4-2) that show that no (1 − ε)-approximate algorithm for the
maximum multicommodity flow problem (resp. maximum concurrent flow problem)
can have running time smaller than O(n2 ) (resp. O(n2 + kn)). On the other hand, in
the setting where ε is fixed or moderately small, say 1/ logO(1) n, and the size of every
number used in the input instance is polynomially bounded, our algorithms for these
two problems have their running time become O(mn)e and O(mn
e + kn) respectively.
Thus they are within a O(m/n) factor of these trivial lower bounds and the O(kn)
e e
term in the second running time is essentially optimal.
Furthermore, these O(mn)
e terms almost match a natural barrier for the existing
multiplicative-weights-based approaches: the Ω(mn) flow-decomposition barrier cor-
responding to a single-commodity flows. Recall that the flow-decomposition barrier
corresponds to the fact that there are instances to the maximum s-t flow problem
for whom decomposition of any (even approximately) optimal solution into flowpaths
has to have a size of Ω(mn). As a result, any algorithm that produces such a de-
78
s1 t1
s2 t2
s n4 −1 n
vertices t n4 −1
2
s n4 t n4
Figure 4-2: In this example, the arcs leaving vertices {si }i and entering vertices {ti }i
have capacity 1. The edges on the middle path have capacity n/4. By choosing
the set of source-sink pairs to be {(si , ti )}i we obtain an instance of the maximum
multicommodity flow problem for which any (1 − ε)-approximate solution has repre-
sentation size Ω((1 − ε)n2 ) = Ω(n2 ), for ε < 12 – each of n/2 − 1 arcs on the middle
path has to have non-zero flow of at least (1 − ε)n/4 commodities flowing through
it. Similarly, by choosing any set of k ≤ n2 /16 distinct (si , tj ) pairs as source-sink
pairs for k commodities and setting all demands to 1, we obtain an instance of the
maximum concurrent flow problem for which any (1 − ε)-approximate solution has
representation size Ω(kn) = Ω(n2 + kn) whenever k ≥ n.
composition has to have a running time of Ω(mn). Therefore, as all the existing
multiplicative-weights-based algorithms for the problem are based on iterative find-
ing of shortest paths that are used to flow the commodities, this natural barrier
transfers over.
Recall that in the case of the maximum s-t flow problem this barrier was overcome
– Goldberg and Rao [73] were the first one to presented an algorithm that improves
upon this barrier. Thus it raises an interesting question of whether one can also
achieve a similar improvement for the maximum multicommodity flow problem.
79
4.1.4 Outline of This Chapter
We start with section 4.2 where we present the multiplicative-weights-update-based
algorithm for the maximum multicommodity flow problem. Next, in section 4.3, we
introduce the main ideas and tools behind our results – the connection between fast
approximation schemes for multicommodity flow problems and dynamic graph algo-
rithm for maintaining (approximately) shortest paths with respect to the sparsified
set of paths. In particular, we formally define the set P,
b and the (δ, Mmax , Mmin , P)-
b
ADSP data structure that we will be using for maintenance of the (approximately)
shortest paths that we are interested in. Subsequently, in section 4.4, we show how
these concepts can be applied to give a more efficient algorithm for the maximum
multicommodity flow problem. Then, in section 4.5, we describe an implementation
of the (δ, Mmax , Mmin , P)-ADSP
b data structure that is efficient enough for our pur-
poses. We conclude in section 4.6 by showing how the ideas developed in this chapter
lead to an improved algorithm for the maximum concurrent flow problem.
B ≤ F ∗ ≤ kmB,
80
where B is the maximum bottleneck of any si -ti path in G. (Recall that a bottleneck
of a path is the minimum capacity of an edge on that path.) Such a bottleneck B
can be computed in O(mn + n2 log n) = O(mn) e time, e.g., using the algorithm of
Fredman and Tarjan [68]. Now, we can just follow the reasoning outlined in Section
3.3 for the maximum s-t flow problem to reach the desired conclusion.
Therefore, from now on, we can just focus on a particular value of F and the task of
designing a (1 + O(ε))-approximate solver for the corresponding (F, G)-system. Once
again, similarly to the case of Chapter 3, this solver will be based on multiplicative-
weights-update routine from Figure 2-1 and thus all we need to do is to design an
oracle for it. Our oracle will be very simple. Given a query corresponding to some
weights w , it considers the length vector l with lengths corresponding to the weights
normalized by the capacities, i.e., sets
we
le := (4.1)
ue
for each arc e. Then, it finds the shortest si -ti path p with respect to l , i.e., the path
p = argminp∈P l(p). Once such a path is found, the oracle returns as an answer a
flow f˜ that sends F units of flow along the path p, if l(p) ≤ e le ue /F ; and returns
P
“fail”, otherwise.
We claim that this procedure provides a 1-oracle for our (F, G)-system (cf. Def-
inition 2.8.2). To see that this is indeed thePcase it suffices to P prove the P following
lemma and note that, by definition, F l(p) = e we congf˜(e) and e le ue = e we .
∗
Lemma 4.2.1. P For any length vector l, whenever F ≤ F , there exists a path p ∈ P
with F l(p) ≤ e le ue .
Proof. Let p1 , . . . , pq be a decomposition of the optimal solution f ∗ into flow-paths.
The fact that f ∗ has to obey the capacity constraints implies that
q
X X
le ue ≥ l(pj )f ∗ (pj ),
e j=1
where f ∗ (pj ) is the amount of flow flown over the path pj by f ∗ . But
q
X X
∗
F ≤F = |fi∗ | = f ∗ (pj ).
i j=1
81
applying Lemma 2.8.9 with the trivial tightness bound of 1, we can conclude that the
−2
above multiplicative-weights-update routine makes at most O(mε
e ) queries to the
oracle and thus the following theorem is established.
Theorem 4.2.2 (see also [70]). For any 1/2 > ε > 0, one can compute a (1 − ε)-
2 −2
approximation to the maximum multicommodity flow problem in time O(km
e ε ).
Although the above algorithm is conceptually simple, its running time is fairly
large – note that for k can be Ω(n2 ) which would result in prohibitively large Ω(m2 n2 )
running time. However, it is not obvious how to reduce this running time signifi-
cantly3 . After all, there are limits to how fast the underlying multiple source-sink
pairs shortest path problem can be solved (e.g., there is a trivial lower bound of
Ω(n2 )) and it is not hard to construct examples on which the bound on the number
of iterations is essentially tight.
It turns out, however, that this multiplicative-weights-update-based approach can
still be sped up considerably. The key to achieving this is not focusing on finding
more efficient way of answering each individual oracle query, but trying to come up
instead with a more efficient way of solving the whole sequence of these queries. This
insight is pivotal for all the techniques presented in this chapter.
To see an example of how this insight can be utilized, we present a way of improv-
ing the running time of the above algorithm that was originally devised by Fleischer
[65]. This improvement is based on the realization that whenever we need the shortest
path in the above algorithm, it is sufficient to compute a (1 + ε)-approximately short-
est source-sink path instead of the shortest one. The only thing that is changed by
doing so, is that now we are working with a (1+ε)-oracle to the (F, G)-system instead
of the 1-oracle that was used above. However, as we are solving this (F, G)-system
(1 + ε)-approximately anyway, this difference does not change anything.
Now, the fact that providing only (1+ε)-approximate answers to the oracle queries
suffices, allows us to modify the way these answers are computed, to avoid solving
the shortest-path problem for all the source-sink pairs each time the oracle is queried.
The way we achieve this is by cycling through the source-sink pairs, keeping supplying
as the answer the shortest path corresponding to the currently chosen source-sink pair
as long as the length of this path is at most (1 + ε)b α – where αb is a maintained by
the algorithm lower bound estimate of the current length of the overall shortest path
in P – and moving on to next source-sink pair once it does not. To start, we set α b
sufficiently small and we do not increase its value as long as we manage to find a path
in P of small enough length (i.e. at most (1 + ε)b α). Once we are unable to find such a
path i.e. our cycling through commodities made a full cycle, we set α b ← (1+ ε)bα, and
start cycling again. The resulting algorithm is presented in Figure 4-3. An important
implementation detail used there is that when cycling through source-sink pairs, we
group together the pairs that share the same source. This allows us to take advantage
of the fact that one execution of Dijkstra’s algorithm computes simultaneously the
shortest paths for all these pairs.
3
One possible speeding up would come from noting that one can solve all-pairs shortest path
problem in time O(min{mn, n2.575 ) [66, 137, 140], which gives a speed up in cases of k being very
large. But, even after this improvement, the running time is still Ω(mn2 ).
82
To see why the above modifications reduce the number of shortest path com-
putations needed to execute our multiplicative-weights-update routine, note that
each execution of Dijkstra’s algorithm either results in providing an answer to or-
acle query, or causes us to move on in our cycling to the next group of source-
sink pairs. Observe that by Lemma 2.8.4 we know that the length of the short-
est path is always nO(1/ε) and thus our way of updating the value of α b ensures
that there is at most blog(1+ε) nO(1/ε) c = O(log mε−2 ) full cycles through the pairs.
−2
Also, the number of oracle queries is still O(mε
e ). Therefore, we have at most
O((m
e + min{k, n})ε−2 ) = O(mε
e −2
) executions of Dijkstra’s algorithm which leads to
the following theorem.
Theorem 4.2.3 (see also [65]). For any 1/2 > ε > 0, the algorithm in Figure 4-3
provides a (1 − O(ε))-approximation to the maximum multi-commodity flow problem
e 2 ε−2 ).
in time O(m
83
Input : Graph G = (V, E, u), source-sink pairs {(si , ti )}i , accuracy parameter
1/2 > ε > 0, and a target flow value F
Output: Either a flow f , or “fail” indicating that F > F ∗
Initialize the multiplicative-weights-update routine D from Figure 2-1 (for F being
the set of si -ti flows of value F in G), let w be the corresponding weights vector
Initialize le ← we /ue for each arc e ∈ E (cf. (4.1)), αb ← 1/U , and
I(s) ← {i | si = s} for each s ∈ V (∗ grouping source-sink pairs with
common sources ∗)
while True do
foreach s ∈ V with I(s) 6= ∅ do (∗ cycling throughS commodities ∗)
Use Dijkstra’s algorithm to find the shortest path p in i∈I(s) Pi
while l(p) < (1 + ε)b α do
P
if l(p) > (1 + ε) e le ue /F then return “fail”
else
Supply to D the flow f˜ sending F units of flow over p as an oracle
answer
Update the weights w as dictated by the routine D
Update the lengths l to make (4.1) hold with respect to updated
weights
if D terminates then return the flow f computed by D
end
end S
Use Dijkstra’s algorithm to find the shortest path p in i∈I(s) Pi
end
end
b ← (1 + ε)b
α α b ∗)
(∗ increase the value of α
84
116, 22, 53, 117, 119, 118, 23]). However, if we are interested in solutions whose
overall running time is within our intended bounds, the result that is closest to what
we need is the one by Roditty and Zwick [117]. They show that if G were undi-
rected and had positive, integer lengths on edges, with maximum length being b, a
(1+δ)-approximate solution to the decremental dynamic all-pairs shortest path prob-
lem can be implemented with total maintenance time of O( e mnb ), O(1) time needed
δ
to answer any vertex-to-vertex shortest-path distance query, and returning shortest
vertex-to-vertex path in O(n) time.4 Unfortunately, in our applications the graph G
is not only directed, but also the lengths of the arcs can be of order of b = Ω(n1/ε )
with ε < 1/2 (cf. Lemma 2.8.4. Therefore, the resulting running time would be pro-
hibitive. Further, the construction of Roditty and Zwick assumes that the sequence
of operations to be handled is oblivious to the behavior of the data structure (e.g. to
its randomized choices). This feature is another shortcoming from our point of view
since in our setting the way the lengths of the arcs evolves depends directly on which
shortest paths the data structure chose to output previously.
To circumvent this lack of suitable existing solutions, we observe that for our pur-
poses it is sufficient to solve a simpler task than the decremental dynamic all-pairs
shortest path problem in its full generality (i.e. in the directed setting, allowing large
lengths on arcs and adversarial requests). Namely, when apply the multiplicative-
weights-update-based approach from the previous section, it is not necessary to com-
pute for each of the resulting subproblems the (approximately) shortest path among
all the suitable paths in the set P. As we will prove later, to establish satisfac-
tory bounds on the quality of the final flow, it suffices that whenever we solve some
shortest-path subproblem, the set of suitable paths over which we are optimizing the
length comes from a set that contains all the flowpaths of some fixed optimal solution
to the instance of the multicommodity flow problem that we are solving. To see why
this is the case, we just need to note that Lemma 4.2.1 – that we need to establish
to obtain the desired bound on the quality of the return solution – is proved by only
considering flowpaths of some optimal solution.
With this observation in mind, we define the following random subset P b of paths
5
in P. One may view P b as a sparsification of the set P.
Definition 4.3.1. For j = 1, . . . , dlog ne, let Sj be a random set obtained by sampling
each vertex of V with probability pj = min{ 102lnj n , 1}. Define Pb := Sdlog ne P(Sj , 2j ),
j=1
where P(S, j 0 ) for a given S ⊆ V , and j 0 > 0 is the set of all paths in P that consist
of at most j 0 arcs and pass through at least one vertex of S.
4
Note that there can be as many as mb edge length increases in a sequence, thus this solution is
faster than naïve solution that just computes all-pairs shortest-path after each edge-length increase,
or one that just answers each of the queries using Dijkstra’s algorithm.
5
It is worth noting that a very similar set was used in [117] albeit with a slightly different
motivation.
85
Lemma 4.3.2. For any fixed multicommodity flow solution f = (f1 , . . . , fk ), with
high probability, all the flowpaths of f are contained in P.
b
Definition 4.3.3. For any δ ≥ 0, Mmax ≥ 2Mmin > 0 and a set of paths Q ⊆ P, let
the δ-approximate decremental (Mmax , Mmin , Q)-shortest path problem ((δ, Mmax , Mmin , Q)-
ADSP, for short) be a problem in which one maintains a directed graph G with length
vector l that supports four operations (sometimes we will refer to these operations as
requests):
• Distance(u, v, β), for u, v ∈ V , and β ∈ [Mmin , Mmax /2]: let d∗ be the length of
the shortest (with respect to l) u-v path in Q; if d∗ ≤ 2β then the query returns
a value d such that d ≤ d∗ + δβ. If d∗ > 2β, the query may return either d as
above, or ∞;
• Increase(e, ω), for e ∈ E and ω ≥ 0: increases the length l(e) of the arc e by ω;
• Path(u, v, β), for u, v ∈ V , and β ∈ [Mmin , Mmax /2]: returns a u-v path of length
at most Distance(u, v, β), as long as Distance(u, v, β) 6= ∞;
Intuitively, β is our guess on the interval [β, 2β] in which the length of the
shortest path we are interested in is. We say that β is (u, v)-accurate for given
(δ, Mmax , Mmin , Q)-ADSP data structure R and u, v ∈ V , if the length d∗ of the short-
est u-v path in Q is at least β and the data structure R returns a finite value in
response to Distance(u, v, β) query. Note that if β is (u, v)-accurate then the δβ-
additive error guarantee on the distance estimation supplied by R in response to
86
Distance(u, v, β) query implies a (1 + δ) multiplicative error guarantee. Also, as long
as d∗ is at least Mmin (this will be always the case in our applications), we can em-
ploy binary search to ask O(log log Mmax /Mmin ) Distance(u, v, ·) requests and either
realize that d∗ is bigger than Mmax , or find 0 ≤ i ≤ dlog Mmax /2Mmin e such that
βi = min{2i Mmin , Mmax /2} is (u, v)-accurate.6 Finally, it is worth emphasizing that
we do not require that the paths returned in response to Path(·, ·, ·) requests are
from Q – all we insist on is just that the length of all the suitable paths from Q are
considered when the path to be returned is chosen.
In section 4.5, we will describe how the ideas and tools from dynamic graph
algorithms lead to an implementation of the (δ, Mmax , Mmin , P)-ADSP
b data structure
that is tailored to maintain the shortest paths from set P and whose performance is
b
described in the following theorem. The theorem is proved in section 4.5.2.
Theorem 4.3.4. For any 1 > δ > 0, Mmax ≥ 2Mmin > 0, (δ, Mmax , Mmin , P)-ADSP b
log M max /M
data structure can be maintained in total expected time O(mn e
δ
min
) plus ad-
ditional O(1) per Increase(·, ·) request in the processed sequence. Each Distance(·, ·, ·)
and Path(·, ·, ·) query can be answered in O(n)
e time, and each SSrcDist(·, ·) query in
O(m) time.
e
Theorem 4.3.5. For any 1 > ε > 0, and L ≥ 1 there exists a (1 + ε)-approximate
Monte Carlo solution to the oblivious decremental dynamic all-pairs shortest paths
problem on directed graphs where arc lengths are rational numbers between 1 and
log L
L, that has total maintenance cost of O(mn
e
ε
) plus additional O(1) per increase
of the length of any arc, and answers shortest path queries for any vertex pair in
O(n(log
e log(1+ε) L)(log log L)) time.
Note that even when we allow lengths to be quite large (e.g. polynomial in n),
the maintenance cost of our solution is still similar to the one that Roditty and Zwick
6
The binary search just chooses i in the middle of the current range of values in which the desired
value of i may lie (initially this range is [0, dlog Mmax /2Mmin e]), and if the Distance(u, v, βi ) query
returns ∞ then the left half of the range (including the i queried) is dropped, otherwise the other
half is dropped.
87
achieved in [117] for undirected graphs with small integer lengths. Unfortunately,
our distance query time is O(n)
e instead of the O(1) time obtained in [117]. So, the
gain that we get over a naïve solution for the problem is that we are able to answer
((1 + ε)-approximately) shortest path queries for any vertex pair in O(n)
e time, as
opposed to the O(m + n log n) time required by Dijkstra’s algorithm.
Proof of Theorem 4.3.5. Let G = (V, E) be the graph of our interest, and let l i be
the length vector of G after processing i requests from our sequence. Consider a u-v
path p, for some u, v ∈ V , that is the shortest u-v path in G with respect to l i , for
some i. Our construction is based on the following simple observation: for any δ > 0,
0
and for all i0 ≥ i, as long as for all arcs e of p, lei is at most (1 + δ)lei , p remains to be
0
a (1 + δ)-approximate shortest u-v path in G with respect to l i . Now, note that we
have n2 different (u, v) pairs, and each of m arcs can increase its length by a factor
(1 + δ) at most dlog(1+δ) Le times. Therefore this observation implies that for any
δ > 0 there exists a set Q(δ) of O(mn2 logδ L ) paths such that for any (u, v) and i there
exists a u-v path p in Q(δ) that has length li (p) within (1 + δ) of the length li (p∗ ) of
the shortest (with respect to l i ) u-v path p∗ in G.
In the light of the above, our solution for the decremental all-pairs shortest path
problem is based on maintaining (ε/3, Ln, 1, Q(ε/3))-ADSP
b data structure R, where
Q(ε/3) is the set constructed as in Definition 4.3.1 after we make P to be the set of
b
10 ln(ndlog(1+ε/3) Le)
all paths in G and change the sampling probabilities pj to min{ 2j
, 1}
for j = 1, . . . , dlog ne. Note that by reasoning completely analogous to the one from
the proof of Lemma 4.3.2, we can argue that with high probability Q(ε/3) ⊆ Q(ε/3). b
Also, by straight-forward adjustment of the construction from Theorem 4.3.4 we ob-
tain an implementation of (ε/3, Ln, 1, Q(ε/3))-ADSP
b data structure that has total
log L
maintenance cost of O(mn ε ) plus O(1) time per each Increase(·, ·) request, and
e
that answers Path(·, ·, ·) and Distance(·, ·, ·) requests in O(n
e log log(1+ε) L) time. There-
fore, if we process the request sequence by just passing arc length increase requests
to R, and answering each u-v shortest path query by issuing to R a Path(u, v, β)
query – where (u, v)-accurate β is found through binary search using O(log log L)
Distance(u, v, ·) requests (see discussion after Definition 4.3.3), then the Definition
4.3.3 ensures that we obtain a correct (1 + ε)-approximate solution for decremental
all-pairs shortest path problem whose performance obeys the desired bounds. The
theorem follows.
88
and where P b is a sparsification of P, as described in Definition 4.3.1. However, the
straight-forward implementation of this idea encounters some difficulties.
First, we have to justify the fact that while answering shortest-path queries we
take into account mainly paths in P, b as opposed to the whole set P. Second, an im-
portant feature of Dijkstra’s algorithm that is exploited in Fleischer’s approximation
scheme, is the fact that whenever one computes the distance between a pair of ver-
tices using this algorithm it simultaneously computes all single source shortest-path
distances. Unfortunately, in our case we cannot afford to replicate this approach – the
SSrcDist(·, ·) query that would be an equivalent of Dijkstra’s algorithm’s execution
here, takes still O(m)
e instead of desired O(n)
e time; we need to circumvent this issue
in a more careful manner. We address both these problems below.
Lemma 4.4.1. With high probability, P for any length vector l, whenever F ≤ F ∗ ,
there exists a path p ∈ P b with F l(p) ≤
e le ue .
Proof. Let f ∗ = (f1∗ , f2∗ , . . . , fk∗ ) be some optimal multicommodity flow with i |fi∗ | =
P
F ∗ . By Lemma 4.3.2 we know that with high probability P b contains all the flowpaths
∗ ∗
p1 , . . . , pq of
Pfq . The fact that f has to obey P the capacity
Pq constraints implies that
∗ ∗ ∗
P
e le ue ≥ j=1 l(pj )f (pj ). But OP T = i |fi | = Pj=1 f (pj ); an averaging argu-
ment shows that there exists a j ∗ such that l(pj ∗ ) ≤ e le ue /F as desired.
89
the following natural approach for identification of admissible pairs for a given α b. We
cycle through all the sink-source pairs and query R for the corresponding distance,
we stick with one pair as long as it is admissible, and move on once it becomes
inadmissible. Clearly, the moment we cycled through all pairs, we know that all the
pairs are inadmissible for α b. The problem with this approach is that the resulting
number of s-t distance queries is at least k and thus this would lead to the somewhat
prohibitive bottleneck of Ω(kn)
e in the running time (note that k can be Ω(n2 )).
To alleviate this problem we note that a very similar issue was present already
in the algorithm from Section 4.2 – the way it was avoided there was by grouping
the source-sink pairs according to common sources and exploiting the fact that Di-
jkstra’s algorithm computes all single source shortest-path distances simultaneously.
Therefore, one execution of Dijkstra’s algorithm either finds an (analog of) admissi-
ble pair, or deems all the pairs sharing the same source inadmissible. Unfortunately,
although our (δ, Mmax , Mmin , P)-ADSP
b data structure allows single source shortest-
path distance queries, these queries require O(m) e time and we cannot afford to use
them to obtain s-t distances in the manner it was done in Section 4.2 – this could
cause Ω(m2 ε−2 ) worst-case running time. We therefore devise a different method of
circumventing this bottleneck. To describe it, let us assume that we are given some
vertex s, and a set I(s) of source-sink pairs that have s as their common source and
that have not yet been deemed inadmissible for our current value of α b. Our procedure
samples dlog ne source-sink pairs from I(s) and checks whether any of them is admis-
sible using the Distance(·, ·, ·) query. If so, then we return the admissible pair found.
Otherwise, i.e. if none of them was admissible, we use the SSrcDist(·, ·) query to check
which (if any) source-sink pairs in I(s) are inadmissible, remove them from the set
I(s) and return an admissible pair (if any was found). We repeat the whole procedure
– if I(s) became empty, we proceed to the next possible source s – until we examine
all source-sink pairs. The algorithm is summarized in Figure 4-5 – for convenience,
we substituted for δ, Mmax , and Mmin the actual values used in our algorithm. The
intuition behind this procedure is that if all dlog ne samples from I(s) turned out to
be inadmissible then with probability at least (1 − n1 ), at least half of the pairs in I(s)
is inadmissible, and therefore the SSrcDist(·, ·) query will reduce the size of I(s) by
at least half. This leads to the expected number of SSrcDist(·, ·) requests being not
too large.
90
Input : Graph G = (V, E, u), source-sink pairs {(si , ti )}i , accuracy parameter
1/2 > ε > 0, and a target flow value F
Output: Either a flow f , or “fail” indicating that F > F ∗
Initialize the multiplicative-weights-update routine D from Figure 2-1 (for F being
the set of si -ti flows of value F in G), let w be the corresponding weights vector
Initialize le ← we /ue for each arc e ∈ E (cf. (4.1)), α b ← 1/U , and
I(s) ← {i | si = s} for each s ∈ V (∗ grouping source-sink pairs with
common sources ∗)
Sample sets {Sj }j=1,...,dlog ne to indicate the set Pb (see Definition 4.3.1)
Initialize (ε/2, 1/U, n2(1+ε)/ε , P)-ADSP
b data structure R as in Theorem 4.3.4
while True do
I(s) ← {i | si = s} for each s ∈ V (∗ initializing pairs to be examined
∗)
foreach s ∈ V with I(s) 6= ∅ do (∗ cycling through commodities ∗)
h(s, t), I(s)i ← Find_Admissible_Pair(ε, R, α b, I(s))
while I(v) 6= ∅ do
p ← R.Path(s, t, α b)
P
if l(p) > (1 + ε) e le ue /F then return “fail”
else
Supply to D the flow f˜ sending F units of flow over p as an oracle
answer
Update the weights w as dictated by the routine D
Use R.Increase(·, ·) requests to make the lengths l obey (4.1) with
respect to updated weights
if R terminates then return the flow f computed by D
h(s, t), I(s)i ← Find_Admissible_Pair(ε, R, α b, I(s))
end
end
b ← (1 + ε/2)b
α α (∗ increase the value of α b ∗)
end
end
91
Input : Accuracy parameter ε > 0, (ε/2, 1/U, n2(1+ε)/ε , P)-ADSPb data structure
R, lower bound α b on minp∈Pb l(p), and a set I(s) of source-sink pairs that
might be admissible for α b
Output: h(s, t), I 0 (s)i, where I 0 (s) is a subset of I(s), and (s, t) is: an admissible
b if I 0 (s) 6= ∅; an arbitrary pair otherwise
pair for α
for i = 1, . . . , dlog ne do
Sample a random source-sink pair (s, t) from I(s)
if R.Distance(s, t, α b) ≤ (1 + ε)b
α then (∗ checking admissibility for α b ∗)
return h(s, t), I(s)i
end
end
(∗ No admissible pairs sampled ∗)
Use R.SSrcDist(s, α b) to check admissibility for α
b of all the source-sink pairs in I(s)
Let I 0 (s) be the subset of admissible pairs from I(s)
if I 0 (s) 6= ∅ then
return h(s, t), I 0 (s)i where (s, t) ∈ I 0 (s)
else
return h(s, t), ∅i where (s, t) is an arbitrary pair
end
Theorem 4.4.2. For any 1/2 > ε > 0, with high probability, one can obtain a
(1 − O(ε))-approximate solution to the maximum multicommodity flow problem in
−2
expected O(mnε
e ) time.
Proof. By our discussion at the beginning of Section 4.2, we just need to prove that,
whenever F ≤ F ∗ , the algorithm presented in Figure 4-4 returns a flow f that, with
high probability, exerts a congestion of at most (1 + ε) in G, and that it works in
−2
expected time of O(mnε
e ).
To this end, let us first notice that as we are explicitly check whether l(p) ≤
(1 + ε) e le ue /F , we know that whenever we pass a flow f˜ corresponding to the
P
path p, to the routine D, it is a correct answer of a (1 + ε)-oracle. As a result, we
can use Lemma 2.8.4 to prove that the weights we are dealing with are never bigger
than n2(1+ε)/ε , and thus our value of Mmax is initialized correctly (as is Mmin since
le ≥ 1/U = Mmin for each e). Furthermore, by Theorem 2.8.3, we know that all we
have to do is to prove that our algorithm never returns “fail” when F ≤ F ∗ and that
−2
it never takes more than O(mnε e ) steps to terminate.
In order to establish the first fact, let l j be the length vector l after j-th oracle
answer has been computed, and let αj be the length of the shortest path in P b with
respect to lj . We want to prove that for every j, the value α bj of α
b immediately
before (j + 1)-th augmentation is at most αj . Once we do that, then the definition
of (ε/2, 1/U, n2(1+ε)/ε , P)-ADSP
b and Lemma 4.4.1, will allow us to conclude that the
algorithm never returns “fail” for F ≤ F ∗ .
To prove that α bj ≤ αj for each j, assume for the sake of contradiction that this
92
is not the case; let j ∗ be the first j for which αbj > αj . By definition, this means that
j ∗
there exists an s-t path p ∈ P b with l (p) < α bj ∗ . Since lj ∗ (p) ≥ mine le = 1/U = α
b0 ,
it must have been the case that α bj ∗ > γ and the pair (s, t) was deemed inadmissible
α
bj ∗
for (1+ε/2) at some earlier point of the algorithm – otherwise the value of α b would
have never increased up to α bj ∗ . But this is impossible, since the length of p could
only increase over time and
∗ ε α bj ∗ ε α bj ∗ α
bj ∗
lj (p) + <α
bj ∗ + = (1 + ε) .
2 (1 + ε/2) 2 (1 + ε/2) (1 + ε/2)
Thus by Theorem 4.3.4, we know that Distance(s, t, α bj ∗ /(1 + ε/2)) must have had
α
bj ∗
return a value of at most (1 + ε) (1+ε/2) which would deem the pair (s, t) admissi-
ble. This contradiction gives us the desired claim and proves the correctness of our
algorithm.
Now, to bound the running time of the algorithm, we note that it is dominated
by the total cost of maintaining our (ε/2, 1/U, n2(1+ε)/ε , P)-ADSP b data structure R,
and servicing all the requests issued to it – all other operations performed by the
algorithm can be amortized within this total cost.
Note that by the reasoning analogous to the one we did in case of the maximum
s-t flow problem, we can assume that the largest capacity U (after normalizing the
smallest one to be equal to 1) is bounded by a polynomial in m and 1/ε. Now,
by Theorem 4.3.4, we obtain that the total expected maintenance cost of our data
log n2(1+ε)/ε U −2
structure R is at most O(mne
ε
) = O(mnε
e ). Also, note that by Lemma
−2
2.8.9 (taking tightness to be 1), there can be at most O(mε e ) oracle answers.
−2
As a result, the cost of serving all the Path(·, ·, ·) request is at most O(mnε e ), as
desired. Furthermore, as each of the paths p returned can be assumed to consist of
at most n arcs and the only length that increase after the flow corresponding to p is
supplied to the routine D are the ones being in p, the total cost of serving Increase(·, ·)
−2 −2
request is at most O(mε
e ) · n · O(1) = O(mnεe ) too. and Increase(·, ·) requests is
−2 −2
at most O(mε (O(n) + n · O(1))) = O(mnε ), as desired.
e e e
Now, to bound the time needed to service all the Distance(·, ·, ·) requests, we
−2
note that there can be at most O(mε e ) samplings of source-sink pairs during the
Find_Admissible_Pair procedure that yield an admissible pair. This is so, since finding
an admissible pair results in an oracle answer. Thus the total cost of samplings that
−2 e −2
found some admissible pair is at most dlog neO(mε e )O(n) = O(mnεe ). On the
other hand, in cases when sampling does not yield an admissible path the time needed
to serve all the corresponding Distance(·, ·, ·) requests can be amortized in the time
needed to serve SSrcDist(·, ·) that is always issued afterward.
Therefore, all that is left to do is to bound the expected service cost of SSrcDist(·, ·)
requests. We call a SSrcDist(s, α b) successful if it resulted in decreasing the size of I(s)
by at least a half. Note that there can be at most dlog ne successful SSrcDist(·, ·)
requests per one source and one value of α b. As a result, the total time spent on
answering successful requests is at most log(1+ε/2) (n2(1+ε)/ε U ) · O(m) e = O(mnε
e −2
),
2(1+ε)/ε
since we have at most log(1+ε/2) (n U ) different values of α b. On the other hand,
93
to bound the time taken by serving unsuccessful SSrcDist(·, ·) requests, we notice that
each (successful or not) SSrcDist(·, ·) query either empties one set I(s) for given source
s and a value of α b, or finds an admissible pair (which results in flow augmentation),
−2
therefore there can be at most O(mε e ) SSrcDist(·, ·) requests in total. Moreover, the
probability that a particular query is unsuccessful is at most n1 - this follows from
the fact that if I(s) has more than a half of pairs admissible for given α b then the
probability that none among dlog ne independent samples turns out to be admissible
is at most ( 21 )dlog ne ≤ n1 . Therefore, the total expected cost of this type of requests is
−2
at most O(mnε · O(m)
e
e e m22 ) = O(mnε
= O( e −2
), as desired. The theorem follows.
n nε
Lemma 4.5.1. For any ρ > 0, length vector l , and 1 ≤ j ≤ dlog ne, if there exists a
j
path p ∈ P(Sj , 2j ) of length l(p) then l[ρ/2 ] (p) ≤ l(p) + ρ.
As suggested by the above lemma, the basic idea behind our (δ, Mmax , Mmin , P)-b
ADSP data structure construction is to maintain for each j exact shortest paths from
a set larger than P(Sj , 2j ) (namely, P(Sj )), but with respect to the rounded version
j
l [δMmin /2 ] of the lengths l , and to cap the length of these paths at Mmax + δMmin .
Note that we do not require our (δ, Mmax , Mmin , P)-ADSP
b data structure to output
paths from P, b thus this approach yields a correct solution. Moreover, as we show in
section 4.5.1, using existing tools from dynamic graph algorithms we can obtain an
implementation of this approach whose performance is close to the one postulated by
Theorem 4.3.4, but with linear dependence of the maintenance cost on the ratio M max
Mmin
(as opposed to logarithmic one), and rather high service cost of Increase(·, ·) requests.
We alleviate these shortcomings in section 4.5.2, where we also prove Theorem 4.3.4.
94
4.5.1 Implementation of the (δ, Mmax , Mmin , P)-ADSP
b with Lin-
Mmax
ear Dependence on Mmin
An important feature of the rounded length vector l [ρ] for any ρ > 0, is that after
dividing it by ρ we obtain a length vector that assigns integral lengths to arcs. There-
fore, we are able to take advantage of existing tools for solving decremental shortest
path problem in dynamic graphs with integer arc lengths. We start by defining this
problem formally.
Definition 4.5.2. For any integer r ≥ 0 and a set of paths Q ⊆ P, let the decre-
mental (r, Q)-shortest path problem ((r, Q)-DSPP for short) be a problem in which
one maintains a directed graph G with positive integral weights on its arcs, and that
supports four operations:
• Distance(u, v), for u, v ∈ V : returns the length of the shortest u-v path in Q if
this length is at most r, and ∞ otherwise.
• Increase(e, t), for e ∈ E and integer t ≥ 0: increases the length of the arc e by t
• Path(u, v), for u, v ∈ V : returns a u-v path of length Distance(u, v), as long as
Distance(u, v) 6= ∞.
We state first the following lemma which is just a simple and known extension of
the classical construction of Even and Shiloach [60] (see also [119]).
Lemma 4.5.3. For any s ∈ V and positive integer r, (r, P({s}))-DSPP data structure
can be maintained in total time O(mr)
e plus additional O(log n) per each Increase(·, ·)
request. Each Distance(·, ·) request can be answered in O(1) time and each Path(·, ·),
and SSrcDist(·) query - in O(n) time.
Proof. First, we notice that to prove the lemma it is sufficient to show that we can
maintain, within desired time bounds, the single-source shortest paths tree of all v-
s paths that have length at most r, for any v ∈ V . Indeed, once we achieve this,
it will also imply that we can keep the single-source shortest paths tree of all s-v
paths having length at most r, for any v ∈ V . Now, to implement (r, P({s}))-DSPP
data structure we just maintain both single-source shortest path trees and whenever
queried Distance(u, v) we answer Distance(u, s) + Distance(s, v) if this sum does not
exceed r and (u, v) is a source sink pair (i.e. the corresponding u-v path is in P);
and ∞ otherwise. Note that our trees allow finding Distance(u, s) and Distance(s, u)
in O(1) time. Similarly, we can answer request Path(u, v), as long as Distance(u, s) +
Distance(s, v) ≤ r, by just concatenating paths Path(u, s) and Path(s, v) that we can
obtain from our trees in O(n) time. Finally to answer SSrcDist(u) request we just
issue a Distance(u, v) request – that we already know how to handle – for each v ∈ V .
It is easy to see that all the running times will fit within the desired bounds as long
as our maintenance of single-source shortest paths tree will obey these bounds.
95
Our way of maintaining such a single-source shortest path tree of all v-s paths is
a simple extension of the classical construction of Even and Shiloach [60] (see also
[119]) who showed how to dynamically maintain in an unweighted directed graph, a
decremental single-source shortest paths tree up to depth d, in O(md) total running
time.
In our data structure, each vertex v will keep a variable d[v] whose value will be
always equal to v’s current distance to s. Moreover, each vertex v keeps a priority
queue Q[v] of all its outgoing arcs, where the key of a particular arc (v, u) is equal
to the current value of d[u] + l(v,u) . We want the queue to support three operations:
Insert(e, t) that adds an arc to a queue with key equal to t, F indM in - returns the
arc with smallest value of the key, and SetKey(e, t) that sets the value of the key of
arc e to t. By using e.g. Fibonacci heap implementation of such queue, F indM in
can be performed in O(1) time, and each of the remaining operations can be done in
O(log n) amortized time.7
The initialization of the data structure can be easily done by computing the single-
source shortest path tree from s using Dijkstra algorithm and inserting arcs into
appropriate queues, which takes O(m log n) time. Also, Distance(·, s) requests can
be easily answered in O(1) time by just returning d[v]. Finally, the implementations
of the Path(·, s), and Increase(·, ·) can be found in Figure 4-6. Clearly, answering
Path(·, s) query takes at most O(n) time. Now, the total time needed to serve w
Increase(·, ·) request is at most O(log n) times the total number of Scan(·) calls. But,
since for a particular arc e = (u, v) Scan(e) is called only if either Increase(e, ·) was
called; or d[v] < ∞ and d[v] increases by at least one, we see that this number is at
most m(r + 1) + w, which gives the desired running time.
96
Procedure Path(v, s):
if v = s then
return an empty path ∅
else
e = (v, u) ← Q[v].F indM in
return Path(u, s) ∪ {e}
end
Procedure Increase(e, t):
le ← le + t
Scan(e)
Procedure Scan(e) :
Q[u].SetKey(e, d[v] + le )
f = (u, v 0 ) ← Q[u].F indM in
if d[v 0 ] + lf > r then
d[u] ← ∞
end
if d[v 0 ] + lf > d[u] then
d[u] ← d[v 0 ] + lf
foreach arc f 0 incoming to u do Scan(f 0 )
end
97
distance from u to s with respect to l . Note that construction of the graph Gu,S
can be performed in O(m) time – in particular the length of each arc (u0 , s) can be
obtained by querying Rs with Distance(u, s). It is easy to see that if we compute
single-source shortest path distances in Gu,S from u0 to all v ∈ V using Dijkstra’s
algorithm then we can obtain the value of Distance(u, v) for each v ∈ V by just
returning the computed value if it is at most r and (u, v) is a source-sink pair (so,
the corresponding u-v path is in P); and returning ∞ otherwise. Obviously, the total
time required is O(m + n log n), as desired.
98
where we used the fact that expected size of Sj is O(n log n/2j ). The lemma
follows.
Lemma 4.5.6. For any δ > 0, Mmax ≥ 2Mmin > 0, (δ, Mmax , Mmin , P)-ADSP b data
log M max /M
structure can be maintained in total expected time O(mn
e
δ
min
) plus additional
1
e log ) per each Increase(·, ·) request in the processed sequence. Each Distance(·, ·, ·)
O(n δ
and Path(·, ·, ·) query can be answered in O(n)
e time, and each SSrcDist(·, ·) query –
in O(m) time.
e
99
b 0
exceeds σdlog ne , where ω is the total increase of the length of e from the beginning
up to the current value of le – we call such an event activation of e in Rb .
In the light of the above, our handling of a Increase(e, ω) request is as follows. Let
b b
b− be the largest b with 2Mmax < le +ω, and let b+ be the largest b with le +ω ≥ σdlog ne .
We start by deactivating e in all Rb with b ≤ b− in which e wasn’t already deactivated,
and activating e (by increasing the length of e to le ) in all Rb with b− < b ≤ b+ in
which it wasn’t activated yet. Next, we issue Increase(e, ω) request to all Rb with
b− < b ≤ b+ , and we increase le by ω. It is not hard to see that by the above two
observations, this procedure does not violate the correctness of our implementation of
(δ, Mmax , Mmin , P)-ADSP
b data structure. Now, to bound the time needed to service
Increase(·, ·) request we note that each arc e can be activated and deactivated in each
Rb at most once, and each such operation takes O(n) e time. So, the total cost of these
Mmax
operation is at most O(mn log Mmin ) and this cost can be amortized within the total
e
maintenance cost. To bound the time taken by processing the Increase(e, ω) requests
b− +1 b +1
passed to all Rb with b− < b ≤ b+ , we note that le + ω < 2Mmax ≤ 2j+3 σj − /δ for
dlog ne+3
any j, thus b+ − b− − 1 ≤ log 2 δ , and the total service time is at most O(n e log 1 ),
δ
as desired. The lemma follows.
100
To prove that our construction is a correct implementation of (δ, Mmax , Mmin , P)-
b
ADSP data structure, consider some s-t path p in P b whose length l(p) is at most
2β for some β ∈ [Mmin , Mmax /2]. Now, upon being queried with Distance(s, t, β)
request, our data structure will return a value d0 = d + δβ/2, where d is the value
returned by R as an answer to Distance(s, t, β) request passed. Since ˆl(p) ≤ l(p), we
have d0 = d + δβ/2 ≤ ˆl(p) + δβ/2 + δβ/2 ≤ l(p) + δβ, as desired. Moreover, upon
Path(s, t, β) query we return a path p0 with ˆl(p0 ) ≤ d. This means that
X
l(p0 ) ≤ (1 + δ/8)ˆle + δMmin /4n ≤ (1 + δ/8)ˆl(p0 ) + δβ/4
e∈p0
since d ≤ ˆl(p) + δβ/2 ≤ 2β + δβ/2 < 3β for δ < 1. The theorem follows.
101
desired performance is a bit involved. Therefore, for the sake of the exposition, in
what follows we stick to the original approach of Garg and Könemann. We want to
emphasis the fact that the dynamic-graph-based tools we developed in this chapter,
could be applied to either of these approaches.
The algorithm of Garg and Könemann that (1−O(ε))-approximates the maximum
concurrent flow problem works as follows. It starts with a zero flow – f = ∅, and
setups for each arc e an initial length le = uγe , where γ = ( 1−ε
m −1/ε
) . The algorithm
proceeds in phases – each one of them consists of k iterations. In each iteration i,
the algorithm tries to route di units of flow from si to ti . This is done by repeating
the following steps. First, a shortest (with respect to current lengths l ) si -ti path p
is found. Next, the bottleneck capacity u, which is the minimum of the bottleneck
capacity of the path p and the remaining demand dbi to be routed, is computed. Then,
the flow f is augmented by routing u units of commodity i along the path p. Finally,
εu
the length of each arc e of the path p is increased by a factorP of (1 + ue ), and di is
b
decreased by u. The entire procedure stops when D(l ) := e le ue – the volume of
G with respect to l , is at least one. The summary of the algorithm can be found in
Figure 4-7.
Input : Graph G = (V, E, u), source-sink pairs {(si , ti )}i with demands {di }i , and
an accuracy parameter ε > 0
Output: (Infeasible) flow f
Initialize f ← ∅, le ← uγe for all arcs e ∈ E, where γ = ( 1−ε
m −1/ε
)
while D(l ) < 1 do
for i := 1, . . . , k do
dbi ← di
while D(l ) < 1 and dbi > 0 do
Find the shortest path p in Pi
Find the bottleneck capacity u of p (∗ u ← min{dbi , mine∈p ue } ∗)
dbi ← dbi − u
Augment the flow f by routing u units of flow along the path p
foreach arc e in p do le ← le (1 + uεue )
end
end
end
Lemma 4.6.2. After t − 1 phases, the algorithm has routed (t − 1)di units of each
commodity i. Scaling the final flow by log1+ε 1/γ yields a feasible flow that achieves
a rate θ = log t−11/γ .
1+ε
102
Lemma 4.6.3. If θ∗ ≥ 1, then the final flow scaled down by log1+ε 1/γ is feasible and
has a value of at least (1 − 3ε)θ∗ .
As we ensured that 1 ≤ θ∗ ≤ km, it is easy to see that the above lemmas imply that
the algorithm in Figure 4-7, after at most 1+θ∗ log1+ε 1/γ ≤ 1+km log1+ε 1/γ phases,
returns a (1 − 3ε)-approximation for the maximum concurrent flow. Unfortunately,
the bound of 1 + mk log1+ε 1/γ on the number of phases is not sufficient to obtain
the O((m
e + k)mε−2 ) running time (in fact, it only establishes the time bound of
O((m
e + k 2 m)mε−2 )). To reduce this dependence of the number of phases on θ∗ , Garg
and Könemann use a halving technique developed in [112]. They run the algorithm
and if it does not stop after T := 2 log1+ε 1/γ +1 phases then, by Lemma 4.6.1, it must
be that θ∗ > 2. In this case, they multiply the demands by two, so θ∗ is halved and
still at least one. Next, they continue the algorithm and keep doubling the demands
again if it does not stop after T phases. Clearly, since initially θ∗ ≤ km, after
repeating such doubling at most log km times the algorithm stops, and thus the total
number of phases is at most T log km. The number of phases can be reduce further
to O(T ) by first applying the above approach with constant ε to get a constant-factor
approximation for θ∗ – this takes O(log2 km) phases – and then with at most O(T )
more phases the (1 − 3ε)-approximation is obtained. Having established this bound
on the number of phases, the bound of O((m e + k)mε−2 ) on the running time of the
whole algorithm follows easily.
Input : Graph G = (V, E, u), source-sink pairs {(si , ti )}i with demands {di }i , and
an accuracy parameter ε > 0
Output: (Infeasible) flow f
Initialize f ← ∅, and le ← uγe for all arcs e ∈ E, where γ = ( 1−ν m −1/ν
) , and
ν := ε(1 + ε)
Sample sets {Sj }j=1,...,dlog ne to indicate the set Pb (see Definition 4.3.1)
Initialize (ε/2, 1, γ/U, P)-ADSP
b data structure R as in Theorem 4.3.4
while D(l ) < 1 do
for i := 1, . . . , k do
dbi ← di
while D(l ) < 1 and dbi > 0 do
p ← R.Path(si , ti , β), where β is guessed with binary search
Find the bottleneck capacity u of p (∗ u ← min{dbi , mine∈p ue } ∗)
dbi ← dbi − u
Augment the flow f by routing u units of flow along the path p
foreach arc e in p do R.Increase(e, εul εu
ue ); le ← le (1 + ue )
e
end
end
end
103
4.6.1 Our Algorithm
We present now a more efficient approximation scheme for maximum concurrent flow
problem. Similarly, to the case of the maximum multicommodity flow problem, our
improvement is based on making the algorithm from Figure 4-7 find the (approxi-
mately) shortest paths using (ε, 1, γ/U, P)-ADSP
b data structure instead of Dijkstra’s
algorithm. Our new implementation of procedure from Figure 4-7 is presented in Fig-
ure 4-8. Note that whenever the algorithm issues Path(u, v, ·) request it uses binary
search to find the (u, v)-accurate β that yields small enough error – see discussion
after Definition 4.3.3 for P
more details.
Let us define α(l ) := i di disti (l ), where disti (l ) is the length of the shortest si -ti
path in Pb with respect to lengths l . As it was the case for maximum multicommodity
flow problem, we need to justify the fact that we focus our attention on shortest paths
in P
b instead of the whole P.
LemmaP 4.6.4. With high probability, for any length vector l , α(l ) ≤ D(l )/θ∗ , where
D(l ) := e le ue .
Proof. Similarly to the proof of Lemma 4.4.1, let us fix the optimal solution f ∗ =
(f1∗ , . . . , fk∗ ) that achieves the ratio of θ∗ . By Lemma 4.3.2, with high probability,
each pj belongs to P b ∩ Pi(j) , where p1 , . . . , pq are the flowpaths of f ∗ and i(j) is the
corresponding commodity. Now, since f ∗ obeys capacity constraints we have that
X X k
X
∗
D(l ) = le ue ≥ l(pj )fi(j) (pj ) ≥ θ∗ di min l(p) ≥ θ∗ α(l )
p∈Pi
e j i=1
Theorem 4.6.5. For any 0.15 > ε > 0, there exists a Monte Carlo algorithm that
finds a (1−4ε)-approximate solution for maximum concurrent flow problem in expected
time O((m
e + k)nε−2 log U ).
Proof. As it was already mentioned, we will be using the algorithm of Garg and
Könemann, in which we use the procedure from Figure 4-8 instead of the one presented
in Figure 4-7. To bound the running time of this new algorithm, we recall that the
preprocessing that ensures that 1 ≤ θ∗ ≤ km can be performed in O(mn) e time.
Also, we use the halving technique to ensure that the running time of the rest of the
algorithm is bounded by the time needed to execute O(T ) = O(log1+ε 1/γ) phases of
the procedure in Figure 4-8.
Now, it is easy to see that this running time is dominated by the cost of maintain-
ing the data structure R, and the time needed to answer the requests Path(·, ·, ·) (note
there is at most O(n) Increase(·, ·) requests per each Path(·, ·, ·) one). By Theorem
log U/γ −1 −1
4.3.4 we know that the maintenance cost is O(mne
ε
) = O(mnε
e (ε + log U )) =
−2
O(mnε
e log U ). To upperbound the time needed to answer Path(·, ·, ·) requests, we
104
note that each such request results in augmentation of the flow, and each augmenta-
tion of the flow results either in increasing the length of at least one arc by (1+ε), or it
is the last augmentation for given commodity. But no arc can have length bigger than
(1 + ε)/ue , because le ue ≤ D(l) and we stop when D(l) ≥ 1, thus the total number
of augmentations of the flow is at most mblog(1+ε) (1+ε) γ
c + k · O(T ) = O(mεe −2
+ kT ),
which results in O((me + k)nε−2 log log U/γ) = O((m e + k)nε−2 log log U ) needed to an-
swer all Path(·, ·, ·) requests (including the time needed to perform each binary search
for β). As a result, the algorithm works in O((m e + k)nε−2 log U ) time, as desired.
We proceed now to lowerbounding the value of the final flow f computed by
our algorithm after scaling it down by the maximal congestion of the arcs. To this
end, let, for given j, and 1 ≤ q ≤ qj , fj,q be the flow of commodity i(j, q) that
was routed along si(j,q) -ti(j,q) path pj,q in q-th augmentation of the flow f in phase
j, where qj is the total number of flow augmentations during phase j. Let l j,q be
the length vector l after routing the flow fj,q . For any j and q ≥ 1, the fact that
we always find the (si(j,q) , ti(j,q) )-accurate β for the Path(si(j,q) , ti(j,q) , β) query, implies
that lj,q−1 (pj,q ) ≤ (1 + ε)disti(j,q) (l j,q−1 ). Therefore, we have that
D(j − 1)
D(j) ≤ .
1 − ν/θ∗
P γ
But, D(0) = e ue ) ue ) = mγ, so for j ≥ 1
mγ
D(j) ≤
(1 − ν/θ∗ )j
mγ ν
= ∗
(1 + ∗ )j−1
(1 − ν/θ ) (θ − ν)
mγ ν(j−1)
≤ ∗
e θ∗ −ν
(1 − ν/θ )
mγ ν(j−1)
≤ e θ∗ (1−ν) ,
(1 − ν)
105
The algorithm stops at the first phase jf during which D(l ) ≥ 1. Thus,
mγ ν(jf −1)
1 ≤ D(jf ) ≤ e θ∗ (1−ν) ,
(1 − ν)
θ∗ (1 − ν) ln 1−ν
mγ
jf − 1 ≥
ν
Since we had jf − 1 successfully completed phases, the flow produced by our
procedure routes at least (jf − 1)di units of each commodity i. Unfortunately, this
flow may be not feasible – it may violate some capacity constraints. But in our
algorithm the length le of any arc e cannot be bigger than (1 + ε)/ue . Thus, the
fact that each arc starts with length le := γ/ue and each time a full unit of flow is
routed through it, its length increases by a factor of at least (1 + ε), implies that the
congestion incurred at e can be at most blog1+ε (1 + ε)/γc. Therefore, we see that the
rate θ achieved by the final flow after scaling it down is at least
jf − 1 θ∗ (1 − ν) ln 1−ν
mγ
θ≥ ≥ .
blog1+ε (1 + ε)/γc ν log1+ε 1/γ
Plugging in γ = (m/(1 − ν))−1/ν and unwinding the definition of ν yields
106
Chapter 5
(Multi-)Cut-Based Minimization
Problems in Undirected Graphs
The focus of this chapter are the (multi-)cut-based minimization problems. This
class of problems captures a variety of basic graph optimization tasks, including the
minimum cut problem, the minimum s-t cut problem, the (generalized) sparsest cut
problem, the balanced separator problem, minimum multi-cut problem, minimum
multi-way-cut problem, and many other graph partitioning primitives. We present a
general method of designing fast approximation algorithms for this class of problems
in undirected graphs. In particular, we develop a technique that given any such
problem that can be approximated quickly on trees, allows us to approximate it
almost as quickly on general graphs while only losing a poly-logarithmic factor in the
approximation guarantee.
To illustrate the applicability of this paradigm, we consider the undirected general-
ized sparsest cut problem and the balanced separator problem. By a simple use of the
developed framework, we obtain the first poly-logarithmic approximation algorithms
for these problems that run in time close to linear.
The main tool behind the results of this chapter is an efficient procedure that
decomposes general graphs into simpler ones while approximately preserving the cut
structure. This decomposition is inspired by the graph decomposition technique of
Räcke that was developed in the context of oblivious routing schemes, and it employs
some of the tools described in Section 2.6, as well as, draws upon some high-level
ideas underlying the iterative approach of Spielman and Teng to solving Laplacian
systems (cf. Section 2.7).
5.1 Introduction
(Multi-)cut-based graph problems are ubiquitous in optimization. They have been
extensively studied – both from theoretical and applied perspective – in the con-
text of flow problems, theory of Markov chains, geometric embeddings, clustering,
community detection, VLSI layout design, and as a basic primitive for divide-and-
conquer approaches (cf. [123]). For the sake of concreteness, we define an optimization
107
problem P to be an (undirected) multi-cut-based (minimization) problem if its every
instance P ∈ P can be cast as a task of finding – for a given input undirected capac-
itated graph G = (V, E, u) – a partition C ~ ∗ := {Ci∗ }ki=1
∗
of vertices of V such that:
~ ∗ = argmin ~
C ~ ~
Cis a partition ofV u(C)fP (C), (5.1)
where u(C)~ is the sum of capacities of all the edges of G whose endpoints are in
different Ci s and fP is an arbitrary non-negative function of C ~ that depends on P ,
but not on the edge set E (and the capacities u) of the graph G. It is not hard
to see that a large number of graph problems fits this definition. For instance, the
minimum cut problem (cf. [79, 106, 89, 132, 87]) corresponds to taking fP (C) ~ to be
~ partitions the graph into exactly two pieces (i.e., it corresponds to a cut), and
1 if C
+∞ otherwise; and the minimum s-t cut problem (that was discussed in Chapter
3) corresponds to a further restricting of fP (C)~ by making it equal to 1 only if C ~
corresponds to a cut that separates s and t, and being +∞ otherwise. Also, to see
an example of a multi-cut problem captured by this definition, one can see that the
multi-way cut problem in which we need to find a minimum capacity set of edges
that disconnects some set of terminals (cf. [49, 50, 47, 48, 88]), corresponds to taking
~ to be always equal to 1 if C
fP (C) ~ has exactly one of terminals in each part of the
partition, and +∞ otherwise.
For the sake of simplicity, we will focus from now on only on the cut-based min-
imization problems – the problems in which the function fP (C) ~ is equal to +∞
whenever the partition C ~ consists of more than two parts. These problems can be
viewed as the ones that can be cast as a task of finding a cut C ∗ such that:
where u(C) is the capacity of the cut C in G and fP , once again, is a non-negative
function that depends on P and C, but not on the set E of edges of G and their
capacities.
Even though we develop the framework for this special case, everything easily
transfer over to the general multi-cut-based setting. The reason behind this is that
for any partition C~ = {Ci }k , we can express u(C)~ as a linear combinations of the
i=1
capacities of the cuts {Ci }i . Namely, we have
k
~ =1
X
u(C) u(Ci ).
2 i=1
108
In the generalized sparsest cut problem, we are given a graph G = (V, E, u) and
a demand graph D = (V, ED , d ). Our task is to find a cut C ∗ that minimizes the
P sparsity u(C)/d(C) among all the cuts C of G. Here, d(C) is the total
(generalized)
demand e∈ED (C) de of the demand edges ED (C) of D cut by the cut C. Casted
in our terminology, the problem is to find a cut C ∗ being argminC u(C)fP (C) with
fP (C) := 1/d(C).
An important problem that is related to the generalized sparsest cut problem is
the sparsest cut problem. In this problem one aims at finding a cut C ∗ that minimizes
the sparsity u(C)/ min{|C|, |C|} among all the cuts C of G. Note that if we considered
an instance of generalized sparsest cut problem corresponding to the demand graph D
being complete and each demand being 1/|V | (this special case is sometimes called the
uniform sparsest cut problem) then the corresponding generalized sparsity of every
cut C is within factor of two of its sparsity. Therefore, up to this constant factor, one
can view the sparsest cut problem as a special case of the generalized sparsest cut
problem.
On the other hand, the balanced separator problem corresponds to a task of finding
a cut that minimizes the sparsity among all the c-balanced cuts C in G, i.e. among
all the cuts C with min{|C|, |C|} ≥ c|V |, for some specified constant c > 0 called the
balance constant of the instance. In our framework, the function fP (C) corresponding
to this problem is equal to 1/ min{|C|, |C|} if min{|C|, |C|} ≥ c|V | and equal to +∞
otherwise.
109
and Rao [100] who used linear programming relaxation of the maximum concurrent
flow problem to establish the first poly-logarithmic approximation algorithms for
many graph partitioning problems. In particular, an O(log n)-approximation for the
sparsest cut and the balanced separator problems. Both these algorithms can be
e 2 ) time.
implemented to run in O(n
These spectral and flow-based approaches√were combined in a seminal result of
Arora, Rao, and Vazirani [17] to give an O( log n)-approximation for the sparsest
cut and the balanced separator problems.
In case of the generalized sparsest cut problem, the results of Linial, London,
and Rabinovich [101], and of Aumann and Rabani [19] give a O(log r)-approximation
algorithms, where r is the number of vertices of the demand graph that are endpoints
of some demand edge. Subsequently, Chawla, Gupta, and Räcke [37] extended the
techniques from [17] to obtain an O(log3/4 r)-approximation.√ This approximation
ratio was later improved by Arora, Lee, and Naor [16] to O( log r log log r).
110
e 2 log U ) time is still the fastest one known. This folklore result is obtained by
O(n
combining an efficient implementation [26, 65] of the algorithm of Leighton and Rao
[100] together with the results of Linial, London, and Rabinovich [101], and of Au-
mann and Rabani [19].
111
(Uniform) sparsest cut and balanced separator problem:
Algorithm Approximation ratio Running time
−1/2 2
Alon-Milman [9] O(Φ ) O(m/Φ
e )
(Ω(n) worst-case) (O(m) using [130])
e
−1/2 3/2
Andersen-Peres [12] O(Φ
e ) O(m/Φ
e )
(Ω(n)
e worst-case)
O(log e 2)
Leighton-Rao [100] √ n) O(n
Arora-Rao-Vazirani [17] O( log n) polynomial time
√
Arora-Hazan-Kale [13] O( log n) e 2)
O(n
Khandekar-Rao-
-Vazirani [91] O(log2 n) e + n3/2 )
O(m
Arora-Kale [15] O(log n) e + n3/2 )
O(m
Orecchia-Schulman-
O(log e + n3/2 )
-Vazirani-Vishnoi [109] p n) O(m
Sherman [122] O( log n/ε) e + n3/2+ε )
O(m
e + n4/3+ε ) using the
(O(m
algorithm from Chapter 3)
3/2+o(1) √ e + n8/7+ε )
this thesis k = 1 (log n)/ ε O(m
√
this thesis k = 2 (log5/2+o(1) n)/ ε e + n16/15+ε )
O(m
√ e + 2k n1+1/(4·2k −1)+ε )
this thesis k ≥ 1 (log(1+o(1))(k+1/2) n)/ ε O(m
Generalized sparsest cut problem:
Algorithm Approximation ratio Running time
Folklore O(log r) e 2 log U )
O(n
([100, 101, 19, 26, 65])
3/4
Chawla-Gupta-Räcke [37] √O(log r) polynomial time
Arora-Lee-Naor [16] O( log r log log r) polynomial time
2+o(1)
this thesis k = 2 log n O(m + |ED | + n4/3 log U )
e
3+o(1) e + |ED | + n8/7 log U )
this thesis k = 3 log n O(m
(1+o(1))k
this thesis k ≥ 1 log n e + |ED |+
O(m
k
+2k n1+1/(2 −1) log U )
Figure 5-1: Here, n denotes the number of vertices of the input graph G, m the
number of its edges, Φ its conductance, U is its capacity ratio, and r is the number
of vertices of the demand graph D = (V, ED , d) that are endpoints of some demand
edges. Also, O(·)
e notation suppresses poly-logarithmic factors. The algorithm of Alon
and Milman applies only to the sparsest cut problem.
112
5.1.4 Overview of the Techniques
The approach we take is inspired by the cut-based graph decomposition of Räcke
[114] that was developed in the context of oblivious routing schemes (see [113, 21, 28,
80] for some previous work on this subject). This decomposition is a powerful tool
in designing approximation algorithms for various undirected cut-based problem.2
It is based on finding for a given graph G with n vertices and m edges, a convex
combination {λi }i of decomposition trees {Ti }i such that: G is embeddable into each
Ti , and this convex combination can be embedded into G with O(log n) congestion.
One employs this decomposition by first approximating the desired cut-based problem
on each tree Ti – this usually yields much better approximation ratio than general
instances – and then extracting from obtained solutions a solution for the graph G
while incurring only additional O(log n) factor in the approximation guarantee.
The key question motivating our results is: can the above paradigm be applied to
obtain very efficient approximation algorithms for cut-based graph problems? After
all, one could envision a generic approach to designing fast approximation algorithms
for such problems in which one decomposes first an input graph G into a convex
combination of structurally simple graphs Gi (e.g. trees), solves the problem quickly
on each of these easy instances and then combines these solutions to obtain a solution
for the graph G, while losing some (e.g. poly-logarithmic) factor in approximation
guarantee as a price of this speed-up. Clearly, the viability of such scenario depends
critically on how fast a suitable decomposition can be computed and how ’easy’ are
the graphs Gi s from the point of view of the problems we want to solve.
We start investigation of this approach by noticing that if one is willing to settle for
approximation algorithms that are of Monte Carlo-type then, given a decomposition
of the graph G into a convex combination {(λi , Gi )}i , one does not need to compute
the solution for each of Gi s to get the solution for G. It is sufficient to just solve the
problem on a small number of Gi s that are sampled from the distribution described
by λi s.
However, even after making this observation, one still needs to compute the de-
composition of the graph to be able to sample from it and, unfortunately, Räcke’s
decomposition – that was aimed at obtaining algorithms that are just polynomial-time
– has serious limitations when one is interested in time-efficiency. In particular, the
running time of his decomposition procedure is dominated by O(m) e all-pair shortest
path computations and is thus prohibitively large from our point of view – cf. Section
5.3.1 for more details.
To circumvent this problem, we design an alternative and more general graph de-
composition (cf. Theorem 5.3.6). Similarly to the case of Räcke’s decomposition, our
construction is based on combining the multiplicative-weights-updated-based routine
(cf. Section 2.8 with the technique of embedding graph metrics into tree metrics.
However – instead of the embedding result of Fakcharoenphol, Talwar, and Rao [62]
that was used by Räcke – we employ the nearly-linear time algorithm of Abraham,
Bartal, and Neiman [1] for finding low-average-stretch spanning trees (cf. Theorem
2
As explained above, in the rest of the chapter, we will focus only on cut-based problems, but all
the results we obtain hold also for the multi-cut-based setting.
113
2.6.3). This choice allows for a much more efficient implementation of our decompo-
sition procedure at a cost of decreasing the quality of provided approximation from
O(log n) to O(log
e n). Also, even more crucially, inspired by the construction of the
ultrasparsifiers of Spielman and Teng [129, 130] (see Section 2.7 for an overview of
this construction), we allow flexibility in choosing the type of graphs into which our
graph is decomposed. In this way, we are able to offer a trade-off between the struc-
tural simplicity of these graphs and the time needed to compute the corresponding
decomposition.
Finally, by recursive use of our decomposition – together with sparsification tech-
nique – we are able to leverage the above-mentioned flexibility to design a procedure
that can have its running time be arbitrarily close to nearly-linear and, informally
speaking, allows us to sample from a decomposition of our input graph G into graphs
whose structure is arbitrarily simple, but at a cost of getting proportionally worse
quality of the reflection of the cut structure of G – see Theorem 5.3.7 for more details.
5.2 Preliminaries
The object of study in this chapter will be an undirected capacitated graphs G =
(V, E, u). By our convention, we will denote by U the capacity ratio of G be-
ing the maximum possible ratio between capacities of two edges of G i.e. U :=
maxe,e0 ∈E u(e)/u(e0 ).
We will say, for some V 0 ⊆ V , that a subgraph G0 = (V 0 , E 0 , u 0 ) is a subgraph of
G induced by V 0 if E 0 consists of all the edges of G whose both endpoints are in V 0
and the capacity vector u 0 on these edges is inherited from the capacity vector u of
G.
114
graphs, we adjust its definition to this setting. In the maximum concurrent flow
problem, we are given an undirected capacitated graph G = (V, E, u), as well as, a
set of k source sink pairs {(si , ti )}i and corresponding positive demands {di }i . The
goal is to find a multicommodity flow f = (f1 , . . . , fk ), where each fi is an si -ti flow
P ii.e. has |f (e)| ≤ ue for each edge e, and maximizes the
of value θdi that is feasible,
rate θ. Here, |f (e)| := i |f (e)| is the total flow routed along the edge e by f .
For reasons that will be clear soon, in this chapter, we will describe the maximum
concurrent flow instances in slightly different – although completely equivalent – way.
Namely, we will represent the source-sink pairs {(si , ti )}i and the demands {di }i via
a demand graph D = (V, ED , d ). In this graph, each source-sink pair (si , ti ) with
demand di is represented by an edge (si , ti ) ∈ ED with its demand d(si ,ti ) = di .
Clearly, this correspondence allows us not only represent a set of source-sink pairs
as a demand graph, but also view any demand graph as a maximum concurrent flow
problem instance.3
5.2.2 Embeddability
A notion we will be dealing extensively with is the notion of a graph embedding.
Definition 5.2.1. For given graphs G = (V, E, u) and G = (V, E, u), by an embed-
ding f of G into G we mean a multicommodity flow f = (fe1 , . . . , fe|E| ) in G with |E|
commodities indexed by the edges of G, such that for each 1 ≤ i ≤ |E|, the flow fei
routes in G uei units of flow between the endpoints of ei .
Definition 5.2.2. For given t ≥ 1, graphs G = (V, E, u), and G = (V, E, u), we say
that G is t-embeddable into G if there exists an embedding f of G into G such that
for all e ∈ E, |f (e)| ≤ tue . We say that G is embeddable into G if G is 1-embeddable
into G.
Intuitively, the fact that G is t-embeddable into G means that we can fractionally
pack all the edges of G into G having all its capacities u multiplied by t. Also, it is
easy to see that for given graphs G and G, one can always find the smallest possible
t such that G is t-embeddable into G by just solving the maximum concurrent flow
problem in G in which we treat G as a demand graph with the demands given by its
capacities.
3
The fact that we use the notion of demand graph in the definition of both the maximum con-
current flow problem and the generalized sparsest cut problem, is not a coincidence. These two
problems turn out to be closely related – see [101] and [19].
115
5.3 Graph Decompositions and Fast Approximation
Algorithms for Cut-based Problems
The central concept of our approach is the notion of an (α, G)-decomposition. This
notion is a generalization of the cut-based graph decomposition introduced by Räcke
[114].
Definition 5.3.1. For any α ≥ 1 and some family of graphs G, by an (α, G)-
decomposition of a graph G = (V, E, u) we mean a set of pairs {(λi , Gi )}i , where
for each i, λi > 0 and Gi = (V, E i , u i ) is a graph in G, such that:
P i
(a) iλ = 1
(c) There exist embeddings f i of each Gi into G such that for each e ∈ E,
X
λi |f i (e)| ≤ αue .
i
Gi s, and one can pack the convex combination of these Gi s into G while exerting a
congestion of at most α in G.
The crucial property of (α, G)-decomposition is that it captures α-approximately
the cut structure of the graph G. We formalize this statement in the following easy
to prove fact.
Fact 5.3.2. For any α ≥ 1, graph G = (V, E, u), and a family of graphs G, if
{(λi , Gi )}i is an (α, G)-decomposition of G then for any cut C of G:
Note that the condition (a) from Definition 5.3.1 implies that {(λi , Gi )}i is a
convex combination of the graphs {Gi }i . Therefore, one of the implications of the
Fact 5.3.2 is that for any cut C in G, not only the capacity of C in every Gi is
lowerbounded by its capacity u(C) in G, but also there always exists a graph Gj in
which the capacity of C is at most αu(C). As it turns out, this property alone allows
one to reduce a task of αβ-approximation of any cut-based minimization problem in
G, any β ≥ 1, to a task of β-approximation of this problem in each of Gi s.
116
To see why this is the case, consider an instance P of some cut-based minimization
problem and let fP be the function corresponding to this instance (cf. (5.2)). Let C i ,
for each i, be some β-optimal solution to the instance P in Gi – we can obtain these
solutions by just running the promised β-approximation algorithm for Gi s. Now, let
C := argmini u(C i )fP (C i ) be the one among solutions C i that constitutes the best
solution for the original G. We claim that C is a αβ-approximate solution for G.
This is so, since if C ∗ is the optimal solution for G, then – as we observed above –
there exists j such that uj (C ∗ ) ≤ αu(C ∗ ). But this means that
where the first inequality follows from definition of C, the second one from cut
lowerbounding property of (α, G)-decomposition, and the third one from the fact that
C j was a β-approximate solution for Gj . So, C is indeed an αβ-approximate solution
for G.
In the above, we reduce the task of approximation of a cut-based minimization
problem to a task of approximation of this problem in all Gi s. However, it turns out
that if one is willing to settle for Monte Carlo-type approximation guarantees, one
just need to approximately solve this problem in only a small number of Gi s. To make
this statement precise, let us introduce the following definition.
Definition 5.3.3. For a given G = (V, E, u), α ≥ 1, and 1 ≥ p > 0, we say that a
collection {Gi }i of random graphs4 Gi = (V, E i , ui ), α-preserves the cuts of G with
probability p if for every cut C of G:
(lowerbounding) for all i, the capacity ui (C) of C in Gi is at least its capacity
u(C) in G;
117
in G, by β-approximating it in a sample of O(ln |V |) Gi s that were chosen according
to distribution described by λi s.
118
F
H’
Figure 5-2: An example of a j-tree H, its core H 0 , and its envelope F consisting of
bold edges.
Theorem 5.3.6. For any graph G = (V, E, u) and t ≥ 1, we can find in time O(tm) e
a t-sparse (O(log
e e m log U )])-decomposition {(λi , Gi )}i of G, where m = |E|,
n), GV [O( t
n = |V |, and U is the capacity ratio of G. Moreover, the capacity ratio of each Gi is
O(mU ).
119
Therefore, we see that compared to the decomposition result of Räcke (cf. Theorem
5.3.5), we obtain in this case a decomposition of G with more efficient implementation
and into objects that are even simpler than decomposition trees, but at a cost of
slightly worse quality.
However, the aspect of this theorem that we will find most useful is the above-
mentioned trade-off between the simplicity of the j-trees into which the decomposition
decomposes G and the time needed to compute it. This flexibility in the choice of t
can be utilized in various ways.
For instance, one should note that, in some sense, the core of a j-tree H captures
all the non-trivial cut-structure of H. In particular, it is easy to see that maximum
flow computations in H essentially reduce to maximum flow computations in H’s core.
So, one might hope that for some cut-based problems the complexity of solving them
in H is proportional to the complexity of solving them in the – possibly much smaller
than H – core of H (this is, for example, indeed the case for the balanced separator
and the sparsest cut problems – see section 5.4.3). Thus one could use Theorem 5.3.6
– for appropriate choice of t – to get a faster algorithm for this kind of problems by
just computing first the corresponding decomposition and then leveraging the existing
algorithms to solve the given problem on a small number of sampled j-trees, while
paying an additional O(log
e n) factor in the approximation quality for this speed-up.
Even more importantly, our ability to choose in Theorem 5.3.6 sufficiently small
value of t, as well as, the cut sparsification technique (cf. Theorem 3.5.1) and recursive
application of the theorem to the cores of O( e m log U )-trees sampled from the computed
t
decomposition, allows us to establish the following theorem – its proof appears in
section 5.6.
Theorem 5.3.7. For any 1 ≥ l ≥ 0, integral k ≥ 1, and any graph G = (V, E, u), we
1−l
e + 2k n(1+ 2k −1 ) log U ) time a collection of (2k+1 ln n) nl -trees {Gi }i that
can find in O(m
(log(1+o(1))k n)-preserve the cuts of G with high probability. Moreover, the capacity
ratio of each Gi is n(2+o(1))k U , where U is the capacity ratio of G, n = |V |, and
m = |E|.
As one can see, the above theorem allows obtaining a collection of j-trees that
α-preserve cuts of G – for arbitrary j = nl – in time arbitrarily close to nearly-linear,
but at a price of α growing accordingly as these two parameters decrease.
Note that one can get a cut-preserving collection satisfying the requirements of
the theorem by just finding an (O(loge n), nl )-decomposition of G via Theorem 5.3.6
and sampling – in the spirit of Fact 5.3.4 – O(log n) nl -trees from it so as to ensure
that each cut is preserved with high probability. Unfortunately, the running time of
such procedure would be too large. Therefore, our approach to establishing Theorem
5.3.7 can be heuristically viewed as an algorithm that in case when the time required
by Theorem 5.3.6 to compute a decomposition of G into nl -trees is not acceptable,
does not try to compute and sample from such decomposition directly. Instead, it
performs its sampling by finding a decomposition of G into j-trees, for some value of
j bigger than nl , then samples a j-tree from it and recurses on this sample. Now, the
desired collection of nl -trees is obtained by repeating this sampling procedure enough
120
times to make sure that the cut preserving probability is high enough, with the value
of j chosen so as to bound the number of recursive calls by k − 1. Since each recursive
call introduces an additional O(log
e n) distortion in the faithfulness of the reflection of
the cut structure of G, the quality of the cut preservation of the collection generated
via such algorithm will be bounded by log(1+o(1))k n.
5.4 Applications
We proceed to demonstrating how the tools and ideas presented in the previous section
lead to a fast approximation algorithms for cut-based graph problems. In particular,
we apply our techniques to the (generalized) sparsest cut and the balanced separator
problems.
The following theorem encapsulates our way of employing the framework.
When we take l equal to zero, the above theorem implies that if we have an
α-approximation algorithm for a given cut-based problem that works only on tree in-
stances and runs in T (m, n, U ) time then, for any k ≥ 1, we can get an (α log(1+o(1))k n)-
approximation algorithm for it in general graphs and the running time of this algo-
e + k2k n(1+1/(2k −1) log U ) + (2k+1 ln n)T (m, n, U ). Note that the
rithm will be just O(m
computational overhead introduced by our framework, as k grows, quickly approaches
nearly-linear. Therefore, if we are interested in designing fast poly-logaritmic approx-
imation algorithms for some cut-based minimization problem, we can just focus our
attention on finding a fast approximation algorithm for its tree instances.
Also, an interesting feature of our theorem is that the procedure producing the
graphs {Gi }i is completely oblivious to the cut-based problem we want to solve – the
fact that this problem is a cut-based minimization problem is all we need to make
our approach work.
Proof of Theorem 5.4.1. We produce the desired collection {Gi }i by just employing
1−l
k (1+ 2k −1 )
Theorem 5.3.7 with the desired value of k ≥ 1 to find in time O(m+2 e n log U )
a set of t = (2k+1 ln n) nl -trees that (log(1+o(1))k n)-preserve the cuts of G with high
probability and output them as {Gi }i .
Now, to prove the theorem, let us define C to be the cut that minimizes the
quantity u(C i )fP (C i ) among all the t solutions C 1 , . . . , C t found, where fP is the
function corresponding to the instance P of a problem P we are solving (cf. equation
(5.2)). Clearly, by definition of (log(1+o(1))k n)-preservation of the cuts, the capacity
121
of C – and thus the quality of the solution corresponding to it – can only improve in
G i.e.
u(C)fP (C) ≤ uj (C)fP (C),
where Gj is the nl -tree to which C corresponds and, for any i, u i denotes the capacity
vector of Gi .
Moreover, if we look at an optimum solution C ∗ to our problem in G – i.e. C ∗ =
arg min∅6=C⊂V u(C)fP (C) – then, with high probability, in at least one of the Gi s,
0
say in Gj , C ∗ has the capacity at most (log(1+o(1))k n) times larger than its original
0
capacity u(C ∗ ) in G. As a result, for the α-optimal solution C j found by the algorithm
0
in Gj it will be the case that
0 0 0 0 0 0
u(C j )fP (C j ) ≤ uj (C j )fP (C j ) ≤ αuj (C ∗ )fP (C ∗ ) ≤ α(log(1+o(1))k n)u(C ∗ )fP (C ∗ ),
where the first inequality follows from the fact that {Gi }i (log(1+o(1))k n)-preserve
cuts of G.
But, by definition of C, we have
0 0
u(C)fP (C) ≤ u(C j )fP (C j ).
So, by noting that u(C ∗ )fP (C ∗ ) is the objective value of an optimal solution to our
instance, we get that C is indeed an (α log(1+o(1))k n)-optimal solution to P with high
probability.
Lemma 5.4.2. For any tree T = (V, ET , u T ) and demand graph D = (V, ED , d ), we
can compute uT [D](h), for all h ∈ ET , in O(|E
e D | + |V |) time.
122
Proof. To compute all the values uT [D](h), we adapt the approach to computing the
stretch of edges outlined by Spielman and Teng in [130].
0
For a given tree T 0 = (V 0 , ET 0 , u T ) and a demand graph D0 = (V 0 , ED0 , d 0 )
0 0 0
corresponding
0
P to it, let us define the size SD0 (T ) of T (with respect to D ) as 0
SD0 (T ) := v∈V 0 (1 + dD0 (v)), where dD0 (v) is the degree of the vertex v in D .
∗ 0 0 0 0
Let us define a splitter vD 0 (T ) of T (with respect to D ) to be a vertex of T such
adjacent edges has its size SD0 (Ti ) being at most one half of the size of T 0 i.e.
0
for each i. It is easy to see that such a splitter v ∗ (T 0 ) can be computed in a greedy
fashion in O(|V 0 |) = O(SD0 (T 0 )) time.
Our algorithm for computing uT [D]s works as follows. It starts with finding a
splitter v ∗ = vD
∗
(T ) of T . Now, let E0 be the set of demand edges e ∈ ED such that
pathT (e) contains v ∗ . Also, for 1 ≤ i ≤ q, let Ei ⊆ ED be the set of edges e ∈ D for
which pathT (e) is contained entirely within Ti . Note that this partition of edges can
be easily computed in O(|ED | + |V |) = O(SD0 (T 0 )) time. Also, let us define Di to be
the demand graph being the subgraph of D spanned by the demand edges from Ei
(with demands inherited from D).
As a next step, our algorithm computes in a simple bottom-up fashion uT [D0 ](h)
for each h ∈ ET . Then, for 1 ≤ i ≤ q, it computes recursively the capacities uTi [Di ]
of all the edges of Ti – note that, by definition of Di , uTi [Di ](h) P = uT [Di ](h) for any
edge h of Ti . Finally, for each h ∈ ET , we output uT [D](h) = iP uT [Di ](h). Note
that, since for i ≥ 1, u [Di ](h) 6= 0 only if h is a part of Ti , the sum i uT [Di ](h) has
T
at most two non-zero components and thus all the values uT [D](h) can be computed
in O(|ED | + |V |) = O(SD (T )) time. Furthermore, the fact that, by our choice of v ∗ ,
SDi (Ti ) ≤ SD (T )/2 implies that the depth of recursion is at most log SD (T ) and the
whole algorithm runs in O(S e D | + |V |) time, as desired.
e D (T )) = O(|E
Now, the crucial observation to be made is that the best achievable flow rate θ∗
of the maximum concurrent flow corresponding to the demand graph D = (V, ED , d )
uT
is equal to minh∈ET uT [D](h)
h
. Therefore, Lemma 5.4.2 implies the following corollary.
Corollary 5.4.3. For any tree T = (V, ET , u T ) and demand graph D = (V, ED , d ),
e D | + |V |) time the optimal maximum concurrent flow rate θ∗ and
we can find in O(|E
uT
an edge h∗ ∈ ET such that θ∗ = uT [D](h
h∗
∗) .
Note that we only compute the optimal flow rate of the concurrent flow and not
the actual flows. In some sense, this is unavoidable – one can easily construct an
example of maximum concurrent flow problem on tree, where the representation of
any (even only approximately) optimal flow has size Ω(|ED ||V |) – see Figure 4-2 in
Chapter 4.
123
5.4.2 Generalized Sparsest Cut Problem
We proceed to designing a fast approximation algorithm for the generalized sparsest
cut problem. We start by noticing that for a given tree T = (V, ET , u T ), demand
uT
graph D = (V, ED , d ), and an edge h of this tree, the quantity uT [D](h)
h
is exactly the
(generalized) sparsity of the cut that cuts in T only the edge h. Therefore, Corollary
5.4.3 together with the fact that the sparsity of the sparsest cut is always an upper
bound on the maximum concurrent flow rate achievable (cf. [101, 19]), gives us the
following corollary.
Corollary 5.4.4. For any given tree T = (V, E, u) and demand graph D = (V, ED , d ),
e D |+
an optimal solution to the generalized sparsest cut problem can be computed in O(|E
|V |) time.
Now, by applying Theorem 5.4.1 together with a preprocessing step of cut sparsi-
fication of D (cf. Corollary 2.6.2), we are able to obtain a poly-logarithmic approxi-
mation for the generalized sparsest cut problem in close to linear time.
Theorem 5.4.5. For any graph G = (V, E, u), demand graph D = (V, ED , d ), and
integral k ≥ 1, there exists a Monte Carlo log(1+o(1))k n-approximation algorithm for
e + |ED | + 2k n(1+1/(2k −1)) log U ),
generalized sparsest cut problem that runs in time O(m
where n = |V |, m = |E|, and U is the capacity ratio of G.
Proof. We start by employing Corollary 2.6.2 with δ equal to 1, to sparsify both
G – to obtain a graph G e = (V, E,e ue )) – and the demand graph D – to obtain a
e e e e + |ED |). Note that computing
demand graph D = (V, ED , d ) – in total time of O(m
sparsity of a cut with respect to these sparsified versions of G and D leads to a
4-approximate estimate of the real sparsity of that cut. Since this constant-factor
approximation is acceptable for our purposes, we can focus on approximating the
sparsest cut problem with respect to G e and D.
e To this end, we just use Theorem
5.4.1 together with Corollary 5.4.4 and, since both |E|
e and |E
eD | have O(n)
e edges, our
theorem follows.
124
certain way from the sparsified version of G. As these computations can be performed
using the maximum flow algorithm from Chapter 3, the running time O(m e + n4/3+ε )
follows.
Unfortunately, O(m
e + n4/3+ε ) running time is still not sufficiently fast for our
purposes. At this point, however, we recall that maximum flow computation in a
j-tree reduces to the task of finding the maximum flow in its core. So, if we want
to perform a maximum flow computation on a j-tree that has its core sparsified (i.e.
the core has only O(j)
e edges), the real complexity of this task is proportional to j as
opposed to being proportional to n. This motivates us to obtaining an implementation
of Sherman’s algorithm on j-trees that achieves better running time. This is captured
in the following lemma.
p
Lemma 5.4.6. For any j-tree G = (V, E, u) and ε > 0, one can O( log n/ε)-
approximate the balanced separator and the sparsest cut problems in O(m
e + nε (n +
j 4/3 )) time, where m = |E| and n = |V |.
Proof. We start by using Corollary 2.6.2 with δ = 1 to sparsify the core of our j-tree
G. This ensures that the resulting j-tree G e has a core with O(j)
e edges. Also, it is
easy to see that we can focus on approximating our problems in this graph since any
approximation obtained for G e leads to an approximation for G that is only by at
most a constant factor worse. p
For given ε > 0, the algorithm of Sherman reduces the task of O( log n/ε)-
approximation of the balanced partition and the sparsest cut problems in the graph
Ge to solving a sequence of nε instances of the following problem7 . We are given a
graph G b being Ge with a source s and sink t added to it. We also add auxiliary edges
that connect these two vertices with the original vertices corresponding to G. e The
capacities of the auxiliary edges can be arbitrary – in particular, they can be zero
which means that the corresponding edge is not present – but we require that the
value of the s-t cut C = {s} is at most n/2 (thus the throughput of the maximum
s-t flow is also bounded by this value). The task is to find a constant approximation
to the maximum s-t flow in the graph G. b
Our goal is to show that we can solve any instance of the above problem in
e + j 4/3 ) time – this will imply our desired O(m
O(n e + nε (n + j 4/3 )) total running time
p
of Sherman’s O( log n/ε)-approximation algorithm and thus yield the proof of the
lemma.
To this end, we design a method of fast compression of the graph G b to make it only
consist of the core of G, the source s and the sink t, and the corresponding auxiliary
e
edges (with modified capacities). This method will allow efficient – i.e. linear time –
recovering from any flow in the compressed graph a flow in the original graph G,b such
that if the flow in the compressed graph was an approximately maximum flow then
so is the recovered one (with the same approximation factor). So, by running the
maximum flow algorithm from Chapter 3 on this compressed graph and recovering
7
For our purposes, we use a slight generalization of the problem that is actually considered in
[122].
125
the flow in G b from the computed flow, we will obtain an approximately maximum
flow in G in time O(n
b e + j 4/3 ), as desired.
Our compression procedure works in steps – each such step is a local transforma-
tion that reduces the number of vertices and edges of the graph by at least one and
can be implemented in constant time. In each transformation we consider a vertex
v in the current version G b0 of Gb such that v has exactly one non-auxiliary edge e
incident to it and this edge is a part of an envelope of G. e As we will see, G b0 will be
always a subgraph of the graph G b and thus one can convince oneself that – as long
0
as Gb still contains some edges of the envelope of G e – we can always find a vertex v
as above.
Assume first that there is at most one auxiliary edge that is incident to v. (For
the sake of the argument, let us use here a convention that this single auxiliary edge
e0 is always present, but might have a capacity of zero.) In this case, we just contract
the edge e and set the new capacity of e0 to be the minimum of its previous capacity
and the capacity of e. We claim now that given any flow f 00 in the graph G b00 obtained
through this contraction of e, we can extend it in constant time to a flow f 0 in the
graph G b0 of the same value. To achieve this, we just transplant the flow of |f 00 (e0 )|
units that f 00 routes over the edge e0 in G b00 , to a flow in G b0 routed through the edges
e and e0 . Note that by definition of the capacity of e0 in G b00 , the transplanted flow f 0
0 00 00
is feasible in Gb and has the same value as f had in G b . Moreover, one can see that
0
the value of the maximum flow in G can be at most the value of the maximum flow
b
in Gb00 since any minimum s-t cut in G b00 can be easily extended to a minimum s-t cut
in Gb that has the same capacity. So, if f 00 was approximately maximum flow in G
0 b00
then so is f 0 in Gb0 .
Now, we deal with the case when there are two auxiliary edges e0 ,e00 incident to
v (wlog, let us assume that the capacity u00 of e00 is not bigger than the capacity u0
of e0 ). To this end, we note that if we route u00 units of flow from s to t along the
path consisting of the edges e0 ,e00 then there still exists a maximum flow in G b0 such
that its flow-path decomposition contains the flow-path corresponding to our pushing
of u00 units of flow over the edges e0 , e00 . This implies that if we reduce the capacity
of e0 and e00 by u00 – thus reducing the capacity of e00 to zero and removing it – and
find the approximate maximum flow f 00 in the resulting graph, we can still recover
an approximately maximum flow in G b0 by just adding the above-mentioned flow-path
00
to f . But, since in the resulting graph v has at most one auxiliary edge incident to
it – namely, e0 – we can use our transformation from the previous case to compress
this graph further. This will reduce the number of vertices and edges of G b0 by at
least one while still being able to recover from an approximately maximum flow in
the compressed graph G b00 an approximately maximum flow in G b0 .
By noting that the compression procedure stops only when G b0 contains no more
edges of the envelope of G e and thus the final compressed graph indeed consists of
only (sparsified) core of G,
e source s, and t, the lemma follows.
Before we proceed further, we note that we can compute very quickly a very
rough, but polynomial in |V |, approximation to the graph partitioning problems of
126
our interest.
Lemma 5.4.7. For any graph G = (V, E, u), we can find a |V |2 -approximation to
the sparsest cut and the balanced separator problems in time O(|E|).
e
Proof. Consider the case when we are interested in the sparsest cut problem (we will
present variation for the balanced separator problem shortly). We sort the edges
of G non-increasingly according to their capacities. Let e1 , . . . , e|E| be the resulting
ordering. Let r∗ be the smallest r such that the set {e1 , . . . , er } contains a spanning
tree T of G. It is easy to see that we can find such r∗ in O(|E|) e time using the
union-find data structure.
Now, let us look at the cut C in G that cuts only the edge er∗ in T . Since no
vertex of G can have degree bigger than |V |, the sparsity of C is at most uer∗ |V |. On
the other hand, any cut in G has to cut at least one edge er with r ≤ r∗ . Therefore,
we know that the sparsity of the optimal cut in G has to be at least uer∗ /|V |. This
implies that C is the desired |V |2 -approximation of the sparsest cut of G.
In case of balanced separator problem, we define r∗ to be the smallest r such that
the largest connected component of the subgraph Er spanned by the edges {e1 , . . . , er }
has its size bigger than (1 − c)|V |, where c is the desired balance constant of our
instance of the problem. Once again, one can easily find such r∗ in O(|E|) e time
by employing union-find data structure. Let F be the connected component of Er∗
containing er∗ . Note that by minimality of r∗ , removing er∗ from F disconnects it
into two connected pieces and at least one of these pieces, say F 0 , has to have at
least (1 − c)|V |/2 and at most (1 − c)|V | vertices. Let C be the cut corresponding
to F 0 . Clearly, the sparsity of C is at most uer∗ |V |. Furthermore, any cut C ∗ with
min{|C ∗ |, |C ∗ } ≥ c|V | has to cut some edge er with r ≥ r∗ . Therefore, the optimal
solution can have sparsity at most u(er∗ )/|V |. This implies that C is a c0 -balanced
separator with c0 = min{(1 − c)/2, c} and thus constitutes our desired |V |2 -(pseudo-
)approximation for our instance of the balanced separator problem.
Now, we can use Theorem 5.4.1 and Lemma 5.4.6 – for the right choice of j = nl
– together with a simple preprocessing making the capacity ratio of the graphs we
are dealing with polynomially bounded, to obtain the following result.
Theorem 5.4.8. For √ any ε > 0, integral k ≥ 1, and graph G = (V, E, u), we
(1+o(1))(k+1/2)
can (log n/ ε)-approximate the sparsest cut and the balanced separator
1+ 1 +ε
problems in time O(m + 2k n 4·2k −1 ), where m = |E| and n = |V |.
e
Proof. Let us assume first that the capacity ratio of G is polynomially bounded i.e.
k
it is nO(1) . In this case, we just use Theorem 5.4.1 on G with l = 4·23·2k −1 to obtain a
collection of (2k+1 ln n) nl -trees {Gi }i in time
1−l 1
e + 2k n(1+ 2k −1 ) ) = O(m
O(m e + 2k n(1+ 4·2k −1 ) ).
p
Next, we compute – using the algorithm from Lemma 5.4.6 – O( log n/ε)-
approximately optimal solutions to our desired problem – being either the sparsest
127
cut or the balanced separator problem – on each of Gi s in total time of
1
e + 2k n1+ 4·2k −1 +ε ).
e + (2k+1 ln n)nε (n + n4l/3 )) = O(m
O(m
By Theorem 5.4.1 we know that choosing the√ best one among these solutions will
(1+o(1))(k+1/2)
give, with high probability, a (log n/ ε)-approximately optimal solution
that we are seeking.
To ensure that G has its capacity ratio always polynomially bounded, we devise
the following preprocessing procedure. First we use Lemma 5.4.7 to find a cut C
being n2 -approximation to the optimal solution of our desired problem. Let ζ be the
sparsity of C. We remove all the edges in G that have capacity smaller than ζ/mn2
and, for all the edge with capacity bigger than ζn, we trim their capacity to ζn.
Clearly, the resulting graph G0 has its capacity ratio polynomially bounded. Also,
the capacity of any cut in G can only decrease in G0 .
Now, we just run the approximation algorithm described above on G0 instead of
G. Also, in the case of approximating the balanced separator problem, we set as our
balance constant the value c00 := min{c, c0 }, where c0 := min{|C|, |C|}/n and c is the
balance constant of our input instance. Since the capacity ratio of G0 is polynomially
bounded, the running time of this algorithm will be as desired. Moreover, we can
assume that the α-approximately optimal cut C 0 output will have sparsity at most ζ
in G0 - otherwise, we can just output C as our solution. This means that C 0 does not
cut any edges with trimmed capacity and thus the capacity of C 0 in G (after putting
the removed edges back) can be only by an additive factor of ζ/n2 larger than in G0 .
But since ζ/n2 is a lower bound on the √ sparsity of the optimal solution, we see that
0 (1+o(1))(k+1/2)
C is our desired (log n/ ε)-approximately optimal solution.
Recall that our main motivation to establishing Lemma 5.4.6 was our inability
to solve the balanced separator problem on trees efficiently. However, the obtained
trade-off between the running time of the resulting algorithm and the quality of
the approximation provided for both the balanced separator and the sparsest cut
problems is much better than the one we got for the generalized sparsest cut problem
(cf. Theorem 5.4.5). This shows us that sometimes it is beneficial to take advantage
of the flexibility in the choice of l given by Theorem 5.4.1 by combining it with an
existing fast approximation algorithm for our cut-based problem that allows us to
obtain a faster implementation on j-tree instances.
128
and G is 9-embeddable into H. This will imply that both these graphs are equivalent
(up to a constant) with respect to their cut-flow structure. As a result, if we consider
i i
a decomposition of G into convex combination {(λi , G )}i , where each graph G is the
graph from GV [O(j)] equivalent – in the above sense – to the graph H i ∈ H[j], then
we will be able to show that this constitutes the t-sparse (O(log
e e m log U )])-
n), GV [O( t
decomposition of G that we are seeking.
a) b)
G H(T, F )
F
E[T ](F )
T T \F
Figure 5-3: a) An example of a graph G (solid edges), its spanning tree T (bold edges),
and a subset F of edges of T (dotted edges). b) Graph H(T, F ) corresponding to the
example from a). The edges of T \ F are bold and the edges of the set E[T ](F ) are
dashed.
To define the family H[j], consider some spanning tree T = (V, ET ) of G. There
is a unique way of embedding G into T . Namely, for each edge e = (u, v) of G we
route the corresponding ue units of flow along the unique u-v path pathT (e) in T .
This implies that if we want G to be embeddable into T then each edge e of the tree
T has to have capacity of at least uT (e), where we define
X
uT (e) := ue0 .
e0 ∈E:e∈pathT (e0 )
Finally, let H(T, F ) = (V, E, u) be a graph (cf. Figure 5-3) with edge set E :=
ET ∪ E[T ](F ) and the capacities ue , for e ∈ E, being equal to:
(
ue if e ∈ E[T ](F )
ue := T
u (e) otherwise.
129
In other words, H(T, F ) is a graph obtained by taking the forest corresponding to
the tree T with edges of F removed and adding to it all the edges that are in the set
E[T ](F ) containing the edges of E whose endpoints are in different components of
this forest (note that F ⊆ E[T ](F )). The capacity ue of an edge e of H(T, F ) is equal
to the capacity uT (e) inherited from the tree T – if e is from the forest T \ F ; and it
is just the original capacity ue of e otherwise. It is easy to see that, by construction,
G embeds into H.
An observation that will turn out to be useful later is that both the capacity vector
u T and the graph H(T, F ) can be constructed very efficiently. This is captured in
the following lemma.
Proof. First, we show how to construct the set E[T ](F ) in O(m). We do this by
first associating with each vertex v a label indicating the connected component of the
forest T \ F to which v belongs. This can be easily done in O(n) time. Next, we start
with E 0 := ∅ and for each e edge of G we check whether its both endpoints belong to
the same connected component. If this is not the case, we add e to E 0 . Clearly, this
procedure takes O(m) time and the resulting set E 0 is equal to E[T ](F ). Now, we
obtain H(T, F ) by just taking the union of the edges from E[T ](F ) – with capacities
given by the capacity vector u – and of the forest T \ F – with capacities given by
the capacities given by the capacity vector u T that, as we will see shortly, can be
computed in O(m)
e time.
To compute the capacity vector u T , we just note that for any e ∈ ET , uT (e) =
u [G](e), where uT [G](e) is defined in section 5.4.1. Thus we can use the algorithm
T
Now, we define H[j] to be the family of all the graphs H(T, F ) arising from all
the possible choices of a spanning tree T of G and of a subset F of edges of T such
that |F | ≤ j.
130
To achieve this goal we will design a nearly-linear time algorithm that for any
length vector l in G finds a graph Hl = (V, EHl , u Hl ) that belongs to the family
H[O(e m log U )] and:
t
Hl
P P
(i) e le ue ≤ αl(G), where l(G) := e le ue is the volume of G with respect to l ;
H
4αm log m ue l ρ(Hl )
(ii) |κ(Hl )| ≥ t
, where κ(Hl ) is the set of edges e of Hl with ue
≥ 2
and
Hl
u
ρ(Hl ) := maxe0 ue00 .
e
To understand the motivation behind the two above conditions, one needs to note
that if w is some weight vector and one takes the length vector l to be le := wuee for
each e, then the flow fHl corresponding to Hl as above constitutes a valid answer of
an α-oracle for the (F[O(e m log U )], G)-system with respect to the weights w . To see
t
why this is the case, we just need to note that the condition (i) asserts that
X X X X
we congfH (e) = le uH
e ≤ α
l
le ue = α we ,
l
e e∈EHl e∈E e
where we use the fact that uH e = fHl (e). Thus, fHl satisfies the requirements of the
l
oracle definition (cf. Definition 2.8.2). (Interestingly, this oracle never returns “fail”–
even when the underlying (F[O( e m log U )], G)-system is infeasible.)
t
Furthermore, condition (ii) implies that
4αm log m
|κG (fHl )| = |κ(Hl )| ≥ ,
t
where we again used the fact that uH e
l = f
Hl (e) and thus ρG (fHl ) = ρ(Hl ) and
κG (fHl ) = κ(Hl ). This mean that the tightness of the resulting oracle is at least
4αm log m
t
.
Now, by employing the multiplicative-weights-update routine from Section 2.8
with δ = 1 and the above oracle we can obtain a feasible solution {λf }f to the α-
e m log U )], G)-system. Moreover, Lemma 2.8.9 allows us to conclude that
relaxed (F[O( t
this routine will run in total O(mt)
e time and produce a solution that has at most t
non-zero λf s. So, in the light of the above, we can just focus on designing a nearly-
linear time algorithm that finds for any length function l the graph Hl that satisfies
conditions (i) and (ii) above with α = O(log
e n).
131
Lemma 5.5.2. We can find in O(m)
e time a spanningP tree TTl of G such that if we
Tl
impose capacities u on the edges of Tl then l(Tl ) = e le ue l ≤ 2αl(G), for some α
being O(log
e n).
Proof. Consider some spanning tree T = (V, ET ) of G with capacities u T on its edges.
Note that
X X X X X
lf uTf = lf ue = l(pathT (e))ue = stretchlT (e)le ue ,
f f ∈ET e∈E,f ∈pathT (e) e∈E e∈E
where we recall from Section 2.6 that the stretch stretchlT (e) of an edge e is the ratio
of the distance l(pathT (e)) between the endpoints of e in T and the length le of the
edge e.
Therefore, we see that to establish the lemma it is sufficient to find in O(m)
e time
a tree T such that X
stretchlT (e)le ue ≤ 2αl(G).
e
To this end, let us follow a technique used in [8] and define a multigraph G on
vertex set V that contains de := 1 + b lel(G)
ue |E|
c copies of each edge e ∈ E. Note that
the total number of edges of G – when counting multiple copies separately – is
X X le ue |E|
de ≤ |E| + ≤ 2|E| = 2m.
e e
l(G)
8
Now, we use the algorithm
P from Theorem 2.6.3 on G with weights we = le for
each e, and obtain in O( e de ) = O(m)
e e time a spanning tree Tl of G – that also must
be a spanning tree of G – such that:
P l
e stretchTl (e)de
P ≤ α,
e de
8
Technically, the algorithm of [1] is designed for simple graphs, but it is straight-forward to adapt
it to work on multigraphs with a running time being nearly-linear in the total number of edges.
132
with α = O(log
e n), as desired.
So, we see that by taking Tl as in the above lemma we obtain a graph in H[0] ⊆
e m log U )] that satisfies condition (i). However, this graph may still violate the
H[O( t
remaining condition (ii). To illustrate our way of alleviating this shortcoming of Tl ,
consider a hypothetical situation in which κ(Tl ) = {e} for some edge e and for all the
Tl Tl
u
other edges f 6= e of Tl we have uff = ρ for some ρ << ρ(Tl ) = uuee . Clearly, in this
case Tl is a bad candidate for the graph Hl (at least when t is not very large) as it
severely violates condition (ii).
However, the key thing to notice in this situation is that if – instead of Hl = Tl –
we consider the graph Hl = H(Tl , F ) with F = {e} then for all the edges f of Tl other
T
uf l
than e we have uf
= ρ. Furthermore, for each edge f ∈ E[Tl ](F ) = E[Tl ]({e}) it is
T
uf l
the case that uf = u(f )
u(f )
= 1 ≤ ρ. This means that ρ(Hl ) = ρ and thus |κ(Hl )| ≥ n−2,
which makes Hl satisfy condition (ii) for even very small values of t.
We see, therefore, that in the above, hypothetical, case by adding only one bot-
tlenecking edge e to F and considering the graph Hl = H(Tl , F ) – instead of Hl = Tl
– we managed to make the size of the set κ(Hl ) really large. It turns out that we
can always make the above approach work. As the following lemma shows, we can
always get the size of κ(Hl ) to be at least 4(2α+1)m
t
while fixing in the above manner
m log U
only O( t ) edges of the tree Tl .
e
Lemma 5.5.3. We can construct in O(m)e time a graph Hl = H(Tl , Fl ), for some
m log U
Fl ⊆ ETl with |Fl | = O( t ), that satisfies the conditions (i) and (ii) with α equal
e
to 2α + 1.
Proof. First, we prove that no matter what is our choice of the set Fl the graph
Hl = H(Tl , Fl ) satisfies condition (i) for α = (2α + 1). To this end, note that
X X
l(Hl ) = le uTe l + le ue ≤ l(Tl ) + l(G) ≤ (2α + 1)l(G).
e∈(ETl \Fl ) e∈E[Tl ](Fl )
We proceed to finding the subset Fl of edges of the tree Tl that will make Hl =
H(Tl , Fl ) satisfy also the condition (ii). Let us define for 0 ≤ j ≤ blog mU c Fj (Tl ) to
Tl
be the set of edges e of Tl such that mU
2j+1
≤ uuee < mU
2j
.
Tl
Note that for any edge e of Tl 1 ≤ uuee ≤ mU – the
first inequality follows since
e itself has to be routed in Tl over e, and the second one sinceSblog in the worst case all
mU c
m edges will be routed in Tl over e. Therefore, the union j=0 Fj (Tl ) of all the
Fj (Tl )s partitions the whole set ETl of the edges of the tree Tl .
Now, let us take j ∗ to be the smallest j such that
blog mU c
X 4(2α + 1)m(blog mU c + 1)
|Fj 0 (Tl )| ≤ .
j 0 =j
t
133
In other words, j ∗ is the smallest j such that we could afford to take as Fl the
union of all the sets Fj ∗ (Tl ), . . . , Fblog mU c (Tl ) and still have the size of Fl not exceed
the desired cardinality bound of
blog mU c
X 4(2α + 1)m(blog mU c + 1)
|Fj 0 (Tl )| > .
j 0 =j ∗ −1
t
However, since this sum has j ∗ +2 ≤ blog mU c+1 summands, pigeon-hole principle
asserts existence of j ∗ ≤ j ≤ blog mU c such that
4(2α + 1)m
|Fj−1 (Tl )| ≥ .
t
Sblog mU c
Now, we define Fl to be the union j=j Fj (Tl ). By the fact that j ∗ ≤ j we
know that the size of such Fl obeys the desired cardinality bound. Furthermore, we
claim that if we take Hl = H(Tl , Fl ) with such choice of Fl then |κ(Hl )| is large
enough.
H
ue l
To see this, note that the fact that ue
= 1 for all edges e in E[Tl ](Fl ) implies
2j−1
that ρ(Hl ) is at most mU
. But this means that all the edges from Fj−1 (Tl ) are in
κ(Hl ) and thus
4(2α + 1)m
|κ(Hl )| ≥ |Fj−1 (Tl )| ≥ ,
t
as desired.
Now, to conclude the proof, we notice that Lemma 5.5.2 and Lemma 5.5.1 imply
that the graph Hl as above can indeed be constructed in O(m)
e time – to find the set
Tl
ue
Fl we just sort the edges of Tl according to ue and choose the appropriate value of
j.
In the light of our above considerations and Lemma 5.5.3 we obtain the following
corollary.
134
5.5.3 Obtaining an (O(log
e e m log U )])-decomposition of G
n), GV [O( t
Lemma 5.5.5. For any spanning tree T = (V, ET ) of G and F ⊆ ET , we can find in
O(m)
e time an almost-O(|F |)-tree G(T, F ) such that the graph H(T, F ) is embeddable
into G(T, F ) and G(T, F ) is 3-embeddable into H(T, F ). Furthermore, the capacity
ratio of G(T, F ) is at most mU .
Proof. Consider some edge e = (v, v 0 ) from E[T ](F ). Let us define v 1 (e) (resp. v 2 (e))
to be the vertex that corresponds to the first (resp. last) moment we encounter an
endpoint of an edge from F while moving along the path pathT (e) from v to v 0 . Also,
let us denote by path1T (e) (resp. path2T (e)) the fragment of pathT (e) between v and
v 1 (e) (resp. between v 0 and v 2 (e)).
a) b)
H(T, F ) G(T, F )
f
e
E[T ](F ) h
T \F T \F
Figure 5-4: a) An example of a graph H(T, F ) from Figure 5-3. b) The corresponding
graph G(T, F ) with three projected edges e, f , and h. The set Proj(e) consist of
edges of H(T, F ) marked as dashed, the edges of the set Proj(f ) are solid, and Proj(h)
contains all the edges that are dotted.
We define the graph G(T, F ) = (V, E 0 , u 0 ) as follows. We make the edge set E 0
of G(T, F ) to consist of the edges from ET \ F and of an edge f = (w, w0 ) for each
w, w0 such that (w, w0 ) = (v 1 (e), v 2 (e)) for some edge e ∈ E[T ](F ). We will call
such an edge f projected and define Proj(f ) to be the set of all e ∈ E[T ](F ) with
135
(v 1 (e), v 2 (e)) = (w, w0 ). Now, we set the capacity u0e in G(T, F ) to be:
(
2uT if e ∈ ET \ F ;
ue := Pe
0
See Figure 5-4 for an example of a graph H(T, F ) and the graph G(T, F ) correspond-
ing to it.
To see that this construction of G(T, F ) can be performed quickly, note that
by Lemma 5.5.1 we can find the set E[T ](F ) and capacity vector u T in O(m) e time.
Moreover, by employing a simple adaptation of the divide-and-conquer approach that
was used in Lemma 5.4.2 to compute u T [D], we can compute all v 1 (e), v 2 (e) for each
e ∈ E[T ](F ) also in O(m)
e time.
By noting that if a vertex v is equal to v 1 (e) or v 2 (e) for some edge e then v
has to be an endpoint of an edge in F , we conclude that the subgraph of G(T, F )
induced by projected edges is supported on at most 2|F | vertices. This means that
G(T, F ) – as a connected graph being a union of a forest ET \ F and a graph on
2|F | vertices – is an almost-2|F |-tree. Furthermore, note that uTe ≤ m maxe0 ∈E ue0
0
P each edge e and that the capacity uf of an projected edge f is upperbounded by
for
e0 ∈E ue0 ≤ m maxe0 ∈E ue0 . So, the fact that no edge e in G(T, F ) has its capacity
smaller than ue implies that the capacity ratio of G(T, F ) is at most mU .
To relate the cut-flow structure of H(T, F ) and G(T, F ), one can embed H(T, F )
into G(T, F ) by embedding each edge e ∈ ET \ F of H(T, F ) into its counterpart edge
in G(T, F ) and by embedding each edge e ∈ E[T ](F ) by routing the corresponding
flow of ue units along the path formed from paths path1T (e) and path2T (e) connected
through the projected edge (v 1 (e), v 2 (e)). It is easy to see that our definition of u 0
ensures that this embedding does not overflow any capacities of edges of G(T, F ).
On the other hand, to 3-embed G(T, F ) into H(T, F ), we embed each non-
projected edge of G(T, F ) into the same edge in H(T, P F ). Moreover, we embed each
0
projected edge f by splitting the corresponding uf = e∈Proj(f ) ue units of flow into
|Proj(f )| parts. Namely, for each e ∈ Proj(f ), we route ue units of this flow along the
path constituted by paths path1T (e), path2T (e) and the edge e. Once again, it is not
hard to convince oneself that such a flow does not overflow the capacities of edges of
H(T, F ) by a factor of more than three. The lemma follows.
It is easy to see that every j-tree is an almost-j-tree, however the converse does
not necessarily hold – one can consider an almost-2-tree corresponding to a cycle.
Fortunately, the following lemma proves that every almost-j-tree G is close – with
respect to its cut structure – to some O(j)-tree G and this G can be found efficiently.
Lemma 5.5.6. Let G0 = (V 0 , E 0 , u 0 ) be an almost-j-tree, we can obtain in O(|E
e 0 |)
time an O(j)-tree G = (V 0 , E, u) such that G0 is embeddable into G and G is 3-
embeddable into G0 . Furthermore, the capacity ratio of G is at most twice the capacity
ratio of G0 .
Proof. Let us assume first that G0 does not have vertices of degree one - we will deal
with this assumption later. Let W ⊂ V 0 be the set of all vertices of degree two in G0 .
136
It is easy to see that we can find in O(|E 0 |) time a collection of edge-disjoint paths
p1 , . . . , pk covering all vertices in W such that in each pi all the internal vertices are
from W and the two endpoints v 1 (pi ) and v 2 (pi ) of pi are not in W , i.e. they have
degree at least three in G0 .
a) b)
emin (pi )
F
pi
137
identity embedding for them as well. Furthermore, to embed the remaining edges
{(v 1 (pi ), v 2 (pi ))}i we can just route, for each i, the corresponding flow from v 1 (pi ) to
v 2 (pi ) along the path pi . Once again, one can convince oneself that by definition of
emin (pi ) the resulting embedding does not overflow any capacities in G0 by a factor of
more than three.
To conclude the proof, it remains to explain how to deal with graph G0 having
vertices of degree one. In this case one may preprocess G0 as follows. We start with
0
an empty forest F and as long as there is a degree one vertex in G0 we remove it –
together with the incident edge – from G0 and we add both the vertex and the edge to
0
F . Once this procedure finishes the resulting graph G0 will not have vertices of degree
one and still will be an almost-j-tree. Thus, we can use our algorithm described above
0
and then just add F to the (3j − 2)-tree that the algorithm outputs. It is easy to
0
see that the resulting graph will be still a (3j − 2)-tree – with F being part of its
envelope – having all the desired properties.
Note that the above lemmas can be combined to produce – in O(m) e time – for a
given graph H(T, F ) an O(|F |)-tree that preserves the cut-flow structure of H(T, F )
up to a factor of nine. This allows us to prove Theorem 5.3.6.
Proof of Theorem 5.3.6. By Corollary 5.5.4 we can compute in O(tm) e time a t-sparse
0 m log U i i i 0
(α , H[O( t )])-decomposition {(λ , H(T , F ))}i of G with α being O(log
e e n). Now,
i i i m log U
consider a convex combination {(λ , G )}i with each G being the O( e
t
)-tree pro-
duced by applying Lemma 5.5.5 and then Lemma 5.5.6 to the graph H(T i , F i ).
Clearly, we can obtain this combination in O(tm) e time.
We claim that this combination is a t-sparse (O(log n), G[O(
e e m log U )])-decomposition
t
P i i
of G. Obviously, i λ = 1 and G is embeddable into each G since G is embeddable
i
into each H(T i , F i ) and each H(T i , F i ) is embeddable into corresponding G . Simi-
i
larly, by composing the 9-embedding of G into H(T i , F i ) and the identity embedding
i
of H(T i , F i ) into G (cf. embeddings {fj }j in Definition 5.3.1), we get that {(λi , G )}i
satisfies all the requirements of a t-sparse (9α0 , GV [O( e m log U )])-decomposition. Fur-
t
i
thermore, by Lemma 5.5.5 and Lemma 5.5.6 the capacity ratio of each G is at most
2mU . The theorem follows.
138
in the spirit of Fact 5.3.4 – and if H is already a j-tree then it is returned. Otherwise,
0
the procedure recursively finds a j-tree G that approximates the cuts of the core of
0
H – thus G is only defined on the vertex set of this core – and returns a j-tree that
approximates the cuts of the whole H – and thus of G – by just adding the envelope
0
of H to G . An important detail is that whenever we recurse, we increase the value of
parameter t by squaring it. Intuitively, we may employ this more aggressive setting of
the value of t since the size of the problem – i.e. the size of the core of H – decreased
by a factor of at least t.
139
Lemma 5.6.2. For a given graph G = (V, E, u) and j ≥ 1, and t ≥ 2, Pro-
cedure Find_Trees(G, t, j) works in O(me + tnT 2 log U ) time and returns a j-tree
G that (log(1+o(1))T n)-preserves the cuts of G with probability (1/2)T , where T ≤
max{dlog(logt (n/j) + 1)e, 1}, n = |V |, m = |E|, and U is the capacity ratio of G.
Moreover, the capacity ratio of G is n(2+o(1))T U .
Proof. Let us define nl to be an upper bound on the number of the vertices of the graph
that we are dealing with at the beginning of the l-th recursive call of the procedure.
Also, let tl be the value of t and let Ul be the upper bound on the capacity ratio of
this graph at the same moment. Clearly, n0 = n, t0 = t, and U0 = U . By solving
simple recurrence we see that
l
tl = t2 (5.3)
and thus
nl n
nl+1 ≤ ≤ 2l+1 −1 . (5.4)
tl t
Now, by Theorem 5.3.6 and Corollary 2.6.2, we know that Ul+1 is O(n2l Ul ). More-
over, by the fact that we sparsify G first and by Theorem 5.3.6, we know that in l-th
iteration it is sufficient to take t0 = tl t̂l , where t̂l is logO(1) n log Ul .
Note that every graph on at most j vertices is a j-tree thus the procedure stops
issuing further recursive calls once the number of vertices of G is at most j. So, the
number of recursive calls of our procedure is at most T − 1, where
T −1 T −1 l
X
e t̂l tl nl ) ≤ O(m
X nt2 log Ul
O(m)
e + O( e + )
l=0 l=0
t2l −1
XT −1
≤ O(m
e + tn e + tnT 2 log U ),
log Ul ) = O(m
l=0
where O(m)
e is the cost of the initial cut sparsification of G.
To establish the desired bound on the quality of the preservation of cuts of G by
G, we prove that for all l ≥ 0, if the execution of the procedure made l recursive calls
140
then the graph G (log(1+o(1))(l+1) n)-preserves the cuts of G with probability (1/2)l+1 .
Clearly, this claim implies the desired bounds by combining it with the fact that
l ≤ T − 1.
We prove the claim inductively. For l = 0 it follows from Corollary 2.6.2, Theorem
5.3.6, and Fact 5.3.4. Now, assume that the claim holds for l ≥ 0 and we will prove
it for l + 1. By Corollary 2.6.2, Theorem 5.3.6, and Fact 5.3.4, we know that H is
(log1+o(1) n)-preserving the cuts of G with probability 1/2. So, by our inductive as-
0
sumption we know that G (log(1+o(1))(l+1) n)-preserves the cuts of H 0 with probability
(1/2)l+1 . Therefore, by Lemma 5.6.1 we know that G also (log(1+o(1))(l+1) n)-preserves
the cuts of H with probability (1/2)l+1 which – by the above-mention relation of
the cut-structure of H to the cut-structure of G – implies that G (log(1+o(1))(l+2) n)-
preserves the cuts of H with probability (1/2)l+2 . So, our claim holds for l + 1 and
thus for all the values of l. This concludes the proof of the lemma.
Now, proving Theorem 5.3.7 boils down to running the Find_Trees(G, t, j) for the
right setting of parameters and sufficiently many times, so as to boost the probability
that the obtained collection of j-trees preserves cuts with high probability.
Proof of Theorem 5.3.7. To obtain the desired collection of nl -trees, we just invoke
1−l
procedure Find_Trees(G, t, j) on G (2k+1 ln n) times with t := n 2k −1 . Note that for
this choice of parameters we get that in the statement of Lemma 5.6.2 T ≤ k.
Therefore, this lemma allows us to conclude that each invocation of the procedure
e (1+1/k) log U ) time to execute and thus we obtain (2k+1 ln n) nl -trees {Gi }i
takes O(n
in total time of 1−l
e + 2k n1+ 2k −1 log U ).
O(m
(We used here the fact that it is sufficient to cut sparsify G only once for all the
procedure calls.)
Furthermore, Lemma 5.6.2 implies that the capacity ratio of the obtained trees
is at most n(2+o(1))k U and each particular nl -tree Gi (log(1+o(1))k n)-preserves the cuts
of G with probability (1/2)k . However, it is easy to see that this in turn means that
i
the whole collection {G }i (log(1+o(1))k n)-preserved the cuts of G with probability
k+1
(1 − (1 − (1/2)k )2 ln n ) = (1 − 1/n2 ), as desired. The theorem follows.
141
Part II
142
Chapter 6
6.1 Introduction
The traveling salesman problem (TSP) – the task of finding a minimum cost tour
that visits each city in a given set at least once – is one of the most celebrated and
extensively studied problems in combinatorial optimization. The book by Lawler et
al. [97] provides a tour d’horizon of all the concepts and techniques developed in the
context of TSP.
The traveling salesman problem comes in two variants. The symmetric version
(STSP) assumes that the cost c(v, u) of getting from city v to city u is equal to c(u, v),
while the more general asymmetric version (ATSP) that we study here does not make
this assumption. Note that since we can replace every arc (u, v) in the tour with the
shortest path from u to v, we can assume that the cost function c satisfies the triangle
inequality, i.e., we have c(u, w) ≤ c(u, v) + c(v, w) for all u, v, and w.
Despite a lot of interest and work, little progress has been made over the last three
decades on its approximability in the general metric case. For the symmetric variant,
there is a celebrated factor 3/2 approximation algorithm due to Christofides [41].
This algorithm is based on first finding a minimum cost spanning tree T on V , then
1
This chapter is based on joint work with Arash Asadpour, Michel Goemans, Shayan Oveis
Gharan, and Amin Saberi and contains material from [18].
143
finding the minimum cost Eulerian augmentation of that tree, and finally shortcut-
ting the corresponding Eulerian walk into a tour. No better approximation algorithm
has since been found for the general symmetric metric case. For the asymmetric
case, no constant approximation algorithm is known. Frieze et al. [69] gave a simple
log n-approximation algorithm for ATSP, which was improved slightly by subsequent
results down to 2/3 log n [30, 78, 63]. This is in sharp contrast to the best inapprox-
imability result of Papadimitriou and Vempala [111] which shows the nonexistence of
an 117/116-approximation algorithm for the ATSP, and of an 220/219-approximation
algorithm for the STSP, unless P = N P . Whether ATSP can be approximated within
a constant factor is a major open question, and so is whether an c-approximation al-
gorithm for STSP can be obtained for a constant c < 3/2.
In this chapter, we make progress on this question by presenting an O( logloglogn n )-
approximation algorithm for the ATSP. This approximation factor finally breaks the
long-standing Θ(log n) barrier mentioned above. Our approach to the asymmetric
version has similarities with Christofides’ algorithm; we first construct a spanning
tree with special properties. Then we find a minimum cost Eulerian augmentation of
this tree, and finally, shortcut the resulting Eulerian walk. (Recall that for undirected
graphs, being Eulerian means being connected and having even degrees, while for
directed graphs it means being (strongly) connected and having the indegree of every
vertex equal to its outdegree.)
Our way of bounding the cost of the Eulerian cost augmentation is based on apply-
ing a simple flow argument using Hoffman’s circulation theorem [120] that shows that
if the tree chosen in the first step is “thin” then the cost of the Eulerian augmentation
is within a factor of the “thinness” of the (asymmetric) Held-Karp linear programming
(LP) relaxation value (OPTHK ) [81]. This flow argument works irrespectively of the
actual directions of the (directed) arcs corresponding to the (undirected) edges of the
tree. Roughly speaking, a thin tree with respect to the optimum solution x ∗ of the
Held-Karp relaxation is a spanning tree that, for every cut, contains a small multiple
(the thinness) of the corresponding value of x ∗ in this cut when the direction of the
arcs are disregarded.
A key step of our algorithm is to find a thin tree of small cost compared to the
LP relaxation value OPTHK . For this purpose, we employ a sampling procedure that
produces a random spanning tree (cf. Section 2.5.2) with respect to weights {λe }e
that are chosen so as to make the probabilities of each edge appearing in this tree to
(approximately) agree with marginal probabilities obtained from the symmetrized LP
solution (scaled by 1 − 1/n). To make this procedure efficient, we develop a simple
iterative algorithm for (approximately) computing these λe ’s efficiently.
The main motivation behind choosing such a sampling procedure is that the events
corresponding to edges being present in the sampled tree are negatively correlated.
This means that the upper tail of well-known Chernoff bound for the independent
setting still holds, see Panconesi and Srinivasan [110]. The proof of the O( logloglogn n )-
thinness of the sampled tree is based on combining this tail bound with “union-
bounding over cuts” technique of Karger [85].
The high level description of our algorithm can be found in Figure 6-1. The proof
of our main Theorem 6.6.4 also gives a more formal overview of the algorithm.
144
INPUT: A set V consisting of n points and a cost function c : V × V → R+
satisfying the triangle inequality.
ALGORITHM:
3. Orient each edge of T ∗ so as to minimize its cost. Find a minimum cost integral
circulation that contains the oriented tree T~ ∗ . Shortcut this multigraph and
output the resulting tour.
6.2 Notation
Before describing our approximation algorithm for ATSP in details, we need to in-
troduce some notation. Throughout this chapter, we use a = (u, v) to denote the arc
(directed edge) from u to v and e = {u, v} for an undirected edge. Also we use A
(resp. E) for the set of arcs (resp. edges) in a directed (resp. undirected) graph.
We use the same notation for a function defined on the edge set E of an undirected
145
graph. For U ⊆ V , we also define the following sets of arcs:
δ + (U ) := {a = (u, v) ∈ A : u ∈ U, v ∈
/ U },
− +
δ (U ) := δ (V \U )
A(U ) := {a = (u, v) ∈ A : u ∈ U, v ∈ U }.
Similarly, for an undirected graph G = (V, E), δ(U ) denotes the set of edges with
exactly one endpoint in U , and E(U ) denotes the edges entirely within U , i.e. E(U ) =
{{u, v} ∈ E : u ∈ U, v ∈ U }.
This relaxation is known as the Held-Karp relaxation [81] and its optimum value,
which we denote by OPTHK , can be computed in polynomial-time (either by the
ellipsoid algorithm or by reformulating it as an LP with polynomially-bounded size).
Observe that (6.3) implies that any feasible solution x to the Held-Karp relaxation
satisfies
x (δ + (U )) = x (δ − (U )), (6.4)
for any U ⊂ V .
Let x ∗ denote an optimum solution to this LP (6.1); thus c(x ∗ ) = OPTHK . We
can assume that x ∗ is an extreme point of the corresponding polytope. We first make
this solution symmetric and slightly scale it down by setting
n−1 ∗
z ∗{u,v} := (xuv + x∗vu ). (6.5)
n
Let A denote the support of x ∗ , i.e. A = {(u, v) : x∗uv > 0}, and E the support
of z ∗ . For every edge e = {u, v} of E, we can define its cost as min{c(a) : a ∈
{(u, v), (v, u)} ∩ A}; with the risk of overloading the notation, we denote this new
cost of this edge e by c(e). This implies that c(z ∗ ) < c(x ∗ ).
The main purpose of the scaling factor in the definition (6.5) is to obtain a vector
z ∗ which belongs to the spanning tree polytope P of the graph (V, E), i.e. z ∗ can be
viewed as a convex combination of incidence vectors of spanning trees, see Lemma
6.3.1. In fact, z ∗ even belongs to the relative interior of P .
146
Lemma 6.3.1. The vector z ∗ defined by (6.5) belongs to the relative interior of the
spanning tree polytope P .
Proof. From Edmonds’ characterization of the base polytope of a matroid [56], it
follows that the spanning tree polytope P is defined by the following inequalities (see
[120, Corollary 50.7c]):
P = {z ∈ RE : z (E) = |V | − 1 (6.6)
z (E(U )) ≤ |U | − 1, ∀U ⊂ V (6.7)
ze ≥ 0 ∀e ∈ E.} (6.8)
∀v ∈ V, x ∗ (δ + (v)) = 1 ⇒ x ∗ (A) = n = |V |
⇒ z ∗ (E) = n − 1 = |V | − 1.
147
spanning trees of G that preserves the marginal probabilities imposed by z ∗ , i.e.
Pr[e ∈ T ] = ze∗ for every edge e ∈ E, where G = (V, E) is the graph whose edge set
E is the support of z ∗ . There is plenty of such distributions. The distribution we
actually choose in our approach corresponds to taking the tree to be a λ-random tree
for some vector λ ∈ R|E| , i.e., a spanning tree T chosen from
Q the collection of all the
possible spanning trees with probability proportional to e∈T λe – cf. Section 2.5.2.
As we will see later – see Theorem 6.4.1 and the discussion after its statement
– such a vector λ always exists and an approximation λ e of it (that suffices for our
purposes) can be found efficiently. Furthermore, as we describe in Section 6.4.1, given
λ,
e we can sample a λ-random
e tree sufficiently fast too.
The main reason behind use of such distribution is that the events corresponding
to the edges of G being included in a sampled λ-random tree are negatively correlated
– see Section 6.4.2. This enables us to use the upper tail of Chernoff bound on such
events. We use this tail bound – together with the union-bounding technique of
Karger [85] – to establish the thinness of a sampled tree. (Roughly speaking, a tree
is said to be thin if the number of its edges in each cut is not much higher than its
expected value; see Section 6.5 for a formal definition of thinness.) This, in turn, will
enable us to obtain from this sampled thin tree a tour in G that has sufficiently small
cost – see Section 6.6 for details.
Before we proceed, let us just note that in Section 6.7 we show how to efficiently
find λee ’s that approximately preserve the margins dictated by z ∗ . More formally, we
prove the following theorem and we use in the rest of the chapter the vector λe that is
produced by it for ε = 0.2.. (Recall that for our ATSP application, zmin is 2−O(n ln n)
and having ε equal to 0.2 suffices.)
Theorem 6.4.1. Given z in the spanning tree polytope of G = (V, E) and some
ee , for all e ∈ E, such that if we sample a λ-random
ε > 0, one can find values λ e tree
T then
Pr[e ∈ T ] ≤ (1 + ε)ze ,
for each e, i.e. the marginals are (1 + ε)-approximately preserved. Furthermore, the
running time of the resulting algorithm is polynomial in n = |V |, − ln zmin and 1/ε.
The above theorem shows that for any z one can find a vector λ e that preserves
marginals up to arbitrarily small precision. This suffices for our purposes. However,
a natural question here is whether there always exists a vector λ that preserves the
marginals exactly. It turns out (cf. [18]) that indeed such a vector exists, as long as, z
is in the relative interior of the spanning tree polytope. Furthermore, this requirement
on z is necessary (in fact, this has been observed already before, see Exercise 4.19 in
[103]). To see why it is the case, suppose G is a triangle and z is the vector ( 12 , 12 , 1).
In this case, z is in the polytope (but not in its relative interior) and there are no
λe ’s that preserve the margins exactly.
148
6.4.1 Sampling a λ-Random Tree
There is a host of results (cf. [77, 96, 45, 4, 33, 138, 90] and Chapter 7) on obtaining
polynomial-time algorithms for generating a uniform spanning tree, i.e. a λ-random
tree for the case of all λe s being equal. Almost all of them can be easily modified to
allow arbitrary λ; however, not all of them still guarantee a polynomial running time
for such general λe s. (For instance, the algorithm to be present in Chapter 7 might
be not polynomial-time in this setting.) Therefore, to circumvent this problem we
will use an iterative approach similar to [96] that is still polynomial for general λe s.
This approach is based on ordering the edges e1 , . . . , em of G arbitrarily and pro-
cessing them one by one, deciding probabilistically whether to add a given edge to
the final tree or to discard it. More precisely, when we process the j-th edge ej , we
decide to add it to a final spanning tree T with probability pj being the probability
that ej is in a λ-random tree conditioned on the decisions that were made for edges
e1 , . . . , ej−1 in earlier iterations. Clearly, this procedure generates a λ-random tree,
and its running time is polynomial as long as the computation of the probabilities
p1 , . . . , pm can be done in polynomial time.
To compute these probabilities efficiently we note that, by Fact 2.5.2, we have
that p1 is equal to λe1 times the effective resistance of the edge e1 = (u1 , v1 ) in the
graph G that has the resistance re of each edge e equal to 1/λe . Furthermore, by
definition of the effective resistance (cf. (2.13)), we can compute it by finding the
vertex potentials corresponding to an electrical u1 -v1 flow of value 1. These, in turn,
can be obtained by solving the Laplacian system (cf. (2.10)). Note that we care only
about having a polynomial running time here, so we can afford solving this system –
and thus compute p1 – exactly, e.g., by employing Gaussian elimination.
Now, once the probability p1 has been computed, to compute p2 we note that if
we choose to include e1 in the tree then:
P Q
0
T 3e ,e e∈T 0 λe
p2 = Pr[e2 ∈ T |e1 ∈ T ] = P 1 2Q
0 0 λe
P T 3e1 Qe∈T
T 0 3e ,e e∈T 0 \e1 λe
= P 1 2Q .
T 0 3e1 e∈T 0 \e1 λe
As one can see, the probability that e2 ∈ T conditioned on the event that e1 ∈ T is
equal to the probability that e2 is in a λ-random tree of a graph obtained from G
by contracting the edge e1 . (Note that one can sum up the λe s of the multi-edges
formed during the contractions and replace them with one single edge.) Similarly,
if we choose to discard e1 , the probability p2 is equal to the probability that e2 is
in a λ-random tree of a graph obtained from G by removing e1 . In general, pj is
equal to the probability that ej is included in a λ-random tree of a graph obtained
from G by contracting all edges that we have already decided to add to the tree, and
deleting all edges that we have already decided to discard. So, this means that we
can compute each pj in polynomial time in the manner completely analogous to the
way we computed p1 – we just use appropriately modified version of the graph G for
each of these computations.
149
6.4.2 Negative Correlation and a Concentration Bound
We derive now the following concentration bound. As discussed in the next section,
this bound is instrumental in establishing the thinness of a sampled tree.
Theorem 6.4.2. For each edge e, let Xe be an indicator random variable associated
with the event [e ∈ T ], where
P T is a λ-random tree. Also, for any subset C of the
edges of G, define X(C) = e∈C Xe . Then we have
E[X(C)]
eδ
Pr[X(C) ≥ (1 + δ)E[X(C)]] ≤ .
(1 + δ)1+δ
Usually, when we want to obtain such concentration bounds, we prove that the
variables {Xe }e are independent and use the Chernoff bound. Unfortunately, in our
case, the variables {Xe }e are not independent. However, it turns out that they are
still negatively correlated, i.e. for any subset F ⊆ E,
Y
Pr[∀e∈F Xe = 1] ≤ Pr[Xe = 1].
e∈F
This fact should be not too surprising as, intuitively, if one considers, e.g., F = {e, e0 }
then the presence of the edge e in the tree makes the edge e0 “less needed” for making
sure that the resulting tree is connected and more likely to lead to creation of a cycle.
Also, as the number of edges in a spanning tree is exactly one fewer than the number of
vertices of the underlying graph, the negative correlation holds “on average”. However,
despite this intuitive feel, no direct proof of this negative correlation is known, and
the only proof crucially relies on the connection between random spanning trees and
electrical flows, i.e., Fact 2.5.2.
Lemma 6.4.3. The random variables {Xe }e are negatively correlated.
Proof. Let us consider some subset F of the edges E and let e1 , . . . , ek be some
arbitrary ordering of the edges in F . Now, as we have
k
Y
Pr[∀e∈F Xe = 1] = Pr[ei ∈ T |Xej = 1∀j<i ],
i=1
To this end, we note that by Fact 2.5.2 we have that Pr[ei ∈ T ] is equal to λei
times the effective resistance of ei in a version of the graph G with resistances re
equal to 1/λe for each edge e. Furthermore, by the discussion in Section 6.4.1 (and
Fact 2.5.2), we know that Pr[ei ∈ T |Xej = 1∀j<i ] is equal to λei times the effective
resistance of ei in a version of G corresponding to contracting the edges ej for j < i –
which can be viewed as setting the resistances rej of these edges to zero – and setting
the resistances re of the rest of the edges to 1/λe .
150
However, by Rayleigh’s Monotonicity Principle (cf. Corollary 2.4.2), we must have
that the effective resistance of ei in the first version of the graph G can be only bigger
than the effective resistance of ei in the second version. Thus, the lemma follows.
Once we have established negative correlation between Xe s, Theorem 6.4.2 follows
directly from the result of Panconesi and Srinivasan [110] that the upper tail part of
the Chernoff bound requires only negative correlation (or even a weaker notion, see
[110]) and not the full independence of the random variables.
At this point, we want to note that other ways of producing negatively correlated
probability distributions on trees (or, more generally, matroid bases) satisfying some
given marginals z have been proposed by Chekuri, Vondrak and Zenklusen [39]. Their
approach can also be used in the framework developed in this chapter.
c(T ) ≤ sOPTHK .
We first prove that if we focus on a particular cut then the “α-thinness” property
holds for it with overwhelming probability where α ≈ logloglogn n .
where β = 4 ln n/ ln ln n.
Proof. Note that by definition, for all edges e ∈ E, Pr[e ∈ T ] ≤ (1 + ε)ze∗ , where
ε = 0.2 is our desired accuracy parameter. Hence,
X
E[|T ∩ δ(U )|] = Pr[e ∈ T ] = z̃ (δ(U )) ≤ (1 + ε)z ∗ (δ(U )),
e∈δ(U ))
P
where we define z̃ (δ(U )) := e∈δ(U )) Pr[e ∈ T ]. Applying Theorem 6.4.2 with
βz ∗ (δ(U )) β
1+δ = ≥ ,
z̃ (δ(U )) 1+ε
151
we derive that Pr[|T ∩ δ(U )| > βz ∗ (δ(U ))] can be bounded from above by
Here, high probability means probability at least 1 − 1/n, but probability 1 − 1/nk
can be achieved by sampling 2k ln n trees.
Proof. We start by showing that for any 1 ≤ j ≤ d2 ln ne, Tj is β-thin with high
probability for β = 4 ln n/ ln ln n. From Lemma 6.5.2 we know that the probability of
∗
some particular cut δ(U ) violating the β-thinness of Tj is at most n−2.5z (δ(U )) . Now,
we use a result of Karger [85] that shows that there are at most n2l cuts of size at most
l times the minimum cut value for any half-integer l ≥ 1. Since, by the definitions of
the Held-Karp relaxation and of z ∗ , we know that z ∗ (δ(U )) ≥ 2(1 − 1/n), it means
there is at most nl cuts δ(U ) with z ∗ (δ(U )) ≤ l(1 − 1/n) for any integer l ≥ 2.
152
Therefore, by applying the union bound (and n ≥ 5), we derive that the probability
that there exists some cut δ(U ) with |Tj ∩ δ(U )| > βz ∗ (δ(U )) is at most
∞
X
ni n−2.5(i−1)(1−1/n) ,
i=3
where each term is an upper bound on the probability that there exists a violating
cut of size within [(i − 1)(1 − 1/n), i(1 − 1/n)]. Fo n ≥ 5, this simplifies to:
∞ ∞
X
i −2.5(i−1)(1−1/n)
X 1
nn ≤ n−i+2 = ,
i=3 i=3
n−1
So, by Markov inequality we have that for any j, the probability that c(Tj ) > 2OPTHK
is at most (1 + ε)/2. Thus, with probability at most ( 1+ε2
)2 ln n < n1 for ε = 0.2, we
have c(T ∗ ) > 2OPTHK . This concludes the proof of the theorem.
Theorem 6.6.1. Assume that we are given an (α, s)-thin spanning tree T ∗ with
respect to the LP relaxation x ∗ . Then we can find a Hamiltonian cycle of cost no
more than (2α + s)c(x ∗ ) = (2α + s)OPTHK in polynomial time.
Before proceeding to the proof of Theorem 6.6.1, we recall some basic network
flow results related to circulations. A function f : A → R is called a circulation if
f (δ + (v)) = f (δ − (v)) for each vertex v ∈ V . Hoffman’s circulation theorem [120, The-
orem 11.2] gives a necessary and sufficient condition for the existence of a circulation
subject to lower and upper capacities on arcs.
Theorem 6.6.2 (Hoffman’s circulation theorem). Given lower and upper capacities
l, u : A → R, there exists a circulation f satisfying l(a) ≤ f (a) ≤ u(a) for all a ∈ A
if and only if
153
2. for all subsets U ⊂ V , we have l(δ − (U )) ≤ u(δ + (U )).
Furthermore, if l and u are integer-valued, f can be chosen to be integer-valued.
Proof of Theorem 6.6.1. We first orient each edge {u, v} of T ∗ to arg min{c(a) : a ∈
{(u, v), (v, u)} ∩ A}, and denote the resulting directed tree by T~ ∗ . Observe that by
definition of our undirected cost function, we have c(T~ ∗ ) = c(T ∗ ). We then find
a minimum cost augmentation of T~ ∗ into an Eulerian directed graph; this can be
formulated as a minimum cost circulation problem with integral lower capacities (and
no or infinite upper capacities). Indeed, set
1 a ∈ T~ ∗
l(a) =
0 a∈/ T~ ∗ ,
1 + 2αx∗a a ∈ T~ ∗
u(a) =
2αx∗a a∈/ T~ ∗ .
We claim that there exists a circulation g satisfying l(a) ≤ g(a) ≤ u(a) for every
a ∈ A. To prove this claim, we use Hoffman’s circulation theorem 6.6.2. Indeed, by
construction, l(a) ≤ u(a) for every a ∈ A; furthermore, Lemma 6.6.3 below shows
that, for every U ⊂ V , we have l(δ − (U )) ≤ u(δ + (U )). Thus the existence of the
circulation g is established. Furthermore,
establishing the bound on the cost of f ∗ . This completes the proof of Theorem
6.6.1.
Lemma 6.6.3. For the capacities l and u as constructed in the proof of Theorem
6.6.1, the following holds for any subset U ⊂ V :
154
Proof. Irrespective of the orientation of T ∗ into T~ ∗ , the number of arcs of T~ ∗ in δ − (U )
is at most αz ∗ (δ(U )) by definition of α-thinness. Thus
where we have used the fact that x ∗ itself is a circulation (see (6.4)). The lemma
follows.
Theorem 6.6.4. For a suitable choice of parameters, the algorithm given in Figure 6-
1 finds a (2+8 ln n/ ln ln n)-approximate solution to the asymmetric traveling salesman
problem with high probability and in time polynomial in the size of the input.
The proof also shows that the integrality gap of the Held-Karp relaxation for the
asymmetric TSP is bounded above by 2 + 8 ln n/ ln ln n. The best known lower bound
on the integrality gap is only 2, as shown in [36]. Closing this gap is a challenging
open question, and this possibly could be answered using thinner spanning trees. In
particular, we formulate the following conjecture.
Corollary 6.6.5. There always exists a (C1 , C2 )-thin spanning tree where C1 and C2
are constants. Therefore, the integrality gap of the ATSP Held-Karp linear program-
ming relaxation (6.1) is a constant.
155
P
exp(γ(T ))
Given a vector γ, for each edge e, define qe (γ) := PT 3eexp(γ(T )) , where γ(T ) =
P T
1. Set γ = ~0.
3. Output γ
e := γ.
Clearly, if the above procedure terminates then the resulting γ e satisfies the re-
quirement of Theorem 6.4.1. Therefore, what we need to show is that this algorithm
terminates in time polynomial in n, − ln zmin and 1/ε, and that each iteration can be
implemented in polynomial time.
We start by bounding the number of iterations - we will show that it is
1
O( |E|2 [|V | ln(|V |) − ln(εzmin )]).
ε
In the next lemma, we derive an equation for δ, and prove that for f 6= e the proba-
bilities qf (·) do not decrease as a result of decreasing γe .
In particular, in the main loop of the algorithm, since qe (γ) > (1 + ε)ze and we
qe (γ)(1−(1+ε/2)ze ) (1+ε)
want qe (γ 0 ) = (1 + ε/2)ze , we get δ = ln (1−q e (γ))(1+ε/2)ze
> ln (1+ε/2) > 4ε for ε ≤ 1 (for
larger values of ε, we can simply decrease ε to 1).
156
Proof. Let us consider some f ∈ E \ {e}. We have
0
P
T ∈T :f ∈T exp(γ (T ))
qf (γ 0 ) = P 0
T ∈T exp(γ (T ))
γ 0 (T ) γ 0 (T )
P P
T :e∈T,f ∈T e + T :e∈T,f
/ ∈T e
= P γ 0 (T ) +
P γ 0 (T )
T 3e e / e
T :e∈T
e−δ T :e∈T,f ∈T eγ(T ) + T :e∈T,f γ(T )
P P
/ ∈T e
= P P
e−δ T 3e eγ(T ) + T :e∈T / e
γ(T )
e−δ a + b
= −δ
e c+d
with a, b, c, d appropriately defined. The same expression holds for qf (γ) with the e−δ
factors removed. But, for general a, b, c, d ≥ 0, if ac ≤ a+b c+d
then xa+b
xc+d
≥ a+b
c+d
for x ≤ 1.
Since
γ(T )
P
a T ∈T :e∈T,f ∈T e a+b
= P γ(T )
≤ qf (γ) =
c T ∈T :e∈T e c+d
by negative correlation (since a/c represents the conditional probability that f is
present given that e is present), we get that qf (γ 0 ) ≥ qf (γ) for δ ≥ 0.
Now, we proceed to deriving the equation for δ. By definition of qe (γ), we have
γ(T )
P
1 / e
T :e∈T
−1= P γ(T )
.
qe (γ) T 3e e
Hence,
γ 0 (T )
P
1 / e
−1 = PT :e∈T γ 0 (T )
qe (γ 0 ) T 3e e
γ(T )
P
/ e
= −δ TP :e∈T
e eγ(T )
T 3e
δ 1
= e −1 .
qe (γ)
Before bounding the number of iterations, we collect some basic results regarding
spanning trees which we need for the proof of the number of iterations.
157
2. Any spanning tree T ∈ T= can be generated by taking the union of any spanning
forest F (of cardinality r) of the graph (V, Q) and a spanning tree (of cardinality
n − r − 1) of the graph G/Q in which the edges of Q have been contracted.
3. Let Tmax be a maximum spanning tree of G with respect to the weights γ(·), i.e.
Tmax = arg maxT ∈T γ(T ). Then, for any T ∈ T< , we have γ(T ) < γ(Tmax ) − ∆.
Proof. These properties easily follow from the matroidal properties of spanning trees.
To prove 3., consider any T ∈ T< . Since |T ∩ Q| < r, there exists an edge f ∈
(Tmax ∩ Q) \ T such that (T ∩ Q) ∪ {f } is a forest of G. Therefore, the unique circuit
/ Q. Thus T 0 = T ∪ {f } \ {e} is a spanning tree. Our
in T ∪ {f } contains an edge e ∈
assumption on Q implies that
158
Clearly, by construction of Q, property (II) is satisfied. Also, Q is non-empty since
f ∗ ∈ Q0 ⊆ Qj = Q. Finally, by the pigeonhole principle, since we have m different
ετ
edges, we know that j < m. Thus, for each e ∈ Q we have γe > Γm = − 4m . This
∗
means that e ∈ / Q and thus Q has property (I).
ετ
Observe that Q satisfies the hypothesis of Lemma 6.7.2 with ∆ = 4m 2 . Thus, for
T
b∈T̂ exp(γ(Tb))
On the other hand, since z satisfies z(E) = n − 1 and z(Q) ≤ r (by definition of
r, see Lemma 6.7.2, part 1.), we have that z(E \ Q) ≥ n − r − 1. Therefore, (6.10)
implies that there must exist ê ∈
/ Q such that q̂ê ≤ zê .
Our final step is to show that for any e ∈ / Q, qe < q̂e + εzmin
2
. Note that once we
εzmin ε
establish this, we know that qê < q̂ê + 2 ≤ (1 + 2 )zê , and thus it must be the case
that γê = 0. But this contradicts the fact that ê ∈/ Q, as by construction all e with
γe = 0 must be in Q. Thus, we obtain a contradiction that concludes the proof of the
Lemma.
εzmin
It remains to prove that for any e ∈
/ Q, qe < q̂e + 2
. We have that
γ(T )
P
T ∈T :e∈T e
qe = P
eγ(T )
P T ∈T γ(T )
+ T ∈T< :e∈T eγ(T )
P
T ∈T= :e∈T e
= P γ(T )
T ∈T e
γ(T ) γ(T )
P
T ∈T< :e∈T e
P
T ∈T= :e∈T e
≤ P γ(T )
+ P γ(T )
T ∈T= e T ∈T e
γ(T )
P
T ∈T= :e∈T e
X eγ(T )
≤ P γ(T )
+ , (6.11)
T ∈T= e T ∈T :e∈T
eγ(Tmax )
<
the first inequality following from replacing T with T= in the first denominator,
and the second inequality following from considering only one term in the second
denominator. Using (6.9) and the fact that the number of spanning trees is at most
159
nn−2 , the second term is bounded by:
X eγ(T ) 2
≤ nn−2 e−ετ /4m (6.12)
T ∈T< :e∈T
eγ(Tmax )
1 n −ετ /4m2
< n e
2
εzmin
= ,
2
by definition of τ . To handle the first term of (6.11), we can use part 2. of Lemma
6.7.2 and factorize:
!
0
X X b X
eγ(T ) = eγ(T ) eγ(T ) ,
T ∈T= Tb∈T̂ T 0 ∈F
where F is the set of all spanning forests of (V, Q). Similarly, we can write
!
0
X X X
eγ(T ) = eγ(T ) eγ(T ) .
b
160
Chapter 7
In this chapter, we set forth a new algorithm for generating uniformly random span-
ning trees in undirected graphs.1 Our algorithm produces a sample from the uniform
distribution
√ over all the spanning trees of the underlying graph in expected time
O(m n). This improves the sparse graph case of the best previously known worst-
e
case bound of O(min{mn, n2.376 }).
To achieve this result, we exploit the connection between random walks and electri-
cal flows (cf. Section 2.5.2) to integrate the traditional random-walk-based techniques
for this problem with combinatorial graph partitioning primitives and fast Laplacian
system solvers discussed in Section 2.7.
7.1 Introduction
The topic of this chapter is generation of uniformly random spanning trees in undi-
rected graphs. Random spanning trees are among the oldest and most extensively
investigated probabilistic objects in graph theory, with their study dating back to
Kirchhoff’s work in the 1840s [92]. However, it is only in the past several decades
that researchers have taken up the question of how best to generate uniformly random
spanning trees algorithmically. This question became an active area of research in
the 1980s and 1990s, during which a long string of papers appeared that provided
successively faster algorithms for this task (e.g.,[77, 96, 44, 45, 33, 4, 82, 138]).
Previous algorithms for this problem broadly fall into two categories: determinant-
based algorithms and random-walk-based algorithms. The starting point for the
determinant-based algorithms was Kirchhoff’s Matrix Tree Theorem, which reduces
counting the number of spanning trees in a graph to the evaluation of a determinant
[92] (see, e.g., Ch.2, Thm. 8 in [31]). The first such algorithm produced a random
spanning tree in time O(mn3 ) [77, 96]. After sequence of improvements ([44, 43]),
this line of research culminated in the algorithm of Colbourn, Myrvold, Day, and Nel
1
This chapter is based on joint work with Jonathan Kelner and James Propp, and contains
material from [90].
161
[45], which runs in the amount of time necessary to multiply two n × n matrices, the
best known bound for which is O(n2.376 ) [46].
The random-walk-based algorithms began with the following striking theorem due
to Broder [33] and Aldous [4]
This immediately yields an algorithm for generating a random spanning tree whose
running time is proportional to the cover time of G. If G has n vertices and m
edges, the cover time can be Θ(mn) in the worst case, but it is often much smaller.
For sufficiently sparse graphs, this yields a better worst-case running time than the
determinant-based algorithms. Now, since one clearly needs to see every vertex in G,
it would seem unlikely that such methods could run in less than the cover time of the
graph. However, in the last major breakthrough in this line of research, Wilson [138]
showed that, by using a different random process, one could actually generate span-
ning trees in expected time proportional to the mean hitting time of the graph, which
can be much smaller than the cover time (but has the same worst-case asymptotics).
The worst-case running time bound of O(mn) has stood for twenty years. In
this chapter, our main result is a new algorithm that offers a better worst-case time
bound:
Beyond the classical applications of generating random spanning trees that mo-
tivated the original work on the problem, there have been some developments that
further motivate their study. In particular, Goyal, Rademacher, and Vempala [74]
showed how to use random spanning trees to generate efficient sparsifiers of a graph,
and they then explained how this could be used to provide a scalable and robust
routing scheme.
162
We say that a procedure A generates a conditionally random arborescence if it
outputs a vertex s and an arborescence T rooted at s such that the probability,
conditioned on some s being a root, of outputting a particular arborescence T rooted
at s is 1/|Ts (G)|, where Ts (G) is the set of all arborescences in G rooted at s.2 Note
that in this definition we do not make any restrictions on the probabilities pA (s) that
the generated arborescence output by A is rooted at s. Now, it is easy to see that, once
we fix some s ∈ G, there is one-to-one correspondence between spanning trees of G
and arborescences rooted at s. Indeed, given any spanning tree, there is a unique way
of directing its edges to make it a valid arborescence rooted at s; conversely, given
any arborescence rooted at s, we can obtain a spanning tree by just disregarding
the direction of the edges. As a result, if we have a procedure A that generates a
random arborescence then,P for a given spanning P tree T , the probability that it will
be generated is exactly s∈G pA (s)/|Ts (G)| = s∈G pA (s)/|T (G)| = 1/|T (G)|. This
means that if we interpret the arborescence returned by A as a spanning tree then
we get in this way a random spanning tree.
163
unvisited vertex traversed by the walk is to generate the whole trajectory of X inside
the previously visited parts of G step-by-step.
The core of our approach is showing that there indeed exists an efficient way of
shortcutting X. At a high level, the way we obtain it is as follows. We start by
identifying a number of induced subgraphs D1 , . . . , Dk of G such that the cover time
of each Di is relatively small, and the set C of edges of G that are not inside any of
the Di s constitutes a small fraction of the edges of G. Now, we shortcut the walk
X in G by removing, for each i, the trajectories of X inside Di that occur after X
has already explored the whole Di .3 Such shortcut transcripts clearly retain all of the
information that we need. Moreover, we show that the fact that Di s have small cover
time and C has very small size imply that the expected length is also small. Finally,
by exploiting the connection between random walks and electrical flows (namely, Fact
2.5.1) together with the linear system solver discussed in Section 2.7, we provide an
efficient procedure to approximately generate such shortcut transcripts of X. All
together, this yields the desired procedure for generating random arborescences.
164
Definition 7.2.1. A partition (D1 , . . . , Dk , S, C) of G is a (φ, γ)-decomposition of
G if:
1. |C| ≤ φ|E(G)|,
165
Finally, let Zi∗ be the random variable corresponding to the number of times that
some edge from E(Di ) is traversed by our walk X until time τi occurs, i.e., until X
explores the whole subgraph Di . The following lemma holds:
Before we proceed to the proof, it is worth noting that the above lemma does not
directly follow from the result of Aleliunas et al.[5]. The reason is that [5] applies
only to a natural random walk in Di , and the walk induced by X in Di is different.
Fortunately, the fact that |C(Di )| ≤ |E(Di )| allows us to adjust the techniques of [5]
to our situation and prove that a bound similar to the one of Aleliunas et al. still
holds.
Proof. Let us fix D = Di . For a vertex v ∈ V (D), let dG (v) be the degree v in G, and
let dD (v) be the degree of v in D. Clearly, dG (v) ≥ dD (v), and dC (v) ≡ dG (v)−dD (v) is
the number of edges from C incident to v. For u, v ∈ U (D), let pD u,v be the probability
that a random walk in G that starts at u will reach v through a path that does not
pass through any edge inside D.
Consider a (weighted) graph D0 , which we obtain from D by adding, for each
u, v ∈ U (D), an edge (u, v) with weight dG (u) · pD u,v . (All edges from D have weight 1
in D0 . Note that we do not exclude the case u = v, so we may add self-loops.) By the
fact that the Markov chain corresponding to the random walk X in G is reversible,
dG (u) · pD D
u,v = dG (v) · pv,u , so our weights are consistent.
Now, the crucial thing to notice is that if we take our walk X and filter out of
the vertices that are not from D, then the resulting “filtered” walk YD will just be a
natural random walk in D0 (with an appropriate choice of the starting vertex). In
other words, the Markov chain corresponding to the random walk in D0 is exactly the
Markov chain induced on V (D) by the Markov chain described by our walk X. As a
result, since Zi∗ depends solely on the information that the induced random walk YD
retains from X, it is sufficient to bound Zi∗ with respect to YD . However, it is easy to
see that, in this case, E[Zi∗ ] can be upper-bounded by the expected time needed by a
random walk in D0 , started at arbitrary vertex, to visit all of the vertices in D0 and
then reach some vertex in U (D). We thus can conclude that E[Zi∗ ] is at most twice
the cover time of D0 . (More precisely, it is the cover time plus the maximum hitting
time.)
Now, it is well-known (e.g., see [102]) that the cover time of any undirected graph
G is at most 2 log |V (G0 )|Hmax , where Hmax (G0 ) is the maximal hitting time in G0 .
0
Our aim is therefore to show that Hmax (D0 ) = O(|E(D)|γ(D)), where γ(D) is the
diameter of D. We will do this by applying the approach of Aleliunas et al., who
proved in [5] that for an unweighted graph G0 , Hmax (G0 ) ≤ |E(G0 )|γ(G0 ).
To achieve this goal, let K be some number such that all weights in D0 after
multiplication by K become integral (which we can do since all of the weights in D0
have to be rational). Let K · D0 be the unweighted multigraph that we obtain from
D0 by multiplying all of its weights by K and then interpreting the new weights as
multiplicities of edges. Note that the natural random walk in K · D0 (treated as a
sequence of visited vertices) is exactly the same as in D0 .
166
Now, we prove that for two vertices v and w of D0 such that an edge (v, w) exists
in D, the expected time Hv,w (K · D) = Hv,w (D0 ) until a random walk in K · D0
that starts at v will reach w is at most 4|E(D)|. To see this, note that the long-run
frequency with which an copy of an edge is taken in a particular direction is 1/(2M ),
where M is total number of edges of K · D0 (and we count each copy separately).
Thus one of the K copies of edge (v, w) is taken in the direction from v to w every
K/(2M )-th step on average. This in turn means P that Hv,w (D0 ) ≤ 2M/K. Now, to
bound M , we note first that M ≤ K(|E(D)| + u,v∈U (D) dG (u)pD u,v ). Thus, since
P D
for a given u ∈ U (D) the probability v pu,v of going outsideP D directly from u
is equal to dC (u)/dG (u), we obtain that M ≤ K(|E(D)| + u dC (u)) ≤ 2K|E(D)|
by the property of a (φ, γ)-decomposition that requires that |C(D)| ≤ |E(D)|. We
can thus conclude that Hv,w (D0 ) ≤ 2M/K ≤ 4|E(D)|, as desired. Having obtained
this result, we can use a simple induction to show that Hv,w (D0 ) ≤ 4|E(D)|∆(v, w),
where ∆(v, w) is the distance between v and w in D. From this, we can conclude
that Hmax (D0 ) = 4|E(D)|γ(D) and E[Zi∗ ] ≤ 16 log n|E(Di )|γ(Di ), as desired.
167
efficiently – while yielding the same distribution on random arborescences.
Therefore, in the light of the above discussion, all we really need to focus on
now is showing that we can simulate a shortcut walk X e as above within the desired
time bounds (and that we can indeed find the underlying (φ, γ)-decomposition of G
efficiently).
Proof. Let us fix some D = Di and an edge e = (u, u0 ) ∈ C(D) with u ∈ U (D).
Consider now a graph D0 that we obtain from D as follows. First, we add vertex
u0 , and some dummy vertex u∗ to D, and then, for each (w, w0 ) ∈ C(D) \ {e} with
w ∈ U (D), we add an edge (w, u∗ ). (Note that w0 can be equal to u0 .) Finally, we
add the edge e = (u, u0 ). The crucial thing to notice now is that for any given vertex
v ∈ D, Pv (e) is exactly the probability that a random walk in D0 started at v will hit
u0 before it hits u∗ . We can compute such probabilities quickly using electrical flows.
More precisely, if we treat D0 as an electrical circuit and look at the vertex poten-
tials φ corresponding to the electrical u0 -u∗ flow of value 1 in D0 , then – by Fact 2.5.1
– the probability Pv (e) will be equal to φφv0−φ u∗
. Furthermore, by exploiting the fact
u −φu∗
that the task of finding such a vertex potentials φ corresponds to solving a Laplacian
system (cf. (2.10)), we can compute a δ-approximation of Pv (e) for each v ∈ V (D), in
0
time O(|E(D
e )| log 1/δ) by applying the fast Laplacian system solver from Theorem
168
q
2.7.1. (Note that the error term εkφkL(D0 ) = ε φT L(D0 )φ in (2.16) is – by (2.12) –
at most ε times the energy of the electrical u0 -u∗ flow of value 1 in D0 . However, as
all the resistances in D0 are equal to 1, this energy can be only polynomial in n (in
fact, at most n) and thus it suffices to take ε = δ/nO(1) to get the desired accuracy.)
So, our algorithm performs the above computations for each edge e. (Note that we
might have to two such computations per edge – one for each endpoint of e – if they
each lie in different Di .) From each computation, we store the probabilities Pv (e) for
Pv that we are interested
all vertices in. The running time of this algorithm is bounded
by O(|C| i |E(Di )| log 1/δ) = O(φm2 log 1/δ), where we use the fact that for each D,
e e
|E(D0 )| = |E(D)|+|C(D)| ≤ 2|E(D)| by the definition of a (φ, γ)-decomposition.
Since the algorithm from the above lemma gives us only δ-approximations to the
values of the Pv (e)s, we need to show now that it is still sufficient for our purposes.
Namely, in the following lemma we show how to utilize such an imperfect values of
Pv (e)s to obtain a faithful simulation of the walk X.
e
Lemma 7.3.2. If one can obtain δ-approximate values of Pv (e)s in expected time
T (φ, γ, m, n) log 1/δ then we can simulate the walk X
e (until it covers the whole graph)
in expected time O(m(γ
e + φn) + T (φ, γ, m, n)).
Note that if we plug for T (φ, γ, m, n) the bound offered by Lemma 7.3.1, the time
of the simulation of the walk X e would be O(m(γ
e + φn) + φm2 ).
where we amortized the number of shortcutting steps by tying them to the transitions
over an edge in C that each of the shortcutting steps has to be followed or preceded
by.
Now, we note that all we S really need to do to simulate the walk X is to be
e
able to obtain, for each v ∈ i V (Di ), mn samples from the probability distribution
corresponding to the probabilities {Pv (e)}e . To see why this is the case, consider an
algorithm that simulates up to mn first steps of the walk Xe using these samples and
then simulates any further steps of X e by just simulating the original walk X and
shortcutting the blocks appropriately. Clearly, the expected cost of simulating the
steps of Xe beyond the first mn steps, can be bounded by the expected length E[τ ]
of the walk X until it covers the whole graph, times the probability that the walk X e
will cover the whole graph in less than mn steps. By Fact 7.2.2, Markov’s inequality
and (7.1) we have that this cost is at most:
O(mn)O(m(γ
e + φn))/mn = O(m(γ
e + φn)),
169
In the light of the above, we just need to focus now on obtaining the above-
mentioned samples efficiently enough.
To this end, let us first describe how these samples could be obtained if we some-
how knew all the values Pv (e). The way one could proceed then would be as follows.
First, for each i and v ∈ C(Di ), one would fix some ordering e1 , . . . , e|C(Di )| of the
edges from C(Di ). Next, forPeach v ∈ U (Di ), one would create in O(|C(D
e i )|) time
an array Av where Av (i) := 1≤j≤i Pv (ej ).
Clearly, the total time to compute the arrays Av for all the relevant vs would be
at most X
e (Di )| · |C(Di )|) = O(m|C|)
O(|V e = O(φmn).
e (7.2)
i
Now, to obtain the samples, one would choose a collection of random numbers rvl ,
for each 1 ≤ l ≤ mn and relevant v, where each rvl is chosen uniformly at random from
the interval [0, 1]. Next, one would find for each v and l via binary search an entry
j in Av (i) – for the corresponding i – such that Av (j) ≥ rvl > Av (j − 1), and output
elv := ej as the sample corresponding to rvl . It is easy to verify that the obtained
samples {elv }v,l would have the desired distribution.
Obviously, the problem in implementing the above approach in our case is that
Lemma 7.3.1 gives us access to only δ-approximate values of Pv (e)s. Therefore, if we
repeated the above procedure with these values then the obtained entries Av (i)0 in
the arrays might differ by as much as δi ≤ δn from the true value of Av (i) and thus
the obtained samples {e0 lv }v,l might have different distribution than {elv }v,l .
The explain how we deal with this problem consider the situation when we have
some {rvl }v,l chosen and are finding the corresponding edges {e0 lv }v,l using the inac-
curate values of Av (i)0 s instead of Av (i)s. The key observation to make here is that
as long as rvl chosen, for some v and l, differs by more than δn from all the entries in
the array A0v (we will call such a situation that the sample rvl is clear), then we know
that the edge e0 lv that we output is equal to the edge elv that would be output if we
used the true values of Av (i). In other words, whenever our sampling of rvl is clear, it
does not matter that we are using only approximate values of Pv (e)s – the obtained
sample is the same.
So, the way our sampling procedure will work is the following. Once the num-
bers {rvl }v,l are chosen, we find the smallest value k ∗ of k such that if we use the
algorithm from Lemma 7.3.1 to compute the εk -approximate values of Pv (e)s with
εk = 1/2k+1 m3 n2 , then all the rvl s are clear. We find such k ∗ by just starting with
k = 0 and incrementing it each time there is some sample rvl that is still not clear.
Once such k ∗ is found, it is easy to see that the samples {elv }v,l produced from them
using εk∗ -approximate values of Pv (e)s have the desired distribution. So, the algo-
rithm produces the right samples and indeed it provides the desired simulation of the
walk X.e
Now, to bound its expected running time, we just need to bound the running time
required to find k ∗ and compute the final samples {e0 lv } corresponding to it. To this
end, note that the cost of computing the εk -approximate values of Pv (e)s and the
170
corresponding edge samples is, by Lemma 7.3.1 and (7.2), at most
O(φmn)
e + T (φ, γ, m, n) log 1/εk = O(T
e (φ, γ, m, n)k + φmn).
On the other hand, the probability that for a given v and l the sample rvl differs by less
than εk n from a given Av (i)0 is at most 4εk n, as Av (i)0 is within εk n of the true value
of Av (i) and thus rvl would need to fall within the interval [Av (j) − 2εk n, Av (j) + 2εk n]
for this to happen.
So, by union-bounding over all (at most m) values of i we get that the probability
of rvl being not clear is at most εk mn. Finally, by union-bounding over all v and l,
we get that a probability that there is at least one rvl that is not clear is at most
4εk m3 n2 ≤ 2k−1 .
As a result, the total expected time of obtaining the needed edge samples is at
most ∞
X
2k−1 O(T
e (φ, γ, m, n)k + φmn) = O(T
e (φ, γ, m, n) + φmn),
k=0
We now proceed to the final ingredient of our algorithm—finding good (φ, γ)-
decompositions quickly.
Lemma 7.3.3. For any G and any φ = o(1), there exists a (φ, O(1/φ))-decomposition
e
of G. Moreover, such a decomposition can be computed in O(m).
e
As we will We omit the proof, as we shall prove a stronger statement in Lemma 7.4.2.
7.3.3 e 3/2 )
Generating a Random Arborescence in Time O(m
We can now put these results together to generate a random arborescences in expected
e 3/2 ). Note that this bound is worse than the one stated in Theorem 7.1.2—
time O(m
we shall improve it to obtain the better time bound in Section 7.4.
Let us set φ = 1/m1/2 . By Lemma 7.3.3, we can get a (1/m1/2 , O(m e 1/2 ))-
decomposition of G in O(m)
e time. As we explained already, all that we need to sample
the desired arborescence is to be able to simulate the walk X
e until it covers the whole
graph. By Lemma 7.3.1 and Lemma 7.3.2 this can be done in O(m(γ e + φn) + φm2 ) =
3/2
O(m
e ) time.
171
7.4 e √n) Running Time
Obtaining an O(m
In this section, we show how to modify the O(m e 3/2 running time algorithm obtained
√
above to make it run in O(me n) time and thus prove Theorem 7.1.2.
To understand what needs to modified to get this improvement, one should note
that the main bottleneck in the algorithm presented above is the computation of
the approximations to probabilities Pv (e), i.e., Lemma 7.3.1 —everything else can
be done in time O(me √n) (if one would choose φ = 1/√n). Unfortunately, it is not
clear how we could improve the running time of these computations. To circumvent
this problem, we will alter our algorithm to use slightly different probabilities and a
slightly different random walk that will end up yielding a faster simulation time.
To introduce these probabilities, let us assume that there is no edges in G between
different Di s (we will ensure that this is the case latter) and let Ci (u) for a vertex
u ∈ S be the set of edges incident both to u and the component Di . Now, for a given
i, some v ∈ V (Di ), and u ∈ S with |Ci (u)| > 0, we define Qv (u) to be the probability
that u is the first vertex not in V (Di ) that is reached by a random walk that starts
at v. We will use these probabilities to simulate a new random walk X b that we are
about to define. For given trajectory X(ω) of walk X, X(ω) is equal to X(ω)
b e (as
defined before) except that whenever X(ω) shortcuts some block visited by X, X(ω)
e b
contains only the first vertex visited in this block (as opposed to both first and last
vertices retained in X(ω)).
e Clearly, X
b is a certain shortcutting of X e and is thus a
shortcutting of X as well.
It is not hard to see that by using of Qv (u) in a way completely analogous to the
way we used Pv (e) before, we can simulate the walk X b efficiently, and the expected
length of this walk is bounded by the expected length of the walk X. e However, unlike
X,
e X b does not necessarily posses all of the information needed to reconstruct the
final arborescence. This shortcoming manifests itself whenever some u is visited for
the first time in Xb directly after X
b entered some Di after τi has already occurred. In
this case, we know that the corresponding trajectory of the walk X visited u for the
first time through some edge whose other end was in Di (and thus we should add it
to our arborescence as eu ), but we don’t know which one it was.
To deal with this problem, we will define a stronger decomposition of G whose
properties will imply that the above failure to learn eu will occur only for a small
number of vertices u. Then, at the end of our simulation of the walk X, b we will
employ a procedure that will compute the missing arcs in a manner that will not
distort the desired distribution over output arborescences.
We proceed now to formalizing the above outline.
172
Definition 7.4.1. A partition (D1 , . . . , Dk , S, C) of G is a strong (φ, γ)-decomposition
of G if:
1. (D1 , . . . , Dk , S, C) is a (φ,γ)-decomposition,
3. |C(S)| ≤ φ|V (G)|, where C(S) is the set of vertices from S that are connected
by an edge to some Di .
Lemma 7.4.2. For any G, and any φ = o(1) there exists a strong (φ, O(1/φ))-
e
decomposition of G. Moreover, such a decomposition can be computed in O(m).
e
Proof. We will use a slightly modified version of the ball-growing technique of Leighton
and Rao [100] as presented by Trevisan [134]. For a graph H, let BH (v, j) be the ball
of radius j around the vertex v in H, i.e., let BH (v, j) consist of the subgraph of H in-
duced by all vertices in H that are reachable by a path of length at most j from v. Fur-
thermore, let RH (v, j) be the set of vertices that are at distance exactly i from v in H.
+ −
Finally, let RH (v, j) (RH (v, j) respectively) be the set E(BH (v, j + 1)) \ E(BH (v, j))
(the set E(BH (v, j)) \ E(BH (v, j − 1)) respectively).
Consider now the following procedure:
• While H 6= ∅
• (* Ball-growing *)
First, we note that the above procedure can be implemented in nearly-linear time,
since each edge is examined at most twice before it is removed from H. More-
over, an elementary charging argument shows that in the resulting partition |C| ≤
(1/(1 + 1/t))|E(G)| = φ|E(G)|, and similarly |C(S)| = |S| ≤ φ|V (G)|. By construc-
tion, there is no edges between distinct Di s. We want to argue now that for all i,
ji ≤ 3(1 + log |E(G)|/ log(1 + t)), which in turn would imply that all the Di s have
diameter at most 3(1 + log |E(G)|)/ log(1 + t) = 3(1 + log |E(G)|)/ log(1/(1 − φ))) =
173
O(log m/(− log(1 − φ))) = O(log m/φ), where we used Taylor expansion of log(1 − x)
around x = 0 to get this estimate. To see why the above bound on ji holds, as-
sume that it was not the case for some i and v. Then, during the corresponding
ball-growing procedure, a particular one of the three conditions from (∗) must have
been triggered more than ji /3 = 1 + log |E(G)|/ log(1 + t) times. If this condition
was |RH (v, j + 1)| > t|V (BH (v, j))|, then, since we never remove vertices from our
ball that is being grown and BH (v, 0) has one vertex, the final ball BH (v, ji ) has to
have at least (1 + t)ji /3 > |E(G)| ≥ |V (G)| vertices, which is a contradiction. Simi-
+ −
larly, if |RH (v, j + 1)| > t|E(BH (v, j))| (|RH (v, j + 1)| > t|E(BH (v, j))| respectively)
was the condition in question, then |E(BH (ji , v))| > (1 + t)ji /3 ≥ |E(G)|, which is a
contradiction as well. Thus we may conclude that the above bound on ji holds, and
thus all Di s have diameter O(1/φ),
e as desired.
At this point, we know that the partition of G that we obtained satisfies all of the
properties of a strong (φ, O(1/φ))-decomposition
e except possibly the one that asserts
that there is no Di such that |E(Di )| is smaller than the number |C(Di )| of edges
from C incident to Di . However, if such Di exist then we can just add them to our
cut, i.e., we add V (Di ) to S, and E(Di ) to C. Note that the size of C can at most
triple as a result of this operation, since an edge that is initially in C can be incident
to at most two of the Di , and edges E(Di ) are by definition not incident to any other
Dj . Similarly, C(S) does not increase as a result of adding V (Di ) to S. Thus, we
may conclude that the decomposition returned by this algorithm is indeed a strong
(φ, O(1/φ))-decomposition
e of G.
Proof. Let us fix some D = Di , let S 0 be the set of vertices in w ∈ S with |Ci (w)| > 0,
and let us fix some u ∈ S 0 . Consider now a graph Du0 that we obtain from D ∪ S 0
by merging all vertices in S 0 \ {u} into one vertex called u∗ . For any given vertex
v ∈ D, Qv (u) is exactly the probability that a random walk in Du0 started at v will
hit u before it hits u∗ . Once again, as in Lemma 7.3.1, we can quickly compute such
probabilities using the connection of random walks to electrical flows (cf. Fact 2.5.1)
and fast Laplacian system solver as discussed in Section 2.7 – see the proof of Lemma
7.3.1 for details.
Note that we will need to run Laplacian system solver per each vertex u in C(S)
and a component Di such that |Ci (u)| > 0. Thus the total running time of the
P
resulting algorithm is bounded by O(|C(S)| i |E(Di )| log 1/δ) = O(φmn log 1/δ) as
e e
desired.
174
7.4.3 Coping with Shortcomings of X
b
As mentioned above, the walk X b can be simulated more efficiently than the walk
X, but it does not have all the information needed to construct the arborescence
e
that would be generated by the walk X that X b is meant to shortcut. This lack
of information occurs only when some vertex u is visited for the first time by X b
immediately after X visited a component Di after time τi has already occurred. Note
b
that, by the properties of the strong (φ, γ)-decomposition that we are using, it must
be the case that u ∈ C(S) and |C(S)| ≤ φn. This shows that X b fails to estimate the
arcs eu for a small fraction of vertices of G. We prove now that, in this case, these
missing arcs can be reconstructed efficiently and in a way that preserves the desired
distribution over arborescences.
Lemma 7.4.4. For a trajectory X(ω)
b that starts at vertex s, let F (X(ω))
b be the set
∗
of arcs ev for v ∈
/ C(S) ∪ {s} corresponding to this walk. Let F be the set of all
arborescences H rooted at s such that eH (v) = ev for v ∈
/ C(S) ∪ {s}. Then, given F ,
we can generate a random arborescence from F in time O(m + (φn)2.376 ).
∗
175
Let F (T ) = {eT (v) | v ∈ / C(S) ∪ {s}}. By the fact that X b is a a shortcutting
of the walk X and by Theorem 7.1.1, we know that F (X(ω)) b = F (T ) for exactly
∗
|F (T ) |/|Ts (G)| fraction of trajectories. But since the procedure from Lemma 7.4.4
generates a random arborescence from F (T )∗ , T is generated with probability exactly
1/|Ts (G)|. This concludes the proof of Theorem 7.1.2.
176
Chapter 8
Conclusions
Maximum s-t Flow Problem: Given the fundamental nature of the maximum
s-t flow problem, it is natural to ask whether one can refine further the approach we
presented in Chapter 3, to make it give us an algorithm for the undirected maximum
s-t flow problem that runs in nearly-linear time and computes an exact answer. Note
that by the connection we described in Section 3.6, having such an algorithm for
undirected graphs would also yield a nearly-linear time algorithm for the directed
setting.
One observation to keep in mind when investigating the above question is that
it is known [14, 94] that the multiplicative-weight-update routine we described in
Section 2.8 and employed in Chapter 3, is fundamentally unable to guarantee better
than polynomial in 1/ε dependence of the running time on the error parameter ε.
Therefore, to make our electrical-flow-based approach lead to exact answers, one
would need to change the way the electrical flow computations are combined to form
the final answer. One promising approach to doing that is to use the ideas underlying
the interior point methods. It is known that the running time of the generic interior
177
point method algorithms can be sped up significantly in the context of flow problems
[51]. So, one could wonder if it is possible to obtain further improvement here by
utilizing the insight gained from understanding the interplay between electrical flows
and the combinatorial structure of graphs.
178
grades quickly as this connectivity worsens. It would be interesting to understand if
basing the partitioning procedure on electrical flows – instead of basing it just on the
eigenvector corresponding to the second-smallest eigenvalue of the Laplacian – could
mitigate this problem.
Yet another possible application of electrical flows might be making further progress
on understanding thin trees. As explained in Chapter 6, obtaining (α, O(1))-thin
trees, for α = o(log n/ log log n), would allow us to obtain an improved, O(α) factor,
approximation to the asymmetric traveling salesman problem. As electrical-flow-
based techniques were already useful in establishing our (O(log n/ log log n), O(1))-
thinness bound, it is possible that a deeper understanding of the interplay of electrical
flows and the cut structure of graphs will lead to further improvement.
179
180
Bibliography
[1] I. Abraham, Y. Bartal, and O. Neiman. Nearly tight low stretch spanning trees.
In FOCS’08: Proceedings of the 49th Annual IEEE Symposium on Foundations
of Computer Science, pages 781–790, Washington, DC, USA, 2008. IEEE Com-
puter Society. Full version available at arXiv:0808.2017.
[4] D. J. Aldous. A random walk construction of uniform spanning trees and uni-
form labelled trees. SIAM Journal on Discrete Mathematics, 3(4):450–465,
1990.
[7] N. Alon, Z. Galil, O. Margalit, and M. Naor. Witnesses for boolean matrix
multiplication and for shortest paths. In FOCS ’92: Proceedings of the 33rd
Annual IEEE Symposium on Foundations of Computer Science, pages 417–426,
Washington, DC, USA, 1992. IEEE Computer Society.
[8] N. Alon, R. M. Karp, D. Peleg, and D. West. A graph-theoretic game and its
application to the k-server problem. SIAM Journal on Computing, 24(1):78–
100, 1995.
[9] N. Alon and V. Milman. λ1 , isoperimetric inequalities for graphs and supercon-
centrators. Journal of Combinatorial Theory, Series B, 38:73–88, 1985.
181
[11] R. Andersen, F. Chung, and K. Lang. Local graph partitioning using pagerank
vectors. In FOCS’06: Proceedings of the 47th Annual IEEE Symposium on
Foundations of Computer Science, pages 475–486, Washington, DC, USA, 2006.
IEEE Computer Society.
[12] R. Andersen and Y. Peres. Finding sparse cuts locally using evolving sets.
In STOC’09: Proceedings of the 41st Annual ACM symposium on Theory of
Computing, pages 235–244, New York, NY, USA, 2009. ACM.
p
[13] S. Arora, E. Hazan, and S. Kale. O( log n) approximation to sparsest cut in
Õ(n2 ) time. In FOCS’04: Proceedings of the 45th Annual IEEE Symposium
on Foundations of Computer Science, pages 238–247, Washington, DC, USA,
2004. IEEE Computer Society.
[16] S. Arora, J. R. Lee, and A. Naor. Euclidean distortion and the sparsest cut.
In STOC’05: Proceedings of the 37th Annual ACM Symposium on Theory of
Computing, pages 553–562, New York, NY, USA, 2005. ACM.
[17] S. Arora, S. Rao, and U. Vazirani. Expander flows, geometric embeddings and
graph partitioning. Journal of the ACM, 56(2):1–37, 2009.
182
[22] S. Baswana, R. Hariharan, and S. Sen. Maintaining all-pairs approximate short-
est paths under deletion of edges. In SODA’03: Proceedings of the 14th Annual
ACM-SIAM Symposium on Discrete Algorithms, pages 394–403, Philadelphia,
PA, USA, 2003. Society for Industrial and Applied Mathematics.
[30] M. Bläser. A new approximation algorithm for the asymmetric TSP with train-
gle inequality. In SODA, pages 638–645, 2002.
183
[35] J. R. Bunch and J. E. Hopcroft. Triangular factorization and inversion by fast
matrix multiplication. Mathematics of Computation, 28(125):231–236, 1974.
[36] M. Charikar, M. Goemans, and H. Karloff. On the integrality ratio for the
asymmetric traveling salesman problem. Mathematics of Operations Research,
31:245–252, 2006.
[38] J. Cheeger. A lower bound for the smallest eigenvalue of the laplacian. In
Problems in Analysis, eds Gunning RC, 1970.
[41] N. Christofides. Worst case analysis of a new heuristic for the traveling salesman
problem. Report 388, Graduate School of Industrial Administration, Carnegie-
Mellon University, Pittsburgh, PA, 1976.
184
[48] W. H. Cunningham and L. Tang. Optimal 3-terminal cuts and linear program-
ming. In IPCO ’99: Proceedings of the 7th International IPCO Conference on
Integer Programming and Combinatorial Optimization, pages 114–125, London,
UK, 1999. Springer-Verlag.
[51] S. I. Daitch and D. A. Spielman. Faster approximate lossy generalized flow via
interior point algorithms. In STOC’08: Proceedings of the 40th Annual ACM
Symposium on Theory of Computing, pages 451–460, 2008.
[53] C. Demetrescu and G. F. Italiano. A new approach to dynamic all pairs short-
est paths. In STOC’03: Proceedings of the 35th Annual ACM Symposium on
Theory of Computing, pages 159–166, New York, NY, USA, 2003. ACM.
[54] C. Demetrescu and G. F. Italiano. Fully dynamic all pairs shortest paths with
real edge weights. Journal of Computer and System Sciences, 72(5):813–837,
2006.
[57] P. Elias, A. Feinstein, and C. E. Shannon. A note on the maximum flow through
a network. IRE Transactions on Information Theory, 2, 1956.
185
[60] S. Even and Y. Shiloach. An on-line edge-deletion problem. Journal of the
ACM, 28(1):1–4, 1981.
[61] S. Even and R. E. Tarjan. Network flow and testing graph connectivity. SIAM
Journal on Computing, 4(4):507–518, Dec. 1975.
[63] U. Feige and M. Singh. Improved approximation ratios for traveling salesman
tours and paths in directed graphs. In APPROX, pages 104–118, 2007.
[68] M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved
network optimization algorithms. Journal of the ACM, 34:596–615, July 1987.
[70] N. Garg and J. Könemann. Faster and simpler algorithms for multicommodity
flow and other fractional packing problems. In FOCS’98: Proceedings of the 39th
Annual IEEE Symposium on Foundations of Computer Science, pages 300–309,
1998.
[73] A. V. Goldberg and S. Rao. Beyond the flow decomposition barrier. Journal
of the ACM, 45(5):783–797, 1998.
186
[74] N. Goyal, L. Rademacher, and S. Vempala. Expanders via random spanning
trees. In SODA’09: Proceedings of the 20th Annual ACM-SIAM Symposium on
Discrete Algorithms, pages 576–585, 2009.
[75] M. D. Grigoriadis and L. G. Khachiyan. Fast approximation schemes for con-
vex programs with many blocks and coupling constraints. SIAM Journal on
Optimization, 4(1):86–107, 1994.
[76] M. D. Grigoriadis and L. G. Khachiyan. Approximate minimum-cost multicom-
e −2 knm) time. Mathematical Programming, 75(3):477–482,
modity flows in O(ε
1996.
[77] A. Guénoche. Random spanning tree. Journal of Algorithms, 4(3):214–220,
1983.
[78] N. S. H. Kaplan, M. Lewenstein and M. Sviridenko. Approximation algorithms
for asymmetric TSP by decomposing directed regular multigraphs. Journal of
the ACM, pages 602–626, 2005.
[79] J. Hao and J. B. Orlin. A faster algorithm for finding the minimum cut in a
graph. In SODA’92: Proceedings of the 3rd Annual ACM-SIAM Symposium on
Discrete Algorithms, SODA ’92, pages 165–174, Philadelphia, PA, USA, 1992.
Society for Industrial and Applied Mathematics.
[80] C. Harrelson, K. Hildrum, and S. Rao. A polynomial-time tree decomposition
to minimize congestion. In SPAA’03: Proceedings of the 15th Annual ACM
Symposium on Parallel Algorithms and Architectures, pages 34–43, New York,
NY, USA, 2003. ACM.
[81] M. Held and R. Karp. The traveling salesman problem and minimum spanning
trees. Operations Research, 18:1138–1162, 1970.
[82] D. Kandel, Y. Matias, R. Unger, and P. Winkler. Shuffling biological sequences.
Discrete Applied Mathematics, 71(1-3):171–185, 1996.
[83] G. Karakostas. Faster approximation schemes for fractional multicommodity
flow problems. In SODA’02: Proceedings of the 13th Annual ACM-SIAM Sym-
posium on Discrete Algorithms, pages 166–173, Philadelphia, PA, USA, 2002.
Society for Industrial and Applied Mathematics.
[84] D. Karger and S. Plotkin. Adding multiple cost constraints to combinatorial op-
timization problems, with applications to multicommodity flows. In STOC’95:
Proceedings of the 27th Annual ACM Symposium on Theory of Computing,
pages 18–25, New York, NY, USA, 1995. ACM.
[85] D. R. Karger. Global min-cuts in rnc, and other ramifications of a simple
min-out algorithm. In SODA ’93: Proceedings of the 4th Annual ACM-SIAM
Symposium on Discrete Algorithms, pages 21–30, Philadelphia, PA, USA, 1993.
Society for Industrial and Applied Mathematics.
187
[86] D. R. Karger. Better random sampling algorithms for flows in undirected
graphs. In SODA ’98: Proceedings of the 9th Annual ACM-SIAM Symposium
on Discrete Algorithms, pages 490–499, Philadelphia, PA, USA, 1998. Society
for Industrial and Applied Mathematics.
[87] D. R. Karger. Minimum cuts in near-linear time. Journal of the ACM, 47:46–76,
January 2000.
[89] D. R. Karger and C. Stein. An Õ(n2 ) algorithm for minimum cuts. In STOC’93:
Proceedings of the 25th Annual ACM Symposium on the Theory of Computing,
pages 757–765, New York, NY, USA, 1993. ACM.
[91] R. Khandekar, S. Rao, and U. Vazirani. Graph partitioning using single com-
modity flows. In STOC’06: Proceedings of the 38th Annual ACM Symposium
on Theory of Computing, pages 385–390, New York, NY, USA, 2006. ACM.
[92] G. Kirchhoff. Uber die aufloesung der gleichungen auf welche man sei der unter-
suchung der linearen verteilung galvanischer strome gefuhrt wind. Poggendorgs
Ann. Phys. Chem., 72:497–508, 1847.
[94] P. Klein and N. Young. On the number of iterations for dantzig-wolfe optimiza-
tion and packing-covering approximation algorithms. In IPCO’02: Proceedings
of the 7th International IPCO Conference, pages 320–327. Springer, 2002.
[95] I. Koutis, G. L. Miller, and R. Peng. Approaching optimality for solving SDD
systems. In FOCS’10: Proceedings of the 51st Annual IEEE Symposium on
Foundations of Computer Science, 2010.
188
[98] G. F. Lawler and A. D. Sokal. Bounds on the l2 spectrum for markov chains
and markov processes: A generalization of cheeger’s inequality. Transactions of
the American Mathematical Society, 309(2):577–580, 1988.
[100] T. Leighton and S. Rao. Multicommodity max-flow min-cut theorems and their
use in designing approximation algorithms. Journal of the ACM, 46(6):787–832,
1999.
[101] N. Linial, E. London, and Y. Rabinovich. The geometry of graphs and some
of its algorithmic applications. In FOCS 1994: Proceedings of the 35th An-
nual IEEE Symposium on Foundations of Computer Science, pages 577–591,
Washington, DC, USA, 1994. IEEE Computer Society.
[103] R. Lyons and Y. Peres. Probability on Trees and Networks. 2009. In preparation.
Current version available at http://mypage.iu.edu/~rdlyons/.
[108] Y. Nesterov. Fast gradient methods for network flow problems. In The 20th
International Symposium of Mathematical Programming, 2009.
189
[110] A. Panconesi and A. Srinivasan. Randomized distributed edge coloring via
an extension of the Chernoff-Hoeffding bounds. SIAM Journal on Computing,
26:350–368, 1997.
[115] T. Radzik. Fast deterministic approximation for the multicommodity flow prob-
lem. In SODA’95: Proceedings of the 6th Annual ACM-SIAM Symposium on
Discrete Algorithms, pages 486–492, Philadelphia, PA, USA, 1995. Society for
Industrial and Applied Mathematics.
[116] L. Roditty and U. Zwick. Improved dynamic reachability algorithms for directed
graphs. In FOCS’02: Proceedings of the 43rd Annual IEEE Symposium on
Foundations of Computer Science, page 679, Washington, DC, USA, 2002. IEEE
Computer Society.
[117] L. Roditty and U. Zwick. Dynamic approximate all-pairs shortest paths in undi-
rected graphs. In FOCS’04: Proceedings of the 45th Annual IEEE Symposium
on Foundations of Computer Science, pages 499–508, Washington, DC, USA,
2004. IEEE Computer Society.
[118] L. Roditty and U. Zwick. A fully dynamic reachability algorithm for directed
graphs with an almost linear update time. In STOC’04: Proceedings of the
36th Annual ACM Symposium on the Theory of Computing, pages 184–191,
New York, NY, USA, 2004. ACM.
190
√
[122] J. Sherman. Breaking the multicommodity flow barrier for O( logn)-
approximations to sparsest cut. In FOCS’09: Proceedings of the 50th Annual
IEEE Symposium on Foundations of Computer Science, pages 363–372, Los
Alamitos, CA, USA, 2009. IEEE Computer Society.
[125] D. D. Sleator and R. E. Tarjan. A data structure for dynamic trees. Journal of
Computer and System Sciences, 26(3):362–391, 1983.
[129] D. A. Spielman and S.-H. Teng. Nearly-linear time algorithms for graph parti-
tioning, graph sparsification, and solving linear systems. In STOC’04: Proceed-
ings of the 36th Annual ACM Symposium on the Theory of Computing, pages
81–90, New York, NY, USA, 2004. ACM.
[130] D. A. Spielman and S.-H. Teng. Nearly-linear time algorithms for precondi-
tioning and solving symmetric, diagonally dominant linear systems. CoRR,
abs/cs/0607105, 2006.
[131] D. A. Spielman and S.-H. Teng. A local clustering algorithm for massive
graphs and its application to nearly-linear time graph partitioning. CoRR,
abs/0809.3232, 2008.
[132] M. Stoer and F. Wagner. A simple min-cut algorithm. Journal of the ACM,
44:585–591, July 1997.
[133] M. Thorup and U. Zwick. Spanners and emulators with sublinear distance
errors. In SODA’06: Proceedings of the 17th Annual ACM-SIAM Symposium
on Discrete Algorithms, pages 802–809, New York, NY, USA, 2006. ACM.
191
[134] L. Trevisan. Approximation algorithms for unique games. In FOCS’05: Pro-
ceedings of the 46th Annual IEEE Symposium on Foundations of Computer
Science, pages 05–34. IEEE Computer Society, 2005. Full version available at
http://www.cs.berkeley.edu/ luca/pubs/unique.pdf.
[138] D. B. Wilson. Generating random spanning trees more quickly than the cover
time. In STOC’96: Proceedings of the 28th Annual ACM Symposium on the
Theory of Computing, pages 296–303. ACM, 1996.
[140] U. Zwick. All pairs shortest paths using bridging sets and rectangular matrix
multiplication. Journal of the ACM, 49(3):289–317, 2002.
192