0% found this document useful (0 votes)

5 views445 pages

Lecture Notes in Computer Science

Uploaded by

elizabethnyanchama

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views445 pages

Lecture Notes in Computer Science

Uploaded by

elizabethnyanchama

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 445

Lecture Notes in Computer Science 1412

Edited by G. Goos, J. Hartmanis and J. van Leeuwen

3
Berlin
Heidelberg
New York
Barcelona
Budapest
Hong Kong
London
Milan
Paris
Singapore
Tokyo
Robert E. Bixby E. Andrew Boyd
Roger Z. Rı́os-Mercado (Eds.)

Integer Programming and

Combinatorial Optimization

6th International IPCO Conference

Houston, Texas, June 22-24, 1998
Proceedings

13
Series Editors
Gerhard Goos, Karlsruhe University, Germany
Juris Hartmanis, Cornell University, NY, USA
Jan van Leeuwen, Utrecht University, The Netherlands

Volume Editors

Robert E. Bixby
Department of Computational and Applied Mathematics, Rice University
6020 Annapolis, Houston, TX 77005, USA
E-mail: bixby@rice.edu
E. Andrew Boyd
PROS Strategic Solutions
3223 Smith Street, Houston, TX 77006, USA
E-mail: boyd@prosx.com
Roger Z. Rı́os-Mercado
Department of Industrial, Engineering,Texas A&M University
1000 Country Place Dr. Apt. 69, Houston, TX 77079, USA
E-mail: roger@hpc.uh.edu

Cataloging-in-Publication data applied for

Die Deutsche Bibliothek - CIP-Einheitsaufnahme

Integer programming and combinatorial optimization :
proceedings / 6th International IPCO Conference, Houston, Texas,
June 22 - 24, 1998. Robert E. Bixby . . . (ed.). - Berlin ; Heidelberg ;
New York ; Barcelona ; Budapest ; Hong Kong ; London ; Milan ;
Paris ; Santa Clara ; Singapore ; Tokyo : Springer, 1998
(Lecture notes in computer science ; Vol. 1412)
ISBN 3-540-64590-X

CR Subject Classiﬁcation (1991): G.1.6, G.2.1-2, F.2.2

ISSN 0302-9743
ISBN 3-540-64590-X Springer-Verlag Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, speciﬁcally the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microﬁlms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are
liable for prosecution under the German Copyright Law.
c Springer-Verlag Berlin Heidelberg 1998
Printed in Germany
Typesetting: Camera-ready by author
SPIN 10637207 06/3142 – 5 4 3 2 1 0 Printed on acid-free paper
Preface

This volume contains the papers selected for presentation at IPCO VI, the Sixth
International Conference on Integer Programming and Combinatorial Optimiza-
tion, held in Houston, Texas, USA, June 22–24, 1998. The IPCO series of confer-
ences highlights recent developments in theory, computation, and applications
of integer programming and combinatorial optimization.
These conferences are sponsored by the Mathematical Programming Society,
and are held in the years in which no International Symosium on Mathemati-
cal Programming takes place. Earlier IPCO conferences were held in Waterloo
(Canada) in May 1990; Pittsburgh (USA) in May 1992; Erice (Italy) in April
1993; Copenhagen (Denmark) in May 1995; and Vancouver (Canada) in June
1996.
The proceedings of IPCO IV (edited by Egon Balas and Jens Clausen in
1995) and IPCO V (edited by William Cunningham, Thomas McCormick, and
Maurice Queyranne in 1996), were published by Springer-Verlag in the series
Lecture Notes in Computer Science as Volumes 920 and 1084, respectively. The
proceedings of the first three IPCO conferences were published by organizing
institutions.
A total of 77 extended abstracts, mostly of an excellent quality, were initially
submitted. Following the IPCO policy of having only one stream of sessions over
a three day span, the Program Committee selected 32 papers. As a result, many
outstanding papers could not be selected.
The papers included in this volume have not been refereed. It is expected
that revised versions of these works will appear in scientific journals.
The Program Committee thanks all the authors of submitted extended ab-
stracts and papers for their support of the IPCO conferences.

April 1998 Robert E. Bixby

E. Andrew Boyd
Roger Z. Rı́os Mercado
IPCO VI Program Committee

Imre Bárány, Mathematical Institute, Budapest

Daniel Bienstock, Columbia University
Robert E. Bixby (chair), Rice University
William Cook, Rice University
Bert Gerards, CWI, Amsterdam
David B. Shmoys, Cornell University
David P. Williamson, IBM T.J. Watson Research Center

IPCO VI Organizing Committee

E. Andrew Boyd (chair), PROS Strategic Solutions
Roger Z. Rı́os-Mercado, Texas A&M University

IPCO VI Sponsoring Institutions

ILOG CPLEX Division
PROS Strategic Solutions
Mathematical Programming Society
Rice University
Texas A&M University
University of Houston
Table of Contents

0,1 Matrices, Matroids

The Packing Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
G. Cornuéjols, B. Guenin, and F. Margot

A Characterization of Weakly Bipartite Graphs . . . . . . . . . . . . . . . . . . . . . . . . 9

B. Guenin

Bipartite Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
G. Gasparyan

Characterizing Noninteger Polyhedra with 0–1 Constraints . . . . . . . . . . . . . . 37

A. Sebő

A Theorem of Truemper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
M. Conforti and A. Kapoor

The Generalized Stable Set Problem for Claw-Free Bidirected Graphs . . . . 69

D. Nakamura and A. Tamura

On a Min-max Theorem of Cacti . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Z. Szigeti

Edge Connectivity
Edge-Splitting and Edge-Connectivity Augmentation in Planar Graphs . . . 96
H. Nagamochi and P. Eades

A New Bound for the 2-Edge Connected Subgraph Problem . . . . . . . . . . . . . 112

R. Carr and R. Ravi

An Approximation Algorithm for 2-Edge

Connected Spanning Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
J. Cheriyan, A. Sebő, and Z. Szigeti

Algorithms
Multicuts in Unweighted Graphs with Bounded Degree and Bounded
Tree-Width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
G. Călinescu, C. G. Fernandes, and B. Reed

Approximating Disjoint-Path Problems Using Greedy Algorithms and

Packing Integer Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
S. G. Kolliopoulos and C. Stein
VIII Table of Contents

Approximation Algorithms for the Mixed Postman Problem . . . . . . . . . . . . . 169

B. Raghavachari and J. Veerasamy

Improved Approximation Algorithms for Uncapacitated Facility Location . 180

F. A. Chudak

The Maximum Traveling Salesman Problem under Polyhedral Norms . . . . . 195

A. Barvinok, D. S. Johnson, G. J. Woeginger, and R. Woodroofe

Integer Programming Applications

Polyhedral Combinatorics of Benzenoid Problems . . . . . . . . . . . . . . . . . . . . . . 202
H. Abeledo and G. Atkinson

Consecutive Ones and a Betweenness Problem in Computational Biology . . 213

T. Christof, M. Oswald, and G. Reinelt

Solving a Linear Diophantine Equation with Lower and Upper Bounds on

the Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
K. Aardal, C. Hurkens, and A. K. Lenstra

Integer Programming Computation

The Intersection of Knapsack Polyhedra and Extensions . . . . . . . . . . . . . . . . 243
A. Martin and R. Weismantel

New Classes of Lower Bounds for Bin Packing Problems . . . . . . . . . . . . . . . . 257

S. P. Fekete and J. Schepers

Solving Integer and Disjunctive Programs by Lift and Project . . . . . . . . . . . 271

S. Ceria and G. Pataki

A Class of Hard Small 0–1 Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

G. Cornuéjols and M. Dawande

Network Flows
Building Chain and Cactus Representations of All Minimum Cuts from
Hao-Orlin in the Same Asymptotic Run Time . . . . . . . . . . . . . . . . . . . . . . . . . 294
L. Fleischer

Simple Generalized Maximum Flow Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 310

É. Tardos and K. D. Wayne

The Pseudoﬂow Algorithm and the Pseudoﬂow-Based Simplex for the

Maximum Flow Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
D. S. Hochbaum
Table of Contents IX

An Implementation of a Combinatorial Approximation Algorithm for

Minimum-Cost Multicommodity Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
A. V. Goldberg, J. D. Oldham, S. Plotkin, and C. Stein

Scheduling
Non-approximability Results for Scheduling Problems with Minsum
Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
J. A. Hoogeveen, P. Schuurman, and G. J. Woeginger
Approximation Bounds for a General Class of Precedence Constrained
Parallel Machine Scheduling Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
A. Munier, M. Queyranne, and A. S. Schulz
An Eﬃcient Approximation Algorithm for Minimizing Makespan on
Uniformly Related Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
C. Chekuri and M. Bender
On the Relationship between Combinatorial and LP-Based Approaches to
NP-Hard Scheduling Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
R. N. Uma and J. Wein

Quadratic Assignment Problems

Polyhedral Combinatorics of Quadratic Assignment Problems with Less
Objects Than Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
V. Kaibel
Incorporating Inequality Constraints in the Spectral Bundle Method . . . . . 423
C. Helmberg, K. C. Kiwiel, and F. Rendl

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

?
The Packing Property

Gérard Cornuéjols1, Bertrand Guenin1 , and François Margot2

1
Graduate School of Industrial Administration
Carnegie Mellon University, Pittsburgh, PA 15213, USA
2
Department of Mathematical Science
Michigan Technical University, Houghton, MI 49931, USA

Abstract. A clutter (V, E) packs if the smallest number of vertices

needed to intersect all the edges (i.e. a transversal) is equal to the maxi-
mum number of pairwise disjoint edges (i.e. a matching). This terminol-
ogy is due to Seymour 1977. A clutter is minimally nonpacking if it does
not pack but all its minors pack. A 0,1 matrix is minimally nonpacking if
it is the edge-vertex incidence matrix of a minimally nonpacking clutter.
Minimally nonpacking matrices can be viewed as the counterpart for the
set covering problem of minimally imperfect matrices for the set packing
problem. This paper proves several properties of minimally nonpacking
clutters and matrices.

1 Introduction

A clutter C is a pair (V (C), E(C)), where V (C) is a finite set and E(C) =
{S1 , . . . , Sm } is a family of subsets of V (C) with the property that Si ⊆ Sj
implies Si = Sj . The elements of V (C) are the vertices of C and those of E(C)
are the edges. A transversal of C is a minimal subset of vertices that intersects
all the edges. Let τ (C) denote the cardinality of a smallest transversal. A clutter
C packs if there exist τ (C) pairwise disjoint edges.
For j ∈ V (C), the contraction C/j and deletion C \ j are clutters defined as
follows: both have V (C)−{j} as vertex set, E(C/j) is the set of minimal elements
of {S − {j} : S ∈ E(C)} and E(C \ j) = {S : j 6∈ S ∈ E(C)}. Contractions and
deletions of distinct vertices can be performed sequentially, and it is well known
that the result does not depend on the order. A clutter D obtained from C by
deleting Id ⊆ V (C) and contracting Ic ⊆ V (C), where Ic ∩ Id = ∅ and Ic ∪ Id 6= ∅,
is a minor of C and is denoted by C \ Id /Ic .
We say that a clutter C has the packing property if it packs and all its minors
pack. A clutter is minimally non packing (mnp) if it does not pack but all its
minors do. In this paper, we study mnp clutters.
?
This work was supported in part by NSF grants DMI-9424348, DMS-9509581, ONR
grant N00014-9710196, a William Larimer Mellon Fellowship, and the Swiss National
Research Fund (FNRS).

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 1–8, 1998. c Springer–Verlag Berlin Heidelberg 1998
2 Gérard Cornuéjols et al.

These concepts can be described equivalently in terms of 0,1 matrices. A 0,1

matrix A packs if the minimum number of columns needed to cover all the rows
equals the maximum number of nonoverlapping rows, i.e.

X
n
min xj : Ax ≥ e, x ∈ {0, 1}n
1
X
m
= max yi : yA ≤ e, y ∈ {0, 1}m , (1)
1

where e denotes a vector of appropriate dimension all of whose components

are equal to 1. Obviously, dominating rows play no role in this definition (row
Ai. dominates row Ak. , k 6= i, if Aij ≥ Akj for all j), so we assume w.l.o.g.
that A contains no such row. That is, we assume w.l.o.g. that A is the edge-
vertex incidence matrix of a clutter. Since the statement “A packs” is invariant
upon permutation of rows and permutation of columns, we denote by A(C)
any 0,1 matrix which is the edge-vertex incidence matrix of clutter C. Observe
that contracting j ∈ V (C) corresponds to setting xj = 0 in the set covering
constraints A(C)x ≥ e (since, in A(C/j), column j is removed as well as the
resulting dominating rows), and deleting j corresponds to setting xj = 1 (since,
in A(C \ j), column j is removed as well as all rows with a 1 in column j). The
packing property for A requires that equation (1) holds for the matrix A itself
and all its minors. This concept is dual to the concept of perfection (Berge [1]).
Indeed, a 0,1 matrix is perfect if all its column submatrices A satisfy the equation

X
n
max xj : Ax ≤ e, x ∈ {0, 1}n
1
X
m
= min yi : yA ≥ e, y ∈ {0, 1}m .
1

This definition involves “column submatrices” instead of “minors” since setting

a variable to 0 or 1 in the set packing constraints Ax ≤ e amounts to con-
sider a column submatrix of A. Pursuing the analogy, mnp matrices are to the
set covering problem what minimally imperfect matrices are to the set packing
problem.
The 0,1 matrix A is ideal if the polyhedron {x ≥ 0 : Ax ≥ e} is integral
(Lehman [9]). If A is ideal, then so are all its minors [16]. The following result
is a consequence of Lehman’s work [10].
Theorem 1. If A has the packing property, then A is ideal.
The converse is not true, however. A famous example is the matrix Q6 with
4 rows and 6 columns comprising all 0,1 column vectors with two 0’s and two
1’s. It is ideal but it does not pack. This is in contrast to Lovász’s theorem [11]
stating that A is perfect if and only if the polytope {x ≥ 0 : Ax ≤ e} is integral.
The Packing Property 3

The 0,1 matrix A has the Max-Flow Min-Cut property (or simply MFMC
property) if the linear system Ax ≥ e, x ≥ 0 is totally dual integral (Seymour
[16]). Specifically, let

τ (A, w) = min wx : Ax ≥ e, x ∈ {0, 1}n ,
Xm
ν(A, w) = max yi : yA ≤ w, y ∈ {0, 1}m .
1

A has the MFMC property if τ (A, w) = ν(A, w) for all w ∈ Z+

n
. Setting wj = 0
corresponds to deleting column j and setting wj = +∞ to contracting j. So,
if A has the MFMC-property, then A has the packing property. Conforti and
Cornuéjols [3] conjecture that the converse is also true.

Conjecture 1. A clutter has the packing property if and only if it has the MFMC
property.

This conjecture for the packing property is the analog of the following version of
Lovász’s theorem [11]: A 0, 1 matrix A is perfect if and only if the linear system
Ax ≤ e, x ≥ 0 is totally dual integral.
In this paper, our first result is that this conjecture holds for diadic clutters.
A clutter is diadic if its edges intersect its transversals in at most two vertices
(Ding [6]). In fact, we show the stronger result:

Theorem 2. A diadic clutter is ideal if and only if it has the MFMC property.

A clutter is said to be minimally non ideal (mni) if it is not ideal but all
its minors are ideal. Theorem 1 is equivalent to saying that mni clutters do not
pack. Therefore mnp clutters fall into two distinct classes namely:

Remark 1. A minimally non packing clutter is either ideal or mni.

Next we consider ideal mnp clutters. Seymour [16] showed that Q6 is the only
ideal mnp clutter which is binary (a clutter is binary if its edges have an odd
intersection with its transversals). Aside from Q6 , only one ideal mnp clutter
was known prior to this work, due to Schrijver [14]. We construct an infinite
family of such mnp clutters (see Appendix). The clutter Q6 , Schrijver’s example
and those in our infinite class all satisfy τ (C) = 2. Our next result is that all
ideal mnp clutters with τ (C) = 2 share strong structural properties with Q6 .

Theorem 3. Every ideal mnp clutter C with τ (C) = 2 has the Q6 property, i.e.
A(C) has 4 rows such that every column restricted to this set of rows contains
two 0’s and two 1’s and, furthermore, each of the 6 such possible 0,1 vectors
occurs at least once.

We make the following conjecture and we prove later that it implies Conjecture 1.

Conjecture 2. If C is an ideal mnp clutter, then τ (C) = 2.

4 Gérard Cornuéjols et al.

The blocker b(C) of a clutter C is the clutter with V (C) as vertex set and the
transversals of C as edge set. For Id , Ic ⊆ V (C) with Id ∩ Ic = ∅, it is well known
and easy to derive that b(C \ Id /Ic ) = b(C)/Id \ Ic .
We now consider minimally non ideal mnp clutters. The clutter Jt , for t ≥ 2
integer, is given by V (Jt ) = {0, . . . , t} and E(Jt ) = {1, . . . , t}, {0, 1}, {0, 2}, . . . ,
{0, t}. Given a mni matrix A, let x̃ be any vertex of {x ≥ 0 : Ax ≥ e} with
fractional components. A maximal row submatrix Ā of A for which Āx̃ = e is
called a core of A. The next result is due to Lehman [10] (see also Padberg [13],
Seymour [17]).
Theorem 4. Let A be a mni matrix, B = b(A), r = τ (B) and s = τ (A). Then
(i) A (resp. B) has a unique core Ā (resp. B̄).
(ii) Ā, B̄ are square matrices.
Moreover, either A = A(Jt ), t ≥ 2, or the rows and columns of Ā can be
permuted so that
(iii) ĀB̄ T = J + (rs − n)I.
Here J denotes a square matrix filled with ones and I the identity matrix. Only
three cores with rs = n + 2 are known and none with rs ≥ n + 3. Nevertheless
Cornuéjols and Novick [5] have constructed more than one thousand mni matri-
ces from a single core with rs = n + 2. An odd hole Ck2 is a clutter with k ≥ 3
odd, V (Ck2 ) = {1, . . . k} and E(Ck2 ) = {{1, 2}, {2, 3}, . . . , {k − 1, k}, {k, 1}}. Odd
holes and their blockers are mni with rs = n + 1 and Luetolf and Margot [12]
give dozens of additional examples of cores with rs = n + 1 and n ≤ 17. We
prove the following theorem.
Theorem 5. Let A 6= A(Jt ) be a mni matrix. If A is minimally non packing,
then rs = n + 1.
We conjecture that the condition rs = n + 1 is also sufficient.
Conjecture 3. Let A 6= A(Jt ) be a mni matrix. Then A is minimally non packing
if and only if rs = n + 1.
Using a computer program, we were able to verify this conjecture for all known
mni matrices with n ≤ 14.
A clutter is minimally non MFMC if it does not have the MFMC property
but all its minors do. Conjecture 1 states that these are exactly the mnp clutters.
Although we cannot prove this conjecture, the next proposition shows that a
tight link exists between minimally non MFMC and mnp clutters. The clutter
D obtained by replicating element j ∈ V (C) of C is defined as follows: V (D) =
V (C) ∪ {j 0 } where j 0 6∈ V (C), and

E(D) = E(C) ∪ {S − {j} ∪ {j 0 } : j ∈ S ∈ E(C)}.

Element j 0 is called a replicate of j. Let ej denote the j th unit vector.

Remark 2. D packs if and only if τ (C, e + ej ) = ν(C, e + ej ).
The Packing Property 5

Proposition 1. Let C be a minimally non MFMC clutter. We can construct a

minimally non packing clutter D by replicating elements of V (C).

Proof. Let w ∈ Z+ n
be chosen such that τ (C, w) > ν(C, w) and τ (C, w0 ) = ν(C, w0 )
for all w ∈ Z+ with w0 ≤ w and wj0 < wj for at least one j. Note that wj > 0
0 n

for all j, since otherwise some deletion minor of C does not have the MFMC
property. Construct D by replicating wj − 1 times every element j ∈ V (C). By
Remark 2, D does not pack. Let D0 = D \ Id /Ic be any minor of D. If j or one of
its replicates is in Ic then we can assume that j and all its replicates are in Ic .
Then D0 is a replication of a minor C 0 of C/j. Since C 0 has the MFMC property,
D0 packs by Remark 2. Thus we can assume Ic = ∅. By the choice of w and
Remark 2, if Id 6= ∅ then D0 packs. t
u

Proposition 1 can be used to show that if every ideal mnp clutter C satisfies
τ (C) = 2 then the packing property and the MFMC property are the same.
Proposition 2. Conjecture 2 implies Conjecture 1.

Proof. Suppose there is a minimally non MFMC clutter C that packs. By The-
orem 1, C is ideal. Then by Proposition 1, there is a mnp clutter D with a
replicated element j. Furthermore, D is ideal. Using Conjecture 2, 2 = τ (D) ≤
τ (D/j). Since D/j packs, there are sets S1 , S2 ∈ E(D) with S1 ∩ S2 = {j}.
Because j is replicated in D, we have a set S10 = S1 ∪ {j 0 } − {j}. Remark that
j 0 6∈ S2 . But then S10 ∩ S2 = ∅, hence D packs, a contradiction. t
u

Finally, we introduce a new class of clutters called weakly binary. They can
be viewed as a generalization of binary and of balanced clutters. (A 0,1 matrix
is balanced if it does not have A(Ck2 ) as a submatrix, k ≥ 3 odd, where as above
Ck2 denotes an odd hole. See [4] for a survey of balanced matrices). We say that
a clutter C has an odd hole Ck2 if A(Ck2 ) is a submatrix of A(C). An odd hole Ck2
of C is said to have a non intersecting set if ∃S ∈ E(C) such that S ∩ V (Ck2 ) = ∅.
A clutter is weakly binary if, in C and all its minors, all odd holes have non
intersecting sets.
Theorem 6. Let C be weakly binary and minimally non MFMC. Then C is
ideal.
Note that, when C is binary, this theorem is an easy consequence of Seymour’s
theorem saying that a binary clutter has the MFMC property if and only if it
does not have Q6 as a minor [16]. Observe also that Theorem 6 together with
Conjecture 2, Proposition 2, and Theorem 3, would imply that a weakly binary
clutter has the MFMC property if and only if it does not contain a minor with
the Q6 property.
6 Gérard Cornuéjols et al.

References
1. C. Berge. Färbung von Graphen deren sämtliche bzw. deren ungerade Kreize starr
sind (Zusammenfassung), Wisenschaftliche Zeitschritch, Martin Luther Universität
Halle-Wittenberg, Mathematisch-Naturwissenschaftliche Reihe, 114–115, 1960.
2. W. G. Bridges and H. J. Ryser. Combinatorial designs and related systems. J.
Algebra, 13:432–446, 1969.
3. M. Conforti and G. Cornuéjols. Clutters that pack and the max-flow min-cut prop-
erty: A conjecture. In W. R. Pulleyblank and F. B. Shepherd, editors, The Fourth
Bellairs Workshop on Combinatorial Optimization, 1993.
4. M. Conforti, G. Cornuéjols, A. Kapoor, and K. Vušković. Balanced matrices. In
J. R. Birge and K. G Murty, editors, Math. Programming, State of the Art 1994,
pages 1–33, 1994.
5. G. Cornuéjols and B. Novick. Ideal 0, 1 matrices. J. Comb. Theory Ser. B, 60:145–
157, 1994.
6. G. Ding. Clutters with τ2 = 2τ . Discrete Math., 115:141–152, 1993.
7. J. Edmonds and R. Giles. A min-max relation for submodular functions on graphs.
Annals of Discrete Math., 1:185–204, 1977.
8. B. Guenin. Packing and covering problems. Thesis proposal, GSIA, Carnegie Mel-
lon University, 1997.
9. A. Lehman. On the width-length inequality. Mathematical Programming, 17:403–
417, 1979.
10. A. Lehman. On the width-length inequality and degenerate projective planes. In
W. Cook and P.D. Seymour, editors, Polyhedral Combinatorics, DIMACS Series
in Discrete Math. and Theoretical Computer Science, Vol. 1, pages 101–105, 1990.
11. L. Lovász. Normal hypergraphs and the perfect graph conjecture. Discrete Math.
2:253–267, 1972.
12. C. Luetolf and F. Margot. A catalog of minimally nonideal matrices. Mathematical
Methods of Operations Research, 1998. To appear.
13. M. W. Padberg. Lehman’s forbidden minor characterization of ideal 0−1 matrices.
Discrete Math., 111:409–420, 1993.
14. A. Schrijver. A counterexample to a conjecture of Edmonds and Giles. Discrete
Math., 32:213–214, 1980.
15. A. Schrijver. Theory of Linear and Integer Programming, Wiley, 1986.
16. P. D. Seymour. The matroids with the max-flow min-cut property. J. Comb. Theory
Ser. B, 23:189–222, 1977.
17. P. D. Seymour. On Lehman’s width-length characterization. In W. Cook and P. D.
Seymour, editors, Polyhedral Combinatorics, DIMACS Series in Discrete Math.
and Theoretical Computer Science, Vol. 1, pages 107–117, 1990.

Appendix

We construct ideal minimally non packing clutters C with τ (C) = 2. By Theo-

rem 3, these clutters have the Q6 property. Thus V (C) can be partitioned into
I1 , . . . , I6 and there exist edges S1 , . . . , S4 in C of the form:

S1 = I1 ∪ I3 ∪ I5 , S2 = I1 ∪ I4 ∪ I6 ,
S3 = I2 ∪ I4 ∪ I5 , S4 = I2 ∪ I3 ∪ I6 .
The Packing Property 7

Without loss of generality we can reorder the vertices in V (C) so that elements
in Ik preceed elements in Ip when k < p.
Given a set P of p elements, let Hp denote the ((2p − 1) × p) matrix whose
rows are the characteristic vectors of the nonempty subsets of P, and let Hp∗ be
its complement, i.e. Hp + Hp∗ = J.
For each r, t ≥ 1 let |I1 | = |I2 | = r, |I3 | = |I4 | = t and |I5 | = |I6 | = 1. We
call Qr,t the clutter corresponding to the matrix

 I1 I2 I3 I4 I5 I6 
Hr Hr∗ J 0 1 0

A(Qr,t ) =  Hr∗ Hr 0 J 1 0 
 J 0 Ht∗ Ht 0 1 
0 J Ht Ht∗ 0 1

where J denotes a matrix filled with ones. The rows are partitioned into four
sets that we denote respectively by T (3, 5), T (4, 5), T (1, 6), T (2, 6). The indices
k, l for a given family indicate that the set Ik ∪ Il is contained is every element of
the family. Note that the edge S1 occurs in T (3, 5), S2 in T (1, 6), S3 in T (4, 5)
and S4 in T (2, 6).
Since H1 contains only one row, we have Q1,1 = Q6 and Q2,1 is given by
 
1 1 0 0 1 0 1 0
 1 0 0 1 1 0 1 0  T (3, 5)
 
 0 1 1 0 1 0 1 0 
 
 0 0 1 1 0 1 1 0 
A(Q2,1 ) =  1 0 0 1 0 1 1 0 

  T (4, 5)
 0 1 1 0 0 1 1 0 
 
 1 1 0 0 0 1 0 1  T (1, 6)
0 0 1 1 1 0 0 1 T (2, 6)

Proposition 3. For all r, t ≥ 1, the clutter Qr,t is ideal and minimally non
packing.
The clutter D obtained by duplicating element j ∈ V (C) of C is defined by:
V (D) = V (C)∪{j 0 } where j 0 6∈ V (C) and E(D) = {S : j 6∈ S ∈ E(C)}∪{S ∪{j 0 } :
j ∈ S ∈ E(C)}. Let α(k) be the mapping defined by: α(1) = 2, α(2) = 1, α(3) =
4, α(4) = 3, α(5) = 6, α(6) = 5.
Suppose that, for k ∈ {1, .., 6}, we have that Ik contains a single element
j ∈ V (C). Then j belongs to exactly two of S1 , . . . , S4 . These two edges are of
the form {j} ∪ Ir ∪ It and {j} ∪ Iα(r) ∪ Iα(t) . We can construct a new clutter
C ⊗ j by duplicating element j in C and including in E(C ⊗ j) the edges:

{j} ∪ Iα(j) ∪ Ir ∪ It ,
(2)
{j 0 } ∪ Iα(j) ∪ Iα(r) ∪ Iα(t) .
8 Gérard Cornuéjols et al.

Since the ⊗ construction is commutative we denote by C ⊗{k1, . . . , ks } the clutter

(C ⊗ k1 ) . . . ⊗ ks . For Q6 , we have I1 = {1} = S1 ∩ S2 and {1} ∪ Iα(1) ∪ I3 ∪ I5 =
{1, 2, 3, 5} and finally {10 } ∪ Iα(1) ∪ Iα(3) ∪ Iα(5) = {10 , 2, 4, 6}. Thus
 
1 1 0 1 0 1 0
 1 1 0 0 1 0 1 
 
 0 0 1 0 1 1 0 
A(Q6 ⊗ 1) = 



 0 0 1 1 0 0 1 
 1 0 1 1 0 1 0 
0 1 1 0 1 0 1

Proposition 4. Any clutter obtained from Q6 and the ⊗ construction is ideal

and minimally non packing.
The clutter Q6 ⊗ {1, 3, 5} was found by Schrijver [14] as a counterexample to a
conjecture of Edmonds and Giles on dijoins. Prior to this work, Q6 and Q6 ⊗
{1, 3, 5} were the only known ideal mnp clutters. Eleven clutters can be obtained
using Proposition 4. In fact it can be shown [8] that this proposition remains
true if we replace Q6 by Qr,t . There are also examples that do not fit any of the
above constructions, as shown by the following ideal mnp clutter.
 
1 1 0 0 1 0 1 0
 1 1 0 0 0 1 0 1 
 
 0 0 1 1 1 0 0 1 
 
 0 0 1 1 0 1 1 0 

A(C) =  

 1 0 1 1 1 0 1 0 
 0 1 1 0 0 1 0 1 
 
 0 1 1 0 1 0 0 1 
1 1 0 1 0 1 1 0
A Characterization of Weakly Bipartite Graphs

Bertrand Guenin

Graduate School of Industrial Administration

Carnegie Mellon University, Pittsburgh, PA 15213, USA

Abstract. A labeled graph is said to be weakly bipartite if the clutter

of its odd cycles is ideal. Seymour conjectured that a labeled graph is
weakly bipartite if and only if it does not contain a minor called an odd
K5 . An outline of the proof of this conjecture is given in this paper.

1 Introduction
Let G = (V, E) be a graph and Σ ⊆ E. Edges in Σ are called odd and edges in
E −Σ are called even. The pair (G, Σ) is called a labeled graph. Given a subgraph
H of G, V (H) denotes the set of vertices of H, and E(H) the set of edges of H.
A subset L ⊆ E(G) is odd (resp. even) if |L ∩ Σ| is odd (resp. even). A cycle of
G is a connected subgraph of G with all degrees equal to two.
A labeled graph (G, Σ) is said to be weakly bipartite if the following polyhe-
dron Q is integral (i.e. all its extreme points are integral):
n o
|E| P
Q = x ∈ <+ : i∈C xi ≥ 1, for all odd cycles C of (G, Σ) (1)

See Gerards [7] for a recent survey on weakly bipartite graphs and connexions
with multicommodity flows. Particularly interesting is the case where Σ = E(G).
Let x̂ be any 0, 1 extreme point of Q. Then x̂ is the incidence vector of a set of
edges which intersect every odd cycle of G. In other words e − x̂ is the incidence
|E|
vector of a bipartite subgraph of G. Let w ∈ <+ be weights for the edges of G
and let x̄ be a solution to

min wx : x ∈ Q ∩ {0, 1}|E| . (2)

Then e − x̄ is a solution to the Weighted Max-Cut problem. This problem is

known to be NP-Hard even in the unweighted case [10]. Note, weakly bipartite
graphs are precisely those graphs for which the integrality constraints in (2) can
be dropped.
Weakly bipartite graphs G with Σ = E(G) were introduced by Grötschel and
Pulleyblank [8]. They showed that the optimization problem min{wx : x ∈ Q}
can be solved in polynomial time.
Barahona [1] proved that planar graphs are weakly bipartite. In fact, Fonlupt,
Mahjoub and Uhry [5] showed that all graphs which are not contractible to K5

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 9–22, 1998. c Springer–Verlag Berlin Heidelberg 1998
10 Bertrand Guenin

are weakly bipartite. This is closely related to an earlier result by Barahona

[2] on the cut polytope. Note, this does not yield a characterization of weakly
bipartite graphs. Consider the graph obtained from K5 by replacing one edge by
two consecutive edges. This graph is weakly bipartite and contractible to K5 .
Following Gerards [6], we will define a number of operations on labeled
graphs, which maintain the weak bipartite property. Given U ⊆ V (G), the cut
{(u, v) : u ∈ U, v 6∈ U } is denoted by δ(U ). Given two sets S1 , S2 , the symmetric
difference (S1 ∪S2 )−(S1 ∩S2 ) is denoted by S1 4S2 . The labeled graph obtained
by replacing Σ by Σ 4 δ(U ) in (G, Σ) is called a relabeling of (G, Σ). Since δ(U )
intersects every cycle an even number of times we readily obtain that:

Remark 1. (G, Σ) and (G, Σ 4 δ(U )) have the same set of odd cycles.

(G, Σ) \ e denotes a labeled graph (G 0 , Σ 0 ), where Σ 0 = Σ − {e} and G 0 is

obtained by removing edge e from G. (G, Σ)/e denotes the labeled graph obtained
as follows: (1) if e is odd (i.e. e ∈ Σ) then find a relabeling of (G, Σ) such that
e is even, (2) contract edge e in G (remove edge e in G and identify both of
its endpoints). (G, Σ) \ e is called a deletion minor and (G, Σ)/e a contraction
minor . Let Q be the polyhedron associated with (G, Σ), see (1). It can be readily
shown (see for example the introduction of [18]) that deleting edge e corresponds
to projecting Q onto a lower subspace and contracting e corresponds to setting
xe to zero. A labeled graph (H, θ) is called a minor of (G, Σ), if it can be obtained
as a sequence of relabelings, deletions and contractions. It follows from Remark
1 and the above observations that:

Remark 2. If (G, Σ) is weakly bipartite then so are all its minors.

(H, θ) is called a proper minor of (G, Σ) if it is a minor of (G, Σ) and |E(H)| <
|E(G)|. An odd K5 , denoted by K f5 , is the complete graph on 5 vertices where
all edges are labeled odd. For the polyhedron Q associated with K f5 , the 10
constraints corresponding to the triangles (the odd cycles of length three) de-
fine a fractional point ( 13 , . . . , 13 ) of Q. Thus K f5 is not weakly bipartite. Sey-
mour [18],[19] predicted, as part of a more general conjecture on binary clutters
(see Sec. 2.3) that:

f5 minor.
Conjecture 1. (G, Σ) is weakly bipartite if and only if it has no K

A labeled graph is said to be minimally non weakly bipartite if it is not weakly

bipartite but all its proper minors are. An outline to the proof of the following
theorem (which is equivalent to Conjecture 1) is given in this paper.

f5 .
Theorem 1. Every minimally non weakly bipartite graph is a relabeling of K

Section 2 introduces some basic notions on clutters, before stating a theorem by

Lehman [13] on minimally non ideal (mni) clutters. The section concludes by
deriving key properties of binary mni clutters. Section 3 gives an outline of the
proof of Theorem 1. A complete proof of this result can be found in [9].
A Characterization of Weakly Bipartite Graphs 11

2 Clutters

2.1 Elementary Definitions

A clutter A is a pair (E(A), Ω(A)) where E(A) is a finite set and Ω(A) is a
family of subsets of E(A), say {S1 , . . . , Sm }, with the property that Si ⊆ Sj
implies Si = Sj . M (A) denotes a 0, 1 matrix whose rows are the incidence
vectors of the elements of Ω(A). A clutter A is said to be ideal if the polyhedron
Q(A) = {x ≥ 0 : M (A)x ≥ e} is integral. We say that A is the clutter of odd
cycles of a labeled graph (G, Σ) if E(A) = E(G) and the elements of Ω(A) are
the odd cycles of (G, Σ). Thus a labeled graph is weakly bipartite when the
clutter of its odd cycles is ideal.
Given a clutter A and i ∈ E(A), the contraction A/i and deletion A \ i are
clutters defined as follows: E(A/i) = E(A \ i) = E(A) − {i}, Ω(A/i) is the set
of inclusion-wise minimal elements of {S − {i} : S ∈ Ω(A)} and, Ω(A \ i) = {S :
i 6∈ S ∈ Ω(A)}. Contractions and deletions can be performed sequentially, and
the result does not depend on the order. A clutter B obtained from A by a set
of deletions Id and a set of contractions Ic , where Ic ∩ Id = ∅ is called a minor
of A and is denoted by A \ Id /Ic . We say that B is a contraction minor (resp.
deletion minor) if Id = ∅ (resp. Ic = ∅). The following is well known [13]:
Remark 3. If A is ideal then so are its minors.
We saw in Remark 1 that relabeling a labeled graph leaves the clutter of odd
cycles unchanged. Contractions and deletions on a labeled graph (as defined
in Sec. 1) are equivalent to the corresponding operations on the clutter of odd
cycles.
Remark 4. Let A be the clutter of odd cycles of (G, Σ). Then
– A/e is the clutter of odd cycles of (G, Σ)/e and
– A \ e is the clutter of odd cycles of (G, Σ) \ e.
Given a clutter A, the clutter b(A) is called the blocker of A and is defined
as follows: E (b(A)) = E(A) and Ω (b(A)) is the set of inclusion-wise minimal
elements of {U : C ∩ U 6= ∅, ∀C ∈ Ω(A)}. It is well known that b(A \ Ic /Id ) =
b(A)/Ic \ Id and that b (b(A)) = A [17]. If A is the clutter of odd cycles of a
labeled graph (G, Σ), then the elements of Ω (b(A)) are of the form δ(U ) 4 Σ,
where δ(U ) is a cut of G [6].

2.2 Minimally Non Ideal Clutters

A clutter A is called minimally non ideal (mni) if it is not ideal but all its proper
minors are ideal. Because of Remark 4, the clutter of odd cycles of a minimally
non weakly bipartite labeled graph is mni. In this section we review properties
of mni clutters.
The clutter Jt , for t ≥ 2 integer, is given by E(Jt ) = {0, . . . , t} and Ω(Jt ) =
{{1, . . . , t}, {0, 1}, {0, 2}, . . . , {0, t}}. The cardinality of the smallest element of
12 Bertrand Guenin

Ω (b(A)) is denoted by τ (A). In this section we consider the matrix representa-

tion A = M (A) of a clutter A. We say that a matrix M (A) is mni when the
clutter A is mni. The blocker of b (M (A)) is the matrix M (b(A)) and τ (A) is
the smallest number of non-zero entries in any row of b (M (A)).
Given a mni matrix A, let x̃ be any extreme point of Q(A) = {x ≥ 0 : Ax ≥ e}
with fractional components. A maximal row submatrix Ā of A for which Āx̃ = e
is called a core of A. Two matrices are said to be isomorphic if one can be
obtained from the other by a sequence of permutations of the rows and columns.
The next result is by Lehman [13] (see also Padberg [16], Seymour [20]).
Theorem 2. Let A be a mni matrix. Then B = b(A) is mni. Let r = τ (B) and
s = τ (A). Either A is isomorphic to M (Jt ) or
(i) A (resp. B) has a unique core Ā (resp. B̄).
(ii) Ā, B̄ are square matrices.
Moreover, the rows and columns of Ā can be permuted so that
(iii) ĀB̄ T = J + (rs − n)I, where rs − n ≥ 1.
Here J denotes a square matrix filled with ones and I the identity matrix. Also
e is the vector of all ones, ej is the j th unit vector, and B̄.j denotes column j of
B̄. The following is a special case of a result of Bridges and Ryser [3]:
Theorem 3. Let Ā, B̄ be matrices satisfying (ii),(iii) of Theorem 2.
(i) Columns and rows of Ā (resp. B̄) have exactly r (resp. s) ones.
(ii) ĀB̄ T = ĀT B̄
(iii) ĀT B̄.j = e + (rs − n)ej .
(iv) Let j be the index of any column of A. Let C1 , . . . , Cs (resp. U1 , . . . , Ur )
be the characteristic sets of the rows of Ā (resp. B̄) whose indices are given
by the characteristic set of column j of B̄ (resp. Ā). Then C1 , . . . , Cs (resp.
U1 , . . . , Ur ) intersect only in {j} and exactly q = rs − n + 1 of these sets
contain j.
Note that in the last theorem, Property (ii) implies (iii) that in turn implies
(iv). Because of Theorem 2 and Theorem 3(i) the fractional point x̃ must be
( 1r , . . . , 1r ). The next remark follows from the fact that Ā is a maximal row
submatrix of A for which Āx̃ = e.
Remark 5. Rows of A which are not rows of Ā have at least r + 1 non-zero
entries. Similarly, rows of B which are not in B̄ have at least s + 1 ones.
Let A 6= Jt be a mni clutter with A = M (A). Ā, with M (Ā) = Ā, denotes the
core of A. Let B be the blocker of A and B̄ the core of B. Consider the element
C ∈ Ω(Ā) (resp. U ∈ Ω(B̄)) which corresponds to the ith row of Ā (resp. B̄).
By Theorem 2(iii), C intersects every element of Ω(B̄) exactly once except for
U which is intersected q = rs − n + 1 ≥ 2 times. We call U the mate of C.
Thus every element of Ω(Ā) is paired with an element of Ω(B̄). Notice that
Theorem 3(iv) implies the following result.
Remark 6. Let A be a mni clutter distinct from Jt and consider C1 , C2 ∈ Ω(Ā)
with i ∈ C1 ∩ C2 . The mates U1 , U2 of C1 , C2 satisfy U1 ∩ U2 ⊆ {i}.
A Characterization of Weakly Bipartite Graphs 13

2.3 Binary Clutters

A clutter A is said to be binary if for any three sets S1 , S2 , S3 ∈ Ω(A), the set
S1 4S2 4S3 contains a set of Ω(A). Lehman [11] showed (see also Seymour [17]):

Theorem 4. A is binary if and only if for any C ∈ Ω(A) and U ∈ Ω (b(A)) we

have |C ∩ U | odd.

Thus in particular if A is binary then so is its blocker. The following is easy, see
for example [6].

Proposition 1. Let (G, Σ) be a labeled graph. Then the clutter of odd cycles of
(G, Σ) is binary.

2.4 Minimally Non Ideal Binary Clutters

Note that the blocker of Jt is Jt itself. We therefore have {1, 2} ∈ E(Jt ) and
{1, 2} ∈ E (b(Jt )). It follows by Theorem 4 that Jt is not binary. The clutter F7
is defined as follows: E(F7 ) = {1, . . . , 7} and

Ω(F7 ) = {{1, 3, 5}, {1, 4, 6}, {2, 4, 5}, {2, 3, 6}, {1, 2, 7}, {3, 4, 7}, {5, 6, 7}} .

The clutter of odd cycles of K5 is denoted by OK5 . Conjecture 1 is part of a more

general conjecture by Seymour on minimally non ideal binary clutters. See [18]
p. 200 and [19] (9.2), (11.2).

Conjecture 2. If A is a minimally non ideal binary clutter, then A is either F7 ,

OK5 or b(OK5 ).

Since we can readily check that F7 and b(OK5 ) are not clutters of odd cycles this
conjecture implies Conjecture 1. Next are two results on mni binary clutters.

Proposition 2. Let A be a mni binary clutter and C1 , C2 ∈ Ω(Ā). If C ⊆

C1 ∪ C2 and C ∈ Ω(A) then either C = C1 or C = C2 .

Proof. Let r denote the cardinality of the elements of Ω(Ā).

Case 1: |C| = r.
By Remark 5, we have C ∈ Ω(Ā). Let U be the mate of C and q = |C ∩ U | ≥
2. By Theorem 4, q is odd so in particular q ≥ 3. Since C ⊆ C1 ∪ C2 , we
must have |U ∩ C1 | > 1 or |U ∩ C2 | > 1. This implies that U is the mate of
C1 or C2 , i.e. that C = C1 or C = C2 .
Case 2: |C| > r.
Let t = |C1 ∩ C2 ∩ C|. Since C ⊆ C1 ∪ C2 , it follows that

|C| = t + |(C1 4 C2 ) ∩ C| . (3)

14 Bertrand Guenin

For T = C1 4 C2 4 C, we have
|T | = |(C1 ∩ C2 ∩ C) ∪ [(C1 4 C2 ) − C]| , C ⊆ C1 ∪ C2
= |C1 ∩ C2 ∩ C| + |C1 4 C2 | − |(C1 4 C2 ) ∩ C|
= t + |C1 4 C2 | − (|C| − t), by (3)
= 2t + |C1 | + |C2 | − 2|C1 ∩ C2 | − |C|
≤ |C1 | + |C2 | − |C|, t ≤ |C1 ∩ C2 |
≤ 2r − (r + 1), C1 , C2 ∈ Ω(Ā)
Since A is binary we have that T is equal to, or contains an element of Ω(A).
But |T | ≤ r − 1 which contradicts Theorem 3(i) and Remark 5. t
u
Notice that for OK5 the previous theorem simply says that given two triangles
C1 , C2 there is no odd cycle (distinct from C1 and C2 ) which is contained in the
union of C1 and C2 . It is worth mentioning that this is a property of mni binary
clutters only. Indeed the property does not hold for odd holes or more generally
for any circulant matrix with the consecutive one property. For a description of
many classes of mni clutters see [4] and [14].
Proposition 3. Let A be a mni binary clutter and B its blocker. For any e ∈
E(A) there exist C1 , C2 , C3 ∈ Ω(Ā) and U1 , U2 , U3 ∈ Ω(B̄) such that
(i) C1 ∩ C2 = C1 ∩ C3 = C2 ∩ C3 = {e}
(ii) U1 ∩ U2 = U1 ∩ U3 = U2 ∩ U3 = {e}
(iii) For all i, j ∈ {1, 2, 3} we have:
Ci ∩ Uj = {e} if i 6= j, and |Ci ∩ Uj | = q ≥ 3, if i = j.

(iv) For all ei ∈ Ui and ej ∈ Uj with i, j ∈ {1, 2, 3}

∃C ∈ Ω(A) with C ∩ Ui = {ei } and C ∩ Uj = {ej }.
Proof. Let r (resp. s) denote the cardinality of the elements of Ω(Ā) (resp.
Ω(B̄)).
(i) By Theorem 3(iv) there exist s sets C1 , . . . , Cs ∈ Ω(A) such that C1 −
{e}, . . . , Cs − {e} are all disjoint. Moreover, exactly q = rs − n + 1 ≥ 2 of
these sets, say C1 , . . . , Cq , contain e. Finally, by Theorem 4 q ≥ 3.
(ii) Let Ui be the mate of Ci , where i ∈ {1, 2, 3}. We know |Ui ∩ Ci | > 1 and
for all j ∈ {1, . . . , s} − {i} we have |Ui ∩ Cj | = 1. Since C1 , . . . , Cs only
intersect in e and since by Theorem 3(i), |Ui | = s it follows by counting that
e ∈ Ui . Finally, by Remark 6 and the fact that e ∈ C1 ∩ C2 ∩ C3 , we obtain
Ui ∩ Uj ⊆ {e} for all i 6= j and i, j ∈ {1, 2, 3}.
(iii) Follows from (i),(ii) and the fact that Ui is the mate of Cj if and only if
i = j.
(iv) Let T = Ui ∪ Uj − {ei , ej }. Since A is binary so is its blocker B. By Propo-
sition 2, there is no U ∈ Ω(B) with U ⊆ T . Thus E(A) − T intersects every
element of Ω(B). Since the blocker of the blocker is the original clutter, it
follows that E(A) − T contains or is equal to, an element C of Ω(A). Since
C ∩ Ui 6= ∅ and C ∩ Uj 6= ∅ we have by construction C ∩ Ui = {ei } and
C ∩ Uj = {ej }.
A Characterization of Weakly Bipartite Graphs 15

3 Outline of the Proof

3.1 From Binary Clutters to Labeled Graphs
Let (G, Σ) be a minimally non weakly bipartite graph. Let A be the clutter of
odd cycles of (G, Σ) and B its blocker. As noted in Sec. 2.2, A (and thus B) is
mni. Let e be any edge of E(A) and let U1 , U2 , U3 ∈ Ω(B̄) be the sets defined in
Proposition 3. We define

R = U1 − {e} B = U2 − {e} G = U3 − {e} (4)

and

W = E(A) − (U1 ∪ U2 ∪ U3 ) ∪ {e}. (5)

Note that by Proposition 3(ii), R, B, G and W form a partition of the edges of

G. It may be helpful to think of R, B, G, W as a coloring of the edges of G.
Let C be any odd cycle of (G, Σ) with e 6∈ C. Since C ∈ Ω(A) we have
C ∩ Ui 6= ∅ for i ∈ {1, 2, 3} it follows that C ∩ R 6= ∅, C ∩ B 6= ∅, C ∩ G 6= ∅.
Therefore, the minimally non weakly bipartite graph (G, Σ) satisfies the following
property:
(P1) Every odd cycle C of (G, Σ) that does not contain e, has at least one
edge in R, one in B, and one in G.
Consider edges ei ∈ R and ej ∈ B then ei ∈ U1 , ej ∈ U2 . Hence by Proposi-
tion 3(iv), there is an odd cycle C ∈ Ω(A) with {ei } = C ∩ U1 ⊇ C ∩ R and
{ej } = C ∩ U2 ⊇ C ∩ B. Therefore, (G, Σ) also satisfies:
(P2) For any ei ∈ R (resp. B, G) and ej ∈ B (resp. G, R) there is an odd
cycle C of (G, Σ) with the following properties. C does not contain e,
the only edge of C in R (resp. B, G) is ei , and the only edge of C in B
(resp. G, R) is ej .

3.2 Building Blocks

Definition 1. We say that a sequence of paths S = [P1 , . . . , Pt ] forms a circuit
if each path Pi has endpoints vi , vi+1 and vt+1 = v1 . We denote the set of edges
∪ti=1 Pi by E(S).
The next result is easy.
Lemma 1. Let S = [P1 , . . . , Pt ] be a sequence of paths that form a circuit. If
there is an odd number of odd paths in S then E(S) contains an odd cycle.
Given two vertices v1 and v2 of path P , the subpath of P between v1 and v2 is
denoted by P (v1 , v2 ) = P (v2 , v1 ). Given a set S ⊆ E and e ∈ E, we denote the
set S − {e} by S−e. Let C be a cycle, v1 , v2 two vertices of C and e an edge
of C. Using the notation just defined we have that C−e(v1 , v2 ) defines a unique
path. Next comes the first building block of the odd K5 .
16 Bertrand Guenin

Lemma 2. Let (G, Σ) be a minimally non weakly bipartite graph with a partition
of its edges as given in (4)-(5).

(i) There exist odd cycles CR ⊆ R ∪ W, CB ⊆ B ∪ W and CG ⊆ G ∪ W which

intersect exactly in e. Moreover, |CR ∩ R|, |CB ∩ B|, |CG ∩ G| are all even
and non-zero.
(ii) CR , CB and CG have only vertices w1 and w2 in common, where e =
(w1 , w2 ).

w1 odd w2
e

CB CG

Fig. 1. Lemma 2. Bold solid lines represent paths in R ∪ W−e, dashed lines paths in
B ∪ W−e, and thin solid lines paths in G ∪ W−e.

Proof (of Lemma 2).

(i) Let us rename sets C1 , C2 , C3 of Proposition 3 by CR , CB , CG . We know

from Proposition 3(i) that CR , CB and CG intersect exactly in e. By Propo-
sition 3(iii), ∅ = CR ∩ (U2 ∪ U3 − {e}) = CR ∩ (B ∪ G). Thus CR ⊆ R ∪ W .
Also by Proposition 3(iii), |CR ∩R| = |CR ∩(U1 −{e})| = |CR ∩U1 |−1 = q−1
where q ≥ 3 is odd. Identically we show, CB ⊆ B ∪ W, CG ⊆ G ∪ W and
|CB ∩ B|, |CG ∩ G| both non-zero and even.
(ii) Suppose, for instance, CR and CB have a vertex t distinct from w1 and w2
in common. Let P = CR −e(w1 , t), P 0 = CR −e(w2 , t) and Q = CB −e(w1 , t),
Q0 = CB −e(w2 , t), see Fig. 2. Since we can relabel edges in δ({w1 }) and
in δ({t}) we can assume w.l.o.g. that edge e = (w1 , w2 ) is odd and that
paths P, P 0 are both even. If Q is odd then let S = [P, Q] otherwise let
S = [{e}, Q, P 0]. By Lemma 1, E(S) contains an odd cycle, a contradiction
to Proposition 2. t
u

Since CR , CB and CG have only vertices w1 and w2 in common, we can relabel

(G, Σ) so that e is the only odd edge in CR ∪ CB ∪ CG . Let us now proceed to
add 3 more paths PR , PB and PG to our initial building block.
A Characterization of Weakly Bipartite Graphs 17

Q
w1
P
t
P’
w2
Q’

Fig. 2. Lemma 2(ii).

Lemma 3. Let (G, Σ) be a minimally non weakly bipartite graph with a partition
of its edges as given in (4)-(5). Suppose we also have odd cycles CR , CB and CG
as defined in Lemma 2 where e is the only odd edge in CR ∪ CB ∪ CG .
(i) There is an odd path PR (resp. PB , PG ) between a vertex vBR (resp.
vRB , vBG ) of CB (resp. CR , CB ) distinct from w1 , w2 , and a vertex vGR
(resp. vGB , vRG ) of CG (resp. CG , CR ) distinct from w1 , w2 .
(ii) PR ⊆ R ∪ W−e, PB ⊆ B ∪ W−e and PG ⊆ G ∪ W−e.

vRG vRB
CR

PG PB
odd odd
w1 odd w2
e
vBG vGB

CB CG
PR
vBR vGR
odd

Fig. 3. Lemma 3.

Proof (of Lemma 3). By symmetry it is sufficient to show the result for path
0
PR . Since |CR ∩ R| ≥ 2 there is an edge eB = (vBR , vBR ) ∈ B of CB such that
vBR is distinct from w1 , w2 and CB −e(w1 , vBR ) contains exactly one edge in B
0
namely eB . Similarly, we have edge eG = (vGR , vGR ) ∈ CG with vGR distinct
from w1 , w2 and CG −e(w1 , vGR ) ∩ G = {eG }.
By property (P2) there is an odd cycle C such that C ∩ B = {eB } and
C ∩ G = {eG }. The cycle C can be written as {eB , eG } ∪ PR ∪ PR0 where PR and
PR0 are paths included in R ∪ W−e. Since C is odd we can assume w.l.o.g. that
PR is odd and PR0 is even.
18 Bertrand Guenin

0 0
Case 1: The endpoints of PR are vBR , vGR (resp. vBR , vGR ).
0
Then let S = [CB−e(w1 , vBR ), PR , CG−e(vGR , w1 )]. By Lemma 1, E(S) con-
tains an odd cycle but e 6∈ E(S) and E(S) ∩ B = ∅, a contradiction to
(P1).
0 0
Case 2: The endpoints of PR are vBR , vGR .
0 0
Then let S = [CB−e(w1 , vBR ), PR , CG−e(vGR , w1 )]. By Lemma 1, E(S) con-
tains an odd cycle but e 6∈ E(S) and E(S) ∩ B = E(S) ∩ G = ∅, a contra-
diction to (P1).
Thus PR has endpoints vBR , vGR . t
u

An internal vertex of a path P is a vertex of P which is distinct from the end-

points of P . By choosing paths PR , PB , PG carefully we can show the following
additional property:
Lemma 4. PR , PB and PG have no internal vertices in common with CR , CB
or CG .

Remark 7. Let (G, Σ) be a labeled graph with an odd path P where all internal
vertices of P have degree two (in G). Then there is a sequence of relabeling
and contractions that will replace P by a single odd edge, without changing the
remainder of the graph.

Remark 8. Consider (G, Σ) as defined in Lemma 3. If PR , PB , PG have no inter-

f5 minor.
nal vertices in common, then (G, Σ) contains a K
f5 :
This is because in this case the following sequence of operations yields K
1. delete all edges which are not in CR , CB , CG and not in PR , PB , PG ,
2. contract CR −e(vRG , vRB ), CB −e(vBR , vBG ), CG −e(vGR , vGB ),
3. relabel edges in δ({w1 , w2 }),
4. replace each odd path by a single odd edge (see Remark 7).

3.3 Intersections of Paths PR , PB , and PG .

Because of Remark 8 we can assume at least two of the paths PR , PB and PG
must share an internal vertex. One of the main step of the proof of Theorem
1 is to show the following lemma (we give a simplified statement here) which
describes how paths PR , PB and PG must intersect.
Lemma 5. A minimally non weakly bipartite graph (G, Σ) is either a relabeling
f5 or it contains a contraction minor (H, ΣH ) with the following properties
of K
(see Fig. 4):
(i) There are odd paths PR0 , PB0 , PG0 of the form given in lemmas 3, 4 but
0 0
with vertices vBG , vGB instead of vBG , vGB .
0 0 0
(ii) PR ⊆ R, PB ⊆ B, PG ⊆ G.
A Characterization of Weakly Bipartite Graphs 19

(iii) PR0 and PB0 (resp. PG0 ) share an internal vertex tRB (resp. tRG ).
(iv) PB0 (tRB , vGB
0
) and PG0 share an internal vertex tBG .
0 0 0
(v) PB and PG (tRG , vBG ) share an internal vertex t0BG .
(vi) Paths PR (vBR , tRB ) and PR0 (vGR , tRG ) consist of a single edge.
0

(vii) PB0 (resp. PG0 ) has exactly one odd edge which is incident to vGB 0
0
(resp. vBG ).
(viii) PR0 (vBR , tRB ), PR0 (vGR , tRG ) are even and PR0 (tRB , tRG ) is odd.
(ix) No vertex is common to all three paths PR , PB and PG .

t RB odd t RG
P’R

v RB CR vRG P’B
P’G

w1 odd w2
e
vBR vGR
odd odd

v’BG CB CG v’GB

Fig. 4. Lemma 5. Graph (H, ΣH ).

We did not represent vertices tBG (and t0BG ) in the previous figure. Let q denote
the first vertex of PG0 , starting from vRG , which is also a vertex of PB0 (tRB , vGB
0
)
0
(see Fig. 4). By Lemma 5(iv) there exist such a vertex. Let q be the first vertex
of PG0 (q, vRG ), starting from q, which is either a vertex of PB0 or equal to vRG .
By Lemma 5(ix) q, q 0 are distinct from tRB .
Definition 2. (K, ΣK ) is the graph (see Fig. 5) obtained by deleting every edge
of (H, ΣH ) which is not an edge of CR , CB , CG or PR0 (vBR , tRB ), PB0 and PG0 (q, q 0 ).
From Lemma 5 we can readily obtain the following properties:
Remark 9.
(i) There are exactly two odd edges in (K, ΣK ), namely e and the edge of PB0
0
incident to vGB .
0
(ii) Let S be the set of vertices {w1 , w2 , vBR , vRB , vGB , tRB , q, q 0 } shown in
0
Fig. 5. vRB and q may denote the same vertex but all other vertices of S
are distinct.
(iii) S is the set of all vertices of (K, ΣK ) which have degree greater than two.
20 Bertrand Guenin

t RB q
t’RB
t RB q q’
t’RB

CR P’R vRB P’B

P’R vRB vRG = q’ P’B CR

w1 odd w2 w1 odd w2
odd odd
vBR vBR
v’GB v’GB
CB CG CB CG

Fig. 5. Graph (K, ΣK ), Left q 0 = vRG , right q 0 6= vRG

Definition 3. Let tRB be the vertex of (H, ΣH ) defined in Lemma 5(iii). ē =

(tRB , t0RB ) denotes the edge of PB0 (tRB , vRB ) incident to tRB (see Fig 5). Note,
t0RB need not be distinct from vRB .

Lemma 6. Let (H, ΣH ) be the graph defined in Lemma 5. There are three dis-
tinct odd paths F1 , F2 , F3 from tRB to t0RB .

Proof. By applying Proposition 3(i) to edge ē we obtain odd cycles L1 , L2 , L3

of (G, Σ) which intersect exactly in ē. Let L01 , L02 , L03 be the corresponding odd
cycles in (H, ΣH ) and Fi , for i ∈ {1, 2, 3}, denotes the path L0i −ē. Since ē is even
and L0i is odd we must have Fi odd as well. t
u

Lemma 7. Let (K, ΣK ) be the graph given in Definition 2 and let F1 , F2 , F3

be the paths given in Lemma 6. Then F1 , F2 , F3 all have an internal vertex in
common with (K, ΣK ).

Proof. Suppose for a contradiction this is not the case and we have Fi with
no internal vertices in common with (K, ΣK ), see Fig. 6. Consider the graph
obtained from (H, ΣH ) by deleting ē = (tRB , t0RB ) and all edges which are not
f5 :
edges of (K, ΣK ) or edges of Fi . The following sequence of operations yields K

1. relabel edges in δ({q}),

2. if q 0 = vRG then contract CR −e(vRG , vRB ) otherwise contract PB0 (q 0 , vRB ),
3. relabel edges in δ({q}), δ({w1 , w2 }),
4. contract PB0 (q, vGB
0
) and CB −e(t, vBR ),
5. replace odd paths by odd edges, see Remark 7.
A Characterization of Weakly Bipartite Graphs 21

t RB q
Fi
odd
e
t’RB
CR
vRB vRG = q’ P’B

w1 odd w2
odd

vBR v’GB
CB CG

Fig. 6. Lemma 7. We represent the case where q 0 = vRG only.

f5 as a proper minor, a contradiction since (G, Σ) is

Hence (G, Σ) contains K
minimally non weakly bipartite. t
u

Let (H̄, ΣH̄ ) be the graph obtained by deleting from (H, ΣH ) all the edges
which are not edges of CR , CB , CG and not edges of PR0 , PB0 and PG0 . Because of
Lemma 7 we can define fi , for i = 1, 2, 3, to be the first internal vertex of Fi ,
starting from tRB , which is also a vertex of (H̄, ΣH̄ ). By symmetry (see Fig. 4)
there is a vertex a vertex t0RG of PG0 (tRG , vRG ) which is incident to tRG and there
are odd paths F10 , F20 , F30 between tRG and t0RG . As previously we can define fi0 ,
for i = 1, 2, 3, to be the first internal vertex of Fi0 , starting from tRG , which is
also a vertex of (H̄, ΣH̄ ).
The remainder of the proof is a case analysis which shows that for each
possible set of vertices f1 , f2 , f3 and f10 , f20 , f30 the graph (H, ΣH ) contains a
f5 minor. In order to prove Lemma 5 and to make the case analysis tractable
K
we first establish general results for labeled graphs with properties (P1) and (P2).

Acknowledgments. I am most indebted to Prof. Gérard Cornuéjols and Prof.

François Margot for their help.

References

1. F. Barahona. On the complexity of max cut. Rapport de Recherche No. 186,

Mathematiques Appliqués et Informatiques, Université Scientifique et Medicale
de Grenoble, France, 1980.
2. F. Barahona. The max cut problem in graphs not contractible to K5 . Oper. Res.
Lett., 2:107–111, 1983.
22 Bertrand Guenin

3. W. G. Bridges and H. J. Ryser. Combinatorial designs and related systems. J.

Algebra, 13:432–446, 1969.
4. G. Cornuéjols and B. Novick. Ideal 0, 1 matrices. J. Comb. Theory Ser. B, 60:145–
157, 1994.
5. J. Fonlupt, A. R. Mahjoub, and J. P. Uhry. Composition of graphs and the bipartite
subgraph polytope. Research Report No. 459, Laboratoire ARTEMIS (IMAG),
Université de Grenoble, Grenoble, 1984.
6. A. M. H. Gerards. Graphs and Polyhedra: Binary Spaces and Cutting Planes. PhD
thesis, Tilburg University, 1988.
7. A. M. H. Gerards. Multi-commodity flows and polyhedra. CWI Quarterly, 6(3),
1993.
8. M. Grötschel and W. R. Pulleyblank. Weakly bipartite graphs and the max-cut
problem. Operations Research Letters, 1:23–27, 1981.
9. B. Guenin. A characterization of weakly bipartite graphs. Working paper, GSIA,
Carnegie Mellon Univ., Pittsburgh, PA 15213, 1997.
10. R. M. Karp. Reducibility among combinatorial problems. In R. E. Miller and J. W.
Thatcher, editors, Complexity of Computer Computations, pages 85–103. Plenum
Press, New York, 1972.
11. A. Lehman. A solution of the Shannon switching game. J. SIAM, 12(4):687–725,
1964.
12. A. Lehman. On the width-length inequality, mimeographic notes. Mathematical
Programming, 17:403–417, 1979.
13. A. Lehman. On the width-length inequality and degenerate projective planes. In
W. Cook and P.D. Seymour, editors, Polyhedral Combinatorics, DIMACS Series
in Discrete Math. and Theoretical Computer Science, Vol. 1, pages 101–105, 1990.
14. C. Luetolf and F. Margot. A catalog of minimally nonideal matrices. Mathematical
Methods of Operations Research, 1998. To appear.
15. B. Novick and A. Sebö. On Combinatorial Properties of Binary Spaces. IPCO
Proceedings, 1995.
16. M. W. Padberg. Lehman’s forbidden minor characterization of ideal 0−1 matrices.
Discrete Math. 111:409–420, 1993.
17. P. D. Seymour. The forbidden minors of binary clutters. J. of Combinatorial Theory
B, 22:356–360, 1976.
18. P. D. Seymour. The matroids with the max-flow min-cut property. J. Comb. Theory
Ser. B, 23:189–222, 1977.
19. P. D. Seymour. Matroids and multicommodity flows. European J. of Combina-
torics, 257–290, 1981.
20. P. D. Seymour. On Lehman’s width-length characterization. In W. Cook, and P.
D. Seymour, editors, Polyhedral Combinatorics, DIMACS Series in Discrete Math.
and Theoretical Computer Science, Vol. 1, pages 107–117, 1990.
Bipartite Designs

Grigor Gasparyan

Yerevan State University

Yerevan-49, Armenia
Grisha@@ysu.am

Abstract. We investigate the solution set of the following matrix equa-

tion: AT B = J + diag(d), where A and B are n × n {0, 1} matrices, J is
the matrix with all entries equal one and d is a full support vector. We
prove that in some special cases (such as: both Ad−1 and Bd−1 have
full supports, where d−1 = (d−1 −1 T
1 , . . . , dn ) ; both A and B have constant
−1
column sums; d · 1 6= −1, and A has constant row sum etc.) these so-
lutions have strong structural properties. We show how the results relate
to design theory, and then apply the results to derive sharper charac-
terizations of (α, ω)-graphs. We also deduce consequences for ”minimal”
polyhedra with {0, 1} vertices having non-{0, 1} constraints, and ”mini-
mal” polyhedra with {0, 1} constraints having non-{0, 1} vertices.

1 Introduction
Suppose we are given two n × n {0, 1} matrices A and B, and a full support
vector d. Let us call the pair of matrices (A, B) a (bipartite) d-design if
AT B = J + diag(d),
where J is the matrix filled with ones. It seems difficult to say anything about
the structure of the matrices A and B in such a general setting. But if d > 0
then a surprising result of Lehman [7] asserts that either

01
A=B∼ = DPP ≡
1I
(then we call the pair (A, B) a DPP-design), or for some r and s:
AJ = JA = rJ; BJ = JB = sJ; AT B = BAT = J + (rs − n)I
(then we call the pair (A, B) an (r, s)-design). This result generalizes the earlier
results of de Bruijn and Erdős [2] and Ryser [12], and it is one of the main
arguments in the proof of Lehman’s theorem on minimally non-ideal polyhedra
[8].
In this paper we would like to investigate the d-designs a bit more generally.
Our main goal is to find sufficient conditions which force a d-design to become
an (r, s)-design. The following theorem summarizes our results in that direction:

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 23–36, 1998. c Springer–Verlag Berlin Heidelberg 1998
24 Grigor Gasparyan

Theorem 1. Let (A, B) be a d-design. Then:

1. If n > 3 and both Ad−1 and Bd−1 are full support vectors, then (A, B)
either is an (r, s)-design or a DPP-design;
2. If AJ = rJ and d−1 · 1 6= −1, then (A, B) either is an (r, s)-design or r = 2
(and we characterize all the possible such designs);
3. If A and B have constant column sums, then (A, B) is an (r, s)-design.
Our proof (see Section 7) uses widely the ideas of Lehman [8], Padberg [10],
Seymour [15] and Sebő [14]. In Sections 4 and 5 we state and prove the two
main ingredients of the proof: de Bruijn and Erdős Lemma [2] and a lemma
from matrix theory, which contains, as a special case, our key argument from
linear algebra due to Ryser [12], Lehman [7] and Padberg [10]. In Section 6 we
discuss the applications of the lemma to the classical design theory. In Section
8 we apply Theorem 1 to get some new characterizations for (α, ω)-graphs. In
Section 9 we use Theorem 1 to characterize fractional vertices and facets of
certain types of minimally non-{0, 1} polyhedra. Our characterizations imply
Lehman’s theorem on minimally non-ideal polyhedra [8], Padberg’s theorem on
minimally imperfect polytopes [11] and the part of Sebő [14], which correspond
to nonsingular matrix equations. Matrix generalization of the singular case is
going to be considered in the forthcoming paper [5]. In Sections 2 and 3 we give
some preliminaries and discuss some basic examples.

2 Notations and Terminology

The following conventions and terminology is used throughout this article. We
use lower case boldface letters to name vectors and upper case letters to name
matrices. In particular, 1 and 0 denote vector of all one and the zero vector,
respectively, and I and J denote the identity matrix and the all one matrix of
suitable size. ei denotes the ith unite vector and eij denotes the (i, j)th element
of I. For a vector x, xi is its ith coordinate. Similarly, if A is a matrix then
aij denotes its (i, j)th element, ai. denotes its ith row and a.j denotes its jth
column. Also, diag(d) denotes the diagonal matrix made up of the vector d, and
d−1 = (d−1 −1 T
1 , . . . , dn ) .
We will not distinguish the matrix from the ordered set of its columns. A
matrix with {0, 1} entries we call a {0, 1} matrix and a linear constraint we call
a {0, 1} constraint if it has {0, 1} coefficients and {0, 1} RHS. A ∼ = B means that
A can be obtained from B after permutation of its columns and rows.
If x = (x1 , . . . , xn ), then x/i := (x1 , . . . , xi−1 , xi+1 , . . . , xn ) and if A is a
{0, 1} matrix, then

A/i := {a.j /i : for each j such that aij = 0}.

The dot product of column (row) vectors a and b is a · b := aT b (a · b := abT ).

If A is a set system, then A denotes its (point-set) incidence matrix (and vice
versa), and AT denotes its dual set system, i.e. the (point -set) incidence matrix
of AT is congruent to AT .
Bipartite Designs 25

A pairwize balanced design (with index λ) is a pair (V, B) where V is a set

(the point set) and B is a family of subsets of V ( the block set) such that any
pair of points is in precisely λ blocks. A square design is a pairwize balanced
design with equal number of points and blocks. A symmetric design is a square
design with blocks of constant size. A projective plane is a symmetric design
with index 1. If
∼ 01
A= ≡ DPP,
1I
then A is called a degenerated projective plane.
Let A be an n × n {0, 1} matrix. G(A) denotes a graph, vertices of which are
the columns of A and two vertices are adjacent iff the dot product of the corre-
sponding columns is not zero. We say that A is connected if G(A) is connected.
If I and J are sets of indices, then A[I; J ] denotes the submatrix of A
induced by the rows I and columns J .

Definition 1. We say that A is r-uniform (r-regular) if JA = rJ(AJ = rJ).

We say that A is totally r-regular if JA = AJ = rJ.

Definition 2. We say that an n×n {0, 1} matrix A is a DE-matrix (de Bruijn-

Erdős’s matrix) if for each pair i and j with aij = 0, the number of zeros in the
ith row is equal to the number of zeros in the jth column.

Pn
Definition 3. Let (A, B) be a d-design. If 1 + i=1 d−1 i = 0 then we call such a
design singular. The ith row of A (B) we call a d-row if ai. d−1 = 0 (bi. d−1 = 0).

Let P be a polyhedron. We say that P is vertex {0, 1} if all its vertices are
{0, 1} vectors. We say that P is facet {0, 1} if it can be given with the help
of {0, 1} constraints. We call P a {0, 1}-polyhedron if it is both vertex {0, 1}
and facet {0, 1}. P/j denotes the orthogonal projection of P on the hyperplane
xj = 0, and P \j denotes the intersection of P with the hyperplane xj = 0. The
first operation is called contraction and the second one deletion of the coordinate
j. A polyhedron P 0 is called a minor of P if it can be obtained from P by
successively deletion or contraction one or several coordinates.
For an m × n {0, 1} matrix A, we denote by P≤ (A) = {x ∈ Rn : Ax ≤
1; x ≥ 0} the set packing polytope (SP-polytope) and by P≥ (A) = {x ∈ Rn :
Ax ≥ 1; x ≥ 0} the set covering polyhedron (SC-polyhedron) associated with A.
A {0, 1} SP-polytope is called perfect, and a {0, 1} SC-polyhedron is called
ideal. A SP-polytope P is called minimally imperfect if it is not perfect, but all
its minors are perfect. A minimally non-ideal polyhedron is defined similarly.
It is easy to see that P≥ (DPP) is a minimally non-ideal polyhedron.
For more information on polyhedral combinatorics we refer to Schrijver [13].
If G = (V, E) is a graph, then n = n(G) denotes the number of vertices of G;
ω = ω(G) denotes the cardinality of a maximum clique of G; α = α(G) denotes
the cardinality of a maximum stable set; and χ = χ(G) denotes the chromatic
number of G. A k-clique or k-stable set will mean a clique or stable set of size
26 Grigor Gasparyan

k. A graph G is called perfect if χ(H) = ω(H) for every induced subgraph H of

G. A graph G is called minimal imperfect if it is not perfect, but all its proper
induced subgraphs are perfect. G is called an (α, ω)-graph (or partitionable), if
n = αω +1, and V (G)\v can be partitioned both into ω-cliques and into α-stable
sets, for every v ∈ V (G) (here we assume that the empty graph and the clique
are (α, ω)-graphs). Lovász [9] proved the following important theorem:

Theorem 2. If G is minimal imperfect, then it is an (α, ω)-graph.

Theorem 2 provides the only known coNP characterization of perfectness,

and it is used by Padberg[11] to show further properties of minimal imperfect
graphs. For more on (α, ω)-graphs we refer to Chvátal et al. [3].

3 Constructions Associated with Bipartite d-Designs:

Some Basic Examples

There are several ways to associate a combinatorial structure to a d-design (A, B).
A straightforward way to do it is to take two set systems A = {A1 , . . . , An } and
B = {B1 , . . . , Bn } on some ground set V = {v1 , . . . , vn } such that the matri-
ces A and B are (point-set) incidence matrices of A and B, respectively. Then
the pair of set systems (A, B) have the following property: for each 1 ≤ i, j ≤
n, |Ai ∩ Bj | = 1 + eij di . We call such a pair of set systems a d-design. In par-
ticular, it was proved by Padberg [11] (see also [3]) that the pair of set systems
of ω-cliques and α-stable sets of an (α, ω)-graph is an (α, ω)-design. Another
interesting special case is when A = B. It was proved by de Bruijn and Erdős
[2] that (A, A) is a d-design iff AT is a (may by degenerated) projective plane.
A d-design can be characterized with the help of just one set system AT .
Then the ith column of B can be interpreted as an incidence vector of a set
subsystem of AT , which contains all the points except Ai by exactly once and Ai
by exactly di +1 times. We call such a set system a d-hypergraph. A d-hypergraph
corresponding to a (r, s)-design we call an (r, s)-hypergraph. In particular, −1-
hypergraph is a hypergraph having equal number of edges and vertices such
that for each vertex v, V \v can be partitioned with the help of its edges. We
will show that a −1-hypergraph is an (α, ω)-hypergraph, which corresponds to
the ω-clique hypergraph of some (α, ω)-graph.
An interesting special case is when AT is 2-uniform, i.e. it is a graph. A
nonsingular d-design (A, B) we call a G-design, if A is 2-regular. A graph G we
call a d-graph if there exists a G-design (A, B) such that A is the (edge-vertex)
incidence matrix of G. If G is an odd cycle, then we call (A, B) a C-design.
Let G be a d-graph. Then it is not difficult to show that, for each 1 ≤ i ≤ n,
di = ±1 (see Lemma 9). Denote by G\v (G/v) the graph obtained from G after
deleting (duplicating) the vertex v. It is easy to see that, for each vertex v, either
G\v or G/v has a perfect matching. Call such a graph matchable. The following
lemma characterizes d-graphs:
Bipartite Designs 27

Lemma 1. G is a d-graph iff it is a connected graph with odd number of vertices

and exactly one odd cycle such that the distance from each not degree two vertex
to the cycle is even.

Proof. We will prove by induction on the number of vertices. Suppose the theo-
rem is true for the graphs having less than n vertices and G is a d -graph with n
vertices. Then, clearly, G is connected, has equal number of edges and vertices,
and odd number of vertices. Hence G has exactly one cycle, which is odd, as the
(edge-vertex) incidence matrix of G is nonsingular. Furthermore, if v1 is a leaf of
G, and v1 v2 ∈ E(G), then G\{v1 ; v2 } is matchable. Indeed, for each v 6= v1 , v2 ,
the perfect matching of G\v (or G/v) must contain the edge v1 v2 . Hence after
deleting the edge v1 v2 from the matching, it will be a perfect matching for G\v
(or G/v). It follows that either v2 has degree 2, or it is a vertex of the cycle and
has degree 3.
Let v3 6= v1 such that v3 v2 ∈ E. Now if v2 has degree two then G\{v1 ; v2 } is a
d-graph. Hence by induction hypothesis, the distance from each not degree two
vertex v 6= v1 ; v3 to the cycle is even. If v1 is the unique leaf of G nonadjacent
to the cycle, then the degree of v3 in G\{v1 ; v2 } is one, hence the distances from
v1 and v3 to the cycle are also even. If v4 6= v1 is a leaf nonadjacent to the cycle
and v4 v5 ∈ E then, by induction hypothesis, G\{v4 ; v5 } and G\{v1 ; v2 ; v4 ; v5 }
are a d-graphs, hence the distances from v1 and v3 to the cycle are again even.
If G has no leafs then it is an odd cycle and we have nothing to prove. Suppose
G has leafs, but all of them are adjacent to the cycle. Then it is easy to see that G
has exactly two leafs, the neighbors of which are adjacent vertices in the cycle.
Denote by V1 the set of vertices of G such that G\v has a perfect matching
and by V2 = V \V1 . Then it is easy to see that G/v has a perfect matching iff
v ∈ V2 and |V1 | = |V2 | + 1. Now if (A, B) is a d-design corresponding to G, then
d−1 · 1 =|V1 | − |V2 | = −1, a contradiction.
The sufficiency of the condition is proved by similar arguments.

It is possible to associate a full dimensional simplex to a nonsingular d-design

as follows. Denote by y the unique solution of B T y = 1, and by P (A, B) =
convex hull {a.1 , . . . , a.n , y}. The facets of P (A, B) containing the vertex y are
b.j · x = 1. Hence P (A, B) has at most one non-{0, 1} facet, and at most one
non-{0, 1} vertex, which is not in that facet.
Conversely, suppose y is a non-degenerated vertex of a polyhedron P such
that all its neighbors and the facets containing it are {0, 1}. Then we can asso-
ciate a d-design (A, B) with y taking as columns of A the neighboring vertices
of y and as columns of B the supports of the facets of P containing y such that
the vertex corresponding to the ith column of A is not in the facet corresponding
to the ith column of B. We call such a vertex a d-design vertex. Similarly, if F
is a simplicial facet such that all its vertices and neighboring facets are {0, 1},
then we associate a d-design with F and call it a d-design facet.
28 Grigor Gasparyan

4 De Bruijn-Erdős’s Matrices
In this section we summarize the information about DE-matrices, which we use
in this paper.

Lemma 2 ([2]). Let A be an n × m {0, 1} matrix without all one columns. If

n ≤ m and for each pair i and j with aij = 0, the number of zeros in the ith
row is less or equal than the number of zeros in the jth column, then A is a
DE-matrix.

Proof. Denote by wj = 1/(n − 1 · a.j ) and w = (w1 , . . . , wm )T . Then

X 1 X 1
(1T − ai. )w = ≤ = 1,
j
n − 1 · a.j j
m − ai. 1
aij =0 aij =0

and m = 1T (J −A)w ≤n. Hence we should have equality throughout, i.e. m = n,

and if aij = 0 then ai. 1 = 1 · a.j .

Lemma 3. If A is a DE-matrix, then it has an equal number of all one rows

and columns.

Proof. Delete all one rows and columns and apply Lemma 2.

Lemma 4. Let A and A0 be DE-matrices, where a.1 6=a0 .1 and a.j = a0 .j , 1 <
j ≤ n. Then either A or A0 has an all one column.

Proof. Indeed, if say ai1 = 1 and a0i1 = 0, then ai. = 1T . Hence by Lemma 3, A
has also an all 1 column.

Lemma 5. If B is a DE-matrix and J −B is connected, then B is totally regular.

Proof. If (1 − b.i ) · (1 − b.j ) 6= 0, then there exists an index k such that bki =
bkj = 0, hence 1 · b.j = bk. 1 = 1 · b.i . As J − B is connected, it follows that
JB = sJ, for some integer s. Since B has no all 1 column, and it is a DE-matrix,
it cannot have an all 1 row. Hence B is totally s-regular.

The following result, which has been extracted by Sebő [14] from Lehman [8]
(see also [15] and [10]), is one of our key arguments in the proof of Theorem 13.
Denote by A∗ the set of all the solutions of AT x = 1, and by A01 the set of all
{0, 1} vectors in A∗ .

Theorem 3 ([8,14]). Let A be a nonsingular {0, 1} matrix without an all one

column. If the vector x ∈ A∗ has full support, and for each i,

x/i ∈ linear hull (A/i)01 ,

then A is a DE-matrix.
Bipartite Designs 29

Proof. Let Bi ⊆ (A/i)01 be a matrix such that the equation Bi y = x/i has a
unique solution. As x/i has full support, Bi has no all zero row.
Suppose aij = 0, L = {l : alj = 1} and Bi0 is the submatrix of Bi induced by
the rows L. Then all the columns of Bi0 have exactly one 1. Thus we have:

ai. 1 =n − rk(A/i) ≥ rk(Bi ) = 1T Bi0 1 ≥ 1 · a.j ,

and we are done by Lemma 2.

The next important result of Lehman [8] will be used to prove Theorem 12.

Theorem 4 ([8]). Let A be an n × m {0, 1} matrix having full row rank and
no all one or all zero or equal columns. If the vector x ∈ A∗ has full support,
and for each i, affine hull (A/i) can be given with the help of {0, 1} constraints,
then A is a DE-matrix.

Proof. Suppose A0 = {a.1 , . . . , a.n } is nonsingular. Then for each 1 ≤ i ≤ n :

0
x/i ∈ (A/i)∗ ⊆ linear hull (A/i)01 ⊆ linear hull (A /i)01 ,

hence by Theorem 3, A0 is a DE-matrix.

If n < m, then there must exist i ≤ n and j > n such that A00 = {A0 ∪a.j }\a.i
is again nonsingular. But then A00 also is a DE-matrix and by Lemma 4, A has
an all 1 column, a contradiction.

5 A Lemma on Matrices

The following lemma contains our main argument from linear algebra. Though
we need just a very special case of the lemma, we would like to state it in general
form, for the sake of possible other applications.
Suppose A, B, D ∈ R n×n ; U, W ∈ R n×m , where D is nonsingular and U
(or W ) has full column rank.

Lemma 6. det(D + U W T ) 6= 0 iff det ∆ 6= 0 and then

AT B = D + U W T ⇔ BD−1 AT = I + X∆−1 Y T ,

where ∆ = W T D−1 U + I; X = BD−1 U ; Y T = W T D−1 AT .

Proof. Denote by

∆T U T −∆ −W T −∆ 0
F = ;E = ; D0 = ;
Y A X B 0 D

Now if AT B = D + U W T , then AT X = U ∆; Y T B = ∆W T ; and

Y T X = W T D−1 AT BD−1 U = W T D−1 (D + U W T )D−1 U = ∆2 − ∆.

30 Grigor Gasparyan

It follows that AT B = D + U W T ⇔ F T E = D0 . As AT BD−1 U = U ∆, and U

has full column rank, the singularity of ∆ implies the singularity of either A or
B. If ∆ is nonsingular, then both F and E are nonsingular, and

F T E = D0 ⇔ ED0−1 F T = I ⇔ BD−1 AT = I + X∆−1 Y T .

Since AT X∆−1 = U and ∆−1 Y T B = W T , A and B are also nonsingular.

Notice that if the inverse of the matrix D is easy to compute, then Lemma
6 reduces the singularity test of the n × n matrix D + U W T to the singularity
test of an m × m matrix ∆.
Taking m = 1, U = W = 1 and D = diag(d) we get:

Lemma 7 ([7,10]). If (A, B) is a nonsindular d-design, then BD−1 AT = I +

δ −1 xyT , where x = Bd−1 , y = Ad−1 , and δ = d−1 · 1 + 1.

It follows that Pn if ai. is a−1d-row of a nonsingular d-design (A, B), then for
each k ≤ n, j=1 aij bkj dj = eik . On the other hand, if for some i and
Pn −1
k, j=1 a ij b kj dj = e ik , or, in particular, i 6= k and ai. · bk. = 0, then ei-
ther ai. or bk. is a d-row.
The following simple lemma will also be useful in the sequel.

Lemma 8. If (A, B) is a d-design, then the columns of A and B are affinely

independent. Moreover, if B is singular then Bd−1 = 0.

6 Applications to the Block Design

Lemma 6 has some important consequences in the classical design theory. In
particular, it implies that if d has full support, then λJ + diag(d) is nonsingular
iff λd−1 · 1 6= −1. This simple fact, which can be easily proved directly, was used
by Bose [1] (see also [12]) to prove the well-known Fisher’s inequality asserting
that in a pairwize balanced design the number of blocks is greater or equal than
the number of points.

Theorem 5 ([12]). [1] Let A = {A1 , . . . , Am }, where A1 , . . . , Am are distinct

subsets of a set of n elements such that for each i 6= j, 1 ≤ |Ai ∩ Aj | = λ < n.
Then m ≤ n.

Proof. If one of the sets has λ elements, then all the other sets contain this one
and are disjoint otherwise. It follows that m ≤ n. Hence we may suppose that
di = |Ai | − λ > 0. Then AT A =diag(d) + λJ. Since λd−1 · 1 6= −1, m ≤ rk
A ≤ n.

Here is another useful consequence of Lemma 6:

Corollary 1. Let A, B ∈ R n×n . If AT B = λJ + (k − λ)I, where λ(1 − n) 6= k,

then
AT B = BAT ⇔ JB = rJ; JA = sJ ⇔ BJ = rJ; AJ = sJ.
Bipartite Designs 31

Proof. By Lemma 6 we have:

BAT = λ(λ(n − 1) + k)−1 BJAT + (k − λ)I.

Hence AT B = BAT ⇔ BJAT = tJ, for some t ⇔ BJ = rJ; AJ = sJ, where

rs = λ(n − 1) + k ⇔ JB = rJ; JA = sJ, where rs = λ(n − 1) + k (as AT B =
λJ + (k − λ)I).

Thus, we get the following well-known result of Ryser [12]:

Theorem 6 ([12]). The dual of a symmetric design is again a symmetric de-

sign.

Proof. Suppose AAT = (k − λ)I + λJ. Then JA = kJ and by Corollary 1,

AJ = kJ and AT A = AAT = (k − λ)I + λJ.

The following interesting fact also can be easily deduced from Lemma 6.

Theorem 7 ([12]). In any square design, there exists a set incident to each
given pair of points.

7 Some Sufficient Conditions

The following theorem completely characterizes the d-designs (A, B), where J −
A is disconnected (the proof is omitted).

Theorem 8. Let (A, B) be a d-design. If it is not a DPP-design and J − A is

disconnected, then the following cases are possible:

1.    
100 100
(A, B) ∼
=  1 1 0  ,  1 0 1  ;
101 110
2.
1 eT1 1 0
(A, B) ∼
= , ;
1J −I e1 I
3.    
0 1 1 101
(A, B) ∼
=  1 J − I J  ,  0 I 0  ;
1 0 I 00I
4.    
1011 1100
 0 0 1 1   0 0 0 1 
(A, B) ∼
=    
 1 1 0 0  ,  1 0 0 0  .
1101 0011
32 Grigor Gasparyan

Theorem 9. If (A, B) is a d-design without d-rows, and n > 3, then either

(A, B) is an (r, s)-design or it is a DPP-design.
Proof. As both Ad−1 and Bd−1 have full supports, it follows from Lemma 8
that (A, B) is nonsingular. If either A or B has an all one column, then it is not
difficult to show that n ≤ 3. So suppose that A and B have no all one columns.
Denote by Lj = {l : alj = 1}; K i = {k : aik = 1}; and B ij = B[Lj ; K i ]. Let
aij = 0. As a.j · b.k = 1, for each k ∈ K i , and ai. · bl. ≥ 1, for each l ∈ Lj
(by Lemma 7), it follows that each row of B ij contains at least one 1 and each
column of B ij contains exactly one 1. Hence |K i |≥| Lj |, for each pair i and j
such that aij = 0. Therefore, by Lemma 2, |K i |=| Lj |, B ij is a permutation
matrix, and both A and B are DE-matrices. Moreover, if aij = 0 and alj = 1,
then ai. · bl. = 1. Suppose now that ai. ≥ al. . As by Lemma 3, A has no all one
row, there exists a j such that aij = alj = 0. Since A is a DE-matrix, ai. = al. ,
a contradiction. Hence ai. · bl. = 1, for each i 6= l. That is BAT = J + diag(d0 ),
for some vector d0 .
Now, it follows from Theorem 8 and Lemma 5 that either (A, B) is a DPP-
design, or both A and B are totally regular. Suppose A is totally r-regular and
B is totally s-regular. Then rsJ = JAT B = nJ + Jdiag(d), hence d1 = d2 · · · =
dn = rs − n, and AT B = J + (sr − n)I. Similarly, BAT = J + (sr − n)I = AT B.
Notice that in the proof we are using just a very special case of Theorem 8
(when both A and B are DE-matrices), which can be easily proved directly.
As a consequence we get the important result of Lehman on the structure of
square minimally non-ideal matrices.
Corollary 2 ([7]). If (A, B) is a d-design, where d ≥ 1, then either (A, B) is
an (r, s)-design or it is a DPP-design.
Here are two other useful consequences:
Corollary 3. If AT B = J −I, then (A, B) is an (α, ω)-design, where n = αω+1.

Corollary 4. If (A, B) is a d-design, where A and B both are uniform, then it

is an (r, s)-design.
Proof. If A is uniform, then the solution of AT x = 1 has full support. Hence
Bd−1 also has full support and B has no d-rows.
For our next result we need the following nice lemma of Sebő [14]:
Lemma 9 ([14]). If (A, B) is a d-design, where A is r-regular, then, for each
j ≤ n, dj ≡ −n (mod r).
The following theorem characterizes the nonsingular d-designs (A, B), where
A is regular:
Theorem 10. If (A, B) is a nonsingular d-design, where A is r-regular, then
either it is an (r, s)-design or a G-design.
Bipartite Designs 33

Proof. Suppose (A, B) is not an (r, s)-design. Then by Lemma 9 and Theorem
9, for each j, either dj = −1 or dj ≥ r − 1, and by Theorem 9, either A or B
has a d-row. Consider two cases:
Case 1: a1. is a d-row. Now, it follows from Lemma 7 that for each i 6= 1,
either a1. · bi. = 0 or a1. ≤ bi. . As B has no equal columns, it follows that r = 2,
dj = ±1, hence (A, B) is a G-design.
Case 2: b1. is a d-row having maximum number of ones.
Suppose b1. = (1 . . . 1, 0 . . . 0), where b1. 1 = k. Then we have that for each
i 6= 1, either ai. · b1. = 0 or ai. · b1. = r. Moreover, a1. · b1. = r − 1. It follows
that r ≤ k < n. Suppose ai. · b1. = r if 1 < i ≤ l, ai. · b1. = 0 if i ≥ l, and
a1k+1 = 1. Denote by A1 = A[l+1 . . . n; k+1 . . . n], B1 = B[l+1 . . . n; k+1 . . . n].
As A isPnonsingular, l P ≥ k. On the other hand, AT1 B1 =diag(dk+1 . . . dn ) + J,
where j=k+1 dj = j=1 d−1
−1
n n
j 6= −1, hence k ≥ l. It follows that k = l and
A1 is nonsingular. Hence the equation AT1 x = 1 − e1 has a unique solution.
As all the columns of B 0 = B[k + 1 . . . n; 1 . . . k] satisfy that equation, B 0 has
an all
Pnone row. Suppose bp. is the row of B corresponding to that row of B 0 .
−1
As j=1 a2j bpj dj = 0, bp. is a d-row having more ones than b1. , which is a
contradiction.

Notice that only Lemma 8 and Theorem 8 yet contain some information
about the structure of singular designs. The characterization of singular designs
seems to be a more difficult problem, as Lemma 6 cannot be applied directly.
In particular, it would be interesting to check whether there exist singular d-
designs (A, B), where A is r-regular, and r > 2. A partial answer to this question
is given in [14]. Here is another result on that direction. The prove is similar to
the proof of Theorem 10.

Lemma 10. If (A, B) is a d-design, where A is r-regular and r > 1, then A is

connected.

8 Sharper Characterizations for (α, ω)-Graphs

In this section we apply Theorem 9 to get some new, smaller sets of conditions
characterizing (α, ω)-graphs. It is not difficult to deduce from Theorem 9 (see
[3]) that G is an (α, ω)-graph iff it has a family of n cliques A and a family
of n stable sets B such that AT B = J − I. The following reformulation of this
statement is a strengthening of a similar result of Hougardy and Gurvich [6].

Corollary 5. If G has a family of ≤ n cliques covering all the edges of G such

that for each vertex v ∈ V, G\v can be partitioned with the help of these cliques,
then G is an (α, ω)-graph.

The following theorem provides another characterization of (α, ω)-graphs.

Theorem 11 ([4]). G is an (α, ω)-graph iff it has an α-stable set A1 such that
for each vertex s ∈ A1 and stable set S ⊂ V ; χ(G\s) = ω = ω(G\S).
34 Grigor Gasparyan

Proof. Let A := {A1 , A2 , . . . , Aαω+1 }, where A1 is the stable set occurring in

the theorem; fixing an ω-coloration of each of the α graphs G\s (s ∈ A1 ),
A2 , . . . , Aαω+1 denote the stable sets occurring as a color-class in one of these
colorations. Define B := {B1 , B2 , . . . , Bαω+1 }, where Bi is an ω-clique of G\Ai .
Now it is straightforward to check that AT B = J − I ( see [4] ). Since G\s
(s ∈ A1 ) has a partition into ω stable sets, n ≤ αω + 1 is obvious. On the other
hand, A has full column rank, and n ≥ αω + 1 follows. Thus, n = αω + 1 and
(A, B) is an (α, ω)-design. The fact that B is the set of all ω-cliques of G follows
now by noticing that an arbitrary ω-clique Q is disjoint from exactly one element
Ai ∈ A: its incidence vector is the unique solution of the equation AT x = 1 − ei .
On the other hand, one of the columns of B also satisfies this equation. In the
same way A is the set of all α-stable sets.

Notice that Theorem 11 immediately implies Theorem 2 and all the proper-
ties of minimal imperfect graphs shown by Padberg [11].
From the proof of Theorem 11 we get:

Corollary 6 ([16]). Let G be a graph and S0 be an α-stable set in G. If n =

αω+1and for each vertex s ∈ S0 and α-stable set S ⊂ V ; χ(G\s) = ω = ω(G\S),
then G is an (α, ω)-graph.

Here is another interesting consequence of Theorem 9:

Corollary 7 ([16]). G is partitionable iff for some p, q ≥ 2 such that n ≤ pq+1,

G has a family of n stable sets, A, such that each vertex is in at least p of the
sets A; and A has no sets intersecting every q-clique.

Proof. Let B be the matrix the ith column of which is the incidence vector of a
q-clique disjoint from the stable set corresponding to the ith column of A. Then
1T AT B1 ≥ pqn, hence n = pq + 1 and AT B = J − I.

9 Design Vertices and Facets in Polyhedra

The following two theorems contain both Padberg’s theorem on minimally im-
perfect polytopes [11] and Lehman’s theorem on minimally non-ideal polyhedra
[8], and the second one also contains the part of Sebő [14], which corresponds to
nonsingular matrix equations. In the proofs we mainly use the ideas of Lehman
[8], Padberg [10], several results on d-designs of the present work and the follow-
ing simple but surprising fact communicated by Sebő [14]: if P is a facet {0, 1}
polyhedron such that, for each i ≤ n, both P \i and P/i are {0, 1}-polyhedra then
P is full dimensional.

Theorem 12. Suppose P is a full dimensional, vertex {0, 1} polyhedron such

that, for each 1 ≤ i ≤ n, both P \i and P/i are {0, 1}-polyhedra, and F is a
non-{0, 1} facet of P . Then F is a d-design facet. Moreover, if we denote by
Bipartite Designs 35

(A, B) the d-design corresponding to F , then the following cases are possible:
1. Either (A, B) is a DPP-design or

∼ 01 11
(A, B) = , ;
1I 0I

and then F is the unique non-{0, 1} facet of P .

2. A is totally regular and P has at most two non-{0, 1} facets. Moreover, if B
is nonsingular, then F either is an (r, s)-design facet or a C-design facet.

Proof. Let F = {x ∈ P : a · x = 1}. As for each i, P/i is a {0, 1}-polyhedron, it

follows that a has full support. Denote by A the matrix, the ith column of which
is the ith vertex of F . Since a has full support, it follows that F is bounded,
A has full row rank and has no all one column. On the other hand, as P \i is a
{0, 1}-polyhedron, it follows that for each i, affine hull (A/i) can be given with
the help of {0, 1}-constraints. Hence by Theorem 4, A is a DE-matrix. Now, it
follows from Lemma 4 that all the neighboring facets of F are {0, 1}. F cannot
have neighboring facets of type xi = 0, for otherwise a1. = eT1 and A = I, which
is impossible. Hence F is a d-design facet.
Thus, by Theorem 8, either A ∼ = DPP or A is totally regular. If A ∼= DPP
then it is not difficult to proof that we have case 1. As P can have at most two
parallel facets, the proof is finished by Theorem 10.

Theorem 13. Let P be a facet {0, 1} polyhedron. If for each i ≤ n, both P \i

and P/i are {0, 1}-polyhedra then either P = P≥ (A), where A ∼ = DPP, or P has
at most two fractional vertices, both of which are (r, s)-design vertices.

Acknowledgments. I am very grateful to András Sebő for several helpful com-

munications, which have been essential in preparing this article, and to Hasmik
Lazaryan for detecting some errors in the preliminary versions.

References
1. R. S. Bose. A note on Fisher’s inequality for balanced incomplete block design.
Ann. Math. Stat., 20:619–620, 1949.
2. N. G. de Bruijn and P. Erdős. On a combinatorial problem. Indag. Math., 10:421–
423, 1948.
3. V. Chvátal, R. L. Graham, A. F. Perold, and S. H. Whitesides. Combinatorial
designs related to the strong perfect graph conjecture. Discrete Math., 26:83–92,
1979.
4. G. S. Gasparyan. Minimal imperfect graphs: A simple approach. Combinatorica,
16(2):209–212, 1996.
5. G. S. Gasparyan and A. Sebő. Matrix equations in polyhedral combinatorics. In
preparation.
6. S. Hougardy and V. Gurvich. Partitionable Graphs. Working paper.
7. A. Lehman. No the width-length inequality. Math. Programming, 17:403–413, 1979.
36 Grigor Gasparyan

8. A. Lehman. The width-lenght inequality and degenerated projective planes. In W.

Cook and P. D. Seymour, editors, Polyhedral Combinatorics, DIMACS, Vol. 1,
pages 101–105, 1990.
9. L. Lovász, A characterization of perfect graphs. J. of Combin. Theory, 13:95–98,
1972.
10. M. Padberg. Lehman’s forbidden minor characterization of ideal 0–1 matrices.
Discrete Math., 111:409–420, 1993.
11. M. Padberg. Perfect zero-one matrices. Math. Programming, 6:180–196, 1974.
12. H. Ryser. An extension of a theorem of de Bruijn and Erdős on combinatorial
designs. J. Algebra, 10:246–261, 1968.
13. A. Schrijver. Theory of Linear and Integer Programming. Wiley, New York, 1986.
14. A. Sebő. Characterizing noninteger polyhedra with 0–1 constraints. In R. E. Bixby,
E. A. Boyd, and R. Z. Rı́os-Mercado, editors, Proceedings of the 6th International
IPCO Conference, LNCS, Vol. 1412, pages 36–51. Springer, 1998. This volume.
15. P. D. Seymour. On Lehman’s width-length characterization. DIMACS, 1:107–117,
1990.
16. F. B. Shepherd. Nearly-perfect matrices. Math. Programming, 64:295–323, 1994.
Characterizing Noninteger Polyhedra with 0–1
Constraints

?
András Sebő

CNRS, Laboratoire Leibniz-IMAG, Grenoble, France

http://www-leibniz.imag.fr/DMD/OPTICOMB/Membres/sebo/sebo.html

Abstract. We characterize when the intersection of a set-packing and a

set-covering polyhedron or of their corresponding minors has a noninte-
ger vertex. Our result is a common generalization of Lovász’s characteri-
zation of ‘imperfect’ and Lehman’s characterization of ‘nonideal’ systems
of inequalities, furthermore, it includes new cases in which both types of
inequalities occur and interact in an essential way. The proof specializes
to a conceptually simple and short common proof for the classical cases,
moreover, a typical corollary extracting a new case is the following: if
the intersection of a perfect and an ideal polyhedron has a noninteger
vertex, then they have minors whose intersection’s coefficient matrix is
the incidentce matrix of an odd circuit graph.

1 Introduction

Let A≤ and A≥ be 0–1-matrices (meaning that each entry is 0 or 1) with

n columns. We will study the integrality of the intersection P (A≤ , A≥ ) :=
P ≤ (A≤ ) ∩ P ≥ (A≥ ) of the set-packing polytope P ≤ (A≤ ) = {x ∈ IRn : A≤ x ≤
1, x ≥ 0} and the set-covering polyhedron P ≥ (A≥ ) = {x ∈ IRn : A≥ x ≥ 1, x ≥
0}. We will speak about (A≤ , A≥ ) as a system of inequalities, or simply system.
Obviously, one can suppose that both the rows of A≤ and those of A≥ are
incidence (‘characteristic’) vectors of a clutter, that is of a family of sets none
of which contains the other. The sets in the clutters and their 0–1 incidence
vectors will be confused, and with the same abuse of terminology, clutters and
their matrix representations (where the rows are the members of the clutter) will
not be distinguished. If A≤ and A≤ do not have equal rows, that is (explicit)
equalities, we will say that (A≤ , A≥ ) is simple.
The constraints defining P ≤ (A≤ ) will be called of packing type, and those
defining P ≥ (A≥ ) of covering type. A vertex of P (A≤ , A≥ ) can also be classified
to be of packing type, of covering type, or of mixed type, depending on whether
all nonequality constraints containing the vertex are of packing type, of covering
type, or both types occur.
?
Visiting the Research Institute for Mathematical Sciences, Kyoto University.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 37–52, 1998. c Springer–Verlag Berlin Heidelberg 1998
38 András Sebő

The blocker of a clutter A≥ (or the antiblocker of A≤ ) is the set of inclu-

sionwise minimal (resp. maximal) integer vectors in P ≥ (A≥ ) (resp. P ≤ (A≤ ) ).
These are 0–1-vectors and define another clutter.
A polyhedron in this paper is the set of all (real) solutions of a system of
linear inequalities with integer coefficients. A polytope is a bounded polyhedron.
For basic definitions and statements about polyhedra we refer to Schrijver [11],
and we only repeat now shortly the definition of the terms we are using directly.
A face of a polyhedron is a set we get if we replace certain defining inequalities
with the equality so that the resulting polyhedron is nonempty. A polyhedron is
integer if each of its faces contains an integer point, otherwise it is noninteger.
If X ⊆ IRn , we will denote by r(X) the (linear) rank of X, and by dim(X)
the dimension of X, meaning the rank of the differences of pairs of vectors in
X, that is, dim(X) := r {x − y : x, y ∈ X} .
If P is a polyhedron, then its faces of dimension dim(P ) − 1 are called facets,
and its faces of dimension 0, vertices. All (inclusionwise) minimal faces of P have
the same dimension. We say that P has vertices, if this dimension is 0.
It is easy to see that P (A≤ , A≥ ) has vertices for all A≤ , A≥ . (If a minimal face
is not of dimension 0, it contains an entire line, contradicting some non-negativity
constraint.) So P (A≤ , A≥ ) is integer if and only if it has integer vertices.
A vertex of a full dimensional polyhedron is simplicial , if it is contained in
exactly n facets. A simplicial vertex has n neighbouring vertices. Neighbours
share n − 1 facets.
If A≤ is empty, a combinatorial coNP characterization of the integrality of
P (A≤ , A≥ ) is well-known (Lovász [8], Padberg [9]). If A≥ is empty, a recent
result of Lehman solves the problem (Lehman [6], Seymour [12]). A common
generalization of these could be a too modest goal: if for every i ∈ {1, . . . , n}
either the i-th column of A≤ or that of A≥ is 0, then the nonintegrality of
P (A≤ , A≥ ) can be separated to the two ‘classical’ special cases. Such systems
(A≤ , A≥ ) contain both special cases, but nothing more. There are less trivial
examples where P ≤ (A≤ ) and P ≥ (A≥ ) do not really interact in the sense that
all fractional vertices of P ≤ (A≤ , A≥ ) are vertices of P ≤ (A≤ ) or of P ≥ (A≥ ).
In this work we characterize when the intersection of a set-packing and a set-
covering polyhedron or that of any of their corresponding minors is noninteger.
The results contain the characterizations of perfect and ideal polyhedra and new
cases involving mixed vertices. The special cases are not used and are not treated
separately by the proof: a common proof is provided for them instead.
Graphs G = (V, E) are always undirected, V = V (G) is the vertex-set,
E = E(G) the edge-set; 1 is the all 1 vector of appropriate dimension.
If x ∈ IRn , the projection of x parallel to the i-th coordinate is the vector xi =
(x1 , . . . , xi−1 , xi+1 , . . . , xn ). Let us fix the notation V := {1, . . . , n}. If X ⊆ IRn ,
the projection parallel to the i-th coordinate of the set X is X i := {xi : x ∈ X};
if I ⊆ V , X I is the result of successively projecting parallel to i ∈ I (the order
does not matter).
Let P := P ≤ (A≤ ) or P := P ≥ (A≥ ), and I, J ⊆ V , I ∩ J = ∅. A minor of P
is a polyhedron P \ I/J := (P ∩ {x : xi = 0 if i ∈ I})I∪J . The set I is said to be
Characterizing Noninteger Polyhedra with 0–1 Constraints 39

deleted, whereas J is contracted. For set-packing polyhedra the contraction of J

is the same as the deletion of J.
≤ ≥
It is easy to see that P ≤ \ I/J = P ≤ (A0 ), and P ≥ \ I/J = P ≥ (A0 ), where
≤ ≥
A0 , A0 arise from A≤ , resp. A≥ in a simple way: delete the columns indexed by
I, and then delete those rows that are no more maximal, resp. minimal; for A≤
do the same with J; for A≥ delete the columns indexed by J and also delete all
the rows having a 1 in at least one of these columns. Hence minors of set-packing
or set-covering polyhedra are of the same type.
We do not use the terms ‘contraction’ or ‘deletion’ for matrices (or clutters),
because that would be confusing here for several reasons, one of which being that
these operations do not only depend on the matrix (or clutter) itself. But we
≤ ≥
define the minors of the ordered pair (A≤ , A≥ ): (A≤ , A≥ ) \ I/J := (A0 , A0 ),
0≤ 0≥ ≤ 0≤ ≥ 0≥
where I, J, A , A are as defined above. The polyhedra P (A ) and P (A )
will be called corresponding minors of the two polyhedra P ≤ (A0 ≤ ) and P ≥ (A0 ≥ ).
Parallelly, for a clutter (matrix) A and v ∈ V we define the clutter A − v :=
{A ∈ A : v ∈ / A} on V \ {v}.
If P = P ≤ (A≤ ) is integer, then P and A≤ are called perfect, whereas if
P = P ≥ (A≥ ) is integer, P and A≥ are called ideal. All minors of perfect and ideal
matrices are also ideal or perfect respectively. If a matrix is not perfect (not ideal)
but all its proper minors are, then it is called minimal imperfect, or minimal
nonideal respectively. It is easy to see that the family Hnn−1 of the n − 1-tuples
of an n-set is minimal imperfect, and it is also an easy and well-known exercise
to show that matrices not containing such a minor (or equivalently having the
‘dual Helly property’) can be represented as the (inclusionwise) maximal cliques
of a graph. We will call Hnn−1 (n = 3, 4, . . . , ) minimal nongraph clutters. The
degenerate projective plane clutters Fn = {1, . . . , n − 1}, {1, n}, {2, n}, . . ., {n −
1, n} , (n = 3, 4, . . .) are minimal nonideal.
It is easy to show that the blocker of the blocker is the original clutter. The
antiblocker of the antiblocker of Hnn−1 is not itself, and this is the only exception:
it is another well-known exercise to show that the antiblocker of the antiblocker
of a clutter that has no Hnn−1 minor (dual Helly property), is itself.
A graph G is called perfect or minimal imperfect if its clique-matrix is so. It
is said to be partitionable, if it has n = αω + 1 vertices (α, ω ∈ IN), and for all
v ∈ V (G), G − v can be partitioned both into α cliques and into ω stable-sets.
Lovász [8] proved that minimal imperfect graphs are partitionable and Padberg
[9] proved further properties of partitionable graphs.
Analogous properties have been proved for nondegenerate minimal nonideal
clutters by Lehman [6], from which we extract: a pair of clutters (A, B), where
B is the blocker or the antiblocker of A will be called partitionable, if they
are defined on V := {1, . . . , n}, n = rs − µ + 1, (r, s ∈ IN, µ ∈ ZZ, 0 ≤ µ ≤
min{r, s}), µ 6= 1, and for all v ∈ V there exist sets A1 , . . . , As ∈ A and sets
B1 , . . . , Br ∈ B such that v ∈ A1 , . . . , Aµ , B1 , . . . , Bµ and both {A1 \ v, . . . , Aµ \
v, Aµ+1 , . . . , As } and {B1 \ v, . . . , Bµ \ v, Bµ+1 , . . . , Br } are partitions of V \ {v}.
40 András Sebő

Remark 1. The clique matrix of a partitionable graph is a partitionable clutter

with ω = r, α = s, µ = 0.
Supposing that (A, B) is partitionable, it is easy to see that they are an-
tiblockers of each other if and only if µ = 0, and they are blockers of each other
if and only if µ ≥ 2. Indeed, if (A, B) are partitionable it can be shown (Pad-
berg [9], [10]) that A and B have exactly n members and these can be indexed
A1 , . . . , An , B1 , . . . , Bn so that |Ai ∩ Bj | is 1 if i 6= j and is µ if i = j. These
properties will be proved directly for general ‘minimal noninteger systems’.
We will call A partitionable, if (A, B) is partitionable, where B is the blocker
or the antiblocker of A – we will always make clear which of the two is meant.
Let A be partitionable. Clearly, 1/r1 ∈ P ≤ (A) if µ = 0, and 1/r1 ∈ P ≥ (A)
if µ ≥ 2 (it is actually the unique full support noninteger vertex of P ≤ or P ≥ ,
for minimal nonideal or minimal imperfect polyhedra, it is the unique fractional
vertex). Let us call this the regular vertex of P ≤ (A), or of P ≥ (A). The regular
vertex of Fn and that of Hnn−1 is defined as their unique fractional vertex.
The idea of this work originates in the frustrating similarities between min-
imal imperfect and minimal nonideal matrices and the proofs of the results.
This similarity becomes fascinating when comparing Seymour’s proof [12] of
Lehman’s, and Gasparyan’s direct proof [3] of Lovász’s and Padberg’s theorems.
Despite these similarities, the generalization has to deal with several new
phenomena, for instance P (A≤ , A≥ ) can be empty, and its dimension can also
vary. (Antiblocking and blocking polyhedra are trivially full dimensional !) We
will meet many other difficulties that oblige us to generalize the notions and
arguments of the special cases – without making the solution much more compli-
cated. The proof synthesizes polyhedral and combinatorial arguments, moreover
a lemma involving the divisibility relations between the parameters will play a
crucial role when mixed fractional vertices occur.
We show now an example with mixed vertices. Surprisingly, this will be
the only essential (‘minimal noninteger’) new example where the two types of
inequalities interact in a nontrivial way. In a sense, a kind of ‘Strong Perfect
Graph Conjecture’ is true for mixed polyhedra.
If A≤ ∪ A≥ = E(C2k+1 ) ⊆ 2V (G) (k ∈ IN), and neither A≤ nor A≥ is empty,
then P (A≤ , A≥ ) will be called a mixed odd circuit polyhedron, and (A≤ , A≥ )
will be called a mixed odd circuit. The unique fractional vertex of a mixed odd
circuit polyhedron is 1/21.
Let now (A≤ , A≥ ) be a simple odd circuit. Let us define Bi to be the (unique)
subset of vertices of the graph C2k+1 having exactly one common vertex with
every edge of C2k+1 except with (i, i + 1); the number of common vertices of
Bi with the edge (i, i + 1) is required to be zero or two depending on whether
its incidence vector is in A≤ or A≥ respectively (i = 1, . . . , 2k + 1, i + 1 is
understood mod n = 2k + 1). The neighbors of the vertex 1/21 on P (A≤ , A≥ )
are the characteristic vectors of the Bi , (i = 1, . . . , n = 2k + 1). Follow these and
other remarks on C7 :
Example 1. (an odd circuit polyhedron) Let us define P (A≤ , A≥ ) ⊆ IR7 with:
xi + xi+1 ≤ 1 (i = 1, 2, 3, 4), xi + xi+1 ≥ 1 (i = 5, 6, 7; for i = 7 , i + 1 := 1).
Characterizing Noninteger Polyhedra with 0–1 Constraints 41

This polyhedron remains noninteger after projecting 1: indeed, the inequality

x7 − x2 ≥ 0 is a sixth facet-inducing inequality (containing the vertex 1/2 1)
besides the five remaining edge-inequalities. These six inequalities are linearly
independent ! (The projection of a vertex is still a vertex if and only if the
projection is parallel to a coordinate which is nonzero both in some set-packing
and some set-covering facet containing the vertex.) But the new inequality is not
0–1 ! However, a study of nonintegrality should certainly include this example.
The vertices of P (A≤ , A≥ ) are, besides (1/2)1, the sets Bi , (i = 1, . . . , 7).
These are the shifts of B2 := {1, 4, 6} by ±1 and 0, 2, and of B6 := {2, 4, 6, 7}
by 0, ±1. Note that the vector (0, 1, 1, 1, 0, −1, −1) is orthogonal to all the Bi
(i = 1, . . . , 7), whence the 7 × 7 matrix B whose rows are these, is singular !
In general, if (A≤ , A≥ ) is a simple mixed odd circuit, and A≤ has one more
row than A≥ , then 1T A≤ − 1T A≥ (defines a Chvátal-Gomory cut and) is or-
thogonal to all the Bi -s (i = 1, . . . , 2k + 1), so they are linearly dependent !
Linear independence of the neighbors of fractional vertices play a funda-
mental role in the special case of Padberg [9],[10], Lehman[6], and also in Gas-
paryan [3],[4]. Mixed odd circuits show that we have to work here without this
condition. As a consequence we will not be able to stay within matrix terms, but
will have to mix combinatorial and polyhedral arguments: Lemma 8 is mostly
a self-contained lemma on matrices, where the polyhedral context, through
Lemma 7 brings in a stronger combinatorial structure: ‘r = 2’. The matricial
part of Lemma 8 reoccurs in papers [4] and [5], studying the arising matrix equa-
tions. The latter avoids the ‘nonsingularity assumption’ replacing Lemma 7 by
combinatorial (algebraic) considerations.

This paper is organized as follows: Section 2 states the main result, its corollaries,
and reformulations. The proof of the main result is provided in sections 3 and
4. Section 5 is devoted to some more examples and other comments.

2 Results

When this does not cause missunderstanding, we will occasionnally use the
shorter notations P ≤ := P ≤ (A≤ ), P ≥ := P ≥ (A≥ ), P := P (A≤ , A≥ ) = P ≤ ∩P ≥ .
≤ ≥
Recall that the polyhedra P ≤ (A0 ) := P ≤ \ I/J and P ≥ (A0 ) := P ≥ \ I/J,
(I, J ⊆ V := {1, . . . , n}, I ∩ J = ∅) are called corresponding minors, and
≤ ≥
(A0 , A0 ) =: (A≤ , A≥ ) \ I/J is a minor of (A≤ , A≥ ). (Note that two minors are
corresponding if and only if the two I ∪ J are the same, since for set-packing
polyhedra deletion is the same as contraction.) Furthermore, if for all such I, J
the polyhedron (P ≤ (A≤ ) \ I/J) ∩ (P ≥ (A≥ ) \ I/J) is integer, then the system
(A≤ , A≥ ) will be called fully integer.

Theorem 1. Let A≤ and A≥ be 0–1-matrices with n columns. Then (A≤ , A≥ )

≤ ≥
is not fully integer if and only if it has a minor (A0 , A0 ) for which at least one
of the following three statements holds:
42 András Sebő

≤
– A0 is a minimal nongraph clutter, or it is partitionable with µ = 0, moreover
≤ ≥
in either case the regular vertex of P ≤ (A0 ) is in P ≥ (A0 ), and it is the
≤ 0≤ ≥ 0≥
unique packing type fractional vertex of P (A ) ∩ P (A ).
– A0 ≥ is a degenerate projective plane, or it is partitionable with µ ≥ 2, more-
≥ ≤
over in either case the regular vertex of P ≥ (A0 ) is in P ≤ (A0 ), and it is
≤ ≥
the unique covering type fractional vertex of P ≤ (A0 ) ∩ P ≥ (A0 ).
0≤ 0≥
– (A , A ) is a mixed odd circuit.
Lovász’s NP-characterization of imperfect graphs [8] (with the additional
properties proved by Padberg[10]), follow:
Corollary 1. Let A≤ be a 0–1-matrix with n columns. Then A≤ is imperfect
≤
if and only if it has either a minimal nongraph or a partitionable minor A0 ,
≤
moreover P (A0 ) has a unique fractional vertex.
Specializing Theorem 1 to set-covering polyhedra one gets Lehman’s celebrated
result [6], see also Seymour [12]:
Corollary 2. Let A≥ be a 0–1-matrix with n columns. Then A≥ is nonideal if
and only if it has either a degenerate projective plane or a partitionable minor
≥ ≥
A0 , moreover P (A0 ) has a unique fractional vertex.
The following two consequences are stated in a form helpful for coNP char-
acterization theorems (see Section 5):
Corollary 3. Let A≤ and A≥ be 0–1-matrices with n columns. Then (A≤ , A≥ )
is not fully integer if and only if at least one of the following statements holds:
– A≤ has a minimal nongraph or a partitionable, furthermore minimal imper-
fect minor with its regular vertex in the corresponding minor of P ≥ (A≥ ),
– A≥ has a degenerate projective plane or a partitionable minor with its regular
≤ ≤
vertex in the corresponding minor P ≤ (A0 ) of P ≤ (A≤ ), where A0 is perfect.
– (A≤ , A≥ ) has a mixed odd circuit minor.
If we concentrate on the structural properties of the matrices A≤ and A≥ implied
by the existence of a fractional vertex we get the following.This statement is not
reversible: if A≤ consists of the maximal stable-sets of an odd antihole, and A≥
of one maximal but not maximum stable-set, then (A≤ , A≥ ) is fully integer,
although A≤ is minimal imperfect !

Corollary 4. Let A≤ and A≥ be 0–1-matrices with n columns and assume that

P ≤ (A≤ ) ∩ P ≥ (A≥ ) is a noninteger polyhedron. Then
– either A≤ has a minimal imperfect minor,
– or A≥ has a degenerate projective plane, or a partitionable minor,
– or (A≤ , A≥ ) has a mixed odd circuit minor.

Note the asymmetry between ‘minimal imperfect’ in the first, and ‘partitionable’
in the second case (for an explanation see 5.2).
The results certainly provide a coNP characterization in the following case:
Characterizing Noninteger Polyhedra with 0–1 Constraints 43

Corollary 5. Let A≤ be a perfect, and A≥ an ideal 0–1-matrix with the same

number of columns. Then (A≤ , A≥ ) is fully integer if and only if it has no mixed
odd circuit minor.
These results provide a certificate for the intersection of a set-covering poly-
hedron and a set-packing polytope or of their corresponding minors to be nonin-
teger. This certificate can be checked in polynomial time in the most interesting
cases (see Section 5). We will however prove Theorem 1 in the following, slightly
sharper form which leaves the possibility to other applications open – and cor-
responds better to our proof method:
We call (A≤ , A≥ ) combinatorially minimal noninteger , if P := P (A≤ , A≥ ) is
noninteger, but (P ≤ \ i) ∩ (P ≥ \ i) and (P ≤ /i) ∩ (P ≥ /i) are fully integer for all
i = 1, . . . , n. Clearly, mixed odd circuits have this property.
Note the difference with the following definition which takes us out of 0–1
constraints: P is polyhedrally minimal noninteger, if it is noninteger, but P ∩{x ∈
IRn : xi = 0} and P i are integer for all i ∈ V .
Both the combinatorial and the polyhedral definitions require that the inter-
section of P with each hyperplane xi = 0 (i ∈ V ) is integer.
The two definitions are different only in what they require from projections,
and this is what we are going to generalize now. When we are contracting an
element, combinatorially minimal noninteger systems require the integrality of
i i i
P ≤ (A≤ ) ∩ P ≥ (A≥ ) instead of the integrality of P ≤ (A≤ ) ∩ P ≥ (A≥ ) in the
polyhedral definition, and this is the only difference between the two. It is easy
i i i
to see that P ≤ (A≤ ) ∩ P ≥ (A≥ ) ⊇ P ≤ (A≤ ) ∩ P ≥ (A≥ ) , and we saw (see
Example 1) that the equality does not hold in general, so the integrality of
i i i
P ≤ (A≤ ) ∩P ≥ (A≥ ) and that of P ≤ (A≤ )∩P ≥ (A≥ ) are seemingly independent
of each other. The combinatorial definition looks actually rather restrictive, since
it also requires that fixing a variable to 1 in P ≥ (A≥ ), and fixing the same variable
to 0 in P ≤ (A≤ ) the intersection of the two polyhedra we get is integer.
Note however, that surprisingly, the results confirm the opposite: the com-
binatorial definition is less restrictive, since besides partitionable, minimal non-
graph and degenerate projective clutters, it also includes mixed odd circuit poly-
hedra, which are not polyhedrally minimal noninteger !
Our proofs will actually not use more about the projections than the fol-
lowing simple sandwich property of P which is clearly implied by both com-
binatorial and polyhedral minimal nonintegrality (Qi can be chosen to be the
polyhedron on the left hand side or the one on the right hand side respectively):
for all i = 1, . . . , n, there exists an integer polyhedron Qi such that
≤ ≤ i i i
P (A ) ∩ P ≥ (A≥ ) ⊆ Qi ⊆ P ≤ (A≤ ) ∩ P ≥ (A≥ ) .
Let us call the system (A≤ , A≥ ) minimal noninteger, if

– P is noninteger, and
– P ∩ {x ∈ IRn : xi = 0}(= P ≤ ∩ {x ∈ IRn : xi = 0} ∩ P ≥ ∩ {x ∈ IRn : xi = 0})
is an integer polyhedron for all i ∈ V , and
– P has the sandwich property.
44 András Sebő

Theorem 2. If (A≤ , A≥ ) is minimal noninteger, simple, and w ∈ P is a frac-

tional vertex, then P is full dimensional, w is simplicial, and at least one of the
following statements hold:
– w is of packing type, and then A≤ is either a minimal nongraph clutter, or
the clique-matrix of a partitionable graph,
– w is of covering type, and then A≥ is either a degenerate projective plane or
a partitionable clutter, µ ≥ 2,
– w is a mixed vertex, and then (A≤ , A≥ ) is a mixed odd circuit.
Moreover, P has at most one fractional vertex of covering type, at most one
of packing type, and if it has a vertex of mixed type, then that is the unique
fractional vertex of P .

Note that Theorem 2 sharpens Theorem 1 in two directions: first, the con-
straint of Theorem 2 does not speak about all minors, but only about the dele-
tion and contraction of elements; second, the integrality after the contraction of
elements is replaced by the sandwich property.
The corollaries about combinatorial and polyhedral minimal nonintegrality
satisfy the condition of Theorem 2 for two distinct reasons. In the combinatorial
case simplicity does not necessarily hold, but deleting the certain equalities from
A≥ , the system remains combinatorially minimal noninteger (see 5.2).
Corollary 6. If (A≤ , A≥ ) is combinatorially minimal noninteger, then at least
one of the following statements holds:
– A≤ is a minimal nongraph or a partitionable clutter with µ = 0, furthermore
it is minimal imperfect, and the regular vertex of P ≤ (A≤ ) is the unique
packing type fractional vertex of P ≤ (A≤ ) ∩ P ≥ (A≥ ).
– A≥ is a degenerate projective plane, or a partitionable clutter with µ ≥ 2,
while A≤ is perfect, and the regular vertex of P ≥ (A≥ ) is in P ≤ (A≤ ).
– (A≤ , A≥ ) is a mixed odd circuit, and 1/21 is its unique fractional vertex.
This easily implies Theorem 1 and its corollaries using the following remark.
(it is particularly close to Corollary 3), while the next corollary does not have
similar consequences. This relies on the following:
– If P is noninteger, (A≤ , A≥ ) does contain a combinatorially minimal nonin-
teger minor. (Proof: In both P ≤ and P ≥ delete and contract elements so that
the intersection is still noninteger. Since the result has still 0–1 constraints
this can be applied successively until arriving at a combinatorially minimal
noninteger system.)
– If P is noninteger, one does not necessarily arrive at a polyhedrally minimal
noninteger polyhedron with deletions and restrictions of variables. (Coun-
terexample: Example 1.)

Corollary 7. If P ≤ (A≤ ) ∩ P ≥ (A≥ ) is polyhedrally minimal noninteger, then at

least one of the following statements holds:
Characterizing Noninteger Polyhedra with 0–1 Constraints 45

– either A≤ is a minimal nongraph or a partitionable clutter with µ = 0, and

the regular vertex of P ≤ (A≤ ) is the unique packing type fractional vertex of
P ≤ (A≤ ) ∩ P ≥ (A≥ ).
– or A≥ is a degenerate projective plane, or a partitionable clutter with µ ≥ 2,
and the regular vertex of P ≥ (A≥ ) is in P ≤ (A≤ ).

Proof. Express wi as a convex combination of vertices of P i . Replacing the

vectors in this combination by their lift, we get a vector which differs from w
exactly in the i-th coordinate (i = 1, . . . , n) – if it did not differ, w would be the
convex combination of integer vertices of P . So the i-th unit vector is in the linear
space generated by P for all i = 1, . . . , n, proving that P is full dimensional, in
particular, simple. So Theorem 2 can be applied, and its third alternative cannot
hold (see Example 1). t
u

Gasparyan [4] has deduced this statement by proving that in the polyhedral
minimal case the matrices involved in the matrix equations are nonsingular (see
comments concerning nonsingularity in Example 1).
The main frame of the present paper tries to mix (the polar of) Lehman’s
polyhedral and Padberg’s matricial approaches so as to arrive at the simplest
possible proof. Lemmas 1–4 and Lemma 7 are more polyhedral, Lemma 5,
Lemma 6 and Lemma 8 are matricial and combinatorial. When specializing
these to ideal clutters, their most difficult parts fall out and quite short variants
of proofs of Lehman’s or Padberg’s theorem are at hand.

3 From Polyhedra to Combinatorics

The notation A, B will be used for families of sets. (We will also use the notation
A for the matrices whose rows are the members of A.) The degree dA (v) of v in
A is the number of A ∈ A containing v.
Given w ∈ P , let Aw be the set of those rows A of A≥ or of A≤ for which
w(A) = 1. (We do not give multiplicites to the members of Aw , regardless of
whether some of its elements are contained in both A≥ and A≤ !) We also define
these if the polyhedron also has non-0–1-constraints. Then A≥ and A≤ denote
the set-covering and set-packing inequalities in the defining system.
If P is integer, we define Bw as the family of those 0–1 vectors (vertices of P )
which are on the minimal face of P containing w. (Equivalently, Bw is the set of
vertices having a nonzero coefficient in some convex combination expressing w.)
If A ∈ Aw , and B ∈ Bw , then |A∩B| = 1. Clearly, r(Aw )+ r(Bw ) = dim P + 1. If
it is necessary in order to avoid misunderstanding, we will write Aw (P ), Bw (P ).
The following lemma is based on the polar (in the sense of interchanging ver-
tices and facets) of a statement implicit in arguments of Lehman’s and Seymour’s
work (see Seymour [12] ).
Lemma 1. If Q is a polyhedron with S 0–1 vertices (and not necessarily 0–1-
constraints) and w ∈ Q, w > 0, then B∈Bw B = V , and

r(Bw ) ≥ max |A| : A ∈ Aw , r(Aw ) ≤ n − max |A| : A ∈ Aw + 1.
46 András Sebő

S since Q is integer, w is the convex combination of 0–1 vertices in

Proof. Indeed,
Bw , whence B∈Bw B = V . In particular, for A ∈ Aw and all a ∈ A there exists
Ba ∈ Bw , such that a ∈ Ba .
Since A ∈ Aw , and Ba ∈ Bw , we have |A ∩ Ba | = 1, and consequently
A ∩ Ba = {a}. Thus {Ba : a ∈ A} consists of |A| linearly independent sets of
Bw , whence r(Bw ) ≥ |A|. t
u

Remark 2. Compare Lemma 1 with Fonlupt, Sebő [2]: a graph is perfect if and
only if the linear rank of the maximum cliques (as vertex-sets) in every induced
subgraph is at most n − ω + 1 where ω is the size of the maximum clique in the
subgraph; the equality holds if and only if the subgraph is uniquely colorable.

We note and use in the sequel without reference that if P is minimal non-
integer, then w > 0 for all fractional vertices w of P (wi = 0 would imply that
(P ≤ \ i) ∩ (P ≥ \ i) is also noninteger).
In sections 3 and 4 I will denote the identity matrix, J the all 1 matrix
of appropriate dimensions;
A is called r-regular, if 1A = r1, and r-uniform if
A1 = r1; Ac := V \ A : A ∈ A . A is said to be connected if V cannot be
partitioned into two nonempty classes so that every A ∈ A is a subset of one of
the two classes. There is a unique way of partitioning A and V into the connected
components of A.
Lemma 2. If (A≤ , A≥ ) is minimal noninteger, w is a fractional vertex of P :=
P (A≤ , A≥ ), and A ⊆ Aw is a set of n linearly independent members of Aw ,
then every connected component K of Ac is n − rK -regular and n − rK -uniform
(rK ∈ IN), and r(A − v) = n − dA (v).

Proof. Recall that w > 0. If P is minimal noninteger, then for arbitrary i ∈ V the
i
sandwich property provides us Qi ⊆ IRV \{v} , wi ∈ P ≤ (A≤ )∩P ≥ (A≥ ) ⊆ Qi ⊆
i i
P ≤ (A≤ ) ∩ P ≥ (A≥ ) , that is, wi ∈ Qi and wi > 0. Applying the inequality in
Lemma 1 to Qi and wi , and using the trivialbut crucial fact that Awi (Qi ) ⊇ A−i,
we get the inequality r(A − i) ≤ n − max |A| : A ∈ A − i .
On the other hand, r(A) = n by assumption. One can now finish in a few
lines like Conway proves de Bruijn and Erdős’s theorem [7], which is actually
the same as Seymour [12, Lemma 3.2]:
Let H := Ac for the simplicity of the notation. What we have proved so far
translates as dH (v) ≤ |H| for all v ∈ H ∈ H. But then,
X X X X X X X
n= 1= 1/|H| = 1/|H| = dH (v)/|H| ≤ 1,
H∈H H∈H v∈H v∈V H∈H,v∈H v∈V v∈V

and the equality follows. t

Remark 3. The situation of the above proof will be still repeated several times:
when applying Lemma 1, the 0–1 vectors that have an important auxiliary role
for bounding the rank of some sets are in Bw (Qi ), and are not necessarily vertices
Characterizing Noninteger Polyhedra with 0–1 Constraints 47

of P . The reader can check on mixed odd circuits that the neighbors B =
{B1 , . . . , Bn } of 1/21 are not suitable for the same task (unlike in the special
cases): the combinatorial ways that use B had to be replaced by this more general
polyhedral argument. Watch for the same technique in Lemma 7 !
The next lemma synthesizes two similar proofs occurring in the special cases:

Lemma 3. If (A≤ , A≥ ) is minimal noninteger, w and w0 are fractional vertices

of P , then defining A and A0 to be a set of n linearly independent vectors from
Aw , Aw0 respectively, A and A0 cannot have exactly n − 1 common elements.
Proof. (Sketch) Apply Lemma 6 to both w and w0 . With the exception of some
degenerate cases easy to handle, any member of an r-regular clutter can be
uniquely reconstructed from the others. t
u

Lemma 4. If (A≤ , A≥ ) is minimal noninteger and simple, and w is a fractional

vertex of P , then P := P (A≤ , A≥ ) is full dimensional, w is simplicial, and the
vertices neighbouring w on P are integer.
The proof can be summarized with the sentence: a minimal noninteger system
cannot contain implicit equalities (only ‘explicit’ equalities).
Proof. Let us first prove that P is full dimensional. By Lemma 3 Aw is linearly
independent (recall that every member was included only once). Suppose 0 ∈ IRn
can be written as a nontrivial nonnegative linear combination of valid inequal-
ities. Clearly, all of these are implicit equalities (see [11]) of P . In particular
their coefficient vectors are in Aw . In this nontrivial nonnegative combination
there is no nonnegativity constraint xi ≥ 0, because otherwise P ⊆ {x : xi = 0},
contradicting w > 0. So everything participating in it is in Aw contradicting its
linear independence.
Since Aw is linearly independent, w is simplicial. If a neighbour w0 of w is
noninteger, we arrive at a contradiction with Lemma 3. t
u
We will say that a polyhedron P is minimal noninteger if P = P (A≤ , A≥ )
for some simple, minimal noninteger system. (Since P is full dimensional by
Lemma 4, it determines (A≤ , A≥ ) uniquely.)
Given a minimal noninteger polyhedron P and a fractional vertex w of P ,
fix A := A(w) := Aw and let B := B(w) denote the set of vertices neighboring
w in P .
Note that ∪ni=1 Bwi (P i ) = B(w) holds in the polyhedrally minimal noninteger
case, but does not necessarily hold otherwise, and therefore we need essential
generalizations. Do not confuse Bw (which is just {w}) with B(w).
We will say that a vertex B ∈ B and a facet A ∈ A not containing it are asso-
ciates. By Lemma 4 w is simplicial, whence this relation perfectly matches A and
B. We will suppose that the associate of the i-th row Ai of A is the associate Bi of
Ai ; µi := |Ai ∩ Bi |. Clearly, µi 6= 1 (i = 1, . . . , n). Denoting by diag(d1 , . . . , dn )
the n × n diagonal matrix whose diagonal entries are d1 , d2 , . . . , dn , we have
proved:
48 András Sebő

Lemma 5. AB T = J + diag(µ1 − 1, . . . , µn − 1), where µi 6= 1, (i = 1, . . . , n).

If µi does not depend on i, we will simply denote it by µ. (This notation is

not a coincidence: in this case (A, B) turns out to be partitionable where µ is
the identically denoted parameter.) By Lemma 5, µ 6= 1.
The main content of Lemma 3, 5, some aspects of Lemma 4 and most of
Lemma 6 are already implicitly present already in Padberg[9].

4 Associates and the Divisibility Lemma

The following lemma extracts and adapts to our needs well-known statements
from Lehman’s, Seymour’s and Padberg’s works, and reorganizes these into one
statement. It can also be deduced by combining results of Gasparyan [4], which
investigate combinatorial properties implied by matrix equations. For instance
the connectivity property of Lemma 6 below is stated in [4] in a general self-
contained combinatorial setting.
Lemma 6. If P is minimal noninteger, and w ∈ P is a fractional vertex of P ,
then A = A(w) is nonsingular and connected , moreover,

– if the clutter Ac is connected, then 1A = A1 = r1, r ≥ 2.

– if the clutter Ac has two components, then A is a degenerate projective plane.
– if the clutter Ac has at least three components, then A = Hnn−1 .

Proof. (Sketch) If Ac has at least two components, then any two sets whose
complements are in different components cover V . This, and the matrix equation
of Lemma 5 determine a degenerate combinatorial structure. (For instance one
can immediately see that the associate of a third set has cardinality at most two,
and it follows that all but at most one members of B have at most two elements.)
If Ac has one component,then the uniformity and regularity of Ac claimed
by Lemma 2 implies that of A. t
u

Recall that the nonsingularity of B cannot be added to Lemma 6 !

It is well-known that both for minimal imperfect and minimal nonideal ma-
trices the associates of intersecting sets are (almost) disjoint. In our case they
can also contain one another , and the proof does not fit into the combinatorial
properties we have established (namely Lemma 5). We have to go back to our
polyhedral context (established in the proof of Lemma 2, see also Remark 3):
Let us say that A with AB T = J + diag(µ1 − 1, . . . , µn − 1), where µi 6= 1
(i = 1, . . . , n) is nice, if for A1 , A2 ∈ A, v ∈ A1 ∩ A2 the associates B1 , B2 ∈ B of
A1 and A2 respectively, either satisfy B1 ∩ B2 \ {v} = ∅ or B1 \ {v} = B2 \ {v}.
(In the latter case, since B1 and B2 cannot be equal, one of the two contains v.)

Lemma 7. Let P be minimal noninteger, and A = A(w), B = B(w) for some

noninteger vertex w ∈ P . Then A is nice.
Characterizing Noninteger Polyhedra with 0–1 Constraints 49

Check the statement for the mixed C7 of Example 1 ! (It can also be instructive
to follow the proof on this example. )

Proof. Let v ∈ A1 , A2 ∈ A, and let B1 , B2 ∈ B be their associates. Moreover

assume u ∈ B1 ∩ B2 \ {v}. Let A0 ∈ A, u ∈ A, v ∈ / A0 . (There exists such an
A0 ∈ A since for instance Lemma 6 implies that a column of A cannot dominate
another.) Since
P is minimal noninteger,
v there exists an integer polyhedron Qv
v v
such that P ≤ (A≤ ) ∩ P ≥ (A≥ ) ⊆ Qv ⊆ P ≤ (A≤ ) ∩ P ≥ (A≥ ) . Now because of
Awv (Qv ) ⊇ A − i, the scalar product of the vertices of Bwv (Qv ) with all vectors
in A − i is 1, and the proof method of Lemma 1 can be applied:
For every a ∈ A0 \ u fix some Ba ∈ Bwv (Qv ) so that a ∈ Ba . Now {Ba :
a ∈ A0 \ u} ∪ {B1 \ v, B2 \ v} are r + 1 vectors in IRV \v all of which have
exactly one common element with each A ∈ A − v. On the other hand, by
Lemma 2 r(A − v) = n − r = (n − 1) − (r − 1), so there can be at most
r linearly independent sets with this property.
P Hence there exists a nontrivial
linear combination λ1 (B1 \v)+λ2 (B2 \v)+ a∈A\u λa Ba = 0. Since for a ∈ A0 \u
the unique vector in this linear combination which contains a is Ba , one gets
that λa = 0 for all a ∈ A \ u. It follows that B1 \ v = B2 \ v, and λ1 = λ2 . t
u

Although the following statement is the heart of our proof, it is independent

of the other results. The very root of the statement is the simple observation
that n + dj is a multiple of r. Note that in order to deduce r = 2 we need more
than just the matrix equation !

Lemma 8. Assume that A, B are 0–1 matrices, 1A = A1 = r1, and

AB T = J + diag(µ1 − 1, . . . , µn − 1), µi 6= 1. Then

– either µ1 = . . . = µn =: µ, and then AB T = B T A = J + (µ − 1)I, BJ =

JB = sJ, (s = (n + µ − 1)/r),
– or {µ1 , . . . , µn } = {0, r}, and if A is connected and nice, then r = 2.

Proof. If µ1 = µ2 = . . . = µn =: µ, then we finish easily, like [3]: since µ 6= 0,

A is invertible; since A commutes with I, and by assumption with J too, so
does its inverse; now expressing B T from AB T = J + (µ − 1)I we get that it
is the product of two matrices which commute with both A and J. So B T also
commutes with these matrices, proving the statement concerning this case. (The
matrices X and Y are said to commute, if XY = Y X.)
So suppose that there exist i, j ∈ V such that µi 6= µj .

Claim (1). r|Bj | = n + µj − 1, and 0 ≤ µj ≤ r, (j = 1, . . . , n).

Indeed, r1B T = (1A)B T = 1(AB T ) = 1 J + diag(µ1 − 1, . . . , µn − 1) =
(n + µ1 − 1, . . . , n + µn − 1).
The inequality is obvious: 0 ≤ µj = |Aj ∩ Bj | ≤ |Aj | = r, (j = 1, . . . , n).

Claim (2). If there exist i, j ∈ V , µi 6= µj , then µj ∈ {0, r} for all j ∈ V .

50 András Sebő

Indeed, according to Claim (1) we have n + µj − 1 ≡ 0 mod r, where µj

is in an interval of r + 1 consecutive integers representing every residue class
mod r exactly once, except 0, which is represented twice, by 0 and r. Hence if
{µ1 , . . . , µn } contains two different values, then these values can only be 0 and
r as claimed.

Claim (3). If v ∈ A1 ∩ A2 , µ1 = 0, µ2 = r, then B1 = B2 \ {v}.

Indeed, let u ∈ A2 ∩ B1 . (Because of the matrix equation in the constraint,

we also know |A2 ∩ B1 | = 1.) We have |A1 ∩ B1 | = µ1 = 0, and since |A2 ∩ B2 | =
µ2 = r = |A2 |, we also have A2 ⊆ B2 .
Since v ∈ A1 and A1 ∩ B1 = ∅ : u 6= v. Because of A2 ⊆ B2 we have
u ∈ (B1 ∩ B2 ) \ {v}. So we must have B1 \ {v} = B2 \ {v} by the condition, and
/ B1 , v ∈ B2 : B1 = B2 \ {v}. The claim is proved.
since v ∈
Now we finish the proof. Since there exist i, j ∈ V such that µi 6= µj , by
Claim (2) µj ∈ {0, r} for all j ∈ V . Since A is a connected clutter, there exists
v ∈ V so that v ∈ Ai ∩ Aj and µi = 0, µj = r. After possible renumbering, we
can assume i = 1, j = 2.
So let A1 , A2 ∈ A = A(w), v ∈ A1 ∩ A2 , µ1 = 0, µ2 = r and denote the
associates of A1 , A2 by B1 , B2 respectively.
By Claim (3), 1 = |A2 ∩ B1 | = |A2 ∩ B2 | − 1 = r − 1, so r = 2. t
u

Proof of Theorem 2. (Sketch) Let (A≤ , A≥ ) be minimal noninteger. Further-

more, let w ∈ P a fractional vertex of P . Let A := A(w) and B := B(w). Then
we have the matrix equation of Lemma 5.

Case 1. Ac is connected: according to Lemma 6 and Lemma 7 the conditions of

Lemma 8 are satisfied, and using Lemma 8 it is straightforward to finish.

Case 2. Ac has two components: by Lemma 6 A is a degenerate projective plane.

It can be checked then that either A = A≥ or A is not minimal noninteger. We
prove this with the following technique (and use similar arguments repeatedly
in the sequel): we prove first that there exist an i ∈ V so that (P ≥ /i) ∩ (P ≤ /i) is
noninteger. It turns out then that the maximum p of the sum of the coordiates of
a vector on (P ≥ /i) ∩ (P ≤ /i) and the maximum q of the same objective function
on P i are close to each other: [p, q] does not contain any integer (we omit the
details). So for all Qi such that P i ⊆ Qi ⊆ (P ≥ /i) ∩ (P ≤ /i) the maximum of
the sum of coordinates on Qi must lie in the interval [p, q]. Thus Qi cannot be
chosen to be integer, whence P does not have the sandwich property.

Case 3. Ac has at least three components: by Lemma 6 A is the set of n − 1-

tuples of an n-set. If A = A≤ , then we are done (again the first statement holds
in the theorem). In all the other cases P turns out not to be minimal noninteger
(with the above-described technique). t
u
Characterizing Noninteger Polyhedra with 0–1 Constraints 51

5 Comments
5.1 Further Examples
A system (A≤ , A≥ ) for which a P (A≤ , A≥ ) ⊆ IR5 is integer, but (A≤ , A≥ ) is
not fully integer: the rows of A≤ are (1, 1, 0, 0, 0), (0, 1, 1, 0, 0), (1, 0, 1, 0, 0) and
(0, 0, 1, 1, 1); A≥ consists of only one row, (0, 0, 0, 1, 1).
We mention that a class of minimal noninteger simple systems (A≤ , A≥ ) with
the property that (A≤ , A≥ ) \ i (i ∈ V ) defines an integer, but not always fully
integer polyhedron, can be defined with the help of ‘circular’ minimal imperfect
and minimal nonideal systems (see Cornuéjols and Novick [1]): define A≤ := Cnr ,
A≥ := Cns , where r ≤ s and A≤ is minimal imperfect, A≥ is minimal nonideal.
Such examples do not have mixed vertices, so they also show that the first
two cases of our results can both occur in the same polyhedron.

5.2 A Polynomial Certificate

We sketch why Corollary 6 follows from Theorem 2. Note that Corollary 6 im-
mediately implies Corollary 3.
In a combinatorially minimal noninteger system (A≤ , A≥ ), A≤ is in fact
minimal imperfect or perfect. This is a simple consequence of the following:

Claim. If P ≤ := P ≤ (A≤ ) or P ≥ := P ≥ (A≥ ) is partitionable with a regular

vertex w ∈ P := P ≤ ∩ P ≥ , and P ≤ /I (I ⊆ V ) is partitionable with regular
vertex w0 , then w0 ∈ P ≥ /I.

Indeed, suppose that w is the regular vertex of a polyhedron whose defining

clutter has parameters (r, s), and let the parameters of w0 be (r0 , s0 ). So w := 1/r1
and w0 := 1/r0 1.
Now r0 ≤ r, because the row-sums of the defining matrix of P ≤ /I (which is
a submatrix of A≤ ) do not exceed the row-sums of A≤ . Since w ∈ P ≤ (A≤ ), the
row-sums of A≤ are at most r.
But then, if we replace in 1/r1 some coordinates by 1/r0 some others by 1
the vector w00 we get majorates 1/r1 ∈ P ≥ (A≥ ) whence it is also in P ≥ (A≥ ).
Since w0 ∈ P ≥ /I is equivalent to the belonging to P ≥ of such a vector w00 , the
claim is proved.
To finish the proof of Corollary 6 one can show that after deleting from A≥
an equality from ‘Aw ’, the system remains minimal noninteger.
Using appropriate oracles, Corollary 3 provides a polynomial certificate. (For
the right assumptions about providing the data and certifying the parameters of
a partitionable clutter we refere to Seymour [12]. We need an additional oracle
for the set-covering part.)
The polynomial certificates can be proved from the Claim using the fact that
for partitionable clutters and perfect graphs the parameters can be certified in
polynomial time.
For the non-full-integrality of the intersection of perfect and ideal polyhedra
a simple polynomial certificate is provided by Corollary 5.
52 András Sebő

Acknowledgments
I am thankful to Grigor Gasparyan and Myriam Preissmann for many valuable
comments. Furthermore, I feel lucky to have learnt Lehman’s results and espe-
cially to have heard the main ideas of Padberg’s work from Grigor Gasparyan.
I would like to thank András Frank for comparing a lemma in [12] concerning
ideal matrices to Erdős and de Bruijn’s theorem: this helped me getting closer
to ideal matrices and strengthened my belief in a common generalization (a
particular case of Fisher’s inequality is implicit in proofs for minimal imperfect
graphs as well, see [3]).
Last but not least I am indebted to Kazuo and Tomo Murota for their mirac-
ulous help of various nature: due to them, it was possible to convert an extended
abstract to a paper during five jet-lag-days.

References
1. G. Cornuéjols and B. Novick. Ideal 0–1 matrices. J. of Comb. Theory B, 60(1):145–
157, 1994.
2. J. Fonlupt and A. Sebő. The clique rank and the coloration of perfect graphs. In
R. Kannan and W. Pulleyblank, editors, Integer Programming and Combinatorial
Optimization I. University of Waterloo Press, 1990.
3. G. Gasparyan. Minimal imperfect graphs: A simple approach. Combinatorica,
16(2):209–212, 1996.
4. G. Gasparyan. Bipartite designs. In R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-
Mercado, editors, Integer Programming and Combinatorial Optimization: Proceed-
ings of the 6th International Conference on Integer Programming and Combinato-
rial Optimization, LNCS, Vol. 1412, pages 23–35. Springer, 1998. This volume.
5. G. Gasparyan and A. Sebő. Matrix equations in polyhedral combinatorics. 1998.
In preparation.
6. A. Lehman. The width-length inequality and degenerate projective planes. In
W. Cook and P. D. Seymour, editors, Polyhedral Combinatorics, DIMACS, Vol. 1,
pages 101–105, 1990.
7. J. H. van Lint and R. M. Wilson. A Course in Combinatorics. Cambridge Univer-
sity Press, 1992.
8. L. Lovász. A characterization of perfect graphs. J. of Comb. Theory B, 13:95–98,
1972.
9. M. Padberg. Perfect zero-one matrices. Math. Programming, 6:180–196, 1974.
10. M. Padberg. Lehman’s forbidden minor characterization of ideal 0–1 matrices.
Discrete Mathematics, 111:409–420, 1993.
11. A. Schrijver. Theory of Linear and Integer Programming. Wiley, 1986.
12. P. D. Seymour. On Lehman’s width-length characterization. In Polyhedral Combi-
natorics, DIMACS, Vol. 1, pages 107–117, 1990.
A Theorem of Truemper

?
Michele Conforti and Ajai Kapoor

Dipartimento di Matematica Pura ed Applicata, Università di Padova

Via Belzoni 7, 35131 Padova, Italy

Abstract. An important theorem due to Truemper characterizes the

graphs whose edges can be labelled so that all chordless cycles have
prescribed parities. This theorem has since proved an essential tool in
the study of balanced matrices, graphs with no even length chordless
cycle and graphs with no odd length chordless cycle of length greater
than 3. In this paper we prove this theorem in a novel and elementary
way and we derive some of its consequences. In particular, we show how
to obtain Tutte’s characterization of regular matrices.

1 Truemper’s Theorem
Let β be a 0,1 vector indexed by the chordless cycles of an undirected graph
G = (V, E). G is β-balanceable if its edges can be labelled with labels 0 and 1
such that l(C) ≡ βC mod 2Pfor every chordless cycle C of G, where l(e) is the
label of edge e and l(C) = e∈E(C) l(e).
We denote by β H the restriction of the vector β to the chordless cycles of an
induced subgraph H of G.
In [14] Truemper showed the following theorem:
Theorem 1. A graph G is β-balanceable if and only if every induced subgraph
H of type (a), (b), (c) and (d) (Figure 1) is β H -balanceable.
Graphs of type (a), (b) or (c) are referred to as 3-path configurations (3P C’s).
A graph of type (a) is called a 3P C(x, y) where node x and node y are connected
by three internally disjoint paths P1 , P2 and P3 . A graph of type (b) is called
a 3P C(xyz, u), where xyz is a triangle and P1 , P2 and P3 are three internally
disjoint paths with endnodes x, y and z respectively and a common endnode
u. A graph of type (c) is called a 3P C(xyz, uvw), consists of two node disjoint
triangles xyz and uvw and disjoint paths P1 , P2 and P3 with endnodes x and
u, y and v and z and w respectively. In all three cases the nodes of Pi ∪ Pj
i 6= j induce a chordless cycle. This implies that all paths P1 , P2 , P3 of (a) have
length greater than one. Graphs of type (d) are wheels (H, x). These consist of a
chordless cycle H called the rim together with a node x called the center, that
has at least three neighbors on the cycle. Note that a graph of type (b) may also
be a wheel.
?
Supported in part by a grant from Gruppo Nazionale Delle Ricerche-CNR.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 53–68, 1998. c Springer–Verlag Berlin Heidelberg 1998
54 Michele Conforti and Ajai Kapoor

(a) (b) (c) (d)

Fig. 1. 3-path configurations and wheel

In this paper, we give an alternative simple proof of Theorem 1 and we

highlight its importance by deriving some well known theorems, such as the
Tutte’s characterization of regular matrices, the characterization of balanceable
matrices and of even and odd signable graphs. Finally we show how to use
Theorem 1 to obtain decomposition theorems.
Truemper [14] derived the excluded minor characterization of matroids rep-
resentable over GF3 as a consequence of Theorem 1. From known results, see
[13], this implies Tutte’s theorem. Here we offer a more direct derivation.
First some definitions. N (v) is the set of neighbors of node v. A signed graph
G is a graph whose edges are labeled with 0 or 1. Given a 0, 1 vector β indexed by
the chordless cycles of G, if a chordless cycle C of G satisfies l(C) ≡ βC mod 2,
then C is signed correctly, otherwise C is signed incorrectly. A β-balancing of G
is one for which each of its chordless cycles is correctly signed. The operation of
scaling on a cut δ(S) of G consists of switching (from 0 to 1 and viceversa) the
labels on all edges in δ(S). Since cuts and cycles of G have even intersections,
we have the following:

Remark 2. Let G0 be a signed graph obtained from G by scaling on a cut. A

chordless cycle C is correctly signed in G0 if and only if C is correctly signed in
G.

Assume G is connected and contains a clique cutset Kl with l nodes and let
G01 , G02 , . . . , G0n be the components of the subgraph induced by V (G) \ Kl . The
blocks of G are the subgraphs Gi induced by V (G0i ) ∪ Kl , i = 1, . . . , n.

Remark 3. If G contains a K1 or K2 cutset, then G is β-balanceable if and

only if all of its blocks are β Gi -balanceable.

Proof: The ”only if” part is obvious. We prove the ”if” statement when G has
a K2 cutset {u, v}, since the other case is again immediate. All blocks have a
β-balancing in which edge uv has the same label, since all blocks Gi are β Gi -
balanceable and we can always scale on a cut of Gi separating u and v. The
A Theorem of Truemper 55

signings of the blocks induce a β-balancing of G, since every chordless cycle of

G belongs to one of the blocks Gi . 2
The following observation appears in [14].

Remark 4. A graph G which is a wheel or a 3-path configuration is β-balanceable

if and only if any signing of G produces an even number of incorrectly signed
chordless cycles.

Proof: If G is a wheel or a 3-path configuration, every edge of G belongs to

exactly two chordless cycles. Therefore switching the label of edge uv changes
the parities of the labels of the two chordless cycles containing uv and of no
other chordless cycle. Now if G is a wheel or a 3-path configuration and has
at least two chordless cycles that are signed incorrectly, then either G contains
two chordless cycles that are signed incorrectly and have a common edge or G
contains three chordless cycles C1 , C2 and C3 such that C1 and C3 are signed
incorrectly, C2 is signed correctly and C2 has common edges with both C1 and
C3 , but C1 and C3 do not have a common edge.
Therefore by switching the label of at most two edges, the number of cycles
that are incorrectly signed decreases by two. 2
An ordering e1 , . . . , en of the edges of a connected graph G is consistent if
the first edges in the sequence belong to a spanning tree T of G and all other
edges ej have the property that ej , together with some edges having smaller
indices, closes a chordless cycle Cj of G. Note that for any spanning tree T of
G, G admits a consistent ordering where the edges of T appear first.
Assume a connected graph G, a vector β are given and consider the following
signing algorithm:
Let e1 , . . . , en be a consistent ordering of the edges of G. Label the edges of T
arbitrarily and label the remaining edges ej so that the chordless cycles Cj are
signed in accordance with the components βCj of β.
Since every edge of T belongs to a cut of G, containing no other edge of T ,
then Remark 2 shows that if G is β-balanceable, an arbitrary labeling of the
edges of T can always be extended to a β-balancing of G. Therefore the above
signing algorithm will produce a β-balancing of G, whenever G is β-balanceable.
Conversely, given a consistent ordering where a tree T appears first and a β-
balancing of G, this same signing of G is produced by the algorithm when T is
signed as in the signing of G.

Remark 5. Let G be a β-balanceable graph and let Gv be the subgraph of G,

obtained by removing node v. Then every β-balancing of Gv with respect to β Gv
can be extended to a β-balancing of G.

Proof: We assume that v is not a cutnode of G, else by Remark 3, we can argue

on the blocks of G. Now G contains a spanning tree T where v is a leaf of T .
Order the neighbors of v in Gv as v0 , v1 , . . . , vk , where v0 is the neighbor of v
56 Michele Conforti and Ajai Kapoor

in T and vi is chosen so that amongst all nodes in N (v) \ {v0 , . . . , vi−1 } the
path between nodes v and vi is shortest in the subgraph of G with edge set
E(Gv ) ∪ {vv0 , . . . , vvi−1 }. Now place first the edges of T , then the other edges
of Gv in a consistent ordering with respect to T \ {v}, then vv1 , . . . , vvk . This
ordering is a consistent ordering for G and the signing algorithm can be applied
to produce from the β-balancing of Gv , a β-balancing of G. 2

Lemma 6. Let G0 be an induced subgraph of G, containing a given chordless

cycle C and satisfying the following properties:
1) G0 is connected and contains no K1 or K2 cutset.
2) C belongs to G0 and G0 \ C is nonempty.
3) V (G0 ) is minimal with respect to 1) and 2).
Then G0 is a 3-path configuration or a wheel containing C.

Proof: Let G00 be the subgraph of G0 , induced by V (G0 ) \ V (C). If G00 is a single
node, say u, and u has only two neighbors ci and cj in C, then ci and cj are
nonadjacent and G0 is a 3P C(ci , cj ). Otherwise G0 is a wheel with u as center.
If G00 contains more than one node, by 3) we have that G00 is connected and
that:

4) Every node of G00 has at most two neighbors in C and these two neighbors
are adjacent.
5) G00 contains at most one pair of nodes, say x1 and xn such that both x1 and
xn have neighbors in C and (N (x1 ) ∪ N (xn )) ∩ V (C) either contains at least
three nodes or two nonadjacent nodes.

(Indeed by 3), we have that G00 is connected. So if G00 contains more that one
such pair, let x1 , xn be chosen satisfying 5) and closest in G00 . Let P = x1 , . . . , xn
be a shortest path in G00 connecting them. The subgraph G∗ of G, induced by
V (C) ∪ V (P ) satisfies 1) and 2). Then if more that one such pair exists, G∗ is a
proper subgraph of G0 and this contradicts 3).)
Let C = c1 , . . . , cm and assume first that G00 contains one pair of nodes, x1 ,
xn satisfying 5). Then by 3), G00 is a path P = x1 , . . . , xn . If a node of C, say
ci , is adjacent to a node xi , 1 < i < n of P , then by 3) and 4), x1 is adjacent to
ci−1 , possibly to ci and no other node of C. Node xn is adjacent to ci+1 , possibly
to ci (indices modm) and no other node of C. Therefore no other node of C is
adjacent to an intermediate node of P . In this case, G0 is a wheel with center ci .
If no node of C is adjacent to an intermediate node of P , then by 4) we can
assume w.l.o.g. that x1 is adjacent to ci−1 and possibly ci and xn is adjacent to
cj+1 and possibly cj . If x1 or xn has two neighbors in C and i = j, then G0 is a
wheel with center ci . In the remaining cases G0 is a 3-path configuration.
If G00 contains no pair of nodes satisfying 5), by 1) and 4) we have that C is
a triangle c1 , c2 , c3 , all three nodes of C have neighbors in G00 and no node of G00
has more than one neighbor in C. If G00 is a chordless path P = x1 , . . . , xn with
A Theorem of Truemper 57

x1 adjacent to, say c1 and xn adjacent to c2 , then c3 has some neighbor in P and
G0 is a wheel with center c3 . Otherwise let P12 be a shortest path connecting c1
and c2 and whose intermediate nodes are in G00 . If c3 has a neighbor in P12 we
have a wheel with center c3 . Otherwise let P3 be a shortest path connecting c3
and V (P12 ) \ {c1 , c2 } and whose intermediate nodes are in G00 . By 3), G0 is made
up by C, together with P12 and P3 , furthermore P3 meets P12 either in a node
x or in two adjacent nodes t1 , t2 . In the first case, we have a 3P C(c1 c2 c3 , x),
otherwise we have a 3P C(c1 c2 c3 , t1 t2 t3 ). 2
For e ∈ E(G), Ge denotes the graph whose node set represents the chordless
cycles of G containing e and whose edges are the pairs C1 , C2 in V (Ge ), such
that C1 and C2 belong to a 3-path configuration or a wheel.

Lemma 7. If e = {u, v} is not a K2 cutset of G, Ge is connected.

Proof: Assume not. Let Ge1 and Ge2 be two components of Ge . Let Gi be the
subgraph of G induced by the node set ∪C∈Gei V (C), for i = 1, 2.
Assume first that {u, v} is a K2 cutset separating G1 from G2 in the graph
induced by V (G1 ) ∪ V (G2 ). Pick C1 ∈ Ge1 and C2 ∈ Ge2 and a path P in G such
that in the subgraph G0 of G induced by V (C1 ) ∪ V (C2 ) ∪ V (P ), {u, v} is not
a K2 cutset and C1 , C2 and P are chosen so that |P | is minimized. (Note that
P exists since {u, v} is not a K2 cutset of G). Then by the minimality of P , no
node of P is contained in a chordless cycle containing edge e. By Lemma 6, C1
is a chordless cycle in a 3-path configuration or wheel H, contained in G0 . Since
any edge in a 3-path configuration or wheel is contained in two chordless cycles,
V (C1 ) ∪ V (C2 ) ⊆ V (H). But then C1 C2 is an edge of Ge , a contradiction.
So {u, v} is not a K2 cutset in the graph induced by V (G1 ) ∪ V (G2 ). Let
C2 ∈ Ge2 , such that for some C ∈ Ge1 , {u, v} is not a K2 cutset in the graph
induced by V (C) ∪ V (C2 ). Let C2 be u = v1 , . . . , vm = v. Let v C be the node of
lowest index in V (C2 ) \ V (C) and let SC be the component of the graph induced
by V (C2 )\V (C) containing node v C . Amongst all C ∈ Ge1 such that {u, v} is not
a K2 cutset in the graph induced by V (C) ∪ V (C2 ), let C1 be the chordless cycle
for which the node v C1 has the highest index and with respect to that |SC1 | is
smallest possible. By Lemma 6, C1 is a chordless cycle of a 3-path configuration
or wheel H contained in V (C1 ) ∪ V (SC1 ). Let C3 be the chordless cycle of H
distinct from C1 containing edge e. We show that C3 contradicts the choice of
C1 . Since H contains C1 and C3 , C3 ∈ Ge1 . Also C2 and C3 have a common node
which is distinct from u or v and so uv is not a K2 cutset in the subgraph of G,
induced by V (C3 ) ∪ V (C2 ). If v C1 is contained in V (C3 ) then v C3 has an index
higher than i, a contradiction, otherwise since SC3 ⊆ SC1 and some node of SC1
belongs to C3 , |SC3 | < |SC1 |, a contradiction. 2
Proof of Theorem 1: The necessity of the condition is obvious. We prove the
sufficiency by contradiction. Assume that G and β are chosen so that G is a
counterexample to the theorem with respect to β and V (G) is minimal. Then G
is connected and by Remark 5 G contains no K1 or K2 cutset.
58 Michele Conforti and Ajai Kapoor

Let e = uv be any edge of G and let Gu = G \ u, Gv = G \ v and Guv =

G \ {u, v}. By the minimality of graph G, Gu , Gv and Guv are respectively β Gu -,
β Gv - and β Guv - balanceable and Remark 3 shows that a β-balancing of Guv can
be extended to a β-balancing of Gu and to a β-balancing of Gv . To complete
the signing of G, label uv arbitrarily. Now we have signed G so that:
Every chordless cycle of G which is incorrectly signed contains edge e = uv.
Let the chordless cycles of G containing edge uv be partitioned into the
incorrectly signed B and the correctly signed C. Both sets are nonempty, else,
by possibly switching the label of uv, we have a β-balancing of G. Furthermore
{u, v} is not a K2 cutset of G.
Hence by connectivity of Ge (Lemma 7), there exists an edge C1 C2 in Ge ,
where C1 ∈ B and C2 ∈ C. Any edge in a 3-path configuration or a wheel is
contained in exactly two chordless cycles, thus G contains a 3-path configuration
or a wheel with exactly one chordless cycle in B and by Remark 4, we are done.
2
2 Even and Odd-Signable Graphs
A hole in an undirected graph G is a chordless cycle of length greater than three.
Signed graphs provide a useful setting for studying graphs without even or odd
holes.
A graph G is even-signable if G is β-balanceable for the vector βC = 1 if C is
a triangle of G and βC = 0 if C is a hole of G. A graph G is odd-signable if G is
β-balanceable for the vector β of all ones. Even-signable graphs were introduced
in [6] and odd-signable in [4].
Note that G contains no odd hole if and only if G is even-signable with all
labels equal to one and G contains no even hole if and only if G is odd-signable
with all labels equal to one.
A graph of type (b) may also be a wheel of type (d), when at least one of
the paths P1 , P2 , P3 is of length one. To separate these cases, from now on, we
impose that all three paths in graph of type (b) have length greater than one.
With this assumption, all chordless cycles of graphs of type (a), (b) and (c) are
holes except the triangle of (b) and the two triangles of (c). Furthermore the rim
of a wheel is a hole unless the wheel is K4 .
We now derive from Theorem 1 co-NP characterizations of even-signable and
odd-signable graphs.
For graphs of type (d) (the wheels), when the center together with the nodes
of the hole induces an odd number of triangles the wheel is called an odd wheel.
When the center has an even number of neighbors on the hole the wheel is called
an even wheel. (Notice that a wheel may be both odd and even and K4 is a wheel
that is neither even nor odd).
In a signed graph G the weight of a subgraph H is the sum of the labels of
the edges contained in H.
Theorem 8. A graph is even-signable if and only if it contains no 3P C(xyz, u)
and no odd wheel.
A Theorem of Truemper 59

Theorem 9. A graph is odd-signable if and only if it contains no 3P C(x, y),

no 3P C(xyz, uvw) and no even wheel.
Proof of Theorem 8: In a 3P C(xyz, u) the sum of the weights of the three
holes modulo 2 is equivalent to the weight of the edges of the triangle, since
all other edges are counted precisely twice. So if a 3P C(xyz, u) is signed so
that its holes have even weight, then the triangle also has even weight. Thus a
3P C(xyz, u) is not odd-signable.
Similarly in a wheel the sum of the weights of the chordless cycles containing
the center is equivalent modulo 2 to the weight of the rim. In an odd wheel the
number of triangles containing the center is odd and so if the wheel is signed so
that the weights of the chordless cycles containing the center are correct then
the weight of the rim is odd.
Consider graphs (a) and (c). By labeling 1 all edges in the triangles and 0 all
other edges, we obtain a correct labeling of these graphs. In wheel (H, x), that
is not odd label 1 all edges of H that belong to a triangle of (H, x) and 0 all
other edges. 2
Proof of Theorem 9: In a 3P C(x, y) the sum of the weights of two of the
holes is equivalent modulo 2 to the weight of the third, since the edges in the
intersection of the two are counted twice and the remainder induce the third.
Thus if the graph is signed so that these two holes have odd weight then the
weight of the third is even. So a 3P C(x, y) is not odd-signable.
Similarly in a 3P C(xyz, uvw) the sum of the weights of all three holes is
equivalent modulo 2 to the sum of the weights of the two triangles. If the graph
is signed so that the weight of the three holes is odd then at least one of the
triangles must have even weight.
In an even wheel the weight of the rim is equivalent modulo 2 to the sum
of the weights of the other chordless cycles. Since there are an even number of
these, each with odd weight, the rim has even weight.
Consider a graph of type (b). By labeling 1 all edges in triangles and 0 all
other edges, we obtain an odd signing of these graphs. To label a wheel (H, x)
that is not even, on every subpath of H with endnodes adjacent to x and no
intermediate node adjacent to x, label 1 one edge and label 0 all other edges of
(H, x). This gives an odd signing of the wheel. 2
The recognition problem for both even-signable and odd-signable graphs is
still open. In [4] both problems are solved for graphs that do not contain a cap
as induced subgraph. (A cap is a hole H plus a node that has two neighbors in
H and these neighbors are adjacent).
In [3] a polynomial time recognition algorithm is given, to test if a graph G
contains no even hole (i.e. G is odd-signable with all labels equal to one).

3 Universally Signable Graphs

Let G be a graph which is β-balanced for all 0, 1 vectors β that have an entry of
1 corresponding to the triangles of G. Such a graph we call universally signable.
60 Michele Conforti and Ajai Kapoor

Clearly triangulated graphs i.e. graphs that do not contain a hole are universally
signable. In [5] these graphs are shown to generalize many of the structural prop-
erties of triangulated graphs. Here we show a decomposition theorem that follows
easily from the co-NP characterization of these graphs as given by Theorem 1.

Theorem 10. A graph G is universally signable if and only if G contains no

graph of type (a), (b), (c) or (d) which is distinct from K4 .

In view of the previous remark, the above condition is equivalent to the

condition: ”no hole of G belongs to a graph of type (a), (b), (c) or (d)”. Now
the proof of the above theorem follows from Theorem 1.
As a consequence of Theorem 10 and Lemma 6 we have the following decom-
position theorem.

Theorem 11. A connected universally signable graph that is not a hole and is
not a triangulated graph contains a K1 or K2 cutset.

It was the above decomposition theorem that prompted us to look for a new
proof for Theorem 1.
Now Theorem 11 and the following result of Hajnal and Suranyi [11] can be
used to decompose with clique cutsets a universally signable graph into holes
and cliques.

Theorem 12. A triangulated graph that is not a clique contains a clique cutset.

4 α-Balanced Graphs, Regular and Balanceable Matrices

Actually, Truemper proved the following theorem that he also showed to be

equivalent to Theorem 1.
Let α be a vector with entries in {0, 1, 2, 3} indexed by the chordless cycles of
a graph G. A graph G = (V, E) is α-balanceable if its edges can be labeled with
labels of −1 and +1 so that for every chordless cycle C of G, l(C) ≡ αC mod 4.
Such a signing is an α-balancing of G.

Theorem 13. A graph is α-balanceable if and only if αC is even for all even
length chordless cycles C and odd otherwise and every subgraph H of G of type
(a), (b), (c) or (d) is αH -balanceable.

To see that the two theorems are equivalent note that an α-balancing of G
with labels of 1 and −1, is implied by a β-balancing with β = ( αC −|E(C)|
2 ) mod 2,
by replacing the 0’s by −1’s. Similarly the β-balancing of G with labels of 0 and
1 is implied by an α-balancing with αC = (2βC + |E(C)|) mod 4, by replacing
the −1’s by 0’s.
A Theorem of Truemper 61

Balanceable and Balanced Matrices

The bipartite graph G(A) of a matrix A has the row and column sets of A as
color classes and for all entries aij 6= 0, G(A) has an edge ij of label aij .
A 0, ±1 matrix A is balanced if G(A) is α-balanced for the vector α of all
zeroes. A 0, 1 matrix A is balanceable if G(A) is α-balanceable for the vector α
of all zeroes. (From now on, signing consists of replacing some of the 10 s with
−10 s).
Note that the same signing algorithm of Section 1, applied to G(A), can be
used to obtain a balanced matrix from A, when A is balanceable. Here signing
the edges of G(A) means assigning labels ±1.
We can now derive from Theorem 13 a co-NP characterization of balanceable
matrices:

Theorem 14. A 0, 1 matrix A is balanceable if G(A) contains no 3P C(x, y)

where x and y belong to opposite sides of the bipartition and no wheel, where the
center node has odd degree.

Proof: By Theorem 13 we only need to find in G(A) the subgraphs of type (a),
(b), (c) or (d) that are not balanceable. Since G(A) is bipartite it cannot contain
graphs of type (b) or (c). Graphs of type (a) with both endnodes in the same
side of the bipartition are seen to be balanceable by signing the edges so that the
three paths have the same length mod 4. When the two nodes of degree 3 belong
to opposite sides of the bipartition then since two of the paths have the same
length mod4, either 1 mod 4 or 3 mod 4 there exists a chordless cycle signed
incorrectly with respect to α of all zeroes.
For a wheel (H, x), let C1 , . . . , Ck be the chordless cycles of (H, x) containing
x. Obtain a signing
P of the graphP so that C1 , . . . , Ck are signed correctly.
Pk For F ⊆
k
E, let l(F ) = e∈F l(e). Then i=1 l(Ci ) ≡ 0 mod 4. But l(H) = i=1 l(Ci ) −
2l(S) where S consists of all edges with one endpoint the center node of the
wheel. Since 2l(S) ≡ 2|S| mod 4, clearly l(H) ≡ 0 mod 4 if and only if k = |S|
is even. 2
In [8], [2], a polynomial algorithm is given, to recognize if a matrix is balance-
able or balanced. Balanced 0, ±1 matrices have interesting polyhedral properties
and have been recently the subject of several investigations, see [7] for a survey.

Totally Unimodular and Regular Matrices: A Theorem of Tutte

A matrix Ã is totally unimodular (TU, for short) if every square submatrix of

Ã has determinant 0, ±1. Consequently a TU matrix is a 0, ±1 matrix. If Ã
is a 0, ±1 matrix such that G(Ã) is a chordless cycle C, then det(Ã) = 0 if
l(C) ≡ 0 mod 4 and det(Ã) = ±2 if l(C) ≡ 2 mod 4. So if Ã is TU, then Ã is
balanced.
A 0, 1 matrix A is regular if A can be signed to be TU. An example of a 0, 1
matrix that is not regular is one whose bipartite graph is a wheel with a rim of
62 Michele Conforti and Ajai Kapoor

length 6 (and the center node has obviously three neighbors in the rim). We will
see this later in this section.
To state the theorem of Tutte characterizing regular matrices, we need to
introduce the notion of pivoting in a matrix. Pivoting
on an entry
6= 0 of a
yT − yT
matrix A = , we obtain the matrix B = .
x D x D − xy T
Remark 15. Let B be obtained from A by pivoting on the nonzero entry aij .
Then:
– A can be obtained from B by pivoting on the same entry.
– Let aij be the pivot element. Then bij = −aij . For l 6= j, bil = ail and for
k 6= i, bkj = akj . For l 6= j and k 6= i, bkl = akl − aij ail akj
– det(A) = ±det(D − xy T ) and det(B) = ±det(D).

We are interested in performing the pivot operations on A both over the reals
(R-pivoting) and over GF 2 (GF 2-pivoting). Let B be a matrix obtained from
A by performing a GF 2-pivot or an R-pivot. We next show how to obtain G(B)
from G(A).
Remark 16. Let B be the 0, 1 matrix obtained from a 0, 1 matrix A by GF 2-
pivoting on aij = 1. Then G(B) is obtained from G(A) as follows:

1) For every 4-cycle C = u1 , i, j, v1 of G(A) remove edge u1 v1 .

2) For every induced chordless path P = u1 , i, j, v1 of G(A), add edge u1 v1 .

Proof: Follows from Remark 15. 2

It is easy to check that a 2 × 2, ±1 matrix is singular if and only if the sum of
its entries is equivalent to 0 mod 4. A 0, ±1 matrix A is weakly balanced if every
4-cycle C of G(A) satisfies l(C) ≡ 0 mod 4. Equivalently, A is weakly balanced
if every 2 × 2 submatrix of A has determinant 0, ±1.

Remark 17. Let B̃ be the matrix obtained from a weakly balanced 0, ±1 matrix
Ã by R-pivoting on a non-zero entry aij = . Then B̃ is a 0, ±1 matrix and
G(B̃) is obtained from G(Ã) as follows:

1) Edge ij has label −.

2) For every 4-cycle u1 , i, j, v1 in G(Ã) remove edge u1 v1 .
3) For every induced chordless path P = u1 , i, j, v1 in G(Ã) add edge u1 v1
and label it so that, for the resulting cycle C = u1 , i, j, v1 in G(B̃), l(C) ≡
0 mod 4.

Proof: 1) is trivial. By Remark 15, for k 6= i and l 6= j, bkl = akl − akj ail .
Note that 2 = 1. So bkl is the value of the determinant of the 2 × 2 submatrix
of Ã with rows i, k and columns j, l. Since Ã is weakly balanced, bkl and bkl ,
have values in 0, ±1. For 2), note that all 2 × 2 submatrices of Ã with all four
entries non-zero have determinant 0. Finally since the 2 × 2 submatrix of B̃ has
determinant 0, part 3) follows. 2
A Theorem of Truemper 63

Corollary 18. Let Ã be a weakly balanced 0, ±1 matrix and A be the 0, 1 matrix

with the same support. Let B̃ and B be the matrices obtained by R-pivoting Ã
and GF 2-pivoting A on the same entry. Then G(B̃) and G(B) have the same
edge set. (Equivalently, B̃ and B have the same support).

Tutte [16], [17] proves the following:

Theorem 19. A 0, 1 matrix A is regular if and only if for no matrix B, obtained

from A by GF 2-pivoting, G(B) contains a wheel whose rim has length 6.

To prove the above result, we need the following three lemmas (the first is
well known):
Lemma 20. A 0, 1 matrix A is regular if and only if any matrix obtained from
A by GF 2-pivoting is regular.

Proof: Follows by Remark 15 and Corollary 18. 2

Lemma 21. Let Ã be a balanced 0, ±1 matrix, B̃ be obtained by R-pivoting Ã
on ãij and B be the 0, 1 matrix with the same support as B̃. If B̃ is not balanced,
then B is not balanceable.

Proof: We show that G(B̃) can be obtained from G(B) by applying the signing
algorithm. Let T be any tree in G(B̃), chosen to contain all edges in {ij} ∪
{ix : x ∈ N (i)} ∪ {jy : y ∈ N (j)}. Then T is also a tree of G(Ã). Let
S = t1 , . . . , t|T |−1 , e1 , . . . , el be a consistent ordering of the edges of G(B̃), where
ti are edges in T . We show that the signing of G(B̃) can be obtained by the
signing algorithm with sequence S, where the edges of T are labeled as in G(B̃).
Let ek be an edge of S and Cek be a chordless cycle of G(B) containing ek and
edges in S \ ek+1 , . . . , em , such that Cek has the largest possible intersection
with {i, j} and, subject to this, Cek is shortest. We show that Cek forces ek to
be signed as in G(B̃).
Remark 17 shows that if Cek contains both nodes i and j and has length 4,
then Cek forces ek to be signed as in G(B̃).
All other edges ek are labeled the same in G(B̃) and G(Ã). We show that
Cek forces this signing of edge ek .
If Cek contains both nodes i and j and has length bigger than 4, then in
G(Ã) the nodes of Cek induce a cycle with unique chord i1 j1 , where i1 and j1
are the neighbors of i and j in Cek . By Remark 17, the sum of the labels on
the edges i1 i, ij, jj1 in G(B̃) is equivalent modulo 4 to the label of edge i1 j1 , in
G(Ã). Thus the cycle Ce0 k of G(Ã) induced by V (Cek ) \ {i, j} and the cycle Cek
of G(B̃) force ek to be signed the same.
If Cek contains one of {i, j}, say i, then by choice of Cek , node j has i as
unique neighbor in Cek . For, if not, ek either belongs to a chordless cycle of G(B̃)
of to a chordless cycle that is shorter that Cek and contains node j (this happens
when (Cek , j) is the rim of a wheel with center j and no hole of (Cek , j) contains
i, j and ek ), a contradiction to our assumption.
64 Michele Conforti and Ajai Kapoor

But then Cek is also a chordless cycle of G(Ã) and forces ek to be signed as
in G(B̃).
If Cek contains neither i nor j and at most one neighbor of i or j then Cek is
also a chordless cycle of G(Ã) and forces ek to be signed as in G(B̃). Otherwise
by the choice of Cek , node i has a unique neighbor i0 in Cek , node j has a unique
neighbor j 0 in Cek and i0 , j 0 are adjacent. So, by Remark 17, G(Ã) contains a
hole Ce0 k , whose node set is V (Cek ) ∪ {i, j}. This hole Ce0 k and the hole Cek of
G(B̃) force ek to be signed the same. 2

Lemma 22. From every 0, 1 matrix A that is not regular, we can obtain a 0, 1
matrix that is not balanceable by a sequence of GF 2-pivots.

Proof: Let A be the smallest 0, 1 matrix (in terms of the sum of the number of
rows and columns) that is not regular but cannot be pivoted to a matrix that
is not balanceable. Since A is obviously balanceable, let Ã be a corresponding
balanced 0, ±1 matrix. By minimality, we can assume that Ã is square and
|det(Ã)| ≥ 2. By Remark 15, we can R-pivot on any nonzero entry of Ã to
obtain a 0, ±1 matrix B̃ which contains a proper submatrix C̃ with the same
determinant value as Ã. Since Ã is weakly balanced, by Remark 17, B̃, and
hence C̃, is a 0, ±1 matrix. Let B be the 0, 1 matrix with the same support as B̃
and C the submatrix of B corresponding to C̃. By Corollary 18, B is obtained
from A with a GF 2-pivot on the same element. Assume B̃ is balanced. Then C̃
would be a balanced matrix which is not TU. However, this implies that C is
not regular (this was already known to Camion [1]): Indeed, C̃ is a signing of C
which is balanced but not TU: So C̃ can be obtained by applying the signing
algorithm on G(C), starting with a tree T of G(C). Assume C has a TU signing
C̃ 0 . Since C̃ 0 is also a balanced matrix, then G(C̃ 0 ) can be obtained through the
signing algorithm by signing T as in G(C̃ 0 ). So G(C̃) and G(C̃ 0 ) differ on some
fundamental cuts of T . So C̃ can be transformed in C̃ 0 by multiplying by −1 the
rows and columns corresponding to the nodes in on shore of this cut. However
this operation preserves the TU property.
So B̃ is not balanced and by Lemma 21, B is not balanceable. 2
Proof of Theorem 19: By Lemma 20, regular matrices are closed under GF 2-
pivoting and if A is a 0, 1 matrix such that G(A) contains a wheel G(W ) whose
rim has length 6, then W (hence A) is obviously not regular.
For the sufficiency part, if A is a 0, 1 matrix which is not regular, then by
Lemma 22, we can obtain by GF 2-pivots a 0, 1 matrix B which is not bal-
anceable. By Theorem 14, G(B) contains a 3P C(x, y) where x and y belong to
distinct color classes, or a wheel with rim H and center v and v has an odd
number, greater than one, of neighbors in H.
If G(B) contains a 3P C(x, y), Remark 16 shows that we can GF 2-pivot on
B so that all of its paths have length three and by doing a last GF 2-pivot on an
entry corresponding to an edge incident to x, we obtain a wheel whose rim has
length 6.
A Theorem of Truemper 65

If G(B) contains a wheel (H, x) and x has an odd number of neighbors in

the rim H, Remark 16 shows that we can GF 2-pivot on an entry corresponding
to an edge of H, incident with a neighbor of x, to obtain a wheel (H 0 , x), where
x has two less neighbors in H 0 than in H. When x has only three neighbors in
H 0 , to obtain a wheel whose outer cycle has length 6, GF 2-pivot so that all the
subpaths of H 0 , between two consecutive neighbors of x have length two. 2
Tutte’s original proof of the above theorem is quite difficult. A short, self-
contained proof can be found in [10]. In [12], a decomposition theorem for regular
matrices in given, together with a polynomial algorithm to test if a matrix is
regular or TU. A faster algorithm is given in [15].

5 Decomposition
The co-NP characterizations obtained in Theorems 8, 9 and 14 are used in [3],
[4], [8], [2] to obtain the decomposition results for graphs without even holes,
cap-free graphs and balanceable matrices. However the proofs of these theorems
are long and technical. We have seen how Theorem 1 can be used to decompose
universally signable graphs with K1 and K2 cutsets into holes and triangulated
graphs. Here we further illustrate in two easy cases the use of a co-NP character-
ization to obtain decomposition results and polynomial recognition algorithms.

Restricted Unimodular and Totally Odd Matrices

A 0, ±1 matrix A is restricted unimodular (RU, for short) if every cycle C (possi-
bly with chords) of G(A) satisfies l(C) ≡ 0 mod 4. A 0, 1 matrix A is signable to
be RU if there exists a RU 0, ±1 matrix that has the same support. RU matrices
are a known subclass of TU matrices, see e.g. [18].
A 0, ±1 matrix A is totally odd (TO, for short) if every cycle C of G(A)
satisfies l(C) ≡ 2 mod 4. A 0, 1 matrix A is signable to be TO if there exists a
TO 0, ±1 matrix that has the same support. TO matrices are studied in [9].
A weak 3-path configuration between nodes x and y (W 3P C(x, y)) is made
up by three paths P1 , P2 , P3 connecting x and y such that Pi ∪ Pj , i 6= j,
i, j = 1, 2, 3 induces a cycle (possibly with chords). So Pi may be a single edge
or may contain chords and edges may have endnodes in distinct paths Pi and
Pj . If G is a bipartite graph, a W 3P C(x, y) is homogeneous is x and y belong
to the same color class of G and is heterogeneous otherwise.

Theorem 23. A 0, 1 matrix A is signable to be RU if and only if G(A) contains

no weak 3-path configuration which is heterogeneous and A is signable to be TO
if and only if G(A) contains no weak 3-path configuration which is homogeneous.

Proof: Let G0 be the bipartite graph obtained from G(A) by replacing each edge
with a path of length 3 and A0 be the 0, 1 matrix such that G0 = G(A0 ). Then
there is a correspondence between the cycles of G(A) and the holes of G0 . So
A is signable to be RU if and only if G0 is α-balanceable for the vector α of all
66 Michele Conforti and Ajai Kapoor

zeroes (i.e. A0 is a balanceable matrix) and A is signable to be TO if and only

if G0 is α-balanceable for the vector α of all twos. In the first case, the theorem
follows from Theorem 14. The proof for the second case in analogous and is left
as an exercise. 2
A bridge of a cycle C is either a chord of C or a subgraph of G, whose node set
contains all nodes of a connected component of G \ V (C), say G0 , together with
the nodes of C, adjacent to at least one node in G0 and whose edges are the edges
of G with at least one endnode in G0 . The attachments of a bridge B are the
nodes of V (B) ∩ V (C). A bridge is homogeneous if all of its attachments belong
to the same color class of G and is heterogeneous if no two of its attachments
belong to the same color class of G. Obviously, if B is a heterogeneous bridge,
then B has at most two attachments.

Lemma 24. Let A be 0, 1 matrix that is signable to be RU and C any cycle of

G(A). Then every bridge of C is homogeneous.
Let A be 0, 1 matrix that is signable to be TO and C any cycle of G(A). Then
every bridge of C is heterogeneous.

Proof: We prove the first statement. Let x and y be two attachments of a bridge
B of a cycle C. If x and y belong to distinct color classes of G(A), then G(A)
contains a heterogeneous W 3P C(x, y) where P1 , P2 are the two xy-subpaths of
C and P3 is any xy-path in B. The proof of the second statement is similar. 2

Bridges B1 and B2 of C cross if there exist attachments x1 , y1 of B1 and x2 ,

y2 of B2 that are distinct and appear in the order x1 , x2 , y1 , y2 when traversing
C in one direction.

Lemma 25. Let A be 0, 1 matrix that is signable to be RU and C any cycle of

G(A). Then no pair of homogeneous bridges of C, having attachments in distinct
color classes of G(A), cross.
Let A be 0, 1 matrix that is signable to be TO, C any cycle of G(A). Then
no pair of heterogeneous bridges of C cross.

Proof: To prove the first statement, assume B1 and B2 are homogeneous bridges
of C having attachments x1 , y1 of B1 and x2 , y2 of B2 , appearing in the order
x1 , x2 , y1 , y2 when traversing C. If x1 , y1 and x2 , y2 are in distinct color classes
of G(A), we have a heterogeneous W 3P C(x1 , x2 ), where P1 is the subpath of
C, connecting x1 , x2 and not containing y1 . P2 and P3 contain respectively a
x1 , y1 -path in B1 and a x2 , y2 -path in B2 . The proof of the second part is similar.
2

Theorem 26. Let A be 0, 1 matrix that is signable to be TO, C any cycle of

G(A) and B be a heterogeneous bridge of C with two attachments x and y. Then
G(A) \ {x, y} is disconnected.
A Theorem of Truemper 67

Proof: By Lemma 24, x and y are the only two attachments of B. Let P1 , P2 be
the two subpaths of C, connecting x and y. By Lemma 25, no bridge of C has
an attachments in both P1 \ {x, y} and P2 \ {x, y}. So no two of B, P1 and P2
are in the same component of G(A) \ {x, y} and, since at least two of them are
not edges, G(A) \ {x, y} contains at least two components. 2
Theorem 27 ([18]). Let A be 0, 1 matrix that is signable to be RU and C
any cycle of G(A) containing homogeneous bridges B1 and B2 with attachments
in distinct color classes of G(A). Then C contains two edges whose removal
disconnects G(A) and separates B1 and B2 .

Proof: Assume that the attachments of B1 and B2 belong to the ”red” and
”blue” sides of the bipartition of G(A). Let P1 be be minimal subpath of C with
the following property:
P1 contains all the attachments of B1 and no bridge of C with red attachments
has all its attachments either in P1 or outside P1
The subpath P2 is similarly defined, with respect to B2 and the bridges with
blue attachments. By Lemma 25 P1 and P2 can be chosen to be nonoverlapping.
Furthermore by minimality of P1 and P2 , the endnodes a1 , b1 of P1 are red nodes
and the endnodes a2 , b2 of P2 are blue nodes. Let C = a1 , P1 , b1 , Pb1 b2 , b2 , P2 , a2 ,
Pa2 a1 , let b be any edge of Pb1 b2 and a any edge of Pa1 a2 . By Lemma 25 and the
construction of P1 and P2 , P1 ∪ B1 and P2 ∪ B2 belong to distinct components
of G \ {a, b}. 2
Clearly to test if A is signable to be RU, we can assume that G(A) is bicon-
nected, otherwise we work on the biconnected components.
If G(A) is biconnected and contains no cycle with homogeneous bridges with
attachments in distinct color classes of G(A), then A has two ones per row or per
column. (This is easy from network flows). In this case A is obviously RU: Sign A
so that each row or column contains a 1 and a −1 to obtain a network matrix (or
its transpose). From this fact and the above theorem yield in a straightforward
way a polytime algorithm to test if a 0, 1 is signable to be RU. This algorithm,
combined with the signing algorithm of Section 1, gives a procedure to test if a
0, ±1 matrix is RU.
In a similar manner, see [9], Theorem 26 and the signing algorithm give
procedures to test if a 0, 1 matrix is signable to be TO and to test if a 0, ±1
matrix is TO.

References
1. P. Camion. Caractérisation des matrices totalement unimodulaires. Cahiers Centre
Études Rech. Op., 5:181–190, 1963.
2. M. Conforti, G. Cornuéjols, A. Kapoor, and K. Vušković. Balanced 0, ±1 matrices,
Parts I–II. 1994. Submitted for publication.
3. M. Conforti, G. Cornuéjols, A. Kapoor, and K. Vušković. Even-hole-free graphs,
Parts I–II. Preprints, Carnegie Mellon University, 1997.
68 Michele Conforti and Ajai Kapoor

4. M. Conforti, G. Cornuéjols, A. Kapoor, and K. Vušković. Even and odd holes in

cap-free graphs. 1996. Submitted for publication.
5. M. Conforti, G. Cornuéjols, A. Kapoor, and K. Vušković. Universally signable
graphs. Combinatorica, 17(1):67–77, 1997.
6. M. Conforti, G. Cornuéjols, A. Kapoor, and K. Vušković. A Mickey-Mouse decom-
position theorem. In Balas and Clausen, editors, Proceedings of 4th IPCO Confer-
ence, Springer Verlag, 1995.
7. M. Conforti, G. Cornuéjols, A. Kapoor, M. R. Rao, and K. Vušković. Balanced
matrices. Proceedings of the XV International Symposium on Mathematical Pro-
gramming, University of Michigan Press, 1994.
8. M. Conforti, G. Cornuéjols,and M. R. Rao. Decomposition of balanced 0,1 matrices,
Parts I–VII. 1991. Submitted for publication.
9. M. Conforti, G. Cornuéjols, and K. Vušković. Balanced cycles and holes in bipartite
graphs. 1993. Submitted for publication.
10. A. M. H. Gerards. A short proof of Tutte’s characterization of totally unimodular
matrices. Linear Algebra and its Applications, 14:207–212, 1989.
11. A. Hajnal and T. Suryani. Uber die auflosung von graphen vollstandiger teil-
graphen. Ann. Univ. Sc. Budapest. Eotvos Sect. Math., 1, 1958.
12. P. Seymour. Decomposition of regular matroids. Journal of Combinatorial Theory
B, 28:305–359, 1980.
13. K. Truemper. On balanced matrices and Tutte’s characterization of regular ma-
troids. Working paper, University of Texas at Dallas, 1978.
14. K. Truemper. Alpha-balanced graphs and matrices and GF(3)-representability of
matroids. Journal of Combinatorial Theory B, 32:112–139, 1982.
15. K. Truemper. A decomposition theory of matroids V. Testing of matrix total uni-
modularity. Journal of Combinatorial Theory B, 49:241–281, 1990.
16. W. T. Tutte. A homotopy theorem for matroids I, II. Trans. Amer. Math. Soc.,
88:144–174, 1958.
17. W. T. Tutte. Lectures on matroids. J. Nat. Bur. Standards B, 69:1–47, 1965.
18. M. Yannakakis. On a class of totally unimodular matrices. Mathematics of Opera-
tions Research, 10:280–304, 1985.
The Generalized Stable Set Problem for
Claw-Free Bidirected Graphs

Daishin Nakamura and Akihisa Tamura

Department of Computer Science and Information Mathematics

University of Electro-Communications
1-5-1 Chofugaoka, Chofu, Tokyo 182-8585, Japan
{daishin, tamura}@@im.uec.ac.jp

Abstract. Bidirected graphs are a generalization of undirected graphs.

The generalized stable set problem is an extension of the maximum
weight stable set problem for undirected graphs to bidirected graphs.
It is known that the latter problem is polynomially solvable for claw-free
undirected graphs. In this paper, we define claw-free bidirected graphs
and show that the generalized stable set problem is also polynomially
solvable for claw-free bidirected graphs.

1 Introduction
Let G = (V, E) be an undirected graph. A subset S of V is called a stable
set if any two elements of S are nonadjacent. Given a weight vector P
w ∈ <V ,
a maximum weight stable set is a stable set S maximizing w(S) = i∈S wi .
The problem of finding a maximum weight stable set is called the maximum
weight stable set problem (MWSSP). It is well known that the problem can be
formulated as the following integer programming problem:

[MWSSP] maximize w · x subject to xi + xj ≤ 1 for (i, j) ∈ E,

xi ∈ {0, 1} for i ∈ V.

In this paper, we consider the problem generalized as follows: for a given finite
set V and for given P, N, I ⊆ V × V ,

[GSSP] maximize w · x subject to xi + xj ≤ 1 for (i, j) ∈ P,

−xi − xj ≤ −1 for (i, j) ∈ N,
xi − xj ≤ 0 for (i, j) ∈ I,
xi ∈ {0, 1} for i ∈ V.

Here we call this problem the generalized stable set problem (GSSP). We note
that the GSSP is equivalent to the generalized set packing problem discussed in
[1,2]. To deal with the GSSP, a ‘bidirected’ graph is useful. A bidirected graph

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 69–83, 1998. c Springer–Verlag Berlin Heidelberg 1998
70 Daishin Nakamura and Akihisa Tamura

G = (V, E) has a set of vertices V and a set of edges E, in which each edge
e ∈ E has two vertices i, j ∈ V as its endpoints and two associated signs (plus
or minus) at i and j. The edges are classified into three types: the (+, +)-edges
with two plus signs at their endpoints, the (−, −)-edges with two minus signs,
and the (+, −)-edges (and the (−, +)-edges) with one plus and one minus sign.
Given an instance of the GSSP, we obtain a bidirected graph by making (+, +)-
edges, (−, −)-edges and (+, −)-edges for vertex-pairs of P, N and I respectively.
Conversely, for a given bidirected graph with a weight vector on the vertices, by
associating a variable xi with each vertex, we may consider the GSSP. We call
a 0−1-vector satisfying the inequality system arising from a bidirected graph G
a solution of G. We also call a subset of vertices a solution of G if its incidence
vector is a solution of G. The GSSP is an optimization problem over the solutions
of a bidirected graph.
Since several distinct bidirected graphs may have the same set of solutions,
we deal with some kind of ‘standard’ bidirected graphs. A bidirected graph is
said to be transitive, if whenever there are edges e1 = (i, j) and e2 = (j, k) with
opposite signs at j, then there is also an edge e3 = (i, k) whose signs at i and k
agree with those of e1 and e2 . Obviously, any bidirected graph and its transitive
closure have the same solutions. A bidirected graph is said to be simple if it has
no loop and if it has at most one edge for each pair of distinct vertices. Johnson
and Padberg [3] showed that any transitive bidirected graph can be reduced to
simple one without essentially changing the set of solutions, or determined to
have no solution. We note that a transitive bidirected graph has no solution if
and only if it has a vertex with both a (+, +)-loop and a (−, −)-loop. For any
bidirected graph, the associated simple and transitive bidirected graph can be
constructed in time polynomial in the number of vertices.
Given a bidirected graph G, its underlying graph, denoted by G, is defined as
the undirected graph obtained from G by changing all the edges to (+, +)-edges.
A bidirected graph is said to be claw-free if it is simple and transitive and if its
underlying graph is claw-free (i.e., does not contain a vertex-induced subgraph
which is isomorphic to the complete bipartite graph K1,3 ).
It is well known that the MWSSP is NP-hard for general undirected graphs
(and hence, the GSSP is also NP-hard). However, for several classes of undirected
graphs, the MWSSP is polynomially solvable. For example, Minty [4] proposed a
polynomial time algorithm for the MWSSP for claw-free undirected graphs. On
the other hand, there are several polynomial transformations from the GSSP to
the MWSSP (see [5,6]). Unfortunately, we cannot easily derive the polynomial
solvability of the GSSP for claw-free bidirected graphs by using these transfor-
mations, because these do not preserve claw-freeness. Our aim in this paper is
to verify that the GSSP for claw-free bidirected graphs is polynomially solvable.

2 Canonical Bidirected Graphs and Their Solutions

In this section, we will give several definitions and discuss basic properties of
solutions of bidirected graphs. Let G = (V, E) be a simple and transitive bidi-
The Stable Set Problem for Claw-Free Bidirected Graphs 71

rected graph and w be a weight vector on V . For any subset U ⊆ V, we call

the transformation which reverse the signs of the u side of all edges incident to
each u ∈ U the reflection of G at U , and we denote it by G:U . Obviously, reflec-
tion preserves simpleness and transitivity. Let w:U denote the vector defined by
(w:U )i = −wi if i ∈ U ; otherwise (w:U )i = wi . For two subsets X and Y of V ,
let X 4 Y denote the symmetric difference of X and Y .

Lemma 1. Let X be any solution of G. Then, X 4 U is a solution of G:U . The

GSSP for (G, w) is equivalent to the GSSP for (G:U, w:U ).

Proof. The first assertion is trivial from the definition

P of G:U .PThe second as-
sertion follows from the equation w:U (X 4 U ) = i∈X\U wi + i∈U\X (−xi ) =
P
w(X) − i∈U wi , (the last term is a constant). t
u

We say that a vertex is positive (or negative) if all edges incident have plus
(or minus) signs at it, and that a vertex is mixed if it is neither positive nor
negative. If a bidirected graph has no (−, −)-edge, it is said to be pure. We say
that a bidirected graph is canonical if it is simple, transitive and pure and it has
no negative vertex. For any instance (G, w) of the GSSP, we can transform it to
equivalent one whose bidirected graph is canonical as follows. From the previous
section, we can assume that G is simple and transitive. Johnson and Padberg [3]
proved that G has at least one solution U ⊆ V . From Lemma 1, G:U has the
solution U 4 U = ∅, that is, G:U must be pure. Let W be the set of negative
vertices of G:U . Then G:U :W has no negative vertex, and furthermore, it is pure
because any edge (v, w) of G:U with w ∈ W must be a (+, −)-edge. Since this
transformation is done in polynomial time, we assume that a given bidirected
graph of the GSSP is canonical in the sequel.
For any solution X of a canonical bidirected graph G, we partition X into
two parts:
−+ −+
XB = {i ∈ X | NG (i) ∩ X = ∅} and XI = {i ∈ X | NG (i) ∩ X 6= ∅},
−+
where NG (i) denotes the set of vertices adjacent to i by a (−, +)-edge incident
+−
to i with a minus sign, NG (i) is defined analogously. Here we call XB a base
of X. Let
+−
ex(XB ) = XB ∪ {i ∈ V | i ∈ NG (x) for some x ∈ XB }.

If S ⊆ V is a stable set of G, we say that S is a stable set of G. It is not difficult

to show the following lemmas.

Lemma 2. For any solution X of a canonical bidirected graph G, X = ex(XB ),

and hence, (ex(XB ))B = XB .

Lemma 3. For any solution X of a canonical bidirected graph G, its base XB

is a stable set of G.
72 Daishin Nakamura and Akihisa Tamura

Lemma 4. For any stable set S of a canonical bidirected graph G, ex(S) is a

solution of G.
Thus there is a one-to-one correspondence between the solutions and the
stable sets of G.
For any subset U of V , let G[U ] denote the subgraph induced by U . We call
H ⊆ V a connected component of G if H induces a connected component of G.
Lemma 5. Let X and Y be solutions of a canonical bidirected graph G. For
any connected component H of G[XB 4 YB ], XB 4 H and YB 4 H are bases of
certain solutions of G.
Proof. From Lemma 3, XB and YB are stable sets of G. Thus XB 4 H and
YB 4 H are also stable sets of G. Hence Lemma 4 implies the assertion. t
u
Let X be a specified solution of G. For any solution Y of G, let H1 , . . . , H` be
the connected components of G[XB 4 YB ]. We define the weight of Hi , denoted
by δ X (Hi ) or simply δ(Hi ), by
δ X (Hi ) = w(ex(XB 4 Hi )) − w(X).
P
We remark that the equation w(Y ) − w(X) = δ(Hi ) may not hold because
−+
there may exist a vertex v such that NG (v) contains several vertices of distinct
connected components, that is, wv may be doubly counted. In order to avoid
this obstacle, we require some additional conditions.
Lemma 6. For any solution X of G, there exists U ⊆ V such that G0 = G:U
and X 0 = X 4 U satisfy
(a) G0 is canonical,
(b) X 0 is a stable set of G0 , i.e., X 0 = (X 0 )B ,
(c) for each mixed vertex v 6∈ X 0 , there is a vertex u ∈ X 0 adjacent to v.
Proof. Let M be the set of all mixed vertices v such that v 6∈ X, v is adjacent
+−
to no vertex of XB and NG (v) = ∅. For any (inclusion-wise) maximal stable
set S of G[M ], U = XI ∪ S satisfies the assertion. t
u
We note that a subset U having the conditions of Lemma 6 can be found in
polynomial time. The conditions of Lemma 6 overcome the above obstacle.
Lemma 7. Let G be a canonical bidirected graph and X be a solution of G
satisfying the conditions of Lemma 6. For any solution Y , let H1 , . . . , H` be the
connected components of G[XB 4 YB ]. Then,
X̀ X̀
w(Y ) − w(X) = δ X (Hi ) = {w(ex(XB 4 Hi )) − w(X)}.
i=1 i=1

Proof. Suppose to the contrary that there exists a mixed vertex v such that
−+
NG (v) contains two vertices u and w of distinct connected components Hi and
Hj . Since X is a stable set, u, w ∈ YB . Let x be a vertex of X adjacent to v.
From the transitivity, x must be adjacent to both u and w. This contradicts the
fact that Hi and Hj are distinct connected components of G[XB 4 YB ]. t
u
The Stable Set Problem for Claw-Free Bidirected Graphs 73

3 A Basic Idea for Finding an Optimal Solution of the

GSSP

Given an instance (G, w) of the GSSP, for each i = 0, 1, . . . , |V |, let

Si = {X ⊆ V | X is a solution of G and has exactly i positive vertices },

wi = max w(X),
X∈Si

Si∗ = {X ∈ Si | w(X) = wi }.

Suppose that N denotes the smallest number j with wj = maxi wi . Minty [4]
showed that if a given undirected graph is claw-free, then w0 < · · · < wN . More
precisely, (0, w0 ), . . . , (N, wN ) lie on an increasing concave curve. Minty’s algo-
rithm for solving the MWSSP for claw-free undirected graphs finds an optimal
solution by tracing (i, wi ) one by one. However, even if a given bidirected graph
is claw-free, this fact does not hold as an example in Figure 1 where (+, +)-
edges are drawn by lines and (+, −)-edges by arrows whose heads mean minus
signs. Thus, it seems to be difficult to trace (i, wi ) one by one for the GSSP.

3 5 4
a b c
i wi solution
0 5 {e}
2
1 10 {b, e}
d e f −4
2 14 {b, e, h}
5
3 13 {b, e, f, g, i}
4 15 {a, c, e, f, g, i}
g h i
3 4 4
Fig. 1.

(N, wN )
(6, w6 )

(4, w4 )
(3, w3 )

(1, w1 )

(0, w0 )

0 1 2 3 4 6 N

Fig. 2.
74 Daishin Nakamura and Akihisa Tamura

We will use a technique of the fractional programming. Let us consider the up-
per envelope of the convex hull of the set of pairs (0, w0 ), (1, w1 ), . . . , (N, wN )
as in Figure 2. We call (i, wi ) a Pareto-optimal pair if it lies on the envelope,
and their solutions Pareto-optimal solutions. Obviously, (0, w0 ) and (N, wN ) are
always Pareto-optimal. In Figure 2, (0, w0 ), (1, w1 ), (3, w3 ), (4, w4 ), (6, w6 ) and
(N, wN ) are Pareto-optimal.
Let X i be a Pareto-optimal solution with X i ∈ Si . Suppose that F is a subset
of all the solutions of G such that X i ∈ F and F is defined independently to
the weight vector w. Let us also consider the Pareto-optimal solutions for the
restriction on F . Obviously, X i is also Pareto-optimal in F . We consider the
following two problems

[MAXδ] max δ(Y ) = w(Y ) − w(X i ) , and
Y ∈F

δ(Y )
[MAXρ] max ρ(Y ) = | δ(Y ) > 0 ,
Y ∈F ν(Y )
where ν(Y ) denotes the difference of the numbers of all the positive vertices of
Y and X i . We denote ρ(·) and δ(·) for a weight vector w̄ by ρw̄ (·) and δw̄ (·)
explicitly. Suppose that X i is not optimal in F . Let Y 1 be an optimal solution
of the MAXδ for w̄0 = w. We set r = ρw̄0 (Y 1 ) and consider the new weight
vector w̄1 defined by
0
w̄i − r if i is a positive vertex,
w̄i1 = (1)
w̄i0 otherwise.

Then, δw̄1 (Y 1 ) = 0. For any solution Y ∈ F,

δw̄0 (Y ) − r · ν(Y )
ρw̄1 (Y ) = = ρw̄0 (Y ) − r.
ν(Y )

Thus, X i is Pareto-optimal in F for w̄1 . We now assume that there is a so-

lution Y ∗ with ρw̄0 (Y ∗ ) > ρw̄0 (Y 1 ) and δw̄0 (Y ∗ ) > 0. Then, evidently, 0 <
ν(Y ∗ ) < ν(Y 1 ). We also have δw̄1 (Y ∗ ) = [δw̄0 (Y ∗ )−r·ν(Y ∗ )] = ν(Y ∗ )[ρw̄0 (Y ∗ )−
ρw̄0 (Y 1 )] > 0. Conversely, if δw̄1 (Y ∗ ) > 0 then ρw̄0 (Y ∗ )>ρw̄0 (Y 1 ) and δw̄0 (Y ∗ ) >
0. Summing up the above discussion, for an optimal solution Y 2 of the MAXδ
for w̄1 , if δw̄1 (Y 2 ) = 0 then Y 1 is an optimal solution of the MAXρ for w; oth-
erwise, by repeating the above process at most |V | times, the MAXρ for w can
be solved, because of the fact that ν(Y 1 ) > ν(Y 2 ) > · · · > 0.
From the above discussion, for each Pareto-optimal solution X i ∈ Si∗ , if we
can easily define a subset F such that
(A1) X i ∈ F and Sj∗ ∩ F = 6 ∅ where (j, wj ) is the next Pareto-optimal pair,
and
(A2) the MAXδ for F and for any w can be solved in time polynomial in the
number of vertices of G,
then we can either determine X i is optimal or find a Pareto-optimal solution
X k ∈ Sk∗ with i < k ≤ N in polynomial time. (We may find (4, w4 ) from (1, w1 )
The Stable Set Problem for Claw-Free Bidirected Graphs 75

in Figure 2.) In addition, if X 0 ∈ S0∗ can be found in polynomial time, the GSSP
for (G, w) can be solved in polynomial time. In fact, this initialization is not so
difficult if we can apply the above technique for any vertex-induced subgraph of
G, because it is sufficient to solve the GSSP for the bidirected graph obtained
from the current one by deleting all the positive vertices, recursively.
Finally we introduce a tool in order to trace Pareto-optimal pairs. Let X i
be a Pareto-optimal solution with i < N . Without loss of generality, we assume
that X i and G satisfy the conditions of Lemma 6. We say that H ⊆ V is an
alternating set for X i if H is connected in G and if X i 4 H is a stable set of
G. We define the weight δ(H) of an alternating set H with respect to w by
w(ex(X i 4 H)) − w(X i ).

Lemma 8. Let (j, wj ) be the next Pareto-optimal pair of (i, wi ). Then, for any
X j ∈ Sj∗ , there exists a connected component H of G[XB i
4 XB j
] such that
ex(XB 4 H) is a Pareto-optimal solution with more positive vertices than X i .
i

Proof. For each connected component H, we denote by ν(H) the difference of

numbers of positive vertices of X j ∩H and X i ∩H. It is not difficult to show that
either (δ(H) > 0 and ν(H) > 0) or (δ(H) = 0 and ν(H) = 0) or (δ(H) < 0 and
j
ν(H) < 0). Here we ignore the second case. Since i < j, G[(XB i
4 XB )] must have
at least one connected component of the first case. For each connected component
H of the first or third case, let ρ(H) = δ(H)/ν(H). From the minimality of j
and the Pareto-optimality of (j, wj ), if H is of the first case, then ρ(H) ≤
(wj −wi )/(j−i). Similarly, if H is of the third case, then ρ(H) ≥ (wj −wi )/(j−i).
j
−w i
By combining the above inequalities and Lemma 7, one can obtain ρ(H) = w j−i
for any H of the first or third case. Hence, any connected component H of the
first case satisfy the assertion. t
u

Lemma 8 says that we can trance Pareto-optimal solutions by using alternating

sets.

4 Finding a Next Pareto-Optimal Solution

Let G, w and X be a given claw-free bidirected graph, a given weight vector on
the vertices and a Pareto-optimal solution with respect to w. Without loss of
generality, we assume that G and X satisfy the conditions of Lemma 6. In this
section, we explain how to find a next Pareto-optimal solution.
We first give several definitions. We call the vertices of X black and the other
vertices white. Any white vertex is adjacent to at most two black vertices, since
otherwise G must have a claw. A white vertex is said to be bounded if it is
adjacent to two black vertices, free if it is adjacent to exactly one black vertex
and otherwise super free. A cycle (or path) is called an alternating cycle (or path)
if white and black vertices appear alternately, and its white vertices form a stable
set. An alternating path is called free if its endpoints are either black or free or
super free. Alternating cycles and free alternating paths are alternating sets,
and vice versa in claw-free cases. Thus, Lemma 8 guarantees that we deal with
76 Daishin Nakamura and Akihisa Tamura

only alternating cycles and free alternating paths in order to find a next Pareto-
optimal solution. An alternating cycle or a free alternating path is called an
augmenting cycle or an augmenting path respectively if it has a positive weight.
For two distinct black vertices x and y, let W denote the set of all the bounded
vertices adjacent to both x and y. If W is not empty, W is called a wing adjacent
to x (and y). A black vertex is called regular if it is adjacent to three or more
wings, irregular if it is adjacent to exactly two wings, and otherwise useless.
An alternating cycle is said to be small if it has at most two regular vertices;
otherwise large. Here we call C1 , . . . , Ck a large augmenting cycle family if each
Ci is a large augmenting cycle and each vertex in Ci is adjacent to no vertex in
Cj for 1 ≤ i < j ≤ k. From Lemma 7, δ(C1 ∪ · · · ∪ Ck ) = δ(C1 ) + · · · + δ(Ck )
holds.
Our algorithm for finding a next Pareto-optimal solution is described by
using the technique discussed in the previous section:

(0) w0 ← w and i ← 0 ;
(1) Find a small augmenting cycle Ai+1 of the maximum weight for wi if it
exists, otherwise go to (2) ;
Construct the new weight wi+1 by applying (1), i ← i + 1 and repeat (1) ;
(2) Find a large augmenting cycle family Ai+1 of the maximum weight for wi
if it exists, otherwise go to (3) ;
Construct the new weight wi+1 by applying (1), i ← i + 1 and repeat (2) ;
(3) Find an augmenting path Ai+1 of the maximum weight for wi if it exists,
otherwise go to (4) ;
Construct the new weight wi+1 by applying (1), i ← i + 1 and repeat (3) ;
(4) If i = 0 then X is optimal, otherwise ex(X 4 Ai ) is a next Pareto-optimal
solution.

Note that in (2) there is no small augmenting cycle since these are eliminated
in (1), and that in (3) there is no augmenting cycle since these are eliminated in
(1) and (2). These facts are important in the following sense.

Theorem 9. For any weight vector,

1. a maximum weight small augmenting cycle can be found in polynomial time,

2. a maximum weight large augmenting cycle family can be found in polynomial

time if no small augmenting cycle exists,
3. a maximum weight augmenting path can be found in polynomial time if no
augmenting cycle exists.

By Lemma 8 and Theorem 9, our algorithm find a next Pareto-optimal solu-

tion in polynomial time. Summing up the above discussions, we obtain our main
theorem.

Theorem 10. The GSSP for claw-free bidirected graphs is polynomially solv-
able.
The Stable Set Problem for Claw-Free Bidirected Graphs 77

In the rest of the section, we briefly explain a proof of Theorem 9. Our ap-
proach is an extension of Minty’s algorithm for undirected claw-free graphs. This,
however, does not seem a straightforward extension because we must overcome
several problems. A significant problem is how to deal with ‘induced weights’.
Let A be an alternating cycle or a free alternating path. Then its weight is
expressed as
P −+
δX (A) = w(A−X) − w(X∩A) + {w(v) | v is mixed, NG (v) ∩ (A−X) 6= ∅}.
P
We call the term the induced weight, which appears in the bidirected case but
not in the undirected case.
We first consider cycles. Let x1 , . . . , xk with k ≥ 3 be distinct black vertices
and W1 , . . . , Wk , Wk+1 = W1 be wings such that xi is adjacent to Wi and Wi+1
for i = 1, . . . , k. Then (W1 , x1 , W2 , . . . , Wk , xk , W1 ) is called a cycle of wings. It
is easy to show the following:

Lemma 11 ([4]). Let (W1 , x1 , W2 , . . . , Wk , xk , W1 ) with k ≥ 3 be a cycle of

wings and yi ∈ Wi for i = 1, . . . , k. Then (y1 , x1 , y2 , . . . , yk , xk , yk+1 = y1 ) is an
alternating cycle if and only if yi is not adjacent to yi+1 for i = 1, . . . , k.

−+
Lemma 12. Let v be a mixed vertex such that NG (v) has a bounded vertex
but is not included in a wing. Then there uniquely exists a black vertex x such
−+
that [x = v or x is adjacent to v] and all the vertices in NG (v) are adjacent to
x.

Proof. It is trivial if v is black. Suppose that v is white. Let y be a bounded

−+
vertex in NG (v), and let x1 and x2 be the black vertices adjacent to y. Since
they are in X which is a stable set, x1 is not adjacent to x2 . Thus, without loss of
generality, we can assume that v is adjacent to x1 since otherwise {y, v, x1 , x2 }
induces a claw. The edge (v, x1 ) is not a (−, +)-edge because v is white and
x1 is black. That is, the sign of this edge at v is +. Let y 0 be any vertex in
−+
NG (v) − {y}. Then y 0 must be adjacent to x1 from the transitivity. Finally
note that v is not adjacent to x2 , since otherwise from the same discussion any
−+ −+
vertex in NG (v) − {y} must be adjacent to x2 and NG (v) is included in the
wing adjacent to x1 and x2 , a contradiction. t
u

Lemma 13. Let C = (W1 , x1 , W2 , . . . , Wk , xk , W1 ) be a cycle of wings (k ≥

3). Then a maximum weight alternating cycle included in C can be found in
polynomial time.

Proof. Let Wk+1 = W1 and W0 = Wk . For i = 1, . . . , k and for each pair y ∈ Wi

and z ∈ Wi+1 such that y is not adjacent P to z, draw a directed ‘red’ edge
−+
from y to z with weight w(y) − w(xi ) + {w(v)| v is mixed, NG (v) ∩ Wi−1 =
−+ −+ −+
∅, NG (v) ∩ Wi 6= ∅ and [y ∈ NG (v) or z ∈ NG (v)]}. From Lemma 11, there
is a one-to-one mapping between all the directed cycles of red edges and all the
alternating cycles in C.
78 Daishin Nakamura and Akihisa Tamura

−+
Let v be a mixed vertex such that NG (v) ∩ (W1 ∪ · · · ∪ Wk ) 6= ∅. From
−+
Lemma 12, there uniquely exists i ∈ {1, . . . , k} such that NG (v) ∩ Wi−1 =
−+ −+
∅ and NG (v) ∩ Wi 6= ∅. Moreover from Lemma 12 again, for such i, NG (v) ∩
((W1 ∪ · · · ∪ Wk ) − (Wi ∪ Wi+1 )) = ∅. Hence the mapping conserves weights. A
maximum weight directed cycle of red edges can be found in polynomial time
by the breadth first search. t
u

Lemma 14. A maximum weight small augmenting cycle can be found in poly-
nomial time.

Proof. The number of alternating cycles of length 4 is polynomially bounded.

Thus we can easily find one having the maximum weight. On the other hand,
each alternating cycle of length at least 6 is included in a certain cycle of wings.
The number of all cycles of wings containing at most two regular vertices is
polynomially bounded. We can also enumerate these in polynomial time. By
Lemma 13, we can find a maximum weight small augmenting cycle in polynomial
time. t
u

Unfortunately, a maximum weight large augmenting cycle cannot be found

in polynomial time in the same way because the number of the cycles of wings
having three or more regular vertices cannot be polynomially bounded. Before
considering the step (2) in our algorithm, we introduce a useful property relative
to wings around regular vertices. For convenience, we will use some notations as
below:

• v1 ∼v2 means that v1 and v2 are adjacent, and v1 6∼v2 means v1 and v2 are
not adjacent.
+−
• v1 ∼ v2 says there is an edge having plus and minus sings at v1 and v2
+−
respectively, and v1 6∼ v2 is its negation.
+ ++ +− +
• v1 ∼ v2 denotes either v1 ∼ v2 or v1 ∼ v2 , and v1 6∼ v2 is the negation of
+
v1 ∼ v2 .
• v1 v2 says that v1 and v2 are contained in the same wing, and v1 6 v2 is its
negation.

Lemma 15 ([4]). Given a regular vertex x, let B(x) = {v| v∼x and v is
bounded}. Then there exists a partition of B(x), namely [N 1 (x), N 2 (x)], such
that for any v1 , v2 ∈ B(x) with v1 6 v2 ,

v1 ∼v2 ⇐⇒ [v1 , v2 ∈ N 1 (v) or v1 , v2 ∈ N 2 (v)].

Moreover this partition is uniquely determined, and hence, it can be found in

polynomial time.

This is the key lemma of Minty’s algorithm. If a large alternating cycle or a free
alternating path passes through v1 ∈ N 1 (v) and a regular vertex v, then it must
The Stable Set Problem for Claw-Free Bidirected Graphs 79

pass through a vertex v2 such that v2 ∈ N 2 (v) and v2 6 v1 . From this property
Minty showed that by constructing a graph called the “Edmonds’ graph” and by
finding a maximum weight perfect matching of it, a maximum weight augmenting
path for any Pareto-optimal stable set can be found in polynomial time. To
deal with induced weights, we require an additional property of the partition of
vertices adjacent to a regular vertex.
Lemma 16. For a regular vertex x and a vertex v such that v = x or v∼x, we
define
def +− +−
N 1 (x) v N 2 (x) ⇐⇒ ∃a∈N 1 (x), ∃b∈N 2 (x) such that a6∼b, a ∼ v and b 6∼ v,
def +− +−
N 2 (x) v N 1 (x) ⇐⇒ ∃c∈N 2 (x), ∃d∈N 1 (x) such that c6∼d, c ∼ v and d 6∼ v.
Then at most one of N 1 (x) v N 2 (x) and N 2 (x) v N 1 (x) holds.
+− +
Proof. Let us consider the case v = x. If b 6∼ x, then b ∼ x because b∼x. In
+− +
addition, if a ∼ x, then a ∼ b. Hence neither N 1 (x) x N 2 (x) nor N 2 (x) x
N 1 (x) holds.
Suppose to the contrary that v∼x, N 1 (x) v N 2 (x) and N 2 (x) v N 1 (x).
+− +−
There exist a, d ∈ N 1 (x) and b, c ∈ N 2 (x) such that a6∼b, c6∼d, a ∼ v, b 6∼ v,
+− +−
c ∼ v and d 6∼ v. Note that b, d and v are mutually distinct. Assume to the con-
+ +− +− + +
trary that b∼v. Then b ∼ v because b 6∼ v. But a ∼ v and v ∼ b induce a ∼ b,
contradicting a6∼b. Hence b6∼v and similarly d6∼v. Now b∼d since otherwise
{x, b, d, v} induces a claw. Thus b d from Lemma 15.
Suppose that a c. Because x is regular, i.e., x is adjacent to at least three
wings, there exists e ∈ N (x) such that e 6 a c and e 6 b d. Suppose that e ∈
+−
N 1 (x). Then e6∼b and e6∼c from Lemma 15. If e 6∼ v, then replace d by e, and from
+−
the above discussion, b e, a contradiction. Hence e ∼ v, and we can replace a by
e. Similarly if e ∈/ N 1 (x), i.e., e ∈ N 2 (x), then we can replace c by e. Henceforth
we assume that a 6 c.
Suppose to the contrary that a 6 d. From Lemma 15, a∼d. Since a is bounded,
a is adjacent to two black vertices: x and namely y. Then d6∼y, since otherwise
d∼x and d∼y imply a d, a contradiction. Now v∼y since otherwise {a, d, v, y}
+ −+
induces a claw. Note that v ∼ y since otherwise v ∼ y, contradicting the fact
+− + +
that y is black and v is white. Thus c ∼ v and v ∼ y induce c ∼ y. However, c∼x
and c∼y imply a c, a contradiction. Hence a d and similarly c b. Since a d,
d b and b c, a c holds. However, this contradicts the assumption a 6 c. t
u
We add the induced weight of an alternating cycle or a free alternating path
to weights of appropriate vertices in it. We define w̃ : (V ∪ (V × V )) → < by the
following procedure: let w̃ ← 0 and for each mixed vertex v,
−+
• if B −+ (v) = {u | u is bounded, v ∼ u} is empty or included in a wing,
w̃(u) ← w̃(u) + w(v) for each u ∈ B −+ (v),
80 Daishin Nakamura and Akihisa Tamura

• otherwise there uniquely exists a black vertex x such that x = v or x∼v,

from Lemma 12,
? if x is regular, then
− if N 2 (x) v N 1 (x), then w̃(u) ← w̃(u) + w(v) for each u ∈ B −+ (v) ∩
N 2 (x),
− otherwise w̃(u) ← w̃(u) + w(v) for each u ∈ B −+ (v) ∩ N 1 (x),
? otherwise x must be irregular, and w̃(t, u) ← w̃(t, u) + w(v) for each pair
+− +−
of vertices t, u ∈ B(x) such that t 6 u, t6∼u and [t ∼ v or u ∼ v].
By combining Lemmas 15 and 16, we can prove the next lemma.
Lemma 17. Let C = (y1 , x1 , y2 , x2 , . . . , yk , xk , yk+1 = y1 ) be an alternating
cycle with white vertices y1 , . . . , yk and black vertices x1 , . . . , xk (k ≥ 3). Then
Pk Pk Pk Pk
δX (C) = i=1 w(yi ) − i=1 w(xi ) + i=1 w̃(yi ) + i=1 w̃(yi , yi+1 ).

If there is no small augmenting cycle, by using Lemma 17, we can construct the
Edmonds’ graph Ĝ such that
1. each edge of Ĝ is colored black or white, and it has a weight ŵ,
2. all the black edges form a perfect matching M of Ĝ,
3. if M is a maximum weight perfect matching of Ĝ then there is no large
augmenting cycle family in G and
4. if ŵ(M ) < ŵ(M ∗ ) for a maximum weight perfect matching M ∗ of Ĝ, let
Ĉ1 , . . . , Ĉk be all the augmenting cycles in M ∗ 4 M ; then Ĉ1 , . . . , Ĉk cor-
respond to a maximum weight large augmenting cycle family C1 , . . . , Ck in
G.
In the next section, we show that the Edmonds’ graph can be constructed in
polynomial time. Hence the step (2) in our algorithm can be done in polynomial
time. Analogously, if there is no augmenting cycle, for any pair of vertices a and
b, we can find a maximum weight augmenting path whose endpoints are a and
b, if it exists, by constructing the Edmonds’ graph and by finding a maximum
weight perfect matching in it. Now we can find a maximum weight augmenting
path by trying all the pairs of vertices a and b.

5 Constructing and Modifying the Edmonds’ Graph

We now describe how to construct the Edmonds’ graph to find a maximum
weight large alternating cycle family. We note that Edmonds’ graphs for finding
a maximum weight augmenting path can be obtained by modifying the construc-
tion.
A white alternating path P is called an irregular white alternating path
(IWAP) if all black vertices of P are irregular and no wing contains two white
vertices of P . For an IWAP P = (y1 , z1 , y2 , z2 , . . . , zk−1 , yk ), we define its weight,
denoted by δ̃X (P ) as
P P Pk Pk−1
δ̃X (P ) = ki=1 w(yi ) − k−1
i=1 w(zi ) + i=1 w̃(yi ) + i=1 w̃(yi , yi+1 ).
The Stable Set Problem for Claw-Free Bidirected Graphs 81

Then Lemma 17 can be described in terms of IWAP:

Lemma 18. Let C = (P1 , x1 , P2 , x2 , . . . , Pk , xk , Pk+1 = P1 ) be an alternating
cycle of length at least 6 such that k ≥ 2, x1 , . . . , xk are distinct regular vertices
and P1 , . . . , Pk are IWAPs. Then
P P
δX (C) = ki=1 δ̃X (Pi ) − ki=1 w(xi ).

Lemma 19. Let A and B be subsets of bounded vertices. Then a maximum

weight IWAP whose endpoints are in A and B respectively can be found in poly-
nomial time.
Proof. We can reduce this problem to find maximum weight directed paths in
directed acyclic graphs. u
t
Now we make the Edmonds’ graph GEd . Let x1 , . . . , xr be all the regular
vertices. GEd has 2r vertices, namely x1i , x2i (i = 1, . . . , r). Join x1i and x2i by
a black edge with weight ŵ(x1i , x2i ) = w(xi ) (i = 1, . . . , r). For each pair of
regular vertices xi and xj and for p, q ∈ {1, 2}, if there exists an IWAP whose
endpoints are in N p (xi ) and N q (xj ), join xpi and xqj by a white edge whose
weight ŵ(xpi , xqj ) is the maximum weight among such IWAPs. Now we finish
constructing the Edmonds’ graph.
Let M be the set of all the black edges. Note that M is a perfect matching.
An alternating cycle Ĉ of length 2k ≥ 6 in GEd corresponds to a large alternating
cycle C in G where C has k regular vertices and δM (Ĉ) = δX (C).
But this is not true for k = 2. Let Ĉ = (x1i , x1j , x2j , x2i , x1i ) be an alternating
cycle of GEd . Here xi and xj are distinct regular vertices. We denote Ppq as
the maximum weight IWAP corresponding to the edge (xpi , xqj ) in the Edmonds’
graph for p, q ∈ {1, 2}. If δM (Ĉ) is not positive, then there is no problem in our
purpose. So suppose that its weight is positive, i.e. δM (Ĉ) = δ̃X (P11 )+ δ̃X (P22 )−
w(xi ) − w(xj ) > 0.
If P11 and P22 have no vertex in common, then C = (xi , P11 , xj , P22 , xi ) is
a small augmenting cycle, contradicting to that we have already eliminated all
the small augmenting cycles. Hence we can denote
P11 = (y11 , z1 , y21 , z2 , . . . , y`−1
1
, z`−1 , y`1 ) and
P22 = (y12 , z1 , y22 , z2 , . . . , y`−1
2
, z`−1 , y`2 ).
Here z1 , . . . , z`−1 are irregular vertices, both yk1 and yk2 are in a common wing
Wk for k = 1, . . . , `, y11 ∈ N 1 (xi ), y12 ∈ N 2 (xi ), y`1 ∈ N 1 (xj ) and y`2 ∈ N 2 (xj ).
We first discuss an easy situation. A wing W is said to be irregular reachable
to a regular vertex x if there exist an integer m ≥ 1, distinct irregular vertices
z1 , . . . , zm−1 and distinct wings W1 (= W ), W2 , . . . , Wm such that W1 is adjacent
to z1 and Wk is adjacent to zk−1 and zk for k = 2, . . . , m, where zm = x. Let
W (xi , xj ) denote the union of all the wings that are irregular reachable to both
xi and xj .
82 Daishin Nakamura and Akihisa Tamura

Lemma 20. If N 1 (xj ) ⊆ W (xi , xj ), then any large alternating cycle in G passes
through neither P12 nor P22 . That is, we can delete the edges (x1i , x2j ) and (x2i , x2j )
from GEd . Similarly if N 2 (xj ) ⊆ W (xi , xj ), we can delete (x1i , x1j ) and (x2i , x1j ). If
N 1 (xi ) ⊆ W (xi , xj ), we can delete (x2i , x1j ) and (x2i , x2j ). If N 2 (xi ) ⊆ W (xi , xj ),
we can delete (x1i , x1j ) and (x1i , x2j ).
Proof. Suppose that a large alternating cycle C passes xi , P12 (or P22 ) and xj .
Before xj , it passes a vertex in N 2 (xj ). Hence after xj , it must pass a vertex
v ∈ N 1 (xj ) ⊆ W (xi , xj ). Hence C contains exactly two regular vertices xi and
xj , contradicting to that C is large. t
u
In the sequel, we suppose that none of N 1 (xi ), N 2 (xi ), N 1 (xj ) nor N 2 (xj )
is contained in W (xi , xj ).
Lemma 21. There exists k such that yk1 = yk2 and 2 ≤ k ≤ ` − 1, or there exists
k such that yk1 ∼yk2 and 1 ≤ k ≤ `.

Proof. Suppose that this lemma does not hold, i.e. yk1 6= yk2 and yk1 6∼yk2 for all
k = 1, . . . , ` (Note that y11 6= y12 and y`1 6= y`2 since B 1 (xi ) ∩ B 2 (xi ) = B 1 (xj ) ∩
B 2 (xj ) = ∅). Let z0 = xi , z` = xj and Ck = (yk1 , zk , yk2 , zk−1 , yk1 ) (k = 1, . . . , `).
P` Ck is a small alternating cycle for all k = 1, . . . , `. We can show that
Then
k=1 δX (Ck ) = δ̃X (P11 ) + δ̃X (P22 ) − w(xi ) − w(xj )(> 0). (The proof is slightly
complicated because we must consider about the induced weight w̃.) Hence at
least one Ck is a small augmenting cycle, a contradiction. t
u
Now we can show the next two lemmas, but proofs are omitted.
Lemma 22. If ` = 1, any large alternating cycle passes through neither P11 nor
P22 . Hence we can delete the edges (x1i , x1j ) and (x2i , x2j ) from GEd .

Lemma 23. If ` ≥ 2, the followings hold.

1. There exists k such that 2 ≤ k ≤ ` − 1 and yk1 = yk2 , or there exists k such
that 1 ≤ k ≤ ` − 1, yk1 6= yk2 , yk+1
1 2
6= yk+1 , yk1 6∼yk+1
2
and yk2 6∼yk+1
1
.
2. For such k, let
P11i = (y11 , z1 , y21 , . . . , zk−1 , yk1 ), 1
P11j = (yk+1 1
, zk+1 , . . . , y`−1 , z`−1 , y`1 ),
P22i = (y1 , z1 , y2 , . . . , zk−1 , yk ) and P22j = (yk+1 , zk+1 , . . . , y`−1 , z`−1 , y`2 ),
2 2 2 2 2

0 0
and let P12 = (P11i , zk , P22j ) and P21 = (P22i , zk , P11j ).
0 0 0
Then δ̃X (P12 ) + δ̃X (P21 ) = δ̃X (P11 ) + δ̃X (P22 ), P12 is an IWAP between
B (xi ) and B (xj ), and P21 is an IWAP between B 2 (xi ) and B 1 (xj ).
1 2 0

3. δ̃X (P11 ) + δ̃X (P22 ) = δ̃X (P12 ) + δ̃X (P21 ).

0 0
4. δ̃X (P12 ) = δ̃X (P12 ) and δ̃X (P21 ) = δ̃X (P21 ).
Summing up the above discussion, dealing with three cases, i.e. Lemmas 20,
22 and 23, we modify the Edmonds’ graph. In the first two cases, elimination
of augmenting cycles of a form (x1i , x1j , x2j , x2i , x1i ) or (x1i , x2j , x1j , x2i , x1i ) can be
easily done by deleting edges. In the last case, we modify GEd as below:
The Stable Set Problem for Claw-Free Bidirected Graphs 83

1. Delete four edges (x1i , x1j ), (x2i , x2j ), (x1i , x2j ) and (x2i , x1j ) (Lemma 23 guaran-
tees the existence of these four edges),
2. Add two new vertices zki and zkj , join zki and zkj by a black edge and assign
its weight ŵ((zki , zkj )) to be 0, where k satisfies the conditions of Lemma 23,
3. Add four white edges (x1i , zki ), (x2i , zki ), (x1j , zkj ) and (x2j , zkj ), and assign their
weights to be ŵ((x1i , zki )) = δ̃X (P11 ), ŵ((x2i , zki )) = δ̃X (P22 ), ŵ((x1j , zkj )) = 0
and ŵ((x2j , zkj )) = δ̃X (P12 ) − δ̃X (P11 )(= δ̃X (P22 ) − δ̃X (P21 )).

All large alternating cycles through black edges (x1i , x2i ) and (x1j , x2j ) can
be preserved by our revision, because (xpi , xqj ) in the original Edmonds’ graph
(p, q ∈ {1, 2}) is interpreted by the path (xpi , zki , zkj , xqj ) in the revised Edmonds’
graph. Furthermore, Lemma 23 guarantees that weights of these four edges are
equal to those of such four paths, respectively.

Lemma 24. A maximum weight large alternating cycle family can be found in
polynomial time if there is no small augmenting cycle.

Proof. Make the Edmonds’ graph. Then eliminate all the augmenting cycles of
a form (x1i , x1j , x2j , x2i , x1i ) or (x1i , x2j , x1j , x2i , x1i ). Let G0Ed be the modified graph
and M 0 be the set of its black edges. Note that M 0 is perfect. Let M ∗ be a
maximum weight perfect matching andSĈ1 , . . . , Ĉk be all the augmenting cycle in
M 0 4 M ∗ (k may be zero). Note that ( i=1 Ĉi ) is a maximum weight alternating
k

cycle family of G0Ed . Then each Ĉi has length at least 6 because we eliminate all
augmenting cycles of length 4, and hence Ĉi corresponds to a large augmenting
cycle Ci of X such that δX (Ci ) = δM 0 (Ĉi ). Moreover C1 , . . . , Ck are disjoint
because Ĉ1 , . . . , Ĉk are vertex-disjoint. Now fromSconstruction and modification
k
of the Edmonds’ graph, we can conclude that ( i=1 Ci ) is a maximum weight
large alternating cycle family of X. t
u

References
1. E. Boros and O. C̆epek, O. On perfect 0, ±1 matrices. Discrete Math., 165/166:81–
100, 1997.
2. M. Conforti, G. Cornuéjols, and C. De Francesco. Perfect 0, ±1 matrices. Linear
Algebra Appl., 253:299–309, 1997.
3. E. L. Johnson and M. W. Padberg. Degree-two inequalities, clique facets, and
biperfect graphs. Ann. Discrete Math., 16:169–187, 1982.
4. G. J. Minty. On maximal independent sets of vertices in claw-free graphs. J.
Combin. Theory Ser. B, 28:284–304, 1980.
5. E. C. Sewell. Binary integer programs with two variables per inequality. Math.
Programming, 75:467–476, 1996.
6. A. Tamura. The generalized stable set problem for perfect bidirected graphs. J.
Oper. Res. Soc. Japan, 40:401–414, 1997.
On a Min-max Theorem of Cacti

?
Zoltán Szigeti

Equipe Combinatoire, Université Paris 6

75252 Paris, Cedex 05, France
Zoltan.Szigeti@@ecp6.jussieu.fr

Abstract. A simple proof is presented for the min-max theorem of

Lovász on cacti. Instead of using the result of Lovász on matroid parity,
we shall apply twice the (conceptionally simpler) matroid intersection
theorem.

1 Introduction
The graph matching problem and the matroid intersection problem are two well-
solved problems in Combinatorial Theory in the sense of min-max theorems and
polynomial algorithms for finding an optimal solution. The matroid parity prob-
lem, a common generalization of them, turned out to be much more difficult. For
the general problem there does not exist polynomial algorithm [2], [3]. Moreover,
it contains NP-hard problems. On the other hand, for linear matroids Lovász
[3] provided a min-max formula and a polynomial algorithm. There are several
earlier results which can be derived from Lovász’ theorem, e.g. Tutte’s result
on f -factors [9], a result of Mader on openly disjoint A-paths [5], a result of
Nebesky concerning maximum genus of graphs [6]. Another application which
can be found in the book of Lovász and Plummer [4] is the problem of cacti. It
is mentioned there that ”a direct proof would be desirable.” Our aim is to fill
in this gap, that is to provide a simpler proof for this problem. We remark here
that we shall apply the matroid intersection theorem twice. We refer the reader
to [7] for basic concepts of matroids.
A graph K is called cactus if each block (maximal 2-connected subgraph) of
K is a triangle (cycle of length three). The size of a cactus K is the number
of its blocks. Lovász derived a min-max theorem for the maximum size of a
cactus contained in a given graph G from his general min-max theorem on linear
matroid parity problem. Here we shall give a simple proof for this result on cacti.
The proof follows the line of Gallai’s (independently Anderson’s [1]) proof for
Tutte’s theorem on the existence of perfect matchings.
In fact, we shall solve the graphic matroid parity problem in the special case
when for each pair the two edges have exactly one vertex in common. The graphic
?
This work was done while the author visited Laboratoire LEIBNIZ, Institut IMAG,
Grenoble.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 84–95, 1998. c Springer–Verlag Berlin Heidelberg 1998
On a Min-max Theorem of Cacti 85

matroid parity problem is the following. Given a graph G and a partition of its
edge set into pairs, what is the maximum size of a forest which consists of pairs,
in other words, what is the maximum number of pairs whose union is a forest.
A pair of edges is called v-pair if these two edges have exactly one vertex in
common and they are not loops. If G is an arbitrary graph and V is a partition
of the edge set of G into v-pairs then (G, V) is called v-graph. From now on a
cactus of (G, V) is a forest of G consisting of v-pairs in V. The size of a cactus
is the number of v-pairs contained in it. The v-graphic matroid parity problem
consists of finding the maximum size β(G, V) of a cactus in a v-graph (G, V).
The original cactus problem can be formulated as a v-graphic matroid parity
problem as follows. Let (G0 , V) be the following v-graph. The vertex set of G0
is the same as of G. We define the edge set of G0 and the partition V of the
edge set into v-pairs as follows. For each triangle T of G we introduce a v-pair
in (G0 , V): choose any two edges of T, add them to the edge set of G0 and add
this v-pair to V. (G0 will contain lots of parallel edges. In fact, G0 is obtained
from G by multiplying edges.) Obviously, there is a one to one correspondence
between the cacti of G and the forests of G0 being the union of v-pairs. Thus
the problem is indeed a v-graphic matroid parity problem.
To state the theorem on cacti we need some definitions. Let (G, V) be a v-
graph. Let P := {V1 , V2 , ..., Vl } be a partition of the vertex set V (G). Let VP ⊆ V
(SP ⊆ V) be the set of those v-pairs whose end vertices belong to three (two)
different members of P. Let Q := {H1 , H2 , ..., Hk } be a partition of VP ∪ SP .
Let us denote by p(Hi ) the number of Vj ’s for which there exists at least one
v-pair in Hi with a vertex in Vj . We say that (P, Q) is a cover of (G, V). The
value val(P, Q) of a cover is defined as follows.
X p(Hi ) − 1
val(P, Q) := n − l + b c,
2
Hi ∈Q

where n = |V (G)|, l = |P| and k = |Q|.

Now, we are able to present the min-max result of Lovász [4] on cacti in our
terminology.

Theorem 1. Let (G, V) be a v-graph. Then β(G, V) = min{val(P, Q)}, where

the minimum is taken over all covers (P, Q) of (G, V).

Remark 1. In the definition of a cover, Q could be defined as the partition of V

and not of VP ∪SP . Indeed, if (P, Q) is a cover, then the pairs in V −VP ∪SP can
be added to Q as new members of the partition without changing the value of the
cover since p(T ) = 1 for all T ∈ V −VP ∪SP . We mention that if T ∈ V −VP ∪SP ,
then T can not be in a maximal cactus.

A cactus of (G, V) is called perfect if it is a spanning tree of G. If G contains

only one vertex v and no edge, then the vertex v is considered as a perfect cactus
of (G, V). Since a spanning tree contains n − 1 edges, β(G, V) ≤ b n−1 2 c, for any
v-graph (G, V).
86 Zoltán Szigeti

Let (P, Q) be a cover of a v-graph (G, V). The elements Hi ∈ Q are called
components of the cover. G/P will denote the graph obtained from G by con-
tracting each set Vi in P into one vertex. We identify the edge sets of G and
G/P. (GP , VP ) is the v-graph, where VP is defined as above, it contains those
v-pairs of V which remain v-pairs after the contraction, the vertex set of GP is
the same as of G/P and the edge set of GP is the set of edges of the v-pairs in
VP , that is, GP is obtained from G/P by deleting the edges which do not belong
to any v-pair in VP . For Hi ∈ Q, (GP [Hi ], Hi ) will denote the v-graph for which
the edge set of GP [Hi ] is the set of edges of the v-pairs in Hi and the vertex set
of GP [Hi ] contains those vertices of GP for which at least one v-pair of Hi is
incident. Then p(Hi ) is the number of vertices of GP [Hi ]. (Note that if Hi ∈ SP ,
then (GP [Hi ], Hi ) contains two edges which are parallel or one of them is a loop,
that is it is not really a v-graph. However, we shall need this type of ”v-graphs”
in the proof.) If F is a subset of edges of a v-graph (G, V) then the number of
v-pairs of V contained in F is denoted by vV (F ).
For a graph G on n vertices and with c connected components, a forest of G
containing n − c edges is called spanning. For a connected graph G, a forest F of
G containing n − 2 edges (that is, F has exactly two connected components) is
called almost spanning. A v-graph will be called (cactus)-critical if by identifying
any two vertices the v-graph obtained has a perfect cactus. Especially, this means
that in a critical v-graph there exists a cactus which is almost perfect, that is,
it is an almost spanning tree consisting of v-pairs. Critical v-graphs will play an
important role in the proof, like factor-critical graphs play the key role in the
proof of Tutte’s theorem. A component Hi ∈ Q is said to be critical in (G, V) if
the v-graph (GP [Hi ], Hi ) is critical. If Hi ∈ SP , then (GP [Hi ], Hi ) is considered
to be critical.
We say that the partition P of V is the trivial partition if l := |P| = n := |V |.
The cover (P, Q) is the trivial cover if l = n and k := |Q| = 1. Let P 0 =
{V11 , ..., V1r1 , V21 , ..., V2r2 , ..., Vl1 , ..., Vlrl }, where ∪j Vij = Vi for all i, then the
partition P 0 is called a refinement of the partition P. If P 0 is a refinement of P
so that |P 0 | = |P| + 1, then we say it is an elementary refinement. If Vi ∈ P then
the partition obtained from P by replacing Vi by its singletons will be denoted
by P ÷ {Vi }. If P 0 is a refinement of P, then we shall use p0 (Hi ) instead of p(Hi ).
We shall need later two auxiliary graphs B and D. These graphs will depend
on a v-graph (G, V) and a cover (P, Q) of this v-graph. We suppose that for
each component Hi , p(Hi ) is even. First we define the graph B = (V (G), E(B)).
e = uv will be an edge of B if and only if there exist u, v ∈ Vj ∈ P, Hi ∈ Q and
a cactus K in (GP÷{Vj } [Hi ], Hi ) consisting of p(Hi )/2 v-pairs so that exactly
two vertices u and v of Vj are connected in K, not necessarily by an edge but by
a path in K, that is u and v are in the same connected component of K. (Note
that K contains a cactus of size (p(Hi ) − 2)/2 in (GP [Hi ], Hi ). We mention that
(by Lemma 2, see later) (GP [Hi ], Hi ) will always contain a cactus consisting of
(p(Hi ) − 2)/2 v-pairs of V.) We call this edge e an augmenting edge for Hi . In
other words, the trace of the cactus K in P is the edge e. We will call the edges
of B as augmenting edges. Note that an edge of B may be augmenting for more
On a Min-max Theorem of Cacti 87

Hi ∈ Q. Let P 0 be a refinement of P. Then the set AP 0 of augmenting edges

connecting vertices in different sets of P 0 will be called the augmenting edges
with respect to the refinement P 0 .
The second auxiliary graph D will be a bipartite graph with colour classes
E(B) (the edge set of B) and Q. Two vertices e ∈ E(B) and Hi ∈ Q are
connected if and only if e is an augmenting edge for Hi . As usually, the set of
neighbours of a vertex set X of one of the colour classes of D will be denoted by
ΓD (X).
Finally, some words about the ideas of the proof. As it was mentioned earlier
we shall follow the proof of Tutte’s theorem. Let us briefly summarize the steps
of this proof. We suppose that the Tutte condition is satisfied for a given graph
G and we have to construct a perfect matching of G. Let X be a maximal
set satisfying the condition with equality. The maximality of X implies that
all the components of G − X are factor-critical, thus it is enought to find a
perfect matching in an auxiliary bipartite graph D, where one of the color classes
corresponds to X while the other to the (critical) components. Hall’s theorem
(or the matroid intersection theorem) provides easily the existence of a perfect
matching M in D. The desired perfect matching of G can be obtained from M
and from the almost perfect matchings of the critical components. We mention
that this is a lucky case because the union of these almost perfect matchings will
be automatically a matching in G.
In the case of cacti it is not easier to prove that version where we have to find
a perfect cactus, so we shall prove directly the min-max theorem. We shall choose
a minimal cover (P, Q) of (G, V) which is maximal in some certain sense. This
will imply that the minimal cover of (GP [Hi ], Hi ) is unique for each component
Hi . This fact has two consequences, namely (i) each component Hi is critical
(hence p(Hi ) is even) and (ii) for any component Hi and for any refinement P 0
of P, either there exists an augmenting edge for Hi with respect to P 0 or its
cover rests minimal in (GP 0 [Hi ], Hi ).
We shall construct the cactus of size val(P, Q) in (G, V) as follows. (1) For
n − l components Hi , we shall find a cactus Ki in (GP÷{Vj } [Hi ], Hi ) of size
p(Hi )/2 so that the trace of Ki in P is an edge and the corresponding augmenting
edges form a spanning forest of the auxiliary graph B. (We shall see that the size
of a spanning forest of B is indeed n−l.) (2) For the other components Hj we shall
need a cactus in (GP [Hj ], Hj ) of size p(Hj )/2−1, and (3) the union of all of these
forests will be a forest, that is a cactus of size val(P, Q). Using (i), for the latter
components Hj it is enough to find an arbitrary almost spanning tree in GP [Hj ]
(and then using that Hj is critical, this forest can be replaced by a convenient
cactus containing the same number of edges, that is of size p(Hi )/2 − 1). By
the definition of augmenting edge, for the former components Hi it is enough to
consider an arbitrary spanning tree in GP [Hi ] so that (∗) there exist augmenting
edges for these components whose union will be a spanning forest of B. Thus we
have to find a forest F in G so that (a) E(F ) ∩ E(GP [Hj ]) is either a spanning
tree or an almost spanning tree in GP [Hj ], (b) for n − l components Hi we have
spanning tree, (c) for these components (∗) is satisfied.
88 Zoltán Szigeti

The existence of a forest with (a) and (b) can be proved, using (ii), by a
matroid partition theorem (for a graphic matroid and a truncated partitional
matroid). We shall see in Lemma 5 that if for all such forests we consider the
components where the corresponding forest is a spanning tree then we get the
set of basis of a matroid on the set of indices of the components.
Two matroids will be defined on the edge set of the auxiliary graph D, one
of them will be defined by the above introduced matroid, and the other one will
be defined by the cycle matroid of B. The matroid intersection theorem will
provide a forest of G with (a), (b) and (c). As we mentioned earlier, each part of
the forest, which corresponds to a component, can be replaced by a convenient
cactus, and thus the desired cactus has been found.

2 The Proof

Proof. (max ≤ min) Let F be an arbitrary cactus in (G, V) and let (P, Q) be any
cover of (G, V). Contract each Vi ∈ P i = 1, 2, . . . , l into a vertex and let F 0 be a
subset of F of maximum size so that F 0 is a forest in the contracted graph G/P.
For the number c (c0 ) of connected components of F in G (of F 0 in G/P) we have
obviously, c0 ≤ c. Thus |F | = n − c ≤ n − c0 = (l − c0 ) + (n − l) = |F 0 | + (n − l). It
follows that vV (F ) ≤ vV (F 0 ) + n − l. Let F 00 be the maximum subforest of F 0 in
GP consisting of v-pairs S in V. Obviously, F 00 forms a cactus in each (GP [Hi ], Hi )
Hi ∈ Q. By definition, Hi ∈Q Hi covers all the v-pairs contained in F 00 . Thus
P P
vV (F 0 ) = vV (F 00 ) = vVP (F 00 ) = Hi ∈Q vHi (F 00 ) ≤ Hi ∈Q b p(H2i )−1 c, and the
desired inequality follows. t
u

Proof. (max ≥ min) We prove the theorem by induction on the number n of

vertices of G. For n = 3 the result is trivially true.
Let (P, Q) be a minimum cover of (G, V) for which l is as small as possible
and subject to this k is as large as possible. Note that by the maximality of k,
each pair in SP will form a component because for each Hi ∈ SP , b p(H2i )−1 c = 0.

Lemma 1. For each Hi ∈ Q, the unique minimum cover of (GP [Hi ], Hi ) is the
trivial one.

Proof. Let (P 0 , Q0 ) be a minimum cover of (GP [Hi ], Hi ). Clearly, val(P 0 , Q0 ) ≤

b p(H2i )−1 c. Using this cover, a new cover (P ∗ , Q∗ ) of (G, V) can be defined as
follows. Let the partition P ∗ of V (G) be obtained from P by taking the union
of all those Vr and Vs whose corresponding vertices in GP are in the same set
of P 0 . Then l∗ = l − p(Hi ) + l0 , where l0 = |P 0 |. Let Q∗ be obtained from Q by
deleting Hi and by adding Q0 . We claim that the new cover is also a minimum
cover.
X X p0 (Hj0 ) − 1
p(Hj ) − 1
val(P ∗ , Q∗ ) ≤ n − l∗ + +
2 0 0
2
Hj ∈Q−{Hi } Hj ∈Q
On a Min-max Theorem of Cacti 89

X
p(Hj ) − 1
= n − l∗ +
2
Hj ∈Q−{Hi }

+ (val(P 0 , Q0 ) − (p(Hi ) − l0 ))
X
p(Hj ) − 1
≤ n − (l − p(Hi ) + l0 ) +
2
Hj ∈Q−{Hi }

p(Hi ) − 1
+ − p(Hi ) + l0 = val(P, Q).
2

It follows that equality holds everywhere, so val(P 0 , Q0 ) = b p(H2i )−1 c, thus the
trivial cover of (GP [Hi ], Hi ) is minimal. Furthermore, by the minimality of l, P 0
is the trivial partition of V (GP [Hi ]) and by the maximality of k, Q0 may contain
only one set and we are done. t
u

Lemma 2. Each component Hi ∈ Q is critical.

Proof. Suppose that there exists a component Hi ∈ Q for which (GP [Hi ], Hi )
is not critical, that is there are two vertices a and b in GP [Hi ] so that after
identifying a and b the new v-graph (G0 , V 0 ) has no perfect cactus. By the hy-
pothesis of the induction, it follows that there is a cover (P 0 , Q0 ) of (G0 , V 0 ) so
that in G0 val(P 0 , Q0 ) < (p(Hi )−1)−1
2 ≤ b p(H2i )−1 c. This cover can be considered
as a cover (P , Q ) of (GP [Hi ], Hi ) and val(P 00 , Q00 ) = val(P 0 , Q0 ) + 1. Thus
00 00

val(P 00 , Q00 ) ≤ b p(H2i )−1 c, that is (P 00 , Q00 ) is a minimal cover of (GP [Hi ], Hi )
but not the trivial one (a and b are in the same member of P 00 ), which contra-
dicts Lemma 1. t
u

Corollary 1. If Hi ∈ Q and a, b are two vertices of GP [Hi ], then there exists

an almost perfect cactus K in (GP [Hi ], Hi ) so that a and b belong to different
connected components of K.

Proof. By Lemma 2, Hi is critical, so by identifying a and b in GP [Hi ], the

v-graph obtained has a perfect cactus K. Clearly, K has the desired properties
in (GP [Hi ], Hi ). t
u

Remark 2. By Corollary 1, for any component Hi the v-graph (GP [Hi ], Hi ) (and
consequently (G, V)) contains a cactus containing b p(H2i )−1 c v-pairs. However, at
this moment we can not see whether we can choose a cactus containing b p(H2i )−1 c
v-pairs for all Hi so that their union is a cactus as well. Note that by Corollary
1, p(Hi ) is even for each component Hi ∈ Q, that is b p(H2i )−1 c = p(H2i )−2 .

2 c, then (G, V) has

Proposition 1. If l = n − 1, k = 1 and val(P, Q) = 1 + b n−2
a perfect cactus.
90 Zoltán Szigeti

Proof. Let (uv, vw) be one of the v-pairs in V. Let us consider the following
cover (P 0 , Q0 ) of (G, V). Each set of P 0 contains exactly one vertex of G except
one which contains u and v, and Q0 contains exactly one set H (containing all
v-pairs in V). Then, clearly, this is a minimum cover. By the assumptions for l
and k this cover also minimizes l and maximizes k, thus by Lemma 2, its unique
component H is critical, that is the v-graph (GP 0 , VP 0 ) is critical. Let F be a
perfect cactus of the v-graph obtained from (GP 0 , VP 0 ) by identifying v and w.
Obviously, F ∪ uv ∪ vw is a perfect cactus of (G, V) and we are done. t
u

Lemma 3. Let P 0 be a refinement of P and let Hi ∈ Q for which Hi ∈ / ΓD (AP 0 ).

Then the trivial cover is a minimal cover of (GP 0 [Hi ], Hi ) with value b p(H2i )−1 c.

Proof. By Corollary 1, (GP [Hi ], Hi ) (and consequently (GP 0 [Hi ], Hi )) contains

a cactus of size b p(H2i )−1 c, hence the value of a minimum cover of (GP 0 [Hi ], Hi )
is at least b p(H2i )−1 c. Thus what we have to show is that the trivial cover has
value b p(H2i )−1 c.
First, we prove this when P 0 is an arbitrary elementary refinement of P, say
Vj ∪ Vj2 = Vj . We shall denote the vertices of GP 0 [Hi ] corresponding to Vj1 and
1

Vj2 by v1 and v2 . In this case we have to prove the following.

Proposition 2. v1 or v2 does not belong to GP 0 [Hi ].

Proof. Hi ∈ / ΓD (AP 0 ) implies that there exists no augmenting edge for Hi with
respect to P 0 that is (GP 0 [Hi ], Hi ) has no perfect cactus. By Proposition 1, we can
use the induction hypothesis (of Theorem 1), that is there exists a cover (P 00 , Q00 )
of (GP 0 [Hi ], Hi ) so that val(P 00 , Q00 ) ≤ (p(Hi )+1)−1
2 − 1 = p(H2i )−2 . This cover
gives a cover (P ∗ , Q∗ ) of (GP [Hi ], Hi ) with val(P ∗ , Q∗ ) ≤ p(H2i )−2 . So (P ∗ , Q∗ )
is a minimum cover of (GP [Hi ], Hi ) and by Lemma 1, it is the trivial cover.
Moreover, v1 and v2 are in different sets of P 00 (otherwise, val(P ∗ , Q∗ ) < p(H2i )−2 ,
a contradiction), hence (P ∗ , Q∗ ) is the trivial cover of (GP 0 [Hi ], Hi ) and its value
is p(H2i )−2 . It follows that v1 or v2 is not a vertex in GP 0 [Hi ] and the proposition
is proved. t
u

Let P 0 = {V11 , ..., V1r1 , V21 , ..., V2r2 , ..., Vl1 , ..., Vlrl } where ∪j Vij = Vi for all i,
be a refinement of P. It is enough to prove that for all i where ri ≥ 2 there exists
an elementary refinement P ∗ of P with Vi = Vij ∪ (Vi − Vij ) for some 1 ≤ j ≤ ri
so that in GP ∗ [Hi ] the vertex corresponding to Vi − Vij is isolated. Applying
Proposition 2, at most ri times, we see that such an elementary refinement
exists indeed. t
u

Corollary 2. The vertex sets of the connected components of the graph B (de-
fined by the augmenting edges) are exactly the sets in P.
On a Min-max Theorem of Cacti 91

Proof. By definition, there is no edge of B between two different sets of P. Let

us consider an elementary refinement P 0 of P. If there was no augmenting edge
with respect to this refinement, then by Lemma 3, the value of the cover (P 0 , Q0 )
would be val(P, Q) − 1, where Q0 is obtained from Q by adding the elements of
SP 0 − SP as new members of the partition, contradicting the minimality of the
cover (P, Q). This implies that the subgraphs of B spanned on the sets Vi in P
are connected. t
u
Let Fi be an arbitrary spanning tree of GP [Hi ] for all Hi ∈ Q (by Corollary 1,
GP [Hi ] is connected). Then E(Fi ) ∩ E(Fj ) = ∅ if i 6= j because theS components
of Q are disjoint. Let W = (V (GP ), E(W )) where E(W ) := Hi ∈Q E(Fi ).
Let P 0 be a refinement of P with |P 0 | = l0 . Let Q1 := ΓD (AP 0 ) and Q2 :=
Q − Q1 . We define two matroids on E(W ). Let G be the cycle matroid of W
with rank function rG , that is the edge sets of the forests are the independent
sets. Let FP 0 := F1 +F2 (direct sum), where Fj will be theS following (truncated)
partitional matroid (with rank function rj ) on Ej := Hi ∈Qj E(Fi ) j = 1, 2.
Let F1 contain those sets F ⊆ E1 for which |F ∩ E(Fi )| ≤ 1 for all i and the
intersection can be 1 at most t0 := |Q1 | − (l0 − l) times. Let F2 contain those
sets F ⊆ E2 for which |F ∩ E(Fi )| ≤ 1 for all i. For the rank function r0 of FP 0
r0 (X) = r1 (X ∩ E1 ) + r2 (X ∩ E2 ).
Lemma 4. For any refinement P 0 of P, E(W ) can be written as the union of
an independent set in G and an independent set in FP 0 .
Proof. This is a matroid partition problem. It is well-known (see for example [7])
that the lemma is true if and only if for any Y ⊆ E(W ), |Y | ≤ rG (Y ) + r0 (Y ).
Suppose that this is not true, and let Y be a maximum cardinality set violating
the above inequality.
S Then, clearly, Y is closed in FP 0 . Thus Y can be written in
the form Y = Hi ∈Q∗ E(Fi ), for some Q∗ ⊆ Q. Let K1 , . . . , Kc be the connected
∗
components of the Pcgraph K on the vertex ∗set V (GP ) with edge set Y . Then
rG (Y ) = l − c = 1 (p(Kj ) − 1). Let t := |Q ∩ Q1 |.
Let Q00 be obtained from Q by taking the unions of all those Hm and Hm0
in Q∗ for which Fm and Fm0 are in the same connected component of K ∗ , that
is each member Hj00 ∈ Q00 − Q corresponds to some Kj , so p(Hj00 ) = p(Kj ).
Case 1. t ≤ t0 . Then r0 (Y ) = |Q∗ |. Let us consider the cover (P, Q00 ) of (G, V).
Pc p(H 00 )−1 P
Since 0 ≤ val(P, Q00 ) − val(P, Q) = 1 b j2 c − Hi ∈Q∗ p(H2i )−2 ,
X X p(Hi ) − 2
|Y | = (p(Hi ) − 1) = 2 + |Q∗ |
2
Hi ∈Q∗ ∗
Hi ∈Q
X c
p(Hj00 ) −1
≤2 + |Q∗ | = rG (Y ) + r0 (Y )
1
2

contradicting the assumption for Y.

Case 2. t > t0 . Now, by the closedness of Y in FP 0 , Y contains all the trees Fi
for which Hi ∈ Q1 . Thus r0 (Y ) = r1 (Y ∩ E1 ) + r2 (Y ∩ E2 ) = t0 + (|Q∗ | − |Q1 |) =
92 Zoltán Szigeti

|Q∗ | − (l0 − l). Let us consider the following cover (P 0 , Q3 ) of (G, V), where Q3
is obtained from the above defined Q00 by adding each element in SP 0 − SP as
a component and adding the v-pairs in VP 0 − VP to appropriate members of
Q00 . If L ∈ VP 0 − VP , then L corresponds in W to a vertex or an edge and in
the latter case L ∈ Q1 so L corresponds to an edge of a connected component
Kj of K ∗ . We add L to the member Hj00 of Q00 corresponding to Kj . (If Kj is
an isolated vertex, then the corresponding Hj00 of Q00 was empty earlier.) Now,
(P 0 , Q3 ) is a cover of (G,
PV) indeed. We shall denote the members of Q3 − Q by
Hj 1 ≤ j ≤ c. Clearly, 1 p (Hj3 ) ≤ l0 . By Lemma 3, the value of the new cover
3 c 0

is the following.
$ %
Xc
p 0
(H 3
) − 1 X p(Hi ) − 2
val(P 0 , Q3 ) = n − l0 +
j
+
1
2 ∗
2
Hi ∈Q−Q
l0 − c X p(Hi ) − 2
≤ n − l0 + + .
2 2
Hi ∈Q−Q∗

Using that val(P, Q) ≤ val(P 0 , Q3 ) we have the following inequality.

X X p(Hi ) − 2
|Y | = (p(Hi ) − 1) = 2 + |Q∗ |
2
Hi ∈Q∗ ∗
Hi ∈Q
≤ (l − c) + (|Q∗ | − (l0 − l)) = rG (Y ) + r0 (Y ),

contradicting the assumption for Y. The proof of Lemma 4 is complete. t

By Lemma 4, for the trivial partition P 0 of V (G), the following fact is im-
mediate.

Corollary 3. There exists a forest F in the graph W so that for n − l indices i

E(Fi ) ⊆ E(F ) and E(F ) ∩ E(Fi ) is an almost spanning tree of GP [Hi ] for the
other indices. t
u

We shall need the following claim whose proof is trivial.

Proposition 3. Let F be a forest on a vertex set S. Let F 0 be a subgraph of

F with two connected components F10 and F20 . If F10 and F20 belong to the same
connected component of F then let us denote by a and b the two end vertices
of the shortest path in F connecting F10 and F20 , otherwise let a ∈ V (F10 ) and
b ∈ V (F20 ) be two arbitrary vertices. Let F 00 be any forest on V (F 0 ) with two
connected components so that a and b are in different connected components of
F 00 . Then (F − E(F 0 )) ∪ E(F 00 ) is a forest on S. t
u

Remark 3. By Corollary 3, there exists a forest F in W and consequently in GP

so that E(F ) ∩ E(Fi ) is a forest with two connected components on V (GP [Hi ])
for all components Hi . Let Hi be an arbitrary component of Q. By Corollary
1, for the two vertices a and b defined in Proposition 3 (F 0 = E(F ) ∩ E(Fi )),
On a Min-max Theorem of Cacti 93

there exists an almost perfect cactus K in (GP [Hi ], Hi ) so that a and b belong to
different components of K. Then, by Proposition 3, F − (E(F ) ∩ E(Fi )) ∪ E(K)
is a forest. We can do this for all components, so the v-graph (G, V) contains a
P
cactus containing Hi ∈Q b p(H2i )−1 c v-pairs.

Now we define a matroid (Q, M) on the sets of Q. Let Q0 ⊆ Q be in M if

and only if there is fi ∈ E(Fi ) for each Hi ∈ Q − Q0 so that E(W ) − ∪fi is a
forest in W.

Lemma 5. (Q, M) is a matroid.

Proof. We show that M satisfies the three properties of independent sets of

matroids.
(1) By Lemma 4, for P 0 = P, ∅ ∈ M.
(2) If Q00 ⊆ Q0 ∈ M, then Q00 ∈ M because any subgraph of a forest is a forest.
(3) Let Q0 , Q00 ∈ M so that |Q00 | < |Q0 |. By definition, there are fi0 ∈ E(Fi )
for Hi ∈ Q − Q0 and fi00 ∈ E(Fi ) for Hi ∈ Q − Q00 so that T 0 := E(W ) − ∪fi0
and T 00 := E(W ) − ∪fi00 are forests in W. Choose these two forests T 0 and T 00
so that they have edges in common as many as possible. |Q00 | < |Q0 | implies
that T 0 has more edges than T 00 has. T 0 and T 00 are two independent sets in the
matroid G thus there is an edge e ∈ T 0 − T 00 so that T 00 ∪ e is also a forest in
W. Then, clearly, e = fi00 for some i. If e ∈ E(Fi ) with Hi ∈/ Q0 then replace fi00
by fi and the new forest T with T contradicts the assumption on T 0 and T 00 .
0 ∗ 0

Thus e ∈ E(Fi ) so that Hi ∈ Q0 and then obviously Q00 ∪ {Hi } ∈ M and we are
done. t
u

We shall apply the matroid intersection theorem for the following two ma-
troids on the edge set of the graph D. For a set Z ⊆ E(D), let us denote the
end vertices of Z in the colour class E(B) (Q) by Z1 (Z2 , respectively). The
rank of Z in the first matroid will be rB (Z1 ) and rM (Z2 ) in the second matroid,
where rB is the rank function of the cycle matroid of the graph B and rM is the
rank function of the above defined matroid M. Note that if a vertex x of D is
in the colour class E(B) (Q) then the edges incident to x correspond to parallel
elements of the first (second) matroid.

Remark 4. By Corollary 2, rB (E(B)) = n−l and by Corollary 3, rM (Q) ≥ n−l.

Moreover, if AP 0 is the set of augmenting edges of some refinement P 0 of P, then
by Lemma 4,

l0 − l ≤ rM (ΓD (AP 0 )). (1)

Lemma 6. There exists a common independent set of size n − l of these two

matroids.

Proof. By the matroid intersection theorem (see for example [7]) we have to
prove that for any set Z ⊆ E(D) (+) n − l ≤ rB (E(D) − Z) + rM (Z).
94 Zoltán Szigeti

Suppose that there is a set Z violating (+). Clearly, we may assume that
E(D)− Z is closed in the first matroid. This implies that there is a set J ⊆ E(B)
so that E(D) − Z is the set of all edges of D incident to J and J is closed in
the cycle matroid of B. Let us denote by V10 , V20 , ..., Vl00 the vertex sets of the
connected components of the graph on vertex set V (B) with edge set J. Then
by the closedness of J, E(B)− J is the set of augmenting edges of the refinement
P 0 := {V10 , V20 , ..., Vl00 } of P, that is, AP 0 = E(B) − J. (Obviously, Z is the set
of all edges incident to E(B) − J in D.) Then rM (Z) = rM (ΓD (AP 0 )) and
rB (E(D) − Z) = rB (J) = n − l0 . By (1), l0 − l ≤ rM (ΓD (AP 0 )) and thus
n − l = (l0 − l) + (n − l0 ) ≤ rM (Z) + rB (E(D) − Z), contradicting the fact that
Z violates (+). t
u

The Construction of the Desired Cactus. Let N ⊆ E(D) be a common

independent set of size n − l. (By Lemma 6, such a set exists.) It follows that
N is a matching in D so that it covers a basis E 0 in the cycle matroid of B and
an independent set Q0 in M. Thus there exists a forest F 0 on V (GP ) so that it
spans the spanning trees Fi in GP [Hi ] for Hi ∈ Q0 and almost spanning trees
Fi − fi in GP [Hi ] (for appropriate fi ) for Hi ∈ Q − Q0 . Let us denote by c the
number of connected components of F 0 . Clearly, E 0 ∪ E(F 0 ) is a forest on V (G)
P
containing 2(n − l + Hi ∈Q p(H2i )−2 ) edges P and it has c connected
P components.
(|E 0 ∪ E(F 0 )| = P
|E 0 | + |E(F 0 )| = n − l + Hi ∈Q0 (p(Hi ) − 1) + Hi ∈Q−Q0 (p(Hi ) −
2) = 2(n − l) + Hi ∈Q (p(Hi ) − 2).)
We shall change the trees and forests by appropriate ones obtaining a cactus
of the desired size. As in Remark 3, for each Hi ∈ Q − Q0 we may replace in F 0
Fi − fi by an almost perfect cactus in GP [Hi ] obtaining a forest F 00 on V (GP )
with the same number of edges. As above, E 0 ∪ E(F 00 ) is a forest on V (G). For
all e ∈ E 0 e is an augmenting edge for He ∈ Q0 , where He is the pair of e
in the matching N . Thus there exists a cactus Ke in (GP÷Vi [He ], He ) of size
p(Hi )/2 so that the trace of Ke in Vi is the edge e, where Vi ∈ P contains the
edge e. (Note S that each Ke corresponds
S to a connected graph Fe0 in GP [He ].)
0
Replace E Hi ∈Q0 E(Fi ) by e∈E 0 E(Ke ). We obtain again aSforest of G with
the same number of edges. (Indeed, first in F 00 we replace Hi ∈Q0 E(Fi ) by
S 0
e∈E 0 E(Fe ) and obviously we obtained a graph with c connected components,
and, clearly, the edge set of this graph corresponds to a subgraph of G with c
connected components. Since the number of edges in this subgraph is the same
as in E 0 ∪ E(F 0 ) it is a forest of the same size.) The forest obtained consists of
P
v-pairs, that is it is a cactus with size n − l + Hi ∈Q b p(H2i )−1 c. t
u

Remark 5. While I was writing the final version of this paper I realized that the
same proof (after the natural changes) works for the general graphic matroid
parity problem. The details will be given in a forthcoming paper [8].

Acknowledgement. I am very grateful to Gábor Bacsó for the fruitful discus-

sions on the topic.
On a Min-max Theorem of Cacti 95

References
1. I. Anderson. Perfect matchings of a graph. Journal of Combinatorial Theory, Series
B, 10:183–186, 1971.
2. P. Jensen and B. Korte, Complexity of matroid property algorithms. SIAM J.
Comput., 11:184–190, 1982.
3. L. Lovász. Matroid matching problem. In Algebraic Methods in Graph Theory.
Colloquia Mathematica Societatis J. Bolyai 25, Szeged, 1978.
4. L. Lovász and M. D. Plummer. Matching Theory. North Holland, Amsterdam,
1986.
5. W. Mader. Über die maximalzahl kreuzungsfreier H-wege. Archiv der Mathematik,
31, 1978.
6. L. Nebesky. A new characterization of the maximum genus of a graph. Czechoslovak
Mathematical Journal, 31, 1981.
7. A. Recski. Matroid Theory and its Applications in Electric Network Theory and
in Statics. Akadémiai Kiadó, Budapest, 1989.
8. Z. Szigeti. On the graphic matroid parity problem. In preparation.
9. W. T. Tutte. Graph Factors. Combinatorica, 1:79-97, 1981.
Edge-Splitting and Edge-Connectivity
Augmentation in Planar Graphs

Hiroshi Nagamochi1 and Peter Eades2

1
Kyoto University
naga@@kuamp.kyoto-u.ac.jp
2
University of Newcastle
eades@@cs.newcastle.edu.au

Abstract. Let G = (V, E) be a k-edge-connected multigraph with a

designated vertex s ∈ V which has even degree. A splitting operation
at s replaces two edges (s, u) and (s, v) incident to s with a single edge
(u, v). A set of splitting operations at s is called complete if there is no
edge incident to s in the resulting graph. It is known by Lovász (1979)
that there always exists a complete splitting at s such that the resulting
graph G0 (neglecting the isolated vertex s) remains k-edge-connected. In
this paper, we prove that, in the case where G is planar and k is an even
integer or k = 3, there exists a complete splitting at s such that the
resulting graph G0 remains k-edge-connected and planar, and present an
O(|V |3 log |V |) time algorithm for finding such a splitting. However, for
every odd k ≥ 5, there is a planar graph G with a vertex s which has
no complete splitting at s which preserves both k-edge-connectivity and
planarity. As an application of this result, we show that the problem
of augmenting the edge-connectivity of a given outerplanar graph to an
even integer k or to k = 3 can be solved in polynomial time.

1 Introduction

Let G = (V, E) stand for an undirected multigraph, where an edge with end
vertices u and v is denoted by (u, v). For a subset1 S ⊆ V in G, G[S] denotes
the subgraph induced by S. For two disjoint subsets X, Y ⊂ V , we denote by
EG (X, Y ) the set of edges (u, v) with u ∈ X and v ∈ Y , and by cG (X, Y )
the number of edges in EG (X, Y ). The set of edges EG (u, v) may alternatively
be represented by a single link (u, v) with multiplicity cG (u, v). In this way,
we also represent a multigraph G = (V, E) by an edge-weighted simple graph
N = (V, LG , cG ) (called a network) with a set V of vertices and a set LG of links
weighted by cG : LG → Z + , where Z + is the set of non-negative integers. We
denote |V | by n, |E| by e and |LG | by m. A cut is defined as a subset X of V
1
A singleton set {x} may be simply written as x, and “ ⊂ ” implies proper inclusion
while “ ⊆ ” means “ ⊂ ” or “ = ”.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 96–111, 1998. c Springer–Verlag Berlin Heidelberg 1998
Augmentation in Planar Graphs 97

with ∅ 6= X 6= V , and the size of the cut X is defined by cG (X, V − X), which
may also be written as cG (X). If X = {x}, cG (x) denotes the degree of vertex x.
For a subset X ⊆ V , define its inner-connectivity by λG (X) = min{cG (X 0 ) | ∅ = 6
X 0 ⊂ X}. In particular, λG (V ) (i.e., the size of a minimum cut in G) is called
the edge-connectivity of G. For a vertex v ∈ V , a vertex u adjacent to v is called
a neighbor of v in G. Let ΓG (v) denote the set of neighbors of v in G.
Let s ∈ V be a designated vertex in V . A cut X is called s-proper if ∅ 6=
X ⊂ V − s. The size λG (V − s) of a minimum s-proper cut is called the s-
based-connectivity of G. Hence λG (V ) = min{λG (V − s), cG (s)}. A splitting at
s is (k, s)-feasible if λG0 (V − s) ≥ k holds for the resulting graph G0 . Lovász [6]
showed the following important property:

Theorem 1 ([2,6]). Let a multigraph G = (V, E) have a designated vertex

s ∈ V with even cG (s), and k be an integer with 2 ≤ k ≤ λG (V − s). Then there
is a complete (k, s)-feasible splitting. t
u

Since a complete (k, s)-feasible splitting effectively reduces the number of ver-
tices in a graph while preserving its s-based-connectivity, it plays an important
role in solving many graph connectivity problems (e.g., see [1,2,9]).
In this paper, we prove an extension of Lovász’s edge-splitting theorem, aim-
ing to solve the edge-connectivity augmentation problem with an additional con-
straint that preserves the planarity of a given planar graph. Firstly, we consider
the following type of splitting; for a multigraph G = (V, E) with a designated
vertex s, let ΓG (s) = {w0 , w1 , . . . , wp−1 } (p = |ΓG (s)|) of neighbors of s, and
assume that a cyclic order π = (w0 , w1 , . . . , wp−1 ) of ΓG (s) is given. We say
that two edges e1 = (wh , wi ) and e2 = (wj , w` ) are crossing (with respect to π)
if e1 and e2 are not adjacent and the four end vertices appear in the order of
wh , wj , wi , w` along π (i.e., h + a = j + b = i + c = ` (mod p) holds for some
1 ≤ c < b < a ≤ p − 1). A sequence of splittings at s is called noncrossing
if no two split edges resulting from the sequence are crossing. We prove that
there always exists a complete and noncrossing (k, s)-feasible splitting for even
integers k, and such a splitting can be found in O(n2 (m + n log n)) time.
Next we consider a planar multigraph G = (V, E) with a vertex s ∈ V of even
degree. A complete splitting at s is called planarity-preserving if the resulting
graph from the splitting remains planar. Based on the result of noncrossing
splitting, we prove that, if k is an even integer with k ≤ λG (V − s), then there
always exists a complete (k, s)-feasible and planarity-preserving splitting, and
the splitting can be found in O(n3 log n) time. For k = 3, we prove by a separate
argument that there exists a complete (k, s)-feasible and planarity-preserving
splitting if the resulting graph is allowed to be re-embedded in the plane.
Example 1. (a) Fig. 1(a) shows a graph G1 = (V, E) with cG1 (s, wi ) = 1 and
cG1 (wi , wi+1 ) = a, 0 ≤ i ≤ 3 for a given integer a ≥ 1. Clearly, λG1 (V − s) =
k for k = 2a + 1. For a cyclic order π = (w0 , w1 , w2 , w3 ), G1 has a unique
complete (k, s)-feasible splitting (i.e., splitting pair of (s, w0 ), (s, w2 ) and a pair
of (s, w1 ), (s, w3 )), which is crossing with respect to π. This implies that, for every
odd k ≥ 3, there is a graph G with a designated vertex s and a cyclic order of
98 Hiroshi Nagamochi and Peter Eades

s s s

1 1
1 1 1 1 1 1 1
1 1
1 w11 a a w1 w11 a a w1
a w0 a w0
a a a w0 a-1
w10 1 w2 w10 w2
1
a a a 1 a
w3 w1 w9 1 1 w3 w9 1 w3
a a 1 1
a a
a a w8 1 w4 w8
aw a a-1
1 a
w4
7
a a a a
w5 w5
w2 w6 w7 w6
(a) (b) (c)

Fig. 1. Example of three planar graphs (a) G1 , (b) G2 , and (c) G3 .

ΓG (s) which has no complete and noncrossing (k, s)-feasible splitting. Note that
the planar G1 has a complete and planarity-preserving (k, s)-feasible splitting (by
putting one of the split edges in the inner area of cycle C1 = {w0 , w1 , w2 , w3 }).
(b) Fig. 1(b) shows a planar graph G2 = (V, E) with cG2 (wi , wi+1 ) = a (mod
12) for 0 ≤ i ≤ 11 and cG2 (e) = 1 otherwise for an integer a ≥ 1, which satisfies
λG2 (V − s) = k for k = 2a + 1. The G2 has a unique complete (k, s)-splitting,
which is not planarity-preserving unless the embedding of subgraph G2 [V − s] is
not changed; if G2 [V − s] is re-embedded in the plane so that block components
{w2 , w3 , w4 } and {w8 , w9 , w10 } of G2 [V − s] are flipped and two vertices w3 and
w9 share the same inner face, then the complete (k, s)-splitting is now planarity-
preserving. From this, we see that for every odd k ≥ 3, there is a planar graph
G with a designated vertex s which has no complete and planarity-preserving
(k, s)-feasible splitting (unless the embedding of G is re-embedded).
(c) Let a ≥ 2 be an integer, and consider the graph G3 = (V, E) in Fig. 1(c),
where cG3 (wi , wi+1 ) = a for i ∈ {1, 7}, cG3 (wi , wi+1 ) = a (mod 12) for i ∈
{0, 1, . . . , 11} − {1, 7}, and cG3 (e) = 1 otherwise. Clearly, λG3 (V − s) = k for
k = 2a + 1 (≥ 5). It is easily observed that the unique complete (k, s)-feasible
splitting is not planarity-preserving for any choice of re-embedding of G3 in the
plane. This implies that for every odd k ≥ 5, there exists a graph which has no
complete and planarity-preserving (k, s)-feasible splitting even if re-embedding
after splitting is allowed. t
u

2 Preliminaries
2.1 Computing s-Based Connectivity
The vertex set V of a multigraph G = (V, E) are denoted by V (G). We say that
a cut X separates two disjoint subsets Y and Y 0 of V if Y ⊆ X ⊆ V − Y 0 (or
Y 0 ⊆ X ⊆ V − Y ). The local edge-connectivity λG (x, y) for two vertices x, y ∈ V
is defined to be the minimum size of a cut in G that separates x and y. A cut X
crosses another cut Y if none of subsets X ∩ Y , X − Y , Y − X and V − (X ∪ Y )
is empty.
Augmentation in Planar Graphs 99

An ordering v1 , v2 , . . . , vn of all vertices in V is called a maximum adjacency

(MA) ordering in G if it satisfies cG ({v1 , v2 , . . . , vi }, vi+1 ) ≥ cG ({v1 , v2 , . . . , vi },
vj ), 1 ≤ i < j ≤ n.

Lemma 1. [8] Let G = (V, E) be a multigraph, and v1 be a vertex in V .

(i) An MA ordering v1 , v2 , . . . , vn of vertices in G can be found in O(m + n log n)
time.
(ii) The last two vertices vn−1 and vn satisfy λG (vn−1 , vn ) = cG (vn ). t
u

Using this lemma repeatedly, we can compute λG (V − s) by the next algo-

rithm.

Algorithm CONTRACT
Input: A multigraph G = (V, E) with |V | ≥ 3 and a designated vertex s ∈ V .
Output: an s-proper cut X ∗ with cG (X ∗ ) = λG (V − s) < λG (X ∗ ).
1 α := min{cG (v) | v ∈ V − s};
2 Let X := {v} for a vertex v ∈ V − s with cG (v) = α;
3 H := G;
4 while |V (H)| ≥ 4 do { cH (u) ≥ α holds for all u ∈ V (H) − s }
5 Find an MA-ordering in H starting from v1 = s, and let v, w (6= s) be the
last two vertices in this ordering; { λH (v, w) = cH (w) by Lemma 1(ii) }
6 Contract v and w into a vertex, say z, and let H be the resulting graph;
7 if cH (z) < α then
8 Let X ∗ be the set of all vertices in V − s contracted into z so far;
{ cH (z) = cG (X ∗ ) }
9 end { if }
10 end. { while }

It should be noted that for each u ∈ V (H) − s cH (u) ≥ α holds before every
iteration of the while-loop. The last two vertices v, w in an MA ordering in line 5,
which are clearly distinct from s, satisfy λH (v, w) = cH (w) by Lemma 1(ii).
Let X ∗ be the cut output by CONTRACT, and α∗ be the final value of α
(i.e., α∗ = cG (X ∗ )). Note that any two vertices v and w in line 5 have been
contracted into a single vertex only when λH (v, w) ≥ α∗ holds. We prove that
α∗ = λG (V − s). For a vertex u ∈ V (H) − s, let Xu denote the set of all
vertices in V − s contracted so far. Assume that there is an s-proper cut Y with
cG (Y ) < α∗ . Clearly, the final graph H has three vertices z1 , z2 and s, and
satisfies α∗ ≥ min{cH (z1 ), cH (z2 )}, and thus Y 6= Xz1 , Xz2 . Hence there is a
vertex pair v, w ∈ V (H) chosen in line 5 at some iteration of the while-loop such
that Xv ⊆ Y and Xw ⊆ (V − s) − Y (or Xw ⊆ Y and Xv ⊆ (V − s) − Y ).
Assume that v and w are the vertices in the earliest iteration of the while-loops
among such pairs of vertices. This implies that when v and w are contracted
into a vertex, the current graph H has a subset Y 0 ⊂ V (H) − s such that
∪y∈Y 0 Xy = Y . However, λH (v, w) ≤ cH (Y 0 ) = cG (Y ) < α∗ , contradicting
λH (v, w) ≥ α∗ . Therefore, α∗ = λG (V − s).
100 Hiroshi Nagamochi and Peter Eades

Now we show λG (X ∗ ) > α∗ . Assume that the output cut X ∗ is found in

line 8 in the i-th iteration of the while-loop (the case where the X ∗ is found
at line 3 is trivial), and let α0 be the value of α before the i-th iteration. Then
α0 > α∗ , and any two vertices v and w in line 5 in any earlier iteration have
been contracted into a single vertex only when λH (v, w) ≥ α0 . Analogously with
the above argument, we see that when CONTRACT finds the final cut X ∗ in
line 8, λG (u, u0 ) ≥ α0 holds for u, u0 ∈ X ∗ (hence λG (X ∗ ) > α∗ ). This leads to
the next lemma, where the running time clearly follows from Lemma 1(i).

Lemma 2. For a multigraph G = (V, E) with a designated vertex s ∈ V , CON-

TRACT computes a cut X ∗ such that cG (X ∗ ) = λG (V − s) < λG (X ∗ ) in
O(n(m + n log n)) time. t
u

2.2 Splitting Edges for a Pair of Neighbors

Given a multigraph G = (V, E), a designated vertex s ∈ V , vertices u, v ∈ ΓG (s)

(possibly u = v) and a non-negative integer δ ≤ min{cG (s, u), cG (s, v)}, we
construct graph G0 = (V, E 0 ) from G by deleting δ edges from EG (s, u) and
EG (s, u), respectively, and adding new δ edges to EG (u, v). We say that G0 is
obtained from G by splitting δ pairs of edges (s, u) and (s, v) by size δ, and
denote the resulting graph G0 by G/(u, v, δ). Clearly, for any s-proper cut X, we
see that

cG (X) − 2δ if u, v ∈ X
cG/(u,v,δ) (X) = (1)
cG (X) otherwise.

Given an integer k satisfying 0 ≤ k ≤ λG (V − s), we say that splitting δ pairs

of edges (s, u) and (s, v) is (k, s)-feasible if λG/(u,v,δ) (V − s) ≥ k.
For an integer k, let ∆G (u, v, k) be the maximum δ such that splitting edges
(s, u) and (s, v) with size δ is (k, s)-feasible in G. In this subsection, we show
how to compute ∆G (u, v, k). An s-proper cut X is called (k, s)-semi-critical in
G if it satisfies cG (s, X) > 0, k ≤ cG (X) ≤ k + 1 and λG (X) ≥ k.
An algorithm, called MAXSPLIT(u, v, k), for computing ∆G (u, v, k) is de-
scribed as follows.
1. Let δmax = min{cG (s, u), cG (s, v)} if u 6= v, and δmax = bcG (s, u)/2c if u = v,
and let Gmax = G/(u, v, δmax ).
2. Compute λGmax (V − s) and an s-proper cut X with cGmax (X) = λGmax (V −
s) < λGmax (X) (such X exists by Lemma 2). If λGmax (V − s) ≥ k, then
∆G (u, v, k) = δmax , where at least one of u and v is no longer a neighbor of
s in Gmax in the case u 6= v, or cGmax (s, u) ≤ 1 in the case u = v.
3. If k − λGmax (V − s) ≥ 1, then u, v ∈ X (for otherwise cG (X) = cGmax (X) < k
would hold). Output ∆G (u, v, k) = δmax −d 21 (k−λGmax (V −s))e and the s-proper
cut X as such a (k, s)-semi-critical cut with u, v ∈ X.
The correctness of step 2 is clear. In step 3, we see from (1) that G0 =
G/(u, v, δ) with δ = δmax − d 21 (k − λGmax (V − s))e satisfies λG0 (V − s) = k or
Augmentation in Planar Graphs 101

k + 1. This implies that ∆G (u, v, k) = δ. We show that the X has a property

that
cG0 (Z) > cG0 (X) for any Z with u, v ∈ Z ⊂ X,
where we call such a (k, s)-semi-critical cut X with u, v ∈ X admissible (with
respect to u, v) in G0 . For any Z with u, v ∈ Z ⊂ X, we have cG0 (Z) =
cGmax (Z) + 2d 21 (k − λGmax (V − s))e > cG0 (X), since λGmax (X) > λGmax (V − s)
implies cGmax (Z) > cGmax (X). By summarizing this, we have the next result.

Lemma 3. For a multigraph G = (V, E) with a designated vertex s ∈ V , and

vertices u, v ∈ ΓG (s) (possibly u = v), let k be a nonnegative integer with k ≤
λG (V − s), and let G0 = G/(u, v, δ) for δ = ∆G (u, v, k). Then:
(i) If cG0 (s, u) > 0 and cG0 (s, v) > 0 in the case u 6= v or if cG0 (s, u) ≥ 2 in the
case u = v, then G0 has an admissible cut X.
(ii) The cut X in (i) (if any) and ∆G (u, v, k) can be computed in O(mn+n2 log n)
time. t
u

3 Noncrossing Edge Splitting

For a cyclic order π = (w0 , w1 , . . . , wp−1 ) of ΓG (s), a sequence of splittings at

s is called noncrossing (with respect to π) if no two split edges (wh , wi ) and
(wj , w` ) are crossing with respect to π (see Section 1 for the definition). In this
section, we show that for any even k ≤ λG (V − s), there always exists a complete
and noncrossing (k, s)-splitting. However, as observed in Example 1(a), for every
odd k ≥ 3, there is a graph that has no such splitting.

3.1 (k, s)-Semi-critical Collections

Before computing a complete (k, s)-feasible splitting, we first find a family X

of subsets of V − s (by performing some noncrossing edge splittings at s) as
follows. For a multigraph G = (V, E) and s ∈ V , a family X = {X1 , X2 , . . . , Xr }
of disjoint subsets Xi ⊂ V − s is called a collectionPin V − s. A collection X
r
may be empty. A collection X is called covering if i=1 cG (s, Xi ) = cG (s). A
collection X in V − s is called (k, s)-semi-critical in G either if X = ∅ or if
all Xi ∈ X are (k, s)-semi-critical. We can easily see that a (k, s)-semi-critical
covering collection in G with cG (s) > 0 satisfies |X | ≥ 2 [9].
Let X be an s-proper cut with cG (X) ≤ k + 1. Clearly, splitting two edges
(s, u) and (s, v) such that u, v ∈ X is not (k, s)-feasible. Then the size of any cut
Z ⊆ X remains unchanged after any (k, s)-feasible splitting in G. We say that
two s-proper cuts X and Y s-cross each other if X and Y cross each other and
cG (s, X ∩ Y ) > 0. It is not difficult to prove the following properties by using
submodularity of cut function cG (the detail is omitted).

Lemma 4. Let G = (V, E) be a multigraph with a designated vertex s, and k be

an integer with k ≤ λG (V − s). Then:
102 Hiroshi Nagamochi and Peter Eades

(i) If two (k, s)-semi-critical cuts X and Y s-cross each other, then cG (X) =
cG (Y ) = k + 1, cG (X − Y ) = cG (Y − X) = k and cG (X ∩ Y, V − (X ∪ Y )) = 1.
(ii) Let X be an admissible cut with respect to u, u0 ∈ V − s (possibly u = u0 ),
and Y be a (k, s)-semi-critical cut. If X and Y cross each other, then cG (X) =
cG (Y ) = k + 1 and cG (Y − X) = k.
(iii) Let Xi (resp., Xj ) be admissible cuts with respect to ui , u0i (resp., with respect
to uj , u0j ), where possibly ui = u0i or uj = u0j holds. If {ui , u0i } ∩ {uj , u0j } = ∅ or
cG (s, u) ≥ 2 for some u ∈ {ui , u0i } ∩ {uj , u0j }, then two cuts Xi and Xj do not
cross each other. t
u
We now describe an algorithm, called COLLECTION, which computes a
(k, s)-semi-critical covering collection X in a graph G∗ obtained from G by a
noncrossing sequence of (k, s)-feasible splittings. Let π = (w0 , w1 , . . . , wp−1 ) be
a cyclic order of ΓG (s). In a graph G0 obtained from G by a noncrossing sequence
of (k, s)-feasible splittings, a vertex wj is called a successor of a vertex wi if
cG0 (s, wj ) ≥ 1 and h > 0 with j = i + h (mod p) is minimum.
0. Initialize X to be ∅.
1. for each wi , i := 0, . . . , p − 1 do
if wi is not in any cut X ∈ X then execute MAXSPLIT(wi , wi , k) to compute
G0 = G/(wi , wi , δ) with δ = ∆G (wi , wi , k) and an admissible cut Xwi in G0 (if
cG0 (s, wi ) ≥ 2); let G := G0 ;
if cG (s, wi ) ≥ 2 then X := X ∪ {Xwi }, discarding all X ∈ X with X ⊂ Xwi
from X .
end { for }
2. for each wi such that cG (s, wi ) = 1, i := 0, . . . , p − 1 do
if wi is not in any cut X ∈ X then execute MAXSPLIT(wi , wj , k) for wi and
the successor wj of s in the current G to compute G0 = G/(wi , wj , δ) with
δ = ∆G (wi , wj , k) and an admissible cut Xwi in G0 (if cG0 (s, wi ) = 1); let
G := G0 ;
if cG (s, wi ) = 1 then X := {X − Xwi | cG (s, X − Xwi ) > 0, X ∈ X } ∪ {Xwi }.
else (if cG (s, wi ) = 0) remove any cut X with cG (s, X) = 0 from X .
end { for }
Output G∗ := G. t
u
Clearly, the resulting sequence of splitting is (k, s)-feasible and noncrossing.
Lemma 5. Algorithm COLLECTION correctly computes a (k, s)-semi-critical
covering collection X in the output graph G∗ .
Proof: Let X be the set of cuts obtained after step 1. If two cuts Xwi , Xwj ∈ X
(0 ≤ i < j ≤ p − 1) has a common vertex v, then wj 6∈ Xwi and Xwi − Xwj 6= ∅
(otherwise, Xwi must have been discarded). However, this implies that Xwi and
Xwj cross each other, contradicting Lemma 4(iii). Thus, the X is a (k, s)-semi-
critical collection.
Now we prove by induction that X is a (k, s)-semi-critical collection dur-
ing step 2. Assume that MAXSPLIT(wi , wj , k) is executed to compute G0 =
G/(wi , wj , δ) with δ = ∆G (wi , wj , k). If cG0 (s, wi ) = 0, then a cut X ∈ X with
wj ∈ X may satisfy cG0 (s, X) = 0 after the splitting. However, such a cut will
Augmentation in Planar Graphs 103

be removed from X . If cG0 (s, wi ) = 1, then an admissible cut Xwi in G0 is

found. Clearly, any X ∈ X satisfies one of the followings: (i) X ∩ Xwi = ∅,
(ii) X ⊂ Xwi , and (iii) X ∩ Xwi 6= ∅ = 6 X − Xwi . Since X is updated to
{X − Xwi | cG (s, X − Xwi ) > 0, X ∈ X } ∪ {Xwi }, it is sufficient to show that
cG0 (X − Xwi ) = k holds in the case (iii) (note that λG0 (X − Xwi ) ≥ k follows
from λG0 (X) ≥ k). Since two cuts X and Xwi cross each other in the case (iii),
cG0 (X − Xwi ) = k follows from Lemma 4(ii). This proves that X remains to be
a (k, s)-semi-critical collection, which becomes covering after step 2. t
u

3.2 Algorithm for Noncrossing Edge-Splitting

In this subsection, k is assumed to be a positive even integer. We can prove the
next property by Lemma 4(i) and the evenness of k (the detail is omitted).
Lemma 6. Let G = (V, E) be a multigraph with a designated vertex s, and k be
an even integer with k ≤ λG (V − s). Further, let X be a (k, s)-semi-critical cut,
and Y and Y 0 be (k, s)-semi-critical cuts with Y ∩ Y 0 = ∅. Then X can s-cross
at most one of Y and Y 0 . t
u
Using the lemma, we now describe an algorithm that constructs a complete
and noncrossing (k, s)-feasible splitting from a given (k, s)-semi-critical covering
collection X in a graph G with a designated vertex s of even cG (s).
If s has at most three neighbors, then any complete (k, s)-feasible splitting
is noncrossing (with respect to any cyclic order of ΓG (s)) and such a splitting
can be found by applying MAXSPLIT at most three times. In what follows, we
assume that |ΓG (s)| ≥ 4.
First, we define a notion of segment. For a given covering collection X with
|X | ≥ 2 in a multigraph G with a designated vertex s and a cyclic order π =
(w0 , w1 , . . . , wp−1 ) of ΓG (s), a subset P ⊂ ΓG (s) of neighbors of s which are
consecutive in the cyclic order is called segment if there is a cut X ∈ X with
P ⊆ X such that P is maximal subject to this property. Note that there may be
two segments P and P 0 with P ∪ P 0 ⊆ X for the same cut X ∈ X . A segment
P with |P | = 1 is called trivial. We now describe the two cases.
Case 1: There is a nontrivial segment P = {wi , wi+1 , . . . , wj } (with respect
to X ). Let X1 , X2 and X3 be the cuts in X such that wi−1 ∈ X1 , {wi , . . . , wj } ⊆
X2 and wj+1 ∈ X3 (possibly X1 = X3 ). We execute MAXSPLIT(wi−1 , wi , k)
and then MAXSPLIT(wj , wj+1 , k). Let G00 be the resulting graph. We first re-
move all cuts X ∈ X with cG00 (s, X) = 0 from X . If one of wi−1 , wi , wj , wj+1
is no longer a neighbor of s in G00 , then the number of neighbors of s decreases
at least by one. Let us consider the case where all of wi−1 , wi , wj , wj+1 remain
neighbors of s in G00 . Thus, G00 has admissible cuts Yi and Yj (with respect
to wi−1 , wi and wj , wj+1 , respectively). In the case of wi−1 = wj+1 , it holds
cG00 (s, wi−1 ) ≥ 2, since otherwise cG00 (s, X2 ) ≥ 3 (by evenness of cG00 (s)) would
imply cG00 (V − X2 ) = cG00 (X2 ) − cG00 (s, X2 ) + cG00 (s, X1 ) ≤ (k + 1) − 3 + 1 < k,
contradicting λG00 (V − s) ≥ k. Thus, by Lemma 4(iii), two cuts Yi and Yj do
not cross each other. There are two subcases (a) Yi ∩ Yj = ∅ and (b) Yi ⊆ Yj or
Yj ⊆ Yi .
104 Hiroshi Nagamochi and Peter Eades

(a) Yi ∩ Yj = ∅. We prove that X1 6= X3 and X1 ∪ X2 ∪ X3 ⊆ Yi ∪ Yj . Then

two (k, s)-semi-critical cuts Yi and X2 s-cross each other, and by Lemma 4(i)
cG00 (X2 − Yi ) = cG00 (Yi − X2 ) = k. Note that X2 − Yi is a (k, s)-semi-critical cut.
Thus X2 −Yi cannot cross another admissible cut Yj (otherwise cG00 (X2 −Yi ) = k
would contradict Lemma 4(ii)), and hence X2 ⊂ Yi ∪ Yj . By Lemma 6, Yi which
already crosses X2 cannot s-cross X1 , and thus X1 ⊂ Yi . Similarly, we have
X3 ⊂ Yj . Therefore, X1 6= X3 and X1 ∪ X2 ∪ X3 ⊆ Yi ∪ Yj . There may be some
cut X ∈ X −{X1 , X2 , X3 } which crosses one of Yi or Yj . If the X crosses Yi , then
cG00 (X − Yi ) = k by Lemma 4(ii). We see that cG00 (s, X − Yi ) = cG00 (s, X) ≥ 1,
because if cG00 (s, X ∩ Yi ) ≥ 1 (i.e., X and Yi s-cross each other) then cG00 (Yi −
X) = k < k + 1 = cG00 (Yi ) by Lemma 4(i), contradicting the admissibility of
Yi (note {wi−1 , wi } ⊆ Yi − X). Thus X − Yi is a (k, s)-semi-critical cut in G00 .
Similarly, If the X crosses Yj , then X − Yj is a (k, s)-semi-critical cut in G00 .
Hence if X ∩ Yi 6= ∅ = 6 X ∩ Yj , then X − Yi − Yj is also a (k, s)-semi-critical cut
in G00 . Therefore, we can update X by X := {X − Yi − Yj | X ∈ X } ∪ {Yi , Yj }.
(b) Yi ⊆ Yj or Yj ⊆ Yi . Without loss of generality, assume Yj ⊆ Yi . Since
cG00 (s, X2 ∩ Yi ) ≥ 2 holds by wi , wj ∈ X2 ∩ Yi , we see by Lemma 4(i) that
X2 cannot cross Yi (hence, X2 ⊂ Yi ). If X1 = X3 , then cG00 (s, X1 ∩ Yi ) ≥ 2
(including the case wi−1 = wj+1 ) and by Lemma 4(i) X1 = X3 cannot cross Yi
(hence, X1 = X3 ⊂ Yi ). In the case X1 6= X3 , at most one of X1 and X3 can
s-cross Yi by Lemma 6. Thus X1 ⊂ Yi or X3 ⊂ Yi . For any cut X ∈ X − {X3 }
which crosses Yi , we can show that X − Yi is a (k, s)-semi-critical cut in G00
using similar reasoning as for Case 1(a). Therefore, we can update X by X :=
{X − Yi | cG00 (s, X − Yi ) > 0, X ∈ X } ∪ {Yi } (note that cG00 (s, X1 − Yi ) or
cG00 (s, X3 − Yi ) may be 0).
Notice that the number of cuts in X in cases (a) and (b) decreases at least
by one after updating.
Case 2: All segments are trivial. We choose a neighbor wi of s and its succes-
sor wj , and execute MAXSPLIT(wi , wj , k). Assume that MAXSPLIT(wi , wj , k)
finds an admissible cut Y (otherwise, the number of neighbors of s decreases
at least by one). Let X1 and X2 be the cuts in X which contain wi and wj ,
respectively. We see that X1 ⊂ Y or X2 ⊂ Y , because otherwise Y would s-cross
both X1 and X2 (contradicting Lemma 6). If X1 s-crosses Y , then we see by
Lemma 4(ii) that X1 − Y is a (k, s)-semi-critical cut if cG00 (s, X1 − Y ) ≥ 1.
The case where X2 s-crosses Y is similar. For any cut X ∈ X − {X1 , X2 } which
crosses Y , we can show that X −Y is a (k, s)-semi-critical cut in G00 using similar
reasoning as for Case 1(a). We update X by X := {X − Y | cG (s, X − Y ) ≥
1, X ∈ X } ∪ {Y }. In this case, the number of cuts in X never increases, but it
may not decrease, either. However, Y contains a nontrivial segment in the new
X , and we can apply the above argument of Case 1.
By applying the above argument to Case 1, at least one vertex is no longer a
neighbor of s or the number of cuts in a collection X is decreased at least by one.
After applying the argument of Case 2, at least one vertex is no longer a neighbor
of s or a nontrivial segment is created. Therefore, by executing MAXSPLIT at
Augmentation in Planar Graphs 105

most 4(|ΓG (s)|+|X |) = O(|ΓG (s)|) times, we can obtain a complete (k, s)-feasible
splitting of a given graph G, which is obviously noncrossing.

Theorem 2. Given a multigraph G = (V, E) with a designated vertex s ∈ V

of even degree, a positive even integer k ≤ λG (V − s), and a cyclic order π of
neighbors of s, a complete and noncrossing (k, s)-feasible splitting can be found
in O(|ΓG (s)|n(m + n log n)) time. t
u

4 Planarity-Preserving Splitting

In this section, we assume that a given graph G with a designated vertex s of

even degree and an integer k ≤ λG (V − s) is planar, and consider whether there
is a complete and planarity-preserving (k, s)-feasible splitting. We prove that
such splitting always exists if k is even or k = 3, but may not exist if k is odd
and k ≥ 5, as observed in Example 1(c). We initially fix an embedding ψ of G
in the plane, and let πψ be the order of neighbors of s that appear around s in
the embedding χ of G.

4.1 The Case of Even k

Clearly, a complete splitting at s is planarity-preserving if it is noncrossing with

respect to πψ . Therefore, in the case of even integers k, the next theorem is
immediate from Theorem 2 and the fact that m is O(n) in a planar graph G.

Theorem 3. Given a planar multigraph G = (V, E) with a designated vertex

s ∈ V of even degree, and a positive even integer k ≤ λG (V − s), there exists a
complete and planarity-preserving (k, s)-feasible splitting (which also preserves
the embedding of G[V − s] in the plane), and such splitting can be found in
O(|ΓG (s)|n2 log n) time. t
u

4.2 The Case of k = 3

For k = 3 ≤ λG (V − s), we can prove that there is a complete and planarity-

preserving (k, s)-feasible splitting. However, as observed in Example 1(b), in
this case we may need to re-embed the subgraph G[V − s] in the plane to ob-
tain such a splitting. We will show how to obtain a complete (3, s)-feasible and
planarity-preserving splitting. Firstly, however, we describe a preprocessing al-
gorithm based on the following Lemma 7 (the proof is omitted).
A set Y of vertices in G (or the induced graph G[Y ]) is called a k-component
if λG (u, v) ≥ k for all u, v ∈ Y and |Y | is maximal subject to this property.
It is easy to see that the set of k-components in a graph is unique, and forms
a partition of the vertex set. Such a partition can be found in linear time for
k = 1, 2 [10] and for k = 3 [7,12]. A k-component Y (or the induced graph G[Y ])
is called a leaf k-component if cG (Y ) < k. Note that λG[V −s] (V − s) means the
edge-connectivity of the subgraph G[V − s].
106 Hiroshi Nagamochi and Peter Eades

Lemma 7. Let G = (V, E) be a multigraph with a designated vertex s of even

degree such that λG (V − s) ≥ 3.

(i) Any s-proper cut X with cG (X) ≤ 4 induces a connected subgraph G[X].
(ii) Assume λG[V −s] (V − s) = 0, and let u and v be two neighbors of s such that
they belong to different 1-components in G[V − s]. Then ∆G (u, v, 3) ≥ 1.
(iii) Assume λG[V −s] (V − s) = 1. Let Y be a leaf 2-component in G[V − s]. If
∆G (u, v, 3) = 0 for some neighbors u ∈ ΓG (s) ∩ Y and v ∈ ΓG (s) − Y , then
ΓG (s) − Y − v 6= ∅ and ∆G (u0 , v 0 , 3) ≥ 1 for any neighbors v 0 ∈ ΓG (s) − Y − v
and u0 ∈ ΓG (s) ∩ Y .
(iv) Assume λG[V −s] (V − s) = 2. Let Y be a 3-component with cG (s, Y ) ≥ 2
in G[V − s]. Then ∆G (u, v, 3) ≥ 1 for any neighbors u ∈ ΓG (s) ∩ Y and
v ∈ ΓG (s) − Y .
(v) Assume λG[V −s] (V −s) = 2. Let Y be a non-leaf 3-component with cG (s, Y ) =
1 in G[V − s]. If ∆G (u, v, 3) = 0 for some neighbors u ∈ ΓG (s) ∩ Y and
v ∈ ΓG (s) − Y , then ΓG (s) − Y − v 6= ∅ and ∆G (u, v 0 , 3) ≥ 1 for any neighbor
v 0 ∈ ΓG (s) − Y − v. t
u

Based on this lemma, for a given cyclic order π of ΓG (s), we can find a
noncrossing sequence of (k, s)-feasible splittings (which may not be complete)
such that the resulting graph G∗ satisfies either cG∗ (s) = 0 (i.e., the obtained
splitting is complete) or the following condition:

λG∗ [V −s] (V − s) = 2, and cG∗ (s, Y ) = 1 for all leaf 3-components Y ,

(2)
and cG∗ (s, Y 0 ) = 0 for all non-leaf 3-components Y 0 in G∗ [V − s].

The entire preprocessing is described as follows.

Algorithm PREPROCESS
Input: A multigraph G = (V, E) (which is not necessarily planar), a designated
vertex s ∈ V with even degree and λG (V − s) ≥ 3 , and a cyclic order π of ΓG (s).
Output: A multigraph G∗ obtained from G by a noncrossing (3, s)-feasible
splitting such that G∗ satisfies either cG∗ (s) = 0 or (2).
1 G0 := G;
2 while G0 [V − s] is not connected do
3 Find a neighbor w ∈ ΓG0 (s) and its successor w0 ∈ ΓG0 (s) such that
w and w0 belong to different 1-components in G0 [V − s];
4 G0 := G0 /(w, w0 , 1)
5 end; { while }
6 while G0 [V − s] is not 2-edge-connected (i.e., λG0 [V −s] (V − s) = 1) do
7 Choose a leaf 2-component Y in G0 [V − s];
8 Find a neighbor w ∈ ΓG0 (s) ∩ Y and its successor w0 ∈ ΓG0 (s) − Y ;
9 if ∆G (w, w0 , 3) ≥ 1 then G0 := G0 /(w, w0 , 1)
10 else
11 Find a neighbor w00 ∈ ΓG0 (s) − Y and its successor w000 ∈ ΓG0 (s) ∩ Y ,
and G0 := G0 /(w00 , w000 , 1)
12 end { if }
Augmentation in Planar Graphs 107

13 end; { while }
14 while G0 [V − s] has a 3-component Y with cG0 (s, Y ) ≥ 2 or a non-leaf
3-component Y with cG0 (s, Y ) = 1 do
15 Find neighbors w ∈ ΓG0 (s) ∩ Y and w0 ∈ ΓG0 (s) − Y such that w0 is
the successor of w;
16 if ∆G (w, w0 , 3) ≥ 1 then G0 := G0 /(w, w0 , 1)
17 else { Y is a non-leaf 3-component with cG0 (s, Y ) = 1 }
18 Find a neighbor w00 ∈ ΓG0 (s) − Y such that w is the successor of w00 ,
and G0 := G0 /(w00 , w, 1)
19 end { if }
20 end; { while }
21 Output G∗ := G0 .

Correctness of PREPROCESS easily follows from Lemma 7. Clearly, the

number of splitting operations carried out in PREPROCESS is O(n), and each
splitting (including testing if ∆G (w, w0 , 3) ≥ 1 in lines 9 and 16) can be per-
formed in linear time. Therefore, PREPROCESS runs in O(n2 ) time.

Lemma 8. For a multigraph G = (V, E), a designated vertex s ∈ V with an even

degree and λG (V − s) ≥ 3, and a cyclic order π of ΓG (s), there is a noncrossing
sequence of (3, s)-feasible splitting such that the resulting graph G∗ satisfies either
cG∗ (s) = 0 or (2), and such a splitting can be found in O(n2 ) time. t
u

Now we describe how to obtain a complete (3, s)-feasible and planarity-

preserving splitting given (2).
Let G∗ = (V, E ∗ ) be a multigraph satisfying (2), and let G00 = (V 00 , E 00 )
denote the graph G∗ [V − s]. Then λG00 (V 00 ) = 2. A cut in a graph is a 2-cut
if its cut size is 2. Now we consider a representation of all 2-cuts in G00 . Let
{Y1 , Y2 , . . . , Yr } be the set of 3-components in G00 , and GG00 = (V, E) denote
the graph obtained from G00 by contracting each 3-component Yi into a single
vertex yi . For a vertex v ∈ V , ϕG00 (v) denotes the vertex yi ∈ V into which
v (possibly together with other vertices) is contracted. We can easily observe
that GG00 is a cactus, i.e., a connected graph such that any edge is contained in
exactly one cycle, where a cycle may be of length two. Clearly a cactus is an
outerplanar graph, and any two cycles in a cactus have at most one common
vertex. A vertex with degree 2 in a cactus is called a leaf vertex. Since any non-
leaf vertex in a cactus is a cut-vertex, there may be many ways of embedding a
cactus in the plane. For a vertex z ∈ V in the cactus GG00 , we denote by ϕ−1 G00 (z)
the 3-component Yi such that Yi is contracted into z. It is easy to see that a
subset Y ⊂ V in G00 is a leaf 3-component if and only if there is a leaf vertex
z ∈ V with ϕ−1 G00 (z) = Y in GG00 . From this, we have the next lemma.

Lemma 9. Let G∗ = (V, E ∗ ) be a multigraph, and s be a designated vertex with

even degree. Assume that λG∗ (V − s) ≥ 3 and (2) holds. Let L(GG00 ) be the set
of leaf vertices in cactus GG00 of G00 = G∗ [V − s]. Then ϕG00 defines a bijection
between ΓG (s) and L(GG00 ). t
u
108 Hiroshi Nagamochi and Peter Eades

Now assume that a given graph G = (V, E) with λG (V − s) ≥ 3 is planar.

Fix a planar embedding ψ of G, and let πψ be the cyclic order of ΓG (s) in which
neighbors in ΓG (s) appear along the outer face of G[V − s]. By applying PRE-
PROCESS to G and πψ , a multigraph G∗ is obtained from G by a noncrossing
(hence planarity-preserving) (3, s)-feasible splitting satisfying either cG∗ (s) = 0
or (2). If cG∗ (s) = 0 (i.e., the splitting is complete) then we are done. Thus
assume that G∗ satisfies (2). It is not difficult to see that G00 = G∗ [V − s] and
cactus GG00 have the following properties:
(i) For any 2-cut X in G00 , Z = ∪x∈X ϕG00 (x) is a 2-cut in GG00 . (i.e., the two
edges in EG00 (X) exist in the same cycle in GG00 ).
(ii) For any 2-cut Z in GG00 , X = ∪z∈Z ϕ−1 00
G00 (z) is a 2-cut in G .

In other words, cactus GG00 represents all 2-cuts in G00 = G∗ [V −s]. By Lemma 9,
there is a bijection between EG∗ (s) and L(GG00 ), and thus GG00 has an even
number of leaf vertices. A set σ of new edges which pairwise connect all leaf
vertices in a cactus is called a leaf-matching. Hence finding a complete (k, s)-
feasible splitting in G∗ is to obtain a leaf-matching σ in GG00 such that adding
the leaf-matching destroys all 2-cuts in the cactus GG00 .
However, to ensure that the complete splitting corresponding to a leaf-mat-
ching preserves the planarity of G∗ , we need to choose a leaf-matching σ of GG00
carefully. An embedding χ of a cactus in the plane is called standard if all leaf
vertices are located on the outer face of χ. In particular, for a cyclic order π of leaf
vertices, a standard embedding χ of a cactus is called a standard π-embedding
if the leaf vertices appear in the order of π along the outer face of χ. Note that
a standard π-embedding of a cactus is unique (unless we distinguish one edge
from the other in a cycle of length two). We define a flipping operation in an
embedding χ of cactus G = (V, E) as follows. Choose a cycle C in G and a vertex
z in C. We see that removal of the two edges in C incident to z creates two
connected components, say G 0 and G 00 ; we assume that z ∈ V (G 0 ). Let G[C, z]
denote the subgraph G 0 of G. We say that the embedding χ of G is flipped by
(C, z) if we fold the subgraph G[C, z] back into the other side of area surrounded
by C while fixing the other part of the embedding χ. An embedding obtained
from a standard π-embedding of a cactus by a sequence of flipping operations is
called a π 0 -embedding.
Recall that neighbors in ΓG∗ (s) appear in cyclic order πψ0 = (w1 , . . . , wr )
in an embedding χψ of G∗ . We also use πψ0 to represent the cyclic order of
leaf vertices z1 = ϕ(w1 ), z2 = ϕ(w2 ), . . . , zr = ϕ(wr ) in cactus GG00 . Then the
standard πψ0 -embedding χψ of GG00 has the following property:
each vertex z ∈ V in GG00 can be replaced with the subgraph
(3)
G∗ [ϕ−1 (z)] without creating crossing edges in χ.
Observe that a flipping operation preserves property (3), and hence any π-
embedding χ of GG00 also satisfies the property (3).
A pair (σ, χ) of a leaf-matching σ on leaf vertices in a cactus G and a π-
embedding χ of G is called a π-configuration. A π-configuration (σ, χ) of G is
called good if it satisfies the following conditions.
Augmentation in Planar Graphs 109

(a) all 2-cuts in cactus G = (V, E) are destroyed by adding σ (i.e., for any 2-cut
X, σ contains an edge (z, z 0 ) with z ∈ X and z 0 ∈ V − X), and
(b) all edges in σ can be drawn in π-embedding χ of G without creating crossing
edges.
Now the problem of computing a complete and planarity-preserving (3, s)-
feasible splitting in G∗ is to find a good πψ0 -configuration (σ, χ) of GG00 . To show
that such a good πψ0 -configuration always exists in GG00 , it suffices to prove the
next lemma (the proof is omitted).
Lemma 10. For a cactus G = (V, E) and a cyclic order π of leaf vertices, as-
sume that there is a standard π-embedding of G. Then there always exists a good
π-configuration (σ, χ) of G, and such a configuration can be found in O(|V|2 )
time. t
u
By this lemma, a complete and planarity-preserving (3, s)-feasible splitting
in a graph G∗ which satisfies (2) can be computed in O(n2 ) time. This and
Lemma 8 establish the following theorem.
Theorem 4. Given a planar multigraph G = (V, E) with a designated vertex
s ∈ V of even degree, and λG (V − s) ≥ 3, there exists a complete and planarity-
preserving (3, s)-feasible splitting, and such a splitting can be found in O(n2 )
time. t
u

5 Augmenting Edge-Connectivity of Outerplanar Graphs

Given a multigraph G = (V, E) and a positive integer k, the k-edge-connectivity
(resp., k-vertex-connectivity) augmentation problem asks to find a minimum
number of new edges to be added to G such that the augmented graph be-
comes k-edge-connected (resp., k-vertex-connected). Watanabe and Nakamura
[11] proved that the k-edge-connectivity augmentation problem for general k is
polynomially solvable. In such applications as graph drawing (see [3]), a planar
graph G is given, and we may want to augment its edge- (or vertex-) connectivity
optimally while preserving its planarity. Kant and Boldlaender [5] proved that
the planarity-preserving version of 2-vertex-connectivity augmentation problem
is NP-hard. Kant [4] also showed that, if a given graph G is outerplanar, then
the planarity-preserving versions of both the 2-edge-connectivity and 2-vertex-
connectivity can be solved in linear time. For a planar graph G, let γk (G) (resp.,
γ̃k (G)) denote the minimum number of new edges to be added to G so that the
resulting graph G0 becomes k-edge-connected (resp., so that the resulting graph
G00 becomes k-edge-connected and remains planar). Clearly, γk (G) ≤ γ̃k (G) for
any planar graph G and k ≥ 1. From the results in the preceding sections, we
can show the next result.
Theorem 5. Let G = (V, E) be an outerplanar graph. If k ≥ 0 is an even
integer or k = 3, then γk (G) = γ̃k (G) and the planarity-preserving version of
the k-edge-connectivity augmentation problem can be solved in O(n2 (m+n log n))
time.
110 Hiroshi Nagamochi and Peter Eades

Proof: (Sketch) Based on Theorems 3 and 4, we can apply the approach of

Cai and Sun [1] (also see [2]) for solving the k-edge-connectivity augmentation
problem by using the splitting algorithm. The detail is omitted. t
u

Furthermore, for every odd integer k ≥ 5, there is an outerplanar graph G

such that γk (G) < γ̃k (G). Consider the graph G03 obtained from the graph G3
in Example 1(c) by deleting s and the edges in EG03 (s). It is easy to see that
γk (G03 ) = 2 < 3 = γ̃k (G03 ).
Remark: Given an undirected outerplanar network N = (V, L, c) and a real
k > 0, we consider the k-edge-connectivity augmentation problem which asks
how to augment N by increasing link weights and by adding new links so that
0 0 0
P N =0 (V, L∪L , c )P
the resulting network is k-edge-connected and remains planar
while minimizing e∈L (c (e) − c(e)) + e∈L0 c0 (e), where c and c0 are allowed
to be nonnegative reals. It is not difficult to observe that this problem can be
solved in O(n2 (m + n log n)) time by the argument given so far in this paper.
(It would be interesting to see whether the problem can be formulated as a
linear programming or not; if the planarity is not necessarily preserved then the
problem is written as a linear programming.)

Acknowledgments
This research was partially supported by the Scientific Grant-in-Aid from Min-
istry of Education, Science, Sports and Culture of Japan, the grant from the
Inamori Foundation and Kyoto University Foundation, and the Research Man-
agement Committee from The University of Newcastle.

References
1. G.-R. Cai and Y.-G. Sun. The minimum augmentation of any graph to k-edge-
connected graph. Networks, 19:151–172, 1989.
2. A. Frank. Augmenting graphs to meet edge-connectivity requirements. SIAM J.
Disc. Math., 5:25–53, 1992.
3. G. Kant. Algorithms for Drawing Planar Graphs. PhD thesis, Dept. of Computer
Science, Utrecht University, 1993.
4. G. Kant. Augmenting outerplanar graphs. J. Algorithms, 21:1–25, 1996.
5. G. Kant and H. L. Boldlaender. Planar graph augmentation problems. LNCS,
Vol. 621, pages 258–271. Springer-Verlag, 1992.
6. L. Lovász. Combinatorial Problems and Exercises. North-Holland, 1979.
7. H. Nagamochi and T. Ibaraki. A linear time algorithm for computing 3-edge-
connected components in multigraphs. J. of Japan Society for Industrial and
Applied Mathematics, 9:163–180, 1992.
8. H. Nagamochi and T. Ibaraki. Computing edge-connectivity of multigraphs and
capacitated graphs. SIAM J. Disc. Math., 5:54–66, 1992.
9. H. Nagamochi and T. Ibaraki. Deterministic Õ(nm) time edge-splitting in undi-
rected graphs. J. Combinatorial Optimization, 1:5–46, 1997.
10. R. E. Tarjan. Depth-first search and linear graph algorithms. SIAM J. Comput.,
1:146–160, 1972.
Augmentation in Planar Graphs 111

11. T. Watanabe and A. Nakamura. Edge-connectivity augmentation problems. J.

Comp. System Sci., 35:96–144, 1987.
12. T. Watanabe, S. Taoka and K. Onaga. A linear-time algorithm for computing all
3-edge-components of a multigraph. Trans. Inst. Electron. Inform. Comm. Eng.
Jap., E75-A:410–424, 1992.
A New Bound for the 2-Edge Connected
Subgraph Problem

Robert Carr1? and R. Ravi2??

1
Sandia National Laboratories
Albuquerque, NM, USA
bobcarr@@cs.sandia.gov
2
GSIA, Carnegie Mellon University
Pittsburgh, PA, USA
ravi@@cmu.edu

Abstract. Given a complete undirected graph with non-negative costs

on the edges, the 2-Edge Connected Subgraph Problem consists in finding
the minimum cost spanning 2-edge connected subgraph (where multi-
edges are allowed in the solution). A lower bound for the minimum cost
2-edge connected subgraph is obtained by solving the linear programming
relaxation for this problem, which coincides with the subtour relaxation
of the traveling salesman problem when the costs satisfy the triangle
inequality.
The simplest fractional solutions to the subtour relaxation are the 12 -
integral solutions in which every edge variable has a value which is a
multiple of 12 . We show that the minimum cost of a 2-edge connected
subgraph is at most four-thirds the cost of the minimum cost 12 -integral
solution of the subtour relaxation. This supports the long-standing 43
Conjecture for the TSP, which states that there is a Hamilton cycle which
is within 43 times the cost of the optimal subtour relaxation solution when
the costs satisfy the triangle inequality.

1 Introduction
The 2-Edge Connected Subgraph Problem is a fundamental problem in Sur-
vivable Network Design. This problem arises in the design of communication
networks that are resilient to single-link failures and is an important special case
in the design of survivable networks [11,12,14].

1.1 Formulation
An integer programming formulation for the 2-Edge Connected Subgraph Prob-
lem is as follows. Let Kn = (V, E) be the complete graph of feasible links on
?
Supported by NSF grant DMS9509581 and DOE contract AC04-94AL85000.
??
Research supported in part by an NSF CAREER grant CCR-9625297.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 112–125, 1998. c Springer–Verlag Berlin Heidelberg 1998
A New Bound for the 2-Edge Connected Subgraph Problem 113

which the 2-Edge Connected Subgraph Problem is formulated. We denote an

edge of this graph whose endpoints are i ∈ V and j ∈ V by ij. For each vertex
v ∈ V , let δ(v) ⊂ E denote the set of edges incident to v. For each subset of
vertices S ⊂ V , let δ(S) ⊂ E denote the set of edges in the cut which has S as
one of the shores, i.e. the set of edges having exactly one endpoint in S. Denote
the edge variable for e ∈ E by xe , which is 0,1, or 2 depending on whether e is
absent, occurs singly or P doubly in the 2-edge connected subgraph. For A ⊂ E,
let x(A) denote the sum e∈A xe . Let ce denote the cost of edge e. We have the
following integer programming formulation.

minimize c · x
subject to
x(δ(v)) ≥ 2 for all v ∈ V,
(1)
x(δ(S)) ≥ 2 for all S ⊂ V,
xe ≥0 for all e ∈ E,
xe integral.

The LP relaxation is obtained by dropping the integrality constraint in this

formulation. This LP relaxation is almost the same as the subtour relaxation
for the Traveling Salesman Problem (TSP). The Traveling Salesman Problem
consists in finding the minimum cost Hamilton cycle in a graph (a Hamilton
cycle is a cycle which goes through all the vertices). The subtour relaxation for
the TSP is as follows.

minimize c · x
subject to
x(δ(v)) = 2 for all v ∈ V, (2)
x(δ(S)) ≥ 2 for all S ⊂ V,
xe ≥ 0 for all e ∈ E.

The constraints of the subtour relaxation are called the degree constraints, the
subtour elimination constraints, and the non-negativity constraints respectively.
If one has the relationship cij ≤ cik + cjk for all distinct i, j, k ∈ V , then
c is said to satisfy the triangle inequality. An interesting known result is that
if the costs satisfy the triangle inequality, then there is an optimal solution to
(1) which is also feasible and hence optimal for (2). This follows from a result
of Cunningham [11] (A more general result called the Parsimonious Property is
shown by Goemans and Bertsimas in [7]). We can show that this equivalence
holds even when the costs do not satisfy the triangle inequality. In the latter
case, we replace the given graph by its metric completion, namely, for every
edge ij such that cij is greater than the cost of the shortest path between i and
j in the given graph, we reset the cost to that of this shortest path. The intent
is that if this edge is chosen in the solution of (1), we may replace it by the
shortest cost path connecting i and j. Since multiedges are allowed in the 2-edge
connected graph this transformation is valid. Hence without loss of generality,
we can assume that the costs satisfy the triangle inequality.
114 Robert Carr and R. Ravi

1.2 Our Result and its Significance

Our main result is the following.
4
Theorem 1. The minimum cost of a 2-edge connected subgraph is within 3
times the cost of the optimal half-integral subtour solution for the TSP.
This result is a first step towards proving the following conjecture we offer.
4
Conjecture 1. The minimum cost of a 2-edge connected subgraph is within 3
times the cost of the optimal subtour solution for the TSP.
By our remarks in the end of Section 1.1, it would follow from Conjecture 1
that the minimum cost of a 2-edge connected subgraph is also within 43 times
the cost of an optimal solution to the linear programming relaxation (1).
We formulated Conjecture 1 as an intermediate step in proving the following
stronger “four-thirds conjecture” on the subtour relaxation for the TSP, which
would directly imply Conjecture 1.
Conjecture 2. If the costs satisfy the triangle inequality, then the minimum cost
of a Hamilton cycle is within 43 times the cost of the optimal subtour solution
for the TSP.
Note that Theorem 1 and Conjecture 1 imply similar relations between the
fractional optimum of the subtour relaxation and a minimum-cost 2-vertex con-
nected subgraph when the costs obey the triangle inequality. In particular, The-
orem 1 implies that when the costs satisfy the triangle inequality, the minimum
cost 2-vertex connected spanning subgraph is within 43 times the cost of the
optimal half-integral subtour solution for the TSP. This follows from the simple
observation that from the minimum-cost 2-edge connected graph, we can short-
cut “over” any cut vertices without increasing the cost by using the triangle
inequality [5,11].

1.3 Related Work

A heuristic for finding a low cost Hamilton cycle was developed by Christofides
in 1976 [4]. An analysis of this heuristic shows that the ratio is no worse than 32 in
both Conjecture 1 and Conjecture 2. This analysis was done by Wolsey in [16] and
by Shmoys and Williamson in [15]. A modification of the Christofides heuristic
to find a low cost 2-vertex connected subgraph when the costs obey the triangle
inequality was done by Fredrickson and Ja Ja in [5]. The performance guarantee
for this heuristic to find a 2-vertex connected subgraph is 32 . There has also
been a spate of work on approximation algorithms for survivable network design
problems generalizing the 2-edge connected subgraph problem [7,8,9,10,13,17];
however, the performance guarantee for the 2-edge connected subgraph problem
from these methods is at best 32 when the costs obey the triangle inequality
(shown in [5,7]) and at best 2 when they do not (shown in [9]).
Both Conjecture 2 and Conjecture 1 have remained open since Christofides
developed his heuristic. In this paper, we suggest a line of attack for proving
Conjecture 1.
A New Bound for the 2-Edge Connected Subgraph Problem 115

2 Motivation

In this section we discuss two distinct motivations that led us to focus on half-
integral extreme points and prove a version of Conjecture 1 for this special case.
One follows from a particular strategy to prove Conjecture 1 and the other
from examining subclasses of subtour extreme points that are sufficient to prove
Conjectures 1 and 2.

2.1 A Strategy for Proving Conjecture 1

Let an arbitrary point x∗ of the subtour polytope for Kn be given. Multiply this
by 43 to obtain the vector 43 x∗ . Denote the edge incidence vector for a given 2-edge
connected subgraph H in Kn by χH . Note that edge variables could be 0,1, or 2
in this incidence vector. Suppose we could express 43 x∗ as a convex combination
of incidence vectors of 2-edge connected subgraphs Hi for i = 1, 2, . . . , k. That
is, suppose that
4 ∗ Pk
3x = i=1 λi χHi , (3)

where λi ≥ 0 for i = 1, 2, . . . , k and

X
k
λi = 1.
i=1

Then, taking dot products on both sides of (3) with the cost vector c yields

4
Pk
3c · x∗ = i=1 λi c · χHi . (4)

Since the right hand side of (4) is a weighted average of the numbers c · χHi , it
follows that there exists a j ∈ {1, 2, . . . , k} such that

c · χHj ≤ 43 c · x∗ . (5)

If we could establish (5) for any subtour point x∗ , then it would in particular
be valid for the optimal subtour point, which would prove Conjecture 1.
In an attempt at proving Conjecture 1, we aim at contradicting the idea of a
minimal counterexample, that is, a subtour point x∗ having the fewest number of
vertices n0 such that (3) can not hold for any set of 2-edge connected subgraphs.
First we have the following observation.

Theorem 2. At least one of the minimal counterexamples x∗ to (3) holding

(for some set of 2-edge connected subgraphs) is an extreme point of the subtour
polytope.
P
Proof. Suppose x∗ = l µl xl , where each xl is an extreme point which is not
a minimal counterexample, and the µl ’s satisfy the usual constraints for a set
116 Robert Carr and R. Ravi

of convex multipliers. Thus, for each l, we can find a set of 2-edge connected
subgraphs Hil such that
4 l X l Hil
x = λi χ ,
3 i

where the λli ’s satisfy the usual constraints for a set of convex multipliers. Then
4 ∗
P 4
P P l

3x l 3 µl x µl λli χHi .
l
= = l i (6)

Since we have that

X X X
µl · ( λli ) = µl · (1) = 1,
l i l

Equation (6) shows that 43 x∗ can be expressed as a convex combination of 2-edge

connected subgraphs as well, from which this theorem follows.

Thus we need to focus only on minimal counterexamples x∗ in Kn0 which are

extreme points. To carry out the proof, we wish to find a substantial tight cut
δ(H) for x∗ , i.e. an H ⊂ V such that 3 ≤ |H| ≤ n0 − 3 and

x∗ (δ(H)) = 2.

We can then split x∗ into 2 smaller subtour solutions x1 and x2 in the following
way. Take the vertices of V \ H in x∗ and contract them to a single vertex to
obtain x1 . Likewise, take the vertices of H in x∗ and contract them to a single
vertex to obtain x2 . An example of this is shown in Figure 1.
Since x1 and x2 are not counterexamples to our conjecture, we would be able
to decompose 43 x1 and 43 x2 into combinations of 2-edge connected subgraphs,
which we may then attempt to glue together to form a similar combination for
4 ∗ ∗
3 x , thereby showing that x is not a counterexample (We show how this can be
accomplished for the case of half-integral extreme points in Case 1 in the Proof
of Theorem 6).
What if there are no tight substantial cuts however? The following proposi-
tion which is shown in [1] shows us what we need to do.

Proposition 1. If x∗ is an extreme point of the subtour polytope and has no

substantial tight cuts, then x∗ is a 1/2-integer solution.

This led us to focus on 1/2-integral solutions x∗ , and we were able to complete

the proof for this special case. In the next section, we show our main result that
if x∗ is a 1/2-integer subtour solution, then (3) can always be satisfied.

2.2 The Important Extreme Points

Consider any extreme point x∗ . We wish to express 43 x∗ as a convex combination
of 2-edge connected graphs for Conjecture 1 or a convex combination of Eulerian
graphs for Conjecture 2. An important question is what features of x∗ make it
A New Bound for the 2-Edge Connected Subgraph Problem 117

2/3
2/3
1/3
2/3
1/3
1 1
1 2/3
1
1/3 2/3
2/3
1/3
2/3

Fig. 1. An idea for splitting a minimal counterexample into two smaller in-
stances. Note that H defines a substantial tight cut, i.e., both H and V \ H have
at least three vertices and x(δ(H)) = 2.

difficult to do this? In an effort to answer this question, we try to transform

x∗ into another extreme point x∗ on a larger graph so that x∗ belongs to a
subclass of the extreme points, but 43 x∗ is at least as hard to express as a convex
combination of 2-edge connected graphs (or Eulerian graphs) as 43 x∗ is. The idea
then is that we only have to be able to express 43 x as a convex combination of
2-edge connected graphs (or Eulerian graphs) for all extreme points x belonging
to this particular subclass in order to prove Conjecture 1 (or Conjecture 2). If
we have a subclass S of extreme points x such that being able to express 43 x
as a convex combination of 2-edge connected graphs for all extreme points x
belonging to this particular subclass is sufficient to prove Conjecture 1, then we
say that S is sufficient to prove Conjecture 1. Likewise, a subclass S can be
sufficient to prove Conjecture 2.
We have found two different subclasses of extreme points which are sufficient
to prove both Conjecture 1 and Conjecture 2. In some sense, the extreme points
in such a subclass are the hardest extreme points to deal with when proving
Conjecture 1 or Conjecture 2. One class, termed fundamental extreme points,
can be found in [2].

Definition 1. A fundamental extreme point is an extreme point for the subtour

relaxation satisfying the following conditions.

(i) The support graph is 3-regular,

(ii) There is a 1-edge incident to each vertex,
(iii) The fractional edges form disjoint cycles of length 4.
118 Robert Carr and R. Ravi

A second class of such sufficient extreme points is described below. We will

restrict our attention to showing that the subclass described below is sufficient to
prove Conjecture 1, although showing that it is also sufficient to prove Conjecture
2 requires only minor modifications in our arguments.
Consider any extreme point x∗ . Pick the smallest integer k such that x∗e is a
multiple of k1 for every edge e ∈ E. Then form a 2k-regular 2k-edge connected
multigraph Gk = (V, Ek ) as follows. For every edge e = uv ∈ E, put l edges
between u and v, where l := kx∗e . Then showing that 43 Ek can be expressed
as a convex combination of 2-edge connected graphs is equivalent to showing
that 43 x∗ can be so expressed. But suppose every vertex in Gk is replaced by
a circle of 2k nodes, each node with one edge from Ek , and 2k − 1 new edges
linking this node to its two neighboring nodes in the circle, all in such a way
that the resulting graph Gk = (V , E k ) is still 2k-regular and 2k-edge connected.
Note that loosely speaking, we have Ek ⊂ E k . We seek to then show that if
we can express 43 E k as a convex combination of 2-edge connected graphs, then
we can do so for 43 Ek as well. The graph Gk will turn out to corresponds to a
subtour extreme point x∗ (in the same way that Gk corresponds to x∗ ). It is
more convenient to define this subtour extreme point x∗ than to define Gk .
Let us now define x∗ .
Definition 2. Expand each vertex in V into a circle of 2k nodes, with an edge
of Ek leaving each such node, as described in the previous paragraph. Take the
equivalent of an Eulerian tour through all the edges of Ek by alternately travers-
ing these edges and jumping from one node to another node in the same circle
until you have traversed all of the edges in Ek and have come back to the edge in
Ek you started with. When you jump from node u to node v in the same circle
in this Eulerian tour, define x∗uv := k−1k . For every edge e ∈ Ek , we naturally
define x∗e := k1 . For each circle Cv of nodes corresponding to the vertex v ∈ V ,
we pick an arbitrary perfect matching Mv on the nodes in Cv , including in Mv
only edges e which have not yet been used in the definition of x∗ . We then define
x∗e := 1 for all e ∈ Mv .
We have the following:
Theorem 3. x∗ in Definition 2 is a subtour extreme point.
Proof. The support graph of x∗ is 3-regular, with the fractional edges in x∗
forming a Hamilton cycle on the vertices V . Call the edges in x∗ ’s support graph
Ek.
We first show that x∗ is a feasible subtour point. If it were not, there would
have to be a cut in the graph Gk = (V , E k ) of value less than 2. Clearly, such
a cut C would have to go through some circle Cv of nodes since Gk is 2k-edge
connected. But the contribution of the edges from the circle Cv to any cut
crossing it is at least 1 since the edges in the circle Cv each have a value greater
than or equal to 1/2. Hence, the contribution from the non-circle edges in the
cut C is less than 1. But this is not possible because when v is ripped out of x∗ ,
the minimum cut in the remaining solution is greater than or equal to 1. Hence,
x∗ is a feasible subtour point.
A New Bound for the 2-Edge Connected Subgraph Problem 119

We show that x∗ is an extreme point by showing that it can not be expressed

1 1
as 2x + 12 x2 , where x1 and x2 are distinct subtour points. Suppose x∗ could
be so expressed. Then the support graphs of x1 and x2 would coincide with or
be subgraphs of the support graph E k of x∗ . Because of the structure of the
support graph, setting the value of just one fractional edge determines the entire
solution due to the degree constraints. Hence, all the edges e ∈ E k such that
xe = k1 would have to say be smaller than k1 in x1 and larger than k1 in x2 . But,
then a cut separating any circle of nodes Cv from the rest of the vertices in x1
would have a value less than 2, which contradicts x1 being a subtour point.
We now have the following:
Theorem 4. If 43 x∗ can be expressed as a convex combination of 2-edge con-
nected graphs spanning V , then 43 x∗ can be expressed as a convex combination
of 2-edge connected graphs spanning V .
Proof. Suppose 43 x∗ can be expressed as a convex combination
4 ∗ P
3x = i λi χH i , (7)

where the H i ’s are 2-edge connected graphs spanning V . For each i, contract
each circle of nodes Cv back to the vertex v ∈ V in H i . Call the resulting graph
Hi . Since contraction preserves edge connectivity, Hi is a 2-edge connected graph
spanning V . When one performs this contraction on x∗ , one gets x∗ . As a result,
we obtain that
4 ∗
P
3x = i λi χ ,
Hi
(8)

which proves our theorem.

We can now define the subclass of important extreme points.
Definition 3. An important extreme point is an extreme point for the subtour
relaxation satisfying the following conditions.
(i) The support graph is 3-regular,
(ii) There is a 1-edge incident to each vertex,
(iii) The fractional edges form a Hamilton cycle.
We are now ready for the culminating theorem of this section.
Theorem 5. The subclass of important extreme points is sufficient to prove
Conjecture 1.
Proof. If there is an extreme point x∗ such that 43 x∗ cannot be expressed as a
convex combination of 2-edge connected graphs, then by Theorem 4, the impor-
tant extreme point 43 x∗ cannot be expressed as a convex combination of 2-edge
connected graphs either. Hence, our theorem follows.
The analogous theorem for the class of fundamental extreme points can be found
in [2].
120 Robert Carr and R. Ravi

3 The Proof of Theorem 1

Let x∗ be a 1/2-integer subtour solution on Kn = (V, E). Denote the edges of

the support graph of x∗ (the set of edges e ∈ E such that x∗e > 0) by Ê(x∗ ).
Construct the multigraph G(x∗ ) = (V, E(x∗ )), where E(x∗ ) ⊃ Ê(x∗ ) and differs
from Ê(x∗ ) only in that there are two copies in E(x∗ ) of every edge e ∈ Ê(x∗ )
for which x∗e = 1. Note that the parsimonious property [7] implies that there are
no edges e with xe > 1 in the optimal fractional solution.
Because of the constraints of the subtour relaxation, it follows that G(x∗ )
is a 4-regular 4-edge connected multigraph. Similarly, corresponding to every 4-
regular 4-edge connected multigraph is a 1/2-integer subtour solution, although
this solution may not be an extreme point.
Showing (3) for some choice of 2-edge connected subgraphs Hi for every
1/2-integer subtour solution x∗ would prove Conjecture 1 whenever the optimal
subtour solution was 1/2-integer, as was discussed in the last section. So, equiva-
lently to showing (3) for some choice of 2-edge connected subgraphs Hi for every
1/2-integer subtour solution x∗ , we could show
2 E(G)
P
3χ = i λi χHi , (9)

where this expression is a convex combination of some chosen set of 2-edge

connected subgraphs Hi , for every 4-regular 4-edge connected multigraph G =
(V, E(G)). These are equivalent because of the remarks in the previous paragraph
and the observation that G(x∗ ) behaves like 2x∗ .
It turns out that (9) is very difficult to show directly, but the following
slight strengthening of it makes the task easier. Consider any 4-regular 4-edge
connected multigraph G = (V, E(G)) and any edge e ∈ E(G). Then, we prove
instead that
2 E(G)\{e}
P
3χ = i λi χHi (10)

where this expression is a convex combination of some chosen set of 2-edge

connected subgraphs Hi .
For technical reasons, we will prove (10) with the additional restriction that
none of the Hi ’s may use more than one copy of any edge in E(G). Note however
that G may itself have multiedges so H may also have multiedges. In the latter
case, we think of two parallel multiedges in H as being copies of two distinct
multiedges in G.
For any 4-regular 4-edge connected graph G and any edge e ∈ E(G), we
define P (G, e) to be the following statement.
Statement 1. P (G, e) ⇔ For some finite set
P of 2-edge connected subgraphs Hi ,
we have (10), where λi ≥ 0 for all i and i λi = 1, and none of the Hi ’s may
use more than one copy of any edge in E(G).
As noted above, Statement 1 does not rule out the possibility of doubled
edges in the Hi ’s because there may be doubled edges in G.
A New Bound for the 2-Edge Connected Subgraph Problem 121

We define a tight cut for a 4-edge connected graph G to be a cut which has
exactly 4-edges in it. We define a non-trivial cut for such a graph to be a cut
where both shores have at least 2 vertices each. We have the following lemma.
Lemma 1. Let G = (V, E) be a 4-regular 4-edge connected graph which has no
tight non-trivial cut which includes an edge e = uv ∈ E. Let the other 3 (not
necessarily distinct) neighbors of v be x, y, and z. Then either ux or yz is a loop
or G0 = G − v + ux + yz is 4-regular and 4-edge connected, and likewise for the
other combinations.

Proof. Let G = (V, E) and e = uv ∈ E be given, where the neighbors of v are

as stated. First, note that any cut in G containing all four edges incident on v
has size at least 8, since the cut formed by moving v to the opposite side of the
cut must have size at least 4 since G is 4-edge connected.
Suppose neither ux or yz is a loop. Then clearly, G0 is a 4-regular connected
graph. Since it is 4-regular, every cut has an even number of edges in it. By our
earlier observation, there can be no cuts δ(H) in G0 of cardinality zero. Suppose
G0 has a non-trivial cut δ(H) with only 2 edges in it. Consider Ĝ = G + ux + yz
with vertex v back in. The two non-trivial cuts δ(H ∪ {v}) and δ((V \ H) ∪ {v})
can each have at most 3 more edges each (for a total of 5 edges each) since as
observed earlier, these cuts could not have all 4 edges incident to v in them. But,
G = Ĝ − ux − yz has only cuts with an even number of edges in them since it
is 4-regular. Hence the cuts δ(H ∪ {v}) and δ((V \ H) ∪ {v}) in G have at most
4 edges in them. One of these two cuts is a tight non-trivial cut which contains
e, which yields the lemma.

We are now ready for our main theorem.

Theorem 6. Let x∗ be a minimum cost 1/2-integer subtour solution. Then there
exists a 2-edge connected subgraph H such that c · χH ≤ 43 c · x∗ .

Proof. As remarked in the discussion before this theorem, it is sufficient to prove

P (G, e) for all 4-regular 4-edge connected multigraphs G and for all e ∈ E(G). To
prove this, we show that a minimal counterexample to P (G, e) can not happen.
Let G = (V, E(G)) be a 4-regular 4-edge connected multigraph and e ∈ E(G)
which has the minimum number of vertices such that P (G, e) does not hold.
Since by inspection, we can verify that P (G, e) holds when G has 3 vertices,
we can assume that |V | > 3. We now consider the cases where G has a tight
non-trivial cut which includes edge e and where G has no tight non-trivial cut
which includes e.
Case 1: G has a tight non-trivial cut which includes edge e.
Choose such a tight non-trivial cut and denote the edges other than e in this
cut by a, b, and c. As before, consider contracting one of the shores of this cut to
a single vertex v1 . Denote the edges incident to v1 , which corresponded to e, a, b,
and c, by e1 , a1 , b1 , and c1 respectively. This resulting graph G1 = (V1 , E1 ) can
be seen to be 4-regular and 4-edge connected. (To see this, suppose there was a
cut of cardinality less than four in G1 and let H1 be the shore of this cut not
122 Robert Carr and R. Ravi

containing v1 . Then the cut δ(H1 ) in G shows that G is not 4-edge-connected,

a contradiction.) Since (G, e) was a minimal counterexample to P (G, e), we
have P (G1 , e1 ). By contracting the other shore, we can get a 4-regular 4-edge
connected graph G2 , and we know that P (G2 , e2 ) also holds.
By P (G1 , e1 ) we have
2 E(G1 )\{e1 }
P 1

3χ = i λi χHi , (11)

and by P (G2 , e2 ) we have

2 E(G2 )\{e2 }
P 2
3χ = i µi χHi . (12)

In (11), consider the edges incident to v1 in each of the Hi1 ’s. There are clearly
at least 2 such edges for every Hi1 . The values of edges a1 , b1 , c1 , and e1 in
2 E(G1 )\{e1 }
3χ are 23 , 23 , 23 , and 0 respectively. This adds up to 2. Hence, since we
are dealing with convex combinations, which are weighted averages, when the
weights are taken into account, the Hi1 ’s have on average 2 edges incident to v1
each. But since every Hi1 has at least 2 such edges, it follows that every Hi1 has
exactly 2 edges incident to v1 in it.
For each 2-edge connected subgraph Hi1 which has edges a1 and b1 , denote
the corresponding convex multiplier by λab i . Define λi and λi similarly. One
ac bc

can see that the only way for the variable values of edges a1 , b1 , and c1 to end
up all being 23 in 23 χE(G1 )\{e1 } is for the following to hold:
P ab P ac P bc 1
i λi = i λi = i λi = 3 . (13)

Similarly, we must have

P P P 1
i µab
i = i µac
i = i µbc
i = 3. (14)

Call the three types of 2-edge connected graphs Hij as ab-graphs, ac-graphs,
and bc-graphs. Our strategy is to combine say each ab-graph Hi1 of G1 with an
ab-graph Hj2 of G2 to form an ab-graph Hijab
of G which is also 2-edge connected.
So, we define

Hij
ab
:= (Hi1 − v1 ) + (Hj2 − v2 ) + a + b, (15)

where Hi1 and Hj2 are ab-graphs. Since Hi1 − v1 and Hj2 − v2 are both connected,
it follows that Hijab
is 2-edge connected. Similarly define Hij ac
and Hijbc
.
Now consider the following expression:
P P P
i,j 3λi µj Hij + i,j 3λi µj Hij + i,j 3λi µj Hij .
ab ab ab ac ac ac bc bc bc
(16)

One can verify that this is in fact a convex combination. Any edge f in say
G1 − v1 occurs in (16) with a weight of
P P ab P ac P bc
{i | f ∈H 1 } (λi · (3 · j µj ) + λi · (3 · j µj ) + λi · (3 · j µj )). (17)
ab ac bc
i
A New Bound for the 2-Edge Connected Subgraph Problem 123

In light of (14) we have that (17) evaluates to

X 2
λi = . (18)
3
{i | f ∈Hi1 }

We have a similar identity when f is in G2 − v2 and we also have that edges a, b,

and c each occur in (16) with a weight of 23 as well. Therefore we have
P P P 2 E(G)\{e}
i µj Hij +
3λab i µj Hij + i µj Hij = 3 χ , (19)
ab ab
i,j i,j 3λac ac ac
i,j 3λbc bc bc

which contradicts (G, e) being a minimal counterexample.

Case 2: G has no tight non-trivial cut which includes edge e.
Denote the endpoints of e by u ∈ V and v ∈ V , and denote the other 3 not
necessarily distinct neighbors of v in G by x, y, z ∈ V . Because e is in no tight
non-trivial cut, we have that x 6= y 6= z. (If any two of the neighbors x, y and z
are the same, say x = y, then the cut δ({v, x}) will be a tight non-trivial cut).
Thus, without loss of generality, if any two neighbors are the same vertex, we
can assume that they are u and z. Hence, u 6= x and u 6= y.
Define the graph G1 = (V1 , E1 ) by

G1 = G − v + ux + yz, (20)

and define e1 = ux. We know by Lemma 1 that G1 is 4-regular and 4-edge

connected. Since (G, e) is a minimal counterexample, we therefore know that
P (G1 , e1 ) holds. Similarly, define the graph G2 = (V2 , E2 ) by

G2 = G − v + uy + xz, (21)

and define e2 = uy. As before, we know that P (G2 , e2 ) holds as well.

So, we can form the following convex combinations of 2-edge connected
graphs:
2 E1 \{e1 }
P 1

3χ = i λi χHi , (22)

and
2 E2 \{e2 } P 2
3χ = i µi χHi . (23)

Define Ĥi1 by

Hi1 − yz + yv + zv for yz ∈ Hi1 ,
Ĥi1 = (24)
Hi1 + yv + xv for yz 6∈ Hi1 .

Likewise, define Ĥi2 by

Hi2 − xz + xv + zv for xz ∈ Hi2 ,
Ĥi2 = (25)
Hi2 + yv + xv for xz 6∈ Hi2 .
124 Robert Carr and R. Ravi

Consider the convex combination of 2-edge connected subgraphs

1
P Ĥi1
P 2
2 i λi χ + 12 i µi χĤi . (26)

Every edge in f ∈ E \ δ(v) occurs with a total weight of 23 in (26) since f occured
with that weight in both (22) and (23). Since yz occurs with a total weight of 23
in (22) and xz occurs with a total weight of 23 in (23), one can verify that xv, yv,
and zv each occur with a total weight of 23 in (26) as well. Therefore, we have
2 E\{e} 1 P 1
1 P 2

3χ = 2 i λi χĤi + 2 i µi χĤi , (27)

which contradicts G, e being a minimal counterexample.

4 Concluding Remarks
An obvious open problem arising from our work is to extend our strategy and
settle Conjecture 1. In another direction, it would be interesting to apply our
ideas to design a 43 -approximation algorithm for the minimum cost 2-edge- and
2-vertex-connected subgraph problems.
Another interesting question is the tightness of the bound proven in Theo-
rem 1. The examples we have been able to construct seem to demonstrate an
asymptotic ratio of 65 between the cost of a minimum cost 2-edge connected sub-
graph and that of an optimal half-integral subtour solution. Finding instances
with a worse ratio or improving our bound in Theorem 1 are open problems.

References
1. M. Balinski. On recent developments in integer programming. In H. W. Kuhn,
editor, Proceedings of the Princeton Symposium on Mathematical Programming,
pages 267–302. Princeton University Press, NJ, 1970.
2. S. Boyd and R. Carr. Finding low cost TSP and 2-matching solutions using certain
half-integer subtour vertices. Manuscript, March 1998.
3. S. Boyd and R. Carr. A new bound for the 2-matching problem. Report TR-96-07,
Department of Computer Science, University of Ottawa, Ottawa, 1996.
4. N. Christofides. Worst case analysis of a new heuristic for the traveling salesman
problem. Report 388, Graduate School of Industrial Administration, Carnegie Mel-
lon University, Pittsburgh, 1976.
5. G. N. Fredrickson and J. Ja Ja. On the relationship between the biconnectiv-
ity augmentation and traveling salesman problems. Theoretical Computer Science,
19:189–201, 1982.
6. M. X. Goemans. Worst-case comparison of valid inequalities for the TSP. Math.
Programming, 69:335–349, 1995.
7. M. X. Goemans and D. J. Bertsimas. Survivable networks, linear programming
relaxations and the parsimonious property. Math. Programming, 60:145–166, 1993.
8. M. X. Goemans, A. Goldberg, S. Plotkin, D. Shmoys, É. Tardos, and D. P. Willam-
son. Approximation algorithms for network design problems. Proceedings of the
Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’94), pages
223–232, 1994.
A New Bound for the 2-Edge Connected Subgraph Problem 125

9. S. Khuller and U. Vishkin. Biconnectivity approximations and graph carvings. J.

Assoc. Comput. Mach., 41(2):214–235, 1994.
10. P. Klein and R. Ravi. When cycles collapse: A general approximation technique for
constrained two-connectivity problems. Proceedings of the Conference on Integer
Programming and Combinatorial Optimization (IPCO ’93), pages 39–56, 1993.
11. C. L. Monma, B. S. Munson, and W. R. Pulleyblank. Minimum-weight two-
connected spanning networks. Math. Programming, 46:153–171, 1990.
12. C. L. Monma and D. F. Shallcross. Methods for designing communication net-
works with certain two-connectivity survivability constraints. Oper. Res., 37:531–
541, 1989.
13. H. Saran, V. Vazirani, and N. Young. A primal-dual approach to approximation
algorithms for network Steiner problems. Proc. of the Indo-US workshop on Coop-
erative research in Computer Science, pages 166–168. Bangalore, India, 1992.
14. K. Steiglitz, P. Weiner, and D. J. Kleitman. The design of minimum-cost survivable
networks. IEEE Trans. on Circuit Theory, CT-16, 4:455–460, 1969.
15. D. B. Shmoys and D. P. Williamson. Analyzing the Held-Karp TSP bound: A
monotonicity property with application, Inf. Process. Lett., 35:281–285, 1990.
16. L. A. Wolsey. Heuristic analysis, linear programming and branch and bound. Math.
Program. Study, 13:121–134, 1980.
17. D. P. Williamson, M. X. Goemans, M. Mihail, and V. Vazirani. A primal-dual
approximation algorithm for generalized Steiner network problems. Combinatorica
15:435–454, 1995.
An Improved Approximation Algorithm for
Minimum Size 2-Edge Connected Spanning
Subgraphs

Joseph Cheriyan1 , András Sebő2 , and Zoltán Szigeti3

1
Department of Combinatorics and Optimization
University of Waterloo, Waterloo, Ontario, Canada N2L 3G1
jcheriyan@@dragon.uwaterloo.ca
2
Departement de Mathematiques Discretes
CNRS, Laboratoire LEIBNIZ (CNRS,INPG,UJF)-IMAG
46 Avenue Felix Viallet, 38000 Grenoble Cedex, France
Andras.Sebo@@imag.fr
3
Equipe Combinatoire, Université Paris VI
4 place Jussieu, Couloir 45-46 3e, 75252 Paris, France
Zoltan.Szigeti@@ecp6.jussieu.fr

Abstract. We give a 17 12
-approximation algorithm for the following NP-
hard problem:
Given a simple undirected graph, find a 2-edge connected span-
ning subgraph that has the minimum number of edges.
The best previous approximation guarantee was 32 . If the well known TSP
4
3
conjecture holds, then there is a 43 -approximation algorithm. Thus our
main result gets half-way to this target.

1 Introduction
Given a simple undirected graph, consider the problem of finding a 2-edge con-
nected spanning subgraph that has the minimum number of edges. The problem
is NP-hard, since the Hamiltonian cycle problem reduces to it. A number of
recent papers have focused on approximation algorithms 1 for this and other
related problems, [2]. We use the abbreviation 2-ECSS for 2-edge connected
spanning subgraph.
Here is an easy 2-approximation algorithm for the problem:
Take an ear decomposition of the given graph (see Section 2 for defini-
tions), and discard all 1-ears (ears that consist of one edge). Then the
resulting graph is 2-edge connected and has at most 2n − 3 edges, while
the optimal subgraph has ≥ n edges, where n is the number of nodes.
1
An α-approximation algorithm for a combinatorial optimization problem runs in
polynomial time and delivers a solution whose value is always within the factor α
of the optimum value. The quantity α is called the approximation guarantee of the
algorithm.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 126–136, 1998. c Springer–Verlag Berlin Heidelberg 1998
An Approximation Algorithm for 2-Edge Connected Spanning Subgraphs 127

Khuller & Vishkin [8] were the first to improve on the approximation guarantee
of 2. They gave a simple and elegant algorithm based on depth-first search that
achieves an approximation guarantee of 1.5. In an extended abstract, Garg, San-
tosh & Singla [6] claimed to have a 1.25-approximation algorithm for the prob-
lem. No proof of this claim is available; on the other hand, there is no evidence
indicating that achieving an approximation guarantee of 1.25 in polynomial time
is impossible.
We improve Khuller & Vishkin’s 18 17
12 -approximation guarantee to 12 . If the
4 4
well known TSP 3 conjecture holds, then there is a 3 -approximation algorithm,
see Section 5. Thus our main result gets half-way to this target.
Let G = (V, E) be the given simple undirected graph, and let n and m denote
|V | and |E|. Assume that G is 2-edge connected.
Our method is based on a matching-theory result of András Frank, namely,
there is a good characterization for the minimum number of even-length ears
over all possible ear decompositions of a graph, and moreover, an ear decom-
position achieving this minimum can be computed efficiently, [4]. Recall that
the 2-approximation heuristic starts with an arbitrary ear decomposition of G.
Instead, if we start with an ear decomposition that maximizes the number of
1-ears, and if we discard all the 1-ears, then we will obtain the optimal solution.
In fact, we start with an ear decomposition that maximizes the number of odd-
length ears. Now, discarding all the 1-ears gives an approximation guarantee of
1.5 (see Proposition 8 below). To do better, we repeatedly apply “ear-splicing”
steps to the starting ear decomposition to obtain a final ear decomposition such
that the number of odd-length ears is the same, and moreover, the internal nodes
of distinct 3-ears are nonadjacent. We employ two lower bounds to show that
discarding all the 1-ears from the final ear decomposition gives an approximation
guarantee of 1712 . The first lower bound is the “component lower bound” due to
Garg et al [6, Lemma 4.1], see Proposition 4 below. The second lower bound
comes from the minimum number of even-length ears in an ear decomposition
of G, see Proposition 7 below.
After developing some preliminaries in Sections 2 and 3, we present our
heuristic in Section 4. Section 5 shows that the well known 43 conjecture for the
metric TSP implies that there is a 43 -approximation algorithm for a minimum-
size 2-ECSS, see Theorem 18. Almost all of the results in Section 5 are well
known, but we include the details to make the paper self-contained. Section 6
has two examples showing that our analysis of the heuristic is tight. Section 6
also compares the two lower bounds with the optimal value.

A Useful Assumption
For our heuristic to work, it is essential that the given graph be 2-node con-
nected. Hence, in Section 4 of the paper where our heuristic is presented, we
will assume that the given graph G is 2-node connected. Otherwise, if G is not
2-node connected, we compute the blocks (i.e., the maximal 2-node connected
subgraphs) of G, and apply the algorithm separately to each block. We compute
a 2-ECSS for each block, and output the union of the edge sets as the edge set of
128 Joseph Cheriyan et al.

a 2-ECSS of G. The resulting graph has no cut edges since the subgraph found
for each block has no cut edge, and moreover, the approximation guarantee for
G is at most the maximum of the approximation guarantees for the blocks.

2 Preliminaries

Except in Section 5, all graphs are simple, that is, there are no loops nor multi-
edges. A closed path means a cycle, and an open path means that all the nodes
are distinct.
An ear decomposition of the graph G is a partition of the edge set into open
or closed paths, P0 + P1 + . . . + Pk , such that P0 is the trivial path with one
node, and each Pi (1 ≤ i ≤ k) is a path that has both end nodes in Vi−1 =
V (P0 ) ∪ V (P1 ) ∪ . . . ∪ V (Pi−1 ) but has no internal nodes in Vi−1 . A (closed
or open) ear means one of the (closed or open) paths P0 , P1 , . . . , Pk in the ear
decomposition, and for a nonnegative integer `, an `-ear means an ear that has `
edges. An `-ear is called even if ` is an even number, otherwise, the `-ear is called
odd. (The ear P0 is always even.) An open ear decomposition P0 + P1 + . . . + Pk
is one such that all the ears P2 , . . . , Pk are open.

Proposition 1 (Whitney [12]).

(i) A graph is 2-edge connected if and only if it has an ear decomposition.

(ii) A graph is 2-node connected if and only if it has an open ear decomposition.

An odd ear decomposition is one such that every ear (except the trivial path
P0 ) has an odd number of edges. A graph is called factor-critical if for every node
v ∈ V (G), there is a perfect matching in G − v. The next result gives another
characterization of factor-critical graphs.

Theorem 2 (Lovász [9], Theorem 5.5.1 in [10]). A graph is factor-critical

if and only if it has an odd ear decomposition.

It follows that a factor-critical graph is necessarily 2-edge connected. An open

odd ear decomposition P0 + P1 + . . . + Pk is an odd ear decomposition such that
all the ears P2 , . . . , Pk are open.

Theorem 3 (Lovász & Plummer, Theorem 5.5.2 in [10]). A 2-node con-

nected factor-critical graph has an open odd ear decomposition.

Let ε(G) denote the minimum number of edges in a 2-ECSS of G. For a graph
H, let c(H) denote the number of (connected) components of H. Garg et al [6,
Lemma 4.1] use the following lower bound on ε(G).

Proposition 4. Let G = (V, E) be a 2-edge connected graph, and let S be a

nonempty set of nodes such that the deletion of S results in a graph with c =
c(G − S) ≥ 2 components. Then ε(G) ≥ |V | + c − |S|.
An Approximation Algorithm for 2-Edge Connected Spanning Subgraphs 129

Proof. Focus on an arbitrary component D of G − S and note that it contributes

≥ |V (D)| + 1 edges to an optimal 2-ECSS, because every node in D contributes
≥ 2 edges, and at least two of these edges have exactly one end node in D.
Summing over all components of G − S gives the result. t
u
For a set of nodes S ⊆ V of a graph G = (V, E), δ(S) denotes the set of
edges that have one end node in S and one end node in V − S. For the singleton
node set {v}, we use the notation δ(v). For a vector x : E→IR, x(δ(S)) denotes
P
e∈δ(S) xe .

3 Frank’s Theorem and a New Lower Bound for ε

For a 2-edge connected graph G, let ϕ(G) (or ϕ) denote the minimum number
of even ears of length ≥ 2, over all possible ear decompositions. For example:
ϕ(G) = 0 if G is a factor-critical graph (e.g., G is an odd clique K2`+1 or an
odd cycle C2`+1 ), ϕ(G) = 1 if G is an even clique K2` or an even cycle C2` , and
ϕ(G) = ` − 1 if G is the complete bipartite graph K2,` (` ≥ 2). The proof of the
next result appears in [4], see Theorem 4.5 and Section 2 of [4].
Theorem 5 (A. Frank [4]). Let G = (V, E) be a 2-edge connected graph. An
ear decomposition P0 + P1 + . . . + Pk of G having ϕ(G) even ears of length ≥ 2
can be computed in time O(|V | · |E|).

Proposition 6. Let G be a 2-node connected graph. An open ear decomposition

P0 + P1 + . . . + Pk of G having ϕ(G) even ears of length ≥ 2 can be computed in
time O(|V | · |E|).
Proof. Start with an ear decomposition having ϕ(G) even ears of length ≥ 2 (the
ears may be open or closed). Subdivide one edge in each even ear of length ≥ 2 by
adding one new node and one new edge. The resulting ear decomposition is odd.
Hence, the resulting graph G0 is factor critical, and also, G0 is 2-node connected
since G is 2-node connected. Apply Theorem 3 to construct an open odd ear
decomposition of G0 . Finally, in the resulting ear decomposition, “undo” the
ϕ(G) edge subdivisions to obtain the desired ear decomposition P0 +P1 +. . .+Pk
of G. t
u
Frank’s theorem gives the following lower bound on the minimum number of
edges in a 2-ECSS.
Proposition 7. Let G = (V, E) be a 2-edge connected graph. Then ε(G) ≥
|V | + ϕ(G) − 1.
Proof. Consider an arbitrary 2-ECSS of G. If this 2-ECSS has an ear decom-
position with fewer than ϕ(G) + 1 even ears, then we could add the edges of
G not in the 2-ECSS as 1-ears to get an ear decomposition of G with fewer
than ϕ(G) + 1 even ears. Thus, every ear decomposition of the 2-ECSS has
≥ ϕ(G) + 1 even ears. Let P0 + P1 + . . . + Pk be an ear decomposition of the 2-
ECSS, where k ≥ ϕ(G). It is easily seen that the number of edges in the 2-ECSS
is k + |V | − 1 ≥ ϕ(G) + |V | − 1. The result follows. t
u
130 Joseph Cheriyan et al.

The next result is not useful for our main result, but we include it for com-
pleteness.

Proposition 8. Let G = (V, E) be a 2-edge connected graph. Let G0 = (V, E 0 ) be

obtained by discarding all the 1-ears from an ear decomposition P0 + P1 + . . .+ Pk
of G that has ϕ(G) even ears of length ≥ 2. Then |E 0 |/ε(G) ≤ 1.5.

Proof. Let t be the number of internal nodes in the odd ears of P0 +P1 +. . .+Pk .
(Note that the node in P0 is not counted by t.) Then, the number of edges
contributed to E 0 by the odd ears is ≤ 3t/2, and the number of edges contributed
to E 0 by the even ears is ≤ ϕ+|V |−t−1. By applying Proposition 7 (and the fact
that ε(G) ≥ |V |) we get, |E 0 |/ε(G) ≤ (t/2 + ϕ + |V | − 1)/ max(|V |, ϕ + |V | − 1) ≤
(t/2|V |) + (ϕ + |V | − 1)/(ϕ + |V | − 1) ≤ 1.5. t
u

4 Approximating ε via Frank’s Theorem

For a graph H and an ear decomposition P0 + P1 + . . . + Pk of H, we call an

ear Pi of length ≥ 2 pendant if none of the internal nodes of Pi is an end node
of another ear Pj of length ≥ 2. In other words, if we discard all the 1-ears from
the ear decomposition, then one of the remaining ears is called pendant if all its
internal nodes have degree 2 in the resulting graph.
Let G = (V, E) be the given graph, and let ϕ = ϕ(G). Recall the assumption
from Section 1 that G is 2-node connected. By an evenmin ear decomposition of
G, we mean an ear decomposition that has ϕ(G) even ears of length ≥ 2. Our
method starts with an open evenmin ear decomposition P0 + P1 + . . . + Pk of G,
see Proposition 6, i.e., for 2 ≤ i ≤ k, every ear Pi has distinct end nodes, and the
number of even ears is minimum possible. The method performs a sequence of
“ear splicings” to obtain another (evenmin) ear decomposition Q0 +Q1 +. . .+Qk
(the ears Qi may be either open or closed) such that the following holds:

Property (α)
(0) the number of even ears is the same in P0 + P1 + . . . + Pk and in Q0 + Q1 +
. . . + Qk ,
(1) every 3-ear Qi is a pendant ear,
(2) for every pair of 3-ears Qi and Qj , there is no edge between an internal node
of Qi and an internal node of Qj , and
(3) every 3-ear Qi is open.

Proposition 9. Let G = (V, E) be a 2-node connected graph with |V | ≥ 4. Let

P0 + P1 + . . . + Pk be an open evenmin ear decomposition of G. There is a
linear-time algorithm that given P0 + P1 + . . . + Pk , finds an ear decomposition
Q0 + Q1 + . . . + Qk satisfying property (α).

Proof. The proof is by induction on the number of ears. The result clearly holds
for k = 1. Suppose that the result holds for (j − 1) ears P0 + P1 + . . . + Pj−1 . Let
An Approximation Algorithm for 2-Edge Connected Spanning Subgraphs 131

Q00 + Q01 + . . .+ Q0j−1 be the corresponding ear decomposition that satisfies prop-
erty (α). Consider the open ear Pj , j ≥ 2. Let Pj be an `-ear, v1 , v2 , . . . , v` , v`+1 .
Possibly, ` = 1. (So v1 and v`+1 are the end nodes of Pj , and v1 6= v`+1 .)
Let T denote the set of internal nodes of the 3-ears of Q00 + Q01 + . . . + Q0j−1 .
Suppose Pj is an ear of length ` ≥ 2 with exactly one end node, say, v1 in T .
Let Q0i = w1 , v1 , w3 , w4 be the 3-ear having v1 as an internal node. We take
Q0 = Q00 , . . . , Qi−1 = Q0i−1 , Qi = Q0i+1 , . . . , Qj−2 = Q0j−1 . Moreover, we take
Qj−1 to be the (` + 2)-ear obtained by adding the last two edges of Q0i to Pj , i.e.,
Qj−1 = w4 , w3 , v1 , v2 , . . . , v` , v`+1 , and we take Qj to be the 1-ear consisting of
the first edge w1 v1 of Q0i . Note that the parities of the lengths of the two spliced
ears are preserved, that is, Qj−1 is even (odd) if and only if Pj is even (odd),
and both Qj and Q0i are odd. Hence, the number of even ears is the same in
P0 + P1 + . . . + Pj and in Q0 + Q1 + . . . + Qj .
Now, suppose Pj has both end nodes v1 and v`+1 in T . If there is one 3-ear
Q0i that has both v1 and v`+1 as internal nodes (so ` ≥ 2), then we take Qj−1
to be the (` + 2)-ear obtained by adding the first edge and the last edge of Q0i
to Pj , and we take Qj to be the 1-ear consisting of the middle edge v1 v`+1 of
Q0i . Also, we take Q0 = Q00 , . . . , Qi−1 = Q0i−1 , Qi = Q0i+1 , . . . , Qj−2 = Q0j−1 .
Observe that the number of even ears is the same in P0 + P1 + . . . + Pj and in
Q 0 + Q1 + . . . + Qj .
If there are two 3-ears Q0i and Q0h that contain the end nodes of Pj , then we
take Qj−2 to be the (` + 4)-ear obtained by adding the last two edges of both Q0i
and Q0h to Pj , and we take Qj−1 (similarly, Qj ) to be the 1-ear consisting of the
first edge of Q0i (similarly, Q0h ). (For ease of description, assume that if a 3-ear
has exactly one end node v of Pj as an internal node, then v is the second node
of the 3-ear.) Also, assuming i < h, we take Q0 = Q00 , . . . , Qi−1 = Q0i−1 , Qi =
Q0i+1 , . . . , Qh−2 = Q0h−1 , Qh−1 = Q0h+1 , . . . , Qj−3 = Q0j−1 . Again, observe that
the number of even ears is the same in P0 +P1 +. . .+Pj and in Q0 +Q1 +. . .+Qj .
If the end nodes of Pj are disjoint from T , then the proof is easy (take
Qj = Pj ). Also, if Pj is a 1-ear with exactly one end node in T , then the proof
is easy (take Qj = Pj ).
The proof ensures that in the final ear decomposition Q0 + Q1 + . . . + Qk ,
every 3-ear is pendant and open, and moreover, the internal nodes of distinct 3-
ears are nonadjacent. We leave the detailed verification to the reader. Therefore,
the ear decomposition Q0 + Q1 + . . . + Qk satisfies property (α). t
u

Remark 10. In the induction step, which applies for j ≥ 2 (but not for j = 1),
it is essential that the ear Pj is open, though Q0i (and Q0h ) may be either open
or closed. Our main result (Theorem 12) does not use part (3) of property (α).
Our approximation algorithm for a minimum-size 2-ECSS computes the ear
decomposition Q0 + Q1 + . . . + Qk satisfying property (α), starting from an open
evenmin ear decomposition P0 + P1 + . . . + Pk . (Note that Q0 + Q1 + . . . + Qk
is an evenmin ear decomposition.) Then, the algorithm discards all the edges
in 1-ears. Let the resulting graph be G0 = (V, E 0 ). G0 is 2-edge connected by
Proposition 1.
132 Joseph Cheriyan et al.

Let T denote the set of internal nodes of the 3-ears of Q0 + Q1 + . . .+ Qk , and

let t = |T |. (Note that the node in Q0 is not counted by t.) Property (α) implies
that in the subgraph of G induced by T , G[T ], every (connected) component
has exactly two nodes. Consider the approximation guarantee for G0 , i.e., the
quantity |E 0 |/ε(G).
Lemma 11. ε(G) ≥ 3t/2.
Proof. Apply Proposition 4 with S = V − T (so |S| = n − t) and c = c(G − S) =
t/2 to get ε(G) ≥ n − (n − t) + (t/2). t
u

Theorem 12. Given a 2-edge connected graph G = (V, E), the above algorithm
finds a 2-ECSS G0 = (V, E 0 ) such that |E 0 |/ε(G) ≤ 17
12 . The algorithm runs in
time O(|V | · |E|).
Proof. By the previous lemma and Proposition 7,

ε(G) ≥ max(n + ϕ(G) − 1, 3t/2) .

We claim that
t 5(n + ϕ(G) − 1)
|E 0 | ≤+ .
4 4
To see this, note that the final ear decomposition Q0 + Q1 + . . . + Qk satisfies
the following: (i) the number of edges contributed by the 3-ears is 3t/2; (ii) the
number of edges contributed by the odd ears of length ≥ 5 is ≤ 5q/4, where q is
the number of internal nodes in the odd ears of length ≥ 5; and (iii) the number
of edges contributed by the even ears of length ≥ 2 is ≤ ϕ(G) + (n − t − q − 1),
since there are ϕ(G) such ears and they have a total of (n − t − q − 1) internal
nodes. (The node in Q0 is not an internal node of an ear of length ≥ 1.)
The approximation guarantee follows since
|E 0 | t/4 + 5(n + ϕ(G) − 1)/4
≤
ε(G) ε(G)
t/4 + 5(n + ϕ(G) − 1)/4
≤
max(n + ϕ(G) − 1, 3t/2)
t 2 5(n + ϕ(G) − 1) 1
≤ +
4 3t 4 n + ϕ(G) − 1
17
= .
12
t
u

4
5 Relation to the TSP 3
Conjecture

This section shows that the well known 43 conjecture for the metric TSP (due
to Cunningham (1986) and others) implies that there is a 43 -approximation al-
gorithm for a minimum-size 2-ECSS, see Theorem 18. Almost all of the results
An Approximation Algorithm for 2-Edge Connected Spanning Subgraphs 133

in this section are well known, except possibly Fact 13, see [1,3,5,7,11,13]. The
details are included to make the paper self-contained.
In the metric TSP (traveling salesman problem), we are given a complete
graph G0 = Kn and edge costs c0 that satisfy the triangle inequality (c0vw ≤
c0vu + c0uw , ∀v, w, u ∈ V ). The goal is to compute c0T SP , the minimum cost of a
Hamiltonian cycle.
Recall our 2-ECSS problem: Given a simple graph G = (V, E), compute ε(G),
the minimum size of a 2-edge connected spanning subgraph. Here is the multiedge
(or uncapacitated) version of our problem. Given G = (V, E) as above, compute
µ(G), the minimum size (counting multiplicities) of a 2-edge connected spanning
submultigraph H = (V, F ), where F is a multiset such that e ∈ F =⇒ e ∈ E.
(To give an analogy, if we take ε(G) to correspond to the f -factor problem, then
µ(G) corresponds to the f -matching problem.)

Fact 13. If G is a 2-edge connected graph, then µ(G) = ε(G).

Proof. Let H = (V, F ) give the optimal solution for µ(G). If H uses two copies
of an edge vw, then we can replace one of the copies by some other edge of G
in the cut given by H − {vw, vw}. In other words, if S is the node set of one of
the two components of H − {vw, vw}, then we replace one copy of vw by some
edge from δG (S) − {vw}. t
u

Remark 14. The above is a lucky fact. It fails to generalize, both for minimum-
cost (rather than minimum-size) 2-ECSS, and for minimum-size k-ECSS, k ≥ 3.

Given an n-node graph G = (V, E) together with edge costs c (possibly c

assigns unit costs), define its metric completion G0 , c0 to be the complete graph
Kn = G0 with c0vw (∀ v, w ∈ V ) equal to the minimum-cost of a v-w path in G, c.

Fact 15. Let G be a 2-edge connected graph, and let c assign unit costs to the
edges. The minimum cost of the TSP on the metric completion of G, c, satisfies
c0T SP ≥ µ(G) = ε(G).

Proof. Let T be an optimal solution to the TSP. We replace each edge vw ∈

E(T ) − E(G) by the edges of a minimum-cost v-w path in G, c. The resulting
multigraph H is obviously 2-edge connected, and has c0T SP = c(H) ≥ µ(G). u
t

Here is the subtour formulation of the TSP on G0 , c0 , where G0 = Kn . This

gives an integer programming formulation, using the subtour elimination con-
straints. There is one variable xe for each edge e in G0 .

c0T SP = minimize c0 · x
subject to x(δ(v)) = 2, ∀v ∈ V
x(δ(S)) ≥ 2, ∀S ⊂ V, ∅ =
6 S 6= V
x ≥ 0,
x ∈ ZZ .
134 Joseph Cheriyan et al.

The subtour LP (linear program) is obtained by removing the integrality con-

straints, i.e., the x-variables are nonnegative reals rather than nonnegative in-
tegers. Let zST denote the optimal value of the subtour LP. Note that zST is
computable in polynomial time, e.g., via the Ellipsoid method. In practice, zST
may be computed via the Held-Karp heuristic, which typically runs fast.
3
Theorem 16 (Wolsey [13]). If c0 is a metric, then c0T SP ≤ zST .
2

4
TSP 4
3 Conjecture. If c0 is a metric, then c0T SP ≤ zST .
3
To derive the lower bound zST ≤ ε(G), we need a result of Goemans &
Bertsimas on the subtour LP, [7, Theorem 1]. In fact, a special case of this result
that appeared earlier in [11, Theorem 8] suffices for us.
Proposition 17 (Parsimonious property [7]). Consider the TSP on G0 =
(V, E 0 ), c0 , where G0 = K|V | . Assume that the edge costs c0 form a metric, i.e.,
c0 satisfies the triangle inequality. Then the optimal value of the subtour LP
remains the same even if the constraints {x(δ(v)) = 2, ∀v ∈ V } are omitted.
Note that this result does not apply to the subtour integer program given
above.
Let z2CUT denote the optimal value of the LP obtained from the subtour LP
by removing the constraints x(δ(v)) = 2 for all nodes v ∈ V . The above result
states that if c0 is a metric, then zST = z2CUT . Moreover, for a 2-edge connected
graph G and unit edge costs c = 1l, we have z2CUT ≤ µ(G) = ε(G), since µ(G) is
the optimal value of the integer program whose LP relaxation has optimal value
z2CUT . (Here, z2CUT is the optimal value of the LP on the metric completion of
G, c.) Then, by the parsimonious property, we have zST = z2CUT ≤ ε(G). The
main result in this section follows.
4
Theorem 18. Suppose that the TSP 3 conjecture holds. Then
4
zST ≤ ε(G) ≤ c0T SP ≤ zST .
3
A 43 -approximation of the minimum-size 2-ECSS is obtained by computing
4
3 zST on the metric completion of G, c, where c = 1l.

The Minimum-Cost 2-ECSS Problem

Consider the weighted version of the problem, where each edge e has a nonnega-
0
Pfind a 2-ECSS (V, E ) of the given graph G = (V, E)
tive cost ce and the goal is to
0
such that the cost c(E ) = e∈E 0 ce is minimum. Khuller & Vishkin [8] pointed
out that a 2-approximation guarantee can be obtained via the weighted matroid
intersection algorithm. When the edge costs satisfy the triangle inequality (i.e.,
when c is a metric), Frederickson and Ja’Ja’ [5] gave a 1.5-approximation algo-
rithm, and this is still the best approximation guarantee known. In fact, they
An Approximation Algorithm for 2-Edge Connected Spanning Subgraphs 135

proved that the TSP tour found by the Christofides heuristic achieves an approx-
imation guarantee of 1.5. Simpler proofs of this result based on Theorem 16 were
found later by Cunningham (see [11, Theorem 8]) and by Goemans & Bertsimas
[7, Theorem 4].
Consider the minimum-cost 2-ECSS problem on a 2-edge connected graph
G = (V, E) with nonnegative edge costs c. Let the minimum cost of a simple 2-
ECSS and of a multiedge 2-ECSS be denoted by cε and cµ , respectively. Clearly,
cε ≥ cµ . Even for the case of arbitrary nonnegative costs c, we know of no exam-
cµ 7 cµ 7
ple where > . There is an example G, c with ≥ . Take two copies of
zST 6 zST 6
K3 , call them C1 , C2 , and add three disjoint length-2 paths P1 , P2 , P3 between
C1 and C2 such that each node of C1 ∪ C2 has degree 3 in the resulting graph G.
In other words, G is obtained from the triangular prism C6 by subdividing once
each of the 3 “matching edges”. Assign a cost of 2 to each edge in C1 ∪ C2 , and
assign a cost of 1 to the remaining edges. Then cε = cµ = 14, as can be seen by
taking 2 edges from each of C1 , C2 , and all 6 edges of P1 ∪ P2 ∪ P3 . Moreover,
zST ≤ 12, as can be seen by taking xe = 1/2 for each of the 6 edges e in C1 ∪ C2 ,
and taking xe = 1 for the remaining 6 edges e in P1 ∪ P2 ∪ P3 .

6 Conclusions

Our analysis of the heuristic is (asymptotically) tight. We give two example

graphs. Each is an n-node Hamiltonian graph G = (V, E), where the heuristic
(in the worst case) finds a 2-ECSS G0 = (V, E 0 ) with 17n/12 − Θ(1) edges.
The first example graph, G, is constructed by “joining” many copies of the
following graph H: H consists of a 5-edge path u0 , u1 , u2 , u3 , u4 , u5 , and 4 disjoint
edges v1 w1 , v2 w2 , v3 w3 , v4 w4 . We take q copies of H and identify the node u0
in all copies, and identify the node u5 in all copies. Then we add all possible
edges ui vj , and all possible edges ui wj , i.e., we add the edge set of a complete
bipartite graph on all the u-nodes and all the v-nodes, and we add the edge
set of another complete bipartite graph on all the u-nodes and all the w-nodes.
Finally, we add 3 more nodes u01 , u02 , u03 and 5 more edges to obtain a 5-edge cycle
u0 , u01 , u02 , u03 , u5 , u0 . Clearly, ε(G) = n = 12q + 5. If the heuristic starts with the
closed 5-ear u0 ,u01 ,u02 ,u03 ,u5 ,u0 , and then finds the 5-ears u0 ,u1 ,u2 ,u3 ,u4 ,u5 in all
the copies of H, and finally finds the 3-ears u0 vj wj u5 (1 ≤ j ≤ 4) in all the
copies of H, then we have |E 0 | = 17q + 5.
Here is the second example graph, G = (V, E). The number of nodes is n =
3×5q , and V = {0, 1, 2, ..., 3×5q −1}. The “first node” 0 will also be denoted 3×
5q . The edge set E consists of (the edge set of) a Hamiltonian cycle together with
(the edge sets of) “shortcut cycles” of lengths n/3, n/(3 × 5), n/(3 × 52 ), . . . , 5.
In detail, E = {i(i + 1) | ∀0 ≤ i ≤ q − 1} ∪ {(3 × 5j × i)(3 × 5j × (i + 1)) | ∀0 ≤ j ≤
q−1, 0 ≤ i ≤ 5q−j −1}. Note that |E| = 3×5q +5q +5q−1 +...+5 = (17×5q −5)/4.
In the worst case, the heuristic initially finds 5-ears, and finally finds 3-ears, and
so the 2-ECSS (V, E 0 ) found by the heuristic has all the edges of G. Hence, we
have |E 0 |/ε(G) = |E|/n = 17/12 − 1/(12 × 5q−1 ).
136 Joseph Cheriyan et al.

How do the lower bounds in Proposition 4 (call it Lc ) and in Proposition 7

(call it Lϕ ) compare with ε? Let n denote the number of nodes in the graph.
There is a 2-node connected graph such that ε/Lϕ ≥ 1.5−Θ(1)/n, i.e., the upper
bound of Proposition 8 is tight. There is another 2-edge connected (but not 2-
node connected) graph such that ε/Lc ≥ 1.5 − Θ(1)/n and ε/Lϕ ≥ 1.5 − Θ(1)/n.
Among 2-node connected graphs, we have a graph with ε/Lc ≥ 4/3 − Θ(1)/n,
but we do not know whether there exist graphs that give higher ratios. There is
a 2-node connected graph such that ε/ max(Lc , Lϕ ) ≥ 5/4 − Θ(1)/n, but we do
not know whether there exist graphs that give higher ratios.

References
1. R. Carr and R. Ravi. A new bound for the 2-edge connected subgraph problem. In
R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado, editors, Integer Programming
and Combinatorial Optimization: Proceedings of the 6th International Conference
on Integer Programming and Combinatorial Optimization, LNCS, Vol. 1412, pages
110–123. Springer, 1998. This volume.
2. J. Cheriyan and R. Thurimella. Approximating minimum-size k-connected span-
ning subgraphs via matching. Proc. 37th Annual IEEE Sympos. on Foundat. of
Comput. Sci., pages 292–301, 1996.
3. N. Christofides. Worst-case analysis of a new heuristic for the traveling salesman
problem. Technical report, G.S.I.A., Carnegie-Mellon Univ., Pittsburgh, PA, 1976.
4. A. Frank. Conservative weightings and ear-decompositions of graphs. Combinator-
ica, 13:65–81, 1993.
5. G. L. Frederickson and J. Ja’Ja’. On the relationship between the biconnectivity
augmentation and traveling salesman problems. Theor. Comp. Sci., 19:189–201,
1982.
6. N. Garg, V. S. Santosh, and A. Singla. Improved approximation algorithms for
biconnected subgraphs via better lower bounding techniques. Proc. 4th Annual
ACM-SIAM Symposium on Discrete Algorithms, pages 103–111, 1993.
7. M. X. Goemans and D. J. Bertsimas. Survivable networks, linear programming
relaxations and the parsimonious property. Mathematical Programming, 60:143–
166, 1993.
8. S. Khuller and U. Vishkin. Biconnectivity approximations and graph carvings.
Journal of the ACM, 41:214–235, 1994. Preliminary version in Proc. 24th Annual
ACM STOC,pages 759–770, 1992.
9. L. Lovász. A note on factor-critical graphs. Studia Sci. Math. Hungar., 7:279–280,
1972.
10. L. Lovász and M. D. Plummer. Matching Theory. Akadémiai Kiadó, Budapest,
1986.
11. C. L. Monma, B. S. Munson, and W. R. Pulleyblank. Minimum-weight two-
connected spanning networks. Mathematical Programming, 46:153–171, 1990.
12. H. Whitney. Nonseparable and planar graphs. Trans. Amer. Math. Soc., 34:339–
362, 1932.
13. L. A. Wolsey. Heuristic analysis, linear programming and branch and bound. Math-
ematical Programming Study, 13:121–134, 1980.
Multicuts in Unweighted Graphs with Bounded
Degree and Bounded Tree-Width

Gruia Călinescu1 ? , Cristina G. Fernandes2?? , and Bruce Reed3? ? ?

1
College of Computing, Georgia Institute of Technology
Atlanta, GA 30332–0280, USA
gruia@@cc.gatech.edu
2
Department of Computer Science, University of São Paulo
Rua do Matao, 1010 05508–900 Sao Paulo, Brazil
cris@@ime.usp.br
3
CNRS - Paris, France, and
Department of Computer Science, University of São Paulo, Brazil
reed@@ime.usp.br.

Abstract. The Multicut problem is defined as follows: given a graph

G and a collection of pairs of distinct vertices (si , ti ) of G, find a small-
est set of edges of G whose removal disconnects each si from the corre-
sponding ti . Our main result is a polynomial-time approximation scheme
for Multicut in unweighted graphs with bounded degree and bounded
tree-width: for any > 0, we presented a polynomial-time algorithm
with performance ratio at most 1 + . In the particular case when the
input is a bounded-degree tree, we have a linear-time implementation of
the algorithm. We also provided some hardness results. We proved that
Multicut is still NP-hard for binary trees and that, unless P = N P ,
no polynomial-time approximation scheme exists if we drop any of the
the three conditions: unweighted, bounded-degree, bounded-tree-width.
Some of these results extend to the vertex version of Multicut.

1 Introduction

Multicommodity Flow problems have been intensely studied for decades [7,11,9],
[13,15,17] because of their practical applications and also of the appealing hard-
ness of several of their versions. The fractional version of a Multicut problem
is the dual of a Multicommodity Flow problem and, therefore, Multicut is of
similar interest [3,9,10,13,20].
?
Research supported in part by NSF grant CCR-9319106.
??
Research partially supported by NSF grant CCR-9319106 and by FAPESP (Proc.
96/04505–2).
???
Research supported in part by ProNEx (MCT/FINEP) (Proj. 107/97) and FAPESP
(Proc. 96/12111–4).

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 137–152, 1998. c Springer–Verlag Berlin Heidelberg 1998
138 Gruia Călinescu et al.

The Weighted Multicut is the following problem: given an undirected

graph G, a weight function w on the edges of G, and a collection of k pairs of
distinct vertices (si , ti ) of G, find a minimum weight set of edges of G whose
removal disconnects each si from the corresponding ti .
The particular case in which k = 1 is characterized by the famous Max-
Flow Min-Cut Theorem [6], and is solvable in strongly polynomial time [4]. For
k = 2, a variant of the Max-Flow Min-Cut Theorem holds [11,12] and Multicut
is solvable in polynomial time. For k ≥ 3, the problem is NP-hard [3].
Since many variants of the Weighted Multicut are known to be NP-hard,
we search for efficient approximation algorithms. The performance ratio of an
approximation algorithm A for a minimization problem is the supremum, over
all possible instances I, of the ratio between the weight of the output of A when
running on I and the weight of an optimal solution for I. We say A is an α-
approximation algorithm if its performance ratio is at most α. The smaller the
performance ratio, the better.
The best known performance ratio for Weighted Multicut in general graphs is
O(log k) [10]. Important research has been done for improving the performance
ratio when the input graph G belongs to special classes of graphs. For planar
graphs, Tardos and Vazirani [20], see also [13], give an approximate Max-Flow
Min-Cut theorem and an algorithm with a constant performance ratio.
The case when the input graph is restricted to a tree has been studied in
[9]. Unweighted Multicut problem (in which w(e) = 1 for all edges e of G)
restricted to stars (trees of height one) is equivalent (including performance ratio)
to Minimum Vertex Cover, by Proposition 1 in [9]. It follows that Unweighted
Multicut restricted to stars is NP-hard and Max SNP-hard. In fact, getting a
performance ratio better than two seems very hard, since getting a performance
ratio better than two for Minimum Vertex Cover remains a challenging open
problem [16]. Garg, Vazirani and Yannakakis give an algorithm in [9] with a
performance ratio of two for the Weighted Multicut problem in trees. Note that
the integral unweighted Multicommodity Flow problem in trees is solvable in
polynomial time [9].
We find useful two variations of the Multicut problem. The Vertex Mul-
ticut problem is: given an undirected graph G and a collection of k pairs of
distinct nonadjacent vertices (si , ti ) of G called terminals, find a minimum set
of nonterminal vertices whose removal disconnects each si from the correspond-
ing ti . The Unrestricted Vertex Multicut problem is: given an undirected
graph G and a collection of k pairs of vertices (si , ti ) of G called terminals, find
a minimum set of vertices whose removal disconnects each si from the corre-
sponding ti . (Note that in this variation, terminals might be removed.) Observe
that Vertex Multicut is at least as hard as Unrestricted Vertex Multicut. From
an instance of Unrestricted Vertex Multicut we can obtain an instance of Vertex
Multicut by adding, for each si , a new vertex s0i adjacent only to si , and, for
each ti , a new vertex t0i adjacent only to ti . Each pair (si , ti ) is substituted by
the new pair (s0i , t0i ). Solving Vertex Multicut in this instance is equivalent to
solving Unrestricted Vertex Multicut in the original instance.
Multicuts in Unweighted Graphs 139

Both Vertex Multicut and Unrestricted Vertex Multicut might be of interest

on their own. Garg, Vazirani and Yannakakis considered the weighted version of
Vertex Multicut and proved that their algorithm in [10] achieves a performance
ratio of O(log k) for the weighted version of Vertex Multicut in general graphs.
From now on, we refer to Multicut as Edge Multicut, to avoid confusion.
Let us mention some results we obtained for Vertex Multicut and Unrestricted
Vertex Multicut. We have a proof that Vertex Multicut is NP-hard in bounded-
degree trees. Unrestricted Vertex Multicut is easier: it is polynomially solvable
in trees, but it becomes NP-hard in bounded-degree series-parallel graphs.
The tree-width notion (first introduced by Robertson and Seymour [19])
seems to often capture a property of the input graph which makes hard problems
easy. Various NP-hard problems, like Clique or Coloring, have a polynomial-time
algorithm (linear time in fact) if the input graph has bounded tree-width (see
for example [2]). We will present the formal definition of tree-width in Section 2.
Bounded tree-width can also be used to obtain good approximation algo-
rithms for those problems that remain NP-hard even if restricted to graphs of
bounded tree-width. In our case, Unrestricted Vertex Multicut is NP-hard in
graphs of tree-width at most two, since this class of graphs coincides with the
series-parallel graphs (see for example [21]). We give a straightforward PTAS
for Unrestricted Vertex Multicut in graphs of bounded tree-width.
We present an approximation-ratio preserving reduction from Edge Multi-
cut to Unrestricted Vertex Multicut. If the Edge Multicut instance graph has
bounded degree and bounded tree-width, the Unrestricted Vertex Multicut in-
stance obtained by the reduction has bounded tree-width. Combining the re-
duction with the PTAS for Unrestricted Vertex Multicut in graphs of bounded
tree-width, we obtain a PTAS for Unweighted Edge Multicut in graphs with
bounded degree and bounded tree-width. This is the main result of the paper.
Note that, according to [8, page 140, Theorem 6.8], a FPTAS cannot exist for
this problem, unless P=NP.
We also have a linear-time implementation of our PTAS for Edge Multicut
in bounded-degree trees. The running time of our implementation is O((n +
k)d1/eddd1/e+2 ), where n is the number of vertices of the tree, k is the number
of (si , ti ) pairs, d is the maximum degree of the tree and 1 + is the desired
approximation ratio of the algorithm. The size of the input is Θ(n + k).
We show that Edge Multicut is still NP-hard for binary (degree bounded by
three) trees. Thus, on the class of graphs of bounded degree and bounded tree-
width, which contains binary trees, Edge Multicut is easier (there is a PTAS)
than on general graphs, yet still NP-hard. Identifying this class is the main
theoretical result of this paper.
Hardness results indicate why we cannot eliminate any of the three res-
trictions—unweighted, bounded degree and bounded tree-width—on the input
graph and still obtain a PTAS. It is known [1] that for a Max SNP-hard problem,
unless P=NP, no PTAS exists. We have already seen that Unweighted Edge Mul-
ticut is Max SNP-hard in stars [9], so letting the input graph have unbounded
degree makes the problem harder. We show that Weighted Edge Multicut is Max
140 Gruia Călinescu et al.

SNP-hard in binary trees, therefore letting the input graph be weighted makes
the problem harder. Finally, we show that Unweighted Edge Multicut is Max
SNP-hard if the input graphs are walls. Walls, to be formally defined in Sec-
tion 4, have degree at most three and there are walls with tree-width as large as
we wish. We conclude that letting the input graph have unbounded tree-width
makes the problem significantly harder.
In Section 2 we present the polynomial-time algorithm for Unrestricted Ver-
tex Multicut in trees and the polynomial-time approximation scheme for Unre-
stricted Vertex Multicut in bounded-tree-width graphs. In Section 3, we show the
approximation-preserving reduction from Edge Multicut to Unrestricted Vertex
Multicut. Finally, in Section 4 we present our hardness results.

2 Algorithms for Unrestricted Vertex Multicut

In this section we concentrate on Unrestricted Vertex Multicut. We present a
polynomial-time algorithm for trees and a PTAS for graphs with bounded tree-
width. Let us start defining tree-width.
Let G be a graph and Θ be a pair (T, (Xw )w∈V (T ) ), which consists of a tree
T and a multiset whose elements Xw , indexed by the vertices of T , are subsets of
V (G). For a vertex v of G, we denote by Fv the subgraph of T induced by those
vertices w of T for which Xw contains v. Then Θ is called a tree decomposition
of G if it satisfies the two conditions below:
(1) For every edge e = xy of G, there is a vertex w of T such that {x, y} ⊆
Xw ;
(2) For every vertex v of G, the subgraph Fv of T is a tree.
The width of Θ is the maximum, over all vertices w of T , of |Xw | − 1, and
the tree-width of G, denoted by tw(G), is the minimum of the widths of all tree
decompositions of G.
Consider an instance of Unrestricted Vertex Multicut, that is, a graph G =
(V, E) and a set C of pairs (si , ti ) of vertices of G. We say a pair (si , ti ) in C is
disconnected by a set S ⊆ V if si is disconnected from ti in the subgraph of G
induced by V − S. A set S is a solution for G if S disconnects all pairs (si , ti )
in C. If S has minimum size (i.e., minimum number of vertices), then S is an
optimal solution for G.
Now, let us describe the polynomial-time algorithm for trees. The input of
the algorithm is a tree T and a set C of pairs (si , ti ) of vertices of T .
Consider the tree T rooted at an arbitrary vertex and consider also an ar-
bitrary ordering of the children of each vertex (so that we can talk about pos-
torder).

Algorithm:
Input: a tree T .
Start with S = ∅.
Call a pair (si , ti ) in C active if it is not disconnected by S.
Traverse the tree in postorder.
Multicuts in Unweighted Graphs 141

When visiting vertex v, if v is the least common ancestor of some active

pair (si , ti ) in C, then insert v into S and mark (si , ti ).
Output S.

Clearly the following invariant holds: all non-active pairs in C are discon-
nected by S. A pair in C that becomes non-active does not go back to active
since we never remove vertices from S. At the end of the algorithm, no pair in
C is active, meaning that S is a solution for the problem. For the minimality
of S, note that the paths joining si to ti in T for all marked pairs (si , ti ) form
a pairwise disjoint collection of paths. Any solution should contain at least one
vertex in each of these paths. But there are |S| marked paths, meaning that
any solution has at least |S| vertices. This implies that |S| is a minimum-size
solution. Besides, it is not hard to see that the algorithm can be implemented
in polynomial time.

2.1 Bounded-Tree-Width Graphs

Next we present a PTAS for Unrestricted Vertex Multicut in graphs with bounded
tree-width. A PTAS consists of, for each > 0, a polynomial-time algorithm for
the problem with a performance ratio of at most 1 + . Let us describe such an
algorithm.
The input of our algorithm is a graph G = (V, E), a tree decomposition
Θ = (T, (Xw )w∈V (T ) ) of G, and a set C of pairs of vertices of G.
Given a subgraph G0 of G, denote by C(G0 ) the set of pairs in C whose two
vertices are in G0 , and by G\ G0 the subgraph of G induced by V (G)\ V (G0 ). For
the description of the algorithm, all the instances we mention are on a subgraph
G0 of G and the set of pairs to be disconnected is C(G0 ). So we will drop C(G0 )
of the notation and refer to an instance only by the graph G0 . Denote by opt(G0 )
the size (i.e., the number of vertices) of an optimal solution for G0 .
Root the tree T (of the given tree decomposition) at a vertex r and consider
an arbitrary ordering of the children of each vertex of T . For a vertex u of T , let
T (u) be the subtree of T rooted at u. Let G(u) be the subgraph of G induced
by the union of all Xw , w ∈ V (T (u)). Let t = d(tw(G) + 1)/e.
Here is a general description of the algorithm: label the vertices of T in
postorder. Find the lowest labeled vertex u such that an optimal solution for
G(u) has at least t vertices. If there is no such vertex, let u be the root. Find
an approximate solution Su for G(u) such that |Su | ≤ (1 + )opt(G(u)) and
Xu ⊆ Su . If u is the root of T , then output Su . Otherwise, let G0 = G \ G(u) and
let Θ0 = (T 0 , (Xw0 )w∈V (T 0 ) ) be the tree decomposition of G0 where T 0 = T \ T (u)
and Xw0 = Xw \ V (G(u)), for all w ∈ V (T 0 ). Recursively get a solution S 0 for
G0 . Output S = S 0 ∪ Su .
Next we present a detailed description of the algorithm. It works in iterations.
Iteration i starts with a subgraph Gi−1 of G, a tree decomposition Θi−1 =
(T i−1 , (Xwi−1 )w∈V (T i−1 ) ) of Gi−1 with T i−1 rooted at r, and a set S i−1 of vertices
of G. Initially, G0 = G, Θ0 = Θ, S 0 = ∅ and i = 1. The algorithm halts when
Gi−1 = ∅. When Gi−1 is nonempty, the algorithm starts calling a procedure
142 Gruia Călinescu et al.

Get (u, A), which returns a vertex u of T i−1 and a solution A for Gi−1 (u) such
that |A| ≤ (1 + )opt(Gi−1 (u)) and Xui−1 ⊆ A. Then the algorithm starts a new
iteration with Gi = Gi−1 \ Gi−1 (u), Θi = (T i , (Xwi )w∈V (T i ) ), where T i = T i−1 \
T i−1 (u) and Xwi = Xwi−1 \ V (Gi−1 (u)), for all w ∈ V (T i ), and S i = S i−1 ∪ A.
The formal description of the algorithm appears in Figure 2.1.

Algorithm:

G0 ← G;
Θ0 ← Θ;
S 0 ← ∅;
i ← 1;
while Gi−1 6= ∅ do
Get (ui , Ai ); /* |Ai | ≤ (1 + )opt(Gi−1 (ui )) and Xui−1 ⊆ Ai */
i i−1 i−1
G ←G \ G (ui );
T i ← T i−1 \ T i−1 (ui );
i i−1
Xw ← Xw \ V (Gi−1 (ui )), for each w ∈ V (T i );
S i ← S i−1 ∪ Ai ;
i ← i + 1;
endwhile;
f ← i − 1;
output S f .

Fig. 1. The algorithm for Unrestricted Vertex Multicut in bounded-tree-width

graphs.

We will postpone the description of Get (u, A) and, for now, assume that it
works correctly and in polynomial time. The next lemma states a property of
tree decompositions that we will use later.

Lemma 1. Consider a graph G and a tree decomposition Θ = (T, (Xw )w∈V (T ) )

of G. Let u be a vertex of T , x be a vertex of G(u) and y be a vertex of G \ G(u).
Then any path in G from x to y contains a vertex of Xu .

Next we prove that the output of the algorithm is in fact a solution.

Lemma 2. S f is a solution for G.

Proof. Let (s, t) be a pair in C and P be a path in G from s to t. We need to

show that there is a vertex of P in S f . Note that the vertex sets V (Gi−1 (ui ))
define a partition of V (G).
Let i be such that s is in Gi−1 (ui ). If all vertices of P lie in Gi−1 (ui ) then,
in particular, both s and t are in Gi−1 (ui ), which means (s, t) ∈ C(Gi−1 (ui )).
Since S f contains a solution for Gi−1 (ui ), S f must contain a vertex of P .
If not all vertices of P lie in Gi−1 (ui ), let y be the first vertex of P that does
not lie in Gi−1 (ui ). If y is in Gi−1 \ Gi−1 (ui ) then, by Lemma 1, there is a vertex
Multicuts in Unweighted Graphs 143

of Xui−1
i
in the segment of P from s to y. Since Xui−1
i
⊆ S f , there is a vertex of
P in S . If y is not in G
f i−1
\ G (ui ), then y is not in Gi−1 . This means y is in
i−1

G (uj ), for some j < i. Moreover, s is in Gj−1 \ Gj−1 (uj ) (because this is a
j−1

supergraph of Gi−1 ). Again by Lemma 1, there is a vertex of Xuj−1 j

⊆ S f in P ,
concluding the proof of the lemma.
The next lemma proves that the performance ratio of the algorithm is at
most 1 + .

Lemma 3. |S f | ≤ (1 + )opt(G).

Proof. We have that

X
f X
f
|S f | = |Ai | ≤ (1 + )opt(Gi−1 (ui ))
i=1 i=1

X
f
= (1 + ) opt(Gi−1 (ui )) ≤ (1 + )opt(G),
i=1

because the subgraphs Gi−1 (ui ) are vertex disjoint.

Now we proceed with the description of a straightforward polynomial-time
implementation of Get (u, A).
Search the vertices of the tree T i−1 in postorder. Stop if the vertex u being
visited is either the root or is such that opt(Gi−1 (u)) ≥ t. Let us show how we
check whether opt(Gi−1 (u)) ≥ t in polynomial time.
If we are searching vertex u, it is because all children of u have been searched
and have an optimal solution with less than t vertices. Compute an optimal
solution for each child v of u. This can be done in O(nt+1 ) time by brute force:
check all subsets of G(v) of size at most t. The time is polynomial, since t =
d(tw(G) + 1)/e is fixed. Let s be the sum of the sizes of the solutions for the
children of u.
Let us show that the optimum of Gi−1 (u) is at most s + tw(G) + 1. We do
this by presenting a solution B for Gi−1 (u) of size at most s + tw(G) + 1. The
set B is the union of Xu and an optimal solution for Gi−1 (v), for each child v
of u. Thus |B| ≤ |Xu | + s ≤ tw(G) + 1 + s. Now we must prove that B is in
fact a solution for Gi−1 (u). Let (s, t) be a pair in C(Gi−1 (u)) and P be a path
in Gi−1 (u) from s to t. We need to show that there is a vertex of P in B. If
there is a vertex of P in Xu , then clearly B contains a vertex of P . If, on the
other hand, P contains no vertex of Xu , we must have all vertices of P in the
same Gi−1 (v), for some child v of u, by Lemma 1. But B contains a solution for
Gi−1 (v). Therefore, B contains a vertex of P . This completes the proof that B is
a solution for Gi−1 (u), and so the optimum of Gi−1 (u) is at most s + tw(G) + 1.
Now, let us proceed with the description of Get (u, A). If s < t, then
opt(G(u)) ≤ s + tw(G) + 1 < t + tw(G) + 1, and we can compute in poly-
nomial time an optimal solution A0 for G(u) (by brute force in O(nt+tw(G)+2 )
time, which is polynomial since t = d(tw(G) + 1)/e). If |A0 | < t then pro-
ceed to the next vertex in postorder. If |A0 | ≥ t, then we output u and the
144 Gruia Călinescu et al.

set A = A0 ∪ Xu . Note that in fact opt(Gi−1 (u)) = |A0 | ≥ t, Xu ⊆ A and

|A| ≤ opt(Gi−1 (u)) + (tw(G) + 1) ≤ opt(Gi−1 (u)) + t ≤ (1 + )opt(Gi−1 (u)),
as desired. On the other hand, if s ≥ t, then t ≤ s ≤ opt(Gi−1 (u)) ≤ s +
tw(G) + 1 ≤ s + t ≤ opt(Gi−1 (u)) + opt(Gi−1 (u)) = (1 + )opt(Gi−1 (u)).
Thus B (from the previous paragraph) is a solution for Gi−1 (u) of size at most
s + tw(G) + 1 ≤ (1 + )opt(Gi−1 (u)) that can be computed in polynomial time.
Moreover, Xu ⊆ B. So in this case we output u and A = B. This finishes the
description of Get (u, A).

3 Edge Multicut
In this section we show that Edge Multicut can be reduced to Unrestricted
Vertex Multicut by a reduction that preserves approximability.
The reduction has the following property. If the instance of Edge Multicut is
a graph with bounded degree and bounded tree-width, then the corresponding
instance of Unrestricted Vertex Multicut has bounded tree-width.
Given a graph G = (V, E), the line graph of G is the graph whose vertex set
is E and such that two of its vertices (edges of G) are adjacent if they share an
endpoint in G. In other words, the line graph of G is the graph (E, L), where
L = {ef : e, f ∈ E and e and f have a common endpoint}.
Consider an instance of Edge Multicut, that is, a graph G = (V, E) and a set C
of pairs of distinct vertices of G. Let us describe the corresponding instance of
Unrestricted Vertex Multicut. The input graph for Unrestricted Vertex Multicut
is the line graph of G, denoted by G0 . Now let us describe the set of pairs of
vertices of G0 . For each pair (s, t) in C, we have in C 0 all pairs (e, f ) such that e
has s as endpoint and f has t as endpoint.
Clearly G0 can be obtained from G in polynomial time. Note that C 0 has at
most k∆2 pairs, where k = |C| and ∆ is the maximum degree of G. Also C 0 can
be obtained from G and C in polynomial time.
The following theorem completes the reduction.

Theorem 4. S is a solution for Edge Multicut in G if and only if S is a solution

for Unrestricted Vertex Multicut in G0 .

Proof. Consider a solution S of Edge Multicut in G, that is, a set S of edges of

G such that any pair in C is disconnected in (V, E − S). Note that S ⊆ E(G) =
V (G0 ). Let us verify that the removal of S from G0 disconnects all pairs in C 0 .
For any pair (e, f ) in C 0 , there are s and t in V (G) such that s is an endpoint
of e, t is an endpoint of f and the pair (s, t) is in C. Moreover, a path P 0 in G0
from e to f corresponds to a path P in G from s to t whose edges are a subset of
the vertices in P 0 (which are edges of G). Since S is a solution of Edge Multicut
in G, there must be an edge of P in S, which means that there is a vertex of P 0
in S. Hence S is a solution for Unrestricted Vertex Multicut in G0 .
Conversely, let S be a solution for Unrestricted Vertex Multicut in G0 , that
is, S is a set of edges of G whose removal from G0 disconnects all pairs of vertices
of G0 in C 0 . Let (s, t) be a pair in C, and P a path in G from s to t. (Recall that,
Multicuts in Unweighted Graphs 145

by the description of Edge Multicut, s 6= t.) Let e be the first edge of P and f
the last one (possibly e=f). Clearly s is incident to e, and t to f . Thus (e, f ) is a
pair in C 0 . Corresponding to P , there is a path P 0 in G0 from e to f containing
as vertices all edges of P . Since S is a solution for Unrestricted Vertex Multicut
in G0 and (e, f ) is in C 0 , S must contain a vertex of P 0 . Therefore there is an
edge of P in S, which implies that S is a solution of Edge Multicut in G.
The next lemma shows the previously mentioned property of this reduction.

Lemma 5. If G has bounded degree and bounded tree-width, then the line graph
of G has bounded tree-width.

Proof. Denote by G0 the line graph of G. Let us present a tree decomposition

of G0 whose tree-width is at most (tw(G)+1)∆, where ∆ is the maximum degree
of G.
Let Θ = (T, (Xu )u∈V (T ) ) be a tree decomposition of G of width tw(G). For
each u ∈ V (T ), let Yu be the set of edges of G incident to some vertex in Xu .
First let us prove that Θ0 = (T, (Yu )u∈V (T ) ) is a tree decomposition of G0 . For
this, given an edge e of G, denote by Te the subgraph of T induced by those
vertices in T for which Yu contains e. We shall prove that (1) any edge h of G0
has both endpoints in Yu , for some u in V (T ); and (2) that Te is a tree for any
edge e of G0 .
The endpoints of an edge h of G0 are two edges e and f of G with a common
endpoint, say, v. But v ∈ Xu for some u ∈ V (T ). This implies that both e and
f belong to Yu , proving (1). For (2), let e be a vertex of G0 , that is, an edge
e = xy of G. For any u such that e ∈ Yu , we must have that either x ∈ Xu or
y ∈ Xu . Therefore Te = Tx ∪ Ty . We know that the subgraphs Tx and Ty of T
are subtrees of T . Moreover, Tx and Ty have a vertex in common, because both
x and y belong to the same Xu , for some u ∈ V (T ). Hence Te is a subtree of T .
This completes the proof that Θ0 is a tree decomposition of G0 .
To verify that the width of Θ0 is at most (tw(G) + 1)∆, just note that
|Yu | ≤ |Xu |∆, for all u ∈ V (T ).
The next corollary is a consequence of the previous reduction and the PTAS
given in Section 2.1.

Lemma 6. There is a PTAS for Edge Multicut in bounded-degree graphs with

bounded tree-width.

In fact we know how to implement the PTAS given in Section 2.1, for Edge
Multicut in bounded-degree trees, in time O((n + k)d1/eddd1/e+2 ), where n
is the number of vertices of the tree, k is the number of (si , ti ) pairs, d is the
maximum degree of the tree and 1 + is the desired approximation ratio of the
algorithm. The size of the input is Θ(n + k). We omit the description of this
linear-time implementation in this extended abstract.
146 Gruia Călinescu et al.

4 Complexity Results

In this section, we examine the complexity of Edge, Vertex and Unrestricted

Vertex Multicut. First we prove that Edge and Vertex Multicut are NP-hard in
bounded-degree trees, while Unrestricted Vertex Multicut is NP-hard in series-
parallel graphs of bounded degree. Second, we show that the Weighted Edge
Multicut is Max SNP-hard in binary tree. Finally we prove that Edge, Vertex and
Unrestricted Vertex Multicut are Max SNP-hard in walls (defined in Section 4).

Theorem 7. Edge Multicut in binary trees is NP-hard.

Proof. The reduction is from 3-SAT, a well-known NP-complete problem [8].

Consider an instance Φ of 3-SAT, that is, a set of m clauses C1 , C2 , . . . , Cm
on n variables x1 , x2 , . . . , xn , each clause with exactly three literals.
Let us construct an instance of Edge Multicut: a binary tree T and a set of
pairs of distinct vertices of T . The tree T is built as follows. For each variable
xi , there is a gadget as depicted in Figure 2 (a). The gadget consists of a binary
tree with three vertices: the root and two leaves, one labeled xi and the other
labeled xi . For each clause Cj , there is a gadget as depicted in Figure 2 (b). The
gadget consists of a binary tree with five vertices: the root, one internal vertex
and three leaves, each one labeled by one of the literals in Cj .

(a) (b) (c)

x3
_
xi xi
_
x1 x2

x3 x3
_ _ _ _ _
x1 x1 x2 x2 x3 x3 x1 x2 x1 x2

Fig. 2. (a) The gadget for variable xi . (b) The gadget for clause Cj =
{x1 , x2 , x3 }. (c) Tree T built for the instance Φ = (x1 ∨ x2 ∨ x3 ) ∧ (x1 ∨ x2 ∨ x3 ),
that is, C1 = {x1 , x2 , x3 } and C2 = {x1 , x2 , x3 }.

The tree T is built from these n + m gadgets by arbitrarily connecting them

using new vertices to get a binary tree. See Figure 2 (c) for an example.
Next, we give the set of pairs of vertices of T in our instance. For each variable
xi , there is a pair with the vertices labeled xi and xi in its gadget. For each clause
Cj , there are two pairs: one formed by the two leaves that are siblings and the
other formed by the last leaf and the internal vertex. Finally, each vertex labeled
x̃i in the gadget for a clause is paired to the vertex labeled x̃i in the gadget for
the variable xi , where x̃i ∈ {xi , xi }. This ends the construction of the instance
for Edge Multicut. Note that all this can be done in polynomial time in the size
of Φ.
Multicuts in Unweighted Graphs 147

The next lemma completes the proof of Theorem 7.

Lemma 8. Φ is satisfiable if an only if there is a solution for T of size exactly

n + 2m. Moreover, we can construct in polynomial time such a solution for T
from a truth assignment for Φ and vice versa.

Proof. Assume Φ is satisfiable. Let us present a solution S for T of size exactly

n + 2m. The edge set S consists of two types of edges:

1. For each variable xi , S contains the edge in the gadget for xi incident to the
leaf labeled xi if xi =TRUE or to the leaf labeled xi if xi =FALSE.
2. For each clause Cj , S contains two distinct edges in the gadget for Cj . These
edges are such that (1) they disconnect the two pairs in the gadget, and (2)
the only leaf that is still connected to the root of the gadget is a leaf with
a label x̃i ∈ Cj such that x̃i =TRUE. (The four possible choices for the two
edges are shown in Figure 3.)

r r r r

v v

Fig. 3. Possible choices of two edges, the dashed edges, in the gadget for a clause
that leave exactly one leaf (the marked leaf v) connected to the root r.

Clearly such set S has exactly n + 2m edges and can be constructed in

polynomial time from Φ. Let us prove that S is in fact a solution for T . It is
easy to see that S disconnected the pairs for the variables, and the pairs for the
clauses. The remaining pairs consist of two vertices labeled by a literal x̃i , one in
the variable gadget for xi and the other in a clause gadget. If x̃i =TRUE, then the
edge in the variable gadget incident to the vertex labeled x̃i is in S, guaranteeing
that the pair is disconnected. If x̃i =FALSE, then the vertex labeled x̃i in the
clause gadget is disconnected from the root of this gadget, and therefore, from
the gadget for xi . Thus S is a solution for T , and it has exactly n + 2m edges.
Let us prove the inverse implication. Assume there is a solution S for T
with exactly n + 2m edges: one per variable and two per clause (one for each of
the “disjoint” pairs). More specifically, S has exactly one edge in each variable
gadget, and exactly two edges in each clause gadget in one of the configurations
of Figure 3. Set xi =TRUE if the edge of S in the gadget for xi is incident to
the vertex labeled xi ; set xi =FALSE otherwise. Clearly, we can determine this
truth assignment in polynomial time.
148 Gruia Călinescu et al.

For each clause Cj , there is exactly one leaf v in the gadget for Cj that is
connected to the root r of the gadget. Let x̃i ∈ {xi , xi } be the label for this leaf.
There is a pair formed by this leaf v and the leaf in the gadget for xi whose label
is x̃i . In S, there must be an edge e in the path between these two leaves. Since
leaf v is connected to the root r of the gadget for Cj and all edges in S are either
in a variable gadget or in a clause gadget, this edge e has to be in the variable
gadget. This means e is the edge incident to the leaf labeled x̃i in the gadget
for xi . Hence x̃i =TRUE, and the clause is satisfied. Since this holds for all the
clauses, the given assignment makes Φ TRUE, implying that Φ is satisfiable.

Theorem 9. Vertex Multicut in trees with maximum degree at most four is

NP-hard.

We omit the proof. The construction is similar to the one used in Theorem 7.

Theorem 10. Unrestricted Vertex Multicut in series-parallel graphs with max-

imum degree at most three is NP-hard.

We omit the proof. The construction is similar to the one used in Theorem 7.

Theorem 11. Weighted Edge Multicut is Max SNP-hard in binary trees.

Proof sketch. Let us reduce Edge Multicut in stars to Weighted Edge Multicut
in binary trees. From an instance of the Unweighted Edge Multicut restricted
to stars, we construct an instance of the Weighted Edge Multicut restricted to
binary trees in the following way: for each leaf of the star S, there is a corre-
sponding leaf in the binary tree T . The pairs are the same (we may assume there
is no pair involving the root of the star). We connect the leaves of T arbitrarily
into a binary tree. The edges in T incident to the leaves get weight one and all
other edges of T get weight 2n + 1, where n is the number of leaves in the star
S (which is the same as the number of leaves in the tree T we construct). Any
solution within twice the optimum for the Weighted Edge Multicut instance we
constructed will contain only edges of T incident to the leaves, since any other
edge is too heavy (removing all edges incident to the leaves, we get a solution of
weight n). Then it is easy to see that any optimal solution for the Weighted Edge
Multicut instance we constructed corresponds to an optimal solution for the orig-
inal Unweighted Multicut star instance, and vice versa. Also approximability is
preserved by this reduction.
A wall of height h consists of h + 1 vertex disjoint paths R0 , . . . , Rh , which
we call rows, and h + 1 vertex disjoint paths L0 , . . . , Lh , which we call columns.
A wall of height six is depicted in Figure 4 (a). The reader should be able to
complete the definition by considering Figure 4 (a). The formal definition is as
follows. Each row is a path of 2h + 2 vertices. Each column, a path with 2h + 2
vertices. Column r contains the (2r + 1)st and the (2r + 2)nd vertices of all rows,
as well as the edge between them. For i < h and even, each Lr contains an edge
between the (2r + 2)nd vertex of Ri and the (2r + 2)nd vertex of Ri+1 . For i < h
Multicuts in Unweighted Graphs 149

(a)
L0 L1 Lr Lh
R0
R1

(b)

_ _
x1 x2 x3 x1 x2 x3

Fig. 4. (a) A wall of height six. The dark edges indicate row Ri and column Lr .
(b) The three last rows of the wall built from Φ = (x1 ∨ x2 ∨ x3 )(x1 ∨ x2 ∨ x3 ).

and odd, each Lr contains an edge between the (2r + 1)st vertex of Ri and the
(2r + 1)st vertex of Ri+1 . These are all the edges of the wall.
We prove that Edge, Vertex and Unrestricted Vertex Multicut are Max SNP-
hard in walls. This means, by Arora et al. [1], that there is a constant > 0
such that the existence of a polynomial-time approximation algorithm for any
of the three versions of Multicut with performance ratio at most 1 + implies
that P=NP.
As in [18], we use the concept of L-reduction, which is a special kind of
reduction that preserves approximability.
Let A and B be two optimization problems. We say A L-reduces to B if there
are two polynomial-time algorithms f and g, and positive constants α and β,
such that for each instance I of A,
1. Algorithm f produces an instance I 0 = f (I) of B, such that the optima
of I and I 0 , of costs denoted OptA (I) and OptB (I 0 ) respectively, satisfy
OptB (I 0 ) ≤ α · OptA (I), and
2. Given any feasible solution of I 0 with cost c0 , algorithm g produces a solution
of I with cost c such that |c − OptA (I)| ≤ β · |c0 − OptB (I 0 )|.

Theorem 12. Edge, Vertex and Unrestricted Vertex Multicut are Max SNP-
hard in walls.

Proof sketch. The reduction is from the well-known Max SNP-hard problem
MAX 3-SAT [18]. We show the reduction for Unrestricted Vertex Multicut. The
other two reductions are similar.
The first part of the L-reduction is the polynomial-time algorithm f and
the constant α. Given any instance Φ of MAX 3-SAT, f produces an instance
W, C of Unrestricted Vertex Multicut such that W is a wall. Also, the cost of
150 Gruia Călinescu et al.

the optimum of W, C in Unrestricted Vertex Multicut, denoted OptMC (W, C),

is at most α times the cost of the optimum of Φ in MAX 3-SAT, denoted by
OptSAT (Φ), i.e., OptMC (W, C) ≤ α · OptSAT (Φ).
Consider an instance Φ of MAX 3-SAT, that is, a collection of m clauses on n
variables x1 , . . . , xn , each consisting of exactly three literals. Let us describe the
corresponding instance for Unrestricted Vertex Multicut. The wall W is a wall
of height 6m. To describe the collection C of pairs of vertices of W , consider the
last row of W partitioned into m same length paths, each one associated to one
of the clauses of Φ. Each path has length 12. Label the 2nd , 6th and 10th vertices
in the j th path each with one of the literals in the j th clause. See Figure 4 (b)
for an example. For each pair of vertices u, v in W , u labeled xi and v labeled
xi , include into C the pair u, v. For each clause, include three pairs. The three
pairs formed by each two of the vertices labeled by its three literals. This ends
the description of the instance of Unrestricted Vertex Multicut.
First note that W and C can be obtained in polynomial time in the size of
Φ.
Lemma 13. OptMC (W, C) ≤ 6 · OptSAT (Φ).
Proof sketch. W, C clearly has a solution of size 3m. Also OptSAT (Φ) ≥
m/2.

Lemma 14. From a solution to Φ of size s, 0 ≤ s ≤ m, we can obtain a solution

to W, C of size 3m − s and vice versa.
Proof sketch. Given an assignment that satisfies s clauses of Φ, let S be the
set of all labeled vertices of W except one labeled vertex per satisfied clause.
Choose to not include in S a vertex labeled by a literal that is assigned TRUE.
One can verify that this set S is a solution for W, C of size 3m − s.
Now, consider a solution S for W, C of size 3m − s. Since W has height 6m,
there is a row Ri of W which has no vertex of S. Set to TRUE any literal which
appears as a label of a vertex of W that is connected to Ri after the removal of
S. If some variable was not assigned a value by this rule, assign it an arbitrary
value. Note that, since vertices labeled xi are not connected to vertices labeled xi
after the removal of S, the assignment is well-defined. Consider the six columns
of the wall corresponding to the j th clause of Φ. S should contain at least two
vertices in these columns, otherwise there would be a path connecting at least
two of the labeled vertices in these columns. This means that at least s clauses
have only two vertices removed from their columns of W . Thus one of the labeled
vertices is connected to row Ri , meaning that this clause is satisfied.
The previous two lemmas can be used in an obvious way to show the reduction
we presented is an L-reduction.

Acknowledgments
The first two authors would like to thank Howard Karloff for suggesting the
problem, and for some helpful discussions.
Multicuts in Unweighted Graphs 151

References

1. S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and

hardness of approximation problems. Proc. 33rd IEEE Symposium on Foundations
of Computer Science, pages 14–23, 1992.
2. S. Arnborg and J. Lagergren. Problems easy for tree-decomposable graphs. Journal
of Algorithms, 12(2):308–340, 1991.
3. E. Dahlhaus, D. S. Johnson, C. H. Papadimitriou, P. D. Seymour, and M. Yan-
nakakis. The complexity of multiterminal cuts. SIAM Journal on Computing,
23(4):864–894, 1994.
4. E. A. Dinic. Algorithm for solution of a problem of maximum flow in networks
with power estimation. Soviet Mathematics Doklady, 11:1277–1280, 1970.
5. G. Even, J. S. Naor, B. Schieber, and L. Zosin. Approximating minimum subset
feedback sets in undirected graphs with applications. Proc. 4th Israel Symposium
on Theory of Computing and Systems, pages 78–88, 1996.
6. L. R. Ford and D. R. Fulkerson. Maximal flow through a network. Canadian Jour-
nal of Mathematics, 8:399–404, 1956.
7. L. R. Ford and D. R. Fulkerson. A suggested computation for maximal multicom-
modity network flows. Management Science, 5:97–101, 1958.
8. M. R. Garey and D. S. Johnson. Computers and Intractability. Freeman, 1979.
9. N. Garg, V. Vazirani, and M. Yannakakis. Approximate max-flow min-(multi)cut
theorems and their applications. SIAM Journal on Computing, 25(2):235–251,
1996.
10. N. Garg, V. Vazirani, and M. Yannakakis. Primal-dual approximation algorithms
for integral flow and multicut in trees. Algorithmica, 18(1)3–20, 1997.
11. T. C. Hu. Multicommodity network flows. Operations Research, 9:898–900, 1963.
12. A. Itai. Two-commodity flow. Journal of ACM, 25:596–611, 1978.
13. P. Klein, A. Agrawal, R. Ravi, and S. Rao. Approximation through multicommod-
ity flow. Proc. 31st IEEE Symposium on Foundations of Computer Science, pages
726–737, 1990.
14. P. Klein, S. Plotkin, S. Rao, and E. Tardos. Approximation algorithms for Steiner
and directed multicuts. Journal of Algorithms, 22(2):241–269, 1997.
15. F. T. Leighton and S. Rao. An approximate max-flow min-cut theorem for uni-
form multicommodity flow problems with application to approximation algorithms.
Proc. 29th IEEE Symposium on Foundations of Computer Science, pages 422–431,
1988.
16. D. B. Shmoys. Computing near-optimal solutions to combinatorial optimization
problems. In W. Cook and L. Lovász, editors, Combinatorial Optimization, DI-
MACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 20,
pages 355–397. AMS Publications, 1995.
17. S. Plotkin and É. Tardos. Improved bounds on the max-flow min-cut ratio for
multicommodityfFlows. Proc. 25th Annual ACM Symp. on Theory of Computing,
pages 691–697, 1993.
18. C. H. Papadimitriou and M. Yannakakis. Optimization, approximation, and com-
plexity classes. Journal of Computer and System Sciences, 43:425–440, 1991.
19. N. Robertson and P. Seymour. Graph minor II. Algorithmic aspects of tree-width.
Journal of Algorithms, 7:309–322, 1986.
152 Gruia Călinescu et al.

20. É. Tardos and V. V. Vazirani. Improved bounds for the max-flow min-multicut
ratio for planar and Kr,r -free graphs. Information Processing Letters, 47(2):77-80,
1993.
21. J. van Leeuwen. Graph algorithms. Handbook of Theoretical Computer Science,
Vol. A, chapter 10, pages 525–631. The MIT Press/Elsevier, 1990.
Approximating Disjoint-Path Problems Using
Greedy Algorithms and Packing Integer
Programs ?

Stavros G. Kolliopoulos and Clifford Stein

Dartmouth College, Department of Computer Science

Hanover, NH 03755–3510, USA
{stavros, cliff}@@cs.dartmouth.edu

Abstract. The edge and vertex-disjoint path problems together with

their unsplittable flow generalization are NP-hard problems with a multi-
tude of applications in areas such as routing, scheduling and bin packing.
Given the hardness of the problems, we study polynomial-time approxi-
mation algorithms with bounded performance guarantees. We introduce
techniques which yield new algorithms for a wide range of disjoint-path
problems. We use two basic techniques. First, we propose simple greedy
algorithms for edge- and vertex-disjoint paths and second, we propose the
use of a framework based on packing integer programs for more general
problems such as unsplittable flow. As part of our tools we develop im-
proved approximation algorithms for a class of packing integer programs,
a result that we believe is of independent interest.

1 Introduction
This paper examines approximation algorithms for disjoint-path problems and
their generalizations. In the edge( vertex)-disjoint path problem, we are given a
graph G = (V, E) and a set T of connection requests, also called commodities.
Every connection request in T is a vertex pair (si , ti ), 1 ≤ i ≤ K. The objective
is to connect a maximum number of the pairs via edge( vertex)-disjoint paths.
For the vertex-disjoint paths problem, the connection requests are assumed to be
disjoint. We call the set of connected pairs realizable. A generalization of the edge-
disjoint paths problem is multiple-source unsplittable flow. In this problem every
commodity k in the set T has an associated demand ρk , and every edge e has a
capacity ue . The demand ρk must be routed on a single path from sk to tk . The
objective is to maximize the sum of the demands that can be fully routed while
respecting the capacity constraints. Wlog, we assume that maxk ρk = 1, and
following the standard definition of the problem in the literature, ue ≥ 1, ∀e ∈
E. When all demands and capacities are 1 in the multiple-source unsplittable
?
Research partly supported by NSF Award CCR-9308701 and NSF Career Award
CCR-9624828.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 153–168, 1998. c Springer–Verlag Berlin Heidelberg 1998
154 Stavros G. Kolliopoulos and Clifford Stein

flow problem we obtain the edge-disjoint path problem. (See [10,14] for further
applications and motivation for unsplittable flow.) In all the above problems
one can assign a weight wi ≤ 1 to each connection request and seek to find a
realizable set of maximum total weight. In this paper we will state explicitly
when we deal with the weighted version of a problem.
Both the edge- and vertex-disjoint path problems are fundamental, exten-
sively studied (see e.g. [26,6,27,21,10,13,3]), NP-hard problems [9], with a mul-
titude of applications in areas such as telecommunications, VLSI and schedul-
ing. Despite the attention they have received, disjoint-path problems on general
graphs remain notoriously hard in terms of approximation; p even for edge-disjoint
paths, no algorithm is known which can find even an ω(1/ |E|) fraction of the
realizable paths.
In approximating these problems, we use the traditional notion of a ρ-approxi-
mation algorithm, ρ > 1, which is one that outputs, in polynomial time, a real-
izable set of size at least 1/ρ times the optimum. We will also give and refer to
algorithms which output a realizable set whose size is a non-linear function of
the optimum OP T , such as OP T 2 /|E|.

Overview of Previous Work. Two main approaches have been followed for ap-
proximation.
(i) The first approach, which we call the rounding approach, consists of solving a
fractional relaxation and then use rounding techniques to obtain an integral solu-
tion. The fractional relaxation is typically multicommodity flow and the rounding
techniques used to date involved sophisticated and non-standard use of random-
ized rounding [31]. The objective value of the resulting solution is compared to
the fractional optimum y ∗ , which is an upper bound on the integral optimum,
OPT. This approach has been the more successful one and recently yielded the
first approximation algorithm for uniform unsplittable flow [31] which is the spe-
cial case of unsplittable flow where all the capacities have the same value. Let d
denote the dilation of the fractional solution, i.e. the maximum length of a flow
path in the fractional relaxation. Bounds that rely on the dilation are particu-
larly appealing for expander graphs where it is known that d = O(polylog(n))
[16,12]. The rounding approach yields, for unweighted uniform unsplittable flow
(and thus for unweighted p edge-disjoint paths as well) a realizable set of size
Ω(max{(y ∗ )2 /|E|, y ∗ / |E|, y ∗ /d}) and an Ω(max{(y ∗ )2 /|E|, y ∗ /d}) bound for
the weighted version [31] . This p approach is known to have limitations, e.g.
it is known that a gap of Ω( |V |) exists between the fractional and integral
optima for both the edge- and vertex-disjoint path problems on a graph with
|E| = Θ(|V |) [7].
(ii) Under the second approach, which we call the routing approach, a commodity
is never split, i.e. routed fractionally along more than one path during the course
of the algorithm. In the analysis, the objective value of the solution is compared
to an estimated upper bound on the OP T. This approach has found very limited
applicability so far, one reason being the perceived hardness of deriving upper
bounds on OP T without resorting to a fractional relaxation. The only example
of this method we are aware of is the on-line Bounded Greedy Algorithm in
Approximating Disjoint-Path Problems 155

[10] whose approximation guarantee depends also on the diameter of the graph.
The algorithm can be easily modified
p into an p
off-line procedure that outputs
Ω(OP T / |E|) (Ω(OP T / |V |)) for edge( vertex)-disjoint
realizable sets of size p
paths. The Ω(OP T / |V |) bound is the best known bound to date for vertex-
disjoint paths.

Table 1. Known approximation bounds for edge-disjoint paths (EDP), uniform

capacity unsplittable flow (UCUFP), and general unsplittable flow (UFP), Ω-
notation omitted. Eo denotes the set of edges used by some path in an integral
optimal solution and do the average length of the paths in thepsame solution.
Results with no citation come from the present paper. Our y ∗ / |E| bound for
the weighted EDP problem holds under the assumption that the number of
connection requests K = O(|E|).

routing approach rounding approach

2 ∗ (y ∗ )2 y∗
unweighted EDP √
OP T
[10], √ OPT
, OPT
|Eo |
, OPT
do
√y [31], |E|
[31], d
[31]
|E| |Eo | |E|
(y ∗ )2
weighted EDP — √y∗
|E|
, |E|
[31], y∗
d
[31]
∗ 2
— (y ) ∗
weighted UCUFP |E|
[31], yd [31]
∗ 2
— y∗ ∗
weighted UFP √ , (y )3 , yd
log |E| |E| |E| log |E|

Our Contribution. In this paper we provide techniques for approximating disjoint-

path problems that bear on both of the above approaches. Tables 1 and 2 sum-
marize previous and new bounds for edge-, vertex-disjoint path and unsplittable
flow problems.
Under the routing approach (approach (ii)) we give a simple deterministic
greedy algorithm Greedy Path for edge-disjoint paths that has performance
guarantees comparable to those obtained by the multicommodity flow based
algorithms. Greedy algorithms have been extensively studied in combinatorial
optimization due to their elegance and simplicity. Our work provides another
example of the usefulness of the greedy method. The underlying idea is that if
one keeps routing commodities along sufficiently short paths the final number of
commodities routed is lowerbounded with respect to the optimum.
p
Greedy Path outputs a realizable set of size Ω(max{OP T 2 /|Eo |,
OP T / |Eo |}) for the edge-disjoint path problem. Here Eo ⊆ E is the set of
edges used by theppaths in an optimal solution. Note that OP T 2 /|Eo | always
dominates OP T / |Eo | in the unweighted case that we consider; we give both
bounds to facilitate comparison with existing work and to conform to the tradi-
tional notion of a ρ-approximation algorithm. Our approximation existentially
improves upon the multicommodity-flow based results when |Eo | = o(|E|), i.e.
when the optimal solution uses a small portion of the edges of the graph. An-
other bound can be obtained by noticing that OP T 2 /|Eo | = OP T /do , where do
denotes the average length of the paths in an optimal solution.
156 Stavros G. Kolliopoulos and Clifford Stein

Essentially the same algorithm, Greedy VPath, obtains for thepvertex-

disjoint path problem a realizable set of size Ω(max{OP T 2 /|Vo |, OP T / |Vo |}),
where Vo ⊆ V is the set of vertices used by the paths inpan optimal solution.
Recall that the best known bound to date is t = Ω(OP T / |V |). The realizable
set output by our algorithm has size Ω(t2 ) and potentially better than p this
when |Vo | = o(|V |). This is a significant improvement when OP T = ω( |V |).
For example, when OP T = Ω(|V |), we obtain a constant-factor approximation.
Again an Ω(OP T /do ) guarantee follows immediately.

Table 2. Known approximation bounds for vertex-disjoint paths, Ω-notation

omitted. Vo denotes the set of vertices used by some path in an integral optimal
solution and do the average length of the paths in the same solution. Results
with no citation come from the present paper.

routing approach rounding approach

2 ∗ ∗ 2 ∗
unweighted √
OP T
[10], √ OPT
, OPT
|Vo |
, OPT
do
√y , (y|V|) , yd
|V | |Vo | |V|
(y∗ )2 y∗
—
∗
weighted √y , |V|
, d
|V|

We turn to the rounding approach (approach (i)) to handle the weighted dis-
joint path and unsplittable flow problems. We propose the use of packing integer
programs as a unifying framework that abstracts away the need for customized
and complex randomized rounding schemes. A packing integer program is of the
form maximize cT · x, subject to Ax ≤ b, A, b, c ≥ 0. We first develop, as part
of our tools, an improved approximation algorithm for a class of packing integer
programs, called column restricted, that are relevant to unsplittable flow prob-
lems. Armed with both this new algorithm and existing algorithms for general
packing integer programs, we show how packing formulations both provide a
unified and simplified derivation of many results from [31] and lead to new ones.
In particular, we obtain the first approximation algorithm for weighted multiple-
source unsplittable flow on networks with arbitrary demands and capacities and
the first approximation algorithm for weighted vertex-disjoint paths. Further, we
believe that our new algorithm for column-restricted packing integer programs
is of independent interest. We now elaborate on our results under the rounding
approach, providing further background as necessary.

1.1 Packing Integer Programs

Packing integer programs are a well-studied class of integer programs that can
model several NP-complete problems, including independent set, hypergraph
k-matching [19,1], job-shop scheduling [23,28,33,20] and many flow and path
related problems. Many of these problems seem to be difficult to approximate,
and not much is known about their worst-case approximation ratios. Following
[30] a packing integer program (PIP) is defined as follows.
Approximating Disjoint-Path Problems 157

Definition 1. Given A ∈ [0, 1]m×n , b ∈ [1, ∞)m and c ∈ [0, 1]n with maxj cj =
1, a PIP P = (A, b, c) seeks to maximize cT · x subject to x ∈ Z+n
and Ax ≤ b.
Constraints of the form 0 ≤ xj ≤ dj are also allowed. If A ∈ {0, 1}m×n, each
entry of b is assumed integral. Let B = mini bi , and α be the maximum number
of non-zero entries in any column of A.
The parameters B and α in the definition above appear in the approximation
bounds. For convenience we call bi the capacity of row i. The restrictions on the
values of the entries of A, b, c are wlog; the values in an arbitrary packing program
can be scaled to satisfy the above requirements [29]. We will state explicitly when
some packing program in this paper deviates from these requirements. When
A ∈ {0, 1}m×n, we say that we have a (0, 1)-PIP.

Previous Work on Packing Programs. The basic techniques for approximat-

ing packing integer programs have been the randomized rounding technique of
Raghavan and Thompson [24,25] and the work of Plotkin, Shmoys and Tar-
dos [23]. Let y ∗ denote the optimum value of the linear relaxation. Standard
randomized rounding yields integral solutions of value Ω(y ∗ /m1/B ) for general
PIP’s and Ω(y ∗ /m1/(B+1) ) for (0, 1)-PIP’s [25] (see also [29].) Srinivasan [29,30]
improved on the standard randomized rounding bounds and obtained bounds of
Ω(y ∗ (y ∗ /m)1/(B−1) ) and Ω(y ∗ /α1/(B−1) ) for general PIP’s and Ω(y ∗ (y ∗ /m)1/B )
and Ω(y ∗ /α1/B ) for (0, 1)-PIP’s.

New Results for Column-Restricted PIP’s. The above results show that for vari-
ous combinations of values for y ∗ , m and B, the bounds obtained for a (0, 1)-PIP
are significantly better than those for general PIP’s. In fact they are always better
when y ∗ < m. As another example, the approximation ratio m1/(B+1) obtained
for a (0, 1)-PIP is polynomially better than the approximation ratio of a PIP with
the same parameters. Thus it is natural to ask whether we can bridge this gap.
We make progress in this direction by defining a column-restricted PIP Pr as one
where all non-zero entries of the j-th column of A have the same value ρj ≤ 1.
Column-restricted PIP’s arise in applications such as unsplittable flow problems
(see next section). We show how to obtain approximation guarantees for column-
restricted PIP’s that are similar to the ones obtained for (0, 1)-PIP’s. Let yr∗ de-
note the optimum of the linear relaxation of Pr . We obtain an integral solution
of value Ω(yr∗ /m1/(B+1) ) and Ω(yr∗ /α1/B ). Letting σ(yr∗ ) = Ω(yr∗ (yr∗ /m)1/B ) we
also obtain a bound that is at least as good as σ(yr∗ ) for yr∗ < m log n and in any
case it is never worse by more than a O(log1/B n) factor. Finally we show how to
improve upon the stated approximations when maxj ρj is bounded away from 1.
We develop the latter two results in a more complete version of this paper [15].
We now give an overview of our technique. First we find an optimum solution
x∗ to the linear relaxation of the column-restricted PIP Pr . We partition the ρj ’s
into a fixed number of intervals according to their values and generate a packing
subproblem for each range. In a packing subproblem P L corresponding to range
L, we only include the columns of A with ρj ∈ L and to each component of
the bL -vector we allocate only a fraction of the original bi value, a fraction
158 Stavros G. Kolliopoulos and Clifford Stein

that is determined based on information from x∗ . Next we find approximate

solutions to each subproblem and combine them to obtain a solution to the
original problem. Perhaps the key idea is in using the solution x∗ to define the
capacity allocation to the bL -vector for subproblem P L . This generalizes previous
work of the authors [14] on single-source unsplittable flow. The other key idea
is that each subproblem can be approximated almost as well as a (0, 1)-PIP.

1.2 Applications of Packing to Approximation

We introduce a new framework for applying packing techniques to disjoint-path

problems. First, we formulate an integer program (which is not necessarily a
PIP) and solve a linear relaxation of this integer program to obtain a solution
x. Typically this is a multicommodity flow problem. We then explicitly use the
solution x to guide the formation of a column-restricted or (0, 1) PIP. A related
usage of a solution to the linear relaxation of integer programs in a different con-
text can be found in [8,32]. An integral approximate solution to the created PIP
will be an approximate solution to the original disjoint path problem (with pos-
sibly some small degradation in the approximation factor). This integral solution
can be found using existing algorithms for approximating PIP’s as a black box.
Our algorithms apply to the case when there are weights on the commodities,
and thus generalize those of Srinivasan for edge-disjoint paths. This approach
yields four applications which we explain below.
∗ 2 ∗
p 1: Weighted Unsplittable Flow. Let F1 , F2 , F3 denote (y ) /|E|, y /d
Application
and y ∗ / |E| respectively. We obtain a realizable set of weight Ω(max{F3 / log |E|,
F1 / log3 |E|, F2 }) for unsplittable flow with arbitrary demands and capacities. In
fact we can give a better F1 -type bound for small enough y ∗ , whose analytical
form is complicated. See [15] for further details. In the case where the number of
commodities K = O(|E|) we show how to obtain also an Ω(max{F1 / log |E|, F3 })
bound. Notice that for the edge-disjoint path problem this is a natural assump-
tion since at most |E| connection requests can be feasibly routed. We also note
that a ρ-approximation for y ∗ entails an O(ρ log |E|) approximation for the prob-
lem of routing in rounds [2,10]. We do not pursue any further the latter problem
in this extended abstract.
Application 2: Weighted Vertex-Disjoint Paths.pWe give an algorithm that out-
puts a solution of value Ω(max{(y ∗ )2 /|V |, y ∗ / |V |, y ∗ /d}). The algorithm re-
lies on the observation that, after solving a fractional relaxation, the problem
of rounding is essentially an instance of hypergraph matching; thus it can be
formulated as a packing program with |V | constraints. The algorithm is surpris-
ingly simple but the performance guarantee matches the integrality gap known
for the problem [7].
Application 3: Routing with Low Congestion. A problem that has received a
lot of attention in the literature on routing problems (e.g. [25,17,23,22,10,14]) is
that of minimizing congestion, i.e. the factor by which one is allowed to scale
up capacities in order to achieve an optimal (or near-optimal) realizable set. In
Approximating Disjoint-Path Problems 159

our usage of packing in the rounding algorithms we have assumed that the pa-
rameter B of the packing program is equal to 1. Allowing B > 1 is equivalent to
allowing congestion B in the corresponding disjoint-path problem. Thus another
advantage of the packing approach is that tradeoffs with the allowed congestion
B can be obtained immediately by plugging in B in the packing algorithms that
we use as a black box. For example the approximation for edge-disjoint paths
becomes Ω(max{y ∗ (y ∗ /|E| log |E|)1/B , y ∗ /|E|1/(B+1) , y ∗ /d1/B }), when the num-
ber of connection requests is O(|E|). Our congestion tradeoffs generalize previous
work by Srinivasan [31] who showed the Ω(y ∗ /d1/B ) tradeoff for uniform capac-
ity unsplittable flow. We do not state the tradeoffs explicitly for the various
problems since they can be obtained easily by simple modifications to the given
algorithms.

Application 4: Independent Set in the Square of a Graph. Given a graph G =

(V, E) the k-th power Gk = (V, E k ) of G is a graph where two vertices are adja-
cent if and only if they are at distance at most k in G.
p We further demonstrate
the power of packing formulations by providing an O( |V |) approximation algo-
rithm for finding a maximum independent set in the square of a graph. We also
give results that depend on the maximum vertex degree ∆ in G. Our approxi-
mation ratio cannot be polynomially improved in the sense that no (n/4)1/2−ε
approximation, for any fixed ε > 0, can be obtained in polynomial time unless
N P = ZP P. Studying NP-hard problems in powers of graphs is a topic that has
received some attention in the literature [5,34,18,4].
Independently of our work, Baveja and Srinivasan (personal communication)
have obtained results similar to ours for approximating vertex-disjoint paths
under the rounding approach, unsplittable flow and column-restricted packing
integer programs. Their work builds on the methods in [31].

2 Approximating a Column-Restricted PIP

In this section we present the approximation algorithm for column-restricted

PIP’s. Let P = (A, b, c) be a column-restricted PIP. We call ρj ≤ 1, the value
of the non-zero entries of the j-th column, 1 ≤ j ≤ n, the value of column j.
Throughout this section we assume that there is a polynomial-time algorithm
that given a (0, 1)-PIP with fractional optimum y ∗ outputs an integral solution
of value at least σ(m, B, α, y ∗ ) where m, B, α are the parameters of the packing
program. For example a known σ is Ω(y ∗ /m1+B ). We start by providing a sub-
routine for solving a column-restricted PIP when the column values are close in
range.

Theorem 1. Let P = (A, b, c) be a column-restricted PIP where all column

values ρj are equal to ρ and where each bi = ki Γ ρ, Γ ≥ 1, ki positive integer,
1 ≤ i ≤ m. Here mini bi is not necessarily greater than 1. Then we can find in
polynomial time a solution of value at least σ(m, kΓ, α, y ∗ ), where y ∗ denotes the
optimum of the linear relaxation of P and k = mini ki .
160 Stavros G. Kolliopoulos and Clifford Stein

Proof. Transform the given system P to a (0, 1)-PIP P 0 = (A0 , b0 , c) where b0i =
ki Γ, and A0ij = Aij /ρ. Every feasible solution (either fractional or integral) x̄ to
P 0 is a feasible solution to P and vice versa. Therefore the fractional optimum y ∗
is the same for both programs. Also the maximum number of non-zero entries on
any column is the same for A and A0 . Thus we can unambiguously use α for both.
We have assumed that there is an approximation algorithm for P 0 returning a
solution with objective value σ(m, kΓ, α, y ∗ ). Invoking this algorithm completes
the proof. t
u

The proof of the following lemma generalizes that of Lemma 4.1 in [11].
Lemma 1. Let P = (A, b, c), be a column-restricted PIP with column values in
the interval (a1 , a2 ], and bi ≥ Γ a2 , ∀i and some number Γ ≥ 1. Here mini bi
is not necessarily greater than 1. There is an algorithm α Packing that finds a
a1 ∗
solution g to P of value at least σ(m, Γ, α, 2a 2
y ), where y ∗ is the optimum of
the fractional relaxation of P. The algorithm runs in polynomial time.
Proof sketch. We sketch the algorithm α Packing. Obtain a PIP P 0 = (A0 , b0 , c)
from P as follows. Round down bi to the nearest multiple of Γ a2 and then
multiply it with a1 /a2 . Set b0i equal to the resulting value. Every b0i is now
between a1 /2a2 and a1 /a2 times the corresponding bi . Set A0ij to a1 if Aij 6= 0
and to 0 otherwise. P 0 has thus a fractional solution of value at least (a1 /2a2 )y ∗
that can be obtained by scaling down the optimal fractional solution of P. Note
that every b0i is a multiple of Γ a1 . Thus we can invoke Theorem 1 and find
a solution g 0 to P 0 of value at least σ(m, Γ, α, (a1 /2a2 )y ∗ ). Scaling up every
component of g 0 by a factor of at most a2 /a1 yields a vector g that is feasible
a1 ∗
for P and has value at least σ(m, Γ, α, 2a 2
y ). t
u

Lemma 2. Let P = (A, b, c), be a column-restricted PIP with column values

in the interval (a1 , a2 ], and bi a multiple of Γ a2 , ∀i and some number Γ ≥ 1.
Here mini bi is not necessarily greater than 1. There is an algorithm Inter-
val Packing that finds a solution g to P of value at least σ(m, Γ, α, aa12 y ∗ ),
where y ∗ is the optimum of the fractional relaxation of P. The algorithm runs
in polynomial time.
Proof sketch. Similar to that of Lemma 1. Since all bi ’s are already multiples of
Γ a2 , we don’t pay the 1/2 factor for the initial rounding. t
u
We now give the idea behind the full algorithm. The technique generalizes
earlier work of the authors on single-source unsplittable flow [14]. Let x∗ denote
the optimum solution to the linear relaxation of P. We are going to create packing
subproblems P λ = (Aλ , bλ , cλ ) where Aλ contains only the columns of A with
values in some fixed range (αλ−1 , αλ ]. We will obtain our integral solution to
P by combining approximate solutions to the subproblems. The crucial step is
capacity allocation to subproblems. Consider a candidate for the bλ -vector that
we call the λ-th fractional
P capacity vector. In the fractional capacity vector the
i-th entry is equal to j|ρj ∈(αλ−1 ,αλ ] Aij x∗j . In other words we allocate capacity
to the λ-th subproblem by “pulling” out of b the amount of capacity used up
in the solution x∗ by the columns with values in (αλ−1 , αλ ]. The benefit of such
a scheme would be that by using the relevant entries of x∗ we could obtain a
Approximating Disjoint-Path Problems 161

feasible solution to the linear relaxation of P λ . However, to be able to benefit

from Lemma 1 to find an approximate integral solution to each P λ , we need all
entries of bλ to be larger than Bαλ . This increases the required capacity for each
subproblem potentially above the fractional capacity. Thus we resort to more
sophisticated capacity allocation to ensure that (i) there is enough total capacity
in the b-vector of P to share among the subproblems (ii) the subproblems can
benefit from the subroutines in Lemmata 1, 2. In particular, we initially assign
to each subproblem λ only 1/2 of the fractional capacity vector; this has the
asymptotically negligible effect of scaling down the fractional optimum for each
subproblem by at most 2. We exploit the unused (1/2)b vector of capacity to
add an extra Bαλ units to the entries of bλ , ∀λ.
Given αi , αj , let J αi ,αj be the set of column indices k for which αi < ρk ≤ αj .
We then define, Aαi ,αj to be the m × |J αi ,αj | submatrix of A consisting of the
columns in J αi ,αj , and for any vector x, xαi ,αj to be the |J αi ,αj |-entry subvector
x consisting of the entries whose indices are in J αi ,αj . We will also need to
combine back together the various subvectors, and define xα1 ,α2 ∪ · · · ∪ xαk−1 ,αk
to be the n-entry vector in which the various entries in the various subvectors
are mapped back into their original positions. Any positions not in any of the
corresponding index sets J αi ,αj are set to 0.
algorithm Column Partition(P)
Step 1. Find the n-vector x∗ that yields the optimal solution to the linear relaxation
of P.
Step 2a. Define a partition of the (0, 1] interval into ξ = O(log n) consecutive subinter-
vals (0, n−k ], . . . , (4−λ , 4−λ+1 ], . . . , (4−2 , 4−1 ], (4−1 , 1] where k is a constant larger
than 1. For λ = 1 . . . ξ − 1 form subproblem P λ = (Aλ , bλ , cλ ). Aλ and cλ are the
−λ −λ+1 −λ −λ+1
restrictions defined by Aλ = A4 ,4 and cλ = c4 ,4 . Define similarly
−k −k
P ξ = (Aξ , bξ , cξ ) such that Aξ = A0,n , cξ = c0,n .
Step 2b. Let dλi be the ith entry of (Aλ · xλ∗ ). We define bλ = b when λ is 1 or ξ, and
otherwise bλi = B(1/4)λ−1 + (1/2)dλi .
Step 3. Form solution ẋ by setting xj to 1 if ρj ∈ (0, n−k ], and 0 otherwise.
Step 4. On each P λ , 2 ≤ λ ≤ ξ − 1, invoke α Packing to obtain a solution vector
xλ . Combine the solutions to subproblems 2 through ξ − 1 to form n-vector x̂ =
∪2≤λ≤ξ−1 xλ .
Step 5. Invoke Interval Packing on P 1 to obtain a solution vector x1 . Let x̄ = ∪x1 .
Step 6. Of the three vectors ẋ, x̂ and x̄, output the one, call it x, that maximizes
cT · x.

Two tasks remain. First, we must show that the vector x output by the
algorithm is a feasible solution to the original packing problem P. Second, we
must lower bound cT · x in terms of the optimum y∗ = cT · x∗ of the fractional
relaxation of P. Let y∗r1 ,r2 = cr1 ,r2 xr∗1 ,r2 , and if r1 = (1/4)λ and r2 = (1/4)λ−1
then we abbreviate y∗r1 ,r2 as y∗λ . We examine first the vector x̂.
Lemma 3. Algorithm Column Partition runs in polynomial time and the n-
Pλ=ξ−1
vector x̂ it outputs is a feasible solution to P of value at least λ=2 σ(m, B, α,
(1/16)y∗λ ).
Proof sketch. Let ŷ λ be the optimal solution to the linear relaxation of P λ . By
Lemma 1 the value of the solution xλ , 2 ≤ λ ≤ ξ − 1, found at Step 4 is at
162 Stavros G. Kolliopoulos and Clifford Stein

least σ(m, B, α, (1/2)(1/4)ŷ λ ). By the definition of bλ in Pλ , (1/2)xλ∗ ,i.e. the

restriction of x∗ scaled by 1/2, is a feasible fractional solution for Pλ . Thus
ŷ λ ≥ (1/2)y∗λ . Hence the value of xλ is at least σ(m, B, α, (1/16)y∗λ ). The claim
on the value follows.
For the feasibility, we note that the aggregate capacity used by x̂ on row i of
A is the sum of the capacities used by xP λ
, 2 ≤ λ ≤ ξ − 1,Pon each subproblem.
λ=ξ−1 λ=ξ−1
This sum is by Step 2b at most (1/2) λ=2 dλi + B λ=2 (1/4)λ−1 . But
Pξ−1 P λ=ξ−1
B λ=2 (1/4)λ−1 < (1/2)B < (1/2)bi and λ=2 dλi ≤ bi . Thus the aggregate
capacity used by x̂ is at most (1/2)bi + (1/2)bi = bi . t
u
It remains to account for x̄ and ẋ. The following theorem is the main result
of this section.
Theorem 2. Let P = (A, b, c) be a column-restricted PIP and y ∗ be the op-
timum of the linear relaxation of P. Algorithm Column Partition finds in
polynomial time a solution g to P of value Ω(max{y ∗ /m1/(B+1) , y ∗ /α1/B ,
y ∗ (y ∗ /m log n)1/B }).

Proof. Each of the three vectors ẋ, x̂, x̄ solves a packing problem with column
values lying in (0, 1/nk ], (1/nk , 1/4] and (1/4, 1] respectively. Let Ṗ, P̂, P̄ be
the three induced packing problems. The optimal solution to the linear relax-
ation of at least one of them will have value at least 1/3 of the optimal so-
lution to the linear relaxation of P. It remains to lower bound the approxi-
mation achieved by each of the three vectors on its corresponding domain of
column values. Since m, B, and α are fixed for all subproblems, note that σ
is a function of one variable, y ∗ . Vector ẋ solves Ṗ optimally. The solution is
feasible since all column values are less than 1/nk and thus the value of the
left-hand side of any packing constraint cannot exceed 1. By Lemma 1, vector
1/4,1
x̄ outputs a solution to P̄ of value at least σ(m, B, α, (1/4)y∗ ). For P̂, the
Pξ−1
value of the solution output is given by the sum A = λ=2 σ(m, B, α, (1/16)y∗λ )
in Lemma 3. We distinguish two cases. If σ is a function linear in y∗ then
−k −1
A ≥ σ(m, B, α, (1/16)y∗n ,4 ). If σ is a function convex in y∗ the sum A is
−k −1
minimized when all the terms y∗λ are equal to Θ(y∗n ,4 / log n). Instantiating
σ(y∗ ) with the function Ω(max{y∗ /m1/(B+1) , y∗ /α1/B }) in the linear case and
with Ω(y∗ (y∗ /m)1/B ) in the convex case completes the proof. t
u

3 Applications of PIP’s to Approximation

3.1 Weighted Multiple-Source Unsplittable Flow

Our approach for weighted unsplittable flow consists of finding first the optimum
of the fractional relaxation, i.e. weighted multicommodity flow, which can be
solved in polynomial time via linear programming. The relaxation consists of
allowing commodity k to be shipped along more than one path. Call these paths
the fractional paths. We round in two stages. In the first stage we select at most
one of the fractional paths for each commodity, at the expense of congestion,
Approximating Disjoint-Path Problems 163

i.e. some capacities may be violated. In addition, some commodities may not
be routed at all. In the second stage, among the commodities routed during the
first stage, we select those that will ultimately be routed while respecting the
capacity constraints. It is in this last stage that a column-restricted PIP is used.
We introduce some terminology before giving the algorithm. A routing is
a set of ski -tki paths Pki , used to route ρki amount of flow from ski to tki
for each (sPki , tki ) ∈ I ⊆ T . Given a routing g, the flow ge through edge e is
equal to Pi ∈g,Pi 3e ρi . A routing g for which ge ≤ ue for every edge e is an
unsplittable flow. A fractional routing is one where for commodity k (i) the flow
is split along potentially many paths (ii) demand fk ≤ ρk is routed. A fractional
routing corresponds thus to standard multicommodity flow. A fractional single-
path routing is one where the flow for a commodity is shipped on one path if at
all, but only a fraction fk ≤ ρk of the demand is shipped for commodity k. The
value of a routing g is the weighted sum of the demands routed in g.

algorithm Max Routing(G = (V, E, u), T )

Step 1. Find an optimum fractional routing f by invoking a weighted multicommodity
flow algorithm. Denote by αf (T ) the value of f.
Step 2. Scale up all capacities by a factor of Θ(log |E|) to obtain network G0 with
capacity function u0 . Invoke Raghavan’s algorithm [24] on G0 to round f to a
routing g 0 .
Step 3. Scale down the flow on every path of g 0 by a factor of at most Θ(log |E|) to
obtain a fractional single-path routing g that is feasible for G.
Step 4. Construct a column-restricted PIP P = (A, b, c) as follows. Let k1 , k2 , . . . , kλ
be the set of commodities shipped in g, λ ≤ K. A has λ columns, one for each
commodity in g, and |E| rows, one for each edge of G. Aij = ρkj if the path Pkj
in g for commodity kj goes through edge ei and 0 otherwise. The cost vector c
has entry cj set to wkj ρkj for each commodity kj shipped in g. Finally, bi = uei ,
1 ≤ i ≤ |E|.
Step 5. Invoke algorithm Column Partition to find an integral solution ĝ to P.
Construct an unsplittable flow g 00 by routing commodity kj on path Pkj if and
only if gˆj = 1.

Theorem 3. Given a weighted multiple-source unsplittable flow problem (G =

(V, E), T ), algorithm Max Routing finds in p polynomial time an unsplittable
flow of value at least Ω(max{αf (T )/(log |E| |E|), (αf (T ))2 /(|E| log3 |E|)}),
where αf (T ) is the value of an optimum fractional routing f.
Proof sketch. First note that since routing g 0 is feasible for G0 , after scaling down
the flow at Step 3, routing g is feasible for G. The rounded solution to P main-
tains feasibility. For the value, we can easily extend the analysis of Raghavan’s
algorithm to show that even with weights, it routes in G0 a constant fraction
of αf (T ) [24]. Set zkj to be equal to the amount of flow that is routed in g for
commodity kj divided by ρkj . The λ-vector z is a feasible fractional solution to
P of value Ω(αf (T )/ log |E|). The theorem follows by using Theorem 2 to lower
bound the value of the integral solution ĝ to P. t
u
We now give a bound that depends on the dilation d of the fractional routing.
164 Stavros G. Kolliopoulos and Clifford Stein

Theorem 4. Given a weighted multiple-source unsplittable flow problem (G =

(V, E), T ), there is a polynomial-time algorithm that finds an unsplittable flow
of value at least Ω(αf (T )/d), where αf (T ) is the value and d the dilation of an
optimum fractional routing f.

The construction
p in the proof of Theorem 4 can also be used to give an
Ω(max{αf (T )/( |E|), (αf (T ))2 /(|E| log |E|)}), bound in the case where the
number of commodities |T | = O(|E|). We omit the details in this version.

3.2 Weighted Vertex-Disjoint Paths

A relaxation of the problem is integral multicommodity flow where every com-

modity has an associated demand of 1. We can express vertex-disjoint paths
as an integer program I that has the same constraints as multicommodity flow
together with additional “bandwidth” constraints on the vertices such that the
total flow through a vertex is at most 1. Let LP be the linear relaxation of
I. The optimal solution f to LP consists of a set of fractional flow paths. Our
algorithm relies on the observation that f gives rise (and at the same time is a
fractional solution) to a PIP. The particular PIP models a 1-matching problem
on a hypergraph H with vertex set V and a hyperedge (subset of V ) for every
path in the fractional solution. In other words, the paths in f may be viewed
as sets of vertices without any regard for the flow through the edges of G. We
proceed to give the full algorithm.

algorithm Path Packing(G = (V, E), T )

Step 1. Formulate the linear relaxation LP and find an optimal solution f to it. Using
flow decomposition express f as a set of paths P1 , P2 , . . . , Pλ each connecting a pair
of terminals and carrying zi ≤ 1, 1 ≤ i ≤ λ, units of flow.
Step 2. Construct a (0, 1)-PIP P = (A, b, c) as follows. A is a |V | × λ matrix; Aij is
1 if path Pj includes vertex i and 0 otherwise. b is a vector of ones. cj is equal to
wk such that path Pj connects terminals sk and tk .
Step 3. Find an integral solution g to P and output the corresponding set of paths
P (g).

Theorem 5. Given a weighted vertex-disjoint paths problem (G = (V, E), T ),

p Packing finds in polynomial time a solution of value
algorithm Path
Ω(max{y ∗ / |V |, (y ∗ )2 /|V |, y ∗ /d}), where d is the dilation and y ∗ is the value
of an optimum solution to the linear relaxation LP.
Proof sketch. We show first that P (g) is a feasible solution to the problem.
Clearly the constraints of the packing program P, constructed at Step 2 ensure
that the paths in P (g) are vertex disjoint. This excludes the possibility that
more than one (sk , tk )-path is present in P (g) for some k. The optimal value of
the linear relaxation of P is at least y ∗ since setting xj equal to zj , 1 ≤ j ≤ λ,
yields a feasible fractional solution to P. By applying either standard randomized
rounding [25] or Srinivasan’s algorithms [29,30] at Step 3, we obtain the claimed
bounds on the objective value cT · g. t
u
Approximating Disjoint-Path Problems 165

3.3 An Application to Independent Set

p
In this section we show how a packing formulation leads to an O( |V |) approx-
imation for the following problem: find a maximum weight independent set in
the square of the graph G = (V, E).

Theorem 6. Given a graph G = (V, E) and c ∈ [0, 1]|V | a weight vector on the
vertices, there exists a polynomial-time algorithm that outputs
p an independent set
in the square G2 = (V, E 2 ) of G of weight Ω(max{y ∗ / |V |, (y ∗ )2 /|V |, y ∗ /∆}).
Here y ∗ denotes the optimum of a fractional relaxation and ∆ is the maximum
vertex degree in G.

A hardness of approximation result for the problem of finding a maximum

independent set in the k-th power of a graph follows.
Theorem 7. For the problem of finding a maximum independent set in the k-th
power Gk = (V, E k ) of a graph G = (V, E) for any fixed integer k > 0, there is no
|V | 1/2−ε
ρ-approximation with ρ = ( k+2 ) , for any fixed ε > 0, unless N P = ZP P.

4 Greedy Algorithms for Disjoint Paths

In this section we turn to the routing approach for the unweighted edge- and
vertex-disjoint path problems.
algorithm Greedy Path(G, T )
Step 1. Set A to ∅.
Step 2. Let (s∗ , t∗ ) be the commodity in T such that the shortest path P∗ in G from
s∗ to t∗ has minimum length. If no such path exists halt and output A.
Step 3. Add P∗ to A and remove the edges of P∗ from G. Remove (s∗ , t∗ ) from T .
Goto Step 2.

We begin by lowerbounding the approximation ratio of the algorithm.

Theorem 8. Algorithm Greedy Path runs in polynomial time and outputs
a solution
p to an edge-disjoint paths problem (G = (V, E), T ) of size at least
1/( |Eo | + 1) times the optimum, where Eo ⊆ E is the set of edges used by the
paths in an optimal solution.

We now give improved bounds on the size of the output realizable set.
Theorem 9. Algorithm Greedy Path outputs a solution to an edge-disjoint
path problem (G = (V, E), T ) of size Ω(OP T 2 /|Eo |), where Eo ⊆ E is the set of
edges used by the paths in an optimal solution.

Proof. Let t be the total number of iterations of Greedy Path and Ai be the set
A at the end of the i-th iteration. Let O be an optimal set of paths. We say that a
path Px hits a path Py , if Px and Py share an edge. We define the set O A as the
paths in O that correspond to commodities not routed in A. Let Pi be the path
added to A at the i-th iteration of the algorithm. If Pi hits ki paths in O Ai that
are not hit by a path in Ai−1 , then Pi must have length at least ki . In turn each
166 Stavros G. Kolliopoulos and Clifford Stein

of the paths hit has length at least ki otherwise it would have been selected by
the algorithm instead of Pi . FurthermorePtall paths in O are edge-disjoint with
total number of edges |Eo |. Therefore i=1 ki2 ≤ |Eo |. Applying the Cauchy-
P
Schwarz inequality on the left-hand side we obtain that ( ti=1 ki )2 /t ≤ |Eo |. But
Pt
i=1 ki = |O At | since upon termination of the algorithm all paths in O At
2
must have been hit by some path in At . We obtain |O tAt | ≤ |Eo |. Wlog we can
assume that |At | = o(|O|), since otherwise Greedy Path obtains a constant-
factor approximation. It follows that t = Ω(|O|2 )/|Eo |) = Ω(OP T 2 /|Eo |). t
u

Corollary 1. Algorithm Greedy Path outputs a solution to an edge-disjoint

path problem (G = (V, E), T ) of size Ω(OP T /do ), where do is the average length
of the paths in an optimal solution.
Algorithm Greedy Path gives a solution to vertex-disjoint paths with the fol-
lowing modification at Step 3: remove the vertices of P∗ from G. Call the resulting
algorithm Greedy VPath. The analogues of the results above can be seen to
hold for Greedy VPath as well.
Theorem 10. Algorithm Greedy VPath outputs a solution to a vertex-disjoint
path problem (G = (V, E), T ) of size Ω(max{OP T /|Vo |, OP T 2 /|Vo |, OP T /do }),
where Vo ⊆ V is the set of vertices used by the paths in an optimal solution and
do is the average length of the paths in an optimal solution.

Acknowledgments. We wish to thank Jon Kleinberg and Aravind Srinivasan

for valuable discussions. We also thank Javed Aslam for helpful discussions.

References
1. R. Aharoni, P. Erdös, and N. Linial. Optima of dual integer linear programs.
Comb., 8:13–20, 1988.
2. Y. Aumann and Y. Rabani. Improved bounds for all-optical routing. In Proc. 6th
ACM-SIAM Symp. on Discrete Algorithms, pages 567–576, 1995.
3. A. Z. Broder, A. M. Frieze, and E. Upfal. Static and dynamic path selection on
expander graphs: a random walk approach. In Proc. 29th Ann. ACM Symp. on
Theory of Computing, 531–539, 1997.
4. C. Cooper. The thresold of hamilton cycles in the square of a random graph.
Random Structures and Algorithms, 5:25–31, 1994.
5. H. Fleischner. The square of every two-connected graph is hamiltonian. J. of
Combinatorial Theory B, 16:29–34, 1974.
6. A. Frank. Packing paths, cuts and circuits – A survey. In B. Korte, L. Lovász, H. J.
Prömel, and A. Schrijver, editors, Paths, Flows and VLSI-Layout, pages 49–100.
Springer-Verlag, Berlin, 1990.
7. N. Garg, V. Vazirani, and M. Yannakakis. Primal-dual approximation algorithms
for integral flow and multicut in trees. Algorithmica, 18:3–20, 1997.
8. R. M. Karp, F. T. Leighton, R. L. Rivest, C. D. Thompson, U. V. Vazirani, and
V. V. Vazirani. Global wire routing in two-dimensional arrays. Algorithmica,
2:113–129, 1987.
Approximating Disjoint-Path Problems 167

9. R. M. Karp. On the computational complexity of combinatorial problems. Net-

works, 5:45–68, 1975.
10. J. M. Kleinberg. Approximation algorithms for disjoint paths problems. PhD thesis,
MIT, Cambridge, MA, May 1996.
11. J. M. Kleinberg. Single-source unsplittable flow. In Proc. 37th Ann. Symp. on
Foundations of Computer Science, pages 68–77, October 1996.
12. J. M. Kleinberg and R. Rubinfeld. Short paths in expander graphs. In Proc. 37th
Ann. Symp. on Foundations of Computer Science, pages 86–95, 1996.
13. J. M. Kleinberg and É. Tardos. Disjoint paths in densely-embedded graphs. In
Proc. 36th Ann. Symp. on Foundations of Computer Science, pages 52–61, 1995.
14. S. G. Kolliopoulos and C. Stein. Improved approximation algorithms for unsplit-
table flow problems. In Proc. 38th Ann. Symp. on Foundations of Computer Sci-
ence, pages 426–435, 1997.
15. S. G. Kolliopoulos and C. Stein. Approximating disjoint-path problems using
greedy algorithms and packing integer programs. Technical Report PCS TR97–
325, Department of Computer Science, Dartmouth College, 1997.
16. F. T. Leighton and S. B. Rao. Circuit switching: A multi-commodity flow approach.
In Workshop on Randomized Parallel Computing, 1996.
17. T. Leighton and S. Rao. An approximate max-flow min-cut theorem for uniform
multicommodity flow problems with applications to approximation algorithms. In
Proc. 29th Ann. Symp. on Foundations of Computer Science, pages 422–431, 1988.
18. Y.-L. Lin and S. E. Skiena. Algorithms for square roots of graphs. SIAM J. on
Discrete Mathematics, 8(1):99–118, 1995.
19. L. Lovász. On the ratio of optimal and fractional covers. Discrete Mathematics,
13:383–390, 1975.
20. P. Martin and D. B. Shmoys. A new approach to computing optimal schedules for
the job-shop scheduling problem. In Proc. 5th Conference on Integer Programming
and Combinatorial Optimization, pages 389–403, 1996.
21. D. Peleg and E. Upfal. Disjoint paths on expander graphs. Comb., 9:289–313,
1989.
22. S. Plotkin. Competitive routing of virtual circuits in ATM networks. IEEE J.
Selected Areas in Comm., 1128–1136, 1995.
23. S. Plotkin, D. B. Shmoys, and E. Tardos. Fast approximation algorithms for
fractional packing and covering problems. Math. of Oper. Res., 20:257–301, 1995.
24. P. Raghavan. Probabilistic construction of deterministic algorithms: approximating
packing integer programs. J. of Computer and System Sciences, 37:130–143, 1988.
25. P. Raghavan and C. D. Thompson. Randomized rounding: a technique for provably
good algorithms and algorithmic proofs. Comb., 7:365–374, 1987.
26. N. Robertson and P. D. Seymour. Outline of a disjoint paths algorithm. In B. Ko-
rte, L. Lovász, H. J. Prömel, and A. Schrijver, editors, Paths, Flows and VLSI-
Layout. Springer-Verlag, Berlin, 1990.
27. A. Schrijver. Homotopic routing methods. In B. Korte, L. Lovász, H. J. Prömel,
and A. Schrijver, editors, Paths, Flows and VLSI-Layout. Springer, Berlin, 1990.
28. D. B. Shmoys, C. Stein, and J. Wein. Improved approximation algorithms for shop
scheduling problems. SIAM J. on Computing, 23(3):617–632, 1994.
29. A. Srinivasan. Improved approximations of packing and covering problems. In
Proc. 27th Ann. ACM Symp. on Theory of Computing, pages 268–276, 1995.
30. A. Srinivasan. An extension of the Lovász Local Lemma and its applications to
integer programming. In Proc. 7th ACM-SIAM Symp. on Discrete Algorithms,
pages 6–15, 1996.
168 Stavros G. Kolliopoulos and Clifford Stein

31. A. Srinivasan. Improved approximations for edge-disjoint paths, unsplittable flow

and related routing problems. In Proc. 38th Ann. Symp. on Foundations of Com-
puter Science, pages 416–425, 1997.
32. A. Srinivasan and C.-P. Teo. A constant-factor approximation algorithm for packet
routing and balancing local vs. global criteria. In Proc. 29th Ann. ACM Symp. on
Theory of Computing, pages 636–643, 1997.
33. C. Stein. Approximation Algorithms for Multicommodity Flow and Shop Scheduling
Problems. PhD thesis, MIT, Cambridge, MA, August 1992.
34. P. Underground. On graphs with hamiltonian squares. Disc. Math., 21:323, 1978.
Approximation Algorithms for the Mixed
Postman Problem

Balaji Raghavachari and Jeyakesavan Veerasamy

Computer Science Program

The University of Texas at Dallas
Richardson, TX 75083, USA
{rbk, veerasam}@@utdallas.edu

Abstract. The mixed postman problem, a generalization of the Chinese

postman problem, is that of finding a shortest tour that traverses each
edge of a given mixed graph (a graph containing both undirected and
directed edges) at least once. The problem is solvable in polynomial time
either if the graph is undirected or if the graph is directed, but NP-hard
in mixed graphs. An approximation algorithm with a performance ratio
of 3/2 for the postman problem on mixed graphs is presented.

1 Introduction

Problems of finding paths and tours on graphs are of fundamental importance

and find many practical applications. The Traveling salesman problem (TSP)
is a well-known and widely studied problem. The objective is to find a shortest
tour that visits all vertices of a given graph exactly once. The problem is known
to be NP-hard. Postman problems are similar to TSP at first glance, but are
quite different in terms of the complexity of the problems. Given a graph G,
the Chinese postman problem (CPP) is to find a minimum cost tour covering
all edges of G at least once [9]. It is the optimization version of the Euler tour
problem, which asks if there is a tour that traverses every edges of a graph
exactly once. Edmonds and Johnson [3] showed that the problem is solvable in
polynomial time. They also showed that the problem is solvable in polynomial
time if G is a directed graph.
The Mixed Postman Problem (MPP) is a generalization of the Chinese
Postman Problem and is listed as Problem ND25 by Garey and Johnson [8]. In
the mixed postman problem, the input graph may contain both undirected edges
and arcs (directed edges). The objective is to find a tour that traverses every
edge at least once, and which traverses directed edges only in the direction of the
arc. Even though both undirected and directed versions of the Chinese postman
problem are polynomially solvable, Papadimitriou [13] showed that MPP is NP-
hard. There are other related problems, such as the Rural postman problem
and the Windy postman problem, which are also NP-hard [4,5]. Many practical

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 169–179, 1998. c Springer–Verlag Berlin Heidelberg 1998
170 Balaji Raghavachari and Jeyakesavan Veerasamy

applications like mail delivery, snow removal, and trash pick-up can be modeled
as instances of MPP, and hence it is important to design good approximation
algorithms for this problem.
Previous Work: Numerous articles have appeared in the literature over
the past three decades about the mixed postman problem. Edmonds and John-
son [3], and Christofides [2] presented the first approximation algorithms. Fred-
erickson [7] showed that the algorithm of [3] finds a tour whose length is at
most 2 times the length of an optimal tour (i.e., approximation ratio of 2).
He also presented a mixed strategy algorithm, which used the solutions output
by two different heuristics, and then selected the shorter of the two tours. He
proved that the approximation ratio of the mixed strategy algorithm is 53 . Com-
prehensive surveys are available on postman problems [1,4]. Integer and linear
programming formulations of postman problems have generated a lot of interest
in recent years [10,12,16]. Ralphs [16] showed that a linear relaxation of MPP
has optimal solutions that are half-integral. One could use this to derive a 2-
approximation algorithm for the problem. Improvements in implementation are
discussed in [12,15]. It is interesting to note that Nobert and Picard [12] state
that their implementation has been used for scheduling snow removal in Mon-
treal. Several other articles have appeared on generalized postman problems,
such as k-CPP [14] and the Windy Postman problem [4].
Our Results: Even though numerous articles have appeared in the litera-
ture on MPP after Frederickson’s paper in 1979, his result has been the best
approximation algorithm for MPP in terms of proven worst-case ratio until now.
In this paper, we present an improved approximation algorithm for MPP with an
approximation ratio of 32 . We study the properties of feasible solutions to MPP,
and derive a new lower bound on the cost of an optimal solution. Our algorithm
uses a subtle modification of Frederickson’s algorithm, and the improved perfor-
mance ratio is derived from the new lower bound. We present examples showing
that our analysis is tight.

2 Preliminaries

Problem Statement: The input graph G = (V, E, A) consists of a set of

vertices V , a multi-set of edges E, and a multi-set of arcs (directed edges) A.
A non-negative cost function C is defined on edges and arcs. We extend the
definition of C to graphs (multisets of edges and arcs), by taking the sum of the
costs of its edges and arcs. We assume that the graph is strongly connected, i.e.,
there exists a path from any vertex u to any other vertex v, since the problem
is clearly infeasible for graphs that are not strongly connected. The output is
a tour which may travel each edge or arc several times. Therefore, except for
traversing each edge/arc once, the traversal could always use a shortest path
between any two nodes. Hence we assume that the weights of the edges satisfy
the triangle inequality.
Definitions: A cut is a partition of the vertex set into S and V − S. It is
called nontrivial if neither side is empty. An edge crosses the cut if it connects
Approximation Algorithms for the Mixed Postman Problem 171

a vertex in S to a vertex in V − S. Let outdegree(v) be the number of outgoing

arcs from v. Similarly, indegree(v) is the number of incoming arcs into v. Let
degree(v) be the total number of edges and arcs incident to v. We say v has even
degree if degree(v) is even. A graph has even degree if all its vertices have even
degree. Let surplus(v) = outdegree(v) − indegree(v). If surplus(v) is negative,
we may call it as a deficit. The definition of surplus can be extended to sets of
vertices S, by finding the difference between outgoing and incoming edges that
cross the cut (S, V − S).
Properties of Eulerian Graphs: A graph is called Eulerian if there is a
tour that traverses each edge of the graph exactly once. It is known that an
undirected, connected graph is Eulerian if and only if the degree of each vertex
is even. For a directed graph to be Eulerian, the underlying graph must be
connected and for each vertex v, outdegree(v) = indegree(v). In other words,
for each vertex v, surplus(v) = 0. A mixed graph G = (V, E, A) is Eulerian if
the graph is strongly connected and satisfies the following properties [6,11]:

– Even Degree Condition: Every vertex v is incident to an even number

of edges and arcs, i.e., degree(v) is even. This condition implies that the
number of edges and arcs crossing any cut (S, V − S) is even.
– Balanced Set Condition: For every nontrivial cut S ⊂ V , the absolute
surplus of S must be less than or equal to the number of undirected edges
crossing the cut (S, V − S).

The above conditions can be checked in polynomial time using algorithms for
the maximum flow problem [6]. In other words, we can decide in polynomial time
whether a given mixed graph is Eulerian. The problem we are interested in is to
find set of additional edges and arcs of total minimum cost that can be added
to G to make it Eulerian, and this problem is NP-hard. In the process of solving
mixed postman problem, arcs and edges may be duplicated. For convenience, we
may also orient some undirected edges by giving them a direction. The output
of our algorithm is an Eulerian graph H that contains the input graph G as a
subgraph. So each edge of H can be classified either as an original edge or as
a duplicated edge. Also, each arc of H is either an original arc, or a duplicated
arc, or an oriented edge, or a duplicated & oriented edge.

3 Frederickson’s Mixed Algorithm

Frederickson defined the following algorithms as part of his solution to MPP:

– Evendegree: augment a mixed graph G by duplicating edges and arcs

such that the resulting graph has even degree. A minimum-cost solution
is obtained by disregarding the directions of the arcs (i.e., by taking the
underlying undirected graph) and solving CPP by adding a minimum-weight
matching of odd-degree nodes.
– Inoutdegree: augment a mixed graph G by duplicating edges and arcs,
and orienting edges such that in the resulting graph, for each vertex v,
172 Balaji Raghavachari and Jeyakesavan Veerasamy

surplus(v) = 0. We will refer to this as the Inout problem. A minimum-cost

solution GIO is obtained by formulating a flow problem and solving it op-
timally. The augmentation cost, CIO (G) is defined as the cost of additional
arcs and edges that were added to G to get GIO by Inoutdegree.
– Evenparity: applied to the output of Inoutdegree on an even-degree
graph; restores even degree to all nodes without increasing the cost, while
retaining the property that indegree = outdegree at all nodes. Edmonds and
Johnson [3] indicated that Inoutdegree can be applied to an even-degree
graph in such a way that the resulting graph has even degree and hence
Eulerian. Frederickson [7] showed a simple linear-time algorithm to perform
the task. The basis of this algorithm is that a suitably defined subgraph of
undirected edges and duplicated edges/arcs forms a collection of Eulerian
graphs.
– Largecycles: similar to Evendegree except that only edges are allowed
to be duplicated, and arcs are not considered.

Input graph G = (V, E, A)

Evendegree
Inoutdegree
Inoutdegree
Largecycles
Evenparity
Mixed2
Mixed1
Select min-cost solution

Fig. 1. Mixed algorithm

Figure 1 shows an approximation algorithm for MPP presented by Frederick-

son [7], called the Mixed algorithm. It comprises of two heuristics called Mixed1
and Mixed2. Heuristic Mixed1 first runs Evendegree to make the degree of
all nodes even. Then it runs Inoutdegree to make indegree = outdegree for
all vertices. Finally, Evenparity restores even degree to all the nodes without
increasing the cost, and the graph becomes Eulerian. Heuristic Mixed2 first
calls Inoutdegree, and then makes the graph Eulerian by calling Largecy-
Approximation Algorithms for the Mixed Postman Problem 173

cles. Since Largecycles disregards the arcs of the graph, no further steps are
needed. The Mixed algorithm outputs the best solution of the two heuristics,
and Frederickson showed its performance ratio is at most 53 .

4 Structure of Inout Problem

In this section, we will identify a few critical properties of the Inout problem,
which we use improve the Mixed algorithm. Let G be a mixed graph, and let
Gb and Gr (blue graph and red graph respectively) be two augmentations of G
for the Inout problem, i.e., Gb and Gr are two Inout solutions.
Let Gbr = Gb ⊕ Gr denote the symmetric difference of Gb and Gr , the
multigraph containing edges and arcs in one graph, but not the other. Since we
are dealing with multigraphs, an edge/arc (u, v), occurs in Gbr as many times
as the difference between the number of copies of (u, v) in Gb and Gr . A set of
arcs C in Gbr is an alternating cycle if the underlying edges of C form a simple
cycle, and in any walk around the cycle, the blue arcs of C are directed one way,
and the red arcs of C are directed in the opposite direction (see Figure 2).

Blue arcs

Red arcs

Fig. 2. An alternating cycle

The following lemmas show how alternating cycles of Gbr can be used to
switch between different Inout solutions.
Lemma 1. The arcs of Gbr can be decomposed into alternating cycles.

Proof. This proof is based on the properties of Eulerian directed graphs. Since
they are Inout solutions, each vertex of Gb and Gr satisfies the condition,
in degree = out degree. However, this rule does not apply to Gbr , since all com-
mon arcs and edges have been removed. Since only common arcs are removed, it
can be verified that for each vertex in Gbr , surplusb = surplusr . In other words,
at each vertex of Gbr , the net surplus or deficit created by blue arcs is equal to
the net surplus or deficit created by red arcs. So at each vertex, an incoming
blue arc can be associated with either an outgoing blue arc, or an incoming red
arc. Similar statement is valid for red arcs also.
Consider a directed walk starting at an arbitrary vertex in Gbr , in which
blue edges are traversed in the forward direction and red edges are traversed
174 Balaji Raghavachari and Jeyakesavan Veerasamy

in the reverse direction. Whenever any vertex is revisited during the walk, an
alternating cycle is removed and output. When the start vertex is re-visited
and there are no more arcs incident to it, restart the walk from another vertex.
Continue the process until all arcs of Gbr have been decomposed into alternating
cycles. An alternative proof can be given by showing that reversing all red arcs
makes Gbr satisfy indegree = outdegree at each node and hence the arcs can be
decomposed into cycles.

Each alternating cycle AC in Gbr is composed of one or more blue paths and
red paths. In each cycle, all blue paths go in one direction and all red paths go
in the opposite direction. Let us call blue arcs of AC as B and red arcs as R.
Define the operation of removing a set of arcs B from an Inout solution
GIO , denoted by GIO − B, as follows. If an arc of B is an additional arc or edge
added during the augmentation, then it can be removed. On the other hand, if
the arc was an undirected edge in G that was oriented during the augmentation,
then we remove the orientation but leave the edge in the graph. The cost of such
an operation is the total cost of the arcs/edges that are removed. Similarly, we
can define the addition operation GIO + B, where a set of arcs B are added to
GIO .

Lemma 2. Consider two Inout solutions Gb and Gr , and their symmetric dif-
ference Gbr . Let AC be an alternating cycle in Gbr , with blue component B and
red component R. Then, Gb − B + R and Gr − R + B are also Inout solutions.

Proof. Clearly, adding or deleting edges of AC does not affect the deficit/surplus
of the nodes that are not in AC. For all nodes in AC, the net deficit/surplus
created by blue paths of AC is same as the net deficit/surplus created by red
paths of AC. So, the net deficit/surplus created by removal of B from Gb is
compensated by the addition of R. Therefore, Gb −B+R is an Inout solution, as
each node of Gb −B +R has net deficit/surplus of zero. By symmetry, Gr −R+B
is also an Inout solution.

Lemma 3. Let d− (R) be the cost of removing R from Gr and let d+ (R) be the
cost of adding R to Gb . Similarly, let d− (B) be the cost of removing B from
Gb and let d+ (B) be the cost of adding B to Gr . Then d+ (R) ≤ d− (R) and
d+ (B) ≤ d− (B).

Proof. We prove that the cost of adding R to Gb does not exceed the cost of
removing R from Gr . Arcs of R in Gr can be either additional arc, oriented
edge, or additional oriented edge. Additional arcs and additional oriented edges
contribute at most the same cost to d+ (R). If an oriented edge is in R, then either
it is not currently oriented in Gb or it is oriented in the opposite direction in Gb .
To add this oriented edge to Gb , we can either orient the undirected edge, or
remove its opposite orientation; either way there is no additional cost. Therefore
R can be added to Gb without incurring additional cost. Hence, d+ (R) ≤ d− (R).
By symmetry, d+ (B) ≤ d− (B).
Approximation Algorithms for the Mixed Postman Problem 175

5 Improved Lower Bound for Mixed Postman Problem

Consider a mixed graph G = (V, E, A), with edges E and arcs A. Let C ∗ be
the weight of an optimal postman tour of G. Suppose, given G as input, In-
outdegree outputs GIO = (V, U, M ). U ⊆ E are the edges of G that were not
oriented by Inoutdegree. M ⊇ A are arcs that satisfy indegree = outdegree
at each vertex. Let CM and CU be the total weight of M and U respectively.
Note that CM + CU = CIO (G) + C(G). M may contain several disjoint directed
components. Suppose we shrink each directed component of GIO into a single
vertex and form an undirected graph U G. If we run Evendegree on U G, it
adds a minimum-weight matching (i.e., a subset of U ) to make the degree of
each vertex in U G even. Let the weight of this matching be CX . Frederickson [7]
used a lower bound of CM + CU on C ∗ . We show the following improved lower
bound on C ∗ , which allows us to improve the approximation ratio.
Lemma 4. CM + CU + CX ≤ C ∗ .
In order to prove the above lemma, we first show that when we find an
optimal Inout solution, if some undirected edges are not oriented (edges in the
set U ) by Inoutdegree, then adding additional copies of edges in U (when one
runs Evendegree in U G) to G does not decrease CIO (G), the augmentation
cost of Inoutdegree.
Lemma 5. Let G = (V, E, A) be a mixed graph. Let GIO = (V, U, M ) be an opti-
mal Inout solution computed by Inoutdegree algorithm. Let U = {u1 , ..., uk }
be the undirected edges of G that were not oriented by Inoutdegree, even
though the algorithm could orient these edges without incurring additional cost.
Adding additional copies of edges in U to G does not decrease the augmentation
cost of an optimal Inout solution.
Proof. We give a proof by contradiction. Let H be the graph G to which ni copies
of ui (i = 1, . . . , k) are added such that the augmentation cost of Inoutdegree
is less for H than G, i.e., CIO (H) < CIO (G). In addition, let H be a minimal
supergraph of G with this property, and if we add any fewer copies of the edges
of U , then CIO (H) = CIO (G). Clearly, some edges of U must be oriented in
HIO , otherwise an Inout solution of H would also be an Inout solution of G.
Assigning Gb = GIO and Gr = HIO , we will prove that d− (R) < d− (B) for
every cycle AC that uses at least one edge of U . Recall that, d− (R) is the cost of
removing R from the red Inout solution HIO , and d− (B) is the cost of removing
B from the blue Inout solution GIO .
Consider any alternating cycle AC that contains an edge of U . Let B and R
be its blue and red arcs respectively. By Lemma 3, d+ (B) ≤ d− (B). If d− (R) ≥
d− (B) then,

C(HIO − R + B) = C(HIO ) − d− (R) + d+ (B) ≤ C(HIO ).

Therefore, HIO − R + B is an optimal Inout solution of H, and it uses fewer

copies of edges of U than H. This contradicts the minimality assumption of
176 Balaji Raghavachari and Jeyakesavan Veerasamy

H. Therefore d− (R) < d− (B) for all cycles AC that involve at least one edge
from U . Choose any such cycle AC. We already know that d+ (R) ≤ d− (R) and
d− (R) < d− (B). Combining these, we get,

C(GIO − B + R) = C(GIO ) − d− (B) + d+ (R) < C(GIO ).

Note that GIO − B + R uses at most one copy of each edge from U and its cost
is less than C(GIO ). In other words, GIO − B + R is an Inout solution of G,
and C(GIO − B + R) < C(GIO ). This contradicts the assumption that GIO is
an optimal Inout solution of G. Therefore, adding additional copies of edges
from U does not decrease the augmentation cost of an Inout solution.

Proof of Lemma 4. Let G∗ be an optimal solution, whose cost is C ∗ . Consider

the nodes of U G, the nodes corresponding to the directed components of M . G∗
needs to have additional matching edges (subset of U ) between these components
to satisfy the even-degree condition for each component. This matching costs at
least CX . By Lemma 5, we know that additional edges of U do not decrease the
cost of optimal Inout solution. Therefore the augmentation cost of the Inout
problem is still CIO (G). Hence the total cost of G∗ is at least CX + CIO (G) +
C(G). Substituting CM + CU for CIO (G) + C(G), we get, CM + CU + CX ≤ C ∗ .

6 Modified Mixed Algorithm

Figure 3 describes the modified Mixed algorithm. First, we run algorithm In-
outdegree on input graph G and obtain GIO = (V, U, M ). Before running
Evendegree of Mixed1 algorithm, reset the weights of all arcs and edges used
by M to 0, forcing Evendegree to duplicate edges/arcs of M whenever possi-
ble, as opposed to duplicating edges of U . Use the actual weights for the rest of
the Mixed1 algorithm. There are no changes made in Mixed2 algorithm.
Remark: The cost of arcs and edges of M need not be set to zero. In practice,
the cost of each edge in U could be scaled up by a big constant. This ensures that
the total cost of edges from U in the the minimum-cost matching is minimized.

6.1 Analysis of Modified Mixed Algorithm

Lemma 6. Let CM be the cost of arcs in M and let C ∗ be the cost of an optimal
postman tour of G. The cost of the tour generated by Modified Mixed1 algorithm
is at most C ∗ + CM .

Proof. Consider the components induced by the arcs of M . In original graph

G, the arcs of M correspond to arcs and oriented edges, possibly duplicated.
Since the original edges and arcs of G corresponding to M are made to have
zero-cost when Evendegree is run, any component that has an even number of
odd-degree vertices can be matched using the arcs of M at zero cost. Therefore,
the algorithm adds a minimum-cost matching of cost CX . Let H be the graph
Approximation Algorithms for the Mixed Postman Problem 177

Input graph G = (V, E, A)

In G, set the cost of

Inoutdegree
edges & arcs in M to 0
outputs (V, U, M)
and run Evendegree

Inoutdegree
Largecycles
Evenparity
Mixed2
Modified Mixed1
Select min-cost solution

Fig. 3. Modified Mixed algorithm

at this stage. Note that Evendegree duplicates each arc of M at most once to
form H. We follow Frederickson’s analysis [7] for the rest of the proof: Let M1
be the multiset of arcs such that there are two arcs in M1 for each arc in M .
Clearly, M and M1 both satisfy Inout property. Hence, the union of U , X, and
M1 forms a Inout solution containing H, whose cost is CU + CX + 2 ∗ CM . Since
Inoutdegree is an optimal algorithm for Inout problem, it is guaranteed to
find an Inout solution of cost at most CU + CX + 2 ∗ CM . This is at most
C ∗ + CM by Lemma 4. Finally, Evenparity does not change the cost of the
solution.

Lemma 7 (Frederickson [7]). Algorithm Mixed2 finds a tour whose cost is

at most 2C ∗ − CM .

Theorem 1. Algorithm Modified Mixed produces a tour whose cost is at most

3 ∗
2 C .

Proof. By Lemma 6, Modified Mixed1 outputs a solution whose cost is at most

CM + C ∗ , which is at most 32 C ∗ , if CM ≤ C ∗ /2. On the other hand, if CM >
C ∗ /2, then by Lemma 7, Mixed2 outputs a solution whose cost is at most
2C ∗ − CM , which is at most 32 C ∗ .
178 Balaji Raghavachari and Jeyakesavan Veerasamy

7 Conclusion

We have presented an algorithm and analysis to achieve approximation ratio of

3/2 for the mixed postman problem. Improvement in the performance ratio is
achieved by proving an improved lower bound on the cost of an optimal postman
tour. The performance ratio is tight as shown by Frederickson’s examples.

Acknowledgments

The research of the first author was supported in part by a grant from the
National Science Foundation under Research Initiation Award CCR-9409625.
The second author gratefully acknowledges the support of his employer, Samsung
Telecommunications America Inc.

References

1. P. Brucker. The Chinese postman problem for mixed networks. In Proceedings of

the International Workshop on Graph Theoretic Concepts in Computer Science,
LNCS, Vol. 100, pages 354–366. Springer, 1980.
2. N. Christofides, E. Benavent, V. Campos, A. Corberan, and E. Mota. An optimal
method for the mixed postman problem. In System Modelling and Optimization,
Notes in Control and Information Sciences, Vol. 59. Springer, 1984.
3. J. Edmonds and E. L. Johnson. Matching, Euler tours and the Chinese postman.
Math. Programming, 5:88–124, 1973.
4. H. A. Eiselt. Arc routing problems, Part I: The Chinese postman problem. Oper-
ations Research, 43:231–242, 1995.
5. H. A. Eiselt. Arc routing problems, Part II: The rural postman problem. Operations
Research, 43:399–414, 1995.
6. L. R. Ford and D. R. Fulkerson. Flows in Networks. Princeton U. Press, Princeton,
NJ, 1962.
7. G. N. Frederickson. Approximation algorithms for some postman problems. J.
Assoc. Comput. Mach., 26:538–554, 1979.
8. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the
Theory of NP-Completeness. Freeman, New York, 1979.
9. M. Guan. Graphic programming using odd and even points. Chinese Math., 1:273–
277, 1962.
10. C. H. Kappauf and G. J. Koehler. The mixed postman problem. Discret. Appl.
Math., 1:89–103, 1979.
11. Y. Nobert and J. C. Picard. An optimal algorithm for the mixed Chinese postman
problem. Publication # 799. Centre de recherche sur les transports, Montreal,
Canada, 1991.
12. Y. Nobert and J. C. Picard. An optimal algorithm for the mixed Chinese postman
problem. Networks, 27:95–108, 1996.
13. C. H. Papadimitriou. On the complexity of edge traversing. J. Assoc. Comput.
Mach., 23:544–554, 1976.
14. W. L. Pearn. Solvable cases of the k-person Chinese postman problem. Operations
Research Letters, 16:241–244, 1994.
Approximation Algorithms for the Mixed Postman Problem 179

15. W. L. Pearn and C. M. Liu. Algorithms for the Chinese postman problem on
mixed networks. Computers & Operations Research, 22:479–489, 1995.
16. T. K. Ralphs. On the mixed Chinese postman problem. Operations Research
Letters, 14:123–127, 1993.
Improved Approximation Algorithms for
Uncapacitated Facility Location

?
Fabián A. Chudak

School of Operations Research & Industrial Engineering

Cornell University
Ithaca, NY 14853, USA
chudak@@cs.cornell.edu

Abstract. We consider the uncapacitated facility location problem. In

this problem, there is a set of locations at which facilities can be built;
a fixed cost fi is incurred if a facility is opened at location i. Further-
more, there is a set of demand locations to be serviced by the opened
facilities; if the demand location j is assigned to a facility at location i,
then there is an associated service cost of cij . The objective is to de-
termine which facilities to open and an assignment of demand points to
the opened facilities, so as to minimize the total cost. We assume that
the service costs cij are symmetric and satisfy the triangle inequality.
For this problem we obtain a (1 + 2/e)-approximation algorithm, where
1 + 2/e ≈ 1.736, which is a significant improvement on the previously
known approximation guarantees.
The algorithm works by rounding an optimal fractional solution to a
linear programming relaxation. Our techniques use properties of opti-
mal solutions to the linear program, randomized rounding, as well as a
generalization of the decomposition techniques of Shmoys, Tardos, and
Aardal.

1 Introduction
The study of the location of facilities to serve clients at minimum cost has been
one of the most studied themes in the field of Operations Research (see, e.g., the
textbook edited by Mirchandani and Francis [9]). In this paper, we focus on one
of its simplest variants, the uncapacitated facility location problem, also known
as the simple plant location problem, which has been extensively treated in the
literature (see the chapter by Cornuéjols, Nemhauser, and Wolsey in [9]). This
problem can be described as follows. There is a set of potential facility locations
F ; building a facility at location i ∈ F has an associated nonnegative fixed cost
fi , and any open facility can provide an unlimited amount of certain commodity.
There also is a set of clients or demand points D that require service; client j ∈ D
?
Research partially supported by NSF grants DMS-9505155 and CCR-9700029 and
by ONR grant N00014-96-1-00500.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 180–194, 1998. c Springer–Verlag Berlin Heidelberg 1998
Approximation Algorithms for Uncapacitated Facility Location 181

has a positive demand of commodity dj that must be shipped from one of the
open facilities. If a facility at location i ∈ F is used to satisfy the demand of client
j ∈ D, the service or transportation cost incurred is proportional to the distance
from i to j, cij . The goal is to determine a subset of the set of potential facility
locations at which to open facilities and an assignment of clients to these facilities
so as to minimize the overall total cost, that is, the fixed costs of opening the
facilities plus the total service cost. We will only consider the metric variant of the
problem in which the distance function c is nonnegative, symmetric and satisfies
the triangle inequality. Throughout the paper, a ρ-approximation algorithm is
a polynomial-time algorithm that is guaranteed to deliver a feasible solution of
objective function value within a factor of ρ of optimum. The main result of
this paper is a (1 + 2/e)-approximation algorithm for the metric uncapacitated
facility location problem, where 1 + 2/e ≈ 1.736.
Notice that our result is based on worst case analysis, that is, our solu-
tions will be within a factor of (1 + 2/e) of optimum for any instance of the
problem. Such a strong assurance can have salient practical implications: these
algorithms often outperform algorithms whose design was not grounded by the
mathematical understanding required for proving performance guarantees. We
have corroborated this assertion for our algorithm through a few computational
experiments, which will be reported in a follow-up paper.
In contrast to the uncapacitated facility location problem, Cornuéjols, Fisher,
and Nemhauser [3] studied the problem in which the objective is to maximize
the difference between assignment and facility costs. They showed that with this
objective, the problem can be thought of as a bank account location problem.
Notice that even though these two problems are equivalent from the point of
view of optimization, they are not from the point of view of approximation.
Interestingly, Cornuéjols, Fisher, and Nemhauser showed that for the maximiza-
tion problem, a greedy procedure that iteratively tries to open the facility that
most improves the objective function yields a solution of value within a constant
factor of optimum.
The metric uncapacitated facility location problem is known to be NP-hard
(see [4]). Very recently, Guha and Khuller [7] have shown that it is Max SNP-
hard. In fact, they have also shown that the existence of a ρ-approximation
algorithm for ρ < 1.463 implies that NP ⊆ TIME(nO(log log n) ) (see also Feige
[5]).
We briefly review previous work on approximation algorithms for the metric
uncapacitated facility location problem. The first constant factor approximation
algorithm was given by Shmoys, Tardos, and Aardal [11], who presented a 3.16-
approximation algorithm, based on rounding an optimal solution of a classical
linear programming relaxation for the problem. This bound was subsequently
improved by Guha and Khuller [7], who provided a 2.408-approximation algo-
rithm. Guha and Khuller’s algorithm requires a stronger linear programming
relaxation. They add to the relaxation of [11] a facility budget constraint that
bounds the total fractional facility cost by the optimal facility cost. After run-
ning the algorithms of [11], they use a greedy procedure (as in [3]) to improve the
182 Fabián A. Chudak

quality of the solution: iteratively, open one facility at a time if it improves the
cost of the solution. Since they can only guess the optimal facility cost to within
a factor of (1 + ), they are in fact solving a weakly polynomial number of linear
programs. In contrast, the 1.736-approximation algorithm presented in this pa-
per requires the solution of just one linear program, providing as a by-product
further evidence of the strength of this linear programming relaxation.
Without loss of generality we shall assume that the sets F and D are disjoint;
let N = F ∪ D, n = |N |. Even though all our results hold for the case of
arbitrary demands, for sake of simplicity of the exposition, we will assume that
each demand dj is 1 (j ∈ D); thus, the cost of assigning a client j to an open
facility at location i is cij . The distance between any two points k, ` ∈ N is ck` .
We assume that the n×n distance matrix (ck` ) is nonnegative, symmetric (ck` =
c`k , for all k, ` ∈ N ) and satisfies the triangle inequality, that is, cij ≤ cik + ckj ,
for all i, j, k ∈ N . The simplest linear programming relaxation (due to Balinski,
1965 [2]), which we will refer to as P, is as follows:
XX X
Min cij xij + fi yi
j∈D i∈F X
i∈F
(P) subject to xij = 1 for each j ∈ D (1)
i∈F
xij ≤ yi for each i ∈ F, j ∈ D (2)
xij ≥ 0 for each i ∈ F, j ∈ D . (3)

Any 0-1 feasible solution corresponds to a feasible solution to the uncapacitated

facility location problem: yi = 1 indicates that a facility at location i ∈ F is
open, while xij = 1 means that client j ∈ D is serviced by the facility built
at location i ∈ F . Inequalities (1) state that each demand point j ∈ D must
be assigned to some facility, while inequalities (2) say that clients can only be
assigned to open facilities. Thus the linear program P is indeed a relaxationP of the
problem.
P Given
P a feasible fractional solution (x̂, ŷ), we will say that i∈F fi ŷi
and j∈D i∈F cij x̂ij are, respectively, its fractional facility and service cost.
Given a feasible solution to the linear programming relaxation P, the algo-
rithm of Shmoys, Tardos, and Aardal first partitions the demand points into
clusters and then for each cluster opens exactly one facility, which services all
the points in it. In their analysis, they show that the resulting solution has the
property that the total facility cost is within a constant factor of the fractional
facility cost and the total service cost is within a constant factor of the fractional
service cost. The main drawback of this approach is that most of the time the
solution is unbalanced, in the sense that the first constant is approximately three
times smaller than the second.
One of the simplest ways to round an optimal solution (x∗ , y ∗ ) to the lin-
ear program P is to use the randomized rounding technique of Raghavan and
Thompson [10] as proposed by Sviridenko [12] for the special case in which all
the distances are 1 or 2. Essentially, the 1.2785-approximation algorithm of [12]
opens a facility at location i ∈ F with probability yi∗ ; and then assigns each
demand point to its nearest facility. This matches the lower bound of Guha and
Approximation Algorithms for Uncapacitated Facility Location 183

Khuller for this special case. Independently of our work, Ageev and Sviridenko
[1] have recently shown that the randomized rounding analysis for the maximum
satisfiability problem of Goemans and Williamson [6] can be adapted to obtain
improved bounds for the maximization version of the problem.
The following simple ideas enable us to develop a rounding procedure for the
linear programming relaxation P with an improved performance guarantee. We
explicitly exploit optimality conditions of the linear program, and in particular,
we use properties of the optimal dual solution and complementary slackness. A
key element to our improvement is the use of randomized rounding in conjunction
with the approach of Shmoys, Tardos, and Aardal. To understand our approach,
suppose that for each location i ∈ F, independently, we open a facility at i with
probability yi∗ . The difficulty arises when attempting to estimate the expected
service cost: the distance from a given demand point to the closest open facility
might be too large. However, we could always use the routing of the algorithm
of Shmoys, Tardos, and Aardal if we knew that each cluster has a facility open.
In essence, rather than opening each facility independently with probability yi∗ ,
we instead open one facility in each cluster with probability yi∗ . The algorithm
is not much more complicated, but the most refined analysis of it is not quite so
simple. Our algorithms are randomized, and can be easily derandomized using
the method of conditional expectations. The main result of this paper is the
following.
Theorem 1. There is a polynomial-time algorithm that rounds an optimal so-
lution to the linear programming relaxation P to a feasible integer solution whose
value is within (1 + 2/e) ≈ 1.736 of the optimal value of the linear programming
relaxation P.
Since the optimal LP value is a lower bound on the integer optimal value, the
theorem yields a 1.736-approximation algorithm, whose running time is domi-
nated by the time required to solve the linear programming relaxation P. As a
consequence of the theorem, we obtain the following corollary on the quality of
the value of the linear programming relaxation.
Corollary 1. The optimal value of the linear programming relaxation P is within
a factor of 1.736 of the optimal cost.
This improves on the previously best known factor of 3.16 presented in [11].

2 A Simple 4-Approximation Algorithm

In this section we present a new simple 4-approximation algorithm. Even though
the guarantees we will prove in the next section are substantially better, we will
use most of the ideas presented here. Next we define the neighborhood of a
demand point k ∈ D.
Definition 1. If (x, y) is a feasible solution to the linear programming relaxation
P, and j ∈ D is any demand point, the neighborhood of j, N(j), is the set of
facilities that fractionally service j, that is, N(j) = {i ∈ F : xij > 0}.
184 Fabián A. Chudak

The following definition was crucial for the algorithm of Shmoys, Tardos, and
Aardal [11].
Definition 2. Suppose that (x, y) is a feasible solution to the linear program-
ming relaxation P and let gj ≥ 0, for each j ∈ D. Then (x, y) is g-close if xij > 0
implies that cij ≤ gj (j ∈ D, i ∈ F).
Notice that if (x, y) is g-close and j ∈ D is any demand point, j is fractionally
serviced by facilities inside the ball of radius gj centered at j. The following
lemma is from [11].
Lemma 1. Given a feasible g-close solution (x, y), we can find, in polynomial
time, a feasible integer 3g-close solution (x̂, ŷ) such that
X X
fi ŷi ≤ fi y i .
i∈F i∈F

We briefly sketch the proof below. The algorithm can be divided into two
steps: a clustering step and a facility opening step. The clustering step works as
follows (see Table 1). Let S be the set of demand points that have not yet been
assigned to any cluster; initially, S = D. Find the unassigned demand point
j◦ with smallest gj -value and create a new cluster centered at j◦ . Then all of
the unassigned demand points that are fractionally serviced by facilities in the
neighborhood of j◦ (that is, all the demand points k ∈ S with N(k) ∩ N(j◦ ) 6= ∅)
are assigned to the cluster centered at j◦ ; the set S is updated accordingly.
Repeat the procedure until all the demand points are assigned to some cluster
(i.e., S = ∅). We will use C to denote the set of centers of the clusters.

Table 1. The clustering construction of Shmoys, Tardos, and Aardal.

1. S ← D, C ← ∅
2. while S =
6 ∅
3. choose j◦ ∈ S with smallest gj value (j ∈ S)
4. create a new cluster Q centered at j◦ , C ← C ∪ {j◦ }
5. Q ← {k ∈ S : N(k) ∩ N(j◦ ) 6= ∅}
6. S ← S − Q

The following fact follows easily from the clustering construction and the
definition of neighborhood, and is essential for the success of the algorithm.
Fact 1. Suppose that we run the clustering algorithm of Table 1, using any g-
close solution (x, y), then:
(a) neighborhoods of distinct centers are disjoint (i.e., if j and k are centers,
j 6= k ∈ C, then N(j) ∩ N(k) = ∅), P
(b) for every demand point k ∈ D, i∈N(k) xik = 1 .
Approximation Algorithms for Uncapacitated Facility Location 185

After the clustering step, the algorithm of [11] opens exactly one facility per
cluster. For each center j ∈ C we open the facility i◦ in the neighborhood of j,
N(j), with smallest fixed cost fi and assign all the demandP points in the cluster
of j to facility i◦ . Observe that by inequalities (2), i∈N(j) yi ≥ 1, thus fi◦ ≤
P
i∈N(j) fi y i . Using Fact 1(a), the total facility cost incurred
P
by the algorithm
is never more than the total fractional facility cost i∈F fi yi . Next consider
any demand point k ∈ D and suppose it belongs to the cluster centered at j◦ ;
let ` ∈ N(k) ∩ N(j◦ ) be a common neighbor and let i be the open facility in
the neighborhood of j◦ (see Figure 1). Then, the distance from k to i can be

N(j) N(k)

≤ gj ◦
≤ gj ◦ ` ≤ gk
i
j◦ k

Fig. 1. Bounding the service cost of k (4-approximation algorithm). The circles

(•) are demand points, whereas the squares ( ) are facility locations.

bounded by the distance from k to ` (which is at most gk ) plus the distance

from ` to j◦ (which is at most gj◦ ) plus the distance from j◦ to i (which is at
most gj◦ ). Thus, the distance from k to an opened facility is at most 2gj◦ + gk ,
which is at most 3gk , since j◦ was the remaining demand P point with minimum
g-value. Hence the total service cost can be bounded by 3 k∈D gk .
Shmoys, Tardos, and Aardal used the filtering technique of Lin & Vitter
[8] to obtain g-close solutions and then applied Lemma 1 to obtain the first
constant factor approximation algorithms for the problem. However, a simpler
g-close solution is directly obtained by using the optimal solution to the dual
linear program of P. More precisely, the dual of the linear program P is given by
186 Fabián A. Chudak

X
Max vj (4)
X
j∈D
(D) subject to wij ≤ fi for each i ∈ F (5)
j∈D
vj − wij ≤ cij for each i ∈ F, j ∈ D (6)
wij ≥ 0 for each i ∈ F, j ∈ D . (7)

Fix an optimal primal solution (x∗ , y ∗ ) and an optimal dual solution (v ∗ , w∗ ),

and let LP∗ be the optimal LP value. Complementary slackness gives that x∗ij > 0
implies that vj∗ − wij
∗ ∗
= cij ; since wij ≥ 0, we get the following lemma.
Lemma 2. If (x∗ , y ∗ ) is an optimal solution to the primal linear program P and
(v ∗ , w∗ ) is an optimal solution to the dual linear program D, then (x∗ , y ∗ ) is
v ∗ -close.
By applying Lemma 1 to the optimal v ∗ -close solution (x∗ , y ∗P
), we obtain a
feasible solution for the problem withP total facility cost at most i∈F fi yi∗ and
with total service cost bounded by 3 j∈D vj∗ = 3 LP∗ . We can bound the sum
of these by 4 LP∗ ; thus we have a 4-approximation algorithm. Note the inbalance
in bounding facility and service costs.

3 A Randomized Algorithm
After solving the linear program P, a very simple randomized algorithm is the
following: open a facility at location i ∈ F with probability yi∗ independently
for every i ∈ F, and then assign each demand P point to its closest open facility.
Notice that the expected facility cost is just i∈F fi yi∗ , the same bound as in the
algorithm of Section 2. Focus on a demand point k ∈ D. If it happens that one
of its neighbors has been opened, then the service cost of k would be bounded
by the optimal dual variable vk∗ . However, if we are unlucky and this is not
the case (an event that can easily be shown to occur with probability at most
1/e ≈ 0.368, where the bound is tight), the service cost of k could be very large.
On the other hand, suppose that we knew, for instance, that for the clustering
computed in Section 2, k belongs to a cluster centered at j, where one of the
facilities in N(j) has been opened. Then in this unlucky case we could bound the
service cost of k using the routing cost of the 4-approximation algorithm.
OurPalgorithm is also based on randomized rounding and the expected facility
cost is i∈F fi y ∗ . However, we weaken the randomized rounding step and do not
open facilities independently with probability yi∗ , but rather in a dependent way
to ensure that each cluster center has one of its neighboring facilities opened.
Even though the algorithms presented in this section work for any g-close
feasible solution, for sake of simplicity we will assume as in the end of Section 2,
that we have a fixed optimal primal solution (x∗ , y ∗ ) and a fixed optimal dual
solution (v ∗ , w∗ ), so that (x∗ , y ∗ ) is v ∗ -close. It is easy to see that we can assume
that yi∗ ≤ 1 for each potential facility location i ∈ F. To motivate the following
Approximation Algorithms for Uncapacitated Facility Location 187

definition, fix a demand location j ∈ D, and suppose without loss of generality

that the neighbors of j (that is, the facilities i for which x∗ij > 0) are {1, . . . , d}
with c1j ≤ c2j ≤ . . . ≤ cdj . Then it is clear that we can assume without loss of
generality that j is assigned “as much as possible” to facility 1, then to facility 2
and so on; that is, x∗1j = y1∗ , x∗2j = y2∗ , . . . , x∗d−1,j = yd−1
∗
(but maybe x∗dj < yd∗ ).
Definition 3. A feasible solution (x, y) to the linear programming relaxation P
is complete if xij > 0 implies that xij = y i , for every i ∈ F, j ∈ D.
Thus the optimal solution (x∗ , y ∗ ) is “almost” complete, in the sense that for
every j ∈ D there is at most one i ∈ F with 0 < x∗ij < yi∗ .
Lemma 3. Suppose that (x, y) is a feasible solution to the linear program P for
a given instance of the uncapacitated facility location problem I. Then we can
find in polynomial time an equivalent instance Ĩ and a complete feasible solution
(x̃, ỹ) to its linear programming Prelaxation withP the same fractional
P P facility and
service
P Pcosts as (x, y), that is, i∈F f i y i = i∈F f i ỹ i and j∈D i∈F cij xij =
j∈D i∈F cij x̃ij . Moreover, if (x, y) is g-close, so is (x̃, ỹ). t
u
Proof. Pick any facility i ∈ F for which there is a demand point j ∈ D with
0 < xij < y i (if there is no such a facility, the original solution (x, y) is complete
and we are done). Among the demand points j ∈ D for which xij > 0, let j◦
be the one with smallest xij value. Next create a new facility location i0 which
is an exact copy of i (i.e. same fixed cost and in the same location), and set
ỹi0 = yi − xij◦ and reset ỹi equal to xij◦ . Next for every j ∈ D with xij > 0, reset
x̃ij = xij◦ = ỹi , and set x̃i0 j = xij − xij◦ (which is nonnegative by the choice
of j◦ ). All the other components of x and y remain unchanged. Clearly (x̃, ỹ) is
a feasible solution to the linear programming relaxation of the new instance; if
(x, y) is g-close so is (x̃, ỹ). It is straightforward to verify that the new instance
is equivalent to the old one (because there are no capacity restrictions) and that
the fractional facility and service costs of the solutions (x, y) and (x̃, ỹ) are the
same. Since the number of pairs (k, j) for which 0 < xkj < y k has decreased
at least by one, and initially there can be at most |D||F | ≤ n2 such pairs, n2
iterations suffice to construct a new instance with a complete solution. t
u
By Lemma 3, we can assume that (x∗ , y ∗ ) is complete. To understand some of
the crucial points of our improved algorithm we will first consider the following
randomized rounding with clustering. Suppose that we run the clustering
procedure exactly as in Table 1, and let C be the set of cluster centers. We
partition the facility locations into two classes, according to whether they are in
the neighborhood of a cluster center or not.
Definition 4. The set of central facility locations L is the set of facility locations
that are in the neighborhood of some cluster center, that is, L = ∪j∈C N(j); the
remaining set of facility locations R = F − L are noncentral facility locations.
The algorithm opens facilities in a slightly more complicated way that the
simplest randomized rounding algorithm described in the beginning of the sec-
tion. First we open exactly one central facility per cluster as follows: indepen-
dently for each center j ∈ C, open neighboring facility i ∈ N(j) at random with
188 Fabián A. Chudak

probability x∗ij (recall Fact 1(b)). Next independently we open each noncen-
tral facility i ∈ R with probability yi∗ . The algorithm then simply assigns each
demand point to its closest open facility.

Lemma 4. For each facility location i ∈ F, the probability that a facility at

location i is open is yi∗ .

Proof. If i is a noncentral facility (i ∈ R), we open a facility at i with probability

yi∗ . Suppose next that i is a central facility (i ∈ L), and assume that i ∈ N(j),
for a center j ∈ C. A facility will be opened at location i only if the center j
chooses it with probability x∗ij ; but x∗ij = yi∗ since (x∗ , y ∗ ) is complete. t
u

P
Corollary 2. The expected total facility cost is i∈F fi yi∗ .

For each demand

P point k ∈ D, let C k denote the fractional service cost of k,
that is, C k = i∈F cik x∗ik . The expected service cost of k ∈ D is bounded in the
following lemma whose proof is presented below.
Lemma 5. For each demand point k ∈ D, the expected service cost of k is at
most C k + (3/e)vk∗ .
P
Overall, since k∈D vk∗ = LP∗ , the expected total service cost can be bounded
as follows.
P ∗
Corollary 3. The expected total service cost is at most k∈D C k + (3/e) LP .
P P
By combining Corollaries 2 and 3, and noting that k∈D C k + i∈F fi yi∗ =
LP∗ , we obtain the following.
Theorem 2. The expected total cost incurred by randomized rounding with
clustering is at most (1 + 3/e)LP∗ .
Proof of Lemma 5. Fix a demand point k ∈ D. For future reference, let j◦ be the
center of the cluster to which k belongs; notice that j◦ always has a neighboring
facility i◦ (i◦ ∈ N(j◦ )) opened, hence its service cost is never greater than vj∗◦ .
To gain some intuition behind the analysis, suppose first that each center in C
shares at most one neighbor with k; that is, |N(j) ∩ N(k)| ≤ 1, for each center
j ∈ C. Each neighbor i ∈ N(k) is opened with probability yi∗ = x∗ik independently
in this special case. For notational simplicity suppose that N(k) = {1, . . . , d},
with c1k ≤ . . . ≤ cdk . Let q be the probability that none of the facilities in N(k)
Qd Qd
is open. Note that q = i=1 (1 − yi∗ ) = i=1 (1 − x∗ik ). One key observation is
that q is “not too big”: since 1 − x ≤ e−x (x > 0), using Fact 1(b),
!
Yd Yd X d
1
∗ −x∗ ∗
q= (1 − xik ) ≤ e ik = exp − xik = ,
i=1 i=1 i=1
e

where exp (x) = ex . We will bound the expected service cost of k by considering
a clearly worse algorithm: assign k to its closest open neighbor; if none of the
Approximation Algorithms for Uncapacitated Facility Location 189

neighbors of k is open, assign k to the open facility i◦ ∈ N(j◦ ) (exactly as in

Section 2). If facility 1 is open, an event which occurs with probability y1∗ , the
service cost of k is c1k . If, on the other hand, facility 1 is closed, but facility 2
is open, an event which occurs with probability (1 − y1∗ )y2∗ , the service cost of k
is c2k , and so on. If all the facilities in the neighborhood of k are closed, which
occurs with probability q, then k is assigned to the open facility i◦ ∈ N(j◦ ). But
in this case, k is serviced by i◦ , so the service cost of k is at most 2vj∗◦ + vk∗ ≤ 3vk∗
exactly as in Figure 1 (Section 2); in fact, this backup routing gives a deterministic
bound: the service cost of k is always no more than 3vk∗ . Thus the expected
service cost of k is at most

c1k y1∗ + c2k y2∗ (1 − y1∗ ) + · · · + cdk yd∗ (1 − y1∗ ) . . . (1 − yd−1

∗
) + 3vk∗ q

X
d
1 3 ∗
≤ cik x∗ik + 3vk∗ = C k + v ,
i=1
e e k

which concludes the proof of the lemma in this special case.

Now we return to the more general case in which there are centers in C that
can share more than one neighbor with k. We assumed that this was not the case
in order to ensure that the events of opening facilities in N(k) were independent,
but now this is no longer true for facilities i, i0 ∈ N(k) that are neighbors of the
same center. However, if one of i or i0 is closed, the probability that the other
is open increases; thus the dependencies are favorable for the analysis. The key
idea of the proof is to group together those facilities that are neighbors of the
same cluster center, so that the independence is retained and the proof of the
special case above still works. A more rigorous analysis follows.
Let Cˆ be the subset of centers that share neighbors with k. For each center
j ∈ C, let Sj = N(j)∩N(k), and so Cˆ = {j ∈ C : Sj 6= ∅}. We have already proved
the lemma when |Sj | ≤ 1, for each center j ∈ C. For each center j ∈ C, ˆ let Ej be
the event that at least one common neighbor of j and k is open (see Figure 2).
To follow the proof, for each j ∈ C, ˆ it will be convenient to think of the event
of choosing facility i in Sj as P a sequence of two events: first j chooses to “open”
∗
Sj with probability pj = x
i∈Sj ik (i.e., event Ej occurs); and then if Sj is
open, j chooses facility i ∈ Sj with probability x∗ij /pj (which is the conditional
P
probability of opening i given event Ej ). Now let C j = i∈Sj cik x∗ik /pj ; that is,
C j is the conditional expected distance from k to Sj , given the event Ej . For
example, if Sj = {r, s, t} are the common neighbors of j and k, the event Ej
occurs when one of r, s or t is open, pj = x∗rk + x∗sk + x∗tk and C j = crk x∗rk /pj +
csk x∗sk /pj + ctk x∗tk /pj . Notice that by Fact 1(a), the events Ej (j ∈ C) ˆ are
independent. This completes the facility grouping. Consider the neighbors of k
that are noncentral facility locations; that is, locations i ∈ N(k) ∩ R. For each
each noncentral neighbor i ∈ N(k) ∩ R, let Ei be the event in which facility i is
open, C i be the distance cik , and pi = x∗ik . Next notice that the events E`
P all of P
are independent. It follows easily from the definitions that ` p` = i∈F xik = 1
P
and ` C ` p` = C k .
190 Fabián A. Chudak

Now we can argue essentially as in the simple case when |Sj | ≤ 1 for each
center j ∈ C. Assume that there are d events E` , and for notational simplicity,
they are indexed by ` ∈ {1, . . . , d}, with C 1 ≤ . . . ≤ C d . Let D be the event
that none of E1 , . . . , Ed occurs; that is, D is precisely the event in which all
the facilities in the neighborhood of k, N(k), are closed; let q be the probability
of event D. Note that, as in the simple case, the service cost of k is never
greater that its backup routing cost 3vk∗ , in particular, this bound holds even
conditioned on D. As before, we will analyze the expected service cost of a worse
algorithm: k is assigned to the open neighboring facility with smallest C ` ; and if
all the neighbors are closed, k is assigned through its backup routing to the open
facility i◦ ∈ N(j◦ ). If the event E1 occurs (with probability p1 ), the expected

N(k)
E3
3
E4 E1 E2
1

4
2

N(4)

N(2)

Fig. 2. Estimating the expected service cost of k. Here the centers that share a
neighbor with k are demand locations 2 and 4 (Ck = {2, 4}). The neighbors of
k that are noncentral locations are 1 and 3. Event E2 (respectively E4 ) occurs
when a facility in N(k) ∩ N(2) (respectively N(k) ∩ N(4)) is open, while event E1
(respectively E3 ) occurs when facility 1 (respectively 3) is open. Though there
are dependencies among the neighbors of a fixed center, the events E1 , E2 , E3
and E4 are independent.

service cost of k is C 1 . If event E1 does not occur, but event E2 occurs (which
happens with probability (1 − p1 )p2 ), the expected service cost of k is C 2 , and
Approximation Algorithms for Uncapacitated Facility Location 191

QdIf we are in the complementary space D, which occurs with probability

so on.
q = `=1 (1 − p` ), the service cost of k is never greater than its backup service
cost 3vk∗ . Thus the expected service cost of k can be bounded by

C 1 p1 + C 2 (1 − p1 )p2 + · · · + C d (1 − p1 ) . . . (1 − pd−1 )pd + 3vk∗ q . (8)

To prove the lemma we bound the first d terms of (8) by C k , and q by 1/e. u
t

Notice than even though the clustering construction is deterministic, the

backup service cost of k (that is, the distance between k and the facility open
in N(j◦ )) is a random variable B. In the proof above, we used the upper bound
B ≤ 3vk∗ . In fact, the proof of Lemma 5 shows that the expected service cost of
k is no more than C k + q E[B|D]. As it can be easily seen, the upper bound used
for equation (8) is not tight. In fact, we can get an upper bound of (1 − q) C k +
q E[B|D] as follows. First note the following simple probabilistic interpretation
of the first d terms of (8). Let Z` (` = 1, . . . , d) be independent 0-1 random
variables with Prob{Z` = 1} = p` . Consider the set of indices for which Z` is 1,
and let Z be the minimum C ` value in this set of indices; if all of the Z` are 0, Z is
defined to be 0. Then the expected value of Z is exactly equal to the first d terms
of (8). Given a set of numbers S, we will use min◦ (S) to denote the smallest
element of S if S is nonempty, and 0 if S is empty, so that Z = min◦{Z` =1} C ` Z` .
The following intuitive probability lemma provides a bound on the first d terms
of (8).
Pd
Lemma 6. Suppose that 0 ≤ C 1 ≤ . . . ≤ C d , p1 , . . . , pd > 0, with `=1 p` = 1.
Let Z1 , . .P
. , Zd be 0-1 independent random variables, with Prob{Z` = 1} = p` .
Let C = C ` p` . Then" #
Y
d
E min ◦ C ` Z` + (1 − Z` )C ≤ C . t
u
Z` =1
`=1
Qd
Applying the lemma to the first d terms of equation (8), since E[ `=1 (1 − Z` )] =
Qd
`=1 (1 − p` ) = q, we have that

C 1 p1 + C 2 (1 − p1 )p2 + · · · + C d (1 − p1 ) . . . (1 − pd−1 )pd ≤ C k (1 − q) . (9)

Thus we have proved the following.

Lemma 7. For each demand point k ∈ D, the expected service cost of k is at
most (1 − q) C k + q E[B|D].
Finally we introduce the last idea that leads to the (1+2/e)-approximation al-
gorithm. In Figure 1, we have bounded the distance from center j◦ to the open fa-
cility i, cij◦ , by vj∗◦ . However, i is selected (by the center j◦ ) with probability x∗ij◦
P
and, thus, the expected length of this leg of the routing is i∈F cij◦ x∗ij◦ = C j◦ ,
which in general is smaller than the estimate vj∗◦ used in the proof of Lemma 5.
Thus, to improve our bounds, we slightly modify the clustering procedure by
changing line 3 of the procedure of Table 1 to
192 Fabián A. Chudak

3’. choose j◦ ∈ S with smallest vj∗ + C j value (j ∈ S)

We will call the modified algorithm randomized rounding with improved

clustering. Notice that Lemmas 4 and 7 are unaffected by this change. We will
show that the modified rule 3’ leads to the bound E[B|D] ≤ 2vk∗ + C k , improving
on the bound of 3vk∗ we used in the proof of Lemma 5.
Lemma 8. If we run randomized rounding with improved clustering,
the conditional expected backup service cost of k, E[B|D], is at most 2vk∗ + C k .
Proof. Suppose that the clustering partition assigned k to the cluster with center
j◦ . Deterministically, we divide the proof into two cases.
Case 1. Suppose that there is a facility ` ∈ N(k) ∩ N(j◦ ), such that c`j◦ ≤ C j◦
(see Figure 3(a)). Let i be the facility in N(j◦ ) that was opened by j◦ ; notice
that cij◦ ≤ vj∗◦ (because (x, y) is v ∗ -close). Then the service cost of k is at most
cik ≤ ck` + c`j◦ + cj◦ i , which using again that (x∗ , y ∗ ) is v ∗ -close, is at most
vk∗ + c`j◦ + vj∗◦ ≤ vk∗ + C j◦ + vj∗◦ ≤ C k + 2vk∗ , where the last inequality follows
from the fact that the center has the minimum (C j + vj∗ ) value. In this case, we
have a (deterministic) bound, B ≤ C k + 2vk∗ .
Case 2. Assume that c`j◦ > C j◦ for every ` ∈ N(k) ∩ N(j◦ ) (see Figure
3(b)). First note that when we do not condition on D (i.e., that no facility
in N(k) is open), then the expected length of the edge from j◦ to the facility
that j◦ has selected is C j◦ . However, we are given that all the facilities in the
neighborhood of k are closed, but in this case, all of these facilities that contribute
to the expected service cost of j◦ (the facilities in N(k) ∩ N(j◦ )) are at distance
greater than the average C j◦ . Thus the conditional expected service cost of j◦
is at most the unconditional expected service cost of j◦ , C j◦ . It follows then
that if ` ∈ N(k) ∩ N(j◦ ), the conditional expected service cost of k is at most
C j◦ + cj◦ ` + c`k ≤ C j◦ + vj∗◦ + vk∗ ≤ C k + 2vk∗ . Hence, E[B|D] ≤ C k + 2vk∗ in this
case too. t
u
Thus, using Lemmas 7 and 8, the expected service cost of k can be bounded by
2 ∗
C k (1 − q) + q (2vk∗ + C k ) = C k + 2q vk∗ ≤ C k + v ,
e k
where once again we bound q by 1/e.
Corollary 4. The expected total service
P cost of randomized
P rounding with
improved clustering is at most k∈D C k + (2/e) k∈D vk∗ .
Combining Corollaries 2 and 4, randomized rounding with improved
clustering produces a feasible solution with expected cost no greater than
X X
2 X ∗ 2
fi yi∗ + Ck + vk = 1 + LP∗ ≈ 1.736 LP∗ .
e e
i∈F k∈D k∈D
Approximation Algorithms for Uncapacitated Facility Location 193

N(k)
N(k)

k k
≤ vk∗
vk∗ ≥ `
N(j◦ )
N(j◦ ) `
≤ vj∗◦
≤ C j◦
j◦ j◦
≤ C j◦
≤ vj∗◦ i
i
deterministic bound bound in expected value
(a) (b)

Fig. 3. Bounding the backup service cost of k.

Thus we have proved the following theorem.

Theorem 3. There is a polynomial-time randomized algorithm that finds a fea-

sible solution to the uncapacitated facility location problem with expected cost at
most (1 + 2/e)LP∗ .
To finish the proof of Theorem 1 we can show that the algorithm of Theorem 3
can be derandomized using the standard method of conditional expectations.

4 Discussion
We conclude with a few remarks concerning the algorithm of Section 3. A stan-
dard technique to improve the guarantees of randomized rounding is to use a
fixed parameter, say γ ≥ 0, to boost the probabilities. For instance, the sim-
plest randomized rounding algorithm would open facility i ∈ F with probability
min{γyi∗ , 1}. This technique can be also applied to our randomized algorithm
in a very simple fashion. The bounds of Theorem 3 are thus parameterized for
each γ. Even though this does not lead to an overall improvement in the perfor-
mance guarantee, it allows us to improve the performance
P guarantee for some
values of ρ, where ρ ∈ [0, 1] is defined by ρ LP∗ = ∗
i∈F i yi . In particular,
f
we can show that if ρ ≤ 2/e ≈ 0.736, there is a variant of our algorithm with
194 Fabián A. Chudak

performance guarantee ρ ln(2/ρ) + 1. This bound gets better as ρ approaches

0. Another algorithm is obtained by applying Theorem 3 to the filtering g-close
solution proposed by Shmoys, Tardos, and Aardal. This new algorithm performs
better when ρ approaches 1. Thus for “most” values of ρ, we can provide a better
guarantee than the one of Theorem 1. For example, the worst possible guarantee
1.736 is achieved only when ρ is “close” to 0.70.
Another interesting issue concerns the proof of Theorem 3. In particular,
for the bound q ≤ 1/e to be tight, it requires having many terms (d has to be
relatively large) and all the x values have to be the same. This bad instance
seems very unlikely to occur. Therefore for a particular solution, a more careful
bookkeeping of the constants can provide a substantially improved guarantee.
Finally, our work also indicates that it might be worthwhile to investigate
the of randomization for more complex location problems.

Acknowledgments. The author is grateful to David Shmoys for providing

great assistance and numerous ideas, that led to some crucial points of the results
in this paper. We also wish to thank Gena Samorodnitsky and Mike Todd for
helpful comments.

References
1. A. A. Ageev and M. I. Sviridenko. An approximation algorithm for the uncapaci-
tated facility location problem. Manuscript, 1997.
2. M. L. Balinski. Integer programming: Methods, uses, computation. Management
Science, 12(3):253–313, 1965.
3. G. Cornuéjols, M. L. Fisher, and G. L. Nemhauser. Location of bank accounts to
optimize float: An analytic study of exact and approximate algorithms. Manage-
ment Science, 23(8):789–810, 1977.
4. G. Cornuéjols, G. L. Nemhauser, and L. A. Wolsey. The uncapacitated facility
location problem. In P. Mirchandani and R. Francis, editors, Discrete Location
Theory, pages 119–171. John Wiley and Sons, Inc., New York, 1997.
5. U. Feige. A threshold of ln n for approxiamting set-cover. In 28th ACM Symposium
on Theory of Computing, pages 314–318, 1996.
6. M. X. Goemans and D. P. Williamson. New 3/4-approximation algorithms for
max-sat. SIAM Journal on Discrete Mathematics, 7:656–666, 1994.
7. S. Guha and S. Khuller. Greedy strikes back: improved facility location algorithms.
In Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 1998.
To appear.
8. J. H. Lin and J. S. Vitter. -Approximation with minimum packing constraint
violation. In Proceedings of the 24th Annual ACM Symposium on Theory of Com-
puting, pages 771–782, 1992.
9. P. Mirchandani and R. Francis, editors. Discrete Location Theory. John Wiley
and Sons, Inc., New York, 1990.
10. P. Raghavan and C. D. Thompson. Randomized rounding. Combinatorica, 7:365–
374, 1987.
11. D. B. Shmoys, É. Tardos, and K. Aardal. Approximation algorithms for facility
location problems. In 29th ACM Symposium on Theory of Computing, pages 265–
274, 1997.
12. M. I. Sviridenko. Personal communication, July 1997.
The Maximum Traveling Salesman Problem
Under Polyhedral Norms

Alexander Barvinok1? , David S. Johnson2 , Gerhard J. Woeginger3?? , and

Russell Woodroofe1? ? ?
1
University of Michigan, Dept. of Mathematics
Ann Arbor, MI 48109-1009, USA
barvinok@@math.lsa.umich.edu
2
AT&T Labs – Research, Room C239
Florham Park, NJ 07932-0971, USA
dsj@@research.att.com
3
Institut für Mathematik, TU Graz
Steyrergasse 30, A-8010 Graz, Austria
gwoegi@@opt.math.tu-graz.ac.at

Abstract. We consider the traveling salesman problem when the cities

are points in Rd for some fixed d and distances are computed according
to a polyhedral norm. We show that for any such norm, the problem of
finding a tour of maximum length can be solved in polynomial time. If
arithmetic operations are assumed to take unit time, our algorithms run
in time O(nf −2 log n), where f is the number of facets of the polyhedron
determining the polyhedral norm. Thus for example we have O(n2 log n)
algorithms for the cases of points in the plane under the Rectilinear and
Sup norms. This is in contrast to the fact that finding a minimum length
tour in each case is NP-hard.

1 Introduction

In the Traveling Salesman Problem (TSP), the input consists of a set C of cities
together with the distances d(c, c0 ) between every pair of distinct cities c, c0 ∈ C.
The goal is to find an ordering or tour of the cities that minimizes (Minimum
TSP) or maximizes (Maximum TSP) the total tour length. Here the length of a
tour cπ(1) , cπ(2) , . . . , cπ(n) is

X
n−1
d(cπ(i) , cπ(i+1) ) + d(cπ(n) , cπ(1) ).
i=1

?
Supported by an Alfred P. Sloan Research Fellowship and NSF grant DMS 9501129.
??
Supported by the START program Y43-MAT of the Austrian Ministry of Science.
???
Supported by the NSF through the REU Program.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 195–201, 1998. c Springer–Verlag Berlin Heidelberg 1998
196 Alexander Barvinok et al.

Of particular interest are geometric instances of the TSP, in which cities

correspond to points in Rd for some d ≥ 1, and distances are computed according
to some geometric norm. Perhaps the most popular norms are the Rectilinear,
Euclidean, and Sup norms. These are examples of what is known as an “Lp
norm” for p = 1, 2, and ∞. In general, the distance between two points x =
(x1 , x2 , . . . , xd ) and y = (y1 , y2 , . . . , yd ) under the Lp norm, p ≥ 1, is
!1/p
X
d
p
d(x, y) = |xi − yi |
i=1

with the natural asymptotic interpretation that distance under the L∞ norm is

d(x, y) = max di=1 |xi − yi | .

This paper concentrates on a second class of norms which also includes the
Rectilinear and Sup norms, but can only approximate the Euclidean and other Lp
norms. This is the class of polyhedral norms. Each polyhedral norm is determined
by a unit ball which is a centrally-symmetric polyhedron P with the origin at its
center. To determine d(x, y) under such a norm, first translate the space so that
one of the points, say x, is at the origin. Then determine the unique factor α by
which one must rescale P (expanding if α > 1, shrinking if α < 1) so that the
other point (y) is on the boundary of the polyhedron. We then have d(x, y) = α.
Alternatively, and more usefully for our purposes, we can view a polyhedral
norm as follows. If P is a polyhedron as described above and has F facets, then
F is divisible by 2 and there is a set HP = {h1 , . . . , hF/2 } of points in Rd such
that P is the intersection of a collection of half-spaces determined by HP :
   
\
F/2
\
F/2
P =  {x : x · hi ≤ 1} ∩  {x : x · hi ≥ −1}
i=1 i=1

Then we have
n o
d(x, y) = max (x − y) · hi : 1 ≤ i ≤ F/2

Note that for the Rectilinear norm in the plane we can take HP = {(1, 1), (−1, 1)}
and for the Sup norm in the plane we can take HP = {(1, 0), (0, 1)}.
For the Minimum TSP on geometric instances, all of the key complexity
questions have been answered. As follows from results of Itai, Papadimitriou,
and Swarcfiter [5], the Minimum TSP is NP-hard for any fixed dimension d and
any Lp or polyhedral norm. On the other hand, recent results of Arora [1,2] and
Mitchell [8] imply that in all these cases a polynomial-time approximation scheme
(PTAS) exists, i.e., a sequence of polynomial-time algorithms Ak , 1 ≤ k < ∞,
where Ak is guaranteed to find a tour whose length is within a ratio of 1 + (1/k)
of optimal.
The situation for geometric versions of the Maximum TSP is less completely
resolved. Barvinok [3] has shown that once again polynomial-time approximation
The Maximum TSP 197

schemes exist for all fixed dimensions d and all Lp or polyhedral norms (and in
a sense for any fixed norm; see [3]). Until now, however, the complexity of the
optimization problems themselves when d is fixed has remained open: For no
fixed dimension d and Lp or polyhedral norm was the problem of determining
the maximum tour length known either to be NP-hard or to be polynomial-time
solvable. In this paper, we resolve the question for polyhedral norms, showing
that, in contrast to the case for the Minimum TSP, the Maximum TSP is solvable
in polynomial time for any fixed dimension d and any polyhedral norm:
Theorem 1. Let dimension d be fixed, and let k · k be a fixed polyhedral norm in
Rd whose unit ball is a polyhedron P determined by a set of f facets. Then for any
set of n points in Rd , one can construct a traveling salesman tour of maximum
length with respect to k · k in time O(nf −2 log n), assuming arithmetic operations
take unit time.
As an immediate consequence of Theorem 1, we get relatively efficient algo-
rithms for the Maximum TSP in the plane under Rectilinear and Sup norms:
Corollary 2. The Maximum TSP for points in R2 under the L1 and L∞ norms
can be solved in O(n2 log n) time, assuming arithmetic operations take unit time.
The restriction to unit cost arithmetic operations in Theorem 1 and Corol-
lary 2 is made primarily to simplify the statements of the conclusions, although
it does reflect that fact that our results hold for the real number RAM compu-
tational model. Suppose on the other hand that one assumes, as one typically
must for complexity theory results, that the components of the vectors in HP
and the coordinates of the cities are all rationals. Let U denote the maximum
absolute value of any of the corresponding numerators and denominators. Then
the conclusions of the Theorem and Corollary hold with running times multi-
plied by n log(U ). If the components/coordinates are all integers with maximum
absolute value U , the running times need only be multiplied by log(nU ). For
simplicity in the remainder of this paper, we shall stick to the model in which
numbers can be arbitrary reals and arithmetic operations take unit time. The
reader should have no trouble deriving the above variants.
The paper is organized as follows. Section 2 introduces a new special case
of the TSP, the Tunneling TSP , and shows how the Maximum TSP under a
polyhedral norm can be reduced to the Tunneling TSP with the same number of
cities and f /2 tunnels. Section 3 sketches how the latter problem can be solved
in O(nf +1 ) time, a slightly weaker result than that claimed in Theorem 1. The
details of how to improve this running time to O(nf −2 log n) will be presented
in the full version of this paper, a draft of which is available from the authors.
Section 4 concludes by describing some related results and open problems.

2 The Tunneling TSP

The Tunneling TSP is a special case of the Maximum TSP in which distances
are determined by what we shall call a tunnel system distance function. In such a
198 Alexander Barvinok et al.

distance function we are given a set T = {t1 , t2 , . . . , tk } of auxiliary objects that

we shall call tunnels. Each tunnel is viewed as a bidirectional passage having
a front and a back end. For each pair c, t of a city and a tunnel we are given
real-valued access distances F (c, t) and B(c, t) from the city to the front and
back ends of the tunnel respectively. Each potential tour edge {c, c0 } must pass
through some tunnel t, either by entering the front end and leaving the back (for
a distance of F (c, t) + B(c0 , t)), or by entering the back end and leaving the front
(for a distance of B(c, t) + F (c0 , t)). Since we are looking for a tour of maximum
length, we can thus define the distance between cities c and c0 to be
n o
d(c, c0 ) = max F (c, ti ) + B(c0 , ti ), B(c, ti ) + F (c0 , ti ) : 1 ≤ i ≤ k

Note that this distance function, like our geometric norms, is symmetric.
It is easy to see that Maximum TSP remains NP-hard when distances are
determined by arbitrary tunnel system distance functions. However, for the case
where k = |T | is fixed and not part of the input, we will show in the next section
that Maximum TSP can be solved in O(n2k−1 ) time. We are interested in this
special case because of the following lemma.
Lemma 3. If k · k is a polyhedral norm determined by a set HP of k vectors
in Rd , then for any set C of points in Rd one can in time O(dk|C|) construct a
tunnel system distance function with k tunnels that yields d(c, c0 ) =k c − c0 k for
all c, c0 ∈ C.

Proof. The polyhedral distance between two cities c, c0 ∈ Rd is

n o
k c − c0 k = max (c − c0 ) · hi : 1 ≤ i ≤ k
n o
= max (c − c0 ) · hi , (c0 − c) · hi : 1 ≤ i ≤ k

Thus we can view the distance function determined by k · k as a tunnel system

distance function with set of tunnels T = HP and F (c, h) = c · h, B(c, h) = −c · h
for all cities c and tunnels h. t
u

3 An Algorithm for Bounded Tunnel Systems

This section is devoted to the proof of the following lemma, which together with
Lemma 3 implies that the Maximum TSP problem for a fixed polyhedral norm
with f facets can be solved in O(nf +1 ) time.
Lemma 4. If the number of tunnels is fixed at k, the Tunneling TSP can be
solved in time O(n2k+1 ), assuming arithmetic operations take unit time.

Proof. Suppose we are given an instance of the Tunneling TSP with sets C =
{c1 , . . . , cn } and T = {t1 , . . . , tk } of cities and tunnels, and access distances
F (c, t), B(c, t) for all c ∈ C and t ∈ T . We begin by transforming the problem
to one about subset construction.
The Maximum TSP 199

Let G = (C ∪ T, E) be an edge-weighted, bipartite multigraph with four

edges between each city c and tunnel t, denoted by ei [c, t, X], i ∈ {1, 2} and X ∈
{B, F }. The weights of these edges are w(ei [c, t, F ]) = F (c, t) and w(ei [c, t, B]) =
B(c, t), i ∈ {1, 2}. For notational convenience, let us partition the edges in E
into sets E[t, F ] = {ei [c, t, F ] : c ∈ C, i ∈ {1, 2}} and E[t, B] = {ei [c, t, B] : c ∈
C, i ∈ {1, 2}}, t ∈ T . EachP tour for the TSP instance then corresponds to a
subset E 0 of E that has e∈E 0 w(e) equal to the tour length and satisfies

(T1) Every city is incident to exactly two edges in E 0 .

(T2) For each tunnel t ∈ T , |E 0 ∩ E[t, F ]| = |E 0 ∩ E[t, B]|.
(T3) The set E 0 is connected.

To construct the multiset E 0 , we simply represent each tour edge {c, c0 } by a

pair of edges from E that connect in the appropriate way to the tunnel that
determines d(c, c0 ). For example, if d(c, c0 ) = F (c, t) + B(c0 , t), and c appears
immediately before c0 when the tour is traversed starting from cπ(1) , then the
edge (c, c0 ) can be represented by the two edges e2 [c, t, F ] and e1 [c0 , t, B]. Note
that there are enough (city,tunnel) edges of each type so that all tour edges can
be represented, even if a given city uses the same tunnel endpoint for both its
tour edges. Also note that if d(c, c0 ) can be realized in more than one way, the
multiset E 0 will not be 0
P unique. However, any multiset E constructed in this
fashion will still have e∈E 0 w(e) equal to the tour length.
On the other hand, any set E 0 P satisfying (T1) – (T3) corresponds to one (or
more) tours having length at least e∈E 0 w(e): Let T 0 ∈ T be the set of tunnels t
with |E 0 ∩ E[t, F ]| > 0. Then G0 = (C ∪ T 0 , E 0 ) is a connected graph all of whose
vertex degrees are even by (T1) – (T3). By an easy result from graph theory,
this means that G0 contains an Euler tour that by (T1) passes through each
city exactly once, thus inducing a TSP tour for C. Moreover, by (T2) one can
construct such an Euler tour with the additional property that if ei [c, t, x] and
ej [c0 , t, y] are consecutive edges in this tour, then x 6= y, i.e, either x = F, y = B
0 0
or x = B, y = F . Thus we will have w(ei [c, t, x]) + Pw(ej [c , t, y]) ≤ d(c, c ), and
hence the length of the TSP tour will be at least e∈E 0 w(e), as claimed.
Thus our problem is reduced to finding a maximum weight set of edges E 0 ⊆
E satisfying (T1) – (T3). We will now sketch our approach to solving this latter
problem; full details are available from the authors and will appear in the journal
version of this paper. The basic idea is to divide into O(n2k−2 ) subproblems, each
of which can be solved in linear time. Each subcase corresponds to a choice of
a degree sequence d for the tunnels, for which there are O(nk−1 ) possibilities,
and a choice, for that degree sequence, of a canonical-form sequence s of edges
that connects together those tunnels that have positive degree, for which there
are again O(nk−1 ) possibilities.
Having chosen d and s, what we are left with is a maximum weight bipartite
b-matching problem: starting with a set consisting of the edges specified by s,
each tunnel end must have its degree augmented up to that specified by d and
each city must have its degree augmented up to 2. This b-matching problem
can be solved in O(n3 ) time by the standard technique that converts it to an
200 Alexander Barvinok et al.

assignment problem on an expanded graph. The overall running time for the
algorithm is thus O(nk−1 nk−1 n3 ) = O(n2k+1 ), as claimed. t
u
In the full paper we show how two additional ideas enable us to reduce our
running times to O(n2k−2 log n), as needed for the proof of Theorem 1. The
first idea is to view each b-matching problem as a transportation problem with
a bounded number of customer locations. This latter problem can be solved
in linear time by combining ideas from [7,4,11]. The second idea is to exploit
the similarities between the transportation instances we need to solve. Here a
standard concavity result implies that one dimension of our search over degree
sequences can be handled by a binary search. In the full paper we also discuss
how the constants involved in our algorithms grow with k.

4 Conclusion
We have derived a polynomial time algorithm for the Maximum TSP when the
cities are points in Rd for some fixed d and when the distances are measured
according to some polyhedral norm. The complexity of the Maximum TSP with
Euclidean distances and fixed d remains unsettled, however, even for d = 2.
Although the Euclidean norm can be approximated arbitrarily closely by poly-
hedral norms (and hence Barvinok’s result [3] that Maximum TSP has a PTAS),
it is not itself a polyhedral norm.
A further difficulty with the Euclidean norm (one shared by both the Min-
imum and Maximum TSP) is that we still do not know whether the TSP is in
NP under this norm. Even if all city coordinates are rationals, we do not know
how to compare a tour length to a given rational target in less than exponential
time. Such a comparison would appear to require us to evaluate a sum of n
square roots to some precision, and currently the best upper bound known on
the number of bits of precision needed to insure a correct answer remains ex-
ponential in n. Thus even if we were to produce an algorithm for the Euclidean
Maximum TSP that ran in polynomial time when arithmetic operations (and
comparisons) take unit time, it might not run in polynomial time on a standard
Turing machine.
Another set of questions concerns the complexity of the Maximum TSP when
d is not fixed. It is relatively easy to show that the problem is NP-hard for all
Lp norms (the most natural norms that are defined for all d > 0). For the case
of L∞ one can use a transformation from Hamiltonian Circuit in which each
edge is represented by a separate dimension. For the Lp norms, 1 ≤ p < ∞,
one can use a transformation from the Hamiltonian Circuit problem for cubic
graphs, with a dimension for each non-edge. However, this still leaves open the
question of whether there might exist a PTAS for any such norm when d is not
fixed. Trevisan [10] has shown that the Minimum TSP is Max SNP-hard for any
such norm, and so cannot have such PTAS’s unless P = NP. We can obtain a
similar result for the Maximum TSP under L∞ by modifying our NP-hardness
transformation so that the source problem is the Minimum TSP with all edge
lengths in {1, 2}, a special case that was proved Max SNP-hard by Papadimitriou
The Maximum TSP 201

and Yannakakis [9]. The question remains open for Lp , 1 ≤ p < ∞, although we
conjecture that these cases are Max SNP-hard as well.
Finally, we note that our results can be extended in several ways. For instance,
one can get polynomial-time algorithms for asymmetric versions of the Maximum
TSP in which distances are computed based on non-symmetric unit balls. Also,
algorithmic approaches analogous to ours can be applied to geometric versions of
other NP-hard maximization problems: For example, consider the Weighted 3-
Dimensional Matching Problem that consists in partitioning a set of 3n elements
into n triples of maximum total weight. The special case where the elements
are points in Rd and where the weight of a triple equals the perimeter of the
corresponding triangle measured according to some fixed polyhedral norm can
be solved in polynomial time.
Acknowledgment. We thank Arie Tamir for helpful comments on a preliminary
version of this paper, and in particular for pointing out that a speedup of O(n2 )
could be obtained by using the transportation problem results of [7] and [11].
Thanks also to Mauricio Resende and Peter Shor for helpful discussions.

References
1. S. Arora. Polynomial-time approximation schemes for Euclidean TSP and other
geometric problems. Proc. 37th IEEE Symp. on Foundations of Computer Science,
pages 2–12. IEEE Computer Society, Los Alamitos, CA, 1996.
2. S. Arora. Nearly linear time approximation schemes for Euclidean TSP and other
geometric problems. Proc. 38th IEEE Symp. on Foundations of Computer Science,
pages 554–563. IEEE Computer Society, Los Alamitos, CA, 1997.
3. A. I. Barvinok. Two algorithmic results for the traveling salesman problem. Math.
of Oper. Res., 21:65–84, 1996.
4. D. Gusfield, C. Martel, and D. Fernandez-Baca. Fast algorithms for bipartite net-
work flow. SIAM J. Comput., 16:237–251, 1987.
5. A. Itai, C. Papadimitriou, and J. L. Swarcfiter. Hamilton paths in grid graphs.
SIAM J. Comput., 11:676–686, 1982.
6. E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys. The
Traveling Salesman Problem, Wiley, Chichester, 1985.
7. N. Megiddo and A. Tamir. Linear time algorithms for some separable quadratic
programming problems. Oper. Res. Lett., 13:203–211, 1993.
8. J. Mitchell. Guillotine subdivisions approximate polygonal subdivisions: Part II
– A simple PTAS for geometric k-MST, TSP, and related problems. Preliminary
manuscript, April 1996.
9. C. H. Papadimitriou and M. Yannakakis. The traveling salesman problem with
distances one and two. Math. of Oper. Res., 18:1–11, 1993.
10. L. Trevisan. When Hamming meets Euclid: The approximability of geometric TSP
and MST. Proc. 29th Ann. ACM Symp. on Theory of Computing, pages 21–29.
ACM, New York, 1997.
11. E. Zemel. An O(n) algorithm for the linear multiple choice knapsack problem and
related problems. Inf. Proc. Lett., 18:123–128, 1984.
Polyhedral Combinatorics of Benzenoid
Problems

Hernán Abeledo1 and Gary Atkinson2

1
Department of Operations Research
The George Washington University
Washington, DC 20052, USA
abeledo@@seas.gwu.edu
2
Bell Laboratories, Lucent Technologies
Holmdel, NJ 07733, USA
atkinson@@lucent.com

Abstract. Many chemical properties of benzenoid hydrocarbons can

be understood in terms of the maximum number of mutually resonant
hexagons, or Clar number, of the molecules. Hansen and Zheng (1994)
formulated this problem as an integer program and conjectured, based on
computational evidence, that solving the linear programming relaxation
always yields integral solutions. We establish their conjecture by showing
that the constraint matrices of these problems are unimodular.
Previously, Hansen and Zheng (1992) showed that a certain minimum
weight cut cover problem defined for benzenoids yields an upper bound
for the Clar number and conjectured that equality always holds. We
prove that strong duality holds by formulating a network flow problem
that can be used to compute the Clar number. We show that our results
extend to generalizations of the Clar number and cut cover problems
defined for plane graphs that are bipartite and 2-connected.

1 Introduction

In this paper we study optimization problems defined for maps of graphs that
are bipartite and 2-connected. These problems are generalizations of ones that
arise in chemical graph theory, specifically those encountered in the analysis of
benzenoid hydrocarbons.
Benzenoid hydrocarbons are organic molecules composed of carbon and hy-
drogen atoms organized into connected (hexagonal) benzene rings. The structure
of such a molecule is usually represented by a benzenoid system, a bipartite and
two-connected plane graph whose interior faces are all regular hexagons. Each
node of a benzenoid system represents a carbon atom, and each edge corresponds
to a single or double bond between a pair of carbon atoms. The hydrogen atoms
are not explicitly represented since their location is immediately deduced from
the configuration of the carbon atoms.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 202–212, 1998. c Springer–Verlag Berlin Heidelberg 1998
Polyhedral Combinatorics of Benzenoid Problems 203

Generally, there is more than one way to arrange n carbon atoms into h
hexagonal rings where each arrangement has n − 2(h − 1) hydrogen atoms. The
chemical formula Cn Hn−2(h−1) represents a series of benzenoid isomers, struc-
turally distinct molecules having the same atomic makeup. However, not all
possible arrangements correspond to actual molecules. For benzenoid molecules,
empirical evidence [4,5] shows that their graphs always have a perfect matching.
In chemical graph theory, perfect matchings are called Kekulé structures. A
benzenoid system will usually have many Kekulé structures; analyzing them is
of interest to chemists and gives rise to interesting combinatorial optimization
problems [8]. For example, the Clar number of a benzenoid system is the opti-
mal value of a particular maximization problem over its set of Kekulé structures
and is a key concept in the aromatic sextet theory developed by E. Clar [3] to
explain benzenoid phenomenology. It has been observed that as the Clar number
increases within an isomeric series of benzenoid hydrocarbons, (1) isomeric sta-
bility increases, so chemical reactivity decreases, and (2) the absorption bands
shift towards shorter wavelengths, so the isomer colors change from dark blue-
green to red, yellow or white [5]. It has also been demonstrated that the Clar
number provides a rough estimate of the Dewar-type resonance energy [1].
Hansen and Zheng [7] formulated the computation of the Clar number as an
integer (linear) program and conjectured, based on empirical results, that solving
its linear programming relaxation yields integer solutions for all benzenoids.
We prove here that the constraint matrix of this integer program is always
unimodular. In particular, this establishes that the relaxation polytope is integral
since the linear program is in standard form.
Interestingly, these unimodular constraint matrices coming from an applied
problem constitute an unusual case: they are not, in general, totally unimodular
as often occurs with optimization problems on graphs that give rise to integral
polyhedra. However, for the Clar problem, we remark that a subset of them are
(also) totally unimodular, namely those constraint matrices corresponding to a
natural subset of benzenoids called catacondensed benzenoid hydrocarbons.
In a previous paper, Hansen and Zheng [6] considered a minimization problem
for benzenoid systems that we call here the minimum weight cut cover problem.
They showed that the optimal value of the cover problem is an upper bound for
the Clar number and conjectured that equality always holds. We prove this is the
case by formulating the minimum weight cut cover problem as a network flow
problem and using its dual to construct a solution for the Clar number problem.
As a consequence, the Clar number and the minimum weight cut cover problems
can be solved in strongly polynomial time using a minimum cost network flow
algorithm. Our results are established for the generalized versions of the Clar
number and cut cover problems that are defined here.

2 Preliminaries and Background

In this paper we consider combinatorial optimization problems defined for plane

graphs that are bipartite and 2-connected. The undefined graph theory terminol-
204 Hernán Abeledo and Gary Atkinson

ogy we use is standard and may be found in [2] or [9]. Here (V, E, F ) will always
denote the plane map with set of finite faces F that results from embedding a
bipartite and 2-connected planar graph G = (V, E). We will further assume that
the nodes of V are colored black or white so that all edges connect nodes of
different color.
Since G is bipartite and 2-connected, the boundary of each face f ∈ F
is an even cycle which can be perfectly matched in two different ways. Each
of these two possible matchings is called a frame of the face. A frame of a
face f is clockwise oriented if each frame edge is drawn from a black node to
white node in a clockwise direction along the boundary of f . Otherwise, the
frame is counterclockwise oriented. A framework for (V, E, F ) is a mapping ϕ :
F 7→ E such that ϕ(f ) is a frame for each f ∈ F . A framework ϕ is simple if
ϕ(f ) ∩ ϕ(f 0 ) = ∅ whenever f 6= f 0 . An oriented framework has all frames with
the same orientation. It follows that oriented frameworks are simple.
A set of node-disjoint faces F 0 ⊆ F is called a resonant set for (V, E, F )
if there exists a perfect matching for G that simultaneously contains a frame
for each face in F 0 . We denote by res(V, E, F ) the maximum cardinality of a
resonant set for (V, E, F ). The maximum resonant set problem seeks to determine
res(V, E, F ). It can be viewed as a set partitioning problem where nodes in a
plane map must be covered by faces or edges, and the objective is to maximize
the number of faces in the partition. Of course, this problem is infeasible if G
does not have a perfect matching.
When (V, E, F ) is a benzenoid system, res(V, E, F ) is called its Clar number.
A Clar structure is a representation of an optimal solution to this problem where
the (hexagonal) faces and edges in the partition have inscribed circles and double
lines, respectively (see Fig. 1).

Fig. 1. Clar structures of different benzenoids

We recall that a basis of a matrix is a maximal linearly independent col-

umn submatrix. An integer matrix is called unimodular if, for each basis, the
determinants of the largest submatrices are relatively prime. A matrix is totally
unimodular if the determinant of every square submatrix is 0, 1, or -1. Totally
unimodular matrices are unimodular. The relationship between unimodular ma-
trices and integral polyhedra is stated in a theorem established by Veinott and
Dantzig [13] and extended by Truemper [12] to matrices that do not have full row
rank. To simplify the presentation of results, we consider an empty polyhedron
to be integral.
Polyhedral Combinatorics of Benzenoid Problems 205

Theorem 1 ([13], [12]). An integer matrix A is unimodular if and only if the

polyhedron {x : Ax = b, x ≥ 0} is integral for any integer vector b.

An integer matrix M is column eulerian if M 1 = 0 (mod 2), i.e., if the sum

of its columns is a vector with all entries even. Truemper gave the following
sufficient condition for a matrix to be unimodular.

Theorem 2 ([12]). An integer matrix A is unimodular if:

1. A = BC, with B a unimodular basis of A and each entry of C is 0, 1 or -1.

2. No column eulerian column submatrix of A is linearly independent.

Throughout this paper we assume familiarity with basic properties of totally

unimodular matrices as can be found in [10] or [11].

3 Two Useful Propositions

We say two subgraphs of a plane graph are intrafacial if they are edge-disjoint
and each one is contained in the closure of a face of the other. Let G1 and G2 be
two intrafacial subgraphs. We say G1 is interior to G2 and G2 is exterior to G1
if G1 is contained in the closure of an interior face of G2 . We say G1 and G2 are
mutually exterior if each subgraph is contained in the exterior face of the other
one.

Proposition 3. A plane graph with all even degree nodes can be decomposed
into pairwise intrafacial cycles.

Proof. We outline here an iterative procedure that yields the desired decom-
position. Note that the blocks of a plane graph with all even degree nodes are
eulerian, 2-connected and pairwise intrafacial. Since a block is 2-connected, the
boundary of each of its faces is a cycle. Removing such a cycle from the original
graph results in a new graph whose blocks continue to be eulerian, 2-connected
and pairwise intrafacial. Furthermore, the removed cycle is intrafacial with re-
spect to the new graph. t
u

An (even) cycle is signed if each edge is alternatively assigned a value of 1

or -1. The signing of a plane cycle is clockwise if edges drawn from a black to
a white node in a clockwise direction along the cycle are given a value of 1;
otherwise the signing is counterclockwise. For brevity in the following proof, we
say a face and a cycle are frame adjacent if the face’s frame and the cycle have
common edges.

Proposition 4. Let G0 = (V 0 , E 0 ) be a plane subgraph of (V, E, F ) that is de-

composed into intrafacial cycles. Then the intrafacial cycles of G0 can be signed
such that, for any framework ϕ for (V, E, F ) and for every f ∈ F , all edges in
ϕ(f ) ∩ E 0 have the same sign.
206 Hernán Abeledo and Gary Atkinson

Proof. All cycles referred to in this proof are those of the G0 decomposition.
Since G is bipartite, all cycles are signable. We will show that the following rule
yields the desired signing: if a cycle lies in the interior of an even number of
cycles (not counting itself), sign it clockwise; otherwise sign it counterclockwise.
Consider any face f ∈ F . Then f is interior or exterior to any cycle. Let
n denote the total number of cycles that contain f and assume, without loss
of generality, that ϕ(f ) is clockwise oriented. If f is not frame adjacent to any
cycle, the proposition holds vacuously for f . Next, we consider the cases when
f is interior or exterior to a frame adjacent cycle.
Let c be a frame adjacent cycle that contains f in its interior. Then c must
be the only such cycle and must also be the innermost cycle that contains f
in its interior. Therefore, c must be interior to n − 1 cycles. For n odd (even),
c is signed clockwise (counterclockwise). Since f is interior to c and ϕ(f ) is
clockwise oriented, its frame edges in c are drawn from a black to a white node
in the clockwise direction along c. Hence, the frame edges in c are all assigned a
+1 if n is odd and a -1 if n is even.
Next, let C denote the set cycles exterior and frame adjacent to f . All cycles
in C are signed with the same orientation since each must be interior to the
same number of cycles as f . Since ϕ(f ) is clockwise oriented, the frame edges
that f shares with a cycle in C are drawn from a black node to white node in
the counterclockwise direction along the cycle. Thus, the frame edges of f in the
exterior cycles are assigned a +1 if n is odd and a -1 if n is even.
In both cases, edges in ϕ(f ) ∩ E 0 are assigned a +1 if n is odd and a -1 if n
is even. t
u

4 Formulations of the Maximum Resonant Set Problem

The following integer program (IP1) is a formulation of the maximum resonant

set problem that was proposed by Hansen and Zheng [7] to compute the Clar
number of a benzenoid system.

max {1T y : Kx + Ry = 1, x ≥ 0, y ≥ 0, x ∈ ZZ E , y ∈ ZZ F }

where

K is the V × E node-edge incidence matrix of G = (V, E), and

R is the V × F node-face incidence matrix of (V, E, F ).

To show that the linear programming relaxation of IP1 yields an integral

polytope, it is convenient to consider a reformulation of this integer program.
Given a framework ϕ for (V, E, F ), we define its edge-frame incidence matrix U
to be an E × F matrix such that, for each face f ∈ F , its corresponding column
of U is the edge incidence vector of the frame ϕ(f ). The incidence matrix of
a framework yields the following useful factorization of the matrix R defined
above.
Polyhedral Combinatorics of Benzenoid Problems 207

Lemma 5. Let U be the incidence matrix of a framework. Then, R = KU .

Introducing the vector of variables z = x + U y yields an alternative formu-

lation of the maximum resonant set problem (IP2).

max {1T y : Kz = 1, x + U y − z = 0, x ≥ 0, y ≥ 0, z ≥ 0, y ∈ ZZ F , x, z ∈ ZZ E }.

All feasible solutions to the above integer program are necessarily binary
vectors. Hence, constraints Kz = 1 express that z is the incidence vector of a
perfect matching for G. Constraints x + U y − z = 0 partition the edges in the
perfect matching represented by z between those in faces considered resonant
(represented by U y) and those identified by x.
Note that this alternative formulation of the maximum resonant set problem
is valid for the incidence matrix U of any framework of (V, E, F ). To facilitate the
subsequent analysis of this integer program, it is advantageous to assume that
U is the incidence matrix of a simple framework. Recall this occurs whenever
the framework is oriented.

5 Unimodularity of the Constraint Matrices

In this section we prove that the constraint matrices of both formulations of

the maximum resonant set problem are unimodular. Therefore, their linear re-
laxation polytopes are integral and the optimization problems can be solved
directly as linear programs.
First, we prove that the constraint matrix of the integer program IP2 is
unimodular when U is the incidence matrix of a simple framework. The general
form of this matrix is

0 0 K
I U −I

The incidence matrix of a simple framework has the following useful property.

Lemma 6. Let U be the incidence matrix of a simple framework. Then, each

row of U has at most one entry of 1.

We are now ready for the main result of this section.

Theorem 7. Let U be the incidence matrix of a simple framework ϕ. Then, the

constraint matrix of the integer program IP2 is unimodular.

Proof. Since unimodularity is preserved when rows and columns are multiplied
by −1, it is equivalent to show that the following matrix is unimodular.

0 0 K
A=
IU I
208 Hernán Abeledo and Gary Atkinson

To show that matrix A satisfies the first condition of Theorem 2, it suffices

to choose as basis B a column submatrix

0 KT
B=
I IT

where KT and IT are the column submatrices of K and I, respectively, corre-

sponding to the edges of a spanning tree of G. Since G is bipartite, K is totally
unimodular. This implies that B is totally unimodular and, in particular, B is
unimodular. It can easily be shown that if A is factored as A = BC, then all
entries of C are 0, 1 or -1.
Next, we show that the second condition of Theorem 2 is satisfied. Let A0 be
a column eulerian column submatrix of A. Then

0 0 0 KE 0
A =
ID0 UF 0 IE 0

where D0 ⊆ E, E 0 ⊆ E and F 0 ⊆ F are edge and face subsets indexing the

columns of A0 . Since A0 is column eulerian and each row of U has at most one
entry of 1, it follows that the nonzero rows of [ID0 UF 0 IE 0 ] have exactly two
entries of 1. To prove that A0 is linearly dependent, we must show there are
0 0 0
vectors u ∈ IRD , v ∈ IRF , and w ∈ IRE , not all zero, such that

 
u
0 0 KE 0  
v =0 (1)
ID0 UF 0 IE 0
w

If E 0 is empty, we can set all entries of u equal to 1 and all entries of v equal
to -1. Otherwise, let G0 denote the plane subgraph of G = (V, E) induced by E 0 .
Since KE 0 is column eulerian, G0 is a plane graph with all even degree nodes. By
Proposition 3, we can assume that G0 is decomposed into intrafacial cycles. Since
these cycles are even and edge-disjoint, each cycle can be signed independently
of the others. Note that in any signing, if we set we equal to the value given to
edge e in the cycle signing, for each e ∈ E 0 , then KE 0 w = 0. In particular, we
choose to sign the cycles so that, for each face f ∈ F , we = we0 for all pairs
of edges e and e0 in ϕ(f ) ∩ E 0 . We showed in Proposition 4 that such a signing
exists.
To assign values to the entries of u and v we proceed as follows. For each
f ∈ F 0 , if ϕ(f ) ∩ E 0 6= ∅, we let vf = −we for e ∈ ϕ(f ) ∩ E 0 . Otherwise,
if ϕ(f ) ∩ E 0 = ∅, vf can be chosen arbitrarily, say vf = 1. Finally for each
e ∈ D0 , either e ∈ E 0 or e ∈ ϕ(f ) for some f ∈ F 0 . In the first case, we assign
ue = −we and, in the second case, we let ue = −vf . It can now be easily seen
that the u, v, w vectors so defined are a solution to the homogeneous system (1),
establishing that the columns of A0 are linearly dependent. t
u
Polyhedral Combinatorics of Benzenoid Problems 209

For any given framework incidence matrix U , let P2 (b0 ) denote the polyhedron
defined by the system of equations with constraint matrix of problem IP2 and
integer right hand side vector b0 , together with nonnegativity constraints on all
variables. Similarly, let P1 (b) denote the polyhedron defined by the system of
equations with constraint matrix of problem IP1 with integer right hand side
vector b, and nonnegativity constraints on all variables. Proving the following
lemma is straightforward.

Lemma 8. Let b be an integer vector such that P1 (b) is nonempty. Then, there
exists an integer vector b0 such that (x, y) is an extreme point of P1 (b) if and
only if (x, y, x + U y) is an extreme point of P2 (b0 ).

Note Lemma 8 holds for the incidence matrix U of any framework. In par-
ticular, if we choose the framework to be simple, then combining Theorem 7,
Lemma 8, and Theorem 1 establishes the unimodularity of the constraint matrix
of problem IP1.

Corollary 9. The constraint matrix of problem IP1 is unimodular.

In conclusion, we have established that the constraint matrices of both for-

mulations of the maximum resonant set problem are unimodular. In particular,
for IP2 we showed here that the result holds when the framework is simple. Al-
though it is not necessary for our development, it can be shown with additional
effort that Theorem 7 is also valid for an arbitrary framework.

6 Minimum Weight Cut Cover Problem

Hansen and Zheng [6] proposed a class of cuts for benzenoids systems and defined
an optimal covering problem based on these cuts. In this section we apply their
definitions to a larger class of plane graphs. The proofs of results established
by Hansen and Zheng [6] extend directly to the more general version considered
here, so we will omit them. As before, (V, E, F ) continues to denote the plane
map of a bipartite and 2-connected planar graph G = (V, E), where the node
set is bipartitioned as V = V1 ∪ V2 . We also assume in this section that G has a
perfect matching.

Definition 10. A cut of (V, E, F ) is a simple (possibly closed) plane curve c

such that
(1) c intersects G only at some edge subset Ec of E, (i.e., c does not go through
any nodes);
(2) the subgraph Gc = (V, E \ Ec ) is disconnected such that the nodes of each
edge in Ec belong to different components of Gc ; and
(3) all nodes of edges in Ec in any component of Gc belong to either V1 or V2 .

Definition 11. A cut cover of (V, E, F ) is a set of cuts C such that each face
in F is intersected by at least one cut in C.
210 Hernán Abeledo and Gary Atkinson

Definition 12. Let M ⊆ E be a perfect matching for (V, E, F ). The weight mc ,

P a cut c is mc = |Ec ∩ M | and the weight m(C) of a cut
with respect to M , of
cover C is m(C) = c∈C mc .

Hansen and Zheng [6] proved the following two theorems.

Theorem 13 ([6]). The weight of a cut is the same for all perfect matchings.

Thus, for any cut cover C of (V, E, F ), its weight m(C) is also independent
of the perfect matching. Let cov(V, E, F ) denote the value of a minimum weight
cut cover for (V, E, F ). Then

Theorem 14 ([6]). For all (V, E, F ), res(V, E, F ) ≤ cov(V, E, F ).

Hansen and Zheng [6] conjectured that the above inequality is satisfied as
equation by all benzenoid systems. In what follows, we sketch our proof that this
conjecture holds true for the class of plane maps we consider in this paper. To
accomplish this, we will define a minimum cost network flow problem associated
with the minimum weight cut cover problem. The directed graph of this flow
problem is based on the geometric dual graph of the plane map (V, E, F ).
Let G∗ = (F ∪ {t}, E ∗ ) denote the dual graph of (V, E, F ), where t denotes
the external face of G. We now transform G∗ into a directed graph. Let ϕ denote
the clockwise oriented framework of (V, E, F ) and let {f, i} be an edge in E ∗ .
Without loss of generality we can assume f ∈ F . If the edge that separates faces f
and i in (V, E, F ) belongs to ϕ(f ), then {f, i} becomes arc (f, i). Otherwise, {f, i}
becomes the arc (i, f ). Next, for each node f ∈ F , we perform a node splitting
transformation to obtain nodes f1 ∈ F1 and f2 ∈ F2 connected by the arc
(f1 , f2 ) ∈ AF , where F1 and F2 are two copies of F and AF is the set of resulting
arcs connecting nodes in these two sets. Outgoing (incoming) arcs of node f
become outgoing (incoming) arcs of node f2 (f1 ). Let D = (F1 ∪F2 ∪{t}, AE ∪AF )
denote the resulting directed graph, where AE is the set of arcs connecting nodes
in different faces. Observe that each directed cycle in D = (F1 ∪F2 ∪{t}, AE ∪AF )
corresponds to a cut of (V, E, F ).
We now define a minimum cost circulation problem on the directed graph
D. A lower bound of 1 is imposed on the flow value of each arc in AF (this
enforces that all faces in F are covered by cuts). Flow on all remaining arcs is
required to be nonnegative. The cost assigned to each arc in AF is zero. Finally,
let p ∈ {0, 1}E be the incidence vector of a perfect matching for G = (V, E).
The cost of arcs in AE is given by the corresponding entry of the vector p. Let γ
denote the optimal value of this network flow problem. Since the right hand side
data for this network flow problem is integer, there exists an optimal solution
to the problem that is integer. As integer solutions can be interpreted as cut
covers, it follows that cov(V, E, F ) ≤ γ. We next prove that γ ≤ res(V, E, F ).
Let U be the incidence matrix of the counterclockwise oriented framework of
(V, E, F ) and let a ∈ {−1, 0, 1}E be defined as a = (U − U )1. We note that the
nonzero entries of a correspond to the edges in the perimeter of (V, E, F ) and
they alternate in sign along the perimeter.
Polyhedral Combinatorics of Benzenoid Problems 211

The dual of the network flow problem can be written as follows:

minimize 1T y
subject to U u − U v − aw + x = p (2)
y−u+v =0 (3)
y, u, v, w, x ≥ 0 (4)

where y, u, v ∈ IRF , w ∈ IR, and x ∈ IRE .

The proof is completed by showing that the x, y components of any feasible
solution to the above dual problem are also feasible for the linear relaxation of
problem IP1. The key observation here is that premultiplying constraints (2) by
K (the node-edge incidence matrix of G) and rearranging terms gives

Kx + R(u − v) = Kp = 1

Using equations (3), we substitute y for (u − v) and obtain Kx + Ry = 1, the

equation constraints of problem IP1. Thus, the strong duality result follows.

Theorem 15. For all (V, E, F ), res(V, E, F ) = cov(V, E, F ).

The arguments in the proof of Theorem 15 also show that res(V, E, F ) and
cov(V, E, F ) can be computed in strongly polynomial time using a network flow
algorithm.

References
1. J. Aihara. On the number of aromatic sextets in a benzenoid hydrocarbon. Bulletin
of the Chemical Society of Japan, 49:1429–1430, 1976.
2. B. Bollobás. Graph Theory: An Introductory Course. Springer-Verlag, New York,
1979.
3. E. Clar. The Aromatic Sextet. John Wiley & Sons, London, 1972.
4. S. J. Cyvin and I. Gutman. Kekulé Structures in Benzenoid Hydrocarbons.
Springer-Verlag, Berlin, 1988.
5. I. Gutman and S. J. Cyvin. Introduction to the Theory of Benzenoid Hydrocarbons,
Springer-Verlag, Berlin, 1989.
6. P. Hansen and M. Zheng. Upper bounds for the Clar number of benzenoid hy-
drocarbons. Journal of the Chemical Society, Faraday Transactions, 88:1621–1625,
1992.
7. P. Hansen and M. Zheng. The Clar number of a benzenoid hydrocarbon and linear
programming. Journal of Mathematical Chemistry, 15:93–107, 1994.
8. P. Hansen and M. Zheng. Numerical bounds for the perfect matching vectors of
a polyhex. Journal of Chemical Information and Computer Sciences, 34:305–308,
1994.
9. F. Harary. Graph Theory. Addison-Wesley, Boston, 1969.
10. G. L. Nemhauser and L.A. Wolsey. Integer and Combinatorial Optimization. John
Wiley & Sons, New York, 1988.
11. A. Schrijver. Theory of Integer and Linear Programming. John Wiley & Sons, New
York, 1986.
212 Hernán Abeledo and Gary Atkinson

12. K. Truemper. Algebraic characterizations of unimodular matrices. SIAM Journal

of Applied Mathematics, 35(2):328–332, 1978.
13. A. F. Veinott and G. B. Dantzig. Integral extreme points. SIAM Review, 10(3):371–
372, 1968.
Consecutive Ones and a Betweenness Problem in
Computational Biology

Thomas Christof, Marcus Oswald, and Gerhard Reinelt

Institut für Angewandte Mathematik

Universität Heidelberg
Im Neuenheimer Feld 293/294
D-69120 Heidelberg, Germany
{Thomas.Christof, Marcus.Oswald, Gerhard.Reinelt}@@IWR.Uni-Heidelberg.De

Abstract. In this paper we consider a variant of the betweenness prob-

lem occurring in computational biology. We present a new polyhedral
approach which incorporates the solution of consecutive ones problems
and show that it supersedes an earlier one. A particular feature of this
new branch-and-cut algorithm is that it is not based on an explicit integer
programming formulation of the problem and makes use of automatically
generated facet-defining inequalities.

1 Introduction

The general Betweenness Problem is the following combinatorial optimization

problem. We are given a set of n objects 1, 2, . . . , n, a set B of betweenness
conditions, and a set B of non-betweenness conditions. Every element of B (of
B) is a triple (i, j, k) (a triple (i, j, k)) requesting that object j should be placed
(should not be placed) between objects i and k. The task is to find a linear order
of all objects such that as few betweenness and non-betweenness conditions as
possible are violated, resp. to characterize all orders that achieve this minimum.
If violations are penalized by weights, we call the problem of finding a linear order
minimizing the sum of weights of violations Weighted Betweenness Problem. This
problem is NP-hard in general.
In this paper we consider a special variant of this problem occurring in com-
putational molecular biology, namely in the physical mapping problem with end
probes. For the purpose of this paper we do not elaborate on the biological back-
ground, but refer to [4]. We define the problem simply as follows. We are given
a set of m so-called clones i ∈ {1, 2, . . . , m} to each of which two so-called end
probes it and ih are associated. These n = 2m probes are numbered such that
it = 2i − 1 and ih = 2i. Depending on the data, we have for every pair of a clone
i and a probe j ∈ {1, 2, . . . , n} \ {it , ih } either a betweenness condition (it , j, ih )
or a non-betweenness condition (it , j, ih ). Violation of a betweenness condition is
penalized with cost cρ , and violation of a non-betweenness constraint receives a

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 213–228, 1998. c Springer–Verlag Berlin Heidelberg 1998
214 Thomas Christof et al.

penalty of cµ . The problem is then to find a linear order of the probes minimizing
the sum of penalties for violated constraints.
This Weighted Betweenness Problem can also be stated in a different version:
A 0/1 matrix A ∈ {0, 1}m×n has the consecutive ones property (for rows) if the
columns of A can be permuted so that the 1’s in each row appear consecutively.
For a 0/1 matrix B ∈ {0, 1}m×n having the consecutive ones property let nB ρ
(nBµ ) denote the number of 1’s (of 0’s) that have to be switched to transform
A into B. For given nonnegative numbers cρ and cµ , we define the Weighted
Consecutive Ones Problem as the task to find a matrix B with the consecutive
ones property minimizing cρ nB ρ + cµ nµ . This problem is known to be NP-hard
B

[2]. All column permutations π of a feasible matrix B so that the 1’s in each row
of B π appear consecutively can be found in time linear in the number of 1’s in
B by a so called P Q-tree algorithm [1].
For our discussion, we assume that the data is given as clone × probe 0/1-
matrix A where ai,it and ai,ih are fixed to 1. The other entries are obtained from
some experiment where an entry aij = 1 gives rise to a betweenness constraint
(it , j, ih ) and an entry aij = 0 corresponds to a non-betweenness constraint
(it , j, ih ). A solution of the Weighted Betweenness Problem corresponds then
to a solution of the Weighted Consecutive Ones Problem with the additional
constraint that in some column permuted matrix B π (in which the 1’s in each
row appear consecutively) the first and the last 1 in each row i correspond to
the end probes of clone i.
The Weighted Consecutive Ones Problem models the biological situation
when the information that the probes are the ends of clones is missing [8]. Note
that by introducing artificial variables the Weighted Consecutive Ones Problem
can easily be transformed to a Weighted Betweenness Problem.
The paper is organized as follows. Section 2 discusses our previous approach
which could already be applied successfully. An improved approach is presented
in section 3 leading to the definition of the betweenness polytope. This polytope
is then studied in the following section. Separation in the context of a branch-
and-cut algorithm to solve the betweenness problem to optimality is the topic
of section 5. Computational results conclude this paper.

2 A First IP Model

Our previous computational approach to the Weighted Betweenness Problem was

based on a different model. We state it here mainly for a comparison and for
introducing some notations. In the following we will introduce 0/1 variables in-
dicating whether a betweenness or non-betweenness constraint is violated. Since
a betweenness condition (i, j, k) is violated if and only if the non-betweenness
condition (i, j, k) is satisfied, we can just complement variables and will speak
only about betweenness conditions in the following. The objective function co-
efficients of the variables will then give the preference if the condition should be
satisfied or violated.
A Betweenness Problem 215

Let m be the number of clones and n be the number of probes. In our special
setup we have n = 2m and we have n − 2 betweenness constraints (it , j, ih ) for
every clone i. We write, for short, (i, j) for the betweenness constraint (it , j, ih ),
and call (i, j) a clone-probe pair. The set Bm of all possible clone-probe pairs is

Bm := {(i, j) | 1 ≤ i ≤ m, 1 ≤ j ≤ n, j 6= it , j 6= ih }.

Obviously, |Bm | = 2m(m − 1).

We will develop an integer programming formulation with two types of vari-
ables. For every ordered pair (i, j) of probes, we introduce a 0/1 variable yij
which has value 1 if and only if i precedes j in the order π. With every clone-
probe pair (i, j) ∈ Bm , we associate the 0/1 variable xij which has value 1 if and
only if constraint (it , j, ih ) is met in the order π. Equivalently, xij = 0 if and
only if the non-betweenness constraint (it , j, ih ) is met.
To ensure that the variables yij encode a linear order π of the probes, the
constraints of the IP formulation of the linear ordering problem have to be met,
i.e., they have to satisfy

yij + yji = 1, for all 1 ≤ i < j ≤ n,

yij + yjk + yki ≤ 2, for all triples 1 ≤ i, j, k ≤ n.

To ensure that the xij count violations of the betweenness and nonbetweenness
constraints, we have to add further inequalities. To force a xij , (i, j) ∈ Bm to
be 0 if and only if (it , j, ih ) is violated (or (it , j, ih ) is satisfied), we add

xij ≤ yjih + yjit ,

xij ≤ yih j + yit j ,
xij ≥ −yit j − yjih + 1,
xij ≥ yit j + yjih − 1.

Thus, xij is 1 if and only if yit j = yjih .

We do not discuss the objective function here. Due to the positive param-
eters cρ and cµ we can omit some of these inequalities depending on whether
xij corresponds to an original betweenness or non-betweenness condition, and
moreover, we only have to require

0 ≤ xij ≤ 1.

The objective function will force these variables to have integer values if the
linear ordering variables are integer. Note that the objective function is zero on
the linear ordering variables.
With every feasible solution of the problem, corresponding to a permutation
π of the probes, we associate 0/1-vectors ψ π ∈ {0, 1}n(n−1) and χπ ∈ {0, 1}|Bm|
with
1 if i precedes j in the order π,
ψij
π
=
0 otherwise.
216 Thomas Christof et al.

and
1 if constraint (it , j, ih ) is met in the order π,
χπij =
0 otherwise.
The polytope PLOBW
m
associated with the instance of the Weighted Betweenness
π
Problem is the convex hull of all possible vectors ψ
χπ ,
π
ψ
PLOBW
m
= conv π is a permutation of the probes .
χπ

Previous computations were based on partial descriptions of this polytope (see

[4]).

3 Modelling Without Linear Ordering Variables

The formulation of the preceding section is somewhat redundant. Namely, let

2
ψ
χ ∈ P m
LOBW ∩ {0, 1}6m −4m be an incidence vector of the Weighted Between-
ness Problem with n = 2m probes and the set Bm of betweenness constraints.
Obviously χ can be retrieved from ψ, because

χij = 1 ⇔ ψit j = ψjih .

Conversely, for a given χ there exist one or more feasible settings of ψ. These
settings cannot be obtained directly but in linear time by application of the
P Q-tree algorithm [1]. For a given χ we define for every clone i three sets in the
following way.
Si1 = {j | χij = 1} ∪ {it },
Si2 = {j | χij = 1} ∪ {ih },
Si3 = {j | χij = 1} ∪ {it } ∪ {ih }.
The feasible settings of the linear ordering variables ψ correspond to all
permutations of {1, . . . , n} where the elements of every set introduced above
occur in consecutive order.
We now define the projection of PLOBW m
onto the χ variables

ψ
PBW = conv
m
χ | there exists ψ such that ∈ PLOBW
m
.
χ

It can easily be tested if a 0/1-vector χ is contained in PBW m

. Namely if not,
2m(m−1)
i.e. if χ ∈ {0, 1} ,χ ∈ / PBW , then with the sets defined above for the
m

clones, the application of the P Q-tree algorithm would yield the result that
no permutation exists in which the elements of the sets appear as consecutive
subsequences. Otherwise, if χ ∈ PBW m
, then the P Q-tree provides all consistent
permutations.
Because a feasible linear ordering can be derived from a χ ∈ PBW m
and the
objective function is zero for the linear ordering variables, they can be omitted. A
A Betweenness Problem 217

solution of the Weighted Betweenness Problem for physical mapping is obtained

by solving

max cT x
x ∈ PBW
m

x ∈ {0, 1}2m(m−1)

Now we want to derive an integer programming formulation of this problem

from an integer programming formulation of the Weighted Consecutive Ones
Problem.
For each χ the sets Sik can be written as rows of a matrix M χ ∈ {0, 1}3m×2m.
For 1 ≤ i ≤ m and 1 ≤ j ≤ n and 1 ≤ k ≤ 3 an entry mχ3(i−1)+k,j of M χ has the
value 1 if and only if j ∈ Sik .

Proposition 1. χ ∈ PBW
m
if and only if M χ has the consecutive ones property
for rows.

Proof. Clear because of the construction of the sets Sik .

Corollary 2. Each integer programming formulation of the Weighted Consecu-

tive Ones Problem leads to an integer programming formulation of the Weighted
Betweenness Problem for physical mapping.
Proof. Observe that the entries of M χ are constant or equal to some correspond-
ing entries of χ. Hence, it exists a (simple) linear transformation from χ to M χ
which can be used to substitute M χ in an IP formulation of the Weighted Con-
secutive Ones Problem, yielding an IP formulation of the Weighted Betweenness
Problem.

An IP formulation of the Weighted Consecutive Ones Problem can be derived

from a theorem of Tucker [15]. In this paper it is shown that the 0/1 matrix M
has the consecutive ones property for rows if and only if no submatrix of M is
a member of {Mi }, where {Mi } is a given set of 0/1 matrices. Now for each
submatrix of M and each matrix Mi with the same size it is easy to derive an
inequality, which is violated if and only if the submatrix is equal to Mi . The
set of all these inequalities gives an IP formulation of the Weighted Consecutive
Ones Problem.
Using the linear transformation from χ to M χ we obtain an IP formulation
of the Weighted Betweenness Problem for physical mapping. For m = 2 the
nontrivial inequalities of the formulation are

2x12t + x21t + x21h ≤ 3,

2x12t − 2x12h − x21t − x21h ≤ 1,

and further inequalities which can be obtained by the symmetry operations

described in section 4.2. Note that these two classes of inequalities do not define
2
facets of PBW .
218 Thomas Christof et al.

4 m
The Polytope PBW
In this section we will investigate some properties of the polytope PBWm
(see also
[13] for a more detailed discussion). In particular we are interested in exhibiting
classes of facet-defining inequalities.

4.1 Dimension and Lifting

We first determine the dimension of PBW
m
and address the question of trivial
(node) lifting of facets.
Proposition 3. Let m > m ≥ 2. An inequality g T x ≤ g0 for PBW m
, obtained by
trivial lifting from an inequality f x ≤ f0 which is valid for PBW
T m
is valid for
PBW
m
.
Proof. Let a binary vector χ ∈ PBW
m
be given. Since by removing the components
χij with (i, j) ∈ Bm \ Bm from χ one obtains a vector χ which is a feasible
incidence vector of PBW
m
, the proposition follows.

We can also compute the dimension of a face of PBW m

which is induced by an
inequality resulting from trivial lifting of an inequality of PBW
m
.
Theorem 4. Let f T x ≤ f0 be valid for PBW m
, and let F = {x ∈ PBW m
| f T x = f0 }
be the induced face with dim F ≥ 0. Let m > m and let g x ≤ g0 be obtained
T

from f T x ≤ f0 by trivial lifting. Let G = {x ∈ PBW

m
| g T x = g0 }. Then
dim G − dim F = |Bm | − |Bm | = 2m(m − 1) − 2m(m − 1).
Proof. Obviously, it is sufficient to show the theorem for m = m + 1. Let
m = m + 1 and k := dim F ≥ 0. Then there exist k + 1 permutations π0 , . . . , πk
and corresponding incidence vectors χπi with χπi ∈ F for i = 0, . . . , k and
arank{χπ0 , . . . , χπk } = k + 1. W.l.o.g. by symmetry considerations we may as-
sume that
π0−1 (it ) < π0−1 (ih ) for 1 ≤ i ≤ m,
π0−1 (it ) < π0−1 (j t ) for 1 ≤ i < j ≤ m.
We now construct k+1+|Bm|−|Bm| = k+1+4m permutations ρ0 , . . . , ρk+4m
with χρi ∈ G for i = 0, . . . , k + 4m and arank{χρ0 , . . . , χρk+4m } = k + 4m + 1.
For 0 ≤ l ≤ k set
ρl = (πl (1), . . . , πl (2m), mt , mh )
The remaining permutations ρk+1 , . . . , ρk+4m are obtained from π0 by insert-
ing mt and mh at different positions. For 1 ≤ l ≤ m we define the permutations
as
ρk−2+3l = (π0 (1), . . . π0 (π0−1 (lt ) − 1), mt , lt , mh , π0 (π0−1 (lt ) + 1), . . . π0 (2m)),
ρk−1+3l = (π0 (1), . . . π0 (π0−1 (lt ) − 1), mh , lt , mt , π0 (π0−1 (lt ) + 1), . . . π0 (2m)),
ρk+3l = (π0 (1), . . . π0 (π0−1 (lt ) − 1), lt , mt , mh , π0 (π0−1 (lt ) + 1), . . . π0 (2m)),
ρk+3m+l = (π0 (1), . . . π0 (π0−1 (lh ) − 1), mt , lh , mh , π0 (π0−1 (lh ) + 1), . . . π0 (2m)).
A Betweenness Problem 219

The following properties of χρijl , k + 1 ≤ l ≤ k + 4m, (i, j) ∈ Bm \ Bm are

needed to show that the vectors χρl are affinely independent. They result in a
straightforward way from the definition of ρl , k + 1 ≤ l ≤ k + 4m.
= 0, since π0−1 (mt ) < π0−1 (lt ) <
ρk−2+3l
For l < r ≤ m it holds that χrm t
−1 t −1 h
π0 (r ) < π0 (r ).
By similar arguments, it holds for l < r ≤ m that
ρ ρ ρ
χijk−2+3l = χijk−1+3l = χijk+3l = 0,

for
(i, j) ∈ {(r, mt ), (r, mh ), (m, rt )}.
Moreover, for 1 ≤ l ≤ m, we have
ρ ρ ρ
χlm
k−2+3l
t = χlm
k−1+3l
h = χml
k+3l
t = 0,

and
ρ ρ
χlm
k−2+3l
h = χml
k−2+3l
t = 1,
ρ ρ
χlm
k−1+3l
t = χml
k−1+3l
t = 1,
ρk+3l ρk+3l
χlmt = χlmh = 1.

In addition, for 1 ≤ r, l ≤ m

ρk+3m+l 1 for l = r,
χmr h =
0 otherwise,

and, for 1 ≤ r, l ≤ m,
ρ ρ ρ
χmr
k−2+3l
h = χmr
k−1+3l
h = χmr
k+3l
h = 0.

In the following table, the rows are the incidence vectors (χρk+i )T , 1 ≤
i ≤ 4m, restricted to the variables in Bm \ Bm . The vectors are partitioned
ρk+i ρk+i ρk+i
into m blocks of coefficients χlm t , χlmh , χmlt , and the remaining coefficients
ρk+i ρk+i
χm1h , . . . , χmmh .
ρ ρ ρ ρ ρ ρ ρ ρ
χ1m
k+i
t χ1m
k+i
h χm1
k+i
t · · · χmm t χmmh χmmt χm1h
k+i k+i k+i k+i
· · · χmm
k+i
h

i=1
i=2 1−I 0 0 0
i=3
.. ..
. ∗ . 0 0
i = 3m − 2
i = 3m − 1 ∗ ∗ 1−I 0
i = 3m
i = 3m + 1
.. ∗ ∗ ∗ I
.
i = 4m
220 Thomas Christof et al.

The system has full rank and since it holds that, for 0 ≤ l ≤ k,
πl
ρl χij for (i, j) ∈ Bm ,
χij =
0 for (i, j) ∈ Bm \ Bm ,

the vectors χρ0 , . . . , χρk+4m are affinely independent.

It is easy to see that χρi ∈ G for i = 0, . . . , k + 4m. Hence, dim G ≥ k + 4m.
On the other hand, dim G ≤ k + 4m, because by the lifting operation only 4m
variables are added (for m = m + 1). Therefore, dim G = dim F + 4m.

From these results we obtain that trivial lifting preserves the facet-defining
property and that PBW
m
is full-dimensional.

Corollary 5. Let m > m ≥ 2. An inequality g T x ≤ g0 for PBW m

, obtained by
trivial lifting from a facet-defining inequality f x ≤ f0 of PBW is facet-defining.
T m

Corollary 6. For m ≥ 2

dim PBW
m
= 2m(m − 1),

i.e., PBW
m
is full-dimensional.
2 2
Proof. PBW is full-dimensional. By application of Theorem 4 with F = PBW and
G = PBW the corollary follows.
m

4.2 Small Instance Relaxations

Because facet-defining inequalities of PBW

m
are trivially liftable, we can obtain
from linear descriptions of polytopes associated with small instances of the prob-
lem, say for 2 ≤ m ≤ 4, relaxations of PBW
m
for any m. We call such a relaxation
small instance relaxation [3,6].
In order to characterize the symmetry properties of PBW m
we use the following
|Bm |
notation. Let v ∈ IR . The entries of v can be placed in an m by m clone-
clone matrix ṽ of 2-dimensional vectors. The rows and columns of the matrix
correspond to the clones, and an entry ṽij ∈ IR2 of ṽ is defined for i 6= j as
ṽij = vvijht , i.e., an entry ij refers to the relations of the end probes obtained
ij
from clone j with clone i.
Now, the set of incidence vectors in PBW m
can be partitioned into equivalence
classes with respect to the following two operations. First, for χ ∈ PBW m
an arbi-
trary permutation of the clones is allowed. This corresponds to a simultaneous
permutation of the rows and columns of the associated clone-clone matrix χ̃. Sec-
ond, for an arbitrary clone j its end probes can be reversed. This corresponds

to reversing in column j of the clone-clone matrix the entries of χχijht to χχijht
ij ij
for all 1 ≤ i ≤ m, i 6= j.
A Betweenness Problem 221

We used the algorithms for facet enumeration discussed in [5] to get more
insight into the facet structure of PBW m
for m ≤ 4. It is clear that also the
facet-defining inequalities of PBW can be assigned to equivalence classes.
m
2
PBW has 7 vertices and is completely described by 3 equivalence classes of
3
facets. PBW has 172 vertices and 241 facets in total which can be partitioned
into 16 equivalence classes of facets. They are displayed in Table 1 (f˜i denotes
the clone-clone matrix representation of f i ). Observe that the (lifted) facets of
2 3 T T
PBW are among the facets of PBW (f 1 x ≤ f01 to f 3 x ≤ f03 ).
4
The computation of the complete linear description of PBW which has 9,197
vertices was not possible in reasonable time. However, by an algorithm for paral-
lel facet enumeration (see [5]) we found more than 1, 16 · 107 (!) different equiva-
lence classes of facet-defining inequalities, yielding a lower bound of 4.4 · 109 for
4
the number of facets of PBW .

4.3 Cycle Inequalities

We use the following notation. For j ∈ {1, . . . , n}, c(j) denotes the clone from
which the probe j is extracted, i.e., j = c(j)t or j = c(j)h , and by q(j) we
denote the second probe q(j) which the clone c(j) defines. Of course, we have
q(q(j)) = j.
A class of valid inequalities can be motivated by the following observation.
Suppose we have n = 2m betweenness constraints (i1 t , 1, i1 h ), . . . , (in t , n, in h ),
such that every probe j, j ∈ {1, . . . , n}, is exactly once the probe between the
ends of a clone ij 6= c(j). Then for any permutation π of the probes at least two
betweenness constraints are violated, because the first probe π(1) and the last
probe π(2m) are not between other probes. Hence, the inequality
X
n
xij j ≤ n − 2, ij 6= c(j) (1)
j=1

is valid for PBW

m
. The following cycle inequalities are a special case of these
inequalities. In particular, they are facet-defining and they generalize the facet
T
f 9 x ≤ f09 of Table 1.
Theorem 7. Let Dn be the complete directed graph on n nodes corresponding
to all probes, and let R be a cycle in Dn with the property that if k in V (R),
then q(k) ∈/ V (R). Then
X
xc(k)l + xc(l)q(k) ≤ 2|R| − 2 (2)
(k,l)∈R

defines a facet of PBW

m
.
Proof. Let us assume that
X
aT x = aij xij ≤ b (3)
(i,j)∈Bm
222 Thomas Christof et al.

3 T
Table 1. Facet-defining inequalities f i x ≤ f0i of PBW .

   
1 0 0 0
∗ ∗
 1 0
 
0

0
f˜1 = 

1
1
∗ 0
0
, f01 = 2
 f˜2 = 

0
−1
∗ 0
0
, f02 = 0

0
0
0
0

0 0
∗ 0 0
∗
   
1 0 2 −1
∗ ∗
 −1 0
  0 −1

f˜3 = 

−1
−1
∗ 0
0
, f03 = 0
 f˜4 = 

0
0
∗ 0
0
, f04 = 2

0
0
−1
2

0 0
∗ −1 0
∗
   
2 −1 1 −1
∗ ∗
 0 −1
 
1

−1
f˜5 = 

0
0
∗ 0
0
, f05 = 2
 f˜6 = 

−1
−1
∗ 1
1
, f06 = 2

1
−2
1
−1

1 0
∗ 1 −1
∗
   
1 −1 2 1
∗ ∗
 1 −1
  0 −1

f˜7 = 

1
1
∗ −1
−1
, f07 = 2
 f˜8 = 

0
0
∗ −1
1
, f08 = 4

1
−1
1
−1

1 −1
∗ 1 1
∗
   
1 1 1 −1
∗ ∗
 0 0
 
1 −1

f˜9 = 

1
0
∗ 0
1
, f09 = 4
 f˜10 = 

1
−1
∗ 0
0
, f010 = 2

0
0
1
−2

1 1
∗ 1 −2
∗
   
2 0 1 1
∗ ∗

0 −2
 
−1 −1

f˜11 = 

0
−2
∗ −1
1
, f011 = 2
 f˜12 = 

1
−1
∗ 0
−2
, f012 = 2

−2
1
1
0

0 −1
∗ −1 −2
∗
   
1 1 1 1
∗ ∗

−3 −3
 
−1 −1

f˜13 = 

1
−3
∗ 1
−3
, f013 = 2
 f˜14 = 

1
−1
∗ −2
0
, f014 = 2

1
1
1
−2

−3 −3
∗ −1 0
∗
   
3 3 1 0
∗ ∗

−1 −1
 
0 −1

f˜15 = 

1
−1
∗ −3
1
, f015 = 6
 f˜16 = 

0
−1
∗ 0
1
, f016 = 2

1
−3
0
−1

−1 1
∗ 1 0
∗
A Betweenness Problem 223

is facet-defining for PBW

m
, such that the face induced by (2) is contained in the
facet induced by (3). We show that (3) is a positive multiple of (2).
W.l.o.g. we set R = {(1t , 2t ), (2t , 3t ), . . . , (mt , 1t )}. Then (2) becomes

X
m−1
(xi(i+1)t + x(i+1)ih ) + xm1t + x1mh ≤ 2m − 2 (4)
i=1

Figure 1 displays the graph belonging to (4) for m = 4.

t 1 h

h t

4 2

t h

h 3 t
Fig. 1. Cycle inequality with 4 clones

First we show that all coefficients aij are constant for the clone-probe pairs
(i, j) corresponding to the nonzero coefficients in (4).
Let
h h
π1 = (1h , 2t , 3t , 2h , . . . , it , (i − 1) , . . . , mt , (m − 1) , mh , 1t ),
h h
π2 = (2t , 1h , 3t , 2h , . . . , it , (i − 1) , . . . , mt , (m − 1) , mh , 1t ).

It is easy to see that χπ1 and χπ2 satisfy (4) as equation. Thus aT χπ1 = aT χπ2 .
This yields a12t = a21h .
Because R is invariant under the cyclic permutation 1 → 2 → · · · m → 2 → 1,
the same operations applied to π1 and π2 imply that

a23t = a32h , . . . , am1t = a1mh . (5)

224 Thomas Christof et al.

Analogously, using the permutations

h h
π3 = (1t , 2t , 1h , 3t , 2h , . . . , it , (i − 1) , . . . , mt , (m − 1) , mh ),
h h
π4 = (2t , 3t , 2h , . . . , it , (i − 1) , . . . , mt , (m − 1) , 1t , mh , 1h )

we obtain

a12t + a21h = a23t + a32h , . . . , am−1mt + am(m−1)h = am1t + a1mh . (6)

With (5) and (6) we have shown the first claim.

Now we show that aij = 0, for all remaining coefficients. For this purpose we
define the following m − 1 permutations, which only differ in the position of 1h .
t h
ρ1 = (1t , 2t , 3t , . . . , (m − 1) , mt , 1h , 2h , . . . (m − 1) , mh ),
t h
ρ2 = (1t , 2t , 3t , . . . , (m − 1) , 1h , mt , 2h , . . . (m − 1) , mh ),
..
.
t h
ρm−1 = (1t , 2t , 1h , 3t , . . . , (m − 1) , mt , 2h , . . . (m − 1) , mh ).

Again, the associated incidence vectors satisfy (4) as equation. The equations
aT χρi = aT χρi+1 , for 1 ≤ i ≤ m − 2, give

a1(m+1−i)t = −a(m+1−i)1h , for 1 ≤ i ≤ m − 2. (7)

With the permutations (which only differ in the position of 1h )

t h
τ1 = (2t , 3t , . . . , (m − 1) , mt , 1h , 1t , 2h , . . . (m − 1) , mh ),
t h
τ2 = (2t , 3t , . . . , (m − 1) , 1h , mt , 1t , 2h , . . . (m − 1) , mh ),
...
t h
τm−1 = (2t , 1h , 3t , . . . , (m − 1) , mt , 1t , 2h , . . . (m − 1) , mh ),

we obtain analogously that

a1(m+1−i)t = a(m+1−i)1h , for 1 ≤ i ≤ m − 2. (8)

With (7) and (8) we have shown that a13t = a31h = a14t = a41h = . . . =
a1mt = am1h = 0.
Applying cyclic permutations to ρi and τi , for 1 ≤ i ≤ m − 1, we finally
obtain that all coefficients aij are zero for all zero-coefficients in (4).
Hence, the left hand side of (3) is a multiple of the left hand side of (4). By
the properties of (3) and (4), (4) induces that same facet as (3).

5 Separation Procedures
In our computations, we do not use the IP formulation presented above. because
on the one hand it is fairly complicated and on the other hand the inequalities
A Betweenness Problem 225

do not define facets in general. Rather, we proceed as follows. As discussed

before, we can check feasibility of an integer vector x∗ by making use of the
P Q-tree algorithm. If x∗ is feasible, then the P Q-tree algorithm also generates
all optimal solutions π ∗ . Otherwise, if x∗ is not feasible (but binary), then one
can construct a cutting plane, which is satisfied by all 0/1-vectors different from
x∗ . Let P = {i | x∗i = 1} and Z = {i | x∗i = 0}, then
X X
xi − xi ≤ |P | − 1
i∈P i∈Z

is a cutting plane with the desired properties.

5.1 Separation of Cycle Inequalities

The separation of the cycle inequalities can be done by a shortest path algorithm.
Let x∗ be an LP solution and Dn = (Vn , An ) be the complete directed graph on
n nodes with edge weights

wij = 2 − x∗c(i)j − x∗c(j)q(i) .

Then it is easy to see by the following transformation that a cycle with weight
less than 2 corresponds to a violated inequality. Let y ∗ = 1−x∗ . Then x∗ violates
a cycle inequality (4) if and only if y ∗ violates an inequality
X
xc(i)j + xc(j)q(i) ≥ 2
(i,j)∈R

which is true if and only w(R) < 2 for the cycle R in Dn = (Vn , An ).

5.2 Separation of Small Instance Relaxations

We also implemented separation heuristics for inequalities of small instance re-

laxations.
Let an LP solution x∗ ∈ IR|Bm | and a facet-defining inequality g T x ≤ g0 be
given. For a permutation σ of {1, . . . , m} we define

Xm X
m ∗ T
xij t gσ(i)σ(j)t
C(σ) = .
x∗ij h gσ(i)σ(j)h
i=1 j=1,j6=i

It is clear that the set {σ | C(σ) > g0 } gives all violated inequalities among
all inequalities which are (by means of relabeling of the clones) equivalent to
g T x ≤ g0 . Thus the separation problem for equivalent inequalities (relative to
relabeling of the clones) is a quadratic assignment problem in which all entries
are 2-dimensional vectors.
226 Thomas Christof et al.

Because an inequality g T x ≤ g0 of a small instance relaxation results from

trivial node lifting of an inequality f T x ≤ f0 which is facet-defining for PBW
m
,
m < m we have
Xm X
m ∗
xσ−1 (i)σ−1 (j)t T fij t
C(σ) = .
i=1 j=1,j6=i
x∗σ−1 (i)σ−1 (j)h fij h

Since m is rather small (m ≤ 4) the quadratic assignment problem is com-

putationally tractable and can effectively be solved by heuristics; in our imple-
mentation we use a Grasp procedure similar to [12].
In order to separate all inequalities of one equivalence class this separation
procedure has to be executed for at most 2m different inequalities fiT x ≤ f0i
which are equivalent relative to the end probe reversing symmetry.

6 Computational Results
In our branch-and-cut computations we used ABACUS [14,9,10] with LP solver
CPLEX [7] and an implementation of the P Q-tree algorithm [11].
In [4] we used a set of 150 randomly generated problem instances with 42 to 70
probes which resemble instances occurring in practical applications. To generate
the data the clones were randomly distributed across the chromosome. We used
a coverage (which gives the average number of how often a single point along the
chromosome is covered by a clone) varying from 3 to 5, a false negative rate of
10%, and a false positive rate varying from 0% to 5%. To create a false positive
or false negative, a coin was flipped at each entry with the given probability.
Across experiments with varying coverage, the clone length was held constant.
Our computational experiments show that the new approach clearly super-
sedes our previous approach. Table 2 compares our previous approach (using
linear ordering variables) with the new model described here. The table displays
the average CPU-times ttot (in hh:mm:ss, on a Sun Sparc IPX), the average
number of nodes of the branch-and-cut tree nsub , the average number of cutting
planes which are added to the linear program ncut and the average number of
LP reoptimizations nlp to solve the problem instances.

Table 2. Average values for 150 problems.

Model ttot nsub ncut nlp

Linear ordering based 0:44:32 3.2 22197.8 201.5
P Q-tree based 0:01:28 1.1 1022.0 6.3

In our new approach the following strategy for separating a fractional LP

solution turned out to be favorable. If by exact enumeration of the nontrivial
A Betweenness Problem 227

2
facet classes of PBW not enough cutting planes are found, we execute the heuris-
tics of section 5.2 for a small subset (less than 15) of all equivalence classes of
4
facet-defining inequalities of PBW . If this separation fails, then we separate the
cycle inequalities.
To improve the performance of our branch-and-cut algorithm we developed
algorithms for executing the separation procedures for different classes of facets
in parallel. In addition, we studied by computational experiments two important
questions in detail:

– Which and how many classes of facets should be used for separation?
– Given a large set of cutting planes generated by the separation heuristics,
which ones should be selected and should be added to the LP?

Concerning the first problem, we observed that facet classes with a large
number of roots (feasible incidence vectors contained in the facet) are the most
important ones. The best solution to the second problem is to use cutting planes
whose normal vectors have a small angle with the objective function gradient.
Detailed computational experiments using small instance relaxations in parallel
branch-and-cut algorithms are reported in [3,6].
For the Weighted Betweenness Problem we tested 25 different strategies on
8 randomly generated instances with the following characteristics: 125 clones
(=250 probes), a coverage of 4, a false positive rate of 5% and a false negative
rate of 10%. With our best strategy we could solve the 8 instances on a parallel
machine with 8 Motorola Power PC processors (7 of them reserved for executing
the separation heuristics) in 2:01:34h total time.

References
1. K. Booth and G. Lueker. Testing for the consecutive ones property, interval graphs,
and graph planarity using PQ-tree algorithms. Journal of Computer and System
Sciences, 13:335–379, 1976.
2. K. S. Booth. PQ-Tree Algorithms. PhD thesis, University of California, Berkeley,
1975.
3. T. Christof. Low-Dimensional 0/1-Polytopes and Branch-and-Cut in Combinato-
rial Optimization. Aachen: Shaker, 1997.
4. T. Christof, M. Jünger, J. Kececioglu, P. Mutzel, and G. Reinelt. A branch-and-cut
approach to physical mapping of chromosomes by unique end-probes. Journal of
Computational Biology, 4(4):433–447, 1997.
5. T. Christof and G. Reinelt. Efficient parallel facet enumeration for 0/1 polytopes.
Technical report, University of Heidelberg, Germany, 1997.
6. T. Christof and G. Reinelt. Algorithmic aspects of using small instance relaxations
in parallel branch-and-cut. Technical report, University of Heidelberg, Germany,
1998.
7. CPLEX. Using the CPLEX Callable Library. CPLEX Optimization, Inc, 1997.
8. M. Jain and E. W. Myers. Algorithms for computing and integrating physical maps
using unique probes. In First Annual International Conference on Computational
Molecular Biology, pages 84–92. ACM, 1997.
228 Thomas Christof et al.

9. M. Jünger and S. Thienel. The design of the branch-and-cut system ABACUS.

Technical Report 95.260, Universität zu Köln, Germany, 1997.
10. M. Jünger and S. Thienel. Introduction to ABACUS – A Branch-And-CUt system.
Technical Report 95.263, Universität zu Köln, Germany, 1997.
11. S. Leipert. PQ-trees, an implementation as template class in C++. Technical
Report 97.259, Universität zu Köln, Germany, 1997.
12. Y. Li, P. M. Pardalos, and M. G. C. Resende. A greedy randomized adaptive
search procedure for the quadratic assignment problem. In P. M. Pardalos and
H. Wolkowicz, editors, DIMACS Series in Discrete Mathematics and Theoretical
Computer Science, Vol. 16, pages 237–261. American Mathematical Society, 1994.
13. M. Oswald. PQ-Bäume im Branch & Cut-Ansatz für das Physical-Mapping-
Problem mit Endprobes. Master’s thesis, Universität Heidelberg, Germany, 1997.
14. S. Thienel. ABACUS A Branch-And-CUt System. PhD thesis, Universität zu Köln,
1995.
15. A. Tucker. A structure theorem for the consecutive 1’s property. Journal of
Combinatorial Theory, 12:153–162, 1972.
Solving a Linear Diophantine Equation with
Lower and Upper Bounds on the Variables

Karen Aardal1? , Cor Hurkens2 , and Arjen K. Lenstra3??

1
Department of Computer Science, Utrecht University
aardal@@cs.ruu.nl
2
Department of Mathematics and Computing Science
Eindhoven University of Technology
wscor@@win.tue.nl
3
Emerging Technology, Citibank N.A.
arjen.lenstra@@citicorp.com

Abstract. We develop an algorithm for solving a linear diophantine

equation with lower and upper bounds on the variables. The algorithm
is based on lattice basis reduction, and first finds short vectors satisfying
the diophantine equation. The next step is to branch on linear combi-
nations of these vectors, which either yields a vector that satisfies the
bound constraints or provides a proof that no such vector exists. The
research was motivated by the need for solving constrained linear dio-
phantine equations as subproblems when designing integrated circuits
for video signal processing. Our algorithm is tested with good result on
real-life data.

Subject classification: Primary: 90C10. Secondary: 45F05, 11Y50.

1 Introduction and Problem Description

We develop an algorithm for solving the following integer feasibility problem:

does there exist a vector x ∈ ZZ n such that ax = a0 , 0 ≤ x ≤ u? (1)

We assume that a is an n-dimensional row vector, u is an n-dimensional column

vector, and that a0 is an integer scalar. This is an NP-complete problem; in
the absence of bound constraints, it can be solved in polynomial time. The
research was motivated by a need for solving such problems when designing
integrated circuits (ICs) for video signal processing, but several other problems
?
Research partially supported by ESPRIT Long Term Research Project No. 20244
(Project ALCOM-IT: Algorithms and Complexity in Information Technology), and
by NSF grant CCR-9307391 through David B. Shmoys, Cornell University.
??
Research partially supported by ESPRIT Long Term Research Project No. 20244
(Project ALCOM-IT: Algorithms and Complexity in Information Technology).

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 229–242, 1998. c Springer–Verlag Berlin Heidelberg 1998
230 Karen Aardal et al.

can be viewed as problem (1), or generalizations of (1). One such example is

the Frobenius problem that was recently considered by Cornuéjols, Urbaniak,
Weismantel and Wolsey [4]. The instances related to video signal processing
were difficult to tackle by linear programming (LP) based branch-and-bound
due to the characteristics of the input. In order to explain the structure of these
instances we briefly explain the origin of the problem below. We also tested our
algorithm with good results on the Frobenius instances of Cornuéjols et al.
In one of the steps of the design of ICs for video signal processing one needs
to assign so-called data streams to processors. A data stream is a repetitive set
of arithmetic operations. The attributes of a data stream are the starting time
of the first execution, the number of repetitions, and the period of repetition.
One can view a data stream as a set of nested loops. The outer loop has an
iterator i0 : 0 ≤ i0 ≤ I0 . The following loop has iterator i1 : 0 ≤ i1 ≤ I1 , and so
forth. The periodicity corresponding to a loop is the time interval between two
consecutive iterations. When constructing an assignment of the streams to the
processors the following conflict detection problem occurs: check whether there is
any point in time at which operations of two different streams are carried out. If
such a point in time exists, then the streams should not be assigned to the same
processor. Consider an arbitrary data stream f . Let if = (if 0 , if 1 , . . . , if m )T be
the iterator vector of the stream. The iterator vector satisfies upper and lower
bounds, 0 ≤ if ≤ If . Let pf denote the period vector and sf the starting time
of the stream. The point in time at which execution if of data stream f takes
place is expressed as t(if ) = sf + pTf if . The conflict detection problem can be
formulated mathematically as the following integer feasibility problem: given
data streams f and g, do there exist iterator vectors if and ig such that

sf + pTf if = sg + pTg ig , and such that 0 ≤ if ≤ If , 0 ≤ ig ≤ Ig ?

The Frobenius problem is defined as follows: given nonnegative integers

(a1 , . . . , an ) with gcd(a1 , . . . , an ) = 1, find the largest integer a0 that cannot
be expressed as a nonnegative integer combination of a1 , . . . , an . The number a0
is called the Frobenius number. The instances considered by Cornuéjols et al.
[4] were also hard to solve using LP-based branch-and-bound. They developed
a test set approach that was successful on their instances.
When solving a feasibility problem such as (1) by LP-based branch-and-
bound, two difficulties may arise. First, the search tree may become large due
to the magnitude of the upper bounds on the variables, and second, round-off
errors may occur. The size of the branch-and-bound tree may also be sensitive
to the objective function that is used. For our problem (1) an objective function
does not have any meaning since it is a feasibility problem; as long as we either
find a feasible vector, or are able to verify that no feasible vector exists, the
objective function as such does not matter. The problem is that one objective
function may give an answer faster than another, but which one is best is hard
to predict. An objective function also introduces an aspect to the problem that
is not natural. Round-off errors occur quite frequently for the instances related
to the conflict detection problem, since the coefficients of some of the variables
Solving Constrained Linear Diophantine Equations 231

are very large (≈ 107 ). The special characteristics of these instances – some very
large and some relatively small coefficients and a very large right-hand side value
a0 – are due to the difference in periodicity of the nested loops. This difference is
explained by the composition of a television screen image. Such an image consist
of 625 lines, and each line is composed of 720 pixels. Every second 25 pictures
are shown on the screen, so the time between two pictures is 40 ms. The time
between two lines and between two pixels are 64 µs and 74 ns respectively. Since
the output rate of the signals has to be equal to the input rate, we get large
differences in periodicity when the data stream corresponds to operations that
have to be repeated for all screens, lines and pixels. Due to the large difference
in the magnitude of the coefficients we often observe that the LP-based branch-
and-bound algorithm terminates with a solution in which for instance variable
xj takes value 4.999999, simply because the hardware does not allow for greater
precision. If one would round xj to xj = 5.0, then one would obtain a vector
x such that ax 6= a0 . It is obviously a serious drawback that the algorithm
terminates with an infeasible solution.
To overcome the mentioned deficiencies we have developed an algorithm
based on the L3 basis reduction algorithm as developed by Lenstra, Lenstra
and Lovász [10]. The motivation behind choosing basis reduction as a core of
our algorithm is twofold. First, basis reduction allows us to work directly with in-
tegers, which avoids the round-off problems. Second, basis reduction finds short,
nearly orthogonal vectors belonging to the lattice described by the basis. Given
the lower and upper bounds on the variables, we can interpret problem (1) as
checking whether there exists a short vector satisfying a given linear diophantine
equation. It is easy to find an initial basis that describes the lattice containing
all vectors of interest to our problem. This initial basis is not “good” in the
sense that it contains very long vectors, but it is useful as we can prove struc-
tural properties of the reduced basis obtained by applying the L3 algorithm to
it. It is important to note that basis reduction does not change the lattice, it
only derives an alternative way of spanning it. Furthermore, our algorithm is
designed for feasibility problems. Once we have obtained the vectors given by
the reduced basis, we use them as input to a heuristic that tries to find a feasible
vector fast or, in case the heuristic fails, we call an algorithm that branches on
linear combinations of vectors and yields either a vector satisfying the bound
constraints, or a proof that no such vector exists.
In Sect. 2 we give a short description of the L3 basis reduction algorithm
and a brief review of the use of basis reduction in integer programming. In Sect.
3 we introduce a lattice that contains all interesting vectors for our problem
(1), and provide an initial basis spanning that lattice. We also derive structural
properties of the reduced basis. Our algorithm is outlined in Sect. 4, and we
report on our computational experience in Sect. 5.
We are indebted to Laurence Wolsey for stimulating discussions and for nu-
merous suggestions on how to improve the exposition.
232 Karen Aardal et al.

2 Basis Reduction and Its Use in Integer Programming

We begin by giving the definition of a lattice and a reduced basis.

Definition 1. A subset L ⊂ IRn is called a lattice if there exists linearly inde-
pendent vectors b1 , b2 , . . . , bl in IRn such that
Pl
L = { j=1 αj bj : αj ∈ ZZ, 1 ≤ j ≤ l}. (2)

The set of vectors b1 , b2 , . . . , bl is called a lattice basis.

Gram-Schmidt orthogonalization is an algorithm for deriving orthogonal vectors
b∗j , 1 ≤ j ≤ l from independent vectors bj , 1 ≤ j ≤ l. The vectors b∗j , 1 ≤ j ≤ l
and the real numbers µjk , 1 ≤ k < j ≤ l are defined inductively by:
Pj−1
b∗j = bj − k=1 µjk b∗k (3)

µjk = (bj )T b∗k /(b∗k )T b∗k (4)

Let || || denote the Euclidean length in IRn . Lenstra, Lenstra and Lovász [10]
used the following definition of a reduced basis:
Definition 2. A basis b1 , b2 , . . . , bl is reduced if
1
|µjk | ≤ 2 for 1 ≤ k < j ≤ l (5)

and

||b∗j + µj,j−1 b∗j−1 ||2 ≥ 34 ||b∗j−1 ||2 for 1 < j ≤ l. (6)

Pj−1
The vector b∗j is the projection of bj on the orthogonal complement of k=1 IRbk ,
and the vectors b∗j + µj,j−1 b∗j−1 and b∗j−1 are the projections of bj and bj−1 on
Pj−2
the orthogonal complement of k=1 IRbk . Inequality (5) states that the vectors
1 ≤ j ≤ l are “reasonably” orthogonal. Inequality (6) can be interpreted as fol-
lows. Suppose that vectors bj−1 and bj are interchanged. Then the vectors b∗j−1
and b∗j will change as well. More precisely, the new vector b∗j−1 is the vector
b∗j + µj,j−1 b∗j−1 . Inequality (6) therefore says that the length of vector b∗j−1
does not decrease too much if vectors vectors bj−1 and bj are interchanged. The
constant 34 in inequality (6) is arbitrarily chosen and can be replaced by any
fixed real number 14 < y < 1. Lenstra et al. [10] developed a polynomial time
algorithm for obtaining a reduced basis for a lattice given an initial basis. The
algorithm consists of a sequence of size reductions and interchanges as described
below. For the precise algorithm we refer to [10].
Size reduction: If for any pair of indices j, k : 1 ≤ k < j ≤ l condition (5) is
violated, then replace bj by bj − dµjk cbk , where dµjk c = dµjk − 12 e.
Interchange: If condition (6) is violated for an index j, 1 < j ≤ l, then inter-
change vectors bj−1 and bj .
Solving Constrained Linear Diophantine Equations 233

Basis reduction was introduced in integer programming by H.W. Lenstra,

Jr. [11], who showed that the problem of determining if there exists a vector
x ∈ ZZ n such that Ax ≤ d can be solved in polynomial time when n is fixed.
Before this result was published, only the cases n = 1, 2 were known to be
polynomially solvable. The idea behind Lenstra’s algorithm can be explained
considering a two-dimensional convex body. Suppose that this body is “thin” as
illustrated in Fig. 1. If it extends arbitrarily far in both directions, as indicated
in the figure, then an LP-based branch-and-bound tree will become arbitrarily
deep before concluding that no feasible solution exists. It is easy to construct a
similar example in which a feasible vector does exist. So, even if n = 2, an LP-
based branch-and-bound algorithm may require exponentially many iterations
in terms of the dimension. What Lenstra observed was the following. Assume

Fig. 1. A thin convex body in ZZ 2 .

that we start with the full-dimensional bounded convex body X ∈ IRn and
that we consider the lattice ZZ n . The problem is to determine whether there
exists a vector x ∈ (X ∩ ZZ n ). We refer to this problem as problem P . We use
bj = ej , 1 ≤ j ≤ n as a basis for the lattice ZZ n , where ej is the vector where
all elements of the vector are equal to zero, except element j that is equal to one.
To avoid having a convex body that is thin we apply a linear transformation τ
to X to make it appear “regular”. Problem P is equivalent to the problem of
determining whether there exists a vector x ∈ (τ X∩τ ZZ n ). The new convex body
τ X has a regular shape but the basis vectors τ ej are not necessarily orthogonal
any longer, so from the point of view of branching the difficulty is still present.
234 Karen Aardal et al.

We can view this as having shifted the problem we had from the convex body
to the lattice. This is where basis reduction proves useful. By applying the L3
algorithm to the basis vectors τ ej , we obtain a new basis b̂1 , . . . , b̂n spanning the
same lattice, τ ZZ n , but having short, nearly-orthogonal vectors. In particular it
is possible to show that the distance d between
P any two consecutive hyperplanes
H + k b̂n , H + (k + 1)b̂n , where H = n−1 j=1 IRbj and k ∈ ZZ, is not too short,
which means that if we branch on these hyperplanes, then there cannot be too
many of them. Each branch at a certain level of the search tree corresponds to
a subproblem with dimension one less than the dimension of its predecessor. In
Fig. 2 we show how the distance between hyperplanes H + k b̂n increases if we
use a basis with orthogonal vectors instead of a basis with long non-orthogonal
ones.

b2 τX τX
b2
b1
b1
(a) (b)

Fig. 2. (a) Non-orthogonal basis. (b) Orthogonal basis.

Due to the special structure of our instances we do not use a transformation

τ in our algorithm. We can simply write down an initial basis for a lattice
that contains all vectors of interest for our problem, and then apply the L3
algorithm directly to this basis. Our initial basis resembles the bases used by
several authors for finding feasible solutions to subset sum problems that arise
in certain cryptographic systems, see further Sect. 3.
For the integer programming problem P , Lovász and Scarf [13] developed an
algorithm that, as Lenstra’s algorithm, uses branching on hyperplanes. Instead
of using a transformation τ to transform the convex body and the initial basis
vectors, and then applying a basis reduction algorithm, their algorithm produces
a “Lovász-Scarf-reduced” basis by measuring the width of the considered con-
vex body in different independent directions. Lovász and Scarf’s definition of a
reduced basis is a generalization of the definition given by Lenstra et al. [10].
Cook, Rutherford, Scarf and Shallcross [2] report on a successful implementa-
tion of the Lovász-Scarf algorithm. Cook et al. were able to solve some integer
programming problems arising in network design that could not be solved by
traditional LP-based branch-and-bound.
Solving Constrained Linear Diophantine Equations 235

Integer programming is not the only application of lattice basis reduction. A

prominent application is factoring polynomials with rational coefficients. Lenstra
et al. [10] developed a polynomial-time algorithm based on basis reduction for
finding a decomposition into irreducible factors of a non-zero polynomial in one
variable with rational coefficients. In cryptography, basis reduction has been used
to solve subset sum problems arising in connection with certain cryptosystems,
see for instance [5], [9], [15], [16]. A recent application in cryptography is due
to Coppersmith [3] who uses basis reduction to find small integer solutions to
a polynomial in a single variable modulo N , and to a polynomial in two vari-
ables over the integers. This has applications to some RSA-based cryptographic
schemes. In extended g.c.d.-computations, basis reduction is used by for instance
Havas, Majewski and Matthews [8]. Here, the aim is to find a short multiplier
vector x such that ax = a0 , where a0 = gcd(a1 , a2 , . . . , an ).

3 Structure of Initial and Reduced Basis

Here we consider a lattice that contains all vectors of interest to our problem
(1):

does there exist a vector x ∈ ZZ n such that ax = a0 , 0 ≤ x ≤ u?

Without loss of generality we assume that gcd(a1 , a2 , ..., an ) = 1. We formulate

an initial basis B that generates this lattice and derive structural properties of
the reduced basis RB obtained after applying the L3 algorithm to B.
Let x0 denote the vector (x1 , . . . , xn , x0 )T . We refer to x0 as the extended x-
vector. Here, x0 is a variable that we associate with the right-hand side coefficient
a0 . Given the bounds on the variables, we tackle problem (1) by trying to find
a short vector x satisfying ax = a0 . This can be done by finding short vectors
in the lattice Ls containing the vectors

(x1 , . . . , xn , x0 , (a1 x1 + · · · + an xn − a0 x0 ))T , (7)

where x1 , . . . , xn , x0 ∈ ZZ. In particular, we want to find integral extended x-

vectors lying in the null-space of a0 = (a1 , . . . , an , a0 ), denoted by N(a0 ), i.e.,
vectors x0 ∈ ZZ n+1 that satisfy ax − a0 x0 = 0. Moreover, if possible, we want
x to satisfy the upper and lower bound constraints, and x0 to be equal to one.
Below we will show how we can use basis reduction to find such a vector if one
exists.
The lattice L spanned by the basis B given by
 
1 0 0 ··· 0
 0 1 0 ··· 0 
 
 .. . . 
 . .. .. 
B=  .


 .. 1 0 
 
 0 0 ··· 0 N1 
N2 a1 · · · · · · N2 an −N2 a0
236 Karen Aardal et al.

contains the vectors

(x, N1 x0 , h)T = (x1 , . . . , xn , N1 x0 , N2 (−a0 x0 + a1 x1 + · · · + an xn ))T , (8)

where N1 and N2 are large integral numbers. Note that the basis vectors are given
columnwise, and that the basis consists of n + 1 vectors bj = (b1j , . . . , bn+2,j )T ,
1 ≤ j ≤ n + 1. A vector (x0 , 0)T in the lattice Ls that belongs to N(a0 , 0)
corresponds to a vector (x, N1 x0 , 0)T in the lattice L. We will show below that
by choosing the multipliers N1 and N2 large enough the reduced basis will contain
a vector (xd , N1 , 0)T , and vectors (xj , 0, 0, )T , 1 ≤ j ≤ n − 1. The vector xd is a
solution to the equation ax = a0 , and the vectors xj are solutions to ax = 0, i.e.,
they belong to the null-space N(a). Since the vectors xd and xj , 1 ≤ j ≤ n − 1
belong to the reduced basis we can expect them to be relatively short.
Lemma 1 (Lenstra, Lenstra, Lovász [10]). Let Λ ⊂ IRn be a lattice with
reduced basis b1 , b2 , . . . , bl ∈ IRn . Let y1 , y2 , . . . , yt ∈ Λ, t ≤ l be linearly
independent. Then,
||bj ||2 ≤ 2n−1 max{||y1 ||2 , ||y2 ||2 , ..., ||yt ||2 } for 1 ≤ j ≤ t. (9)

Let b̂j = (b̂1j . . . , b̂n+2,j )T , 1 ≤ j ≤ n + 1, denote the vectors of the reduced

basis, RB, obtained by applying L3 to B.
Theorem 1. There exists numbers N01 and N02 such that if N1 > N01 , and if
N2 > N1 N02 , then the vectors b̂j ∈ ZZ n+2 , of the reduced basis RB have the
following properties:
1. b̂n+1,j = 0 for 1 ≤ j ≤ n − 1,
2. b̂n+2,j = 0 for 1 ≤ j ≤ n,
3. b̂n+1,n = N1 .
Proof. Without loss of generality we assume that an 6= 0. Consider the following
n vectors in ZZ n+1 : vj = (an ej − aj en , 0)T , 1 ≤ j ≤ n − 1, vn = (a0 en , an )T .
Next, define vectors z1 , . . . , zn as follows:
zj = Bvj = (vj , 0)T , 1 ≤ j ≤ n − 1, (10)
zn = Bvn = (a0 en , N1 an , 0) . T
(11)
The vectors zj , 1 ≤ j ≤ n belong to the lattice L.
Choose N01 such that
2
N01 > 2n+1 max{||z1 ||2 , . . . , ||zn−1 ||2 } = 2n+1 max{||v1 ||2 , . . . , ||vn−1 ||2 }.
From Lemma 1 we have that
||b̂j ||2 ≤ 2n+1 max{||z1 ||2 , . . . , ||zn−1 ||2 } for 1 ≤ j ≤ n − 1.

Suppose that b̂n+1,j 6= 0 for some j : 1 ≤ j ≤ n − 1. Then ||b̂j ||2 ≥ b̂2n+1,j ≥ N12
as N1 divides b̂n+1,j . As a consequence, ||b̂j ||2 > N01
2
, which contradicts the
outcome of Lemma 1. We therefore have that b̂n+1,j = 0 for 1 ≤ j ≤ n − 1.
Solving Constrained Linear Diophantine Equations 237

Next, select N02 such that

2
N02 > 2n+1 max{||v1 ||2 , . . . , ||vn ||2 }.

Due to Lemma 1, the following holds for the reduced basis vectors b̂j , 1 ≤ j ≤ n:
||b̂j ||2 ≤ 2n+1 max{||z1 ||2 , . . . , ||zn ||2 } =
2n+1 max{||z1 ||2 , . . . , ||zn−1 ||2 , a20 + N12 a2n } =
2n+1 max{||v1 ||2 , . . . , ||vn−1 ||2 , a20 + N12 a2n } ≤
2n+1 max{||v1 ||2 , . . . , ||vn−1 ||2 , N12 ||vn ||2 } ≤
N12 2n+1 max{||v1 ||2 , . . . , ||vn−1 ||2 , ||vn ||2 } < N12 N02
2
.
If b̂n+2,j 6= 0 for some j : 1 ≤ j ≤ n, then ||b̂j ||2 ≥ b̂2n+2,j ≥ N22 since N2
divides b̂n+2,j . This implies that ||b̂j ||2 > N12 N02
2
, which yields a contradiction.
We can therefore conclude that b̂n+2,j = 0 for 1 ≤ j ≤ n.
Finally, we prove Property 3. The equation ax = a0 has a feasible solution
since we assume that gcd(a1 , . . . , an ) = 1 and since a0 is integral. Let xd be
a solution to ax = a0 . This implies that the vector (xd , N1 , 0)T = B(xd , 1)T
belongs to L spanned by B. The lattice L is also spanned by the reduced ba-
sis RB, and hence the vector (xd , N1 , 0)T can be obtained as (xd , N1 , 0)T =
RB(λ1 , . . . , λn+1 )T . Properties 1 and 2 imply the following:

0λ1 + · · · + 0λn−1 + b̂n+1,n λn + b̂n+1,n+1 λn+1 = N1 (12)

0λ1 + · · · + 0λn + b̂n+2,n+1 λn+1 = 0 (13)

From equation (13) we obtain λn+1 = 0 as b̂n+2,n+1 is equal to an integer

multiple of N2 . If we use λn+1 = 0 in equation (12) we obtain b̂n+1,n λn = N1 ,
which implies that b̂n+1,n divides N1 . Moreover, N1 divides b̂n,n+1 (cf. the proof
of Property 1). We can therefore conclude that b̂n+1,n = N1 . t
u

Example 1. Consider the following instance of problem (1): does there exist a
vector x ∈ ZZ 5 such that:

8, 400, 000x1 + 4, 000, 000x2 + 15, 688x3 + 6, 720x4 + 15x5 = 371, 065, 262;
0 ≤ x1 ≤ 45; 0 ≤ x2 ≤ 39; 0 ≤ x3 ≤ 349; 0 ≤ x4 ≤ 199; 0 ≤ x5 ≤ 170?

Let N1 = 1, 000 and N2 = 10, 000. The initial basis B looks as follows:
 1 0 0 0 0 0

 0 1 0 0 0 0
 0
 0 0 1 0 0 
B=  0 0 0 1 0 0
 0 0 0 0 1 0
 
0 0 0 0 0 N1
8, 400, 000N2 4, 000, 000N2 15, 688N2 6, 720N2 15N2 −371, 065, 262N2
238 Karen Aardal et al.

After applying L3 to B we obtain:

 
−10 0 1 5 36 2
 21 0 −2 −10 17 −4 
 
 0 15 −25 −95 39 −42 
 
RB =  0 −35 −1 −76 8 −21 

 0 −8 −72 72 −22 1 
 
 0 0 0 0 1, 000 0 
0 0 0 0 0 10, 000

Notice that the sixth ((n + 1)-st) element of the first four (n − 1) basis vectors
of RB are equal to zero, and that the last element ((n + 2)-nd) of the first five
(n) basis vectors are equal to zero. We also note that b̂n+1,n = b̂6,5 = N1 and
that the elements of the first n = 5 rows of RB are quite small. t
u

We conclude the section with a brief comparison between our initial basis
B and various bases used when trying to find solutions to subset sum problems
that arise in cryptography. In the cryptography application the message that
the “sender” wants to transmit to the “receiver” is represented by a sequence
x ∈ {0, 1}n of “bits”. The receiver knows a sequence of numbers a1 , . . . , an , and
instead of sending the actual message x, the sender sends a number a0 = ax.
Once the receiver knows a0 he can recover the message x by solving the subset
sum problem ax = a0 . Here, the equation ax = a0 is known to have a solution.
Lagarias and Odlyzko [9] considered the following basis to generate the lattice
containing vectors (x, (−ax + a0 x0 ))T :
(n)
I 0(n×1)
B=
−a a0

Here, I(n) denotes the n-dimensional identity matrix, and 0(n×1) denotes the
n × 1-matrix consisting of zeros only. Lagarias and Odlyzko proposed a polyno-
mial time algorithm based on basis reduction. There is no guarantee that the
algorithm finds a feasible vector x, but the authors show that for “almost all”
instances of density d = n/ log2 (maxj aj ) < 0.645, a feasible vector is found.
Several similar bases have been considered later as input to algorithms for trying
to find solutions to subset sum problems. Schnorr and Euchner [15] for instance
used the basis  
diag(2)(n×n) 1(n×1)
B= na na0 
0(1×n) 1
where diag(2)(n×n) is the n × n-matrix with twos along the main diagonal and
zeros otherwise. Here, a lattice vector v ∈ ZZ n+2 that satisfies |vn+2 | = 1, vn+1 =
0 and vj ∈ {±1} for 0 ≤ j ≤ n, corresponds to a feasible vector xj = 12 |vj −
vn+2 |, 0 ≤ j ≤ n. Schnorr and Euchner [15] proposed an algorithm that for
“almost all” subset sum problem instances with d < 0.9408 finds a feasible
vector. The algorithm uses the above basis as input. For further details on finding
Solving Constrained Linear Diophantine Equations 239

feasible solutions to subset sum problems arising in cryptography we refer to the

above references and to the references given in the introduction.
The important differences between the approach described above and our
approach are the following. First the question that is posed is different. We do
not know a priori whether our instances have a feasible solution or not, and we
want to solve the feasibility problem, i.e., if a feasible solution exists we want
to find one, and if no feasible solution exists this should be verified. Hence, we
propose a branching algorithm as described in the following section. We also use
two large constants N1 and N2 to “force” the L3 algorithm to find interesting
vectors. Moreover, we have shown, see [1], that our algorithm can be generalized
to handle systems of linear diophantine equations.

4 The Algorithm
Here we discuss how we can use the properties stated in Theorem 1 to design an
algorithm for solving the feasibility problem (1). Since we are only interested in
vectors belonging to the null-space N(a0 ) we consider only the first n+1 elements
of the first n vectors of RB. We now have obtained a basis for ZZ n+1 ∩ N(a, N a0
1
)
with the following structure:

X xd
RB0 =
0(1×(n−1)) N1

In our example above the basis RB0 corresponds to

 
−10 0 1 5 36
 21 0 −2 −10 17 
 
 0 15 −25 −95 39 
 
 0 −35 −1 −76 8 
 
 0 −8 −72 72 −22 
0 0 0 0 1, 000

The last column of the matrix RB0 , (xd , N1 )T is a solution to the equation
ax − N a0
1
x0 = 0, which implies that the vector xd is a solution to the equation
ax = a0 . All other columns of RB0 are solutions to the equation ax − 0 = 0,
i.e., the columns of the submatrix X all lie in the null-space N(a).
In our algorithm we first check whether the vector xd satisfies the lower and
upper bounds. If yes, we are done. If any of the bounds is violated we search for a
vector that is feasible, or for a proof that no feasible vector exists, by branching
on linear integer multiples of the vectors in N(a). Note that by adding any linear
integer combination of vectors in N(a), xλ , to xd we obtain a vector xd + xλ that
satisfies axd + xλ = a0 . For the feasible instances the search for a feasible vector
turned out to be easy. To speed up the algorithm for most of these instances we
developed a heuristic as follows. Let Xj be the j-th column of the submatrix X of
RB0 . Suppose that we are at iteration t of the heuristic and that an integer linear
combination of t0 < t vectors of N(a) has been added to vector xd . The vector
240 Karen Aardal et al.

obtained in this way is called the “current vector”. For simplicity we assume that
only variable xk of the current vector violates one of its bound constraints. At
iteration t we add or subtract an integer multiple λt of the column vector Xt if
the violation of variable xk ’s bound constraint is reduced and if no other bound
constraints becomes violated. As soon as the value of xk satisfies its bounds,
we do not consider any larger values of λt . If the heuristic does not find any
feasible solution, we call an exact branching algorithm that branches on linear
combinations of vectors in N(a). A summary of the complete algorithm is given
in Fig. 3.

procedure main(a, a0 , u)
begin
store initial basis B;
RB = L3 (B);
extract RB0 from RB;
if 0 ≤ xd ≤ u then return xd ;
heuristic(RB);
if heuristic fails then
branch on linear combinations of columns j = 1, . . . , n − 1 of the submatrix X;
return feasible vector x, or a proof that no such vector exists;
end

Fig. 3. Algorithm 1.

In our example we can see that by subtracting the third column of X,

X3 = (1, −2, −25, −1, −72)T , from xd = (36, 17, 39, 8, −22)T , we obtain a vector
x = (35, 19, 64, 9, 50)T that satisfies the equation as well as all the bounds.

5 Computational Experience
We solved thirteen instances of problem (1). Eight of the instances were feasible
and five infeasible. The instances starting with “P” in Table 1 were obtained
from Philips Research Labs. The instances starting with “F” are the Frobenius
instances of Cornuéjols et al. [4]. Here we used the Frobenius number as right-
hand side a0 . The two other instances, starting with “E”, were derived from
F3 and F4. The information in Table 1 is interpreted as follows. In the first
two columns, “Instance” and “n”, the instance names and the dimension of
the instances are given. An “F” in column “Type” means that the instance is
feasible, and an “N” that it is not feasible. In the two columns of LP-based
branch-and-bound, “LP B&B”, the number of nodes and the computing time
are given. In the “# Nodes” column, 500, 000∗ means that we terminated the
search after 500,000 nodes without reaching a result. Two asterisks after the
number of nodes indicate that a rounding error occurred, i.e., that the rounded
solution given by the algorithm did not satisfy the diophantine equation. In both
Solving Constrained Linear Diophantine Equations 241

Table 1. Results of the computational experiments.

LP B&B Algorithm
Instance n Type # Nodes Time (s) Heur. # Nodes Time (s)

P1 5 F 420∗∗ – 1 < 10−5

P2 5 F 327 0.09 1 < 10−5
P3 4 F 75 0.05 0 < 10−5
P4 5 F 313∗∗ – 1 < 10−5
P5 5 F 231 0.11 2 < 10−5
P6 5 F 313∗∗ – 1 < 10−5
E1 6 F 3,271 0.97 0 < 10−5
E2 7 F 500, 000∗ – 0 < 10−5
F1 5 N 500, 000∗ – – 1 < 10−3
F2 6 N 500, 000∗ – – 5 0.01
F3 6 N 500, 000∗ – – 1 < 10−3
F4 7 N 500, 000∗ – – 1 0.01
F5 8 N 500, 000∗ – – 5 0.01

cases we do not report on the computing times since no result was obtained. In
the three columns corresponding to our algorithm, “Algorithm”, the column
“Heur.” gives the number of vectors belonging to N(a) that was used in the
integer linear combination of vectors added to the vector xd by the heuristic in
order to obtain a feasible solution. Notice that for every feasible instance, the
heuristic found a feasible solution. A zero in column “Heur.” therefore means that
the vector xd was feasible. For the infeasible instances the heuristic obviously
failed, and therefore the sign “–” is given in the column. In that case we turn to
the branching phase. Here, a one in the column “# Nodes” means that we solved
the problem in the root node by using logical implications. The computing times
are given in seconds on a 144MHz Sun Ultra-1. For the LP-based branch-and-
bound we used CPLEX version 4.0.9 [6], and in our algorithm we used LiDIA, a
library for computational number theory [12], for computing the reduced basis.

Our results indicate that the instances are rather trivial once they are rep-
resented in a good way. Using the basis ej and branching on variables as in
LP-based branch-and-bound is clearly not a good approach here, but it is the
standard way of tackling integer programs. Using basis reduction seems to give
a more natural representation of the problem. For our instances the computing
times were very short, and, contrary to LP-based branch-and-bound, we avoid
round-off errors. It is also worth noticing that the infeasibility of instances F1–F5
was particularly quickly verified using our algorithm.
242 Karen Aardal et al.

References
1. K. Aardal, A. K. Lenstra, and C. A. J. Hurkens. An algorithm for solving
a diophantine equation with upper and lower bounds on the variables. Re-
port UU-CS-97-40, Department of Computer Science, Utrecht University, 1997.
ftp://ftp.cs.ruu.nl/pub/RUU/CS/techreps/CS-1997/
2. W. Cook, T. Rutherford, H. E. Scarf, and D. Shallcross. An implementation of the
generalized basis reduction algorithm for integer programming. ORSA Journal on
Computing, 5:206–212, 1993.
3. D. Coppersmith. Small solutions to polynomial equations, and low exponent RSA
vulnerability. Journal of Cryptology, 10:233–260, 1997.
4. G. Cornuéjols, R. Urbaniak, R. Weismantel, and L. Wolsey. Decomposition of in-
teger programs and of generating sets. In R. Burkard and G. Woeginger, editors,
Algorithms – ESA ’97, LNCS, Vol. 1284, pages 92–103. Springer-Verlag, 1997.
5. M. J. Coster, A. Joux, B. A. LaMacchia, A. M. Odlyzko, and C. P. Schnorr.
Improved low-density subset sum algorithms. Computational Complexity, 2:111–
128, 1992.
6. CPLEX Optimization Inc. Using the CPLEX Callable Library, 1989.
7. B. de Fluiter. A Complexity Catalogue of High-Level Synthesis Problems. Master’s
thesis, Department of Mathematics and Computing Science, Eindhoven University
of Technology, 1993.
8. G. Havas, B. S. Majewski, and K. R. Matthews. Extended gcd and Hermite nor-
mal form algorithms via lattice basis reduction. Working paper, Department of
Mathematics, The University of Queensland, Australia, 1996.
9. J. C. Lagarias and A. M. Odlyzko. Solving low-density subset sum problems. Jour-
nal of the Association for Computing Machinery, 32:229–246, 1985.
10. A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovász. Factoring polynomials with
rational coefficients. Mathematische Annalen, 261:515–534, 1982.
11. H. W. Lenstra, Jr. Integer programming with a fixed number of variables. Mathe-
matics of Operations Research, 8:538–548, 1983.
12. LiDIA – A library for computational number theory. TH Darmstadt / Universität
des Saarlandes, Fachbereich Informatik, Institut für Theoretische Informatik.
http://www.informatik.th-darmstadt.de/pub/TI/LiDIA
13. L. Lovász and H. E. Scarf. The generalized basis reduction algorithm. Mathematics
of Operations Research, 17:751–764, 1992.
14. M. C. McFarland, A. C. Parker, and R. Camposano. The high-level synthesis of
digital systems. Proceedings of the IEEE, Vol. 78, pages 301–318, 1990.
15. C. P. Schnorr and M. Euchner. Lattice basis reduction: Improved practical algo-
rithms and solving subset sum problems. Mathematical Programming, 66:181–199,
1994.
16. C. P. Schnorr and H. H. Hörner. Attacking the Chor-Rivest Cryptosystem by
improved lattice reduction. In L. C. Guillou and J.-J. Quisquater, editors, Advances
in Cryptology – EUROCRYPT ’95, LNCS, Vol. 921, pages 1–12. Springer Verlag,
1995.
17. W. F. J. Verhaegh, P. E. R. Lippens, E. H. L. Aarts, J. H. M. Korst, J. L. van Meer-
bergen, and A. van der Werf. Modeling periodicity by PHIDEO steams. Proceed-
ings of the Sixth International Workshop on High-Level Synthesis, pages 256–266.
ACM/SIGDA, IEEE/DATC, 1992.
The Intersection of Knapsack Polyhedra and
Extensions

?
Alexander Martin and Robert Weismantel

Konrad-Zuse-Zentrum Berlin
Takustr. 7
D-14195 Berlin, Germany
{martin, weismantel}@@zib.de

Abstract. This paper introduces a scheme of deriving strong cutting

planes for a general integer programming problem. The scheme is related
to Chvátal-Gomory cutting planes and important special cases such as
odd hole and clique inequalities for the stable set polyhedron or families
of inequalities for the knapsack polyhedron. We analyze how relations
between covering and incomparability numbers associated with the ma-
trix can be used to bound coefficients in these inequalities. For the in-
tersection of several knapsack polyhedra, incomparabilities between the
column vectors of the associated matrix will be shown to transfer into
inequalities of the associated polyhedron. Our scheme has been incorpo-
rated into the mixed integer programming code SIP. About experimental
results will be reported.

1 Introduction

LP-based branch-and-bound algorithms are currently the most important tool

to deal with real-world general mixed integer programming problems compu-
tationally. Usually, the LP relaxations that occur during the execution of such
methods are strengthened by cutting planes. Cutting planes for integer programs
may be classified with regard to the question whether their derivation requires
knowledge about the structure of the underlying constraint matrix. Examples
of families of cutting planes that do not exploit the structure of the constraint
matrix are Chvátal-Gomory cuts [6], [4], [12] or lift-and-project cuts [1]. An
alternative approach to obtain cutting planes for an integer program follows es-
sentially the scheme to derive relaxations associated with certain substructures
of the underlying constraint matrix, and tries to find valid inequalities for these
relaxations. Crowder, Johnson and Padberg [5] applied this methodology by in-
terpreting each single row of the constraint matrix as a knapsack relaxation and
strengthened the integer program by adding violated knapsack inequalities. An
?
Supported by a “Gerhard Hess-Forschungsförderpreis” of the German Science Foun-
dation (DFG).

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 243–256, 1998. c Springer–Verlag Berlin Heidelberg 1998
244 Alexander Martin and Robert Weismantel

analysis of other important relaxations of an integer program allows to incor-

porate odd hole and clique inequalities for the stable set polyhedron [8] or flow
cover inequalities for certain mixed integer models [11]. Further recent examples
of this second approach are given in [3], [7].
Our paper follows the methodology to obtain cutting planes for an integer
program by investigating relaxations of it. We try to go one step further and
investigate the intersection of two or more knapsack polyhedra. We describe a
general family of valid inequalities for an integer program that are associated
with its feasible solutions. Usually such inequalities must be lifted in order to
induce high dimensional faces. We derive lower and upper bounds on the exact
lifting coefficients of such an inequality and discuss special cases when these
bounds can be computed in polynomial time. We also relate our family of in-
equalities to Chvátal-Gomory cuts and discuss in detail the special case where
only two knapsacks are involved. The use of feasible set inequalities within an
implementation for the solution of general mixed integer programming problems
is investigated.

Consider some matrix A ∈ IRm×n , vectors b ∈ IRm , u ∈ IRn , and the polytope
P := conv {x ∈ ZZ n : Ax ≤ b, 0 ≤ x ≤ u},
that is the convex hull of all integral vectors x satisfying Ax ≤ b and 0 ≤ x ≤ u.
Set N := {1, . . . , n}. For S ⊆ N , define PS := P ∩ {x ∈ IRn : xi = 0 for i ∈
N \ S}, where PN = P , and, for x ∈ IRn , denote by x|S := (xi )i∈S the vector
restricted to the components in S.
P
Definition 1. Let T ⊆ N such that i∈T A·i vi ≤ b for all v ∈ IRT with v ≤ u|T .
T is called a feasible set.
P Let w : T 7→ ZZ be some weighting of the elements of
T . For j ∈ N \ T with i∈T A·i ui + A·j uj 6≤ b , the inequality
X X
wi xi + wj xj ≤ wi ui (1)
i∈T i∈T

is called a feasible set inequality associated with T (and {j}) if

1 X
wj ≤ min min wi xi
l=1,...,uj l x
X
i∈T
A·i xi ≥ A·j − r(T ) (2)
i∈T
0 ≤ xi ≤ ui , xi ∈ ZZ, i ∈ T,
P
where r(T ) := b − i∈T A·i ui .

Theorem 2. Feasible set inequalities are valid for PT ∪{j} .

P P P
Proof. Let γ = i∈T wi ui and γl := max { i∈T wi xi : i∈T A·i xi + A·j l ≤
b, 0 ≤ xi ≤ ui , xi ∈ ZZ, i ∈ T }. After complementing variables xi to ui − xi for
1
i ∈ T we obtain that the right handP side of (2) is minl=1,...,uj l (γ − γl ). For1some
feasible solution x̄ ∈ P we have i∈T wi x̄i + wj x̄j ≤ γx̄j + x̄j minl=1,...,uj l (γ −
γl ) ≤ γx̄j + (γ − γx̄j ) = γ. t
u
The Intersection of Knapsack Polyhedra 245

Examples of feasible set inequalities include (1, k)-configuration and mini-

mal-cover inequalities that are known for the knapsack polytope K := conv {x ∈
{0, 1}n : aT x ≤ α} with a ∈ IRn+ , α > 0. Let S ⊆ N be a minimal cover, i.e.,
a(S) > α and a(S \ {i}) ≤ α for all i ∈ S, and partition S into T and {j}
P some j ∈ S. Set wi := 1 for all i ∈ T . The feasible set inequality
for P reads
i∈T x i + w j xj ≤
P |T | = |S| − 1 with w
Pj ≤ min{|V | : V ⊆ T, i∈V ai ≥
aj − r(T )}. Since i∈T ai + r(T ) = α and i∈S ai > α, this minimum is greater
than or equal to one. Therefore, under the regularity condition imposed in [10],
the feasible set inequality is always a (1, k)-configuration inequality. In case the
coefficient happens to be one we get a minimal cover inequality, see, for instance,
[13].
Theorem 2 states the validity of the feasible set inequality for PT ∪{j} . To
obtain a (strong) valid inequality for P we resort to lifting, see [9]. Consider
some permutation
π1 , . . . , πn−|T |−1 of the set N \ (T ∪ {j}). For k = 1, . . . , n − |T | − 1 and
l = 1, . . . , uπk let
X X
γ(k, l) = max wi xi + wi xi
i∈T ∪{j} i∈{π1 ,...,πk−1 }
X X
A·i xi + A·i xi + A·πk l ≤ b (3)
i∈T ∪{j} i∈{π1 ,...,πk−1 }
0 ≤ xi ≤ ui , xi ∈ ZZ for i ∈ T ∪ {π1 , . . . , πk−1 }.
P
Let γ = i∈T wi ui , the lifting coefficients are

γ − γ(k, l)
wπk := min . (4)
l=1,...,uπk l

The following statement is immediate.

P
Theorem 3. The (lifted) feasible set inequality wT x ≤ i∈T wi ui is valid for
P.

Note that the right hand side of (2) coincides with (4) applied to variable j
if we substitute in (3) the set T ∪ {j} by T . In other words, a lifted feasible set
inequality associated with T and {j}, where the variables in N \ (T ∪ {j}) are
lifted according to the sequence π1 , . . . , πn−|T |−1 , coincides with the inequality
associated with T , where j is lifted first, and the remaining variables N \(T ∪{j})
are lifted in the same order π1 , . . . , πn−|T |−1 . Thus, instead of speaking of a
feasible set inequality associated with T and {j}, we speak in the sequel of a
feasible set inequality associated with T and view j as the variable that is lifted
first.
Odd hole- and clique inequalities for the set packing polytope are examples of
lifted feasible set inequalities. Consider the set packing polytope P = conv {x ∈
{0, 1}n : Ax ≤ 1l} for some 0/1 matrix A ∈ {0, 1}m×n. Let GA = (V, E) denote
the associated column intersection graph whose nodes correspond to the columns
246 Alexander Martin and Robert Weismantel

of A and nodes i and j are adjacent if and only if the columns associated with
i and j intersect
P in some row. Let Q ⊆ V be a clique in GA , then the clique
inequality i∈Q i ≤ 1 is valid for P . To see that this inequality is a lifted
x
feasible set inequality, let T = {i} for some i ∈ Q. The feasible set inequality
xi ≤ 1 is valid for P{i} . Lifting the remaining variables k ∈ Q \ {i} by applying
formula (4) yields wk = 1, and the clique inequality follows.

2 Bounds on the Lifting Coefficients

For a feasible set inequality associated with T , the calculation of the lifting
coefficients for the variables in N \ T requires the solution of an integer program.
In this section we study lower and upper bounds for these coefficients. It will turn
out that these bounds are sometimes easier to compute. We assume throughout
the section that A ≥ 0 and wi ≥ 0 for i ∈ T .

Definition 4. Let T ⊆ N be a feasible set and w : T 7→ IRT+ a weighting of T .

For v ∈ IRm we define the
Covering Number
X X
φ≥ (v) := min { wi xi : A·i xi ≥ v, 0 ≤ xi ≤ ui , xi ∈ ZZ, i ∈ T },
i∈T i∈T

≥-Incomparability Number
X X
φ6≥ (v) := min { wi xi : A·i xi 6≥ v, 0 ≤ xi ≤ ui , xi ∈ ZZ, i ∈ T,
i∈T i∈T X
∧ ∃ j ∈ T, xj < uj : A·i xi + A·j ≥ v},
i∈T

≤-Incomparability Number
X X
φ6≤ (v) := min { wi xi : A·i xi 6≤ v, 0 ≤ xi ≤ ui , xi ∈ ZZ, i ∈ T },
i∈T i∈T

where we set φ6≥ (v) := 0 and φ6≤ (v) := 0 for v ≤ 0.

P
Consider a (lifted) feasible set inequality wT x ≤ i∈T wi ui associated with
T , where the variables in N \ T are lifted in the sequence π1 , . . . , πn−|T | . The
following proposition gives upper bounds for the lifting coefficients derived from
the covering number.

Proposition 5.
1 ≥
(a) wπ1 = minl=1,...,uπ1 l φ (A·π1 l − r(T )).
1 ≥
(b) wπk ≤ minl=1,...,uπk l φ (A·πk l − r(T )), for k = 2, . . . , n − |T |.
The Intersection of Knapsack Polyhedra 247

Proof. (a) directly follows from Theorem 2. To see (b), it suffices to show that
γ −γ(k, l) ≤ φ≥ (A·πk l−r(T )) for l = 1, . . . , uπk , see (4). This relation is obtained
by
P P
γ − γ(k, l) = γ − max i∈T wi xi + i∈{π1 ,...,πk−1 } wi xi
P P
i∈T A·i xi + i∈{π1 ,...,πk−1 } A·i xi + A·πk l ≤ γ,
T ∪{π1 ,...,πk−1 }
0P≤ x ≤ u|T ∪{πP1 ,...,πk−1 } , x ∈ ZZ
= min i∈T wi xi − i∈{π1 ,...,πk−1 } wi xi
P P
i∈T A·i xi − i∈{π1 ,...,πk−1 } A·i xi ≥ A·πk l − r(T ),
0P≤ x ≤ u|T ∪{π1 ,...,πk−1 } , x ∈ ZZ T ∪{π1 ,...,πk−1 }
≤ min Pi∈T wi xi
i∈T A·i xi ≥ A·πk l − r(T ),
0 ≤ x ≤ u|T , x ∈ ZZ T
= φ≥ (A·πk l − r(T )),

where the second equation follows by complementing variables xi , i ∈ T . t

To derive lower bounds on the lifting coefficients we need the following rela-
tions.
Lemma 6. For v1 , v2 ∈ IRm with v1 , v2 ≥ 0 holds:
(a) φ≥ (v1 ) ≥ φ6≥ (v1 ) and φ≥ (v1 ) ≥ φ6≤ (v1 ).
(b) φ≥ , φ6≥ , and φ6≤ are monotonically increasing, that is for v1 ≥ v2 , φ≥ (v1 ) ≥
φ≥ (v2 ), φ6≥ (v1 ) ≥ φ6≥ (v2 ), and φ6≤ (v1 ) ≥ φ6≤ (v2 ).
(c) φ≥ (v1 + v2 ) ≥ φ6≥ (v1 ) + φ6≤ (v2 ).
(d) φ≥ (v1 + v2 ) + max {wi : i ∈ T } ≥ φ6≤ (v1 ) + φ≥ (v2 ).
(e) φ6≤ (v1 + v2 ) + max {wi : i ∈ T } ≥ φ6≤ (v1 ) + φ6≤ (v2 ).

Proof. (a) and (b) are obvious, the proofs of (c) and (e) follow the P same line.
We show exemplarily P (c). Let x̄ ∈ IR T
with 0 ≤ x̄ ≤ u| T and i∈T A·i x̄i ≥
v1 + v2 such that w
i∈T i ix̄ = φ ≥
(v1 + v2 ). If v 1 = v2 = 0 the statement
Pw.l.o.g. v1 > 0. Let z ∈ IR with z ≤ x̄ such
T
is trivial.
P Otherwise, suppose
that i∈T A·i zi ≥ v1 and i∈T wi zi is minimal. Since v1 > 0 there exists
some
P i 0 ∈ T with z i0 > 0, and since z wasP chosen to be minimal, we have that
i∈T A·i zi −A·i0 6≥ v1 . ThisP implies that P i∈T A·i (x̄i −zi )+A·i0 6≤ Pv2 . Summing
up, we get φ≥ (v1 + v2 ) = i∈T wi x̄i = ( i∈T A·i zi − A·i0 ) + ( i∈T A·i (x̄i −
zi ) + A·i0 ) ≥ φ6≥ (v1 ) + φ6≤ (v2 ). Finally, (d) directly follows from (c) and the fact
that φ6≥ (v1 ) + max {wi : i ∈ T } ≥ φ≥ (v1 ). t
u

With the help of Lemma 6 we are able to bound the lifting coefficients from
below.
Theorem 7. For k = 1, . . . , n − |T | we have
φ6≤ (A·πk l − r(T ))
wπk ≥ min − max {wi : i ∈ T }. (5)
l=1,...,uπk l
248 Alexander Martin and Robert Weismantel

φ6≤ (A
·πk l−r(T ))
Proof. Let cπk := minl=1,...,uπk l − max {wi : i ∈ T } denote the
right hand side of (5), for k = 1, . . . , n − |T |. We show by induction on k that
P Pk P
the inequality i∈T wi xi + i=1 cπi xi ≤ i∈T wi ui is valid. For k = 1, the
statement follows from Proposition 5 (a) and Lemma 6 (a). Now let k ≥ 2 and
suppose the statement is true for all l < k. Let x̄ be an optimal solution of
X X
k
max wi xi + cπi xπi
i∈T i=1
X Xk
A·i xi + A·πi xπi ≤ b
i∈T i=1
0 ≤ xi ≤ ui , xi ∈ ZZ for i ∈ T ∪ {π1 , . . . , πk }.
P Pk P
We must show that i∈T wi x̄i + i=1 cπi x̄πi ≤ i∈T wi ui . First note that the
inequality is valid if x̄πk = 0. This is equivalent to saying
X
k−1 X
k−1
φ≥ A·πi xπi − r(T ) ≥ cπi xπi ,
i=1 i=1
Pk−1
for all x ∈ ZZ {π1 ,...,πk−1 } with i=1 A·πi xπi ≤ b, 0 ≤ xπi ≤ uπi , i = 1, . . . , k−1.
Applying Lemma 6 (d) and (b) we obtain with wmax := max {wi : i ∈ T }
X
k k−1
X
φ≥ A·πi x̄πi − r(T ) ≥ φ≥ A·πi x̄πi − r(T ) + φ6≤ (A·πk x̄πk ) − wmax
i=1 i=1
X
k−1
≥ cπi x̄πi + φ6≤ (A·πk x̄πk − r(T )) − wmax
i=1
X
k−1
≥ cπi x̄πi
i=1
φ6≤ (A·πk l − r(T ))
+ x̄πk min − wmax
l=1,...,uk l
X
k
= cπi x̄πi .
i=1
P P
On account of i∈T wi (ui − x̄i ) ≥ φ≥ ( ki=1 A·πi x̄πi − r(T )) the statement
follows. t
u

Theorem 7 applies, in particular, if we set the coefficient of the first lifted

variable wπ1 to the upper bound of Proposition 5 (a). The subsequent example
shows that in this case the lower bounds given in Theorem 7 may be tight.
" #
11668613 14
Example 8. Let A = and b = . The set T = {1, 2, 3, 4} is
46113921 12
feasible for the 0/1 program max {cT x : x ∈ P } with P := conv {x ∈ {0, 1}8 :
The Intersection of Knapsack Polyhedra 249

Ax ≤ b}. For wi = 1, i ∈ T , we obtain φ≥ 83 = 3. Moreover, φ6≤ 69 = 2, because
6

9 ≥ A·i for i ∈ T . Accordingly we get φ6≤ 12 = 1 = φ6≤ 31 . The inequality
P 8
x1 + x2 + x3 + x4 + φ≥ 83 x5 + i=6 (φ6≤ (A·i ) − 1)xi ≤ 4 reads x1 + x2 + x3 +
x4 + 3x5 + x6 ≤ 4. It defines a facet of P .

The question remains, whether the values φ≥ , φ6≥ and φ6≤ are easier to com-
pute than the exact lifting coefficient. Indeed, they sometimes are. Suppose
wi = 1 for all i ∈ T and consider the comparability digraph G = (V, E) that
is obtained by introducing a node for each column and arcs (i, j) if A·i ≥ A·j
and A·i 6= A·j or if A·i = A·j and i > j, for i, j ∈ {1, . . . , n} (where transitive
arcs may be deleted). Let r denote the number of nodes with indegree zero, i. e.,
δ − (i) = 0. Then, φ≥ , φ6≥ and φ6≤ can be computed in time O(nr + α), where α
is the time to construct the comparability digraph. For example, in case of one
knapsack inequality the comparability digraph turns out to be a path, and thus
φ≥ , φ6≥ and φ6≤ can be computed in time O(n + n log n) = O(n log n).

3 Connection to Chvátal-Gomory Cuts

So far we have been discussing feasible set inequalities for general integer pro-
grams. With the exception that with u|T also every vector v ≤ u|T is valid for
P we have not subsumed any assumptions on the matrix A. Thus a comparison
to Chvátal-Gomory cutting planes that do not rely on any particular struc-
ture of A is natural. Recall that Chvátal-Gomory inequalities for the system
Ax ≤ b, 0 ≤ x ≤ u, x ∈ ZZ n are cutting planes dT x ≤ δ such that di = bλT Â·i c,
i = 1, . . . , n, and
A b
δ = bλ b̂c for some λ ∈ IR+ , where Â =
T m+n
and b̂ = .
I u
P
Consider a (lifted) feasible set inequality wT x ≤ i∈T wi ui associated with
T , whose remaining variables N \ T are lifted in the sequence π1 , . . . , πn−|T | .
This lifted feasible set inequality is compared to Chvátal-Gomory inequalities
resulting from multipliers λ ∈ IR+ m+n
that satisfy bλT Â·i c = wi for i ∈ T .

Proposition 9.
P
(a) bλT b̂c ≥ i∈T ui wi .
P
(b) If bλT b̂c = i∈T ui wi , let j be the smallest index with bλ Â·πj c 6= wπj .
T

Then, bλ Â·πj c < wπj .

Proof. Since T is a feasible set, (a) is obviously true. To see (b) suppose the
contrary and let j be the first index with bλT Â·πj c 6= wπj and bλT Â·πj c >
P
wπj . Set γ = i∈T ui wi and consider an optimal solution x̄ ∈ ZZ T ∪{π1 ,...,πj }
γ−γ(j,x̄j )
of (3) such that wπj = x̄j . Obviously, x̄ can be extended to a feasible
solution x̃ of P by setting x̃i = x̄i , if i ∈ T ∪ {π1 , . . . , πj }, x̃i = 0, otherwise.
250 Alexander Martin and Robert Weismantel

This solution satisfies the feasible set inequality with equality, since wT x̃ =
P γ−γ(j,x̃j )
i∈T ∪{π1 ,...,πj−1 } wi x̃i +wπj x̃j = γ(j, x̃j )+ x̃j x̃j = γ. On the other hand, j
is the first index where
P the Chvátal-Gomory
P and the feasible set coefficient differ
and we conclude i∈N bλT Â·i cx̃i = i∈T ∪{π1 ,...,πj−1 } bλT Â·i cx̃i + bλT Â·πj cx̃j >
P
i∈T ∪{π1 ,...,πj−1 } wi x̃i + wπj x̃j = γ = bλ b̂c, contradicting the validity of the
T

Chvátal-Gomory inequality. t
u

As soon as the first two coefficients differ, for k ∈ {πj+1 , . . . , πn−|T | }, no

further statements on the relations of the coefficients are possible, in general.
Example 10. For b ∈ IN, let P (b) be the convex hull of all 0/1 solutions that
satisfy the knapsack inequality

2x1 + 6x2 + 8x3 + 9x4 + 9x5 + 21x6 + 4x7 ≤ b.

One Chvátal-Gomory cutting plane for P (b) reads

x1 + 3x2 + 4x3 + 4x4 + 4x5 + 10x6 + 2x7 ≤ bb/2c.

Let b = 25. The set T := {1, 2, 3, 4} is a feasible set. Choosing coefficients 1, 3, 4, 4

for the items in T and lifting the items 5, 6, 7 in this order we obtain the lifted
feasible set inequality that is valid for P (25):

x1 + 3x2 + 4x3 + 4x4 + 4x5 + 11x6 + x7 ≤ 12.

The right hand side of the Chvátal-Gomory cutting plane and the lifted feasible
set inequality coincide. With respect to the lifting order 5, 6, 7 the coefficient
of item 6 is the first one in which the two inequalities differ. This coefficient is
11 in the feasible set inequality and 10 in the Chvátal-Gomory cutting plane.
For item 7 the coefficient in the feasible set inequality is then smaller than the
corresponding one in the Chvátal-Gomory cutting plane. For b = 27 we obtain
the lifted feasible set inequality that is valid for P (27):

x1 + 3x2 + 4x3 + 4x4 + 4x5 + 9x6 + x7 ≤ 12.

The right hand side of this inequality is by one smaller than the right hand side
of the corresponding Chvátal-Gomory cutting plane for λ = 12 . However, the
coefficients of the items 6 and 7 are smaller than the corresponding coefficients
of the Chvátal-Gomory cutting plane.
Under certain conditions a feasible set- and a Chvátal-Gomory cutting plane
coincide.
Theorem 11. Let A ∈ INm×n and T be a feasible set of the integer program
max {cT x : Ax ≤ b, 0 ≤ x ≤ 1l, x ∈ ZZ n }. Let λ ∈ IRm
P + such that bλ bc =
T

i∈T bλT
A·i c. If, for all j ∈ N \ T , column vector A·j is a 0/1-combination of
elements from {A·i : i ∈ T }, then

bλT A·j c = φ≥ (A·j − r(T )).

The Intersection of Knapsack Polyhedra 251

Proof. We first show that bλT A·j c ≥ φ≥ (A·j − r(T )). Since A·j is a 0/1-com-
bination of elements from {A·i : i ∈ T },P
P there exist σ ∈ P {0, 1}T such that
Pi∈T σi A·i = ≥A·j . Thus, bλ A·j c = bλ ( i∈T σi A·i )c ≥ i∈T σi bλ A·i c =
T T T

i∈T σi wi ≥ φ (A·j − r(T )).

To show the opposite relation, we know by Proposition 5 that the coefficient of
the feasible set inequality wj satisfies wj ≤ φ≥ (A·j −r(T )). By Proposition 9 (b),
however, we know that, if wj does not coincide with bλT A·j c for all j ∈ N \ T ,
there is at least one index j with wj > bλT A·j c. This together with the first
part of the proof implies φ≥ (A·j − r(T )) ≤ bλT A·j c < wj ≤ φ≥ (A·j − r(T )), a
contradiction. Thus, wj = bλT A·j c for all j, and the claim follows by Proposition
5. t
u

By Proposition 5 the expression φ≥ (A·j − r(T )) is an upper bound on the ex-

act coefficient of an item j ∈ N \ T in any lifted feasible set inequality associated
with the feasible set T and the weighting
P bλT A·i c, i ∈ T . On the other hand, the
Chvátal-Gomory cutting plane j∈N bλ A·j c ≤ bλT bc is valid for P . Therefore,
T

this Chvátal-Gomory
P cutting
P plane must coincide with any lifted feasible set in-
equality i∈T bλT A·i cxi + j∈N \T wj xj ≤ bλT bc independent on the sequence
in which the lifting coefficients wj for the items in N \ T are computed.

4 Consecutively Intersecting Knapsacks

So far we have been discussing a framework with which one can define and ex-
plain families of cutting planes for a general integer program. On the other hand,
from a practical point of view the cutting plane phase of usual codes for integer
programming relies in particular on valid inequalities for knapsack polyhedra.
From both a theoretical and a practical point of view it would be desirable to
understand under what conditions facets of single knapsack polyhedra define or
do not define strong cutting planes of an integer program when several knapsack
constraints intersect. This question is addressed in this section. In fact, here we
study a special family of 0/1 programs that
arises when A ∈ INm×n and Ax ≤ b defines a system of consecutively inter-
secting knapsack constraints. Throughout this section we assume u = 1l, A ≥ 0
P integral. For i = 1, . . . , m, let Ni := supp(Ai· ) and P := conv {x ∈ {0, 1} :
i Ni
and
j∈Ni Aij xj ≤ bi }.

Definition 12. A system of linear inequalities Ax ≤ b is called a system of

consecutively intersecting knapsack constraints if A ∈ INm×n and Ni ∩ Nl = ∅
for all i, l ∈ {1, . . . , m}, |i − l| ≥ 2.

A natural question when one starts investigating the intersection of several

knapsack polyhedra is when this polyhedron inherits all the facets of the single
knapsack polyhedra.

Proposition 13. Let A ∈ INm×n and Ax ≤ b be a system of consecutively

intersecting knapsack constraints. Let i ∈ {1, . . . , m}. If Ni ∩ Nl ∪ {k} is not a
252 Alexander Martin and Robert Weismantel

cover for all l ∈ {i − 1, i + 1} and k ∈

/ Ni , then every facet-defining inequality of
the knapsack polyhedron P i defines a facet of P .
P P
Proof. Suppose that j∈Ni cj xj ≤ γ defines a facet of P i . Let j∈N dj xj ≤ γ
define a facet F of P such that
X
Fc := {x ∈ P : cj xj = γ} ⊆ F.
j∈Ni

Let x◦ ∈ ZZ n ∩ Fc be a vector such that x◦j = 0 for all j 6∈ Ni .

Consider some j 6∈ Ni , and let j ∈ Nl for some l 6= i. If |l − i| ≥ 2, obviously
x◦ + ej ∈ Fc . In the other case, we know Ni ∩ Nl ∪ {j} is not a cover (on account
◦
Pproposition), and thus x + ej ∈ Fic as well. This implies
of the conditions in the
that dj = 0. Because j∈Ni cj xj ≤ γ defines a facet of P we obtain

dim(Fc ) ≥ dim(P i ) − 1 + |N | − |Ni | = |N | − 1 ≥ dim(P ) − 1.

Therefore, Fc defines a facet of P that coincides with F . t

The condition that, for every k 6∈ Ni , l = i − 1, i + 1, the set (Ni ∩ Nl ) ∪ {k}

is not a cover, is essential for the correctness of the proposition as the following
example shows.
"b b b
#
3 +1 3 +1 3 +1 0
Example 14. For b ∈ IN \ {0} let A be the matrix
3 +1 b
b b
0 3 +1
4
and consider P = conv {x ∈ {0, 1} : Ax ≤ b}. Then N1 ∩ N2 = {2, 3}, and the
set {2, 3, 4} defines a cover.
The inequality x1 + x2 + x3 ≤ 2 defines a facet of the knapsack polyhedron
b b b
conv {x ∈ {0, 1}3 : ( + 1)x1 + ( + 1)x2 + ( + 1)x3 ≤ b}.
3 3 3
On the other hand, the inequality x1 + x2 + x3 ≤ 2 is not facet-defining for P ,
since the face F = {x ∈ P : x1 + x2 + x3 ≤ 2} is strictly contained in the face
induced by the inequality x1 + x2 + x3 + x4 ≤ 2.

In certain cases, a complete description of P may even be derived from the

description of the single knapsack polyhedra.

Theorem 15. Let m = 2 and A ∈ IN2×n . Let Ax ≤ b be a system of con-

secutively intersecting knapsack constraints such that every pair of items from
N1 ∩ N2 is a cover. For i = 1, 2 let C i x ≤ γ i be a system of inequalities that
describes the single knapsack polyhedron P i .
Then, P is described by the system of inequalities
P
j∈N1 ∩N2 xj ≤ 1

Cix ≤ γ i for i = 1, 2.
The Intersection of Knapsack Polyhedra 253

Proof. Note that the system of linear inequalities given in the theorem is valid for
P . To see that it suffices to describe P , let cT x ≤ γ be a non-trivial P facet-defining
inequality of P that is not a positive multiple of the inequality j∈N1 ∩N2 xj ≤ 1.
W.l.o.g. we assume that supp(c) ∩ N1 6= ∅. Let Z = {z1 , . . . , zt } = N1 ∩ N2 . We
claim that (supp(c) \ Z) ∩ N2 = ∅. Suppose the contrary. We define natural
numbers γ 0 , γ 1 , . . . , γ t by
P P
γ 0 := max{ cj xj : A1j xj ≤ b1 , x ∈ {0, 1}N1\Z },
j∈N1 \Z j∈N1 \Z
P P
γ i := max{ cj xj : A1j xj ≤ b1 − A1zi , x ∈ {0, 1}N1\Z },
j∈N1 \Z j∈N1 \Z

and claim that the face induced by the inequality cT x ≤ γ is contained in the
face induced by the inequality

X X
t
cj xj + (γ 0 − γ i )xzi ≤ γ 0 . (6)
j∈N1 \Z i=1

This inequality
P is valid for P by definition of the values γ i , i = 0, . . . , t and
because i∈Z xi ≤ 1. P
Let x ∈ P ∩ ZZ n such that cT x = γ. If i∈Z xi = 0, then cT xP= γ 0 , since
P we obtain a contradiction to the validity Tof c x ≤ γ. If i∈Z xi > 0,
T
otherwise
then i∈Z xi = 1. Let zi ∈ Z with xzi = 1. Then c x = γ implies that x − ezi
is an optimal solution of the program
X
max{ cj wj : w ∈ {0, 1}N1\Z : A1· w ≤ b1 − A1zi }.
j∈N1 \Z

This shows that in this case x satisfies inequality (6) as an equation, too. We
obtain that cT x ≤ γ must be a facet of the knapsack polyhedron P 1 . This
completes the proof. t
u

The correctness of the theorem strongly relies on the fact that every pair of
items from the intersection N1 ∩ N2 is a cover. If this condition is not satisfied,
then Example 14 demonstrates that the facet-defining inequalities of the two
single knapsack polyhedra do not suffice in general to describe the polyhedron
associated with the intersection of the two knapsack constraints. Geometrically,
the fact that we intersect two knapsack constraints generates incomparabilities
between the column vectors of the associated matrix. These incomparabilities
give rise to cutting planes that do not define facets of one of the two single
knapsack polyhedra that we intersect. In fact, incomparabilities between col-
umn vectors in a matrix make it possible to “melt” inequalities from different
knapsack polyhedra. A basic situation to which the operation of melting applies
is

Proposition 16. Let m = 2, A ∈ IN2×n andP Ax ≤ b be a system

P of two consecu-
tively intersecting knapsack constraints. Let i∈N1 \N2 ci xi + i∈N1 ∩N2 ci xi ≤ γ
254 Alexander Martin and Robert Weismantel

P P
be a valid inequality for P 1 , and let i∈N2 \N1 ci xi + i∈N1 ∩N2 ci xi ≤ γ be a
P
valid inequality for P 2 . Setting Θ := i∈N1 \N2 ci , the melted inequality
X X X
ci xi + ci xi + (ci − Θ)+ xi ≤ γ
i∈N1 \N2 i∈N1 ∩N2 i∈N2 \N1

is valid for P .
Proof. Let x ∈ P ∩ ZZ n . If xiP= 0 for all i ∈ P
N2 \ N1 with ci − Θ > 0, the
inequality is satisfied because i∈N1 \N2 ci xi + i∈N1 ∩N2 ci xi ≤ γ is valid for
P P
P 1 . Otherwise, i∈N2 \N1 (ci − Θ)+ xi ≤ i∈N2 \N1 ci xi − Θ, and we obtain
X X X
ci xi + ci xi + (ci − Θ)+ xi
i∈N1 \N2 i∈N1 ∩N2 i∈N2 \N1
X X X
≤ ci xi − Θ + ci xi + ci xi
i∈N1 \N2 i∈N1 ∩N2 i∈N2 \N1
X X
≤ ci xi + ci xi
i∈N1 ∩N2 i∈N2 \N1
≤ γ.
t
u
Proposition 16 can be extended to general upper bounds u ∈ INn . Often the
inequality that results from melting valid inequalities from knapsack polyhedra
as described in Proposition 16 does not define a facet of P . In such situations
there is a good chance of strengthening the melted inequality by determining
lifting coefficients for the items in (N1 \ N2 ) ∪ (N2 \ N1 ) with respect to a given
order.

145600 15
Example 17. Let A be the matrix and b = . The inequality
005614 15
x1 + 4x2 + 5x3 + 6x4 ≤ 15 defines a facet of P 1 := conv {x ∈ ZZ 4 : x1 +
4x2 + 5x3 + 6x4 ≤ 15, 0 ≤ xi ≤ 3, i ∈ {1, 2, 3, 4}}. For α ∈ {0, 1, 2, 3} the
inequality 5x3 + 6x4 + αx6 ≤ 15 is valid for P 2 := conv {x ∈ ZZ {3,4,5,6} :
5x3 + 6x4 + x5 + 4x6 ≤ 15, 0 ≤ xi ≤ 3, i ∈ {3, 4, 5}, 0 ≤ x6 ≤ 1}. Setting
α = 1 and applying Proposition 16 and the succeeding remark we obtain that
0x1 + 3x2 + 5x3 + 6x4 + x6 ≤ 15 is valid for P := conv {x ∈ ZZ 6 : Ax ≤ b, 0 ≤
xi ≤ 3, i ∈ {1, 2, 3, 4, 5}, 0 ≤ x6 ≤ 1}. This inequality can be strengthened by
lifting to yield the facet-defining inequality x1 + 3x2 + 5x3 + 6x4 + x6 ≤ 15. For
α = 2, we end up with the melted inequality 0x1 +2x2 +5x3 +6x4 +2x6 ≤ 15. For
α = 3 we obtain the inequality 0x1 + x2 + 5x3 + 6x4 + 3x6 ≤ 15 that can again be
strengthened to yield a facet-defining inequality of P , x2 +5x3 +6x4 +x5 +3x6 ≤
15.
It turns out that sometimes the feasible set inequalities for two consecu-
tively intersecting knapsacks can be interpreted in terms of melting feasible set
inequalities for the associated single knapsack polytopes.
The Intersection of Knapsack Polyhedra 255

15 3 5 13 0 0 20
Example 18. Consider A = and b = . The set T =
0 17 18 17 19 20 35
{2, 3} is feasible, and the inequality x2 + x3 + x4 + 2x5 + 2x6 ≤ 2 is a feasible
set inequality for P 2 := conv {x ∈ {0, 1}{2,3,4,5,6} : 17x2 + 18x3 + 17x4 + 19x5 +
20x6 ≤ 35}, where the variables not in T are lifted in the sequence 4, 5, 6. In
addition, x1 + x2 + x3 + x4 ≤ 2 is a feasible set inequality for P 1 := conv {x ∈
{0, 1}{1,2,3,4} : 15x1 + 3x2 + 5x3 + 13x4 ≤ 20}, P with respect to the same feasible
set T and the lifting sequence 4, 1. Now Θ = i∈N1 \N2 ci = 1 and the melted
inequality reads x1 + x2 + x3 + x4 + x5 + x6 ≤ 2, which is facet-defining for
P := {x ∈ {0, 1}6 : Ax ≤ b}. Note that the melted inequality is also a feasible
set inequality with respect to T and the lifting sequence 4, 1, 5, 6.

Note that Example 18 also shows that under certain conditions the operation
of melting feasible set inequalities produces a facet-defining inequality for P .

5 Computational Experience
In this section we briefly report on our computational experience with a sep-
aration algorithm for the feasible set inequalities. We have incorporated this
algorithm in a general mixed integer programming solver, called SIP, and tested
it on instances from the MIPLIB 3.0 ([2]). Details of our separation algorithm
such as how to determine a feasible set T , how to weight the variables in T , how
to perform the lifting, and for which substructures of the underlying constraint
matrix the separation algorithm should be called, will be given in a forthcoming
paper. We compared SIP with and without using feasible set inequalities. The
time limit for our runs was 3600 CPU seconds on a Sun Enterprise 3000. It turns
out that for 14 out of 59 problems (air05, fiber, gesa2, gesa2 o, gesa3 o, misc03,
misc07, p0033, p0201, p2756, qnet1, seymour, stein27, stein45) we find feasible
set inequalities or our separation routines uses more than 1% of the total running
time.

Table 1. Comparison of SIP with and without feasible set (FS) inequalities.

Example B&B Cuts Time Gap %

Nodes Others FS FS Total
SIP without FS 733843 1241 0 0.0 18413.6 17.38
SIP with FS 690812 1450 1712 561.8 18766.2 9.92

Table 1 summarizes our results over these 14 problem instances. The first
column gives the name of the problem, Column 2 the number of branch-and-
bound nodes. The following two columns headed Cuts give the number of cutting
planes found, Others are those that are found by the default separation routines
256 Alexander Martin and Robert Weismantel

in SIP, and FS shows the number of feasible set inequalities added. Columns
5 and 6 present the time spent for separating feasible set inequalities and the
total time. The last column gives the sum of the gaps ( upper bound - lower bound
lower bound
)
in percentage between the lower and upper bounds. Table 1 shows that the
time slightly increases (by 2%), but the quality of the solutions is significantly
improved, the gaps decrease by around 43% when adding feasible set inequalities.
Based on these results we conclude that feasible set inequalities are a tool that
helps solving mixed integer programs.

References
1. E. Balas, S. Ceria, and G. Cornuéjols. A lift-and-project cutting plane algorithm
for mixed 0 − 1 programs. Mathematical Programming, 58:295–324, 1993.
2. R. E. Bixby, S. Ceria, C. M. McZeal, and M. W. P. Savelsbergh. An updated mixed
integer programming library: MIPLIB 3.0. 1998. Paper and problems available at
http://www.caam.rice.edu/∼bixby/miplib/miplib.html
3. S. Ceria, C. Cordier, H. Marchand, and L. A. Wolsey. Cutting planes for inte-
ger programs with general integer variables. Technical Report CORE DP9575,
Université Catholique de Louvain, Louvain-la-Neuve, Belgium, 1997.
4. V. Chvátal. Edmonds polytopes and a hierarchy of combinatorial problems. Dis-
crete Mathematics, 4:305–337, 1973.
5. H. Crowder, E. Johnson, and M. W. Padberg. Solving large-scale zero-one linear
programming problems. Operations Research, 31:803–834, 1983.
6. R. E. Gomory. Solving linear programming problems in integers. In R. Bellman
and M. Hall, editors, Combinatorial analysis, Proceedings of Symposia in Applied
Mathematics, Vol. 10. Providence, RI, 1960.
7. H. Marchand and L. A. Wolsey. The 0–1 knapsack problem with a single continuous
variable. Technical Report CORE DP9720, Université Catholique de Louvain,
Louvain-la-Neuve, Belgium, 1997.
8. M. W. Padberg. On the facial structure of set packing polyhedra. Mathematical
Programming, 5:199–215, 1973.
9. M. W. Padberg. A note on zero-one programming. Operations Research, 23:833–
837, 1975.
10. M. W. Padberg. (1, k)-configurations and facets for packing problems. Mathemat-
ical Programming, 18:94–99, 1980.
11. M. W. Padberg, T. J. Van Roy, and L. A. Wolsey. Valid inequalities for fixed
charge problems. Operations Research, 33:842–861, 1985.
12. A. Schrijver. On cutting planes. Annals of Discrete Mathematics, 9:291–296, 1980.
13. L. A. Wolsey. Faces of linear inequalities in 0-1 variables. Mathematical Program-
ming, 8:165–178, 1975.
New Classes of Lower Bounds for Bin Packing
Problems ?

Sándor P. Fekete and Jörg Schepers

Center for Parallel Computing, Universität zu Köln

D–50923 Köln, Germany
{fekete, schepers}@@zpr.uni-koeln.de

Abstract. The bin packing problem is one of the classical NP-hard op-
timization problems. Even though there are many excellent theoretical
results, including polynomial approximation schemes, there is still a lack
of methods that are able to solve practical instances optimally. In this
paper, we present a fast and simple generic approach for obtaining new
lower bounds, based on dual feasible functions. Worst case analysis as
well as computational results show that one of our classes clearly out-
performs the currently best known “economical” lower bound for the bin
packing problem by Martello and Toth, which can be understood as a
special case. This indicates the usefulness of our results in a branch and
bound framework.

1 Introduction

The bin packing problem (BPP) can be described as follows: Given a set of n
“items” with integer size x1 , . . . , xn , and a supply of identical “containers” of
capacity C, decide how many containers are necessary to pack all the items.
This task is one of the classical problems of combinatorial optimization and NP-
hard in the strong sense – see Garey and Johnson [9]. An excellent survey by
Coffmann, Garey, and Johnson can be found as Chapter 2 in the recent book [2].
Over the years, many clever methods have been devised to deal with the re-
sulting theoretical difficulties. Most notably, Fernandez de la Vega and Lueker [8],
and Karmarkar and Karp [12] have developed polynomial time approximation
schemes that allow it to approximate an optimal solution within 1 + ε in polyno-
mial (even linear) time, for any fixed ε. However, these methods can be hardly
called practical, due to the enormous size of the constants. On the other hand,
there is still a lack of results that allow it solve even moderately sized test prob-
lems optimally – see Martello and Toth [15,16], and the ORLIB set of benchmark
problems [1]. The need for better understanding is highlighted by a recent ob-
servation by Gent [10]. He showed that even though some of these benchmark
?
This work was supported by the German Federal Ministry of Education, Science,
Research and Technology (BMBF, Förderkennzeichen 01 IR 411 C7).

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 257–270, 1998. c Springer–Verlag Berlin Heidelberg 1998
258 Sándor P. Fekete and Jörg Schepers

problems of 120 and 250 items had defied all systematic algorithmic solution
attempts, they are not exactly hard, since they can be solved by hand.
This situation emphasizes the need for ideas that are oriented towards the
exact solution of problem instances, rather than results that are mainly worst
case oriented. In this paper, we present a simple and fast generic approach for
obtaining lower bounds, based on dual feasible functions. Assuming that the
items are sorted by size, our bounds can be computed in linear time with small
constants, a property they share with the best lower bound for the BPP by
Martello and Toth [15,16]. As it turns out, one of our classes of lower bounds
can be interpreted as a systematic generalization of the bound by Martello and
Toth. Worst case analysis as well as computational results indicate that our
generalization provides a clear improvement in performance, indicating their
usefulness in the context of a branch and bound framework. Moreover, we can
show that the simplicity of our systematic approach is suited to simplify and
improve other hand-tailored types of bounds.
The rest of this paper is organized as follows. In Section 2, we give an intro-
duction to dual feasible functions and show how they can be used to obtain fast
and simple lower bounds for the BPP. In Section 3, we consider the worst case
performance of one of our classes of lower bounds, as well as some computational
results on the practical performance. Section 4 concludes with a discussion of
possible extensions, including higher-dimensional packing problems.

2 Dual Feasible Functions

For the rest of this paper, we assume without loss of generality that the items
have size xi ∈ [0, 1], and the container size C is normalized to 1. Then we
introduce the following:

Definition 1 (Dual feasible functions). A function u : [0, 1] → [0, 1] is called

dual feasible, if for any finite set S of nonnegative real numbers, we have the
relation
X X
x ≤ 1 =⇒ u(x) ≤ 1. (1)
x∈S x∈S

Dual feasible functions have been used in the performance analysis of heuris-
tics for the BPP, first by Johnson [11], then by Lueker [13]; see Coffman and
Lueker [3] for a more detailed description. The Term (which was first introduced
by Lueker [13]) refers to the fact that for any dual feasible function u and for any
bin packing instance with item sizes x1 , . . . , xn , the vector (u(x1 ), . . . , u(xn )) is a
feasible solution for the dual of the corresponding fractional bin packing problem
(see [12]). By definition, convex combination and compositions of dual feasible
functions are dual feasible.
We show in this paper that dual feasible functions can be used for improving
lower bounds for the one-dimensional bin packing problem. This is based on the
following easy lemma.
Lower Bounds for Bin Packing Problems 259

Lemma 1. Let I := (x1 , . . . , xn ) be a BPP instance and let u be a dual feasi-

ble function. Then any lower bound for the transformed BPP instance u(I) :=
(u(x1 ), . . . , u(xn )) is also a lower bound for I.
By using a set of dual feasible functions U and considering the maximum
value over the transformed instances u(I), u ∈ U, we can try to obtain even
better lower bounds.
In [13], a particular class of dual feasible functions is described; it relies on a
special rounding technique. For a given k ∈ IN, consider the stair function u(k)
that maps (for i ∈ {1, . . . , k}) all values from the interval [ k+1
i
, k+1
i+1
) onto the
i
value k . 1 is mapped to 1. We give a slightly improved version and a simple
proof that these functions are dual feasible.
Theorem 1. Let k ∈ IN. Then

u(k) : [0, 1] → [0, 1]

x, for x(k + 1) ∈ ZZ
x 7→
b(k + 1)xc k1 , else
is a dual feasible function.
P
Proof. Let S be a finite set of nonnegative real numbers with xi ≤ 1. We
xi ∈S
P (k)
have to show that u (xi ) ≤ 1. Let T := {xi ∈ S | xi (k + 1) ∈ ZZ}. Clearly,
xi ∈S
we only need to consider the case S 6= T . Then
X X X X
(k + 1) u(k) (xi ) + k u(k) (xi ) = (k + 1) xi + b(k + 1)xi c
xi ∈T xi ∈S\T xi ∈T xi ∈S\T
X
< (k + 1) xi .
xi ∈S
P P
By definition of u(k) , the terms (k + 1) x∈T u(k) (x) and k x∈S\T u(k) (x) are
P P
integer, so by virtue of xi ≤ 1, we have the inequality (k +1) x∈T u(k) (x)+
xi ∈S
P P (k)
k x∈S\T u(k) (x) ≤ k, implying u (x) ≤ 1. t
u
x∈S

The rounding mechanism is visualized in Figure 1, showing the difference of

the stair functions u(k) , k ∈ {1, . . . , 4}, and the identity id over the interval [0, 1].
For the BPP, we mostly try to increase the sizes by a dual feasible function, since
this allows us to obtain a tighter bound. The hope is to find a u(k) for which
as many items as possible are in the “win zones” – the subintervals of [0, 1] for
which the difference is positive.
The following class of dual feasible functions is the implicit basis for the
bin packing bound L2 by Martello and Toth [15,16]. This bound is obtained by
neglecting all items smaller than a given value . We account for these savings
by increasing all items of size larger than 1 − . Figure 2 shows the corresponding
win and loss zones.
260 Sándor P. Fekete and Jörg Schepers

1/2
x 0 1

k=1

1
y 0
1/3 2/3
x 0 1

k=2
y 0 1/2 1

1/4 2/4 3/4

x 1
0
k=3
y 0 1/3 2/3 1

1/5 2/5 3/5 4/5

x 0 1
k=4
y 0 1/4 2/4 3/4 1

y = u(k) (x)

Fig. 1. Win and loss zones for u(k) .

Theorem 2. Let ∈ [0, 12 ]. Then

U () : [0, 1] → [0, 1]


 1, for x > 1 −
x 7→ x, for ≤ x ≤ 1 −

0, for x <
is a dual feasible function.
P
Proof. Let S be a finite set of nonnegative real numbers, with x ≤ 1. We
x∈S
consider two cases. If S contains an element larger
P than 1 − , then all other
elements have to be smaller than . Hence we have U () (x) = 1. If all elements
P () x∈S
P
of S have at most size 1 − , we have U (x) ≤ x ≤ 1. t
u
x∈S x∈S

Our third class of dual feasible functions has some similarities to some bounds
that were hand-tailored for the two-dimensional and three-dimensional BPP by

ε 1- ε
x 1
0

y 0 x 1

(ε)
y = U (x)

Fig. 2. Win and loss zones for U () .

Lower Bounds for Bin Packing Problems 261

x 0 1/3 1/2 2/3 1

ε =1/3

y 0 1/3 2/3 1
x 0 1/4 1/2 3/4 1

ε =1/4
y 0 1/4 3/4 1
x 0 1/6 1/2 4/6 5/6 1

ε =1/6
y 0 1/6 4/6 5/6 1

x 0 1/8 1/2 5/8 6/8 7/8 1

ε =1/8
y 0 1/8 5/8 6/8 7/8 1

(ε)
y =φ (x)

Fig. 3. Win and loss zones for φ() .

Martello and Vigo [17], and Martello, Pisinger, and Vigo [14]. However, our
bounds are simpler and dominate theirs.
This third class also ignores items of size below a threshold value . For the
interval (, 12 ], these functions are constant, on ( 12 , 1] they have the form of stair
functions. Figure 3 shows that for small values of , the area of loss zones for φ()
exceeds the area of win zones by a clear margin. This contrasts to the behavior
of the functions u(k) and U () , where the win and loss areas have the same size.

Theorem 3. Let ∈ [0, 12 ). Then

φ() : [0, 1] → [0, 1]


 b(1−x)−1 c
 1 − b−1 c , for x > 12
x 7→ b−1 1
c, f”ur ≤ x ≤ 1

 2
0, for x <

is a dual feasible function.

P
Proof. Let S be a finite set of nonnegative real numbers, with x ≤ 1. Let
x∈S
S 0 := {x ∈ S | ≤ x ≤ 12 }. We distinguish two cases:

If all elements of S have size at most 12 , then by definition of S 0 ,

X X
1≥ x≥ x ≥ |S 0 | (2)
x∈S x∈S 0
262 Sándor P. Fekete and Jörg Schepers

holds. Since |S 0 | is integral, it follows that |S 0 | ≤ b−1 c, hence

X X 1
φ() (x) = φ() (x) = |S 0 | ≤ 1. (3)
b−1 c
x∈S x∈S 0

Otherwise S contains exactly one element y > 12 and we have

X X
1≥ x≥y+ x ≥ y + |S 0 |. (4)
x∈S x∈S 0

Therefore |S 0 | ≤ b(1 − y)−1 c and hence

X X b(1 − y)−1 c 1
φ() (x) = φ() (y) + φ() (x) = 1 − + |S 0 | −1 ≤ 1. (5)
b−1 c b c
x∈S x∈S 0

t
u

3 A Class of Lower Bounds for the BPP

In this section, we show how dual feasible functions can be combined by virtue
of Lemma 1 in order to get good bounds for the BPP. For this purpose, we
consider the lower bound L2 , suggested by Martello an Toth [15,16], which can
be computed in time O(n log n). According to the numerical experiments by
Martello and Toth, L2 yields very good approximations of the optimal value.
We describe L2 with the help of dual feasible functions. Using Lemma 1, it
is straightforward to see that L2 provides a lower bound for the BPP. Our
description allows an easy generalization of this bound, which can be computed
in linear time. We show that this generalized bound improves the asymptotic
worst-case performance from 23 to 34 . Empirical studies provide evidence that
also the practical performance is significantly improved.

3.1 The Class L(q)

We give a definition of (asymptotic) worst case performance:
Definition 2 ((Asymptotic) worst case performance). Let L be a lower
bound for a minimization problem P . Let opt(I) denote the optimal value of P
for an Instance I. Then

L(I)
r(L) := sup |I is an instance of P (6)
opt(I)
is called the worst case performance and

L(I)
r∞ (L) := inf+ sup |I is an instance of P with opt(I) ≥ s (7)
s∈IR0 opt(I)

is called the asymptotic worst case performance of L.

Lower Bounds for Bin Packing Problems 263

The easiest lower bound for the BPP is the total volume of all items (the size
of the container will be assumed to be 1), rounded up to the next integer. For a
normalized BPP instance I := (x1 , . . . , xn ), this bound can be formulated as
& n '
X
L1 (I) := xi . (8)
i=1

Theorem 4 (Martello and Toth). r(L1 ) = 12 .

Martello and Toth showed that for their ILP formulation of the BPP, the bound
L1 dominates the LP relaxation, the surrogate relaxation, and the Lagrange
relaxation. (See [16], pp. 224.) According to their numerical experiments, L1
approximates the optimal value very well, as long as there are sufficiently many
items of small size, meaning that the remaining capacity of bins with big items
can be exploited. If this is not the case, then the ratio between L1 and the
optimal value can reach a worst case performance of r(L1 ) = 12 , as shown by the
class of examples with 2k items of size 12 + 2k
1
.
In the situation which is critical for L1 (“not enough small items to fill up
bins”), we can make use of the dual feasible functions U () from Theorem 2;
these functions neglect small items in favor of big ones. This yields the bound
L2 : For a BPP instance I, let

L2 (I) := max1 L1 (U () (I)). (9)

∈[0, 2 ]

From Lemma 1, it follows immediately that L2 is a lower bound for the BPP.
The worst case performance is improved significantly:
Theorem 5 (Martello and Toth). r(L2 ) = 23 .
This bound of 23 is tight: consider the class of BPP instances with 6k items of
size 13 + 6k
1
.
There is little hope that a lower bound for the BPP that can be computed
in polynomial time can reach an absolute worst case performance above 23 . Oth-
erwise, the N P -hard problem Partition (see [9], p. 47) of deciding whether two
sets of items can be split into two sets of equal total size could be solved in
polynomial time.
L2 can be computed in time O(n log n). The computational effort is deter-
mined by sorting the items by their size, the rest can be performed in linear time
by using appropriate updates.
Lemma 2 (Martello and Toth). Consider a BPP instance I := (x1 , . . . , xn ).
If the sizes xi are sorted, then L2 (I) can be computed in time O(n).
Now we define for any k ∈ IN
(k)
L2 (I) := max1 L1 (u(k) ◦ U () (I)) (10)
∈[0, 2 ]
264 Sándor P. Fekete and Jörg Schepers

and for q ≥ 2 consider the following bounds:

(q) (k)
L∗ (I) := max{L2 (I), max L2 (I)}. (11)
k=2,...,q

By Lemma 1, these are valid lower bounds. As for L2 , the time needed for
computing these bounds is dominated by sorting:
Lemma 3. Let I := (x1 , . . . , xn ) be an instance of the BPP. If the items xi are
(q)
given sorted by size, then L∗ (I) can be computed in time O(n) for any fixed
q ≥ 2.

3.2 An Optimality Result for L(q)

(2)
For the described class of worst case instances for L2 , the bound L∗ yields the
optimal value. As a matter of fact, we have:
Theorem 6. Let I := (x1 , . . . , xn ) be a BPP instance with all items larger than
1 (2)
3 . Then L∗ (I) equals the optimal value opt(I).

Proof. Without loss of generality, let x1 ≥ x2 ≥ . . . ≥ xn . Consider an optimal

solution, i. e., an assignment of all items to m := opt(I) bins. Obviously, any bin
contains one or two items. In each bin, we call the larger item the lower item,
while (in the bins with two items) the smaller item is called the upper item. In
the following three steps, we transform the optimal solution into normal form.
1. Sort the bins in decreasing order of the lower item.
2. Move the upper items into the bins with highest indices. The capacity con-
straint remains valid, since increasing bin index corresponds to decreasing
size of the lower item.
3. Using appropriate swaps, make sure that in bins with two items, increasing
index corresponds to increasing size of the upper item. Again, the order of
the bins guarantees that no capacity constraint is violated.
Eventually, we get the normal form shown in Figure 4: Item xi is placed in bin
i. The remaining items m + 1, m + 2, . . . , n are placed in bins m, m − 1, . . . , m −
(n − (m + 1)). The first bin with two items has index q := 2m + 1 − n.
For i ∈ {1, . . . , n}, the assumption xi > 13 implies

d3xi − 1e 1
u(2) (xi ) = ≥ . (12)
2 2
For n ≥ 2m − 1, we get
& n '
(2)
X (2m − 1)
(2) (2)
L2 (I) ≥ L1 (u (I)) = u (xi ) ≥ = m. (13)
i=1
2

(2)
Hence, the lower bound L2 (I) yields the optimal value, leaving the case

n < 2m − 1. (14)
Lower Bounds for Bin Packing Problems 265

n n-1 m+2 m+1

... ...

1 2 q-1 q q+1 m-1 m

Bin 1 2 ... q-1 q q+1 ... m-1 m

Fig. 4. Normal form of an optimal solution for a BPP instance with sizes xi >
1/3.

Then q = 2m + 1 − n > 2m + 1 − (2m − 1) = 2, implying that at least bins 1

and 2 contain only one item.
If xm > 12 , then xi > 12 holds for all i ≤ m, as the items are sorted by
decreasing size, thus U (1/2) (xi ) = 1. Hence,
&m '
X
(1/2) (1/2)
L2 (I) ≥ L1 (U (I)) ≥ U (xi ) = m. (15)
i=1

In this case, L2 (I) equals the optimal value. Therefore, assume

1
xm ≤ (16)
2
for the rest of the proof.
If xm−1 + xm > 1, let := xm ≤ 1/2. For i ≤ m − 1, we have xi ≥ xm−1 >
1 − xm = 1 − , hence

u(2) ◦ U () (xi ) = u(2) (1) = 1. (17)

Furthermore, 1/3 < xm ≤ 1/2 implies

1
u(2) ◦ U () (xm ) = u(2) (xm ) = . (18)
2
All in all, we have
&m '
(2)
X 1
(2) ()
L2 (I) ≥ u ◦U (xi ) = (m − 1) + = m, (19)
i=1
2

i. e., the bound meets the optimum.

This leaves the case

xm−1 + xm ≤ 1. (20)
266 Sándor P. Fekete and Jörg Schepers

1111
0000
0000
1111
0000
1111
*
2m-i -1
1111
0000
0000
1111
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
i *

000
111
i* i*+2 m

Fig. 5. Determining i∗ .

Using the assumptions (14) and (20), we show that there is an i∗ ∈ {2m −
n, . . . , n} with

xi∗ + x2m−i∗ −1 > 1. (21)

Figure 5 shows the meaning of this statement: At least one of the upper items
cannot be combined with the lower item two bins to the left.
This is shown in the following way: If for all i∗ ∈ {2m − n, . . . , n}, we had

xi + x2m−i−1 ≤ 1, (22)

then all upper items could be moved two bins to the left, since the first two
items do not contain more than one item by (14). This would allow it to pack
item xm−1 with xm , saving a bin.
Now let := x2m−i∗ −1 . By (21), we have for all i ∈ {1, . . . , i∗ } that

xi ≥ xi∗ > 1 − , (23)

hence

u(2) ◦ U () (xi ) = u(2) (1) = 1. (24)

For i ∈ {i∗ + 1, . . . , 2m − 1 − i∗ } we have

xi ≥ x2m−i∗ −1 = , (25)

and therefore
1
u(2) ◦ U () (xi ) ≥ u(2) (xi ) ≥ . (26)
2
Summarizing, we get
& i∗ 2m−i∗ '
(2)
X X−1
(2) () (2) ()
L2 (I) ≥ u ◦ U (xi ) + u ◦ U (xi ) (27)
i=1 i=i∗ +1
Lower Bounds for Bin Packing Problems 267
& i∗ 2m−i∗ '
X X−1 1
≥ 1+ (28)
i=1 i=i∗ +1
2
∗
(2m − 2i − 1)
= i∗ + = m. (29)
2

This completes the proof. t

3.3 Worst Case Performance of L(q)

As we have stated above, we cannot hope to improve the worst case performance
of L2 . However, we can show that the asymptotic worst case performance is
(2)
improved by r∞ (L∗ ):

Theorem 7.
(2) 3
r∞ (L∗ ) = . (30)
4
Proof. Let I := (x1 , . . . , xn ) be a BPP instance. We start by showing

(2) 3
max{L2 (I), L2 (I)} ≥ opt(I) − 1. (31)
4
(2)
By Theorem 6, all items with xi > 13 fit into m ≤ L2 (I) bins. Let these bins
be indexed by 1, . . . , m. Using the First Fit Decreasing heuristic, we add the
remaining items, i. e., sort these items by decreasing order and put each item
into the first bin with sufficient capacity; if necessary, use a new bin. Let q denote
the number of bins in this solution.
Suppose we need m bins for the big items and not more than m 3 items for
the rest, then Theorem 6 yields the first part of the statement:

4 4 (2)
opt(I) ≤ q ≤ m ≤ max{L2 (I), L2 (I)}. (32)
3 3
Therefore, assume

3q
>m (33)
4
for the rest of the proof.
Let α denote the largest free capacity of one of the bins 1 through m, i. e.,
the total size of all items in this bin is at least (1 − α).
No item that was placed in one of the bins m + 1 through q can fit into the
bin with free capacity α. This means that all these bins can only contain items
xi > α. On the other hand, the size of these items does not exceed 13 . This
implies that the bins m + 1 through q − 1 must contain at least three items of
size xi > α, while bin q holds at least one item xi > α.
268 Sándor P. Fekete and Jörg Schepers

Thus, we get
& n '
X
L2 (I) ≥ L1 (I) = xi ≥ d(1 − α)m + 3α(q − 1 − m) + αe (34)
i=1
= d(1 − 4α)m + 3α(q − 1) + αe .
Now consider two cases.
Case 1: Assume α > 14 . Since (1 − 4α) < 0, we can replace the term m in
(34) by 3q
4 > m:

3
L2 (I) > (1 − 4α) q + 3α(q − 1) + α (35)
4

3
= q − 2α (36)
4
3 3
≥ q − 1 ≥ opt(I) − 1. (37)
4 4
Case 2: For α ≤ 14 , neglect the term (1 − 4α)m ≥ 0 in (34):
L2 (I) ≥ d3α(q − 1) + αe (38)
3 3
≥ q − 1 ≥ opt(I) − 1. (39)
4 4
(2)
This proves r∞ (L∗ ) ≥ 34 . For showing that equality holds, consider the family of
bin packing instances with 3k items of size 1/4+δ with δ > 0. This needs at least
(2)
k bins. For sufficiently small δ, we have L2 = 0 and L2 (I) = L1 (I) ≤ 34 k+1. u t

3.4 Computational Performance of L(q)

Generally speaking, L2 yields results that are orders of magnitude better than
the worst case performance (see [15,16]). In the following, we compare the com-
(q)
putational performance of L∗ on the same type of benchmark problems. As it
turns out, we get a clear improvement.
For our computational investigation, we generated random instances in the
same way as described in [16], pp. 240. The bin size is normalized to 100. For a
given number n of items, the sizes were generated randomly with even distribu-
tion on the sets {1, . . . , 100}, {20, . . . , 100}, and {35, . . . , 100}. For each problem
class, and n ∈ {100, 500, 1000}, we generated 1000 instances. In the table, we
(2) (5) (10)
compare L2 with the bounds L∗ , L∗ , and L∗ . Shown is the average relative
error in percent (% err) and the number of instances, for which the optimal
value was reached (# opt). The optimum was computed with the routine MTP
from [16] with a limit of 100000 search nodes.
Especially for large instances, we see a clear improvement by the new bounds.
Compared with L2 , the number of times that the optimal value is met is increased
by 50 % for n = 1000 and the first two problem classes. For the third problem
class, we always get the optimum, as shown in Theorem 6. The differences be-
(2) (5) (10)
tween L∗ , L∗ and L∗ are significant, but small.
Lower Bounds for Bin Packing Problems 269

Table 1. Performance of lower bounds for the BPP.

(2) (5) (10)
L2 L∗ L∗ L∗
Interval n % err # opt % err # opt % err # opt % err # opt
100 0.432 774 0.324 832 0.303 843 0.272 859
[1, 100] 500 0.252 474 0.157 644 0.154 645 0.130 693
1000 0.185 381 0.116 578 0.114 578 0.100 606
100 0.419 732 0.297 812 0.265 832 0.232 853
[20, 100] 500 0.231 443 0.144 634 0.138 642 0.123 677
1000 0.181 366 0.104 605 0.103 605 0.091 632
100 0.229 827 0.000 1000 0.000 1000 0.000 1000
[35, 100] 500 0.160 553 0.000 1000 0.000 1000 0.000 1000
1000 0.114 507 0.000 1000 0.000 1000 0.000 1000

4 Conclusions

We have presented a fast new method for generating lower bounds for the bin
packing problem. The underlying method of dual feasible functions can also be
used in the case of higher dimensions by combining our ideas the approach for
modeling higher-dimensional orthogonal packings that we developed for find-
ing exact solutions for the d-dimensional knapsack problem [4]. Details will be
contained in the forthcoming papers [5,6,7].

References

1. J. E. Beasley. OR-Library: Distributing test problems by electronic mail. Journal

of Operations Research Society, 41:1069–1072, 1990. http://mscmga.ms.ic.ac.uk/
info.html
2. E. G. Coffmann, Jr., M. R. Garey, and D. S. Johnson. Approximation algorithms
for bin packing: A survey. In D. S. Hochbaum, editor, Approximation Algorithms
for NP-Hard Problems, pages 46–93. PWS Publishing, Boston, 1997.
3. E. G. Coffmann, Jr. and G. S. Lueker. Probabilistic Analysis of Packing and Par-
titioning Algorithms. Wiley, New York, 1991.
4. S. P. Fekete and J. Schepers. A new exact algorithm for general orthogonal d-
dimensional knapsack problems. Algorithms – ESA ’97, LNCS, Vol. 1284, pages
144–156. Springer, 1997.
5. S. P. Fekete and J. Schepers. On more-dimensional packing I: Modeling. ZPR
Technical Report 97-288. http://www.zpr.uni-koeln.de
6. S. P. Fekete and J. Schepers. On more-dimensional packing II: Bounds. ZPR Tech-
nical Report 97-289. http://www.zpr.uni-koeln.de
7. S. P. Fekete and J. Schepers. On more-dimensional packing III: Exact Algorithms.
ZPR Technical Report 97-290. http://www.zpr.uni-koeln.de
8. W. Fernandez de la Vega and G. S. Lueker. Bin packing can be solved within 1 + ε
in linear time. Combinatorica, 1:349–355, 1981.
9. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the
Theory of NP-Completeness. Freeman, San Francisco, 1979.
270 Sándor P. Fekete and Jörg Schepers

10. I. P. Gent. Heuristic solution of open bin packing problems. (To appear in Journal
of Heuristics). http://www.cs.strath.ac.uk/~apes/papers
11. D. S. Johnson. Near-optimal bin packing algorithms. PhD thesis, Massachusetts
Institute of Technology, Cambridge, Massachusetts, 1973.
12. N. Karmarkar and R. M. Karp. An efficient approximation scheme for the one-
dimensional bin packing problem. Proc. 23rd Annual Symp. Found. Comp. Sci.
(FOCS 1982), pages 312–320, 1982.
13. G. S. Lueker. Bin packing with items uniformly distributed over intervals [a, b].
Proc. 24th Annual Symp. Found. Comp. Sci. (FOCS 1983), pages 289–297, 1983.
14. S. Martello, D. Pisinger, and D. Vigo. The three-dimensional bin packing problem.
Technical Report DEIS-OR-97-6, 1997. http://www.deis.unibo.it
15. S. Martello and P. Toth. Lower bounds and reduction procedures for the bin pack-
ing problem. Discrete Applied Mathematics, 28:59–70, 1990.
16. S. Martello and P. Toth. Knapsack Problems. Wiley, New York, 1990.
17. S. Martello and D. Vigo. Exact solution of the two-dimensional finite bin packing
problem. Technical Report DEIS-OR-96-3, 1996. http://www. deis.unibo.it
18. J. Schepers. Exakte Algorithmen für Orthogonale Packungsprobleme. PhD thesis,
Mathematisches Institut, Universität zu Köln, 1997.
Solving Integer and Disjunctive Programs by
Lift and Project

Sebastián Ceria1 and Gábor Pataki2?

1
Graduate School of Business, and
Computational Optimization Research Center
Columbia University, New York, NY 10027, USA
sebas@@cumparsita.gsb.columbia.edu
http://www.columbia.edu/∼sc244
2
Department of Industrial Engineering and Operations Research, and
Computational Optimization Research Center
Columbia University, New York, NY 10027, USA
gabor@@ieor.columbia.edu
http://www.ieor.columbia.edu/∼gabor

Abstract. We extend the theoretical foundations of the branch-and-cut

method using lift-and-project cuts for a broader class of disjunctive con-
straints, and also present a new, substantially improved disjunctive cut
generator. Employed together with an efficient commercial MIP solver,
our code is a robust, general purpose method for solving mixed integer
programs. We present extensive computational experience with the most
difficult problems in the MIPLIB library.

1 Introduction

Disjunctive programming is optimization over a finite union of convex sets. Its

foundations were developed, and the term itself coined in the early seventies by
Balas [4,5]; since then it attracted the attention of numerous researchers, includ-
ing Jeroslow [18,19], Blair [12], Williams [25], Hooker [15], Beaumont [11], Sherali
and Shetty [23], Meyer [21]. Besides having an elegant theory, disjunctive pro-
gramming provides a way to formulate a wide variety of optimization problems,
such as mixed integer programs, linear complementarity, job-shop scheduling,
equilibrium problems, and so on.
There is a natural connection between disjunctive programming problems
and logic. In fact, a recent paper of Hooker and Osorio [16] proposes to solve
discrete optimization problems that can be formulated by using logic and linear
programming. They call this area of mathematical programming MLLP (Mixed
Logical Linear Programming). Even though MLLP is, in principle, more general
?
Both authors were supported by NSF grant DMS 95-27-124

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 271–283, 1998. c Springer–Verlag Berlin Heidelberg 1998
272 Sebastián Ceria and Gábor Pataki

than disjunctive programming, since it allows more flexibility in the representa-

tion of logical formulas, every MLLP can be represented as a general disjunctive
program.
Recently, Balas, Ceria and Cornuéjols [8,9] proposed the lift-and-project me-
thod, which is related to the work of Balas on disjunctive programming, the
matrix-cuts of Lovász and Schrijver [20], the hierarchy of relaxations for mixed-
integer programming of Sherali and Adams [22], and the intersection cuts of
Balas [3]. The implementation of the lift-and-project method in a branch-and-
cut framework (see [9]) proved to be very effective when tackling difficult mixed
0–1 programs.
The goal of our work is twofold. First, we show that the lift-and-project
method for the 0–1 case can be extended quite naturally for a large class of
disjunctions, called facial disjunctions. Such disjunctive constraints abound in
practice, and our preliminary computational experience shows that – when the
disjunctions are carefully chosen – the cuts generated from them outperform 0–1
disjunctive cuts.
Second, we attempt to answer the challenge posed by a new generation of
commercial MIP solvers. We implemented a new, much improved lift-and-project
cut generator, and we present our computational experience with it. Combining
our separator with an efficient commercial code (CPLEX 5.0) in a cut-and-
branch framework yields an extremely robust general purpose MIP solver.
In the rest of this section we provide the basic notation and definitions.
In Section 2 we give a brief overview of known properties of facial disjunctive
programs, and describe how disjunctive cuts generated from facial disjunctions
can be lifted. In Section 3 we discuss implementation issues and present new
computational results on the problems of MIPLIB 3.0 with a substantially im-
proved version of a lift-and-project cut generator. Finally, in Section 4 we give
our conclusions, discuss ongoing work, and future directions.

Disjunctive Sets and Disjunctive Cuts. A disjunctive set is a set of points

satisfying a collection of inequalities connected by the logical connectors ∧ (con-
junction, “AND” ) and ∨ (disjunction, “OR”).
A disjunctive set is in Disjunctive Normal Form or DNF, if its terms do not
contain further disjunctions. For simplicity, we shall be dealing with sets in DNF
that contain only two terms, i.e. sets of the form

K0 ∪ K1 (1)

Moreover, we shall assume that there is a polyhedron K that contains both K0

and K1 , and K0 and K1 are defined by one additional inequality, that is
K = {x | Ax ≥ b}
(2)
Kj = {x | Ax ≥ b, dj x ≥ gj } (j = 0, 1)

We shall denote

P = cl conv (K0 ∪ K1 ) (3)

Integer and Disjunctive Programming by Lift and Project 273

A disjunctive set is in conjunctive normal form, or CNF if its conjunctions do

not contain further conjunctions. E.g. if we are given the set K as above and

Kij = {x | Ax ≥ b, dij x ≥ gij } (i = 1, . . . , p, j = 0, 1)

then the set

^
{x | (x ∈ Ki0 ∨ x ∈ Ki1 ) } (4)
i=1,...,p

or, equivalently
^
{ x | Ax ≥ b, (di0 x ≥ gi0 ∨ di1 x ≥ gi1 ) } (5)
i=1,...,p

is in conjunctive normal form.

As an example, consider the feasible set of a mixed 0–1 program. In this case
K is the feasible set of the linear programming relaxation, and

Ki0 = { x | Ax ≥ b, xi ≤ 0}
Ki1 = { x | Ax ≥ b, xi ≥ 1}

for (i = 1, . . . , p). Then with this definition of Ki0 and Ki1 (5) is the usual way
of expressing the feasible set of the mixed 0–1 program. Moreover, if for some i
we choose

K0 = Ki0 , K1 = Ki1 (6)

then the set in DNF in the strengthening of the LP-relaxation obtained by

imposing the 0–1 condition on the variable xi .
A disjunctive program (DP for short) is an optimization problem with the
feasible set being is a disjunctive set. Optimizing a linear function over the set
in (1) can be done by optimizing over P .
If Π is a polyhedron, then we denote

Π ∗ = {(α, β) | αx ≥ β is a valid inequality for Π}

The next theorem, due to Balas, provides a representation of the set P in (1).
This representation will be used in the next section as the basis for disjunctive
cutting plane generation.

Theorem 1 (Balas [4]). Assume that the sets K0 and K1 are nonempty. Then
(α, β) ∈ P ∗ if and only if there exists u0 , v 0 , u1 , v 1 such that

uj A +v j dj = α (j = 0, 1)
uj b +v j gj ≥ β (j = 0, 1) (7)
uj , v j ≥ 0 (j = 0, 1)
274 Sebastián Ceria and Gábor Pataki

In fact, this result holds under a more general regularity condition, than the
nonemptyness of all Ki ’s (assuming nonemptyness, the result follows by simply
using Farkas’ lemma).
The lift-and-project method is based on the generation of disjunctive or lift-
and-project cuts, αx ≥ β which are valid for P , and violated by the current
LP-solution x̄, i.e. αx̄ < β. Theorem 7 provides a way of generating disjunctive
cuts for a general disjunctive program through the solution of a linear program,
called the cut-generation LP, of the form
max β − αx̄
s.t. (α, β) ∈ P ∗ (8)
(α, β) ∈ S
where S is a normalization set ensuring boundedness of the CLP.
There are several possible choices for the set S (see [13] for a complete de-
scription). In our current implementation we use the following normalization
constraint:
1
X
S = {(u0 , v 0 , u1 , v 1 ) : (uj + v j )T e ≤ 1}
j=0

where e is a vector of all ones of appropriate dimension.

2 Facial Disjunctions
Our purpose is to use disjunctive cuts for solving general disjunctive programs.
We will devote particular attention to a special class of facial disjunctive pro-
grams. A disjunctive set is called facial if the sets K0 and K1 are faces of K.
Some examples of facial and non-facial disjunctive programs include:
• Variable upper bound constraints: Suppose that we wish to model the
following situation: if an arc i in a network is installed, we can send up to ui
units of flow on it; if it is not installed, we can send none. If we denote by yi
the amount of flow on the arc, and xi is a 0–1 variable indicating whether
the arc is installed, or not, then clearly,
xi = 1 ∨ yi = 0
(9)
0 ≤ yi ≤ u
is a correct model.
• Linear complementarity: The linear complementarity problem (LCP for
short) is finding x, z satisfying
x, z ≥ 0
Mx + z = q (10)
xT z = 0
Clearly, (10) is a facial disjunctive program in CNF, with the disjunctions
being xi = 0 ∨ zi = 0.
Integer and Disjunctive Programming by Lift and Project 275

• Ryan-Foster disjunctions: This disjunction is used as a very successful

branching rule in solving set-partitioning problems. Precisely, suppose that
the feasible set of an SPP is given as
xk ∈ {0, 1} ∀k
Ax = e
with A being a matrix of zeros and ones, and e the vector of all ones. Denote
by Ri the support of the ith row of A. Then the disjunction
X X
xk = 1 ∨ xk = 0
k∈R k∈R

is valid for all the feasible solutions of the SPP, if R is a subset of any Ri .
However, if R is chosen as the intersection of Ri1 and Ri2 for two rows i1
and i2 , then the disjunction will perform particularly well as a branching
rule, when solving the SPP by branch-and-bound.
• Machine scheduling (see [6]): In the machine scheduling problem we
are given a number of operations {1, . . . , n} with processing times p1 , . . . , pn
that need to be performed on different items using a set of machines. The
objective is to minimize total completion time while satisfying precedence
constraints between operations and the condition that a machine can process
one operation at a time, and operations cannot be interrupted. The problem
can be formulated as
M in tn
tj − ti ≥ di , (i, j) ∈ Z
(11)
ti ≥ 0, i = {1, . . . , n}
tj − ti ≥ di ∨ ti − tj ≥ dj (i, j) ∈ W
where ti is the starting time of job i, Z is the set of pairs of operations
constrained by precedence relations and W is the set of pairs that use the
same machine and therefore cannot overlap.
This disjunctive program is NOT facial.
One of the most important theoretical properties of 0–1 disjunctive cuts is
the ease with which they can be lifted ([8,9]). For clarity, recall that
K = {x | Ax ≥ b}
Kj = {x | x ∈ K, dj x ≥ gj } (j = 0, 1) (12)
P = cl conv (K0 ∪ K1 )
Let K 0 be a face of K, and
Kj0 = {x | x ∈ K 0 , dj x ≥ gj } (j = 0, 1)
(13)
P 0 = cl conv (K00 ∪ K10 )
Suppose that we are given a disjunctive cut (α0 , β 0 ) valid for P 0 and violated by
x̄ ∈ K 0 . Is it possible to quickly compute a cut (α, β), which is valid for P , and
violated by x̄ ?
276 Sebastián Ceria and Gábor Pataki

The answer is yes, if (α0 , β 0 ) is a cut obtained from a 0–1 disjunction ([8,9]),
and K 0 is obtained from K by setting several variables to their bounds. Moreover,
it is not only sufficient to solve the CLP with the constraints representing K 0
in place of K, its size can be reduced by removing those columns from A which
correspond to the variables at bounds. In practice these variables are the ones
fixed in branch-and-cut, plus the ones that happen to be at their bounds in the
optimal LP-solution at the current node. Cut lifting is vital for the viability of
the lift-and-project method within branch-and-cut; if the cut generation LP’s
are solved with putting all columns of A into the CLP, the time spent on cut
generation is an order of magnitude larger, while the cuts obtained are rarely
better ([9]). The main result of this section is :
Theorem 2. Let K 0 be a an arbitrary face of K and Kj0 and P 0 as above. Let
(α0 , β 0 ) ∈ (P 0 )∗ with the corresponding multipliers given. Then we can compute
a cut (α, β) that is valid for P and for all x ∈ K 0

αx − β = α0 x − β 0 (14)

Proof. Somewhat surprisingly, our proof is even simpler, than the original for
the 0–1 case. Let K 0 be represented as

K 0 = {x | A= x = b= , A+ x ≥ b+ }

where the systems A= x ≥ b= and A+ x ≥ b+ form a partition of Ax ≥ b. Since

(α0 , β 0 ) ∈ (P 0 )∗ , we have

α = u0= A= + u0+ A+ + v0 s
= u1= A= + u1+ A+ + v1 t
(15)
β ≤ u0= b= + u0+ b+ + v0 s0
β ≤ u1= b= + u1+ b+ + v1 t0

with u0+ ≥ 0, u1+ ≥ 0, v0 ≥ 0, v1 ≥ 0, and u0= and u1= unconstrained. Then

if we replace the negative components of u0= and u1= by 0, and compute the
corresponding (α, β) it will clearly be valid for P and satisfy (14). t
u
It is interesting to note that the above proof also implies that if the CLP
is solved with a normalization that is imposed only on β, such as β = ±1, or
|β| ≤ 1, then the lifted cut will also be optimal. Hence the above theorem also
generalizes the cut-lifting theorem in [8].
Suppose that after possibly complementing variables, multiplying rows by a
scalar, adding rows, and permuting columns A= and b= can be brought into the
form
A= = [I, 0] , b= = 0
This condition is satisfied for most facial disjunctions of importance. Then just
as in the 0–1 case, we can remove the columns from A that correspond to the
columns of I, and solve the cut generation LP in the smaller space.
The consequence of these results is that facial disjunctions can be used for
branching in a branch-and-cut algorithm, in place of the usual 0–1 branching. At
Integer and Disjunctive Programming by Lift and Project 277

any given node of the tree, the LP-relaxation is always a system that arises from
the system defining K by imposing equality in some valid inequalities. In other
words, the LP-relaxation at any given node defines a face of the LP-relaxation at
the root. Therefore, if we also wish to generate disjunctice cuts, this can always
be done using the LP-relaxation at the current node of the branch-and-cut tree,
then lifting the resulting cut to be globally valid. Notice, that for this scheme to
work, we only require the disjunctions for branching to be facial; the disjunctions
for cutting can be arbitrary.

3 Computations
3.1 The Implementation
The computational issues that need to be addressed when generating lift-and-
project cuts were thoroughly studied in [9]. Their experience can be briefly sum-
marized as:
• It is better to generate cuts
• in large rounds, before adding them to the linear programming relax-
ation, and reoptimizing.
• in the space of the variables which are strictly between their upper and
lower bounds, then to lift the cut to the full space.
• The distance of the current fractional point from the cut hyperplane is a
reliable measure of cut quality.
We adopted most of their choices, and based on our own experience, we added
several new features to our code, namely,
• In every round of cutting, we choose 50, (or less, if fewer are available) 0–1
disjunctions for cut generation. In an analogous way, we also choose a set of
general integer disjunctions of the form xi ≤ bx̄i c ∨ xi ≥ dx̄i e.
• The 50 0–1 disjunctions are chosen from a candidate set of 150. It is rather
conceivable that a disjunction will give rise to a strong cut if and only if
it would perform well when used for branching; i.e. the improvement of
the objective function on the two branches would be substantial. Therefore,
we use “branching” information to choose the variables from the candidate
set. In the current implementation we used the strongbranch routine of
CPLEX 5.0, which returns an estimate of the improvements on both gener-
ated branches. Our strategy is testing the 150 candidate variables (or fewer
if less are available) then picking those 50 which maximize some function
(currently we use the harmonic mean) of the two estimates. We call this
procedure the strong choice of cutting variables. We then repeat this process
for the general integer variables, if any. We are in the process of testing other
rules commonly used for selecting branching variables, like pseudo-costs and
integer estimates.
• We pay particular attention to the accuracy of the cuts. If the cuts we derive
are numerically unstable we resolve the CLP with a better accuracy.
278 Sebastián Ceria and Gábor Pataki

• We use a normalization constraint (the set S in CLP) that bounds the sum
of all multipliers (ui , v i ).
• In joint work with Avella and Rossi [1], we have chosen to generate more
than one cut from one disjunction using a simple heuristic. After solving
the CLP, we fix a nonzero multiplier to zero, then resolve. We repeat this
procedure several times always checking whether the consecutive cuts are
close to being parallel; if so, one of them is discarded.

3.2 The Test-Bed and the Comparison

As a benchmark for comparison, we used the commercial MIP solver CPLEX
5.0, with the default parameters.
As the testbed, we used problems from MIPLIB 3.0, a collection of publicly
available mixed integer programming problems. We excluded those problems
which were too easy for CPLEX, namely the ones that could be solved within
100 nodes, and also the fast0507 problem, since it is too large. We divided the
remaining problems into two groups.
• “Hard” problems; the ones that cannot be solved within one thousand sec-
onds by CPLEX with the default setting.
• “Medium” problems. All the rest.
Finally, since the enumeration code of MIPO was written 4 years ago, currently
it is not competitive with the best commercial solvers. Therefore, we tested our
cut-generator in a cut-and-branch framework. We generated 2 and 5 rounds of
50-100 cuts, after every round adding them to the LP formulation, reoptimizing,
and dropping inactive constraints. After the fifth round we fed the strengthened
formulation to the CPLEX 5.0 MIP solver with the above setting. Also, all the
cut-generation LP’s were solved using the CPLEX dual simplex code.
All of our tests were performed on a Sun Enterprise 4000 with 8-167MHz
CPU; we set a memory limit of 200 MB, and ran all our tests using one processor
only.

3.3 The Computational Results

The results for those “hard” problems which could be solved with, or without
cuts, are summarized in Table 2. Their description is included in Table 1. The
problems not solved by any of the two methods are: danoint, dano3mip, noswot,
set1ch, seymour. Nevertheless, on the last two problems lift-and-project cuts
perform quite well; set1ch can be solved with 10 rounds, and on seymour we
were able to get the best bound known to date (see the next section).
Also, the medium problems were run, and solved by CPLEX and cut-and-
branch as well. The comparisons for these problems are not presented here, but
cut-and-branch was roughly twice as fast if we aggregate all the results.
Integer and Disjunctive Programming by Lift and Project 279

Table 1. Problem description.

Problem LP value IP value

10teams 897.00 904.00
air04 55,264.43 55,866.00
air05 25,877.60 26,374.00
arki001 7,009,391.43 7,010,963.84
gesa2 25,476,489.68 25,781,982.72
gesa2 o 25,476,489.68 25,781,982.72
harp2 -74,325,169.35 -73,893,948.00
misc07 1415.00 2,810.00
mod011 -62,121,982.55 -54,558,535.01
modglob 20,430,947.62 20,740,51
p6000 -2,350,838.33 -2,349,787.00
pk1 0.00 11.00
pp08a 2748.35 7350.00
pp08aCUTS 5480.61 7350.00
qiu -931.64 -132.87
rout -1393.38 -1297.69
set1ch 30426.61 49,846.25
vpm2 9.89 13.75

The following preliminary conclusions can be drawn.

(1) In 9 problems out of the 22, either 2, or 5 rounds (in most cases 5) of lift-
and-project cuts substantially improve the solution time. In 5 problems the
difference is “make-or-break”; between solving, or not solving a problem, or
improving the solution time by orders of magnitude.
(2) On 6 problems, our cuts do not make much difference in the solution time.
(3) On 3 problems, adding our cuts is actually detrimental. It is important to
note, that the deterioration in the computing time is not due to the time
spent on generating the cuts, rather to the fact, that they make the linear
programming relaxation harder to solve. In fact, it is most likely possible to
catch this effect by monitoring, e.g. the density of the cuts, and the deteri-
oration of the LP relaxation’s condition number.

Lift-and-Project Cuts on Two Very Difficult Problems. There were 5

problems that neither CPLEX alone, nor our code (with at most 5 rounds of cuts)
was able to solve. These are : danoint, dano3mip, noswot, set1ch and seymour.
All of them are notoriously hard, and currently unsolvable by general purpose
MIP-solvers within a reasonable time. In fact, dano3mip, noswot, and seymour
have never been solved to optimality (although an optimal value for noswot is
reported in MIPLIB 3.0, we could not find anyone to confirm the existence of
such a solution).
Our cuts do not perform well on the first 3 problems; danoint and dano3mip
are network design problems with a combinatorial structure, already containing
280 Sebastián Ceria and Gábor Pataki

Table 2. Computational results for cut-and-branch.

Problem CPLEX 5.0 Cut−and−Branch2 Cut−and−Branch5

Time Nodes Time Nodes Time Nodes
10teams 5404 2265 1274 306 5747 1034
air04 2401 146 1536 110 5084 120
air05 1728 326 1411 141 4099 213
arki001 6994 21814 18440 68476 13642 12536
gesa2 9919 86522 3407 22601 1721 6464
gesa2 o 12495 111264 4123 28241 668 4739
harp2 14804 57350 10686 28477 13377 31342
misc07 2950 15378 2910 12910 4133 14880
mod011 22344 18935 63481 24090 +++ +++
modglob +++ +++ 10033 267015 435 5623
p6000 1115 2911 1213 2896 805 1254
pk1 3903 130413 5094 122728 6960 150243
pp08a +++ +++ 1924 47275 178 1470
pp08aCUTS 50791 1517658 277 3801 134 607
qiu 35290 27458 15389 10280 27691 15239
rout 19467 133075 26542 155478 40902 190531
vpm2 8138 481972 1911 63282 974 18267

many special purpose cuts, and noswot is highly symmetric. However, disjunctive
cuts perform strikingly well on the last two instances.
Until now, set1ch could be solved to optimality only by using special pur-
pose path-inequalities [26]. After exhausting the memory limits, CPLEX could
only find a solution within 15.86 % of the optimum. We ran our cutting plane
generator for 10 rounds, raising the lower bound to within 1.4 % of the integer
optimum. CPLEX was then able to solve the strengthened formulation in 28
seconds by enumerating 474 nodes. It is important to note that no numerical
difficulties were encountered during the cutting phase, (even if we generated 15
rounds, although this proved unnecessary) and the found optimal solution pre-
cisely agrees with the one reported in MIPLIB (the objective coefficients are
one-fourth integral).
The problem seymour is an extremely difficult setcovering instance; it was
donated to MIPLIB by Paul Seymour, and its purpose is to find a minimal
“irreducible configuration” in the proof of the four-colour conjecture. It has not
been solved to optimality. The value of the LP-relaxation is 403.84, and an integer
solution of 423.0 is known. The best previously known lower bound of 412.76
[2] was obtained by running CPLEX 4.0 on an HP SPP2000 with 16 processors,
each processor having 180 MHz frequency, and 720 Mflops peak performance, for
the total of approximately 58 hours and using approx. 1360 Mbytes of memory.
Due to the difficulty of the problem, we ran our cutting plane algorithm on
this problem with a rather generous setting. We generated 10 rounds of cuts, in
each round choosing the 50 cutting variables picked by our strong choice from
Integer and Disjunctive Programming by Lift and Project 281

among all fractional variables with the iteration limit set to 1000. The total time
spent on generating the 10 rounds was approximately 10.5 hours, and the lower
bound was raised to 413.16. The memory useage was below 50Mbytes. Running
CPLEX 4.0 on the strengthened formulation for approximately 10 more hours
raised the lower bound to 414.20 - a bound that currently seems unattainable
without using our cuts.

Computational Results with Other Disjunctions. We ran our cut-genera-

tor on two of the most difficult set-partitioning problems in MIPLIB, namely
air04 and air05, by using the Ryan-Foster disjunctions described in the previous
section. The results with two rounds of cuts are summarized in Table 3.

Table 3. Results with the Ryan-Foster disjunctions.

Problem CPLEX 5.0 CPLEX 5.0 C&B C&B

Time Nodes Time Nodes
air04 2401 146 1300 115
air05 1728 326 1150 123

4 Conclusions and Future Directions

In the near future we plan to explore the following topics:

• Making our computational results with cut-and-branch more consistent. The

key here is, finding the right amount of cutting, that sufficiently strengthens
the LP-relaxation, but does not make it too difficult to solve.
• We are currently implementing a branch-and-cut method that uses disjunc-
tive cuts and allows branching on facial disjunctions, using cut lifting based
on Theorem 2. We will use the disjunctions which are given as part of the
formulation, and in some other cases, we will use the structure of the problem
to generate other valid disjunctions. Our goal is to treat disjunctions in a way
similarly to inequalities (that is, to maintain a set of “active” disjunctions,
and to keep the rest in a “pool”), and handle them efficiently throughout
the code.
• We are in the process of testing our cutting plane generator with other
commercial LP and MIP solvers (XPRESS-MP). This program allows for
the generation of cutting planes within the enumeration tree without the
need of programming our own enumeration, and hence improving on the
efficiency.
282 Sebastián Ceria and Gábor Pataki

References
1. P. Avella and F. Rossi. Private communication.
2. G. Astfalk and R. Bixby. Private communication.
3. E. Balas. Intersection cuts – A new type of cutting planes for integer programming.
Operations Research, 19:19–39, 1971.
4. E. Balas. Disjunctive programming: Facets of the convex hull of feasible points.
Technical Report No. 348, GSIA, Carnegie Mellon University, 1974.
5. E. Balas. Disjunctive programming. Annals of Discrete Mathematics, 5:3–51, 1979.
6. E. Balas. Disjunctive programming and a hierarchy of relaxations for discrete op-
timization problems. SIAM J. Alg. Disc. Meth., 6:466–486, 1985.
7. E. Balas. Enhancements of lift-and-project. Technical Report, GSIA, Carnegie Mel-
lon University, 1997.
8. E. Balas, S. Ceria, and G. Cornuéjols. A lift-and-project cutting plane algorithm
for mixed 0–1 programs. Mathematical Programming, 58:295–324, 1993.
9. E. Balas, S. Ceria, and G. Cornuéjols. Mixed 0–1 Programming by lift-ad-project
in a branch-and-cut framework. Management Science, 42:1229–1246, 1996.
10. E. Balas, S. Ceria, G. Cornuéjols, and G. Pataki. Polyhedral Methods for the
Maximum Clique Problem. AMS, DIMACS Series on Discrete Mathematics and
Computer Science, 26:11–28, 1996.
11. N. Beaumont. An algorithm for disjunctive programs. European Journal of Oper-
ational Research, 48:362–371, 1990.
12. C. Blair. Two rules for deducing valid inequalities for 0–1 problems. SIAM Journal
of Applied Mathematics, 31:614–617, 1976.
13. S. Ceria and J. Soares. Disjunctive cuts for mixed 0–1 programming: Duality and
lifting. Working paper, Graduate School of Business, Columbia University, 1997.
14. S. Ceria and J. Soares. Convex programming for disjunctive optimization. Working
paper, Graduate School of Business, Columbia University, 1997.
15. J. Hooker. Logic based methods for optimization. In A. Borning, editor, Principles
and practice of constraint programming, LNCS, Vol. 626, pages 184–200, 1992.
16. J. Hooker, M. Osorio. Mixed logical/linear programming. Technical report, GSIA,
Carnegie Mellon University, 1997.
17. J. Hooker, H. Yan, I. Grossman, and R. Raman. Logic cuts for processing networks
with fixed charges. Computers and Operations Research, 21:265–279, 1994.
18. R. Jeroslow. Representability in mixed-integer programming I: Characterization
results. Discrete Applied Mathematics, 17:223–243, 1987.
19. R. Jeroslow. Logic based decision support: Mixed-integer model formulation. An-
nals of Discrete Mathematics, 40, 1989.
20. L. Lovász and A. Schrijver. Cones of matrices and set-functions and 0–1 optimiza-
tion. SIAM J. Optimization, 1:166–190, 1991.
21. R. Meyer. Integer and mixed-integer programming models: General properties.
Journal of Optimization Theory and Applications, 16:191–206, 1975.
22. H. Sherali and W. Adams. A hierarchy of relaxations between the continuous and
convex hull representations for zero-one programming problems. SIAM J. Disc.
Math., 3:411–430, 1990.
23. H. Sherali and C. Shetty. Optimization with disjunctive constraints, In M. Beckman
and H. Kunzi, editors, Lecture notes in Economics and Mathematical Systems,
Vol. 181. Springer-Verlag, 1980.
24. J. Soares. Disjunctive Methods for Discrete Optimization Problems, PhD thesis,
Graduate School of Business, Columbia University, 1997. In preparation.
Integer and Disjunctive Programming by Lift and Project 283

25. H. P. Williams. An alternative explanation of disjunctive formulations. European

Journal of Operational Research, 72:200–203, 1994.
26. L. Wolsey. Private communication.
?
A Class of Hard Small 0–1 Programs

Gérard Cornuéjols1 and Milind Dawande2??

1
Graduate School of Industrial Administration
Carnegie Mellon University, Pittsburgh, PA 15213, USA
2
IBM, T. J. Watson Research Center
Yorktown Heights, NY 10598, USA

Abstract. In this paper, we consider a class of 0–1 programs which,

although innocent looking, is a challenge for existing solution methods.
Solving even small instances from this class is extremely difficult for
conventional branch-and-bound or branch-and-cut algorithms. We also
experimented with basis reduction algorithms and with dynamic pro-
gramming without much success. The paper then examines the perfor-
mance of two other methods: a group relaxation for 0,1 programs, and
a sorting-based procedure following an idea of Wolsey. Although the re-
sults with these two methods are somewhat better than with the other
four when it comes to checking feasibility, we offer this class of small 0,1
programs as a challenge to the research community. As of yet, instances
from this class with as few as seven constraints and sixty 0–1 variables
are unsolved.

1 Introduction

Goal programming [2] is a useful model when a decision maker wants to come “as
close as possible” to satisfying a number of incompatible goals. It is frequently
cited in introductory textbooks in management science and operations research.
This model usually assumes that the variables are continuous but, of course, it
can also arise when the decision variables must be 0,1 valued. As an example,
consider the following market-sharing problem proposed by Williams [18]: A
large company has two divisions D1 and D2 . The company supplies retailers
with several products. The goal is to allocate each retailer to either division D1
or division D2 so that D1 controls 40% of the company’s market for each product
and D2 the remaining 60% or, if such a perfect 40/60 split is not possible for
all the products, to minimize the sum of percentage deviations from the 40/60
split. This problem can be modeled as the following integer program (IP):
?
This work was supported in part by NSF grant DMI-9424348.
??
Part of the work was done while this author was affiliated with Carnegie Mellon
University.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 284–293, 1998. c Springer–Verlag Berlin Heidelberg 1998
A Class of Hard Small 0–1 Programs 285

Pm
Min |si |
Pn i=1
s.t. j=1 aij xj + si = bi i = 1, ..., m
xj ∈ {0, 1} for j = 1, ..., n
si free for i = 1, ..., m.

where n is the number of retailers, m is the number of products, aij is the demand
of retailer i for product j, and the right hand side vector bi is determined from
the desired market split among the two divisions D1 and D2 . Note that the
objective function of IP is not linear but it is straightforward to linearize it.
This integer program also models the following basic Feasibility Problem (FP)
in geometry:
Feasibility Problem: Given m hyperplanes in <n , does there exist a point
n
x ∈ {0, 1} which lies on the intersection of these m hyperplanes?
If the optimum solution to IP is 0, the answer to FP is “yes”, else the answer
to FP is “no”. Clearly, FP is NP-complete since for m = 1, FP is the subset-
sum problem which is known to be NP-complete [6]. Problems of this form can
be very difficult for existing IP solvers even for a relatively small number n of
retailers and number m of products (e.g. n = 50, m = 6, and uniform integer
demand between 0 and 99 for each product and retailer). More generally, with
this choice of aij , asking for a 50/50 split and setting n = 10(m − 1) produces a
class of hard instances of 0–1 programs for existing IP solvers.
In this paper, we consider instances from the above class generated as follows:
aij P uniform integer between 0 and 99 (= D − 1), n = P 10(m − 1) and bi =
b 12 nj=1 aij c or, more generally, in the range b 12 (−D + nj=1 aij )c to b 12 (−D +
Pn
j=1 aij )c + D − 1.

2 Available Approaches for Solving IP

In this section, we report on our computational experience with four different
IP solvers available in the literature.

2.1 Branch and Bound

We found that even small instances of IP are extremely hard to solve using the
conventional branch-and-bound approach. We offer the following explanation.
For instances chosen as described above, there is often no 0–1 point in the inter-
section of the hyperplanes, that is the optimum solution to IP is strictly greater
than 0, whereas the solution to the linear programming relaxation is 0, even
after fixing numerous variables to 0 or to 1. Because the lower bound stays at
0, nodes of the branch-and-bound tree are not pruned by the lower bound until
very deep into the enumeration tree. We illustrate this point in Table 1. We
generated 5 instances of IP (each having 30 variables and 4 constraints) using
the setup described above. We indicate the number of nodes enumerated to solve
IP using CPLEX 4.0.3. For each of the 5 instances, the number of nodes enu-
merated by the branch-and-bound tree is greater than 220 . Note that, in each
286 Gérard Cornuéjols and Milind Dawande

Table 1. Size of the branch-and-bound tree (5 instances).

Problem Number of nodes Optimal

size enumerated solution
4 × 30 1224450 1.00
4 × 30 1364680 2.00
4 × 30 2223845 3.00
4 × 30 1922263 1.00
4 × 30 2415059 2.00

case, the solution to IP is strictly greater than 0. Instances with 40-50 variables
take several weeks before running to completion.
Note: The problem class IP is related to the knapsack problems considered by
Chvátal [3]. It is shown in [3] that these knapsack problems are hard to solve
using branch-and-bound algorithms. However, the coefficients of the knapsack
n
constraint are required to be very large (U [1, 10 2 ]). For the instances in the
class IP that we consider, the coefficients aij are relatively small (U [0, 99]). By
combining the constraints of IP with appropriate multipliers (e.g. multiplying
constraint i by (nD)i−1 ) and obtaining a surrogate constraint, we get an equiva-
lent problem by choosing D large enough, say D = 100 in our case. The resulting
class of knapsack instances is similar to that considered in [3].

2.2 Branch and Cut

The idea of branch-and-cut is to enhance the basic branch-and-bound approach

by adding cuts in an attempt to improve the bounds. Here, we used MIPO,
a branch-and-cut algorithm which uses lift-and-project cuts [1]. The computa-
tional results were even worse than with the basic branch-and-bound approach
(see Table 2). This is not surprising since, in this case, the linear programming
relaxation has a huge number of basic solutions with value 0. Each cutting plane
cuts off the current basic solution but tends to leave many others with value
0. As we add more cuts, the linear programs become harder to solve and, over-
all, time is wasted in computing bounds that remain at value 0 in much of the
enumeration tree.

2.3 Dynamic Programming

Using the surrogate constraint approach described above, we get a 0–1 knap-
sack problem which is equivalent to the original problem. Clearly, this technique
is suitable for problems with only a few constraints. Dynamic programming
algorithms can be used to solve this knapsack problem. Here, we use an imple-
mentation due to Martello and Toth [13] pages 108–109. The complexity of the
algorithm is O(min(2n+1 , nc)) where c is the right-hand-side of the knapsack
A Class of Hard Small 0–1 Programs 287

constraint. For the instances of size 3 × 20 and 4 × 30, we used the multiplier
(nD)i−1 with D = 100 for constraint i to obtain the surrogate constraint. For
the larger instances, we faced space problems: We tried using smaller multipliers
0 0
(e.g. (nD )i−1 with D = 20) but then the one-to-one correspondence between
solutions of the original problem and that of the surrogate constraint is lost. So
one has to investigate all the solutions to the surrogate constraint. Unfortunately,
this requires that we store all the states of the dynamic program (the worst case
bound for the number of states is 2n ). Hence, even when we decreased the multi-
pliers, we faced space problems. See Table 2. We note that there are other more
sophisticated dynamic programming based procedures [14] which could be used
here. Hybrid techniques which use, for example, dynamic programming within
branch-and-bound are also available [13].

2.4 Basis Reduction

For the basis reduction approach, we report results obtained using an imple-
mentation of the generalized basis reduction by Xin Wang [16][17]. It uses the
ideas from Lenstra [11], Lovász and Scarf [12] and Cook, Rutherford, Scarf and
Shallcross [4]. We consider the feasibility question FP here. Given the polytope
P = {0 ≤ x ≤ 1 : Ax = b}, the basis reduction algorithm either finds a 0,1
point in P or generates a direction d in which P is “flat”. That is, max{dx − dy :
x, y ∈ P } is small. Without loss of generality, assume d has integer coordinates.
For each integer t such that dmin{dx : x ∈ P }e ≤ t ≤ bmax{dx : x ∈ P }c the
feasibility question is recursively asked for P ∩ {x : dx = t}. The dimension of
each of the polytopes P ∩ {x : dx = t} is less than the dimension of P . Thus,
applying the procedure to each polytope, a search tree is built which is at most
n deep. A direction d in which the polytope is “flat” is found using a generalized
basis reduction procedure [12][4].

2.5 Computational Experience

Table 2 contains our computational experience with the different approaches

given in Section 2. We choose 4 settings for the problem size: m × n = 3 × 20,
4 × 30, 5 × 40 and 6 × 50. For each of these settings, we generated 5 instances
asPfollows: aij random integers chosen uniformly in the range [0,99] and bi =
b( j=1 aij )/2c. For branch-and-bound, we use the CPLEX 4.0.3 optimizer. For
branch-and-cut, we use MIPO [1]. For dynamic programming, we use DPS [13].
For basis reduction, we use Wang’s implementation [16]. Times reported refer to
seconds on an HP720 Apollo desktop workstation with 64 megabytes of memory.
None of the problems with 40 or 50 variables could be solved within a time limit
of 15 hours.
288 Gérard Cornuéjols and Milind Dawande

Table 2. Computing times for various solution procedures.

Problem B&B B & C DP Basis Red.

size (m × n) CPLEX 4.0.3 MIPO DPS Wang
3 × 20 11.76 213.62 0.93 300.42
3 × 20 13.04 125.47 1.28 209.79
3 × 20 13.16 208.76 0.94 212.03
3 × 20 13.64 154.71 1.31 277.41
3 × 20 11.21 190.81 1.11 197.37
4 × 30 1542.76 *** 20.32 ***
4 × 30 1706.84 *** 20.37 ***
4 × 30 2722.52 *** 20.31 ***
4 × 30 2408.84 *** 18.43 ***
4 × 30 2977.28 *** 18.94 ***
5 × 40 *** *** +++ ***
5 × 40 *** *** +++ ***
5 × 40 *** *** +++ ***
5 × 40 *** *** +++ ***
5 × 40 *** *** +++ ***
6 × 50 *** *** +++ ***
6 × 50 *** *** +++ ***
6 × 50 *** *** +++ ***
6 × 50 *** *** +++ ***
6 × 50 *** *** +++ ***
** Time limit (54000 seconds) exceeded.
+++ Space limit exceeded.

3 Two Other Approaches

In this section, we introduce two other approaches to FP.

3.1 The Group Relaxation

The feasible set of FP is

S = {x ∈ {0, 1}n : Ax = b}

where (A, b) is an integral m × (n + 1) matrix. We relax S to the following group

problem (GP).

Sδ = {x ∈ {0, 1}n : Ax ≡ b (mod δ)}

where δ ∈ Z+ m
. GP is interesting because (i) In general, GP is easier to solve
than IP [15] and (ii) Every solution of FP satisfies GP. The feasible solutions to
GP can be represented as s-t paths in a directed acyclic layered network G. The
digraph G has a layer corresponding to each variable xj , j ∈ N , a source node s
A Class of Hard Small 0–1 Programs 289

and a sink node t. The layer corresponding to variable xj has δ1 ×δ2 ...×δm nodes.
Node j k where k = (k1 , ..., km ), ki = 0, . . . , δi −1 for i = 1, . . . , m, can be reached
from the source node s if variables x1 , x2 , ..., xj can be assigned values 0 or 1 such
Pj
that `=1 ai` x` ≡ ki (mod δi ), i = 1, 2, ..., m. When this is the case, node j k has
0
two outgoing arcs (j k , (j +1)k ) and (j k , (j +1)k ), where ki0 ≡ ki +ai,j+1 (mod δi ),
corresponding to setting variable xj+1 to 0 or to 1. The only arc to t is from
node nb (mod δ) . So, the digraph G has N = 2 + n × δ1 ... × δm nodes and at
most twice as many arcs.
Example 2: Consider the set
S = {x ∈ {0, 1}4 : 3x1 + 2x2 + 3x3 + 4x4 = 5
6x1 + 7x2 + 3x3 + 3x4 = 10}
Corresponding to the choice δi = 2, i = 1, 2, we have the following group
relaxation
S22 = {x ∈ {0, 1}4 : 1x1 + 0x2 + 1x3 + 0x4 ≡ 1 (mod 2)
0x1 + 1x2 + 1x3 + 1x4 ≡ 0 (mod 2)}
Figure 1 gives the layered digraph G for this relaxation.

0 0 0 0
s
00 00 u 00 u u
1 u2 2 3 3 4 4 400

u1 0 0

01 01 u3 u3 01 01
1 2 3 4
0 0 0
10 10 u3 10 u4 10
1 u2 2 3 4
u4
t
0 0
11 11 11 11
1 2 3 4

Fig. 1. The layered digraph, G, for Example 2.

Every s-t path in G represents a feasible 0–1 solution to S22 . What is more
important is that every feasible solution to FP is also a s-t path in G. But this
relationship is not reversible. That is, an s-t path in G may not be feasible for IP.
Also, G may contain several arcs which do not belong to any s-t path. Such arcs
can be easily identified and discarded as follows: Among the outgoing arcs from
nodes in layer n − 1, we only keep those arcs which reach the node nb (mod δ)
and delete all other arcs. This may introduce paths which terminate at layer
n − 1. Hence, going through the nodes in layer n − 2, we discard all outgoing
arcs on paths which terminate at layer n − 1, and so on. It can be easily seen
that performing a “backward sweep” in G in this manner, in time O(N ), we
0
get a new graph G which consists of only the arcs in solutions to the group
0
relaxation. For the graph G in Figure 1, the graph G is shown in Figure 2.
290 Gérard Cornuéjols and Milind Dawande

0 0
s
00 u2 00 u 00 00
1 2 3 3 4
u1
01 01 u3 01 01
1 2 3 4
0 0 0
10 10 10 10
1 u2 2 3 4
u4
t
0
11 11 11 11
1 2 3 4

0
Fig. 2. The solution digraph, G , for Example 2.

For each s-t path in G0 , we check whether it corresponds to a feasible solution of

FP. Thus, a depth-first-search backtracking algorithm solves FP in time O(pmn)
where p is the number of s-t paths in G0 . Two issues are crucial with regard to
0
the complexity of this approach: the size of p and the size of the digraph G . A
2n
simple estimate shows that the expected value of p is δ1 ×...×δm when the data
aij are uniformly and independently distributed between 0 and D − 1, the δi ’s
divide D P and the bi ’s are uniformly
Pnand independently distributed in the range
b 12 (−D + j=1 aij )c to b 12 (−D + j=1 aij )c + D − 1. On the other hand, the size
n
0
of G is of the order n × δ1... × δm . The two issues, namely the size of the digraph
0
G and the number of solutions to the group relaxation, are complementary to
0
each other. As the size of G increases, the number of solutions to the group
relaxation decreases. The best choice is when these two sizes are equal, that is
n
22
δ 1 × δ 2 × . . . δm ≈ √ . Then, under the above probabilistic assumptions, the
n
√ n
expected complexity of the group relaxation approach is O( n2 2 ).

3.2 An O(n2(n=2) ) Sorting-Based Procedure

Laurence Wolsey [21] suggested a solution approach based on sorting. We first

describe this procedure for the subset sum feasibility problem (SSFP).
SSFP: Given an integer n, an integral n-vector a = (a1 , ..., an ) and an integer
n Pn
b, is {x ∈ {0, 1} : j=1 aj xj = b} = 6 φ?
The procedure SSP(n,a,b) described below decomposes the problem into two
n
subproblems each of size n2 . It then sorts the 2 2 subset sums for both the sub-
problems and traverses the two lists containing these subset sums in opposite
directions to find two values that add up to b. WLOG, we assume n is even.
SSP(n,a,b)

1. Let p = n2 , v 1 = {a1 , ..., ap } and v 2 = {ap+1 , ..., an }. Compute SS 1 and SS 2 ,

the arrays of subset sums of the power sets of v 1 and v 2 respectively.
2. Sort SS 1 and SS 2 in ascending order.
A Class of Hard Small 0–1 Programs 291

3. Let k = 2p , i = 1 and j = k.
do while ((i ≤ k) OR (j ≥ 1)) {
If (SS 1 [i] + SS 2 [j] = b) then quit. (Answer to SSFP is “yes”)
else if (SS 1 [i] + SS 2 [j] < b) then set i = i + 1
else set j = j − 1 }.
4. Answer to SSFP is “No”.
The complexity of this procedure is dominated by the sorting step and there-
n
fore it is O(n2 2 ). This procedure can be extended to answer FP as follows: For
i = 1, ..., m, multiply the ith constraint by (nD)i−1 and add all the constraints
to obtain a single surrogate constraint
X
n X
m X
m
(nD)i−1 aij xj = (nD)i−1 bi
j=1 i=1 i=1
P P Pm
Let a = ( m i=1 (nD)
i−1
ai1 , ...., mi=1 (nD)
i−1
ain ), b = i=1 (nD)
i−1
bi . Call
SSP(n,a,b).
Note: As for dynamic programming, this technique is suitable only for problems
with a few constraints. If (nD)m−1 is too large, a smaller number can be used
but then the one-to-one correspondence between the solutions of FP and the
solutions of the surrogate constraint is lost. In this case note that, if for some
i and j we have SS 1 [i] + SS 2 [j] = b, the corresponding 0–1 solution may not
satisfy FP. So, in order to solve FP, we need to find all the pairs i, j such that
SS 1 [i] + SS 2 [j] = b.

3.3 Computational Experience

See Table 3. For the group relaxation, we use δi = 8, for all i = 1, ..., m, for
the 5 instances of size 3 × 20 and δi = 16, for all i = 1, ..., m, for the remaining
instances. Times reported refer to seconds on an HP720 Apollo desktop worksta-
tion with 64 megabytes of memory. The sorting-based procedure dominates the
group relaxation for instances up to 40 variables. The group relaxation could
solve all the instances but is very expensive for larger problems. Within each
problem setting, there is very little difference in the amount of time taken by
the group relaxation. This is not surprising since, within each problem setting,
the number of solutions to the group relaxation is about the same. A similar
observation holds for the subset sum approach as well as dynamic programming.

4 Conclusions
In this paper, we consider a class of 0,1 programs with n variables, where n is a
multiple of 10. As noted in the previous section, although the group relaxation is
able to solve problem instances with up to 50 variables, its running time increases
rapidly.
√ Itsnspace complexity and expected time complexity can be estimated to
be O( n2 2 ). It is an open question to find an algorithm with expected time
292 Gérard Cornuéjols and Milind Dawande

Table 3. Computing time comparisons.

Problem B&B B&C DP Basis Group Subset Sort

size (m × n)
3 × 20 11.76 213.62 0.93 300.42 0.13 0.01
3 × 20 13.04 125.47 1.28 209.79 0.11 0.01
3 × 20 13.16 208.76 0.94 212.03 0.11 0.01
3 × 20 13.64 154.71 1.31 277.41 0.12 0.03
3 × 20 11.21 190.81 1.11 197.37 0.12 0.03
4 × 30 1542.76 *** 20.32 *** 18.06 0.99
4 × 30 1706.84 *** 20.37 *** 17.83 1.03
4 × 30 2722.52 *** 20.31 *** 17.92 0.98
4 × 30 2408.84 *** 18.43 *** 17.93 1.00
4 × 30 2977.28 *** 18.94 *** 18.04 1.01
5 × 40 *** *** +++ *** 1556.43 46.10
5 × 40 *** *** +++ *** 1562.66 46.18
5 × 40 *** *** +++ *** 1604.02 45.61
5 × 40 *** *** +++ *** 1548.55 46.20
5 × 40 *** *** +++ *** 1606.24 45.51
6 × 50 *** *** +++ *** 26425.01 +++
6 × 50 *** *** +++ *** 26591.28 +++
6 × 50 *** *** +++ *** 26454.30 +++
6 × 50 *** *** +++ *** 27379.04 +++
6 × 50 *** *** +++ *** 27316.98 +++
** Time limit (54000 seconds) exceeded.
+++ Space limit exceeded.

n
complexity better than O(2 2 ). Our computational experience indicates that the
standard approaches to integer programming are not well suited for this class
of problems. As such, we would like to present these small-sized 0–1 integer
programs as a challenge for the research community and we hope that they may
lead to new algorithmic ideas.

References
1. E. Balas, S. Ceria, and G. Cornuéjols. Mixed 0–1 programming by lift-and-project
in a branch-and-cut framework. Management Science, 42:1229–1246, 1996.
2. A. Charnes and W. W. Cooper. Management Models and Industrial Applications
of Linear Programming. Wiley, New York, 1961.
3. V. Chvátal. Hard knapsack problems. Operations Research, 28:1402–1411, 1980.
4. W. Cook, T. Rutherford, H. E. Scarf, and D. Shallcross. An implementation of the
generalized basis reduction algorithm for integer programming. ORSA Journal of
Computing, 3:206–212, 1993.
5. H. P. Crowder and E. L. Johnson. Use of cyclic group methods in branch-and-
bound. In T. C. Hu and S. M. Robinson, editors, Mathematical Programming,
pages 213–216. Academic Press, 1973.
A Class of Hard Small 0–1 Programs 293

6. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the

Theory of NP-Completeness. Freeman, San Francisco, 1979.
7. R. E. Gomory. On the relation between integer and non-integer solutions to linear
programs. Proceedings of the National Academy of Sciences, Vol. 53, pages 260–
265, 1965.
8. G. A. Gorry, W. D. Northup, and J. F. Shapiro. Computational experience with
a group theoretic integer programming algorithm. Mathematical Programming,
4:171–192, 1973.
9. G. A. Gorry and J. F. Shapiro. An adaptive group theoretic algorithm for integer
programming problems. Management Science, 7:285–306, 1971.
10. G. A. Gorry, J. F. Shapiro, and L. A. Wolsey. Relaxation methods for pure and
mixed integer programming problems. Management Science, 18:229–239, 1972.
11. H. W. Lenstra. Integer programming with a fixed number of variables. Mathematics
of Operations Research, 8:538–547, 1983.
12. L. Lovász and H. Scarf. The generalized basis reduction algorithm. Mathematics
of Operations Research, 17:751–763, 1992.
13. S. Martello and P. Toth. Knapsack Problems: Algorithms and Computer Imple-
mentations. Wiley, Chichester, U.K, 1990.
14. S. Martello and P. Toth. A mixture of dynamic programming and branch-and-
bound for the subset sum problem. Management Science 30:765–771, 1984.
15. M. Minoux. Mathematical Programming: Theory and Algorithms. Wiley, New York,
1986.
16. X. Wang. A New Implementation of the Generalized Basis Reduction Algorithm
for Convex Integer Programming, PhD thesis, Yale University, New Haven, CT,
1997. In preparation.
17. X. Wang. Private communication, 1997.
18. H. P. Williams. Model Building in Mathematical Programming. Wiley, 1978.
19. L. A. Wolsey. Group-theoretic results in mixed integer programming. Operations
Research, 19:1691–1697, 1971.
20. L. A. Wolsey. Extensions of the group theoretic approach in integer programming.
Management Science, 18:74–83, 1971.
21. L. A. Wolsey. Private communication, 1997.
Building Chain and Cactus Representations of
All Minimum Cuts from Hao-Orlin in the Same
Asymptotic Run Time ?

Lisa Fleischer

Department of Industrial Engineering and Operations Research

Columbia University, New York, NY 10027, USA
lisa@@ieor.columbia.edu

Abstract. A cactus tree is a simple data structure that represents all

minimum cuts of a weighted graph in linear space. We describe the first
algorithm that can build a cactus tree from the asymptotically fastest
deterministic algorithm that finds all minimum cuts in a weighted graph
— the Hao-Orlin minimum cut algorithm. This improves the time to
construct the cactus in graphs with n vertices and m edges from O(n3 )
to O(nm log n2 /m).

1 Introduction
A minimum cut of a graph is a non-empty, proper subset of the vertices such
that the sum of the weights of the edges with only one endpoint in the set is
minimized. An undirected graph on n vertices can contain up to n2 minimum
cuts [7,4]. For many applications, it is useful to know many, or all minimum cuts
of a graph, for instance, in separation algorithms for cutting plane approaches
to solving integer programs [5,8,2], and in solving network augmentation and
reliability problems [3]. Many other applications of minimum cuts are described
in [19,1].
In 1976, Dinits, Karzanov, and Lomonosov [7] published a description of a
very simple data structure called a cactus that represents all minimum cuts of
an undirected graph in linear space. This is notable considering the number of
possible minimum cuts in a graph, and the space needed to store one minimum
cut. Ten years later, Karzanov and Timofeev [16] outlined the first algorithm
to build such a structure for an unweighted graph. Although their outline lacks
some important details, it does provide a framework for constructing correct
algorithms [20]. In addition, it can be extended to weighted graphs.
?
The full paper is available at http://www.ieor.columbia.edu/∼lisa/papers.html.
This work supported in part by ONR through an NDSEG fellowship, by AASERT
through grant N00014-95-1-0985, by an American Association of University Women
Educational Foundation Selected Professions Fellowship, by the NSF PYI award of
Éva Tardos, and by NSF through grant DMS 9505155.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 294–309, 1998. c Springer–Verlag Berlin Heidelberg 1998
Chain and Cactus Representations of Minimum Cuts 295

The earliest algorithm for finding all minimum cuts in a graph uses maximum
flows to compute minimum (s, t)-cuts for all pairs of vertices (s, t). Gomory
and Hu [12] show how to do this with only n (s, t)-flow computations. The
fastest known deterministic maximum flow algorithm, designed by Goldberg
and Tarjan [11], runs in O(nm log(n2 /m)) time. Hao and Orlin [13] show how
a minimum cut can be computed in the same asymptotic time as one run of
the Goldberg-Tarjan algorithm. Using ideas of Picard and Queyranne [18], the
Hao-Orlin algorithm can be easily modified to produce all minimum cuts. Karger
and Stein [15] describe a randomized algorithm that finds all minimum cuts in
O(n2 log3 n) time. Recently, Karger has devised a new, randomized algorithm
that finds all minimum cuts in O(n2 log n) time.
The Karzanov-Timofeev outline breaks neatly into two parts: generating a
sequence of all minimum cuts, and constructing the cactus from this sequence.
This two-phase approach applies to both unweighted and weighted graphs. It is
known that the second phase can be performed in O(n2 ) time for both weighted
and unweighted graphs [16,17,20]. The bottleneck for the weighted case is the
first phase. Thus the efficiency of an algorithm to build a cactus tree depends
on the efficiency of the algorithm used to generate an appropriate sequence of
all minimum cuts of an undirected graph. The Karzanov-Timofeev framework
requires a sequence of all minimum cuts found by generating all (s, t) minimum
cuts for a given s and a specific sequence of t’s. The Hao-Orlin algorithm also
generates minimum cuts by finding all (s, t) minimum cuts for a given s and a
sequence of t’s. However, the order of the t’s cannot be predetermined in the
Hao-Orlin algorithm, and it may not be an appropriate order for the Karzanov-
Timofeev framework.
All minimum cuts, produced by any algorithm, can be sequenced appro-
priately for constructing a cactus tree in O(n3 ) time. This is not hard to do
considering there are at most n2 minimum cuts, and each can be stored in
O(n) space. The main result of this paper is an algorithm that constructs an
appropriate sequence of minimum cuts within the same time as the asymptot-
ically fastest, deterministic algorithm that finds all minimum cuts in weighted
graphs, improving the deterministic time to construct the cactus in graphs with
n vertices and m edges from O(n3 ) to O(nm log(n2 /m)).
Why build a cactus tree? Any algorithm that is capable of computing all
minimum cuts of a graph implicitly produces a data structure that represents
all the cuts. For instance, Karger’s randomized algorithm builds a data struc-
ture that represents all minimum cuts in O(k + n log n) space, where k is the
number of minimum cuts. The cactus tree is special because it is simple. The
size of a cactus tree is linear in the number of vertices in the original graph,
and any cut can be retrieved in time linearly proportional to the size of the cut.
In addition, the cactus displays explicitly all nesting and intersection relations
among minimum cuts. This, as well as the compactness of a cactus, is unique
among representations of minimum cuts of weighted graphs.
Karzanov and Timofeev [16] propose a method of constructing a cactus tree
using the chain representation of minimum cuts (described in Section 2.4). Their
296 Lisa Fleischer

algorithm is lacking details, some of which are provided by Naor and Vazirani [17]
in a paper that describes a parallel cactus algorithm. Both of these algorithms
are not correct since they construct cacti that may not contain all minimum cuts
of the original graph. De Vitis [20] provides a complete and correct description
of an algorithm to construct a cactus, based on the ideas in [16] and [17]. Karger
and Stein [15] give a randomized algorithm for constructing the chain represen-
tation in O(n2 log3 n) time. Benczúr [3] outlines another approach to build a
cactus without using the chain representation. However, it is not correct since
it constructs cacti that may not contain all minimum cuts of the original graph.
Gabow [9,10] describes a linear-sized representation of minimum cuts of an un-
weighted graph and gives an O(m + λ2 n log(n/λ)) time algorithm to construct
this representation, where λ is the number of edges in the minimum cut.
This paper describes how to modify the output of the Hao-Orlin algorithm
and rearrange the minimum cuts into an order suitable for a cactus algorithm
based on Karzanov-Timofeev framework. The algorithm presented here runs in
O(nm + n2 log n) time—at least as fast as the Hao-Orlin algorithm. Since an
algorithm based on the Karzanov-Timofeev outline requires O(n2 ) time, plus
the time to find and sort all minimum cuts, this leads to the fastest known
deterministic algorithm to construct a cactus of a weighted graph, and the fastest
algorithm for sparse graphs.

2 Preliminaries

2.1 Definitions and Notation

We assume the reader is familiar with standard graph terminology as found in [6].
A graph G = (V, E) is defined by a set of vertices V , with |V | = n and a set
of edges E ⊆ V xV , with |E| = m. A weighted graph also has a weight function
on the edges, w : E → <. For the purposes of this paper, we assume w is non-
negative. A cut in a graph is a non-empty, proper subset of the vertices. The
weight of a cut C is the sum of weights of edges with exactly one endpoint in the
set. The weight of edges with one endpoint in each of two disjoint vertex sets S
and T is denoted as w(S, T ). A minimum cut is a cut C with w(C, C) ≤ w(C 0 , C 0 )
for all cuts C 0 . An (S, T )-cut is a cut that contains S and is disjoint from T . A
(S, T ) minimum cut is a minimum cut that separates S and T . Note that if the
value of the minimum (S, T )-cut is greater than the value of the minimum cut,
there are no (S, T ) minimum cuts.

2.2 The Structure of Minimum Cuts of a Graph

A graph can have at most n2 minimum cuts. This is achieved by a simple cycle
on n vertices: there are n2 choices of pairs of edges broken by a minimum cut.
This is also an upper bound as shown in [7,4] or, with a simpler proof, in [14].
Let λ be the value of a minimum cut in graph G = (V, E). The proof of the
next lemma is in the full paper.
Chain and Cactus Representations of Minimum Cuts 297

Lemma 1. If S1 and S2 are minimum cuts such that none of A = S1 ∩ S2 ,

B = S1 \S2 , C = S2 \S1 , or D = S1 ∪ S2 is empty, then

1. A, B, C, and D are minimum cuts,

2. w(A, D) = w(B, C) = 0,
3. w(A, B) = w(B, D) = w(D, C) = w(C, A) = λ/2.

Two cuts (S1 , S1 ) and (S2 , S2 ) that meet the conditions of the above lemma are
called crossing cuts. A fundamental lemma further explaining the structure of
minimum cuts in a graph is the Circular Partition Lemma. This lemma is proven
by Bixby [4] and Dinits, et al. [7], with alternate proofs in [3,20].

Definition 2. A circular partition is a partition of V into k ≥ 3 disjoint subsets

V1 , V2 , . . . , Vk , such that

– w(Vi , Vj ) = λ/2 when i − j = 1 mod k and equals zero otherwise.

– For 1 ≤ a < b ≤ k, A = ∪b−1 i=a Vi is a minimum cut, and if B is a minimum
cut such that B or B is not of this form, then B or B is contained in some
Vi .

Lemma 3 (Bixby [4]; Dinits, Karzanov, and Lomonosov [7]). If G con-

tains crossing cuts, then G has a circular partition.

The proof of this lemma is long and technical, and it is omitted here. The
proof uses Lemma 1 to argue that any minimum cut not represented in a circular
partition must be contained in one of the sets of the partition.
It may be that G has more than one circular partition. Let P := {V1 , . . . , Vk }
and Q := {U1 , . . . , Ul } be two distinct circular partitions of V . These partitions
are compatible if there is a unique i and j such that Ur ⊂ Vi for all r 6= j and
Vs ⊂ Uj for all s 6= i. Proofs of the next two statements are in the full paper.

Corollary 4. Any two circular partitions of a graph G are compatible.

A set of sets is called laminar if every pair sets is either disjoint or one set is
contained in the other. Laminar sets that do not include the empty set can be
represented by a tree where the root node corresponds to the entire underlying
set, and the leaves correspond to the sets that contain no other sets. The parent
of a set A is the smallest set properly containing A. Since the total number of
nodes in a tree with n leaves is less than 2n, the size of the largest set of laminar
sets of n objects is at most 2n − 1.

Lemma 5. There are at most n − 2 distinct circular partitions of a graph on n

vertices.
298 Lisa Fleischer

5 6 7 1 19 17 11 13
1/2 16
20 18
8
4 10 12
2 1
16 11 9
13 14

3 2
1 17 14 8
4 3
12

20 19 18 15 6
5 7 10 15 9

Fig. 1. Cactus of a graph.

2.3 The Cactus Representation

If G has no circular partitions, then G has no crossing cuts. In this case, by
considering the smaller side of every cut, the minimum cuts of G are laminar
sets. Hence we can represent the minimum cuts of G by the tree used to represent
laminar sets mentioned in the previous section. There is a 1-1 correspondence
between the minimum cuts of G and the minimum cuts of this tree.
In order to represent crossing minimum cuts, it is necessary to add cycles to
this tree representing circular partitions of G. A cactus of a graph is a data struc-
ture that succinctly represents all minimum cuts of G by succinctly representing
all circular partitions of G.
A cactus is a tree-like graph that may contain cycles as long as no two cycles
share more than one vertex: every edge in the cactus lies on at most one cycle.
Figure 1 contains an example of a cactus. A cactus of a graph G, denoted H(G),
has in addition a mapping π that maps vertices of G to vertices of H(G), and a
weight assigned to every edge. To distinguish the vertices of a graph from those
of its cactus, we refer to the vertices of a cactus as nodes. If λ is the value of
the minimum cut in G, then each cycle edge of H(G) has weight λ/2, and each
path edge has weight λ. The mapping π is such that every minimal cut M of
H(G) corresponds to a minimum cut π −1 (M ) of G, and every minimum cut in
G equals π −1 (M ) for some minimal cut M of H(G). If π −1 (i) = ∅, we say that
i is an empty node. Figure 1 contains a graph and a corresponding cactus. The
proofs of the next theorem and corollary are in the full paper.

Theorem 6 (Dinits, Karzanov, and Lomonosov [7]). Every weighted graph

has a cactus.

Corollary 7. Every weighted graph on n vertices has a cactus with no more

than 2n vertices.

As defined here, and in [7], a graph does not necessarily have a unique cactus.
For instance, a cycle on three nodes can also be represented by a cactus that
is a star: an empty node of degree three, and three nodes of degree one each
containing a distinct vertex. There are many rules that could make the definition
unique [17,20]. We will follow De Vitis [20] in the definition of a canonical cactus.
Chain and Cactus Representations of Minimum Cuts 299

Let i be a node on cycle Y of H(G). Let CiY be the component containing i

formed by removing the edges adjacent to i on Y , and let ViY = π −1 (CiY ). We
call i trivial if ViY = ∅. By removing CiY and making the neighbors of i on Y
neighbors in Y , we can assume H(G) has no trivial nodes. We also assume that
H(G) has no empty, 3-way cut-nodes: an empty cut-node whose removal breaks
H(G) into exactly three components can be replaced by a 3-cycle. Finally, we
assume there are no empty nodes i in H(G) of degree three or less: either i is
a 3-way cut-node, and handled as above, or it is a 2-way cut-node, and hence
we can contract an incident cut-edge and still maintain a representation of all
minimum cuts of G in H(G). A canonical cactus of a graph is a cactus with no
trivial nodes, no empty 3-way cut-nodes, and no empty nodes with degree ≤ 3.
The following theorem is not hard to prove.

Theorem 8 (De Vitis [20]). Every weighted graph has a unique canonical
cactus.

Henceforth, cactus will be used to mean canonical cactus. In this canonical

representation, every circular partition of G corresponds to a cycle of H(G) and
every cycle of H(G) represents a circular partition of G. For cycle Y , the ViY
are the sets of the corresponding circular partition.

2.4 The Chain Representation of Minimum Cuts

Karzanov and Timofeev [16] make an important observation about the struc-
ture of minimum cuts, which they use to build the cactus: if two vertices s and
t are adjacent in the graph G, then the minimum cuts that separate these ver-
tices are nested sets. This motivates assigning vertices of G an adjacency order
{v1 , . . . , vn } so that vi+1 is adjacent to a vertex in Vi := {v1 , . . . , vi }. Let Mi
be the set of minimum cuts that contain Vi but not vi+1 . We will refer to such
minimum cuts as (Vi , vi+1 ) minimum cuts. The following lemma summarizes the
observation of [16].

Lemma 9. If the vertices in G are adjacency ordered, then all cuts in Mi are
non-crossing.

This implies that the cuts in Mi form a nested chain and can be represented by
a path Pi of sets that partition V : let A1 ⊂ A2 ⊂ · · · ⊂ Al be the minimum cuts
separating Vi and vi+1 . The first set X1 of the path is A1 , and each additional
set Xr = Ar \Ar−1 , with the final set Xl+1 equal to V \Al . Each link of the
path represents the minimum cut Ar that would result if the path were broken
between sets Xr and Xr+1 .
Note that, for any ordering of vertices, the Mi form a partition of the min-
imum cuts of G: vi and vi+1 are the smallest, consecutively indexed pair of
vertices that are separated by a cut, if and only if the cut is in Mi . The set of
Pi for all 1 ≤ i < n is called the chain representation of minimum cuts. For this
reason, we will refer to these paths as chains.
300 Lisa Fleischer

2.5 From the Chain Representation to the Cactus

The algorithm outlined by Karzanov and Timofeev [16], refined by Naor and
Vazirani [17], and corrected by De Vitis [20], builds the cactus representation
from the chain representation of minimum cuts. For completeness, an outline
of the algorithm is presented here. Let Gi represent the graph G with nodes
in Vi contracted into one node. Let Gr be the smallest such graph that has a
minimum cut of value λ (r is the largest index of such a graph). The algorithm
starts with the cactus for Gr , a pair of vertices incident to one edge. It then
builds the cactus for Gi−1 from Gi using the following observation that follows
from the fact that cuts in Mi are non-crossing.
Corollary 10. Let the vertices in G have an adjacency ordering. For each non-
empty Mi , there is a path in the cactus of Gi that shares at most one edge with
every cycle, and such that all cuts in Mi cut through this path.
By contracting the path described in the lemma into a single node, all cuts
in Mi are removed from H(Gi ), and no other minimum cuts are removed. Thus
the resulting structure is H(Gi+1 ).
Working in the opposite direction, in iteration i, the cactus algorithm replaces
the node u in H(Gi+1 ) that contains vertices {v1 , . . . , vi } with the path Pi that
represents the chain of cuts Ci in Mi . Edges that were incident to u are then
joined to an appropriate node in Pi . The mapping of vertices of Gi to vertices
of the cactus are updated. Lastly, post-processing forms the canonical cactus of
Gi . More precisely, the algorithm proceeds as follows.
(i) If Ci = X1 , . . . , Xk then replace u by k new nodes u1 , . . . , uk with edges
(uj , uj+1 ) for 1 ≤ j < k.
(ii) For any tree or cycle edge (u, w) in H(Gi+1 ), let W 6= ∅ be the set of vertices
in w, or if w is an empty node, the vertices in any non-empty node w0
reachable from w by some path of edges disjoint from a cycle containing
(u, w). Find the subset Xj such that W ⊂ Xj and connect w to uj .
(iii) Let U be the set of vertices mapped to H(Gi+1 ). Assign to uj the set Xj ∩U .
All other mappings remain unchanged.
(iv) Remove all empty nodes of degree ≤ 2 and all empty 2-way cut nodes by
contracting an adjacent tree edge. Replace all empty 3-way cut-nodes with
3-cycles.
The correctness of this procedure is easy to prove by induction using the
previous observations and the following two lemmas.
Lemma 11. If (u, w) is a tree edge of H(Gi+1 ) and T ⊂ V is the set of vertices
in the subtree attached to u by (u, w), then T ⊆ Xj for some j.

Lemma 12. Let (u, w) be a cycle edge of H(Gi+1 ) that lies on cycle Y , and
let T1 , S
T2 , . . . , Tr be the circular partition represented by Y with Vi ⊂ T1 . Then
r
either l=2 Tl ⊂ Xj for some j, or there are indices a and b such that T2 = Xa ,
T3 = Xa+1 , . . . , Tr = Xb .
Chain and Cactus Representations of Minimum Cuts 301

Clearly, operations (i)-(iii) of the above procedure can be performed in linear

time, thus requiring O(n2 ) time to build the entire cactus from the chain rep-
resentation of minimum cuts. Operation (iv) requires constant time per update.
Each contraction removes a node from the cactus, implying no more than O(n)
contractions per iteration. Each cycle created is never destroyed, and since the
total number of circular partitions of a graph is at most n, there can not be more
than n of these operations over the course of the algorithm.

Theorem 13. The canonical cactus of a graph can be constructed from the chain
representation of all minimum cuts of the graph in O(n2 ) time.

2.6 Finding All Minimum Cuts in a Weighted Graph

To find all minimum cuts, we make minor modifications to the minimum cut
algorithm of Hao and Orlin [13]. The Hao-Orlin algorithm is based on Goldberg
and Tarjan’s preflow-push algorithm for finding a maximum flow [11]. It starts
by designating a source vertex, u1 , assigning it a label of n, and labeling all other
vertices 0. For the first iteration, it selects a sink vertex, u2 , and sends as much
flow as possible from the source to the sink (using the weights as arc capacities),
increasing the labels on some of the vertices in the process. The algorithm then
repeats this procedure n − 2 times, starting each time by contracting the current
sink into the source, and designating a node with the lowest label as the new
sink. The overall minimum cut is the minimum cut found in the iteration with
the smallest maximum flow value, where the value of a flow is the amount of
flow that enters the sink.
To find all minimum cuts with this algorithm, we use the following result.
Define a closure of a set A of vertices in a directed graph to be the smallest set
containing A that has no arcs leaving the set.

Lemma 14. (Picard and Queyranne [18]) There is a 1-1 correspondence

between the minimum (s, t)-cuts of a graph and the closed vertex sets in the
residual graph of a maximum (s, t)-flow.

Let Ui be the set of vertices in the source at iteration i of the Hao-Orlin routine,
1 ≤ i ≤ n− 1. If a complete maximum flow is computed at iteration i, Lemma 14
implies that the set of (Ui , ui+1 ) minimum cuts equals the set of closures in the
residual graph of this flow. This representation can be made more compact by
contracting strongly connected components of the residual graph, creating a
directed, acyclic graph (DAG). We refer to the graph on the same vertex set,
but with the direction of all arcs reversed (so that they direct from Ui to ui+1 ),
as the DAG representation of (Ui , ui+1 ) minimum cuts.
The Hao-Orlin routine does not necessarily compute an entire maximum flow
at iteration i. Instead it computes a flow of value ≥ λ that obeys the capacity
constraint for each edge, but leaves excess flow at some nodes contained in a dor-
mant region that includes the source. Let Ri denote the vertices in the dormant
region at the end of the ith iteration of the Hao-Orlin algorithm. The following
302 Lisa Fleischer

lemma follows easily from the results in [13]. Together with the above observa-
tions, this implies that at the end of iteration i of the Hao-Orlin algorithm, we
can build a linear-sized DAG representation of all (Ui , ui+1 ) minimum cuts.

Lemma 15. Ri is contained in the source component of the DAG representation

of (Ui , ui+1 ) minimum cuts.

The second and more serious problem with the Hao-Orlin algorithm is that
the ordering of the vertices implied by the selection of sinks in the algorithm
may not be an adjacency ordering as required for the construction phase of the
Karzanov-Timofeev framework. This means that some of the DAGs may not
correspond to directed paths. We address this problem in the next section.

3 The Algorithm

In this section, we discuss how to transform the DAG representation of all min-
imum cuts into the chains that can be used to construct a cactus representation
of all minimum cuts. To do this, we use an intermediate structure called an
(S, T )-cactus.

3.1 (S,T)-Cacti

An (S, T )-cactus is a cactus representation of all minimum cuts separating vertex

sets S and T . Note that an (S, T )-cactus is not necessarily a structure represent-
ing minimum (S, T )-cuts, but overall minimum cuts that separate S and T . If the
minimum (S, T )-cut has value greater than the minimum cut, then the (S, T )-
cactus is a single node containing all vertices. The proof of the following lemma
is in the full paper.

Lemma 16. An (S, T )-cactus is a path of edges and cycles.

Each cycle in an (S, T )-cactus has two nodes that are adjacent to the rest of the
path, one on the source side, one on the sink side. The other nodes on the cycle
form two paths between these two end nodes. We will call these cycle-paths.
Thus each cycle in the (S, T )-cactus can be described by a source-side node, a
sink-side node, and two cycle-paths of nodes connecting them. The length of a
cycle-path refers to the number of nodes on the cycle-path. A zero length cycle-
path implies that it consists of just one arc joining the source-side node to the
sink-side node.
Without loss of generality, we assume that no cycle of an (S, T )-cacti has a
zero length cycle-path. Any cycle of an (S, T )-cactus that contains a zero length
cycle-path can be transformed into a path by deleting the arc in the zero length
cycle-path. This operation does not delete any (S, T )-cut represented by the
(S, T )-cactus.
Note that Lemma 9 implies that an (S, T )-cactus of a chain is a path without
cycles. A first step in building these chains needed for the cactus algorithm is
Chain and Cactus Representations of Minimum Cuts 303

constructing the n − 1 (S, T )-cacti of the n − 1 DAG’s output by the Hao-Orlin

algorithm. From these (S, T )-cacti, we can then construct chains needed for the
cactus algorithm.
Benczúr [3] outlines how to build an (S, T )-cactus from the DAG representa-
tion of all minimum cuts separating S and T . The algorithm is a topological sort
on the DAG, with some post-processing that can also be performed in O(m + n)
time. For completeness, we describe the algorithm here.
For two different cycles of the (S, T )-cactus, all the nodes on the cycle closer
to the source must precede all the nodes on the further cycle in any topological
ordering. Thus a topological sort sequences the nodes of the DAG so that nodes
on a cycle in the (S, T )-cactus appear consecutively. It remains to identify the
start and end of a cycle, and to place the nodes of a cycle on the proper chain.
For a node i on cycle Y of a cactus H(G), define CiY to be the component
containing i formed by removing the edges adjacent to i on Y , and define ViY =
π −1 (CiY ). If ViY = ∅ for some Y , we can remove CiY and make the neighbors of i
on Y neighbors in Y . Thus we assume H(G) has no cycle nodes i with ViY = ∅.
The proof of the following lemma is contained in the full paper.
Lemma 17. Let Y be a cycle on three or more nodes of cactus H(G) then
1. The sum of the costs on edges between any two vertex sets ViY and VjY of
adjacent nodes i and j on cycle Y equals λ/2.
2. There are no edges between vertex sets ViY and VhY of nonadjacent nodes i
and h on Y .
Since we can construct an (S, T )-cactus from the cactus of the complete graph
by contracting nodes, this lemma also applies to (S, T )-cacti: on a cycle of an
(S, T )-cactus, only adjacent nodes share edges.
We can now walk through the topological sort of a DAG produced by the
Hao-Orlin algorithm to create the corresponding (S, T )-cactus. As long as the
weight of the edges between two consecutive components in the topological order
equals λ, the value of a minimum cut, the (S, T )-cactus is a path. The first time
we encounter a value < λ, the path divides into two cycle-paths of a cycle.
We stay with one cycle-path as long as the weight of the edges between two
consecutive components = λ/2. If it is zero, we jump to the other cycle-path. If
it is anything in between λ/2 and zero, we end the cycle with an empty vertex,
and start a new cycle. If it is λ, we end the cycle with the current vertex, and
resume a path. At the end of the topologically ordered list, we end the cycle (if
started) with the last component.
Lemma 18. (Benczúr [3]) A DAG representing strongly connected compo-
nents of the residual graph of a maximum (s, t)-flow can be transformed into an
(s, t)-cactus in O(m + n) time. t
u

3.2 Constructing the Chains

In this section we describe how to efficiently build chains Pi of minimum cuts
that can be used to construct a cactus representation of all minimum cuts from
304 Lisa Fleischer

the (Uj−1 , uj )-cacti, 1 < j ≤ n. Let Dj be the (Uj−1 , uj )-cactus for 1 < j ≤ n.
We start by fixing an adjacency ordering {v1 , . . . , vn } with v1 = u1 and let
σ : V → V be the permutation σ(i) = j if vi = uj .
The algorithm we present below builds each chain Pi , 1 < i ≤ n, one at
a time by reading cuts off of the Dj . The basic idea is to use the following
fact established below: If any two cuts from Dj are such that one is completely
contained within the other, then the index of the chain to which the first belongs
is no more than the index of the chain to which the second belongs. Thus, all
the minimum cuts belonging in Pi that are contained in one Dj are consecutive;
and, as we walk through the cuts from Uj−1 to uj in Dj , the index of the chain
to which a cut belongs is non-decreasing. The validity of these statements is
established in the next two subsections.

The Code
(s,t)-CactiToChains (v)

0. For i = 2 to n,
0. Pi = ∅.
0. For j = 2 to n − 1,
1. nij := the node in Dj that contains vi .
2. If nij is not the source node,
3. Identify the path of nodes from source to predecessor of nij in Dj .
4. If Pi = ∅, let the source node of Dj be the source node of Pi .
5. Else, add vertices in the source but not in Pi to the last node of Pi .
6. Append the remaining path to Pi , skipping empty nodes;
7. and, in Dj , contract this path into the source node.
8. If nij is on a cycle in Dj ,
9. Append to Pi the chain of the cycle that does not contain nij .
10. Contract nij into the source node of Dj .
11. Add a component to Pi that contains only uj .
12. If Pi 6= ∅, add the vertices not in Pi to the last node of Pi .

Correctness: In this section we prove that each Pi constructed by the above

algorithm represents all cuts in Mi . Define a phase to be one pass through the
outer while loop. Pi is complete when it contains all vertices. A node of Pi is
complete when either a new node of Pi is started, or Pi is complete. For example,
a node started at Step 11 is not complete until after the next execution of Step
5 or Step 12. A cut in Dj is obtained by either removing an arc on a path or a
pair of arcs on two different cycle-paths of a cycle. The one or two nodes that
are endpoints of these arcs on the source side are said to represent this cut.
Theorem 19. Each chain Pi constructed by the algorithm contains precisely the
cuts in Mi .
The proof of this theorem establishes the correctness of the algorithm and
depends upon the following lemmas.
Chain and Cactus Representations of Minimum Cuts 305

source sink

Fig. 2. The (s, t)-cactus Dj , updated for iteration i. The black nodes represent
the possible locations of nij .

Lemma 20. At the end of phase i, all vertices vk , k ≤ i are contained in the
source node of every remaining Dj .
Proof. Proof is by induction. At start, all sources contain vertex v1 = u1 . Assume
vertices vk , k < i are contained in the sources of the remaining Dj at the start
of phase i. For each j, if nij is not the source node, then nij is contracted into
the source in Step 10. t
u

Lemma 21. In phase i, for all j, nij either lies on the path in Dj from the
source to the first node of the cycle closest to the source, or is one of the two
nodes on the cycle adjacent to this path, as in Figure 2.
Proof. Figure 2 shows the possible locations of nij in Dj as specified in the
lemma. We show that if there is a cycle between nij and the source Sj , then we
can find crossing cuts in Mi , contradicting our choice of the ordering of the vi .
If nij is not in one of the locations stipulated in the lemma, then some cycle
contains at least one node that either precedes, or is unrelated to nij , on both
of its cycle-paths. (Recall that we are assuming that every cycle-path has length
greater than zero.) Thus there are two crossing minimum cuts that separate the
source of Dj from vi . Lemma 20 implies that Vi−1 is contained in the source,
and hence these are cuts in Mi , contradicting Lemma 1. t
u

Lemma 22. The algorithm constructs chains that contain every vertex exactly
once.
Proof. By construction, each Pi is a chain that contains every vertex. Suppose,
in iteration j, we are about to add a vertex ur to the chain Pi . If ur is in the
source component of Dj , we explicitly check before we add ur . Otherwise, r ≥ j
and there is some cut that separates vertices in Uj−1 and Vi−1 from vertices ur
and uj . All cuts added to Pi before this iteration come from a Dk with k < j, and
thus contain a strict subset of vertices in Uj−1 . By Lemma 9, the cuts separating
Vi−1 from vi are nested, and so none of the previous cuts contain ur in the source
side. Thus we add ur to Pi exactly once. t
u

Lemma 23. Each complete node added to Pi represents the same cut that is
represented by the node in the current Dj .
306 Lisa Fleischer

Proof. Proof is by induction on j. The first node added to Pi is the source node
of some Dj and clearly both these cuts are the same. Now assume that up until
this last node, all previous nodes added to Pi represented the corresponding cut
of the relevant Dk , k ≤ j. Adding the current node from Dj , we add or have
added in previous steps to Pi all the vertices that are in the source side of this
cut. We must argue that there are no additional vertices in any node preceding
this one in Pi . Recall that the cuts in Mi are nested. Thus all previous nodes
added to Pi represent cuts strictly contained in the current cut. Hence the node
added to Pi represents the same cut this node represents in the current Dj . u t

Lemma 24. Each complete node added to Pi , 1 < i ≤ n, except the last, repre-
sents a distinct cut in Mi .
Proof. Since each node added to Pi contains some vertex, and no vertex is re-
peated, each represents a distinct cut. When Pi is started, Lemma 20 implies
that all vertices in Vi−1 are contained in the source. Vertex vi is only added in
Step 11, when j = σ(i). This starts the final node of Pi since vi = uj is con-
tained in the source nodes of all Dk , k > j. Hence all previous nodes added to
Pi represent cuts that contain Vi−1 but not vi , and are thus cuts in Mi . t
u
Proof of Theorem 19. For a fixed cut C, let vi be the vertex of least index in C.
We show that this cut is represented in Pi . By the end of each phase k, k < i, the
algorithm contracts all nodes on the path from the source to the node containing
vk . Since Vi−1 is contained in the source side of C, this cut is not contracted
in Dj at the start of phase i. If the cut C is represented in Dj by one node, it
lies on the path of Dj from the expanded source to nij . Thus this node is added
to Pi in either Step 5 or 6. Otherwise the cut is through the same cycle that
contains nij . One node representing the cycle is the node immediately preceding
nij , currently source-side node of the cycle. The other node lies on the cycle-path
not containing nij . This cut is added to Pi in Step 9, when this latter node is
added. t
u

Efficiency: Some care is needed to implement the above algorithm to complete

within the same asymptotic time as the Hao-Orlin algorithm. Within the inner
“for” loops, operations should not take more than O(log n) time. We show, that
with minor preprocessing and record keeping, all operations will take no more
than logarithmic time per operation, or can be charged to some set of minimum
cuts.
Before starting the algorithm, we walk through each (s, t)-cactus from t to
s, labeling every node with the smallest index of a vertex vi contained in that
node, and labeling every arc with the smallest index on the sink side of the arc.
This takes linear time per (s, t)-cactus, for a total of O(n2 ) time.
We perform one more preprocessing step. For each vertex, we create an array
of length n − 1. In the array for vertex vi , entry j indicates if vi is in the source
component of Dj . This takes O(n2 ) time to initialize; and, through the course
of the algorithm, it will take an additional O(n2 ) time to maintain, since once a
vertex is contracted into the source, it stays there.
Chain and Cactus Representations of Minimum Cuts 307

Theorem 25. Algorithm (s,t)-CactiToChains runs in O(n2 log n) time.

Proof. In Step 1, we use the labeling of the arcs to walk through Dj starting from
the source until we find nij . By Lemmas 20 and 21, all these cuts are (Vi−1 , vi )-
cuts, and hence the arcs are labeled with vi . The time to do this depends on the
out degree of each node encountered in Dj (at most 2), and is charged to the
minimum cuts we encounter. Note that if none of the arcs leaving the source is
labeled with the current chain index i, then nij must be the source node. This
takes care of Steps 1-3, and Step 8.
Let’s put aside Steps 5 and 12 and concentrate on the remaining steps. In
these, we are either adding nodes and vertices to Pi , or contracting nodes of Dj .
This takes constant time per addition or contraction of vertex. Since each vertex
is added at most once to Pi and contracted into the source of Dj at most once,
over the course of the algorithm these steps take no more than O(n2 ) time.
Finally, we consider Steps 5 and 12. In these, we need to determine the
difference of two sets that are increasing over the course of the algorithm. To do
this efficiently, we wait until the end of a phase to add the vertices that would
have been added at Step 5. At the end of each phase, we make a list of the
vertices that are missing from Pi . This can be done in linear time per phase by
maintaining an array in each phase that indicates if a vertex has been included
in Pi . For each of the vertices not in Pi , we then need to determine to which
node of Pi it should be added. The nodes that are possible candidates are the
incomplete nodes—any node created in Step 11.
Vertex v is added to the node containing uk in Step 5 if v is contained in the
source of Dj , j > k but is not contained in the source of Dk , for iterations k and
j where nik and nij are not source nodes, and k is the highest index with these
properties. Consider the Dj for which nij is not the source node. The sources
of these Dj are nested for any fixed i, by the observation that the cuts of Mi
are nested. That is, once a vertex is contained in one of them, it is contained
in the rest. During each phase, we maintain a array-list of the consecutive j for
which nij is not the source node. For each vertex, we can now binary search
through these nested sources using the arrays described before the statement
of the theorem to find the node of Pi that should contain v in O(log n) time.
The time spent on this step over the course of the algorithm is then bounded by
O(n2 log n): logarithmic time per vertex per phase.
Note that we have added vertices to the sources of these Dj since we started
the phase. All the vertices we added in Step 7 have also been added to Pi , so
they are not among the vertices we need to place at the end (i.e. the vertices not
in Pi ). We do need to worry about the vertices added in Step 10, however. To
avoid this problem, simply postpone the contractions described in Step 10 until
after we have assigned all vertices to a node of Pi . t
u

Note that mn log n2 /m is never smaller than n2 log n.

Corollary 26. The chain and the cactus representations of all minimum cuts
of a weighted graph can both be constructed in O(mn log n2 /m) time. t
u
308 Lisa Fleischer

Acknowledgement

I would like to thank Éva Tardos for assistance with the proof of Theorem 25.

References

1. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows. Prentice Hall, 1993.

2. D. Applegate, R. Bixby, V. Chvátal, and W. Cook. Finding cuts in the TSP (a
preliminary report). Technical Report 95-05, DIMACS, 1995.
3. A. A. Benczúr. Cut Structures and Randomized Algorithms in Edge-Connectivity
Problems. PhD thesis, Department of Mathematics, Massachusetts Institute of
Technology, June 1997.
4. R. E. Bixby. The minimum number of edges and vertices in a graph with edge
connectivity n and m n-bonds. Networks, 5:253–298, 1975.
5. A. Caprara, M. Fischetti, and A. N. Letchford. On the separation of maximally
violated mod-k cuts. Unpublished manuscript, 1997.
6. T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. The
MIT Press / McGraw-Hill, 1990.
7. E. A. Dinits, A. V. Karzanov, and M. V. Lomonosov. On the structure of a fam-
ily of minimal weighted cuts in a graph. In A. A. Fridman, editor, Studies in
Discrete Optimization, pages 290–306. Moscow Nauka, 1976. Original article in
Russian. This is an English translation obtained from ITC-International Transla-
tions Centre, Schuttersveld 2, 2611 WE Delft, The Netherlands. (ITC 85-20220);
also available from NTC-National Translations Center, Library of Congress, Cat-
aloging Distribution Service, Washington DC 20541, USA (NTC 89-20265).
8. L. Fleischer and É. Tardos. Separating maximally violated comb inequalities in
planar graphs. In Integer Programming and Combinatorial Optimization: 5th In-
ternational IPCO Conference, LNCS, Vol. 1084, pages 475–489, 1996.
9. H. N. Gabow. Applications of a poset representation to edge connectivity and
graph rigidity. In Proc. 32nd Annual Symp. on Found. of Comp. Sci., pages 812–
821, 1991.
10. H. N. Gabow. Applications of a poset representation to edge connectivity and
graph rigidity. Technical Report CU-CS-545-91, Department of Computer Science,
University of Colorado at Boulder, 1991.
11. A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem.
Journal of ACM, 35:921–940, 1988.
12. R. E. Gomory and T. C. Hu. Multi-terminal network flows. J. Soc. Indust. Appl.
Math, 9(4):551–570, 1991.
13. J. Hao and J. B. Orlin. A faster algorithm for finding the minimum cut in a graph.
In Proc. of 3rd ACM-SIAM Symp. on Discrete Algorithms, pages 165–174, 1992.
14. D. R. Karger. Random sampling in cut, flow, and network design problems. In
Proc. of 6th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 648–
657, 1995.
15. D. R. Karger and C. Stein. A new approach to the minimum cut problem. Journal
of the ACM, 43(4):601–640, 1996.
16. A. V. Karzanov and E. A. Timofeev. Efficient algorithms for finding all minimal
edge cuts of a nonoriented graph. Cybernetics, 22:156–162, 1986. Translated from
Kibernetika, 2:8–12, 1986.
Chain and Cactus Representations of Minimum Cuts 309

17. D. Naor and V. V. Vazirani. Representing and enumerating edge connectivity cuts
in RNC. In Proc. Second Workshop on Algorithms and Data Structures, LNCS,
Vol. 519, pages 273–285. Springer-Verlag, 1991.
18. J.-C. Picard and M. Queyranne. On the structure of all minimum cuts in a network
and applications. Mathematical Programming Study, 13:8–16, 1980.
19. J.-C. Picard and M. Queyranne. Selected applications of minimum cuts in net-
works. INFOR, 20(4):394–422, 1982.
20. A. De Vitis. The cactus representation of all minimum cuts in a weighted graph.
Technical Report 454, IASI-CNR, 1997.
Simple Generalized Maximum Flow Algorithms

Éva Tardos1? and Kevin D. Wayne2??

1
Computer Science Department, and
Operations Research and Industrial Engineering Department
Cornell University, Ithaca, NY 14853, USA
eva@@cs.cornell.edu
2
Operations Research and Industrial Engineering Department
Cornell University, Ithaca, NY 14853, USA
wayne@@orie.cornell.edu

Abstract. We introduce a gain-scaling technique for the generalized

maximum flow problem. Using this technique, we present three simple
and intuitive polynomial-time combinatorial algorithms for the problem.
Truemper’s augmenting path algorithm is one of the simplest combi-
natorial algorithms for the problem, but runs in exponential-time. Our
first algorithm is a polynomial-time variant of Truemper’s algorithm.
Our second algorithm is an adaption of Goldberg and Tarjan’s preflow-
push algorithm. It is the first polynomial-time preflow-push algorithm in
generalized networks. Our third algorithm is a variant of the Fat-Path
capacity-scaling algorithm. It is much simpler than Radzik’s variant and
matches the best known complexity for the problem. We discuss practical
improvements in implementation.

1 Introduction

In this paper we present new algorithms for the generalized maximum flow prob-
lem, also known as the generalized circulation problem. In the traditional max-
imum flow problem, the objective is to send as much flow through a network
from one distinguished node called the source to another called the sink, subject
to capacity and flow conservation constraints. In generalized networks, a fixed
percentage of the flow is lost when it is sent along arc. Specifically, each arc
(v, w) has an associated gain factor γ(v, w). When g(v, w) units of flow enter arc
(v, w) at node v then γ(v, w)g(v, w) arrive at w. The gains factors can represent
physical transformations due to evaporation, energy dissipation, breeding, theft,
or interest rates. They can also represent transformations from one commodity
to another as a result of manufacturing, blending, or currency exchange. They
?
Research supported in part by an NSF PYI award DMI-9157199, by NSF through
grant DMS 9505155, and by ONR through grant N00014-96-1-0050.
??
Research supported in part by ONR through grant AASERT N00014-97-1-0681.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 310–324, 1998. c Springer–Verlag Berlin Heidelberg 1998
Simple Generalized Maximum Flow Algorithms 311

may also represent arc failure probabilities. Many applications are described in
[1,3,5].
Since the generalized maximum flow problem is a special case of linear pro-
gramming, it can be solved using simplex, ellipsoid, or interior-point methods.
Many general purpose linear programming algorithms can be tailored for the
problem. The network simplex method can handle generalized flows. Kapoor
and Vaidya [16] showed how to speed up interior-point methods on network
flow problems by exploiting the structured sparsity in the underlying constraint
matrix. Murray [18] and Kamath and Palmon [15] designed different interior-
point algorithms for the problem. We note that these simplex and interior-point
methods can also solve the generalized minimum cost flow problem.
The first combinatorial algorithms for the generalized maximum flow prob-
lem were the augmenting path algorithms of Jewell [14] and Onaga [19] and
exponential-time variants. Truemper [22] observed that the problem is closely
related to the minimum cost flow problem, and that many of the early gen-
eralized maximum flow algorithms were, in fact, analogs of pseudo-polynomial
minimum cost flow algorithms. Goldberg, Plotkin and Tardos [7] designed the
first two combinatorial polynomial-time algorithms for the problem: Fat-Path
and MCF. The Fat-Path algorithm uses capacity-scaling and a subroutine that
cancels flow-generating cycles. The MCF algorithm performs minimum cost flow
computations. Radzik [20] modified the Fat-Path algorithm, by canceling only
flow-generating cycle with sufficiently large gains. Goldfarb and Jin [12] modified
the MCF algorithm by replacing the minimum cost flow subroutine with a sim-
pler computation. Goldfarb and Jin [11] also presented a dual simplex variant of
this algorithm. Recently, Goldfarb, Jin and Orlin [13] designed a new capacity-
scaling algorithm, motivated by the Fat-Path algorithm. Tseng and Bertsekas
[23] proposed an -relaxation method for solving the more general generalized
minimum cost flow problem with separable convex costs. However, their running
time may be exponential in the input size.
Researchers have also developed algorithms for the approximate generalized
maximum flow problem. Here, the objective is to find a ξ-optimal flow, i.e., a flow
that generates excess at the sink that is within a (1 - ξ) factor of the optimum,
where ξ is an input parameter. Cohen and Megiddo [2] showed that the approx-
imate generalized maximum flow problem can be solved in strongly polynomial-
time. Their algorithm uses a subroutine which tests feasibility of a linear system
with two variables per inequality. Radzik [20] observed that the Fat-Path algo-
rithm can be used to compute approximates flow faster than optimal flows. His
Fat-Path variant, that cancels only flow-generating cycles with large gain, is the
fastest algorithm for computing approximate flows. Subsequently, Radzik [21]
gave a new strongly polynomial-time analysis for canceling all flow-generating
cycles, implying that the original Fat-Path algorithm computes an approximate
flow in strongly polynomial-time. For the linear programming algorithms, it is
not known how to improve the worst-case complexity of the exact algorithms to
find approximate flows.
312 Éva Tardos and Kevin D. Wayne

We present a new rounding technique for generalized flows, which can be

viewed as a type of gain-scaling. Using this technique, we propose three simple
combinatorial algorithms for the generalized maximum flow problem. Our first
algorithm is a polynomial-time variant of Truemper’s [22] algorithm. Truem-
per’s algorithm is a very simple maximum flow based augmenting path algo-
rithm, analogous to Jewell’s primal-dual algorithm for the minimum cost flow
problem. Truemper’s algorithm may require exponential-time, but by apply-
ing our new gain-scaling technique, we develop a polynomial-time variant. Our
second algorithm is an adaption Goldberg and Tarjan’s [10] preflow-push algo-
rithm for the minimum cost flow problem. Using gain-scaling, we establish the
first polynomial-time preflow-push algorithm for generalized flows. Our third
algorithm is a simple variant of the Fat-Path capacity-scaling algorithm. By
using gain-scaling, our Fat-Path variant improves the complexity of canceling
flow-generating cycles, and hence of the overall algorithm. In contrast, Radzik’s
Fat-Path variant modifies this subroutine, canceling only flow-generating cycles
with sufficiently large gain. Both Fat-Path variants have the same complexity,
but Radzik’s variant and proof of correctness are quite complicated.

2 Preliminaries
2.1 Generalized Networks
Since some of our algorithms are iterative and recursive, it is convenient to solve a
seemingly more general version of the problem which allows for multiple sources.
An instance of the generalized maximum flow problem is a generalized network
G = (V, E, t, u, γ, e), where V is an n-set of nodes, E is an m-set of directed
arcs, t ∈ V is a distinguished node called the sink, u : E → <≥0 is a capacity
function, γ : E → <>0 is a gain function, and e : V → <≥0 is an initial excess
function. A residual arc is an arc with positive capacity. A lossy network is a
generalized network in which no residual arc has gain factor exceeding one. We
Q simple directed paths and cycles. The gain of a path P is denoted
consider only
by γ(P ) = e∈P γ(e). The gain of a cycle is defined similarly. A flow-generating
cycle is a cycle whose gain is more than one.
For notational convenience we assume that G has no parallel arcs. Our algo-
rithms easily extend to allow for parallel arcs and the running times we present
remain valid. Without loss of generality, we assume the network is symmetric and
the gain function is antisymmetric. That is, for each arc (v, w) ∈ E there is an
arc (w, v) ∈ E (possibly with zero capacity) and γ(w, v) = 1/γ(v, w). We assume
the capacities and initial excesses are given as integers between 1 and B, and
the gains are given as ratios of integers which are between 1 and B. To simplify
the running times we assume B ≥ m, and use Õ(f ) to denote f logO(1) m.

2.2 Generalized Flows

A generalized pseudoflow is a function g : E → < that satisfies the capacity
constraints g(v, w) ≤ u(v, w) for all (v, w) ∈ E and the antisymmetry constraints
Simple Generalized Maximum Flow Algorithms 313

g(v, w) = −γ(w, Pv)g(w, v) for all (v, w) ∈ E. The residual excess of g at node v is
eg (v) = e(v)− (v,w)∈E g(v, w), i.e., the initial excess minus the the net flow out
of v. If eg (v) is positive (negative) we say that g has residual excess (deficit) at
node v. A flow g is a pseudoflow that has no residual deficits; it may have residual
excesses. A proper flow is a flow which does not generate any additional residual
excess, except possibly at the sink. We note that a flow can be converted into a
proper flow, by removing flow on useless paths and cycles. For a flow g we denote
its value |g| = eg (t) to be the residual excess at the sink. Let OPT(G) denote the
maximum possible value of any flow in network G. A flow g is optimal in network
G if |g| = OPT(G) and ξ-optimal if |g| ≥ (1 − ξ)OPT(G). The (approximate)
generalized maximum flow problem is to find a (ξ-) optimal flow. We sometimes
omit the adjective generalized when its meaning is clear from context.

2.3 Residual and Relabeled Networks

Let g be a generalized flow in network G = (V, E, s, u, γ, e). With respect to the

flow g, the residual capacity function is defined by ug (v, w) = u(v, w) − g(v, w).
The residual network is Gg = (V, E, s, ug , γ, eg ). Solving the problem in the
residual network is equivalent to solving it in the original network.
Our algorithms use the technique of relabeling, which was originally intro-
duced by Glover and Klingman [4]. A labeling function is a function µ : V → <>0
such that µ(t) = 1. The relabeled network is Gµ = (V, E, t, uµ , γµ , eµ ), where the
relabeled capacity, relabeled gain and relabeled initial excess functions are de-
fined by: uµ (v, w) = u(v, w)/µ(v), γµ (v, w) = γ(v, w)µ(v)/µ(w), and eµ (v) =
e(v)/µ(v). The relabeled network provides an equivalent instance of the gen-
eralized maximum flow problem. Intuitively, node label µ(v) changes the local
units in which flow is measured at node v; it is the number of old units per
new unit. The inverses of the node labels correspond to the linear programming
dual variables, for the primal problem with decision variables g(v, w). With re-
spect to a flow g and labels µ we define the the relabeled residual network by
Gg,µ = (V, E, t, ug,µ , γµ , eg,µ ), where the relabeled residual capacity and relabeled
residual excess functions are defined by ug,µ (v, w) = (u(v, w)−g(v, w))/µ(v) and
eg,µ (v) = eg (v)/µ(v). We define the canonical label of a node v in network G to
be the inverse of the highest gain residual path from v to the sink. If G has no
residual flow-generating cycles, then we can compute the canonical labels using
a single shortest path computation with costs c(v, w) = − log γ(v, w).

2.4 Optimality Conditions

An augmenting path is a residual path from a node with residual excess to the
sink. A generalized augmenting path (GAP) is a residual flow-generating cycle,
together with a (possibly trivial) residual path from a node on this cycle to the
sink. By sending flow along augmenting paths or GAPs we increase the net flow
into the sink. The following theorem of Onaga [19] says that the nonexistence of
augmenting paths and GAPs implies that the flow is optimal.
314 Éva Tardos and Kevin D. Wayne

Theorem 1. A flow g is optimal in network G if and only if there are no aug-

menting paths or GAPs in Gg .

2.5 Finding a Good Starting Flow

Our approximation algorithms require a rough estimate of the optimum value in
a network. Radzik [20] proposed a Õ(m2 ) time greedy augmentation algorithm
that finds a flow that is within a factor n of the optimum. His greedy algorithm
repeatedly sends flow along highest-gain augmenting paths, but does not use
arcs in “backward” direction. Using this algorithm, we can determine an initial
parameter ∆0 which satisfies OPT(G) ≤ ∆0 ≤ nOPT(G).

2.6 Canceling Flow-Generating Cycles

Subroutine CancelCycles converts a generalized flow g into another general-
ized flow g 0 whose residual network contains no flow-generating cycles. In the
process, the net flow into every node, including the sink, can only be increased. It
also finds node labels µ so that Gg0 ,µ is a lossy network. This subroutine is used
by all of our algorithms. CancelCycles was designed by Goldberg, Plotkin,
and Tardos and is described in detail in [7]. It is an adaptation of Goldberg and
Tarjan’s [9] cancel-and-tighten algorithm for the minimum cost flow problem
using costs c(v, w) = − logb γ(v, w) for any base b > 1. Note that negative cost
cycles correspond to flow-generating cycles. In Section 7 we discuss practical
implementation issues.
Theorem 2. Let b > 1. If all of the costs are integral and at least −C, then
CancelCycles runs in Õ(mn log C) time.
In generalized flows, the costs will typically not be integral. In this case, the next
theorem is useful.
Theorem 3. If the gains are given as ratios of integers between 1 and B, then
CancelCycles requires Õ(mn2 log B) time.
Radzik [21] showed that CancelCycles runs in strongly polynomial-time. We
have a variant that limits the relabeling increases, allowing a simpler proof of
the strongly polynomial running time.

2.7 Nearly Optimal Flows

The following lemma derived from [7] says that if a flow is ξ-optimal for suffi-
ciently small ξ, then we can efficiently convert it into an optimal flow. It is used
to provide termination of our exact algorithms. The conversion procedure in-
volves one call to CancelCycles and a single (nongeneralized) maximum flow
computation.
Lemma 1. Given a B −4m -optimal flow, we can compute an optimal flow in
Õ(mn2 log B) time.
Simple Generalized Maximum Flow Algorithms 315

3 Gain-Scaling
In this section we present a rounding and scaling framework. Together, these
ideas provide a technique which can be viewed as a type of gain-scaling. By
rounding the gains, we can improve the complexity of many generalized flow
computations (e.g., canceling flow-generating cycles above). However, our ap-
proximation from rounding creates error. Using an iterative or recursive ap-
proach, we can gradually refine our approximation, until we obtain the desired
level of precision.

3.1 Rounding Down the Gains

In our algorithms we round down the gains so that they are all integer powers
of a base b = (1 + ξ)1/n . Our rounding scheme applies in lossy networks, i.e.,
networks in which no residual arc has a gain factor above one. This implies that
the network has no residual flow-generating cycles. We round the gain of each
residual arc down to γ̄(v, w) = b−c̄(v,w) where c̄(v, w) = −blogb γ(v, w)c, main-
taining antisymmetry by setting γ̄(w, v) = 1/γ̄(v, w). Note that if both (v, w)
and (w, v) are residual arcs, then each has unit gain, ensuring that γ̄ is well-
defined. Let H denote the resulting rounded network. H is also a lossy network,
sharing the same capacity function with G. Let h be a flow in network H. The
interpretation of flow h as a flow in network G is defined by: g(v, w) = h(v, w)
if g(v, w) ≥ 0 and g(v, w) = −γ(w, v)h(w, v) if g(v, w) < 0. Flow interpretation
in lossy networks may create additional excesses, but no deficits. We show that
approximate flows in the rounded network induce approximate flows in the orig-
inal network. First we show that the rounded network is close to the original
network.

Theorem 4. Let G be a lossy network and let H be the rounded network con-
structed as above. If 0 < ξ < 1 then (1 − ξ)OPT(G) ≤ OPT(H) ≤ OPT(G).

Proof. Clearly OPT(H) ≤ OPT(G) since we only decrease the gain factors of
residual arcs. To prove the other inequality, we consider the path formulation
of the maximum flow problem in lossy networks. We include a variable xj for
each path Pj , representing the amount of flow sent along the path. Let x∗ be an
optimal path flow in G. Then x∗ is also a feasible path flow in H. From path Pj ,
γ(Pj )x∗j units of flow arrive at the sink in network G, while only γ̄(Pj )x∗j arrive
in network H. The theorem then follows, since for each path Pj ,

γ(Pj ) γ(Pj )
γ̄(Pj ) ≥ ≥ ≥ γ(Pj )(1 − ξ).
b |P |
1+ξ

Corollary 1. Let G be a lossy network and let H be the rounded network con-
structed as above. If 0 < ξ < 1 then the interpretation of a ξ 0 -optimal flow in H
is a ξ + ξ 0 -optimal flow in G.
316 Éva Tardos and Kevin D. Wayne

Proof. Let h be a ξ 0 -optimal flow in H. Let g be the interpretation of flow h in

G. Then we have

|g| ≥ |h| ≥ (1 − ξ 0 )OPT(H) ≥ (1 − ξ)(1 − ξ 0 )OPT(G) ≥ (1 − ξ − ξ 0 )OPT(G).

3.2 Error-Scaling and Recursion

In this section we describe an error-scaling technique which can be used to speed
up computations for generalized flow problems. Radzik [20] proposed a recur-
sive version of error-scaling to improve the complexity of his Fat-Path variant
when finding nearly optimal and optimal flows. We use the technique in a sim-
ilar manner to speed up our Fat-Path variant. We also use the idea to convert
constant-factor approximation algorithms into fully polynomial-time approxima-
tion schemes.
Suppose we have a subroutine which finds a 1/2-optimal flow. Using error-
scaling, we can determine a ξ-optimal flow in network G by calling this subroutine
log2 (1/ξ) times. To accomplish this we first find a 1/2-optimal flow g in network
G. Then we find a 1/2-optimal flow h in the residual network Gg . Now g + h is
a 1/4-optimal flow in network G, since each call to the subroutine captures at
least half of the remaining flow. In general, we can find a ξ-optimal flow with
log2 (1/ξ) calls to the subroutine.
The following lemma of Radzik [20] is a recursive version of error-scaling.
It
√ says that we can compute an ξ-optimal flow by combining two appropriate
ξ-optimal flows.
√ √
Lemma 2. Let g be a ξ-optimal flow in network G. Let h be a ξ-optimal
flow in network Gg . Then the flow g + h is ξ-optimal in G.

4 Truemper’s Algorithm
Truemper’s maximum flow based augmenting path algorithm is one of the sim-
plest algorithms for the generalized maximum flow problem. We apply our gain-
scaling techniques to Truemper’s algorithm, producing perhaps the cleanest and
simplest polynomial-time algorithms for the problem. In this section we first
review Truemper’s [22] algorithm. Our first variant runs Truemper’s algorithm
in a rounded network. It computes a ξ-optimal flow in polynomial-time, for
any constant ξ > 0. However, it requires exponential-time to compute optimal
flows, since we would need ξ to be very small. By incorporating error-scaling, we
show that a simple variant of Truemper’s algorithm computes an optimal flow
in polynomial-time.
A natural and intuitive algorithm for the maximum flow problem in lossy net-
works is to repeatedly send flow from excess nodes to the sink along highest-gain
(most-efficient) augmenting paths. Onaga observed that if the input network has
no residual flow-generating cycles, then the algorithm maintains this property.
Thus, we can find a highest-gain augmenting path using a single shortest path
computation with costs c(v, w) = − log γ(v, w). By maintaining canonical labels,
Simple Generalized Maximum Flow Algorithms 317

we can ensure that all relabeled gains are at most one, and a Dijkstra shortest
path computation suffices. Unit gain paths in the canonically relabeled network
correspond to highest gain paths in the original network. This is essentially
Onaga’s [19] algorithm. If the algorithm terminates, then the resulting flow is
optimal by Theorem 1. However, this algorithm may not terminate in finite time
if the capacities are irrational. Truemper’s algorithm [22] uses a (nongeneralized)
maximum flow computation to simultaneously augment flow along all highest-
gain augmenting paths. It is the generalized flow analog of Jewell’s primal-dual
minimum cost flow algorithm.

Theorem 5. In Truemper’s algorithm, the number of maximum flow computa-

tions is bounded by n plus the number of different gains of paths in the original
network.

Proof. After each maximum flow computation, µ(v) strictly increases for each
excess node v =
6 t.

4.1 Rounded Truemper (RT)

Algorithm RT computes a ξ-optimal flow by running Truemper’s algorithm in
a rounded network. The input to Algorithm RT is a lossy network G and an
error parameter ξ. Algorithm RT first rounds the gains to integer powers of
b = (1 + ξ)1/n , as described in Section 3.1. Let H denote the rounded network.
Then RT computes an optimal flow in H using Truemper’s algorithm. Finally the
algorithm interprets the flow in the original network. Algorithm RT is described
in Figure 1.

The gain of a path in network G is between B −n and B n . Thus, after round-

ing to powers of b, there are at most 1 + logb B 2n = O(n2 ξ −1 log B) different
gains of paths in H. Using Goldberg and Tarjan’s [8] preflow-push algorithm,
each (nongeneralized) maximum flow computation takes Õ(mn) time. Thus, by
318 Éva Tardos and Kevin D. Wayne

Theorem 5, RT finds an optimal flow in H in Õ(mn3 ξ −1 log B) time. The fol-

lowing theorem follows using Corollary 1.

Theorem 6. Algorithm RT computes a ξ-optimal flow in a lossy network in

Õ(mn3 ξ −1 log B) time.

4.2 Iterative Rounded Truemper (IRT)

RT does not compute an optimal flow in polynomial-time, since the precision
required to apply Lemma 1 is roughly ξ = B −m . In Algorithm IRT, we apply
error-scaling, as described in Section 3.2. IRT iteratively calls RT with error
parameter 1/2 and the current residual network. Since RT sends flow along
highest-gain paths in the rounded network, not in the original network, it cre-
ates residual flow-generating cycles. So, before calling RT in the next iteration,
we must first cancel all residual flow-generating cycles with subroutine Cancel-
Cycles, because the input to RT is a lossy network. Intuitively, this can be
interpreted as rerouting flow from its current paths to highest-gain paths, but
not all of the rerouted flow reaches the sink.

Theorem 7. Algorithm IRT computes a ξ-optimal flow in Õ(mn3 log B log ξ −1 )

time. It computes an optimal flow in Õ(m2 n3 log2 B) time.

In the full paper we prove that Algorithm IRT actually finds an optimal flow in
Õ(m2 n3 log B + m2 n2 log2 B) time.

5 Preflow-Push
In this section we adapt Goldberg and Tarjan’s [10] preflow-push algorithm to
the generalized maximum flow problem. This is the first polynomial-time preflow
push algorithm for generalized network flows. Tseng and Bertsekas [23] designed
a preflow push-like algorithm for the generalized minimum cost flow problem,
but it may require more than B n iterations. Using our rounding technique, we
present a preflow-push algorithm that computes a ξ-optimal flow in polynomial-
time for any constant ξ > 0. Then by incorporating error-scaling, we show how
to find an optimal flow in polynomial-time.

5.1 Rounded Preflow-Push (RPP)

Algorithm RPP is a generalized flow analog of Goldberg and Tarjan’s preflow
push algorithm for the minimum cost flow problem. Conceptually, RPP runs
the minimum cost flow algorithm with costs c(v, w) = − log γ(v, w) and error
parameter = n1 log b where b = (1 + ξ)1/n . This leads to the following natural
definitions and algorithm. An admissible arc is a residual arc with relabeled gain
above one. The admissible graph is the graph induced by admissible arcs. An
active node is a node with positive residual excess and a residual path to the
sink. We note that if no such residual path exists and an optimal solution sends
Simple Generalized Maximum Flow Algorithms 319

flow through this node, then that flow does not reach the sink. So we can safely
disregard this useless residual excess. (Periodically RPP determines which nodes
have residual paths to the sink.) Algorithm RPP maintains a flow h and node
labels µ. The algorithm repeatedly selects an active node v. If there is an ad-
missible arc (v, w) emanating from node v, IPP pushes δ = min{eh (v), uh (v, w)}
units of flow from node v to w. If δ = uh (v, w) the push is called saturating;
otherwise it is nonsaturating. If there is no such admissible arc, RPP increases
the label of node v by a factor of 2 = b1/n ; this corresponding to an additive
potential increase for minimum cost flows. This process is referred to as a relabel
operation. Relabeling node v can create new admissible arcs emanating from v.
To ensure that we do not create residual flow-generating cycles, we only increase
the label by a relatively small amount.
The input to Algorithm RPP is a lossy network G and error parameter ξ.
Before applying the preflow-push method, IPP rounds the gains to powers of
b = (1 + ξ)1/n , as described in Section 3.1. The method above is then applied to
the rounded network H. Algorithm RPP is described in Figure 2.
We note that our algorithm maintains a pseudoflow with excesses, but no
deficits. In contrast, the Goldberg-Tarjan algorithm allows both excesses and
deficits. Also their algorithm scales . We currently do not see how to improve
the worst-case complexity by a direct scaling of .

The bottleneck computation is performing nonsaturating pushes, just as for

computing minimum cost flows with the preflow-push method. By carefully
choosing the order to examine active nodes (e.g., the wave implementation),
we can reduce the number of nonsaturating pushes. A dual approach is to use
more clever data structures to reduce the amortized time per nonsaturating
push. Using a version of dynamic trees specialized for generalized networks [6],
we obtain the following theorem.
320 Éva Tardos and Kevin D. Wayne

Theorem 8. Algorithm RPP computes a ξ-optimal flow in Õ(mn3 ξ −1 log B)

time.

5.2 Iterative Rounded Preflow-Push (IRPP)

RPP does not compute an optimal flow in polynomial time, since the precision
required is roughly ξ = B −m . Like Algorithm IRT, Algorithm IRPP adds error-
scaling, resulting in the following theorem.

Theorem 9. IRPP computes a ξ-optimal flow in Õ(mn3 log B log ξ −1 ) time. It

computes an optimal flow in Õ(m2 n3 log2 B) time.

6 Rounded Fat Path

In this section we present a simple variant of Goldberg, Plotkin, and Tardos’ [7]
Fat-Path algorithm which has the same complexity as Radzik’s [20] Fat-Path
variant. Our algorithm is intuitive and its proof of correctness is much simpler
than Radzik’s. The Fat-Path algorithm can be viewed as an analog of Orlin’s
capacity scaling algorithm for the minimum cost flow problem. The original Fat-
Path algorithm computes a ξ-optimal flow in Õ(mn2 log B log ξ −1 ) time, while
Radzik’s and our variants require only Õ(m(m + n log log B) log ξ −1 ) time.
The bottleneck computation in the original Fat-Path algorithm is cancel-
ing residual flow-generating cycles. Radzik’s variant reduces the bottleneck by
canceling only residual flow-generating cycles with big gains. The remaining
flow-generating cycles are removed by decreasing the gain factors. Analyzing the
precision of the resulting solution is technically complicated. Instead, our vari-
ant rounds down the gains to integer powers of a base b, which depends on the
precision of the solution desired. Our rounding is done in a lossy network, which
makes the quality of the resulting solution easy to analyze. Subsequent calls to
CancelCycles are performed in a rounded network, improving the complexity.
We first review the FatAugmentations subroutine which finds augmenting
paths with sufficiently large capacity. Then we present Algorithm RFP, which
runs the Fat-Path algorithm in a rounded network. It computes approximately
optimal and optimal flows in polynomial-time. We then present a recursive ver-
sion of RFP, which improves the complexity when computing nearly optimal and
optimal flows.

6.1 Fat Augmentations

The FatAugmentations subroutine was originally developed by Goldberg,

Plotkin, and Tardos for their Fat-Path algorithm and is described in detail in [7].
The input is a lossy network and fatness parameter δ. The subroutine repeatedly
augments flow along highest-gain δ-fat paths, i.e. highest-gain augmenting paths
among paths that have enough residual capacity to increase the excess at the sink
by δ, given sufficient excess at the first node of the path. This process is repeated
Simple Generalized Maximum Flow Algorithms 321

until no δ-fat paths remain. There are at most n + OPT(G)/δ augmentations.

By maintaining appropriate labels µ, an augmentation takes Õ(m) time, using
an algorithm based on Dijkstra’s shortest path algorithm. Upon termination,
the final flow has value at least OPT(G) − mδ.

6.2 Rounded Fat Path (RFP)

Algorithm RFP runs the original Fat-Path algorithm in a rounded network.

The idea of the original Fat-Path algorithm is to call FatAugmentations and
augment flow along δ-fat paths, until no such paths remain. At this point δ is
decreased by a factor of 2 and a new phase begins. However, since FatAugmen-
tations selects only paths with large capacity, it does not necessarily send flow
on overall highest-gain paths. This creates residual flow-generating cycles which
must be canceled so that we can efficiently compute δ/2-fat paths in the next
phase.
The input to Algorithm RFP is a lossy network and an error parameter ξ.
First, RFP rounds down the gains as described in Section 3.1. It maintains a
flow h in the rounded network H and an upper bound ∆ on the excess dis-
crepancy, i.e., the difference between the value of the current flow |h| and the
optimum OPT(H). The scaling parameter ∆ is initialized using Radzik’s greedy
augmentation algorithm, as described in Section 2.5. In each phase, ∆ is de-
creased by a factor of 2. To achieve this reduction, Algorithm RFP cancels all
residual flow-generating cycles in Hh , using the CancelCycles subroutine. By
Theorem 2 this requires Õ(mn log C) time where C is the biggest cost. Recall
c(v, w) = −blogb γ(v, w)c so C ≤ 1 + logb B = O(nξ −1 log B). Then subroutine
FatAugmentations is called with fatness parameter δ = ∆/(2m). After this
call, the excess discrepancy is at most mδ = ∆/2, and ∆ is decreased accord-
ingly. Since each δ-fat augmentation either empties the residual excess of a node
or increases the flow value by at least δ, there are at most n + ∆/δ = n + 2m
augmentations per ∆-phase, which requires a total of Õ(m2 ) time. Algorithm
RFP is given in Figure 3.

Theorem 10. Algorithm RFP computes a 2ξ-optimal flow in a lossy network

in Õ((m2 + mn log(ξ −1 log B)) log ξ −1 ) time.

Proof. To bound the running time, we note that there are at most log2 (n/ξ)
phases. FatAugmentations requires Õ(m2 ) time per phase, and CancelCy-
cles requires Õ(mn log C) time, where we bound C = O(nξ −1 log B) as above.
The algorithm terminates when ∆ ≤ ξOPT(H). At this point h is ξ-optimal in
network H, since we maintain ∆ ≥ OPT(H) − |h|. The quality of the resulting
solution then follows using Theorem 4.

6.3 Recursive Rounded Fat-Path (RRFP)

Algorithm RFP computes a ξ-optimal flow in Õ(m2 + mn log log B) time when
ξ > 0 is inversely polynomial in m. However it may require more time to
322 Éva Tardos and Kevin D. Wayne

Input: lossy network G, error parameter 0 < ξ < 1

Output: 2ξ-optimal flow g
Set base b = (1 + ξ)1/n and round gains in network G to powers of b
Let H be resulting network
Initialize ∆ ← ∆0 and h ← 0 {OPT(H) ≤ ∆0 ≤ nOPT(H)}
repeat
(h0 , µ) ← CancelCycles(Hh )
h ← h + h0
h0 ← FatAugmentations(Hh , µ, ∆/(2m)) {OPT(Hh ) − |h0 | ≤ ∆/2}
0
h← h+h
∆ ← ∆/2
until ∆ ≤ ξOPT(H)
g ← interpretation of h in network G
Fig. 3: RFP(G, ξ).

compute optimal flows than the original Fat-Path algorithm. By using the re-
cursive scheme from Section 3.2, we can compute nearly optimal and optimal
flows faster than the original Fat-Path algorithm. In each recursive call, we
reround the network. We cancel flow-generating cycles in an already (partially)
rounded network. The benefit is roughly to decrease the average value of C from
O(nξ −1 log B) to O(n log B).

Theorem 11. Algorithm RRFP computes a ξ-optimal flow in a lossy network in

Õ(m(m + n log log B) log ξ −1 ) time. If the network has residual flow-generating
cycles, then an extra Õ(mn2 log B) preprocessing time is required. Algorithm
RRFP computes an optimal flow in Õ(m2 (m + n log log B) log B) time.

7 Practical Cycle-Canceling

We implemented a version of the preflow-push algorithm, described in Section 5,

in C++ using Mehlhorn and Näher’s [17] Library of Efficient Data types and
Algorithms (LEDA). We observed that as much as 90% of the time was spent
canceling flow-generating cycles. We focused our attention on reducing this bot-
tleneck.
Recall, CancelCycles is an adaption of Goldberg and Tarjan’s cancel-and-
tighten algorithm using costs c(v, w) = − log γ(v, w). Negative cost cycles cor-
respond to flow-generating cycles. The underlying idea of the Goldberg-Tarjan
algorithm is to cancel the most negative mean cost cycle until no negative cost
cycles remain. To improve efficiency, the actual cancel-and-tighten algorithm
only approximates this strategy. It maintains a flow g and node potentials (cor-
responding to node labels) π that satisfy -complementary slackness. That is,
cπ (v, w) = c(v, w) − π(v) + π(w) ≥ − for all residual arcs (v, w). The subroutine
makes progress by reducing the value of , using the following two computations
which comprise a phase: (i) canceling residual cycles in the subgraph induced by
negative reduced cost arcs and (ii) updating node potentials so that finding new
Simple Generalized Maximum Flow Algorithms 323

negative cost (flow-generating) cycles is efficient. The cancel-and-tighten algo-

rithm uses the following two types of potential updates. Loose updating uses a
computationally inexpensive topological sort, but may only decrease by a fac-
tor of (1 − 1/n) per phase. Tight updating reduces by the maximum possible
amount, but involves computing the value of the minimum mean cost cycle ∗ ,
which is relatively expensive.
We observed that the quality of the potential updates was the most impor-
tant factor in the overall performance of CancelCycles. So, we focused our
attention on limiting the number of iterations by better node potential updates.
Using only loose updates, we observed that CancelCycles required a large
number of phases. By using tight updates, we observed a significant reduction in
the number of phases, but each phase is quite expensive. The goal is to a reach
a middle ground. We introduce a medium updating technique which is much
cheaper than tight updating, yet more effective than loose updating; it reduces
the overall running time of the cycle canceling computation. Our implementation
uses a combination of loose, medium, and tight potential updates.
Tight updating requires Õ(mn) time in the worst case, using either a dynamic
programming or binary search method. We incorporated several heuristics to
improve the actual performance. These heuristics are described in the full paper.
However, we observed that tight relabeling was still quite expensive.
We introduce a medium potential updating which is a middle ground between
loose and tight updating. In medium updating, we find a value 0 which is close
to ∗ , without spending the time to find the actual minimum mean cost cycle.
In our algorithm, we only need to estimate ∗ in networks where the subgraph
induced by negative cost arcs is acyclic. To do this efficiently, we imagine that
the in addition to the original arcs, a zero cost link exists between every pair
of nodes. We can efficiently find a minimum mean cost cycle in this modified
network by computing a minimum mean cost path in the acyclic network induced
by only negative cost arcs, without explicitly considering the imaginary zero cost
arcs. Let 0 denote the value of the minimum mean cost cycle in the modified
network. Clearly 0 ≤ ∗ , and it is not hard to see that 0 ≥ (1 − 1/n). We
can binary search for 0 using a shortest path computation in acyclic graphs.
This requires only O(m) time per iteration. If we were to determine 0 exactly,
in Õ(n log B) iterations the search interval would be sufficiently small. If the
gains in the network are rounded to powers of b = (1 + ξ)1/n then Õ(log C)
iterations suffice, where C = O(nξ −1 log B). In our implementation we use an
approximation to 0 .

References
1. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms,
and Applications. Prentice Hall, Englewood Cliffs, New Jersey, 1993.
2. Edith Cohen and Nimrod Megiddo. New algorithms for generalized network flows.
Math Programming, 64:325–336, 1994.
3. F. Glover, J. Hultz, D. Klingman, and J. Stutz. Generalized networks: A funda-
mental computer based planning tool. Management Science, 24:1209–1220, 1978.
324 Éva Tardos and Kevin D. Wayne

4. F. Glover and D. Klingman. On the equivalence of some generalized network flow

problems to pure network problems. Math Programming, 4:269–278, 1973.
5. F. Glover, D. Klingman, and N. Phillips. Netform modeling and applications.
Interfaces, 20:7–27, 1990.
6. A. V. Goldberg, S. A. Plotkin, and É Tardos. Combinatorial algorithms for the
generalized circulation problem. Technical Report STAN-CS-88-1209, Stanford
University, 1988.
7. A. V. Goldberg, S. A. Plotkin, and É Tardos. Combinatorial algorithms for the
generalized circulation problem. Mathematics of Operations Research, 16:351–379,
1991.
8. A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem.
Journal of the ACM, 35:921–940, 1988.
9. A. V. Goldberg and R. E. Tarjan. Finding minimum-cost circulations by canceling
negative cycles. Journal of the ACM, 36:388–397, 1989.
10. A. V. Goldberg and R. E. Tarjan. Solving minimum cost flow problems by succes-
sive approximation. Mathematics of Operations Research, 15:430–466, 1990.
11. D. Goldfarb and Z. Jin. A polynomial dual simplex algorithm for the generalized
circulation problem. Technical report, Department of Industrial Engineering and
Operations Research, Columbia University, 1995.
12. D. Goldfarb and Z. Jin. A faster combinatorial algorithm for the generalized
circulation problem. Mathematics of Operations Research, 21:529–539, 1996.
13. D. Goldfarb, Z. Jin, and J. B. Orlin. Polynomial-time highest gain augmenting
path algorithms for the generalized circulation problem. Mathematics of Operations
Research. To appear.
14. W. S. Jewell. Optimal flow through networks with gains. Operations Research,
10:476–499, 1962.
15. Anil Kamath and Omri Palmon. Improved interior point algorithms for exact and
approximate solution of multicommodity flow problems. In Proceedings of the 6th
Annual ACM-SIAM Symposium on Discrete Algorithms, pages 502–511, 1995.
16. S. Kapoor and P. M. Vaidya. Speeding up Karmarkar’s algorithm for multicom-
modity flows. Math Programming. To appear.
17. K. Mehlhorn and S. Näher. A platform for combinatorial and geometric computing.
CACM, 38(1):96–102, 1995. http://ftp.mpi-sb.mpg.de/LEDA/leda.html
18. S. M. Murray. An interior point approach to the generalized flow problem with costs
and related problems. PhD thesis, Stanford University, 1993.
19. K. Onaga. Dynamic programming of optimum flows in lossy communication nets.
IEEE Trans. Circuit Theory, 13:308–327, 1966.
20. T. Radzik. Faster algorithms for the generalized network flow problem. Mathe-
matics of Operations Research. To appear.
21. T. Radzik. Approximate generalized circulation. Technical Report 93-2, Cornell
Computational Optimization Project, Cornell University, 1993.
22. K. Truemper. On max flows with gains and pure min-cost flows. SIAM J. Appl.
Math, 32:450–456, 1977.
23. P. Tseng and D. P. Bertsekas. An -relaxation method for separable convex cost
generalized network flow problems. In 5th International Integer Programming and
Combinatorial Optimization Conference, 1996.
The Pseudoflow Algorithm and the
Pseudoflow-Based Simplex for the Maximum
Flow Problem

?
Dorit S. Hochbaum

Department of Industrial Engineering and Operations Research,

and Walter A. Haas School of Business
University of California, Berkeley
dorit@@hochbaum.berkeley.edu

Abstract. We introduce an algorithm that solves the maximum flow

problem without generating flows explicitly. The algorithm solves di-
rectly a problem we call the maximum s-excess problem. That problem
is equivalent to the minimum cut problem, and is a direct extension of
the maximum closure problem. The concepts used also lead to a new
parametric analysis algorithm generating all breakpoints in the amount
of time of a single run.
The insights derived from the analysis of the new algorithm lead to a new
simplex algorithm for the maximum flow problem – a pseudoflow-based
simplex. We show that this simplex algorithm can perform a parametric
analysis in the same amount of time as a single run. This is the first
known simplex algorithm for maximum flow that generates all possible
breakpoints of parameter values in the same complexity as required to
solve a single maximum flow instance and the fastest one.
The complexities of our pseudoflow algorithm, the new simplex algo-
rithm, and the parametric analysis for both algorithms are O(mn log n)
on a graph with n nodes and m arcs.

1 Introduction

This extended abstract describes an efficient new approach to the maximum

flow and minimum cut problems. The approach is based on a new certificate
of optimality inspired by the algorithm of Lerchs and Grossmann, [LG64]. This
certificate, called normalized tree, partitions the set of nodes into subsets some of
which have excess capacity and some have capacity deficit. The nodes that belong
to the subsets with excess form the source set of a candidate minimum cut. The
algorithm solves, instead of the maximum flow problem, another problem which
we call the maximum s-excess problem. That problem is defined on a directed
?
Research supported in part by NEC, by NSF award No. DMI-9713482, and by SUN
Microsystems.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 325–337, 1998. c Springer–Verlag Berlin Heidelberg 1998
326 Dorit S. Hochbaum

graph with arc capacities and node weights and does not contain distinguished
source and sink nodes. The objective of the s-excess problem is to find a subset
of the nodes that maximizes the sum of node weights, minus the weight of
the arcs separating the set from the remainder of the nodes. The new problem
is shown to be equivalent to the minimum cut problem that is traditionally
solved by deriving a maximum flow first. With the new algorithm these problems
can be solved without considering flows explicitly. The steps of the algorithm
can be interpreted as manipulating pseudoflow – a flow that does not satisfy
flow balance constraints. For this reason we choose to call the algorithm the
pseudoflow algorithm.
The main feature that distinguishes the pseudoflow algorithm from other
known algorithms for the maximum flow problem is that it does not seek to either
preserve or progress towards feasibility. Instead the algorithm creates “pockets”
of nodes so that at optimum there is no residual arc that can carry additional
flow between an “excess pocket” and a “deficit pocket”. The set of nodes in
all the “excess pockets” form the source set of a minimum cut and also the
maximum s-excess set.
The certificate maintained by our algorithm bears a resemblance to the ba-
sic arcs tree maintained by simplex. It is demonstrated that this certificate is
analogous to the concept of a strong basis introduced by Cunningham [C76]
if implemented in a certain “extended network” permitting violations of flow
balance constraints. It is further shown that the algorithmic steps taken by our
algorithm are substantially different from those of the simplex and lead to a
different outcome in the next iteration based on the same strong basis in the
given iteration.
The contributions in this paper include:

1. The introduction of the pseudoflow maximum s-excess problem that provides

a new perspective on the maximum flow problem.
2. A pseudoflow algorithm for the maximum flow problem of complexity
O(mn log n).
3. Parametric analysis conducted with the pseudoflow algorithm that generates
all breakpoints in the same complexity as a single run.
4. A new pseudoflow-based simplex algorithm for maximum flow using the low-
est label approach.
5. A parametric analysis simplex method that finds all possible parameter
breakpoints in the same time as a single run, O(mn log n).

The parametric simplex method is the first known parametric implementa-

tion of simplex to date that finds all breakpoints in the same running time as a
single application.

1.1 Notation

For P, Q ⊂ V , the set of arcs going from P to Q is denoted by, (P, Q) = {(u, v) ∈
A|u ∈ P and v ∈ Q}. Let the capacity of arc (u, v) be denoted by cuv or c(u, v).
A New Max Flow Algorithm 327

For P, Q ⊂PV , P ∩ Q = ∅, the capacity of the cut separating P from Q is,

C(P, Q) = (u,v)∈(P,Q) cuv . For S ⊆ V , let S̄ = V \ S.
For a graph G = (V, A) we denote the number of arcs by m = |A| and the
number of nodes by n = |V |.
An arc (u, v) of an unspecified direction is referred to as edge [u, v].
[v1 , v2 , . . . , vk ] denotes an undirected path from v1 to vk . That is,
[v1 , v2 ], . . . , [vk−1 , vk ] ∈ A.
We use the convention that the capacity of an arc that is not in the graph is
zero. Thus for (a, b) ∈ A and (b, a) 6∈ A, cb,a = 0.
The capacity of an edge e is denoted either by c(e) or ce . The flow on an edge
e is denoted either by f (e) or fe . We use the convention that f (a, b) = −f (b, a).
For a given flow or pseudoflow f , the residual capacity of e is denoted by cf (e)
which is ce − fe .
Given a rooted tree, T , Tv is the subtree suspended from node v that contains
all the descendants of v in T . T[v,p(v)] = Tv is the subtree suspended from the
edge [v, p(v)]. An immediate descendant of a node v, a child of v, is denoted by
ch(v), and the unique immediate ancestor of a node v, the parent of v, by p(v).

2 The Maximum Flow Problem and the Maximum

s-Excess Problem
The pseudoflow algorithm described here ultimately finds a maximum flow in
a graph. Rather than solving the problem directly the algorithm solves instead
the maximum s-excess problem.

Problem Name: Maximum s-Excess

Instance: Given a directed graph G = (V, A), node weights (positive or neg-
ative) wi for all i ∈ V , and nonnegative arc weights cij for all (i, j) ∈ A.
Optimization
P PProblem: Find a subset of nodes S ⊆ V such that
i∈S wi − i∈S,j∈S̄ cij is maximum.

We elaborate here further on this problem and its relationship to the maxi-
mum flow and minimum cut problems.
The maximum flow problem is defined on a directed graph with distinguished
source and sink nodes and the arcs adjacent to source and sink, A(s) and A(t),
Gst = (V ∪ {s, t}, A ∪ A(s) ∪ A(t)).
The standard formulation of the maximum flow problem with zero lower
bounds and xij variables indicating the amount of flow on arc (i, j) is,

Max xts
P P
subject to i xki − j xjk = 0 k ∈ V
0 ≤ xij ≤ cij ∀(i, j) ∈ A.

In this formulation the first set of (equality) constraints is called the flow bal-
ance constraints. The second set of (inequality) constraints is called the capacity
constraints.
328 Dorit S. Hochbaum

Definition 1. The s- excess capacity of a set S ⊆ V in the graph Gst = (V ∪

{s, t}, A ∪ A(s) ∪ A(t)) is, C({s}, S) − C(S, S̄ ∪ {t}).

We claim that finding a subset S ⊆ V that maximizes C({s}, S) − C(S, S̄ ∪

{t}) is equivalent to the maximum s-excess problem.

Lemma 2. A subset of nodes S ⊆ V maximizes C({s}, S) − C(S, S̄ ∪ {t}) in

Gst if and only if it is of maximum s-excess in the graph G = (V, A).

We next prove that the s-excess problem is equivalent to the minimum cut
problem,

Lemma 3. S is the source set of a minimum cut if and only if it is a set of

maximum s- excess capacity C(s, S) − C(S, S̄ ∪ {t}) in the graph.

Proof. Given an instance of the s-excess problem. Append to the graph G =

(V, A) the nodes s and t; Assign arcs with capacities equal to the weights of
nodes from s to the nodes of positive weight; Assign arcs with capacities equal
to the absolute value of the weights of nodes of negative weights, from the nodes
to t.
The sum of weights of nodes in S is also the sum of capacities C({s}, S) −
C(S, {t}) where the first term corresponds to positive weights in S, and the
second to negative weights in S:
P P
j∈S wj − i∈S,j∈S̄ cij = C({s}, S) − C(S, {t}) − C(S, S̄)
= C({s}, S) − C(S, S̄ ∪ {t}).
= C({s}, V ) − C({s}, S̄) − C(S, S̄ ∪ {t}).
= C({s}, V ) − C({s} ∪ S, S̄ ∪ {t}).

The latter term is the capacity of the cut ({s}∪S, S̄ ∪{t}). Hence maximizing the
s-excess capacity is equivalent to minimizing the capacity of the cut separating
s from t.
To see that the opposite is true, consider the network (V ∪ {s, t}, A) with
arc capacities. Assign to node v ∈ V a weight that is csv if the node is adjacent
to s and −cvt if the node is adjacent to t. Note that it is always possible to
remove paths of length 2 from s to t thus avoiding the presence of nodes that
are adjacent to both source and sink. This is done by subtracting from the arcs’
capacities csv , cvt the quantity min{csv , cvt }. The capacities then translate into
node weights that serve as input to the s-excess problem and satisfy the equalities
above. t
u

We conclude that the maximum s-excess problem is a complement of min-

imum cut which in turn is a dual of the maximum flow problem. As such, its
solution does not contain more flow information than the solution to the min-
imum cut problem. As we see later, however, it is possible to derive a feasible
flow of value equal to that of the cut from the certificate used in the algorithm,
in O(mn) time.
A New Max Flow Algorithm 329

The reader may wonder about the arbitrary nature of the s-excess problem,
at least in the sense that it has not been previously addressed in the literature.
The explanation is that this problem is a relaxation of the maximum closure
problem where the objective is to find, in a node weighted graph, a closed set of
nodes of maximum total weight. In the maximum closure problem it is required
that all successors of each node in the closure set will belong to the set. In the
s-excess problem this requirement is replaced by a penalty assigned to arcs of
immediate successors that are not included in the set. In that sense the s-excess
problem is a relaxation of the maximum closure problem. The proof of Lemma
3 is effectively an extension of Picard’s proof [Pic76] demonstrating that the
maximum weight closed set in a graph (maximum closure) is the source set of a
minimum cut.
We provide a detailed account of the use of the pseudoflow and other algo-
rithms for the maximum closure problem in [Hoc96]. We also explain there how
these algorithms have been used in the mining industry and describe the link to
the algorithm of Lerchs and Grossmann, [LG64].

3 Preliminaries and Definitions

Pseudoflow is an assignment of values to arcs that satisfy the capacity constraints

but not necessarily the flow balance constraints. Unlike preflow, that may violate
the flow balance constraints only with inflow exceeding outflow, pseudoflow per-
mits the inflow to be either strictly larger than outflow (excess) or outflow strictly
larger than inflow (deficit). Let f be a pseudoflow vector with 0 ≤ fij ≤ cij the
pseudoflow value assigned to arc (i, j). Let inflow (D), outflow (D) be the total
amount of flow incoming and outgoing to and from the set of nodes D. For each
subset of nodes D ⊂ V ,

inflow (D) − outflow(D) P

excess(D) = P
= (u,v)∈(V ∪{s}\D,D) fu,v − (v,u)∈(D,V ∪{t}\D) fv,u .

The absolute value of excess that is less than zero is called deficit, i.e.
−excess(D) = deficit (D).
Given a rooted tree T . For a subtree Tv = T[v,p(v)] , let M[v,p(v)] = Mv =
excess(Tv ) and be called the mass of the node v or the arc [v, p(v)]. That is,
the mass is the amount of flow on (v, p(v)) directed towards the root. A flow
directed in the opposite direction – from p(v) to v – is interpreted as negative
excess or mass.
We define the extended network as follows: The network Gst is augmented
with a set of arcs – two additional arcs per node. Each node has one arc of
infinite capacity directed into it from the sink, and one arc of infinite capacity
directed from it to the source. This construction is shown in Figure 1. We refer
to the appended arcs from sink t as the deficit arcs and the appended arcs to the
source s as the excess arcs. The source and sink nodes are compressed into a ‘root’
node r. We refer to the extended network’s set of arcs as Aaug . These include, in
330 Dorit S. Hochbaum

excess arc deficit arc

8
ν

s V t

Fig. 1. An extended network with excesses and deficits.

addition to the deficit and excess arcs, also the arcs adjacent to source and sink
– A(s) and A(t). The extended network is the graph Gaug = (V ∪ {r}, Aaug ).
Any pseudoflow on a graph has an equivalent feasible flow on the extended
network derived from the graph – a node with excess sends the excess back to
the source, and a node with deficit receives a flow that balances this deficit from
the sink.
Throughout our discussion of flows on extended networks, all the flows con-
sidered saturate the arcs adjacent to source and sink and thus the status of
these arcs, as saturated, remains invariant. We thus omit repeated reference to
the arcs A(s) and A(t).

4 A Normalized Tree

The algorithm maintains a construction that we call a normalized tree after the
use of this term by Lerchs and Grossmann in [LG64] for a construction that
inspired ours. Let node r 6∈ V serve as root and represent a contraction of s and
t. Let (V ∪ {r}, T ) be a tree where T ⊂ Ā. The children of r are called the roots
of their respective branches or subtrees. The deficit and excess arcs are only used
to connect r to the roots of the branches.
A normalized tree is a rooted tree in r that induces a forest in (V, A). We
refer to each rooted tree in the forest as branch. A branch of the normalized
tree rooted at a child of r, ri , Tri is called strong if excess(Tri ) > 0, and weak
otherwise. All nodes of strong branches are considered strong, and all nodes of
weak branches are considered weak.
A normalized tree is depicted in Figure 2.
A New Max Flow Algorithm 331

s = r = t
excess > 0 excess ≤ 0
deficit arc
r1 r2 r3

strong weak

Fig. 2. A normalized tree. Each ri is a root of a branch.

Consider a rooted forest T \ {r} in G = (V, A) and a pseudoflow f in Gst

satisfying the properties:
Property 4. The pseudoflow f saturates all source-adjacent arcs and all sink-
adjacent arcs.

Property 5. In every branch all downwards residual capacities are strictly pos-
itive.

Property 6. The only nodes that do not satisfy flow balance constraints are the
roots of their respective branches that are adjacent to r in the extended network.

Definition 7. A tree T with pseudoflow f is called normalized if it satisfies

properties 4, 5, and 6.
Property 6 means, in other words, that in order for T to be a normalized
tree, f has to satisfy flow balance constraints in the extended network with only
the roots of the branches permitted to send/receive flows along excess or deficit
arcs.
We let the excess of a normalized tree T and pseudoflow f be the sum of
excesses of its strong branches (or strong nodes). Property 4 implies that the
excess of a set of strong nodes in a normalized tree, S, satisfies:

excess(S) = inflow (S) − outflow(S) = f ({s}, S) + f (S̄, S) − f (S, S̄)

−f (S, {t}) = C({s}, S) − C(S, {t}) + f (S̄, S) − f (S, S̄).

This equality is used to prove the superoptimality of the set of strong nodes (in
the superoptimality Property).
We choose to work with normalized trees that satisfy an optional property
stronger than 5:
332 Dorit S. Hochbaum

Property 8 (Unsaturated arcs property). The tree T has all upwards resid-
ual capacities strictly positive.

Another optional implied property is:

Property 9. All “free” arcs of A with flows strictly between lower and upper
bound are included in T .
With this property all arcs that are not adjacent to root are free. T is thus the
union of all free arcs including some excess and deficit arcs. The only arcs in T
that are not free are deficit arcs with 0 flow – adjacent to 0-deficit branches. All
out of tree arcs are thus at their upper or at the lower bounds.
The next property – the superoptimality property – is satisfied by any nor-
malized tree, or equivalently, by any tree-pseudoflow pair satisfying properties
4, 5, and 6. The superoptimality of a normalized tree and the conditions for
optimality are stated in the next subsection.

4.1 The Superoptimality of a Normalized Tree

Property 10 (Superoptimality). The set of strong nodes of the normalized

tree T is a superoptimal solution to the s-excess problem: The sum of excesses
of the strong branches is an upper bound on the maximum s-excess.

From the proof of the superoptimality property it follows that when all arcs
from S to S̄ are saturated, then no set of nodes other than S has a larger s-excess.
We thus obtain the optimality condition as a corollary to the superoptimality
property.
Corollary 11 (Optimality condition). Given a normalized tree, a pseud-
oflow f and the collection of strong nodes in the tree S. If f saturates all arcs
in (S, S̄) then S is a maximum s-excess set in the graph.
The next Corollary holds only if Property 8 is satisfied. It implies minimality
of the optimal solution set S.
Corollary 12 (Minimality). Any proper subset of the strong nodes is not a
maximum s-excess set in (V, A0 ).
On the other hand, it is possible to append to the set of strong nodes S any
collection of 0-deficit branches that have no residual arcs to weak nodes without
changing the value of the optimal solution. This leads to a maximal maximum
s-excess set.

5 The Description of the Pseudoflow Algorithm

The algorithm maintains a superoptimal s-excess solution set in the form of

the set of strong nodes of a normalized tree. That is, the sum of excesses of
strong branches is only greater than the maximum s-excess. Each strong branch
A New Max Flow Algorithm 333

forms an “excess pocket” with the total excess of the branch assigned to its root.
Within each branch the pseudoflow is feasible.
Each iteration of the algorithm consists of identifying an infeasibility in the
form of a residual arc from a strong node to a weak node. The arc is then added
in and the tree is updated. The update consists of pushing the entire excess
of the strong branch along a path from the root of the strong branch to the
merger arc (s0 , w) and progressing towards the root of the weak branch. The first
arc encountered that does not have sufficient residual capacity to accommodate
the pushed flow gets split and the subtree suspended from that arc becomes a
strong branch with excess equal to the amount of flow that could not be pushed
through the arc. The process continues along the path till the next bottleneck
arc is encountered
Recall that G = (V, A), and Af is the set of residual arcs with respect to a
given pseudoflow f .

5.1 The Pseudoflow Algorithm

begin
Initialize ∀(s, j) ∈ A(s), f (s, j) = c(s, j). ∀(j, t) ∈ A(t), f (j, t) = c(j, t).
For all arcs (i, j) ∈ A, f (i, j) = 0.
T = ∪j∈V [r, j], the branches of the tree are {j}j∈V .
Nodes with positive excess are strong, S, and the rest are weak, W .
while (S, W ) ∩ Af 6= ∅ do
Select (s0 , w) ∈ (S, W )
Merge T ← T \ [r, rs0 ] ∪ (s0 , w).
Renormalize
Push δ = M[r,rs0 ] units of flow along the path [rs0 , . . . , s0 , w, . . . , rw , r]:
begin
Let [vi , vi+1 ] be the next edge on the path.
If cf (vi , vi+1 ) > δ augment flow by δ, f (vi , vi+1 ) ← f (vi , vi+1 ) + δ.
Else, split {(vi , vi+1 ), δ − cf (vi , vi+1 )}.
Set δ ← cf (vi , vi+1 ).
Set f (vi , vi+1 ) ← c(vi , vi+1 ); i ← i + 1
end
end
end
procedure split {(a, b), M }
T ← T \ (a, b) ∪ (r, a); M(r,a) = M .
The branch Ta is strong or 0-deficit with excess M .
Af ← Af ∪ {(b, a)} \ {(a, b)}.
end

The push step, in which we augment flow by δ if cf (vi , vi+1 ) > δ, can be
replaced by augmentation if cf (vi , vi+1 ) ≥ δ. The algorithm remains correct,
334 Dorit S. Hochbaum

but the set of strong nodes is no longer minimal among maximum s-excess sets,
and will not satisfy Property 8. We prove in the expanded version of this paper
that the tree maintained is indeed normalized, which establishes the algorithm’s
correctness.

5.2 Initialization

We choose an initial normalized tree with each node as a separate branch for
which the node serves as root. The corresponding pseudoflow saturates all arcs
adjacent to source and to sink. Thus all nodes adjacent to source are strong
nodes, and all those adjacent to sink are weak nodes. All the remaining nodes
have zero inflow and outflow, and are thus of 0 deficit and set as weak. If a node
is adjacent to both source and sink, then the lower capacity arc among the two
is removed, and the other has that value subtracted from it. Therefore each node
is uniquely identified with being adjacent to either source or sink or to neither.

5.3 Termination

The algorithm terminates when there is no residual arc between any strong and
weak nodes.
From Corollary 12 we conclude that at termination the set of strong nodes
is a minimal source set of a minimum cut. In other words, any proper subset
of the set of strong nodes cannot be a source set of a minimum cut. It will be
necessary to identify additional optimal solutions, and in particular a minimum
cut with maximal source set for the parametric analysis.
To that end, we identify the 0-deficit branches among the weak branches. The
set of strong branches can be appended with any collection of 0-deficit branches
without residual arcs to weak nodes for an alternative optimal solution. The
collection of all such 0-deficit branches with the strong nodes forms the sink set
of a minimum cut that is maximal. To see that, consider an analogue of Corollary
12 that demonstrates that no proper subset of a weak branch (of negative deficit)
can be in a source set of a minimum cut.

5.4 About Normalized Trees and Feasible Flows

Given any tree in the extended network, and a specification of pseudoflow values
on the out of tree arcs, it is possible to determine in linear time O(n) whether
the tree is normalized. If the tree is not normalized, then the process is used to
derive a normalized tree which consists of a subset of the arcs of the given tree.
Given a normalized tree it is possible to derive in O(n) time the values of the
pseudoflow on the tree arcs, and in time O(mn) to derive an associated feasible
flow. At termination, that feasible flow is a maximum flow.
A New Max Flow Algorithm 335

5.5 Variants of Pseudoflow Algorithm and Their Complexity

Implementing the pseudoflow algorithm in its generic form results in complexity

of O(nM + ) iterations, for M + = C({s}, V ) – the sum of all capacities of arcs
adjacent to source. This running time is not polynomial.
A natural variant to apply is capacity scaling. Capacity scaling is imple-
mented in a straightforward way with running time of O(mn log M ), where M is
the largest capacity of source-adjacent or sink-adjacent arcs. This running time
is polynomial but not strongly polynomial.
Our strongly polynomial variant relies on the lowest label selection rule of a
merger arc. With this selection rule, our algorithm runs in strongly polynomial
time, O(mn log n).
The lowest label selection rule is described recursively. Initially all nodes are
assigned the label 1, `v = 1 for all v ∈ V . The arc (s0 , w) selected is such that w
is a lowest label weak node among all possible active arcs.
Upon a merger using the arc (s0 , w) the label of the strong node s0 becomes
the label of w plus 1 and all nodes of the strong branch with labels smaller than
that of s0 are updated to be equal to the label of s0 : formally, `s0 ← `w + 1 and
for all nodes v in the same branch with s0 , `v ← max{`v , `s0 }.
The lowest label rule guarantees a bound of mn on the total number of
iterations. In the phase implementation all weak nodes of the same label are
processed in one phase. The phase implementation runs in time O(mn log n).
The lowest label implementation of the algorithm is particularly suitable for
parametric implementation. The algorithm has features that make it especially
easy to adjust to changes in capacities. The common type of analysis finding all
breakpoints in parametric capacities of arcs adjacent to source and sink that are
linear functions of the parameter λ can also be implemented in O(mn log n).

6 Pseudoflow-Based Simplex

The simplex algorithm adapted to the s-excess problem maintains a pseudoflow

with all source and sink adjacent arcs saturated. At termination, the optimal
solution delivered by this s-excess version of simplex identifies a minimum cut.
The solution is optimal when only source adjacent nodes have positive excess
and only sink adjacent nodes have deficits. Additional running time is required
to reconstruct the feasible flows on the given tree, that constitute maximum
flow.
We show in the full version of the paper that the concept of a strong basis,
introduced by Cunningham [C76], is a tree in the extended network satisfying
Properties 5, 6 and 9.

6.1 Pseudoflow-Based Simplex Iteration

An entering arc is a merger arc. It completes a cycle in the residual graph. It

is thus an arc between two branches. We include an auxiliary arc from sink to
336 Dorit S. Hochbaum

source, thus the merger arc completes a cycle. Alternatively we shrink source
and sink into a single node r as before.
Nodes that are on the source side of the tree are referred to as strong, and
those that are on the sink side, as weak, with the notation of S and W respec-
tively. Let an entering arc with positive residual capacity be (s0 , w). The cycle
created is [r, rs0 , . . . , s0 , w, . . . , rw , r].
The largest amount of the flow that can be augmented along the cycle is
the bottleneck residual capacity along the cycle. The first arc attaining this
bottleneck capacity is the leaving arc.
In the simplex the amount of flow pushed is determined by the bottleneck
capacity. In the pseudoflow algorithm the entire excess is pushed even though it
may be blocked by one or more arcs that have insufficient residual capacity.
The use of the lowest label selection rule in the pseudoflow-based simplex
algorithm for the choice of an entering arc leads to precisely the same complexity
as that of our pseudoflow algorithm, O(mn log n).

6.2 Parametric Analysis for Pseudoflow-Based Simplex

Given a series of ` parameter values for λ, {λ1 , . . . , λ` }. Let the source adja-
cent arcs capacities and the sink adjacent arc capacities be a linear function
of λ with the source adjacent capacities monotone nondecreasing with λ and
the sink adjacent capacities monotone nonincreasing with λ. Recently Goldfarb
and Chen [GC96] presented a dual simplex method with running time O(mn2 ).
This method is adaptable to use for sensitivity analysis for such a sequence of
` parameter values, in the same amount of time as a single run, O(mn2 + n`).
The algorithm, however, does not generate all parameter breakpoints in a com-
plete parametric analysis. Our algorithm is the first simplex algorithm that does
generate all parameter breakpoints.
The parametric analysis process is not described here. We only mention that
in order to implement the complete parametric analysis we must recover from
the simplex solution the minimal and maximal source sets minimum cuts. To do
that, we scan the tree at the end of each computation for one parameter value,
and separate 0-deficit branches. This process is equivalent to the normalization
of a tree. It adds only linear time to the running time and may be viewed as basis
adjustment. The running time is linear in the number of nodes in the currently
computed graph.
The overall running time of the procedure is identical to that of the pseud-
oflow algorithm with lowest label, O(mn log n + n`).

6.3 Comparing Simplex to Pseudoflow

Although the simplex implementation in the extended network has the same
complexity as that of the pseudoflow algorithm, the two algorithms are not the
same. Starting from the same normalized tree a simplex iteration will produce
different trees in the following iteration. The pseudoflow algorithm tends to pro-
duce trees that are shallower than those produced by simplex thus reducing the
A New Max Flow Algorithm 337

average work per iteration (which depends on the length of the path from the
root to the merger node). Other details on the similarities and differences be-
tween simplex and the pseudoflow algorithm are provided in the full version of
the paper.

References
C76. W. H. Cunningham. A network simplex method. Mathematical Program-
ming, 1:105–116, 1976.
GGT89. G. Gallo, M. D. Grigoriadis and R. E. Tarjan. A fast parametric maximum
flow algorithm and applications. SIAM Journal of Computing, 18(1):30–55,
1989.
GT86. A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow
problem. J. Assoc. Comput. Mach., 35:921–940, 1988.
GC96. D. Goldfarb and W. Chen. On strongly polynomial dual algorithms for the
maximum flow problem. Special issue of Mathematical Programming B, 1996.
To appear.
GH90. D. Goldfarb and J. Hao. A primal simplex method that solves the Maxi-
mum flow problem in at most nm pivots and O(n2 m) time. Mathematical
Programming, 47:353–365, 1990.
Hoc96. D. S. Hochbaum. A new – old algorithm for minimum cut on closure graphs.
Manuscript, June 1996.
LG64. H. Lerchs, I. F. Grossmann. Optimum design of open-pit mines. Transac-
tions, C.I.M., LXVIII:17–24, 1965.
Pic76. J. C. Picard. Maximal closure of a graph and applications to combinatorial
problems. Management Science, 22:1268–1272, 1976.
An Implementation of a Combinatorial
Approximation Algorithm for Minimum-Cost
Multicommodity Flow

Andrew V. Goldberg1 , Jeffrey D. Oldham2? , Serge Plotkin2?? , and

Cliff Stein3? ? ?
1
NEC Research Institute, Inc., Princeton, NJ 08540, USA
avg@research.nj.nec.com
2
Department of Computer Science, Stanford University
Stanford, CA 94305–9045, USA
{oldham, plotkin}@cs.stanford.edu
http://theory.stanford.edu/~oldham
3
Department of Computer Science, Dartmouth College
Hanover, NH 03755, USA
cliff@cs.dartmouth.edu

Abstract. The minimum-cost multicommodity flow problem involves si-

multaneously shipping multiple commodities through a single network so
that the total flow obeys arc capacity constraints and has minimum cost.
Multicommodity flow problems can be expressed as linear programs,
and most theoretical and practical algorithms use linear-programming
algorithms specialized for the problems’ structures. Combinatorial ap-
proximation algorithms in [GK96,KP95b,PST95] yield flows with costs
slightly larger than the minimum cost and use capacities slightly larger
than the given capacities. Theoretically, the running times of these algo-
rithms are much less than that of linear-programming-based algorithms.
We combine and modify the theoretical ideas in these approximation al-
gorithms to yield a fast, practical implementation solving the minimum-
cost multicommodity flow problem. Experimentally, the algorithm solved
our problem instances (to 1% accuracy) two to three orders of magni-
tude faster than the linear-programming package CPLEX [CPL95] and the
linear-programming based multicommodity flow program PPRN [CN96].

?
Research partially supported by an NSF Graduate Research Fellowship, ARO Grant
DAAH04-95-1-0121, and NSF Grants CCR-9304971 and CCR-9307045.
??
Research supported by ARO Grant DAAH04-95-1-0121, NSF Grants CCR-9304971
and CCR-9307045, and a Terman Fellowship.
???
Research partly supported by NSF Award CCR-9308701 and NSF Career Award
CCR-9624828. Some of this work was done while this author was visiting Stanford
University.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 338–352, 1998. c Springer–Verlag Berlin Heidelberg 1998
Minimum-Cost Multicommodity Flow Combinatorial Implementation 339

1 Introduction
The minimum-cost multicommodity flow problem involves simultaneously ship-
ping multiple commodities through a single network so the total flow obeys the
arc capacity constraints and has minimum cost. The problem occurs in many
contexts where different items share the same resource, e.g., communication net-
works, transportation, and scheduling problems [AMO93,HL96,HO96].
Traditional methods for solving minimum-cost and no-cost multicommodity
flow problems are linear-programming based [AMO93,Ass78,CN96,KH80]. Using
the ellipsoid [Kha80] or the interior-point [Kar84] methods, linear-programming
problems can be solved in polynomial time. Theoretically, the fastest algorithms
for solving the minimum-cost multicommodity flow problem exactly use the
problem structure to speed up the interior-point method [KP95a,KV86,Vai89].
In practice, solutions to within, say 1%, often suffice. More precisely, we say
that a flow is -optimal if it overflows the capacities by at most 1 + factor and
has cost that is within 1 + of the optimum. Algorithms for computing approxi-
mate solutions to the multicommodity flow problem were developed in [LMP+ 95]
(no-cost case) and [GK96,KP95b,PST95] (minimum-cost case). Theoretically,
these algorithms are much faster than interior-point method based algorithms
for constant . The algorithm in [LMP+ 95] was implemented [LSS93] and was
shown that indeed it often outperforms the more traditional approaches. Prior to
our work, it was not known whether the combinatorial approximation algorithms
for the minimum-cost case could be implemented to run quickly.
In this paper we describe MCMCF, our implementation of a combinatorial ap-
proximation algorithm for the minimum-cost multicommodity flow problem. A
direct implementation of [KP95b] yielded a correct but practically slow imple-
mentation. Much experimentation helped us select among the different theoret-
ical insights of [KP95b,LMP+ 95,LSS93,PST95,Rad97] to achieve good practical
performance.
We compare our implementation with CPLEX [CPL95] and PPRN [CN96]. (Sev-
eral other efficient minimum-cost multicommodity flow implementations, e.g.,
[ARVK89], are proprietary so we were unable to use these programs in our study.)
Both are based on the simplex method [Dan63] and both find exact solutions.
CPLEX is a state-of-the-art commercial linear programming package, and PPRN
uses a primal partitioning technique to take advantage of the multicommodity
flow problem structure.
Our results indicate that the theoretical advantages of approximation algo-
rithms over linear-programming-based algorithms can be translated into prac-
tice. On the examples we studied, MCMCF was several orders of magnitude faster
than CPLEX and PPRN. For example, for 1% accuracy, it was up to three orders
of magnitude faster. Our implementation’s dependence on the number of com-
modities and the network size is also smaller, and hence we are able to solve
larger problems.
We would like to compare MCMCF’s running times with modified CPLEX and
PPRN programs that yield approximate solutions, but it is not clear how to make
the modifications. Even if we could make the modifications, we would probably
340 Andrew V. Goldberg et al.

need to use CPLEX’s primal simplex to obtain a feasible flow before an exact
solution is found. Since its primal simplex is an order of magnitude slower than
its dual simplex for the problem instances we tested, the approximate code would
probably not be any faster than computing an exact solution using dual simplex.
To find an -optimal multicommodity flow, MCMCF repeatedly chooses a com-
modity and then computes a single-commodity minimum-cost flow in an auxil-
iary graph. This graph’s arc costs are exponential functions of the current flow.
The base of the exponent depends on a parameter α, which our implementation
chooses. A fraction σ of the commodity’s flow is then rerouted to the correspond-
ing minimum-cost flow. Each rerouting decreases a certain potential function.
The algorithm iterates this process until it finds an -optimal flow.
As we have mentioned above, a direct implementation of [KP95b], while
theoretically fast, is very slow in practice. Several issues are crucial for achieving
an efficient implementation:
Exponential Costs: The value of the parameter α, which defines the base of
the exponent, must be chosen carefully: Using a value that is too small will
not guarantee any progress, and using a value that is too large will lead
to very slow progress. Our adaptive scheme for choosing α leads to signifi-
cantly better performance than using the theoretical value. Importantly, this
heuristic does not invalidate the worst-case performance guarantees proved
for algorithms using fixed α.
Stopping Condition: Theoretically, the algorithm yields an -optimal flow
when the potential function becomes sufficiently small [KP95b]. Alterna-
tive algorithms, e.g., [PST95], explicitly compute lower bounds. Although
these stopping conditions lead to the same asymptotic running time, the
latter one leads to much better performance in our experiments.
Step Size: Theory specifies the rerouting fraction σ as a fixed function of α.
Computing σ that maximizes the exponential potential function reduction
experimentally decreases the running time. We show that is it possible to use
the Newton-Raphson method [Rap90] to quickly find a near-optimal value
of σ for every rerouting. Additionally, a commodity’s flow usually differs
from its minimum-cost flow on only a few arcs. We use this fact to speed up
these computations.
Minimum-Cost Flow Subroutine: Minimum-cost flow computations domi-
nate the algorithm’s running time both in theory and in practice. The arc
costs and capacities do not change much between consecutive minimum-cost
flow computations for a particular commodity. Furthermore, the problem
size is moderate by minimum-cost flow standards. This led us to decide to
use the primal network simplex method. We use the current flow and a ba-
sis from a previous minimum-cost flow to “warm-start” each minimum-cost
flow computation. Excepting the warm-start idea, our primal simplex code
is similar to that of Grigoriadis [Gri86].
In the rest of this paper, we first introduce the theoretical ideas behind the
implementation. After discussing the various choices in translating the theoreti-
cal ideas into practical performance, we present experimental data showing that
Minimum-Cost Multicommodity Flow Combinatorial Implementation 341

MCMCF’s running time’s dependence on the accuracy is smaller than theoreti-

cally predicted and its dependence on the number k of commodities is close to
what is predicted. We conclude by showing that the combinatorial-based imple-
mentation solves our instances two to three orders of magnitude faster than two
simplex-based implementations. In the longer version of this paper, we will also
show that a slightly modified MCMCF solves the concurrent flow problem, i.e., the
optimization version of the no-cost multicommodity flow problem, two to twenty
times faster than Leong et al.’s approximation implementation [LSS93].

2 Theoretical Background

2.1 Definitions

The minimum-cost multicommodity flow problem consists of a directed network

G = (V, A), a positive arc capacity function u, a nonnegative arc cost function c,
and a specification (si , ti , di ) for each commodity i, i ∈ {1, 2, . . . , k0 }. Nodes si
and ti are the source and the sink of commodity i, and a positive number di is
its demand.
A flow is a nonnegative arc function f. A flow fi of commodity i is a flow
obeying conservation constraintsPand satisfying its demand di . We define the to-
k0
tal flow f(a) on arc a by f(a) = i=1 fi (a). Depending on context, the symbol f
represents both the (multi)flow (f1 , f2 , . . ., fk ) and the total flow fP
1 +f2 +· · ·+fk ,
summed arc-wise. The cost of a flow f is the dot product c · f = a∈A c(a)f(a).
Given a problem and a flow f, the congestion of arc a is λa = f(a)/u(a),
and the congestion of the flow is λA = maxa λa . Given a budget B, the cost
congestion is λc = c · f/B, and the total congestion is λ = max{λA , λc }. A
feasible problem instance has a flow f with λA ≤ 1.
Our implementation approximately solves the minimum-cost multicommod-
ity flow problem. Given an accuracy > 0 and a feasible multicommodity flow
problem instance, the algorithm finds an -optimal flow f with -optimal conges-
tion, i.e., λA ≤ (1 + ), and -optimal cost, i.e., if B ∗ is the minimum cost of any
feasible flow, f’s cost is at most (1 + )B ∗ . Because we can choose arbitrarily
small, we can find a solution arbitrarily close to the optimal.
We combine commodities with the same source nodes to form commodi-
ties with one source and (possibly) many sinks (see [LSS93,Sch91]). Thus, the
number k of commodity groups may be smaller than the number k0 of simple
commodities in the input.

2.2 The Algorithmic Framework

Our algorithm is mostly based on [KP95b]. Roughly speaking, the approach

in that paper is as follows. The algorithm first finds an initial flow satisfying
demands but which may violate capacities and may be too expensive. The al-
gorithm repeatedly modifies the flow until it becomes O()-optimal. Each it-
eration, the algorithm first computes the theoretical values for the constant α
342 Andrew V. Goldberg et al.

and the step size σ. It then computes the dual variables yr = eα(λr −1) , where
r rangesPover the arcs A and the arc cost function c, and a potential function
φ(f) = r yr . The algorithm chooses a commodity i to reroute in a round robin
order, as in [Rad97]. It computes, for that commodity, a minimum-cost flow fi∗
in a graph with arc costs related to the gradient ∇φ(f) of the potential func-
tion and arc capacities λA u. The commodity’s flow fi is changed to the convex
combination (1 − σ)fi + σfi∗ . An appropriate choice of values for α and σ lead
to Õ(−3 nmk) running time (suppressing logarithmic terms). Grigoriadis and
Khachiyan [GK96] decreased the dependence on to −2 .
Since these minimum-cost algorithms compute a multiflow having arc cost at
most a budget bound B, we use binary search on B to determine an -optimal
cost. The arc cost of the initial flow gives the initial lower bound because the
flow is the union of minimum-cost single-commodity flows with respect to the arc
cost function c and arc capacities u. Lower bound computations (see Sect. 3.1)
increase the lower bound and the algorithm decreases the congestion and cost
until an -optimal flow is found.

3 Translating Theory into Practice

The algorithmic framework described in the previous section is theoretically

efficient, but a direct implementation requires orders of magnitude larger run-
ning time than commercial linear-programming packages [CPL95]. Guided by
the theoretical ideas of [KP95b,LMP+ 95,PST95], we converted the theoretically
correct but practically slow implementation to a theoretically correct and practi-
cally fast implementation. In some cases, we differentiated between theoretically
equivalent implementation choices that differ in practicality, e.g, see Sect. 3.1. In
other cases, we used the theory to create heuristics that, in practice, reduce the
running time, but, in the worst case, do not have an effect on the theoretical
running time, e.g., see Sect. 3.3.
To test our implementation, we produced several families of random prob-
lem instances using three generators, multigrid, rmfgen, and tripartite.
Our implementation, like most combinatorial algorithms, is sensitive to graph
structure. MCMCF solves the multigrid problem instances (based on [LO91])
very quickly, while rmfgen instances (based on [Bad91]) are more difficult. We
wrote the tripartite generator to produce instances that are especially difficult
for our implementation to solve. More data will appear in the full paper. Brief
descriptions of our problem generators and families will appear in [GOPS97].

3.1 The Termination Condition

Theoretically, a small potential function value and a sufficiently large value of the
constant α indicates the flow is -optimal [KP95b], but this pessimistic indicator
leads to poor performance. Instead, we periodically compute the lower bound
on the optimal congestion λ∗ as found in [LMP+ 95,PST95]. Since the problem
Minimum-Cost Multicommodity Flow Combinatorial Implementation 343

instance is assumed to be feasible, the computation indicates when the current

guess for the minimum flow cost is too low.
The weak duality inequalities yield a lower bound. Using the notation from
[PST95],
X X X
λ yr ≥ Ci (λA ) ≥ Ci∗ (λA ) . (1)
r comm. i comm. i

For commodity i, Ci (λA ) represents the cost of the current flow fi with respect
to arc capacities λA u and the cost function y t A, where A is the km × m arc
adjacency matrix together with the arc cost function c. Ci∗ (λA
P) is the minimum-
P
∗ ∗
cost
P ∗ flow. ForPall choices of dual variables and λA ≥ 1, λ ≥ i C i (1)/ r yr ≥
i Ci (λA )/ r yr . Thus, this ratio serves as a lower bound on the optimal
congestion λ∗ .

3.2 Computing the Step Size σ

While, as suggested by the theory, using a fixed step size σ to form the con-
vex combination (1 − σ)fi + σfi∗ suffices to reduce the potential function, our
algorithm computes σ to maximize the potential function reduction. Brent’s
method and similar strategies, e.g., see [LSS93], are natural strategies to max-
imize the function’s reduction. We implemented Brent’s method [PFTV88],
but the special structure of the potential function allows us to compute the
function’s first and second derivatives. Thus, we can use the Newton-Raphson
method [PFTV88,Rap90], which is faster.
Given the current flow f and the minimum-cost flow fi∗ for commodity i, the
potential function φ(σ) is a convex function (with positive second derivative) of
the step size σ. Over the range of possible choices for σ, the potential function’s
minimum occurs either at the endpoints or at one interior point. Since the func-
tion is a sum of exponentials, the first and second derivatives φ0 (σ) and φ00 (σ)
are easy to compute.
Using the Newton-Raphson method reduces the running time by two orders
of magnitude compared with using a fixed step size. (See Table 1.) As the accu-
racy increases, the reduction in running time for the Newton-Raphson method
increases. As expected, the decrease in the number of minimum-cost flow com-
putations was even greater.

3.3 Choosing α
The algorithm’s performance depends on the value of α. The larger its value,
the more running time the algorithm requires. Unfortunately, α must be large
enough to produce an -optimal flow. Thus, we developed heuristics for slowly
increasing its value. There are two different theoretical explanations for α than
can be used to develop two different heuristics.
Karger and Plotkin [KP95b] choose α so that, when the potential function is
less than a constant factor of its minimum, the flow is -optimal. The heuristic
344 Andrew V. Goldberg et al.

Table 1. Computing an (almost) optimal step size reduces the running time and
number of minimum-cost flow (MCF) computations by two orders of magnitude.

time (seconds) number of MCFs

problem Newton fixed ratio Newton fixed ratio
rmfgen1 0.12 830 17220 20.7 399 8764 22.0
rmfgen1 0.06 1810 83120 45.9 912 45023 49.4
rmfgen1 0.03 5240 279530 53.3 2907 156216 53.7
rmfgen1 0.01 22930 2346800 102.3 13642 1361900 99.8
rmfgen2 0.01 3380 650480 192.4 3577 686427 191.9
rmfgen3 0.01 86790 9290550 107.0 17544 1665483 94.9
multigrid1 0.01 980 57800 59.0 1375 75516 54.9

of starting with a small α and increasing it when the potential function’s value
became too small experimentally failed to decrease significantly the running
time.
Plotkin, Shmoys, and Tardos [PST95] use the weak duality inequalities (1)
upon which we base a different heuristic. The product of the gaps bounds the
distance between the potential function and the optimal flow. The algorithm’s
improvement is proportional to the size of the right gap, and increasing α de-
creases the left gap’s size. Choosing α too large, however, can impede progress
because progress is proportional to the step size σ which itself depends on how
closely the potential function’s linearization approximates its value. Thus, larger
α reduces the step size.
Our heuristic attempts to balance the left and right gaps. More precisely, it
chooses α dynamically to ensure the ratio of inequalities
P P
(λ r yr / comm. i Ci (λA )) − 1
P P (2)
( comm. i Ci (λA )/ comm. i Ci∗ (λA )) − 1

remains balanced. We increase α by factor β if the ratio is larger than 0.5 and
otherwise decrease it by γ. After limited experimentation, we decided to use the
golden ratio for both β and γ. The α values are frequently much lower than
those from [PST95]. Using this heuristic rather than using the theoretical value
of ln(3m)/ [KP95b] usually decreases the running time by a factor of between
two and six. See Table 2.

3.4 Choosing a Commodity to Reroute

Several strategies for selecting the next commodity to reroute, proposed for con-
current flows, also apply to the minimum-cost multicommodity flow problem.
These strategies include weighted [KPST94] and uniform [Gol92] randomiza-
tion, and round-robin [Rad97] selection. Our experiments, which also included
adaptive strategies, suggest that round robin works best.
Minimum-Cost Multicommodity Flow Combinatorial Implementation 345

Table 2. Adaptively choosing α requires fewer minimum-cost flow (MCF) com-

putations than using the theoretical, fixed value of α.
number of MCFs time (seconds)
problem adaptive fixed ratio adaptive fixed ratio
GTE 0.01 2256 15723 6.97 1.17 7.60 6.49
rmfgen3 0.03 3394 10271 3.03 19.34 68.30 3.53
rmfgen4 0.10 4575 6139 1.34 24.65 33.00 1.34
rmfgen4 0.05 6966 22579 3.24 36.19 117.12 3.24
rmfgen4 0.03 16287 53006 3.25 80.10 265.21 3.31
rmfgen5 0.03 18221 64842 3.56 140.85 530.03 3.76
multigrid2 0.01 1659 1277 0.77 20.82 23.56 1.13

3.5 “Restarting” the Minimum-Cost Flow Subroutine

Theoretically, MCMCF can use any minimum-cost flow subroutine. In practice, the
repeated evaluation of single-commodity problems with similar arc costs and
capacities favor an implementation that can take advantage of restarting from a
previous solution. We show that using a primal network simplex implementation
allows restarting and thereby reduces the running time by one-third to one-half.
To solve a single-commodity problem, the primal simplex algorithm repeat-
edly pivots arcs into and out of a spanning tree until the tree has minimum cost.
Each pivot maintains the flow’s feasibility and can decrease its cost. The sim-
plex algorithm can start with any feasible flow and any spanning tree. Since the
cost and capacity functions do not vary much between MCF calls for the same
commodity, we can speed up the computation, using the previously-computed
spanning tree. Using the previously-found minimum-cost flow requires O(km)
additional storage. Moreover, it is more frequently unusable because it is in-
feasible with respect to the capacity constraints than using the current flow. In
contrast, using the current flow requires no additional storage, this flow is known
to be feasible, and starting from this flow experimentally requires a very small
number of pivots.
Fairly quickly, the number of pivots per MCF iteration becomes very small.
See Fig. 1. For the 2121-arc rmfgen-d-7-10-040 instance, the average number of
pivots are 27, 13, and 7 for the three commodities shown. Less than two percent
of arcs served as pivots. For the 260-arc gte problem, the average numbers are
8, 3, and 1, i.e., at most three percent of the arcs.
Instead of using a commodity’s current flow and its previous spanning tree,
a minimum-cost flow computation could start from an arbitrary spanning tree
and flow. On the problem instances we tried, restarting reduces the running time
by a factor of about 1.3 to 2. See Table 3. Because optimal flows are not unique,
the number of MCF computations differ, but the difference of usually less than
five percent.
346 Andrew V. Goldberg et al.

rmfgen-d-7-10-040 gte
4 500
6×10

Cumulative Number of Pivots

400
4
4×10
300

200
2×104
100

0 0

0 200 400 600 800 1000 0 10 20 30 40 50

Number of MCF Calls Number of MCF Calls

Fig. 1. The cumulative number of pivots as a function of the number of

minimum-cost flow (MCF) calls for three different commodities in two prob-
lem instances

4 Experimental Results
4.1 Dependence on the Approximation Factor
The approximation algorithm MCMCF yields an -optimal flow. Plotkin, Shmoys,
and Tardos [PST95] solve the minimum-cost multicommodity flow problem using
shortest-paths as a basic subroutine. Karger and Plotkin [KP95b] decreased
the running time by m/n using minimum-cost flow subroutines and adding a
linear-cost term to the gradient to ensure each flow’s arc cost is bounded. This
change increases the -dependence of [PST95] by 1/ to −3 . Grigoriadis and
Khachiyan [GK96] improved the [KP95b] technique, reducing the -dependence
back to −2 . MCMCF implements the linear-cost term, but experimentation showed
the minimum-cost flows’ arc costs were bounded even without using the linear-
cost term. Furthermore, running times usually decrease when omitting the term.

Table 3. Restarting the minimum-cost flow computations using the current flow
and the previous spanning tree reduces the running time by at least 35%.

restarting no restarting
problem instance time (seconds) time (seconds) ratio
rmfgen-d-7-10-020 0.01 88 180 2.06
rmfgen-d-7-10-240 0.01 437 825 1.89
rmfgen-d-7-12-240 0.01 613 993 1.62
rmfgen-d-7-14-240 0.01 837 1396 1.67
rmfgen-d-7-16-240 0.01 1207 2014 1.67
multigrid-032-032-128-0080 0.01 21 37 1.77
multigrid-064-064-128-0160 0.01 275 801 2.92
Minimum-Cost Multicommodity Flow Combinatorial Implementation 347

The implementation exhibits smaller dependence than the worst-case no-

cost multicommodity flow dependence of O(−2 ). We believe the implementa-
tion’s searching for an almost-optimal step size and its regularly computing lower
bounds decreases the dependence. Figure 2 shows the number of minimum-cost
flow computations as a function of the desired accuracy . Each line represents
a problem instance solved with various accuracies. On the log-log scale, a line’s
slope represents the power of 1/. For the rmfgen problem instances, the depen-
dence is about O(−1.5 ). For most multigrid instances, we solved to a maximum
accuracy of 1% but for five instances, we solved to an accuracy of 0.2%. These
instances depend very little on the accuracy; MCMCF yields the same flows for
several different accuracies. Intuitively, the grid networks permit so many differ-
ent routes to satisfy a commodity that very few commodities need to share the
same arcs. MCMCF is able to take advantage of these many different routes, while,
as we will see in Sect. 5, some linear-programming based implementations have
more difficulty.

rmfgen Instances multigrid Instances

Number of MCF Computations (log scale)

5
2×10
104
105
5×104 5000

2×104
104 2000
5000
1000
2000
500
1000
500
200
5 10 20 50 100 5 10 20 50 100 200 500
Inverse 1/ of Desired Accuracy (log scale) Inverse 1/ of Desired Accuracy (log scale)

Fig. 2. The number of minimum-cost flow (MCF) computations as a function

of 1/ for rmfgen instances is O(−1.5 ) and for multigrid instances

4.2 Dependence on the Number k of Commodity Groups

The experimental number of minimum-cost flow computations and the running
time of the implementation match the theoretical upper bounds. Theoretically,
the algorithm performs Õ(−3 k) minimum-cost flow computations, as described
in Sect. 2.2. These upper bounds (ignoring the dependence and logarithmic
dependences) match the natural lower bound where the joint capacity constraints
are ignored and the problem can be solved using k single-commodity minimum-
cost flow problems. In practice, the implementation requires at most a linear
(in k) number of minimum-cost flows.
348 Andrew V. Goldberg et al.

Figure 3 shows the number of minimum-cost flow computations as a function

of the number k of commodity groups. Each line represents a fixed network with
various numbers of commodity groups. The multigrid figure shows a depen-
dence of approximately 25k for two networks. For the rmfgen instances, the
dependence is initially linear but flattens and even decreases. As the number
of commodity groups increases, the average demand per commodity decreases
because the demands are scaled so the instances are feasible in a graph with
60% of the arc capacities. Furthermore, the randomly distributed sources and
sinks are more distributed throughout the graph reducing contention for the
most congested arcs. The number of minimum-cost flows depends more on the
network’s congestion than on the instance’s size so the lines flatten.

rmfgen Instances multigrid Instances

2×105 1.2×104

Number of MCF Computations

104
5
1.5×10
8000

6000
105
4000
4
5×10 2000

50 100 150 200 250 0 100 200 300 400

Number k of Commodity Groups Number k of Commodity Groups

Fig. 3. The number of minimum-cost flow (MCF) computations as a function

of the number k of commodity groups for rmfgen and multigrid instances

5 Comparisons with Other Implementations

5.1 The Other Implementations: CPLEX and PPRN
We compared MCMCF (solving to 1% accuracy) with a commercial linear-program-
ming package CPLEX [CPL95] and the primal partitioning multicommodity flow
implementation PPRN [CN96].
CPLEX (version 4.0.9) yields exact solutions to multicommodity flow linear
programs. When forming the linear programs, we group the commodities since
MCMCF computes these groups at run-time. CPLEX’s dual simplex method yields
a feasible solution only upon completion, while the primal method, in principle,
could be stopped to yield an approximation. Despite this fact, we compared
MCMCF with CPLEX’s dual simplex method because it is an order of magnitude
faster than its primal simplex for the problems we tested.
Minimum-Cost Multicommodity Flow Combinatorial Implementation 349

PPRN [CN96] specializes the primal partitioning linear programming tech-

nique to solve multicommodity problems. The primal partitioning method splits
the instance’s basis into bases for the commodities and another basis for the joint
capacity constraints. Network simplex methods then solve each commodity’s sub-
problem. More general linear-programming matrix computations applied to the
joint capacity basis combine these subproblems’ solutions to solve the problem.

5.2 Dependence on the Number k of Commodity Groups

The combinatorial algorithm MCMCF solves our problem instances two to three
orders of magnitude faster than the linear-programming-based implementations
CPLEX and PPRN. Furthermore, its running time depends mostly on the network
structure and much less on the arc costs’ magnitude.
We solved several different rmfgen networks (see Fig. 4) with various num-
bers of commodities and two different arc cost schemes. Even for instances having
as few as fifty commodities, MCMCF required less running time. Furthermore, its
dependence on the number k of commodities was much smaller. For the left
side of Fig. 4, the arc costs were randomly chosen from the range [1, 100]. For
these problems, CPLEX’s running time is roughly quadratic in k, while MCMCF’s
is roughly linear. Although for problems with few commodities, CPLEX is some-
what faster, for larger problems MCMCF is faster by an order of magnitude. PPRN
is about five times slower than CPLEX for these problems. Changing the cost of
interframe arcs significantly changes CPLEX’s running time. (See the right side
of Fig. 4.) Both MCMCF’s and PPRN’s running times decrease slightly. The running
times’ dependences on k do not change appreciably.

rmfgen with Nonzero-Cost Interframe Arcs rmfgen with Zero-Cost Interframe Arcs
PPRN-14 CPLEX-12
1000 1000 CPLEX-14
Running Time (min) (log scale)

Running Time (min) (log scale)

PPRN-14

PPRN-12

CPLEX-14 PPRN-12
100
100
CPLEX-12

MCMCF-14
MCMCF-14
10
MCMCF-12
10
MCMCF-12

1
1
20 50 100 200 20 50 100 200
Number k of Commodity Groups (log scale) Number k of Commodity Groups (log scale)

Fig. 4. The running time in minutes as a function of the number k of commodity

groups for two different rmfgen networks with twelve and fourteen frames.
CPLEX’s and PPRN’s dependences are larger than MCMCF’s
350 Andrew V. Goldberg et al.

MCMCF solves multigrid networks two to three orders of magnitude faster

than CPLEX and PPRN. The left side of Fig. 5 shows MCMCF’s running time using
a log-log scale for two different networks: the smaller one having 1025 nodes
and 3072 arcs and the larger one having 4097 nodes and 9152 arcs. CPLEX and
PPRN required several days to solve the smaller network instances so we omitted
solving the larger instances. Even for the smallest problem instance, MCMCF is
eighty times faster than CPLEX, and its dependence on the number of commodities
is much smaller. PPRN is two to three times slower than CPLEX so we solved only
very small problem instances using PPRN.

multigrid Instances Tripartite Instances

104 CPLEX
CPLEX
1000
Running Time (min) (log scale)

1000 PPRN

MCMCF
800

Running Time (min)

100
600
MCMCF
10
MCMCF
400 PPRN

1
200
0.1
0

20 50 100 200 500 0 20 40 60 80 100 120

Number k of Commodity Groups (log scale) Number of frames

Fig. 5. The running time in minutes as a function of the number k of commodity

groups (left figure) and the number of frames (right figure)

5.3 Dependence on the Network Size

To test the implementations’ dependences on the problem size, we used tripar-
tite problem instances with increasing numbers of frames. Each frame has fixed
size so the number of nodes and arcs is linearly related to the number of frames.
For these instances, MCMCF’s almost linear dependence on problem size is much
less than CPLEX’s and PPRN’s dependences. See the right side of Fig. 5. (MCMCF
solved the problem instances to two-percent accuracy.) As described in Sect. 3.5,
the minimum-cost flow routine needs only a few pivots before a solution is found.
(For the sixty-four frame problem, PPRN required 2890 minutes so it was omitted
from the figure.)

6 Concluding Remarks
For the problem classes we studied, MCMCF solved minimum-cost multicommodity
flow problems significantly faster than state-of-the-art linear-programming-based
Minimum-Cost Multicommodity Flow Combinatorial Implementation 351

programs. This is strong evidence the approximate problem is simpler, and that
combinatorial-based methods, appropriately implemented, should be considered
for this problem. We believe many of these techniques can be extended to other
problems solved using the fractional packing and covering framework of [PST95].
We conclude with two unanswered questions. Since our implementation never
needs to use the linear-cost term [KP95b], it is interesting to prove whether the
term is indeed unnecessary. Also, it is interesting to try to prove the experimental
O(−1.5 ) dependence of Sect. 4.1.

References
AMO93. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory,
Algorithms, and Applications. Prentice Hall, Englewood Cliffs, NJ, 1993.
ARVK89. I. Adler, M. G. C. Resende, G. Veiga, and N. Karmarkar. An implemen-
tation of Karmarkar’s algorithm for linear programming. Mathematical
Programming A, 44(3):297–335, 1989.
Ass78. A. A. Assad. Multicommodity network flows – A survey. Networks, 8(1):37–
91, Spring 1978.
Bad91. T. Badics. genrmf. 1991. ftp://dimacs.rutgers.edu/pub/netflow/
generators/network/genrmf/
CN96. J. Castro and N. Nabona. An implementation of linear and nonlinear
multicommodity network flows. European Journal of Operational Research,
92(1):37–53, 1996.
CPL95. CPLEX Optimization, Inc., Incline Village, NV. Using the CPLEX
Callable Library, 4.0 edition, 1995.
Dan63. G. B. Dantzig. Linear Programming and Extensions. Princeton University
Press, Princeton, NJ, 1963.
GK96. M. D. Grigoriadis and L. G. Khachiyan. Approximate minimum-cost
multicommodity flows in Õ(−2 knm) time. Mathematical Programming,
75(3):477–482, 1996.
Gol92. A. V. Goldberg. A natural randomization strategy for multicommodity
flow and related algorithms. Information Processing Letters, 42(5):249–
256, 1992.
GOPS97. A. Goldberg, J. D. Oldham, S. Plotkin, and C. Stein. An implementation of
a combinatorial approximation algorithm for minimum-cost multicommod-
ity flow. Technical Report CS-TR-97-1600, Stanford University, December
1997.
Gri86. M. D. Grigoriadis. An efficient implementation of the network simplex
method. Mathematical Programming Study, 26:83–111, 1986.
HL96. R. W. Hall and D. Lotspeich. Optimized lane assignment on an automated
highway. Transportation Research—C, 4C(4):211–229, 1996.
HO96. A. Haghani and S.-C. Oh. Formulation and solution of a multi-commodity,
multi-modal network flow model for disaster relief operations. Transporta-
tion Research—A, 30A(3):231–250, 1996.
Kar84. N. Karmarkar. A new polynomial-time algorithm for linear programming.
Combinatorica, 4(4):373–395, 1984.
KH80. J. L. Kennington and R. V. Helgason. Algorithms for Network Program-
ming. John Wiley & Sons, New York, 1980.
352 Andrew V. Goldberg et al.

Kha80. L. G. Khachiyan. Polynomial algorithms in linear programming. Zhurnal

Vychislitel’noi Matematiki i Matematicheskoi Fiziki (Journal of Computa-
tional Mathematics and Mathematical Physics), 20(1):51–68, 1980.
KP95a. A. Kamath and O. Palmon. Improved interior point algorithms for exact
and approximate solution of multicommodity flow problems. In Proceedings
of the 6th Annual ACM-SIAM Symposium on Discrete Algorithms, Vol. 6,
pages 502–511. Association for Computing Machinery, January 1995.
KP95b. D. Karger and S. Plotkin. Adding multiple cost constraints to combinato-
rial optimization problems, with applications to multicommodity flows. In
Symposium on the Theory of Computing, Vol. 27, pages 18–25. Association
for Computing Machinery, ACM Press, May 1995.
KPST94. P. Klein, S. Plotkin, C. Stein, and É. Tardos. Faster approximation algo-
rithms for the unit capacity concurrent flow problem with applications to
routing and finding sparse cuts. SIAM Journal on Computing, 23(3):466–
487, 1994.
KV86. S. Kapoor and P. M. Vaidya. Fast algorithms for convex quadratic pro-
gramming and multicommodity flows. In Proceedings of the 18th Annual
ACM Symposium on Theory of Computing, Vol. 18, pages 147–159. Asso-
ciation for Computing Machinery, 1986.
LMP+ 95. T. Leighton, F. Makedon, S. Plotkin, C. Stein, É. Tardos, and S. Tragoudas.
Fast approximation algorithms for multicommodity flow problems. Journal
of Computer and System Sciences, 50(2):228–243, 1995.
LO91. Y. Lee and J. Orlin. gridgen. 1991. ftp://dimacs.rutgers.edu/pub/
netflow/generators/network/gridgen/
LSS93. T. Leong, P. Shor, and C. Stein. Implementation of a combinatorial mul-
ticommodity flow algorithm. In David S. Johnson and Catherine C. Mc-
Geoch, editors, Network Flows and Matching, Series in Discrete Mathemat-
ics and Theoretical Computer Science, Vol. 12, pages 387–405. American
Mathematical Society, 1993.
PFTV88. W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Nu-
merical Recipes in C. Cambridge University Press, Cambridge, 1988.
PST95. S. A. Plotkin, D. B. Shmoys, and É. Tardos. Fast approximation algorithms
for fractional packing and covering problems. Mathematics of Operations
Research, 20(2):257–301, 1995.
Rad97. T. Radzik. Fast deterministic approximation for the multicommodity flow
problem. Mathematical Programming, 78(1):43–58, 1997.
Rap90. J. Raphson. Analysis Æquationum Universalis, seu, Ad Æquationes Al-
gebraicas Resolvendas Methodus Generalis, et Expedita. Prostant venales
apud Abelem Swalle, London, 1690.
Sch91. R. Schneur. Scaling Algorithms for Multicommodity Flow Problems and
Network Flow Problems with Side Constraints. PhD thesis, MIT, Cam-
bridge, MA, February 1991.
Vai89. P. M. Vaidya. Speeding up linear programming using fast matrix multipli-
cation. In Proceedings of the 30th Annual Symposium on Foundations of
Computer Science, Vol. 30, pages 332–337. IEEE Computer Society Press,
1989.
Non-approximability Results for Scheduling
Problems with Minsum Criteria

Han Hoogeveen1, Petra Schuurman1 , and Gerhard J. Woeginger2?

1
Department of Mathematics and Computing Science
Eindhoven University of Technology
P.O. Box 513, 5600 MB Eindhoven, The Netherlands
{slam, petra}@@win.tue.nl
2
Institut für Mathematik, Technische Universität Graz
Steyrergasse 30, A-8010 Graz, Austria
gwoegi@@opt.math.tu-graz.ac.at

Abstract. We provide several non-approximability results for determin-

istic scheduling problems whose objective is to minimize the total job
completion time. Unless P = N P, none of the problems under consid-
eration can be approximated in polynomial time within arbitrarily good
precision. Most of our results are derived by Max SNP hardness proofs.
Among the investigated problems are: scheduling unrelated machines
with some additional features like job release dates, deadlines and weights,
scheduling flow shops, and scheduling open shops.

1 Introduction

Since the early 1970s, the algorithms and optimization community has put
lots of efforts into identifying the computational complexity of various combi-
natorial optimization problems. Nowadays it is common knowledge that, when
dealing with an N P-hard optimization problem, one should not expect to find
a polynomial-time solution algorithm. This insight motivates the search for ap-
proximation algorithms that output provably good solutions in polynomial time.
It also immediately raises the question of how well we can approximate a specific
optimization problem in polynomial time.
We say that a polynomial-time approximation algorithm for some optimiza-
tion problem has a performance guarantee or worst-case ratio ρ, if it outputs a
feasible solution with cost at most ρ times the optimum value for all instances of
the problem; such an algorithm is also called a polynomial-time ρ-approximation
algorithm. Now, given an N P-hard optimization problem, for which values of
ρ does there exist and for which values of ρ does there not exist a polynomial-
time ρ-approximation algorithm? In this paper we focus on ‘negative’ results
for scheduling problems in this area, i.e., we demonstrate for several scheduling
?
Supported by the START program Y43–MAT of the Austrian Ministry of Science.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 353–366, 1998. c Springer–Verlag Berlin Heidelberg 1998
354 Han Hoogeveen et al.

problems that they do not have polynomial-time ρ-approximation algorithms

with ρ arbitrarily close to 1, unless P = N P.
Until now, the literature only contains a small number of non-approximability
results for scheduling problems. Most of the known non-approximability results
have been derived for the objective of minimizing the makespan. As far as we
know, the only non-approximability results for scheduling problems with minsum
objective are presented in the papers of Kellerer, Tautenhahn & Woeginger [7]
and Leonardi & Raz [12]. In the first paper, the authors prove that the problem
of minimizing total flow time on a single machine subject to release dates cannot
be approximated in polynomial time within any constant factor. In the second
one, the authors derive a similar result for the total flow time problem subject
to release dates on parallel machines. Below, we give a short list with the most
important results for makespan minimization, all under the assumption that
P = N P; for more detailed information on the scheduling problems mentioned,
see the first paragraphs of Sections 3 and 4, or see the survey article by Lawler,
Lenstra, Rinnooy Kan & Shmoys [8]. We refer the reader interested in more
results on approximability and non-approximability in scheduling to Lenstra &
Shmoys [10]. Throughout this paper, we use the notation introduced by Graham,
Lawler, Lenstra & Rinnooy Kan [4] to denote the scheduling problems.

– Lenstra & Rinnooy Kan [9] prove that P | prec, pj = 1 | Cmax (makespan min-
imization on parallel machines with precedence constraints and unit process-
ing times) does not have a polynomial-time approximation algorithm with
performance guarantee strictly better than 4/3.
– Lenstra, Shmoys & Tardos [11] show that R | | Cmax (makespan minimiza-
tion on unrelated machines) does not have a polynomial-time approximation
algorithm with performance guarantee strictly better than 3/2.
– Hoogeveen, Lenstra & Veltman [5] prove that P | prec, c = 1, pj = 1 | Cmax
and P ∞ | prec, c = 1, pj = 1 | Cmax (two variants of makespan minimization
on parallel machines with unit processing times and unit communication
delays) cannot be approximated with worst-case ratios better than 5/4 and
7/6, respectively.
– Williamson et al. [16] prove that O | | Cmax and F | | Cmax (makespan mini-
mization in open shops and flow shops, respectively) cannot be approximated
in polynomial time with performance guarantees better than 5/4.

All the listed non-approximability results have been derived via the so-called gap
technique, i.e., via N P-hardness reductions that create gaps in the cost function
of the constructed instances. More precisely, such a reduction transforms the
YES-instances of some N P-hard problem into scheduling instances with ob-
jective value at most c∗ , and it transforms the NO-instances into scheduling
instances with objective value at least g · c∗ , where g > 1 is some fixed real
number. Then a polynomial-time approximation algorithm for the scheduling
problem with performance guarantee strictly better than g (i.e., with guaran-
tee g − ε where ε > 0) would be able to separate the YES-instances from the
NO-instances, thus yielding a polynomial-time solution algorithm for an N P-
Non-approximability of Scheduling Problems 355

complete problem. Consequently, unless P = N P, the scheduling problem can-

not have a polynomial-time ρ-approximation algorithm with ρ < g.
In this paper, we present non-approximability results for the corresponding
minsum versions of the makespan minimization problems listed above. We prove
that none of the scheduling problems
P P P
|
– R | rjP Cj , R | d¯j | P Cj , and R | | wj Cj ;
– F|| Cj and OP| | Cj ;
– P | prec, pj = 1 | Cj ; P P
– P | prec, c = 1, pj = 1 | Cj and P ∞ | prec, c = 1, pj = 1 | Cj
can be approximated in polynomial time within arbitrarily good precision, un-
less P = N P. OurPmain contribution is the non-approximability result for
the problem R | rj | Cj , which answers an open problem posed by Skutella
[15]. Interestingly, we do not prove this non-approximability result by apply-
ing the standard gap technique that we sketched above, but by establishing
Max SNP-hardness of this problem (see Section 2 for some information P on
Max SNP-hardness). In the Max SNP-hardness proof for R | rj | Cj we are
recycling some of the ideas that Lenstra, Shmoys & Tardos [11] used in P their
gap reduction
P for R | | Cmax . Also the non-approximability results for F | | Cj
and O | | Cj are established via Max SNP-hardness proofs; part of these ar-
guments are based on combinatorial considerations that first have been used
by PWilliamson et al. [16]. The non-approximability
P results for P | prec, Ppj =
1| Cj , for P | prec, c = 1, pj = 1 | Cj , and for P ∞ | prec, c = 1, pj = 1 | Cj
follow from the well-known gap reductions for the corresponding makespan min-
imization problems P | prec, pj = 1 | Cmax , P | prec, c = 1, pj = 1 | Cmax , and
P ∞ | prec, c = 1, pj = 1 | Cmax in a straightforward way, and therefore they are
not described in this paper.
The paper is organized as follows. In Section 2, we summarize some use-
ful information on approximation schemes, non-approximability, Max SNP-
hardness, and L-reductions, as we need it in the remainder of the paper. The
non-approximability results are presented in Sections 3 and 4: Section 3 deals
with scheduling unrelated machines and Section 4 considers flow shops and open
shops. The paper is concluded with a short discussion in Section 5.

2 Preliminaries on Non-approximability
This section gives some information on approximation algorithms, and it summa-
rizes some basic facts on Max SNP-hardness. For a more extensive explanation
we refer the reader to Papadimitriou & Yannakakis [14]. For a compendium of
publications on non-approximability results we refer to Crescenzi & Kann [2].
A polynomial-time approximation scheme for an optimization problem, PTAS
for short, is a family of polynomial-time (1 + ε)-approximation algorithms for
all ε > 0. Essentially, a polynomial-time approximation scheme is the strongest
possible approximability result for an N P-hard problem. For N P-hard prob-
lems an important question is whether such a scheme exists. The main tool for
356 Han Hoogeveen et al.

dealing with this question is the L-reduction as introduced by Papadimitriou &

Yannakakis [14]:
Definition 1 (Papadimitriou and Yannakakis [14]). Let A and B be two
optimization problems. An L-reduction from A to B is a pair of functions R
and S, both computable in polynomial time, with the following two additional
properties:
– For any instance I of A with optimum cost Opt(I), R(I) is an instance of
B with optimum cost Opt(R(I)), such that

Opt(R(I)) ≤ α · Opt(I), (1)

for some positive constant α.

– For any feasible solution s of R(I), S(s) is a feasible solution of I such that

|Opt(I) − c(S(s))| ≤ β · |Opt(R(I)) − c(s)|, (2)

for some positive constant β, where c(S(s)) and c(s) represent the costs of
S(s) and s, respectively. u
t
Papadimitriou & Yannakakis [14] prove that L-reductions in fact are approx-
imation preserving reductions. If there is an L-reduction from the optimiza-
tion problem A to problem B with parameters α and β, and if there exists
a polynomial-time approximation algorithm for B with performance guarantee
1 + ε, then there exists a polynomial-time approximation algorithm for A with
performance guarantee 1 + αβε. Consequently, if there exists a PTAS for B, then
there also exists a PTAS for A.
Papadimitriou & Yannakakis [14] also define a class of optimization problems
called Max SNP, and they prove that every problem in this class is approx-
imable in polynomial time within some constant factor. The class Max SNP is
closed under L-reductions, and the hardest problems in this class (with respect
to L-reductions) are the Max SNP-complete ones. A problem that is at least as
hard (with respect to L-reductions) as a Max SNP-complete problem is called
Max SNP-hard . For none of these Max SNP-hard problems, a PTAS has been
constructed. Moreover, if there does exist a PTAS for one Max SNP-hard prob-
lem, then all Max SNP-hard problems have a PTAS, and
Proposition 2 (Arora, Lund, Motwani, Sudan, and Szegedy [1]). If
there exists a PTAS for some Max SNP-hard problem, then P = N P. u
t
Proposition 2 provides a strong tool for proving the non-approximability of an
optimization problem X. Just provide an L-reduction from a Max SNP-hard
problem to X. Then unless P = N P, problem X cannot have a PTAS.

3 Unrelated Parallel Machine Scheduling

The unrelated parallel machine scheduling problem considered in this section
is defined as follows. There are n independent jobs 1, 2, . . . , n that have to be
Non-approximability of Scheduling Problems 357

scheduled on m machines M1 , M2 , . . . , Mm . Preemptions are not allowed. Every

machine can only process one job at a time and every job has to be processed on
exactly one machine. If job j is scheduled on machine Mi , then the processing
time required is pij . Every job j has a release date rj on which it becomes avail-
able for processing. The objective is to minimize the total job completion P time.
In the standard scheduling notation this problem P is denoted by R | rj | Cj .
We prove the non-approximability of R | rj | Cj by presenting an L-reduc-
P
tion from a Max SNP-hard 3-dimensional matching problem to R | rj | Cj .
This L-reduction draws some ideas from the gap reduction of Lenstra, Shmoys &
Tardos [11] for R | | Cmax . Consider the following Maximum Bounded 3-Dimen-
sional Matching problem (Max-3DM-B), which has been proven to be Max
SNP-hard by Kann [6].

MAXIMUM BOUNDED 3-DIMENSIONAL MATCHING (Max-3DM-B)

Input: Three sets A = {a1 , . . . , aq }, B = {b1 , . . . , bq } and C = {c1 , . . . , cq }. A
subset T of A × B × C of cardinality s, such that any element of A, B and
C occurs in exactly one, two, or three triples in T . Note that this implies that
q ≤ s ≤ 3q.
Goal: Find a subset T 0 of T of maximum cardinality such that no two triples of
T 0 agree in any coordinate.
Measure: The cardinality of T 0 .

The following simple observation will be useful.

Lemma 3. For any instance I of Max-3DM-B, we have Opt(I) ≥ 17 s. t
u
Now let I = (q,PT ) be an instance of Max-3DM-B. We construct an instance
R(I) of R | rj | Cj with 3q + s jobs and s + q machines. The first s machines
correspond to the triples in T , and hence are called the triple machines. The
remaining q machines are the dummy machines. The 3q + s jobs are divided into
3q element jobs and into s dummy jobs. The 3q element jobs correspond to the
elements of A, B, and C, and are called A-jobs, B-jobs, and C-jobs, respectively.
Let Tl = (ai , bj , ck ) be the lth triple in T . The processing times on the
lth triple machine are now defined as follows. The processing time of the three
element jobs that correspond to the elements ai , bj , and ck is 1, whereas all
the other element jobs have processing time infinity. The dummy jobs have
processing time 3 on any triple machine and also on any dummy machine. All
A-jobs and B-jobs have processing time 1 on the dummy machines, and all C-
jobs have infinite processing time on the dummy machines. The release dates
of all A-jobs and all dummy jobs are 0, all B-jobs have release date 1, and all
C-jobs have release date 2.
In the following we are mainly interested in schedules of the following struc-
ture. Among the triple machines, there are k machines that process the three
element jobs that belong to the triple corresponding to the machine (we call such
machines good machines), there are q − k machines that process a dummy job
together with a C-job, and there are s − q machines that process a single dummy
358 Han Hoogeveen et al.

job. The q dummy machines are split into two groups: q − k of them process an
A-job and a B-job, and k of them process a single dummy job. An illustration for
schedules of this structure is given in Figure 1. All jobs are scheduled as early as
possible. The cost Zk of a schedule σ with the special structure described above

a b c
k
a b c

Triple machines
d c
q-k
d c

d
s-q
d

a b
Dummy machines

q-k
a b

d
k
d

Fig. 1. A schedule for R(I) with a special structure.

only depends on the number k of good machines, and it is equal to

X X X X
Zk = Ca + Cb + Cc + Cd
A−jobs a B−jobs b C−jobs c dummy jobs d
= q + 2q + 3k + 4(q − k) + 3s
= 7q − k + 3s.

Lemma 4. Let σ be any feasible schedule for R(I) with k good machines. Then
the objective value of σ is at least equal to Zk , and there exists a feasible sched-
ule σ 0 satisfying the structure described above with the same number k of good
machines.

Proof. We argue that the objective value of schedule σ is at least Zk ; then σ

can be replaced by a schedule σ 0 with the desired structure, which is readily
determined given the k good machines. Note that a schedule in which every job
starts at its release date has objective value equal to 6q + 3s. We prove that the
extra cost caused by jobs that do not start at their release date is at least q − k.
Non-approximability of Scheduling Problems 359

Let `A , `B , and `C be the number of A-jobs, B-jobs, and C-jobs, respectively,

that do not start at their release dates. Let `D 1 be the number of dummy jobs
that start at time 1, and let `D 2 be the number of dummy jobs that start at
or after time 2. Finally, let k1 and k2 be the number of machines that are idle
during the time intervals [0, 1] and [1, 2], respectively. There are q − `C of the C-
jobs whose processing starts at their release date. All these C-jobs are processed
on triple machines, and only k triple machines are good machines. Hence, the
remaining number of at least q − `C − k machines that process such a C-job must
be idle during the time interval [0, 1], or during [1, 2], or during both intervals.
This yields that

k1 + k2 ≥ q − `C − k. (3)

The number of jobs that have release date zero equals the number of machines.
Hence, the processing of at least k1 of the A-jobs and dummy jobs does not start
at time zero, and we conclude that `A + `D 1 + `2 ≥ k1 . Analogously at least k2
D

of the B-jobs and dummy jobs are not processed during the time interval [1, 2],
and therefore `B + `D2 ≥ k2 holds. Summarizing, one gets that the additional
cost caused by jobs in σ that do not start at their release date is at least

`A + `B + `C + `D
1 + 2`2 = (` + `1 + `2 ) + (` + `2 ) + `
D A D D B D C

≥ k1 + k2 + `C ≥ q − `C − k + `C = q − k.

Hereby the proof of the lemma is complete. t

Lemma 5. For any instance I of Max-3DM-B, we have that Opt(R(I)) ≤

69 Opt(I). Hence, the polynomial-time transformation R fulfills condition (1)
of Definition 1.

Proof. Let k = Opt(I). The statement of Lemma 4 yields that Opt(R(I)) =

7q − k + 3s, and Lemma 3 yields that k ≥ 17 s ≥ 17 q. Hence, Opt(R(I)) ≤
49k − k + 21k = 69 Opt(I). t
u
Next, we define a polynomial-time transformation S that maps feasible solu-
tions of R(I) to feasible solutions of I. Let σ be a feasible schedule for R(I). As
described in the proof of Lemma 4, we find a corresponding schedule σ 0 for σ that
has the special structure. Then the feasible solution S(σ) for the instance I of
Max-3DM-B consists of the triples in T that correspond to the good machines
in σ 0 .

Lemma 6. For any feasible schedule σ of R(I), the feasible solution S(σ) of
instance I fulfills the inequality |Opt(I)−c(S(σ))| ≤ |Opt(R(I))−c(σ)|. Hence,
the polynomial-time transformation S fulfills condition (2) of Definition 1.

Summarizing, Lemma 5 and Lemma 6 state that the transformations R and S

satisfy both conditions in Definition 1, and hence constitute a valid L-reduction.
We formulate the following theorem.
P
Theorem 7. The scheduling problem R | rj | Cj is Max SNP-hard and thus
does not has a PTAS, unless P = N P. t
u
¯ P
The same idea works to prove Max SNP-hardness P of R | dj | Cj (scheduling
unrelated machines with job deadlines) and R | | wj Cj (minimizing the total
weighted completion time).

4 Shop Scheduling Problems

The shop scheduling problems considered in this section are defined as follows.
There are n jobs 1, 2, . . . , n and m machines M1 , M2 , . . . , Mm . Each job j consists
of m operations O1j , O2j , . . . , Omj . Operation Oij has to be processed on Mi for
a period of pij units. No machine may process more than one job at a time,
and no two operations of the same job may be processed at the same time.
Preemption is not allowed.
There are three types of shop models. First, there is the open shop in which
it is immaterial in what order the operations are executed. Secondly, there is the
job shop in which the processing order of the operations is prespecified for each
job; different jobs may have different processing orders. Finally, there is the flow
shop, which is a special case of the job shop: here, the prespecified processing
order is the same for all jobs. In the standard scheduling notation, the open shop,
job shop,
P and flow Pshop with total Pjob completion time objective are denoted by
O| | Cj , J | | Cj , and F | | Cj , respectively.
In this section, we provide a proof for the following theorem.
P P P
Theorem 8. The scheduling problems O | | Cj , J | | Cj , and F | | Cj
are Max SNP-hard. Unless P = N P, they do not have a PTAS.
Williamson et al. [16] show by a gap reduction from a variant of 3-satisfiability
that the makespan minimization shop problems J | | Cmax and F | | Cmax cannot
be approximated in polynomial time within a factor better than 5/4 (unless
P = N P). In proving Theorem 8, we use an L-reduction that is based on this
gap reduction. Our L-reduction is from the following version Max-2Sat-B of
maximum 2-satisfiability.

MAXIMUM BOUNDED 2-SATISFIABILITY (Max-2Sat-B)

Input: A set U = {x1 , . . . , xq } of variables and a collection C = {c1 , . . . , cs } of
clauses over U . Each clause consists of exactly two (possibly identical) literals.
Each variable occurs either once or twice in negated form in C, and either once
or twice in unnegated form in C.
Goal: Find a truth assignment for U such that a maximum number of clauses
is satisfied.
Measure: The number of satisfied clauses.
Non-approximability of Scheduling Problems 361

Proposition 9. The above defined problem Max-2Sat-B is Max SNP-hard.

t
u

In analogy to Lemma 3, we argue that for any instance I of Max-2Sat-B, the

optimum value is bounded away from zero.
Lemma 10. For any instance I of Max-2Sat-B, we have Opt(I) ≥ 12 s. t
u

4.1 Flow Shop Scheduling

Let I = (U, C) with U = {x1 , . . . , xq } and C = {c1 , . . . , cs } be an instance of

Max-2Sat-B. We define a flow shop instance R(I) as follows.
In constructing this instance R(I), we distinguish between the first and the
second unnegated (respectively, negated) occurrence of each literal. For j = 1, 2,
we refer to the jth occurrence of the literal xi as xij , and to the jth occurrence of
xi as xij . With each variable xi we associate four corresponding so-called variable
jobs: the two jobs xi1 and xi2 correspond to the first and second occurrence
of literal xi , and the two jobs xi1 and xi2 correspond to the first and second
occurrence of the literal xi . With each literal x ∈ {xi2 , xi2 } that does not occur
in the formula, we further associate three so-called dummy jobs d1 (x), d2 (x), and
d3 (x). The processing time of every operation is either 0 or 1. Every variable job
x has exactly three operations with processing time 1 and m − 3 operations of
processing time 0. The three length 1 operations are a beginning operation B(x),
a middle operation M (x) and an ending operation E(x). A dummy job di (x)
only has a single operation Di (x) with processing time 1, and m − 1 operations
of processing time 0. There are four classes of machines.

– For each variable xj , there are two assignment machines: the first one pro-
cesses the operations B(xj1 ) and B(xj1 ), whereas the second one processes
the operations B(xj2 ) and B(xj2 ).
– For each variable xj , there are two consistency machines: the first one pro-
cesses the operations M (xj1 ) and M (xj2 ), whereas the second one processes
the operations M (xj2 ) and M (xj1 ).
– For each clause a∨b, there is a clause machine that processes E(a) and E(b).
– For each literal x ∈ {xi2 , xi2 } that does not occur in the formula, there
is a garbage machine that processes the operation E(x) and the dummy
operations D1 (x), D2 (x), and D3 (x).

The processing of every job first goes through the assignment machines, then
through the consistency machines, then through the clause machines, and finally
through the garbage machines. Since most operations have length 0, the precise
processing order within every machine class is not essential; we only note that
for every variable job, processing on the first assignment (first consistency) ma-
chine always precedes processing on the second assignment (second consistency)
machine. Similarly as in Section 3, we are mainly interested in schedules for
R(I) with a special combinatorial structure. In a so-called consistent schedule,
for every variable x either both operations B(xi1 ) and B(xi2 ) are processed in
362 Han Hoogeveen et al.

B(x i 1) B(x i 1)
Assignment machines
B(x i 2) B(x i 2)

M(x i 1) M(x i 2)
Consistency machines
M(x i 2) M(x i 1)

E(x) E(y)
Clause machines
E(u) E(v)

D1(x) D2(x) D3(x) E(x)

Garbage machines
D1(y) D2(y) E(y) D3(y)

Fig. 2. A consistent schedule for the flow shop instance.

the interval [0, 1], or both operations B(xi1 ) and B(xi2 ) are processed during
[0, 1]. Moreover, in a consistent schedule the machines process the operations of
length 1 during the following intervals. The assignment machines are only pro-
cessing length 1 operations during [0, 2], and the consistency machines are only
processing such operations during [1, 3]. On every clause machine, the operations
of length 1 are either processed during [2, 3] and [3, 4] or during [3, 4] and [4, 5].
On the garbage machines, all four operations of length 1 are processed during
[0, 4]; there are no restrictions on the processing order on the garbage machines.
Figure 2 gives an illustration of a consistent schedule.

Lemma 11. Let σ be any feasible schedule for R(I). Then there exists a feasible
consistent schedule σ 0 whose objective value is at most the objective value of σ.

Proof. Let σ be an arbitrary feasible schedule for R(I). Throughout the proof,
we only deal with the placement of length 1 operations (as placing the length 0
operations is straightforward); a machine is busy if it is processing a length 1
operation. Our first goal is to transform σ into a schedule in which the assign-
ment machines are only busy during [0, 2] and the consistency machines are only
busy during [1, 3]. We start by shifting all operations on the assignment and
consistency machines as far to the left as possible without violating feasibility.
Clearly, in the resulting schedule all operations on the assignment machines are
processed during [0, 2].
Now suppose that on some consistency machine some operation, say opera-
tion M (xi1 ), only completes at time four. Since all operations were shifted to the
left, this yields that both operations B(xi1 ) and B(xi2 ) are scheduled during the
time interval [1, 2]. Moreover, operations B(xi1 ) and B(xi2 ) are both processed
Non-approximability of Scheduling Problems 363

during [0, 1]. We proceed as follows: If operation M (xi1 ) is processed before

M (xi2 ), then we switch the order of the operations B(xi2 ) and B(xi2 ) on their
assignment machine. Moreover, we reschedule operation M (xi2 ) during [1, 2] and
operation M (xi1 ) during [2, 3]. If operation M (xi1 ) is processed after M (xi2 ),
then we perform a symmetric switching and rescheduling step. Note that after
these switches, the schedule is still feasible and that on the consistency machines
no operation has been shifted to the right.
By performing such switches, we eventually get a schedule in which all as-
signment machines are busy during [0, 2] and all consistency machines are busy
during [1, 3]. Finally, by shifting all operations on the clause and garbage ma-
chines as far to the left as possible, we obtain a schedule σ 0 . It is routine to check
that in schedule σ 0 every clause machine is either busy during [2, 4] or during
[3, 5], and every garbage machine is busy during [0, 4]. Since in schedule σ0 no
job finishes later than in schedule σ, the objective value has not been increased.
It remains to be proven that the constructed schedule σ 0 indeed is a consistent
schedule, i.e., to prove that for every variable xi , the operations B(xi1 ) and
B(xi2 ) are processed simultaneously. First, assume that operation B(xi1 ) starts
at time 0. This implies that B(xi1 ) is processed during [1, 2], and that M (xi1 )
is processed during [2, 3]. This in turn implies that M (xi2 ) starts at time 1, and
thus B(xi2 ) is processed during [0, 1], and we are done. In case operation B(xi1 )
starts at time 1, a symmetric argument works. t
u

Next, we define a polynomial-time transformation S that maps feasible so-

lutions of R(I) to feasible solutions of the Max-2Sat-B instance I. Let σ be
a feasible schedule for R(I). As described in the proof of Lemma 11, we find a
corresponding consistent schedule σ 0 . We define the truth setting S(σ) for I as
follows. If in σ 0 operations B(xi1 ) and B(xi2 ) are processed during [0, 1], then
variable xi is set to TRUE, and if B(xi1 ) and B(xi2 ) are processed during [1, 2],
then xi is set to FALSE.
It can be verified that if a clause is satisfied under the truth setting S(σ),
then the length 1 operations on the corresponding clause machine are processed
during [2, 3] and [3, 4]. Conversely if a clause is not satisfied, then these operations
occupy the intervals [3, 4] and [4, 5].

Lemma 12. For any instance I of Max-2Sat-B, we have that Opt(R(I)) ≤

58 Opt(I). Hence, the polynomial-time transformation R fulfills condition (1) of
Definition 1.

Proof. There are s clause machines and at most 2q garbage machines. Since
the total completion time of two jobs on the same clause machine is at most
4 + 5 = 9, the total completion time of the variable jobs that have non-zero
processing requirement on some clause machine is at most 9s. Moreover, the total
completion time of the remaining jobs (i.e., the jobs with non-zero processing
time on some garbage machine) is at most (1 + 2 + 3 + 4)2q ≤ 20s. Hence
Opt(R(I)) ≤ 29s, and Lemma 10 completes the proof. t
u
364 Han Hoogeveen et al.

Lemma 13. For any feasible schedule σ of R(I), the feasible solution S(σ) of in-
stance I fulfills the inequality |Opt(I) − c(S(σ))| ≤ 12 |Opt(R(I)) − c(σ)|. Hence,
the polynomial-time transformation S fulfills condition (2) of Definition 1.

Proof. As has been argued before, the length 1 operations on a clause machine
have completion times 3 and 4, if the corresponding clause is satisfied, and they
have completion times 4 and 5, if the clause is not satisfied. In other words, every
unsatisfied clause induces an extra cost of 2 in the objective function. The claim
follows. t
u
Lemma 12 and 13 state that the transformations R and S satisfy the condi-
tions in Definition 1. Since
P both transformations are computable in polynomial
time, the problem F | | Cj is Max SNP-hard. This completes the proof of
the flow shop and of the job shop part of Theorem 8.

4.2 Open Shop Scheduling

The essential difference between the flow shop problem and the open shop prob-
lem is that the order in which the operations belonging to the same job are
processed is no longer given; therefore, we must look for a different way to en-
force that the beginning operation indeed precedes the middle operation, which
in turn must precede the ending operation. To this end, we introduce a number
of additional jobs, which are used to fill the interval [0, 1] on the consistency ma-
chines and the interval [0, 2] on the clause machines; these additional jobs can
be forced to go there, because our objective is to minimize the total completion
time, which favors small jobs. We further need some more jobs, which are used
to remove unnecessary idle time. This can be worked out as follows.
Similar to the flow shop we start from an instance I = (U, C) of Max-2Sat-B.
We introduce the same set of variable jobs, dummy jobs, assignment machines,
consistency machines, clause machines, and garbage machines. Additionally, the
instance R(I) contains 26q + 6s so-called structure jobs. Every structure job con-
sists of m − 1 operations of length 0 and of a single operation of non-zero length;
this operation is called the structure operation corresponding to the structure
job.

– On each of the 2q assignment machines, we introduce five structure oper-

ations of length 3. Because of their large processing time, any reasonable
schedule processes them during the interval [2, 17].
– On each of the 2q consistency machines, we introduce three structure oper-
ations of length 13 and five structure operations of length 3. Any reasonable
schedule processes the operations of length 13 during [0, 1] and the operations
of length 3 during [3, 18].
– On each of the s clause machines, we introduce six structure operations of
length 13 . Any reasonable schedule processes them during [0, 2].

It can be shown that for any feasible schedule for R(I), there exists a reasonable
consistent schedule with non-larger objective value. With this, one can define a
Non-approximability of Scheduling Problems 365

truth setting S(σ) like in Subsection 4.1. Again, the constructed transformations
are polynomialPtime computable and fulfill the conditions in Definition 1. Hence,
problem O | | Cj is Max SNP-hard. This completes the proof of Theorem 8.

5 Conclusions
In this paper, we have derived a number of non-approximability results for
scheduling problems with total job completion time objective. The approxima-
bility status of most scheduling problems with this objective function or its
weighted counterpart remains amazingly unclear: until today, all that we know
is that some of these problems can be solved
P in polynomialPtime by straightfor-
ward algorithms (like the problems 1 | | wj Cj and P | | Cj ) and that some
of these problems do not have a PTAS (like the problems investigated in this
paper). However, there is not a single strongly N P-hard scheduling problem
with minsum objective for which a PTAS has been constructed. We state the
following conjectures.
P
Conjecture 14. The problems P 1 | rj | Cj (scheduling a single machine with
job release dates) and P | | wj Cj (scheduling parallel identical machines with
the objective of minimizing the total weighted job completion time) both do have
a PTAS.
P
Conjecture 15. Neither of the problems P | rj | CPj (scheduling parallel iden-
tical machines with job release dates) and 1 | prec | Cj (scheduling a single
machine with precedence constraints) has a PTAS.

References
1. S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and
hardness of approximation problems. Proceedings of the 33rd IEEE Symposium on
the Foundations of Computer Science, pages 14–23, 1992.
2. P. Crescenzi and V. Kann. A compendium of NP-optimization problems. 1997.
http://www.nada.kth.se/nada/theory/problemlist/html.
3. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the
Theory of N P-Completeness. Freeman, San Francisco, 1979.
4. R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. Opti-
mization and approximation in deterministic sequencing and scheduling: A survey.
Annals of Discrete Mathematics, 5:287–326, 1979.
5. J. A. Hoogeveen, J. K. Lenstra, and B. Veltman. Three, four, five, six, or the
complexity of scheduling with communication delays. Operations Research Letters,
16:129–137, 1994.
6. V. Kann. Maximum bounded 3-dimensional matching is MAX SNP-complete. In-
formation Processing Letters, 37:27–35, 1991.
7. H. Kellerer, T. Tautenhahn, and G. J. Woeginger. Approximability and nonapprox-
imability results for minimizing total flow time on a single machine. Proceedings
of the 28th Annual ACM Symposium on the Theory of Computing, pages 418–426,
1996.
366 Han Hoogeveen et al.

8. E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys. Sequencing

and scheduling: Algorithms and complexity. In S. C. Graves, A. H. G. Rinnooy
Kan, and P. H. Zipkin, editors, Logistics of Production and Inventory. Handbooks
in Operations Research and Management Science, Vol. 4, pages 445–522. North-
Holland, Amsterdam, 1993.
9. J. K. Lenstra and A. H. G. Rinnooy Kan. Complexity of scheduling under prece-
dence constraints. Operations Research, 26:22–35, 1978.
10. J. K. Lenstra and D. B. Shmoys. Computing near-optimal schedules. In
P. Chrétienne, E. G. Coffman Jr., J. K. Lenstra, and Z. Liu, editors, Scheduling
Theory and its Applications, pages 1–14. Wiley, Chichester, 1995.
11. J. K. Lenstra, D. B. Shmoys, and É. Tardos. Approximation algorithms for
scheduling unrelated parallel machines. Math. Programming, 46:259–271, 1990.
12. S. Leonardi and D. Raz. Approximating total flow time on parallel machines.
Proceedings of the 29th Annual ACM Symposium on the Theory of Computing,
pages 110–119, 1997.
13. C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
14. C. H. Papadimitriou and M. Yannakakis. Optimization, approximation, and com-
plexity classes. Journal of Computer and System Sciences, 43:425–440, 1991.
15. M. Skutella. Problem posed at the open problem session of the Dagstuhl Meeting
on “Parallel Machine Scheduling”. Schloß Dagstuhl, Germany, July 14–18, 1997.
16. D. P. Williamson, L. A. Hall, J. A. Hoogeveen, C. A. J. Hurkens, J. K. Lenstra,
S. V. Sevastianov, and D. B. Shmoys. Short shop schedules. Operations Research,
45:288–294, 1997.
Approximation Bounds for a General Class of
Precedence Constrained Parallel Machine
Scheduling Problems

Alix Munier1 , Maurice Queyranne2 , and Andreas S. Schulz3

1
Université Pierre et Marie Curie, Laboratoire LIP6
4 place Jussieu, 75 252 Paris, cedex 05, France
Alix.Munier@@lip6.fr
2
Faculty of Commerce and Business Administration, University of British Columbia
Vancouver, B.C., Canada V6T 1Z2, and
Università di Bologna – Sede di Rimini, via Angherà 22, 47037 Rimini, Italy
queyranne@@ecosta.unibo.it
3
Technische Universität Berlin, Fachbereich Mathematik
MA 6–1, Straße des 17. Juni 136, 10623 Berlin, Germany
schulz@@math.tu-berlin.de

Abstract. A well studied and difficult class of scheduling problems con-

cerns parallel machines and precedence constraints. In order to model
more realistic situations, we consider precedence delays, associating with
each precedence constraint a certain amount of time which must elapse
between the completion and start times of the corresponding jobs. Re-
lease dates, among others, may be modeled in this fashion. We provide
the first constant-factor approximation algorithms for the makespan and
the total weighted completion time objectives in this general class of
problems. These algorithms are rather simple and practical forms of list
scheduling. Our analysis also unifies and simplifies that of a number
of special cases heretofore separately studied, while actually improving
some of the former approximation results.

1 Introduction
Scheduling problems involving precedence constraints are among the most dif-
ficult problems in the area of machine scheduling, in particular for the design
of good approximation algorithms. Our understanding of the structure of these
problems and our ability to generate near-optimal solutions remain limited. The
following examples illustrate this point. (i) The first approximation algorithm
for P|prec|Cmax by Graham [14] is not only more than thirty years old, but it is
also still essentially the best one available for this problem. On the other hand,
it is only known that no polynomial-time algorithm can have a better approx-
imation ratio than 4/3 unless P = NP [23]. (ii) The computational complexity
of the problem Pm|pj = 1, prec|Cmax , open problem 8 from the original list of

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 367–382, 1998. c Springer–Verlag Berlin Heidelberg 1998
368 Alix Munier et al.

Garey and Johnson [11] is still open. (iii) The situation is also unsatisfactory with
machines running at different speed, for which no constant-factor approximation
algorithms are known. For the makespan objective, Chudak and Shmoys [10] only
recently
√ improved to O(log m) an almost twenty year old approximation ratio of
O( m) due to Jaffe [20]. They obtained the same approximation ratio for the
total weighted completion time objective. (iv) Progress is also quite recent for
the latter objective on a single machine or identical parallel machines. Until re-
cently, no constant-factor approximation algorithms were known. Lately, a better
understanding of linear programming relaxations and their use to guide P solution
strategies led to
P a 2– and a 2.719–approximation algorithm for 1|prec| wj Cj
and 1|rj , prec| wj Cj P, respectively [16,34,35], and to a 5.328–approximation al-
gorithm for P|rj , prec| wj Cj [7]. Few deep negative results are known for these
problems (see [18] for the total completion times objective).
In thisPpaper, we consider (a generalization of) the scheduling problem
P|rj , prec| wj Cj and answer a question of Hall et al. [16, Page 530]:
“Unfortunately, we do not know how to prove a good performance guar-
antee for this model by using a simple list-scheduling variant.”
Indeed their algorithm, as well as its improvement by Chakrabarti et al. [7],
is rather elaborate and its performance ratio does not match the quality of the
lower bound it uses. We show that using the same LP relaxation in a different
way (reading the list order from the LP midpoints instead of LPPcompletion
times) yields a simple 4–approximation algorithm for P|rj , prec| wj Cj . We
actually obtain this result in the more general framework of precedence delays.
We consider a general class of precedence-constrained scheduling problems
on identical parallel machines. We have a set N of n jobs and m identical par-
allel machines. Each job j has a nonnegative processing requirement (size) pj
and must be processed for that amount of time on any one of the machines. A
job must be processed in an uninterrupted fashion, and a machine can process
only one job at a time. We are interested in constrained scheduling problems
in which each job j may have a release date rj before which it cannot be pro-
cessed, and there may be a partial order A on the jobs. We associate with each
precedence-constrained job pair (i, j) ∈ A a nonnegative precedence delay dij ,
with the following meaning: in every feasible schedule, job j cannot start until
dij time units after job i is completed. Special cases include ordinary precedence
constraints (dij = 0); and release dates rj ≥ 0 (which may be modeled by adding
a dummy job 0 with zero processing time and precedence delays d0j = rj for
all other jobs). Delivery times (or lags), which must elapse between the end of a
job’s processing and its actual completion time, may also be modeled by adding
one or several dummy jobs and the corresponding precedence delays.
We denote the completion time of a job j in a schedule S as CjS and will
drop the superscript S when it is clear to which schedule we refer. We con-
sider the usual objectives of minimizing the makespanP Cmax = maxj Cj and,
for given nonnegative weights wj ≥ 0, a weighted sum j wj Cj of completion
times. In an extension of the common notation introduced in [15], we P may de-
note these problems as P|prec. delays dij |Cmax and P|prec. delays dij | wj Cj ,
Approximation Bounds for Parallel Machine Scheduling 369

respectively. These problems are NP-hard (see, e.g., Lawler et al. [22]), and we
discuss here the quality of relaxations and approximation algorithms. An α–
approximation algorithm is a polynomial-time algorithm that delivers a solution
with objective value at most α times the optimal value. Sometimes α is called
the (worst-case) performance guarantee or ratio of the algorithm.
Precedence delays were considered for resource-constrained project schedul-
ing under the name “finish-to-start lags”, e.g., by Bartusch, Möhring, and Rader-
macher [4] and Herroelen and Demeulemeester [17], for one-machine scheduling
by Wikum, Llewellyn, and Nemhauser [37] under the name “generalized prece-
dence constraints”, and by Balas, Lenstra, and Vazacopoulos [3] under that of
“delayed precedence constraints”; the latter authors use the Lmax minimization
problem as a key relaxation in a modified version of the shifting bottleneck
procedure for the classic job-shop scheduling problem. Most of the theoretical
studies concerning this class of precedence constraints consider the one ma-
chine problem 1|prec. delays dij = k, pj = 1|Cmax which corresponds to a basic
pipeline scheduling problem (see [21] for a survey). Leung, Vornberger, and Wit-
thoff [24] showed that this problem is NP-complete. Several other authors (e.g.,
[6,5,28]) obtained polynomial-time algorithms for particular instances by utiliz-
ing well-known algorithms for special cases of the classical m–machine problem.
In the context of approximation algorithms, the main result is that Graham’s list
scheduling algorithm [14] was extended to P|prec. delays dij = k, pj = 1|Cmax
to give a worst-case performance ratio of 2 − 1/(m(k + 1)) [21,28]. We extend
this result, in Section 3, to nonidentical precedence delays and processing times.
List scheduling algorithms, first analyzed by Graham [14] are among the
simplest and most commonly used approximate solution methods for parallel
machine scheduling problems. These algorithms use priority rules, or job rank-
ings, which are often derived from solutions to relaxed versions of the problems.
For example, several algorithms of Hall, Schulz, Shmoys, and Wein [16] use the
job completion times obtained from linear programming relaxations. In Section 2
we show that a modified, job-driven version of list scheduling based on job com-
pletion times can, in the presence of precedence constraints, lead to P solutions
that are about as bad as m times the optimum, for both the Cmax and j wj Cj
objectives; this behavior may also occur when using actual optimum completion
times.
Graham’s original list scheduling, however, works well for minimizing the
makespan as we show in Section 3 byPextending it to the case of precedence de-
lays. For minimizing a weighted sum j wj Cj of completion times, we present in
Section 4 a new algorithm with approximation ratio bounded by 4 for the general
problem with precedence delays. This algorithm is based on an LP relaxation
of this problem, which is a straightforward extension of earlier LP relaxations
proposed by Hall et al. [16]. The decision variables in this relaxation are the com-
pletion times Cj of every job j, so we choose to ignore the machine assignments
in these relaxations. There are two sets of linear constraints, one representing
the precedence delays (and, through the use of a dummy job, the release dates)
in a straightforward fashion; the other set of constraints is a relatively simple
370 Alix Munier et al.

way of enforcing the total capacity of the m machines. Although the machine
assignments are ignored and the machine capacities are modeled in a simplistic
way, this is sufficient to obtain the best relaxation and approximation bounds
known so far for these problems and several special cases thereof. We show that
using midpoints derived from the LP relaxation leads to a performance ratio
bounded by 4 for the general problem described above. Recall that in a given
schedule the midpoint of a job is the earliest point in time at which half of its
processing has been performed; that is, if the schedule is (or may be considered
as) nonpreemptive then the midpoint of job j is simply CjR − pj /2 where CjR is
its completion time in the relaxation R. The advantage of using midpoints in the
analysis of approximation algorithms was first observed by Goemans [12] and
has since then been used by several authors (e.g., [35,36,13]). Our result seems
to be the first, however, where midpoints are really needed within the algorithm
itself. We also show how the analysis yields tighter bounds for some special cases,
and then conclude with some additional remarks in Section 5. We believe that
the approach of applying a list-scheduling rule in which the jobs are ordered
based on their midpoints in an LP solution will have further consequences for
the design of approximation algorithms.
In summary, the main contributions of this paper are as follows.

1. We clarify the relationship between two forms of List Scheduling Algorithms

(LSAs): Graham’s non-idling LSAs and job-based LSAs. In particular, it is
shown that the former are appropriate for optimizing objectives, such as the
makespan Cmax , that are related to maximizing machine utilization, whereas
they are inappropriate (leading to unbounded performance ratio) P for job
oriented objectives, such as the weighted sum of completion times j wj Cj .
In contrast, we present job-based LSAs with bounded performance ratio for
the latter objective.
2. We show that using job completion times as a basis for job-based list schedul-
ing may yield very poor schedules for problems with parallel machines, prece-
dence constraints and weighted sum of completion times objective. This may
happen even if the completion times are those of an optimal schedule.
3. In contrast, we show that job-based list scheduling according to job midpoints
from an appropriate LP relaxation leads to job-by-job error ratios of at most
4 for a broad class of problems.
4. We present a general model of scheduling with precedence delays. This also
allows us to treat in a unified framework ordinary precedence constraints,
release dates and delivery times. In particular, this simplifies and unifies the
analysis and proof techniques.
5. Finally, we present the best polynomial-time approximation bounds known
so far for a broad class of parallel machine scheduling problems with prece-
dence constraints or delays (including release dates and delivery times) and
either a makespan or total weighted completion time objective. These bounds
are obtained by using relatively simple LSAs which should be of practical
as well as theoretical interest. We also present the best polynomially solv-
able relaxations known so far for such problems with the latter objective.
Approximation Bounds for Parallel Machine Scheduling 371

The approximation results are summarized in Table 1 where the parameter

ρ is defined as ρ = max(j,k)∈A djk / mini∈N pi , and m denotes the number of
identical parallel machines.

Table 1. Summary of results.

Problem New bound Best earlier bound

Due to space limitations some details are omitted from this paper. They can
be found in the complete version, see [27].

2 List Scheduling Algorithms

In his seminal paper, Graham (1966) showed that a simple list-scheduling rule
1
is a (2 − m )–approximation algorithm for P|prec|Cmax . In this algorithm, the
jobs are ordered in some list, and whenever one of the m machines becomes
idle, the next available job on the list is started on that machine, where a job
is available if all of its predecessors have completed processing. By their non-
idling property, Graham’s List Scheduling Algorithms (GLSAs) are well suited
when machine utilization is an important consideration. Indeed, it is shown in
Section 3 that, for the makespan minimization problem P|prec. delays dij |Cmax ,
any GLSA (i.e., no matter which list of jobs is used) produces a schedule with
objective function value within a factor 2 of the optimum. In this case, a job is
available if all its predecessors are completed and the corresponding precedence
delays have elapsed.
In contrast, the elementary Example 1 shows that the non-idling property
may lead to an arbitrarily P poor performance ratio for a weighted sum of com-
pletion times objective j∈N wj Cj .

Example 1. Consider the following two-job

P instance of the single machine non-
preemptive scheduling problem 1|rj | wj Cj (a special case of a precedence de-
lay problem, as discussed in the introduction). For a parameter q ≥ 2, job 1 has
p1 = q, r1 = 0 and w1 = 1, whereas job 2 has p2 = 1, r2 = 1 and w2 = q 2 .
The optimum schedule is to leave the machine idle during the time interval [0, 1)
so as to process job 2 first. The optimum objective value is 2q 2 + (q + 2). Any
372 Alix Munier et al.

non-idling heuristic starts processing job 1 at time 0, leading to an objective

value at least q 3 + q 2 + q, and its performance ratio is unbounded as q may be
arbitrarily large. t
u
A different example of the same type but using ordinary precedence con-
straints rather than release dates can be found in [33, Page 82, Ex. 2.20]. Thus
to obtainP a bounded performance for the weighted sum of completion times ob-
jective j wj Cj , we must relax the non-idleness property. One strategy, leading
to job-based nonpreemptive list scheduling algorithms, is to consider the jobs one
by one, in the given list order, starting from an empty schedule. Each job is non-
preemptively inserted in the current schedule without altering the jobs already
scheduled. Specific list scheduling algorithms differ in how this principle is im-
plemented, in particular, for parallel machines, regarding the assignment of the
jobs to the machines. For definiteness, consider the following version, whereby
every job is considered in the list order and is scheduled at the earliest feasible
time at the end of the current schedule on a machine. Notice that the given
list is assumed to be a linear extension of the poset defined by the precedence
constraints.
Job-Based List Scheduling Algorithm for P|prec. delays dij |·
1. The list L = (`(1), `(2), . . . , `(n)) is given.
2. Initially all machines are empty, with machine completion times Γh = 0 for
all h = 1, . . . , m.
3. For k = 1 to n do
3.1 Let job j = `(k). Its start time is
Sj = max (max{Ci + dij : (i, j) ∈ A}, min{Γh : h = 1, . . . , m})
and its completion time is Cj = Sj + pj .
3.2 Assign job j to a machine h such that Γh ≤ Sj .
Update Γh = Cj .
Various rules could be used in Step 3.2 for the choice of the assigned machine h,
for example one with largest completion time Γh (so as to reduce the idle time
between Γh and Sj ). Note also that the above algorithm can be modified to
allow insertion of a job in an idle period before Γh on a machine h. In effect, the
observations below also apply to all these variants.
One method (e.g., Phillips et al. [29] as well as Hall et al. [16]) for defining the
list L consists in sorting the jobs in nondecreasing order of their completion times
in a relaxation of the scheduling problem under consideration. In the presence of
ordinary precedence constraints, this works well for the case of a single machine
(Hall et al., ibid., see also [34]), but Example 2 shows that this may produce very
poor schedules for the case of identical parallel machines. This example uses the
list which is produced by an optimal schedule, the tightest kind of relaxation
that can be defined; note that this optimal schedule defines the same completion
time order as the relaxation in Hall et al. and its extension in Section 4 below.
Example 2. For a fixed number m ≥ 2 of identical parallel machines and a
positive number , let the job set be N = {1, . . . , n} with n = m(m + 1) + 1.
Approximation Bounds for Parallel Machine Scheduling 373

The ordinary precedence constraints (j, k) (with djk = 0) are defined as follows:
(i) j = 1 + h(m + 1) and k = j + g, for all h = 0, . . . , m − 1 and all g = 1, . . . , m;
and (ii) for all j < n and k = n. The processing times are pj = 1 + h(m + 1) for
j = 1 + h(m + 1) and h = 0, . . . , m − 1; and pj = otherwise.
P The objective is
either to minimize the makespan, or a weighted sum j wj Cj of job completion
times with weights wj = 0 for all j < n and wn = 1; note that, due to the
precedence constraints (ii) above, these two objectives coincide for any feasible
schedule.
An optimal solution has, for h = 0, . . . , m − 1, job j = 1 + h(m + 1) starting
at time Sj∗ = 0 on machine h + 1, immediately followed by jobs j + 1, . . . , j + m
assigned as uniformly as possible to machines 1, . . . , h+1. Job n is then processed
∗
last on machine m, so that the optimal objective value is Cmax = Cn∗ = 1+(m2 +
1). A corresponding list is L = (1, 2, . . . , n). Any version of the list scheduling
algorithm described above produces the following schedule from this list: job 1
is scheduled with start time S1L = 0 and completion time C1L =1; the m jobs k =
2, . . . , m + 1 are then scheduled, each with SkL = 1 and CkL = 1 + on a different
machine; this will force all subsequent jobs to be scheduled no earlier than time
1+. As a result, for h = 0, . . . , m−1, job j = 1+h(m+1) is scheduled with start
time SjL = h + ( 21 (h − 1)h(m + 1) + h), followed by jobs k = j + 1, . . . , j + (m + 1)
each with SkL = h + 1 + ( 12 h(h + 1)(m + 1) + h) on a different machine. Finally,
job n is scheduled last with SnL = m + ( 12 (m − 1)m(m + 1) + m) and thus the
objective value is Cmax L
= CnL = m + o(), or arbitrarily close to m times the
∗
optimal value Cmax for > 0 small enough. t
u
The example shows that list scheduling according to completion times can
lead to poor schedules on identical parallel machines with job precedence con-
straints. In contrast, we will present in Section 4 a linear programming relax-
ation of the general problem with precedence delays and show that job-based
list scheduling according to job midpoints leads to a bounded performance ratio.

3 The Performance of Graham’s List Scheduling

In this section we show that Graham’s (non-idling) List Scheduling Algorithms
generate feasible schedules with makespan less than twice the optimum for in-
stances of P|prec. delays dij |Cmax . As discussed in the introduction, this result
and proof extend and unify earlier work.
Let S H = (S1H , . . . , SnH ) denote the vector of start times of a schedule con-
structed by GLSA, as described in Section 2. Let Cmax H
= maxi∈N (SiH + pi )
0
denote the makespan of this schedule. For any pair (t, t ) of dates such that
0 ≤ t ≤ t0 ≤ Cmax H
, let I[t, t0 ) denote the total machine idle time during the
0
interval [t, t ). (Thus, for example, if all m machines are idle during the whole
interval, then I[t, t0 ) = m(t0 − t).) Let N + denote the set of jobs in N that have
at least one predecessor.
Lemma 1. Let j be a job in N + and let i be an immediate predecessor of j with
largest value SiH + pi + dij . Then I[SiH , SjH ) ≤ m dij + (m − 1)pi .
374 Alix Munier et al.

Proof. Let i be an immediate predecessor of j with largest value of SiH + pi +

dij . Since H is a GLSA, job j may be processed by any available processor
from date SiH + pi + dij on. Therefore I[SiH + pi + dij , SjH ) = 0 and hence
I[SiH , SjH ) = I[SiH , SiH + pi + dij ). Since job i is processed during the time
interval [SiH , SiH + pi ), at most m − 1 machines are idle during this interval, and
we obtain the requisite inequality. t
u
The precedence network (N 0 , A0 , `) is defined by N0 = N ∪ {0}; A0 = A ∪
{(i, 0) : i ∈ N }; and arc lengths `(i, j) = pi +dij for all (i, j) ∈ A, and `(i, 0) = pi
for all i ∈ N . Job 0 is a dummy job succeeding all other jobs with zero delays. It is
added so that any maximal path in the precedence network is of the form i . . . j0
and its length is the minimum time required to process all the jobs i, . . . , j.
Recall from the introduction that ρ = max(j,k)∈A djk / mini∈N pi . We assume
that max(j,k)∈A djk + mini∈N pi > 0; otherwise we have ordinary precedence
constraints and we know that, for the makespan objective (and since we only
consider nonpreemptive schedules), jobs with zero processing time can then be
simply eliminated. If some pi = 0 (with all djk > 0) then we may use ρ = +∞
and the ratio 1/(1 + ρ) = 0.
Lemma 2. Let L denote the length of a longest path in the precedence network.
Then I[0, Cmax
H
) ≤ (m − 1/(1 + ρ))L .
Proof. Let i0 ∈ N such that SiH0 + pi0 = Cmax H
. Starting with q = 0, we construct
as follows a path (ik , ik−1 , . . . , i0 ) in the precedence network:
1. While iq ∈ N + do: choose as iq+1 an immediate predecessor of iq with largest
value SiHq+1 + piq+1 + diq+1 iq .
2. Set k = q and stop.
Since ik has no predecessor, I[0, SiHk ) = 0. By repeated application of
Pk−1 Pk
Lemma 1, we have I[SiHk , SiH0 ) ≤ m q=0 diq+1 iq + (m − 1) q=1 piq . In addition,
Pk−1
I[SiH0 , Cmax
H
) ≤ (m − 1)pi0 , and therefore I[0, Cmax
H
) ≤ m q=0 diq+1 iq + (m −
P P Pk
1) kq=0 piq . The total length of the path satisfies k−1q=0 diq+1 iq + q=0 piq ≤ L.
If mini∈N pi = 0 then we immediately have I[0, Cmax ) ≤ mL and the proof is
H
Pk−1 Pk−1 Pk−1
complete. Otherwise, q=0 diq+1 iq ≤ ρ q=0 mini∈N pi ≤ ρ q=0 piq . There-
P
fore, k−1 q=0 diq+1 iq ≤ 1+ρ L and the lemma follows.
ρ
t
u
We are now in the position to prove the main result of this section.
Theorem 1. For the scheduling problem P|prec. delays dij |Cmax , the perfor-
mance ratio of Graham’s List Scheduling Algorithm is 2 − 1/(m(1 + ρ)), no
matter which priority list is used.
P
Proof. We have m CmaxH
= i∈N pi + I[0, Cmax
H
), and thus, by Lemma 2,
P
pi 1
Cmax
H
≤ i∈N + 1 − L .
m m(1 + ρ)
1
P
Since m i∈N pi and L are two lower bounds on the optimum makespan, the
result follows. t
u
Approximation Bounds for Parallel Machine Scheduling 375

There exist instances showing that the bound in Theorem 1 is (asymptoti-

cally) best possible, even for m = 1 [27].

4 The Performance of Job-Based List Scheduling

In this section we first presentP a linear programming relaxation of the problem

of minimizing a weighted sum wj Cj of job completion times subject to prece-
dence delays, which is then used in the design of a 4–approximation algorithm
for this problem. This formulation is a direct extension of a formulation given in
[16], see also [33]. The decision variables are the job completion times Cj for all
jobs j ∈ N . Note that this formulation does not take into account the assignment
of jobs to machines. The set of constraints is:

Cj ≥ Ci + dij + pj all (i, j) ∈ A, (1)

X X 2
1 1X 2
pj Cj ≥ pj + pj all F ⊆ N. (2)
2m 2
j∈F j∈F j∈F

Constraints (1) are the precedence delay constraints. Constraints (2) are a rela-
tively weak way of expressing the requirement that each machine can process at
most one job at a time; for the single-machine case (m = 1) they were introduced
by Wolsey [38] and Queyranne [30], and studied by Queyranne and Wang [31],
von Arnim, Schrader, and Wang [1], von Arnim and Schulz [2], and Schulz [33]
in the presence of ordinary precedence constraints; they were extended to m ≥ 2
parallel machines by Hall et al. [16]. Note that these constraints, for F = {j},
imply Cj ≥ 0; these and, as indicated above, the use of a dummy job 0 allow
the formulation of release date constraints.
For a weighted sum of completion times objective, the LP formulation is
simply:
X
minimize wj Cj subject to (1) - (2). (3)
j∈N

Let C LP denote any feasible solution to the constraint set (1)–(2) of this LP;
we will call CjLP the LP completion time of job j. We now use this LP solution
to define a feasible schedule with completion time vector C H and analyze the
job-by-job relationship between CjH and CjLP for every job j ∈ N .
We define the LP start time SjLP and LP midpoint MjLP as SjLP = CjLP − pj
and MjLP = CjLP − pj /2, respectively. We now use the List Scheduling Algo-
rithm of Section 2 with the LP midpoint list L defined by sorting the jobs in
nondecreasing order of their midpoints MjLP . The next theorem contains our
main result.
376 Alix Munier et al.

Theorem 2. Let C LP denote any feasible solution to the constraint set (1)–(2)
and let M LP and S LP denote the associated LP midpoints and start times, re-
spectively. Let S H be the vector of start times of the feasible schedule constructed
by the List Scheduling Algorithm using the LP midpoint list. Then

SjH ≤ 2 MjLP + 2 SjLP for all jobs j ∈ N. (4)

Proof. Assume for simplicity that the jobs are indexed in the order of their LP
midpoints, that is, M1LP ≤ M2LP ≤ · · · ≤ MnLP . We fix job j ∈ N and consider
the schedule constructed by the List Scheduling heuristic using the LP midpoint
list L = (1, 2, . . . , n) up to and including the scheduling of job j, that is, up to
the completion of Step 3 with k = j. Let [j] = {1, . . . , j}.
Let µ denote the total time between 0 and start time SjH of job j when all m
machines are busy at this stage of the algorithm. Since only jobs in [j − 1] have
1
Pj−1
been scheduled so far, we have µ ≤ m i=1 pi . Let λ = Sj − µ. To prove (4),
H

we only need to show that

1 X
j−1
(i) pi ≤ 2MjLP and (ii) λ ≤ 2SjLP .
m i=1

Inequality (i) follows from a straightforward variant of Lemma 3.2 in [16]. We

omit the details here and refer the reader to the full version of this paper, see
[27]. To show (ii), let q denote the number of time intervals between dates 0
and SjH when at least one machine is idle (i.e., not processing a job in [j − 1])
in the schedule C H . Denote these idle intervals as (ah , bh ) for h = 1, . . . , q, so
that 0 ≤ a1 ; that bh−1 < ah < bh for all h = 2, . . . , q; and that bq ≤ SjH . We
Pq
have λ = h=1 (bh − ah ) and all machines are busy during the complementary
intervals [bh , ah+1 ], including intervals [0, a1 ] and [bq , SjH ] if nonempty.
Consider the digraph G[j] = ([j], A[j] ) where

A[j] = {(k, `) ∈ A : k, ` ∈ [j] and C`H = CkH + dk` + p` } ,

that is, A[j] is the set of precedence pairs in [j] for which the precedence delay
constraints (1) are tight for C H . If bq > 0 then a machine becomes busy at date bq
(or starts processing job j if bq = SjH ) and thus there exists a job x(q) ∈ [j] with
start time Sx(q)
H
= bq . We repeat the following process for decreasing values
of the interval index h, starting with h = q, until we reach the date 0 or the
busy interval [0, a1 ]. Let (v(1), . . . , v(s)) denote a maximal path in G[j] with
last node (job) v(s) = x(h). Note that we must have bg < Sv(1) H
≤ ag+1 for
some busy interval [bg , ag+1 ] with ag+1 < bh , for otherwise some machine is
idle immediately before the start time Sv(1) H
of job v(1) and this job, not being
constrained by any tight precedence delay constraint, should have started earlier
than that date. This implies in particular that s ≥ 2. We have
s−1
X Xs−1

bh − ag+1 ≤ Sv(s)
H
− Sv(1)
H
= Sv(i+1)
H
− Sv(i)
H
= pv(i) + dv(i)v(i+1) .(5)
i=1 i=1
Approximation Bounds for Parallel Machine Scheduling 377

On the other hand, the precedence delay constraints (1) imply

1 1
Mv(i+1)
LP
≥ Mv(i)
LP
+ pv(i) + dv(i)v(i+1) + pv(i+1)
2 2
for all i = 1, . . . , s − 1. Therefore

1X
s−1
1
Mx(h)
LP
− Mv(1)
LP
≥ px(h) + pv(i) + dv(i)v(i+1)
2 2 i=1

and thus
1
Sx(h)
LP
− Mv(1)
LP
≥ (bh − ag+1 ) .
2
If bg > 0, then let x(g) denote a job with start time Sx(g)
H
satisfying bg ≤ Sx(g)
H
≤
ag+1 and with minimum value of Mx(g) under this condition. Therefore Mx(g) ≤
LP LP

Mv(1)
LP
. We also have (k, x(g)) ∈ A[j] for some k ∈ [j] with SkH < bg ≤ Sx(g)H
, for
otherwise job x(g) should have started processing on some idle machine before
date bg . This implies

1
Sx(h)
LP
− Mx(g)
LP
≥ (bh − ag+1 ) (6)
2
and we may repeat the above process with h = g and job x(h) = x(g). Since
g < h at each step, this whole process must terminate, generating a decreasing
sequence of indices q = h(1) > . . . > h(q 0 ) = 0 such that every idle interval
is contained in some interval [ah(i+1)+1 , bh(i) ]. Adding the inequalities (6) and
using SjLP ≤ MjLP for all j = h(i), we obtain
0
X
q
λ≤ (bh(i) − ah(i+1)+1 ) ≤ 2(Sx(h(1))
LP
− Mx(h(q
LP
0 )) ) ≤ 2(Sj
LP
− 0) . (7)
i=1

This establishes (ii). The proof of Theorem 2 is complete. t

There exist instances showing that the factors 2 in inequality (4) are (asymp-
totically) best possible for any number of machines, see [27].
Using for C LP an optimal LP solution, Theorem 2 implies performance ratios
of 1/4
P and 4 for the LP relaxation and the heuristic solution, respectively, for
the wj Cj objective.

Corollary 1. Let C LP denoteP an optimal solution to the LP defined in (3) for

the problem P|prec. delays dij | wj Cj . Let C H denote the solution constructed
from C LP by the List Scheduling Algorithm using the LP midpoint list, and let
C ∗ denote an optimum schedule. Then
X 1X X X
wj CjLP ≥ wj Cj∗ and wj CjH ≤ 4 wj Cj∗ . (8)
4
j∈N j∈N j∈N j∈N
378 Alix Munier et al.

Examples show that the latter bound is (asymptotically) tight for an arbi-
trary number of machines. We suspect that the first inequality in (8), bounding
the performance ratio of the LP relaxation, is not tight. The worst instances we
know have a performance ratio of 1/3 for the LP relaxation, see [27] for details.
The analysis in Theorem 2 may be refined for some P special cases, yielding
tighter performance ratios. For the problem P|prec| wj Cj , observe that the
list scheduling algorithm will not allow all machines to be simultaneously idle
at any date before the start time of any job i ∈ N . Therefore, in the proof of
Theorem 2, all the idle intervals, with total length λ, contain some processing
of some job(s) i < j; as a result the total work during the busy intervals is at
Pj−1
most i=1 pi − λ. Hence, we obtain the following result.

Corollary 2. List scheduling

P by LP midpoints is a (4 − 2/m)–approximation
algorithm for P|prec| wj Cj .
P
Note that for m = 1 we recover the performance ratio of 2 for 1|prec| wj Cj
in [16,34], which is known to be tight for that special case.
In the case of a single machine, the idle intervals that add up to λ time units
cannot contain any processing. Therefore, in the proof of Theorem 2 replace
Ps−1
inequality (5) with bh − ag+1 ≤ Sv(s)H
− Cv(1)
H
= i=1 dv(i)v(i+1) . Adding up the
precedence delay constraints for all i =P 1, . . . , s−1 and omitting some processing
≥ 12 px(h) + i=1 dv(i)v(i+1) and thus Sx(h)
s−1
times yield Mx(h)
LP
− Mv(1)
LP LP
− Mv(1)
LP
≥
Pq 0
bh − ag+1 . Therefore we may replace (7) with λ ≤ i=1 (bh(i) − ah(i+1)+1 ) ≤
Sx(h(1))
LP
− Mx(h(q
LP
0 )) ≤ Sj
LP
and thus inequality (ii) with λ ≤ SjLP . This implies
SjH ≤ 3MjLP .

Corollary 3. List scheduling

P by LP midpoints is a 3–approximation algorithm
for 1|prec. delays dij | wj Cj .
P
Note that for the special case 1|rj , prec| wj Cj we recover the performance
ratio of 3 in [16,34]. The best known approximation algorithm for this problem,
however, has a performance guarantee of e ≈ 2.719 [35].

5 Concluding Remarks
The appropriate introduction of idle time is a rather important part in the design
of approximation algorithms to minimize the weighted sum of completion times
subject to precedence delays. As Example 1 illustrates, idle time is needed to
avoid that profitable jobs which become available soon are delayed by other, less
important jobs. On the other hand, too much idle time is undesired as well. The
necessity to balance these two effects contributes to the difficulty of P
this problem.
Interestingly, all former approximation algorithms for P|rj , prec| wj Cj with
constant-factor performance ratios are based on variants of Graham’s original
list scheduling which actually tries to avoid machine idle time. In fact, Hall
et al. [16] partition jobs into groups that are individually scheduled according
to a GLSA, and then these schedules are concatenated to obtain a solution of
Approximation Bounds for Parallel Machine Scheduling 379

the original problem. To find a good partition, this scheme was enriched with
randomness by Chakrabarti et al. [7]. Chekuri et al. [8] presented a different
variant of a GLSA by artificially introducing idle time whenever it seems that
a further delay of the next available job in the list (if it is not the first) can be
afforded. Hence, the algorithm analyzed in Section 4 seems the first within this
context that does not take the machine-based point of view of GLSAs.
However, an advantage of the rather simple scheme of Chekuri et al. is its
small running time (though the performance ratios obtained are considerably
worse). In fact, so far we have not even explained that the algorithms presented
above are indeed polynomial-time algorithms. Whereas this is obvious for the
GLSA variant for makespan minimization, we have to argue that in case of the
total weighted completion time the linear programming relaxation (3) behaves
well. In fact, it can be solved in polynomial time since the corresponding sepa-
ration problem is polynomially solvable [33].
It seems worth to note that Theorem 2 actually implies stronger results than
stated in Corollaries 1 and 3. The performance ratios given there not only hold
for the weighted sum of completion times, but even for the weighted sum of
third-points and midpoints, respectively. Recall that, in a given schedule and for
a given value 0 < α ≤ 1, the α–point of job j is the earliest time at which a
fraction α of its processing has been performed; thus if the schedule is (or may be
considered as) nonpreemptive, then the α–point of job j is simply Cj − (1 − α)pj .
Talking about α–points, it is interesting to note that in the proof of Theorem 2
other α–points could be used in this analysis, provided that 12 ≤ α < 1, but the
midpoint (α = 1/2) leads to the best bound.
Finally, let us relate precedence delays to another kind of restrictions that has
been given quite some attention recently. Assume that precedence constraints
(i, j) are caused by some technological requirements. Then it seems reasonable
that the information or object produced during the processing of job i needs to
be transferred to job j before the processing of j can start. Hence, precedence
delays might be interpreted as kind of communication delays. This term is coined
with a slightly different meaning, however (see, e.g., [32,19,9,25,26]). In fact, in
this context one usually assumes that a delay only occurs if i and j are assigned
to different machines. Our results can be extended to the model with precedence
and communication delays. For details, we again refer the reader to the full
version of this paper [27].

Acknowledgments

This work was initiated during a workshop on “Parallel Scheduling” held at

Schloß Dagstuhl, Germany, July 14–18, 1997. The authors are grateful to the
organizers for providing a stimulating atmosphere. The research of the second
author is supported in part by a research grant from NSERC, (the Natural Sci-
ences and Research Council of Canada) and by the UNI.TU.RIM. S.p.a. (Società
per l’Università nel riminese), whose support is gratefully acknowledged.
380 Alix Munier et al.

References
1. A. von Arnim, R. Schrader, and Y. Wang. The permutahedron of N-sparse posets.
Mathematical Programming, 75:1–18, 1996.
2. A. von Arnim and A. S. Schulz. Facets of the generalized permutahedron of a
poset. Discrete Applied Mathematics, 72:179–192, 1997.
3. E. Balas, J. K. Lenstra, and A. Vazacopoulos. The one machine problem with
delayed precedence constraints and its use in job-shop scheduling. Management
Science, 41:94–109, 1995.
4. M. Bartusch, R. H. Möhring, and F. J. Radermacher. Scheduling project net-
works with resource constraints and time windows. Annals of Operations Research,
16:201–240, 1988.
5. D. Bernstein and I. Gertner. Scheduling expressions on a pipelined processor with
a maximum delay of one cycle. ACM Transactions on Programming Languages
and Systems, 11:57 – 66, 1989.
6. J. Bruno, J. W. Jones, and K. So. Deterministic scheduling with pipelined proces-
sors. IEEE Transactions on Computers, C-29:308–316, 1980.
7. S. Chakrabarti, C. A. Phillips, A. S. Schulz, D. B. Shmoys, C. Stein, and J. Wein.
Improved scheduling algorithms for minsum criteria. In F. Meyer auf der Heide
and B. Monien, editors, Automata, Languages and Programming, LNCS, Vol. 1099,
pages 646–657. Springer, Berlin, 1996.
8. C. S. Chekuri, R. Motwani, B. Natarajan, and C. Stein. Approximation techniques
for average completion time scheduling. In Proceedings of the 8th ACM–SIAM
Symposium on Discrete Algorithms, pages 609–618, 1997.
9. P. Chrétienne and C. Picouleau. Scheduling with communication delays: A survey.
In P. Chrétienne, E. G. Coffman Jr., J. K. Lenstra, and Z. Liu, editors, Scheduling
Theory and its Applications, chapter 4, pages 65–90. John Wiley & Sons, 1995.
10. F. A. Chudak and D. B. Shmoys. Approximation algorithms for precedence–
constrained scheduling problems on parallel machines that run at different speeds.
In Proceedings of the 8th ACM–SIAM Symposium on Discrete Algorithms, pages
581–590, 1997.
11. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the
Theory of NP–Completeness. Freeman, San Francisco, 1979.
12. M. X. Goemans. Improved approximation algorithms for scheduling with release
dates. In Proceedings of the 8th ACM–SIAM Symposium on Discrete Algorithms,
pages 591–598, 1997.
13. M. X. Goemans, M. Queyranne, A. S. Schulz, M. Skutella, and Y. Wang. Single
machine scheduling with release dates. Working paper, 1997.
14. R. L. Graham. Bounds for certain multiprocessing anomalies. Bell System Tech.
J., 45:1563–1581, 1966.
15. R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. Opti-
mization and approximation in deterministic sequencing and scheduling: A survey.
Annals of Discrete Mathematics, 5:287–326, 1979.
16. L. A. Hall, A. S. Schulz, D. B. Shmoys, and J. Wein. Scheduling to minimize average
completion time: Off–line and on–line approximation algorithms. Mathematics of
Operations Research, 22:513–544, 1997.
17. W. Herroelen and E. Demeulemeester. Recent advances in branch-and-bound pro-
cedures for resource-constrained project scheduling problems. In P. Chrétienne,
E. G. Coffman Jr. , J. K. Lenstra, and Z. Liu, editors, Scheduling Theory and its
Applications, chapter 12, pages 259–276. John Wiley & Sons, 1995.
Approximation Bounds for Parallel Machine Scheduling 381

18. J. A. Hoogeveen, P. Schuurman, and G. J. Woeginger. Non-approximability results

for scheduling problems with minsum criteria. In R. E. Bixby, E. A. Boyd, and
R. Z. Rı́os-Mercado, editors, Proceedings of the 6th International IPCO Conference,
LNCS, Vol. 1412, pages 344–357. Springer, 1998. This volume.
19. J. A. Hoogeveen, B. Veltman, and J. K. Lenstra. Three, four, five, six, or the
complexity of scheduling with communication delays. Operations Research Letters,
3:129–137, 1994.
20. J. M. Jaffe. Efficient scheduling of tasks without full use of processor resources.
Theoretical Computer Science, 12:1–17, 1980.
21. E. Lawler, J. K. Lenstra, C. Martel, B. Simons, and L. Stockmeyer. Pipeline
scheduling: A survey. Technical Report RJ 5738 (57717), IBM Research Division,
San Jose, California, 1987.
22. E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys. Sequencing
and scheduling: Algorithms and complexity. In S. C. Graves, A. H. G. Rinnooy Kan,
and P. H. Zipkin, editors, Logistics of Production and Inventory, Handbooks in
Operations Research and Management Science, Vol. 4, chapter 9, pages 445–522.
North–Holland, Amsterdam, The Netherlands, 1993.
23. J. K. Lenstra and A. H. G. Rinnooy Kan. Complexity of scheduling under prece-
dence constraints. Operations Research, 26:22–35, 1978.
24. J. Y.–T. Leung, O. Vornberger, and J. Witthoff. On some variants of the bandwidth
minimization problem. SIAM J. Computing, 13:650–667, 1984.
25. R. H. Möhring, M. W. Schäffter, and A. S. Schulz. Scheduling jobs with com-
munication delays: Using infeasible solutions for approximation. In J. Diaz and
M. Serna, editors, Algorithms – ESA’96, LNCS, Vol. 1136, pages 76–90. Springer,
Berlin, 1996.
26. A. Munier and J.–C. König. A heuristic for a scheduling problem with communi-
cation delays. Operations Research, 45:145–147, 1997.
27. A. Munier, M. Queyranne, and A. S. Schulz. Approximation bounds for a
general class of precedence constrained parallel machine scheduling problems.
Preprint 584/1998, Department of Mathematics, Technical University of Berlin,
Berlin, Germany, 1998. ftp://ftp.math.tu-berlin.de/pub/Preprints/combi/
Report-584-1998.ps.Z
28. K. W. Palem and B. Simons. Scheduling time critical instructions on RISC ma-
chines. In Proceedings of the 17th Annual Symposium on Principles of Programming
Languages, pages 270–280, 1990.
29. C. Phillips, C. Stein, and J. Wein. Scheduling jobs that arrive over time. In
Proceedings of the Fourth Workshop on Algorithms and Data Structures, LNCS,
Vol. 955, pages 86–97. Springer, Berlin, 1995.
30. M. Queyranne. Structure of a simple scheduling polyhedron. Mathematical Pro-
gramming, 58:263–285, 1993.
31. M. Queyranne and Y. Wang. Single–machine scheduling polyhedra with precedence
constraints. Mathematics of Operations Research, 16:1–20, 1991.
32. V. J. Rayward-Smith. UET scheduling with unit interprocessor communication
delays. Discrete Applied Mathematics, 18:55–71, 1987.
33. A. S. Schulz. Polytopes and Scheduling. PhD thesis, Technical University of Berlin,
Berlin, Germany, 1996.
34. A. S. Schulz. Scheduling to minimize total weighted completion time: Performance
guarantees of LP–based heuristics and lower bounds. In W. H. Cunningham, S. T.
McCormick, and M. Queyranne, editors, Integer Programming and Combinatorial
Optimization, LNCS, Vol. 1084, pages 301–315. Springer, Berlin, 1996.
382 Alix Munier et al.

35. A. S. Schulz and M. Skutella. Random–based scheduling: New approximations and

LP lower bounds. In J. Rolim, editor, Randomization and Approximation Tech-
niques in Computer Science, LNCS, Vol. 1269, pages 119–133. Springer, Berlin,
1997.
36. A. S. Schulz and M. Skutella. Scheduling–LPs bear probabilities: Randomized
approximations for min–sum criteria. In R. Burkard and G. Woeginger, editors,
Algorithms – ESA’97, LNCS, Vol. 1284, pages 416–429. Springer, Berlin, 1997.
37. E. D. Wikum, D. C. Llewellyn, and G. L. Nemhauser. One–machine generalized
precedence constrained scheduling problems. Operations Research Letters, 16:87–
89, 1994.
38. L. A. Wolsey, August 1985. Invited Talk at the 12th International Symposium on
Mathematical Programming, MIT, Cambridge.
An Efficient Approximation Algorithm for
Minimizing Makespan on Uniformly Related
Machines

Chandra Chekuri1? and Michael Bender2

1
Computer Science Department
Stanford University
chekuri@@cs.stanford.edu
2
Division of Applied Sciences
Harvard University
bender@@deas.harvard.edu

Abstract. We give a new efficient approximation algorithm for schedul-

ing precedence constrained jobs on machines with different speeds. The
setting is as follows. There are n jobs 1, . . . , n where job j requires pj
units of processing. The jobs are to be scheduled on a set of m machines.
Machine i has a speed si ; it takes pj /si units of time for machine i to pro-
cess job j. The precedence constraints on the jobs are given in the form of
a partial order. If j ≺ k, processing of k cannot start until j’s execution if
finished. Let Cj denote the completion time of job j. The objective is to
find a schedule to minimize Cmax = maxj Cj , conventionally called the
makespan of the schedule. We consider non-preemptive schedules where
each job is processed on a single machine with no preemptions. Recently
Chudak and Shmoys [1] gave an algorithm with an approximation √ ra-
tio of O(log m) significantly improving the earlier ratio of O( m) due
to Jaffe [7]. Their algorithm is based on solving a linear programming
relaxation of the problem. Building on some of their ideas, we present
a combinatorial algorithm that achieves a similar approximation ratio
but runs in O(n3 ) time. In the process we also obtain a constant factor
approximation algorithm for the special case of precedence constraints
induced by a collection of chains. Our algorithm is based on a new lower
bound which we believe is of independent interest. By a general result
of Shmoys, Wein, and Williamson [10] our algorithm can be extended to
obtain an O(log m) approximation ratio even if jobs have release dates.

?
Supported primarily by an IBM Cooperative Fellowship. Remaining support was pro-
vided by an ARO MURI Grant DAAH04-96-1-0007 and NSF Award CCR-9357849,
with matching funds from IBM, Schlumberger Foundation, Shell Foundation, and
Xerox Corporation.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 383–393, 1998. c Springer–Verlag Berlin Heidelberg 1998
384 Chandra Chekuri and Michael Bender

1 Introduction

The problem of scheduling precedence constrained jobs on a set of identical

parallel machines to minimize makespan is one of the oldest problems for which
approximation algorithms have been devised. Graham [4] showed that a simple
list scheduling gives a ratio of 2 and it is the best known algorithm till date.
We consider a generalization of this model in which machines have different
speeds. In the scheduling literature such machines are called uniformly related.
We formalize the problem below. There are n jobs 1, . . . , n, with job j requiring
processing of pj units. The jobs are to be scheduled on a set of m machines.
Machine i has a speed factor si . Job j with a processing requirement pj takes
pj /si time units to run on machine i. Let Cj denote the completion time of
job j. The objective is to find a schedule to minimize Cmax = maxj Cj called
the makespan of the schedule. We restrict ourselves to non-preemptive schedules
where a job once started on a machine has to run to completion on the same
machine. Our results carry over to the preemptive case as well. In the scheduling
literature [5] where problems are classified in the α|β|γ notation, this problem
is referred to as Q|prec|Cmax .

Liu and Liu [9] analyzed the performance of Graham’s list scheduling algo-
rithm for the case of different speeds and showed that the approximation guar-
antee depends on the ratio of the largest to the smallest speed. This ratio could
be arbitrarily large even for a small number of machines. The first algorithm to
have a bound independent of the speeds was given by Jaffe [7]. He showed that
list scheduling
√ restricted to the set of machines with speeds√that are within a
factor of 1/ m of the fastest machine speed results in an O( m) bound. More
recently, Chudak and Shmoys [1] improved the ratio considerably and gave an
algorithm which has a guarantee of O(log m). At a more basic level their algo-
rithm has a guarantee of O(K) where K is the number of distinct speeds. The
above mentioned algorithm relies on solving a linear programming relaxation
and uses the information obtained from the solution to allocate jobs to proces-
sors. We present a new algorithm which finds an allocation without solving a
linear program. The ratio guaranteed by our algorithm is also O(log m) but is
advantageous for the following reason. Our algorithm runs in O(n3 ) time and
is combinatorial, hence is more efficient than the algorithm in [1]. Further, the
analysis of our algorithm relies on a new lower bound which is very natural,
and might be useful in other contexts. In addition we show that our algorithm
achieves a constant factor approximation when the precedence constraints are
induced by a collection of chains. We remark here that our work was inspired
by, and builds upon the ideas in [1].

The rest of the paper is organized as follows. Section 2 contains some of

the ideas from the paper of Chudak and Shmoys [1] that are useful to us. We
present our lower bound in Section 3, and give the approximation algorithm and
the analysis in Section 4.
An Algorithm for Uniformly Related Machines 385

2 Preliminaries
We summarize below the basic ideas in the work of Chudak and Shmoys [1].
Their main result is an algorithm which gives a ratio of O(K) for the problem
of Q|prec|Cmax where K is the number of distinct speeds. They also show how
to reduce the general case with arbitrary speeds to one in which there are only
O(log m) distinct speeds as follows.
– Ignore all machines with speed less than 1/m times the speed of the fastest
machine.
– Round down all speeds to the nearest power of 2.
They observe that the above transformation can be done while losing only a con-
stant factor in the approximation ratio. Using this observation, we will restrict
ourselves to the case where we have K distinct speeds.
When all machines have the same speed (K = 1), Graham [4] showed that
list scheduling gives a 2 approximation. His analysis shows that in any schedule
produced by list scheduling, we can identify a chain of jobs j1 ≺ j2 . . . ≺ jr such
that a machine is idle only when one of the jobs in the above chain is being
processed. The time spent processing the above chain is a lower bound on the
optimal makespan. In addition, the measure of the time instants during which
all machines are busy is also a lower bound by arguments about the average
load. These two bounds provide an upper bound of 2 on the approximation ratio
of list scheduling. One can apply a similar analysis for the multiple speed case.
As observed in [1], the difficulty is that the time spent in processing the chain
identified from the list scheduling analysis, is not a lower bound. The only claim
that can be made is that the processing time of any chain on the fastest machine
is a lower bound. However the jobs in the chain guaranteed by the list scheduling
analysis do not necessarily run on the fastest machine. Based on this observation,
the algorithm in [1] tries to find an assignment of jobs to speeds (machines) that
ensures that the processing time of any chain is bounded by some factor of the
optimal.
We will follow the notation of [1] for sake of continuity and convenience.
Recall that we have K distinct speeds. Let mk be the number of machines
with speed sk , k = 1, . . . , K, where s1 > . . . > sK . Let Muv denote the sum
P v
k=u mk . In the sequel we will be interested in assigning jobs to speeds. For
a given assignment, let k(j) denote the speed at which job j is assigned to be
processed. The average processing allocated to a machine of a specific speed k,
denoted by Dk , is the following.
1 X
Dk = pj .
m k sk
j:k(j)=k

A chain is simply a subset of jobs which are totally ordered by the precedence
constraints. Let P be the set of all chains induced by the precedence constraints.
We compute a quantity C defined by the following equation.
X pj
C = max
P ∈P sk(j)
j∈P
386 Chandra Chekuri and Michael Bender

A natural variant of list scheduling called speed based list scheduling is developed
in [1] which is constrained to schedule according to the speed assignments of the
jobs. In classical list scheduling, the first available job from the list is scheduled
as soon as a machine is free. In speed based list scheduling, an available job is
scheduled on a free machine provided the speed of free machine matches the
speed assignment of the job. The following theorem is from [1]. The analysis is
a simple generalization of Graham’s analysis of list scheduling.

Theorem 1 (Chudak and Shmoys). For any job assignment k(j), j = 1, . . . ,

n, the speed-based list scheduling algorithm produces a schedule of length

X
K
Cmax ≤ C + Dk .
k=1

The authors of [1] use a linear programming relaxation of the problem to

PK
obtain
√ a job assignment
√ that satisfies the two conditions: k=1 Dk ≤ (K +
∗ ∗ ∗
K)Cmax , and C ≤ ( K + 1)Cmax , where Cmax is the optimal makespan. Com-
bining these with Theorem 1 gives them an O(K) approximation. We will show
how to use an alternative method based on chain decompositions to obtain an
assignment satisfying similar properties.

3 A New Lower Bound

In this section we develop a simple and natural lower bound that will be used in
the analysis of our algorithm. Before formally stating the lower bound we provide
some intuition. The two lower bounds used in Graham’s analysis for identical
parallel machines are the maximum chain length and the average load. As dis-
cussed in the previous section, a naive generalization of the first lower bound
implies that the maximum chain length (a chain’s length is sum of processing
times of jobs in it) divided by the fastest speed is a lower bound. However it is
easy to generate examples where the maximum of this bound and the average
load is O(1/m) times the optimal. We describe the general nature of such exam-
ples to motivate our new bound. Suppose we have two speeds with s1 = D and
s2 = 1. The precedence constraints between the jobs are induced by a collection
of ` > 1 chains, each of the same length D. Suppose m1 = 1, and m2 = ` · D.
The average load can be seen to be upper bounded by 1. In addition the time
to process any chain on the fastest processor is 1. However if D ` it is easy to
observe that the optimal is Ω(`) since only ` machines can be busy at any time
instant. The key insight we obtain from the above example is that the amount
of parallelism in an instance restricts the number of machines that can be used.
We capture this insight in our lower bound in a simple way. We need a few defi-
nitions to formalize the intuition. We view the precedence relations between the
jobs as a weighted poset where each element of the poset has a weight associated
with it that is the same as the processing time of the associated job. We will
further assume that we have the transitive closure of the poset.
An Algorithm for Uniformly Related Machines 387

Definition 1. A chain P is a set of jobs j1 , . . . , jr such that for all 1 ≤ i < r,

ji ≺ ji+1 . The length of a chain P , denoted by |P |, is the sum of the processing
times of the jobs in P .

Definition 2. A chain decomposition P of a set of precedence constrained jobs

is an partition of the poset in to a collection of chains {P1 , P2 , . . . , Pr }. A maxi-
mal chain decomposition is one in which P1 is a longest chain and {P2 , . . . , Pr }
is a maximal chain decomposition of the poset with elements of P1 removed.

Though we define a maximal chain decomposition as a set of chains, we will

implicitly assume that it is an ordered set, that is |P1 | ≥ |P2 | ≥ . . . |Pr |.

Definition 3. Let P = {P1 , P2 , . . . , Pr } be any maximal chain decomposition of

the precedence graph of the jobs. We define a quantity called LP associated with
P as follows.
Pj
|Pi |
LP = max Pi=1
j
i=1 si
1≤j≤min{r,m}

With the above definitions in place we are ready to state and prove the new
lower bound.
Theorem 2. Let P = {P1 , P2 , . . . , Pr } be any
Pn maximal
Pmchain decomposition of
the precedence graph of the jobs. Let AL = ( j=1 pi )/( i=1 si ) which represents
the average load. Then
∗
Cmax ≥ max{AL, LP }.
Moreover the lower bound is valid for the preemptive case as well.
∗
Proof. It is easy to observe that Cmax ≥ AL. We will show the following for
1≤j≤m
Pj
∗ |Pi |
Cmax ≥ Pi=1
j
i=1 si
which will prove the theorem. Consider the first j chains. Suppose our input
instance was modified to have only the jobs in the first j chains. It is easy to see
that a lower bound for this modified instance is a lower bound for the original
instance. Since it is possible to execute only one job from each chain at any
time instant, only
Pthe j machines are relevant for this modified instance.
fastestP
j j
The expression ( i=1 |Pi |)/( i=1 si ) is simply the average load for the modified
instance, which as we observed before, is a lower bound. Since the average load
is also a lower bound for the preemptive case, the claimed lower bound applies
even if preemptions are allowed.

Horvath, Lam, and Sethi [6] proved that the above lower bound gives the
optimal schedule length for preemptive scheduling of chains on uniformly related
machines. The idea of extending their lower bound to general precedences using
maximal chain decompositions is natural but does not appear to have been
effectively used before.
388 Chandra Chekuri and Michael Bender

3
Theorem 3. A maximal chain decomposition can be computed in
2
√ O(n ) time.
If all pj are the same, the running time can be improved to O(n n).
Proof. It is necessary to find the transitive closure of the given graph of prece-
dence constraints. This can be done in O(n3 ) time using a BFS from each vertex.
From a theoretical point of view this can be improved to O(nω ) where ω ≤ 2.376
using fast matrix multiplication [2]. A longest chain in a weighted DAG can be
found in O(n2 ) time using standard algorithms. Using this at most n times, a
maximal chain decomposition can be obtained in O(n3 ) time. If all pj are the
same (without loss of generality we can assume they are all 1), the length of a
chain is the same as the number of vertices in the chain. It is possible to√use
this additional structure to obtain a maximal chain decomposition in O(n2 n)
time. We omit the details.

4 The Approximation Algorithm

The approximation algorithm we develop in this section will be based on the
maximal chain decompositions defined in the previous section. As mentioned
in Section 2, we will describe an algorithm to produces an assignment of jobs
to speeds at which they will be processed. Then we use the speed based list
scheduling of [1] with the job assignment produced by our algorithm.

1. compute a maximal chain decomposition of the jobs P = {P1 , . . . , Pr }.

2. set ` = 1. set B = max{AL, LP }.
3. foreach speed 1 ≤ i ≤ K do P
(a) let ` ≤ t ≤ r be maximum index such that `≤j≤t |Pj |/(mi si ) ≤ 4B.
(b) assign jobs in chains P` , . . . , Pt to speed i.
(c) set ` = t + 1. If ` > r return.
4. return.

Fig. 1. Algorithm Chain-Alloc

The algorithm in Figure 1 computes a lower bound B on the optimal using

Theorem 2. It then orders the chains in non-increasing lengths and allocates the
chains to speeds such that no speed is loaded by more than four times the lower
bound. We now prove several properties of the above described allocation which
leads to the approximation guarantee of the algorithm.
Lemma 1. Let P`(u) , . . . , Pr be the chains remaining when Chain-Alloc consid-
ers speed u in step 3 of the algorithm. Then
1. |P`(u) |/su ≤ 2B and
2. Either P`(u) , . . . , Pr are allocated to speed u or Du > 2B.
An Algorithm for Uniformly Related Machines 389

Proof. We prove the above assertions by induction on u. Consider the base case
when u = 1 and `(1) = 1. From the definition of LP it follows that |P1 |/s1 ≤ B.
Since P1 is the longest chain, it follows that |Pj |/s1 ≤ B for 1 ≤ j ≤ r. Let t
be the last chain allocated to s1 . If t = r we are done. If t < r, it must be the
case that adding P(t+1) increases the average load on s1 to more than 4B. Since
Pt
P(t+1) /s1 ≤ B, we conclude that D1 = j=1 |Pj |/m1 s1 > 3B > 2B.
Assume that the conditions of the lemma are satisfied for speeds s1 to su−1
and consider speed su . We will assume that `(u) < r for otherwise there is
nothing to prove. We observe that the second condition follows from the first
using an argument similar to the one used above for the base case. Therefore
it is sufficient to prove the first condition. Suppose |P`(u) |/su > 2B. We will
derive a contradiction later. Let j = Pv`(u) and let v be the index such that
M1v−1 < j ≤ M1v (recall that M1v = k=1 mk ). If j > m, no such index exists
and we set v to K, the slowest speed. If j ≤ m, for convenience of notation we
assume that j = M1v simply by ignoring other machines of speed sv . It is easy
to see that v ≥ u and j > M1u−1 . From the definition of LP , AL, and B, we
Pj Pv
get the following. If j ≤ m then LP ≥ ( i=1 |Pi |)/( k=1 mk sk ). If j > m then
Pj PK
AL ≥ ( i=1 |Pi |)/( k=1 mk sk ). In either case we obtain the fact that

Pj
|P |
Pv i=1 i ≤ max{LP , AL} = B (1)
k=1 mk sk

Since |Pj |/su > 2B, it must be the case that |Pi |/su > 2B for all M1u−1 < i ≤ j.
This implies that

X
j
|Pi | > 2B(j − M1u−1 )su
M1u−1 <i

X
v
≥ 2B m k sk
k=u

X
j X
v
⇒ |Pi | > 2B m k sk (2)
i=1 k=u

The last inequality follows since we are summing up more terms on the left hand
side. From the induction hypothesis it follows that speeds s1 to su−1 have an
average load greater than 2B. From this we obtain

X
j−1 X
u−1
|Pi | > 2B m k sk (3)
i=1 k=1

X
j X
u−1
⇒ |Pi | > 2B m k sk (4)
i=1 k=1
390 Chandra Chekuri and Michael Bender

Combining Equations 2 and 4 we obtain the following.

X
j X
u−1 X
v
2 |Pi | > 2B mk sk + 2B m k sk
i=1 k=1 k=u
Xv
> 2B m k sk
k=1

X
j X
v
⇒ |Pi | > B m k sk (5)
i=1 k=1

Equation 5 contradicts Equation 1.

|Pj |
Corollary 1. If chain Pj is assigned to speed i, then si ≤ 2B.

Corollary 2. Algorithm Chain-Alloc allocates all chains.

∗
Lemma 2. For 1 ≤ k ≤ K, Dk ≤ 4Cmax .
∗
Proof. Since B ≤ Cmax and the algorithm never loads a speed by more than an
average load of 4B, the bound follows.

∗
Lemma 3. For the job assignment produced by Chain-Alloc C ≤ 2KCmax .
P ∗
Proof. Let P be any chain. We will show that j∈P pj /sk(j) ≤ 2KCmax where
k(j) is the speed to which job j is assigned. Let Ai be the set of jobs in P which
are assigned to speed i. Let P` beP the longest chain assigned to speed i by the
algorithm. We claim that |P` | ≥ j∈Ai pi . This is because the jobs in Ai form a
chain when we picked P` to be the longest chain in the max chain decomposition.
∗
From Corollary 1 we know that |P` |/si ≤ 2B ≤ 2Cmax . Therefore it follows that

X pj XK
|Ai | ∗
= ≤ 2KCmax .
sk(j) i=1
si
j∈P

Theorem 4. Using speed based list scheduling on the job assignment produced
by Algorithm Chain-Alloc gives a 6K approximation where K is the number of
distinct speeds. Furthermore the algorithm runs in O(n3 ) time. The running time
2√
can be improved to O(n n) if all pj are the same.
∗
Proof. From Lemma 2 we have Dk ≤ 4Cmax for 1 ≤ k ≤ K and from Lemma 3
∗
we have C ≤ 2KCmax . Putting these two facts together, for the job assignment
produced by the algorithm Chain-Alloc, speed based list scheduling gives the
following upper bound by Theorem 1.
X
K
∗ ∗ ∗
Cmax ≤ C + Dk ≤ 2KCmax + 4KCmax ≤ 6KCmax .
k=1
An Algorithm for Uniformly Related Machines 391

It is easy to see that the speed based list scheduling can be implemented in
O(n2 ) time. The running time is dominated by the time to do the maximum
chain decomposition. Theorem 3 gives the desired bounds.

Corollary 3. There is an algorithm which runs in O(n3 ) time and gives an

O(log m) approximation ratio to the problem of scheduling precedence constrained
jobs on uniformly related machines.

We remark here that the leading constant in the LP based algorithm in [1]
is better. We also observe that the above bound is based on our lower bound
which is valid for preemptive schedules as well. Hence our approximation ratio
is also valid for preemptive schedules. In [1] it is shown that the lower bound
provided by the LP relaxation is a factor of Ω(log m/ log log m) away from the
optimal. Surprisingly it is easy to show using the same example as in [1] that
our lower bound from Section 3 is also a factor of Ω(log m/ log log m) away from
the optimal.

Theorem 5. There are instances where the lower bound given in Theorem 2 is
a factor of Ω(log m/ log log m) away from the optimal.

Proof. The proof of Theorem 3.3 in [1] provides the instance and it is easily
verified that any maximum chain decomposition of that instance is a factor of
Ω(log m/ log log m) away from the optimal.

4.1 Release Dates

Now consider the scenario where each job j has a release date rj before which
it cannot be processed. By a general result of Shmoys, Wein, and Williamson
[10] an approximation algorithm for the problem without release dates can be
transformed to one with release dates losing only a factor of 2 in the process.
Therefore we obtain the following.
Theorem 6. There is an O(log m) approximation algorithm for the problem
Q|prec, rj |Cmax that runs in time O(n3 ).

4.2 Scheduling Chains

In this subsection we show that Chain-Alloc followed by speed based list schedul-
ing gives a constant factor approximation if the precedence constraints are in-
duced by a collection of chains. We give an informal proof in this version of
the paper. We first observe that any maximal chain decomposition of a collec-
tion of chains is simply the collection itself. The crucial observation is that the
algorithm Chain-Alloc allocates all jobs of any chain to the same speed class.
The two observation together imply that there are no precedence relations be-
tween jobs allocated to different speeds. This allows us to obtain a stronger ver-
sion of Theorem 1 where we can upper bound the makespan obtained by speed
392 Chandra Chekuri and Michael Bender

based list scheduling as Cmax ≤ C + max1≤k≤K Dk . Further we can bound C as

max1≤i≤r |Pi |/sk(i) where chain Pi is allocated to speed k(i). From Corollary 1
it follows that C ≤ 2B. Lemma 2 implies that max1≤k≤K Dk ≤ 4B. Combining
the above observations yields a 6 approximation.
Theorem 7. There is a 6 approximation for the problem Q|chains|Cmax and a
12 approximation for the problem Q|chains, rj |Cmax .
Computing the maximal chain decomposition of a collection of chains is triv-
ial and the above algorithm can be implemented in O(n log n) time.

5 Conclusions
The main contribution of this paper is a simple and efficient O(log m) approxi-
mation to the scheduling problem Q|prec|Cmax . Chudak and Shmoys [1] provide
similar approximations for the more general casePwhen the objective function is
the average weighted completion time (Q|prec| wj Cj ) using linear program-
ming relaxations. We believe that the techniques of this paper can be extended
to obtain a simpler and combinatorial algorithm for that case as well. It is known
that the problem of minimizing makespan is hard to approximate to within a
factor of 4/3 even if all machines have the same speed [8]. However, for the single
speed case a 2 approximation is known, while the current ratio for the multi-
ple speed case is only O(log m). Obtaining a constant factor approximation, or
improving the hardness are interesting open problems.

Acknowledgments
We thank Monika Henzinger for simplifying the proof of Lemma 1.

References
1. F. Chudak and D. Shmoys. Approximation algorithms for precedence-constrained
scheduling problems on parallel machines that run at different speeds. Proceedings
of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA),
1997.
2. D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progres-
sion. Proceedings of the 19th ACM Symposium on Theory of Computing, pages
1–6, 1987.
3. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the
Theory of NP-Completeness. Freeman, San Francisco, 1979.
4. R.L. Graham. Bounds for certain multiprocessor anomalies. Bell System Tech. J.,
45:1563–81, 1966.
5. R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. Opti-
mization and approximation in deterministic sequencing and scheduling: A survey.
Ann. Discrete Math., 5:287–326, 1979.
6. E. Horvath, S. Lam, and R. Sethi. A level algorithm for preemptive scheduling.
Journal of the ACM, 24(1):32–43, 1977.
An Algorithm for Uniformly Related Machines 393

7. J. Jaffe. Efficient scheduling of tasks without full use of processor resources. The-
oretical Computer Science, 26:1–17, 1980.
8. J. K. Lenstra and A. H. G. Rinnooy Kan. Complexity of scheduling under prece-
dence constraints. Operations Research, 26:22–35, 1978.
9. J. W. S. Lui and C. L. Lui. Bounds on scheduling algorithms for heterogeneous
computing systems. In J. L. Rosenfeld, editor, Information Processing 74, pages
349–353. North-Holland, 1974.
10. D. Shmoys, J. Wein, and D. Williamson. Scheduling parallel machines on-line.
SIAM Journal on Computing, 24:1313–31, 1995.
On the Relationship Between Combinatorial and
LP-Based Approaches to NP-Hard Scheduling
Problems

? ??
R. N. Uma and Joel Wein

Department of Computer Science, Polytechnic University

Brooklyn, NY 11201, USA
ruma@@tiger.poly.edu, wein@@mem.poly.edu

Abstract. Enumerative approaches, such as branch-and-bound, to solv-

ing optimization problems require a subroutine that produces a lower
bound on the value of the optimal solution. In the domain of scheduling
problems the requisite lower bound has typically been derived from either
the solution to a linear-programming relaxation of the problem or the
solution of a combinatorial relaxation. In this paper we investigate, from
both a theoretical and practical perspective, the relationship between
several linear-programming based lower bounds and combinatorial lower
bounds for two scheduling problems in which the goal is to minimize the
average weighted completion time of the jobs scheduled.
We establish a number of facts about the relationship between these dif-
ferent sorts of lower bounds, including the equivalence of certain linear-
programming-based lower bounds for both of these problems to combi-
natorial lower bounds used in successful branch-and-bound algorithms.
As a result we obtain the first worst-case analysis of the quality of the
lower bound delivered by these combinatorial relaxations.
We then give an empirical evaluation of the strength of the various lower
bounds and heuristics. This extends and puts in a broader context a
recent experimental evaluation by Savelsbergh and the authors of the
empirical strength of both heuristics and lower bounds based on different
LP-relaxations of a single-machine scheduling problem. We observe that
on most kinds of synthetic data used in experimental studies a simple
heuristic, used in successful combinatorial branch-and-bound algorithms
for the problem, outperforms on average all of the LP-based heuristics.
However, we identify other classes of problems on which the LP-based
heuristics are superior, and report on experiments that give a qualitative
sense of the range of dominance of each. Finally, we consider the impact
of local improvement on the solutions.
?
Research partially supported by NSF Grant CCR-9626831.
??
Research partially supported by NSF Grant CCR-9626831 and a grant from the New
York State Science and Technology Foundation, through its Center for Advanced
Technology in Telecommunications.

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 394–408, 1998. c Springer–Verlag Berlin Heidelberg 1998
Combinatorial and LP-Based Approaches for Scheduling Problems 395

1 Introduction
A well-studied approach to the exact solution of NP-hard scheduling problems
may be called enumerative methods, in which (implicitly) every possible solution
to an instance is considered in an ordered fashion. An example of these methods
is branch-and-bound, which uses upper and lower bounds on the value of the
optimal solution to cut down the search space to a (potentially) computationally
tractable size. Such methods are typically most effective when the subroutines
used to calculate both the upper and lower bounds are fast and yield strong
bounds, hence quickly eliminating much of the search space from consideration.
Although there are a wealth of approaches to designing the lower-bounding
subroutines, we can identify two that have been particularly prominent. The
first relies on a linear-programming relaxation of the problem, which itself is
often derived from an integer linear-programming formulation by relaxing the
integrality constraints; Queyranne and Schulz give an extensive survey of this
approach [15]. The second relies on what we will call a combinatorial relaxation of
the problem and yields what we will call a combinatorial lower bound. By this we
simply mean that the lower bound is produced by exploiting some understanding
of the structure of the problem as opposed to by solving a mathematical program.
For example, in this paper we focus on combinatorial lower bounds that are
obtained by relaxing the constraint (in a nonpreemptive scheduling problem)
that the entire job must be processed in an uninterrupted fashion.
Another approach to an NP-hard scheduling problem is to develop an ap-
proximation algorithm. Here the goal is to design an algorithm that runs in
polynomial time and produces a near-optimal solution of some guaranteed qual-
ity. Specifically, we define a ρ-approximation algorithm to be an algorithm that
runs in polynomial time and delivers a solution of value at most ρ times optimal;
see [9] for a survey. In contrast, an enumerative approach attempts to solve a
(usually small) problem to optimality, with no guarantee that the solution will
be obtained in time polynomial in the size of the input.
Recently various researchers have been successful in creating new connec-
tions between linear-programming relaxations used to give lower bounds for
certain scheduling problems and the design of approximation algorithms. Specif-
ically, they have used these relaxations to develop approximation algorithms
with small-constant-factor worst-case performance guarantees; as a by-product
one obtains worst-case bounds on the quality of the lower bound delivered by
these relaxations [14,10,7,18,17]. We define a ρ-relaxation of a problem to be a
relaxation that yields a lower bound that is always within a factor of ρ of the
optimal solution.
In this paper we establish additional connections between different approaches
to these problems. We consider two N P -hard scheduling problems in which the
goal is
Pto minimize the average weighted completion time of the jobs scheduled:
1|rj | wj Cj , theP problem of scheduling n jobs with release dates on a single
machine, and P || wj Cj , the problem of scheduling n jobs on identical parallel
processors. For each problem we show that a combinatorial lower bound that
was used successfully in a branch-and-bound code for the problem is equivalent
396 R. N. Uma and Joel Wein

to the solution of a linear-programming relaxation that had been used in the

design of approximation algorithms. As a consequence we give the first worst-
case analysis of these sorts of combinatorial lower bounds. We also consider
several related lower bounds Pand establish a number of facts about their relative
strength. Finally, for 1|rj | wj Cj , we give an empirical evaluation of the rela-
tive performance of these different lower bounds, and compare the performance
of the approximation algorithms based on the LP-relaxations with the heuristics
used in the successful branch-and-bound code of Belouadah, Posner and Potts
[2].
P
Brief Discussion of Previous Related Work: We begin with 1|rj | wj Cj .
Dyer and Wolsey considered several linear-programming relaxations of this prob-
lem as a tool for producing strong lower bounds [5]. Among those considered were
two time-indexed linear programming relaxations, in which the linear program
contains a variable for every job at every point in time. In the first relaxation
{0, 1}-variables yjt determine whether job j is processed during time t, whereas
in a second stronger relaxation {0, 1}-variables xjt determine whether job j com-
pletes at time t.
Although both linear programs are of exponential size, Dyer and Wolsey
showed that the yjt -LP is a transportation problem with a very special structure
and thus can be solved in O(n log n) time. The xjt -LP, which has been observed
empirically to give strong lower bounds [19,22], is very difficult to solve due
to its size. Van den Akker, Hurkens and Savelsbergh [21] developed a column-
generation approach to solving these linear programs that made feasible the
solution of instances with up to 50 jobs with processing times in the range of
[0..30].
Inspired by the empirical strength of this relaxation,PHall, Schulz, Shmoys
and Wein [10] gave a 3-approximation algorithm for 1|rj | wj Cj based on time-
indexed linear programs. Their approximation algorithm in fact relies only on
the weaker yjt -relaxation and simultaneously proves that the yjt -LP (and hence
the stronger xjt -LP) are 3-relaxations of the problem. Subsequent papers gave
improved techniques with better constant performance guarantees [7,8]. In a re-
cent empirical study, Savelsbergh, Uma and Wein [16] demonstrated that the
yjt bound on many instances comes within a few percent of the xjt bound at
a greatly reduced computational cost. They also showed that the ideas that led
to improved approximation algorithms also yielded improved empirical perfor-
mance; the best heuristics gave rather good results on most sorts of randomly
generated synthetic data.
P
In parallel with work on linear-programming lower bounds for 1|rj | P wj Cj
there has been significant work on branch-and-bound algorithms for 1|rj | wj Cj
based on combinatorial lower bounds [2,3,4,12]. The most successful of these is
due to Belouadah, Posner and Potts [2] who made use of two combinatorial lower
bounds based on job splitting, and an upper bound based on a simple greedy
heuristic.
Although it is difficult to compare the efficacy of the branch-and-bound code
of Belouadah, Posner and Potts with the branch-and-cut code due to Van den
Combinatorial and LP-Based Approaches for Scheduling Problems 397

Akker et. al. based on xjt -relaxations [20] (since they were developed several
years apart in different programming languages on different architectures, etc.)
the evidence seems to be that neither much dominates the other; however, that
of Belouadah, Posner and Potts seems to have been somewhat stronger, as they
were able to solve to optimality problems of size 50 whereas van den Akker et.
al solved to optimality problems of size 30. The enhanced strength of the lower
bounds due to the xjt -relaxations does not make up for the amount of time it
takes to solve them.
Discussion of Results: This paper was born out of an interest to make more
precise, from both an analytical and empirical perspective, the comparison be-
tween the LP-based techniques and the techniques associated with the best com-
binatorial branch-and-bound algorithm. In this process several potentially sur-
prising relationships between these two approaches arose. Specifically,
P we show
that the solution delivered by the yjt -based relaxation for 1|rj | wj Cj is identi-
cal to that used to deliver the weaker of the two lower bounds used by Belouadah,
Posner and Potts. We also show that the stronger of their two lower bounds, while
empirically usually weaker than the xjt -based relaxation, neither always domi-
nates that lower bound nor is it dominated by it. A corollary of this
P observation
is that the optimal preemptive schedule for an instance of 1|rj | wj Cj neither
always dominates nor is dominated by the solution to the xjt -relaxation.
We then establish a similar relationship P for a different problem. Webster
[23] gave a series of lower bounds for P || wj Cj that are based on a similar
notion to the job-splitting approach of Belouadah, Posner and Potts. We show
that the weakest of his lower bounds (which in fact was originally proposed by
Eastman, Even and Isaacs in 1964 [6] ) is equivalent to a generalization of the
yjt relaxation to parallel
P machines that was also used to give approximation
algorithms for P |rj | wj Cj by Schulz and Skutella [18,17].
We then give an empirical evaluation P of the quality of the different lower
bounds and associated heuristics for 1|rj | wj Cj ; this extends and puts in a
broader context a recent experimental study of Savelsbergh and the authors [16].
We demonstrate that the stronger lower bound considered by Belouadah, Posner
and Potts on synthetic data sets (which can be computed in O(n2 ) time) on aver-
age improves on the yjt lower bound by a few percent, and that heuristics based
on this relaxation improve by a few percent as well. However, we demonstrate
that on most of the synthetic data sets we consider, the simple greedy heuristic
used by Belouadah, Posner and Potts is superior to all of the heuristics based
on the approximation algorithms associated with the different LP relaxations.
It is only on data sets that were specifically designed to be difficult that the
LP-based heuristics outperform the simple greedy approach.
Finally, we note that simple local-improvement techniques are often very suc-
cessful in giving good solutions to scheduling problems [1]; we therefore consider
the impact of some simple local improvement techniques when applied both
“from scratch” and to the solutions yielded by the various heuristics that we
consider.
398 R. N. Uma and Joel Wein

2 Background

In this section we briefly review the relevant lower bounds and algorithms. In
both problems we consider we have n jobs j, j = 1, . . . P , n, each with positive
processing time pj and nonnegative weight wj . For 1|rj | wj Cj with each job
is associated a release date rj before which it is not available for processing.
LP-Relaxations: P We begin with the two relevant linear-programming relax-
ations of 1|rj | wj Cj . As mentioned earlier, Dyer and Wolsey introduced sev-
eral integer linear programming formulations of the problem. We focus on two.
In the first, with variables yjt and completion-time variables Cj , yjt = 1 if job j
is being processed in the time period [t, t + 1] and yjt = 0 otherwise.
Pn
minimize j=1 wj Cj
Pn
subject to yjt ≤ 1, t = 1, . . . , T ;
PTj=1
y
t=1 jt = p j , j = 1, . . . , n;
pj 1 PT 1
Cj = 2 + pj t=1 (t + 2 )yjt , j = 1, . . . , n;
yjt ≥ 0, j = 1, . . . , n, t = rj , . . . , T.

As noted earlier, this linear program is a valid relaxation of the optimal

preemptive schedule as well [10], and can be solved in O(n log n) time [5]. The
structure of the solution is in fact quite simple: at any point in time, schedule the
available unfinished job with maximum wj /pj (this may involve preemption).
In the second linear-program, which is much harder to solve, the binary
variable xjt for each job j (j = 1, . . . , n) and time period t (t = pj , . . . , T ), where
T is an upper bound on schedule makespan, indicates whether job j completes
in period t (xjt = 1) or not (xjt = 0). This relaxation is stronger than the yjt
relaxation; in particular it is not a valid relaxation of the optimal preemptive
schedule, and its integer solutions yield only nonpreemptive schedules.
Pn PT
minimize t=pj cjt xjt
PT j=1
subject to t=pj xjt = 1, j = 1, . . . , n; (1)
Pn P t+pj −1
j=1 s=t xjs ≤ 1, t = 1, . . . , T ; (2)
xjt ≥ 0, j = 1, . . . , n, t = pj , . . . , T.

The assignment constraints (1) state that each job has to be completed ex-
actly once, and the capacity constraints (2) state that the machine can process
at most one job during any time period.
Job Splitting: The lower bounds of Belouadah, Posner and Potts are based on
job splitting. This technique is based on the idea that a relaxation of a nonpre-
emptive scheduling problem may be obtained by splitting each job intoPsmaller
pieces that can be scheduled individually. If the objective function is wj Cj ,
when we split job j into pieces we also must split its weight wj among the pieces
as well. In essence, we create a number of smaller jobs. If we split the jobs in
such a way that we can solve the resulting relaxed problem in polynomial time,
Combinatorial and LP-Based Approaches for Scheduling Problems 399

we obtain a polynomial-time computable lower bound on the optimal solution

to the original problem.
Note that an inherent upper bound on the quality of any lower bound
achieved by this process is the value of the optimal preemptive schedule, as
preemptive schedules correspond to splits in which all of the weight of the job j
is assigned to the last piece to be scheduled, and such a weight assignment gives
the strongest possible lower bound for a specifiedPsplit of the job into pieces.
Note however, that the preemptive version of 1|rj | wj Cj is also N P -hard and
we must settle for solving something weaker that can be computed in polynomial
time.
Belouadah, Posner and Potts give two lower bounds (BPP1 and BPP2) based
on job splitting. In BPP1, the pieces of job j are exactly those that arise in the
optimal preemptive solution to the yjt relaxation, and the optimal solution to
the resulting split problem has the same structure as that of optimal preemptive
solution to the yjt -relaxation. In this lower bound each piece of job j receives a
fraction of weight wj in exact proportion to the fraction of the size of pj that its
size is. In the BPP2 lower bound as much weight as possible is shifted to later
scheduled pieces of the job while maintaining the property that the structure of
the optimal solution can be computed in polynomial time.
Approximation
P Algorithms: Recent progress on approximation algorithms
for 1|rj | wj Cj is based on solving the linear relaxation of either of the afore-
mentioned LPs and then inferring an ordering from the solution [14,10,7,8]. In
this abstract we focus only on a few of these heuristics that were demonstrated
in [16] to be the most useful computationally.
In constructing an ordering we make use of the notion of an α-point [14,11].
We define the α-point of job j, 0 ≤ α ≤ 1, to be the first point in time, in the
solution to a time-indexed relaxation, at which an α fraction of job j has been
completed. We define the algorithm Schedule-by-Fixed-α, that can be applied to
the solution of either LP relaxation, as ordering the jobs by their α-points and
scheduling in√that order. Goemans [7] has shown that for appropriate choice of
α this is a ( 2 + 1)-approximation algorithm. Goemans also showed that by
choosing α randomly according to a uniform distribution and then scheduling
in this order, one obtains a randomized 2-approximation algorithm [7] and if
one chooses using a different distribution, a 1.7451-approximation algorithm [8].
Either randomized algorithm can be derandomized by considering n different
values of α, scheduling according to each of them, and then choosing the best.
We call this algorithm Best-α, or, if only consider k equally-spaced values of α, we
call the algorithm k-Best-α. We note that it is simpleP to adapt these algorithms
to be applied to combinatorial relaxations of 1|rj | wj Cj as well.

3 Analytical Evaluation of Strength of Different Bounds

One Machine: We first introduce some notation to describe the two BPP lower
p w
bounds. We say job l is “better” than job j if wpll < wjj , or, equivalently, if wpll > pjj .
For both bounds jobs are split as follows. When a better job arrives, we split the
400 R. N. Uma and Joel Wein

currently executing job into two pieces such that one piece completes at the
arrival time of the new job and the the second piece is considered for scheduling
later. When a job is split into pieces, its weight is also split. So if job j is split
into k pieces, then each piece i has a processing time pij , a weight wji and release
Pk Pk
date rj , such that i=1 pij = pj and i=1 wji = wj .
For the BPP1 bound, the weights are assigned to the pieces of a job such
wji wj
that pij
= pj for all i = 1, . . . k, whereas for BPP2 the weights are assigned to
the pieces in a greedy fashion with the aim of maximizing the lower bound; in
[2] they have shown that BPP2 is always greater than or equal to BPP1.
Let us denote the lower bound given by BPP1 as LB BP P 1 and the lower
bound given by the yjt LP relaxation as LB yjt .

Theorem 1. LB yjt = LB BP P 1 .

Proof. As a preliminary we establish a bit more understanding of BPP1. Let

P be the original instance with no split jobs. Let P1 denote the corresponding
problem where one job, say j (with processing time pj , weight wj and release
date rj ), is split into k pieces. Say each piece i is of length pij and is assigned
P P
a weight wji for i = 1, . . . , k such that ki=1 pij = pj and ki=1 wji = wj . These
k pieces are constrained to be scheduled contiguously in P1 . The set of jobs in
P1 are the k pieces of this split job j plus the remaining jobs in P . Obviously
there is a one-to-one correspondence between the feasible schedules for P and
P1 . Note that in P1 , the pieces of job j will be scheduled exactly during the
interval P schedules job j. Therefore, it is sufficient to consider the contribution
of this one job to the weighted completion time in both the schedules. Let the
pieces of job j start at times t1j , . . . , tkj respectively. So in P , this job is scheduled
during [t1j , t1j + pj ] and it contributes wj ∗ (t1j + pj ).
The following equality was shown by Belouadah, Posner and Potts in [2].

X
k
wj ∗ (t1j + pj ) = wji Cji + CBRKj (3)
i=1

where Cji denotes the completion time of piece i of job j and

X
k−1 X
k
CBRKj = wji phj
i=1 h=i+1

can be thought of as the cost of breaking job j into k pieces.

If we remove the constraint from P1 that the pieces of the job have to be
scheduled contiguously, then the resulting problem P2 gives a lower bound on
the cost of P1 . This is true even if more than one job is split. Therefore, the idea
is to split the jobs so that the optimal schedule for the resulting problem P2 can
be computed easily. So BP P 1 (and BP P 2) split the jobs accordingly.
We now turn to proving our theorem. Consider the (weighted) contribution
of just one job, say j, to the respective lower bounds (denoted LBjBP P 1 and
Combinatorial and LP-Based Approaches for Scheduling Problems 401

y
LBj jt ). Let job j be released at rj with a processing time requirement of pj and
weight wj . Let this job be split into k pieces of lengths p1j , . . . , pkj starting at
times t1j , . . . , tkj , respectively. So we have p1j + p2j + · · · + pkj = pj . BPP1 would
pij
assign weights wji = pj wj for i = 1, . . . , k. The cost of breaking job j is given by,

X
k−1 X
k
CBRKj = wji phj
i=1 h=i+1

X
k−1
wj i X h
k
= · pj pj
i=1
pj
h=i+1

wj 1 X
k Xk
= · (( pij )2 − (pij )2 )
pj 2 i=1 i=1

1X i i
k
1
= wj pj − w p
2 2 i=1 j j

Now the contribution of job j to the BP P 1 lower bound is,

X
k
LBjBP P 1 = wji (tij + pij ) + CBRKj
i=1
X
k
1 1X i i
k
= wji (tij + pij ) + wj pj − w p
i=1
2 2 i=1 j j

and the contribution of job j to the yjt lower bound is,

y wj pj wj X 1
LBj jt = + yjt (t + )
2 pj t 2

wj X i 1 i
k
wj pj 1 1
= + ((t + )(t + 1 + ) + · · · + (tij + (pij − 1) + ))
2 pj i=1 j 2 j 2 2

wj X i i (pij )2
k
wj pj
= + (pj tj + (pij )2 − )
2 pj i=1 2
Xk
pij wj pj wj X i 2
k
= ( wj )(tij + pij ) + − (p )
i=1
pj 2 2pj i=1 j
= LBjBP P 1
Summing over all jobs j we have the required result. t
u
As an immediate corollary we obtain an upper bound on the quality of lower
bound provided by both BPP1 and BPP2.PGoemans et al. [8] proved that the
yjt -relaxation is a 1.685-relaxation of 1|rj | wj Cj ; thus we see that BPP1 and
BPP2 are as well. We now turn to the relationship with the xjt relaxation; it is
known that this is stronger than the yjt [5].
402 R. N. Uma and Joel Wein

Theorem 2. The lower bound given by BPP2 neither always dominates nor is
dominated by the xjt -lower bound.

The proof is by exhibiting two instances; one on which BPP2 is better and
one on which xjt is better. Due to space constraints the instances are omitted
in this extended abstract.
Parallel
P Identical Machines: In [23], Webster considers the problem
P || wj Cj He gives a series of progressively stronger lower bounds for this prob-
lem all of which are based on ideas similar to job-splitting; these lower bounds
lead to a successful branch-and-bound algorithm. The weakest of these bounds is
actually due to a 1964 paper by Eastman, Even and Isaacs [6]. By an argument
of a similar flavor to that of the one machine problem we can show that in fact
the weakest bound is equivalent to an extension of the yjt -relaxation to parallel
machines. In this extension the m machines are conceptualized as one machine
of speedPm. Schulz and Skutella show that P this formulation is a 2-relaxation
of P |rj | wj Cj and a 3/2-relaxation of P || wj Cj . Williamson [24] had also,
independently, observed this equivalence between theP lower bounds due to East-
man, Even and Isaacs and the yjt -relaxation of P || wj Cj . By establishing this
equivalence we obtain a worst-case upper bound on the performance of all of
Webster’s lower bounds. In fact, he considers a more general problem in which
each processor can have non-trivial ready times – our results extend to this case
as well. The details are omitted due to the space constraints of the extended
abstract.

4 Empirical Evaluation

4.1 Experimental Design

In [16] a comprehensive experimental evaluation is given of the relative strength

of the yjt (BPP1) and xjt LP relaxations and approximation algorithms based on
them. Our goals in this section are threefold. (1) To quantify experimentally the
relative strength of the BPP2 lower bound, which is no weaker than the BPP1
bound but, in contrast to the xjt -lower bound, can be computed efficiently,
and of the approximation algorithms based on this stronger lower bound. (2)
To compare the performance of the simple WSPT heuristic used by Belouadah
et al. in their branch-and-bound algorithm with the LP-based approximation
algorithms. (3) To understand the impact of local improvement on solutions to
the problem.
We sought to generate a rich enough set of instances to observe the fullest
possible range of algorithm and lower bound performance. Our experimental
design follows that both of Hariri and Potts [12] and of [16]. We worked with
three sets of data. The first, which we call OPT, was a set of 60 instances with
n = 30 jobs, with wj ∈ [1..10] and pj ∈ [1..5] in 20 instances and pj ∈ [1..10]
in 40 instances. For these instances we knew the optimal solutions from the
branch-and-cut code of [20].
Combinatorial and LP-Based Approaches for Scheduling Problems 403

We then generated a large Synthetic data set according to a number of

parameters: problem size (n), distribution for random generation of the weights
wj , distribution for the random generation of the pj , and arrival process. The
problem size n was chosen from {50, 100, 200, 500}. The release dates were gen-
erated either by a Poisson process to model that, on average 2, 5, 10 or 20 jobs
arrived
Pevery 10 units of time, (where 10 is the maximum job size) or uniformly
in [0, pj ]. Three distributions were used for the generation of each of the wj
and pj : (i) uniform in [1, 10] (ii) normal with a mean of 5 and a standard devia-
tion of 2.5, (iii) a two hump distribution, where, with probability 0.5 the number
is chosen from a normal distribution with mean 2.5 and standard deviation 0.5,
and with probability 0.5 the number is chosen from a normal distribution with
mean 7.5 and standard deviation 0.5. Ten instances were generated randomly
for each combination of parameters, for a total of 1800 instances. For all of these
we computed the BPP1 and BPP2 relaxations, and ran on all of these relax-
ations the heuristics discussed in Section 2. Solutions to the xjt relaxation were
available for some of the instances from our work in [16].
Finally, as in [16] we considered the Hard data sets that were designed to
provoke poor performance from the LP-based heuristics. All of these instances
attempt to exploit the fact that the yjt relaxation is also a valid relaxation of
the optimal preemptive schedule [10], and that therefore on instances for which
the optimal preemptive schedule is much better than the optimal nonpreemp-
tive schedule, the yjt -based relaxation should perform poorly. Therefore these
instances have one or several very large jobs, and a large number of tiny jobs
that are released regularly at small intervals. In the Hard1 data set we created
instances with processing times 1 and 20 or 30. The size 1 jobs were generated on
average 9 times more frequently than the size 20 or size 30 jobs. In the Hard2
data set we generated job sizes in a two-hump distribution, with the humps at
5 and 85; the size 5 jobs were generated on average 9 times more frequently.
We note that all problem instances are available at http://ebbets.poly.
edu/SCHED/onerj.html.
Due to space considerations in this extended abstract we focus on a few data
sets which turn out to be among the most difficult in their categories for the
heuristics and that are at the same time representative of the overall behavior
of that set. Specifically, we focus on the OPT, Hard1 and Hard2, Synthetic1
and Synthetic2. The Synthetic1 set corresponds to the Synthetic set where
release dates are generated by a Poisson process with arrival rates 2 and 5
(n = 50, 100) and the Synthetic2 set corresponds to the release dates being
generated uniformly (n = 50, 100, 200, P500). We also report results with respect
to the average weighted flow time = wj (Cj − rj ); while equivalent to average
weighted completion time at optimality this criterion is provably much harder
to approximate [13] and thus more interesting.

4.2 Lower Bounds

Table 1 reports on the relative performance of the different lower bounds. We
note that the BPP2 lower bound does provide some improvement over the BPP1
404 R. N. Uma and Joel Wein

bound at modest additional computational cost (O(n log n) to O(n2 )), but since
both are relaxations of the optimal preemptive schedule, on the Hard1 data
set the BPP2 bound is still far from the xjt -lower bound. (Do not be misled by
the small numbers recorded for Hard2. They are with respect to the best lower
bound we know for that instance, which is BPP2.) Furthermore, note that on
the Synthetic1 data set sometimes BPP2 is better than the xjt -relaxation, and
this explains why none of the numbers in that column is 0 – the comparison
is always with respect to the best lower bound for that instance. We note that
the maximum improvement observed by BPP2 over BPP1 on any instance was
9.492%, on an instance in the Hard1 data set.

Table 1. Quality of the lower bounds with respect to the weighted flow time
given by yjt , BP P 2 and xjt relaxations. We report on (BEST
LB
−LB)
× 100, where
BEST is the best available lower bound and LB is the corresponding lower
bound. The values reported are averaged over all the instances in each case.
Results for the yjt and xjt relaxations are from the experimental work of Savels-
bergh et al.

OPT Hard1 Hard2 Synthetic1 Synthetic2

yjt (BPP1) 4.506 19.169 0.848 1.729 1.848
BPP2 2.969 15.539 0.000 0.941 0.000
xjt 1.322 0.000 N/A 0.013 N/A

4.3 Upper Bounds

In [16] we demonstrated that the various heuristics based on the LP-relaxations

in general yield improved performance with improved quality of relaxation, al-
though there are many instances for which this is not the case. Therefore it is of
interest to study the impact of the BPP2 lower bounds on heuristic performance.
Of greater interest is to compare the performance of all this LP “stuff” to
the simple and naive heuristic used by Belouadah et al. in their combinatorial
branch and bound code. This heuristic, WSPT, simply, when idle, selects for pro-
cessing the unprocessed available job with the largest wj /pj ratio and schedules
it nonpreemptively. It is trivial to see that the worst-case performance of this
heuristic is unbounded. Table 2 provides a summary of the performance of the
various heuristics.
The Table demonstrates that the heuristics based on the BPP2 bound do
in fact do better by up to a few percent on the hard data sets. Most striking,
however, is that except on the Hard1 data set the WSPT heuristic is superior to
all the LP relaxation-based approaches. Across the entire Synthetic data sets
WSPT yields a better solution than the LP-approaches 70 − 80% of the time;
on specific instances of the Hard data set it is up to 27% better than the best
performance of the other heuristics.
Combinatorial and LP-Based Approaches for Scheduling Problems 405

Table 2. Performance of the algorithms based on the three (yjt , BP P 2, xjt )

formulations. We report the mean, standard deviation and maximum of the ratio
of the performance of the algorithm to the best lower bound. The first three rows
of the table report results from the work of Savelsbergh et al.

Hard1 Synthetic1 Synthetic2

Mean SDev Max Mean SDev Max Mean SDev Max
Schedule-by-fixed-α (yjt ) 2.207 0.623 3.906 1.126 0.093 1.686 1.334 0.126 1.892
Best-α (yjt ) 1.683 0.263 2.164 1.042 0.042 1.282 1.155 0.067 1.371
5-Best-α (yjt ) 1.775 0.301 2.244 1.048 0.046 1.358 1.171 0.079 1.419
Schedule-by-fixed-α (BP P 2) 1.985 0.507 3.902 1.116 0.084 1.643 1.292 0.109 1.615
Best-α (BP P 2) 1.628 0.245 2.122 1.040 0.039 1.233 1.144 0.059 1.311
5-Best-α (BP P 2) 1.710 0.274 2.296 1.046 0.043 1.233 1.157 0.067 1.356
Best-α (xjt ) 1.854 0.320 2.428 1.048 0.045 1.303 - - -
WSPT 1.874 0.372 3.015 1.033 0.032 1.148 1.121 0.048 1.307

In some sense this performance is quite surprising, but in another it is not very
surprising at all. WSPT, while lacking any worst-case performance guarantee, is
a natural choice that should work well “most of the time”. It is likely that when
one generates synthetic data one is generating instances that one would think
of “much of the time”. It is on the harder sorts of instances that the potential
problems with such a heuristic arise.
Although it would not be difficult to generate instances that would yield
horrendous performance for WSPT the purpose of an experimental study is to
yield insight into algorithm performance on instances that may be somewhat
natural. Towards this end we generated a spectrum of problems in the framework
of “a few jobs of size P and lots of jobs of size 1”, with P ranging from 1 to
1000. The results of the experiment, plotted in Figure 1, gives a qualitative
sense of the range in which WSPT is competitive with algorithms with worst-
case performance guarantee, which is essentially up to P = 100 or so.
Finally, we experimented with the impact of local improvement, which can
be a very powerful algorithmic technique for certain scheduling problems. Specif-
ically, given a solution we considered all pairs and all triples of jobs, switching
any that led to an improved solution and iterating until no improving switch
existed. We applied these to the solutions of all the heuristics. For the heuristics
that considered several possible schedules we applied local improvement to all
of them and took the best resulting schedule. The results, reported in Table 3,
indicate that the combination of local improvement applied to a collection of
high-quality schedules yield the best results. We note further that applying this
local improvement from scratch (random orderings) was not competitive with
its application to a good initial schedule; the latter approach on average yielded
solutions of quality 40-50% better. The importance of the quality of the initial
schedule for local improvement was also observed by Savelsbergh et al [16] in
the case of a real resource constrained scheduling problem.
406 R. N. Uma and Joel Wein

ALG/LP for WFT (y_jt-based) averaged over 10 instances

4.5
best-alpha 5-best-alpha WSPT
4

3.5

2.5

1.5

0.5

0
0 200 400 600 800 1000
p_max (k)

P
Fig. 1. Plot of ALG
LP ratios for wj Fj for Best-α, 5-Best-α and WSPT. Each
data point is an average of 10 instances. On the x-axis we plot the maximum
job size pmax . pmax = k means that the processing time was generated so that
p = 1 with a probability of 0.9 and p = k with a probability of 0.1. The arrival
times were generated uniformly in the range [0..500]

5 Conclusions
Enthusiasts of worst-case analysis may be disappointed by the relatively poor
showing of the approximation algorithms with respect to the simple WSPT
heuristic. They may be consoled by their advantages on certain classes of diffi-
cult problems and their supremacy when coupled with local improvement. We
suggest, however, that although it is interesting to compare these approaches on
simple models, perhaps the most important role for worst-case analysis is as a
tool that forces the algorithm designer to have new ideas that may also be useful
in other more complex problems1 . Savelsbergh et al. showed that the idea of us-
ing many α values led to improved solutions for actual complex manufacturing
scheduling problems; for these problems it is unlikely that there exist simple
heuristics that will perform as well in such settings.
Finally, in this study we have touched on several approaches to scheduling
problems: linear programming, approximation, combinatorial relaxations and
branch-and-bound, and local improvement, and made a modest contribution to
our further understanding of their relationship. We feel that it is an important
direction to continue to try to understand in a unified fashion the different roles
that these elements can play in the theory and practice of scheduling.

Acknowledgements. We are grateful to Jan Karel Lenstra, Maurice Queyranne,

David Shmoys and Chris Potts for helpful discussions. The second author would
1
We are grateful to David Shmoys for suggesting this idea.
Combinatorial and LP-Based Approaches for Scheduling Problems 407

Table 3. Performance of the algorithms based on the three (yjt , BP P 2, xjt )

formulations after some local improvement. We report the mean, standard devi-
ation and maximum of the ratio of the performance of the algorithm to the best
lower bound.
Hard1 Synthetic1 Synthetic2
Mean SDev Max Mean SDev Max Mean SDev Max
Schedule-by-fixed-α (yjt ) 1.594 0.264 2.137 1.088 0.061 1.330 1.200 0.073 1.438
20-Best-α (yjt ) 1.402 0.148 1.679 1.022 0.022 1.138 1.096 0.035 1.196
5-Best-α (yjt ) 1.420 0.164 1.770 1.024 0.023 1.140 1.100 0.037 1.213
Schedule-by-fixed-α (BP P 2) 1.591 0.270 2.141 1.081 0.058 1.339 1.175 0.066 1.371
20-Best-α (BP P 2) 1.398 0.133 1.624 1.022 0.022 1.143 1.096 0.035 1.191
5-Best-α (BP P 2) 1.413 0.142 1.652 1.024 0.023 1.143 1.100 0.036 1.213
20-Best-α (xjt ) 1.429 0.197 1.871 1.021 0.022 1.125 - - -
WSPT 1.746 0.284 2.321 1.031 0.030 1.147 1.110 0.042 1.241

also like to thank Andreas Schulz for inviting him to ISMP ’97, where the idea for
this paper arose in discussions with Chris Potts. Some of the results in Section
3 were independently obtained by David Williamson and a joint journal paper
is forthcoming.

References

1. E. J. Anderson, C. A. Glass, and C. N. Potts. Machine scheduling. In E. Aarts and

J. K. Lenstra, editors, Local Search in Combinatorial Optimization. Wiley Press,
1997.
2. H. Belouadah, M. E. Posner, and C. N. Potts. Scheduling with release dates on
a single machine to minimize total weighted completion time. Discrete Applied
Mathematics, 36:213–231, 1992.
3. L. Bianco and S. Ricciardelli. Scheduling of a single machine to minimize to-
tal weighted completion time subject to release dates. Naval Research Logistics
Quarterly, 29:151–167, 1982.
4. M. I. Dessouky and J. S. Deogun. Sequencing jobs with unequal ready times to
minimize mean flow time. SIAM Journal on Computing, 10:192–202, 1981.
5. M. E. Dyer and L. A. Wolsey. Formulating the single machine sequencing problem
with release dates as a mixed integer program. Discrete Applied Mathematics,
26:255–270, 1990.
6. W. L. Eastman, S. Even, and I. M. Isaacs. Bounds for the optimal scheduling of n
jobs on m processors. Management Science, 11(2):268–279, 1964.
7. M. Goemans. Improved approximation algorithms for scheduling with release
dates. In Proceedings of the 8th ACM-SIAM Symposium on Discrete Algorithms,
pages 591–598, 1997.
8. M. Goemans, M. Queyranne, A. Schulz, M. Skutella, and Y. Wang. Single machine
scheduling with release dates. Preprint, 1997.
9. L. A. Hall. Approximation algorithms for scheduling. In D. S. Hochbaum, editor,
Approximation Algorithms for NP-hard Problems, pages 1–43. PWS Publishing
Company, 1997.
408 R. N. Uma and Joel Wein

10. L. A. Hall, A. S. Schulz, D. B. Shmoys, and J. Wein. Scheduling to minimize average

completion time: Off-line and on-line approximation algorithms. Mathematics of
Operations Research, (3):513–544, August 1997.
11. L. A. Hall, D. B. Shmoys, and J. Wein. Scheduling to minimize average comple-
tion time: Off-line and on-line algorithms. In Proceedings of the 7th ACM-SIAM
Symposium on Discrete Algorithms, pages 142–151, January 1996.
12. A. M. A. Hariri and C. N. Potts. An algorithm for single machine sequencing
with release dates to minimize total weighted completion time. Discrete Applied
Mathematics, 5:99–109, 1983.
13. H. Kellerer, T. Tautenhahn, and G. J. Woeginger. Approximability and nonapprox-
imability results for minimizing total flow time on a single machine. In Proceedings
of the 28th Annual ACM Symposium on Theory of Computing, May 1995.
14. C. Phillips, C. Stein, and J. Wein. Scheduling jobs that arrive over time. In Pro-
ceedings of Fourth Workshop on Algorithms and Data Structures, LNCS, Vol. 955,
pages 86–97. Springer-Verlag, Berlin, 1995.
15. M. Queyranne and A. S. Schulz. Polyhedral approaches to machine scheduling.
Technical Report 408/1994, Technical University of Berlin, 1994.
16. M. W. P. Savelsbergh, R. N. Uma, and J. Wein. An experimental study of LP-
based approximation algorithms for scheduling problems. In Proceedings of the 9th
ACM-SIAM Symposium on Discrete Algorithms, 1998.
17. A. S. Schulz and M. Skutella. Random-based scheduling: New approximations and
LP lower bounds. In J. Rolim, editor, Randomization and Approximation Tech-
niques in Computer Science – Proceedings of the International Workshop RAN-
DOM’97, LNCS, Vol. 1269, pages 119–133. Springer, Berlin, 1997.
18. A. S. Schulz and M. Skutella. Scheduling–LPs bear probabilities: Randomized
approximations for min–sum criteria. In R. Burkard and G. Woeginger, editors,
Algorithms – Proceedings of the 5th Annual European Symposium on Algorithms
(ESA’97), LNCS, Vol. 1284, pages 416–429. Springer, Berlin, 1997.
19. J. P. De Sousa and L. A. Wolsey. A time-indexed formulation of non-preemptive
single-machine scheduling problems. Mathematical Programming, 54:353–367,
1992.
20. M. Van den Akker, C. P. M. Van Hoesel, and M. W. P. Savelsbergh. A polyhedral
approach to single machine scheduling. Mathematical Programming, 1997. To
appear.
21. M. Van den Akker, C. A. J. Hurkens, and M. W. P. Savelsbergh. A time-indexed
formulation for single-machine scheduling problems: Column generation. 1996.
Submitted for publication.
22. M. Van den Akker. LP-Based Solution Methods for Single-Machine Schedul-
ing Problems. PhD thesis, Eindhoven University of Technology, Eindhoven, The
Netherlands, 1994.
23. S. Webster. New bounds for the identical paralled processor weighted flow time
problem. Management Science, 38(1):124–136, 1992.
24. D. P. Williamson. Personal communication, 1997.
Polyhedral Combinatorics of Quadratic
Assignment Problems with Less Objects than
Locations

Volker Kaibel

Institut für Informatik

Universität zu Köln
Pohligstr. 1
50969 Köln, Germany
kaibel@@informatik.uni-koeln.de
http://www.informatik.uni-koeln.de/ls juenger/kaibel.html

Abstract. For the classical quadratic assignment problem (QAP) that

requires n objects to be assigned to n locations (the n × n-case), polyhe-
dral studies have been started in the very recent years by several authors.
In this paper, we investigate the variant of the QAP, where the number
of locations may exceed the number of objects (the m × n-case). It turns
out that one can obtain structural results on the m × n-polytopes by ex-
ploiting knowledge on the n × n-case, since the first ones are certain pro-
jections of the latter ones. Besides answering the basic questions for the
affine hulls, the dimensions, and the trivial facets of the m×n-polytopes,
we present a large class of facet defining inequalities. Employed into a
cutting plane procedure, these polyhedral results enable us to compute
optimal solutions for some hard instances from the QAPLIB for the first
time without using branch-and-bound. Moreover, we can calculate for
several yet unsolved instances significantly improved lower bounds.

1 Introduction

Let a set of m objects and a set of n locations be given, where m ≤ n. We will

be concerned with the following problem. Given linear costs c(i,j) for assigning
object i to location j and quadratic costs q{(i,j),(k,l)} for assigning object i to
location j and object k to location l, the task is (in the non-symmetric case)
to find an assignment, i.e., an injective map ϕ : {1, . . . , m} −→ {1, . . . , n}, that
minimizes
X
m X m X
m
q{(i,ϕ(i)),(k,ϕ(k))} + c(i,ϕ(i)) .
i=1 k=i+1 i=1

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 409–422, 1998. c Springer–Verlag Berlin Heidelberg 1998
410 Volker Kaibel

In the symmetric case we have quadratic costs q̂({i,k},{j,l}) for assigning the two
objects i and k anyhow to the two locations j and l, and we have to find an
assignment ϕ : {1, . . . , m} −→ {1, . . . , n} that minimizes

X
m X
m X
m
q̂({i,k},{ϕ(i),ϕ(k)}) + c(i,ϕ(i)) .
i=1 k=i+1 i=1

The classical quadratic assignment problem (introduced by Koopmans and

Beckmann [10]) is a special case of the non-symmetric formulation, where m = n
and d{(i,j),(k,l)} = fik djl + fki dlj holds for some flow-matrix (fik ) and some
distance-matrix (djl ). If one of the flow- or distance-matrix is symmetric then
the problem is also a special case of the symmetric formulation given above.
The QAP is not only from the theoretical point of view a hard one among the
classical combinatorial optimization problems (Sahni and Gonzales [14] showed
that even -approximation is N P-hard), but it has also resisted quite well most
practical attacks to solve it for larger instances.
Polyhedral approaches to the classical case with m = n (the n × n-case) have
been started during the recent years by Padberg and Rijal ([13,11]) as well as
Jünger and Kaibel ([4,6,5,8]). In [5] the first large class of facet defining inequal-
ities for the associated polytopes is presented. These inequalities turned out to
yield very effective cutting planes that allowed to solve for the first time several
instances from the QAPLIB (the commonly used set of test instances compiled
by Burkarhd, Karisch, and Rendl [2]) to optimality without using branch-and-
bound.
In this paper, we describe a polyhedral approach to the case where the num-
ber of objects m might be less than the number of locations n (the m × n-case).
We restrict our presentation to the symmetric version. In Sect. 2 a problem
formulation in terms of certain hypergraphs is introduced and the associated
polytopes are defined. The trivial facets as well as the affine hulls of these poly-
topes are considered in Sect. 3. In Sect. 4 rather tight relaxation polytopes are
presented that are projections of certain relaxation polytopes for the n × n-case.
In Sect. 5 we describe a large class of facet defining inequalities for the polytopes
that are associated with the (symmetric) m × n-case. Strengthening the relax-
ations of Sect. 4 by some of these inequalities in a cutting plane procedure, we
can improve for some m × n-instances in the QAPLIB the lower bounds signifi-
cantly. Moreover, we can solve several m × n-instances by a pure cutting plane
procedure. Results of these experiments are given in Sect. 6. We conclude with
some remarks on promising further directions of polyhedral investigations for
the m × n-QAP in Sect. 7.

2 QAP-Polytopes

As indicated in the introduction, we restrict to the symmetric QAP in this paper.

Since it provides convenient ways to talk about the problem, we first formulate
the symmetric QAP as a problem defined on a certain hypergraph.
Polyhedral Combinatorics of QAPs 411

Throughout this paper, let m ≤ n, M := {1, . . . , m}, and N := {1, . . . , n}.

We define a hypergraph Ĝm,n := (Vm,n , Êm,n ) on the nodes Vm,n := M × N with
hyperedges

Vm,n
Êm,n := {(i, j), (k, l), (i, l), (k, j)} ∈ : i 6= k, j 6= l .
4

A hyperedge {(i, j), (k, l), (i, l), (k, j)} is denoted by hi, j, k, li. The sets rowi :=
{(i, j) ∈ Vm,n : j ∈ N } and colj := {(i, j) ∈ Vm,n : i ∈ M} are called the i-th
row and the j-th column of Vm,n , respectively.
We call a subset C ⊂ Vm,n of nodes a clique of the hypergraph Ĝm,n if it
intersects neither any row nor any column more than once. The maximal cliques
of Ĝm,n are the m-cliques. The set of hyperedges that is associated with an m-
clique C ⊂ Vm,n of Ĝm,n consists of all hyperedges that share two nodes with C.
This set is denoted by Êm,n (C). Solving symmetric QAPs then is equivalent to
finding minimally node- and hyperedge-weighted m-cliques in Ĝm,n .
We denote by x(... ) ∈ RVm,n and z (... ) ∈ RÊm,n the characteristic vectors
of subsets of Vm,n and Êm,n , respectively. Thus the following polytope encodes
the structure of the symmetric QAP in an adequate fashion (where we simplify
(xC , z C ) := (xC , z Êm,n (C) )):
n o
SQAP m,n := conv (xC , z C ) : C is an m-clique of Ĝm,n

The (mixed) integer linear programming formulation in Theorem 1 is quite

(i,j)
basic for the polyhedral approach. Let ∆(k,l) be the set of all hyperedges that
P for any vector u ∈ R and any
contain both nodes (i, j) and (k, l). As usual, L
0 0
subset L ⊂ L of indices we denote u(L ) := λ∈L0 uλ .

Theorem 1. Let 1 ≤ m ≤ n. A vector (x, z) ∈ RVm,n × RÊm,n is a vertex of

SQAP m,n , i.e., the characteristic vector of an m-clique of Ĝm,n , if and only if
it satisfies the following conditions:
x(rowi ) = 1 (i ∈ M) (1)
x(colj) ≤ 1 (j ∈ N ) (2)
(i,j)
− x(i,j) − x(k,j) + z ∆(k,j) = 0 (i, k ∈ M, i < k, j ∈ N ) (3)
zh ≥ 0 (h ∈ Êm,n ) (4)
xv ∈ {0, 1} (v ∈ Vm,n ) (5)

Proof. The “only if part” is obvious. To prove the “if part”, let (x, z) ∈ RVm,n ×
RÊm,n satisfy (1), . . . ,(5). Since x is a 0/1-vector that satisfies (1) and (2) it must
be the characteristic vector of some m-clique C ⊂ Vm,n of Ĝm,n . Considering
two appropriate equations from (3), one obtains (by the nonnegativity of z) that
zhi,j,k,li > 0 implies x(i,j) = x(k,l) = 1 or x(i,l) = x(k,j) = 1. But then, in each
of the equations (3) there is at most one hyperedge involved that corresponds
to a non-zero component of z. This leads to the fact that zhi,j,k,li > 0 implies
412 Volker Kaibel

zhi,j,k,li = 1, and that x(i,j) = x(k,l) = 1 implies zhi,j,k,li = 1. Hence, z must be

the characteristic vector of Eˆm,n (C) u
t

How is the n × n-case related to the m × n-case? Obviously, SQAP m,n arises
from SQAP n,n by the canonical orthogonal projection σ̂ (m,n) : RVn,n ×RÊn,n −→
RVm,n × RÊm,n . Let Wm,n = Vn,n \Vm,n and Fm,n = {hi, j, k, li ∈ Eˆn,n : hi, j, k, li∩
Wm,n 6= ∅} be the sets of nodes and hyperedges that are “projected out” this
way. The following connection is very useful.

Remark 1. If an inequality (a, b)T (x, z) ≤ α defines a facet of SQAP n,n and
av = 0 holds for all v ∈ Wm,n as well as bh = 0 for all h ∈ Fm,n , then the
“projected inequality” (a0 , b0 )T (x0 , z 0 ) ≤ α with (a0 , b0 ) = σ̂ (m,n) (a, b) defines a
facet of SQAP m,n .

The following result shows that in order to investigate the m × n-case with
m < n it suffices to restrict even to m ≤ n − 2. In fact, it turns out later that
the structures of the polytopes for m ≤ n − 2 differ a lot from those for m = n
or m = n − 1.

Theorem 2. For n ≥ 2 the canonical orthogonal projection σ̂ (n,n−1) : RVn,n ×

RÊn,n −→ RVn−1 ,n × RÊn−1 ,n induces an isomorphism between the polytopes
SQAP n,n and SQAP n−1,n .

Proof. In the n × n-case, in addition to (1) and (3) the equations

x(colj) = 1 (j ∈ N ) (6)
(i,j)
− x(i,j) − x(i,l) + z ∆(i,l) = 0 (i, j, l ∈ N , j < l) (7)

are valid for SQAP n,n . Obviously, the columns of these equations that corre-
spond to nodes in Wn−1,n or to hyperedges in Fn−1,n are linearly independent.
This implies the theorem. t
u

3 The Basic Facial Structures of SQAP m;n

The questions for the affine hull, the dimension, and the trivial facets are an-
swered by the following theorem.

Theorem 3. Let 3 ≤ m ≤ n − 2.

(i) The affine hull of SQAP m,n is precisely the solution space of (1) and (3).

(ii) SQAP m,n has dimension dim(RVm,n × RÊm,n ) − (m + mn(m − 1)/2).

(iii) The nonnegativity constraints (x, z) ≥ 0 define facets of SQAP m,n .
(iv) The inequalities (x, z) ≤ 1 are implied by the (1) and (3) together with
the nonnegativity constraints (x, z) ≥ 0.
Polyhedral Combinatorics of QAPs 413

Proof. Part (iv) is a straightforward calculation. While (iii) follows immediately

from Remark 1 and the fact that the nonnegativity constraints define facets in
the n × n-case (see [4]), part (i) (which implies (ii)) needs some more techniques,
which are not introduced here due to the restricted space. They can be found in
detail in [8]. The key step is to project the polytope isomorphically into a lower
dimensional vector space, where the vertices have a more convenient coordinate
structure. t
u

4 Projecting a Certain Relaxation Polytope

In [4] it is proved that the affine hull of the polytope SQAP n,n (the n × n-case)
is described by (1), (3), (6), and (7), i.e., aff(SQAP n,n ) is the solution space
of the following system (where we use capital letters (X, Z) in order to avoid
confusion between the n × n- and the m × n-case):

X(rowi ) = 1 (i ∈ N ) (8)
X(colj) = 1 (j ∈ N ) (9)
(i,j)
− X(i,j) − X(k,j) + Z ∆(k,j) = 0 (i, j, k ∈ N , i < k) (10)

(i,j)
− X(i,j) − X(i,l) + Z ∆(i,l) = 0 (i, j, l ∈ N , j < l) (11)

The fact that (9) and (11) are needed additionally to describe the affine hull of
the polytope in the n × n-case is the most important difference to the m × n-case
with m ≤ n − 2.
It turned out ([12,8]) that minimizing over the intersection SEQP n,n of
aff(SQAP n,n ) and the nonnegative orthant empirically yields a very strong lower
bound for the symmetric n× n-QAP. In contrast to that, minimizing over the in-
tersection SEQP m,n of aff(SQAP m,n ) and the nonnegative orthant usually gives
rather poor lower bounds (for m ≤ n − 2). However, solving the corresponding
linear programs is much faster in the m × n case (as long as m is much smaller
than n).
In order to obtain a good lower bound also in the m×n-case, one could add n−
m dummy objects to the instance and after that calculate the bound in the n×n-
model. Clearly it would be desirable to be able to compute that bound without
“blowing up” the model by adding dummies. The following result provides a
possibility to do so, and hence, enables us to compute good lower bounds fast in
case of considerably less objects than locations. One more notational convention
is needed. For two disjoint columns colj and coll of Ĝm,n we denote by hcolj : coll i
the set of all hyperedges that share two nodes with colj and two nodes with coll .

Theorem 4. Let 3 ≤ m ≤ n − 2. A point (x, z) ∈ RVm,n × RÊm,n is contained

in σ̂ (m,n) (SEQP n,n ) if and only if it satisfies the following linear system:

x(rowi ) = 1 (i ∈ M) (12)
x(colj) ≤ 1 (j ∈ N ) (13)
(i,j)
− x(i,j) − x(k,j) + z ∆(k,j) = 0 (i, k ∈ M, i < k, j ∈ N ) (14)
414 Volker Kaibel

(i,j)
− x(i,j) − x(i,l) + z ∆(i,l) ≤ 0 (j, l ∈ N , j < l, i ∈ M) (15)
x(colj ∪ coll ) − z(hcolj : coll i) ≤ 1 (j, l ∈ N , j < l) (16)
xv ≥ 0 (v ∈ Vm,n ) (17)
zh ≥ 0 (h ∈ Êm,n ) (18)

Proof. It should be always clear from the context whether a symbol like rowi is
meant to be the i-th row of Vn,n or of Vm,n . The rule is that in connection with
variables denoted by lower-case letters always Vm,n is the reference set, while
variables denoted by upper-case letters refer to Vn,n .
We shall first consider the “only if claim” of the theorem. Let (X, Z) ∈
SEQP n,n , and let (x, z) = σ̂ (m,n) (X, Z) be the projections of (X, Z). Obviously,
(12), (14), and the nonnegativity constraints hold for (x, z). The inequalities (13)
and (15) follow from (9), (11) and the the nonnegativity of (X, Z).
It remains to show that the inequalities (16) are satisfied by (x, z). The
equations

X(colj ∪ coll ) − Z(hcolj : coll i) = 1 (j, l ∈ N , j < l) (19)

hold, since (X, Z) ∈ aff(SQAP n,n ) and they are easily seen to be valid for
SQAP n,n . For j ∈ N we denote

Qj := {(i, j) : 1 ≤ i ≤ m} and Q̄j := {(i, j) : m < i ≤ n} .

Adding some equations of (11) (the ones with i > m) to (19) yields equations

X(Qj ) + X(Ql ) − Z(hQj : Ql i) + Z(hQ̄j : Q̄l i) = 1 (j, l ∈ N , j < l) . (20)

Thus, by the nonnegativity of Z, the inequalities (16) hold for the projected
vector (x, z).
We come to the more interesting “if claim”. In order to show that the
given system (12), . . . ,(18) of linear constraints forces the point (x, z) to be
contained in the projected polytope σ̂ (m,n) (SEQP n,n ), we shall exhibit a map
φ : RVm,n × RÊm,n −→ RVn,n × RÊn,n that maps such a point (x, z) satisfying
(12), . . . ,(18) to a point (X, Z) = φ(x, z) ∈ SEQP n,n which coincides with (x, z)
on the components belonging to Ĝm,n (as a subgraph of Ĝn,n ). Hence, the first
step is to define (X, Z) = φ(x, z) as an extension of (x, z), and the second step
is to prove that this (X, Z) indeed satisfies (8), . . . ,(11), as well as (X, Z) ≥ 0.
The following extension turns out to be a suitable choice (recall that m ≤ n − 2):

1 − x(colj )
X(i,j) := (i > m) (21)
n−m
(i,j)
x(i,j) + x(i,l) − z ∆(i,l)
Zhi,j,k,li := (i ≤ m, k > m) (22)
n−m
2 1 − x(colj ∪ coll ) + z (hcolj : coll i)
Zhi,j,k,li := (i, k > m) (23)
(n − m − 1)(n − m)
Polyhedral Combinatorics of QAPs 415

Let (x, z) ∈ RVm,n × RÊm,n satisfy (12), . . . ,(18), and let (X, Z) = φ(x, z) be the
extension defined by (21), . . . ,(23). Clearly, X is nonnegative (by (13)) and Z
is nonnegative (by (15) for i ≤ m, k > m and by (16) for i, k > m).
The validity of (8), . . . ,(11) for (X, Z) is shown by the following series of
calculations. We use the notation

∆(i, j) := {h ∈ Êm,n : (i, j) ∈ h}

for the set of all hyperedges of Ĝm,n that contain the node (i, j). Note that by
(14) we have

z (∆(i, j)) = (m − 2)x(i,j) + x(colj ) . (24)

Equations (8): For i ≤ m this is clear, and for i > m we have

(21) X 1 − x(colj )
X(rowi ) =
n−m
j∈N
X
1
= n−m n− x(colj )
j∈N
X
1
= n−m n− x(rowi )
i∈M

(12)
= 1 .

Equations (9): For j ∈ N we have

(21) 1 − x(colj )
X(colj ) = x(colj ) + (n − m)
n−m
= 1 .

Equations (10): For i, k ≤ m these equations clearly hold. In case of i ≤ m and

k > m we have

(i,j)
−X(i,j) − X(k,j) + Z ∆(k,j)

(i,j)
(21),(22) 1 − x(colj ) X x(i,j) + x(i,l) − z ∆(i,l)
= −x(i,j) − +
n−m n−m
l∈N \j
1
= −x(i,j) + x(colj ) − 1 + (n − 2)x(i,j) + x(rowi ) − z(∆(i, j))
n−m
(12),(24) 1
= −x(i,j) + (n − m)x(i,j)
n−m
= 0 .
416 Volker Kaibel

1
It remains to consider the case i, k > m. Here, we get (using α := (n−m−1)(n−m)
in order to increase readability)

(i,j)
−X(i,j) − X(k,j) + Z ∆(k,j)
(23) X
= −X(i,j) − X(k,j) + 2α (1 − x(colj ∪ coll ) + z(hcolj : coll i))
l∈N \j

= −X(i,j) − X(k,j)
X 1
+2α n − 1 − (n − 2)x(colj ) − x(Vm,n ) + z(∆(i, j))
2
i∈M
(12),(24)
= −X(i,j) − X(k,j) + 2α n − m − 1 + (m − n + 1)x(colj )
2
= −X(i,j) − X(k,j) + (1 − x(colj ))
n−m
(21)
= 0 .

Equations (11): If i ≤ m holds, then we have

(i,j)
−X(i,j) − X(i,l) + Z ∆(i,l)
X X
= −x(i,j) − x(i,l) + Zhi,j,k,li + Zhi,j,k,li
k∈M\i k∈N \M
(i,j)
(22) X x(i,j) + x(i,l) − z ∆(i,l)
(i,j)
= −x(i,j) − x(i,l) + z ∆(i,l) +
n−m
k∈N \M

= 0 .

For i > m we have

(i,j)
−X(i,j) − X(i,l) + Z ∆(i,l)
X X
= −X(i,j) − X(i,l) + Zhi,j,k,li + Zhi,j,k,li
k∈M k∈N \M\i

(k,l)
(22),(23) X x(k,l) + x(k,j) − z ∆(k,j)
= −X(i,j) − X(i,l) +
n−m
k∈M
X 1 − x(colj ∪ coll ) + z(hcolj : coll i)
+2
(n − m − 1)(n − m)
k∈(N \M)\i
Polyhedral Combinatorics of QAPs 417

1
= −X(i,j) − X(i,l) + x(coll + x(colj ) − 2z(hcoll : colj i)
n−m
2
+ 1 − x(colj ∪ coll ) + z(hcolj : coll i)
n−m
1
= −X(i,j) − X(i,l) + 2 − x(colj ) − x(coll )
n−m
(21)
= 0 .

t
u

Finally, we investigate the system (12), . . . ,(18) with respect to the question
of redundancies. From Theorem 3 we know already that the nonnegativity con-
straints (17) and (18) define facets of SQAP m,n as well as that (12) and (14)
are needed in the linear description of the affine hull of the polytope. Thus it
remains to investigate (13), (15), and (16). And in fact, it turns out that one of
these classes is redundant.
Theorem 5. Let 4 ≤ m ≤ n − 2.

(i) The inequalities (13) are redundant in (12), . . . ,(18).

(ii) The inequalities (15) and (16) define facets of SQAP m,n .

Proof. To prove part (i) observe that the equations

(m − 1)x (colj ) − z (hcolj : Vm,n \ colj i) = 0 (j ∈ N ) (25)

and

x (Vm,n ) = m (26)

hold for SQAP m,n . Thus, they are implied by the linear system (12), (14) due
to Theorem 3. Adding up all inequalities (16) for a fixed j ∈ N and all l ∈ N \ j
and subtracting (25) (for that j) and (26) yields
X
n−m−1≥ x(colj ∪ coll ) − z(hcolj : coll i)
l∈N \j

−(m − 1)x(colj ) + z(hcolj : Vm,n \ colj i) − x(Vm,n )

X
= (n − 1)x(colj ) + x(coll ) − z(hcolj : Vm,n \ colj i)
l∈N \j

−(m − 1)x(colj ) + z(hcolj : Vm,n \ colj i) − x(Vm,n )

= (n − m − 1)x(colj ) ,

what proves part (i) due to n − m − 1 ≥ 1.

The proof of (ii) needs the same techniques as mentioned in the proof of
Theorem 3, and is omitted here as well. It can be found in [8]. u
t
418 Volker Kaibel

5 A Large Class of Facets

In [5] a large class of facet defining inequalities for the n × n-case is investigated.
Many of them satisfy the requirements stated in Remark 1. We briefly introduce
these inequalities here, and demonstrate in Sect. 6 how valuable they are for
computing good lower bounds or even optimal solutions for QAPs with less
objects than locations.
Let P1 , P2 ⊆ M and Q1 , Q2 ⊆ N be two sets of row respectively column
indices with P1 ∩ P2 = ∅ and Q1 ∩ Q2 = ∅. Define S := (P1 × Q1 ) ∪ (P2 × Q2 )
and T := (P1 × Q2 ) ∪ (P2 × Q1 ). In [5] it is shown that the following inequality
is valid for SQAP n,n for every β ∈ Z (where z(S) and z(T ) are the sums
over all components of z that belong to hyperedges with all four endnodes in S
respectively in T , and hS : T i is the set of all hyperedges with two endnodes in
S and the other two endnodes in T ):
β(β − 1)
− βx(S) + (β − 1)x(T ) − z(S) − z(T ) + z(hS : T i) ≤ (27)
2
Clearly the validity carries over to the m × n-case. The inequality (27) is called
the 4-box inequality determined by the triple (S, T , β).
In this paper, we concentrate on 4-box inequalities that are generated by a
triple (∅, T , β) (i.e., we have P1 = Q2 = ∅ or P2 = Q1 = ∅), which we call 1-
box inequalities. Empirically, they have turned out to be the most valuable ones
within cutting plane procedures among the whole set of 4-box inequalities.
In [5], part (i) of the following theorem is proved, from which part (ii) follows
immediately by Remark 1.
Theorem 6. Let n ≥ 7.
(i) Let P, Q ⊆ N generate T = P × Q ⊆ Vn,n , and let β ∈ Z be an integer
number such that
• β ≥ 2,
• |P |, |Q| ≥ β + 2,
• |P |, |Q| ≤ n − 3, and
• |P | + |Q| ≤ n + β − 5
hold. Then the 1-box inequality
β(β − 1)
(β − 1)x(T ) − z(T ) ≤ (28)
2
defined by the triple (∅, T , β) defines a facet of SQAP n,n .
(ii) For m ≤ n and P ⊆ M the 1-box inequality (28) defines a facet of
SQAP m,n as well.

6 Computational Results
Using the ABACUS framework (Jünger and Thienel [7]) we have implemented a
simple cutting plane algorithm for (symmetric) m×n-instances (with m ≤ n−2)
Polyhedral Combinatorics of QAPs 419

that uses (12), (14), . . . ,(18) as the initial set of constraints. Thus, by Theorem 4
(and Theorem 5 (i)), the first bound that is computed is the symmetric equa-
tion bound (SEQB), which is obtained by optimizing over the intersection of
aff(SQAP n,n ) with the nonnegative orthant.
The separation algorithm that we use is a simple 2-opt based heuristic for
finding violated 1-box inequalities with β = 2. We limited the experiments to
this small subclass of box inequalities since on the one hand they emerged as
the most valuable ones from initial tests, and on the other hand even our simple
heuristic usually finds many violated inequalities among the 1-box inequalities
with β = 2 (if it is called several times with different randomly chosen initial
boxes T ).
The experiments were carried out on a Silicon Graphics Power Challenge
computer. For solving the linear programs we used the barrier code of CPLEX
4.0, which was run in its parallel version on four processors.
We tested our code on the esc instances of the QAPLIB, which are the
only ones in that problem library that have much less objects than locations.
Note that all these instances have both a symmetric (integral) flow as well as
a symmetric (integral) distance matrix, yielding that only even numbers occur
as objective function values of feasible solutions. Thus, every lower bound can
be rounded up to the next even integer number greater than or equal to it.
Our tests are restricted to those ones among these instances that have 16 or 32
locations. All the esc16 instances (with 16 locations) were solved for the first
time to optimality by Clausen and Perreghard [3]. The esc32 (with 32 locations)
instances are still unsolved, up to three easy ones among them.
Table 1 shows the results for the esc16 instances. The instances esc16b,
esc16c, and esc16h are omitted since they do not satisfy m ≤ n − 2 (esc16f
was removed from the QAPLIB since it has an all-zero flow matrix). The bounds

Table 1. The column objects contains the number of objects in the respective
instance, opt is the optimal solution value, SEQB is the symmetric equation
bound (i.e., the bound after the first LP), box is the bound obtained after some
cutting plane iterations, LPs shows the number of linear programs being solved,
time is the CPU time in seconds, and speed up is the quotient of the running
times for working in the n × n- and in the m × n-model. The last column gives
the running times Clausen and Perregard needed to solve the instances on a
parallel machine with 16 i860 processors.

name objects opt SEQB box LPs time (s) speed up CP (s)
esc16a 10 68 48 64 3 522 4.87 65
esc16d 14 16 4 16 2 269 2.74 492
esc16e 9 28 14 28 4 588 3.37 66
esc16g 8 26 14 26 3 58 14.62 7
esc16i 9 14 0 14 4 106 28.18 84
esc16j 7 8 2 8 2 25 32.96 14
420 Volker Kaibel

produced by our cutting plane code match the optimal solution values for all
instances but esc16a (see also Fig. 1). Working in the m × n-model speeds up
the cutting plane code quite much for some instances. The running times in the
m × n-model are comparable with the ones of the branch-and-bound code of [3].

0.8
lower/upper

0.6

0.4

0.2

0
esc16a esc16d esc16e esc16g esc16i esc16j

Fig. 1. The bars (gray for SEQB and black for the bound obtained by the box
inequalities) show the ratios of the lower and upper bounds, where the upper
bounds are always the optimal solution values here.

For the esc32 instances, our cutting plane algorithm computes always the
best known lower bounds (see Tab. 2 and Fig. 2). The three instances esc32e,
esc32f, and esc32g were solved to optimality for the first time by Brüngger,
Marzetta, Clausen, and Perregard [1]. Our cutting plane code is able to solve
these instances to optimality within a few hundred seconds of CPU time (on

Table 2. The column labels have the same meanings as in Tab. 1. Additionally,
upper gives the objective function value of the best known feasible solution and
prev lb denotes the best previously known lower bound. (Running times with a
? are only approximately measured due to problems with the queuing system of
the machine).

name objects upper prev lb SEQB box LPs time (s)

esc32a 25 130 36 40 88 3 62988
esc32b 24 168 96 96 100 4 ?60000
esc32c 19 642 506 382 506 8 ?140000
esc32d 18 200 132 112 152 8 ?80000
esc32e 9 2 2 0 2 2 576
esc32f 9 2 2 0 2 2 554
esc32g 7 6 6 0 6 2 277
esc32h 19 438 315 290 352 6 119974
Polyhedral Combinatorics of QAPs 421

four processors). These are about the same running times as needed by [1] with
a branch-and-bound code on a 32 processor NEC Cenju-3 machine.
The formerly best known lower bounds for the other esc32 instances were
calculated by the triangle decomposition bounding procedure of Karisch and
Rendl [9]. The bounds obtained by the cutting plane code improve (or match, in
case of esc32c) all these bounds. The most impressive gain is the improvement
of the bound quality from 0.28 to 0.68 for esc32a. While for the esc16 instances
switching from the n×n- to the m×n-model yields a significant speed up, in case
of the esc32 instances to solve the linear programs even became only possible
in the m × n-model.
Nevertheless, for the hard ones among the esc32 instances the running times
of the cutting plane code are rather large. Here, a more sophisticated cutting
plane algorithm is required in order to succeed in solving these instances to
optimality. This concerns the separation algorithms and strategies, the treatment
of the linear programs, as well as the exploitation of sparsity of the objective
functions, which will be briefly addressed in the following section.

0.8
lower/upper

0.6

0.4

0.2

0
esc32a esc32b esc32c esc32d esc32e esc32f esc32g esc32h

Fig. 2. The dark gray and the black bars have the same meaning as in Fig. 1.
Additionally, the light gray bars show the qualities of the previously best known
lower bounds.

7 Conclusion
The polyhedral studies reported in this paper have enabled us to build for the
first time a cutting plane code for QAPs with less objects than locations that
has a similar performance as current parallel branch-and-bound codes for smaller
instances and gives new lower bounds for the larger ones. More elaborated sepa-
ration procedures (including parallelization) and a more sophisticated handling
of the linear programs will surely increase the performance of the cutting plane
algorithm still further.
At the moment, the limiting factor for the cutting plane approach is the size
(and the hardness) of the linear programs. But if one considers the instances
422 Volker Kaibel

in the QAPLIB more closely, it turns out that the flow matrices very often are
extremely sparse. If one exploits this sparsity, one can “project out” even more
variables than we did by passing from the n × n- to the m × n-model. In our
opinion, investigations of the associated projected polytopes will eventually lead
to cutting plane algorithms in much smaller models, which perhaps will push
the limits for exact solutions of quadratic assignment problems far beyond the
current ones.

References
1. A. Brüngger, A. Marzetta, J. Clausen, and M. Perregaard. Joining forces in solving
large-scale quadratic assignment problems. In Proceedings of the 11th International
Parallel Processing Symposium IPPS, pages 418–427, 1997.
2. R. E. Burkard, S. E. Karisch, and F. Rendl. QAPLIB - A quadratic assignment
problem library. Journal of Global Optimization, 10:391–403, 1997.
3. J. Clausen and M. Perregaard. Solving large quadratic assignment problems in
parallel. Computational Optimization and Applications, 8(2):111–127, 1997.
4. M. Jünger and V. Kaibel. On the SQAP-polytope. Technical Report 96.241,
Angewandte Mathematik und Informatik, Universität zu Köln, 1996.
5. M. Jünger and V. Kaibel. Box-inequalities for quadratic assignment polytopes.
Technical Report 97.285, Angewandte Mathematik und Informatik, Universität zu
Köln, 1997.
6. M. Jünger and V. Kaibel. The QAP-polytope and the star-transformation. Techni-
cal Report 97.284, Angewandte Mathematik und Informatik, Universität zu Köln,
1997.
7. M. Jünger and S. Thienel. Introduction to ABACUS – A Branch-And-CUt System.
Technical Report 97.263, Angewandte Mathematik und Informatik, Universität zu
Köln, 1997. (To appear in OR Letters).
8. V. Kaibel. Polyhedral Combinatorics of the Quadratic Assignment Problem.
PhD thesis, Universität zu Köln, 1997. http://www.informatik.uni-koeln.de/
ls juenger/staff/kaibel/diss.html.
9. S. E. Karisch and F. Rendl. Lower bounds for the quadratic assignment problem
via triangle decompositions. Mathematical Programming, 71(2):137–152, 1995.
10. T. C. Koopmans and M. J. Beckmann. Assignment problems and the location of
economic activities. Econometrica, 25:53–76, 1957.
11. M. Padberg and M. P. Rijal. Location, Scheduling, Design and Integer Program-
ming. Kluwer Academic Publishers, 1996.
12. M. G. C. Resende, K. G. Ramakrishnan, and Z. Drezner. Computing lower bounds
for the quadratic assignment problem with an interior point solver for linear pro-
gramming. Operations Research, 43:781–791, 1995.
13. M. P. Rijal. Scheduling, Design and Assignment Problems with Quadratic Costs.
PhD thesis, New York University, 1995.
14. S. Sahni and T. Gonzales. P-complete approximation problems. Journal of the
Association for Computing Machinery, 1976.
Incorporating Inequality Constraints in the
Spectral Bundle Method

Christoph Helmberg1 , Krzysztof C. Kiwiel2 , and Franz Rendl3

1
Konrad-Zuse-Zentrum für Informationstechnik Berlin
Takustraße 7, 14195 Berlin, Germany
helmberg@@zib.de, http://www.zib.de/helmberg
2
Systems Research Institute, Polish Academy of Sciences
Newelska 6, 01-447 Warsaw, Poland
kiwiel@@ibspan.waw.pl
3
Technische Universität Graz, Institut für Mathematik
Steyrergasse 30, A-8010 Graz, Austria
rendl@@opt.math.tu-graz.ac.at

Abstract. Semidefinite relaxations of quadratic 0-1 programming or

graph partitioning problems are well known to be of high quality. How-
ever, solving them by primal-dual interior point methods can take much
time even for problems of moderate size. Recently we proposed a spec-
tral bundle method that allows to compute, within reasonable time, ap-
proximate solutions to structured, large equality constrained semidefinite
programs if the trace of the primal matrix variable is fixed. The latter
property holds for the aforementioned applications. We extend the spec-
tral bundle method so that it can handle inequality constraints without
seriously increasing computation time. This makes it possible to apply
cutting plane algorithms to semidefinite relaxations of real world sized
instances. We illustrate the efficacy of the approach by giving some pre-
liminary computational results.

1 Introduction

Since the landmark papers [9,4,10,2] it is well known that semidefinite pro-
gramming allows to design powerful relaxations for constrained quadratic 0-1
programming and graph partitioning problems. The most commonly used algo-
rithms for solving these relaxations, primal-dual interior point methods, offer
little possibilities to exploit problem structure. Typically their runtime is gov-
erned by the factorization of a dense symmetric positive definite matrix in the
number of constraints and by the line search that ensures the positive definite-
ness of the matrix variables. Computation times for problems with more than
3000 constraints or matrix variables of order 500, say, are prohibitive. Very re-
cently, a pure dual approach has been proposed in [1] that is able to exploit the
sparsity of the cost matrix in the case of the max-cut relaxation with diagonal

R. E. Bixby, E. A. Boyd, and R. Z. Rı́os-Mercado (Eds.): IPCO VI

LNCS 1412, pp. 423–435, 1998. c Springer–Verlag Berlin Heidelberg 1998
424 Christoph Helmberg et al.

constraints. It is not yet clear whether these results extend to problems with a
huge number of less structured constraints.
The spectral bundle method [6] works on a reformulation of semidefinite re-
laxations as eigenvalue optimization problems and was developed to provide ap-
proximate solutions to structured problems fast. In contrast to standard bundle
methods [7,11], a non-polyhedral semidefinite cutting plane model is constructed
from the subgradients. Reinterpreted in terms of the original semidefinite pro-
gram, the semidefinite model ensures positive semidefiniteness of the dual matrix
variable on a subspace only. Subgradients correspond to eigenvectors to negative
eigenvalues and are used to correct the subspace. By means of an aggregate sub-
gradient the dimension of the subspace can be kept small, thus ensuring efficient
solvability of the subproblems. Lanczos methods (see e.g. [3]) allow to compute a
few extremal eigenvalues and their eigenvectors efficiently by a series of matrix-
vector multiplications which do not require the matrix in explicit form. Thus
structural properties of cost and coefficient matrices can be exploited.
Like most first order methods, the spectral bundle method exhibits fast
progress in the beginning, but shows a strong tailing off effect as the optimal
solution is approached. Fortunately many semidefinite relaxations do not have
to be solved exactly. Rather an approximate solution is used to improve, e.g., by
cutting planes, the current relaxation, which is then resolved.
In its original form the spectral bundle method is designed for equality con-
straints only because sign constraints on the dual variables may increase com-
putation times for the semidefinite subproblem significantly. In this paper we
employ Lagrangian relaxation to approximate the solution of a sign constrained
semidefinite subproblem. Surprisingly just one update of Lagrange multipliers
per function evaluation suffices to ensure convergence. The semidefinite subprob-
lem can be solved as efficiently as in the unconstrained case, thus rendering this
method an attractive choice for large scale semidefinite cutting plane algorithms.
Section 2 introduces some notation and explains the connection to semidefi-
nite programming. This is followed by a very brief review of important properties
of the maximal eigenvalue function. Section 4 explains the extension of the spec-
tral bundle method to inequality constraints. In Section 5 we discuss efficiency
aspects of the subproblem solution. Section 6 gives computational results.

2 Semidefinite Programs
Let Sn denote the space of symmetric matrices of order n. The inner product
in this space is the usual matrix inner product, hA, Bi = tr(B T A) for A, B ∈
IRm×n . Let Sn+ denote the set of symmetric positive semidefinite matrices. Sn+
is a pointed closed convex cone. Except for its apex {0}, a face F of this cone
can be described as
F = P V P T : V ∈ Sr+ ,
where P ∈ IRn×r is some fixed matrix
with orthonormal columns (w.l.o.g.). The
2 . For A, B ∈ Sn , A B refers to the Löwner
dimension of such a face F is r+1
partial order induced by the cone Sn+ (A B ⇐⇒ A − B ∈ Sn+ ).
Inequality Constraints in the Spectral Bundle Method 425

In the following A : Sn → IRm is a linear operator and AT(·) is the corre-

sponding adjoint operator satisfying hA(X), yi = X, AT(y) for all X ∈ Sn and
y ∈ IRm . The operators are of the form
 
hA1 , Xi
 ..  Xm
A(X) =  .  and AT
(y) = yi Ai .
hAm , Xi i=1

We will denote a subset J ⊂ {1, . . . , m} of the rows of A by AJ . Likewise we

will speak of ATJ and yJ for some vector y ∈ IRm .
We consider semidefinite programs of the following form with J ⊂ {1, . . . , m}
and J¯ = {1, . . . , m} \ J,

max hC, Xi
min hb, yi
s.t. AJ¯(X) = bJ¯
(P) (D) s.t. Z = AT(y) − C
AJ (X) ≤ bJ
Z 0, yJ ≥ 0.
X 0.

Under a constraint qualification (which we tacitly assume to hold), any

optimal solution X ∗ of (P) and any optimal solution (y ∗ , Z ∗ ) of (D) satisfy
X ∗ Z ∗ = 0.
We assume that there exists ŷ ∈ IRm with I = ATJ (ŷ), which is equivalent
to the requirement that tr(X) = a = bTJ¯ ŷJ¯ for all feasible X. Most semidefinite
relaxations of combinatorial optimization problems satisfy this requirement or
can be scaled appropriately. To avoid the trivial case of X = 0, assume a > 0.
Under these assumptions it is well known that (D) is equivalent to the eigen-
value optimization problem (see e.g. [8])

min aλmax (C − AT(y)) + hb, yi .

y∈IRm ,yJ ≥0

To simplify notation we drop, without loss of generality, the coefficient a and

assume that |J| = m, i.e., there are inequality constraints only. Thus we concen-
trate on the following problem

miny≥0 f (y) with f (y) = λmax (C − AT(y)) + hb, yi , (1)

or equivalently min fS := f + ıS with ıS being the indicator function of S = IRm

+
(ıS (y) = 0 if y ∈ S, ∞ otherwise).

3 The Function λmax (X)

The maximal eigenvalue λmax (X) of a matrix is a nonsmooth convex function.

The kinks of λmax (X) appear at points where the maximal eigenvalue has mul-
tiplicity at least two. If, for some X̂ ∈ Sn , the columns of P ∈ IRn×r form an
426 Christoph Helmberg et al.

orthonormal basis of the eigenspace of the maximal eigenvalue of X̂, then the
subdifferential (the set of subgradients) of λmax (·) at X̂ is

∂λmax (X̂) = P V P T : tr(V ) = 1, V ∈ Sr+ (2)
n o
= conv vv T : v T X̂v = λmax (X̂), kvk = 1 .

It is a well known fact that

λmax (X) = max {hW, Xi : tr(W ) = 1, W 0} .

Any subset Ŵ ⊂ {W 0 : tr(W ) = 1} gives rise to a non-polyhedral cutting

plane model minorizing λmax (X),
n o
λmax (X) ≥ max hW, Xi : W ∈ Ŵ . (3)

4 The Spectral Bundle Method with Inequality

Constraints
It follows from the properties of λmax (·) that (1) is a nonsmooth convex mini-
mization problem. A standard approach to solve such problems is the proximal
bundle method [7]. We first sketch the method in general and give the corre-
sponding symbols that we will use for the spectral bundle method. Afterwards
we specialize the algorithm to the eigenvalue problem (1).
At iteration k the algorithm generates a test point y k+1 = arg minS fˆk +
k 2 ˆk
2 k · −x k , where f is an accumulated cutting plane model of f and the weight
u

u > 0 keeps y k+1

near the current iterate xk . A serious step xk+1 = y k+1 occurs
if f (x ) − f (y
k k+1
) ≥ mL [f (xk ) − fˆk (y k+1 )], where mL ∈ 0, 12 . Otherwise a null
step x k+1
= x is made but the new subgradient computed at y k+1 improves
k

the next model.

The spectral bundle method is tailored for the eigenvalue problem (1) in that,
instead of the usual polyhedral fˆk , it uses a semidefinite cutting plane model of
the form (cf. (3))

fˆk (y) = maxW ∈Ŵ k W, C − AT(y) + hb, yi . (4)

We will specify the choice of Ŵ k and discuss the computation of fˆk in detail in
Sect. 5. The usual bundle subproblem of finding

y∗k+1 = arg miny≥0 fˆk (y) + u2 ky − xk k2 (5)

turns out to be rather expensive computationally because of the sign constraints

on y. Instead we compute, by means of Lagrangian relaxation, an approximation
y k+1 to y∗k+1 . To this end we define the Lagrangian of (4)–(5)
u
L(y; W, η) = hC, W i + hb − A(W ), yi − hη, yi + ky − xk k2
2
Inequality Constraints in the Spectral Bundle Method 427

¯ k 2
for W ∈ Ŵ k and η ∈ IRm+ . It has the form L(·; W, η) = fW,η (·) + 2 k · −x k ,
u

where
f¯W,η (·) = hC, W i + hb − η − A(W ), ·i
is an affine function minorizing fˆSk = fˆk + ıS and hence fS . The dual function
ϕ(W, η) = min L(·; W, η), computed by finding

u b − η − A(W )
yW,η = arg min f¯W,η (y) + ky − xk k2 = xk − ,
y 2 u
has the form
1
ϕ(W, η) = hC, W i + b − η − A(W ), xk − kb − η − A(W )k2 .
2u
Denote by η k the multipliers of the previous iteration. We find an approximate
maximizer (W k+1 , η k+1 ) to ϕ over Ŵ k × IRm
+ by first computing

W k+1 ∈ Arg maxW ∈Ŵ k ϕ(W, η k ) (6)

via an interior point algorithm, and then setting

n o
b−A(W )
η k+1 = arg maxη≥0 ϕ(W k+1 , η) = −u min 0, xk − u . (7)

It is not difficult to check that

y k+1 := yW k+1 ,ηk+1 ≥ 0 and η k+1 , y k+1 = 0.

At the test point y k+1 the function is evaluated and a new subgradient is com-
puted.
By the observations above, we have

f¯k+1 = f¯W k+1 ,ηk+1 ≤ fˆSk ≤ fS .

This and the following lemma motivate a stopping criterion of the form

f (xk ) − f¯k+1 (y k+1 ) ≤ εopt (|f (xk )| + 1).

Lemma 1. f (xk ) ≥ f¯k+1 (y k+1 ), and if f (xk ) = f¯k+1 (y k+1 ) then xk is optimal.

Proof. Since f (xk ) ≥ f¯k+1 (xk ) and

f¯k+1 (xk ) − f¯k+1 (y k+1 ) = b − η k+1 − A(W k+1 ), xk − y k+1

1
= kb − η k+1 − A(W k+1 )k2 ,
u
the inequality follows. If equality holds then f (xk ) = f¯k+1 (xk ) and 0 = b −
η k+1 − A(W k+1 ) = ∇f¯k+1 ∈ ∂fS (xk ). u
t
428 Christoph Helmberg et al.

For practical reasons we state the algorithm with an inner loop allowing for
several repetitions of the two “coordinatewise” maximization steps (6) and (7).
Algorithm 1.
Input: y 0 ∈ IRm + , εopt ≥ 0, εM ∈ (0, ∞], an improvement parameter mL ∈
(0, 12 ), a weight u > 0.
1. (Initialization) Set k = 0, x0 = y 0 , η 0 = 0, f (x0 ) and Ŵ 0 (cf. Sect. 5).
2. (Direction finding) Set η + = η k .
(a) Find W + ∈ Arg maxW ∈Ŵ k ϕ(W, η + ).
(b) Compute η + = arg maxη≥0 ϕ(W + , η).
(c) Compute y + = yW + ,η+ .
(d) (Termination) If f (xk ) − f¯W + ,η+ (y + ) ≤ εopt (|f (xk )| + 1) then STOP.
(e) If fˆk (y + ) − f¯W + ,η+ (y + ) > εM [f (xk ) − f¯W + ,η+ (y + )] then go to (a).
Set η k+1 = η + , W k+1 = W + , y k+1 = y + .
3. (Evaluation) Compute f (y k+1 ),
find WSk+1 ∈ Arg max C − AT(y k+1 ), W : tr(W ) = 1, W 0 .

4. (Model updating) Determine Ŵ k+1 ⊇ conv W k+1 , WSk+1 (cf. Sect. 5).
5. (Descent test) If f (xk )−f (y k+1 ) ≥ mL [f (xk )− f¯k+1 (y k+1 )] then set xk+1 =
y k+1 (serious step); otherwise set xk+1 = xk (null step).
6. Increase k by 1 and go to Step 2.
For εM = ∞ the algorithm performs exactly one inner iteration as described
before. This single inner iteration suffices to guarantee convergence.
Theorem 1. Let εopt = 0 and εM = ∞. If Arg min fS 6= ∅ then xk → x̄ ∈
arg min fS , otherwise kxk k → ∞. In both cases f (xk ) ↓ inf fS .
The proof is rather technical and we refer the reader to the full version of the
paper for details. To sketch the main idea only, assume that test point y k has
caused a null step. Then y k is the (unconstrained) minimizer of the strongly
convex function f¯k (·)+ u2 k·−xk k2 with modulus u and therefore, since W k ∈ Ŵ k ,
u
L(yW k+1 ,ηk ; W k+1 , η k ) ≥ L(y k ; W k , η k ) + ky k+1 k − y k k2 .
2 W ,η
Likewise we obtain
u k+1
ky − yW k+1 ,ηk k2 .
L(y k+1 ; W k+1 , η k+1 ) ≥ L(yW k+1 ,ηk ; W k+1 , η k ) +
2
Using these relations and the fact that the subgradients generated by the algo-
rithm remain locally bounded one can arrive at Theorem 1 along a similar line
of arguments as given in [7].

5 Ŵ k and the Quadratic Semidefinite Subproblem

Whereas the convergence analysis of the algorithm imposes only very few re-
quirements on Ŵ k and its update to Ŵ k+1 , its actual realization is of utmost
importance for the efficiency of the algorithm. The spectral bundle method uses
n k
o
Ŵ k = W = P k V (P k )T + αW : tr(V ) + α = 1, V ∈ Sr+ , α ≥ 0 ,
Inequality Constraints in the Spectral Bundle Method 429

k
where P k ∈ IRn×r is some fixed matrix with orthonormal columns and W ∈ Sn+
k
satisfies tr(W ) = 1. P k should span at least partially the eigenspace belonging
k
to the largest eigenvalues of matrices C − AT(y) for y in the vicinity of xk . W
serves to aggregate subgradient information that cannot be represented within
k
the set P k V (P k )T : tr(V ) = 1, V ∈ Sr+ . The use of W ensures convergence
of the spectral bundle method even if the number of columns of P k is restricted
to r = 1. The quality of the semidefinite model as well as the computation time
for solving (6) depend heavily on r.
The special structure of Ŵ k allows us to evaluate fˆk directly:
n D k Eo
fˆk (y) = max λmax ((P k )T (C − AT(y))P k ), W , C − AT(y) + hb, yi .

Observe that the eigenvalue computation involved is cheap since the argument
is an r × r symmetric matrix.
By the choice of Ŵ k , for a fixed η = η k , (6) is equivalent to the quadratic
semidefinite program

1 k
min 2u kbD−η − A(P k V (P k )T + αW )k2 E
k
− P k V (P k )T + αW , C − AT(xk ) − b − η, xk (8)
s.t. tr(V ) + α = 1
V 0, α ≥ 0.
k
Its optimal solution (V ∗ , α∗ ) gives rise to W k+1 = P k V ∗ (P k )T + α∗ W and
k
is also used to update P k and W in Step 4 of the algorithm. Let QΛQT be
an eigenvalue decomposition of V ∗ . Then the ‘important’ part of the spectrum
of W k+1 is spanned by the eigenvectors associated with the ‘large’ eigenvalues
of V ∗ . Thus the eigenvectors of Q are split into two parts Q = [Q1 Q2 ] (with
corresponding spectra Λ1 and Λ2 ), Q1 containing as columns the eigenvectors
associated to ‘large’ eigenvalues of V ∗ and Q2 containing the remaining columns.
P k+1 then contains an orthonormal basis of the columns of P k Q1 and at least
one eigenvector to the maximal eigenvalue of C − AT(y k+1 ) computed in Step 3
of the algorithm. The next aggregate matrix is
k+1 k
W = α∗ W + P k Q2 Λ2 QT2 (P k )T /(α∗ + tr(Λ2 )). (9)

This ensures W k+1 ∈ Ŵ k+1 . For Ŵ 0 we choose P 0 to contain a set of or-

thonormalized eigenvectors to large eigenvalues of C − A(y 0 ). We do not use the
aggregate in the first iteration.
(8) has to be solved for each update of η in the inner loop or, if εM = ∞, at
least once per outer iteration. Thus the efficiency of the whole algorithm hinges
on the speed of this computation. We briefly explain the most important issues
that arise in this context.
Using the svec-operator described in [12] to expand symmetric matrices
from Sr into column vectors of length r+1 2 and by ignoring all constants, (8)
430 Christoph Helmberg et al.

can be brought into the following form (recall that, for A, B ∈ Sr , hA, Bi =
svec(A)T svec(B) and that tr(V ) = hI, V i)

min 12 svec(V )T Q11 svec(V ) + αq12

T
svec(V ) + 12 q22 α2 + cT1 svec(V ) + c2 α
s.t. α + sI svec(V ) = 1
T
(10)
α ≥ 0, V 0,

where
1
Pm
Q11 = u i=1 svec(P Ai P )svec(P Ai P )
T T T

1
q12 = u svec(P A (A(W ))P )
T T

1
q22 = u A(W ), A(W )
c1 = −a svec(P T ( u1 AT(b − η) + C − AT(y))P )
c2 = −a( u1 (b − η) − y, A(W ) + C, W )
sI = svec(I).

This problem has r+1 2 +1 variables and can be solved quite efficiently by interior
point methods if r is not too large, smaller than 25, say. Since convergence is
guaranteed even for r = 1, it is possible to run this algorithm for problems with
a huge number of constraints m. Several remarks are in order.
First, it is not necessary to have W available as a matrix, it suffices to store
the m-vector A(W ) and the scalar C, W . These values are easily updated
whenever W is changed.
Second, almost all computations involve the projected matrices P T Ai P .
These have to be computed only once for each evaluation of (10), they are
symmetric of size r × r, and only one such projected matrix has to be kept in
memory if the values of Q11 to c2 are accumulated.
Third, the most expensive operation in computing the cost coefficients is the
accumulation of Q11 , which involves the4 summation of m dyadic products of
vectors of size r+1 2 for a total of O(mr ) operations. Even for rather small r
but sufficiently large m, this operation takes longer than solving the reduced
quadratic semidefinite program.
Finally, changes in η do not affect the quadratic cost matrices Q11 to q22 ,
but only the linear cost coefficients c1 and c2 . Within the inner loop it suffices
to update the linear cost coefficients. The dominating cost of computing Q11
can be avoided. If the inner iteration yields only small changes in η then the
optimal solution will also change only slightly. This can be exploited in restarting
strategies.
We solve (10) by the primal-dual interior point code used in [6]. It maintains
primal and dual feasibility throughout (U is the dual variable of V , β is the dual
variable of α, and t is the dual variable to the primal equality constraint),

Q11 svec(V ) + αq12 + c1 − tsI − svec(U ) = 0, (11)

αq22 + q12
T
svec(V) + c2 − t − β = 0, (12)
1 − α − sTI svec(V ) = 0. (13)
Inequality Constraints in the Spectral Bundle Method 431

We employ a restart procedure for reoptimizing the quadratic program in the

inner loop. Let (U ∗ , V ∗ , α∗ , β ∗ , t∗ ) denote the optimal solution of the latest prob-
lem, and let ∆c1 and ∆c2 denote the update (caused by the change in η) to be
added to c1 and c2 . We determine a starting point (U 0 , V 0 , α0 , β 0 , t0 ) by the
following steps. First, note that (V, α) = (I, 1)/(r + 1) is the analytic center
of the primal feasible set. Therefore any point on the straight line segment
ξ(V ∗ , α∗ ) + (1 − ξ)(I, 1)/(r + 1) with ξ ∈ [0, 1] is again primal feasible. Fur-
thermore it can be expected to be a point close to the central path if the new
optimal solution is relatively qclose to the old optimal solution. We determine ξ
k(∆c1 ,∆c2 )k
by ξ = min{.99999, max[.9, 1− k(c1 ,c2 )k ]}.
In (11) and (12) the change from (V , α ) to the starting point (V 0 , α0 )
∗ ∗

determines the changes in the dual variables up to a diagonal shift that can be
applied through t. This diagonal shift is chosen so that the changes in U and
β are positive definite. Thus U 0 (β 0 ) is strictly positive definite but not too
far from U ∗ (β ∗ ). Experience shows that this restarting heuristic reduces the
number of iterations to two thirds down to one half depending on the size of the
changes in η.

6 Implementation and Combinatorial Applications

The algorithm has been implemented as a direct extension of the algorithm in
[6] and uses essentially the same parameter settings. In particular, mL = 0.1,
r = 25, εopt = 5 · 10−4 , and the weight u is updated dynamically by safeguarded
quadratic interpolation. We present some computational results for εM = .6 and
εM = ∞. The choice of εM = .6 should keep the number of inner iterations
small, at the same time avoiding null steps enforced by the bad quality of the
relaxation. Our computational experiments were carried out on a Sun Sparc
Ultra 1 with a Model 140 UltraSPARC CPU and 64 MB RAM. Computation
times are given in hours:minutes:seconds.
We apply the code to the max-cut instances G1 to G21 on 800 vertices
from [6]. Graphs G1 to G5 are unweighted random graphs with a density of
6% (approx. 19000 edges). G6 to G10 are the same graphs with random edge
weights from {−1, 1}. G11 to G13 are toroidal grids with random edge weights
from {−1, 1} (approx. 1200 edges). G14 to G17 are unweighted ‘almost’ planar
graphs having as edge set the union of two (almost maximal) planar graphs (ap-
prox. 4500 edges). G18 to G21 are the same almost planar graphs with random
edge weights from {−1, 1}. In all cases the cost matrix C is the Laplace matrix
of the graph divided by four, i.e., let A denote the (weighted) adjacency matrix
of G, then C = (Diag(Ae) − A)/4.
To the basic semidefinite relaxation (diag(X) = e) we add 1600 triangle
inequalities. These were determined by our code from [5], which is a primal-dual
interior point code. We applied the separation routine of this code to the optimal
solution of the semidefinite relaxation without inequality constraints. We also
used the code to solve the relaxation with the same triangle constraints and will
compare these results to the spectral bundle method.
432 Christoph Helmberg et al.

Table 1. Comparison of the spectral bundle and interior point approaches.

Problem IP-sol spectral bundle interior point

1% 0.1% 1% 0.1%
G1 12049.33 19 3:51 2:16:14 4:05:13
G2 12053.88 20 5:37 2:15:57 4:04:42
G3 12048.56 17 4:13 2:44:06 4:06:09
G4 12071.59 17 3:11 2:42:53 4:04:20
G5 12059.68 15 3:21 3:39:56 5:02:24
G6 2615.55 1:24 14:47 3:37:52 4:59:34
G7 2451.61 1:28 11:55 3:37:48 4:59:28
G8 2463.68 1:42 19:19 3:37:47 4:59:27
G9 2493.51 1:38 16:44 3:38:45 5:00:46
G10 2447.79 1:22 12:17 3:35:53 4:57:31
G11 623.49 9:25 54:55 4:55:20 6:10:01
G12 613.02 9:32 47:01 4:57:09 6:43:04
G13 636.45 8:52 39:00 4:31:17 6:46:56
G14 3181.35 1:19 10:13 3:36:13 5:24:18
G15 3161.89 1:09 11:51 3:35:00 5:22:31
G16 3164.95 1:24 12:55 4:07:35 5:57:37
G17 3161.78 1:20 12:32 3:35:22 5:23:04
G18 1147.61 4:10 15:45 4:30:04 5:51:05
G19 1064.57 4:23 17:17 4:59:07 6:47:54
G20 1095.46 4:31 14:58 4:30:13 5:51:18
G21 1087.89 3:22 12:40 4:56:09 6:16:55

For the bundle algorithm the diagonal of C is removed. This does not change
the problem because the diagonal elements of X are fixed to one. The offset
eT (Ae − diag(A))/4 is not used internally in the spectral bundle code. However,
after each serious step we check externally whether the stopping criterion is
satisfied if the offset is added. If so, we terminate the algorithm. As starting
vector y 0 we choose the zero vector.
For inequality constrained problems our spectral bundle code exhibits an
even stronger tailing off effect than for equality constrained problems. To solve
the relaxation exactly seems to be out of reach. However, within the first few
iterations the objective value gets already very close to the optimal value. In
cutting plane approaches it does not make sense to solve relaxations to optimality
if much faster progress can be achieved by adding new inequalities. Therefore
the quick initial convergence is a desirable property for these applications.
In this spirit we give the time needed to get within 1% and 0.1% of the
optimum in Table 1 for both the spectral bundle code with εM = .6 and the
interior point code. The first column specifies the problem instance, the sec-
ond (IP-sol) gives the optimal value of the relaxation with triangle inequalities.
Columns three and five (four and six) display the times needed by both codes to
get within 1% (0.1%, resp.) of the optimum. (The entries for the interior point
code are estimates, since its output only gives the overall computation time.)
Inequality Constraints in the Spectral Bundle Method 433

Table 2. Performance of the spectral bundle code for εopt = 5·10−4 and εM = .6.

f (xk )−f ∗
Problem f∗ f (xk ) f∗
total time eig time[%] # calls # serious
−4
G1 12049.33 12055.79 5.4·10 5:10 62.9 15 11
G2 12053.88 12057.64 3.1·10−4 12:06 53.99 25 16
G3 12048.56 12051.59 2.5·10−4 13:15 53.21 27 18
G4 12071.59 12078.58 5.8·10−4 4:26 61.28 15 11
G5 12059.68 12065.88 5.1·10−4 6:28 56.96 18 13
G6 2615.55 2617.04 5.7·10−4 26:17 45.21 49 22
G7 2451.61 2453.27 6.8·10−4 15:38 45.95 34 21
G8 2463.68 2464.86 4.8·10−4 31:33 46.01 59 22
G9 2493.51 2494.79 4.8·10−4 28:06 45.37 54 22
G10 2447.79 2449.71 7.8·10−4 15:20 43.8 35 22
G11 623.49 623.95 7.3·10−4 1:11:04 68.64 94 30
G12 613.02 613.61 9.6·10−4 47:42 78.13 90 35
G13 636.45 637.03 9.1·10−4 39:00 79.74 82 35
G14 3181.35 3182.53 3.7·10−4 19:46 64.67 43 29
G15 3161.89 3163.24 4.3·10−4 19:51 66.16 40 27
G16 3164.95 3165.98 3.3·10−4 29:32 64.73 50 28
G17 3161.78 3163.52 5.5·10−4 18:48 67.55 41 28
G18 1147.61 1148.25 5.6·10−4 21:54 55.48 48 31
G19 1064.57 1065.06 4.6·10−4 22:33 54.99 48 28
G20 1095.46 1095.94 4.4·10−4 20:05 56.02 46 29
G21 1087.89 1088.72 7.6·10−4 14:48 59.57 40 29

The fast progress of the bundle code on examples G1 to G5 can be explained

by the large offset value (9500) that is added externally. For the other classes
the offset is around 0 for G6 to G13 and G18 to G21 , being about 2300 for G14
to G17 . The eigenvalue structure of the solutions of G10 to G13 suggests that the
rather poor performance of the algorithm on these problems is caused by a very
flat objective.

Table 2 provides more detailed information on the overall performance of the

spectral bundle code for εM = .6. Column f ∗ refers to the value of the interior
k ∗
point solution, f (xk ) gives the objective value at termination, f (x f)−f
∗ is the
relative accuracy with respect to f ∗ , total time gives the computation time,
eig time is the percentage of time spent in the Lanczos code (eigenvalue and
eigenvector computation), # calls counts the number of objective evaluations
and # serious the number of serious steps.

Somewhat surprisingly the spectral bundle method also yields reasonable

results for εM = ∞. The corresponding values are displayed in Table 3.
434 Christoph Helmberg et al.

Table 3. Performance of the spectral bundle code for εopt = 5·10−4 and εM = ∞.

f (xk )−f ∗
Problem f∗ f (xk ) f∗
total time eig time[%] # calls # serious
−4
G1 12049.33 12055.79 5.4·10 5:14 62.42 15 11
G2 12053.88 12057.31 5.4·10−4 13:13 54.22 27 16
G3 12048.56 12053.27 3.9·10−4 11:54 54.06 25 17
G4 12071.59 12078.58 5.8·10−4 4:23 61.98 15 11
G5 12059.68 12065.88 5.1·10−4 6:26 57.51 18 13
G6 2615.55 2616.89 5.1·10−4 26:32 45.79 52 24
G7 2451.61 2453.27 6.8·10−4 15:14 46.50 34 21
G8 2463.68 2464.86 4.7·10−4 32:02 44.59 59 22
G9 2493.51 2494.69 4.7·10−4 30:14 46.14 58 23
G10 2447.79 2449.51 7.0·10−4 14:32 44.84 34 22
G11 623.49 623.89 6.4·10−4 1:30:05 71.3 113 33
G12 613.02 613.69 1.1·10−3 53:17 74.8 95 33
G13 636.45 637.05 9.4·10−4 43:31 79.1 87 36
G14 3181.35 3182.68 4.1·10−4 19:23 64.49 43 28
G15 3161.89 3163.18 4.1·10−4 19:29 66.47 40 27
G16 3164.95 3165.98 3.3·10−4 28:55 65.42 50 28
G17 3161.78 3163.46 5.3·10−4 18:35 67.89 41 28
G18 1147.61 1148.28 5.8·10−4 20:16 58.31 47 30
G19 1064.57 1065.04 4.4·10−4 23:24 59.69 51 28
G20 1095.46 1095.94 4.3·10−4 18:39 59.34 46 28
G21 1087.89 1088.63 6.8·10−4 15:09 61.17 42 29

References

1. S. Benson, Y. Ye, and X. Zhang. Solving large-scale sparse semidefinite programs

for combinatorial optimization. Working paper, Department of Management Sci-
ence, University of Iowa, IA, 52242, USA, Sept. 1997.
2. M. X. Goemans and D. P. Williamson. Improved approximation algorithms for
maximum cut and satisfiability problems using semidefinite programming. J. ACM,
42:1115–1145, 1995.
3. G. H. Golub and C. F. van Loan. Matrix Computations. The Johns Hopkins
University Press, 2nd edition, 1989.
4. M. Grötschel, L. Lovász, and A. Schrijver. Polynomial algorithms for perfect
graphs. Annals of Discrete Mathematics, 21:325–356, 1984.
5. C. Helmberg and F. Rendl. Solving quadratic (0,1)-problems by semidefinite pro-
grams and cutting planes. ZIB Preprint SC-95-35, Konrad-Zuse-Zentrum für In-
formationstechnik Berlin, Nov. 1995. To appear in Math. Programming.
6. C. Helmberg and F. Rendl. A spectral bundle method for semidefinite pro-
gramming. ZIB Preprint SC-97-37, Konrad-Zuse-Zentrum für Informationstechnik
Berlin, Aug. 1997.
7. K. C. Kiwiel. Proximity control in bundle methods for convex nondifferentiable
minimization. Math. Programming, 46:105–122, 1990.
8. A. S. Lewis and M. L. Overton. Eigenvalue optimization. Acta Numerica, 149–190,
1996.
Inequality Constraints in the Spectral Bundle Method 435

9. L. Lovász. On the Shannon capacity of a graph. IEEE Transactions on Information

Theory, IT-25(1):1–7, 1979.
10. L. Lovász and A. Schrijver. Cones of matrices and set-functions and 0-1 optimiza-
tion. SIAM J. Optim., 1(2):166–190, 1991.
11. H. Schramm and J. Zowe. A version of the bundle idea for minimizing a nons-
mooth function: Conceptual idea, convergence analysis, numerical results. SIAM
J. Optim., 2:121–152, 1992.
12. M. J. Todd, K. C. Toh, and R. H. Tütüncü. On the Nesterov-Todd direction
in semidefinite programming. Technical Report TR 1154, School of Operations
Research and Industrial Engineering, Cornell University, Ithaca, New York 14853,
Mar. 1996.
Author Index

Aardal, K. . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Margot, F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Abeledo, H. . . . . . . . . . . . . . . . . . . . . . . . . . 202 Martin, A. . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Atkinson, G. . . . . . . . . . . . . . . . . . . . . . . . . 202 Munier, A. . . . . . . . . . . . . . . . . . . . . . . . . . . 367

Barvinok, A. . . . . . . . . . . . . . . . . . . . . . . . . 195 Nagamochi, H. . . . . . . . . . . . . . . . . . . . . . . . 96

Bender, M. . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Nakamura, D. . . . . . . . . . . . . . . . . . . . . . . . . 69
Călinescu, G. . . . . . . . . . . . . . . . . . . . . . . . 137 Oldham, J.D. . . . . . . . . . . . . . . . . . . . . . . . 338
Carr, R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112 Oswald, M. . . . . . . . . . . . . . . . . . . . . . . . . . 213
Ceria, S. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Chekuri, C. . . . . . . . . . . . . . . . . . . . . . . . . . 383
Pataki, G. . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Cheriyan, J. . . . . . . . . . . . . . . . . . . . . . . . . . 126
Plotkin, S. . . . . . . . . . . . . . . . . . . . . . . . . . . 338
Christof, T. . . . . . . . . . . . . . . . . . . . . . . . . . 213
Chudak, F.A. . . . . . . . . . . . . . . . . . . . . . . . 180
Queyranne, M. . . . . . . . . . . . . . . . . . . . . . . 367
Conforti, M. . . . . . . . . . . . . . . . . . . . . . . . . . .53
Cornuéjols, G. . . . . . . . . . . . . . . . . . . . . 1, 284
Raghavachari, B. . . . . . . . . . . . . . . . . . . . . 169
Dawande, M. . . . . . . . . . . . . . . . . . . . . . . . . 284 Ravi, R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Reed, B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Eades, P. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Reinelt, G. . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Rendl, F. . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Fekete, S. . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Fernandes, C.G. . . . . . . . . . . . . . . . . . . . . . 137 Schepers, J. . . . . . . . . . . . . . . . . . . . . . . . . . 257
Fleischer, L. . . . . . . . . . . . . . . . . . . . . . . . . .294 Schulz, A.S. . . . . . . . . . . . . . . . . . . . . . . . . . 367
Schuurman, P. . . . . . . . . . . . . . . . . . . . . . . 353
Gasparyan, G. . . . . . . . . . . . . . . . . . . . . . . . 23 Sebő, A. . . . . . . . . . . . . . . . . . . . . . . . . . 37, 126
Goldberg, A.V. . . . . . . . . . . . . . . . . . . . . . .338 Stein, C. . . . . . . . . . . . . . . . . . . . . . . . 153, 338
Guenin, B. . . . . . . . . . . . . . . . . . . . . . . . . . . 1, 9 Szigeti, Z. . . . . . . . . . . . . . . . . . . . . . . . 84, 126
Helmberg, C. . . . . . . . . . . . . . . . . . . . . . . . .423 Tamura, A. . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Hochbaum, D.S. . . . . . . . . . . . . . . . . . . . . 325
Tardos, É. . . . . . . . . . . . . . . . . . . . . . . . . . . 310
Hoogeveen, J.A. . . . . . . . . . . . . . . . . . . . . . 353
Hurkens, C. . . . . . . . . . . . . . . . . . . . . . . . . . 229
Uma, R.M. . . . . . . . . . . . . . . . . . . . . . . . . . . 394
Johnson, D.S. . . . . . . . . . . . . . . . . . . . . . . . 195
Veerasamy, J. . . . . . . . . . . . . . . . . . . . . . . . 169
Kaibel, V. . . . . . . . . . . . . . . . . . . . . . . . . . . .409
Kapoor, A. . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Wayne, K.D. . . . . . . . . . . . . . . . . . . . . . . . . 310
Kiwiel, K.C. . . . . . . . . . . . . . . . . . . . . . . . . 423 Wein, J. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .394
Kolliopoulos, S.G. . . . . . . . . . . . . . . . . . . . 153 Weismantel, R. . . . . . . . . . . . . . . . . . . . . . . 243
Woeginger, G.J. . . . . . . . . . . . . . . . . 195, 353
Lenstra, A.K. . . . . . . . . . . . . . . . . . . . . . . . 229 Woodroofe, R. . . . . . . . . . . . . . . . . . . . . . . 195