Dynamic Matchings in Convex Bipartite Graphs
Dynamic Matchings in Convex Bipartite Graphs
net/publication/220975392
CITATIONS READS
13 325
4 authors, including:
Loukas Georgiadis
University of Ioannina
72 PUBLICATIONS 942 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Loukas Georgiadis on 22 May 2014.
1 Introduction
1 2 3 4 5 6 7 1 2 3 4 5 6 7
[1, 3] [2, 2] [3, 5] [4, 4] [5, 7] [6, 6] [7, 7] [1, 1] [2, 2] [1, 3] [4, 4] [3, 5] [6, 6] [5, 7]
Fig. 1. Sequence of operations inducing linear number of changes in any explicit rep-
resentation of a convex matching. Only edges in the maximum matching are shown.
time per operation and support queries asking whether a given edge is contained
in any perfect matching. The same asymptotic update bound is achieved in [17]
for maintaining the size of a maximum matching in general dynamic graphs
(modified by edge insertions and deletions). The running time in both results
depends on the value of the the exponent ω of matrix multiplication.
The situation is similar in the case of convex bipartite graphs; it is not hard
to construct examples where a single update changes all the matching edges.
Still, due to the special structure of the convex graphs it is less obvious that any
representation of a maximum matching can change dramatically due to a small
change in the input. For example, since the values in Y are ordered, a matching
can be described by giving a corresponding order for X (together with a dummy
variable for each unmatched value). To measure the difference between two
convex matchings more accurately we can count the number of inversions be-
tween pairs of successive variables in the two corresponding orderings of X.
Figure 1 shows that even with respect to this measure, the difference between
two matchings after O(1) updates can be Θ(|V |), also in the amortized sense.
Such problematic cases rule out the possibility of maintaining any explicit repre-
sentation of a maximum matching in sub-linear time, even for convex bipartite
graphs. On the other hand, it is easy to verify that the sets of matched vertices
in X (and Y ) can change only by one element per update. (This fact holds also
for general graphs modified by edge insertions and deletions, since a matching
M is maximum iff there is no augmenting path relative to M [1].) The structure
we develop is based on the previous observation and on a parallel algorithm of
Dekel and Sahni for convex bipartite matchings [3]. We review their algorithm
in Section 2.1.
Recall that Y is totally ordered and the input graph G = (X∪Y, E) is represented
succinctly by giving an interval [s(x), t(x)], for each x ∈ X; s(x) (t(x)) is the
first (last) vertex in Y adjacent to x. For convenience, we identify the vertices
in Y by their rank in the underlying linear arrangement, and henceforth assume
Y = {1, . . . , m}. We allow G to be modified by a collection U composed of the
following update operations:
· Insert a vertex x into X together with an interval [s(x), t(x)]; make x adjacent
to every y ∈ Y such that s(x) ≤ y ≤ t(x).
· Delete a vertex x from X; remove all edges incident to x.
· Insert a vertex y into Y ; make y adjacent to any x ∈ X such that s(x) <
y < t(x).
· Delete a vertex y from Y if there is no x ∈ X such that s(x) = y or t(x) = y;
remove all edges adjacent to y.
Clearly, the above operations maintain the convexity of G, and also maintain the
given order of Y . As a result, we do not need to test if G remains convex after
each operation (in fact we are not aware of any efficient dynamic algorithm for
this test), but this also implies that we cannot support arbitrary edge updates.
Restricted edge updates are possible by deleting a variable and re-inserting it
with a different interval. Given a dynamic convex bipartite graph G that is
modified subject to U, we wish to support a collection Q of the following query
operations:
· Given a vertex x ∈ X or y ∈ Y , return the status matched or unmatched of
this vertex.
· Given a matched vertex u (u ∈ X ∪ Y ), return its mate w in the maximum
matching; the pair {u, w} is an edge of the maximum matching.
Our results are summarized in the following theorems.
Theorem 1. We can maintain a maximum matching of a convex bipartite graph
G = (X ∪ Y, E), under the update operations in U, in O(log2 |X|) amortized
time per update and O(|Y | + |X| log |X|) space. A status query can be answered
in constant worst-case time. A pair query for a vertex u can be answered in
O(min{k log2 |X| + log |X|, |X| log |X|}) worst-case time, where k is the number
of update operations that occurred since the last pair query for u or its mate.
2
p
Theorem 2. The amortized running time for a pair query is O( |X| logp |X|).
Also, there is an update sequence which forces our structure to spend Ω( |X|)
amortized time for each pair query.
Suppose we wish to maintain a greedy matching when all intervals have the
same start-point. That is, the input graph G = (X ∪ Y, E) is such that s(x) = 1
for all x ∈ X, and Y = {1, . . . , m}. We say that an x ∈ X has higher priority
than an x0 ∈ X if t(x) < t(x0 ). Define ni as the number of variables whose
interval is [1, i]. Also define ai to be the number of matched variables x ∈ X,
such that t(x) ≤ i. These quantities satisfy the recurrence ai = min{ai−1 +ni , i},
for 1 ≤ i ≤ m and a0 = 0. Inserting (deleting) a variable x with domain [1, k]
amounts to incrementing (decrementing) nk . Let n0i (a0i ) be the value of ni (ai )
after the update (insertion or deletion of [1, k]).
Consider the insertion of the variable [1, k]. For the given k we have n0k =
nk + 1 and for i 6= k, n0i = ni . Let j be the smallest integer such that j ≥ k and
aj = j, if such a j exists, and let j = m + 1 otherwise. If j = k then all the first
k values are matched with variables that have the same priority as x or higher,
and nothing changes. Otherwise, ak < k and the new variable will be matched;
this happens because we always maintain a greedy matching and [1, k] has higher
priority than any interval that ends after k. Thus, a0i = ai + 1 if k ≤ i < j, and
a0i = ai otherwise. The remaining update operations are analyzed similarly. In
every update operation, we are interested in the quantities bj = j − aj − nj+1
for all j ∈ Y . When we insert a variable with domain [1, k], we need to locate
the first j ≥ k with aj = j. Then, by the definition of aj we have aj−1 + nj ≥ j,
so j 0 − aj 0 − nj 0 +1 ≤ −1 for j 0 = j − 1. Similar inequalities are derived for the
remaining update operations.
We construct a data structure that maintains the bi quantities implicitly.
This implicit representation is necessary because a single update can change
many bi ’s. The data structure consists of a balanced binary search tree T (e.g.,
a red-black tree [8]) over Y with each leaf corresponding to a value. Leaf i stores
ni and a number b̂i ; an internal node v stores the numbers add(v) and min(v).
We define these quantities later. Given a node or leaf u and an ancestor v of u in
T , define sum(u, v) to be the sum of add(w) for all nodes w on the tree path from
u to v, excluding v. Given a node or leaf u, define sum(u) to be sum(u, root),
where root is the root of T . Let leaves(v) be the set of leaves contained in the
subtree rooted at v. The numbers stored in the tree will satisfy the invariants:
(a) bj = b̂j + sum(j), and (b) min(v) = min{b̂j + sum(j, v) | j ∈ leaves(v)}.
Combining (a) and (b) we get min{bj | j ∈ leaves(v)} = min{b̂j + sum(j, v) | j ∈
leaves(v)} + sum(v) = min(v) + sum(v). Initially add(v) = 0 for all nodes v ∈ T .
As T is modified after an update operation the values add(v) keep track of the
changes that have to be applied to the affected bj values. By its definition add(v)
affects the bj values of all the leaves j ∈ leaves(v). Also, b̂j is simply the value
of bj before the update. We note that the numbers stored at the internal nodes
are easy to maintain in constant time for each restructuring operation of the
balanced tree (e.g., rotation).
We sketch how T is updated when we insert a variable with domain [1, k].
After locating leaf k we increment nk . Since this affects bk−1 , we increment b̂k−1
and update min(v) for the ancestors v of k − 1 in T bottom-up. As we ascend
the path toward the root we calculate sum(k − 1, v) at each node v, which takes
constant time per node. Next we locate the first leaf j ≥ k such that bj < 0. To
that end we perform a top-down search on the appropriate subtree of T . Note
that as we descend the tree we can compute sum(v) at each node v that we
visit, and thus compute min(v) + sum(v) (in constant time per node). So, let
P = (v0 = root, v1 , . . . , k) be the search path for k in T . Let P 0 = (u1 , u2 , . . .) be
the subsequence of P composed of the nodes vi such that vi+1 is the left child
of vi ; let ri0 be the right child of ui0 . Then, j is located at the subtree rooted
at ri∗ where i∗ is the largest index i0 that satisfies min(ri0 ) + sum(ri0 ) < 0. We
can locate ri∗ in O(log m) time. Then the search for j proceeds top-down in
the subtree of ri∗ as follows: Let v be the current node, and let u be its left
child. If min(u) + sum(u) < 0 then the search continues in the subtree of u.
Otherwise, we move to the right child of v. The last step is to subtract one from
bi , for k ≤ i ≤ j. Since the involved leaves may be too many, we perform the
decrements implicitly using the add quantities. Let η be the nearest common
ancestor of k and j in T ; we can locate η easily in O(log m) time since T is
balanced. Let Pk be the path from η to k, excluding the end-points, and let Pj
be the path from η to j excluding the end-points. Let v be a node on Pk such
that its right child u is not on Pk . If u is a leaf then decrement b̂u ; otherwise
decrement add(u). Perform the symmetric steps for Pj (visiting the left children
of nodes on this path). Finally, decrement b̂k and b̂j and update min for all
O(log m) nodes on Pk , Pj and on the path from root to η; each update takes
constant time if we process the paths bottom-up.
Notice that by storing at each leaf i the variables x with t(x) = i, we can
locate a variable that is replaced in the maximum matching by the newly inserted
variable. This is a variable stored in the leaf j that satisfies αj = j and j ≥ k.
Similarly, after a deletion we can locate a variable that replaces the deleted one
in the maximum matching. We will use this capability in the next section. Also,
we note that by grouping successive empty leaves (i.e., leaves j with nj = 0) we
can reduce the number of leaves in T to min{m, n}. Thus, we get:
Lemma 1. Updates of variables and values in the equal start-points case take
worst-case O(log n) time.
Lower bound. We begin with T being a full binary tree with n leaves, where
each leaf stores a unique variable: The ith leaf stores √ a variable with domain
[i, n]. For simplicity √ we take m = n. Next we insert n variables √
with domains
[1, j], j = 1, .√. . , n. These variables will shift the matching by√ n positions,
and the last √n variables will become infeasible. Consider the n nodes of T
at height log n. For each such node Pi we make a pair query for a leaf in Pi ’s
subtree. The corresponding query paths intersect only above the Pi ’s. To get
the claimed bound it suffices to count the work done √ at each Pi . When the pair
query
√ algorithm visits Pi , update(P i ) will have Θ( n) variables, so it will spend
Ω( n) time √ to process these variables. Hence the total work for all nodes at
height log n is Ω(n). By √ deleting the variables [1, j] and reinserting them,
√ we
can repeat this process n times. This implies an amortized cost of Ω( n) for
a pair query.
Upper bound. For simplicity, we consider T to be a full binary tree with n leaves,
each storing a single variable. Informally, having more variables in some leaves
cannot increase the asymptotic cost bound because both the height of the tree de-
creases and also each pair query updates the matching for more than one variable.
We analyze the worst case running time of a sequence of operations composed of ν
updates, followed by µ pair queries. This will give us the worst-case scenario, be-
cause the total cost for querying a set of paths is maximized after all updates are
performed. At any node P , processing update(P ) takes O(2i log n) if the height
i of P is at most log ν, and O(ν log n) for i > log ν. We calculate separately
the cost for processing the nodes at height above log ν (top subtree) and below
log ν (bottom subtrees). First, for the top subtree of T we observe that each pair
query visits log n/ν of its nodes and at each node it spends at most O(ν log n)
time. Hence, if µ ≤ n/ν, the total cost is O(µν √ log n/ν log n) = O(µν log2 n),
µν 2 2
so the amortized cost is O( ν+µ log n) = O( n log n). For µ > n/ν, the total
cost is O(n log n + µ log n), because the top subtree has O(n/ν) nodes. So, in
n µ
this case, the amortized cost is O( ν+µ log n + ν+µ log n) = O( ν 2nν
+n log n + log n),
√
which is O( n log n). We now turn to the bottom subtrees. The contribution of
a path inside a bottom subtree√ is O(ν log n). Hence, if µ ≤ n/ν, the total cost
is O(µν log n), which gives O( n log n) amortized cost. The total contribution
of a bottom subtree to a pair query is O(ν log n), because the total number of
variables in all update lists is O(ν). Since we have O(n/ν)
√ bottom subtrees, for
µ > n/ν the total cost is O(n log n + µ log n), i.e., O( n log n) amortized cost.