Trees and Optimization
Trees and Optimization
Trees and Optimization
32
Proof: The first statement holds because the endpoints of a maximal nontrivial path in an acyclic graph have degree 1. Next, since a cut-vertex v
of a connected graph G has a neighbor in each component of G v, a leaf
cannot be a cut-vertex. Thus when v is a leaf of a tree G, the graph G v
is connected and acyclic.
PROPERTIES OF TREES
Trees have many equivalent definitions. Verifying any one shows
that a graph is a tree, and then the others are available for use.
1.1.3. THEOREM. For a graph G on n vertices, the following properties
are equivalent (and define the class of trees on n vertices).
A) G is connected and has no cycles.
B) G is connected and has n 1 edges.
33
x
y
Since edges of cycles are not cut-edges, we can delete edges from a
connected graph to obtain a spanning tree. We next use two properties of
trees to prove a result about spanning trees: 1) Since a tree has no cycles,
every edge is a cut-edge. 2) Since a tree has a unique connecting path for
each vertex pair, adding any edge creates exactly one cycle. We use subtraction and addition to indicate deletion and inclusion of single edges.
1.1.5. PROPOSITION. If T and T are two spanning trees of a connected graph G and e E(T) E(T ), then there exists e E(T ) E(T) such that T e + e and T + e e are both spanning trees of
G.
Proof: The figure below illustrates T and T sharing two edges, with
E(T) bold and E(T ) solid. The specified edge e is a cut-edge of T ; let U
and U be the vertex sets of the components of T e. Adding e to T creates a unique cycle C. The path C e contained in T has its endpoints
in U and U , so it has an edge e with endpoints in U and U .
Since e is the only edge of T joining U and U , we have e E(T )
E(T). Since e joins the components of T e, the graph T e + e is a spanning tree. Since e is on the path from U to U in T , it lies on the unique
cycle formed by adding e to T . Hence T + e e also is a spanning tree.
e E(T)
e E(T )
34
five blocks; three copies of K 2 , one of K 3 , and one subgraph that is neither
a cycle nor complete.
i
e
g
a
G c
x
d
f
j
b5
b1 b3 b2
a e
x
b4
35
11
8 2
7
10
5
12
36
Next we seek a spanning tree with the most leaves. When our graph
models a communication network, we seek the smallest set of vertices to
protect so that all surviving vertices after an attack can communicate.
The non-leaf vertices in a spanning tree form a set S such that G[S] is connected and every vertex outside S is adjacent to some vertex of S. Such a
37
Ri
38
For k = 3, there are several proofs that the construction in Example 1.1.14 is optimal. For k = 4, the optimal bound l(n , 4) 52 n + 85
was proved in GriggsWu [1990] and in K leitmanWest [1991] (two small
graphs have no tree with 52 n + 2 leaves). GriggsWu [1990] also proved
that l(n , 5) = 36 n + 2. The proofs are algorithmic, constructing a tree with
at least this many leaves. We present a proof for k = 3.
1.1.15. THEOREM. (LinialSturtevant [unpub.], GriggsWu [1990],
K leitmanWest [1991]) Every connected N-vertex graph G with
(G) 3 has a spanning tree with at least N/4 + 2 leaves.
Proof: We provide an algorithm to grow such a tree. Let T denote the
current tree, with n vertices and l leaves. If x is a leaf of T , then the
external degree of x, denoted d(x), is | NG(x) V(T)|. The operation of expansion at x consists of adding to T the d(x) edges from x to NG(x) V(T).
We grow T by operations, where each operation consists of some number
of expansions. Note that expansion preserves the property that all edges
from T to G V(T) are incident to leaves of T .
A leaf x of T with d(x) = 0 is dead; no expansion is possible at a
dead leaf, and it remains a leaf in the final tree. Let m be the number
of dead leaves in T . An expansion that makes y a dead leaf kills y. We
call an operation admissible if its effect on T satisfies the augmentation
inequality 3l + m n, where l, m, and n denote the changes
in the numbers of leaves, dead leaves, and vertices in T , respectively.
We grow a spanning tree by admissible operations. If G is not 3regular, we begin with the edges at a vertex of maximum degree. If G
is 3-regular and every edge belongs to a triangle, then G = K 4 , and the
claim holds. Otherwise, G is 3-regular and has an edge in no triangle; in
this case we begin with such an edge and the four edges incident to it.
yi
xi
If T is grown to a spanning tree with L leaves by admissible operations, then all leaves eventually die. The final operation will kill at least
two leaves not counted by the augmentation inequality, so the total of
m from the augmentation inequalities for the operations will be at most
L 2. We begin with 4 leaves and 6 vertices if G is 3-regular; otherwise
39
with r leaves and r + 1 vertices for some r > 3. Summing the augmentation inequalities over all operations yields 3(L 4) + (L 2) N 6 if G
is 3-regular and 3(L r) + (L 2) N r 1 otherwise. These simplify
to 4L N + 8 and 4L N + 2r + 1 N + 9, respectively, which yield
L N/4 + 2.
It remains to present admissible operations that can be applied until
T absorbs all vertices and to show that the last operation kills two extra
leaves. We use the three operations shown below, applying O2 only when
O1 is not available.
x
O1
O2
x
O3
40
0
)N
r
> (1 1+r0 )N ,
41
external degree i. Operation Pi consists of expansion at a leaf with external degree i plus expansion at one of those new vertices (see figure below).
When the maximum external degree is i, we perform Pi at a vertex x
with d(x) = i if the number of vertices introduced by the second expansion is at least 2r + i i. If no such choice is available, then we perform Oi .
This procedure always chooses some operation, and we grow a spanning
tree. The operations will be admissible because (1) Pi provides enough
leaves, and (2) when Pi is not available, Oi increases deadness by enough.
When the second expansion in Pi introduces s leaves, n = s + i and
l = s + i 2. We need r(s + i 2) + M (r 1)(s + i). Always M i ,
since x is no longer a leaf. Hence s + i 2r i 0 suffices, and this holds
since Pi is used only when s 2r + i i.
s 2r + i i
=i
Oi
=i
Pi
42
1
2r
+2b)r
suffices, so we set r = 1+b2b (k (1 + b) ln k) .
Therefore, b ln k bk(1
1 +b
Now the augmentation inequality holds for Oi .
With these choices, we have proved that l(n , k) (1 1+r0 )n. Since
1
+2b)+(1+2b) ln k
ln k
1 + 0
(1 + O( lnkk )), choosing b < /3 yields
< (1+b )(1
= (1+2b)
r
k
k(1+b) ln k
the desired lower bound when k is sufficiently large.
Note that for small , the proof of Theorem 1.1.16 sets r asymptotically to about k, but it still requires r 2, so roughly k > 2/ is
needed. CaroWestYuster [2000] also gave a probabilistic algorithm for
connected domination that does as well as Theorem 1.1.16. For each fixed
k with k 6, the exact value of l(n , k) in terms of n remains unknown.
43
44
pi li = 90/33 < 3; the expected length of a code using the eight words of
length 3 would be 3.
33
19
11
14
5 1 1 7 8 2 3 6
8:11
7:01
6:101
3:001 5:100
2:0001
1:00000 1:00001
pn
pn1
pj
k
k+1
qn1
qj
45
Huffmans algorithm computes an optimal code, but how does it compare to a balanced tree with every codeword having length lg n or
lg n ? If n = 2k and the words are equally likely, then the balanced
tree with all leaves at depth k is optimal, as produced by the algorithm.
With pi = 1/n, the resulting expected length is kpi = pi lg pi ; the
latter quantity is called the entropy of the discrete probability distribution {pi}. The formula is no coincidence; entropy is a lower bound on the
expected length. This holds for all codes with binary codewords, not just
prefix-free codes.
1.1.22. THEOREM. (Shannon) For every probability distribution {pi}
on n messages and every binary code for these messages, the expected
length of codewords is at least pi lg pi .
pj
pj
pi
pi
lg + q1 1
lg
q0 1
q0
q0
q1
q1
j W1
i W0
46
36). When the probabilities vary greatly, Huffman coding is much more
efficient than codes words of equal length. The length of a computer file
coded for compactness may be only half its length under ASCII coding,
which assigns 5 digits per character. Coding individual files accentuates
this; the distribution of characters in a program source file may be much
different from that of a document or another program.
However, we may want the codewords in the same order as the message words. This makes searching easy, because we can store at each internal vertex a word between the largest message word at a leaf of the left
subtree and the smallest message word at a leaf of the right subtree. The
expected length will be longer than in Huffman coding, but these alphabetic prefix-free codes can be easier to use while almost as efficient.
Since the items must appear at leaves in left-to-right order, the
leaves in any subtree get one of the (n2) sets of consecutive messages. The
final merge to complete an optimal alphabetic tree must combine an optimal tree for the first k leaves and an optimal tree for the last n k leaves,
for some k. To choose the best k, we solve the subproblems for all consecutive segments. This algorithmic technique of solving all subproblems is
called dynamic programming.
1.1.23. ALGORITHM. (Optimal Alphabetic Trees).
Input: Frequencies p1 , . . . , pn and fixed left-to-right ordering of leaves.
Initialization: Set c(S) = 0 for each singleton leaf set S.
Iteration: For i from 2 through n, compute a cost c(S) for each segment S
of i consecutive nodes. With Sk being the first k of these, and Sk = S Sk ,
the cost is c(S) = ( j S pj ) + mink [c(Sk ) + c(Sk )].
1.1.24. THEOREM. Algorithm 1.1.23 computes optimal alphabetic trees
in time O(n3).
= 1 pi(lg pi lg q0)
i W0
pj (lg pj lg q1) .
j W1
= 1 + q0 lg q0 + q1 lg q1 pi lg pi
i W
Proof: When two adjoining segments are merged, the search path to each
leaf lengthens by 1. This explains the combining cost j S pj in the algorithm. Since an optimal tree must merge optimal subtrees, induction on
i proves that the algorithm finds an optimal tree.
A separate dynamic program computes the (n2) combining costs j S pj
in advance, in increasing order of | S| , using constant time per computation. The algorithm then computes a potential combination for each
choice of two adjoining segments, in increasing order of the size of the
union. Such a pair is specified by the start and end of the first segment
(possibly equal) and the end of the second segment. The algorithm performs (n3) + (2n) constant-time computations to find all values c(S), keeping
always the best value found among the | S| 1 candidates for c(S).
47
13
10
7
5
4 6
3
2
3 2 2 3
2
48
depths. If the new item replaces its left child in the list, and the algorithm breaks ties by merging the lexicographically leftmost least compatible pair, then the merged pair always consists of two consecutive items.
This produces a fast implementation (GarsiaWachs [1977]).
The proof of correctness is surprisingly hard (it may be skipped without loss of continuity). Hu [1973] shortened the original HuTucker
[1971] proof. We follow the still shorter proof in HuK leitmanTamaki
[1979], which permits a more general optimization criterion.
The proof has two main steps. Feasibility: The depths resulting from
Step 2 are realizable by an alphabetic tree. Optimality: The tree resulting from Step 1 has minimum cost among a class of trees that includes all
alphabetic trees. The final alphabetic tree is then also optimal, because
pi li does not change when we rearrange the tree to make it alphabetic
without changing the depths of the leaves.
The proof of feasibility requires technical lemmas about the weights
of nodes and their order of formation. As Step 1 proceeds, the items in a
segment bounded by noncrossable nodes (including the boundary nodes)
are pairwise compatible. When a noncrossable item is merged, the two
compatibility sets involving it combine. Therefore, if u and v are compatible and both are crossable, then they are compatible with the same
set of nodes as long as they both exist.
Let T be the tree produced by Step 1 of the algorithm. We write w(u)
for the weight of a node u in T , with w(u) w(v) meaning that w(u) <
w(v) or that w(u) = w(v) with u to the left of v in the list. Now no two
nodes or pairs of nodes are considered equal in weight. A compatible pair
(u , v) in T is a locally minimal compatible pair (LMCP) if w(v) w(x)
for all x compatible with u and w(u) w(y) for all y compatible with v.
The algorithm chooses an LMCP. Also, each node belongs to at most one
LMCP, so the current LMCPs are pairwise disjoint.
1.1.27.* LEMMA . Merging an LMCP cannot decrease the weight of the
lightest node compatible with v. In particular, merging an LMCP
preserves other LMCPs.
Proof: Let a be the weight of the lightest node compatible with v before
merging x and y. By symmetry, we may assume that y is not between v
and x. If v becomes compatible with a node of weight less than a, then
the merge must make some node compatible with v that was not before.
Hence x or y is noncrossable, and there is no noncrossable node between
v and x. This requires v and x to be compatible, so a w(x).
Suppose u becomes newly compatible with v when x and y merge. If
u is compatible with y before the merge, then local minimality implies
w(x) w(u), so a w(u). If u is not compatible with y before the merge,
49
50
Proof: Let a and d be the children of v, and let b and c be the children of
u, listed in each case from left to right. The possible orderings of these in
the node list are (a , b , c , d), (b , a , c , d), and (a , b , d , c). If a b and c d
when u is formed, then local minimality in the formation of u implies
w(c) w(a) and w(b) w(d), which in turn implies w(u) w(v).
Otherwise, we apply Lemma 1.1.29 with {x , y} = {b , c} and use induction on the number of merges that cross over u before the formation
of v. If v is the first to cross over u, then Lemma 1.1.29 again implies
w(c) w(a) and w(b) w(d) (the ordering (a , b , c , d) requires two invocations of Lemma 1.1.29; the other two orderings use Lemma 1.1.29
once and the local minimality of b , c once). For the induction step, let t
be the last vertex to cross over u before v. By the induction hypothesis,
w(u) w(t). By arguments like those made already for u and v, we have
w(t) w(v).
After v is formed, the set of nodes compatible with v is the same as
the set compatible with u as long as both exist. Since w(u) w(v), this
implies that u merges before v.
1.1.31.* LEMMA . Let u and v be compatible crossable nodes with parents u and v in T , respectively. If w(u) w(v), then w(u) w(v).
51
If a depth list is not realizable, then the bottom-up process of converting it to an alphabetic tree fails at some point where k is the largest
entry and some maximal segment of k has odd length. Since each depth
except the root has an even number of nodes, the tree produced by Step
1 has a merge between an element a of this segment and an element d of
another segment of nodes at depth k, where between a and d there is an
element b with a smaller depth. Before a and d can be merged to form v,
the leaf b must become crossable by merging into a parent node u. Hence
v crosses over u. By Lemma 1.1.30, this implies that u and v are compatible crossable nodes with w(u) w(v).
It thus suffices to show that if u and v are compatible crossable nodes
with w(u) w(v), then l(u) l(v). Since w(u) w(v), it follows that v
is not a descendant of u. We use induction on the distance from v to the
closest common ancestor of u and v to prove the claim. If this distance
is 0, then u is a descendant of v, and the statement is immediate. Otherwise, let u and v be the parents of u and v, respectively. By Lemma
1.1.31, w(u) w(v); also u and v are compatible crossable nodes. By
the induction hypothesis, l(u) l(v), which yields l(u) l(v).
Given an input w (a list of weights, each designated as crossable or
noncrossable), let C be the class of trees formed by merges that can be ordered so that every merge is a compatible pair (in the current order) and
the list of depths of the nodes formed, in the order of formation, is nonincreasing. The optimal Huffman (non-alphabetic) trees for the frequency
lists 1,2,1 and 1,2,2,2,1 do not have such an ordering, since they require
the first merge to be a noncompatible pair. Since alphabetic trees merge
only adjacent nodes in the current list, every alphabetic tree with input
w belongs to C. Therefore, it suffices to show that the tree T produced
by Step 1 belongs to C and achieves minimum cost over the trees in C.
1.1.33.* THEOREM. The HuTucker Algorithm produces an optimal alphabetic binary tree.
Proof: By Lemma 1.1.28, every algorithm that always merges LMCPs
produces T . Hence it suffices to show that for every input w, there is a
tree of minimum cost in C having a merge order in which the first (deepest) merge is an LMCP. This implies simultaneously that T C and that
T has minimum cost in C.
We use induction on the length n of the input; the claim holds trivially for n = 2. Choose an optimal tree T and bottom-up merge ordering
of T to minimize the sum of weights of the first merged pair. Let this
pair be {u , v} with parent v ; it suffices to show that {u , v} is an LMCP.
If not, then we may assume by symmetry that w(u) > w(x) for some node
52
x compatible with v; let x have the least such weight (leftmost if there is
a tie).
By the induction hypothesis, subsequent merges are LMCPs. First
suppose that before x merges, v merges with some node y. Since v
x, also x is compatible with v and with every node compatible with v
(regardless of whether x is crossable). In particular, x y, but w(v)
w(u) > w(x), which contradicts the choice of {v , y} as an LMCP.
Next suppose that x eventually merges with v . We wish to replace
the first merge {u , v} with {v , x} (forming x) and replace the {v , x}
merge with {u , x}; this would yield the same list of depths of merges
and yield a cheaper tree, since w(u) > w(x). Since v x, this list will also
be a list of compatible merges unless u is noncrossable and some intervening merge in forming T crosses the position of u. We must eliminate this
possibility.
We consider this together with the remaining case, which is when x
merges with some y other than v before v merges. Here replacing the
initial {u , v} merge by {v , x} and the later {v , y} merge by {u , y} would
again yield the same list of depths of merges, and it would produce either a cheaper tree or one with the same cost but cheaper first merge.
Again these pairs are compatible at the time of merge and the new list
will be a list of compatible merges unless u is initially noncrossable and
some intervening merge crosses the position of u.
If u is noncrossable, then the hypothesis v x implies that v and x
are on the same side of u. Suppose {r, s} is the first merge crossing the
position of u, with s on the same side of u as v , x. If s v initially, then
the choice of x implies w(x) w(s), and thus {r, s} is not an LMCP when
it merges. We prove that w(x) w(z) when z is a node on the v side of u
that is not initially compatible with v but becomes compatible with v before x merges. In particular, s has this property and could not form an
LMCP with r.
We use induction on the number of merges until z becomes compatible
with v . We have already argued that w(x) w(z) if no such merges occur.
Otherwise, z becomes compatible with v via some merge {a , b} in which
b z. If a v initially, then the choice of x implies that w(x) w(a). If
instead a became compatible with v before the {a , b} merge, then the induction hypothesis yields w(x) w(a) again. In either case, a , b being an
LMCP implies that w(a) w(z).
We have proved that there is no first merge crossing over a noncrossable u before x merges. Thus the choice of the initial cheapest deepest
compatible merge must be an LMCP, which completes the proof.
53
54
EXERCISES
1.1.1. () Prove that a tree with maximum degree k has at least k leaves.
1.1.2. () Prove that every graph with n vertices and k edges has at least n k
components.
1.1.3. () Prove C A , B in Theorem 1.1.3 by adding edges to connect components.
1.1.14. Prove that a connected graph having exactly two vertices that are not
cut-vertices is a path.
1.1.15. Let T be a tree with k vertices. Let G be a graph that does not contain K3
or K 2 ,t . Prove that if (G) > (k 2)(t 1), then T occurs as an induced subgraph
of G. (Zaker [2011])
1.1.5. () Prove that a connected n-vertex graph has exactly one cycle if and only
if it has exactly n edges.
1.1.18. () Prove that every n-vertex graph with n + 2 edges has a cycle of length
at most (n + 2)/2 . For each n, construct an example with no shorter cycle.
1.1.6. () Every tree is bipartite. Prove that every tree has a leaf in its larger
partite set (in both if they have equal size).
1.1.19. () Let T be a tree with k edges, and let G be an n-vertex graph with more
than n(k 1) (k2) edges. Use Proposition 1.1.4 to prove that T G if n > k.
1.1.7. () Let T be a tree in which every vertex has degree 1 or degree k. Determine the possible values for | V(T)| .
1.1.8. () Let T be a tree in which all vertices adjacent to leaves have degree at
least 3. Prove that T has some pair of leaves with a common neighbor.
1.1.9. () Let G be a tree. Prove that there is a partition of V(G) into two
nonempty sets such that each vertex has at least half of its neighbors in its own
set in the partition if and only if G is not a star.
1.1.10. () There are five cities in a network. The cost of building a road directly
between i and j is the entry ai , j in the matrix below. Note that ai , j = indicates
that there is a mountain in the way and the road cannot be built. Determine the
least cost of making all the cities reachable from each other.
0
3
5
11
9
3 5
0 3
3 0
9
8 10
11
9
0
7
9
8
10
7
0
1.1.17. () Prove that every n-vertex graph with n + 1 edges has a cycle of length
at most (2n + 2)/3 . For each n, construct an example achieving this bound.
55
1.1.33. Let l(G) be the maximum number of leaves in a spanning tree of G, and
2
. Linial conjectured that l(G) f(G) for all G; this is
let f(G) = vV(G) d(v)
d(v)+1
false. Prove that f(G) l(G) can be arbitrarily large by considering the graph
G m with 10m vertices formed by adding a matching joining K 5m and mC5 .
1.1.34. () Let G be an n-vertex graph other than K3 in which every edge belongs
to a triangle. Prove that G has a spanning tree with at least (n + 5)/3 leaves and
that this is sharp for an infinite family of graphs. (Hint: Grow a tree by operations satisfying an appropriate augmentation inequality. To get the constant
right , an extra dead leaf may need to be guaranteed at the beginning.)
1.1.35. Prove that the number of binary trees with n + 1 leaves equals the number of ordered trees with n + 1 vertices (Example 1.1.18 shows the five binary trees
with four leaves). (Hint: Show that the two families satisfy the same recurrence.)
56