0% found this document useful (0 votes)
8 views65 pages

Generating Functions

The document contains lecture notes on Enumerative Combinatorics by Peter J. Cameron from Autumn 2013, covering ten sections including subsets, partitions, permutations, and generating functions. It discusses various counting techniques and important concepts such as binomial coefficients, Bell numbers, and Stirling numbers, along with exercises for practice. The notes serve as a comprehensive introduction to the field, emphasizing both theoretical and practical aspects of combinatorial enumeration.

Uploaded by

Manjaree Agrawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views65 pages

Generating Functions

The document contains lecture notes on Enumerative Combinatorics by Peter J. Cameron from Autumn 2013, covering ten sections including subsets, partitions, permutations, and generating functions. It discusses various counting techniques and important concepts such as binomial coefficients, Bell numbers, and Stirling numbers, along with exercises for practice. The notes serve as a comprehensive introduction to the field, emphasizing both theoretical and practical aspects of combinatorial enumeration.

Uploaded by

Manjaree Agrawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Enumerative Combinatorics

The LTCC lectures


Peter J. Cameron
Autumn 2013

Abstract
These are the notes of my lecture course on Enumerative Combina-
torics at the London Taught Course Centre in Autumn 2013. Thanks
to all who attended for their support. There are ten sections, as fol-
lows:
• Subsets, partitions, permutations
• Formal power series
• Catalan numbers
• Unimodality
• q-analogues
• Symmetric polynomials
• Group actions
• Species
• Möbius inversion
• Cayley’s theorem
Exercises are included at the end of the sections.

1 Subsets, Partitions, Permutations


Enumerative combinatorics is concerned with counting discrete structures of
various types. There is a great deal of variation both in what we mean by
“counting” and in the types of structures we count. Typically each structure
has a “size” measured by a non-negative integer n, and “counting” may mean

1
(a) finding an exact formula for the number f (n) of structures of size n;

(b) finding an approximate or asymptotic formula for f (n);

(c) finding an analytic expression for a generating function for f (n);

(d) finding an efficient algorithm for computing f (n) exactly or approxi-


mately;

(e) finding an efficient algorithm for stepping from one of the counted ob-
jects to the next (in some natural ordering).

In this course I will mostly be concerned with the first three goals; discussing
algorithms would lead too far afield. The exception to this is one particularly
important algorithm, a recurrence relation, in which the value of f (n) is
computed from n and the earlier values f (0), . . . , f (n − 1).
An asymptotic formula for f (n) is an analytic function g(n) such that
f (n)/g(n) → 0 as n → ∞. There are several types of generatingX functions;
the most important for us are the ordinary generating function f (n)xn ,
n≥0
X f (n)xn
and the exponential generating function .
n≥0
n!
If you want to learn the state-of-the-art in combinatorial enumeration,
I recommend the two volumes of Richard Stanley’s Enumerative Combina-
torics, or the book Analytic Combinatorics by Philippe Flajolet and Robert
Sedgewick. The On-line Encyclopedia of Integer Sequences is another valu-
able resource for combinatorial enumeration.

1.1 Subsets
The three most important objects in elementary combinatorics are subsets,
partitions and permutations; we briefly discuss the counting functions for
these. First, subsets.
The total number of subsets of an n-element set is 2n . This can be used
by noting that this number f (n) satisfies the recurrence relation f (n) =
2f (n − 1); this is proved by observing that any subset of {1, . . . , n − 1} can
be extended to a subset of {1, . . . , n} in two different ways, either including
the element n or not.

2
 
n
The binomial coefficient is the number of k-element subsets of
k
{1, . . . , n}. The formula is
 
n n(n − 1) · · · (n − k + 1) n!
= = .
k k(k − 1) · · · 1 k! (n − k)!

Note
  that there
 are k factors in both numerator and denominator. We have
n n
= = 1. We can extend the definition to all non-negative inte-
0 n  
n
gers n and k by defining = 0 for k > n: this fits with the counting
k
interpretation.
The recurrence relation for binomial coefficients is Pascal’s Triangle
     
n n−1 n−1
= + for 0 < k < n.
k k−1 k

For the first term on the right counts subsets containing n, while the second
counts subsets not containing n.
Counting subsets by cardinality gives
n  
X n
= 2n .
k=0
k

There is a huge literature on “binomial coefficient identities”. A few


examples are given as exercises.
Anticipating our discussion of formal power series in the next chapter, we
now discuss generating functions for binomial coefficients.
n  
X n
xk = (1 + x)n .
k=0
k

This is the Binomial Theorem for non-negative integer exponents. If we


write (1 + x)n = (1 + x) · · · (1 + x) and expand the product, then we obtain
the term in xk by choosing x from
  k of the brackets and 1 from the remaining
n
n − k, which can be done in ways; each contributes 1 to the coefficient
k
of xk , so the theorem holds.

3
If we multiply this equation by y n and sum, we obtain the bivariate
generating function for the binomial coefficients:
n  
X X n k n X
x y = (1 + x)n y n
n≥0 k=0
k n≥0
1
=
1 − (1 + x)y
1 1
= ·
1 − y 1 − xy/(1 − y)
X yk
= k+1
xk ,
k≥0
(1 − y)

so we obtain the other univariate generating function for binomial coefficients:


X n yk
yn = k+1
.
n≥k
k (1 − y)

This formula is actually a rearrangement of the Binomial Theorem for neg-


ative integer exponents. The basis of this connection is the following evalu-
ation, for positive integers m and k:
 
−m −m(−m − 1) · · · (−m − k + 1
=
k k!
(m + k − 1) · · · (m + 1)m
= (−1)k
 k!
m + k − 1
= (−1)k .
k

1.2 Partitions
In this case and the next, we are unable to write down a simple formula
for the counting numbers, and have to rely on recurrence relations or other
techniques.
The Bell number B(n) is the number of partitions of a set of cardinality
n. We refine this in the same way we did for subsets. The Stirling number of
the second kind, S(n, k), is the number of partitions of an n-set into k parts.

4
Thus, S(0, 0) = 1 and S(0, k) = 0 for k > 0; and if n > 0, then S(n, 0) = 0,
S(n, 1) = S(n, n) = 1, and S(n, k) = 0 for k > n. Clearly we have
n
X
S(n, k) = B(n) for n > 0.
k=1

The recurrence relation replacing Pascal’s is:

S(n, k) = S(n − 1, k − 1) + kS(n − 1, k) for 1 ≤ k ≤ n.

It turns out that we can turn this into a statement about a generating
function, but with a twist. Let

(x)k = x(x − 1) · · · (x − k + 1) (k factors).

Then we have n
X
n
x = S(n, k)(x)k for n > 0. How?
k=1

It is possible to find a traditional generating function for the index n:


X
n yk How?
S(n, k)y = .
n≥k
(1 − y)(1 − 2y) · · · (1 − ky)

Also, the exponential generating function for the index n is


X S(n, k)xn (exp(x) − 1)k
= . How?
n≥k
n! k!

Summing over k gives the e.g.f. for the Bell numbers:


X B(n)xn
= exp(exp(x) − 1).
n≥0
n!

1.3 Permutations
The number of permutations of an n-set (bijective functions from the set to
itself) is the factorial function n! = n(n − 1) · · · 1 for n ≥ 0. The exponen-
tial generating function for this sequence is 1/(1 − x), while the ordinary
generating function has no analytic expression (it is divergent for all x 6= 0).

5
Any permutation can be decomposed uniquely into disjoint cycles. So we
refine the count by letting u(n, k) be the number of permutations of an n-set
which have exactly k cycles (including cycles of length 1). Thus,
n
X
u(n, k) = n! for n > 0.
k=1

The numbers u(n, k) are the unsigned Stirling numbers of the first kind. The
reason for the name is that it is common to use a different count, where a
permutation is counted with weight equal to its sign (as defined in elementary
algebra, for example the theory of determinants). Let s(n, k) be the sum of
the signs of the permutations of an n-set which have k cycles. Since the sign
of such a permutation is (−1)n−k , we have s(n, k) = (−1)n−k u(n, k). The
numbers s(n, k) are the signed Stirling numbers of the first kind.
We have n
X
s(n, k) = 0 for n > 1.
k=1

This is related to the algebraic fact that, for n > 1, the permutations with
sign + form a subgroup of the symmetric group of index 2 (that is, containing
half of all the permutations), called the alternating group.
We will mainly consider signed Stirling numbers below, though it is some-
times convenient to prove a result first for the unsigned numbers.
As usual we take s(n, 0) = 0 for n > 0 and s(n, k) = 0 for k > n.
We have s(n, n) = 1, s(n, 1) = (−1)n−1 (n−1)!, and the recurrence relation

s(n, k) = s(n − 1, k − 1) − (n − 1)s(n − 1, k) for 1 ≤ k ≤ n.

From this, we find a generating function:


n
X
s(n, k)xk = (x)n .
k=1

Putting x = 1 in this equation shows that indeed the sum of the signed
Stirling numbers is zero for n > 1.
Note that this is the inverse of the relation we found for the Stirling
numbers of the second kind. So the matrices formed by the Stirling numbers
of the first and second kind are inverses of each other.

6
Exercises
1. Let A be the matrix of binomial
  coefficients (with rows and columns
i
indexed by N, and (i, j) entry ), and B the matrix of “signed binomial
j  
i−j i
coefficients” (as before but with (i, j) entry (−1) ). Prove that A and
j
B are inverses of each other.
What are the entries of the matrix A2 ?
n  2  
X n 2n
2. Prove that = .
k=0
k n
3. (a) Prove that the following are equivalent for sequences (a0 .a1 , . . .) and
(b0 , b1 , . . .), with exponential generating functions A(x) and B(x) respec-
tively:
n
X
(ii) b0 = a0 and bn = S(n, k)ak for n ≥ 1;
k=1

(i) B(x) = A(exp(x) − 1).

(b) Prove that the following are equivalent for sequences (a0 .a1 , . . .) and
(b0 , b1 , . . .), with exponential generating functions A(x) and B(x) respec-
tively:
n
X
(i) b0 = a0 and bn = s(n, k)ak for n ≥ 1;
k=1

(ii) B(x) = A(log(1 + x)).

4. Construct a bijection between the set of all k-element subsets of {1, 2, . . . , n}


containing no two consecutive elements, and the set of all k-element subsets
of {1, 2,
 . . . , n − k + 1}. Hence show that the number of such subsets is
n−k+1
k
.
In the UK National Lottery, six numbers are chosen randomly (with-
out replacement, order unimportant) from the set {1, . . . , 49}. What is the
probability that the selection contains no two consecutive numbers?

7
2 Formal power series
Probably you recognised in the last chapter a few things from analysis, such
as the exponential and geometric series; you may know from complex analysis
that convergent power series have all the nice properties one could wish. But
there are reasons for considering non-convergent power series, as the following
example shows.
Recall the generating function for the factorials:
X
F (x) = n!xn ,
n≥0

which converges nowhere. Now consider the following problem. A permu-


tation of {1, . . . , n} is said to be connected if there is no number m with
1 ≤ m ≤ n − 1 such that the permutation maps {1, . . . , m} to itself. Let Cn
be the number of connected permutations of {1, . . . , n}. Any permutation is
composed of a connected permutation on an initial interval and an arbitrary
permutation of the remainder; so
n
X
n! = Cm (n − m)!.
m=1

Hence, if X
G(x) = 1 − C n xn ,
n≥1

we have F (x)G(x) = 1, and so G(x) = 1/F (x).


Fortunately we can do everything that we require for combinatorics (ex-
cept some asymptotic analysis) without assuming any convergence proper-
ties.

2.1 Formal power series


Let R be a commutative ring with identity. A formal power series over R is,
formally, an infinite sequence (r0 , r1 , r2 , . . .) of elements of R; but we always
represent it in the suggestive form
X
r0 + r1 x + r2 x2 + · · · = rn xn .
n≥0

8
We denote the set of all formal power series by R[[x]].
The set R[[x]] has a lot of structure: there are many things we can do with
formal power series. All we require of any operations is that they only require
adding or multiplying finitely many elements of R. No analytic properties
such as convergence of infinite sums or products are required to hold in R.

(a) Addition: We add two formal power series term-by-term.

(b) Multiplication: The rule for multiplication of formal power series, like
that of matrices, looks unnatural but is really the obvious thing: we
multiply powers of x by adding the exponents, and then just gather up
the terms contributing to a fixed power. Thus
X  X  X
n n
an x · bn x = cn x n ,

where n
X
cn = ak bn−k .
k=0

Note that to produce a term of the product, only finitely many addi-
tions and multiplications are required.

(c) Infinite sums and products: These are not always defined. Suppose, for
example, that A(i) (x) are formal power series for i = 0, 1, 2, . . .; assume
that the first non-zero coefficient in A(i) (x) is the coefficient of xni ,
where ni → ∞ as i → ∞. Then, to work out the coefficient of xn in
the infinite sum, we only need the finitely many series A(i) (x) for which
ni ≤ n. Similarly, the product of infinitely many series B (i) is defined
provided that B (i) (x) = 1 + A(i) (x), where A(i) satisfy the condition
just described.

(d) Substitution: Let B(x) be a formal power series with constant term
zero. Then, for any formal power series A(x), the series A(B(x)) ob-
tained by substituting B(x)
P for x inn A(x) is defined. For, if A(x) =
n
an B(x) , and B(x) has no terms in xk
n
P
an x , then A(B(x)) =
for k < n.

(e) Differentiation: of formal power seriesPis always P


defined; no limiting
n
process isPrequired. The derivative of an x is nan xn−1 , or alter-
n
natively, (n + 1)an+1 x .

9
(f) Negative powers: We can extend the notion of formal power series to
formal Laurent series, which are allowed to have finitely many negative
terms: X
an x n .
n≥−m

Infinitely many negative terms would not work since multiplication


would then require infintely many arithmetic operations in R.

(g) Multivariate formal power series: We do not have to start again from
scratch to define series in several variables. For R[[x]] is a commutative
ring with identity, and so R[[x, y]] can be defined as the set of formal
power series in y over R[[x]].

As hinted above, R[[x]] is indeed a commutative ring with identity: veri-


fying the axioms is straightforward but tedious, and I will just assume this.
With the operation of differentiation it is a differential ring.
Recall that a unit in a ring is an element with a multiplicative inverse.
The units in R[[x]] are easy to describe:

rn xn is a unit in R[[x]] if and


P
Proposition 2.1 The formal power series
only if r0 is a unit in R.

Proof If ( rn xn ) ( sn xn ) = 1, then looking at the constant term we see


P P
that r0 s0 = 1, so r0 is a unit.
Conversely, suppose that r0 s0 = 1. Considering the coefficient of xn in
the above equation with n > 0, we see that
n
X
rk sn−k = 0,
k=0

so we can find the coefficients sn recursively:


n
!
X
sn = −s0 rk sn−k .
k=1

This argument shows the very close connection between finding inverses
in R[[x]] and solving linear recurrence relations.

10
2.2 Example: partitions
We are considering partitions of a number n, rather than of a set, here. A
partition of n is an expression for n as a sum of positive integers arranged in
non-increasing order; so the five partitions of 4 are

4 = 3 + 1 = 2 + 2 = 2 + 1 + 1 = 1 + 1 + 1 + 1.

Let p(n) be the number of partitions of n.

Theorem 2.2 (Euler’s Pentagonal Numbers Theorem)


X
p(n) = (−1)k−1 (p(n − k(3k − 1)/2) + p(n − k(3k + 1)/2)) ,
k≥1

where the sum contains all terms where the argument n − k(3k ± 1)/2 is
non-negative.

This is a very efficient recurrence


√ relation for p(n), allowing it to be
computed with only about n arithmetic operations if smaller values are
known. For example, if we know

p(0) = 1, p91) = 1, p(2) = 2, p(3) = 3, p(4) = 5,

then we find p(5) = p(4) + p(3) − p(0) = 7, p(6) = p(5) + p(4) − p(1) = 11,
and so on.
I will give a brief sketch of the proof.

Step 1: The generating function.


X Y
p(n)xn = (1 − xk )−1 .
n≥0 k≥1

k
For on the right, we have the product of geometric series 1 + xP + x2k + · · ·,
n
and the coefficient of x is the number of ways of writing n = kak , which
is just p(n).

11
Step 2: The inverse of the generating function. We need to find
Y
(1 − xk ).
k≥1

The coefficient of xn in this product is obtained from the expressions for n as


a sum of distinct positive integers, where sums with an even number of terms
contribute +1 and sums with an odd number contribute −1. For example,

9 = 8 + 1 = 7 + 2 = 6 + 3 = 5 + 4 = 6 + 2 + 1 = 5 + 3 + 1 = 4 + 3 + 2,

so there are four sums with an even number of terms and four with an odd
number of terms, and so the coefficient is zero.

Step 3: Pentagonal numbers appear. It turns out that the following is


true:
The numbers of expressions for n as the sum of an even or an
odd number of distinct positive integers are equal for all n except
those of the form k(3k ± 1)/2, for which the even expressions
exceed the odd ones by one if k is even, and vice versa if k is odd.
This requires some ingenuity, and I do not give the proof here.
This shows that the expression in Step 2 is equal to
X
(−1)k xk(3k+1)/2 + xk(3k−1)/2 ,

1+
k≥1

and we immediately obtain the required recurrence relation.

Exercises
1. Suppose that R is a field. Show that R[[x]] has a unique maximal ideal,
consisting of the formal power series with constant term zero. Describe all
the ideals of R[[x]].
2. Suppose that A(x), B(x) and C(x) are the exponential generating func-
tions of sequences (an ), (bn ) and (cn ) respectively. Show that A(x)B(x) =
C(x) if and only if
n  
X n
cn = ak bn−k .
k=0
k

12
3. (a) Let (an ) be a sequence of integers, and (bn ) the sequence of partial sums
X n
of (an ) (in other words, bn = ai ). Suppose that the generating function
i=0
for (an ) is A(x). Show that the generating function for (bn ) is A(x)/(1 − x).
(b) Let (an ) be a sequence of integers, and let cn = nan for all n ≥
0. Suppose that the generating function for (an ) is A(x). Show that the
generating function for (cn ) is x(d/dx)A(x). What is the generating function
for the sequence (n2 an )?
(c) Use the preceding parts of this exercise to find the generating function
X n
for the sequence whose nth term is i2 , and hence find a formula for the
i=1
sum of the first n squares.

3 Catalan numbers
In the last chapter, as in most of this course, we treated power series as formal
objects: even differentiation involves no limiting processes. However, if the
coefficients are complex numbers, and the series converge in some neighbour-
hood of the origin, then analytic methods can be used. These methods can
be very powerful. We will see them at work in the derivation of a formula for
the Catalan numbers, and then give a few examples of combinatorial objects
counted by Catalan numbers.

3.1 Analysis
A complex function which is analytic in some neighbourhood of the origin
is represented by a convergent power series in a disc about the origin. If
an analytic relation between functions holds in a suitable disc, then any
connection between the coefficients which can be derived will also be true in
the world of formal power series.
The most important formal power series to which this principle can be
applied are
X a
a
(a) The binomial series (1 + x) = xn , where a is any complex
n≥0
n

13
number, and the binomial coefficient is defined as
 
a a(a − 1) · · · (a − n + 1)
= .
n n!
X xn
(b) The exponential series exp(x) = ..
n≥0
n!

X (−1)n−1 xn
(c) The logarithmic series log(1 + x) = .
n≥1
n

Here is a simple example. The identity


(1 + x)a (1 + x)b = (1 + x)a+b ,
valid for |x| < 1, gives rise to the Vandermonde convolution
n     
X a b a+b
= .
k=0
k n−k n

3.2 Example: Catalan numbers


The Catalan numbers are one of the most important sequences of combi-
natorial numbers, with a large range of occurrences in apparently different
counting problems. I will introduce them with one particular occurrence, and
then give a number of different places where they arise. The derivation of
the formula for them is on the border between formal and analytic methods,
and multivariate versions of this method are useful in areas such as lattice
path problems.

Problem Given an algebraic structure with a (non-associative) binary op-


eration ◦, in how many different ways can a product of n terms be evaluated
by inserting brackets?
For example, the product a ◦ b ◦ c ◦ d has five evaluations:
((a ◦ b) ◦ c) ◦ d, (a ◦ (b ◦ c)) ◦ d, (a ◦ b) ◦ (c ◦ d), a ◦ ((b ◦ c) ◦ d), a ◦ (b ◦ (c ◦ d)).

X of n terms, for n ≥ 1,
Let Cn be the number of evaluations of a product
so that C1 = C2 = 1, C3 = 2, C4 = 5. Let c(x) = Cn xn be the generating
n≥1
function.

14
In a bracketing of n terms, the last application of ◦ will combine some
product of the first m terms with some product of the last n − m terms, for
some m with 1 ≤ m ≤ n − 1. So we have the recurrence relation
n−1
X
Cn = Cm Cn−m for n > 1.
m=1

Combined with the initial condition C1 = 1, this determines the sequence.


Now consider the product c(x)2 . The recurrence relation shows that the
terms in xn in c(x)2 are the same as those in c(x) for n > 1; only the terms
in x differ, with c(x) containing 1x and c(x)2 containing 0x. So we have

c(x) = x + c(x)2 .

We can rearrange this as a quadratic equation:

c(x)2 − c(x) + x = 0.

The solution of this equation is


1
√ 
c(x) = 2
1± 1 − 4x .

The choice of sign in the square root is determined by the fact that c(0) = 0,
so we must take the negative sign:

c(x) = 21 1 − 1 − 4x .


From this expression it is possible to extract the coefficient of xn . Ac-


cording to the Binomial Theorem,
X 1/2
1/2
(1 − 4x) = (−4x)n ,
n≥0
n

and so  
1/2
Cn = − 12 (−4)n .
n
Now
 
1/2 (1/2)(−1/2) · · · (−(2n − 3)/2)
=
n n!

15
1 1 · 3 · (2n − 3)
= n
(−1)n−1
2 n!
1 1 (2n − 2)!
= n (−1)n−1 n−1
2 n 2 ((n − 1)!)2
 
1 n 1 2n − 2
= −2(− 4 ) ,
n n−1

so finally we obtain  
1 2n − 2
Cn = .
n n−1
The result and its proof call for a few remarks.
First, are these manipulations really valid?

(a) We have used here the Binomial Theorem for exponent 1/2, which is
proved analytically by observing that the function (1 + x)1/2 is analytic
in the interior of the unit disc (it has a branchpoint at x = −1), and
then using the formula for the coefficient of xn in the Taylor series
(differentiate n times, put x = 0, divide by n!).

(b) It is clear, from back substitution, that the function c(x) = 21 (1 −



1 − 4x) does indeed satisfy the equation c(x) = x + c(x)2 ; so its
coefficients satisfy the recurrence relation and initial condition for the
Catalan numbers Cn . Since these data determine the numbers uniquely,
our final formula is indeed valid.

Second, this is a case where, even once you know the formula for the Cata-
lan numbers, it is difficult to show directly that they satisfy the recurrence
relation. (Spend a few moments trying; you will be convinced of this!)
And
 third,it is not at all obvious that n divides the binomial coeffi-
2n − 2
cient ; but since Cn counts something, it is an integer, and so this
n−1
divisibility is indeed true.

3.3 Other Catalan objects


Here are a small selection of the many objects counted by Catalan numbers.
The obvious ways of verifying this for a class of objects are either

(a) to verify the Catalan recurrence and initial condition; or

16
(b) to find a bijection to a known class of Catalan objects.

There are sometimes other less obvious ways, as we will see in the case of
Dyck paths.
Where possible I have given an illustration of the five Catalan objects
counted by C4 .

Binary trees
A binary tree has a root of degree 2; the other vertices have degree 1 or
3. So every non-root vertex is either a leaf or has two descendants, which we
specify as left and right descendants.
The number of binary trees with n leaves is Cn . Figure 1 shows the
correspondence with bracketed products: the tree is a “parse tree” for the
product.

r r r r r r r r
r \\r \
\r  r r r r r r \\r  \\ r r
r \\r r \r  LLr LLr
\r r r r
 \ \ 
\
\\r \\r \\r r
\  r
\ 
(a ◦ (b ◦ (c ◦ d))) (a ◦ ((b ◦ c) ◦ d)) ((a ◦ b) ◦ (c ◦ d)) ((a ◦ (b ◦ c)) ◦ d) (((a ◦ b) ◦ c) ◦ d)

Figure 1: Binary trees and bracketed products

Rooted plane trees


The number of rooted plane trees with n edges is Cn+1 . Figure 2 shows
the rooted plane trees with three edges.
r
r r r r r
@
r @r r r r r r r r
@ @ @
r r @r @r @r

Figure 2: Rooted plane trees

17
Dissections of polygons
An n-gon can be dissected into triangles by drawing n − 2 non-crossing
diagonals. There are Cn−1 dissections of an n-gon. Figure 3 shows dissections
of a pentagon.

q q q q q
q  Zq q BB Zq q Zq q BB Zq q Zq
Z Z Z Z Z
B   Z
B ZB  B  B  B  BZZ 
q
B q Bq ZBq q
B q Bq Bq Bq Zq

Figure 3: Dissections of a polygon

Dyck paths
A Dyck path starts at the origin and ends at (2n, 0), moving at each step
to the adjacent lattice point in either the north-easterly or south-easterly
direction and never going below the X-axis. (An even number of steps is
required since each step either increases or decreases the Y-coordinate by 1.)
Figure 4 shows the Dyck paths with n = 3.
q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q@q q q
@ q q q q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q@q q q q q q
@ q q q
@ q q q q q
@ q q q q q q q@q q
@ q q q q q q q
q q q q q q@q q q q q q @
q@q q q q @
q@q q q
@ q q q
@ q q q@q q q q
@ q q
@ q q
@

Figure 4: Dyck paths

The number of Dyck paths is Cn+1 , and of these, Cn never return to the
X-axis before the end. I will indicate the proof since it illustrates another
technique.
Let Dn be the number of Dyck paths, and En the number which never
return to the axis. Now a Dyck path begins by moving from (0, 0) to (1, 1)
and ends by moving from (2n − 1, 1) to (2n, 0); if it did not return to the
axis in between, then removing these “legs” gives a shorter Dyck path. So

En = Dn−1 .

Suppose that a Dyck path first returns to the axis at (2k, 0). Then it is a
composite of a non-returning Dyck path of length 2k with an arbitrary Dyck

18
path of length 2(n − k); so
n
X
Dn = Ek Dn−k .
k=1

Solving these simultaneous recurrences gives the result.

Ballot numbers
An election is held with two candidates A and B, each of whom receives
exactly n votes. In how many ways can the votes be counted so that A is
never behind in the count?
It is easy to match these ballot numbers with Dyck paths. For n = 3, the
five counts are AAABBB, AABABB, AABBAB, ABAABB, and ABABAB.
This can be described another way. In a 2×n array, we place the numbers
1, . . . , 2n in order against the candidates who receive those votes. This gives
the representations shown in Figure 5.

1 2 3 1 2 4 1 2 5 1 3 4 1 3 5
4 5 6 3 5 6 3 4 6 2 5 6 2 4 6
AAABBB AABABB AABBAB ABAABB ABABAB
Figure 5: Tableaux

Note that the numbers increase along each row and down each column.

3.4 Young diagrams and tableaux


The five objects shown are known as Young tableaux ; they arise in the rep-
resentation theory of the symmetric group and much related combinatorics.
A Young diagram (sometimes called a Ferrers diagram) consists of n
boxes arranged in left-aligned rows, the number of boxes in each row being
a non-decreasing function of the row number. This is simply a graphical
representation of a partition of n: for each partition n = a1 + a2 + · · ·, with
a1 ≥ a2 ≥ . . ., we take a1 boxes in the first row, a2 in the second, and so on.
Now a Young tableau is a filling of the boxes with the numbers 1, 2, . . . , n
so that each row and each column is in increasing order. You maay like to

19
invent a ballot interpretation for the number of Young tableaux belonging to
a given diagram.
This combinatorics is important in describing the representation theory
of the symmetric group Sn , the group of all permutations of {1, . . . , n}. It
is known that the irreducible matrix representations of Sn over the complex
numbers are in one-to-one correspondence with the partitions of n (that is, to
the Young diagrams); the degree of a representation is equal to the number
of Young tableaux belonging to the corresponding diagram. Thus, the five
Young tableaux shown in the preceding section correspond to an irreducible
representation of degree 5 of the group S6 .
There is a “hook length formula” for the number of Young tableaux cor-
responding to a given diagram. The hook associated with a cell consists of
that cell and all those to its right in the same row or below it in the same
column. The hook length of a cell is the number of cells in its hook. Now the
number of Young tableaux associated with the diagram is equal to n! divided
by the product of the hook lengths of all its cells.
Thus for the diagram with two rows of length 3, the formula gives
6!
= 5.
4·3·2·3·2·1

3.5 Wedderburn–Etherington numbers


What happens if we count binary trees without the left-right distinction
between the two children at each node? In other words, two binary trees will
count as “the same” if a sequence of reversals of subtrees above each point
converts one to the other.
It can be shown that the recurrence relation for the number Wn of binary
trees with this convention (the Wedderburn–Etherington numbers is
 n−1
X
1

Wi Wn−i if n is odd,


2

i=1
Wn = n−1
!
 X
1
Wi Wn−i + Wn/2 if n is even,


2

i=1

and that the generating function w(x) satisfies

w(x) = x + 12 (w(x)2 + w(x2 )).

20
This is much more difficult to solve. Whereas Cn is roughly 4n (in the
1/n
sense that the limit of Cn as n → ∞ is 4), Wn is roughly 2.483 . . .n in the
same sense.

Exercises
1 Give a counting proof of the Vandermonde convolution in the case where
a and b are natural numbers.

2 Verify some of the formulae for Catalan objects in the notes, either by
deriving a recurrence, or by finding bijections between the objects counted.

3 In the analysis of Dyck paths, adopt the convention that D0 = 1 and


E0 = 0. Prove that, if d(x) and e(x) are the generating functions, then

xd(x) = e(x), d(x) = 1 + e(x)d(x).

Hence derive formulae for Dn and En .

4 Use the hook length formula to derive the formula for the Catalan number
Cn .

5 Prove the recurrence relation and the equation for the generating function
for the Wedderburn–Etherington numbers.

4 Unimodality
It is well known that the binomial coefficients increase up to halfway, and
then decrease. Indeed, the shape of the bar graph of binomial coefficients
is well approximated by a scaled version of the “bell curve” of the normal
distribution.

....................................
....... ......
...... .....
....... ....
.... ....
...... ....
....
..
... ....
..
... ....
.
.... ....
....
.
.... ....
....
. ....
..
..
..... ......
.......
..
..
..
..... ........
...
..
..
.......
. ..........
..............
......
............................ .........................

21
This property of binomial coefficients is easily proved. Since
   
n n−k n
= ,
k+1 k+1 k
the binomial coefficient increases from k to k + 1, remains constant, or de-
creases, according as n − k > k + 1, n − k = k + 1 or n − k < k + 1, that
is, as n is greater than, equal to, or less than 2k + 1. So, if n is even, the
binomial coefficients increase up to k = n/2 and then decrease; if n is odd,
the two middle values (k = (n − 1)/2 and k = (n + 1)/2) are equal, and they
increase before this point and decrease after.
Other combinatorial numbers also show this unimodality property, but
in cases where we don’t have a formula, we need general techniques.

4.1 Unimodality and log-concavity


Given a sequence of positive numbers, say a0 , a1 , a2 , . . . , an , we say that the
sequence is unimodal if there is an index m with 0 ≤ m ≤ n such that

a0 ≤ a1 ≤ · · · ≤ am ≥ am+1 ≥ · · · ≥ an .

The sequence a0 , a1 , a2 , . . . , an of positive integers is said to be log-concave


if a2k ≥ ak−1 ak+1 for 1 ≤ k ≤ n − 1. The reason for the name is that
the logarithms of the as are concave: setting bk = log ak , we have 2bk ≤
bk−1 + bk+1 , or in other words, bk+1 − bk ≤ bk − bk−1 . So if we plot the points
(k, bk ) for 0 ≤ k ≤ n, then the slopes of the lines joining consecutive points
decrease as k increases, so that the figure they form is concave when viewed
from above.
Now it is clear that a log-concave sequence is unimodal.
A nice general result is:
n
X
Theorem 4.1 Let A(x) = ak xk be the generating polynomial for the
k=0
numbers a0 , . . . , an . Suppose that all the roots of the equation A(x) = 0 are
real and negative. Then the sequence a0 , . . . an is log-concave.

Before we begin the proof, we note that a polynomial with all coefficients
positive cannot have a real non-negative root, and a polynomial all of whose
roots are negative has all its coefficients positive.

22
The proof is by induction: there is nothing to prove when n = 1, since
any sequence of two numbers is log-concave. For n = 2, the condition for
the polynomial a0 + a1 x + a2 x2 to have real roots is a21 − 4a0 a2 ≥ 0, which
is stronger than log-concavity; as remarked, if the roots are real, they are
negative.
Now we turn to the general case. Suppose that A(x) = (x+c)B(x), where
c > 0 and
B(x) = bn−1 xn−1 + · · · + b1 x + b0 .
Now the polynomial B(x) has all its roots real and negative, since they are
all the roots of A(x) except for −c. So the coefficients are all positive, and
by the inductive hypothesis, the sequence b0 , . . . , bn−1 is log-concave; that is,

b2k ≥ bk−1 bk+1

for k = 1, . . . , n − 2. Also, since A(x) = (x + c)B(x), we have a0 = cb0 ,


an = bn−1 , and ak = bk−1 + cbk for 1 ≤ k ≤ n − 1.
We first show that bk bk−1 ≥ bk+1 bk−2 for 2 ≤ k ≤ n − 2. For we have

b2k bk−1 ≥ bk+1 b2k−1 ≥ bk+1 bk bk−2 ;

dividing by bk gives the result.


Now for 2 ≤ k ≤ n − 2, we have

a2k − ak+1 ak−1 = (bk−1 + cbk )2 − (bk + cbk+1 )(bk−2 + cbk−1 )


= (b2k−1 − bk bk−2 ) + c(bk−1 bk − bk+1 bk−2 ) + c2 (b2k − bk+1 bk−1 );

and all three terms are non-negative since c > 0.


The cases k = 1 and k = n − 1 are left to the reader.

4.2 Binomial coefficients and Stirling numbers


For the binomial coefficients, we have
n  
X n
xk = (1 + x)n ;
k=0
k

all its roots are −1, and so the theorem shows that the binomial coefficients
are log-concave, and hence unimodal.

23
For the unsigned Stirling numbers of the first kind, we have
n
X
u(n, k)xk = x(x + 1) · · · (x + n − 1),
k=1

and the polynomial on the right has roots 0, −1, −2, . . . , −(n − 1). We can
neglect the zero root: the Stirling numbers start at k = 1 rather than zero,
and dividing by x simply changes the indexing so that they start at 0. So
again the Stirling numbers are log-concave and hence unimodal.
The Stirling numbers of the second kind are more difficult, since there
is no convenient form for the generating polynomial. We start with the
recurrence relation
S(n, 1) = S(n, n) = 1, S(n, k) = S(n−1, k−1)+kS(n−1, k) for 1 < k < n.
Let n
X
An (x) = S(n, k)xk .
k=0

We have A0 (x) = 1. For n > 0, we have A(n, 0) = 0, so zero is a root of


An (x) = 0. We have to show that the other roots are real and negative.
We prove this by induction: P1 (x) = x has a single root at x = 0, while
A2 (x) = x + x2 has roots at x = 0 and x = −1; so the induction begins.
From the recurrence relation, we have
n
X
An (x) = S(n, k)xk
k=1
Xn n
X
k
= S(n − 1, k − 1)x + kS(n − 1, k)xk
k=1 k=1
= x (dAn−1 (x)/dx + An−1 (x)) .
Putting Bn (x) = An (x)ex , we see that An (x) = 0 and Bn (x) = 0 have
the same roots. The identity above, multiplied by ex , gives
x dBn−1 (x)/dx = Bn (x).
By Rolle’s Theorem, there is a root of Bn (x) between each pair of roots of
Bn−1 (x), and one to the left of the smallest root of Bn−1 (x) (since Bn−1 (x) →
0 as x → −∞); and also a a root at 0. This accounts for (n − 2) + 1 + 1
roots, that is, all the roots of Bn (x). So the induction step is complete.

24
Exercises
1 Let S be a fixed set of positive integers, and let rn be the number of
partitions ofPn into distinct parts from the set S. What is the generating
polynomial rn xn ? Is the sequence (rn ) unimodal?

2 Let (an ) be an infinite sequence of positive numbers which is log-concave


(that is, an−1 an+1 ≤ a2n for all n ≥ 1). Show that the ratio an+1 /an tends to
a limit as n → ∞.

5 q-analogues
In a sense, a q-analogue of a combinatorial formula is simply another formula
involving a variable q which has the property that, as q → 1, the second
formula becomes the first. Of course there is more to it than that; some
q-analogues are more important than others. What follows is nothing like
a complete treatment; I will concentrate on a particularly important case,
the Gaussian or q-binomial coefficients, which are, in the above sense, q-
analogues of binomial coefficients.

5.1 Definition of Gaussian coefficients


The Gaussian (or q-binomial) coefficient is defined for non-negative integers
n and k as
(q n − 1)(q n−1 − 1) · · · (q n−k+1 − 1)
 
n
= .
k q (q k − 1)(q k−1 − 1) · · · (q − 1)
In other words, in the formula for the binomial coefficient, we replace each
factor r by q r − 1. Note that this is zero if k > n; so we may assume that
k ≤ n.
qr − 1
Now observe that lim = r. This follows from l’Hôpital’s rule: both
q→1 q − 1
numerator and denominator tend to 0, and their derivatives are rq r−1 and 1,
whose ratio tends to r. Alternatively, use the fact that
qr − 1
= 1 + q + · · · + q r−1 ,
q−1
and we can now harmlessly substitute q = 1 in the right-hand side; each of
the r terms becomes 1.

25
Hence if we replace each factor (q r − 1) in the definition of the Gaussian
coefficient by (q r − 1)/(q − 1), then the factors (q − 1) in numerator and
denominator cancel, so the expression is unchanged; and now it is clear that
   
n n
lim = .
q→1 k k
q

5.2 Interpretations
Quantum calculus The letter q stands for “quantum”, and the q-binomial
coefficients do play a role in “quantum calculus” similar to that of the ordi-
nary binomial coefficients in ordinary calculus. I will not discuss this further;
see the book Quantum Calculus, by V. Kac and P. Cheung, Springer, 2002,
for further details.

Vector spaces over finite fields The letter q is also routinely used for
the number of elements in a finite field (which is necessarily a prime power,
and indeed there is a unique finite field of any given prime power order q –
a theorem of Galois).

Theorem 5.1 Let V be an n-dimensional vector space over a field


 with q
n
elements. Then the number of k-dimensional subspaces of v is .
k q

Proof The proof follows the standard proof for binomial coefficients count-
ing subsets of a set.
A k-dimensional subspace of V is specified by choosing a basis, a sequence
of k linearly independent vectors. Now the number of choices of the first
vector is q n − 1 (since every vector except the zero vector is eligible); the
second can be chosen in q n − q ways (since the q multiples of the first vector
are now ineligible); the third in q n −q 2 ways (since the q 2 linear combinations
of the first two are now ruled out); and so on. In total,

(q n − 1)(q n − q) · · · (q n − q k−1 )

choices.
We have to divide this by the number of k-tuples of vectors which form
a basis for a given k-dimensional subspace. This number is obtained by

26
replacing n by k in the above formula, that is,

(q k − 1)(q k − q) · · · (q k − q k−1 ).

Dividing, and cancelling the powers of q, gives the result.

Remark Let F denote a field of q elements. Then a set of k linearly


independent vectors in F n can be represented as a k × n matrix of rank
k. We may put it into reduced echelon form by elementary row operations
without changing the subspace it spans; and, indeed, any subspace has a
unique basis in reduced echelon form. So as a corollary we obtain

Corollary 5.2 The number of k×n  matrices over a field of q elements which
n
are in reduced echelon form is .
k q

As a reminder, a matrix is in reduced echelon form if

(a) the first non-zero entry in any row is a 1 (a leading 1);


(b) the leading 1s occur further to the right in successive rows;
(c) all the other elements in the column of a leading 1 are 0.

This has two consequences. First, it gives us another way of calculating


the Gaussian coefficients. For example, the 2 × 4 matrices in reduced echelon
form are as follows, where ∗ denotes any element of the field:
     
1 0 ∗ ∗ 1 ∗ 0 ∗ 1 ∗ ∗ 0
, , ,
0 1 ∗ ∗ 0 0 1 ∗ 0 0 0 1
     
0 1 0 ∗ 0 1 ∗ 0 0 0 1 0
, , .
0 0 1 ∗ 0 0 0 1 0 0 0 1

So we have
 
4
= q 4 + q 3 + q 2 + q 2 + q + 1 = (q 2 + 1)(q 2 + q + 1).
2 q

This expression, and the definition, are polynomials in q, which agree for ev-
ery prime power q; so they coincide. In a similar way, any Gaussian coefficient
can be written out as a polynomial.

27
The other consequence is that algebra is not required here. Over any
alphabet of size q, containing two distinguished elements 0 and 1, the number
of k × n matrices in
 “reduced echelon form” (satisfying (a)–(c) above) with
n
no zero rows is .
k q

Lattice paths How many lattice paths are there from the origin to the
point (m, n), where each step in the path moves one unit either north or
east?  
m+n
Clearly the number is , since we must take m + n steps of which
m
m are north and n are east, and the northward steps may occur in any m of
the m + n positions.
Suppose we want to count the paths by the area under the path (that is,
bounded by the X-axis, the line x = m, and the path). We use a generating
function approach, so that a path enclosing an area of a units contributes
q a to the overall generating function. Here q is simply a formal variable; the
answer is obviously a polynomial in q.

Theorem 5.3 The generating function


 for lattice paths from (0, 0) to (m, n)
m+n
by area under the path is .
m q

 section. Note that, as q → 1, we expect the


We will see whyin the next
m+n
formula to tend to .
m

A non-commutative interpretation Let x and y be elements of a (non-


commutative) algebra which satisfy yx = qxy, where the coefficient q is a
“scalar” and commutes with x and y. Then we have the following analogue
of the Binomial Theorem (see Exercises):
Theorem 5.4
X n
n
(x + y) = n x( n − k)y k .
k=0
k q

For example,

(x + y)3 = xxx + xxy + xyx + yxx + xyy + yxy + yyx + yyy.

28
We can use the relation to move the y’s to the end in each term; each time
we jump a y over an x we pick up a factor q. So

(x + y)3 = x3 + (1 + q + q 2 )x2 y + (1 + q + q 2 )xy 2 + y 3 ,

in agreement with the theorem.

5.3 Combinatorial properties


These properties can be proved in two ways: by using the counting inter-
pretation involving subspaces of a vector space, or directly from the formula
(usually easiest). The proofs are all relatively straightforward; I will just give
outlines where appropriate.
   
n n
Proposition 5.5 = .
k q n−k q

This is straightforward from the formula. Alternatively we can invoke


vector space duality: there is a bijection between subspaces of dimension k
of a vector space and their annihilators (subspaces of codimension k of the
dual space).
         
n n n n−1 k n−1
Proposition 5.6 = = 1, and = +q for
0 q n q k q k−1 q k q
0 < k < n.

Again, straightforward from the formula. Alternatively, consider k × n


matrices in reduced echelon. If the leading 1 in the last row is in the last
column, then the other entries in the last row and column are zero, and
deleting them gives a (k − 1) × (n − 1) matrix in reduced echelon. Otherwise,
the last column is arbitrary (so there are q k possibilities for it; deleting it
leaves a k × (n − 1) matrix in reduced echelon.

Remark From this we can prove Theorem 5.3, as follows. Let Q(n, k) be
the sum of the weights of lattice paths from (0, 0) to (n − k, k), where the
weight of a path is q a if the area below it is a. Clearly Q(n, 0) = Q(n, n) = 1.
Consider all the lattice paths from (0, 0) to (n − k, k), and divide them
into two classes: those in which the last step is vertical, and those in which
it is horizontal. In the first case, the last step is from the end of a path

29
counted by Q(n − 1, k − 1) (ending at (n − k, k − 1)), and adds no area to the
path. In the second step, it is from the end of a path counted by Q(n − 1, k)
(ending at (n − k − 1, k)), and increases the area by k, adding q k Q(n − 1, k)
to the sum. So the numbers Q(n, k) have the same recurrence and boundary
conditions as the Gaussian coefficients, and must be equal to them.
From the last two results, we can deduce an alternative recurrence:
     
n n−k n − 1 n−1
=q + .
k q k−1 q k q

5.4 The q-binomial theorem


The q-analogue of the Binomial Theorem states:
Theorem 5.7 For any positive integer n,
n n  
k(k−1)/2 k n
Y X
i−1
(1 + q z) = q z .
i=1 k=0
k q

The proof is by induction on n; starting the induction at n = 1 is trivial.


Suppose that the result is true for n − 1. For the inductive step, we must
compute
n−1  !
X n − 1
q k(k−1)/2 z k 1 + q n−1 z .

k=0
k q
The coefficient of z k in this expression is
   
k(k−1)/2 n − 1 (k−1)(k−2)/2+n−1 n − 1
q +q
k q k−1 q
   !
n−1 n−1
= q k(k−1)/2 + q n−k
k q k−1 q
 
n
= q k(k−1)/2
k q
by the alternative recurrence relation.
I state without proof here Heine’s formula, the q-analogue of the negative
binomial theorem:
n ∞  
Y
i−1 −1
X n+j−1 j
(1 − q z) = z .
i=1 j=0
j q

30
5.5 Jacobi’s Triple Product Identity
This is only loosely connected with the topics of this chapter, but is inter-
esting in its own right.
Theorem 5.8 (Jacobi’s Triple Product Identity)
Y X 2
(1 + q 2n−1 z)(1 + q 2n−1 z −1 )(1 − q 2n ) = ql zl .
n>0 l∈Z

The sharp-eyed will notice that the series on the right breaks my rules that
formal Laurent series should have only finitely many negative terms. Well,
this just shows that formal power series are more flexible than might first
appear! You can check that the three infinite products on the left contribute
only finitely many terms to each power, positive or negative, of z.
By replacing q by q 1/2 and moving the third term in the product to the
right-hand side, the identity takes the form
! !
Y X 2 Y
(1 + q n−1/2 z)(1 + q n−1/2 z −1 ) = q l /2 z l (1 − q n )−1 ,
n>0 l∈Z n>0

in which form we will prove it. The proof here is a remarkable argument by
Richard Borcherds, and this write-up from my Combinatorics textbook.
A level is a number of the form n + 21 , where n is an integer. A state is
a set of levels which contains all but finitely many negative levels and only
finitely many positive levels. The state consisting of all the negative levels
and no positive ones is called the vacuum. Given a state S, we define the
energy of S to be
X X
{l : l > 0, l ∈ S} − {l : l < 0, l 6∈ S},

while the particle number of S is

|{l : l > 0, l ∈ S}| − |{l : l < 0, l 6∈ S}|.

Although it is not necessary for the proof, a word about the background
is in order!
Dirac showed that relativistic electrons could have negative as well as
positive energy. Since they jump to a level of lower energy if possible, Dirac
hypothesised that, in a vacuum, all the negative energy levels are occupied.

31
Since electrons obey the exclusion principle, this prevents further electrons
from occupying these states. Electrons in negative levels are not detectable.
If an electron gains enough energy to jump to a positive level, then it becomes
‘visible’; and the ‘hole’ it leaves behind behaves like a particle with the same
mass but opposite charge to an electron. (A few years later, positrons were
discovered filling these specifications.) If the vacuum has no net particles
and zero energy, then the energy and particle number of any state should be
relative to the vacuum, giving rise to the definitions given.
We show that the coefficient of q m z l on either side of the equation is equal
to the number of states with energy m and particle number l. This will prove
the identity.
For the left-hand side this is straightforward. A term in the expansion
1 1
of the product is obtained by selecting q n− 2 z or q n− 2 z −1 from finitely many
factors. These correspond to the presence of an electron in positive level
n − 21 (contributing n − 12 to the energy and 1 to the particle number), or a
hole in negative level −(n − 21 ) (contributing n − 12 to the energy and −1 to
the particle number). So the coefficient of q m z l is as claimed.
The right-hand side is a little harder. Consider first the states with parti-
cle number 0. Any such state can be obtained in a unique way from the vac-
uum by moving the electrons in the top k negative levels up by n1 , n2 , . . . , nk ,
say, where n1 ≥ n2 ≥ . . . ≥ nk . (The monotonicity is equivalent to the re-
quirement that no electron jumps over another. The jumping process allows
the possibility that some electrons jump from negative levels to higher but
still negative levels, so k is not the number of occupied positive levels.) The
energy of the state is thus m = n1 +. . .+nk . Thus, the number of states with
energy m and particle number 0 is equal to the Q number p(m) of partitions
m n −1
of m, which is the coefficient of q in P (q) = n>0 (1 − q ) , as we saw in
lecture 1.
Now consider states with positive particle number l. There is a unique
ground state, in which all negative levels and the first l positive levels are
filled; its energy is
1 3 2l − 1 1
+ + ... + = l2 ,
2 2 2 2
and its particle number is l. Any other state with particle number l is
obtained from this one by ‘jumping’ electrons up as before; so the number of
such states with energy m is p(m − 12 l2 ), which is the coefficient of q m z l in
2
q l /2 z l P (q), as required.

32
The argument for negative particle number is similar.

Exercises
1 Prove that, for fixed n, the Gaussian coefficients are unimodal.
 
n
2 For fixed n and k, the Gaussian coefficient is a polynomial in q of
k q
degree k(n − k), whose coefficients a0 , . . . , ak(n−k) are non-negative integers.
Prove that the coefficients are symmetric: that is, ai = ak(n−k)−i .

Remark It is also true that the coefficients are unimodal, but this is not so
easy to prove. The polynomial does not have all its roots real and negative!

3 Show that, for a, b equal to 0 or 1,




2m + a
  0  if a = 0 and b = 1,
= m
2l + b −1  l otherwise.

 
n
Remark For a more challenging exercise, find a formula for , where
k ω
ω is a primitive dth root of unity.

4 Deduce Euler’s Pentagonal Numbers Theorem from Jacobi’s Triple Prod-


uct Identity. (Hint: put q = t3/2 , z = −t−1/2 .)

5 Consider the algebra generated by two non-commuting variables x and y


satisfying the relation yx = qxy. Prove that
n  
n
X n
(x + y) = xn−k y k .
k=0
k q

33
6 Symmetric polynomials
A symmetric polynomial in n indeterminates is one which is unchanged under
any permutation of the indeterminates. The theory of symmetric polynomials
goes back to Newton, but more recently has been very closely connected with
the representation theory of the symmetric group, which we glanced at in
Lecture 3. I will just give a few simple results here. The best reference is Ian
Macdonald’s book Symmetric Functions and Hall Polynomials.

6.1 Symmetric polynomials


Let x1 , . . . , xn be indeterminates. If π is a permutation of {1, . . . , n}, we
denote by iπ the image of i under π. Now a polynomial F (x1 , . . . , xn ) is a
symmetric polynomial if
F (x1π , . . . , xnπ ) = F (x1 , . . . , xn ) for all π ∈ Sn ,
where Sn is the symmetric group of degree n (the group of all polynomials
of degree n).
Any polynomial is a linear combination of monomials xa11 · · · xann , where
a1 , . . . , an are non-negative integers. The degree of this monomial is a1 +· · ·+
an . A polynomial is homogeneous of degree r if every monomial has degree
r. Any polynomial can be written as a sum of homogeneous polynomials of
degrees 1, 2, . . ..
In a homogeneous symmetric polynomial of degree r, the exponents in
any monomial form a partition of r into at most n parts; two monomials
which give rise to the same partition are equivalent under a permutation,
and so must have the same coefficient. Thus, the dimension of the space
of homogeneous symmetric polynomials of degree r is pn (r), the number of
partitions of r with at most n parts.
There are three especially important symmetric polynomials:
(a) The elementary symmetric polynomial er , which is the sum of all the
monomials consisting
  of products of r distinct indeterminates. Note
n
that there are monomials in the sum.
r
(b) The complete symmetric polynomial  hr , which
 is the sum of all the
n+r−1
monomials of degree r. There are terms in the sum: the
r
proof of this is given in the Appendix to these notes.

34
n
X
(c) The power sum polynomial pr , which is simply xri .
i=1

For example, if n = 3 and r = 2,

(a) the elementary symmetric polynomial is x1 x2 + x2 x3 + x1 x3 ;


(b) the complete symmetric polynomial is x1 x2 + x2 x3 + x1 x3 + x21 + x22 + x23 ;
(c) the power sum polynomial is x21 + x22 + x23 .
   
n n+r−1
Note that er (1, . . . , n) = , hr (1, . . . , 1) = , and pr (1, . . . , 1) =
r r
n.
Also, the q-binomial theorem that we met in the last lecture shows that
 
2 n−1 r(r−1)/2 n
er (1, q, q , . . . , q ) = q ,
r q

and Heine’s formula shows that, similarly,


 
2 n−1 n+r−1
hr (1, q, q , . . . , q )= .
r q

6.2 Generating functions


The best-known occurrence of the elementary symmetric polynomials is the
connection with the roots of polynomials. (To avoid conflict with xi , the
variable in a polynomial is t in this section.) The coefficient of tn−r in a
polynomial of degree n is (−1)r er (a1 , . . . , an ), where a1 , . . . , an are the roots.
This is because the polynomial can be written as

(t − a1 )(t − a2 ) · · · (t − an ),

and the term in tn−r is formed by choosing t from n − r of the factors and
−ai from the remaining r.
Said otherwise, and putting xi = −1/ai , this says that the generating
function for the elementary symmetric polynomials is
n
X n
Y
r
E(t) = er (x1 , . . . , xn )t = (1 + xi t),
r=0 i=1

35
with the convention that e0 = 1.
In a similar way, the generating function for the complete symmetric
polynomials is
X n
Y
H(t) = r
hr (x1 , . . . , xn )t = (1 − xi t)−1 .
r≥0 i=1

We also take P (t) to be the generating function for the power sum polyno-
mials, with a shift:
X
P (t) = pr (x1 , . . . , xn )tr−1 .
r≥1

Now we see that H(t) = E(−t)−1 , so that


n
X
(−1)r 3r hn−r = 0 for n ≥ 1.
r=0

For P (t), we have

d d
H(t) = P (t)H(t), E(t) = P (−t)E(t).
dt dt

6.3 Functions indexed by partitions


We extend the definitions of symmetric polynomials as follows. Let λ =
(a1 , a2 , . . .) be a partition of r, a non-decreasing sequence of integers with
sum r. Then, if z denotes one of the symbols e, h or p, we define zλ to be the
product of zai over all the parts ai of λ; this is again a symmetric polynomial
of degree r. For example, if n = 3 and λ is the partition (2, 1) of 3, we have

eλ = (x1 x2 + x1 x3 + x2 x3 )(x1 + x2 + x3 ),
pλ = (x21 + x22 + x23 )(x1 + x2 + x3 ),
hλ = eλ + pλ .

We also define the basic polynomial mλ to be the sum of all monomials with
exponents a1 , a2 , . . .. In the above case,

mλ = x21 x2 + x21 x3 + x22 x1 + x22 x3 + x23 x1 + x23 x2 .

36
Theorem 6.1 If n ≥ r, and z is one of the symbols m, e, h, p, then any
symmetric polynomial of degree r can be written uniquely as a linear com-
bination of the polynomials zλ , as λ runs over all partitions. Moreover, in
all cases except z = p, if the polynomial has integer coefficients, then it is a
linear combination with integer coefficients.
So the polynomials er or hr , with r ≤ n, are generators of the ring of
symmetric polynomials in n variables with integer coefficients. For z = e,
this is a version of Newton’s Theorem on symmetric polynomials (which,
however, applies also to rational functions).

6.4 Appendix: Selections with repetition


Theorem
 6.2 The number of n-tuples of non-negative integers with sum r
n+r−1
is .
r
The claim about the number of monomials of degree r follows immediately
from this result, which should be contrasted with
 the fact that the number
n
of n-tuples of zeros and ones with sum r is .
r

Proof We can describe any such n-tuple in the following way. Take a line
of n + r − 1 boxes. Then choose n − 1 boxes, and place barriers in these
boxes. Let
(a) a1 be the number of empty boxes before the first barrier;
(b) a2 be the number of empty boxes between the first and second barriers;
(c) . . .
(d) an be the number of empty boxes after the last barrier.
Then a1 , . . . , an are non-negative integers with sum r. Conversely, given n
non-negative integers with sum r, we can represent it with n − 1 barriers in
n + r − 1 boxes: place the first barrier after a1 empty boxes, the second after
a2 further empty boxes, and so on.
So the required number of n-tuples is equal to the number of ways to
position n − 1 barriers in n + r − 1 boxes, which is
   
n+r−1 n+r−1
= ,
n−1 r

37
as required.

7 Group actions
How many ways can you colour the faces of a cube with three colours? Clearly
the answer is 36 = 729. But what if we regard two colourings as the same
if one can be transformed into the other by a rotation of the cube? This is
typical of the problems we consider in this chapter.

7.1 The Orbit-Counting Lemma


This chapter of the lectures, unlike most of the others, requires some technical
background. I assume that you know the definition of a group. I will run
briefly through the theory of group actions, and finally reach the Orbit-
Counting Lemma, which solves our introductory problem.
Throughout this section, permutations act “on the right”, that is, the
effect of applying a permutation π to an element x of the domain is written
xπ. This is not just a matter of notation; it entails the fact that the product
π1 π2 of two permutations is calculated by the rule “first π1 , then π2 ”, rather
than the other way round. This ensures that x(π1 π2 ) = (xπ1 )π2 for all
elements x.
An action of a group G on a set X is a map associating to each group
element g ∈ G a permutation πg of X in such a way that the following two
conditions hold:
(a) πgh = πg πh for all g, h ∈ G (that is, xπgh = xπg πh for all g, h ∈ G and
all x ∈ X);
(b) if 1 denotes the identity element of G, then π1 is the identity permuta-
tion (that is, xπ1 = x for all x ∈ X).
Usually we simplify notation by not distinguishing between g and πg , writing
simply xg instead of xπg . From a different point of view, an action is a
homomorphism from the group G to the symmetric group of all permutations
of X.
Two elements x, y ∈ X are equivalent under the action if there exists an
element g ∈ G such that xg = y. It is routine to show that this is really an
equivalence relation; its equivalence classes are called orbits, and the action
is transitive if there is just one orbit. Thus we have a first structure theorem:

38
any action can be split uniquely into transitive actions on the sets of the
orbit partition of the domain.
In our motivating problem, the group G of 24 rotations of the cube acts
on the set X of 729 coloured cubes, and we want to count the orbits. So our
immediate goal is to count the orbits in an arbitrary action.
If H is a subgroup of G, then there is a natural partition of G into right
cosets Hx of H, for x ∈ G. Lagrange’s Theorem assures us that each coset
has the same cardinality, so the number of cosets is equal to |G|/|H|. We
denote the set of right cosets of H in G by cos(H, G). Now there is an action
of G on the set cos(H, G): the group element g induces the permutation
Hx 7→ H(xg). At risk of some confusion, we write this as (Hx)g = H(xg).
Now, given any transitive action of G on a set X, and x ∈ X, the set
{g ∈ G : xg = x}
is a subgroup of H, called the stabiliser of x, and denoted by StabG (x). Now
there is a natural bijection between X and cos(H, G), where the element
y ∈ X corresponds to the set
{g ∈ G : xg = y}
(it is easily checked that this is a coset of H). This bijection also respects
the action of G: if z ∈ G satisfies yg = z, and Hk and Hl are the cosets
corresponding to y and z, then (Hk)g = (Hl).
So the so-called “coset spaces” of subgroups of G give a complete list of
transitive actions of G, up to a natural notion of isomorphism of actions.
Note in addition that any two points in the same orbit have stabilisers of
the same order. (The stabilisers are in fact conjugate subgroups of G.)
In an arbitrary action of G on X, we let fixX (g) denote the number of
points of X which are fixed by the permutation g. Now we can state the
Orbit-Counting Lemma, the foundation of enumeration under group action.
Theorem 7.1 Let G act on the finite set X. Then the number of G-orbits
in X is equal to the average number of fixed points of elements of G, that is,
1 X
fixX (g).
|G| g∈G

The theorem has a probabilistic interpretation. Choose a random element


of G (from the uniform distribution). Then its expected number of fixed
points is equal to the number of orbits of G.

39
Proof Construct a bipartite graph as follows. The vertices are of two types:
the elements of X, and the elements of G. There is an edge from x to g if
xg = x. We count the number of edges in two different ways. X
Each vertex g lies in fixX (g) edges; so the number of edges is fixX (g).
g∈G
Now we count the other way. Take a point x ∈ X. The number of edges
containing it is | StabG (x)|. This value is the same for all the points in the
orbit OG (x) containing x. So the number of edges containing points in the
orbit is | StabG (x)| · |OG (x)| = |G|. Since each orbit contributes |G| edges,
the number of orbits is obtained by dividing the number of edges by |G|, as
claimed.
Now consider the coloured cubes. In order to do the calculations, we need
to classify the elements of the group G of rotations of the cube (a group of
order 24). They are of the following types:

(a) the identity;


(b) “face rotations” (about an axis through two opposite face centres)
through ±π/2 (six of these, two for each pair of opposite faces);
(c) “face rotations” through π (three of these);
(d) “edge rotations” (about an axis through two opposite edge midpoints)
through π (six of these);
(e) “vertex rotations” (about an axis through two opposite vertices) through
±2π/3 (eight of these, two for each pair of opposite vertices).

For each type of rotation, we have to count the number of coloured cubes it
fixes. A cube will be fixed if faces in the same cycle of the permutation have
the same colour. So the answer will be 3c , where c is the number of cycles
of the permutation on faces. For the five types listed above the numbers of
cycles are 6 (each single face is a cycle), 3 (for the vertical axis, the top and
bottom faces, and the other four in a single cycle), 4 (as the previous except
that the 4-cycle splits into two 2-cycles), 3 (the faces are permuted in cycles
of two), and 2 (the faces are permuted in cycles of three). So the calculation
of the theorem is:
1 6
(3 + 6 · 33 + 3 · 34 + 6 · 33 + 8 · 32 ) = 57.
24

40
7.2 Labelled and unlabelled
Many combinatorial objects that we want to count are based on an underlying
set, which we usually assume to be the set {1, 2, . . . , n}. Very often the
simplest method of counting gives us the total number of objects that can
be built on this set. But we may be completely uninterested in the labels
1, 2, . . . , n, and want to count two objects as being the same if there are some
labellings of the underlying set that make them identical.
We distinguish these two problems as counting labelled and unlabelled
objects.
Counting unlabelled objects is thus an orbit-counting problem: we want
to know the number of orbits of the symmetric group Sn , acting on the
objects in question by permuting the labels. 
n
To take an extreme case: there are labelled k-element subsets of an
k
n-element set, but there is only one unlabelled subset. Here are a few more
examples.

Objects Labelled Unlabelled


Subsets 2n n+1
Partitions B(n) p(n)
Permutations n! p(n)
Linear orders n! 1

Here B(n) is the Bell number (the number of partitions of an n-set) and
p(n) the number of partitions of the number n. Note that the numbers of
unlabelled structures can agree and those of labelled structures disagree, or
vice versa.
The third entry needs a little explanation. Any permutation can be writ-
ten as a product of disjoint cycles; the cycle lengths form a partition of n
called the cycle structure of the permutation. Now given two permutations
with the same cycle structure, we can replace the entries in one by those in
the other. For example, (1)(2, 3) can be transformed into (2)(1, 3) by swap-
ping the labels 1 and 2. (You might recognise this as the argument that
shows that two permutations are conjugate in the symmetric group if and
only if they have the same cycle structure.)
In the three cases in the table, we can count the unlabelled objects di-
rectly; but in more complicated cases, the Orbit-Counting Lemma is required.
One example is the number of graphs on n vertices. The labelled number

41
is 2n(n−1)/2 , since for each of the n(n − 1)/2 pairs of vertices we can choose
whether to join it by an edge or not; but the only way to calculate the nuber
of unlabelled graphs is via the Orbit-Counting Lemma.

7.3 Cycle index


There is a way to “mechanise” the counting in many important cases, which
we now discuss. This was introduced by Redfield and, independently, by
Pólya, and refined by de Bruijn and others. (Incidentally, these early work-
ers found the Orbit-Counting Lemma in Burnside’s group theory book, and
called it “Burnside’s Lemma”, a name which is still sometimes used. How-
ever, the result is due to Frobenius, and earlier to Cauchy in a special case.)
The set-up is as follows. We have a set X on which a group G acts. We
are going to decorate X by placing one of a set of “figures” at each point.
Each figure has a weight, which is a non-negative integer. We don’t require
the number of figures to be finite, but we ask that there should be only
finitely many figures of any given weight. The figures can thus be counted
by the figure-counting series
X
A(x) = an x n ,
n≥0

where an is the number of figures of weight n.


Now one of the configurations we want to count consists of the set X with
a figure at each point; this can be described by a function from X Pto the set
of figures. Such a function f will have a weight, given by w(f ) = {w(x) :
x ∈ X}. There are only finitely many functions of any given weight, and the
action of the group G preserves weight; so we can let bn be the number of
functions of weight n, and define the function-counting series
X
B(x) = bn x n .
n≥0

The final ingredient is the cycle index polynomial Z(G), defined as


1 X c1 (g) c2 (g)
Z(G) = s s2 · · · sncn (g) .
|G| g∈G 1

Here s1 , . . . , sn are indeterminates, and ci (g) is the number of cycles of length


i in the cycle decomposition of g, for i = 1, . . . , n.
Now the Cycle Index Theorem states:

42
Theorem 7.2
B(x) = Z(G; si ← A(xi ) for i = 1, . . . , n).
The notation on the right means that we substitute A(xi ) for si , for
i = 1, . . . , n.
I won’t prove the theorem here – it follows from the Orbit-Counting
Lemma with a certain amount of ingenuity – but will conclude with a simple
application which doesn’t even hint at the uses of the theorem.
First, let us calculate the cycle index of the rotation group of the cube.
The five types of elements mentioned earlier have the following cycle struc-
tures in their action on faces:
(a) Identity: (1, 1, 1, 1, 1, 1) (usually abbreviated to 16 ).
(b) Face rotations through ±π/2: 12 4.
(c) Face rotations through π: 12 22 .
(d) Edge rotations: 23 .
(e) Vertex rotations: 32 .
So the cycle index is
1 6
Z(G) = (s + 6s21 s4 + 3s21 s22 + 6s32 + 8s23 ).
24 1
Now any counting problem for which we can write a figure-counting series
can be solved by substitution. For example:
(a) Take each of the three colours to be a figure of weight 0. The figure-
counting series is simply 3. We recover our earlier count.
(b) Take one of the colours (say red) to have weight 1, and all the others
weight 0. The figure-counting series is x + 2. So substituting xi + 2
for si gives a polynomial in which the coefficient of xk is the number of
types of cube which have exactly k red faces.
(c) A small extension of the Cycle Index Theorem shows that, if we sub-
stitute pi (x, y, z) = xi + y i + z i for si , we obtain a trivariate polynomial
in which the coefficient of xi y j z k is the number of cubes with i red, j
blue, and k green faces.
(d) The generalisation to an arbitrary number of colours is now routine.

43
Exercises
1 Perform the calcuations in the four counting problems above.

2 A necklace has ten beads, each of which is either black or white, arranged
on a loop of string. A cyclic permutation of the beads counts as the same
necklace. How many necklaces are there?
How many are there if the necklace obtained by turning over the given
one is regarded as the same?

3 Let G be a permutation group on a set X, where |X| = n.


For 0 ≤ i ≤ n, let pi be the proportion P of i elements of G which have
exactly i fixed points on X, and let p(x) = pi x be the generating function
for these numbers (the probability generating function for fixed points).
For 0 ≤ i ≤ n, let Fi be the number of orbits of G in its action on the set
of i-tuples of distinct elements of X, and let
X Fi xi
F (x) =
i!
be the exponential generating function for these numbers.
Use the Orbit-counting Lemma to show that

F (x) = P (x + 1)

and deduce that the proportion of fixed-point-free elements in G is p0 =


F (−1).
Taking G to be the symmetric group Sn , show that the number of fixed-
point-free permutations (the derangement number ) is
n
X (−1)k
n! .
k=0
k!

Deduce that this number is the closest integer to n!/e.

4 Consider the set of all functions from {1, . . . , n} to {1, . . . , m}. There
are mn functions in the set. Now let the symmetric group Sn act on these
functions by permuting their arguments: (f π)(x) = f (xπ −1 ). [Incidentally,
the inverse is there to make this an action – can you see why?]

44
Show that orbits correspond to m-tuples
 of non-negative
 integers with
m+n−1
sum n, so that the number of orbits is . (See the Appendix in
n
Lecture Notes 7.)
Show that a permutation g with k cycles fixes mk functions. Hence use
the Orbit-Counting Lemma to show that
n  
1 X m+n−1
u(n, k)mk = .
n! k=1 n

Show that we can replace m by an indeterminate x and multiply by n! to get


the identity
Xn
u(n, k)xk = x(x + 1) · · · (x + n − 1),
k=1

from which some sign changes yield


n
X
s(n, k)xk = x(x − 1) · · · (x − n + 1),
k=1

a formula we met in Section 1. (Here s(n, k) and u(n, k) are the signed and
unsigned Stirling numbers of the first kind.)

8 Species
In this lecture I will discuss a very nice unifying principle for a number of
topics in enumerative combinatorics, the theory of species, introduced by
André Joyal in 1981. Species have been used in areas ranging from infinite
permutation groups to statistical mechanics, and I can’t do more here than
barely scratch the surface.
Joyal gave a category-theoretic definition of species; I will take a more
informal approach.
There is a book on species, by Bergeron, Labelle and Leroux, entitled
Combinatorial Species and Tree-Like Structures; but I think that Joyal’s
original paper in Advances in Mathematics is hard to beat.

45
8.1 What is a species?
As I said earlier, a typical combinatorial structure of the type we wish to
count is often built on a finite set; we are interested in counting labelled
structures (the different structures built on a fixed set) and also the unla-
belled structures (essentially the isomorphism types of structures).
A species is a functor F (this word is used by Joyal in its technical sense
from category theory; I will be less formal but will explain what is going on)
which takes an n-element set and produces the set of objects in which we
are interested; it should also have the property that the functor transforms
any bijection between n-element sets A and B to a bijection between the sets
F(A) and F(B) of objects built on these sets. Because of this condition, we
can use the standard n-element set {1, 2, . . . , n}, but don’t have to worry if
during the argument we have a non-standard set (such as a proper subset of
the standard set).
Joyal’s intuition is that we think of a formal power series where the coef-
ficients are not numbers, but sets of combinatorial objects:
X
F= F ({1, 2, . . . , n})xn .
n≥0

Suitable specialisations will give us the generating functions for unlabelled


and unlabelled objects.
The first specialisation is to replace the set F(A) by the sum of the cycle
indices of the automorphism groups of the unlabelled structures in F(A):
let us call this Z(F). This will be a formal power series in infinitely many
variables s1 , s2 , . . .. Now it turns out that the specialisations

f (x) = Z(F; sn ← xn for all n),


F (x) = Z(F; s1 ← x, xn ← 0 for n > 1),

give us, respectively, the ordinary generating function for the unlabelled
structures in the species F, and the exponential generating function for the
labelled structures.

8.2 Examples
If this is a bit abstract, hopefully some examples will bring it back to earth.

46
Sets Let Set denote the “identity” species, where the structure on the finite
set A is simply a labelling of A. Thus, for each n, there is one unlabelled
srtucture, and one labelled structure. So the generating functions are
X 1
set(x) = xn = ,
n≥0
1−x
X xn
Set(x) = = exp(x)
n≥0
n!

respectively.
The cycle index of the species Set can be computed as follows. First,
1 X n!
Z(Sn ) = sa1 · · · sann ,
n! 1 · · · n n a1 ! · · · an ! 1
a1 a

where the sum is over all partitions of n having ai parts of size i for i =
1, 2, . . . , n (the coefficient is the number of permutations with this cycle struc-
ture). Summing this over all n seems a formidable task, but a remarkable
simplification occurs: since n! cancels we can sum over the variables a1 , . . . , an
independently. We obtain
!
X  si 
Z(Set) = exp .
i≥1
i

Now substituting X i for si for all i gives


!
X  xi 
set(x) = exp
i≥1
i
= exp(− log(1 − x))
1
= ,
1−x
Set(x) = exp(x),

as expected.
Note that the formula for the sum of the cycle indices of the symmetric
groups was known in the combinatorial enumeration community before Joyal
provided it with this nice interpretation.

47
Linear orders A much easier case is the species Lin of linear (or total)
orders. There are n! labelled linear orders on n points; all are isomorphic,
and there are no non-trivial automorphisms, so we have
X 1
Z(Lin) = sn1 = ,
n≥0
1 − s1

from which the generating functions are lin(x) = Lin(x) = 1/(1 − x).

8.3 Operations on species


There are three important ways that we can add two species F and G.

Sum F + G is the species which constructs on the set A all the F-objects
and all the G-objects (we assume these two classes to be disjoint). Clearly the
cycle index and the generating functions for unlabelled and labelled objects
are simply obtained by adding those for F and G.

Product F · /mathbf G is the species whose objects on a set A are con-


structed in the following way: partition A into two (possibly empty) parts
B and C; put an F -object on B, and a G-object on C. A slightly harder
calculation shows that the cycle index, and hence the generating functions
for unlabelled and labelled objects, are obtained by multiplying those for F
and G.
Here is an example. What is Set2 ? Given a set A, we partition it into
a subset B and its complement A \ B. So we can regard this as the species
Subset. The numbers of unlabelled and labelled objects in this species on
n points are n + 1 and 2n respectively, and their generating functions are (as
expected) 1/(1 − x)2 and exp(2x).

Substitution As with power series in general, there is a formal restriction


on substitution: we can only substitute G into F provided that G(∅) = ∅. If
this condition holds, then we define F[G]-objects on A as follows: partition
A (into non-empty parts); put a G-structure on each part; and put a F-
structure on the set of parts.
The cycle index is given by substituting the cycle index of G into that of
F in the following way:
Z(F[G]) = Z(F : sn ← Z(G, sm ← snm )).

48
In other words, for the indeterminate sn in Z(F, we substitute the cycle
index of G but in the indeterminates sn , s2n , . . . in place of s1 , s2 , . . ..
The effect on the generating functions for labelled objects is simple sub-
stitution: F [G](x) = F (G(x)). For unlabelled objects it is a bit more com-
plicated, we need the cycle index for F:

f g(x) = Z(F; sn ← g(xn ) for all n).

For example, let Set∗ be the species of non-empty sets. Then the e.g.f.
for labelled objects is Set∗ (x) = exp(x) − 1. Now Set[Set∗ ] is the species of
set partitions, where the labelled objects are counted by the Bell numbers:
the exponential generaing function is thus exp(exp(x) − 1), as we saw earlier.
As an exercise, obtain the ordinary generating function for partitions of the
integer n from this approach.

Remark The fact that substituting a species into Set exponentiates the
generating function for labelled structures is sometimes called the exponential
principle in enumerative combinatorics. We see that substitution of species
is much more general.

Rooted structures This means structures where one point is distinguished.


It can be shown that the effect of rooting a species is to apply the operator

s1 to the cycle index, and hence to apply the operator x d/dx to the
∂s1
generating function for labelled structures. I will denote the operation of
rooting a species by R, and the operation of rooting and then removing the
root (i.e., deleting a point) by D: this just corresponds to differentiation.
There are many other nice examples, some of which are described in the
exercises.

8.4 Exercises
1 Define the species Circ of circular orders and the species Perm of per-
mutations, and calculate the generating functions for unlabelled and labelled
objects in these species.
Show that
X φ(m)
Z(Circ) = − log(1 − sm ),
m≥1
m

49
where φ is Euler’s totient function.
Use the decomposition of permutations into disjoint cycles to show that
Set[Circ] = Perm,
and verify the appropriate identities for the generating functions.

Remark It is not so easy to calculate the cycle index of Perm directly,


but using the above expression it is not too hard to show that
Y
Z(Perm) = (1 − sn )−1 .
n≥1

2 Use the fact that Catalan objects are rooted binary trees to show that
the species Cat of Catalan objects satisfies
Cat = E + Cat2 ,
where E denotes the species of singleton sets (that is, it returns its input if
this has cardinality 1, and the empty set otherwise).
Show similarly that the species W of rooted binary trees without the left-
right distinction (counted by Wedderburn–Etherington numbers) satisfies
W = E + Set2 [W],
where Set2 is the species of 2-element sets.

3 Let F denote the species of “1-factors” or partitions of a set into subsets


of size 2. Show that
D(F) = E · F,
F = Set[Set2 ].
Use each of these equations to show that the exponential generating function
for labelled 1-factors is exp(x2 /2).

4 Let Graph and ConnGraph be the species of graphs and connected


graphs respectively. (Here, assume that a connected graph has at least one
vertex.) Show that
Graph = Set[ConnGraph].
(It follows from this that the e.g.f. for connected graphs is the logarithm of
the e.g.f. for graphs.)

50
9 Möbius inversion
In this section we will discuss the Inclusion-Exclusion principle, with a few
applications (including a formula for the chromatic polynomial of a graph),
and then consider a wide generalisation of it due to Gian-Carlo Rota, involv-
ing the Möbius function of a partially ordered set. The q-binomial theorem
gives a simple formula for the Möbius function of the lattice of subspaces of
a vector space.

9.1 Inclusion-Exclusion
The Inclusion-Exclusion Principle is one of the most familiar results in com-
binatorics. For two sets A and B, it asserts simply that |A ∪ B| = |A| +
|B| − |A ∩ B|. For the general case, we need some notation. Let A1 , . . . , An
\of a finite set S. For any subset I of the index set {1, 2, . . . , n, we
be subsets
let AI = Ai . By convention, we take A∅ = S.
i∈I

Theorem 9.1 The number of elements lying in none of the sets A1 , . . . , An


is X
(−1)|I| |AI |.
I⊆{1,...,n}

Proof We count the contribution of each element s ∈ S to the sum in the


above formula.
If s lies in none of the sets Ai then it is counted once in the term A∅ and
in none of the others.
Suppose that J = {i : s ∈ Ai } = 6 ∅. Then the terms to which s contributes
come from sets AI with I ⊆ J, and the contribution is
j  
X
|I|
X j
(−1) = (−1)k = (1 − 1)j = 0,
I⊆J k=0
k
where j = |J|. 
Corollary 9.2 Suppose that the family of sets has the property that, if |I| =
i, then |AI | = mi . Then the number of points lying in none of the sets is
n  
i n
X
(−1) mi .
i=0
i

51
9.2 Applications
We begin with two standard applications of the Corollary. First, a formula
for the Stirling numbers of the second kind.

Theorem 9.3 The number of surjective functions from an m-set to an n-set


is n  
i n
X
(−1) (n − i)m .
i=0
i

Proof Let S be the set of all functions from M to N , where |M | = m and


|N | = n, say N = {1, . . . , n}. Let Ai be the set of functions which do not
take the value i. Then a function is surjective if and only if it lies in none of
the sets Ai .
If |I| = i, then AI consists of functions which take values in the set
{1, . . . , n} \ I; there are (n − i)m such functions. So the theorem follows
immediately from Corollary 9.2. 
Corollary 9.4
n  
1 X i n
S(m, n) = (−1) (n − i)m .
n! i=0 i

Proof We can describe a surjective function as follows: choose a partition


of the domain into n parts (we can do this in S(m, n) ways, by definition
of the Stirling number); then assign each part to a point of the codomain
(which can be done in n! ways). So n!S(m, n) is the number of surjective
functions. 

The second application concerns derangements: these are permutations


of {1, . . . , n} with no fixed points.

Theorem 9.5 The number of derangements of {1, . . . , n} is given by the


formula
n
X (−1)i
dn = n! .
i=0
i!

52
Proof Let S be the set of all permutations, and Ai the set of permutations
which fix the element i ∈ {1, . . . , n}. Then a permutation is a derangement
if and only if it lies in no set Ai . The permutations in AI fix every point in
the set I, so there are (n − i)! of them if |I| = i. Thus Corollary 9.2 gives
n n
(−1)i
 
X
i n X
dn = (−1) (n − i)! = n! .
i=0
i i=0
i!

as claimed. 

The summation here is the partial sum of the series for e−1 , so dn is
approximately n!/e. Indeed, it is easy to show that it is the nearest integer
to n!/e.
The “secretary problem” asks: a secretary puts n letters into n addressed
envelopes at random: what is the probability that no letter is correctly ad-
dressed? The answer is very close to 1/e, perhaps a little surprising at first
sight.
For our final application we consider graphs. A graph consists of a set V
of vertices and a set E of edges, each edge being a 2-element set of vertices.
Given a set of q colours, a colouring of the graph is an assignment of colours
to the vertices; it is proper if the two vertices in each edge have different
colours.

Theorem 9.6 For any graph G = (V, E), there is a polynomial PG (x) such
that, for any natural number q, PG (q) is the number of proper colourings of
G with q colours. Moreover, PG is a monic polynomial with degree n = |V |.

This is usually proved by operations on the graph (“deletion” and “con-


traction”. The Inclusion-Exclusion proof here provides a formula.

Proof Let S be the set of all colourings of G with q colours. For each edge e,
let Ae be the set of colourings for which the edge e is “improperly coloured”,
that is, its vertices have the same colour. A colouring is proper if it lies in
no set Ae . Given a set I ⊆ E, how many colourings lie in AI ? Consider the
graph (V, I) with edge set I. A colouring in Ai assigns the same colour to
all vertices in the same connected component of this graph; so |AI | = q c(I) ,
where c(I) is the number of connected components of (V, I).

53
By Theorem 9.1, the number of proper colourings is
X
(−1)|I| q c(I) .
I⊆E

It is clear that this is a polynomial in q; the leading term comes from the
unique graph (V, I) with n connected components, namely I = ∅. 

This formula shows a connection between graph colouring and the Potts
model in statistical mechanics, but we cannot pursue this here.

9.3 The Möbius function of a poset


A poset, or partially ordered set, consists of a set A with a relation ≤ on A
which is
(a) reflexive: a ≤ a for all a ∈ A;
(b) antisymmetric: a ≤ b and B ≤ a imply a = b, for all a, b ∈ A;
(c) transitive: a ≤ b and b ≤ c imply a ≤ c, for all a, b, c ∈ A.
An important combinatorial example consists of the case where A is the set
of all subsets of a finite set S, and a ≤ b means that a is a subset of b. It
turns out that the Inclusion-Exclusion principle can be formulated in terms
of this poset, and then generalised so as to apply to any poset.
We begin with an observation which will not be proved here.

Theorem 9.7 Let P = (A, ≤) be a finite poset. Then we can label the
elements of A as a1 , a2 , . . . , an such that, if ai ≤ aj , then i ≤ j.

This is sometimes stated “Every poset has a linear extension”. The anal-
ogous result for infinite posets requires a weak form of the Axiom of Choice
in its proof.
Now let P = (A, ≤) be a poset. We define the incidence algebra of P as
follows: the elements are all functions f : A × A → R such that f (a, b) = 0
unless a ≤ b. Addition and scalar multiplication are defined in the obvious
way, and multiplication by the rule
 X
 f (a, c)g(c, b) if a ≤ b,
f g(a, b) = a≤c≤b
0 if a 6≤ b.

54
If we number the elements of A as in the preceding theorem, then we can
represent a function from A × A to R by an n × n matrix; the definition
of the incidence algebra shows that any function which lies in the algebra
is upper triangular. The multiplication in the algebra is then just matrix
multiplication, so the incidence algebra is a subalgebra of the algebra of all
n × n real matrices.
We now define three particular elements of the incidence algebra.
(a) ι is the identity function:

1 if a = b,
ι(a, b) = ,
0 if a 6= b
represented by the identity matrix.
(b) ζ is the zeta function:

1 if a ≤ b,
ζ(a, b) =
0 if a 6≤ b.

(c) µ, the Möbius function, is the inverse of the zeta function: µζ = ζµ = ι.


The zeta function is represented by an upper unitriangular matrix with
integer entries; so its inverse, the Möbius function, is also represented by an
upper unitriangular matrix with integer entries. Its definition shows that, if
a < b, then X
µ(a, c) = 0,
a≤c≤b

so that X
µ(a, b) = − µ(a, c).
a≤c<b

This gives a recursive method for calculating the Möbius function, as we will
see.
From the definition, we immediately have the Möbius inversion formula:
Theorem 9.8 Let P be a poset with Möbius function µ. Then the following
are equivalent:
P
(a) g(a, b) = a≤c≤b f (a, c) for all a ≤ b;
P
(b) f (a, b) = a≤c≤b g(a, c)µ(c, b) for all a ≤ b.

55
9.4 Some examples
The preceding remark shows that the value of µ(a, b) depends only on the
structure of the interval [a, b] = {c : a ≤ c ≤ b}.
Many important posets have a least element (which is usually called 0)
and a “homogeneity property”: for any a, b with a ≤ b, there is an element
c such that the interval [a, b] is isomorphic to the interval [0, c]. In a poset
with this property, µ(a, b) = µ(0, c), and we can regard the Möbius function
as a one-variable function.

A chain
A chain, or linear order, is a poset in which every pair of elements is
comparable. Any finite chain is isomorphic to {0, 1, . . . , n − 1} with the
usual order. Its Möbius function is given by
1 if b = a,
(
µ(a, b) = −1 if b = a + 1,
0 otherwise.
This follows immediately from the recursive method of computing µ.
In this case, any interval [a, b] is isomorphic to the interval [0, b − a], so
it would have sufficed to take a = 0; but the general case is simple enough.

Direct product
The direct product of posets P1 = (A1 , ≤1 ) and P2 = (A2 , ≤2 ) has set
A1 × A2 (Cartesian product), and
(a1 , a2 ) ≤ (b1 , b2 ) ⇔ a1 ≤1 b1 and a2 ≤2 b2 .
It is easily checked that
µ((a1 , a2 ), (b1 , b2 )) = µ(a1 , b1 )µ(a2 , b2 ).
This extends in a straightforward way to the direct product of any finite
number of posets.

Subsets of a set The poset of all subsets of {1, 2, . . . , n} can be represented


as the direct product of n copies of the 2-element chain {0, 1}; the subset a
is identified with the n-tuple (a1 , . . . , an ), where
1 if i ∈ a,
n
ai =
0 if i ∈ / a.

56
It follows from the two preceding paragraphs that the Möbius function is

(−1)|b\a| if a ⊆ b,
µ(a, b) =
0 if a 6⊆ b.

In this case, if a ⊆ b, then [a, b] is isomorphic to [∅, b \ a], and we see the
homogeneity property in action. So the following are equivalent:
P
(a) f (a) = b≤a g(b);

(b) g(a) = b≤a f (b)(−1)|a\b| .


P

With a little rearrangement, this is a generalisation of the Inclusion-Exclusion


principle, with cardinality replaced by an arbitrary function (see Exercise 1).

The classical Möbius function The classical Möbius function from num-
ber theory is defined on the natural numbers; the partial order is given by
a ≤ b if a divides b. Although this partial order is infinite, all intervals are
finite, and it has the homogeneity property: if a | b, then the interval [a, b] is
isomorphic to [1, b/a].
This poset is isomorphic to the product of chains, one for each prime
power. We have
1 if b = a,
(
µ(pa , pb ) = −1 if b = a + 1,
0 otherwise.
Hence we have the general formula:

d
µ(m, n) = (−1) if m | n and n/m is a product of d distinct primes,
0 otherwise.

In particular, µ(1, n) is the number-theorists’ Möbius function, which they


write as µ(n). We have the classical Möbius inversion formula, the equiva-
lence of the following functions f, g on N:
P
(a) g(n) = m|n f (m);
P
(b) f (n) = m|n f (n)µ(n/m).

57
Subspaces of a vector space For our final example, let A be the set of all
subspaces of an n-dimensional vector space over a field of order q. If V ≤ W ,
the structure of the interval [V, W ] depends only on dim(W ) − dim(V ), and
so is isomorphic to [{0}, W/V ].
Recall the q-binomial theorem:
n n  
k(k−1)/2 k n
Y X
i−1
(1 + q z) = q z .
i=1 k=0
k q

Putting z = −1, the left-hand side becomes 0; then we have


n−1  
n n(n−1)
X
k k(k−1)/2 n
(−1) q /2 = − (−1) q .
k=0
k q

This shows, recursively, that if dim(V ) = n, then µ[{0}, V ] = (−1)n q n(n−1)/2 .

Exercises
1 Let (Ai : i = 1, . . . , n} be a family of subsets of a set X. For I ⊆
{1, . . . , n}, let
• f (I) be the number of points lying in Ai for all i ∈ I, and
• g(I) be the number of points lying in Ai for all i ∈ I and for no i ∈
/ I.
Prove that X
f (I) = g(J),
J⊇I

and deduce from Theorem 9.8 and the form of the Möbius function for the
power set of a set that
X
g(I) = (−1)|J\I| f (J).
J⊃I

Putting I = ∅, deduce the classical form of the Inclusion–Exclusion principle.

2 There is a partial order on the set of all partitions of {1, . . . , n}, defined
as follows: if a and b are partitions, say that a refines b if every part of b is
a union of parts of a.
Can you find the Möbius function of this partial order?

58
3 Prove the following “approximate version” of Inclusion-Exclusion:

Let A1 , . . . , An , A01 , . . . , A0n be subsets of a set X. For I ⊆ N =


{1, . . . , n}, let

\ \
aI = Ai , a0I = A0i .
i∈I i∈I

If aI = a0I for all proper subsets I of N , then |aN −a0N | ≤ |X|/2n−1 .

4 Prove that the exponential generating function for the derangement num-
bers dn (Theorem 9.5) is
X dn xn e−x
= .
n≥0
n! 1−x

Give an alternative proof of this formula, by showing that, if Derang is the


species of derangements, then

Perm = Set · Derang.

(A set carrying a permutation is the union of the set of fixed points and a
set none of whose points is fixed.)

5 The following problem, based on the children’s game “Screaming Toes”,


was suggested to me by Julian Gilbey.

n people stand in a circle. Each player looks down at someone


else’s feet (i.e., not at their own feet). At a given signal, everyone
looks up from the feet to the eyes of the person they were looking
at. If two people make eye contact, they scream. What is the
probability of at least one pair of people screaming?

Prove that the required probability is


bn/2c
X (−1)k−1 (n)2k
2k 2k k!
,
k=1
(n − 1)

where (n)j = n(n − 1) · · · (n − j + 1).

59
10 Cayley’s Theorem
The course ends with four entirely different proofs of Cayley’s theorem for
the number of labelled trees on n vertices, some of which introduce new ideas.
There is a direct bijective proof due to Prüfer; Joyal’s proof using species;
a proof using Kirchhoff’s Matrix-Tree Theorem; and a proof using Lagrange
inversion.
A tree is a connected graph without cycles. It is not hard to show by
induction that a tree on n vertices has n − 1 edges. There are 16 trees on the
vertex set {1, 2, 3, 4}: four of them are “stars” in which one vertex is joined
to the other three, and the other twelve are “paths”.

Theorem 10.1 The number of labelled trees on the vertex set {1, . . . , n} is
nn−2 .

10.1 Prüfer codes


We construct a bijection between the set of all trees on the vertex set
{1, . . . , n} and the set of all (n − 2)-tuples of elements from this set. The
tuple associated with a tree is called its Prüfer code.
First we describe the map from trees to Prüfer codes. Start with the
empty code. Repeat the following procedure until only two vertices remain:
select the leaf with smallest label; append the label of its unique neighbour
to the code; and then remove the leaf and its incident edge.
Next, the construction of a tree from a Prüfer code P . We use an auxiliary
list L of vertices added as leaves, which is initially empty. Now, while P is
not empty, we join the first element of P to the smallest-numbered vertex
v which is not in either P or L, and then add v to L and remove the first
element of P . When P is empty, two vertices have not been put into L; the
final edge of the tree joins these two vertices.
I leave it as a (quite non-trivial) exercise to show that these maps are
inverse bijections.
This proof gives extra information: the valency of vertex i of the tree
is one more than the number of occurrences of i in its Prüfer code; so the
number of trees with prescribed vertex valencies can be calculated.

60
10.2 A proof using species
Let Lin and Perm be the species of linear orders and permutations respec-
tively. We have seen that these two species have the same counting function
for labelled structures on n points (namely n!); so Lin[F] and Perm[F] will
also have the same counting function for labelled structures, for any species
F.
Joyal takes F = RTree, the species of rooted trees (trees with a distin-
guished vertex.
Now Lin[RTree] consists of a linear order on a set, say {1, 2, . . . , k} with
the usual order, with a rooted tree at each point. We can identify the root of
the tree at point i to be i itself. What we have constructed is a tree with a
distinguished path {1, 2, . . . , k}. Joyal calls such an object a vertebrate, since
it has a “backbone” from the “head” 1 to the “tail” k. We get a vertebrate
by taking a tree on n vertices and distinguishing two of them to be the head
and the tail; in a tree there is a unique path between any two vertices. So
the number of vertebrates is n2 T (n), where T (n) is the number of trees.
Also Perm[RTree] consists of a set of, say, k points carrying a permu-
tation, with a rooted tree attached at each point. If we direct every edge of
each tree towards the root, we have a picture representing what Joyal calls
an endofunction, a function from {1, . . . , n} to itself. Such a function has a
set of “periodic points” which return to their initial positions after finitely
many steps; any other point is “transient”, and the transient points feed into
periodic points in a treelike fashion. The number of endofunctions is clearly
nn .
So n2 T (n) = nn , giving the result.

10.3 The Matrix-Tree Theorem


This theorem, proved by Kirchhoff in the nineteenth century for analysis of
electrical circuits, depends on the notion of the Laplacian matrix of a graph
G = (V, E). Assuming that V = {v1 , . . . , vn }, this is the n × n symmetric
matrix whose (i, i) entry is the valency of vertex vi , and whose (i, j) entry
for i 6= j is −1 if {vi , vj } is an edge, and 0 otherwise. Note that the row sums
of this matrix are all zero, so its determinant is zero.
Recall that the (i, j) cofactor of a square matrix A is the determinant
of the matrix obtained from A by deleting the ith row and the jth column,
multiplied by (−1)i+j .

61
Theorem 10.2 The cofactors of the Laplacian matrix of a graph are all
equal to the number of spanning trees of the graph.
A tree on the vertex set {1, . . . , n} is simply a spanning tree of the com-
plete graph, the graph whose edges are all pairs of vertices. The Lapla-
cian matrix of the complete graph is nIn − Jn , where In and Jn denote the
n × n identity and all-1 matrices. Deleting the last row and column gives
nIn−1 − Jn−1 .
We find the determinant of the last matrix by computing its eigenvalues.
Every row and column sum is n − (n − 1) = 1, so the all-1 vector is an
eigenvector with eigenvalue 1. If v is a vector orthogonal to the all-1 vector,
then Jn−1 v = 0, so v is an eigenvector with eigenvalue n. Thus nIn−1 − Jn−1
has eigenvalues 1 (multiplicity 1) and n (multiplicity n−2); so its determinant
is nn−2 , which is thus the number of spanning trees.
The proof of the Matrix-Tree Theorem depends on the Cauchy–Binet for-
mula, a nineteenth century determinant formula which asserts the following.
et A be an m × n matrix, and B an n × m matrix, where m < n. Then
X
det(AB) = det(A(X)) det(B(X)),
X

where X ranges over all m-element subsets of {1, . . . , n}. Here A(X) is the
m × m matrix whose columns are the columns of A with index in X, and
B(X) is the m × m matrix whose rows are the rows of B with index in X.
To prove the Matrix-Tree Theorem for the graph G = (V, E) with Lapla-
cian matrix L(G), choose an arbitrary orientation of the edges of G, and let
M be the signed vertex-edge incidence matrix of G, with (v, e) entry +1 if
v is the “head” of the arc e, −1 if v is the “tail” of e, and 0 otherwise. It is
straightforward to show that M M > = L(G). Let v be any vertex of G, and
let N = Mv be the matrix obtained by deleting the row of M indexed by e.
It can be shown that, if X is a set of n − 1 edges, then
±1 if X is the edge set of a spanning tree,
n
det(N (X)) =
0 otherwise.
By the Cauchy–Binet formula, det(N N > ) is equal to the number of spanning
trees. But N N > is the principal cofactor of L(G) obtained by deleting the
row and column indexed by v.
The fact that all cofactors are equal is not really necessary for us, and
can be proved by elementary linear algebra.

62
10.4 Lagrange inversion
Our final approach involves another general technique, Lagrange inversion.
Let G be the set of all formal power series (over the commutative ring R
with identity) which have the form x + · · ·, that is, constant term is zero and
coefficient of x is 1. Any of these series can be substituted into any other.
We make a simple observation:

Proposition 10.3 The set G, with the operation of substitution, is a group.

This group is sometimes called the Nottingham group, for reasons that
are a little obscure.

Proof Closure and the associative law are straightforward, and the formal
power series x is the identity. Let f (x) = x +a2 x2 +a3 x3 +· · · be any element
of G. We seek an inverse g(x) = x + b2 x2 + b3 x3 + · · · such that f (g(x)) = x.
The coefficient of xn in

f (g(x)) = g(x) + a2 g(x)2 + a3 g(x)3 + · · ·

is bn + stuff, where stuff involves the as and bi for i < n. Equating it to zero
gives bn in terms of as and bi for i < n; so the bs can be found recursively.
In a similar way, we find a unique element h(x) ∈ G for which h(f (x)) = x.
Then
g(x) = h(f (g(x)) = h(x),
and the inverse is unique. 

The proof implicitly shows us how to find the inverse; Lagrange inversion
gives a more direct approach.

Theorem 10.4 The coefficient of xn in g(x) is


 n−1  n 
d x .
n!.
dxn−1 f (x) x=0

I will not give the proof here; it involves working with Laurent series and
extending the notion of poles and the calculus of residues to formal power
series.

63
Now let RTree be the species of rooted trees, as before. We clearly have
the equation
RTree = E · Set[RTree],
where E is the species of 1-element sets; this is because a rooted tree is a
(ppossibly empty) set of rooted trees all joined to a new root.
Thus the exponential generating function T ∗ (x) for rooted trees satisfies

T ∗ (x) = x exp(T ∗ (X)).

So the function T ∗ (x) is the inverse (in the group G) of the function x/ exp(x).
From Lagrange inversion, we find that the coefficient of xn /n! in T ∗ (x) is
 n−1 
d
exp(nx) = nn−1 .
dxn−1 x=0

Since the number of rooted trees is n times the number of trees, we conclude
that there are nn−2 trees on n vertices.

10.5 Stirling’s formula


The most famous asymptotic formula in enumerative combinatorics is Stir-
ling’s formula, an estimate for the factorial function. We write f ∼ g to mean
that f (n)/g(n) → 1 as n → ∞. Typically this is used with f a combinatorial
counting function and g an analytic approximation to f . Stirling’s formula
is an example.
√  n n
Theorem 10.5 n! ∼ 2πn .
e
It follows that, if T (n) is the number of labelled trees on n vertices, then
 1/n
T (n)
lim = e,
n→∞ n!

so the exponential generating function for T (n) has radius of convergence


1/e.
Using more complicated methods, Otter showed that the number of unla-
belled trees on n vertices is asymptotically An−5/2 cn , where A = 0.5349485 . . .
and c = 2.955765 . . ..

64
Exercises
1 Calculate the chromatic polynomial of

(a) the path with n vertices,


(b) the cycle with n vertices.

2 A forest is a graph whose connected components are trees. Show that


there is a bijection between labelled forests of rooted trees on n vertices, and
labelled rooted trees on n + 1 vertices with root n + 1.
Use Stirling’s formula to show that, if a forest of rooted trees on n vertices
is chosen at random, then the probability that it is connected tends to the
limit 1/e as n → ∞.

3 Count the labelled trees in which the vertex i has valency ai for 1 ≤ i ≤ n,
where a1 , . . . , an are positive integers with sum 2n − 2.

65

You might also like