Generating Functions
Generating Functions
Abstract
These are the notes of my lecture course on Enumerative Combina-
torics at the London Taught Course Centre in Autumn 2013. Thanks
to all who attended for their support. There are ten sections, as fol-
lows:
• Subsets, partitions, permutations
• Formal power series
• Catalan numbers
• Unimodality
• q-analogues
• Symmetric polynomials
• Group actions
• Species
• Möbius inversion
• Cayley’s theorem
Exercises are included at the end of the sections.
1
(a) finding an exact formula for the number f (n) of structures of size n;
(e) finding an efficient algorithm for stepping from one of the counted ob-
jects to the next (in some natural ordering).
In this course I will mostly be concerned with the first three goals; discussing
algorithms would lead too far afield. The exception to this is one particularly
important algorithm, a recurrence relation, in which the value of f (n) is
computed from n and the earlier values f (0), . . . , f (n − 1).
An asymptotic formula for f (n) is an analytic function g(n) such that
f (n)/g(n) → 0 as n → ∞. There are several types of generatingX functions;
the most important for us are the ordinary generating function f (n)xn ,
n≥0
X f (n)xn
and the exponential generating function .
n≥0
n!
If you want to learn the state-of-the-art in combinatorial enumeration,
I recommend the two volumes of Richard Stanley’s Enumerative Combina-
torics, or the book Analytic Combinatorics by Philippe Flajolet and Robert
Sedgewick. The On-line Encyclopedia of Integer Sequences is another valu-
able resource for combinatorial enumeration.
1.1 Subsets
The three most important objects in elementary combinatorics are subsets,
partitions and permutations; we briefly discuss the counting functions for
these. First, subsets.
The total number of subsets of an n-element set is 2n . This can be used
by noting that this number f (n) satisfies the recurrence relation f (n) =
2f (n − 1); this is proved by observing that any subset of {1, . . . , n − 1} can
be extended to a subset of {1, . . . , n} in two different ways, either including
the element n or not.
2
n
The binomial coefficient is the number of k-element subsets of
k
{1, . . . , n}. The formula is
n n(n − 1) · · · (n − k + 1) n!
= = .
k k(k − 1) · · · 1 k! (n − k)!
Note
that there
are k factors in both numerator and denominator. We have
n n
= = 1. We can extend the definition to all non-negative inte-
0 n
n
gers n and k by defining = 0 for k > n: this fits with the counting
k
interpretation.
The recurrence relation for binomial coefficients is Pascal’s Triangle
n n−1 n−1
= + for 0 < k < n.
k k−1 k
For the first term on the right counts subsets containing n, while the second
counts subsets not containing n.
Counting subsets by cardinality gives
n
X n
= 2n .
k=0
k
3
If we multiply this equation by y n and sum, we obtain the bivariate
generating function for the binomial coefficients:
n
X X n k n X
x y = (1 + x)n y n
n≥0 k=0
k n≥0
1
=
1 − (1 + x)y
1 1
= ·
1 − y 1 − xy/(1 − y)
X yk
= k+1
xk ,
k≥0
(1 − y)
1.2 Partitions
In this case and the next, we are unable to write down a simple formula
for the counting numbers, and have to rely on recurrence relations or other
techniques.
The Bell number B(n) is the number of partitions of a set of cardinality
n. We refine this in the same way we did for subsets. The Stirling number of
the second kind, S(n, k), is the number of partitions of an n-set into k parts.
4
Thus, S(0, 0) = 1 and S(0, k) = 0 for k > 0; and if n > 0, then S(n, 0) = 0,
S(n, 1) = S(n, n) = 1, and S(n, k) = 0 for k > n. Clearly we have
n
X
S(n, k) = B(n) for n > 0.
k=1
It turns out that we can turn this into a statement about a generating
function, but with a twist. Let
Then we have n
X
n
x = S(n, k)(x)k for n > 0. How?
k=1
1.3 Permutations
The number of permutations of an n-set (bijective functions from the set to
itself) is the factorial function n! = n(n − 1) · · · 1 for n ≥ 0. The exponen-
tial generating function for this sequence is 1/(1 − x), while the ordinary
generating function has no analytic expression (it is divergent for all x 6= 0).
5
Any permutation can be decomposed uniquely into disjoint cycles. So we
refine the count by letting u(n, k) be the number of permutations of an n-set
which have exactly k cycles (including cycles of length 1). Thus,
n
X
u(n, k) = n! for n > 0.
k=1
The numbers u(n, k) are the unsigned Stirling numbers of the first kind. The
reason for the name is that it is common to use a different count, where a
permutation is counted with weight equal to its sign (as defined in elementary
algebra, for example the theory of determinants). Let s(n, k) be the sum of
the signs of the permutations of an n-set which have k cycles. Since the sign
of such a permutation is (−1)n−k , we have s(n, k) = (−1)n−k u(n, k). The
numbers s(n, k) are the signed Stirling numbers of the first kind.
We have n
X
s(n, k) = 0 for n > 1.
k=1
This is related to the algebraic fact that, for n > 1, the permutations with
sign + form a subgroup of the symmetric group of index 2 (that is, containing
half of all the permutations), called the alternating group.
We will mainly consider signed Stirling numbers below, though it is some-
times convenient to prove a result first for the unsigned numbers.
As usual we take s(n, 0) = 0 for n > 0 and s(n, k) = 0 for k > n.
We have s(n, n) = 1, s(n, 1) = (−1)n−1 (n−1)!, and the recurrence relation
Putting x = 1 in this equation shows that indeed the sum of the signed
Stirling numbers is zero for n > 1.
Note that this is the inverse of the relation we found for the Stirling
numbers of the second kind. So the matrices formed by the Stirling numbers
of the first and second kind are inverses of each other.
6
Exercises
1. Let A be the matrix of binomial
coefficients (with rows and columns
i
indexed by N, and (i, j) entry ), and B the matrix of “signed binomial
j
i−j i
coefficients” (as before but with (i, j) entry (−1) ). Prove that A and
j
B are inverses of each other.
What are the entries of the matrix A2 ?
n 2
X n 2n
2. Prove that = .
k=0
k n
3. (a) Prove that the following are equivalent for sequences (a0 .a1 , . . .) and
(b0 , b1 , . . .), with exponential generating functions A(x) and B(x) respec-
tively:
n
X
(ii) b0 = a0 and bn = S(n, k)ak for n ≥ 1;
k=1
(b) Prove that the following are equivalent for sequences (a0 .a1 , . . .) and
(b0 , b1 , . . .), with exponential generating functions A(x) and B(x) respec-
tively:
n
X
(i) b0 = a0 and bn = s(n, k)ak for n ≥ 1;
k=1
7
2 Formal power series
Probably you recognised in the last chapter a few things from analysis, such
as the exponential and geometric series; you may know from complex analysis
that convergent power series have all the nice properties one could wish. But
there are reasons for considering non-convergent power series, as the following
example shows.
Recall the generating function for the factorials:
X
F (x) = n!xn ,
n≥0
Hence, if X
G(x) = 1 − C n xn ,
n≥1
8
We denote the set of all formal power series by R[[x]].
The set R[[x]] has a lot of structure: there are many things we can do with
formal power series. All we require of any operations is that they only require
adding or multiplying finitely many elements of R. No analytic properties
such as convergence of infinite sums or products are required to hold in R.
(b) Multiplication: The rule for multiplication of formal power series, like
that of matrices, looks unnatural but is really the obvious thing: we
multiply powers of x by adding the exponents, and then just gather up
the terms contributing to a fixed power. Thus
X X X
n n
an x · bn x = cn x n ,
where n
X
cn = ak bn−k .
k=0
Note that to produce a term of the product, only finitely many addi-
tions and multiplications are required.
(c) Infinite sums and products: These are not always defined. Suppose, for
example, that A(i) (x) are formal power series for i = 0, 1, 2, . . .; assume
that the first non-zero coefficient in A(i) (x) is the coefficient of xni ,
where ni → ∞ as i → ∞. Then, to work out the coefficient of xn in
the infinite sum, we only need the finitely many series A(i) (x) for which
ni ≤ n. Similarly, the product of infinitely many series B (i) is defined
provided that B (i) (x) = 1 + A(i) (x), where A(i) satisfy the condition
just described.
(d) Substitution: Let B(x) be a formal power series with constant term
zero. Then, for any formal power series A(x), the series A(B(x)) ob-
tained by substituting B(x)
P for x inn A(x) is defined. For, if A(x) =
n
an B(x) , and B(x) has no terms in xk
n
P
an x , then A(B(x)) =
for k < n.
9
(f) Negative powers: We can extend the notion of formal power series to
formal Laurent series, which are allowed to have finitely many negative
terms: X
an x n .
n≥−m
(g) Multivariate formal power series: We do not have to start again from
scratch to define series in several variables. For R[[x]] is a commutative
ring with identity, and so R[[x, y]] can be defined as the set of formal
power series in y over R[[x]].
This argument shows the very close connection between finding inverses
in R[[x]] and solving linear recurrence relations.
10
2.2 Example: partitions
We are considering partitions of a number n, rather than of a set, here. A
partition of n is an expression for n as a sum of positive integers arranged in
non-increasing order; so the five partitions of 4 are
4 = 3 + 1 = 2 + 2 = 2 + 1 + 1 = 1 + 1 + 1 + 1.
where the sum contains all terms where the argument n − k(3k ± 1)/2 is
non-negative.
then we find p(5) = p(4) + p(3) − p(0) = 7, p(6) = p(5) + p(4) − p(1) = 11,
and so on.
I will give a brief sketch of the proof.
k
For on the right, we have the product of geometric series 1 + xP + x2k + · · ·,
n
and the coefficient of x is the number of ways of writing n = kak , which
is just p(n).
11
Step 2: The inverse of the generating function. We need to find
Y
(1 − xk ).
k≥1
9 = 8 + 1 = 7 + 2 = 6 + 3 = 5 + 4 = 6 + 2 + 1 = 5 + 3 + 1 = 4 + 3 + 2,
so there are four sums with an even number of terms and four with an odd
number of terms, and so the coefficient is zero.
Exercises
1. Suppose that R is a field. Show that R[[x]] has a unique maximal ideal,
consisting of the formal power series with constant term zero. Describe all
the ideals of R[[x]].
2. Suppose that A(x), B(x) and C(x) are the exponential generating func-
tions of sequences (an ), (bn ) and (cn ) respectively. Show that A(x)B(x) =
C(x) if and only if
n
X n
cn = ak bn−k .
k=0
k
12
3. (a) Let (an ) be a sequence of integers, and (bn ) the sequence of partial sums
X n
of (an ) (in other words, bn = ai ). Suppose that the generating function
i=0
for (an ) is A(x). Show that the generating function for (bn ) is A(x)/(1 − x).
(b) Let (an ) be a sequence of integers, and let cn = nan for all n ≥
0. Suppose that the generating function for (an ) is A(x). Show that the
generating function for (cn ) is x(d/dx)A(x). What is the generating function
for the sequence (n2 an )?
(c) Use the preceding parts of this exercise to find the generating function
X n
for the sequence whose nth term is i2 , and hence find a formula for the
i=1
sum of the first n squares.
3 Catalan numbers
In the last chapter, as in most of this course, we treated power series as formal
objects: even differentiation involves no limiting processes. However, if the
coefficients are complex numbers, and the series converge in some neighbour-
hood of the origin, then analytic methods can be used. These methods can
be very powerful. We will see them at work in the derivation of a formula for
the Catalan numbers, and then give a few examples of combinatorial objects
counted by Catalan numbers.
3.1 Analysis
A complex function which is analytic in some neighbourhood of the origin
is represented by a convergent power series in a disc about the origin. If
an analytic relation between functions holds in a suitable disc, then any
connection between the coefficients which can be derived will also be true in
the world of formal power series.
The most important formal power series to which this principle can be
applied are
X a
a
(a) The binomial series (1 + x) = xn , where a is any complex
n≥0
n
13
number, and the binomial coefficient is defined as
a a(a − 1) · · · (a − n + 1)
= .
n n!
X xn
(b) The exponential series exp(x) = ..
n≥0
n!
X (−1)n−1 xn
(c) The logarithmic series log(1 + x) = .
n≥1
n
X of n terms, for n ≥ 1,
Let Cn be the number of evaluations of a product
so that C1 = C2 = 1, C3 = 2, C4 = 5. Let c(x) = Cn xn be the generating
n≥1
function.
14
In a bracketing of n terms, the last application of ◦ will combine some
product of the first m terms with some product of the last n − m terms, for
some m with 1 ≤ m ≤ n − 1. So we have the recurrence relation
n−1
X
Cn = Cm Cn−m for n > 1.
m=1
c(x) = x + c(x)2 .
c(x)2 − c(x) + x = 0.
The choice of sign in the square root is determined by the fact that c(0) = 0,
so we must take the negative sign:
√
c(x) = 21 1 − 1 − 4x .
and so
1/2
Cn = − 12 (−4)n .
n
Now
1/2 (1/2)(−1/2) · · · (−(2n − 3)/2)
=
n n!
15
1 1 · 3 · (2n − 3)
= n
(−1)n−1
2 n!
1 1 (2n − 2)!
= n (−1)n−1 n−1
2 n 2 ((n − 1)!)2
1 n 1 2n − 2
= −2(− 4 ) ,
n n−1
so finally we obtain
1 2n − 2
Cn = .
n n−1
The result and its proof call for a few remarks.
First, are these manipulations really valid?
(a) We have used here the Binomial Theorem for exponent 1/2, which is
proved analytically by observing that the function (1 + x)1/2 is analytic
in the interior of the unit disc (it has a branchpoint at x = −1), and
then using the formula for the coefficient of xn in the Taylor series
(differentiate n times, put x = 0, divide by n!).
Second, this is a case where, even once you know the formula for the Cata-
lan numbers, it is difficult to show directly that they satisfy the recurrence
relation. (Spend a few moments trying; you will be convinced of this!)
And
third,it is not at all obvious that n divides the binomial coeffi-
2n − 2
cient ; but since Cn counts something, it is an integer, and so this
n−1
divisibility is indeed true.
16
(b) to find a bijection to a known class of Catalan objects.
There are sometimes other less obvious ways, as we will see in the case of
Dyck paths.
Where possible I have given an illustration of the five Catalan objects
counted by C4 .
Binary trees
A binary tree has a root of degree 2; the other vertices have degree 1 or
3. So every non-root vertex is either a leaf or has two descendants, which we
specify as left and right descendants.
The number of binary trees with n leaves is Cn . Figure 1 shows the
correspondence with bracketed products: the tree is a “parse tree” for the
product.
r r r r r r r r
r \\r \
\r r r r r r r \\r \\ r r
r \\r r \r LLr LLr
\r r r r
\ \
\
\\r \\r \\r r
\ r
\
(a ◦ (b ◦ (c ◦ d))) (a ◦ ((b ◦ c) ◦ d)) ((a ◦ b) ◦ (c ◦ d)) ((a ◦ (b ◦ c)) ◦ d) (((a ◦ b) ◦ c) ◦ d)
17
Dissections of polygons
An n-gon can be dissected into triangles by drawing n − 2 non-crossing
diagonals. There are Cn−1 dissections of an n-gon. Figure 3 shows dissections
of a pentagon.
q q q q q
q Zq q BB Zq q Zq q BB Zq q Zq
Z Z Z Z Z
B Z
B ZB B B B BZZ
q
B q Bq ZBq q
B q Bq Bq Bq Zq
Dyck paths
A Dyck path starts at the origin and ends at (2n, 0), moving at each step
to the adjacent lattice point in either the north-easterly or south-easterly
direction and never going below the X-axis. (An even number of steps is
required since each step either increases or decreases the Y-coordinate by 1.)
Figure 4 shows the Dyck paths with n = 3.
q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q@q q q
@ q q q q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q@q q q q q q
@ q q q
@ q q q q q
@ q q q q q q q@q q
@ q q q q q q q
q q q q q q@q q q q q q @
q@q q q q @
q@q q q
@ q q q
@ q q q@q q q q
@ q q
@ q q
@
The number of Dyck paths is Cn+1 , and of these, Cn never return to the
X-axis before the end. I will indicate the proof since it illustrates another
technique.
Let Dn be the number of Dyck paths, and En the number which never
return to the axis. Now a Dyck path begins by moving from (0, 0) to (1, 1)
and ends by moving from (2n − 1, 1) to (2n, 0); if it did not return to the
axis in between, then removing these “legs” gives a shorter Dyck path. So
En = Dn−1 .
Suppose that a Dyck path first returns to the axis at (2k, 0). Then it is a
composite of a non-returning Dyck path of length 2k with an arbitrary Dyck
18
path of length 2(n − k); so
n
X
Dn = Ek Dn−k .
k=1
Ballot numbers
An election is held with two candidates A and B, each of whom receives
exactly n votes. In how many ways can the votes be counted so that A is
never behind in the count?
It is easy to match these ballot numbers with Dyck paths. For n = 3, the
five counts are AAABBB, AABABB, AABBAB, ABAABB, and ABABAB.
This can be described another way. In a 2×n array, we place the numbers
1, . . . , 2n in order against the candidates who receive those votes. This gives
the representations shown in Figure 5.
1 2 3 1 2 4 1 2 5 1 3 4 1 3 5
4 5 6 3 5 6 3 4 6 2 5 6 2 4 6
AAABBB AABABB AABBAB ABAABB ABABAB
Figure 5: Tableaux
Note that the numbers increase along each row and down each column.
19
invent a ballot interpretation for the number of Young tableaux belonging to
a given diagram.
This combinatorics is important in describing the representation theory
of the symmetric group Sn , the group of all permutations of {1, . . . , n}. It
is known that the irreducible matrix representations of Sn over the complex
numbers are in one-to-one correspondence with the partitions of n (that is, to
the Young diagrams); the degree of a representation is equal to the number
of Young tableaux belonging to the corresponding diagram. Thus, the five
Young tableaux shown in the preceding section correspond to an irreducible
representation of degree 5 of the group S6 .
There is a “hook length formula” for the number of Young tableaux cor-
responding to a given diagram. The hook associated with a cell consists of
that cell and all those to its right in the same row or below it in the same
column. The hook length of a cell is the number of cells in its hook. Now the
number of Young tableaux associated with the diagram is equal to n! divided
by the product of the hook lengths of all its cells.
Thus for the diagram with two rows of length 3, the formula gives
6!
= 5.
4·3·2·3·2·1
20
This is much more difficult to solve. Whereas Cn is roughly 4n (in the
1/n
sense that the limit of Cn as n → ∞ is 4), Wn is roughly 2.483 . . .n in the
same sense.
Exercises
1 Give a counting proof of the Vandermonde convolution in the case where
a and b are natural numbers.
2 Verify some of the formulae for Catalan objects in the notes, either by
deriving a recurrence, or by finding bijections between the objects counted.
4 Use the hook length formula to derive the formula for the Catalan number
Cn .
5 Prove the recurrence relation and the equation for the generating function
for the Wedderburn–Etherington numbers.
4 Unimodality
It is well known that the binomial coefficients increase up to halfway, and
then decrease. Indeed, the shape of the bar graph of binomial coefficients
is well approximated by a scaled version of the “bell curve” of the normal
distribution.
....................................
....... ......
...... .....
....... ....
.... ....
...... ....
....
..
... ....
..
... ....
.
.... ....
....
.
.... ....
....
. ....
..
..
..... ......
.......
..
..
..
..... ........
...
..
..
.......
. ..........
..............
......
............................ .........................
21
This property of binomial coefficients is easily proved. Since
n n−k n
= ,
k+1 k+1 k
the binomial coefficient increases from k to k + 1, remains constant, or de-
creases, according as n − k > k + 1, n − k = k + 1 or n − k < k + 1, that
is, as n is greater than, equal to, or less than 2k + 1. So, if n is even, the
binomial coefficients increase up to k = n/2 and then decrease; if n is odd,
the two middle values (k = (n − 1)/2 and k = (n + 1)/2) are equal, and they
increase before this point and decrease after.
Other combinatorial numbers also show this unimodality property, but
in cases where we don’t have a formula, we need general techniques.
a0 ≤ a1 ≤ · · · ≤ am ≥ am+1 ≥ · · · ≥ an .
Before we begin the proof, we note that a polynomial with all coefficients
positive cannot have a real non-negative root, and a polynomial all of whose
roots are negative has all its coefficients positive.
22
The proof is by induction: there is nothing to prove when n = 1, since
any sequence of two numbers is log-concave. For n = 2, the condition for
the polynomial a0 + a1 x + a2 x2 to have real roots is a21 − 4a0 a2 ≥ 0, which
is stronger than log-concavity; as remarked, if the roots are real, they are
negative.
Now we turn to the general case. Suppose that A(x) = (x+c)B(x), where
c > 0 and
B(x) = bn−1 xn−1 + · · · + b1 x + b0 .
Now the polynomial B(x) has all its roots real and negative, since they are
all the roots of A(x) except for −c. So the coefficients are all positive, and
by the inductive hypothesis, the sequence b0 , . . . , bn−1 is log-concave; that is,
all its roots are −1, and so the theorem shows that the binomial coefficients
are log-concave, and hence unimodal.
23
For the unsigned Stirling numbers of the first kind, we have
n
X
u(n, k)xk = x(x + 1) · · · (x + n − 1),
k=1
and the polynomial on the right has roots 0, −1, −2, . . . , −(n − 1). We can
neglect the zero root: the Stirling numbers start at k = 1 rather than zero,
and dividing by x simply changes the indexing so that they start at 0. So
again the Stirling numbers are log-concave and hence unimodal.
The Stirling numbers of the second kind are more difficult, since there
is no convenient form for the generating polynomial. We start with the
recurrence relation
S(n, 1) = S(n, n) = 1, S(n, k) = S(n−1, k−1)+kS(n−1, k) for 1 < k < n.
Let n
X
An (x) = S(n, k)xk .
k=0
24
Exercises
1 Let S be a fixed set of positive integers, and let rn be the number of
partitions ofPn into distinct parts from the set S. What is the generating
polynomial rn xn ? Is the sequence (rn ) unimodal?
5 q-analogues
In a sense, a q-analogue of a combinatorial formula is simply another formula
involving a variable q which has the property that, as q → 1, the second
formula becomes the first. Of course there is more to it than that; some
q-analogues are more important than others. What follows is nothing like
a complete treatment; I will concentrate on a particularly important case,
the Gaussian or q-binomial coefficients, which are, in the above sense, q-
analogues of binomial coefficients.
25
Hence if we replace each factor (q r − 1) in the definition of the Gaussian
coefficient by (q r − 1)/(q − 1), then the factors (q − 1) in numerator and
denominator cancel, so the expression is unchanged; and now it is clear that
n n
lim = .
q→1 k k
q
5.2 Interpretations
Quantum calculus The letter q stands for “quantum”, and the q-binomial
coefficients do play a role in “quantum calculus” similar to that of the ordi-
nary binomial coefficients in ordinary calculus. I will not discuss this further;
see the book Quantum Calculus, by V. Kac and P. Cheung, Springer, 2002,
for further details.
Vector spaces over finite fields The letter q is also routinely used for
the number of elements in a finite field (which is necessarily a prime power,
and indeed there is a unique finite field of any given prime power order q –
a theorem of Galois).
Proof The proof follows the standard proof for binomial coefficients count-
ing subsets of a set.
A k-dimensional subspace of V is specified by choosing a basis, a sequence
of k linearly independent vectors. Now the number of choices of the first
vector is q n − 1 (since every vector except the zero vector is eligible); the
second can be chosen in q n − q ways (since the q multiples of the first vector
are now ineligible); the third in q n −q 2 ways (since the q 2 linear combinations
of the first two are now ruled out); and so on. In total,
(q n − 1)(q n − q) · · · (q n − q k−1 )
choices.
We have to divide this by the number of k-tuples of vectors which form
a basis for a given k-dimensional subspace. This number is obtained by
26
replacing n by k in the above formula, that is,
(q k − 1)(q k − q) · · · (q k − q k−1 ).
Corollary 5.2 The number of k×n matrices over a field of q elements which
n
are in reduced echelon form is .
k q
So we have
4
= q 4 + q 3 + q 2 + q 2 + q + 1 = (q 2 + 1)(q 2 + q + 1).
2 q
This expression, and the definition, are polynomials in q, which agree for ev-
ery prime power q; so they coincide. In a similar way, any Gaussian coefficient
can be written out as a polynomial.
27
The other consequence is that algebra is not required here. Over any
alphabet of size q, containing two distinguished elements 0 and 1, the number
of k × n matrices in
“reduced echelon form” (satisfying (a)–(c) above) with
n
no zero rows is .
k q
Lattice paths How many lattice paths are there from the origin to the
point (m, n), where each step in the path moves one unit either north or
east?
m+n
Clearly the number is , since we must take m + n steps of which
m
m are north and n are east, and the northward steps may occur in any m of
the m + n positions.
Suppose we want to count the paths by the area under the path (that is,
bounded by the X-axis, the line x = m, and the path). We use a generating
function approach, so that a path enclosing an area of a units contributes
q a to the overall generating function. Here q is simply a formal variable; the
answer is obviously a polynomial in q.
For example,
28
We can use the relation to move the y’s to the end in each term; each time
we jump a y over an x we pick up a factor q. So
Remark From this we can prove Theorem 5.3, as follows. Let Q(n, k) be
the sum of the weights of lattice paths from (0, 0) to (n − k, k), where the
weight of a path is q a if the area below it is a. Clearly Q(n, 0) = Q(n, n) = 1.
Consider all the lattice paths from (0, 0) to (n − k, k), and divide them
into two classes: those in which the last step is vertical, and those in which
it is horizontal. In the first case, the last step is from the end of a path
29
counted by Q(n − 1, k − 1) (ending at (n − k, k − 1)), and adds no area to the
path. In the second step, it is from the end of a path counted by Q(n − 1, k)
(ending at (n − k − 1, k)), and increases the area by k, adding q k Q(n − 1, k)
to the sum. So the numbers Q(n, k) have the same recurrence and boundary
conditions as the Gaussian coefficients, and must be equal to them.
From the last two results, we can deduce an alternative recurrence:
n n−k n − 1 n−1
=q + .
k q k−1 q k q
30
5.5 Jacobi’s Triple Product Identity
This is only loosely connected with the topics of this chapter, but is inter-
esting in its own right.
Theorem 5.8 (Jacobi’s Triple Product Identity)
Y X 2
(1 + q 2n−1 z)(1 + q 2n−1 z −1 )(1 − q 2n ) = ql zl .
n>0 l∈Z
The sharp-eyed will notice that the series on the right breaks my rules that
formal Laurent series should have only finitely many negative terms. Well,
this just shows that formal power series are more flexible than might first
appear! You can check that the three infinite products on the left contribute
only finitely many terms to each power, positive or negative, of z.
By replacing q by q 1/2 and moving the third term in the product to the
right-hand side, the identity takes the form
! !
Y X 2 Y
(1 + q n−1/2 z)(1 + q n−1/2 z −1 ) = q l /2 z l (1 − q n )−1 ,
n>0 l∈Z n>0
in which form we will prove it. The proof here is a remarkable argument by
Richard Borcherds, and this write-up from my Combinatorics textbook.
A level is a number of the form n + 21 , where n is an integer. A state is
a set of levels which contains all but finitely many negative levels and only
finitely many positive levels. The state consisting of all the negative levels
and no positive ones is called the vacuum. Given a state S, we define the
energy of S to be
X X
{l : l > 0, l ∈ S} − {l : l < 0, l 6∈ S},
Although it is not necessary for the proof, a word about the background
is in order!
Dirac showed that relativistic electrons could have negative as well as
positive energy. Since they jump to a level of lower energy if possible, Dirac
hypothesised that, in a vacuum, all the negative energy levels are occupied.
31
Since electrons obey the exclusion principle, this prevents further electrons
from occupying these states. Electrons in negative levels are not detectable.
If an electron gains enough energy to jump to a positive level, then it becomes
‘visible’; and the ‘hole’ it leaves behind behaves like a particle with the same
mass but opposite charge to an electron. (A few years later, positrons were
discovered filling these specifications.) If the vacuum has no net particles
and zero energy, then the energy and particle number of any state should be
relative to the vacuum, giving rise to the definitions given.
We show that the coefficient of q m z l on either side of the equation is equal
to the number of states with energy m and particle number l. This will prove
the identity.
For the left-hand side this is straightforward. A term in the expansion
1 1
of the product is obtained by selecting q n− 2 z or q n− 2 z −1 from finitely many
factors. These correspond to the presence of an electron in positive level
n − 21 (contributing n − 12 to the energy and 1 to the particle number), or a
hole in negative level −(n − 21 ) (contributing n − 12 to the energy and −1 to
the particle number). So the coefficient of q m z l is as claimed.
The right-hand side is a little harder. Consider first the states with parti-
cle number 0. Any such state can be obtained in a unique way from the vac-
uum by moving the electrons in the top k negative levels up by n1 , n2 , . . . , nk ,
say, where n1 ≥ n2 ≥ . . . ≥ nk . (The monotonicity is equivalent to the re-
quirement that no electron jumps over another. The jumping process allows
the possibility that some electrons jump from negative levels to higher but
still negative levels, so k is not the number of occupied positive levels.) The
energy of the state is thus m = n1 +. . .+nk . Thus, the number of states with
energy m and particle number 0 is equal to the Q number p(m) of partitions
m n −1
of m, which is the coefficient of q in P (q) = n>0 (1 − q ) , as we saw in
lecture 1.
Now consider states with positive particle number l. There is a unique
ground state, in which all negative levels and the first l positive levels are
filled; its energy is
1 3 2l − 1 1
+ + ... + = l2 ,
2 2 2 2
and its particle number is l. Any other state with particle number l is
obtained from this one by ‘jumping’ electrons up as before; so the number of
such states with energy m is p(m − 12 l2 ), which is the coefficient of q m z l in
2
q l /2 z l P (q), as required.
32
The argument for negative particle number is similar.
Exercises
1 Prove that, for fixed n, the Gaussian coefficients are unimodal.
n
2 For fixed n and k, the Gaussian coefficient is a polynomial in q of
k q
degree k(n − k), whose coefficients a0 , . . . , ak(n−k) are non-negative integers.
Prove that the coefficients are symmetric: that is, ai = ak(n−k)−i .
Remark It is also true that the coefficients are unimodal, but this is not so
easy to prove. The polynomial does not have all its roots real and negative!
n
Remark For a more challenging exercise, find a formula for , where
k ω
ω is a primitive dth root of unity.
33
6 Symmetric polynomials
A symmetric polynomial in n indeterminates is one which is unchanged under
any permutation of the indeterminates. The theory of symmetric polynomials
goes back to Newton, but more recently has been very closely connected with
the representation theory of the symmetric group, which we glanced at in
Lecture 3. I will just give a few simple results here. The best reference is Ian
Macdonald’s book Symmetric Functions and Hall Polynomials.
34
n
X
(c) The power sum polynomial pr , which is simply xri .
i=1
(t − a1 )(t − a2 ) · · · (t − an ),
and the term in tn−r is formed by choosing t from n − r of the factors and
−ai from the remaining r.
Said otherwise, and putting xi = −1/ai , this says that the generating
function for the elementary symmetric polynomials is
n
X n
Y
r
E(t) = er (x1 , . . . , xn )t = (1 + xi t),
r=0 i=1
35
with the convention that e0 = 1.
In a similar way, the generating function for the complete symmetric
polynomials is
X n
Y
H(t) = r
hr (x1 , . . . , xn )t = (1 − xi t)−1 .
r≥0 i=1
We also take P (t) to be the generating function for the power sum polyno-
mials, with a shift:
X
P (t) = pr (x1 , . . . , xn )tr−1 .
r≥1
d d
H(t) = P (t)H(t), E(t) = P (−t)E(t).
dt dt
eλ = (x1 x2 + x1 x3 + x2 x3 )(x1 + x2 + x3 ),
pλ = (x21 + x22 + x23 )(x1 + x2 + x3 ),
hλ = eλ + pλ .
We also define the basic polynomial mλ to be the sum of all monomials with
exponents a1 , a2 , . . .. In the above case,
36
Theorem 6.1 If n ≥ r, and z is one of the symbols m, e, h, p, then any
symmetric polynomial of degree r can be written uniquely as a linear com-
bination of the polynomials zλ , as λ runs over all partitions. Moreover, in
all cases except z = p, if the polynomial has integer coefficients, then it is a
linear combination with integer coefficients.
So the polynomials er or hr , with r ≤ n, are generators of the ring of
symmetric polynomials in n variables with integer coefficients. For z = e,
this is a version of Newton’s Theorem on symmetric polynomials (which,
however, applies also to rational functions).
Proof We can describe any such n-tuple in the following way. Take a line
of n + r − 1 boxes. Then choose n − 1 boxes, and place barriers in these
boxes. Let
(a) a1 be the number of empty boxes before the first barrier;
(b) a2 be the number of empty boxes between the first and second barriers;
(c) . . .
(d) an be the number of empty boxes after the last barrier.
Then a1 , . . . , an are non-negative integers with sum r. Conversely, given n
non-negative integers with sum r, we can represent it with n − 1 barriers in
n + r − 1 boxes: place the first barrier after a1 empty boxes, the second after
a2 further empty boxes, and so on.
So the required number of n-tuples is equal to the number of ways to
position n − 1 barriers in n + r − 1 boxes, which is
n+r−1 n+r−1
= ,
n−1 r
37
as required.
7 Group actions
How many ways can you colour the faces of a cube with three colours? Clearly
the answer is 36 = 729. But what if we regard two colourings as the same
if one can be transformed into the other by a rotation of the cube? This is
typical of the problems we consider in this chapter.
38
any action can be split uniquely into transitive actions on the sets of the
orbit partition of the domain.
In our motivating problem, the group G of 24 rotations of the cube acts
on the set X of 729 coloured cubes, and we want to count the orbits. So our
immediate goal is to count the orbits in an arbitrary action.
If H is a subgroup of G, then there is a natural partition of G into right
cosets Hx of H, for x ∈ G. Lagrange’s Theorem assures us that each coset
has the same cardinality, so the number of cosets is equal to |G|/|H|. We
denote the set of right cosets of H in G by cos(H, G). Now there is an action
of G on the set cos(H, G): the group element g induces the permutation
Hx 7→ H(xg). At risk of some confusion, we write this as (Hx)g = H(xg).
Now, given any transitive action of G on a set X, and x ∈ X, the set
{g ∈ G : xg = x}
is a subgroup of H, called the stabiliser of x, and denoted by StabG (x). Now
there is a natural bijection between X and cos(H, G), where the element
y ∈ X corresponds to the set
{g ∈ G : xg = y}
(it is easily checked that this is a coset of H). This bijection also respects
the action of G: if z ∈ G satisfies yg = z, and Hk and Hl are the cosets
corresponding to y and z, then (Hk)g = (Hl).
So the so-called “coset spaces” of subgroups of G give a complete list of
transitive actions of G, up to a natural notion of isomorphism of actions.
Note in addition that any two points in the same orbit have stabilisers of
the same order. (The stabilisers are in fact conjugate subgroups of G.)
In an arbitrary action of G on X, we let fixX (g) denote the number of
points of X which are fixed by the permutation g. Now we can state the
Orbit-Counting Lemma, the foundation of enumeration under group action.
Theorem 7.1 Let G act on the finite set X. Then the number of G-orbits
in X is equal to the average number of fixed points of elements of G, that is,
1 X
fixX (g).
|G| g∈G
39
Proof Construct a bipartite graph as follows. The vertices are of two types:
the elements of X, and the elements of G. There is an edge from x to g if
xg = x. We count the number of edges in two different ways. X
Each vertex g lies in fixX (g) edges; so the number of edges is fixX (g).
g∈G
Now we count the other way. Take a point x ∈ X. The number of edges
containing it is | StabG (x)|. This value is the same for all the points in the
orbit OG (x) containing x. So the number of edges containing points in the
orbit is | StabG (x)| · |OG (x)| = |G|. Since each orbit contributes |G| edges,
the number of orbits is obtained by dividing the number of edges by |G|, as
claimed.
Now consider the coloured cubes. In order to do the calculations, we need
to classify the elements of the group G of rotations of the cube (a group of
order 24). They are of the following types:
For each type of rotation, we have to count the number of coloured cubes it
fixes. A cube will be fixed if faces in the same cycle of the permutation have
the same colour. So the answer will be 3c , where c is the number of cycles
of the permutation on faces. For the five types listed above the numbers of
cycles are 6 (each single face is a cycle), 3 (for the vertical axis, the top and
bottom faces, and the other four in a single cycle), 4 (as the previous except
that the 4-cycle splits into two 2-cycles), 3 (the faces are permuted in cycles
of two), and 2 (the faces are permuted in cycles of three). So the calculation
of the theorem is:
1 6
(3 + 6 · 33 + 3 · 34 + 6 · 33 + 8 · 32 ) = 57.
24
40
7.2 Labelled and unlabelled
Many combinatorial objects that we want to count are based on an underlying
set, which we usually assume to be the set {1, 2, . . . , n}. Very often the
simplest method of counting gives us the total number of objects that can
be built on this set. But we may be completely uninterested in the labels
1, 2, . . . , n, and want to count two objects as being the same if there are some
labellings of the underlying set that make them identical.
We distinguish these two problems as counting labelled and unlabelled
objects.
Counting unlabelled objects is thus an orbit-counting problem: we want
to know the number of orbits of the symmetric group Sn , acting on the
objects in question by permuting the labels.
n
To take an extreme case: there are labelled k-element subsets of an
k
n-element set, but there is only one unlabelled subset. Here are a few more
examples.
Here B(n) is the Bell number (the number of partitions of an n-set) and
p(n) the number of partitions of the number n. Note that the numbers of
unlabelled structures can agree and those of labelled structures disagree, or
vice versa.
The third entry needs a little explanation. Any permutation can be writ-
ten as a product of disjoint cycles; the cycle lengths form a partition of n
called the cycle structure of the permutation. Now given two permutations
with the same cycle structure, we can replace the entries in one by those in
the other. For example, (1)(2, 3) can be transformed into (2)(1, 3) by swap-
ping the labels 1 and 2. (You might recognise this as the argument that
shows that two permutations are conjugate in the symmetric group if and
only if they have the same cycle structure.)
In the three cases in the table, we can count the unlabelled objects di-
rectly; but in more complicated cases, the Orbit-Counting Lemma is required.
One example is the number of graphs on n vertices. The labelled number
41
is 2n(n−1)/2 , since for each of the n(n − 1)/2 pairs of vertices we can choose
whether to join it by an edge or not; but the only way to calculate the nuber
of unlabelled graphs is via the Orbit-Counting Lemma.
42
Theorem 7.2
B(x) = Z(G; si ← A(xi ) for i = 1, . . . , n).
The notation on the right means that we substitute A(xi ) for si , for
i = 1, . . . , n.
I won’t prove the theorem here – it follows from the Orbit-Counting
Lemma with a certain amount of ingenuity – but will conclude with a simple
application which doesn’t even hint at the uses of the theorem.
First, let us calculate the cycle index of the rotation group of the cube.
The five types of elements mentioned earlier have the following cycle struc-
tures in their action on faces:
(a) Identity: (1, 1, 1, 1, 1, 1) (usually abbreviated to 16 ).
(b) Face rotations through ±π/2: 12 4.
(c) Face rotations through π: 12 22 .
(d) Edge rotations: 23 .
(e) Vertex rotations: 32 .
So the cycle index is
1 6
Z(G) = (s + 6s21 s4 + 3s21 s22 + 6s32 + 8s23 ).
24 1
Now any counting problem for which we can write a figure-counting series
can be solved by substitution. For example:
(a) Take each of the three colours to be a figure of weight 0. The figure-
counting series is simply 3. We recover our earlier count.
(b) Take one of the colours (say red) to have weight 1, and all the others
weight 0. The figure-counting series is x + 2. So substituting xi + 2
for si gives a polynomial in which the coefficient of xk is the number of
types of cube which have exactly k red faces.
(c) A small extension of the Cycle Index Theorem shows that, if we sub-
stitute pi (x, y, z) = xi + y i + z i for si , we obtain a trivariate polynomial
in which the coefficient of xi y j z k is the number of cubes with i red, j
blue, and k green faces.
(d) The generalisation to an arbitrary number of colours is now routine.
43
Exercises
1 Perform the calcuations in the four counting problems above.
2 A necklace has ten beads, each of which is either black or white, arranged
on a loop of string. A cyclic permutation of the beads counts as the same
necklace. How many necklaces are there?
How many are there if the necklace obtained by turning over the given
one is regarded as the same?
F (x) = P (x + 1)
4 Consider the set of all functions from {1, . . . , n} to {1, . . . , m}. There
are mn functions in the set. Now let the symmetric group Sn act on these
functions by permuting their arguments: (f π)(x) = f (xπ −1 ). [Incidentally,
the inverse is there to make this an action – can you see why?]
44
Show that orbits correspond to m-tuples
of non-negative
integers with
m+n−1
sum n, so that the number of orbits is . (See the Appendix in
n
Lecture Notes 7.)
Show that a permutation g with k cycles fixes mk functions. Hence use
the Orbit-Counting Lemma to show that
n
1 X m+n−1
u(n, k)mk = .
n! k=1 n
a formula we met in Section 1. (Here s(n, k) and u(n, k) are the signed and
unsigned Stirling numbers of the first kind.)
8 Species
In this lecture I will discuss a very nice unifying principle for a number of
topics in enumerative combinatorics, the theory of species, introduced by
André Joyal in 1981. Species have been used in areas ranging from infinite
permutation groups to statistical mechanics, and I can’t do more here than
barely scratch the surface.
Joyal gave a category-theoretic definition of species; I will take a more
informal approach.
There is a book on species, by Bergeron, Labelle and Leroux, entitled
Combinatorial Species and Tree-Like Structures; but I think that Joyal’s
original paper in Advances in Mathematics is hard to beat.
45
8.1 What is a species?
As I said earlier, a typical combinatorial structure of the type we wish to
count is often built on a finite set; we are interested in counting labelled
structures (the different structures built on a fixed set) and also the unla-
belled structures (essentially the isomorphism types of structures).
A species is a functor F (this word is used by Joyal in its technical sense
from category theory; I will be less formal but will explain what is going on)
which takes an n-element set and produces the set of objects in which we
are interested; it should also have the property that the functor transforms
any bijection between n-element sets A and B to a bijection between the sets
F(A) and F(B) of objects built on these sets. Because of this condition, we
can use the standard n-element set {1, 2, . . . , n}, but don’t have to worry if
during the argument we have a non-standard set (such as a proper subset of
the standard set).
Joyal’s intuition is that we think of a formal power series where the coef-
ficients are not numbers, but sets of combinatorial objects:
X
F= F ({1, 2, . . . , n})xn .
n≥0
give us, respectively, the ordinary generating function for the unlabelled
structures in the species F, and the exponential generating function for the
labelled structures.
8.2 Examples
If this is a bit abstract, hopefully some examples will bring it back to earth.
46
Sets Let Set denote the “identity” species, where the structure on the finite
set A is simply a labelling of A. Thus, for each n, there is one unlabelled
srtucture, and one labelled structure. So the generating functions are
X 1
set(x) = xn = ,
n≥0
1−x
X xn
Set(x) = = exp(x)
n≥0
n!
respectively.
The cycle index of the species Set can be computed as follows. First,
1 X n!
Z(Sn ) = sa1 · · · sann ,
n! 1 · · · n n a1 ! · · · an ! 1
a1 a
where the sum is over all partitions of n having ai parts of size i for i =
1, 2, . . . , n (the coefficient is the number of permutations with this cycle struc-
ture). Summing this over all n seems a formidable task, but a remarkable
simplification occurs: since n! cancels we can sum over the variables a1 , . . . , an
independently. We obtain
!
X si
Z(Set) = exp .
i≥1
i
as expected.
Note that the formula for the sum of the cycle indices of the symmetric
groups was known in the combinatorial enumeration community before Joyal
provided it with this nice interpretation.
47
Linear orders A much easier case is the species Lin of linear (or total)
orders. There are n! labelled linear orders on n points; all are isomorphic,
and there are no non-trivial automorphisms, so we have
X 1
Z(Lin) = sn1 = ,
n≥0
1 − s1
from which the generating functions are lin(x) = Lin(x) = 1/(1 − x).
Sum F + G is the species which constructs on the set A all the F-objects
and all the G-objects (we assume these two classes to be disjoint). Clearly the
cycle index and the generating functions for unlabelled and labelled objects
are simply obtained by adding those for F and G.
48
In other words, for the indeterminate sn in Z(F, we substitute the cycle
index of G but in the indeterminates sn , s2n , . . . in place of s1 , s2 , . . ..
The effect on the generating functions for labelled objects is simple sub-
stitution: F [G](x) = F (G(x)). For unlabelled objects it is a bit more com-
plicated, we need the cycle index for F:
For example, let Set∗ be the species of non-empty sets. Then the e.g.f.
for labelled objects is Set∗ (x) = exp(x) − 1. Now Set[Set∗ ] is the species of
set partitions, where the labelled objects are counted by the Bell numbers:
the exponential generaing function is thus exp(exp(x) − 1), as we saw earlier.
As an exercise, obtain the ordinary generating function for partitions of the
integer n from this approach.
Remark The fact that substituting a species into Set exponentiates the
generating function for labelled structures is sometimes called the exponential
principle in enumerative combinatorics. We see that substitution of species
is much more general.
8.4 Exercises
1 Define the species Circ of circular orders and the species Perm of per-
mutations, and calculate the generating functions for unlabelled and labelled
objects in these species.
Show that
X φ(m)
Z(Circ) = − log(1 − sm ),
m≥1
m
49
where φ is Euler’s totient function.
Use the decomposition of permutations into disjoint cycles to show that
Set[Circ] = Perm,
and verify the appropriate identities for the generating functions.
2 Use the fact that Catalan objects are rooted binary trees to show that
the species Cat of Catalan objects satisfies
Cat = E + Cat2 ,
where E denotes the species of singleton sets (that is, it returns its input if
this has cardinality 1, and the empty set otherwise).
Show similarly that the species W of rooted binary trees without the left-
right distinction (counted by Wedderburn–Etherington numbers) satisfies
W = E + Set2 [W],
where Set2 is the species of 2-element sets.
50
9 Möbius inversion
In this section we will discuss the Inclusion-Exclusion principle, with a few
applications (including a formula for the chromatic polynomial of a graph),
and then consider a wide generalisation of it due to Gian-Carlo Rota, involv-
ing the Möbius function of a partially ordered set. The q-binomial theorem
gives a simple formula for the Möbius function of the lattice of subspaces of
a vector space.
9.1 Inclusion-Exclusion
The Inclusion-Exclusion Principle is one of the most familiar results in com-
binatorics. For two sets A and B, it asserts simply that |A ∪ B| = |A| +
|B| − |A ∩ B|. For the general case, we need some notation. Let A1 , . . . , An
\of a finite set S. For any subset I of the index set {1, 2, . . . , n, we
be subsets
let AI = Ai . By convention, we take A∅ = S.
i∈I
51
9.2 Applications
We begin with two standard applications of the Corollary. First, a formula
for the Stirling numbers of the second kind.
52
Proof Let S be the set of all permutations, and Ai the set of permutations
which fix the element i ∈ {1, . . . , n}. Then a permutation is a derangement
if and only if it lies in no set Ai . The permutations in AI fix every point in
the set I, so there are (n − i)! of them if |I| = i. Thus Corollary 9.2 gives
n n
(−1)i
X
i n X
dn = (−1) (n − i)! = n! .
i=0
i i=0
i!
as claimed.
The summation here is the partial sum of the series for e−1 , so dn is
approximately n!/e. Indeed, it is easy to show that it is the nearest integer
to n!/e.
The “secretary problem” asks: a secretary puts n letters into n addressed
envelopes at random: what is the probability that no letter is correctly ad-
dressed? The answer is very close to 1/e, perhaps a little surprising at first
sight.
For our final application we consider graphs. A graph consists of a set V
of vertices and a set E of edges, each edge being a 2-element set of vertices.
Given a set of q colours, a colouring of the graph is an assignment of colours
to the vertices; it is proper if the two vertices in each edge have different
colours.
Theorem 9.6 For any graph G = (V, E), there is a polynomial PG (x) such
that, for any natural number q, PG (q) is the number of proper colourings of
G with q colours. Moreover, PG is a monic polynomial with degree n = |V |.
Proof Let S be the set of all colourings of G with q colours. For each edge e,
let Ae be the set of colourings for which the edge e is “improperly coloured”,
that is, its vertices have the same colour. A colouring is proper if it lies in
no set Ae . Given a set I ⊆ E, how many colourings lie in AI ? Consider the
graph (V, I) with edge set I. A colouring in Ai assigns the same colour to
all vertices in the same connected component of this graph; so |AI | = q c(I) ,
where c(I) is the number of connected components of (V, I).
53
By Theorem 9.1, the number of proper colourings is
X
(−1)|I| q c(I) .
I⊆E
It is clear that this is a polynomial in q; the leading term comes from the
unique graph (V, I) with n connected components, namely I = ∅.
This formula shows a connection between graph colouring and the Potts
model in statistical mechanics, but we cannot pursue this here.
Theorem 9.7 Let P = (A, ≤) be a finite poset. Then we can label the
elements of A as a1 , a2 , . . . , an such that, if ai ≤ aj , then i ≤ j.
This is sometimes stated “Every poset has a linear extension”. The anal-
ogous result for infinite posets requires a weak form of the Axiom of Choice
in its proof.
Now let P = (A, ≤) be a poset. We define the incidence algebra of P as
follows: the elements are all functions f : A × A → R such that f (a, b) = 0
unless a ≤ b. Addition and scalar multiplication are defined in the obvious
way, and multiplication by the rule
X
f (a, c)g(c, b) if a ≤ b,
f g(a, b) = a≤c≤b
0 if a 6≤ b.
54
If we number the elements of A as in the preceding theorem, then we can
represent a function from A × A to R by an n × n matrix; the definition
of the incidence algebra shows that any function which lies in the algebra
is upper triangular. The multiplication in the algebra is then just matrix
multiplication, so the incidence algebra is a subalgebra of the algebra of all
n × n real matrices.
We now define three particular elements of the incidence algebra.
(a) ι is the identity function:
1 if a = b,
ι(a, b) = ,
0 if a 6= b
represented by the identity matrix.
(b) ζ is the zeta function:
1 if a ≤ b,
ζ(a, b) =
0 if a 6≤ b.
so that X
µ(a, b) = − µ(a, c).
a≤c<b
This gives a recursive method for calculating the Möbius function, as we will
see.
From the definition, we immediately have the Möbius inversion formula:
Theorem 9.8 Let P be a poset with Möbius function µ. Then the following
are equivalent:
P
(a) g(a, b) = a≤c≤b f (a, c) for all a ≤ b;
P
(b) f (a, b) = a≤c≤b g(a, c)µ(c, b) for all a ≤ b.
55
9.4 Some examples
The preceding remark shows that the value of µ(a, b) depends only on the
structure of the interval [a, b] = {c : a ≤ c ≤ b}.
Many important posets have a least element (which is usually called 0)
and a “homogeneity property”: for any a, b with a ≤ b, there is an element
c such that the interval [a, b] is isomorphic to the interval [0, c]. In a poset
with this property, µ(a, b) = µ(0, c), and we can regard the Möbius function
as a one-variable function.
A chain
A chain, or linear order, is a poset in which every pair of elements is
comparable. Any finite chain is isomorphic to {0, 1, . . . , n − 1} with the
usual order. Its Möbius function is given by
1 if b = a,
(
µ(a, b) = −1 if b = a + 1,
0 otherwise.
This follows immediately from the recursive method of computing µ.
In this case, any interval [a, b] is isomorphic to the interval [0, b − a], so
it would have sufficed to take a = 0; but the general case is simple enough.
Direct product
The direct product of posets P1 = (A1 , ≤1 ) and P2 = (A2 , ≤2 ) has set
A1 × A2 (Cartesian product), and
(a1 , a2 ) ≤ (b1 , b2 ) ⇔ a1 ≤1 b1 and a2 ≤2 b2 .
It is easily checked that
µ((a1 , a2 ), (b1 , b2 )) = µ(a1 , b1 )µ(a2 , b2 ).
This extends in a straightforward way to the direct product of any finite
number of posets.
56
It follows from the two preceding paragraphs that the Möbius function is
(−1)|b\a| if a ⊆ b,
µ(a, b) =
0 if a 6⊆ b.
In this case, if a ⊆ b, then [a, b] is isomorphic to [∅, b \ a], and we see the
homogeneity property in action. So the following are equivalent:
P
(a) f (a) = b≤a g(b);
The classical Möbius function The classical Möbius function from num-
ber theory is defined on the natural numbers; the partial order is given by
a ≤ b if a divides b. Although this partial order is infinite, all intervals are
finite, and it has the homogeneity property: if a | b, then the interval [a, b] is
isomorphic to [1, b/a].
This poset is isomorphic to the product of chains, one for each prime
power. We have
1 if b = a,
(
µ(pa , pb ) = −1 if b = a + 1,
0 otherwise.
Hence we have the general formula:
d
µ(m, n) = (−1) if m | n and n/m is a product of d distinct primes,
0 otherwise.
57
Subspaces of a vector space For our final example, let A be the set of all
subspaces of an n-dimensional vector space over a field of order q. If V ≤ W ,
the structure of the interval [V, W ] depends only on dim(W ) − dim(V ), and
so is isomorphic to [{0}, W/V ].
Recall the q-binomial theorem:
n n
k(k−1)/2 k n
Y X
i−1
(1 + q z) = q z .
i=1 k=0
k q
Exercises
1 Let (Ai : i = 1, . . . , n} be a family of subsets of a set X. For I ⊆
{1, . . . , n}, let
• f (I) be the number of points lying in Ai for all i ∈ I, and
• g(I) be the number of points lying in Ai for all i ∈ I and for no i ∈
/ I.
Prove that X
f (I) = g(J),
J⊇I
and deduce from Theorem 9.8 and the form of the Möbius function for the
power set of a set that
X
g(I) = (−1)|J\I| f (J).
J⊃I
2 There is a partial order on the set of all partitions of {1, . . . , n}, defined
as follows: if a and b are partitions, say that a refines b if every part of b is
a union of parts of a.
Can you find the Möbius function of this partial order?
58
3 Prove the following “approximate version” of Inclusion-Exclusion:
\ \
aI = Ai , a0I = A0i .
i∈I i∈I
4 Prove that the exponential generating function for the derangement num-
bers dn (Theorem 9.5) is
X dn xn e−x
= .
n≥0
n! 1−x
(A set carrying a permutation is the union of the set of fixed points and a
set none of whose points is fixed.)
59
10 Cayley’s Theorem
The course ends with four entirely different proofs of Cayley’s theorem for
the number of labelled trees on n vertices, some of which introduce new ideas.
There is a direct bijective proof due to Prüfer; Joyal’s proof using species;
a proof using Kirchhoff’s Matrix-Tree Theorem; and a proof using Lagrange
inversion.
A tree is a connected graph without cycles. It is not hard to show by
induction that a tree on n vertices has n − 1 edges. There are 16 trees on the
vertex set {1, 2, 3, 4}: four of them are “stars” in which one vertex is joined
to the other three, and the other twelve are “paths”.
Theorem 10.1 The number of labelled trees on the vertex set {1, . . . , n} is
nn−2 .
60
10.2 A proof using species
Let Lin and Perm be the species of linear orders and permutations respec-
tively. We have seen that these two species have the same counting function
for labelled structures on n points (namely n!); so Lin[F] and Perm[F] will
also have the same counting function for labelled structures, for any species
F.
Joyal takes F = RTree, the species of rooted trees (trees with a distin-
guished vertex.
Now Lin[RTree] consists of a linear order on a set, say {1, 2, . . . , k} with
the usual order, with a rooted tree at each point. We can identify the root of
the tree at point i to be i itself. What we have constructed is a tree with a
distinguished path {1, 2, . . . , k}. Joyal calls such an object a vertebrate, since
it has a “backbone” from the “head” 1 to the “tail” k. We get a vertebrate
by taking a tree on n vertices and distinguishing two of them to be the head
and the tail; in a tree there is a unique path between any two vertices. So
the number of vertebrates is n2 T (n), where T (n) is the number of trees.
Also Perm[RTree] consists of a set of, say, k points carrying a permu-
tation, with a rooted tree attached at each point. If we direct every edge of
each tree towards the root, we have a picture representing what Joyal calls
an endofunction, a function from {1, . . . , n} to itself. Such a function has a
set of “periodic points” which return to their initial positions after finitely
many steps; any other point is “transient”, and the transient points feed into
periodic points in a treelike fashion. The number of endofunctions is clearly
nn .
So n2 T (n) = nn , giving the result.
61
Theorem 10.2 The cofactors of the Laplacian matrix of a graph are all
equal to the number of spanning trees of the graph.
A tree on the vertex set {1, . . . , n} is simply a spanning tree of the com-
plete graph, the graph whose edges are all pairs of vertices. The Lapla-
cian matrix of the complete graph is nIn − Jn , where In and Jn denote the
n × n identity and all-1 matrices. Deleting the last row and column gives
nIn−1 − Jn−1 .
We find the determinant of the last matrix by computing its eigenvalues.
Every row and column sum is n − (n − 1) = 1, so the all-1 vector is an
eigenvector with eigenvalue 1. If v is a vector orthogonal to the all-1 vector,
then Jn−1 v = 0, so v is an eigenvector with eigenvalue n. Thus nIn−1 − Jn−1
has eigenvalues 1 (multiplicity 1) and n (multiplicity n−2); so its determinant
is nn−2 , which is thus the number of spanning trees.
The proof of the Matrix-Tree Theorem depends on the Cauchy–Binet for-
mula, a nineteenth century determinant formula which asserts the following.
et A be an m × n matrix, and B an n × m matrix, where m < n. Then
X
det(AB) = det(A(X)) det(B(X)),
X
where X ranges over all m-element subsets of {1, . . . , n}. Here A(X) is the
m × m matrix whose columns are the columns of A with index in X, and
B(X) is the m × m matrix whose rows are the rows of B with index in X.
To prove the Matrix-Tree Theorem for the graph G = (V, E) with Lapla-
cian matrix L(G), choose an arbitrary orientation of the edges of G, and let
M be the signed vertex-edge incidence matrix of G, with (v, e) entry +1 if
v is the “head” of the arc e, −1 if v is the “tail” of e, and 0 otherwise. It is
straightforward to show that M M > = L(G). Let v be any vertex of G, and
let N = Mv be the matrix obtained by deleting the row of M indexed by e.
It can be shown that, if X is a set of n − 1 edges, then
±1 if X is the edge set of a spanning tree,
n
det(N (X)) =
0 otherwise.
By the Cauchy–Binet formula, det(N N > ) is equal to the number of spanning
trees. But N N > is the principal cofactor of L(G) obtained by deleting the
row and column indexed by v.
The fact that all cofactors are equal is not really necessary for us, and
can be proved by elementary linear algebra.
62
10.4 Lagrange inversion
Our final approach involves another general technique, Lagrange inversion.
Let G be the set of all formal power series (over the commutative ring R
with identity) which have the form x + · · ·, that is, constant term is zero and
coefficient of x is 1. Any of these series can be substituted into any other.
We make a simple observation:
This group is sometimes called the Nottingham group, for reasons that
are a little obscure.
Proof Closure and the associative law are straightforward, and the formal
power series x is the identity. Let f (x) = x +a2 x2 +a3 x3 +· · · be any element
of G. We seek an inverse g(x) = x + b2 x2 + b3 x3 + · · · such that f (g(x)) = x.
The coefficient of xn in
is bn + stuff, where stuff involves the as and bi for i < n. Equating it to zero
gives bn in terms of as and bi for i < n; so the bs can be found recursively.
In a similar way, we find a unique element h(x) ∈ G for which h(f (x)) = x.
Then
g(x) = h(f (g(x)) = h(x),
and the inverse is unique.
The proof implicitly shows us how to find the inverse; Lagrange inversion
gives a more direct approach.
I will not give the proof here; it involves working with Laurent series and
extending the notion of poles and the calculus of residues to formal power
series.
63
Now let RTree be the species of rooted trees, as before. We clearly have
the equation
RTree = E · Set[RTree],
where E is the species of 1-element sets; this is because a rooted tree is a
(ppossibly empty) set of rooted trees all joined to a new root.
Thus the exponential generating function T ∗ (x) for rooted trees satisfies
So the function T ∗ (x) is the inverse (in the group G) of the function x/ exp(x).
From Lagrange inversion, we find that the coefficient of xn /n! in T ∗ (x) is
n−1
d
exp(nx) = nn−1 .
dxn−1 x=0
Since the number of rooted trees is n times the number of trees, we conclude
that there are nn−2 trees on n vertices.
64
Exercises
1 Calculate the chromatic polynomial of
3 Count the labelled trees in which the vertex i has valency ai for 1 ≤ i ≤ n,
where a1 , . . . , an are positive integers with sum 2n − 2.
65