Random Graphs
Random Graphs
Lecture 6: February 6
Instructor: Alistair Sinclair
Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They
may be distributed outside this class only with the permission of the Instructor.
The probabilistic method is non-constructive, in that it merely proves the existence of objects with certain
properties rather than explicitly constructing them. In some cases, there is an easy way to make the method
algorithmic, which we now illustrate with a simple example. (More sophisticated tools for making the method
constructive will be discussed later in the class.)
6.1 MAX3SAT
The MAX3SAT problem is defined as follows. Given a boolean formula ϕ in 3CNF form on variables
{x1 , · · · , xn } and clauses {C1 , · · · , Cm }, find the maximum number of clauses that can be satisfied by some
truth assignment to the variables. (For the purposes of this section, we assume that an instance of MAX3SAT
has exactly three literals per clause, all of whose variables are distinct.)
This is the optimization version of the 3SAT problem, which is NP-complete. Therefore, MAX3SAT is an
NP-hard optimization problem. However, a simple probabilistic argument yields a surprisingly good lower
bound on the optimum value for this problem.
7m
Claim 6.1 For every such ϕ, there exists an assignment satisfying at least 8 clauses.
Finally, note that there must exist a point in the sample space at which X takes value at least E[X]. Hence
there exists an assignment satisfying at least 7m
8 clauses.
Note that we can apply exactly the same argument to CNF formulas with clauses of varying lengths (and
indeed for general constraint satisfaction problems, defined by a conjunction
P of more general boolean con-
straints). In general, the number of clauses we can satisfy is at least i pi , where pi is the probability that
the ith clause is satisfied by a random assignment. The values pi can easily be computed by inspection.
6-1
6-2 Lecture 6: February 6
Exercise: Let X be the random variable above, denoting the number of satisfied clauses in a random
assignment. For any 1 < α < 8, show that
h α i 1
Pr X ≤ 1 − m ≤ .
8 α
[Hint: Apply Markov’s inequality to the random variable Z = m − X.] Hence deduce that a random
assignment satisfies at least a 43 fraction of the clauses with probability at least 12 .
Exercise: Use Markov’s inequality and the fact that X is integer-valued to show that we actually get
7 1
Pr X ≥ m ≥ .
8 1+m
Deduce that we can actually achieve the expected value of 78 m in polynomial time with high probability.
In many examples, it is possible to efficiently derandomize the randomized construction used in the probabilis-
tic method, and thus achieve the expected value (or better) deterministically. In its simplest incarnation, this
technique is usually called the “method of conditional probabilities.” We illustrate it using our MAX3SAT
example.
Consider a 3CNF formula ϕ. We can think of the random construction of a truth assignment as sequentially
assigning truth values: first pick a T/F value for x1 , then for x2 , and so on. This process can be pictured as
a tree (see Figure 6.1).
We index each node of the tree with a formula Ψ: the formula at the root is ϕ, and to get the formula at a
node at level i we simply replace the variables x1 , . . . , xi in ϕ with their appropriate T/F values. Note that
the expression associated with each node is similar to the original expression ϕ, except that some clauses
may have less than three literals in them.
Also, for each node Ψ in the tree, we define the random variable XΨ to be the number of clauses that are
satisfied in the tree below Ψ (i.e., when the assignments to the remaining variables are made randomly).
Now consider the expectation E[XΨ ], where Ψ is a node at level i (so that the next variable to be assigned
is xi+1 ). Using conditional expectation,
1
E[XΨ ] = Pr[xi+1 = T ] · E[XΨ|xi+1 =T ] + Pr[xi+1 = F ] · E[XΨ|xi+1 =F ] = (E[XΨ1 ] + E[XΨ2 ]),
2
where Ψ1 and Ψ2 are the children of Ψ. Thus at least one of the children must have an expectation at least
as large as that of Ψ.
Lecture 6: February 6 6-3
constants
Now, note that at the root, E[XΨ ] ≥ 78 m, when Ψ = ϕ. Hence, at each node we can choose a child such the
expectation remains greater than 7m 8 . Moreover, note that for any partially assigned formula, it is possible
to compute exactly the probability that each of its clauses is satisfied, and hence the expectation E[XΨ ], in
linear time. So we can in linear time determine the child of Ψ with the larger expectation. We choose this
child and proceed down the tree, while maintaining the invariant that E[XΨ ] ≥ 87 m. When we hit a leaf
of the tree (after n levels) we will have a complete assignment, and since the invariant still holds that this
assignment must in fact satisfy at least 7m8 clauses.
The above approach can be made to work in many of our examples. (Exercise: which ones?) The key
ingredient is the ability to compute the expectation when some of the random choices have already been made.
Even when this is not possible exactly, it is sometimes possible to compute the expectation approximately
and thus to deterministically achieve a final result that is close to the expectation.
Markov’s Inequality gives the best tail bound when the expectation is all we know about a non-negative
random variable. In this lecture we will explore how this can be improved if additional information is
available, such as higher moments. In particular we are interested in using the variance/second moment to
obtain a tighter bound.
Definition 6.2 The variance of a random variable X is defined as Var(X) = E[(X −EX)2 ] = EX 2 −(EX)2 .
Intuitively variance measures how far the random variable is likely to be from its expectation. The following
standard inequality makes this intuition quantitative.
Var(X)
Lemma 6.3 (Chebyshev’s Inequality) Pr[|X − EX| ≥ α] ≤ α2 .
6-4 Lecture 6: February 6
EY Var(X)
Pr[|X − EX| ≥ α] = Pr[Y ≥ α2 ] ≤ = . (6.1)
α2 α2
p 1
p
Corollary 6.4 Pr[|X − EX| ≥ β Var(X)] ≤ β2 . [Note that Var(X) is the standard deviation of X.]
1 Var(X)
Corollary 6.5 Pr[|X − EX| ≥ βEX] ≤ β 2 (EX)2 .
p 2
Exercise: Show that Pr[|X − EX| ≥ β Var(X)] ≤ 1+β 2 . Hence deduce that Chebyshev’s inequality is not
necessarily tight (i.e., it does not provide the tightest bound possible given EX and Var(X)).
Notwithstanding the previous exercise, we cannot do much better than Chebyshev’s inequality if all we know
are the mean and variance of a random variable. However, in many cases we have much more information,
which enables us to prove much tighter tail bounds. For example, if the random variable X is normally
distributed with mean µ and variance σ 2 , i.e., X ∼ N (µ, σ 2 ), then
q β2
2 e− 2 1
Pr [|X − µ| ≥ βσ] ≈ π · β β2 ,
which is much sharper than Corollary 6.4. Similarly tight bounds hold when X is the sum of a large number
of independent random variables with bounded range. We shall discuss this at greater length in a later
lecture.
As a first example, we will apply the variance/second moment method to find thresholds in random graphs.
In the Gn,p model of random graphs, a graph G with n vertices is constructed by including each of the n2 pos-
sible edges independently with probability p. Thus E[number of edges] = p n2 and E[deg of any vertex] =
p(n − 1).
Typical Questions:
• Is G ∈ Gn,p connected?
We will now attack the second of these questions, i.e., we’ll look at the property that G contains a 4-clique.
This will serve as an illustration of some general ideas.
Let the random variable X denote the number
P of 4-cliques in G. We look first at the expectation EX (i.e.,
the first moment). As usual we write X = C XC , where C ranges over all subsets of four vertices and
1, if C is a 4-clique;
XC =
0, otherwise.
Lecture 6: February 6 6-5
Now EXC is the probability that C is a clique, which is just p6 (as there must be six edges within C). So
by linearity of expectation we have
X n 6
EX = EXC = p = θ(n4 p6 ).
4
C
• If p n−2/3 , then EX → 0;
• If p n−2/3 , then EX → ∞.
(We use the notation f (n) g(n) to mean that f (n)/g(n) → 0 as n → ∞.)
Can we translate this sharp jump in the expectation into a stronger statement about probabilities? It turns
out we can. We call p(n) a threshold for a property Q if
p p(n) ⇒ Pr[G ∈ Gn,p has Q] → 1 as n → ∞, and
p p(n) ⇒ Pr[G ∈ Gn,p has Q] → 0 as n → ∞.
Theorem 6.6 The value p(n) = n−2/3 is a threshold for containing a 4-clique.
Proof: As above, let the random variable X denote the number of 4-cliques in G.
One direction is easy. Since X is an integer-valued random variable, Pr[X > 0] = Pr[X ≥ 1] ≤ EX, by
2
Markov’s inequality. Since EX → 0 for p n− 3 , we deduce that Pr[G contains a 4-clique] → 0, as desired.
The other direction is less trivial: notice that EX → ∞ does not immediately imply a useful lower bound
on Pr[X > 0], since X could be 0 most of the time and very large with small probability. We therefore need
to use the second moment of X. Specifically, we will show the following:
Note that the claim is equivalent to saying that E(X 2 ) = (1 + o(1))(EX)2 , i.e., that the second moment is
asymptotically no larger than the square of the mean.
Before proving the claim, let us see how it implies the second direction of our theorem. By Chebyshev’s
Inequality, Pr[X = 0] ≤ Pr[|X − EX] ≥ EX] ≤ Var(X)
(EX)2
. Therefore, by the claim, this latter ratio tends to
zero as desired.
P
Proof of Claim 6.7: Expanding the variance of X = C XC we have
Var(X) = EX 2 − (EX)2 (6.2)
X X X X
= EXC2 − (EXC )2 + E(XC XD ) − EXC EXD (6.3)
C C C6=D C6=D
X X
= Var(XC ) + Cov(XC , XD ), (6.4)
C C6=D
where the covariance of two random variables Y, Z is defined as Cov(Y, Z) = E(Y Z) − EY EZ. Note that
Cov(Y, Z) = 0 if Y, Z are independent, and otherwise can be positive, negative or zero.
Since the XC are {0,1} random variables, we have C Var(XC ) = C (EXC − (EXC )2 ) = Θ(n4 p6 ). To
P P
analyze the covariance terms, we consider three cases:
6-6 Lecture 6: February 6
• Case 1: |C ∩ D| ≤ 1. In this case C, D share zero or one vertex, so XC , XD are independent and
Cov(XC , XD ) = 0.
• Case 2: |C ∩ D| = 2. Here C, D have two vertices in common, and Cov(XC , XD ) ≤ E(XC XD ) =
Pr[C, D are both cliques]. For each choice of C, D,this latter event requires that 11 specific edges are
present, so the probability is p11 . Since there are n6 62 = Θ(n6 ) such pairs C, D, the total contribution
of this covariance term is Θ(n6 p11 ).
• Case 3: |C ∩ D| = 3. By similar reasoning to Case 2, the contribution from the covariance of such
pairs C, D is Θ(n5 p9 ).
What happens when p = c · p(n) for some constant c ? The definition of threshold doesn’t say anything
about this. We will now briefly discuss the behavior at the threshold for some particular properties.
6.4.1 4-Cliques
Fact 6.9 For p = cn−2/3 , let X be the number of 4-cliques. Then X is asymptotically Poisson with parameter
c6 /24.
Lecture 6: February 6 6-7
(As a somewhat technically involved exercise, you are invited to prove this Fact. You need to show that,
k
c6 6
for each fixed k, Pr[X = k] approaches e−λ λk! , where λ = 24 .) This implies that Pr[X > 0] → 1 − e−c /24 .
Observe that as c varies, the probability that G contains a 4-clique varies smoothly. This is therefore called
a “coarse threshold”.
Fact 6.10 p(n) = n−1 is a threshold for the property Q ≡ “G has a connected component of size θ(n)”.
Moreover, for p(n) = c/n, the size of the largest component in G is almost surely
Θ(log n) if c < 1;
Θ(n2/3 ) if c = 1;
Θ(n) if c > 1.
(By “almost surely” we mean “with probability tending to 1 as n → ∞”.) Since the behavior depends in
detail on the value of the constant c there must be some activity in the lower order terms. It turns out that,
if we set p(n) = n−1 + c0 · n−4/3 then we get smooth behavior as c0 varies. The “width of the transition” is
thus θ(n−4/3 ). This is an example of a “sharp threshold.”
A graph property is said to be monotone increasing if adding edges to G cannot destroy the property (i.e.,
whenever G has the property, so does any graph obtained by adding edges to G). Monotone decreasing
properties are defined analogously. A host of natural graph properties (including all the ones discussed
above) are monotone. The following theorem of Bollobás and Thomason [BT86] confirms that the threshold
phenomenon is ubiquitous:
Theorem 6.11 Every monotone increasing (or decreasing) graph property has a threshold.
In our examples so far, the threshold values p(n) were pretty small (e.g., n−2/3 , n−1 ). It is natural to look
at the case p = 1/2, i.e., the random graph model Gn,1/2 , in which every graph on n vertices has equal
probability. For example, we know that almost every graph in this model has a k-clique for any fixed k, since
k
the threshold is n−k/(2) = n−2/(k−1) 1/2. This is a special case of the following more general theorem
(which we will not prove) about properties of almost every graph:
Theorem 6.12 (Fagin [F76]) For any first order property Q, and for any constant p ∈ (0, 1), either almost
every G ∈ Gn,p has Q or almost every G ∈ Gn,p does not have Q.
(The term “almost every” here means “with probability tending to 1 as n → ∞”.) Informally, we can define
a first order property as one expressible as a finite sentence, using ∀, ∃, ∨, ∧, ¬, variables denoting vertices of
the graph, and the relation ∼ denoting adjacency in the graph.
Fact 6.14 The following properties hold for almost every G in Gn,1/2 :
• G has diameter 2.
• G does not have diameter 1.
• G is Hamiltonian.
• G is connected.
The first two of these follow from the above theorem about first-order properties. The last two are not
first-order properties and require separate proofs.
The clique number of a graph G is the size of a largest clique in G. Determining the clique number of a graph
is a famous NP-hard problem. However, the following theorem says that the clique number of a random
graph is known rather precisely:
Theorem 6.15 For G ∈ Gn,p for any constant p ∈ (0, 1), the clique number of G is almost surely ∼
2 log1/p (n).
In particular, when p = 1/2 the maximum clique size is almost surely ∼ 2 log2 n. We will see how to prove
this theorem in the next lecture.
In fact, even more is known:
Theorem 6.16 (Bollobás and Erdős [BE76], Matula [M76]) For p = 1/2, the maximum clique in
almost every graph G has size either k(n) or k(n) + 1, for some integer k(n).
In other words, the clique number of almost every graph is specified within two adjacent integers!
As an example, for n = 1000 and p = 1/2, the largest clique size is 15 or 16 with high probability.
Challenge: Find a polynomial-time algorithm that outputs a clique of size ≥ (1 + ) log2 n with high
probability in almost every graph G. This is a major open problem in average-case complexity. The folklore
says that all known algorithms find a clique of size no larger than (and usually as large as) (1 − ) log2 n.
Even though we know that G almost certainly contains a clique of size 2 log2 n, we don’t know how to find
one more than half this size!
References
[B81] B. Bollobás, “Random graphs,” in Combinatorics, Proceedings, Swansea 1981, London Math-
ematical Society Lecture Note Series 52, Cambridge University Press, pp. 80–102.
[BE76] B. Bollobás and P. Erdős, “Cliques in random graphs,” Mathematical Proceedings of the
Cambridge Philosophical Society 80 (1976), pp. 419–427.
Lecture 6: February 6 6-9
[BT86] B. Bollobás and A. Thomason, “Threshold functions,” Combinatorica 7 (1986), pp. 35–38.
[ER60] P. Erdős and A. Rényi, “On the evolution of random graphs,” Publ. Math. Inst. Hungar.
Acad. Sci. 5 (1960), pp. 17–61.
[F76] R. Fagin, “Probabilities in finite models,” Journal of Symbolic Logic 41 (1976), pp. 50–58.
[M76] D.W. Matula, The largest clique size in a random graph, Technical Report, Southern Methodist
University, Dallas, 1976.