Review of Thermodynamics and Intro To Statistical Mechanics: Prof. Mark W. Tibbitt - ETH Z Urich - 21 Februar 2019
Review of Thermodynamics and Intro To Statistical Mechanics: Prof. Mark W. Tibbitt - ETH Z Urich - 21 Februar 2019
Review of Thermodynamics and Intro To Statistical Mechanics: Prof. Mark W. Tibbitt - ETH Z Urich - 21 Februar 2019
Lecture 2:
Review of thermodynamics and intro to statistical mechanics
Prof. Mark W. Tibbitt – ETH Zürich – 21 Februar 2019
1 Suggested reading
• Molecular Driving Forces – Dill and Bromberg: Chapters 1, 2, 3, 5, 6, & 10
2 Thermodynamics
In this course, we are not always intent on rigorously deriving or proving any thermodynamic relations, leaving
that for proper thermodynamics or statistical mechanics courses. Instead, we are interested in developing
physical intuition and simple models grounded in thermodynamic principles for the problems that we address.
That is, we want to understand the physical system and be able to make appropriate assumptions as needed.
Thermodynamics is a useful toolset for reasoning about energies and entropies. It allows us to predict the
likelihood of atoms and molecules to react; to adsorb, diffuse, or partition; to dissolve or change phase; and to
alter their shape or bonding arrangement. Recall, the first law of thermodynamics: dU = dq − dw and the
second law of thermodynamics: dS ≥ 0. At the simplest level, thermodynamics can be thought of as these
two laws and some calculus.
A thermodynamic system is a collection of matter in any form, separated from its surroundings by real or
imaginary boundaries. How the system is defined will depend on the problem that we want to solve. Boundaries
are key as they define what comes in and goes out. At that point, we need to do some bookkeeping, accounting
for energy and matter exchange across the boundaries or for changes in volume. Many times these systems
contain subsystems that are also defined by real or imaginary boundaries.
Within these systems we have extensive and intensive properties. An extensive property X is the sum of
the same property in each of the subsystems X1 + X2 + X3 + · · · . That is extensive properties increase with
the system size. On the other hand, intensive properties are independent of the system size. Temperature T ,
pressures p, and concentrations [Ni ] are intensive properties. Extensive properties include volume V , number
of atoms or molecules Ni , internal energy U , and entropy S. In thermodynamics, each extensive property is
related to an intensive property. We will see these conjugate pairs below in the discussion of thermodynamic
driving forces.
The study of thermal and chemical equilibrium is governed by the second law of thermodynamics or
entropy maximization. In words, this is the assertion that the equilibrium observable state (macrostate) of an
isolated system is that state that can occur in the largest number of different ways (microstates). Or, when
given the choice from all observable macrostates the isolated system at equilibrium will choose the macrostate
that has the most representative microstates. We can write down a fundamental thermodynamic equation
for entropy as a multivariate expression of extensive properties:
S = S(U, V, {Ni })
This is just a mathematical form to say that the entropy is dependent on U, V, N1 , N2 , N3 , . . . and that each of
these are independent of one another. Traditionally, thermodynamics is discussed by writing down a multivariate
fundamental thermodynamic equation for energy:
U = U (S, V, {Ni })
We can also compose an equation for free energy, such as the Helmholtz Free Energy: F = U − T S. This
is more applicable for ‘real’ systems (non-isolated systems) that we encounter in the world around us and we
define equilibrium as the minimum free energy – for this to occur the system will tend toward low energy and
high entropy. Thus, there is an interplay between the energy of the system and the entropy of the system,
which we will explore throughout this course. At high temperatures, entropy dominates. At low temperatures,
energy dominates. Note, this is also entropy maximization if you construct a closed system by placing the ‘real’
system in a reservoir: say a test-tube in a water bath.
1
Prof. Tibbitt Lecture 2 Networks & Gels
For now these are just mathematical definitions, based on the chain rule from calculus. However, each of
the partial derivates of dU corresponds to a measurable physical property: temperature, T ; pressure, p; and the
chemical potential of species j, µj . We are perhaps more used to seeing:
M
X
dU = T dS − pdV + µj dNj . (3)
j=1
From this we see the relationships between measurable parameters and the partial differentials:
∂U ∂U ∂U
T ≡ , p≡− , µj ≡ . (4)
∂S V,{Ni } ∂V S,{Ni } ∂Nj S,V,Ni6=j
We can perform the same logic for dS by rearranging Eq. 3 and we see that:
M
1 p X µj
dS = dU + dV − dNj . (5)
T T j=1
T
and, therefore:
1 ∂S p ∂S µj ∂S
≡ , ≡ , ≡− . (6)
T ∂U V,{Ni } T ∂V U,{Ni } T ∂Nj U,V,Ni6=j
We can think of the expressions in Eqs. 4 and 6 as thermodynamic driving forces that push the system
toward maximum entropy or minimum energy. Temperature, T , describes the tendency of the system for energy
exchange. Energy is the capacity of the system to do work and that capacity can flow as heat. Entropy is the
tendency of the work capacity, or energy, to flow from one system to another. In this manner, entropy is a
kind of potential to transfer energy from one place to another and 1/T is the corresponding thermodynamic
driving force. Pressure, p, is a ‘force’ for changing volume. The chemical potential, µ, is a tendency for matter
exchange. In each case, this is done by applying the maximum entropy principle (dStotal = 0) to specify the
state of equilibrium. We can exploit this logic to derive simple concepts, such as the ideal gas law, and use
lattice models and the tools of statistical mechanics to understand these principles at the molecular scale.
4 Statistical mechanics
To gain a better understanding of the essence of entropy, we use the tools of statistical mechanics. Statistical
mechanics takes a probabilistic view of macroscopic systems whereby the macroscopic observable system is
viewed in terms of the average properties of a system consisting of a large number of degrees of freedom.
2
Prof. Tibbitt Lecture 2 Networks & Gels
M
X
S=− p(i) ln p(i) (8)
i=1
This is the generalized entropy for any probability distribution, applied to the probability distribution from Eq.
7, we see:
Ω
X
S=− p(µ) ln p(µ) (9)
µ=1
Ω
X 1 1
=− ln (10)
µ=1
Ω(U ) Ω(U )
1
= − ln (11)
Ω(U )
= ln Ω(U ). (12)
This is a reformulation of the classic definition of entropy postulated by Ludwig Boltzmann, and inscribed on
his tombstone:
S = kB ln W (13)
where kB is the Boltzmann constant, ∼ 1.38 × 10−23 m2 kg s−2 K−1 , and provides the proper units on entropy.
Eq. 13 states that as we increase the number of microstates (or multiplicity W ) in a system, the entropy
3
Prof. Tibbitt Lecture 2 Networks & Gels
increases. Also, it states that the entropy can never be negative. From these postulates, and Eq. 13, all we need
to do is some clever counting to calculate entropy and find those systems that have maximum multiplicity W as
these will have maximum entropy! This is true for the microcanonical ensemble discussed here and demonstrated
below. We will also discuss and explore the implications of statistical mechanics on other ensembles in future
lectures.
As an aside, to my knowledge the form of S = kB ln W while rigorously tested and confirmed has no a priori
proof. However, some simple logic allows us to understand why it should take the logarithmic form. Consider a
system with two subsystems A and B having multiplicities WA and WB . The multiplicity of the total system will
be the product W = WA WB . We require that entropy is extensive, meaning that S = SA + SB . The logarithm
function satisfies this requirement: if SA = kB ln WA and SB = kB ln WB then S = kB ln W = kB ln WA WB =
kB ln WA + kB ln WB = SA + SB . This simple argument illustrates why S should be a logarithmic function of
W but is not a rigorous proof.
5 Probability
The concepts that we introduce here – probability, multiplicity, combinatorics, averages, and distribu-
tion functions – provide the foundation for statistical mechanics and calculating entropy.
We define probability as follows. If we have in total N events, measurements, or trials each with t distinct
possible outcomes: A, B, C, . . . and the number of events for each outcome are nA , nB , nC , . . . or {ni }, the
probability of each outcome is calculated as:
ni
pi = . (14)
N
By definition,
t
X t
X
ni = N and pi = 1. (15)
i=1 i=1
Thus, probabilities are in the range of zero to one: pi ∈ [0, 1]. If only one outcome is possible, the event is
deterministic and the outcome has a probability of one. If an outcome can never occur, it has a probability of
zero.
Consider the event of a single roll of a Würfel with six sides. The probability that a 6 is rolled is 1/6 as
there are N = 6 possible outcomes with n6 = 1 of them is a 6. But suppose we roll the Würfel four times. We
may want to calculate the probability that we roll a 6 every time. Or we may want to calculate the probability
that we roll one 2 followed by two 4’s followed by one 2. Or we may want to calculate the probability that we
roll two 4’s and two 5’s in any order. The rules of probability and combinatorics provide the tools to calculate
these probabilities. First we need to define certain aspects of the events.
The outcomes A, B, C, . . . are considered mutually exclusive if the occurrence of one precludes the oc-
currence of all the others. For example, with the Würfel 2 and 4 are mutually exclusive as only one number
can appear face up for each roll. The outcomes A, B, C, . . . are considered collectively exhaustive if they
comprise all possible outcomes for the event, trial, or measurement and no other possible outcomes exist. For
example, {1, 2, 3, 4, 5, 6} is the collectively exhaustive set of outcomes for the roll of the Würfel, as long as we
discount any of the rare times that it might end up on a side. Events are independent if the outcomes of each
is unrelated or not correlated with the outcome of any other event. That is, there is no information transfer
from event to event. Assuming that no one has manipulated the Würfel, each roll is independent of all past
rolls.
The multiplicity of events is the total number of ways in which different outcomes can occur. If the number
of outcomes of type X is nX and the number of outcomes of type Y is nY , then the total number of possible
combinations of outcomes is the multiplicity
W = nX nY (16)
In the case where we have multiple subsystems 1, 2, 3, . . . with multiplicities W1 , W2 , W3 , . . . we can calculate
the multiplicity of the total system:
W = W1 W2 W3 · · · (17)
The rules of probability allow us to calculate the probabilities of combinations of events. One such case
is the probability that outcome A OR outcome B will occur in a given event. If the outcomes A, B, C, . . . are
mutually exclusive, then the probability of observing A OR B (A ∪ B) is,
4
Prof. Tibbitt Lecture 2 Networks & Gels
p(A ∪ B) = pA + pB . (18)
This is often stated as the addition rule and requires that the outcomes are mutually exclusive.
Another such case is the probability that outcome A AND outcome B occur in successive events. If the
outcomes A, B, C, . . . are independent, then the probability of observing A AND B in successive events (A ∩ B)
is,
p(A ∩ B) = pA pB . (19)
This is often stated as the multiplication rule and requires that the outcomes are independent. A more general
form can be constructed in cases where the outcomes are not independent and occur with a conditional
probability. The conditional probability p(B|A) is the probability of outcome B, given that outcome A has
occurred. In this case,
N ! = N (N − 1)(N − 2) · · · 1. (24)
By definition, 0! ≡ 1.
It is often convenient to combine W (n, N ) with pn to generate a function from which we can calculate
averages and visualize distributions. We consider here the binomial distribution by combining Eq. 23
with Eq. 19:
5
Prof. Tibbitt Lecture 2 Networks & Gels
N!
P (n, N ) = pn (1 − p)N −n . (25)
n!(N − n)!
where p is the probability of ‘success’. As expected the average value for the distribution hni = N p. What
happens to this distribution in the limit of large N ?
First observation: the probability distribution rapidly collapses on the condition that maximizes W . We will
see that this is equivalent to maximizing entropy! Second, we can approximate the binomial distribution as a
continuous function, the Gaussian function, for sufficiently large N. Let us define q = 1 − p and then solve for
n̄. We can do this more easily by taking the derivative of the ln P (n, N ) and setting it to 0.
N!
ln P (n, N ) = ln pn q N −n (26)
n!(N − n)!
= n ln p + (N − n) ln q + ln N ! − ln n! − ln(N − n)! (27)
≈ n ln p + (N − n) ln q + N ln N − N − n ln n + n − (N − n) ln(N − n) + (N − n) (28)
∂ ln P (n, N )
= ln p − ln q − ln n + ln(N − n) (29)
∂n
p(N − n)
= ln = 0 at maximum (30)
nq
∴ n̄q = p(N − n̄) (31)
Np
n̄ = = N p. (32)
p+q
NB: In this derivation, we used Stirling’s approximation: ln x! = x ln x − x. We will employ this throughout the
course. Next, we can take a Taylor’s series approximation about the most probable value, n̄:
∂ 1 ∂2
ln P (n, N ) = ln P (n̄, N ) + ln P (n, N )|n=n̄ (n − n̄) + ln P (n, N )|n=n̄ (n − n̄)2 + · · · (33)
∂n 2! ∂ 2 n
From Eq. 29:
6
Prof. Tibbitt Lecture 2 Networks & Gels
∂2 1 1
2
ln P (n, N ) = − − (34)
∂ n n N −n
1 1
=− − evaluated at n̄ (35)
Np N − Np
1 1 1
=− + (36)
N p 1−p
1 1 1
=− = − . (37)
N p(1 − p) N pq
We can then calculate the multiplicity based on the assumption that every sequence possible is equally likely
(the statistical mechanical approach), no matter how one arranges the vacancies and occupancies:
7
Prof. Tibbitt Lecture 2 Networks & Gels
V V!
W (N, V ) = = (41)
N N !(V − N )!
so that WA (3, 5) = 10, WB (3, 4) = 4, andWC (3, 3) = 1. The multiplicity increases as the volume increases. That
is, if the degree of freedom for the system is its volume, the particles will spread to occupy the largest allowable
volume to maximize the multiplicity of the system. This is the basis for the force called pressure. This simple
model can be used to understand the entropic driving force in pressure and even to derive the ideal and van der
Waals gas laws!
In another example, we can ask why molecules diffuse? Here, we can place four white spheres and four
black spheres on eight lattice sites. Imagine that we are able to organize the white spheres and black spheres
into separate sections of the lattice (see C below) and place a perfectly permeable barrier between them. This
represents the unmixed case. Now calculate the the multiplicities for the left and right regions. For Case C,
Wlef t = 04 and Wright = 44 and from Eq. 17, WC = 1. We can do the same for Cases B and A and see that
WB = 14 · 34 = 16 and WA = 24 · 24 = 36.
Here, if the degree of freedom is the extent of particle exchange, then the system will have the greatest multiplic-
ity when the ‘concentration’ is uniform throughout. This can be understood as the entropic basis of chemical
potential!
8
Prof. Tibbitt Lecture 2 Networks & Gels
As a final example, we can ask why is rubber elastic? This is a simple question that is foundational for this
course. When you pull on a rubber, it pulls back. When you push on a rubber, it pushes back. As we will
discuss in much more detail, these restorative forces can be understood by the tendency for a polymer chain to
adopt the conformation that maximizes its multiplicities. A simple lattice model can be considered where we
attach the first monomer to a wall and ‘grow’ a short chain of three monomers. The degree of freedom here is
the distance, `, of the end of the chain from the wall in the two-dimensional space. All we need to then do is
count the possible conformations for the polymer chain as a function of ` in the two-dimensional lattice. As we
see W (` = 1) = 2, W (` = 2) = 4, and W (` = 3) = 1. Thus, the polymer chain will adopt a conformation with
` = 2 and if we pull it to ` = 3 there will be a restoring force that is felt as elasticity!
In these model examples we see that the systems tend toward maximum multiplicity. In all of these examples
the energy of the system is fixed and so all we need to do is count multiplicities and find that the system tends
to maximize multiplicity, which is directly maximizing the entropy: S = k ln W . In later examples, we will
also consider the energies of non-ideal systems and derive physical relations from the same simple principles
introduced here.