Review of Thermodynamics and Intro To Statistical Mechanics: Prof. Mark W. Tibbitt - ETH Z Urich - 21 Februar 2019

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Prof.

Tibbitt Lecture 2 Networks & Gels

Lecture 2:
Review of thermodynamics and intro to statistical mechanics
Prof. Mark W. Tibbitt – ETH Zürich – 21 Februar 2019

1 Suggested reading
• Molecular Driving Forces – Dill and Bromberg: Chapters 1, 2, 3, 5, 6, & 10

2 Thermodynamics
In this course, we are not always intent on rigorously deriving or proving any thermodynamic relations, leaving
that for proper thermodynamics or statistical mechanics courses. Instead, we are interested in developing
physical intuition and simple models grounded in thermodynamic principles for the problems that we address.
That is, we want to understand the physical system and be able to make appropriate assumptions as needed.
Thermodynamics is a useful toolset for reasoning about energies and entropies. It allows us to predict the
likelihood of atoms and molecules to react; to adsorb, diffuse, or partition; to dissolve or change phase; and to
alter their shape or bonding arrangement. Recall, the first law of thermodynamics: dU = dq − dw and the
second law of thermodynamics: dS ≥ 0. At the simplest level, thermodynamics can be thought of as these
two laws and some calculus.
A thermodynamic system is a collection of matter in any form, separated from its surroundings by real or
imaginary boundaries. How the system is defined will depend on the problem that we want to solve. Boundaries
are key as they define what comes in and goes out. At that point, we need to do some bookkeeping, accounting
for energy and matter exchange across the boundaries or for changes in volume. Many times these systems
contain subsystems that are also defined by real or imaginary boundaries.
Within these systems we have extensive and intensive properties. An extensive property X is the sum of
the same property in each of the subsystems X1 + X2 + X3 + · · · . That is extensive properties increase with
the system size. On the other hand, intensive properties are independent of the system size. Temperature T ,
pressures p, and concentrations [Ni ] are intensive properties. Extensive properties include volume V , number
of atoms or molecules Ni , internal energy U , and entropy S. In thermodynamics, each extensive property is
related to an intensive property. We will see these conjugate pairs below in the discussion of thermodynamic
driving forces.
The study of thermal and chemical equilibrium is governed by the second law of thermodynamics or
entropy maximization. In words, this is the assertion that the equilibrium observable state (macrostate) of an
isolated system is that state that can occur in the largest number of different ways (microstates). Or, when
given the choice from all observable macrostates the isolated system at equilibrium will choose the macrostate
that has the most representative microstates. We can write down a fundamental thermodynamic equation
for entropy as a multivariate expression of extensive properties:

S = S(U, V, {Ni })
This is just a mathematical form to say that the entropy is dependent on U, V, N1 , N2 , N3 , . . . and that each of
these are independent of one another. Traditionally, thermodynamics is discussed by writing down a multivariate
fundamental thermodynamic equation for energy:

U = U (S, V, {Ni })
We can also compose an equation for free energy, such as the Helmholtz Free Energy: F = U − T S. This
is more applicable for ‘real’ systems (non-isolated systems) that we encounter in the world around us and we
define equilibrium as the minimum free energy – for this to occur the system will tend toward low energy and
high entropy. Thus, there is an interplay between the energy of the system and the entropy of the system,
which we will explore throughout this course. At high temperatures, entropy dominates. At low temperatures,
energy dominates. Note, this is also entropy maximization if you construct a closed system by placing the ‘real’
system in a reservoir: say a test-tube in a water bath.

1
Prof. Tibbitt Lecture 2 Networks & Gels

3 Thermodynamic driving forces


The fundamental definitions of temperature, pressure, and chemical potential are classically based on the form
U = U (S, V, {Ni }). The microscopic origin of these properties are better understood in terms of entropy and
so we will often alternate between U = U (S, V, {Ni }) and S = S(U, V, {Ni }) in this course.
In general, systems will be in equilibrium at extrema in the fundamental thermodynamic equation for energy,
entropy, or free energy. As stated above, the tendency toward maximum entropy S (or maximum microstate
multiplicity W ) presides of the study of thermal and chemical equilibrium. Thus, we are often interested in
calculating small changes in this parameter as we deviate a system from its current state. Therefore, we calculate
dS (or in the case of minimizing U, dU ). We will also see that the partial differentials of these multivariate
equations can be considered as thermodynamic driving forces (connected to intensive physical properties) that
push a system toward equilibrium.
First consider dS:
    M  
∂S ∂S X ∂S
dS = dU + dV + dNj . (1)
∂U V,{Ni } ∂V U,{Ni } j=1
∂Nj U,V,Ni6=j

We can calculate the same for dU :


    M  
∂U ∂U X ∂U
dU = dS + dV + dNj . (2)
∂S V,{Ni } ∂V S,{Ni } j=1
∂Nj S,V,Ni6=j

For now these are just mathematical definitions, based on the chain rule from calculus. However, each of
the partial derivates of dU corresponds to a measurable physical property: temperature, T ; pressure, p; and the
chemical potential of species j, µj . We are perhaps more used to seeing:
M
X
dU = T dS − pdV + µj dNj . (3)
j=1

From this we see the relationships between measurable parameters and the partial differentials:
     
∂U ∂U ∂U
T ≡ , p≡− , µj ≡ . (4)
∂S V,{Ni } ∂V S,{Ni } ∂Nj S,V,Ni6=j
We can perform the same logic for dS by rearranging Eq. 3 and we see that:
  M 
1 p X µj 
dS = dU + dV − dNj . (5)
T T j=1
T

and, therefore:
     
1 ∂S p ∂S µj ∂S
≡ , ≡ , ≡− . (6)
T ∂U V,{Ni } T ∂V U,{Ni } T ∂Nj U,V,Ni6=j

We can think of the expressions in Eqs. 4 and 6 as thermodynamic driving forces that push the system
toward maximum entropy or minimum energy. Temperature, T , describes the tendency of the system for energy
exchange. Energy is the capacity of the system to do work and that capacity can flow as heat. Entropy is the
tendency of the work capacity, or energy, to flow from one system to another. In this manner, entropy is a
kind of potential to transfer energy from one place to another and 1/T is the corresponding thermodynamic
driving force. Pressure, p, is a ‘force’ for changing volume. The chemical potential, µ, is a tendency for matter
exchange. In each case, this is done by applying the maximum entropy principle (dStotal = 0) to specify the
state of equilibrium. We can exploit this logic to derive simple concepts, such as the ideal gas law, and use
lattice models and the tools of statistical mechanics to understand these principles at the molecular scale.

4 Statistical mechanics
To gain a better understanding of the essence of entropy, we use the tools of statistical mechanics. Statistical
mechanics takes a probabilistic view of macroscopic systems whereby the macroscopic observable system is
viewed in terms of the average properties of a system consisting of a large number of degrees of freedom.

2
Prof. Tibbitt Lecture 2 Networks & Gels

Macrostate refers to the observable, equilibrium properties of a system as described by thermodynamics.


Microstate describes one particular configuration of a system that is consistent with the observable macrostate.
Let us consider of an ideal gas of a large number of molecules (on the order of 1023 ) at a given temperature T and
volume V as a macrostate. The observable properties say nothing about the relative positions and momenta
of the molecules and we can imagine an immense number of ways of arranging them that lead to the same
macrostate – each arrangement corresponds to a microstate. We can imagine the molecules in constant flux,
sampling many such microstates such that the average properties of the system give the macrostate properties
that we observe. We refer to the collection of possible microstates associated with a given macrostate as an
ensemble. At this scale, we only know that the molecules obey the laws of quantum mechanics and are often
in good approximation of classical Newtonian mechanics. Statistical mechanics gives us the tools to explain
the macroscale properties, including but not limited to thermodynamic state functions, by applying probability
theory to the mechanic equations of motion for a large ensemble of systems of particles. Instead of trying to
follow each particle (atom or molecule) and calculating its 6 degrees of freedom (position: ~x, ~y , ~z and momentum:
p~x , p~y , p~z ) we use the rules of probability to make predictions about possible observable states.
There are several key assumptions that the rules of statistical mechanics are built on. First, we assert that
macroscopic systems are composed of molecules that obey quantum mechanical or classical equations of motion.
Second, we assume that the systems we study are ergodic, meaning that the time average of the microstates is
equivalent to the ensemble average: hXi = X̄. Stated another way, this means that provided infinite time the
system will visit all possible microstates.
For now we will restrict our discussion to a specific ensemble known as the microcanonical ensemble.
Here, we assume that we know the macrostate possesses a given fixed internal energy U . Thus, for a microstate
to be consistent with the macrostate, it must also have the same internal energy U . Since all microstates have
the same energy, the central postulate of statistical mechanics is that all microstates are equally
probable. This is also called the assumption of equal a priori equilibrium probabilities. This is to say that if
we have a collection of particles (atoms or molecules) of some fixed energy, then any time we take an image of
a microstate, all possible positions and momenta that together are consistent with the fixed internal energy U
will be equally likely. We can express this mathematically:
1
p(µ)U = , (7)
Ω(U )
where µ is a given microstate and Ω(U ) is the total number of microstates with internal energy U . This states
simply that the probability of any given microstate is equal.
We now introduce a general definition of the entropy of a system, Gibbs’ coarse-grained entropy, based on
the probability distribution of microstates for that system:

M
X
S=− p(i) ln p(i) (8)
i=1

This is the generalized entropy for any probability distribution, applied to the probability distribution from Eq.
7, we see:


X
S=− p(µ) ln p(µ) (9)
µ=1

X 1 1
=− ln (10)
µ=1
Ω(U ) Ω(U )
1
= − ln (11)
Ω(U )
= ln Ω(U ). (12)

This is a reformulation of the classic definition of entropy postulated by Ludwig Boltzmann, and inscribed on
his tombstone:

S = kB ln W (13)

where kB is the Boltzmann constant, ∼ 1.38 × 10−23 m2 kg s−2 K−1 , and provides the proper units on entropy.
Eq. 13 states that as we increase the number of microstates (or multiplicity W ) in a system, the entropy

3
Prof. Tibbitt Lecture 2 Networks & Gels

increases. Also, it states that the entropy can never be negative. From these postulates, and Eq. 13, all we need
to do is some clever counting to calculate entropy and find those systems that have maximum multiplicity W as
these will have maximum entropy! This is true for the microcanonical ensemble discussed here and demonstrated
below. We will also discuss and explore the implications of statistical mechanics on other ensembles in future
lectures.
As an aside, to my knowledge the form of S = kB ln W while rigorously tested and confirmed has no a priori
proof. However, some simple logic allows us to understand why it should take the logarithmic form. Consider a
system with two subsystems A and B having multiplicities WA and WB . The multiplicity of the total system will
be the product W = WA WB . We require that entropy is extensive, meaning that S = SA + SB . The logarithm
function satisfies this requirement: if SA = kB ln WA and SB = kB ln WB then S = kB ln W = kB ln WA WB =
kB ln WA + kB ln WB = SA + SB . This simple argument illustrates why S should be a logarithmic function of
W but is not a rigorous proof.

5 Probability
The concepts that we introduce here – probability, multiplicity, combinatorics, averages, and distribu-
tion functions – provide the foundation for statistical mechanics and calculating entropy.
We define probability as follows. If we have in total N events, measurements, or trials each with t distinct
possible outcomes: A, B, C, . . . and the number of events for each outcome are nA , nB , nC , . . . or {ni }, the
probability of each outcome is calculated as:
ni
pi = . (14)
N
By definition,
t
X t
X
ni = N and pi = 1. (15)
i=1 i=1

Thus, probabilities are in the range of zero to one: pi ∈ [0, 1]. If only one outcome is possible, the event is
deterministic and the outcome has a probability of one. If an outcome can never occur, it has a probability of
zero.
Consider the event of a single roll of a Würfel with six sides. The probability that a 6 is rolled is 1/6 as
there are N = 6 possible outcomes with n6 = 1 of them is a 6. But suppose we roll the Würfel four times. We
may want to calculate the probability that we roll a 6 every time. Or we may want to calculate the probability
that we roll one 2 followed by two 4’s followed by one 2. Or we may want to calculate the probability that we
roll two 4’s and two 5’s in any order. The rules of probability and combinatorics provide the tools to calculate
these probabilities. First we need to define certain aspects of the events.
The outcomes A, B, C, . . . are considered mutually exclusive if the occurrence of one precludes the oc-
currence of all the others. For example, with the Würfel 2 and 4 are mutually exclusive as only one number
can appear face up for each roll. The outcomes A, B, C, . . . are considered collectively exhaustive if they
comprise all possible outcomes for the event, trial, or measurement and no other possible outcomes exist. For
example, {1, 2, 3, 4, 5, 6} is the collectively exhaustive set of outcomes for the roll of the Würfel, as long as we
discount any of the rare times that it might end up on a side. Events are independent if the outcomes of each
is unrelated or not correlated with the outcome of any other event. That is, there is no information transfer
from event to event. Assuming that no one has manipulated the Würfel, each roll is independent of all past
rolls.
The multiplicity of events is the total number of ways in which different outcomes can occur. If the number
of outcomes of type X is nX and the number of outcomes of type Y is nY , then the total number of possible
combinations of outcomes is the multiplicity

W = nX nY (16)
In the case where we have multiple subsystems 1, 2, 3, . . . with multiplicities W1 , W2 , W3 , . . . we can calculate
the multiplicity of the total system:

W = W1 W2 W3 · · · (17)
The rules of probability allow us to calculate the probabilities of combinations of events. One such case
is the probability that outcome A OR outcome B will occur in a given event. If the outcomes A, B, C, . . . are
mutually exclusive, then the probability of observing A OR B (A ∪ B) is,

4
Prof. Tibbitt Lecture 2 Networks & Gels

p(A ∪ B) = pA + pB . (18)
This is often stated as the addition rule and requires that the outcomes are mutually exclusive.
Another such case is the probability that outcome A AND outcome B occur in successive events. If the
outcomes A, B, C, . . . are independent, then the probability of observing A AND B in successive events (A ∩ B)
is,

p(A ∩ B) = pA pB . (19)
This is often stated as the multiplication rule and requires that the outcomes are independent. A more general
form can be constructed in cases where the outcomes are not independent and occur with a conditional
probability. The conditional probability p(B|A) is the probability of outcome B, given that outcome A has
occurred. In this case,

p(A ∩ B) = p(B|A)pA = p(A|B)pB . (20)


Eq. 20 is more commonly referred to as Bayes’ rule. Importantly, Bayes’ rule reduces to Eq. 19 in the case of
independence, i.e., p(B|A) = pB .
The addition rule can also be generalized for outcomes that are not mutually exclusive:

p(A ∪ B) = pA + pB − p(A ∩ B). (21)


Again Eq. 21 reduces to Eq. 18 in the case where A and B are mutually exclusive, i.e., p(A ∩ B) = 0.
Combinatorics allows us to count events. This process is central to statistical mechanics and will be the
basis of calculating entropy. Combinatorics is the study of the composition of a series of events as opposed to
following the direct sequence of events. For example, we can contrast the following two curiosities. First: What
is the probability of the specific sequence of four coin flips HT T H, (where H refers to head and T refers to
tails and where pH = pT = 1/2)? Second: What is the probability of two H’s and two T’s in any order? The
first is addressed by Eq. 19 and p(HT T H) = pH pT pT pH = (1/2)4 = 1/16. The second is answered by counting
the number of sequences with two H’s and two T ’s: HT T H, HHT T, HT HT, T HHT, T HT H, and T T HH.
Thus, the probability of observing two H’s and two T’s is 6/16 or 3/8.
These problems are tractable for coins flips or Würfel rolls with small N . However, most physical systems of
interest are composed of extremely large numbers of atoms or molecules and we need to generalize our ability
to count! For this we use the math of permutations and combinations. In the example above, if each coin
is distinguishable – say by using coins of different values – there would be W = N ! = 24 permutations for each
outcome. However, in general one coin is indistinguishable from another coin of the same value and if we use
four coins of the same value we can only distinguish H from T . Throughout this course, in general, items of
the same character will be treated as indistinguishable. In this case, the number of permutations is reduced by
the fact that we cannot distinguish between the different H’s and T ’s. In general, for a collection of N objects
(or events) with t categories, of which ni objects in each category are indistinguishable from each other but
distinguishable from the objects in the other categories, the number of permutations W is
N!
W = . (22)
nA !nB !nC ! · · · nt !
The multiplicity is also written as Ω and we will likely use the terms interchangeably in this course.
If there are only two categories, as in the case for a coin flip or in the case of success / failure, the permutations
for n successes our of N events is
 
N N!
W (n, N ) = = . (23)
n n!(N − n)!
As an aside, the notation N !, read N factorial, is the product of the integers from 1 to N

N ! = N (N − 1)(N − 2) · · · 1. (24)
By definition, 0! ≡ 1.
It is often convenient to combine W (n, N ) with pn to generate a function from which we can calculate
averages and visualize distributions. We consider here the binomial distribution by combining Eq. 23
with Eq. 19:

5
Prof. Tibbitt Lecture 2 Networks & Gels

N!
P (n, N ) = pn (1 − p)N −n . (25)
n!(N − n)!
where p is the probability of ‘success’. As expected the average value for the distribution hni = N p. What
happens to this distribution in the limit of large N ?

First observation: the probability distribution rapidly collapses on the condition that maximizes W . We will
see that this is equivalent to maximizing entropy! Second, we can approximate the binomial distribution as a
continuous function, the Gaussian function, for sufficiently large N. Let us define q = 1 − p and then solve for
n̄. We can do this more easily by taking the derivative of the ln P (n, N ) and setting it to 0.

N!
ln P (n, N ) = ln pn q N −n (26)
n!(N − n)!
= n ln p + (N − n) ln q + ln N ! − ln n! − ln(N − n)! (27)
≈ n ln p + (N − n) ln q + N ln N − N − n ln n + n − (N − n) ln(N − n) + (N − n) (28)
∂ ln P (n, N )
= ln p − ln q − ln n + ln(N − n) (29)
∂n
p(N − n)
= ln = 0 at maximum (30)
nq
∴ n̄q = p(N − n̄) (31)
Np
n̄ = = N p. (32)
p+q
NB: In this derivation, we used Stirling’s approximation: ln x! = x ln x − x. We will employ this throughout the
course. Next, we can take a Taylor’s series approximation about the most probable value, n̄:

∂ 1 ∂2
ln P (n, N ) = ln P (n̄, N ) + ln P (n, N )|n=n̄ (n − n̄) + ln P (n, N )|n=n̄ (n − n̄)2 + · · · (33)
∂n 2! ∂ 2 n
From Eq. 29:

6
Prof. Tibbitt Lecture 2 Networks & Gels

∂2 1 1
2
ln P (n, N ) = − − (34)
∂ n n N −n
1 1
=− − evaluated at n̄ (35)
Np N − Np
 
1 1 1
=− + (36)
N p 1−p
 
1 1 1
=− = − . (37)
N p(1 − p) N pq

Back to the Taylor’s series:


1 1
ln P (n, N ) = ln P (n̄, N ) − (n − N p)2 . (38)
2 N pq
Exponentiate to solve for P (n, N ):
(n−N p)2
P (n, N ) = P (n̄, N ) · e− 2N pq . (39)
R∞
Determine the normalization constant by integrating over all space and knowing that the −∞
P (n, N ) = 1.
This results in:
1 (n−N p)2
P (n, N ) = √ · e− 2N pq . (40)
2πN pq
Note that this is the form of a Gaussian distribution with the mean n̄ and variance σ 2 embedded into the
equation with n̄ = N p and σ 2 = N pq. This implies that the

fraction of likely outcomes decreases as N increases.
The ratio of likely outcomes to total outcomes scales as NN . Again, this is pushing toward maximizing W and
entropy!

6 Simple lattice models


Statistical mechanics often employs simple models and counting to predict equilibrium behavior. One such
model landscape that is employed are lattice models, which will be used several times in this course. A lattice
model considers atoms and molecules as hard spheres and space as a collection of sphere-sized bins, which are
artificial, mutually exclusive, and collectively exhaustive units of space. Each lattice site is either occupied or
not and no two atoms or molecules can reside in the same lattice site. Very simple this illustrates the concept
that atoms or molecules are located in different places in space and that no two particles can be in the same
place at the same time.
Lattice models can be very helpful in understanding simple thermodynamic concepts. For example, we can
ask why an ideal gas exerts pressure? Imagine an ideal gas of N atoms or molecules that are free to distribute
throughout a large or small volume, V . The tendency to spread out or pressure is often described by mechanics
as a result of the atoms or molecules smashing into the side walls of the enclosed volume. However, we can also
employ the concepts of multiplicity to understand pressure as the gas will seek to maximize the multiplicity (or
entropy). Consider a simplified case of N = 3 atoms or molecules placed into three ‘volumes’: VA = 5, VB = 4,
and VC = 3.

We can then calculate the multiplicity based on the assumption that every sequence possible is equally likely
(the statistical mechanical approach), no matter how one arranges the vacancies and occupancies:

7
Prof. Tibbitt Lecture 2 Networks & Gels

 
V V!
W (N, V ) = = (41)
N N !(V − N )!
so that WA (3, 5) = 10, WB (3, 4) = 4, andWC (3, 3) = 1. The multiplicity increases as the volume increases. That
is, if the degree of freedom for the system is its volume, the particles will spread to occupy the largest allowable
volume to maximize the multiplicity of the system. This is the basis for the force called pressure. This simple
model can be used to understand the entropic driving force in pressure and even to derive the ideal and van der
Waals gas laws!
In another example, we can ask why molecules diffuse? Here, we can place four white spheres and four
black spheres on eight lattice sites. Imagine that we are able to organize the white spheres and black spheres
into separate sections of the lattice (see C below) and place a perfectly permeable barrier between them. This
represents the unmixed case. Now calculate the the multiplicities for the left and right regions. For Case C,
Wlef t = 04 and Wright = 44 and from Eq. 17, WC = 1. We can do the same for Cases B and A and see that
WB = 14 · 34 = 16 and WA = 24 · 24 = 36.
   

Here, if the degree of freedom is the extent of particle exchange, then the system will have the greatest multiplic-
ity when the ‘concentration’ is uniform throughout. This can be understood as the entropic basis of chemical
potential!

8
Prof. Tibbitt Lecture 2 Networks & Gels

As a final example, we can ask why is rubber elastic? This is a simple question that is foundational for this
course. When you pull on a rubber, it pulls back. When you push on a rubber, it pushes back. As we will
discuss in much more detail, these restorative forces can be understood by the tendency for a polymer chain to
adopt the conformation that maximizes its multiplicities. A simple lattice model can be considered where we
attach the first monomer to a wall and ‘grow’ a short chain of three monomers. The degree of freedom here is
the distance, `, of the end of the chain from the wall in the two-dimensional space. All we need to then do is
count the possible conformations for the polymer chain as a function of ` in the two-dimensional lattice. As we
see W (` = 1) = 2, W (` = 2) = 4, and W (` = 3) = 1. Thus, the polymer chain will adopt a conformation with
` = 2 and if we pull it to ` = 3 there will be a restoring force that is felt as elasticity!

In these model examples we see that the systems tend toward maximum multiplicity. In all of these examples
the energy of the system is fixed and so all we need to do is count multiplicities and find that the system tends
to maximize multiplicity, which is directly maximizing the entropy: S = k ln W . In later examples, we will
also consider the energies of non-ideal systems and derive physical relations from the same simple principles
introduced here.

You might also like