Cristallografia PDF
Cristallografia PDF
Cristallografia PDF
Crystallography
Edited by
C. GIACOVAZZO
Dipartimento Geomineralogico, University of Bari, ltaly
and lstituto di Ricerca per lo Sviluppo delle Metodologie Cristallografiche,
CNR, Bari, ltaly
...
List of contributors Xlll 1.D The space-group
rotation matrices
1 Symmetry in crystals 1.E Symmetry groups
Carmelo Giacovazzo l . F Symmetry
generalization
The crystalline state and isometric
References
operations
Symmetry elements
Axes of rotational symmetry
Crystallographic computing
Axes of rototranslation or screw axes Carmelo Giacovazzo
Axes of inversion
Introduction
Axes of rotoreflection
The metric matrix
Reflection planes with translational
component (glide planes) The reciprocal lattice
Lattices Basic transformations
The rational properties of lattices Transformation from triclinic to
orthonormal axes
Crystallographic directions
Rotations in Cartesian systems
Crystallographic planes
Some simple crystallographic calculations
Symmetry restrictions due to the lattice
periodicity and vice versa Torsion angles
Point groups and symmetry classes Best plane through a set of points
Point groups in one and two dimensions Best line through a set of points
The Laue classes Principal axes of a quadratic form
The seven crystal systems Metric considerations on the lattices
The Bravais lattices Niggli reduced cell
Plane lattices Sublattices and superlattices
Space lattices Coincidence-site lattices
The space groups Twins
The plane and line groups Calculation of the structure factor
On the matrix representation of symmetry Calculation of the electron density
operators function
Appendices: 1.A The isometric The method of least squares
transformations Linear least squares
l . B Some combinations of Reliability of the parameter estimates
movements Linear least squares with constraints
1.C Wigner-Seitz cells Non-linear (unconstrained) least squares
viii I Contents
Symmetry elements
Suppose that the isometric operations described in the preceding section,
not only bring to coincidence a couple of congruent objects, but act on the
entire space. If all the properties of the space remain unchanged after a
given operation has been carried out, the operation will be a symmetry
operation. Symmetry elements are points, axes, or planes with respect to
which symmetry operations are performed. .
In the following these elements will be considered in more detail, while
the description of translation operators will be treated in subsequent
sections.
Table 1.1. Graphical symbols for symmetry elements: (a) axes normal to the pfane of
projection; (b) axes 2 and 2, ,parallel to the plane of projection; (c) axes parallel or
inclined to the plane of projection; (d) symmetry pfanes normar to the plane of
projection; (e) symmetry planes parallel to the plane of projection
Axes of inversion
An inversion axis of order n is present when all the properties of the space
remain unchanged after performing the product of a 2nln rotation around
the axis by an inversion with respect to a point located on the same axis.
The written symbol is fi (read 'minus n' or 'bar n'). As we shall see on p. 9
we will be mainly interested in 1, 2, 3, 4, 6 axes, and their graphic symbols
are given in Table 1.1, while their effects on the space are represented in the
second column of Fig. 1.1. According to international notation, if an object
is represented by a circle, its enantiomorph is depicted by a circle with a
comma inside. When the two enantiomorphous objects fall one on top of
the other in the projection plane of the picture, they are represented by a
single circle divided into two halves, one of which contains a comma. To
each half the appropriate + or - sign is assigned.
We may note that:
(1) the direction of the i axis is irrelevant, since the operation coincides
with an inversion with respect to a point;
Axes of rotoreflection
A rotoreflection axis of order n is present when all the properties of the
space do not change after performing the product of a 2nln rotation around
an axis by a reflection with respect to a plane normal to it. The written
symbol of this axis is fi. The effects on the space of the 1, 2, 3, 4, 6 axes
coincide with those caused by an inversion axis (generally of a different
order). In particular: i = m, 2 = 1, 3 = 6, 4 = 4, 6 = 3. From now on we will
no longer consider the ii axes but their equivalent inversion axes.
6 1 Carmelo Giacovazzo
Lattices
Translational periodicity in crystals can be conveniently studied by con-
sidering the geometry of the repetion rather than the properties of the motif
which is repeated. If the motif is periodically repeated at intervals a, b, and
/Cl /Cl /Cl ,,Cl /Cl ,,Cl ,Cl
c along three non-coplanar directions, the repetition geometry can be fully
o H , H? ?H ? H ? H? H? H described by a periodic sequence of points, separated by intervals a, b, c
/C1 /Cl /Cl /Cl &C1 7 1 /Cl along the same three directions. This collection of points will be called a
lattice. We will speak of line, plane, and space lattices, depending on
H H H H
whether the periodicity is observed in one direction, in a plane, or in a
three-dimensional space. An example is illustrated in Fig. 1.2(a), where
HOCl is a geometrical motif repeated at intervals a and b. If we replace the
molecule with a point positioned at its centre of gravity, we obtain the
lattice of Fig. 1.2(b). Note that, if instead of placing the lattice point at the
centre of gravity, we locate it on the oxygen atom or on any other point of
the motif, the lattice does not change. Therefore the position of the lattice
with respect to the motif is completely arbitrary.
If any lattice point is chosen as the origin of the lattice, the position of
any other point in Fig. 1.2(b) is uniquely defined by the vector
where u and v are positive or negative integers. The vectors a and b define a
parallelogram which is called the unit cell: a and b are the basis vectors of
the cell. The choice of the vectors a and b is rather arbitrary. In Fig. 1.2(b)
four possible choices are shown; they are all characterized by the property
that each lattice point satisfies relation (1.1) with integer u and v.
Nevertheless we are allowed to choose different types of unit cells, such
as those shown in Fig. 1.2(c), having double or triple area with respect to
those selected in Fig. 1.2(b). In this case each lattice point will still satisfy
(1.1) but u and v are no longer restricted to integer values. For instance, the
point P is related to the origin 0 and to the basis vectors a' and b' through
( 4 v) = (112, 112).
The different types of unit cells are better characterized by determining
the number of lattice points belonging to them, taking into account that the
~ i ~ (a). R~~~~~~~~~ of a graphical motif as an points on sides and on corners are only partially shared by the given cell.
example of a two-dimensional crystal; (b) The cells shown in Fig. 1.2(b) contain only one lattice point, since the
corresponding lattice with some examples Of four points at the corners of each cell belong to it for only 114. These cells
primitive cells; (c) corresponding lattice with
some examples of multiple cells. are called primitive. The cells in Fig. 1.2(c) contain either two or three
Symmetry in crystals 1 7
points and are called multiple or centred cells. Several kinds of multiple
cells are possible: i.e. double cells, triple cells, etc., depending on whether
they contain two, three, etc. lattice points.
The above considerations can be easily extended to linear and space
lattices. For the latter in particular, given an origin 0 and three basis II
vectors a , b, and c, each node is uniquely defined by the vector
= ua + ub + W C . (1.2)
The three basis vectors define a parallelepiped, called again a unit cell. a
Fig. Notation for a unit cell.
The directions specified by the vectors a , b, and c are the X , Y, Z
crystallographic axes, respectively, while the angles between them are
indicated by a, 0, and y, with a opposing a , opposing b, and y opposing
c (cf. Fig. 1.3). The volume of the unit cell is given by
where the symbol '.' indicates the scalar product and the symbol ' A ' the
vector product. The orientation of the three crystallographic axes is usually
chosen in such a way that an observer located along the positive direction of
c sees a moving towards b by an anti-clockwise rotation. The faces of the
unit cell facing a , b, and c are indicated by A, B, C, respectively. If the
chosen cell is primitive, then the values of u, u, w in (1.2) are bound to be
integer for all the lattice points. If the cell is multiple then u, u, w will have
rational values. To characterize the cell we must recall that a lattice point at
vertex belongs to it only for 1/8th, a point on a edge for 114, and one on a
face for 112.
Crystallographic directions
Since crystals are anisotropic, it is necessary to specify in a simple way
directions (or planes) in which specific physical properties are observed.
Two lattice points define a lattice row. In a lattice there are an infinite
number of parallel rows (see Fig. 1.4): they are identical under lattice
translation and in particular they have the same translation period.
A lattice row defines a crystallographic direction. Suppose we have chosen a
primitive unit cell. The two lattice vectors Q , and Q,,,,, ,, ,,
with u, u,
w, and n integer, define two different lattice points, but only one direction.
This property may be used to characterize a direction in a unique way. For
instance, the direction associated with the vector Q9,,,, can be uniquely
defined by the vector Q,,,,,with no common factor among the indices. This
direction will be indicated by the symbol [3 1 21, to be read as 'three, one,
two' and not 'three hundred and twelve'. Fig. 1.4. Lattice rows and planes.
8 1 Carmelo Giacovazzo
Crystallographic planes
Three lattice points define a crystallographic plane. Suppose it intersects the
three crystallographic axes X , Y , and Z at the three lattice points ( p , 0, 0 ) )
(0, q, 0 ) and (0, 0 , r ) with integer p, q, r (see Fig. 1.5). Suppose that m is
the least common multiple of p, q , r. Then the equation of the plane is
xlp + y l q + z l r = 1. (1.3)
hx + ky + lz = m (1.4)
Fig. 1.5. Some lattice planes of the set (236).
where h , k , and 1 are suitable integers, the largest common integer factor of
which will be 1.
We can therefore construct a family of planes parallel to the plane (1.4),
by varying m over all integer numbers from -m to +m. These will also be
crystallographic planes since each of them is bound to pass through at least
three lattice points.
The rational properties of all points being the same, there will be a plane
of the family passing through each lattice point. For the same reason each
lattice plane is identical to any other within the family through a lattice
translation.
Let us now show that (1.4) represents a plane at a distance from the
origin m times the distance of the plane
indicate that the planes of the family divide a in h parts, b in k parts, and c 0 k. (110) (010)
in 1 parts.
Crystallographic planes parallel to one of the three axes X, Y, or Z are
defined by indices of type (Okl), (hol), or (hkO) respectively. Planes parallel
to faces A, B, and C of the unit cell are of type (hOO), (OkO), and (001)
respectively. Some examples of crystallographic planes are illustrated in Fig.
1.6.
As a numerical example let us consider the plane
(zio)
which can be written as Fig. 1.6. Miller indices for some crystallographic
planes parallel to Z ( Z i s supposed to be normal
to the page).
The first plane of the family with integer intersections on the three axes will
be the 30th (30 being the least common multiple of 10, 15, and 6) and all the
planes of the family can be obtained from the equation lox + 15y 62 = m,+
by varying m over all integers from -m to +m. We observe that if we divide
p, q, and r in eqn (1.6) by their common integer factor we obtain
+ +
x/3 y/2 z/5 = 1, from which
Planes (1.7) and (1.8) belong to the same family. We conclude that a
family of crystallographic planes is always uniquely defined by three indices
h, k, and 1 having the largest common integer factor equal to unity.
1 = i)
(i[i
-
1
2 = m- 212-= 2jm
3 3 31 (3[3 = 3)
4 4 414 = 4/m
6 6 = 3/m 616 = 6/m
5 + 5 + 3 = 13
t.'
12 ( Carmelo Giacovazzo
11 Table 1.3. For each combination of symmetry axes the minimum angles between axes
are given. For each angle the types of symmetry axes are quoted in parentheses
.-
P '.--
symmetry axes
2
3
2
2
2
2
90
90
(22)
(2 3)
90 (2 2)
90 (2 3)
90
60
(22)
(2 2)
---0
4 2 2 90 (2 4) 90 (2 4) 45 (2 2)
6 2 2 90 (2 6) 90 (2 6) 30 (2 2)
0 2 3 3 54 44'08" (2 3) 54 44'08" (2 3) 70 31 '44" (3 3)
Fig. 1.9. Arrangement of equivalent objects 4 3 2 35 15'52" (2 3) 45 (2 4) 54 44'08" (4 3)
around two intersecting symmetry axes.
Table 1.4. Crystallographic point groups with more than one axis
(43 2 ----
432
rnlm
Table 1.5. Crystallographic point groups with more than one axis, each axis being
proper and improper simultaneously
E E
E E E
EEE E EkEnmm
. - 2 E 9 3 ~ ~ m wE E
Symmetry in crystals 1 15
2. The variation of the refractive index of the crystal with the vibration
direction of a plane-polarized light wave is represented by the optical
indicatrix (see p. 607). This is in general a three-axis ellipsoid: thus the
lowest symmetry of the property 'refraction' is 2/m 2lm 2/m, the point
group of the ellipsoid. In crystal classes belonging to tetragonal, trigonal, or
hexagonal systems (see Table 1.6) the shape of the indicatrix is a rotational
ellipsoid (the axis is parallel to the main symmetry axis), and in symmetry
classes belonging to the cubic system the shape of the indicatrix is a sphere.
For example, in the case of tourmaline, with point group 3m, the ellipsoid is
a revolution around the threefold axis, showing a symmetry higher than that
of the point group.
We shall now see how it is possible to guess about the point group of a
crystal through some of its physical properties:
1. The morphology of a crystal tends to conform to its point group
symmetry. From a morphological point of view, a crystal is a solid body
bounded by plane natural surfaces, the faces. The set of symmetry-
equivalent faces constitutes a form: the form is open if it does not enclose
space, otherwise it is closed. A crystal form is named according to the
number of its faces and to their nature. Thus a pedion is a single face, a
pinacoid is a pair of parallel faces, a sphenoid is a pair of faces related by a
diad axis, a prism a set of equivalent faces parallel to a common axis, a
pyramid is a set of planes with equal angles of inclination to a common axis,
etc. The morphology of different samples of the same compound can show
different types of face, with different extensions, and different numbers of
edges, the external form depending not only on the structure but also on the
chemical and physical properties of the environment. For instance, galena
crystals (PbS, point group m3m) tend to assume a cubic, cube-octahedral,
or octahedral habit (Fig. 1.12(a)). Sodium chloride grows as cubic crystals
from neutral aqueous solution and as octahedral from active solutions (in
the latter case cations and anions play a different energetic role). But at the
same temperature crystals will all have constant dihedral angles between
corresponding faces (J. B. L. Rome' de l'Ile, 1736-1790). This property, the
observation of which dates back to N. Steno (1669) and D. Guglielmini
(1688), can be explained easily, following R. J. Haiiy (1743-1822), by
considering that faces coincide with lattice planes and edges with lattice
rows. Accordingly, Miller indices can be used as form symbols, enclosed in
braces: {hkl). The indices of well-developed faces on natural crystals tend
to have small values of h, k, 1, (integers greater than six are rarely
involved). Such faces correspond to lattice planes with a high density of
lattice points per unit area, or equivalently, with large intercepts alh, blk,
cll on the reference axes (Bravais' law). An important extension of this law
is obtained if space group symmetry (see p. 22) is taken into account: screw
axes and glide planes normal to a given crystal face reduce its importance
(Donnay-Harker principle).
The origin within the crystal is usually chosen so that faces (hkl) and
(hit) are parallel faces an opposite sides of the crystal. In Fig. 1.13 some
idealized crystal forms are shown.
(b)
The orientation of the faces is more important than their extension. The
Fig. 1.12. (a) Crystals showing cubic or cube-
orientations can be represented by the set of unit vectors normal to them. octahedral or octahedral habitus, (b) crystal with
This set will tend to assume the point-group symmetry of the given crystal a sixfold symmetry axis.
16 1 Carmelo Giacovazzo
Plane lattices
An oblique cell (see Fig. 1.14(a)) is compatible with the presence of axes 1
or 2 normal to the cell. This cell is primitive and has point group 2.
If the row indicated by m in Fig. 1.14(b) is a reflection line, the cell must
be rectangular. Note that the unit cell is primitive and compatible with the
point groups m and 2mm. Also the lattice illustrated in Fig. 1.14(c) with
a = b and y # 90" is compatible with m. This plane lattice has an oblique
primitive cell. Nevertheless, each of the lattice points has a 2mm symmetry
and therefore the lattice must be compatible with a rectangular system. This
can be seen by choosing the rectangular centred cell defined by the unit
vectors a' and b'. This orthogonal cell is more convenient because a simpler
coordinate system is allowed. It is worth noting that the two lattices shown
in Figs. 1.14(b) and 1.14(c) are of different type even though they are
compatible with the same point groups.
In Fig. 1.14(d) a plane lattice is represented compatible with the presence
of a fourfold axis. The cell is primitive and compatible with the point groups
4 and 4mm.
In Fig. 1.14(e) a plane lattice compatible with the presence of a three- or
a sixfold axis is shown. A unit cell with a rhombus shape and angles of 60"
and 120" (also called hexagonal) may be chosen. A centred rectangular cell
can also be selected, but such a cell is seldom chosen.
I I i i I
Fig. 1.14. The five plane lattices and the
corresponding two-dimensional point groups. (d) 4;4mm
Symmetry in crystals 1 19
Oblique P 2 a, b, y
Rectangular P, C 2mm a, b, y = 90"
Square P 4mm a = b, y = 90"
Hexagonal P 6mm a=b,y=120"
The basic features of the five lattices are listed in Table 1.7
Space lattices
In Table 1.8 the most useful types of cells are described. Their fairly limited
number can be explained by the following (or similar) observations:
A cell with two centred faces must be of type F. In fact a cell which is at
the same time A and B, must have lattice points at (0,1/2,1/2) and
(1/2,0, 112). When these two lattice translations are applied one after
the other they will generate a lattice point also at (1/2,1/2,0);
A cell which is at the same time body and face centred can always be
reduced to a conventional centred cell. For instance an I and A cell will
have lattice points at positions (1/2,1/2,1/2) and (0,1/2,1/2): a lattice
point at (1/2,0,0) will then also be present. The lattice can then be
described by a new A cell with axes a ' = a/2, b' = b, and c' = c (Fig.
1.15).
It is worth noting that the positions of the additional lattice points in
Table 1.8 define the minimal translational components which will move an
object into an equivalent one. For instance, in an A-type cell, an object at
+
( x , y, z) is repeated by translation into ( x , y + m/2, z n/2) with m and n
integers: the shortest translation will be (0,1/2,1/2).
Let us now examine the different types of three-dimensional lattices
grouped in the appropriate crystal systems.
J
Fig. 1.15. Reduction of an I- and A-centred cell
Table 1.8. The conventional types of unit cell to an A-centred cell.
P primitive - 1
I body centred (112,1/2, l r 2 ) 2
A A-face centred (0,1/2,1/2) 2
B B-face centred (1/2,0,1/2) 2
C C-face centred (1/2,1/2,0) 2
F All faces centred (112,112, O), (1/2,0,1/2) 2
~0,112,112~ 4
R Rhombohedrally (1/3,2/3,2/3), (2/3,1/3,1/3) 3
centred (de
scription with
'hexagonal axes')
20 ( Carmelo Giacovazzo
Triclinic lattices
Even though non-primitive cells can always be chosen, the absence of axes
with order greater than one suggests the choice of a conventional primitive
cell with unrestricted a , p, y angles and a:b:c ratios. In fact, any triclinic
lattice can always be referred to such a cell.
Monoclinic lattices
The conventional monoclinic cell has the twofold axis parallel to b, angles
a = y = 90", unrestricted p and a :b:c ratios. A B-centred monoclinic cell
with unit vectors a, b, c is shown in Fig. 1.16(a). If we choose a' = a ,
+
b' = b, c' = ( a c ) / 2 a primitive cell is obtained. Since c' lies on the (a, c)
plane, the new cell will still be monoclinic. Therefore a lattice with a B-type
monoclinic cell can always be reduced to a lattice with a P monoclinic cell.
An I cell with axes a, b, c is illustrated in Fig. 1.16(b). If we choose
a' = a, b' = b, c' = a + c, the corresponding cell becomes an A monoclinic
cell. Therefore a lattice with an I monoclinic cell may always be described
by an A monoclinic cell. Furthermore, since the a and c axes can always be
interchanged, an A cell can be always reduced to a C cell.
An F cell with axes a, b, c is shown in Fig. 1.16(c). When choosing
+
a' = a, b' = b, c' = ( a c ) / 2 a type-C monoclinic cell is obtained. There-
fore, also, a lattice described by an F monoclinic cell can always be
described by a C monoclinic cell.
We will now show that there is a lattice with a C monoclinic cell which is
not amenable to a lattice having a P monoclinic cell. In Fig. 1.16(d) a C cell
with axes a, b, c is illustrated. A primitive cell is obtained by assuming
+
a' = ( a + b ) / 2 , b' = ( - a b ) / 2 , c' = c, but this no longer shows the
features of a monoclinic cell, since y' # 90°, a' = b' # c ' , and the 2 axis lies
along the diagonal of a face. It can then be concluded that there are two
distinct monoclinic lattices, described by P and C cells, and not amenable
one to the other.
Orthorhombic lattices
In the conventional orthorhombic cell the three proper or inversion axes are
parallel to the unit vectors a, b, c, with angles a = /3 = y = 90" and general
a:b:c ratios. With arguments similar to those used for monoclinic lattices,
the reader can easily verify that there are four types of orthorhombic
lattices, P, C, I, and F.
Tetragonal lattices
In the conventional tetragonal cell the fourfold axis is chosen along c with
a = p = y = 90°, a = b, and unrestricted c value. It can be easily verified
that because of the fourfold symmetry an A cell will always be at the same
time a B cell and therefore an F cell. The latter is then amenable to a
tetragonal I cell. A C cell is always amenable to another tetragonal P cell.
Thus only two different tetragonal lattices, P and I, are found.
cell is also an F cell. There are three cubic lattices, P, I, and F which are not
amenable one to the other.
Hexagonal lattices
In the conventional hexagonal cell the sixfold axis is parallel to c, with
a = b, unrestricted c, a = /3 = 90") and y = 120". P is the only type of
hexagonal Bravais lattice.
Trigonal lattices
As for the hexagonal cell, in the conventional trigonal cell the threefold axis
is chosen parallel to c, with a = b, unrestricted c, a = /3 = 90°, and y = 120".
Centred cells are easily amenable to the conventional P trigonal cell.
Because of the presence of a threefold axis some lattices can exist which
may be described via a P cell of rhombohedral shape, with unit vectors a R ,
bR, CR such that aR= bR = cR, aR= PR = YR, and the threefold axis along
the UR + bR + CR direction (see Fig. 1.17). Such lattices may also be
described by three triple hexagonal cells with basis vectors UH, bH, CH
defined according to[61
These hexagonal cells are said to be in obverse setting. Three further triple
hexagonal cells, said to be in reverse setting, can be obtained by changing
a H and bH to -aH and -bH. The hexagonal cells in obverse setting have
centring points (see again Fig. 1.17)) at
(O,O, O), I ,I ,I , (113,213,213)
while for reverse setting centring points are at
Triclinic
Cubic
Trigonal
The total number of crystallographic space groups is 230. They were first
derived at the end of the last century by the mathematicians Fedorov (1890)
and Schoenflies (1891) and are listed in Table 1.9.
In Fedorov's mathematical treatment each space group is represented by
a set of three equations: such an approach enabled Fedorov to list all the
space groups (he rejected, however, five space groups as impossible: Fdd2,
Fddd, 143d, P4,32, P4132). The Schoenflies approach was most practical and
is described briefly in the following.
On pp. 11-16 we saw that 32 combinations of either simple rotation or
inversion axes are compatible with the periodic nature of crystals. By
combining the 32 point groups with the 14 Bravais lattices (i.e. P, I, F, . . .)
one obtains only 73 (symmorphic) space groups. The others may be
obtained by introducing a further variation: the proper or improper
symmetry axes are replaced by screw axes of the same order and mirror
planes by glide planes. Note, however, that when such combinations have
more than one axis, the restriction that all symmetry elements must
intersect in a point no longer applies (cf. Appendix l.B). As a consequence
of the presence of symmetry elements, several symmetry-equivalent objects
will coexist within the unit cell. We will call the smallest part of the unit cell
which will generate the whole cell when applying to it the symmetry
24 1 Carmelo Giacovazzo
Table 1.9. The 230 three-dimensional space groups arranged by crystal systems and
point groups. Space groups (and enantiomorphous pairs) that are uniquely deter-
minable from the symmetry of the diffraction pattern and from systematic absences (see
p. 159) are shown in bold-type. Point groups without inversion centres or mirror planes
are emphasized by boxes
Triclinic [i3 p1
i P1
Orthorhombic 12221 P222, P222,, P2,2,2, P2,2,2,, C222,, C222, F222, 1222,
12,2121
mm2 Pmm2, PmcP,, Pcc2, PmaP,, PcaS,, PncZ,, PmnZ,, Pba2,
Pna2,, Pnn2, Cmm2, Cmc2,, Ccc2, Amm2, Abm2, Ama2,
Aba2, Fmm2, Fdd2,lmm2, lba2, h a 2
mmm Pmmm,Pnnn,Pccm,Pban,Pmma,Pnna,Pmna,Pcca,
Pbam, Pccn, Pbcm, Pnnm, Pmmn, Pbcn, Pbca, Pnma,
Cmcm, Cmca, Cmmm, Cccm, Cmma, Ccca, Fmmm,
Fddd, Immm, Ibam, Ibca, lmma
the screw axes and the glide planes with their corresponding symmorphic
symmetry elements. For instance, the space groups P4Jmmc, P4/ncc,
14,lacd, all belong to the point group 4lmmm.
5. The frequency of the different space groups is not uniform. Organic
compounds tend to crystallize in the space groups that permit close
packing of triaxial ellipsoids.[81According to this view, rotation axes and
reflection planes can be considered as rigid scaffolding which make more
difficult the comfortable accommodation of molecules, while screw axes
and glide planes, when present, make it easier because they shift the
molecules away from each other.
Mighell and Rodgers [9] examined 21 051 organic compounds of known
crystal structure; 95% of them had a symmetry not higher than orthorhom-
bic. In particular 35% belonged to the space group P2,/c, 13.3% to PI,
12.4% to P2,2,2,, 7.6% to P2, and 6.9% to C21c. A more recent study by
~ i l s o n , [ ' ~based
] on a survey of the 54599 substances stored in the
Cambridge Structural Database (in January 1987), confirmed Mighell and
Rodgers' results and suggested a possible model to estimate the number Nsg
of structures in each space group of a given crystal class:
Nsg = Acc exp { -BccE21sg - Ccclmls,)
where A,, is the total number of structures in the crystal class, [2],, is the
number of twofold axes, [m],, the number of reflexion planes in the cell, B,,
and Cc, are parameters characteristic of the crystal class in question. The
same results cannot be applied to inorganic compounds, where ionic bonds
are usually present. Indeed most of the 11641 inorganic compounds
considered by Mighell and Rodgers crystallize in space groups with
orthorhombic or higher symmetry. In order of decreasing frequency we
have: Fm3m, Fd3m, P6Jmmc, P2,/c, ~ m 3 m ~, 3 m C2/m,, C2/c, . . . .
The standard compilation of the plane and of the three-dimensional space
groups is contained in volume A of the International Tables for Crystallog-
raphy. For each space groups the Tables include (see Figs 1.20 and 1.21).
1. At the first line: the short international (Hermann-Mauguin) and the
Schoenflies symbols for the space groups, the point group symbol, the
crystal system.
2. At the second line: the sequential number of the plane or space group,
the full international (Hermann-Mauguin) symbol, the Patterson symmetry
(see Chapter 5, p. 327). Short and full symbols differ only for the
monoclinic space groups and for space groups with point group mmm,
4/mmm, 3m, 6/mmm, m3, m3m. While in the short symbols symmetry
planes are suppressed as much as possible, in the full symbols axes and
planes are listed for each direction.
Pbcn Orthorhombic
No. 60 P 2,lb 2 / c 2 , / n Patterson symmetry P m m m
Origin at i on I c 1
Asymmetric unit OSxli; 0 OlzS:
Symmetry operations
Tet rag on al
No. 93 P4222 Patterson symmetry P 4/m m m
Origin at 2 2 2 at 422 1
Asymmetric unit Olxli; OIySI; 0Izl$
Symmetry operations
CONTINUED No. 93
Generators selected ( I ) ; t (I ,O,O); t (0, I ,0); t (O,O, I); (2); (3); (5)
Positions
Mulliplicily. Coordinates Reflection conditions
Wyckofl kllcr.
S ~ l esymmetry
General:
001: I = 2 n
4 h 2.. , , ,L, z
1 f ,:,z+: +,i,T f,$,f+f hkl : 1 = 2 n
0- 0-
C2
4 -0 0- -0 0- -
P 2/m
TI-
:! I
0+
- 10 o+ +0 0 -
I
0-
o+ -
-
0-
0+ 0"; -
O+
-1 -
4 -0 0-
+O o+
-0 0-
+0 o+
c
1 1--
------ 0 ---0
+ +
Fig. 1.22. Some space group diagrams.
Oblique cell P I , ~2
Rectangular cell pm, pg, a n , ~ 2 m m P, ~ V ,~ 2 9 9c2mm
,
Square cell p4, p4n-m~p4gt1-1
Hexagonal cell p3, p3rn1, p31n-1,p6, p6mrn
When applying the symmetry operator C1= (R,, TI) to a point at the end
of a vector r, we obtain X' = CIX = RIX + TI. If we then apply to r' the
Fig. 1.25. A periodic decoration of the plane
according to the 17 crystallographic plane symmetry operator C2, we obtain
groups (drawing by SYMPATI,a computer
program by L. Loreto and M . Tonetti, pixel, 9, X = C2Xf= R2(RlX + TI) + T2= R2RlX + R2Tl + T2.
9-20; Nov 1990).
Symmetry in crystals 1 35
Appendices
1.A The isometric transformations
It is convenient to consider a Cartesian basis (el, e2, e3). Any transforma-
tion which will keep the distances unchanged will be called an isometry or
36 1 Carmelo Giacovazzo
Direct movements
Let us separate (l.A.l) into two movements:
X1=X0+T
Xo = RX.
(1.A.3) adds to each position vector a fixed vector and corresponds
therefore to a translation movement. (1.A.4) leaves the origin point
invariant. In order to find the other points left invariant we have to set
Xo = X and obtain
(1.A.5a) will have solutions for X#O only if det (R - I) = 0. Since det (R - I) =
det (R - RR) = det [(I - R)RJ = det (I - R) det R = det (I - R) = -det (R - I),
then this condition is satisfied. Therefore one of the three equations
represented by (1.A.4) must be a linear combination of the other two. The
two independent equations will define a line, which is the locus of the
invariants points; the movement described by (1.A.4) is therefore a
rotation. In conclusion, a direct movement can be considered as the
combination (or, more properly, the product) of a translation with a
rotation around an axis.
If in eqn (1.A.1) is R = I then the movement is a pure translation, if T = 0
the movement is a pure rotation. When the translation is parallel to the
rotation axis the movement will be indicated as rototranslation. An
example of direct movement is the transformation undergone by the points
of a rigid body when it is moved. Another example is the anti-clockwise
rotation around the z axis of an angle 8; this will move r(x, y, z ) into
Symmetry in crystals 1 37
(l.A.5b)
Opposite movements
An opposite movement can be obtained from a direct one by changing the
sign to one or three rows of the R matrix. For instance, when changing the
sign of the third row, we substitute the vector ( x ' , y', 2') with (x', y', -zl),
i.e. the point P' with its symmetry related with respect to a plane at z = 0.
This operation is called a reflection with respect to the plane at z = 0.
Changing the signs of all three rows of the R matrix implies the substitution
of the vector (x', y', z') with (-x', -y ', -2')) i.e. of the point P with its
symmetry related with respect to the origin of the coordinate system. This
operation is called inversion with respect to a point.
We may conclude that each direct movement, followed by a reflection
with respect to a plane or by an inversion with respect to a point yields an
opposite movement. On the other hand an opposite movement may be
obtained as the product of a direct movement by a reflection with respect to
a plane or by an inversion with respect to a point.
connecting the point with the south pole of the unit sphere. If the point
to be projected is in the -2 hemisphere then the north pole is used.
In Fig. l.B.3(b) parts of the stereographic projections for m3m are
magnified in order to make clearer the statements made in the text.
(d) Solution 5, 3, 2. This solution, which is compatible with the symmetry
poie
of the regular icosahedron (20 faces, 12 vertices) and its dual, the
regular pentagon-dodecahedron, (12 faces, 20 vertices), but not with
the periodicity property of crystals, will not be examined.
It is however of particular importance in Crystallography as sym-
metry of viruses molecules and in quasi-crystals.
6. Composition of two glide planes. In Fig. 1.B.4 let S and S' be the
traces of two glide planes forming an angle a and 0 be the trace of their
intersection line. The translational components OA and OB are chosen to
lie on the plane of the drawing and Q is the meeting point of the axes of the
OA and OB segments. X, Y, and Q' are the reflection images of Q with
respect to S, S', and to the point 0, respectively. The product S'S moves Q
to Q' and then back to Q. Since S'S is a direct movement it leaves Q
unchanged and corresponds to a rotation around an axis normal to the plane
of the figure and passing through Q. Since S'S moves first A to 0 and then
to B, the rotation angle AQB = 2a. Note that the two glides are equivalent
to a rotation around an axis not passing along the intersection line of S
and S'.
7. Composition of two twofold axes, with and without translational
component. From point 5 we know that the coexistence of two orthogonal
twofold axes passing by 0 , implies a third binary axis perpendicular to them
and also passing through 0 (see Fig. l.B.5(a)). The reader can easily verify
the following conclusions:
(a) if one of the two axes is 2, (Fig. l.BS(b)), then another 2 axis, at 114
from 0 and intersecting orthogonally the screw axis, will exist;
(b) if two 21 intersect in 0 (Fig. l.B.5(c)), then another 2 axis perpendicu-
lar to them and passing at (114,114) from 0 will be present;
(c) if a pair of mutually perpendicular 2 axes is separated by 114 of a period
(Fig. l.BS(d)), then a 2, axis orthogonally intersecting both axes will
exist; Fig. 1.8.3. (a) Geometry of the stereographic
projection. (b) Angular values occurring in m3m
(d) if a 2 and a 2, axis are separated by 114 of a period (Fig. l.B.5(e)) there stereographic projection.
will then be a new 2, axis normal to both of them and intersecting the
first 2, axis at 114 froin the 2 axis;
(e) if two orthogonal screws are separated by 114 of a period (Fig.
l.B.5(f)), then a third screw axis normal to them and passing at
(114,114) from them will be present.
i = ( i -1o -1
0
:).
This corresponds to changing the sign of all the elements of the original
matrix. Therefore in the following list we will not give all the 64 matrices
necessary to describe the space groups, but only the 32 matrices cor-
responding to proper symmetry elements.
Direction [0 0 01
Direction [I 0 01
2 (~ 0 0 ° 1) ( 1 )
0H 2 = 0 1 0 4
( l o O); ( l o
~ 0 0 14 3 = O 0 1 .
0)
o o i o o i o I 0' 0 1 0
(yg;)
Direction [0 1 01
2 =( i0 o10 O) ; H 2 =O 1 O1 O) ; 3 =( O
0 O1 i O
); 4 3 = 0 1 0 .
o o i o o i l o o
Direction [0 0 11
i o o 1 1 0 o i o
44 1 Carmelo Giacovazzo
Direction [ l 1 01
Direction [l 0 11
0 0 1
Direction [0 1 11
Direction [l i 01
Direction [ i 0 11
Direction [0 1 I]
Direction [I 1 11
0 0 1 0 1 0
Direction [I1 11
o i 0 o o i
Direction [l i 11
l o o o i o
Symmetry in crystals 1 45
Direction [I 1 ?I
3
(
~ 0
0);
0 13
(O O T )
2 ' 1 0 0 .
T o o 0 i 0
Direction [2 1 0]
Direction [ l 2 01.
then gn+'= g, gn+2= g2, . . , If n is the smallest integer for which (l.E.l) is
satisfied, there will only be n distinct powers of g. Since gJgn-j= gn-jgi = e,
then gn-j is the inverse of gi. The element g is then said to be of order n and
46 1 Carmelo Giacovazzo
Table 1.E.1. List of generators for non-cyclic point groups. There are 2 1 proper
generators in all
We note that
1. Each element appears once and only once in a given row (or column) of
the table. In order to demonstrate this statement let us consider the ith
row of the table and suppose that there are two different elements gj and
gk, for which gigj = gigk = g,. Then g, would appear twice in the row, but
by multiplying the two equations by g;' we obtain g, = g,, in contrast
with the hypothesis.
2. Each row (column) is different from any other row (column); this
property follows immediately from property 1.
3. For abelian groups the table is symmetric with respect to the diagonal.
Symmetry in crystals 1 47
1
i,2, m
3
4.4
2/m, mm2, 222
6, 6, 3
32.3m
mmm
4/m
4mm, 422.42m
g/m
3m, 62m, 6mm, 622
23
4/mmm
432.43m
m3
6/mmm
m3m
Groups having the same multiplication table, even though their elements
might have different physical meaning, are called isomorphous. They must
have the same order and may be considered as generated from the same
abstract group. For instance the three point groups 222, 2/m and mm2 are
isomorphous. To show this let us choose g,, g,, g3, g4 in the following way.
group 222: 1, 2, 2, 2;
group 2/m; 1, 2, m, i;
group mm2; 1, m, m, 2.
The multiplication table of the abstract group is
Subgroups
A set H of elements of the group G satisfying the group conditions is called
a subgroup of G. The subgroup H is proper if there are symmetry
operations of G not contained in H. Examples of subgroups are:
(1) the set of even integers (including zero) under the sum law is a
subgroup of the group of all integers;
(2) the point group 32 has elements g, = 1, g2 = 310011,g3 = 3, = 3-l,
g4 = 2[1m1,g5 = 2[0101,g6 = 2[iio1;H = (gl, g2, g3) is a subgroup of G;
48 1 Carmelo Giacovazzo
Cosets
Let H = (h,, h2, . . .) be a subgroup of G and gi an element of G not
contained in H. Then the products
giH = (gihl, g h , . . .) and Hgi = (hlgi, h2gi, . . .)
form a left and a right coset of H respectively. In general they will not be
identical.
Furthermore H can not have any common element with giH or Hgi. In
fact, if for instance, we had gihi = hk, it would follow that gi = hkhjl, i.e.
contradicting the hypothesis, gi would belong to H.
It can be shown that two right (or left) cosets, either have no common
element or are identical one to the other. This allows us to decompose G
with respect to H in the following way:
It follows that the order of a subgroup is a divisor of the order of the group
and if this is a prime number, the only subgroup of G is e and G must also
be cyclic.
The decomposition of the group 2/m into separate left cosets with respect
to the subgroup 2 is:
Conjugate classes
An element gi is said to be conjugate to an element gj of G if G contains an
element gk such that
gi = gklgjgk. (1.E.5)
If g, is fixed and gk varies within G, then the set of elements gi forms a class
of conjugate elements.
In agreement with relation (1.E.5) the element e forms a class on its own.
Since each element of G can not belong to two different classes, it is
possible to decompose G into the factorized set G = e U T, U T2 U . . . .
A physical or geometrical meaning may be attributed to the classes. In
Symmetry in crystals 1 49
Conjugate subgroups
Let H be a subgroup of G and g an element of G not in H. Then all the
elements g - l ~ g ,form a group. H and g-lHg are conjugate subgroups.
We can now define a new type of group of order p, called a factor group
or quotient group, indicated by the symbol G/H: its elements are cosets of
50 1 Carmelo Giacovazzo
H. The following multiplication table is for the quotient group (we assume
= e).
rrl
reasons they are labelled as subgroups of type IIa. Space groups with
primitive cells have no entry in the block IZa. Some further subgroups of
C222 are C222, (with c' = 2c), I222 (with c' = 2c) and I2,2,2, (with c' = 2c).
These subgroups have conventional cells larger than that of C222 and are
denoted as subgroups of type IIb. For k subgroups the point group P of G is
unchanged.
3. By combination of 1 and 2. In this case both the translation group T
and the point group P of G are changed.
A theorem by Hermann states that a maximal subgroup of G is either a t
subgroup or a k subgroup. Thus in the International Tables only I, IIa, IIb,
and IIc subgroups are listed.
Sometimes we are interested to the possible space groups G' of which a
given space group G is a subgroup. G' is called a minimal supergroup of the
group G if G is a maximal subgroup of G'. Of course we will have a
minimal t, or a minimal non-isomorphous k, or a minimal isomorphous k
supergroup G ' of G according to whether G is a maximal t, or a maximal
non-isomorphous k, or a maximal isomorphous k subgroup of G'. The
minimal non-isomorphous supergroups of C222 are:
of type t: Cmmm, Cccm, Cmma, Ccca, P422, P42,2, P4,22, P42,2,
~ 4 r n 2~, 4 ~~ 24 b, 2 ~, 4 n 2 P622,
, P6,22, P6,22,
of type k: F222, P222 (with a ' = a/2, b' = b/2).
For very large values of the order of the rotation axis the two types
approach w and am respectively. From the geometrical point of view w and
wm are identical, and our standard notation will be wm (the situation
Symmetry in crystals 1 53
a,# a m occurs when the rotation direction is taken into account: i.e. for a
magnetic field round a disc).
In three dimensions a point group can include continuous rotations about
one or about all axes (this is a consequence of the Euler theorem applied to
such a limiting case). In the first case two groups can be identified, wm and
w 2
--, according to whether there is or not a mirror perpendicular to a, axis
mm
(the symmetry of the two groups can be represented by a circular cone and
w m
by a circular cylinder respectively). The symbol -- represents the case in
mm
which continuous rotation about any axis is allowed (the symmetry is
represented by a sphere).
Representation of a group
If a square matrix d can be associated to each g E G , in such a way that
when gigj = g k also didj = d,, then the matrices form a group D isomor-
phous with G. These matrices form an isomorphous or exact repre-
sentation of the group: the order n of the matrices is the dimension of the
representation. In accordance with this point of view, in Chapter 1 we have
represented the symmetry groups through square matrices of order 3.
Different representations of G may be obtained through a transformation of
the type
is transformed into
with dl of order m < n and d2 of order (n - m). If this can not be obtained
by any transformation, then the representation is called irreducible;
otherwise it is called reducible. Sometimes dl and d2 can be further
reduced, and at the end of the process each matrix d j will be transformed
into
q-ld,q = diag [djl), dj2), . . . , d,(")]= d,'
where dji) are themselves matrices.
The matrices dl1), dp), dill, . . . all have the same dimension. Similarly
dl2),di2),di2),. . . have the same dimension. From the rule of the product of
blocked matrices it follows that (dl1),d$'), dS1),. . .) form a'representation of
the group, as well as (dl2),ds2),dS2),. . .), etc.
It can be shown that for finite groups the number of irreducible
representations is equal to the number of classes. For instance an
isomorphous (reducible) representation of the point group 32 is
Character tables
The sum of the diagonal elements of a matrix, elsewhere called trace, in
group theory is called character and is indicated by ~ ( g It) is obvious that
~ ( g , )defines the dimensionality of the representation. The complete set of
characters for a given representation is called the character of the
Symmetry in crystals 1 55
The G1 groups
1. Gh groups. In a one-dimensional space (a line), which is non-periodic,
only two symmetry operators are conceivable: 1 and ? (which is the
reflection operator m). The only two (point) groups are therefore 1
and I.
2. G: groups. Besides the 1 and ? operators, they contain the translation
operator. Only two groups of type G: are then possible.
The G2 groups
1. G?, groups. In a 'two-dimensional space (a plane), which is non-periodic,
the only conceivable operators are those of rotation around an axis
perpendicular to the plane and of reflection with respect to a line on the
plane. The number of (point) groups is infinite, but there are only ten
crystallographic groups (see p. 16).
2. ~f (border) groups. In a two-dimensional space, periodic in one
dimension, only the symmetry operators (and their combinations), which
transform that direction into itself, are allowed. We may therefore
consider reflection planes parallel or perpendicular to the invariant t a---a--t:2
direction, glides with translational component parallel to it and two-fold - - - - - - - - - - - t.a 1- - - + - - - I at:2,a
axes. There are seven Gf groups (the symmetries of linear decorations)
which are represented in Fig. 1.F.1. t.m +-+-+-+t:zm
3. G$ groups. There are the 17 plane groups described on the pages 30 and I-It:m
34. Fig. 1.F.l. The seven border groups
56 1 Carmelo Giacovazzo
The G3 groups
1. G; groups. There describe non-periodic spaces in three dimensions. The
number of (point) groups is infinite (see Appendix l.B), but there are
only 32 crystallographic point groups (see pp. 11-16).
2. G: (rod) groups. Rod groups may be considered as arising from the
combination of one-dimensional translation groups with point groups G;.
They describe three-dimensional objects which are periodic in only one
direction (say z). This must remain invariant with respect to all
symmetry operations. The only allowed operations are therefore n and ri
axes coinciding with z, 2 and 3 axes perpendicular to it, screw axes and
glide planes with a translational component parallel to the invariant
direction.
There are 75 G: crystallographic groups. In Table l.F.l the rod group
symbols are shown alongside the point groups from which they are derived.
The first position in the symbol indicates the axis (n or ri) along z, the
1 1
2 2 21
3 3 31
4 4 41
6 6 61
Im Im Ic
2mm 2mm 2,mc
31-17 3m 3c
4m m 4mm 4,mc
6mm 6m m 6,mc
m m
2/m 2/m 2,/m
4/m 4/m 421m
6/m 61m 6,/m
m2m m2m m2c
2 2 2
--- 2 2 2
--- 2, 2 2
---
mmm mmm mmc
4 2 2
--- 4 2 2
--- 4, 22
---
rnrnrn mmm mmc
6 2 2
--- 6 2 2
--- 6, 2 2
---
mmm mmm mmc
12 12
222 222 2,22
32 32 312
422 422 4,22
622 622 6,22
1 1
3 3
4 4
6 6
- 2 -1 -
2 -2
1-
1-
m m C
- 2 - 2
3-
-2
3- 3-
m m C
4m2 4m2 4c2
6m2 6m2 6c2
Symmetry in crystals 1 57
The 64,groups
The three-dimensional Euclidean space may be insufficient to describe the
symmetries of some physical objects. We can therefore introduce one or
more additional continuous variables (e.g. the time, the phase of a wave
function, etc.), thus passing from a three-dimensional space into a space
with dimensions m > 3. In a four-dimensional Euclidean space the sym-
metry groups G: may be constructed from their three-dimensional projec-
tions G:, which are all well known. Thus there are 227 point groups G: and
4895 groups Gi.
1 P1
2 P2
rn prn Pb
Tim p2/m p2/b
1 P1
2 2 2
--- 2 2 2 P--l
P--- 222 22,2,
mrnm mmm rnbrn
p2- -2A2 p22
- l2L 222 22,2
brnm bm a P b b i Pbba-
22121 p---
P--- 22,2 P,b2a2 2
nrnm nbm
1rn c l rn
2mm c,mrn
m2m cm2m
2 2 2
--- 2 2 2
C---
mmm mmm
12 c12
222 c222
- 2 - 2
1- cl-
m rn
-
4 P4
4m rn p4mm p4grn
41rn p4m/m p4/n
4 2 2
--- 4 2 2
P--- 2- 2
p -4A 42,2 422
mmrn mmm r n g m P---
r n g m P---
ngm
4 P4
422 ~ 2 2 ~ 4 2 ~ 2
4m2 p4m2 p4g2 p42rn p42,rn
Symmetry in crystals 1 59
-
in the ceramic and electronic industries. At room temperature, NiO is
rhombohedral with edge length a , 2.952 A and a~ =. 60'4': aR ap-
proaches 60" with increasing temperature, and, above 250°C, NiO is
cubic with ac2: 4.177 A. The relation between the two cells is shown in
Fig. l.F.2(b): the same set of lattice points is described by the primitive
rhombohedral unit cell and by the face-centred cubic cell provided that
aR= 60' exactly and a R = a/*. If the cube is compressed (or extended)
along one of the four threefold axes of the cubic unit cell then symmetry
reduces from cubic to rhombohedral (the only threefold axis is the
compression axis). The polymorphism of NiO is due to its magnetic
properties. Each Ni2+ ion has two unpaired spins (the [Ar]3d8 electronic
configuration). At room temperature the spins in NiO form an ordered
antiferromagnetic array: layers of Ni2+ with net spin magnetic moments
all in the same direction alternate with layers of Ni2+ with magnetic
moments all in the opposite direction, as in Fig. l.F.2(c). In these
conditions the threefold axis is unique and the structure is rhom-
bohedral. Above 250°C the antiferromagneti~ordering is lost: the
rhombohedral + cubic transition occurs and NiO displays ordinary
paramagnetism.
2. If we project a G: group, in which a 6, axis is present, on a plane
perpendicular to the axis, we obtain a G; group. But, if we assign a
different colour to each of the six atoms related by the 6, axis, we will
obtain a colour group G;,@)with a clear meaning of the symbols.
In the groups with antisymmetry there will be four types of equivalence
between geometrically related objects: identity, identity after an inversion Fig. 1.F.2. Examples of structure described by
operation, anti-identity (the two objects differ only in the colour), identity an antisymrnetry group: (a) CoAI,O, magnetic
after both an inversion operation and a change in colour. A general rotation structure; (b) geometrical relation between a
face-centered cubic unit cell and a primitive
matrix may be written in the form rhombohedral unit cell; (c) antiferromagnetic
superstructure of NiO (only ~ i ' +ions are
shown).
\ O 0 0 R44/
where R4, = -1 or +1 depending on whether or not the operation changes
the colour.
For the three-dimensional groups with antisymmetry we observe that,
because of the existence of the anti-identity operation 1' (only the colour is
changed), the anti-translation operation t' = t l ' will also exist. New types of
Bravais lattices, such as those given in Fig. 1.F.3, will come out. As an
example, in Fig. l.F.2(c) the quasi-cubic magnetic unit cell of NiO has an
60 ( Carmelo Giacovazzo
edge length twice that of the chemical unit cell. It may be seen[14]that if the
five Bravais lattices are centred by black and white lattice points (in equal
percentage) then five new plane lattices are obtained. In three dimensions
there are 36 black and white Bravais lattices, including the traditional
uncoloured lattices.
? 9 References
fg
1. Shubnikov, A. V. (1960). Krystallografiya, 5, 489.
2. Shubnikov, A. V. and Belov, N. V. (1964). Coloured symmetry. Pergamon,
Oxford.
3. Bradley, C. J. and Cracknell, A. P. (1972). The mathematical theory of
symmetry in solids. Representation theory for point groups and space groups.
Clarendon Press, Oxford.
4. Lockwood, E. H. and MacMillan, R. H . (1978). Geometric symmetry.
/
,
I
b---- -b
Cambridge University Press.
5. Vainshtein, B. K. (1981). Modern crystallography I: Symmetry of crystals,
methods of structural crystallography. Springer, Berlin.
6. (1983). International tables for crystallography, Vol. A, Space group symmetry.
?' Reidel, Dordrecht.
Fig. 1.F.3. Two antisymmetrical Bravais lattices. 7. Dougherty, J . P. and Kurtz, S. K. (1976). Journal of Applied Crystallography, 9,
145.
8. Kitaigorodskij, A. I. (1955). Organic crystallochemistry. Moscow.
9. Mighell, A. and Rodgers, J. R. (1980). Acta Crystallographica, A36, 321.
10. Wilson, A. J. C. (1988). Acta Crystallographica, A44, 715.
11. Fischer, W., Burzlaff, H, Hellner, E., and Donnay, J. D. M. (1973). Space
groups and lattice complexes, NBS Monograph 134. National Bureau of
Standards, Washington, D.C.
12. Alexander, E. and Herrmann, K. (1929). Zeitschrift fur Kristallographie, 70,
328.
13. Alexander, E. (1929). Zeitschrift fur Kristallographie, 70, 367.
14. Mackay, A. L. (1957). Acta Cryst., 10, 543.
CARMELO G I A C O V A Z Z O
Introduction
In this chapter elements of crystallographic computing are described.
Material is treated in order to answer day-to-day questions and to provide a
basis for reference. Among the various topics, those which are of more
frequent use have been selected: axis transformations, geometric calcula-
tions (bond angles and distances, torsion angles, principal axes of the
quadratic forms, metric considerations on the lattices, structure factors,
Fourier calculations,. . .). The method of least squares and its main
crystallographic applications are treated in greater detail. For practical
reasons some calculations useful in characterization of thermal ellipsoids are
developed in Appendix 3.B .
The following notation is adopted: rl. r2 denotes the scalar product
between the two vectors rl and r2, rl A r2 is their cross product, r will be the
modulus of r. S1S2is the (row by columns) product of two matrices S1and
S2: S is the transposed matrix of S, and S is the determinant of the square
matrix S.
We will also distinguish between coordinate matrices and vectors. For
example, with respect to a coordinate system [0, a, b, c] the vector r will be
written as
where X is the coordinate matrix and A is the matrix which represents the
basis vectors of the rectilinear coordinate system.
G is the metric matrix, also called the metric tensor: its elements define both
the moduli of a, b, c and the angles between them. The value of its
determinant is
G = a2b2c2(1- cos2 a - cos2j3 - cos2 y + 2 cos a cos P cos y)
(2.2)
which (see Table 2.1 and p. 69) is equal to V2 (square of the volume of the
unit cell). If r, = r2= r then (2.1) becomes
which gives the modulus square of a vector. We c8n now calculate the
following:
1. The interatomic distance d between two atoms positioned in (xl, y,, z,)
and (x2y2z2)Denoting
A1 = a(xl - x2), A2 = b(yl - y2), A3 = c(zl - 22)
gives
+ + +
d2 = A: A; A$ + 2A1A2COSY + 2A1A3cos P 2A2A3cos a. (2.3b)
2. The angle 8 between two vectors
cos 0 = ~,GX,l(r,r~). (2.4)
3. The cross product r2 A r3:
Equation (2.10a) suggests that a* is normal to the plane (b, c), b* to the
plane (a, c) and c* to the plane (a, b). The modulus and sense of a*, b*, c*
are fixed by (2. lob).
According to (2.10a) a* may be written as
Equation (2.10) also suggest that the roles of direct and reciprocal space
may be interchanged: i.e. the reciprocal of the reciprocal lattice is the direct
64 1 Carmelo Giacovazzo
lattice. Therefore
1 1 1
a=,(b*~c*), b = - ( c* A a * ) , c =-
v v* v * (a* A b*). (2.13)
v* = 1/v
Crystallographic computing 1 65
where dH is the spacing of the planes (hkl) in the direct lattice. This may be
proved by observing that dH is equal to the length of the normal ON to the
plane ABC from the origin 0. Since r i has the same direction as ON,
r 1
d H = (alh) .-=-
r; ria
4. As well as for direct lattice, a metric matrix G* may be defined for the
reciprocal lattice:
from which
+ + +
d H = (h2a*2 k2b*2 1 2 ~ * 22hka*b* cos y*
+ 2hla *c*cos /3* + 2klb *c*cos (2.17b)
is easily obtained.
Specific expressions of d H for the various crystal systems are given in Table
2.2.
Basis transformations
In three-dimensional space the coordinate system defined by the base
vectors a ' , b r , c r may be defined in terms of the base vectors a, b, c by
66 1 Carmelo Giacovazzo
Table 2.2. The algebraic expressions of 4, for the various crystal systems
Cubic
Tetragonal
Orthorhombic
Hexagonal and
trigonal ( P )
Trigonal (R)
( h2 + k2 + 1') sin2 a + 2(hk + hl + kl)(cos2a-cos a)
a2 1 +2cos3cu-3cos2~
h2 I? 1' 2hlcosP
Monoclinic 2 . 2 +7+2.2--
a sln fi b c sin /3 ac sin2P
21h 2hk
+- (cos y cos a - cos p ) + -(cos a c0s.p - COS y )
ca ab
to (2.19) and (2.5), V ' = V M . Thus, for any transformation of axes the unit
cell volume is multiplied by the determinant of the transformation matrix.
The set of matrices conventionally used to pass from centred cells to
primitive ones and vice versa is shown in Table 2.C.1 (however, transfor-
mation matrices are not unique).
Let us now apply (2.20) to derive the transformation rules of a quadratic
form. Let
r' = r* = x'a* + y'b* + z'c* = (r' .a)a* + (r' b)b* + (r' c)c*). (2.24)
On assuming in (2.23) r = a *, b *, c*, the following relations
. . .
a* = (a* a*)a + (a* b * ) b + (a* c*)c
. .
b* = (b* a*)a + (b* b*)b + (b* c*)c
.
C* = (c* a*)a + (c* b*)b + (c* c*)c
/---
where (I,, I,, I,), (m,, m2, m,), (n,, n2, n,) are direction cosines of the unit
a vectors ala, blb, clc in E. Therefore
ClT=Cm?=Cn:=l.
i i i
C From Fig. 2.2 it may be deduced that
Fig. 2.2. Orthonormalization of crystallographic
bases. 1, = 1, 12= 0, 1, = 0, m l= cos y,
m2 = sin y, m3 = 0, nl = cos p.
Since
cos a = C mini = cos cos p + sin sin n2
i
we obtain
n2 = (cos a - cos p cos y)/sin y = -sin p cos a*.
Furthermore, from the relation Ci n: = 1
n, = sin p sin a*= l/(cc*)
is easily obtained. Finally
(fI))=(co:y cos p
0
siny
0
=(bc:s
c cos p
0
b sin y
0
0
-c sin /3 cos a* l / c *
)(,) = WIJL (2.30)
M=
("d*~;~*
0
c*cosp*
b*siny* - c * s i n p * c o s a
llc
and
l l a * -cot y*/a* a cos /3
~ - l = 0 l/(b*siny*) b c o s a
0 C
where sy, cy, ty stand for sin y, cos y, and tan y, etc. . . .
The family of all the possible transformations M which orthonormalize a
given frame A according to E = MA may be obtained from the following
decomposition of the metric matrix G of A:
V =a b A c = det (M-')
and
0
cos a, 0 sin a,
(-sin a,
c a s ia
Rz(a3) = sin a3 cos a3
0 cos a2
0)
Since the matrices are orthogonal, the following relations hold:
R(a) = R-l(cu) = R-l(-a) = R(-a).
Corresponding clockwise rotations are obtained by changing a;. into -a;..
Matrices corresponding to rotatory-reflection operations about el, e2, e3
are obtained by replacing in (2.32) the integer 1 by -1. Matrices
corresponding to rotatory-inversion operations about el, e2, e3 are obtained
by changing the signs of all the elements. The traces of the matrices that
represent proper rotations, rotatory-reflection and rotatory-inversion oper-
+
ations are 1 2 cos a, 2 cos a - 1, -2 cos a - 1 respectively.
It should be noted that rotating r about el, e2, e3 in anti-clockwise mode
is equivalent to rotating el, e2, e3 in clockwise mode. For example if the
framework [0, el, e2, e3] may be superimposed to the new framework
[0,e;, eh, e;] by a clockwise rotation through a, about e3, then E' =
R,(a3)E. According to (2.20), X ' = R;l(a3)x = R,(a,)X, which corresponds
to an anti-clockwise rotation of a vector r in E.
See now some useful applications:
1. Rotation about the unitary vector I in a rectilinear coordinate system.
Given the crystal base A = (a, b, c), an orthonormal base E = (el, ez, e3)
may be chosen (E = M A ) such that el coincides with 1. In accordance with
(2.32) a rotation about 1 in E is represented by R,, and, according to Table
2.E.1, the same rotation in A is represented by R = MR,(M)-'. As an
example we calculate in the hexagonal system the matrix corresponding to
an anti-clockwise rotation through x about a. According to (2.30) and
(2.31) the matrices M and M-I are
from which
1(cos - 1)/2 -c sin x / ( a f l )
R = MR,(M)-~ = cos x -2 c sin Xl(afl)
0 aflsinX/(2c) cos x
Incidentally, it may be noted that the ratio a/c is unconstrained. Thus R
(which is an integer matrix) may correspond to a symmetry axis along a in
Crystallographic computing 1 71
from which
X = RxRyR,X = RX
where
cqcw -cqso
c q s o + sqsqcw c q c o - sqsqsw -sqcq
+
sqsw - cqsqcw s q c o c q s q s o c q c q
and
where cei and soi stand for cos ei and sin Oi respectively.
Fig. 2.3. (a) Eulerian angles. (b) Spherical polar
coordinates. The Eulerian angles can also be used in order to calculate, in any
crystallographic system A, the rotation function corresponding to any
desired rotation. The simplest procedure could be:
(a) Transform coordinates in A (say XA) into coordinates in E (say XE). If
E = MA, then according to (2.20), X, = (M)-lXA. For example, M may
be (2.31a) or (2.31b) or (2.31~).
(b) Transform the Cartesian coordinates into a rotated set of axes. Then XE
transforms in XI, = REuXE.
(c) Return these coordinates into the system A. Then the inverse operation
described in (a) has to be made.
The final coordinates are Xj, = M X =~ MR,,(M)-'X~ so that the desired
rotation function is
Since (2.32b) represents the anti-clockwise rotation matrix about the unit
vector 1 in an orthonormal frame we can replace the direction cosines
11, 12,l3 of the rotation axis 1 by
Il = sin I) cos q,, l2 = cos q, l3 = -sin qj sin q,
and so obtain the expression of the rotation matrix Rsp in terms of the
rotation angle x and the spherical polar coordinates q, and q :
CX+ (1 - cx)s2*c2Q) -s*sQ)sx+ (1 -cx)c*s*cQ) -cvsx-- (1 -cx)s2*cQ)sQ)
Since
(
R,, = s * s q s ~ +(1 - c ~ ) c v s v c q ,
c ysx- (1 - cx)s2*cQ)sq,
cx + (1 - c x ) c 2 ~
-s*cQ)sx- (1 - cx)c*s*sq,
S~CQ)SX - (1 - C X ) C I ~ S ~ S Q )
cx + (1 - c~)sz*s2Q)
Torsion angles
For a sequence of four atoms A , B, C, D, the torsion angle o(ABCD) is
defined as the angle between the normals to the planes ABC and BCD (see
Fig. 2.4). By conventiod2] w is positive if the sense of rotation from BA to
CD, viewed down BC, is clockwise, otherwise it is negative. Note that
w(ABCD) and w(DCBA) have the same sign; furthermore, the sign of a
torsion angle does not change by rotation or translation, and is reversed by
reflection or inversion. According to the definition (see again Fig. 2.4)
cos o =
(a A .
b ) (b A c) b
-sin w =
(a A b) A (b A c)
ab2c sin a sin y b ab2c sin a sin y
which, owing to (2.7) and (2.8), become
cos a cos y - sin /3 Vb
cos w = , sin o =
sin a sin /3 ab2c sin a sin y ' Fig. 2.4. Definition of the torsion angle w .
74 1 Carmelo Giacovazzo
where d is the distance of the plane from the origin of the coordinate
+ +
system, n = n , a + n2b n,c = A N = nFa* nib* ngc* = A*N* is the +
normal to the plane. The weights w, should be taken as being inversely
proportional to the variances of the atomic positions in the direction normal
to the desired plane, but they are often assumed to be unitary.
If the atoms are considered as point masses of weight w j , the least squares
plane coincides with the principal plane of least inertia.
The minimum of Q will be searched with respect to d and n:, n z , n;
under the condition that n is a unit vector. This kind of problem is best
solved by the method of Lagrange multipliers. The function to minimize is
thenL3]
= C w j ( A * x j- d ) 2 - A ( N * G * N *- 1). (2.34)
i
The partial derivative of (2.34) with respect to d gives
from which
d = N * [ ( F w j x j ) ( ? w,)-'1 = N*Xo. (2.35)
Equation (2.35) states that the plane passes through the centroid ro = ao.
Owing to (2.35), eqn (2.34) becomes
Note that N * S N * is the weighted sum of the squares of the distances of the
atoms from the plane. Setting to zero the derivative of (2.36) with respect to
Crystallographic computing 1 75
N* (in practice with respect to the components n:, ng, nz) gives
SN* - AG*N* = 0, which may be also written as
(A- hl)N = O (2.37)
where A = SG and N = G*N* (see Table 2.E. 1). Writing (2.37) as AN = AN
and multiplying both sides for N* gives
be the quadratic form. Finding its principal axes is equivalent to finding the
directions n in which q is stationary. As in calculating the best plane through
a set of points, the problem may be solved via the Lagrange multipliers by
minimizing
from which
(A - Al)N* = O
where A = QG*, and N* = GN is the general eigenvector the components of
which are referred to the reciprocal axis. The eigenvalue A gives the value of
q in the n direction. Indeed, if (2.38) is written as AN* = AN* and both sides
76 1 Carmelo Giacovazzo
Substituting the three eigenvalues A,, A,, A, into (2.38a) provides the three
eigenvectors N:, N;, N: which represent the principal axes of q.
If the quadratic form is referred to the reciprocal basis (i.e. q * = H$H =
Pllh: + 2P12hk+ . . . + P3,12) the problem may be solved in the same way
on condition that G* and N* replace G and N respectively. As an example
let us determine the principal axes of an atomic thermal ellipsoid for which
PI1= 0.00906, P12= -0.00049, P13 = -0.00102, PZ2= 0.00401, P23 =
0.00038, P3, = 0.01424. Let the orthohombic unit cell parameters be
a = 8.475, b = 10.742, c = 5.8991$. The function to minimize is
which has solutions A, = 0.483, AB = 0.448, A, = 0.677. Using the first root
gives
0.1677 -0.0565 -0.0354
(-0.0352
-0.0733
-0.0203
0.0438
0.0132)
0.0125
(I::)(!).
n3,
=
Since the three equations are linearly dependent n,, and n2, can be found in
terms of n,,: n,, = +0.2592n3,, n2, = 0.185n3,. The eigenvector N, will
have unitary modulus (remember that n,,, n,,, n,, are the components of
N, in A) if n3, = 0.1515. Therefore N, = -[0.0393, 0.0280, 0.15151. In an
analogous way N~ = [-0.0139, -0.0866, 0.05871 and N, = [-0.1097,
0.0210, 0.04921. Since
transforms the basis A into a Cartesian coordinate system A' in which the
axes are the eigenvectors of $. Indeed, according to (2.E.8), $ transforms
into $' = VG$GV. Because of (2.38b)
so that
smallest three non-coplanar translations will be the Buerger cell edges. This
cell is, however, not unique: if it is, then it coincidesiwith the Niggli cell.
For 7 of the 14 Bravais lattices a unique Buerger cell exists,[''] while in a
face centred cubic lattice (see later) two Buerger cells can be found. In
other lattice types up to five types of Buerger cell can be found (the values 4
and 5 occur only in triclinic lattices) according to whether some conditions
on the parameters of the conventional cell are satisfied or not. For example,
the triclinic lattice described by a Buerger cell with
may be described by means of four other Buerger cells having the same a,
b, c values, but with
a = 60°00', P = 86'24', y = 75'31';
a = 120"001, P = 93'36', y = 100'48';
a = 117'57', /3 = 93'36', y = 104'28';
a = 113'58', P = 100°48', y = 104'28'.
It will be shown later that only the first of the five cells is the Niggli cell.
If gij are the elements of the metric matrix, the Niggli cell is defined by
the following conditions:[129131
1. Positive reduced cell (all the angles <90°). Main conditions:
Special conditions:
(b) if gz3= 1/2gZ2then g126 2g13; if g13 = 1/2gll then g12s 2gZ3;if g12=
1/2gll then g13 s 2g23.
2. Negative reduced cell (all the angles 290'). Main conditions:
(b) if k231 = 1/2g22 then g12 = 0; if lg131= 1/2gll then g12= 0; if lglzl =
1/2gii then g13 = 0; if (Ig23l + Ig13l + Ig121) = 1/2(g1, +gZ2) then gll a
2 18131 + 18121.
The main conditions define a cell based on the three shortest non-
coplanar vectors. Conditions (a) break down ambiguity when two cell edges
are equal, conditions (b) define the Niggli cell when there is more than one
symmetrically independent Buerger cell.
As an example of systematic ambiguity let us consider the face-centred
cubic lattice with cubic edge a. If we move to the primitive unit cell by
means of the appropriate matrix quoted in Table 2.C.1 we get
g;1= g h = gj3 = a2/2;g;, = gi3 = gi3 = a2/4.
If we move to the primitive cell by means of the transformation matrix
Crystallographic computing 1 79
1(1/2 112 011 1/2 112 011 0 1/2 11211 then we get
Both the primitive cells are Buerger cells but the second violates the
conditions (a): thus the first is the Niggli reduced cell.
Matrices which derive Niggli cells from Buerger cells are given by Santoro
and ~ i ~ h e l l .A[ 'very
~ ~ efficient algorithm to derive the Niggli cell from any
primitive cell is described by Krivy and ~ r u b e r . ~ ' ~ ]
For any Bravais lattice ~ i g g l i [defined
~] the algebraic relations that the gijs
of the reduced cell must satisfy. The type of Bravais lattice may be thus
derived from the Niggli cell just by comparing the found with the expected
relations. For example, in a face-centred cubic lattice the gij of the Niggli
cell must satisfy gll = g22= g33, gI2= g13 = g23 = g11/2.
The use of automatic procedures devoted to identify the Niggli cell may
yield incorrect conclusions as a consequence of errors in the cell parameters
or of rounding errors in the calculations. Some auxiliary procedures recently
suggested by different are less sensitive to these error sources.
The final steps from the Niggli cell to the conventional cell may be
performed by means of suitable transformation matrices.["] It would be
worthwhile recalling that the lattice symmetry determined via the Niggli cell
is only of metric nature, and that may be equal to or larger than the
symmetry of the crystal structure.
Reduced cells may be used:
As a useful step for the correct definition of the space group (see also
Chapter 3). An advisable sequence may the following:[181from the
conventional cell to a primitive cell, and then to the Niggli cell; analysis
of the latttice symmetry, analysis of Laue symmetry and of systematic
extinctions; space group choice.
As an effective tool for the identification and characterization of
crystalline materials['91(as an alternative to powder methods in which the
identification is based on matching diffraction positions and intensities).
An advisable sequence may be: a unit cell is determined, the reduced
cell is derived together with derivative supercells and subcells (derivative
cells are calculated to overcome possible errors made by the experimen-
talist). These cells are checked against a suitable file containing as
complete as possible a file containing crystallographic data (the NBS
Crystal Data File handles data of more than 60 000 materials).
It could be asked now if Niggli cell expresses some geometrical property.
G r ~ b e r [has
~ ~ shown
] that a cell is a Niggli cell if and only if the following
conditions are fulfilled:
+ +
(1) a b c is a minimum when calculated for all primitive cells of the
lattice;
cell
The Niggli cell is obtained from the previous one by application of the
matrix 001/100/111:
a = 8.070 A, b = 9.562 A, c = 12.434 A,
a = 100.97", /3 = lO6.54", y = 110.03".
This cell satisfies the geometrical properties suggested by Gruber.
Table 2.3.
Coincidence-site lattices
Most materials of technological interest are used in their polycrystalline
form. Their mechanical and chemical properties are controlled to a large
extent by the boundary between crystallites. The energy of a polycrystal is
higher than that of a single crystal with the same mass: the additional energy
is stored in the grain boundary areas, and depends on the orientations of the
neighbouring grains. Thus modern treatment of these materials tend to
optimize the size of the grains and quality of the grain boundaries.
The mathematical model of the crystalline interfaces is today based on the
properties of coincidence-site lattice (CSL) and related lattices. Consider
two lattices L and L' with bases A and A'. Without loss of generality it will
be assumed that the two lattices have one lattice point in common, taken as
the origin of the coordinate systems. Let N and N' be two matrices with
integer elements. The lattices L and L' will have a common superlattice if a
lattice point of L (defined by NA) can be found which is also a lattice point
of L' (defined by N'A'): then NA = N'A', or also
where X, = N'-'N is a matrix with rational elements. In this case the CSL is
defined as that superlattice (at 1 or 2 or 3 dimensions) of L and L' which
contains all (and only) the lattice points in-common to L and L'. Note that
several other lattices could be defined having points in common with L and
L' but all of them will be superlattices of the CSL.
To determine the CSL one has to find[22.231 a factorization of Xc of the
82 1 Carmelo Giacovazzo
type X, = N'-'N with the smallest possible values of N and N'. If No and NI,
satisfy this condition then No and NA indicate the reciprocal fraction of
coincidence points (degree of coincidence) in lattices L and L' respectively,
and the CSL basis will be NoA = N S ' . If N or N' are sufficiently small (a
large fraction of points of one of the two lattices consists of coincidence
sites) and if the boundary coincides with a dense net plane of CSL, then the
boundary energy per unit area will be a minimum.
Analogously, two lattices L and L' will have a common sublattice if two
matrices N and N' (with integer elements) can be found such that
N-IA = N I - I A I or also
A' = XdA,
where Xd= N'N-I is a matrix with rational elements. In this case the
displacement-shift-complete lattice (DSC) is defined as the sublattice with
the largest volume of the primitive cell. All lattices which are sublattices of
both L and L' will be sublattices of DSC.
The DSC may be determined by means of a factorization similar to that
used for X,: again we will look for the smallest possible values of N and N'.
If No and Nh satisfy such a factorization process then the DSC basis will be
NilA = NA-'A', which defines a cell with volume l/No times the volumes of
the cell defined by A and 1/NA times the volume defined by A'.
It may be also shown that:
1. The CSL (DSC) of the reciprocal lattices is the reciprocal lattice of the
DSC (CSL) of the two lattices.[241
2. The coarsest lattice which contains all vectors of the form u + u', where
u and u' are vectors of L and L' respectively, is the DSC lattice.[251With
respect to the energy of the grain boundaries DSC lattices have the same
importance as CSL: indeed translations by DSC vectors do not destroy
the coincidence sites. Such vectors are the geometrically possible Burgers
vectors (energy considerations will dictate the most probable of them) of
dislocations in grain boundaries.
If the two lattices L and L' are congruent then one can be transformed
into the other by means of a rotation: this is called coincidence rotation if L
and L' have a CSL in common. The ratio between the volume of the CSL
unit cell and the volume V of the crystal unit cell is called the multiplicity of
the CSL and is denoted by Z (the analogous ratio for the DSC cell will be
1/Z). The determination of all possible coincidence orientations with low
values of Z is an important premise for the understanding of grain
boundaries. If A' = XA defines one of the required orientations, then, owing
to (2.21),
G' = XGX. (2.40)
Several attempts have been made to find the general solution of (2.40).
Special methods for the solution of this problem were developed for
cubi~,[~h ~ ,e~~' ]a ~ o n a l , and
[ ~ ~r h, ~ m
~ ]b o h e d r a l [lattices.
~~] The problem may
be so stated: determine all the rotation angles 8 about a given lattice axis
[UVW] which generate CSLs. It may be shown that in cubic lattices a CSL is
obtained by a rotation 8 about an axis [uvw] coincident with a lattice
direction if
tan (812) = (u2 + v2 + ~ ~ ) l ' ~ / r n
Crystallographic computing 1 83
Twins
Twins are regular aggregates consisting of individual crystals of the same
species joined together i i some definite mutual orientation. There are three
principal types of twin: growth twins (produced by accident as the crystal
grows from its initial nucleus), deformation twins (considered as a means
of relieving the strain in$uced by some applied stress), and transformation
twins (the product of a polymorphic transformation, i.e., when a higher
symmetry crystal is cooled and converts to a lower symmetry structure).
From a geometrical p ~ i n of
t view a twin is characterized by the symmetry
operations which relate one individual to the other individuals in the
composite crystal. The operation is very frequently a rotation through n
about a zone axis (in this case the axis is the twin axis and the twin is a
rotation twin), or a reflection in a lattice plane called the twin plane (the
twin is then a reflectipn'twin).
-, *
i Rotation twins of n/3, n12, 2n/3 also occur Fig. 2.5. (a): 6SL lattice and CSL unit cell for a
but are less common. rotatiop of a ~ ~ blanice i c about I0011. (b): CSL
Obviously diad, tetrad, or hexad axes cannot be considered twin axes (at and ~ s ~ j a t t i c efor
s the cubic (111) plane.
least for rotation through n). If a triad is a twin axis the twinning operation
may be equivalently described as a n/3, n, or 4n/3 rotation about the axis:
conventionally the n rotation is preferred.
A twin is called a contact twin if the two components are joined in a
plane (known as fhe composition plane). In the case of a rotation twin the
composition plane is parallel to the twin axis, in reflection twins the
composition plane is parallel to the twin plane. In interpenetrating twins
the twin c~mponentsintergrow so as to generate an irregular interface
between components.
Multiple twins consist of three or more components. If the twinning
operations relating adjacent components are all identical then the twins are
known as lamellar or polysynthetic twins (the components have a lamellar
form parallel to the composition plane). Polysynthetic twins may be on a
microscopic or macroscopic scale.
Supplementary information on the most common types of twin and some
84 1 Carmelo Giacovazzo
Twins of special interest are TLS twins with n = 1. They were called by
Friedel twins by merohedry since the crystal symmetry is merohedry of
order n (subgroup of order n ) of the symmetry of its lattice. Accordingly,
merohedrical twins have one or more symmetry operations which are
present in the lattice and not in the crystal. In order to explain their
diffraction behaviour, they may be divided into two classes:[35]
Twins in class I show the same crystal Laue symmetry as the lattice
symmetry. Then the twin operation belongs to the Laue symmetry of the
crystal: in these conditions the set of intensities collected from the twin
coincides, except for anomalous scattering, with that which would be
measured on a single crystal. Structure determination is therefore not
hindered but the determination of the absolute configurations (using the
methods described on p. 97) is impossible.
Twins in class I1 are characterized from the fact that the Laue symmetry
of the crystal is lower than the crystal lattice symmetry. Then at least one
of the twin operations belongs to the lattice symmetry but not to the TWINS
Laue symmetry of the crystal. Twins by hemiedry, tetartohedry, and
ogdohedry can be found: they are made by two, four, and eight crystals
respectively.
A scheme for twin classification is drawn in Fig. 2.6. In Fig. 2.7 some
examples of twinning are collected.[361
In Fig. 2.7(a) the projection along the b axis of a monoclinic lattice with
nyy yy
~~s-twins twins
P = 90" is shown, together with its twinned lattice (the assumed twin
operation is the mirror plane m perpendicular to a , but we could also
choose the mirror plane perpendicular to c ) . The o misfit is intentionally
/\
class I class II
exaggerated. Fig. 2.6. A scheme for the classification o f twins.
TLS, n =l,classII
Im
In Fig. 2.7(b) the projection along the b axis of the monoclinic lattice of
1-aspartic acid with a = 7.617, b = 6.982, c = 5.142, /3 = 99.84" is shown: the
lattice is also shown after a two-fold rotation about the a " axis. It is easily
seen that 2a of the original lattice nearly coincides with 2a-c of the rotated
lattice, while a * is the common reciprocal lattice of the two lattices. The
twin lattice unit cell is defined by a ' = 2a, c' = c, b' = b, P' = /3 (but also a
B-centred orthohombic cell may be chosen, four times larger than the
original cell).
In Fig. 2.7(c) the projection of a hexagonal lattice along the c axis is
shown. If the space group of the crystal is supposed to be ~ 3a TLS , twin of
class I1 may be generated by reflection with respect to the plane m drawn in
the figure. The diffraction pattern will show then R3m symmetry.
In Fig. 2.7(d) the classical penetration twin of fluorite (CaF,) is described.
Two cubic lattices, referred by the twin operation (111) mirror plane, are
viewed along the direction [i10]. The twin lattice has a volume three times
the volume of the original cell.
An elegant derivation of twin laws by merohedry has been recently
proposed.[371Denote by H and G the point-group symmetry of the crystal
and the point group of its lattice respectively (G may be obtained by the
process of cell reduction). Since H is a subgroup of G, the coset
decomposition of G with respect to H may be made (see Appendix l.E).
Any system of g, operations (g, E G, g, 4 H) used for the (left) coset
decomposition will lead to the superposition of the lattice onto itself, and
therefore will contain the possible merohedral twin laws for a crystal of
point symmetry H in a lattice of point symmetry G.
For example, a-quartz crystallizes in P3121 with a = 4.913 and c =
5.404 A. The crystal point group is 321 and the metric symmetry is 6/mmm:
thus
H (1; 2,1101;2[im];2[orol;3[ooil;3fhi1)
G {H; 2pm1;2[iiol; 2[nol; 2[ziol;6[m1];6foo11;
1; 40011; z[lio]; 2[1zol;2[z101:3pmI;
3&,; ~[Ool]; ~f0011;2[110];2[100];2[010]).
The coset decomposition is therefore
G = H U (2[,1]H) U ( W U ( ~ [ 0 0 1 , ~ ) .
It may be seen that 2,,,], I, and correspond to the classical twin laws
for DauphinC, Brazil, and combined twinning respectively. The twin-related
reflections are therefore (hkl), (&El), (&El), (hki).
The same procedure, applied to a crystal with point group H = 4/m, will
decompose the metric symmetry group G = 4/mmm into
G = H U (2[010]H).
Hence the twin-related reflections are (hkl) and (&kt).
In the case of hemiedry (twins in class 11, two individuals) two reflections
which are not equivalent by Laue symmetry contribute to a twin
reflection:[381then
IFtn12 = a IFHl2+ (1 - a)IFKI2
+
IFMI' = (1 - a ) lFHI2 a IFKI2
Crystallographic computing 1 87
where
AH = xf
j=1
njfoj(H) x
m
s=l
exp (-H&,H) cos 2n H(R,X, + Ts) = x
j=1
f
A, (2.42a)
BH=
f
j=1
n,foi(H) x
m
s=l
+
exp (-Apis~)sin 2n H(R,X, T,) =
t
j= 1
B,. (2.42b)
Aj and Bj are the contributions of the jth atom and of its symmetry
equivalents to AH and BH respectively, pis is the 3 X 3 temperature factor
matrix for the atom j in symmetry position s, nj is the occupation number of
atom j, defined as mjlm, where mi is the number of different atomic
positions which are symmetry equivalent to the jth atom. Accordingly,
nj = 1 for an atom in a general position, n, < 1 for atoms in special positions
(the use of nj allows that summation over s is always extended from 1 to m,
independently of the atomic site type). If the jth site is only partially
occupied because of some statistical disorder then nj will be proportionally
reduced.
The calculation of (2.42) will be simpler if, for a given H the symmetry
equivalent indices H, = HR,, s = 1, . . . , m are calculated. In this case (see
eqn (3.36)) HR,X,. may be replaced by H,x, and (see eqn (2.E.8)) H~,,H by
As ~ j h .
88 1 Carmelo Giacovazzo
AH =
j=l s=l
t m
BH= C
j=l
[Cnj$(H)
s=l
exp (-A,~,H,) sin 2 n (H,x, + AT,)]
For each j the maximum number of ujs (and vjs) to calculate is 24. Indeed, if
the space group is centrosymmetric (origin on a centre of symmetry), s may
vary only over the symmetry matrices not referred by the inversion centre;
AH is then multiplied by 2 and BH is settled to zero (origin on a centre of
symmetry). For space groups with centred unit cell s may vary only over the
matrices not referred by non-primitive lattice translations: AH and BH are
then multiplied by the centring order of the cell.
Scattering factors f,,. have been tabulated[421for all elements: their
accuracy depends on the wave functions and on the numerical methods
used. The values of fg,at the actual sin 8/A value may be obtained from the
tables by interpolation. A more usual procedure is to approximate
scattering factors by the sum of one or more Gaussian functions: for
accurate structure factor calculations four Gaussians are used according
to 1431
4
fo(8) = C a, exp [-b, sin2 8/A2] + C.
i=l
It should be noted that only nine parameters have to be stored for each
element.
where the prime to the summation implies that only half of the reflections
(0, k, 1) have to be considered.
The calculations may be performed in a trivial fashion starting from the
list of symmetry independent Fhkl, generating symmetry equivalents, and
evaluating the sum in (2.44) for every X. The crystal symmetry may be more
conveniently exploited by combining in advance the terms containing the
symmetrical structure factors, thus obtaining an expression valid for that
given symmetry. The summations in (2.44) are then limited to the set of
independent Fhklvalues. For example, in Pmmm
90 1 Carmelo Giacovazzo
Then
p(x, y) Z) =
1
Vh=0
x
"
several approaches for answering this problem. Because of its wide use in
crystallography we will mostly be interested in the method of least squares.
Alternative approaches are briefly mentioned on pp. 108-9.
Suppose that a set of n experimental observations
F-(fljf2,.. ,fn)
is available for which:
(1) h is subjected to some random error e, due to the finite precision of the
measurement process;
(2) 4 is known to linearly depend on a set of m S n parameters
X' (xl, x2, . . . , X,).
Then the observational equations may be written[451as
s= x v? = w = minimum
where v - i=l
where wi = 1/o?.
92 1 Carmelo Giacovazzo
In matrix notation
where
IFH,lcis the modulus of the ith structure factor calculated in XO; its
derivatives are also calculated in XO.Thus eqn (2.66) is again obtained.
The normal equations may be obtained by settling to zero the derivatives
of S with respect to AX:
a lFHlc
-=o for j = I , . . . , rn.
axi
In matrix form (see eqns (2.53) and (2.54))
BAA=D
or, more explicitly,
where
6A 6 Bi
=cos g,-+ sin g,-
axji sxji
where A , and B, are the contributions of the jth atom to A H and BH
respectively.
Consider the various cases:
1. xji is an atomic coordinate: then
mx = &B;'=-B;' 3
=
H
K1>
($) n-m
where B, = A N ~ ~ from
A , which variance and covariance values for the
parameters may be calculated. In particular the variance of the es'timated
parameters is given by
pi, can range from 0 to f1: as a rule, pi, s 0.2-0.3 are frequent, pi, = f1
refers to two completely dependent parameters, one of which has to be
eliminated.
As already stressed, owing to the presence of systematic errors very often
the working matrix Nf experimentally available is referred to Mf by an
unpredictable relation much more complex than a scaling factor. In these
cases the technique of multiplying the working variance-covariance matrix
B;' by $I($) in order to obtain M, may be highly questionable. A report[511
of the International Union of Crystallography Subcommittee suggests that
besides indices R or R, as given in eqns (5.3) and (5.85) the goodness of fit
ratio $I($) (see also eqn (5.86)) should be also reported in publications as
a global measure of fit.
(Cpu) and the storage (St) needed for a 'structure factor calculation-least-
squares refinement' cycle will comply with the following table:
Step Cpu St
In accordance with the above table, computing time and storage rapidly
increase with the complexity of the structure, so that the task soon becomes
prohibitive even for large and fast computers when large-scale problems
(thousands of parameters) are treated. A useful suggestion arises by
observing that the elements on the principal diagonal of B are sums of
squares, so that they are always positive and rather large. On the contrary,
the off-diagonal elements are sums of products which may be positive or
negative; therefore they are generally expected to be smaller than diagonal
elements. Accordingly computer storage and computing time may be
reduced by setting all off-diagonal elements to zero (diagonal-least-squares
approximation). That is equivalent to assuming a complete statistical
independence among the parameters, but this is often unrealistic. For
example:
(1) errors in the thermal parameters generate a systematic error (larger to
high sin 8 / A ) on IF[,,which on its turn produces a bias into the estimate
of the scale factor K;
(2) in oblique coordinate systems, non-negligible correlations between
some coordinates of the same atom may be found. Indeed an error on a
coordinate is compensated by errors on some other coordinates (see
Fig. 2.10);
Fig. 2.10. P(xo, yo) is the true atomic position. If
(3) a high correlation will be found among the site occupancy and an error A, is introduced, the 'best' value for x i s
temperature factors if they are contemporaneously refined. obtained by minimizing, along the line yo +
A y = const, the distance of the atom from the
true position P. That produces A, = -A, cos y.
An alternative to the diagonal approximation is the block-diagonal
approximation; a first block may involve the correlation between the scale
and the overall thermal parameter; the other blocks, one for each atom, are
9 x 9 matrices comprising positional and anisotropic temperature factors (a
4 x 4 matrix for an isotropic atom). The matrix B will then appear as in Fig.
2.11. Larger blocks are sometimes used; e.g. all the atoms in the same
molecule could belong to the same block; or, also, a block for all positional
parameters and a block for all vibrational parameters with the overall scale
factor; or, . . . .
The storage requirement for the block diagonal approximation is certainly
smaller than for full matrix methods. Each refinement cycle is faster, but
convergence is slower: so the complete refinement process requires more
cycles and almost the same computing time. Thus the major advantage of
the method is the lessened storage requirements for problems too large to
be treated by a full matrix. Fig. 2.11. A scheme for block-diagonal
A very large saving of computer time has been recently achieved by approximation.
100 I Carmelo Giacovazzo
-
from the two different models. In the practice T = R(l)/R(2) is often used
because usually Rw(1)/Rw(2) R(l)/R(2). The test compares T with the
function
for small molecules) there are about 30 observations for each coordinate
being refined, at 2.7 A the number of parameters is nearly equal to the
number of the observations.
2. Atoms with very large atomic number and very light atoms coexist in the
unit cell. Then modest errors on the heavy-atom parameters cause strong
errors on the light-atoms parameters.
3. Too high thermal motion, presence of structural disorder, etc. Only poor
and scarce data are then available.
If, however, some prior stereochemical information is available on parts
of the structure its use in the least-squares procedures may increase the
degree of overdetermination of the system and improve the accuracy of the
results. Three methods will be briefly recalled here.
The atomic coordinates so obtained may be used in the usual manner for
calculating structure factors. The problem is now to calculate the appropri-
ate contribution for the rigid parameters x,, yo, zo, w, v, q5 to the matrix of
normal equations.
Derivatives with respect to such parameters are calculated from those for
the atomic parameters Xi using the chain rule: i.e.
6 lFHIc -
-- a lFH1c
axo j=1 ax, '
106 1 Carmelo Giacovazzo
represent the fixed constraints. Usually the available model X0 will not
exactly satisfy (2.69). The problem will be linearized by expanding both the
IFHIcsand the Gis in Taylor series:
Use of restraints
Soft, flexible constraints (say restraints) may be imposed to some functions
of the parameters in order to permit only realistic deviations of their values
from fixed standard ones. These functions are used as supplementary
observations, so that the order of the normal equation matrix is neither
increased nor reduced. In this situation the function to m i n i m i ~ e [ ~ ~
is, ~ ' ]
where gq is the function describing the gth restraint, g,, is its standard (or
optimal) value, w, is the weight to associate to the qth restraint. The normal
equations are obtained by expanding S in Taylor series and equalling to zero
its derivatives (with respect to AX):
where
and so on.
Another restraint may be: the sum of all the atomic coordinates along a
polar direction can be fixed to its current value in order to keep fixed the
centre of gravity of the molecule and so determine the origin. Several other
types of restraints can be imposed (see Chapter 8, p. 564) and they concern
van der Waals distances, planarity of groups, chirality (a restraint on the
chiral volume about an asymmetric carbon atom may maintain the
conformation in the correct hand), bond and torsion angles, thermal
108 1 Carmelo Giacovazzo
P(ei) = ( 2 ~ o ? ) - exp
~ ' ~[-e?/(2o?)]
Crystallographic computing 1 109
Since the second and third terms are constant, In L has its maximum
when
Rietveld refinement
The basis of the technique
Powder diffraction patterns (see p. 293) may be collected in a step scan
mode: intensity is measured for a given interval of time, and the theta and
two-theta axes are then stepped to the next position. The pattern is then
indexed: i.e. appropriate Miller indices are associated with observed
reflections and simultaneously accurate unit-cell dimensions are calculated.
Because of unavailable experimental errors in the estimates of the
diffraction angles and because of the frequent overlapping of peak inten-
sities, indexing is a rather difficult task for relatively large cell volumes
and/or low-symmetry crystals. Several approaches are today available for
this aim: implemented in computer programs, they often provide more than
one solution, so that proper figures of can be used to distinguish
between bad and good solutions. The reader will find general remarks on
the various approaches, and tests on their efficiency, in a paper by ~hirley[~']
,and in some more recent paper^.[^"^^] This stage of analysis is followed by
the examination of possible systematic absences to suggest a space group.
If a (even imperfect) structural model is available then the intensity yi,
observed at the ith step may be compared with the corresponding intensity
110 1 Carmelo Giacovazzo
S = C Wi I~ i-
o yic12
where wi, given by
(wi)-' = 0; = B:~ + o;~,
is a suitable weight. oi, is the standard deviation associated with the peak
(usually based on counting statistics) and a,, is that associated with the
background intensity yib.
yic is the sum of the contributions from neighbouring Bragg reflections
and from the background:
(Gaussian) ;
cli2
2(1 + CIX;k)-l (Lorentzian);
nHk
-(1 + c2xfk)-2
'L 2
(modif. 1 Lorentzian);
~tHk
with 0 s q G 1 (pseudo-Voigt);
Crystallographic computing 1 111
r(B)
r ( p - 0.5)
52
7tk
(1 + 4c4x:,)-p (Pearson VII);
where Co = 4 In 2, C1 = 4, C, = 4 ( - I),
~ C3 = 4 (2213- I), C4= 21f8- 1,
Xik = AOik/Hk. Hk is the full-width at half-maximum (FWHM) of the kth
Bragg reflection, and r is the gamma function.
It is easily seen that the pseudo-Voigt function presents the mixing
parameter rj which gives the per cent Lorentzian character of the profile.
When P = 1, 2, m, Person VII becomes a Lorentzian, modified Lorentzian,
and Gaussian function respectively. Of some use also is the pure Voigt
function which is the convolution of Gaussian and Lorentzian forms.
The FWHM is usually considered to vary with scattering angle according
to
+
(FWHM), = ( U tan2 8 V tan 8 - w)'" (2.71)
for the Gaussian component,[781according to
(FWHM), = X tan 8 + Y/cos 8 (2.72)
for the Lorentzian component.[791 U, V, W, and/or X, Y are variable
parameters in the profile refinement.
Besides analytical functions non-analytical functions arising from an
analysis of resolved peaks may also be usedL8'] to describe peak shape (in
the Rietveld method the peak shape is not the end but a tool of the
method).
There is no well established approach to the background. It is mainly due
to insufficient shielding, to diffuse scattering, to incoherent scattering
(rather high for neutrons), to electronic noise of the detector system. The
background and its variation with angle is usually defined by refinement of
the coefficients of a power series in 28:
where y,, and y,, are the observed and calculated values at the position 28,.
For good experimental data the R value is expected to be around one per
cent.
The values of the profile parameters may be refined for each single peak
collected from the standard: each peak is separated and analysed in a region
with enough points to allow a good sampling of background on each side of
the peak (a parameter for background may also be refined). In particular,
the values of FWHM obtained from refinement are used for a first estimate
of the U , V, W values in eqn (2.71) and/or of X , Y values in eqn (2.72). It
is worthwhile mentioning that standard specimen data are also used for the
determination of wavelength A (if necessary) and for the zero-point
calibration 28, of the detector scale.
Analysis of the specimen profile f usually follows instrument profile
Crystallographic computing 1 113
or of trigonometric type1961
The function gl depends on the projected focal spot profile, g2 is due to the
varying displacements of the various parts of the flat specimen surface from
the focusing circle, g3 is due to the axial divergence (as regulated by Soller
slit collimators), g4 arises from specimen transparency (i.e. from the
Crystallographic computing 1 115
N ~ A
instrument.
5. Resolution of the pattern may be improved by mathematical
technique^:[^^^-^^^] i.e. by deconvolution of the pure from the observed
profile, this last containing the effect of instrumental broadening (see Fig. 2.13. Functions defining the instrument
Appendix 3.A, p. 185). function g.
The resolution of the pattern as well as the signal-to-noise ratio can be
appreciably improved with carefully designed diffraction experiments. When
a non-conventional diffraction apparatus is used one of the first choices to
be made is fixing the wavelength. Complex structures will produce peaks
closely spaced and frequently overlapping: a long wavelength improves
separation between peaks but reduces the number of accessible reflections
which cannot then offset the large number of structural parameters to
refine. A short wavelength increases the number of accessible peaks but
produces severe overlap of them. Thus a useful compromise has to be
chosen between the two requirements.[lo4]When higher resolution is needed
for conventional X-ray sources it is advisable to use the Kg doublet (but the
total experiment time will increase).
In order to compare the performances of different experimental arrange-
ments, in Fig. 2.14 typical FWHM (instrument-only contribution) are
plotted against the diffraction angle for modern neutron diffractometer (N),
for a conventional Bragg-Brentano with (XCS) and without (XCW)
diffracted beam Soller slit (Cu K, radiation and a diffracted-beam curved
graphite monochromator) limiting vertical beam divergence to less than the
standard value of 5", and for a synchrotron powder diffractometer (S). The
use of incident beam monochromators can further improve performances of
conventional divergent beam diffractometry.[lo5]Indeed, the K,, component
may be removed, so halving the number of lines in the pattern. As a
consequence, the resolution of the remaining lines is improved, the
-e
(I)
D
N
......
/.
/
H
/'
'...., ........
........ . .
XCW .... .. ,
-.-. -. -.-. .....
0.2 - -.-.-.-.-.-.-.-a
- .-:-.-./
XCS .-.-.-.
_._,-._._._._ .~-~-._.~._._._._._._._._._._._._)_._-._.c._._._._._._._._._._._._)_._-._.c._._._._._._._._._._._._)_._-
.o -S I
10 20 30 40 50 60
20 (deg.)
polymers the width of peaks is rather large for polymers so that overlapping
events in their pattern are more frequent. In spite of that, discrimination
among various structural models may often be accomplished provided prior
information (size and shape of rigid molecular fragments) is adequately
introduced into refinement.
for which the internal molecular motion may be negligible and the thermal
motion may be described in terms of a rigid body model.[1141161
In the crystallographic least-squares procedures (see p. 94) any correla-
tion among fi tensors of different atoms is usually ignored. However, if
thermal ellipsoids correctly represent the thermal motion some a posteriori
correlations among them can be found. For example, since bond stretching
vibrations have a much smaller amplitude than other sources (i.e. bond
bending or torsional vibrations) the mean square displacement of pairs of
bonded atoms should be approximately equal in the bond direction. Or
also, in long-chain molecules the thermal motion normal to the chain should
be greater than at right angles to it; or, terminal atoms such as the 0 atoms
in carbonyl groups or H atoms in methyl groups have generally greater
thermal motion than the atoms to which they are bonded. Thus if the crystal
contains more-or-less rigid groups of atoms, it makes sense to analyse
thermal motion in terms of translational and librational oscillatipns of these
groups. In the following, such an analysis will be described on assuming a
Cartesian coordinate system.
In accordance with Appendix l . A the most general motion of a rigid
body is a screw rotation. If the axis of rotation is correctly oriented but
incorrectly positioned, the rotation and the translation component parallel
to the axis do not vary, but additional translation components perpendicular
to the rotation axis are introduced. Thus the most general motion of a rigid
body may be considered as the combination of a rotation with a suitable
translation. These operations do not commute in general; luckily an
adequate treatment of anisotropic thermal motion, accurate to the level of
quadratic approximation, can be used upon infinitesimal rotations which do
commute.
To illustrate this representation of rotation, consider an atom at X =
(xl, x2, x3) in a Cartesian coordinate system E. According to p. 71, in such
a system, a rotation through x about the unitary vector 1 (I1, I,, 1, will be its
x
direction cosines), is represented by relation (2.32b). If is sufficiently small
(2.32b) may be written as
S + D provided
AD+DA=O
is satisfied. Indeed, if S in (2.77) is replaced by S +D
from which
and
where Xo represents now the direct lattice components of the vector do.
3. The rigid body model. Let n be the unit vector about which a group of
atoms oscillates with mean square amplitude (w2). A small rotation dw
about n will produce over the interatomic vector d the variation 6d = (n A
d) d o .
Since 6d is perpendicular to d, we can write
-
(w2) = d2(w2)sin2 3 dt(w2) sin2 3
where 3 is the angle between do and n. In accordance with (2.80)
(d) =do + (dow2sin2 3 ) / 2 = do(l + w2sin2 312).
122 1 Carmelo Giacovazzo
- -
distance from P to the axis. The radial displacement is approximately given
by d - d cos o d(1- cos o ) d((u2/2).
As an example, a librational mean-square amplitude of 10' may lead to
Fig. 2.16. Shortening of interatomic distances
due to libration motion. shortening of interatomic distances of up to 0.025 A.
Thermal motion is expected to produce distortion also in apparent angles.
To study this type of effect the joint distribution of three correlated thermal
motions has to be taken into account.[1161
where
A, = o(a)/a, B1 = sin a(cos a - cos (p cos y)a(a),
A, = a(b)/b, B2 = sin (p(cos (p - cos a cos y)a((p),
A3 = a(c)/c, B3 = sin y(cos y - cos a cos (p)a(y).
On applying (2.84) to the expression of cos a* in Table 2.1 we obtain
a2((a) = {a2(a)sin2 a + a2((p)(sin(p cos y + cos a*cos (p sin y)'
+
+ 02(y)(cos(p sin y cos a*sin (p cos ~ ) ~ ) l ( sa*
i n sin (p sin Y ) ~ .
Apply now (2.84) to the relation (2.3b) in order to derive a2(d2),where d
is the distance between two uncorrelated atoms positioned in (xl, yl, 2,) and
(x,, y2, z2) respectively. Then
1
a2(d) = 7 {(A1+ A2 cos y + A3 cos ( p ) 2 [ ~ T+~ a2(a2(x1)
: + a2(x2))]
d
+ (A, cos y + A, + A3 cos a ) 2 [ ~ : ~+: b2(a2(yl)+ 02(y2))]
+ (A, cos + A, cos a + A3)2[A3: + c2(c?(z1) + a2(z2))]
+ (AlA2a(y) sin y)' + (AlA3a((p)sin (p)' + (A2A3a(a)sin a)2).
(2.85a)
Crystallographic computing ( 123
(2.85a) is simpler if the errors on the unit cell parameters can be neglected.
-
An additional simplification is obtained if the errors are isotropic (i.e.
a2&, = bZa;, = c20:, = a: and a2a;, = b2at2= c2afZ a:) and the axes are
orthogonal. Then (2.85a) reduces to
a&BcD) = 2
dABsin2 (ABC)
+ d& sin2(BCD)
- dABcos (ABC)
dABsin (ABC)
dBc - dABcos (ABC)
- 2 cos o cot (BCD)(
,, sin (ABC)
- dcD cos (BCD)
cD sin (BCD)
Appendices
2.A Some metric relations between direct and
reciprocal lattices
Let us prove that V* = 1/V. By definition
Owing to (2.8)
1 1
V* =-(b A c ) . [(c a A b)a] = -
v3 v '
Derive now the values of a*, p*, y* from direct lattice parameters.
According to the first eqn (2.13)
sin a* =
v
abc sin p sin y
is obtained. Expressions for sin p* and sin y* quoted in Table 2.1 are
obtained by cyclic permutation of the parameters. Derive now the expres-
Crystallographic computing 1 125
4. The planes HI, H2, H3 shall belong to the same zone if r i , , r i , , r i , are
coplanar. Then they shall define a cell in the reciprocal lattice whose volume
is zero:
5. The condition that the point P(x, y, z) lies in a plane (the nth of the set
from the origin) of the family (hkl). We require that the projection of
+ +
r = xu yb zc onto the direction of r i be equal to n times the interplanar
spacing d H :
r - r i / r : = ndH= n l r i
from which
.
r r i = hx + ky + lz = HX = n. (2.B.2)
126 1 Carmelo Giacsvazzo
6. The condition that the point P(h, k, 1) of the reciprocal lattice lies in
the nth (starting from the origin) plane of the set of planes (uvw) of the
reciprocal lattice. It is the same problem as in 5 above, but transferred in
the reciprocal lattice. The required condition is therefore
12. Plane determined by the three points Pi, P2, P3 whose positional
vectors are P l , P2, p3:(r -PI) ' (p2 -pi) A (p3 -PI) = 0.
In terms of coordinates
("'
X~-XI
Y -Y1
y2-yl
"I)
22-21 =O. (2.B.4)
x3 - X1 Y3 - Yl 23 - 21
Equation (2.B.4) derives from (2.5): indeed the unit cell volume defined by
the vectors
i =P i , r 2 = p 2-pl, r3 =P3
is vanishing.
13. Plane normal to the unitary vector n:r n = d, where d is the distance
of the plane from the origin. It may also be written
r . n = ~ * ~ = d . (2.B.5)
Crystallographic computing 1 127
14. Plane normal to the unitary vector n and through the point defined by
ro:
( r - r , ) . n = N*(x-x,) =o. (2.B.6)
15. Distance from P1 to the plane N*X - d = 0:
D = N*Xl - d. (2.B.7)
If the plane is defined by means of the equation r rn = d', where rn is a
general vector, before applying (2.B.7) n = rnln and d = d'lm have to be
calculated.
16. Projection of the vector rl on to the plane N*X - d = 0:
P(rl (1 n*) = rl - ( N * x , ) ~ *
17. Principal axes of the symmetry operator R. A vector r along an axis
of R should satisfy the eigenvalue equation Rr = Ar.
To give an example, let R = R, in a Cartesian system. The secular
equation will be
cos8-A -sin8
det (R, - Al) = det
= (1
(+ sin 8
0
cos 6' - A
0 1-A
- 2~ cos 8)(1 - A)= 0.
or, in matrix notation, A: = QA:, where Q coincides (see Table 2.C.1) with
the matrix which transforms an I cell into a P cell. It may be concluded that
the reciprocal of an F lattice is an I lattice whose cell is defined by the vector
2a:, 2b:, 2c:.
If we index the reciprocal lattice with respect to A,* and A: we obtain
where all of h, = 2h1, k , = 2k1, I , = 21, will either assume even values (when
h,, k,, 1, are integer numbers) or odd values (when hI, k,, 1, are of type m/2,
n/2, p / 2 with integer values of m, n, p). These are just the conditions for
systematic extinctions in an F lattice.
cides with (2.E.1). Note that the trace of R is invariant under transforma-
tions such as (2.E.l).
A particular case of (2.E.1) occurs when A' =A*. Then
R,*=GR,G*, T,*=GT,. (2.E.2)
If the second of the equations (2.9) is introduced into the first equation
(2.E.2) the following result is obtained
R,* = (R,)-~GR;~R,G-~
= (R,)-l. (2.E.3)
In conclusion, if the operator R is a symmetry element in the direct space,
(R)-' is a symmetry element in the reciprocal space. The list of the
symmetry operators in the reciprocal space may be obtained without
inverting the various matrices. Indeed, if R is a symmetry element in the
direct space, group properties guarantee that R-' is also an element of the
direct space symmetry group: therefore R is a symmetry operator of the
reciprocal space symmetry group. Consequently, if the set of matrices R
operate in direct space, the set R operate in reciprocal space (however, R,
and R, may pertain to different symmetry elements).
In an orthonormal system a* = a, b* = b, c* = c and G = I. Therefore any
symmetry operator C = (R, T) will have identical expression both in direct
and in the reciprocal space. Furthermore R = R-' holds, as already obtained
in Appendix 1.A.
It is often useful to know the transformation rules valid in reciprocal
space when the basis vector transformation A' = MA is performed in direct
space. We describe some of them:
1. A* e A 1 * : according to (2.25), A' = GIA'* = MGA*. Owing to (2.21),
that gives
Note that relations (2.E.5) are the transformation rules of the Miller
indices (hkl).
3. G * e G 1 * : let us introduce the first eqn (2.16) in the first eqn (2.21),
which thus becomes GI*-' = MG*-IM. On post-multiplying both sides of
this equation by GI* we obtain I = MG*-~MG'*from which
- -
are the metric matrices of A and A' respectively, G* and GI* are
the metric matrices of A* = ( a * , b*, c*) and A'* ( a r * ,b ' * , c ' * )
respectively. C (R,T) is a symmetry operator (R is its rotational
part, T its translational part): C, C ' , C*, C'* are symmetry
operators defined in A, A', A*, A'* respectively. Q and Q* are
the quadratic forms of A and A*.
from which
Qr* = (M)-'Q*M-', Q* = MQ'*M. (2.E.8)
The transformation rules obtained in this paragraph and on pp. 65-7 are
collected in Table 2.E.1.
A particular basis transformation is that corresponding to a symmetry
operation. In accordance with eqns (2.19) and (2.20) a transformation R
acting on the coordinates is equivalent to the transformation M = (R)-'
acting on the basis vectors. In this case the metric matrix will not vary:
indeed, according to (2.21), G' = (R)'GR-~ which is identical (because of
(2.9)) to G. Find now the relationships existing between the matrices 8' and
p defining the anisotropic temperature factors of the atoms related by a
symmetry operator. In accordance with (2.E.8)
As an example, the reader will easily verify that the relationships existing in
the cubic system along the components of p of two atoms related by the
symmetry axis 311111(see the matrix given in Appendix l.D) are
Pi1 = P331 Pi2 = P137 Pi3 = P23)
Since X and 6X are vectors, the first two terms on the right-hand side of
(2.F.2) are 1x 1 matrices which are equal (My1 = ~ 7 ' ) . For the same
reason the third and the fourth terms are equal too. Therefore
6s = ~(%)(AM;'AX - AM;~F) =0
from which
(AM,-'A)X = A M ~ ~ F ,
which coincides with (2.53).
S = FM,-'F- ~ B X (F-
= F)M,-I(F- F ) - (X - X)B(X-X)
where X = A-IF.The expected value of 3 will then be
(9) = ((F - F)Myl(F -F)) - ((X - X)B(X - x)).
(F - F ) and ( 2 - X) are random variables with zero means and finite
variances, and with variance-covariance matrices of rank n and m
respectively. In this condition it is possible to show[451
that
7
((F - F)M,-'(F - F ) ) = n, ((X - x)M;~(% - X)) = m.
Therefore
where -N/2 s h ' < N/2. Subdivide the x axis in N parts so that
132 1 Carmelo Giacovazzo
Similarly, the expression jlr + jo will generate all the integers j in the range
[O, (N - 111
j=jlr+jO ( = O 1 .- 1 ; jo=O, 1 , . . . , r - 1 ) .
Then
W# = W , h o i ~ W f ~Ni ~ W h ~ i ~
and
s-1 r-1
p(j) = p(jo, jl) = (-1y 2 2 F(hl, ho)w,hoJ1wfljOh"io
ho=O h1=0
WN
+
operations, the FFT method involves r(s r) operations in the first step and
+
s(s r) operation in the second step. If N is sufficiently large that
corresponds to a large saving of time.
Besides calculating electron density maps, the FFT method is also used
for the calculation of the structure factors when the number of atoms is very
large and computing time has to be saved. Calculations are organized into
two steps:
1. Step 1: All the atoms (in the asymmetric unit) which contribute to the
electron density are selected. For each of them the electron density is
nothing else but the Fourier transform of the atom scattering curve
corrected for thermal motion. Usually each atom is represented by a
single Gaussian function (for low resolution data) or as the sum of two or
three Gaussian components. The overall model electron density map is
sampled on a grid not too fine (that would greatly increase cost and
computer storage) and not too coarse (in order to avoid too rough an
approximation of the electron density).
2. Step 2: The fourier inversion of the map is made, which provides
structure factor magnitudes and phases.
The speed of steps 1 and 2 depends on the number of grid points in which
the density has been sampled and on the number of terms in the Gaussian
approximation to the atomic scattering curve. Too coarse a sampling grid
would produce structure factors which are the sum of the desired ones and
those with indices spaced away by multiples of N. Ten ~ i c k [ ' proved
~ ~ ] that
the number of grids may be taken small by artificially increasing the B value
of the atoms.
Besides structure factor calculations most of the steps involved in LSQ
structure refinement can be performed by the FFT algorithm:[125~1261 i.e. the
calculation of the gradient vector D and of the normal matrix B (see p. 92).
Indeed, in accordance with p. 145, the following properties of the Fourier
transform hold:
6dr)
6 FT -* (-2nik)F(H)
6Y
etc.
A
Cubic system
Twinning according to the spinel law is commonly found in crystals of the
class 43m, which often exhibit forms (111). Twinning occurs with (111) as
a twin plane (see Fig. 2.J.l(a)). The same law applies to the penetration
twin shown in Fig. 2.J.l(b), where the two cubes are rotated about [ I l l ]
(classical mineral: fluorite). An interpenetrant rotation twin about [ l l i ] is
shown in Fig. 2.J.l(c).
(a) (b)
Crystals of pyrite, point group m3, frequently obey the iron cross law:
they twin with (110) as a twin plane.
In Fig. 2.J.l(d) the classical pentagonal dodecahedron is shown, with
form e = (210): twinned crystals display the form shown in Fig. 2.J.l(e).
Tetragonal system
Cassiterite (SnO,, point group 4/mmm), presents the so called elbow twin,
twin plane {101),
. .
shown in Fig. 2.J.2. The elbow twin is also found in rutile
\Iiiil ( ~ i 0 2 and
) zircon (ZrSiO,), which are often polysynthetic.
(c) / Calcopyrite crystals (CuFeS,, point group 42m) are commonly tetra-
hedral, with (112) as the dominant form. Twinning occurs on (1121,
contact, lamellar, or penetration.
Orthorhombic system
Aragonite (CaCO,, point group mmm, polymorphous with calcite) often
twins on (110) faces. In Fig. 2.J.5(a) a single crystal of aragonite in shown,
Crystallographic computing 1 135
Monoclinic system
Orthoclase crystals (KAlSi,08, point group 2/m) often twin according to the
Carlsbad, Baveno, or Manebach law.
In the Carlsbad law c is the twin axis and (010) the composition plane
(see Fig. 2.J.7(a)). The twin and composition plane is (021) in Baveno
twins (see Fig. 2.J.7(b)), and (001) in Manebach twins (see Fig. 2.J.7(c)).
While Carlsbad twins are penetrating twins, Baveno and Manebach are
usually contact twins.
Triclinic system
In twins of plagioclase feldspars [(Ca, Na)(Al, Si)A1Si208, point group i] (b)
the albite law is often satisfied: twin plane (010) , type of twinning usually Fig. 2.5.3. Some examples of calcite twins. (a)
multiple and polysynthetic (see Fig. 2.J.8(a,b,c)). Also common is the Plane {OOO1); (b) twin plane {lio2).
1- -1
(c) (4
(c) Fig. 2.J.8. Plagioclase feldspars. (a) Single
Fig. 2.J.6. Staurolite twinning. (a) Twin after Fig. 2.J.7. Orthoclase twins. (a) Carlsbad twin; crystal; (b) twin by Albite law; (c) polysynthetic
{032}; (b) twin after {232). (b) Baveno twin; (c) Manebach twin. albite twin; (d) twin by pericline law.
Crystallographic computing 1 137
pericline law, with twin axis [OlO]. The composition plane is the so-called
'rhombic section', a plane parallel to b whose orientation does not
correspond to rational indices; see Fig. 2.J.8(d).
References
1. Sands, D. E. (1982). Vectors and tensors in crystallography. Addison-Wesley,
Reading.
2. Klyne, W. and Prelog, V. (1960). Experientia, 16, 521.
3. Schomaker, V., Waser, J., Marsh, R. E., and Bergman, G. (1958). Acta
Crystallographica, 8, 600.
4. Niggli, P. (1982). Handbuch der Experimentalphysik, Vol. 7, Part 1. Academ-
ische Verlagsgesellschaft, Leipzig.
5. Delaunay, B. N. (1983). Zeitschrift fur Kristallographie, 84, 109.
6. Katayama, C. (1986). Journal of Applied Crystallography, 19, 69.
7. Burzlaff, H , and Zimmermann, H. (1984). Zeitschrift fur Kristallographie, 170,
241.
8. Burzlaff, H, and Zimmermann, H. (1984). Zeitschrift fur Kristallographie, 170,
247.
9. Buerger, M. J. (1957). Zeitschrift fiir Kristallographie, 109, 42.
10. Buerger, M. J. (1960). Zeitschrift fur Kristallographie, 1l3, 52.
11. Gruber, B. (1973). Acta Crystallographica, A29, 433.
12. Santoro, A. and Mighell, A. D. (1970). Acta Crystallographica, A26, 124.
13. Krivy, I. and Gruber, B. (1976). Acta Crystallographica, A32, 297.
14. Clegg , W. (1981). Acta Crystallographica, A37, 913.
15. Ferraris, G. and Ivaldi, G. (1983). Acta Crystallographica, A39, 595.
16. Andrews, L. C. and Bernstein, H. J. (1988). Acta Crystallographica, A44,
1009.
17. Azaroff, L. 17. and Buerger, M. J. (1958). The powder method in x-ray
crystallography. McGraw-Hill, New York.
18. Mighell, A. D. and Rodgers, J. R. (1980). Acta Crystallographica, A36, 321.
19. Mighell, A. D. and Himes, V. L. (1986). Acta Crystallographica, A42, 101.
20. Gruber, B. (1989). Acta Crystallographica, A45, 123.
21. Santoro, A. and Mighell, A. D. (1972). Acta Crystallographica, A28, 284.
22. Fortes, M. A. (1983). Acta Crystallographica, A39, 351.
23. Grimmer, H. (1976). Acta Crystallographica, A32, 783.
24. Grimmer, H. (1974). Scripta Metallurgica, 8, 1221.
25. Bolmann, W. (1970). Crystal defects and crystalline interfaces. Springer, Berlin.
26. Ranganathan, S. (1966). Acta Crystallographica, A21, 197.
27. Grimmer, H. (1984). Acta Crystallographica, A40, 108.
28. Grimmer, H. (1989). Acta Crystallographica, A45, 320.
29. Bonnet, R., Cousineau, E., and Warrington, D. M. (1981). Acta
Crystallographica, A37, 184.
30. Grimmer, H. (1989). Acta Crystallographica, A45, 505.
31. Donnay, G. and Donnay, J. D. H. (1974). Canadian Mineralogist, 12, 422.
32. Le Page, Y., Donnay, J. D. H., and Donnay, G. (1984). Acta
Crystallographica, A40, 679.
33. Friedel, G. (1926). Legons de cristallographie. Paris: Berger-Levrault, Paris.
(Reprinted 1964. Blanchard, Paris.)
34. Santoro, A. (1974). Acta Crystallographica, A30, 224.
35. Catti, M. and Ferraris, G. (1976). Acta Crystallographica, A32, 163.
36. Van der Sluis, P. (1989). Thesis. University of Utrecht.
37. Flack, H. D. (1987). Acta Crystallographica, A43, 564.
38. Britton, D. (1972). Acta Crystallographica, A28, 296.
138 1 Carmelo Giacovazzo
108. Lehmann, M. S., Christensen, A. N., Fjellvig, H., Feidenhans, R., and
Nielsen, M. (1987). Journal of Applied Crystallography, 20, 123.
109. Hibble, S. J., Cheetham, A. K., Bogle, A. R. L., Wakerley, H. R., and Cox,
D. E. (1988). Journal of the American Chemical Society, 110, 3295.
110. Maichle, J. K., Ihringer, J., and Prandl, W. (1988). Journal of Applied
Crystallography, 21, 22.
111. Cheetham, A. K. and Taylor, J. C. (1977). Journal of Solid State Chemistry,
21, 253.
112. Meille, S. V., Briickner, S., and Lando, J. B. (1989). Polymer, 30, 786.
113. Briickner, S., Meille, S. V., Porzio, W., and Ricci, G. (1988).
Makromolekulare Chemie, 189, 2145.
114. Cruickshank, D. W. J. (1956). Acta Crystallographica, 9, 754.
115. Schomaker, V. and Trueblood, K. N. (1968). Acta Crystallographica, B24, 63.
116. Johnson, C. K. (1970). In Crystallographic computing (ed. F. R. Ahmed), pp.
207-19. Munksgaard, Copenhagen.
117. Busing, W. R. and Levy, H. A. (1964). Acta Crystallographica, 17, 142.
118. Cruickshank, D. W. J. and Robertson, A. P. (1953). Acta Crystallographica, 6,
698.
119. Stanford, R. H. Jr and Waser, J. (1972). Acta Crystallographica, A28, 213.
120. Nardelli, M. (1983). Computers and Chemistry, 7, 3, 95.
121. Shmueli, V. (1974). Acta Crystallographica A30, 848.
122. Ten Eick, L. F. (1973). Acta Crystallographica, A29, 183.
123. Ten Eick, L. F. (1977). Acta Crystallographica, A33, 486.
124. Immirzi, A. (1976). In Crystallographic computing techniques (ed. F. R.
Ahmed), pp. 399-412. Munksgaard, Copenhagen.
125. Aganval, R. (1978). Acta Crystallographica, AM, 791.
126. Jack, A. and Levitt, M. (1978). Acta Crystallographica, AM, 931.
The diffraction of X-rays
by crystals
CARMELO G I A C O V A Z Z O
Introduction
Crystal structure analysis is usually based on diffraction phenomena caused
by the interaction of matter with X-rays, electrons, or neutrons. Although
the theory of diffraction is the same for all types of radiation, we shall
consider X-ray scattering with particular interest: some references to
electron and neutron scattering are made in Appendix 3.B pp. 195 and 198.
The most important properties of X-rays were described by Rontgen in
1896. However, with equipment in common use in optics at that time he
could not measure any effect of interference, reflection, or refraction.
Several years later Sommerfeld measured an X-ray wavelength of about
0.4 A. In 1912 M. von Laue, starting from an article by Ewald, a student of
Sommerfeld, suggested the use of crystals as natural lattices for diffraction.
This experiment was successfully performed by Friedrich and Knipping,
both students of Rontgen. In 1913 W. L. Bragg and M. von Laue used
X-ray diffraction patterns for deducing the structure of NaC1, KC1, KBr,
and KI. In such a way, in only few years, the electromagnetic nature of
X-rays and their usefulness in the determination of crystal structure was
indisputably demonstrated.
Let us recall some properties of electromagnetic radiation:
4. The refractive index of X-rays is very near to unity: for A = 2 A and for
high-density substances the difference from unity is of the order lop4,
being lo-' for most cases. For this reason the X-rays cannot be focused
by means of suitable lenses like ordinary light or electrons. Thus, if
X-rays are used, we cannot talk about a direct observation of objects by
means of instruments equivalent to optical or electron microscopes.
142 1 Carmelo Giacovazzo
Thomson scattering
Q
,
Let suppose that (see Fig. 3.l(a)) a free material particle with electric
charge e and mass m is at the origin 0 of our coordinates system and that a
plane monochromatic electromagnetic wave with frequency v and electric
vector Ei propagates along the x axis in positive direction. Its electric field is
described by equation
Ei = Eoiexp 2niv(t - X/C)
where EOiis the amplitude of the wave and Ei is the value of the field at
position x at time t. The field exerts on the particle a periodic force F = eEi
and therefore the particle will undergo oscillatory motion with acceleration
a = F l m = eEi/m and frequency v. In accordance with classical theory of
electromagnetism a charged particle in accelerated motion is a source of
electromagnetic radiation: its field at r is proportional to acceleration and
lies in the plane (Ei, r). Let us orient the axes y and z of our coordinates
system in such a way that the observation point Q defined by vector r is in
the plane ( x , y). At the point Q we will measure the electric field Ed due to
scattered radiation
exp [2niv(t - rlc) - ia].
Ed= EOd
Thomson showed that (see also pp. 165-6)
Fig. 3.1. (a) A free charged particle is in 0:a
plane monochromatic electromagnetic wave
propagates along the x axis. (b)Surface element
1
Eod= - Eoi(e2/mc2)sin 9
at scattering angle 28. r
The diffraction of X-rays by crystals 1 143
where (2nr sin28) r(d(28)) is the surface element at angle 28. The total
scattering 'cross-section' PI4 is equal to 6.7 x cm2/electron, which is a
very small quantity. It may be calculated that the total fraction of incident
radiation scattered by one 'crystal' composed only of free electrons and
having dimensions less that 1mm is less than 2 per cent.
The scattered radiation will be partially polarized even if the incident
144 ( Carmelo Giacovazzo
radiation is not. Thus, if the beam is scattered first by a crystal (monochro-
mator) and then by the sample the polarization of the beam will be
different. The scattering is coherent, according to Thomson, because there
is a well defined phase relation between the incident radiation and the
scattered one: for electrons a = n.
Unfortunately it is very difficult to verify by experiment the Thomson
formula since it is almost impossible to have a scatterer composed
exclusively of free electrons. One could suppose that scatterers composed of
light elements with electrons weakly bound to the nucleus is a good
approximation to the ideal Thomson scatterer. But experiments with light
elements have revealed a completely different effect, the Compton effect.
Compton scattering
The process can be described in terms of elastic collision between a photon
and a free electron. The incident photon is deflected by a collision from its
original direction and transfers a part of its energy to the electron.
Consequently there is a difference in wavelength between the incident
radiation and the scattered one which can be calculated by means of the
relation (see also Appendix 3.B, p. 185)
Ah (A) = 0.024 (1 - cos 28). (3.4)
The following properties emerge from eqn (3.4): Ah does not depend on
the wavelength of incident radiation; the maximum value of AA (Ah =
0.048) is reached for 28 = n (backscattering) which is small but significant
for wavelengths of about 1A. Besides, Ah = 0 for 28 = 0.
Compton scattering is incoherent; it causes a variation in wavelength but
does not involve a phase relation between the incident and the scattered
radiation. It is impossible to calculate interference effects for Compton
radiation.
F(r*) = x Aj
N
j=1
exp (2nir* . I;.) (3.7a)
The coherent intensity I,,, can be calculated on the basis of the following
observations. An atomic electron can be represented by its distribution
function pe(r) = 1v(r)l2, where v ( r ) is the wave function which satisfies the
Schrodinger equation. The volume dv contains p, dv electrons and scatters
an elementary wave which will interfere with the others emitted from all the
elements of volume constituting the electron cloud. In accordance with
p. 145 the electron scattering factor will be
respectively.
The diffraction of X-rays by crystals 1 147
Equation (3.12) are illustrated in Fig. 3.3(a) and eqns (3.13) in Fig.
3.3(b). In accordance with eqn (3.10) the electron scattering factor is equal
to 1 when r * = 0. Moreover, the scattering of 1s electrons, whose
distribution is very sharp, is more efficient at higher values of r*. If the
distribution of 1s electrons could really be considered point-like their
scattering factor would be constant with varying r* (see Appendix 3.A, p.
177 for the transform of a Dirac delta function).
According to the premise of this section the intensity of the Compton
radiation of an atomic electron will be
lincoe = ZeTh(l -f z)
where ZeTh is given by eqn (3.2) or eqn (3.3). The intensity of the Compton
radiation has the same order of magnitude as the radiation scattered
coherently.
Scattering by atoms
Let q,(r), . . . , qz(r) be the wave functions of Z atomic electrons: then
pejdv = I%(r)12dv is the probability of finding the jth electron in the
volume dv. If every function qj(r) can be considered independent of the
others, then pa(r) dv = (Cf=, pej) dv is the probability of finding an electron
i (b) r* t
1
A-'I
2
in the volume dv. The Fourier transform of ,d -, r ,) is called the atomic Fig. 3.3. (a) Radial distribution for I s and 2s
electrons of a C atom as defined by Slater
scattering factor and will be denoted by fa. functions. (b) ScatteRing factors for 1s and 2s
Generally the function pa(r) does not have spherical symmetry. In most electrons.
crystallographic applications the deviations from it, for instance because of
covalent bonds, are neglected in first approximation. If we assume that pa is
spherically symmetric and, without loss of generality, that the centre of the
atom is at the origin, we will have
fa(r*) = 6"
ua
sin (2nrr*)
*2, dr=C.tq
j=1
where Ua(r) = 4m2pa(r) is the radial distribution function for the atom. The
pa function is known with considerable accuracy for practically all neutral
atoms and ions: for lighter atoms via Hartree-Fock methods, and for
heavier atoms via the Thomas-Fermi approximation. In Fig. 3.4(a) the fa
functions for some atoms are shown. Each curve reaches its maximum
value, equal to Z, at sin O / A = 0 and decreases with increasing sin OIA.
According to the previous paragraph most of radiation scattered at high
values of sin O / A is due to electrons of inner shells of the electron cloud
(core). Conversely scattering of valence electrons is efficient only at low
sin O / A values. f, can thus be considered the sum of core and valence
electron scattering:
fa =fcore +fvalence.
In Fig. 3.4(b) fCore and fva,e,c, of a nitrogen atom are shown as function of
sin O / A .
As a consequence of eqn (3.14) the intensity of the radiation coherently
scattered from an atom can be obtained by summing the amplitudes relative
148 1 Carmelo Giacovazzo
\,=I /
,--
sin O/n Since fe = 1 for sin B I A = 0 there is no Compton radiation in,the direction of
(4 the primary beam. Nevertheless it is appreciable at high values of sin BIA.
When we consider the diffraction phenomenon from one crystal the
intensity coherently diffracted will be proportional to the square of the
vectorial sum of the amplitudes scattered from the single atoms while the
intensity of the Compton radiation will be once more the sum of the single
intensities. As a consequence of the very high number of atoms which
contribute to diffraction, Compton scattering can generally be ignored: its
presence is detectable as background radiation, easily recognizable in
crystals composed of light atoms.
0 05
(b) The temperature factor
Fig. 3.4. (a) Scattering factors for S, ~ a +0.
, (b)
core and valence scattering for nitrogen atom.
In a crystal structure an atom is bound to others by bond forces of various
types. Their arrangement corresponds to an energy minimum. If the atoms
are disturbed they will tend to return to the positions of minimal energy:
they will oscillate around such positions gaining thermal energy.
The oscillations will modify the electron density function of each atom
and consequently their capacity to scatter. Here we will suppose that the
thermal motion of an atom is independent of that of the others. This is not
completely true since the chemical bonds introduce strong correlations
between the thermal motions of various atoms (see pp. 117-20 and
Appendix 3.B, p. 186).
The time-scale of a scattering experiment is much longer than periods of
thermal vibration of atoms. Therefore the description of thermal motion of
an atom requires only the knowledge of the time-averged distribution of its
position with respect to that of equilibrium. If we suppose that the position
of equilibrium is at the origin, that p ( r r ) is the probability of finding the
centre of one atom at r', and that pa(r - r') is the electron density at r when
the centre of the atom is at r', then we can write
where
q(r*) = 1
S'
.
p ( r r ) exp (2nir* r') dr' (3.17)
-
crystals between 0.05 and 0.20 A (B lying between 0.20 and 3.16 A2) but
can also reach 0.5 81 (B 20 A2) for some organic crystals. The conse-
quence of this is to make the electron density of the atom more diffuse and
therefore to reduce the capacity for scattering with increasing values of
sin 8/A.
In general an atom will not be free to vibrate equally in all directions. If
we assume that the probability p ( r r ) has a three-dimensional Gaussian
distribution the surfaces of equal probability will be ellipsoids called
vibrational or thermal, centred on the mean position occupied by the atom.
Now eqn (3.19) will be substituted (see Appendix 3.B, pp. 186 and 188)
by the anisotropic temperature factor (3.20) which represents a vibrational
ellipsoid in reciprocal space defined by six parameters UTl, U12, Uz3, UF2,
uT3, ul3:
+
q(r*) = exp [-2n2(UT1x*' Ug2y*2 U3*3~*22UT2x*y* + +
+ 2UT3x*z* + 2Uz3y*z*)]. (3.20)
The six parameters UG (five more than the unique parameter U necessary
to characterize the isotropic thermal motion) define the orientation of the
thermal ellipsoid with respect to the crystallographic axes and the lengths of
the three ellipsoid axes. In order to describe graphically a crystal molecule
150 1 Carmelo Giacovazzo
FM(r*)= I2
S j=1
N
pj(r - q) exp (2nir* r) dr
= 21
j=l S
pj(Rj) exp [2nir* (q + R,)] d ~ ,
N
=
j=l
J;(r*) exp (2nir* q), .
where J;(r*) is the atomic scattering factor of the jth atom (thermal motion
included; in the previous section indicated by fa,). The fact that in eqn (3.21)
we have neglected the redistribution of the outer electrons leads to
negligible errors for FM(r*), except in case of small r* and for light atoms,
where the number of outer electrons represents a consistent fraction of Z.
p,(r), as defined by (3.21), is the electron density of a promolecule, or,
in other words, of an assembly of spherically averaged free atoms
thermically agitated and superimposed on the molecular geometry. Such a
model is unsatisfactory if one is interested in the deformation of the electron
density consequent to bond formation. In a real molecule the electron
density is generated by superposition of molecular space orbitals V iwith
occupation ni:
Since pmolecule
can be decomposed into atomic fragments, a finite set of
appropriately chosen basis functions can be used to represent each jth
atomic fragment (see Appendix 3.D). Then
Diffraction b y a crystal
One three-dimensional infinite lattice can be represented (see Appendix
3.A, p. 174) by the lattice function
+
where V is the volume of the unit cell and r i = ha* + kb* Ic* is the
generic lattice vector of the reciprocal lattice (see pp. 63-5).
If the scatterer object is non-periodic (atom, molecule, etc.) the ampli-
tude of the scattered wave FM(r*) can be non-zero for any value of r*. On
the contrary, if the scatterer object is periodic (crystal) we observe a
non-zero amplitude only when r* coincides with a reciprocal lattice point:
The directions s which satisfy eqns (3.26) are called diffraction directions
and relations (3.26) are the Laue conditions.
Finiteness of the crystal may be taken into account by introducing the
form function @(r): @(r) = 1 inside the crystal, @(r) = 0 outside the crystal.
In this case we can write
152 1 Carmelo Giacovazzo
where
D(r*) =
S
O(r) exp (2nir* r) dr = exp (2nir' .r) dr
and 52 is the volume of the crystal. Because of eqn (3.A.40) the relation
(3.27) becomes
+
= ,
1
FM(H) x
53
D (r* - r i ) .
FH=
j=l
x fi
N
exp (2nirG . q )
p. 64 we write
FH= xf j
N
j=1
exp (2niAxj) = A H iBH + (3.30a)
where
AH= x
N
j=1
f j cos ~ J G H X ~ BH
, =
N
j=1
fi sin 2nHXj. (3.30b)
Fhkl
=
j=1
xf j exp 2ni(hxj + kyj + lzj).
I
FH= fOj exp (2niHxj - 8 n 2 4sin2 8/A2) Fig. 3.7. F, is represented in the Gauss pla_nefor
j=1 a crystal structure with N = 5. It is cui = 2zHX,.
154 1 Carmelo Giacovazzo
N
FH= &, exp (2niAXj - 2n2AU?H)
j=l
Bragg's l a w
A qualitatively simple method for obtaining the conditions for diffraction
was described in 1912 by W. L. Bragg who considered the diffraction as the
consequence of contemporaneous reflections of the X-ray beam by various
lattice planes belonging to the same family (physically, from the atoms lying
on these planes). Let 8 be (see Fig. 3.8) the angle between the primary
beam and the family of lattice planes with indices h, k, 1 (having no integer
common factor larger than unity). The difference in 'path' between the
+
waves scattered in D and B is equal to AB BC = 2d sin 6. If it is multiple
of A then the two waves combine themselves with maximum positive
interference:
Fig. 3.8. Reflection of X-rays from two lattice
planes belonging to the family H = (h, k, I). dis
the interplanar spacing.
2dHsin 8 = nA,
Since the X-rays penetrate deeply in the crystal a large number of lattice
planes will reflect the primary beam: the reflected waves will interfere
destructively if eqn (3.32) is not verified. Equation (3.32) is the Bragg
equation and the angle for which it is verified is the Bragg angle: for
n = 1, 2, . . . we obtain reflections (or diffraction effects) of first order,
second order, etc., relative to the same family of lattice planes H.
The point of view can be further simplified by observing that the family of
fictitious lattice planes with indices h' = nh, k' = nk, I' = nl has interplanar
spacing d H = j dHIn. NOWeqn (3.32) can be written as
lattice at 0. When the vector rtT, is on the surface of the sphere then the
corresponding direct lattice planes will lie parallely to IP and will make an
angle 6 with the primary beam. The relation
OP=r~=l/dH=IOsin6=2sinB/A
holds, which coincides with Bragg's equation. Therefore: the necessary and
sufficient condition for the Bragg equation to be verified for the family of
planes (hkl) is that the lattice point defined by the vector ri lies on the
surface of the sphere called the reflection or Ewald sphere. AP is the
direction of diffracted waves (it makes an angle of 28 with the primary
beam): therefore we can suppose that the crystal is at A.
For X-rays and neutrons A - (0.5-2) A, which is comparable with the
dimensions of the unit cell (-10A): the sphere then has appreciable
curvature with respect to the planes of the reciprocal lattice. If the primary
beam is monochromatic and the crystal casually oriented, no point of the
reciprocal lattice should be in contact with the surface of the Ewald sphere
except the (000) point which represents scattering in the direction of the
primary beam. It will be seen in Chapter 4 that the experimental techniques
aim to bring as many nodes of the reciprocal lattice as possible into contact
-
with the surface of the reflection sphere.
In electron diffraction A 0.05 A: therefore the curvature of the Ewald
sphere is small with respect to the planes of the reciprocal lattice. A very
high number of lattice points can simultaneously be in contact with the
surface of the sphere: for instance, all the points belonging to a plane of the
reciprocal lattice passing through 0.
If r i > 2/A (then dH< A/2) we will not be able to observe the reflection H.
This condition defines the so-called limiting sphere, with centre 0 and
radius 2/A: only the lattice points inside the limiting sphere will be able to
diffract. Vice versa if A > 2a,,,, where a,,, is the largest period of the unit
cell, then the diameter of the Ewald sphere will be smaller than rg,, (the
smallest period of the reciprocal lattice). Under these conditions no node
could intercept the surface of the reflection sphere. That is the reason why
we can never obtain diffraction of visible light (wavelength -5000 A) from
crystals.
The wavelength determines the amount of information available from an
experiment. In ideal conditions the wavelength should be short enough to
leave out of the limiting sphere only the lattice points with diffraction
intensities close to zero due to the decrease of atomic scattering factors.
Friedel law
+
In accordance with eqn (3.30) we write FH= A H iBH.Then it will also be:
F-, = AH- iBH and consequently
Q)-H = -Q)n (3.34)
156 1 Carmelo Giacovazzo
From that the Friedel law is deduced, according to which the diffraction
intensities associated to the vectors H and -H of the reciprocal space are
equal. Since these intensities appear to be related by a centre of symmetry,
usually, although imperfectly, it is said that the diffraction by itself
introduces a centre of symmetry.
xfi
F"R exp ( ~ J T ~ H=T )
N
;=l
.
exp ( ~ J T ~ H R Xexp
; ) (2niHT)
= x fi
N
j=1
+XT); = FH
exp ~ J T ~ H ( R
we can write
F", = FHexp ( - ~ J T ~ H T ) . (3.35)
Sometimes it is convenient to split eqn (3.35) into two relations:
From (3.36) it is concluded that intensities ZH and I,qRare equal while their
phases are related by eqn (3.37). The most relevant consequences of eqn
(3.35) are described in the following.
If the space group was Pm [(x, y, z), (x, y, z)], by using R2 and by
applying the Friedel law we would obtain eqn (3.38) again. If the space
group was P2/m eqn (3.38) would be obtained again only by using matrices
R2, R3, and R, in eqn (3.36). This time the Friedel law does not add any
additional relationship to those obtained from eqn (3.36). We can conclude
that the symmetry of the diffraction intensities from crystals belonging to
space groups P2, Pm and P2/m is that of the Laue class 2/m.
The reader will easily verify that the crystals belonging to groups P222,
Pmm2, and P2/m 2/m 2/m show intensity symmetry of the 2/m 2/m 2/m
Laue class:
IFhkrl = I F h d = = IFiid
=IF---l=IF-
hkl hkll - IFhill = IFhkil
will exist. In this case every reflection is a restricted phase reflection and will
assume the values XHT or ~ ( H +T 1). If the origin is assumed on the centre
of symmetry then T = 0 and the permitted phase values are 0 and n. Then
according to eqn (3.30b), FHwill be a real positive number for q, equal to
0, and a negative one for qHequal to n. For this reason we usually talk in
centrosymmetric space groups about the sign of the structure factor instead
of about the phase.
In Fig. 3.10 FHis represented in the complex plane for a centrosymmetric
structure of six atoms. Since for each atom at q another symmetry
equivalent atom exists at -q, the contribution of every couple to FHwill
have to be real.
As an example of a non-centrosymmetric space group let us examine
P2A2,, [(x, y, z), (i -x, J , 4 + z), ( i + x, 4 - y, 2), (ii,+ y, - z)] where
the reflections (hkO), (Okl), (h01) satisfy the relation HR = -H for R = R2,
R3, R4 respectively. By introducing T = T2 in eqn (3.39) we obtain
158 ( Carmelo Giacovazzo
imag. axis
1 None
1 All
m (0, k, 0 )
2 (h,O, I )
2/m All
mm2 [ ( h ,k, 0 ) masks (h, 0, O), (0, k, 0)l
222 Three principal zones only
mmm All
4 (h, k,O)
4 (h, k, 0 ) ; (0,0, I )
4/ m All
422 (h, k,O); {h,O, I ) ; { h , h, I )
42m [(h, k, 01, { h , h, 011; [ { h , 0, I ) ,
(O,O, 111
[(h, k, 01, { h , 0,0), { h , h, O)]
All
None
All
{ h, O,fi,O)
{ h , 0, h, I )
All
( h , k,O)
(O,O, 1 )
All
[ { h , h, I ) , { h , h, 0 ) , (O,O, / ) I
[(h, k, 01, { h , h, 0), { h , 0, O)]
(h,k,O); (h,O,I); ( h , h , I )
All
{ h, k, 0 )
All
( { h , k. O), { h ,h, 011
{ h , k,O}; { h , h, I }
All
The diffraction of X-rays by crystals 1 159
Systematic absences
Let us look for the class of reflections for which HR = H and let us apply eqn
(3.35). This relation would be violated for those reflections for which AT is
not an integer number unless IFHI = 0 . From this fact the rule follows:
reflections for which HR = H and AT is not integer will have diffraction
intensity zero or, as usually said, will be systematically absent or extinct. Let
us give a few examples.
+
In the space group P2, [ ( x , y , z ) , ( 3 , y 1, Z ) ] the reflections (OkO) satisfy
the condition HR, = H. If k is odd, HT, is semi-integer. Thus, the reflections
(OkO) with k # 2n are systematically absent.
In the space group P4, [ ( x , y, z ) , (2, P, 1 + z ) , (JJx, +
, +z ) , ( y ,3, 2 z ) ]
only the reflections (001) satisfy the condition HR, = H for j = 2, 3, 4 . Since
HT, = 112, AT3= 114, HT4= 3114, the only condition for systematic absence
is I # 4n, with n integer.
In the space group PC [ ( x , y , z ) , ( x , j , z + +)I the reflections (h01) satisfy
the condition HR, = H. Since HT, = 112 the reflections (h01) with 1 # 2n will
be systematically absent.
Note that the presence of a glide plane imposes conditions for systematic
absences to bidimensional reflections. In particular, glide planes opposite to
a, b, and c impose conditions on classes (Okl), (hol), and (hkO) respec-
tively. The conditions will be h = 2n, k = 2n, 1 = 2n for glide planes of type
a, b, or c respectively. The reader can easily check the data listed in Table
3.2.
Let us apply now the same considerations to the symmetry operators
centring the cell. If the cell is of type A, B, C, I, symmetry operators will
exist whose rotational matrix is always the identity while the translational
matrix is:
Lattice none
h+k+l=2n
h+ k=2n
k+l=2n
hkl h+l=2n
i
h+ k=2n
k+l=2n
h+l=2n
-h+ k+I=3n
h-k+I=3n
a
Glide-plane 11 ( 0 0 1 ) b hku
n
d
Glide-plane 11 ( 1 0 0 ) b
c Okl
n
d
C
Glide-plane 11 (1 1 0 ) b hhl
n
d
Screw-axis 11 b 2 1 ~ 4 ~
4 1 ~ 4 ~ Oku
quently only the reflections for which h, k, and 1 are all even or all odd will
be present.
The same criteria lead us to establish the conditions for systematic
absences (-h + +
k 1# 3n for obverse setting and h - k 1# 3n for reverse +
setting) for a hexagonal cell with rhombohedra1 lattice.
Rules for systematic absences may be also derived by using the explicit
algebraic form of the structure factor. Suppose, for example, that the space
group contains a c-glide plane perpendicular to the b axis (then ( x , y, z) and
(x, j , z + 4) will be symmetry equivalent points). The structure factor is
then
Nl2
Fhkl=
j=1
f j exp 2ni(hx, + kyj + lz,)
NI2
+ f j exp 2ni[hxj - ky, + l(z, + ;)I.
,=I
The diffraction of X-rays by crystals 1 161
For 1 even ~ 1 2
+
Fhkl= 2 x fi exp [2ni(hxj lz,)] cos 2nkyj;
j=1
for 1 odd
Ni2
Fhkl= 2 x fi exp [2ni(hxj
j=l
+ lz,)] sin 2nky,.
It is easily seen that Fhol= 0 for 1 odd, in accordance with our previous
results.
Diffraction intensities
The theory so far described is called kinematic: basically it calculates
interference effects between the elementary waves scattered inside the
162 1 Carmelo Giacovazzo
where t, and t are lengths in the direction of the primary and diffracted
beams respectively, a is the diffracted power per unit distance and intensity.
The equations have to be solved subject to the boundary conditions: I,
should be equal to the intensity of the primary beam when to = 0 and I = 0
when t = 0. The sum of the two equations is zero, which is the condition for
the conservation of energy. Zachariasen's theory has been modified by
other author^:[^-^] the introduction of an extinction correction parameter in
The diffraction of X-rays by crystals 1 165
Anomalous dispersion
It is well known that electrons are bound to the nucleus by forces which
depend on the atomic field strength and on the quantum state of the
electron. Therefore they have to be considered as oscillators with natural
frequencies. If the frequency of the primary beam is near to some of these
natural frequencies resonance will take place. The scattering under these
conditions is called anomalous and can be analytically expressed by
substitution of the atomic scattering factor fa defined earlier by a complex
quantity
Af' and f" are called the real and imaginary dispersion corrections. In order
to have a simple insight into the problem (a rigorous quantum-mechanical
treatment was carried out by Honl) we recall that the classical differential
equation describing the motion of a particle of mass m and charge e in an
alternating field intensity Eoiexp (iwt) is
moment:
02ex(t) w2e2 exp (iwt)
Ed = EOdexp (iot) = -= -Eoi
rc2 mrc2 08 - o2 igw' +
If the electron is unrestrained and undamped then g = wo = 0 and
-eL
Ed = (E&h = 7 EOiexp (iot)
mrc
which well agrees with eqn (3.1) suggested by Thomson: n is the phase lag
between the scattered and the incident radiation.
Since g << o , when o >> o, the expression of Ed is not very different from
that of a free electron. Therefore Thomson scattering is only applicable
when o >> w,.
We define now the scattering factor for an electron as the ratio
While the imaginary term is always positive, the real term is negative when
o < wo and positive when o > 01,. From the quantum-theory point of view,
the frequency wo coincides with that of a photon with just sufficient energy
to eject the electron from the atom. Such an energy corresponds to the
wavelength A, = 2nc/wo corresponding to the absorption edge. Thus it may
be expected that a remarkable deviation from Thomson scattering will arise
when the primary beam wavelength is close to an absorption edge of the
atom being considered.
An important question is whether Af' and f " vary with diffraction angle.
Existing theoretical treatments suggest changes of some per cent with
sin e/A but no rigorous experimental checks have been made so far:
therefore in most of the routine applications Af' and f " are considered to be
constant.
For most substances at most X-ray wavelengths from conventional
sources dispersion corrections are rather small. Calculated values for CrK,
(A = 2.291 A), CuK, (A = 1.542 A), and MoK, (A = 0.7107 A) are listed in
the International tables for x-ray crystallography, Vol. 111. In some special
cases ordinary X-ray sources can also generate relevant dispersion effects.
For example, holmium has the L3 absorption edge (-1.5368A) very close
to CuK, radiation: in this case the holmium scattering factor is not the same
for K,, and K,, wavelengths. The following dispersion corrections are
cal~ulated:[~~1
CuK,,(A= 1.5406 A): Af = -15.41 3.70
-
fl'-
(b)
where + and - indicate that the magnitudes are calculated for the vectors
H and - H respectively. The subscripts P and Q indicate that the structure ~ ~ ; ~ ; ~ ~ L ~ ~ ~ ~
factors are calculated only with the contribution of P or Q atoms edge; (b)samarium nearthe ne edge.
168 1 Carmelo Giacovazzo
imag imag.
axis axis
1 real axis
respectively:
Fb+ = x f,!
P
j=1
exp 2niHXj;
from which
AZ = 4 (F'JIF:( cos q.
Furthermore
In general, as we can see, IFHI= I F-,1 is no longer valid, i.e. the Friedel
law is not satisfied in the presence of anomalous dispersion. The value of AZ
depends on the collinearity of F; and F&. If they are collinear then
(F+I= IF-1: but this happens by mere chance. AZ is a maximum when Fp
and FQare approximately at the right angles.
The Friedel law is satisfied if: the structure is centrosymmetric-in this
case IF+[and IF-I are always equal; the reflection is centrosymmetric even
if the structure is non-centrosymmetric; the crystal is constituted of only one
chemical element which is the anomalous scatterer.
As a last observation it should be mentioned that besides X-ray, neutron
and gamma-ray anomalous dispersion are also very useful in crystal
structure analysis. Neutron anomalous dispersion techniques employ
The diffraction of X-rays by crystals 1 169
1
=- Fhklexp [-2ni(hx + ky + lz)]. (3.45)
Vh,k,l=-2
x = [x, y, z] are the fractional coordinates of the point defined by the vector
r. The atomic positions will correspond to the maxima of p(r).
If in eqn (3.45) we sum up the contributions of H and -H we will have
from which
@(r*) is the form function: @(r*)= 1 inside the available reflection sphere,
@(r*)= 0 outside this sphere. According to eqn (3.A.35) we have
- -
directions). In practical cases N, and N, are rather large: for instance, for
V = 1000 A3, dmin= 0.8 A , A = 0.25 A, we have N, 8180 and Np 64 000.
If symmetry is present the amount of calculation is smaller. The number
of independent reflections to be measured is roughly N,/(tm) where z is the
centring order of the cell and m the multiplicity factor of the Laue class (this
is not strictly exact as the multiplicity factor refers to general reflections of
type (hkl) and may be different and less than m for certain zones of
reflections). Furthermore, it will be sufficient to sample p upon the grid
points lying inside the asymmetric unit for reconstructing the whole content
of the cell.
For instance, let P2/m be the space group with a = 7.8 A, b = 16.2 A and
c = 8.1 A and /3 = 93'. If we divide a and c into 33 and b into 66 intervals
the grid spacing will have a sufficient and almost identical resolution in all
three directions. The number of grid points lying inside the asymmetric unit
(114 of the unit cell) is now 33 x 33 x 17 = 18 513.
Very often the volume of the unit cell is much larger than 103A3
(V > lo6hi3 is not infrequent for macromolecules). Thus even with the use
of high-speed computers, the calculation of p is a fairly arduous task
involving time-consuming procedures. Different algorithms are used to
make calculations faster. The most convenient are the Beevers-Lipson
technique and the fast Fourier transform algorithm by Cooley and Tookey
(see Chapter 2, pp. 88-90, and Appendix 2.1).
Unfortunately, it is not possible to apply eqn (3.47) only on the basis of
information obtained directly from X-ray diffraction. Indeed, according to
eqn (3.41), only the moduli IFHIcan be obtained from diffraction intensities
because the corresponding phase information is lost. This is the so-called
crystallographic phase problem: how to identify the atomic positions
starting only from the moduli IFH(.A general solution to the problem has
not been found, but there are methods we can successfully apply (see
Chapters 5 and 8).
m=-m
exp (-ima)J-,(z)
may be applied. Jm(see Fig. 3.17) is the Bessel function of the first kind of
order m, satisfying J-,(z) = (-l)"Jm(z). Then (3.48) reduces to
.
fi(r*) exp (2nir* r y ) x exp (2nir* ru)
u
.
x x exp {im[@,- 2nK
m
(r; + r,)]}J-,(2nr* .g,)
=fi(r*) exp (2nir* r y ) x J_,(2nr*
m
.g,)
Fig. 3.17. The Bessel functions J,,(z),n = x exp {im(@,- 2 n K . ry)}x exp [2niru . (r* - mK)].
0,1,2,3. u
The diffraction of X-rays by crystals 1 173
Provided the number of cells in the crystal is large enough the sum over u
+
leads to (1/V) 6 ( H - r* + mK) where H = ha* kb* + lc*. Consequently
the reflections occur for r* = H' = H + mK: for m = O we have main
reflections (H' = H), for m # 0 satellites are defined. We also see that four
indices are now needed for the identification of a diffraction effect. The
structure factor of the reflections H' = (h, k, 1, n) may then be written as
N
=
j=l
.
A(H1)exp (2nir* r y ) ~ , ( 2 n ~ 'gj)(-l)m
X exp (-2niK r;) exp (im@j). (3.49).
According to Fig. 3.17 the average intensity of satellite reflections rapidly
decreases with m.
The above formalism may be extended to one-dimensionally density
modulated structures and also to multi-dimensional (harmonic or not)
modulations. The above results also suggest that the reciprocal lattice of an
IMS is aperiodic in the three-dimensional space, and that the symmetry
group of an IMS cannot be a three-dimensional space group. We will show
in Appendix 3.E (main references will also be given) that such a reciprocal
lattice may be transformed into a periodic lattice provided a higher-
dimensional space is taken into consideration. Thus the symmetry in the
three-dimensional space can also be a poor residue of the full symmetry in
the higher-dimensional space.
Appendices
where S indicates the integration space. Thus the delta function corresponds
to an infinitely sharp line of unit weight located at ro. It is easily seen that, if
ro = xoa + yob + zoc, then
6 (r - rO)= 6 (X - x0) 6 (y - yo) 6 (2 - 20). (3.A.2)
6(x -xo) may be considered as the limit of different analytical functions.
For example, as the limit for a + 0 of the Gaussian function
where x* is a real variable. It easily seen that (3.A.4) satisfies the properties
174 1 Carmelo Giacovazzo
d(r - 4 ) = I,*
exp [2nirS ( r - ro)]dr* (3.A.6)
is derived. Consequently
L(x) = 6 ( x - x,)
,=-o(r
+ +
where r,,,,, = ua ub wc and u, v, w are integer values.
Accordingly, in a three-dimensional space:
(1) a periodic array of points along the z axis with positions z, = nc may be
represented as
+=
f'~(r>= w ) ~ ( Y I
n=-m
C
- 2,); (3.A.12)
(2) a series of lines in the (x, z) plane, parallel to x and separated by c may
be represented by
P3(r) = C
,=-m
S(z - z,). (3.A.14)
p(r') (l* .
exp [ 2 d r * (r' - r)] dr*) dr',
but
?T[p(r)l = T[F(r*)I = ~ ( - 4
F(r*) is a complex function: by denoting
A(r*) = 1S
.
p(r) cos (2nir* r) dr
B(r*) = 1S
p(r) sin (2nir* r) dr
then
Since
+m
3. Rectangular aperture:
p(x) = c for -g < x < g, otherwise p(x) = 0.
Then
sin (2ngx *)
~ ( x *=) cl-y exp (2nix*x) dx = c
nx *
which is plotted in Fig. 3.A.1.
4. Dirac delta function: p(x) = 6(x). Then (see Fig. 3.A.2(c))
+m
T[pp(x)] = x
P
n=-p
exp (2ninax*) =
P
n=-p
cos (2nnax*)
- l
1
-,sin [n(2n + l)ax*] - sin [n(2n - l)ax*]
-
-
2 sin (nax *) =
178 1 Carmelo Giacovazzo
1
- {sin [n(2p + l)ax*]- sin [ n ( 2 p - l)ax*]
2 sin (nax *)
+ sin [n(2p - l)ax*]- sin [n(2p - 3)ax*]
+ sin [n(-2p + l)ax*]- sin [n(-2p - l ) a x * ] )
- sin [(2p + 1)nax*I - sin Nnax *
- -
sin nu* sin m x * '
The function
sin N n y
f(y)==
sin Nnax *
F ( x * ) = lim
N-= sin nax* '
The function F ( x * ) will present infinitely sharp lines at x* = h l a of weight
l l a . Indeed
I&
Whichever the value of
+& sin N m x *
sin nax *
E',
1
&*=-lim
a I& +'sin N n y
-
sin n y dye
when N+m the value of the integral is unity.
F(r*) = x x x
P1
u=-p1
P2
u=-p2 w=-p3
P3
.
exp (2nir* ru,u,w)
9 . .x
P2 P3
with integer values of h, k, 1. It is easily seen that the solution of the above
three equations is given by
r: = h a * + kb* + lc*,
where
and V = a b A c. The vectors a*, b*, c* are nothing else but the basic
vectors of the reciprocal lattice defined in 3 2.3. When Nl, N2, N3 are
sufficiently large then F(r*) has appreciable values only in the reciprocal
lattice points defined by the triple of integers H = (h, k, 1).
Its Fourier transform is the limit of (3.A.29) for Nl, N2, N3 tending to
infinity:
sinNlna . r * sinN2nb . r * sinN3ncar*
F(r*) = lim
N,,N~,N+- sin n a . r* sin n b r*. sin JGC r* '
According to points 6 and 7 F(r*) represents a three-dimensional lattice by
an array of delta functions the weight of which may be calculated by
180 1 Carmelo Giacovazzo
sin nh
sin N2nk sin N,nl d l = V * = -1
x dhl
E2
sin n~ v
arises. In conclusion, the Fourier transform of a lattice in direct space
(represented by the function L(r)) is the function L(r*)IV:
which represents a lattice again (called the reciprocal lattice) in the Fourier
transform space.
9. Fourier transform of a one-dimensional periodic array of points along
the z axis, as defined by (3.A.12). Then
10. Fourier transform of a lattice plane lying on the plane z = 0 and with
translation constants a and b. Then
p(r) = b(x - nu) 6(y - nb) S(z)
and
/
,I.
. .
'
x = r sin p, cos 8 y = r sin g, sin 8 z = r cos p,
with r > 0, 0 < cp 6 n, 0 s 8 < 2n. Analogous transformations could be
written for r*. Without loss of generality we can choose z along the r*
direction: then r r* = rr* cos cp. Furthermore, for each point with coordin-
ates (r, 8, cp) another point will exist, equivalent to the first, with
coordinates (r, n + 8, n - cp). The contribution of both the points to the
y*
+
integral (3.A.15) will be exp (2nirr* cos cp) exp [2nirr* cos ( n - p,)] =
2 cos (2nrr * cos 9). Thus (3 .A.15) reduces to
3~ n 2n
F r ) p(r)cos(2nrr*coscp)r2sincpdrdp,d%
Fig. 3.A.4. Polar and Cartesian coordinates.
The diffraction of X-rays by crystals 1 181
F(r*) = 1
m
0
4nr2p(r)
sin 2nrr *
2nrr *
dr = Q U(r) sin2nrr2nrr** dr (3.A.33)
where U(r) = 4nr2p(r) is the radial distribution function. Thus F(r*) is also
spherically symmetric and its value at r* = 0 is given by
F(r*) = 6 4s2
sin 2nrr*
2nrr *
2
+
r sin 2nrr* dr = $ n ~ ~ r p ( y )
Convolutions
The convolution (or folding) of two functions p(r) and g(r) (it will be
denoted by p(r) *g(r)) is defined by the integral
Convoluting two functions very often has the effect of 'broadening' the
one by the other. As an example, the convolution of two Gaussian functions
+
N(a,, al) and N(a,, a;?)is the Gaussian function N((a: 0;)ll2, a , a,). +
The convolution operation appears in many scientific areas, and is
involved in the interpretation of most experimental measurements. For
example, when the intensity of a spectral line is measured by scanning it
with a detector having a finite slit as input aperture, or when a beam of light
passes through a ground-glass screen and is broadened out into a diffuse
beam. Suppose in the second example that p(8) is the angular distribution
of the incident beam and g(B) is the angular distribution which could be
obtained if the incident beam was perfectly collimated. For any given p(8)
the angular distribution of the transmitted beam is given by:
182 ( Carmelo Giacovazzo
That may be explained by observing that the component of the
transmitted beam emerging at angle 0 due to the light component incident
at angle 8 ' (and therefore deviated through the angle 8 - 8') has intensity
p(01)g(8- 0'). If interference effects are absent the total intensity in the
direction 0 is the integral of p(O1)g(8 - 8') over all values of 0'.
A very important theorem for crystallographers is the following:
T [ P ( ~*) g(r)l = T[p(r)l T[g(r)l. (3.A.35)
The left-hand side of (3.A.35) may be written
If g(r) = p(-r) (3. A.34) will represent the autoconvolution of p(r) with
itself inverted with respect to the origin: in crystallography it has a special
significance, the 'Patterson function', and will be denoted by P(u). It is
If r and u are assumed to belong to the same space, choosing the same
coordinate system transforms the above relation onto
6 (r - ro)* p (r) = p (r - ro). (3.A.40)
We see that the convolution of p(r) with 6(r - r,) is equivalent to a shift
of the origin by r , (see Fig. 3.A.6(a)).
Suppose now that f (x) is a function defined between 0 and a. Because of
(3.A.10)
The diffraction of X-rays by crystals 1 183
~(x,Y) 0 0 0
Fig. 3.A.6. Convolutions of the function fwith:
(a)the Dirac 6 ( x - a ) delta function; (b) a one-
w 0 0 0
dimensional lattice. In (c) the convolution of the
function f(x, y ) with a two-dimensional lattice is
shown.
L (L
p(x)g(u - x) exp (itu) dx du.
)
184 1 Carmelo Giacovazzo
Thus, the mean of the convolution is equal to the sum of the means of the
constituent distributions.
By extending the procedure Table 3.A.1 may be obtained. The following
notation has been used:
are the central moments of order p for the convolution distribution (similar
expression can be derived for the constituent distributions). Accordingly
po= 1, 1-11 = 0, p, coincides with the variance a', while yl = p3/a3 and
y,= [(p4/a4)- 31 are the skewness and the excess parameters for any
distribution.
Deconvolution of spectra
Often it occurs that an experimentally measured function C may be
considered as the convolution of the functions p and g. If p is known in
advance then it may be of some interest to obtain g. That frequently occurs
in spectroscopy or in powder diffraction, where a spectrum is often
constituted by overlapping peaks and it is wanted to deconvolute from such
a spectrum a given lineshape function. Effects of such self-deconvolution
The diffraction of X-rays by crystals 1 185
are:['"16] the component lines are more clearly distinguished, their location,
area, etc. are more correctly defined, the signal to noise ratio is increased.
Let us consider C as the convolution of the lineshape function p and of
the ideal spectrum g. Then p may be decon~oluted[l~~ from C by taking the
Fourier transform of
and by calculating
T k l = T[C]/Tbl. (3.A.43)
g is finally obtained as inverse Fourier transform of the right-hand side of
(3.A.43):
u=
i
(x'" (xtyt) (xtzt)
1
(x'y') ( y f 2 ) (y'z') = ( x t X ' ) .
(xtzt) (ytz') (zt2)
The Fourier transform of p(X1) gives
+
q(X*) = exp (-2n2X*u*x*) = exp [ - ~ ? G ~ ( U T ~2U&.x*y*
X*~ + . . .)I
= exp (-X*/3X*) = exp - ( p , , ~ +
*~2P12x*y*+ . . .)I
where U* is the variance-covariance matrix expressed in reciprocal coordi-
nates and
or also
1 1
(u2)equiv = 2 Tr (PO) =
6~ i,j
The reader will easily derive from (3.B.lOd) specific formulae for
specific crystallographic systems: e.g. for cubic, tetragonal, and ortho-
rhombic systems
(u2)equiv= (Plla2+ P22b2 + P 3 3 ~ ~ ) 1 ( 6 ~ d ~ ) ,
for hexagonal and trigonal systems (hexagonal setting)
P(c)=~ p(X')dXr.
ellipsoid
The most often used value is C = 1.5282: then[''] the ellipsoid encloses 50
per cent of the trivariate Gaussian probability density.
arise, from which the $ restrictions can be derived. Such restrictions can
also be obtained in an easier way by using Wigner's theorem according to
which the symmetry of the $ restrictions is displayed by the matrix
Table 3.B.1. Site symmetry table giving key for the 29 types
of symmetry p-restrictions
Cross- Pi1 P22 P33 Pi2 Pi3 P23 Cross- Pi1 Pz, P33 Pi2 Pi3 P2,
refer- refer-
ence ence
No. No.
1 A A A O 0 0 16 A B C D 0 0
2 A A C O 0 0 17 A B C O E 0
3 A B A O 0 0 18 A B C O 0 F
4 A B B O 0 0 19 A A C D E -E
5 A A A D D D 20 A A C D E E
6 A A A D - D - D 21 A B A D E -D
7 A A A D -D D 22 A B A D E D
8 A A A D D -D 2 3 A B B D -D F
9 A A C A / 2 0 0 24 A B B D D F
10 A B C O 0 0 25 A B C B/2 F/2 F
11 A A C D 0 0 26 A B C A/2 0 F
12 A B A O E 0 27 A B C B/2 E 0
13 A B B O 0 F 28 A B C A/2 E E/2
14 A B C B/2 0 0 29 A B C D E F
15 A B C A/2 0 0
The diffraction of X-rays by crystals 1 191
Detector
I - scan
where the Ds are the amplitudes of the excited wavefields in the crystal (the
intensities are given by 1= 1 D 12). In eqn (3.B. 13) DL, is the two-beam
amplitude of the reflection HI, D, is the amplitude of the Umweg wave
defined by
Du = R W ) F H ~ F H , - H ~ .
Ciare suitable parameters.
lR(+)l and A(3) are the modulus and phase of the complex resonance[331
term R ( q ) = )R(q)l exp (iA(+)] which governs the amplitude and the
resonance phase shift of the Umweg wave. Typical resonance curves for
IR(q)1 and A(q) are given in Fig. 3.B.4. IR(q)I is highest near the
three-beam position. A(q) varies from zero to n when 3 is scanned through
the three-beam position: it is less than n/2 when H, is inside the Ewald
sphere; it is between n/2 and n when it is outside.
Equation (3.B.13) and (3.B.14) may be used to interpret the main
features of typical q-scan profiles (we assume that r;, crosses the Ewald
Fig. 3.8.3. Diagram of the interference between
the unperturbed two-beam amplitude Do and
the Umweg resonance term D,. -
Suppose Q, 0: for
. < 0) to outside ( 3 > 0)).
sphere from inside (. 3
6
<<O it is @j+ A($) = 0, interfering waves are
The diffraction of X-rays by crystals 1 193
- -
expected profiles for a, +45" (or Q3= -45') should present characteris-
tics between those for a, = 0 and for Q3= 90" (or a3 -90"). Ideal q-scan
profiles for a, 0, n, f7612, kn14, have been recently secured[341and are
v
I & ( 3 ) and I&(+) are the q-scan profiles for the positive ( + a 3 ) and the
v
Fig. 3.8.4. Schematic drawings for the
negative triplet phase (-a3) respectively. Thus the ideal profiles shown in amplitude of the resonance term Rand for its
Fig. 3.B.5 are marked by the condition AZ(tp)lG1= 1. phase factor in function of tp. tp is assumed to be
zero at the ideal three-beam position.
Ideal 3-scan profiles can only be obtained if the dominant process in he
three-beam interaction is due to the interference effect. That occurs when
IFH21and IFHl-H21 are about twice as strong as IFHl/.If IFH2[ is small and
IFHZ-HII is large enough (or vice versa as the influence of (FH21
and IFH2-H11is
symmetric) then intensity is removed from the H1 reflection and coupled
into the H, reflection via the coupling Hz- HI. This loss of intensity is not
compensated by the scattering power from H2 into H1 reflection. The result
is the so-called aufhellung effect, that is a strong depletion of the HI
intensity. If IFH,Iis very small and IFH,(is large then the intensity of HI is
increased at the cost of the H2 reflection intensity. This effect is called
umweganregung by ~ e n n i n g e r . [ ~ ~ ]
Unweganregung and aufhellung effects can be evaluated[361by comparing
the 3-scan profiles for 0, and - a 3 . Then profiles can be separated into two
parts: the symmetric AI curve, which represents the phase-independent
umweganregung or aufhellung effect; the ideal 3-scan profiles, which
contain the phase information.
Some concluding remarks may be useful:
1. Exact 3-scan can be difficult for conventional four-circle
diffra~tometers.[~~]A special six-circle d i f f r a ~ t o m e t e rproved
~ ~ ~ ] to be useful:
two circles (8, Y) for the detector and four circles for the crystal motion (see
Fig. 3.B.6). When the vector H1 is aligned into the T) axis, the 3 scan is
performed by rotating only about the axis: the detector circles (8, Y) may
be moved to observe the reflections H1or H2.
2. Since dynamical three-beam interaction has a very small angular
The diffraction of X-rays by crystals 1 195
range, both the divergence and the spectral width of the primary beam
should be small enough. Synchrotron radiation seems therefore the most
suitable source for q-scan experiments. But also properly modified X-ray
equipments based on rotating-anode generators can play an important role.
3. Measuring with the necessary statistical accuracy a single 3-scan
profile is a time-consuming process (say, half an hour when synchrotron
radiation is used, some hours for less intense sources). In order to avoid the
influence of long-range intensity variations, the q-scan profile is usually
obtained as the sum of many fast scans.
4. Finding three beam points is not very easy for large structures (more
than three reciprocal lattice points can simultaneously lie on the Ewald
sphere). It is safe to calculate in advance the most suitable 11, angles for each
reflection HI and choose for measurements three-beam cases which are
separated from all the others by an angular distance greater than 0.1".
Fig. 3.8.6. Non-conventional six-circle
5. As a rule, more reliable phase estimates can be obtained if FH,, FH2, diffractometer,
FH1-H2 have comparable magnitudes. If polarized radiation is used, the
polarization factor may be exploited to attain such a rule.
6. The absolute configuration of a non-centrosymmetric structure may be
determined by accurately measuring one (or more) triplet phase, having a
value near fnI2.
Electron diffraction
The diffraction of electrons (e-diffraction) was demonstrated by Davisson
and Germer in 1927. The electron beam is produced in an electron gun by a
'hair-pin' filament with diameter of about 10 pm, or in a heated pointed
filament of 1-2 pm size (see Chapter 9). Electromagnetic lenses restrict
divergence to or lop4 rad, but also divergence of 10-6rad may be
achieved for special purposes. Electrons are accelerated through a potential
difference of V volts. The following relation is valid:
Two energetical intervals are commonly used: we will talk about high-
energy diffraction (HEED) when V -50-120kV (with the advent of
high-voltage electron microscopes this range needs to be extended to
1MeV) and A =0.05 A, and low-energy diffraction (LEED) when V = 10-
300V and A=4-1 A. Electrons are strongly absorbed by matter and
therefore e-diffraction in transmission is applicable to very thin layers of
matter (10p7-10-5 cm). The scattering is caused by the interaction of the
electrons with the electrostatic field q(r) of the atoms. q(r) is the sum of
the field caused by the nucleus and the field caused by the electron cloud.
Thus the interaction of electrons with matter may be divided into three
processes: (a) no interaction-the electron passes straight through the
specimen; (b) elastic scattering-the electrons are scattered by the Coulom-
bic potential due to the nucleus. Since the proton mass is much larger than
that of electron, no loss of energy occurs: such a scattering is the most
important in electron microscopy. (c) Inelastic scattering-electrons of the
primary beam interact with atomic electrons, and are scattered having
suffered a loss of energy. In a microscope such electrons are focused at
196 1 Carmelo Giacovazzo
I
= 4n fax(r*)exp (-2niP .r) dr*
I
- 4n Z exp (-2nir*
where fa, is in electrons and f:, in (fte occurs in the first Born
approximation for electron scattering by atoms).
If we compare numerically eqn (3.B.15) with the atomic scattering factor
for the X-rays, we observe that:
1. e-scattering is much more efficient than X-scattering (see Fig. 3.B.7).
Consequently diffraction effects are easily detected even from volumes
much smaller than those required for X-rays (in practice, starting from
thicknesses of 100 A for simple structures constituted of heavy atoms).
sin @/A
Fig. 3.8.7. Typical scattering curves for: ( I ) 2. The curves fa, are less sensitive (see Table 3.B.3) to the atomic number
electrons; (2) X-rays; (3) neutrons. Z than fa, (on the average fae(0) ~ 2 " ~ Therefore
). the positions of the
The diffraction of X-rays by crystals 1 197
the crystal surface: thus in principle the structure of outer layers can be
different from that of internal layers. Interpretation of the spectra is not
easy because of multiple scattering of the electrons. Additional information
is provided by means of Auger spectroscopy of scattered electrons.
Neutron scattering
A neutron is a heavy particle with spin 4 and magnetic moment of 1.9132
nuclear magnetons. Its wave properties were shown in 1936 by Halban and
Preiswerk and by Mitchell and Powers. Neutron diffraction experiments
(n-diffraction) require high fluxes provided nowadays by modern reactors.
They produce fast neutrons whose energy is reduced by collisions in a
moderator of heavy water or graphite. So retarded neutrons are called
thermal and their speed obeys a Maxwell distribution: the corresponding
spectrum is white (see Fig. 3.B.8(a)). A monochromator (usually single
crystals of Ge, Cu, Zn, Pb) selects the desired wavelength A.
Neutrons can also be produced in a pulsed manner by spallation, at a
repetition rate between 24 Hz and 50 Hz. High-energy protons (-1 GeV),
in short pulses at the appropriate pulse frequency, strike a target such as
uranium or tungsten releasing several tens of neutrons per proton (-25
neutrons for 238U).The pulsed neutron flux is only present for a very short
time (the burst lasts around 0.4 ys): heat-removal is then easy and high
fluxes are allowed (higher than those produced by reactors).
High-energy neutrons are slowed down to thermal energies by appropri-
ate moderators. Water, polyethylene, liquid hydrogen, or liquid methane
are frequently used: the choice is determined by the scattering experiment.
During the thermalization process neutrons undergo a large number of
collisions which cause pulse broadening. The width of the pulse leaving the
moderator is roughly proportional to the wavelength (see Fig. 3.B.8(b)) so
that the fractional wavelength resolution is nearly constant. A characteristic
neutron spectrum is shown in Fig. 3.B.8(c): high intensities at short
wavelength (A < 1A) is a very significant characteristic of pulsed neutron
sources.
The neutron-atom interaction comprises interaction with the nucleus and
interaction of the magnetic momentum associated with the spin of the
neutron with the magnetic momentum of the atom. This effect mainly
occurs for atoms with incompletely occupied outer electron shells (for
instance, transition elements).
The neutron-nucleus interaction is governed by very short range nuclear
forces (-10-l3 cm). Since the nuclear radius is of the order 10-l5 cm, i.e. of
several orders of magnitude less than the wavelength associated with the
incident neutrons, the nucleus will behave like a point and its scattering
factor bo will be isotropic and not dependent on sin B/A (see Fig. 3.B.7). By
convention, the scattering amplitude is assumed positive where there is a
phase change of 180" between incident and scattered waves: it has the
dimension of length and is measured in units of 10-12 cm. When the neutron
Fig. 3.0.8. (a) Spectrum from a nuclear reactor. is very close to the nucleus a metastable system, nucleus + neutron, is
The shaded wavelengths are selected by a
monochromator. (b) A schematic time- created which decays by re-emitting the neutron. For appropriate energy a
dependence of two pulses of neutrons leaving resonance effect can occur: then the scattering factor assumes the form
the moderator with different energies (A, > A,). A b = bo - Ab'. Since Ab' can be greater than b, it is possible to have
next pulse starts when the number of neutrons
(N,) is small enough. (c) Neutron flux negative scattering factors for some nuclei (for instance, 'H, 48Ti, 62Ni,
distribution on a pulsed neutron machine. 5 5 ~ n )for
: them the scattering is out of phase by 180" with respect to the
The diffraction of X-rays by crystals 1 199
orientation in the sample and the vector r * . For a sample containing a single
magnetic domain sin2 a is equal for all atoms. For the multi-domain samples
it will be necessary to average over the various spin orientations. For
ferromagnetic and ferrimagnetic samples external fields for orienting the
spins can be applied so as to obtain sin2 a = 0 or sin2 a = 1. Measurements
of IFI2 for both cases separate magnetic from nuclear scattering and allow
the study of magnetic structures. Sometimes the magnetic cell coincides with
the 'chemical' cell so that the magnetic contribution to the intensities is
added to nuclear contribution. In other cases the magnetic structure has a
cell which is a multiple of the chemical cell causing additional purely
magnetic reflections. Magnetic symmetry can be described by means of
space groups of antisymmetry or by means of colour groups (see Appendix
l . F and Fig. 1.F.2(b)) associated with classical three-dimensional space
groups.
Monochromatic beams of polarized neutrons are easily obtained from
unpolarized beams by using suitable single-crystal monochromators (for
example, a Co-Fe alloy). For a given polarization, either constructive or
destructive interference between nuclear and magnetic scattering amplitudes
can occur. Thus the technique may detect a weak magnetic scattering even
when it is accompanied by a strong nuclear scattering.
Among the most important characteristics and applications of neutron
diffraction we quote:
1. The interaction of neutrons with the matter is weaker than that of
f,-
-
X-rays or electrons (see Table 3.B.3). Generally speaking, scattering
amplitudes are fa, (10-12-10-11) cm for X-rays, fa, -
lo-' cm for electrons,
10-l2 cm for neutrons. Therefore high neutron fluxes and crystals with
dimensions of several millimetres are needed for measuring appreciable
scattered intensities.
2. The b values vary non-monotonically with the atomic numbe; Z:
isotopes of the same element can have very different values of b. This allows
us to distinguish between atoms having very close values of Z (but very
different values of b : e.g. b,, = -0.36, b,, = 0.96, b,, = 0.25) and to
localize the positions of light atoms in the presence of heavy atoms.
Neutrons are particularly useful for localizing hydrogen atoms. Usually they
are partially or completely substituted by deuterium with a value b > 0 and
with negligible incoherent scattering.
4. Since b does not depend on sin 8/A nuclear scattering decreases only
because of the temperature effect. Thus reflections with high values of
sin 8lA can be collected giving atomic positions and thermal parameters
with accuracy higher than from X-rays. In many cases X- and n-diffraction
experiments are both performed in such a way that accurate maps of the
electron density can be obtained.
The diffraction of X-rays by crystals ( 201
tH= 505.555LdHsin O0
where L is measured in meters, A and dH in A, t in ys. Thus time of flight
depends linearly on both the flight path and on the wavelength. In addition
the equations suggest that resolution improves with increasing flight path.
Area detectors (see Chapter 4) are the best choice for time-of-flight
techniques. Indeed, even if the source is on for brief pulses the full
spectrum may be used (in monochromatic techniques coupled with a reactor
the source is on all the time but a small portion of the spectrum is used).
k
where k varies over the individual electrons of the atom. If we sum (3.C.1)
over all the atoms in the assemblage the total incoherent scattering is
obtained, which is a smooth function slowly increasing with sin B l A . Such a
function has to be subtracted from the total scattering pattern: only after
that can the coherently scattered intensity be used to deduce structural
parameters.
P(u) = I
R
p d r ) p d r + u)@(r)@(r+ u) dr. (3.C.2)
The product @(r) @(r + u) is always vanishing except when both the points
The diffraction of X-rays by crystals 1 203
+
r and r u are inside the object: in this case @(r)@(r+ u) = 1. The space ' \
\
I
domain in which this condition is verified is a function of u and will be \
denoted by Qv(u), where 0 6 v(u) 6 1. It is readily seen from Fig. 3.C.1, \
that Rv(u) is the volume which belongs to the object and to the object
shifted by u. Furthermore, v(u) is centrosymmetric (v(u) = v(-u)), it
decreases with increasing u and takes its maximum for u = 0 (v(0) = 1).
+& , 1
I
I
I
Accordingly I
where
RY(U)=
I" +
@(r)@(r u) d r = @(r)* @(-r).
so that
Q* ID(r*)l2dr* = L .
Pv(u) d u b *exp (2nir* u) dr*
-
domain volume is obtained, say w = 1/52, together with its approximate
dimension rz ( 1 / ~ )=~l/ro,
diffracting object.
' ~ where ro is the average dimension of the Fig,3.C.2. Thetypicalform
function of r*.
of , D ( r X , as
,~a
r*
204 ( Carmelo Giacovazzo
In terms of diffraction angles the width of the diffraction peak near the
- -
origin of the reciprocal space may be obtained from the Bragg relation
(2/r,*) sin 8, (2Ir:) 8, 2roeo= A from which
-
v--=- Fig. 3.C.3. The distribution p ( u ) for one-
dimensional arrangements of objects of length d
(d) as a function of the compactness.
-
real gases, liquids, and amorphous solids. p(u) will oscillate about unity at
short distances from the origin, while p(u) 1 at long distances. Oscillations
of p(u) are larger for higher concentrations. When dl1 becomes maximum
the close packing of the particles gives rise to a perfect lattice of period d:
then very sharp maxima of p(u) will occur at u = nd (full short-range and
long-range order).
Let us now denote by z(u) the probability of finding an atom in the
element of volume dv located at the extremity of the vector u from an atom
located at the origin. Then
It may be noted that l / v l is not only the asymptotic value of z(u) but also
its mean value. Indeed, integrating z(u) on the unit volume gives
The function ID(r*)I2 can be considered very broad with' respect to the
Dirac delta function but very sharp with respect to T[p(u) - 1] (which in
disorder structures is a function slowly varying with r*). According to
(3.C.9) the integral of the ID(r*)I2 peak is Q: therefore
The first term in (3.C.19) corresponds to the peak at the origin of the
reciprocal space. It is detectable only at very small angles and it is
distinguishable from the primary beam (see the previous section) provided
the diffracting object is very small (say, <l pm). It does not depend on the
internal structure of the object but only on its external shape. We will
discuss such a peak on p. 213.
The second term depends exclusively on the statistical distribution of the
atoms: if this is perfectly uniform p(u) = 1 and I E ( ~ * ) [1.~ =Thus the
variations of IE(r*)I2about its average contain information about the atomic
distribution in the object. Such a distribution may be determined by
inversion of (3. C. 19):
be the structure factor of the nth group of atoms (each group composed of p
atoms). Then
N
and
Since the observed intensity will be the average with respect to all the
possible mutual configurations of the atomic groups
sin 2nr*uii
2nr*uij
) (3.C.23)
208 ( Carmelo Giacovazzo
(1E(r*)I2) = 1 + 5
i+j=l
YiY,
sin 2nr * uij
2xr*uij
where Y = f /N(Cf=l (f?)"'. In order to give some examples, for a molecule
composed of two atoms at distance 1 (3.C.23) becomes
The position of the mth maximum for large values of m will approach more
and more closely to
x sin B/A = (0.125 + 0.5m)
but maxima will become weak and ill defined.
It is easily seen that the maxima in the IE(r*)I2 curve will coincide with
those given by (3.C.27). In particular, from the first maximum located in
(sin O/A), the interatomic distance
x = 0.615/(sin (3.C.28)
is obtained.
It is easily seen that Debye maxima will correspond to those provided by
(3.C.21).
Diffraction by gases
For a perfect gas (consisting of identical atoms with negligible volume and
exercising no action on each other) any atom can occupy any position with
the same probability. Therefore p(u) = 1, (3.C.14) reduces to
If the central peak is also taken into consideration, the observed intensity is
of type described in Fig. 3.C.4. For r* = 0 the value of ( ~ ( r * )isl ~~~f
(replace JD(r*)I2by 1D(0)1' = Q2 and v, by S2/N in (3.C.18)). As soon as r*
is out of the peak at the origin the value of IF(r*)I2 is equal to ~ f ' .
If the gas is a mixture of perfect monoatomic gases the scattering is
practically given by
sin 011
in Fig. 3.C.5 (curve (a)). The maxima in the experimental curve are not well
Fig. 3.C.4. Schematic diagram of the scattered
defined (they are represented by roughly horizontal portions of the curve) intensity from a monoatomic gas.
because of the continuous decay o f f with sin @/A. The maxima should be
better emphasized by plotting JE(r*)I2(curve (b)).
For real gases the intrinsic volume of the atoms is no longer negligible.
Since atoms are impenetrable, an atomic distribution function p(u) such as
that described in Fig. 3.C.3(a) may be chosen, where d is the atomic
diameter. Then (3.C.21) becomes
- 3-
A
v- - -- - -
V V
A n -
V
v - -
v - --
I I I I I I
I I Fig. 3.C.8. Radial distribution curves for CH
, -,
0 1 2 4 5 6 7 NO, (E, experimental; T, theoretical). Also
r(A) shown is the difference curve (E-T).'~'
212 1 Carmelo Giacovazzo
In spite of the purely geometrical nature of the model the diffraction pattern
of monoatomic liquids closely satisfies the model predictions. In Fig. 3.C.10
the IE(r*)I2 curve of liquid mercury is shown: its Fourier transform (eqn
(3.C.22)) shows a maximum a little over 3 A.
A further example concerns the diffraction pattern of water. As it is well
1/d r* known, the molecules are strongly polar and V-shaped with 0 - H distances
(b) of about 1 A, and nearly tetrahedral HOH angles (109"). For our purposes,
Fig. 3.12.9. (a) Distribution functions P ( u )for they may be geometrically represented by spheres of about 2.8 A diameter.
hard spheres of diameter dfor two values of the The water scattering curves[s01 at 1.5 "C and 83 "C are shown in Fig.
concentration parameter C. (b) I E ( ~ * )curves
I~
corresponding to the two distribution shown in 3.C.ll(a); the corresponding radial distribution curves are given in Fig.
(a). 3.C.ll(b). The main maximum for the radial distribution occurs at a radius
of about 2.8 A while a second broader maximum occurs at about 4.5 A,
nearly vanishing at high temperature. Such results suggested to Bernal and
~ o w l e r [ ~that
l ] the arrangement of H 2 0 molecules in water may be
described as a broken-down ice structure in which each molecule tries to
bind four neighbours: since bonds are continually breaking and re-forming
at any instant each molecule is bonded to fewer than four neighbours.
As a last example, examine now Fig. 3.C.12, where the diffraction
patterns by vitreous silica, a cristobalite crystal powder, and silica gel are
shown. The main maxima of all the three curves nearly overlap: but
cristobalite shows numerous sharp maxima while only one maximum occurs
in vitreous silica. Furthermore, vitreous silica intensity decreases with
sin 8/A as we have just seen for liquids, while silica gel shows increasing
intensity toward small sin OlA. The radial distribution function for vitreous
shows a first peak at r = 1.62 A and a second one, at 2.65 A: that
indicates that the tetrahedral coordination in the crystalline state persists in
the vitreous state (but here the orientations of the tetrahedral groups are
randomly distributed). The vitreous state is essentially homogeneous
(diffraction intensity decreases at small sin $/A as for liquids) while silica gel
- is made from very small discrete particles (10-100 A) with voids among
0.5 1 2 r * them (diffraction intensity increases at small scattering angles: see the next
Fig. 3.C.10. I E ( ~ * ) /curves
' for liquid mercury. section).
The diffraction of X-rays by crystals 1 213
Small-angle scattering
Small-angle scattering is a technique for studying structural features or
inhomogeneities of colloidal dimensions.[401For wavelengths of about 1A
the typical angular domain of the technique ranges up to one or two
degrees.
We have already seen (p. 204) that when diffraction occurs from a finite
statistically homogeneous object of volume S2 a central peak in the intensity
curve may be measured which becomes broader as the object size decreases,
and does not depend on the internal structure of the object. Its intensity
distribution varies according to (3.C.18):
Since diffraction is considered at very small angles f (r*) will coincide with
the number of electrons per atom. Thus the above equation reduces to
where IDP(r*)l2is the average value of )DP(r*)l2over all the possible particle
orientations. Different shapes of particles will give rise to different shapes of
Fig. 3.C.11. (a) 1 f ( r * ) l 2water scattering curves at
the scattering function in reciprocal space: ideal scattering intensities can 1 .5"C and 83 "C; (b) the corresponding
therefore be calculated for spherical, cylindrical, flat, ellipsoidal, etc., distributions 4nu2p(u).
particles. The results are all rather similar, particularly in the central range,
but remarkable differences occur at larger angles. It may then be expected
that in the central part a universal approximation for all particle shapes
must exist
Let us assume the origin 0 in the centre of gravity of a particle of volume
S i l ~ c aGel
I .
.....- ....
Vit. SIO, - # - - ' . . ..::. . .:.;:. . .: : . .
..
. .. .. .... .... .......,..:..: ,
. . . :.
.. .
.......
Fig. 3.C.12. Diffraction patterns for silica gel,
vitreous SiO,, and for cristobalite.
214 1 Carmelo Giacovazzo
R = ( l v ) ( r2 du)
UP
is the radius of gyration (= the mean square distance) of the particle with
respect to its centre of gravity.
According to (3.C.31) the scattering power per particle will be
averaged over all orientations. From the above equation the function p ( u )
may be derived by inverse Fourier transform.
1 core
1
Yq,, = 3 sin2 6 cos 241 = 3(s: - s:)
Y;,, = 6 sin2 6 sin @ cos @ = 6sxsy
Yq,, = 3 sin 6 cos 6 cos 4 = 3sxs, quadrupoles.
Y;,, = 3 sin 6 cos 6 sin 4 = 3sys,
Y;,, = 6 cos2 6 - 2 = 6s; - 2
I
Y',,, = 15 sin3 6 cos 341 = 15(s: - 3s,2)sx
Y;,, = 15 sin3 6 sin 341 = (45s: - 1 5 s ; ) ~ ~
Y;,, = 15 sin2 6 cos 6 cos 241 = 15($ - s;)s,
Y:, = 30 sin2 6 cos 8 sin cos @ = 30s,sys,
$J octapoles.
Yg,,= 1.5 sin 6(5cos2 6 - 1)cos @ = 1.5(5s;- l)s,
Y:,, = 1.5 sin 6(5 cos2 6 - 1) sin @ = 1.5(5s: - l)sy
Y;,, = 10 cos3 6 - 6 cos 6 = (10s; - 6)s,
In Fig. 3.D.1 some electron density plots in special sections of direct
space are shown.[611
In most organic molecules the expansion (3.D.2) may be stopped at
octapoles: then the deformation charge density is described by a linear
combination of 16 terms, for which 16 population parameters have to be
estimated.
In some formulations[621the free-atom valence shell is modified by an
expansion-contraction radial parametrization in order to take into account
the fact that an atom will expand or contract when it becomes more
negative or more positive. Such a feature may be incorporated into a
perturbed valence density p:,,ence(r) given by
where
6 = F c a ~ cmultipole,
, F2 = F c a ~ cfree
, atoms. AP is then the deformation map,
i.e. the difference between the atomic densities represented by spherical
harmonics and those represented by free atoms;
F' = Fobs,F2 = Fcalc,rnultipole Ap is then the residual density, e.g. the part
of electron density not accounted for by the multipole model;
4 = Fobs, F2 = Fcalc,
free atoms,under the condition that F2 is calculated
with parameters from high-order (HO) refinement (high-order reflec-
tions are unaffected by chemical deformation of valence orbitals and
locate atoms carefully). Ap is then denoted as a (X - XHO)deforma-
tion map;
Fl = Fobs,F2 = Fcalc,free atoms,with fi calculated as in (3) but for core
electrons only. Ap is the (X - X,,) valence map;
4= Fobs, F2= Fcalc,
free atoms,with F2 calculated with parameters from
neutrons or a joint high-order X-ray and neutron data. Then Ap is the
+
X - N or X - (XHO N) deformation map;
4 = Fobs, = Fca~c,~reeatorns,with F2 calculated as in (5) from core
+
electrons only. Ap is then a X - (XHO N) valence map.
The estimation of the charge density model obtained at the end of
calculations is different according to whether least squares or Fourier
methods have been used[671.However, direct and reciprocal fitting are
nearly equivalent: indeed, if we assume in (3.D.4) that Fl = Fobs,F2= Fcalc,
AF = Fobs(H)- Fcalc(H),and we calchlated the difference Patterson
I 1
x
Ap(r)Ap(r + u) dr = - I A F cos
V H
~ ~2nH u,
Thus least-squares parameters which give the best fit between observed
and calculated structure factors are expected to give rise to the lowest
variance in Ap.
Direct evaluation of atomic or molecular charges, of dipoles and higher
moments, etc, can be obtained from the estimated population parameters or
directly from the electron d e n ~ i t y . [ For
~ ~ ,example,
~~] the net charge on atom
j is
The diffraction of X-rays by crystals 1 221
when the integration is made over the (not always easily defined) atomic
volume. Also, the dipole moment of a molecule may be calculated as
where
and kf, k:, k: are the a * , b*, c* components of Ki. One of them should be
irrational for incommensurate modulation. Since (3 + d) indices hi are
necessary to label a single diffraction effect, we will speak of d-dimensional
periodic modulation or of d-dimensional modulated structure. In Fig. 3.E.1
a section of the three-dimensional diffraction pattern of a one-dimensional
modulated structure is sketched. The main reflections are marked by the Fig. 3.E.l. One-dimensional modulated
largest spots. structure: sketch of a section of the three-
A L:+d lattice may now be with basis vectors dimensional diffraction pattern, showing main
and satellite reflections.
bl = a * , b2 = b*, b3 = c*, b3+i= Ki + ei (i = 1, . . . , d) (3.E.2)
where the eis are unit vectors perpendicular to S g Then in S:+, a reciprocal
vector may be written as
0- a* -Q
W W M W W
ns a
v w L,
If a position is defined in S3 by
where
so that
x3+, = k{xl + k h 2 + k&x3
is obtained, which coincides with (3.E.4).
If in S3+, the new coordinates
The diffraction of X-rays by crystals 1 223
Quasicrystals
In a famous paper by Shechtman, Blech, Gratias, and cahn17'] electron
diffraction patterns of a rapidly solidified A1-Mn alloy were shown which
suprisingly displayed five-fold symmetry. Sharpness of the spots suggested
long-range translational order, but the presence of the five-fold symmetry
violated the sacred rules of crystallography. In particular, by rotation of the
specimen, five-fold axes (in six directions), three-fold axes (in 10 directions)
and two-fold axes (in 15 directions) could be revealed: the subsequent Fig. 3.E.3. Schematic representation of the
ascertainment of the existence of an inversion centre fixed, for this Al-Mn density p' i n S,. S, is the horizontal line: its
phase, the icosahedral point group m55. Somewhat later a large number of bulging parts represent real atoms. (a) Perfect
crystal; (b)density modulation; (c) displacive
alloys with 'forbidden' symmetries were found: such kinds of materials modulation.
(providing electron diffraction patterns displaying sharp peaks and forbid-
den symmetry such as icosahedral, octogonal, decagonal, dodecagonal, etc.)
are called quasicrystals. A huge amount of theoretical and experimental
publications are now available: for a more comprehensive treatment of the
subject and for relevant literature the reader is referred to three excellent
review~.[~+'~]
A useful premise to quasicrystals (as well as to the IMSs) is the definition
of quasiperiodic functions. It is well known that a periodic function
+
f (x) =f (x 1) may be uniformly approximated by finite sums of functions
exp (2ninx). Accordingly, in a p -dimensional space
f '(XI, . . . , x,) = q(nl, . . . , n,) exp (2ni(nlxl + . . . + n,x,)
nl, ..., np
is a periodic function. However, its 'projection' on one-dimensional space
f (x) = C...,
nl, np
q h , . . . , n,) exp (2ni(nlv1 + . . . + npvP)x) (3.E.6)
obtained by fixing in f ' xj to v,x is not periodic if some of the vjs are
irrational. Functions which may be uniformly approximated by finite sums
224 1 Carmelo Giacovazzo
+
of functions exp [2ni(n,v1x + n,v2x2 . . . + n,v,x)] are quasiperiodic func-
tions. We take three simple examples.
1. Let f '(x,, x,) = A 1 sin 2nx1 + A 2 sin 2nx2.
If we assume X, = ax1 where cu is an irrational number then
+
f (x) = A, sin 2nx1 A2 sin 2ncux,
is not periodic.
2, f (x) is the superposition of a periodic sequence of large open circles,
schematically represented by C, 6(x -nu), and a periodic sequence of
small solid circles represented by C, G(x - rnaz12). Here, z =
2 cos (1615) = (1 + f i ) / 2 = 1.618034. . . is the golden mean, n and m are
any integers. Since a and az/2 are incommensurate numbers the
structure is not periodic (see Fig. 3.E.4, first line) but the diffraction
pattern will display delta peaks owing to the fact that the order is
perfectly maintained at long distances. The existence of delta peaks may
be demonstrated by examining periodic approximations off (x). Since
again Fig. 3.E.5(b)) one has to attach parallel line elements in the lattice
vertices. Thus the real one-dimensional quasicrystal results from a cut of the
disconnected 'line atoms' of the two-dimensional crystal with the physical
space S,.
The procedure may be generalized to the n-dimensional case: the
dimension of the space S,, in which a lattice with translational periodicity is
obtained is determined by the number of n rationally independent recipro-
cal basis vectors which are necessary to index the diffraction pattern. The
real aperiodic crystal is again the cut (in the direct space) of the
n-dimensional crystal with the physical space.
Well known two-dimensional examples of quasiperiodic structures are
Penrose ti ling^.[^^,^^] TWOexamples are shown in Fig. 3.E.6: in (a) tiling of
the plane is achieved by putting together two rhombic units in accordance
with some matching rules (without them the plane should be covered in a
periodic way). The pattern shows a five-fold rotation point. In (b) 'kites'
and 'darts' are used: no pentagonal symmetry is shown. ~ a c k a y [ first ~~]
showed that their Fourier transform satisfies a five-fold symmetry.
Octagonal, decagonal, and dodecagonal two-dimensional quasicrystals are
also known; all of them can be embedded in a periodic five-dimensional
space. The three-dimensional icosahedral lattice mentioned at the beginning
of this section may be embedded in a six-dimensional space.
The characteristics of the quasicrystals do not coincide with those of the
incommensurately modulated structures. While the latter show main and
satellite reflections, an average structure, and crystallographic point sym-
metry, the quasicrystals show one kind of reflection only, no average
structure, and non-crystallographic point symmetry.
References
1. Ewald, P. P. (1917). Annalen der Physik, 54, 519.
2. Laue, M. von (1931). Ergebnisse der exakten Naturwissenschaften, 10, 133.
3. Pinsker, 2. G. (1978). Dynamical scattering of x-rays in crystals. Springer,
Berlin.
4. Darwin, C. G. (1914). Philosophical Magazine, 27, 315.
5. Darwin, C. G. (1922). Philosophical Magazine, 43, 800.
6. Zachariasen, W. H. (1967). Acta Crystallographica, 23, 558.
7. Becker, P. J. and Coppens, P. (1974). Acta Crystallographica, A30, 129.
Fig. 3.E.6. (a) A two-dimensional quasilattice 8. Kato, N. (1976). Acta Crystallographica, A32, 453.
showing one five-fold rotation point (plane
symmetry 5mm). Basic oblate and prolate
9. Becker, P. J. (1982). In Computational crystallography, pp. 462-9. Clarendon,
rhombi with their matching rules are shown: Oxford.
similarly arrowed edges must fit. (b) Penrose 10. Chapuis, G., Templeton, D. H., and Templeton, L. K. (1985). Acta
tiling with kites and darts (it does not show Crystallographica, A41, 274.
five-fold symmetry).
11. Templeton, L. K., Templeton, D. H., Phizackerley, R. P., and Hodgson, K. 0.
(1982). Acta Crystallographica, A38, 74.
12. Phillips, J. C. and Hodgson, K. 0. (1980). Acta Crystallographica, A36, 856.
13. Bijvoet, J. M. (1949). Proceedings of the K. Nederlandse Akademie van
Wetenschappen, B52, 313.
14. Phillips, D. L. (1962). Journal of the Association for Computing Machinery, 9 ,
84.
15. Twomey, S. (1963). Journal of the Association for Computing Machinery, 10,97.
16. Kennett, T. J., Brewster, P. M., Prestwich, W. V., and Robertson, A. (1978).
Nuclear Instruments and Methods, 153, 125.
The diffraction of X-rays by crystals ( 227
Introduction
This chapter discusses the experimental methods used to study the
diffraction of X-rays by crystalline materials. Although, as seen in Appendix
3.B (pp. 195 and 198), electrons and neutrons are also diffracted by crystals,
we will concentrate our attention on X-ray diffraction. We begin discussing
how X-rays are produced and how one can define the beam of radiation that
will interact with the crystalline sample. The specimens that will receive our
attention are single crystals and polycrystalline materials, that is aggregates
of a very large number of very small crystals. We discuss the methods used
to record the diffraction pattern and to measure the intensities of the X-rays
scattered by these two types of specimen in separate sections. The ultimate
goal of extracting structure factor amplitudes from diffracted intensities
requires the application of a series of correction factors. This process, called
data reduction, is discussed in the final section of the chapter.
X-ray sources
Conventional generators
All the standard laboratory sources normally used for X-ray diffraction
experiments generate radiation using the same physical principles but can
vary substantially in their construction details. The two types of conven-
tional generators that are used in conjunction with the data recording
devices discussed on pp. 245 and 287 are sealed-tube and rotating-anode
generators. Most of the techniques used for diffraction data collection
require monochromatic radiation. Due to the way in which radiation is
produced in the conventional generators, only a discrete number of possible
wavelengths can be selected for experimental use. This limited choice and
the difference in intensity are two of the major differences between this type
of radiation and that generated by synchrotrons.
and
and 3. When the two orbitals involved in the transition are adjacent the line
is called a,if they are separated by another shell, the line is called P. Thus,
the Cu K, line is produced by a copper target in which the atoms lost an
electron in the orbital of n = 1 and the vacancy was filled by an electron of
the orbital n = 2. The X-ray photon energy is the difference between these
two energy levels. Since for every principal quantum number n there are n
energy levels corresponding to the possible values of the quantum number 1
(from 0 to n - I), the a and /3 lines are actually split into multiple lines that
are very close to one another because the difference between these energy
levels is small. Still, X-ray radiation corresponding to all the possible energy
differences is not observed because some energy transitions are forbidden
by the selection rules. Thus, although Fig. 4.2 has a scale on the abscissa
which cannot show it, the Cu K, line is actually split into a doublet, the Km1
and K,, lines, of very similar wavelengths and which are, for this reason,
not easily separable.
The frequency of the characteristic line corresponding to a given
transition is related to the atomic number of the element that gave rise to it,
Z, by Moseley's law
Y = C(Z - a)2 (4.3)
where the constant C depends on the atomic energy levels involved in the
transition and the constant a takes into account the interactions with other
electrons. Thus, in a plot of Y " ~as a function of 2 for a given transition the
points corresponding to different target elements lie in a straight line and
different lines are obtained for the K,,, K,,, KB1, etc., transitions. The
characteristic frequency is higher the higher the atomic number and so the
Mo K, line ( 2 = 42) has a higher frequency and therefore a higher energy
than the Cu K, line ( 2 = 29). A full list of the wavelengths of the
characteristic lines of the elements which are used in X-ray diffraction
studies can be found in the International tables for x-ray crystallography.['1
Here, we will just point out that the two most frequently used lines are the
Cu K, line, 12 = 1.5418 A and the Mo K, line, 12 = 0.7107 A. Both are
doublets of slightly different wavelengths as pointed out before.
The intensity of a characteristic K line can be calculated using the
equation:
ZK = Bi(V - VK)'.' (4.4)
where B is a constant, i the electrical current, and VK the excitation
potential of the K series, a quantity which is proportional to the energy
required to remove a K electron from the target atom. It can be shown[21
that the ratio zK/zwis a maximum if the accelerating potential is chosen to
be V = 4VK. If this condition is fulfilled, the K, line is about 90 times more
intense than the white radiation of equal wavelength (I,). The K,, line is
approximately twice as intense as the K,, line and the ratio K,/KB depends
on Z but it averages 5 (see ~ieck'']).The data collection methods that use
monochromatic radiation discussed later all use K, radiation and therefore
require the elimination of the Kg component of the spectrum which is
always present. The methods used to achieve this are discussed on p. 241.
Synchrotron radiation
X-rays, as well as other types of electromagnetic radiation, can also be
generated by sources known as synchrotron radiation facilities. In these
installations either electrons or positrons are accelerated at relativistic
velocities along orbits of very large radii, several metres or even hundreds
of metres. These sources are, by necessity, very complex and those that
produce suitable X-rays are limited, located mainly in the United States and
Europe. However, since, as we will see, the X-rays they produce are in
many ways much better than those generated by conventional sources, their
use has grown steadily in the crystallographic community. As a result, more
beam time has been made available to crystallographers and more syn-
chrotron sources are planned for construction in different countries. Among
those, the 6 GeV storage ring to be built in the USA[~]and the European
Synchrotron Radiation ~ a c i l i t ~ designed
,[~] specifically to produce the best
possible X-rays, promise to be of special importance is the development of
this field.
From the extensive literature that exists in this ever expanding field we
recommend two very elementary description^,[^^'] an introductory
textbook,['] and a more advanced treatise in several volumes.[g]In this last
treatise chapters 1, 2, and 11 of volume 1 are specially relevant to our
discussion; volume 3 of the series is totally devoted to X-ray methods.
Notice the scale at the bottom which gives an idea of the size of this type of
facility. The basic element of the installation from which radiation is
generated is the storage ring, a toroidal cavity in which the charged particles
are kept circulating under vacuum. An extremely high vacuum is required
or else the particles are lost by collision with the atoms present in the cavity.
Prior to injection into the storage ring, the particles must be accelerated, for
example, first by a linear accelerator and then by a booster as shown in the
figure. The other two elements that are essential to the operation of the ring
are the so-called lattice, that is the set of magnets which force the particles
to follow a closed trajectory as well as performing other functions, and the
radio-frequency cavity system which restores to the particles the energy they
lose as synchrotron radiation. The beam lines, not shown in the figure, are
tangential to the storage ring.
It can be shown that a charged particle moving along a circular orbit emits
as electromagnetic radiation the following power[8*9]
where P is the energy emitted per unit time, e is the particle charge, c the
speed of light, E the energy of the particle, mo its mass at rest, and R the
bending radius of the orbit. This equation explains why high-energy
particles are required and also why only electrons or positrons are used. The
power emitted by heavier particles such as protons is too low to be of
significant importance.
The quantity y, the ratio of the total energy to the rest energy of the
particle, is of considerable importance since it is approximately related to
the opening angle of the cone of radiation by
WAVELENGTH (A)
124 12.4 124 0 12
'-
I I I I 1 I
Table 4.1. Relevant parameters of the Synchrotron Radiation Sources i n operation in 1987 (taken from reference 5)
Although a,, a,. and a,, a,, vary along the orbit their variations are
correlated and it is thus useful to define another parameter, the emittance,
which at special symmetry positions is found to be
The emittance is instead a constant along the charged particle path and it is
thus another important parameter characteristic of an installation. The
emittances of the synchrotron sources in operation in 1987 are also shown in
Table 4.1.
Another useful function, which is often used to compare the potential
performance of two sources, is the spectral brightness, also called spectral
brillian~e,'~]
defined as the number of photons emitted per unit area of the
source at point x, z over a 0.1 per cent relative band width per unit solid
angle dS2 and unit time in the direction defined by the angles 3 (defined by
the instantaneous velocity of the charged particle and the projection of the
direction of observation onto the vertical plane) and 19 (defined by the
projection onto the horizontal plane instead). It can be shown that
the spectral brightness b is equal to (see ~ a r ~ a r i t o n d ochapter
[ ~ ] 2)
where N is the spectral flux. If one defines the central brightness b, which is
the brightness for x = z = q = 0 it is obvious that
From this equation it can be seen that the brightness can be increased by
increasing the flux or by decreasing the as. Decreasing the a s can be
accomplished by reducing the emittance of the storage ring (eqn (4.12)).
The emittance of a ring is thus a fundamental parameter to be taken into
account in comparing the expected performance from two different sources.
A radiation spectrum quite different from that produced by bending
magnets can be obtained by the use of insertion devices. These are a series
of periodically spaced magnets of alternating polarity which are inserted in a
straight region of the ring and which do not alter the ideal closed orbit of
the particles in the storage ring. Most insertion devices create a sinusoidal
magnetic field which forces the charged particles to oscillate around the
mean orbit. According to their characteristics they are called wigglers or
undulators.
The parameter that has to be examined to determine whether an insertion
device is a wiggler or an undulator is the K parameter
Experimental methods in X-ray crystallography 1 239
Filters
One way to select a wavelength interval out of the spectrum generated by
the source is by filtering the radiation through a material that selectively
absorbs the unwanted radiation while letting through most of the photons of
the wavelength that will be used for the diffraction experiment. The
absorption of X-rays by a material follows Beer's law:
where I is the transmitted intensity, I,, the incident intensity, x the distance
travelled by the X-rays in the material, i.e. the thickness of the filter, and p
is the linear absorption coefficient which depends on the substance, its
density, and the wavelength of the X-rays. Since p depends on the density
of the material, the quantity that is usually tabulated is pm = p l p , the mass
absorption coefficient, a characteristic of the substance that depends only on
the wavelength considered. Complete tables of p , as a function of the
wavelength for different materials used as filters can be found in Koch and
~ a c ~ i l l a v r ~and ~ of pm versus A, for nickel is shown in Fig. 4.8
, [ ~a ' plot
along with the radiation spectrum generated by a copper anode. In the
figure it can be seen that the curve of pm versus A, shows two continuous
branches separated by a sharp discontinuity, called the absorption edge. If
the filter is a pure element, the continuous parts of the curve follow
approximately the equation
pm = kz3A3 (4.17)
where k is a constant with different values for the two branches of the curve
and Z is the atomic number of the element. This equation shows why harder
X-rays, i.e. X-rays with a shorter wavelength, are absorbed less than those
242 1 Hugo L. Monaco
Fig. 4.8. The broken line represents the variation WHITE RADIATION
of the mass absorption coefficient p, as a
function of the wavelength for nickel. The
continuous line is the X-ray spectrum generated
by a copper anode. Notice that the absorption
edge of nickel falls in between the K, and the KB
characteristic lines of copper.
with a longer wavelength. The presence of the absorption edge in the curve
is explained by the fact that the photons at the edge have the wavelength
corresponding to the energy necessary to eject an electron from the K
orbital of the atoms of the filter. Thus, when this energy is reached massive
absorption of radiation occurs with photoionization of the filter and
production of fluorescent radiation.
Similarly to the displacement of the position of the characteristic lines
with Z, absorption edges move to shorter wavelength as the atomic number
of the element increases. A common single filter is chosen so that its
absorption edge falls in between the K, and the Kg peaks of the anode that
has been used to generate the X-rays. In this way, the unwanted radiation
of highest intensity, i.e. the unavoidable Kg peak, can be greatly attenuated
without reducing too much the intensity of the K, peak that will be used for
the experiment. Figure 4.8 shows that a nickel filter has its absorption edge
at the wavelength necessary to very strongly absorb the copper Kg peak.
Since copper has Z = 29 and nickel Z = 28 there is a difference of one
between the atomic numbers of the target anode used to generate the
X-rays and the filter with an absorption edge falling in between its K, and
Kg peaks. This is generally true for every element with Z s 70 and for the
elements of the second long row of the periodic table it is also true that both
the elements of Z - 1 and Z - 2 can be used to absorb the Kg peak of the
anode with atomic number Z. Thus both Nb ( Z = 41) and Zr (2 = 40) can
be used as filters for Mo (2 = 42) radiation.
The relative intensities of the K, and the Kg peaks depend not only on
the absorption coefficient of the filter but also on its thickness. Roberts and
parrish[16' give a table of the appropriate filter thicknesses necessary to
produce K,/KB ratios of 100 and 500 for different elements used as targets
and filters. The same table gives also the percentage of K, peak lost by
filtering which can vary between about 40 and 70 per cent.
A variation of the simple filter technique is the Ross balanced-filter
method.[16]In this method two filters are used: one with its absorption edge
at slightly shorter and the other at slightly longer wavelength than the K,
Experimental methods in X-ray crystallography 1 243
peak selected. The thickness of the filters is chosen so that the radiation is
absorbed to the same extent except in the interval in between the two
absorption edges. With this technique two measurements are made with
either one of the two filters in position and the measured intensity is then
taken to be the difference between the two values obtained.
Crystal monochromators
An alternative and more selective way to produce a beam of X-rays with a
narrow wavelength distribution is by using a single-crystal monochromator.
Bragg's equation (3.32) shows that when radiation of different wave-
lengths impinges upon a crystal, diffracted beams are observed at scattering
angles 13that depend on the wavelength of the radiation A. Thus, selecting a
given diffraction angle 8 is equivalent to choosing a particular wavelength
out of the spectrum incident on the crystal.
The simplest type of crystal monochromator consists of a single crystal
with one face parallel to a major set of crystal planes mounted so that its
orientation with respect to the X-ray beam can be properly adjusted. The
most important properties of a crystal monochromator are:
(1) the crystal used must be mechanically strong and should be stable in the
X-ray beam;
(2) the interplanar distance should be in the appropriate range to allow the
selection of the desired wavelength at a reasonable scattering angle;
(3) the presence of one or more strong diffracted intensities that can be
chosen so that the intensity loss of the beam, which is always
appreciable, may be reduced as much as possible; and
(4) the mosaicity of the crystal, which determines the divergence of the
diffracted beam and the resolution of the crystal, should be mall.['^,^^]
The reflection chosen should also have a scattering angle as small as possible
in order to minimize the loss of intensity due to the polarization factor (see
p. 303). Roberts and parrish[16]give a table with the important properties of
crystals commonly used as monochromators.
In a variation of this simple type of flat monochromator, the crystal
surface is cut so that it forms an angle with the set of planes that diffract the
radiation. In this way, the diffracted beam has a smaller width and as a
result more photons are concentrated in a smaller cross section of the
bearn.[l6] By properly curving their surface, crystal monochromators can be
used to focus the X-ray beam in a very small area.[lglThe curvature of the
surface can be produced by simply bending the crystal, in which case the
diffracting planes should ideally be tangential to the curved surface. If the
monochromator is bent in the shape of a cylinder of elliptical section with
the source in one of the foci, the reflected radiation will concentrate on the
other focus of the ellipse. A further variation consists of not only bending
the crystal but also in grinding its surface so that the radius of curvature of
the diffracting planes of the crystal is different from that of its surface. The
advantage of this type of monochromator is that it does not suffer from
some optical aberrations present in singly bent crystals.[161Curved crystal
monochromators are frequently used to select the wavelength of syn-
chrotron radiation. In addition to the requirements stated before, the
244 I Hugo L. Monaco
crystals should have in this case a very small thermal expansion and a large
thermal conductivity because the power applied is much larger than in the
case of conventional sources.[lsl
Another type of monochromator of wide application in synchrotron
sources is the double-crystal monochromator in which the incident X-ray
beam is diffracted twice by two similar crystals. This type of monochromator
can be constructed with different geometries designed to improve the
resolution and/or to keep the X-ray beam in the original direction. A
discussion of this type of crystal monochromator can be found in
~ a r ~ a r i t o n d o . L Crystal
'~] monochromators are more selective than filters
and, in the case of conventional generators, are capable of resolving the Kal
and K,, doublet which cannot be separated by any filtering method.
Collimators
The function of collimators is to define a narrow cylindrical beam of X-rays
that ideally should be as parallel as possible.
A simple pinhole collimator is shown sketched in Fig. 4.9. It consists of a
cylinder with two apertures defining the beam and a third guard aperture
which does not affect the beam size defined by the other two but eliminates
the radiation scattered by the defining aperture furthest from the X-ray
source. These apertures are commonly circular, although slits can be used
instead in which case square or rectangular beams can be defined.
Cylindrical beam pinhole collimators are typically used with conventional
sources to define a beam of radiation that is monochromatized by either a
filter or a crystal monochromator. Such collimators never produce an ideally
parallel X-ray beam but, in addition to the parallel X-rays, they also
produce convergent and divergent X-rays as shown in the figure. A
conventional X-ray source, when viewed at the appropriate take-off angle is
seen as a square. If 1 is the distance between the two defining apertures S1
and S2 and d is the diameter of the collimator the maximum angle of
divergence of the beam, y, can be calculated as shown in the figure as
dl2
tan y/2 = -= dl1
112
and since the angle is very small
I I
Fig. 4.9. Pinhole collimator showing the angle of
greatest divergence y, S, and S, are the two
Id
apertures defining the beam which are I I
separated by the distance I, and have a diameter
of d; S, is the guard aperture. -
Experimental methods in X-ray crystallography 1 245
variables that can be adjusted are the crystal and X-ray focus size, the
crystal to focus distance, and the crystal to detector distance. Depending
also on the reflection to reflection resolution necessary for the experiment,
different conditions are found which maximize the signal to noise ratio given
the restrictions imposed by the experiment.
Mirrors
X-rays can be reflected by mirrors when the angle of incidence is smaller
than a certain critical angle 8, which is a function of the wavelength A and
can be calculated by the equation[17,191
where Z is the atomic number, p the density, and A the atomic weight of
the reflecting material. If the angle of incidence is chosen to be within ten
per cent of 8, for the Cu K, radiation, the X-rays having a wavelength
shorter than the corresponding A, in particular the fairly intense Kp peak,
will not be reflected and therefore they will be eliminated from the beam.
Thus, by properiy choosing the glancing angle of the X-rays on the mirror
the radiation can be partially monochromatized. The values of the critical
angles 8, depend on the reflecting material as shown by the equation given
above and they are, in general, very small: 14' for glass and 23' for nickel
mirrors, calculated for a wavelength corresponding to the Cu K, radiation.
A table of this property and other parameters of interest of mirrors can be
found in witz.[lgl
If the reflecting surface of the mirror is curved, ideally in the shape of an
elliptical cylinder with the source in one focus, the reflected radiation will
converge to the other focus of the ellipse and a very intense X-ray beam will
be obtained at that point.
This principle is used in the design of a very powerful device that is used
to focus and partially monochromatize an X-ray beam and which uses two
curved mirrors with perpendicular axes of curvature.[211This double-mirror
system has been used for X-ray diffraction work on virus crystals which,
having very large unit cell parameters, pose particularly serious problems
for the spatial separation of the very close diffracted beams.[221Mirrors are
also very extensively used in the beamlines of synchrotrons. In addition to
focusing and partial monochromatization they perform several other func-
tions: splitting of a beam into two, magnification or demagnification of the
source, and change in the polarization of the radiation.[l8I The function to
be performed determines their geometry and so their surface can be flat or
curved and the mirror can be bent or segmented, that is constituted by
several small flat pieces which are easier to produce than large curved single
mirrors.
defined (on p. 155) and it was shown that diffraction is observed whenever a
reciprocal lattice node lies on this sphere. The direction of the diffracted
beam is determined by the vector joining the centre of the Ewald sphere
and the reciprocal lattice node lying on its surface. We saw that the
characteristics of the crystal and the conditions of the experiment cause
these nodes of reciprocal space to have a volume. We will make use of all
these concepts in our discussion of the data collection devices that will
follow.
If the scattering experiment is performed with radiation of a single
wavelength there will be a single Ewald sphere of radius 1/A and the
probability that a stationary reciprocal lattice node may by chance be on its
surface will be fairly low. Furthermore, and as pointed out on p. 163,
diffraction from only a cross-section of the node is not acceptable since it is
the entire volume that should give rise to the diffracted intensity from which
the structure factor amplitude is to be derived. In addition, in order to solve
a structure, one needs all of the diffracted intensities that can be measured
corresponding to the nodes found within a sphere of radius D* = l/R,,,,
the inverse of the resolution of the structure. Broadly speaking there are
two ways to tackle these problems: the first is to use polychromatic
radiation, i.e. to have a series of Ewald spheres corresponding to different
wavelengths; the second method is to move thc reciprocal lattice nodes, i.e.
the crystal, so that all the nodes from which one wishes to measure the
diffraction cross the Ewald sphere completely. The first method is histori-
cally the oldest, it is called the Laue method, and it was with it that
diffraction from crystals was discovered.[23]
In Fig. 4.10, the shaded area represents the volume of reciprocal space
containing all the nodes that will produce diffraction when the stationary
specimen is hit with radiation of a wavelength in the interval between A,,
and A,;,. In the Laue method, different cross-sections of a node are excited
by radiation of slightly different wavelengths and the result is an intensity
integrated over the wavelength rather than over the volume sweeping
through a single Ewald sphere. If the white radiation spectrum resulting
from a conventional generator is used to produce Laue diffraction, the
practical applications of the method are rather limited.[24,251 It is for this
reason, and also because the diffraction pattern produced is rather difficult
to interpret, that the method fell into disuse until the advent of synchrotron
radiation and the development of very powerful computers. The use of the
Laue method with synchrotron radiati~n"~] provides an extremely fast and
efficient method to record diffraction data. The applications are already
important in the field of small-molecule~27~281 as well as macromolecular
crystallography[29~301where it has opened up the possibility of performing
time-resolved studies in the crystalline state.L3']
All the other data collection methods discussed in this chapter use
monochromatic radiation and therefore require a more or less complicated
mechanism to move the crystal as the diffracted radiation is measured.
rotation
-
axis
Using this value, the unit cell parameter corresponding to the real axis
coincident with the rotation axis can be calculated.
Incidentally, notice that because of this relationship equally spaced planes
in reciprocal space do not produce equally spaced layer lines on the film.
The first is the use of a layer line screen that blocks all the diffracted
radiation with the exception of that due to one selected reciprocal lattice
plane at a time. The second difference is the coupling of the rotation motion
to a displacement of the film along the cylinder axis. In this way, spots
belonging to the same reciprocal lattice plane that cross the Ewald sphere at
different times and which would end up recorded on the same layer line are
recorded instead in different positions on the film. Thus, a single reciprocal
lattice plane is mapped on to the film plane.
Figure 4.13(a) shows the Ewald sphere projected on to a reciprocal lattice
plane that would produce one layer line in a normal cylindrical film rotation
camera. The view is in the direction of the rotation axis and each of the
lines represented corresponds to a reciprocal lattice row of a given index. In
this representation the origin of reciprocal space is found at point 0, the
intersection of the incident X-ray beam with the Ewald sphere. The line
tangent to this point has one index equal to zero and corresponds to one of
the reciprocal lattice axes. In Fig. 4.13(b) the reciprocal lattice point P is
crossing the Ewald sphere and therefore it is in diffracting position. Its
coordinates, shown on the unrolled film on the right, are x and z ; the first is
proportional to the angle 2 8 as can be seen in the same figure, the second is
proportional to the rotation angle o since, as already pointed out, crystal
rotation and film translation along the cylinder axis are coupled,
If r, is again the cylindrical cassette radius one can write
250 1 Hugo L. Monaco
film
and
Normally rf is chosen so that C1has the value of 2" mm-l. Thus measuring
the x coordinate of a reflection in millimetres one can automatically
calculate the corresponding value of 20.
The coupling parameters of the film movement to the crystal rotation are
chosen so that the constant C2 in
is also made equal to 2 and thus, the two angles w and 20 can be measured
on the film on the same scale.
Figure 4.13(b) shows that a normal to the zero-level reciprocal lattice row
passing through A bisects the angle 28. Thus, for this particular level, 8 is
equal to w because the sides of the two angles are perpendicular and one
can write
Experimental methods in X-ray crystallography 1 251
and
x = 2C2zlC,
which is the equation of a straight line of slope 2 if the two constants C , and
CZare chosen to be equal. Thus the reciprocal lattice axis passing through 0
will produce on the film a series of spots that will be found on a straight line
which will normally have a slope of 2.
When the angle o reaches the value of 90" the zero-level line will be
found in the direction of the X-ray beam and the line traced by the spots
will have reached the point where the film is cut to let the X-rays through.
Immediately thereafter the spots will be recorded on the other side of the
film interruption, that is they will begin to be recorded on the bottom half of
the film. When w equals 180" the zero-level line is found again tangent to
the Ewald sphere but it has been flipped over. The spots recorded after that
will change the sign of the only index which is varying along the line. Figure
4.14 shows the Ewald construction and the film appearance at the beginning
of the rotation cycle and after a rotation angle of slightly more than 180".
If the second reciprocal lattice axis, which has to be found in the plane
selected for recording, forms with this axis an angle of a*,when o equals
a* this second axis will be found tangent to the sphere of reflection and
after that will begin to produce a second straight line parallel to that traced
by the first axis and separated from it a distance w = a*.Thus, reciprocal
lattice axes are identified in the photograph by the straight lines they
produce and the angle between them is simply the o value that separates
the lines on the film.
Figure 4.15 shows that for reciprocal lattice lines not passing through the
origin of reciprocal space, 8 is not equal to o and therefore the relationship
between x and z is no longer the equation of a straight line. It is instead a
curve of the type shown on the right-hand side of Fig. 4.15. Each of the
layer lines that do not pass through 0 will produce a curve similar to the
one shown in the figure and therefore the film will show a family of
non-intersecting curves or festoons that will be found on both sides of the
line crossing the centre of the film, at increasing distances from the centre.
Each festoon corresponds to one reciprocal space line and therefore the
reflections found on it will have one index in common. On every film there
will be two festoon families, one for each of the two reciprocal lattice axes
found in the plane that is being examined. All non-central reflections are
found at the intersections of two festoons, one from each of the two families
found on the film.
Figure 4.16 is a picture of the camera and Fig. 4.17 is a typical zero-level
Weissenberg photograph. After the plane axes and the festoons have been
identified, it is not difficult to index a photograph like this by inspection.
r o t a t o n axis beam, it is instead tilted so that its intersection with the reciprocal lattice
plane falls on the Ewald sphere. As a consequence, the angle made by the
incident and diffracted X-rays with a normal to the rotation axis going
through the centre of the Ewald sphere are equal (see Fig. 4.18(b)) and the
method is called the equi-inclination method. A detailed description of this
method is found in ~ u e r ~ e r . ' ~ ~ ]
Since the main uses of the Weissenberg camera are nowadays space-group
and unit cell parameter determination and no longer quantitative intensity
measurements, upper-level Weissenberg photographs are seldom recorded.
FILM
I I
E w a l d sphere F I LM
(b)
described by the same Figs 4.19(b) and (c) if we imagine that we are viewing
the Ewald sphere not from a side but from the top or the bottom.
When a full revolution has been completed the intersection of the
reciprocal lattice plane and the Ewald sphere has described a circle whose
radius is the diameter of the intersection and equal to
2 sin plA. (4.22)
All the points in the reciprocal lattice plane which are found within this
radius have passed through the Ewald sphere and therefore have produced
a diffracted beam which has been recorded on the film. Since the precession
256 1 Hugo L. Monaco
angle is usually not larger than 30°, the maximum radius of this circle is
normally A-'.
In Fig. 4.19(b) the reciprocal lattice point R is inside the Ewald sphere, in
Fig. 4.19(c) it is outside. When a full cycle has been completed and we are
back to the situation shown in Fig. 4.19(b), the point is again inside the
Ewald sphere. Thus in a full precession cycle, every reciprocal lattice point
that will produce a signal crosses the Ewald sphere twice, moving each time
in opposite directions.
and
Thus
cos v = cos p - nd*A
and finally
s = r, cot cos-'(cos p - nd*A) (4.25)
which can be used to calculate the crystal to screen distance for any
upper-level precession photograph. Incidentally, we notice that if n is made
equal to 0 this equation reduces to (4.23), the expression derived for a
zero-level precession photograph.
Re-examining Fig. 4.22, we notice that there is an area of the reciprocal
lattice plane that will never pass through the Ewald sphere. The projection
of this area onto the plane of the figure is the line that goes from the point
0, to Q,. The blind region circle has a radius, rb, that can be calculated
from Fig. 4.22
rb = l/A(sin v - sin p).
and
a* = df lhf (4.28)
where f is the crystal to film distance and a* the reciprocal lattice parameter
from which the unit cell parameter can be easily calculated.
is wasted. If the crystals do not decay in time, that is if the intensity of the
reflections does not change as the crystal is exposed to the X-ray, this is not
too a serious a problem; with sufficient time available all the reciprocal
lattice region of interest can, in theory, be explored one layer at a time. But
if radiation damage is a problem, as is the case of macromolecular crystals,
both the precession and Weissenberg methods are extremely inefficient. For
example, the recording of a complete data set of a protein crystal with the
precession camera usually requires one crystal per photograph, i.e. one
crystal per reciprocal lattice plane. Although in the early days of protein
crystallography this was the way that data were collected, the search for a
more efficient way to record diffracted intensities led in the early 1970s to
the reintroduction of the screenless flat-film rotation (oscillation) method
for macromolecular data collection. The flat-film rotation method had been
used since the very beginning of X-ray crystallography[36]but it had been
abandoned due to the difficulties in indexing and quantitatively measuring
the reflection intensities on the film. This situation was changed by the
introduction of computer-controlled microdensitometers which assured that
films could be conveniently scanned and reliable intensities could be
extracted from them. A new type of flat-film rotation camera with eight
cassettes that are used for successive exposures, the Arndt-Wonacott
ame era,[^',^*] was built, and in a relatively short time the rotation method
became one of the major, if not the major, method for macromolecular
crystal data collection.
r o t at ion
axis
that contain all the reciprocal lattice points that will produce diffraction on
the flat film perpendicular to the X-ray beam. Thus a rotation photograph
contains reflections coming from all the reciprocal layers that intersect the
Ewald sphere as shown in Fig. 4.28. The reflections contained in each of the
lune pairs come from the same reciprocal lattice plane and therefore have
one index in common.
Since nodes in reciprocal space have a volume, a reflection is not
completely recorded on the film until the entire volume has passed through
the Ewald sphere. Any reflection whose reciprocal lattice node has not
completely passed through the Ewald sphere is called a partially recorded
reflection or, more improperly, half spot, regardless of the percentage of the
volume that has passed through the sphere of reflection. Since the rotation
range in macromolecular crystallography is usually quite small, as we will
see, a substantial number of reflections on a rotation film are partially
recorded reflections. These have to be properly identified and dealt with
during film processing. One of the reasons why the rotation methud has
f~lm
been so successful is that it has been found that reflections partially
recorded on different films can be added together[381to yield a reliable value
for the total diffracted intensity.
Figure 4.29 shows the idealized shape of the partially recorded reflections,
which is different if they are recorded at the beginning or the end of the
rotation range. In one case the missing part of the reflection has been
recorded in the previous film, in the other it will be recorded on the next.
Re-examining Fig. 4.28 we notice that the area of reciprocal space that
crosses the Ewald sphere becomes a series of points along the projection of
the rotation axis. Thus, no matter how small their reflecting range,
reflections found along this line will always be partially recorded.
~ o n a c o t t [has
~ ~ ]examined the factors that limit the maximum rotation
I I
range allowed without having reflection overlap from different reciprocal
Fig. 4.29. The idealized shape of partially
recorded reflections. The spots labelled A are
lattice planes on the film. The expression for the maximum rotation range
recorded at one end of the rotation range and 1s:
those labelled B at the other end. The missing
parts of reflections A will be found on the A@rnax Ir*IIR&,x - A (4.29)
previous rotation photograph and those of spots
B on the following one. where A@,,, is the maximum allowed rotation range in radians, R:,, the
Experimental methods in X-ray crystallography 1 263
rotation
axis
-
X - rays limltlng sphere
for a g l v e n
r e s o l u t on
Ewald '
sphere
rotation range per photograph. Still, there may be practical reasons that
partially or totally limit the freedom of choice of the rotation axis. An
example is crystal morphology. A crystal shaped as a very thin plate with its
highest symmetry axis perpendicular to the plate cannot be easily mounted
with that axis parallel to the spindle.
as can be seen in Fig. 4.32. If the angle @ at which a given reciprocal lattice
point crosses the Ewald sphere is known, since xo, yo, and 2, are only
functions of the reflection indices and the unit cell parameters, x, y, and z,
the reflection coordinates in the laboratory system, can be calculated.
The fourth coordinate system is the projection of the laboratory coordin-
ate system on to the film plane. In order to convert from the laboratory to
the film coordinate system we only need to know the crystal to film distance.
Figure 4.32 shows the relationship between the crystal, laboratory, and film
coordinate systems.
Thus we can calculate the film coordinates for a reflection if we know the
angle Q, at which this reflection crossed the Ewald sphere.
Let us consider the reciprocal lattice point P which crosses the Ewald
sphere at a rotation angle @. Figure 4.33 shows that P is on the Ewald
sphere when the centre of the sphere has moved from A to A' that is from
(Ap', 0, 0) to (A-' cos @, A-' sin @, 0).
Since P lies on the Ewald sphere, its coordinates must satisfy the equation
of a sphere
-
X-rays
The rotation picture recorded before the one shown in Fig. 4.35 was
exposed with the X-rays in approximately the direction of the b axis. In fact
in Fig. 4.35 we can see the small circle closer to the beam stop that
corresponds to the h01 plane.
The k index of the reflections is the easiest to identify; for the smallest
circle it is 0 and for the concentric lunes it is successively 1, 2, 3, etc.
(compare with Fig. 4.28). In order to find h and I we have to find the c* and
a * axes on the film, then indexing is done simply by counting spots. The c*
axis is horizontal because it coincides with the rotation axis, its intersection
with a * can be found by locating a * , here we are aided by systematic
extinctions. The first index h is the most difficult to determine but the
position of the c* axis, which corresponds to h = 0 can be found looking at
the picture exposed before Fig. 4.35 which shows symmetry about the
rotation axis. The axis of symmetry is the c* axis.
~ e r n a l ' proposed
~~] the use of specially designed charts to more easily
index rotation photographs. These charts are nowadays not very widely
used.
Densitometry
The data collection cameras that we have seen in this chapter normally use
film as a detector for the radiation diffracted by the crystals. In Chapter 3
(p. 161) we have seen that the quantity that is proportional to the structure
factor amplitude is the integrated intensity of the diffracted X-rays.
Therefore, in order to be able to quantitatively use the data recorded with
these methods, the first step is the extraction of the relative integrated
intensities from the film. This is currently done using an instrument called a
Experimental methods in X-ray crystallography 1 269
Figure 4.36 shows the computer output from a scanning program with a
where (no - n)ln, is the fraction of particles that have not been excited, a is
the fraction of radiation absorbed by excited and unexcited particles, and rn
is the number of particles excited per photon, normally taken to be equal to
1. Solving the differential equation
n = no[l - exp (-maElno)]
using as boundary conditions n = 0 for E = 0 and n = no for very large
values of E.
As more and more particles are excited on the film, the fraction of light
transmitted, when the optical density is measured, will decrease. Assuming
that the fraction of light transmitted by the film is proportional to the area
that has not been excited one can obtain
dIt/Io= -fIt dnlIoK
where It/Io is proportional to the unexcited area, K is the proportionality
constant, f is the surface covered by a grain, and dn the increment in excited
particles as before.
Solving the differential equation we obtain
ItlIo= exp (-fnlK),
2.3 log ItlIo= -fnlK,
and
D =fn12.3K.
The maximum optical density that can be measured, Dm,,, will cor-
respond to n = n,. If, in the expression of n as a function of E, we substitute
Experimental methods in X-ray crystallography 1 271
n and no in terms of D and Dm,, we obtain
where So = mafI2.3K.
This equation can be simplified[421
by two successive series expansions to
D I E = So(1- D /2Dmax). (4.35)
From this equation we see that for large values of Dm,, and small optical
densities the relationship between optical density and exposure is linear.
The exposure E, that is the number of photons exciting the film, is
proportional to exposure time when a film is exposed to successively higher
X-ray doses by increasing the time that it is hit by a constant X-ray beam.
The parameter Dm,, is characteristic of the film and can be determined
plotting D I E as a function of D. Some typical values have been determined
by Vonk and ~ i j ~ e r sand[ ~ more
~ ] recently by ~ l d e r [for
~ ~very
] widely used
types of X-ray film. The value of Dm,,, representing the optical density of
the film when all the silver halide granules on it have been excited, is usually
well beyond the optical density range that can be measured with the
densitometer .
The ratio D I E is called the film speed at density D, So is the initial speed,
and the function in parentheses shows how this value decreases as exposure
progresses.
Arndt et have derived a simple approximate expression for the
fractional standard deviation of an optical density measurement on film. If
aDis the standard deviation
Microdensitometers
There are currently three different types of microdensitometers in use: the
rotating-drum, cathode ray tube and flat-bed microden~itometer.[~~~ The
first kind is by far the most widely used by X-rays crystallographers and it is
for that reason that it will be briefly described here.
Rotating-drum densitometers have been described by brah hams son[^^]
and ~ u o n g [ ~and
' ] their performance in the scanning of precession photo-
graphs has been analysed by Nockolds and ~ r e t s i n g e r [and
~ ~ ]by Matthews
et a1. [491
In a rotating-drum scanner, the film is mounted on a cylindrical drum that
has a rectangular aperture for the film and which rotates about the cylinder
axis. The rotation speed is variable, and depending on the instrument can
be as high as 12 revolutions per second. A beam of visible light is passed
272 1 Hugo L. Monaco
through the film and its intensity is measured by a detector. Source and
detector are stationary during one revolution of the drum and are
automatically stepped along the cylinder axis until the entire film is covered.
The incident light intensity I,, is measured as the beam goes through air.
The raster size is variable, it can be 12.5, 25, 50, 100, or 200 ym or more
and the transmitted light is measured at intervals equal to these values.
Thus if a raster size of say 100 ym has been chosen a strip 100 ym wide will
be read in one revolution at 100pm intervals. After that, source and
detector will be advanced 100 ym and another strip will be read until the
entire film is covered by 100 x 100 ym pixels. The instrument is interfaced
to a computer and the process is totally computer controlled.
When the light beam goes through air, the instrument gives an optical
density of 0. The maximum integer reading of 255(2'- 1) can be made to
correspond to an optical density of either 2 or 3. The optical density values
thus measured can be stored on magnetic tape or directly in the computer
connected to the densitometer. There are two strategies used by computer
programs in processing densitometer data.[48,491 In the first, the entire film is
read as described above and the data are stored for subsequent computer
processing to obtain the integrated intensities. In the second approach,
integrated intensities are obtained on-line as scanning of the film proceeds.
Both strategies have their advantages and disadvantages and have been used
extensively to scan all types of diffraction films.
The equations relating optical density on the film to the total number of
photons that have caused it ((4.34) and (4.35)) show that there is a
maximum value of optical density that can be measured for a given type of
film and that the optical density is a linear function of the exposure only for
relatively low values of D. We will briefly discuss how these two limitations
are handled experimentally.
The problem of a limited dynamic range of the film, i.e. of a limited
optical density range that can be measured, can be solved by placing more
than one film in the cassette that records the X-ray diffraction pattern. Since
a substantial fraction of the radiation is absorbed by the X-ray film, those
reflections which are too strong to be measured on the first film will
normally fall within measuring range on the second or third. After the
integrated intensities have been calculated, all the films in the pack can be
scaled together.
The non-linearity of the film response can be handled by constructing
experimentally a table that relates an optical density produced by the
scanner to a given exposure. Since one is interested in relative integrated
intensities, the table can be constructed by exposing the film to the same
X-ray beam during different times and measuring the optical density of the
spots under the conditions that will be used for data scanning. In this
experiment the exposure times are proportional to the number of photons
hitting the X-ray film.
An alternative approach proposed by Matthews et a1.[49]is to assume a
parabolic relationship between integrated intensity and optical density.
this second approach the non-linearity of the film response is handled in the
film scaling procedure.
Although the simplest approach to determine the integrated intensity is to
simply subtract the background from the area covering the spot, using the
technique called profile fitting, smaller estimated standard deviations can be
~ b t a i n e d .In
~ ~this
~ ] method, a model profile for the reflection is constructed,
that is a model intensity distribution in two dimensions is determined by
averaging the measured profiles of a certain number of reflections on the
film. It is then assumed that all the reflections have this standard profile and
the measured data are fitted to it.
DWractometer geometry
A single-crystal diffractometer consists of an X-ray source, an X-ray
detector, a goniostat that orients the crystal so that a chosen X-ray
diffracted beam can be received by the detector, and a computer that
controls goniostat and detector movements and performs the mathematical
operations required to position the crystal and detector in the desired
orientations.
The detector is usually of the scintillation counter type in which X-rays
excite a fluorescent material and thus generate visible light which is
measured in an appropriate way. Xenon filled proportional counter detec-
tors are also used, particularly with Cu K, radiation. This type of X-ray
detector will be further discussed on p. 281.
Both the molecular excitation of the fluorescent material and the gas
ionization in the counter detector are events that can be triggered by the
arrival of only one photon. It can be shown experimentally[s3]that if an
X-ray beam is measured several times with a diffractometer detector, the
different intensity values obtained follow a Poisson distribution. For such a
distribution, the estimated standard deviation is
0=~ 1 ' 2
X-rays and the rotation of the detector about an axis passing through the
crystal. The detector can only move on this plane and it forms an angle 28
with the incident beam as shown in Fig. 4.37. In the figure, point P is in
diffracting position because it is on the Ewald sphere and produces a
scattered beam that can be detected because it is on the equatorial plane. In
order to observe diffraction, all the reciprocal lattice nodes are brought in
turn to some point on the circle defined by the intersection of the sphere of
reflection and the equatorial plane. At the same time, the detector is moved
to the appropriate 28 angle so that it can receive the diffracted beam.
The most widely used type of goniostat uses the Eulerian cradle which
gives rise to the four-circle diffractometer shown schematically in Fig. 4.38.
x
The cradle is constituted by the circle which carries the goniometer head
Fig. 4.37. The diffractometer equatorial
with the crystal. The instrument has a main axis that is normal to the
geometry. The detector, rotating about the equatorial plane and therefore to the incident and diffracted X-ray beams
instrument main axis, defines a plane that and passes through the crystal. Rotation of the cradle about the main axis
contains the incident beam. Reflections will
always be measured on this plane. defines the angle o,rotation about the spindle axis of the goniometer head
defines the angle Q, in exactly the same way as in the rotation camera. The
angle x is defined by the spindle of the goniometer head and the main
instrument axis. The four circles of the diffractometer are thus the Q, and x
circles about which the crystal can be rotated, the o circle, defined by the
rotation of the cradle, and the 28 circle described by the rotation of the
detector about the main axis.
Both the 28 and o rotations are about the main axis but the first moves
the detector and the second the cradle. In a three-circle diffractometer the
degree of freedom that is missing is the rotation o, in other words the
cradle is fixed with its x plane perpendicular to the incident X-ray beam.
The angle 28 is 0 when the detector is positioned in the direction of the
x
incident X-ray beam, is 0 if the spindle axis is parallel to the main axis, w
is 0 when the x circle is perpendicular to the incident X-ray beam, and the
zero position of Q, is arbitrary and can be defined with respect to the crystal
orientation. If the angle x is 0 the o and Q, rotations coincide.
In general, only two rotations are required to bring a reciprocal lattice
node to the intersection of the Ewald sphere and the equatorial plane. If we
tor
9 rotations that would cause collisions or would produce diffracted beams that
would be blocked by the x circle in the conventional four-circle
diffractometer are still practicable.
and let
sin Q, cos x
(
XG= lr*J cosQ,cosx
-sinx ). (4.38)
be calculated since they depend only on the angles and the magnitude of the
vectors Ir*l (eqn (4.38)).
Multiplying both sides of the first eqn (4.39) by H and introducing the
second eqn (4.40) we obtain
HA* = W A G = X ~ U - ~ U=AXGAG.
~ (4.41)
Now we can write for the three reflections, 1, 2, and 3
HIA* = X G l ~ G
and finally,
U = H ~ l x. ~ .
Since HM and XM are known, one can calculate U, the orientation matrix
from (4.43).
If the unit cell parameters are known the angles corresponding to only
two reflections are sufficient to calculate u.['~]
In the more general case we have discussed, the orientation matrix yields
also the unit cell parameters of the crystal
A* = UAG
and
Area detectors
Due to its high precision, the diffractometer is the ideal data collection
instrument for small-molecule crystals but it suffers, as we said in the
previous section, from the drawback that it collects only one reflection at a
time. When data have to be collected from macromolecular crystals which
have very large unit cells and which therefore require the recording of many
reflections and which in addition have, in general, a more or less serious
radiation decay problem, the diffractometer is an inadequate data collection
device. On the other hand, the rotation method described earlier (p. 259),
that is with the reflections recorded on film and with a choice of the rotation
range A@ made in order to minimize the number of films exposed and the
fraction of partially recorded reflections, has an intrinsically lower precision.
This is due to two main reasons; the first is that, as we have seen, film is a
poorer detector than diffractometer counters, the second is that during the
282 1 Hugo L. Monaco
film exposure the signal is recorded by the film only during a fraction of the
total exposure time. If the reflecting range of a reflection of the crystal is A
and the rotation range selected A@ this fraction is A/A@. Typically A is no
more than a few tenths of a degree whereas A@ can be one degree or more.
In other words, in the rotation method the signal is recorded on the film
during a time equal to tAlA@ where t is the total exposure time whereas the
background is recorded instead during the total time t.[579581One could, in
principle, improve this situation by simply reducing A@, so that A is
spanned by several rotation photographs, and then measure the integrated
intensities only on those films in which the reflection is found. There are
many reasons why this is not done when working with film but this is instead
perfectly feasible when the detection is done by the devices called area
detectors or X-ray position-sensitive detectors.[591
Area detectors were designed to combine the photon counting efficiency
of the diffractometer with the ability to record a large fraction of the
reflections which simultaneously cross the Ewald sphere, which is the main
advantage of the rotation method. Area detectors are thus probably the best
choice for the data collection of macromolecules, and although they have
not yet found many important applications in small-molecule crystal-
lography they will probably turn out to be very useful in cases when
radiation damage is a problem. However, it should be pointed out that the
devices that are currently available commercially have been optimized for
the detection of copper radiation 2nd are, in most cases, less efficient in the
detection of the higher-energy molybdenum radiation which is very often
used in small-molecule work.[601
Principles of operation of area detectors
The most common area detectors types that are currently used in macro-
molecular work and that include the current commercially available
instruments belong to two groups: multiwire proportional counters and
television area detectors. To these two groups there has been a recent
addition: the imaging plate, which is based on entirely different physical
principles and, because it is more recent, has not yet been as thoroughly
tested as the other two groups. We will briefly discuss the X-ray detection
mechanisms of these three types of detector.
Multiwire proportional counters are gas filled chambers that contain three
parallel planar electrodes; an anode sandwiched in between two cathodes.
The anode and at least one of the cathodes are arrays of parallel wires
which are perpendicular among them.[61,621 The gas filling the chamber is
usually a xenon-carbon dioxide mixture (see also Mokulskaya et u I . ~ ~ ~ ] ) .
When an X-ray photon is absorbed by a Xe molecule an inner shell electron
is emitted with a kinetic energy that is most of the energy of the absorbed
photon and which is sufficient to produce the ionization of many more
molecules. It has been calculatedf621that an X-ray photon of the Cu K,
wavelength has enough energy to induce on average the formation of 320
ion pairs. The free electrons and the positive ions move in opposite
directions, the former in the direction of the anode where they produce an
ionization avalanche with the formation of several orders of magnitude of
new ion pairs.[621These ion pairs move again in opposite directions under
the influence of the electric field and in so doing generate the electrical
signal which is measured in the detector and which is localized in the region
Experimental methods in X-ray crystallography 1 283
where the initial photon hit the counter. The function of the carbon dioxide
molecules is to absorb the ultraviolet photons which are generated in the
avalanche process and which could produce the photoemission of electrons
and thus start the whole process in another region of the counter.
In television area detectors the X-ray radiation is converted into visible
light by a fluorescent phosphor. These visible photons, after suitable
intensification, are detected by the photocathode of a standard high-
sensitivity television camera tube which is linked to a computer.[-71 The
area detector phosphor, that is the fluorescent material that transforms
X-rays into visible light, is either polycrystalline gadolinium oxysulphide or
zinc sulphide and it produces between 250 and 500 visible photons per X-ray
photon.[661In spite of this gain in photon numbers, the sensitivity of the
camera is not enough to measure them and so an increase in the signal is
required to make it detectable. This enhancement is achieved by an image
intensifier in which the photons produced by the first phosphor generate a
certain number of electrons from a photocathode which are then accelerated
and strike a second phosphor that is optically coupled to the camera. The
photon gain of these intensifiers is of the order of either 100 or 1000.[~~]
The
television camera tube consists of a photoemissive cathode with an
intensifier that accelerates the electrons generated by the light producing a
charge image which is scanned by an electron beam used to measure the
signal. Since the photons arriving in 40 ms, which is the period necessary to
scan the image, are not enough to give good counting statistics, a certain
number of these images have to be added before the statistics become
satisfactory.
The imaging plate is essentially a storage phosphor. This means that the
X-ray photons produce on the plate a latent image that is then excited by
stimulation with a He-Ne laser producing light at 633 nm. The light thus
generated has a wavelength of 390 nm and is irradiated from the plate areas
which were previously hit by the X-ray photons. This phenomenon is called
photostimulated luminescence.[681The radiation energy of the X-ray pho-
tons can be stored by the phosphor for fairly long periods; it has been found
that the photostimulated luminescence is reduced to one half of its initial
value after approximately ten hours. The photostimulable material covering
the plate is Ba F Br:Eu2+ crystals. When the plate is hit by X-ray photons
some of the Eu2+ ions are ionized to Eu3+ ions and the electrons that are
freed are trapped in Br vacancies introduced in the crystal that are called F
centres. Subsequent excitation of these centres by the laser liberates again
these electrons that return to the Eu3+ ions which thus become excited Eu2+
ions. An electronic transition in these ions generates the luminescence with
an intensity proportional to that of the original The storage
phosphor is read by an image reader which releases the stored information
by means of the laser and collects the emitted radiation and channels it into
a photomultiplier tube which converts the radiation into an electrical signal.
The plate can be used repeatedly, since exposure to visible radiation
restores it to its initial condition. The two other elements of the detection
system are an image processor and an image writer that can be used to
imprint the plate image onto photographic film to produce a permanent
The characteristics of the image reader turn out to be crucial for
the performance of the entire system and the best precision could not be
obtained until an adequate read-out instrument was built.['']
284 1 Hugo L. Monaco
The performance of the three types of detector mentioned here has been
analysed in several of the references given in this section and in particular
by ~ r n d t . [Multiwire
~~] proportional counters and television detectors are
already fairly widely spread in many laboratories and the data they have
produced have been used to solve several new protein structure^.^^^-^^] The
special characteristic of the imaging plate is its very wide dynamic range
which appears to make it the ideal detector to be used with the very intense
synchrotron radiation.
detector
where R,,, is the maximum and R,, the minimum resolution of the
reflections that can be measured by the detector in the position correspond-
ing to the selected values of 8, and D. The angles &,, and emin are defined
in the figure and are a function of the detector size, the angle 8, and the
crystal to detector distance D. From the figure it can be seen that
28,,, = 8, + tan-' (a/2D)
28,, = 8, - tan-'(a12D)
where a is the detector width.
Equations (4.45) can be used to calculate the value of 8, to be selected
for data collection to a particular resolution once that D, the crystal to
detector distance, has been chosen.
D must be selected according to the characteristics of the detector, the
wavelength of the radiation, and the unit cell parameters of the crystal. The
first two parameters do not normally change for different experiments
performed at a given installation and therefore D can usually be calculated
with a very simple formula in which the only variable is the maximum unit
cell parameter of the crystal. For example in Howard et a1. ,[73] the crystal to
detector distance for one type of multiwire proportional counter is
calculated in centimetres, for Cu K, radiation by the following equation:
where a,, is the longest unit cell parameter of the crystal measured in
ingstroms. The equivalent equation for another type of multiwire propor-
tional counter[741is:
Thus, data from the same crystal would have to be collected at very
different Ds by the two detectors which, although based on the same
general principles, differ in their construction details. Once D has been
286 1 Hugo L. Monaco
determined, knowing the detector width a, one can calculate the 8, required
to collect data to the resolution desired.
Each of the electronic pictures generated by the detector is called a frame
and the individual elements of the picture are called pixels. The reflection
size on the picture, the space between reflections, and, in general, the
spatial resolution of the detector are expressed by the number and size of
the pixels.
The camera or the goniostat and the detector are controlled by a
computer which is, in general, connected to another computer which
receives from it the frames that are then used to calculate the integrated
intensities (see for example Blum et u Z . [ ~ ~ ~ ) .
Two methods have been proposed to make the reciprocal lattice nodes of
the crystal cross the Ewald sphere: the rotation (oscillation) method and the
stationary picture method. In both cases the detector does not move while
data collection proceeds. In the first method the crystal is rotated about an
axis which is often the vertical axis in pretty much the same way as when
film is used. Many of the considerations discussed earlier (p. 259) are thus
applicable to this technique in which data are collected in a series of
consecutive rotation (oscillation) frames. In addition to the detection
method there are basically two fundamental differences between the two
techniques. The first is that the detector is not always perpendicular to the
X-ray beam but can form with it an angle 8, as pointed out before.
Obviously this fact has to be taken into account in the prediction of the
detector coordinates of the reflections collected. The second major
difference is in the choice of A@, the rotation range, which in this case is
selected so that each reflection appears in several frames.[733751 AS pointed
out before this strategy improves the signal to noise ratio since reflections
are integrated only in the frames in which they appear.
In the electronic stationary picture method,[762771 the crystal is also rotated
about an axis but the frame is recorded with the crystal held stationary. The
reflection intensity is thus extracted from a series of still electronic pictures
at slightly different values of a.The A@ between frames is of the order of
0.06" and subframes are sampled at distances of 0.01" in order to better scan
reflections that in some cases can be very sharp.[771This second data
collection strategy is less widely used than the rotation (oscillation) method.
It is worth noticing that these strategies of data collection are the only
ones described so far that truly sample the entire volume of a reciprocal
lattice node. With film methods what one sees is a projection of the entire
volume on to the film plane, whereas with the diffractometer one looks at a
reflection profile on a single plane that can be chosen to cut the node
volume in different ways as seen earlier (p. 278). Thus, area detector data
are the only ones that can be profile fitted in three dimensions, a possibility
that ought to further improve their quality.
In most cases the crystal is more or less accurately aligned before data
collection can begin so that x,, and y,, the coordinates of a reflection on the
detector, and @, the rotation angle at which the node crosses the Ewald
sphere for all the reflections to be collected can be predicted[761(see also p.
265). However, a full data set collected with an area detector contains a
very large fraction, if not all, of the reciprocal lattice nodes to a given
resolution and, since the crystal orientation can be obtained automatically
by efficient computer programs,[78]it is also possible not to orient the crystal
Experimental methods in X-ray crystallography 1 287
before data collection begins and find the orientation afterwards, during
frame processing.[731A strategy that can be used to cover a section of
reciprocal space with an area detector, which is obviously applicable when
the crystal orientation is known before data collection begins, is discussed
by Xuong et al.[741
are the cases in which the preferred orientations and other properties of the
crystallites need to be studied.
Another important feature which distinguishes powder diffraction is that
the intensity of the diffracted radiation on the cone surfaces can arise from
the contributions of more than one single-crystal reciprocal lattice node.
Figure 4.48 shows in projection that this can happen both as a result of
chance and crystal symmetry. A powder diffraction maximum, measured
along any direction on the cone surface, is thus said to have a certain
multiplicity that will be higher the higher the symmetry of the crystallites
under examination.
When the diffraction experiment is performed with monochromatic
radiation, that is when there only a single Ewald sphere, there is only one
diffraction cone corresponding to each sphere of a given radius rT in
reciprocal space. In other words, the angle 201 corresponds unambiguously
to the sphere of radius rT, 28, to that of radius r t , etc., and we have only
one possibility if we want to measure the diffraction that arises from the
sphere of radius r:: to have some means of detecting radiation at an angle
28, with the incident X-ray beam. It is, however, possible to shine on the
specimen X-rays with a wavelength variable within a certain range. The
experiment is exactly equivalent to the Laue method used for single crystals.
In this case, there will be many Ewald spheres, one for each wavelength,
and each will generate a diffraction cone with a given sphere of radius rT.
Figure 4.49 shows the Ewald spheres corresponding to the two values
limiting the wavelength interval of the radiation used. In the figure it can be
seen that the diffraction due to the sphere of radius rT can be measured at
many different values of the angle 28,. For different acceptable choices of
28, there will be diffraction produced by radiation of different wavelengths.
The methods which use polychromatic incident radiation and analyse the
energy or wavelength of the scattered radiation at a fixed scattering angle
are called energy dispersive methods in powder diffraction. They obviously
require a detector that will discriminate the energy of the arriving scattered
radiation and have some advantages that make them the best choice in
certain situations.[791Just like the Laue method they are best practised with
a synchrotron source which can furnish, as we have seen, radiation of
adequate intensity in a rather extended energy interval. For the remainder
of this chapter we will assume that we are dealing with monochromatic
X-rays. The methods which use them are the most widely diffused in
Experimental methods in X-ray crystallography 1 289
standard laboratories. From the rich literature that covers the diffraction of
polycrystalline materials in depth we recommend two books.[s0,811
pattern shows a series of arches resulting from the projection of the '\
diffraction cones on to the cylindrical surface. The big advantage of the
camera is that it records the entire pattern generated at all possible values of
28; its main disadvantage is that it does not record the entire projection of
the diffraction cone but only a segment. Since, as we have seen, in most
cases the diffraction pattern is isotropic, and one is therefore only interested
in the position of the arches and their relative intensities, this limitation is
not very severe.
In addition to the cylinder that holds the film strip in place, the main body
of the camera has a collimator that serves to define the incident X-ray beam
and a beam trap that stops it after it has travelled through the specimen.
Although one can place the film so that the cut in the cylindrical surface is
made to coincide with the collimator or beam trap, punching a hole for the
other, and both ways of mounting the film have been used, a third
alternative is usually preferred. In the so-called Straumanis method of film
mounting two holes are punched in the film strip positioned at about one
quarter and three quarters of the total film length. One of the two holes is
then used for the collimator and the other for the beam trap. The advantage
of the Straumanis method of film mounting is that it provides accurate
measurements for the positions of the arches that will then be translated
into Bragg spacings, dH, for both high and low values of 28. As seen in Fig.
4.51(b), the arches centred on one of the two holes punched are present as
doublets. They correspond to the K,, and K,, lines of conventional
generators which are normally not resolved by X-ray filters but are clearly
separated after diffraction by powder samples at high 2 8 values. That the
doublets correspond to high 2 8 values can be seen by differentiating Bragg's
law:
2d, sin 8 = A,
2dHcos 8 d 8 = dil, il(sin cos 8 A8 = Ail,
and
A8 = A-I tan 8 Ail. (4.46)
In the case of Cu K, radiation, the doublet is separated by 0.0038 A, if we
take il = 1.5418, A8 = 0.0240" for 8 = 10" and A8 = 0.8009" for 8 = 80"
instead. Thus, the presence of double lines centred on one of the punched
holes serves to unambiguously identify it as that corresponding to the
collimator making it unnecessary to mark the strip. It is the diffraction
pattern recorded that tells us which hole corresponds to 8 =90°. An
important advantage of this method of film mounting is that the positions
8 = 0" and 8 = 90" can be very precisely determined by taking the averages
of the arch positions corresponding to several diffraction cones on the film. S'
From the position of the symmetrical arches, one can easily calculate the X-RAYS-
corresponding 8 values since as seen in Fig. 4.52, if S is the distance
between the arches due to a diffraction cone and R is the radius of the
cylinder
SI2nR = 48/360°
Fig. 4.52. Projection of the drum of the Debye-
for the arches centred on the beam trap ( 8 = 0") and Scherrer camera on t o its axis. The specimen is
in the centre of the circle, S a n d S' are the
S1/2nR = (360" - 48)/360° distances between the symmetrical arches
corresponding t o one diffraction cone, and R i s
for those centred on the collimator ( 8 = 90"). the radius of the camera.
292 1 Hugo L. Monaco
INCIDENT
BEAM
CRYSTAL MONOCHROMATOR
DIFFRACTED
counts
DETECTOR
25-58 25-59
J
1/1 100 35 30 20
Barium N u m i n m Fluoride
hkl
I Rad. CuKo A 1.5418
Cut off I/I
Filter Ni
,
Dia.
Diffractomter I / I c o r .
Ref. Schultz et al;, Acto Chea. Scand., 2 2623-30
019
028
(1972) - 134
223
Sy s. Orthothabic S.G 036
ao 5.156 bo 7.575 co 19.64 A c 119
a p Y Z Dx 029
Ref. +
-
nwS 'V sign
D 4.53 mP Color
Ref. Ibid.
-
Hovedfagsarbeide for den Matematisk
Naturvitenskapelige Embetseksamen,
Universitetet i Trondheia, Trondheim (1969)
is stable below 770°C.
formed at 50 -1. 8 BaFz with A1F3.
FORM M - 2
1
Fig. 4.58. Re~roductionof a card of the
J.C.P.D.S. p&der diffraction data file.
Joint Committee for Powder Diffraction Standards (JCPDS), International
Centre for Diffraction Data.
Figure 4.58 is a reproduction of a card in the PDF. The information
contained in the card should be readily interpretable. Notice that the
relative intensities are expressed as percentages of the strongest line which
is arbitrarily assigned an intensity equal to 100. Not all of the cards contain
all of the information shown in the figure. In particular it may not always be
possible to unambiguously index the lines present in a pattern and therefore
the Miller indices corresponding to a given Bragg spacing may not be
available in the file.
Using the information contained in the PDF it is often possible to match
the diffraction pattern of an unknown to that of one of the known
substances present in the file. This task can be accomplished using both
manual[951and computer method^[^^,^^] with a current tendency in favour of
the latter.
The simultaneous identification of more than one component in a sample
is also possible using the method described above but clearly with a degree
of difficulty that increases with the complexity of the diffraction pattern
generated.
The quantitative analysis of the different crystalline phases present in an
unknown is another important application of powder diffraction. Due to
absorption effects of these specimens the assumption of a direct propor-
tionality between the intensities measured and the amount of a given
crystalline phase present in the sample is not possible. Alexander and
K l ~ g [ have
~ ~ ] derived the equation that relates the intensity of a given
Experimental methods in X-ray crystallography 1 299
diffraction line due to a component to its weight fraction in the sample for
the case of a flat polycrystalline specimen. If the sample is a uniform
mixture of n components and extinction and microabsorption effects can be
neglected, it can be shown thatrg8]
If I:, is the intensity of the same line for the pure component 1
and
so in this case I,, is not a linear function of x l . Plots of the ratio IHI/fi1as a
function of x 1 can then be either calculated using the tabulated values of ,ul
and p2 or determined experimentally from the intensities measured from
samples of known composition. These curves can then be used to determine
x l for an unknown specimen.
In the general case in which pi is not equal to ,uM and there are more than
two components, the determination of x l requires the addition of an
internal standard. For this case it can be shown that1'']
300 / Hugo L. Monaco
where xi and x f are the weight fractions of the i component and the flushing
agent, Ii and If their diffracted intensities and the two constants ki and kf
their intensities relative to that of a reference substance, normally corun-
dum (a-A1,03). If the flushing agent is chosen to be corundum, this
equation reduces to
and differentiating
2ddH = -A csc 8 cot 8 d 8
whence
Data reduction
In Chapter 3 we saw that in the relationship between integrated intensity
and the square of the structure factor amplitude there are several factors
that vary from reflection to reflection. In order to calculate the relative
structure factor amplitudes to be used in the solution of the crystal
structures as described in Chapter 5 one needs first to take these effects into
account. The procedure followed to extract relative structure factor
amplitudes from raw integrated intensities is called data reduction. In data
reduction the different reflection dependent parameters present in eqn
(3.41) are taken into account by multiplying the relative intensities by
suitable correction factors. Here we will neglect E, the extinction coefficient
which was discussed on pp. 97 and 164 and will concentrate on L, P, and T,
the other three factors. The corrections applied are called, as we have seen,
Lorentz, polarization, and absorption corrections respectively. In addition,
we will also discuss the problem of radiation damage of the crystals which is
usually handled before the other corrections are applied. Another problem
that is often encountered, especially in the case of macromolecular crystal
data sets, is that of scaling partial data sets originating from different
crystals which when merged will produce the final total set of relative
integrated intensities. We discuss this problem briefly at the end of the
chapter.
Lorentz correction
We have seen that diffraction arises whenever reciprocal lattice nodes, that
always have a non-negligible volume, cross the sphere of reflection. If a
node is in diffracting position for a longer time, the intensity of the
corresponding reflection will be proportionally higher. This factor would not
be important if the method used to record the integrated intensities ensured
that every reciprocal lattice node were in a diffracting position for exactly
the same time, as it would affect every reflection in the same way and in the
end it would simply scale all the intensities by the same factor. This,
however, is not the case. Depending on the method used to record the
reflection intensity and on the position of the reciprocal lattice node, the
times required for different nodes to cross the Ewald sphere are different.
The Lorentz correction simply takes this factor into account.
The time a node is in diffracting position is dependent on two factors: the
position of the node and the velocity with which it sweeps through the
sphere of reflection. We will derive the form of the Lorentz factor in a very
simple case and then show the form it takes in a more complicated situation.
302 1 Hugo L. Monaco
AXIS
Figure 4.59 shows the Ewald sphere for a diffraction experiment in which
the crystal is rotated about an axis which is normal to the plane defined by
the incident and the diffracted beams. This is for example the case of a
zero-level rotation or Weissenberg photograph or of the equatorial reflec-
tions measured with a diffractometer.
The crystal, and therefore the reciprocal lattice, is assumed to be rotated
at a constant angular velocity o ;if Vn is the linear velocity component of the
reciprocal lattice node along the radius of the sphere of reflection, the
Lorentz factor can be defined as follows
which is indeed proportional to the time during which diffraction takes place
for a given reciprocal lattice node.
The linear velocity of the point P is
and the point P is not on a zero-level layer but rather on the nth layer with a
diffraction cone with a semiangle equal to 90-Yit can be shown that[10s21061
L = (cos p cos Y sin y)-l (4.52)
where y is the projection on to the zero layer of the angle 28 between the
incident and the diffracted beam.
If the rotation axis is normal to the X-ray beam and the reflection is on a
zero-level p = 0 and Y = 0. In this situation the projection onto the zero
level of 28, i.e. y, is identical to 28 and the expression for L given by eqn
(4.52) reduces to eqn (4.51).
~ i ~ s odiscusses ~ of the L factor for the different experimen-
n [ ~ ~the~ form
tal arrangements which are used in data collection and gives tables of the
values of L as a function of the parameters which can be selected.
Polarization correction
The polarization correction depends on the state of polarization of the
incident X-ray beam and on the scattering angle of the diffracted beam. In
Chapter 3 we have seen that when a totally non-polarized beam is diffracted
by a crystal, the diffracted intensity is affected by a factor, called the
polarization factor, which in this simple case was shown to be equal to
where 6 is the Bragg angle of the reflection considered and the diffracting
crystal was tacitly assumed to be ideally mosaic. This simple expression for
the polarization correction can be applied whenever the incident X-rays are
not polarized, that is when the radiation is produced by a conventional
source and monochromatized using an appropriate filter. Notice that in
theory this factor can have values ranging between 1.0 and 0.5 depending on
the scattering angle, although in practice this variation is less substantial.
For a data set collected with CuK, radiation between 508, and 28,
resolution it varies between Pso= 0.9995 and P2 = 0.7470.
The more general form of the polarization correction for an incident
beam monochromatized with a crystal is[17,1081
where 8 is the Bragg angle of the reflection produced by the specimen and
6; the angle of the reflection of the monochromator crystal which was used
to select the wavelength. The angle p is the angle between the projection of
the normal to the reflecting plane on to a plane perpendicular to the
incident monochromatized X-rays and the plane of incidence.[lo8]When the
original X-ray beam, the monochromated beam, and the scattered beam all
lie in the same plane this angle is equal to 0 and the polarization factor takes
the simpler form
Here EL is the amplitude of the optical field in the plane of incidence of the
X-rays and Ek is the component perpendicular to it. In this expression for
the polarization correction, the problem is to obtain an accurate value for
the parameter c' which depends on the set-up of the facility used. This can
be done in two ways; one is by measuring the polalrization ratio of the beam
that will strike the specimen. The second method is by calculating it
theoretically on the basis of the characteristics of the source and of the
crystal used to monochromatize the radiation.
The polarization correction is frequently grouped with the Lorentz
correction in a single factor, the LP correction.
Absorption corrections
As pointed out in Chapter 3, the transmission factor T is related to the
absorption of the incident and diffracted X-ray beams by the crystal. We
have briefly discussed the absorption of X-rays on p. 241 where it was
pointed out that according to Beer's law, absorption reduces the intensity of
an X-ray beam travelling through a given material by an amount which
depends on the material and the length of the path travelled by the
radiation in it. Figure 4.60 shows that, for a given scattered beam, this path
can be very different for different points in the crystal. The path lengths are
dependent, as can be seen in the picture for points 0 , R, and T, on the
location of the point scattering the X-rays, and on the incident and
scattering angle, that is on the reflection considered.
The intensity of the diffracted X-rays is thus reduced, with respect to what
it would be without absorption by the factor
which is valid for every point in the crystal. Here x is the total path length
and y is, as we have seen, the linear absorption coefficient, in this case, of
the crystal.
Fig. 4.60. For a given scattering angle 29, the Equation (4.55) can be used to calculate a very rough estimate of the
path of the incident and scattered beams i n the
crystal depends on the position of the scattering optimum crystal size for a given compound of linear absorption coefficient
point within the crystal. y. In eqn (3.41), the constant K2 included 8, the crystal volume, that we
Experimental methods in X-ray crystallography ( 305
and x = 3 / p .
In order to get T, the transmission factor for an entire crystal, one simply
has to integrate eqn (4.55) over the total crystal volume. If instead of
writing x we decompose the path into p the incident or primary beam path
and q the diffracted or secondary beam path the transmission factor T can
be written as follows:
where gi is the mass fraction of element i present in the unit cell, p; is its
mass absorption coefficient, and p is the crystal density. Recall that p i is a
function of the atomic number of the element and of the wavelength of the
radiation used: it is smaller for lower atomic numbers and for shorter
wavelengths. This explains why absorption corrections become more
important for heavy-element crystals and for radiation of longer wave-
lengths. Sometimes all it takes is a change from copper to molybdenum
radiation to sufficiently reduce the absorption problem in a given crystal
structure determination. In any case it is always instructive to calculate the
value of p for the crystal being examined in order to get an indication of the
severity of the absorption problem.
An analytical evaluation of T according to eqn (4.56) would be, in theory,
the ideal method to use in order to take care of the absorption correction.
The result would depend on the beam path in the crystal which is a function
of the reflection considered, i.e. one would get a different value of T for
every reflection measured. The problem is that the integral in eqn (4.56)
cannot be calculated analytically even in the case of the simplest crystal
shapes. Numerical evaluations have been obtained in the case of spheres or
cylinders, they can be found in ~ i p s o n [ " ~where
] T is given as a function of
pR and 8, R being the sphere or cylinder radius and 8 the scattering angle.
Spheres or cylinders are, however, not very good approximations for the
shape of most real crystals so that if one wants to assume that the specimen
under study is a cylinder or a sphere, it is usually necessary to grind it into
that shape. Mechanical devices exist that can be used to accomplish this.[l14]
306 1 Hugo L. Monaco
This approach is, however, not always possible since there are many crystals
that will not survive the very harsh treatment required to shape them into
an ideal form.
An analytical method that can be used to calculate T for any polyhedral
crystal was proposed by de Meulenaer and ~ o m ~ a . This
[ " ~method
~ divides
an arbitrary crystal volume into smaller polyhedra that are ultimately
subdivided into tetrahedra. The total transmission factor is then calculated
as
most serious, if not the most serious, source of error in the experimental
determination of relative integrated intensities.
that is it is the inverse of the factor by which the intensity measured at time t
has to be multiplied to yield the intensity corrected at time zero, the value
that should be used to extract the relative structure factor amplitudes. This
correction factor is a function, not only of the time after data collection
started but also of the scattering angle of the reflection 6.
The value of R can be estimated without resorting to any particular model
for the radiation damage process, for example by fitting the intensities of
each monitored reflection to a polynomial of the form[125]
where t is the exposure time and n is a number that ranges from 1 to 7. The
discrepancy index
where D is a disorder parameter, S = sin 6/12, and the quantities Al(t) and
310 1 Hugo L. Monaco
Relative scaling
If the final total data set of relative integrated intensities has not been
collected under strictly constant conditions but results from merging a
certain number of subsets, each measured under more or less different
conditions, before these sets can be merged together into a single one it is'
necessary to scale them by applying the appropriate relative scale factors.
The subsets may be derived from different crystals, if radiation damage has
made it necessary to stop data collection at a certain stage before the set
was completed, or they may also come from the same crystal. For example
data collection on films with the rotation method requires scaling as there is
no way to ensure that all the data collection parameters, take for example
film developing and fixing times, will be held strictly constant throughout
the data collection process.
Relative scaling of partial data subsets is done on the basis of the
reflections which these subsets have in common. In order to determine the
relative scale factors to be applied to the subsets, it is first necessary to
Experimental methods in X-ray crystallography 1 311
define the conditions on the data that the scale factors ought to satisfy.
Among the different criteria proposed, that of Hamilton et a1.[1331is
currently the most widely used. In this method one defines
where IH, is the ith observation of reflection H, 1 is the subset in which the
ith reflection is present, Ki the relative scale factor to be applied and VH,the
weight of the ith observation of reflection H.
It can be shown that this condition can be stated in the equivalent
the relative scale factors are chosen so that the quantity I/J is a minimum and
therefore the condition from which the best value for the intensity of
reflection H, I,, is found is
and IHis
Since in this formulation the residual is not linear in the Gis, the best values
for these parameters are determined using the iterative non-linear least-
squares procedure described in Chapter 2 (p. 94), that is for each iteration
qH,is approximated by
Given that the AGl(i)s are not independent, one of them is arbitrarily set
equal to zero, that is one GI is made constant, and then the other Gls are
corrected until convergence is achieved.
An alternative way of solving the equations of Hamilton et al. was
proposed by Fox and ~ o l m e s . [ ' ~
In~their
] formulation one sets in turn all
the derivatives of q with respect to the Gis to zero, i.e.
6q/6G1= 0 for 1 = 1 . . . L
and then approximates q by the Taylor expansion:
In the special simple case in which the weights can be written as the product
of a term which depends on the reflection and another which depends on
the subset an exact solution for this problem has been found.[1341 It is useful
in scaling the different films present in a pack which have in common many
312 1 Hugo L. Monaco
reflections and differ only in those which are outside the dynamic range of
each film.
Another alternative procedure that avoids the use of iterations to
determine the scale factors has been proposed by ~ a ewho [ defines
~ ~ the ~ ~
residual
A",, = log KIIH,- log K,!IH, (4.68)
and minimizes the quantity
(4.70)
where I(H)i is the ith measurement of reflection H, (Z(H)) is its mean value
and the summation extends over all the reflections measured more than
once in the set.
In the cases in which R is calculated using independent reflections which
ought to have equal intensities for symmetry reasons, the notation R,,, is
used.
Appendix
where n is the number of molecules in the unit cell, M, the molecular weight
of the substance, and N is Avogadro's number.
The density of the crystal is
where the unit cell volume has to be measured in A3 and the density in
g cmP3.
Measuring the density of a small-molecule crystal usually poses no serious
problem. There are several methods available and the measurements can be
done with high precision.[136]
If the molecular weight of the compound is known, then, using (4.A.2),
the number of molecules in the unit cell, n, can be easily calculated. Since n
has to be an integer, if the density measurement is very reliable and the
molecular weight is not, eqn (4.A.2) can be used to calculate a more
accurate molecular weight. Alternatively, a precise molecular weight can be
used to yield an accurate density for the crystal using the integer closest to
the n determined experimentally.
In the case of protein crystals, the situation is not so simple and there are
several alternative equations that are equivalent to (4.A.2) We will briefly
discuss one of them.[1371
In a macromolecular crystal (see Chapter 8, p. 536) there is water which
is eliminated when the crystal is dried and water which remains bound to
the macromolecule and there is also salt dissolved in the solvent. If d is the
fractional loss of mass when the crystal is dried, u is the fraction of liquid
which remains in the crystal, s is the mass of salt per unit mass of solvent,
and w is the solvent not accessible to the salt because it is strongly bound to
the macromolecule, the total mass of the unit cell of the crystal is
m = m, + dm + um, + s(m - m, - wm,),
where m is the total mass of the unit cell and m, the mass of the protein in
the unit cell.
From this equation we can obtain
where n is the number of molecules in the unit cell, M, the molecular weight
314 1 Hugo L. Monaco
n=-NVp, 1 - d - s
Mp l + u - s - s w
Vp, 1 - d - s
n = 0.602 - (4.A.3)
Mp 1 + u - s - S W '
Equation (4.A.3) can be used in much the same way as eqn (4.A.2) but in
this case determining the crystal density is a much more serious experimen-
tal problern.[l3'] In addition, one needs to know u, w, s, and d ; the first two
parameters are usually not determined, they are instead estimated from the
average of known protein crystals; s is the quantity most easily measured
and d is quite difficult to determine and requires the use of several crystals
for better precission.
An alternative to eqn (4.A.3) has been derived by ~ a t t h e w s . " Another
~~]
approach to determine n for macromolecular crystals is discussed in Chapter
8 (p. 538).
References
Rieck, G. D. (1962). In International tables for x-ray crystallography, Vol. 111,
(ed. C. H. MacGillavry and G. D. Rieck), pp. 59-72. Kynock, Birmingham.
Luger, P. (1980). Modern x-ray analysis o n single crystals, Ch. 2. Walter de
Gruyter, Berlin.
Phillips, W. C. (1985). In Methods in enzymology, Vol. 114, (ed. H. W.
Wyckoff, C. H. W. Hirs, and S. N. Timasheff), pp. 300-16. Academic,
Orlando.
Eisenberger, P. (1986). Science, 231, 687-93.
ESRF (1987). Foundation Phase Report. ESRF, Grenoble.
Bienenstock, A. and Winick, H. (1983). Physics Today, 36, 48-58.
Winick, H. (1987). Scientific American, 257, 72-81.
Margaritondo, G. (1988). Introduction to synchrotron radiation. Oxford
University Press, New York.
Koch, E. E. (ed.) (1983). Handbook o n synchrotron radiation. North Holland,
Amsterdam.
Materlik, G. (1982). In Uses of Synchrotron Radiation in Biology (ed. H. B.
Stuhrmann), pp. 1-21. Academic, New York.
Hendrickson, W. A., Smith, J. L., Phizackerley, R. P., and Merritt, E. A.
(1988). Proteins : Structure, Function and Genetics, 4, 77-88.
Arndt, U. W. (1984). Journal of Applied Crystallography, 17, 118-19.
Bonse, U. (1980). In Characterization of crystal defects by x-ray methods (ed.
B. K. Tanner and D. K. Bowen), pp. 298-319. Plenum, New York.
Rieck, W., Euler, H., and Schulz, H. (1988). Acta Crystallographica A44,
1099-101.
Koch, B. and MacGillavry, C. H. (1962). In International tables for x-ray
crystallography, Vol. I11 (ed. C. H. MacGillavry and G. D. Rieck), pp.
157-200. Kynock, Birmingham.
16. Roberts, B. W, and Parrish, W. (1962). In International tables for x-ray
crystallography, Vol. 111, (ed. C. H. MacGillavry and G. D. Rieck), pp.
73-88. Kynock, Birmingham.
17. Arndt, U. W. and Sweet, R. M. (1977). In The Rotation Method in
Experimental methods in X-ray crystallography 1 315
Introduction
The goal of a structural analysis is to obtain the distribution of atomic
electron density in the unit cell (in practice the atomic positions) starting
from the diffraction data. As already observed in Chapter 3 (p. 169) it is not
possible to reach this goal in a unique and automatic way, because from the
experimental data only the magnitudes, but not the phases, of the structure
factors may be obtained. Therefore, in order to compute the electron
density by means of eqn (3.45), we must somehow derive the missing
information. In this chapter we shall analyse the most important methods
commonly used to solve the phase problem.
The problem must in principle have a solution (even if not necessarily
unique), since the measured intensities are proportional to the squares of
the structure factors, which may be expressed as?
t Throughout this chapter the convention of using capital letters for the I x 3 matrix of the
reciprocal lattice indices will not be followed, in order to conform with the notation generally
used in the literature on Patterson and direct methods, where generally lower case letters
indicate the general reciprocal vectors as well as the matrix of their components. With this
notation no ambiguity should arise between, for instance, the scalar product of the reciprocal
.
vectors h by the direct position vector r, indicated by h r, and the product of the indices
matrix h by the rotation matrix R of a symmetry operator, indicated by hR (the transpose sign
is usually omitted).
320 ( Davide Viterbo
sets.['] Nevertheless, in practice the constraint that the solution must obey
stereochemical rules makes it extremely unlikely that more than one
homometric set is chemically acceptable.
The possibility of solving a system of non-linear equations relies on that
of obtaining a first approximate solution, constituting the so called initial
structural model. This can then be refined until the best agreement with the
experimental data is achieved.
Before considering the different methods employed to define an initial
model, it is therefore necessary to establish the criteria which allow us to
assess its correctness. From the M positional vectors of the model the
structure factors
may be computed. A good agreement between the IFils and the observed
moduli IF,"(, obtained directly from the intensities, will indicate a correct
model. The most common parameter used to express this agreement is the R
index (also called agreement index or residual)
where K is a scale factor bringing IF;] on the same scale of IF:[, obtained as
K = C h JF,"J/C,IF;/.
In the case of equal atom structures, the R value for totally random
atomic positions has been statistically evaluated to be 0.83 for centrosym-
metric structures and 0.59 for non-centrosymmetric structures.[21Structural
models yielding values of the R index lower than these extreme values may
be considered as plausible initial guesses to start the refinement process. In
general a model with R G 0 . 5 if centrosymmetric or 0.4 if non-
centrosymmetric will be a good starting point. It may also happen that the
postulated model contains errors which can not be corrected by the
following refinement, and therefore it does not converge to the correct
solution. A quite frequent case is represented by crystals containing one or
more solvent molecules; if the presence of these molecules in the cell is
overlooked, then the initial model will be incomplete and the index R will
not decrease below 0.15-0.25, unless the positions of the solvent molecules
are taken into account. We will consider the behaviour of the R index in
some more detail in the paragraph on structure refinement.
Historically, the first crystal structures were solved by trial and error
methods, consisting in a systematic trial of all structural hypotheses
compatible with the known physical and chemical properties of the
considered crystal. These methods require a great effort, ingenuity, and skill
and can only be used with simple structures. They are seldom used today
and for this reason they will not be treated (for a comprehensive account
the reader is referred to Lipson and ~ o c h r a n [ ~ IOnly
) . the methods based on
the use of the Patterson function and the so-called direct methods will be
considered here, while those using isomorphous replacement and anoma-
lous dispersion, mainly used in solving biological macromolecular struc-
tures, will be dealt with in Chapter 8.
Solution and refinement of crystal structures 1 321
2 IF1 (acentric),
PI([FI) = -exp(-I FI2/2) (5.5)
E
where
are introduced. In the same way by which (IF[') was defined, we have
322 1 Davide Viterbo
and in the case of all equal atoms Z, = 1/N. From (5.7) we can immediately
derive
where K' (reciprocal of the scale factor used in eqn (5.3)) is the scale factor,
I°Fhl is the structure amplitude in absolute scale for atoms at rest, B is the
overall isotropic temperature factor, and s = sin OIL.
Table 5.1. Theoretical values of some functions of \El obtained from the centric (5.10)
and acentric (5.11) distributions and their comparison with the corresponding ex-
perimental values for the AZOS structure
Theoretical
where (s2) is the mean value of sin2 O/A2 in the considered interval and
is computed using the tabulated values of the atomic scattering factors for
atoms at rest for s = m. Dividing the reciprocal lattice into several
intervals of s, (5.14) tells us that a linear relation will exist between
In ( ( I F ~ ~ J ~ ) ~and
/ & (s2)
) and that a plot of these values obtained from the
experimental data can be interpolated by the best straight line passing
through them. The intercept of the line on the vertical axis will give us In K'
and its slope the value of 2B.
Figure 5.2 is an example of such Wilson plot for a typical small organic
structure (p-carboxyphenylazoxycyanide-dimethyl su~phoxide,[~~ AZOS
hereinafter); the numerical values of the terms appearing in eqn (5.14),
obtained from 1908 observed reflections, are given in Table 5.2. The main
reason for the deviations of the experimental points from the straight line is
the breakdown of the condition of equiprobability of all atomic positions,
assumed in deriving (5.4) and (5.5); in fact, the presence of structural
Table 5.2. Numerical values of the different terms in eqn (5.14) employed to obtain the
Wilson plot of AZOS shown in Fig. 5.2
The last column in Table 5.1 gives the experimental values of several
statistical indicators based on the IEl values of AZOS (space group P2,/a),
which confirm the presence of an inversion centre.
We have shown (see (3.A.39)) that the Fourier transform of P(u) is IF(r*)I2
(in symbols I ~ ( r * )=l T[P(u)])
~ and vice versa
] r * )I exp (-2nir*
P(u) = ~ - ~ [ l ~ ( r *=) l ~IF(
v*
. U)dr*
1
=- IF,)^ exp (-2nih ' u).
V h
and then P(u) = P(-u), i.e. the Patterson function is always centrosym-
metric even when p(r) is not. This is in agreement with the deductions of p.
Solution and refinement of crystal structures 1 325
176 concerning the centrosymmetric nature of all functions with real Fourier
transform. Since 1 ~ ~ 1 depend
' on the interatomic vectors [cf. eqn (5.1)] we
may expect that also P(u) will contain information on these quantities. This
can be verified starting from the definition (5.16). Let us, for simplicity,
suppose we have an idealized structure made up of n point atoms with an
associated weight equal to their atomic number (Fig. 5.3). The integral in
(5.16) then becomes a summation over the n points, and
In order to derive P(u), all the atoms of the original structure p(r,) are
shifted by a fixed vector u to obtain the corresponding p(r, + u), then the
+
products p(r,)p(r, u) are performed and finally all these contributions are
summed; the products will be non-zero only when p(r, + u) #O, that is
when a point in the translated structure (broken lines in Fig. 5.3) coincides
with an atom of the original structure. This condition is verified only when
the vector u coincides with an interatomic vector (in Fig. 5.3(a) u coincides
with the vector 2-4), while for a general u (Fig. 5.3(b)) all point of the
translated image fall into regions where p(r) is zero. In the first case the
value of P(u) will be proportional to the product of the weights of the two
superposed atoms, in the second P(u) = 0. In Fig. 5.3 a higher weight has
been given to atom 1 (heavier atom) and in 5.3(c) the case of a vector u
coinciding with the distance 1-4 is represented; it will be P ( u , , ~>
) P(uZ,~))
both being single Patterson peaks, as each corresponds to only one
interatomic vector. Finally in 5.3(d) the case of a vector u coinciding with
two parallel interatomic vectors of equal length is illustrated; two terms will
contribute to the summation in (5.18) and P(u) becomes twice as large as
P(u',~) and it is said to have a multiplicity of two. Let us, for instance,
suppose that atom 1 is a sulphur and the others are carbons; we will then
have: P(u2,4)= 6 x 6 = 36, P ( q 4 ) = 6 x 16 = 96 and, with reference to Fig.
5.3(d), P(u) = 2 x ( 6 6)=72.
~
From what we have seen so far it follows that the Patterson function will
have maxima corresponding to all possible interatomic vectors within the
unit cell; the height of each peak will be proportional to the product of the
atomic numbers of the atoms connected by the vector u, multiplied by the
multiplicity of the same vector.
This concept can be further clarified by considering Fig. 5.4, where in
5.4(a) a set of N = 5 points is represented, while in 5.4(b) the corresponding
distribution of interatomic vectors, and in 5.4(c) the same set of vectors,
after they have been translated to a common origin, are shown; the last
corresponds to the distribution of peaks in the Patterson function.
The Patterson function will have the same periodicity as the electron
density and therefore the size of the unit cell will be identical. On the other
hand, the number of peaks in the Patterson function is much greater than
that in p(r); given N atoms in the cell they will give rise to N' peaks in
P(u), N of which will superpose on a single peak at the origin (they
correspond to the N zero distances of each atom with itself), while the
remaining N(N - 1) are distributed over the cell. This higher density of
peaks becomes a more serious problem for real structures with non-point-
like atoms. In fact the Patterson peaks are wider than the maxima in an
electron density map. As illustrated in the one-dimensional example of Fig.
326 1 Davide Viterbo
5.5, because of the non-zero width of the peaks in p ( x ) , the Patterson peaks
will have a width twice as large.
For these reasons the Patterson map of a structure, with even a moderate
number of atoms, may appear as an almost featureless distribution of vector
density. To overcome this problem it is convenient to employ a sharpening
procedure, consisting in computing the Patterson function with coefficients
IEhI2 or better IFhEh(.In fact the normalized structure factors correspond to
a point-atom structure with no decrease of the atomic scattering factor with
increasing sin 8/A.The (l$EhI coefficients are more convenient, because
over-sharpening is sensitive to the series truncation errors (cf. Fig. 3.16) and
may produce spurious peaks or a down-scaling of correct peaks in the map.
It is also possible to eliminate the origin peak, which may obscure some
short vectors, by subtracting from the coefficients the terms corresponding
(c) to the interaction of each atom with itself (i.e. the first term in the
Fig. 5.4. (a) Scheme of a molecule formed by right-hand side of (5.1)); the coefficient in the series (5.17) will be
five point atoms; (b) corresponding
representation of all possible interatomic
vectors; (c) Patterson function obtained by
translating all vectors in (b) to a common origin.
Solution and refinement of crystal structures 1 327
--
Vectors Height
heavy atom-heavy atom ZpZp very high
heavy atom-light atom Z ~ intermediate
Z ~
light atom-light atom ZIZIvery low
Triclinic (PI, p i ) pi
Primitive monoclinic (P2, P2,, . . . , P2,/c) P2/m
Centred monoclinic (C2, Cc, . . . , C2/c) C2/m
Primitive orthorhombic (P2,2,2,, . . . , .
Pna2,, . . . , Pbca . . .) Pmmm
2 axis 1) to a, b, c 0, v, w ; u, 0, w ; u, v, 0
2, axis 11 to a, b, c i,v, w ; u, 4, w ; u, v, 4
Fig. 5.7. Argand diagram i n which t w o heavy
mmirrorI.toa,b,c u,O,0;O,v,O;O,O,w
atoms (with atomic scattering factors f, and f,) a glide I to b, c 4, 4,
v, 0 ; 0, w
and six light atoms contribute t o the structure b glide I to a, c 4,
u, 0 ; 0, $, w
factor F; the resultant of the contributions of the
two heavy atoms is quite close t o F.
c glide 1. to a, b u, 0, $; 0, v, 4
Solution and refinement of crystal structures 1 329
coefficients the observed amplitudes (to which all atoms in the structure will
contribute) with the corresponding calculated phases $.;I The map will not
only reveal the heavy atoms but also other atoms of the structure. In the
most favourable cases the structure may be completed from the first
electron density map, but in general it is necessary to operate in more than
one cycle by the so-called method of Fourier synthesis recycling. Each
cycle requires the calculation of the structure factors from the coordinates of
the known atoms; their phases will then be used to compute a new electron
density map. If the initial model is correct, each cycle will reveal new atoms
until the structure is completed.
From the previous considerations one may get the impression that it
would be advantageous to have compounds containing atoms of high atomic
number, but one has also to consider that their contribution to the diffracted
amplitudes may became so dominant that the observed data will be almost
unaffected by the contribution of the remaining light atoms. The definition
of the final structure will then be rather inaccurate. It has been dem-
onstrated empirically that the best ratio between heavy and light atoms is
that for which
As the values of the ratio (5.20) become less than 1.0, then the
interpretation of the Patterson function and the process of completing the
structure become more and more difficult, but at the same time the accuracy
of the refined positions of the light atoms will increase. As an example let us
consider a hypothetical organic compound of formula C3,H3,04X, where X
is a halogen considered as a heavy atom; then, supposing that there are
two molecules in the cell, C 2: = 2708 and C 2; = 578, 2450, 5618 for
X = C1, Br, I respectively and the ratios (5.20) will be 0.21, 0.90, 2.07
respectively. Supposing that the data measured for the three derivatives are
equally good, the chlorine compound will be difficult to solve, but the
refined structure will be quite accurate; with bromine the solution should be
quite easy with still a reasonable accuracy of the final structure, while with
iodine it will be very easy to solve the structure but its accuracy will be
further reduced.
In order to find the positions of the heavy atoms it is necessary to locate
them with respect to the symmetry elements and to the conventional origin
of the unit cell. Let as now consider some examples, assuming for the
moment that the asymmetric unit only contains one heavy atom.
1. Space group PI. The vector between equivalent atoms related by the
inversion centre, has component u = 2x, v = 2y, w = 22; once it has been
localized on the map it will immediately give the heavy atom coordinates
with respect to the origin chosen on the inversion centre.
2. Space group P2, (twofold screw axis parallel to b). As we have seen,
the vector between equivalent heavy atoms gives rise to a peak on the
Harker section at 2x, i, 22. From its position one may easily derive the x
and z coordinates of the heavy atom; the y coordinate may be arbitrarily
assigned in order to fix the origin along the twofold screw axis.? Let us
t This is correct in the process of finding a starting model formed by a heavy atom, but
during the refinement a more robust way of fixing the origin should be used, as described in
Chapter 2, p. 107.
330 1 Davide Viterbo
The heavy atom is the sulphur and the value of the ratio (5.20) is 0.37.
The general equivalent positions are:
24
28
L 1 I I
21-33 $ 1 I I I I I J '
(b)
Fig. 5.9. Methyl~phenylsulphonyl)furoxanH M I :
(a) Harker section i, v, w; (b)Harker section u, v,
f ; (c) Harker section u, w.:,
The general equivalent positions are
The three mutually perpendicular twofold screw axes give rise to three
Harker sections of the type 4, - 2y, 22; 4 - 2x, 2y, 4; 2x, 4, 4 - 22 (Fig.
5.9(a, b, c)). In the first the highest peak is positioned at v = $ - 2y =
501100, w = 22 = 6.61100, giving y = 0.0 and 2 = 0.033. In the second the
highest peak is at u = - 2x = 38.2/100, v = 2y = 0/100 (in agreement with
the first section) and then x = 0.059 and y = 0.0. In the third section the
peak at u = 2x = 11.81100 and w = 4 - 22 = 43.2/100, confirming the coord-
inates derived from the first two sections, is not the highest peak but the
fourth highest. The largest peak at 50 and 6.6 is in common with the first
section, because of the special value of y = 0; the second and third peaks are
Solution and refinement of crystal structures 1 333
23 27
-
I I I I I I 1
giving for Fe(2) the coordinates referred to the eight possible origins:
Only one of these positions will account for the vectors Fe(1) - Fe(2) and
+
Fe(1) Fe(2) in the Patterson. In our case it will be the last position; in fact
u2 = 1.380, v 2 = 1.300, w2 = 1.645 will give x2 = 0.690, y2 = 0.650, z2 = 0.822.
Peak 3 and peak 5 may then be interpreted as Fe(1) - Fe(2) and
+
Fe(1) Fe(2) respectively, after these Fe(2) coordinates have been transl-
ated by -1.0 along the three axes to obtain the values listed in the bottom
Peak u v w H Interpretation
Origin 0 0 0 364
1 0.080 0.260 0.1 10 144
2 0.020 0.000 0.710 138
3 0.490 0.560 0.470 120
4 0.070 0.275 0.400 72
5 0.1 10 0.245 0.820 65
6 0.485 0.455 0.810 65
7 0.420 0.180 0.420 63
8 0.405 0.290 0.360 59
9 0.480 0.570 0.760 55
10 0.400 0.195 0.700 38
11 0.380 0.300 0.645 33
12 0.430 0.285 0.075 30
13 0.450 0.165 0.135 27
Solution and refinement of crystal structures 1 335
part of Table 5.4. With a similar procedure we may deduce the coordinates
of the remaining two Fe atoms listed in the table.
Direct methods
Introduction
With the term direct methods are indicated those methods which try to
derive the structure factor phases directly from the observed amplitudes
through mathematical relationships. In general the phase and the amplitude
of a wave are independent quantities and in order to understand how, in the
case of X-ray diffraction, it is possible to relate these two quantities, two
important properties of the electron density function should be considered:
(1) it is everywhere positive, i.e, p(r) 3 0 (positivity);
(2) it is composed of discrete atoms (atomicity).
The relation between positivity and phase values may be simply under-
stood by just imagining the computation of p(r) of a centrosymmetric
structure as a Fourier series, first with all signs correct and then with all
signs reversed: the first map will be everywhere positive or zero, while the
second wiIl be negative or zero and therefore physically unacceptable. Two
pictorial examples of how positivity restricts the possible values of the
phases are described in the Appendix 5.C, while a more formal explanation
will be given later.
Historically, the first mathematical relationships capable of giving phase
information were obtained, in the form of inequalities, by Harker and
as per['^] in 1948 and then further developed by Karle and ~ a u p t m a n [ ' ~ l
and by other authors. Because of their limited practical interest, they will
not be treated here and the reader is referred to more specialized
textbooks.[191In 1953 Hauptman and Karle['O1 established the basic concepts
and the probabilistic foundations of direct methods; the great power of
these methods in solving complex crystal structures had its highest recogni-
336 ( Davide Viterbo
tion in the Nobel Prize for Chemistry conferred in 1985 on the mathe-
matician H. Hauptman and the physicist J. Karle.
Also in 1953 ~ayre,['~] using the atomicity condition, was able to derive a
very important relation. He considered that for a structure formed by well
resolved and almost equal atoms, the two functions p ( r ) and p2(r) are quite
similar and show maxima at the same positions. A one-dimensional example
is illustrated in Fig. 5.10.
We have seen that the Fourier transform of p ( r ) is ( 1 I V ) f i and for the
case of all equal atoms
N
Fh =fh
j=1
.
exp ( 2 n i h q ) . (5.21)
For large values of IFhJthe left-hand side will be large, real, and positive. It
is therefore likely that the largest terms in the sum on the right will also be
real and positive. It follows that, if IFkl and IFh-kl also have large values, it
will be
@hk= 9)-h+ 9)k+ 9)h-kzO (5.26)
which for centrosymmetric structures becomes
where S(h) stands for the sign of reflection h and the symbol = stays for
'probably equal'. We note that (5.27) coincides with the indication obtained
in Appendix 5.C.
Relations (5.26) and (5.27) are expressed in a probabilistic form and
indicate the necessity of applying probability methods to estimate their
reliability. On the whole, the use of probability techniques to obtain
relationships between phases and magnitudes, has proved to be the most
important approach for the practical use of direct methods. We will
therefore describe in more detail these methods and the procedures
employed for their practical applications.
when
Let us show that its value does not change when the origin is moved by a
general vector ro. The structure factor of index h, referred to the new
origin, will be
FA =
N
x fi
',
.
exp (2nih (5 - rO))
is a s.i. for any value of k. When the origin is chosen on a twofold screw
axis, then q = Q)2h,0,21 - nk. The permissible origins are located on any of
the fqur screw axes present in the unit cell (cf. Fig. 5.18).
The two examples given above refer to single phases and are therefore
one-phase semi-invariants. We can generalize what we have seen so far
to the case of the linear combinations of more phases. Thus, the
combination C $h is a S.S. in PI if C h = 2H (i.e. if the three components
of the sum vector are all even); in P2, it is a s.s. if C h = (2H, 0, 2L). A
compact way to indicate these two conditions is: Ch = 0 mod(2,2,2)
(meaning that each of the three components of the vector Ch gives a
zero rest when divided by 2), and Ch = 0 mod (2,0,2) (with the same
meaning as before for the first and third components, while the second
must be zero). In general we may write C h = 0 mod o,, where o, is a
vector, called semi-invariant modulus, with integer components; the
vector C h is called the semj-invariant vector.
In order to identify which phases or combinations of phases are s.s. one
can refer to special tables (see, for example, ~ i a c o v a z z o [ ~in~ ]which
) the
space groups are classified in such a way that those belonging to the same
class have the same permissible origins. The same process may be carried
out automatically on a computer following an algebraic approach.[23]
Before considering probability methods, let us use the positivity property
of the electron density function to obtain an indication on the value of
triplet invariants. The development of the product F-hFkFh-k gives
F-hFkFh-k
= z
il
fjl exp [-2nih . q,] x
i2
fj, exp [-2nik q,]
x
i3
fi, exp [-2ni(h - k) .q,]
+ [k . (q, - q,)]} = R.
If the electron density is positive, the atomic scattering factors, which are
the Fourier transform of the electron density around each atom, will also be
positive and the summation in (1) will then be positive. Recalling equation
(5.A.9) the terms (2), (3), and (4) are proportional to lFh12- (lF12))
IFkI2- ( 1 ~ 1 ~and
)) - ( 1 ~ )respectively
~ ) and for large values of IF1
they will be large and positive. The last term R is the only complex term,
but being a sum of positive and negative quantities, it will on average be
small; R may be considered as a 'noise' term. For large values of the
structure amplitudes we can then write
F-hFkFh-k= +
C fj + K { J F ~+~ ' IF^^^+ IFh-k12- ~ ( I F I ~ ) }R 3 0 . (5.37)
i
In the case of a centrosymmetric structure this relation implies that, when
IFh/, IFk[, and IFh-kl are large, then the product of the three signs of the
structure factors, S(h)Sfk)S(h - k) = + . This deduction is in agreement
with (5.27), derived from Sayre's equation, and with the pictorial derivation
given in Appendix 5.C.
Probability methods
For the same reasons considered on pp. 321-2, from now on we will use
the normalized structure factors. The use of normalized amplitudes is also
suggested by (5.37), which shows that the value of the triplet does not
depend on the diffraction angle. Indeed we may note that at low sin 8/A,
where the scattering power is higher, both the positive terms and the noise
term R are larger, and that they decrease by comparable amounts as the
diffraction angle increases. It is also important to point out that the use of
normalized structure factors, corresponding to a point-atom structure,
implicitly corresponds to a sharpening of the atomicity assumption for the
electron density.
In Appendix 5.D the probability formulae for triplet invariants are
derived, here we will just consider the main results. For a non-
centrosymmetric structure the distribution associated to (5.26), derived by
~ o c h r a n , ' is
~ ~given
] by (cf. (5.D.17))
where:
(1) for equal atoms
Ghk = ( 2 / @ ) IEhEkEh-kl;
Solution and refinement of crystal structures 1 341
where
and
t
tan ph =
(iG, sin c+)
'=I
(C G, cos
j=1
with G, = Ghk, and mi = qk,+ qh+. Equations (5.42) to (5.44) are easily
I understood when the r relationships of type (5.40) are plotted on an Argand
1 diagram as vectors of modulus Gi and phase mi; the angle between each
vector and the real axis is an indication of a probable value of qh. In Fig.
5.12 the case with r = 5 is illustrated; it can be seen that
r r
H AH = a,, sin Ph = C G, sin m,, OH = ahcos Ph = 2 G, cos LC)~
Fig. 5.12. Vector representation, in the complex
plane, of the combination of five triplets of type and relations (5.42), (5.43), (5.44) become immediately clear.
(5.40) involving the same reflection h.
Finally eqn (5.41) becomes
which still is a von Mises distribution with a maximum for qh= Ph and a
variance[261depending on ah as shown in Fig. 5.13. For instance we may
deduce that for ah= 2, (9,) = Ph f 50", while for a,, = 10, ( q h ) = Ph It 19".
Equation (5.44) gives the most probable value of qh and is known as the
tangent formula;[271as we shall see later, this formula plays an important
role in the phase determination process.
In the case of centrosymmetric structures the probability that the sign
relationship (5.27) is true, is given byCz8](Appendix 5.D)
P' = 4 + $ tanh[(~,a;~"IEhEkEh-kl)]. (5.46)
When several relations for the same S(h) exist
Figure 5.14 shows the trend of (5.48); it can be seen that, when several
terms, all with the same sign, contribute to the summation, then the
absolute value of the tanh argument may become rather large and P f
approaches the extreme values 1 or 0.
In the past 10-15 years probability methods have seen new important
developments. Not only it has been possible to improve the estimate of
triplets, but also to derive reliable estimates of other phase relationships,
t
02
0 4 8 12 16
both s.i. and s.s.
At the basis of the new approaches stands the following principle: 'It is
Gora
possible to obtain a good estimate of s.i. or s.s. given "appropriate" sets of
Fig. 5.13. Trend of the standard deviation, o,of
the distribution (5.45) as a function of n,, normalized structure factor moduli, which are statistically the most effective
defined by (5.43). in determining the value of the given s. i, or s.s. '
Solution and refinement of crystal structures 1 343
The first task will then be to identify these moduli, indicated as the
phasing magnitudes, and rank them according to their effectiveness in
estimating s.i. or s.s.. Given their set {JEl),the second task will be to derive
the probability distribution
P(@ I {IEI)) (5.49)
where @ is any phase relationship we want to estimate and the vertical bar
after it stands for: 'given all magnitudes in {\El)'.
Cochran's formula[24]for triplets, derived in Appendix 5.D, is a trivial
example of such conditional distribution; in fact it may be seen as
P(@hk I IEhl, IEkl, IEh-kl).
~ c h e n k [ ~extended
~ , ~ ~ ] this principle to the case of four-phase s.i.
(quartets):
Q = Q7h + Q7k + TI + Q7-h-k-P (5.50)
Their distribution, given the four associated magnitudes, derived by
~ i m e r s k a , [indicates
~~] that Q = 0, with a variance depending on
= (2/N) IEhEkElEh+k+lI . (5.51)
Because of the 1/N factor any reasonably sized structure will have very
small B values, and for this reason quartets estimated in this way can not be
used in practice. Schenk pointed out that quartet (5.50) could be considered
-
+
as the sum of two triplets, such as TI = ~ 7 , ~ 7 , - qh+, and
- = ql
+ qh+&.Then, if also I Eh+,I is large, we will have TI 0 and G = 0
+
and therefore Q = TI T, 0 with a strengthened reliability with respect to
+
that indicated by (5.51). In a similar way we can see that the same quartet
can be written as the sum of two other pairs of triplets, and that Q also
depends on JE,+,I and IEk+,(. We can then say that the quartet not only
depends on the four basis magnitudes IEhI, \E,I, (Ell, IE,+,+,I, but also on
the three cross magnitudes (Eh+,I, IEh+lI, JEk+ll.If the last three moduli
are also large, then the indication Q = 0 is strengthened. Empirically it was
-
also found that, when the cross magnitudes have very small values, then
Q n (since cos Q = -1, these are called negative quartets).
Later ~auptrnan[~'] derived the probability distribution of Q, given the
seven basis and cross magnitudes, and confirmed Schenk's empirical
findings. He then formulated the neighbourhood principle,[331the concept
344 1 Davide Viterbo
if the first reflection has kl even and I , odd and restricted phase 0, n , the
-
second has h2 even and phase 0, n , and the third has consequently phase
f n/2, then T = fn/2, in contrast with the indication T 0 given by (5.38)
when the three normalized amplitudes are large. The symmetry operators in
P212121 are: R, = I, T, = (0,0,0); R2 = (7, I , 1) (only the diagonal elements
are indicated, all the others being zero), T, = (1,0, 1); R, = (1, i , i ) ,
T3 = (4, 4,O); R, = (T, 1, i ) , T, = (0, 4, 4); and a triplet equivalent to T may
be written as
because 1, is odd and k, even. T and T', forming the first representation of
the triplet, have the same G but opposite phase and their combination in a
phase diagram (similar to that of Fig. 5.12) will give a null vector, which
does not contradict the expected restricted value T = f n / 2 . The general
use of the space-group symmetry in estimating triplets has been described
' ~ also showed that for non-primitive cells the l l f i
by ~ i a c o v a z z o , [ ~who
factor in (5.39a) should be replaced by l / m P , where N, is the number of
atoms in the primitive cell.
Let us now consider the example of a quartet such as (5.50). As we have
seen, Q in general depends on seven magnitudes, and when no special
symmetry conditions exist, the first representation is formed just by Q and
the seven magnitudes form the first phasing shell. On the other hand, if,
besides the identity, there is an other symmetry operator C, = (R,, T,),
which leaves some special reflections unchanged ( H = HR,) and if one of the
cross reflection (e.g. h + k) is of this special type, then the quartet
is equivalent to Q [Q' - Q = 2n(h + k)T,], but with two new cross terms
IEhR,+rland IEkRS+,I.In this case the first representation is formed by Q and
Q ' and the first phasing shell contains nine magnitudes. Since the larger the
number of phasing magnitudes the better the estimate of the s.i., a
significant advantage is obtained by considering also Q'.
Solution and refinement of crystal structures 1 345
A numerical example will further clarify the above ideas; let us consider
the quartet
Q = q 1 5 3 + 4)&1 + (PzI2 + Q)ji6 (C~OSS:
504, 365, 643)
in the space group P2,/c, for which R, = I, R, = ( I , 1, ?), R3 = (I, ?,I),
R4= (1, ?, 1). The first cross reflection is such that (504)R4= (504), and an
equivalent quartet may be set up
where H is any reciprocal lattice vector for which IEHI is large. Since the
term added to T is null, then C = T, but the quintet will depend on four
basis magnitudes JE,I, IEkl, IEh+kl, (EHl and on six cross terms IEh*Hl,
IEk+nl, IEh+kfHI. If M vectors H are selected (80-100 reflections with
largest IEHI), then the second representation will be formed by M quintets
and the second phasing shell will contain 10 x M magnitudes.
Let us now consider a general n-phase s.s.
In general (the few exceptions are beyond our scope) it is possible to find a
phase qHand two symmetry operators Ci and Cj such that
then have
P+(EZh,O,U)= 1 + tanh : ( :030;312 IE2h,0,21\ C (-l)*(IEhk,I2 - I))
k
(5.57)
where the sum is over all values of k. The term (-l)k=exp(nk)
corresponds to the constant angle nk and is obtained by applying (3.37) to
express the phase relation between qhkland qgki in (5.35). When all major
terms in the sum have k even, then P+(E2h,o,21) > $ and S(2h, 0, 21) = +,
when they have k odd, then P+(E2h,o,21) < $ and S(2h, 0,21) .= +.
Formulae (5.56) and (5.57) and those for some other space groups were
derived long before the concept of representation was introduced, and they
were known as the C formulae[201.
The second representation of a S.S. will also be illustrated by means of an
example. In space group P212121let us consider the s.s. @ = q4O6) for which
the first representation (5.55) reduces to the set of triplets
with q varying over all reflections with IE,I large. The second phasing shell
will include IEHI = IE4,,I, a term IEhI = IE2,,I for each triplet (5.58) and, for
each quintet, lEql and the four cross terms IEH+,I, IEhRfql.
Once the phasing magnitudes have been identified it becomes possible to
derive the appropriate probability distribution (5.49). So far formulae have
been derived for estimating:
(1) one-phase s.s, by the second representation;[3w11
(2) two-phase S.S.by the first
(3) triplets by the second representation (PI0
(4) quartets by the first r e p r e s e n t a t i ~ n . [ ~ ~ , ~ ~ I
In most cases, the derived probability distributions for non-
centrosymmetric structures have the form of a von Mises function as (5.38)
and for centrosymmetric structures the tanh form of (5.48); the concentra-
tion parameter G or the tanh argument are now substituted by more
complex functions of the considered phasing magnitudes, which will not be
reported. Only their practical use will be illustrated in the following
sections.
the absolute value) of the s.i. and S.S.In order to obtain all phases referred
to one form it will be necessary to fix the enantiomorph. The phase
determination process can then be summarized as shown in the flow-
diagram of Fig. 5 .l5.
The choice of the space group with the corresponding set of symmetry / Estimate i s . and s,s!s
I
operators already imposes some restrictions on the points where the origin
may be localized. Indeed, only the sites with the same point symmetry will
be suitable and represent the so-called permissible origins.
We will then have to choose the origin among the permissible ones and I O r i g i n (and e n a n t i o m o r p h ) I
this may be performed by fixing the value of a limited number of suitable I definition I
phases. In order to show how this is possible, let us first consider a
one-dimensional structure (Fig. 5.16(a)) formed by three atoms of different
atomic numbers ( 2 , = 3, Z 2 = 2, Z3 = 1 and C 2, = 6). The unitary structure
factor for this structure, referred to an origin at Xo, will be
Fig. 5.15. Flow diagram of the phase
determination process by direct methods.
3
(i/6) Z, sin 2nh(x, - Xo). (5.60)
j=1
When the origin is shifted so that Xo varies from zero to the identity period
a, the trigonometric terms in (5.60) will change and Ah, Bh, and therefore
Uh, will follow these changes. The modulus JUhl does not depend on the
position of the origin, and only the phase qh will depend on X,. Figure
5.16(b) illustrates this variation in the case h = 1: we can see that at Xo = 0,
q1= 109", while at Xo = 0.7, ql = 0". A change of enantiomorph is achieved
by inverting the direction of the x axis; again 1 Uhl will not change, while all
phases will change their sign ( q + 360 - q).
Since there is only one value of Xo for which q1= 0°, then the origin is
uniquely defined by fixing q, = 0". In Fig. 5.16(c) the variation of q2 with
X, is also shown: it can be seen that q2goes to zero at two points, Xo = 0.26
170 292 374 85 177 299 92 , Xo h.2 Fig. 5.16. (a) One-dimensional structure made of
three non-equal atoms; (b) values of the phase
2i6 3b8t40 131 223 3 k 5 it 4658' I U-s 0 . 4 4 .
ra. for different origin positions; (c)
I I 8
and Xo = 0.76, at a distance a12 from each other. Fixing q, = O" does not
define the origin uniquely, but only restricts its possible position to two
points; in general, by fixing qh= 0°, the origin is restricted to h possible
positions. It should finally be observed that at X o = 0.7 (position for which
q1= 0") q2= 315", and a change of enantiomorph will give q2= 45". It can
then be seen that the enantiomorph can be chosen by restricting the value of
q, (or in general of another phase q h ) within the interval 0-n (or, as in our
case, n-2n).
Let us now generalize to three dimensions the procedure just described to
fix the origin in one dimension. We will first consider the space group PI, in
which any point of the unit cell is a permissible origin and therefore no
semi-invariants which are not at the same time invariants exist. In analogy
with the one-dimensional case, we can uniquely fix the origin along the x
axis by fixing the phase of the (100) reflection and thus restricting the
possible origins to lie on planes parallel to (100). Similarly the origin may be
fixed in the other two directions by fixing the phases q,,, and q,,. The
three reflections (loo), (OlO), and (001) define a primitive cell in the
reciprocal lattice, but in a triclinic lattice there are infinite ways in which a
primitive cell may be chosen. Indeed, any three non-coplanar vectors
?Z
The three vectors will then form a primitive cell if
A=
(1 1: :I
h2 k2 12 = f l .
(5.63)
By fixing q,, = 0" we will restrict the possible origins to lie on planes
parallel to the crystallographic planes of indices HI. If we also fix q,,, = 0°,
the possible origins will be at the same time restricted to lie on planes
parallel to the planes Hz, i.e. they will lie on the intersection lines between
+
the two sets of planes. Finally, by fixing q,, = O" (with H, # mH, nH2, m,
n positive or negative integers) the origin will be further restricted to be at
the intersection points of the above lines with the planes parallel to the H,
crystallographic planes. The number of such points within the unit cell is
given by the value of A and only when the primitivity condition (5.63) is
Fig. 5.17. Position of the eight distinct inversion obeyed, the three phases will fix the origin in a unique way.
centres in a P I unit cell, corresponding t o the
positions of the permissible origins, numbered In the space group PI the permissible origins lie on the eight distinct
as i n Table 5.5. inversion centres in the unit cell (Fig. 5.17). When the origin is shifted, for
Solution and refinement of crystal structures 1 349
Table 5.5. Sign variations for the reflections divided into parity groups, when the origin
is placed at the different inversion centers in the P1 cell of Fig. 5.17
Origin Parity
instance, from (O,O, 0) to (i,0, O), the phase cpH of the reflection H = (hkl)
will change by -nh; this is equivalent to saying that the sign of the
reflection will change or not depending on whether h is odd (u for ungerade
in German) or even (g for gerade). Any change of origin among the eight
permissible ones will have an effect on the sign of a reflection H which will
depend on the parity of the three components h, k, or 1 (cf. Table 5.5).
Reflections of type ggg are structure semi-invariants and may not be used to
distinguish among the possible origins. If we consider a reflection of
different parity, e.g. ggu, by imposing that its sign must be +
, we restrict
the possible origins to lie on four points (in Table 5.5 these are the origins 1,
3, 5, 7). In order to further reduce the ambiguity we will have to fix the sign
of a reflection of different parity (not ggg), e.g. ugg; with reference to Table
5.5, by fixing the sign to be + , the possible origins are restricted to points 1
and 3. In order to fix uniquely the origin, we will have to fix the sign of a
third different reflection. Its parity should not only be different from that of
the two already chosen, but it must also be different from ugu (for which
both origins 1 and 3 have a + sign); in fact the combination ggu ugg + +
ugu = ggg is a s.s. We will, for instance, choose uug.
The above rules given for PI are also valid for all primitive centrosym-
metric space groups with symmetry not higher than orthorhombic.
Similar procedures may be devised for the other space groups. If along a
given direction the origin is restricted on points separated by 4, then it can
be fixed by fixing the phase of a reflection with an appropriate parity. When
the origin can be shifted in a continuous way along an axis, it is possible to
fix it by using a reflection the phase of which only takes unique values within
the corresponding unit period.
Thus, in space group P2, (Fig. 5.18), with the b axis parallel to the
twofold screw axis, the permissible origins are all the points on the four 21
axes at (0, y, 0), ($,y, 0), (0, y, i), (4, y, 4). The choice of the twofold screw
axis corresponds to that of one of the four inversion centers on the
projection along y ; the projection reflections h01 have restricted phases 0,
n. The phases q,,,, are S.S. and indeed the (gOg) crystallographic planes pass
through all permissible origins. O n the other hand, a phase quo,will have,
for instance, a zero value if the origin is chosen at (0, y, 0) or (0, y, i), and a Fig. 5.18. Projection along the y axis of a P2, cell
n value if the origin is on the other two screw axes. By fixing quo,= 0" the with the twofold screw axis parallel t o b.
350 1 Davide Viterbo
origins are restricted on the first two 2, axes. A second phase of type qgoU
(or q~,,,) will assume a zero value at (0, y, 0) and a n value at (0, y, i), and
by fixing qgoU = 0" the origins are restricted to lie on the first screw axis. In
order to fix uniquely the origin along the y axis, we will have to fix a phase
of type q,,,, because the (hll) planes intersect the y axis only once within
the period b.
Let us finally consider the very common space group P2,2,2,, in which the
permissible origins are located at the eight points midway between the three
orthogonal, non-intersecting 21 axes. The origin may be fixed in a simple
way by using reflections belonging to the three principal zones (Okl), (hol),
and (hkO). These reflections have phases restricted to two values: 0, n or
k n / 2 depending on whether the index following the zero, in a cyclic way, is
even or odd (cf. Chapter 3 , pp. 157-8). It is therefore possible to apply
the rules derived for the space group PI: the origin may be fixed by fixing
the phases of three zone reflections belonging to three linearly independent
parity groups (not ggg).
In the examples considered so far, we have always used three phases to
fix the origin, and this is true for all primitive space groups up to the
orthorhombic system. In the centred space groups some of the permissible
origins are related by translational symmetry and are indistinguishable. For
this reason the number of phases needed to fix the origin is reduced, as, at
the same time, is reduced the number of allowed parity groups. Thus, for
instance, in a C-centred lattice all ugg, ugu, gug, and guu reflections are
systematically absent and the origin is fixed by fixing the phases or the signs
of two reflections belonging to the other four parity groups.
Going from one enantiomorph form to the other will change the sign of
all individual phases and, as a consequence, the sign of all linear
combinations of phases. Therefore, the most general way of fixing the
enantiomorph will be that of restricting a suitable s.i. or S.S. within the
interval 0-n.
Let us for instance, consider the space group P21. Suppose that the two
phases q,, and qZl3have been assigned zero value to fix the origin; they
form a triplet invariant with qjil
Normalization
By the method described on p. 323-4 and using equation (5.15), the values of
the normalized structure factors are first calculated. Most computer
programs will supply a list of reflections sorted in decreasing order of JEl
and perform a statistical analysis of the normalized amplitudes as shown
earlier. In some cases a more detailed statistical analysis is carried out in
order to reveal the presence of pseudo-translational symmetry (cf. Appen-
dix 5.E).[4s521
Some of the most recent programs allow one to introduce the available a
priori information, such as the existence of pseudo-translational symmetry
or the coordinates of a previously located fragment;[53,541 examples of the
use of this information will be given later.
and in (5.65) the signs are of no use, while G represents the tanh argument
in (5.46) or the corresponding value from the second representation.
where Qz, = ( q h )- qkl- q h p k l ZQhk1. Since at the beginning the phases are
not known, a,,can not be computed, but it can be estimated a priori by
Solution and refinement of crystal structures 1 353
and it can be computed for each reflection, before any phase information
has been obtained, using the G values only.
The convergence process is a step process in which, at each step, the
reflection with minimum (a,,) is (temporarily) eliminated, provided it is not
an accepted one-phase S.S. or another reflection already included in the
starting set. When a reflection is eliminated, at the same time, all phase
relationships contributing to it are eliminated and the (a,,) values of the
other reflections involved in these relations are updated. Since, at each step,
the reflection which is less related to the remaining reflections is eliminated,
the process must converge towards the group of reflections which are most
strongly interrelated and are therefore the most effective in starting the
phase determination process.
t
Yes
+
[/canorigin\
be defined without
\this reflect ion?/ N7
ref Iections
A / I
reflections t o
Yes
r
reflection w i t h small~st
c d > at t i m e of elimination
goes into starting set, t-
Fig. 5.21. Flow diagram of the convergence
procedure.
Solution and refinement of crystal structures 1 355
Figures of merit
As we shall see, the phase determination process usually leads to more than
one solution. Given several sets of phases it would be rather time
consuming to compute and interpret all the corresponding electron density
maps to see which yield the correct structure. It is instead easier to compute
some appropriate functions, called figures of merit (fom), which allow an a
priori estimate of the goodness of each phase set. Several functions have
been proposed[60]and we will analyse those most commonly used.
MABS (absolute fom) represents a measure of the internal consistency of
the employed triplet relationships in estimating the phases. It is defined
as
~ c =e lo0(z
h - (ah)~)be; (5.71)
it should be minimum for the correct set of phases.
tpo fom: this is defined as
Table 5.6. AZOS: list of the 50 reflections with largest IEl value
No. h k I E No. h k I E
Besides, in the case of complex structures, the E-maps often only show a
partial image of the structure, which will then have to be completed.
We may now illustrate the most common phase determination
procedures.
Symbolic addition method
This method will be illustrated through an example. We will use the same
AZOS structure employed earlier (p. 323); the 50 reflections with largest
IEJvalue are listed in Table 5.6. When using all sign relations of type (5.47)
relating these reflections, the convergence procedure defines the following
starting set of signs:
Origin-fixing
3 0 1 2 + gug
reflections
8 1 1 2 + uug
16 10 4 1 + ggu
Other reflections 1 4 0 2 a
with symbolic 2 1 6 0 4 b
sign 12 8 6 4 c
In the unit cell there are four molecules of row formula C10HllN304Sand
o~o;~" = 0.141. The probability with which a sign is determined is given by
(5.46) or (5.48) and it can be seen that it is never less than 0.95, the
minimum tanh argument being 0.141 x ( 2 . 2 1 ) = ~ 1.522.
The sign expansion procedure may be followed in detail in Table 5.7. At
step (17) a new symbol d must be introduced in order to define new signs; at
step (20) we have two different, but not contradictory, symbolic indications
for the sign of the same reflection and their comparison suggests that
symbols b and c correspond to the same sign. Similarly at step (22) we have
the indication that a and b represent opposite signs. Not all the signs of the
first 50 reflections are determined; 10 signs are not defined with only four
symbols. At this point it is not convenient to introduce new symbols, but
rather to use other reflections with smaller [ E l . The probability of the
individual sign relations will decrease, but, with 40 signs already defined, we
will have several multiple indications. For AZOS 200 reflections with
(El Z- 1.61 were used. As the sign determination is carried out, several other
indications about the values of the symbols are obtained: the indications
b = c and a = - 6 are confirmed several times and new indications that
a = - and b = + (in agreement with a = - b ) are obtained. No indication is
obtained about symbol d. Out of the 24 = 16 possible sign combinations,
only two are consistent with the previous indications: - +++ and
- ++ - . Also some contradictory indications are obtained; for instance,
+
S(165) has seven contributions: - , ab, ab, ab, - , - , . The first six
indications confirm the relation a = -6, while the seventh, with very low
probability, is clearly contradictory with respect to the others and it will be
neglected. If the determination of one or more signs turns out to have
several contradictory indications of similar probability, it should be sup-
posed that a wrong choice has been made at some previous step and the
procedure should be reconsidered from the beginning. In our case this did
not happen and it has been possible to define the signs of all reflections
Solution and refinement of crystal structures 1 359
Table 5.7. Step by step illustration of the symbolic addition procedure for the 50
of Table 5.6
abc
a b c
a b c
abc
abc
introduced because fewer
than half the signs have been
determined
-a bcd
-a b c d
-a bcd
-a bc
-a bc
-a bc
+
+
+
+
+
+
a
a
a
- bc
- Again b = c confirmed
-a bcd
-a bcd
- b c d
- b c d
a b
a b
using four symbols. When the most probable values are substituted for the
symbols, two sets of signs are obtained for AZOS and the corresponding
electron density maps may be computed. The values of the MABS, R,, and
I/JO foms indicate that the set with d = - is more reliable and, indeed, the
Multisolution methods
The basic idea of these methods is that of assigning approximate numerical
values to the starting phases, instead of using symbols. It has been found
empirically that initial errors of 40-50" usually do not spoil structure
solutions. The starting set, defined by the convergence procedure, will
include, besides the origin and, when needed, enantiomorph fixing reflec-
tions, also a limited number of other phases necessary to initiate the phase
expansion process by means of the tangent formula (5.44). If these are
general phases with values anywhere between 0 and 2n, we may tentatively
Fig. 5.22. AZOS: representation and give them the four quadrant values: fn/4, f37~14.One of these will be
interpretation of the E-map computed with the
best sign set obtained by application of the correct within 45". All restricted phases are assigned the values defined by
symbolic addition procedure. the space-group symmetry; e.g. 0, n , or fn/2.
Solution and refinement of crystal structures 1 361
This number grows very rapidly with increasing ng and ns and only by
limiting the number of starting-set reflections it is possible to maintain the
computing time within reasonable limits. This limit can be greatly reduced
by using the so called magic integers (described in the Appendix 5.F),
which allow a considerable reduction of the number of combinations with a
minimum increase in the phase error.
Most multisolution computer programs, such as MULTAN,[~~] are mainly
based on the use of triplets estimated by Cochran's formula, to which a few
one-phase s.s. estimated by the El formulae may be added. With these
programs it is possible to solve structures with up to 60-70 atoms in the
asymmetric unit. Although more complex structures have also been solved,
it may also happen that simpler structures can not be solved. In fact,
because of the rather crude probabilistic estimate of triplets, it may happen
that at some stage of the 'chain' phase-expansion process, some triplets with
an actual value quite far from zero are used; as a result the determined
phases are completely wrong. In the recent years several new developments
have been proposed in order to overcome these problems and to make
direct methods capable of solving increasingly more complex structures. In
Appendix 5.G the most promising developments of the multisolution
techniques are outlined, while in the following we will describe, through a
practical example, the use of the SIR (semi-invariant representation)
program;[671this program is based on the multisolution strategy strength-
ened by the use of all phase relationships for which a reliable estimate may
be obtained by means of the representation method.
We will follow in detail the solution of the structure of the antibiotic
21-acetoxy-11-(R)-rifamicinol (RIFOL),[~~] C39H49N013.CH30H.H20,
which crystallizes in the space group P2, with Z = 2. Its structural formula is
estimated using their first representation; all of them are used to compute
SS2FOM, while 129 with J G J>0.6 are actively used in the phase
determination procedure.
Triplet invariants relating the 362 strongest reflections are set up and
estimated by means of their second representation (PI0 formula).[4s1The
concentration parameter of the von Mises distribution is given by
Table 5.8. RIFOL: starting set of phases defined by the convergence procedure
Origin 5 5 0
15 2 0
4 6 1
Permuted 18 6 4
72 10 4
149 3 1
25 1 2
17 8 4
Total number of permutations 24
Solution and refinement of crystal structures 1 363
where the summations over t extend to all triplets linking qh to two other
known or previously determined phases, while the sums over s refer to
two-phase S.S. The weight attributed to each relation is calculated from the
+
a = ( T ~ B')"~ values of the reflections contributing to At the
beginning the origin-fixing reflections and the restricted permuted phases
are given a = 100, one-phase S.S. (if any) are assigned their G values, while
the general phases represented by magic integers are given an a value which
depends on the root mean square deviation of the representation. Starting
from the eight reflections in the starting set, the phase extension is carried
out following the order indicated by the convergence procedure. As we
have seen, the reflections eliminated at the end are those best related to the
starting set; the phase determination process should therefore be carried out
in an order inverse to that of elimination. In Table 5.9 the final part of the
inverted convergence map (divergence map) of RIFOL is reported and
with its aid we can follow the phase determination process.
Reflection 24 is the first to be determined by a single triplet relation
(written in the form of (5.65)); it will be used to define the following
reflections with its corresponding a value. The second reflection no. 99
forms a two-phase S.S. with 24 (written as a triplet but with an asterisk
instead of the third code number) and is related by a triplet to the starting
set; each summation in (5.78) will have one term. The remaining reflections
are then determined by a similar chain process.
When a sufficiently large number (60-100) of phases has been deter-
mined, one should proceed to their refinement. It is in fact possible to
redetermine the initial phases with a greater number of contributors to the
sums in (5.78); for instance the phase of reflection 24 can now be
determined using all relations involving 24 and two other reflections among
the 60-100 determined ones. The tangent refinement process is repeated
until self-consistency and then the remaining phases are determined and
refined. At this point different foms are computed.
Table 5.9. RIFOL: divergence map illustrating the phase determination path starting
from the set of reflections in Table 5.8
Table 5.10. RIFOL: list of the ten best sets of phases with their relative figures of merit
CFOM yields the E-map shown in Fig. 5.23, which may be interpreted in
terms of the RIFOL structure as indicated by the connected circled maxima.
Not all atoms are found in the map (solvent molecules plus eight atoms are
missing, two to close the macrocycle and the others in the side chains), but
the structure can be easily completed by the methods described in the next
paragraph. The final refined structure is shown in Fig. 5.24.
model is formed by one or more heavy atoms. The same method can be
used to complete a molecular fragment when all atoms have approximately
the same weight. If 50-60 per cent of the electron density has been located
with sufficient accuracy it is quite easy to complete the structure. If the
initial model only contains a smaller percentage of the electron density, the
method can still be applied but the Fourier coefficients should be corrected
by appropriate statistical weights, such as those proposed by ~ i m [ ' l , ~ ~ ] ,
taking into account the different contribution of the known atoms to the
different structure factors. The derivation of these weights is given in
Appendix 5.H, where some other procedures for completing a partial model
will also be described. The Fourier cycles not only allow the location of the
new atoms but also the improvement of the positions of the model atoms.
For very small fragments the direct method procedures mentioned at the
end of the last paragraph are usually easier to apply.
will show maxima at the positions of the atoms of the given model, while a
series with coefficients Fz = IFzI exp (icp,,,,)
1
p,,(r) = -
V h
Fi exp (-2nih .r)
represents the true structure. In order to see how much the initial model
deviates from the real structure, the difference series
1
Ap(r) = p,(r) - pc(r) = ; ( F i - F i ) exp (-2nih . r) (5.81)
should be computed. Unfortunately the values of cp,,, are not known and
we have to assume cp,,, = cpi; this approximation, illustrated in the Argand
diagram of Fig. 5.25, will hold better the better is the initial model.
Equation (5.81) then becomes
1
Ap(r) =
" h
(IFgI - IFiI) exp (-2nih .r + icpi).
If in the model an atom is missing, then p,(r) will be zero at the
corresponding position, while p,(r) will show a maximum. The difference
synthesis will also show a peak at the same position but it will be almost
D
zero at the positions of the model atoms (if these are correct) where
PO@> = pc(r).
An important property of the difference syntheses is that they are almost
unaffected by series truncation errors. Indeed, because of the limited
number of observations, the Fourier maps computed by means of (5.79) and
(5.80) will show some ripples around each peak (see Fig. 3.16), the size of
Fig. 5.25. Mustration of the approximation which inci-eases with increasing peak height. As a consequence a light atom
C
%truez vh. close to a heavy atom may be obscured by its ripples. Since the number of
Solution and refinement of crystal structures 1 367
terms in the two series (5.79) and (5.80) is the same, the truncation errors
will also be approximately the same and will cancel out in the difference
(5.82).
Let us now see how the different types of errors in the model are reflected
in a difference synthesis.
1. Missing atoms. We have already seen that they appear as positive
maxima, but, because of the approximation made for the phases, their
height is usually smaller than that corresponding to the atomic number of
the missing atom.
The lack of truncation errors allows the correct localization of light atoms
even when the model contains much heavier atoms. For this reason,
difference Fourier series, computed when the model has been corrected for
most other important errors, allow the localization of hydrogen atoms (with
only one electron they contribute very little to the X-ray diffraction) even in
the presence of medium size atoms (cf. Fig. 5.28).
2. Position errors. Their effect is shown in Fig. 5.26. If p, gives the
correct position of the atom and p, its wrong position in the model, in the
Ap map the latter will be close to a negative minimum, while the correct
position will be towards the neighbouring positive maximum along the
maximum gradient line. It is possible to have a quantitative estimate of the
shifts to be applied, but in general, when the errors are not too large, it is
easier to correct the position errors by the least-squares methods (cf.
Chapter 2, pp. 90-108, and later in this section).
3. Errors in the thermal parameters. As we have seen in Chapter 3 (p.
148), because of the thermal motion the electron density function around
each atomic nucleus becomes wider. In Fig. 5.27(a) the case in which the
thermal motion has been neglected or underestimated in the model is
represented. The p, density will therefore have a smaller and wider peak Fig. 5.26. Position error of a model atom (top)
with respect to p,, and in the difference synthesis a negative depression, and corresponding difference synthesis
surrounded by a positive ring, will appear. If, on the other hand, too high a (bottom).
thermal motion has been assumed for one atom of the model, then Ap will
show a small positive maximum surrounded by a negative ring.
Finally, the case in which an isotropic thermal motion has been assumed
in the model (the p, maximum has a spherical distribution) while the real
motion is anisotropic (the p, maximum has an ellipsoidal distribution), is
illustrated in Fig. 5.27(b) together with the corresponding difference
synthesis showing two positive maxima and two negative minima; the line
joining the two positive lobes represents the direction of largest thermal
motion. The qualitative indications obtained in this .way allow one to
recognize those atoms for which it is more important to carry out the
least-squares refinement varying the six parameters of the anisotropic
thermal motion (cf. Appendix 3.B).
Least-squares method
By far the most widely used method of structure refinement is the
least-squares method. The theory and the computing procedures have been Fig. 5.27. Thermal parameter errors and their
described in Chapter 2, pp. 90-108. Several computer programs have been effect on the difference syntheses; for the atom
of the model it is assumed: (a) too small an
implemented to carry out the crystallographic least-squares refinement; isotropic motion; (b) an isotropic motion when
usually they are integrated within complete crystallographic packages such an anisotropic model should be assumed.
368 1 Davide Viterbo
as SHELX,['~]X T A L , [ ~N ~ ]R C V A X , ~ CRYSTALS[~']
~~] and those commer-
cially available together with most single-crystal diffractometers. Here we
will illustrate the practical application to the refinement of the fairly small
structure of 2-a~oxycyanopyridine[~~] (AZOP, C6H4N40,space group PI,
Z = 2). The structure was solved using the SIR program;[67]from the best
E-map the coordinates of all the 11 non-hydrogen atoms of the molecule
(cf. Fig. 5.28) were obtained. The structure factors computed with this
model, assuming an isotropic temperature factor equal for all atoms
( U = 0.05 A'), yield a residual R = 0.34, a sufficiently low value to indicate
the correctness of the model. As we have seen in Chapter 2 (p. 94) the
refinement must be repeated in several cycles until convergence. For AZOP
the refinement was performed using 1176 observed reflections by means of
the system of programs SHELX;['O1 the first four cycles were carried out
varying the overall scale factor and the three coordinates plus the isotropic
thermal parameter of each atom; the total number of varied parameters is
+
1 4 x 11= 45 and the R factor reduces to 0.148. In the next step the
refinement is carried out varying for each atom the six components of the
anisotropic thermal parameter as described on pp. 186-8. Four anisotropic
+
refinement cycles were performed with 1 9 x 11= 100 varied parameters
and at convergence R = 0.078.
At this point we may assume that the model formed by the heavier atoms
is sufficiently well refined and we can compute a difference Fourier synthesis
in order to localize the hydrogen atoms. Figure 5.28 shows a projection of
the map onto the mean plane of the AZOP molecule; the known atomic
positions are also shown linked by their chemical bonds, and the hydrogen
atoms are clearly seen as maxima in the map. The coordinates of the four
hydrogen atoms derived from the map are then introduced in the refinement
process with appropriate isotropic temperature factors, but this usually can
not be done in a straightforward way. In fact the contribution of the
Solution and refinement of crystal structures 1 369
GofF =
n-m
are also computed. Equation (5.86) is nothing but the quantity defined in
(2.58) and GofF should be close to unity if the weights of the observations
have been correctly assessed, the errors in the model are negligible in
comparison with the errors in the data and there are no significant
systematic errors. For AZOP R, = 0.054 and Go= = 1.504.
In the final stages of the refinement one should check whether some
intense low-angle reflections are affected by secondary extinction (cf. p.
97) and have JF,J systematically smaller than IF,(; some least-squares
programs allow an empirical correction of this effect but it is common
practice to simply discard these few reflections from the refinement. In the
case of AZOP three such reflections were eliminated.
It is also important to check for possible correlations between parameters
during the least-squares refinement. In Chapter 2 we saw that the
least-squares procedure gives the variance-covariance matrix of the derived
parameters (eqn (2.56)). It is therefore straightforward to calculate, for
each term in the matrix, the correlation coefficient
COV (PiPj)
Pij =
~ ' ( ~ i ) ~ ~ ( ~ j >
which is close to zero if the correlation is negligible and close to 1 for large
correlation. Most programs compute the correlation matrix and output the
off-diagonal elements which are greater than 0.5. No such correlation was
found for AZOP.
How low should the final R factor be in order to have good confidence in
the quality of the refined model? The answer to this question is not
straightforward, because it depends on the type and complexity of the
structure and on the quality of the experimental data. Nevertheless, for
small or medium size structures giving good quality crystals (as it is usually
the case for most organic or organometallic compounds), with intensities
collected at room temperature on a diffractometer, we should expect R
values in the range 0.03-0.07. For very accurate low-temperature measure-
ments these values may be further reduced. The quality of the model should
then also be assessed by looking at the standard deviations of the atomic
parameters or at those of the derived geometrical quantities, such as bond
distances, which should be less than 0.006A for non-hydrogen atoms.
Usually as the complexity of the structure increases the quality af the
crystals and of the diffraction data decreases, and, at convergence the R
value may be in the range 0.08-0.15 for large molecules and up to 0.25-0.30
for macromolecules solved at low or medium resolution. In these cases the
final model may still contain some fairly large errors, which may often be
corrected by applying stereochemical or energy constraints (cf. Chapter 8,
p. 568).
The final R value will also depend on the selection of reflections used
during the refinement. It is in fact common practice to discard the very
Solution and refinement of crystal structures 1 371
already considered in Chapter 2 (p. 103), where it was shown that the ratio
of the residuals R;(l)lR&(2) obtained from two refinements of different
models may be used for testing the significance of the difference between
the two models. For instance this test may be used for determining the
absolute configuration of optically active molecules from the effect of
anomalous dispersion,[831although, as pointed out by ~ o g e r s , @some~I
caution is necessary.
Although the R index is the most widely used criterion to assess the
goodness of a structural model, its indications should be regarded with some
caution. As an example of the weakness of the R index, let US consider the
case of a structure containing one heavy atom, such as Pt, Ir, Os, or Cd,
and a certain number of lighter atoms (C, N, 0 ) . Since the I$ value is less
sensitive to the position of the light atoms, a model with the heavy atom
correctly placed and the light atoms quite inaccurately localized will give a
lower R value than a model with a small error in the position of the heavy
atom and an almost correct position of the light atoms. Nevertheless the
second model, with a smaller average position error and a uniform
distribution of errors over all atoms, is certainly a better one.
When the model is an incomplete representation of the molecule, R
becomes a rather weak guide, as it measures the agreement between
quantities which depend on largely different numbers of parameters (IF:]
depends on all atomic parameters, while IFiI only depends on the limited
number of parameters defining the incomplete model). The only valid check
of the indication given by the R value is the convergence of the svbsequent
Process of completing and refining the model.
As mentioned in the introduction of this chapter, the refinement of a
372 1 Davide Viterbo
structural model will not converge if the model is affected by large errors.
This is a direct consequence of the non-linear nature of the problem, which,
as observed in Chapter 2 (p. 93), implies the presence of several local
minima in the function to be minimized. Two types of errors, which occur
quite often, are particularly disturbing.
known one may initially assume an occupation factor of 0.5 and than refine
the factor to convergence (the occupation factor is highly correlated to the
temperature factor and some caution should be used in the least-squares
refinement). Only when disorder is treated in a proper way will the
refinement converge to an acceptably low R value.
Absolute configuration
In Chapter 3 (p. 168) it was shown that, when the anomalous dispersion
effect is present, Friedel's law is no longer satisfied. We shall see in Chapter
7 (p. 489) that for compounds containing asymmetric carbon atoms, isomers
of opposite chirality (enantiomers) are possible and that their solutions
rotate the plane of polarized light in opposite directions. By simple chemical
methods it is not possible to decide which of the two configurations
corresponds to the isomer rotating the plane of the light to the right
+
(( )-rotamer). Fisher proposed to assume as reference that ( )- +
glyceraldehyde corresponds to the configuration 5a shown in the table on p.
487 and to establish the configuration of all other chiral molecules in a
relative way with respect to this convention; thus (+)-tartaric acid
corresponds to 7a. Since this is an arbitrary assumption there was only a 50
per cent chance of it being correct. Fortunately the first determination of
the absolute configuration of NaRb-(+)-tartrate[941by the method de-
scribed below (using the anomalous scattering of the Zr K, radiation by
Rb), proved that the assumption was correct. Since then the anomalous
dispersion method has been used to determine the absolute configuration of
a large number of molecules and it will be shown in Chapter 8 (p. 545) that
the method is also applied in macromolecular crystallography.
The two optical isomers differ in hand and may be related either by an
inversion or by a reflection operation (inverse congruence). t
In a normal X-ray experiment, with no anomalous dispersion, because of
Friedel's law, the two enantiomers are indistinguishable, as they give rise to
the same diffraction pattern. When one or more anomalous scatterers are
present, then both IFhI and IF-,I should be measured and compared with the
computed values. Since the Bijvoet["] differences 1 ~ ~ 1IF-,I2 ~ - are rather
small, and anomalous dispersion is only large for heavier atoms near their
absorption edges, it is essential to apply a proper absorption correction to
the diffraction data.
In order to illustrate how the absolute configuration may be determined it
is instructive to consider the procedure originally used by Bijvoet, Peerder-
man, and van ~ o m r n e l ,which ~~~~ gives reliable results only when the
anomalous dispersion effect is sufficiently large and for this reason it is no
longer used. The ratios IF", /IF?,,\ and IF;J/JF?,1 are tabulated for a limited
number of reflections with large Bijvoet differences; a one-by-one com-
parison of the two ratios indicates that the wrong configuration has been
chosen if, when the first ratio is greater than 1, the second is less than 1 and
i We should note that the term 'absolute configuration' is not always correct in describing
crystal structures, as it does not apply to the case of non-centrosymmetric but achiral space
groups (polar groups with reflection symmetry operations, such as Pna2,) o r to that of achiral
molecules (without asymmetric centres) crystallizing in non-centrosymmetric space roups,
such as q u a r t z The term 'absolute structure' has been recently proposed and discussed$'~"
Solution and refinement of crystal structures 1 375
vice versa. If this is the case, then the correct enantiomer is obtained by
&anging the sign of all atomic coordinates.
When the anomalous dispersion effect is small all the reflections should be
used and efficient methods have been proposed by and
in the latter an absolute-structure or chirality parameter x is
refined in a least-squares process (cf. p. 97) in which the structure factor is
written as
+
IF(h, x)I2 = (1 - x) IFhI2 x IF-,I2 (5.89)
where x is close to zero when the model and the crystal are in the same
chirality, and approaches 1 if they are inverted one with respect to another.
Appendices
(x) = x
N
j=1
(xi) and a2= x 4.
N
j=1
(5.A.1)
where the sum is extended to the independent atoms only so that Fh can be
considered as the sum of N/2 independent random variables xj =
2fi cos 2nh q with mean value
+
The joint probability that A lies between A and A dA and at the same
time B lies between B and B + d B is given by the product of the two
distributions (5.A. 12)
1
P(A, B) dA d B = P(A)P(B) dA dB = -exp ( - ( A ~+ B2)/z) dA dB.
Jcx
(5. A. 13)
Equation (5.A.13) represents the probability that the structure factor F lies
within a region of the complex plane of area dS = dA dB. If the structure
factor is expressed in polar coordinates (i.e. in modulus and phase), then
dS = IF1 dlFJdg, and
other symmetry elements besides the inversion centre. Let us, for instance,
consider the effects due to the presence of a mirror plane, by considering
the space group Pm (m perpendicular to the b axis). In Fig. 5.A.1 we can
see that on the projection along the b axis, all atoms related by the m plane
are superimposed; the projection of the electron density on the ac plane will
only contain half of the peaks, each with double weight. The Fourier
coefficients FhOI relative to this projection will have a distribution cor-
responding to a structure of N / 2 atoms with scattering power 26 and then
The mean intensity of the h01 reflections will be twice that of the general
reflections hkl. In order to take into account the symmetry element effects,
(5.A.9) is generalized to
where E will be equal to 1 for the general reflections and greater than 1 for
certain classes of reflections which are influenced by the presence of
symmetry elements; thus in Pm E = 2 for the h01 reflections. The values of E
and the relative reflection classes for the different symmetry elements are
tabulated in the International Tables.[991
We note that the calculation of the ratio of the average observed intensity
for specific classes of reflections over that of the general reflections, can be
used as an indicator of the presence of some symmetry elements. For
instance we can distinguish between the space groups Pm and P2 by
computing the ratio ( IFhOl1')/(I ~ ~ ~which ~ 1will~ be) close
, to 2 for Pm and
close to 1 for P2.
Each image represents the ends of all the vectors from the atom at the
origin to all the others. Figure 5.B.l(b) illustrates how the Patterson map oj
Fig. 5.4(c) can be interpreted in terms of five displaced images of the
five-atom structure of Fig. 5.4(a), repeated in Fig. 5.B.l(a). It can be easilj
seen that the same map can also be interpreted in terms of five displaced
images of the enantiomeric structure.
It is in theory rather simple[1011to extract one image of the structure from
the overlapping set, but, as we shall see, the proposed methods may be
difficult to apply in practice; these are the so-called superposition methods.
Let us first consider the vector superposition in which one Patterson mar
is superposed on an other in such a way that the origin of the seconc
coincides with a given vector of the first; the two maps are thus separated b)
an interatomic vector. The set of coincident peaks will reveal one or more
0
@\, \
\
Solution and refinement of crystal structures
different peaks will eventually yield a single image. This is illustrated in Fig.
5.B.2, where superposition of the double image on a new peak reveals one
image only.
The basic principle of the method may be understood by considering that,
if the structure contains atoms 1 , 2 , 3 , . . . , N in the unit cell, the Patterson
map P will be the superposition of the images I(l), I(2), I(3), . . . ,I(N)
obtained by placing atoms 1 , 2 , 3 , . . . , N in turn at the origin, i.e.
( r )= P ( r ) X P ( r - u ) (5.B.1)
Solution and refinement of crystal structures 1 381
which takes very high values when coincidences occur, but is subject to
considerable background noise and is very sensitive to small errors in the
value of the vector u used for the superposition.
2. The sum function
E (r) = P(r) + P ( r - u) (5.B.2)
which is less sensitive to the errors on u, but can often produce spurious
peaks and in general a high background noise.
3. The minimum function
where P(uj) are the values of the Patterson function at the ends of the
vectors;
382 1 Davide Viterbo
will be large and are illustrated in Fig. 5.C.1, where curve 1 shows the term
+ 141 cos 2nhx (Fh positive), curve 2 the term - I FhI cos 2nhx ( F , negative),
and curve 3 the term +IF,,\ cos2n2hx ( 4 , positive). Fh would then
contribute to the electron density in the region indicated by the letter A, if
positive, and in the region B, if negative; in both cases F,, will contribute to
both regions only with a positive sign. It follows that, with p(x) everywhere
Fig. 5.C.1. One-dimensional centrosymmetric
structure with IFh/ and lFZhllarge: curve 1 shows positive, if 1 Fhl and IF,, I are large, whatever the sign of I Fh1, the sign of IF,, 1
the trend of IFh[cos Pnhx, curve 2 that of is more likely positive. This argument can be generalized to three
-IFh/ cos Pnhx, and curve 3 that of
F
I, ; hatched regions A and B are
cos 2 ~ 2 h x the
dimensions.
those in which a high value of the electron As a second example we shall consider the projection of a centrosym-
density is indicated. metric structure for which the amplitudes I FhI, I FkI, and IFh-,[ are large. In
Fig. 5.C.2 the traces of the three families of planes h, k, and h - k are
k shown as full lines, while the dotted lines are located half way. If we
h-k 1 1 1 I associate to the full lines the maxima of the terms IFhlcos2nh r, -
. .
lFklcos 2nk r, IFh-,I cos 2n(h - k) r contributing to the electron density
function p(r), then the dotted lines will represent the corresponding
minima. If I Fhl, IFkl, I Fh-klare large, the corresponding terms must add up
to give a preponderant contribution to p ( r ) . This will happen when
-
cos 2nh r = +1, if Fh is positive, and cos 2nh . r = -1 if it is negative; the
same applies to the other two terms. It follows that the possible maxima of
the electron density will be restricted to the regions indicated by A, B, C,
and D in Fig. 5.C.2, where the above conditions are satisfied for all three
terms at the same time. For each region the signs S(h), S(k), S(h - k) of
the three structure factors are
Fig. 5.C.2. Three-dimensional centrosymmetric
Region S(h) S(k) S(h - k)
structure with IFhl, IFk!, and IFh-*/ large: the full A + - -
lines represent the maxima of lFhl cos 2nh. r,
IFkIcos2nk.rand IFh_klcos2n(h- k).r,the
B + + +
broken lines the corresponding minima; the C - - +
hatched regions A, B, C, D are those in which
high values of the electron density are more
D - + -
Under the assumption that all points in the unit cell have equal
probability of housing an atom, the atomic position vectors q will form a set
of independent random variables. Thus also the terms g,, which are
functions of q, will be random variables. Then, because of the central limit
theorem, Eh will have a normal distribution with mean
and variance
1
4=- 2 ( ( E )- ( 4 ) ' ) .
N ,=I (5.D.3)
m,=1
Ah = -
386 1 Davide Viterbo
Bh =-
fl
x1
,=I
1
[S,(k)Cj(h - k) + Cj(k)Sj(h - k)] = -
fl, = I
x Pi. (5.D.10)
where
and
Qhk = Q)h - q k - Th-k.
Since qk and qh-, are known dQhk= d q h and the distribution of qh is
Solution and refinement of crystal structures 1 387
we obtain
[
1 exp - 21 (E, - ~ -)'Im
.
k ~ h - k
(5.D.19)
and
1 exp x
p+= -
1+ exp (-2x) - exp x + exp (-x)
= tanh x + exp xexp+ exp
(-x)
(-x)
+
= tanh (x) 1- P +
and finally
P+ = 4 + f tanh (*
1
EhI EkEh-,) . (5. D.22)
The reader may assign any three values to the phases and find by trial and o 766
error, with the help of Fig. 5.F.1, the value of x which best satisfies (5.F.2). p,
So far we have used the sequence of three magic integers 3, 4, 5, but
other longer or shorter sequences of different integers may be used. The
accuracy with which phases can be represented by magic integers depends
on the values of the integers and on the length of the sequence. ~ a i n [ ~ ~ ~ , ~ ~ ~ ~
described the theory of magic integers and gave some rules for selecting
those sequences which minimize the mean square deviations (Aq2) of the
represented phases. It turns out that the optimal sequences m,, m2, . . . , m,
are those for which 2m1 = m, + 1 and the differences m, - m,-l, mnPl-
mn-2, . . . , m3 - m2, m2 - m1 form a geometric progression of integer 2
numbers with common ratio r 3 1. For r = 2 the progression 1, 2, 4, 8,
16, . . . will give rise to the magic integers listed in the top part of Table
5.F.1. An integer progression with common ratio less than 2 is the so-called
Fibonacci series 1, 1, 2, 3, 5, 8, 13, . . . , where each term is defined as the Fig. 5.F.1. Representaton o f t h e three phases In
+
sum of the two preceding ones (F, = Fn-l Fn-2) and r tends to the golden eqns (5.F.2) as a function x.
number 1.618. The corresponding magic-integer sequences are listed in the
bottom part of Table 5.F.1, where the column headed (Aq), , , gives the
root mean square error of the phases represented with the different
magic-integer sequences. The optimal sequences also have the advantage
that the errors are equally distributed among the represented phases. From
Table 5.F.1 it may be seen that (Aq), , , increases as n increases and, for a
given n it is lower when the integers are larger (r = 2).
In several multisolution programs the general phases in the starting set
are expressed in the form (5.F.1) and x is explored at regular intervals in the
range 0-1 (if the enantiomorph has to be fixed, only the interval 0-0.5
should be explored). The value of the interval Ax is chosen in such a way
that the mean variation of the phases from one value of x to the other is of
the same order of magnitude of the corresponding (Aq),,, . Ax is
therefore a function of n and of the m, values and it is smaller the larger are
the integers. In order to keep the number of explored x values (each value
corresponding to a trial starting set of phases) within reasonable limits, it is
preferable to use magic integers based on Fibonacci series, at the cost of
slightly higher (Aq), , ,. A comparison between the use of quadrant
permutations and that of magic-integer permutations is shown in Table
5.F.2. This comparison may be better illustrated by considering the case
n = 2, for which it is possible to draw in the plane ( q l , q2), the lines
q , = 2x mod (1) and q2= 3x mod (1) (Fig. 5.F.2(a)). The choice of regular
Table 5.F.1. Magic integer sequences based on the geometric progression with
common ratio r = 2 (top) and on the Fibonacci series (bottom); (Aq),,,,,, is the root
mean square error of the phases represented by each sequence
n Sequence (AQJ)~.~.~.
390 1 Davide Viterbo
Table 5.F.2. Comparison between quadrant permutations (fn/4, f 3n/4) and the per-
mutations obtained with magic integers based on the Fibonacci series.
+ +
then q s =(ml m,)x n.The new phases obtained in this way are called
secondary phases. The starting set of phases will therefore include the
origin-fixing phases, the primary and secondary phases, and their number
can be up to 60-70.
The selection of the x values to be tried is done by means of the q-map.
This is set up using all triplets relating the phases in the starting set. With a
Solution and refinement of crystal structures 1 391
where
Xh = C
k
IEhEkEh-kl COS (9)-h + 9)k + qh-k)
and (5.G.6)
where a = qN- 9,; but in the right-hand side of (5.H.2) a is unknown and
the best estimate of FQ is obtained by replacing exp (ia) by its expected
value
,
(exp (ia)) = (cos a ) + i(sin a ) = (COSa ) ;
since a can equally have positive and negative values, then (sin a ) = 0 and
(5.H. 2) becomes
indicating that the 'best' Fourier synthesis is obtained by weighting the IFNI
factors by w = (cos a ) . In order to derive (cos a ) we must first know the
probability distribution of c?..
+
For FQ= AQ iBQ the probability distribution (5.A.13) becomes
1
P(AQ, BQ) = -exp [ - (A; + B2Q)lCQ] (5.H.4)
JGCQ
where CQ = C, f: is a summation over the unknown atoms. From the
triangle in Fig. 5 . ~1 .we have
Fig. 5.H.1. Vector representation i n the complex
plane of the structure factor as a sum of t w o
A$ + B ~ Q = IFQI2= IFNI2+ - 2 lFNlIFPIcos a (5. H.5) components.
394 1 Davide Viterbo
Finally
exp (X cos a)
P(a) =
2"Io(X) .
Recalling eqn (5.68), we may now derive
When this weight is used in Fourier recycling, then IFN(= (Fobs(and the
employed coefficients are
Similarly ~ o o l f s o n [derived
' ~ ~ ~ that for centrosymmetric structures
FP+ (FQ)= IFobsl tanh (X12)Sp (5.H.13)
where Sp is the sign of Fp.
where the last two terms mainly contribute to the background, the first term
contributes to the known part of the structure, and the second to the desired
unknown part, but only with half weight. In order to give the P and the Q
peaks the same weight, as in the case of the /3 synthesis, the coefficients of
the y' synthesis should be altered to
+
(2 IFNl(cos a) - IFPI) exp (iq,) = Fp + FQ FT,exp (2iqp)
+ {JFQ12- (lFQ12))lFg. (5.H.20)
The new synthesis is referred to as the 2F, - F, synthesis; it has been shown
that, for non-centrosymmetric structures, it contains fewer background
peaks than the /3 synthesis and is more effective in suppressing the peaks
corresponding to wrong atoms in the initial model.
Let us conclude this paragraph with a comparison, proposed by Tollin et
between the a synthesis and the optical image synthesis by
holography (see any optics textbook such as ~ e y e r - ~ r e n d t [ ' ~ lThe
]).
principle of holography is illustrated in Fig. 5.H.2. The waves scattered by
the object Q interfere with those from a coherent point source P and the
fringe pattern is recorded on a photographic plate to form a hologram (H).
The intensity pattern on the hologram will depend on the phase relation-
ships between the interfering waves from P and Q. By illuminating the
396 ( Davide Viterbo
laser
n
(a) RECORDING
'reference beam
mirror
image
/ - -
5
'
laser
Fig. 5.H.2. Scheme of the formation of a
hologram. (b) IMAGE RECONSTRUCT ION
hologram with light identical to that from P, the image of the object Q may
be reconstructed. If t,(x) and t Q ( x )are the functions describing the source P
and the object Q, the corresponding scattered amplitudes will be their
Fourier transforms TP(s) and TQ(s). The amplitude at a point s on the
hologram will be
A ( s ) = Tp(s)+ TQ(s)exp (2nias) (5.H.21)
where a is the separation between t,(x) and tQ(x). The intensity at s on H
will then be
References
1. Patterson, A. L. (1944). Physics Review, 65, 195.
2. Wilson, A. J. C. (1950). Acta Crystallographica, 3, 397.
3. Lipson, H. and Cochran, W. (1953). The crystalline state. vol. III: The
determination of crystal structures. G. Bell, London.
4. Wilson, A. J. C. (1949). Acta Crystallographica, 2, 318.
5. Wilson, A. J. C. (1942). Nature, 150, 151.
6. Viterbo, D., Gasco, A., Serafino, A. and Mortarini, V. (1975). Acta
Crystallographica, B31, 2151.
7. Harker, D. (1936). Journal of Chemical Physics, 4, 381.
8. Calleri, M., Chiari, G . , Chiesi Villa, A., Gaetani Manfredotti, A., Guastini,
C. and Viterbo, D. (1976). Acta Crystallographica, B32, 1032.
9. Gervasio, G., Rossetti, R. and Stanghellini, P. L. (1979). Journal of Chemical
Research (S) 334, (M)3943.
10. Sheldrick, G. M. (1985). In Crystallographic computing 3 (ed. G. M. Sheldrick,
C. Kriiger and R. Goddard), pp. 184-9. Oxford University Press; Robinson,
W. and Sheldrick, G. M. (1988). In Crystallographic Computing 4 (ed. N. W.
Isaacs and R. M. Taylor), pp. 366-77. Oxford University Press.
11. Luger, P. and Fuchs, J. (1986). Acta Crystallographica, A42, 380.
12. Lenstra, A. T. H . and Schoone, J. C. (1973). Acta Crystallographica, -429,
419.
13. PavelEik, F. (1989). Journal o f Applied Crystallography, 22, 181.
14. Argos, P. and Rossmann, M. G. (1976). Acta Crystallographica, B32, 2975.
15. Rossmann, M. G., Arnold, E. and Vriend, G. (1986). Acta Crystallographica,
A42, 325.
16. Terwilliger, T. C., Sung-Hou Kim and Eisenberg, D. (1987). Acta
Crystallographica, A43, 1.
17. Harker, D. and Kasper, J. S. (1948). Acta Crystallographica, 1, 70.
18. Karle, J. and Hauptman, H . (1950). Acta Crystallographica, 3, 181.
19. Giacovazzo, C. (1980). Direct methods in crystallography. Academic, London.
20. Hauptman, H . and Karle, J. (1953). The solution of the phase problem. I. The
centrosymmetric crystal, ACA Monograph, No. 3. Polycrystal Book Service,
New York.
21. Sayre, D. (1952). Acta Crystallographica, 5, 60.
22. Hauptman, H . and Karle, J. (1956). Acta Crystallographica, 9, 45.
23. Cascarano, G. and Giacovazzo, C. (1983). Zeitschrift fur Kristallographie, 165,
169.
24. Cochran, W. (19%). Acta Crystallographica, 8, 473.
25. von Mises, R. (1918). Physikalisches Zeitschrift, 19, 490.
26. Karle, J. and Karle, I. L. (1966). Acta Crystallographica, 21, 849.
27. Karle, J. and Hauptman, H . (1956). Acta Crystallographica, 9, 635.
28. Cochran, W. and Woolfson, M. M. (1955). Acta Crystallographica, 8, 1.
29. Schenk, H. (1973). Acta Crystallographica, A29, 77.
30. Schenk, H . (1973). Acta Crystallographica, A29, 480.
31. Simerska, M. (1965). Czechoslovakian Journal of Physics, 6, 1.
32. Hauptman, H. (1975). Acta Crystallographica, A31, 671.
33. Hauptman, H. (1975). Acta Crystallographica, A31, 680.
34. Hauptman, H. (1976). Acta Crystallographica, A32, 934.
398 1 Davide Viterbo
Introduction
This chapter deals with a class of natural and synthetic compounds that, at
least to a first approximation, can be treated as being composed of
oppositely charged spheres. We start by illustrating some important
chemical and physical concepts that are later on applied to the organization
of ionic crystals. For instance, we shall see how it is possible to rationalize
some atomic structures in terms of the energetics of crystal formation. To
this end, the ionic model, attractive for its simplicity, will be used even if
departure from the predictions are common particularly when bonds have
quite prevalent covalent character. The close-packing model of spheres is
examined closely and used as a helpful tool to describe many structures.
Finally we see how the concepts and principles illustrated previously are
very useful for the systemization of several ionic structures with different
degrees of complexity.
The energy of the orbitals for various neutral atoms increases in the
following order:
This experimental scale, though not always reliable (see Fig. 6.4),
On the same principle, the entire periodic table can be obtained in the
same way, as illustrated in Table 6.1.
Elements whose valence electrons are in the s orbitals belong to groups
IA or IIA: alkaline metals have only one valence electron in their s orbital,
while alkaline-earth metals have two. The elements in groups from IB to
VIIIB, known as transition metals, have their external electrons in orbitals
(n - l)d which immediately precede the last orbital ns.
Those elements whose valence electrons are in the p orbitals belong to
groups I11 to VIIIA. Such elements are known as post-transition metals and
non-metals, and are separated roughly in the Table 6.1 by a broken line.
Helium (in the hatched square) can be placed either in group IIA, taking
into account its electron configuration, or in group VIIIA, taking into
account its chemical behaviour similar to that of the other noble gases.
Finally there are the lanthanides and actinides which are characterized by a
gradual filling of the f orbitals preceding by two places (n - 2) the last
orbital s(n).
nner
!lectronc
10
18
16
i4
16
"Light metals.
t
bTran~itionmetals.
"The stair step running from boron to astatine, divides non-metals from post-transition metals.
Nobel gases.
408 1 Fernando Scordari
lonic crystals
The periodic table (Table 6.1) and Table 6.2 show that the E N of elements
varies with a certain regularity. In fact it increases much more from left to
right (along the periods) than from bottom to top (along the groups). The
further apart two elements are in a given period, the greater is the
difference in electronegativity between them, and the more markedly ionic
are the bonds of the compounds they form together. The various types of
bonds which form between the atoms play an important role in determining
geometrically the structural pattern and the physico-chemical properties
which characterize a crystalline substance.
In molecular crystals, more or less complex, finite, atomic aggregates can
be found, i.e. molecules, within which considerably stronger covalent bonds
are formed than the van der Waals bonds which are formed between the
molecules (see Chapter 7). In non-molecular crystals the protagonists are
atoms or ions (simple or complex) which are held together by more or less
localized bonds, and in limited cases of maximum polarization, by Coulomb
interactions.
This chapter will deal with non-molecular crystals, in particular ionic
crystals. The following questions will be considered:
1. What conditions must be satisfied in order that a combination of ions
may be considered stable?
2. What are the configurations that such combinations give rise to?
The energy associated with a given ions' disposition will be analysed
410 1 Fernando Scordari
briefly in order to answer the first question, while the atomic building of
some typical structures will be analysed in order to answer the second.
Gibb's free energy, G = E - TS + PV, indicates the stability of a crystal-
line structure. The term P V (pressure x volume) can be omitted if the
pressure is not high, thus free energy takes the form:
(groups I, IIA), part of group IIIA, transition metals with low oxidation
numbers, and non-metals (groups VI and VIIA). Lattice energy, or rather
the U, energy of a crystal, can be defined as the energy necessary at
absolute zero to break down a mol of crystal into its ionic components,
carrying them to an infinite distance. The formation of an ionic bond can be
considered as having two distinct phases in origin:
(1) a positive ion or cation forms in the following way:
If ionization takes place by means of the taking off of more than one
electron, then the total IE is the sum of the partial I E S . [ ~ , ~ ]
For example, let us consider two opposite charges supposedly concen-
trated in a single point, +zle, -z2e (e being the charge or an electron). At a
distance of R from each other, they are mutually attracted by a force
from which the potential electrostatic energy of the charges can be derived:
hence:
.... @ @
%1
@ @ @ @ @ @ @ ...
Fig. 6.5. 'Touching' cations and anions
alternating i n a row. -
Equation (6.16) differs from (6.12) not only in the number of the
interactions N, but also in the constant 1.39. This constant derives from the
positions of the ions considered in Fig. 6.5: more generally it depends on
the type of three-dimensional lattice which the ions form. ~ a d e l u n g ' s [ ~ ~ ~ ]
constant expresses the geometric characteristics of the lattice, and will be
indicated here as AM. If No (Avogadro's number) is the number of couples
of ions contained in a mol, (6.16) becomes:
AMvalues for some important lattice types are given in Table 6.3.
Since a crystal is the content, in terms of atoms, of a cell which is
repeated many times in space, its potential energy can be calculated as
Ionic crystals 1 413
Structure Coordination A
,
number
CsCl
Halite (NaCI)
Zinc blende (ZnS)
Wurtzite (ZnS)
Fluorite (CaF,)
Cuprite (Cu,O)
Rutile (TiO,)
Anatase (TiO,)
follows:
AHEA= -83.3 kcal mol-' then UL= - 186.4 kcal mol-'. If in (6.17) we
insert the values A , = 1.74745, Re = 2.814 X lop8cm, n = 9, and introduce Fig. 6.6. Sketch of the Born-Haber cycle.
414 1 Fernando Scordari
At
I
,to0
Esf;!
@,',!'
'\OO000000 00,' -
metal ion in
s p h e r i c a l field',
I
y
I metal i o n in
, I
1 t e t r a h e d r a l field
1 metal ion i n
o c t a h e d r a l field
Let us now analyse how the electrons are positioned in the d orbitals. The
first three electrons occupy the three t,, orbitals.
The fourth electron has two possibilities: it may occupy an e, orbital, in
accordance with the Hund's first rule, or maximum multiplicity principle
(an atom, in its fundamental state, follows the maximum multiplicity
principle: in a partially filled orbital the number of electrons having the
same spin orbital is maximum), or it may associate with another electron
already in a t2, orbital, which spins in the opposite direction, thus further
stabilizing the complex.
Two factors influence the fourth electron and decide which tendency will
prevail: the A, and the pairing energy, P. The latter can be defined as the
energy necessary:
(1) to overcome the repulsion which exists between two electrons which
occupy the same orbital;
(2) to compensate the exchange energy which is lost when Hund's first rule
is violated.
If the A, > P, then the fourth electron will go to fill a t2, orbital, forming a
low-spin complex (two electrons which spin in opposite direction plus two
electrons which spin in the same way give a compound which is not very
paramagnetic). If the A, < P, then the fourth electron will go to fill an e,
orbital, forming a high-spin complex (four electrons which spin in the same
way give a very paramagnetic compound).
Table 6.4 shows the CFSE relative to high- and low-spin configurations,
for octahedral sites.
We will now examine the case of a transition metal in a tetrahedral
configuration. Figure 6.8(b) shows four ligands at the vertices of a
tetrahedron. The three d orbitals, which we shall call tZ, are nearer to the
ligand than are the other two e orbitals. This implies that, unlike an
octahedral coordination, in a tetrahedral coordination the three t2 orbitals
have more energy than the two e orbitals (Fig. 6.7(a)).
The crystal field separation value is indicated by A, and is inferior to the
Ao(A,lA, = 419) for two reasons:
(1) because there is no direct interaction between the ligands and the d
Table 6.4. CFSE for transition metals of the first series i n octahedral configurations
orbitals of the metal ion; Table 6.5. CFSE for transition metals of the
first series in tetrahedral configurations. For
(2) because there are four ligands instead of six. the ions involved see Table 6.4
This explains why, as experience has demonstrated, tetrahedral complexes Number of High-spin configuration
d electrons
have only high-spin configurations. e t, CFSE(A,)
Table 4.5 shows the CFSE relative to high-spin configurations for
tetrahedral sites.
At this point it will be useful to list the factors, which are at times
mutually opposed, which govern cation coordination:
voluminous ligands favour tetrahedral coordination (see p. 425);
the position of the ligand in the spectrochemical series (see p. 415)
favours tetrahedral coordination if on the left, octahedral if on the
right:
a high ligand charge favours octahedral coordination, because it
increases Ao, which in turn favours the formation of low-spin com-
plexes, increasing the number of electrons in tZgorbitals;
the CFSE favours octahedral complexes, both because A. = (9/4)A,,
and because the CFSE of tetrahedral complexes is generally less than
those of octahedral complexes (cf. Tables 6.4 and 6.5).
If the CFSEs of Table 6.5 are multiplied by the scale factor 419, they can
be compared with the CFSE of Table 6.4, and the octahedral site
stabilization energy, OSSE (i.e. the difference between the CFSEs con-
cerning octahedral and tetrahedral sites), can be calculated in A. units (see
Table 6.6).
Table 6.6. OSSE for transition metals of the first series. For the ions involved
see Table 6.4
Ionic radius
An important factor which influences the geometry of the structure of ionic
compounds is the so-called ionic radius. It expresses the 'dimensions' of an
Ionic crystals 1 419
2. Use the multiplication coefficients 0.35 and 0.85 for electrons belonging
to the groups (ns, np) and ((n - l)s, (n - 1)p) respectively and 1 for
those belonging to the group ((n - 2)s, (n - 2)p) or to any preceding
group. It is preferable to use 0.30 to 0.35 in the case of groups (Is).
3. If the electrons belong to groups (nd) or (nf), use the coefficients 0.35
and 1, respectively, for electrons which belong to the same group, or for
those which belong to the next group to the left.
For example, for the compound NaF:
By resolving the system (6.21), re(Na+) = 0.95 and re(F-) = 1.36 A can be
obtained, which agree perfectly with the ionic radii quoted by ~auling.[~I
This procedure can be extended to analogous isoelectronic compounds.
Various ionic radius systems have been proposed, amongst the most
accurate, thanks to the large number of structures examined, are those
which have appeared Effective ionic radii extracted from
Shannon's original work[lglare given in Table 6.7. These systems show that:
1, Cations are generally smaller than anions. The average radius of the
outermost occupied orbital varies considerably from the least electronega-
tive atom to the cation, but very little from the most electronegative atom to
the anion, e.g. ro(Li) = 1.586, ro(Li+)= 0.186 A and r,(K) = 2.162, ro(K+)=
420 1 Fernando Scordari
Table 6.7. Effective ionic radii ( r e ) versus coordination number (CN) according to Shannon[lgl
Na'
Na'
Nb3+
Nb4+
Nb5+
~ d ~ +
Nd3+
Ni2+
Ni3+
Ni4+
yo2+
Np2+
NP3+
NP4+
Np5+
NP6+
Np7'
o2-
OH-
os4+
os5+
os6+
os6+
os7+
os8+
p3+
p5+
pa3+
pa4+
pa5+
pb2+
422 1 Fernando Scordari
0.592 A. In the anions the added electron increases the radius by a few per
cent, e.g. ro(C1) = 0.725 A, ro(C1-) = 0.742 A and ro(Br) = 0.851, r,(Br-) =
0.869 A.
2. The dimensions of the atoms decrease along the period (with n a
constant), the alkaline metals having the largest dimensions, and increases
sharply along the group from top to bottom (with n a variable). This occurs
because the dimensions of an atom depend both on n and on Zeff,which act
on the atomic radius producing contrasting effects. Thus an increase in n
tends to increase the atomic volume (the most probable radius increases
with n, see Fig. 6.2), while an increase in Z,, tends to contract orbitals. For
example, the Z,, calculated with Slater's principles for Na and C1 are
respectively 2.20 and 6.10, while for Li and Cs they are 1.3 and 2.2. In the
first case the sharp increase in Z,,, which is not countered by n (which is
constant), induces a decrease in the dimensions from Na to C1. In the
second case the regular increase in n, hardly countered at all by the effects
of Z,,, induces a notable increase in the atomic volumes from Li to Cs.
Obviously this also follows for the ionic radius.
3. The ionic radius varies with the coordination number (CN). Note that
the repulsion forces increases with the CN, which in turn increases the
effective ionic radius re of the ion of the opposite charge (the cation). The
lattice energy allows this trend to be quantified approximately. (6.15) can be
rewritten as:
E = - AMzlz2e2+-CNB
R R"
where AM is Madelung's constant (which has taken the place of 1.39) and
CN, which is defined below, is the coordination number (which has taken
the place of the number 2, that is the two anions coordinated by the cation,
see Fig. 6.5 and eqn (6.13)).
For R = Re, dEldR = 0, from which we can work out:
Let us suppose that ions of opposite charges can crystallize just as well
into a CsCl type structure as an NaCl (halite) or ZnS (wurtzite) one.
Assuming that the intermediate value n = 9, we obtain:
The result of (6.24) indicates that the inter-ionic distances of a CsCl type
lattice (R,) are 2.3 per cent greater than those of an NaCl type lattice (R2),
while (6.25) shows that those of a ZnS type lattice (R3) are smaller than it
by =4 per cent. Since the anion radii remain more or less the same, the
variations occur almost entirely in the cation radius.
4. The ionic radius depends on the electronic spin state. There is a close
relationship between ionic radius and spin state in transition metals. In fact,
Ionic crystals 1 423
This means, for example, that for alkaline metals the polarization power
increases according to the following scale: Li+ > Nai > K+ > Rbf > Cs+
(ionic potential, 0 = Z / r , increases from right to left).
Regarding the anions, the greater the dimensions of their outermost
electrons, the weaker are their bonds, due both to the greater distance from
the nucleus and to the increasing shield effect. It follows that the
polarizability of halogens varies according to the scale: I- > Br- > C1- > F-,
analogously with that of haicogens. Anions with a high negative charge,
such as As3- and p3-, are particularly easily polarized. Moreover, it has
been shown above how the different peripheral electrons are shielded by the
s, p, and d type electrons. Therefore, although ~ i ' +has approximately the
same ionic radius as Mg,'+ it polarizes more than Mg2+ one extra
contiguous anion: this effect is even more noticeable between Cu+ and Naf.
The lattice energy is minimal if the total potential energy is minimal, this
occurs when:
(1) the interatomic distances are as close as possible to the equilibrium
distances (R,);
(2) the number of ions at, or very nearly at, the equilibrium distance is as
high as possible.
These conditions are general and apply to crystals having simple structural
units, or complex units connected to similar units by central or near central
forces of attraction.
The maximum filling principle can be expressed as follows: 'Simple
(atoms or ions) or complex (groups of atoms or ions) structural units on
which central or near central attractive forces act tend to increase contact
with each other to the maximum, whilst reducing their distances to a
minimum'.
By supposing that the ions behave as if they were rigid spheres on which
central, or almost central, forces are acting, we can hope to:
(1) predict the most probable arrangement of the anions which surround
the cation (coordination polyhedra), by means of a simple geometric
rule (radius ratio rule);
Coordination polyhedra
It is sometimes convenient to illustrate crystal structures by means of
coordination polyhedra, in order to underline the geometrical relationships
which characterize structural frameworks. Often such polyhedra continue to
exist beyond the crystal state as definite physico-chemical realities, e.g.
[siO4I4- tetrahedra in silicate melts, or Fe(H20),.-,(SO4), octahedra where
1< n < 1.3, in aqueous solution of iron s ~ l ~ h a t e . [ ~ ~ ]
The term coordination is used to refer to the atoms or ions which
surround a central atom or ion. If it is not otherwise specified below, it may
be assumed that the central ion is the cation (M) surrounded by the anions
(XI.
The Xs at a tangent to M are known as the first nearest neighbours of M
and constitute the first coordination shell. The coordination number (CN)
is equal to the number of the first nearest neighbours, i.e. to the number of
bonds formed by an atom. The next Xs, at a tangent to the first nearest
neighbours, are known as the second nearest neighbours of M, and form the
second coordination shell, and so on. Sometimes the first coordination
shell is not easily distinguished from the second. In such cases the
uncertainty is expressed by attributing two numbers to the CN, e.g. 6 2. +
The distances dMx are all equal when M occupies certain symmetrical
positions, e.g. M in 3, with CN = 6; otherwise the dMx vary, if only a little,
from their average.
The coordination polyhedron of M can be obtained by 'joining the dots',
i.e. all the centres of X. Some of the possible coordination polyhedra are
shown in Figs. 6.10 and 6.11. They can be found in structures as isolated
polyhedra, and/or connected to polyhedra of the same or another type.
Dumb-bell
B
A
H C
Triangle
Tetrahedron
B
Square
-. D ~ A
/ C B
Octahedron
Cube
Hexagonal cuboc.
Cuboctahedron
LI + Na" K+ RbiCs' and r,(X) are more or less the same. Otherwise, such compounds are
governed by the cation coordination number (since the cation is generally
smaller than the anion) and the stoichiometry. The dimensions of the cation
depend on the CN (see p. 422). Therefore ambiguity remains as to the ionic
radius dimensions when the structure (and thus the CN) is not known. Let
us compare the theoretical data with the experimental data for the halides.
'Average' ionic radii for a CN = 6 are used, for example, CsBr (r, = 1.67,
rx = 1.96 A) forms a compound Cs': Br- = 8:8, in accordance with both the
radii ratio: (p, = 0.85) and the stoichiometry (M:X = 1:l). Also for NaCl
both the p, = 0.53 and the stoichiometry indicate a compound Na+:Cl- =
6:6, in accordance with the results of structural analysis.
The rule of the radius ratio predicts M:X = 4:4 structures in the higher
field T (p, < 0.41), M:X = 6:6 in the intermediate field 0 (0.41 G p, < 0.73),
and M:X = 8:8 in the lower field C (p, > 0.73). Experimental data, on the
other hand, shows that almost all the structures are of M:X = 6:6 type, two
are of M:X = 8:8 type (CsI, CsBr), and one, CsC1, can crystallize in either
type of coordination.
In Fig. 6.14 a broken line is shown for p, = 0.32, which corresponds to
the point of intersection of the NaCl and ZnS curves in Fig. 6.12. The first
Fig. 6.14. Three fields are shown (T, 0, C), anomaly involves the T field. The incongruence between the observed
delimited by t w o unbroken lines (p, = structure M:X = 6:6 and the predicted structure M:X = 4:4 by the radius
0.73,0.41). In accordance with geometrical
criteria, they represent stability areas for the ratio rule is only apparent. Considering Fig. 6.12, NaCl type configurations
alkaline halides. The broken line (p, = 0.32) are stable for 0.32 < p, < 0.41, i.e. the values observed. The other anomaly
shows the reduced T field according t o the
energy indications of Fig. 6.12. The diamonds involving the lower field, C, is constituted by M:X = 6:6 structures and can
and squares show the type of coordination be explained by considering not only the interactions between ions of
polyhedron on which the real halide structures opposite charge, but also those between ions of similar charge in the CN
are based: octahedra and cubes respectively.
function, and the geometric relationships between the coordination poly-
hedra (see p. 435).
However, if re(IVLi+)were used for LiI, p, = 0.27 would be found to be
the right value. The configuration Li+:I- = 4:4 would gain potential energy
and would therefore seem likely to be preferred, which is contrary to the
experimental data.
Another example can be illustrated by means of Fig. 6.15, as deduced by
~ h i l l i p s . [ ~The
~ I 'fields of stability' for the most important structures of
A2B04 composition are shown. The diagram was obtained empirically on
the basis of about 130 structures using the effective radii of the cations in
octahedra1 (r,) and tetrahedral (r,) coordinations. Such a diagram makes it
possible to work out the field in which a certain A,BO, structure prevails,
even if some borderline cases may be contrary to the general trend.
Improvements in structural predictions can be obtained by considering not
only geometrical factors but also chemical factors, such as polarization.
Graphs for compositions of M,X, type have been obtained by plotting
the average quantum number n (which indicates roughly the dimensions of
the ions) on the y axis and the electronegativity difference, AEN (which
indicates the degree of polarization of the bond),[241on the x axis. The
Fig. 6.15. Taking into account ionic radii r,(A) prediction is improved if p,AEN is used instead of AEN.
and r,(B) the stability fields for A2B04 structures
are shown. These fields conform t o the Finally using effective mean ionic radii, several have found
following structure types: (a) 0-K2S04, (b) quantitative relations between crystallographic and compositional para-
Na2S04(thenardite), (c) (Mg, Fe), SiO, (olivine),
(d) Be2Si04 (phenakite), (e) A12Mg0, (spinel), (f) meters in garnets. So according to as so,[^^] it is possible to calculate the
Sr2Pb04, (g) K2NiF4, (h) FeJaO,, (i) A12Ba04. oxygen fractional coordinates, the metal-oxygen distances, and the cell
Ionic crystals 1 429
where r(X), r(Y), and r(Z) are the mean ionic radii respectively of cations
8, 6, and 4 coordinated, and (OH) are the number of OH groups present in
the formula unit (f.u.) of garnets:
Closest packings
Most metals, some compounds of the type MX, MX, etc., and others with a
more complex stoichiometry, take on structures which can be described by
means of closest packings of rigid spheres of equal radius. Let us denote by
packing coefficient (ci) the ratio:
ci = C (VaIVc) (6.27)
where C Va is the volume of the atoms or anions contained in the
elementary cell of volume Vc. For discussion of the concept of ci see
Appendix 6.A. According to the maximum filling principle, ci + max.
Hexagonal closest packing (HCP) and cubic closest packing (CCP) satisfy
this requirement. When analysing their characteristics, it should be borne in
mind that in the common structures, ions, atoms, or groups of either,
replace rigid spheres forming the so-called closest-packing structures. The
basis of the HCP and CCP dispositions is a layer of spheres (layer A)
packed as closely as possible similar to that illustrated in Fig. 6.16.
This layer has the following characteristics: every sphere is in contact with
another six spheres; between every three spheres there is a space, known as
a hole, every sphere is surrounded by six holes, one third of each of which
belongs to it, so a total of two holes can be associated with each sphere.
Such holes, which are all alike from a physical point of view, are indicated
by b (white) and c (black). Spheres belonging to the next layer may
correspond to the b or the c holes of layer A, and are thus said to belong to
430 1 Fernando Scordari
Many cations have a radius of just the right size, compared with that of
the oxygen atoms, that they can occupy tetrahedral or octahedral holes. Let
us examine, therefore, these two types of holes more closely in order to find
out what is their quantitative ratio to the spheres. In Fig. 6.18 a layer A
(black spheres) and a layer B (white spheres) are shown.
Supposing that the sequence is an ABAB . . . type, a sphere marked S
(belonging to layer A and not visible in the figure) is surrounded by a total
of 12 spheres (six of A and six of two B layers, above and below A). It
contributes to the formation of:
(1) six octahedral holes ( 0 ) three of which are shown in the figure; the -.-.
other three can be obtained by a symmetry plane m which passes
Fig. 6.18. Octahedral, 0, (top right) and
through A; tetrahedral. T. (bottom right) holes with relative
coordination polyhedra &a& out from t w o
(2) eight tetrahedral holes, three of which, shown in the figure, are in fo~~o~in~compact~a~e~~.
correspondence with the three spheres marked with a thick T and one
of which is in correspondence with the sphere marked S and three of
the spheres marked with the same T; the other four can be obtained by
a symmetry plane m which passes through A.
Since six spheres are needed to form an octahedral hole and every sphere
helps to form six holes, obviously there is a ratio of one hole to each sphere.
Four spheres are needed to form each tetrahedral hole, while each sphere
helps to form eight tetrahedral holes, so there is a ratio of two tetrahedral
holes to each sphere.
A layer of octahedral holes (OL) is enclosed by two layers of spheres, as
are two layers of tetrahedral holes (TL), through they are orientated in
different directions (see Fig. 6.32). This situation is common both to HCP
and CCP, consequently a double layer of spheres has the following
characteristics in both HCP and CCP:
while a CCP one contains four (Fig. 6.17). For example, the side of an
elementary CCP cell will be a,, = ~ j 6 where, D is the diameter of a sphere.
The volume of a sphere is V, = 4.189(D/2)3 from which c, = 4v,/a3 = 0.74.
The other parameter is the ratio hlD. Since h = [ ( ~ V ' 3 / 2 ) ~ -
" ~ Fig. 6.20) it follows that h l D = 0.8165. For example, in
( ~ f i / 6 ) ~ ](see
HCP, co = 2h, therefore colD = 1.63.
Combinations of HCP and CCP can create an indefinite number of
closest-packing structures. The repetition period, nh, will be in the range
2h S nh ~ p , where
, p , is the maximum dimension of the crystal perpen-
dicular to the closest-packed layers which the sequence can reach when the
structure is completely disordered.
As n increases, the number of possible sequence combinations grows very
rapidly. For example, there is only one possible packing for n =4:
ABACA; for n = 6 there are two: ABCACBA and ABABACA; but for
n = 20 the number of possible packing rises to 4625. One of the following
ways can be chosen to derive the possible packings: the first is to count all
the possible sequences once having generated them;[301the second is to
obtain them by means of combinatorial analysis.[311Figure 6.16 shows the
symmetry of a closest-packed layer (6mm) and of a pair of such layers (3m).
The minimum symmetry content common to all closest packings is P3ml.
Other symmetries are represented by all those space groups which can have
as a subgroup P3ml (see also Fig. 1.E.1), i.e. ~ 3 m 1 ,~ 6 m 2 ,Pb3mc,
P6,/mmc, R3m, ~ g m ~, m 3 m .For HCP and CCP the space groups are
respectively: P63/mmc and ~ m 3 m .
When real structures are considered the symmetry may be reduced, due
not only to the arrangement of the cations which occupy the various holes,
but also to the distortions which the cations produce in the lattice.
Therefore, some structures belonging to systems different from those just
examined can be described as closest packings, distorted to a greater or
lesser degree.
There are other close packings characterized by spheres with fewer next
neighbours. These are the PTP (primitive tetragonal packed), the BCT
(body-centred tetragonal), and the BCC (body-centred cubic). The first has
c, = 0.72 and CN = 11, the second (not found in the known structures) has
c, = 0.70 and CN = 10, the third (taken on by several structures) has
+
c, = 0.68 and 8 6 spheres of the same type coordinated around a central
Fig. 6.21. A BCC cell. The central sphere is sphere, eight at a distance of D and six at a distance of 1.150 (Fig. 6.21).
surrounded by eight spheres at a distance of D
and six more distant spheres at 1.150 ( 2 0 = 1V3, HCP and CCP, on the other hand, have 12 spheres all at the same distance
where / i s the side of the cell). from the central sphere.
Ionic crystals 1 433
Pauling's rules
Ionic bonds play an important role in the building up of structures for the
vast majority of minerals. In fact, even the groups (Si04)4-, (SO4)'-,
(C0,)'-, which have prevalently covalent bonds internally, can be seen as
large negative ions bonded to the cations by essentially ionic interactions.
The general principle of the minimization of potential energy can be applied
to such compounds. The Pauling rules are derived from this principle.
where so, Ro, and N are empirical constants. For so = 1, Ro = R,, the
parameter R1 represents the expected bond length when the valence is
assumed to be one. In such as a case (6.29) becomes:
By inserting the correct values for R, and N for each cation[341and the
bond distance R, which can be deduced from the structure, s can be
determined. For example Na+, Mg2+, A13+, Si4+,P5+, and s6+have values
of R1 = 1.622 A and N = 4.290. The experimental bond valence can be
obtained by applying (6.30). The agreement between the experimental and
the theoretical valence is very good, generally the difference does not
exceed 5 per cent. If the agreement is not satisfactory for an ionic structure
it can be ascribed to the following:
(1) the structure has not been correctly determined;
(2) the structure has not been interpreted properly, or certain bonds have
been overlooked, or the wrong atoms have been ascribed to certain
sites (e.g. Si4+ in place of A13+ or vice versa).
(3) The cations have stereochemically active lone-pair electrons (e.g. sb3+).
Amongst the various applications of this method[351it is often used by
crystallographers to distinguish 0'- from (OH)- and H 2 0 in some complex
mineral structures.[361Calculated bond valences can be used through (6.28)
to calculate experimental atomic valence and so the oxidation state of an
Ionic crystals 1 435
that they will share vertices is reduced, so much so that in the case of group
having yet higher cation charges, like [S0,I2-, that it is almost nil.
Figure 6.22 shows that do, = dCf.However, in octahedral coordinations
the shielding of the two cations is more accentuated. This explains why
numerous halites tend to keep the NaCl structure rather than that of the
CsCl, favoured by p,. Structures like CsCl are only possible if the anions
are large enough to keep the cations far enough apart, as can be observed in
the case of CsC1, CsBr, CsI (in the NaCl structure type, the distance
between two cations for CsF is 4.24 A and 4.92 A for CsCl). It should be
remembered here that, in accordance with the indications of Fig. 6.22, ionic
compounds with an MX stoichiometry (see p. 437) crystallize not with an
HCP lattice, but with a CCP lattice (e.g. NaC1, MgO, CaO, etc.). This is
because the energy levels involved in the formation of the two packings seen
along h is different: in the case of CCP the octahedra MX6 share edges,
while in the case of HCP they share faces.
On the basis of what has been said, the question of how to realize the
relative stability of certain ~ i 0 polymorphs
~ [ ~ can ~ ~be understood. This
compound has three polymorphic modifications: rutile, brookite, and
anatase. The structural differences concern mainly the way in which the
TiO, octahedra are linked to one another. In rutile, brookite, and anatase,
each octahedron shares respectively two, three, and four edges with a
similar number of other octahedra. The structural stability of brookite is less
than that of rutile, while anatase is the least stable, as predicted by the third
Pauling rule.
MX structures
As predicted by the third Pauling rule, the packing definitely preferred by
MX compounds is CCP, while BCC is rare, and HCP is never adopted. It
can be seen from the stoichiometry of these compounds that all the
octahedra are occupied and that therefore an HCP-based structure is
decidedly less probable than a CCP-based structure. In the first case, in fact,
the octahedra share edges and faces, while in the second the octahedra only
share edges. The non-existence of hexaognal packing in the structures of
MX compounds can be best explained with help of Fig. 6.23.
438 1 Fernando Scordari
-A
OLB
-B
OLA
-A
OLB
-B
OLA
-A
440 1 Fernando Scordari
OLB
-
1
I
OLA C
- I
OLB I
Ionic crystals 1 441
A,B,X, structures
A very large number of compounds of great geological, crystallochemical,
and applicative interest belong to this class. Some structures are illustrated
below in which A and B represent monovalent to pentavalent metals and X
is generally oxygen.
llmenite (FeTiO,) and perovskite (CaTiO,) have ABX, stoichiometry.
Ilmenite is isostructural with corundum, in which the cations are substituted
+
according to the following scheme: 2A13+ (Fez+,~ g ' + ) ~ i ~ Such + .
cations are ordered so that the octahedra of one layer are occupied by Fez+
and those of the next layer are occupied by Ti4+. It follows that two
contiguous octahedra belonging to two adjacent layers are occupied by ~ e ' +
and by ~ i ~so+that, the average valence (+3) of the two cations remains the
same as that of the two corresponding cations in corundum. The specializa-
tion of the octahedral sites, half occupied only by Fez+ and half only by
Ti4+,determines the disappearance of the glide c. Thus the space group R ~ C
(corundum) becomes R3. A number of structures with various valences
(shown in brackets) have a similar structure to ilmenite, including
LiNbO,(+ 1, +5); MgTiO,, FeTi03(+2, $4); Mn(Fe, Sb)03, a-AlZO3, a-
Fez03 (the last two can be considered as particular cases of the structure of
ilmenite) and Ti203, V203(+3, +3). In these compounds A and B have
similar ionic radii ( ( r e ) = 0.65 A), therefore they occupy two octahedral
holes of a CCP type packing.
If, however, the cation is too large to fit into the octahedral holes,
another type of structure with the same stoichiometry is formed (perovskite
type). Thus, as Na+, K+, Ca2+,Sr2+,Ba2+,pb2+,etc., cations have an ionic
radius equal to that of the anions when have CN = 12, they can substitute an
X (0'-, F-) anion. If one quarter of the X anions are substituted by such
cations, then a particular type of CCP is obtained, a representative cell of
which is illustrated in Fig. 6.28.
t Incidentally, the inversion and the complete disorder between Fez' and Fe3+ in octahedral
sites make magnetite an excellent conductor of electricity. This is due to the ease with which
~ e ' +and ~e~~ exchange electrons. Below -153 "C the Fez+ cations become ordered, reducing
noticeably the capacity of magnetite to conduct electricity.
Ionic crystals 1 445
direction of the shortest identity period within the silicate anion are
chosen.
If rule (a) is satisfied by more than one chain, the fundamental
chains are chosen such that their number is lowest.
Once rules (a) and (b) have been satisfied, the third rule indicates
the following order of preference (for fundamental chains): un-
branched > loop-branched > open-branched > mixed-branched >
hybrid (for > read 'is preferred to').
order in which the various parameters have been presented repre-
sents the classification hierarchy: superclass (N,,), classes (CN), sub-
classes ( L ) , branches ( B ) , orders ( M ) , groups (D), sub-groups (r or t),
families (P, Pr).
If there are a number of silicates with the same type of silicate anion, i.e.
silicates belonging to the same family further subdivisions can be made. The
following criteria are particularly important:
(1) the Si:O atomic ratio of the silicate anions;
(2) the degree of stretching of a chain.
As far as criterion (1) is concerned, it should be noted that two or more
rings, or two or more single chains having P > 1, can join together via all or
452 1 Fernando Scordari
only part of their tetrahedra. As a result, complex anions with various Si:O
ratios are generated, depending on the portion of the tetrahedra involved.
As far as criterion (2) is concerned, Liebau proposes that the degree of
stretching should be measured by means of the stretching factor (A), which
can be obtained as follows:
where I, is the identity period of the chain, 1, is the length of the edge of the
tetrahedron, both in A, and P is the periodicity of the chain (the number of
tetrahedra needed to identify the period).
Since amongst all the silicates so far discovered shattuckite,
C U ~ [ S ~ , ~ ~ ] (has
~ Hthe
) , ,most stretched chain, it is taken here as a point of
reference. From it the value 1, = 2.7 A can be obtained (equal to half of the
repetition period of the chain, which is two tetrahedra [Si04]-4). Thus for
shattuckite f, = 5.40 A12 X 2.70 A = 1.00 while for enstatite, Mg2[Si206],
(I, = 5.21 A, P = 2) f, = 0.956 and for alamosite, Pb,,[Si,20,6], (I, =
19.63 A, P = 12) f, = 0.606.
The crystallochemical classification presents the same periodicity charac-
teristics as the periodic system of the elements, so that, for example, by
gradual condensation of [TO4] tetrahedra in a linear way, uB anions can be
generated that are linked only by means of vertices in the following way:
(1) for D = 0 the number of tetrahedra that can be condensed linearly
increases as M increases, so that for M -, w, D i. 1; in which case a
single uB chain will result;
(2) for D = 1 the number of chains increases with M, and for M w,
D + 2, thus forming a single uB layer;
(3) when single layers are condensed and for M -, w, D + 3, a three-
dimensional building of tetrahedra, i.e. uB framework is obtained.
By varying M and D the whole of Table 6.8 can be obtained, which lists
Table 6.8. Chemical and mineralogical (in round brackets) nomenclature of silicates
M=l M=2 M=3 ...
D= 0 Oligosilicates Monosilicates Disilicates Trisilicates . ..
(-1 (Nesosilicates) (Sorosilicates)
Structural formulae
A structural formula should contain the largest possible amount of
information and therefore should include the parameters used in crystal-
lochemical classification. Some of the parameters CN and P are best used
only when there is some perplexity: the first is written in square brackets as
a right-handed superscript to the cation, while the second is written without
brackets as left-handed supercrript to the cation. Other parameters, Nanand
L, can be deduced directly from the structural formula. Therefore a
structural formula containing the essential parameters is as follows:
anions with CN = 4 are found with L > 1 (at present only one case has been
found with L = 2). Highly electsonegative cations favour a higher degree of
linkedness because the [SiOJ effective charge tends to be reduced.
3. B: branched silicate anions, in particular branched ring anions, are less
stable than unbranched varieties, due Do the shwter average Si-Si distances.
If, however, cations with high EN are present, then the stability of such
silicate anions is increased, since they attenuate the effective negative
charge thus reducing the repulsion between the tetrahedra. The stabilizing
effect of such cations decreases as D increases, because in such cases the
effective charge of the tetrahedra [TO,] ako decreases.
4. M: the number of structures with increasing M falls drastically both in
the oligosilicates and in the cyclosilicates, and in polysilicates and phyllosili-
cates. This is due to the decided increase in the potential energy of
structures as M increases. To examine this phenomenon the differences in
energy between linear groups of tetrahedra will be analysed. It should be
borne in mind that the greater their distance from the tetrahedron which
terminates the group, the less tmo contiguous tetrahedra will vary from the
point of view of energy. For example two Q1 tetrahedra are similar from the
point of view of energy because they each start and terminate a group. Two
Q1 tetrahedra and one Q2, belonging to a group of three tetrahedra are very
different from the point of view of energy, because the distance that
separates Q2 from the external tetrahedron Q1 is only that of one
tetrahedron. Finally, in a single unbranched chain every tetrahedron has
practically the same energy as that next to it, since both are more or less at
the same distance from the end tetrahedron. In contrast a linear group of Fig. 6.41. Single and double layers (the
fundamental chain is hatched) and the structural
tetrahedra could, from an energetic point of view, be regarded as being formulae of: (a) muscovite,
constituted by different structural units. KAI,{uB, 1 ~ ) [ 2 ( ~ ~ ~ i , ) ~ l , and
l ( ~ talc,
~),
To this the principle of parsimony is applied (see p. 436, Pauling's fifth ,
MgAuB, 1 _.l2si 0 NOH),; (b) apophyllite,
KCa,{uB, 1 , } ~ ~ i , 0 , , l & ~ ~ 0 ~ ) ~ 8(c)
~~0;
rule), according to which the smaller the number of structural units a hexacelsian, Ba{uB, 2 ,}I (AISi)O,],(hT).
456 1 Fernando Scordari
structure is composed of, the more stable it is. This explains the uniform
number reduction of the groups as M increases.
What has been said above regarding linear groups can be extended to
chains or layers, by considering the whole chain or the whole layer as a
structural unit, instead of a tetrahedron.
5 . D: the general rule is that where there is a fixed Si-0 ratio, the silicate
anions tend to join together, in accordance with D+max. Though there
are exceptions to this rule, it can be justified by the fact that condensation
of the [siO4I4- tetrahedron results in a better local electrostatic valence
balance, in accordance with Pauling7ssecond rule.
6: t, r: multiple tetrahedron silicate anions are more stable than cyclic
ones. In fact, since the average Si4+-Si4+ distances are greater for the
former, the force of repulsion between Si4+ and Si4+is less.
7: P: the periodicity of a chain can be evaluated, in general, by means of
the stretching factor f,. The higher this factor, the more the chain is
stretched, and the lower is P. There are, however, important exceptions
where P increases with f, (pyroxenoids and pyroxenes).
(c)
Fig. 6.41. (Continued) 8. f,: this factor depends, in particular, on EN and on the average valence
( v ) of the cations. Strongly electropositive cations tend to increase the
negative charge of the [TO4] groups; therefore the repulsive forces which
act between these groups are greater and as a result f, + 1. Vice versa,
strongly electronegative cations, by reducing the charge of the [TO4]
groups, reduce f, and so P tends to increase.
An increase in the ( v ) of the cations has a similar effect. In fact, when
( v ) is high, a greater number of oxygens of the chain must be involved in
cation-oxygen bonds. As a consequence f, is reduced and P increases.
The ionic radius of the cation has more effect on the distortion of the
polyhedra than on the periodicity P. The greater the difference between the
silicon radius and that of the other cations, the greater this effect will be.
9. Nan: bearing in mind still the principle of parsimony, it can be stated
that the number of the structural units, distinguished either chemically or
geometrically, must be as small as possible. This explains why the vast
majority of silicates have Nan= 1; only a few structures have Nan= 2 (e.g.
okenite, Ca,,{uB, 2 ~)[3Si6016]{~B, 1 ~)[3Si6015]2.18H20) and there are no
known crystal structures at present with Nan> 2; however in glasses, melts,
etc. Nan> 2 is common. Silicates with mixed anions are favoured by cations
with high EN.
Fig. 6.42. Three-dimensional buildings of [TO,]
tetrahedra, and the structural formulae of: (a) 10. s: for a given Si:O ratio of the silicate anion, high EN cations favour
trid mite, {US, ~}[2~i,041; (b) cristobalite, {uB,
J'
:}[ Si,04] (note the thickened fundamental
a high s value.
chain and how in cristobalite the layers
perpendicular to [ l l l ] are all orientated the
same way, while in tridymite the same layers
perpendicular to [00011are turned through 180" Appendices
compared to one another); (c) orthoclase,
K{IB ~ } [ 3 ( ~ ~ ~ i 3 ) ~The
, l ( structure
m ~ ) . of
orthoclase is represented schematically. The 6.A Application of the concept of the packing coefficient
letters U and D indicate the position of the [TO,]
tetrahedra with one vertex pointing either up (U) (ci)
or down (D); (d) the fundamental chain IS, on
which the structure of orthoclase is based (the In general, calculated packing coefficients almost never match the expected
'branch' consists of the dotted tetrahedra). value (0.74). It is sometimes found that structures with lower c: ( ~ 0 . 6 0 )
Ionic crystals 1 457
Oxides
Periclase [MgO]
Rutile [TiOz]
P-Tridymite [SiO,]
Coesite [Si02]
Stishovite [SiO,]
Brucite [Mg(OH)2]
Gibbsite [A1(OH)3]
Ilmenite [FeTi03]
Spinel [MgA120,]
Carbonates
Aragonite [CaCO,] ci= 0.61 (6 = 2.93)
Calcite [CaC03] ci= 0.56, c.p. (6 = 2.71)
Alstonite [BaCa(CO,),] ci= 0.51
Ewaldite [Ba3Ca2(C03)5] ci= 0.55
Shortite [Na2Caz(C03)3] ci= 0.53
Tychite [NazMgz(C03)4(S04)] ci= 0.55
Gaylussite [Na2Ca(H20)5(C03)z] ci= 0.51 (a
Fig. 6.42. (Continued)
Borates
Kotoite [Mg3(B03),] ci= 0.67, C.P.
Tincalconite [NazB405(OH)4.3H20] ci= 0.55
Kernite [NazB406(OH)2~3H20] ci= 0.53
Borax [Na2B405(OH)4.8Hz0] ci= 0.53
Sinhalite [MgAl(BO,)] ci= 0.76, c.p.
Aksaite [Mg(B304(0H)2)2.3H20] ci= 0.44
Gowerite [Ca(B304(OH)z)z~3H20] ci= 0.42
Sulphates
Baryte [BaSO,] ci= 0.53
Chlorothionite [K2Cu(S04)C12] ci= 0.50
Fibroferrite [Fe(OH)S04.5H20] ci= 0.53
Parabutlerite [Fe(OH)S04-2H20] ci= 0.60
Hohmannite [Fe(Hz0)4[(S04)20]~4HzO]
ci= 0.58
Coquimbite [Fez(S04)]-9H20 ci= 0.55
458 ( Fernando Scordari
Phosphates
Triphylite [ ~ i ( ~ e ' Mn2+)P04]
+,
Heterosite [(Mn, Fe)P04]
Hydroxylapatite [Ca,(OH)(PO,),]
Brushite [CaHP04.2H20]
Moraesite [Be2(OH)P04.4H20]
Struvite [NH4MgP04.6H20]
Silicates
Forsterite [Mg2Si04]
Larsenite [PbZnSiO,]
Monticellite [CaMgSiO,]
Humite [Mg7(0H,F)&304),]
Zircon [ZrSiO,]
Grossular [Ca3Al2Si3OI2]
Enstatite [Mg2Si206]
Anthophyllite [Mg7Six022(OH)2]
Pyrophyllite [A12Si40,,(OH)2]
Orthoclase [KAISi,Ox]
Some remarks concerning the relationships between the anion arrange-
ment and the packing coefficient are helpful.
The compound CaCO, crystallizes in two structures (polymorphs): calcite
(c, = 0.56) and aragonite (c, = 0.61). Calcite can be described by means of a
close-packed oxygen arrangement with & triangular and 4 octahedral sites
filled respectively by C4+ and Ca2+. This packing is not the closest possible
for CaC0, because Ca2+is surrounded by six oxygens, i.e. in this structure,
calcium shows the lowest CN among those adopted (see Table 6.7). If
pressure increases, a more compact structure (aragonite in which Ca2+ has
CN = 9) originates; this structure no longer conforms td the closest packing
model.
The compound SiO, gives rise to several phases. Some of them are
considered here, they are: (1) P-tridymite, stable between 870 and 1470 "C;
(2) coesite, roughly stable from 30 to 100 kbar; and (3) stishovite, a very
high-pressure phase, stable above 100 kbar. P-tridymite structure (c, = 0.50,
6 = 2.22) is an infinite three-dimensional framework of [SiO4I4- tetrahedra.
It can be sliced into sheets of tetrahedra like muscovite (Fig. 6.41(a)) but
arranged in such way that the tetrahedra vertices alternate up and down.
The structure of coesite (c, = 0.67, 6 = 2.91) is quite dense and somewhat
more complex compared with that of P-tridymite. The main difference
between them concerns the second and following Si4+coordinations that, of
course, are greater for coesite. This explains the different packing
coefficients, though none of them (see p. 429 and following) conforms to the
closest packing. Stishovite has an exceptionally high packing coefficient
(c, = 0.99) and a density (6 = 4.29) much greater than that of /3-tridymite or
coesite. The structure is rutile type, TiO,, (Fig. 6.26, (c)), so it conforms to
the closest packing model. The very high pressure under which stishovite
crystallizes forces Si4+ to renounce its habitual four for an unusual six
coordination.
According to the mechanical closest packing model it is impossible to
have c, = 1, unless we suppose that the available space of the structure is
Ionic crystals ( 459
where CNA and CNB are respectively the coordination numbers of cations A
and B.
The relationship between m, and the average anion coordination
number ( m , ) is expressed by
--
CN,/CN, = p l(m + n). 64.2)
Replacing A . l in A.2, the latter can be written as
parameters, and symmetry, showing how they can sometimes have predic-
tive value.
At first, we will assume tentatively that the ionic structures having
ci> 0.60 can be described in terms of a close-packed anion arrangement
with cations filling the holes, even if in some cases this hypothesis does not
hold (as some of the following examples will prove).
1. Periclase, MgO, Fm3m, and Z = 4, has a packing coefficient ci= 0.62.
According to the accredited Mg2+ CN, 4, 5, 6, 8 (see Table 6.7) this cation
can fill the tetrahedral (half) or the octahedral site of a structure based on a
close-packing model. Moreover the space group Fm3m and the number of
formula units Z = 4 informs us that Mg2+ and 02-lie at 4,i, 4 and 0 , 0 , 0
or vice versa, from which m, = CN, follows. Using (A.3) in the more
simple form mCN, =pCN, we reach an obvious conclusion, i.e. Mg2+ and
02-have the same CN, (4 or 6) and of course for both coordinations the
electrostatic-valence principle is satisfied. To solve the CN dilemma, we
observe that periclase is isotype with lime, CaO, and that accredited CN, for
Ca2+ are 6, 7, 8, 9, 10, 12 (Table 6.7). Only 6 and 8 belong both to Mg2+
and Ca2+ CN, sets, consequently only the octahedral sites of a closed-
packed anion arrangement can be filled.
2. Rutile, Ti02, P4,/mnm, and Z = 2, has a packing coefficient c, = 0.74.
The accredited CN, of Ti4+ are 4, 5, 6, 8 (Table 6.7). Therefore, according
to the close packing and valence requirements, Ti4+ can fill tetrahedral or
octahedral sites. The octahedral site stabilization energy (OSSE = 0) cannot
remove the uncertainty regarding the site, even if the high electrical charge
informs us that Ti4+ prefers the octahedral site. El. can be observed that,
according to the space group, Ti4+can be placed at 0, 0, 4 or 0, 0, 0 and that
02-can lie on 2/m or 4 or mm (from 4e to 4g according to International
Tables of X-ray Crystallography Vol. A) in this case m, = CN,. No
profitable information can be drawn from isotype structures, but we know
that Nb4+ and ~ a can ~ replace
+ Ti4+ in very high percentage ( ~ 0 . 4 0 )The
.
common CN for these two cations, i.e. 6, indicates that very probably, Ti4+
prefers the octahedral site.
From (A.3) the calculated oxygen coordination number is m, = 3, i.e.
three Ti4+ surround one 02-whereas, as shown above, Ti4+ seems linked to
six oxygens.
3. Ilmenite, FeTiO,, ~ 3 and, Z = 2, has a packing coefficient c, = 0.66.
According to the space group, two ~ e and ~ two
+ Ti4+ can lie on a three-fold
axis or one of them on a three-fold axis and the other on the two
independent 3 (la, lb). As for the oxygen atoms, they can be situated on
the two independent i (3e, 3d) or in the general position 1 (6f); in both
cases (i or 1) CN, must be an integer. The accredited CN, for Fe2+ and ~ i ~ +
are respectively 4, 6, 8 and 4, 5, 6, 8 (Table 6.7), so either Fe2+ or ~ i may
~ +
fill tetrahedral and/or octahedral sites. From (A.3) three different m, can
be calculated for ilmenite:
- -
1.4 + 1-4= 3CNx;
- CN,
- = 813
1.4 + 1-6= 3CNx;
- -
-
CN, = 1013
1.6 + 1.6 = 3CNx;CN, 4.
Only the third result agrees with the statement that m, must be an
Ionic crystals 1 461
integer and precisely it tells us that four cations are linked to each oxygen
without specifying their nature. Applying the electrostatic-valence rule to
+ +
the three possible combinations 1Fe2+ 3Ti4+, 2Fe2+ 2 ~ i ~ 3Fe2+ +, +
lTi4+,we found respectively 2.33, 2, 1.67 V.U.(valence units). So 2Fe2+ and
2Ti4+ are the cations coordinated by one oxygen atom, in agreement with
the stoichiometry.
4. Perovskite, CaTiO,, ~ m j m ,and Z = 1, has a packing coefficient
c, = 0.62. The space group and Z inform us that Ca2+ and Ti4+ are in m3m
(la, lb) and that oxygen lies on 4/mmm (3c or 3d), i.e. the number of
cations arotlnd it is an integer and, what is more, even. The accredited CN,
for Ca2+ are 6, 7, 8, 9, 10, 12 and for ~ i they
~ ' are 4, 5, 6, 8. So according
to c, and CN, Ca2+ and Ti4+ should fill, like ilmenite, octahedral sites. This
eventuality would require that 5 of the octahedral si;es be filled by Cd2+and
Ti4+ and that consequently several edges (three are in the sheet) be shared
between Ca2+ and Ti4+ octahedra. Now we observe that for CN = 6,
different from ilmenite (re = 0.61 (Fe2' low spin) and re = 0.605 A (Ti4+),
the Ca2+ and Ti4+ effective ionic radii are quite different: re = 1.00 (Ca2+)
and re = 0.605 A (Ti4+).This strong difference is reflected on the octahedral
edges that, if shared, would involve a strong strain in the structure and
consequently the increase of its potential energy. Moreover for CN = 12,
the Ca2+ effective ionic radius (re = 1.34 A) matches well with the 02-one,
a
so we have a special close packing form in which of oxygens are replaced
by Ca2+. For Ti4+ in tetrahedral or octahedral sites and Ca2' 12-
coordinated, (A.3) gives respectively m, = or m, = 6, supporting the
hypothesis that Ti4+ is six-coordinated like the oxygen. As far as the nature
of the cations around the oxygen is concerned, the valence sum principle
indicates that around the oxygen atom there are two Ti4+ and four Ca2+.
5. Zircon, ZrSi04, I4,/amd, and Z = 4, has a packing coefficient c, =
0.70. In accordance with the space group and the number of formula unit Z,
zirconium and silicium are located at 42m (4a and 4b). In theory, oxygen
can be placed at 2/m (8c and 8d) or at 2mm (8e x 2), but it can easily be
shown that starting from the unit cell parameters, a = 6.60 and c = 5.98 A,
the calculated interatomic distances zr4+-0;- and Si4+-02- do not match
well with those obtained from the effective ionic radii (Table 6.7). For
instance, supposing the cation (M4+), i.e. zr4+ or Si4+, at 0, 0, 0 (4,) and
02-at 0, $, (8c), then the M4+-02- calculated distance is 1.81 A # 1.66
(Si4+-0'-) and 2.12 (zr4+-02-). On the other hand the situation does not
improve if 02-is located at 0, 0, z because the calculated Zr4+-0 and
Si4+-0 distance sums are: 2.99 A (5.9812 A) # 3.78 A (1.66 + 2.12 A).
These kinds of argument support the hypothesis that 0'- is situated on 2 or
m (16f or 16g or 16h), so m, = CN,.
The packing coefficient would suggest that zircon adopts a close-packing
structure. However, if Si4+ cations are located at the habitual tetrahedral
holes, m, assumes integer values (A.3) when zr4' fill tetrahedral sites or
when it is 8- or 12-coordinated. Let us now consider, besides ZrSiO,, the
zircon isotype structures: coffinite, USi04; thorite, ThSi04; Hafnon,
HfSiO,. The accredited CN, (Table 6.7) of the bigger cations are: 4, 5, 6, 7,
8, 9 (zr4+);6, 7, 8, 9, 12 (u4+);6, 8, 9, 10, 11, 12 (Th4+);4, 6, 7, 8 (HP+).
They show that some CN, (6 and 8) are present in all the four cations. This
peculiarity together with the results of (A.3) suggest that a packing,
462 1 Fernando Scordari
References
1. Cartemel, E. and Fowles, G. W. A. (1966). Valency and molecular structure.
Butterworth, London.
2. Huheey, J. E. (1983). Inorganic chemistry. Principles of structure and reactivity,
(3rd edn). Harper, New York.
3. Vainshtein, B. K., Fridkin, V. M., and Indenbom, V. L. (1982). Modern
crystallography II. Springer. Berlin.
4. Pauling, L. (1959). The nature of the chemical bond. Cornell University Press,
Ithaca.
5. Hamilton, W. C. and Ibers, J. A. (1968). Hydrogen bonding in solids.
Benjamin, New York.
6. Greenwood, N. N. (1970). Ionic-crystals, lattice defects and non-stoichioimetry.
Butterworths, London.
7. Catti, M. (1978). Acta Crystallographica, A34 974.
8. Cotton, F. A. and Wilkinson, G. (1980). Advanced inorganic chemistry, (4th
edn). Wiley, New York.
9. Orgel, L. E. (1960). A n introduction to transition-metal chemistry ligand field
theory. John Wiley, New York.
10. Basolo, F. and Johnson, R. (1964). Coordination chemistry. Benjamin, New
York.
11. Ballhausen, C. J. (1962). Introduction to ligand field theory. McGraw-Hill, New
York.
12. Burns, R. G. (1970). Mineralogical applications of crystal field theory.
Cambridge University Press.
13. Busing, W. R. (1981). W M I N , computer program to model molecules and
crystals in terms of potential energy functions, U.S. National Technical Informa-
tion Service, ORNL-5747.
14. Catlow, C. R. A. and Cormack, A. N. (1984). Acta Crystallographica, B40, 195.
15. Alberti, A. and Vezzalini, G. (1978). Zeitschrift fur Kristallographie, 147, 167.
16. Brown, G. E. and Fenn, P. M. (1979). Physics and Chemistry of Minerals, 4, 83.
17. Parker, S. C., Catlow, C. R., and Cormack, A. N. (1984). Acta
Crystallographica, B40, 200.
18. Shannon, R. D. and Prewitt, C. T. (1969). Acta Crystallographica, B25, 925.
19. Shannon, R. D. (1976). Acta Crystallographica, A32, 751.
20. Fajans, K. (1923). Naturwissenschaften, 11, 165.
464 1 Fernando Scordari
phenomenon the reader is referred to the second part of the Dunitz's book
X-ray analysis and the structure of organic molecules.[41The attempts aimed
to interpret, rationalize, and understand this continuously growing mass of
data give rise to problems far beyond the limits of a single chapter, being
dealt with by the scientific discipline of structural chemistry, initiated more
~ ] his famous book The nature of the
than 50 years ago by Linus ~ a u l i n g ' in
chemical bond and the structure of molecules and crystals. A n introduction to
modern structural chemistry.
The present chapter is intended to summarize the structural chemistry of
molecular crystals. As regards its content, it has been necessary to be
severely selective of topics. The discussion is strictly confined to the nature
of molecular crystals and to the stereochemical aspects of molecules which
are directly derived from the results of structural investigation. Final
applications, such as structure-property studies in chemistry and molecular
biology or pharmacology have been completely excluded. The topics have
been divided in two parts, according to whether they refer to single
molecules or molecular crystals as a whole and can be so summarized:
(1) molecular crystals: molecular interactions, their nature and effect on the
crystal packing; elements of crystal thermodynamics and polymorphism;
(2) single molecules: classical stereochemistry; molecular geometry and
chemical bond; molecular mechanics; interpretation of molecular struc-
tures and the structure correlation method.
C-H
_---
1 .
C----,.---
\-.'
Fig. 7.2. Curves of non-bonded energy as a
function of internuclear distance, E,,(r)
for the C-C, C-H, and H-H interactions
versus r,
their proper contact distance, given by the sum of their van der Waals radii
(Table 7.1).
Dispersion or London f o r c e ~ [ ~ l l ]
These are weak short-range attractive forces which decrease with the sixth
power of the intermolecular distance and are caused by the mutual
attraction of the small transient dipoles that molecules can induce in each
other. They give the greatest contribution to the lattice energy in crystals of
neutral molecules even if the energy of any single interaction is very small.
Dispersion energy between two molecules is usually expressed as the sum of
the crossed interactions between all pairs of atoms on the two molecules and
the sum of repulsion and dispersion energies is named non-bonded energy,
En,. When plotted versus the interatomic distance, r, the non-bonded
energy displays a typical minimum at a distance nearly equal to the sum of
van der Waals radii of the two atoms (Fig. 7.2). Such a minimum is quite
shallow, being of the order of a few tenths of a kcal mol-l.
Dipolar forces
Molecules having permanent dipole moments experience electrostatic at-
traction when properly oriented (orientation forces). According to
~ i t a i ~ o r o d s k i ,such
[ ~ I attraction forces must cancel out in crystals having
only translational symmetry operations (space group PI); for other space
groups it has been estimated that the dipole-dipole interactions may
contribute one tenth of the lattice energy for molecules having dipole
moment of 3-4 D, while their contribution becomes negligible when the
dipole moment is less than 1D.
Monopolar forces
Monopoles are associated with ions, a situation which is not very common
in molecular crystals and is not discussed here in detail. Two main points,
however, deserve particular attention: ionic interactions are known to cause
a quite relevant increase in lattice energies as can be shown, for example, by
the comparison of these energies for acetic acid and sodium acetate,
respectively 17.4 and 182 kcal mol-l; monopole-monopole interactions are
long-range forces whose energy decrease only with the first power of the
470 1 Gastone Gilli
interionic distance (Coulomb law) and in this respect differ from all other
short- or very-short-range intermolecular forces.
Hydrogen bonding[l"lsl
With the exception of monopolar forces, H-bonds are the highest-energy
interactions in molecular crystals. They greatly affect the way in which
molecules are packed, in the sense that the observed packing is almost
inevitably that allowing the maximum number of such bonds to be made.
Moreover, H-bonding is, by itself, the most relevant non-bonded interaction
in nature, being the main factor determining the structure of water, the
folding of proteins, and the pairing of bases in DNA. For this reason most
crystal packing studies are essentially attempts to understand the laws
governing the intermolecular H-bonding in an easily reproducible ex-
perimental environment, that given by the molecular crystal.
H-bonding occurs when a hydrogen atom is bonded to two (or sometimes
more) other atoms. This situation may be depicted schematically as
D-H--A, where D is the H-bonding donor and A the acceptor. In
principle all atoms more electronegative then hydrogen (C, N, 0 , F, S, C1,
Se, Br, I) can play the role of A and D , though stronger hydrogen bonds
are necessarily associated with the most electronegative ones (N, 0 , F, Cl).
Several theoretical studies have been devoted to clarifying the nature of
the bond in H-bonding complexes and in particular the relative contribu-
tions of different terms to its total energy. Probably the most popular and
quoted partitioning scheme is that developed by Umeyama and
~ o r o k u m a [ for
l ~ ~the treatment of (H20), and (HF), dimers. It makes use
of the energy decomposition analysis developed within the ab initio
SCF-MO theory[203211 where the total H-bonding energy is partitioned in
four terms: the repulsion or exchange energy and the electrostatic,
polarization, and charge transfer attraction energies. The authors were able
to conclude that the main attractive term is electrostatic and that the
contribution of charge transfer is small, so that H-bonding can be
qualitatively defined as an electrostatic more than charge transfer or simply
electrostatic interaction.
H-bonds can be classified (Fig. 7.3) according to their topology as
intramolecular, intermolecular, or bifurcated and, according to their
energy, as going from weak to very strong.
1. Weak H-bonding can be observed for any couple of donor and
acceptor atoms whenever the two groups cannot achieve the correct
approach for some sterical reason. The main factor is usually the D-H--A
angle which, for maximizing the electrostatic interaction between the D-H
dipole and the negatively charged acceptor, must be in the range of some
160-180". A good example comes from the intramolecular H-bonds closing
five-, six-, or seven-membered rings; the H-bond closing a five-membered
ring is always weak (and so weak that the hydrogen of its D-H group forms,
whenever possible, a second bifurcated hydrogen bond with another
acceptor (Fig. 7.3(g1)). A second reason why a H-bond is to be classified as
weak comes from the small intrinsic electronegativities of the H-bonded
partners, the most classical case being that of the C-H--A interactions.[221
2. Medium H-bonding is typical of water, alcohols, amines, amides, and
carboxylic acids. Its geometry is rather well defined: the 0-H--0 group
Molecules and molecular crystals 1 471
---$ \
R-C
//O---H-0
\OAH
'C-R
- - -04
Charge transfer[2G291
Intermolecular charge transfer or donor-acceptor interactions occur be-
tween electron donors (Lewis' bases) and acceptors (Lewis' acids). They
establish an at least partially covalent bond between highly polarizable
groups, which is often described as the formation of a molecular orbital by
electron donation from the highest occupied molecular orbital (HOMO) of
the donor to the lowest unoccupied molecular orbital (LUMO) of the
acceptor. Classical examples are the molecular complexes NH, BF3 = +
+
H3N-BF,, I, I- = [I-I-I]-, or the molecular crystal of iodine, where the
I, molecule is both an acceptor along the interatomic axis and a donor
perpendicular to it. Such a type of interaction occurs in crystals only in the
presence of large and easily polarizable (soft) atoms. Another type of
donor-acceptor interaction which has been actively studied in the recent
past is that present in metallic or semiconducting organic crystals, a class of
mixed crystals which contain planar donor and acceptor molecules packed in
separated (segregated) infinite stacks and have given rise to great interest
for their potential applications in electronics. For a recent review on their
structural aspects see references.[301
displace the molecules from their sharp minima, already determined by the
short-range balance of repulsion and van der Waals interactions of the outer
atoms of the molecule, so that these latter remain the true controlling factor
of the packing arrangement. The only exception to this rule is represented
by the hydrogen bond, which is electrostatic in nature but requires a quite
specific geometry in the donor-acceptor D-H--A interaction and, more-
over, involves energies much greater than dispersion interactions do.
The relevant role played by short-range van der Waals forces in the
crystal packing helps us to understand the apparently paradoxical fact that
the large majority of molecular crystals belong to a few space groups having
second-order symmetry elements. Since van der Waals forces are both
adirectional and additive, the energy of any single atom is lower the higher
the number of atoms of other molecules surrounding it at contact distances
is or, in other words, the most stable crystal is that in which molecules pack
themselves with the highest coordination number. A simple geometrical
analysis of the problem has been carried out by ~ i t a i ~ o r o d s kwho
~ [ ~has
'
shown that rows of molecules staggered by a glide lattice operation can
produce a very efficient close packing (coordination of twelve) by repeating
a molecule of arbitrary form in the space groups PI, P2,, P2,/c, Pca2,)
Pna2, and P2,2,2, or a centrosymmetric ~iloleculein the space groups PI,
P2,/c, C2/c or Pbca. These space groups are actually those most frequently
observed for molecular crystals.
Dispersion energy
The first theoretical treatment of dipersion forces is due to ond don[^] (so
that they are often called London forces), who used the perturbation theory
to obtain the following simplified equation for the dispersion energy E(r) of
two molecules whose centres are separated by the distance r
E(r) = -(3/2)a;a;1,1,r-~/(1, + I,) (7. la)
where a; and a; are the molecular polarizability volumes and I, and I2their
first ionization potentials. For identical molecules it becomes
E(r) = -(3/4)a'21r-6 (7. lb)
showing that the attraction energy increases for molecules having high
polarizability and ionization potential values and is of the general form
- ~ r - ~where
, c is a constant mainly determined by the value of a'.E(r) is
the only attractive energy among neutral molecules without permanent
dipole moment and determines all phase transition temperatures of the
substance. Polarizabilities are known to increase with the molecular volume
and with the number of n bonds (particularly extended systems of
conjugated bonds) and are essentially a measure of how much electron
clouds can be displaced from their equilibrium positions around the nuclei;
higher polarizabilities imply stronger intermolecular attractions and this is
the reason why, at room temperature, larger molecules give crystals, smaller
ones give liquids, and only very small ones are gaseous. It may be said that
dispersion forces (or more basically polarizabilities) are the true reason why
molecular crystals can exist.
474 1 Gastone Gilli
Atom-atom potentials
When the interacting bodies are atoms (e.g. noble gases) the distance r can
be taken as the internuclear distance. In the case of molecules some
complications arise and the atom-atom approximation is most commonly
used: the total interaction energy between two molecules is expressed as the
sum of those among the constituent atoms. The two-atom potentials in the
generalized form
Enb(r)= A exp (-Br)r-D - Cr-6 (7.4)
are often named atom-atom potentials. Values for the A, B, C, and D
parameters are not derived from theory but determined by experiment, such
as gas deviations from ideality, compressibility of liquids and solids, and
neutron scattering by liquids. Otherwise, the parameters minimizing the
difference between observed and calculated values of some physical quantity
(typically lattice energies and cell parameters for crystals; bond distances,
bond angles, and torsion angles for molecules) are used.
Equation (7.4) requires four constants for any pair of atoms, which can be
Molecules and molecular crystals 1 475
a problem when the number of different atoms increases. For this reason
other forms of (7.4) are used, such as
Electrostatic energies
The study of interactions among permanent molecular multipoles is
simplified by the fact that the energy involved rapidly decreases with the
order of the multipole itself and it is generally admitted that the role played
by quadrupoles is negligible. Even considering only the first two terms of
multipolar expansion the number of terms remains relevant, that is
monopole with monopole, dipole, and induced dipole together with dipole
with dipole and induced dipole. However, a great simplification can be
reached in some cases because interactions with induced dipoles can be
usually neglected in lattice energy calculations and crystals of neutral
molecules (practically the only case so far studied) do not need monopoles
to be taken into account.
A point of general interest concerns the value of the electric permittivity
E (normally expressed as the product of the dielectric constant E, and of
the vacuum permittivity E,, so that E = E,EJ which is included in all of the
following equations. In interactions decreasing rapidly (e.g, with rP6 in the
dispersive or dipole-dipole interactions) E, is assumed to be unitary as in a
vacuum because the space around the atom considered can be taken as
empty within the short distance (usually 15 A) to which the calculations are
476 1 Gastone Gilli
Pair A B x ~ o -C~ D
Atom r* E*
Molecules and molecular crystals 1 477
extended. When monopoles are involved the energy decreases only with the
first or second power of distance; calculations must be extended to a wide
range and the value of E, to be used becomes a complex and barely known
function of the distance in consequence of the shielding effects produced
around the ion by the interleaving atoms.
1. Monopole-monopole interactions occur in crystals containing ions.
The interaction energy of ions i and j having formal charges qi and qj located
at a distance rij is
where r is the distance between the charge q and the central point of the
dipole p, and @ the angle between r and the direction of the dipole.
3. The case of dipole-dipole interactions is of great interest as most
neutral molecules have small dipole moments associated with chemical
bonds between atoms of different electronegativities, whose vector sum
produces, unless it does not vanish because of symmetry, the overall
molecular dipole moment. The interaction energy is calculated according to
two main models.
In the dipolar model the bond dipole moments are tabulated for all bonds
of interest or directly calculated from the known values of atom
electronegativities. The interaction energy between two dipoles pi and pj
whose centres are separated by the vector q is
and the total dipole-dipole energy, ED,, in the crystal is one half of the
sum of all the intermolecular terms A slightly different way could be
that of summing up all the small bond dipoles into the overall molecular
dipole moment and of computing the total energy over all molecules.
In the monopolar model the total energy is calculated as the sum of all
atomic partial charges qi according to
where F,, and Fmo, are the inter- and intramolecular contributions to the
total crystal free energy. In this approximation Fmo, could be separately
calculated at any temperature from the frequencies of the internal vibra-
tional modes of the rigid molecule by the usual methods of statistical
thermodynamics; the same methods allow us to calculate another molecular
quantity which will be shown to be necessary, the contribution of internal
vibrational modes to the molar heat capacity, Cmo,.
Within our approximations the molar free energy of the crystal can be
written as
capacities at constant pressure and volume and from this difference can be
actually calculated as A U(T) = J": (C, - C,) dT.
KOis the zero-point energy, that is the crystal vibrational energy at zero
kelvin; it is known that it may have some relevance only for molecules of
extremely small moment of inertia (N2, 0 2 , CO) and strongly directional
bonds (H20). In common crystals it is very small (<0.2 kcal mol-') and can
be neglected. It cannot be dissociated from the lattice energy and the global
+
term U KOcan easily be obtained from the experimental molar subEima-
tion enthalpy at the temperature T according to
E,,, is the vibrational part of the internal energy. At not too low
temperatures an oscillator gives a contribution of kT to the internal energy
(equipartition principle). The Avogadro number N of molecules having six
degrees of freedom (three rotational and three translational) accumulate, at
the temperature T in gaseous phase, a kinetic energy of 6NkT/2 = 3RT; in
the solid state translations and rotations are hindered and become oscilla-
tions and librations around the equilibrium positions which, having both
kinetic and potential components, can accumulate an energy E,,, = 6NkT =
6RT and the crystal should have a molar heat capacity C, = 6R. In practice
C, tends to zero for T tending to zero because an ever decreasing number
of vibrational levels is accessible and E,,, = $:CV d T becomes increasingly
smaller than 6R.
Svibis the vibrational entropy. Clearly lim Svib= SOfor any crystal of a
T-0
pure substance without static disorder, where So is the residual entropy at
absolute zero. So, a quantity strictly related to KO, is very small and it
is assumed to be zero when the entropies are measured according to
the third law of thermodynamics. This can be calculated as Svib= Scr=
SF (Cp - CmoJ d(ln T).
The thermodynamic equations needed to evaluate the different terms in
(7.11) and an actual calculation concerning naphthalene at two different
temperatures are summarized in Table 7.4. The experimental data necessary
are not easily found, being the molar heat capacity at constant pressure, C,,
the isobaric thermal expansivity, a = (l/V)(6V/6T),, the isothermal com-
pressibility, b = -(l/V)(6V/6p),, and the volume of the unit cell, V) from
zero to the temperature of interest. Two other quantities are needed, Uo,
obtained from the molar sublimation enthalpy according to (7.12), and the
contribution of internal vibration modes to the molar heat capacity, Cmo,,as
a function of T, which has already been shown to be obtainable without
great difficulty from spectroscopic data in the case that there is no mixing of
internal modes (intramolecular vibrations) and external modes (vibrations
and librations of the molecules as a rigid body). Figure 7.5 reports the plot
480 1 Gastone Gilli
uo lim AH""^)
T-0
-16700 -16700
of the calculated values of molar vibrational free energy for the crystal of
naphthalene over a much wider range of temperatures.
From these data it is possible to obtain some rules regarding crystals in
general and molecular crystals in particular. The lattice energy constitutes
by far the greatest part of crystal free energy, F,,; at low temperature and
even at higher temperatures it is the prevailing part. The general effect of
the vibrational part is that of causing a continuous decrease of F,, because
ITSvibl increases more rapidly than E,,, with temperature. In other words
the crystal is increasingly stabilized by higher temperatures; this, of course,
is not accidental but the expression of a fundamental physical law as
differentiation of the free energy expression dG = V dp - S d T implies that
(6G/6T), = - S and that, S being always positive, the free energy decreases
with increasing T. A last point can be that the work done against the
internal cohesion forces of the crystal, AU, is small up to room temperature
and can be neglected in a first approximation.
where the first term is the non-bonded energy (eqn (7.4) or (7.5)), the
second the electrostatic energy (7.9) and the last one the H-bonding energy.
The index i runs over all atoms of a reference molecule and the index j over
those of all the surrounding molecules. The summation can be truncated at
Molecules and molecular crystals 1 481
The number of parameters for the unit cell goes from one for the cubic
system to six for the triclinic system. Moreover, the molecule may have
fewer degrees of freedom as it is in a special position; for example it has
only three rotational degrees of freedom if located on a symmetry centre.
So, the lattice energy of naphthalene (P21/a, Z = 2) depends on only seven Fig. 7.6. Eulerian angles 0, q, and yt defining the
orientation of the (x', y', 2') orthogonal base of
degrees of freedom (a, b, c, P, 8, QI,3 ) . the rigid molecule with respect to the
If n is the total number of parameters, it is useful to think of U as a orthogonal base of the crystal, (a, b, c ) .The
+
hypersurface in an (n 1)-dimensional space where the geometrical para- broken line OA is the intersection of planes
( x ' y ' ) and (ab).
meters are the abscissae and the energy the ordinate: the different minima
on this surface correspond to all possible crystal structures in the space
group chosen. The relevant number of calculations done on different
crystals indicate that the experimental structure usually corresponds to the
deepest minimum or, at least, to one of the deepest minima of the potential
surface and that the calculated and experimental values of U compare
within a few kcal mol-l. This seems to indicate that the structure can be
predicted from the simple evaluation of lattice energy, independently of the
vibrational part of the free energy.
The reasons for this fact will be discussed in the next section. What is
important here is that it allows us to obtain the best potential energy
parameters (globally called a force field) to be used in (7.14) by a
least-squares procedure on a minimum number of known crystal structures
where the quantities to be reproduced are the lattice energy, the unit cell
parameters, and the positional parameters of the molecule. This method has
been applied to several classes of chemical compounds and is reported to
give far better results than the use of potentials derived from other
molecular properties. Unfortunately it has been impossible to find a force
field able to give a very accurate reproduction of the crystal properties that
is valid for all chemical compounds. For instance, the calculations can be
very good for a class (e.g. hydrocarbons) but any attempt to extend the
force field to molecules containing heteroatoms causes a worsening of the
final results. Moreover, very few cases of intermolecular H-bonding have
been studied so far[823s351and no general evaluation of the term EHBin
(7.14) is therefore possible.
Another point concerns molecular flexibility. In theory, lattice energy
calculations can be extended to the case in which the molecule has internal
degrees of freedom caused by rotation around single bonds. If t, are the j
torsion angles of interest, (7.15) can be rewritten as
U = U a , , 8, QI,V , x , Y , x, z,) (7.16)
where 8, QI,v, x, y, z concern a reference fragment of the molecule and z,
are the torsion angles defining the orientation of the other fragments with
482 1 Gastone Gilli
moved by the rotation around the single bond. Let us assume that, when the
molecule is in its conformational minimum, all contact distances of interest
are not far from the minimum of their atom-atom potentials (Fig. 7.2): the
global conformational energy minimum will inevitably be shallow and the
crystal field might even produce relevant changes of the torsion angle. A
classical example is diphenyl, which is not planar by itself but becomes
planar under the slight compression of the crystal field. Conversely, crystal
forces will weakly affect the value of the torsion angle when the walls of the
potential well are steep, a situation occurring when the molecule is in
tension (or rigid) because, in the conformational minimum, the atomic
contact distances are in part shorter and in part longer than the optimal
contact distances. From this point of view it can be stated that overcrowded
molecules are the least affected by the weak crystal forces.
It remains for us to consider the case of molecules having different
possible conformations which crystallize in a conformation which is not that
of minimum energy for the free molecule. This is quite possible and, in fact,
many conformationaly flexible molecules are found to crystallize in different
space groups with different conformations (conformational polymorphism).
It has already been remarked that the enthalpy of polymorphic transitions
seldom exceeds 1 kcal mol-' in molecular crystals and this makes it possible
that a molecule, whose second most stable conformation differs from the
first one by not more than 1-2 kcal mol-l, may gain from a more efficient
crystal packing that energy (or even more) which has been lost in
consequence of the choice of an unpreferred conformation.
Isomerism
Molecules having the same formula but different structures are called
isomers and are said to display isomerism. There are three different types of
isomerism: constitutional, configurational, and conformational, and the
two last are grouped under the common term of stereoisomerism.
Configurational isomerism
Molecules having the same constitution but different configurations are
called configurational isomers. Two main cases are to be distinguished:
when the two stereoisomers can be related by a symmetry operation of
reflection they are enantiomers and when this is not possible they are called
diastereoisomers or, sometimes, diastereomers. The distinction has a
precise physical meaning. Both enantiomers and diastereoisomers are
identical as far as their chemical bonds are concerned. Only the former,
however, have indistinguishable non-bonded interactions; they have the
same physico-chemical properties in all respects except for their optical
properties and reactivity towards other enantiomeric species. In particular,
enantiomers rotate the plane of polarized light by the same angle but in
Motecules and molecular crystals 1 487
0, /a
\bj
1
COOHI
I =COOH E 0 0 ~ 7 a , b , c
8 a.b
488 1 Gastone Gilli
transition state having an energy higher only by a few kcal mol-l. Further
investigations have shown than other potentially chiral compounds of
trivalent phosphorus and arsenic have higher interconversion barriers and
that their enantiomers may be actually resolvable, at least at low
temperature.
The absolute configuration of an enantiomer (i.e. its actual atomic
disposition in space) is a problem which cannot be tackled by chemical
methods. All attempts to obtain it from the sign of optical activity in
solution, the property which is more strictly related to enantiomerism, have
failed. The first absolute configuration of an enantiomer was only accom-
plished in 1951 by Bijvoet and c o - ~ o r k e r s [on
~ ~(+)-tartrate
] of sodium and
rubidium by anomalous scattering methods, as discussed in Chapter 5. Since
then absolute configuration determination has become routine.
The nomenclature for identifying enantiomers has changed over the years
and reflects the discovery of new techniques for their study. The oldest one
simply reports the sign of the optical activity (+ or -); later on Fisher, in
his fundamental studies on carbohydrates, discovered the stereochemical
series and enantiomers were named after the series they belonged to (D or
L). More recently Cahn, Ingold, and i re lo^[^^] introduced the R-S
nomenclature which describes exactly the spatial arrangement of groups or
atoms as obtainable from X-ray diffraction studies.
In this context it is worthwhile to remark that chemical nomenclature has
been developed for solutions and not for the crystal state, so that some
misunderstandings can sometimes occur. As far as enantiomers are con-
cerned, it is known that the usual chemical synthesis can only produce a 1:l
mixture of the two enantiomers which is called racemic mixture; it can be
separated (resolved) by reacting it with other chiral molecules by which the
two enantiomers are transformed into diastereoisomers; only the more
recent asymmetric synthesis can produce single R or S enantiomers. When
a pure enantiomer crystallizes, it can only adopt a polar space group (i.e.
without S, rotary-reflection axes), otherwise the symmetry element would
generate the other enantiomer; this is the case, for instance, for all natural
aminoacids or proteins. The crystallization of the racemic mixture may be a
more complex problem. Usually the mixture is thermodynamically more
stable and crystallizes as a homogeneous solid containing equimolecular
amounts of both enantiomeric molecules which is termed racemate. The
crystal obtained is usually centrosymmetric with the two enantiomers
related by a symmetry centre, though sometimes it happens to be polar with
a double (or of higher even multiplicity) asymmetric unit allocated to both
enantiomers. In the rare case that the racemate is less stable than its
components, the racemic mixture crystallizes in a polar space group with
spontaneous resolution of the enantiomers, that is it produces two types of
enantiomeric crystals, each one containing just one enantiomer (which is,
incidentally, the way Louis Pasteur discovered enantiomers for the first
time). This fact is the origin of a not uncommon error in structural
determination, that of determining the absolute molecular configuration
without taking into account the fact that the bottle from which the crystal
was taken contains one half of crystals of the opposite configuration. The
subject of chiral crystals can become quite difficult to understand in its
generality because there are also space groups which are chiral by
themselves and not because they have to allocate chiral objects (e.g. P31
490 1 Gastone Gilli
and P3J. Moreover, what has been said for configurational enantiomerism
has to be extended to conformational enantiomerism because conformations
become fixed within the crystals. For more details the reader is referred to a
more specialized treatment.[571
Going back to the other isomers, it has already been remarked that
geometrical isomerism is the traditional way of indicating diastereoisomer-
ism in molecules having an internal S,. Classical examples are cis-trans
isomers in ethylene derivatives (13a, 13b), syn-anti isomers in oximes
(14a, 14b) or azocompounds (15a, 15b), cis-trans isomers in square planar
(16a, 16b) or octahedral (17a, 17b) complexes, and mer-fac (meridional-
facial) octahedral complexes (18a, 18b). Geometrical isomerism is also
observed in saturated cyclic compounds where the substituents can be over
or under the ring plane. For instance, four-membered rings substituted as in
(19) can only show geometrical isomerism in consequence of their internal
mirror plane; accordingly (19a) and (19b) are cis-trans isomers. Complete
nomenclature rules for geometrical isomers are a ~ a i l a b l e . [ ~ ~ > ~ ~ ]
Conformational isomerism
Conformational isomers (rotational isomers, rotamers, conformers) are
the molecular states corresponding to minima of the potential energy curve
expressed as a function of the torsion angle around a single bond. With
reference to Fig. 7.8, states b, d, and f are conformers while a, c, and e are
transition states; all are possible conformations. As is the case for all
stereoisomers, conformers can have mutual relationships of enantiomerism
or diastereoisomerism; so conformers b and f are enantiomers while b and c
(or c and f) are diastereoisomers.
The energy barrier protecting conformers is usually rather small (5-
15 kcal mol-l) and much smaller than that protecting configurational iso-
mers. For reference, the thermal activation barrier for cis-trans
isomerization around a double C-C bond (13a, 13b) is some 40 kcal mol-l.
However, it is not the height of the barrier that can distinguish between
conformational and configurational isomers but the fact that conformers
differ in a rotation around what is a single bond (even if with partial
double-bond character) in the ground state of the molecule. This rule allows
us to classify as conformers the enantiomers produced by hindered rotation
in o, or-disubstituted diphenyls (20a, 20b), although the interconversion
barrier is so high that they can be resolved at room temperature. Such
resolvable conformers are sometimes reported as atropoisomers.
Ring conformations
It seems reasonable to assume that a ring containing single bonds, e.g.
cyclohexane, could have different conformations describable by a potential
energy curve of the type shown in Fig. 7.8, having minima and maxima
corresponding to ring conformers and transition states. The difficulties arise
when trying to define the independent variables that the potential energy is
a function of, and this for the evident reason that the torsion angles fixing
the conformation are not mutually independent but related by the condition
of ring closure. The problem is not easy to solve and only some basic
concepts will be discussed here; the mathematical complications will be
discussed in the next section.
Molecules and molecular crystals 1 491
Let us focus our attention on the origins of the energy barrier which
separates the different conformers. As it will be discussed in more detail
later (p. 507), the deformation of a molecule can be conceived of in terms of
displacements of its bond distances, bond angles, and torsion angles from
their optimal equilibrium values. The deformation energy needed rapidly
decreases in the order bond stretching or compression, bond angle bending,
and torsion around single bonds, while the torsion around double bonds
requires energies in between stretching and bending. Since energies of
single bond torsions are very small, transitions between conformations
which can occur without angle bending will have an almost null activation
barrier. In this case the ring is flexible because many different combinations
of the two isoenergetic (degenerate) conformations become possible. The
pathway interconverting isoenergetic conformations is called, for reasons to
be discussed later, a pseudorotation path. Classical examples of degenerate
conformations are the envelope (E) and twisted (T) conformations of
cyclopenthane (Fig. 7.9) and the boat (B) and twist-boat or twisted (T)
conformations of cyclohexane (Fig. 7.10). The opposite occurs when it is
impossible for the molecule to change conformation without bond angle
deformation (or, in general, bond stretching or rotation around double
bonds). In this case the molecule is rigid and its actual conformation is
protected by a barrier which can be of the order of magnitude of Fig. 7.9. The two degenerate conformers of
10 kcal mol-' for saturated rings and somewhat greater if there are double cyclopenthane (E =envelope, T = twisted) and
bonds stiffening the ring. Figure 7.10 shows the somewhat idealized shape of their symmetries. The ring conformation is
flexible because the t w o conformers can
the potential energy curve of cyclohexane as a function of a generalized interconvert with zero activation energy along a
coordinate of conformational interconversion as could be obtained by the pseudorotation path. Any observed
conformation can be described as a linear
molecular mechanics methods. The rigid conformation of cyclohexane is the combination of E and T.
chair (C), which is the most stable and is transformed into the T form
through a transition state called half-chair (H) which is some 11kcal mol-'
higher in energy. T and B are almost isoenergetic and can be interconverted
practically without any activation barrier (pseudorotation). C, B, and T are
the three low-energy conformations of cyclohexane and therefore its three
conformers according to the definitions given in the previous section.
The number of possible independent conformers of a ring can be
determined by the use of simple considerations. A planar ring of N atoms
has 2N degrees of freedom, out of which two are of translation, one of
rotation, and 2N - 3 of in-plane vibration. Allowing the atoms to vibrate
out of plane, that is the ring to become puckered, the number of vibrational
A l , l l l l l l 1 1 1 1 1 1 xZ + y2, z2
A
,, 1 1 1 1 -1 -1 1 1 1 1 - 1 -1 R,
B,, 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1
B
,, 1 -1 1 -1 -1 1 1 -1 1 -1 -1 1
E,l 2 1 -1 -2 0 0 2 1 -1 -2 0 0 (R,, R,,) (x:, yz),
E
,, 2 -1 -1 2 0 0 2 -1 -1 2 0 0 (X -Yr x ~ )
A,, 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1
A, 1 1 1 1 -1 -1 -1 -1 -1 -1 1 1 z
El, 1 - 1 1 -1 1 -1 -1 1 -1 1 - 1 1
B," 1 -1 1 -1 -1 1 -1 1 -1 1 1 -1
El, 2 1 -1 -2 0 0 -2 -1 1 2 0 0 (x,y)
E", 2 -1 -1 2 0 0 -2 1 1 -2 0 0
applying all the group symmetry operations to the six small vertical vectors
of Fig. 7.11; the character for any class of symmetry operations (or column
of the character table) is given by the number of small vectors not moved by
the symmetry operation. It is easily seen that the characters of such
representation, r 6 , are
where h is the order of the group (in the present case 24) and x,(R) and x ' ( R ) are the
characters of the symmetry operations Tiand T', respectively.
494 1 Gastone Gilli
for E2, is not intuitive and must be found by ordinary group theory
methods. A first base can be obtained by associating the out-of-plane
displacement to atom 1 of the ring and operating on it by all symmetry
operations multiplied by their characters. The required base is the sum of all
the terms obtained, that is
which is a base for (or is transformed as) the boat conformation B and can
be written in normalized form as
t If the two already normalized functions 4, and 4, are not orthogonal, i.e. [ 4,4, du = S #
0, the new function 4' = 4, - SG, is orthogonal to +,, as can be shown by simple substitution.
Molecules and molecular crystals 1 495
where i can assume the integer values from one to six. Moreover, it can be
verified by simple computation that the following functions give the same
displacements as 4, in (7.17) and 4, and 4, in (7.18) and (7.19), and
therefore are transformed by the operations of the point group D6,, as the
i.r. BZgand E2, of Fig. 7.12
where 6-'I2 and 3-'12 are normalization factors and q3 and q2 the mean
puckering amplitudes of the two vibration modes.
The function describes the out-of-plane displacements of the C
conformer while any linear combination k,& +
k2GTdescribes those of the
degenerate B and T conformers. For the properties of trigonometric
functions, such combinations can be written as
where 42 is a phase angle having values of 0, 60, 120, 180, 240, or 300" for
the pure B conformations and 30, 90, 150, 210, 270, or 330" for the pure T
conformations.
Equations (7.21) and (7.24) describe analytically the conformations of a
six-membered ring in terms of the three parameters q3, 9,) and G2 and the
N - 3 possible conformations remain characterized as the three C, B, and T
conformers. The characterizing parameters (9,) q2, G2) are more generally
identified as the N - 3 generalized puckering coordinates, that is the N - 3
variables necessary and sufficient to define any out-of-plane deformation of
the ring. q3 and q2 are called puckering amplitudes and G2 the phase angle
and any possible conformation of the six-membered ring corresponds to a
point in the three-dimensional space spanned by the orthogonal base
( q 2 , $ 2 ) q3).
496 1 Gastone Gilli
300
--
-5 -
--
-2 -
conformers C, B, and T. The ring, however, also has high-energy transition
states and there is general agreementUs9]that these are the envelope (E),
half-chair (H), and screw-boat (S) conformations of C,, C2, and C2
symmetry, respectively, as shown in Fig. 7.13. Cremer and ~ o p l e [have
shown that the E and H forms are located at tan 0 = fv(3/2) and fv 2 ,
~~]
t According to the nomenclature usually employed, the conformations chair, boat, and
twisted (or twist-boat) are called C, B, and T and the atoms which are over or under the mean
plane of the ring are indicated by upper left or lower right indices. So, 2 3 5 is~ a boat ring having
the atoms number two and five directed upwards and B,,, its enantiomer while *T, is a twisted
ring with the second atom pointing up and the fifth one pointing down.
Molecules and molecular crystals 1 497
+ +
which can be decomposed as T, = A'! E'; Ei, where A; and E;' cor-
respond to translation and rigid body rotations, respectively. The puckering
modes belong to the two-dimensional i.r. E; and a base for them, found by
the usual methods, is shown in Fig. 7.15. The two conformers found,
envelope (E) and twisted (T),are degenerate and all their linear combina-
tions are possible conformations on a pseudorotation pathway. By analogy
with (7.20-7.24) the non-translation and non-rotation conditions are
4,C;3E
where 4, is a phase angle having values of 0, 36, 72, 108, 144, 180, 216, 252,
1~ 'T,
288, and 324" for the ten E conformations and 18, 54,90, 126, 162, 198, 234,
270, 306, and 342" for the other ten T conformations, intermediate values of
@, corresponding to mixtures of the two along the pseudorotation path. The
30
complete pathway is shown in Fig. 7.16.
$2 "v
E A
*,
4 ~ FE
3
YJT4
Computation of puckering coordinates
The theory given in the previous section concerns equilateral and isogonal
rings. However, the equations obtained can also be used for irregular rings
d : 4 without loss of generality, at least as regards the Cremer and Pople
treatment,[601which makes use of the displacements zifrom the ring mean
plane. Other methods using the values of the endocyclic torsion angles (e.g.
T~
EI that proposed by Altona et al.[621for five-membered rings) may not have the
Fig. 7.16. Pseudorotation wheel describing the Same degree of generality.
conformations which can be assumed by a
five-membered ring during the changes of the
Several Computer programs[63264] are available for the computation of the
phase angle $,. puckering parameters according to Cremer and ~ople.[~'] The procedure
usually includes: transformation from crystal to orthogonal coordinates;
passage to the system of coordinates defined by (7.20) or (7.27), having
their origin at the centre of gravity of the ring, the z axis perpendicular to
the mean ring plane and the y axis passing through the first ring atom;
determination of the parameters q and $J (or Q, 8, and $) satisfying eqns
(7.21) and (7.24) or (7.30).
An example of application to the fructose and glucose rings of the
molecule of sucrose is reported in Table 7.6. The five-membered fructose
ring has q2 = 0.353 A and G, = 265.1". By comparing the calculated value of
Gz with those of Fig. 7.16, it is found that the ring conformation is very near
4 ~ with
3 a small component of E3. The six-membered ring has a Q value of
0.556& i.e. it is more puckered than the five-membered ring. The
calculated values of 8 = 5.2" and $ = 183.7" indicate, when compared with
Table 7.6. Atomic coordinates and puckering parameters for fructose and glucose rings
of the molecule of sucrose (coordinates from Brown and Levy, Acta Crystallographica,
839, 790 (1973))
those of Fig. 7.14, that the ring conformation is an almost perfect 'C, with a
very small distortion towards 5H4 and E,.
The VSEPR
The VSEPR theory assumes that the relative ligand arrangement around a
central atom is determined by the mutual repulsions of bonding or
non-bonding (lone) pairs of electrons. Bonding and non-bonding pairs are
identified from the Lewis-type electronic structure of the molecule. The
method can be summed up in the following few rules: Rule 1. Electron pairs
repel each other and adopt the geometrical disposition which minimizes
mutual repulsions; Rule 2. Lone pairs take more room than bonding pairs;
Rule 3: The space occupied by a bonding pair decreases with the increasing
electronegativity of the substituent; Rule 4. The two bonding pairs of a
double bond (and, more so, the three bonding pairs of a triple bond) occupy
more space than a single bond does.
Calling A the central atom, X the bonding pair (or the generic ligand
connected by it) and E the non-bonding pair, Rule 1 can be used to predict
the geometry of a large number of compounds (Fig. 7.17). BeCl,, BF,, and
CF, are AX,, AX,, and AX, molecules and will have linear, trigonal, and
tetrahedral geometries, respectively. Molecules of type AX3E (NH,, pH3,
NF,, PF,, or AsC1,) are pyramidal and those of type AX,E, (H20, H2S,
H,Se, or OF,) are bent (angular). Molecules of the type AX, may be either
trigonal bipyramidal, or square pyramidal. According to VSEPR calcula-
tions, the first geometry is slightly more stable and, in fact, AX, molecules AX, AX3E AX2E2
(PF,, AsF,, and PCl,) are usually trigonal bipyramidal. Finally, molecules
belonging to the types AX, (e.g. SF6) and AX,E (e.g. IF5) display the
expected octahedral and square pyramidal geometries.
For AX4E (TeCI,, SF,), AX3E2 (ClF,, BrF,), and AX2E3 (XeF,, IC1;) it
is necessary to establish whether the non-bonding pairs are in the axial or AX, AX,E AX3E, AX2E31
equatorial positions of the trigonal bipyramid. According to rule 2 the
non-bonding pair takes more room and should prefer the equatorial position
where it makes angles of 120, 120, and 90" with the vicinal electron pairs
(the three angles would be 90" for the axial position). Similar considerations
predict that AX,E, molecules (e.g. [ICl;]) are square planar with the AX, AX,E AX,E2
non-bonding pairs in trans positions. L
An interesting feature of VSEPR is its ability to interpret fine structural Fig. 7.17. Prediction of the shape of simple
details. For example, Rule 2 predicts that increasing substitution of X by E ~ ~ ~ ~ ~ e R :Ectron ~ ~ ~
will cause a narrowing of the remaining X-A-X angles because of the pairs (lone pairs), TP =total number of electron
greater space taken by the lone pairs. The effect is generally observed, as pairs(bondingandnon-bonding); HYB=typeof
VB hybridization.
exemplified by the series CH, (AX,), NH3 (AX,E), and H 2 0 (AX2E2)
where the X-A-X angle decreases in the order 109.5, 107.3, and 104.5".
Rule 2 can be considered a particular case of Rule 3 since the
non-existent substituent E can be thought of as the least electronegative of
all the possible X ligands. This last rule is thus the most suited for
understanding the angular changes caused by substitution and predicts that
the X-A-X angle decreases while the electronegativity of X increases.
Although a few exceptions are known, the effect is correctly observed in a
variety of cases. For example, the X-A-X angle is 103.8 in OF, and 104.9
in H,O, 102.1 in NF, and 106.6" in NH,. In the halides of the main-group
502 1 Gastone Gilli
elements of the type AX2E2or AX3E the bond angles are known to increase
in the order F < Cl < Br = I. Another well documented[1061series of
electronegativity-dependent angular variations concerns the monosubsti-
tuted benzenes. The endocyclic C-C-C angles in ipso (that is that carrying
the substituent) changes with the substituent itself, being 117.2 for N(Me),,
118.1 for CH,, 120.00 for H (unsubstituted benzene), 121.4 for C1, 121.8 for
-CN, and 123.4" for F. It is evident that the angle is wider the greater is the
electronegativity of the substituent group.
Rule 4 explains why the bond angles trans to a multiple bond are smaller
than usual, as can be illustrated by some examples in the X2A=Y system.
The X-A-X angle should be 120" in the absence of perturbing effects but
becomes respectively 108 and 111" in F2C=0 and C12C=0, where both the
double-bond repulsion and the high electronegativity of the substituents
contribute to the squeezing of the angle. The effect of the double bond is
better seen when the X substituent is less electronegative, for example in
H2C=0 and (NHJ2C=O, where the X-C-X angle is 108" in both
compounds.
and the problem consists of finding the set of aij values matching the
required molecular fragment geometry. A few rules allow us to set up a
Molecules and molecular crystals 1 505
3. Since the atomic orbitals @ j are orthonormal the sum over all HOs of the
squares of the coofficients of any @ must be unitary, that is
Ci a; = 1 for any j.
As an example, let us build up the three trigonal HOs in the plane (xy).
The axes are chosen in such a way that HO, points along x and H 0 2 and
HO, are in the (-x, y) and (-x, -y) quadrants, respectively. Let us write
the HOs as
The s orbital has spherical symmetry and must be equally shared by the
three HOs; since C ia: = 1 for condition (3), it must be that a, = a, = a, =
11Y3. c, is zero because p, is orthogonal to HO, in view of the axes chosen
+
and therefore a: b:= 1 for (2), so that b1 = 21v6. Since H 0 2 and H 0 3 are
+
symmetrical with respect to the a, containing x and c: c: = 1, it results
that c2 = -c3 = 1/v2. Likewise b2 = b, = -1Iv6 because of condition (2).
The final HOs are
It is easy to check that condition (1) is fulfilled for all the symmetry
operations of the point group D,, that the trigonal AX, fragment belongs
to. The hybridization index n of each HO is defined as the ratio between
the sum of the squares of the p A 0 coefficients and the square of the s
coefficient. In this case n1 = n2 = n, = 2 and the hybridization is termed sp2.
Alternatively, the s and p characters (S and P, respectively) of the HO can
be used, as defined by the conditions n = PIS = P l ( 1 - P ) = (1 - S)/S and
+ +
S P = 1 and by the inverse relationships S = ll(n + I), P = nl(n 1). In
the present case S = 0.33, and P = 0.666.
By the same methods it can be calculated that the two spHOs
(n, = n2 = 1) oriented along x have equations HO1 = (s + p,)/v2 and
H 0 2 = (s - p,)/v2 and that four equivalent sp3 HOs (n, = n2 = n, = n4 = 3),
oriented in such a way that each HO makes the same angles with the
506 1 Gastone Gilli
Both HO coefficients ail and hybridization indices n have been worked out
starting from symmetry considerations. If the HO geometry is not exactly
linear, trigonal, or tetrahedral the n values are no longer integer but
become fractional numbers and not necessarily equal. Equations for the
calculus of the ail and n values have been reported[11231131 for some particular
geometries, that is the quasi-tetrahedral systems AX2Y2 of symmetry C2,
and AX2YZ of symmetry C, and the planar quasi-trigonal AX2Y system of
symmetry C,,.
T /*p;,
For the AX2Y2system of C2, symmetry (left side of Fig. 7.18) it has been
shown that n, = n2 = -sec a , n3 = n, = -sec /3 and that the angles a and /3
+
are related by the equation cot2 (a/2) cot2 (/3/2) = 1. By putting a =
X
109.47" we obtain nl = n2 = n3 = n4 = 3, /3 = 109.47", S = 0.25, and P = 0.75
/ 4 Ir
, , a , HO, I a I' for any HO, in agreement with the theory. In the water molecule
0 - -- - , HO, - - - -- "02
( a = 104.9") the two HOs pointing to the hydrogens have n =3.89,
1
Y I / Y
I
1'
I
S = 0.20, and P = 0.80 and the angle between the lone pairs is calculated to
Fig. ,.,*.
Definition of the coordinate system
and labelling of the relevant angles between
be /3 = 114.8", in agreement with Bent's rule because the p character is
found to concentrate on the HOs pointing to the more electronegative
hybrid orbitals in the AX2Y2 (left) and AX2Y substituents, that is the hydrogens. The opposite case occurs in (CH3)20,
(right) systems discussed in the text.
where the methyl is an electron donating group; the observed angle a is
111°, allowing us to calculate that n = 2.79, S = 0.26, and P = 0.74 for the
HOs directed towards the CH, groups and /3 = 108".
In the planar system AX2Y (Fig. 7.18, right side) the important
relationships become nl = n2 = -sec cu and n, = tan2 (ct.12) - 1. Only if
cu = 120" it will be that nl = n2 = n, = 2, while n1 = n2 > n3 for a < 120" and
nl = n2 < n, for a > 120". Since a is smaller or greater than 120" when H 0 3
is directed towards substituents less or more electronegative, Bent's rule is
obeyed.
These few examples can help to illustrate some interesting and not always
well understood aspects of VB theory that can be summarized as follows:
(1) The hybridization indices are, in general, fractional numbers which
become integers only for a few very special geometries;
(2) the hybridization indices and the p and s characters of the HOs are
straightforwardly calculable from the bond angles and vice versa;
(3) Bent's rule is not a simple qualitative rule but can be quantified, at least
in a number of simple cases.
Molecular mechanics
Molecular mechanics is a method for calculating the equilibrium geometries
and other properties of ground state molecules on the basis of a purely
classical mechanical model. The molecule is considered to be a set of atoms
connected by elastic springs, and to any internal molecular coordinate (bond
Molecules and molecular crystals 1 507
where the sum is extended to all the N bonds within the molecule, ks,i are
their stretching force constants and Ali = li - the differences between the
actual and equilibrium bond lengths. This model tends to overestimate the
energy needed to produce very large lengthenings because, in such a case,
the bond becomes increasingly yielding in consequence of the decreased
overlapping of orbitals. It is common practice to add a small cubic term of
the type k , , , ; ( ~ l ~ )where
~, kScjiis a small negative constant taking into
account this last effect.
2. Bond angle bending. Its total energy is expressed as
where the sum is extended to all the M bond angles, AOi = Oi - Oo,; are the
differences between the actual and equilibrium bond angles, and kb,i the
bending force constants. Bending energies are smaller than stretching
energies by an order of magnitude because angles are much more yielding
508 1 Gastone Gilli
than distances. Also in this case a cubic term can be added, which is of the
type kb,,i(h8i)3and where kbc,iis a small negative constant.
3. Stretching-bending terms. An improved force field is obtained if
proper allowance is made for the fact that the narrowing of a bond angle is
paralleled by the legthening of the two encompassing bonds and vice versa.
If 8i,ABcis the angle delimited by the bonds li,ABand li,BCconnecting the
three atoms A, B, and C, the stretching-bending energy is defined as
where ksb,iare the corresponding force constants and Ali and AOi have the
usual meaning. The term Esbis small in comparison with the previous terms
and is neglected in some force fields. A field containing it is called a valence
force field; this differs from the Urey-Bradley force field which presents a
different treatment of the A . . . C geminal interactions and is not discussed
here.
A C
B
H
/
O - A~- - - -c- P>
B
o
4. Out-of-plane bending. In the deformations concerning trigonal sp2
atoms it is useful to distinguish between in-plane and out-of-plane deforma-
tions (Fig. 7.19). The in-plane deformations concern the angles A-P-B,
Fig. 7.19. in-plane and out-of-plane A-P-0, and B-P-0 and are dealt with by eqn (7.39), while the
deformations of a trigonal sp2 atom. out-of-plane deformations are the displacements from zero (A8) of the
angles C-A-P, C-B-P, and C-0-P. The associated energy is
E , = C [Vl(l + cos @)I2+ V2(1- cos 2@)/2+ V3(1+ cos 3@)/2] (7.42)
where @ is the torsion angle (-180" s @ s 180°), V,, V2, and V3 are the
force constants, and the sum is extended to all sequences 1-2-3-4 of
bonded atoms in the molecule. The second term has the value zero for
@ = 0 and f180" and maxima for @ = f90; it is used to describe the rotation
around double bonds and in this case V1 = V3 = 0. The third term is equal to
zero for @ = f60°, f180" and has maxima for @ = 0, f120"; it describes the
rotation around single bonds connecting sp3 atoms. The first term is zero for
@ = 180°, has a maximum for @ = 0°, is used as a small corrective term in
special cases, and is often neglected.
6. Non-bonded interactions. The corresponding energy, Enb, can be
expressed by the atom-atom potentials discussed on p. 474. The interac-
tions among all the atoms of the molecule are taken into account, with the
exclusion of those between first neighbours (1-2 or bonded atoms) and
second neighbours (1-3 or vicinal atoms). Third-neighbour (or 1-4)
interactions are considered in both the Enband E , terms. Special treatment
Molecules and molecular crystals 1 509
has sometimes been reserved for the lone pairs on etheric oxygens or aminic
nitrogens by considering them as pseudo-atoms localized at about 1A from
the atom and having their own atom-atom potentials.
7. Electrostatic interactions. It has already been shown on p. 477 that
electrostatic interactions can be evaluated as interactions among partial
charges localized on the atoms or among small dipoles associated with each
chemical bond. Both methods are used in molecular mechanics.
A relevant number of different force fields proposed by different
~ ]currently available and Table 7.8 reports, as an example, the
a ~ t h o r s ' ' ~are
Table 7.8. Force field parameters for alkanes and non-conjugated alkenes according to
M M ~ / M M P ~ . [ ' ~Force * ' ~ ~ ~ i n kcal mol-' A-2 or i n kcal mol-' deg-'; bond
, ' ~ ~constants
moments (b.m.) in D. In the symbol (e,),, n indicates the number of additional hydrogen
atoms bonded to the central atom. I= C(sp3), 2 = C(sp2), 3 =H. Final energies i n
kcal mol-'
ions by a charge transfer reaction where the Lewis base I- donates to the
empty n* orbital of I,. Several crystal structures containing the I; ion are
known: the anion is nearly linear with an I,-I, I, angle of 175-180". The
bond distance in the I, molecule is 2.67 A and the sum of the van der Waals
radii is 4.30 A. Actual structures show many interleaving distances, the
shortening of the I, . I, distance being always associated with a lengthen-
ing of the I,-I, one, and vice versa. In general the anion is strongly
asymmetric in presence of small cations and tends to become symmetrical
with larger cations. A plot of the I,-I, versus I, - I, distances is shown in
Fig. 7.20(a). The experimental points are clearly arranged on an equilateral
hyperboloid whose analytic form can be derived from Pauling's formula
(7.31)
Ad = d(n) - d(1) = -c log,, n
where n is the bond number, d(n) and d(1) the bond distances for the bond
numbers n and 1, respectively, and c a constant easily determined as
c = Ad'llog 2, being Ad' the bond length increment for the symmetrical
anion having n = 112 (currently c = 0.85 for a Ad' of 0.26 A). Assuming that
the single bond is shared among the three atoms, it can be written
and the resulting function for c = 0.85 is drawn as a continuous curve in Fig.
7.20(a); only one branch of the hyperboloid is shown since the second one,
obtainable by interchanging I, with 13, is identical because of ion symmetry.
The points on the curve may be supposed to represent the geometrical
changes occurring along the reaction coordinate of the process
The shape of the upper part of the curve indicates that the approach of the
I; ion to the iodine molecule is much more rapid than the lengthening of
514 1 Gastone Gilli
approaches the carbon (i.e. d l decreases), the plane containing the carbon
and the two R and R' substituents bends away causing an increasing
pyramidalization of the carbon which rehybridizes from sp2 to sp3, while the
C=O distance is slightly increased. It seems of interest that the nucleophile
does not approach the carbonyl plane perpendicularly but makes an almost
constant N . C==O angle of 110°, which has been interpreted in terms of
elementary banana bond considerations but also shown to match the results
of ab initio quantum-mechanical calculations.[41The constraint arising from
Fig. 7.25. The coordinate system is that of Fig. the fixed 110" angle is more clearly seen in (7.VI). The two -NR2 and
7.24 projected on the (NCO) plane. The open
circles indicate the positions of the nitrogen -COR groups would both be expected to be splayed outwards in order to
atoms (top), of the bisector of RCR' (bottom left) reduce their van der Waals repulsions but the C-N bond is actually found to
and of the carbonyl oxygen (bottom right) i n the
14 (A-L) structures considered. (Adapted from
be splayed inwards by the need to maintain the correct approach angle to
Bijrgi and ~ u n i t z . ' ' ~ ~ ' ) the carbonyl C=O bond.
When looking at the dependence (Fig. 7.24) between A (carbonyl group
pyramidaiity) and d l (N . C distance) it is found that they are correlated
according to the regression line
and anilines respectively, in agreement with the fact that the contribution of
the polar form decreases with the decrease of electronegativity of X). In
spite of the general tendency to planarity, there are many crystal structures
c
Z
+
Xc
+
1
XN
517
xN against z for all the chemical classes except amides and thioamides.
These latter cluster in the lower left corner with t c 1.Y and X, S 27",
showing the greater resistance to out-of-plane deformation of these com-
pounds in agreement with the larger electronegativities of the oxygen and
sulphur atoms.
Compounds of the other classes undergo more severe distortions which
appear to be of two different types. The first, producing simple nitrogen
pyramidalization, occurs for t nearly equal to zero and for increasing values
of xN and can be called a butterfly deformation. The second, which could
be called combined, associates nitrogen pyramidalization (increasing xN)
with a rotation around the C-N bond (increasing z). The diagonal straight
line of Fig. 7.27 ideally separates mostly butterfly (on the left) from mostly
combined (on the right) distortions.
The combined motion admits a simple interpretation. In the planar
fragment the nitrogen is sp2 hybridized and its p, A 0 is implies in a n bond
with the p, A 0 on the carbon. The rotation around the C-N bond causes a
decoupling of the n system while the nitrogen rehybridizes engaging its p,
A 0 into an sp3 HO carrying the lone pair. The opposite mechanism ia also
possible: the planar nitrogen undergoes an out-of-plane vibration (butterfly
motion) which causes the rehybridization of the atom from sp2 to sp3. The
sp3 HO on nitrogen is essentially decoupled from the p, A 0 on carbon, the
double bond fades, and the rotation around C-N becomes possible. This
second mechanism seems to be preferred because Fig. 7.28 shows that XN (a
measure of the nitrogen rehybridization) is strongly correlated with a
lengthening of the C-N distance which almost encompasses the full
variation range of dC-N (from 1.27 to 1.44 A going from double to single
C(sp2)-N(sp3) bond), while no similar correlation can be established with z,
the rotation angle around the C-N bond.[1411
From the point of view of the structure correlation method the butterfly
motion corresponds to a simple out-of-plane vibration and does not produce
z' changes from zero to 180' and, accordingly, the nitrogen hybridization
from sp2 to sp3.
The values of the constants have been chosen as average values from
those of the different classes of compounds. The CTIB value is in the range
5-22 kcal mol-', QP is evaluated from vibrational spectroscopy and mole-
cular mechanics to be some 5-10 kcal mol-' radP2, and IB can be assimi-
lated to the inversion barrier of ammonia, usually reported as =6
kcal mol-l. Values chosen were CTIB = 14, IB = 8 kcal mol-', and QP =
8 kcal mol-I rad-2. Equation (7.49), however, evaluates the position of the
xN
transition state at t ' = 180' and = 60", which does not correspond to that
found from Fig. 7.27 as far as xN is concerned. The reason is that some
allowance must also be made for the 1-4 non-bonded interactions, that is
particular system was studied originates from the empirical observation that the
n-conjugated system present in the 6-diketone en01 HOCR=CR-CR=O
fragment undergoes greater delocalization when it forms either intra-
molecular or infinite chain intermolecular hydrogen bonding. The nature
and the entity of this effect is shown in Table 7.9. Here the standard
distances are those tabulated['091 for pure single and double bonds, the
unperturbed ones are an average of nine fragments that cannot make
H-bonds (having OR instead of OH). The unperturbed geometry is a 87:13
mixture of the resonance forms (7.IXa) and (7.IXb) according to Pauling's
formula (7.31). However, when the fragments form H-bonds all distances
undergo changes consistent with an increased contribution of the polar form
(7.IXb), which becomes respectively 29 per cent for the intermolecular and
48 per cent for the intramolecular cases reported in the table; at the same
time the contract 0--0 distance, which is typically 2.74 A in the 0 - H . 0
bond in ice, becomes much shorter, being respectively 2.575 and 2.485 A in
Table 7.9. Selected bond distances (A) for the P-diketone enol fragment. %(7.IXb) = per
cent contribution of the polar form (7.IXb) according to Pauling's formula (7.31); the
standard 0--0 contact distance in parentheses is that observed in ice
Perturbed by:
Intermolecular
H-bond (7.XIIa 1.316 1.372 1.431 1.238 2.575 29
Intramolecular
H-bond (7.X)b 1.281 1.398 1.410 1.279 2.485 48
the two cases, while the 0 - H bond distance (0.95 A in the absence of
H-bonding and according to Table 7.7) is lengthened up to 1.24 A in the last
compound of Table 7.9.
In a first 25 X-ray or neutron structures of p-diketones and
P-ketoesters were studied, 22 of type (7.X) and 3 of type (7.XI), and the
discussion reported here concerns this first analysis. Later on[2591491 the
investigation was extended to a much larger set of fragments (81 and 37 for
(7X) and (7.XI), respectively) but without obtaining significant differences
in the final results. In summary, all the following phenomena are observed
to occur together:
(1) increased delocalization of the n-conjugated system;
(2) strengthening of the 0 - H - . 0 bond as indicated by the shortening of
do--, and the lengthening of d,, distances;
(3) shift of the proton position towards the middle point of the 0 . . . 0
contact.
The three effects may occur with different intensities but maintain the
same intercorrelation for both intramolecularly (7.X) or intermolecularly
(7.XI) bonded fragments (the intramolecular H-bond is usually stronger and
causes greater n delocalization) and even for different classes of compounds
(e.g. P-diketone enols form stronger H-bonds and are more delocalized
than the enols of P-ketoesters or /3-ketoamides).
While the H-bond strengthening is easily measured by both d o - - , and
dSH changes, a specific geometric parameter is needed in order to describe
the n system delocalization. The simplest way consists in using symmetry
coordinates for the in-plane antisymmetric vibration of the group (7.X) or
(7.XI), that is q , = d4 - d l and q , = d , - d,. Clearly q , = q , = 0 for the
totally delocalized structure, while the greatest values will occur for the
standard bond distances of Table 7.9. Figure 7.31 reports the scatter plot of
s
4'I
N
-a
0
'y
?4
k .
U
It
N .,A' INTRA HE
-.05- /
*
o INTER HB
a&/
' 0"
,
the H-bond itself (EHB)and the van der Waals energy originated from the
attractive and repulsive terms of the 1-4 interactions (Evdw). Re-
establishing now the double C=C bond, the resonance (7.IXa) tt (7.IXb)
will cause a neat shift of electrons from left to right (Fig. 7.34), which will
+
stop when the minimum of ERES E B P is attained, where the first term is
the resonance energy gain and the second the energy needed to dissociate
the opposite partial charges on the terminal oxygens. The charges have the
correct sign for strengthening the H-bond with consequent shortening of
d o - - , and lengthening of d., The movement of the proton to the right is
equivalent to a negative charge going to the left and the global effect is the
annihilation of the partial charges initially set up by resonance, so allowing
an increased contribution of the polar forrn (7.IXb) and a further
strengthening of the H-bond, an imaginary process going on until the
Fig. 7.34. A graphical representation of the minimum of the function
resonance assisted hydrogen bonding (RAHB)
model. (Reproduced by permission from Gilli e t E = E H B+ E v m+ ERES+ EBP (7.51)
is attained. The H-bond formed has quite specific features and, in view of
the strict interplay of n delocalization and H-bond strengthening which
causes it, has been called[241resonance assisted h y d r ~ g e n bonding
(RAHB). It is essentially a charge assisted H-bond where, however, the
charges do not arise from the presence of ions but from the resonance in a
heteroconjugated system.
The RAHB model is supported by a wealth of spectroscopic data and
theoretical calculations for which the reader is addressed to the original
paper. What is more important here is to discuss whether the correlation
among geometrical parameters can be interpreted in terms of the potential
energy hypersurface of the fragment. The approximate energy partitioning
used in (7.51) can be evaluated quantitatively since semi-empirical equa-
tions are available for all the four terms. E H Bis the total energy, including
both attraction and repulsion terms, of the 0 - H . . . 0 bond as a function of
dH
, = r and d o _ - , = R. It can be written as E d r ., R ) and has been
\
all the desired values of r at the R value for which the energy was a relative
minimum. In such a way R turns out to be a function of r and is known for
any value of r. ERE, can be calculated by the method proposed by
Krigowski et a1.[1441which is known to give the resonance energy of a
n-conjugated system from its bond distances with fairly good accuracy. EBp
is the energy required to create the opposite fractional charges fq on the
two terminal oxygens and can be easily evaluated by means of the
coefficients of the atomic ionization energy versus electron affinity curves
tabulated by Hinze and ~ a f f 2 for [ ~the~ main
~ ~ elements.
~ ~ ~ ~Finally, EVdw
can be calculated by the methods and equations already discussed in the
section on molecular mechanics.
The final potential energy map calculated according to (7.51) for
acetylacetone (R, = R, = CH,, R, = H) is shown in Fig. 7.35 as a function of
two coordinates: the bond number of the 0 - H bond, n(0-H) (which is 1.0
in the absence of H-bonding when d,, assumes the value of 0.97 hl, and
0.5 when the hydrogen is equally shared by the two oxygens and
,d
, = do,-, = 1.25 hl), and the coupling parameter A (which assumes the
values of A = 0, 0.5, and 1 for n-localized EK, fully n-delocalized, and
n-localized KE fragment structures, respectively) or its related quantity Q.
Figure 7.36 illustrates the physical meaning that can be given to the four
corners and the central point of the plot of Fig. 7.35 while Fig. 7.37 shows a
three-dimensional representation of the potential energy map in the same
coordinate system.
The potential energy map is centrosymmetric in consequence of the
fragment symmetry and displays a diagonal valley of lower energy which can
*PROTON TRANSFER-
I J
0 FORMAL 0.5 CHARGE 1
7. Xll 7.Xlll
From this point of view, the P-diketone enols so far discussed correspond to
the heterodienic scheme (7.52) (X = Y = O), but the RAHB mechanism is
supposed to operate independently of the number of interleaving carbon
atoms provided the conjugation between the lone pair on X and the C=Y
double bond is maintained. In the X = Y = 0 series the scheme (7.55)
represents carboxylic acids which, in fact, are well known to give strongly
H-bonded dimers (Fig. 7.3, scheme b), though a more convincing proof of
RAHB could come from the observation of strongly delocalized and
H-bonded heterotrienes (7.53) and heterotetraenes (7.54). Structures cor-
responding to these fragments have been recently analysed[251and found to
display the expected behaviour. Compounds (7.XII) behave as totally
7. X I V 7. x v 7.XVI 7.XVll
"k
8--
N-H--0
/
7. X I X I
1 1
7.XXlll
References
1. Wells, A. F. (1975). Structural inorganic chemistry (4th edn). Clarendon,
Oxford.
2. Bondi, A. (1964). Journal of Physical Chemistry, 68, 441.
3. Allen, F. H . , Bellard, S., Brice, M. D . , Cartwright, B. A., Doubleday, A.,
Higgs, H., Hummelink, T., Hummelink-Peters, B. G., Kennard, O., Mother-
well, W. D. s . , Rodgers, J. R., and Watson, D. G. (1979). Acta
Crystallographica, B35, 2331.
4. Dunitz, J. D. (1979). X-ray analysis and the structure of organic molecules.
Cornell University Press, Ithaca, N.Y.
5. Pauling, L. (1960). The nature of the chemical bond and the structure of
molecules and crystals. A n introduction to modern structural chemistry (3rd
edn). Cornell University Press, Ithaca, N.Y.
6. Kitaigorodski, A. I. (1973). Molecular crystals and molecules. Academic, New
York.
7. Kaplan, 1. G. (1986). Theory of molecular interactions. Elsevier, Amsterdam.
8. Pertsin, A. J., and Kitaigorodski, A. I. (1987). The atom-atom potential
method. Springer, Berlin.
9. London, F. (1930). Zeitschrift fur Physikalische Chemie, B11, 222, 236; (1930).
Zeitschrift fur Physik, 63, 245.
10. Lennard-Jones, J. E. (1931). Proceedings of the Physical Society, 43, 461.
11. Buckingham, R. A. (1938). Proceedings of the Royal Society of London, A168,
264.
12. Giglio, E. (1969). Nature, 222, 339.
13. Allinger, N. L. (1975). MMlIMMP1, QCPE No. 318. Indiana University.
14. Hamilton, W. C., and Ibers, J. A. (1968). Hydrogen bonding in solids.
Benjamin, New York.
15. Pimentel, G . C., and McClellan, A. L. (1960). The hydrogen bond. Freeman,
San Francisco; Pimentel, G. C., and McClellan, A. L. (1971). Annual Reviews
of Physical Chemistry, 22, 347.
16. Kollman, P. A , , and Allen, L. C. (1972). Chemical Reviews, 72,283; Kollman,
P. A. and Joesten, M. D. (1982). Journal of Chemical Education, 59, 362.
17. Schuster, P., Zundel, G., and Sandorfy, C. (ed.) (1976). The hydrogen bond,
Vols. I, 11, and 111. North-Holland, Amsterdam.
18. Emsley, J. (1980). Chemical Society Review, 9, 91.
19. Umeyama, H., and Morokuma, K. (1977). Journal of the American Chemical
Society, 99, 1316.
20. Morokuma, K. (1971). Journal of Chemical Physics, 55, 1236.
530 1 Gastone Gilli
Bijvoet, J. M., Peerdeman, A. F., and van Bommel, A. J. (1951). Nature, 168,
271.
Rogers, D. (1975). In Anomalous scattering (ed. S. Ramaseshan and S. C.
Abrahams). Munksgaard, Copenhagen. p. 231.
Cotton, F. A. (1971). Chemical applications of group theory (2nd edn). Wiley,
New York.
Cano, F. H., Foces-Foces, C., and Garcla-Blanco, S. (1977). Tetrahedron, 33,
797.
Cremer, D. and Pople, J. A. (1975). Journal of the American Chemical Society,
97, 1354.
Boeyens, J. C. A. (1978). Journal of Crystal and Molecular Structure, 8, 317.
Altona, C., Geise, H. J., and Romers, C. (1968). Tetrahedron, 24, 13.
Parkanyi, L. (1980). RING. Hungarian Academy of Sciences, Budapest.
Nardelli, M. (1982). PARST. University of Parma, Italy.
Clark, T. (1985). A handbook of computational chemistry. Wiley, New York.
Hehre, W. J., Radom, L., Schleyer, P. v. R., and Pople, J. A., (1985). Ab
initio molecular orbital theory. Wiley, New York.
Pople, J. A., and Beveridge, D. L. (1970). Approximate molecular orbital
theory. McGraw-Hill, New York.
Dewar, M. J. S. (1969). The molecular orbital theory of organic chemistry.
McGraw-Hill, New York.
Murrell, J. N., and Harget, A. J. (1972). Semiempirical self-consistent-jield
molecular orbital theory of molecules. Wiley, London.
Frisch, M. J., Head-Gordon, M., Schlegel, H. B., Raghavachari, K., Binkley,
J. S., Gonzalez, C., et al. (1988). GAUSSIAN 88. Gaussian Inc., ~ittsburgh,
PA.
Bingham, R. C., Dewar, M. J. S., and Lo, D. H. (1975). Journal of the
American Chemical Society, 97, 1285, 1294, 1302, 1307, 1311; Dewar, M. J. S . ,
and Thiel, W. (1977). Journal of the American Chemical Society, 99, 4899,
4907.
Dewar, M. J. S., Zoebisch, E. G . , Healy, E. F., and Stewart, J. J. P. (1985).
Journal of the American Chemical Society, 107, 3902.
Coulson, C. A., (1961). Valence. Oxford University Press, London.
McWeeny, R. (1979). Coulson's valence. Oxford University Press, London.
75. Streitwieser, A. Jr. (1961). Molecular orbital theory for organic chemists.
Wiley, New York.
76. Gillespie, R. J. (1972). Molecular geometry. Van Nostrand-Reinhold, London.
77. Gillespie, R. J. (1963). Journal of Chemical Education, 40,295; (1970). 47,18.
78. Bader, R. F. W., Gillespie, R. J., and MacDougall, P. J. (1988). Journal of the
American Chemical Society, 110, 7329.
79. Orgel, L. E., (1966). An introduction to transition-metal chemistry (2nd edn).
Wiley, New York.
80. Bethe, H. (1929). Annalen der Physik, 3, 135.
81. Van Vleck, J. H. (1932). Physical Review, 41,208; (1935). Journal of Chemical
Physics, 3, 803, 807.
82. Ballhausen, C. J., (1962). Introduction to ligand field theory. McGraw-Hill,
New York.
83. Dunn, T. M., McClure, D. S., and Pearson, R. G. (1965). Some aspects of
crystalfield theory. Harper and Row, New York.
84. Schaffer, C. E., and J@rgensen,C. K. (1965). Molecular Physics, 9, 401.
85. Schaffer, C. E. (1973). Structure and Bonding, 14, 69.
86. Burdett, J. K. (1978). Chemical Society Review, 7, 507.
87. Burdett, J. K., (1980). Molecular shapes. Theoretical models of inorganic
stereochemistry, p. 23. Wiley, New York.
88. Walsh, A. D. (1953). Journal of the Chemical Society, 2260, 2266, 2288, 2296,
2301, 2306.
532 1 Gastone Gilli
146. Hinze, J., Whitehead, M. A., and Jaffk, H. H., (1963). Journal of the
American Chemical Soceity, 85, 148.
147. Iijima, K., Ohnogi, A., and Shibata, S. (1987). Journal of Molecular Structure,
156, 111.
148. Ferretti, V., Bellucci, F., Bertolasi, V., and Gilli, G. (1985). Proceedings of the
IX European Crystallographic Meeting, Torino, p. 337.
149. Bertolasi, V., Gilli, P., Ferretti, V., and Gilli, G. (1991) Journal of the
American Chemical Society, 113, 4917.
Protein crystallography
Introduction
Protein crystallography, the subject of this chapter, is a specialized branch
of crystallography that investigates, by using diffraction techniques on single
crystals, the three-dimensional structure of biological macromolecules.[l-sl
For a long time, good quality single crystals were obtainable only for
globular proteins. Nowadays, other biological macromolecules, like t-RNA,
polysaccharides, and polynucleotides, give crystals suitable for X-ray
analysis. Therefore, the techniques for structure solution and refinement
described in this chapter apply to all cases where crystals with a high portion
of unordered solvent can be grown.
Despite the fact that crystals of proteins have been known since the
beginning of the century, only around 1960 were the first structures,
myoglobin and haemoglobin, e l ~ c i d a t e d . ~Their
~ ] resolution was made
possible mainly by the development of the isomorphous replacement
technique. Since then, a lot of theoretical and technical advances have been
made: among them, the use of anomalous dispersion, molecular replace-
ment, and the development of oscillation techniques in data collection of
crystals with large cell dimensions. At the end of the 1970s the growth of
computing power made possible the refinement in reciprocal space. In the
meantime, the introduction of powerful and affordable graphic stations
brought the development of interactive graphic software, allowing faster
interpretation of electron density maps and model building. Only in very
recent years has the availability of synchrotron radiation and two-
dimensional detectors made data collection a routine procedure.
An idea of the 'state of the art' of protein crystallography is given by the
Brookhaven Protein Data Bank (thereafter called PDB),[~]which collects
coordinates of the macromolecular structures solved, mainly by X-rays.
Looking at the January 1989 release, one finds, perhaps with some surprise,
that only 413 sets of coordinates are available (plus 86 bibliographic entries,
that is structures solved but not yet deposited).? They include 10 polysac-
charides, 26 DNA fragments, 6 t-RNA, and 18 model structures. The
remaining 353 are protein structures, including 14 virus coat proteins: taking
into account the fact that most of them are related proteins or variants of
the same molecule (there are for example 34 lysozyme variants or mutants,
12 haemoglobins, and 8 myoglobins), this number must be considered quite
small and demonstrates that solving the structure of a new macromolecule
? I t must be remembered that the Protein Data Bank cannot be considered fully
representative of the macromolecular structures so far solved, since not all of them are quickly
deposited.
536 1 Giuseppe Zanotti
still remains a long and demanding job. But possibly the reason for the
relatively low number of new structures published every year is the
crystallization process, a field in which technical (and theoretical) advances
have proceeded very slowly and which remains the more uncertain step in
protein structure determination.
Protein crystals
It is the solvent content that makes the difference between a classical
molecular crystal and a protein crystal: in the former, all the atoms can be
described in terms of a regular lattice, whilst in the latter a crystalline array
coexists with a high portion of material in the liquid state. The mother
liquid, whose content can range approximately from 30 to 75 per cent or
more, has a strong influence on the behaviour of this kind of crystal, making
them very peculiar and creating some advantages along with some obvious
disadvantages. Among the latter, the major one is that protein crystals are
much less ordered than classical crystals, not only for the large amount of
unordered material present in the crystal itself, but also because surface
groups of the macromolecule in contact with the solvent can show a great
mobility. As a consequence, diffraction data cannot be measured to the
resolution normally attainable with 'small' molecules. On the other hand,
among the advantages, the environment of the macromolecule in the crystal
is not too different from that of the solution from which the crystal was
obtained (the influence of the solvent on the conformation of the protein
cannot be underestimated) and we can profit from the solvent in the
preparation of heavy-atom derivatives, as will be described in the next
paragraph.
A last important point must be remembered about protein crystals: since
inversion symmetry elements are not allowed, the number of possible space
groups is reduced from the original number of 230 to 65 (see Chapter 1,
p. 24).
Class Examples
since the first term in parentheses represents the specific volume of the
protein and the second the reciprocal of VM, and remembering that
molecular weight is expressed in Da:
If we assume a value of about 1.35g cmP3 for the protein density, as a first
approximation:
V; - 1.23/VM (8.3)
v:,," =1- v;. (8.4)
The most important information contained in VM is not the solvent
content, but the molecular mass of macromolecule in the crystal cell. Very
often the number of molecules per asymmetric unit can be determined
unambiguously, when the molecular weight of the protein is known, at least
approximately. For monomeric proteins, or for those without the possibility
of internal symmetry, the number of molecules in the cell is relevant only
for the subsequent steps in structure determination. For multimeric pro-
teins, composed of identical subunits, this number may be of relevant
biological consequence: in fact, if some of the internal symmetry elements
are coincident with the crystallographic ones, the relative arrangement of
such subunits in the molecule is immediately available.[151
The experimental determination of the density of the protein crystal,
along with that of the dried content of the crystal itself, can allow a quite
accurate determination of the molecular mass of the protein contained in
the asymmetric unit, and consequently the molecular weight of the protein.
These techniques have been reviewed by ~ a t t h e w s , [ ' ~but
] the reader must
be warned that the experimental measurement of the density of such kinds
of crystals is difficult and the results very often uncertain.
Classes Examples
tion is made possible by their quite high solvent content: the presence of
channels of disordered or only partially ordered liquid allows the diffusion
of relatively small compounds into the crystal. Reactions among the diffused
compounds and eventual accessible reactive sites of the protein can take
place. In some special cases, protein molecules are derivatized in solution
and subsequently crystallized, but the former procedure is simpler and it is
advisable to try it first. In practice, the crystal is soaked in a solution of its
mother liquid in which the heavy-atom compound has been dissolved. A list
of the more commonly used heavy-atom derivatives is given in Table 8.2.
Concentration of heavy atoms and time of soaking are the most relevant
variables: they can range from 0.001 M to 0.1 M and from a few hours to
several days (these numbers must be considered only a rough indication,
since extreme cases have been reported). These conditions strongly depend
on the protein under study, on the precipitating agents, on the pH used, and
on the temperature.
1. pH. At high pH (greater than 9) the acidic groups of the protein are
mostly negatively charged, and potentially they can react with cations. At
the same time, many heavy atoms form insoluble hydroxides. At low pH,
on the other hand, potentially reactive groups will be protonated and
prevented from reacting.
2. Precipitating agents. High salt concentration is not the ideal medium
for heavy-atom reaction with proteins, not only because the solubilization of
the compound can be prevented by a very high salt concentration, but also
because ions in the solution will compete with the protein. Polyethylene
glycol solutions on the contrary provide favourable conditions for heavy-
atom reactions, since PEG does not react with most of the compounds
commonly used.
3. Temperature. There is an obvious kinetic effect of temperature on
heavy-atom reactions, which are slower at 4 "C than at room temperature.
Very often preparation of heavy-atom derivatives is a trial and error
technique, unless some special features of the macromolecule are known a
540 1 Giuseppe Zanotti
first protein structures, but it remains, with few exceptions, the only
procedure available to solve the structure of completely new proteins.['']
Let Fpbe the structure factor of the native protein, Fp its magnitude and
rpP its phase. FpH, FpH, and rppH are the corresponding quantities of the .%
2
heavy-atom derivative. If we assume a perfect, 'ideal' isomorphism, the d,
relationships between Fp and FpHis illustrated in Fig. 8.1 and can be -2
written:
where FH is a vector representing the contribution of the heavy atom I Real axis
The right-hand side of eqn (8.6) shows that the Patterson map calculated
with these coefficients will contain peaks corresponding to heavy-atom
heavy-atom vectors, FL, and to heavy-atom protein-atom vectors, the
mixed terms.
A more common choice among protein crystallographers is to calculate a
Patterson synthesis using coefficients (FpH- F,)'. This was originally called
modulus difference squared synthesis, but is commonly known as
isomorphous difference-~atterson.['~] The resultant map has several inter-
esting features, but the main reason for its success is that it is more
representative of the heavy-atom situation. In particular, in the centrosym-
metric case (that is for centric? projections in a non-centrosymmetric space
group), since FpHand Fp are collinear, these coefficients represent a true
estimate of the value of FH. In fact:
t The word centric is used here in a general sense, indicating that phases are restricted to two
values, not necessarily 0 or x.
542 1 Giuseppe Zanotti
for cpp. It is important to notice that this problem is immediately solved for
restricted phase reflections, for which eqn (8.12) becomes: A
F, FP
-
*
Since the value of FHis known from calculation and the values of FpHand Fp
from observation, the relative sign of Fp with respect to FH can be
determined unambiguously from the comparison of the magnitude of the
three vectors (Fig. 8.3). For general reflections this is unfortunately not
true, and the ambiguity between the two possible values for the phase angle
must be solved with other techniques.
In principle, a three-dimensional electron density map could be calcu- - --- -
lated, using the two possible phase values: this kind of synthesis is called an FPH FP
SIR map and it will contain information about the true structure plus (c)
noise.['lI In fact, if we call cp, the correct phase value of cpp and cppw the Fig. 8.3. The sign of reflections of centric zones
wrong one, looking at Fig. 8.4 we can write: can be fully determined using only one heav
atom derivative (after Blundell and Johnson ). 16
When F, is small compared with F, and F,,,
WSIR= FPexp (i~,) + FPexp (~vPw). (8.16) case (a) and (b)apply: if F,< F,,, the sign of F,
is equal t o that of F,; if F,> F,, the sign of F, is
Since cp, = 1/2(cppT+ cppw), then: reversed. The so-called 'cross over', represented
in (c), may occur when F , is larae and F. is
(8.17) small: despite the condiiibn F,; F,,, thk sign of
F, is reversed with respect t o F,.
method used to solve this phase ambiguity, and it remains the more
common and is widely adopted. It is based on the preparation of a second
heavy-atom derivative having heavy atoms bound to the protein in
position(s) different from those of the first one. Let us assume we have Fig. 8.4. The same diagram shown i n Fig. 8.2,
measured coefficients from two derivatives, called FpHland FPH2,and in except that now the 'true' and the 'wrong' value
for the native structure factors are explicitly
some way we have calculated FHland FH2(that means we know the indicated as F,, and Fpw respectively. It can be
positions of the heavy atoms of the two derivatives, relative to the same seen that the further the t w o vectors are from
each other, the more the S.I.R. phase
origin). The complete solution of the phase problem can be illustrated using (corresponding t o rp,) is different from rp,.
a diagram due to ~ a r k e r , [illustrated
~~] in Fig. 8.5: a circle of radius Fp
centred on the origin 0 of an Argand plane is drawn, and from 0 the two
vectors -FHl and -FH2; they are in turn the centres of two circles of radii
FpHland FPHZ respectively. Since the two derivatives are assumed perfectly
isomorphous, the three circles must intersect in one point (B in Fig. 8.5):
the vector OB will coincide with the structure factor of the protein. The
graphical construction just described is equivalent to solving a pair of
544 1 Giuseppe Zanotti
I Real axis
Notice that we do not need native coefficients for the calculation, so (8.21)
is also useful when an anomalous scatterer is present in the native protein
crystal. A map calculated with coefficients (8.21), supposing the anomalous
contribution Ffl, is small compared with Fp, which is very often true in our
I Real axis case, will contain maxima corresponding to vectors relating the positions of
Fig. 8.7. For restricted-phase reflections, if only anomalous scatterers. This synthesis was for example used by ~ o s m a n n [ ~ ~ ]
one type of anomalous scatterer is present, F:,
and F& have the same magnitude, independent to determine the Hg position in two derivatives of horse haemoglobin.
of the size of the imaginary component. Other kind of anomalous Patterson maps with coefficients different from
Protein crystallography 1 545
t Readers must be warned that measurement of Bijvoet pairs requires very accurate data
collection, since anomalous differences are very often of the same order of magnitude as the
experimental errors on intensities. Synchrotron radiation is extremely helpful in this respect:
thd appropriate wavelengths that optimize the anomalous effect can be selected (see p. 239).
546 1 Giuseppe Zanotti
value of q, associated with every qH. Let us assume the solution for the
phase angle of the derivative structure factor, from (8.A.7) and (8.A.15), is
qpH= qH+ 6. We are left with another ambiguity, since the original choice
of q H was arbitrary: the two possibilities are in fact qpH= qH-t 6 or
qpH= - qH- 6, and they determine the selection of the enantiomorph
(Fig. 8.9).
At least two heavy-atom derivatives and anomalous differences for one of
Real axis them are necessary to select the correct hand of the molecule: using phases
from one derivative along with anomalous contribution and the heavy atoms
in the two possible arrangements, two difference-Fourier of a second one
are calculated. Peaks in the difference map calculated with atoms in the
correct hand will be reinforced, those with the wrong one lowered.[']
Real
axis The difference FPH(obs) - FPH(calc)
in (8.24) is called lack of closure and is
usually denoted by E,; its physical meaning is illustrated in Fig. 8.11, which
is a realistic picture of what was idealized in Fig. 8.1: now the triangle
formed by the three vectors F,, FPH, and FHdoes not close. The lack of
closure depends from the phase of the protein, since:
Fig. 8.10. Harker construction analogous to Fig.
8.5. where, due to lack of isomorphism and
errors in the data, the three circles of radius F,, ni in (8.24) is a normalizing factor and Ei is a measure of the total error. The
F,,,, and F,,, do not intersect at the same point. choice of .E, is crucial and not obvious. In practice, the value commonly
Protein crystallography 1 547
I Real axis
-
Real axis
between the true and the estimated value of F
The third source of error, the inaccuracy i n the
measurement of Fp and F,, prevents the
.,
used is:[311
The right-hand side of (8.26) represents the mean square value of the lack
of closure residual. Since it depends from the resolution, it is evaluated in
ranges of sin 8/A.
An a priori derivation of (8.24) along with that of E; for centric and
acentric reflections has been obtained by Terwilliger and ~ i s e n b e r ~asl ~a~ ]
function of all the different sources of errors:
Equations (8.27a) and (8.27b) are not of direct use, since we lack a value
for q and p, but from them a theoretical justification of (8.26) can be
obtained.[291
If more than one derivative is present, (8.24) can be modified to give the
total probability distribution of the phase of a reflection:
The sum is extended to all the j derivatives used in calculating the phases. A
diagram showing P,(q,) for the hypothetical derivatives of Fig. 8.10 is
reported in Figs. 8.12(a) and (b), and the total probability in Fig. 8.12(c).
The lack of closure for anomalous scattering measurements assumes a
different form:[25,26,291
but the probability for the phase of the structure factor due to the
anomalous scattering is analogous to (8.24):
E,,, is usually smaller than Eiso and can be estimated for example from the
agreement among centric reflections.[251If isomorphous and anomalous
information are available for the derivative j, they can be combined to give
a total probability:
P,(?'P
') = (q(iso,(q~>)(P,(an01(9~)). (8.31)
On the basis of a different formulation of the error model. Hendrickson and
548 1 Giuseppe Zanotti
if every one is cast in the form (8.32), Ptot(qp)can be obtained by the sum
of the individual coefficients:
+ C. Ci cos 2 q p + C D~sin 2 q p
i i
qP(best)
can be calculated easily by developing (8.38) and observing that
P(qp) can be sampled and the integral substituted by a sum:
Protein crystallography 1 549
FH(calc) can be calculated in the usual way from the known heavy-atom
parameters, but a correct estimate of FH(,bs) is available only for centric
reflections (see p. 541), so (8.41) should in principle be employed to refine
that class of reflections and the refined parameters used to calculate all the
phases. Nevertheless, since (8.8) can be considered an approximate estimate
of FH(obs),expression (8.41) has sometimes been used to refine acentric
data.$
Possibly the most used and most popular refinement scheme among
protein crystallographers is the so-called phase refinement, introduced for
the first time in the refinement of myoglobin.[341 The quantity minimized is:
+
The - and values are called lower (F,,) and upper (F,,) estimates, respectively. It has
been shown that the lower estimate would represent the correct value of FH for the large
majority of the reflections. It must be considered, in any case, that when anomalous differences
are low or the measurements are affected by large errors, coefficients (8.8) can be more safely
used.
5 A modification of the phase refinement scheme, called MVFC (minimum variance Fourier
coefficients) has been proposed by ~ ~ ~ u s c Theh . [ expression
~~] minimized is again (8.42), but
the sine and cosine of the phase are considered as variables. The values of the trigonometric
functions estimated in the refinement are used directly for the electron density calculation,
without the need to postulate a probability distribution for the phase error.
550 1 Giuseppe Zanotti
- F~H(calc))exp ( i ~ P ~ ( ~ ~ l ~ ) ) . (8.49)
dF = rn(F~~(obs)
If the position of all the major heavy atom peaks has been determined
correctly, coefficients (8.49) allow the localization of low-occupancy sites.
Figure 8.13 shows the coefficients used in (8.48) and (8.49) compared with
the 'true' ones.[381
map and so the new phases can be used to modify the original ones. The
procedure described here, suggested by ~ a n g , [is~based
] on the same basic
principle.
We have seen on p. 543 that an electron density map calculated using SIR
phases will contain the correct density of the molecule, covered by an
enormous amount of noise. Information about the correct phases can be
obtained via the following automated procedure.
1. After calculating an electron density SIR (or single anomalous scatter-
ing, SAS) map, errors are removed in direct space by evaluating for
every point an average value p, from the formula:
where
w i= 1 - rijlR if pi > 0 and R > rij (8.52)
Since the resulting map has large homogeneous connected regions, the
molecular boundary can be revealed by a threshold density level appropri-
ately selected. Real-space filtering methods can also be used to extend the
resolution of the electron density map.
space group. At other times we have reasons to believe that the conforma-
tion of a protein is quite similar to that of another that has been previously
solved, which is often the case for the same protein from different species.
In all the cases mentioned above, six variables, three rotational and three
translational, will approximately describe the transformation from one set of
coordinates to the other. In fact, if we call X the set of vectorst representing
the atoms of the original molecule and X' the transformed ones, the
transformation is simply described by:
where [C] is a matrix that rotates the coordinates X into the new orientation
and t is a translation vector. Equation (8.53) is illustrated for a two-
dimensional situation in Fig. 8.14, where a 'molecule' formed by three point
atoms can be superimposed to an identical molecule in a different
orientation by the translation of a vector t, after the rotation of an angle cu.
As mentioned in Appendix 5.B (p. 383), the technique of positioning a
molecule or a fragment of known structure in a crystal cell is called
molecular replacement. In principle it is possible to simultaneously search
for the six variables which minimize the difference between Fobs and F,,,,,
but in practice this is a very hard task, even for the fastest computer.$ The
solution of the problem was pioneered by Rossmann and v low,[^^] who
explored the possibility of finding the orientation of similar subunits in a
crystal cell without any knowledge about the translation t, making use of the
Patterson function. After the correct orientation has been found, a search
for the translation vector can be carried out (a collection of papers on
molecular replacement is found in the book by ~ o s s m a n n ) . [ Let
~ I us first
describe the methodology and the problems connected with the rotation
function.
where C is a matrix that rotates the coordinates of the model molecule with
respect to the reference system of the crystal, P,,,,,(u) is the Patterson
w
(the unit cell is dashed). (b) Its corresponding
.IIIIJ
Patterson map. Circled points indicate self-
vectors. Squared points are cross-vectors close
to the origin: some of the points of Fig. 8.15(b)
accidentally superimpose t o them during
rotation, giving rise t o false maxima in the
rotation function. (4
Protein crystallography 1 555
F(h) are the Fourier coefficients of the crystal and Fmol(p)the coefficients of
the Fourier transform of the isolated molecule, rotated by C (h, h' are used
here to indicate different terms of (hkl) values, p represents a point in
reciprocal space of a continuous transform). Gh,h,is an interference function
whose magnitude depends on h, h', and the volume used in the integration
of (8.54).
The function R(C) can be evaluated in real space using (8.54) or in
reciprocal space, using (8.55). In both cases the computing time is strongly
dependent on the sampling chosen, which in turn is related to resolution. In
real space PC,,,, and Pmo,must be sampled finely enough for the resolution
selected (generally this means a value around 112 or 113 the d spacing). The
volume of integration is a sphere whose radius depends on the size of the
isolated molecule, and this value determines the steps of the angular
variables used in evaluating R(C). In reciprocal space problems are quite
similar, since Fmol(p) is a continuous function, defined over all the
reciprocal space. The isolated molecule can be put in an artificial cell,
generally a cube whose edges can be about two to thiee times the size of the
molecule, and the continuous function evaluated with a sampling appropri-
ate to the resolution used. In practice, since (8.55) is dominated by large
Fourier coefficients, it is possible to limit the numbers of F ( p ) used.
A faster but more complex approach in evaluating the rotation function,
the so-called fast-rotation function, has been devised by ~ r o w t h e r . If~ ~we~ ]
express the Patterson function in terms of spherical polar coordinates,
(r, 0, q ) , for a rotation C, corresponding to the three angles cq,a 2 , ag, the
rotation function can be written:
1
R(C) = PwSt(r,8, q)RPmol(r,8, q ) r 2sin 0 dr d o d q (8.56)
rotation angles el, 02, and 83, illustrated in Fig. 2.3(a): O 1 is the rotation
angle about the z axis and is positive when the rotation is clockwise looking
from the origin; O2 is a rotation about the new x axis and 83 a rotation about
the new z axis. The matrix C describing such a rotation is given in (2.32b).t
An appropriate rotation for the three angles will cover all the space (see p.
72), but if the Patterson map presents some rotational symmetry, the
rotation function will also have symmetry and a partial rotation will be
sufficient.
The symmetry of the rotation function is a combination of the symmetries
of the two Patterson functions, PC,,,,and P,,,.[~~] The Eulerian angles make
easy to describe the symmetry of the rotation function.[491Any triplets of
angles 0 , ) 82, and 8, can be considered as a point of a three-dimensional
system, whose unit cell has dimensions 2n in all directions: a rotation a is in
+
fact equivalent to a 2n. The resulting rotation space groups are some of
those described in the International tables for x-ray crystallography. $
A disadvantage in using 8 angles is that when O2 is small, 8, and O3
represent a rotation about nearly the same axis, and maxima will resemble
strips rather than maxima. The distortion effect can be avoided if a
combination of Eulerian angles is used instead:['']
A different possibility is the use of spherical polar angles, (p, I/J, and x
(Fig. 2.3(b)). Angles (p and I/J define a spin axis, and a rotation of around x
this axis is performed. Polar angles are very useful when a particular
direction has to be exploited or when a defined rotation has to take place, as
is sometimes the case for self-rotation (see p. 558).
Translation functions
Once the orientation of a known molecule in an unknown cell has been
found, the next step is the determination of its absolute position. Only when
one molecule is present in space group PI is this problem non-existent, since
in this case the origin of the crystal cell can be chosen arbitrarily with
respect to all three axes. In all the other cases, when the reference
molecule, exactly oriented, is translated in the unknown cell, symmetry-
related molecules move accordingly and all the intermolecular vectors
change: only when all the molecules in the crystal cell are in the correct
position, the calculated Patterson cross-vectors superimpose to those of the
observed Patterson (intramolecular vectors are insensitive to translation).
Figure 8.17 illustrates the method. Molecule 1 is positioned in the crystal
cell of the unknown structure in the correct orientation: s1 is the vector
defining its position with respect to the origin. Since we do not know yet the
correct position of the molecule in the cell, s1 is arbitrarily chosen. Molecule
2 is generated by the twofold axis, and its position is defined by vector s2.
The correct solution is shown in Fig. 8.17(a), where the correct origin of
molecule 1 is indicated by sy. As vector sl varies, all the intermolecular
vectors among symmetry-related atoms will change: they will coincide with
t In Chapter 2 the rotation matrix C is called RE, for Eulerian angles and R,, for spherical
polar angles.
$The reader must be warned that the Eulerian rotation matrix is not Hermitian, that is
reversing the order of the Patterson functions does not produce the same rotation-equivalent
positions.
Protein crystallography 1 557
those of Fig. 8.17(a) only when s, = s:, that is when the local origin is
correctly defined with respect to the symmetry element. The determination
of the translation vector is performed by comparing two Patterson maps,
just as before in the rotation case, except that now we are interested in
maximizing the superposition of a different class of peaks. The problem
described above is in practice quite difficult, since the function representing
the superposition is generally very noisy and with many small maxima.
Several translation functions have been proposed, and some of them are
briefly summarized in Appendix 8.B. To illustrate the principles of
translational search, only the T function of Crowther and low['^] will be
described here, following the treatment of atm man.['^] In the case illustrated
in Fig. 8.17, the set of cross-vectors of the calculated Patterson from
molecule 1 to molecule 2 can be written as:
T(t) = lohs(h)Fl(h)F~(hA)
exp (-2niht) (8.60)
h
where F,(h)
.\ is the Fourier transform of the model molecule 1 and Fl(hA)
/
where P,(Cu) is the Pl(u) Patterson function rotated by matrix C, and U(u)
is a function which is 1 inside a sphere and 0 elsewhere. The function U is
necessary since both Patterson maps extend to all space, but we are
interested only in the superposition of self-vectors, confined to a region
around the origin of the cell. The sphere defined by U generally has a
diameter slightly larger than the maximum supposed molecular dimension.
The choice of polar rotation angles is quite common for self-rotation and
deserves a brief comment. Quite often the non-crystallographic symmetry is
represented by a rotation axis, in a direction different from the crystal-
lographic ones. In that event, the use of polar angles reduces the search for
the position of the axis from a three-dimensional problem to a two-
dimensional one: a twofold axis, for example, will correspond to a rotation
of 180" around the polar axis X, and a two-dimensional map (calculated for
q from 0" to 360°, Il,from 0" to 360", and x = 180") will show the presence of
the axis. A clear example of that is presented by Evans et a1.[551
The definition of the translational component of the non-crystallographic
symmetry represents the last and possibly the more difficult step. Let us for
example assume that the direction of a twofold non-crystallographic axis is
Protein crystallography 1 559
I Crystallization I
I Space group determination I
Isomorpous derivative
preparation and
data collection replacement
Difference-
Patterson map *
Yd, srdetc.
measurements derivative
Real space
The eight different combinations of the six phases give rise to eight
conditional probability distribution for Qj, whose formulation is not
discussed here. An important point, however, must be stressed: the
probability distributions do not depend only on the total number of atoms
present in the crystal cell (see Appendix 5.C), but also on the difference
between native and derivative data. The reliability factor for a triplet phase,
G, defined in (5.37) and (5.C.16), in the presence of isomorphous data must
be rewritten:[60]
where Np and NH are the number of protein atoms and heavy atoms in the
unit cell, respectively, and Ah represents the normalized difference for
Protein crystallography 1 561
reflection h. The second term of (8.63) may substantially increase the value
of the reliability factor G. A relevant application of (8.62) is that one single
derivative allows in principle the determination of the phase of the protein
structure factor.
A similar approach can also be used in presence of anomalous dispersion
data, where in addition to the classical triplet invariants:
where the Fourier coefficients are usually weighted by their figure of merit.
The initial interpretation of the map is in general not easy, unless very good
phases at high resolution are available, which is seldom the case. Therefore,
the strategy generally adopted is to calculate first an electron density map at
low resolution, say 5-6 A: these maps allow us to identify the contours of
the molecules in the crystal cell, and to distinguish between solvent regions
and protein. Eventually, some elements of secondary structure can be
identified: a-helices will appear as cylindrical rods of diameter of about
4-6A. 0-sheets are more difficult to distinguish, and in any case single
0-strands are not visible.
When the position of the molecule has been located in the unit cell, a
map at medium resolution, say 3.5-2.5 A resolution, is calculated and an
attempt to trace the polypeptide chain is made. Chain trace at this
resolution is made easier, and sometimes possible, if the amino acid
sequence is known. Mistakes are quite common in the interpretation at
medium resolution: the connections among secondary structure elements
are often difficult to recognize and amino acids can be positioned along the
chain shifted from their correct position by one or more residues.
Higher-resolution phases, say 2 A or more, allow us to correct for this
kind of mistake and to locate more accurately the amino acid side chains.
Unfortunately, MIR phases very seldom extend to that resolution, and
high-resolution maps can be obtained using calculated or combined phases,
as will be discussed later.
t Despite the distinction between them described on p. 105, rigid-body and constrained
refinement are taken as synonyms i n this chapter.
Protein crystallography 1 565
atoms is accurately known and there are reasons to believe that it will not
be significantly modified by the environment, the entire group can be
treated as a rigid entity. In the classical case of a phenyl ring, the eighteen
positional variables can be reduced to only three translational and three
rotational.
Bond length and valence angles in amino acids are very well known from
the structures of hundreds of small peptides. In a protein, they can be held
fixed to their theoretical values and only torsion angles around single bonds
allowed to vary. This approach was used by ~ i a m o n d [ ~in ' ] real-space
refinement, but it can be used in reciprocal space as well. Taking into
account the fact that the peptide bond can be considered planar, only two
torsion angles, called g, and I) (see Appendix 8.D), need to be varied for
the backbone chain of every amino acid: for a protein of n residues, the
parameters are reduced to about 2n for backbone plus the torsion angles of
side chains. An illustration of a possible choice of constrained parameters is
reported in Fig. 8.20 for a simple dipeptide. This solves the problem of
underdetermination, but the model becomes in some way too rigid and the
radius of convergence, that is the maximum displacement allowed for an
atom in a wrong position to be corrected, becomes quite small.
Constrained least squares can be applied to very different extent: the
definition of rigid body can be applied to only some group of atoms or to
the entire molecule. If, for example, an approximate solution of the
structure has been found using the molecular replacement technique, a first
refinement can be performed by considering the entire protein (or a
subunit) as a rigid group and the best position in the new crystal cell can be
searched for using only three translational and three rotational variables. In
that event, there is the supplementary advantage that, since the number of
variables is very limited, only low-resolution data need to be included in
refinement, greatly increasing the radius of convergence of the method.
Increasing the number of observations is another possible solution of the
underdetermination problem in macromolecular refinement (see p. 107).
Information from other sources can, in fact, be introduced and treated in a
way similar to that used for observations coming from X-ray diffraction. The
use of geometrical restraints has been proposed by Konnert and
end ricks on,[^^'^^] following a procedure devised by ~ a s e r [ for ~ ~ small
]
d,(ideal,
is the ideal value for the specific distance we are considering, d,(,,,,) is
that calculated from our present model and wi is usually chosen as the
reciprocal of the standard deviation of the distribution expected for the
distances of type j. Notice that since di(calc)is a function of the atomic
coordinates, (8.70) does not increases the number of variables. The total
number of equations like (8.70) is equal to the distances that are restrained:
bond lengths, the distances between one atom and the next-nearest-
neighbour (which is equivalent to restraining valence angles), and the
first-to-fourth atom distances, where the dihedral angle described by the
four atoms is in some way fixed (this is for example the case of the planar
peptide bonds). An example of the number of distances that can be
restrained for a simple dipeptide is illustrated in Fig. 8.21. Other possible
restraints in the Hendrickson and Konnert formulation are:
S3 represents the sum of the deviations of the atoms i from the plane k,
which is defined by its unit normal m k and by the origin to plane distance
d k ; ri is the vector that defines a point i whose distance from plane k has to
be minimized.[751S, restrains the volume of chiral atoms, defined for an
a-carbon by the product of the interatomic vectors of the three atoms
bound to it:
Since the sign of Vc, depends upon the handedness, S4 restrains chiral
centres to their correct configuration (Fig. 8.22). S, is applied to all
non-bonded atoms (except those taken into account in S2) and avoids too
close contacts. Other kinds of restraints can be considered, i.e. on isotropic
thermal parameters, occupancy, and non-crystallographic symmetry. It may
sometimes happen, particularly during the first stages of refinement, that
some part of the structure is poorly determined and the model 'blows up'.
Fig. 8.22. The chiral volume for an cu-carbon
In that event, a restrain on the excessive shifts can be applied: atom. The central atom is chosen as the origin
of the coordinate system, and vectors (r, - ,),r
(rc8 - ),r and (r, -),,r (8.74) are denoted s,
sc8, and s, respectively. The cross product
s, X sC8 is a vector perpendicular to the plane
CC,C8. If it is on the same side of vectors,, as
where rk and ro are the atomic vectors of the target and the initial model in the figure, that is if angle 0 is less than 90".
respectively. Using eqns (8.70)-(8.74), the number of observational func- the dot product between the t w o vectors is
positive. If s, and sq are reversed, that is if the
tions is now greatly increased from the original number, represented by eqn wrong configuration IS chosen for the cu-carbon,
(8.69). Equation (8.75) has effect only on the diagonal terms of the normal the vectors, x sc8 points i n the opposite
matrix. The number of restrained parameters for the example described in direction and the value of V, becomes
negative.
Table 8.3 is shown in Table 8.4.
It must be remembered that in protein crystallography 'experimental'
phases are very often available. They can be included in least-squares as an Table 8.4. Number of restraints, following
additional information that imposes another Hendrickson and ~ o n n e r t , ~ ' for
~ ' a protein
molecule of 1469 atoms (excludina H) in the
(8.76)
'97 = Ci wp(qi(obs)- 2
Q)i(calc)) . asymmetric unit. The example is Ye~ativeto
the case of Table 8.3
qobsis the estimate of phase angle from isomorphous and anomalous data Number of distances:
and qolCis the phase calculated from the model. Weights for (8.76) must bond distances
take into account the cyclic nature of phase angles. angle distances
planar 1-4 distances
Phase information is also used by Lunin and ~ r z h u m t s e v . [They
~ ~ ] suggest
Planes
that only differences among crystallographic quantities be minimized, that is
structure factor amplitudes and phases. Since phase probability distribution Chiral centres
may be represented by (8.34), they assume an analogous probabilistic Torsion angles
distribution for the module of the structure factor F for reflection i of the Possible contacts:
form: contacts due t o single torsion
contacts due to multiple torsion
P(F;) = exp [ - ( F : -F ~ ( , , , , ) ~ / ~ u ~ ] , (8.77) possible H-bonds
S=
i
{ ( 1 / 2 a 2 ) [-~ F:(ob,,]2
f +
- [ A cos qi B sin q i
+ C cos 2 q i + D sin 2 q i ] ) . (8.78)
Using (8.78) the multimodality of the phase distribution is taken into
account.
568 1 Giuseppe Zanotti
The four terms on the right-hand side describe bond, valence angle,
dihedral torsion angle, and non-bonded interactions, respectively. Kb is the
bond stretching constant and K, the bond angle bending force constant; k,
is the torsional barrier and m and 6 the periodicity and the phase of the
barrier. A and B the repulsive and the long-range non-bonded parameters.
The summation extends to the j bonds, the I valence angles, the 8 torsion
angles, and the n non-bonded interactions between all pairs of atoms
separated by at least three bonds. Despite the apparently very different
approach, the energy minimization and the geometrically restrained least-
squares are not too different in practice, since the final effect of (8.80) is to
impose restraints on the model.
Whatever method is used, special care is needed about the weights
applied to the different functions. We are in fact dealing with non-,
homogeneous quantities, like structure factor amplitudes and interatomic
distances, so the weights of the relative observational functions must be
chosen in such a way that everything is put on the same scale: an
overestimate of geometric restraints will in fact produce a stereochemically
perfect model associated with a very high crystallographic R factor; on the
contrary, an underestimate of the same weight will result in a good R factor
with unreasonable bond lengths and angles.
where ti and R, are the translation vector and the rotation matrix of the
Protein crystallography 1 569
entire group i, Yi,, . . . , Vim are the m torsion angles and B,,, . . . , Bin the
n temperature factors of group i.
Since the definition of rigid group is left to the user, the entire molecule
or a portion of it can be constrained, or eventually some subunits. The
restrained-constrained approach was originally devised for nucleic-acid
refinement, but it has been successfully used in refinement of protein
structures too.[822831A computationally quite efficient method of combining
sterical restraints and rigid-body refinement has also been recently
proposed. [s41
To take into account the effect of the medium and the approximations used
to calculate the total energy, dynamical effects can be better represented by
a set of Langevin equations:
w, is a factor which puts Exrayon the same scale as the empirical potential
energy term and NA is given by C w,(F,,,)~, to ensure that wA is
independent of the resolution range used. The terms E p and ENBcan be
included to take into account experimental information about MIR phases
and crystal packing respectively.
Molecular dynamics simulations can be performed at ambient
temperature,[871or at higher temperature, as in the version called simulated
annealing.[s69881 The latter consists in starting the simulation at room
temperature, say 300 K, and heating up the system (for example at
2000-4000 K) and subsequently cooling down at the initial value. The
advantage of going to high temperatures, unreasonable from the biological
point of view, is that model can come out of local minima, and the ratio of
convergence of the method is increased with respect to classical least
squares. t
The result of molecular dynamics calculations is a family of conforma-
tions, but the constraints imposed by X-ray data restrict these conforma-
tions to all those with the lower crystallographic R factor.
Starting model
(see Fig. 8.18)
Calculation of new
Some cycles of least- electron density maps
Fig. 8.23. Block diagram summarizing the squares refinement with new phases
phases of structural refinement. The starting
model is obtained by one of the procedures
schematized in Fig. 8.18. The numbers of
iterating cycles necessary t o reach convergence --. Structure factor
can be quite variable, depending o n the quality
of the initial model. If M.I.R. phases are not
available (i.e. the structure has been solved b y I calculation
molecular replacement techniques), only maps
with calculated phases can be used. Some of the
coefficients used in electron density map I End of refinement I
calculation are described on p. 571.
Protein structure
The three-dimensional structure of a globular protein is the resultant of a
very large number of interactions. This makes it at the same time stable and
flexible: a modification in a specific site can, in fact, be assimilated by small
local adjustments, without altering the overall conformation. In other cases,
modifications taking place in a region of the macromolecule can be
transmitted and influence the conformation in zones far apart from where
the movement originated, a phenomenon called cooperative or allosteric
Protein crystallography 1 573
effect.? It must also be remembered that the medium may have a relevant
influence on the conformation of the macromolecule, that in vivo operates
in a non-homogeneous environment, like for example the cell cytoplasm or
the cellular membrane. Flexibility can consequently be useful, or even
necessary, to the protein to fulfil its biological functions. On the contrary,
the picture of a molecular model resulting from the X-ray structure
determination is, at least apparently, a static one, the only 'dynamic'
information residing in the atomic temperature factors. Nevertheless, this
must not be considered in contrast with reality, since the X-ray structure
represents an average, over time and space, of single structures around a
conformational minimum, which is representative of the conformation of
the protein molecule in a solution of the same medium. On the other hand,
a description of a dynamic situation is quite difficult, particularly for a
complex molecule: the sort of static image that will turn out from the
following discusion must be considered a necessary simplification as well as
the starting point for the comprehension, at the molecular level, of the
behaviour of biological systems.
General aspects
Proteins are polymers of the 20 natural aminoacids (some symbols and
conventions relative to amino acids are reported in Appendix 8.D). They
may contain other groups, often relevant for the conformation of the
molecule, like haems, prosthetic groups, carbohydrates, and so on, or they
can coordinate cations (see p. 582). Nevertheless, the building blocks of
proteins are the amino acids, and the protein fold* is the first and perhaps
more general aspect of its structure.
A polypetide chain of a given sequence, in the appropriate conditions,
can refold spontaneously into its final three-dimensional structure:[931all the
necessary structural information must be contained in the amino acid
sequences. It should therefore be possible, at least in principle, to predict
from it the structure of a protein. Unfortunately, despite several efforts, this
remains a hope for the future. Moreover, some simple, schematic rules
relative to the forces that contribute to stabilize a structure can be
summarized: they are, however, only indications and exceptions are quite
comm~n.[~'~~]
1. Groups potentially able to form hydrogen bonds will be positioned
accordingly: or at the surface of the molecule, where they can be
hydrogen bonded with the solvent, or, when in the interior, a hydrogen-
bond donor will be close to an acceptor. Not necessarily all the possible
interactions of this kind will be realized, but most of them usually are.
2. Polar residues in a water-soluble protein will preferably be located on
the surface of the macromolecule and the hydrophobic ones in the
interior (the contrary apply for lipid-interacting proteins).
3. Charged groups will be close to the surface, where they can be
t Cooperativity and allostery are not exactly synonymous, but in this context no distinction is
made about them.
I The term 'fold' is not used here to indicate a process, but simply the way in which the main
chain is wrapped on itself.
574 ( Giuseppe Zanotti
t For historical reasons, the amino acid sequence is called primary structure.
Fig. 8.24. (a) Stereo view of t w o a-helices of
m y ~ ~ l o b i n(data
[ ~ ~from
] PDB), connected by a
short piece of chain (to simplify the
representation, for every side chain only the
@-carbonis drawn). Helix 7 goes from residue
0C 100 to 118 and h e l i x 8 f r o m 125 t o 148. Residues
from 116 t o 119 participate also i n a type I
@-turn.From I 1 9 t o 124 the structure is less
0N regular. The t w o helices are not exactly aligned,
but their axes are slightly bent, a classical
situation of the packing of helices, (b) Scheme of
00 the hydrogen bond pattern for the most
common types of helices: in the a-helix, 13
atoms are included in the ideal ring from the
O H carbonil oxygen t o the N-H. Only 10 atoms in
the 3,,-helix.
Right-handed a-helix
Left-handed a-helix
3,,-helix
Parallel-chain plated sheet
Anti-parallel-chain pleated
sheet
Polyglycine
Polyproline II
576 1 Giuseppe Zanotti
One of the reasons that helices are quite common is the possibility to
accommodate in a very short length a large number of residues: all side
chains in fact point away from the helical axis, giving rise to quite an
efficient packing of bulky groups, one on the top of the other.C9'] The only
residue that cannot be accommodated in a whelix is proline, due to the
steric hindrance of its side chain and to the fact that its nitrogen is lacking
the hydrogen necessary to form the hydrogen bond. Proline can be fitted
properly only in the first turn of a helix, but will interrupt it, or introduce a
bend, if present in any other place. A special kind of secondary structure,
called polyproline helix, has sometimes been observed.
Another important peculiarity of helical structures is represented by their
electrostatic properties: since all the dipoles of the peptides are pointing in
the same direction, the result is a net total dipole for the helix[981that may
contribute, for example, in stabilizing the binding of charged species.
2. ,&structures. A 0-strand is a portion of polypeptide chain in a nearly
extended, pleated conformation. It is less regular than helices and its torsion
angle values reported in Table 8.5 must be considered only indicative, since
large variations are observed in practice. A P-strand is normally not stable
in itself, but it is stabilized by the contact with another P-strand, with which
it forms hydrogen bonds. According to the relative direction of the two
chains? 0-strands can be parallel or antiparallel (Fig. 8.25): they differ in
the way the hydrogen bonds are formed, making the latter slightly more
stable[1011and more commonly observed in proteins. Since P-chains are
quite flexible, this kind of structure can present irregularities and defects: a
jump of one residue in a strand is for example called a P-bulge.
Several parallel or antiparallel P-strands form a so called P-sheet. Since in
a 0-strand amino acid side chains point alternatively in opposite directions,
a P-sheet has the possibility of having a polar surface and an apolar one.
Finally, strands in a sheet are not exactly aligned, but are twisted with
respect to each other.[lo2]
3. Others. Short tight loops connecting two strands are called p-turns, or
hairpins, or reverse turns. Different kinds of classifications have been
attempted: one of them, based on the hydrogen bond patterns, is that of
Milner-White and ~ o e t . [ ' The
~ ~ ] more classical p-turns are called type I, 11,
and I11 (and type I t , 11', and 111' those with g, and q~ torsion angles
reversed). A schematic pattern of two turns is illustrated in Fig. 8.26.
In real structures, short pieces of polypetide chains not falling in one of
the previous categories are normally found. In general their g, and 1C, torsion
angles correspond to those of one of the structures discussed above, but
nevertheless that portion of the molecule cannot be classified as one of the
previous categories.
A last important feature strongly influencing the polypeptide chain
conformation is the presence of a disulphide bridge: the sulphur atoms of
two cysteines form a covalent bond, connecting two pieces of polypeptide
chain or even two different chains. This results in a strong stabilization of
the three-dimensional structure, at the expense of some local irregularities.
ARG
PRO
ASP
PHE
CYS
LEU
GLU
PRO
PRO
TYR
THR
GLY
PRO
CY S
LY S
ALA
ARG
ILE
ILE
ARG
TYR
PHE
TYR
ASN
ALA
LY S
ALA
GLY
LEU
CY S
GLN
THR
PHE
VAL
TYR
GLY
GLY
CYS
ARG
ALA
LY S
ARG
ASN
ASN
PHE
LYS
SER
ALA
GLU
ASP
CY S
MET
ARG
THR
CY S
GLY
GLY
ALA
TER
t Other possibilities are: a modification of the N or the C terminal part of the molecule, or a
proteolytic cleavage of a peptide bond.
Protein crystallography 1 583
Solvent structure
A high portion of the solvent contained into a crystal cell can be considered
not to be relevant for the macromolecule; it is simply there to fill the
584 1 Giuseppe Zanotti
Protein classification
The secondary structural elements of a protein could be combined, at least
in principle, in a nearly infinite number of ways to produce the three-
dimensional structure. An examination of the structures solved till now has
on the contrary shown that these elements are put together in quite a
limited number of ways, giving rise to a small set of possible patterns. The
previous statement must nevertheless be regarded cautiously: it could be
partially ascribed to the limited size of the data base available (as explained
in the introduction, in the Brookhaven Protein Data Bank only 353 sets of
protein coordinates are reported, most of them relative to parent mole-
cules); in addition, a bias could be introduced by the fact that only some
classes of molecules have so far been crystallized. In any case, regularities
and similarities in tertiary structures can be observed: a distribution of
proteins or domains into classes and subclasses has been attempted.
According to ~ i c h a r d s o n , [ 'protein
~ ~ ~ molecules solved up to the present
belong to one of the following classes:
The same holds for class (2), the all-p domains: the strands in this group are
antiparallel to each other (only if more than seven strands are present, can a
parallel chain be observed). Figure 8.33(c) shows a picture of prealbumin,
an example of a protein of the subclass called greek key $ barrel, and Fig.
8.29 illustrates all the topologies found in that class, which does not mean
all the possible, but simply those found in the structures solved till now.
Protein crystallography 1 587
Appendices
Appendix 8.A Some formulae for isomorphous
replacement and anomalous dispersion
In this appendix, formulae (8.13) and (8.23), relative to isomorphous
replacement and anomalous scattering, will be derived in a general way.
Equation (8.13) can also be obtained from Fig. 8.1, using simple trigono-
metric considerations. It will be derived to illustrate the formalism used in
the rest of the appendix. Let us call:
FPH= FPHexp G ~ P H )
FH= FHexp (iq,).
From (8.5):
= (FP+ FH)(Fc + Fh).
FPHF:H (8.A.1)
By substituting in (8.A.1):
FPHexP ( ~ ~ P H ) FexpP H(-~vPH)
= {FPexp (MP) + FHexp ( ~ ~ H ) ) {exp
F P (-iqP) + FHexp (-iqH)), (8.A.2)
F$, = F$ + F&+ FHFpexp [i(qp - qH)]+ FpFHexp [-i(qp - q,)], (8.A.3)
F;, = F$ + F&+ 2FPFHCOS (qp - qH). (8.A.4)
Equation (8.A.4) is equivalent to (8.11). An analogous expression can be
derived for qp,. Since, from (8.5):
FP= FPH
- FH
finding the correct origin with respect to the symmetry operator A. In the
case of a superposition of m maps translated by vectors 5, the sum function
defined in (5 .B.5) ( ~ u e r ~ e r ; [ "see
~ I Appendix 5 .B) becomes:
S(u) = 2 Z(h) Ci cos [2nh(u - q)].
h
(8.B.1)
is satisfied, and these peaks will have the same magnitude as those defining
the correct origin. However, the position of false peaks can be predicted, or
they can be removed by modifying the Z(h) value.
2. The T function (Crowther and low['^]). The function for two
molecules related by symmetry operator A is given by (8.60). It has been
that the T function is virtually equivalent to the Q function,
except that the origin is looked at in a different way.
Modified T functions have also been defined. If more than one symmetry
operation is considered, (A,, . . . , A,), the T function becomes:
h(t) = 2 z O b s ( h ) (C
h i
~ F,(~A,)F:(~A,)) exp (-2nih . t)
I
(8.B.5)
Vector t here does not have the same meaning as in Fig. 8.17: function TH
gives in fact the absolute value of the translation that has to be applied to
the model in order to position it correctly in the crystal cell. Equation
(8.B.7) is similar to the Crowther T function, but O(t) is a new term, taking
into account the interpenetration among molecules. TH will present a
maximum when cross-vectors of Patterson maps superimpose and molecules
do not overlap each other.
590 1 Giuseppe Zanotti
r o = p o =b -AXo.
Let us also define:
~t;.= (rT~i)~/(pTApi)
Pi = -(rT+~Api)~/(pTApi).
Protein crystallography 1 591
Using (8.C.4) and (8.C.5), a new value of X , called X I , will be given by:
X 1 = Xo + mopo. (8.C.7)
Now a new residual and a new value for pi can be calculated:
From p l we can now evaluate a new vector X and a new residual vector:
By analogy with (8.C.9) a new value of p can be calculated, and so on. The
method can proceed until convergence, that is when the vector of residuals
r;: is less than a predefined amount. The advantages of the method are that
every estimate of X improves the previous one and that the elements of the
matrix A are not moved from their positions: the non-zero elements of A
can be stored contiguously, taking advantage of a scarcely populated
matrix.
Leu. L
-
Ala, A -CB
CG1
Val, V -CB' - C B - CG - SD - CE Met, M
\CGZ
,CD1-CEl,
Phe, P -CB-CG cz - C B - OG Ser, S
OG1
Thr, T -cs'
\ CG2
I
CD1 - CE1 , OD1
Tyr,Y -CB-CG: c z - OH -CB -CG Asn, N
CD2-CE2 ' ' ND2
ASP,D - C B - CG :
OD1
OD2
-CB-CG-CD
OEl
' OE2
G ~ uE,
Fig. 8.0.4. Symbols used i n Protein Data Bank
for the atoms of the 20 side chains of Fig. 8.D.3.
NH1 The same nomenclature is also used, with some
small modifications, in most of the refinement
Arg, R CB-CG-CD-NE-CZ: -CB - C G - CD - CE - N Z L~s, programs. The three-letter and the one-letter
NH2 code for the amino acids are used.
594 1 Giuseppe Zanotti
References
6. Kendrew, J. C., Dickerson, R. E., Strandberg, B. E., Hart, R. G., Phillips, D.
C., and Shore, V. C. (1960). Nature, 185, 422-7; Cullis, A. F., Muirhead, H.,
Perutz, M., Rossmann, M. G., and North, A. C. T. (1961). Proceedings of the
Royal Society of London, A265, 161-87; Perutz, M. (1985). In Methods in
Enzymology, Vol. 114 (ed. H. W. Wyckoff, C. H. W. Hirs, and S. Timasheff),
pp. 3-46. Academic, Orlando.
7. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F. Jr., Brice,
M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M.
(1977). Journal of Molecular Biology, 112, 535-42.
8. Pusey, M. L., Snyder, R. S., and Naumann, R. (1986). Journal of Biological
Chemistry, 261, 6524-9.
9. Zeppezauer, M. (1971). Methods in Enzymology, 22, 253-69.
10. McPherson, A. (1976). Journal of Biological Chemistry, 251, 6300-3.
11. McPherson, A. (1976). Methods in Biochemical Analysis, 23, 249-345.
12. Garavito, R. M., Jenkins, J., Jansonius, J. N., Karlsson, R., and Rosenbusch,
-
J. P. (1983). Journal of Molecular Biology, 164, 313-27.
13. Michel, H. (1983). Trends in Biological Science, 56-9.
14. Matthews, B. W. (1968). Journal of Molecular Biology, 33, 491-7.
15. Matthews, B. W., and Bernhard, S. A. (1973). Annual Reviews in Biophysics
and Bioengineering, 2, 257-317.
16. Matthews, B. W. (1974). Journal of Molecular Biology, 82, 513-26.
17. Green, D. W., Ingram, V. M., and Perutz, M. F. (1954). Proceedings of the
Royal Society of London, A225, 287-307; Rossmann, M. G. (1960). Acta
Crystallographica, 13, 221-6.
18. Ramachandran, G. N. and Srinivasan, R. (1970). Fourier methods in crystal-
lography (ed. M. J. Buerger) pp. 96-235. Whiley, New York.
19. Sim, G. A. (1959). Acta Crystallographica, 12, 813-15.
20. Sim, G. A. (1960). Acta Crystallographica, 13, 511-12.
21. Blow, D. M., and Rossmann, M. G. (1961). Acta Crystallographica, 14,
1195-202.
22. Harker, D. (1956). Acta Crystallographica, 9, 1-9.
23. Hendrickson, W. A. and Teeter, M. M. (1981). Nature, 290, 107-13.
24. Rossmann, M. G. (1961). Acta Crystallographica, 14, 383-8.
25. North, A. C. T. (1965). Acta Crystallographica, 18, 212-16.
26. Matthews, B. W. (1966). Acta Crystallographica, 20, 82-6.
27. Parthasarathy, S. (1970). Acta Crystallographica, A27, 45-7.
28. Bijvoet, M. (1954). Nature, 173, 888-91.
Protein crystallography 1 595
98. Hol, W. G. J., van Duijnen, P. T., and Berendsen, H. J. C. (1978). Nature,
273, 443-6.
99. Takano, T. (1977). Journal of Molecular Biology, 110, 569-84.
100. Borkakoti, N., Moss, D. S., and Palmer, R. A. (1982). Acta Crystallographica,
B38, 2210-17.
101. Richardson, J. S. (1977). Nature, 268, 495-500.
102. Chothia, C. (1973). Journal of Molecular Biology, 75, 295-302.
103. Milner-White, E. J. and Poet, R. (1987). Trends in Biological Science, 189-92.
104. Ramachandran, G. N., Ramakrishnan, C., and Sasisekharan, V. (1963).
Journal of Molecular Biology 7, 95-9.
105. Bode, W. and Schwager, P. (1975). Journal of Molecular Biology, 98,693-717.
106. Phillips, D. C. (1970). British biochemistry past and present (ed. T. W.
Goodwin), pp. 11-28. Academic, London.
107. Wlodawer, A., Deisenhofer, J., and Huber, R. (1987). Journal of Molecular
Biology, 193, 145-56.
108. Chotia, C., Levitt, M., and Richardson, D. C. (1971). Proceedings of the
National Academy of Sciences of the USA, 74, 4130-4.
109. Richardson, J. S. (1981). Advances in Protein Chemistry, 34, 167-339.
110. Babu, Y. S., Sack, J. S., Greenhough, T. J., Bugg, C. E., Means, A. R., and
Cook, W. J. (1985). Nature, 315, 37-40.
111. Dickerson, R. E. and Geiss, I. (1983). Hemoglobin: structure, function,
evolution and pathology. Benjamin Cummings, Menlo Park.
112. Jurnak, F. A. and McPherson, A. (ed.) (1984). Biological macromolecules and
assemblies. Wiley, New York.
113. Krebs, E. G. and Beavo, J. A. (1979). Annual Reviews in Biochemistry, 48,
923-59.
114. Paulson, J. C. (1989). Trends in Biological Science, 14, 272-6.
115. Petsko, G. A. and Ringe, D. (1984). Annual Reviews in Biophysics and
Bioengineering, 13, 331-71.
116. Wlodawer, A., Borkakoti, N., Moss, D. S., and Howlin, B. (1986). Acta
Crystallographica, B42, 379-87.
117. Perutz, M. F. (1970). Nature, 228, 726-39.
118. Barford, D. J. and Johnson, L. N. (1989). Nature, 340, 609-16.
119. Adman, E. T., Sieker, L. C., and Jensen, L. H. (1976). Journal of Biological
Chemistry, 251, 3801-6.
120. Zhang, R.-g., Joachimiak, A., Lawson, C. L., Schevitz, R. W., Otwinowski,
Z., and Sigler, P. B. (1987). Nature, 327, 591-7.
121. Priestle, J. P. (1988). Journal of Applied Crystallography, 21, 572-6.
122. Sheriff, S., Hendrickson, W. A., and Smith, J. L. (1987). Journal of Molecular
Biology, 197, 273-96.
123. Blacke, C. C. F., Geisow, M. J., Oatley, S. J., Rerat, B., and Rerat, C.
(1978). Journal of Molecular Biology, 121, 339-56.
124. Banner, D. W., Bloomer, A. C., Petsko, G. A., Phillips, D. C., and Wilson, I.
A. (1976). Biochemistry and Biophysics Research Communications, 72, 146.
125. Tollin, P. (1966). Acta Crystallographica, 21, 613-14.
126. Buerger, M. J. (1967). Vector space. Wiley, New York.
127. Tollin, P. (1969). Acta Crystallographica, A25, 376-7.
128. Harada, Y., Lifchitz, A , , Berthou, J., and Jolles, P. (1981). Acta
Crystallographica, A37, 398-406.
129. Agarwal, R. C. (1978). Acta Crystallographica, AM, 791-809.
130. Ten Eyck, L. F. (1973). Acta Crystallographica, A29, 183-91.
131. Hestenes, M. R. and Stiefel, E. (1952). Journal of Research of the National
Bureau of Standards, 49, 409-36.
132. Rollet, J. S. (ed.) (1965). Computing methods in crystallography. Pergamon,
Paris.
Phvsical ~ r o ~ e r t i of
es
crystals
MICHELE CATTl
Introduction
The discipline named crystal physics has been associated traditionally with
two topics:
1. The study of phenomenological and macroscopic properties of crystals,
concerning mainly the effects of crystal anisotropy and symmetry on the
physical properties of matter.
2. The microscopic investigation of crystal defects, i.e. deviations from the
ideal periodicity of the crystal structure, and of their influence on the
macroscopic physical properties.
Both subjects are closely related to crystallographic results: in particular,
the first one relies upon consideration of symmetry, while the second one is
founded on the structural analysis of crystals.
Crystal physics is not only a discipline of great fundamental interest, but it
also has important technological applications. Crystals are used in industry
because of their useful physical (optical, electrical, magnetic, etc.) pro-
perties, which are studied by crystal physics. It suffices to mention
piezoelectric transducers, magnetic oxides for tapes for recorders and
computers, crystals for non-linear optics of laser technology, and many
other examples. Analogously, defects have large effects on the crystal
properties, such as mechanical strength, electrical and thermal conductivity,
etc., so as to play an outstanding role in several branches of technology. It is
also worthwhile to point out the close relations between crystal physics and
solid state physics. The latter discipline, in a restricted meaning, emphasizes
such topics as the quantum study of electronic and vibrational energy levels,
the interaction of radiation with matter, and related items, both in ideally
periodic and in defective solids. All these subjects try to explain on a
microscopic basis the phenomenological properties which are dealt with by
crystal physics; on the other hand, their treatment relies heavily upon the
geometrical features of solids studied by crystallography. Thus crystal
physics can be considered to be a bridging discipline between crystal-
lography and solid state physics, partially overlapping in some areas with
both of them.
600 1 Michele Catti
Tensorial quantities
Let us consider a functional relationship between two vectorial physical
quantities, X and Y. If they have a common direction, the functional
relation between their moduli is dealt with by writing a Taylor expansion
about X = 0:
In the Taylor expansion (9.1) a single coefficient (Yo, y', y", etc.) is required
for each term. On the other hand, according to (9.2) the constant part of
the Y(X) dependence is expressed by 3 quantities Yo,,, the linear part by 9
coefficients y,,, the quadratic part by 27 coefficients y,,,, and so on. In
general, for a term of nth order in the Taylor expansion a number of 3"+'
coefficients is necessary to define the Y(X) dependence.
We now want to examine how the different coefficients appearing in the
expansion (9.2) are transformed, when the orthonormal reference basis E is
changed into another one E'. Let T be the transformation matrix (cf. p. 65)
relating the two Cartesian bases: E' = TE. The metric matrices of both bases
are equal to the identity matrix I, so that G' = G = I; by substituting into
G' = TGT, equivalent to (2.21), the result fi = I is obtained. This condition
is equivalent to
As for the nine coefficients yih expressing the linear part of the Y(X)
dependence, they can be considered to be components of a 3 x 3 square
matrix y, so that the first-order term in (9.2) is written in matrix form as:
Y = yx. (9.5)
where T is the matrix relating the new basis to the old one. It should be
noticed that the general rule (9.8) is equivalent to rule (9.4) multiplied by n
times: thus we can also say that a tensor of rank n transforms in the same
way as a product of n coordinates (or vector components). It is important to
point out that any set of 3" coefficients with n subscripts does not necessarily
obey the (9.8) transformation rule, and so need not represent a tensor of
rank n. Take for instance the 32 components of the matrix T, or those of the
orthonormalization matrix M relating the Cartesian basis E to the lattice
basis A: in no way can an expression of type (9.6) be applied to them, so
that they are not components of second-rank tensors. Tensors of rank
higher than two could be represented by matrices with more than two
dimensions, but this is usually avoided for simplicity and the matrix
formalism is limited to tensors of first and second rank. Further, in tensor
calculus the Einstein convention is commonly adopted, according to which
summation symbols are omitted and understood; however, for the sake of
clarity this convention is not followed in this text. Because of the particular
importance of second-rank tensors in representing the physical properties of
crystals, their features are analysed in detail in Appendix 9.A.
Tensors of rank n have been introduced as coefficients of terms of order
n - 1 in the Taylor expansion (9.2) expressing a functional dependence
between two vectors. However, tensors can represent a physical property
relating not only vectors, but also other tensors. Let us consider a linear
dependence only, for the sake of simplicity. Then the coefficients expressing
such a dependence between vector components and second-rank tensor
components are clearly themselves components of a third-rank tensor:
vector may have any direction lying in the mirror plane, while in group 1 its
orientation is unrestricted. On the other hand, a spontaneous moment of
magnetic dipole (ferromagnetism) is consistent only with crystal symmetry
represented by subgroups of the limit group w/m: 1, 2, 3, 4, 6, m, 2/m, 6,
4/m, 6/m, 1, 3, 4. The axial vector may have any orientation in point groups
1 and T, it must be normal to the mirror plane in group m, and its direction
has to be parallel to the symmetry axis in all other cases.
A symmetric second-rank tensorial property is represented geometrically
by a second-order surface (quadric), which may be an ellipsoid or a
hyperboloid of one or two sheets (cf. Appendix 9.A). In general, such a
surface has mmm symmetry, but in special cases higher symmetries can be
displayed. When two tensor eigenvalues are equal but different from the
third one, then the quadric becomes a revolution ellipsoid or hyperboloid
with symmetry a/mm. Because of Neumann's principle, this must occur for
all crystal point groups belonging to the tetragonal, trigonal, and hexagonal
systems, which are subgroups of the limit group w/mm, but not of the group
mmm. Besides, the principal direction of the tensor corresponding to the
unique eigenvalue must be parallel to the symmetry axis 4, 3, or 6. When all
three eigenvalues are equal, the quadric is a sphere with symmetry m/m;
such a situation is required for all cubic point groups (subgroups of w/m),
as both mmm and w/mm symmetries are lower than those consistent with
the cubic system. For the orthorhombic, monoclinic, and triclinic systems,
the second-rank tensor can be represented by a general mmm quadric with
all three eigenvalues different, but with a constrained orientation in the
orthorhombic and monoclinic cases. In the former instance, the principal
directions are to be parallel to the crystallographic axes, while in the latter
case one of the principal directions must be parallel to the unique
monoclinic axis.
The forms shown by first- and symmetric second-rank tensors as imposed
by symmetry are given in Table 9.1. Of course if the symmetric second-rank
"
.
(c) Symmetric second-rank tensor y
Triclinic Monoclinic
("I
Orthorhombic
Cubic
iYl1
i2 .)
("l ") Y33
Yl 1
Tetragonal,
hexagonal " .) Y33
Y33
Physical properties of crystals 1 605
Pyroelectricity
The presence of the zeroth-order term Dopiin the expansion (9.13) implies
that, in an anisotropic crystal, a spontaneous polarization given by the
vector Po = Do is possible when E = 0. This adds to the normal induced
Physical properties of crystals 1 607
wave, drawn through the centre of the optical indicatrix (Fig. 9.4). The
central section of the ellipsoid normal to OP is an ellipse with semi-axes OM
and ON. It can be shown that not just one but two wave fronts propagate
through the crystal normally to OP with different velocities v, = cln, and
v 2 = cln, (c is the velocity of light in a vacuum); the corresponding
refractive indices nl and n2 are equal to the lengths of semi-axes OM and N
ON. The two waves are plane polarized with directions of vibration of the D
vectors parallel to OM and ON, respectively. The whole subject of optical
crystallography, which will not be dealt with here in detail, is based on the
study of properties of the optical indicatrix and other related surfaces. We
would just point out that the orientation and symmetry behaviour of the
optical indicatrix in the different crystal point groups follows all results
derived from Neumann's principle for symmetrical second-rank tensors on
p. 604 and in Table 9.1.
Thus in the cubic system all three principal refraction indices are equal,
the optical indicatrix is a sphere, and the crystal is optically isotropic in all
directions. In the tetragonal, trigonal, and hexagonal systems two principal Fig.9.4. The Optical
indices are equal but different from the third one, whose corresponding
principal axis (called optic axis) is parallel to the high-symmetry direction.
The indicatrix is an ellipsoid of revolution about the optic axis. The two
wave normals propagating along that direction have equal refractive indices,
and therefore coincide; no double refraction is observed, and the optic axis
is a direction of optical isotropy. Crystals belonging to these symmetry
systems are called uniaxial. For the orthorhombic, monoclinic, and triclinic
systems, the indicatrix is a general ellipsoid with two circular central
sections; the corresponding normal directions, lying on the plane of the
longest and shortest semi-axes, are the optic axes and the crystals are said to
be biaxial. The two optic axes are directions of optical isotropy, just as in
the uniaxial case.
In some crystalline materials the optical behaviour depends significantly
on terms of order higher than one (typically, second-order terms) in the
D(E) expansion (9.13). This subject is called non-linear optics, and is
associated with important technological applications. Optical non-linearity is
also known as the electro-optical effect, and the electro-optical coefficients
eZhk are evidently the components of a third-rank tensor. The (9.13)
expansion truncated to second order can be rewritten as:
Crystal strain
In a general way, the state of strain of the crystal is defined by the vector
field u = x' - x, which gives for every point the change between equilibrium
x and strained x ' position vectors. As usual, the dependence u(x) of the
vector field can be expanded in a power series of type (9.2) (we assume that
the displacement of the point at the origin is vanishing):
Taking into account that, for small strains, ell << 1, e,, << 1, tan q1= q,,
tan q2= q,, we obtain: q, = ezl, q2= el,.
If ell = e,, = 0 and el, = -eZ1, then the strain reduces to an anti-clockwise
rigid rotation by the angle q = ezl; in this case e is antisymmetrical
(eij = -eji). Generally, the strain tensor e can always be written as the sum
+
of a symmetrical E = i ( e e) plus an antisymmetrical w = i ( e - e) com-
ponent: cij + wij = i(eij + eji) + i(eij - eji) = eij. As o is antisymmetrical, it
represents a rigid rotation; therefore E = e - o corresponds to the physically
relevant part of the strain. A geometrical picture of the decomposition
+
e = E o for the planar deformation of Fig. 9.5 is shown in Fig. 9.6.
The Lagrangian strain tensor E is called infinitesimal, because it is suitable
to represent small deformations only; the use of the same symbol as that for
dielectric permittivity may sometimes be confusing. Another tensor which is
more convenient to express larger deformations is the finite Lagrangian
strain tensor:
q = i(e + e + ee). (9.17)
For small strains, the difference q - E = 1/2ee is vanishing, so that use of q
or E tensors is quite equivalent. Let us look now for a relation between the
previous macroscopic representation of strain and the lattice microscopic
nature of the crystal. M is the orthonormalization matrix of the undeformed
lattice basis (E = M A ) , and M' that of the deformed basis (E = MIA'); as
fractional coordinates of points are not changed by a homogeneous
deformation, then M ' x ' = &. From (9.16), x ' - x = ex; by substituting
X' = MI-'&, one obtains:
Substitution into E = $(e + e) and into (9.17) yields the required expressions
for the E and q tensors:
Its eigenvalues cui are always positive, so that the representation quadric of
equation
is an ellipsoid, whose radius vector has a length equal to the inverse square
root of the coefficient of thermal expansion along its direction. The
corresponding volume thermal expansion is
1 dV
--=C
VdT i=l
cuii=tra.
Inner deformation
The strain tensor q was related by (9.20) to a change of metric tensor
G' - G, i.e. of unit-cell geometry. This corresponds to a purely homoge-
neous deformation of the crystal structure, leaving the atomic fractional
coordinates constant (lattice strain). If, on the other hand, such coordinates
vary, then in addition to the lattice deformation, an inner strain arises,
which is just defined by the coordinate changes Axi for all atoms in the
asymmetric unit. The inner strain generally occurs as a relaxation of atomic
positions to minimize the energy of the deformed lattice, so it is a function
of the lattice strain. The overall deformation of the atomic arrangement is
612 1 Michele Catti
the sum of these two effects. Changes of interatomic distances due to the
total strain can be decomposed into the separate effects, according to
where d , and d(, are the distances between atoms i and j before and after
deformation, respectively. The first term is due to lattice strain only, while
the second one (in square brackets) is mainly ascribed to inner strain
contribution.
Let us consider an application of these concepts to the strain induced by
thermal expansion in mica muscovite. Two crystal-structure refinements at
25 and 700 "C allowed us to determine the changes of unit-cell constants and
of atomic fractional coordinates for the corresponding temperature range
(Table 9.2). By calculating the metric tensor change G' - G and applying
equation (9.20), the strain tensor q can be obtained; dividing it by the
temperature difference A T yields the tensor of thermal expansion a,which
represents the lattice component of thermal strain: a,, = 1.12, a22= 1.18,
a33= 1.89 x lo-' K-l. The inner component of thermal strain is given by
the changes of atomic fractional coordinates divided by AT. It appears that
the largest contribution to inner deformation is related to basal oxygen
atoms 0(1), 0(2), 0(3), while the effect is very small in all other cases. This
corresponds to a substantial decrease of the ditrigonal distortion of the (001)
layer of (Si, Al)O, tetrahedra sharing corners, which approaches a more
symmetrical hexagonal configuration (cf. Fig. 6.41). The contributions of
lattice and internal deformations to changes of interatomic distances can be
analysed in the case of K - 0 bond lengths; in Table 9.3 the two components
Stress tensor
The other field tensor which is necessary to define the elastic properties of
crystals is the stress tensor. The mechanical forces applied externally to the
crystal are 'contact forces' or 'surface forces' acting on the external surface
of every volume element, and not 'body forces' acting on every point of the
body (like the force of gravity). Therefore the force field is represented by
the vector p (force per unit area) as a function of the unit vector n normal to
the surface element dS. For a homogeneous stress, p depends not on the
absolute position of dS but only on its orientation, so that p =p(n);
furthermore, the dependence is simply linear, being expressed by a tensorial
relationship:
Elasticity tensor
A solid is said to be elastic when the dependence of strain on the applied
stress is linear (Hooke's law). If the solid is anisotropic, as is the case for
crystals, then Hooke's law is expressed in tensorial form. The coefficients of
the linear dependence must be characterized by four subscripts, as they
relate the qij and rhk components of two second-rank tensors (cf. eqn
(9.10)):
3
Til =Ch,k~ljhk~hkt (9.23)
1
and, inversely:
1
The quantity cijhkhas the physical meaning of stress component zij which
must be applied to the crystal so as to achieve a deformation state
characterized by a qhk component of unit value. Similarly, syhk is the qij
strain component resulting from application of a unit stress rhk to the
crystal. As the q components are dimensionless, and those of z have the
dimension of pressure, it follows that the quantities cijhk and sijhk have
the dimensions of pressure and of pressure-', respectively.
The coefficients cijhkand sijhkobey the transformation rule (9.8), and then
are components of the fourth-rank tensors c and s , respectively; c is called
tensor of elastic constants or stiffness coefficients, while s is the tensor of
elastic moduli or compliance coefficients. The two tensors are related by the
generalized inversion relationship:
Unlike q and z, which are field tensors, c and s are matter tensors, that is
they characterize an intrinsic property of the crystalline medium and are
independent of the applied force field. On the basis of the symmetry
relations qii = qji and rij= rji for the strain and stress tensors, and of the
condition that the total moment of force applied to the crystal be zero
(otherwise the whole crystal would rotate rigidly), a number of symmetry
relations can be derived for the subscripts of cijhkand sijhkcomponents.
These are:
and similarly for sijhk.Thus of the 81 components of each tensor only 21 are
actually independent.
The mechanical work per unit volume of an infinitesimal elastic deforma-
tion of the crystal is given by
Physical properties of crystals 1 615
and integrating with respect to dqii the work per unit volume necessary to
produce the finite strain q is derived:
by substitution of (9.24) for qij, and remembering that z~~= -pShk for
isotropic pressure, one obtains:
The expression for the reciprocal of volume compressibility, the elastic bulk
modulus K, is obtained accordingly:
K = 110. (9.30)
A simplified convention, based on a condensation of subscripts, is often
used to express components of stress, strain, and elasticity tensors (Voigt's
notation). The symmetrical pair of indices ij (i, j = 1, 2, 3) is substituted by a
single subscript p ( p = 1, 2, 3, 4, 5, 6)) according to the rule: 11 -t 1,
22 -, 2, 33 + 3, 23 + 4, 13 + 5, 12 + 6. Then rp = tii and cpq= cijhk,
with the above correspondence law for subscripts implied. The tensor z is
now represented by a 6 x 1 linear matrix, instead of a 3 X 3 symmetrical
square matrix:
complicated for the q and s tensors. In fact, in order to have relations (9.23)
and (9.24) transformed into the corresponding ones
holds for the two 6 x 6 square matrices representing the elasticity tensors.
Thus the relations (9.31) and (9.32) can be rewritten in matrix form as:
t= cq, (9.34)
By use of Voigt's notation the expressions (9.27), (9.29), and (9.30) for
the energy of elastic deformation, volume compressibility, and elasticity
bulk modulus become, respectively:
invariance, such components can only have zero values. Thus in the
monoclinic system, where a twofold axis parallel to e2 is present, the elastic
constants c14, C24) c ~ C ~~ S ,,C16, C261 C36, C56 are always equal to zero. It is
easy to show that no further constraints are imposed on the cp, components
by the other symmetry operations in monoclinic point groups. In fact, by
the same procedure as above the c tensor can be proved to be invariant to
action of the inversion centre. Thus elasticity is a centrosymmetrical
property, and in order to derive the symmetry constraints on the cpq
components only the generators of the point group not containing the
inversion centre need be taken into account. For instance, a mirror plane
normal to e2 is the product of a twofold axis parallel to e2 and of the
inversion centre, so that it is completely equivalent to the twofold axis as far
as the elastic behaviour is concerned.
In the orthorhombic system, two twofold axes parallel to e2 and e3 can be
considered as generators for the 222 and mmm point groups (excluding the
inversion centre). Thus in this case the symmetry constraints on the elastic
constants are the sum of those already found for the monoclinic system, plus
those due to the twofold axis parallel to e3. Such a rotation transforms the
indices of vectors ei according to: 1 + -1, 2 + -2, 3 + 3, and the Voigt
condensed subscripts according to: 1 + 1, 2 + 2, 3 -, 3, 4 + -4, 5 +
-5, 6 + 6. Following the same reasoning as before, this implies that all cpq
components with only one index equal to either 4 or 5 must have zero value.
By combining that result with the conditions found previously for invariance
to a rotation parallel to e2, one obtains that the elastic constants which may
differ from zero are only ell, cz2, c ~ c12, ~ C13,
, C23, c,,, c ~ C6+~ This
, can be
easily shown to be true for point group mm2 as well, and then holds for the
whole orthorhombic system.
The symmetry restrictions on the components of the elasticity tensor c
(Voigt's notation) are summarized in Table 9.4 for all crystal point groups.
Then by means of (9.20), and using the Voigt notation, the following
expressions are obtained for the strain components related to changes of
lattice constants:
ql = $ [ ( ~ ' / a-) ~11, q2 = $[(brlb)2- 11, q3 = $[(C'/C)~
- 11,
77, = $(bllb)(c'lc) cos a ' , qs = 4(a1/a)(c1/c)cos p',
q6 = $(alla)(b'lb) cos y'.
We want to apply, for instance, a uniaxial compression of 1GPa
(= lo9N mT2) to a crystal of CaSO, along the x crystallographic direction,
and to determine the corresponding deformation. The stress tensor takes
618 1 Michele Catti
Table 9.5, Independent values of elastic stiffnesses c,, (GPa) and compliances s,
(TP~C')of some crystals (Landolt-Bornstein Tables, 1983)
the form z = [-1 0 0 0 0 0] GPa, and, by using (9.32) or (9.35), the strain
components are obtained: q = [-0.011 0.00076 0.00128 0 0 01. Taking
into account the above relations between q components and lattice
constants, the unit-cell edges of anhydrite undergo changes of -1.10 per
cent (a), +0.08 per cent (b), +0.13 per cent (c). The mechanical work per
unit volume required to perform this deformation can be calculated by
(9.36), yielding a value of 5.5 MJ m-3. Let us consider now an isotropic
compression of 1GPa of the same crystal, corresponding to the stress
z = [-1 -1 -1 0 0 0] GPa. Again, by (9.35) we obtain the resulting
deformation: q = [-0.00896 -0.00344 -0.00675 0 0 01, corresponding to
relative decreases of the a, b, c cell edges by -0.90 per cent, -0.34 per
cent, and -0.68 per cent, respectively. The energy per unit volume amounts
to 9 . 6 ~ ~ m -and ~ , the relative volume decrease is -1.9 per cent
qq = -0.0192). This value could also have been computed by deriving
the volume compressibility /3 = 0.01915 ( G ~ a ) - lby (9.37), which gives for a
pressure of 1GPa the same volume contraction as that calculated
previously.
Piezoelectricity
Some physical properties of crystals are expressed by a relation between a
vector and a second-rank tensor, rather than between two vectors or
between two second-rank tensors, as have been considered until now. If the
dependence between the components is linear, the relative coefficients
represent a third-rank tensor. This type of third-rank tensor is different
from that examined previously (cf. the electro-optical effect), which
expresses a second-order dependence between the components of two
vectors.
In piezoelectric crystals, by applying a mechanical stress z an electric
dipole moment (per unit volume) P arises, whose components P;: are related
linearly to the stress components:
As in the cases of second- (z, q) and fourth- (c, s) rank tensors, also for
the third-rank tensor d the number of subscripts can be reduced using the
contracted notation of Voigt. Of course only the second and third subscripts
of dihk components, i.e. the pair of indices referring to the c or q
components, are affected by contraction; the first subscript, relative to
components of vector P or E , is not involved. Similarly to what is done for
the strain tensor q , coefficients equal to 2 have to be introduced in the shear
components in order to have the relations
6 3
P, = C diptp
p=l
and rl, = C dipEi
i=l
satisfied: dil = dill, di2= di2,, di3= di33, di4= 2diZ3, di5= 2di13, di6= 2diI2.
According to Voigt's notation, the piezoelectric tensor d is represented
simply by a 3 x 6 rectangular matrix with two-subscript components dip
( i = l , . . . , 3 ; p = 1 , . . . , 6).
yielding the result that the only components of tensor d which have
non-zero values are dI5) dZ4) d31, d3Z) d33. By similar reasoning the
conditions imposed by symmetry on the piezoelectric moduli can be derived
for all other non-centrosymmetric point groups. In particular, it should be
remarked that in the cubic point group 432 all dip values turn out to be zero:
so this crystal symmetry, though lacking the inversion centre, is forbidden
for piezoelectric crystals. The forms taken by the d tensor in all permitted
point groups are shown in Table 9.6.
Crystal defects
In previous chapters, some properties of the crystalline state were presented
and discussed on the basis of a simplified model, the ideal crystal, relying
upon the fundamental assumption of an atomic structure with perfect
translational periodicity. A large number of very important crystal pro-
perties can be accounted for in such a way: for example, the diffraction of
X-rays, electrons, and neutrons, the main aspects of dielectric and optical
behaviour, and most features of electronic and thermodynamic properties.
However, some crucial phenomena of crystalline solids can not be
explained at all by the perfect crystal model. First, the mechanical
behaviour in non-elastic conditions. The force per unit surface required to
break brittle crystals (break modulus), and the energy necessary to produce
a plastic deformation in ductile crystals have experimental values lower by
several orders of magnitude with respect to theoretical values calculated on
the basis of the perfect crystal model. The maximum shear stress which an
ideal crystal can withstand in elastic conditions is estimated to be about
K1100, where K is the elastic bulk modulus: on the other hand, in real
crystals stresses of K/105 are sufficient to start plastic slip processes. Second,
all transport phenomena in crystals can not be understood by a perfectly
ordered lattice model, where all atomic sites are occupied. This includes
very important processes such as diffusion, which is involved in all chemical
or phase transformations in crystalline solids, and the electrical conductivity
due to ionic motion within the crystal structure. Furthermore, several
particular features of electronic, spectroscopic, and thermal behaviour of
real crystals are not accounted for by the perfect crystal model.
Thus the deviations from ideality, or defects, have to be examined in
detail in order to build up a more general model: the real or defective
crystal. Two classes can be distinguished, point defects and extended
defects. The former ones are a violation of translational symmetry in a
single lattice site: for instance, the absence of an atom from its expected
position (vacancy), or the presence of an atom in an unexpected position
(interstitial). Extended defects, on the other hand, break the lattice
periodicity by relating two portions of the crystal, each of which is perfect in
its interior, in a 'wrong' way. The misfit may involve an extended part of a
plane, and in this case we have a planar defect (e.g. stacking faults in layer
structures); or it may involve just a region closely surrounding a line, so that
we speak of a linear defect or dislocation. Extended defects are weak points
of the crystal, from a mechanical point of view, and are then responsible for
plastic behaviour and for the lower observed strength with respect to
theoretical values. Point defects, on the other hand, provide vacant lattice
Physical properties of crystals 1 623
Experimental methods
Methods for direct observation (e.g. microscopy techniques) can be applied
to extended but generally not to point defects, as these usually have
dimensions smaller than the limit of resolution of the best instruments.
Transmission electron microscopy (TEM) is the experimental technique best
suited to observation of extended defects; in the case of high-resolution
instruments, the resolution power can reach 2-3 A (HRTEM). An electron
beam accelerated by a potential of 100 kV is associated with a de Broglie
wave of wavelength A = h l p = 0.037 A, where h is the Planck constant and p
is the linear momentum of electrons (cf. p. 185). Such a small wavelength is
consistent with a much higher resolution than can be attained by conven-
tional optical microscopy (not better than lo4A for monochromatic green
light). A system of electromagnetic lenses allows the electron beam to be
focused on to the sample and on to the fluorescent observation screen,
analogously to what happens with light in an optical microscope. The
regions of the crystal which contain extended defects give rise to changes of
intensity of the transmitted beam (effect of 'contrast'), so as to visualize the
defects themselves.
A scheme of the optical path of a TEM is shown in Fig. 9.8. The beams
which are scattered by the sample at small angles (1 to 2') to the transmitted
beam are focused by the objective lens to form a diffraction pattern at its
back focal plane. By properly focusing the intermediate and projector lens =
system, a magnified image of the back focal plane of the objective lens is
projected on the viewing screen. The intermediate aperture can limit the
sample area from which the diffraction pattern is obtained, so as to explore
very small portions of the specimen (selected area diffraction mode). In the
imaging mode, the aperture at the back focal plane of the objective lens is
inserted in order to block all diffracted beams and let out the transmitted
beam only (bright-field technique); alternatively, all but a single diffracted I
beam are blocked by the aperture (dark-field technique). Intermediate
methods are based on selection of a number of diffracted beams, which
recombine at the image plane giving a contrast contributed only by the
corresponding diffracting directions.
Important sources of errors in operating conditions are due to the
spherical aberration of the objective lens and to an incorrect focusing of the
objective lens. The spherical aberration error causes a slight displacement
A = C,a3 of points on the image plane. cx (rad) is the maximum angle of
electron scattering which can pass through the objective lens (effective (a) (b)
aperture of the lens); C,is the coefficient of spherical aberration (==focal
Fig. 9.8. Optical path of a transmission electron
length, of order 1-3 mm). Taking into account the Rayleigh formula for the microscope: (a) selected area diffraction mode;
resolution of the instrument (R is the size of the resolved object): (b) imaging mode. (1) Specimen; (2) objective
lens; (3) back focal plane of objective lens; (4)
0.61A first intermediate image plane; (5) intermediate
R =-- lens; (6) second intermediate image plane; (7)
0 projector lens; (8) viewing screen.
624 ( Michele Catti
unit-cell volume and of the crystal density may give important indications
about the number and nature of point defects present in the crystal.
Another physical quantity which is closely related to such defects is the
electrical resistivity, particularly in ionic crystals where the very small
conductivity observed is entirely due to the thermally activated motion of
ions through the crystal. Ionic transport is made possible by vacancies, so
that resistivity measurements as a function of temperature are able to
characterize many features of point defects in the crystal. Other very
important techniques are based on resonance phenomena (e.g. electron spin
resonance or ESR) which may be particularly sensitive to impurities present
at the concentration level of point defects. These topics will be considered
in more detail in the following sections.
Planar defects
The most important planar defects are observed in layer structures and are
called stacking faults. Let us examine the simplest case of layer structures,
already mentioned on p. 429: the cubic and hexagonal close-packed
arrangements. A single close-packed layer is shown in Figs. 9.10 and 9.11; it
corresponds to a (111) or a (001) crystallographic plane in the cubic or
hexagonal case, respectively. Vectors are shown along the important lattice
directions lying on the plane; for example, in the cubic case (Fig. 9.10) the
arrow along [I121 means the vector a + b - 2c, and is denoted by the
conventional notation a[llZ]. The stacking sequences of layers
ABCABC. . . (cubic) and ABAB. . . (hexagonal) are projected on to the
plane (110) (cubic) or (130) (hexagonal) normal to the close-packed layers
and containing the [112] (cubic) or [210] (hexagonal) direction (Fig. 9.12).
The three positions A, B, C of a general layer are related to one another by
translation vectors equal to a[113]/6 (cubic) or a[210]/3 (hexagonal). Full
circles represent atoms belonging to different layers, linked by thick lines to
emphasize the stacking sequence.
A stacking fault occurs when a single layer takes a different position with
respect to that required by the periodic sequence. This corresponds to a
1r l l l l rllolJ
C
B
A
C
B
A
C
Fig. 9.12. Stacking sequences of perfect FCC B
(left) and HCP (right) closeIpacked structures
shown on the (110) and (120) planes,
respectively. A B C A B C A B C [1121
Physical properties of crystals 1 627
,
'
,, slip plane but with perfect lattice continuity
(right).
fault, where glide occurs by a fraction of lattice vector breaking the lattice
continuity over the whole slip plane. In the present instance if the slip plane
cuts the whole crystal, no defect is created (Fig. 9.17). On the other hand, if
only a part of the slip plane, limited by a boundary line, is involved in the
glide, then that line separating the slipped from the unslipped portion of the
crystal is a defective region: a dislocation.
Thus a dislocation is a linear defect defined completely by the cor-
responding curve (which is a planar line of length 1 and then determines the
slip plane as well), and by the lattice vector coplanar with it which measures
the magnitude and direction of slip (Burgers vector b). A perfect superposi-
tion of lattice points of the two slipped portions of the crystal is observed far
from the dislocation line, while close to it the lattice periodicity fails. It is
important to examine the orientation of the Burgers vector with respect to
the dislocation line. Two limit cases are observed for a straight line: the
edge dislocation, with b normal to the line, and the screw dislocation, with b ,
parallel to the line (Fig. 9.18). If the dislocation line is a general curve
',
(Fig. 9.19), then the dislocation character (edge or screw) changes from C
point to point, according to whether the unit vector 1 tangent to the line in a b
given point is normal or parallel to the Burgers vector; for an intermediate Fig. 9.19. Genera, dislocation with mixed edge-
angle, the dislocation is said to have a mixed character. screw character.
I
Energy of a dislocation
Dislocations form, move, and interact with one another in a crystal
according to the energy of elastic distortion of the lattice associated with
them. It is thus important to analyse the elastic strain around line defects
using the basic ideas of crystal elasticity developed on p. 614, and to find out
which factors control the dislocation energy. The simplest case is that of a
pure screw dislocation, which can be represented for convenience as a shear
deformation of a cylindrical ring of isotropic material (Fig. 9.24). A radial
slit LMNO is cut parallel to the z axis, and the free surfaces are displaced
rigidly with respect to each other by the distance b in the z direction (cf.
Fig. 9.18(b)). Owing to the symmetry of the problem, cylindrical r, q, z
rather than Cartesian coordinates are best suited to represent the strain and
stress field in the ring. The only non-zero strain component turns out to be
the shear term e,, = b/2nr. The corresponding stress is z,, = G E ~=,
Gb/2nr, where G is the shear modulus (the equivalent of a shear
component of the elasticity tensor).
It is easy to calculate the strain energy per unit volume W of the
Fig. 9.24. Elastic strain of a cylindrical ring
deformed ring as half the product of stress x strain:
simulating the lattice distortion of a screw
dislocation. w = ~ t , , ~ ,=,$ G ( b / 2 n ~ ) ~ . (9.41)
As the radius r of the cylindrical ring varies between the inner r, and the
outer rl values, it is necessary to integrate the above expression in order to
obtain the total deformation energy per unit length of the ring. For a section
of thickness dr and unit length, the volume is 2nr dr; thus the overall energy
is given by:
1 '-1
n r= - Gb2log - .
G (b / 2 n ~ ) ~ 2 dr
4n ro
This formula represents the elastic energy per unit length of a screw
dislocation, neglecting the contribution of the dislocation core (r < ro) where
strains are very large and the elasticity theory fails. However, estimates
suggest that the core energy is only a small fraction of the elastic energy,
owing to the much smaller crystal volume involved in the core distortion.
The ro length is of the order of 1 0 - ~ m ,rl is of the order of the crystal
dimensions, and a typical value of the shear modulus G is 4 x 10lONm-2;
for a Burgers vector of, say, 2.5 x 10-lo m, and a crystal of 0.01 m the
energy of a single screw dislocation amounts to 3.2 x lop9J m-'.
The energy per unit length of an edge dislocation is given by expression
(9.42) divided by 1 - v, where v = (3K - 2G)l[2(3K + G)] is Poisson's ratio
( ~ 0 . 3 ) This
. is related to the fact that not only shear but also compressive
Physical properties of crystals 1 633
stresses are involved in edge dislocations; thus the bulk elastic modulus K in
addition to G appears in the corresponding energy expression.
An important aspect of eqn (9.42) is that the dislocation energy is
proportional to the square of Burgers vector magnitude b. As a conse-
quence, for a dislocation with a large value of b it is energetically more
favourable to separate into several dislocations characterized by b values as
small as possible (i.e. equivalent to a lattice spacing). Partial dislocations
(cf. p. 634 below) arise from splitting of a normal dislocation so as to attain
Burgers vectors shorter than a lattice spacing, in association with a stacking
fault.
Using the G and b values assumed in the numerical example on p. 632, the
interaction force per unit length between two dislocations at a distance of
100 A turns out to be 0.04 N m-'. It should be noticed that the force (9.44)
decays very slowly with distance, according to the l l d dependence, so that
the interaction between dislocations is quite long range.
634 1 Michele Catti
Partial dislocations
A peculiar type of linear defect is observed in crystals showing layer
structures, and is closely connected with stacking faults (p. 625). Let us
consider the close-packed cubic structure: atomic layers (Fig. 9.10) are
represented by (111) lattice planes (i.e. the system of symmetry equivalent
planes ( I l l ) , ( l l ? ) , (1?1), ( i l l ) ) , and clearly most processes of plastic glide
related to formation of dislocations occur on these layers. The shortest
lattice vector on the (111) plane, a[li0]/2, is the most probable Burgers
vector for dislocations in this case. Similarly, in a general layer structure the
layer plane is preferred as slip plane for formation of dislocations. However,
we have learnt (p. 626) that a slip on these layers by a suitable vector,
different from a lattice vector, may give rise to a stacking fault. In the FCC
structure such a vector is typically a[112]/6, translating for instance a B
layer into a C position, while the whole stack above the slip plane follows it
(cf. Fig. 9.13(a)). Now if the glide process occurs not on the whole plane
area, but just on a part of it, then at the border between slipped and
unslipped portions a linear defect arises. This is called partial dislocation,
or Shockley partial: it is quite equal to a normal dislocation, except for its
Burgers vector not being a lattice vector. Thus a stacking fault not
extending through the whole crystal is bounded by a partial dislocation. If
the fault occupies a ribbon on the slip plane, then its boundary is formed by
two partial dislocations, one for each side. In this case the Burgers vectors
of the two partials have as sum a lattice vector, which is the Burgers vector
of the unit dislocation sum of the two partials themselves. Let us consider
an ordinary dislocation with Burgers vector a[li0]/2 on the slip plane (111)
of an FCC structure (cf. Fig. 9.10); it can be decomposed into two partials
separated by a stacking fault according to the following relation between the
corresponding Burgers vectors:
Physical properties of crystals 1 635
It should be stressed that the vector on the left-hand side of the equality is a
latttice vector, while those on the right-hand side are not; this is consistent
with the characters of unit and partial dislocations, respectively. The driving
force of the above dislocation reaction is the minimization of the dislocation
energy (9.42), brought about by a decrease of the Burgers vector b.
Point defects
Besides vacancies and interstitials (intrinsic defects), which have already
been mentioned on p. 622, a third kind of point defects should be
M
Fig.9.25. small-anglegrain boundary,showing
considered. These are impurities, and correspond to substitution of an atom the array of edge dislocations at the interface.
at a regular lattice site by another atom of a different chemical species
(extrinsic defect). All such types of point defects are usually observed in
ionic, covalent, and metal crystals. If the crystal is not monoatomic, it is
necessary to distinguish between stoichiometric and non-stoichiometric
compounds. In the first case the defect concentrations of different atomic
species are related by the constraint of constant ratios between numbers of
atoms, while this does not occur for non-stoichiometric crystals. Further-
more, in ionic solids the numbers of point defects concerning anions and
cations are always constrained by the need for electroneutrality. We shall
examine stoichiometric ionic crystals in more detail.
A single vacancy, either cationic or anionic, would violate the crystal
electroneutrality; thus it must be associated with another vacancy of
opposite sign, or to an interstitial of the same sign. In the former case we
have a pair of vacancies of different signs called Schottky defect: in the
latter one, a vacancy + interstitial pair (Frenkel defect) is observed. Also an
ionic impurity with different charge with respect to the substituted ion
introduces an electric unbalance and has then to be compensated by a
vacancy or an interstitial of the same sign. For instance, a M2+ impurity in a
636 1 Michele Catti
Diffusion
One of the fundamental phenomena made possible by point defects in
crystals is the mobility of atoms and ions in the solid state. The process
concerning mass transport by purely thermal activation, with absence of
electric fields, is called diffusion. It is governed by Fick's laws, the second of
which is the following:
\
log D A However, the diffusion coefficient D as a function of temperature shows a
more complex experimental behaviour (Fig. 9.27). Two regions, of high and
low temperature respectively, can be distinguished: these are characterized
by quite different slopes of log D versus 1IT. Moreover, in the low-
temperature region the observed straight line may be shifted up or down by
changing the crystal sample of. the same substance. The explanation is
related to the presence of impurities of aliovalent ions, which, as mentioned
mtr~ns~c, \ on p. 635, give rise to vacancies distinct from those due to thermal disorder.
reg1on 1 lmpur~ty For instance, crystals of NaCl can contain small, different quantities of
1 \, region
" divalent impurities (e.g. ~ n ' +ions) substituting Na+ cations: then for each
1/T
Mn2+ impurity a vacancy VNa is created in order to keep the electroneutra-
Fig. 9.27. Arrhenius plot Of log
versus 1/T, showing the impurity and intrinsic
lity. The number of these vacancies is independent of temperature, changes
regions characterizing diffusion in crystals. from sample to sample, and at low temperature is much larger than that of
thermal (intrinsic) vacancies. Thus at low temperature (impurity region) the
quantity n, of eqn (9.50) represents essentially the number of vacancies due
to impurities and is a constant; the slope of the straight line log D versus
1 / T is simply Ha, according to (9.49). At high temperature, on the other
hand, intrinsic vacancies outnumber extrinsic ones (intrinsic region), and
expression (9.46) for the thermal distribution of Schottky defects must
replace n, in equation (9.50) for Do. The result obtained for the diffusion
coefficient D is:
D = g v a 2 exp
~ (- Ha + k y / 2 ) .
Physical properties of crystals 1 639
Therefore, in the intrinsic region the slope of the line log D versus 1 / T
equals (Ha+ AHJ2)Ik instead of Halk, and is then larger than the
corresponding one in the impurity region.
Ionic conductivity
Alternatively to the thermally activated process (diffusion), ionic transport
in crystals is driven by an applied electric field. In this case electric
conduction based on migration of ions rather than electrons occurs;
however, the atomistic mechanism always relies upon the presence of point
defects, just as for diffusion. Generally, the relevant contribution to
conductivity is given either by cations (e.g. alkali or silver ions) or by anions
(halide or oxide ions). In either case the conductivity a is proportional to
the number of charge carriers per unit volume, n, and to the mobility of the
migrating ion, p (e is the electron charge); for cations:
Appendix
where the first subscript is the identity number of the eigenvector, and the
second one specifies the Cartesian component.
It is very useful to change the reference basis from the original one to that
formed by the three normalized eigenvectors Q,, Q2, Q3, and to find out
the representation of tensor y in the new basis. The transformation matrix
Q has rows given by the components Qij of each eigenvector:
taking into account (9.A.2) and (9.A.3). This means that the representation
of tensor y in the basis of its eigenvectors is a diagonal matrix whose
non-zero components are just the eigenvalues A,, h2, hg. The eigenvalues
take the names of principal components (yl = dl, y2= A,, y3 = S),and the
directions of the eigenvectors are called principal axes (or principal
directions) of tensor y; this is represented with respect to the basis of its
eigenvectors by the matrix:
Along the principal directions the crystal behaves isotropically, i.e. each
eigenvector Xi is transformed by tensor y into a parallel vector k; according
to = y i X i , which is the equivalent of (9.A.2).
Let us consider now the case of all eigenvalues positive. Then by writing
equation (9.A.7) as EL, (xi/aJ2 = 1, it turns out that the semi-axes of the
representation ellipsoid have lengths given by ai = l / v y i , i.e. they are equal
to the inverse square root of principal tensor components. The symmetry
axes of the quadric correspond to the principal directions of the tensor.
We want now to derive an expression for the component y, of tensor y
along a given direction characterized by direction cosines 4, 12, l3 (then I is
the unit vector parallel to that direction). Such a component is defined as
the projection onto I of the Y vector corresponding to a X vector parallel to
1, divided by the modulus X : y, = Y . IIX. With respect to a general
Cartesian basis, by applying the usual formula of scalar product and taking
into account (9.A.5) one obtains:
where of course x represents the length of a radius vector joining the origin
to a general point on the quadric surface. Because of (9.A.8), it follows that
x = llvy,, i.e. the radius vector of the representation quadric has a length
equal to the inverse square root of the tensor component along the same
direction.
Further reading
Amelinckx, S., Gevers, R., and Van Landuyt, J. (ed.) (1978). Diffraction and
imaging techniques in material science. North-Holland, Amsterdam.
Physical properties of crystals 1 643
monopolar energy 477 origin fixing 329, 334, 346-51,358, 362-3, 379-80, 383
monopolar forces 469 orthoclase 456
monticellite 444 oscillation method 259
mosaic 162 out-of-plane deformation 508, 517
Moseley's law 231
mother liquid 536
movements 2,36-41 packing coefficient (c,) 429,456
multiple diffraction 191, 351 packing, in protein crystals 548
multiplicity 288 pairing energy 416
multiplicity (M) 449 parameter-shift method 392
multisolution methods 352, 360-5, 389, 390-3 partially recorded reflections 262
multiwire proportional counter 282 partial structure recycling 365
muscovite 455, 612, 627, 628 Patterson
atomic superposition 381
function 182,320,324-35,377-84,541
naphthalene C,,H, 618 group 328
naphthylamines 516 sharpening 326
narsarsukite 455 superposition methods 335, 377-84
negative quartets 343, 356, 362 vector methods 335, 377-84
negative triplets 356, 362 vector multiplicity 325
neighbourhood principle 343 Pauli principle 404
Nernst-Einstein equation 639 Pauling's bond number 503, 513, 516
neutron scattering 198 Pauling's rules 433
Newmann's principle 14, 603, 616 first 433
nibrucite 439 second 433,443,447,456
Niggli reduced cell 77 third 435, 454
non-bonded energy 469,474,508 fourth 436
non-linear optics 608 fifth 436, 455
non-molecular crystals 409 PDB 535, 585
normal spinel 437,442,443 PEG 537,538
normalization 351, 361 periclase 438, 460
normalized structure factors 321-4,326, 340, 351, permissible origins 338, 347-50
356-7,361 periodic table of the elements 407
nucleophilic addition 514, 515 periodicity of the chain or ring (P or Pr)450
number of anion types (N,,) 448 perovskite 441,445, 461
number of molecules in unit cell 312 phase
combined 562, 571
permutation 352,360-1,362, 364
problem 169, 319,540
occupancy 437 refinement 365-75,549
octahedral site stabilization energy (OSSE) 417 phasing magnitudes 343-6
olivine 418, 428, 444, 454 phasing power 550
omega phasing shell
scan 278 first 344-6
2-theta scan 278 second 345-6
omit maps 572 phenakite 428
open-branched anions ( o B ) 449 phlogopite 446
optical density 269 photographs
optical indicatrix 15, 608 oscillation 267
optical isomers 488 precession 254
orientation matrix 276 rotation 247
Subject index ( 651