Shrake 1973

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

J. Mol. Bid.

(1973) 79, 361-371

Environment and Exposure to Solvent of Protein Atoms.


Lysozyme and Insulin

A. SHRAKE AND J. A. RUPLEY~

Department of ChemGtry
The University of Arizona
Tucson, Arizona 85?21, U.S.A.

(Received 12 March 1973, and in revised form 6 June 1973)

A computer program is described for calculating the environment and the


exposure to solvent of atoms of a protein. The computation is based on the
atomic co-ordinates of the protein and on assumptions like those of Lee &
Richards (1971). Results for lysozyme and insulin are presented. Changes in
exposure to solvent and in the nature of contacts that develop through folding,
association reactions and crystallization are described numerically. The compu-
tations suggest several generalizations. (a) Lattice contacts within the protein
crystal are characterized by a significantly smaller involvement of non-polar side
chains and a proportionately greater involvement of ionizable side chains than
is found for protein folding or for protein association reactions important for
biological function. (b) In helical regions the carbonyl oxygen of the first residue
in the helix has high probability of being shielded from solvent. (c) Glycine is
among the residues having exposure least affected by folding: this accords with
the expectation that it lies at bends of the peptide chain on the surface of the
molecule.

1. Introduction
Atomic co-ordinates derived from high resolution crystallographio analyses are
available for more than 30 proteins (Dickerson, 1972). Some method of describing
these structures in a way that allows simple and objective comparisons among them
seems necessary. Particular importance is attached to descriptions of the molecular
surface and the environments of reactive groups because these features should most
closely relate to chemical properties. Lee t Richards (1971) have made an effective
approach to this problem through developing a computer program for calculating
the exposure of protein atoms to solvent. The present report extends this method by
focusing attention on the nature of contacts between atoms. Results are given for
native lysozyme and insulin and for changes in their surfaces that occur during
folding and several association reactions, including crystallization.

2. Methods
Like Lee & Richards (1971), we describe the protein by a set of solvated van der Waale’
spheres. The surface of a sphere is represented by a set of 92 test points that are nearly
uniformly distributed. Each atom of the protein is considered separately as a central atom
that is checked for overlap with all other atoms of the moleculctthe test atoms. The latter

t To whom inquiries aoncerningthis paper should be addressed.


23 351
TABLE 1. Average areox exposed to solvent in Qly-X4ly models for the unfolded stde

ALA (14)... 0 25 5 NOE2 49 3 CG 51 TRP (6). . . CG2 60 6 CA 13


N 2 1 :“G 41
11 3 0” 203: CD1 18 4 N 1 1 C 5
BB 40 7 CEl 38 3 6 5 BE 36 8
CA 19 3 CA OEND 41
SC 150 5 SC 133 10
ODl 32 5 :“G 23
10 4 OEND’ 39
; 28
0 0
3 OD2 41 5 CD1 65 7 z2 4: 3 : 218 ‘5 25
GLY (18)... CD2 19 8 30 6
CB 75 2 CD2 66 7 :: :G” 3
BB 39 6 N 5 2
TEmiINAL CD1 52
BB 49 4 SC 115 5 CA 53 5 CD1 3: : I)ESIDUES CD2 71
SC 75 2 0” 30
1 6
1 s”“c 164
34 5
6 SBE
1:: 1: NE 17 1
CYS (20)... 9 4 ALA (2) . . . BB 99
CEl
ARG (13). . . N LYS (7)... PRO (4) . . . SC 152
BB 89 6 CZl 37 1 N 2 1
N 2 2 CA f : N 2 2 N 0 0
CA 11 0
CA 13 3 CA 11 2 CA 14 2 C LYS (l)...
HIS (5)... :;2 “3; : 1 0
C 0 0 0” 25
0 4
0 C 0 1 NEND 37
N CEZ 19 4 OEND 39 3
0 27 3 0 26 3 0” it : CA 24
CA 1; “3 CD2 4 1 OEND’41 3
CB 27 4 s”“G 32
30 5
7 CB 26 6 CB 37 5 C 0
C CB 75 1
CG 26 7 cc: 32
25 6
5 BB 36 6 0 33
32 5 0 2: ; 9 BB 94 1 32
CD 36 4 :: 30
45 3
2 SC 229
:“c 62 10 :BG 33 2 SC 75 1 :G” 13
NE 11 2
NC; 41
51 3
5 TYR (ll)... CD 33
au (10). . . ND1 1: 2’ N 2 2 ASN (Z)... CE
z1 464 : 5
N CEl 60 3 BB 40 CA 7 4 N 2 0 NZ t:
NH2 52 5 SC 174 6 SER (16)...
CA 13 “s NE2 15 1 CA 11 3
BB 42 3 C 1 1 CD2 37 10 N 2 2 BB 94
: 0
25 1
4 C 2 1
SC 202 7 0 23 2 MET (2) . . CA 15 4 SC 172
CB 34 4 OEND 35 3
CB 29 2 N 1 0 C 1 1 0END’42 4
:“c 1:; ; PHE (Z)...
CG 38 6 CA 13 2 0 26 4
c”:1 2: : CB 39 5
CB 51 6 NEND 35 2
CEl 35 2 CG 2 1
:ED1 3; 25 $ 208 ; OG 31 4 CA 13 2
INLE (lo)*** CZ 11 1 NOD1 41 1
OEZ 42 3 CB 20 1 C 11
CA ; : BB 44 6 OH 40 1 NOD2 38 2
: 0 32 1
SC 82 5 CE2 35 3 30 0
CB 38 5 :: 43
32 4
3 CD2 22 6
0” 23
0 1
4 :G" 41
cc 1 1 CE 79 2
CB 6 3 THR (ll)... CD1 27 4
NOD1 37 5 N
Ctil 27 3 GLY (2)... CEl 39 1
NOD2 45 6 2: 173
42 3
4 CA 1: :
%” (g)*..2 2 CD1 74 3 NEND 44 1 cz 38 1
BB 40 5 CA 12 5 CG2 53 4 VAL (14)... CA 64 4 CE2 38 1
SC 121 6 ; 25
0 1
4 CD2 25 2
34 5 N
: 2: 5 CB 19 3 lo’ : 0” 32
2 1
:c” 160 4 CA BB 80 5
ASP (7)... 25 5 OCl 28 2
s :: 35 3 CC2 67 6 BB 142 6 SC 202 0
0” 2: :
CA 1: : 0” 260 : BB 38 5
LNE” (19)*.. LElJ (1) . . .
C 0 0 %El 4: : CB 29 5 SC 114 8 ::l 63
10 7
2
CA : a N 2

The numbers of residues of esch type included in the averaging (a total of 127 non-terminel residues in lysozyme and 94 in insulin dimer) 8re given in
perentheses following the residue type. The atom designations 8re those used in Imoto et al. (1972). BB and SC stand for the sums over all b&ckbone and
over all side oh&n atoms of the residue, respectively. The first number column gives the average area (A’) exposed to solvent in the peptide model; the
second gives the root-mean-square deviation. Mean velues for terminal residues (Gly-X or X-Gly models) are given at the end of the list.
EXPOSURE CALCULATIONS 363

are divided into two categories. Near tat &ms axe those of a Gly-X-Gly tripeptide model
in which residue X contains the central atom. The tripeptide model for a half-cystine
residue of 8 disulflde includes the SG, CB and CA atoms of the partner half-cystine as
near atoms. M tat atum am ail other test atoms.
The exposure of a particular central 8tom to solvent is the area of the solvated sphere
that contains test points not occluded by test 8toms. Each test point on the surf8ce of a
central atom is considered separately with respect to all the test atoms. The test atom
given credit for occluding 8 test point is determined by the gre8test value for the ratio of
the solvated radius of the test atom to the distance from the test point to the center of
the test atom. The list of intemcting test atoms and the corresponding areas occluded on
8 particular central atom describes the environment of the central atom. This list, which
is the basic output of the computation, is stored on magnetic t8pe for subsequent use in
summations and comparisons.
The Gly-X-Gly tripeptide serves as a model for the environment of a central atom in
the unfolded protein. This model assumes that side chains of adjacent residues in 8n
unfolded chain on the average do not contact the central residue. The conformation used
for the tripeptide is that for the corresponding atoms of the n8tive protein. The area
exposed to solvent for a p8rtioulsr type of atom (or residue) therefore varies for this model
according to its location in the folded molecule. Table 1 gives averages for areas exposed to
solvent in the unfolded state and the corresponding root-mean-square deviations (a) for
each type of atom and (b) for the sums over all backbone atoms and over 8ll side chain
atoms for each type of residue. The smaJl values of the root-mean-square deviations show
that use of the native conformation for the tripeptide model introduces no systematic
error and is likely a better representation of the random coil than that obtained using a
single conformation for the tripeptide.
In cdculations for unfolded proteins, the test atoms are ne&r atoms only. In compute-
tions for folded molecules, the test atoms include both near and long atoms. Because near
and long test atoms axe considered on equal terms in determining which test atom is given
credit for occluding a test point, the area assigned to ne8r 8toms in calculations for a
folded protein is in general less than that occluded by the same near 8toms in the unfolded
model.

TABLE 2
Van der Wadd radii

A-f38 of
I&diust
SOlveted Sphere
(A)
(Aa,

I I
All nitrogen: -N-, -NH, ---NH,, --NH3 + I.6 106

All oxygen: =0, -0-, -OH 1.4 99

AU Sulfur: -S-, -SH 1.86 133

I I
Non-rrromatiooarbon: -CH-, -CH%, -CH, 2.0 145

I I
Arom8tic ottrbon: =CH, =C-- 1.86 133

Carbonyl 8ndieS other asrbon I.6 100


Zn8+ 0.74 68
Solvent (wcttar) 1.4 -

t From P8uling (1960).


364 A. SHRAKE AND J. A. RUPLEY
The van der Waals’ radii used in these computations and the surface areas of the
solvated van der Waals’ spheres are given in Table 2. The radius of the solvated sphere is
the sum of the van der Waals’ radius of the atom and that of the solvent (1.4 A). The
values of Table 2, which are those of Pauling (1960), differ somewhat from those of Lee &
Richards (1971), who used the van der Waals’ radii of Bondi (1964). Like Lee & Richards,
we do not explicitly consider hydrogen atoms, which are incorporated into the van der
Waals’ radii for groups.
Atoms are classed as polar or non-polar. Hetero-atoms and carbon atoms bonded to two
or more hetero-atoms are considered polar and all other atoms non-polar. Charged atoms
are polar atoms that are part of an ionizable group, e.g. CD, OEl and 0E2 in glutamyl
residues and NDl, CEl and NE2 in histidyl residues.
Ninety-two points represent the surface of the solvated sphere with sticient accuracy
for this type of calculation. Decreasing the size of the set from approx. 400 to 92 changes
the percentage of total surface area exposed to solvent by less than 2% (with respect to
the total surface of the solvated van der Waals’ sphere) for 90% of the atoms of lysozyme.
The set of 92 surface points has B-fold symmetry about an axis parallel to the z-axis of
the Cartesian co-ordinate frame. Table 3 gives the atom contacts for the two Zn2+ ions
of the insulin hexamer. The three dimer units of the hexamer are related by a 3-fold
symmetry axis coincident with the z-axis. Thus, a rotation of 120” about this axis results
in no signilicant change in contacts (the differences correspond to f 2 surface points at
most).

TABLES
ofthsztwo zinc atom within the insulin hexamer
Contucts

A. Zn2+ (0.0, 0.0, +8*l)t: Total area exposed to solvent in hexaxner= O-6Aa
Extents of coma&$
Dimer I Dimer II Dimer III

(A21 (Aa) (Aa)


B’lO His CEl 6.6 6.0 6.6
B’lO His NE2 6.3 7.6 6.3
B’lO IIis CD2 6.9 6.9 6.9

B Zn2+ (0.0, 0.0, -8O)t: Total area exposed to solvent in hexamer = 8.2 Aa
Extents of contact$
Dimer I DixnerII Dimer III

(Aa) P) (A2)
BlO His CEl 3.8 2.6 3.8
BlO His NE2 10.6 10.6 11.3
BlO His CD2 2.6 2.6 1.9

t Cartesianco-ordinatesgiven by Guy Dodson (personal communication).


$ Area of zinc atom occluded by protein atoms.

The input for a computation consists of the Cartesian co-ordinates of all atoms heavier
than hydrogen and for each such atom the residue number and a designator of the atom
typet. The co-ordinates for the tetragonal lysozyme crystal structure were obtained from
D. C. Phillips and colleagues (Blake et al., 1967; Blake, 1967) and those for the rhombo-
hedral 2 zinc insulin crystal structure1 from D. C. Hodgkin and colleagues (Blundell et
al., 1971). Both co-ordinate sets are those that were current in 1970 and both had been
refined by the method of Diamond (1966). We emphasize that the co-ordinates used in

t The atoms are defined as by Imoto et al.(1972).


$ ‘Ihe two zinc atoms were deleted in all calculationsexaept in that giving the data in Table 3.
T I

---- I n

3
L-.
:-
!
-5:
-_‘--
L--__

<
I-

i
!_
.
__ --
-.
_-A
-=*,

_______------ I

*-A
-=-__
s--
---.

./
CT:_

* ___________- 1

awhzosh1
( zg ) pasodxa oalv
TABLE 4
Exposure and environment radts for lysozyme

0 23 7 CG c 28 28 CO 27 22 5
: ZL 20 co1 c 36 33 CG 2 31) 5
32 lb CEI 6 32 k3 CO 24 31 A
c 35 36 cz 22 17 36 HE P 29 10
0 26 29 CEZ 13 27 26
4 22 29 CD2 6 26 26 C2
HH2 322 23 :
14 IO 33 WI 28 22 23
6 10 40
6 *s 43 4b LSN 73 zia 61
0 26 ** CA is 30 A
ib 12 29 N 0 15 15
C t ti 9
38 ibb 136 0 * 26 i7
Ll 52 14 CB 28 32 2
* 26 16 cc D 22 Q
c 21 21 NOD2 26 i8 7
0 21 *9 Nom 2 47 3
35 32 **
2 14 34 36 SE9 7 206 195 47 ,“R 151 36 Jb
CA b 4Ir 36 CA 8 13 6
3 178 177 N 02618 N 0 t *c
[I 32 38 C 0 I3 6
0 29 ZD 0 26 9 0
G 16 22 C9 lb 0 6
6 16 32
* *4 61
0 24 71
1s HIS 43 2LP tr9 2 22 H
CA 9 27 ib c 16 73

50 SER 3 232 ii?


CL c 52 22
N G t3 15
C c ** 16
Ni 10 13 2 0 ‘ 27 19
cz P 40 6 CR 2 66 33
N”2 28 36 7 OG 1 S2 6
NH1 0 k9 14
51 THR 3 275 194
6 CYS 52 97 103 CA 0 49 32
CA L lk **
N i 9 23 N
C 0 :: ::
C c 15 17 P 3* 27
I!* c 54 32
r&l 3: 163 *b
II CGZ 3 36 55
SG 13 39 4 OGl C 33 19

7 CL” 82 131 233 52 AZP ** 276 209


CA 11 24 3* CA 0 49 36
,” 01 :3 *2
22 29 “AL N ‘ 17 2s
CL C G 28 37
COR 1 ib
16 24
30 N
C
CC :: 9 27
CD 2 8 25
OEt 15 16 25
oe1 17 7 75
53 T”ll 23 232 506
CA c 25 rc
N 0 24 15

31 AL1
CA
9 ALA c 119 t35 N
CA 0 33 58 c
N 0 PA 23 0
C G 16 33 CB 54 GLI 2 163 126
0 0 Ii 43 CA P 57 $9
CB 0 3t 77 32 ALA
CA F C
0 Zb
32 29
23
N 0 b 27 26

0 i lb 39
:z* 0
A 166 77
56

CG1 16 7b
CDI : 35 77
NH2 33 ib 17
NH1 JC 10 li 58 LE” 0 98 *35
CA A I7 5*
22 H 0 13 PA
GLY
CA :: ‘A
27 21
bo C A i7 30
N * 25 9
c 0 i5 20
0 P9 li II
EXPOSURE CALCULATIONS 367
I7 tl.r 19 26.3 223 ooz 0 31 40 2 ii 36 Pi 0 5 b
CL D 32 21 DOi a 22 z5 0 2c 56 C 0 10 9
II 0 Ll zi CC1 0 54 0 0 SO I3
67 GL” 76 6b 31 co1 4: 0 47 GE b 9 30
CA 39 32 6 ct 1 0 0
c" 0 29
8 14
3 79 PRO 79 ii9 iu CB NO02 33 b 3
CI 0 28 30 ot NOD1 39 0 a
0 36 0 7 II I) 721
c i 21 is 9* “IL II151 357 104 GL” 12 91 iP6
68 A:; 120 262 91 0 0 lb 30 CI 0 25 47 CA il 36 S3
is rr 9 co 25 27 il : 0 22
26 L3
IZ H I 20 lb
5b ILE 5 177 323 F 0 i6
is 15
0 co 1.7 b 39
CI 22 CC 4* 6 22 0 0 16 36
CR
& 2,
7 10
i? 27
0 (10 cvs 0 ii3 255 CGZ : :: ::
CG 3 43 3 CI 0 24 47 CC1 0 13 90
co 19 33 3
NE 6 21 3 : 0 IO
17 31
13
CZ 3 25 5
NH2 25 23 21 EB 0 13 40
.?i 60
H"i i6 36 5 SG c 29 55
59 ASN 33 237 ib9
GA 0 32 32 69 THR 3 239 221 6i SER 75 7z 135
H I 17 23 C1 0 25 33 CA Z 16 36
N 0 14 i* t 13 21)
C 0 17 25 c" 0 16 16
0 1 er 21 &I 49
6 14
8 19
27
CB 0 *9 38
ct2 P 49 56 OG 16 5 13
OGl 0 31 27
82 ALA 33 134 137
70 PRO 131 54 61 CI ib 35 28
Cd 13 3 9 N 0 34 6
N 0 5 9 C 0 16 17

ifI 16
1 21
26 63
PO 95 ALA 0 114 247
CA ‘ 32 62
83 LE" 0 240 3t9 N 0 26 t*
CA : 36 35 C 0 a 29
61 ARG 84 176 322 0 G 14 *4
CA 0 2c 47 ! 0 22
25 10
17 CE L 22 88
N t zi to ,"B 0 32 24
c 0 825 96 LIS b, 252 340
c C 16 14 CG : 4: ix CA 0 39 47
0 9 zz 19 co* 0 38 68
coi C 27 68 ," c
f 28
25 28
16
72 StR 3t 159 tot :9 62 33
22 39
30
CA 0 36 17 84 LE" 22 206 268
N F 23 14 CI 0 2.8 25 CG L 27 5r
C 0 7 13 N 0 26 10 CO I3 zc 49
0 21 7 11 C G 15 14 CE 5 31 60 CEI i 9 4b
CB 11 49 2, NZ 21 26 17 cz1 IO 74
OG D 36 20 :, 2
9 *i
26 22
CG G 33 49 119 169 146 c":, :
0 17
LO 71
69
73 ARG 147 126 102 co2 30 16 51 3 Pf+ 30 CEZ 0 13 4"
CA 13 14 2, CO1 0 38 58 CO2 Y 3 30
N 0 16 13
C G 9 26 85 SER 61 1~7 76 109 “AL 99 135 74
0 0 26 26 C1 17 ,?d 9 CA 3 25 16
:," 259 14
13 li
30 c 2* F 0
1 20
lb I73
c" 15 ;
CO 14 1r 24 0 : 19 22 0 0 20 21
z: it230 6 D CR 39 27 17 CB 13 6 0
06 4 33 it CG2 36 17 9
NH2 55 0 0 4b ILE 12 100 320 CGl 46 25 t
NM 17 6 24 b6 SER 69 50 164 CA 0 25 33
CA 2 16 41 N Li 23 15 110 ALA 11 133 184
74 ASN 47 165 119 N i 11 10 c c I7 13 c* 3 32 49
63 TRP 43 zro 559 CA 0 22 27 c I225 4 26 15
CA 0 33 35 N 2 3 i6 . &I c 21 41 F 0 17
2.3 29
21
C D 17 13 CG2 6 14 6Z 0 ‘ 20 30
," : :: :," 0 16 20 9 OG 21 I* 15 CGI 0 25 62 CLI L) 38 55
0 35 26 C4 24 13 36 COi z 2b *I
CB : 39 36 CG 0 23 25 iii TRP iZ 374 546
CG 0 20 39 99 YAL 3 157 316 CA 0 39 30
CD1 3 12 63 CA 0 33 32
NE 8 0 45 H c 20 23 2 0 24
I.¶ 16
25
CEI *: 0 66 75 LE" 9b 107 168 c 0 15 15 0 ! 30 19
CZ1 1 50 CA 3 is i4 0 3 ib t9 c"G" 0 26
27 36
52
cn 10 14 50 N 0 26 6 co II 25 66
CZ2 ‘ 19 50 C L 13 14 CGZ 0 28 65 CO1 0 36 48
cc.2 c 20 38 cti 0 17 87 2:1 D
C 22
25 46
36
co2 E 9 38 COB 112 14
11 20
3s
CG 6: II 26 bb ILE 5 122 315 100 SER 34 202 IOt Cl1 9 17 56
64 t"S 0 i4t 177 CO2 0 2 CA E 2, 21 CA 0 k9 22 cn 3 32 53
CA b 36 211 CDi 6 16 5t N 0 16 3 N 0 24 lb CIZ 35 b.¶
C 0 16 ti CEZ : 25 42
F c
0 17
23 23
16 76 CYS 20 113 183 0 0 16 30 COP 0 19 29
0 8 20 16 CA 2 25 36 CB 0 2 4i
CR 0 25 43 N 0 26 e0 CG.? 0 it 71 112 ARG iO.? 267 237
SG 0 2; 50 C 0 21 23 CGi 5 *7 44 CA 0 35 38
0 12 3 31 CD1 0 17 84 101 ASP 108 80 46 N 0'24 *a
65 ASN 57 162 171 CB 6 16 31) CA H 22 6 C ‘ 20 i4
CA 0 39 33 SG 6 20 40 89 THR 6‘ 201 124 N 0 23 .3
N 0 30 is CA D 35 35
N 0 24 17
C : to 14
0 2, t*

:z 4: :: ::
OGl 7 31 7
ioe GL” 6i 104 ti
66 ASP 35 181 171 90 ALA 23 62 191 CA 30 *3 6 113 A;: 106 98 140
CA I3 30 17 CA 9 17 46
N 0 24 9 H 0
0
10
i,
30
z5
c" (I
0 16
29 9
3 N 6 :: ::
B
C 1lI 3 C cl 31 14 0 c i ii I3
0 II 5 iz 0 t it 39 0 32 b 7
3 N i i6 ii CB ii 25 56 103 PSN 8, 69 66 21 16 27
St 0 :: :; c
0
0
6
16
*9
30
25
CA 6 6 13
A. SHRAKE AND J. A. RUPLEY

24 43
35 36

The velues given for each atom are from left to right (columns 2 to 4): the area (Aa) exposed to
solvent, the area occluded by polar long atoms end the area occluded by non-polar long stems,
respectively. The oorresponding sums of these areaa over all atoms of each residue we given on the
fist line of each block.

this paper are preliminary and are being refined in the crystallographic laboratories. The
effect of uncertainty in the co-ordinates is discussed in the Discussion (section (f)).
All computations were carried out on the CDC6400 computer of the University of
Arizona Computer Center except for preliminary work, which was done on the Argus
system of the Laboratory of Molecular Biophysics of Oxford Universityt. Run times on
the CDC6400 were approx. 5-6 and 4.2 mm for lysozyme and insulin dimer, respectively.

3. Results

(a) Exposure and environment of atom-s


Individual atom data for native lysozyme and insulin dimer are presented in
Tables 4 and 5. Comparison of the values of oolumn 3 (area in Aa occluded by polar
long atoms) with those of column 4 (area occluded by non-polar long atoms) gives a
measure of the polarity of the environment within the folded protein. The values of
column 2 (area exposed to solvent) can be compared with the areas exposed in the
Gly-X-Gly models for the unfolded state that are given in Table 1 in order to see the
effect of folding on exposure of an atom or residue.

(b) Exposure and change in exposure of backbone and side chain elements
Figure 1 shows the area exposed to solvent for the backbone and side ohain of eaoh
residue of native lysozyme and insulin dimer. The values plotted are summations
over the appropriate atoms of the results given in Tables 4 and 5. Graphs of this kind
are a convenient way to present the changes in exposed surface area that follow from
association reactions. Figure 2 shows the changes developed through binding of the

t During tenure of a Speaial Fellowship from the National Jnstitutes of Health held by J. A.
Rupley.
AA A GA T L
K I 5 18 r e
on ; nn P ”
T
I
I
5 ” P
&
$ 501 :
L , :1
- 9 0 k_____________________-_______!b._*__
____h .___
___ _____
______ .~~__
N
Qab. -__- ..i_l pl~_L__^IAi 1 .~ ip~m,~
_~1~.1_1._.i- _~J_1
IO 20 30 40 50 60 70 00 90 100 110 12’0

IO 20 30 40 50 60 70 80 90 100 I IO 120
Lysozyme residue number
(b)

Fro. 2. Changes exposed to solvent for backbone atoms (--a--a--)


in 8m8 and side chain
8tOm (-a-e-) for (a) binding of N-acetylglucosamine hexaseccharide to lysozyme; and
(b) incorporation into the crystal lattice.

A
s

P? s 7 :7
e
L G
e
” ”
i
eVn rHU “V ‘G T
:

lb)

A10 A20 BIO 820 830 A’10 A’20

(4
insulin residue number

Fro. 3. Chtuqea exposed to solvent for backbone &_wns (--.--.--)


in area and side oh&n
atoms (-•-•-) for (e) association of insulin monomers to the dimer; (b) incorporation of
dimer into the hexamer; and (a) incorporation of hexamer into the crystal lattice.
TABLE 6

E 3 LSH 75 53 233 ‘ 24 *6
C. fi 6
23
23
33
25
,” IA 36
ii c 0 : 1.3 34
CB 9 21 73

co9 :,’ t,” :: 815 LE” G 192 404


C1 33 51
Nom
CG 2: i :: z : 25 23
NOrI 5 i6 25 3 22 *I. et6 TYR 6 07 647
Cl 14 r9
* * GLH 77 ii6 t23 H : 1* 30
CL 5 I.9 *r c 13 2*
ai5 GLH 64 263 12.2 rt Z% 0 : i6 3k
z : 16 2r ce 0
N
CA 2 26
43 ,: 0 c 22 32 CG 0 : 2:
C : 29 i, CB *2 e* to1 c 7 59
A s GLU 65 163 170 0 6 29 26 CG 2; li 2, CEI c 6 7%
CL 2 26 25 CB C 46 13 CZ 0 ‘ 72
N E LB 26 CG 33 30 9 NOEl
CO 30 : 2: CEZ 0 0 76
t i i9 16 CD e 13 10 “DE2 4, 0 13 CDL 7).
0 26 20 HDEi 1G 22 PI OH : % 56
CB ,: L3 21 WE2 17 26 13 9 5 “IS ii.9 139 19.2 CG 0 7 Lz
CG *t I* 6 CA ‘ EZ 32 cm 0 I2 59 BZT THR 33 64 259
Lib LE” 6 165 420 cxi 25 46 CP 0” 13 9*
::i 2: :: :: CA 0 26 46 z s2 131 1.3
25 c2 E 12 33 H 9 39
OEZ L 3G 22 N 3 20 I6 CE2 27 ii 26 t c J 26
,“6 :: 213 II16 -co* c 13 27
6 5 GLN 66 293 i29 : F :: ti Cd 3 19 I* OH ** 7 2% :6 6 26a L5
Ia?
CI 5 *9 22 NOi i 29 10 OGI *: 16 i6
N 0 20 is “0” ‘t 25
26 41
76 CEI 25 32 29 CG2 ib 21 51
co* *4 90 NEP li
: 6
G 41
43 ii
I1 cm : 22 62 CO2 29 : i6G E2e PRO It 167 269
CB 2 49 14 CA c 17 c*
CG 14 39 21 Ai, GL” 79 266 153 6 6 LE” 22 125 329 : 6 11
CD c eo 5 CI 0 4* 32 CA 3 LZ 32 : is 16
NOEi 5 24 26 F G 26 23
NOEZ Q3 9 5 ‘ 29
0 5 20 ,:
16 CYS c 125 242 St” e 33 32

ii" : :: 4i9 CD ‘: 262 253 829 LIS 166 99 125


: 0
c 20
22' Z6
20

CB G 16 71 6 7 cvs 57 31 73
SG u 13 611 C1 2 22
N * i i.3 CR 9 16 II
* 7 % 25 is* 11.3 ,” 0 26 13 c c 3 I.3
0 39 22 0 i6 15
N
c
0 29 13
i; 2.9 11
EC4 17
17
14
21
16
21
& t'
. 30
*c 32
19 CG 6 13 7
HUDl 22 13 20 e 6 GLY 36 .66 122
56 25 9 16 NO02 32 13 7 CI 32 29 41

6 6 THR 131 i63 68 A19 TYR 26 235 466


CL D 33 a* CL k 19 36
N

s 56
9 30

29
23
9

102
z
0
5 :: :;
0 16 34
e 9 SER
CL
36
c
6’4
lb
196
*9
920 GLY
CA
3G
26
Ok
36
35
96
26
II
ce 16 I7 17 ce 0 21 56 F Y 9 16 A9’ i GL” 76 60 93
mt 15 16 I5 CG 6 14 *5 0 c 9 3, CA 36 2, 28
CC2 65 14 6 CD1 6 i” 55 NEW 23 13 ill
CE1 1‘ ill 93 Btl GL” 131 6t 96 C c 15 *2
A 9 SER 89 72 89 C?. 6 23 36 0 17 7 25
CL 5 21 21 GEE I 36 45
CO2 t .16 b2 619 HIS it9 193 121
: 0 20
1C iI21 OH 6 32 *9 CPI 3 35 22
0 1: 7 26 H c 9 15
CB 36 14 13 620 CIS 16 155 163 CG 25 3 5 c I *c 21
Ot 36 6 0 CA ‘ 36 35 0 * 19 26 CD 5 0 0 0 t 2, 25
16 33 16 K2 02 13
14 **
35
6lJ ILE 64 232 ii6 ! 6 ii21 ZP
20
CA 0 *4 it 0 13 9 24 CGl c 24 69
CE1 56 12 co* 9 33 69
: ; :: :
A- 3 “AL 21 i35 266
&I 9 24
26 1: CL 2 I* 38
N c 9 26
::: 260 2I
33 =c
33
COI 47 17 30

6%’ “E: * 160 136 ctt i* 66


0 39 19 cc2 i”, 32 4*
N C 29 ii
L. 4 GLU 67 136 279
: 6i 2+
32 13
29 CA 3 ic 39
CB 6 26 35 823 ‘LY S 93 165
Et 6 2, 3.¶ 6 1 PHE 176 25 EC2 CI ,” 36 63
CA 13 P 1,
z 22
16 31
3i
0 ii 16 *o

75 206 i6i
9i3 GLU 51 172 255 3 35 25
CL 6 25 43 N 0 26 Li
: Y? 29 29
c” *6 24
26 26
23 2-4 22
CEI 0 9 75 C6 29 z-8 a*
EXPOSURE CALCULATIONS

CG 11 20 22 c i IO 23 8.13 GL” 216 193 36


co 0 b 14 0 12 1:. 26 CA 35 36 23
NW1 21 12 11 CB 25 5 39 N 29 17 2i
NOE2 16 22 20 CC 3 3 I7 c 26 10 Z"
NODi 17 2 2* 0 26 27 37
A- 6 c*s 6 179 175 1116 LE” 0 193 309 NOD2 23 3 36 CB 25 20
CA P r9 19 CA C 26 39 CG Ir 41
1‘ N L 26 14 13
E! L
L
**
29 17 C b 22 16
R’ 4 GLN
C&
55 I22
9 13
273
3c
CD
OEl 311 17
6

0 0 26 28 H OEt 17 2
CR 0 33 hi &I c 25
16 21
49 I: :. :: :,”
SG G &* 59 CG i 27 36 0 : is 32 8.14 ALA 29ll
CB i 19 32 CA 42 P-24

Ei
WE1
0
25
2,5
1
4,
23
14
z ::

NOE2 2c 3 43 c”9 ::
9’15 622
49
20
24
25
i 54
c 76
c 96 8.25
79

38,
35
24
15
29
5L
39
39 ra
AVi9 7*4 62 192 355 1: 12 30 42
CA 6 22 25 3 22 2;
,” i 1s
14 11 0 29 30 a62
L 21 1 10 53
0 19 20 El’ 7 CYS 5L 50 99 tr 16 it
C’I : 20 36 CA 2 5 19
CG c 13 26 N c 2 26 LEV 126 127 tzli
co1 6 23 23 C 614 CA 5 P5 Ilr
Gil 19 1* ** & 32t 14c 25 N 25 15
CI 3 3 39 C 14
CE2 3 1" 56 SG 13 23 : 0 * ,"
C"2 it *II CB 9 9
OH *; 9 2(' CG 21 I]
46 16 25
30 13 35

39 lY7 159
9 21 6
17 15
1* 2
A.12 SER 2: 171 166 SG ‘ 38 39 5 T
CA in 39 35 32
N L 29 14 A*21 A'S!
CA
*3
16
5:
66
N c 24 18
C 2 16 13 96 119
OEND iG 1, 25 19 33
OCND'26 25 * N 15 2"
.C6 32 22 6 c 0 2 11
CG I ii 23 0 10 6 2*
NOOi 1 i7 39 L k2 43
NOW 16 8 22 :G” 0 32 49

GL" 27 I3 102
CA 27 27 33
‘ 22 2E
1,
:: 32
Wll LE"
CA
3 169 491
i 32 *, GL” 24 79
& 11

N c 29 22 CA 6 2i CG 2:
C 24 25 6 CO 21
0 s .21 29 6 CE de6
CR 2” 9 13 CB 3 28 PI 3 llz 33
CG 3 9 CG L 13 62
co1 19 4 6 CO1 0 ii a7 ALA 113 61
CEI 33 4 14 9* 2 "AL 122 38 73 CO2 * li 88 CA 2 16
cz 7 : 26 CA 5 3 9 H c 6
CEZ I, 6 2, N i 0 13 c 3
CD2
OH
ic
28
0
J
33
22
C 0 6 10 OENO 1: 13
0 2 12 34 215 DEW 3* 5

A*15 GLN 85 205 176


C9 II * 0 CA 25 co 66 1, *
CGI *7 16 N 17
CA 16 bi 14 cc2 55 c c
L 22 ik 0
z i 25 13 8' 3 ASN "2 44 22, 03
0 2 21 25 CA 0 A 39 CG
CB 6 25 30 N i 2 23

The insulin monomer aonsists of two polypeptide ahsine, A and B. The polypeptide chaina of
the aeoond monomer unit of the dimer are distinguished by aslterisks. See Tcbble4 for additional
description.
TABLE 6
Contact information for~ly.sozyme

1 L”S... 127 CIS 13 33 L”S 1*0 I* LE” I2 53 TlR... 57 GLN i7


3 PHE 110 ID ALA 30 CIS 76 *o THR ii 51 TMQ 131 se SIR IO
24 SER 6.6 123 TRP : 123 TRP b3 53 ,“R IO 66 AS"
119 PLL 29 *c 1% 7 “,: E:: 53 TV9 :
*: ::i? If 19 ALP... 31 ALA *6 52 ASP 3
is LIS 52 115 CIS 12 ;; ::: 60 SER...
“2’1:: :: 129 LEU 49 32
36
“LA
SER
9
7
lr3
8.6
,“Q...
ASH 57
50 ILE b6 69
6*
,“R
CIS
63
63
s1 GLH 27 I G‘” 49 13 LE” 5d
14 ARG 33 35 GLU 53 ,“R 47 $3 THR 5b 5i THQ 60
34
38 LEU
PHE : 6 CIS zi iii TRP : 52 ASP 3i 60 SER *3
I) LE” 19 14 ASN.,. LO TRP. . . 37 ASH 2 lb LE” 39 It ALL 17 :: ::: ::
2 “AL... 9 ALA 17 22 GL" 67 17 LE.” 198 6% GLN 29 90 SER 23
39 ASW 66 i2 MET A2 105 “ET izo 35 GLU... 51 THQ 21 :: ::: :: 53 TVR El
3A PIE 63 ii ALA :: ::: :: 12 WE, 99 57 GLN 96 22 ALA ** 52 ASP 10 II SER 23
i L1S 59 ii?6 ARG : *iI SEll 44 t5 LE” 60 ii” ALA 62 51 GLN 16 60 ARG 9 60 C”S z1
3 PM lb 25 LEU * 17 LE” *e 31 ALA 66 56 LE” 2
Zi ARG 12 :: ::: :: i66 ,RP 50 44 1SN.B. :: fP’ *A
“I ::: ix 4 ii ALA... 25 LEU 99 “AL 53 32 ALA 46 9* ASP 9i Sr. GLY... 65 ASN b’
15 HIS 55 PO ,“Q : 3i ALL 51 56 LE” *A 51 THR 41 56 ILE 5
3 PHE... 6 LEU 51 2.0 ,RP i 27 AS” 35 109 “AL 23 57 GLN 43 :I ::fl :i 6i ARG 5
38 WE 129 14 ARG *4 56 LE” 35 36 SEQ *I 46 ASH 26 42 ALA 31 13 AR6 *
61 ILE 36 ** SEQ 31 33 LIS 11 *3 ,HQ 24 *o ,“Q 30
: ::; ::; 7 GLU 32 to.9 TQP 30 3b PHE 16 45 ARG 9 63 LEU 25 61 AQG,..
7 CLU 1)& 95 ALL 26 55 ILE 12 42 ALA 7 56 LEU 21 6Z ,RP 16i
WI ILE 45 ‘i ::i :: 32 ALi t. 37 ASN 7 35 GL” 5 36 SER LO II GL” 94
40 IHR 36 13 CYS ib t9 VAL 19 44 ASN 56 SEQ t 53 ,vTR 14 69 ASS SO
** ALA 33 9 ALA i* 26 GL” 15 30 CIS : 39 rsn ii 69 THR 46
55 ILE 30 iZ MET 12 JO c*s 6 45 ARG... 55 ILE LO ,t IL4 39
2 “LlL 26 18 ASP * 36 SEQ... 611 AQG 1** *I ASP 36
12 WT... 96 LIS .3 55 ILE 75 51 THQ 76 4i
91 GLN
SER :
““5 :‘R: *’ 2 17 LEU 127 21 APG 1 39 ASN 72 C9 GL” 59 :: 2: ::
(I LEU 92 32 ILL 52 5E SCQ 30 55 ILE...
rWI... *I ,RP 79 29 “IL... 37 PlSW 8.D I+6 1% 30 4C ,“R 92 “,: 7:: lb I
7 GLU 77 IA ILE 47 i23 TRD 56 $2 ALA 36 kr. ASN 2? 36 SLQ A4
A LEU 56 I5.HIS *c Zi ARC... 25 LE” 50 57 GLN li 3* LLA 67 6Z TPP...
36 PHE 35 9 Lt.9 36 180 SE4 120 26 GL” *9 38 PM 51 56 LIU 62 61 AQG 1%
3 PHE 30 25 LE” 3t 20 TII 73 3* PHE 42 39 GL” 24 6 LEU 49 73 APC 105
6 CIS 19 92 “AL 26 23 TIP 63 6 LE” 36 33 LIS 19 39 ASN 44, 63 TRP 103
* VAL 12 29 VAL 22 99 VAL 60 9 ALA 35 54 CL” 1* 88 ILE 39 75 LEU II
5ARG E 56 LEU ** 19 PSH 22 12c *LE 33 3* PHE 7 38 PHL 26
10 ALA ZD 132 GL” 2 3t ALA 3P 3 PHE 21 :; ::t ::
5 ARG... 12 llki 31 3, AM... 91 SER 20 h* C”S 9
i.?, ,RP iv9 :: ::: I9 7 22 ‘L”... 33 LVS 29 33 LIS 106 5, GLH 16 ,* ACN
3A PHE 83 *+ 1QC 7 19 ASH II 5 diRG P7 38 DUE L, :
PO “IL 52 16 GL” 23 ,“Q 3* 26 TPP *o 39 PSN 47 :: ‘;:“u :i :: ::s 1
i*: t;;z :; 13 LIS : 20 ,“R 19 31 ALL 15 36 SE4 19 92 “AL 72 SLR 1
II ASP z *+ SER II 32 hi.4 t7 47 TWR... 12 RC, :
i.22 ALA 30 55 ILE i 21 1RG 3 :: tti it 2 2 “LL 15 56 4SN ** 53 TIP i 63 ,PP...
0 LE” es 35 GLU 5 41) ASP 24 91) 1I.E 117
9 ALA 25 13 LYS... 23 TIQ... 30 C”S... 49 GL” 24 56 LE”... 6Z ,RP if2
7 GLO i7 119 LE” ID1 iP5 NE, iE9 123 fRP 8I 36 WE... 106 ,QP iPi 76 CYS 95
izr ILC 15 i6 ASO 60 *I ARG 76 34 WE 50 3 PHE 133 411 asp... 95 ICE *A
25 LE” 67 20 tva 66 33 LYS i*s 50 SEQ 37 56 ILE 67 :; ::,” ::
,*:
126
5:: i4
Cl.” :
10
9
ALL
&LA
sr
II
ZLI
19
TPP
&ON
bi
57
::
26
!i:f,
GL”
47
::
5
II
4%
LE”
83
70 “*: ::“, ::
t8
35
TQP
GLU
43
37
56
w
ILE
CIS
65
37
i6 GL” 30 27 ASN 56 29 “AL .?* t “AL 6‘ Ir7 ,HQ *I 95 ALA 35 9L CIS 36
15 “IS 26 ill TRP 55 i*o YLL 20 55 1l.E 61 69 T”Q r 54 CL” 26 9, LIS 3’
6 CIS...
9 ARG 66 ii ALA 19 104 GL” 5* 32 ALL 13 29 “AL bt 71 PPO 3 91 SE9 25 7c ASH 25
12 ME, 10 99 “AL *1 illr A46 10 3* ILL 25 inr ASP If
i7 LEU 13 25 LEU P6 ,w 16 i; ::: :: 49 GL”... 12 WE, *t 76 ILE ic
ir APG * 106 AS4 t si ALA 5 4 GL” 3i 45 ARG 47 5, GLN 19 66 SER 7
17 LE” 4 tI6 THQ I AZ3 ,SP 16 69 ,HQ 36 a2 “AL 13 73 AR‘ 5
24 SEQ 3 40 ,HQ AC 46 ASN ZL 3i *LA H 77 ASN *
ill ASP 3 31 ALA... 39 15* *7 ,H9 16 53 ““9 9 63 LEU 2
I03 ASH i 35 GL” 55 i LYS : k(L LSP 15 1, LE”
37 ASN 6 51 THR 14 36 SEI : 6L C”9.e.
129 LEU tr SER... ‘:: :;r :: 6.9 ARG ii 105 IlET 3 60 SEQ 63
123 TRP : 27 ASH I’Jl *I ASH 33 39’ 9SN. . . TG PRO 14 P.SN 21
19 AS” 47 115 evs 27 2 “AL 13.4 50 SER 6’ 57 GLN... 73 ILi 46
I GL”... 26 GL” 36 Sk WE 23 42 ALA 75 52 ISP 189 63 LE” 3s
c GL” 9F 15 HIS... ZLI TRP 30 56 LEU 23 36 SE4 71 50 SEQ... 63 ,I)P 39
3 PHE 07 I* AQG A5 1*0 “IL 26 105 HE, 17 41 GLN 6C 41 LSP 76 :: Kt :: 59 ASN 23
* L”S 49 92 “Al. 75 111 LSP 16 33 L”S 16 55 ILE 43 46 PSN 66 52 GL” 5, 6C c*s 19
IO ALA ‘4 22 GL” 12 29 “AL 15 31 ASH 35 69 THQ 54 WI LSr( 9* 53 ,“Q 13
11 ALP *L) :: ::: :: 30 GIS 1* 1 LIS 3* 59 ASN 5i 79 PQO II
6 CIS t7 Be ILE 55 ‘:: t:“, a29 ii9 AL& 1.I 41 TYQ 14 60 SER 26 ;: ::: :: 53 ILE
5 AQG 20 12 MET 53 124 ILE 6 *03 TRP * 5‘ GL” 5 51 THR 25 56 ILE 14 66 ASP s
9 PLP i(r 13 L”S 31 25 LE” 6 32 ALA 6 36 PHE 5 45 ARG Z6 10s TRP ** 61 ,QP 5
B LE” 7 96 L”5 27 26 GL” 1 61 &RG 19 *3 THR i* 73 AQG b
i7 CF.” 23 25 LEU... 40 THR... 52 ASP i 59 ASH 12 65 ASN 3
8 LEU... 67 ASP 7 16 ASP 113 32
55
ALA...
,LS 6*
55
1
ILr.
L”S
A7
o* 51 ,“.=.a..
55
56
ILC
LE” 7’
9* CIS
72 SE* :
3 PHE 120 93 ASN 3 it+ ILE 65
ie ME, 89 *a “RP 85 56 LE” IelI 112 LE” 79 53 TIP 1?6
1% ALP 55 16 GL”... 9 ALL 51 35 GlU 40 54 GL” 3* 115 ARG 62 56 ILE... 65 ASH...
55 ILE 47 16 ASP 39 29 YAL r9 36 SEQ 36 39 ASN 32 60 SEQ 57 A3 LF.” 67 7, Aft4 64
aa P”E 97 *o ,“R JZ 13 LIS rr 3S PHE 56 3 PYE t7 44 1SU rs 79 PCD 59
29 “LL 46 13 LIS 10 ** SEQ 36 29 “AL 32 86 SE9 26 63 ADG k3 z: ::z :i 66 C”E 59
5 GLY 41 96 LIS SD 129 LEU 26 23 ,IP 30 65 SER 26 50 SE9 27 46 LE” 60 67 ‘L” 53
5 AQG 37 14 ARG 19 12 *i* ** 31 ALA .?* 06 TLE *r I.9 CL” 27 63 IRP 5s 79 SER 34
32 LLP 25 12 “ET ib 27 ISN 14 6 LE” 17 (13 LEU 22 59 ASH 25 94 CIS 41 66 ASP t6
7 GL” 22 23 TYQ ID 30 CIS 16 42 ALA 21 52 ASP to 91 SER 36 76 XLE 22
68 ILE 19 :: ::: I3 7 17 LE” 9 37 ASN li 41 GLN 16 I.6 LSH 17 v* ASP ** 6L CIC %Z
6 CIS it, 19 ASN * 34 PHE 7 3A WE is 66 LSP 16 6ll CIS IA 60 SER 5
17 LE”... i* “ET 6 43 ,“R Ii 95 ALA i6
26 ,QP 147 t6 GL”,.. 33 LIS 3 *i GM... 69 ,HR 6 iBS Tt?P (16 ‘SC...
2i TRP 2 I* ME, 135 iZ(I “AL 63 or LEU 126 56 ILE 4 59 LStd ,’ 63 ARG .D
20 f”P 77 24 SEQ *i 33 Llf... 39 ASN 66 51 THP. 7
9 ALA... 92 “AL 75 29 “AL 36 33 WE 115 i LIS 35 52 ASP... 51 GLH 2 :: :*I: :t
96 L”S 36 12s ILE Sk 37 ASH 109 43 THR 3i 5, GLH iZ6 60 SER 2 c9 ,“R 57
::
129
:::
LE”
5s
.I
39
15
i3
“IS
LIS
*1
I*
3”
123
C”S
,QP
11
*5
123
34
TRP
PHE
(12
6.0
40
42
THQ
ALA
Zk
it,
4*
k6
LSU 107
ASH 56 59 AS&..
63
69
SER
ASW
L6
31
6 c*s 37 56 LEU ii 25 LE” 23 30 C”S 56 5k GL” 9 59 A% 5i 63 TRP 67 Si ““R ii
5 ARG 31 26 ,QP 17 29 “I‘ 3c 55 TIR kc 5L ISP 56 66 c** IP
i2 “ET. 30 :: ::i I* 12i CLN 6 35 GL” 26 SZ ALA... 5~ TRR 40 SO SEQ 4, 72 SER 7
25 LEU : 27 ASN 6 32 ALA 17 39 mn 66 56 ILE 27 bt TRP 65 79 PRO. *
‘:: ::: :: 23 ,“R 6 31 ALA 15 57 GLN .lL) *3 ,“R *a 61 ARC 39
6 LEU eo id ASP t 27 AD+... 36 SER It 43 ,HQ 36 c* ALA 3 56 ILL 39 67 GL”...
I GLU 19 i*o “IL $06 54 GL” 36 50 SER 1 51 ,“R 36
ii ALA ilr 10 ASP... 1ii TRP 66 35 WE... oi GLW 22 WI CIS 26 :: :E ::
** LEU 101 /b SEQ 85 11* ARG 20s 36 SER *i 66 ASP 29
EXPOSURE CALCULATIONS 363

77 ASH... 40 THR 60 55 ILE 23 24 SER i1


I4 ASN 127 53 7*4 5.5 A9 THR 17 26 GL”
I8 ILE 35 43 THR 35 93 ASN 14 I22 ALA ,”
75 LC” 27 54 GL" 32 92 VAL 10
?h CIS 6 .¶JLE" 19 5c. GL” b I22 ALA...
79 PRO 4 82 ALA 17 A? ASP 1 125 ARG l?C
63 TRP 3 I)0cvs 17 119 ASP $5
42 ALI *i 9t “AL... llS TRP 31
78 ILE... A6 SER 6 17 LE” 79 I*& VAL 19
79 PRO 94 05 SER 5 .¶a ILE 79 I2A GLN 19
76 CIS 61 I LIS 2 09 THR 62 5 ARC $8
711 ASH 56 95 ALA 59 i2’l ILL 14
wo CIS 52 85 SER...
82 ALA 51 A? ASP Bk :; $f :: 123 TRP...
90 ALA J2 A* ALA 57 5 ARG 133
13 LC” ** ro THR 24 ;: ::i :: l’l CIS 128
65 AS+4 19 83 Lt” 19 I2 “tT 21 33 LYS 82
94 2’15 9 84 LE” 14 94 CIS 13 99 “IL... 120 “AL 17
63 TPP 9 At SER 12 90 ALA 12 2i ARC 66 29 “AL 60
89 CIS 8 88 ILE 7 93 ASN 11 5k PHE +c
7? ASN 1 86 SER 5 55 ILi 3 I,“:, i:: :: i** ALA 39
90 ALA 1 96 LYS 67 12’, ILE 39
79 PRO... 93 ASN... 95 ALA 54 26 ‘LY 25
78 *ii )‘J 8b SER.,. 09 TU9 96 20 T”R 4‘ Itl GLN PJ
$5 ASN 62 i LlS lib 96 CYS 71 108 TIP 37 125 ARG 17
70 PRO... 81 Si9 $7 3 PHE 30 9‘ ALA 54 90 ILE 26 38 PHE $6
?Z SCR 43 82 ALA 51 8, P,SP Z6 92 “AL 31 23 TIR 24 121 CIS 5
68 ARG 40 64 Cl3 i5 85 SEP 22 91 SER 19 y; sr” 23 119 ASP 3
69 THR 19 72 AS’1 :: Ire YHR 22 97 LIS 15 21 9 ALA 1
49 GC” I1 83 it” 5 88 ILE 13 95 ALA 12 1Di ASP 1’.
48 A5P * 90 C”S 5 14 LE” 12 92 C”S 3 9, LYS II 124 ILE...
?7 ASN 5 15 HIS i ill* ‘L” 7 121 GLN 86
71 GLYI.. 66 ASP 2 87 ASP... 129 LEU 75
61 LRG 69 85 SEP 97 94 CF.... 1tifJ SER... 121 e*s 73
TO PPO 36 80 C”S... 89 T”R 73 91 ILE k9 21 APG iill 25 LE” 51
69 THQ 28 53 TW 106 88 ILE LC 91 SER 45 97 LYS 40 Z6 GL” 39
73 APG 2, 83 LE” 56 86 SE4 16 58 ILE kJ 96 LIS J9 i2: ;;; 2;
72 SER 5 65 ASN 52 90 ALA 10 97 LIS $2 99 “AL 33
66 ASP 40 15 “IS 5 90 ALA 41 182 GL. 26 Z AK 26
I2 SER... 60 SER 22 82 ALA 3 96 L”S 24 26 TIS t3 29 “AL 22
h9 T*R 64 71 ILE 16 13 LE” 21 98 ILE 18 122 ALA 11
6i A% *o ** AL4 16 83
9i LE”
SER : 63 TRP te 101 PSC ? 125 APG 12
70 PRO r7 6* c*s 15 9s ASN 19 120 VAL 7
65 ASY 36 se Li” 13 92 “AL 18 iOi ASP... 126 GLI ?
6” SE-? 30 79 PRO 10 78 ILE 9 9, LIS 37 24 SER 5
73 ARG 28 81 SEP 5 95 ALA 4 9B ILE 33
74 ASN 4 bh CIS 1 63 YRP 27 125 ARG...
66 ASP 4 .3* SER... 89 THR 1 106. SER 14 A22 ALA 191
8‘4 LE” 7* 103 ASN iP 121 GLN 86
79 PRO 65 95 ALA... 99 “AL 5 ii9 ASP 6*
00 CYS 19 98 ILE 64 5 A9G 57
83 LE” 17 92 “AL 52 iz7 CYS 26
53 TIR 16 iDB TV 46 124 ILE 21
115 SE9 9 56 LE” 32 6 C”S 7
(12 ALA 7 99 “AL 31 123 TW 3
91 SEC? 29
a* ALA.,. 58 ILE 23 126 GLY...
?9 PPO 51 ZB TRP *i 6 CYS 29
85 SEP iJ 94 CIS ZL 128 A46 19
90 ALA 39 97 LIS li 12* ILE *(I
81 SER 36 93 ASH ic 125 ARG I?
78 ILE 33 17 LE” 9 127 C”S 8
I)& CIS 27 89 THI... 119 ASP... 5 PRG 5
83 LE” 20 87 ASP 4‘ 96
90 LIS
ALA i 125 ARC 69
94 LE” 18 93 ASN I? 1.21 GLN 58 127 CIS...
87 ASP r 15 H’IS 74 96 LIS... 12.2 ALA k8 129 CELI 06
92 VlL J4 2C TYR 152 118 YHR 36 124 ILE 74
83 Lc”... 91 SE4 ZL 9J ASN 91 120 “AL I* 1*& APG PI
91 SE9 8s 88 TLE 19 123 TRP 4 9 ALA 19
58 IL2 74 90 ALA I? :: ::: 59
A2 125 ARG 19
80 CIS 69 17 LE” 43 it0 “AL... 6 c*s 16
53 TIS 55 90 ALA... ioc SE@ 28 27 ASN 82 5 ARC 13
75 LE”... 90 ALA I.2 93 ASN 53 94 C”S 20 la YRP 73 I.23 TRP i
63 TRP 85 82 ALA 34 83 LF” *9 15 HIS 25 115 THR 60
62 TRP 81 6k CIS 33 A9 YHS *7 97 I_** 24 25 GL” 68 12e ARC...
73 LPG 39 78 ILE 30 9r CIS 33 16 GLY 20 12I GLN 32 129 Le.” 45
71 ASN 26 VI LF” 39 78 ILE 23 95 ALA 20 St CIS 23 126 GLY 3i
97 LIS 22 56 GL” 27 88 ILE *I 98 ILE 12 115 CIS *i I27 C”S 2*
74 95N 13 81 SER 26 82 ALA *I 28 YRP 6 ** SER 19 10 ALA 7
76 CYS 0 ‘15 SLR 2* 8, ASP 13 I22 ALA 14
9k c*s *1 92 “AL ic 9, LIS... iP5 “ET... 124 ILE 5 129 LE”...
76 CYS... 40 THQ 13 91 SER (1 94 CYS 52 Ail YPP 177 119 ASL 3 13 LIS 110
63 TW 96 19 PRO 7 85 SER 2 100 SE.7 ci 100 YRP 131 116 LYS 2 I27 CYS 9*
78 I‘E 67 ‘6 CIS 15 *3 TIR 99 111 TR- * 12‘ ILE J+?
7k ASN c2 87
55 ASP
ILE : 91 SER... 63 TPP 3r, 28 TRP 7.8 liT GCY * 25 LE” 43
77 ASH 35 63 TRP t 83 LE” 86 75 LE” *a 99 “AL 33 9 ALA 35
97 LVS 32 88 ILE 1 88 ILE 71 ihl ASO 27 27 ASN 29 121 GLN... lP ALA 27
94 CYS I2 56 IL< 54 96 LYS Zb 10, ALA 22 12’. IL< 19 128 ARG ?
75 LE” II 84 LEI,... 9c CIS 42 93 ASN ZE 31 ALA I? 125 AR‘ Lli 6 C”S 6
98 ILE 4 $1 GL’J *c* 95 ALA 37 95 ALA 16 101 AiN ic li9 ASP 60
6Z TPP * Al Sr9 93 56 LE” *A 99 “IL 15 106 ASN I) **o “A‘ **
96 A.LA 26 98 ILE ii 123 TW 1,

The first line of each block gives the residue number and name of the oentral residue. The follow-
ing lines list in desaending order of signi&xnce all residues containing long atoms that occlude
surface of the central residue. The values are residue number, residue name, and surface area (Aa)
occluded on the central residue. Because only long atoms are considered, the area oooluded by a
residue adjacent to a central residue represents oontaots of only side chain atoms of the edjawmt
residue.
E

Et = 0

:c
6 a
14 3

1B 2 25
35 2 15

12 25
18 27

133 18 2
125 B 2

6 2 i% 10 0 1‘J 38 61
i 1 71 II 2 10 40 .5E

16
21

41
43

1
3 Id 50 43 47 17 3 :
15 11 38 44 26 a !

13 $6
54 00
EXPOSURE CALCULATIONS 366

exasaccharide of N-acetylglucosamine in the lysozyme cleft (Fig. 2(a)) and ineor-


poration of a Iysozyme molecule into the crystal lattice (Fig. 2(b)). Figure 3 showa the
changes in exposed area for formation of the insulin dimer from two monomers
(Fig. 3(a)), for incorporation of insulin dimer into the hexamer (Pig. 3(b)) and fok
incorporation of hexamer into the crystal lattice (Fig. 3(c)). The graphs of Pigure I
serve as a basis for evaluating the extent of the changes shown in Figures 2 and 3.

(c) Contact information


Table 6 gives the extent of contact of each residue of lysozyme with its neighbors.
Tabulations of this kind give an objective description of the environment of atoms or
residues within the native structure. The contact values of Table 6 are constructed
from long atom information only. This seems to be the most appropriate description
of the special environment of a central atom in the folded state (near atoms are
present in both the folded and unfolded states).
ontact information can also be displayed graphically. Figure 4 plots data for the
lin dimer corresponding to those listed for Iysozyme in Table 6. Ooi and co-
Lvorkers (Nisbikawa et al., 1972) have used similar plots of a-carbon distances to
display structures of proteins. Ooi pointed out that elements of the matrix near the
diagonal reflect secondary structure and off-diagonal elements represent tertiary
structure. Helical regions show as four-residue thick ribbons on either side of the
Idiagonal. The helices of insulin are largely irregular, and this is reflected as irregu-
larity in the patterns of Figure 4. Contacts between monomer units of the dimer are
given in the upper right and lower left quadrants. The anti-parallel ,&structnre,
involving residues B23 to B28 and B*23 to B*28, developed in the association of
m&in monomers to the dimer, appears as a ribbon of unit slope orthogonal to the
diagonal in both the lower left and upper right quadrants; these ribbons are encircle
in Figure 4. Within each monomer the chain from residues A6 to 813 runs anti-
paraliel to that from Bl to B6 to give irregular ,&struoture, which is also seen as off-
diagonal ribbons of unit slope.
Figures 5 and 6 give contacts that develop through incorporation of the insulin
dimer into hexamer and through binding of hexasaceharide to lysozyme. Figure 5
descri-bes the two different dimer-dimer interfaces of insulin. The dimer I-dimer II
dimer III-dimer I contacts are related by symmetry and therefore both may be
plotted in Figure 5. The pseudo S-fold symmetry axis of the dimer is reflected in the
symmetry about the diagonal of Figure 5.
As one expects, the areas that two atoms occlude on each other are approximately
the same. Thus, the pairs of numbers in Figures 5 and 6 are comparable and Figure 4
has approximate symmetry about the diagonal.

FIG. 4. Ooi plot for the insulin dimer of contacts between residues. Letter symbols give the
extent that an abscissa residue occludes area of an ordinate residue. Each increase in alphabet
stands for 15 da of occluded surface. The right-hand column gives the surface area (As) exposed
to solvent times l/30. Only long atom contacts are included in the sums; thus, the diagonal is
blank. The following gives the correspondence between the standard residue designations of Table 5
and those used in this Figure: -41 to A21 = 101 to 121; Bl to B30 = 201 to 230; A*1 to A*21 =
301 to 321; B*l to B*30 = 401 to 430. Regions of the plot corresponding to each chain are deline-
ated. Contacts within monomer units are given in the upper left and lower right quadrants.
Contacts between the monomers are shown in the upper right and lower left quadrants. Lines along
the diagonal indicate a-helical sections. The two enclosed regions off the diagonal represent the
j3-structure at the monomer-monomer interface.
NAG C’ 36 3 35 69 1 48 44 82 0 58 54 62 120 92
43 2 35 98 2 74 37 97 3 61 34 88 86 122

NAG A 20 10 49 143 38 50
18 9 27 185 46 58

NAG B 112 81 18 50 37 11 3
93 70 19 69 57 20 3

NAG C 5 21 68 56 68 68 62 90 95
8 22 46 96 89 7% 56 134 58

NAG D 31 110 93 8 76 6 53 9 50 57
32 11% 120 7 85 3 63 17 39 63

NAG E 3 100 24 2 28 12 139 5 33 83 75 29


3 139 13 0 19 8 132 8 30 93 95 32

NAGF 6 64 34 16 24 3 74
4 94 26 7 14 0 99

33 34 35 36 37 42 43 44 46 52 56 57 58 59 62 63 75 9% 101 102 103 107 108 109 110 112 114


Lys E’he Glu Ser Am Alrt Tbr Am Asn Asp Leu GPn Ile Am Trp Trp Leu Ile Asp Gly Asn AIa Trp Val Ala Arg Arg

LyKW,ylne

Fru. 6. Contacts of the N-acetylglucosamine hexasaooharide and the OL-anomerof N-acetylgluoosamme with residues oflysozyme. The ac-anomer binds “anomalously”
with some contacts like those of the unit of the hexasaccharide bound at site C (Blake et aE., 1967). Contacts for hexasaccharide bound at sites A to F am given
separately. The upper number of each pair gives the area (As) of the protein residue occluded by the saooharide unit; the lower number gives the converse. NAG,
N-aoetylglucosamine.
- -- -- - __~.~___ I______
Polar Charged Non-polar
All Polar Charged Non-polar Uackbone Side chain
side chain side chain side chain
_”
A Lysozyme
Unfolded 21,723 6176 2466 13,082 6840 15,884 3777 5141 6966
Native 6583 1811 1261 3511 1599 4984 1564 2302 1118
Hexasacoharide complex 5919 1586 1128 3205 1462 4457 1395 2162 900
Crystal lattice 4786 1261 944 2581 1157 3629 1064 1659 906

B Lysozyme diflerences
Unfolded-native 16,140 4364 1205 9571 4241 10,900 2213 2839 5848
Native-hexasaccharide
complex 664 225 133 306 137 527 169 140 218
Native-lattice 1797 550 317 930 442 1355 500 643 212

c Insulirz
Unfolded dimor 17,348 4178 1954 11,215 4507 12,841 2565 4212 6065
Monomers 7334 1642 1278 4459 1557 5777 1420 2508 1849
Dimer 6023 1346 1169 3510 1245 4778 1249 2053 1477
Dimer in hexamer 4585 1130 8.59 2595 1017 3568 1119 1648 801
Dimer in lattice 3057 766 506 1784 739 2317 791 978 549

D .Irwulilz differem-
Unfolded dimer-native
dimer 11,325 2833 785 7705 3262 8063 1316 2159 4ii8S
Monomers-dimer 1311 297 109 949 312 999 171 455 372
Dime+dimer in hexamer 1438 215 310 915 228 1210 130 405 676
Dimer in hexamer.-dimer
in lattice 1528 364 353 811 278 1251 328 670 262
__I__ - ~~-~.1”~~..--“-“9_a^-“.~.-~__~,~,
368 A. SHRAKE AND J. A. RUPLEY

The contact information focuses on the immediate protein environment of an atom


or group and is complementary to the extent of exposure to solvent. Thus, Table 6 is
complementary to Figure l(a), Figure 4 to Figures l(b) and 3(a), Figure 5 to Figure
3(b), and Figure 6 to Figure 2(a).

(d) Xzcmnzarytabulations
Table 7 gives results for lysozyme and insulin summed over classes of atoms (all
atoms ; polar, charged, non-polar; backbone, side chain) and over types of side chains
(polar, charged, non-polar). The side chain categories are specified as follows : charged,
those containing groups that bear charge at any pH in the range 0 to 12 ; polar, those
containing polar but no charged atoms; and non-polar, those containing only
non-polar atoms, and tryptophan, methionine and cystine. The values presented in
Table 7 are the areas exposed to solvent and in sections B and D changes in exposed
area (areas are in A”).

4. Discussion
(a) Comparison with results of Lee & Richards (1971)
The van der War&’ radii used in these computations (Table 2) differ significantly
from those of Lee & Richards (1971) in particular for side chain atoms for which Lee
& Richards use the uniform value of 1.8 8. The values of the static accessibility?
calculated for lysozyme from column 2 of Table 4 and the surface areas of Table 2
are very close to the values for lysozyme listed by Lee & Richards. If areas rather
than ratios of areas are considered, differences due to changes in radii become appar-
ent. Nevertheless, general conclusions drawn from the computations remain essentially
unaltered by the changes in radii. For example, Lee & Richards made the striking
point that a large fraction of the total surface of globular proteins is comprised of
non-polar atoms in the folded as well as in the unfolded state. The data of Table 7
(compare columns 1 and 4) confirm this conclusion; non-polar atoms constitute
0.53 and 0.60 of the lysozyme surface for the folded and unfolded molecules, respec-
tively. Because in the present calculation the van der Waals’ radii assigned to non-
polar atoms are larger than those assigned to polar atoms, the above fractions are
eaoh about 0.1 greater than those determined by Lee & Richards. The salient point is
that in spite of the crude model used in the computations, exposure values have semi-
quantitative reliability and trends within self-consistent sets of results appear to be
meaningful.
In explanation of the considerable non-polar surface in the folded state, exami-
nation of Table 7 (compare columns 4 and 9) shows that a relatively high proportion
(approx. two-thirds) of the non-polar surface of the folded molecules is associated with
non-polar atoms that are part of polar or charged side chains, e.g. the methylene
carbons of lysine. The extent to which the surface exposed in the unfolded state
becomes buried on folding is two to three times greater for non-polar residues than
for polar residues (charged and uncharged). This observation is consistent with the
“oil-drop” model of protein folding.
Cavities within the lysozyme structure located by the graphics display of Lee &
Richards (1971) do not exist according to the present calculations. This reflects the

t Defined by Lee & Richards (1971) as 100 x area of solvated sphere exposed to solvent/total
mea of solvated sphere.
EXPOSURE CALCULATIONS 189

-a in vandex Waels’ radii. Computations with the solvent radius rednce


and to 04 A show the cavities. These remarks do not weaken the important
ooncfusiou of Lee & Richards-the density of packing of side chains within a native
protem is not uniform.

(b) Exposure of main chain cm-bony1 oxygen atoms in helical regions


e carbonyl oxygen is the major contributor to the exposed surface of the
e backbone (Table 1). The computations for insuhn and lysozyme sho
half of the earbonyl oxygens are exposed to solvent in both helical and non-
helical regions. Thus, it is of interest that the 6rst residue of each of the eleven hehces
in these two molecules has the carbonyl oxygen entirely buried and that in ten of the
eleven helices the carbonyl oxygen of the last helical residue is exposed to solvent.
This observation is confirmed by examination of seven other proteins for which
~orn~~t&tions have been carried out and assignments of helical regions could be ma
though this correlation cannot be used to predict secondary structure because the
exposure of the carbonyl oxygen depends on long range interactions, the origin of the
effect presumably is related to the stability of helical regions in macromolecules snd
uh&.nately should be explicable in terms of theories of protein folding.

(c) Ranking ofresidues according to exposure of ba.ckbone wr side chain atoms


to solueti
n Iysozpe, glycine is among the residues having exposure least affected by fol
he relatively low effect of folding on glycine exposure was confirmed through
lations on nine other proteins (glyoine, serine and glutamic acid are the residues with
the highest probability for exposure of backbone atoms). The relatively high exposure
of glycine residues in the folded state accords with the expectation that glycine lies
at bends of the chain (Venkatachalam, 1968) and, thus, on the surface of the molecule.
Lee %GRichards (1971) noted that proline is more exposed than expected from the
non-polar character of its side chain. The present calculations confirm this behavior,
which probably also reflects the participation of proline in bends in the chain.

(d) Qornparieon of folding, bindkg, association and wystallization reactions


Figures 2 and 3 and Table 7 show that lattice contacts are extensive, involving
ebout 30% of the surface of lysozyme and insulin. The types of residues participating in
lattice contacts of lysozyme and insulin are not the same as those that on the average
constitute the surface of the free molecule or that are involved in the contacts between
saocherides and lysozyme or that constitute the monomer-monomer and dimer-
dimer interfaces of the insulin polymers. A signifioantly smaller participation of non-
polar side chains in the lattice oontacts and a proportionately greater involvement of
charged side chains are found (Table 7).

(e) Protein environment of atoms and residues


The description of the environment of an atom or residue through listing atoms or
residues that occlude its surface includes all important interacting elements exe
for those involved in long range ionic interactions. A comparison of the tabulations
for lysoayme and insulin with Kendrew-type models of these proteins shows that
the listiug of residues contacting other residues is substantially correct and that
370 A. SHRAKE AND J. A. RUPLEP

hydrogen bonding or hydrophobic interactions are reflected in extensive overlap.


Since solvated radii are used, the listing includes some less significant elements of the
environment; thus, contacts between residues with areas less than 5 to 10 AZ are not
important.
The list of contacting residues is altered substantially when the solvent radius is
decreased, e.g. from 1.4 to 0.4 A. However, the most important contacts remain in the
list generated with the reduced solvent radius. Decreasing the solvent radius to 1.0 A
does not significantly change the nature and extent of residue-residue contacts.
The environment about each atom of native lysozyme and insulin is described in
Tables 4 and 5 by giving the surface area occluded by polar and non-polar long atoms,
with the intent of indicating the polarity of the environment. These numbers can be
summed over groups of atoms. The carboxyl groups of Glu-35 and Asp-52 of lysozyme
are both largely shielded from solvent; the ratios of exposure in the folded state Do
that in the unfolded state are O-2 and 0.3, respectively. Glu-35 has a considerably
less polar environment, however. The fraction of its occluded surface assigned to
polar contacts is O-1 compared with 0.7 for Asp-52. This difference in environment is
reflected in the pK 6 to 6.5 found for Glu-35 and pK approx. 4 found for Asp-52
(Imoto et al., 1972).

(f ) Structural assumptions
It is assumed that computations based on the crystallographically defined co-
ordinates are relevant to solution properties. Two aspects of this assumption should
be discussed. First, uncertainty in the crystallographic results is generally estimated
at 0.5 b for protein molecules studied at 2 to 3 A resolution. In order to investigate
this difficulty, a random error was introduced into the Cartesian co-ordinates of
each atom of lysozyme using a Gaussian probability distribution that gave an arith-
metic average movement for each atom of O-39 A. Computations with this perturbed
set of co-ordinates show no significant changes in contacts between residues, i.e. very
few changes greater than 5 to 10 A2 in area of contact. Exposure of individual atoms
to solvent is also only slightly affected by this co-ordinate perturbation; atoms that
are completely buried according to the unperturbed calculations remain so and atoms
exposed to solvent undergo changes in exposure of approximately 5 A2. The exposure
summed over classes of atoms (as in Table 7) changes by less than 5%. Thus, the con-
clusions drawn from exposure and contact computations of the kind described here
are not sensitive to considerable error in. the co-ordinates.
Second, the conformation of a protein molecule in the crystal may differ from that
in solution. This problem has been considered by many workers (see review by Rupley,
1969). The time-average conformation of a protein as reflected in equilibrium
properties appears to be unaffected by crystallization. Surface side chains involved in
lattice contacts can be expected to undergo perturbation if they are relatively
unrestricted in solution.
X-ray studies of complexes of lysozyme with the /3(1-+4)-linked monomer, dimer
and trimer of N-acetylglucosamine have provided co-ordinate information for the
saccharide moieties binding at sites A through C. The co-ordinates for the moieties
binding at the remaining sites (for N-acetylglucosamine hexamer) are derived from
model building. A few of the side chains of the protein residues that are involved in
the binding of hexasaccharide (see Figs 2(a) and 6) also participate in lattice contacts
(see Fig. 2(b)).
EXPOSURE CALCULATIOKS 371

The co-ordinates of the free insulin monomer and dimer are assumed. to be the same
as those in the hexamer. In the hexamer the first few residues of the B and B’ chains
are buried in the adjacent dimers. In the free dimer these residues are possibly
folded back onto the surface. Thus, the actual areas exposed to solvent in the free
dimer are presumably less than the computed values and the calculated changes in
exposed area brought about by the association of dimers to form hexamer are only
approximate.
(g) Concluding remarks
Exposnre values for atoms can be summed in different ways, e.g. to describe
exposure of chromophores or other side chain elements of a protein. Values of this
kind based on the present calculations have been used (see review by Imoto et at.,
1972) for examining free energies of association reactions and for understanding
perturbations of ionizable groups and perturbations of chromophores. Exposure and
contact information define environment more precisely than terms such as “par-&&y
buried ‘I.
Lee 6%Richards (1971) have discussed the limitations in applying exposure compu-
tations that are based on the relatively crude model of hard-sphere atoms and on
the equilibrium structure determined by X-ray diffraction. In particular, conclusions
related to rate processes must be made with oaution. Nevertheless, the use of exposure
calculations is justified by the need to summarize structural information objectively
and semi-quantitatively and by the advantages of concise tabulations and graphical
display.

We are grateful to Professor D. C. Phillips and his colleagues for the hospitahty they
extended and the encouragement they offered in the early stages of this work. We are
also indebted to Professor D. C. Hodgkin and her colleagues for giving us the co-ordinates
of insulin and for discussions on this structure. This work was supported by the American
Cancer Society, the National Institutes of Health and the University of Arizona Computer
Center. One of us (A. S.) thanks the National Institutes of Health for support in the form
of a postdoctoral fellowship from 1971 to 1973.

REFERENCES
Blake, C. 6. F. (1967). Proc. Roy. SOL, London (ser. B), 167, 435-438.
Blake, C. C. F., Johnson, L. N., Mair, G. A., North, A. C. T., Phillips, D. 6. & Sarma, V. R.
(1967). Proc. Roy. SOL, London (ser. B), 167, 378-388.
Blundell, T. L., Cutfield, J. F., Cutfield, S. M., Dodson, E. J., Dodson, G. G., Hodgkin,
D. C., Mercola, D. A. & Vijayan, M. (1971). Natzcre (London), 231, 506-511.
Bondi, A. (1964). J. Phys. Chem. 68, 441-451.
Diamond, R. (1966). Acta Crystallog. 21, 253-266.
Diekerson, R. El. (1972). Annzc. Rev. Biochem. 41, 815-842.
Imoto, T., Johnson, L. N,, North, A. C. T., Phillips, D. C. & Rupley, J. A. (1972). The
Enzymes, 7, 665-868.
Lee, B. & Richards, F. M. (1971). J. Mol. Biol. 55, 379-400.
Nishikawa, K., Ooi, T., Isogai, Y. & Nobuhiko, S. (1972). J. Phys. Soo. (Jupm), 32, 1331-
1337.
Pauling, L. C. (1960). 5!‘heNature of the Chemical Bond, 3rd edn. Cornell .University Press,
Ithaca, New York.
Rupley, J. A. (1969). In Strzccture and Stability of Biological Molecules (Timasheff, S. N.
& Fasman, G. D., eds), pp. 291-352, Marcel Dekker, New York.
Venkatachalam, C. M. (1968). Biopolymers, 6, 1425-1436.

You might also like