Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
First−principles quantum chemistry in the life sciences
Phil. Trans. R. Soc. Lond. A 2004 362, 2653-2670
doi: 10.1098/rsta.2004.1469
Email alerting service
Receive free email alerts when new articles cite this article - sign up in the box at the top
right-hand corner of the article or click here
To subscribe to Phil. Trans. R. Soc. Lond. A go to: http://rsta.royalsocietypublishing.org/subscriptions
This journal is © 2004 The Royal Society
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
10.1098/rsta.2004.1469
First-principles quantum chemistry
in the life sciences
By Tanja van M o u r i k
Department of Chemistry, University College London,
20 Gordon Street, London WC1H 0AJ, UK (t.vanmourik@ucl.ac.uk)
Published online 24 September 2004
The area of computational quantum chemistry, which applies the principles of quantum mechanics to molecular and condensed systems, has developed drastically over
the last decades, due to both increased computer power and the efficient implementation of quantum chemical methods in readily available computer programs. Because
of this, accurate computational techniques can now be applied to much larger systems
than before, bringing the area of biochemistry within the scope of electronic-structure
quantum chemical methods. The rapid pace of progress of quantum chemistry makes
it a very exciting research field; calculations that are too computationally expensive
today may be feasible in a few months’ time!
This article reviews the current application of ‘first-principles’ quantum chemistry in biochemical and life sciences research, and discusses its future potential. The
current capability of first-principles quantum chemistry is illustrated in a brief examination of computational studies on neurotransmitters, helical peptides, and DNA
complexes.
Keywords: quantum chemistry; ab initio calculations;
density functional theory calculations; gas-phase biomolecules
1. Introduction
As a result of the rapid development of computational quantum chemistry, highaccuracy techniques can be applied to much larger systems than previously envisaged, bringing the area of biochemistry within the reach of electronic structure quantum chemistry. With the ongoing advances in computer architecture and quantum
chemical software development, the range of molecular systems that can be studied
continues to increase progressively. This paper will focus on the application of ‘firstprinciples’ (ab initio and density functional theory) quantum chemical methods in
biochemical and life sciences research. The recent award of the 1998 Nobel Prize
in Chemistry to Walter Kohn and John Pople reflects the enormous capability and
value of first-principles quantum chemistry in today’s chemistry research. John Pople
was honoured for the development of quantum chemical methods, whereas Walter
Kohn’s contribution was the development of the relatively new density functional
theory (DFT) methodology.
One contribution of 17 to a Triennial Issue ‘Chemistry and life science’.
Phil. Trans. R. Soc. Lond. A (2004) 362, 2653–2670
2653
c 2004 The Royal Society
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2654
T. van Mourik
First-principles quantum mechanical computations are traditionally not well suited
for large molecules, because of their huge computational demand. The results of
cheaper methods (such as force-field and semi-empirical methods), however, vary
significantly depending on the specific parametrization. As first-principles methods
do not require empirical calibration, they are applicable to all molecular systems
and properties one may be interested in, even in the absence of experimental data.
As a result of this, they have the potential of yielding much more accurate results
than the methods traditionally used in the life sciences, thereby greatly enhancing
the reliability of computational research in this area.
A comprehensive appraisal of first-principles studies in biochemistry and the life
sciences is, if at all possible, beyond the scope of this paper. Researchers apply a
plethora of different methods, implemented in a large number of quantum chemistry
programs, to calculate properties of a wide variety of biomolecular systems. In this
paper, I will therefore not attempt to be complete, but instead I will present a
selection of interesting research done in this field, and hope that these case studies
illustrate the current potential of first-principles methods. I will start with a very
brief review of the theory of first-principles methods, after which calculations in three
subareas, neurotransmitters, peptides, and DNA fragments, will be discussed. The
reviewed research is focused on gas-phase studies at a temperature of 0 K. The paper
concludes with an outlook on the future potential of first-principles research in the
life sciences.
2. First-principles methods
Computational chemistry encompasses many methods, which can be divided into
quantum chemical and classical (non-quantum chemical) methods. Classical methods
consider only the nuclei of a molecular system, while ignoring the electronic motions.
These methods calculate the energy of a molecular system using a force field, which
is a parametric function of the positions of the nuclei only. Classical methods are very
computationally efficient, and are generally applied to very large molecular systems.
In contrast, quantum chemical (or electronic structure) methods do take the
motions of the electrons into account—they deal with the computation of molecular electronic structures, and are therefore much more computationally demanding
than their classical counterparts. Molecular quantum chemistry attempts to solve the
time-independent Schrödinger equation: HΨ = EΨ . Solving this equation yields the
wavefunction of a molecular system, from which all its molecular properties can (in
principle) be derived. The Schrödinger equation can be solved exactly only for the
hydrogen atom, and therefore, all quantum chemical methods are necessarily approximate. In this review, I will consider two groups of quantum chemical methods: the
pure ‘ab initio’ methods and DFT.
Ab initio calculations use no other input than the Schrödinger equation, a few
fundamental physical constants, and the atomic numbers of the atoms present in
the molecular system of interest. The simplest ab initio method is the Hartree–Fock
(HF) method. In the HF method each electron only sees an average field of the other
electrons; local distortions of the electron distribution are ignored. As a result, HF
ignores electron correlation, which limits the accuracy achievable with this method.
There are many approaches that attempt to include this electron correlation, such as
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
First-principles quantum chemistry in the life sciences
2655
configuration interaction, coupled cluster and many-body perturbation-theory methods. The basic idea of perturbation theory is that the problem at hand is first solved
for a simpler system, after which the solution is adjusted in the direction of the
more complicated, true system. Møller–Plesset perturbation theory (Møller & Plesset 1934) considers the HF Hamiltonian to be the undistorted (simpler) system,
and the electron correlation is computed as a perturbation. This leads to a series in
which each higher-order method should (in principle) be more accurate than the previous one (however, the Møller–Plesset perturbation series does not always converge
towards the ‘correct’ result, as first convincingly shown by Olsen et al . (1996)). The
second-order method in the Møller–Plesset series, MP2, is one of the most common
electron-correlation methods, because of its relative inexpensiveness as compared
with other correlated ab initio methods.
Another way to include electron correlation is via DFT. DFT is based on the notion
that the total energy of a molecular system is a functional of the charge density
(Hohenberg & Kohn 1964). The main idea of DFT is to describe a molecular system
directly via its density, without first finding the wavefunction. However, the exact
form of the universal energy density functional is unknown. The general strategy is
therefore to approximate it by various model functionals. One of the most widely
used functionals is B3LYP, devised by Becke (1993). The precise form of density
functionals is commonly determined by fitting to atomic or molecular data. DFT is
therefore strictly speaking not an ab initio method (though it is often labelled as one).
In principle, ‘first principles’ is a synonym for ‘ab initio’. In this paper however, I use
the term ‘first principles’ to include pure ab initio methods, as well as DFT. DFT is
more computationally efficient than correlated ab initio methods and can therefore
be applied to larger molecular systems. However, one of the major deficiencies of
current density functionals is their inability to correctly account for the dispersion
energy—the intermolecular energy contribution arising from the correlation between
fluctuations in the electron distributions of neighbouring molecules (van Mourik &
Gdanitz 2002)—which makes DFT less suitable for dispersion-dominated interactions
(such as stacking interactions and hydrogen bonds to π electron clouds).
Both ab initio and (Kohn–Sham) DFT methods generally use basis sets, which
are collections of basis functions representing atomic orbitals (AOs). From a linear
combination of these AOs, molecular orbitals (MOs) can be constructed. In HFbased methods, the MOs are a representation of the wavefunction, whereas in DFT
the so-called Kohn–Sham orbitals are simply a way of representing the density. The
expansion of the MOs (or Kohn–Sham orbitals) in a set of AOs is not an approximation if the basis is complete. Practically, however, finite basis sets are used, thereby
introducing an additional approximation in the calculations. Particularly for ab initio
methods it generally holds that, the larger the basis, the better the approximation,
and the choice of basis set is therefore often determined by a trade-off between accuracy and computational cost.
3. First-principles calculations on neurotransmitters
(a) Neurotransmitters
Neurotransmitters are small chemical messengers that transmit information across
the gap (‘synapse’) between nerve cells (‘neurons’). The neurotransmitter’s release
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2656
T. van Mourik
NH2
O
NH2
O
OH
NH2
NH2
HO
O
OH
CH 3
OH
OH
OH
GABA
acetylkcholine
NHCH3
HO
OH
dopamine
NA
NH2
N
NH2
HO
N
OH
adrenaline
tryptamine
serotonin
Figure 1. The structures of a selection of neurotransmitters.
from one side of the synapse to the other is triggered by an electrical impulse travelling along a nerve towards the synapse. The neurotransmitters cross the synapse and
are received by a receptor, which is a protein that binds the neurotransmitter and
suffers some conformational change as a result of it. The result is either a continued
electrical impulse, or an inhibition of it, on the other side of the synapse.
The neurotransmitters are a diverse group of chemical compounds ranging
from simple monoamines such as glycine, γ-aminobutyrate (GABA) and acetylcholine, compounds containing single aromatic rings (dopamine, noradrenaline and
adrenaline) or double (fused) rings (tryptamine and serotonin), to polypeptides such
as the enkephalins (see figure 1). A common characteristic of these molecules is their
flexibility, leading to a large number of possible conformations. A knowledge of the
favoured shape of these molecules is of vital importance, as the shape influences their
transport properties and plays an important role in the binding of these biomolecules
to their receptor sites. In recent years, much effort has been devoted to illuminate
the conformational preferences of neurotransmitters.
(b) Studies on neurotransmitters in the gas phase
A recent computational study provided a prediction for the structure and function
of the β-adrenergic receptor, one of the receptors that bind the neurotransmitter
adrenaline (Vaidehi et al . 2002). The novel aspect of this research was that the new
set of methods used correctly predicted the structure and binding characteristics of
the receptor, starting with nothing but a knowledge of its sequence of amino acids.
However, such studies are still out of reach for first-principles quantum chemistry
methods. The neurotransmitter-in-receptor study was feasible because it used newly
developed computer programs for the prediction of protein structure and ligandPhil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
First-principles quantum chemistry in the life sciences
2657
binding conformations, in which the interaction between atoms is calculated using
force fields. Force fields methods apply a parametric function of the nuclear coordinates to calculate the atom–atom interactions and are much cheaper computationally
than first-principles quantum chemistry methods.
Even though first-principles studies of neurotransmitters in genuine physiological environments are not yet feasible, the area is not wholly out of reach of firstprinciples quantum chemistry. The strategy of physical chemists has traditionally
been to reduce and simplify the molecular problem at hand. True to this approach,
researchers have started to study neurotransmitters ‘in the gas phase’. Even though
environmental effects play a very important role in modelling biological processes,
an understanding of the intrinsic energetics of flexible biomolecules is essential. By
studying the biomolecules and their hydrated clusters in the gas phase, the environmental effects are effectively eliminated, allowing the molecule–solvent interaction
to be studied in a controlled environment. Comparison with studies in solution can
then provide information about the relative importance of the intrinsic and environmental effects. The study of biomolecules in the gas phase can also be considered a
first step in a progressive investigation from isolated (gas-phase) molecules, via small
hydrated clusters, to completely solvated molecules, to molecules in truly physiological systems, such as the neurotransmitter in its binding site.
There are several reasons for the recent upsurge in gas-phase studies of molecules
of biological interest. Firstly, the structures of flexible molecules depend on subtle
intramolecular interactions. A study on the conformational landscape of serotonin
showed that semi-empirical methods, which are computationally much less demanding than first-principles methods, are incapable of accurately predicting the structures and relative stabilities of the different serotonin conformers (van Mourik &
Emson 2002), indicating the need for higher-level quantum chemical methods to
study flexible biomolecules. Due to ongoing software and hardware developments, it
is nowadays feasible to perform accurate first-principles calculations on a large number of possible conformations of molecules of the size of those in figure 1. A second
reason for the renewed interest in gas-phase biomolecules is due to developments
in electronic, vibrational and microwave spectroscopy techniques, using free-jet gasphase expansions (Levy 1980; Smalley et al . 1974), which allow flexible molecules
and their hydrated clusters to be studied in the gas phase at very low rotational and
vibrational temperatures (Robertson & Simons 2001; Zwier 2001).
The experimental and theoretical studies demonstrate a unique synergic relationship, thereby encouraging collaborative research: assignment of the experimental
spectra is crucially dependent on comparison with spectra computed using quantum
chemical techniques, whereas the experimental results are indispensable for validation of the theoretical results. As a result, a large number of combined experimental/theoretical papers on biomolecules in the gas phase, including the neurotransmitters tryptamine and serotonin (Carney & Zwier 2000, 2001; van Mourik & Emson
2002), the ephedra (Butz et al . 2001b, 2002), the adrenergic neurotransmitters and
their analogues (Alagona & Ghio 2002; Butz et al . 2001a; Graham et al . 1999; Nagy
et al . 2003; Snoek et al . 2003a, b) have appeared in the literature over recent years.
(c) The adrenergic neurotransmitters
As figure 1 shows, the neurotransmitters noradrenaline (NA) and adrenaline (A)
have a very similar structure; the only difference is that a methyl group in adrenaline
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2658
T. van Mourik
R
R
H
O
…
O
O
H
O
…
H
H
Figure 2. The two hydrogen-bonding configurations of the catechol hydroxyl groups.
replaces one of the amino hydrogens in NA. Noradrenaline has one chiral centre (the
side chain carbon atom bearing four different groups), and thus, every conformer has
a corresponding non-superimposable mirror image. The two mirror images (the chiral
forms R and S) are spectroscopically indistinguishable, and therefore only one form
needs to be considered. The substitution of one of the amino hydrogens by a methyl
group to form adrenaline introduces a second chiral centre (the nitrogen), resulting
in the existence of two ‘diastereoisomers’, structures that are not mirror images of
each other: one of these (1R2S/1S2R) is adrenaline, and the other (1S2S/1R2R) is
pseudoadrenaline (PA). Thus, for A/PA, twice as many structures will need to be
considered than for NA.
The two molecules do not only have a similar structure, they also have a similar
function in the body. Both are released as a response to mental or physical stress,
and prepare the body for strenuous activity: the so-called ‘fight-or-flight’ syndrome,
causing the familiar ‘adrenaline rush’. They also act as hormones and are secreted in
response to a low blood-glucose level. They increase the amount of glucose released
into the blood by the liver and decrease the use of glucose by muscle.
Most of the flexibility of NA and adrenaline is located in the ethanolamine side
chain. The orientation of the side chain is determined by four torsion angles. In
addition, the catecholic OH groups have two different orientations supporting an
intramolecular hydrogen bond (see figure 2). If one only considers side-chain conformations in which atomic groups on adjacent atoms are in a staggered position with
respect to each other, and not those that are eclipsed, then one needs to take into
account only three different orientations of each of the four side-chain torsion angles.
With these restrictions, there are
34 (four flexible bonds, each having three different values for the torsion angle)
× 2 (two different orientations of the catechol groups)
= 162 possible conformations.
Considering that a B3LYP/6-31+G∗ geometry optimization of one structure may
take 1–2 days of computer time on a reasonably fast PC (1.7 GHz Pentium 4), it
may be obvious that the full characterization of these molecules takes a fair amount
of computer time.
A theoretical search for the most stable NA, A, and PA conformer reveals a close
competition between two structures: AG1a, which has an extended side chain, and
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
First-principles quantum chemistry in the life sciences
NA (AG1a)
NA (GG1a)
A (AG1a)
A (GG1a)
PA (AG1a)
PA (GG1a)
2659
Figure 3. The two most stable conformers of the neurotransmitters NA, A and PA.
GG1a, in which the side chain is folded back towards the catechol ring (Snoek et al .
2003b; van Mourik 2004) (see figure 3). The AG1a and GG1a structures are nearly
isoenergetic, and which of the two is the most stable is dependent on the level of
theory used. The narrow energy gap between the AG-type and GG-type conformers
was also observed in 2-amino-1-phenylethanol (APE), the benzene analogue of NA
(Graham et al . 1999; Macleod et al . 2003). However, whereas both AG1 and GG1
APE conformers were observed experimentally, almost the entire NA population was
found to adopt the global-minimum structure (Snoek et al . 2003b). The reasons for
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2660
T. van Mourik
the absence of a larger spread of populated conformers in jet-cooled NA are not
well understood. In the experiments, NA was created by laser ablation into a supersonic argon expansion. The relatively large barriers (of the order of 10–20 kJ mol−1
(van Mourik & Früchtl 2005)) between different NA conformers indicate that it is
unlikely that conformational relaxation occurs during cooling in the supersonic jet.
Hence, it seems likely that laser-ablated NA is generated predominantly in the globalminimum conformation. Thus far there have been no jet-cooled experimental data
available for A/PA. It would be very interesting to see which A and PA conformers
are formed experimentally.
(d ) Protonated neurotransmitters
Under aqueous physiological conditions, biomolecules containing basic amino
groups will exist predominately in their protonated form. Though some theoretical studies on protonated neurotransmitters have appeared in the literature (Alagona & Ghio 2002; Nagy et al . 2003), corresponding spectroscopic work has been
lacking behind. This is partly due to the fact that it is notoriously difficult to produce reasonable quantities of protonated biomolecules at sufficiently low vibrational
and rotational temperatures to allow their structural analysis. However, a recent
spectroscopic study on protonated ethanolamine (Macleod & Simons 2004) introduces a potentially very promising approach to generate protonated molecules in
high concentration in the gas phase. The method is based upon the photo-excitation
of hydrogen-bonded complexes with phenol. It is found that the infrared spectrum of
the [phenoxy-protonated-ethanolamine] complex is almost identical to the computed
spectrum of free protonated ethanolamine. I expect that this new approach will pave
the way for an upsurge in experimental and theoretical studies probing the structural
preferences of protonated biomolecules.
(e) Hydrated neurotransmitters
Gas-phase biomolecule-(H2 O)n clusters bridge the gap between isolated and fully
hydrated biomolecules. Their study allows the relative importance of individual water
molecules to be investigated. To find the most stable hydrates, it is not sufficient to
consider the water complexes of just the most stable isolated-biomolecule conformer,
as the interaction with water may change the relative stability of the conformers
(Butz et al . 2002). Indeed, the interaction with water may even change the biomolecular conformation to one that is non-existent in the absence of water, as found in
calculations on adrenaline–(H2 O)2 (van Mourik 2004). Thus, as a minimum, one has
to study hydrates involving several of the most stable isolated-biomolecule conformers. The functional groups of the adrenergic neurotransmitters provide many possible
water-binding sites: calculations on 1:1 hydrates of NA and A (Snoek et al . 2003a;
van Mourik 2004) show that there are about 10 different ways for a single water
molecule to bind to these neurotransmitters. In addition, the number of local minima increases steeply with the number of constituents in the cluster, and thus, a full
study of the hydrates is a formidable task. Structural data from spectroscopy experiments may help to reduce the conformational space that needs to be investigated,
further stressing the importance of collaborative research in this field.
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2661
First-principles quantum chemistry in the life sciences
peptide
bond
H
O
C
C
amino acids are linked by peptide bonds
to form polypeptide chains:
H
O
N
C
C
H
R2
H
O
N
C
C
H
R3
H
O
+H N
3
R1
amino-terminal residue
amino acid residue
amino acid residue
N
C
H
R4
C
O−
carboxyl-terminal residue
Figure 4. The basic structure of a peptide.
(f ) Anharmonic vibrational frequencies
The interpretation of the experimental IR vibrational spectra in the studies discussed above crucially depends on comparison with reliable theoretical vibrational
frequencies. Most standard algorithms to calculate vibrational frequencies employ the
harmonic approximation, yielding harmonic vibrational frequencies. As the anharmonicity contribution to the vibrational frequencies can be rather large in hydrogenbonded systems, this complicates the comparison between computed and experimental infrared spectroscopic data. The application of scaling factors (Scott & Radom
1996) can remedy this to some extent, but it is very difficult to account for varying
degrees of anharmonicity in the frequency modes using scaling factors. To overcome these deficiencies, techniques have been developed to compute anharmonic
vibrational frequencies. One of the principal methods for calculating anharmonic
frequencies is the vibrational self-consistent field (VSCF) method, of which the
basic structure was first introduced in 1978 (Bowman 1978). Because VSCF and its
correlation-corrected form, CC-VSCF (Jung & Gerber 1996), require the calculation
of many points on the ab initio potential energy surface, they are very computationally demanding, restricting their application to only small molecules. However,
promising new implementations of the CC-VSCF method, employing cost-reducing
techniques such as the use of ab initio-improved semi-empirical potentials (Gerber
et al . 2003), or the use of pseudo-potential basis sets for heavy atoms coupled with
an algorithm to reduce the number of pair-coupling elements (Benoit 2004), show
potential for widening the application field of the CC-VSCF method.
4. First-principles calculations on peptides
(a) Peptide terminology
Amino acids are the building blocks of peptides and proteins. An amino acid consists
of a central α carbon, an amino (NH2 ) group, a carboxyl (COOH) group, and a
distinctive R group, often called the side chain (for reasons mentioned below). In
solution, and at neutral pH, amino acids predominantly occur in their zwitterionic
forms, containing a protonated amino group (NH+
3 ) and a deprotonated carboxyl
group (COO− ). In peptides and proteins, amino acids are commonly called ‘residues’.
They are linked by peptide bonds: the carboxyl group of one amino acid is joined
to the amino group of another amino acid under elimination of a water molecule
(see figure 4). The amino group of the first amino acid in a polypeptide chain,
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2662
T. van Mourik
and the carboxyl group of the last amino acid, remain intact. By convention, the
chain extends from the ‘amino (or N) terminus’ to the ‘carboxyl (or C) terminus’.
The successive peptide bonds generate the ‘main chain’ or ‘backbone’ of the peptide,
which consists of repeating NH–Cα H–C=O units. Thus, an n-residue peptide consists
of a main chain and n side chains.
The peptide group, (C=O)–N–H, is planar and rigid, but the peptide main chain
has flexibility through the two bonds that flank the peptide bond, the Cα –C and
N–Cα bonds. The dihedral angles that define the orientation of these bonds, δNCα CN
and δCNCα C (ψ and φ), can be plotted against each other to produce a so-called
‘Ramachandran plot’ (Ramachandran et al . 1963). It turns out that many angle
combinations almost never occur because they would produce collisions between
different parts of the peptide (the exception is glycine, which, with only a hydrogen
as its side chain, can adopt conformations that are forbidden for other residues).
(b) The dimensionality problem
One of the major problems of computational studies of peptides is their flexibility,
resulting in large number of possible conformations. Even if one would ignore the
flexibility of the side chains (which is reasonable only for glycine and alanine residues,
which contain as their side chain just a hydrogen or methyl group, respectively), and
would only take into account three different positions of the Ramachandran angles
φ and ψ, then for an octapeptide one still needs to consider 314 = 4 782 969 different conformations. This dimensionality problem can be overcome to some extent by
realizing that most polypeptide chains fold into regular periodic structures. These
recurring structural motifs are called secondary structures. Some common secondary
structure elements include the α helix, β pleated sheets and the β turn. Protein
folding can be regarded as a process in which first secondary structure elements
are formed before the protein assembles into its complete three-dimensional structure (Karplus & Weaver 1994). A detailed understanding of the properties of these
secondary structures is therefore paramount to understanding protein folding.
(c) Studies on helical peptides
Until just a few years ago, first-principles calculations on peptides have focused
mainly on elucidating conformational preferences of small peptides (dipeptides and
tripeptides), as reviewed by Császár & Perczel (1999). However, researchers have
started to do calculations on unprecedentedly large systems, thereby advancing the
boundaries of first-principles quantum chemistry. With HF, calculations on complete
proteins are becoming feasible. In 1998, van Alsenoy et al . (1998) performed a HF
geometry optimization of crambin, a 64-residue protein containing 642 atoms. Very
recently, Zhang et al . (2003) computed the interaction energy of the streptavidin–
biotin complex, containing 1775 atoms, at the HF level of theory. This calculation
was made possible by using a new computational technique, MFCC (molecular fragmentation with conjugate caps). Other researchers have used the (more expensive)
DFT method to study helical peptides containing 10 (Elstner et al . 2000; Topol et
al . 2001) or 17 (Wieczorek & Dannenberg 2003) alanine residues. In addition to traditional DFT, the study by Elstner et al . (2000) also used the self-consistent charge
tight binding scheme (SCC-DFTB), which can be viewed as an approximation to
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
First-principles quantum chemistry in the life sciences
2663
DFT, which enabled them to study 20-residue alanine-based peptides with inclusion
of as many as 38 water molecules.
Many studies have focused on α helical peptides because the α helix is the
most common secondary structure motif in peptides. Most of these studies concentrate on alanine-rich peptides, as alanine has one of the highest helix propensities
(Chakrabartty & Baldwin 1995; Marqusee et al . 1989; O’Neil & DeGrado 1990). Alanine peptides also have a slight advantage for theoretical studies, because alanine,
with a methyl group as its side chain, is one of the simplest amino acids. Studies on
alanine-based peptides are nevertheless computationally demanding. α-helices form
hydrogen bonds between the C=O group of the nth residue and the NH group of the
(n + 4)th residue, and thus, at least five residues are needed to form one α helical
turn. Studies on α helical peptides therefore need to consider peptides containing at
least five amino acid residues. However, calculations on alanine-based peptides show
that, in vacuum, short peptides adopt the more tightly coiled 310 helical structure.
310 -helices form hydrogen bonds between the nth and (n + 3)th amino acid residues.
Longer peptide chains, and inclusion of solvent effects, are needed to stabilize the
α helix (Elstner et al . 2000), increasing the complexity of the problem.
The inclusion in the peptide structure of residues more complex than glycine or alanine may also complicate the calculations. The larger side chains of these residues add
to the dimensionality problem as they exhibit flexibility, and their hydrogen-bonding
capabilities may affect the stability of the different structural motifs. HF calculations on isolated AKAAA-AKAAA (A = alanine, K = lysine) peptides indicate that
lysine’s charged side chain (CH2 CH2 CH2 CH2 NH+
3 ) tends to bind to the C-terminus
of the peptide, thereby distorting the α-helical structure. Figure 5a shows how the
side chain of the second lysine curves to enable hydrogen bonding to the carboxyl of
the C-terminus. Increasing the peptide length improves the stability of the α helix,
as indicated by the only slightly distorted α-helical structure of AKAAA-AKAAAAKAAA (see figure 5b). The inclusion of water molecules around the charged NH+
3
group of lysine, as well as around the carboxyl group of the C-terminus, prevents the
lysine chain to hydrogen-bond to the C-terminal carboxyl group, thereby reducing
the distortion from the ideal α helical structure (figure 5c).
5. First-principles calculations on DNA fragments
DNA (deoxyribonucleic acid) is one of the most important biomolecules, as it encodes
the genetic information in the nucleus of cells. The basic structural units of DNA
are nucleotides, which consist of a deoxyribose sugar, a phosphate group, and a
base. The most common and well-known form of DNA is the double helix, of which
the three-dimensional structure was deduced by Watson, Crick (Watson & Crick
1953), Franklin & Wilkins. This discovery, the 50th anniversary of which was widely
celebrated last year, won Watson, Crick and Wilkins the 1962 Nobel Prize in Medicine
and Physiology. In the DNA double helix, two helical nucleotide chains, coiled around
a common axis, are held together by hydrogen bonding between the bases on opposite
strands (see figure 6). DNA contains four different bases: thymine, cytosine, adenine
and guanine. Adenine pairs with thymine, and guanine pairs with cytosine.
Other forms of DNA are known to exist, however. For example, four-stranded
DNA structures occur in telomeric DNA. Telomeres are the specialized ends of chromosomes that protect these ends from recombination and from being recognized as
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2664
T. van Mourik
(a)
(b)
(c)
Figure 5. HF/3-21G optimized structures of (AKAAA)n peptides. (a) AKAAA. (b) AKAAAAKAAA-AKAAA. The view down the helical spiral evidences the α-helical nature of the structure. (c) AKAAA-AKAAA + 10H2 O. The water molecules prevent the charged lysine chain to
hydrogen-bond to the C-terminus.
Figure 6. The DNA double helix.
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
First-principles quantum chemistry in the life sciences
2665
Figure 7. Crystal structure of the potassium form of an Oxytricha nova Q-quadruplex.
(a)
(b)
Figure 8. Guanine tetrad structures optimized with B3LYP/6-311++G∗∗ .
(a) Bifurcated hydrogen-bonded G-tetrad; (b) G·G hydrogen-bonded G-tetrad.
damaged DNA. Telomeric DNA contains guanine-rich segments, which can fold into
four-stranded G-quadruplex structures (see figure 7). These quadruplexes result from
stacking of guanine tetrads. Cations (NH+
4 or monovalent metal cations), located
either in the tetrad cavity or between successively stacked tetrads, are essential for
structural integrity of the DNA quadruplex. Recently, there has been considerable
interest in these G-quadruplexes, not least because of their potential use in the development of anti-cancer drugs (Neidle & Parkinson 2002). The DNA G-quadruplex has
also attracted the interest of computational researchers, who have so far focused on
the guanine tetrad and the cation-tetrad complex. First-principles calculations on
systems of this size are computationally very demanding, and have only recently
become feasible. The first HF and DFT calculations on the stability and structure
of the G-tetrad were presented in 1999 (Gu et al . 1999). As the crystal structures
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2666
T. van Mourik
of DNA quadruplexes show near-coplanar geometries for the guanine tetrads, these
calculations were performed for the planar, C4h -symmetric, structure of the guanine
tetrad.
Gu and co-workers (Gu & Leszczynski 2000; Gu et al . 1999) found that, whereas
the cation-containing G-tetrad adopts the G·G N1-carbonyl, N7-amino hydrogenbonded structure (in agreement with the crystal structures of the DNA quadruplexes), in the optimized geometry of the bare G-tetrad the guanine monomers are
held together by bifurcated hydrogen bonds. Calculations performed in my group
show that both the G·G hydrogen-bonded and bifurcated hydrogen-bonded C4h symmetric tetrad (see figure 8) can be located (van Mourik & Dingley 2005); however, these structures are transition states, not true minima, on the DFT/B3LYP
potential energy surface (the true minimum is a twisted S4 -symmetric structure).
Meyer et al . (2001) have shown that ions with a small radius (Li+ , Be2+ , Cu+ and
Zn2+ ) cause non-planarity of the complex, which may prevent stacking of the Gtetrads. K+ is too large to fit in the central cavity and this ion therefore prefers to
be located between successive tetrads in the Q-quadruplex.
6. Future prospects
Computational quantum chemistry is a rapidly developing field. This is partly due
to the development and implementation of innovative first-principles methods. These
include more efficient approaches to existing methods, such as linear scaling algorithms (Goedecker & Scuseria 2003), as well as hybrid (QM/MM) methods based on
the combination of classical (molecular mechanics (MM)) and quantum mechanics
(QM) methodologies (Morokuma 2002). Promising new techniques that treat anharmonicity and quantum effects to calculate free energies of biomolecular systems,
which are required at temperatures above 0 K, are being developed. Whereas diffusion quantum Monte Carlo (DQMC) techniques (Benoit & Clary 2000; Clary 2001;
van Mourik et al . 2001) yield the vibrational ground state (at 0 K) of a molecular system, the torsional path integral Monte Carlo (TPIMC) technique (Miller &
Clary 2002) does account for temperature effects. The construction of a global ab
initio potential energy surface is not yet feasible for biomolecules larger than glycine
(Miller & Clary 2004), and thus, both DQMC and TPIMC currently rely on force
fields to calculate the potential energy surface of biomolecular systems. A second
cause for the increasing capability of quantum chemistry is due to advances in computer technology. Moore (1965) observed that the speed of computers doubles roughly
every 18 months (this observation has been dubbed ‘Moore’s law’). Furthermore, the
dawn of parallel computer architectures and ‘Grid technology’ developments (Foster
& Kesselman 1999) additionally increases the available computational power. However, the rapid increase in central-processing-unit speed and novel architectures also
comes with huge challenges for scientists and engineers (Dunning et al . 2002), as
algorithms and programs need to be rewritten and parallelized to make full use of
the computational resources.
The future potential of first-principles quantum chemistry can hardly be overestimated. I expect a dramatic increase in the application of first-principles quantum chemistry in the life sciences. This may lead to improved methodologies for
drug development as well as an increased understanding of complex processes such
as protein folding. As outlined in this paper, small biomolecular systems (such as
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
First-principles quantum chemistry in the life sciences
2667
neurotransmitters, DNA fragments and peptides) are already being studied using
ab initio and DFT methods, and papers appear in the literature reporting firstprinciples computational research on ever larger systems. With more powerful computers and more efficient computer algorithms, electronic-structure calculations on
complete proteins and neurotransmitter–receptor systems will be feasible. I believe
that research will become more interdisciplinary, with experimental and theoretical researchers working side by side on the same projects. In addition, despite the
complexity of quantum chemical methods, computer programs are becoming more
and more ‘user friendly’, inviting non-theoreticians to supplement their experimental
research with complementary results from computer simulations.
I gratefully acknowledge The Royal Society for their support under the University Research
Fellowship scheme. I thank my collaborators in Oxford (Professor John P. Simons and Dr Lavina C. Snoek) and at University College London (Dr Andrew J. Dingley) for joint experimental/theoretical research and stimulating discussions.
References
Alagona, G. & Ghio, C. 2002 Interplay of intra- and intermolecular H-bonds for the addition
of a water molecule to the neutral and N-protonated forms of noradrenaline. Int. J. Quant.
Chem. 90, 641–656.
Becke, A. D. 1993 Density functional thermochemistry. 3. The role of exact exchange. J. Chem.
Phys. 98, 5648.
Benoit, D. M. 2004 Fast vibrational self-consistent field calculations through a reduced modemode coupling scheme. J. Chem. Phys. 120, 562–573.
Benoit, D. M. & Clary, D. C. 2000 Quaternion formulation of diffusion Monte Carlo for the
rotation of rigid molecules in clusters. J. Chem. Phys. 113, 5193–5202.
Bowman, J. M. 1978 Self-consistent field energies and wavefunctions for coupled oscillators. J.
Chem. Phys. 68, 608–610.
Butz, P., Kroemer, R. T., MacLeod, N. A., Robertson, E. G. & Simons, J. P. 2001a Conformational preferences of neurotransmitters: norephedrine and the adrenaline analogue, 2methylamino-1-phenylethanol. J. Phys. Chem. A 105, 1050–1056.
Butz, P., Kroemer, R. T., MacLeod, N. A. & Simons, J. P. 2001b Conformational preferences
of neurotransmitters: ephedrine and its diastereoisomer, pseudoephedrine. J. Phys. Chem.
A 105, 544–551.
Butz, P., Kroemer, R. T., MacLeod, N. A. & Simons, J. P. 2002 Hydration of neurotransmitters: a
spectroscopic and computational study of ephedrine and its diastereoisomer pseudoephedrine.
Phys. Chem. Chem. Phys. 4, 3566–3574.
Carney, J. R. & Zwier, T. S. 2000 The infrared and ultraviolet spectra of individual conformational isomers of biomolecules: tryptamine. J. Phys. Chem. A 104, 8677–8688.
Carney, J. R. & Zwier, T. S. 2001 Conformational flexibility in small biomolecules: tryptamine
and 3-indole-propionic acid. Chem. Phys. Lett. 341, 77–85.
Chakrabartty, A. & Baldwin, R. L. 1995 Stability of α-helices. Adv. Protein Chem. 46, 141–176.
Clary, D. C. 2001 Torsional diffusion Monte Carlo: a method for quantum simulations of proteins.
J. Chem. Phys. 114, 9725–9732.
Császár, A. G. & Perczel, A. 1999 Ab initio characterization of building units in peptides and
proteins. Prog. Biophys. Mol. Biol. 71, 243–309.
Dunning Jr, T. H., Harrison, R. J., Feller, D. & Xantheas, S. S. 2002 Promise and challenge
of high-performance computing, with examples of molecular modelling. Phil. Trans. R. Soc.
Lond. A 360, 1089–1105.
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
2668
T. van Mourik
Elstner, M., Jalkanen, K. J., Knapp-Mohammady, M., Frauenheim, T. & Suhai, S. 2000 DFT
studies on helix formation in N-acetyl-(L-alanyl)n -N′ -methylamide for n = 1–20. Chem. Phys.
256, 15–27.
Foster, I. & Kesselman, C. (eds) 1999 The Grid: blueprint for a new computing infrastructure.
San Mateo, CA: Morgan Kaufmann.
Gerber, R. B., Chaban, G. M., Gregurick, S. K. & Brauer, B. 2003 Vibrational spectroscopy
and the development of new force fields for biological molecules. Biopolymers 68, 370–382.
Goedecker, S. & Scuseria, G. 2003 Linear scaling electronic structure methods in chemistry and
physics. Comput. Sci. Engng 5, 14–21.
Graham, R. J., Kroemer, R. T., Mons, M., Robertson, E. G., Snoek, L. C. & Simons, J. P. 1999
Infrared ion-dip spectroscopy of a noradrenaline analogue: hydrogen bonding in 2-amino-1phenylethanol and its singly hydrated complex. J. Phys. Chem. A 103, 9706–9711.
Gu, J. & Leszczynski, J. 2000 A remarkable alteration in the bonding pattern: an HF and DFT
study on the interactions between the metal cations and the Hoogsteen hydrogen-bonded
G-tetrad. J. Phys. Chem. A 104, 6308–6313.
Gu, J., Leszczynski, J. & Bansal, M. 1999 A new insight into the structure and stability of
Hoogsteen hydrogen-bonded G-tetrad: an ab initio SCF study. Chem. Phys. Lett. 311, 209–
314.
Hohenberg, P. & Kohn, W. 1964 Inhomogeneous electron gas. Phys. Rev. A 136, 864–871.
Jung, J. O. & Gerber, R. B. 1996 Vibrational wave functions and spectroscopy of (H2 O)n ,
n = 2, 3, 4, 5: vibrational self-consistent field with correlation corrections. J. Chem. Phys.
105, 10 332–10 348.
Karplus, M. & Weaver, D. L. 1994 Protein folding dynamics: the diffusion-collision model and
experimental data. Protein Sci. 3, 650–668.
Levy, D. H. 1980 Laser spectroscopy of cold gas-phase molecules. A. Rev. Phys. Chem. 31,
197–225.
Macleod, N. A. & Simons, J. P. 2004 Neurotransmitters in the gas phase: infrared spectroscopy
and structure of protonated ethanolamine. Phys. Chem. Chem. Phys. 6, 2821–2826.
Macleod, N. A., Robertson, E. G. & Simons, J. P. 2003 Hydration of neurotransmitters: a computational and spectroscopic study of a noradrenaline analogue, 2-amino-1-phenyl-ethanol.
Mol. Phys. 101, 2199–2210.
Marqusee, S., Robbins, V. H. & Baldwin, R. L. 1989 Unusually stable helix formation in short
alanine-based peptides. Proc. Natl Acad. Sci. USA 86, 5286–5290.
Meyer, M., Steinke, T., Brandl, M. & Sühnel, J. 2001 Density functional study of guanine and
uracil quartets and of guanine quartet/metal ion complexes. J. Computat. Chem. 22, 109–124.
Miller III, T. F. & Clary, D. C. 2002 Torsional path integral Monte Carlo method for the
quantum simulation of large molecules. J. Chem. Phys. 116, 8262–8269.
Miller III, T. F. & Clary, D. C. 2004 Quantum free energies of the conformers of glycine on an
ab initio potential energy surface. Phys. Chem. Chem. Phys. 6, 2563–2571.
Møller, C. & Plesset, M. S. 1934 Note on an approximation treatment for many-electron systems.
Phys. Rev. 46, 618–622.
Moore, G. E. 1965 Cramming more components onto integrated circuits. Electronics 38, 114–
117.
Morokuma, K. 2002 New challenges in quantum chemistry: quests for accurate calculations for
large molecular systems. Phil. Trans. R. Soc. Lond. A 360, 1149–1164.
Nagy, P. I., Alagona, G., Ghio, C. & Takács-Novák, K. 2003 Theoretical conformational analysis
for neurotransmitters in the gas phase and in aqueous solution: norepinephrine. J. Am. Chem.
Soc. 125, 2770–2785.
Neidle, S. & Parkinson, G. 2002 Telomere maintenance as a target for anticancer drug discovery.
Nat. Rev. Drug. Discov. 1, 383–393.
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
First-principles quantum chemistry in the life sciences
2669
Olsen, J., Christiansen, O., Koch, H. & Jørgensen, P. 1996 Surprising cases of divergent behavior
in Møller–Plesset perturbation theory. J. Chem. Phys. 105, 5082–5090.
O’Neil, K. T. & DeGrado, W. F. 1990 A thermodynamic scale for the helix-forming tendencies
of the commonly occurring amino acids. Science 250, 646–651.
Ramachandran, G. N., Sasisekharan, V. & Ramakrishnan, C. J. 1963 Stereochemistry of
polypeptide chain configurations. J. Mol. Biol. 7, 95–99.
Robertson, E. G. & Simons, J. P. 2001 Getting into shape: conformational and supramolecular
landscapes in small biomolecules and their hydrated clusters. Phys. Chem. Chem. Phys. 3,
1–18.
Scott, A. P. & Radom, L. 1996 Harmonic vibrational frequencies: an evaluation of HartreeFock, Møller–Plesset, quadratic configuration interaction, density functional theory, and semiempirical scale factors. J. Phys. Chem. 100, 16 502–16 513.
Smalley, R. E., Ramakrishna, B. L., Levy, D. H. & Wharton, L. 1974 Laser spectroscopy of
supersonic molecular beams: application to the NO2 spectrum. J. Chem. Phys. 61, 4363–
4364.
Snoek, L. C., van Mourik, T., Carçabal, P. & Simons, J. P. 2003a Neurotransmitters in the gas
phase: hydrated noradrenaline. Phys. Chem. Chem. Phys. 5, 4519–4526.
Snoek, L. C., van Mourik, T. & Simons, J. P. 2003b Neurotransmitters in the gas phase: a
computational and spectroscopic study of noradrenaline. Mol. Phys. 101, 1239–1248.
Topol, I. A., Burt, S. K., Deretey, E., Tang, T.-H., Perczel, A., Rashin, A. & Csizmadia, A. G.
2001 α- and 310 -helix interconversion: a quantum-chemical study on polyalanine systems in
the gas phase and in aqueous solvent. J. Am. Chem. Soc. 123, 6054–6060.
Vaidehi, N., Floriano, W. B., Trabanino, R., Hall, S. E., Freddolino, P., Choi, E. J., Zamanakos,
G. & Goddard III, W. A. 2002 Prediction of structure and function of G protein-coupled
receptors. Proc. Natl Acad. Sci. USA 99, 12 622–12 627.
van Alsenoy, C., Yu, C.-H., Peeters, A., Martin, J. M. L. & Schäfer, L. 1998 Ab initio geometry
determination of proteins. I. Crambin. J. Phys. Chem. A 102, 2246–2251.
van Mourik, T. 2004 The shape of neurotransmitters in the gas phase: a theoretical study of
adrenaline, pseudoadrenaline, and hydrated adrenaline. Phys. Chem. Chem. Phys. 6, 2827–
2837.
van Mourik, T. & Dingley, A. J. 2005 (In preparation.)
van Mourik, T. & Emson, L. E. V. 2002 A theoretical study of the conformational landscape of
serotonin. Phys. Chem. Chem. Phys. 4, 5863–5871.
van Mourik, T. & Früchtl, H. A. 2005 The potential energy landscape of noradrenaline. An
electronic structure study. (Submitted.)
van Mourik, T. & Gdanitz, R. J. 2002 A critical note on density functional theory studies on
rare-gas dimers. J. Chem. Phys. 116, 9620–9623.
van Mourik, T., Price, S. L. & Clary, D. C. 2001 Diffusion Monte Carlo simulations on uracil–
water using an anisotropic atom–atom potential model. Faraday Disc. 118, 95–108.
Watson, J. D. & Crick, F. H. C. 1953 A structure for deoxyribose nucleic acid. Nature 171,
737–738.
Wieczorek, R. & Dannenberg, J. J. 2003 H-bonding cooperativity and energetics of α-helix
formation of five 17-amino acid peptides. J. Am. Chem. Soc. 125, 8124–8129.
Zhang, D. W., Xiang, Y. & Zhang, J. Z. H. 2003 New advance in computational chemistry: full
quantum mechanical ab initio computation of streptavidin–biotin interaction energy. J. Phys.
Chem. B 107, 12 039–12 041.
Zwier, T. S. 2001 Laser spectroscopy of jet-cooled biomolecules and their water-containing clusters: water bridges and molecular conformation. J. Phys. Chem. A 105, 8827–8839.
Phil. Trans. R. Soc. Lond. A (2004)
Downloaded from rsta.royalsocietypublishing.org on May 24, 2011
AUTHOR PROFILE
Tanja van Mourik
Born in 1966 in Vlissingen, a small city on the Dutch coast, Tanja van Mourik studied
chemistry at the University of Utrecht, the Netherlands, with a nine-month intermezzo at the Ruhr University of Bochum, Germany. She returned to Utrecht in July
1989, where in the same year she graduated cum laude. She obtained her PhD in
the field of theoretical chemistry from the University of Utrecht in 1994, after which
she spent three years as a Postdoctoral Associate at the Pacific Northwest National
Laboratory in Richland, WA, USA. In 1997 she came to the UK to take up a postdoctoral research position at University College London. She was awarded a Royal
Society University Fellowship in 2000, and has since been working as a University
Research Fellow at the Department of Chemistry, University College London. Her
research interests include the accurate quantum chemical computation of molecular
properties in general, and more specifically the application of these methods to the
study of small molecules of biological interest.
2670
View publication stats