Academia.eduAcademia.edu

Abstract

Unbiased parallel tempering Monte Carlo simulations of a 49 residue protein starting from random conformations, reveal a non-trivial path followed by the molecule to the native state. The molecule (PDB id: 2GJH) consists of an α-helix and a 3 stranded β-sheet, in which two of the adjacent strands straddle the other secondary structure elements along the sequence. In the course of folding, one of the strands making sequence non-local contacts is seen to be "cached" as a non-native extension of the native α-helix. After the other secondary structure elements have formed and assembled in their proper tertiary arrangement, the cached segment is released and it changes its secondary structure to a strand as it attaches to a β-hairpin to complete the native structure. The study is based on a physics based implicit water all-atom interaction potential called the Lund force field.

Introduction

While many protein structures contain β-sheets with complex arrangement of β-strands, the folding mechanisms giving rise to such structures are largely unclear. Most successful folding simulations to date have been with α-helical proteins, which are dominated by sequence local interactions. The hydrogen bonds are formed between residues i and i + 4 along the sequence, which, even for the most stretched out conformation of the chain, are in the spatial neighbourhood of each other. Local interactions quickly lead such proteins into their folded structures. While β-sheets consisting of one or more sequence adjacent β-hairpins require somewhat longer range contacts along the sequence, they are still local structures. Typically they arise through a zipper-like mechanism starting from the turn regions and sequentially forming hydrogen bonds away from the turns. An important feature of these sequence local structures is that they do not interfere with the formation of other similar structures elsewhere along the chain.

When two neighbouring strands in a β-sheet come from regions of the sequence separated by a large number of residues, formation of contacts between them is no longer independent of the structure of the intervening segment. Premature formation of such contacts creates large steric barriers and can hinder the proper folding of other secondary structure elements. The protein is then lead into a deep local minimum and can only fold by first breaking the prematurely formed long distance native β-contacts. Such considerations suggest that proteins with complex β-sheets should on average fold slower than α-helices. This is indeed consistent with experimental observations. The so called "contact order", the average sequence separation of residues in contact, bears a striking correlation with the folding rate over a huge range of folding rates. β-sheets with complex strand arrangements have high contact orders, and are seen to fold slowly. But is it possible that some proteins with relatively high contact order might have evolved tricks to avoid the deep local minima and fold much faster than others of the same complexity? Plots of the folding rates versus

Methods

Our model represents all atoms of the protein chains, including all hydrogen atoms, but an implicit treatment of the solvent molecules through an effective interaction potential. The model assumes constant bond lengths, bond angles and peptide-bond torsion angles of 180 degrees. Each protein molecule has only the Ramachandran backbone torsion angles and the side chain torsion angles as its degrees of freedom. The effective force field contains terms to account for excluded volume repulsion, local backbone electrostatics, hydrogen bonds and hydrophobic interactions. The model and the force field have been described in detail elsewhere 1, 2 , where we also show that the force field describes the folding and thermodynamics of a range of short peptides with both α-helical and β-sheet structures. Sampling of protein conformations is carried out using replica exchange Monte Carlo techniques with 32 replicas. For this work, we have used the protein folding software package PROFASI 3 , version 1.1.2. The results presented here are based on 1.4 × 10 10 elementary Monte Carlo updates of the protein chain per replica. All simulations were initialised with random values for all degrees of freedom and different random number seeds.

Results

The simulated molecule folds to the native state with a backbone RMSD (all residues) of about 1.8Å. The global energy minimum found in the simulations has a backbone RMSD of 1.7Å, and is shown superimposed on the PDB structure in Fig.1. This minimum energy structure shares all the hydrogen bonds and C α contacts with the native state.

Figure 1

Comparison of the global energy minimum (colour) with the PDB structure 2GJH.pdb (grey). contact orders suggest such a possibility, as there is a large fluctuation in the folding rates for proteins with intermediate contact orders. The exact nature of such fold accelerating mechanisms are unknown. From all-atom Monte Carlo simulations of a 49 residue protein, Top7-CFR (PDB id: 2GJH, residues 2 -50) with both helical and β-sheet structures and a non-trivial β-sheet geometry, we propose one possible mechanism.

More interesting than the fact that the molecule folds is the observed manner of formation of β-sheet contacts between the N-terminal strand and the C-terminal hairpin 4 . When the molecule folds from random conformations, the first structures to emerge are the native helix and the C-terminal hairpin. These are strong structural elements consisting of sequence local contacts and fold to the same structures as excised segments in simulations. But we observe that the N-terminal strand initially folds simply as a continuation of the native helix. β-strands are stabilised by inter-strand interactions. Therefore, initially when the C-terminal hairpin is absent, the N-terminal region does not have any stabilising interactions as a β-strand. The helix, which forms first, provides a good template for the N-terminal strands, and absorbs them. The helix, even with its non-native extension, is a structure that folds and unfolds easily. The β-hairpin forms independently, and subsequently makes hydrophobic contacts with the helix. Upon the formation of hydrophobic contacts between the helix and the hairpin, both the structures are stabilised. The nonnative extension of the helix, containing the N-terminal strand residues, does not benefit from the hydrophobic contacts with the hairpin and eventually unfolds. Unlike the situation for an entirely unfolded molecule, when the N-terminal residues are freed with both the helix and hairpin in place, they do have other β-strands to bind to, which turns out to be lower in energy. Hence, the N-terminal strands join it with a larger probability.

Conclusions

Using all-atom Monte Carlo simulations starting from random initial conformations, we find that the molecule Top7-CFr folds to within 1.7Å of its native state, following a non-trivial folding pathway. The observed mechanism of formation of sequence non-local β-sheet contacts depends on the chameleon behaviour of the N-terminal strand. We believe that such caching of β-strands in neighbouring helices is one mechanism for accelerating the formation of complex β-sheet structures. More detailed results are published in Refs. 4, 5.