| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


* Institut de Biologie Structurale Jean-Pierre Ebel CNRS-CEA.-UJF, Grenoble, France;
Institut de Recerca Biomèdica, Parc Científic de Barcelona, Barcelona, Spain; and
Departamento de Bioquímica y Biología Molecular y Celular, Facultad de Ciencias, and Biocomputation and Complex Systems Physics Institute-BIFI, University of Zaragoza, Zaragoza, Spain
Correspondence: Address reprint requests to Pau Bernadó, Institut de Recerca Biomèdica, Parc Científic de Barcelona, Josep Samitier 1-5, 08028 Barcelona, Spain. E-mail: pbernado{at}pcb.ub.es; or to Javier Sancho, Departamento de Bioquímica y Biología Molecular y Celular, Facultad de Ciencias, Universidad de Zaragoza, 50009-Zaragoza, Spain. E-mail: jsancho{at}unizar.es.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
The thermodynamic stability of proteins is dependent on a delicate balance between different interactions involving protein and solvent atoms in both the native and unfolded states (2
3
). However, there is still no general agreement on the net contribution of the different fundamental interactions, such as hydrogen bonds, van der Waals, etc. (3
7
). On the other hand, the hydrophobic effect, which describes the observed tendency of apolar compounds to minimize their exposure to water, is widely acknowledged as stabilizing the native state (8
11
). The experimental quantification of the contribution of the hydrophobic effect to protein stability is not trivial, one reason being that, unlike in the native state, the side-chain solvent exposures in the unfolded state are hard to estimate due to the large number of discrete conformations available to the main chain. So far, side-chain accessibilities have been approximated by calculations performed on different models of the unfolded state, including Gly-X-Gly extended tripeptides (12
), Gly-X-Gly peptides with dihedral angles characteristic of protein structures (13
,14
), and Ala-X-Ala simulated ensembles (15
), or, more recently, by averages of peptide-fragment collections extracted from native structures (16
,17
). Large differences in solvent exposures have been reported depending on the model. Until now, reported exposures have tended to be residue-type averages, rather than residue-specific exposures calculated in specific sequences. The solvent accessibility in the unfolded state is intimately linked to the conformational sampling occurring on the backbone, and it has been shown that neighboring residues limit the conformational space sampled by certain amino acids (18
,19
). In addition, large amino acids may bury the chain from the solvent more effectively. Therefore, it is clear that both the interpretation of stability mutational studies (20
) and the parameterization of protein stability calculations (21
) could benefit from using sequence-specific solvent exposure data calculated from accurate models of unfolded protein ensembles.
The consensus view of the unfolded state is that of a large ensemble of more or less randomized conformations in fast equilibrium, although certain bias toward the native conformation (22
,23
) or toward certain types of secondary structure, especially the polyproline II, has been reported in some cases (24
,25
). At present, diverse structural techniques can provide valuable structural information about highly disordered proteins. Nuclear magnetic resonance, by measuring residual dipolar couplings in partially aligned proteins (26
), has provided insight into the conformational sampling observed in intrinsically and chemically unfolded proteins (22
,23
,27
32
). Paramagnetic relaxation enhancement experiments measured in spin-labeled mutants of several proteins have provided information about the presence of long-range contacts in unstructured chains (33
36
). Small-angle x-ray scattering experiments have become an important tool for the study of size characteristics of the unfolded state (37
39
). Recently, we developed an algorithm, Flexible-Meccano, which generates ensembles of realistic atomic models that are compatible with biophysical data measured using NMR and small-angle x-ray scattering (31
).
We present here a fast method that provides sequence-specific solvent exposures for any residue in a given unfolded ensemble based on the flexible-Meccano algorithm (31
). Large differences in solvent exposures are observed for residue types, depending essentially on the different primary sequence context and, to a lesser extent, on the length of the protein and its global amino acid composition.
| METHODS |
|---|
|
|
|---|
and C' atoms of plane (i + 1), the selected
/
combination and the tetrahedral angle (set to 109°). Amino-acid-specific
/
combinations used to create the main chain are randomly extracted from a database built from 500 high-resolution x-ray structures with resolutions of <1.8 Å and B factors <30 Å2, from which all residues in
-helices and ß-sheets were removed (40
for Gly) atom spheres of volumes derived from the Levitt's simplified force field (42
/
pair and another set of
/
dihedral angles is selected, until no overlap is found or 500
/
combinations have been tested; otherwise, a completely new structure is calculated from the last residue. In a second step (Fig. 1), side chains are incorporated to the ensemble using the program Sccomp, which places and optimizes side-chain conformations in a fixed protein backbone (43
|
All calculations, for both the generation of the unfolded ensembles and quantification of solvent accessibilities, have been done on the computation center at the Biocomputation and Complex Systems Physics Institute in Zaragoza.
Protein sequences used
A set of 19 proteins corresponding to Set3 from Eyal et al. (43
) was used for the calculation of solvent exposures. The PDB codes and residue lengths of the proteins are shown in Table 3. This set was originally collected based on the structural characteristics of the proteins. Of importance for our study, the proteins included in the set share <20% sequence identity. This implies a variety of amino acid contexts that should provide enough cases to derive reliable solvent-exposure statistics, as well as to reveal sequence-specific solvation characteristics. The total number of amino acid residues in the database was 4346. Notice that cysteine residues are simulated in their reduced form. For each one of these 19 amino acid sequences, an ensemble of 2000 conformations was generated using the flexible-Meccano algorithm.
|
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
118 Å2/residue, lying between the two bounds proposed by Creamer et al. (17
|
|
The sequences flanking the least and most exposed residue of each type are shown in Fig. 3 a. A statistical analysis has been performed to compare the enrichment of sequences in certain residues with respect to their overall abundance in the proteins studied (Fig. 3 b). A general prevalence of Pro immediately after poorly exposed residues was found. This is due to the proximity of the Pro C
atom, which also imposes the special conformational restriction to X-Pro residues (41
). In addition to Pro, the sequences flanking the least exposed residues are rich in the three aromatic residues, Trp, Tyr and Phe, which, due to their size, can easily screen neighboring residues from solvent. The least exposed sequences are also moderately enriched in Gly. A possible explanation for this counterintuitive result is that the large conformational freedom of Gly would facilitate the peptide chain folding into itself, thus enhancing solvent screening. On the other hand, most exposed sequences are rich in Pro as well. However, these Pro residues appear located at position i + 2 and i + 3, probably forming rigid elbows that could direct the following chain away from the exposed residue. Most exposed sequences are rich in small residues such as Ser and Gly, especially in the closest positions, and poor in large residues such as Tyr, Phe, Leu, and Arg.
|
|
Global solvent accessibilities
Different thermodynamic terms that characterize the free energy of unfolding,
H,
S, and
Cp, have been parameterized in terms of the change in the total (
Atot), the apolar (
Aap), and the polar (
Apol) surface area exposed upon unfolding. A detailed description of these relationships can be found in Robertson and Murphy (46
). Whereas calculating the surface accessibility of the folded state is straightforward if the three-dimensional structure of the protein is available, performing the same calculation for the denatured state is much more difficult, and approximations based on tripeptides or small protein fragments (see Introduction) are normally used. Using more realistic polar and apolar solvent exposures may help to improve the parameterizations. Total, as well as apolar and polar, accessibilities in the unfolded states of the 19 proteins studied are shown in Table 3. Notice that in this case, all residues in the sequence were used for the calculation. The accessibilities obtained using the atomic model of the denatured ensembles are in good agreement with those estimated using the amino acid compositions and the average residue accessibilities. The correlations of the calculated and estimated global accessibilities, shown in Fig. S1 of Supplementary Material, present slopes close to 1.0, and the root-mean-square deviations found are 324, 186, and 141 Å2 for the total, apolar, and polar solvent accessibilities, respectively. These results indicate that good approximations for the global, as well as the apolar and polar, accessibilities of the denatured state of proteins can be estimated from the residue averaged solvent exposures derived in this study. Notice that in the reported residue averaged solvent exposures (Table 1), the excess of exposed surface typically present in the chain termini is not accounted for. Therefore, the accessibilities derived from the detailed simulations of the denatured state ensembles are more accurate.
It is interesting that all three solvent exposures also present good correlations with the number of residues in the protein, with regression coefficients >0.98. Total, apolar, and polar solvent exposures follow the relationships
![]() |
![]() |
![]() |
| CONCLUSIONS |
|---|
|
|
|---|
The unfolded states of 19 proteins with low sequence and structural homology have been simulated. They represent a database of residues large enough to allow deriving statistically robust averaged atom- and residue-specific solvent accessibilities that can in turn be used for the parameterization of the different contributions involved in protein stability. It is important that, despite the usefulness of those averaged solvent exposures, the sequence-specific context of the different residues of any particular protein exerts a strong influence on the solvent exposures, and thus sequence-specific solvent exposures of the residue of interest should be used for the interpretation of mutational studies. On the other hand, we anticipate that the simulated unfolded ensembles could be useful to investigate the elusive balance of interactions occurring in the native and denatured states that is so important for understanding protein stability.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
J.S. acknowledges financial support from Institute of Biocomputation and Physics of Complex Systems (BIFI) grant BFU2004-01411. M.B. acknowledges UMR 5075, Centre National de la Recherche Scientifique, Commissariat a l'Energie Atomique, and Universite Joseph Fourier Grenoble, and ANR NT05-4_42781 for financial support. P.B. acknowledges the European Molecular Biology Organization for a long-term fellowship and funds from the Ramon y Cajal program (Spain).
Submitted on April 21, 2006; accepted for publication August 21, 2006.
| REFERENCES |
|---|
|
|
|---|
2. Funahashi, J., Y. Sugita, A. Kitao, and K. Yutani. 2003. How can free energy component analysis explain the difference in protein stability caused by amino acid substitutions? Effect of three hydrophobic mutations at the 56th residue on the stability of human lysozyme. Protein Eng. Des. Sel. 16:665671.
3. Lazaridis, T., and M. Karplus. 2003. Thermodynamics of protein folding: a microscopic view. Biophys. Chem. 100:367395.[CrossRef][Medline]
4. Myers, J. K., and C. N. Pace. 1996. Hydrogen bonding stabilizes globular proteins. Biophys. J. 71:20332039.
5. Campos, L. A., S. Cuesta-Lopez, J. Lopez-Llano, F. Falo, and J. Sancho. 2005. A double-deletion method to quantifying incremental binding energies in proteins from experiment: example of a destabilizing hydrogen bonding pair. Biophys. J. 88:13111321.
6. Kono, H., M. Saito, and A. Sarai. 2000. Stability analysis for the cavity-filling mutations of the Myb DNA- binding domain utilizing free-energy calculations. Proteins. 38:197209.[CrossRef][Medline]
7. Gatchell, D. W., S. Dennis, and S. Vajda. 2000. Discrimination of near-native protein structures from misfolded models by empirical free energy functions. Proteins. 41:518534.[CrossRef][Medline]
8. Baldwin, R. L. 1986. Temperature dependence of the hydrophobic interaction in protein folding. Proc. Natl. Acad. Sci. USA. 83:80698072.
9. Kauzmann, W. 1959. Some factors in the interpretation of protein denaturation. Adv. Protein Chem. 14:163.[Medline]
10. Privalov, P. L. 1979. Stability of proteins: small globular proteins. Adv. Protein Chem. 33:167241.[Medline]
11. Dill, K. A. 1990. Dominant forces in protein folding. Biochemistry. 29:71337155.[CrossRef][Medline]
12. Miller, S., J. Janin, A. M. Lesk, and C. Chothia. 1987. Interior and surface of monomeric proteins. J. Mol. Biol.196:641656.[CrossRef][Medline]
13. Shrake, A., and J. A. Rupley. 1973. Environment and exposure to solvent of protein atoms. Lysozyme and insulin. J. Mol. Biol. 79:351371.[CrossRef][Medline]
14. Rose, G. D., A. R. Geselowitz, G. J. Lesser, R. H. Lee, and M. H. Zehfus. 1985. Hydrophobicity of amino acid residues in globular proteins. Science. 229:834838.
15. Zielenkiewicz, P., and W. Saenger. 1992. Residue solvent accessibilities in the unfolded polypeptide chain. Biophys. J. 63:14831486.
16. Creamer, T. P., R. Srinivasan, and G. D. Rose. 1995. Modeling unfolded states of peptides and proteins. Biochemistry. 34:1624516250.[CrossRef][Medline]
17. Creamer, T. P., R. Srinivasan, and G. D. Rose. 1997. Modeling unfolded states of proteins and peptides. II. Backbone solvent accessibility. Biochemistry. 36:28322835.[CrossRef][Medline]
18. Pappu, R. V., R. Srinivasan, and G. D. Rose. 2000. The Flory isolated-pair hypothesis is not valid for polypeptide chains: implications for protein folding. Proc. Natl. Acad. Sci. USA. 101:1256512570.
19. Goldenberg, D. P. 2003. Computational simulation of the statistical properties of unfolded proteins. J. Mol. Biol. 326:16151633.[CrossRef][Medline]
20. Fersht, A. R., A. Matouschek, and L. Serrano. 1992. The folding of an enzyme. I. Theory of protein engineering analysis of stability and pathway of protein folding. J. Mol. Biol. 224:771782.[CrossRef][Medline]
21. Freire, E. 1993. Structural thermodynamics: prediction of protein stability and protein binding affinities. Arch. Biochem. Biophys. 303:181184.[CrossRef][Medline]
22. Shortle, D., and M. S. Ackerman. 2001. Persistence of native-like topology in a denatured protein in 8 M urea. Science. 293:487489.
23. Ohnishi, S., A. L. Lee, M. H. Edgell, and D. Shortle. 2004. Direct demonstration of structural similarity between native and denatured eglin C. Biochemistry. 43:40644070.[CrossRef][Medline]
24. Tran, H. T., X. Wang, and R. V. Pappu. 2005. Reconciling observations of sequence-specific conformational propensities with the generic polymeric behavior of denatured proteins. Biochemistry. 44:1136911380.[CrossRef][Medline]
25. Shi, Z., K. Chen, Z. Liu, and N. R. Kallenbach. 2006. Conformation of the backbone in unfolded proteins. Chem. Rev. 106:18771897.[CrossRef][Medline]
26. Tjandra, N., and A. Bax. 1997. Direct measurement of distances and angles in biomolecules by NMR in a dilute liquid crystalline medium. Science. 278:11111114.
27. Louhivouri, M., K. Pääkkönen, K. Fredriksson, P. Permi, J. Lounila, and A. Annila. 2003. On the origin of residual dipolar couplings from denatured proteins. J. Am. Chem. Soc. 125:1564715650.[CrossRef][Medline]
28. Mohana-Borges, R., N. K. Goto, G. J. A. Kroon, H. J. Dyson, and P. E. Wright. 2004. Structural characterization of unfolded states of apomyoglobin using residual dipolar couplings. J. Mol. Biol. 34:11311142.
29. Meier, S., S. Güthe, T. Kiefhaber, and S. Grzesiek. 2004. Foldon, the natural trimerization domain of T4 fibritin, dissociates into a monomeric A-state form containing a stable ß-hairpin: atomic details of trimer dissociation and local ß-hairpin stability from residual dipolar couplings. J. Mol. Biol. 344:10511069.[CrossRef][Medline]
30. Jha, A. K., A. Colubri, K. F. Freed, and T. R. Sosnick. 2005. Statistical coil model of the unfolded state: resolving the reconciliation problem. Proc. Natl. Acad. Sci. USA. 102:1309913104.
31. Bernadó, P., L. Blanchard, P. Timmins, D. Marion, R. W. H. Ruigrok, and M. Blackledge. 2005. A structural model for unfolded proteins from residual dipolar couplings and small-angle x-ray scattering. Proc. Natl. Acad. Sci. USA. 102:1700217007.
32. Bernadó, P., C. W. Bertoncini, C. Griesinger, M. Zweckstetter, and M. Blackledge. 2005. Defining long-range oand local disorder in native
-synuclein using residual dipolar couplings. J. Am. Chem. Soc. 127:1796817969.[CrossRef][Medline]
33. Gillespie, J. R., and D. Shortle. 1997. Characterization of long-range structure in the denatured state of staphylococcal nuclease. I. Paramagnetic relaxation enhancement by nitroxide spin labels. J. Mol. Biol. 268:158169.[CrossRef][Medline]
34. Lindorff-Larsen, K., S. Kristjansdottir, K. Teilum, W. Fieber, C. M. Dobson, F. M. Poulsen, and M. Vendruscolo. 2004. Determination of an ensemble of structures representing the denatured state of the of the bovine acyl-coenzyme A binding protein. J. Am. Chem. Soc. 126:32913299.[CrossRef][Medline]
35. Bertoncini, C. W., Y.-S. Jung, C. O. Fernandez, W. Hoyer, C. Griesinger, T. M. Jovin, and M. Zweckstetter. 2005. Release of long-range tertiary interactions potentiates aggregation of natively unstructured
-synuclein. Proc. Natl. Acad. Sci. USA. 102:14301435.
36. Dedmon, M. W., K. Lindorff-Larsen, J. Christodoulou, M. Vendruscolo, and C. M. Dobson. 2005. Mapping long-range interactions in
-synuclein using spin-label NMR and ensemble molecular dynamics simulations. J. Am. Chem. Soc. 127:476477.[CrossRef][Medline]
37. Doniach, S. 2001. Changes in biomolecular conformation seen by small angle x-ray scattering. Chem. Rev. 101:17631798.[CrossRef][Medline]
38. Kohn, J. E., I. S. Millett, J. Jacob, B. Zagrovic, T. M. Dillon, N. Cingel, R. S. Dothager, S. Seifert, P. Thiyagarajan, T. R. Sosnick, M. Z. Hasan, V. S. Pande, I. Ruczinski, S. Doniach, and K. W. Plaxco. 2004. Random-coil behaviour and dimensions of chemically unfolded proteins. Proc. Natl. Acad. Sci. USA. 101:1249112496.
39. Fitzkee, N. C., and G. D. Rose. 2004. Reassessing random-coil statistics in unfolded proteins. Proc. Natl. Acad. Sci. USA. 101:1249712502.
40. Lovell, S. C., I. W. Davis, W. B. Arendall III, P. I. W. de Bakker, J. M. Word, M. G. Prisant, J. S. Richardson, and D. C. Richardson. 2003. Structure validation by C
geometry:
,
and Cß deviation. Proteins. 50:437450.[CrossRef][Medline]
41. MacArthur, M. W., and J. M. Thornton. 1991. Influence of proline residues on protein conformation. J. Mol. Biol. 218:397412.[CrossRef][Medline]
42. Levitt, M. 1976. A simplified representation of protein conformations for rapid simulation of protein folding. J. Mol. Biol. 104:59107.[CrossRef][Medline]
43. Eyal, E., R. Najmanovich, B. J. McConkey, M. Edelman, and V. Sobolev. 2004. Importance of solvent accessibility and contact surfaces in modeling side-chains conformations in proteins. J. Comp. Chem. 25:712724.[CrossRef]
44. Hubbart, S. J., and J. M. Thornton. 1993. NACCESS Computer Program. Department of Biochemistry and Molecular Biology, University College London, London, UK.
45. Chou, P. Y., and G. D. Fasman. 1974. Conformational parameters for amino acids in helical, ß-sheet, and random coil regions calculated from proteins. Biochemistry. 13:211222.[CrossRef][Medline]
46. Robertson, A. D., and K. P. Murphy. 1997. Protein structure and the energetics of protein stability. Chem. Rev. 97:12511267.[CrossRef][Medline]
47. Lazaridis, T., and M. Karplus. 1999. Discrimination of the native from misfolded protein models with an energy function including implicit solvation. J. Mol. Biol. 288:477487.[CrossRef][Medline]
48. Richards, F. M. 1977. Areas, volumes, packing and protein structure. Annu. Rev. Biophys. Bioeng. 6:151176.[CrossRef][Medline]
49. Schrake, A., and J. A. Rupley. 1973. Environment and exposure to solvent of protein atoms. J. Mol. Biol. 79:351371.[CrossRef][Medline]
50. Chothia, C. 1975. Structural invariants in protein folding. Nature. 254:304308.[CrossRef][Medline]
51. Lee, B., and Richards, F. M. 1971. The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55:379400.[CrossRef][Medline]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |