| Correction Biophysical Journal, Volume 87, Issue 2, 1 August 2004, Pages 1398 Full Text | PDF (43 kb) |
| Modeling the Backbone Dynamics of Reduced and Oxidized Solvated Rat Microsomal Cytochrome b5 Biophysical Journal, Volume 87, Issue 1, 1 July 2004, Pages 498-512 Andrea Giachetti, Giovanni La Penna, Angelo Perico and Lucia Banci Abstract In this article, a description of the statistics and dynamics of cytochrome in both reduced and oxidized forms is given. Results of molecular dynamics computer simulations in the explicit solvent have been combined with mode-coupling diffusion models including and neglecting the molecule-solvent correlations. and nuclear magnetic relaxation parameters of N in the protein backbone have been calculated and compared with experiments. Slight changes in charge density in the heme upon oxidation produces a cascade of changes in charge distributions from heme propionates up to charged residues ∼1.5nm from Fe. These changes in charge distributions modify the molecular surface and the water shell surrounding the protein. The statistical changes upon oxidation can be included in diffusive models that physically explain the upper and lower limits of relaxation parameters at high off-resonance fields. Abstract | Full Text | PDF (379 kb) |
| Studies of Proton Translocations in Biological Systems: Simulating Proton Transport in Carbonic Anhydrase by EVB-Based Models Biophysical Journal, Volume 87, Issue 4, 1 October 2004, Pages 2221-2239 Sonja Braun-Sand, Marek Strajbl and Arieh Warshel Abstract Proton transport (PTR) processes play a major role in bioenergetics and thus it is important to gain a molecular understanding of these processes. At present the detailed description of PTR in proteins is somewhat unclear and it is important to examine different models by using well-defined experimental systems. One of the best benchmarks is provided by carbonic anhydrase III (CA III), because this is one of the few systems where we have a clear molecular knowledge of the rate constant of the PTR process and its variation upon mutations. Furthermore, this system transfers a proton between several water molecules, thus making it highly relevant to a careful examination of the “proton wire” concept. Obtaining a correlation between the structure of this protein and the rate of the PTR process should help to discriminate between alternative models and to give useful clues about PTR processes in other systems. Obviously, obtaining such a correlation requires a correct representation of the “chemistry” of PTR between different donors and acceptors, as well as the ability to evaluate the free energy barriers of charge transfer in proteins, and to simulate long-time kinetic processes. The microscopic empirical valence bond (Warshel, A., and R. M. Weiss. 1980. . 102:6218–6226; and Åqvist, J., and A. Warshel. 1993. . 93:2523–2544) provides a powerful way for representing the chemistry and evaluating the free energy barriers, but it cannot be used with the currently available computer times in direct simulation of PTR with significant activation barriers. Alternatively, one can reduce the empirical valence bond (EVB) to the modified Marcus’ relationship and use semimacroscopic electrostatic calculations plus a master equation to determine the PTR kinetics (Sham, Y., I. Muegge, and A. Warshel. 1999. . 36:484–500). However, such an approximation does not provide a rigorous multisite kinetic treatment. Here we combine the useful ingredients of both approaches and develop a simplified EVB effective potential that treats explicitly the chain of donors and acceptors while considering implicitly the rest of the protein/solvent system. This approach can be used in Langevin dynamics simulations of long-time PTR processes. The validity of our new simplified approach is demonstrated first by comparing its Langevin dynamics results for a PTR along a chain of water molecules in water to the corresponding molecular dynamics simulations of the fully microscopic EVB model. This study examines dynamics of both models in cases of low activation barriers and the dependence of the rate on the energetics for cases with moderate barriers. The study of the dependence on the activation barrier is next extended to the range of higher barriers, demonstrating a clear correlation between the barrier height and the rate constant. The simplified EVB model is then examined in studies of the PTR in carbonic anhydrase III, where it reproduces the relevant experimental results without the use of any parameter that is specifically adjusted to fit the energetics or dynamics of the reaction in the protein. We also validate the conclusions obtained previously from the EVB-based modified Marcus’ relationship. It is concluded that this approach and the EVB-based model provide a reliable, effective, and general tool for studies of PTR in proteins. Finally in view of the behavior of the simulated result, in both water and the CA III, we conclude that the rate of PTR in proteins is determined by the electrostatic energy of the transferred proton as long as this energy is higher than a few kcal/mol. Abstract | Full Text | PDF (917 kb) |
Copyright © 2007 The Biophysical Society. All rights reserved.
Biophysical Journal, Volume 93, Issue 12, 4128-4140, 15 December 2007
doi:10.1529/biophysj.107.111849
Biophysical Theory and Modeling
Esther Caballero-Manrique, Jenelle K. Bray, William A. Deutschman1, Frederick W. Dahlquist2 and Marina G. Guenza
, 
Department of Chemistry, University of Oregon, Eugene, Oregon
Address reprint requests to Marina G. Guenza, Dept. of Chemistry and Institute of Theoretical Science, University of Oregon, Eugene, OR 97403. Tel.: 541-3462877; Fax: 541-3464643.Proteins are dynamic systems in which motion is characterized by an extended range of timescales. NMR is a powerful technique to study protein dynamics because it is sensitive to motions on many different timescales. The interpretation of NMR data, however, often involves model-dependent analyses. Indeed, several theoretical models have been successfully developed to interpret NMR relaxation data for protein motion. Nevertheless, such models often contain a finite number (sometimes large) of adjustable parameters that limit the theory’s predictive power and generality.
Here we present a theoretical approach that is parameter free, since all of the physical quantities entering the theoretical description of protein dynamics are defined a priori from experiments and molecular dynamics simulations. Protein dynamics spanning a broad range of timescales are predicted as analytical functions of the eigenvalue/eigenvector solution to our equation of motion. We find good quantitative agreement between our theoretical predictions and the experimental and simulation data, without the use of adjustable parameters.
The dynamics of proteins occur over an extended range of time- and length-scales. For example, local bond librational motions can occur on picosecond timescales, whereas bond reorientation can occur on nanoseconds, and the global tumbling and cooperative interdomain motions can occur on tens of nanoseconds and longer timescales 1. Although local dynamics depends on the chemical structure of the residue and on the local barriers that determine semiflexibility, global dynamics depends on the size of the protein and noticeably slows down with increasing protein molecular weight: timescales of milliseconds or longer are often observed in the global motions of large protein complexes. Due to the many orders of magnitude in timescale, and the many degrees of freedom available to the molecule, understanding the general principles of protein dynamics is a complicated problem.
Nevertheless, the development of a general understanding of protein local dynamics can provide important insights into many physical properties. It is widely recognized that the dynamics of proteins are relevant to function 2,3,4,5. For example, kinetic processes involving protein-protein or protein-ligand interactions can depend on conformational fluctuations. In many cases, protein stability is dominated by the thermodynamic entropy, which in turn is a function of the density of conformational states explored by the molecule and increases with increasing flexibility 2,3,6,7,8. Furthermore, there are subtle ways in which protein flexibility and dynamics can influence function. In allosteric mechanisms of activation, protein dynamics allow for structural fluctuations of coupled domains, which can transmit information between distant sites inside the protein. Such fluctuations can be responsible for promoting reactivity by bringing within close proximity catalytic sites that are otherwise distant. Finally, large cooperative motions are important for protein reactivity, because they can allow substrates to access internal regions of the protein that are sterically hindered.
The combination of experimental NMR methods and capable theoretical tools can provide a complete and quantitative picture of protein dynamics over a wide range of timescales. One might be tempted to directly model NMR relaxation by computer simulations. Unfortunately, this is generally not possible, since simulations cannot reach the longest timescales measured by NMR for most systems of interest 9,10,11. The longest timescale sampled by simulations is limited by computer power. For a typical computer, atomistically detailed simulations, explicitly including solvent molecules, are constrained to the nanosecond timescale. Such simulations contain data pertaining to local-scale motions for medium-sized proteins, i.e., ∼100 amino acids. Alternatively, simulations can achieve longer than nanosecond timescales when employing multiscale coarse-graining algorithms. In the latter case, local scale information is averaged over, and the effects of solvent interactions enter through effective parameters. Thus, a complete description of the long-time dynamics of proteins, through atomistically detailed simulations, is limited to modest-sized proteins.
Several theoretical approaches have been developed to correlate NMR relaxation data to models of protein mobility. Of the initial attempts to model NMR relaxation by assuming specific mechanisms of diffusive dynamics 12,13,14, the most widely accepted is the model-free approach by Lipari and Szabo 15,16. In a seminal article they presented a formally simple but physically sound theory that maps the decay of the time autocorrelation function for the single bond vector i,
(defined in the next section) into a linear combination of two uncorrelated dynamical processes. The theory in its simplest form requires three fitting parameters: the local and global correlation times, and a generalized order parameter determined by the weights assigned to the two dynamical processes. The main contribution of the Lipari-Szabo approach is that it allows for an analysis of NMR relaxation data to provide an approximate, although fairly realistic, picture of the local motions of the protein. Later theoretical approaches correlate the measured order parameters to protein entropy 17,18 and function.
The assumptions in Lipari-Szabo theory are consistent with the dynamics of flexible bonds in a globular protein. For this system, local motions follow two uncorrelated decay processes given by the local bond dynamics and the overall protein tumbling. However, when more complex systems are involved, the original three-parameter theory becomes inadequate. Proteins that do not obey the simple three-parameter scheme are, for example: a), proteins with rigid structure, where local and global dynamics both occur on the slow timescale of the overall protein rotation; b), partially unfolded proteins in which local and global motions are both relatively fast; c), proteins with asymmetric shapes, for which rotational diffusion is anisotropic and the slow motion includes at least three contributions corresponding to the tensorial components of the overall tumbling motion 19; and d), proteins composed of two or more large domains that slowly move relative to one another (e.g., the protein calmodulin). In general, experimental data for such proteins can be reproduced by the theory only at the expense of introducing a high number of fitting parameters 20,21,22,23.
The complexity of the underlying dynamics, even for a simple flexible globular protein, has been made apparent in the recent work by Brüschweiler and co-workers. In the Reorientational Eigenmode Dynamics theory 24,25, local reorientation of backbone bonds is obtained from simulations by initially eliminating the overall tumbling motion from molecular dynamics (MD) trajectories. The global dynamic modes are a posteriori calculated through a fitting of correlation times. This procedure is repeated until good numerical agreement with the data is achieved. It is clear from this model that, even for globular proteins, the dynamics is a combination of many slow dynamical processes.
The theory we present in this article approaches the modeling of NMR relaxation and protein dynamics from a new perspective. In our approach, we derive a first principles equation of motion for protein dynamics that correctly describes a broad range of time regimes, and provides a good prediction of different NMR experiments, (i.e., T1, T2, and nuclear Overhauser effect (NOE)), x-ray crystallography temperature factors, and computer simulations. We are interested in pursuing this alternative-avenue, first-principles approach for the following reasons: it should provide a physically accurate picture of protein dynamics within a general framework, applicable to most proteins and experimental methods. Although parameter-dependent models can achieve good agreement with experiments by including an increasing number of adjustable parameters, this is often accomplished at the expense of physical self-consistency and broad predictive power.
Our approach combines MD simulations and the solution of a diffusion equation in the spirit of our early work 26. However it differs from mode-coupling dynamics approaches 27,28, as developed by La Penna, Perico, and Freed, in many fundamental points. First of all, we are not interested in including higher-order bond-bond correlations. Second, in the proteins we investigate, the solvation dynamics of water molecules are uncorrelated with protein motions so that there is no modification of the protein friction. Finally, we adopt an original form for the hydrodynamic interaction that properly accounts for hydrophobic regions in the protein structure.
Although atomistically detailed simulations provide the needed information for the short-time dynamic regime, the long timescales of protein dynamics is governed mainly by hydrodynamics 29,30,31. It is known that conventional equations of motion, which are quite accurate in representing the dynamics of synthetic macromolecules 32, do not apply to proteins unless arbitrary numerical prefactors in the hydrodynamic interaction are included. The difficulty with conventional hydrodynamic interaction formalisms to treat biological systems is ascribed to the following: although macromolecules of synthetic origin explore a large conformational space, protein dynamics are characterized by fluctuations around the native conformation, which is stabilized by hydrophobic interactions. Each residue in the folded structure may be partially exposed to the protein hydrophobic core, and partially exposed to the solvent. This effect is essential in determining the dependence on solvent viscosity of the rate of conformational transitions in proteins, as shown by Ansari et al. in a study of myoglobin dynamics after ligand dissociation 33. The presence of hydrophobic regions modifies local friction coefficients and requires a new formalism for the hydrodynamic interaction, as we discuss further below.
In our procedure, we begin by performing molecular dynamics simulations of a protein in its solvent described at the atomistic level. Simulations provide a trajectory of ∼4ns during which the protein is stable and exhibits fluctuations about its equilibrium configuration. From this relatively short “equilibrium” trajectory, we calculate the statistical quantities that characterize the matrices in the equation of motion for the time evolution of the protein spatial coordinates. The equation is solved by diagonalization of these matrices, yielding eigenvalues and eigenvectors that characterize the spectral density. From the latter, we calculate NMR relaxation parameters, which we compare with experimental 1H-15N NOE, spin-lattice relaxation time (T1), and spin-spin relaxation time (T2). We test our procedure against NMR data for the dynamics of the response regulator CheY that controls chemotaxis in Escherichia coli. We find that theoretical predictions and experimental NMR data are in good agreement. Moreover, to formally connect our approach to Lipari-Szabo’s model, we calculate the sequence-dependent order parameter,
which is found to agree well with the corresponding values extracted from NMR experiments.
The response regulator protein CheY is part of a two-component regulatory system that controls the chemotactic swimming response of motile bacteria. Understanding the dynamics of CheY is in general relevant, since two-component regulatory systems are conserved through all bacteria, archea, and low eukaryotes 34. Specifically in bacteria, response regulators control gene expression, chemotaxis, antibiotic resistance, and many other processes 35. CheY is an ideal protein to investigate since it is the best characterized member of the response regulator superfamily 36.
As an additional test of our theory, we use our analytical solution of the equation of motion to calculate the mean-square displacements of α-carbon atoms in the picosecond regime. We observe that our theoretical results agree well with experimental data of temperature factors for CheY, as measured with high-resolution x-ray crystallography by Volz and Matsumura 36. The good agreement we obtain between our theoretical predictions and both the x-ray and NMR experimental data underlines the power and utility of our general approach.
The remainder of this article is organized in the following manner. In the Theory section, we present the Langevin equation for protein dynamics. The section includes a new treatment of the hydrodynamic interaction that accounts for varying degrees of residue exposure to the solvent, from the completely screened to the fully exposed. In the Methods section, we describe the method to perform molecular dynamics simulations, as well as the methodology we used to analyze simulation data. This section includes a brief description of the NMR relaxation experiments. In the Results section, we present the theoretical predictions of our approach for a series of experimentally measured quantities. In all of our analyses, we directly compare theory with simulations and experimental data for NMR relaxation and order parameters, and Debye-Waller temperature factors from x-ray crystallography.
According to Bloch, Wangsness, and Redfield 37,38,39,40, spin relaxation parameters measured by NMR are functions of the spectral density,
![]() | (1.1) |
The latter is the second-order Legendre polynomial of the cosine of the angle, 
![]() | (1.2) |
In Eq. (1.2),
is the angle between the bond vectors
and
and describes the bond vector reorientation during the time interval t.
For a semiflexible macromolecule, the function
assumes a simple analytical form 41, expressed as a function of eigenvalues and eigenvectors of the matrix product
which governs the equation of motion for the time evolution of the spatial coordinates for the protein in solution (see Eqs. (1.6)). Here H is the matrix of hydrodynamic interaction and A is the structural matrix, which are defined in the following sections. In this context, the goal is to define an equation of motion that properly accounts for the dynamic properties of the protein.
The equation of motion for the time evolution of the spatial coordinates is a diffusive Langevin equation, where the inertial term is neglected because protein dynamics are overdamped 29,31,41,42,43,44. In our approach the protein is represented as a collection of effective units, or residues, centered on the position of the α-carbon. In this way, the structure of the protein appears to be coarse-grained at the level of its primary sequence of amino acids. The time evolution of the spatial coordinate of the ith residue inside a protein, i.e., the vector
obeys a Langevin equation (LE), given by the balance of forces acting on the residue 29,31,32,41,42,43,44
![]() | (1.3) |
is Boltzmann constant, T is the temperature, and
is the friction coefficient obtained from the average over the residue-dependent local friction coefficient
It is also useful to define the following quantities,
that will be used later on in this article. In Eq. (1.3), the viscous force on the left-hand side,
is balanced on the right-hand side by the intramolecular force,
and by the random force,
The random force is assumed to be a Gaussian white noise process with zero mean. The LE is an inhomogeneous, first-order differential equation, linear in R, that is solved by diagonalization of the product of matrices HA![]() | (1.4) |
The matrix of eigenvectors, Q, also diagonalizes A through the congruent transformation
![]() | (1.5) |
defines the mean-square length of the normal mode a as 
Perico and Guenza have shown that for a macromolecule, the autocorrelation function
can be expressed as a function of
following the relation 41
![]() | (1.6) |
![]() | (1.7) |
The time correlation function
characterizes the disorientation of bond
and is defined by eigenvalues and eigenvectors of the matrices A and H according to
![]() | (1.8) |
The condition applies that
i.e., the normalized average length of ith the bond. The time correlation function is a linear combination of N decay processes characterized by the mode index a. Each decay process has a weight of
and a correlation time
which shows how negative eigenvalues correspond to unphysical correlation times. Two points emerge from this description: i), the time correlation functions characterizing protein dynamics contain contributions from N correlation times, where N in our model corresponds to the number of residues in the protein; ii), the function
for a macromolecule is, in general, not a simple linear combination of exponential functions 41.
Eqs. (1.6) define the autocorrelation function
as a function of the eigenvalues,
and eigenvectors,
of the matrix product HA, and of the eigenvalues,
of the connectivity matrix A. Therefore the quality of the predicted autocorrelation functions, depends on the accuracy of the definition of matrices H and A.
If the matrices H and A are known, it is straightforward to calculate
and the spectral density. Following the approach developed by Bixon and Zwanzig to treat the dynamics of semiflexible macromolecules 43, the matrix A is defined as
![]() | (2.1) |
![]() | (2.2) |
with
and
with
The matrix A defines the effective mean-force potential,![]() | (2.3) |
With a slightly different form of the matrix A our approach recovers known models of protein dynamics. For example, if the residue potential is assumed to be harmonic, and uniform for each residue, while local intra- and intermolecular connectivity are included through the matrix M, our approach recovers a Gaussian network model 46,47,48, implemented to include hydrodynamic interaction. The traditional Gaussian network model is recovered when hydrodynamic interaction is neglected, and H=1.
The generic element of the hydrodynamic interaction matrix,
describes how the motion of the ith residue produces instantaneous waves in the solvent, which perturb the velocity of the fluid surrounding a generic residue
The perturbation of the velocity decreases as the inverse of the interresidue distance
in this way the hydrodynamic interaction is a long-ranged perturbation, which affects large-distance and long-time scale dynamics of the protein.
Because the perturbation propagates through the solvent, in hydrophobic regions, which are not in contact with the solvent, the hydrodynamic interaction becomes screened and its perturbation on the dynamics is negligible 29. This effect is not accounted for in the conventional expression of the hydrodynamic interaction matrix 29,
![]() | (2.4) |
is the average residue friction,
is the solvent viscosity, and
is the average hydrodynamic radius of the residue. In Eq. (2.4), each residue is assumed averaged over the solvent degrees of freedom, which is a good approximation for flexible macromolecules since they explore an extended conformational space; i.e., each residue is statistically exposed to the solvent. Nevertheless, for folded proteins with residues buried in the hydrophobic core, self- and cross-contributions in Eq. (2.4) are not properly balanced. This often leads to negative eigenvalues, and unphysical correlation times, which implies time correlation functions that diverge at long time, instead of decaying to zero (see Eq. (1.8) with
and Fig. 1).To take into account the presence of residues partially or totally buried in the protein, we derive a new form for the hydrodynamic matrix. We couple to the traditional contribution for solvent exposed regions 49, a solvent-screened equation of motion driving the dynamics of the regions excluded from the solvent. The perturbation of the velocity of residue j, due to the motion in the solvent of residue i, is given by
where
is the preaveraged Oseen tensor.
In this new formalism, each residue has a friction coefficient,
that contains a contribution from the area exposed to solvent,
and a contribution from the area screened from the solvent (buried),
The effective friction coefficient for each residue,
is given by a simple extension of Stokes’ law as
![]() | (2.5) |
is the radius of a spherical bead of surface area equal to the surface exposed to the solvent 50, whereas
is the radius of a spherical bead of surface area equal to the surface shielded from the solvent. The internal viscosity of the hydrophobic core of the protein,
is assumed to be twice the magnitude of the viscosity of the water solvent,
This value is chosen on the physical grounds that residues in the hydrophobic core move in a “liquid” of hydrophobic particles, similar to the environment of a monomer in a melt of polymer chains 51. The exact calculation from the decay of the velocity autocorrelation function provides a consistent estimate 52.The hydrodynamic force due to residue i,
contains the friction related to the exposure to solvent of residue i,
To allow the solution of Eq. (1.3) through a normal mode representation, we approximate
as the total friction on residue i,
weighted by the ratio of the average contribution due to solvent exposure,
to the average total friction coefficient, 
![]() | (2.6) |
and 
Once Eqs. (2.5) are included in our derivation, by balancing the hydrodynamic drag force on the jth residue,
with the intramolecular force and the solvent Brownian force, we obtain the equation of motion given by Eq. (1.3), with a novel form of the hydrodynamic interaction matrix, namely
![]() | (2.7) |
Equation (2.7) is general and holds also for nonbiological macromolecules since it recovers the conventional definition for the dynamics of unperturbed macromolecules of synthetic origin 29, Eq. (2.4), if all the residues are statistically exposed to solvent, i.e., no regions of hydrophobicity are present in the molecule. As mentioned before, if the presence of hydrophobic regions in the dynamics of proteins is overlooked, the diagonalization of the matrix product HA leads, in most cases, to negative eigenvalues, which correspond to unphysical negative correlation times. An arbitrary parameter, the so-called reduced friction coefficient, was previously included as a weight of the cross-contributions to overcome this problem 44. With the new formulation for the hydrodynamic interaction proposed here, no arbitrary parameters need be included.
The averaged statistical quantities that enter matrices H and A are calculated from trajectories of MD simulations of the protein in its solvent, performed using the experimental values of the thermodynamic parameters. These include the average normalized statistical correlation between effective bonds defined in Eq. (2.2), as well as the average distance between each pair of α-carbons,
, which enters the hydrodynamic matrix Eq. (2.7). We also calculate, for each residue, the average surface area available to solvent as well as the average surface area exposed to the hydrophobic region, which define the local friction coefficients entering Eq. (2.7), through the definition in Eq. (2.5). In this way, all the input parameters to our theory are known, as well as the numerical values of the matrices H and A.
The procedure just outlined extracts the relevant physical information from short-time all-atom simulations and, through the solution of the equation of motion, allows us to predict experimentally measurable dynamical properties of proteins in a broad range of timescales. The key idea is that if initial position and forces acting on each residue are known, it should be possible to make predictions about its motion for any timescale. This is analogous to solving Newton’s equation of motion to predict the trajectory of a point-mass, when its initial position and forces acting on it are known. The idea of using computer simulations as input to diffusive equations is well established 26,32,53,54,55. In the following sections, we test our approach by investigating the dynamics of the signal transduction protein CheY, as measured in simulations and experiments of NMR relaxation.
In the specific case of E. coli CheY, we start with the x-ray structure from the Protein Data Bank (entry 3CHY 36) and perform molecular dynamics simulations using the GROMACS package 56 (GROMOS96 force field) for the protein in a box containing ∼3000 water molecules, with periodic boundary conditions. Density and temperature are set to those from experiments. The temperature was controlled using a Berendsen algorithm at 300K with a coupling constant of 100 fs. The cutoff for the Van der Waals and Coulomb interactions was 1.4nm, with neighbor lists updated every fifth step. The x-ray structure presents seven amino acids in two different rotameric configurations: Asn-23, Asn-32, Glu-37, Ser-56, Met-85, Tyr-106, and Leu-127. All the rotamers interconvert during the simulation with the exception of Tyr-106: we perform two distinct sets of runs starting with CheY having the side chain in Tyr-106 either exposed to solvent (the CheYa conformation) or embedded in a hydrophobic groove (the CheYb conformation). After energy minimization and system equilibration, the longest simulation run for CheYa extended up to 19ns whereas the longest run for CheYb included 21ns. Long time “stretches” in the simulation, during which the structure was fluctuating around a transient stable configuration, were identified. A stable folded configuration was defined by having the root mean-square deviation of the α-carbon positions from the initial equilibrated structure, smaller than ∼4Å. We identified one long stable “stretch” of ∼4ns for CheYa and of ∼3ns for CheYb. From each stretch, we calculated statistical averages and residue-specific friction coefficients, which are input to the protein equation of motion. Available surface areas (ASA) are calculated for each picosecond of simulation 57,58. The contribution to the friction coefficient due to exposure to the hydrophobic region is in turn obtained by subtracting the ASA from the total surface area of the residue 59, and it is also calculated for each picosecond of MD trajectory.
For each NMR experiment a single sample was used, containing ∼1mM concentration of 15N-labeled CheY protein in 50mM phosphate buffer. Experiments were carried out on a 600-MHz spectrometer, equipped with a triple resonance probe and pulse field gradient, at a 1H frequency of 599.98MHz, and a 15N frequency of 60.8MHz, at 25°C. Spectra were collected for the measurement of longitudinal relaxation rates (T1), 15N transverse relaxation rates (T2), and the 1H-15N steady-state NOE of the 15N nuclei in CheY.
As a test of consistency for the theory, we compare time correlation functions calculated from eigenvalues and eigenvectors of the equation of motion, with the ones obtained directly from the simulation trajectories that provided the input parameters to the theory. If the equation of motion is correct and contains all the needed information the two should be identical. We calculate the decay of the time correlation function for bond disorientation,
The comparison is limited to intervals of 1.5ns, since the longest simulation “stretch” during which the protein fluctuates around a stable configuration has a total length of ∼4ns for CheYa, and ∼3ns for CheYb. We find that the theoretical predictions of bond disorientation, as a function of time, are in quantitative agreement with simulations for all the bonds along the primary structure of the rotamer CheYb, whereas a more complex picture emerges for the CheYa rotamer.
To summarize our results in a visual picture and rapidly identify bonds presenting this “anomalous” decay of the correlation function, we plot in Fig. 2 the mean-square difference between theoretical and simulated
at fixed time intervals (0.5ns, 1.0ns, and 1.5ns), for each bond along the primary sequence of CheYa.
Fig. 2 shows that in the CheYa rotamer, most of the bonds display quantitative agreement with simulated bond disorientation with the exception of two main peaks clearly above the “noise” level, which correspond to regions including residues 13–15, and bond 117. Uniform “noise” appears in the baseline, more pronounced with increasing time, which is a consequence of the numerical error accumulating in the simulations 60,61. The disagreement between theory and simulations for bonds 13–15, and 117 is due to the presence of local conformational barriers 62, which are not accounted for in the theory. Fig. 3 shows as an example of this effect, the decorrelation in time for bonds 70–72, which are well behaved, and the decorrelation in time of bonds 13–15, which are part of the region that displays anomalous relaxation. For bonds 70–72, theoretical predictions and simulations of the time correlation function superimpose, whereas for bonds 13–15 the theory predicts faster motion than observed in simulations. Each sequence of three bonds defines the comprised dihedral angle. For sequence 70–72, the distribution of dihedral angles in the simulation follows Gaussian statistics, which is consistent with our theory. However, in the region of anomalous decorrelation the distribution of the comprised dihedral angle shows two stable conformational states, which cannot be described by our approach in its present form. Further development of the theory will require the inclusion of activated barrier crossing.
It is interesting to consider the possible implications of the different local conformational behaviors that we observe for the two CheY rotamers. The comparison between theory and simulation identifies regions, along the protein primary structure, characterized by small conformational energy barriers, easily overcome during simulations, but still relevant in affecting the local bond reorientation. These barriers are present only in one of the two main rotamers of CheY, specifically CheYa. The two rotamers differ in the position of the side chain of residue Tyr-106, which is exposed to solvent in CheYa and buried in a hydrophobic pocket in CheYb. Residue 106 is a highly conserved aromatic amino acid, either tyrosine or phenylalanine, in the extended family of response regulators, and is relevant in the mechanism of signal transduction. The external position of the aromatic ring in residue 106 facilitates binding of the response regulator, here CheY, to its autokinase protein during the first step of the phosphorylation process. Its external position also hinders access of the flagellar motor to the binding surface of CheY 63. Consistently, inactive mutants of CheY have the aromatic ring in 106 exclusively solvent exposed.
Fig. 3 shows that the transition of the Tyr-106 side chain from a local configuration buried in a hydrophobic pocket to one exposed to solvent, is coupled to a change in the local energy landscape for the sequence Asp-13, Phe-14, Ser-15, and Thr-16, and the appearance of two stable energetic states. It is possible that the region defined by residues 13–16 in CheYa functions as an “on-off switch” during CheY phosphorylation, since it is characterized by two possible states when the ring in Tyr-106 is exposed to solvent, and by only one state when the ring is buried. In support of this hypothesis are the facts that residues 13–16 appear to play some role in the activation of CheY. For example: a), residue Asp-13 is part of the active site of CheY; b), specific amino acids in this region appear to be conserved in the large family of signal transduction proteins; c), mutations that lead to active CheY usually involve residues 13 and 106. In conclusion, this simple procedure of comparing our theoretical predictions for bond decorrelation times with simulation data, allows for the identification of regions along the primary protein sequence where relatively slow conformational transition take place. Such slow regions could potentially play a key role in the protein’s biological function.
Since the theory predicts faster dynamics for bonds 13–15, and bond 117, than should be expected, theoretical predictions of NMR relaxation for these bonds will not be reported. The results presented in Figure 2 and Figure 3 show that for most of the bonds, the agreement between theory and simulations is good. We emphasize that no adjustable parameters are included in the analytical theory, which characterizes the decays of ∼250 time correlation functions.
From the eigenvalues and eigenvectors obtained from the solution of the equation-of-motion, we calculate the spectral density. The spectral density determines the relaxation times measured in NMR, i.e., the 1H-15N nuclear Overhauser effect (NOE), the spin-lattice relaxation time (T1), and the spin-spin relaxation time (T2), according to
![]() | (3.1) |
and
Here,
is the permeability of vacuum, h is Planck’s constant, ωH and ωN are the 1H and 15N Larmor frequencies of the experimental magnetic field, γH and γN are their respective gyromagnetic ratios, δN is the chemical shift anisotropy of the 15N nucleus, and
is the average N-H bond length. For each effective bond, we compare the theory with these three independent NMR measurements. In the theory, the effective bond is defined as the vector connecting two adjacent α-carbons along the protein primary sequence, whereas experimental data concern the relaxation of the bond connecting nitrogen and hydrogen atoms. Although the segments considered are different, the relatively good agreement between the analytical theory and experimental NMR data implies that the most relevant contribution to the dynamics of N-H bond relaxation is given by the backbone dynamics, as represented by the reorientation of the coarse-grained statistical segment connecting two alpha carbons.The theory predicts NMR relaxations for CheYa and CheYb separately because the interconversion between the two rotameric states of residue Tyr-106 exceeds our longest simulation time. We expect both forms to be present in solution, since the transition between the two rotamers of Tyr-106 has been observed experimentally, and in simulations 64, although their relative concentration is not known. The reduced affinity of CheY to bind the flagellar motor could suggest that only a low percentage of CheYb is present in solution. However, since the x-ray structure of the protein (which is input to our simulations) shows 50% occupancy of the two rotamers 36,65, we adopt this concentration for the two species. Because the differences in the theoretically predicted bond relaxation times between CheYa and CheYb are small, and the experimental data are affected by experimental error, it is not possible for us to accurately assess by comparison the most probable concentration of the two allosteric forms.
In Fig. 4, we show a comparison between theoretical predictions and experiments for T1, T2, and NOE of E. coli CheY, for the 50:50 concentration ratio of CheYa and CheYb. Predictions for bonds 13–16 and 117 are not reported in the plots, as discussed earlier on. Bonds 13–15 show a peak with the maximum corresponding to residue 14, where T1=0.94s, T2=0.19s, and a reverse peak with NOE=0.01. Moreover, residue 117 shows a peak with T1=0.74s, T2=0.14s, and a reverse peak with NOE=0.45. These peaks are unrealistic, since the theory incorrectly predicts faster dynamics than should be physically expected for these bonds. Experimental data for the NMR relaxation of bonds 13–15 are not available.
In the comparison between the experimental data and our theoretical predictions, we interpolate each data set with a spline fit that serves as a guide to the eye. Each interpolation line contains ∼50% of the available points in the data set. As an estimate of the agreement between theory and experiments, we calculate the mean square predictive error
![]() | (3.2) |
For purposes of visual clarity, we have omitted from Fig. 4 the “trivial” relaxation of the bonds at the two ends of the protein. For those segments, both theory and experiments show enhanced flexibility and fast dynamics due to the lack of connectivity, so that NMR relaxation times largely exceed the range depicted in Fig. 4.
For all of the residues, we find that the theory reproduces well the experimentally observed trend of fast and slow relaxation. Notice that the slow relaxation processes mentioned here do not include slow exchange processes, which cannot be predicted by our theory in its present stage of development. The baseline in the data represents the overall protein rotation, which relates to the protein long-time dynamics and depends on the model for hydrodynamic interaction. Peaks in the spin-lattice relaxation indicate local flexibility and fast relaxation. Both theory and experiments show an enhanced flexibility in regions of the protein corresponding to the α2-β3 loop 47,48,49,50,51,52, the β4-α4 loop (87–92), and the turn including bonds 76–80. In conclusion, the quality of agreement obtained in this study shows that simulations, theory, and experiments are largely compatible in providing consistent information on the physics of the system.
Through the direct modeling of NMR relaxation data using the Lipari-Szabo theory, a commonly extracted quantity is the order parameter
with
and
the protein orientational correlation time, i.e., the time of global molecular rotation. For times short in comparison to the relaxation of the orientational time correlation function,
the bond reorientation can appear restricted to fluctuations around a specific angle. This angle could be similar to that in the protein’s native conformation, suggesting that the bond is part of a rigid local structure and
Alternatively, the angle could strongly differ from that of the native structure if the bond is flexible, i.e., 
In Fig. 5 (top panel), we directly compare our theoretically predicted order parameters with the corresponding values extracted from experimental NMR data using the Lipari-Szabo theory. Several bonds in CheY cannot be modeled and “experimental” values are available only for 90 of the 129 residues. We chose the sampling time interval for the theoretical data
by optimizing the agreement between the orientational correlation function for the first bond along the protein sequence and its corresponding experimental value. Overall, the agreement between our theory and the experimental data is good for the entire primary sequence.
Both theory and experiments show several regions with significant loss of orientation, including the two ends of the protein, the α2-β3 loop 47,48,49,50,51,52, the β4-α4 loop (87–92), and the turn (76–80), consistent with the relaxation data discussed previously. An unphysical enhanced mobility appears for residues 13–16, where activated barrier crossing cannot be accounted for by the theory, as discussed previously. In general, order parameters extracted from NMR data have sharp peaks and fast transitions from completely immobile, to strongly mobile residues along the primary sequence. Our theory appears to exhibit somewhat smoother behavior. In conclusion, the agreement between theory and experiments in identifying regions of high and low mobility is good, with a correlation coefficient
Choosing slightly different values of the time interval, from
to
has a moderate effect on the quality of this agreement.
One of the advantages of our general first-principles approach is that it is independent of adjustable parameters. In contrast, phenomenological models optimize their agreement with a specific set of experimental data by adjusting multiple parameters. Such optimizations seldom achieve equally good agreement with data taken from other experiments that probe similar fundamental quantities.
In this section, we present a comparison between our theoretical predictions of temperature factors and their corresponding experimental values obtained from high resolution x-ray crystallography of E. coli CheY by Volz and Matsumura 36. Theoretical values of the temperature factors are function of the mean-square fluctuation about the equilibrium position of individual residues, and are expressed, using eigenvalues and eigenvectors of our equation of motion, following the general relation
![]() | (3.3) |
which provides the best agreement for the baseline between our predictions and the experimental data. The adopted value of 3.5ps is consistent with the expected value. Furthermore, we observe that differences in
on the order of
do not significantly change the quality of the agreement between theory and experiment.Fig. 5 (bottom panel) shows that both experiment and theory present similar regions of enhanced short-time fluctuations along the protein primary structure. Most of these regions correspond to flexible loops. The agreement between theory and experiment is good, with a correlation coefficient of 0.60. The observed agreement quality is comparable to other theoretical models used to calculate local fluctuations from x-ray structures, such as the network models pioneered by Bahar et al. and Tirion 46,47,48. A notable factor that affects the precision of the experimental data is the presence of intermolecular constraints due to crystal packing. Intermolecular constraints, which are absent in physiological conditions and are not included in our approach, suppress the amplitude of local fluctuations, and could be responsible for the lack of mobility observed, for example, in the region defined by residues 100–105.
In summary, Fig. 5 illustrates the utility of a general approach for protein dynamics, which describes in a unified theoretical framework data obtained from independent measurements and different experimental techniques.
We propose an equation of motion for protein dynamics that allows for the calculation of NMR relaxation parameters, starting from the protein static structure. The theory is ab initio, as it is developed using the conventional tools of nonequilibrium statistical mechanics. The equation of motion describing protein dynamics is analytically solved through matrix diagonalization, as opposed to a simulation procedure, to provide eigenvalues and eigenvectors from which the experimentally measurable quantities are expressed. Inputs to the equation of motion are structural correlations and friction coefficients obtained from molecular dynamics trajectories. Since the numerical values of all the physical parameters that enter our formalism (such as local friction, degree of exposure to solvent, bond flexibility, and statistical bond length) are determined from the analysis of computer simulation trajectories, the theory is free of adjustable parameters.
The theory is novel since it explicitly includes in the equation of motion the effects of hydrophobic regions. Hydrodynamic effects are accounted for in a completely general formalism that describes the dynamics of residues in the whole range, from totally exposed to completely screened from solvent. Interresidue interactions are mapped onto an effective pairwise potential, which is obtained from the statistical mechanical description of the intramolecular distribution. Our approach provides a realistic description of the system, since, through the site-specific hydrodynamic interaction and the effective interresidue potential, the specific nature of each amino acid is represented.
The approach is fairly accurate since good agreement with data from different experimental techniques and simulations is obtained. The theory makes predictions of NMR spin relaxation along the protein primary structure, in reasonably good agreement with experiments. We tested our approach through direct comparison to measurements of 1H-15N nuclear Overhauser effect, spin-lattice relaxation time, and spin-spin relaxation time for the signal transduction protein CheY, which controls the swimming behavior of E. coli bacteria. Since the timescale characteristic of the physical quantities measured by NMR relaxation exceed the timescale investigated in computer simulations, the analytical approach presented here projects information obtained from computer simulations into the long-time dynamical regime.
The theory is general, because its predictions are not limited to NMR relaxation experiments, but includes any dynamical quantity of interest related to protein equilibrium fluctuations. To highlight the generality of our approach we present comparisons of theoretical predictions with experimental Debye-Waller temperature factors obtained from high-resolution x-ray crystallography, with bond reorientation time correlation functions measured by molecular dynamics computer simulations, and with order parameters extracted from NMR measurements using the Lipari-Szabo model. We find that our theory agrees reasonably well with experiments and simulations for all of the dynamical properties investigated.
Because of its simplicity, our approach provides a straightforward procedure for a first assessment of dynamical properties from the static structure of a protein. In relation to existing NMR experiments, it provides an estimate of the dynamics of bond relaxation for those regions in the protein that cannot be measured experimentally. Furthermore, our approach could be useful as a systematic method to assess the long-time dynamical properties of a protein with known structure, before NMR relaxation experiments are carried on.
Since our theory is site specific, it allows for calculation of structural and dynamical properties of interest in relation to the biological function of the protein under study. Through a direct comparison with computer simulation data, the theory identifies regions along the protein primary sequence where relevant conformational energy barriers are present. For the protein CheY, we observe a correlation between residues forming such high barrier regions, and the extent to which those residues are conserved in the large family of response regulator proteins. This fact indicates that local energetic barriers, as identified by our method, likely play an important role in the signal transduction mechanism of the CheY protein.
We thank Prof. Andrew H. Marcus for a critical reading of this article.
This material is based upon work supported by the National Science Foundation under grant No. 0509808. We acknowledge the donors of the American Chemical Society Petroleum Research Fund for partial support of this research.
1. (2001). Protein dynamic studies move to a new time slot. Nat. Struct. Biol. 8, 912–914. CrossRef | PubMed
2. (2006). Characterization of the fast dynamics of protein amino acid side chains using NMR relaxation in solution. Chem. Rev. 106, 1672–1699. CrossRef | PubMed
3. (2001). NMR probes of molecular dynamics: overview and comparison with other techniques. Annu. Rev. Biophys. Biomol. Struct. 30, 129–155. CrossRef | PubMed
4. (2001). Dynamic activation of protein function: a view emerging from NMR spectroscopy. Nat. Struct. Biol. 8, 926–931. CrossRef | PubMed
5. (1988). Proteins: A Theoretical Perspective of Dynamics, Structure, and Thermodynamics. (New York: John Wiley & Sons). PubMed
6. (1996). Contributions to conformational entropy arising from bond vector fluctuations measured from NMR-derived order parameters: application to protein folding. J. Mol. Biol. 263, 369–382. CrossRef | PubMed
7. (1984). Allostery without conformational change. A plausible model. Eur. Biophys. J. 11, 103–109. CrossRef | PubMed
8. (1981). Method for estimating the configurational entropy of macromolecules. Macromol 14, 325–332. PubMed
9. (2004). High-resolution field-cycling NMR studies of a DNA octamer as a probe of phosphodiester dynamics and comparison with computer simulations. Biochemistry 43, 3637–3650. PubMed
10. (2004). Discriminating the helical forms of peptides by NMR and molecular dynamic simulations. J. Am. Chem. Soc. 126, 10478–10484. CrossRef | PubMed
11. (1998). Molecular dynamics of staphylococcal nuclease: comparison of simulation with 15N and 13C NMR relaxation data. J. Am. Chem. Soc. 120, 5301–5311. CrossRef | PubMed
12. (1962). Spin relaxation processes in a two-proton system undergoing anisotropic reorientation. J. Chem. Phys. 36, 1–4. CrossRef | PubMed
13. (1967). Effect of internal rotation on angular correlation functions. J. Chem. Phys. 47, 5258–5268. CrossRef | PubMed
14. (1997). A protocol for the interpretation of side-chain dynamics based on NMR relaxation: application to phenylalanines in antamanide. J. Am. Chem. Soc. 119, 4272–4284. CrossRef | PubMed
15. (1982). Model-free approach to the interpretation of nuclear magnetic relaxation in macromolecules. 1. Theory and range of validity. J. Am. Chem. Soc. 104, 4546–4559. CrossRef | PubMed
16. (1982). Model-free approach to the interpretation of nuclear magnetic relaxation in macromolecules. 2. Analysis of experimental results. J. Am. Chem. Soc. 104, 4559–4570. CrossRef | PubMed
17. (2003). Temperature dependence of NMR order parameters and protein dynamics. J. Am. Chem. Soc. 125, 11158–11159. CrossRef | PubMed
18. (1997). Contributions to protein entropy and heat capacity from bond vector motions measured by NMR spin relaxation. J. Mol. Biol. 272, 790–804. CrossRef | PubMed
19. (2000). Dynamic Light Scattering. (NY: Dover Publications, Mineola). PubMed
20. (1990). Deviations from the simple two-parameter model-free approach to the interpretation of nitrogen-15 nuclear magnetic relaxation of proteins. J. Am. Chem. Soc. 112, 4989–4991. CrossRef | PubMed
21. (2003). Beyond the decoupling approximation in the model free approach for the interpretation of NMR relaxation of macromolecules in solution. J. Am. Chem. Soc. 125, 8400–8404. CrossRef | PubMed