Article Outline

Article Information

PubMed

Related Articles

  • …more

Copyright © 2005 The Biophysical Society. All rights reserved.
Biophysical Journal, Volume 89, Issue 5, 2939-2949, 1 November 2005

doi:10.1529/biophysj.105.065664

Biophysical Theory and Modeling

Comparison of Mode Analyses at Different Resolutions Applied to Nucleic Acid Systems

Adam W. Van Wynsberghe* and Qiang Cui*Go To Corresponding Author 

* Graduate Program in Biophysics and University of Wisconsin, Madison, Wisconsin
Department of Chemistry and Theoretical Chemistry Institute, University of Wisconsin, Madison, Wisconsin

Address reprint requests to Q. Cui, Tel.: 608-262-9801.

Abstract

More than two decades of different types of mode analyses has shown that these techniques can be useful in describing large-scale motions in protein systems. A number of mode analyses are available and include quasiharmonics, classical normal mode, block normal mode, and the elastic network model. Each of these methods has been validated for protein systems and this variety allows researchers to choose the technique that gives the best compromise between computational cost and the level of detail in the calculation. These same techniques have not been systematically tested for nucleic acid systems, however. Given the differences in interactions and structural features between nucleic acid and protein systems, the validity of these techniques in the protein regime cannot be directly translated into validity in the nucleic acid realm. In this work, we investigate the usefulness of the above mode analyses as applied to two RNA systems, i.e., the hammerhead ribozyme and a guanine riboswitch. We show that classical normal-mode analysis can match the magnitude and direction of residue fluctuations from the more detailed, anharmonic technique, quasiharmonic analysis of a molecular dynamics trajectory. The block normal-mode approximation is shown to hold in the nucleic acid systems studied. Only the mode analysis at the lowest level of detail, the elastic network model, produced mixed results in our calculations. We present data that suggest that the elastic network model, with the popular parameterization, is not best suited for systems that do not have a close packed structure; this observation also hints at why the elastic network model has been found to be valid for many globular protein systems. The different behaviors of block normal-mode analysis and the elastic network model, which invoke similar degrees of coarse-graining to the dynamics but use different potentials, suggest the importance of applying a heterogeneous potential function in a robust analysis of the dynamics of biomolecules, especially those that are not closely packed. In addition to these comparisons, we briefly discuss insights into the conformational space available to the hammerhead ribozyme.

Introduction

Studying biologically relevant problems via computational methods is always a compromise between the level of detail and accuracy needed to answer a specific question and the availability of computational resources. This is due to both the large size of biological polymers and the long timescales over which they exhibit interesting behavior. Processes such as protein folding or conformational change often can occur on the millisecond timescale, whereas conventional molecular dynamics (MD) simulations need to propagate in femtosecond time-steps to simulate atomic motion accurately. This necessitates 1012 evaluations of the energy and its gradient, which is an impractical if not impossible requirement given commonly available computing power. Consequently, considerable effort has been devoted to developing techniques to gain access to specific quantities of interest while treating other degrees of freedom that do not affect those quantities in less detail 1,2,3,4,5. The most commonly coarse-grained aspect of molecular simulation is solvent degrees of freedom 1,3, but other methods also approximate properties of the macromolecule. For example, to study enzyme mechanisms, hybrid QM/MM potential energy functions have been developed 6 that treat the chemically important residues quantum mechanically, whereas the remaining residues’ electronic structure is approximated with a classical force field. Biased dynamics such as targeted MD 2 have been designed to force specific transitions to take place during nanosecond simulations; attempts have also been made 7,8,9 to recover the equilibrium thermodynamics using Jarzynski’s equality 10 from such nonequilibrium simulations. The successes of these few examples vary, but in each case, simplifying approximations are made to access the desired properties. These approximations can only be useful if they do not affect the observable of interest. Obviously, this must be checked through careful test calculations using representative systems.

The computational studies in this work test the validity of the approximations made in several mode analyses, techniques that are generally used to identify mobile structural motifs in large macromolecular systems. Since this observable has major contributions from motions that occur at long timescales, only highly approximate methods are able to access this type of information for very large systems. One such method that has been successfully applied to protein systems for more than three decades is classical normal-mode analysis (CNMA) 11,12,13,14. Traditionally, this method represents the potential energy surface of a system by a harmonic approximation around a single minimum in an all-atom molecular mechanics force field. Construction and subsequent diagonalization of the mass-weighted second derivative matrix (the Hessian matrix) allows the equations of motion of this simplified system to be solved analytically (see Methods for details). Although this level of approximation is not appropriate for the analysis of detailed side-chain motions, useful information about large-scale motion and mobile structural motifs can be readily obtained; a number of studies 15,16,17,18,19,20 have illustrated that CNMA often gives rather reliable results in this context.

Unfortunately, even for an approximate method such as CNMA, the larger biological complexes such as the ribosome, ATPases, and RNA polymerases require such significant memory storage and time for diagonalization as to be intractable on conventional computers. To treat such systems, the blocked Hessian method was developed 21 and subsequently improved upon 17. This method transforms the Hessian into a reduced space spanned by the rotational and translational degrees of freedom of user-defined blocks, usually taken as a single residue or nucleotide. This reduced space results in a much smaller Hessian that can be more easily stored and diagonalized. The block approximation was found to be valid in calculations of residue fluctuations in protein systems where only the low-frequency modes are of interest. Currently known as the block normal-mode analysis, this technique has made it possible to study the thermal fluctuations of large macromolecular systems while still using an all-atom molecular mechanics force field. In addition, other methods have also been devised to deal with the problem of storage and diagonalization of large matrices 22.

For even larger systems or when computational power is limited and calculational speed is of the utmost importance, a further simplified method of mode analysis, the elastic network model (ENM), has been proposed 23,24,25,26,27,28,29. It uses a simplified potential to create a network of harmonic springs that connect atoms that are within a given cutoff distance. This method has been used successfully to study multiple RNA polymerases from different organisms 30 as well as the extremely large ribosomal multisubunit complex 31,32. As it is the least detailed method described here, it also has the most severe limitations. The arbitrary nature of the utilized spring constant removes any ability to predict the magnitude of thermal fluctuations, and the results also have the lowest resolution, usually at the residue level.

All of the above methods have been extensively tested for and used in protein systems. They have not been rigorously tested for nucleic acids, however. Since the early 1980s, it has been clear that nucleic acids, particularly RNA, can play cellular roles other than just genomic storage 33,34,35,36. Crystal and NMR structures of RNA molecules that exhibit enzymatic and regulatory functions have been numerous in the last 10 years 37,38,39,40,41,42,43,44,45. Discussions of the mechanisms and the mode of action of these systems have led to questions about the inherent flexibilities in these structures 46,47,48,49,50,51,52,53,54. Single-molecule experiments have suggested that the dynamics of even small RNA systems can be strikingly complex and intimately coupled to function 55. Computational efforts that could assist in understanding the flexibility present in theses systems have the potential to be very useful. It is not clear, however, which, if any, of the mode analyses can be reliably applied to nucleic acids. The stabilizing interactions in nucleic acids are mainly hydrogen-bonding and base-stacking, whereas in proteins hydrophobic effects dominate. Protein structures are generally globular in shape and densely packed, whereas nucleic acids tend to be nonspherical with elongated and loosely packed global structures. With these differing structural features and interactions, it is not known a priori if these model techniques, which have worked surprisingly well in proteins, are reliable for nucleic acids. To our knowledge, the only tests of mode analysis applied to nucleic acid systems are a work from Viduna et al. 56 and a work from Zacharias 57. The latter work compared a quasiharmonic analysis of a molecular dynamics trajectory of a simple double-stranded DNA octamer with a classical normal-mode calculation. In this article, we hope to extend this work to nucleic acid systems with more complex tertiary structures as well as to other types of mode analyses.


Methods

All calculations were performed with the CHARMM 58 suite of programs compiled to use the Cornell force field 59, normally associated with AMBER 60. This choice of force field was necessitated by our inability to acquire stable MD trajectories with the CHARMM27 nucleic acid force field 61 (data not shown). The CHARMM-formatted Cornell parameters were obtained from Dr. Tom Cheatham III. Nonbonded parameters for Mg2+ were adapted from Aqvist 62. Both equilibration and productive simulations were performed using periodic boundary conditions in the microcanonical ensemble. Electrostatic interactions were treated using the particle-mesh Ewald summation 63, and the van der Waals potential was smoothly shifted to zero at 12.0Å 64.

Two different systems were studied in this work: the hammerhead ribozyme (PDB code #301D) 37 and the guanine riboswitch (PDB code #1U8D) 40 and are shown in Fig. 1. All four calculation types—quasiharmonic analysis of a molecular dynamics trajectory, classical normal-mode analysis, block normal-mode analysis, and the elastic network model—were applied to the hammerhead ribozyme. For the guanine riboswitch, only the latter three calculations were performed.

Display large version of this figure
Figure 1
The hammerhead ribozyme (A) and the guanine riboswitch (B) are shown in surface representations. Atoms closer to the center of mass of each system are colored red, whereas those far away are colored blue with a linear gradient for intermediate distances. Note that the guanine riboswitch is more densely packed than the hammerhead ribozyme. This figure was created with VMD 88.

Molecular dynamics simulations and quasiharmonic analysis (QHA)

Nucleic acid systems, especially those with marginal stability such as the hammerhead ribozyme, must be carefully prepared to ensure a stable molecular dynamics (MD) simulation. In particular, because these systems are so highly charged, accurate treatment of the electrostatics is essential. To prepare the system, the hammerhead ribozyme along with its five associated Mg2+ ions was immersed in a rhombic dodecahedron (RHDO) of thermalized TIP3P waters 65 with its characteristic dimension set at 78.28Å. Twenty-nine Na+ ions were added to neutralize the total charge via Solvate 1.0 66.

A careful equilibration scheme was adapted from the work of Hermann et al. 67. The equilibration in this work was accomplished in four stages. Initially, the ribozyme and the counterion positions were held fixed, while the water coordinates as well as the lattice dimension of the RHDO unit cell were optimized with 2000 steepest-descent (SD) steps. The optimized RHDO dimension was 78.03Å and remained fixed throughout the rest of equilibration and productive simulations. Keeping the constraints as before, the system was then subjected to 75ps of dynamics starting at 50K with the temperature increasing in 50K increments every 5ps until 300K was reached. Temperature was checked and equilibrated (if necessary) every 1ps. The second stage of equilibration removed the constraint from the counterions to allow their positions to adjust to the ribozyme and the water. Both counterion and water positions were optimized with 2000 SD steps followed by another 75ps of heating dynamics as before. In the third stage of equilibration, the ribozyme was allowed to move but was restrained by a harmonic potential to allow unfavorable contacts to dissipate without disturbing the structure significantly. Initially a mass-weighted force constant of 25 kcal/mol/amu/Å2 was applied to all atoms in the ribozyme and 2000 SD steps were performed. Using the same heating procedure as before, the system was heated to 300K. The harmonic constraints were then gradually relaxed with 75ps of simulation at each distinct constraint value. Velocities were reassigned at 300K between each constraint. The values for the force constant were (all in kcal/mol/amu/Å2): 20, 15, 10, 5, 1, 0.1, 0.01, and 0.001. The last stage of equilibration was 75ps of simulation, with the entire system evolving without any constraints along with velocity rescaling every 1ps if the temperature had veered >10° from 300K. Productive simulation differed from this last stage of equilibration only by the absence of this velocity rescaling. Productive simulations for the hammerhead system were carried out for a total of 13.5ns with coordinates saved every 100 fs. For the majority of this time, the system exhibited stable dynamics with an all-atom RMSD of ∼2.75Å, as shown in Fig. 2. This value is somewhat larger than is typically seen in protein simulations because these small RNA systems are fairly flexible. It should also be noted that the Stem I conformation observed in the crystal structure used in these simulations, PDB #301D, is not the only low-energy conformer of the hammerhead. A curved conformation of Stems I and II was observed in a crystal structure of a RNA-DNA hammerhead ribozyme-inhibitor complex 68. The evolution of the structure away from the conformation seen in 301D to this curved conformation is reflected in the increase in RMSD in the first nanosecond of simulation. At ∼12.5ns, the system developed a local instability in the unpaired region between Stems I and III, which then quickly caused alterations in the structure and a sharp increase in the RMSD. Removing both the beginning and the end of this trajectory still results in >10ns of stable simulation in which the system evolves as a solution-phase ribozyme. Thus, we chose to perform the quasiharmonic analysis on only this center part of the trajectory.

Display large version of this figure
Figure 2
All-atom RMSD calculated every 100 fs with respect to the crystal structure (301D) of the hammerhead ribozyme. The outlined box denotes the portion of the trajectory used for the quasiharmonic analysis.

Principal component analysis is a general statistical technique that allows one to analyze a distribution to extract information about its variances and the directions of those variances 69,70. When applied to molecular dynamics it is usually known as essential dynamics or quasiharmonic analysis (QHA). In this specific case, the distribution of interest is that of the atomic positions, with the variance relating to the thermal fluctuations of the system. QHA provides data like any of the other mode analyses discussed in this article: directions that correspond to certain modes of motion and associated variances that describe how energetically costly it is to move in those directions. QHA is considered the more exact of the techniques discussed in this article, since it is derived from a MD trajectory sampling an atomic and anharmonic potential energy function. Because of this, results from QHA are a harmonic approximation of the free energy surface, whereas the other methods discussed are harmonic approximations to a potential energy surface. Nevertheless, it is understood that the reliability of QHA depends on the quality of the force field as well as on the length of the MD simulation 71.

The directions of motion and associated variances from QHA are the eigenvectors and eigenvalues of the covariance matrix from the MD trajectory, C72,

(1)
where qi is the ith mass-weighted displacement coordinate from the minimized structure and the brackets average over all time-steps. Fluctuations can be calculated directly from the MD trajectory.


Classical normal-mode analysis (CNMA)

Classical normal-mode analysis is the second most detailed calculation in this work, but it is far more approximate than QHA. It does not require a MD trajectory, so the calculations are far easier to accomplish and can be finished in less than a day for small systems. The only preliminary step for CNMA is a local energy minimization. This minimization was completed using cycles of the adapted-basis Newton-Raphson method with gradually decreasing harmonic constraints to remove local steric clashes without perturbing the structure significantly. Minimizations were considered complete when only six modes (corresponding to rotational and translational degrees of freedom) had frequencies less than or equal to zero. For the hammerhead ribozyme, this required minimizing the RMS energy gradient to 0.0001kcal/mol per Å, whereas the guanine riboswitch was minimized to 0.001kcal/mol per Å. This minimization step is necessary for CNMA so that the linear term in the Taylor expansion of the potential is zero 13. Truncating at the second-order leaves a system whose equations of motion can be analytically solved by diagonalizing the Hessian matrix, H,

(2)
where V is the potential and qi is a mass-weighted displacement coordinate. The diagonalization of this matrix reduces the problem from a system of 3N coupled harmonic oscillators, a rather difficult problem, to a system of 3N uncoupled harmonic oscillators, a trivial problem to solve. This procedure yields eigenvectors and eigenvalues that can be used to calculate atomic fluctuations using
(3)
where kB is Boltzmann’s constant, T is the temperature, mk is the atomic mass, and Wki is the kth component of the eigenvector with associated eigenvalue ωi.


Block normal-mode analysis (BNMA)

Block normal-mode analysis was originally suggested by Tama et al. 21 and improved by Li and Cui 17, and is a technique that extends CNMA to larger systems. BNMA constructs the same Hessian, Eq. (2), as in CNMA, but then the Hessian is projected onto a space spanned by the rotational and vibrational degrees of freedom of predefined blocks, reducing the Hessian storage requirement by approximately a factor of 200 and the diagonalization time by a factor of 3000 if one nucleotide equals one block. The resulting eigenvectors are then projected back to the all-atom space and have the same form as the eigenvectors from CNMA. This procedure perturbs the magnitudes of the eigenvalues, but in a linear fashion for the low-frequency modes 17,21. A plot of CNMA eigenvalues versus BNMA eigenvalues returns a straight line whose slope can be used to scale the BNMA results; the scaling factor has been observed to be insensitive to the system being studied and correlates with the size of the defined block 17,21. Fluctuations are again calculated using Eq. (3), but with these scaled frequencies.


Elastic network model (ENM)

The elastic network model was originally suggested by Tirion 23 and is the most coarse-grained, and subsequently, is the least computationally expensive of all the methods discussed in this article. It requires no minimization and uses a simplified representation of the macromolecule and the intramolecular potential. In this work, each base was represented by a single sphere located at the position of its phosphorus atom. Any pair of spheres within a 20Å cutoff radius, rcut, of each other was connected with a Hookean spring at its equilibrium position with a force constant of 0.95kcal/mol/Å224. Previous work has shown that slight modifications to the coarse-graining scheme, e.g., using a sphere for both the phosphate and the N1 or N9 base nitrogen or changing rcut, does not affect the results significantly 73. Fluctuations are calculated with Eq. (3). Because the spring constant used is entirely arbitrary, ENM cannot predict the absolute magnitudes of the fluctuations. To compare the trends in the ENM root mean-square fluctuations (RMSFs) to the other calculations, the ENM RMSFs must be scaled. A simple, uniform scale factor was chosen so that the sum of the ENM RMSFs equaled the sum of the CNMA RMSFs.


Methods of comparison

In addition to comparing RMSFs from different mode analyses, the directional nature of the modes can also be examined. One can compare the subspaces each set of low-frequency modes spans by examining the spanning coefficients as defined by 17

(4)
where the Wi values correspond to individual modes from the appropriate technique. Obviously, if the sum in Eq. (4) went over all modes, then each SPANi would be equal to 1 and this metric would be useless. The useful comparison is for the low-frequency modes only, because these modes significantly contribute to the thermal fluctuations of the system. Therefore, the sum is limited to only a few modes, and in this work, only the 41 lowest-frequency modes are used.

One can make a more detailed comparison of the sets of modes by looking at the individual terms that are summed over in Eq. (4), the expansion coefficients. This quantity, given by the absolute value of the expression within the parentheses in Eq. (4), is simply the absolute value of the dot product between two eigenvectors and will be referred to as the overlap. Calculating all possible overlaps between two sets of modes gives rise to an overlap-matrix 74. If the two sets of modes were identical, the matrix would have ones on the minor diagonal and zeros everywhere else. Spanning coefficients can be generated from an overlap matrix by summing the squares of an individual column.

Note that the ENM eigenvectors have a different dimensionality than the other, all-atom calculations. To compare the eigenvectors of ENM with the other calculations directly, the components corresponding to the phosphorus atoms in the all-atom modes are abstracted and then renormalized.



Results and discussion

Hammerhead ribozyme

The hammerhead ribozyme is a particularly useful and interesting system for the purpose of this work because it is relatively small and therefore more easily studied computationally with multiple techniques. It is also an attractive system because questions regarding its mechanism can be explored using molecular dynamics. Although this system has been studied for more than 15 years, the detailed mechanism is still not completely elucidated 75. In this work, we attempt to investigate one of the remaining questions: Is the P9/G10.1 Mg2+ able to play a role in the enzymatic mechanism?

It was observed that the abolishment of the binding site of P9/G10.1 Mg2+ via a phosphorthioate substitution results in a decrease in catalytic activity of nearly 1000-fold 51. However, various crystal structures have consistently placed the binding site of this Mg2+ nearly 20Å away from the cleavable phosphodiester bond 37,68,76. Attempting to explain this discrepancy, Peracchi et al. proposed that the hammerhead ribozyme undergoes a conformational change that brings this Mg2+ close to the active site 51. To test this hypothesis, we ran 12.5ns of molecular dynamics and analyzed the trajectory with quasiharmonic analysis. This type of analysis allows one to visualize the soft motions of a macromolecular system and thus determine which parts of the system are mobile (i.e., exhibit large thermal fluctuations) and which are rigid (i.e., small thermal fluctuations). In Fig. 3, the root mean-square fluctuations (RMSFs) as computed from the MD trajectory are mapped onto the phosphate backbone. As expected, the most mobile parts of the ribozyme are the three stem regions at the extremities away from the core. The region known as Domain I contains the active site and seems to be the most rigid as it has the lowest calculated fluctuations. Domain II has larger thermal fluctuations, but it is still not as mobile as the three stems. Using QHA, it is possible to not only calculate the magnitude of the nanosecond timescale fluctuation but also obtain its direction. Figure 4AB, show end-structures of the two quasiharmonic modes with the highest variances (lowest associated frequencies). These two modes show that Stems I and II primarily flex into and out of the cleft between them with a motion much like a pair of pliers. Stem III’s major direction of fluctuation is in and out of the plane formed by the three helices.

Display large version of this figure
Figure 3
RMSFs from the MD trajectory for each phosphate atom is plotted onto a tube structure of the hammerhead ribozyme. The conformation corresponds to the average structure from the portion of the MD trajectory shaded in Fig. 2. The black sphere highlights the binding site of the P9/G10.1 Mg2+. This figure was created with MOLSCRIPT 2.1.2 89.
Display large version of this figure
Figure 4
The extremes of the quasiharmonic modes with the lowest (ω = 0.26cm−1) (A) and the next-to-lowest (ω=1.38cm−1) (B) frequencies. In A, the mode is shown with a temperature of 300K, whereas in B, the mode is shown with a temperature of 2500K. Since mode 1 has a much lower frequency, the amplitude of motion allowed at a given temperature is much greater. B is shown at the elevated temperature to make the direction of motion easier to visualize. This figure was created with MOLSCRIPT 2.1.2 89.

What can these simulations tell us about the possible large-scale motion in the hammerhead ribozyme? The major conclusion from these simulations is that Stems I and II are highly mobile as reflected by their large magnitude of thermal fluctuations. As has been suggested by earlier experiments, we directly observe the conformational freedom of these two helices 77. This motion is characterized by a hinge near the active site and allowed the distance between Stems I and II to vary on the order of 5Å during the MD trajectory. Perhaps surprisingly, we see little motion in the active site or Domain II. The claim that, during catalysis, Domain II approaches Domain I remains controversial, however 78. In this simulation, we see no evidence that such a conformational transition takes place. Unfortunately, however, our simulation result is limited by its timescale. With only 12.5ns of simulation, any conformation change or rare event that takes place on longer timescales cannot be observed. It is possible that, in order to bring Domains I and II into close proximity, a large reorganization of the hammerhead structure must take place. Recent cross-linking studies have suggested that this indeed may be the case 79.


Mode analysis comparison: hammerhead ribozyme

The primary goal of this work is to study how well the approximations made in various types of mode analysis perform for nucleic acid systems. Because BNMA is a direct approximation to CNMA, the appropriate test for BNMA is a comparison to CNMA. For consistency, all other techniques are compared to CNMA as well, although in principle, QHA is the most detailed and physically motivated technique presented in this work. QHA, CNMA, and BNMA are all-atom methods and their results depend on the quality of the molecule force field with QHA being most sensitive; ENM is not an all-atom technique, but is computationally efficient and involves the least amount of parameterization.

Since these techniques are primarily used to identify mobile regions in a macromolecular system, the first and most relevant comparison to make is an examination of the RMSFs. As shown in Fig. 5, the CNMA results match the MD RMSFs in a qualitative way, as the trends match up very well. The magnitude predicted by CNMA is in fairly good agreement but seem to underestimate the MD fluctuations. This is reasonable, given that the CNMA results come from a harmonic approximation to the potential energy surface. The results also show that BNMA matches both the trends and the magnitudes of the RMSFs calculated from the MD and CNMA very well, and suggest that the BNMA approximation at least holds for the hammerhead ribozyme. The last method, the ENM, with the popular parameterization, performs rather poorly. The overall magnitude of the ENM fluctuations is correct, but they have been scaled to allow easier comparison. They would not be predictive if the CNMA results were not already known (see Methods for details). Even the qualitative trends in the RMSFs were not reproduced. In particular, it misses the minimum in RMSF at residue index 8 and poorly miscalculates the trend and magnitude of Stem I of the enzyme strand.

Display large version of this figure
Figure 5
RMSFs for the hammerhead ribozyme from the four mode analyses. The Residue Index is as follows: 1–15 corresponds to the phosphorus atom from nucleotides within the enzyme strand, going from 5′ to 3′ and 16–39 are from the substrate strand, also going from 5′ to 3′. Note that the 5′ end of each strand is terminated with a hydroxyl, not a phosphate, so that no fluctuation is plotted for this residue. (See Fig. 1 from Scott et al. 37 for accepted nucleotide numbering.)

As to the eigenvectors, Fig. 6 shows that the BNMA modes span the CNMA modes very well, with the first 11 modes having a spanning coefficient >0.9 with a slow decrease as the frequency of the mode increases. The QHA modes are nearly equally as impressive, illustrating that CNMA, and by extension BNMA, modes well represent the conformational space accessed by the MD trajectory. This again gives credence that both CNMA and BNMA are valid approximations for this system. The ENM results are not quite as impressive, however. The spanning coefficients for the lowest-frequency modes, the ones important in calculating the thermal fluctuations, are much lower than observed in the QHA or BNMA. The 10 lowest-frequency modes’ spanning coefficients fluctuate at ∼0.75 and then fall off. This raises serious concerns that ENM may not be reproducing the structural fluctuations observed in the MD trajectory.

Display large version of this figure
Figure 6
Spanning coefficients (Eq. (4)) for the hammerhead ribozyme. Only the 41 lowest-frequency modes are included in the sum for this quantity.

As seen in Figure 7A, the individual CNMA modes match up reasonably well with those from the QHA. Because BNMA is a direct approximation of CNMA, the correlation between the two sets is also very good, as the overlap matrix in Figure 7B shows high overlap only near the diagonal, especially at the low-frequency modes. Figure 7C, however, shows the very poor overlap between the ENM modes and the CNMA modes. There is a weak correlation between the two sets at the very lowest frequencies, but very little elsewhere. It should be noted that these comparisons are only meaningful because the eigenvalues associated with the discussed eigenvectors are well-correlated (data not shown).

Display large version of this figure
Figure 7
Overlap matrices for the hammerhead ribozyme. Perfect correspondence between two techniques would result in a black line along the minor diagonal. The scale along each axis is 41 modes.

The most immediate conclusion from these data is that CNMA seems to be a reliable way to identify mobile regions (i.e., exhibit large thermal fluctuations) in nucleic acids. Even with its harmonic approximation and the lack of explicit solvent, the RMSFs from CNMA predict which parts of the hammerhead ribozyme are mobile and which are rigid, and do so in a semiquantitative manner. The directional motion is also reproduced, as is illustrated in both the spanning coefficients and the overlap matrices. The data suggest, perhaps unsurprisingly, that the BNMA approximation holds up well, since BNMA results match those from CNMA. Unfortunately, the same success does not hold for the ENM. The RMSFs have some qualitative inaccuracies and the modes generated from ENM are not very similar to the other three methods. It is perhaps not too surprising that CNMA, and therefore BNMA, are valid for nucleic acids as well as for proteins systems. The changes in the interactions are coded into the parameters used in the actual force field. Obviously the changes in relative weights of certain types of interactions do not perturb the validity of the harmonic approximation used in CNMA. In the ENM, however, there is only one parameter that actually matters: the cutoff distance, rcut. Physically, the ENM can be viewed as a density detector; any atoms within rcut are included in a measure of the density of adjacent atoms. The denser the packing, the more springs are added, and the more rigidly the given atom is held. This correlates well to the trend in thermal fluctuations in proteins where, for the most part, the systems have a dense core and the mobile regions are protrusions. Previous work has shown that ENM has difficulties dealing with protrusions that are pathologically extended from the majority of the protein 80. In the case of the hammerhead ribozyme, nearly the entire system is a loosely packed, extended system. This is generally true for most small nucleic acid systems, where the burial of hydrophobic surface does not dominate the folding, as is often the case with proteins. We hypothesize that it is this irregularly packed structure, on the scale of rcut, that causes the breakdown of the ENM in the hammerhead system. To test this hypothesis, we performed the same calculations (excepting the MD simulation) on a more densely packed system, the guanine riboswitch.


Mode analysis comparison: guanine riboswitch

The guanine riboswitch has a more compact structure as compared to the hammerhead ribozyme, as shown in Fig. 1. Instead of three elongated double-helical regions made from two separate RNA strands as in the hammerhead, the riboswitch is made from a single self-complementary RNA strand that forms a tightly packed pancake-shaped structure. For the purposes of this work, this system is a small RNA like the hammerhead but is somewhat more compact. Using the same analysis tools as discussed in the above section on the hammerhead ribozyme, we show that the ENM is able to predict atomic fluctuations for this densely packed system more accurately.

As illustrated in Fig. 8, the BNMA calculations match both the magnitude and the trends of the CNMA RMSFs very well for the guanine riboswitch. The quality of this match is similar to that seen in Fig. 5 for the hammerhead ribozyme. However, unlike in the hammerhead ribozyme, the ENM results have no major inaccuracies and for the most part capture the trends from the CNMA well. Although there are some minor discrepancies, in this measure, the ENM performs much better for the guanine riboswitch than in the hammerhead ribozyme.

Display large version of this figure
Figure 8
RMSFs for the guanine riboswitch. Only three curves are shown, since no MD trajectory was run for this system. The Residue Index corresponds to the single RNA strand going from 5′ to 3′. This structure also lacked a 5′ phosphate group, and no fluctuation from the 5′ terminal nucleotide is plotted.

As seen in the hammerhead system, Fig. 9 shows that BNMA approximates the CNMA riboswitch results very well with high spanning coefficients, especially at low frequencies. The ENM results still do not match those from BNMA, and are similar in quality as for the hammerhead ribozyme. The BNMA/CNMA overlap matrix shown in Figure 10A illustrates that the two sets of modes are, as expected, highly correlated. Figure 10B shows that the ENM results are still not very good, but at low frequencies, the ENM results do correlate with CNMA very well. This correlation quickly dies off as the frequency of the modes increases, however. Overall, the ENM results from the guanine riboswitch are better than those from the hammerhead ribozyme. RMSFs for the riboswitch match up qualitatively with CNMA, whereas those for the hammerhead have some discrepancies. The ENM for the riboswitch still does not do a great job at predicting the direction of the structural mobility, but ENM applied to the hammerhead is worse. ENM does not work particularly well with either system, though, when compared to BNMA. From these data and previous analysis using a range of different parameters in the ENM model 73, we suggest that the ENM does not work well for systems with loosely packed structures, and therefore is not an appropriate choice for small RNA systems.

Display large version of this figure
Figure 9
Spanning coefficients (Eq. (4)) for the guanine riboswitch. Only the 41 lowest-frequency modes are used to calculate this quantity.
Display large version of this figure
Figure 10
Overlap matrices for the guanine riboswitch. The scale along each axis is 41 modes.


Concluding remarks

A stable, 12.5-ns MD simulation of the hammerhead ribozyme was performed, the longest reported hammerhead simulation by >10ns. Analysis of this trajectory allows us to identify mobile structural motifs in this small RNA system. Our results suggest that the most mobile regions are the three stems, while the active site is fairly rigid. We can also suggest that, in order for Domain II to approach the active site, a fairly extensive reorganization of the hammerhead must take place.

We have also used this MD trajectory as a benchmark to test the validity of several types of mode analyses in a complex nucleic acid system, within the limit of 10-ns-scale simulation time. The widely used CNMA provided qualitative and semiquantitative prediction of the magnitude of thermal fluctuations and the directionality of the soft motions. The BNMA approximation also was shown to hold for the two small RNAs studied here, and we expect that this observation will transfer to other RNA systems. The ENM model, however, had difficulties in matching the trends in fluctuations from the MD trajectory and the other two mode analyses based on physical potentials; this result was not found to be sensitive to the variation in the cutoff or identity of nodes in the network 73, although it is possible that more extensive reparameterization of the model will improve the results (I. Bahar, private communication). Another set of mode analyses performed on the guanine riboswitch suggests that the reliability of the ENM model may be improved for larger, more compact systems; indeed, the agreement between ENM and BNMA for 30S and 50S ribosome was found to be quite impressive 73.

It is interesting to discuss these observations briefly, in the context of constructing appropriate coarse-grained models for biological molecules 81. This is a topic of tremendous interest considering the rapid development of structural biology, which has made structures of biological systems at various resolutions and of increasing size and complexity readily available. In an increasing number of research articles, it has been demonstrated that a simple model like the elastic network model works extremely well for many large macromolecules, even when only low-resolution structural data were available 82,83. An interesting question, then, is “why does ENM work?” On physical grounds, one may argue that global flexibility is, by definition, a collective property and therefore is not expected to be sensitive to fine structural details but depends mainly on the overall shape of the macromolecule. This argument explains why models like BNMA, which neglect motions at small scales but maintain physical interactions, work well for many protein and nucleic acid systems. It remains intriguing why ENM, which assumes a homogeneous interaction strength (i.e., uniform spring constant in different parts of the macromolecule), works well for many protein complexes. The observation in this work that the quality of ENM depends on the compactness of the RNA system offers an interesting clue: in compact systems, such as most globular proteins 84, interatomic interactions, on average, are fairly homogeneous at the residue-level. That BNMA—which coarse-grained the dynamics of the system to a similar degree to ENM but kept physical interactions—produced reliable results, further corroborates the importance of heterogeneous interactions, even at the residue level, in systems that are not compact. A more quantitative connection between the compactness of the system and appropriate level of coarse-graining is under study. Finally, the fact that small changes in sequence (due to either mutation or organism origin) at hinge residues can often have profound impact on the displacement along functionally important direction of biomolecules (e.g., point mutations can abolish the motility of molecular motors 85,86) also suggests the importance of heterogeneous interactions, although it is possible that such mutations mainly modify the free energy cost for moving along a particular direction instead of significantly modifying the character of low-frequency modes. Recent work has shown that these hinges can be identified by introducing nonhomogeneous interactions into the ENM 87, which actually relies on the variation of low-frequency modes upon changing the elastic force constants associated with specific residues 87. In other words, although homogeneous ENM is indeed incredibly useful in many studies, it is also of great interest to develop coarse-grained models that go beyond the limit of the homogeneous elastic model for robustly describing the flexibility of large biomolecular systems, for which the resolution of experimental characterization is typically low and the cost of calculations is an important factor.


Acknowledgments

The authors thank Tom E. Cheatham 3rd for helpful discussions regarding nucleic acid simulations as well as coordinates used for initial testing of simulation protocols. The authors also thank Sam Butcher for suggesting the guanine riboswitch as a particularly compact small RNA structure for study. Guohui Li’s assistance with the block normal-mode and elastic network mode code was also greatly appreciated. Discussions with Profs. I. Bahar and C. Brooks are appreciated.

A.W.V.W. was supported by a National Science Foundation predoctoral fellowship. Q.C. is an Alfred P. Sloan Research Fellow. Computational resources from the National Center for Supercomputing Applications at the University of Illinois are greatly appreciated.

References

1. Feig, M., and Brooks, C.L. (2004). Recent advances in the development and application of implicit solvent models in biomolecule simulations. Curr. Opin. Struct. Biol. 14, 217–224. CrossRef | PubMed

2. Schlitter, J., Engels, M., and Kruger, P. (1994). Targeted molecular-dynamics—a new approach for searching pathways of conformational transitions. J. Mol. Graph. 12, 84–89. CrossRef | PubMed

3. Im, W., Berneche, S., and Roux, B. (2001). Generalized solvent boundary potential for computer simulations. J. Chem. Phys. 114, 2924–2937. CrossRef | PubMed

4. Shurki, A., and Warshel, A. (2003). Structure/function correlations of proteins using MM, QM/MM, and related approaches: methods, concepts, pitfalls, and current progress. Adv. Protein Chem. 66, 249–313. CrossRef | PubMed

5. Gao, J.L. (1996). Hybrid quantum and molecular mechanical simulations: an alternative avenue to solvent effects in organic chemistry. Acct. Chem. Res. 29, 298–305. PubMed

6. Field, M.J., Bash, P.A., and Karplus, M. (1990). A combined quantum-mechanical and molecular mechanical potential for molecular-dynamics simulations. J. Comput. Chem. 11, 700–733. CrossRef | PubMed

7. Rodriguez-Gomez, D., Darve, E., and Pohorille, A. (2004). Assessing the efficiency of free energy calculation methods. J. Chem. Phys. 120, 3563–3578. CrossRef | PubMed

8. Park, S., and Schulten, K. (2004). Calculating potentials of mean force from steered molecular dynamics simulations. J. Chem. Phys. 120, 5946–5961. CrossRef | PubMed

9. Jensen, M.O., Park, S., Tajkhorshid, E., and Schulten, K. (2002). Energetics of glycerol conduction through aquaglyceroporin GlpF. Proc. Natl. Acad. Sci. USA 99, 6731–6736. CrossRef | PubMed

10. Jarzynski, C. (1997). Nonequilibrium equality for free energy differences. Phys. Rev. Lett. 78, 2690–2693. CrossRef | PubMed

11. Brooks, B., and Karplus, M. (1983). Harmonic dynamics of proteins—normal modes and fluctuations in bovine pancreatic trypsin-inhibitor. Proc. Natl. Acad. Sci. USA 80, 6571–6575. CrossRef | PubMed

12. Go, N., Noguti, T., and Nishikawa, T. (1983). Dynamics of a small globular protein in terms of low-frequency vibrational modes. Proc. Natl. Acad. Sci. USA 80, 3696–3700. CrossRef | PubMed

13. Hayward, S. (2001). Normal mode analysis of biological molecules. In Computational Biochemistry and Biophysics. Becker, O.M., MacKerell, A.D., Roux, B., Watanabe, M., eds. (New York: Marcel Dekker). PubMed

14. Levitt, M., Sander, C., and Stern, P.S. (1983). The normal modes of a protein—native bovine pancreatic trypsin inhibitor. Int. J. Quantum Chem. 10, 181–199. PubMed

15. Ma, J.P. (2005). Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes. Structure 13, 373–380. Abstract | Full Text | PDF (346 kb) | CrossRef | PubMed

16. Cui, Q., Li, G.H., Ma, J.P., and Karplus, M. (2004). A normal mode analysis of structural plasticity in the biomolecular motor F-1-ATPase. J. Mol. Biol. 340, 345–372. CrossRef | PubMed

17. Li, G.H., and Cui, Q. (2002). A coarse-grained normal mode approach for macromolecules: an efficient implementation and application to Ca2+-ATPase. Biophys. J. 83, 2457–2474. Abstract | Full Text | PDF (1159 kb) | PubMed

18. Krebs, W.G., Alexandrov, V., Wilson, C.A., Echols, N., Yu, H.Y., and Gerstein, M. (2002). Normal-mode analysis of macromolecular motions in a database framework: developing mode concentration as a useful classifying statistic. Proteins 48, 682–695. CrossRef | PubMed

19. Thomas, A., Hinsen, K., Field, M.J., and Perahia, D. (1999). Tertiary and quaternary conformational changes in aspartate transcarbamylase: a normal mode study. Proteins 34, 96–112. CrossRef | PubMed

20. Seno, Y., and Go, N. (1990). Deoxymyoglobin studied by the conformational normal mode analysis. I. Dynamics of globin and the hemoglobin interaction. J. Mol. Biol. 216, 95–109. CrossRef | PubMed

21. Tama, F., Gadea, F.X., Marques, O., and Sanejouand, Y.H. (2000). Building-block approach for determining low-frequency normal modes of macromolecules. Proteins 41, 1–7. CrossRef | PubMed

22. Mouawad, L., and Perahia, D. (1993). Diagonalization in a mixed basis—a method to compute low-frequency normal modes for large macromolecules. Biopolymers 33, 599–611. CrossRef | PubMed

23. Tirion, M.M. (1996). Large amplitude elastic motions in proteins from a single-parameter, atomic analysis. Phys. Rev. Lett. 77, 1905–1908. CrossRef | PubMed

24. Doruker, P., Jernigan, R.L., and Bahar, I. (2002). Dynamics of large proteins through hierarchical levels of coarse-grained structures. J. Comput. Chem. 23, 119–127. CrossRef | PubMed

25. Haliloglu, T., Bahar, I., and Erman, B. (1997). Gaussian dynamics of folded proteins. Phys. Rev. Lett. 79, 3090–3093. CrossRef | PubMed

26. Bahar, I., Atilgan, A.R., and Erman, B. (1997). Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Fold. Des. 2, 173–181. CrossRef | PubMed

27. Tama, F., and Sanejouand, Y.H. (2001). Conformational change of proteins arising from normal mode calculations. Protein Eng. 14, 1–6. PubMed

28. Atilgan, A.R., Durell, S.R., Jernigan, R.L., Demirel, M.C., Keskin, O., and Bahar, I. (2001). Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J. 80, 505–515. Abstract | Full Text | PDF (664 kb) | PubMed

29. Doruker, P., Atilgan, A.R., and Bahar, I. (2000). Dynamics of proteins predicted by molecular dynamics simulations and analytical approaches: application to α-amylase inhibitor. Proteins 40, 512–524. CrossRef | PubMed

30. Delarue, M., and Sanejouand, Y.H. (2002). Simplified normal mode analysis of conformational transitions in DNA-dependent polymerases: the elastic network model. J. Mol. Biol. 320, 1011–1024. CrossRef | PubMed

31. Tama, F., Valle, M., Frank, J., and Brooks, C.L. (2003). Dynamic reorganization of the functionally active ribosome explored by normal mode analysis and cryo-electron microscopy. Proc. Natl. Acad. Sci. USA 100, 9319–9323. CrossRef | PubMed

32. Wang, Y.M., Rader, A.J., Bahar, I., and Jernigan, R.L. (2004). Global ribosome motions revealed with elastic network model. J. Struct. Biol. 147, 302–314. CrossRef | PubMed

33. Cech, T.R. (2002). Ribozymes, the first 20 years. Biochem. Soc. Trans. 30, 1162–1166. CrossRef | PubMed

34. Guerriertakada, C., Gardiner, K., Marsh, T., Pace, N., and Altman, S. (1983). The RNA moiety of ribonuclease-P is the catalytic subunit of the enzyme. Cell 35, 849–857. Abstract | | CrossRef | PubMed

35. Cech, T.R., Zaug, A.J., and Grabowski, P.J. (1981). vitro splicing of the ribosomal-RNA precursor of Tetrahymena—involvement of a guanosine nucleotide in the excision of the intervening sequence. Cell. 27, 487–496. Abstract | | CrossRef | PubMed

36. Winkler, W., Nahvi, A., and Breaker, R.R. (2002). Thiamine derivatives bind messenger RNAs directly to regulate bacterial gene expression. Nature 419, 952–956. CrossRef | PubMed

37. Scott, W.G., Murray, J.B., Arnold, J.R.P., Stoddard, B.L., and Klug, A. (1996). Capturing the structure of a catalytic RNA intermediate: the hammerhead ribozyme. Science 274, 2065–2069. CrossRef | PubMed

38. Serganov, A., Yuan, Y.R., Pikovskaya, O., Polonskaia, A., Malinina, L., Phan, A.T., Hobartner, C., Micura, R., Breaker, R.R., and Patel, D.J. (2004). Structural basis for discriminative regulation of gene expression by adenine- and guanine-sensing mRNAs. Chem. Biol. 11, 1729–1741. Abstract | Full Text | PDF (838 kb) | CrossRef | PubMed

39. Egli, M. (2004). Nucleic acid crystallography: current progress. Curr. Opin. Chem. Biol. 8, 580–591. CrossRef | PubMed

40. Batey, R.T., Gilbert, S.D., and Montange, R.K. (2004). Structure of a natural guanine-responsive riboswitch complexed with the metabolite hypoxanthine. Nature 432, 411–415. CrossRef | PubMed

41. Adams, P.L., Stahley, M.R., Kosek, A.B., Wang, J.M., and Strobel, S.A. (2004). Crystal structure of a self-splicing group I intron with both exons. Nature 430, 45–50. CrossRef | PubMed

42. Ke, A.L., Zhou, K.H., Ding, F., Cate, J.H.D., and Doudna, J.A. (2004). A conformational switch controls hepatitis δ-virus ribozyme catalysis. Nature 429, 201–205. CrossRef | PubMed

43. Krasilnikov, A.S., Yang, X.J., Pan, T., and Mondragon, A. (2003). Crystal structure of the specificity domain of ribonuclease P. Nature 421, 760–764. CrossRef | PubMed

44. Rupert, P.B., Massey, A.P., Sigurdsson, S.T., and Ferre-D’Amare, A.R. (2002). Transition state stabilization by a catalytic RNA. Science 298, 1421–1424. CrossRef | PubMed

45. Zhang, L., and Doudna, J.A. (2002). Structural insights into group II intron catalysis and branch-site selection. Science 295, 2084–2088. CrossRef | PubMed

46. Ando, T., Tanaka, T., and Kikuchi, Y. (2003). Substrate shape specificity of E. coli RNase P ribozyme is dependent on the concentration of magnesium ion. J. Biochem. (Tokyo) 133, 445–451. CrossRef | PubMed

47. Reiter, N.J., Blad, H., Abildgaard, F., and Butcher, S.E. (2004). Dynamics in the U6 RNA intramolecular stem-loop: a base-flipping conformational change. Biochemistry 43, 13739–13747. PubMed

48. Hampel, K.J., and Burke, J.M. (2003). Solvent protection of the hammerhead ribozyme in the ground state: evidence for a cation-assisted conformational change leading to catalysis. Biochemistry 42, 4421–4429. PubMed

49. Shan, S.O., and Herschlag, D. (2002). Dissection of a metal-ion-mediated conformational change in Tetrahymena ribozyme catalysis. RNA 8, 861–872. CrossRef | PubMed

50. Pereira, M.J.B., Harris, D.A., Rueda, D., and Walter, N.G. (2002). Reaction pathway of the trans-acting hepatitis δ-virus ribozyme: a conformational change accompanies catalysis. Biochemistry 41, 730–740. PubMed

51. Peracchi, A., Beigelman, L., Scott, E.C., Uhlenbeck, O.C., and Herschlag, D. (1997). Involvement of a specific metal ion in the transition of the hammerhead ribozyme to its catalytic conformation. J. Biol. Chem. 272, 26822–26826. CrossRef | PubMed

52. Kim, H.D., Nienhaus, G.U., Ha, T., Orr, J.W., Williamson, J.R., and Chu, S. (2002). Mg2+-dependent conformational change of RNA studied by fluorescence correlation and FRET on immobilized single molecules. Proc. Natl. Acad. Sci. USA 99, 4284–4289. CrossRef | PubMed

53. Pinard, R., Hampel, K.J., Heckman, J.E., Lambert, D., Chan, P.A., Major, F., and Burke, J.M. (2001). Functional involvement of G8 in the hairpin ribozyme cleavage mechanism. EMBO J. 20, 6434–6442. CrossRef | PubMed

54. Peracchi, A., Karpeisky, A., Maloney, L., Beigelman, L., and Herschlag, D. (1998). A core-folding model for catalysis by the hammerhead ribozyme accounts for its extraordinary sensitivity to basic mutations. Biochemistry 37, 14765–14775. PubMed

55. Zhuang, X.W., Kim, H., Pereira, M.J.B., Babcock, H.P., Walter, N.G., and Chu, S. (2002). Correlating structural dynamics and function in single ribozyme molecules. Science 296, 1473–1476. CrossRef | PubMed

56. Viduna, D., Hinsen, K., and Kneller, G. (2000). Influence of molecular flexibility on DNA radiosensitivity: a simulation study. Phys. Rev. E 62, 3986–3990. PubMed

57. Zacharias, M. (2000). Comparison of molecular dynamics and harmonic mode calculations on RNA. Biopolymers 54, 547–560. CrossRef | PubMed

58. Brooks, B.R., Bruccoleri, R.E., Olafson, B.D., States, D.J., Swaminathan, S., and Karplus, M. (1983). CHARMM—a program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4, 187–217. CrossRef | PubMed