Skip to main content

Advertisement

QRNAS: software tool for refinement of nucleic acid structures

Article metrics

Abstract

Background

Computational models of RNA 3D structure often present various inaccuracies caused by simplifications used in structure prediction methods, such as template-based modeling or coarse-grained simulations. To obtain a high-quality model, the preliminary RNA structural model needs to be refined, taking into account atomic interactions. The goal of the refinement is not only to improve the local quality of the model but to bring it globally closer to the true structure.

Results

We present QRNAS, a software tool for fine-grained refinement of nucleic acid structures, which is an extension of the AMBER simulation method with additional restraints. QRNAS is capable of handling RNA, DNA, chimeras, and hybrids thereof, and enables modeling of nucleic acids containing modified residues.

Conclusions

We demonstrate the ability of QRNAS to improve the quality of models generated with different methods. QRNAS was able to improve MolProbity scores of NMR structures, as well as of computational models generated in the course of the RNA-Puzzles experiment. The overall geometry improvement may be associated with increased model accuracy, especially on the level of correctly modeled base-pairs, but the systematic improvement of root mean square deviation to the reference structure should not be expected. The method has been integrated into a computational modeling workflow, enabling improved RNA 3D structure prediction.

Background

Ribonucleic acid (RNA) molecules play pivotal roles in living organisms. RNAs are involved in a variety of biological processes: they transmit genetic information, they sense and communicate responses to cellular signals, and even catalyze chemical reactions [1]. With the very rapid discovery of new classes of RNA molecules, new functions beyond storing genetic information are also being discovered. The functions of RNA molecules and interactions of proteins, RNAs, and their complexes, often depend on their structure, which in turn is encoded in the linear sequence of ribonucleotide residues. Thus, the understanding of the molecular basis of RNA function requires the knowledge of RNA structure.

The experimental determination of RNA 3D structures is expensive and difficult [2, 3]. However, the ribonucleotide sequence determines RNA structure (in a similar manner as amino acid sequence determined protein structure), it is theoretically possible to infer the RNA structures from sequences. Since the historically first prediction of tRNA 3D structure in 1969 [4], throughout the decades, numerous computational methods were developed to generate RNA 3D structure from sequence. Currently, the field of research on RNA structure prediction is quite advanced, and the advantages and limitations of different methods are known, in particular from the assessment within the RNA-Puzzles community-wide experiment [5,6,7], which has been inspired by the CASP experiment for protein structure prediction [8].

Because of the very high costs of all-atom simulations, RNA 3D structures are usually not predicted by simulating all the details of the physical process of macromolecular folding, starting from sequence alone. The most successful general strategy for RNA 3D structure prediction that emerged from the RNA-Puzzles experiment involves the following approaches or their combination: 1) identification of pre-existing information in databases of molecular structure and e.g., using known structures as templates to develop a comparative model for the whole structure or its part; 2) running a simulation, often using a coarse-grained strategy, with restraints to represent all possible knowledge about the target structure, to generate ensembles of structurally similar conformations with possibly best scores. In this strategy, a typical approach is to derive potentials (scoring functions) based on a statistical analysis of experimentally determined structures. Statistical potentials can be used to replace or supplement the calculation of the physical free energy by evaluating the relative frequencies of features, such as pairwise distances of atoms (bonded and non-bonded) and mutual orientations of chemical groups (e.g., torsion angles). In this methodological framework, the most frequently observed structural features are also the most probable ones.

Simplifications applied in the process of RNA 3D structure prediction come with a cost of the loss of fine structural details. Computational models often present imperfect stereochemistry, unnatural bond lengths or steric conflicts. These deficiencies are clearly visible when using quality assessment tools, such as MolProbity [9, 10]. To obtain a high-quality model, a structure obtained from template-based modeling or from coarse-grained simulations needs to be further refined. However, even models perceived as correct by validation tools can still be far from their native structures. The most challenging task faced by the refinement is not only to improve the visible quality of the model but to bring it closer to the ‘true’ structure (which in case of real predictions is unknown at the time of the modeling). According to RNA-Puzzles, the best models of medium-sized RNA molecules exhibit root mean square deviation (RMSD) of 5–10 Å from the reference structure. It is tempting to ask whether a dedicated software tool could improve these results.

In this article, we present QRNAS, a new software tool for fine-grained refinement of nucleic acid structures, dedicated to improving the quality of models generated by low- to medium-resolution methods commonly used, e.g., for RNA 3D structure modeling. QRNAS is capable of handling RNA, DNA or chimeras and hybrids thereof, and enables modeling of nucleic acids containing modified residues. We demonstrate the ability of QRNAS to improve the quality of models generated in the course of RNA-Puzzles, often with improvement in the model accuracy, as compared to the reference structure. QRNAS is also able to improve MolProbity scores of NMR structures from Protein Data Bank.

Implementation

Force field

The force field used by QRNAS is a modified version of AMBER [11, 12] adopted to represent 107 modified nucleotides currently known to be present in RNA [13]. Currently, 130 residues are parametrized, including four canonical ribonucleotides (A, G, C, U) and deoxyribonucleotides (dA, dC, dG, dT) as well as naturally occurring modifications thereof (e.g., m7G, m1A, dU, wybutosine, queuosine, etc.). The key novel feature of QRNAS is an extension of the AMBER force field with energy terms that allow for modeling of restrained structures and enforce the backbone regularization. Imposition of secondary structure is also possible due to interaction types that go beyond the original AMBER force field, namely: explicit hydrogen bonds and enforcement of base pair co-planarity. These two interaction types are often poorly modeled in structures generated by computational structure prediction methods, and in our experience, their enforcement is a critical element of high-resolution refinement. Application of custom distance restraints required the introduction of pairwise harmonic interactions. Regularization of backbone torsions was realized by introduction of a knowledge-based energy term. All these add-ons carry along a certain degree of arbitrariness, and for this reason, we made them optional. In particular, our program falls back to plain AMBER [13] when all four additional terms are disabled. Similarly, electrostatics and van der Waals interactions can be disabled by the user (e.g., to speed up the calculation). With electrostatics enabled, the user can choose between generalized Born solvent and vacuum environment. In either case, the system is assumed to be non-periodic.

The new energy terms associated with hydrogen bonds, base pairs, backbone irregularities, and custom restraints are given, respectively, by Eqs. (1)–(4) (see below).

Explicit hydrogen bonds

Although hydrogen bonds in AMBER are currently handled by means of electrostatic and van der Waals interactions, we decided to reintroduce an additional explicit description. Our goal was to gain finer control over the strength of this interaction. This was prompted in part by our observation, e.g., in the context of the RNA-Puzzles experiment, that in computational models of RNA structure obtained by low- to medium-resolution computational methods, interactions based on hydrogen bonding are often poorly modeled [5,6,7]. Computationally modeled structures often present an “almost correct” orientation of hydrogen bond donors and acceptors, which nonetheless deviates from the values typically observed in high-resolution structures. In these computational models, a relatively small adjustment of geometry often leads not only to an interaction that can be detected as a “proper” hydrogen bond by software for structure analysis but to an improved overall orientation of base moieties involved in pairing via these hydrogen bonds. Thus, with high force constant, explicit hydrogen bonds can be used as restraints when imposing secondary structure on the modeled nucleic acid molecule. Another benefit of enforcing strong hydrogen bonds in the structure optimization procedure is that geometrically correct contacts are preserved throughout the computational simulation once they are formed.

According to Lu et al., the statistical analysis of the hydrogen-bonds obtained from simulations shows that the strengths of hydrogen bonds in liquid water conform to a Gaussian distribution [14]. Therefore, the energy term associated with hydrogen bond (EH-bond) was chosen to be Gaussian in its length with an exponential dependence on the cosine of its angle:

$$ {E}_{H- bond}={k}_1\mathit{\exp}\left(-{r}_{ij}^2/d\right)\mathit{\exp}\left(\mathit{\cos}\left({\theta}_{ij k}-{\theta}_0\right)\right) $$
(1)

Where k1 denotes the force constant, rij is the hydrogen bond length between donor hydrogen i and acceptor j, and θijk is the bond angle between donor-hydrogen-acceptor. The parameters k1, i, θ0 were iteratively tuned to reproduce experimental hydrogen bond lengths. The multiplier was arbitrarily set at a value of − 1 kcal/mol, which proved to provide good persistence of contacts in the course of energy minimization.

Base pair co-planarity

Models of RNA structure obtained by computational methods (in particular by coarse-grained methods and in the process of comparative modeling) often present various deviations of base-pair geometry. In particular, canonical Watson-Crick base pairs often deviate from co-planarity. Therefore, QRNAS was equipped with an optional feature that performs the idealization of base pair planarity. When enabled, Watson-Crick base pairs are not only restrained by explicit hydrogen bonds but also additionally flattened. The flattening is implemented by application of force to the atoms of each base according to Eq. (2):

$$ {E}_{BP}={k}_2{\sum}_{i\in base}{r}_{i0}^2 $$
(2)

where k2 denotes the force constant; ri0 is the distance from the i-th atom of the base to the plane that best matches the base pair. The plane is least-squares fitted to the atoms of both bases. The magnitude of the force acting on each atom is proportional to its distance from the plane of the base, while the direction of the force is perpendicular to this plane. Base pair restraints are introduced only at startup. For two Watson-Crick bases to be considered as a pair, the energy resulting from term (2) must be below − 2 kcal/mol. A user can also override this behavior by providing secondary structure in Vienna format (for a single chain) or as a list of contacts (in general case). In such case automatic detection of base pairs is disabled.

Backbone regularization

The feature of backbone regularization is intended to correct outlying conformers reported by MolProbity. Upon energy minimization, it drags the backbone atoms of each residue to a known conformation, stored in an internal database. The database of preferred conformations was populated with data from all crystal structures of RNA stored in Protein Data Bank (PDB) [15] with a resolution below 1.4 Å as of June 2013. QRNAS identifies a local backbone conformation in a fragment stored in the database that is closest to the one in the input model according to a minimal Root Mean Square Deviation (RMSD) value. The forces acting on atoms are harmonic, as given by Eq. (3).

$$ {E}_{regul}={k}_3{\sum}_{i\in backbone}{\left(\overrightarrow{r_i}-\overrightarrow{b_i}\right)}^2 $$
(3)

The parameter k3 denotes the force constant; bi is the position of i-th backbone atom in a reference backbone. Coordinates bi are transformed by translations and rotations to minimize the RMSD between the optimized backbone and the reference one. A similar library-based approach has been used in RNAfitme web-server for remodeling of nucleic-acid residue conformations of RNA structures [16].

Noteworthy, the original force field parameters were subject to minor tuning, to generate structures with better MolProbity scores. We changed the rest values of OP1-P-OP2 and N9-C1’-O4’ angles to 119.62° and 109.00° respectively, thereby allowing for the elimination of most ‘bad angles’ reported by MolProbity.

Custom restraints

Distance restraints are implemented as simple harmonic forces, as given by Eq. (4).

$$ {E}_{spring}={k}_4{\left(\overrightarrow{r_i}-\overrightarrow{c_i}\right)}^2 $$
(4)

k4 denotes the force constant which can be set by the user. The spring forces can be used as positional or distance restraints since their anchor points ci can be constituted by both atoms and arbitrary points in space.

Minimization

After setting up the model, QRNAS starts to minimize the energy of the system. All force field terms in our model are analytically differentiable, enabling us to use minimization schemes with explicit gradient information. We implemented two algorithms: steepest descent with golden section search and Polak-Ribiere conjugate gradients [17].

Performance optimization

Calculation of electrostatics was parallelized for machines with symmetric multiprocessing (SMP) capability, i.e., multicore workstations. Parallelism was achieved by processing of the ‘electrostatic interaction matrix’ in blocks that share no common atoms. Consequently, the proposed algorithm is nearly lock-free and has much-improved cache hit rate compared to a version which processes pairwise interactions in a random order. We tuned the parameters of the algorithm (block size and pointer hashing function) to achieve good performance on workstations with up to 8 cores. As a proof of concept, we successfully conducted minimization of ribosomal RNA taken from the 60S subunit of the eukaryotic ribosome (PDB code: 4A18) achieving the performance of 0.2 golden-section search steps per hour.

Example run-times for representative models of RNA structure analyzed in this paper, minimized for 1000 steps on a single core of 2.40 GHz Intel® Xeon-E5620 CPU (Linux 4.15.0–45-generic-x86_64/Ubuntu 18.04.1 with g++/gcc 7.3.0 compiler) with/without new options (explicit hydrogen bonds, base pair co-planarity, and backbone regularization): 1byx (16 residues): 39.48 s/39.12 s; 2lu0 (49 residues): 254.00 s /250.19 s; 2jyf (86 residues): 689.26.s /685.86 s.

Results

Regularization of NMR structures

First, we tested QRNAS on a set of twelve nucleic acid 3D structures determined by solution NMR (1A60 [18], 1B36 [19], 2L7D [20], 1P5M [21], 1YG3 [22], 2JYF, 2LC8 [23], 2 LU0 [24], 2M4Q [25], 2 M58 [26], 1BYX [27], 1DXN [28] in the Protein Data Bank). The common feature of the targets chosen for this analysis were suboptimal scores reported by MolProbity [9]. The test set included mostly RNA structures, except for three chimeric and hybrid (RNA/DNA) structures (2L7D, 1BYX, 1DXN). Whenever an ensemble of models was present, we used the first model. All models except two (2LC8, 1BYX) suffered from high clash-scores. All models except two (2L7D, 1DXN) were reported as having bad backbone conformations. Some bad bonds were detected in 1A60, 1YG3 and bad angles were found in 1A60, 1YG3, 2LC8, 2 M58, 1BYX, 1DXN respectively.

We used QRNAS with restraints on explicit hydrogen bonds, restraints on base pair co-planarity, and backbone regularization. No custom restraints were used at this stage. QRNAS was able to resolve all clashes in the studied set, outperforming both the RNAfitme web server (which uses NAMD with CHARMM force-field for optimizing RNA structures) and sander from the AMBER package (Table 1). The mean amount of bad angles was reduced from 3.46 to 1.31%. The average fraction of wrong backbone conformations was reduced from 27.43 to 14.83%. On the contrary, RNAfitme and sander increased the percentages of bad angle and wrong backbone conformations upon refinement. None of the methods has shown consistent improvement of the fraction of bad bonds. This analysis demonstrates the ability of QRNAS to regularize structures and improve their MolProbity scores, and also shows the limitations of current methods. For practical application of QRNAS to optimize NMR-derived RNA models it will be worthwhile to use NMR-derived data as additional custom restraints in the optimization process and to validate the optimized structures against the NMR data that were not used in the optimization.

Table 1 Performance of QRNAS on a selection of NMR structures in terms of optimization of MolProbity scores. QRNAS resolved nearly all steric clashes. It also improved backbone conformations and bond lengths in all studied cases at the price of small perturbations in the angle space. Quality scores of models optimized with RNAfitme and sander from the AMBER package are shown for comparison. In three cases, RNAfitme was unable to process the input file

Assessment of model accuracy

In molecular modeling, one of the essential steps is the selection of the potentially best models. Once the different conformations are generated, a scoring function can be applied to assess the global and local features of the model, aiming at discriminating models that are closer to the ‘true’ structure (usually represented as a model obtained in the course of X-ray crystallography or NMR experiments and used as a reference) from those that are less accurate. While the selection of models was not the primary goal of QRNAS, we tested its ability to score models. In general, in our various analyses, we did not observe the correlation of QRNAS single point energy values (combined with additional scoring from our custom terms) with the model quality (data not shown) [6, 7, 29,30,31]. We suspected that this might be caused by the fine-grained character of the scoring function and its extreme sensitivity to the ruggedness of the RNA energy landscape. In other words, we expected that QRNAS might be able to discriminate ‘good’ and ‘bad’ models only very close to the global energy minimum corresponding to the reference structure. On the other hand, in typical modeling exercises, models generated computationally are relatively far from the reference structure, and their RMSD values rarely fall below 5 Å.

Instead of looking at models generated by folding simulation, we started from six experimentally determined structures which include P4-P6 ribozyme domain of group I intron (PDB code: 1GID [32]), GBS/omegaG group-I intron (PDB code: 1K2G [33]), ai5-gamma group II self-splicing intron (PDB code: 1KXK [34]), viral RNA pseudoknot (PDB code: 1L2X [35]), G-riboswitch aptamer (PDB code: 1Y27 [36]), and fluoride riboswitch (PDB code: 4ENC [37]); and we generated models by introducing minor random perturbations to positions of all atoms. From the pool of generated models, we selected 1000 structures with RMSD to the starting/reference structure ranging from near 0.00 to 5.00 Å. Scoring these models with QRNAS revealed a funnel-like shape, indicative of an energy/score minimum near the native structure (Fig. 1). Alas, the funnel was very narrow, less than 2 Å, which indicated that QRNAS could discriminate only between models that were extremely close to the reference and all the others, but it was incapable of discriminating between models that are very good (RMSD, e.g., around 2 Å) and those that are much worse. This also suggested that the optimization of QRNAS score (e.g., in the course of model refinement) is unlikely to improve the global accuracy of models unless the starting models are already extremely close to the ‘true’ structure. For models of lower accuracy, statistical potentials can be used, such as RASP [38] or the energy functions used in 3D structure prediction methods such as SimRNA [31, 39] or ROSETTA/FARNA/FARFAR [40, 41]. It is worth emphasizing that computational improvement of model accuracy remains a difficult problem, for which no perfect solution exists. QRNAS addresses one of the aspects of this problem, at the level of local geometry.

Fig. 1
figure1

QRNAS single point energy vs. RMSD on sets of decoys derived from the six different experimentally determined structures (1GID, 1KXK, 1L2X, 1Y27, and 4ENC solved by X-ray crystallography and 1K2G by NMR). No correlation between the QRNAS score and model quality is observed, except for the immediate vicinity of the reference structures (RMSD 0–2 Å). 3D models of the native structures are displayed as an inset in the respective plots

Refinement of models in RNA-puzzles experiment

We analyzed the performance of QRNAS on models for two targets of the RNA-Puzzles experiment (Puzzle #1 – relatively easy [5], Puzzle #6 – very difficult [6]), and the resulting broad range of model accuracy. We analyzed up to five top first structures submitted by various participants, generated with different modeling methods, and hence presenting different types of errors and inaccuracies. The modeling methods used by different groups for Puzzles #1 and #6 include ModeRNA [42] and SimRNA [31, 39] (Bujnicki group), Vfold [43] (Chen group), FARNA/FARFAR [40, 41] (Das group), iFoldRNA [44] (Dokholyan group), MC-Fold|MC-Sym [45] (Major group), and RNA123 software suite [46] (SantaLucia group). The models were obtained from the RNA-Puzzles experiment website (currently: http://rnapuzzles.org/). In Puzzle #1 the average RMSD of models was 4.93 Å (best model exhibited 3.42 Å), while in Puzzle #6 the model deviated from the reference structure by 23.05 Å on the average (best model exhibited 11.29 Å).

To assess the capabilities of QRNAS, we conducted a full refinement with default parameters for 10,000 steps. For comparison, we performed refinement with RNAfitme and minimization with sander from the Amber 14 package [47]. RNAfitme was run with the default settings on the web server. Minimization with sander was performed in a truncated octahedral box of 10 Å with TIP3P water model [48] and leaprc.ff14SB variant of the forcefield [49, 50]. The following parameters were used while running sander: imin 1, maxcyc 10,000, cut 300, igb 2, saltcon 0.2, gbsa 1, ntpr 10, ntx 1, ntb 0. For the resulting models, we calculated the value of global RMSD to assess the overall accuracy, and the Interaction Network Fidelity (INF) to compare the accuracy of residue-residue contacts identified in the original and optimized structures [51]. INF values are calculated for all types of contacts including canonical and non-canonical base-pairs and stacking. For the detection of base pairs, we have used our in-house method ClaRNA [52].

In all cases, QRNAS improved MolProbity scores, in particular, it resolved nearly all steric clashes (Tables 2 and 3). For Puzzle #1 (Table 2), the average change of RMSD was − 0.01 for QRNAS vs. 0.26 for sander (i.e., essentially no change vs. minimal deterioration). However, the average INF value decreases from 0.802 to 0.768, 0.759, and 0.482, calculated from the optimized models using QRNAS, sander and RNAfitme web server, respectively. For Puzzle #6 (Table 3) the average change of RMSD was 0.53 for QRNAS vs. 0.51 for sander and 0.52 for RNAfitme (negligible deterioration), and the average improvement of INF was 0.001 (for QRNAS) compare to 0.00 (for sander) and − 0.04 (for RNAfitme) in respect to the starting models. To evaluate the performance of QRNAS to see how it can optimize the non-canonical contacts, we have calculated INF considering only the non-Watson-Crick contacts (INF_nWC) for the models of RNA-Puzzles #1 and #6. In both the rounds, QRNAS improved the INF_nWC values with respect to the starting models. Though QRNAS and RNAfitme have comparable (very minor) improvement of non-canonical contacts, sander does not improve such contacts. Summarizing, in terms of RMSD, the structures changed very little; sometimes the models improved slightly, sometimes they deteriorated slightly. This was expectable because in all cases the models were so far from the reference structure that the local refinement was not expected to drive them towards the global energy minimum, but rather towards a local minimum, which could be further away from the reference structure. On the other hand, we could observe a small increase in the INF values, indicating a small improvement of predicted contacts. We attribute this small change to the ability of QRNAS to improve the local geometry, in particular in the case of base pairs. In models that are reasonably close to the ‘true’ structure and exhibit residues that are ‘almost’ in proper contact with each other (as in many models for Puzzle #1), the optimization by QRNAS can refine these contacts and enable the formation of proper base pairs. The smaller improvement of contacts in models of Puzzle #6 can be explained by the low quality of the starting structures, and the lower fraction of ‘nearly correct’ contacts that could be optimized.

Table 2 Performance of QRNAS on RNA Puzzle #1 models in terms of model accuracy, as compared to RNAfitme and sander from the AMBER package
Table 3 Performance of QRNAS on RNA Puzzle #6 models in terms of model accuracy, as compared to RNAfitme and sander from the AMBER package

Previously published examples of QRNAS application

Following the development and initial tests of QRNAS, we applied it in various modeling studies. In the course of collaborative work on models generated by all groups for Puzzles #5, #6, and #10, we found that models submitted by the Das group had poor clash scores, despite their overall relative accuracy, as measured in terms of RMSD to the reference structure. We have therefore run QRNAS on all Das models submitted for Puzzles #5, #6, and #10 (17 models total). In all cases, a dramatic reduction of clash scores was obtained; in 10 models even down to zero. Only in three cases, the clash scores remained larger than 4; however, these models had initial Clash Scores of nearly 30. Details of this analysis were reported in an article describing RNA-Puzzles Round II [6].

In order to evaluate the performance of QRNAS for blind predictions (at the time when the experimentally determined structure was not available), we calculated the MolProbity scores of RNA-Puzzles #6 models generated in our group before the refinement. The MolProbity scores show improvement in the quality of the models as the average Clashscores reduced from 8.99 to 1.99 (Table 4). The current version of QRNAS has also reduced the bad conformations, bad angles, and bad bonds in the models submitted for RNA-Puzzles #6 (Table 3).

Table 4 Performance of QRNAS for RNAs with unknown reference structures. MolProbity scores of “before” and “after” QRNA optimizations of the models generated in the Bujnicki group for RNA-Puzzles # 6

In the case of group I intron modeling study [29], QRNAS was used as the final step of a workflow to improve a model generated with ModeRNA [42] and SimRNA [31]. It reduced the clash-score from 184.69 to 0.37, bad bonds from 4.12 to 0.00%, bad angles from 6.53 to 0.88%, without major changes of the deviation from the reference structure (10.9 Å to 11.0 Å).

Conclusions

QRNAS is a software tool for fine-grained refinement of nucleic acid structures, based on the AMBER force field with additional restraints. QRNAS is capable of handling RNA, DNA, chimeras, and hybrids thereof, and enables modeling of nucleic acids containing modified residues. We demonstrate the ability of QRNAS to improve the quality of RNA 3D structure models generated with different methods. QRNAS was able to improve MolProbity scores of NMR structures, as well as of computational models generated in the course of the RNA-Puzzles experiment. The overall geometry improvement may be associated with the improvement of local contacts, but the systematic improvement of root mean square deviation to the reference structure should not be expected. QRNAS can be integrated into a computational modeling workflow with other tools, enabling improved RNA 3D structure prediction. Our group systematically uses QRNAS at the final stage of model refinement in the context of the RNA-Puzzles experiment.

Availability and requirements

Project name: QRNAS

Project home page: http://genesilico.pl/software/stand-alone/qrnas

GitHub page (Mirror): https://github.com/sunandanmukherjee/QRNAS.git

Operating systems: GNU/Linux, MacOS and WSL on Windows 10.

Programming language: C++

License: GNU GPLv3+

Any restrictions to use by non-academics: None

For the compilation of QRNAS, a C++ compiler, such as GNU g++ is required. A Makefile is provided for the compilation of the package. Download the software from http://genesilico.pl/software/stand-alone/qrnas or clone it from https://github.com/sunandanmukherjee/QRNAS.git. Unzip the archive, and compile it with the command make to create an executable version of QRNAS. To execute the program use the command …/path/to/QRNAS/QRNA –i input.pdb –o output.pdb where input.pdb is the file to be optimized and output.pdb is the optimized structure. For more advanced usage of QRNAS, users should consult the user manual and the README.txt file in the QRNAS package.

Abbreviations

INF:

Interaction Network Fidelity

PDB:

Protein Data Bank

RMSD:

Root mean square deviation

References

  1. 1.

    Atkins JF, Gesteland RF, Cech TR. RNA worlds: from life’s origins to diversity in gene regulation. Cold Spring Harbor: Cold Spring Harbor Laboratory Press; 2011.

  2. 2.

    Doudna JA. Structural genomics of RNA. Nat Struct Biol. 2000;7(Suppl):954–6.

  3. 3.

    Ponce-Salvatierra A, Astha, Merdas K, Nithin C, Ghosh P, Mukherjee S, Bujnicki JM. Computational modeling of RNA 3D structure based on experimental data. Biosci Rep. 2019;39(2):BSR20180430.

  4. 4.

    Levitt M. Detailed molecular model for transfer ribonucleic acid. Nature. 1969;224(5221):759–63.

  5. 5.

    Cruz JA, Blanchet MF, Boniecki M, Bujnicki JM, Chen SJ, Cao S, Das R, Ding F, Dokholyan NV, Flores SC, et al. RNA-puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction. RNA. 2012;14(4):610–25.

  6. 6.

    Miao Z, Adamiak RW, Blanchet M-F, Boniecki M, Bujnicki JM, Chen S-J, Cheng C, Chojnowski G, Chou F-C, Cordero P, et al. RNA-puzzles round II: assessment of RNA structure prediction programs applied to three large RNA structures. RNA. 2015;21(6):1066–84.

  7. 7.

    Miao Z, Adamiak RW, Antczak M, Batey RT, Becka AJ, Biesiada M, Boniecki MJ, Bujnicki JM, Chen S-J, Cheng CY, et al. RNA-puzzles round III: 3D RNA structure prediction of five riboswitches and one ribozyme. RNA. 2017;23(5):655–72.

  8. 8.

    Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)-round XII. Proteins. 2018;86(Suppl 1):7–15.

  9. 9.

    Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, Wang X, Murray LW, Arendall WB 3rd, Snoeyink J, Richardson JS, et al. MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res. 2007;35(Web Server issue):W375–83.

  10. 10.

    Chen VB, Arendall WB 3rd, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr. 2010;66(Pt 1):12–21.

  11. 11.

    Weiner SJ, Kollman PA, Case DA, Singh UC, Ghio C, Alagona G, Profeta SJ, Weiner P. A new force field for molecular mechanical simulation of nucleic acids and proteins. J Am Chem Soc. 1984;106:765.

  12. 12.

    Perez A, Marchan I, Svozil D, Sponer J, Cheatham TE 3rd, Laughton CA, Orozco M. Refinement of the AMBER force field for nucleic acids: improving the description of alpha/gamma conformers. Biophys J. 2007;92(11):3817–29.

  13. 13.

    Aduri R, Psciuk BT, Saro P, Taniga H, Schlegel HB, Santalucia J. AMBER force field parameters for the naturally occurring modified nucleosides in RNA. J Chem Theory Comput. 2007;3(4):1464–75.

  14. 14.

    Lu H, Wang Y, Wu Y, Yang P, Li L, Li S. Hydrogen-bond network and local structure of liquid water: an atoms-in-molecules perspective. J Chem Phys. 2008;129(12):124512.

  15. 15.

    Berman HM. The protein data bank. Nucleic Acids Res. 2000;28(1):235–42.

  16. 16.

    Antczak M, Zok T, Osowiecki M, Popenda M, Adamiak RW, Szachniuk M. RNAfitme: a webserver for modeling nucleobase and nucleoside residue conformation in fixed-backbone RNA structures. BMC Bioinformatics. 2018;19(1):304.

  17. 17.

    Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical recipes 3rd edition: the art of scientific computing. Cambridge: Cambridge University Press; 2007.

  18. 18.

    Kolk MH, van der Graaf M, Wijmenga SS, Pleij CWA, Heus HA, Hilbers CW. NMR structure of a classical pseudoknot: interplay of single- and double-stranded RNA. Science. 1998;280(5362):434–8.

  19. 19.

    Butcher SE, Allain FHT, Feigon J. Solution structure of the loop B domain from the hairpin ribozyme. Nat Struct Biol. 1999;6:212.

  20. 20.

    DeRose EF, Perera L, Murray MS, Kunkel TA, London RE. Solution structure of the Dickerson DNA Dodecamer containing a single ribonucleotide. Biochemistry. 2012;51(12):2407–16.

  21. 21.

    Lukavsky PJ, Kim I, Otto GA, Puglisi JD. Structure of HCV IRES domain II determined by NMR. Nat Struct Biol. 2003;10:1033.

  22. 22.

    Cornish PV, Hennig M, Giedroc DP. A loop 2 cytidine-stem 1 minor groove interaction as a positive determinant for pseudoknot-stimulated −1 ribosomal frameshifting. Proc Natl Acad Sci. 2005;102(36):12694–9.

  23. 23.

    Houck-Loomis B, Durney MA, Salguero C, Shankar N, Nagle JM, Goff SP, D’Souza VM. An equilibrium-dependent retroviral mRNA switch regulates translational recoding. Nature. 2011;480:561.

  24. 24.

    Knobloch B, Finazzo C, Donghi D, Pechlaner M, Sigel RKO. The structural stabilization of the κ three-way junction by mg(II) represents the first step in the folding of a group II intron. Nucleic Acids Res. 2012;41(4):2489–504.

  25. 25.

    Tsai A, Uemura S, Johansson M, Puglisi Elisabetta V, Marshall RA, Aitken Colin E, Korlach J, Ehrenberg M, Puglisi Joseph D. The impact of aminoglycosides on the dynamics of translation elongation. Cell Rep. 2013;3(2):497–508.

  26. 26.

    Carlomagno T, Amata I, Codutti L, Falb M, Fohrer J, Masiewicz P, Simon B. Structural principles of RNA catalysis in a 2′–5′ lariat-forming ribozyme. J Am Chem Soc. 2013;135(11):4403–11.

  27. 27.

    Szyperski T, Gotte M, Billeter M, Perola E, Cellai L, Heumann H, Wuthrich K. NMR structure of the chimeric hybrid duplex r(gcaguggc).r(gcca)d(CTGC) comprising the tRNA-DNA junction formed during initiation of HIV-1 reverse transcription. J Biomol NMR. 1999;13(4):343–55.

  28. 28.

    Hsu S-T, Chou M-T, Chou S-H, Huang W-C, Cheng J-W. Hydration of [d(CGC)r(aaa)d(TTTGCG)]211Edited by I. Tinoco. J Mol Biol. 2000;295(5):1129–37.

  29. 29.

    Piatkowski P, Kasprzak JM, Kumar D, Magnus M, Chojnowski G, Bujnicki JM. RNA 3D structure modeling by combination of template-based method ModeRNA, template-free folding with SimRNA, and refinement with QRNAS. Methods Mol Biol. 2016;1490:217–35.

  30. 30.

    Dawson WK, Maciejczyk M, Jankowska EJ, Bujnicki JM. Coarse-grained modeling of RNA 3D structure. Methods. 2016;103:138–56.

  31. 31.

    Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM. SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction. Nucleic Acids Res. 2016;44(7):e63.

  32. 32.

    Cate JH, Gooding AR, Podell E, Zhou K, Golden BL, Kundrot CE, Cech TR, Doudna JA. Crystal structure of a group I ribozyme domain: principles of RNA packing. Science. 1996;273(5282):1678–85.

  33. 33.

    Kitamura A, Muto Y, Watanabe S, Kim I, Ito T, Nishiya Y, Sakamoto K, Ohtsuki T, Kawai G, Watanabe K, et al. Solution structure of an RNA fragment with the P7/P9.0 region and the 3′-terminal guanosine of the Tetrahymena group I intron. RNA. 2002;8(4):440–51.

  34. 34.

    Zhang L, Doudna JA. Structural insights into group II intron catalysis and branch-site selection. Science. 2002;295(5562):2084–8.

  35. 35.

    Egli M, Minasov G, Su L, Rich A. Metal ions and flexibility in a viral RNA pseudoknot at atomic resolution. Proc Natl Acad Sci. 2002;99(7):4302–7.

  36. 36.

    Serganov A, Yuan Y-R, Pikovskaya O, Polonskaia A, Malinina L, Phan AT, Hobartner C, Micura R, Breaker RR, Patel DJ. Structural basis for discriminative regulation of gene expression by adenine- and guanine-sensing mRNAs. Chem Biol. 2004;11(12):1729–41.

  37. 37.

    Ren A, Rajashankar KR, Patel DJ. Fluoride ion encapsulation by Mg2+ ions and phosphates in a fluoride riboswitch. Nature. 2012;486(7401):85–9.

  38. 38.

    Capriotti E, Norambuena T, Marti-Renom MA, Melo F. All-atom knowledge-based potential for RNA structure prediction and assessment. Bioinformatics. 2011;27(8):1086–93.

  39. 39.

    Magnus M, Boniecki MJ, Dawson W, Bujnicki JM. SimRNAweb: a web server for RNA 3D structure modeling with optional restraints. Nucleic Acids Res. 2016;44(W1):W315–9.

  40. 40.

    Das R, Baker D. Automated de novo prediction of native-like RNA tertiary structures. Proc Natl Acad Sci U S A. 2007;104(37):14664–9.

  41. 41.

    Das R, Karanicolas J, Baker D. Atomic accuracy in predicting and designing noncanonical RNA structure. Nat Methods. 2010;7(4):291–4.

  42. 42.

    Rother M, Rother K, Puton T, Bujnicki JM. ModeRNA: a tool for comparative modeling of RNA 3D structure. Nucleic Acids Res. 2011;39(10):4007–22.

  43. 43.

    Xu X, Zhao P, Chen S-J. Vfold: a web server for RNA structure and folding thermodynamics prediction. PLoS One. 2014;9(9):e107504.

  44. 44.

    Sharma S, Ding F, Dokholyan NV. iFoldRNA: three-dimensional RNA structure prediction and folding. Bioinformatics. 2008;24(17):1951–2.

  45. 45.

    Parisien M, Major F. The MC-fold and MC-Sym pipeline infers RNA structure from sequence data. Nature. 2008;452:51.

  46. 46.

    Sijenyi F, Saro P, Ouyang Z, Damm-Ganamet K, Wood M, Jiang J, SantaLucia J. The RNA folding problems: different levels of sRNA structure prediction. In: Leontis N, Westhof E, editors. RNA 3D structure analysis and prediction. Berlin: Springer Berlin Heidelberg; 2012. p. 91–117.

  47. 47.

    Salomon-Ferrer R, Case DA, Walker RC. An overview of the Amber biomolecular simulation package. Wiley Interdiscip Rev Comput Mol Sci. 2013;3(2):198–210.

  48. 48.

    Price DJ, Brooks CL 3rd. A modified TIP3P water potential for simulation with Ewald summation. J Chem Phys. 2004;121(20):10096–103.

  49. 49.

    Banás P, Hollas D, Zgarbová M, Jurecka P, Orozco M, Cheatham TE III, Sponer J, Otyepka M. Performance of molecular mechanics force fields for RNA simulations: stability of UUCG and GNRA hairpins. J Chem Theory Comput. 2010;6(12):3836–49.

  50. 50.

    Zgarbová M, Otyepka M, Sponer J, Mládek A, Banáš P, Cheatham TE 3rd, Jurečka P. Refinement of the Cornell et al. nucleic acids force field based on reference quantum chemical calculations of Glycosidic torsion profiles. J Chem Theory Comput. 2011;7(9):2886–902.

  51. 51.

    Parisien M, Cruz JA, Westhof E, Major F. New metrics for comparing and assessing discrepancies between RNA 3D structures and models. RNA. 2009;15(10):1875–85.

  52. 52.

    Waleń T, Chojnowski G, Gierski P, Bujnicki JM. ClaRNA: a classifier of contacts in RNA 3D structures based on a comparative analysis of various classification schemes. Nucleic Acids Res. 2014;42(19):e151.

Download references

Acknowledgments

We thank members of the Bujnicki lab, in particular Grzegorz Łach, Grzegorz Chojnowski, Dorota Niedziałek, Adriana Żyła, Filip Stefaniak, Pritha Ghosh, and Almudena Ponce Salvatierra for discussions and intensive testing of QRNAS. The computational resources from the Interdisciplinary Centre for Mathematical and Computational Modelling at the University of Warsaw (grant: G73-4) and Poznań Supercomputing and Networking Center at the Institute of Bioorganic Chemistry, Polish Academy of Sciences (grant: 312) were used for calculations and benchmarking.

Funding

J.S. was supported by the Polish National Science Center (NCN, 2012/05/N/NZ1/02970 to J.S.). S.M. was supported by IIMCB statutory funds. C.N. was supported by the NCN (MAESTRO 2017/26/A/NZ1/01083 to J.M.B.) and by the IIMCB statutory funds. J.M.B. was supported by the Foundation of Polish Science (FNP, TEAM/2009–4/2 and TEAM/2016–3/18). Funding for open access charge: statutory funds of the International Institute of Molecular and Cell Biology in Poland.

Availability of data and materials

The data sets supporting the conclusions of this article can be downloaded from http://genesilico.pl/QRNAS/QRNAS_data.tar.gz.

Author information

JS contributed to the conception of the software, designed, developed, packaged the software, and drafted the manuscript. JS, SM and CN ran tests, analyzed the data, and benchmarked the software. SM and CN edited the manuscript in response to comments from the referees. CN made the software compatible to various operating systems and made the compilation script. JMB conceived of and supervised the project, analyzed the data, and edited the manuscript. All authors read and approved the manuscript.

Correspondence to Janusz M. Bujnicki.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Stasiewicz, J., Mukherjee, S., Nithin, C. et al. QRNAS: software tool for refinement of nucleic acid structures. BMC Struct Biol 19, 5 (2019) doi:10.1186/s12900-019-0103-1

Download citation

Keywords

  • RNA
  • DNA
  • 3D structure
  • Molecular modeling
  • Structure refinement
  • AMBER force field
  • Software