Skip to main content

Estimating loop length from CryoEM images at medium resolutions



De novo protein modeling approaches utilize 3-dimensional (3D) images derived from electron cryomicroscopy (CryoEM) experiments. The skeleton connecting two secondary structures such as α-helices represent the loop in the 3D image. The accuracy of the skeleton and of the detected secondary structures are critical in De novo modeling. It is important to measure the length along the skeleton accurately since the length can be used as a constraint in modeling the protein.


We have developed a novel computational geometric approach to derive a simplified curve in order to estimate the loop length along the skeleton. The method was tested using fifty simulated density images of helix-loop-helix segments of atomic structures and eighteen experimentally derived density data from Electron Microscopy Data Bank (EMDB). The test using simulated density maps shows that it is possible to estimate within 0.5Å of the expected length for 48 of the 50 cases. The experiments, involving eighteen experimentally derived CryoEM images, show that twelve cases have error within 2Å.


The tests using both simulated and experimentally derived images show that it is possible for our proposed method to estimate the loop length along the skeleton if the secondary structure elements, such as α-helices, can be detected accurately, and there is a continuous skeleton linking the α-helices.


Over the last ten years, electron cryomicroscopy (CryoEM) experiments yielded increasing numbers of 3D electron density images of protein molecules. The Electron Microscopy Data Bank (EMDB) currently archives the 3D images, referred to as density maps in this paper, with a wide range of resolutions from 3Å to over 80Å [1]. When the density map is resolved to high resolution (3-5Å) [2, 3], it is possible to derive the near atomic structure from the density map. However, when the density map is not resolved to the high resolution range, it is still challenging to derive the structure of the imaged molecule [46]. Fitting and comparative modeling approaches have been developed to utilize the existing atomic structures in the Protein Data Bank (PDB) [6, 7]. These approaches apply when a component of the target density map has been resolved to near atomic resolution structure or when the target protein shares significant homology with existing atomic structures.

Modeling protein molecules using de novo methods is a general approach to derive the atomic structure from medium resolution (5-10Å) electron density 3D images [6, 810]. Only the 3D image (top left in Figure 1) and amino acid sequence (top right of Figure 1) are used in de novo processes. It does not need an atomic template protein structure from PDB as required for fitting and comparative modeling methods. First, the secondary structure elements (SSEs) such as α-helices (red sticks in Figure 1) and β-sheets are often identified using pattern recognition methods [1116]. Skeletonization methods detect the medial axis (green, left in Figure 1) of a 3D image's iso-surface [10, 17]. Next, the amino acid sequence segments (red cylinders, right of Figure 1) of the SSEs can be predicted using existing prediction tools [1821]. Various approaches have been developed to combine the secondary structure information from the 3D image and 1D sequence in order to derive the topology. The atomic structures can be built once the possible topologies are predicted [68].

Figure 1
figure 1

Deriving the topology of the secondary structure elements from CryoEM images. The skeleton (green) and detected helices (red) derived from the density map (gray) are combined with the predicted sequence segments of the helices to form a topology graph [8, 9, 23].

An amino acid sequence has a direction, starting with a nitrogen atom (N-terminal) and ending with the a carbon atom (C-terminal). The SSE topology is the order in which this sequence traverses the protein's helices and sheets, including the direction of entry into and exit from the secondary structure. The native topology of a protein's SSEs is likely to produce the lowest energy state compared to incorrect topologies [22]. Determining the correct topology is a crucial step in de novo modeling. We have formulated the SSE topology problem into a constrained graph matching problem and provided a dynamic programming algorithm [9]. We later used a dynamic graph approach to handle errors in the data [23].

The distance between two SSEs is an important constraint in graph matching. As an example, two helices closely located in a 3D image should be matched to two helices with similar distance estimated from the 1D sequence. The distance between two ends of two helices (one on each) can be simply estimated as the Euclidean distance [9], or can be measured more accurately along the skeleton [8, 23, 24]. From the amino acid sequence input, the distance between SSEs can be estimated assuming a 3.8Å distance between adjacent amino acids in the sequence. A scoring function can be developed to represent the overall matching between two sets of SSEs, one from the 3D image and the other from the 1D sequence. The correct topology is assumed to be the one with the best match score.

Despite the relative accuracy of skeletonization algorithms, overestimation may occur if length is measured directly along their piecewise linear curves, which contain many right angles and some error from the thinning process and the 3D image itself.

Here, we extend our previous work in [25], in which we obtained preliminary results testing a computational-geometric method to measure the length of a simplified skeleton. In addition to expanding our test set to include synthetically generated density maps and additional experimentally derived data, we used the directed Hausdorff distance to handle segmentation issues. The measured length appears to agree with the expected length when the SSEs are detected fairly accurately.

Results and discussion

Test data and overall process

Two data sets were used in testing performance. The simulated data set consists of fifty randomly selected helix-loop-helix (HLH) motifs from atomic structures in PDB. The proteins extracted exhibit less than 10% sequence identity. Each extracted HLH of the protein structure was used to generate a 3D density map using EMAN1.9 pdb2mrc [26]. The density maps were simulated to 8Å resolution.

The real data set consists of 18 cases whose density maps were downloaded from EMDB with resolution from 4.2Å to 6.8Å. Their EMDB entries are 5030 (6.4Å), 1733 (6.8Å), 5001 (4.2Å), 1740 (6.8Å) and 5168 (6.6Å). Each of these density maps is aligned with their PDB structures at download and provided multiple helix-loop-helix motif samples for the experiment.

The length of a loop was measured along the skeleton voxel points between (and including) the end points of the two surrounding helices. An endpoint of a helix represents an end of the central axis of the helix [11, 12]. The helices were detected using SSETracer, a simplified version of SSELearner [16]. The skeleton was detected using a local maximum clustering method, more details of which are forthcoming in a separate paper. In order to test the accuracy of our algorithm, we visually inspected the detected helices and included only those cases in which the helices were roughly accurate. This was done to distinguish the potential error in our loop length estimation from that of helix detection, skeletonization, or production of the CryoEM image itself.


The accuracy of the measurement was evaluated using both the simulated data and the real data from the EMDB. Table 1 summarizes the results for the simulated data. The input to our method includes two pieces of information: the detected helix (red sticks) end points and the skeleton voxels (red dots) (Figure 2B). Each measured length along the skeleton was compared with the expected length of the loop. The expected length was calculated as 3.8Å ×(n + 1), where n is the number of the amino acids on the loop and 3.8Å is the average distance between two amino acids.

Table 1 Accuracy of the loop length estimation in the simulated data set.
Figure 2
figure 2

Loop length estimation from a simplified curve. The density map (gray), detected helices (red sticks), the true structure (cyan) are shown for the HLH portion of the structure for 1DU0 (PDB Id) in (A, B) and 1MW8 in (C, D). The detected skeleton (yellow) is shown as surface view in (A) and (C), as voxels (red dots) in (B) and (D). The simplified curve derived is shown in blue.

The fifty tested cases were sorted by the length of the loop, ranging from 1 to 10 amino acids. Almost all the 50 test cases appear to have the error within 0.5Å (column 6 of Table 1). As an example, the loop in 1DU0 (row 15 of Table 1) has three amino acids and the expected length of the loop is 15.2Å. The measured length of the loop along the skeleton is 14.99Å. The relative error is 1.4% of the expected loop length. The simplified curve (blue in Figure 2B) detected by the algorithm appears to be close to the skeleton points (red dots). Another example is from 1MW8 (Figure 2 C, D, row 29 of Table 1) with six amino acids on the loop. The error of the measurement is 0.358Å in this case (column 6 of row 29, Table 1). Note that the skeleton points branch into multiple directions (Figure 2D), yet the algorithm correctly measured the length between the two ending points of the helices by using Hausdorff measurements (see Algorithm). In some cases, as in rows 18 and 28 in Table 1 the greedy step in the Hausdorff computation breaks down and the wrong pair of endpoints was used or the wrong skeleton segment was measured.

The test using the experimentally derived density data involves eighteen HLH motifs from density maps with 4-7Å resolution from EMDB. Twelve of the eighteen cases have measured error within 2Å, and six have error between 2Å and 5Å. The real density maps from the experiments are often more challenging with missing density and additional densities that do not align with the true structure. The helices and skeletons detected from the real maps are often less accurate than those from the simulated density maps. Figure 3 shows an example of experimentally derived data in EMDB 5168 (row 15 in Table 2). The difference between the measured and the expected distance is 2.88Å, higher than a comparable case with a synthetic density map used instead. In general, we saw an increase in error using the real density images, due to greater errors in helix detection and skeletonization induced by the noise present.

Figure 3
figure 3

Detected simplified curve for a loop in CryoEM image (EMDB 5168). The color scheme is the same as that in Figure 2.

Table 2 Accuracy of the measured loop length for the experimentally derived CryoEM data.

The algorithm uses a simplification parameter that is user defined. is the width of the vertex removal band (refer to the algorithm for more details). In general, the smaller the value, the less change in the simplified curve compared to the initial path. In some cases, = 0 is the best option, leaving the original path unchanged. In other cases, a much larger value of was needed. In order to see the degree of simplification that produced the most accurate results, we sampled 's range inside the interval 0[6] in increments of 0.05. The measured lengths w.r.t. values appear to form a step function, and the value closest to the expected value (Figure 4 left) was marked. As seen from this case, the measured length reduces as increases stepwise.

Figure 4
figure 4

The Douglas-Peucker step function. (Left) The step function for case 21 in Table 1 (PDB 1D8L), with the value of used for the best estimate. (Right) Distribution of the best in the simulated data set of 800 loops. The vertical lines show the values that are listed in Table 1.

Figure 4 (right) shows the distribution of the values of for about 800 simulated cases that had less than 0.5Å difference. The vertical lines represent values of for cases in Table 1. It appears that most of the values between 0.0 and 1.5 minimize the error in the measurement (Figure 4, right). However, we observed that we need larger values for the experimentally derived data than for the simulated density maps. This difference is likely to be associated with the quality of skeletonization and helix detection. For the simulated cases, between 0.0 and 1.5 is more likely to produce a good estimate after sufficient preprocessing of the density maps. Multiple values might be needed to sample the expected length when working with the experimentally derived cryoEM data.


We have developed a new approach to estimate loop length along the skeleton from a CryoEM density map. Our tests, using both simulated and experimentally derived images at medium resolution, show that it is possible for our proposed method to estimate fairly accurately the loop length along the skeleton if the SSEs such as α-helices and the skeleton are detected fairly accurately.


The overall process to measure the loop length along the skeleton consists of two tasks: preprocessing and length calculation (Figure 5). The purpose of the preprocessing is to derive the skeleton and the endpoints of the two helices from the density map. Once such information is obtained, our algorithm uses graphs and computational geometric concepts to derive the simplified curve.

Figure 5
figure 5

The process of loop length estimation. (Left) preprocessing, and (right) the length estimation algorithm.


Each case in Table 1 had a density map generated using the HLH segment of the PDB structure and EMAN's pdb2mrc [26]. We applied a skeletonization method that utilizes the local maximum points and clustering to derive the skeleton points from the density map. The HLH regions of cases in Table 2 were extracted from entire density images downloaded from EMDB. We used SSETracer, a secondary structure detection method to detect helices from the density map. It is modified from SSELearner [16] with improved speed. Since helix detection is independent of skeletonization, it is necessary to remove the skeleton voxels that belong to the helix region in order to obtain the skeleton belonging to the loop. We removed those skeleton voxels that are within 2.3Å of the central axis of the helix. Note that a helix is 2.3 - 2.5Å in radius [11, 27]. After such processing, the skeleton voxels that presumably belong to the loop are segmented from the rest of the skeleton voxels and are subject for length calculation.


Local connectivity graphs

A local connectivity graph (LCG) represents a cluster of skeleton voxels. We impose a constraint on the maximum allowable edge length in a graph, possibly yielding multiple disconnected graphs when all skeleton voxels are considered. For our tests, we normalized the distances between the image's voxels to unity, and chose a maximum edge length l < 2, producing individual connected subcomponents if they can be clustered into distant groups, referred to as LCGs in this paper.

Selecting connected components

Oftentimes, segmented or sparse density data yield multiple LCGs. Also, in general, it is not known which helix endpoints the loop actually lies between. We must then determine the best LCG for each possible pair of helix endpoints. For two helices, one with endpoints p and q and the other with r and s, there exists a set Z of four possible endpoint pairs: Z := {{p, r}, {p, s}, {q, r}, {q, s}}. For each endpoint pair z Z, let the directed Hausdorff distance to an LCG [28] be defined as

h ( z , b ) = max min z i z b j b d ( z i , b j ) ,

where z is the set of helix endpoints (comprised of voxels denoted z i ) and b is an LCG (comprised of voxels denoted b j ) from the set B of all LCGs; d(z i , b j ) is then the Euclidean distance between a helix endpoint voxel and LCG voxel. In the presence of multiple LCGs, we choose the best LCG l ^ z per endpoint pair z Z by taking the minimum directed Hausdorff distance over all LCGs:

l ^ z = min b B h ( z , b ) .

We can then use the voxels of l ^ z to build our model of the loop between the endpoints of z.

It should be noted here that the directed Hausdorff is not commutative-in general, h(M, N) ≠ h(N, M)- and we always chose M as a set (pair) of helix endpoints, and N as an LCG. Figure 6 shows the configuration for case 30 (PDB 1O6L) from Table 1, where we want to find l ^ z among the set of LCGs B := {1, 2, 3, 4, 5, 6} to search for the loop that may lie between the helix endpoint pair a. After finding l ^ z using equation (2), we repeat the procedure for each other helix endpoint pair. We try connecting the helix endpoints to their respective closest voxels in l ^ z with respect to the Euclidean distance. If either of the new edges connecting p or r is longer than 5Å, we discard the combination as an infeasible path.

Figure 6
figure 6

Hausdorff distance comparison of the connected skeleton point groups. Two detected helices (solid red lines), with a pair z of helix endpoints (connected by the red dashed line) and several LCGs (gray ellipses) from PDB 1O6L. In this case, LCG 1 is closest to z in terms of directed Hausdorff distance.


After finding the best LCG for a given possible helix endpoint pair, the next step is constructing a path that traverses it in a way that will approximate the loop. We simply performed a breadth-first search starting from one of the helix endpoints we added, and reconstruct the path that ends at the other one in the graph [29], with a helix endpoint as the source. For a given HLH, we find four such paths, one for each possible helix endpoint pair.

Path simplification

Ideally, the distance between two specific ends of two helices should be measured along the skeleton connecting the two ends by using our initial path. If we simply add the length of the line segments along the initial path, there is a danger of over estimation due to the potential zigzagging induced from drawing a path along the edges of the cubic lattice of the 3D image.

Douglas-Peucker line simplification [30, 31] is the systematic removal of points that lie beyond some distance from a line describing the general orientation of a piecewise linear curve (polyline) or one of its subsegments. Consider the two-dimensional example in Figure 7. Part (i) shows an initial polyline a . . . b ¯ . The algorithm is recursive, and takes as parameters the tolerance (Figure 7 (ii)) and a multi-point segment of a polyline. At each recursive iteration it finds an interior point of the current segment which is the most distant from the straight line connecting the end points of the segment, as in Figure 7 (ii) and 7(iii). If all of the current segment's vertices lie within the band, the segment is replaced with a straight line segment containing only its endpoints. Otherwise, the segment is split at the most distant point and each subsegment is handled recursively. In Figure 7 (iii), a c ¯ and c b ¯ are treated in different recursive calls; e is the farthest point from c b ¯ , and no points lie outside the epsilon band for a c ¯ . Overall, the initial polyline a . . . b ¯ is simplified into polyline a c e b ¯ , which approximates the length of the loop between helix endpoints.

Figure 7
figure 7

Recursive iterations of the Douglas Peucker line simplification algorithm. Each gray region, as in (ii), illustrates the distance from the test line ( a b ¯ in (ii)) defined by .



electron cryomicroscopy


secondary structure element - either α-helices or β-sheets


Electron Microscopy Data Bank


Protein Data Bank


helix-loop-helix motif found in protein structures


local connectivity graph - a connected graph of skeleton voxels with a maximum allowed edge length.


  1. Lawson C, et al.: unified data resource for CryoEM. Nucleic Acids Res 2011, 39(Database):D456–64. [] 10.1093/nar/gkq880

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  2. Zhang X, Sun S, Xiang Y, Wong J, Klose T, Raoult D, Rossmann M: Structure of Sputnik, a virophage, at 3.5-A resolution. Proc Nat Acad Sci USA 2012, 109: 18431–18436. 10.1073/pnas.1211702109

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. Zhang X, Ge P, Yu X, Brannan J, Bi G, Zhang Q, Schein S, Zhou Z: Cryo-EM structure of the mature dengue virus at 3.5-A resolution. Nat Struct Mol Biol 2012, 20: 105–110. 10.1038/nsmb.2463

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Lu Y, Strauss C, He J: Incorporation of Constraints from Low Resolution Density Map in Ab Initio Structure Prediction Using Rosetta. Proceeding of 2007 IEEE international Conference on Bioinformatics and Biomedicine Workshops 2007, 67–73.

    Google Scholar 

  5. Baker M, Jiang W, Wedemeyer W, Rixon F, Baker D, Chiu W: Ab initio modeling of the herpes virus VP26 core domain assessed by CryoEM density. PLoS Comput Biol 2006, 2(10):e146. 10.1371/journal.pcbi.0020146

    Article  PubMed Central  PubMed  Google Scholar 

  6. Lindert S, Staritzbichler R, Wotzel N, KarakaS M, Stewart P, Meiler J: EM-fold: De novo folding of alphahelical proteins guided by intermediate-resolution electron microscopy density maps. Structure 2009, 17(7):990–1003. 10.1016/j.str.2009.06.001

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  7. Lindert S, Alexander N, Wötzel N, Karakas M, Stewart PL, Meiler J: EM-fold: de novo atomic-detail protein structure determination from medium-resolution density maps. Structure 2012, 20(3):464–478. 10.1016/j.str.2012.01.023

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  8. Nasr KA, Chen L, Si D, Ranjan D, Zubair M, He J: Building the Initial Chain of the Proteins through De Novo Modeling of the Cryo-Electron Microscopy Volume Data at the Medium Resolutions. ACM Conference on Bioinformatics, Computational Biology and Biomedicine, Orlando, FL 2012.

    Google Scholar 

  9. Nasr KA, Ranjan D, Zubair M, He J: Ranking valid topologies of the secondary structure elements using a constraint graph. J Bioinform Comput Biol 2011, 9(3):415–30. 10.1142/S0219720011005604

    Article  PubMed  Google Scholar 

  10. Baker M, Abeysinghe S, Schuh S, Coleman R, Abrams A, Marsh M, Hryc C, Ruths T, Chiu W, Ju T: Modeling protein structure at near atomic resolutions with Gorgon. Journal of Structural Biology 2011, 174(2):360–373. 10.1016/j.jsb.2011.01.015

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  11. Jiang W, Baker M, Ludtke S, Chiu W: Bridging the information gap: computational tools for intermediate resolution structure interpretation. J Mol Biol 2001, 308(5):1033–44. 10.1006/jmbi.2001.4633

    Article  CAS  PubMed  Google Scholar 

  12. Palu A, He J, Pontelli E, Lu Y: Identification of Alpha-Helices from Low Resolution Protein Density Maps. Proceeding of Computational Systems Bioinformatics Conference (CSB) 2006, 89–98.

    Google Scholar 

  13. Baker M, Ju T, Chiu W: Identification of secondary structure elements in intermediate-resolution density maps. Structure 2007, 15: 7–19. 10.1016/j.str.2006.11.008

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  14. Kong Y, Zhang X, Baker T, Ma J: A Structural-informatics approach for tracing beta-sheets: building pseudo-C(alpha) traces for beta-strands in intermediate-resolution density maps. J Mol Biol 2004, 339: 117–30. 10.1016/j.jmb.2004.03.038

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  15. Zeyun Y, Bajaj C: Computational Approaches for Automatic Structural Analysis of Large Biomolecular Complexes. Computational Biology and Bioinformatics, IEEE/ACM Transactions on 2008, 5(4):568–582.

    Article  Google Scholar 

  16. Si D, Ji S, Nasr K, He J: A machine learning approach for the identification of protein secondary structure elements from electron cryo-microscopy density maps. Biopolymers 2012, 97(9):698–708. 10.1002/bip.22063

    Article  CAS  PubMed  Google Scholar 

  17. Ju T, Baker ML, Chiu W: Computing a family of skeletons of volumetric models for shape description. Computer Aided Design 2007, 39(5):352–60. 10.1016/j.cad.2007.02.006

    Article  PubMed Central  PubMed  Google Scholar 

  18. McGuffin L, Bryson K, Jones D: The PSIPRED protein structure prediction server. Bioinformatics 2000, 16(4):404–5. 10.1093/bioinformatics/16.4.404

    Article  CAS  PubMed  Google Scholar 

  19. Ward J, McGuffin L, Buxton B, Jones D: Secondary structure prediction with support vector machines. Bioinformatics 2003, 19(13):1650–5. 10.1093/bioinformatics/btg223

    Article  CAS  PubMed  Google Scholar 

  20. Pollastri G, McLysaght A: Porter: a new, accurate server for protein secondary structure prediction. Bioinformatics 2005, 21(8):1719–20. 10.1093/bioinformatics/bti203

    Article  CAS  PubMed  Google Scholar 

  21. Pollastri G, Przybylski D, Rost B, Baldi P: Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 2002, 47(2):228–35. 10.1002/prot.10082

    Article  CAS  PubMed  Google Scholar 

  22. Sun W, He J: Native secondary structure topology has near minimum contact energy among all possible geometrically constrained topologies. Proteins 2009, 77: 159–173. 10.1002/prot.22427

    Article  CAS  PubMed  Google Scholar 

  23. Biswas A, Si D, Nasr KA, Ranjan D, Zubair M, He J: Improved efficiency in cryo-EM secondary structure topology determination from inaccurate data. J Bioinform Comput Biol 2012, 10(3):1242006–1-1242006–16. 10.1142/S0219720012420061

    Article  PubMed  Google Scholar 

  24. Abeysinghe S, Ju T: Shape modeling and matching in identifying protein structure from low resolution images. Proceedings of the 2007 ACM symposium on Solid and physical modeling 2007.

    Google Scholar 

  25. Mcknight A, Nasr KA, Si D, Chernikov A, Chrisochoides N, He J: CryoEM skeleton length estimation using a decimated curve. Bioinformatics and Biomedicine Workshops (BIBMW), 2012 IEEE International Conference on: 4–7 October 2012 2012, 109–113. 10.1109/BIBMW.2012.6470283

    Chapter  Google Scholar 

  26. Ludtke S, Baldwin P, Chiu W: EMAN: semiautomated software for high-resolution single-particle reconstructions. J Struct Biol 1999, 128: 82–97. 10.1006/jsbi.1999.4174

    Article  CAS  PubMed  Google Scholar 

  27. Whitford D: Proteins: Structure and Function. Wiley; 2005.

  28. Veltkamp RC: Shape Matching: Similarity Measures and Algorithms. Proceedings of the International Conference on Shape Modeling and Applications SMI '01, Washington, DC, USA: IEEE Computer Society; 2001, 188. []

    Chapter  Google Scholar 

  29. Cormen T, Leierson C, Rivest R, Stein C: Introduction to Algorithms. 3rd edition. MIT Press; 2009.

    Google Scholar 

  30. Douglas D, Peucker T: Algorithms for the Reduction of the Number of Points Required to Represent a Digitized Line or its Caricature. Cartograhica 1973, 10(2):112–122. 10.3138/FM57-6770-U75U-7727

    Article  Google Scholar 

  31. Hershberger J, Snoeyink J: Speeding up the Douglas-Peucker line simplification algorithm. 5th Intl Symp on Spatial Data Handling 1992, 134–143.

    Google Scholar 

Download references


Research was funded in part by NSF grants: CCF-1139864, CCF-1136538, and CSI-1136536 and by the Richard T. Cheng Endowment, the ODU MSF fund and the ODU startup fund. Nikos Chrisochoides helped defray publication costs via NSF funding.

This article has been published as part of BMC Structural Biology Volume 13 Supplement 1, 2013: Selected articles from the Computational Structural Bioinformatics Workshop 2012. The full contents of the supplement are available online at

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jing He.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

Andrew McKnight compiled the test set and developed the software for the algorithm. Jing He, as advisor, guided through the problem she has previously researched regarding SSE topology matching. Nikos Chrisochoides and Andrey Chernikov provided technical information and guidance for our algorithm. Dong Si and Kamal Al Nasr provided software and technical support for SSE extraction from and skeletonization of cryoEM density maps.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( ) applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Cite this article

McKnight, A., Si, D., Al Nasr, K. et al. Estimating loop length from CryoEM images at medium resolutions. BMC Struct Biol 13 (Suppl 1), S5 (2013).

Download citation

  • Published:

  • DOI:


  • Protein Data Bank
  • Density Image
  • Loop Length
  • Initial Path
  • Protein Data Bank Structure