Crystal structure of Escherichia coli protein ybgI, a toroidal structure with a dinuclear metal site

Ladner, Jane E; Obmolova, Galina; Teplyakov, Alexey; Howard, Andrew J; Khil, Pavel P; Camerini-Otero, R Daniel; Gilliland, Gary L

doi:10.1186/1472-6807-3-7

Research article
Open access
Published: 30 September 2003

Crystal structure of Escherichia coli protein ybgI, a toroidal structure with a dinuclear metal site

Jane E Ladner¹,
Galina Obmolova¹,
Alexey Teplyakov¹,
Andrew J Howard³,
Pavel P Khil²,
R Daniel Camerini-Otero² &
…
Gary L Gilliland¹

BMC Structural Biology volume 3, Article number: 7 (2003) Cite this article

2346 Accesses
14 Citations
Metrics details

Abstract

Background

The protein encoded by the gene ybgI was chosen as a target for a structural genomics project emphasizing the relation of protein structure to function.

Results

The structure of the ybgI protein is a toroid composed of six polypeptide chains forming a trimer of dimers. Each polypeptide chain binds two metal ions on the inside of the toroid.

Conclusion

The toroidal structure is comparable to that of some proteins that are involved in DNA metabolism. The di-nuclear metal site could imply that the specific function of this protein is as a hydrolase-oxidase enzyme.

Background

The protein encoded by the ybgI gene of Escherichia coli is 247 residues in length and has a molecular weight of 27 kDa. It belongs to the DUF34 family of proteins [1]. No biological function is known for members of this sequentially related family of, at present, 67 proteins. One of the members of this family, NIF3 yeast, which has 22% identity with ybgI, is reported to interact with the yeast transcriptional coactivator NGG1p, but the exact function of this interaction is not known [2]. It has been suggested that the product of the human gene, NIF3L1, and its mouse ortholog, Nif3l1, which have 22% identity with ybgI and 37% identity with yeast NIF3, inhibits Ngg1p from translocation to the nucleus or that NIF3 binds to Ngg1 in the cytoplasm and enters the nucleus by cotransport [3]. Analysis of the gene expression levels in Escherichia coli under conditions of genotoxic stress caused by mitomycin C DNA damage, showed that the expression level for ybgI was significantly induced[4]. This protein has been included as a structural genomics target [5, 6] for a study focusing on proteins which have no known function. The initial targets for this project were selected from the first completely sequenced bacterial genome of the Haemophilus influenzae [7]. The protein ybgI is a sequence homolog of Haemophilus influenzae HI0105 with a sequence identity of 59%. The ybgI protein was cloned, expressed and the crystal structure was determined to 2.2-Å resolution.

Results and Discussion

The ybgI protein consists of two similar interlinked a/ß domains; both are 3-layer sandwiches (alpha-beta-alpha) as shown in Figure 1. The first domain has a 5-stranded mixed ß-sheet with two a-helices on one side and three a-helices on the other side. Two of the three a-helices are approximately parallel to the ß-strands of the ß-sheet and the third is shorter, approximately perpendicular to the ß-strands and leads over to the second domain. The order of the ß-strands is 1-4-3-2-11. The second domain also has a central mixed ß-sheet but has 6 ß-strands with the order 5-6-8-9-10-7; the ß-sheet is flanked on each side by two a-helices and there is an additional short a-helix leading back to domain 1. The crystallographic asymmetric unit contains three dimers. The application of the three-fold crystal symmetry reveals that the quaternary structure is a toroid formed by three crystallographically related dimers. In the crystals, these toroids stack forming long tubes. The toroidal structure is shown in Figures 2A and 2B.

Searching with CE [8, 9], DALI [10, 11] and SCOP [12, 13] yielded no other polypeptides with the particular arrangement of mixed ß-sheets and a-helices observed in either domain.

The toroid is composed of six polypeptide chains generated by the application of 3-fold symmetry on the dimer. The 2-fold noncrystallographic symmetry operators of the dimers are perpendicular to the three-folds. The inside diameter of the toroid is approximately 30 Å and the outside diameter is approximately 90 Å; the height of the toroid is 57 Å. Due to the 2-fold, the toroid appears the same when approached from either direction along a 3-fold symmetry axis. The superposition of the native subunit structure and selenomethionine subunit structure gives RMSDs for the Ca atoms of 0.2-0.3 Å. The z positions of the toroids on the crystallographic 3-folds differ; in other words, the non-crystallographic 2-folds are not coplanar. It is also of interest to note that the relative positions of the toroids differ between the native and selenomethionine crystals. For instance, the E chain selenomethionine/methionine at position 135 is packed up against the C chain region of 138–141 in the selenomethionine structure and against the C chain region of 140–144 in the native structure.

The most likely region for the active site is a group of conserved residues which includes four histidines (63, 64, 97, 215), two glutamic acids (194, 219), one aspartic acid (101), one asparagine (108), one cysteine (171), one tyrosine (22) and one tryptophan (68). There are also two metal ions 3.3 Å apart in the selenomethionine protein and 2.5 Å apart in the native protein bound by this cluster of residues. In the early refinement of the selenomethionine structure these were treated as 'water' molecules and the B-values became very low indicating that they must be something heavier than oxygen. The anomalous Fourier map of the selenomethionine data indicates that there is a significant anomalous signal at these positions, though much lower than selenium. The X-ray fluorescence identified the presence of Fe in the protein sample. One metal ion is coordinated by H64 Ne2, H215 Ne2 and E219 Oe1; the other metal ion is coordinated by D101 Od1 and Od2, E219 Oe2 and H63 Ne2. This grouping is set back into the inside wall of the toroid and includes residues from both domains. The metal ion sites of the dimer are at opposite ends of a cavity that extends across the dimer interface. This cavity is separated from the center of the toroid by the Y22 residues of the dimer chains. The Y22 residues narrow the access to the cavity from the center of the toroid to approximately 14 Å The distances between the metal ions of the dimer chains are 21.9 and 25.6 Å. The distances between 3-fold related metal ions are 45.0 and 42.5 Å. One of the six putative active sites is shown in Figure 3. In the native protein, the metal positions may be filled or partially filled by magnesium ions which are present in both the growth medium and in the crystallization solution. An anomalous fourier using the native data does not reveal any anomalous signal at these positions and negative results using X-ray fluorescence eliminate the presence of Fe, Zn, Cu, Ni, and Co. In this structure, only 11 sites were included. The electron density at these positions tends to be somewhat smeared. The appearance of the electron density and the refinement of the B factors were used as guides to include or exclude ion sites. The protein structure around the sites is quite good. The presence of iron in the selenomethionine protein sample may indicate the adventitious uptake of iron during preparation since the procedure includes the addition of iron sulfate as a component in the growth medium [14]. The intrinsic metal ions for this protein are not known. The inclusion of histidine, glutamic acid, and aspartic acid in the putative active site with a bridging glutamic acid between the ions is in keeping with cocatalytic sites in a number of proteins where the metal ions are Zn or Zn and Fe, Mn, or Mg [15]. The constancy of the protein structure around these sites supports the view that these are catalytic rather than structural sites.

An E. coli operon has been identified that includes the nei gene which codes for endonuclease VIII and four other genes, ybgI, ybgJ, ybgK, ybgL [16]. Endonuclease VIII is an oxidative base excision repair protein. The proteins encoded by ybgJ and ybgK are putative carboxylases and the protein encoded by ybgL is a putative lactam utilization protein. The inclusion of ybgI in the nei operon of other bacteria is not well conserved.

The highly conserved residues of the DUF34 family are concentrated in two regions of the ybgI structure: at the putative active site and on the side of a groove between the polypeptide chains of the trimer. Figure 4 shows the conserved residues mapped onto the surface of the molecule.

The toroidal ring quaternary structure brings to mind many proteins that are involved in DNA metabolism. In a recent review [17], Hingorani and O'Donnell examine these proteins and speculate on the convergence to the toroidal shape as being a means of providing an enclosed environment for otherwise chemically unfavorable reactions. The functions of these proteins include sliding clamps and helicases that catalyze ATP-fuelled DNA unwinding, and exonucleases and topoisomerases that chemically modify DNA. For instance, the exonuclease of ? bacteriophage is a trimer and forms a toroid with an inner diameter of 30 Å at one end and 15 Å at the opposite end. The double-stranded DNA is encircled by the exonuclease and processively hydrolyzes one of the two strands. The enzyme moves with a specific orientation and degrades the 5' strand so that the product is the 3' strand [18]. The ybgI structure is a symmetric toroid, the inner diameter is the same approached from above or below.

In a review of di-iron-carboxylate proteins (proteins with di-iron centers bridged by carboxylate residues and oxide/hydroxide groups) [19], the authors grouped the known structures into four structural categories. The first three categories are all variations on helix bundles. The fourth class is the a/ß sandwich category which includes purple acid phosphatases. These proteins have di-metal centers (Fe and Zn) that catalyze the hydrolysis of phosphate esters. There is an active site tyrosine radical that is responsible for the purple color and the OH is 2.2 Å from the iron atom. In ybgI, the closest tyrosine is 11 Å away from the metal ions.

Conclusions

The quaternary structure taken together with the upgraded response to DNA damage, the inclusion in the operon with endonuclease VIII, and sequential homology with the yeast NIF3 protein appears consistent with a function that involves DNA repair or involvement in the transcription process. Comparison of the active site with known structures has not yet yielded a definitive clue concerning the specific biological function. Biochemical studies to further profile the function of the ybgI protein are in progress.

The atomic coordinates and structure factors of the selenomethionine and native structures of ybgI are deposited in the Protein Data Bank[20] as 1NMO and 1NMP, respectively.

Methods

Cloning, expression, and purification

The ybgI gene was PCR, polymerase chain reaction, amplified from Escherichia coli MG1655 genomic DNA and subcloned into pDONR201 plasmid using Gateway Technology (Invitrogen). For expression, the coding sequence was transferred into pDEST14 plasmid using site-specific recombination (Invitrogen). The protein was produced in E. coli strain BL21 Star (DE3) (Invitrogen) that was transformed with pDEST14. Cells were grown on LB media containing 100 µg/µL ampicillin at 37°C to an A₆₀₀ of 0.6 and induced with 1 mM isopropyl ß-D-thiogalactoside for 3 hours. The protein was purified by column chromatography in two steps using Source 30Q (Pharmacia) and Butyl-560M (Toyopearl).

Crystallization and structure determination

Crystals were obtained by the vapor diffusion method in hanging drops at room temperature for the native protein and the selenomethionine derivative. The reservoir solution for the native protein included 0.1 M cacodylate buffer at pH 7.5, 0.1 M magnesium acetate, 15% (w/v) polyethylene glycol 8000 and 5% (v/v) polyethylene glycol 400. The reservoir solution for the selenomethionine protein included 0.1 M imidazole buffer pH 8.0, 0.2 M calcium acetate and 15% (w/v) polyethylene glycol 3350. The hanging drops were formed by combining equal volumes of protein solution and reservoir solution. The protein concentrations were 4.7 mg/mL for the native protein and 8.2 mg/mL for the selenomethionine protein. For data collection the crystals were passed through a solution made of equal volumes of reservoir solution and saturated lithium formate for the native crystals and 2 volumes of reservoir solution and one volume of saturated lithium formate for the selenomethionine derivative [21].

Diffraction data were collected at the Advanced Photon Source (APS) South East Regional Collaborative Access Team (SER-CAT) beam line 22ID-D at Argonne National Laboratory. All data were collected at 100 K. Data were collected at three wavelengths for the selenomethionine derivative crystal (0.9795 Å, 0.9793 Å and 0.9780 Å) and at 0.9793 Å for the native crystal. The data were processed using D * TREK [22].

The selenium sites were found with Shake-N-Bake [23, 24]. The polypeptide has four methionine residues and there are three dimers (six monomers) in the asymmetric unit. The 18 highest-ranked sites were entered into SOLVE [25]http://www.solve.lanl.gov. SOLVE chose the opposite hand and gave a solution with 21 sites. RESOLVE [26] was not able to find the correct noncrystallographic symmetry, but once this was determined by visual and vector examination of the sites, RESOLVE was able to build backbone for 911 of the 1482 residues and place 491 sidechains. By superimposing the partial models for the six copies of the polypeptide chain, a nearly complete tracing was determined. CNS [27] was used to refine this model against the data. As the refinement progressed the noncrystallographic symmetry restraints were reduced. XTALVIEW [28] was used to visualize the structure and to make manual adjustments of the coordinates to improve their agreement with the electron density map. REDUCE and PROBE [29] were used to guide rebuilding to help resolve side chain conformations and PROCHECK [30] was used to validate the structures.

The selenomethionine data and the native data are not isomorphous. The cells differ by greater than 1% in the a and b unit cell dimensions. Consequently, the native structure was solved by molecular replacement using CNS. The dimer unit was used as the search molecule. Refinement against the diffraction data was also accomplished using the CNS package. As in the selenomethione structure, noncrystallographic symmetry restraints were used throughout the refinement but the weighting was reduced after the initial rounds of refinement. The data and refinement statistics are shown in Table 1.

Table 1 X-Ray Data Processing and Refinement Statistics

Full size table

Metal ion determination

X-ray fluorescence scans were performed at the absorption edges for Zn, Cu, Ni, Co and Fe at the Advanced Photon Source (APS) Industrial Macromolecular Crystallography Association Collaborative Access Team (IMCA-CAT) beam line 17-ID at Argonne National Laboratory. Solution samples of the native and SeMet proteins were used for the scans. The scans indicated the presence of Fe in the SeMet protein and no Zn, Cu, Ni, or Co, and found none of these metals present in the native protein solution.

References

Pfam:Protein Families database of alignmenets and HMMs[http://www.sanger.ac.uk/Software/Pfam/]
Martens JA, Genereaux J, Saleh A, Brandl CJ: Transcriptional Activation by Yeast PDR1p Is Inhibited by Its Association with NGG1p/ADA3p. J. Biol. Chem. 1996, 271: 15884–15890. 10.1074/jbc.271.16.9298
Article CAS PubMed Google Scholar
Tascou S, Uedelhoven J, Dixkens C, Nayernia K, Engel W, Burfeind P: Isolation and characterization of a novel human gene, NIF3L1, and its mouse ortholog, Nif3l1, highly conserved from bacteria to mammals. Cytogenet. Cell Genet. 2000, 90: 330–336. 10.1159/000056799
Article CAS PubMed Google Scholar
Khil PP, Camerini-Otero RD: Over 1000 genes are involved in the DNA damage response of Escherichia coli. Mol. Microbiol. 2002, 44: 89–105. 10.1046/j.1365-2958.2002.02878.x
Article CAS PubMed Google Scholar
Eisenstein E, Gilliland GL, Herzberg O, Moult J, Orban J, Poljak RJ, Banergei L, Richardson D, Howard AJ: Biological function made crystal clear - annotation of hypothetical proteins via structural genomics. Curr Opin in Biotechnol 2000, 11: 25–30. 10.1016/S0958-1669(99)00063-4
Article CAS Google Scholar
Structure2Function Project[http://s2f.umbi.umd.edu]
Fleischmann RD, M.D. Adams., White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb J-F, Doughertry BA, Merrick JM, McKenney K, Suffon G, FitzHugh W, Fields C, CGocayne JD, Scott J, Shirley R, Liu L-I, Glocek A, Kelley JM, Weidman JF, Phillips CA, Spriggs T, Hedblom E, Cotton MD, Utterback TR, Hanna MC, Nguyen DT, Saudek DM, Brandon RC, Fine LD, Fritchman JL, Fuhrmann JL, Geoghagen NSM, Gnehn CL, McDonald LA, Small KV, Fraser CM, Smith HO, Venter JC: Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae Rd. Science 1995, 269: 496.
Article CAS PubMed Google Scholar
Shindyalov IN, Bourne PE: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering 1998, 11: 739–747. 10.1093/protein/11.9.739
Article CAS PubMed Google Scholar
Finding 3-D Similarities in Protein Structures[http://cl.sdsc.edu]
Holm L, Sander C: Mapping the protein universe. Science 1996, 273: 595–602.
Article CAS PubMed Google Scholar
The DALI server[http://www2.ebi.ac.uk/dali]
Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 1995, 247: 536–540. 10.1006/jmbi.1995.0159
CAS PubMed Google Scholar
Structural Classification of Proteins[http://scop.mrc-lmb.cam.ac.uk/scop/]
Hendrickson WA, Horton JR, LeMaster DM: Selenomethionyl proteins produced for analysis by multiwavelength anomalous diffraction (MAD); a vehicle for direct determination of three-dimensional structure. EMBO J. 1990, 9: 1665–1672.
PubMed Central CAS PubMed Google Scholar
Auld DS: Zinc coordination sphere in biochemical zinc sites. Biometals 2001, 14: 271–313. 10.1023/A:1012976615056
Article CAS PubMed Google Scholar
Gifford CM, Wallace SS: The genes encoding endonuclease VIII and endonuclease III in Escherichia coli are transcribed as the terminal genes in operons. Nucleic Acids Research 2000, 28: 762–769. 10.1093/nar/28.3.762
Article PubMed Central CAS PubMed Google Scholar
Hingorani MM, O'Donnell M: A tale of toroids in DNA metalobism. Nat Rev Mol Cell Biol 2000, 1: 22–30. 10.1038/35036044
Article CAS PubMed Google Scholar
Kovall R, Matthews BW: Toroidal structure of lambda-exonuclease. Science 1997, 277: 1824–1827. 10.1126/science.277.5333.1824
Article CAS PubMed Google Scholar
Nordlund P, Eklund H: Di-iron -carboxylate proteins. Current Opinion in Structural Biology 1995, 5: 758–766. 10.1016/0959-440X(95)80008-5
Article CAS PubMed Google Scholar
Protein Data Bank[http://www.rcsb.org/pdb/]
Rubinson KA, Ladner JE, Tordova M, Gilliland GL: Cryosalts: suppression of ice formation in macromolecular crystallography. Acta Crystallog. 2000, D56: 996–1001.
CAS Google Scholar
Pflugrath JW: The finer things in X-ray diffraction data collection. Acta Crystallog. 1999, D55: 1718–1725.
CAS Google Scholar
Blessing RH, Smith GD: Difference structure-factor normalizaion for heavy-atom or anomalous-scattering substructure determinations. J. Appl. Cryst. 1999, 32: 664–670. 10.1107/S0021889899003416
Article CAS Google Scholar
Weeks CM, Miller R: The design and implementation of SnB v2.0. J. Appl. Cryst. 1999, 32: 120–124. 10.1107/S0021889898010504
Article CAS Google Scholar
Terwilliger TC, Berendzen J: Automated MAD and MIR structure solution. Acta Crystallog. 1999, D55: 849–861.
CAS Google Scholar
Terwilliger TC: Automated structure solution, density modification and model building. Acta Crystallog. 2002, D58: 1937–1940.
CAS Google Scholar
Brünger AT, Adams PD, Clore GM,, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang J-S, Kuszewski J, Nilges M, Pannu NS, Read RJ, Rice LM, Simonson T, Warren GL: Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallog. 1998, D54: 905–921.
Google Scholar
McRee DE: Practical Protein Crystallography 2 Edition San Diego, Academic Press 1999, 477.
Google Scholar
Word JM, Lovell SC, LaBean TH, Taylor HC, Zalis ME, Presley BK, Richardson JS, Richardson DC: Visualizing and Quantifying Molecular Goodness-of-Fit: Small-probe Contact Dots with Explicit Hydrogen Atoms. J. Mol. Biol. 1999, 285: 1711–1733. 10.1006/jmbi.1998.2400
Article CAS PubMed Google Scholar
Laskowski RA, MacArthur MW, Moss DS, Thornton JM: PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 1993, 26: 283–291. 10.1107/S0021889892009944
Article CAS Google Scholar
Kraulis PJ: MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. Applied Crystallography 1991, 24: 946–950. 10.1107/S0021889891004399
Article Google Scholar
Bacon DJ, Anderson WF: A Fast Algoithm for Rendering Space-filling Molecule Pictures. J. of Molecular Graphics 1988, 6: 219–220. 10.1016/S0263-7855(98)80030-1
Article Google Scholar
Merritt EA, Bacon DJ: Raster3D: Photorealistic Molecular Graphics. Methods in Enzymology (Edited by: Sweet RM and Carter CW Jr). San Diego, Academic Press 1997, 277: 505–524.
Google Scholar
DeLano WL: The PyMOL Molecular Graphics System. DeLano Scientific, San Carlos, CA, USA 2002. [http://www.pymol.org]
Google Scholar

Download references

Acknowledgements

We would like to acknowledge the consultations with Celia Chen on the crystallographic packing and selenomethione substructure. This work was supported by the National Institutes of Health grant No. P01-GM57890. This work was also supported in part by an award from the W.M. Keck Foundation. Diffraction data were collected at Southeast Regional Collaborative Access Team (SER-CAT) 22-ID beamline at the Advanced Photon Source, Argonne National Laboratory. Use of the Advanced Photon Source was supported by the U.S. Department of Energy, Office of Basic Energy Sciences, under Contract No. W-31-109-Eng-38.

Certain commercial materials, instruments, and equipment are identified in this manuscript in order to specify the experimental procedure as completely as possible. In no case does such identification imply that the materials, instruments, or equipment identified is necessarily the best available for the purpose.

The accepted SI units of concentration, mol/L, and of unified atomic mass unit, u, have been represented by the symbol M and by the symbol Da, respectively, in order to conform to the conventions of the journal.

Author information

Authors and Affiliations

Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute and the National Institute of Standards and Technology, 9600 Gudelsky Drive, Rockville, MD, 20850, U.S.A
Jane E Ladner, Galina Obmolova, Alexey Teplyakov & Gary L Gilliland
Genetics and Biochemistry Branch, NIDDK, National Institutes of Health, 10 Center Drive, Bethesda, MD, 20892, U.S.A
Pavel P Khil & R Daniel Camerini-Otero
Physical Sciences Department, Illinois Institute of Technology, Chicago, Illinois, 60616, U.S.A
Andrew J Howard

Authors

Jane E Ladner
View author publications
You can also search for this author in PubMed Google Scholar
Galina Obmolova
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Teplyakov
View author publications
You can also search for this author in PubMed Google Scholar
Andrew J Howard
View author publications
You can also search for this author in PubMed Google Scholar
Pavel P Khil
View author publications
You can also search for this author in PubMed Google Scholar
R Daniel Camerini-Otero
View author publications
You can also search for this author in PubMed Google Scholar
Gary L Gilliland
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jane E Ladner.

Additional information

Authors' contributions

JEL grew the crystals used for data collection, collected the diffraction data, solved and refined the molecular structure. GO expressed and purified the native and SeMet proteins and produced the original crystals. AT contributed to the selection of the protein and performed the X-ray fluorescence experiments. AJH provided us with access to the synchrotron and helped with the X-ray fluorsecence experiments. PPK and RDC-O performed the gene expression experiments and contributed to the selection of the protein. GLG conceived the study, participated in the coordination, and provided financial support.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ladner, J.E., Obmolova, G., Teplyakov, A. et al. Crystal structure of Escherichia coli protein ybgI, a toroidal structure with a dinuclear metal site. BMC Struct Biol 3, 7 (2003). https://doi.org/10.1186/1472-6807-3-7

Download citation

Received: 19 June 2003
Accepted: 30 September 2003
Published: 30 September 2003
DOI: https://doi.org/10.1186/1472-6807-3-7

Crystal structure of Escherichia coli protein ybgI, a toroidal structure with a dinuclear metal site