- Research article
- Open Access
Crystal structure of Escherichia coli protein ybgI, a toroidal structure with a dinuclear metal site
© Ladner et al; licensee BioMed Central Ltd 2003
- Received: 19 June 2003
- Accepted: 30 September 2003
- Published: 30 September 2003
The protein encoded by the gene ybgI was chosen as a target for a structural genomics project emphasizing the relation of protein structure to function.
The structure of the ybgI protein is a toroid composed of six polypeptide chains forming a trimer of dimers. Each polypeptide chain binds two metal ions on the inside of the toroid.
The toroidal structure is comparable to that of some proteins that are involved in DNA metabolism. The di-nuclear metal site could imply that the specific function of this protein is as a hydrolase-oxidase enzyme.
- Reservoir Solution
- Advance Photon Source
- Dime Chain
- Purple Acid Phosphatase
The protein encoded by the ybgI gene of Escherichia coli is 247 residues in length and has a molecular weight of 27 kDa. It belongs to the DUF34 family of proteins . No biological function is known for members of this sequentially related family of, at present, 67 proteins. One of the members of this family, NIF3 yeast, which has 22% identity with ybgI, is reported to interact with the yeast transcriptional coactivator NGG1p, but the exact function of this interaction is not known . It has been suggested that the product of the human gene, NIF3L1, and its mouse ortholog, Nif3l1, which have 22% identity with ybgI and 37% identity with yeast NIF3, inhibits Ngg1p from translocation to the nucleus or that NIF3 binds to Ngg1 in the cytoplasm and enters the nucleus by cotransport . Analysis of the gene expression levels in Escherichia coli under conditions of genotoxic stress caused by mitomycin C DNA damage, showed that the expression level for ybgI was significantly induced. This protein has been included as a structural genomics target [5, 6] for a study focusing on proteins which have no known function. The initial targets for this project were selected from the first completely sequenced bacterial genome of the Haemophilus influenzae . The protein ybgI is a sequence homolog of Haemophilus influenzae HI0105 with a sequence identity of 59%. The ybgI protein was cloned, expressed and the crystal structure was determined to 2.2-Å resolution.
The toroid is composed of six polypeptide chains generated by the application of 3-fold symmetry on the dimer. The 2-fold noncrystallographic symmetry operators of the dimers are perpendicular to the three-folds. The inside diameter of the toroid is approximately 30 Å and the outside diameter is approximately 90 Å; the height of the toroid is 57 Å. Due to the 2-fold, the toroid appears the same when approached from either direction along a 3-fold symmetry axis. The superposition of the native subunit structure and selenomethionine subunit structure gives RMSDs for the Ca atoms of 0.2-0.3 Å. The z positions of the toroids on the crystallographic 3-folds differ; in other words, the non-crystallographic 2-folds are not coplanar. It is also of interest to note that the relative positions of the toroids differ between the native and selenomethionine crystals. For instance, the E chain selenomethionine/methionine at position 135 is packed up against the C chain region of 138–141 in the selenomethionine structure and against the C chain region of 140–144 in the native structure.
An E. coli operon has been identified that includes the nei gene which codes for endonuclease VIII and four other genes, ybgI, ybgJ, ybgK, ybgL . Endonuclease VIII is an oxidative base excision repair protein. The proteins encoded by ybgJ and ybgK are putative carboxylases and the protein encoded by ybgL is a putative lactam utilization protein. The inclusion of ybgI in the nei operon of other bacteria is not well conserved.
The toroidal ring quaternary structure brings to mind many proteins that are involved in DNA metabolism. In a recent review , Hingorani and O'Donnell examine these proteins and speculate on the convergence to the toroidal shape as being a means of providing an enclosed environment for otherwise chemically unfavorable reactions. The functions of these proteins include sliding clamps and helicases that catalyze ATP-fuelled DNA unwinding, and exonucleases and topoisomerases that chemically modify DNA. For instance, the exonuclease of ? bacteriophage is a trimer and forms a toroid with an inner diameter of 30 Å at one end and 15 Å at the opposite end. The double-stranded DNA is encircled by the exonuclease and processively hydrolyzes one of the two strands. The enzyme moves with a specific orientation and degrades the 5' strand so that the product is the 3' strand . The ybgI structure is a symmetric toroid, the inner diameter is the same approached from above or below.
In a review of di-iron-carboxylate proteins (proteins with di-iron centers bridged by carboxylate residues and oxide/hydroxide groups) , the authors grouped the known structures into four structural categories. The first three categories are all variations on helix bundles. The fourth class is the a/ß sandwich category which includes purple acid phosphatases. These proteins have di-metal centers (Fe and Zn) that catalyze the hydrolysis of phosphate esters. There is an active site tyrosine radical that is responsible for the purple color and the OH is 2.2 Å from the iron atom. In ybgI, the closest tyrosine is 11 Å away from the metal ions.
The quaternary structure taken together with the upgraded response to DNA damage, the inclusion in the operon with endonuclease VIII, and sequential homology with the yeast NIF3 protein appears consistent with a function that involves DNA repair or involvement in the transcription process. Comparison of the active site with known structures has not yet yielded a definitive clue concerning the specific biological function. Biochemical studies to further profile the function of the ybgI protein are in progress.
The atomic coordinates and structure factors of the selenomethionine and native structures of ybgI are deposited in the Protein Data Bank as 1NMO and 1NMP, respectively.
Cloning, expression, and purification
The ybgI gene was PCR, polymerase chain reaction, amplified from Escherichia coli MG1655 genomic DNA and subcloned into pDONR201 plasmid using Gateway Technology (Invitrogen). For expression, the coding sequence was transferred into pDEST14 plasmid using site-specific recombination (Invitrogen). The protein was produced in E. coli strain BL21 Star (DE3) (Invitrogen) that was transformed with pDEST14. Cells were grown on LB media containing 100 µg/µL ampicillin at 37°C to an A600 of 0.6 and induced with 1 mM isopropyl ß-D-thiogalactoside for 3 hours. The protein was purified by column chromatography in two steps using Source 30Q (Pharmacia) and Butyl-560M (Toyopearl).
Crystallization and structure determination
Crystals were obtained by the vapor diffusion method in hanging drops at room temperature for the native protein and the selenomethionine derivative. The reservoir solution for the native protein included 0.1 M cacodylate buffer at pH 7.5, 0.1 M magnesium acetate, 15% (w/v) polyethylene glycol 8000 and 5% (v/v) polyethylene glycol 400. The reservoir solution for the selenomethionine protein included 0.1 M imidazole buffer pH 8.0, 0.2 M calcium acetate and 15% (w/v) polyethylene glycol 3350. The hanging drops were formed by combining equal volumes of protein solution and reservoir solution. The protein concentrations were 4.7 mg/mL for the native protein and 8.2 mg/mL for the selenomethionine protein. For data collection the crystals were passed through a solution made of equal volumes of reservoir solution and saturated lithium formate for the native crystals and 2 volumes of reservoir solution and one volume of saturated lithium formate for the selenomethionine derivative .
Diffraction data were collected at the Advanced Photon Source (APS) South East Regional Collaborative Access Team (SER-CAT) beam line 22ID-D at Argonne National Laboratory. All data were collected at 100 K. Data were collected at three wavelengths for the selenomethionine derivative crystal (0.9795 Å, 0.9793 Å and 0.9780 Å) and at 0.9793 Å for the native crystal. The data were processed using D * TREK .
The selenium sites were found with Shake-N-Bake [23, 24]. The polypeptide has four methionine residues and there are three dimers (six monomers) in the asymmetric unit. The 18 highest-ranked sites were entered into SOLVE http://www.solve.lanl.gov. SOLVE chose the opposite hand and gave a solution with 21 sites. RESOLVE  was not able to find the correct noncrystallographic symmetry, but once this was determined by visual and vector examination of the sites, RESOLVE was able to build backbone for 911 of the 1482 residues and place 491 sidechains. By superimposing the partial models for the six copies of the polypeptide chain, a nearly complete tracing was determined. CNS  was used to refine this model against the data. As the refinement progressed the noncrystallographic symmetry restraints were reduced. XTALVIEW  was used to visualize the structure and to make manual adjustments of the coordinates to improve their agreement with the electron density map. REDUCE and PROBE  were used to guide rebuilding to help resolve side chain conformations and PROCHECK  was used to validate the structures.
X-Ray Data Processing and Refinement Statistics
cell (a,b,c) (Å)
no. measured intensities
no. unique reflection
R merge (all/high res.)
completeness (all/high res.)
I/s average (all/high res.)
resolution limits used (Å)
R-factor (95% data)
Rfree (5% data)
amino acid residues/atoms
11 Mg ions
12 Fe ions
no. of water molecules
bond length rms deviation (Å)
angle rms deviation (°)
average B (main/side chain) (Å2)
average B water (Å2)
Metal ion determination
X-ray fluorescence scans were performed at the absorption edges for Zn, Cu, Ni, Co and Fe at the Advanced Photon Source (APS) Industrial Macromolecular Crystallography Association Collaborative Access Team (IMCA-CAT) beam line 17-ID at Argonne National Laboratory. Solution samples of the native and SeMet proteins were used for the scans. The scans indicated the presence of Fe in the SeMet protein and no Zn, Cu, Ni, or Co, and found none of these metals present in the native protein solution.
We would like to acknowledge the consultations with Celia Chen on the crystallographic packing and selenomethione substructure. This work was supported by the National Institutes of Health grant No. P01-GM57890. This work was also supported in part by an award from the W.M. Keck Foundation. Diffraction data were collected at Southeast Regional Collaborative Access Team (SER-CAT) 22-ID beamline at the Advanced Photon Source, Argonne National Laboratory. Use of the Advanced Photon Source was supported by the U.S. Department of Energy, Office of Basic Energy Sciences, under Contract No. W-31-109-Eng-38.
Certain commercial materials, instruments, and equipment are identified in this manuscript in order to specify the experimental procedure as completely as possible. In no case does such identification imply that the materials, instruments, or equipment identified is necessarily the best available for the purpose.
The accepted SI units of concentration, mol/L, and of unified atomic mass unit, u, have been represented by the symbol M and by the symbol Da, respectively, in order to conform to the conventions of the journal.
- Pfam:Protein Families database of alignmenets and HMMs[http://www.sanger.ac.uk/Software/Pfam/]
- Martens JA, Genereaux J, Saleh A, Brandl CJ: Transcriptional Activation by Yeast PDR1p Is Inhibited by Its Association with NGG1p/ADA3p. J. Biol. Chem. 1996, 271: 15884–15890. 10.1074/jbc.271.16.9298View ArticlePubMedGoogle Scholar
- Tascou S, Uedelhoven J, Dixkens C, Nayernia K, Engel W, Burfeind P: Isolation and characterization of a novel human gene, NIF3L1, and its mouse ortholog, Nif3l1, highly conserved from bacteria to mammals. Cytogenet. Cell Genet. 2000, 90: 330–336. 10.1159/000056799View ArticlePubMedGoogle Scholar
- Khil PP, Camerini-Otero RD: Over 1000 genes are involved in the DNA damage response of Escherichia coli. Mol. Microbiol. 2002, 44: 89–105. 10.1046/j.1365-2958.2002.02878.xView ArticlePubMedGoogle Scholar
- Eisenstein E, Gilliland GL, Herzberg O, Moult J, Orban J, Poljak RJ, Banergei L, Richardson D, Howard AJ: Biological function made crystal clear - annotation of hypothetical proteins via structural genomics. Curr Opin in Biotechnol 2000, 11: 25–30. 10.1016/S0958-1669(99)00063-4View ArticleGoogle Scholar
- Structure2Function Project[http://s2f.umbi.umd.edu]
- Fleischmann RD, M.D. Adams., White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb J-F, Doughertry BA, Merrick JM, McKenney K, Suffon G, FitzHugh W, Fields C, CGocayne JD, Scott J, Shirley R, Liu L-I, Glocek A, Kelley JM, Weidman JF, Phillips CA, Spriggs T, Hedblom E, Cotton MD, Utterback TR, Hanna MC, Nguyen DT, Saudek DM, Brandon RC, Fine LD, Fritchman JL, Fuhrmann JL, Geoghagen NSM, Gnehn CL, McDonald LA, Small KV, Fraser CM, Smith HO, Venter JC: Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae Rd. Science 1995, 269: 496.View ArticlePubMedGoogle Scholar
- Shindyalov IN, Bourne PE: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering 1998, 11: 739–747. 10.1093/protein/11.9.739View ArticlePubMedGoogle Scholar
- Finding 3-D Similarities in Protein Structures[http://cl.sdsc.edu]
- Holm L, Sander C: Mapping the protein universe. Science 1996, 273: 595–602.View ArticlePubMedGoogle Scholar
- The DALI server[http://www2.ebi.ac.uk/dali]
- Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 1995, 247: 536–540. 10.1006/jmbi.1995.0159PubMedGoogle Scholar
- Structural Classification of Proteins[http://scop.mrc-lmb.cam.ac.uk/scop/]
- Hendrickson WA, Horton JR, LeMaster DM: Selenomethionyl proteins produced for analysis by multiwavelength anomalous diffraction (MAD); a vehicle for direct determination of three-dimensional structure. EMBO J. 1990, 9: 1665–1672.PubMed CentralPubMedGoogle Scholar
- Auld DS: Zinc coordination sphere in biochemical zinc sites. Biometals 2001, 14: 271–313. 10.1023/A:1012976615056View ArticlePubMedGoogle Scholar
- Gifford CM, Wallace SS: The genes encoding endonuclease VIII and endonuclease III in Escherichia coli are transcribed as the terminal genes in operons. Nucleic Acids Research 2000, 28: 762–769. 10.1093/nar/28.3.762PubMed CentralView ArticlePubMedGoogle Scholar
- Hingorani MM, O'Donnell M: A tale of toroids in DNA metalobism. Nat Rev Mol Cell Biol 2000, 1: 22–30. 10.1038/35036044View ArticlePubMedGoogle Scholar
- Kovall R, Matthews BW: Toroidal structure of lambda-exonuclease. Science 1997, 277: 1824–1827. 10.1126/science.277.5333.1824View ArticlePubMedGoogle Scholar
- Nordlund P, Eklund H: Di-iron -carboxylate proteins. Current Opinion in Structural Biology 1995, 5: 758–766. 10.1016/0959-440X(95)80008-5View ArticlePubMedGoogle Scholar
- Protein Data Bank[http://www.rcsb.org/pdb/]
- Rubinson KA, Ladner JE, Tordova M, Gilliland GL: Cryosalts: suppression of ice formation in macromolecular crystallography. Acta Crystallog. 2000, D56: 996–1001.Google Scholar
- Pflugrath JW: The finer things in X-ray diffraction data collection. Acta Crystallog. 1999, D55: 1718–1725.Google Scholar
- Blessing RH, Smith GD: Difference structure-factor normalizaion for heavy-atom or anomalous-scattering substructure determinations. J. Appl. Cryst. 1999, 32: 664–670. 10.1107/S0021889899003416View ArticleGoogle Scholar
- Weeks CM, Miller R: The design and implementation of SnB v2.0. J. Appl. Cryst. 1999, 32: 120–124. 10.1107/S0021889898010504View ArticleGoogle Scholar
- Terwilliger TC, Berendzen J: Automated MAD and MIR structure solution. Acta Crystallog. 1999, D55: 849–861.Google Scholar
- Terwilliger TC: Automated structure solution, density modification and model building. Acta Crystallog. 2002, D58: 1937–1940.Google Scholar
- Brünger AT, Adams PD, Clore GM,, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang J-S, Kuszewski J, Nilges M, Pannu NS, Read RJ, Rice LM, Simonson T, Warren GL: Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallog. 1998, D54: 905–921.Google Scholar
- McRee DE: Practical Protein Crystallography 2 Edition San Diego, Academic Press 1999, 477.Google Scholar
- Word JM, Lovell SC, LaBean TH, Taylor HC, Zalis ME, Presley BK, Richardson JS, Richardson DC: Visualizing and Quantifying Molecular Goodness-of-Fit: Small-probe Contact Dots with Explicit Hydrogen Atoms. J. Mol. Biol. 1999, 285: 1711–1733. 10.1006/jmbi.1998.2400View ArticlePubMedGoogle Scholar
- Laskowski RA, MacArthur MW, Moss DS, Thornton JM: PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 1993, 26: 283–291. 10.1107/S0021889892009944View ArticleGoogle Scholar
- Kraulis PJ: MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. Applied Crystallography 1991, 24: 946–950. 10.1107/S0021889891004399View ArticleGoogle Scholar
- Bacon DJ, Anderson WF: A Fast Algoithm for Rendering Space-filling Molecule Pictures. J. of Molecular Graphics 1988, 6: 219–220. 10.1016/S0263-7855(98)80030-1View ArticleGoogle Scholar
- Merritt EA, Bacon DJ: Raster3D: Photorealistic Molecular Graphics. Methods in Enzymology (Edited by: Sweet RM and Carter CW Jr). San Diego, Academic Press 1997, 277: 505–524.Google Scholar
- DeLano WL: The PyMOL Molecular Graphics System. DeLano Scientific, San Carlos, CA, USA 2002. [http://www.pymol.org]Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.