Structures of EccB1 and EccD1 from the core complex of the mycobacterial ESX-1 type VII secretion system

Background The ESX-1 type VII secretion system is an important determinant of virulence in pathogenic mycobacteria, including Mycobacterium tuberculosis. This complicated molecular machine secretes folded proteins through the mycobacterial cell envelope to subvert the host immune response. Despite its important role in disease very little is known about the molecular architecture of the ESX-1 secretion system. Results This study characterizes the structures of the soluble domains of two conserved core ESX-1 components – EccB1 and EccD1. The periplasmic domain of EccB1 consists of 4 repeat domains and a central domain, which together form a quasi 2-fold symmetrical structure. The repeat domains of EccB1 are structurally similar to a known peptidoglycan binding protein suggesting a role in anchoring the ESX-1 system within the periplasmic space. The cytoplasmic domain of EccD1has a ubiquitin-like fold and forms a dimer with a negatively charged groove. Conclusions These structures represent a major step towards resolving the molecular architecture of the entire ESX-1 assembly and may contribute to ESX-1 targeted tuberculosis intervention strategies. Electronic supplementary material The online version of this article (doi:10.1186/s12900-016-0056-6) contains supplementary material, which is available to authorized users.


Background
Pathogenic bacteria rely on a variety of secretion systems to transport virulence factors, proteins that mediate host-pathogen interactions, across their hydrophobic cell membranes to sites where they can interact with the host. Gram-positive bacteria need only transport proteins across a single membrane, but Gram-negative bacteria require specialized secretion machinery that spans both inner and outer membranes. Mycobacterium tuberculosis, the causative agent of tuberculosis, was recently re-classified as a diderm bacterium when it was shown to have an outer membrane bi-layerreferred to as the mycomembranecomposed largely of mycolic acids [1]. In order to transport key virulence factors across both membranes M. tuberculosis has evolved specialized Type VII secretion systems (T7SS). The T7SSs were discovered based on attenuated strains of M. tuberculosis deficient in EsxA (ESAT-6, early secreted antigenic target of 6 kDa) secretion and are commonly called ESAT six (ESX) secretion systems [2][3][4]. In M. tuberculosis there are five gene clusters, named ESX-1 to ESX-5, which encode T7SS. Each gene cluster encodes a number of proteins that are either secreted or are building blocks for the secretion apparatus. ESX-1 is responsible for secretion of the important virulence factors EsxA and EsxB as well as other virulence-associated proteins (e.g., EspB, EspF, EspJ) that are secreted to the cell surface or extracellular milieu based on recognition of a conserved C-terminal signal sequence on the secretion substrates [5][6][7][8]. These secreted factors have been linked to mycobacterial virulence through studies of the attenuated BCG strain of M. tuberculosis [2,4,9]; in non-pathogenic Mycobacterium smegmatis the orthologous ESX-1 system is involved in conjugation [10,11]. ESX-3 is critical for mycobacterial survival due to its role in metal acquisition [12][13][14]. ESX-5 is important for the secretion of many members of the PE/PPE family of proteins that also play a role in virulence and cell wall integrity [15][16][17]. The functional role of ESX-2 and ESX-4 is still unknown although ESX-4 appears to be the ancestral system from which the other ESX systems have evolved [18].
All ESX gene clusters contain at least three or four ESX conserved components (Ecc), named EccB, EccC, and EccD, with EccE being present in all ESX systems with the exception of ESX-4 [19]. Multiple copies of each core protein as well as other T7SS-associated proteins are present in the core complex resulting in a large~1500 kDa particle [20]. The function of some core components is known, for example, EccC is a member of the FtsK/SpoIIIE-like ATPase family and provides the energy to transport proteins across the mycobacterial membrane(s) [21,22]. EccD contains an N-terminal cytoplasmic domain followed by 11 predicted transmembrane helices, and may form the cytoplasmic membrane channel through which cargo proteins are secreted. The functions of EccB and EccE within the secretion apparatus are less clear. These proteins both have N-terminal transmembrane elements and large Cterminal regions predicted bioinformatically to be localized in the periplasm, but their molecular structures and interacting partners remain unknown.
Understanding the T7SS architecture is critical for development of new antitubercular agents. Currently, no structural data is available for three of the four conserved components EccB, EccD, and EccE. In this study we report the molecular structures of the periplasmic domain of EccB 1 and the cytoplasmic domain of EccD 1 from the ESX-1 cluster. The structures reveal probable functional surfaces of EccB 1 , and an unexpected dimerization by EccD 1 . Here we describe these structures in detail and how they might fit into the larger context of the T7SS.

Results and discussion
Structure of EccB 1 M. tuberculosis EccB 1 (Rv3869) is a 51 kDa protein containing a 40 amino acid (aa) N-terminal domain followed by a single membrane-spanning helix and a~400 aa Cterminal fold. EccB 1 is annotated as a protein domain of unknown function (DUF690) in the Pfam database [23]. In order to gain further insight into the role of EccB 1 within the ESX machinery we determined the crystal structure of the C-terminal domains of EccB 1 from M. tuberculosis (EccB 1mt ) to 1.7 Å resolution and of the orthologous protein (MSMEG_0060; EccB 1ms ) from the nonpathogenic mycobacterial species M. smegmatis to 3.07 Å resolution. Both EccB 1 structures contain a single elongated fold in the shape of a distorted propeller, which has an unanticipated quasi 2-fold symmetry (Fig. 1). A structural comparison of the EccB 1mt and EccB 1ms structures shows that they are highly similar with an r.m.s.d. of 2.7 Å for the superposition of 381 amino acids (Dali Z-score 42.2); there is considerable variability in the conformation of the extensive unstructured loops connecting secondary structure elements which are, themselves, relatively well conserved ( Fig. 2) Five domains are present in the structures including a core domain flanked by two repeat domains on either side. The central core domain consists of a 6 stranded βsheet with 5 strands (β7-β19-β18-β5-β6) arranged in antiparallel fashion with an additional strand (β21) parallel to strand β6 on the periphery of the sheet; the sheet is further stabilized by a disulfide bond between the two central strands (β5 and β18) of the sheet formed between Cys150 and Cys345 (EccB 1mt ) and Cys152 and Cys347 (EccB 1ms ). The four repeat domains each contain a 4 stranded β sheet and two α helices (Fig. 1c). Repeat 1 (R1) (residues S74-M124) and repeat 4 (R4) (residues G391-L445) are located between the core domain and the N-terminal transmembrane region while repeat 2 (R2) (residues E185-P241) and repeat 3 (R3) (residues V267-A320) are located on the opposite side of the core domain distal to the transmembrane region. The interfaces between R1/R4 and R2/R3 domains are formed by hydrophobic residues on the Nterminal helices of each repeat that fold together with each other and with hydrophobic residues from the proline rich strands downstream of each repeat's C-terminal helix. The R2 and R3 domains also pack tightly with the core domain via residues on their N-terminal helices as well as β sheet residues. The tight packing involving residues on either side of multiple repeat domains gives EccB 1 a stable fold with a continuous hydrophobic core and an elongated pseudo-symmetrical shape.
A comparison of the repeat domains of EccB 1mt gives clues to the evolution of the protein. Pairwise sequence alignments of the repeats shows that R2, R3, and R4 show 26, 33 and 27 % sequence identity, respectively, to R1. Pairwise alignments comparing R2-R4 to all other repeats revealed that only R1 has significant identity to all 3 other domains (Fig. 3). Therefore, it appears that R1 is the ancestral domain with R3 sharing more conserved features with R1 than do either R2 or R4. EccB 1ms contains a corresponding set of repeats in the same arrangement as seen in EccB 1mt : R1 (residues Q75-K127) is membrane proximal, R4 (residues G392-L447), the central core domain, R2 (residues Q187-P243), and R3 (residues G267-E323) which is distal to the membrane. EccB 1 does not bear significant sequence similarity to any protein of known structure, and Dali searches using the complete EccB 1 structures revealed no proteins with significant structural homology. However, Dali searches using only EccB 1mt repeat 1 (S74-P124) revealed weak homology (r.m.s.d. 2.7 Å and Dali Z-score of 5.0) to the N-terminal domain of PlyCB (PDB 4 F87, residues 14-70) from streptococcal C1 bacteriophage [24]. Eight PlyCB monomers assemble into a ring that associates with the bacterial cell wall and facilitate phage egress by tethering the degradative PlyCA subunit to the bacterial cell wall. The structural similarity between the two proteins and a common localization of both to bacterial cell envelope structures is intriguing but no clues to EccB 1 function are apparent from our examination of PlyCB.

Structure of EccD 1mt
EccD 1mt (Rv3877) is a 54 kDa protein containing añ 110 amino acid (aa) N-terminal ubiquitin-like domain followed by a 30 aa linker and 11 closely spaced transmembrane helices at its C-terminus. The ubiquitin-like Overall structure and repeat domains of EccB 1mt . a Domain organization of EccB 1 . The predicted transmembrane helix is indicated by a shaded rectangle. The protein variants used for structure determination are shown as horizontal lines. b Overall structure of EccB 1mt . The structure is shown in cartoon representation with the central core domain in grey and repeats domains R1-R4 colored red, orange, green, and blue, respectively. The disulfide bond between Cys150 and Cys345 is shown as yellow spheres. c Repeat domains R1-R4 have a common fold. The isolated repeat domains are shown in the same orientation after superposition of repeats R2-R4 on repeat R1 using Chimera [52] Fig. 2 Superposition of EccB 1mt and EccB 1ms structures. a EccB 1mt (grey) and EccB 1ms (blue) were superimposed using Chimera. b Structure-based sequence alignment of EccB 1mt and EccB 1ms prepared with ESPript (http://espript.ibcp.fr) [53] with numbering and secondary structure elements derived from the EccB 1mt sequence and structure domain of EccD 1 classifies it as a member of the YukD family within the Pfam database. Based on the characteristics of the transmembrane regions the N-terminal portion of EccD 1 is predicted to be localized in the cytoplasm.
We grew crystals of the predicted cytoplasmic domain of EccD 1 from M. tuberculosis (cyto-EccD 1mt ) which diffracted to 1.88 Å. However, we could not obtain crystals of Se-Met containing cyto-EccD 1mt and attempts to perform heavy atom soaks of fragile native crystals of cyto-EccD 1mt were unsuccessful. Therefore, we obtained crystals and determined the structure of cyto-EccD 1mt fused to maltose binding protein (MBP) at a resolution of 2.20 Å by molecular replacement using an MBP structure (PDB ID 1ANF) as the search model [25]. We subsequently solved the 1.88 Å cyto-EccD 1mt structure by molecular replacement using the EccD 1mt segment of the MBP fusion protein. In both structures EccD 1mt residues 20-109 adopt an identical ubiquitin-like fold characterized by a β grasp motif and an anti-parallel β sheet with strands in the order 2,1,5,3,4 (Fig. 4). The MBP fusion protein used as a crystallization aid provides additional crystallization contacts, but it does not perturb the fold of cyto-EccD 1mt (Fig. 4d,e). The two EccD 1mt structures are superimposable with an r.m.s.d. of 0.7 Å over 90 residues and a Dali Z-score of 18.8.
Interestingly, the asymmetric unit of both crystal forms contains two EccD 1 molecules and in both crystal forms the two EccD 1 molecules are arranged as a head-to-tail homodimer stabilized by an extensive interface. The interface is formed by interlocking side chains from β strands 1 and 2 and the N-terminal α-helix of both EccD 1 molecules (Fig. 4) and~650 Å 2 of each EccD 1 molecule (13 % of the total surface) is buried in the interface as calculated with the PISA webserver [26]. The interaction is stabilized by 4 hydrogen bonds and a cluster of buried hydrophobic residues including Met1, Val54, and Val58 resulting in a solvation energy of −13.9 kcal/mol and a Complex Significance Score of 1.0 calculated by the PISA server. The extensive nature of the interface and its re-occurrence in both crystal forms, with or without the MBP fusion, suggests that EccD 1 is a natural homodimer.
Dimerization of cyto-EccD 1mt creates a wide openended groove bordered on two sides by the α1/β3 loops (Fig. 5). The floor of the groove is formed by the two α helices. Notably, the dimerization interface brings acidic residues (Glu45, Asp49, Asp50, Glu57, Glu60, and Asp61) from both chains into this groove. These acidic residues are not offset by the presence of any basic residues in this region thus they create a highly negative surface (Fig. 5b).

Putative function of EccB 1 and EccD 1
Mutations in EccB 3 of the ESX-3 secretion system have been shown to confer drug resistance in M. tuberculosis [27]. The mutations found to confer resistance (Arg14Leu, and Asn24His) occur in the small cytoplasmic domain preceding the transmembrane element of EccB 3 , a region not present in our EccB 1 constructs which contain the soluble periplasmic domain. The fact that mutations in this region confer drug resistance indicates an important function for this short region perhaps in mediating interactions with other cytoplasmically exposed components of the T7SS. The elongated shape and continuous hydrophobic core of EccB 1 suggest that it may serve a structural roleperhaps forming part of a structure that spans the inner and outer membrane components of the ESX secretion system. The structural similarities between PlyCB, the viral cell wall binding protein complex, and EccB 1 hints that EccB 1 may also bind elements of the peptidoglycan layer, but there is not yet any experimental data to support this idea. However, post-translational modification of secreted bacterial proteins with O-linked polysaccharides has been shown to be important for solubility or maintaining subcellular localization to the cell wall [28,29]. EccB 1 contains 24 putative glycosylation sites, as predicted by the NetOGlyc webserver [30], and many of these are surface-exposed in the EccB 1 structures (including Ser143, Thr144, Ser351, and Ser356). While this manuscript was under preparation, the ATPase activity of EccB 1 has been reported [31]. Further studies are needed to define the precise role of EccB 1 in the context of a functional ESX-1 secretion complex.
The dimerization of the cytoplasmic domain of EccD 1 raises interesting possibilities regarding the nature of the transmembrane pore. Each EccD 1 monomer has 11 transmembrane elements thus a dimer would have a total of 22 transmembrane elements. Each monomer may form an independent pore resulting in a pair of closely associated channels, or the transmembrane elements may comprise a single, larger, transmembrane channel. The cytoplasmic Fig. 3 Structure-based sequence alignment of repeat domains of EccB 1mt . Alignment was rendered using ESPript. Amino acid numbering above the alignment refers to the repeat domain R1 sequence and indicated secondary structure elements are derived from the repeat domain R1 structure domain itself is connected to the first transmembrane element by a 30 amino acid linker that may facilitate protein-protein interactions, either with the cytoplasmic EccD 1 domain or other components of the secretion system, or it may simply form an extended tether allowing increased mobility of the ubiquitin-like domains. Fig. 4 Structure of the cytoplasmic domain of EccD 1mt . a Domain organization of EccD 1 . The predicted transmembrane helices 1-11 are indicated by shaded rectangles. The protein construct used for crystallization is shown as a horizontal line. b cyto-EccD 1mt monomer in cartoon representation colored in rainbow colors from N-terminus (blue) to C-terminus (red). The secondary structure elements are labeled. c cyto-EccD 1mt dimer in cartoon representation with acidic residues shown in stick representation (see Fig. 5). d MBP-cyto-EccD 1mt dimer in cartoon representation with MBP moieties colored in grey and cyto-EccD 1mt domains colored in blue and purple. e A close-up view of the MBP-cyto-EccD 1mt dimer. The orientation corresponds to panel c The negatively charged groove of the EccD 1 dimer indicates that it should associate with a positively charged partner(s). It may act to recruit other T7SS components or secretion substrates with positively charged patches into the system, or it may be part of a gating element required to close the channel during periods of inactivity. The residues contributing to the negatively charged groove are not conserved in EccD 1 homologs from other ESX systems indicating that they may serve a system-specific role. Indeed, the ESX-1 locus encodes a variety of secretion substrates not found in the paralogous M. tuberculosis ESX systems and thus it is likely that the ESX-1 system has structural adaptations to enable the secretion of these substrates [6][7][8]32]. As more structures of ESX-1 components are determined likely partners for interaction with the EccD 1 dimer may be revealed.

Conclusions
In summary, we have determined the structures of soluble domains of two integral, conserved components, EccB 1 and EccD 1 , of the ESX-1 secretion channel. Given the importance of the ESX-1 secretion system to mycobacterial virulence, our structures provide crucial information about the molecular makeup of this important protein complex that will aid future drug development efforts.

Expression and purification of EccB 1mt
A construct for expression of the periplasmic domain of EccB 1mt (residues 72-463) was designed based on predicted transmembrane helix using the TOPCONS server [33], secondary structure prediction using the JPred4 server [34], and the sequence alignment of EccB 1 orthologs (Additional file 1: Figure S1). The DNA fragment was PCR-amplified from M. tuberculosis H37Rv genomic DNA using primers EccB1_F72_Nco 5′-CACCATGGGC ACCAGCCTGTTCACCGACC and EccB1_RS463_Hind 5′-GCAAGCTTACAGCGTGTCGTGCTCGAGCAG, and cloned into a modified pET-22b(+) vector (Novagen), which contains the Escherichia coli DsbA signal sequence, a hexahistidine tag and a tobacco etch virus (TEV) protease cleavage sequence.
EccB 1mt was expressed in E. coli Rosetta2(DE3) strain using LB media and 0.5 mM IPTG for induction. Cells were harvested after 4 h incubation at 18°C, resuspended in 20 mM Tris-HCl pH 8.0, 300 mM NaCl buffer, and lysed using microfluidizer (Avestin). EccB 1mt was purified via a Ni-NTA affinity column, incubated with TEV protease to remove the hexahistidine tag, and passed over a Ni-NTA column to remove uncleaved protein, and further purified by size exclusion chromatography using a Superdex 200 column (GE Healthcare). Protein was flash-frozen using liquid nitrogen and stored at −80°C. Fig. 5 Dimerization of cyto-EccD 1mt creates a negatively charged groove. a cyto-EccD 1mt dimer is shown in cartoon representation underneath a semitransparent surface. Clustered acidic residues are shown in stick representation. b Electrostatic surface calculated using the APBS server [54] with protonation states at pH 7.0 assigned by PROPKA [55]. The surface was colored +10 eV (blue) to −10 eV (red)

Crystallization and structure determination of EccB 1mt
Crystals were grown using the sitting drop vapor diffusion method with precipitant containing 0.1 M Tris-HCl pH 5.6, 15 % PEG2000 MME, 10 mM NiCl. Crystals were transferred to crystallization solution supplemented with 20 % glycerol, or with 20 % glycerol and 0.5 NaI [35], and flash-frozen in liquid nitrogen.
Data were collected at the 22-ID beamline at the Advance Photon Source, Argonne National Laboratory, and processed using XDS [36] and HKL-3000 [37]. Iodide ion positions were determined using SHELXD [38] as implemented in HKL-3000, and phases were calculated using SHARP [39]. The model was built using Buccaneer [40] and Coot [41], and refined by REFMAC5 [42] using TLS groups defined by the TLSMD server [43]. The final structure includes residues 74-458.

Expression and purification of EccB 1ms
The periplasmic domain (residues S73-G479) of the MSMEG_0060 gene was PCR-amplified from M. smegmatis mc 2 155 genomic DNA with the gene-specific primers MsEccB1.For. 5′-AACCTGTATTTCCAGAGT AGTGACCAGCTGCTGGTGG and MsEccB1.Rev. 5′-T TCGGGCTTTGTTAGCAGTTAGCCCTCCCCGCTCG Half-set correlation coefficient CC 1/2 as defined in Karplus and Diederichs [56] and calculated using XSCALE [36] or Scala [57] c Calculated using the MolProbity server (http://molprobity.biochem.duke.edu) [58] and cloned into the pMAPLe4 expression vector [44], which appends a TEV protease cleavable hexahistidine tag to the N-terminus of the target protein, using the Gibson ISO cloning method [45]. The sequence of the expression clone was verified by DNA sequencing (Genewiz, Piscataway, NJ). Recombinant protein was overexpressed in E. coli BL21(DE3) by inducing protein expression, of 1 L Terrific broth cultures, at an OD600 of 1.0 with the addition of IPTG to 0.5 mM. Cell growth was continued overnight at 18°C. The following day the cells were harvested by centrifugation and resuspended in Buffer A (20 mM Tris, pH 8.0, 300 mM NaCl, 10 % Glycerol) containing 10 mM imidazole, 1 mM EDTA and Complete protease inhibitor and lysed by sonication. The lysate was clarified by centrifugation (15,000 × g, 30 min, 4°C) and the supernatant was loaded on a Ni-NTA affinity column equilibrated in Buffer A. After extensive washing the bound protein was eluted with Buffer B (Buffer A containing 250 mM imidazole). The target protein was further purified by size exclusion chromatography using a Sephacryl S-100 column (GE Healthcare) equilibrated in Buffer A.

Crystallization and structure determination of EccB 1ms
Crystals of EccB 1ms were grown using the hanging drop vapor diffusion method by mixing protein at a 1:1 ratio of protein to reservoir solution (14 % PEG 8000, 200 mM NaCl, 100 mM phosphate-citrate pH 4.2). Crystals were cryoprotected by a brief soak in reservoir solution containing 20 % propylene glycol. Data from a single crystal was collected at beamline 24-ID-C at the Advanced Photon Source, Argonne National Laboratory. The data were processed with XDS [36] and the structure solved by molecular replacement using the program Phaser [46] and a homology model, prepared with the Phyre2 web server [47], based on the structure of M. tuberculosis EccB 1 (PDB ID 4KK7). The structure was refined with BUSTER [48].

Expression and purification of cyto-EccD 1mt
A construct for expression of the cytoplasmic domain of EccD 1mt (residues 21-109) was designed based on predicted ubiquitin-like domain using the HHpred server [49]. The DNA fragment was PCR-amplified from M. tuberculosis H37Rv genomic DNA using primers EccD1_F21_Nco 5′-CACCATGGCCACCACCCGGGTGACGATC and EccD1_R109_SpeEcoR 5′-GGGAATTCACTAGTCATG ACACCAGAGTCAGCAGTGAC, and cloned into a modified pET-Duet1 vector, which contains an N-terminal hexahistidine tag and TEV protease cleavage sequence. To create a maltose-binding protein (MBP) fusion construct, the same DNA fragment was cloned into a modified pET-22b(+) vector, which contains an N-terminal hexahistidine tag and TEV protease cleavage sequence followed by MBP sequence. Both cyto-EccD 1mt and MBP-cyto-EccD 1mt proteins were expressed and purified as described for EccB 1mt . 5 mM maltose was included in the size-exclusion buffer during purification of MBP-cyto-EccD 1mt variant to obtain ligand-bound MBP [50].
Crystallization and structure determination of MBP-cyto-EccD 1mt and cyto-EccD 1mt Crystals of cyto-EccD 1mt were obtained by sitting drop vapor diffusion method using 0.1 M Tris-HCl pH 8.5, 0.2 M Mg chloride, 30 % PEG4000 as precipitant. Crystals were cryoprotected using crystallization solution supplemented with 10 % glycerol, and vitrified in liquid nitrogen. Crystals grew as thin hexagonal plates and were mounted in cryo-loops with 60°tilt (Mitigen) to avoid overlapping reflections along the crystallographic c axis ( Table 1). Crystals of MBP-cyto-EccD 1mt were obtained by sitting drop vapor diffusion method using 0.1 M HEPES pH 7.5, 1.4 M Na citrate, and cryoprotected using crystallization solution supplemented with 20 % glycerol.
Data were collected at the 22-ID beamline at the Advance Photon Source, Argonne National Laboratory, and processed using XDS [36]. The structure of MBPcyto-EccD 1mt was solved by molecular replacement using Phaser [46] and an MBP structure as a search model (PDB ID 1ANF) [25]. The electron density modification was performed using Parrot [51], and the model was extended using Buccaneer and Coot. The fragment corresponding to cyto-EccD 1mt from the structure of MBP-cyto-EccD 1mt was used as a search model to solve the structure of cyto-EccD 1mt alone using Phaser. The structures were refined using REFMAC5 and TLS groups defined by the TLSMD server.

Additional file
Additional file 1: Figure S1. Sequence alignment of EccB orthologs from M. tuberculosis H37Rv. The secondary structure elements of EccB 1mt are shown at the top of the alignment. The conserved Cys residues are highlighted in blue. The vertical arrows indicate the beginning and end of the EccB 1mt expression construct which was used for crystallization. (PDF 612 kb) Abbreviations T7SS: type VII secretion system; ESAT-6: early secreted antigenic target of 6 kDa; ESX: ESAT six; Ecc: ESX conserved component; MBP: maltose binding protein; r.m.s.d.: root mean square deviation..

Competing interests
The authors declare that they have no competing interests.