Cloning, expression, purification and activity detection of CaNce103p and its truncated versions
To isolate a sufficient amount of CaNce103p for structural studies and facilitate preparation of CaNce103p variants, we developed a recombinant expression system in E. coli. The vector pET22b-wtNCE103 was constructed by inserting the CaNce103p coding sequence into a pET22b vector from which the pelB signal sequence for periplasmic localization had been removed. The stop codon at the end of the coding sequence was preserved to prevent attachment of a His-tag to the resulting protein. Four histidine residues naturally occurring at the C-terminus of CaNce103p were sufficient to enable purification using affinity chromatography on a column charged with Ni2+ cations.
Truncated versions of CaNce103p were prepared to facilitate crystallization. ScNce103p, the homologous CA from S. cerevisiae, was successfully crystallized only upon removal of the first 13 amino acids, which are not conserved among β-CAs [15]. The non-conserved N-terminal part of CaNce103p is even longer than that of ScNce103p (Fig. 1). The vectors pET22b-∆29_CaNCE103, pET22b-∆48_CaNCE103 and pET22b-∆61_CaNCE103 were prepared to express C. albicans CA lacking the first 29, 48 and 61 amino acids, respectively.
Truncation by 29 and 61 amino acids was based on secondary structure prediction performed using the program CFSSP [17] (Fig. 1). Deletion of the first 29 residues removes the entire helix predicted to occur at the N-terminus. Deletion of 61 residues removes all the residues before the beginning of the next predicted helix (Fig. 1). Deletion of the first 48 amino acids was motivated by multiple sequence alignment of CAs. The N-terminal part of an orthologous enzyme from the pathogenic yeast Candida parapsilosis contains a KR motif, which is a potential target for subtilisin-like processing proteinases. Deleting the first 48 amino acids of CaNce103p removes the homologous position, although CaNce103p does not have a KR motif in its N-terminal part.
To obtain maximal protein yield, we optimized culture conditions including temperature (20–37 °C), cultivation time post-induction (4–20 h) and IPTG concentration (0.2–1.0 mM). The highest yields of soluble CaNce103p, ∆29_CaNce103p, ∆48_CaNce103p and ∆61_CaNce103p were obtained 24 h post-induction with 0.4 mM IPTG with cultivation at 20 °C. The presence of the disulfide bond-reducing reagent β-mercaptoethanol was necessary to keep the protein in a soluble state during the purification process. All CaNce103p variants were purified (Fig. 2a), with average yields of 12–23 mg purified protein per liter of culture for CaNce103p, ∆29_CaNce103p and ∆48_CaNce103p and 2–8 mg/L for ∆61_CaNce103p.
The activity of full-length and truncated versions of CaNce103p detected using the stop-flow pH/dye indicator method [15] (Fig. 3) indicated that truncation by 29 and 48 amino acids did not cause differences in the enzyme activity in comparison with the wild-type full-length CaNce103p. Truncation by 61 amino acids rendered the enzyme inactive. The reaction in presence of Δ61_CaNce103p has a similar rate as the spontaneous hydration of CO2. The activity detection also confirmed that β-Mercaptoethanol does not negatively influence the enzyme activity.
Assessment of the oligomeric structure of Δ29_CaNce103p
Structures of fungal CAs solved to date indicate that these oligomeric enzymes are composed of an even number of identical subunits. We determined the oligomeric state of CaNce103p variants using size-exclusion chromatography. ∆29_CaNce103p formed tetramers (Fig. 3b), which was the highest-order oligomer observed in this study. ∆29_CaNce103p was the only version present as a monomer, dimer and tetramer under our experimental conditions. WT_CaNce103p occurred only as a dimer and tetramer. ∆48_CaNce103p occurred only as a monomer and a precipitate; ∆61_CaNce103p was present only as a precipitate. These findings suggest the importance of the N-terminus for folding and oligomerization of CaNce103p.
Protein crystallization
Protein crystallization was facilitated by removal of the first 29 amino acids. Our attempts to crystallize full-length CaNce103p or the variants truncated by 48 or 61 residues were unsuccessful. WT_CaNce103p, ∆48_CaNce103p and ∆61_CaNce103p formed precipitates or very small crystals with skin on the drop. Repeated unsuccessful attempts in a variety of crystallization conditions resulted in our decision to focus on ∆29_ CaNce103p. Purified ∆29_ CaNce103p was enzymatically active and crystallized in the form of needles belonging to the Space Group P212121 (Tab. 2), which allowed determination of the structure at 2.2 Å resolution.
Overall architecture
The overall structure of CaNce103 is similar to that of CAS1, a β-CA from the plant fungal pathogen Sordaria macrospora [8]. It also resembles structures of β-CAs from red algae [17] and bacteria including Escherichia coli, Vibrio cholerae and Haemophilus influenzae [18,19,20]. CaNce103 is a complex of four identical subunits organized as a dimer of dimers, in which the dimerization and tetramerization surfaces are mutually perpendicular (Fig. 4a). The subunits in each dimer are interlocked by their N-terminal arms, consisting of two perpendicular helices stretched from the rest of the molecule over the neighboring subunit (Fig. 4b, c). Each monomer provides more than 90 residues to make contact with its dimerization partner, creating an interface of 3458 Å2. We calculated the interaction energy stabilizing the dimer to be − 48.8 kcal/mol. Association of two dimers in a tetramer is not as strong; it relies on 33 residues forming an interface of 996 Å2. The tetramer is stabilized by an interaction energy of − 11.2 kcal/mol.
The central part of each monomeric subunit is formed by a β-sheet consisting of four parallel strands and one antiparallel strand. This conserved β-structure is flanked on both sides by α-helices. The C-terminal part adjacent to the β-sheet domain is mostly helical. Each monomer contains one zinc atom located in the active site at the bottom of a narrow tunnel, similarly as in ScNce103p [15].
While ∆29_ CaNce103p was the only variant that successfully crystallized, the solved structure corresponds to CaNce103p lacking the first 60 amino acids. This indicates the high flexibility of the N-terminal part of the enzyme.
CaNce103p active site
The active site of Δ29_CaNce103p is formed by the catalytic Zn2+ coordinated by Sγ atom of Cys 106, Nε2 atom of His 160 and Sγ atom of Cys 163 located 2.3 Å from the acceptor atom. According to these data, CaNce103p appears to be a member of the type I β-CAs, the active sites of which are typically formed by two cysteines, one histidine and a fourth ligand—usually water, acetic acid or acetate ion [7]. However, in the present structure, a molecule of β-mercaptoethanol originating from the crystallization buffer fills the fourth position (Fig. 4d).
The zinc coordination sphere is located near the dimer interface. Two of the residues contributing to the zinc coordination sphere are located at the tips of β-sheets (Cys 106 at β1 and His 160 at β3). Cys 163 is located outside of the β-sheet core. The catalytic site is surrounded by amino acids located between the α2 and α4 helices of the contributing monomer units (Figs. 1 and 5a). The contributing residues, most of which are hydrophobic (monomer providing zinc ion ligands: Ile 129, Gly 165; neighboring monomer: Phe 146, Leu 151), create a narrow tunnel (Fig. 4b), which serves as the only point of entry to the positively charged active site (Fig. 5d). The tunnel’s shape and openness may be regulated by the Arg 111 – Asp 163 salt bridge that also contributes to formation of the active site cavity. This salt bridge may function as a pH-dependent regulator of the catalytic activity of Δ29_CaNce103p [21].
Comparison of CaNce103p to other fungal carbonic anhydrases
We aligned the crystal structure of Δ29_CaNce103p with other known β-CA structures from Cryptococcus neoformans (Can2; PDB code: 2W3N), Saccharomyces cerevisiae (ScNce103p; PDB code: 3EYX) and Sordaria macrospora (CAS1; PDB code: 4O1J). The alignment revealed very high similarity among these homologs. The monomer subunits of all structures are nearly identical, although significant differences occur in the N-terminal part. Of the CAs characterized to date, ScNce103p shares the highest sequence homology with Δ29_CaNce103p (Fig. 1). At the overall structural level, however, Δ29_CaNce103p is more closely related to CAS1, which also forms a tetramer. The root mean square deviation (RMSD) for the superposition of 142 Cα atoms of these proteins is 0.7 Å. CAs from S. cerevisiae and C. neoformans form dimers, and RMSD values for superposition of ScNce103p and Can2 with CaNce103p are 1.2 Å for 124 Cα pairs and 1.1 Å for 132 Cα pairs, respectively.
The N-terminal part of Δ29_CaNce103p resembles those of CAS1 and ScNce103p, while the Can2 structure includes an additional helix. However, there are similarities in the substrate tunnel region of the Can2 and Δ29_CaNce103p structures. The substrate tunnels have similar shapes and orientations (Fig. 5b), although the middle part of the Δ29_CaNce103p substrate tunnel is rather narrow compared to those of other fungal CAs (Fig. 5c). We observed the most pronounced differences in shape and proportion of the substrate tunnel when comparing the Δ29_CaNce103p and ScNce103p structures, which interestingly share the highest sequence homology. The active site structure and overall structure of Δ29_CaNce103p are nearly identical to those of CAS1.