Quaternary structure of Artemia haemoglobin II: analysis of T and C polymer alignment and interpolymer interface

Background The brine shrimp Artemia expresses four different types of haemoglobin subunits namely C1, C2, T1 and T2. Two of these four subunits dimerize in different combinations to produce the three isoforms of the heterodimeric Artemia haemoglobin: HbI (C1 and C2), HbII (C1 and T2) and HbIII (T1 and T2). Previous biochemical, biophysical and computational analyses demonstrate that the T and C polymers are rings of nine concatenated globin domains, which are covalently joined by interdomain linkers. Two such rings stacked coaxially give the functional molecule. This research aimed to construct a quaternary structural model of Artemia HbII that shows the interpolymer interface and domain-domain alignment, using the MS3D (mass spectrometry for three dimensional analysis) approach. This involved introducing chemical crosslinks between the two polymers, cleaving with trypsin and analyzing the resulting products by mass spectrometry. This was followed by computational analysis of the mass spectrometry data using the program SearchXlinks to identify putatively crosslinked peptides. Results Six putative EGS (ethylene glycol bis [succinimidylsuccinate]) crosslinked tryptic peptides were identified. All of them support a model in which the EF helices of all domains are in contact along the interpolymer surface, and Domain 1 of the T-polymer aligns with Domain 1 of the C-polymer. Any two adjacent interpolymer domain pairs contact through the early Helix H and early Helix A. The orientation of domains is different from the subunit proposed model proposed previously by this group. Crosslinking with GMBS (N- [γ-maleimidobutyryloxy]succinimide ester) was also performed, and the results show good agreement with this model. Conclusion The interpolymer EF-contact allows the hydrophobic E and F helices to be buried in the interface and therefore allow the complex to solubilize readily to facilitate efficient oxygen transport. Furthermore the EF-contact is a common contact in cooperative haemoglobins and thus the model is consistent with the cooperative behaviour of Artemia HbII.


Background
The brine shrimp Artemia can survive extreme environmental conditions, such as 1.5 M salinity and fluctuating oxygen concentrations. Artemia responds to environmen-tal changes by switching between the two modes of reproduction: ovoviviparity and oviparity, in an oxygen concentration dependent manner. It has a high concentration of haemoglobin in the haemolymph, which also responds to the environment by changes in isoform content. The proposed quaternary structure of Artemia haemoglobin (two nine-domain polymer rings stacked coaxially) is unique among invertebrates (except for the related genus Parartemia). However, the structure has never been solved conclusively.
The brine shrimp Artemia expresses four different types of haemoglobin subunits namely C1, C2, T1 and T2. Two of these four subunits dimerize in different combinations to produce the three isoforms of the heterodimeric Artemia haemoglobin: HbI (C1 and C2), HbII (C1 and T2) and HbIII (T1 and T2). Each polymer consists of nine globin domains arranged in a ring and has a molecular weight of approximately 160 kDa. The mature molecule consists of two polymer rings stacked coaxially and has a molecular weight of approximately 320 kDa. Analysis of translated cDNA sequence, and partial trypsinolysis, confirmed that each polymer is made of nine concatenated globin domains, joined together via flexible linkers [1]. The orientation between each domain within each polymer was predicted from structural analysis of domain linkers [2], indicating a uniform, ring-like structure, characteristic of a repetitive linear concatenation of globin-like domains.
A linker of typically fourteen residues covalently joins each globin domain within each polymer. The interdomain linkers contain the consensus sequence VDPVTGL [2]. Comparisons with pre-A leader sequences of globins from other organisms have led to the hypothetical extended and turn-associated linker structure [2]. The amino acid sequences of all Artemia inter-domain linkers were demonstrated to resemble closely the pre-A leader sequence of the Petromyzon marinus haemoglobin V [2] which has a known 3D structure. The predicted T and C polymer inter-domain linker structures, together with the highly conserved 3D structure of globin-family proteins, suggest a circular polymer structure.
The EM image obtained by Wood et al. (1981) [3] is consistent with the ab initio models of concatenated globin polymers, and with two coaxially stacked polymer rings. The measured sedimentation coefficient and frictional drag coefficient also correlate to this proposed quaternary structure [4]. The Artemia haemoglobins rarely dissociate into subunits in the absence of detergents even at low concentration. Dissociation would be disadvantageous to Artemia, as it would cause excretory loss of haemoglobin. Research by Moens et al. (1984) [5] revealed that when HbII was partially digested by subtilisin and analyzed by electrophoresis, bands with molecular weights in multiples of approximately 17,000 Da were formed. The partially digested sample can still bind oxygen noncooperatively. The result further suggested that the subunits were polymers of myoglobin-like domains joined together by less folded loop-like structures that can be readily digested by proteases under partial proteolysis conditions. Intact Artemia haemoglobin cooperativity depends on the degree of oxygen saturation, temperature and pH [6,7].
The exact polymer interfaces and the domain-domain alignments in each Artemia haemoglobin isoform remain open questions to date. Previous attempts to crystalize the Artemia HbII have all been unsuccessful, possibly because of polymorphism in the primary sequence. It has been hypothesized previously that the E and F helices of both polymers were in contact based on the presence of surface-exposing hydrophobic residues [1].
The aim of this research was to find the quarternary structure of Artemia HbII, with both the polymer interfaces and domain-domain alignment identified. The technique of MS3D (mass spectrometry for three-dimensional analysis) [8][9][10][11] was utilized to achieve the aim. Crosslinks were introduced between the subunits of the native HbII molecule. The molecule was then cleaved with trypsin, and the resulting peptides isolated via reverse-phase high-performance liquid chromatography (RP-HPLC). The RP-HPLC fractions collected were then subjected to ESI-MS and significant peak were subjected to ESI-MS zoom scan to determine their charges. After charge determination a program called SearchXlinks (a program developed by Welfing et al., 2001) [12] was used to identify putatively crosslinked peaks. These peaks were then subjected to tandem mass spectrometry (MS/MS) to confirm the identity of the peptide. Haemoglobin II was chosen because it is made of one T-type (T2) and one C-type (C1) polymer, which share only 88% sequence identity and therefore have readily distinguishable sequences.

Synthesis of EGS-crosslinked tryptic HbII fragments
Because the EGS/trypsin combination had not been used in previous research involving interface mapping by MS3D analysis, it was decided to characterize the digest by electrophoresis. Crosslinked proteolytic fragments can be distinguished on gels as banding differences between the crosslinked, trypsin treated sample and the noncrosslinked, trypsin treated sample.
Based on the EGS+/Trypsin-sample (Figure 1a, Lane 4), compared with the EGS-/Trypsin-(unmodified and undigested HbII sample, Figure 1a, Lane 5), the 230 kDa band was interpreted as crosslinked HbII complex, and the 140 kDa band as individual HbII subunits. Although these molecular weight do not exactly coincide with the actual molecular weights of the HbII molecule and isolated subuits (320 kDa and 160 kDa respectively), crosslinking is known to alter migration characteristics of Artemia hae-moglobins [3]. Pure HbII (Figure 1a, Lane 5) was observed in this experiment, as a band of approximately 140 kDa, comparable with the subunit molecular weight found by Moens and Kondo (1978) [13].
As shown in Figure 1a, a smear is present in the EGS+/ Trypsin+ sample (Lane 2), above 25 kDa that is absent in the EGS-/Typsin+ sample (Lane 3). This smear is presumed to be the crosslinked material.

EGS decrosslinking
One of the two replicates of EGS-crosslinked HbII was decrosslinked by treating it with hydroxylamine-HCl (Figure 1b). A major band with molecular weight of approximately 140 kDa was observed. The 140 kDa band is consistent with a single HbII subunit (Figure 1a, Lane 4). This result indicates that EGS-crosslinked HbII complexes were effectively decrosslinked by this method.

GMBS crosslinking and the significance of the result
GMBS is a cysteine to lysine crosslinker with a spacer arm of 6.8 Å. Only four cysteine residues occur in HbII at position 2 of the A helices of domains C1, C2, T1 and T2. Figure 1c shows that inter-polymer crosslinking with GMBS was successful. This indicates that the A helices of the domains in HbII must be near the interface of the two polymer. This is consistent with the result obtained from EGS-crosslinking and mass spectrometry.

Reverse-phase high-performance liquid chromatography fractionation
Having characterized the EGS-crosslinked tryptic fragments by electrophoresis, the next step was to repeat the EGS+/Trypsin+ experiment but with complete trypsinolysis followed by RP-HPLC fractionation. The RP-HPLC chromatogram is presented in Figure 2. Putatively crosslinked peaks identified and subjected to tandem MS analysis and their corresponding RP-HPLC fractions are as indicated in Figure 2. Charge states of the peaks are labeled as the superscript above the m/z ratios of the peaks (Figure 2, see also Figure 3 inset).

Identification of crosslinked lysine pairs: searchXlinks scoring and tandem mass spectra annotations
Setting a SearchXlinks score threshold of 20.0 [See Additional file 1], the three putative peptide species with the highest scores identified [See Additional file 2] were: GFK-GFK (927.400 1+ peak), HAISVTTK-EAIK (1542.509 1+ peak) and ASWNK-ATIKR (1418.573 1+ peak) ( Table 1). When the SearchXlinks score threshold was reduced to the minimum (1.5) [See Additional file 3] and the peptide assignment receiving the best score was taken [See Additional file 4], five more putative peptide species were identified (Table 1). FindPept analysis again showed that none the peaks showed correlation to trypsin autolytic frag-ments. Tandem MS spectra of the 1418.573 peak and the zoom scan were shown as Figure 3. Red labels indicate peaks recognized by SearchXlinks during the scoring processes. In all cases, many peaks remained unannotated after SearchXlinks scoring.
Reconfirmations (performed manually) were done to four of the putative crosslinked peptide assignments including assignment ASWNK-ATIKR, by checking whether any peaks unaccounted for by SearchXlinks PSD scoring function could be identified. Internal ions formed by receiving at least two b and y-type cleavages, and fragment ions formed from complicated fragmentation pathway were identified (blue labels) (Figure 3 and 4).
Ions formed from these complicated fragmentation pathways, (b and y-type fragment ions losing NH 3 or H 2 O, or gaining H 2 O in the absence of the prerequisite amino acids) were identified. Fragment ions of types b/y-CO 2 , y-CO and y+H 2 O were also identified ( Figure 4). Formations of all these ions are possible according to Schilling et al. (2003) [10]. The program MS2Assign [10] considers b or y-ions losing NH 3 , H 2 O, CO or CO 2 , or gaining of H 2 O in sequence-independent, and ion-type-independent manners.
The immonium y-ion is 2 Da lighter than the conventional, ammonium-type y-ion, hence it is referred to as the y-2 ion. Formation of such y-ions had been theorized by Biemann et al. (1992) [14]. Immonium-type y-ions were observed in the 927.400, 1542.509 and 1418.573 tandem mass spectra also.
It was noticed, on the 1418.573 tandem MS spectra (Figure 4) for instance, that peaks identified to be the nonpolypeptide-characteristic internal ion peaks or peaks resulting from other complicated fragmentation pathways occupy a significant fraction, although the polypeptidecharacteristic b/y-type ion peaks are present. It is therefore important to check whether the polypeptide-characteristic b/y-type ion matches are statistically significant.
The statistical significance of the b and y-type ion matching result can be evaluated according to Chen et al. (2001) [8]. Briefly, The probability that a random peak matched to a peak on the theoretical spectrum by chance (θ) can be expressed as: Here τ is the sum of matching range of the theoretical spectrum, and r is the observing range of m/z charge ratio. Suppose that there are n random peaks, the probability that at least × of them matches, P(n, x, θ), can be expressed as: Hence the significance score, S, can be defined as: (3) Statistical significances of the 927.400 and 1418.573 b/ytype ion matching results were measured ( Table 2) using equations (1), (2) and (3). The theoretical tandem MS spectra of EGS-crosslinks GFK-GFK and ASWNK-ATIKR were also calculated. The theoretical tandem MS spectra contain only the polypeptide-characteristic b and y-ion, and the following b and y-ion derivatives b-H 2 O, b n-All crosslinked tryptic peptides identified suggest that trypsin can cleave after NHS-ester (N-hydroxylsuccinimide ester) crosslinker crosslinked lysine. This seems bizarre, as NHS-ester crosslinked lysine carries no (positive) charge. However, it had been reported that tryptic fragments of such type may be possible [8,9]. The crosslinker used by them, DSG (disuccinimidyl glutarate), is also an NHS-ester crosslinker as EGS. Therefore, it was decided not to reject any EGS-crosslinked tryptic peptides identified in which they were formed by receiving tryptic cleavage after EGS-crosslinked lysine because EGS is an NHS-ester crosslinker.

Model construction
A model of the HbII quaternary structure was made using all crosslinked lysine pairs identified, by forcing the EF corners of two Chironomus globin III domains in contact to make a domain pair (Figure 5a). Then nine such domain pairs were rotated 40° to form a ring with the GH loop pointing into the centre of the ring (Figure 5a and  There are crosslinks in which we cannot judge whether it is inter-polymer or intra-polymer by sequence alone. In

(a) Partial trypsinolysis of EGS-crosslinked HbII
Reverse-phase high-performance liquid chromatography (RP-HPLC) chromatogram of completely trypsinolyzed HbII complex ESI-MS zoom scan results of the 1418.573 peak (inset) and corresponding tandem MS spectra; this spectrum is partially anno-tated Figure 3 ESI-MS zoom scan results of the 1418.573 peak (inset) and corresponding tandem MS spectra; this spectrum is partially annotated. See Figure 4 for the enlarged version of this spectrum and further annotation.  those cases we judged whether such crosslinked peptides are inter-or intra-polymer crosslinks by finding out in which instance the two crosslinked lysines can be placed within 16 Å (the maximum crossbridge length of EGS), given that the domains are uniformly orientated in each polymer.
The domain orientations in this model differ from the orientations suggested by Trotman et al. (1994) [2]. According to this model the GH loops of domains point into the tangential direction with respect to a ring-like polymer. This would not be a surprise since the subunit model proposed by Trotman was based on the predicted structure of interdomain linkers using lamprey Hb Pre-A leader structure as a model and not a true linker. However, the similar primary structure of the Artemia interdomain linkers, plus the presence of a conformationally-restricted proline at a conserved position may well be an indication that all interdomain linkers have similar tertiary structures, and this in turn hints at a repetitive arrangement of domains. Therefore, the uniformly arranged domains assumption features in the previous model (Figure 5a and 5b).
As a further scrutiny, the apparent cysteine-lysine distance between C178 TP (2 nd residue of Domain T2, modeled as the 2 nd residue of Helix A on the Chironomus Hb structure) and K28 TP (5 th residue of Domain C1, modelled as 5 th residue of Helix A on the Chironomus Hb structure) is measured as 6.66 Å ( Figure 6) which is comparable with the maximum GMBS crossbridge span (6.8 Å). Therefore the model shown as Figure 6a and 6b agrees with the GMBS crosslinking results, although it is possible that the crosslink was formed between a different cysteine-lysine pair, or more than one cysteine-lysine pairs are crosslinked by GMBS simutaneously.

MS3D analysis using cysteine-lysine crosslinker GMBS: scrutinizing the Domain 1: Domain 1, EF: EF model
HbII samples were crosslinked with GMBS and subjected to partial trypsinolysis under 37°C for 1.5 hours ( Figure  7a). Additional smear in the trypsin+/GMBS+ lanes can be observed (Figure 7a, Lanes 5 and 7, blue star) by comparing both lanes with the corresponding GMBS-lanes (Figure 7a, Lanes 4 and 6 respectively), which can be GMBScrosslinked HbII partial tryptic fragments. Bands marked with a red star (Figure 7a, Lanes 5 and 7) may be few Daltons heavier than the 55 kDa bands in Lanes 4 and 6 (Figure 7a). These bands may be GMBS-crosslinked HbII partial tryptic fragments, but it is also possible that the small migration difference may be simply due to experimental error.
HbII samples were crosslinked with same amount of GMBS and subjected to complete trypsinolysis under 37°C for 24 hours using 20 μL of 1× trypsin stock. The digestion mixture was fractionated by RP-HPLC. Fractions collected were subjected to ESI-MS and all (not just a selection) significant peaks were subjected to ESI-MS zoom scan to determine their charge states. After SearchXlinks analysis [See Additional file 5], only one peak, the 1381.305 1+ peak (Figure 7b), was a genuine crosslinked peak that was unequivocally assigned to GMBS-crosslink AVK-GILCSD(g-deco-K) (g-deco, GMBS-decorated) [See Additional file 6]. Findpept analysis showed that the 1381.305 1+ peak is not a trypsin autolytic product peak. There are three other putative crosslinked peaks, but all of them showed possible correlations to uncrosslinked HbII tryptic fragments with or without GMBS decoration.
The enlarged version of the tandem MS spectra of the 1418.573 peak Figure 4 The enlarged version of the tandem MS spectra of the 1418.573 peak. The putative peptide species are those which received the best SearchXlinks score for a given m/z ratio, and which had a score above 20. The structure of the putative peptide species and the score received is indicated. The "|" sign joins crosslinked lysines. Fragment ions recognized by SearchXlinks during the scoring process were labelled red. Tandem MS precursor ion peak were labelled green, and the remaining identified peaks were labelled blue.
Therefore, only one pair of crosslinked residue can be conclusively identified: C25 of C-polymer and K106 of T-polymer. Putting all together, GMBS crosslinked residues were identified to be C25 CP and K106 TP , rather than C178 TP and K32 CP as previously guessed. Nevertheless, both residues are on Domain 1 of T and C-polymer and hence agree with the Domain 1: Domain 1 alignment. The domains can be docked in a similar fashion as shown in Figure 4 while still having C25 CP and K106 TP positioned at a distance comparable with the maximum crossbridge span of GMBS (6.8 Å) (Figure 7c). Hence, the Domain 1: Domain 1, EF: EF model was further substantiated by the GMBS MS3D results.

Conclusion
The aim of this research was to investigate the quaternary structure of Artemia HbII, and elucidate the details of domain-domain alignment and the interpolymer interface. The MS3D approach was employed. In total, six putative EGS-crosslinked tryptic peptides (Table 1), and 1 GMBS-crosslinked peptide ( Figure 8) were identified through the MS3D experiments. Four of the seven crosslinked peptides mentioned above exceeded a threshold score of 20.0, while others appeared only when the minimum threshold score (1.5) was applied ( Table 1). The identification of fragment ions carrying portions of EGS crossbridge (such as the [LEGS8]+ ions identified by SearchXlinks; Figure 3) provided further evidence of crossbridges.
Using the Chironomus globin III X-ray structure (1ECD) as a model, and applying the identified crosslinked lysine pairs showed that by having two globin domains in contact through EF corner, nine such domain pairs could be arranged into a ring with any two adjacent domain pairs contacting through the Helix H (Figure 5a and 5b). Domains T1 and C1 were then forced to align, which produced inter-lysine distances in four of the six crosslinked Inter-lysine (or lysine-cysteine) distances (in Å) residues which were experimentally shown to be crosslinked Figure 6 Inter-lysine (or lysine-cysteine) distances (in Å) residues which were experimentally shown to be crosslinked. The TP and CP superscripts indicate T or C polymer residues. Note that this figure is the same as Figure 5a except that only crosslinked residues are shown.
lysine pairs that were comparable with the in-solution crossbridge span of EGS (9.8 ± 2.5 Å). The other two were within the maximum crossbridge span of 16 Å.
The assumption can be made that the domains in each polymer are oriented in the same fashion and hence the two-domain-pair model shown as Figure 5a and 5b represents a section of the HbII complex. The model is consistent with the low resolution EM image of Artemia HbII [3] that suggests the molecule consists of two coaxially stacked ring-like globin polymers (Figure 5a, b, c, d and 5e). This EF: EF Domain 1: Domain 1 model further agrees with the GMBS crosslinking results since K106 TP and C25 CP can be positioned at a distance comparable with the maximum GMBS crossbridge span (6.8 Å) (Figure 7c) while having the domain orientations comparable to the model shown as Figure 5a and 5b.
This model suggests also that the two polymers are stacked in a symmetrical fashion, and therefore explains the dimeric nature of the complex. Having the two polymers stacked asymmetrically (for example, by having the EF helical regions contact with the GH helical regions), would likely mean that polymers would stack infinitely, since T and C polymers both have almost identical tertiary structure and primary structure. However there was no experimental evidence suggesting that T and C polymers can be stacked in this way.
Most importantly, having the EF helical region of each domain located at the interface explains the previous finding from sequence alignment studies that surface-exposed hydrophobic residues were observed most frequently at Helix E [1]. In order to solubilize the complex readily to facilitate efficient oxygen transport in an aqueous environment, these hydrophobic residues need to be buried.
Despite the diverse quaternary and primary structures of invertebrate haemoglobins, only one form of oligomeric assemblage, the EF-dimer (or EF-contacted single-domain subunits), has been found in more than one phylum. Such EF-contacts are commonly observed in many cooperative invertebrate types of haemoglobin also [16]. Therefore, the observed EF-contact between the T and C polymers may well explain the cooperative behavior of Artemia HbII. Although many cooperative types of haemoglobin contain EF-contact between subunits, the cooperativity mechanisms are various [16]. Therefore, exactly how the EF-contacting between the T and C polymer contributes to the cooperativity of Artemia HbII remains an open question.

EGS crosslinking
Crosslinking reactions were performed with 375 μg HbII, suspended in 25 mM HEPES, pH7 to a final volume of 48 μL. EGS (ethylene glycol bis [succinimidyl-succinate]) (4.7 The tandem MS spectrum of the 1381.305 1+ peak Figure 8 The tandem MS spectrum of the 1381.305 1+ peak. Fragment ions recognized by SearchXlinks during the scoring process were labelled red, and the remaining identified peaks were labelled blue. μL of 1mg/mL EGS in DMSO) was added. The reaction was allowed to continue for 1 minute before quenching with 2.7 μL of 1 M Tris, (pH 7.5). The samples were incubated at room temperature for a further 15 minutes, before centrifuging for 10 minutes to remove non-solubilized material. Supernatants were transferred to a new microcentrifuge tube.

GMBS crosslinking
Crosslinking reactions were performed with 1 mg of HbII, suspended in 25 mM HEPES, pH 7.5 to a final volume of 195 μL. GMBS, (4.9 μL of 25 mg/mL in DMSO) was added to HbII samples. The reaction was allowed to proceed for 1 hour at room temperature before quenching with 50 μL of 1 M Tris, (pH 7.5) and 50 μL of 1M cysteine. The tubes were incubated at room temperature for a further 15 minutes after quenching. Crosslinked HbII samples were precipitated with 40% (w/v) ammonium sulphate and resuspended in 55.4 μL of 25 mM HEPES, pH 7.5.

Trypsinolysis
Trypsin was made to a concentration of 20 mg/mL by addition to 25 mM HEPES pH 7. Trypsin stocks of various concentrations (20 μL) were added directly to the crosslinked haemoglobin II sample and then incubated for 3.5 or 24 hours at 37°C. The reaction was quenched by adding 5 μL of 10 mg/mL PMSF. For partial trypsinolysis, 2.5 mg/mL trypsin was used and the incubation time was 3.5 hours. For complete trypsinolysis 20 mg/mL trypsin was used and the incubation time was 24 hours.

EGS decrosslinking
EGS decrosslinking was done by adding an equal volume of 2 M hydroxylamine-HCl (NH 2 OH.HCl), prepared in 200 mM sodium phosphate buffer pH 8.5 to EGScrosslinked HbII samples followed by incubation at 37°C for 6 hours with shaking. The efficiency of decrosslinking was tested using undigested crosslinked HbII samples and analyzed by SDS-PAGE using the NuPAGE system.

Gel electrophoresis
The Invitrogen NuPAGE system was used with the NuPAGE Novex 4-12% Bis-Tris Gel (10 well), and the See-Blue Plus2 prestained standard. Sample preparation and electrophoresis conditions were as recommended by the manufacturer.

Reverse-phase high-performance liquid chromatography
The organic solvent (solution B) used for RP-HPLC was 80% acetonitrile, 0.085% trifluoroacetic acid. The aqueous solution (solution A) used for RP-HPLC was 2% acetonitrile and 0.1% trifluoroacetic acid. Peptides were separated by a gradient of 100% solution A to 50% solution B over a period of 40 minutes at 1 mL per minute.
Complete trypsinolysis was carried out as stated above.
The digest was then made up to 600 μL in solution A. Fractions were collected and reduced to 2 30 μL in a Speedvac and subject to mass spectrometry (see below).

Mass spectrometry data analysis: identification of crosslinked lysine pairs
After the ESI-MS zoom scan analysis, peaks were analyzed once with SearchXlinks Version 3.0.10 [12] with the PSD scoring function turned off, to identify all possible peptide assignments of each peak given their charge-states, to see if any peaks showed possible correlation to crosslinked tryptic peptides of HbII. Findpept analyses were also performed on the same set of peaks, checking to see if there was possible correlation to trypsin autolytic products. Peaks showing possible correlation to crosslinked tryptic peptides of HbII but no possible correlation to trypsin autolytic products would then be subjected to tandem MS analysis and a second round of SearchXlink analysis to score each of the peptide assignments. This was done by matching peaks of the experimental tandem MS spectrum to the theoretical tandem MS spectrum of each peptide assignment, using the PSD scoring function of SearchXlinks. The matching error used was 2.0 Da. A monoisotopic mass type (default) was used for the PSD scoring function of SearchXlinks. Currently, tandem MS peak lists for SearchXlinks have to be generated manually. All tandem MS peaks distinguishable from background noise were considered. It is important to ensure that peaks show no possible correlation to trypsin autolytic products because the trypsin used in this research is autolytic, and the PSD (post source decay) scoring function of SearchXlinks can only handle one protein or protein complex at a time.
Currently, the PSD scoring function of SearchXlinks calculates theoretical tandem MS spectra containing the low energy, polypeptide-characteristic b and y-type ions (ions formed by receiving cleavage at a CO-NH bond), and possible fragment ions predicted to be concomitantly formed under low collision energy by receiving a cleavage at the crossbridge. The score provides an indication of how close the observed tandem MS spectrum resembles the tandem MS signature of a possible peptide assignment compared with other possible peptide assignments of the same m/z ratio. The SearchXlink scoring function (mathematical) was designed in such a way that b and y-type ion matches were weighted higher than fragment ions formed by receiving a cleavage at the crossbridge. For consecutive b and y-type ion matches, weight increases exponentially after each successful consecutive b and y-type ion match. Therefore the score reflects mainly the tandem MS signatures of polypeptides. Fragment ions due to crossbridge fragmentation (which are predicted to be concomitantly formed with b and y-type ions) have only small contributions to the score, as the formation of such ions is currently conjectural. Those peptide or crosslinked peptide assignments receiving the best SearchXlinks scores were considered as the putative peptide species represented by each peak.
After scoring each peptide assignment using the PSD scoring function of SearchXlinks, the peptide assignments receiving the best scores compared with all others of the same m/z ratio were considered as recommended by the program developers. The putative peptide species identified after SearchXlink scoring were then re-confirmed by checking to see whether any peaks, especially the dominating peaks, remaining unannotated after scoring using the PSD scoring function of SearchXlinks could at least be identified. Those unannotated peaks were likely to represent the non-polypeptide-characteristic internal ions receiving at least 2 b/y-type cleavages at peptide groups, although ions formed from more complicated fragmentation pathways may also be possibilities.

Molecular modelling
Since Artemia and Chironomus are both arthropods, the Xray structure of Chironomus globin III (PDB: 1ECD) was used to model the globin domains of Artemia HbII. Domains within T and C polymers were assumed to be repetitively arranged into a ring. Each polymer has nine globin domains. The inter-lysine distances determined were used to define the interploymer-contacting surface, subject to the constraints that domains within T and C polymers are repetitively arranged into a ring and both polymers are identical. Modeling was done using the program Swiss PDB Viewer 3.7 (SPDBV 3.7), by stacking together four Chironomus globins to model two domain pairs. Orientations of the four-globin structures were adjusted to best fit all identified inter-lysine distances, under the constraints that domains within T and C polymers are repetitively arranged into a ring and both polymers are identical.
Unless otherwise stated, residues on the structure 1ECD in which they can be mapped to the exact position as the crosslinked lysines identified were ''mutated'' in silico to lysine (on the basis of residue number of each secondary structures, for examples, position 3 of Helix A, position 2 of FG loop, etc.). The inter-lysine distance between any 2 lysines identified to be crosslinked were taken as distances between the nitrogen atoms of the -NH 2 group.