Equine tapeworm infections are caused by species of the family Anoplocephalidae in the order Cyclophyllidea. The Anoplocephalidae family has 2 valid genera, Anoplocephala (A. perfoliata and A. magna) and Anoplocephaloides (A. mamillana). Previous studies have shown that both A. perfoliata and A. magna have high prevalence in the equine population, and in many cases occur together as a mixed infection [1]. These tapeworms are found in and around the ileocecal valve of horses and are thought to be associated with several intestinal diseases [2,3] that lead to reduced body weight of horses. To minimize economic loss, accurate identification and differentiation is needed to help control these equine parasites. Additionally, the genetics, epidemiology, and biology of the species (Anoplocephalidae) are as yet poorly understood.
Complete mitochondrial DNAs have been used effectively to analyze species phylogenetics, ecology, and population genetics, and some genes and gene regions have helped us locate novel molecular markers [2-6]. The complete mitochondrial DNA of platyhelminthes comprises 12 protein-coding genes, 2 ribosomal genes and 22 transfer RNA genes [5]. The complete mitochondrial genome of A. perfoliata was recently sequenced; however, that of A. magna remains undetermined.
In the present study, the complete mitochondrial genome of A. magna was sequenced and compared with that of A. perfoliata of the genus Anoplocephala in order to find useful molecular markers for the identification of the 2 most commonly occurring equine parasites. Determining the complete mitochondrial genome of A. magna will provide new molecular data for future studies of comparative mitochondrial genomics as well as the phylogenetics of parasitic cestodes.
Equine tapeworms of the species were collected from the digestive tracts of donkeys slaughtered at a commercial slaughterhouse in China and identified by morphology. Total DNA was extracted from a single sample, A. magna, with a Miniprep DNA extraction kit (AXYGEN, Avenue Union City, California, USA).
The complete mt genomic sequence of A. magna was amplified in 3 overlapping fragments. The 3 pair-conserved primers and amplifying conditions used for the long-PCR reactions were the same as those described for amplifying corresponding fragments of A. perfoliata [7]. These amplicons were sequenced by Shanghai Sangon (Shanghai, China) using primer-walking in both directions.
The complete mt genomic sequence of A. magna was assembled using CAP3 Program [8] and annotated using ClustalX software based on comparison sequence with that of A. perfoliata reported previously [7]. The gene boundaries of the mt genomic sequence of A. magna were identified by alignment in comparison to those of A. perfoliata. Total 22 tRNA genes were found using ARWEN [9] and further determined by checking anticodon sequences and potential secondary structures. Putative stem-loop structures of non-coding regions were inferred through comparison with similar published sequences. The amino acid sequences of 12 protein-coding genes were deduced using the genetic code set for the flatworm (Translation Table 9). Nucleotide and amino acid sequence differences between A. magna and A. perfoliata were established with pairwise comparisons.
The complete circular mt genome of A. magna was 13,759 bp in size (GenBank accession no. KU236385). A. magna contains 36 genes, including 12 protein-coding genes (cox1-3, atp6, nad1-6, nad4L, and cytb), 22 transfer RNA genes (trn), and 2 ribosomal RNA genes (the small and large subunits of rRNA were designed as rrnS and rrnL) (Table 1). Putative gene arrangement and lengths of A. magna are listed in Table 1. All 36 genes are transcribed in the same direction. Gene overlaps were found between nad4 and nad4L, between trnQ and trnF, between trnF and trnM, and between trnV and trnA, as reported in A. perfoliata [7]. The nucleotide composition of A. magna mtDNA is biased toward T and A (70.8%), as observed with A. perfoliata (71.0%) and other flatworm mt genomes [2-6]. The genome organization and structure of A. magna is the same as that of A. perfoliata (Table 1).
Twelve protein-coding genes in the A. magna mtDNA comprised 73.3% of the total length. The proportion was similar to data for other cestodes [2-6], but higher than that of A. perfoliata (69.8%). This difference in proportion between A. magna and A. perfoliata was caused by the length variation of the non-coding region (Table 1). Eight of the 12 protein-coding genes used ATG as their initiation codon, and 4 used GTG. Eleven genes had a complete stop codon with 4 genes using TAA and 7 genes using TAG (Table 1). One gene, cox3, was predicted to end with an incomplete termination codon, as identified in the cox3 of A. perfoliata [7]. The size of each of the 12 protein-coding genes identified in the mt genome of A. magna was the same as that of A. perfoliata, except for the cox1 gene whose size was 1,590 bp in A. magna and 1,593 bp in A. perfoliata.
A total of 22 tRNAs were identified in A. magna, of which 18 were inferred to have their nucleotide sequences folded into conventional secondary structures. Meanwhile, the other 4 tRNAs (trnS1, trnS2, trnC, and trnR) were predicted to have lost the DHU arms, as in the case of A. perfoliata. In A. magna, the size of individual tRNA genes ranged from 57 bp to 71 bp, and the size of 5 tRNA genes in its mt genome varied compared to those of the corresponding genes in A. perfoliata (Table 1), including trnC (3 bp), trnS2 (2 bp), trnG (1 bp), trnQ (1 bp), and trnP (1 bp). Among the predicated 36 genes for the A. magna mt genome, only the trnF gene sequence was identical to that of A. perfoliata.
In the mt genome of A. magna, rrnS was located between trnC and cox2 and was 724 bp in size. The rrnL was located between trnT and trnC, and was 973 bp in size. The respective sizes of rrnS and rrnL in A. perfoliata were 724 bp and 981 bp.
A total of 22 non-coding regions, ranging from 1 bp to 271 bp in size, were found in the A. magna mt genome (Table 1). The 2 largest non-coding regions were named as NC1 and NC2. NC1 was located between trnY and trnS2 (UCN), and NC2 was located between nad5 and trnG. The lengths of NC1 and NC2 were 199 bp and 274 bp, respectively, for A. magna, and 875 bp and 279 bp for A. perfoliata.
The possible secondary structures for the 2 largest non-coding regions are shown in Fig. 1. The partial sequence in the NC1 of A. magna could form a significant stable stem-loop structure with 18 canonical base pairs and a loop of 23 nt (Fig. 1A). NC2 comprises 7 identical repeats of a 32 nt sequence, and part of this 7th repeat and the remaining 50 bp can fold a perfectly matched stem-loop structure with 25 base pairs and a loop of 14 nt (Fig. 1B).
The overall difference in nucleotide sequence between A. magna and A. perfoliata was 12.9%. The nucleotide sequence divergence for rrnS and rrnL was 8.2% and 9.2%, respectively, revealing that rrnS is more conserved than rrnL. The sequence difference in NC1 between the 2 species was 24.1%, and that of NC2 was 7.7%. The divergence in the nucleotide and amino acid sequences of each of the 12 mt proteins in A. magna and A. perfoliata ranged from 11.1% to 16% and 6.8% to 16.4%, respectively (Table 1). The amino acid sequence of nad4L gene was the least conserved protein, and that of cox2 was the most conserved protein.
The lengths of the complete genomes of A. perfoliata and A. magna were 14,459 bp and 13,759 bp, respectively. The difference between the lengths of the complete genomes was largely due to differences in the length of NC1, which is caused by the differential number of identical 169-nucleotide repetitive sequence units. Compared to the NC1 of A. magna, additional 4-repetitive sequence units were found in that of A. perfoliata. The difference in the number of repetitive sequence units in this non-coding region was similar to that between Diphyllobothrium latum and D. nihonkaiense [10]. The significance of the number variation of repetitive sequence units between closely-related species needs to be evaluated further.
Stem loop structure has been predicted in the non-coding regions in A. magna mtDNA, as in the mtDNAs of many cestodes [6] and in nematode mtDNAs [11]. Non-coding regions of the A. magna mt genome which have stem-loop structures similar to those in vertebrates and invertebrates may play a similar function to those found in vertebrates, which are known to be involved in the processes of replication and transcription [12-14].
A substantial difference (12.9%) in the nucleotide sequences was observed between the complete mt genomes of A. magna and A. perfoliata from China. The finding of the sequence variation in the mt genome between A. magna and A. perfoliata was consistent with a previous study, where nucleotide sequence variation was detected in the nuclear ITS rDNA [1]. In this study, the differences in the nucleotide and amino acid sequences between A. magna and A. perfoliata support the hypothesis that they are 2 distinct species. To date, only a molecular marker of internal transcribed spacer 2 has been proposed to be used for differentiating infections caused by the 2 most frequently encountered equine tapeworms, A. magna and A. perfoliata [1]. In this study, more gene regions were found to have greater interspecific variation including nad5, cytb, nad4, atp6, nad6, nad3, nad1, and cox3 and may thus be considered as ecological and diagnostic markers for the 2 species. Actually, some mt genes (cox1 and cytb) have been used as targets for molecular-based methods to identify species in other cestodes [15-18].
This study provided new and valuable information about the mt genome of A. magna. The mt genome data presented here supports the hypothesis that A. magna and A. perfoliata represent distinct species. The complete mt genome of A. magna can now be used for comparative mitogenomics among members of the family Anoplocephalidae to find the best molecular markers for characterization and will also be used to reconstruct the molecular systematics of the order Cyclophyllidea in the future.