Plasmodium vivax is the most widely spread human malaria parasite and is a major cause of concern in the Central and South America, Central, South and Southeast Asia, India, Middle East, Oceania, and East Africa. Historically, India is highly endemic to P. vivax but recent reports suggest that malaria due to Plasmodium falciparum and P. vivax are in equal proportions [1]. The portrayal of P. vivax as benign is now being challenged, as reports of respiratory distress and coma have emerged along with the increasing resistance of the parasite to chloroquine [2,3,4,5,6]. This situation complicates the epidemiology as well as outcome of P. vivax malaria not only in India but in global context calling for the urgent attention for effective control measures.
Antigenic genes are present in malaria parasites which encode the variant surface antigens (VSAs) providing protective immunity to the parasite against host/vector. Among these genes, the vir family of P. vivax is the largest subtelomeric multigene superfamily containing 346 genes divided into 12 subfamilies (A-L) [7,8]. The function of the vir gene superfamily is not yet clearly known but they are likely to be involved in antigenic variation and cytoadherence [9,10]. The virulent nature of the VIR proteins leads us to believe that they have a potential role in malaria pathogenesis [7]. Therefore, it is important to study the population genetic diversity of this gene family to understand the recent changing trends of P. vivax.
Indian P. vivax displays complex evolutionary history and holds several traits of being part of the ancestral distribution range [11]. Not many population genetic studies have been carried out on the vir genes of P. vivax in India. However, 4 vir genes were analyzed for the genetic variability existing in different P. vivax populations in India and were found to be highly divergent [12]. In this study, these vir genes were further analyzed by population genetic approaches using different statistical tools to facilitate the understanding of the existing diversity of this superfamily among different Indian populations.
In the present study, a total of 191 malaria symptomatic blood samples were collected from patients by finger-prick method. The samples were collected from 6 different epidemiological settings in India, i.e., 39 from Delhi (DEL) in north India, 63 from Mangalore (MNG) in south India, 29 from Goa (GOA) in west India, 39 from Rourkela (RKL) in east India, 9 from Jabalpur (JBL), and 12 from Raipur (RPR) in central India during the years 2008-2011 (Fig. 1).
The ethical clearances for the proposed study to collect blood samples were obtained from the Institutional Ethics Committee, National Institute of Malaria Research (NIMR), New Delhi, India. The patients were briefed about the study verbally and provided a written consent before the samples were collected. Preliminary diagnosis of P. vivax was done by microscopy followed by rapid diagnostic tests (RDT) (Bioline SD Rapid Test, San Diego, California, USA). The bloodspots were made on Whatman (no. 3) filter paper strips for further molecular studies and transported to the lab in air-tight sealed bags. Genomic DNA of P. vivax was extracted by QIAamp DNA Blood Mini Kit (Qiagen Inc., Valencia, California, USA) according to the manufacturer's instructions. The isolates were analyzed for mixed infections by P. falciparum and P. vivax with PCR assays using the published 18s rRNA primers [13]. Furthermore, the single clones of P. vivax were determined by genotyping the isolates with merozoite surface protein 3α (msp3α) gene.
We followed the traditional PCR protocol using the primers as described earlier for vir 27, vir 4, vir 12, and vir 21 [12]. The data generated during the study of these 4 genes have already been published [12]. In this study, we have included 1 more novel vir gene (vir 1/9) and have reanalyzed the data of the 5 vir genes (4 previously published and 1 described in the present study) following different statistical tools to infer diversity patterns among these 5 vir genes. The 5 vir genes belonged to different subfamilies, i.e., vir 27 in subfamily I, vir 4 in subfamily C, vir 12 in subfamily E, vir 21 in subfamily B, and vir 1/9 in subfamily J [2]. The vir 1/9 gene is 871 bp in length and comprises of 2 exons and 1 intron. Considering the second exon of the vir genes to be highly variable [14], we have designed novel primers to amplify the second exon of the vir 1/9 gene.
The sequences of the primers are; v1/9_1: 5' ATGACA-AATGGGGGACTCAA 3' (forward) and v1/9_3: 5' GAAAATTACTGTTTCCTTAAAATGTGT 3' (reverse) for the primary PCR reaction, and v1/9_2: 5' CGTGAAATGTTATCGGAAAATG 3' (forward) and v1/9_3 for the semi-nested PCR reaction. The annealing temperature was 50℃. The purified PCR products were sequenced as described earlier [12]. The accession numbers of the 4 vir genes (vir 27, vir 4, vir 12, and vir 21) are available in GenBank under accession nos. JQ733915-JQ733988 [12]. The homologous sequences of all the 5 vir genes were compared with the Sal-I reference sequences with GenBank nos. (AAKM-01000041.2 for vir 27, AAKM01000104.1 for vir 4, AAKM-01000016.1 for vir 12, AAKM01000003.1 for vir 21, and AAKM-01000050.1 for vir 1/9).
The sequenced DNA fragments for each vir gene were viewed in Finch TV computer program and, edited DNA sequences were then aligned separately to detect single nucleotide polymorphisms (SNPs) with the help of MEGA v 5.10 computer program following the ClustalW algorithm [15]. Minimum evolution phylogenetic tree was constructed for vir 1/9 from the sequences of the isolates and Sal-I reference sequence. The computer program DnaSP v 5.10 was used for the sequence analysis of the vir genes [16]. For each gene and population sample, the number of segregating sites, number of haplotypes, haplotype diversity, and 2 different measures of nucleotide diversity, π and θw, were calculated [17]. Both π and θw were used to estimate the extent of nucleotide diversity in a population independently for each of the 5 different vir genes. Whereas π measures the average number of pairwise nucleotide differences in a set of DNA sequences, θw measures the total number of segregating sites in a set of DNA sequences [18,19]. Tajima's D test of neutrality [20] which compares the number of segregating sites per site with the nucleotide diversity was conducted for each gene and each population, and the D values were calculated. The direction of Tajima's D test can provide useful information about the evolutionary forces that a population has undergone. For example, a negative value of Tajima's D highlights an excess of low frequency polymorphisms which shows population size expansion and purifying selection and a positive value signifies low levels of both low and high frequency polymorphisms which shows decreased population size and balancing selection [21]. All the values were considered significant at P<0.05. Furthermore, Pairwise Nei's genetic distances (D) were calculated for the 4 vir genes (vir 27, vir 12, vir 21, and vir 1/9) independently using GenAlEx v 6.5. For each gene, the population pair-wise genetic distance matrix was used to construct Neighbor-Joining (NJ) phylogenetic trees using the MEGA v 5.10 computer program [22].
Following the preliminary diagnosis by microscopy and RDT of 191 malaria symptomatic patients, only 108 samples were found to be infected with P. vivax and the remaining 83 with P. falciparum. Further confirmation of differential P. falciparum and P. vivax infections came from the PCR diagnosis by a nested 18s rRNA PCR assay. However, with PCR assays, 15 isolates from Mangalore were detected as mixed malaria infections with both P. falciparum and P. vivax. Therefore, we discarded the 15 samples from further analyses. The remaining 93 P. vivax single infections were further found to be single clonal with msp3α and thereafter analyzed with vir specific primers. The distribution of isolates in each Indian population is depicted in Fig. 1.
The amplified sequence lengths for the vir genes of the present study ranged between 258-1,314 bp. The sequenced DNA fragments of the 5 vir genes were independently aligned (with the respective reference sequences of the SAL-1 strain), manually edited and all the insertions as well as deletions were removed. Multiple sequence alignment of vir 1/9 gene showed 78 SNPs out of which only 3 were synonymous mutations showing high diversity in comparison with the Sal-I reference sequence. The NJ phylogenetic tree for 15 samples showed the presence of 2 principal clades. While the first clade comprises of 9 samples (comprising of P. vivax isolates from DEL and GOA and the reference isolate), the second clade consists of 7 isolates (with isolates from DEL, GOA, and RKL; Fig. 2). Each clade was further divided into a number of subclades. There was no geographical clustering among the isolates, and the distribution was observed to be randomly presenting very similar profiles as observed in the previous vir genes [12]. Comparison of the number of segregating sites among the 5 vir genes revealed the presence of as low as 2 SNPs (vir 4) to as high as 179 (vir 21) among Indian P. vivax populations. Among the populations, the sample from GOA had less segregating sites for vir 27, vir 12, and vir 1/9 as compared to the other populations and maximum number of segregating sites for vir 21. Similarly, the number of haplotypes varies among the 5 different vir genes, with the lowest in the vir 4 (3) and highest in vir 12 (17). Furthermore, the number of haplotypes of vir genes in different populations varies from 1-8 (Tables 1, 2). Very similarly, the haplotype diversity was the lowest in vir 4 (0.711) and the highest in vir 12 (0.962) (Table 1).
The nucleotide diversity parameters as measured by π and θw were calculated separately for each gene and population. The average nucleotide diversity parameter π for all 5 genes was 0.047142, and the average value of θw was 0.03686 (Table 1). The average value of π was found to be higher than θw in vir 12, vir 21, and vir 1/9, but lower in vir 27 and almost equal in vir 4 gene. The π and θw estimates were highest in vir 1/9 (π=0.11280 and θw=0.08805) and lowest in vir 4 (π=0.00068 and θw=0.00054) (Table 1; Fig. 2). Among all the 6 populations, GOA had the lowest nucleotide diversity in vir 27 and vir 12 genes. In general, the pattern of nucleotide diversities as estimated by π and θw were quite variable across the 6 populations of P. vivax (Table 1), indicating high diversity among the vir genes in Indian populations. Very similar to the estimates of nucleotide diversity, the Tajima's D values were quite variable across populations and among the vir genes (Tables 1, 2). Wide variations in the Tajima's D values therefore indicate ongoing molecular evolution of the vir genes in the Indian P. vivax populations.
In order to infer genetic interrelationships among Indian population samples of P. vivax with respect to 4 vir genes sequenced earlier [12], NJ trees were constructed based on the population pair-wise genetic distance matrix (see above). As shown in Fig. 3A-D, the placement of Indian population samples in the NJ phylogenetic tree was different for different vir genes. For example, in the NJ trees constructed with the vir 27 and vir 12 genes, DEL and MNG populations come in 1 cluster and GOA and RKL populations in another, as observed with the vir 1/9 gene (see above). The placement of different Indian populations in the NJ phylogenetic tree majorly points towards no particular patterns of genetic relatedness among the Indian populations of P. vivax, as for each vir gene very different patterns were observed. The results therefore in one hand corroborate the earlier finding on no geographic sub-structuring of Indian P. vivax [7], on the other hand, high sequence diversity of the vir genes as previously reported in India [12] has also been found for the vir 1/9 gene in India.
It is hypothesized that vir genes have a role in malaria pathogenesis [7,23], and P. vivax uses the high sequence diversity in these genes to gain high virulence. Therefore, the study on the genetic diversity and evolutionary potentiality of the vir genes is essential to understand malaria transmission, disease severity, vaccine development, and various evolutionary aspects of malaria. Population genetic studies further allow us to comprehend the evolutionary history of P. vivax, and whether different genes are influenced by natural selection across different geographical regions [21]. Considering that India contributes majorly to the global endemicity of P. vivax malaria [1], it is therefore essential to study genetic diversity of vir genes in Indian populations. The calculated average nucleotide diversities for the 5 vir genes studied here were quite high (π=0.047142; θw=0.03686), indicating that Indian P. vivax populations maintain high genetic diversity [11]. Interestingly, the GOA population sample contains the least diversity in all aspects of data in comparison to other Indian populations (signifying a more conserved population of vir genes) as compared to the other regions of India. Similarly, among the 5 different vir genes studied here, the vir 4 gene was found to contain the least genetic diversity, indicating this gene might be under the influence of some evolutionary constraint. However, such a conclusion should be taken with caution, as we could sequence a very low number of P. vivax isolates for this gene. In contrast, due to the observation of a large number of haplotypes and a high haplotype diversity in vir 12, it can be concluded that the vir 12 gene is the most diverged gene among the 5 genes studied here.
The NJ phylogenetic trees were constructed from pairwise Nei's genetic distance matix for each of the 4 vir genes. Interestingly, the placement of Indian populations was quite different for each of the vir genes. For example, in the NJ trees constructed with the vir 27 and vir 12 genes, DEL and MNG were placed in a single clade and GOA and RKL in another, signifying close genetic affinity between these 2 populations with respect to the 2 vir genes (vir 27 and vir 12). For these 2 genes, GOA and RKL also appear to be genetically identical, as these 2 populations are also placed in a single clade, although geographically these 2 locations are quite wide apart. It is known that genetic relatedness between geographically distant populations can arise owing to the common gene pool shared by the isolates in the past [11]. This phenomenon seems to be common in P. vivax populations in India, as very similar patterns of clustering was observed for vir 21 as well as in vir 1/9 genes. The overall observation in Indian P. vivax populations thus revealed that the existing diversity in vir genes was randomly distributed without any definite geographic pattern. No observation on any genetic structure among populations with differential malaria endemicity further corroborates this contention. Such genetic epidemiological differences spread across the country may also be responsible for the increased complexity of the P. vivax infections which is distinctly exhibited by vir genes as observed in this study. This population-based study demonstrates differential levels of diversity in the different geographic regions as also reported in other studies [12]. Observation on the differential calculated values of the Tajima's D also reflect that majorly the Indian P. vivax populations present complex demographic history, as observed in an earlier study [11]. Whatever the case may be population genetic studies with more number of vir genes and functional analyses will reveal more concrete knowledge on the population evolutionary history [8] of vir genes in India.
Due to limited information on the genetic diversity studies in P. vivax populations (in comparison to P. falciparum), it is difficult to derive concrete conclusion on the observed genetic diversity of the vir genes found in the present study. Although several studies on genetic polymorphisms in P. vivax have been conducted in worldwide populations [24], genetic diversity studies in Indian P. vivax are limited to some antigenic genes [25]. The results from all these studies cannot be compared, as molecular markers are not uniform across countries. Therefore, in order to construct map of global genetic diversity patterns of vir genes, similar vir genes across P. vivax malaria endemic countries are essential. Coupled with functional studies, such a diversity map would inform not only on the extent of extant genetic diversity of vir genes, but also will help in designing new vaccines for P. vivax malaria management.