Screening for viral AMGs
Viral protein annotation using VIBRANT and DRAM-v software combined with manual proofreading identified the viral AMGs in Napahai plateau wetland, including the viral AMGs phoH, phoU and pstS, which were associated with phosphorus metabolism.
Phylogenetic analysis of AMGs associated with phosphorus metabolism in Napahai plateau wetland
There were 24 amino acid sequences of phoH gene in Napahai plateau wetland (Fig. 1A). They were divided into 5 clusters, the largest of which had 10 sequences, while the smallest cluster had only 1 sequence. The remaining 3 clusters contained 6, 5 and 2 sequences, respectively. The phoH gene was genetically diverse in Napahai plateau wetland, which might be related to the different host origins. A total of 74 sequences of phoU gene could be found in seven clusters (Fig. 1B), with the largest cluster containing 27 sequences and the smallest cluster having two sequences. Similar to phoH, phoU was also genetically diverse, but richer than that of phoH. There were 71 pstS sequences forming 9 clusters, with the largest cluster of 23 sequences and the smallest cluster only 1 sequence (Fig. 1C). It could be seen that the genetic diversity of pstS was better than that of phoH and phoU, which might be related to the unique geographical location. Napahai plateau wetland is located in the Three Parallel Rivers of Yunnan protected areas, which forms a complex landscape, and then controls the evolution and characteristics of organisms, thus showing abundant biodiversity. Li et al. obtained 58 phoH gene sequences from Northeastern wetland sediments of China, which were 22%–99% consistent at the amino acid level, and found that the phoH gene could regulate phosphate uptake and metabolism under the low phosphate or phosphate limitation conditions16. However, the exact function remained unclear. The phoH gene clustered into five clusters in Napahai plateau wetland, indicating high genetic diversity. Additionally, water and soil samples were collected from eight separate sampling sites, and there were differences between samples environments, which might also have an impact on the genetic diversity of the three genes.
Phylogenetic analysis and PCoA analysis of AMGs associated with phosphorus metabolism from different habitats and host origins
In order to understand the genetic diversity of viral AMGs (phoH, phoU, pstS) associated with phosphorus metabolism in Napahai plateau wetland, a phylogenetic tree of phosphorus metabolism AMGs from different habitats was constructed, and PCoA analysis was performed (Fig. 2). The results showed that most sequences of phoH, phoU and pstS genes in Napahai plateau wetland clustered individually, especially phoU and pstS genes, and only a few sequences were closely related to those of other habitats. In Fig. 2A, 14 sequences clustered individually and were relatively far from sequences of other habitats, whereas 7 sequences were close to those from freshwater lakes, and other 3 sequences were close to those from rice fields, oceans and other wetlands, respectively. Therefore, the genetic diversity of phoH in Napahai plateau wetland was independent of the habitat. Moreover, some of the phoH sequences were clustered with those of other habitats and distributed in the fourth quadrants (Fig. 2D). From Fig. 2B, apart from 3 sequences which clustered with those from the marine habitats and freshwater lakes, the rest were clustered separately. Whereas in Fig. 2E, apart from only a few sequences, most sequences of phoU were far away from those of different habitats, which was consistent with Fig. 2B. Thus, the genetic diversity of phoU gene in Napahai wetland was also independent of habitat, where the separately clustered sequences may be unique. From Fig. 2C, we can seen that apart from 8 sequences which more closely related to those from the freshwater lake, ocean, rice field, and other wetlands, all the rest were individually clustered. The result was consistent with that of Fig. 2F. Therefore, the genetic diversity of the pstS gene was also habitat-independent.
To study whether the genetic diversity was related to host origins, three AMGs associated with phosphorus metabolism were selected for phylogenetic and PCoA analyses with AMGs sequences from different host origins (Fig. 3). It showed that some sequences of all three genes were similar to those from different host origins, while the remaining were separately clustered. In Fig. 3A, apart from 14 sequences which clustered with those from fungi, bacteria, non-culturable phages, phages and viruses, all the rest were clustered separately. Whereas, most sequences were clustered with those from different host origins together, and only six sequences were far from other sequences of different host origins based on PCoA analysis (Fig. 3D). Only three sequences were clustered with those of archaea and uncultured archaea, and the rest were clustered together to form independent clusters (Fig. 3B). A small amount of sequences were gathered with bacteria, uncultured bacteria, archaea and uncultured archaea, and the rest were clustered individually (Fig. 3E). As can be seen in Fig. 3C, six sequences were clustered with those of archaea, fungi, bacteria, while the rest were clustered separately. Some sequences were gathered with bacteria, uncultured bacteria, archaea and uncultured archaea, and others were clustered separately (Fig. 3F). PCoA analysis was largely consistent with phylogenetic analysis. So the genetic diversity of phoH, phoU and pstS genes in Napahai plateau wetland was independent of the host origins.
Overall, the genetic diversity of phoH, phoU and pstS genes associated with phosphorus metabolism in Napahai plateau wetland was independent of both the habitats and host origins based on phylogenetic and PCoA analyses. It suggested that three genes showed relatively rich genetic diversity and were not genetically limited by differences in habitats or host origins. Han et al. showed that phoH sequences were widely distributed in soil, freshwater, and seawater environments in different locations around the world, indicating the genetic diversity independent of the environment17, which corroborated the conclusions in our study. Phylogenetic analysis of the 58 viral phoH gene sequences in Northeastern wetland of China revealed that some sequences were clustered with bacterial sequences and others clustered with phages sequences16. In Napahai plateau wetland, some phoH gene sequences were clustered with fungal, bacterial, phage, uncultured phage, and viruses. Hence, the genetic diversity of phoH gene was independent of the host origins in either Northeastern wetland or Napahai plateau wetland. Compared with Northeastern wetland, the phoH genes in Napahai plateau wetland showed more abundant genetic diversity, which may be related to geographical location and climate. Additionally, compared with sequences from different habitats and host sources, partial sequences from Napahai plateau wetland were clustered individually, thus they were unique, which might be related to the unique geography. Napahai plateau wetland is located in the Three Parallel Rivers with low latitude and high altitude, and shows specific characteristics which not found in other habitats, and then the species very different, thus providing the possibility for the emergence of unique genetic sequences. Of course, it would require further verification by subsequent study.
As far as the current studies are concerned, most reports on phosphorus AMGs focused on the function. Wang et al. mentioned that the phoH gene regulated phosphate uptake or metabolism under the low phosphorus or phosphate limitation conditions18. Kelly et al. isolated several phages from oligotrophic water bodies with low phosphorus condition, found that they contained the phosphate binding transporter gene pstS by sequencing, which enhanced the host cell with increasing the infection cycle of phages by increasing phosphate utilization19. Gardner et al. studied the PhoR-PhoB two-component regulatory system in E. coli, which regulated the expression of relevant genes according to environmental phosphate concentration and enabled cells to adapt the phosphate starvation20. The phoU existed in many bacteria and was identified as an auxiliary protein of the phosphate-specific transporter system, regulating phosphate metabolism in the host cell acting as phosphate regulators21. Few studies had been conducted on its genetic diversity, therefore, the information on the genetic diversity was relatively scarce.
α diversity analysis of phosphorus metabolism AMGs in different habitats and different host origins
Chao, Shannon and Simpson diversity indices are common mathematical measure of species alpha diversity in the community. Chao focuses on species richness. Shannon index and Simpson index measure species richness and evenness. Simpson reinforces evenness and Shannon reinforces richness22.
Sequences from different habitats, such as Napahai plateau wetland, Pacific Ocean, Lake Baikal, Northeast rice fields, glaciers, and wetlands, were selected for α-diversity analysis (Fig. 4). The genetic diversity indices, such as Chao, Shannon and Simpson, calculated based on the OUT dataset, were used to characterize the alpha diversity. Among them, larger Chao values, smaller Simpson values or larger Shannon values indicate higher genetic diversity. Only at the level of Chao values (Fig. 4A,D,G) and Shannon values (Fig. 4B,E,H), the values of phoH, phoU, and pstS in Napahai plateau wetland were greater than those from other habitats, indicating better heritable, which might be related to the unique geographical location and abundant water resources. The geographical location made it unique and less influenced by external factors, and abundant water resources created a rich biodiversity, thus providing a good genetic environment. From the Simpson values (Fig. 4C,F,I), the values of phoU and pstS genes were smaller than those of other habitats, indicating better inherited. For the phoH gene, the Simpson value was closer in magnitude and lower than those in Antarctic Lake and wetlands, indicating better heritable.
Three AMGs associated with phosphorus metabolism in Napahai plateau wetland were selected for α-diversity analysis with AMGs sequences from different host origins (Fig. 5). In Fig. 5A, the Chao values of phoH gene from bacteria, phages, uncultured phages and uncultured viruses in Napahai plateau wetland were smaller than those of bacteria, phages, uncultured phages and uncultured viruses, indicating the poor genetic diversity. In addition, compared to the genetic diversity of sequences from other host sources, the genetic diversity of phoH gene from bacteria in Napahai plateau wetland was better. As can be seen in Fig. 5D, G, the Chao values of phoU and pstS genes from bacteria in Napahai plateau wetland were greater than those of other host origins, indicating better genetic diversity, while the Chao values of pstS genes from archaea in Napahai plateau wetland were smaller than those of other host origins, indicating poor genetic diversity.
The Shannon value of phoH gene from bacteria in Napahai plateau wetland was smaller than that of bacteria and uncultured viruses, indicating poor diversity, but larger than other host sources, indicating better genetic diversity (Fig. 5B). The Shannon values of phoH gene from phages and uncultured phages in Napahai plateau wetland were lower than those of other host origins, indicating poor diversity. The Shannon value of phoH genes from uncultured viruses in Napahai plateau wetland was 0, probably due to sample size too small to calculate the Shannon value. In Fig. 5E, H, the Chao values of phoU and pstS genes from bacteria in Napahai plateaus wetland were greater than those from other host sources, indicating better diversity, while the Shannon value of pstS gene from archaea in the Napahai plateau wetland was 0, probably small sample size.
The Simpson values of phoH genes from phage, uncultured phage and uncultured virus in Napahai plateau wetland were smaller than those of other host origins (except uncultured virus), indicating better diversity. The smaller Simpson values of phoH genes related to fungi, phages, uncultured phages, and viruses indicated better diversity, while the larger Simpson values compared to bacteria, phages, and uncultured viruses indicated poor diversity (Fig. 5C). As can be seen in Figs. 5F,I, the Simpson values of phoU genes from bacteria and pstS genes from bacteria and archaea in Napahai plateau wetland were smaller than those of other host origins, indicating better genetic diversity.
Currently, most studies on phosphorus AMGs employed phylogenetic analysis16,23. In contrast, relatively few AMGs associated with phosphorus had been reported based on α-diversity analysis, so it was difficult to obtain specific values of α-diversity indices in other studies.
Biogeochemical cycling of AMGs associated with phosphorus metabolism in Napahai plateau wetland
Viruses are the gene carriers in susceptible hosts, and AMGs introduced by viruses into new hosts can enhance viral replication and/or influence key microbial metabolic pathways of the biogeochemical cycles24. It is well known that phosphorus is an essential nutrient and plays essential roles in cells25. Phosphorus deficiency leads to restricted cell division, down-regulation of photosynthesis, reduced protein and nitrogen content and chlorophyll synthesis26. To study the effect of AMGs associated with phosphorus metabolism, a phosphorus metabolic pathway containing phoH, phoU and pstS genes was constructed based on metagenomic data (Fig. 6). When phosphorus deficiency occurs in the host, it leads to the expression of phoH, phoU and pstS genes. phoH is a phosphate starvation inducible gene, while pstS acts as a phosphate transport gene and phoU belongs to a phosphate regulatory gene that produces dissolved inorganic phosphorus (DIPs), which then undergoes a series of reactions to produce ATP. The generated ATP becomes PolyP under the action of ppK which encoding polyphosphate kinase, or is used in Calvin cycle to provide energy for Ru5P to produce RuBP, or is used for DNA biosynthesis to provide energy. PolyP is regenerate into DIP with ppX which encoding exopolyphosphatase, and also involves in the biosynthesis process of DNA as Pi to provide phosphate for the nucleic acids synthesis. Thus, phosphorus metabolism of AMGs invoved plays a significant role in the life process of the virus and host. In addition, phoE and ugpQ genes also are identified in Napahai plateau wetland, but their roles in the phosphorus cycling are currently unknown and need further study.
Based on the phylogenetic and PCoA analyses, we found that the phoH, phoU, and pstS genes all showed unique sequences, which might be drive the microorganisms to produce the phosphorus metabolic pathway in Napahai plateau wetland. Of course, in order to prove this pathway, further validation might be done by metabolomics or metabolic flow method. Furthermore, the phosphorus metabolic pathway was poorly reported, so we could not compare with the phosphorus pathway from other environment to find commonalities and differences.