MetaPhage: an Automated Pipeline for Analyzing, Annotating, and Classifying Bacteriophages in Metagenomics Sequencing Data
- PMID: 36069454
- PMCID: PMC9599279
- DOI: 10.1128/msystems.00741-22
MetaPhage: an Automated Pipeline for Analyzing, Annotating, and Classifying Bacteriophages in Metagenomics Sequencing Data
Abstract
Phages are the most abundant biological entities on the planet, and they play an important role in controlling density, diversity, and network interactions among bacterial communities through predation and gene transfer. To date, a variety of bacteriophage identification tools have been developed that differ in the phage mining strategies used, input files requested, and results produced. However, new users attempting bacteriophage analysis can struggle to select the best methods and interpret the variety of results produced. Here, we present MetaPhage, a comprehensive reads-to-report pipeline that streamlines the use of multiple phage miners and generates an exhaustive report. The report both summarizes and visualizes the key findings and enables further exploration of key results via interactive filterable tables. The pipeline is implemented in Nextflow, a widely adopted workflow manager that enables an optimized parallelization of tasks in different locations, from local server to the cloud; this ensures reproducible results from containerized packages. MetaPhage is designed to enable scalability and reproducibility; also, it can be easily expanded to include new miners and methods as they are developed in this continuously growing field. MetaPhage is freely available under a GPL-3.0 license at https://github.com/MattiaPandolfoVR/MetaPhage. IMPORTANCE Bacteriophages (viruses that infect bacteria) are the most abundant biological entities on earth and are increasingly studied as members of the resident microbiota community in many environments, from oceans to soils and the human gut. Their identification is of great importance to better understand complex bacterial dynamics and microbial ecosystem function. A variety of metagenome bacteriophage identification tools have been developed that differ in the phage mining strategies used, input files requested, and results produced. To facilitate the management and the execution of such a complex workflow, we developed MetaPhage (MP), a comprehensive reads-to-report pipeline that streamlines the use of multiple phage miners and generates an exhaustive report. The pipeline is implemented in Nextflow, a widely adopted workflow manager that enables an optimized parallelization of tasks. MetaPhage is designed to enable scalability and reproducibility and offers an installation-free, dependency-free, and conflict-free workflow execution.
Keywords: NGS; bacteriophages; bioinformatics; metagenomics; phage mining.
Conflict of interest statement
The authors declare no conflict of interest.
Figures




Similar articles
-
What the Phage: a scalable workflow for the identification and analysis of phage sequences.Gigascience. 2022 Nov 18;11:giac110. doi: 10.1093/gigascience/giac110. Gigascience. 2022. PMID: 36399058 Free PMC article.
-
ViroProfiler: a containerized bioinformatics pipeline for viral metagenomic data analysis.Gut Microbes. 2023 Jan-Dec;15(1):2192522. doi: 10.1080/19490976.2023.2192522. Gut Microbes. 2023. PMID: 36998174 Free PMC article.
-
Biofilm marker discovery with cloud-based dockerized metagenomics analysis of microbial communities.Brief Bioinform. 2024 Jul 23;25(Supplement_1):bbae429. doi: 10.1093/bib/bbae429. Brief Bioinform. 2024. PMID: 39266450 Free PMC article.
-
Phages in the Gut Ecosystem.Front Cell Infect Microbiol. 2022 Jan 4;11:822562. doi: 10.3389/fcimb.2021.822562. eCollection 2021. Front Cell Infect Microbiol. 2022. PMID: 35059329 Free PMC article. Review.
-
Ecological and functional roles of bacteriophages in contrasting environments: marine, terrestrial and human gut.Curr Opin Microbiol. 2022 Dec;70:102229. doi: 10.1016/j.mib.2022.102229. Epub 2022 Nov 5. Curr Opin Microbiol. 2022. PMID: 36347213 Review.
Cited by
-
Benchmarking informatics approaches for virus discovery: caution is needed when combining in silico identification methods.mSystems. 2024 Mar 19;9(3):e0110523. doi: 10.1128/msystems.01105-23. Epub 2024 Feb 20. mSystems. 2024. PMID: 38376167 Free PMC article.
-
Ecogenomics and cultivation reveal distinctive viral-bacterial communities in the surface microlayer of a Baltic Sea slick.ISME Commun. 2023 Sep 18;3(1):97. doi: 10.1038/s43705-023-00307-8. ISME Commun. 2023. PMID: 37723220 Free PMC article.
-
Metagenomic investigation of viruses in green sea turtles (Chelonia mydas).Front Microbiol. 2025 Jan 22;16:1492038. doi: 10.3389/fmicb.2025.1492038. eCollection 2025. Front Microbiol. 2025. PMID: 39911250 Free PMC article.
-
Gastrointestinal jumbo phages possess independent synthesis and utilization systems of NAD.Microbiome. 2024 Dec 20;12(1):268. doi: 10.1186/s40168-024-01984-w. Microbiome. 2024. PMID: 39707494 Free PMC article.
-
Comparative Analyses of Bacteriophage Genomes.Methods Mol Biol. 2024;2802:427-453. doi: 10.1007/978-1-0716-3838-5_14. Methods Mol Biol. 2024. PMID: 38819567
References
-
- Roux S, Páez-Espino D, Chen I-MA, Palaniappan K, Ratner A, Chu K, Reddy TBK, Nayfach S, Schulz F, Call L, Neches RY, Woyke T, Ivanova NN, Eloe-Fadrosh EA, Kyrpides NC. 2021. IMG/VR v3: an integrated ecological and evolutionary framework for interrogating genomes of uncultivated viruses. Nucleic Acids Res 49:D764–D775. doi:10.1093/nar/gkaa946. - DOI - PMC - PubMed
-
- Roux S, Adriaenssens EM, Dutilh BE, Koonin EV, Kropinski AM, Krupovic M, Kuhn JH, Lavigne R, Brister JR, Varsani A, Amid C, Aziz RK, Bordenstein SR, Bork P, Breitbart M, Cochrane GR, Daly RA, Desnues C, Duhaime MB, Emerson JB, Enault F, Fuhrman JA, Hingamp P, Hugenholtz P, Hurwitz BL, Ivanova NN, Labonté JM, Lee K-B, Malmstrom RR, Martinez-Garcia M, Mizrachi IK, Ogata H, Páez-Espino D, Petit M-A, Putonti C, Rattei T, Reyes A, Rodriguez-Valera F, Rosario K, Schriml L, Schulz F, Steward GF, Sullivan MB, Sunagawa S, Suttle CA, Temperton B, Tringe SG, Thurber RV, Webster NS, Whiteson KL, et al. . 2019. Minimum information about an uncultivated virus genome (MIUViG). Nat Biotechnol 37:29–37. doi:10.1038/nbt.4306. - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
- BBS/E/F/000PR10353/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- BB/R012490/1/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- BBS/E/F/000PR10355/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- BB/R506552/1/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- BBS/E/F/000PR10356/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
LinkOut - more resources
Full Text Sources