Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
- PMID: 21989143
- PMCID: PMC3194235
- DOI: 10.1186/1471-2164-12-S2-S4
Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
Abstract
Background: A major goal of metagenomics is to characterize the microbial composition of an environment. The most popular approach relies on 16S rRNA sequencing, however this approach can generate biased estimates due to differences in the copy number of the gene between even closely related organisms, and due to PCR artifacts. The taxonomic composition can also be determined from metagenomic shotgun sequencing data by matching individual reads against a database of reference sequences. One major limitation of prior computational methods used for this purpose is the use of a universal classification threshold for all genes at all taxonomic levels.
Results: We propose that better classification results can be obtained by tuning the taxonomic classifier to each matching length, reference gene, and taxonomic level. We present a novel taxonomic classifier MetaPhyler (http://metaphyler.cbcb.umd.edu), which uses phylogenetic marker genes as a taxonomic reference. Results on simulated datasets demonstrate that MetaPhyler outperforms other tools commonly used in this context (CARMA, Megan and PhymmBL). We also present interesting results by analyzing a real metagenomic dataset.
Conclusions: We have introduced a novel taxonomic classification method for analyzing the microbial diversity from whole-metagenome shotgun sequences. Compared with previous approaches, MetaPhyler is much more accurate in estimating the phylogenetic composition. In addition, we have shown that MetaPhyler can be used to guide the discovery of novel organisms from metagenomic samples.
Figures





Similar articles
-
WGSQuikr: fast whole-genome shotgun metagenomic classification.PLoS One. 2014 Mar 13;9(3):e91784. doi: 10.1371/journal.pone.0091784. eCollection 2014. PLoS One. 2014. PMID: 24626336 Free PMC article.
-
Deep learning models for bacteria taxonomic classification of metagenomic data.BMC Bioinformatics. 2018 Jul 9;19(Suppl 7):198. doi: 10.1186/s12859-018-2182-6. BMC Bioinformatics. 2018. PMID: 30066629 Free PMC article.
-
VITCOMIC2: visualization tool for the phylogenetic composition of microbial communities based on 16S rRNA gene amplicons and metagenomic shotgun sequencing.BMC Syst Biol. 2018 Mar 19;12(Suppl 2):30. doi: 10.1186/s12918-018-0545-2. BMC Syst Biol. 2018. PMID: 29560821 Free PMC article.
-
Reference databases for taxonomic assignment in metagenomics.Brief Bioinform. 2012 Nov;13(6):682-95. doi: 10.1093/bib/bbs036. Epub 2012 Jul 10. Brief Bioinform. 2012. PMID: 22786784 Review.
-
Benchmarking Metagenomics Tools for Taxonomic Classification.Cell. 2019 Aug 8;178(4):779-794. doi: 10.1016/j.cell.2019.07.010. Cell. 2019. PMID: 31398336 Free PMC article. Review.
Cited by
-
Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains.Microbiome. 2016 May 3;4(1):18. doi: 10.1186/s40168-016-0162-5. Microbiome. 2016. PMID: 27138046 Free PMC article.
-
CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers.BMC Genomics. 2015 Mar 25;16(1):236. doi: 10.1186/s12864-015-1419-2. BMC Genomics. 2015. PMID: 25879410 Free PMC article.
-
Critical Assessment of Metagenome Interpretation: the second round of challenges.Nat Methods. 2022 Apr;19(4):429-440. doi: 10.1038/s41592-022-01431-4. Epub 2022 Apr 8. Nat Methods. 2022. PMID: 35396482 Free PMC article.
-
Lightweight taxonomic profiling of long-read metagenomic datasets with Lemur and Magnet.bioRxiv [Preprint]. 2024 Aug 25:2024.06.01.596961. doi: 10.1101/2024.06.01.596961. bioRxiv. 2024. PMID: 38895276 Free PMC article. Preprint.
-
Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.Nat Methods. 2017 Nov;14(11):1063-1071. doi: 10.1038/nmeth.4458. Epub 2017 Oct 2. Nat Methods. 2017. PMID: 28967888 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources