SVIM: structural variant identification using mapped long reads
- PMID: 30668829
- PMCID: PMC6735718
- DOI: 10.1093/bioinformatics/btz041
SVIM: structural variant identification using mapped long reads
Abstract
Motivation: Structural variants are defined as genomic variants larger than 50 bp. They have been shown to affect more bases in any given genome than single-nucleotide polymorphisms or small insertions and deletions. Additionally, they have great impact on human phenotype and diversity and have been linked to numerous diseases. Due to their size and association with repeats, they are difficult to detect by shotgun sequencing, especially when based on short reads. Long read, single-molecule sequencing technologies like those offered by Pacific Biosciences or Oxford Nanopore Technologies produce reads with a length of several thousand base pairs. Despite the higher error rate and sequencing cost, long-read sequencing offers many advantages for the detection of structural variants. Yet, available software tools still do not fully exploit the possibilities.
Results: We present SVIM, a tool for the sensitive detection and precise characterization of structural variants from long-read data. SVIM consists of three components for the collection, clustering and combination of structural variant signatures from read alignments. It discriminates five different variant classes including similar types, such as tandem and interspersed duplications and novel element insertions. SVIM is unique in its capability of extracting both the genomic origin and destination of duplications. It compares favorably with existing tools in evaluations on simulated data and real datasets from Pacific Biosciences and Nanopore sequencing machines.
Availability and implementation: The source code and executables of SVIM are available on Github: github.com/eldariont/svim. SVIM has been implemented in Python 3 and published on bioconda and the Python Package Index.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press.
Figures







Similar articles
-
SVIM-asm: structural variant detection from haploid and diploid genome assemblies.Bioinformatics. 2021 Apr 1;36(22-23):5519-5521. doi: 10.1093/bioinformatics/btaa1034. Bioinformatics. 2021. PMID: 33346817 Free PMC article.
-
lordFAST: sensitive and Fast Alignment Search Tool for LOng noisy Read sequencing Data.Bioinformatics. 2019 Jan 1;35(1):20-27. doi: 10.1093/bioinformatics/bty544. Bioinformatics. 2019. PMID: 30561550 Free PMC article.
-
Discovery of tandem and interspersed segmental duplications using high-throughput sequencing.Bioinformatics. 2019 Oct 15;35(20):3923-3930. doi: 10.1093/bioinformatics/btz237. Bioinformatics. 2019. PMID: 30937433 Free PMC article.
-
A comprehensive evaluation of long read error correction methods.BMC Genomics. 2020 Dec 21;21(Suppl 6):889. doi: 10.1186/s12864-020-07227-0. BMC Genomics. 2020. PMID: 33349243 Free PMC article. Review.
-
The impact of long-read sequencing on human population-scale genomics.Genome Res. 2025 Apr 14;35(4):593-598. doi: 10.1101/gr.280120.124. Genome Res. 2025. PMID: 40228902 Review.
Cited by
-
Fundamental Patterns of Structural Evolution Revealed by Chromosome-Length Genomes of Cactophilic Drosophila.Genome Biol Evol. 2024 Sep 3;16(9):evae191. doi: 10.1093/gbe/evae191. Genome Biol Evol. 2024. PMID: 39228294 Free PMC article.
-
Cas9 targeted nanopore sequencing with enhanced variant calling improves CYP2D6-CYP2D7 hybrid allele genotyping.PLoS Genet. 2022 Sep 23;18(9):e1010176. doi: 10.1371/journal.pgen.1010176. eCollection 2022 Sep. PLoS Genet. 2022. PMID: 36149915 Free PMC article.
-
FindCSV: a long-read based method for detecting complex structural variations.BMC Bioinformatics. 2024 Sep 28;25(1):315. doi: 10.1186/s12859-024-05937-w. BMC Bioinformatics. 2024. PMID: 39342151 Free PMC article.
-
HapKled: a haplotype-aware structural variant calling approach for Oxford nanopore sequencing data.Front Genet. 2024 Jul 9;15:1435087. doi: 10.3389/fgene.2024.1435087. eCollection 2024. Front Genet. 2024. PMID: 39045321 Free PMC article.
-
Whole-genome long-read sequencing downsampling and its effect on variant-calling precision and recall.Genome Res. 2023 Dec 27;33(12):2029-2040. doi: 10.1101/gr.278070.123. Genome Res. 2023. PMID: 38190646 Free PMC article.
References
-
- Bartenhagen C., Dugas M. (2013) Rsvsim: an R/Bioconductor package for the simulation of structural variations. Bioinformatics, 29, 1679–1681. - PubMed
-
- Bron C., Kerbosch J. (1973) Algorithm 457: finding all cliques of an undirected graph. Commun. ACM, 16, 575–577.