Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov 1;35(22):4754-4756.
doi: 10.1093/bioinformatics/btz431.

ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions

Affiliations

ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions

Egor Dolzhenko et al. Bioinformatics. .

Abstract

Summary: We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci.

Availability and implementation: ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Overview of ExpansionHunter. (a) A locus definition is read from the variant catalog file. (b) Sequence graph is constructed according to its specification in the variant catalog. (c) Relevant reads are extracted from the input binary alignment/map file. (d) Reads are aligned to the graph. (e) Alignments are pieced together to genotype each variant

Similar articles

Cited by

References

    1. Amiel J. et al. (2003) Polyalanine expansion and frameshift mutations of the paired-like homeobox gene PHOX2B in congenital central hypoventilation syndrome. Nat. Genet., 33, 459.. - PubMed
    1. Benjamini Y., Speed T.P. (2012) Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res., 40, e72.. - PMC - PubMed
    1. Cornish-Bowden A. (1985) Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. Nucleic Acids Res., 13, 3021.. - PMC - PubMed
    1. Dashnow H. et al. (2018) STRetch: detecting and discovering pathogenic short tandem repeat expansions. Genome Biol., 19, 121.. - PMC - PubMed
    1. Dilthey A. et al. (2015) Improved genome inference in the MHC using a population reference graph. Nat. Genet., 47, 682.. - PMC - PubMed

Publication types