Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 8;2(2):100023.
doi: 10.1016/j.xhgg.2021.100023. Epub 2021 Jan 16.

Long-read genome sequencing for the molecular diagnosis of neurodevelopmental disorders

Affiliations

Long-read genome sequencing for the molecular diagnosis of neurodevelopmental disorders

Susan M Hiatt et al. HGG Adv. .

Abstract

Exome and genome sequencing have proven to be effective tools for the diagnosis of neurodevelopmental disorders (NDDs), but large fractions of NDDs cannot be attributed to currently detectable genetic variation. This is likely, at least in part, a result of the fact that many genetic variants are difficult or impossible to detect through typical short-read sequencing approaches. Here, we describe a genomic analysis using Pacific Biosciences circular consensus sequencing (CCS) reads, which are both long (>10 kb) and accurate (>99% bp accuracy). We used CCS on six proband-parent trios with NDDs that were unexplained despite extensive testing, including genome sequencing with short reads. We identified variants and created de novo assemblies in each trio, with global metrics indicating these datasets are more accurate and comprehensive than those provided by short-read data. In one proband, we identified a likely pathogenic (LP), de novo L1-mediated insertion in CDKL5 that results in duplication of exon 3, leading to a frameshift. In a second proband, we identified multiple large de novo structural variants, including insertion-translocations affecting DGKB and MLLT3, which we show disrupt MLLT3 transcript levels. We consider this extensive structural variation likely pathogenic. The breadth and quality of variant detection, coupled to finding variants of clinical and research interest in two of six probands with unexplained NDDs, support the hypothesis that long-read genome sequencing can substantially improve rare disease genetic discovery rates.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

Figure 1
Figure 1
Proband 6 has a de novo insertion resulting in duplication of exon 3 of CDKL5 (A) Ideogram showing location of CDKL5 on chromosome X. Ideogram is from the NCBI Genome Decoration Page. (B) Gene structure of CDKL5, RS1, and PPEF1, indicating the location of the 6,993 bp insertion in CDKL5 (blue/red/gray bars) and location of the origin of the duplicated PPEF1 intronic sequence (red). (C) Zoomed-in view of the insertion. The gray box indicates the entire 6,993 nt insertion, which consists of a partial L1HS retrotransposon (blue box), duplicated PPEF1 intronic sequence (red box), and target site duplication (TSD, yellow box) with duplicated exon 3 (3∗). Green boxes indicate RepeatMasker annotation of the proband’s insertion-bearing, contig sequence. (D) Alignment of CCS reads near exon 3 of CDKL5 in IGV in proband 6 and her parents. Gray reads represent alignment to reference, and multicolor alignments represent unaligned ends of reads. The TSD is indicated by a yellow box. Reads highlighted by the pink box include examples of reads that align to reference upstream of the insertion, contain the TSD, and then have inserted sequence at their 3′ end. Those highlighted in the turquoise box represent inserted sequence, TSD, and reference sequence downstream of the insertion. Note that some reads have hard-clipped bases, which are designated with a black diamond.
Figure 2
Figure 2
The duplicated CDKL5 exon 3 is present in a subset of the proband’s CDKL5 transcripts (A) RT-PCR using primers specific to exons 2–5 of CDKL5 cDNA results in a 240 bp amplicon in proband (P), dad (D), and mom (M). An additional 275 bp amplicon is present only in the proband (asterisk). (B) Sanger sequencing of both amplicons from the proband confirmed that the 240 bp amplicon includes the normal, expected sequencing and inclusion of a duplicated exon 3 in the upper, 275 bp band. This is predicted to lead to a frameshift (red circle) and downstream stop, p.Thr35ProfsTer52. Yellow outlined box, exon 3 sequence; orange outlined box, duplicated exon 3 sequence.
Figure 3
Figure 3
Proband 4 has several large structural changes on chromosome 6 (A) Ideogram with annotation of chromosome 6 breakpoints identified in proband 4, including pericentric inversion breakpoints (pinv1, pinv2) and multiple breakpoints of a complex genomic rearrangement (red arrows). Ideogram is from the NCBI Genome Decoration Page. (B) Schematic of proband 4’s maternal (pink box) and paternal (blue box) chromosome 6 structures. The maternal structure matches reference, while the paternally inherited derived chromosome 6 has pericentric inversion breakpoints (pinv1/pinv2) and a complex cluster of rearranged fragments (DCAGHFEB). (C) Zoomed-in view of (B), showing the schematic of additional fragmentation near 6q22.31–6q23.3 (vertical dashed lines). Asterisks indicate inverted sequence as compared to hg38 reference. See Table S6 for additional breakpoint coordinates and details. (D) Alignment of four sequential paternal contigs to reference chromosome 6 identified a pericentric inversion spanning 6p22.3 to 6q24.2 and a 9.3 Mb region near 6q22.31–6q23.3 with several additional breaks. (E) Zoomed-in view of (D), showing additional fragmentation near 6q22.31–6q23.3.
Figure 4
Figure 4
Proband 4 has two insertional translocations between chromosomes 7 and 9 and an inversion (A) Ideogram with annotation of chromosome 7 and 9 breakpoints identified in proband 4. Ideograms are from the NCBI Genome Decoration Page. (B) Schematic of the proband’s maternal (pink box) and paternal (blue box) p arms of chromosomes 7 and 9. The proband’s maternal alleles match reference. The paternal sequences represent the outcome of translocations (7A;9A and 7B;9B) and inversion (7A;7C), with fragment sizes shown. The red fragment in paternal der9p is inverted with respect to hg38 reference. (C) Alignment of three paternal contigs to reference chromosomes 7 and 9 identified two insertional translocations. See Figure S6 and Supplemental methods regarding blue and red boxed areas.

Similar articles

Cited by

References

    1. Ropers H.H. Genetics of intellectual disability. Curr. Opin. Genet. Dev. 2008;18:241–250. - PubMed
    1. Vissers L.E., de Ligt J., Gilissen C., Janssen I., Steehouwer M., de Vries P., van Lier B., Arts P., Wieskamp N., del Rosario M., et al. A de novo paradigm for mental retardation. Nat. Genet. 2010;42:1109–1112. - PubMed
    1. Wellcome Sanger Institute. D.D.D. Development Disorder Genotype - Phenotype Database. https://decipher.sanger.ac.uk/ddd/ddgenes
    1. Hiatt S.M., Amaral M.D., Bowling K.M., Finnila C.R., Thompson M.L., Gray D.E., Lawlor J.M.J., Cochran J.N., Bebin E.M., Brothers K.B., et al. Systematic reanalysis of genomic data improves quality of variant interpretation. Clin. Genet. 2018;94:174–178. - PMC - PubMed
    1. Clark M.M., Stark Z., Farnaes L., Tan T.Y., White S.M., Dimmock D., Kingsmore S.F. Meta-analysis of the diagnostic and clinical utility of genome and exome sequencing and chromosomal microarray in children with suspected genetic diseases. NPJ Genom. Med. 2018;3:16. - PMC - PubMed