Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 10;176(1-2):377-390.e19.
doi: 10.1016/j.cell.2018.11.029. Epub 2019 Jan 3.

A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens

Affiliations

A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens

Molly Gasperini et al. Cell. .

Erratum in

Abstract

Over one million candidate regulatory elements have been identified across the human genome, but nearly all are unvalidated and their target genes uncertain. Approaches based on human genetics are limited in scope to common variants and in resolution by linkage disequilibrium. We present a multiplex, expression quantitative trait locus (eQTL)-inspired framework for mapping enhancer-gene pairs by introducing random combinations of CRISPR/Cas9-mediated perturbations to each of many cells, followed by single-cell RNA sequencing (RNA-seq). Across two experiments, we used dCas9-KRAB to perturb 5,920 candidate enhancers with no strong a priori hypothesis as to their target gene(s), measuring effects by profiling 254,974 single-cell transcriptomes. We identified 664 (470 high-confidence) cis enhancer-gene pairs, which were enriched for specific transcription factors, non-housekeeping status, and genomic and 3D conformational proximity to their target genes. This framework will facilitate the large-scale mapping of enhancer-gene regulatory interactions, a critical yet largely uncharted component of the cis-regulatory landscape of the human genome.

Keywords: CRISPR; CRISPRi; RNA-seq; crisprQTL; eQTL; enhancer; gene regulation; genetic screen; human genetics; single cell.

PubMed Disclaimer

Conflict of interest statement

DECLARATION OF INTERESTS

The authors declare no competing interests.

Figures

Figure 1.
Figure 1.. Multiplex Enhancer-Gene Pair Screening
(A) Enhancer-gene pairs are screened by introducing random combinations of CRISPR/Cas9 candidate enhancer perturbations to each of many cells, followed by scRNA-seq to capture expression levels of all transcripts. Then, all candidate enhancers are tested against any gene by correlating presence of any perturbation with reduction of any transcript. (B) Multiplex perturbations increase power to detect changes in expression in single-cell genetic screens while greatly reducing the number of cells that need to be profiled. Power calculations on simulated data show that increasing the number of perturbations per cell increases power to detect changes in expression, including for genes with low (0.10 mean UMIs per cell), medium (0.32), or high (1.00) mean expression. x axis corresponds to the simulated % repression of target transcript.
Figure 2.
Figure 2.. Pilot Multiplex Enhancer-Gene Pair Screen Testing 1,119 Candidate Enhancers in K562 Cells
(A) 1,119 candidate enhancers were chosen based on intersection of enhancer-associated features and each targeted by two gRNAs. (B) Schematic of this multiplex enhancer-gene pair screening method. (i) gRNAs were cloned into a lentiviral vector, and delivered to K562 cells at a high MOI. (ii) scRNA-seq was performed on these cells, with concurrent capture of the multiple gRNAs present in each cell. (iii) For each candidate enhancer, cells were partitioned based on whether or not they contained a gRNA targeting it. (iv) For each such partition, we tested for differential expression between the two populations for any gene within 1 Mb of the candidate enhancer. (C) gRNAs were delivered to K562 cells at a high MOI, with median of 15 ± 11.3 gRNAs identified per cell. (D) A total of 47,650 single cell transcriptional profiles were generated. Each perturbation was identified in a median of 516 ± 177 cells. (E) Quantile-quantile plot of the differential expression tests. Distributions of observed versus expected p values for candidate enhancer-targeting gRNAs (orange) and NTC gRNAs (gray; downsampled) are shown. (F) Expression of selected TSS (top row) and β-globin LCR positive controls (bottom row). Nearly all targeted TSSs, and all positive controls, showed significant differential expression of the expected target genes between cells with (+) versus without (−) targeting gRNAs, in contrast with NTCs. Percent changes and p values show the effect size and significance of differential expression of the denoted target gene between these cell groups. See also Figure S1 and Table S1.
Figure 3.
Figure 3.. Multiplex Enhancer-Gene Pair Screening at Scale in K562 Cells
(A) For a scaled experiment, gRNAs were designed to target a total of 5,779 candidate enhancers. Characteristics are shown for 3,853 sites chosen by a model informed by the hits identified in the pilot experiment. (B) 948 exploratory candidate enhancers were sampled from K562 DHSs. 978 candidate enhancers from the pilot were re-targeted with the same gRNA pair, and 377 of these were also targeted with a second, alternative gRNA pair. (C) gRNAs were again delivered to K562 cells, but at a higher MOI than the pilot experiment (median 28 ± 15.3 gRNAs identified per cell). (D) A total of 207,324 single cell transcriptional profiles were generated. Each perturbation was identified in a median of 915 ± 280 single cells. (E) Q-Q plot of the differential expression tests. Distributions of observed versus expected p values for candidate enhancer-targeting gRNAs that were correlated with decrease in target gene expression (orange) and NTC gRNAs (gray; downsampled) are shown. (F) Histogram of the number of target genes impacted by each candidate enhancer identified as part of a pair (10% empirical FDR). (G) Histogram of the number of paired candidate enhancers detected as regulating each target gene (10% empirical FDR). (H) Effect sizes for the 664 enhancer-gene pairs that pass a
Figure 4.
Figure 4.. Replication and Validation of Selected Enhancer-Gene Pairs in Singleton Experiments
(A–D) For each singleton replication experiments of enhancer-gene pairs, bulk RNA-seq was performed on CRISPRi+ K562 cells transduced with gRNAs targeting (purple) e-PRKCB (A), e-PTGER3 (B), e-GYPC (C), e-NMU (D), or the TSSs (dark red) of their respective target genes. Target gene expression in the singleton-target cell lines (red/purple) as compared to replication experiments in which the other 4 candidate enhancers or TSSs were targeted (gray). Eleven other singleton CRISPRi experiments are summarized in Figure S5. (E–G) To validate three enhancer-gene pairs by sequence deletion, monoclonal lines were generated with full deletion of the locus’s genomic sequence in three to six independent clones (e-NMU, E; e-CITED2, F; and e-GLUL, G), followed by bulk RNA-seq. See also Figure S4A. (H) NMU-targeting cells were phenotyped by fluorophore-labeling of intracellular NMU transcripts by RNA flowFISH. (ii–iii) Singleton CRISPRi targeted cells as in (D). (iv–v) A heterogeneous pool of cells engineered such that a portion (based on deletion efficiency) harbor full or scanning deletions of e-NMU (see also Figures S4B and S4C). See also Figure S3 and Table S3.
Figure 5.
Figure 5.. Highlighted Examples of Enhancer-Gene Pairs
(A) Three candidate enhancers (labeled ii–iv) that reside 32, 14, and 9 kb upstream of PRKCB were paired with PRKCB, but a fourth (i) that lies 50 kb upstream was not (shown: hg19 chr16:23791225–23851797; iii is e-PRKCB in Figure 4A). (B) A single candidate enhancer (e-PTGER3 in Figure 4B) located 371 kb downstream of PTGER3 was paired with PTGER3 (shown: chr1:71104684–71582921). (C) Two candidate enhancers paired with GYPC (ii–iii) lie in the 11 kb region upstream of GYPC. However, a third candidate enhancer (i) immediately adjacent to (ii) was not paired with GYPC (shown: chr1:71104684–71582921; ii is e-GYPC in Figure 4C). (D) Targeting five candidate enhancers (i–v) located 30.5, 87, 93.4, 94.1, and 97.6 kb upstream of NMU, significantly reduced expression of NMU (shown: chr1:71104684–71582921; iii-iv is e-NMU in Figure 4D). Target genes’ normalized expression presented on log scale. Asterisks denote the candidate enhancers that were targeted as part of a singleton replication experiment (Figure 4). + and - denote thecells from the at-scale screen with or without gRNAs targeting that locus. Percent changes and p values denote the size and significance of a differential expression between these cell groups.
Figure 6.
Figure 6.. Characteristics of K562 Enhancer-Gene Pairs
(A) Paired candidate enhancers fall close to target genes. Distribution of distances between the paired candidate enhancers and their target gene’s TSS (top row, high confidence pairs; second row, lower confidence pairs), the TSS of whatever K562-expressed gene is closest (third row), or the TSS of every K562-expressed gene within 1 Mb (fourth row). Plotted with respect to gene orientation. Of the 470 high confidence pairs, this plot displays only the 354 that fall upstream of the target genes (as the gRNA library does not include candidate enhancers within 1 kb of any gene body, downstream enhancers are biased to fall further from the target TSS). A TSS-focused zoom of this plot is included as Figure S5E. (B) 317 of 470 high-confidence pairs target the most proximal K562-expressed gene. Target genes are ranked by their absolute distance to the paired candidate enhancer (1 = closest, 2 = second closest, etc.). (C) This framework captures regulatory effects on genes from a broad range of expression levels (expression = mean transcript UMIs/cell in the entire 207,324 cell dataset, for 13,135 K562-expressed genes, 10,560 of these within 1 Mb of a targeted candidate enhancer in the scaled experiment, and 470 high-confidence enhancer-gene pairs). See also Figure S5D. (D) Paired candidate enhancers tend to fall in enhancer-associated ChIP-seq peaks that show stronger signals. All ChIP-seq peaks that overlap the scaled experiment’s 5,779 candidate enhancers were divided into quintiles defined as the average enrichment in ChIP-seq peak region (0 = no such peak overlaps the candidate enhancer, 1 = lowest, 5 = highest). Histograms of the proportion of which candidate enhancers in each quintile that were paired with a target gene are shown for the eight most-enriched ChIP-seq datasets. (E) Enhancer-gene pairs interact more frequently in K562 Hi-C data (left, fractional ranking of enhancer-gene pairs’ Hi-C interaction-frequency against all other possible interactions at similar distances within the same TAD, K-S test against a uniform distribution p value
Figure 7.
Figure 7.. CRISPRi Is Robust to Multiplexing within a Cell
(A) A biological replicate of the pilot study, targeting the same 1,119 candidate enhancers and 381 TSSs, was performed at a low MOI (median 1 ± 1.6 gRNAs identified per cell). (B) A total of 41,284 single cell transcriptional profiles were generated. Each perturbation was identified in a median of 43 ± 16 single cells. (C) Correlation of effect sizes for TSS controls (top, purple) or enhancer-gene pairs identified in the scaled experiment (10% empirical FDR, bottom, orange) across increasing rates of gRNA per cell (left, 1 versus 15; middle, 15 versus 28; right, 1 versus 28 gRNAs/cell). Point sizes are proportional to each target gene’s expression level. (D) The ratios of repression for each TSS control or paired candidate enhancer (as identified with a 10% empirical FDR in any experiment) in the low MOI experiment versus a high MOI experiment (top = median 1 gRNA versus 15 gRNAs; bottom = median 1 gRNA versus 28 gRNAs). The candidate enhancer outliers with stronger effect sizes in the low MOI experiment (right panel, ratios in long left tail) are likely largely due to stochastic under-sampling of lowly expressed target genes in the low MOI experiment (see also Figure S6).

Similar articles

Cited by

References

    1. Adamson B, Norman TM, Jost M, Cho MY, Nuñez JK, Chen Y, Villalta JE, Gilbert LA, Horlbeck MA, Hein MY, et al. (2016). A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882. - PMC - PubMed
    1. Bray NL, Pimentel H, Melsted P, and Pachter L (2016). Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology 34, 525–527. - PubMed
    1. Butler A, Hoffman P, Smibert P, Papalexi E, and Satija R (2018). Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol 36, 411–420. - PMC - PubMed
    1. Canver MC, Smith EC, Sher F, Pinello L, Sanjana NE, Shalem O, Chen DD, Schupp PG, Vinjamur DS, Garcia SP, et al. (2015).BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192–197. - PMC - PubMed
    1. Cao F, Fang Y, Tan HK, Goh Y, Choy JYH, Koh BTH, Hao Tan J, Bertin N, Ramadass A, Hunter E, et al. (2017). Super-enhancers and broad H3K4me3 domains form complex gene regulatory circuits involving chromatin interactions. Sci. Rep 7, 2186. - PMC - PubMed

Publication types

MeSH terms

Substances