Reevaluation of SNP heritability in complex human traits

doi:10.1038/ng.3865

Comparative Study

. 2017 Jul;49(7):986-992.

doi: 10.1038/ng.3865. Epub 2017 May 22.

Reevaluation of SNP heritability in complex human traits

Doug Speed¹, Na Cai^{2

3}; UCLEB Consortium; Michael R Johnson⁴, Sergey Nejentsev⁵, David J Balding^{1

6}

Affiliations

¹ UCL Genetics Institute, University College London, London, UK.
² Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
³ European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK.
⁴ Division of Brain Science, Imperial College London, London, UK.
⁵ Department of Medicine, University of Cambridge, Cambridge, UK.
⁶ Centre for Systems Genomics, School of BioSciences, and School of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia.

PMID: 28530675
PMCID: PMC5493198
DOI: 10.1038/ng.3865

Comparative Study

Reevaluation of SNP heritability in complex human traits

Doug Speed et al. Nat Genet. 2017 Jul.

. 2017 Jul;49(7):986-992.

doi: 10.1038/ng.3865. Epub 2017 May 22.

Authors

Doug Speed¹, Na Cai^{2

3}; UCLEB Consortium; Michael R Johnson⁴, Sergey Nejentsev⁵, David J Balding^{1

6}

Affiliations

¹ UCL Genetics Institute, University College London, London, UK.
² Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
³ European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK.
⁴ Division of Brain Science, Imperial College London, London, UK.
⁵ Department of Medicine, University of Cambridge, Cambridge, UK.
⁶ Centre for Systems Genomics, School of BioSciences, and School of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia.

PMID: 28530675
PMCID: PMC5493198
DOI: 10.1038/ng.3865

Abstract

SNP heritability, the proportion of phenotypic variance explained by SNPs, has been reported for many hundreds of traits. Its estimation requires strong prior assumptions about the distribution of heritability across the genome, but current assumptions have not been thoroughly tested. By analyzing imputed data for a large number of human traits, we empirically derive a model that more accurately describes how heritability varies with minor allele frequency (MAF), linkage disequilibrium (LD) and genotype certainty. Across 19 traits, our improved model leads to estimates of common SNP heritability on average 43% (s.d. 3%) higher than those obtained from the widely used software GCTA and 25% (s.d. 2%) higher than those from the recently proposed extension GCTA-LDMS. Previously, DNase I hypersensitivity sites were reported to explain 79% of SNP heritability; using our improved heritability model, their estimated contribution is only 24%.

PubMed Disclaimer

Conflict of interest statement

Competing Financial Interests

The authors declare no competing financial interests.

Figures

**Figure 1. Comparison of the GCTA and LDAK Models.**
Region 1 contains five SNPs in low LD (lighter colors indicate weaker pairwise correlations). Each SNP contributes unique genetic variation, reflected by SNP weights close to one. Region 2 contains five SNPs in high LD (strong correlations). The total genetic variation tagged by the region is effectively captured by two of the SNPs, and so the others receive zero weight. Under the GCTA Model, the regions are expected to contribute heritability proportional to their numbers of SNPs, here equal. Under the LDAK Model, they are expected to contribute proportional to their sums of SNP weights, here in the ratio 4.6:1.9. Note that the expected heritability can also depend on the allele frequencies and genotype certainty of the SNPs, but for simplicity, these factors are ignored here.

**Figure 2**
**(a) Relationship between heritability and MAF.** The parameter α specifies the assumed relationship between heritability and MAF: in human genetics, α = –1 is typically used (solid blue line), while in animal and plant genetics, α = 0 is more common (green); we instead found α = –0.25 (red) provides a better fit to real data. The gray bars report (relative) estimates of the per-SNP heritability for MAF<0.1 and MAF>0.1 SNPs, averaged across the 19 GWAS traits (vertical lines provide 95% confidence intervals); the dashed lines indicate the per-SNP heritability predicted by each α. **(b) Determining best-fitting** α **for the GWAS traits.** We compare α based on likelihood; higher likelihood indicates better-fitting α. Lines report log likelihoods from LDAK for seven values of α, relative to the highest observed. Line colors indicate the seven trait categories, while the black line reports averages.

**Figure 3**
**(a) Relative estimates of $h_{SNP}^{2}$ for the GWAS traits.** $h_{SNP}^{2}$ estimates from LDSC, GCTA-MS (SNPs partitioned by MAF), GCTA-LDMS (SNPs partitioned by LD and MAF) and LDAK are reported relative to those from GCTA. For versions of GCTA and LDAK, we use α = –0.25 (see main text for explanation of α). Line colors indicate the seven trait categories; the black line reports the (inverse variance weighted) averages, with gray boxes providing 95% confidence intervals for these averages. Numerical values are provided in Supplementary Table 3. **(b) Simulation studies can be misleading.** Phenotypes are simulated with 1000 causal SNPs and $h_{SNP}^{2}$ = 0.8 (black horizontal line), then analyzed using GCTA, GCTA-MS, GCTA-LDMS, LDAK and LDAK-MS (LDAK with SNPs partitioned by MAF). Bars report average $h_{SNP}^{2}$ across 200 simulated phenotypes (vertical lines provide 95% confidence intervals). **Left:** copying the study of Yang *et al.,* causal SNP effect sizes are sampled from ℕ(0, 1), similar to the GCTA Model. **Right:** causal SNP effect sizes are sampled from ℕ(0, *w_j*), similar to the LDAK Model.

**Figure 4. Comparing the GCTA and LDAK Models for the GWAS traits:**
We partition SNPs into low- or high-LD, with the low-LD tranche containing either 50% (left) or 25% (right) of SNPs. For each partition, the horizontal red and black lines indicate the predicted contribution of the low-LD tranche to $h_{SNP}^{2}$ under the GCTA and LDAK Models, respectively. Vertical lines provide point estimates and 95% confidence intervals for the contribution of the low-LD tranche to $h_{SNP}^{2}$ , estimated assuming the GCTA Model. Line colors indicate the seven trait categories, while the black lines provide the (inverse variance weighted) averages.

**Figure 5. Enrichment of SNP Classes.**
Block 1 reports the contributions to $h_{SNP}^{2}$ of DNaseI hypersensitivity sites (DHS), estimated under the GCTA Model with α = –1 (see main text for explanation of α). The vertical lines provide point estimates and 95% confidence intervals for each trait, and for the (inverse variance weighted) average; for 3 of the traits, the point estimate is above 100%, as was also the case for Gusev *et al.* Block 2 repeats this analysis, but now assuming the LDAK Model with α = –0.25. Blocks 3 & 4 estimate the contribution of “genic SNPs” (those inside or within 2 kb of an exon) and “inter-genic SNPs” (further than 125 kb from an exon), again assuming the LDAK Model with α = –0.25. To assess enrichment, estimated contributions are compared to those expected under the GCTA or LDAK Model, as appropriate (horizontal lines).

**Figure 6. Varying quality control for the UCLEB traits.**
We consider three SNP filterings: 353 K high-quality common SNPs (information score > 0.99, MAF > 0.01), 8.8 M common SNPs (MAF > 0.01) and all 17.3 M SNPs (MAF > 0.0005). **(a)** Blocks indicate SNP filtering; bars report (inverse variance weighted) average estimates of $h_{SNP}^{2}$ using LDAK (vertical lines provide 95% confidence intervals). Bar color indicates the value of α used. For Blocks 1, 2 & 3, $h_{SNP}^{2}$ is estimated using the non-partitioned model. For Block 4, SNPs are partitioned by MAF; we find this is necessary when rare SNPs are included, and also allows estimation of the contribution of MAF < 0.01 SNPs (hatched areas). **(b)** bars report our final estimates of $h_{SNP}^{2}$ for height, body mass index and QT interval, the three traits for which common SNP heritability has been previously estimated with reasonable precision (orange lines mark the 95% confidence intervals from these previous studies). Bar colors now indicate SNP filtering; all estimates are based on α = –0.25, using either a non-partitioned model (red and blue bars) or with SNPs partitioned by MAF (purple bars).

See this image and copyright information in PMC

Cited by

Bridging Scales in Alzheimer's Disease: Biological Framework for Brain Simulation With The Virtual Brain.
Stefanovski L, Meier JM, Pai RK, Triebkorn P, Lett T, Martin L, Bülau K, Hofmann-Apitius M, Solodkin A, McIntosh AR, Ritter P. Stefanovski L, et al. Front Neuroinform. 2021 Apr 1;15:630172. doi: 10.3389/fninf.2021.630172. eCollection 2021. Front Neuroinform. 2021. PMID: 33867964 Free PMC article. Review.
PharmGWAS: a GWAS-based knowledgebase for drug repurposing.
Kang H, Pan S, Lin S, Wang YY, Yuan N, Jia P. Kang H, et al. Nucleic Acids Res. 2024 Jan 5;52(D1):D972-D979. doi: 10.1093/nar/gkad832. Nucleic Acids Res. 2024. PMID: 37831083 Free PMC article.
Solving the missing heritability problem.
Young AI. Young AI. PLoS Genet. 2019 Jun 24;15(6):e1008222. doi: 10.1371/journal.pgen.1008222. eCollection 2019 Jun. PLoS Genet. 2019. PMID: 31233496 Free PMC article. No abstract available.
Automatic landmarking identifies new loci associated with face morphology and implicates Neanderthal introgression in human nasal shape.
Li Q, Chen J, Faux P, Delgado ME, Bonfante B, Fuentes-Guajardo M, Mendoza-Revilla J, Chacón-Duque JC, Hurtado M, Villegas V, Granja V, Jaramillo C, Arias W, Barquera R, Everardo-Martínez P, Sánchez-Quinto M, Gómez-Valdés J, Villamil-Ramírez H, Silva de Cerqueira CC, Hünemeier T, Ramallo V, Wu S, Du S, Giardina A, Paria SS, Khokan MR, Gonzalez-José R, Schüler-Faccini L, Bortolini MC, Acuña-Alonzo V, Canizales-Quinteros S, Gallo C, Poletti G, Rojas W, Rothhammer F, Navarro N, Wang S, Adhikari K, Ruiz-Linares A. Li Q, et al. Commun Biol. 2023 May 8;6(1):481. doi: 10.1038/s42003-023-04838-7. Commun Biol. 2023. PMID: 37156940 Free PMC article.
Power Analysis for Genetic Association Test (PAGEANT) provides insights to challenges for rare variant association studies.
Derkach A, Zhang H, Chatterjee N. Derkach A, et al. Bioinformatics. 2018 May 1;34(9):1506-1513. doi: 10.1093/bioinformatics/btx770. Bioinformatics. 2018. PMID: 29194474 Free PMC article.

See all "Cited by" articles

References

1. Yang J, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42:565–569. - PMC - PubMed
1. Maher B. Personal genomes: the case of the missing heritability. Nature. 2008;456:18–21. - PubMed
1. Speed D, et al. Describing the genetic architecture of epilepsy through heritability analysis. Brain. 2014;137:26802689. - PMC - PubMed
1. Henderson C, Kempthorne O, Searle S, von Krosigk C. The estimation of environmental and genetic trends from records subject to culling. Biometrics. 1959;15:192–218.
1. Falconer D, Mackay T. Introduction to Quantitative Genetics. 4th Edition. Longman; 1996.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

[1] Yang J, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42:565–569. - PMC - PubMed

[2] Yang J, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42:565–569. - PMC - PubMed

[3] Maher B. Personal genomes: the case of the missing heritability. Nature. 2008;456:18–21. - PubMed

[4] Maher B. Personal genomes: the case of the missing heritability. Nature. 2008;456:18–21. - PubMed

[5] Speed D, et al. Describing the genetic architecture of epilepsy through heritability analysis. Brain. 2014;137:26802689. - PMC - PubMed

[6] Speed D, et al. Describing the genetic architecture of epilepsy through heritability analysis. Brain. 2014;137:26802689. - PMC - PubMed

[7] Henderson C, Kempthorne O, Searle S, von Krosigk C. The estimation of environmental and genetic trends from records subject to culling. Biometrics. 1959;15:192–218.

[8] Henderson C, Kempthorne O, Searle S, von Krosigk C. The estimation of environmental and genetic trends from records subject to culling. Biometrics. 1959;15:192–218.

[9] Falconer D, Mackay T. Introduction to Quantitative Genetics. 4th Edition. Longman; 1996.

[10] Falconer D, Mackay T. Introduction to Quantitative Genetics. 4th Edition. Longman; 1996.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Reevaluation of SNP heritability in complex human traits

Affiliations

Reevaluation of SNP heritability in complex human traits

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials