Genome-wide Scan of 29,141 African Americans Finds No Evidence of Directional Selection since Admixture

Gaurav Bhatia; Arti Tandon; Nick Patterson; Melinda C Aldrich; Christine B Ambrosone; Christopher Amos; Elisa V Bandera; Sonja I Berndt; Leslie Bernstein; William J Blot; Cathryn H Bock; Neil Caporaso; Graham Casey; Sandra L Deming; W Ryan Diver; Susan M Gapstur; Elizabeth M Gillanders; Curtis C Harris; Brian E Henderson; Sue A Ingles; William Isaacs; Phillip L De Jager; Esther M John; Rick A Kittles; Emma Larkin; Lorna H McNeill; Robert C Millikan; Adam Murphy; Christine Neslund-Dudas; Sarah Nyante; Michael F Press; Jorge L Rodriguez-Gil; Benjamin A Rybicki; Ann G Schwartz; Lisa B Signorello; Margaret Spitz; Sara S Strom; Margaret A Tucker; John K Wiencke; John S Witte; Xifeng Wu; Yuko Yamamura; Krista A Zanetti; Wei Zheng; Regina G Ziegler; Stephen J Chanock; Christopher A Haiman; David Reich; Alkes L Price

doi:10.1016/j.ajhg.2014.08.011

. 2014 Oct 2;95(4):437–444. doi: 10.1016/j.ajhg.2014.08.011

Genome-wide Scan of 29,141 African Americans Finds No Evidence of Directional Selection since Admixture

Gaurav Bhatia ^1,^2,^∗, Arti Tandon ^2,³, Nick Patterson ², Melinda C Aldrich ^4,^5,⁶, Christine B Ambrosone ⁷, Christopher Amos ⁸, Elisa V Bandera ⁹, Sonja I Berndt ¹⁰, Leslie Bernstein ¹¹, William J Blot ^4,^5,¹², Cathryn H Bock ¹³, Neil Caporaso ¹⁰, Graham Casey ¹⁴, Sandra L Deming ^4,⁵, W Ryan Diver ¹⁵, Susan M Gapstur ¹⁵, Elizabeth M Gillanders ¹⁶, Curtis C Harris ¹⁷, Brian E Henderson ¹⁴, Sue A Ingles ¹⁴, William Isaacs ¹⁸, Phillip L De Jager ^2,^3,¹⁹, Esther M John ^20,²¹, Rick A Kittles ²², Emma Larkin ²³, Lorna H McNeill ^24,²⁵, Robert C Millikan ^26,^27,³⁶, Adam Murphy ²⁸, Christine Neslund-Dudas ²⁹, Sarah Nyante ^26,²⁷, Michael F Press ¹⁴, Jorge L Rodriguez-Gil ³⁰, Benjamin A Rybicki ²⁹, Ann G Schwartz ¹³, Lisa B Signorello ^4,^5,¹², Margaret Spitz ⁸, Sara S Strom ³¹, Margaret A Tucker ¹⁰, John K Wiencke ³², John S Witte ³³, Xifeng Wu ⁸, Yuko Yamamura ³¹, Krista A Zanetti ^16,¹⁷, Wei Zheng ^4,⁵, Regina G Ziegler ¹⁰, Stephen J Chanock ¹⁰, Christopher A Haiman ¹⁴, David Reich ^2,^3,³⁵, Alkes L Price ^2,^34,³⁵

¹Division of Health, Science, and Technology, the Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA 02139, USA

²Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA

³Harvard Medical School, New Research Building, 77 Avenue Louis Pasteur, Boston, MA 02115, USA

⁴Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Nashville, TN 37203, USA

⁵Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37203, USA

⁶Department of Thoracic Surgery, Vanderbilt University School of Medicine, Nashville, TN 37203, USA

⁷Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Buffalo, NY 14263, USA

⁸Section of Biostatistics and Epidemiology, Community and Family Medicine, Geisel School of Medicine, Dartmouth College, Hanover, NH 03766, USA

⁹Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08903, USA

¹⁰Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA

¹¹Division of Cancer Etiology, Department of Population Sciences, Beckman Research Institute, City of Hope, CA 91010, USA

¹²International Epidemiology Institute, Rockville, MD 20850, USA

¹³Karmanos Cancer Institute and Department of Oncology, Wayne State University of Medicine, Detroit, MI 48201, USA

¹⁴Departments of Preventive Medicine and Pathology, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, CA 90033, USA

¹⁵Epidemiology Research Program, American Cancer Society, Atlanta, GA 30303, USA

¹⁶Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD 20892, USA

¹⁷Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA

¹⁸James Buchanan Brady Urological Institute, Johns Hopkins Hospital and Medical Institutions, Baltimore, MD 21287, USA

¹⁹Program in Translational NeuroPsychiatric Genomics, Institute for the Neurosciences, Department of Neurology, Brigham and Women’s Hospital, Boston, MA, USA

²⁰Cancer Prevention Institute of California, Fremont, CA 94538, USA

²¹Stanford Cancer Center, Stanford Medicine, Stanford, CA 94305, USA

²²Department of Medicine, University of Illinois at Chicago, Chicago, IL 60607, USA

²³Division of Allergy, Pulmonary, and Critical Care, Department of Medicine, Vanderbilt University Medical Center, 6100 Medical Center East, Nashville, TN 37232-8300, USA

²⁴Department of Health Disparities Research, Cancer Prevention and Population Sciences, the University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA

²⁵Center for Community Implementation and Dissemination Research, Duncan Family Institute, the University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA

²⁶Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC 27599, USA

²⁷Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA

²⁸Department of Urology, Northwestern University, Chicago, IL 60611, USA

²⁹Department of Public Health Sciences, Henry Ford Hospital, Detroit, MI 48202, USA

³⁰Sylvester Comprehensive Cancer Center and Department of Epidemiology and Public Health, University of Miami Miller School of Medicine, Miami, FL 33136, USA

³¹Department of Epidemiology, the University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA

³²University of California, San Francisco, San Francisco, CA 94158, USA

³³Departments of Epidemiology and Biostatistics and Urology, Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA

³⁴Departments of Epidemiology and Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA

^∗

Corresponding author [email protected]

³⁵

These authors contributed equally to this work

³⁶

In memoriam

PMCID: PMC4185117 PMID: 25242497

Abstract

The extent of recent selection in admixed populations is currently an unresolved question. We scanned the genomes of 29,141 African Americans and failed to find any genome-wide-significant deviations in local ancestry, indicating no evidence of selection influencing ancestry after admixture. A recent analysis of data from 1,890 African Americans reported that there was evidence of selection in African Americans after their ancestors left Africa, both before and after admixture. Selection after admixture was reported on the basis of deviations in local ancestry, and selection before admixture was reported on the basis of allele-frequency differences between African Americans and African populations. The local-ancestry deviations reported by the previous study did not replicate in our very large sample, and we show that such deviations were expected purely by chance, given the number of hypotheses tested. We further show that the previous study’s conclusion of selection in African Americans before admixture is also subject to doubt. This is because the F_ST statistics they used were inflated and because true signals of unusual allele-frequency differences between African Americans and African populations would be best explained by selection that occurred in Africa prior to migration to the Americas.

Main Text

Admixed populations, such as African Americans and Latinos, are formed by the mixing of genetically differentiated ancestral populations. Alleles that are highly differentiated between the ancestral populations and advantageous in the admixed population are expected to rise in frequency after admixture, causing local ancestry to deviate from the genome-wide average.¹ These deviations have been interpreted as a signal of the action of natural selection since admixture.^2–4 We note that sampling noise, genetic drift after admixture, and small systematic biases in local-ancestry inference^5,6 will also produce deviations in local ancestry, making it important to account for these factors before concluding that natural selection has occurred.

To better understand deviations in local ancestry as a signal of selection, we simulated the evolution of local ancestry in an admixed population. The population was created seven generations ago with ancestral proportions of 80% and 20% from two ancestral populations to mimic an idealized demographic history of African Americans. We simulated neutral evolution with a variety of effective population sizes (N_e) over seven generations by using a recombination map built from African American data.⁷ For each value of N_e, we assessed the variance in local ancestry, minimum detectable selection coefficient, and effective number of statistical tests (see Table S1, available online). Our results suggest that genetic drift can contribute significantly to the variance in average local ancestry (as a function of N_e) and thus reduce power to detect selection. We note that small systematic biases in local-ancestry inference will also contribute to this variance and have a similar effect on power.

In light of these simulated results, we sought to investigate possible recent selection in African Americans. We performed an admixture scan for unusual deviations in local ancestry in 29,141 African Americans from five cohorts from the African American Lung Cancer Consortium (AALCC), African American Breast Cancer Consortium (AABCC), African American Prostate Cancer Consortium (AAPCC), Children’s Hospital of Philadelphia, and Candidate Gene Association Resource (CARe). Sample sizes and genotyping arrays are listed in Table S2. We note that the AALCC, AABCC, and AAPCC cohorts consist of disease-affected individuals and control subjects, but phenotype information was not available in the current study. The inclusion of affected individuals could produce false-positive signals of selection as a result of admixture associations with disease but is unlikely to produce false-negative selection signals, which would only occur if admixture association and selection each caused local-ancestry deviations that perfectly negated each other at the same locus.

We filtered the data to remove genotyping artifacts, related individuals, and individuals with very little European or African ancestry (see Table S2). To estimate local ancestry, we used HapMap3 CEU (Utah residents with ancestry from northern and western Europe from the CEPH collection) and YRI (Yoruba in Ibadan, Nigeria) haplotypes as ancestral populations in HAPMIX.⁸ The average local ancestry at each locus was calculated as an average of the local-ancestry estimates across all samples. Because of issues with ancestry inference at the ends of chromosomes, we removed the first and last 2 Mb of each chromosome from analysis. We note that in these regions, three loci (which do not overlap any previously published loci⁹) did show significant deviations in local ancestry, but these are very likely to be artifacts (see Appendix A). We subsequently focused on local-ancestry estimates for 118,007 SNPs in the intersection of all cohorts. Because of the extent of admixture linkage disequilibrium (LD) in African Americans, the number of markers required to tag the entire genome is approximately 2,000–3,000,^1,10,11 making our use of 118,000 markers sufficient to tag local ancestry genome-wide.

The average proportion of European ancestry over all samples and all SNPs was 0.204 (SD = 0.0036 across SNPs). On the basis of the extent of admixture LD in African Americans,^1,10,12 we defined a genome-wide-significant signal of selection as a local-ancestry deviation greater than 4.42 SDs (p < 10⁻⁵), corresponding to 5,000 hypotheses tested.¹ We used the empirical SD because the theoretical SD can be affected by genetic drift after admixture, cryptic relatedness, and other factors that are difficult to quantify (see Tables S1 and S3). Figure 1 displays the average local ancestry at each SNP and indicates no genome-wide-significant deviation in local ancestry. We note that our simulations suggest that the actual effective number of independent hypotheses might be closer to 1,000–1,500 (see Table S1). However, our results remain null even if we correct for only 1,000 hypotheses tested (4.06 SDs; p < 5 × 10⁻⁵). Additionally, our results remain null in a smaller sample of 23,000 individuals with more extensive genomic coverage of 461,000 SNPs (see Appendix A).

Ancestry at Each Location in the Genome in 29,141 African Americans

This figure gives the proportion of European ancestry at each of the 118,006 SNPs common to all cohorts. The black line indicates the genome-wide average proportion of European ancestry. The red and blue lines indicate the threshold for genome-wide significance (p < 10⁻⁵) in our study and in the Jin et al. study,⁹ respectively. The dashed blue line indicates the significance threshold (p < 2.7 × 10⁻³) that was actually used in the Jin et al. study.⁹ The SD was computed empirically over all SNPs. It is clear that no region attained genome-wide significance in our scan. For the six loci reported under selection in Jin et al.,⁹ dashed vertical lines indicate their location, and blue points indicate their deviation in local ancestry. These deviations are reported in relation to the genome-wide average ancestry proportion in our study. None of the six reported loci exceeded the threshold for genome-wide significance (p < 10⁻⁵) for the Jin et al.⁹ study (blue lines).

To better understand the implications of these results, we evaluated the range of selection coefficients that we would have high power to detect. Assuming a normal sampling distribution of observed average local ancestry, we can solve for the true average local ancestry (γ_L) that would constitute a genome-wide-significant signal of selection. In our case, we had 95% power to detect selection at loci where γ_L < 0.183 or γ_L > 0.225. Assuming seven generations since admixture,⁸ we performed a grid search of possible values of the selection coefficient for local ancestry (s_anc) to find those that would produce these values of γ_L and obtained an estimate of 0.019, providing an upper bound on the strength of selection since admixture. Similarly, our simulations with N_e = 50,000 (whose variance in average local ancestry was similar to the variance observed in real data; see Table S1) also indicated a minimum detectable selection coefficient of approximately 0.019 (see Figure S1). We note that, in general, the selection coefficient per local-ancestry block (s_anc) will be lower than the selection coefficient per allele (s), and an s_anc of 0.019 could correspond to a large value of s, representing strong selection. The conversion between s and s_anc will depend on allele frequencies in European and African populations (see Table S4).

Our results suggest that selection stronger than s_anc > 0.019 since admixture can be ruled out, and they contrast with a report of six loci as targets of selection after admixture in a recent study by Jin et al.⁹ However, that study considered any deviation greater than 3 SDs (p < 2.7 × 10⁻³), corresponding to only 20 hypotheses tested, to be genome-wide significant. The six loci did not replicate at nominal significance (p < 0.05) in our analysis of many more samples (see Figure 1 and Table 1). When we used a threshold of 3 SDs in our data, six loci showed significant deviations. None of these overlap those reported by Jin et al. (see Table S5), suggesting that reported signals of selection after admixture are likely to be false positives because of an insufficient correction for multiple tests. For five of the six loci in Table 1, the deviation that we observed has the same sign as the previously reported deviation. This could be due to statistical chance (p = 0.11; one-sided Fisher’s exact test), genetic drift after admixture, or small systematic biases in local-ancestry inference (see Table S6). In any case, our results show that the proportion of African ancestry at these six loci was not strongly affected by natural selection since admixture.

Table 1.

Comparison of Deviations in Average Local Ancestry

Region	Jin et al.⁹		Current Study
Region	Deviation	Nominal p Value	Deviation	Nominal p Value
Chr1: 17,409,539–21,604,321	−0.025	7.43 × 10⁻⁴	−0.004	0.55
Chr2: 241,750,403–242,568,618	−0.023	2.07 × 10⁻³	−0.006	0.44
Chr2: 37,451,925–37,508,581	0.023	2.16 × 10⁻³	0.005	0.51
Chr3: 116,930,811–118,313,302	0.025	8.58 × 10⁻⁴	−0.002	0.83
Chr6: 163,653,158–163,653,428	0.023	2.70 × 10⁻³	0.004	0.60
Chr16: 61,214,438–61,242,497	0.023	2.26 × 10⁻³	0.006	0.41

Open in a new tab

We list the six regions reported by Jin et al.⁹ to have unusual deviations in local ancestry and compare these to our scan. The deviation is in the proportion of European local ancestry. None of the six regions replicated at nominal significance (p < 0.05) in our analysis. All positions are from UCSC Genome Browser build hg18.

Allele-frequency differentiation can be a powerful test for selection.^13–18 Indeed, population differentiation between African Americans and YRI was used as the basis of 14 selection signals recently reported by Jin et al. This was described as a test for selection that occurred after the forced migration of the African ancestors of African Americans (both before and after admixture). Jin et al. ultimately concluded that selection occurred before admixture given the lack of overlap with signals of selection after admixture from deviations in local ancestry.⁹ Specifically, single SNPs were ranked by an estimate of F_ST, and the most highly differentiated SNPs were reported as signals of selection. These single SNP estimates of F_ST were produced with the Weir and Cockerham¹⁹ (WC) F_ST estimator. However, a concern with the use of the WC estimator for this application is that estimates can strongly depend on the ratio of sample sizes used. This can potentially result in overestimates of F_ST at neutral SNPs,²⁰ leading to false-positive signals of selection (see Table S7).

On the other hand, the Hudson estimator,^20,21 which is a simple average of the population-specific estimators of Weir and Hill,²² does not have this bias. We assessed the magnitude of inflation of WC estimates in the loci reported by Jin et al.⁹ Their analysis compared African segments of 1,890 African Americans and 113 YRI at SNPs with minor allele frequency (MAF) > 5% and reported a total of 40 SNPs—the 99.99^th percentile of 401,559 SNPs tested—clustered into 14 loci that had F_ST > 0.0452. Ten of these loci were previously unreported targets of natural selection, and four were reported as genome-wide significant in the parallel study of Bhatia et al.²³ (or nearly genome-wide significant in the case of HBB, a previously identified target of selection²⁴). Of the ten novel signals, nine produced lower estimates when we used the Hudson estimator, and three fell below the Jin et al.⁹ threshold (F_ST > 0.0452; see Table 2). We note that the 99.99^th percentile of F_ST could change as a result of the switch from the WC estimator to the Hudson estimator; however, our analyses indicated that the magnitude of this change would be smaller than the decreases observed at most of the ten reported novel loci (see Appendix A), suggesting that inflated WC F_ST estimates might lead to false-positive signals of selection.

Table 2.

Comparison of Signals of Population Differentiation

SNP ID	Region	Gene	Jin et al.⁹data			Bhatia et al.²³Data
SNP ID	Region	Gene	WC F_ST	Hudson F_ST	Model-Based p Value	Model-Based p Value
rs1541044	chr1: 100,125,058–100,183,875	–	0.0562	0.0439^a	4.7 × 10⁻⁵	0.04
rs4460629	chr1: 153,401,959–153,464,086	–	0.0692	0.0650	6.8 × 10⁻⁷	2.1 × 10⁻⁴
rs12094201	chr1: 236,509,336	–	0.0561	0.0489	1.7 × 10⁻⁵	0.86
rs7642575	chr3: 31,400,165	–	0.0453	0.0393^a	1.1 × 10⁻⁴	0.41
rs652888	chr6: 26,554,684–33,961,049	HLA	0.0711	0.0627	1.1 × 10⁻⁶	1.8 × 10⁻¹¹
rs9478984	chr6: 151,555,551–151,569,258	–	0.0545	0.0596	2.1 × 10⁻⁶	0.02
rs10499542	chr7: 22,235,870	–	0.0461	0.0453	3.6 × 10⁻⁵	0.35
rs304735	chr7: 79,768,487–80,482,597	CD36	0.0946	0.0690	3.0 × 10⁻⁷	3.7 × 10⁻¹³
rs2920283	chr8: 143,754,039–143,758,933	PSCA	0.0468	0.0532	7.6 × 10⁻⁶	6.4 × 10⁻⁷
rs1498487	chr11: 5,034,229–5,421,456	HBB	0.0617	0.0464	2.4 × 10⁻⁵	1.7 × 10⁻⁷
rs4883422	chr12: 7,189,594	–	0.0472	0.0461	3.0 × 10⁻⁵	1.3 × 10⁻³
rs6491096	chr13: 25,488,362	–	0.0472	0.0373^a	1.5 × 10⁻⁴	0.4
rs1075875	chr16: 47,595,721	–	0.0766	0.0608	1.3 × 10⁻⁶	NA^b
rs6015945	chr20: 59,319,574	–	0.0627	0.0550	4.3 × 10⁻⁶	0.5

Open in a new tab

We recreated Table 2 from Jin et al.⁹ by analyzing the same data with the Hudson instead of the WC estimator. We also estimated the p value at each SNP by using the reported F_ST = 0.0007 of Jin et al.⁹ and a model-based approach.²⁴ Finally, we report the model-based p value of the most significant SNP in the region from the parallel study by Bhatia et al.²³ We note that results reported in that paper were more significant than those reported here because Bhatia et al. analyzed additional populations. All positions are from UCSC Genome Browser build hg18.

These loci fell below the threshold for the 99.99^th percentile (0.0452) when the Hudson estimator was used.

This locus was not available (NA) because it lacked data in the Bhatia et al.²³ study.

In addition to having issues with F_ST estimation, studies that simply rank the most highly differentiated SNPs between populations are unable to evaluate genome-wide significance of reported signals. On the other hand, model-based approaches^23–26 can formally assess genome-wide significance. In general, studies that use a model-based approach are well powered if sample sizes are much larger than 1 / F_ST,²³ given that both F_ST and sampling noise contribute to normal variation in allele-frequency differences. In the Jin et al.⁹ comparison, the sample size of YRI (n = 113) is much smaller than the reciprocal of F_ST between African Americans and YRI (1 / F_ST = 1,429). When re-evaluated with a model-based approach,^23,24 none of the reported SNPs achieved genome-wide significance (p < 5 × 10⁻⁸; see Table 2). We note that model-based approaches do require robust estimates of F_ST, but these are easily available from even small samples of genome-wide data. We re-examined the statistical significance of the ten novel loci reported by Jin et al.⁹ in the separate data set of Bhatia et al.,²³ which included 6,209 African Americans and 756 YRI. The Bhatia et al.²³ data include nine of these ten loci, and only four of the nine loci were nominally significant (p < 0.05 without correction for multiple-hypothesis testing; see Table 2). Extending the analysis to all 29,141 African Americans in the current study yielded very similar results, given that the YRI sample size was the limiting factor (see Table S8). We caution that the four nominally significant loci should not be viewed as being independently replicated because genetic drift is common to both analyses such that loci in the tail of one analysis could be expected to lie in the tail of the other analysis. The lack of nominal significance at most loci in the nonindependent analysis of Bhatia et al.²³ data suggests that most of the reported novel loci are false positives. We note that the results of Jin et al. and Bhatia et al. were both corrected for European admixture either locally⁹ or genome-wide.²³ Our analyses (see Table S8) agree with prior results that correction for European admixture is imperative²⁷ and found that both corrections perform similarly in terms of power.

It is important to recognize that even robust, genome-wide-significant evidence of unusual population differentiation (e.g., at the four loci identified by both Bhatia et al.²³ and Jin et al.⁹) does not imply selection following the forced migration from Africa. The observed population differences at these loci are best explained by selection within Africa. As an example, we consider the well-studied sickle-cell variant rs334 at the HBB locus, where biological evidence suggests that some selection since the arrival of Africans in the Americas is likely to have occurred. Homozygotes for the recessive allele are afflicted with sickle-cell anemia, a debilitating condition that results in very low fertility. However, the minor allele at rs334 is maintained at high frequency in Africa because heterozygotes have increased malaria resistance.²⁸ The MAF at rs334 in African Americans is 0.050,²⁹ corresponding to an allele frequency of 0.063 (0.050/0.8) on African segments. Conservatively assuming the strongest possible negative selection against the minor allele, we calculate that the maximum allele-frequency difference due to selection post-Africa (after the African ancestors of African Americans migrated from Africa) would be 0.034 (see Appendix A). However, an allele-frequency difference of 0.20 at the HBB locus was reported between Nigerians and Gambians,²³ indicative of larger allele-frequency differences due to selection in Africa. Although these populations have a higher level of differentiation (F_ST = 0.006) than our comparison of African Americans and Nigerians (F_ST = 0.001), we note that allele-frequency differences at HBB are generally related to malaria endemicity and altitude as opposed to F_ST between the populations.²⁴ Thus, we believe that selection in Africa rather than post-Africa is the most likely explanation for most of the observed frequency differences between African Americans and YRI.

Overall, we conclude that there is no locus with genome-wide-significant evidence of selection influencing ancestry in African Americans after their ancestors left Africa and that genome-wide-significant evidence of population differentiation is likely to be best explained by selection in Africa. In addition, we place an upper bound on the selection that could have occurred after admixture and not be detected in our data (s_anc > 0.019). Although strong selection after admixture can be ruled out by our data, weak selection after admixture might have occurred, for example, at the HBB locus. Although our results contrast with previous reports⁹ of selection post-Africa, this discrepancy can be explained by insufficient correction for multiple tests, usage of the WC F_ST estimator instead of the Hudson estimator, and the action of natural selection in Africa.

Several recent studies have investigated unusual deviations in local ancestry as a possible signal of natural selection in admixed populations. Bryc et al.² analyzed 365 African Americans and reported three loci with >3 SDs but correctly noted that these differences were not significant after correction for multiple tests. Jeong et al.³ analyzed 96 Tibetan individuals (derived from admixture of Han- and Sherpa-related populations thousands of years ago) and focused on genes associated with hemoglobin levels (EGLN1 and EPAS1); they found that the observed deviations (3.59 SDs and 3.74 SDs, respectively) at these candidate loci were statistically significant after correction for multiple tests. A recent study⁴ used a new method of local-ancestry inference and reported three loci (including two in the HLA region) with very large (>20%) deviations in local ancestry in 58 Mexican (MXL) samples, but these very large deviations were not observed in consensus MXL local-ancestry calls^5,8,30,31 published by the 1000 Genomes Consortium³² (see Table S9). Finally, recent studies^33–35 have demonstrated evidence of selection since ancient admixture with archaic human populations.

Although a number of alternate methods of detecting selection exist,^36–40 we have focused here on deviations in local ancestry and on population differentiation. We conclude with four recommendations for future studies utilizing these approaches. First, studies reporting selection since admixture on the basis of deviations in local ancestry in African Americans (or in other admixed populations with similar ages of admixture) should employ a genome-wide-significance threshold of p < 10⁻⁵. Second, studies reporting selection on the basis of deviations in local ancestry should be cognizant of the possibility that errors in local-ancestry inference can lead to false-positive signals¹ and that reports of selection might need to be confirmed by multiple methods. Third, studies reporting selection on the basis of population differentiation and involving unequal sample sizes should not use the WC F_ST estimator,¹⁹ which is susceptible to bias in this case, and instead should use the Hudson estimator.^20–22 Fourth, genome-wide significance should not be assessed on the basis of a simple ranking and instead should be assessed via robust model-based approaches.^23–26,41

Acknowledgments

We thank Carlos Bustamante, Bogdan Pasaniuc, and Amy L. Williams for helpful discussions and Richard Cooper for sharing data from 756 Yoruba samples. This research was funded by NIH grant R01 HG006399.

Appendix A

Systematic Deviations in Average Local Ancestry at the Ends of Chromosomes

In the analysis presented in the main text, we removed the first and last 2 Mb of each chromosome because of observed systematic deviations in these regions. When we included all available data, we did observe significant peaks in ancestry (Figure S2). These peaks resided in the first 2 Mb of chromosomes 1 and 7 and the last 2 Mb of chromosome 9. Strong evidence that these peaks were the result of inaccurate local-ancestry inference in these loci was based on (1) a high degree of heterogeneity in inferred local ancestry across cohorts (see Figure S3)—the cohorts showing significant deviations were all genotyped on the same platform (see Table S10)—and (2) unexpected reduction in the length of local-ancestry segments (measured in cM) (see Figure S4). Because of this evidence, we removed the first and last 2 Mb of each chromosome.

Impact of Number of SNPs Analyzed

To test the effect of using a relatively small set of 118,000 SNPs, we excluded the 6,000 CARe individuals who were genotyped on the Affymetrix 6.0 chip. The remaining 22,900 individuals were all genotyped on 461,000 SNPs. In this data set, which had >4-fold denser coverage, we observed no genome-wide-significant deviations in average local ancestry (maximum deviation = 3.76 SDs). This null result is consistent with our result in the full data set and with the extent of admixture LD in African Americans. Because of this admixture LD, 2,000–3,000 markers are sufficient to tag local ancestry in analyses of natural selection since admixture.^1,10,11

Changes in Estimator Alter the 99.99^th Percentile

Use of the Hudson F_ST estimator instead of the WC estimator results in lower estimates of F_ST at the loci reported by Jin et al.⁹ However, it is possible that the threshold at the 99.99^th percentile is also lowered by use of this estimator and that reported loci still fall at this upper tail of the distribution. To assess this effect in sample sizes similar to those of Jin et al.⁹ we subsampled 2,500 African American individuals from our data, subtracted European allele frequencies from CEU,²³ and compared the result to YRI by using both the WC and Hudson F_ST at every SNP. According to this analysis, the 99.99^th percentile of F_ST was 0.048 for the WC estimator and 0.046 for the Hudson estimator.

Jin et al.⁹ reported a threshold of 0.0452. Even if this decreases by 0.002 as a result of using the Hudson estimator, the mean difference between the WC and Hudson F_ST estimates at the ten novel loci would be 0.006, and 2 of the 14 reported loci would no longer be in the 99.99^th percentile (with F_ST estimates of 0.037 and 0.039; see Table 2).

Model of Selection at HBB

We assume the strongest possible negative selection against the minor allele at HBB, that heterozygotes have no advantage (because of much lower rates of malaria in the Americas), and that no people with sickle-cell anemia have children. From this information, we can work backward in time with the following equation:

p_{g + 1} = \frac{p_{g}}{1 - p_{g}},

Equation A1

where p_g represents the sickle-cell allele g generations in the past. Assuming that p₀ = 0.0625²⁹ and that seven generations have passed since the admixture of the African and European ancestors of African Americans,⁸ we have p₁ = 0.0962. Thus, the allele frequency in the African ancestors of African Americans seven generations ago would have been 0.096, and the maximum allele-frequency difference due to selection since the migration from Africa would have been 0.034.

Under this model, the per-allele selection coefficient is simply the allele frequency in the population—not on African segments alone—at the current generation (s^g = γp_g, where γ is the proportion of African ancestry at the HBB locus during the current generation). If we assume that the proportion of local ancestry at each locus seven generations ago is equivalent to the current genome-wide average, the maximum value of this coefficient is s = 0.796(p₇) = 0.077. The selection coefficient per copy of African local ancestry is given by s_anc = γ(p)². That is, given that an individual carries one African chromosome at the HBB locus, he must also carry (1) the sickle-cell allele on this first African chromosome (with probability p), (2) a second African chromosome at this locus (with probability γ), and (3) the sickle-cell allele on that second African chromosome (with probability p). According to our model, the maximum value of this coefficient is s_anc = 0.796(p₇)² = 0.0074. We also explored the effect of weak negative selection against heterozygotes (h) for the sickle-cell allele on both local ancestry and allele-frequency changes following admixture. Our results suggest that only very strong negative selection against heterozygotes (h > 0.05) would produce a genome-wide-significant deviation in average local ancestry, whereas allele frequencies would be affected at smaller values of h (see Table S11).

Supplemental Data

Document S1. Figures S1–S5 and Tables S1–S11

mmc1.pdf^{(571.7KB, pdf)}

Document S2. Article plus Supplemental Data

mmc2.pdf^{(805.8KB, pdf)}

References

1.Seldin M.F., Pasaniuc B., Price A.L. New approaches to disease mapping in admixed populations. Nat. Rev. Genet. 2011;12:523–528. doi: 10.1038/nrg3002. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Bryc K., Auton A., Nelson M.R., Oksenberg J.R., Hauser S.L., Williams S., Froment A., Bodo J.-M., Wambebe C., Tishkoff S.A., Bustamante C.D. Genome-wide patterns of population structure and admixture in West Africans and African Americans. Proc. Natl. Acad. Sci. USA. 2010;107:786–791. doi: 10.1073/pnas.0909559107. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Jeong C., Alkorta-Aranburu G., Basnyat B., Neupane M., Witonsky D.B., Pritchard J.K., Beall C.M., Di Rienzo A. Admixture facilitates genetic adaptations to high altitude in Tibet. Nat. Commun. 2014;5:3281. doi: 10.1038/ncomms4281. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Guan Y. Detecting structure of haplotypes and local ancestry. Genetics. 2014;196:625–642. doi: 10.1534/genetics.113.160697. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Baran Y., Pasaniuc B., Sankararaman S., Torgerson D.G., Gignoux C., Eng C., Rodriguez-Cintron W., Chapela R., Ford J.G., Avila P.C. Fast and accurate inference of local ancestry in Latino populations. Bioinformatics. 2012;28:1359–1367. doi: 10.1093/bioinformatics/bts144. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Pasaniuc B., Sankararaman S., Torgerson D.G., Gignoux C., Zaitlen N., Eng C., Rodriguez-Cintron W., Chapela R., Ford J.G., Avila P.C. Analysis of Latino populations from GALA and MEC studies reveals genomic loci with biased local ancestry estimation. Bioinformatics. 2013;29:1407–1415. doi: 10.1093/bioinformatics/btt166. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Hinch A.G., Tandon A., Patterson N., Song Y., Rohland N., Palmer C.D., Chen G.K., Wang K., Buxbaum S.G., Akylbekova E.L. The landscape of recombination in African Americans. Nature. 2011;476:170–175. doi: 10.1038/nature10336. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Price A.L., Tandon A., Patterson N., Barnes K.C., Rafaels N., Ruczinski I., Beaty T.H., Mathias R., Reich D., Myers S. Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 2009;5:e1000519. doi: 10.1371/journal.pgen.1000519. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Jin W., Xu S., Wang H., Yu Y., Shen Y., Wu B., Jin L. Genome-wide detection of natural selection in African Americans pre- and post-admixture. Genome Res. 2012;22:519–527. doi: 10.1101/gr.124784.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Patterson N., Hattangadi N., Lane B., Lohmueller K.E., Hafler D.A., Oksenberg J.R., Hauser S.L., Smith M.W., O’Brien S.J., Altshuler D. Methods for high-density admixture mapping of disease genes. Am. J. Hum. Genet. 2004;74:979–1000. doi: 10.1086/420871. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Smith M.W., O’Brien S.J. Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nat. Rev. Genet. 2005;6:623–632. doi: 10.1038/nrg1657. [DOI] [PubMed] [Google Scholar]
12.Smith M.W., Patterson N., Lautenberger J.A., Truelove A.L., McDonald G.J., Waliszewska A., Kessing B.D., Malasky M.J., Scafe C., Le E. A high-density admixture map for disease gene discovery in african americans. Am. J. Hum. Genet. 2004;74:1001–1013. doi: 10.1086/420856. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Akey J.M., Zhang G., Zhang K., Jin L., Shriver M.D. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 2002;12:1805–1814. doi: 10.1101/gr.631202. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.McEvoy B.P., Montgomery G.W., McRae A.F., Ripatti S., Perola M., Spector T.D., Cherkas L., Ahmadi K.R., Boomsma D., Willemsen G. Geographical structure and differential natural selection among North European populations. Genome Res. 2009;19:804–814. doi: 10.1101/gr.083394.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Pickrell J.K., Coop G., Novembre J., Kudaravalli S., Li J.Z., Absher D., Srinivasan B.S., Barsh G.S., Myers R.M., Feldman M.W., Pritchard J.K. Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 2009;19:826–837. doi: 10.1101/gr.087577.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Teo Y.-Y., Sim X., Ong R.T.H., Tan A.K.S., Chen J., Tantoso E., Small K.S., Ku C.-S., Lee E.J.D., Seielstad M., Chia K.S. Singapore Genome Variation Project: a haplotype map of three Southeast Asian populations. Genome Res. 2009;19:2154–2162. doi: 10.1101/gr.095000.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Akey J.M. Constructing genomic maps of positive selection in humans: where do we go from here? Genome Res. 2009;19:711–722. doi: 10.1101/gr.086652.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Engelken J., Carnero-Montoro E., Pybus M., Andrews G.K., Lalueza-Fox C., Comas D., Sekler I., de la Rasilla M., Rosas A., Stoneking M. Extreme population differences in the human zinc transporter ZIP4 (SLC39A4) are explained by positive selection in Sub-Saharan Africa. PLoS Genet. 2014;10:e1004128. doi: 10.1371/journal.pgen.1004128. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Weir B.S., Cockerham C.C. Estimating F-Statistics for the Analysis of Population Structure. Evolution. 1984;38:1358–1370. doi: 10.1111/j.1558-5646.1984.tb05657.x. [DOI] [PubMed] [Google Scholar]
20.Bhatia G., Patterson N., Sankararaman S., Price A.L. Estimating and interpreting FST: the impact of rare variants. Genome Res. 2013;23:1514–1521. doi: 10.1101/gr.154831.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Hudson R.R., Slatkin M., Maddison W.P. Estimation of levels of gene flow from DNA sequence data. Genetics. 1992;132:583–589. doi: 10.1093/genetics/132.2.583. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Weir B.S., Hill W.G. Estimating F-statistics. Annu. Rev. Genet. 2002;36:721–750. doi: 10.1146/annurev.genet.36.050802.093940. [DOI] [PubMed] [Google Scholar]
23.Bhatia G., Patterson N., Pasaniuc B., Zaitlen N., Genovese G., Pollack S., Mallick S., Myers S., Tandon A., Spencer C. Genome-wide comparison of African-ancestry populations from CARe and other cohorts reveals signals of natural selection. Am. J. Hum. Genet. 2011;89:368–381. doi: 10.1016/j.ajhg.2011.07.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Ayodo G., Price A.L., Keinan A., Ajwang A., Otieno M.F., Orago A.S.S., Patterson N., Reich D. Combining evidence of natural selection with association analysis increases power to detect malaria-resistance variants. Am. J. Hum. Genet. 2007;81:234–242. doi: 10.1086/519221. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Lewontin R.C., Krakauer J. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics. 1973;74:175–195. doi: 10.1093/genetics/74.1.175. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Price A.L., Helgason A., Palsson S., Stefansson H., St Clair D., Andreassen O.A., Reich D., Kong A., Stefansson K. The impact of divergence time on the nature of population structure: an example from Iceland. PLoS Genet. 2009;5:e1000505. doi: 10.1371/journal.pgen.1000505. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Huerta-Sánchez E., Degiorgio M., Pagani L., Tarekegn A., Ekong R., Antao T., Cardona A., Montgomery H.E., Cavalleri G.L., Robbins P.A. Genetic signatures reveal high-altitude adaptation in a set of ethiopian populations. Mol. Biol. Evol. 2013;30:1877–1888. doi: 10.1093/molbev/mst089. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Aidoo M., Terlouw D.J., Kolczak M.S., McElroy P.D., ter Kuile F.O., Kariuki S., Nahlen B.L., Lal A.A., Udhayakumar V. Protective effects of the sickle cell gene against malaria morbidity and mortality. Lancet. 2002;359:1311–1312. doi: 10.1016/S0140-6736(02)08273-9. [DOI] [PubMed] [Google Scholar]
29.Auer P.L., Johnsen J.M., Johnson A.D., Logsdon B.A., Lange L.A., Nalls M.A., Zhang G., Franceschini N., Fox K., Lange E.M. Imputation of exome sequence variants into population- based samples and blood-cell-trait-associated loci in African Americans: NHLBI GO Exome Sequencing Project. Am. J. Hum. Genet. 2012;91:794–808. doi: 10.1016/j.ajhg.2012.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Maples B.K., Gravel S., Kenny E.E., Bustamante C.D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 2013;93:278–288. doi: 10.1016/j.ajhg.2013.06.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Churchhouse C., Marchini J. Multiway admixture deconvolution using phased or unphased ancestral panels. Genet. Epidemiol. 2013;37:1–12. doi: 10.1002/gepi.21692. [DOI] [PubMed] [Google Scholar]
32.Abecasis G.R., Auton A., Brooks L.D., DePristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., McVean G.A., 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. doi: 10.1038/nature11632. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Vernot B., Akey J.M. Resurrecting surviving Neandertal lineages from modern human genomes. Science. 2014;343:1017–1021. doi: 10.1126/science.1245938. [DOI] [PubMed] [Google Scholar]
34.Sankararaman S., Mallick S., Dannemann M., Prüfer K., Kelso J., Pääbo S., Patterson N., Reich D. The genomic landscape of Neanderthal ancestry in present-day humans. Nature. 2014;507:354–357. doi: 10.1038/nature12961. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Huerta-Sánchez E., Jin X., Asan, Bianba Z., Peter B.M., Vinckenbosch N., Liang Y., Yi X., He M., Somel M. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature. 2014;512:194–197. doi: 10.1038/nature13408. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Sabeti P.C., Schaffner S.F., Fry B., Lohmueller J., Varilly P., Shamovsky O., Palma A., Mikkelsen T.S., Altshuler D., Lander E.S. Positive natural selection in the human lineage. Science. 2006;312:1614–1620. doi: 10.1126/science.1124309. [DOI] [PubMed] [Google Scholar]
37.Voight B.F., Kudaravalli S., Wen X., Pritchard J.K. A map of recent positive selection in the human genome. PLoS Biol. 2006;4:e72. doi: 10.1371/journal.pbio.0040072. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Sabeti P.C., Varilly P., Fry B., Lohmueller J., Hostetter E., Cotsapas C., Xie X., Byrne E.H., McCarroll S.A., Gaudet R., International HapMap Consortium Genome-wide detection and characterization of positive selection in human populations. Nature. 2007;449:913–918. doi: 10.1038/nature06250. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Chen H., Patterson N., Reich D. Population differentiation as a test for selective sweeps. Genome Res. 2010;20:393–402. doi: 10.1101/gr.100545.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Peter B.M., Huerta-Sanchez E., Nielsen R. Distinguishing between selective sweeps from standing variation and from a de novo mutation. PLoS Genet. 2012;8:e1003011. doi: 10.1371/journal.pgen.1003011. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Grossman S.R., Shlyakhter I., Karlsson E.K., Byrne E.H., Morales S., Frieden G., Hostetter E., Angelino E., Garber M., Zuk O. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science. 2010;327:883–886. doi: 10.1126/science.1183863. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S5 and Tables S1–S11

mmc1.pdf^{(571.7KB, pdf)}

Document S2. Article plus Supplemental Data

mmc2.pdf^{(805.8KB, pdf)}

[bib1] 1.Seldin M.F., Pasaniuc B., Price A.L. New approaches to disease mapping in admixed populations. Nat. Rev. Genet. 2011;12:523–528. doi: 10.1038/nrg3002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib2] 2.Bryc K., Auton A., Nelson M.R., Oksenberg J.R., Hauser S.L., Williams S., Froment A., Bodo J.-M., Wambebe C., Tishkoff S.A., Bustamante C.D. Genome-wide patterns of population structure and admixture in West Africans and African Americans. Proc. Natl. Acad. Sci. USA. 2010;107:786–791. doi: 10.1073/pnas.0909559107. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] 3.Jeong C., Alkorta-Aranburu G., Basnyat B., Neupane M., Witonsky D.B., Pritchard J.K., Beall C.M., Di Rienzo A. Admixture facilitates genetic adaptations to high altitude in Tibet. Nat. Commun. 2014;5:3281. doi: 10.1038/ncomms4281. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] 4.Guan Y. Detecting structure of haplotypes and local ancestry. Genetics. 2014;196:625–642. doi: 10.1534/genetics.113.160697. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib5] 5.Baran Y., Pasaniuc B., Sankararaman S., Torgerson D.G., Gignoux C., Eng C., Rodriguez-Cintron W., Chapela R., Ford J.G., Avila P.C. Fast and accurate inference of local ancestry in Latino populations. Bioinformatics. 2012;28:1359–1367. doi: 10.1093/bioinformatics/bts144. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] 6.Pasaniuc B., Sankararaman S., Torgerson D.G., Gignoux C., Zaitlen N., Eng C., Rodriguez-Cintron W., Chapela R., Ford J.G., Avila P.C. Analysis of Latino populations from GALA and MEC studies reveals genomic loci with biased local ancestry estimation. Bioinformatics. 2013;29:1407–1415. doi: 10.1093/bioinformatics/btt166. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib7] 7.Hinch A.G., Tandon A., Patterson N., Song Y., Rohland N., Palmer C.D., Chen G.K., Wang K., Buxbaum S.G., Akylbekova E.L. The landscape of recombination in African Americans. Nature. 2011;476:170–175. doi: 10.1038/nature10336. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] 8.Price A.L., Tandon A., Patterson N., Barnes K.C., Rafaels N., Ruczinski I., Beaty T.H., Mathias R., Reich D., Myers S. Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 2009;5:e1000519. doi: 10.1371/journal.pgen.1000519. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] 9.Jin W., Xu S., Wang H., Yu Y., Shen Y., Wu B., Jin L. Genome-wide detection of natural selection in African Americans pre- and post-admixture. Genome Res. 2012;22:519–527. doi: 10.1101/gr.124784.111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] 10.Patterson N., Hattangadi N., Lane B., Lohmueller K.E., Hafler D.A., Oksenberg J.R., Hauser S.L., Smith M.W., O’Brien S.J., Altshuler D. Methods for high-density admixture mapping of disease genes. Am. J. Hum. Genet. 2004;74:979–1000. doi: 10.1086/420871. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib11] 11.Smith M.W., O’Brien S.J. Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nat. Rev. Genet. 2005;6:623–632. doi: 10.1038/nrg1657. [DOI] [PubMed] [Google Scholar]

[bib12] 12.Smith M.W., Patterson N., Lautenberger J.A., Truelove A.L., McDonald G.J., Waliszewska A., Kessing B.D., Malasky M.J., Scafe C., Le E. A high-density admixture map for disease gene discovery in african americans. Am. J. Hum. Genet. 2004;74:1001–1013. doi: 10.1086/420856. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib13] 13.Akey J.M., Zhang G., Zhang K., Jin L., Shriver M.D. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 2002;12:1805–1814. doi: 10.1101/gr.631202. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] 14.McEvoy B.P., Montgomery G.W., McRae A.F., Ripatti S., Perola M., Spector T.D., Cherkas L., Ahmadi K.R., Boomsma D., Willemsen G. Geographical structure and differential natural selection among North European populations. Genome Res. 2009;19:804–814. doi: 10.1101/gr.083394.108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] 15.Pickrell J.K., Coop G., Novembre J., Kudaravalli S., Li J.Z., Absher D., Srinivasan B.S., Barsh G.S., Myers R.M., Feldman M.W., Pritchard J.K. Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 2009;19:826–837. doi: 10.1101/gr.087577.108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib16] 16.Teo Y.-Y., Sim X., Ong R.T.H., Tan A.K.S., Chen J., Tantoso E., Small K.S., Ku C.-S., Lee E.J.D., Seielstad M., Chia K.S. Singapore Genome Variation Project: a haplotype map of three Southeast Asian populations. Genome Res. 2009;19:2154–2162. doi: 10.1101/gr.095000.109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] 17.Akey J.M. Constructing genomic maps of positive selection in humans: where do we go from here? Genome Res. 2009;19:711–722. doi: 10.1101/gr.086652.108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib18] 18.Engelken J., Carnero-Montoro E., Pybus M., Andrews G.K., Lalueza-Fox C., Comas D., Sekler I., de la Rasilla M., Rosas A., Stoneking M. Extreme population differences in the human zinc transporter ZIP4 (SLC39A4) are explained by positive selection in Sub-Saharan Africa. PLoS Genet. 2014;10:e1004128. doi: 10.1371/journal.pgen.1004128. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib19] 19.Weir B.S., Cockerham C.C. Estimating F-Statistics for the Analysis of Population Structure. Evolution. 1984;38:1358–1370. doi: 10.1111/j.1558-5646.1984.tb05657.x. [DOI] [PubMed] [Google Scholar]

[bib20] 20.Bhatia G., Patterson N., Sankararaman S., Price A.L. Estimating and interpreting FST: the impact of rare variants. Genome Res. 2013;23:1514–1521. doi: 10.1101/gr.154831.113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib21] 21.Hudson R.R., Slatkin M., Maddison W.P. Estimation of levels of gene flow from DNA sequence data. Genetics. 1992;132:583–589. doi: 10.1093/genetics/132.2.583. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib22] 22.Weir B.S., Hill W.G. Estimating F-statistics. Annu. Rev. Genet. 2002;36:721–750. doi: 10.1146/annurev.genet.36.050802.093940. [DOI] [PubMed] [Google Scholar]

[bib23] 23.Bhatia G., Patterson N., Pasaniuc B., Zaitlen N., Genovese G., Pollack S., Mallick S., Myers S., Tandon A., Spencer C. Genome-wide comparison of African-ancestry populations from CARe and other cohorts reveals signals of natural selection. Am. J. Hum. Genet. 2011;89:368–381. doi: 10.1016/j.ajhg.2011.07.025. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib24] 24.Ayodo G., Price A.L., Keinan A., Ajwang A., Otieno M.F., Orago A.S.S., Patterson N., Reich D. Combining evidence of natural selection with association analysis increases power to detect malaria-resistance variants. Am. J. Hum. Genet. 2007;81:234–242. doi: 10.1086/519221. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib25] 25.Lewontin R.C., Krakauer J. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics. 1973;74:175–195. doi: 10.1093/genetics/74.1.175. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib26] 26.Price A.L., Helgason A., Palsson S., Stefansson H., St Clair D., Andreassen O.A., Reich D., Kong A., Stefansson K. The impact of divergence time on the nature of population structure: an example from Iceland. PLoS Genet. 2009;5:e1000505. doi: 10.1371/journal.pgen.1000505. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib27] 27.Huerta-Sánchez E., Degiorgio M., Pagani L., Tarekegn A., Ekong R., Antao T., Cardona A., Montgomery H.E., Cavalleri G.L., Robbins P.A. Genetic signatures reveal high-altitude adaptation in a set of ethiopian populations. Mol. Biol. Evol. 2013;30:1877–1888. doi: 10.1093/molbev/mst089. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib28] 28.Aidoo M., Terlouw D.J., Kolczak M.S., McElroy P.D., ter Kuile F.O., Kariuki S., Nahlen B.L., Lal A.A., Udhayakumar V. Protective effects of the sickle cell gene against malaria morbidity and mortality. Lancet. 2002;359:1311–1312. doi: 10.1016/S0140-6736(02)08273-9. [DOI] [PubMed] [Google Scholar]

[bib29] 29.Auer P.L., Johnsen J.M., Johnson A.D., Logsdon B.A., Lange L.A., Nalls M.A., Zhang G., Franceschini N., Fox K., Lange E.M. Imputation of exome sequence variants into population- based samples and blood-cell-trait-associated loci in African Americans: NHLBI GO Exome Sequencing Project. Am. J. Hum. Genet. 2012;91:794–808. doi: 10.1016/j.ajhg.2012.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib30] 30.Maples B.K., Gravel S., Kenny E.E., Bustamante C.D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 2013;93:278–288. doi: 10.1016/j.ajhg.2013.06.020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib31] 31.Churchhouse C., Marchini J. Multiway admixture deconvolution using phased or unphased ancestral panels. Genet. Epidemiol. 2013;37:1–12. doi: 10.1002/gepi.21692. [DOI] [PubMed] [Google Scholar]

[bib32] 32.Abecasis G.R., Auton A., Brooks L.D., DePristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., McVean G.A., 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. doi: 10.1038/nature11632. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib33] 33.Vernot B., Akey J.M. Resurrecting surviving Neandertal lineages from modern human genomes. Science. 2014;343:1017–1021. doi: 10.1126/science.1245938. [DOI] [PubMed] [Google Scholar]

[bib34] 34.Sankararaman S., Mallick S., Dannemann M., Prüfer K., Kelso J., Pääbo S., Patterson N., Reich D. The genomic landscape of Neanderthal ancestry in present-day humans. Nature. 2014;507:354–357. doi: 10.1038/nature12961. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] 35.Huerta-Sánchez E., Jin X., Asan, Bianba Z., Peter B.M., Vinckenbosch N., Liang Y., Yi X., He M., Somel M. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature. 2014;512:194–197. doi: 10.1038/nature13408. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib36] 36.Sabeti P.C., Schaffner S.F., Fry B., Lohmueller J., Varilly P., Shamovsky O., Palma A., Mikkelsen T.S., Altshuler D., Lander E.S. Positive natural selection in the human lineage. Science. 2006;312:1614–1620. doi: 10.1126/science.1124309. [DOI] [PubMed] [Google Scholar]

[bib37] 37.Voight B.F., Kudaravalli S., Wen X., Pritchard J.K. A map of recent positive selection in the human genome. PLoS Biol. 2006;4:e72. doi: 10.1371/journal.pbio.0040072. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib38] 38.Sabeti P.C., Varilly P., Fry B., Lohmueller J., Hostetter E., Cotsapas C., Xie X., Byrne E.H., McCarroll S.A., Gaudet R., International HapMap Consortium Genome-wide detection and characterization of positive selection in human populations. Nature. 2007;449:913–918. doi: 10.1038/nature06250. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] 39.Chen H., Patterson N., Reich D. Population differentiation as a test for selective sweeps. Genome Res. 2010;20:393–402. doi: 10.1101/gr.100545.109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib40] 40.Peter B.M., Huerta-Sanchez E., Nielsen R. Distinguishing between selective sweeps from standing variation and from a de novo mutation. PLoS Genet. 2012;8:e1003011. doi: 10.1371/journal.pgen.1003011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib41] 41.Grossman S.R., Shlyakhter I., Karlsson E.K., Byrne E.H., Morales S., Frieden G., Hostetter E., Angelino E., Garber M., Zuk O. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science. 2010;327:883–886. doi: 10.1126/science.1183863. [DOI] [PubMed] [Google Scholar]

PERMALINK

Genome-wide Scan of 29,141 African Americans Finds No Evidence of Directional Selection since Admixture

Gaurav Bhatia

Arti Tandon

Nick Patterson

Melinda C Aldrich

Christine B Ambrosone

Christopher Amos

Elisa V Bandera

Sonja I Berndt

Leslie Bernstein

William J Blot

Cathryn H Bock

Neil Caporaso

Graham Casey

Sandra L Deming

W Ryan Diver

Susan M Gapstur

Elizabeth M Gillanders

Curtis C Harris

Brian E Henderson

Sue A Ingles

William Isaacs

Phillip L De Jager

Esther M John

Rick A Kittles

Emma Larkin

Lorna H McNeill

Robert C Millikan

Adam Murphy

Christine Neslund-Dudas

Sarah Nyante

Michael F Press

Jorge L Rodriguez-Gil

Benjamin A Rybicki

Ann G Schwartz

Lisa B Signorello

Margaret Spitz

Sara S Strom

Margaret A Tucker

John K Wiencke

John S Witte

Xifeng Wu

Yuko Yamamura

Krista A Zanetti

Wei Zheng

Regina G Ziegler

Stephen J Chanock

Christopher A Haiman

David Reich

Alkes L Price

Abstract

Main Text

Figure 1.

Table 1.

Table 2.

Acknowledgments

Appendix A

Systematic Deviations in Average Local Ancestry at the Ends of Chromosomes

Impact of Number of SNPs Analyzed

Changes in Estimator Alter the 99.99th Percentile

Model of Selection at HBB

Supplemental Data

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Changes in Estimator Alter the 99.99^th Percentile