Papers by Nicholas Patterson
The American Journal of Human Genetics, 2011
PLoS Genetics, 2011
Previous genetic studies have suggested a history of sub-Saharan African gene flow into some West... more Previous genetic studies have suggested a history of sub-Saharan African gene flow into some West Eurasian populations after the initial dispersal out of Africa that occurred at least 45,000 years ago. However, there has been no accurate characterization of the proportion of mixture, or of its date. We analyze genome-wide polymorphism data from about 40 West Eurasian groups to show that almost all Southern Europeans have inherited 1%-3% African ancestry with an average mixture date of around 55 generations ago, consistent with North African gene flow at the end of the Roman Empire and subsequent Arab migrations. Levantine groups harbor 4%-15% African ancestry with an average mixture date of about 32 generations ago, consistent with close political, economic, and cultural links with Egypt in the late middle ages. We also detect 3%-5% sub-Saharan African ancestry in all eight of the diverse Jewish populations that we analyzed. For the Jewish admixture, we obtain an average estimated date of about 72 generations. This may reflect descent of these groups from a common ancestral population that already had some African ancestry prior to the Jewish Diasporas.
Nature, 2020
4. Burial Unit 4 (single primary) showing Skeleton A in a foetal position (20-25cm below surface)... more 4. Burial Unit 4 (single primary) showing Skeleton A in a foetal position (20-25cm below surface). 5. Burial Unit 5 (single primary) showing Skeleton B partially articulated as disturbed (20cm below surface). 7. Burial Unit 7 (collective) showing disarticulated crania and one partially articulated post-cranial skeleton (20-30cm below surface)
Science, Oct 12, 2012
We present a DNA library preparation method that has allowed us to reconstruct a high-coverage (3... more We present a DNA library preparation method that has allowed us to reconstruct a high-coverage (30×) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of "missing evolution" in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans.
PLoS Genetics, 2009
The prevalence of obesity (body mass index (BMI) $30 kg/m 2) is higher in African Americans than ... more The prevalence of obesity (body mass index (BMI) $30 kg/m 2) is higher in African Americans than in European Americans, even after adjustment for socioeconomic factors, suggesting that genetic factors may explain some of the difference. To identify genetic loci influencing BMI, we carried out a pooled analysis of genome-wide admixture mapping scans in 15,280 African Americans from 14 epidemiologic studies. Samples were genotyped at a median of 1,411 ancestry-informative markers. After adjusting for age, sex, and study, BMI was analyzed both as a dichotomized (top 20% versus bottom 20%) and a continuous trait. We found that a higher percentage of European ancestry was significantly correlated with lower BMI (r = 20.042, P = 1.6610 27). In the dichotomized analysis, we detected two loci on chromosome X as associated with increased African ancestry: the first at Xq25 (locus-specific LOD = 5.94; genome-wide score = 3.22; case-control Z = 23.94); and the second at Xq13.1 (locus-specific LOD = 2.22; case-control Z = 24.62). Quantitative analysis identified a third locus at 5q13.3 where higher BMI was highly significantly associated with greater European ancestry (locus-specific LOD = 6.27; genome-wide score = 3.46). Further mapping studies with dense sets of markers will be necessary to identify the alleles in these regions of chromosomes X and 5 that may be associated with variation in BMI.
The American Journal of Human Genetics, 2011
It has recently been shown that ancestors of New Guineans and Bougainville Islanders have inherit... more It has recently been shown that ancestors of New Guineans and Bougainville Islanders have inherited a proportion of their ancestry from Denisovans, an archaic hominin group from Siberia. However, only a sparse sampling of populations from Southeast Asia and Oceania were analyzed. Here, we quantify Denisova admixture in 33 additional populations from Asia and Oceania. Aboriginal Australians, Near Oceanians, Polynesians, Fijians, east Indonesians, and Mamanwa (a ''Negrito'' group from the Philippines) have all inherited genetic material from Denisovans, but mainland East Asians, western Indonesians, Jehai (a Negrito group from Malaysia), and Onge (a Negrito group from the Andaman Islands) have not. These results indicate that Denisova gene flow occurred into the common ancestors of New Guineans, Australians, and Mamanwa but not into the ancestors of the Jehai and Onge and suggest that relatives of present-day East Asians were not in Southeast Asia when the Denisova gene flow occurred. Our finding that descendants of the earliest inhabitants of Southeast Asia do not all harbor Denisova admixture is inconsistent with a history in which the Denisova interbreeding occurred in mainland Asia and then spread over Southeast Asia, leading to all its earliest modern human inhabitants. Instead, the data can be most parsimoniously explained if the Denisova gene flow occurred in Southeast Asia itself. Thus, archaic Denisovans must have lived over an extraordinarily broad geographic and ecological range, from Siberia to tropical Asia.
PLoS Genetics, 2012
Genetic case-control association studies often include data on clinical covariates, such as body ... more Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low-BMI cases are larger than those estimated from high-BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-controlcovariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control ascertainment and fail to capture available power increases under case-control-covariate ascertainment. We show that an informed conditioning approach, based on the liability threshold model with parameters informed by external epidemiological information, fully accounts for disease prevalence and non-random ascertainment of phenotype as well as covariates and provides a substantial increase in power while maintaining a properly controlled falsepositive rate. Our method outperforms standard case-control association tests with or without covariates, tests of gene x covariate interaction, and previously proposed tests for dealing with covariates in ascertained data, with especially large improvements in the case of case-control-covariate ascertainment. We investigate empirical case-control studies of type 2 diabetes, prostate cancer, lung cancer, breast cancer, rheumatoid arthritis, age-related macular degeneration, and end-stage kidney disease over a total of 89,726 samples. In these datasets, informed conditioning outperforms logistic regression for 115 of the 157 known associated variants investigated (P-value = 1610 29). The improvement varied across diseases with a 16% median increase in x 2 test statistics and a commensurate increase in power. This suggests that applying our method to existing and future association studies of these diseases may identify novel disease loci.
Arthritis & Rheumatism, 2008
The American Journal of Human Genetics, 2008
White blood cell count (WBC) is an important clinical marker that varies among different ethnic g... more White blood cell count (WBC) is an important clinical marker that varies among different ethnic groups. African Americans are known to have a lower WBC than European Americans. We surveyed the entire genome for loci underlying this difference in WBC by using admixture mapping. We analyzed data from African American participants in the Health, Aging, and Body Composition Study and the Jackson Heart Study. Participants of both studies were genotyped across R 1322 single nucleotide polymorphisms that were preselected to be informative for African versus European ancestry and span the entire genome. We used these markers to estimate genetic ancestry in each chromosomal region and then tested the association between WBC and genetic ancestry at each locus. We found a locus on chromosome 1q strongly associated with WBC (p < 10 À12). The strongest association was with a marker known to affect the expression of the Duffy blood group antigen. Participants who had both copies of the common West African allele had a mean WBC of 4.9 (SD 1.3); participants who had both common European alleles had a mean WBC of 7.1 (SD 1.3). This variant explained~20% of population variation in WBC. We used admixture mapping, a novel method for conducting genetic-association studies, to find a region that was significantly associated with WBC on chromosome 1q. Additional studies are needed to determine the biological mechanism for this effect and its clinical implications.
Science, 2019
By sequencing 523 ancient humans, we show that the primary source of ancestry in modern South Asi... more By sequencing 523 ancient humans, we show that the primary source of ancestry in modern South Asians is a prehistoric genetic gradient between people related to early hunter-gatherers of Iran and Southeast Asia. After the Indus Valley Civilization’s decline, its people mixed with individuals in the southeast to form one of the two main ancestral populations of South Asia, whose direct descendants live in southern India. Simultaneously, they mixed with descendants of Steppe pastoralists who, starting around 4000 years ago, spread via Central Asia to form the other main ancestral population. The Steppe ancestry in South Asia has the same profile as that in Bronze Age Eastern Europe, tracking a movement of people that affected both regions and that likely spread the distinctive features shared between Indo-Iranian and Balto-Slavic languages.
Nature, 2010
Author Information The raw sequence data from the two Denisova fossils, the seven present-day hum... more Author Information The raw sequence data from the two Denisova fossils, the seven present-day humans, and the tooth mtDNA have been deposited in the European Nucleotide Archive at EMBL-EBI under accession numbers ERP000318, ERP000121 and FR695060, respectively. The alignments of Denisova sequence reads to the human and chimpanzee genomes are accessible for browsing and download from http://genome.ucsc.edu/Denisova.
SSRN Electronic Journal, 2012
ABSTRACT This short reply summarizes the concerns of the anthropological community about Ashraf a... more ABSTRACT This short reply summarizes the concerns of the anthropological community about Ashraf and Galor&#39;s (Forthcoming) article in the American Economic Review.
Nature, Jan 23, 2015
Ancient DNA makes it possible to observe natural selection directly by analysing samples from pop... more Ancient DNA makes it possible to observe natural selection directly by analysing samples from populations before, during and after adaptation events. Here we report a genome-wide scan for selection using ancient DNA, capitalizing on the largest ancient DNA data set yet assembled: 230 West Eurasians who lived between 6500 and 300 bc, including 163 with newly reported data. The new samples include, to our knowledge, the first genome-wide ancient DNA from Anatolian Neolithic farmers, whose genetic material we obtained by extracting from petrous bones, and who we show were members of the population that was the source of Europe's first farmers. We also report a transect of the steppe region in Samara between 5600 and 300 bc, which allows us to identify admixture into the steppe from at least two external sources. We detect selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height.
PLoS Genetics, 2011
While genome-wide association studies (GWAS) have primarily examined populations of European ance... more While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.
American journal of human genetics, Jan 2, 2014
The extent of recent selection in admixed populations is currently an unresolved question. We sca... more The extent of recent selection in admixed populations is currently an unresolved question. We scanned the genomes of 29,141 African Americans and failed to find any genome-wide-significant deviations in local ancestry, indicating no evidence of selection influencing ancestry after admixture. A recent analysis of data from 1,890 African Americans reported that there was evidence of selection in African Americans after their ancestors left Africa, both before and after admixture. Selection after admixture was reported on the basis of deviations in local ancestry, and selection before admixture was reported on the basis of allele-frequency differences between African Americans and African populations. The local-ancestry deviations reported by the previous study did not replicate in our very large sample, and we show that such deviations were expected purely by chance, given the number of hypotheses tested. We further show that the previous study's conclusion of selection in African...
Journal of Computational Biology, 2004
In Kellis et al. (2003), we reported the genome sequences of S. paradoxus, S. mikatae and S. baya... more In Kellis et al. (2003), we reported the genome sequences of S. paradoxus, S. mikatae and S. bayanus and compared these three yeast species to their close relative, S. cerevisiae. Genome-wide comparative analysis allowed the identification of functionally important sequences, both coding and non-coding. In this companion paper we describe the mathematical and algorithmic results underpinning the analysis of these genomes. We present methods for the automatic determination of genome correspondence. The algorithms enabled the automatic identification of orthologs for more than 90% of genes and intergenic regions across the four species despite the large number of duplicated genes in the yeast genome. The remaining ambiguities in the gene correspondence revealed recent gene family expansions in regions of rapid genomic change. We present methods for the identification of protein-coding genes based on their patterns of nucleotide conservation across related species. We observed the pressure to conserve the reading frame of functional proteins and developed a test for gene identification with high sensitivity and specificity. We used this test to revisit the genome of S. cerevisiae, reducing the overall gene count by 500 genes (10% of previously annotated genes) and refining the gene structure of hundreds of genes. We present novel methods for the systematic de novo identification of regulatory motifs. The methods do not rely on previous knowledge of gene function and in that way differ from the current literature on computational motif discovery. Based on the genome-wide conservation patterns of known motifs, we developed three conservation criteria that we used to discover novel motifs. We used an enumeration approach to select strongly conserved motif cores, which we extended and collapsed into a small number of candidate regulatory motifs. These include most previously known regulatory motifs as well as several noteworthy novel motifs. The majority of discovered motifs are enriched in functionally related genes, allowing us to infer a candidate function for novel motifs. Our results demonstrate the power of comparative genomics to further our understanding of any species. Our methods are validated by the extensive experimental knowledge in yeast, and will be invaluable in the study of complex genomes like that of human.
Current Anthropology, 2013
We present a critique of a paper written by two economists, Quamrul Ashraf and Oded Galor, which ... more We present a critique of a paper written by two economists, Quamrul Ashraf and Oded Galor, which is forthcoming in the American Economic Review and which was uncritically highlighted in Science magazine. Their paper claims there is a causal effect of genetic diversity on economic success, positing that too much or too little genetic diversity constrains development. In particular, they argue that "the high degree of diversity among African populations and the low degree of diversity among Native American populations have been a detrimental force in the development of these regions." We demonstrate that their argument is seriously flawed on both factual and methodological grounds. As economists and other social scientists begin exploring newly available genetic data, it is crucial to remember that nonexperts broadcasting bold claims on the basis of weak data and methods can have profoundly detrimental social and political effects. Explanations for human behavior based on genetic data are powerful and intuitive, but their mobilization comes with responsibility. Since the completion of the full sequencing of the human genome in 2003, several economists have begun to revisit the idea that economic outcomes can be related to genetic background (Ashraf and Galor 2013; Benjamin et al. 2007; Clark Jade d'Alpoim Guedes is a PhD candidate in the
Uploads
Papers by Nicholas Patterson