Abstract
Structurally complex genomic regions are not yet well understood. One such locus, human chromosome 17q21.31, contains a megabase-long inversion polymorphism1, many uncharacterized copy-number variations (CNVs) and markers that associate with female fertility1, female meiotic recombination1,2,3 and neurological disease4,5. Additionally, the inverted H2 form of 17q21.31 seems to be positively selected in Europeans1. We developed a population genetics approach to analyze complex genome structures and identified nine segregating structural forms of 17q21.31. Both the H1 and H2 forms of the 17q21.31 inversion polymorphism contain independently derived, partial duplications of the KANSL1 gene; these duplications, which produce novel KANSL1 transcripts, have both recently risen to high allele frequencies (26% and 19%) in Europeans. An older H2 form lacking such a duplication is present at low frequency in European and central African hunter-gatherer populations. We further show that complex genome structures can be analyzed by imputation from SNPs.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
206,07 € per year
only 17,17 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
References
Stefansson, H. et al. A common inversion under selection in Europeans. Nat. Genet. 37, 129–137 (2005).
Chowdhury, R., Bois, P.R., Feingold, E., Sherman, S.L. & Cheung, V.G. Genetic analysis of variation in human meiotic recombination. PLoS Genet. 5, e1000648 (2009).
Fledel-Alon, A. et al. Variation in human recombination rates and its genetic determinants. PLoS ONE 6, e20321 (2011).
Skipper, L. et al. Linkage disequilibrium and association of MAPT H1 in Parkinson disease. Am. J. Hum. Genet. 75, 669–677 (2004).
Simón-Sánchez, J. et al. Genome-wide association study reveals genetic risk underlying Parkinson's disease. Nat. Genet. 41, 1308–1312 (2009).
McCarroll, S.A. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet. 40, 1166–1174 (2008).
Conrad, D.F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).
McCarroll, S.A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn′s disease. Nat. Genet. 40, 1107–1112 (2008).
Willer, C.J. et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat. Genet. 41, 25–34 (2009).
Handsaker, R.E., Korn, J.M., Nemesh, J. & McCarroll, S.A. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat. Genet. 43, 269–276 (2011).
Quinlan, A.R. & Hall, I.M. Characterizing complex structural variation in germline and somatic genomes. Trends Genet. 28, 43–53 (2012).
International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).
1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Mills, R.E. et al. Mapping copy number variation by population-scale genome sequencing. Nature 470, 59–65 (2011).
Zody, M.C. et al. Evolutionary toggling of the MAPT 17q21.31 inversion region. Nat. Genet. 40, 1076–1083 (2008).
McCarroll, S.A. Copy-number analysis goes more than skin deep. Nat. Genet. 40, 5–6 (2008).
Hindson, B.J. et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal. Chem. 83, 8604–8610 (2011).
Tishkoff, S.A. et al. Convergent adaptation of human lactase persistence in Africa and Europe. Nat. Genet. 39, 31–40 (2007).
Genovese, G. et al. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 329, 841–845 (2010).
Yu, L., Song, Y. & Wharton, R.P. E(nos)/CG4699 required for nanos function in the female germ line of Drosophila. Genesis 48, 161–170 (2010).
Smith, E.R. et al. A human protein complex homologous to the Drosophila MSL complex is responsible for the majority of histone H4 acetylation at lysine 16. Mol. Cell. Biol. 25, 9175–9188 (2005).
Li, X., Wu, L., Corsa, C.A., Kunkel, S. & Dou, Y. Two mammalian MOF complexes regulate transcription activation by distinct mechanisms. Mol. Cell 36, 290–301 (2009).
Sharp, A.J. et al. Segmental duplications and copy-number variation in the human genome. Am. J. Hum. Genet. 77, 78–88 (2005).
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).
Browning, S.R. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008).
Li, Y., Willer, C., Sanna, S. & Abecasis, G. Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009).
Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).
Browning, B.L. & Browning, S.R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009).
Steinberg, K.M. et al. Structural diversity and African origin of the 17q21.31 inversion polymorphism. Nat. Genet. published online: doi:10.1038/ng.2335 (1 July 2012).
Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).
Acknowledgements
J. Korn provided an early version of software for visualizing haplotype diversity. N. Rohland and T. Mullen contributed expertise on laboratory experiments. We thank N. Patterson, D. Reich, D. Altshuler, E. Lander, B. Browning, J. Korn, J. Gray, C. Patil, G. Genovese, A. Sekar and S. Grossman for helpful conversations and/or comments on the manuscript. This work was supported by a Smith Family Award for Excellence in Biomedical Research to S.A.M., by the National Human Genome Research Institute (U01HG005208) and by startup resources from the Harvard Medical School Department of Genetics.
Author information
Authors and Affiliations
Contributions
S.A.M., L.M.B. and R.E.H. conceived the strategy for population genetics dissection of structurally complex loci. L.M.B. performed all laboratory experiments and multiple computational analyses, including the estimation of haplotype frequencies, delineation of CNV regions and alignment of next-generation sequence data. R.E.H. performed computational analyses of the 1000 Genomes Project data, including finding breakpoint-spanning reads for CNVs and integrated analyses of SNP-CNV haplotypes. M.C.Z. performed analyses of sequence data to determine large-scale structures, estimate coalescence and mutation dates and reconstruct the evolutionary history of the locus. R.E.H. and L.M.B. developed the imputation strategy. S.A.M., L.M.B., R.E.H. and M.C.Z. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Note, Supplementary Tables 1–17 and Supplementary Figures 1–8 (PDF 3406 kb)
Rights and permissions
About this article
Cite this article
Boettger, L., Handsaker, R., Zody, M. et al. Structural haplotypes and recent evolution of the human 17q21.31 region. Nat Genet 44, 881–885 (2012). https://doi.org/10.1038/ng.2334
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.2334