Abstract
Dissecting the genetic basis of disease risk requires measuring all forms of genetic variation, including SNPs and copy number variants (CNVs), and is enabled by accurate maps of their locations, frequencies and population-genetic properties. We designed a hybrid genotyping array (Affymetrix SNP 6.0) to simultaneously measure 906,600 SNPs and copy number at 1.8 million genomic locations. By characterizing 270 HapMap samples, we developed a map of human CNV (at 2-kb breakpoint resolution) informed by integer genotypes for 1,320 copy number polymorphisms (CNPs) that segregate at an allele frequency >1%. More than 80% of the sequence in previously reported CNV regions fell outside our estimated CNV boundaries, indicating that large (>100 kb) CNVs affect much less of the genome than initially reported. Approximately 80% of observed copy number differences between pairs of individuals were due to common CNPs with an allele frequency >5%, and more than 99% derived from inheritance rather than new mutation. Most common, diallelic CNPs were in strong linkage disequilibrium with SNPs, and most low-frequency CNVs segregated on specific SNP haplotypes.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
206,07 € per year
only 17,17 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
References
Sebat, J. et al. Large-scale copy number polymorphism in the human genome. Science 305, 525–528 (2004).
Iafrate, A.J. et al. Detection of large-scale variation in the human genome. Nat. Genet. 36, 949–951 (2004).
Tuzun, E. et al. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005).
Sharp, A.J. et al. Segmental duplications and copy-number variation in the human genome. Am. J. Hum. Genet. 77, 78–88 (2005).
Hinds, D.A., Kloek, A.P., Jen, M., Chen, X. & Frazer, K.A. Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat. Genet. 38, 82–85 (2006).
Conrad, D.F., Andrews, T.D., Carter, N.P., Hurles, M.E. & Pritchard, J.K. A high-resolution survey of deletion polymorphism in the human genome. Nat. Genet. 38, 75–81 (2006).
McCarroll, S.A. et al. Common deletion polymorphisms in the human genome. Nat. Genet. 38, 86–92 (2006).
Locke, D.P. et al. Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. Am. J. Hum. Genet. 79, 275–290 (2006).
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).
Korbel, J.O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007).
Kidd, J.M. et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008).
McCarroll, S.A. & Altshuler, D.M. Copy-number variation and association studies of human disease. Nat. Genet. 39, S37–S42 (2007).
The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
Frazer, K.A. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
Smemo, S. & Borevitz, J.O. Redundancy in genotyping arrays. PLoS ONE 2, e287 (2007).
Antipova, A.A., Tamayo, P. & Golub, T.R. A strategy for oligonucleotide microarray probe reduction. Genome Biol 3, RESEARCH0073 (2002).
Shen, F. et al. Improved detection of global copy number variation using high density, non-polymorphic oligonucleotide probes. BMC Genet. 9, 27 (2008).
Korn, J.M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. advance online publication, 10.1038/ng.237 (7 September 2008).
Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).
Cooper, G.M., Nickerson, D.A. & Eichler, E.E. Mutational and selective effects on copy-number variants in the human genome. Nat. Genet. 39, S22–S29 (2007).
McCarroll, S.A. Copy-number analysis goes more than skin deep. Nat. Genet. 40, 5–6 (2008).
Zhang, J., Feuk, L., Duggan, G.E., Khaja, R. & Scherer, S.W. Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome. Cytogenet. Genome Res. 115, 205–214 (2006).
Scherer, S.W. et al. Challenges and standards in integrating surveys of structural variation. Nat. Genet. 39, S7–S15 (2007).
Kidd, J.M., Newman, T.L., Tuzun, E., Kaul, R. & Eichler, E.E. Population stratification of a common APOBEC gene deletion polymorphism. PLoS Genet. 3, e63 (2007).
Perry, G.H. et al. Diet and the evolution of human amylase gene copy number variation. Nat. Genet. 39, 1256–1260 (2007).
Jakobsson, M. et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451, 998–1003 (2008).
McCarroll, S.A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nat. Genet. advance online publication, 10.1038/ng.215 (24 August 2008).
Cohen, J.C., Boerwinkle, E., Mosley, T.H. Jr & Hobbs, H.H. Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N. Engl. J. Med. 354, 1264–1272 (2006).
Cohen, J.C. et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872 (2004).
Acknowledgements
We thank J. Kidd, G. Cooper and E. Eichler for sharing data on high-resolution breakpoints of select CNVs prior to its publication; E. Lander, J. Hirschhorn and S. Kathiresan for thoughtful readings of the manuscript; the Affymetrix team of the Broad Institute Genetic Analysis Platform: W. Brodeur, N. Chia, M. DaSilva, J. Gibbons, N. Houde, M. McConnell, R. Barry, K. Nguyen, J. Camarata, M. Fava and T. Nyinjee under the supervision of C. Gates, B. Blumenstiel, D. Gage and M. Parkin; members of the Affymetrix informatics team: X. Di, H. Gorrell, G. Liu, M. Mittmann, M. Shen, C. Sugnet, A. Willams and G. Yang; members of the Affymetrix arrays and assays team: T. Berntsen, M. Chadha, J. Law, H. Matsuzaki, B. Nguyen, K. Travers, N. Vissa and S. Walsh. S.A.M. was supported by a Lilly Life Sciences Research Fellowship.
Author information
Authors and Affiliations
Contributions
F.G.K. conceived a strategy for empirical probe reduction of SNP probe sets. S.A.M. conceived of hybrid arrays consisting of polymorphic (SNP) and nonpolymorphic (copy number) probes. F.G.K., S.A.M. and D.A. proposed to Affymetrix a specific redesign of the 500K SNP array based on these concepts. The idea was further developed with input from R.R., J.B., S.C., S.L., K.W.J., S.B.G. and M.J.D., and a pilot initiated. For the pilot (which became the SNP 5.0 array), F.G.K. and J.B.M. selected SNP probe sets, and S.A.M. and J.M.K. selected copy number probes. For the development of the SNP 6.0 array, R.M. directed laboratory SNP screening experiments which were analyzed by S.C., E.H. and T.W. P.I.W.d.B., J.B.M. and S.C. selected SNPs from those which passed the screening effort, using a linkage-disequilibrium tagging strategy. S.A.M. and M.H.S. designed and M.H.S. directed laboratory work for the titration experiment that guided empirical selection of copy number probes; on the basis of these results, together with informatic analyses which A.K. performed, S.A.M. and J.M.K. selected copy number probes. Laboratory experiments at Broad Institute were led by M.P. and S.B.G. A.W., J.N., R.H. and E.H. developed supporting software. S.A.M., J.M.K. and J.N. analyzed the data to identify CNVs. S.A.M., J.N., F.G.K. and J.M.K. developed CNP genotyping analysis. P.J.C. conducted and J.V. analyzed experiments to validate CNP genotypes experimentally. S.A.M. analyzed the population-genetic and linkage-disequilibrium properties of CNVs. J.M.K. analyzed the data for evidence of de novo CNVs. A.L.E. analyzed platforms' coverage of CNVs. S.A.M., F.G.K., J.M.K., M.J.D. and D.A. wrote the manuscript. Discussions among all authors informed the array design, the development of algorithms for analysis and the interpretation of results.
Corresponding authors
Ethics declarations
Competing interests
S.C., M.H.S., E.H., T.W., R.M., S.L., J.B., K.J. and R.R. are employees of Affymetrix. The remaining authors (S.A.M., F.G.K., J.M.K., J.N., A.W., P.I.W.dB., J.M., A.K., A.L.E., M.P., R.H., M.J.B., S.B.G. and D.A.) neither personally nor institutionally receive financial support from Affymetrix, and neither the authors nor their employers receive compensation or royalties from the work described in this article.
Supplementary information
Supplementary Text and Figures
Supplementary Methods, Supplementary Tables 1, 4 and Supplementary Figures 1 and 2 (PDF 1559 kb)
Supplementary Table 2
Genomic locations of copy-number polymorphisms (XLS 147 kb)
Supplementary Table 3
Sample-level determinations of integer copy number for each CNP (XLS 2419 kb)
Rights and permissions
About this article
Cite this article
McCarroll, S., Kuruvilla, F., Korn, J. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet 40, 1166–1174 (2008). https://doi.org/10.1038/ng.238
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.238
This article is cited by
-
Ethnic and functional differentiation of copy number polymorphisms in Tunisian and HapMap population unveils insights on genome organizational plasticity
Scientific Reports (2024)
-
A comprehensive analysis of copy number variations in diverse apple populations
BMC Genomics (2023)
-
Assessment of linkage disequilibrium patterns between structural variants and single nucleotide polymorphisms in three commercial chicken populations
BMC Genomics (2022)
-
Convergent evolution and multi-wave clonal invasion in H3 K27-altered diffuse midline gliomas treated with a PDGFR inhibitor
Acta Neuropathologica Communications (2022)
-
Genome-wide analyses of multiple obesity-related cytokines and hormones informs biology of cardiometabolic traits
Genome Medicine (2021)