Abstract
Linear mixed models have attracted considerable attention recently as a powerful and effective tool for accounting for population stratification and relatedness in genetic association tests. However, existing methods for exact computation of standard test statistics are computationally impractical for even moderate-sized genome-wide association studies. To address this issue, several approximate methods have been proposed. Here, we present an efficient exact method, which we refer to as genome-wide efficient mixed-model association (GEMMA), that makes approximations unnecessary in many contexts. This method is approximately n times faster than the widely used exact method known as efficient mixed-model association (EMMA), where n is the sample size, making exact genome-wide association analysis computationally practical for large numbers of individuals.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
206,07 € per year
only 17,17 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout

Similar content being viewed by others
References
Kang, H.M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
Kang, H.M., Ye, C. & Eskin, E. Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots. Genetics 180, 1909–1925 (2008).
Kang, H.M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).
Listgarten, J., Kadie, C., Schadt, E.E. & Heckerman, D. Correction for hidden confounders in the genetic analysis of gene expression. Proc. Natl. Acad. Sci. USA 107, 16465–16470 (2010).
Price, A.L., Zaitlen, N.A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).
Yu, J. et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).
Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355–360 (2010).
Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nat. Methods 8, 833–835 (2011).
Aulchenko, Y.S., Ripke, S., Isaacs, A. & van Duijn, C.M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).
Aulchenko, Y.S., de Koning, D.J. & Haley, C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177, 577–585 (2007).
Abney, M., Ober, C. & McPeek, M.S. Quantitative-trait homozygosity and association mapping and empirical genomewide significance in large, complex pedigrees: fasting serum-insulin level in the Hutterites. Am. J. Hum. Genet. 70, 920–934 (2002).
Guan, Y. & Stephens, M. Practical issues in imputation-based association mapping. PLoS Genet. 4, e1000279 (2008).
Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
Knuth, D.E. Big Omicron and big Omega and big Theta. ACM SIGACT News. 8, 18–24 (1976).
Bennett, B.J. et al. A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Res. 20, 281–290 (2010).
The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).
Lee, S.H., van der Werf, J.H., Hayes, B.J., Goddard, M.E. & Visscher, P.M. Predicting unobserved phenotypes for complex traits from whole-genome SNP data. PLoS Genet. 4, e1000231 (2008).
Meyer, K. Estimating variances and covariances for multivariate animal models by restricted maximum likelihood. Genet. Sel. Evol. 23, 67–83 (1991).
Searle, S.R., Casella, G. & McCulloch, C.E. Variance Components. (Wiley, New York, 2006).
Henderson, C.R. Applications of Linear Models in Animal Breeding (University of Guelph, Guelph, Canada, 1984).
Acknowledgements
This research is supported in part by grants from the US National Institutes of Health (NIH) (HL092206 to Y. Gilad and HG02585 to M.S.). We thank A.J. Lusis for making the mouse genotype and phenotype data available. This study also makes use of data generated by the WTCCC15. A full list of the investigators who contributed to the generation of the data is available from the WTCCC website. Funding for the WTCCC project was provided by the Wellcome Trust (award 085475).
Author information
Authors and Affiliations
Contributions
X.Z. and M.S. designed the study, developed methods and wrote the manuscript. X.Z. implemented software and analyzed data.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1 and 2, Supplementary Table 1 and Supplementary Note (PDF 365 kb)
Rights and permissions
About this article
Cite this article
Zhou, X., Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat Genet 44, 821–824 (2012). https://doi.org/10.1038/ng.2310
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.2310
This article is cited by
-
Genetic dissection of resistance to gray leaf spot by genome-wide association study in a multi-parent maize population
BMC Plant Biology (2024)
-
Human genetic associations of the airway microbiome in chronic obstructive pulmonary disease
Respiratory Research (2024)
-
Genomic evidence for human-mediated introgressive hybridization and selection in the developed breed
BMC Genomics (2024)
-
A cautionary tale of low-pass sequencing and imputation with respect to haplotype accuracy
Genetics Selection Evolution (2024)
-
Genome-wide identification of quantitative trait loci and candidate genes for seven carcass traits in a four-way intercross porcine population
BMC Genomics (2024)