Abstract
Determining whether potential causal variants for related diseases are shared can identify overlapping etiologies of multifactorial disorders. Colocalization methods disentangle shared and distinct causal variants. However, existing approaches require independent data sets. Here we extend two colocalization methods to allow for the shared-control design commonly used in comparison of genome-wide association study results across diseases. Our analysis of four autoimmune diseases—type 1 diabetes (T1D), rheumatoid arthritis, celiac disease and multiple sclerosis—identified 90 regions that were associated with at least one disease, 33 (37%) of which were associated with 2 or more disorders. Nevertheless, for 14 of these 33 shared regions, there was evidence that the causal variants differed. We identified new disease associations in 11 regions previously associated with one or more of the other 3 disorders. Four of eight T1D-specific regions contained known type 2 diabetes (T2D) candidate genes (COBL, GLIS3, RNLS and BCAR1), suggesting a shared cellular etiology.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
206,07 € per year
only 17,17 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Change history
08 July 2015
In the version of this article initially published, the two panels in Figure 2 were presented in the incorrect order. The error has been corrected in the HTML and PDF versions of the article.
References
Cotsapas, C. et al. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 7, e1002254 (2011).
Plagnol, V., Smyth, D.J., Todd, J.A. & Clayton, D.G. Statistical independence of the colocalized association signals for type 1 diabetes and RPS26 gene expression on chromosome 12q13. Biostatistics 10, 327–334 (2009).
Wallace, C. et al. Statistical colocalization of monocyte gene expression and genetic risk variants for type 1 diabetes. Hum. Mol. Genet. 21, 2815–2824 (2012).
Wallace, C. Statistical testing of shared genetic control for potentially related traits. Genet. Epidemiol. 37, 802–813 (2013).
Giambartolomei, C. et al. Bayesian test for colocalization between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Smyth, D.J. et al. Shared and distinct genetic variants in type 1 diabetes and celiac disease. N. Engl. J. Med. 359, 2767–2777 (2008).
Cortés, A. & Brown, M.A. Promise and pitfalls of the Immunochip. Arthritis Res. Ther. 13, 101 (2011).
Parkes, M., Cortés, A., van Heel, D.A. & Brown, M.A. Genetic insights into common pathways and complex relationships among immune-mediated diseases. Nat. Rev. Genet. 14, 661–673 (2013).
Eyre, S. et al. High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis. Nat. Genet. 44, 1336–1340 (2012).
Beecham, A.H. et al. Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis. Nat. Genet. 45, 1353–1360 (2013).
Mero, I.L. et al. A rare variant of the TYK2 gene is confirmed to be associated with multiple sclerosis. Eur. J. Hum. Genet. 18, 502–504 (2010).
Barrett, J.C. et al. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat. Genet. 41, 703–707 (2009).
Ueda, H. et al. Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423, 506–511 (2003).
Fung, E.Y.M.G. et al. Analysis of 17 autoimmune disease–associated variants in type 1 diabetes identifies 6q23/TNFAIP3 as a susceptibility locus. Genes Immun. 10, 188–191 (2009).
Ioannidis, J.P.A. Why most discovered true associations are inflated. Epidemiology 19, 640–648 (2008).
Evangelou, M. et al. A method for gene-based pathway analysis using genomewide association study summary statistics reveals nine new type 1 diabetes associations. Genet. Epidemiol. 38, 661–670 (2014).
Trynka, G. et al. Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease. Nat. Genet. 43, 1193–1201 (2011).
Flutre, T., Wen, X., Pritchard, J. & Stephens, M. A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet. 9, e1003486 (2013).
Deelen, P. et al. Improved imputation quality of low-frequency and rare variants in European samples using the Genome of The Netherlands. Eur. J. Hum. Genet. 22, 1321–1326 (2014).
Swafford, A.D.E. et al. An allele of IKZF1 (Ikaros) conferring susceptibility to childhood acute lymphoblastic leukemia protects against type 1 diabetes. Diabetes 60, 1041–1044 (2011).
Visscher, P.M., Brown, M.A., McCarthy, M.I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).
Nalls, M.A. et al. Multiple loci are associated with white blood cell phenotypes. PLoS Genet. 7, e1002113 (2011).
Morris, A.P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).
Dupuis, J. et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat. Genet. 42, 105–116 (2010).
Nogueira, T.C. et al. GLIS3, a susceptibility gene for type 1 and type 2 diabetes, modulates pancreatic β cell apoptosis via regulation of a splice variant of the BH3-only protein Bim. PLoS Genet. 9, e1003532 (2013).
Harder, M.N. et al. Type 2 diabetes risk alleles near BCAR1 and in ANK1 associate with decreased β-cell function whereas risk alleles near ANKRD55 and GRB14 associate with decreased insulin sensitivity in the Danish Inter99 cohort. J. Clin. Endocrinol. Metab. 98, E801–E806 (2013).
Scott, R.A. et al. Genome-wide association study imputed to 1000 Genomes reveals 18 novel associations with type 2 diabetes. (American Society of Human Genetics, 2014).
DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium et al. Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat. Genet. 46, 234–244 (2014).
Panagiotou, O.A. & Ioannidis, J.P.A. What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations. Int. J. Epidemiol. 41, 273–286 (2012).
Onengut-Gumuscu, S. et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat. Genet. 47, 381–386 (2015).
Cooper, J.D. et al. Seven newly identified loci for autoimmune thyroid disease. Hum. Mol. Genet. 21, 5202–5208 (2012).
Raftery, A.E. Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika 83, 251–266 (1996).
Acknowledgements
M.D.F. is funded by the Wellcome Trust (099772). C.W. and H.G. are funded by the Wellcome Trust (089989).
This work was funded by the JDRF (9-2011-253), the Wellcome Trust (091157) and the National Institute for Health Research (NIHR) Cambridge Biomedical Research Centre. The Cambridge Institute for Medical Research (CIMR) is in receipt of a Wellcome Trust Strategic Award (100140). ImmunoBase is supported by Eli Lilly and Company.
We thank the UK Medical Research Council (MRC) and Wellcome Trust for funding the collection of DNA for the British 1958 Birth Cohort (MRC grant G0000934 and Wellcome Trust grant 068545/Z/02). Control DNA samples were prepared and provided by S. Ring, R. Jones, M. Pembrey, W. McArdle, D. Strachan and P. Burton.
This research uses resources provided by the Type 1 Diabetes Genetics Consortium, a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), the National Institute of Allergy and Infectious Diseases (NIAID), the National Human Genome Research Institute (NHGRI), the National Institute of Child Health and Human Development (NICHD) and the JDRF and supported by grant U01DK062418 from the NIDDK.
Collection of the rheumatoid arthritis data was funded by the Arthritis Foundation and the US National Institutes of Health.
We are grateful to G. Trynka and D. van Heel for use of the celiac disease data. Funding was provided by the Wellcome Trust, by grants from the Celiac Disease Consortium and an Innovative Cluster approved by the Netherlands Genomics Initiative, by the Dutch government (BSIK03009 to C.W.) and the Netherlands Organisation for Scientific Research (NWO; grant 918.66.620), and by US National Institutes of Health grant 1R01CA141743 and Fondo de Investigación Sanitaria grants FIS08/1676 and FIS07/0353.
Multiple sclerosis data were provided by S. Sawcer and the International Multiple Sclerosis Genetics Consortium. Funding was provided by the US National Institutes of Health, the Wellcome Trust, the UK Multiple Sclerosis Society, the UK MRC, the US National Multiple Sclerosis Society, the Cambridge NIHR Biomedical Research Centre, DeNDRon, the Bibbi and Niels Jensens Foundation, the Swedish Brain Foundation, the Swedish Research Council, the Knut and Alice Wallenberg Foundation, the Swedish Heart-Lung Foundation, the Foundation for Strategic Research, the Stockholm County Council, Karolinska Institutet, INSERM, Fondation d'Aide à la Recherche sur la Sclérose en Plaques, Association Française contre les Myopathies, Infrastructures en Biologie Santé et Agronomie (GIS-IBISA), the German Ministry for Education and Research, the German Competence Network Multiple Sclerosis, Deutsche Forschungsgemeinschaft, Munich Biotec Cluster M4, the Fidelity Biosciences Research Initiative, Research Foundation Flanders, Research Fund KU Leuven, the Belgian Charcot Foundation, Gemeinnützige Hertie Stiftung, University Zurich, the Danish Multiple Sclerosis Society, the Danish Council for Strategic Research, the Academy of Finland, the Sigrid Juselius Foundation, Helsinki University, the Italian Multiple Sclerosis Foundation, Fondazione Cariplo, the Italian Ministry of University and Research, the Torino Savings Bank Foundation, the Italian Ministry of Health, the Italian Institute of Experimental Neurology, the Multiple Sclerosis Association of Oslo, the Norwegian Research Council, the South-Eastern Norwegian Health Authorities, the Australian National Health and Medical Research Council, the Dutch Multiple Sclerosis Foundation and Kaiser Permanente.
M. Evangelou is thanked for motivating the investigation of the FASLG association.
Author information
Authors and Affiliations
Contributions
M.D.F. conceived and designed experiments, performed statistical analyses, analyzed data and wrote the manuscript. H.G. conceived and designed experiments. O.B. analyzed data, prepared data and maintained ImmunoBase. E.S. prepared data and maintained ImmunoBase. N.M.W. prepared data. M.B. and S.J.S. contributed multiple sclerosis data and interpreted results. J.B., J.W., A.B. and S.E. contributed rheumatoid arthritis data and interpreted results. J.A.T. analyzed data, contributed T1D data and wrote the manuscript. C.W. conceived and designed experiments, analyzed the data and wrote the manuscript. All authors reviewed and contributed to the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The JDRF/Wellcome Trust Diabetes and Inflammation Laboratory receives funding from Hoffmann La Roche and Eli Lilly and Company. ImmunoBase, for which O.B. is a principal investigator, is funded in part by Eli Lilly and Company.
Integrated supplementary information
Supplementary Figure 1 The hypotheses being tested by our two approaches.
(a) The hypotheses being tested by the Bayesian approach are represented as collections of configurations. Each configuration is represented by a line, and each circle represents one of the Q SNPs in a region under consideration. Yellow circles represent SNPs that are causal for disease 1; blue circles represent SNPs that are causal for disease 2. We assume that at most one SNP can be causal for each disease. (b) The proportional approach tests the null hypothesis of proportionality: given colocalization, we expect effect estimates at any set of SNPs to be proportional for the two traits. In this plot, each point represents the effect estimate for the two traits at a SNP. Under colocalization, these should lie on a straight line through the origin. (c) The proportional null hypothesis does not correspond to 0 from the Bayesian approach. The null hypothesis of proportionality corresponds to colocalization, single-disease association or association with neither disease. A failure to reject the null hypothesis could also be caused by insufficient power.
Supplementary Figure 2 τ, the probability of colocalization, given that both traits are associated with a region.
τ can be expressed as a function of the number of SNPs in the region and p12, the probability of any given SNP being associated to two traits (we assume that the probability of a SNP being associated to the first trait only is held constant at 10−4). The histogram shows the distribution in number of SNPs present over all regions analyzed. Superimposed upon this are lines showing τ for each of p12 = 10−5, p12 = 10−6 and p12 = 10−7. The dotted line shows τ = 0.50, which we believe to be a reasonable average value. From this, we conclude that p12 = 10−6 is the most appropriate value to use.
Supplementary Figure 3 A Manhattan plot of the 6q25.3 region containing candidate causal gene TAGAP.
There is strong evidence of colocalization between celiac disease (CEL) and multiple sclerosis (MS) (pp. ~ 0.94$). However, the proportional approach shows that the risk allele for celiac disease is protective for multiple sclerosis and vice versa.
Supplementary Figure 4 A Manhattan plot of the 19p13.2 region containing the candidate causal genes ICAM1, ICAM3 and TYK2.
The SNPs considered most likely to be causal by our analysis are highlighted. The green signal is shared by all diseases, whereas the magenta signal is unique to celiac disease (CEL).
Supplementary Figure 5 Information from the UCSC Genome Browser for the 1q24.3 FASLG region.
This region shows association with type 1 diabetes and celiac disease. Note that there is strong evidence of regulatory activity in the region of rs78037977, suggesting that this SNP may be significant.
Supplementary Figure 6 Signal clouds for rs78037977, a SNP within the 1q24.3 region containing candidate causal gene FASLG.
This SNP was removed from the celiac disease data in the original analysis owing to its failing a missingness check. However, the clustering shown here is of good quality, implying that the rs78037977 genotype can be considered reliable.
Supplementary Figure 7 A Manhattan plot of the 7p12.2 region containing the candidate causal gene IKZF1.
This gene overlaps two Immunochip regions separated by a recombination hotspot, one at the 5ʹ end and one at the 3ʹ end. The 5ʹ region contains a colocalized signal for multiple sclerosis (MS) and type 1 diabetes (T1D), whereas the 3ʹ end contains only a T1D signal.
Supplementary Figure 8 P values for type 2 diabetes at the peak SNP for all T1D-associated regions.
These regions are divided into those associated with T1D only and those associated with other autoimmune diseases. We see that those associated with no other autoimmune disease tend to have lower type 2 diabetes (T2D) P values. T2D data was taken from the stage 1 GWAS and stage 2 Metabochip study (summary statistics downloaded from http://diagram-consortium.org/).
Supplementary Figure 9 P-value and colocalization data from the regions with newly identified associations.
The most significant SNP for the known association is found, and its P value for the newly identified association is computed. This is plotted against the posterior probability of colocalization (as computed using the Bayesian colocalization approach).
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–9, Supplementary Tables 4 and 5, and Supplementary Note. (PDF 2234 kb)
Supplementary Table 1
The regions analyzed and the number of SNPs within each region (after quality control). (CSV 6 kb)
Supplementary Table 2
Detailed results from the two colocalization methods for each region/trait pair. (CSV 110 kb)
Supplementary Table 3
The results from the conditional Bayesian analysis. (CSV 7 kb)
Rights and permissions
About this article
Cite this article
Fortune, M., Guo, H., Burren, O. et al. Statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls. Nat Genet 47, 839–846 (2015). https://doi.org/10.1038/ng.3330
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3330