Abstract
An understanding of the genetic variation underlying transcript splicing is essential to dissect the molecular mechanisms of common disease. The available evidence from splicing quantitative trait locus (sQTL) studies has been limited to small samples. We performed genome-wide screening to identify SNPs that might control mRNA splicing in whole blood collected from 5,257 Framingham Heart Study participants. We identified 572,333 cis sQTLs involving 2,650 unique genes. Many sQTL-associated genes (40%) undergo alternative splicing. Using the National Human Genome Research Institute (NHGRI) genome-wide association study (GWAS) catalog, we determined that 528 unique sQTLs were significantly enriched for 8,845 SNPs associated with traits in previous GWAS. In particular, we found 395 (4.5%) GWAS SNPs with evidence of cis sQTLs but not gene-level cis expression quantitative trait loci (eQTLs), suggesting that sQTL analysis could provide additional insights into the functional mechanism underlying GWAS results. Our findings provide an informative sQTL resource for further characterizing the potential functional roles of SNPs that control transcript isoforms relevant to common diseases.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
206,07 € per year
only 17,17 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Accession codes
References
Hindorff, L.A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 9362–9367 (2009).
Cookson, W., Liang, L., Abecasis, G., Moffatt, M. & Lathrop, M. Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 10, 184–194 (2009).
Westra, H.J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).
Li, Q., Lee, J.A. & Black, D.L. Neuronal regulation of alternative pre-mRNA splicing. Nat. Rev. Neurosci. 8, 819–831 (2007).
Yeo, G., Holste, D., Kreiman, G. & Burge, C.B. Variation in alternative splicing across human tissues. Genome Biol. 5, R74 (2004).
Wang, E.T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).
Merkin, J., Russell, C., Chen, P. & Burge, C.B. Evolutionary dynamics of gene and isoform regulation in mammalian tissues. Science 338, 1593–1599 (2012).
Coulombe-Huntington, J., Lam, K.C., Dias, C. & Majewski, J. Fine-scale variation and genetic determinants of alternative splicing across individuals. PLoS Genet. 5, e1000766 (2009).
Kwan, T. et al. Heritability of alternative splicing in the human genome. Genome Res. 17, 1210–1218 (2007).
Faustino, N.A. & Cooper, T.A. Pre-mRNA splicing and human disease. Genes Dev. 17, 419–437 (2003).
Nissim-Rafinia, M. & Kerem, B. The splicing machinery is a genetic modifier of disease severity. Trends Genet. 21, 480–483 (2005).
Kwan, T. et al. Genome-wide analysis of transcript isoform variation in humans. Nat. Genet. 40, 225–231 (2008).
Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).
Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).
1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Mendell, J.T., Sharifi, N.A., Meyers, J.L., Martinez-Murillo, F. & Dietz, H.C. Nonsense surveillance regulates expression of diverse classes of mammalian transcripts and mutes genomic noise. Nat. Genet. 36, 1073–1078 (2004).
Carninci, P. et al. The transcriptional landscape of the mammalian genome. Science 309, 1559–1563 (2005).
Hunt, R., Sauna, Z.E., Ambudkar, S.V., Gottesman, M.M. & Kimchi-Sarfaty, C. Silent (synonymous) SNPs: should we care about them? Methods Mol. Biol. 578, 23–39 (2009).
Carlini, D.B. & Genut, J.E. Synonymous SNPs provide evidence for selective constraint on human exonic splicing enhancers. J. Mol. Evol. 62, 89–98 (2006).
Taggart, A.J., DeSimone, A.M., Shih, J.S., Filloux, M.E. & Fairbrother, W.G. Large-scale mapping of branchpoints in human pre-mRNA transcripts in vivo. Nat. Struct. Mol. Biol. 19, 719–721 (2012).
Corvelo, A., Hallegger, M., Smith, C.W. & Eyras, E. Genome-wide association between branch point properties and alternative splicing. PLoS Comput. Biol. 6, e1001016 (2010).
Keene, J.D. & Tenenbaum, S.A. Eukaryotic mRNPs may represent posttranscriptional operons. Mol. Cell 9, 1161–1167 (2002).
Jayaseelan, S., Doyle, F., Currenti, S. & Tenenbaum, S.A. RIP: an mRNA localization technique. Methods Mol. Biol. 714, 407–422 (2011).
Nicolae, D.L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).
Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).
Zhang, X. et al. Genetic associations with expression for genes implicated in GWAS studies for atherosclerotic cardiovascular disease and blood phenotypes. Hum. Mol. Genet. 23, 782–795 (2014).
Graveley, B.R. The haplo-spliceo-transcriptome: common variations in alternative splicing in the human population. Trends Genet. 24, 5–7 (2008).
Nembaware, V., Wolfe, K.H., Bettoni, F., Kelso, J. & Seoighe, C. Allele-specific transcript isoforms in human. FEBS Lett. 577, 233–238 (2004).
Bondar', T.N. & Kravchenko, N.A. Cyclooxigenase-1 gene polymorphism and aspirin resistance. Tsitol. Genet. 46, 66–72 (2012).
Licis, N., Krivmane, B., Latkovskis, G. & Erglis, A. A common promoter variant of the gene encoding cyclooxygenase-1 (PTGS1) is related to decreased incidence of myocardial infarction in patients with coronary artery disease. Thromb. Res. 127, 600–602 (2011).
Zhang, X. et al. Synthesis of 53 tissue and cell line expression QTL datasets reveals master eQTLs. BMC Genomics 15, 532 (2014).
Heinzen, E.L. et al. Tissue-specific genetic control of splicing: implications for the study of complex traits. PLoS Biol. 6, e1 (2008).
Zhernakova, D.V. et al. DeepSAGE reveals genetic variants associated with alternative polyadenylation and expression of coding and non-coding transcripts. PLoS Genet. 9, e1003594 (2013).
GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Dawber, T.R., Kannel, W.B. & Lyell, L.P. An approach to longitudinal studies in a community: the Framingham Study. Ann. NY Acad. Sci. 107, 539–556 (1963).
Feinleib, M., Kannel, W.B., Garrison, R.J., McNamara, P.M. & Castelli, W.P. The Framingham Offspring Study. Design and preliminary data. Prev. Med. 4, 518–525 (1975).
Kannel, W.B., Feinleib, M., McNamara, P.M., Garrison, R.J. & Castelli, W.P. An investigation of coronary heart disease in families. The Framingham offspring study. Am. J. Epidemiol. 110, 281–290 (1979).
Splansky, G.L. et al. The Third Generation Cohort of the National Heart, Lung, and Blood Institute's Framingham Heart Study: design, recruitment, and initial examination. Am. J. Epidemiol. 165, 1328–1335 (2007).
Irizarry, R.A. et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4, 249–264 (2003).
Li, Y., Willer, C.J., Ding, J., Scheet, P. & Abecasis, G.R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).
Lange, K. Mathematical and Statistical Methods for Genetic Analysis (Springer, 2002).
Ramasamy, A. et al. Resolving the polymorphism-in-probe problem is critical for correct interpretation of expression QTL studies. Nucleic Acids Res. 41, e88 (2013).
Tenenbaum, S.A., Lager, P.J., Carson, C.C. & Keene, J.D. Ribonomics: identifying mRNA subsets in mRNP complexes using antibodies to RNA-binding proteins and genomic arrays. Methods 26, 191–198 (2002).
Huang, W., Sherman, B.T. & Lempicki, R.A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009).
Huang, W., Sherman, B.T. & Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Acknowledgements
This research was conducted in part using data and resources from the FHS of the National Heart, Lung, and Blood Institute (NHLBI) of the US National Institutes of Health (NIH) and the Boston University School of Medicine. The analyses reflect intellectual input and resource development from the FHS investigators participating in the SNP Health Association Resource (SHARe) project and in the Systems Approach to Biomarker Research in Cardiovascular Disease (SABRe) project.
We thank J. Zhu and Y. Yang at the DNA Sequencing and Genomics Core of the NHLBI for detailed review of our manuscript and helpful suggestions. We also thank J. Dupuis at the Boston University School of Public Health for her statistical suggestions.
This study used the high-performance computational capabilities of the Biowulf Linux cluster at the US NIH (http://biowulf.nih.gov/).
The FHS is funded by US NIH contract N01-HC-25195; this work was also supported by the NHLBI, Division of Intramural Research.
Author information
Authors and Affiliations
Contributions
X.Z. designed the study, developed the method, performed the analyses and wrote the manuscript. C.J.O'D. conceived and coordinated the project and wrote the manuscript. B.H.C. provided key input and revised the manuscript. R.J., S.Y. and P.J.M. provided the normalized expression Exon array data. T.H., A.D.J., P.J.M. and D.L. reviewed the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Schematic for coverage of Affymetrix exon array probe sets across the entire length of the transcript.
Black regions represent exons, whereas gray regions represent introns. The short dashes underneath the exon regions indicate individual probes of 25 nt in length representing the probe set. The highlighted red box indicates an alternative splicing event (exon 2 is spliced out in mRNA transcript isoform 1), which can be detected by analyzing the exon-level probe sets one by one in this genomic locus. The Affymetrix GeneChip Human Exon 1.0 ST array allows for exon-level expression profiling on a single chip and can interrogate over 280,000 core exons in the human genome.
Supplementary Figure 2 The number of exons and SNPs versus the gene length.
Histograms of (a) the number of core probe sets/exons per gene and (b) the number of SNPs with minor allele frequency (MAF) > 0.01 within 50 kb of each gene. Correlation plots of (c) the number of probe sets/exons to the number of SNPs located within 50 kb of each gene, (d) the length of genes to the number of probe sets/exons and (e) the number of SNPs located within 50 kb of each gene.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–6 and Supplementary Note. (PDF 1126 kb)
Supplementary Tables
Supplementary Tables 1–8 (XLSX 874 kb)
Rights and permissions
About this article
Cite this article
Zhang, X., Joehanes, R., Chen, B. et al. Identification of common genetic variants controlling transcript isoform variation in human whole blood. Nat Genet 47, 345–352 (2015). https://doi.org/10.1038/ng.3220
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3220
This article is cited by
-
Disentangling the complexity of psoriasis in the post-genome-wide association era
Genes & Immunity (2023)
-
The molecular genetic basis of creativity: a mini review and perspectives
Psychological Research (2023)
-
Genome-wide fine-mapping identifies pleiotropic and functional variants that predict many traits across global cattle populations
Nature Communications (2021)
-
Cell-type-specific expression quantitative trait loci associated with Alzheimer disease in blood and brain tissue
Translational Psychiatry (2021)
-
Integrating genome-wide association and transcriptome prediction model identifies novel target genes for osteoporosis
Osteoporosis International (2021)