Single-nucleotide polymorphism
Lua error in package.lua at line 80: module 'strict' not found.
A single nucleotide polymorphism or simple nucleotide polymorphism, often abbreviated to just SNP (pronounced snip; plural snips), is a variation in a single nucleotide which may occur at some specific position in the genome, where each variation is present to some appreciable degree within a population (e.g. >1%).[1]
For example, at a specific base position in the human genome, it may be that in most individuals the base C appears there; but in a minority of individuals, the base A appears at that position instead. There is an SNP at this specific base position, and the two possible nucleotide variations - C or A - are said to be alleles for this base position. Although in this example and most SNPs so far discovered there are only two different alleles, there are also triallelic SNPs in which three different base variations may coexist within a population.[2]
SNPs underlie differences in our susceptibility to disease; a wide range of human diseases, e.g. sickle-cell anemia, β-thalassemia and cystic fibrosis result from SNPs.[3][4][5] The severity of illness and the way our body responds to treatments are also manifestations of genetic variations. For example, a single base mutation in the APOE (apolipoprotein E) gene is associated with a higher risk for Alzheimer's disease.[6]
Contents
Types
Types of SNPs |
---|
Single-nucleotide polymorphisms may fall within coding sequences of genes, non-coding regions of genes, or in the intergenic regions (regions between genes). SNPs within a coding sequence do not necessarily change the amino acid sequence of the protein that is produced, due to degeneracy of the genetic code.
SNPs in the coding region are of two types, synonymous and nonsynonymous SNPs. Synonymous SNPs do not affect the protein sequence while nonsynonymous SNPs change the amino acid sequence of protein. The nonsynonymous SNPs are of two types: missense and nonsense.
SNPs that are not in protein-coding regions may still affect gene splicing, transcription factor binding, messenger RNA degradation, or the sequence of non-coding RNA. Gene expression affected by this type of SNP is referred to as an eSNP (expression SNP) and may be upstream or downstream from the gene.
Concepts surrounding the application of SNPs
- Association Studies are performed to determine whether a genetic variant is associated with a disease or trait.[7]
- A tag SNP is a representative single nucleotide polymorphism (SNP) in a region of the genome with high linkage disequilibrium (the non-random association of alleles at two or more loci. Tag SNPs are useful in whole-genome SNP association studies in which hundreds of thousands of SNPs across the entire genome are genotyped. Set.Set of SNPs used to find out the other nearby SNP or allele.
- Haplotype is specific set of allele or DNA sequence observed on a single chromosome or a part of chromosome more frequently.
- Linkage Disequilibrium (LD), a term used in population genetics, indicates non-random association of alleles at two or more loci, not necessarily on the same chromosome. It refers to the phenomenon that SNP allele or DNA sequence which are close together in the genome tend to be inherited together. LD is affected by two parameters: 1) The distance between the SNPs [the larger the distance the lower the LD]. 2) Recombination rate [the lower the recombination rate the higher the LD].[8]
Frequency
Within a genome
The genomic distribution of SNPs is not homogenous; SNPs occur in non-coding regions more frequently than in coding regions or, in general, where natural selection is acting and 'fixing' the allele (eliminating other variants) of the SNP that constitutes the most favorable genetic adaptation.[9] Other factors, like genetic recombination and mutation rate, can also determine SNP density.[10]
SNP density can be predicted by the presence of microsatellites: AT microsatellites in particular are potent predictors of SNP density, with long (AT)(n) repeat tracts tending to be found in regions of significantly reduced SNP density and low GC content.[11]
Within a population
There are variations between human populations, so a SNP allele that is common in one geographical or ethnic group may be much rarer in another. Within a population, SNPs can be assigned a minor allele frequency — the lowest allele frequency at a locus that is observed in a particular population. This is simply the lesser of the two allele frequencies for single-nucleotide polymorphisms.
Importance
Variations in the DNA sequences of humans can affect how humans develop diseases and respond to pathogens, chemicals, drugs, vaccines, and other agents. SNPs are also critical for personalized medicine.[12]
Biomedical research
SNPs' greatest importance in biomedical research is for comparing regions of the genome between cohorts (such as with matched cohorts with and without a disease) in genome-wide association studies. SNPs have been used in genome-wide association studies as high-resolution markers in gene mapping related to diseases or normal traits. SNPs without an observable impact on the phenotype (so called silent mutations) are still useful as genetic markers in genome-wide association studies, because of their quantity and the stable inheritance over generations.[13]
Forensics
Lua error in package.lua at line 80: module 'strict' not found.
Pharmacogenetics
The knowledge of SNPs will help in understanding pharmacokinetics (PK) or pharmacodynamics, i.e. how drugs act in individuals with different genetic variants. Diseases with different SNPs may become relevant pharmacogenomic targets for drug therapy.[14] Some SNPs are associated with the metabolism of different drugs.[15][16][17]
Disease
A single SNP may cause a Mendelian disease, though for complex diseases, SNPs do not usually function individually, rather, they work in coordination with other SNPs to manifest a disease condition as has been seen in Osteoporosis.[18]
All types of SNPs can have an observable phenotype or can result in disease:
- SNPs in non-coding regions can manifest in a higher risk of cancer,[19] and may affect mRNA structure and disease susceptibility.[20]
- SNPs in coding regions:
- synonymous substitutions by definition do not result in a change of amino acid in the protein, but still can affect its function in other ways. An example would be a seemingly silent mutation in the multidrug resistance gene 1 (MDR1), which codes for a cellular membrane pump that expels drugs from the cell, can slow down translation and allow the peptide chain to fold into an unusual conformation, causing the mutant pump to be less functional.[21]
- nonsynonymous substitutions:
- missense - single change in the base results in change in amino acid of protein and its malfunction which leads to disease (e.g. c.1580G>T SNP in LMNA gene - position 1580 (nt) in the DNA sequence (CGT codon) causing the guanine to be replaced with the thymine, yielding CTT codon in the DNA sequence, results at the protein level in the replacement of the arginine by the leucine in the position 527,[22] at the phenotype level this manifests in overlapping mandibuloacral dysplasia and progeria syndrome)
- nonsense - point mutation in a sequence of DNA that results in a premature stop codon, or a nonsense codon in the transcribed mRNA, and in a truncated, incomplete, and usually nonfunctional protein product (e.g. Cystic fibrosis caused by the G542X mutation in the cystic fibrosis transmembrane conductance regulator gene).[23]
Examples
- rs6311 and rs6313 are SNPs in the Serotonin 5-HT2A receptor gene on human chromosome 13.[24]
- A SNP in the F5 gene causes Factor V Leiden thrombophilia.[25]
- rs3091244 is an example of a triallelic SNP in the CRP gene on human chromosome 1.[26]
- TAS2R38 codes for PTC tasting ability, and contains 6 annotated SNPs.[27]
- rs148649884 and rs138055828 in the FCN1 gene encoding M-ficolin crippled the ligand-binding capability of the recombinant M-ficolin.[28]
Databases
As there are for genes, bioinformatics databases exist for SNPs.
- dbSNP is a SNP database from the National Center for Biotechnology Information (NCBI). As of 8 June 2015[update], dbSNP listed 149,735,377 SNPs in humans.[29][30]
- Kaviar[31] is a compendium of SNPs from multiple data sources including dbSNP.
- SNPedia is a wiki-style database supporting personal genome annotation, interpretation and analysis.
- The OMIM database describes the association between polymorphisms and diseases (e.g., gives diseases in text form)
- The Human Gene Mutation Database provides gene mutations causing or associated with human inherited diseases and functional SNPs
- The International HapMap Project, where researchers are identifying Tag SNP to be able to determine the collection of haplotypes present in each subject.
- GWAS Central allows users to visually interrogate the actual summary-level association data in one or more genome-wide association studies.
The International SNP Map working group mapped the sequence flanking each SNP by alignment to the genomic sequence of large-insert clones in Genebank. These alignments were converted to chromosomal coordinates that is shown in Table 1.[32]
Chromosome | Length(bp) | All SNPs | TSC SNPs | ||
---|---|---|---|---|---|
Total SNPs | kb per SNP | Total SNPs | kb per SNP | ||
1 | 214,066,000 | 129,931 | 1.65 | 75,166 | 2.85 |
2 | 222,889,000 | 103,664 | 2.15 | 76,985 | 2.90 |
3 | 186,938,000 | 93,140 | 2.01 | 63,669 | 2.94 |
4 | 169,035,000 | 84,426 | 2.00 | 65,719 | 2.57 |
5 | 170,954,000 | 117,882 | 1.45 | 63,545 | 2.69 |
6 | 165,022,000 | 96,317 | 1.71 | 53,797 | 3.07 |
7 | 149,414,000 | 71,752 | 2.08 | 42,327 | 3.53 |
8 | 125,148,000 | 57,834 | 2.16 | 42,653 | 2.93 |
9 | 107,440,000 | 62,013 | 1.73 | 43,020 | 2.50 |
10 | 127,894,000 | 61,298 | 2.09 | 42,466 | 3.01 |
11 | 129,193,000 | 84,663 | 1.53 | 47,621 | 2.71 |
12 | 125,198,000 | 59,245 | 2.11 | 38,136 | 3.28 |
13 | 93,711,000 | 53,093 | 1.77 | 35,745 | 2.62 |
14 | 89,344,000 | 44,112 | 2.03 | 29,746 | 3.00 |
15 | 73,467,000 | 37,814 | 1.94 | 26,524 | 2.77 |
16 | 74,037,000 | 38,735 | 1.91 | 23,328 | 3.17 |
17 | 73,367,000 | 34,621 | 2.12 | 19,396 | 3.78 |
18 | 73,078,000 | 45,135 | 1.62 | 27,028 | 2.70 |
19 | 56,044,000 | 25,676 | 2.18 | 11,185 | 5.01 |
20 | 63,317,000 | 29,478 | 2.15 | 17,051 | 3.71 |
21 | 33,824,000 | 20,916 | 1.62 | 9,103 | 3.72 |
22 | 33,786,000 | 28,410 | 1.19 | 11,056 | 3.06 |
X | 131,245,000 | 34,842 | 3.77 | 20,400 | 6.43 |
Y | 21,753,000 | 4,193 | 5.19 | 1,784 | 12.19 |
RefSeq | 15,696,674 | 14,534 | 1.08 | ||
Totals | 2,710,164,000 | 1,419,190 | 1.91 | 887,450 | 3.05 |
Nomenclature
The nomenclature for SNPs can be confusing: several variations can exist for an individual SNP and consensus has not yet been achieved. One approach is to write SNPs with a prefix, period and "greater than" sign showing the wild-type and altered nucleotide or amino acid; for example, c.76A>T.[33][34][35] SNPs are frequently referred to by their dbSNP rs number, as in the examples above.
SNP analysis
SNPs are usually biallelic and thus easily assayed.[36] Analytical methods to discover novel SNPs and detect known SNPs include:
- DNA sequencing;[37]
- capillary electrophoresis;[38]
- mass spectrometry;[39]
- single-strand conformation polymorphism (SSCP);[40]
- single-base extension;
- electrochemical analysis;
- denaturating HPLC and gel electrophoresis;
- restriction fragment length polymorphism;
- hybridization analysis;
Programs for prediction of SNP effects
An important group of SNPs are those that corresponds to missense mutations causing amino acid change on protein level. Point mutation of particular residue can have different effect on protein function (from no effect to complete disruption its function). Usually, change in amino acids with similar size and physico-chemical properties (e.g. substitution from leucine to valine) has mild effect, and opposite. Similarly, if SNP disrupts secondary structure elements (e.g. substitution to proline in alpha helix region) such mutation usually may affect whole protein structure and function. Using those simple and many other machine learning derived rules a group of programs for the prediction of SNP effect was developed:
See also
- Illumina
- Affymetrix
- International HapMap Project
- SNP array
- SNP genotyping
- SNV calling from NGS data
- Short tandem repeat (STR)
- Single-base extension
- Snpstr
- Tag SNP
- TaqMan
- Variome
- SNP Annotation
Notes
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ http://genome.cshlp.org/content/14/5/908.full
- ↑ https://www.researchgate.net/publication/229085137_Single_nucleotide_polymorphisms_a_new_paradigm_for_molecular_marker_technology_and_DNA_polymorphism_detection_with_emphasis_on_their_use_in_plantsPK_Gupta_JK_Roy_M_PrasadCurr_Sci_80_%284%29_524-35
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ National Center for Biotechnology Information, United States National Library of Medicine. 2014. NCBI dbSNP build 142 for human. http://www.ncbi.nlm.nih.gov/mailman/pipermail/dbsnp-announce/2014q4/000147.html
- ↑ National Center for Biotechnology Information, United States National Library of Medicine. 2015. NCBI dbSNP build 144 for human. Summary Page. http://www.ncbi.nlm.nih.gov/projects/SNP/snp_summary.cgi?view+summary=view+summary&build_id=144
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
References
- Nature Reviews Glossary
- Human Genome Project Information — SNP Fact Sheet
- Relation of SNP's with Cancer
External links
Wikimedia Commons has media related to Single nucleotide polymorphism. |
- NCBI resources — Introduction to SNPs from NCBI
- The SNP Consortium LTD — SNP search
- NCBI dbSNP database — "a central repository for both single base nucleotide substitutions and short deletion and insertion polymorphisms"
- HGMD — the Human Gene Mutation Database, includes rare mutations and functional SNPs
- SNPedia - a wiki devoted to the medical consequences of DNA variations, including software to analyze personal genomes
- International HapMap Project — "a public resource that will help researchers find genes associated with human disease and response to pharmaceuticals"
- GWAS Central — a central database of summary-level genetic association findings
- 1000 Genomes Project — A Deep Catalog of Human Genetic Variation
- WatCut — an online tool for the design of SNP-RFLP assays
- SNPStats — SNPStats, a web tool for analysis of genetic association studies
- Restriction HomePage — a set of tools for DNA restriction and SNP detection, including design of mutagenic primers
- American Association for Cancer Research Cancer Concepts Factsheet on SNPs
- PharmGKB — The Pharmacogenetics and Pharmacogenomics Knowledge Base, a resource for SNPs associated with drug response and disease outcomes.
- GEN-SNiP — Online tool that identifies polymorphisms in test DNA sequences.
- Rules for Nomenclature of Genes, Genetic Markers, Alleles, and Mutations in Mouse and Rat
- HGNC Guidelines for Human Gene Nomenclature
- SNP effect predictor with galaxy integration
- Human Gene Mutation Database
- GWAS Central
- Open SNP — a portal for sharing own SNP test results
- The HapMap Project