Human Genetic Variation
Human Genetic Variation
Human Genetic Variation
VARIATION
2
Why do we care about genetic variations?
3
Copy-number variations (CNVs)
A form of structural variation or alterations of the DNA of a genome that
results in the cell having an abnormal number of copies of one or more
sections of the DNA.
CNVs correspond to relatively large regions of the genome that have been
deleted (fewer than the normal number) or duplicated (more than the normal
number) on certain chromosomes.
4
Main Types of Genetic Variations
A. Single nucleotide mutation
Resulting in single nucleotide polymorphisms (SNPs)
Accounts for up to 90% of human genetic variations
Majority of SNPs do NOT directly or significantly contribute to any phenotypes
These can be found on many chromosomes, and often show variations in length
between individuals.
Each variant acts as an inherited allele, allowing them to be used for personal or
parental identification. Their analysis is useful in genetics and biology
research, forensics, and DNA fingerprinting.
6
In analysing VNTR data, two basic genetic principles can be used:
Identity Matching- both VNTR alleles from a specific location must match. If two
samples are from the same individual, they must show the same allele pattern.
Inheritance Matching- the VNTR alleles must follow the rules of inheritance. In
matching an individual with his parents or children, a person must have an allele that
matches one from each parent. If the relationship is more distant, such as a grandparent or
sibling, then matches must be consistent with the degree of relatedness.
7
The Human Gene Mutation Database
(HGMD)
8
9
The Protein Mutation Database (PMD)
The Protein Mutation Database (PMD) is unique among genetic variation databases
as it contains both natural and artificial mutation data derived from human proteins.
The database gives detailed description of the functional or structural effects of the
mutations if known and provides links to the original publications. Relative
differences in activity and/or stability, in comparison with the wild-type protein,
are also indicated.
PMD contains 119,190 natural and artificial mutations (January 2002) and these can be
searched by keyword or sequence similarity
10
EXAMPLE - Hemoglobin
11
12
Database of Genomic Variants archive
13
DGVa is a central archive that receives data from, and distributes data to, a number of
resources:
The DGVa accepts direct submissions from researchers and accession numbers for
data objects included in these are given the prefix 'e'.
The DGVa also exchanges data on a regular basis with dbVar. Data objects
accessioned by dbVar have the prefix 'n'.
You can retrieve DGVa data from the data download page, search the DGVa using
Biomart, and view the data in a genomic context using Ensembl .
The DGVa also supplies data to DGV (Database of Genomic Variants, hosted by The
Centre for Applied Genomics in Canada), where further curation and interpretation is
carried out.
The data stored in the DGVa are organised according to the DGVa's data model and
are centred around three types of object:
the study
the genomic region in which the variation occurs
the particular variation observed in an individual sample (call)
14
Study
'Study' is the placeholder for all data objects and related information for a genomic
structural variation study. The accession number has the prefix estd (or, if the study has
been curated by dbVar, nstd.) Study-related information includes details about the study
authors and their affiliation, the type of study and the publication that describes the
study.
Region
'Region' denotes the genomic location where structural variation is asserted to exist. The
accesssion number has the prefix esv(or, if the study has been curated by dbVar, the
prefix is nsv.) Study authors assert the presence of a structural variant region on the basis
of individual variation in samples. Region-related information includes the assertion
method, which describes how the variation in samples has been merged to define the
region (for example, sample calls overlap by at least 80%.)
Call
'Call' describes the individual variation observed in a sample. The accession number has
the prefix essv (or, if the study has been curated by dbVar, the prefix is nssv.) Call-related
information includes the name of the sample, the experimental procedure that generated
the call (e.g. sequencing or array), the type of variation (e.g. deletion or insertion) and
placement (location) in the sample's genome.
15
Database of Genomic Variants
Database of Genomic Variants is to provide a comprehensive
summary of structural variation in the human genome.
16
17
EXAMPLE-NM_030798
This gene encodes an RCC1-like G-exchanging factor. It is deleted in Williams
syndrome, a multisystem developmental disorder caused by the deletion of contiguous
genes at 7q11.23
18
19
Online Mendelian Inheritance in Man (OMIM):
A Brief Overview
URL: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
OMIM is a human genetic disorders database built and curated using results
from published studies.
Each OMIM record provides a summary of the current state of knowledge of the
genetic basis of a disorder, which contains the following information:
OMIM is searchable via NCBI Entrez, and its records are cross-linked to other
NCBI resources. 20
Online Mendelian Inheritance in Man Stats
21
OMIM: Allelic Variants
The OMIM database includes genetic disorders caused by various
mutation/variation, from SNPs to large-scale chromosomal abnormalities.
The listed allelic variants are searchable through the "Allelic Variants" field.
Single nucleotide substitutions (SNPs);
small insertions and deletions (INDEL/DIPS);
frame shifts caused by these INDELs.
22
For most genes, only selected mutations are included
Criteria for inclusion include: the first mutation to be discovered, high population
frequency, distinctive phenotype, historic significance, unusual mechanism of
mutation, unusual pathogenetic mechanism, and distinctive inheritance.
Some SNPs in the dbSNP records are not linked to the corresponding OMIM
records.
23
dbVar
dbVar is a database of genomic structural variation that allows you to
search, view, and download variant data from studies submitted for any
organism. In general, variants are ≥ 50 nucleotides, but are occasionally
smaller. dbVar provides access to the raw data (when available) and links
to other NCBI and external resources.
24
25
The browser can be sorted by :
- Study accession, Organism, Study Type, Method, Number of Variant
Regions, Number of Variant Calls, or Publication
26
This is followed by a Detailed Information section where you can download
variant data for the current study (Variant Regions, Variant Calls, Both, or
everything via FTP) or browse details about Variant Summary, Samplesets,
Experimental Details, or Validations in a tabbed format.
27
GENETIC MARKER AND MICROSATELLITE
DATABASES
dbSTS
It is an NCBI resource that contains sequence data for short genomic landmark
sequences or Sequence Tagged Sites. These STSs can include polymorphic sequences
such as short tandem repeats (STRs) or non-polymorphic sequences.
The dbSTS database maintains complete records for over 133,202 STS markers,
including 18,000 STR markers and gives key information for each record such as
primer sequences, map location and marker aliases.
Searching dbSTS can be achieved in many ways. The UniSTS interface allows
direct searches by keyword, the NCBI Map View application allows searching by
genomic location or locus, while dbSTS is also available for BLAST searching by
NCBI BLAST.
This array of search options makes the dbSTS database a very reliable source for
Retrieval of both genetic and physical STS map markers.
28
29
NON-NUCLEAR AND SOMATIC MUTATION
DATABASES
The mitochondrial genome consists of a 16,569-bp closed circular molecule in
the mitochondrion each of the several thousand mtDNAs per cell encodes a
control region encompassing a replication origin and the promoters, a large (16 S)
and small (12 S) rRNA, 22 tRNAs, and 13 polypeptides.
Maternally inherited mtDNA has a very high mutation rate —mtDNA mutates
10–20 times faster than nuclear DNA as a result of inadequate proofreading by
mitochondrial DNA polymerases and limited mtDNA repair capability.
31
32
Somatic Mutations
A completely distinct category of human mutations arises somatically during
the process of tumourgenesis.
These mutations may take many forms, the most commonly characterized are
somatic point mutations identified during the screening of candidate genes in
tumour tissues.
33
COSMIC
34
EXAMPLE – T315I mutation
35
36
THANK YOU
37