Bt504 Current Papars 2022

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

• Bt504 Genomic And Protomics

• Final Term Current Papars 2022


Two positions independent protocols (2)
What is ORF (2*)Mostly Repeated*
ORF (Open Reading Frame) It is the part of gene that has a potential to
code for proteins comes in triples called codons, beginning and ending
with a unique translation start (ATG) and one of three stop (TAA, TAG,
TGA) codons.
ORF is usually predicted based on DNA sequence and not proven to be
transcribed.
*Design of siRNA (2 or 3)
Design of siRNA
• 21-23 nt dsRNA,
• GC% slightly < 50%
• Perfect complimentary to target mRNA
• Targeting 3’UTR works better than 5’UTR:The double-stranded RNA
is processed by an RNase III family enzyme Dicer, resulting in the
generation of an siRNA, a 21–23-nucleotide (nt) RNA duplex, composed
of a 19-mer sequence with symmetric 2–3-nt 3′ overhangs. The siRNA
associates with cellular proteins to form the RNA-induced silencing
complex (RISC).
Why pumps are used in HPLC (2 or 3)* Repeated Mostly*
High-performance liquid chromatography (HPLC; formerly referred to
as high-pressure liquid chromatography), is a technique in analytical
chemistry used to separate, identify, and quantify each component in a
mixture.
• HPLC is a chromatographic technique that can separate a mixture of
compounds. It is used in biochemistry and analytical chemistry to
identify, quantify and purify the individual components of a mixture.
• Although HPLC of intact proteins has not become a widely used
technique for analytical proteomics, it is nevertheless highly applicable
as an initial step to fractionate protein mixtures.
• HPLC would appear to be about as useful as preparative IEF for
resolving protein mixtures into fractions.
• HPLC relies on pumps to pass a pressurized liquid solvent containing
the sample mixture through a column filled with a solid adsorbent
material.
• Each component in the sample interacts slightly differently with the
adsorbent material, causing different flow rates for the different
components and leading to the separation of the components as they
flow out the column.
*Explain the Protein as a modular structure.?
Segments of amino acid sequences can be considered as functional
building blocks or modules. The modular units in proteins that confer
specific properties and functions are referred to as “motifs” or
“domains”. Motifs and domains are recognizable sequences that confer
similar properties or functions when they occur in a variety of proteins.
In some cases, amino acid sequences within motifs and domains are
highly conserved and do not vary from protein to protein. In other cases,
some key amino acids occur in a reproducible relationship to each other
in a sequence, even though various substitutions in other amino acids
occur.
Longer amino acid sequences often form domains, which confer specific
properties or functions on a protein. Some domain structures refer
simply to sequences that confer a bulk physical property to a segment of
the polypeptide, such as transmembrane domains, which simply form
helices that span a lipid bilayer membrane. Other domain structures
provide hydrogen bonding or other contacts for key enzyme substrates
or prosthetic groups. In many cases, domains are made up of
combinations of units of secondary structure, such as helix-loop-helix
domains.
*Name three types of enzymes (3)
There are three types of enzymes
• Metabolic enzymes
• Digestive enzymes
• Food enzymes
Three types of genome assembly algorithm (3)

Four factors of genome quality assurance (5)

What is gene ontology (3)


GO (Gene Ontology) terms are implemented as Directed Acyclic Graphs
(DAG)
Three main types
• Biological Process (BP)
• Cellular Components (CC)
• Molecule Function (MF)
Five advantages of Single molecule sequencing (5)
Advantages of single molecule sequencing
• Less sample preparation (no PCR)
• No amplification
• no PCR errors
• fewer contamination issues
• no GC-bias analyze every sample (unPCRable / unclonable)
• analyze low quality DNA (museum, archeological, forensics samples)
• Absolute quantification
• Sequence RNA directly

Dicer and dorsha 3


Dicer or helicase with RNase motif, is an enzyme of RNase III family. In
humans, it is encoded
by the DICER1 gene.
• It is able to digest dsRNA into uniformly sized small RNAs (siRNA)
• Dicer family proteins are ATP-dependent nucleases.
• RNase III enzyme acts as a dimer.
Drosha is a class 2 ribonuclease III enzyme that in humans is encoded
by the gene DROSHA (formerly RNASEN) gene. The RNase III drosha is
the core nuclease that executes the initiation step of microRNA (miRNA)
processing in the nucleus.
• Dicer is an enzyme that is part of the RNase III family
• Drosha belongs to class 2 ribonuclease III enzyme family
Gensen 5 Genscan *
Secondary structure protein 2
• Secondary structure of proteins refers to the local conformation of
some part of a polypeptide.
A few types of secondary structures are particularly stable and occur
widely in proteins.
• The most prominent are:-
– -helix
– β- conformations.
Limitations of greedy algorithm 3
Greedy assemblers can detect false overlaps and high scoring ones that
are resulted from
repetitive sequences. Graph traversal using greedy approach may cause
algorithm to become
stuck in local maxima, which produces a suboptimal solution for the
assembly problem.

What is matagenomics?????

Briefly explain complex genome.


Genomic Complexity. Britten and Davidson defined the relative amounts of
repeated and unique (or singlecopy) DNA sequences in an organism's
genome as its genomic complexity. Thus, prokaryotic genomes have a lower
genomic complexity than eukaryotes.
What is the significance of genie. 5
Extra predicted exons can be eliminated based on evidence from
homology searches.
Likelihood scores provided for each predicted exon.
• The model includes information about:
• Average length of exons and introns.
• Compositional information about exons and introns.
• A neural-net derived model of splice junctions and consensus
sequences around splice
junctions.
• Splice junction information can be further improved by including
results of homology searches.
Genie is approximately 60- 75% accurate on eukaryotic genomes.

Degradation of proteins 10

• Protein modifications appear to be critical to initiating processes that


ultimately degrade proteins.
• Phosphorylation of some proteins is rapidly followed by conjugation
with
ubiquitin, which leads to degradation by the 26S proteasomal complex.
• There evidently are other stimuli for protein ubiquitination and
turnover, including oxidative damage and other protein modifications.
• Proteins also undergo degradation by lysosomal enzymes.
• Any protein may be present in many forms at any one time in a cell.
• Collectively, the proteome of a cell comprises all of these many forms
of all
expressed proteins. This certainly makes the proteome bewilderingly
complex.
2d sds page 5
In past 2D SDS-PAGE was difficult to use because of;
• The relative technical difficulty of performing the IEF step.
• Getting the delicate tube gel containing the focused proteins set up to
efficiently transfer the proteins in the SDS-
PAGE slab gel was a technical challenge.
***Mechanism of smrt 5
SMRT sequencing
• Single Molecule Real Time Sequencing
• utilizes the zero-mode waveguide (ZMW), developed in the
laboratories of Harold G.
Craighead and Watt W. Webb at Cornell University
• ZMW guides light energy into a volume that is small in all dimensions
compared to the
wavelength of the light
• Nano photonic structure, a circular hole in an aluminum film on a
silica substrate
• With an active polymerase immobilized at the bottom of each ZMW,
nucleotides diffuse into
the ZMW chamber.
• A, C, G and T are labeled with a different fluorescent dye having a
distinct emission spectrum
• Nucleotides held by the polymerase prior to incorporation emit an
extended signal that
identifies the base being incorporated.
Describe eukaryotic genome 2
We have to take into attention exon-intron structure of genes in the most
of known eukaryotic genomes
– 99% of yeast genes are intronless
• Non-consensus splice sites
– other than GT-----AG
• Where is the true first 5' exon?
• cDNA data is incomplete and confusing
• Alternative splicing
• TIC gene
• Alternative Promoters
Gene predction 3
• A region of the genome that codes for a functional component such as
an RNA or protein.
• Locating genes within genomic sequence.
• Defining initiation termination sites of genes.
• Extracting the coding region of each gene.
• Identifying function for coding region
• Locating genes within genomic sequence.
• Defining initiation termination sites of genes.
• Extracting the coding region of each gene.
• Identifying function for coding region
Ph gradint in IEP 3.
• The isoelectric point is the pH at which the net charge of the protein
molecule is neutral.
• Different proteins have different isoelectric points.
• Isoelectric point is found by drawing the sample through a stable pH
gradient.
• The range of the gradient determines the resolution of the separation
This technique is analogous to the first step in 2D-SDS-PAGE.
• In IEF generation of a pH gradient is achieved with soluble
ampholytes, which are polycarboxylic acid compounds that generate a
stable pH gradient when voltage is applied across the focusing cell.
• The protein sample then is added, voltage again is applied, and the
proteins then are separated by isoelectric point.
. In commercially available apparatus, such asthe BioRad Rotofor™
cell, the focusing cell is divided by permeable membranes into a series
of chambers.
• After the focusing step, the chambers are quickly and simultaneously
emptied by a
vacuum sipper that draws the contents ofeach section of the cell into a
separate tube.
• With this type of apparatus, the entire protein mixture is separated into
12–20 fractions
10 types if RNA 3
• Messenger RNA
• Transfer RNA (tRNA)
• Ribosomal RNA (rRNA)
• Micro RNA (miRNA)
• Small nuclear RNA (snRNA)
• Regulatory RNAs.
• Transfer-messenger RNA (tmRNA)
• Ribozymes (RNA enzymes)
• Double-stranded RNA (dsRNA)
• Telomerase RNA
what kind of modifications needed for degradation of proteins?
Besides single modifications, proteins are often modified through a
combination of post-translational cleavage and the addition of
functional groups through a step-wise mechanism of protein
maturation or activation. Protein PTMs can also be reversible
depending on the nature of the modification.
define coverage?
Coverage (or depth) in DNA sequencing is the number of unique reads
that include a given nucleotide in the recons
tructed sequence. Deep sequencing refers to the general concept of
aiming for high number of unique reads of each region of a sequence.
give any three differnces bw siRNA and miRNA?
Small interfering RNA (siRNA), sometimes known as short interfering
RNA, are a class 20-25 nucleotide-long RNA molecules that interfere
with the expression of genes.
• They are naturally produced as part of the RNA interference (RNAi)
pathway bythe enzyme Dicer.
• Exogenously introduced byinvestigators to bring about knockdown of a
particular gene.
• siRNA's have a well defined structure.
• A short (usually 21-nt) double-strand of RNA (dsRNA) with 2-nt
overhangs on either end, including a 5' phosphate group and a 3'
hydroxy (-OH) group.
MicroRNAs (miRNAs) are a class of non-coding RNAs that play
important roles in regulating gene expression. The majority of miRNAs
are transcribed from DNA sequences into primary miRNAs and
processed into precursor miRNAs, and finally mature miRNAs.

write the name of major types of symmetry in protein?


• Rotational symmetry
• Helical symmetry
• Cyclic symmetry
• Dihedral symmetry
• Icosahedral symmetry
What is exportin .
A family of proteins (in the karyopherin superfamily) that are involved in
regulating the export of proteins from the nucleus to the cytoplasm, the
reverse of the task carried out by importins
Properties of SiRNa?
Small interfering RNAs (siRNAs) are double stranded RNA molecules
designed to perfectly match the sequence of a target gene and silence its
expression. The function is exerted through the RNA interference (RNAi)
pathway and has revolutionised biological research due to its ease-of-
use and high potency.
Advantages of ALLPATHS-LG over traditional assembler?
Advantages
• Relatively fast runtime
• Can use long reads only for small genomes
Post translational modifications?
Post-translational modification (PTM) refers to the covalent and
generally enzymatic modification of proteins following protein
biosynthesis. Proteins are synthesized by ribosomes translating mRNA
into polypeptide chains, which may then undergo PTM to form the
mature protein product.
What are needed for protein modifications.
Post-translational modification (PTM) refers to the covalent and
generally enzymatic modification of proteins following protein
biosynthesis. Proteins are synthesized by ribosomes translating mRNA
into polypeptide chains, which may then undergo PTM to form the
mature protein product
Modifications those occur early in the life of the protein
• Carboxylation of glutamate residues
• Removal of the N-terminal methionine
• Glycosylation
• Addition of Prosthetic groups
• Formation of multisubunit complexes
• Prenylation of cysteine residues assists anchoring of proteins in or on
membranes.
These more or less “permanent” modifications and transport ultimately
result in the delivery of functional proteins to specific locations in cells.
The activities of many proteins are then controlled by posttranslational
modifications.
• The most prominent and best-understood of these is phosphorylation of
serine, threonine, or tyrosine residues.
• Phosphorylation may activate or inactivate enzymes, alter
proteinprotein
interactions and associations, change protein structures, and target
proteins for degradation.
• Protein phosphorylation regulates protein function in diverse contexts
and appears
to be a key switch for rapid on-off control of signaling cascades, cell-
cycle control, and other key cellular functions.
Eulerian path and eularien circuit?
Eulerian Path
An Eulerian path is a path that visits every edge of the graph once and
only once.
It can end on a vertex different from the one on which it began.
Eulerian Circuit
• An Eulerian path which begins and ends on the same vertex.
• It starts and ends on the same vertex.
Types of plasma protein
• Albumin
Made mainly in liver.
Helps to keep the blood from leaking out of blood vessels.
Help to carry medicines and other substances. Important for tissue
growth and healing.
• Globulin
Made up of different proteins i.e. alpha, beta and gamma types.
Have a role in immunity. Determines chances of developing an infection.
Working of greedy graph algorithm.
• Greedy Graph algorithms represent the simplest, most intuitive,
solution to the assembly problem.
Steps
Greedy Graph algorithms works as follows:
• Compare all reads or contigs in a pairwise fashion to identify
overlapping sequences
algorithms works as follows:
• Merge the sequences that overlap each other the best
Repeat step 2 until no more sequences can be merged, or the remaining
overlaps conflict with existing contigs
• Best overlapping fragments are the one having the highest score .
• The scoring function measures the number of matching bases in
overlap.
Output of genescan
Eukaryotes – Genscan Output
• Gene structure
• Promoter site
• Translation initiation exon
• Internal exons
• Terminal exon (translation termination)
• Poly-adenylation site
• Genscan is 80% accurate on human sequences.
. How gene are predicted write two ways?
Predictions are derived from different computational methods
Two famous approaches;
• “Ab initio” gene
finding
• Comparative
Approach
. Limitations of greedy graph algorithm?
Greedy assemblers can detect false overlaps and high scoring ones that
are resulted from
repetitive sequences. Graph traversal using greedy approach may cause
algorithm to become
stuck in local maxima, which produces a suboptimal solution for the
assembly problem.
. How interacting the molecule in characteristics of tertiary position?
The tertiary structure of a protein refers to the overall three-dimensional
arrangement of its polypeptide chain in space. It is generally stabilized
by outside polar hydrophilic hydrogen and ionic bond interactions,
and internal hydrophobic interactions between nonpolar amino acid
side chains
. How two loci shows 1% combination in genetic map?
. Two strategies used in human gene ?
Gene-transfer approaches, in which a wild-type copy of the mutated
gene is delivered. RNA modification therapy, in which the mRNA
encoded by a mutant gene is targeted. Stem cell therapy, in which
human stem cells are used to repair disease-damaged tissue.
. Proteins determination technique s?
X-ray crystallography reliable but slow, not all protein crystallize.
• Computer structure prediction programs not reliable for all proteins.

. Pharmaceutical importance of proteins?


• Proteins as pharmaceuticals
• Proteins applications
• Whey proteins health effects
• Iron chelate Protein
• Zinc chelate Protein
• Tumor markers
What are two different types of sequencing?
Whole exome sequencing
Whole genome sequencing
These are increasingly used in healthcare and research to identify
genetic variations; both methods rely on new technologies that allow
rapid sequencing of large amounts of DNA. These approaches are
known as next-generation sequencing (or next-gene sequencing).
2.What is intrones?
• They are non-coding sections of a gene.
• Transcribed in the precursor mRNA sequence but is ultimately
removed by RNA splicing.
• Most of the introns appear to be mobile genetic elements.
2.. Write the name of gene-dense and gene-poor areas DNA building
blocks.
Gene-dense:
Human genome's gene-dense areas are predominantly composed of the
DNA building blocks G and C.
Gene poor:
Gene poor areas are rich in the DNA building blocks A and T.
4. Write three major physical mapping techniques.
A physical map of a chromosome or genome that shows the physical
locations of genes and other DNA sequences
These are the following techniques:
o Cytogenetic map
o A restriction map
o Fluorescent in situ hybridization
o Sequence tagged site (STS) map
o Nucleotide sequence map.
5. Describe the D loop of human mitochondria.
D loop is the site where the most of replication and transcription is
controlled.
Genes are tightly packed with almost no non coding DNA outside the D
loop.
Human mitochondrial gene are contains no intrones. Although intrones
are found in the mitochondria of other groups like plants.
What are histones?
Histones are the family of basic proteins that associate with DNA in the
nucleus and help condense it into chromatin.
Nuclear DNA does not appear in free linear strands it is highly
condense and wrapped around histones in order to fit inside of the
nucleus and take part in the formation of chromosomes.
What is viral genome and write its types. 5
A viral genome is the genetic material of a virus. It is also known as the
viral chromosome.
Viral genomes vary in size from few thousand to more than a hundred
thousand nucleotides.
Types of viral genome:
Viral genome can be
▪ ssRNA
▪ ssDNA
▪ dsRNa
▪ dsDNA
▪ Linear
▪ Circular
Describe a brief procedure to create a labeled cDNA in microarray.

• Chips are prepared by using cDNA- cDNA chips or cDNA


microarray.
• The cDNAs are amplified. Then these immobilized on a
solid support made up of nylon filter of glass slide
• The probe DNA are loaded by capillary action.
• Small volume of this DNA is spotted on solid surface
• DNA is delivered mechanically or in a robotic manner.
• When one DNA spotting is done, the pin is washed and
loaded with fresh DNA to start the second cycle.
What are operons?

These are Contiguous genes, transcribed as a single poly cistronic


mRNA, that encode proteins with related functions.

Describe logarithm of odds.

• Usually, the lod (logarithm of odds) score method is used for


statistical analysis of pedigree data.

• A lod score compares the expected distributions of traits if they are


linked or not linked.
• The lod score is the log10 of the ratio of the two probabilities. The
higher the lod score, the closer the two genes.
Difference between Prokaryotic and Eukaryotic gene Expression. 3

• mRNA is not modified


• The existence of introns in prokaryotes is extremely rare.
• In prokaryotes, the newly synthesized mRNA is polycistronic (code
for more than one polypeptide chain).

• In prokaryotes, transcription of a gene and translation of the


resulting mRNA occur simultaneously.
• Many polysomes are found associated with an active gene.

• Prokaryotes-operons simultaneously transcription and translation


• Eukaryotes – no operons, RNA processing, chromatin remodeling
• Homologues
• • Homologues are thus components or characters (such as
genes/proteins with similar sequences) that can be attributed to a
common ancestor of the two organisms during evolution.

• Homologues can be
• • Orthologous
• • Paralogues
• • Xenologues
• • Analogues

• Orthologues
• • Orthologues are homologues that have evolved from a common
ancestral gene by speciation.
• • They usually have similar functions.

• Paralogues
• paraogues are homologues that are related or produced by
duplication within a genome followed by subsequent divergence.
• • They often have different functions.

• Xenologues
• • Xenologues are homologous that are related by an interspecies
(horizontal transfer) of the genetic material for one of the
homologues.
• • The functions of the xenologues are quite often similar.

• Analogues
• • Analogues are non-homologues genes/proteins that have
descended convergently from an unrelated ancestor.
• • Similar function, different sequence or structure.

You might also like