Two positions independent protocols (2) What is ORF (2*)Mostly Repeated* ORF (Open Reading Frame) It is the part of gene that has a potential to code for proteins comes in triples called codons, beginning and ending with a unique translation start (ATG) and one of three stop (TAA, TAG, TGA) codons. ORF is usually predicted based on DNA sequence and not proven to be transcribed. *Design of siRNA (2 or 3) Design of siRNA • 21-23 nt dsRNA, • GC% slightly < 50% • Perfect complimentary to target mRNA • Targeting 3’UTR works better than 5’UTR:The double-stranded RNA is processed by an RNase III family enzyme Dicer, resulting in the generation of an siRNA, a 21–23-nucleotide (nt) RNA duplex, composed of a 19-mer sequence with symmetric 2–3-nt 3′ overhangs. The siRNA associates with cellular proteins to form the RNA-induced silencing complex (RISC). Why pumps are used in HPLC (2 or 3)* Repeated Mostly* High-performance liquid chromatography (HPLC; formerly referred to as high-pressure liquid chromatography), is a technique in analytical chemistry used to separate, identify, and quantify each component in a mixture. • HPLC is a chromatographic technique that can separate a mixture of compounds. It is used in biochemistry and analytical chemistry to identify, quantify and purify the individual components of a mixture. • Although HPLC of intact proteins has not become a widely used technique for analytical proteomics, it is nevertheless highly applicable as an initial step to fractionate protein mixtures. • HPLC would appear to be about as useful as preparative IEF for resolving protein mixtures into fractions. • HPLC relies on pumps to pass a pressurized liquid solvent containing the sample mixture through a column filled with a solid adsorbent material. • Each component in the sample interacts slightly differently with the adsorbent material, causing different flow rates for the different components and leading to the separation of the components as they flow out the column. *Explain the Protein as a modular structure.? Segments of amino acid sequences can be considered as functional building blocks or modules. The modular units in proteins that confer specific properties and functions are referred to as “motifs” or “domains”. Motifs and domains are recognizable sequences that confer similar properties or functions when they occur in a variety of proteins. In some cases, amino acid sequences within motifs and domains are highly conserved and do not vary from protein to protein. In other cases, some key amino acids occur in a reproducible relationship to each other in a sequence, even though various substitutions in other amino acids occur. Longer amino acid sequences often form domains, which confer specific properties or functions on a protein. Some domain structures refer simply to sequences that confer a bulk physical property to a segment of the polypeptide, such as transmembrane domains, which simply form helices that span a lipid bilayer membrane. Other domain structures provide hydrogen bonding or other contacts for key enzyme substrates or prosthetic groups. In many cases, domains are made up of combinations of units of secondary structure, such as helix-loop-helix domains. *Name three types of enzymes (3) There are three types of enzymes • Metabolic enzymes • Digestive enzymes • Food enzymes Three types of genome assembly algorithm (3)
Four factors of genome quality assurance (5)
What is gene ontology (3)
GO (Gene Ontology) terms are implemented as Directed Acyclic Graphs (DAG) Three main types • Biological Process (BP) • Cellular Components (CC) • Molecule Function (MF) Five advantages of Single molecule sequencing (5) Advantages of single molecule sequencing • Less sample preparation (no PCR) • No amplification • no PCR errors • fewer contamination issues • no GC-bias analyze every sample (unPCRable / unclonable) • analyze low quality DNA (museum, archeological, forensics samples) • Absolute quantification • Sequence RNA directly
Dicer and dorsha 3
Dicer or helicase with RNase motif, is an enzyme of RNase III family. In humans, it is encoded by the DICER1 gene. • It is able to digest dsRNA into uniformly sized small RNAs (siRNA) • Dicer family proteins are ATP-dependent nucleases. • RNase III enzyme acts as a dimer. Drosha is a class 2 ribonuclease III enzyme that in humans is encoded by the gene DROSHA (formerly RNASEN) gene. The RNase III drosha is the core nuclease that executes the initiation step of microRNA (miRNA) processing in the nucleus. • Dicer is an enzyme that is part of the RNase III family • Drosha belongs to class 2 ribonuclease III enzyme family Gensen 5 Genscan * Secondary structure protein 2 • Secondary structure of proteins refers to the local conformation of some part of a polypeptide. A few types of secondary structures are particularly stable and occur widely in proteins. • The most prominent are:- – -helix – β- conformations. Limitations of greedy algorithm 3 Greedy assemblers can detect false overlaps and high scoring ones that are resulted from repetitive sequences. Graph traversal using greedy approach may cause algorithm to become stuck in local maxima, which produces a suboptimal solution for the assembly problem.
What is matagenomics?????
Briefly explain complex genome.
Genomic Complexity. Britten and Davidson defined the relative amounts of repeated and unique (or singlecopy) DNA sequences in an organism's genome as its genomic complexity. Thus, prokaryotic genomes have a lower genomic complexity than eukaryotes. What is the significance of genie. 5 Extra predicted exons can be eliminated based on evidence from homology searches. Likelihood scores provided for each predicted exon. • The model includes information about: • Average length of exons and introns. • Compositional information about exons and introns. • A neural-net derived model of splice junctions and consensus sequences around splice junctions. • Splice junction information can be further improved by including results of homology searches. Genie is approximately 60- 75% accurate on eukaryotic genomes.
Degradation of proteins 10
• Protein modifications appear to be critical to initiating processes that
ultimately degrade proteins. • Phosphorylation of some proteins is rapidly followed by conjugation with ubiquitin, which leads to degradation by the 26S proteasomal complex. • There evidently are other stimuli for protein ubiquitination and turnover, including oxidative damage and other protein modifications. • Proteins also undergo degradation by lysosomal enzymes. • Any protein may be present in many forms at any one time in a cell. • Collectively, the proteome of a cell comprises all of these many forms of all expressed proteins. This certainly makes the proteome bewilderingly complex. 2d sds page 5 In past 2D SDS-PAGE was difficult to use because of; • The relative technical difficulty of performing the IEF step. • Getting the delicate tube gel containing the focused proteins set up to efficiently transfer the proteins in the SDS- PAGE slab gel was a technical challenge. ***Mechanism of smrt 5 SMRT sequencing • Single Molecule Real Time Sequencing • utilizes the zero-mode waveguide (ZMW), developed in the laboratories of Harold G. Craighead and Watt W. Webb at Cornell University • ZMW guides light energy into a volume that is small in all dimensions compared to the wavelength of the light • Nano photonic structure, a circular hole in an aluminum film on a silica substrate • With an active polymerase immobilized at the bottom of each ZMW, nucleotides diffuse into the ZMW chamber. • A, C, G and T are labeled with a different fluorescent dye having a distinct emission spectrum • Nucleotides held by the polymerase prior to incorporation emit an extended signal that identifies the base being incorporated. Describe eukaryotic genome 2 We have to take into attention exon-intron structure of genes in the most of known eukaryotic genomes – 99% of yeast genes are intronless • Non-consensus splice sites – other than GT-----AG • Where is the true first 5' exon? • cDNA data is incomplete and confusing • Alternative splicing • TIC gene • Alternative Promoters Gene predction 3 • A region of the genome that codes for a functional component such as an RNA or protein. • Locating genes within genomic sequence. • Defining initiation termination sites of genes. • Extracting the coding region of each gene. • Identifying function for coding region • Locating genes within genomic sequence. • Defining initiation termination sites of genes. • Extracting the coding region of each gene. • Identifying function for coding region Ph gradint in IEP 3. • The isoelectric point is the pH at which the net charge of the protein molecule is neutral. • Different proteins have different isoelectric points. • Isoelectric point is found by drawing the sample through a stable pH gradient. • The range of the gradient determines the resolution of the separation This technique is analogous to the first step in 2D-SDS-PAGE. • In IEF generation of a pH gradient is achieved with soluble ampholytes, which are polycarboxylic acid compounds that generate a stable pH gradient when voltage is applied across the focusing cell. • The protein sample then is added, voltage again is applied, and the proteins then are separated by isoelectric point. . In commercially available apparatus, such asthe BioRad Rotofor™ cell, the focusing cell is divided by permeable membranes into a series of chambers. • After the focusing step, the chambers are quickly and simultaneously emptied by a vacuum sipper that draws the contents ofeach section of the cell into a separate tube. • With this type of apparatus, the entire protein mixture is separated into 12–20 fractions 10 types if RNA 3 • Messenger RNA • Transfer RNA (tRNA) • Ribosomal RNA (rRNA) • Micro RNA (miRNA) • Small nuclear RNA (snRNA) • Regulatory RNAs. • Transfer-messenger RNA (tmRNA) • Ribozymes (RNA enzymes) • Double-stranded RNA (dsRNA) • Telomerase RNA what kind of modifications needed for degradation of proteins? Besides single modifications, proteins are often modified through a combination of post-translational cleavage and the addition of functional groups through a step-wise mechanism of protein maturation or activation. Protein PTMs can also be reversible depending on the nature of the modification. define coverage? Coverage (or depth) in DNA sequencing is the number of unique reads that include a given nucleotide in the recons tructed sequence. Deep sequencing refers to the general concept of aiming for high number of unique reads of each region of a sequence. give any three differnces bw siRNA and miRNA? Small interfering RNA (siRNA), sometimes known as short interfering RNA, are a class 20-25 nucleotide-long RNA molecules that interfere with the expression of genes. • They are naturally produced as part of the RNA interference (RNAi) pathway bythe enzyme Dicer. • Exogenously introduced byinvestigators to bring about knockdown of a particular gene. • siRNA's have a well defined structure. • A short (usually 21-nt) double-strand of RNA (dsRNA) with 2-nt overhangs on either end, including a 5' phosphate group and a 3' hydroxy (-OH) group. MicroRNAs (miRNAs) are a class of non-coding RNAs that play important roles in regulating gene expression. The majority of miRNAs are transcribed from DNA sequences into primary miRNAs and processed into precursor miRNAs, and finally mature miRNAs.
write the name of major types of symmetry in protein?
• Rotational symmetry • Helical symmetry • Cyclic symmetry • Dihedral symmetry • Icosahedral symmetry What is exportin . A family of proteins (in the karyopherin superfamily) that are involved in regulating the export of proteins from the nucleus to the cytoplasm, the reverse of the task carried out by importins Properties of SiRNa? Small interfering RNAs (siRNAs) are double stranded RNA molecules designed to perfectly match the sequence of a target gene and silence its expression. The function is exerted through the RNA interference (RNAi) pathway and has revolutionised biological research due to its ease-of- use and high potency. Advantages of ALLPATHS-LG over traditional assembler? Advantages • Relatively fast runtime • Can use long reads only for small genomes Post translational modifications? Post-translational modification (PTM) refers to the covalent and generally enzymatic modification of proteins following protein biosynthesis. Proteins are synthesized by ribosomes translating mRNA into polypeptide chains, which may then undergo PTM to form the mature protein product. What are needed for protein modifications. Post-translational modification (PTM) refers to the covalent and generally enzymatic modification of proteins following protein biosynthesis. Proteins are synthesized by ribosomes translating mRNA into polypeptide chains, which may then undergo PTM to form the mature protein product Modifications those occur early in the life of the protein • Carboxylation of glutamate residues • Removal of the N-terminal methionine • Glycosylation • Addition of Prosthetic groups • Formation of multisubunit complexes • Prenylation of cysteine residues assists anchoring of proteins in or on membranes. These more or less “permanent” modifications and transport ultimately result in the delivery of functional proteins to specific locations in cells. The activities of many proteins are then controlled by posttranslational modifications. • The most prominent and best-understood of these is phosphorylation of serine, threonine, or tyrosine residues. • Phosphorylation may activate or inactivate enzymes, alter proteinprotein interactions and associations, change protein structures, and target proteins for degradation. • Protein phosphorylation regulates protein function in diverse contexts and appears to be a key switch for rapid on-off control of signaling cascades, cell- cycle control, and other key cellular functions. Eulerian path and eularien circuit? Eulerian Path An Eulerian path is a path that visits every edge of the graph once and only once. It can end on a vertex different from the one on which it began. Eulerian Circuit • An Eulerian path which begins and ends on the same vertex. • It starts and ends on the same vertex. Types of plasma protein • Albumin Made mainly in liver. Helps to keep the blood from leaking out of blood vessels. Help to carry medicines and other substances. Important for tissue growth and healing. • Globulin Made up of different proteins i.e. alpha, beta and gamma types. Have a role in immunity. Determines chances of developing an infection. Working of greedy graph algorithm. • Greedy Graph algorithms represent the simplest, most intuitive, solution to the assembly problem. Steps Greedy Graph algorithms works as follows: • Compare all reads or contigs in a pairwise fashion to identify overlapping sequences algorithms works as follows: • Merge the sequences that overlap each other the best Repeat step 2 until no more sequences can be merged, or the remaining overlaps conflict with existing contigs • Best overlapping fragments are the one having the highest score . • The scoring function measures the number of matching bases in overlap. Output of genescan Eukaryotes – Genscan Output • Gene structure • Promoter site • Translation initiation exon • Internal exons • Terminal exon (translation termination) • Poly-adenylation site • Genscan is 80% accurate on human sequences. . How gene are predicted write two ways? Predictions are derived from different computational methods Two famous approaches; • “Ab initio” gene finding • Comparative Approach . Limitations of greedy graph algorithm? Greedy assemblers can detect false overlaps and high scoring ones that are resulted from repetitive sequences. Graph traversal using greedy approach may cause algorithm to become stuck in local maxima, which produces a suboptimal solution for the assembly problem. . How interacting the molecule in characteristics of tertiary position? The tertiary structure of a protein refers to the overall three-dimensional arrangement of its polypeptide chain in space. It is generally stabilized by outside polar hydrophilic hydrogen and ionic bond interactions, and internal hydrophobic interactions between nonpolar amino acid side chains . How two loci shows 1% combination in genetic map? . Two strategies used in human gene ? Gene-transfer approaches, in which a wild-type copy of the mutated gene is delivered. RNA modification therapy, in which the mRNA encoded by a mutant gene is targeted. Stem cell therapy, in which human stem cells are used to repair disease-damaged tissue. . Proteins determination technique s? X-ray crystallography reliable but slow, not all protein crystallize. • Computer structure prediction programs not reliable for all proteins.
. Pharmaceutical importance of proteins?
• Proteins as pharmaceuticals • Proteins applications • Whey proteins health effects • Iron chelate Protein • Zinc chelate Protein • Tumor markers What are two different types of sequencing? Whole exome sequencing Whole genome sequencing These are increasingly used in healthcare and research to identify genetic variations; both methods rely on new technologies that allow rapid sequencing of large amounts of DNA. These approaches are known as next-generation sequencing (or next-gene sequencing). 2.What is intrones? • They are non-coding sections of a gene. • Transcribed in the precursor mRNA sequence but is ultimately removed by RNA splicing. • Most of the introns appear to be mobile genetic elements. 2.. Write the name of gene-dense and gene-poor areas DNA building blocks. Gene-dense: Human genome's gene-dense areas are predominantly composed of the DNA building blocks G and C. Gene poor: Gene poor areas are rich in the DNA building blocks A and T. 4. Write three major physical mapping techniques. A physical map of a chromosome or genome that shows the physical locations of genes and other DNA sequences These are the following techniques: o Cytogenetic map o A restriction map o Fluorescent in situ hybridization o Sequence tagged site (STS) map o Nucleotide sequence map. 5. Describe the D loop of human mitochondria. D loop is the site where the most of replication and transcription is controlled. Genes are tightly packed with almost no non coding DNA outside the D loop. Human mitochondrial gene are contains no intrones. Although intrones are found in the mitochondria of other groups like plants. What are histones? Histones are the family of basic proteins that associate with DNA in the nucleus and help condense it into chromatin. Nuclear DNA does not appear in free linear strands it is highly condense and wrapped around histones in order to fit inside of the nucleus and take part in the formation of chromosomes. What is viral genome and write its types. 5 A viral genome is the genetic material of a virus. It is also known as the viral chromosome. Viral genomes vary in size from few thousand to more than a hundred thousand nucleotides. Types of viral genome: Viral genome can be ▪ ssRNA ▪ ssDNA ▪ dsRNa ▪ dsDNA ▪ Linear ▪ Circular Describe a brief procedure to create a labeled cDNA in microarray.
• Chips are prepared by using cDNA- cDNA chips or cDNA
microarray. • The cDNAs are amplified. Then these immobilized on a solid support made up of nylon filter of glass slide • The probe DNA are loaded by capillary action. • Small volume of this DNA is spotted on solid surface • DNA is delivered mechanically or in a robotic manner. • When one DNA spotting is done, the pin is washed and loaded with fresh DNA to start the second cycle. What are operons?
These are Contiguous genes, transcribed as a single poly cistronic
mRNA, that encode proteins with related functions.
Describe logarithm of odds.
• Usually, the lod (logarithm of odds) score method is used for
statistical analysis of pedigree data.
• A lod score compares the expected distributions of traits if they are
linked or not linked. • The lod score is the log10 of the ratio of the two probabilities. The higher the lod score, the closer the two genes. Difference between Prokaryotic and Eukaryotic gene Expression. 3
• mRNA is not modified
• The existence of introns in prokaryotes is extremely rare. • In prokaryotes, the newly synthesized mRNA is polycistronic (code for more than one polypeptide chain).
• In prokaryotes, transcription of a gene and translation of the
resulting mRNA occur simultaneously. • Many polysomes are found associated with an active gene.
• Prokaryotes-operons simultaneously transcription and translation
• Eukaryotes – no operons, RNA processing, chromatin remodeling • Homologues • • Homologues are thus components or characters (such as genes/proteins with similar sequences) that can be attributed to a common ancestor of the two organisms during evolution. • • Homologues can be • • Orthologous • • Paralogues • • Xenologues • • Analogues • • Orthologues • • Orthologues are homologues that have evolved from a common ancestral gene by speciation. • • They usually have similar functions. • • Paralogues • paraogues are homologues that are related or produced by duplication within a genome followed by subsequent divergence. • • They often have different functions. • • Xenologues • • Xenologues are homologous that are related by an interspecies (horizontal transfer) of the genetic material for one of the homologues. • • The functions of the xenologues are quite often similar. • • Analogues • • Analogues are non-homologues genes/proteins that have descended convergently from an unrelated ancestor. • • Similar function, different sequence or structure.