0% found this document useful (0 votes)
12 views

Module_2_Reference Course content

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Module_2_Reference Course content

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

MIT School of Bioengineering Sciences and Research

(A Constituent unit of MIT ADT University)

Basic Concepts In Bioinformatics / BI301

Module02

Databases and Biological Databases

Course Coordinator: Dr. Sanket P. Bapat / Dr. Priyanka Nath


Mail ID: sanket.bapat@mituniversity.edu.in | priyanka.nath@mituniversity.edu.in
Disclaimer:

The content delivered here should be considered of utmost importance. However, it is


to be noted that, this material is not Stand-alone material for the fulfilment of the
course syllabus. The content in this presentation should only be used as an aid to
learning.
Books and other resources provided are suggested to be referred for exhaustive
understanding.

MITBIO/MITADT University
Syllabus:

Module 2:
Biological Database and its Types

Introduction to data types and Source. Population and sample,


Classification and Presentation of Data. Quality of data, private and
public data sources. General Introduction of Biological
Databases;Nucleic acid databases (NCBI, DDBJ, and EMBL). Protein
databases (Primary, Composite, and Secondary). Specialized Genome
databases: (SGD, TIGR, and ACeDB). Structure databases (CATH, SCOP,
and PDBsum)

MITBIO/MITADT University
Objective/Learning Outcome:

CO1 Understanding the basics of bioinformatics and its Applications

CO2 Difference between databases and various biological databases

CO3 Performing data storage methods and various formats.

CO4 Understanding sequence alignment and types of sequence alignment

Discuss about the basics of gene expression and understanding the difference between pattern finding
CO5
and regular expression

CO6 Deduce the evolutionary relationships between the sequences by generating a phylogenetic tree.

MITBIO/MITADT University
Databases

5
Sanket Bapat
Database
• Collection of data.

• DBMS: Database Management System.

• 3 types of databases:
1. Flat file format.
2. Relational database.
3. Object oriented database.

6
Sanket Bapat
Difference between Primary and Secondary
db.

Sanket Bapat
Biological Databases
• GenBank:

• Nucleic acid database.


• GenPept is database for proteins.
• Established in 1982.
• Part of NCBI and NIH.
• GenBank uses the flat file format.
• Its divided into Header, Features and Sequence.

8
Sanket Bapat
• DDBJ:

• DNA databank of Japan.


• Built in 1986
• Only nucleotide database of Asia.
• Part of National Institute of Genetics.

• EMBL:

• European Molecular Biology Laboratory.


• Maintained by European Bioinformatics Institute.
• Comprised of 2 tools: MUSCLE and TranSeq.
• MUSCLE is a multiple sequence alignment tool.

9
Sanket Bapat
Ensembl
• Contains all the human genome DNA
sequences currently available in the public
domain.
• Automated annotation: by using different
software tools, features are identified in the
DNA sequences:
• Genes (known or predicted)
• Single nucleotide polymorphisms (SNPs)
• Repeats
• Homologies
• Created and maintained by the EBI and the
Sanger Center (UK)
• www.ensembl.org
Sanket Bapat
Swiss-Prot
• Annotated protein sequence database established in
1986 and maintained collaboratively since 1987, by the
Department of Medical Biochemistry of the University
of Geneva and EBI
• Complete, Curated, Non-redundant and cross-
referenced with 34 other databases
• Highly cross-referenced
• Available from a variety of servers and through
sequence analysis software tools
• More than 8,000 different species
• First 20 species represent about 42% of all sequences
in the database

Sanket Bapat
Protein DataBank (PDB)
• Important in solving real problems in
molecular biology
• Protein Databank
• PDB Established in 1972 at Brookhaven National
Laboratory (BNL)
• Sole international repository of macromolecular
structure data
• Moved to Research Collaboratory
for Structural Bioinformatics

http://www.rcsb.org/

Sanket Bapat
TrEMBL (Translation of EMBL)
• Computer-annotated supplement to SWISS-
PROT, as it is impossible to cope with the flow
of data…
• Well-structure SWISS-PROT-like resource
• Derived from automated EMBL CDS
translation maintained at the EBI, UK.
• TrEMBL is automatically generated and
annotated using software tools (incompatible
with the SWISS-PROT in terms of quality)
• TrEMBL contains all what is not yet in SWISS-
PROT

Sanket Bapat
Databases in Bioinformatics
Sequence databases: GenBank, UniProt

Sequence analysis: InterPro

Genomics: GSS, GENOME

Literature databases: PubMed

Structural databases: SCOP, CATH

Metabolic pathway databases: KEGG, REACTOME

Specialized databases: dbSNP, IMGT

14
Sanket Bapat
Pitfalls of Biological Databases

Errors in Sequence Databases


Redundancy in the Primary Sequence Databases
False or Incomplete Genes Annotations

15
Sanket Bapat
Disclaimer:

The content delivered here should be considered of utmost importance. However, it is


to be noted that, this material is not Stand-alone material for the fulfilment of the
course syllabus. The content in this presentation should only be used as an aid to
learning.
Books and other resources provided are suggested to be referred for exhaustive
understanding.

MITBIO/MITADT University
References:

References Book Name Library

Jin Xiong Essential Bioinformatics Ebook / Present in Library

Mount, David W. Bioinformatics Sequence & Genome Analysis Present in Library

Charlie Hodgman Bioinformatics: Second Edition Present in Library

Parry Smith Introduction to Bioinformatics Present in Library

MITBIO/MITADT University
Interesting Links:

• GenBank database Tutorial: A Beginners Guide


• Link: https://youtu.be/QCgSYQnmVuc?si=tL2JwuIN1UJ3fvZB

• Pubmed database Tutorial: A Beginners Guide


• Link: https://youtu.be/HNAmDAqQF0I?si=oXELTPDx5eB9NNPP

• DDBJ database Tutorial: A Beginners Guide


• Link: https://youtu.be/XoImzQQLqiY?si=Fqt9W0Y0ruRhL3WB

• EMBL database Tutorial: A Beginners Guide


• Link: https://youtu.be/AgMw7yRfr3E?si=fH62PMJCp_vL1lD5

• Uniprot database Tutorial: A Beginners Guide


• Link: https://youtu.be/KH8qVaWb704?si=I6MBIu5dPECD-zVI

• PDB database Tutorial: A Beginners Guide


• Link: https://youtu.be/FJTCrrs3PKs?si=M_PIvidP262_UDcF

MITBIO/MITADT University
The content is intended for internal use only, and the ownership belongs to the coordinator. It
should not be uploaded on any platform without proper authorization.

You might also like