String Database
String Database
String Database
Digital Assignment 1
Faculty - RAMANATHAN K
STRING
Functional Protein Association Networks
1. About the Database
STRING is a database of known and predicted protein-protein interactions.
The interactions include direct (physical) and indirect (functional)
associations; they stem from computational prediction, from knowledge
transfer between organisms, and from interactions aggregated from other
(primary) databases.
The S
TRING database currently covers 24'584'628 proteins from 5'090
organisms.
2. Importance of Database
Protein–protein interaction networks are an important ingredient for the
system-level understanding of cellular processes. Such networks can be
used for filtering and assessing functional genomics data and for providing
an intuitive platform for annotating structural, functional and evolutionary
properties of proteins. Exploring the predicted interaction networks can
suggest new directions for future experimental research and provide
cross-species predictions for efficient interaction mapping.
3. Updation if Any?
The data in this database can be uploaded by giving-
1. Basic Information
2. Node Data - Proteins
3. Edge Data - Protein Associations
The STRING App allows users to modify an already retrieved network in
three different ways.
First, the confidence cutoff for the imported evidence channels can be
increased or decreased, which in the latter case involves fetching
additional interactions from STRING.
Third, any number of additional nodes can be queried by name and added
to the existing network.
4. How to retrieve the information?
To specify your desired starting point of the analysis you have to use the
input form at the STRING start page.
● Protein by name
● Protein by sequence
● Multiple proteins
● Multiple sequences
● Organisms
● Protein families (COGs)
● Examples
● Random entry
You can search STRING by single protein name, multiple names or by
amino acid sequence (in any format) There are also example inputs and a
random input generator which will randomly select a protein with at least 4
predicted links at medium confidence or better. There is a organism entry
to see if your species of interest is available. There is the possibility to
search by protein family rather than a protein in a single organism, by
searching the COGs (clusters of orthologous groups)
Commonly, you enter your protein of interest by supplying its name or
identifier. The organism can be selected by clicking on the arrow or directly
typing the name inside the relative input field (an autocompletion
mechanism will appear to help you). General names that group more than
one organism (e.g. "Mammals", "Chordata") can also be used.
5. References – At least one Research article uses this
dataset
The paper for the database is -
Szklarczyk, D., Franceschini, A., Wyder, S., Forslund, K., Heller, D.,
Huerta-Cepas, J., Simonovic, M., Roth, A., Santos, A., Tsafou, K. P., Kuhn, M.,
Bork, P., Jensen, L. J., & von Mering, C. (2015). STRING v10: protein-protein
interaction networks, integrated over the tree of life. Nucleic acids research,
43(Database issue), D447–D452. https://doi.org/10.1093/nar/gku1003.
Example -