100% found this document useful (1 vote)
251 views

Vietnam National University Ho Chi Minh International University

This document contains the assignment submission from group 21 of the Bioinformatics course at Vietnam National University, Ho Chi Minh City International University. The assignment involves performing sequence alignments on protein sequences retrieved from NCBI, including: 1) A local and global alignment of mouse and E. coli HPRT sequences. 2) A local alignment with modified gap penalties. 3) A multiple sequence alignment of 5 species' sequences and analysis of the first 50 residues.

Uploaded by

Huỳnh Như
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
251 views

Vietnam National University Ho Chi Minh International University

This document contains the assignment submission from group 21 of the Bioinformatics course at Vietnam National University, Ho Chi Minh City International University. The assignment involves performing sequence alignments on protein sequences retrieved from NCBI, including: 1) A local and global alignment of mouse and E. coli HPRT sequences. 2) A local alignment with modified gap penalties. 3) A multiple sequence alignment of 5 species' sequences and analysis of the first 50 residues.

Uploaded by

Huỳnh Như
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

VIETNAM NATIONAL UNIVERSITY

HO CHI MINH INTERNATIONAL UNIVERSITY

BIOINFORMATICS
ASSIGNMENT 3
SEQUENCE ALIGNMENT

GROUP 21:
1. Võ Huỳnh Như - BTBTIU18367
2. Cao Sang - BTBTIU18324
3. Lê Thục Đoan Trinh - BTBTIU18254
4. Nguyễn Uyên Y Xuân - BTBTIU18301

Date of submission: November 20, 2020


Question 1: In the NCBI database, retrieve the protein sequences for
mouse hypoxanthine phosphoribosyl transferase (HPRT)
(NP_038584) and the same enzyme from E. coli (WP_103280448) in
FASTA format.
a. Perform a local alignment of the two sequences using Water in
EMBL-EBI EMBOSS. Report identity, similarity & gaps
_ Identity: 63/175 (36.0%)
_ Similarity: 97/175 (55.4%)
_ Gaps: 13/175 (7.4%)
Figure 1: A local alignment of two sequences using Water

b. Perform a global alignment of the two sequences using Needle in


EMBL-EBI EMBOSS. Compare the results with those from the local
alignment.
A global alignment A local alignment
Matrix EBLOSUM62 EBLOSUM62
Gap_penalty 10.0 10.0
Extend_penalty 0.5 0.5
Length 224 175
Identity 64/224 (28.6%) 63/175 (36.0%)
Similarity 102/224 (45.5%) 97/175 (55.4%)
Gaps 52/224 (23.2%) 13/175 (7.4%)
Score 261.5 266.5

Figure 2: A global alignment of the two sequences using Needle


c. Change the default gap penalty from“10/0.5” to “5/0.1”. Run the
local alignment and compare the results with previous local
alignment.
A new local alignment A previous local
(“5/0.1”) alignment (“10/0.5”)
Matrix EBLOSUM62 EBLOSUM62
Gap_penalty 5.0 10.0
Extend_penalty 0.1 0.5
Length 200 175
Identity 73/200 (36.5%) 63/175 (36.0%)
Similarity 105/200 (52.5%) 97/175 (55.4%)
Gaps 43/200 (21.5%) 13/175 (7.4%)
Score 312.9 266.5

Figure 3: A new local alignment (“5/0.1”) of two sequences using Water


Question 2: In the NCBI database, retrieve the protein sequences for
human (NP_000509), Pan troglodytes (XP_508242), Canis familiaris
(NP_001257813), Mus musculus (NP_058652), Gallus gallus
(NP_990820). Perform multiple sequence alignment for these
sequences and investigate the first 50 residues of the alignment.
Answer the following questions:

Figure: The protein sequences for human, Pan troglodytes, Canis


familiaris, Mus musculus and Gallus gallus
a.What is the identity (%) & gaps (%)?
% identity=(31/50)*100=62%
% gap=0
b. Which position is highly mutated?

c. Which species have 100% identity when comparing the first 50


residues?
Human (NP_000509) and Pan troglodytes (XP_508242) have the 100%
identity when comparing the first 50 residues.

Figure: the protein sequence of Human and Pan troglodytes

You might also like