Bioinformatics Assignment 4
Bioinformatics Assignment 4
Bioinformatics Assignment 4
Instructions:
Use the template to complete your assessment. Your answers must be filled in the text boxes using
the blue font colour as indicated. You are encouraged to use screenshots. Ensure that you proofread
your work and answer all the questions to the best of your ability. Submit original work and
acknowledge your sources appropriately. The assessment will be checked against SafeAssign for
plagiarism.
The Department of Health has noticed a surge of patients who suffer from similar symptoms in the
past few months. They believe that the symptoms are due to a viral infection.
To identify the virus that infected the patients. The blood sample will be collected from the
patient and undergo the tests that include Next generation sequence or ELISA. The DNA will be
extracted from the blood and Next-generation sequence library will be prepared. The Next
Generation Sequencing will then take place and the data analysed , the next step will be
mutation validation and the patient will be diagnosed the virus that causes them to suffer.
ILISA method may also be used. The antibody containing sample will be added to the antigen
coated ELISA plates. The target antibodies will bind to the antigen coated on the surface. The
secondary enzyme linked antibodies will bind and detect antibody. The substrate will be added
for the enzyme. The enzyme will react with the substrate and produce colour with an intensity
directly related to the target antibody level and the by that the patient will be diagnosed.
1
2. Your investigation proved that the symptoms experienced are due to an infection from a
new pox virus strain. You send a DNA sample to The Next Generation Sequencing (NGS) Unit
and they successfully sequence the genome of the strain. Why was NGS rather than Sanger
sequencing used for this? (2)
Answer…
Answer…
2
b) ORFs found: 301
c) The protein that will be translated is RecName: Full=Ankyrin repeat domain-containing
protein OPG023; AltName: Full=Host range protein 1 [Monkeypox virus]
Answer…
>sp|A0A7H0DMZ9.1|PG023_MONPV RecName: Full=Ankyrin repeat domain-
containing protein OPG023; AltName: Full=Host range protein 1
MFDYLENEEVVLDELKQMLRDRDPNDTRNQFKNNALHAYLFNEHCNNVEVVKLLLDSGTNPLHKNWRQFT
PIEEYTNSRHVKVNKDIAMALLEATGYSNINDFNIFSYMKSKNVDVDLIKVLVEHGFDLSVKCENHRSVI
ENYVMTDDPVPEIIDLFIENGCSVLYEDDEYGYVYDDYQPRNCGTVLHLYIIAHLYSESDTRAYVRPEVV
KCLINHGIKPSSIDKNYCTALQYYIKSSHIDIDIVKLLMKGIDNTAYSYIDDLTCCTRGIMADYLNSDYR
YNKDVDLDLVKLFLENGKPYGIMCSIVPLWRNDKETISLILKTMNSDVLQHILIEYMTFGDIDIPLVECM
LEYGAVVNKEAIHRYFRNINIDSYTMKYLLKKEGGDAVNHLDDGEIPIGHLCESNYGCYNFYTYTYKKGL
CDMSYVCPILSTINICLPYLKDINMIDKRGETLLHKAVRYNKQSLVSLLLESGSDVNIRSNNGYTCITIA
INESRNIELLKMLLCHKPTLDCVIDSLSEVSNIVDNAYAIKQCIKYTMIIDDCTSSKIPESISQRYNDYI
DLCNQELNEMKKIMVGGNTMFSLIFTEHGAKIIHRYANNPELRAYYESKQNKIYVEAYDIISDAIVKHNK
IHKTIIKSVDDNTYISNLPYTIKYKIFEQQ
3
5. Download the amino acid sequence of one of the proteins and search for similar sequences
on the NCBI database (not same family). Comment on the results. (2)
d) What challenges did you encounter and how did you solve them? (1)
Answer…
Top 20 Matches
a) Multiply alignments
4
b)
8. You have identified that there is a variant of the virulence protein in a specific group of
people.
a) Search for the nucleotide sequence for the protein and design new PCR amplification
primers to target the gene using sanger sequencing. (2)
5
c) Report the primer sequences and the sizes of the possible amplicons. (2)
d) Which primer set would you use to amplify the gene and why? (2)
a) atgtttgattatctggaaaacgaagaagtggtgctggatgaactgaaacagatgctgcgc
gatcgcgatccgaacgatacccgcaaccagtttaaaaacaacgcgctgcatgcgtatctg
tttaacgaacattgcaacaacgtggaagtggtgaaactgctgctggatagcggcaccaac
ccgctgcataaaaactggcgccagtttaccccgattgaagaatataccaacagccgccat
gtgaaagtgaacaaagatattgcgatggcgctgctggaagcgaccggctatagcaacatt
aacgattttaacatttttagctatatgaaaagcaaaaacgtggatgtggatctgattaaa
gtgctggtggaacatggctttgatctgagcgtgaaatgcgaaaaccatcgcagcgtgatt
gaaaactatgtgatgaccgatgatccggtgccggaaattattgatctgtttattgaaaac
ggctgcagcgtgctgtatgaagatgatgaatatggctatgtgtatgatgattatcagccg
cgcaactgcggcaccgtgctgcatctgtatattattgcgcatctgtatagcgaaagcgat
acccgcgcgtatgtgcgcccggaagtggtgaaatgcctgattaaccatggcattaaaccg
agcagcattgataaaaactattgcaccgcgctgcagtattatattaaaagcagccatatt
gatattgatattgtgaaactgctgatgaaaggcattgataacaccgcgtatagctatatt
gatgatctgacctgctgcacccgcggcattatggcggattatctgaacagcgattatcgc
tataacaaagatgtggatctggatctggtgaaactgtttctggaaaacggcaaaccgtat
ggcattatgtgcagcattgtgccgctgtggcgcaacgataaagaaaccattagcctgatt
ctgaaaaccatgaacagcgatgtgctgcagcatattctgattgaatatatgacctttggc
gatattgatattccgctggtggaatgcatgctggaatatggcgcggtggtgaacaaagaa
gcgattcatcgctattttcgcaacattaacattgatagctataccatgaaatatctgctg
aaaaaagaaggcggcgatgcggtgaaccatctggatgatggcgaaattccgattggccat
ctgtgcgaaagcaactatggctgctataacttttatacctatacctataaaaaaggcctg
tgcgatatgagctatgtgtgcccgattctgagcaccattaacatttgcctgccgtatctg
aaagatattaacatgattgataaacgcggcgaaaccctgctgcataaagcggtgcgctat
aacaaacagagcctggtgagcctgctgctggaaagcggcagcgatgtgaacattcgcagc
aacaacggctatacctgcattaccattgcgattaacgaaagccgcaacattgaactgctg
aaaatgctgctgtgccataaaccgaccctggattgcgtgattgatagcctgagcgaagtg
agcaacattgtggataacgcgtatgcgattaaacagtgcattaaatataccatgattatt
gatgattgcaccagcagcaaaattccggaaagcattagccagcgctataacgattatatt
gatctgtgcaaccaggaactgaacgaaatgaaaaaaattatggtgggcggcaacaccatg
tttagcctgatttttaccgaacatggcgcgaaaattattcatcgctatgcgaacaacccg
gaactgcgcgcgtattatgaaagcaaacagaacaaaatttatgtggaagcgtatgatatt
attagcgatgcgattgtgaaacataacaaaattcataaaaccattattaaaagcgtggat
gataacacctatattagcaacctgccgtataccattaaatataaaatttttgaacagca
b) 10 Primers
c)
6
9. Describe the NGS workflow and explain why it is important to perform quality check during
data analysis. (3)
With next-generation sequencing (NGS), big and complicated genomes may be sequenced in a
single day because to its high throughput. In Illumina NGS systems, massively parallel
sequencing of nucleic acid samples enables high-throughput data collection. The workflow
includes isolation of desired nucleic acids, fragmentation of isolated nucleic acids and
preparation of samples for the sequencers (library preparation), sequencing reactions, and
bioinformatic processing and analysis of sequencing data
Performing quality check during data analysis helps ensure successful sequencing outcomes