Sidraali1289@Gmail.com

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 28

International Journal of Contemporary Issues in Social Sciences

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

MACHINE LEARNING TECHNIQUES BASED DNA SEQUENCE


ALIGNMENT: A SYSTEMATIC LITERATURE REVIEW

Sidra Ali1, Department of Computer Science, City University of Science and Information Technology,
KPK, Email: Sidraali1289@gmail.com,

Muhammad Arif Shah*2, Department of Computer Science, Pak-Austria Fachhochschule, Institute of


Applied Sciences and Technology, Haripur Email: arif.websol@gmail.com

Hamza Shaukat3, School of Technology, Jamk University of Applied Sciences, Finland,


Email: , hamzakhan12599@gmail.com

Muhammad Zakir Khan4, James watt school of engineering University of Glasgow


Email: M.khan.6@research.gla.ac.uk

Corresponding authors*
Received: July 20, 2024 Revised: July 20, 2024 Accepted: September 20, 2024 Published: September 22,
2024

ABSTRACT
The massive volume of organic information, the conventional software engineering procedures
and calculations neglect to take care of complex natural issues of this present reality. In any case,
present day computational methodologies, for example, AI can address the restrictions of the
customary strategies. AI has assumed a significant job in building up Bioinformatics as a field in
its own in the course of the most recent 30 years. For solving the complex problems of biological
data we use Machine Learning Techniques for Deoxyribonucleic Acid (DNA) Sequence
Alignment data. We present a Systematic Literature Review Protocol (SLRP) of DNA sequence
alignment using machine learning technique. This proposition played out an archived arrangement
of explicit techniques which are useful to utilize the Systematic Literature Review (SLR). The
anticipated outcomes of this survey recognize the DNA Sequence Alignment utilizing Machine
Learning Techniques, investigate issues, order the issues, characterize the significant qualities and
furthermore talk about the general attributes. The normal advantages of this investigation in future
will be Systematic best in class of machine learning method in bioinformatics that will be
supportive for new Researchers to represent DNA succession arrangement utilizing machine
learning procedures.
Keywords: DNA sequence alignment; systematic literature review; machine learning;
bioinformatics

INTRODUCTION
Bioinformatics is a multidisciplinary field that is bioinformatics, as it is the underlying advance of
in steady development due to technological a few kinds of trials and it is likewise required in
propels in corresponded sciences (for example a few different bioinformatics fields. It looks at
software engineering, science, scientific, genomic highlight’s DNA successions, qualities,
science, and medicine) (Pevsner 2009). Genomic administrative groupings, or other genomic
research is the most delegate space in auxiliary segments of various life forms. When

https://ijciss.org/ | Ali et al., 2024 | Page 1186


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

all is said in done, similar genomics begins with districts in the adjusted arrangements are called
the arrangement of genomic orthologous comparable locales. These comparative locales
groupings (I.e. arrangements that share a typical are the areas generally preserved from past ages.
heritage) for checking the degree of likeness In certain areas, exceptional characters, for
(preservation) among groupings (or genomes) example, '- ', otherwise called indels or holes are
(Miller, Makova et al. 2004). included. Focusing on the goal that vague or
Because of the expansion of the quantity of analogous typescripts are attuned in reformist
examinations (that are likewise getting fragments, holes are entrenched among the
progressively perplexing) including genomic collections. This inclusion of an exceptional
research just as DNA sequencing advances the image speaks to a transformation (change) or
sum and unpredictability of natural information could be seen as erasure from the other
is being expanded. It straightforwardly succession's viewpoint.
influences the presentation of the computational DNA sequence alignment have two primary
execution of bioinformatics tests (Koboldt, sorts. The first is Multiple Sequence Alignment
Steinberg et al. 2013). So for taking care of such (MSA) is the predominant practice for reasoning
essential issue we utilize the DNA arrangement. organic realities from a lot of arrangements. It
DNA is an atom that conveys the hereditary incorporates the arrangement of multiple
data. It comprises of genetic guidelines for groupings. A MSA can be seen as a 2-D table. In
building, running, keeping up a living being and this table groupings are the lines and the
giving life to the people to come. DNA is in segments of identical DNA Sequence are
twofold helix structure in which two individual orchestrated by putting hole typescripts in
DNA strands bend around one another in a realistic locations, with the end goal that the
winding. The DNA strand comprises of four organic relationship of the successions is best
nucleotide bases Adenine, Cytosine, Guanine described (Simossis, Kleinjung et al. 2003).
and Thymine. Condensed as A, C, G, and T. In Pairwise Sequence Alignment (PSA) is an
atomic science DNA sequencing the approach alignment between any two given sequences.
toward determining the particular entreaty of Pairwise sequence alignment could be
nucleotides privileged a DNA particle which is additionally named nearby and worldwide
made by the reiteration of the four nucleotides: sequence alignment. Nearby sequence alignment
adenine, thymine, guanine and cytosine. The finds the best estimated sub-sequence coordinate
human genome is comprised of 3 billion of these inside twofold specified sequences.
Genetic characters. There are at any rate 26 Neighbourhood sequence alignments are
billion base pairs (bp) (Cohen 2004). planned fundamentally to scan for exceptionally
In bioinformatics, a Sequence Alignment (SA) is comparative areas inside the two given
considered to be a method of orchestrating sequences. Worldwide sequence alignment, then
groupings of proteins, RNA and DNA with a again, is intended to locate the best alignment of
goal to discover locales of likeness which may the two sequences completely. Along these
give extra data on the practical, auxiliary, lines, worldwide sequence alignment searches
developmental and different interests between for worldwide planning between whole
the arrangements. Attuned sequences of amino sequences (Haque, Aravind et al. 2009).
corrosive or nucleotide remains are ordinarily Because of immense measures of organic
enunciated to as lines classified inside a information and an exceptionally enormous
framework, one on head of the other. For number of potential blends and stages of
instance, given two arrangements, different natural sequences, the traditional
ATATAGAGGACACG and human knowledge-based techniques can't work
ATAGGGGACATGG, one potential viably and productively. So man-made
arrangement. In this arrangement, the vertical consciousness strategies, for example, AI can
lines demonstrate the match. The firmly adjusted assume a basic job in complex biomedical

https://ijciss.org/ | Ali et al., 2024 | Page 1187


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

applications (Pan, Wang et al. 2014). AI (ML) is processing capacity to recover and investigate
a subfield of man-made brainpower and is sequences rapidly and precisely. Adjusting the
worried about the improvement of calculations sequences is a significant and basic advance in
and methods that permit PCs to learn. As of late, tackling issues, for example, foreseeing the
the measure of natural information requiring auxiliary and tertiary structure of a protein,
investigation has detonated and many AI anticipating the tribal sequence or recognizing
techniques have been created to manage this the normal qualities in two living beings. In any
blast of information. Consequently, AI in case, the multifaceted nature because of the
bioinformatics has become a significant sheer number of potential blends and searches
examination territory for both PC researchers makes sequence alignment a figure escalated
and scholars (Lacey and Xie 2014). Important issue. This intricacy increments exponentially
AI techniques incorporate help vector machines, with the size of the sequences. Both equipment
portion machines, highlight choice, neural and programming enhancements are generally
systems, developmental calculation, factual considered as likely bearings to improve the
learning, fluffy rationale, regulated getting the speed and exactness (Haque, Aravind et al.
hang of, grouping, gathering learning, Bayesian 2009).
systems, direct relapse, head segments With new organic sequences being found nearly
investigation, concealed Markov models, every day, the natural sequence database is
entropy-based data strategies, and numerous developing exponentially.
others (Pan, Wang et al. 2014). This This blast of information requests new
examination presents a SLR trying to distinguish calculations which are quick but then effective.
the Machine Learning strategies in DNA The test to adjust both speed and productivity
sequence Alignment encouraging the SLR was immediately perceived by numerous
procedure. SLR is a settled examination strategy scientists and in the previous hardly any
used to coordinate the best accessible decades. They utilized AI method to take care of
observational information from methodical such large issue. AI has assumed a significant
exploration (Kitchenham 2004). job in setting up Bioinformatics as a Field in its
As opposed to the regular impromptu writing own directly in the course of the most recent 30
survey process, SLR gives dependable years. A great deal of methods running from
methodologies and set up sequences to total, randomized choice trees to neural systems,
assess and decipher the best accessible bolster vector machines and
individual examinations to address specific concealed Markov models have been applied
exploration questions (Kitchenham 2004). It effectively to take care of issues in novel quality
additionally permits commentators to decide the finding, developmental investigation, tranquilize
genuine impacts and wonders in zones where and horticultural examination and protein
little, singular examinations are not effortlessly grouping (Lacey and Xie 2014). Our
controlled or replicated (Brereton, Kitchenham concentration is the Machine learning strategies
et al. 2007). created for DNA sequence alignment.

I. Background A. Related Works


In this section background and related works Numerous research contributions are in the
about the Machine learning techniques used for region of bioinformatics for DNA sequence
DNA sequence alignment are discussed. alignment Machine learning methods in different
Computational microbiology has increased gatherings and Journals. As far as we could
noteworthy notoriety over the most recent possibly know, there is no organized work or
couple of decades. The enormous volume of SLRP distributed for the Machine learning
organic information that is put away as DNA, procedures utilizing DNA sequence alignment.
RNA and protein sequences requires broad

https://ijciss.org/ | Ali et al., 2024 | Page 1188


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

To perform orderly surveys is important to (Chowdhury and Garai 2017). The protein
follow a pre-built up and all around alignment issue is studied for an extensive
characterized convention. Following the duration, unfortunately, every reachable
characterized orderly advances may ensure the technique yields alignment results contrastingly
reproducibility of the examination. for a solitary alignment issue. Numerous
Cochrane Collaboration, a global association sequence alignment is exposed as an extremely
that produces orderly writing surveys of high computational intricate concern. Plentiful
mediations in human services, proposing a stochastic procedures, in this way, are reflected
handbook to help direct precise audits for improving the precision of alignment.
(Pentheroudakis, Greco et al. 2009). They Among them, several scientists every now and
suggest that the initial step of an efficient audit again utilize Genetic Algorithm. This
is to build up a convention that obviously examination propose several kinds of the
characterizes the targets of the survey, the strategy applied in alignment and the ongoing
measures for consideration and avoidance of configurations in them multi unprejudiced
systematic review, the techniques that genetic deviousness for explaining different
distinguish the investigations, and the sequence alignment. Numerous ongoing
examination strategy for the gathered investigations have revealed noteworthy
examinations. The principle consequence of a advancement in discovery of the alignment
Cochrane Alliance SLR is a rundown of the exactitude.
finest value logical investigations for a particular (Ezziane 2005) article plans investigate the
topic (Kanewala and Bieman 2014), (van utilization of AI in the areas of bioinformatics
Karnebeek and Stockler 2012). (van Karnebeek and DNA sequencing. And furthermore portrays
and Stockler 2012) proposed that the influence the sort of programming programs that helps the
factor, quantity of references, and duration of necessities of scientists to use and support
distribution are imperative to choose and rank untangle the tremendous actions of evidence that
pertinent logical papers enveloping the audited are persistently being amassed in genomic
subject. research (Ezziane 2005).
(Goel, Singh et al. 2013) characterized that a Babasaheb .S. Satpute, Dr. Raghav Yadav
hypothetical audit of delicate figuring strategies (2007) states that because of enormous
for quality expectation. The issue of quality development in the measure of organic
expectation, alongside the issues engaged with information. We need generally progressed and
it, is first portrayed (Goel, Singh et al. 2013). A in fact compelling innovation and apparatuses to
concise depiction of delicate figuring use for the examination of those data.so this
procedures, and their application to quality article present the survey of Machine Learning
expectation, a rundown of various delicate Techniques for Bioinformatics and
registering strategies for quality forecast lastly a Computational Biology and a portion of the
few confinements of the ebb and flow renowned regulated order and Clustering
examination and future exploration bearings are calculations (Khattree and Naik 2007).
introduced (Polato, Ré et al. 2014). This writing talked about in this segment take a
Nikolayevskyy et al (2014) proposed a method shot at the mechanization of clinical procedure
for performing efficient surveys for particularly treatment arranging and furthermore
programming designing analysts and make on helps the dental specialist in discovering
points of interest rules for new zones, for dental illness. It particularly empowers the
example, bioinformatics (Nikolayevskyy, dental specialist to diagnose for the disorder
Kranzer et al. 2016). nature and select the appropriate cure
(Chowdhury and Garai 2017) characterizes a disposition.
survey on numerous sequence alignment from
the point of view of hereditary calculation

https://ijciss.org/ | Ali et al., 2024 | Page 1189


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

B. Existing Systematic Reviews of DNA of researcher to help interpret the huge processes
Sequence Alignment of evidence that are continually being gathered
1. review on multiple sequence alignment in genomic research.
from the perspective of Genetic algorithm The consequences of these deliberate audits are
(Chowdhury and Garai 2017) not identified with our examination. Anyway no
2. Machine learning technology in the orderly survey DNA sequence alignment
application of genome analysis: A systematic utilizing AI procedures already has been
review (Wu and Zhao 2019). distributed hence, we create methodical audit in
3. Applications of artificial intelligence in software engineering sector to fill the slum in
bioinformatics: A review (Ezziane 2005) the momentum study territory of AI strategies in
There are hardly any works has been done in the DNA sequence alignment for additional
precise surveys distributed in DNA sequence examination.
alignment. The principal survey introduced in
the Elsevier on the multi-objective hereditary Research Method
calculation for fathoming different sequence A systematic review is an examination approach
alignment in DNA sequence alignment by created to get, calculate, and comprehend the
(Chowdhury and Garai 2017) And the second whole data. That exploration philosophy is
review presented in SCIENCE DIRECT in 2019 worried about a particular examination question
by Jie Wu and Yiqiang Zhao. and the 3rd review or region of concern and distinguishing holes in
presented in SCIENCE DIRECT in 2006 by the ebb and flow research. Nonetheless; this
(Ezziane 2005). examination follows the methodical audit rule
In first review, that propose by Biswanath proposed by Kichenham (Smith, Turner et al.
Chowdhury, Gautam Garai (Chowdhury and 2016). This rule has been adjusted by utilizing
Garai 2017) the fundamental reason for this the software engineering research issues; which
investigation center around numerous sequence are taken from clinical scientists.
alignment from the viewpoint of hereditary A. Systematic literature review protocol
calculation in DNA sequence alignment with Systematic literature review protocol is utilized
numerous parts missing in this examination, for to indicate the strategy that plays out a recorded
example, information extractions, information arrangement finish before beginning the orderly
Synthesis and cutoff determination system. Also, audit. Nonetheless, a decent class convention
the investigation has numerous needs and deals the effective orderly survey and most of
impediments, for example, just one AI the time refreshed during any timeframe to
calculation is talk about, no information comprise more current distributions
extraction and no information union. (Abdelmaboud, Jawawi et al. 2015).
The second precise audit proposed by Jie Wu,
Yiqiang Zhao and the point of this investigation 1. Research Questions
is to distinguish the utilization of AI in that Characterizing the Research inquiries for the
identified with genomic examination. Also, the deliberate audit is a significant advance. This
examination has numerous needs and survey tended to the accompanying examination
confinements, for example, just talk about AI questions:
procedures, no conversation of web indexes, no
information extraction and no information Table 1: Research Questions
union. RQ# Research Question
The third article proposed by (Ezziane 2005) and Which Machine Learning techniques are
the presumes to inspect the employment of AI in RQ
used for Sequence Alignment and what
the precincts of bioinformatics and DNA 1
are the frequently used techniques?
sequencing. And furthermore portrays the sort of RQ Which dataset is used in Machine
programming programs that helps the necessities 2 Learning Sequence Alignment papers

https://ijciss.org/ | Ali et al., 2024 | Page 1190


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

and what are the frequently used papers are effectively imported from Google
dataset? scholar instead of the other electronic databases
What are the Assembled techniques Table 3.
RQ
used in Machine Learning in DNA Table 3: Data Sources
3
Sequence Alignment? Database URL
What are the evaluation parameters used Source
RQ
in Machine Learning Sequence ACM Digital https://dl.acm.org/
4
alignment papers? Library
RQ Which Machine Learning techniques are Science https://
5 used for Multiple Sequence Alignment? Direct www.sciencedirect.com/
RQ Which Machine Learning techniques are Google https://scholar.google.com
6 used Pairwise Sequence Alignment? Scholar
IEEE Xplore https://ieeexplore.ieee.org/
2. Research Objectives Springer https://link.springer.com/
There are a few goals characterized in this Digital
deliberate survey convention the advantages of Library
these targets to give Researcher's best in class of
AI strategies in DNA sequence alignment. The 4. Search Strategy and Search Strings
Research objectives are: The search strategy conducted from September
to January 2017. It is chosen to begin this
Table 2: Research Objectives efficient survey convention from year 2018 as a
ID Objective result of the genuine examination's in the DNA
To identify Machine Learning in DNA sequence alignment start during the year 2017.
1 Sequence Alignment and frequently With centre to look through choices of sources,
techniques. we guaranteed that the hunt involved diaries,
To identify datasets in Machine magazines, meetings, book areas, conference
Learning techniques in DNA and workshops. We tried many inquiry
2
Sequence Alignment and frequently sequences, and the accompanying restored the
datasets. quantity of correlated articles:
To identify assembled techniques in (''DNA sequence alignment or AI method'')
3
DNA Sequence Alignment. AND ("numerous sequence alignment utilizing
To identify evaluation parameters in AI strategies'') AND ''pairwise sequence
4 Machine Learning techniques DNA alignment utilizing AI procedures''. All
Sequence Alignment. databases like Google researcher acknowledge
To classify ML techniques used for this string.
5
Multiple Sequence Alignment.
To classify ML techniques used for 5. Study Selection
6
Pairwise Sequence Alignment. The key investigations chosen, rely upon the
study choice characterized by the consideration
3. Data Sources and avoidance models appeared in Table 4
Different electronic databases have been utilized during perusing the title and theoretical of the
as essential hotspots for software engineering papers so as to guarantee that the outcomes
research distributions. All of the databases identified with the examination territory under
yielded various outcomes aside from Google study. Now and again, titles and modified works
Scholar; it restores indistinguishable outcomes are not widely inclusive and accordingly are not
from past databases Table 1 yet we utilized satisfactory. In this way we declaim the entire
Google Scholar search database to give high paper to ensure that the data delighted the
calibre of indexed lists. Now and again a few

https://ijciss.org/ | Ali et al., 2024 | Page 1191


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

incorporation and avoidance models examinations and simple to keep up and deal
(Inclusion/exclusion Criteria). with this data. Table 5 shows the information
The methodical writing survey is recovered from things that were utilized in each study including
the electronic databases Table 3 are 101 papers portrayals and examination addresses identified
as shows in Figure 1. In the wake of with our efficient survey directed. The principal
understanding titles, abstracts and applying research question RQ1 shows to the DNA
incorporation and prohibition rules the quantity sequence alignment using machine learning
of papers decreased to 83 papers and sent out to techniques and frequently used machine learning
Endnote database. We evacuate copied papers techniques in DNA sequence alignment that
15 and perusing the full content 50 papers by require investigation for all data extraction
considering consideration and avoidance rules items. The RQ2 indicates the dataset and
we dispose of 3 papers the staying 83 papers frequently used data set, RQ3 indicates the
chose as primary investigations for SLR. assembled techniques and RQ4 indicates
evaluation parameters in DNA sequence
Table 4: Study Selection Criteria alignment using machine learning techniques.
Inclusion Criteria Exclusion Criteria The fifth question RQ5 used to find out that
Papers in DNA Sequence Papers in DNA which machine learning techniques is used for
Alignment related to Sequence multiple sequence alignment and the sixth
computer science Alignment related question RQ6 used to find out that which
to medical sciences machine learning techniques is used for pairwise
A scientific paper The paper was not
available in
English Language.
Reviewed Paper (by Published data not
electronic databases) available;
Papers in press DNA sequencing
classification or
DNA sequence
compression
Papers that describe DNA Papers that
Multiple Sequence describe only DNA
Alignment or Pairwise Sequence
Sequence Alignment Alignment.
using Machine Learning
techniques.

sequence alignment. Table 5 shows the Data


6. Data Extraction items and Descriptions with relevant research
The query item of theoretical examinations is questions.
sent out from the database sources Table 3 with
linked data (paper title, Author, paper reference Figure 1: Selected Papers
type, and so on.) to Endnote database. The
advantages of utilizing the Endnote database is
to enrol all data identified with select

Table 5: Data Extraction Form

https://ijciss.org/ | Ali et al., 2024 | Page 1192


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

Research
Data item Description Questions
Title Title name RQ1
Authors Study authors name RQ1
Reference type Journal paper, conference paper, etc. RQ1
ML techniques Describe the DNA Sequence Alignment using RQ1
Machine Learning techniques and mostly used
techniques.
Datasets Describe datasets and mostly used datasets in DNA RQ2
Sequence Alignment using Machine Learning
techniques.
Assembled techniques, Describe assembled techniques and evaluation RQ1,RQ3,RQ4
evaluation parameters parameters in DNA Sequence Alignment using ML
techniques.
Techniques in Multiple and Describe machine learning techniques used for RQ1, RQ5,RQ6
Pairwise Sequence Alignment Multiple Sequence Alignment, Pairwise Sequence
Alignment.
Findings Main conclusion RQ1
7. Data Synthesis colony, artificial colony optimization, swarm
Data synthesis of this study aims to take care of particle optimization, artificial neural network ,
principle questions proposed toward the start of fuzzy logic etc. we group the paper based on the
the study. The primary fundamental inquiry is to type on DNA sequence alignment using machine
distinguish AI procedures in DNA sequence learning techniques. Five data sources are used
alignment and their commitment type in to retrieve the relevant papers on DNA sequence
bioinformatics that distributed in every year. The alignment using machine learning techniques.
sub questions used to discover the AI strategies
in DNA sequence alignment in bioinformatics. 8. Threats to validity
The huge volume of organic information that is The primary dangers to the legitimacy of our
put away as DNA, RNA and protein sequences audit convention are broke down from the
requires broad registering capacity to recover accompanying three perspectives are: rejection
and dissect sequences rapidly and precisely. of pertinent articles, distribution inclination, and
Adjusting the sequences is a significant and information extraction predisposition, study
basic advance in taking care of issues, choice inclination.
anticipating the familial sequence or recognizing
the regular qualities in two creatures. They Prohibition of significant articles: One of the
utilized AI strategy to take care of such significant issues we observed in this survey was
enormous issue. AI has assumed a significant discovering the important papers that tended to
job in building up Bioinformatics from most the research questions. To accomplish this
recent 30 years. A great deal of strategies target, we directed a hunt on databases recorded
running from randomized choice trees to neural in table 3, utilizing our inquiry string on their
systems, bolster vector machines and concealed web crawlers. Nonetheless, we perceived the
Markov models have been applied effectively to likelihood that some significant studies would
take care of issues in novel quality finding, not be reverted by the search strings we utilized.
transformative examination, medicate and To lessen the risk, we physically checked the
farming exploration and protein arrangement. reference rundown of every one of the
101 papers were encompassed in SLR. These significant investigations to search for any
papers include several machine learning pertinent examinations that were missed in the
techniques which is genetic algorithm, ant bee robotized search.

https://ijciss.org/ | Ali et al., 2024 | Page 1193


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

Information extraction inclination: along with


finding and choosing all the important Review results
examinations, information abstraction was the This section discusses the results associated with
most basic errand in this study. To effectively the Research Questions Table 2. These questions
separate information from these examinations, I were aimed at analysing DNA sequence
read each paper freely and gathered the alignment using machine learning techniques
information introduced in Table 5 that are studies from six perspectives: machine learning
required to respond to the exploration addresses techniques used for DNA sequence alignment,
presented. datasets and evaluation parameters used for
Publication bias: Only DNA sequence DNA sequence alignment using machine
alignment using machine learning techniques learning techniques, assembled techniques,
studies are taken into account, the reason is that machine learning techniques for multiple and
the authors may have some unfairness towards pairwise sequence alignment. We talk about and
DNA sequence alignment using machine decipher the outcomes identified with every one
learning techniques. Therefore, there is likely a of these inquiries in the subsections beneath.
risk of miscalculating the performance of DNA
sequence alignment using machine learning
methods.
Table 6: Selected studies
S. Y Authors Title Publi
N ea sher
o r
1 19 (Eppstein, Galil et al. Sparse Dynamic Programming ACM
92 1992) Linear Cost Functions
2 20 (Fukunishi, Finch et al. A Bayesian Alignment Approach Acm
13 2013) to Transliteration Mining
3 20 (Mohanty and Scalable Offline Searches Acm
14 Tragoudas 2014) in DNA Sequences
4 20 Huazheng Zhu, A Novel Approach to MSA Ieee
16 Zhongshi He & Using Multi-objective EA
Yuanyuan Jia Based on Decomposition
5 20 (Zhang, Schwartz et al. A Greedy Algorithm for Aligning GS
00 2000) DNA Sequences (2)
6 20 (Kumar, Tamura et al. MEGA3 Integrated software for GS
04 2004) Molecular Evolutionary
Genetics Analysis and SA
7 20 (Torres and Nieto Fuzzy Logic in Medicine GS
06 2006) and Bioinformatics
8 20 (Gondro and Kinghorn A simple GA for MSA GS
07 2007)
9 20 (Wang, Huang et al. BindN+ for accurate prediction of DNA and RNA-binding GS
10 2010) residues from protein sequence features
10 20 (Verma, Singh et al. DNA Sequence Assembly using PSO GS
11 2011)
11 20 (Gupta, Agarwal et al. Genetic Algorithm Based Approach for Obtaining Alignment GS
12 2012) of Multiple Sequences

https://ijciss.org/ | Ali et al., 2024 | Page 1194


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

12 20 (Huang, Chen et al. A memetic PSO algorithm for solving the DNA fragment GS
14 2015) assembly problem
13 20 (Kumar 2015) AN ENHNCED ALGO FR MSA OF PROTEIN SEQ USNG GS
15 GA
14 20 (Karaboga and Aslan A discrete ABC algorithm for detecting transcription factor GS
16 2016) binding sites in DNA sequences
15 20 (Othman 2016) Survey of the use of genetic algorithm for multiple sequence GS
16 alignment
16 20 (Karaboga and Aslan discovery of conserved regions in DNA sequences by (ABC) GS
18 2019) algorithm based methods
17 20 (Leslie, Eskin et al. mismatch-string-kernels-for-svm-protein-classification GS
18 2004)
18 19 (Zhang and Wong Toward Efficient MSA A System of Genetic and Dynamic IEEE
97 1997) Programming
19 20 (Chang and fuzzy-sequence-pattern-matching-in-zinc-finger-domain- IEEE
01 Halgamuge 2001) proteins
20 20 (Ando and Iba 2002) ant-algorithm-for-construction-of-evolutionary-tree IEEE
02
21 20 (Nguyen, Yoshihara et a-parallel-hybrid-genetic-algorithm-for-multiple-protein- IEEE
02 al. 2002) sequence alignment
22 20 (Meksangsouy and dna-fragment-assembly-using-an-ant-colony-system- IEEE
03 Chaiyaratana 2003) algorithm
23 20 (Nasser, Vert et al. Multiple Sequence Alignment using Fuzzy LOGIC IEEE
07 2007)
24 20 (Ramaswamy and An Extended Library of Hardware Modules for GA with IEEE
08 Purdy 2008) Applications to DNA Sequence Matching
25 20 (Zhao, Ma et al. 2008) An Improved Ant Colony Algorithm for DNA Sequence IEEE
08 Alignment
26 20 (Al Junid, Abd Majid HIGH SPEED DNA SEQUENCING ACCELERATOR IEEE
08 et al. 2008) USING FPGA
27 20 (Lei and Ruan 2008) Particle Swarm Optimization Algorithm for Finding DNA IEEE
08 Sequence Motifs
28 20 (Bir, Dongardive et al. Building Consensus of Human Papillomavirus using Genetic IEEE
09 2009) Algorithm
29 20 (Mohamed, Othman et Classification of “Gracilaria changii” Protein Sequences IEEE
09 al. 2009) Using Back-Propagation Classifier
30 20 (Arribas-Gil, Metzler Statistical Alignment with a Sequence Evolution Model IEEE
09 et al. 2008) Allowing Rate Heterogeneity along the Sequence
31 20 (Mahmud, Hosen et al. A Novel Two-Tier Multiple Sequence Alignment algorithms IEEE
10 2010)
32 20 (Lei, Sun et al. 2010) Artificial Bee Colony Algorithm for Solving MSA IEEE
10
33 20 Ankit Agrawal and Pairwise Statistical Significance of LSA Using Sequence- IEEE
11 Xiaoqiu Huang Specific POSITION
34 20 (Yu 2011) Solving Sequence Alignment Based on Chaos Particle Swarm IEEE

https://ijciss.org/ | Ali et al., 2024 | Page 1195


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

11 Optimization Algorithm
35 20 (Nagar and Hahsler A Novel Quasi-Alignment-Based Method for Discovering IEEE
12 2012) Conserved Regions in Genetic Sequences
36 20 (Verma 2012) DSAPSO DNA Sequence Assembly using Continuous PSA IEEE
12 with Smallest Position Value Rule
37 20 (Al Junid, Reffin et al. implementation of GA FOR DNA seqnce alignment IEEE
12 2012)
38 20 (Othman and Abdel- MSA Based on GA with new Chromosomes representation IEEE
12 Azim 2012)
39 20 (Halgaswaththa, Neural Network Based Phylogenetic Analysis IEEE
12 Atukorale et al. 2012)
40 20 (Borovska, Gancheva Massively Parallel Algorithm for MSA IEEE
13 et al. 2013)
41 20 (Zheng, Li et al. 2016) A Modified Multiple Alignment Fast Fourier Transform with IEEE
16 Higher Efficiency
42 20 Huazheng Zhu, et al; A Novel Approach to MSA Using Multi-objective EA Based IEEE
16 on Decomposition
43 20 (Rani and Ramyachitra Application of Genetic Algorithm by Influencing the IEEE
17 2017) Crossover Parameters for MSA
44 20 (Kaur and Sohi 2017) Pairwise Sequence Alignment Method Using Flower IEEE
17 pollination algo
45 20 (Siswanto, Hendric et The Genomic Plant Warehouse Framework SLR IEEE
17 al. 2017)
46 20 (Rajapakse and Faleel genetic-approach-to-biosequence-alignment-gaba IEEE
02 2002)
47 20 (Ressom, Natarajan et Applications of fuzzy logic in genomics SD
05 al. 2005)
48 20 (Jangam and A novel method for alignment of 2 nucleic acid sequences SD
07 Chakraborti 2007) using ACO and GA
49 20 (Ho, Yu et al. 2007) Design of accurate predictors for DNA-binding sites in SD
07 proteins using hybrid SVM–PSSM method
50 20 (Kim, Kim et al. 2008) A DNA sequence alignment algorithm using quality SD
08 information and a fuzzy inference method
51 20 (Blum, Vallès et al. An ant colony optimization algorithm for DNA sequencing by SD
08 2008) hybridization
52 20 (Lee, Su et al. 2008) Genetic algorithm with ant colony optimization (GA-ACO) SD
08 for multiple sequence alignment
53 20 (Qian, Yang et al. Particle swarm optimization for SNP haplotype reconstruction SD
08 2008) problem
54 20 (Bi 2010) Deterministic local alignment methods improved by a simple SD
10 GA
55 20 (Zou, Shan et al. 2012) A Novel Center Star Multiple Sequence Alignment Algorithm SD
12 Based on Affine Gap Penalty and K-Band
56 20 (Li, Wang et al. 2012) Improvements on a privacy-protection algorithm for DNA SD
12 sequences with generalization lattices
57 20 (Hassanien, Al- Computational intelligence techniques in bioinformatics SD

https://ijciss.org/ | Ali et al., 2024 | Page 1196


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

13 Shammari et al. 2013)


58 20 (Kaya, Sarhan et al. Multiple Sequence Alignment with Affine Gap by Using SD
14 2014) Multi-Objective Genetic Algorithm
59 20 (Garai and Chowdhury A cascaded pairwise biomolecular sequence alignment SD
15 2015)
60 20 (Lee, Yeu et al. 2015) BulkAligner A novel sequence alignment algorithm based on SD
15 graph theory and Trinity
61 20 (Ji, Pu et al. 2015) One-dimensional pairwise CNN for the global alignment of SD
15 two DNA SEQUENCES
62 20 (Rajasekhar, Lynn et Computing with the Collective Intelligence of Honey Bees – SD
16 al. 2017) A Survey
63 20 (Rubio-Largo, Vega- Hybrid Multiobjective Artificial Bee Colony for MSA SD
16 Rodríguez et al. 2016)
64 20 (Rani and Ramyachitra Multiple sequence alignment using multi-objective based SD
16 2016) bacterial foraging optimization algorithm
65 20 (Dakhli and Bellil Wavelet Neural Networks for DNA Sequence Classification SD
16 2016) Using the Genetic Algorithms and the Least Trimmed Square
66 20 (Chowdhury and Garai A review on multiple sequence alignment from the SD
17 2017) perspective of genetic algorithm
67 20 (Moustafa, Elhosseini Fragmented protein sequence alignment using two-layer SD
17 et al. 2017) particle swarm optimization (FTLPSO)
68 20 (Amorim, Neves et al. An approach for COFFEE objective function to global DNA SD
18 2018) MSA
69 20 Mohamed (Issa, ASCA-PSO Adaptive sine cosine optimization algorithm SD
18 Hassanien et al. 2018) integrated with particle swarm for pairwise LSA
70 20 (Surendar, Shaik et al. Micro Sequence Identification of DNA Data Using Pattern SD
18 2018) Mining Techniques
71 20 (Rubio-Largo, Swarm intelligence for optimizing the parameters of multiple SD
18 Vanneschi et al. 2018) sequence Aligners
72 20 (Saw, Raj et al. 2019) Alignment-free method for DNA sequence clustering using SD
19 Fuzzy integral similarity
73 20 (Wu and Zhao 2019) Machine learning technology in the application of genome SD
19 analysis A systematic review
74 20 (Horng, Wu et al. A genetic algorithm for multiple sequence alignment SPRI
04 2005) NGE
R
75 20 (Xu and Chen 2009) A Method for Multiple Sequence Alignment based on PSO SPRI
09 NGE
R
76 20 (Xu and Lei 2010) Multiple Sequence Alignment Based on ABC_SA SPRI
10 NGE
R
77 20 (Agarwal, Gupta et al. A Genetic Algorithm for Alignment of Multiple DNA SPRI
12 2012) Sequences NGE
R
78 20 (Majid, Khan et al. Application of Parallel Vector Space Model for Large-Scale SPRI

https://ijciss.org/ | Ali et al., 2024 | Page 1197


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

18 2019) DNA Sequence Analysis NGE


R
79 20 (Karaboga and Aslan Discovery of conserved regions in DNA sequences by SPRI
18 2019) Artificial Bee Colony (ABC) algorithm based methods NGE
R
80 20 (Ishaq, Khan et al. Current Trends and Ongoing Progress in the Computational IEEE
19 2019) Alignment of Biological Sequences
81 20 (Lacey and Xie 2014) Supervised Machine Learning Techniques in Bioinformatics: ANN,
14 Protein Classi cation SVM
82 20 (Wen, Li et al. 2012) Systematic literature review of machine learning based SD
11 software development effort estimation models
83 20 (Idri, azzahra Amazal Analogy-based software development effort estimation: A SD
14 et al. 2015) systematic mapping and review

84 Sequence Alignment using Machine Learning-Based IEEE


20 (Amr Ezz El-Din Needleman-Wunsch Algorithm
21 Rashed et al.2021)

A. ML techniques used for DNA sequence Among the above listed ML techniques,
alignment (RQ1) GA, PSO, ACO, FL, ABC are the five most
We recognized 16 types of ML techniques, every now and again utilized ones; they together
applied to DNA sequence alignment. They are were received by 86% of the chosen
listed as follows. investigations, as outlined in Fig. 2. This
 Dynamic programming (DP) presents just the measure of examination
 Artificial Neural Networks (ANN) consideration that each variety of ML method
 Support Vector Machine (SVM) has gotten during the previous 20 years; as a
 Genetic Algorithms (GA) supplement to Fig. 2, Fig. 3 is plotted to
 Fuzzy logic (FL) additionally introduce the dispersion of
 Artificial bee colony(ABC) exploration consideration in every distribution
 Ant colony optimization(ACO) year. As appeared in Fig. 3, on one hand, a
 Particle swarm optimization(PSO) conspicuous distribution top shows up around
 Needle-Wunsch Algorithm year 2008.

https://ijciss.org/ | Ali et al., 2024 | Page 1198


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

SVM(4) ABC(6)
7% 10%

N-WA ABC
2% ACO(6) ACO
10%
ANN
PSO(9) ANN(4)
15% DP
DP(2) 7%
3% FL
GA
PSO
SVM
GA(23) N-WA
38% FL(6)
10%

Figure 2: Distribution of the studies over type of ml technique.


9
8
1
7 2
6
3
5 2 4
1
4
1 1 1
3 2
1 1 1 1 1 1
2 2
1 1 3 1 1 1 1 3 1
1 2 2 2 2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0
92 97 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 21
19 19 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20

ABC ACO DP ANN FL GA PSO SVM N-WA

Figure 3: Distribution of the studies over publication year.


On the other hand, contrasted with other ML strategies with non-ML methods. The run of the
procedures, GA and PSO appear to have gotten mill ML procedure that were frequently used to
prevailing exploration consideration in consolidate with other ML strategies are GA and
numerous years. Note that a few examinations fluffy rationale, individually. With respect to
encompass more than one ML approach. The GA, ACO and ABC as per the chose
recognized ML procedures were utilized for examinations, it was seen as utilized in mix
DNA sequence alignment generally in two structure. The examinations detailing the
structures: in alone or in blend. The blend utilization of GA in blend with other ML
structure might be gotten by consolidating at methods are, for instance, (GA with ACO), (GA
least 2 ML procedures or by joining ML with DP), (GA with ANN), and (GA with

https://ijciss.org/ | Ali et al., 2024 | Page 1199


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

ABC).What we found about the ML strategies exceptionally reliable with the discoveries of a
utilized in DNA sequence alignment space is few other significant survey works.

Table 7: SA Techniques and their relevant studies


Sr. Techniques Studies
1 Dynamic programming (DP) [1], [31],
2 Artificial Neural Networks [29], [39], [61],
(ANN)
3 Support Vector Machine [2], [9], [17], [49]
(SVM)
4 Genetic Algorithms (GA) [6], [8], [11], [13], [15], [21], [24], [28], [37], [38], [43], [46], [54],
[58], [59], [66], [68], [74], [77], [80],
5 Fuzzy logic (FL) [7], [19], [23], [47], [50], [72]
6 Artificial bee colony(ABC) [14], [16], [32], [40], [62], [63], [76], [79],
7 Ant colony [20], [22], [25], [51],
optimization(ACO)
8 Particle swarm optimization [10], [12], [27], [34], [53], [67], [71], [75],
(PSO)
9 Hybrid [18], [36], [48], [52], [64], [65], [69], [81], [82], [83]

10 Needleman-Wunsch [84]
Algorithm

B. Datasets used in machine learning sequence  TRANSFAC


alignment papers (RQ2) We examined the  OTHERSAmong the above listed datasets,
related papers and their sources. Overall, 68 BAILBASE are the most repeatedly used
different datasets (public and private) were datasets; about 25% of the selected studies
identified, used in DNA sequence alignment adopted BAILBASE, as illustrated in Fig.4. The
using machine learning techniques within the 84 detailed information about datasets Fig. 5 is
papers addressing the same research questions. plotted to additionally introduce the circulation
They are listed as follows. of examination consideration in every
 BAILBASE distribution year.
 Protein sequencesTNFAIP2

https://ijciss.org/ | Ali et al., 2024 | Page 1200


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

25% 3%

65% 3%
4%

BAliBASE protein sequences TNFAIP2


TRANSFAC others
Figure 4: Distribution of the studies over type of datasets

8
7
6
5
4
3
2
1
0
97 01 02 03 04 06 07 08 09 10 11 12 13 14 15 16 17 18 19 21
19 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20

BAILBASE protein sequences TNFAIP2


TRANSFAC OTHER
Figure 5: Utilized datasets per category

On the other hand, Benchmark Alignment reference sets: 1) equidistant sequences with
Database - BaliBase (Thompson et al., 1999b; various degrees of preservation, 2) sequences
Bahr et al., 2001) appear to have gotten with a profoundly disparate sequence, 3)
predominant examination consideration in bunches with under 25% personality, 4)
numerous years. The first BaliBase (Thompson sequences with N/C-terminal augmentations, 5)
et al., 1999b) comprises of a lot of 142 reference inner additions, 6) rehashes, 7) roundabout
alignments with more than 1000 sequences. stages, and 8) transmembrane proteins (Bahr et
Variant 2 of BaliBase (Bahr et al., 2001) al., 2001). Gathering 1 is partitioned by
improved a few alignments from the first sequence sizes. From each gathering (and
database and stretched out it to 167 reference subgroups in bunch 1), sequences were
alignments and more than 2100 sequences arbitrarily chosen for the experiments, giving an
incorporating sequences with rehashed areas, aggregate of 32 as an agent test of the whole
transmembrane sequences and round stages. database. This gives a premise to assessment
BaliBase 2 is partitioned into eight classes of where an objective outcome is accessible for

https://ijciss.org/ | Ali et al., 2024 | Page 1201


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

each arrangement of sequences. This outcome is most part need not to be replicated by complete
the result of hand-curation, and doesn't speak to alignment of the BaliBase arrangement (A
the ideal outcome dependent on the scoring straightforward hereditary calculation for
frameworks utilized here. This implies a genuine various sequence alignment)
worldwide improvement in the tests did for the

Table 8: Dataset and Relevant Articles


Sr. Dataset Studies used that dataset
No.
1 BAILBASE [4], [8], [13], [21], [32], [41], [42], [43], [58], [63], [64], [66], [67], [68], [71],
[75], [76],
2 Protein [59], [69], [81]
Sequences
3 TNFAIP2 [78],
4 TRANSFAC [14], [16], [79],
5 OTHERS [2], [3], [5], [7], [9], [10], [11], [12], [17], [18], [19], [20], [22], [23], [24],
[25], [26], [27], [28], [29], [31], [33], [35], [36], [39], [40], [44], [46], [49],
[50], [51], [53], [54], [55], [56], [61], [65], [70], [72], [74], [77], [84]
C. Assembled techniques used in machinelearning in sequence alignment and shows that genetic
DNA sequence alignment algorithm and Ant colony optimization are the
Different ML methods were utilized in blend most regularly used practices (50%), in
with the AI strategies to beat a few difficulties combination with ACO (25%), followed by
related to DNA sequence alignment. Fig. 6 SVM and PSO with 25% each.
shows the assembled techniques used in DNA

2
1.6
1.2
0.8
0.4
0
P CO es V) n BC NN A)
+D +A ri c SP es
ia +A (S
C
GA A at e( ay A +W m
G
gm
lu +B
G GA ith
in Va or
or on S VM lg
cs
c si ti ea
i fi Po os
in
ec st c
-sp te ne
or si
on +S
h
O+
i ti
os SO PS
+p P
M
SV

.
Figure. 6. Distribution of techniques used in combination with MACHINE learning techniques

We investigated the use of GA in combination deterministic methodology is a mind-boggling


with other machine learning techniques in the streamlining task. Hereditary Algorithm (GA) is
selected studies of DNA sequence alignment. perhaps the most well-known and natural
Solving a Sequence Alignment utilizing

https://ijciss.org/ | Ali et al., 2024 | Page 1202


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

streamlining strategy. It depends on the enhancing a mind boggling, enormous, as well


component of characteristic development and as multidimensional issue and gives an ideal or a
the Darwinian hypothesis of common choice, close ideal arrangement [5,10,39,40].
and utilized in mix with relationship, for
 Precision and recall:
Table 9: Assemble SA Techniques and their  Time:
Relevant Studies  Space:
Sr. assembled techniques Relevant  Sensitivity and specificity
No. used in SA Studies  Quality
1 Bayesian with SVM [2],  Speed
2 (PSO) with Shortest [10], Obviously, any individual measurement
Position Value (SPV) isn't a sufficient presentation meter of discovery
3 GA and DP [18], effectiveness. For example, exactness now and
4 ACO and GA [48], [52], again, slanted dataset, can prompt inclined
5 SVM + position-specific [49] outcomes in the presentation meter (Chawla,
scoring matrices Bowyer et al. 2002) [N. V. Chawla, K. W.
6 (GA-ABC) [64] Bowyer, L. O. Hall, and W. P. Kegelmeyer,
7 (WNN)+GA [65] “Smote: synthetic minority over-sampling
8 sine cosine algorithm [69] technique,” Journal of artificial intelligence
(SCA) +(PSO) research, vol. 16, pp. 321–357, 2002.]. The
D. Evaluation parameters used in machine majority of the included papers assessed their
learning sequence alignment papersWe identify models utilizing a few assessment measurements
the most regularly utilized assessment (see Fig. 7).
measurements in DNA sequence alignment
utilizing AI strategies. We discovered 8
distinctive existing assessment measurements.
These measurements were classified by
discovery proficiency and computational
execution. An aggregate of 74 papers talked
about the assessment measurements. Altogether,
7 measurements have a place with this group:
 Accuracy

https://ijciss.org/ | Ali et al., 2024 | Page 1203


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

30 26
26
25
20
15 13

10
2 3
5 2 2
0
cy l y
ca
l ity AC
E
vit ee
d e
ra re al tim
cc
u
d qu S P siti sp
a an s en
n d
i sio y an
ec ci t
pr
e cifi
sp
Figure 7. Detection efficiency metrics
As shown in Fig. 8, the accuracy (26), time (26), metrics to measure DNA sequence alignment
and space (13) are the most commonly used using machine learning techniques. We saw the
metrics to portion DNA sequence alignment absence of thought given to certain
using machine learning techniques detection measurements in the most recent decade
effectiveness. To inspect the enhancement of including, in ascending order, speed, quality etc.
evaluation metrics over time , we pursued the

14

12

10

0
1992 1997 2002 2005 2006 2007 2008 2009 2010 2012 2013 2014 2015 2016 2017 2018 2019 2021

EVALUTION PARAMETERS accuracy EVALUTION PARAMETERS time


EVALUTION PARAMETERS space EVALUTION PARAMETERS speed
EVALUTION PARAMETERS precision and recall EVALUTION PARAMETERS quality
EVALUTION PARAMETERS sensitivity and specificity

Figure 8. Popularity of efficiency metrics over


time

https://ijciss.org/ | Ali et al., 2024 | Page 1204


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

Table 10: Performance Metric and Relevant Articles


Sr. No. Performance Metrics Relevant Studies
1 Accuracy [5], [11], [16], [23], [27], [28], [30], [32], [41],
[47], [48], [49], [50], [53], [54], [58], [63], [64],
[66], [68], [71], [76], [78], [82], [83], [84]
2 Precision and recall: [2], [60],
3 Time [1], [12], [20], [21], [22], [24], [25], [26], [35],
[39], [40], [51], [55], [56], [61], [65], [74],
4 Space [7], [46], [59], [80]
5 Both Time and Space [3], [4], [15], [17], [18], [36], [37],
6 Sensitivity and specificity [9], [29],
7 Quality [42], [43],
8 Speed [8], [75]
9 Size [14], [44], [72], [79]
10 Time and Accuracy [69]
11 Time and Speed [70]
E. Machine learning techniques are used for  Ant Colony Optimization (ACO)
multiple sequence alignment  Particle Swarm Optimization (PSO)
From the chosen investigations, we  Needleman-Wunsch Algorithm
recognized various kinds of ML methods that Among the above recorded ML methods,
had been applied to multiple DNA sequence GA are the most often utilized one; they
alignment. They are listed as follows. received by 46% of the chose examinations, as
 Dynamic programming (DP) outlined in Fig. 9. Fig. 10 is plotted to
 MOMOSA additionally introduce the dispersion of
 Fast Fourier transform (FFT) exploration consideration in every distribution
 Genetic Algorithms (GA) year. As appeared in Fig. 10, on one hand, a
 Fuzzy logic (FL) conspicuous distribution top shows up around
 Artificial Bee Colony (ABC) year 2016.

7% 7%
DP
3%
7% GA-ABC
3% 14% ABC
3% FFT
3% FL
3% GA
GA+ACO
3%
GA+DP
MOMSA
45% PSO
N-WA

Figure 9: Distribution of the studies multiple SA using ML techniques

On the other hand, contrasted with other ML predominant examination consideration in


strategies, GA and ABC appear to have gotten numerous years. Note that a few examinations

https://ijciss.org/ | Ali et al., 2024 | Page 1205


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

comprise more than one ML method. The consolidate with other ML strategies are GA and
recognized ML methods were utilized for DP, individually. With respect to GA, ACO and
different DNA sequence alignment ordinarily in ABC as indicated by the chose investigations, it
two structures: in alone or in mix. The mix was seen as utilized in blend structure. The
structure might be gotten by joining at least two investigations detailing the utilization of GA in
ML strategies or by consolidating ML methods mix with other ML procedures are, for instance,
with other ML procedures. The run of the mill (GA with ACO), (GA with DP) and (GA with
ML procedure that were regularly used to ABC).

0
1997 2002 2004 2007 2008 2009 2010 2012 2013 2014 2015 2016 2017 2018 2021

DP GA-ABC ABC FFT FUZZY LOGIC GA


GA+ACO GA+DP MOMSA PSO N-WA
Figure 10: Distribution of the studies over publication year

17% 17% ACO+GA


GA
PSSM
17% 17%
CNN
Flower pollination algo
PSO
17% 17%

F. Machine learning techniques are used for  Flower Pollination Algorithm


pairwise sequence alignment  Artificial Bee Colony (ABC)
From the chosen examinations, we  Ant Colony Optimization (ACO)
recognized various kinds of ML procedures that  Particle Swarm Optimization (PSO)
had been applied to Pairwise DNA sequence Among the above listed ML techniques,
alignment. They are listed as follows. GA are the most frequently used one and
 Cellular Neural Network (CNN) combine used with other machine learning
 Genetic Algorithms (GA) techniques (GA with ACO); about 36% of the

https://ijciss.org/ | Ali et al., 2024 | Page 1206


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

selected studies adopted the techniques, as year. As appeared in Fig. 12, on one hand, a
illustrated in Fig. 11. Fig. 12 is plotted to conspicuous equivalent distribution top shows
additionally introduce the dissemination of up around every year.
examination consideration in every distribution

Figure 11: Distribution of the studies Pairwise SA using ML techniques


The identified ML techniques were used for two ML strategies or by consolidating ML
pairwise DNA sequence alignment usually in procedures with other ML methods. Concerning
two structures: in alone or in mix. The mix GA, ACO as per the chose examinations, it was
structure might be acquired by joining at least seen as utilized in mix structure.
1.2

0.8

0.6

0.4

0.2

0
ACO+GA GA PSSM CNN Flower pollination algo PSO

2007 2010 2011 2015 2017 2018

Figure 12: Distribution of the studies over publication year.

Repercussions for research and preparation: the ML model performs essentially and reliably
This audit has discovered that the observational superior to the current model.
examinations on the utilization of SVM, ACO,
GA, DP, PSO and ABC strategies. In this III. Conclusion
manner, scientists are urged to lead increasingly The exponential development of the measure of
observational investigations on these ML organic information emerges issues: proficient
methods to additionally fortify the experimental data stockpiling and the board and the extraction
proof about their exhibition. In addition, analysts of valuable data from this information. To take
are additionally urged to investigate the care of such issue AI strategies are utilized DNA
conceivable outcomes of utilizing the ML sequence alignment. The DNA sequence
strategies to evaluate programming alignment utilizing AI strategies in
improvement exertion. So as to search for the bioinformatics is turning out to be basic and
ML procedures and to utilize them all the more significant issue for both scholastic exploration
effectively, specialists would be advised to and in clinical field.
monitor the related teaches, for example, AI, The best of our insight no precise writing survey
information mining, measurements, and man- beforehand has been distributed in the field of
made reasoning, since these controls may give DNA sequence alignment utilizing AI methods
significant thoughts and techniques to address in the bioinformatics research region. Hence,
DNA sequence alignment issues. In spite of the this SLR give valuable data to the AI strategies
fact that this audit has discovered that ML and general and DNA sequence alignment in
models are typically progressively precise and bioinformatics as explicit in the field of SE.

https://ijciss.org/ | Ali et al., 2024 | Page 1207


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

In our paper, we talk about SLR of DNA ronment,” Computers, Materials & Con-
sequence alignment utilizing machine learning tinua, vol. 55, no. 1, pp. 95–119, 2018.
procedures in bioinformatics to give composed [8] W. J. Yang, P. P. Dong, W. S. Tang, X. P.
arrangement to increasing high caliber of SLR to Lou, H. J. Zhou et al., “A MPTCP scheduler
for web transfer,” Computers, Materials &
lead a fruitful precise writing survey before
Continua, vol. 57, no. 2, pp. 205–222, 2018.
beginning it. The normal consequences of this [9] A. Abdelmaboud, D. N.Jawawi, I. Ghani, A.
audit will distinguish AI strategies utilized for Elsafi and B. Kitchenham, "Quality of service
DNA sequence alignment in bioinformatics. The approaches in cloud computing: A systematic
normal advantages of this audit will give the mapping study," Journal of systems and soft-
analysts cutting edge of machine learning ware, vol. 101, pp. 159-179, 2015.
procedures utilized for DNA sequence alignment [10] D. Ebehard and E. Voges, “Digital single
in an orderly manner. sideband detection for interferometric sen-
sors,” presented at the 2nd Int. Conf. Optical
Reference Fiber Sensors, Stuttgart, Germany, Jan. 2-5,
[1] S. N. Atluri and S. Shen, “Global weak 1984.
forms, weighted residuals, finite elements, [11] P. Agarwal, R. Gupta, T. Maheswari, P.
boundary elements & local weak forms,” in Agarwal, S. Yadav et al., “A Genetic Algo-
The Meshless Local Petrov-Galerkin (MLPG) rithm for Alignment of Multiple DNA Se-
Method, 1st ed., vol. 1. Henderson, NV, USA: quences,” presented at the Int. Conf. Ad-
Tech Science Press, 2004, pp. 15–64. vances in Communication, Network, and
[2] S. N. Atluri, “The Meshless Method Computing, Springer. 2012.
(MLPG) for Domain & BIE [12] S.A.M. Al Junid, Z Abd Majid and A.K.
Discretization”. Henderson, NV, USA: Halim, “High speed DNA sequencing accel-
Tech Science Press, 2004. [Online]. erator using FPGA,” presented in Int. Conf.
Available: on Electronic Design, IEEE, 2008.
https://www.techscience.com/books/mlpg_atl [13] S.A.M. Al Junid, M.S. Reffin, Z. Abd Majid,
uri.html N.M. Tahir and M.A. Haron, “Implementa-
[3] A. M. Farhan, “Effect of rotation on the prop- tion of genetic algorithm for optimizing DNA
agation of waves in hollow poroelastic circu- sequence alignment,” presented in IEEE
lar cylinder with magnetic field,” Computers, Business, Engineering & Industrial Applica-
Materials & Continua, vol. 53, no. 2, pp. tions Colloquium (BEIAC), IEEE, 2012.
129–156, 2017. [14] A.R. Amorim, L.A. Neves, C.R. Valêncio,
[4] X. Chen and J. H. Jiang, “A method of virtual G.F. Roberto and G.F.D Zafalon, "An ap-
machine placement for fault-tolerant cloud proach for COFFEE objective function to
applications,” Intelligent Automation & Soft global DNA multiple sequence alignment,"
Computing, vol. 22, no. 4, pp. 587–597, Computational biology and chemistry, vol.
2016. 75, pp. 39-44, 2018.
[5] X. F. Li, Y. B. Zhuang and S. X. Yang, [15] S. Ando and H. Iba, “Ant algorithm for con-
“Cloud computing for big data struction of evolutionary tree, Proceedings of
processing,” Intelligent Automation & Soft the 2002 Congress on Evolutionary Computa-
Computing, vol. 23, no. 4, pp. 545–546, tion. CEC'02 (Cat. No. 02TH8600), IEEE,
2017. 2002.
[6] L. Ali, R. Sidek, I. Aris and M. A. M. Ali, [16] A. Arribas-Gil, D. Metzler and J.L Plouhinec,
“Design of a testchip for low cost IC "Statistical alignment with a sequence evolu-
testing,” Intelligent Automation & Soft Com- tion model allowing rate heterogeneity along
puting, vol. 15, no. 1, pp. 63–72, 2009. the sequence," IEEE/ACM Transactions on
[7] J. Cheng, R. M. Xu, X. Y. Tang, V. S. Sheng Computational Biology and Bioinformatics,
and C. T. Cai, “An abnormal network flow vol. 6, no. 2, pp. 281-295, 2008.
feature sequence prediction approach [17] C. Bi, "Deterministic local alignment meth-
for DDoS attacks detection in big data envi- ods improved by a simple genetic algorithm,"
Neurocomputing, vol 73, no. 13-15, pp. 2394-
2406, 2010.

https://ijciss.org/ | Ali et al., 2024 | Page 1208


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

[18] A. Bir, J. Dongardive, S. Jamkhedkar and S. [28] Z. Ezziane, "DNA computing: applications
Abraham, “Building Consensus of Human and challenges." Nanotechnology, vol. 17, no.
Papillomavirus using Genetic Algorithm,” 2, pp. R27, 2005.
2009 World Congress on Nature & Biologi- [29] T. Fukunishi, A. Finch, S. Yamamoto and E.
cally Inspired Computing (NaBIC), IEEE, Sumita, "A Bayesian alignment approach to
2009. transliteration mining," ACM Transactions on
[19] C. Blum, M. Y. Vallès and M.J. Blesa, "An Asian Language Information Processing
ant colony optimization algorithm for DNA (TALIP), vol. 12, no 3, pp. 1-22, 2013.
sequencing by hybridization," Computers & [30] G. Garai and B. Chowdhury, "A cascaded
Operations Research, vol. 35, no. 11, pp. pairwise biomolecular sequence alignment
3620-3635, 2008. technique using evolutionary algorithm," In-
[20] P. Borovska, V. Gancheva and N. Landzhev, formation Sciences, vol. 297, pp. 118-139,
“Massively parallel algorithm for multiple bi- 2015.
ological sequences alignment,” presented at [31] Goel, Neelam, Shailendra Singh, and Trilok
36th International Conference on Telecom- Chand Aseri. "A review of soft computing
munications and Signal Processing (TSP), techniques for gene prediction." International
IEEE, 2013. Scholarly Research Notices 2013 (2013).
[21] P. Brereton, B.A. Kitchenham, D. Budgen, [32] C. Gondro and B. P. Kinghorn, "A simple ge-
M. Turner and M Khalil, "Lessons from ap- netic algorithm for multiple sequence align-
plying the systematic literature review ment," Genetics and Molecular Research, vol.
process within the software engineering do- 6, no. 4, pp. 964-982, 2007.
main," Journal of systems and software, vol. [33] R. Gupta, P. Agarwal and A.K. Soni, "Ge-
80, no. 4, pp.571-583, 2007. netic algorithm based approach for obtaining
[22] Chang, B. C. and S. K. Halgamuge (2001). alignment of multiple sequences, " Interna-
Fuzzy sequence pattern matching in zinc fin- tional Journal of Advanced Computer Science
ger domain proteins. Proceedings Joint 9th & Applications (IJACSA), vol. 3, no. 12,
IFSA World Congress and 20th NAFIPS In- 2012.
ternational Conference (Cat. No. 01TH8569),
IEEE. [34] T. Halgaswaththa, A.S. Atukorale, M.
[23] N.V. Chawla, K.W. Bowyer, L.O. Hall and Jayawardena and J. Weerasena, “Neural net-
W.P. Kegelmeyer, "SMOTE: synthetic mi- work based phylogenetic analysis,” in proc.
nority over-sampling technique," Journal of 2012 International Conference on Biomedical
artificial intelligence research, vol. 16, pp. Engineering (ICoBE), IEEE, 2012.
321-357, 2002. [35] T. Halgaswaththa, A.S. Atukorale, M.
[24] B. Chowdhury and G. Garai, "A review on Jayawardena and J. Weerasena, “Pairwise se-
multiple sequence alignment from the per- quence alignment algorithms: a survey,” in
spective of genetic algorithm," Genomics, Proc. of the 2009 conference on Information
vol. 109, no. 5-6, pp. 419-431, 2017. Science, Technology and Applications, 2009.
[25] J. Cohen, "Bioinformatics—an introduction [36] T. Halgaswaththa, A.S. Atukorale, M.
for computer scientists," ACM Computing Jayawardena and J. Weerasena, "Computa-
Surveys (CSUR), vol. 36, no. 2, pp. 122-158, tional intelligence techniques in bioinformat-
2004. ics," Computational biology and chemistry,
[26] A. Dakhli and W. Bellil, "Wavelet neural net- vol. 47, no. pp. 37-47, 2013.
works for DNA sequence classification using [37] S.Y. Ho, F.C. Yu, C.Y. Chang and H.L.
the genetic algorithms and the least trimmed Huang, "Design of accurate predictors for
square," Procedia Computer Science, vol. 96, DNA-binding sites in proteins using hybrid
pp. 418-427, 2016. SVM–PSSM method," Biosystems, vol. 90,
[27] D. Eppstein, Z. Galil, R. Giancarlo and G. F. no. 1, pp. 234-241, 2007.
Italiano, "Sparse dynamic programming I: [38] J.T. Horng, L.C. Wu, C.M. Lin and B.H.
linear cost functions," Journal of the ACM Yang, "A genetic algorithm for multiple se-
(JACM), vol. 39, no. 3, pp. 519-545, 1992. quence alignment," Soft Computing, vol. 9,
no. 6, pp. 407-420, 2005.

https://ijciss.org/ | Ali et al., 2024 | Page 1209


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

[39] K.W. Huang, J.L. Chen, C.S. Yang and C.W. methods and programs in biomedicine, vol.
Tsai, "A memetic particle swarm optimiza- 114, no. 1, pp. 38-49, 2014.
tion algorithm for solving the DNA fragment [50] R. Khattree and D. Naik "Machine learning
assembly problem," Neural Computing and techniques for bioinformatics" Computational
Applications, vol. 26, no. 3, pp. 495-506, Methods in Biomedical Research, pp. 57-88.
2015. Chapman and Hall/CRC, 2007.
[40] A. Idri, A. F. azzahra and A. Abran, "Anal- [51] K. Kim, M. Kim and Y. Woo, "A DNA se-
ogy-based software development effort esti- quence alignment algorithm using quality in-
mation: A systematic mapping and review," formation and a fuzzy inference method."
Information and Software Technology, vol. Progress in Natural Science, vol. 18, no. 5,
58, pp. 206-230, 2015. pp. 595-602, 2008.
[41] M. Ishaq, A. Khan, M. Khan and M. Imran, [52] B. Kitchenham, (). "Procedures for perform-
"Current Trends and Ongoing Progress in the ing systematic reviews, " Keele, UK, Keele
Computational Alignment of Biological Se- University, vol. 33, pp. 1-26, 2004.
quences," IEEE Access, vol. 7, pp. 68380- [53] D.C. Koboldt, K.M. Steinberg, D.E. Larson,
68391, 2019. R.K. Wilson and E.R. Mardis, "The next-gen-
[42] M. Issa, A.E. Hassanien, D. Oliva, A. Helmi, eration sequencing revolution and its impact
I. Ziedan and A. Alzohairy, "ASCA-PSO: on genomics, " Cell, vol. 155, no. 1, pp. 27-
Adaptive sine cosine optimization algorithm 38, 2013.
integrated with particle swarm for pairwise [54] M. Kumar, "An enhanced algorithm for mul-
local sequence alignment." Expert Systems tiple sequence alignment of protein sequences
with Applications, vol. 99, pp. 56-70, 2018. using genetic algorithm," EXCLI journal, vol
[43] S. R Jangam. and N. Chakraborti, "A novel 14, pp. 1232, 2015.
method for alignment of two nucleic acid se- [55] S. Kumar, K. Tamura and M. Nei, "MEGA3:
quences using ant colony optimization and integrated software for molecular evolution-
genetic algorithms," Applied Soft Computing, ary genetics analysis and sequence align-
vol. 7, no. 3, pp. 1121-1130, 2007. ment," Briefings in bioinformatics, vol. 5, no.
[44] L. Ji, X. Pu, H. Qu and G. Liu, "One-dimen- 2, pp. 150-163, 2004.
sional pairwise CNN for the global alignment [56] A. Lacey and X. Xie, "Supervised Machine
of two DNA sequences," Neurocomputing, Learning Techniques in Bioinformatics: Pro-
vol. 149, pp. 505-514, 2015. tein Classification." (2014).
[45] U. Kanewala and J. M. Bieman, "Testing sci- [57] J. Lee, Y. Yeu, H. Roh, Y. Yoon and S. Park,
entific software: A systematic literature re- "BulkAligner: A novel sequence alignment
view," Information and Software Technology, algorithm based on graph theory and Trinity,"
vol 56, no. 10, pp. 1219-1232, 2014. Information Sciences, vol 303, pp. 120-133,
[46] D. Karaboga and S. Aslan, "A discrete artifi- 2015.
cial bee colony algorithm for detecting tran- [58] Z.J. Lee, S.F. Su, C.C. Chuang and K.H. Liu,
scription factor binding sites in DNA se- "Genetic algorithm with ant colony optimiza-
quences," Genet Mol Res, vol. 15, no. 2, pp. tion (GA-ACO) for multiple sequence align-
1-11, 2016. ment," Applied Soft Computing, vol. 8, no. 1,
[47] D. Karaboga and S. Aslan "Discovery of con- pp. 55-78, 2008.
served regions in DNA sequences by Artifi- [59] Lei, C. and J. Ruan, “A particle swarm opti-
cial Bee Colony (ABC) algorithm based mization algorithm for finding DNA se-
methods," Natural Computing, vol. 18, no. 2, quence motifs,” in proc. 2008 IEEE Interna-
pp. 333-350, 2019. tional Conference on Bioinformatics and
[48] Y. Kaur and N. Sohi, “Pairwise sequence Biomeidcine Workshops, IEEE, 2008.
alignment method using flower pollination al- [60] X. Lei, J. Sun, X. Xu and L. Guo, “Artificial
gorithm,” in proc. 2017 4th International bee colony algorithm for solving multiple se-
Conference on Signal Processing, Computing quence alignment,” in proc. 2010 IEEE fifth
and Control (ISPCC), IEEE, 2017. international conference on bio-inspired com-
[49] M. Kaya, A. Sarhan and R. Alhajj, "Multiple puting: theories and applications (BIC-TA),
sequence alignment with affine gap by using IEEE, 2010.
multi-objective genetic algorithm," Computer

https://ijciss.org/ | Ali et al., 2024 | Page 1210


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

[61] C.S. Leslie, E. Eskin, A. Cohen, J. Weston posium on Computational Intelligence and
and W.S. Noble, "Mismatch string kernels for Bioinformatics and Computational Biology,
discriminative protein classification," Bioin- IEEE.
formatics, vol. 20, no. 4, pp. 467-476, 2004. [72] Nguyen, H. D., et al. (2002). A parallel hy-
[62] G. Li, Y. Wang and X. Su, "Improvements on brid genetic algorithm for multiple protein se-
a privacy-protection algorithm for DNA se- quence alignment. Proceedings of the 2002
quences with generalization lattices." Com- Congress on Evolutionary Computation.
puter methods and programs in biomedicine, CEC'02 (Cat. No. 02TH8600), IEEE.
vol. 108, no. 1, pp. 1-9, 2012. [73] Nikolayevskyy, Vlad, Katharina Kranzer,
[63] M.S.I. Mahmud, M.A. Hosen, M. Saroer-E- Stefan Niemann, and Francis Drobniewski.
Azam, M.A. Mottalib and H.A Al-Mamun, "Whole genome sequencing of Mycobac-
“A novel two-tier multiple sequence align- terium tuberculosis for detection of recent
ment algorithm,” in proc. 2010 13th Interna- transmission and tracing outbreaks: A sys-
tional Conference on Computer and Informa- tematic review." Tuberculosis 98 (2016): 77-
tion Technology (ICCIT), IEEE, 2010. 85.
[64] A. Majid, M. Khan, N. Iqbal, M.A. Jan and [74] Othman, Mohamed Tahar Ben. "Survey of
M. Khan, "Application of parallel vector the use of genetic algorithm for multiple se-
space model for large-scale dna sequence quence alignment." Journal of Advanced
analysis," Journal of Grid Computing, vol. Computer Science & Technology 5, no. 2
17, no. 2, pp. 313-324, 2019. (2016): 28..
[65] P. Meksangsouy and N. Chaiyaratana, “DNA [75] Othman, Mohamed Tahar Ben, and Gamil
fragment assembly using an ant colony sys- Abdel-Azim. "Multiple sequence alignment
tem algorithm,” in porc. The 2003 Congress based on genetic algorithms with new chro-
on Evolutionary Computation, 2003. mosomes representation." In 2012 16th IEEE
CEC'03., IEEE, 2003. Mediterranean Electrotechnical Conference,
[66] A.L. Caicedo and M.D. Purugganan, "Com- pp. 1030-1033. IEEE, 2012.
parative genomics," Annu. Rev. Genomics [76] Pan, Yi, Jianxin Wang, and Min Li. "Wiley
Hum. Genet, vol. 5, pp. 15-56, 2005. Series on Bioinformatics: Computational
[67] N.S. Mohamed, Z.A. Othman and A.A. Techniques and Engineering." (2014): 513-
Bakar, “A classification of “Gracilaria 514.
changii” protein sequences using back-propa- [77] Pentheroudakis, George, F. A. Greco, and
gation classifier,” in proc. 2009 2nd Confer- Nicholas Pavlidis. "Molecular assignment of
ence on Data Mining and Optimization, tissue of origin in cancer of unknown primary
IEEE, 2009. may not predict response to therapy or out-
[68] P. Mohanty and S. Tragoudas, "Scalable Off- come: a systematic literature review." Cancer
line Searches in DNA Sequences," ACM treatment reviews 35, no. 3 (2009): 221-227.
Journal on Emerging Technologies in Com- [78] Pevsner, J. "Completed Genomes: Bacteria
puting Systems (JETC), vol. 11, no. 2, pp. 1- and Archaea." Bioinformatics and Functional
25, 2014. Genomics, Second Edition, John Wiley &
[69] N. Moustafa, M. Elhosseini, T.H. Taha and Sons, Inc., Hoboken, NJ, USA. doi 10 (2009):
M. Salem, "Fragmented protein sequence 9780470451496.
alignment using two-layer particle swarm op- [79] Polato, Ivanilton, Reginaldo Ré, Alfredo
timization (FTLPSO)," Journal of King Saud Goldman, and Fabio Kon. "A comprehensive
University-Science, vol. 29, no. 2, pp.191- view of Hadoop research—A systematic liter-
205, 2017. ature review." Journal of Network and Com-
[70] A. Nagar and M. Hahsler, “A novel quasi- puter Applications 46 (2014): 1-25.
alignment-based method for discovering con- [80] Qian, Weiyi, Yingjie Yang, Ningning Yang,
served regions in genetic sequences,” in proc. and Chun Li. "Particle swarm optimization
2012 IEEE International Conference on for SNP haplotype reconstruction prob-
Bioinformatics and Biomedicine Workshops, lem." Applied mathematics and Computa-
IEEE, 2012. tion 196, no. 1 (2008): 266-272.
[71] Nasser, S., et al. (2007). Multiple sequence [81] Rajapakse, J. C., and I. Faleel. "Genetic ap-
alignment using fuzzy logic. 2007 IEEE Sym- proach to biosequence alignment (GABA)."

https://ijciss.org/ | Ali et al., 2024 | Page 1211


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

In Proceedings of the 9th International Con- Sianipar, Bahtiar Saleh Abbas, and Achmad
ference on Neural Information Processing, Nizar Hidayanto. "The genomic plant ware-
2002. ICONIP'02., vol. 2, pp. 611-615. IEEE, house framework: A systematic literature re-
2002.. view." In 2017 International Conference on
[82] Rajasekhar, Anguluri, Nandar Lynn, Swa- Information Management and Technology
gatam Das, and Ponnuthurai N. Suganthan. (ICIMTech), pp. 244-248. IEEE, 2017.
"Computing with the collective intelligence [92] Smith, Anna Jo, Elizabeth L. Turner, and
of honey bees–a survey." Swarm and Evolu- Sanjay Kinra. "Universal cholesterol screen-
tionary Computation 32 (2017): 25-48. ing in childhood: a systematic review." Aca-
[83] Ramaswamy, Harish, and Carla Purdy. "An demic pediatrics 16, no. 8 (2016): 716-725.
extended library of hardware modules for ge- [93] Surendar, A., Sadulla Shaik, and N. Usha
netic algorithms, with applications to DNA Rani Rani. "Micro Sequence Identification of
sequence matching." In 2008 51st Midwest DNA Data Using Pattern Mining Tech-
Symposium on Circuits and Systems, pp. niques." Materials Today: Proceedings 5, no.
209-212. IEEE, 2008. 1 (2018): 578-587.
[84] Rani, R. Ranjani, and D. Ramyachitra. "Mul- [94] Torres, Angela, and Juan J. Nieto. "Fuzzy
tiple sequence alignment using multi-objec- logic in medicine and bioinformatics." journal
tive based bacterial foraging optimization al- of Biomedicine and Biotechnology 2006
gorithm." Biosystems 150 (2016): 177-189. (2006).
[85] Rani, R. Ranjani, and D. Ramyachitra. "Ap- [95] van Karnebeek, Clara DM, and Sylvia Stock-
plication of genetic algorithm by influencing ler. "Treatable inborn errors of metabolism
the crossover parameters for multiple se- causing intellectual disability: a systematic
quence alignment." In 2017 4th IEEE Uttar literature review." Molecular genetics and
Pradesh Section International Conference on metabolism 105, no. 3 (2012): 368-381.
Electrical, Computer and Electronics (UP- [96] Verma, Ravi Shankar. "DSAPSO: DNA se-
CON), pp. 33-38. IEEE, 2017. quence assembly using continuous particle
[86] Ressom, H., Padma Natarajan, Rency S. swarm optimization with smallest position
Varghese, and Mohamad T. Musavi. "Appli- value rule." In 2012 1st international confer-
cations of fuzzy logic in genomics." Fuzzy ence on recent advances in information tech-
sets and systems 152, no. 1 (2005): 125-138. nology (RAIT), pp. 410-415. IEEE, 2012.
[87] Rubio-Largo, Alvaro, Leonardo Vanneschi, [97] Verma, Ravi Shankar, Vikas Singh, and San-
Mauro Castelli, and Miguel A. Vega-Ro- jay Kumar. "Dna sequence assembly using
dríguez. "Swarm intelligence for optimizing particle swarm optimization." International
the parameters of multiple sequence align- Journal of Computer Applications 28, no. 10
ers." Swarm and Evolutionary Computa- (2011): 33-38.
tion 42 (2018): 16-28. [98] Wang, Liangjiang, Caiyan Huang, Mary Qu
[88] Rubio-Largo, Álvaro, Miguel A. Vega-Ro- Yang, and Jack Y. Yang. "BindN+ for accu-
dríguez, and David L. González-Álvarez. rate prediction of DNA and RNA-binding
"Hybrid multiobjective artificial bee colony residues from protein sequence fea-
for multiple sequence alignment." Applied tures." BMC Systems Biology 4, no. 1
Soft Computing 41 (2016): 157-168. (2010): 1-9.
[89] Saw, Ajay Kumar, Garima Raj, Manashi Das, [99] Wen, Jianfeng, Shixian Li, Zhiyong Lin,
Narayan Chandra Talukdar, Binod Chandra Yong Hu, and Changqin Huang. "Systematic
Tripathy, and Soumyadeep Nandi. "Align- literature review of machine learning based
ment-free method for DNA sequence cluster- software development effort estimation mod-
ing using Fuzzy integral similarity." Scien- els." Information and Software Technol-
tific reports 9, no. 1 (2019): 1-18. ogy 54, no. 1 (2012): 41-59.
[90] Simossis, Victor, Jens Kleinjung, and Jaap [100] Wu, Jie, and Yiqiang Zhao. "Machine learn-
Heringa. "An overview of multiple sequence ing technology in the application of genome
alignment." Current protocols in bioinformat- analysis: a systematic review." Gene 705
ics 3, no. 1 (2003): 3-7. (2019): 149-156.
[91] Siswanto, Teddy, Spits Warnars Harco Leslie [101] Xu, Fasheng, and Yuehui Chen. "A method
Hendric, Harjanto Prabowo, Nesti Fronika for multiple sequence alignment based on

https://ijciss.org/ | Ali et al., 2024 | Page 1212


International Journal of Contemporary Issues in Social Sciences
[

Volume 3, Issue 3, 2024 ISSN(P):2959-3808 | 2959-2461

particle swarm optimization." In International [106] Zhao, Y., et al. (2008). An improved ant
Conference on Intelligent Computing, pp. colony algorithm for DNA sequence align-
965-973. Springer, Berlin, Heidelberg, 2009.. ment. 2008 International Symposium on In-
[102] Xu, Xiaojun, and Xiujuan Lei. "Multiple se- formation Science and Engineering, IEEE.
quence alignment based on abc_sa." In Inter- [107] Zheng, Weihua, Kenli Li, Keqin Li, and Hing
national Conference on Artificial Intelligence Cheung So. "A modified multiple alignment
and Computational Intelligence, pp. 98-105. fast Fourier transform with higher effi-
Springer, Berlin, Heidelberg, 2010. ciency." IEEE/ACM transactions on compu-
[103] Yu, J. (2011). Solving sequence alignment tational biology and bioinformatics 14, no. 3
based on chaos particle swarm optimization (2016): 634-645..
algorithm. 2011 International Conference on [108] Zou, Quan, Xiao Shan, and Yi Jiang. "A
Computer Science and Service System novel center star multiple sequence alignment
(CSSS), IEEE. algorithm based on affine gap penalty and k-
[104] Zhang, Ching, and Andrew KC Wong. "To- band." Physics Procedia 33 (2012): 322-327.
ward efficient multiple molecular sequence
alignment: a system of genetic algorithm and Appendix A. Example of appendix
dynamic programming." IEEE Transactions Authors that need to include an Appendix
on Systems, Man, and Cybernetics, Part B section should place it after the References
(Cybernetics) 27, no. 6 (1997): 918-932.
section. Multiple appendices should all have
[105] Zhang, Zheng, Scott Schwartz, Lukas Wag-
ner, and Webb Miller. "A greedy algorithm
headings in the style used for above. They
for aligning DNA sequences." Journal of should be ordered as such: A, B, and C etc.
Computational biology 7, no. 1-2 (2000): [109] Amr Ezz El-Din Rashed1, Hanan
203-214. Abdelfatah1, Mervat El-Seddek2, and Hossam
El~Din Mou

https://ijciss.org/ | Ali et al., 2024 | Page 1213

You might also like