1-s2.0-S0010482524001227-main

Computers in Biology and Medicine 171 (2024) 108038
Contents lists available at ScienceDirect
Computers in Biology and Medicine

journal homepage: www.elsevier.com/locate/compbiomed
Optimized fuzzy K-nearest neighbor approach for accurate lung cancer

prediction based on radial endobronchial ultrasonography
Jie Xing a, 1, Chengye Li b, 1, Peiliang Wu b, Xueding Cai b, Jinsheng Ouyang b, *
a
Key Laboratory of Intelligent Informatics for Safety & Emergency of Zhejiang Province, Wenzhou University, Wenzhou, 325035, China
b
Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
A R T I C L E I N F O A B S T R A C T
Keywords: Radial endobronchial ultrasonography (R-EBUS) has been a surge in the development of new ultrasonography for
Feature selection the diagnosis of pulmonary diseases beyond the central airway. However, it faces challenges in accurately
Radial endobronchial ultrasonography pinpointing the location of abnormal lesions. Therefore, this study proposes an improved machine learning
Malignant lung disease
model aimed at distinguishing between malignant lung disease (MLD) from benign lung disease (BLD) through R-
Manta ray foraging optimization
Fuzzy k-nearest neighbor
EBUS features. An enhanced manta ray foraging optimization based on elite perturbation search and cyclic
mutation strategy (ECMRFO) is introduced at first. Experimental validation on 29 test functions from CEC 2017
demonstrates that ECMRFO exhibits superior optimization capabilities and robustness compared to other
competing algorithms. Subsequently, it was combined with fuzzy k-nearest neighbor for the classification pre
diction of BLD and MLD. Experimental results indicate that the proposed modal achieves a remarkable prediction
accuracy of up to 99.38%. Additionally, parameters such as R-EBUS1 Circle-dense sign, R-EBUS2 Hemi-dense
sign, R-EBUS5 Onionskin sign and CCT5 mediastinum lymph node are identified as having significant clinical
diagnostic value.
1. Introduction convex probe EBUS in the bronchoscope center. Recently, a noteworthy

advancement in EBUS technology has emerged in the form of the
Transbronchial lung biopsy (TBLB) has been well used as a tool for ultra-miniature radial probe, specifically denoted as UM-S20-20R and
diagnosing pulmonary diseases. To increase the diagnosis confidence of manufactured by Olympus in Tokyo, Japan. This innovative instrument
TBLB, a bronchoscopist usually performs it under the direction of X-ray has found application in the detection of intrapulmonary lesions, thus
before endobronchial ultrasonography (EBUS). The utilization of EBUS establishing itself as a pivotal diagnostic tool for pulmonary diseases.
for the precise localization of pulmonary masses is predicated upon the Furthermore, its integration with Lung Point virtual navigation repre
fundamental principle that the air content within lung parenchyma sents a significant leap in enhancing diagnostic precision and navigation
serves as an excellent conductor for ultrasound signals. Pulmonary le capabilities in the assessment of pulmonary pathology. The
sions, under such examination, exhibit distinctive characteristics: they ultra-miniature radial probe, UM-S20-20R, operates at a high-frequency
manifest a hypoechoic texture and possess well-defined borders, pri of 20 MHz and features a slender external diameter measuring only 1.4
marily attributable to the robust reflective interface discernible between mm. This compact design enables the probe’s insertion into the 2.0-mm
the aerated lung tissue and the lesion itself. This sonographic approach, working channel of a flexible electronic bronchoscope, either indepen
in focusing on the internal structure of peripheral pulmonary lesions, dently or in conjunction with a guide sheath (GS) [2,3]. After localizing
has prompted the development of a classification system aimed at the target lesion, the radial ultrasound probe is removed. Subsequently,
effectively discerning between benign and malignant lesions [1]. biopsy forceps are introduced into the target site beyond the sub
Consequently, EBUS (EBUS) can serve as a valuable adjunct in guiding segmental bronchus to perform a pathologic biopsy. The utilization of a
TBLB procedures targeted at peripheral intrapulmonary lesions. two guide sheath in conjunction with R-EBUS significantly enhances the
types of EBUS are available for clinical use: radial EBUS (R-EBUS) and endoscopic diagnostic capability for peripheral pulmonary diseases [2].
* Corresponding author.
E-mail addresses: xingjie095@163.com (J. Xing), lichengye41@126.com (C. Li), 404350351@qq.com (P. Wu), xuediing514@126.com (X. Cai), ouyangkch@163.
com (J. Ouyang).
1
Jie Xing and Chengye Li contributed equally to this work.
https://doi.org/10.1016/j.compbiomed.2024.108038
Received 12 October 2023; Received in revised form 2 January 2024; Accepted 26 January 2024
Available online 17 February 2024
0010-4825/© 2024 Elsevier Ltd. All rights reserved.
J. Xing et al. Computers in Biology and Medicine 171 (2024) 108038
We perform the R-EBUS examination only by combining the virtual diagnostic model that combines ant-coronavirus optimization with
navigation without the guide sheath. It has also achieved that the extreme learning machines. Experiments were conducted using various
satisfied results consistent with the final clinical diagnosis of malignant skin cancer image datasets, and the results confirmed the effectiveness
lung disease (MLD) from benign lung disease (BLD) such as inflamma of the proposed method. Huang et al. [21] addressed the shortcomings
tion and diffuse pulmonary diseases in our institution. in the breast X-ray image diagnostic system by proposing a detection
Despite the new type of R-EBUS being well used as a confirming system based on the optimal SqueezeNet model. They utilized an
target biopsy site for the diagnosis of intrapulmonary lesions, the loca improved chef-based optimization algorithm to select features from
tion of the abnormal lesion is usually determined by the procedure preprocessed images and then employed SqueezeNet for classification.
doctor’s visual perception and subjective experience. Is there a tech Hu et al. [22] improved the performance of the coot optimization al
nology that has a good determining accuracy of distinguishing malig gorithm based on chaos theory, opposition-based learning, and encir
nant lesions (lung cancer) from benign lesions? In recent years, there has clement predation strategies. Subsequently, they applied the enhanced
been a notable surge of interest in the application of machine learning algorithm for FS on eight medical datasets. Jesi et al. [23] introduced a
methods within the medical domain, facilitating the integration of sinusitis detection model based on recurrent neural networks. In the
artificial intelligence (AI) technology into medical practices. Machine feature extraction process for sinus CT images, they employed a hybrid
learning-driven prediction models have garnered substantial attention FS algorithm combining the spotted hyena optimization algorithm and
due to their multifaceted utility. They not only find relevance in medical the rain optimization algorithm to enhance the classification perfor
health planning and resource allocation but also serve as invaluable mance of the model.
tools for aiding medical professionals in making informed and data- Manta ray foraging optimization (MRFO) [24] is a metaheuristic
driven decisions in clinical settings [4–7]. For example, Hu et al. [8] algorithm inspired by nature. Research shows that MRFO has strong
proposed a network called pyramid pooling based network to address global optimization capabilities and is very suitable for handling prac
the severe challenge of accurately locating polyps in colonoscopy im tical problems [25,26]. However, similar to other metaheuristic algo
ages. It used pyramid pooling transformer for feature extraction and rithms, there is still room for improvement in the optimization
achieved better results than other networks. the result of. Dai et al. [9] performance of MRFO when dealing with complex data. In this study, a
proposed a low-cost musculoskeletal rehabilitation assessment system MRFO variant (ECMRFO) based on an elite perturbation search strategy
based on electromyography signals. Wavelet transform is applied for (EPS) and cyclic mutation (CM) is proposed. The EPS strategy performs
feature extraction, and long short-term memory is used for model individual position updates by introducing suboptimal solutions.
training. And in terms of health privacy, Wu et al. [10] proposed a Perturbation, thereby expanding the search range of the population
proxy-based user privacy health topic protection algorithm based on solution space. The CM strategy starts from the perspective of mutation,
identity replacement to improve the security of user privacy health mutates the population to varying degrees based on the cyclic topology,
topics on untrusted servers. and expands the diversity of the population to find better solutions. The
Although AI-assisted lung ultrasound (LUS) indicated a broad po superior performance of ECMRFO was validated through comparisons
tential for better accurate medical decisions for guiding care manage with 11 state-of-the-art (SOTA) algorithms on the IEEE Congress on
ment [11–15], it was less reported that AI-assisted R-EBUS was used for Evolutionary Computation 2017 (CEC 2017) test functions. Subse
intrapulmonary diseases. In addition, in the training of machine learning quently, the modified ECMRFO is employed to select the optimal feature
predictive models, a substantial amount of data sets and features are subset, and fuzzy K-nearest neighbors (FKNN) are used as the fitness
utilized. However, among these features, some may be trivial, highly evaluator. The selected optimal feature set is utilized for predicting
correlated, or redundant, which can impact the accuracy of subsequent whether lung lesions are benign or malignant.
models. Therefore, prior to classification, the adoption of feature se The following are the contributions of the proposed work:
lection (FS) methods is employed to reduce the dimensionality of the
feature space, in order to obtain appropriate features with high 1. Develop an improved MRFO approach (ECMRFO) based on EPS and
discriminative capability. CM strategies.
FS is essentially the process of searching for the best feature subset 2. The introduction of suboptimal solutions in EPS effectively facilitates
from the original set of features. Wrapper methods can obtain the the individual communication within the population, thereby
optimal feature subset through iterative search strategies. Recently, enhancing the convergence accuracy of the algorithm. The CM
metaheuristic algorithms have been employed by researchers due to strategy promotes individual mutation, thereby increasing the pop
their strong global search capabilities to search for the best feature ulation diversity, which can balance the exploration and exploitation
subset. Dhillon et al. [16] introduced additional randomness and local of algorithm.
search to the basic cat swarm optimization algorithm (CSO). They per 3. Comparing eleven SOTA methods on CEC 2017 validates the supe
formed FS using the proposed CSO variant and subsequently employed rior performance of ECMRFO.
Bayesian optimization for predicting cancer survival using deep neural 4. To further validate the superiority and practicality of ECMRFO, a
networks. On the other hand, Mohamed et al. [17] proposed an wrapper-based feature selection model is proposed and applied to
improved version of the artificial rabbits optimizer, incorporating address the feature selection problem in lung cancer data
Gaussian mutation and crossover operators. They utilized the proposed
algorithm for FS and employed the MobileNetV3 model for predicting The remaining article is structured as Section 2 presents related
skin cancer. Thawkar et al. [18] introduced a hybrid FS approach called work. The outline of the work is presented in Section 3. Section 4 pre
the butterfly optimization algorithm (BOA) and ant lion optimizer sents the details of bECMRFO-FKNN. Section 5 presents the parameters
(ALO), resulting in the hybrid BOA-ALO method. This method utilizes and algorithms involved in the experiment. Section 6 presents the
the optimal feature subset selected by BOA-ALO and employs three experimental results on optimization problems and lung cancer predic
classifiers, namely Artificial Neural Networks, Adaptive Neuro-Fuzzy tion. The discussion of lung cancer prediction results is presented in
Inference System, and Support Vector Machine, to predict the malig Section 7. Finally, Section 8 generalizes the conclusions drawn from this
nancy of breast tissue. work, while also outlining the future research work.
Venkatesan et al. [19] proposed an adaptive Harris hawk optimiza
tion algorithm for FS. Subsequently, they developed a lung cancer pre 2. Materials and methods
diction model that incorporates discrete local binary patterns and a
hybrid wavelet-partial hadamard transform based on the proposed This section provides an introduction to the experimental dataset,
Harris hawk optimization algorithm. Liu et al. [20] introduced a disease classifier, and the original MRFO utilized in this study. Firstly, the
2
Table 1 Table 3
Description of clinical, Biochemical, CCT & R-EBUS image attributes. Clinical characteristics and CCT findings in MLD patient and BLD patient.
Feature Brief description Feature Brief description Factor MLD (n = BLD (n = χ2 P
68) 87) value value
C1 Gender C15 White blood cells (WBC)
C2 Age C16 Absolute value of neutrophil Gender Male/Female 37/32 56/31 1.845 0.174
cells (AVN) CCT1 flake shadow 12/57 36/50 11.535 0.003
C3 R-EBUS1 Circle-dense sign C17 Absolute value of lymphocytes CCT2 speckle shadow 6/63 29/58 13.423 0.000
(AVL) CCT3 nodular shadow 46/21 50/37 4.564 0.102
C4 R-EBUS2 Hemi-dense sign C18 Percentage of neutrophils (PN) CCT4 mass shadow 16/53 5/82 10.048 0.002
C5 R-EBUS3 Blizzard sign C19 Percentage of lymphocytes CCT5 mediastinum lymph 24/45 9/78 13.778 0.000
(PL) node
C6 R-EBUS4 Short-linear- C20 Hemoglobin (HB) CCT7 interstitial alteration 68/1 85/2 0.147 0.701
hyperecho sign
C7 R-EBUS5 Onionskin sign C21 Blood platelet (BPL)
C8 R-EBUS6 Focal-low-echo C22 Fibrinogen (Fg)
sign Table 4
C9 CCT1 flake shadow C23 D-Dimmer (DD) Clinical characteristics and R-EBUS type findings in MLD patient and BLD
C10 CCT2 speckle shadow C24 C reaction protein (CRP) patient.
C11 CCT3 nodular shadow C25 Carcinoembryonic antigen
(CEA) Factor MLD (n = 68) BLD (n = 87) χ2 value P value
C12 CCT4 mass shadow C26 Keratin protein cyfra21-1 Gender Male/Female 37/32 56/31 1.845 0.174
(Cyfra21-1) Circle-dense sign 15/53 5/82 10.048 0.002
C13 CCT5 mediastinum lymph C27 neuron-specific enolase (NSE) Hemi-dense sign 39/30 24/63 13.382 0.000
node Blizzard sign 7/62 20/67 4.435 0.035
C14 CCT7 interstitial alteration Short-linear-hyperecho 32/37 57/30 5.753 0.016
sign
Onionskin sign 55/14 80/7 4.952 0.026
Focal-low-echo sign 5/64 18/69 5.533 0.019
Table 2
Biochemical & clinical parameters in malignant and benign lung disease pa
tients, average (Standard Deviation). Medical Devices Co Ltd, NC, USA. Subsequently, the Hospital Inspection
Variables MLD (n = Std BLD (n = Std P- Center employed Japan’s Sysmex XE-2100 automatic blood cell
50) 50) value analyzer, developed by Sysmex Corporation, Kobe, Japan, to conduct a
Mean(X) Mean(X) comprehensive analysis of seven routine blood indices, as detailed in
Table 1. The features of the CCT subtype and the characters of the R-
Age (years) 62.957 12.855 60.080 12.598 0.162
WBC (10^9/L) 6.433 1.980 7.641 2.768 0.003
EBUS subtype are shown in Table 1 for details.
AVN (10^9/L) 4.018 1.656 5.102 2.460 0.002
AVL (10^9/L) 1.803 0.815 1.834 0.681 0.799 2.1.2. Statistical analysis
PN (%) 0.668 0.483 0.649 0.097 0.721 In this section, a Two-tailed Student’s t-test was employed to perform
PL (%) 0.303 0.155 0.258 0.090 0.025
a comparative analysis of continuous variables, which encompassed age,
HB (μg/L) 132.521 15.667 135.196 15.375 0.286
BPL (10^9/L) 225.797 60.292 262.578 66.809 0.000 Blood regular test, Fibrinogen (Fg), D-Dimmer (DD), C reaction protein
FG (mg/L) 3.557 1.210 3.662 1.413 0.624 (CRP), Carcinoembryonic antigen (CEA), and Keratin protein cyfra21-1
DD (mg/L) 1.012 2.413 0.690 1.044 0.254 (Cyfra21-1), between cases of malignant lung disease and benign lung
CRP (mg/L) 9.199 11.715 18.825 46.800 0.098
disease. The data are presented as mean (X) ± standard deviation (Std)
CEA(g/L) 44.959 123.855 3.994 6.404 0.002
Cyfra21-1(ng/ 4.401 6.528 2.535 1.515 0.011 and are detailed in Table 2. To investigate potential statistical dispar
ml) ities, an analysis was conducted using the Pearson chi-square test (X2
NSE(ng/ml) 15.326 8.381 13.064 3.505 0.024 test) for variables related to patients’ gender, CCT findings, and R-EBUS
type. A significance threshold of P < 0.05 was adopted to determine
statistical significance. The entire analytical process was carried out
characteristics of the dataset will be presented, followed by a statistical
using SPSS software, version 19, developed by IBM. The results per
analysis. Subsequently, the FKNN classifier used in this research will be
taining to continuous variables are comprehensively summarized in
introduced, and a brief overview of the optimization method for MRFO
Table 2, Table 3, and Table 4.
will be provided.
2.2. Fuzzy k-nearest neighbor

2.1. Data description
The K-nearest neighbor (KNN) [27] classifier is a classification
2.1.1. Data collection method that utilizes the distances between samples as the basis for
We retrospectively collected the data of 156 patients who received classification. KNN assumes that each sample has the same degree of
transbronchial lung biopsy (TBLB) under the direction of R-EBUS membership within its class, which may not always align with
combining virtual navigation who were admitted to the First Affiliated real-world scenarios. Consequently, the FKNN classifier, based on the
Hospital of Wenzhou Medical University between January 16, 2020, and fuzzy set theory, was developed [28].
December 16, 2022. This study received ethical approval from the Ethics Assuming there are n labeled samples with categories, denoted as {s1 ,
Committee of the First Affiliated Hospital of Wenzhou Medical Univer ..., sn }, the formula for calculating the membership degree μi (x) of
sity, with the approval number KY2022-R219. The research was con sample s belonging to the i-th category is as follows:
ducted in strict adherence to the ethical principles outlined in the 1975
Declaration of Helsinki. For the purpose of this study, general clinical ∑
k ( /⃦ ⃦2/(m− 1) )
μij 1 ⃦s − sj ⃦
data were meticulously collected from patients, encompassing variables
(1)
j=1
μi (x) =
such as sex, age, and clinical diagnosis. To obtain the necessary blood k ( /⃦
∑ ⃦2/(m− 1) )
1 ⃦s − sj ⃦
samples, fasting venous blood was collected from all subjects using j=1
vacuum blood collection containers manufactured by Becton Dickinson,
3
⃦ ⃦
where ⃦x − xj ⃦ represents the distance between sample x and the j-th
sample xj . The parameter m is used to control the degree of membership {
Updating the individal′s position using Eq.(4) Coef > r6
of samples to categories. Xij (t + 1) =
Updating the individal′s position using Eq.(6) otherwise
(7)
2.3. An overview of MRFO It should be noted that the choice of update method between Eq. (4)
and Eq. (6) is determined by the parameter Coef = FEs/MaxFEs. When
MRFO [24] is a metaheuristic algorithm inspired by the foraging Coef is greater than a random number r6 in the range [0,1], the search
behavior of manta rays. The MRFO has three main updating modes: method that utilizes the best individual as the reference position is
chain foraging, cyclone foraging, and somersault foraging. employed. Conversely, when Coef is less than or equal to r6 , the search
Chain foraging. In this stage, the MRFO population consisting of N method that uses a random individual as the reference position is applied.
individuals are arranged in a chain-like formation to forage collectively. Somersault foraging. In this updating mechanism, the position of
The position update formula for each individual is as follows:
{ ( j ) ( j )
Xij (t) + r1 × Xbest (t) − Xij (t) + w1 × Xbest (t) − Xij (t) i=1
Xij (t + 1) = ( ) ( ) (2)
Xij (t) + r1 × Xi−j 1 (t) − Xij (t) + w1 × Xbest
j
(t) − Xij (t) i = 2, ⋯, N
the best individual is regarded as an axis, and each individual moves

√̅̅̅̅̅̅̅̅̅̅̅̅̅̅ back and forth around the best individual to update its own position. The
w1 = 2 × r 2 × |log r3 | (3)
mathematical model established for this process is as follows:
j
where Xi (t) represents the current position information of the i-th indi (
Xij (t + 1) = Xij (t) + S × r7 × Xbest
j
)
(t) − r8 × Xij (t) i = 1, ⋯, N (8)
j
vidual in the j-th dimension. Xbest (t) represents the position information
of the current best individual in the j-th dimension. w1 is a weight factor, where S is a parameter controlling the range of movement with a value
and its value is determined as shown in Eq. (2). r1 , r2 , r3 are three of 2. r7 and r8 are two random numbers in the range [0,1].
random numbers in the range [0, 1].
Cyclone foraging. Similar to the whale optimization algorithm, 3. Proposed ECMRFO
individuals in the MRFO population approach food sources in a spiral-
like manner. However, since MRFO population forages in a chain-like 3.1. Motivation
manner, individuals also get closer to the previous individual as they
approach the food source, which distinguishes it from whale optimiza It is worth noting that in the MRFO population, individuals are orga
tion algorithm (WOA). The specific mathematical model can be defined nized in a sequential, chain-like arrangement. The update of their posi
as follows: tions is influenced by either the information derived from the current best
{ j
( j ) ( j )
Xbest (t) + r4 × Xbest (t) − Xij (t) + w2 × Xbest (t) − Xij (t) i=1
Xij (t + 1) = j
( ) ( j ) (4)
Xbest (t) + r4 × Xi−j 1 (t) − Xij (t) + w2 × Xbest (t) − Xij (t) i = 2, ⋯, N
individual or the position of the preceding individual. This mode of

operation imposes certain constraints on the exchange of information
r5 MaxFEs−
(5)
FEs+1
w2 = 2e × sin(2πr5 ) among distinct individuals within the population. Consequently, it hinders
MaxFEs
the diversification of the population and, subsequently, its capacity to

where w2 is a weight factor, MaxFEs and FEs represent the maximum
explore and identify superior solutions. Moreover, relying solely on the
evaluation times and the current evaluation times, respectively. r4 and
current best individual for updating the population’s positions in a fixed
r5 is a random number in the range [0,1].
manner prevents the full utilization of information and potential offered
The updating mechanism in Eq. (3) primarily utilizes the reference
by other suboptimal solutions. Consequently, this constraint restricts the
position of the best individual, enhancing the algorithm’s ability for
algorithm’s exploration scope, leading to the possibility of being trapped
local exploration. Moreover, it’s also crucial not to disregard the
in local optima and failing to discover higher-quality solutions.
improvement in global exploration capabilities. Therefore, individuals
In light of the above considerations, this study introduces an elite
in the MRFO population employ a spiral-like search pattern with a
perturbation strategy to boost individual spatial exploration capabil
random individual Xrand (t) as the reference position to explore further.
ities. Additionally, a mutation strategy based on a topological circular
The updated formula for this approach is as follows:
structure is devised to enhance the algorithm’s diversity, ultimately
{ j
( j ) ( j )
Xrand (t) + r4 × Xrand (t) − Xij (t) + w2 × Xrand (t) − Xij (t) i=1
Xij (t + 1) = j
( ) ( ) (6)
Xrand (t) + r4 × Xi−j 1 (t) − Xij (t) + w2 × Xrand
j
(t) − Xij (t) i = 2, ⋯, N
4
facilitating the discovery of high-quality solutions. 3.3. Elite perturbation search strategy
The original MRFO only considered the position information of the

3.2. The framework of ECMRFO
best solution, which posed constraints related to local optima. To
overcome this limitation, we introduced the EPS strategy inspired by ant
The flowchart of ECMRFO is shown in Fig. 1. Besides, Algorithm 1
colony optimization (ACO) [29]. This strategy incorporates the position
presents the proposed ECMRFO framework. ECMRFO primarily com
information of suboptimal solutions into the individual update process.
prises two main components: a global search strategy based on elite
Additionally, it introduces a carefully calculated perturbation approach
perturbation and a local mutation strategy based on a circular hierarchy
to facilitate a broader exploration of the solution space. The update
structure. In Algorithm 2, the EPS strategy is designed to introduce
process can be divided into three main parts:
suboptimal solutions to expand the search scope. In Algorithm 3, the
Elite Archive. A repository of size k is established to store the po
circular hierarchy mutation strategy is designed to nurture diverse in
sition information of elite solutions encountered during the search
dividuals to enhance algorithm diversity. Initially, in Algorithm 1, the
process, denoted as Xi (i = 1, ⋯, k), along with their corresponding
algorithm parameters are initialized at line 1, followed by the random
fitness values f(Xi ). The position information of these solutions serves as
initialization of a population X with N individuals at line 2. The current
guidance for the population. A visual representation showcasing the
best individual Xbest is then determined. Subsequently, the original
precise composition of the elite solution archive is presented in Fig. 2.
MRFO’s search strategy is employed from lines 6 to 21. The EPS strategy
Each elite solution in the repository is associated with a weight
based on Algorithm 2 is executed at line 22, which involves updating
denoted as Wi . This weight value represents the degree of excellence of
the position of the current best individual. The CM strategy is executed
the corresponding solution in terms of quality. That is, when
at line 24 (Algorithm 3). This evolution process is repeated from lines 3
f(X1 ) ≤ …f (Xi ) ≤ …f(Xk ), there is W1 ≥ …Wi ≥ …Wk . The calculation
to 25 until the termination conditions are met. Ultimately, the output
equation of Wi is shown in Eq. (9).
includes the best individual and its corresponding fitness value.
Algorithm 1. ECMRFO framework.
5
Fig. 1. The flowchart of ECMRFO.
where the parameter k represents the number of elite solutions stored in

(i− 1)2
1
(9)
−
Wi = √̅̅̅̅̅e 2q2 k2
qk 2π the repository. Its size is dynamically adjusted in proportion to the

number of evaluations conducted. This dynamic adjustment allows the
k = 1 + ⌊(FEs / MaxFEs) • (N / 2 − 1)⌋ (10) algorithm to incrementally increase the storage capacity for efficient
information as optimization progresses. The parameter q is assumed to
Wi
pi = (11) have a value of 0.6, consistent with ant colony optimization [30].
∑
k
Wi Furthermore, the parameter pi represents the probability of selecting the
i=1
i-th elite individual in the repository as a guide, denoted as Xguide , to lead
the subsequent individual updates. This parameter plays a crucial role in
the generation of the knowledge carriers, Xknowledge , in the subsequent
steps.
Elite Traction Operator. The described operator serves as a guiding
mechanism for steering the population towards more advantageous re
gions within the search space. It achieves this by introducing pertur
bations to elite individuals, ensuring that the algorithm avoids becoming
ensnared in local optima and thereby enhancing its ability to explore
and identify global optima. The precise mathematical model for this
process is as follows:
Fig. 2. Schematic diagram of the archive of k elite individuals.
Fig. 3. Schematic diagram of the cyclic mutation strategy.
6
{ ( ) ⌈√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅⌉
Xi (t) + β. ∗ Xguide − Xi (t) + Xknowledge (t). ∗ Dti , r9 ≤ 0.5 during the learning process, denoted as θmax = dim + 1 . φibest rep
Xi (t + 1) = ( )
Xguide + β. ∗ Xi (t) − Xguide + Xknowledge (t). ∗ Dti , r9 > 0.5 resents the best learning angle for the i-th member of the population.
(12) Additionally, triali indicates the number of instances where the i-th
member, after implementing elite traction, found that the quality of the
∑Num ( )
Xknowledge (t) = Rand(1, dim). ∗ m1
XSorted − Xi (t) (13) newly acquired member was lower than the previous member’s quality.
The pseudocode of EPS is shown in Appendix A.1.
m1 =1
{ m2
XSorted , r10 < pm2
Xguide = , m2 ∈ {1, 2, ⋯, k} (14) 3.4. Cyclic mutation strategy
Xbest , otherwise
{ In MRFO, the position updates of the current individual only utilize
Num =
m3 , r11 < pm3
, m3 ∈ {1, 2, ⋯, k} (15) the position information of the current best individual and the preceding
1, otherwise individual. This means that the position information of other individuals
in the population is not fully exploited, resulting in poor population
β = 2 • e− (4•FEs/MaxFEs)2
(16) diversity and making it challenging to discover higher-quality solutions.
Strategies to enhance diversity come in various types, such as circular
where β is a parameter that controls the direction and range of the topologies [31], multi-population strategies [32], hierarchical strategies
current individual’s search. XSorted is a collection of elite individuals, [33], and novel learning strategies [34], among others. Due to its po
consisting of k individuals. Dti defines the feasible region through which tential for expanding the search scope, this study, based on a circular
members of the elite archive can acquire knowledge. For the current topology structure, has designed a CM strategy to enhance the diversity
individual Xi (t), there exists a learning angle φi (t) = (φ1i , φ2i , ⋯, of MRFO and facilitate the development of high-quality solutions.
φdim−
i
1
) ∈ Rdim− 1 obtained from Xknowledge (t). During the initialization The CM strategy, as shown in Fig. 3, partitions the population into Z
j
phase, φi is assigned the value of π/4. As the optimization process pro subpopulations based on fitness values, where the z-th subpopulation is
j denoted as Lz (z = 1, …, Z). In this study, Z equals 5, and the number of
ceeds, φi on each dimension can be transformed into a learnable region
j
individuals in each subpopulation is N/Z. These subpopulations, origi
di using a polar-to-Cartesian coordinate transformation method. The nating from Lz , Lz− 1 , and Lz+1 , together form a circular topology struc
difference in the learning regions for the current individual is denoted as ture, as illustrated in Fig. 3(a). The subpopulation Lz and its preceding
Dti (φi (t)) = (d1i , d2i , ⋯, ddim
i ) ∈R
dim
. The specific formula is as follows: Lz− 1 and subsequent Lz+1 subpopulations collectively constitute a cluster
∏dim− 1 denoted as Cz (z = 1, ..., Z), with each cluster having 3N/Z individuals.
di1 = p=1 cos(φpi ) (17) It’s worth noting that when z is 1, representing the first subpopulation
L1 , the last subpopulation LZ acts as the preceding subpopulation to L1 ,
( ) ∏dim− 1
dij = sin φj−i 1 • p=i cos(φpi ) (18) as illustrated in Fig. 3(b).Conversely, when z is Z, representing the last
subpopulation LZ , the first subpopulation L1 acts as the subsequent
(
didim = sin φdim− 1
)
(19) subpopulation to LZ .
A tournament selection method is used to select 3N/Z individuals
i
Intelligent feedback control. The fitness value of the individual from the cluster Cz for position updates. The update process involves two
after the update serves as a feedback metric for the adaptive adjustment methods: one that utilizes the specific position information of the pop
of φi (t). Specifically, if the fitness value of the updated individual’s ulation individuals and another that does not use individual informa
position is lower than the fitness value of the current position, it in tion. The method that utilizes the information of population individuals
dicates the discovery of a more promising search location. In such a case, to update the individuals in cluster Cz is represented by Eq. (23). From
the current learning angle φi (t) is retained. Conversely, if the fitness the formula, it can be observed that this approach involves mutation
value is higher, φi (t) undergoes rotation, leading to the exploration of operations on the current individual using the position information of
alternative spaces. This adaptive refinement mechanism enhances the the best individual Xbest (t) or three randomly selected individuals Xh1 (t),
algorithm’s proficiency in the search space, thereby improving the ef Xh2 (t), and Xh3 (t) from the population. The method that does not utilize
ficiency and accuracy of global search efforts. The mathematical model individual information for updates is shown in Eq. (24). This approach
of this process is as follows: utilizes the position information in the search space for update
operations.
⎧ ( / )
⎪
⎪ 2
⎪ φi (t) + π θmax . ∗ Rand(1, dim − 1), f (Xi (t + 1)) < f (Xi (t)) and triali < θmax
⎨ ( / )
φi (t + 1) = φi (t) + π 8θmax 2 . ∗ Rand(1, dim − 1), (Xi (t + 1)) ≥ f (Xi (t)) and triali < θmax (20)
⎪
⎪ /
⎪
⎩ φibest (t) + π θmax 2 . ∗ Rand(1, dim − 1), otherwise
{ ⎧ ( j j )
φi (t + 1), f (Xi (t + 1)) < f (Xi (t)) ⃒ ⃒
φibest (t + 1) = (21) ⎨ normrnd Xbest (t) + Xi (t), ⃒X j (t) − X j (t)⃒ , r10 < ϑ
⎪
φi (t), otherwise (23)
best i
Mij (t + 1) = 2
⎪
⎩
{ j j j
Xh1 (t) + r11 × Xh2 (t) − Xh3 (t), r10 ≥ ϑ
0, f (Xi (t + 1)) < f (Xi (t))
triali = (22)
triali + 1, otherwise ( )
ub + lb ub + lb
Vij (t + 1) = r12 × ub + lb − Xij (t) − + (24)
where θmax represents the upper limit of the rotation angle encountered 2 2
7
Fig. 4. Flowchart of bECMRFO-FKNN framework.
j
where Mi (t +1) represents the mutated individual. The function Table 5
normrnd( • ) generates random values following a standard normal The information on comparison algorithms.
distribution. r10 ， r11 ， r12 are three random numbers within the range Algorithm Years Parameter information Refs
[0,1]. ϑ is a random variable following a normal distribution, which
ECMRFO Presented p = 6. \
controls the degree of utilization of the position information of the best CGPSO 2011 1 = c2 = 2; W max = 0.9; W min = 0; [39]
individual Xbest (t) or the three randomly selected individuals Xh1 (t), W min = 0.2; V max = 6.
Xh2 (t), and Xh3 (t) from the population. ub and lb represent the upper and SAHABC 2015 ε = 1e − 16. [40]
lower bounds of the search space. HSMA_WOA 2020 z = 0.03; CI = 12000. [41]
WEMFO 2021 b = 1. [42]
After the mutation operation described above, a population V con
CLACO 2021 q = 0.5; ibslo = 1; k = 10; wMax = 0.9; [43]
taining all mutated individuals is generated. The population V and the wMin = 0.4.
original population X are sorted based on fitness values, and the top N RCACO 2021 q = 0.8; ibslo = 0.9; k = 10. [44]
individuals are selected for the next round of position updates. The EHGSA 2022 Rnorm = 2 [45]
CCDE 2022 CR = 0.9; F = 0.8; NC = 10. [46]
pseudocode for the CM mechanism is shown in Appendix A.2.
ISNMWOA 2022 a1 = [0, 2]; a2 = [ − 2, − 1]; alpha = 0.5. [47]
HGSMA 2023 C1 = 0.5; C2 = 2; W = 1.4. [48]
3.5. Complexity analysis GBSMA 2023 r = [0, 1]. [49]
The computational complexity of the ECMRFO is built around five

key components: initialization, fitness evaluation, population updating, which is a discrete problem. Therefore, to apply the continuous algo
EPS, and CM. Mainly influenced by Maximum evaluation times rithm ECMRFO to FS, a transfer function (TF) is needed to convert it into
(MaxFEs), population size (N), and problem dimension (dim), the overall binary form, known as bECMRFO. TF is used to convert continuous
time complexity is the sum of the following terms: variables into probability values. After obtaining the probability values,
a random number is used to determine the final binary value (0 or 1).
• The population size is N and problem dimension dim. The compu The specific conversion process is shown in Eq. (25) and Eq. (26).
tational complexity of initializing all individuals is O(N × dim). {
∼ Xi (t), sigmoid (Xi (t)) ≥ rand4
• The time complexity of the basic MRFO on the objective function Xi (t + 1) = (25)
Xi (t), otherwise
with a dim dimensional search space is given by: O(N × MaxFEs ×
dim).. ⃒ (√̅̅̅ )⃒
⃒ π ⃒
• In the EPS of individual in each evaluation of ECMRFO, update the sigmoid(x) = ⃒⃒ erf • x ⃒⃒ (26)
2
population position, calculate the fitness value of the generated
candidate solution, and update the current individual’s position. The where Xi (t) represents the current search agent, while r13 signifies a
time complexity of implementing EPS: O(N × MaxFEs × dim).. random number that ranges between 0 and 1. The TF Sigmoid( • ) is
• In the CM of individual in each evaluation of ECMRFO, update the purposefully designed to transform continuous values into discrete
population position, calculate the fitness value of the generated counterparts.
candidate solution, and update the current individual’s position. The
time complexity of implementing CM: O(N × MaxFEs × dim)..
4.2. Fitness function
Therefore, the total time complexity of the proposed ECMRFO is
O(N × dim + 3(N × MaxFEs × dim)). The purpose of FS is to extract effective features from the original
features while improving classification accuracy. Therefore, in the
4. Proposed bECMRFO-FKNN model context of wrapper-based FS problems, a fitness function Fit is designed
as shown in Eq. (27). This fitness function is a linear combination of the
4.1. Discretization classification error rate E and the number of selected features R. The
optimization objective is shown in Eq. (28). A smaller Fit value indicates
The FS problem can be viewed as a binary classification problem, a better quality of the current found feature subset.
8
|R| Table 6
Fit = α × E + φ × (27) Wilcoxon signed-rank test results for ECMRFO and the other three algorithms.
|N|
Algorithms Rank AVR R+/R-/R =
min(Fit) (28)
ECMRFO 1 1.86 ~
CMRFO 2 2.07 6/3/20
where N represents the total number of features contained in the dataset. EMRFO 3 2.28 12/3/14
Additionally, α is a weight assigned to the classification error rate MRFO 4 2.69 14/4/11
evaluation, and φ represents the weight of the number of features
selected. Their values are set to 0.95 and 0.05, respectively.
Table 7
4.3. The proposed bECMRFO-FKNN Friedman’s test ranking for ECMRFO and the other three algorithms.
Algorithms Rank AVR
Fig. 4 illustrates the detailed FS process in the bECMRFO-FKNN
ECMRFO 1 2.10
model. This model uses bECMRFO as the feature subset search method CMRFO 2 2.23
and FKNN as the classifier. First, the features are transformed into a EMRFO 3 2.70
binary representation using TF. Then, bECMRFO is employed as the MRFO 4 2.96
search method to find the optimal feature subset. Finally, the perfor
mance of the selected feature subset is tested on the FKNN classifier.
5. Designs for experiments
The experiments in this study are divided into two parts. First, a
series of experiments are conducted on CEC 2017 benchmark functions
to comprehensively evaluate the performance of ECMRFO. Subse
quently, the proposed bECMRFO-FKNN model is applied to predict lung
cancer to explore the performance of ECMRFO in handling FS problems.
Evaluation of ECMRFO’s Optimization Capability. ECMRFO’s
global optimization capability is assessed using the CEC 2017 bench
mark functions. Due to the strong instability of test function F2 in high
dimensions and its exclusion from testing, there are a total of 29 test
functions. These 29 test functions are categorized into four types:
unimodal, multi-modal, hybrid functions, and composition functions.
The details of these test functions can be found in Ref. [35]. In order to
conduct a comprehensive evaluation of the algorithms in this section, we Fig. 5. Result of BDT for ablation experiment with α = 0.05 and α = 0.1.
have chosen 11 SOTA algorithms (Particle swarm optimization with
chaotic and gaussian local search processes (CGPSO), A self adaptive The FS ability of bECMRFO-FKNN on real lung cancer data. In
hybrid enhanced artificial bee colony algorithm (SAHABC), this part, ECMRFO is combined with several classical classifiers,
WOA-enhanced SMA (HSMA_WOA), Moth flame optimization with including FKNN, kernel extreme learning machine (KELM), support
double adaptive weight mechanism (WEMFO), ACO with the Cauchy vector machine (SVM), multilayer perceptron (MLP), and KNN, for the
mutation and the greedy Levy mutation (CLACO), random spare strategy prediction of lung cancer to determine the best-performing classifier.
and chaotic intensification strategy based on ACO (RCACO), gravita Subsequently, experiments are conducted with eight different trans
tional search algorithm with hierarchical structure (EHGSA), Clustering formation functions to identify the optimal one using the bECMRFO-
center-based differential evolution (CCDE), WOA driven by the infor FKNN model. Then, a detailed comparison is made between
mation sharing search mechanism and the Nelder-mead simplex NMs bECMRFO-FKNN and various classical classification models (such as
mechanism (ISNMWOA), hierarchical guided slime mould algorithm classification decision tree (CART), encompasses integration techniques
(SMA) (HGSMA), Gaussian barebone mutation enhanced SMA (AdaBoost), extreme learning machine (ELM), random forest (Ran
(GBSMA)) and assessed their performance when applied alongside the domF), and back-propagation neural network (BP) and other FS models
ECMRFO method. The specific algorithms, along with their respective based on evolutionary computing methods (binary random following
parameter settings, are detailed in Table 5. To maintain consistency and ACO (bRFACO) [30], multi-strategy ensemble binary hunger games
ensure a fair comparison, the dimension dim and population size N are search (MS_bHGS) [50], binary chaotic diffusion-limited aggregation
set to 30 and a maximum evaluation limit of 300,000 evaluation times enhanced grey wolf optimizer (GWO) (bSCGWO) [51], binary
MaxFEs is established. Additionally, the initial search points have been teaching-learning-based optimization algorithm with reinforcement
randomly generated within the same initialization range for all algo learning strategy (bRLTLBO) [52], improved binary gaining-sharing
rithms. Moreover, each of the algorithms are executed independently 30 knowledge-based algorithm with mutation (IBGSK) [53], binary
times on each benchmark function. This allows us to calculate the enhanced GWO with sine initialization strategy, wormhole strategy, and
average value (Avg) and standard deviation (Std) for each algorithm’s elimination strategy (bSWEGWO) [54], and binary MRFO (bMRFO)
performance. The best results obtained are indicated in bold to [55]). The parameters for these models are set according to their
emphasize their significance in the comparative analysis. respective original papers. Six common evaluation metrics, including
Furthermore, non-parametric statistical tests, including the Wil specificity, sensitivity, classification accuracy (ACC), Matthews corre
coxon signed-rank test (WST) [36] and the Friedman test (FRT) [37] lation coefficient (MCC), and F-Measure, are used to analyze the
with Bonferroni post-hoc tests (BDT) [38], are used for comparative experimental results. Besides, all experiments are implemented in
analysis to make the results more practical. In the Wilcoxon signed-rank MATLAB. The experiments are performed on Windows Server 2012 R2
test, symbols ’R+’, ’R-’, and ’R = ’ represent the number of functions for data edge with 128 GB of RAM and CPU attributes of Intel(R) Xeon(R)
which ECMRFO’s performance is significantly better, significantly E5-2650v4 (2.20 GHz).
worse, and statistically equivalent, respectively, compared to the other
algorithms at a significance level of α = 0.05.
9
Fig. 6. Convergence curve and boxplot of the ablation experiment.
6. Numerical results
Table 8
Scalability results of ECMRFO with MRFO.
Experimental results on benchmark functions and feature selection
using ECMRFO will be presented in this section. First, the effectiveness
Dimension Algorithms Rank AVR
of EPS and CM in ECMRFO will be validated in section 6.1.1. The effi
30 ECMRFO 1 1.24 cacy of these two strategies in improving the performance of MRFO will
MRFO 2 1.59
be verified through ablation experiments conducted on CEC 2017. Then,
50 ECMRFO 1 1.20
MRFO 2 1.59
ECMRFO and MRFO will be tested on different dimensions in CEC 2017
100 ECMRFO 1 1.17 to investigate the impact of dimensionality on their performance. Sub
MRFO 2 1.62 sequently, ECMRFO will be compared with 11 SOTA algorithms in CEC
2017 to further demonstrate its superior performance. The analysis of
experimental results from the perspectives of Avg, Std, and different
statistical tests will be presented in section 6.1.2. After validating the
10
Table 9
WST results for ECMRFO and other comparison algorithms.
Algorithms Rank AVR R+/R-/R =
ECMRFO 1 2.66 ~
CGPSO 10 8.34 26/2/1
CLACO 5 5.21 19/4/6
WEMFO 11 9.03 28/0/1
RCACO 6 5.52 21/6/2
SAHABC 4 4.97 16/8/5
EHGSA 8 7.28 24/2/3
ISNMWOA 9 7.97 23/3/3
CCDE 12 10.55 27/0/2
HG_SMA 7 6.83 20/2/7
GBSMA 3 4.28 19/3/7
HSMA_WOA 2 4.03 18/2/9
Table 10
FRT ranking for ECMRFO and other comparison algorithms. Fig. 7. Result of BDT for ECMRFO and comparison methods with α = 0.05 and
α = 0.1.
Algorithms Rank AVR
ECMRFO 1 3.03
CGPSO 11 8.49
CLACO 6 5.47 where α represents the significant level, and qα is a critical value. Ak
WEMFO 10 8.21 denotes the number of algorithms involved in the comparison. Num
RCACO 5 5.45 represents the number of test functions.
SAHABC 4 5.38
As shown in Fig. 5, at significance levels α = 0.05 and α = 0.1, the
EHGSA 8 7.34
ISNMWOA 9 7.86 corresponding threshold values are 0.92 and 2.82. When the mean
CCDE 12 10.31 ranking of a compared algorithm is greater than the threshold, it can be
HG_SMA 7 7.02 said that ECMRFO has a significant difference from that algorithm. The
GBSMA 3 4.79 figure indicates that ECMRFO and MRFO have a significant difference at
HSMA_WOA 2 4.65
these two significance levels. Drawing from these findings, it can be
asserted that the incorporation of EPS and CM strategies serves as an
optimization capability of ECMRFO, a wrapper-based FS method is effective means to augment the performance of the original algorithm.
proposed based on ECMRFO for selecting important features and pre Furthermore, ten test functions were selected for convergence speed,
dicting lung cancer using the FKNN classifier. The performance of the accuracy, and stability analysis, as shown in Fig. 6. It can be observed
proposed prediction model is compared with traditional classification that the original MRFO quickly falls into local optima during the initial
algorithms and wrapper-based FS methods based on other metaheuristic evaluation, such as in F4, F5, F6, and F7. As a result, the corresponding
algorithms using lung cancer data. The FS experimental analysis results boxplot values for MRFO on these functions are also relatively high.
are provided in section 6.2. ECMRFO and CMRFO algorithms exhibit better convergence accuracy
and speed than other algorithms. However, in some functions, such as F8
and F16, the convergence curves of the ECMRFO algorithm are initially
6.1. Benchmarks validation
similar to those of the CMRFO algorithm but diverge later on, indicating
that CMRFO falls into local optima earlier. The box plots in Fig. 6
6.1.1. Ablation study
demonstrate that ECMRFO has the narrowest interquartile range in most
This section aims to validate the effectiveness of the EPS and CM
functions, indicating better stability.
strategies. Algorithms that use only EPS and CM strategies are named
EMRFO and CMRFO, respectively. The comparative results between
6.1.2. Scalability of ECMRFO
ECMRFO and EMRFO, CMRFO, and MRFO are shown in Table 6. The
The scalability of ECMRFO with respect to dimensionality is inves
mean and Std results of these algorithms on test functions can be found
tigated in this section. We compare the performance of ECMRFO and
in Appendix A.3. From Table 6, it can be observed that both CMRFO
MRFO on dimensions 30, 50, and 100. Table 8 presents the average
and EMRFO outperform the original MRFO on test functions, demon
ranking results of ECMRFO and MRFO across these three dimensions.
strating the effectiveness of these two strategies. ECMRFO, which
ECMRFO performs well in all dimensions, and its optimization capa
combines the EPS and CM strategies, ranks first overall. ECMRFO out
bility improves gradually with increasing dimensionality, as evidenced
performs CMRFO, EMRFO, and MRFO on 6, 12, and 14 functions,
by a decrease in average ranking results. In contrast, MRFO’s ability to
respectively. Appendix A.4 presents the p-value results obtained by
handle high-dimensional problems diminishes as the dimensionality
ECMRFO and the three comparative algorithms on the WST.
increases. Moreover, the specific results of ECMRFO and MRFO on the
Table 7 shows the ranking results of the FRT, with ECMRFO ranking
test functions are presented in Appendix A.5. From the results in the
first, followed by CMRFO, EMRFO, and MRFO. To further validate the
table, it can be observed that ECMRFO outperforms MRFO in most al
effectiveness of ECMRFO, a post-hoc BDT test was conducted based on
gorithms, while MRFO demonstrates excellent performance in solving
the FRT results. In BDT, Critical Difference (CD) is used to determine
composition function problems.
whether there is a significant difference between algorithms. If the dif
ference in FRT rankings of the two algorithms is less than CD, it can be
6.1.3. Comparison experiment with SOTA methods
considered that there is a significant difference between the two algo
After validating each component’s effectiveness of ECMRFO, the
rithms. The specific calculation formula of CD is as follows:
proposed algorithm ECMRFO is compared with 11 SOTA algorithms
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
Ak × (Ak + 1) (CGPSO, CLACO, WEMFO, RCACO, SAHABC, EHGSA, ISNMWOA,
CD = qα × (29) CCDE, HG_SMA, GBSMA, and HSMA_WOA) to validate its effectiveness
6 × Num
in this section.
11
Fig. 8. Convergence curve and boxplot of ECMRFO and SOTA methods.
comprehensive understanding of the performance of all algorithms

Table 11
across the functions, please refer to Appendix A.6, which provides the
Parameter settings for the five methods.
average fitness values, Std, and ranking results for each algorithm. The
Methods Parameter values results in the table indicate that ECMRFO achieved the best ranking. The
bECMRFO_FKNN k = 1; m = 2; R+ values for each algorithm are greater than R-. ECMRFO’s perfor
bECMRFO_KELM c = 88; γ = 1024; mance significantly outperforms CGPSO, WEMFO, and CCDE, while it is
bECMRFO_MLP p = 1; close to SAHABC, GBSMA, and HSMA_WOA. Appendix A.7 provides the
bECMRFO_SVM c = 850; γ = 0.17;
bECMRFO_KNN k = 1.
p-values obtained by the comparative algorithms on the WST.
Table 10 shows the ranking results of ECMRFO and the comparative
algorithms on the FRT. ECMRFO achieved the best results, followed by
Table 9 displays the average ranking outcomes derived from mean HSMA_WOA. The BDT results based on the FRT results are shown in
and standard deviation (Std) measurements, alongside the results of the Fig. 7. The threshold values for significance levels α = 0.05 and α = 0.1
Wilcoxon signed-rank test for further statistical analysis. For a more are 5.61 and 5.39, respectively. The results in the figure indicate that
12
benchmark function, in this section, the ability of ECMRFO to solve

practical feature selection problems will be demonstrated. In the field of
machine learning, wrapper-based FS methods are widely adopted
[56–58]. Their main goal is to construct an optimal subset of features
from the original feature set to enhance model performance. However,
the effectiveness of wrapper-based FS is largely influenced by the eval
uation of the selected feature subset by the classifier. Therefore, when
applying wrapper-based FS methods to predict the pathological types of
lung cancer (LC), seeking an appropriate and efficient classifier becomes
crucial.
In pursuit of this objective, an integrated approach is adopted to fuse
the proposed bECMRFO with a variety of classical classifiers. These
classical classifiers encompass FKNN, KELM, MLP, SVM, and KNN. The
Fig. 9. The results of the comparison of bECMRFO combining these
five classifiers. parameter settings and model calls for the above methods come from the
wrapper functions of fuzzy_knn(), trainKELM(), patternnet(), fitcsvm()
and fitcknn() in Matlab, respectively. Through the amalgamation of
these classifiers with FS algorithms, a steadfast commitment is made to
enhance the predictive efficacy concerning LC pathology types. The
particulars of the parameter configurations for these five wrapper FS
methodologies are presented in Table 11.
Fig. 9 depicts histograms portraying the mean errors of these five
models in the prediction of LC pathology types. Notably, bECMR
FO_FKNN demonstrates superior performance, while bECMRFO_MLP
displays the lowest performance level among the quintet. Of particular
significance, bERIME_FKNN, bECMRFO_KELM, and bECMRFO_SVM
exhibit comparable performance levels regarding metrics such as ACC,
Specificity, and Precision. Nevertheless, with regard to Sensitivity, MCC,
and F-measure, bECMRFO_FKNN outperforms its counterparts. Conse
quently, bECMRFO_FKNN shall be selected as the predictive model for
LC pathology types in forthcoming experiments.
Fig. 10. The comparison results of bECMRFO_FKNN combined with different
transfer functions. The S-type and V-type transfer functions hold broad utilization and
serve as primary transformation functions in FS [54]. A thorough
exposition of these transfer functions is available within the existing
body of literature [59]. To discern which conversion functions best align
with the proposed bECMRFO_FKNN prediction model, we undertook a
comprehensive comparative study. Each conversion function was indi
vidually assessed in tandem with the bECMRFO_FKNN model. The
conversion function that most effectively matched the LC pathology type
was then chosen to enhance the predictive capabilities of the model.
Fig. 10 showcases histograms illustrating the average errors associ
ated with the S-type and V-type conversion functions when applied to
bECMRFO_FKNN. The merging of bECMRFO_FKNN with the S2–S4 and
V3–V4 conversion functions yields favorable outcomes, whereas the
pairing with the S1, V1, and V2 conversion functions demonstrates
suboptimal performance. Notably, the V4 conversion function presents
Fig. 11. Comparison of bECMRFO_FKNN with well-known classifiers.
substantial advantages concerning ACC, Specificity, and Precision met
rics. Its performance aligns comparably with other conversion functions
in terms of accuracy, MCC, and F-measure. Consequently, in subsequent
ECMRFO has a significant difference from CGPSO, WEMFO, EHGSA,
experiments, the V4 transformation function was employed alongside
ISNMWOA, CCDE, and HG_SMA at both significance levels. At the α =
bECMRFO_FKNN for LC pathology-type prediction.
0.1 significance level, it has a significant difference from CLACO,
To assess the effectiveness of the bECMRFO FS approach introduced
RCACO, and SAHABC. There is no significant difference between
in this study, a comparative assessment was carried out to gauge the
ECMRFO and GBSMA and HSMA_WOA at these two significance levels.
performance of classifiers employing bECMRFO_FKNN, both with and
Fig. 8 displays the convergence and stability of the comparative al
without FS. The range of classifiers considered AdaBoost, RandomF,
gorithms on the 10 test functions. From Fig. 8, it can be observed that
CART, ELM, and BP.
ECMRFO can find the minimum value and converge faster compared to
Fig. 11 illustrates a comparison between the mean error histograms
the other eleven algorithms. Most algorithms get trapped in local optima
of bECMRFO_FKNN and classifiers that do not utilize FS techniques.
in the later evaluations, while ECMRFO can escape local optima effec
Across the spectrum of assessment metrics, bECMRFO_FKNN consis
tively, indicating the effectiveness of the EPS and CM strategies. The box
tently demonstrates a superior and reliably stable level of performance,
plots in Fig. 8 show that ECMRFO has the narrowest box and the smallest
notably surpassing AdaBoost and RandomF. Conversely, the predictive
values. Overall, ECMRFO outperforms the other comparative
efficacy of CART, ELM, and BP appears to be relatively suboptimal.
algorithms.
These findings unequivocally highlight the substantial predictive capa
bilities of the proposed bECMRFO FS approach in LC prediction. This
6.2. Application in lung cancer diagnosis discovery underscores the efficacy of the bECMRFO FS method in the
domain of LC pathology type prediction and provides robust substanti
After verifying the optimization performance of ECMRFO on the ation for its application within the realm of medical research.
13
Table 12
Avg, Std, and Rank of predicted LC results of eight FS models.
Methods ACC Sensitivity Specificity
Avg Std Rank Avg Std Rank Avg Std Rank
bECMRFO-FKNN 99.38% 0.02 1 100.0% 0.00 1 98.89% 0.04 1

bRFACO-FKNN 97.46% 0.03 5 97.14% 0.06 3 97.78% 0.05 3
MS_bHGS-FKNN 94.29% 0.05 8 94.05% 0.08 8 94.44% 0.08 8
bSCGWO-FKNN 98.08% 0.03 3 97.14% 0.06 3 98.89% 0.04 1
bRLTLBO-FKNN 98.75% 0.03 2 100.0% 0.00 1 97.78% 0.05 3
IBGSK-FKNN 97.50% 0.03 4 97.14% 0.06 3 97.78% 0.05 3
bSWEGWO-FKNN 95.50% 0.05 7 95.71% 0.07 6 95.56% 0.08 7
bMRFO-FKNN 96.13% 0.05 6 95.71% 0.10 7 96.53% 0.06 6
Methods Precision MCC F-measure
Avg Std Rank Avg Std Rank Avg Std Rank
bECMRFO-FKNN 98.75% 0.04 1 98.82% 0.04 1 99.33% 0.02 1

bRFACO-FKNN 97.50% 0.05 3 95.15% 0.06 5 97.13% 0.04 4
MS_bHGS-FKNN 93.85% 0.08 8 89.05% 0.09 8 93.54% 0.05 8
bSCGWO-FKNN 98.75% 0.04 1 96.33% 0.06 3 97.80% 0.04 3
bRLTLBO-FKNN 97.50% 0.05 3 97.64% 0.05 2 98.67% 0.03 2
IBGSK-FKNN 97.50% 0.05 3 95.21% 0.06 4 97.13% 0.04 4
bSWEGWO-FKNN 94.82% 0.09 7 91.43% 0.10 7 94.94% 0.06 7
bMRFO-FKNN 96.25% 0.06 6 92.73% 0.09 6 95.56% 0.06 6
Fig. 12. Comparison of 8 FS algorithms.
Fig. 14. Result of 100 runs of feature selections.
applied to the LC dataset.

Table 12 displays a comprehensive breakdown of the outcomes ob
tained from the eight LC prediction models. The data demonstrates that
each of these models yields commendable results in LC prediction. It is of
particular significance to note that the bECMRFO-FKNN model, as pro
posed, exhibits superior performance relative to the remaining seven
prediction models, with its efficacy in LC prediction being notably
prominent. The specific performance evaluation criteria encompass ac
curacy (99.38%), sensitivity (100.0%), specificity (98.89%), precision
(98.75%), the Matthews correlation coefficient (98.82%), and the F-
measure (99.33%).
Fig. 12 presents the histogram depicting the average error derived
from the 10-fold CV statistics for the eight LC prediction models. Of all
the models under consideration, the bECMRFO-FKNN model, as pro
posed, stands out as the most superior and consistently robust across all
Fig. 13. Convergence curves of feature set evaluation fitness values. evaluated metrics. When contrasted with the subpar performance of the
bMRFO-FKNN model, it undeniably underscores the effectiveness of the
In order to further substantiate the efficacy of the proposed elite perturbation search strategy and the cyclic mutation strategy
bECMRFO_FKNN model in its prognostic capacity for LC pathological within the bECMRFO FS algorithm.
categorizations, a comparative analysis is undertaken vis-à-vis an anal Fig. 13 illustrates the convergence performance of the eight predic
ogous model, wherein it is conjoined with alternative FS algorithms, tion models concerning LC prognosis. Notably, it is essential to highlight
specifically, seven recently-emerged FS algorithms: bRFACO, MS_bHG, that the bECMRFO-FKNN model exhibits remarkable convergence ve
bSCGWO, bRLTLBO, IBGSK, bSWEGWO, and bMRFO. The fusion of locity and exceptional convergence precision, surpassing the remaining
these FS algorithms with the FKNN classifier has led to the development seven models by a substantial margin. This implies that, in LC predic
of seven wrapper FS models. Subsequently, an assessment of the pre tion, the bECMRFO-FKNN model attains precise outcomes more swiftly,
dictive prowess of the bECMRFO_FKNN model is conducted in com aligning more closely with the actual observations.
parison to the predictive efficacy of these aforementioned models when In summary, the superior performance of the bECMRFO-FKNN
14
Fig. 15. Typical R-EBUS images of normal lung and type I -VI. A, normal lung. B, circle hyperechoic dense area without vessels and bronchioles (circle-dense sign,
Type I). C,hemi hyperechoic dense area without vessels and bronchioles(hemi-dense sign, type II). D, heterogeneous pattern with hyperechoic dots and short lines
(red arrow, blizzard sign, type III) and less short line on dense area (white arrow,short-linear-hyperecho, type IV); E, similar to onionskin middle echo (onionskin
sign, type V); F, focal hypo-echo on lesion area of vessel (red arrow,focal-low-echo sign, type VI).
model in LC prediction is evident from the above analysis. The proposed 7. Discussion
model excels across all the evaluated metrics, displaying greater stability
and notable convergence speed while maintaining accuracy. Conversely, In the task of identifying intrapulmonary lesions using R-EBUS im
the bMRFO-FKNN model exhibits comparatively inferior performance, ages, it is noteworthy that R-EBUS has the potential to offer distinctive
underscoring the significance of the elite perturbation search strategy information that can aid in distinguishing the nature of an intra
and the cyclic mutation strategy within the bECMRFO FS algorithm. pulmonary lesion based on image features, such as the presence of
Consequently, based on these findings, it can be deduced that the concentric circles [60].
bECMRFO-FKNN model represents the preferred choice in LC prediction Endobronchial ultrasonography can be employed to analyze the in
due to its exceptional performance and reliability. ternal structure of peripulmonary lesions [2]. Using internal structure
The aforementioned experiments demonstrate that the bECMRFO- evaluation based on internal echoes, vascular and bronchial patency, and
FKNN model exhibits heightened stability, substantial convergence the morphology of hyperechoic areas reflecting air in the alveoli and
speed, and noteworthy accuracy. These observations provide robust bronchioles, three classes and six subclasses of lesions have been identi
evidence of the model’s capacity to discern superior feature subsets fied by EBUS. The hyperechoic areas serve as indicators of the distribution
within LC datasets. To ascertain the pivotal feature subset for forecasting of air within the alveoli and bronchioles. The classes of lesions are as
LC pathological types, a series of 100 FS trials were executed on the follows: type I, Circle-dense sign (circle hyperechoic dense area without
bECMRFO-FKNN model, to gauge the significance of the selected feature vessels and bronchioles); type II, hemi-dense sign(hemi hyperechoic
subset in LC prediction. dense area without vessels and bronchioles); type III, blizzard sign(het
Fig. 14 presents the outcomes stemming from 100 rounds of FS erogeneous pattern with hyperechoic dots and short lines); type IV,
conducted on the LC dataset using the bECMRFO-FKNN model. The short-linear-hyperechoic sign (less short line on dense area); type V, on
horizontal axis of the graph delineates the distinct attributes within the ionskin sign (similar to onionskin middle echo); type VI, focal-low-echo
LC dataset, while the vertical axis charts the frequency of selection for sign (focal hypo-echo on lesion area), as shown in Fig. 15.
each attribute within the LC dataset. Notably, four attributes—C4 (R- Fifty-four of 68 type I and II lesions (79.41%) were malignant, while
EBUS2 Hemi-dense sign), C3 (R-EBUS1 Circle-dense sign), C13 (CCT5 twenty-nine of 87 type I and II lesions (33.33%) were benign. The
mediastinum lymph node), and C7 (R-EBUS5 Onionskin sign)—emerged combination of R-EBUS and virtual navigation proves to be a valuable
as the most frequently selected, garnering 34, 39, 31,36, and 33 selec tool for accurately pinpointing peripheral pulmonary lesions, as evi
tions, respectively, positioning them among the top four in the hierarchy denced in a previous study [61]. However, it is noteworthy that there is
of 100 feature selections. This signifies that these four attributes wield a lack of comprehensive reports regarding the utilization of R-EBUS
substantial influence in the realm of LC prediction. Their recurrent se images specifically for the identification of malignant lung lesions.
lection underscores a pronounced association between these attributes Notably, a majority of malignant lung lesions exhibited characteristics
and the occurrence of LC, suggesting that they likely harbor pivotal falling within type I or II categories on R-EBUS. Furthermore, it was
information capable of significantly enhancing our comprehension and observed that in the case of malignant lung lesions, the proportion of
predictive accuracy concerning LC. dense echo within the lesion exhibited an inverse correlation with the
15
quantity of tumor components. Moreover, the feature of R-EBUS was population diversity plays a crucial role in avoiding local optima during
influenced by the positional relation of the EBUS probe and lesion. As to the search process. The comparison results with 11 SOTA methods
a tumor lesion, it indicated type I frequently when the EBUS probe is confirmed the superiority of ECMRFO. Furthermore, this paper proposes
located in the center of the lesion. Otherwise, it may indicate type II an ECMRFO-based FKNN method as a method for identifying MLD. The
(hemi-dense) when the EBUS probe was adjacent to the lesion. Type I comparison results indicate that the accuracy of determining MLD using
and type II of R-EBUS both indicated that the lesions were full of solid bECMRFO-FKNN will facilitate more effective drug management and
components for proliferation and infiltration of solid cells without improve the diagnosis and treatment of MLD.
bronchioles containing air. Therefore, most type I and type II of R-EBUS If the bECMRFO-FKNN assisted R-EBUS analysis can replace TBLB
(79.41%) were observed in MLD cases. When the lesion was attributed biopsy, it will improve the advantage of bronchoscopy to patients and
to surrounding inflammatory cells and fluid exudation and less the accuracy of the final diagnosis and reduce the patients’ economic
air-containing, Type V(onionskin sign) of R-EBUS may be found burden. With the development of minimally invasive ablation technol
frequently, and that result was confirmed in our study. Furthermore, ogy in the future, malignant lesions identified by R-EBUS can be directly
those enlarged mediastinal lymph nodes were often revealed with chest cured with bronchoscopic ablation, such as electromagnetic navigation
CT examination because pulmonary malignant disease was prone to ablation. During R-EBUS bronchoscopy procedures for malignant lung
invade mediastinum lymph nodes. lesions, R-EBUS diagnosis and localized ablation procedures may be
Histological findings obtained from TBLB consistently demonstrated undergone accurately with the least damage to patients.
that the majority of lesions fell within the type I and II categories as
identified by R-EBUS. Within MLD lesions, specific characteristics such CRediT authorship contribution statement
as the presence of a circular pattern and the hemi-dense sign may signify
areas that are particularly suitable for biopsy. Identifying these regions Jie Xing: Writing – original draft. Chengye Li: Writing – original
is of paramount importance. Moreover, establishing a correlation be draft. Peiliang Wu: Methodology, Writing – review & editing. Xueding
tween thin-section CT findings and the various R-EBUS types holds the Cai: Resources. Jinsheng Ouyang: Funding acquisition.
potential to significantly impact the accuracy of pathological diagnosis.
This, in turn, can lead to precise prognostication and may even inform Declaration of competing interest
the application of innovative therapeutic approaches.
For the lesion of lung tumor major owning to lung cancer full of The authors declare that they have no known competing financial
tumor cells and connective tissue, its lesions hold relatively solid tissues interests or personal relationships that could have appeared to influence
with less air. However, the inflammatory lesion involves much infil the work reported in this paper.
trating inflammation cells, fluid, and aerated bronchioles and alveoli.
The utilization of bECMRFO_FKNN offers the significant advantage of Appendix A. Supplementary data
enabling real-time diagnosis of pulmonary lesions. By distinguishing
between benign and malignant lesions based on ultrasonic images, it has Supplementary data to this article can be found online at https://doi.
the potential to obviate the need for unnecessary biopsies. This diag org/10.1016/j.compbiomed.2024.108038.
nostic approach not only enhances the precision of diagnosis but also
contributes to safety measures in medical practice. The developed References
bECMRFO_FKNN diagnosis system is expected to play a pivotal role in
interpreting EBUS findings that may pose challenges for clinicians, [1] N. Kurimoto, et al., Analysis of the internal structure of peripheral pulmonary
thereby aiding in the differentiation between benign lesions and lung lesions using endobronchial ultrasonography, Chest 122 (6) (2002) 1887–1894.
[2] N. Kurimoto, et al., Endobronchial ultrasonography using a guide sheath increases
cancers. In summary, the application of bECMRFO_FKNN in diagnosing
the ability to diagnose peripheral pulmonary lesions endoscopically, Chest 126 (3)
peripheral pulmonary lesions stands as a valuable tool for achieving (2004) 959–965.
accurate lung cancer diagnosis. [3] D. Anantham, M.S. Koh, Endobronchial ultrasound-guided transbronchial needle
aspiration in the diagnosis and staging of lung cancer, Thoracic Cancer 1 (1) (2010)
However, the proposed method also has limitations. For ECMRFO,
9–16.
although the EPS and CM strategies improve the performance of MRFO, [4] Q. Zhang, et al., A machine learning framework for identifying influenza
they also increase the time complexity due to their incorporation. The pneumonia from bacterial pneumonia for medical decision making, J. Comput. Sci.
increased complexity of ECMRFO may indeed reduce the efficiency of 65 (2022) 101871.
[5] L. Liu, et al., Ant colony optimization with Cauchy and greedy Levy mutations for
the model, which highlights the need for future developments aimed at multilevel COVID 19 X-ray image segmentation, Comput. Biol. Med. (2021) 136.
creating a low-complexity yet high-performance strategy. An additional [6] A. Alimadadi, et al., Artificial intelligence and machine learning to fight COVID-19,
constraint in this study was the absence of data related to benign dis Physiol. Genom. 52 (4) (2020) 200–202.
[7] A. Vaid, et al., Machine learning to predict mortality and critical events in a cohort
eases, resulting in a reduction in the specificity of the obtained results. It of patients with COVID-19 in New York city: model development and validation,
is important to acknowledge that bronchoscopy primarily addresses J. Med. Internet Res. 22 (11) (2020) e24018.
malignancies, and as such, prioritizes sensitivity over specificity. How [8] K. Hu, et al., PPNet: pyramid pooling based network for polyp segmentation,
Comput. Biol. Med. 160 (2023) 107028.
ever, given the multicenter nature of this study, there is a clear need to [9] Y. Dai, et al., MSEva: a musculoskeletal rehabilitation evaluation system based on
gather data pertaining to benign diseases as well. In forthcoming EMG signals, ACM Trans. Sens. Netw. 19 (1) (2022). Article 6.
research endeavors, we intend to investigate whether real-time assess [10] Z. Wu, et al., An Effective Method for the Protection of User Health Topic Privacy
for Health Information Services, World Wide Web, 2023.
ment of EBUS data during bronchoscopy can indeed enhance diagnostic
[11] J.N. Siebert, et al., Deep learning diagnostic and severity-stratification for
accuracy. interstitial lung diseases and chronic obstructive pulmonary disease in digital lung
auscultations and ultrasonography: clinical protocol for an observational
case–control study, BMC Pulm. Med. 23 (1) (2023) 191.
8. Conclusions and future works [12] B. Zhou, et al., Lung mass density prediction using machine learning based on
ultrasound surface wave elastography and pulmonary function testing 149 2
The original MRFO individual position update is related to the pre (2021) 1318.
[13] B. Bataille, et al., Integrated use of bedside lung ultrasound and echocardiography
vious individual and the best individual, which will reduce the popu
in acute respiratory failure: a prospective observational study in ICU, Chest 146 (6)
lation diversity and affect the final convergence quality. This paper (2014) 1586–1593.
introduces two novel search strategies, namely EPS and CM, to solve the [14] S. Siméon, et al., Point-of-care lung ultrasonography for early identification of mild
shortcomings of the original MRFO. EPS introduces the position infor COVID-19: a prospective cohort of outpatients in a Swiss screening center, BMJ
Open 12 (6) (2022) e060181.
mation of the suboptimal solution and continuously approaches the [15] T. Hotta, et al., Deep learning-based diagnosis from endobronchial
optimal solution with disturbance. Furthermore, CM that increases ultrasonography images of pulmonary lesions, Sci. Rep. 12 (1) (2022) 13710.
16
[16] A. Dhillon, A. Singh, V.K. Bhalla, Biomarker identification and cancer survival [38] J. Demšar, Statistical comparisons of classifiers over multiple data sets, J. Mach.
prediction using random spatial local best cat swarm and Bayesian optimized DNN, Learn. Res. 7 (2006) 1–30.
Appl. Soft Comput. 146 (2023) 110649. [39] D. Jia, et al., A hybrid particle swarm optimization algorithm for high-dimensional
[17] M. Abd Elaziz, et al., An efficient artificial rabbits optimization based on mutation problems, Comput. Ind. Eng. 61 (4) (2011) 1117–1122.
strategy for skin cancer prediction, Comput. Biol. Med. 163 (2023) 107154. [40] H. Shan, T. Yasuda, K. Ohkura, A self adaptive hybrid enhanced artificial bee
[18] S. Thawkar, et al., Breast cancer prediction using a hybrid method based on colony algorithm for continuous optimization problems, Biosystems 132–133
butterfly optimization algorithm and ant lion optimizer, Comput. Biol. Med. 139 (2015) 43–53.
(2021) 104968. [41] M. Abdel-Basset, V. Chang, R. Mohamed, HSMA_WOA: a hybrid novel Slime mould
[19] N. Venkatesan, S. Pasupathy, B. Gobinathan, An efficient lung cancer detection algorithm with whale optimization algorithm for tackling the image segmentation
using optimal SVM and improved weight based beetle swarm optimization, problem of chest X-ray images, Appl. Soft Comput. 95 (2020) 106642.
Biomed. Signal Process Control (2023) 105373. [42] W. Shan, et al., Double adaptive weights for stabilization of moth flame optimizer:
[20] N. Liu, et al., ACO-KELM: anti coronavirus optimized kernel-based softplus extreme balance analysis, engineering cases, and medical diagnosis, Knowl. Base Syst. 214
learning machine for classification of skin cancer, Expert Syst. Appl. 232 (2023) (2021) 106728.
120719. [43] L. Liu, et al., Ant colony optimization with Cauchy and greedy Levy mutations for
[21] Q. Huang, H. Ding, M. Effatparvar, Breast cancer diagnosis based on hybrid multilevel COVID 19 X-ray image segmentation, Comput. Biol. Med. 136 (2021)
SqueezeNet and improved chef-based optimizer, Expert Syst. Appl. 237 (2024) 104609.
121470. [44] D. Zhao, et al., Chaotic random spare ant colony optimization for multi-threshold
[22] G. Hu, et al., Multi-strategy assisted chaotic coot-inspired optimization algorithm image segmentation of 2D Kapur entropy, Knowl. Base Syst. (2021) 216.
for medical feature selection: a cervical cancer behavior risk study, Comput. Biol. [45] H. Li, et al., Gravitational search algorithm with hierarchical structure guided by
Med. 151 (2022) 106239. elite individual, in: 2022 15th International Symposium on Computational
[23] P. Maria Jesi, et al., HRSHO: a hybrid rain optimized spotted hyena optimizer for Intelligence and Design, ISCID), 2022.
efficient feature selection in CNN-based sinusitis classification, Biomed. Signal [46] R. Khosrowshahli, S. Rahnamayan, A.A. Bidgoli, Clustering center-based
Process Control 87 (2024) 105441. differential evolution, in: 2022 IEEE Congress on Evolutionary Computation, CEC),
[24] W. Zhao, Z. Zhang, L. Wang, Manta Ray Foraging Optimization: an Effective Bio- 2022.
Inspired Optimizer for Engineering Applications, Engineering Applications of [47] L. Peng, et al., Information sharing search boosted whale optimizer with Nelder-
Artificial Intelligence, 2020, p. 87. Mead simplex for parameter estimation of photovoltaic models, Energy Convers.
[25] K.K. Ghosh, et al., S-shaped versus V-shaped transfer functions for binary Manta Manag. 270 (2022) 116246.
ray foraging optimization in feature selection problem, Neural Comput. Appl. 33 [48] G. Hu, B. Du, G. Wei, HG-SMA: hierarchical guided slime mould algorithm for
(17) (2021) 11027–11041. smooth path planning, Artif. Intell. Rev. 56 (9) (2023) 9267–9327.
[26] I.H. Hassan, et al., An improved binary manta ray foraging optimization algorithm [49] S. Wu, et al., Gaussian Bare-Bone Slime Mould Algorithm: Performance
based feature selection and random forest classifier for network intrusion Optimization and Case Studies on Truss Structures, Artificial Intelligence Review,
detection, Intell. Syst. Appl. 16 (2022) 200114. 2023.
[27] H. Tang, et al., Predicting green consumption behaviors of students using efficient [50] B.J. Ma, S. Liu, A.A. Heidari, Multi-strategy ensemble binary hunger games search
firefly grey wolf-assisted K-nearest neighbor classifiers, IEEE Access 8 (2020) for feature selection, Knowl. Base Syst. 248 (2022) 108787.
35546–35562. [51] J Hu, AA Heidari, LJ Zhang, et al., Chaotic diffusion-limited aggregation enhanced
[28] J.M. Keller, M.R. Gray, J.A. Givens, A fuzzy K-nearest neighbor algorithm, IEEE grey wolf optimizer: Insights, analysis, binarization, and feature selection[J],
Trans. Syst., Man, Cyber. 15 (4) (1985) 580–585. SMC-. International Journal of Intelligent Systems 37 (8) (2022) 4864–4927.
[29] M. Dorigo, V. Maniezzo, A. Colorni, Ant system: optimization by a colony of [52] D. Wu, et al., An Improved Teaching-Learning-Based Optimization Algorithm with
cooperating agents, IEEE Trans. Syst. Man Cybern. B Cybern. : Publ. IEEE Syst., Reinforcement Learning Strategy for Solving Optimization Problems,
Man, Cybernetics Soc. 26 (1) (1996) 29–41. Computational Intelligence and Neuroscience 2022 (2022) 1535957.
[30] X. Zhou, et al., Random following ant colony optimization: continuous and binary [53] G. Xiong, et al., Improved binary gaining–sharing knowledge-based algorithm with
variants for global optimization and feature selection, Appl. Soft Comput. 144 mutation for fault section location in distribution networks, J. Comput. Des. Eng. 9
(2023) 110513. (2) (2022) 393–405.
[31] N.N. Samany, M. Sheybani, S. Zlatanova, Detection of safe areas in flood as [54] X. Yang, et al., An optimized machine learning framework for predicting
emergency evacuation stations using modified particle swarm optimization with intradialytic hypotension using indexes of chronic kidney disease-mineral and bone
local search, Appl. Soft Comput. 111 (2021) 107681. disorders, Comput. Biol. Med. 145 (2022) 105510.
[32] X. Yang, H. Li, Evolutionary-state-driven multi-swarm cooperation particle swarm [55] W. Zhao, Z. Zhang, L. Wang, Manta ray foraging optimization: an effective bio-
optimization for complex optimization problem, Inf. Sci. 646 (2023) 119302. inspired optimizer for engineering applications, Eng. Appl. Artif. Intell. 87 (2020)
[33] V. K, E.S. Gopi, Analyzing the performance improvement of hierarchical binary 103300.
classifiers using ACO through Monte Carlo simulation and multiclass engine [56] X. Zhou, et al., Boosted local dimensional mutation and all-dimensional
vibration data, Expert Syst. Appl. (2023) 121730. neighborhood slime mould algorithm for feature selection, Neurocomputing 551
[34] P. Zhang, et al., A novel human learning optimization algorithm with Bayesian (2023) 126467.
inference learning, Knowl. Base Syst. 271 (2023) 110564. [57] B.H. Nguyen, B. Xue, M. Zhang, A Constrained Competitive Swarm Optimizer With
[35] G. Wu, R. Mallipeddi, P. Suganthan, Problem Definitions and Evaluation Criteria an SVM-Based Surrogate Model for Feature Selection," in, IEEE Transactions on
for the CEC 2017 Competition and Special Session on Constrained Single Objective Evolutionary Computation 28 (1) (Feb. 2024) 2–16.
Real-Parameter Optimization, 2016. [58] T. Li, et al., A binary individual search strategy-based bi-objective evolutionary
[36] J. Derrac, et al., A practical tutorial on the use of nonparametric statistical tests as a algorithm for high-dimensional feature selection, Inf. Sci. 610 (2022) 651–673.
methodology for comparing evolutionary and swarm intelligence algorithms, [59] S. Mirjalili, A. Lewis, S-shaped versus V-shaped transfer functions for binary
Swarm Evol. Comput. 1 (1) (2011) 3–18. Particle Swarm Optimization, Swarm Evol. Comput. 9 (2013) 1–14.
[37] S. García, et al., Advanced nonparametric tests for multiple comparisons in the [60] T.-Y. Chao, et al., Differentiating peripheral pulmonary lesions based on images of
design of experiments in computational intelligence and data mining: experimental endobronchial ultrasonography, Chest 130 (4) (2006) 1191–1197.
analysis of power, Inf. Sci. 180 (10) (2010) 2044–2064. [61] I. Takehiro, et al., Radial endobronchial ultrasound images for ground-glass
opacity pulmonary lesions, Eur. Respir. J. 45 (6) (2015) 1661.
17

1-s2.0-S0010482524001227-main

Uploaded by

Copyright:

Available Formats

1-s2.0-S0010482524001227-main

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1-s2.0-S0010482524001227-main

Uploaded by

Copyright:

Available Formats

Computers in Biology and Medicine 171 (2024) 108038

Contents lists available at ScienceDirect

Computers in Biology and Medicine

Optimized fuzzy K-nearest neighbor approach for accurate lung cancer

1. Introduction convex probe EBUS in the bronchoscope center. Recently, a noteworthy

2.2. Fuzzy k-nearest neighbor

the best individual is regarded as an axis, and each individual moves

individual or the position of the preceding individual. This mode of

the diversification of the population and, subsequently, its capacity to

The original MRFO only considered the position information of the

Fig. 1. The flowchart of ECMRFO.

where the parameter k represents the number of elite solutions stored in

qk 2π the repository. Its size is dynamically adjusted in proportion to the

Fig. 2. Schematic diagram of the archive of k elite individuals.

Fig. 3. Schematic diagram of the cyclic mutation strategy.

Fig. 4. Flowchart of bECMRFO-FKNN framework.

The computational complexity of the ECMRFO is built around five

5. Designs for experiments

Fig. 6. Convergence curve and boxplot of the ablation experiment.

Fig. 8. Convergence curve and boxplot of ECMRFO and SOTA methods.

comprehensive understanding of the performance of all algorithms

benchmark function, in this section, the ability of ECMRFO to solve

Avg Std Rank Avg Std Rank Avg Std Rank

bECMRFO-FKNN 99.38% 0.02 1 100.0% 0.00 1 98.89% 0.04 1

Methods Precision MCC F-measure

Avg Std Rank Avg Std Rank Avg Std Rank

bECMRFO-FKNN 98.75% 0.04 1 98.82% 0.04 1 99.33% 0.02 1

Fig. 12. Comparison of 8 FS algorithms.

Fig. 14. Result of 100 runs of feature selections.

applied to the LC dataset.

You might also like