Abstract
Biological data generally contain complex and high-dimensional samples. In addition, the number of samples in biological datasets is much fewer than the number of features, so the vast number of features should be selected carefully and determine the optimal subset of features. Feature selection (FS) is a vital stage in biological data mining applications (e.g., classification) for dealing with the curse of dimensionality problems and finding highly informative features. This work proposes an effective FS approach based on a new version of Gray Wolf Optimizer (GWO) called Multi-strategy Gray Wolf Optimizer (MSGWO) for better features selection for biological data classification. The use of MSGWO in feature selection is to find the optimal subset of features between classes, solve premature convergence, and enhance the local search ability of the GWO algorithm. Multiple exploration and exploitation strategies are proposed to enhance the global search and local search abilities of the GWO algorithm through the optimization process. The support vector machine (SVM) classifier is used to evaluate the proposed GWO-based FS approaches. MSGWO was evaluated on thirteen high-dimensional biological datasets obtained from the UCI repository with a smaller number of instances. The reported results confirm that employing multiple exploration and multiple exploitation strategies is highly useful for enhancing the search tendency of the MSGWO in the FS domain. Statistical tests proved that the superiority of the proposed approach is statistically significant as compared to the basic GWO and similar wrapper-based FS techniques, including binary particle swarm optimization (BPSO), binary bat algorithm (BBA), binary gravitational search algorithm (BGSA), and binary whale optimization algorithm (BWOA). In terms of classification accuracy, MSGWO yielded better accuracy rates than the standard GWO algorithm on 84% of applied biological datasets. MSGWO also recorded better accuracy rates than its other competitors in all 13 cases. In terms of the lowest number of selected features, MSGWO yielded excellent reduction rates compared to its peers.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
References
Liang M, Hu X (2015) Feature selection in supervised saliency prediction. IEEE Trans Cybernetics 45(5):914–926
Fan Y, Xu H, Kangkang W, Ying Z, Bo T (2020) Spatiotemporal modeling for nonlinear distributed thermal processes based on kl decomposition, mlp and lstm network. IEEE Access 8:25111–25121
Xiaohui L, Chao L, Weijie R, Xiao L, Yanpeng Q (2019) A new feature selection method based on symmetrical uncertainty and interaction gain. Comput Biol Chem 83:107149
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Machine Learn Res 3:1157–1182
Huan L, Hiroshi M (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer, New York
Shahab S, Timon R, Kwok-Wing C (2019) A survey of deep learning techniques: application in wind and solar energy resources. IEEE Access 7:164650–164666
Banan A, Nasiri A, Taheri-Garavand A (2020) Deep learning-based appearance features extraction for automated carp species identification. Aquacultural Eng 89:102053
Marinka Z, Francis N, Bo W, Jure L, Anna G, Hoffman Michael M (2019) Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. Inform Fusion 50:71–91
Chunming X, Scott J (2019) Machine learning and complex biological data. Genome Biol 20:12
Naomi A, Martin K (2018) The curse(s) of dimensionality. Nat Methods 15:05
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A, Liu H (2010) Advancing feature selection research. ASU feature selection repository. Technical Report, Arizona State University, pp 1–28
Cosmin L, Jonatan T, Stijn M, David S, Alain C, Colin M, de Schaetzen V, Duque R, Bersini H, Nowe A (2012) A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE ACM Trans Comput Biol Bioinform 9(4):1106–1119
Nojun K, Chong-Ho C (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159
Girish C, Ferat S (2014) A survey on feature selection methods. Comput Electrical Eng 40(1):16–28
Hamouda C, Majdi M, Hamad A, Asghar HA, Ibrahim A, Hossam F (2020) Feature selection using binary grey wolf optimizer with elite-based crossover for arabic text classification. Neural Comput Appl 32(16):12201–12220
Alweshah M, Alkhalaileh S, Albashish D, Mafarja M, Bsoul Q, Dorgham O (2020) A hybrid mine blast algorithm for feature selection problems. Soft Comput 25:517–534
Ji Bai L, Xiaozheng SG, Wei Z, Jiahui L, Yinzhe X (2020) Bio-inspired feature selection: an improved binary particle swarm optimization approach. IEEE Access 8:85989–86002
Hashim Fatma A, Houssein Essam H, Kashif H, Mabrouk Mai S, Walid A-A (2022) Honey badger algorithm: new metaheuristic algorithm for solving optimization problems. Math Comput Simul 192:84–110
Glover FW, Kochenberger GA (2006) Handbook of metaheuristics, vol 57. Springer, New York
El-Ghazali T (2009) Metaheuristics: from design to implementation, vol 74. Wiley, New Jersey
Bing X, Mengjie Z, Browne Will N, Xin Y (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evolutionary Comput 20(4):606–626
Gholami J, Pourpanah F, Wang X (2020) Feature selection based on improved binary global harmony search for data classification. Appl Soft Comput 93:106402
Hoai NB, Bing X, Mengjie Z (2020) A survey on swarm intelligence approaches to feature selection in data mining. Swarm Evolutionary Comput 54:100663
Mühlenbein H (1997) Genetic algorithms
Rainer S, Kenneth P (1997) Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J Global Optimization 11(4):341–359
Kennedy J (2006) Swarm intelligence. In: Handbook of nature-inspired and innovative computing, Springer, p 187–219
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Micro machine and human science, 1995. MHS’95., Proceedings of the sixth international symposium on, pp 39–43. IEEE
Marco D (2007) Ant colony optimization. Scholarpedia 2(3):1461
Dervis K, Bahriye B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm. J Global Optimization 39(3):459–471
Seyedali M, Mohammad MS, Andrew L (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61
Faris H, Aljarah I, Al-Betar MA, Mirjalili S (2018) Grey wolf optimizer: a review of recent variants and applications. Neural Comput Appl 30(2):413–435
Asghar HA, Parham P (2017) An efficient modified grey wolf optimizer with lévy flight for optimization tasks. Appl Soft Comput 60:115–134
Heidari AA, Abbaspour RA (2018) Enhanced chaotic grey wolf optimizer for real-world optimization problems: a comparative study. In: Handbook of research on emergent applications of optimization algorithms, p 693–727. IGI Global
Xin-She Y, Suash D, Simon F (2014) Metaheuristic algorithms: optimal balance of intensification and diversification. Appl Math Inform Sci 8(3):977
Akash S, Rajesh K, Swagatam D (2019) \(\beta\)-chaotic map enabled grey wolf optimizer. Appl Soft Comput 75:84–105
Gupta S, Deep K (2020) A memory-based grey wolf optimizer for global optimization tasks. Appl Soft Comput 93:106367
Yosef M-S, Habib M, Yadollah O, Ali M-N (2021) A machine learning method based on the genetic and world competitive contests algorithms for selecting genes or features in biological applications. Sci Rep 11(1):1–19
Vanitha CD, Devaraj D, Venkatesulu M (2015) Gene expression data classification using support vector machine and mutual information-based gene selection. Procedia Comput Sci 47:13–21
Shujun H, Nianguang C, Penzuti PP, Shavira N, Wang Y, Xu W (2018) Applications of support vector machine (svm) learning in cancer genomics. Cancer Genomics Proteomics 15(1):41–51
Monirul KM, Shahjahan Md, Kazuyuki M (2011) A new local search based hybrid genetic algorithm for feature selection. Neurocomputing 74(17):2914–2928
Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybernetics Part B (Cybernetics) 36(1):106–117
Xuyang T, Hongbin D, Xiurong Z (2017) Adaptive feature selection using v-shaped binary particle swarm optimization. PloS One 12(3):e0173907
Bing X, Mengjie Z, Browne Will N (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybernetics 43(6):1656–1671
Kashef S, Nezamabadi-pour H (2015) An advanced aco algorithm for feature subset selection. Neurocomputing 147:271–279
Zorarpaci E, Ozel SA (2016) A hybrid approach of differential evolution and artificial bee colony for feature selection. Expert Syst Appl 62:91–103
Taormina R, Chau KW (2015) Ann-based interval forecasting of streamflow discharges using the lube method and mofips. Eng Appl Artif Intell 45:429–440
Dehghani M, Riahi-Madvar H, Hooshyaripor F, Mosavi A, Shamshirband S, Zavadskas EK, Chau KW (2019) Prediction of hydropower generation using grey wolf optimization adaptive neuro-fuzzy inference system. Energies 12(2):289
Wang WC, Xu L, Chau KW, Xu DM (2020) Yin-yang firefly algorithm based on dimensionally cauchy mutation. Expert Syst Appl 150:113216
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67
Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312
Mafarja M, Mirjalili S (2017) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453
Saremi S, Mirjalili S, Lewis A (2017) Grasshopper optimisation algorithm: theory and application. Adv Eng Softw 105:30–47
Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, AlaM AZ, Mirjalili S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowledge Based Syst 145:25–45
Mirjalili S (2015) The ant lion optimizer. Adv Eng Softw 83:80–98
Zawbaa HM, Emary E, Grosan C (2016) Feature selection via chaotic antlion optimization. PloS One 11(3):e0150652
Mafarja M, Eleyan D, Abdullah S, Mirjalili S(2017) S-shaped vs. v-shaped transfer functions for ant lion optimization algorithm in feature selection problem. In: Proceedings of the international conference on future networks and distributed systems, p 14. ACM
Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27(4):1053–1073
SR KS, Murugan S (2017) Memory based hybrid dragonfly algorithm for numerical optimization problems. Expert Syst Appli 83:63–78
Hariharan M, Sindhu R, Vikneswaran V, Haniza Y, Thiyagar N, Sazali Y, Kemal P (2018) Improved binary dragonfly optimization algorithm and wavelet packet based non-linear features for infant cry classification. Comput Methods Programs Biomed 155:39–51
Hammouri AI, Mafarja M, Al-Betar MA, Awadallah MA, Abu-Doush I (2020) An improved dragonfly algorithm for feature selection. Knowledge Based Syst 203:106131
Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput 43:117–130
Chen Y, Li L, Xiao J, Yang Y, Liang J, Li T (2018) Particle swarm optimizer with crossover operation. Eng Appl Artificial Intell 70:159–169
Rajamohana SP, Umamaheswari K (2018) Hybrid approach of improved binary particle swarm optimization and shuffled frog leaping for feature selection. Comput Electrical Eng 67:497–508
Mafarja Majdi, Jarrar Radi, Ahmad Sobhi, Abusnaina Ahmed (2018) Feature selection using binary particle swarm optimization with time varying inertia weight strategies. In The 2nd International Conference on Future Networks & Distributed Systems , Amman, Jordan, volume 2. ACM
Ahmad Subhi, Mafarja Majdi, Faris Hossam, Aljarah Ibrahim (2018) Feature selection using salp swarm algorithm with chaos. In The 2nd International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence (ISMSI 2018). Puket, Japan, volume 2, pages 65–69. ACM
Bo Z, Yang Xinkai H, Biao LZ, Zhanshan L (2020) Oebboa: A novel improved binary butterfly optimization approaches with various strategies for feature selection. IEEE Access 8:67799–67812
Yuanyuan G, Yongquan Z, Qifang L (2020) An efficient binary equilibrium optimizer algorithm for feature selection. IEEE Access 8:140936–140963
Hadeel A, Ahmad S, Eddin SK (2020) A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer. Expert Systems with Applications 148:113249
Neggaz N, Houssein EH, Hussain K (2020) An efficient henry gas solubility optimization for feature selection. Expert Syst Appl 152:113364
Kashif H, Nabil N, William Z, Houssein Essam H (2021) An efficient hybrid sine-cosine harris hawks optimization for low and high-dimensional feature selection. Expert Systems with Applications 176:114778
Manosij G, Shemim B, Ram S, Debasis C, Ujjwal M (2019) Recursive memetic algorithm for gene selection in microarray data. Expert Systems with Applications 116:172–185
Kabir Md, Shahjahan Md, Kazuyuki M (2012) A new hybrid ant colony optimization algorithm for feature selection. Expert Syst. Appl. 39(3747–3763):02
Javier A, Guillermo L, Alba E (2016) Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl Soft Comput 38:922–932
Li Qiang, Chen Huiling, Huang Hui , Zhao Xuehua , Cai ZhenNao , Tong Changfei , Liu Wenbin , Tian Xin (2017) An enhanced grey wolf optimization based feature selection wrapped kernel extreme learning machine for medical diagnosis. Computational and mathematical methods in medicine, 2017,
Emary E, Zawbaa Hossam M (2016) Impact of chaos functions on modern swarm optimizers. PloS one 11(7):e0158738
Eid E, Zawbaa Hossam M, Ella HA (2016) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65
Qiang T, Xuechen C, Xingcheng L (2019) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Applied Soft Computing 76:16–30
Too J, Abdullah AR (2021) Opposition based competitive grey wolf optimizer for emg feature selection. Evol Intell
Mohamed Abdel-Basset, Doaa El-Shahat, Ibrahim El-henawy, de Albuquerque Victor Hugo C, Mirjalili Seyedali (2020) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Systems with Applications 139:112824
Robert HK, Engelbrecht Andries P, Ombuki-Berman Beatrice M (2016) Inertia weight control strategies for particle swarm optimization. Swarm Intelligence 10(4):267–305
Chuang Li-Yeh, Li Jung-Chike, Yang Cheng-Hong (2008) Chaotic binary particle swarm optimization for feature selection using logistic map. In Proceedings of the International MultiConference of Engineers and Computer Scientists, volume 1
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1):67–82
Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, Ala’M A-Z, Mirjalili S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl-Based Syst 145:25–45
Faris H, Mafarja MM, Heidari AA, Aljarah I, Ala’M A-Z, Mirjalili S, Fujita H (2018) An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowl-Based Syst 154:43–67
Faris H, Ibrahim A, Al-Betar M, Mirjalili SM (2017) Grey wolf optimizer: a review of recent variants and applications. Neural Computing and Applications 30:413–435
Wen L, Jianjun J, Ximing L, Mingzhu T (2018) An exploration-enhanced grey wolf optimizer to solve high-dimensional numerical optimization. Engineering Applications of Artificial Intelligence 68:63–80
Chao L, Liang G, Jin Y (2018) Grey wolf optimizer with cellular topological structure. Expert Systems with Applications 107:89–114
Tu Q, Chen X, Liu X (2019) Hierarchy strengthened grey wolf optimizer for numerical optimization and feature selection. IEEE Access 7:78012–78028
Rao RV, Savsani VJ, Vakharia DP (2011) Teaching-learning-based optimization: A novel method for constrained mechanical design optimization problems. Computer-Aided Design 43(3):303–315
Jinhao Z, Mi X, Liang G, Quanke P (2018) Queuing search algorithm: A novel metaheuristic algorithm for solving engineering optimization problems. Applied Mathematical Modelling 63:464–490
Lichman M (2013) UCI machine learning repository
Eid E, Zawbaa Hossam M, Ella HA (2016) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381
Friedman Jerome, Hastie Trevor, Tibshirani Robert (2001) The elements of statistical learning, volume 1. Springer series in statistics New York
Pan H, Yanping L, Xiaoyi L, Wen C (2020) Liu Shuxian (2020) Recognition of common non-normal walking actions based on relief-f feature selection and relief-bagging-svm. Sensors 20(5):1447
Acknowledgements
This work is supported by the research committee at Birzeit University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest. Non-financial competing interests. No funding was received for this work.
Human participants
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mafarja, M., Thaher, T., Too, J. et al. An Efficient High-dimensional Feature Selection Approach Driven By Enhanced Multi-strategy Grey Wolf Optimizer for Biological Data Classification. Neural Comput & Applic 35, 1749–1775 (2023). https://doi.org/10.1007/s00521-022-07836-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07836-8