Skip to main content

Advertisement

Log in

An Efficient High-dimensional Feature Selection Approach Driven By Enhanced Multi-strategy Grey Wolf Optimizer for Biological Data Classification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Biological data generally contain complex and high-dimensional samples. In addition, the number of samples in biological datasets is much fewer than the number of features, so the vast number of features should be selected carefully and determine the optimal subset of features. Feature selection (FS) is a vital stage in biological data mining applications (e.g., classification) for dealing with the curse of dimensionality problems and finding highly informative features. This work proposes an effective FS approach based on a new version of Gray Wolf Optimizer (GWO) called Multi-strategy Gray Wolf Optimizer (MSGWO) for better features selection for biological data classification. The use of MSGWO in feature selection is to find the optimal subset of features between classes, solve premature convergence, and enhance the local search ability of the GWO algorithm. Multiple exploration and exploitation strategies are proposed to enhance the global search and local search abilities of the GWO algorithm through the optimization process. The support vector machine (SVM) classifier is used to evaluate the proposed GWO-based FS approaches. MSGWO was evaluated on thirteen high-dimensional biological datasets obtained from the UCI repository with a smaller number of instances. The reported results confirm that employing multiple exploration and multiple exploitation strategies is highly useful for enhancing the search tendency of the MSGWO in the FS domain. Statistical tests proved that the superiority of the proposed approach is statistically significant as compared to the basic GWO and similar wrapper-based FS techniques, including binary particle swarm optimization (BPSO), binary bat algorithm (BBA), binary gravitational search algorithm (BGSA), and binary whale optimization algorithm (BWOA). In terms of classification accuracy, MSGWO yielded better accuracy rates than the standard GWO algorithm on 84% of applied biological datasets. MSGWO also recorded better accuracy rates than its other competitors in all 13 cases. In terms of the lowest number of selected features, MSGWO yielded excellent reduction rates compared to its peers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
€32.70 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

  1. Liang M, Hu X (2015) Feature selection in supervised saliency prediction. IEEE Trans Cybernetics 45(5):914–926

    Google Scholar 

  2. Fan Y, Xu H, Kangkang W, Ying Z, Bo T (2020) Spatiotemporal modeling for nonlinear distributed thermal processes based on kl decomposition, mlp and lstm network. IEEE Access 8:25111–25121

    Google Scholar 

  3. Xiaohui L, Chao L, Weijie R, Xiao L, Yanpeng Q (2019) A new feature selection method based on symmetrical uncertainty and interaction gain. Comput Biol Chem 83:107149

    MathSciNet  Google Scholar 

  4. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Machine Learn Res 3:1157–1182

    MATH  Google Scholar 

  5. Huan L, Hiroshi M (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer, New York

    MATH  Google Scholar 

  6. Shahab S, Timon R, Kwok-Wing C (2019) A survey of deep learning techniques: application in wind and solar energy resources. IEEE Access 7:164650–164666

    Google Scholar 

  7. Banan A, Nasiri A, Taheri-Garavand A (2020) Deep learning-based appearance features extraction for automated carp species identification. Aquacultural Eng 89:102053

    Google Scholar 

  8. Marinka Z, Francis N, Bo W, Jure L, Anna G, Hoffman Michael M (2019) Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. Inform Fusion 50:71–91

    Google Scholar 

  9. Chunming X, Scott J (2019) Machine learning and complex biological data. Genome Biol 20:12

    Google Scholar 

  10. Naomi A, Martin K (2018) The curse(s) of dimensionality. Nat Methods 15:05

    Google Scholar 

  11. Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

    Google Scholar 

  12. Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A, Liu H (2010) Advancing feature selection research. ASU feature selection repository. Technical Report, Arizona State University, pp 1–28

  13. Cosmin L, Jonatan T, Stijn M, David S, Alain C, Colin M, de Schaetzen V, Duque R, Bersini H, Nowe A (2012) A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE ACM Trans Comput Biol Bioinform 9(4):1106–1119

    Google Scholar 

  14. Nojun K, Chong-Ho C (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159

    Google Scholar 

  15. Girish C, Ferat S (2014) A survey on feature selection methods. Comput Electrical Eng 40(1):16–28

    Google Scholar 

  16. Hamouda C, Majdi M, Hamad A, Asghar HA, Ibrahim A, Hossam F (2020) Feature selection using binary grey wolf optimizer with elite-based crossover for arabic text classification. Neural Comput Appl 32(16):12201–12220

    Google Scholar 

  17. Alweshah M, Alkhalaileh S, Albashish D, Mafarja M, Bsoul Q, Dorgham O (2020) A hybrid mine blast algorithm for feature selection problems. Soft Comput 25:517–534

    Google Scholar 

  18. Ji Bai L, Xiaozheng SG, Wei Z, Jiahui L, Yinzhe X (2020) Bio-inspired feature selection: an improved binary particle swarm optimization approach. IEEE Access 8:85989–86002

    Google Scholar 

  19. Hashim Fatma A, Houssein Essam H, Kashif H, Mabrouk Mai S, Walid A-A (2022) Honey badger algorithm: new metaheuristic algorithm for solving optimization problems. Math Comput Simul 192:84–110

    MathSciNet  MATH  Google Scholar 

  20. Glover FW, Kochenberger GA (2006) Handbook of metaheuristics, vol 57. Springer, New York

    MATH  Google Scholar 

  21. El-Ghazali T (2009) Metaheuristics: from design to implementation, vol 74. Wiley, New Jersey

    MATH  Google Scholar 

  22. Bing X, Mengjie Z, Browne Will N, Xin Y (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evolutionary Comput 20(4):606–626

    Google Scholar 

  23. Gholami J, Pourpanah F, Wang X (2020) Feature selection based on improved binary global harmony search for data classification. Appl Soft Comput 93:106402

    Google Scholar 

  24. Hoai NB, Bing X, Mengjie Z (2020) A survey on swarm intelligence approaches to feature selection in data mining. Swarm Evolutionary Comput 54:100663

    Google Scholar 

  25. Mühlenbein H (1997) Genetic algorithms

  26. Rainer S, Kenneth P (1997) Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J Global Optimization 11(4):341–359

    MathSciNet  MATH  Google Scholar 

  27. Kennedy J (2006) Swarm intelligence. In: Handbook of nature-inspired and innovative computing, Springer, p 187–219

  28. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Micro machine and human science, 1995. MHS’95., Proceedings of the sixth international symposium on, pp 39–43. IEEE

  29. Marco D (2007) Ant colony optimization. Scholarpedia 2(3):1461

    MathSciNet  Google Scholar 

  30. Dervis K, Bahriye B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm. J Global Optimization 39(3):459–471

    MathSciNet  MATH  Google Scholar 

  31. Seyedali M, Mohammad MS, Andrew L (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61

    Google Scholar 

  32. Faris H, Aljarah I, Al-Betar MA, Mirjalili S (2018) Grey wolf optimizer: a review of recent variants and applications. Neural Comput Appl 30(2):413–435

    Google Scholar 

  33. Asghar HA, Parham P (2017) An efficient modified grey wolf optimizer with lévy flight for optimization tasks. Appl Soft Comput 60:115–134

    Google Scholar 

  34. Heidari AA, Abbaspour RA (2018) Enhanced chaotic grey wolf optimizer for real-world optimization problems: a comparative study. In: Handbook of research on emergent applications of optimization algorithms, p 693–727. IGI Global

  35. Xin-She Y, Suash D, Simon F (2014) Metaheuristic algorithms: optimal balance of intensification and diversification. Appl Math Inform Sci 8(3):977

    Google Scholar 

  36. Akash S, Rajesh K, Swagatam D (2019) \(\beta\)-chaotic map enabled grey wolf optimizer. Appl Soft Comput 75:84–105

    Google Scholar 

  37. Gupta S, Deep K (2020) A memory-based grey wolf optimizer for global optimization tasks. Appl Soft Comput 93:106367

    Google Scholar 

  38. Yosef M-S, Habib M, Yadollah O, Ali M-N (2021) A machine learning method based on the genetic and world competitive contests algorithms for selecting genes or features in biological applications. Sci Rep 11(1):1–19

    Google Scholar 

  39. Vanitha CD, Devaraj D, Venkatesulu M (2015) Gene expression data classification using support vector machine and mutual information-based gene selection. Procedia Comput Sci 47:13–21

    Google Scholar 

  40. Shujun H, Nianguang C, Penzuti PP, Shavira N, Wang Y, Xu W (2018) Applications of support vector machine (svm) learning in cancer genomics. Cancer Genomics Proteomics 15(1):41–51

    Google Scholar 

  41. Monirul KM, Shahjahan Md, Kazuyuki M (2011) A new local search based hybrid genetic algorithm for feature selection. Neurocomputing 74(17):2914–2928

    Google Scholar 

  42. Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybernetics Part B (Cybernetics) 36(1):106–117

    Google Scholar 

  43. Xuyang T, Hongbin D, Xiurong Z (2017) Adaptive feature selection using v-shaped binary particle swarm optimization. PloS One 12(3):e0173907

    Google Scholar 

  44. Bing X, Mengjie Z, Browne Will N (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybernetics 43(6):1656–1671

    Google Scholar 

  45. Kashef S, Nezamabadi-pour H (2015) An advanced aco algorithm for feature subset selection. Neurocomputing 147:271–279

    Google Scholar 

  46. Zorarpaci E, Ozel SA (2016) A hybrid approach of differential evolution and artificial bee colony for feature selection. Expert Syst Appl 62:91–103

    Google Scholar 

  47. Taormina R, Chau KW (2015) Ann-based interval forecasting of streamflow discharges using the lube method and mofips. Eng Appl Artif Intell 45:429–440

    Google Scholar 

  48. Dehghani M, Riahi-Madvar H, Hooshyaripor F, Mosavi A, Shamshirband S, Zavadskas EK, Chau KW (2019) Prediction of hydropower generation using grey wolf optimization adaptive neuro-fuzzy inference system. Energies 12(2):289

    Google Scholar 

  49. Wang WC, Xu L, Chau KW, Xu DM (2020) Yin-yang firefly algorithm based on dimensionally cauchy mutation. Expert Syst Appl 150:113216

    Google Scholar 

  50. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67

    Google Scholar 

  51. Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312

    Google Scholar 

  52. Mafarja M, Mirjalili S (2017) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453

    Google Scholar 

  53. Saremi S, Mirjalili S, Lewis A (2017) Grasshopper optimisation algorithm: theory and application. Adv Eng Softw 105:30–47

    Google Scholar 

  54. Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, AlaM AZ, Mirjalili S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowledge Based Syst 145:25–45

    Google Scholar 

  55. Mirjalili S (2015) The ant lion optimizer. Adv Eng Softw 83:80–98

    Google Scholar 

  56. Zawbaa HM, Emary E, Grosan C (2016) Feature selection via chaotic antlion optimization. PloS One 11(3):e0150652

    Google Scholar 

  57. Mafarja M, Eleyan D, Abdullah S, Mirjalili S(2017) S-shaped vs. v-shaped transfer functions for ant lion optimization algorithm in feature selection problem. In: Proceedings of the international conference on future networks and distributed systems, p 14. ACM

  58. Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27(4):1053–1073

    MathSciNet  Google Scholar 

  59. SR KS, Murugan S (2017) Memory based hybrid dragonfly algorithm for numerical optimization problems. Expert Syst Appli 83:63–78

    Google Scholar 

  60. Hariharan M, Sindhu R, Vikneswaran V, Haniza Y, Thiyagar N, Sazali Y, Kemal P (2018) Improved binary dragonfly optimization algorithm and wavelet packet based non-linear features for infant cry classification. Comput Methods Programs Biomed 155:39–51

    Google Scholar 

  61. Hammouri AI, Mafarja M, Al-Betar MA, Awadallah MA, Abu-Doush I (2020) An improved dragonfly algorithm for feature selection. Knowledge Based Syst 203:106131

    Google Scholar 

  62. Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput 43:117–130

    Google Scholar 

  63. Chen Y, Li L, Xiao J, Yang Y, Liang J, Li T (2018) Particle swarm optimizer with crossover operation. Eng Appl Artificial Intell 70:159–169

    Google Scholar 

  64. Rajamohana SP, Umamaheswari K (2018) Hybrid approach of improved binary particle swarm optimization and shuffled frog leaping for feature selection. Comput Electrical Eng 67:497–508

    Google Scholar 

  65. Mafarja Majdi, Jarrar Radi, Ahmad Sobhi, Abusnaina Ahmed (2018) Feature selection using binary particle swarm optimization with time varying inertia weight strategies. In The 2nd International Conference on Future Networks & Distributed Systems , Amman, Jordan, volume 2. ACM

  66. Ahmad Subhi, Mafarja Majdi, Faris Hossam, Aljarah Ibrahim (2018) Feature selection using salp swarm algorithm with chaos. In The 2nd International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence (ISMSI 2018). Puket, Japan, volume 2, pages 65–69. ACM

  67. Bo Z, Yang Xinkai H, Biao LZ, Zhanshan L (2020) Oebboa: A novel improved binary butterfly optimization approaches with various strategies for feature selection. IEEE Access 8:67799–67812

    Google Scholar 

  68. Yuanyuan G, Yongquan Z, Qifang L (2020) An efficient binary equilibrium optimizer algorithm for feature selection. IEEE Access 8:140936–140963

    Google Scholar 

  69. Hadeel A, Ahmad S, Eddin SK (2020) A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer. Expert Systems with Applications 148:113249

    Google Scholar 

  70. Neggaz N, Houssein EH, Hussain K (2020) An efficient henry gas solubility optimization for feature selection. Expert Syst Appl 152:113364

    Google Scholar 

  71. Kashif H, Nabil N, William Z, Houssein Essam H (2021) An efficient hybrid sine-cosine harris hawks optimization for low and high-dimensional feature selection. Expert Systems with Applications 176:114778

    Google Scholar 

  72. Manosij G, Shemim B, Ram S, Debasis C, Ujjwal M (2019) Recursive memetic algorithm for gene selection in microarray data. Expert Systems with Applications 116:172–185

    Google Scholar 

  73. Kabir Md, Shahjahan Md, Kazuyuki M (2012) A new hybrid ant colony optimization algorithm for feature selection. Expert Syst. Appl. 39(3747–3763):02

    Google Scholar 

  74. Javier A, Guillermo L, Alba E (2016) Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl Soft Comput 38:922–932

    Google Scholar 

  75. Li Qiang, Chen Huiling, Huang Hui , Zhao Xuehua , Cai ZhenNao , Tong Changfei , Liu Wenbin , Tian Xin (2017) An enhanced grey wolf optimization based feature selection wrapped kernel extreme learning machine for medical diagnosis. Computational and mathematical methods in medicine, 2017,

  76. Emary E, Zawbaa Hossam M (2016) Impact of chaos functions on modern swarm optimizers. PloS one 11(7):e0158738

    Google Scholar 

  77. Eid E, Zawbaa Hossam M, Ella HA (2016) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65

    Google Scholar 

  78. Qiang T, Xuechen C, Xingcheng L (2019) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Applied Soft Computing 76:16–30

    Google Scholar 

  79. Too J, Abdullah AR (2021) Opposition based competitive grey wolf optimizer for emg feature selection. Evol Intell

  80. Mohamed Abdel-Basset, Doaa El-Shahat, Ibrahim El-henawy, de Albuquerque Victor Hugo C, Mirjalili Seyedali (2020) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Systems with Applications 139:112824

  81. Robert HK, Engelbrecht Andries P, Ombuki-Berman Beatrice M (2016) Inertia weight control strategies for particle swarm optimization. Swarm Intelligence 10(4):267–305

    Google Scholar 

  82. Chuang Li-Yeh, Li Jung-Chike, Yang Cheng-Hong (2008) Chaotic binary particle swarm optimization for feature selection using logistic map. In Proceedings of the International MultiConference of Engineers and Computer Scientists, volume 1

  83. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1):67–82

    Google Scholar 

  84. Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, Ala’M A-Z, Mirjalili S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl-Based Syst 145:25–45

    Google Scholar 

  85. Faris H, Mafarja MM, Heidari AA, Aljarah I, Ala’M A-Z, Mirjalili S, Fujita H (2018) An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowl-Based Syst 154:43–67

    Google Scholar 

  86. Faris H, Ibrahim A, Al-Betar M, Mirjalili SM (2017) Grey wolf optimizer: a review of recent variants and applications. Neural Computing and Applications 30:413–435

    Google Scholar 

  87. Wen L, Jianjun J, Ximing L, Mingzhu T (2018) An exploration-enhanced grey wolf optimizer to solve high-dimensional numerical optimization. Engineering Applications of Artificial Intelligence 68:63–80

    Google Scholar 

  88. Chao L, Liang G, Jin Y (2018) Grey wolf optimizer with cellular topological structure. Expert Systems with Applications 107:89–114

    Google Scholar 

  89. Tu Q, Chen X, Liu X (2019) Hierarchy strengthened grey wolf optimizer for numerical optimization and feature selection. IEEE Access 7:78012–78028

    Google Scholar 

  90. Rao RV, Savsani VJ, Vakharia DP (2011) Teaching-learning-based optimization: A novel method for constrained mechanical design optimization problems. Computer-Aided Design 43(3):303–315

    Google Scholar 

  91. Jinhao Z, Mi X, Liang G, Quanke P (2018) Queuing search algorithm: A novel metaheuristic algorithm for solving engineering optimization problems. Applied Mathematical Modelling 63:464–490

    MathSciNet  MATH  Google Scholar 

  92. Lichman M (2013) UCI machine learning repository

  93. Eid E, Zawbaa Hossam M, Ella HA (2016) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381

    Google Scholar 

  94. Friedman Jerome, Hastie Trevor, Tibshirani Robert (2001) The elements of statistical learning, volume 1. Springer series in statistics New York

  95. Pan H, Yanping L, Xiaoyi L, Wen C (2020) Liu Shuxian (2020) Recognition of common non-normal walking actions based on relief-f feature selection and relief-bagging-svm. Sensors 20(5):1447

    Google Scholar 

Download references

Acknowledgements

This work is supported by the research committee at Birzeit University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Essam H. Houssein.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest. Non-financial competing interests. No funding was received for this work.

Human participants

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mafarja, M., Thaher, T., Too, J. et al. An Efficient High-dimensional Feature Selection Approach Driven By Enhanced Multi-strategy Grey Wolf Optimizer for Biological Data Classification. Neural Comput & Applic 35, 1749–1775 (2023). https://doi.org/10.1007/s00521-022-07836-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07836-8

Keywords

Navigation