Abstract
Several new challenges as well as specialized difficulties are getting accumulated for big data that are against both scholarly research groups as well as and business IT sending. The rich big data sources are set up on information streams as well as the dimensionality scourge. It is difficult to precisely assess these big data for decision making systems. In the recent times, several domains are handling big datasets in which there is large number of additional features. The main aim of feature selection techniques is to eliminate noisy, redundant, or unrelated features that cause poor classification performance. This research implements the Feature selection employing Information Gain, Bacterial Foraging Optimization (BFO) as well as Hybrid BFO to compute on big data. Outcomes on various data sets reveal that the suggested Naïve Bayes, KNN method performs better when compared to the method analyzed in the literature.



Similar content being viewed by others
References
El Bakry, M., Safwat, S., Hegazy, O.: Big data classification using fuzzy K-nearest neighbor. Int. J. Comput. Appl. 132(10), 8–13 (2015)
Yu, K., Wu, X., Ding, W., & Pei, J.: Towards scalable and accurate online feature selection for big data. In: 2014 IEEE International Conference on Data Mining, December 2014, pp. 660–669
Peralta, D., del Río, S., Ramírez-Gallego, S., Triguero, I., Benitez, J.M., Herrera, F.: Evolutionary feature selection for big data classification: a MapReduce approach. Math. Probl. Eng. 501, 246139 (2015)
Gangurde, H.D.: Feature selection using clustering approach for big data. International Journal of Computer Applications. Innovations and Trends in Computer and Communication Engineering (ITCCE), pp. 1–3
Ayma, V.A., Ferreira, R.S., Happ, P., Oliveira, D., Feitosa, R., Costa, G., et al.: Classification algorithms for big data analysis, a MapReduce approach. Int. Arch. Photogram. Remote Sens. Spatial Inf. Sci. 40(3), 17 (2015)
Khan, N., Husain, M.S., Beg, M.R.: Big Data classification using evolutionary techniques: a survey. In: Proceedings of IEEE International Conference on Engineering and Technology (ICETECH), pp. 243–247 (2015)
Bikku, T., Rao, N.S., Akepogu, A.R.: Hadoop based feature selection and decision making models on big data. Indian J. Sci. Technol. (2016). https://doi.org/10.17485/ijst/2016/v9i10/88905
Wang, Y., Ke, W., Tao, X.: A feature selection method for large-scale network traffic classification based on spark. Information 7(1), 6 (2016)
Barbu, A., She, Y., Ding, L., Gramajo, G.: Feature selection with annealing for computer vision and big data learning. arXiv:1310.2880v7 (2013)
Ballard, C., Wang, W.: Dynamic ensemble selection methods for heterogeneous data mining. In: IEEE 2016 12th World Congress on Intelligent Control and Automation (WCICA), June 2016, pp. 1021–1026
Fong, S., Wong, R., Vasilakos, A.V.: Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans. Serv. Comput. 9(1), 33–45 (2016)
Khamitkar, S., Badgujar, N., Karanjkar, V., Kherdekar, H.: Data stream mining big data using velocity varying PSO feature selection. Int. J. Innov. Res. Comput. Commun. Eng. 4(5), 9900–9905 (2016)
Tan, M., Tsang, I.W., Wang, L.: Towards ultrahigh dimensional feature selection for big data. J. Mach. Learn. Res. 15(1), 1371–1429 (2014)
Naseriparsa, M., Bidgoli, A.M., Varaee, T.: A hybrid feature selection method to improve performance of a group of classification algorithms. arXiv preprint. arXiv:1403.2372 (2014)
Kavitha, G., Wahidabanu, R.: Foraging optimization for cluster head selection. J. Theor. Appl. Inf. Technol. 61(3), 571–579 (2014)
Jhankal, N.K., Adhyaru, D.: Comparative analysis of bacterial foraging optimization algorithm with simulated annealing. Int. J. Sci. Res. (IJSR) 3(3), 10–13 (2014)
Kumar, D.P., Archana, S.: Ground water prediction using bacterial Foraging optimization technique. J. Comput. Sci. Appl. 3(6), 172–176 (2015)
Kim, D.H., Abraham, A., Cho, J.H.: A hybrid genetic algorithm and bacterial foraging approach for global optimization. Inf. Sci. 177(18), 3918–3937 (2007)
Zhiwei, H., Tian, G., Huaving, Z., Xu, H., Junwei, C., Ziheng, H., et al.: Transient power quality assessment based on big data analysis. In: IEEE 2014 China International Conference on Electricity Distribution (CICED), September 2014, pp. 1308–1312 (2014)
Jadhav, S.D., Channe, H.P.: Comparative study of K-NN, naive Bayes and decision tree classification techniques. Int. J. Sci. Res. (IJSR) 5(1), 1842–1845 (2013)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Madhusudhanan, B., Sumathi, P., Karpagam, N.S. et al. An hybrid metaheuristic approach for efficient feature selection. Cluster Comput 22 (Suppl 6), 14541–14549 (2019). https://doi.org/10.1007/s10586-018-2337-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-018-2337-2