Abstract
In this paper, VisDroid, a novel generic image-based classification method has been suggested and developed for classifying the Android malware samples into its families. To this end, five grayscale image datasets each of which contains 4850 samples have been constructed based on different files from the contents of the Android malware samples sources. Two types of image-based features have been extracted and used to train six machine learning classifiers including Random Forest, K-nearest neighbour, Decision trees, Bagging, AdaBoost and Gradient Boost classifiers. The first type of the extracted features is local features including Scale-Invariant Feature Transform, Speeded Up Robust Features, Oriented FAST and Rotated BRIEF (ORB) and KAZE features. The second type of the extracted features is global features including Colour Histogram, Hu Moments and Haralick Texture. Furthermore, a hybridized ensemble voting classifier has been proposed to test the efficiency of using a number of machine learning classifiers trained using local and global features as voters to make a decision in an ensemble voting classifier. Moreover, two well-known deep learning model, i.e. Residual Neural Network and Inception-v3 have been tested using some of the constructed image datasets. Furthermore, when the results of the proposed model have been compared with the results of some state-of-art works it has been revealed that the proposed model outperforms the compared previous models in term of classification accuracy, computational time, generality and classification mode.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Gartner (2018) Gartner says worldwide sales of smartphones recorded first ever decline during the fourth quarter of 2017 https://www.gartner.com/en/newsroom/press-releases/2018-02-22-gartner-says-worldwide-sales-of-smartphones-recorded-first-ever-decline-during-the-fourth-quarter-of-2017. Accessed 06 Apr 2019
McAfee (2018) AsiaHitGroup Gang Again sneaks billing-fraud apps onto google play. McAfee Labs threats report
G-Data (2018) The total count of mobile malware rises about 40 percent in 2018. https://www.gdatasoftware.com/blog/2018/11/31255-cyber-attacks-on-android-devices-on-the-rise. Accessed 01 Apr 2019
Micro T (2019) Adware disguised as game, TV, remote control apps infect 9 million Google play users https://blog.trendmicro.com/trendlabs-security-intelligence/adware-disguised-as-game-tv-remote-control-apps-infect-9-million-google-play-users/. Accessed 04 Apr 2019
Kirubavathi G, Anitha R (2017) Structural analysis and detection of android botnets using machine learning techniques. Int J Inf Secur 17(2):153–167. https://doi.org/10.1007/s10207-017-0363-3
Idrees F et al (2017) PIndroid: a novel Android malware detection system using ensemble learning methods. Comput Secur 68:36–46. https://doi.org/10.1016/j.cose.2017.03.011
Verma S, Muttoo SK (2016) An android malware detection framework-based on permissions and intents. Def Sci J 66(6):618. https://doi.org/10.14429/dsj.66.10803
Yerima SY, Sezer S (2018) DroidFusion: a novel multilevel classifier fusion approach for android malware detection. IEEE Trans Cybern 10:1–14. https://doi.org/10.1109/tcyb.2017.2777960
Fan M et al (2018) Android malware familial classification and representative sample selection via frequent subgraph analysis. IEEE Trans Inf Forensics Secur 13(8):1890–1905. https://doi.org/10.1109/tifs.2018.2806891
Gurulian I et al (2016) You can’t touch this: consumer-centric android application repackaging detection. Fut Gener Comput Syst 65:1–9. https://doi.org/10.1016/j.future.2016.05.021
Wang C et al (2017) An android malware dynamic detection method based on service call co-occurrence matrices. Ann Telecommun 72(9–10):607–615. https://doi.org/10.1007/s12243-017-0580-9
Chang W-L, Sun H-M, Wu W (2016) An android behavior-based malware detection method using machine learning. In: 2016 IEEE international conference on signal processing, communications and computing (ICSPCC), 2016. IEEE. https://doi.org/10.1109/ICSPCC.2016.7753624
Hou S et al (2016) Deep4maldroid: a deep learning framework for android malware detection based on linux kernel system call graphs. In: IEEE/WIC/ACM international conference on web intelligence workshops (WIW). IEEE. http://doi.ieeecomputersociety.org/10.1109/WIW.2016.040
Ng DV, Hwang J-IG (2014) Android malware detection using the dendritic cell algorithm. In: 2014 International conference on machine learning and cybernetics (ICMLC). IEEE. https://doi.org/10.1109/ICMLC.2014.7009126
Ongtang M et al (2012) Semantically rich application-centric security in Android. Secur Commun Netw 5(6):658–673. https://doi.org/10.1002/sec.360
Hsien-De Huang T, Kao H-Y (2018) R2-d2: color-inspired convolutional neural network (cnn)-based android malware detections. In: 2018 IEEE international conference on Big Data (Big Data). IEEE
Yang M, Wen Q (2017) Detecting android malware by applying classification techniques on images patterns. In: 2017 IEEE 2nd international conference on cloud computing and Big Data analysis (ICCCBDA). IEEE
Karimi A, Moattar MH (2017) Android ransomware detection using reduced opcode sequence and image similarity. In 2017 7th international conference on computer and knowledge engineering (ICCKE). IEEE
Jain A,Gonzalez H, Stakhanova N (2015) Enriching reverse engineering through visual exploration of Android binaries. In: Proceedings of the 5th program protection and reverse engineering workshop. ACM
Yen Y-S, Sun H-M (2019) An Android mutation malware detection based on deep learning using visualization of importance from codes. Microelectron Reliab 93:109–114
Onwuzurike L et al (2019) MaMaDroid: detecting android malware by building markov chains of behavioral models (extended version). ACM Trans Privacy Secur (TOPS) 22(2):14
Suarez-Tangil G et al (2014) Dendroid: a text mining approach to analyzing and classifying code structures in android malware families. Expert Syst Appl 41(4):1104–1117
Bakour K, Ünver HM, Ghanem R (2019) The Android malware detection systems between hope and reality. SN Appl Sci 1(9):1120
Arp D et al (2014) Drebin: Effective and explainable detection of android malware in your pocket. In: NDSS
Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. In: 2012 IEEE symposium on security and privacy. IEEE
Lisin DA et al (2015) Combining local and global image features for object class recognition. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05)-workshops. IEEE
Mallick S (2018) Shape Matching using Hu moments. Shape matching using Hu moments https://www.learnopencv.com/shape-matching-using-hu-moments-c-python/. Accessed 19 Apr 2019
Huang Z, Leng J (2010) Analysis of Hu’s moment invariants on image scaling and rotation. In: 2010 2nd international conference on computer engineering and technology
Kumar RM, Sreekumar K (2014) A survey on image feature descriptors. Int J Comput Sci Inf Technol 5:7668–7673
Haralick RM, Shanmugam K (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110
Bay H, Tuytelaars T, Van Gool L (2006) Surf: speeded up robust features. In: European conference on computer vision. Springer
Alcantarilla PF, Bartoli A,Davison AJ (2012) KAZE features. In: European conference on computer vision. Springer
Rublee E et al (2011) ORB: an efficient alternative to SIFT or SURF. In: ICCV. Citeseer
Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: European conference on computer vision. Springer
Calonder M et al (2010) Brief: binary robust independent elementary features. In: European conference on computer vision. Springer
He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Szegedy C et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Szegedy C et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Lee J, Lee S, Lee H (2015) Screening smartphone applications using malware family signatures. Comput Secur 52:234–249. https://doi.org/10.1016/j.cose.2015.02.003
Wu D-J et al (2012) Droidmat: Android malware detection through manifest and api calls tracing. In: 2012 seventh asia joint conference on information security. IEEE
Deshotels L, Notani V, Lakhotia A (2014) Droidlegacy: automated familial classification of android malware. In: Proceedings of ACM SIGPLAN on program protection and reverse engineering workshop 2014. ACM
Bakour K, Ünver HM, Ghanem R (2019) A deep camouflage: evaluating android’s anti-malware systems robustness against hybridization of obfuscation techniques with injection attacks. Arab J Sci Eng 44(11):9333–9347
Acknowledgements
The authors thank Sen Chen, Minhui Xue, Lingling Fan, Shuang Hao, Lihua Xu and Haojin Zhu for sharing their KuafuDet malware Dataset. Also, the authors thank Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon and Konrad Rieck for sharing their Drebin Android malware dataset.
Funding
There is no grant has been received from any agency (whether in the public, commercial, or not-for-profit sectors) for this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors of the article have no relationship (either financial or personal) with any people or organizations that can affect or bias the paper’s contents.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Halit Bakır: Khaled Bakour’s name can be written in two different ways due to his dual citizenship.
Rights and permissions
About this article
Cite this article
Bakour, K., Ünver, H.M. VisDroid: Android malware classification based on local and global image features, bag of visual words and machine learning techniques. Neural Comput & Applic 33, 3133–3153 (2021). https://doi.org/10.1007/s00521-020-05195-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05195-w