0% found this document useful (0 votes)
139 views

Malware Detection in Android Applications

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, PDF URL: http://www.ijtsrd.com/papers/ijtsrd26449.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/26449/malware-detection-in-android-applications/mr-tushar-patil

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views

Malware Detection in Android Applications

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, PDF URL: http://www.ijtsrd.com/papers/ijtsrd26449.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/26449/malware-detection-in-android-applications/mr-tushar-patil

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

International Journal of Trend in Scientific Research and Development (IJTSRD)

Volume 3 Issue 5, August 2019 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470

Malware Detection in Android Applications


Mr. Tushar Patil, Prof. Bharti Dhote
Department of Computer Engineering, SIT Lonawala, SPPU, Pune, India

How to cite this paper: Mr. Tushar Patil | ABSTRACT


Prof. Bharti Dhote "Malware Detection in Android is a Linux-based operating system used for smart-phone devices.
Android Applications" Published in Since 2008, Android devices gained huge market share due to its open
International architecture and popularity. Increased popularity of the Android devices and
Journal of Trend in associated primary benefits attracted the malware developers. Rate of
Scientific Research Android malware applications increased between 2008 and 2016. In this
and Development paper, we proposed dynamic malware detection approach for Android
(ijtsrd), ISSN: 2456- applications. In dynamic analysis, system calls are recorded to calculate the
6470, Volume-3 | density of the system calls. For density calculation, we used two different
Issue-5, August IJTSRD26449 lengths of system calls that are 3-gram and 5-gram. Furthermore, Naive Bayes
2019, pp.2401-2403, algorithm is applied to classify applications as benign or malicious. The
https://doi.org/10.31142/ijtsrd26449 proposed algorithm detects malware using 100 real-world samples of benign
and malware applications. We observe that proposed method gives effective
Copyright © 2019 by author(s) and and accurate results. The 3-gram Naive Bayes algorithm detects 84% malware
International Journal of Trend in Scientific application correctly and 14% benign application incorrectly. The 5-gram
Research and Development Journal. This Naive Bayes algorithm detects 88% malware application correctly and 10%
is an Open Access article distributed benign application incorrectly.
under the terms of
the Creative KEYWORDS: Malware Detection • Naive Bayes Classifier • System Calls •
Commons Attribution Frequency • Density
License (CC BY 4.0)
(http://creativecommons.org/licenses/by
/4.0)
INTRODUCTION
Android is a most popular and fastest growing mobile frequency of the system call. Next, we apply the filter on
application development framework. Since 2008, the system calls. Filtered system calls are used for calculating
adoption rate of Android has increased quickly. There are density. Furthermore, system calls are parsed and mapped
approximately 1.5 million Android devices being activated into the machine learning algorithm. We use Naive Bayes
every day[18]. In the first quarter of 2017, Android occupy classifier for classification of the application as benign or
approximately 86.1% market share[17]. It is an open-source malicious. Using mapped system calls as input, we train the
platform based on Linux kernel. Android OS is developed classifier. After that, we apply classifier and classify the
and maintained by Google and promoted by Open Handset application as benign or malicious. The whole system
Alliance. Android applications are developed in Java, Python. applied to real world benign and malware application
Android provide very user-friendly functionalities at truly samples.
low cost. Android users use Android phones for storage,
communication, the Internet surfing, etc. To analyze Our main contribution in this work is: 1. we performed
malware static, dynamic and hybrid analysis methods are system call based dynamic malware analysis techniques. We
used[1]. Static analysis method identifies malware by used Naive Bayes classification algorithm for detection. 2.
unpacking and decompiling application. Mostly, Commercial We used 3-gram and 5-gram length of system calls which
anti-virus uses signature-based malware detection reduces time complexity of system. While, we filtered system
technique. The dynamic analysis identifies malware behavior calls on basis of their frequency. It reduces overhead without
after deploying and executing the application. The hybrid losing accuracy. 3. Performance of the overall dynamic
analysis is the combination of static and dynamic methods. malware detection system is better and gives more accurate
There are two main steps to overcome malware named as results. Proposed system gives 85% and 89% accuracy in
identification of malware and detection of malware. results for 3-gram and 5-gram Naive Bayes algorithm.
Application signature, permissions, and Dalvik bytecode are
the parameters used for static analysis of malware[3]. LITERATURE SURVEY
System calls, network traffic, user interactions are the Faruki Parvez et al.[1] and Arshad Saba et al.[2], gives a
parameters used by dynamic analysis[3]. Hybrid analysis detailed survey of Android architecture and malware. Parvez
technique uses a combined feature of static and dynamic Faruki et al.[1], gives Android security architecture and its
approach. issues, malware types, and its penetration techniques. They
discussed malware detection methods that are static
In this work, we describe dynamic malware detection malware detection and dynamic malware detection. Also,
techniques. For dynamic analysis first, we install all samples they covered malware analysis and detection approaches
on Emulator. Then, we run all applications for a 2-3 minute according to their goal, methodology, and deployment.
and record system calls. After that, we calculate the Finally, they proposed a hybrid approach to analyze and

@ IJTSRD | Unique Paper ID – IJTSRD26449 | Volume – 3 | Issue – 5 | July - August 2019 Page 2401
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
detect Android Malware. Arshad Saba et al.[2] gives details platform which is based on virtual machine introspection.
of different Android malware types and its penetration Droidscope is built upon QEMU emulator. It is monitoring
techniques. They categorized different antimalware whole operating system to get more information regarding
techniques like static and dynamic malware detection. At the malware and also detect kernel level attack.
end, they proposed the hybrid antimalware concept to
overcome limitations of the static and dynamic approach. PROPOSED METHODOLOGY
Preprocessing: The first step of proposed system is to collect
Feizollah et al.[3] provide details about feature selection real-world samples of benign and malware applications.
from Android applications for malware detection. Based on After collection of application sample, system next go to the
deep research, they categorized four different feature second step of recording system calls. Figure 1 shows the
selection group like application meta-data, hybrid, dynamic flow of system call recording. Initially, we installed every
and static features. It gives a novel introduction about application on Android emulator and run for a 2-3 minute.
Android malware detection types and related features. They After that, we recorded system calls of each application and
proposed permission, signature, Java’s code, etc. features copied into an external file(.csv). To trace system calls we
used for static malware detection. While the system calls, used.We know that each line in training set represents single
network traffic, user interactions are the feature set for application features with multiple feature integer and
dynamic malware detection. feature values. Now we labeled each line that means each
application with 1 or 0. Where 1 means benign application,
In paper [4], [5], [6], [7] authors suggested static malware and 0 means a malicious application in training set. We have
analysis techniques with a different approach. Geoffroy used 70% of application from system data set for training
Gueguen et al.[4] propose static malware analysis tool data and remaining for testing. After all this data
named as Androguard. Androguard is the Python-based preprocessing, we applied Naive Bayes classifier in next step
static malware analysis tool used to disassemble and
decompile Android apps by using reverse engineering. Algorithm 1: Naive Bayes Algorithm for Malware
Androguard calculates application similarities and Detection
differences by using NCD(Normalized Compression  The duplicated files are mapped with a single copy of the
Distance), fuzzy risk score and signatures of the malicious file data by mapping with the existing file data in the
application. Faruki Parvez et al.[5] describe the tool cloud
Androsimilar. Androsimilar is a signature based static  The comprehensive requirements in multi-user cloud
malware analysis tool. Androsimilar automatically generates storage systems and introduced the model of
the signature of the test application. Generated signature is deduplicatable dynamic PoS.
compared against malware signature database. Then identify
it as the normal or malicious application. Daniel Arp et al.[6] Input: Android Application System calls stored in .csv file
propose the static malware analysis tool called as Drebin. Output: Class from which given system calls belong.
Drebin is a static malware analysis tool which detects 1. Foreach line in file .csv do
malicious application directly on Android phone. Drebin 2. Remove all parameters except system call name;
collects various features from application code and manifest 3. Store all system call names in another file called system
file. Then machine learning approach is used to distinguish call name;
normal and malicious application. Sanz Borja et al.[7] 4. End
propose permission based static malware detection tool 5. Foreach system call name in file system call name do
called PUMA. PUMAs extract application permission from the 6. Assign unique integer number;
manifest file. Then use the machine learning algorithm to 7. Store all integers in file integer system call file;
identify normal and malicious permissions 8. End
9. Foreach integer system call do
In paper [8],[9],[10],[11] authors suggest dynamic malware 10. Calculate 3-gram and 5-gram length;
analysis techniques with the different approach. Suarez 11. End
Tangil et al.[8] proposes the dynamic analysis tool named as 12. Foreach length of system call
AlterDroid. AlterDroid a tool for dynamic analysis of hidden 13. Compute frequency of each integer then;
malware distributed over application components. 14. Foreach system call if frequency is less than 100;
Alterdroid analyses the behavioral difference between 15. Remove from file ;
original application and fault injected application. It creates 16. Compute density of each integer then;
behavior signatures for both applications. It then analyze 17. Store data into value pair format in data file;
differential signature with the help of pattern matching. Tam 18. End
Kimberly et al.[9], describe tool CopperDroid. CopperDroid 19. After all this data processing apply Naive Bayes
is virtual machine based automatic dynamic analysis system. classifier.
It reconstructs the behaviour of Android malware by 20. Foreach class instance
monitoring system calls. Shabtai Asaf et al.[10] suggest tool 21. Calculate prior probability;
Andromaly. Andromaly is the host-based malware detection 22. P(C) = Nc N
tool. Andromaly continuously monitors various metrics of 23. End
the device like battery usage, CPU usage, the number of 24. Foreach known value pair
active processes and amount of data transferred through a 25. Calculate conditional probability;
network. Then it applies the machine learning algorithm for 26. P(w|c) = countw, c() + 1/count(c) + |V
classifying data as normal and malicious. Lok Kwong Yan et 27. | End
al.[15] proposes the dynamic malware tool called as 28. Foreach unknown value pair
Droidscope. Droidscope is a dynamic malware analysis 29. Calculate posterior probability;

@ IJTSRD | Unique Paper ID – IJTSRD26449 | Volume – 3 | Issue – 5 | July - August 2019 Page 2402
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
30. Cmap = argmaxP(x1, x2, x3, , xn)P P(C) End benign and malware. For all system implementations, we
31. Compare posterior probability for each class then return used real-world malware and benign application samples.
class with highest probability as result. Proposed method gives more accurate results and performs
better than previous work. For 3-gram Naive Bayes
Algorithms classifier, the system gives 85% accuracy while in 5-gram
Let D be the Whole system which consists, Naive Bayes classifier; the system gives 89% accuracy. This
D= {I, P, O} indicates the performance of the system is proportional to
the length of system calls.
Where,
Q- Users Query {q1, q2…, qN} REFERENCES
P- Procedure, [1] Faruki Parvez, Ammar Bharmal, Vijay Laxmi, Vijay
F-Files set of {f1, f2,…,fn} Ganmoor, Manoj Singh Gaur, Mauro Conti, and
I-Input, Muttukrishnan Rajarajan. ”Android security: a survey
I-{F, Q}, of issues, malware penetration, and defenses.” IEEE
O- Output. communications surveys & tutorials 17, no. 2(2015):
998-1022.
Where: [2] Arshad Saba, Munam Ali Shah, Abid Khan, and Mansoor
F = Represents the file, Ahmed. ”Android malware detection & protection: a
m1, m2, m3, m4= representing the ith block of the file, survey.” Int. J. Adv. Comput. Sci. Appl 7, no. 2 (2016):
e = encryption key 463-475.
Phase 1: Pre-process Phase
In the pre-processing phase, [3] Feizollah Ali, Nor Badrul Anuar, Rosli Salleh, and
e← H(F), id ← H(e). Ainuddin Wahid Abdul Wahab. ”A review on feature
selection in mobile malware detection.” Digital
Then, the user announces that it has a certain file via id. If Investigation 13 (2015): 22-37.
the file does not exist, the user goes into the upload phase. [4] Desnos Anthony. ”Androguard: Reverse engineering,
Otherwise, the user goes into the De-Duplication phase. malware and goodware analysis of android
applications.” URL code. google. com/p/androguard
Phase 2 The Upload File (2013).
(C, T )← Encoding(e, F) [5] Faruki Parvez, Vijay Ganmoor, Vijay Laxmi, Manoj
Let the file F = (m1, . . . ,mn). Singh Gaur, and Ammar Bharmal. ”AndroSimilar:
The user first invokes the encoding according robust statistical feature signature for Android
malware detection.” In Proceedings of the 6th
Phase 3. The De-Duplication Data(file) International Conference on Security of Information
res∈ {0, 1} ← De-Duplication {U(e, F), S(T)} and Networks, pp. 152-159. ACM, 2013.
If a file announced by a user in the pre-process phase exists
in the cloud server, the user goes into the De-Duplication [6] Arp Daniel, Michael Spreitzenbarth, Malte Hubner,
phase and runs the De-Duplication protocol Hugo Gascon, Konrad Rieck, and C. E. R. T. Siemens.
”DREBIN: Effective and Explainable Detection of
Phase 4: The Update File Android Malware in Your Pocket.” In NDSS. 2014.
res∈ {he∗, (C∗, T ∗)i,⊥} ← Updating{U(e, i, m, OP), S(C, T )} [7] Sanz Borja, Igor Santos, Carlos Laorden, Xabier Ugarte-
In this phase, a user can arbitrarily update the file by Pedrero, Pablo Garcia Bringas, and Gonzalo lvarez.
invoking the update protocol ”Puma: Permission usage to detect malware in
android.” In International Joint Conference CISIS12-
Phase 5: The Proof of Storage to Owner ICEUTE 12-SOCO 12 Special Sessions, pp. 289-298.
res∈ {0, 1} ←Checking{S(C, T ), U(e)} Springer Berlin Heidelberg, 2013.
At any time, users can go into the proof of storage phase if [8] Suarez-Tangil, Guillermo, Juan E. Tapiador, Flavio
they have the ownerships of the files. The users and the Lombardi, and Roberto Di Pietro. ”ALTERDROID:
cloud server run the checking protocol. differential fault analysis of obfuscated smartphone
malware.” IEEE Transactions on Mobile Computing 15,
RESULT AND DISCUSSIONS no. 4 (2016): 789-802.
User can upload, download update on cloud server and
provide data De-Duplication. [9] Tam Kimberly, Salahuddin J. Khan, Aristide Fattori, and
Lorenzo Cavallaro. ”CopperDroid: Automatic
CONCLUSIONS Reconstruction of Android Malware Behaviors.” In
In this work, we developed dynamic malware detection NDSS. 2015.
system to detect malware in Android applications. For [10] Shabtai Asaf, Uri Kanonov, Yuval Elovici, Chanan Glezer,
dynamic detection, we used system calls invoked by the and Yael Weiss. ”Andromaly: a behavioral malware
application during execution. After that, Naive Bayes detection framework for android devices.” Journal of
classifier is used to classify runtime behavior of applications. Intelligent Information Systems 38, no. 1 (2012): 161-
In addition, we used 3-gram and 5-gram length of system 190.
calls. Instead of using every system calls; we filter system [11] Yan, Lok-Kwong, and Heng Yin. ”DroidScope:
calls based on frequency. Filtered system calls are used to Seamlessly Reconstructing the OS and Dalvik Semantic
calculate density. Then, by applying Naive Bayes classifier, Views for Dynamic Android Malware Analysis.” In
we classified application in two different classes that are
USENIX security symposium, pp. 569-584. 2012.

@ IJTSRD | Unique Paper ID – IJTSRD26449 | Volume – 3 | Issue – 5 | July - August 2019 Page 2403

You might also like