0% found this document useful (0 votes)
93 views9 pages

Bot Detection System Using CNN Algorithm

The document presents a deep learning approach for detecting Android botnets using convolutional neural networks (CNNs). A CNN model is trained on 342 static app features extracted from a dataset of real Android apps containing botnet and normal samples. The trained model achieves 98.9% accuracy in detecting botnets, outperforming other machine learning classifiers. The CNN approach provides improved performance over previous studies on Android botnet detection.

Uploaded by

40 Tanmay Jadhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views9 pages

Bot Detection System Using CNN Algorithm

The document presents a deep learning approach for detecting Android botnets using convolutional neural networks (CNNs). A CNN model is trained on 342 static app features extracted from a dataset of real Android apps containing botnet and normal samples. The trained model achieves 98.9% accuracy in detecting botnets, outperforming other machine learning classifiers. The CNN approach provides improved performance over previous studies on Android botnet detection.

Uploaded by

40 Tanmay Jadhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Mobile Bot Detection: A Deep Learning Approach

Using CNN, Hash Detection and API


Vaishnavi S. Shinde Tanmay V. Jadhav
Government College of Engineering & Research, Government College of Engineering & Research,
Avasari Khurd, Pune Avasari Khurd, Pune
Vaishnavishinde4112002@gmail.com tanmayj888@gmail.com

Akanksha S. Pingale Guided by:


Government College of Engineering & Research, Prof. S.G.Farkade
Avasari Khurd, Pune Government College of Engineering and
pingale.akanksha2003@gmail.com Research, Avasari Khurd, Pune

Abstract— Android, being the most widespread mobile oper-


ating systems is increasingly becoming a target for malware. Ma- typically used to connect to online services and are rarely
licious apps designed to turn mobile devices into bots that may switched off, they provide a rich source of candidates for op-
form part of a larger botnet have become quite common, thus erating botnets. Thus, the term ‘mobile botnet’ refers to a
posing a serious threat. This calls for more effective methods to group of compromised smartphones and other mobile devices
detect botnets on the Android platform. Hence, in this paper, we that are remotely controlled by botmasters using C&C chan-
present a deep learning approach for Android botnet detection nels [2], [3].
based on Convolutional Neural Networks (CNN). Our proposed
botnet detection system is implemented as a CNN-based model
that is trained on 342 static app features to distinguish between Nowadays, malicious botnet apps have become a serious
botnet apps and normal apps. The trained botnet detection model threat. Additionally, their increasing use of sophisticated eva-
was evaluated on a set of 6,802 real applications containing 1,929 sive techniques calls for more effective detection approaches.
botnets from the publicly available ISCX botnet dataset. The Hence, in this paper we present a deep learning approach that
results show that our CNN-based approach had the highest over- leverages Convolutional Neural Networks (CNN) for Android
all prediction accuracy compared to other popular machine botnet detection. The CNN model employs 342 static features
learning classifiers. Furthermore, the performance results ob- to classify new or previously unseen apps as either ‘botnet’ or
served from our model were better than those reported in previ- ‘normal’. The features are extracted through automated re-
ous studies on machine learning based Android botnet detection.
verse engineering of the apps, and are used to create feature
Keywords—Botnet detection; Deep learning; Convolutional vectors that feed directly into the CNN model without further
Neural Networks; Machine learning; Android Botnets pre-processing or feature selection.

We present the design of our CNN-based model for Android


I. INTRODUCTION botnet detection and evaluate the model on a dataset of real
Android is now the most widespread mobile operating system Android apps consisting of 1,929 botnets samples and 4,873
worldwide. Over the years the volume of malware targeting clean samples. Also, we compare the performance of our CNN
Android has continued to grow [1]. This is because it is easier model to other popular machine learning classifiers including
and more profitable for malware authors to target an operating Naïve Bayes, Bayes Net, Decision Tree, Support Vector Ma-
system that is open-source, more prevalent, and does not re- chine (SVM), Random Forest, Random Tree, Simple Logistic
strict the installation of apps from any possible source. As a and Artificial Neural Network (ANN) on the same dataset.
matter of fact, numerous families of malware apps that are The results show that the CNN-based model achieved a botnet
capable of infecting Android devices and turning them into detection performance of 98.9% with an F1-score of 0.981,
malicious bots have been discovered in the wild. These An- thus outperforming all the other machine learning classifiers.
droid bots may become part of a larger botnet that can be used Furthermore, our CNN model shows better performance re-
to perform various types of attacks such as Distributed Denial sults compared to other existing studies focusing on Android
of Service (DDoS) attacks, generation and distribution of botnet detection. Some of these studies utilized the same ISCX
Spam, Phishing attacks, click fraud, stealing login credentials botnet apps employed in this paper.
or credit card details, etc.
The rest of the paper is organized as follows: Section II dis-
A botnet consists of a number of Internet-connected devices cusses related works in Android botnet detection; Section III
under the control of a malicious user or group of users known presents the overall system and gives some background on
as botmaster(s). It also consists of a Command and Control CNN, including a discussion of 1D CNN which is adopted in
(C&C) infrastructure that enables the bots to receive com- this study; Section IV presents methodology and the experi-
mands, get updates and send status information to the mali- ments performed; Results of experiments are given in Section
cious actors. Since smartphones and other mobile devices are V and finally Section VI presents the conclusions of the study
and possible future work.
II. RELATED WORK tion based on feature selection and classification algorithms.
In the study conducted by Kadir et al. [4], the objective was to The paper used ‘permissions requested’ as features and ‘In-
address the gap in understanding mobile botnets and their formation gain’ to select the most significant permissions.
communication characteristics. Thus, they provided an in- Afterwards, Naïve Bayes, Random Forest and Decision Trees
depth analysis of the Command and Control (C&C) and built- were used to classify the Android apps. Results show Random
in URLs of Android botnets. By combining both static and Forest achieving the highest detection accuracy of 94.6% with
dynamic analyses with visualization, relationships between the the lowest false positive rate of 0.099.
analysed botnet families were uncovered, offering insight into
each malicious infrastructure. It is in this study that a dataset Karim et al [11] proposed DeDroid, a static analysis approach
of 1929 samples of 14 Android botnet families were compiled to investigate botnet-specific properties that can be used to
and released to the research community. This dataset is known detect mobile botnets. They first identified ‘critical features’
as the ISCX Android botnet dataset and is available from [5]. by observing the coding behaviour of a few known malware
This paper and several previous works on Android botnets binaries having C&C features. They then compared these ‘crit-
have utilized the full dataset or a subset of it to evaluate pro- ical features’ with features of malicious applications from the
posed Android botnet detection techniques. Drebin dataset [12]. Through this comparison, 35% of the ma-
licious apps in the dataset qualified as botnets. However, clos-
Anwar et al. [6] proposed a static approach towards mobile er examination revealed that 90% were confirmed as botnets.
botnet detection where they utilized MD5 hashes, permissions,
broadcast receivers, and background services as features. Bernardeschia et al. [13] proposed a method to identify bot-
These features were extracted from Android apps to build a nets in Android environment through model checking. Model
machine learning classifier for detecting mobile botnet attacks. checking is an automated technique for verifying finite state
They conducted their experiments on 1400 apps from the systems. This is accomplished by checking whether a structure
UNB ISCX botnet dataset together with 1400 benign apps. representing a system satisfies a temporal logic formula de-
Their best result was 95.1% classification accuracy with a scribing their expected behaviour. In [14], Jadhav et al. pro-
recall value of 0.827 and a precision value of 0.97. pose a cloud-based Android botnet detection system which
exploits dynamic analysis by using a virtual environment with
Paper [7] used machine learning to detect Android botnets cluster analysis. The toolchain for the dynamic analysis pro-
based on permissions and their protection levels. The authors cess within the botnet detection system is composed of strace,
initially used 138 features and then added novel features netflow, logcat, sysdump, and tcpdump. However, the authors
known as protection levels to increase the number of features did not provide any experimental results to evaluate the effec-
to 145. Their approach was evaluated on four machine learn- tiveness of their proposed solution. Moreover, botnets may
ing algorithms: Random Forest, MLP, Decision Trees and easily employ different techniques to evade the virtual envi-
Naïve Bayes. They performed their study on 3270 app in- ronment, and code coverage could limit the system’s effec-
stances (1635 benign and 1635 botnets). The botnet apps used tiveness [15], [24].
were also obtained from the ISCX botnet dataset. The best
results came from Random Forest with 97.3% accuracy, 0.987 Paper [16] proposed an approach to detect mobile botnets us-
recall, and 0.958 precision. ing network features such as TCP/UDP packet size, frame
duration, and source/destination IP address. The authors used
In [8] a method was proposed to detect Android botnets based a set of ML box algorithms and five machine learning classifi-
on Convolutional Neural Networks using permissions as fea- ers to classify network traffic. The five supervised machine
tures. Applications are represented as images that are con- learning approaches include Naïve Bayes, Decision Tree, K-
structed based on the co-occurrence of permissions used with- nearest neighbour, Neural Network, and Support Vector Ma-
in the applications. The proposed CNN is a binary classifier chine. In [17], a method to detect Android botnets based on
that is trained using the images. The authors evaluated their source code mining and source code metric was proposed.
proposed method on 5450 Android applications consisting of There are also a number of works that have proposed signature
1800 botnet applications from the ISCX dataset. Their results based methods for Android botnet detection. These include
show an accuracy of 97.2% with a recall of 0.96, precision of [18-20]. However, these solutions are likely to suffer from the
0.955 and f-measure of 0.957, which is a promising result con- drawbacks of signature based systems which includes the ina-
sidering that only permissions were used in the study. bility to effectively detect previously unseen botnets.

Paper [9] proposed an Android Botnet Identification System Unlike most existing studies, our paper proposes a deep learn-
(ABIS) for checking Android applications in order to detect ing based Android botnet detection system, using Convolu-
botnets. ABIS utilized both static and dynamic features from tional Neural Networks. Also, unlike previous studies that
API calls, permissions and network traffic. The system is utilize only the app permissions, our system is based on 342
evaluated by using several machine learning algorithms with features that represent Permissions, API calls, Commands,
Random Forest obtaining a precision of 0.972 and a recall of Extra Files, and Intents. Furthermore, different from the study
0.969. In [10], a method is proposed for Android botnet detec- in [9] which utilized only permissions, we do not convert fea-
ture vectors into images prior to model training. Instead our deeper layers of the CNN, hence, the number of layers re-
feature vectors are used directly to train 1D CNN models. This quired depends on the complexity and non-linearity of the data
makes our approach computationally less demanding. being analysed. Furthermore, the number of filters in each
III. BACKGROUND stage determines the number of features extracted. Computa-
tional complexity increases with more layers and higher num-
A. The CNN-based classification system bers of filters. Also, with more complex architectures, there is
The classification system is built by extracting static features the possibility of training an overfitted model which results in
from the corpus of botnet and clean samples. To achieve this, poor prediction accuracy on the testing set(s). To reduce over-
we used our bespoke tool built in Python for automated re- fitting, techniques such as ‘dropout’ [22] and ‘batch regulari-
verse engineering of APKs. With the help of the tool, we ex- zation’ are implemented during training of our models.
tracted 342 features consisting of five different types (see Ta- C. One Dimensional Convolutional Neural Networks
ble 2) from all the training apps. The five feature types in- Although CNN is more commonly applied in a multi-
clude: API calls extracted from the executable; Permissions dimensional fashion and has thus found success in image and
and Intents from the manifest file; Commands and Extra Files video analysis-based problems, they can also be applied to
from the APK. These features are represented as vectors of one-dimensional data. Datasets that possess a one-dimensional
binary numbers with each feature in the vector represented by structure can be processed using a one-dimensional convolu-
a ‘1’ or ‘0’. Each feature vector (corresponding to one applica- tional neural network (1D CNN). The key difference between
tion) is labelled with its class. The feature vectors are loaded a 1D and a 2D or 3D CNN is the dimensionality of the input
into the CNN model and used to train the model. After train- data and how the filter (feature detector) slides across the data.
ing, an unknown application can be predicted to be either For 1D CNN, the filters only slide across the input data in one
‘clean’ or ‘botnet’ by applying its own extracted feature vector direction. A 1D CNN is quite effective when you expect to
to the trained model. The process is depicted in Figure 1. derive interesting features from shorter (fixed-length) seg-
ments of the overall feature set, and where the location of the
feature within the segment is not of high relevance.

The use of 1D CNN can be commonly found in NLP applica-


tions. Similarly, 1D CNN is applicable to datasets containing
vectorised data being used to characterize the items to be pre-
dicted (e.g. an Android application). The 1D CNN could be
used to extract potentially more discriminative feature repre-
sentations that describe any existing patterns or relationships
within segments of the vectors characterizing each entity in
the dataset. These new features are then fed into a classifier
(e.g. a fully connected neural network layer) which will in turn
Figure 1: Training and prediction with the CNN-based botnet
use the derived features in making a final classification deci-
detection system.
sion. Hence, in this scenario, the convolutional layers can be
B. Convolutional Neural Networks (CNN) considered as a feature extractor that eliminates the need for
A CNN is a deep learning technique that belongs to the family feature ranking and selection. The CNN model developed in
of Artificial Neural Networks. It works well for identifying this paper is applied to vectorised data characterizing the An-
simple patterns in the data which will then be used to form droid applications, in order to derive a trained model that can
more complex patterns in higher layers. Two types of layers detect new Android botnet apps with very high accuracy.
are typically used for building CNNs; convolutional layers and D. Key elements of our proposed CNN architecture
pooling layers. The role of the convolutional layer is to detect
Our proposed CNN architecture is a 1D CNN consisting of
local conjunctions of features from the previous layer, while
two convolutional layers and two max pooling layers. These
the role of the pooling layer is to merge semantically similar
are followed by a fully connected layer of N units, which is in
features into one [21].
turn connected to a final classification layer containing one
neuron with a sigmoid activation function.
Generally, the convolutional layer extracts the optimal fea- 1
tures while the pooling layer reduces the dimensions of those The sigmoid activation function is given by: 𝑆 =
1+ e −
features that it receives from the convolutional layer (or an- The final classification layer generates an outcome corre-
other preceding pooling layer). At the tail end of the model, sponding to the two classes i.e. ‘botnet’ or ‘normal’. The con-
fully connected (dense) layer(s) are typically used for classifi- volutional layers utilize the ReLU (Rectified Linear Units)
cation. Depending on the characteristics of the dataset, the activation function given by: ƒ(𝑥) = max(O, 𝑥). ReLU helps
performance of the CNN may be influenced by the number of to mitigate vanishing and exploding gradient issues [23]. It has
layers, number of filters (kernels) or the size of the filters. been found to be more efficient in terms of time and cost for
Generally, more and more abstract features are extracted in the training huge data in comparison to classical non-linear activa-
tion functions such as Sigmoid or Tangent functions [24]. A missions’ accounted for most of the features. From Table 2, it
simplified view of our architecture is shown in Figure 2. can be seen that there were 135 ‘API calls’ related features
and 130 ‘permissions’ features, while intents accounted for 53
Input layerConvolutional layer 1 Convolutional layer 2 Fully connected layer features. Some of the features are shown in Table 3.
filter
Table 1: Botnet dataset composition.
Botnet Family Number of samples
Anserverbot 244
Bmaster 6
Sliding

Droiddream 363
Geinimi 264
Misosms 100
Sliding

Nickyspy 199
Notcompatible 76
Sliding

0 = normal Pjapps 244


1= botnet
Pletor 85
Rootsmart 28
filter Sandroid 44
L = 342 Tigerbot 96
Wroba 100
Figure 2: Overview of the implemented 1D CNN model for Zitmo 80
Android application classification to detect botnets. Total 1929

IV. METHODOLOGY AND EXPERIMENTS Table 2: The five different types of features used to train the CNN
In this section we present the experiments undertaken to eval- model.
Feature type Number
uate the CNN models developed in this paper. Our models API calls 135
were implemented using Python and utilized the Keras library Permissions 130
with TensorFlow backend. Other libraries used include Scikit Commands 19
Learn, Seaborn, Pandas, and Numpy. The model was built and Extra files 5
evaluated on an Ubuntu Linux 16.04 64-bit Machine with Intents 53
4GB RAM. Total 342 features

A. Problem definition
Table 3: Some of the prominent static features extracted from Android
Let A ={a1, a2, … an} be a set of apps where each ai is repre- applications for training the CNN model to detect Android Botnets.
sented by a vector containing the values of n features (where
n=342). Let a ={f1,f2,f3 …fn, cl} where 𝑐𝑙 ∈ {𝑏𝑜𝑡𝑛e𝑡, 𝑛𝑜𝑟𝑚𝑎𝑙} Feature name Type
is the class label assigned to the app. Thus, A can be used to TelephonyManager.*getDeviceId API
TelephonyManager.*getSubscriberId API
train the model to learn the behaviours of botnet and normal abortBroadcast API
apps respectively. The goal of a trained model is then to clas- SEND_SMS Permission
sify a given unlabelled app Aunknown = { f1,f2,f3 …fn, ?} by as- DELETE_PACKAGES Permission
signing a label cl, where 𝑐𝑙 ∈ {𝑏𝑜𝑡𝑛e𝑡, 𝑛𝑜𝑟𝑚𝑎𝑙}. PHONE_STATE Permission
SMS_RECIVED Permission
Ljava.net.InetSocketAddress API
B. Dataset READ_SMS Permission
Android.intent.action.BOOT_COMPLETED Intent
In this study we used the Android dataset from [5], which is
io.File.*delete( API
known as the ISCX botnet dataset. The ISCX dataset contains chown Command
1,929 botnet apps (from 14 different families) and has been chmod Command
used in previous works including [4], [7-10], and [17]. The Mount Command
botnet families are shown in Table 1. A total of 4,873 clean .apk Extra File
apps were used for the study in this paper and these were la- .zip Extra File
belled under the category ‘normal’ to facilitate supervised .dex Extra File
.jar Extra file
learning when training the CNN and other machine learning CAMERA Permission
classifiers. The clean apps were obtained from different cate- ACCESS_FINE_LOCATION Permission
gories of apps on the Google Play store and verified to be non- INSTALL_PACKAGES Permission
malicious by using VirusTotal. android.intent.action.BATTERY_LOW Intent
.so Extra File
The 342 static features extracted from the apps for model android.intent.action.POWER_CONNECTED Intent
System.*LoadLibrary API
training were of 5 types: (a) API calls (b) commands (c) per-
missions (d) Intents (e) extra files. The ‘API calls’ and ‘per-
C. Experiments to evaluate the proposed CNN based model of all 10 results is then taken to produce the final result. Also,
In order to investigate the performance of our proposed model, during the training of the CNN models (for each fold), 10% of
we performed different sets of experiments. Table 4 shows the the training set was used for validation.
configuration of the CNN model. The 1D CNN model consists
of two pairs of convolutional and maxpooling layers as shown V. RESULTS AND DISCUSSIONS
in Figure 2. The output of the second max pooling layer is
flattened and passed on to a fully connected layer with 8 units. A. Varying the numbers of filters.
This is in turn connected to a sigmoid activated output layer In this section, we examine the results from experimenting
containing one unit. with different numbers of filters. In our model, we kept the
number of filters in both convolutional layers the same. Table
The first set of experiments was aimed at evaluating the im- 5 shows the results from running the 1D CNN model with
pact of number of filters on the model’s performance. The different numbers of filters. From the table, it is evident that
second set of experiments was performed to evaluate the effect the number of filters had an effect on the performance of the
of varying the length of the filters. In the third, we investigate model. When increased from 4 to 8, there is an improvement
the impact of the maxpooling size on performance. in performance. The performance does not improve until we
reach 32 filters. It then drops again when we increase this to
Table 4: Summary of model configurations. 64. Based on these results we select 32 filters as the optimal
configuration parameter for the model’s number of filters.
Model design summary -1D CNN Notice the increase in the number of training parameters as the
Input layer: Dimension = 342 (feature vector size) number of filters is increased, and for 32 filters, the training of
1D Convolutional layer: 4, 8, 16, 32, 64 filters, 25,625 parameters is required. With 32 filters we obtain a
size = 4, 8, 16, 32, 64 (with number of filters =32) classification accuracy of 98.9% compared to 98.6% that is
MaxPooling layer: Size =2, 4, 8, 16 (with number of filters =32) obtained with 4 filters. Nevertheless, the results obtain with 4
1D Convolutional layer: 4, 8, 16, 32, 64 filters, filters were still acceptable.
size = 4, 8, 16, 32, 64 (with number of filters =32) 1) Training epochs, loss and accuracy graphs.
MaxPooling layer: Size =2, 4, 8, 16 (with number of filters =32) Figures 3 and 4 shows the typical outputs obtained with the
Fully Connected (Dense) layer: 8 units, activation=ReLU validation and training sets during the training epochs. From
Output layer: Fully Connected layer; 1 unit, activa- Fig. 3, it can be seen that the validation loss is generally fluc-
tion=sigmoid tuating from one training epoch to another after an initial drop.
During each epoch, a model is trained and the validation loss
In order to measure model performance, we used the follow- and accuracy are recorded. Our goal is to obtain the model
ing metrics: Accuracy, precision, recall and F1-score. The with the least validation loss because we assume this will be
metrics are defined as follows (taking botnet class as positive): the ‘best’ model that fits the training data. Thus, at every
epoch, the validation loss is compared to previous ones and if
 Accuracy: Defined as the ratio between correctly pre- the current one is lower, the corresponding model is saved as
dicted outcomes and the sum of all predictions. It is the best model. We implemented a ‘stopping criterion’ which
TP+TN
given by: TP+TN+FP+FN will stop the training once no improvement in performance is
observed within 100 epochs. For example in Figure 3, the best
 Precision: All true positives divided by all positive model was obtained with the least validation loss of 0.00531 at
predictions. i.e. Was the model right when it predict- epoch 45. For the next 100 epochs validation loss did not im-
TP
ed positive? Given by: prove, hence the training was stopped. Figure 4 shows the
TP+FP corresponding accuracy behaviour observed from epoch to
 Recall: True positives divided by all actual positives. epoch.
That is, how many positives did the model identify
TP Table 5: Number of filters vs. model performance. Length of
out of all possible positives? Given by: filters used= 4 for first layer and =4 for second layer; dense
TP+FN
 F1-score: This is the weighted average of precision layer = 8 units; validation split=10%.
2 x Recall x Precisi𝑜n
and recall, given by:
Recall+Precisi𝑜n

Where TP is true positives; FP is false positives; FN is false


negatives, while TN is true negatives (all w.r.t. the botnet
class). All the results of the experiments are from 10-fold
cross validation where the dataset is divided into 10 equal
parts with 10% of the dataset held out for testing, while the
models are trained from the remaining 90%. This is repeated
until all of the 10 parts have been used for testing. The average
Number of
4 8 16 32 64
Filters
Accuracy 0.986 0.988 0.988 0.989 0.987
Precision 0.978 0.980 0.980 0.983 0.980
Recall 0.974 0.977 0.976 0.978 0.975
F1-score 0.976 0.978 0.978 0.981 0.977
Num. training
parameters 2777 5,657 11,801 25,625 59,417
Table 6: Length of filters vs. model performance. Number of
filters used= 32 in both first and second convolutional layers;
dense layer = 8 units; validation split=10%.
Length of
4 8 16 32 64
filters
Accuracy 0.989 0.988 0.988 0.988 0.988
Precision 0.983 0.979 0.980 0.981 0.983
Recall 0.978 0.977 0.978 0.979 0.974
F1-score 0.981 0.978 0.979 0.979 0.978
Training
parameters 25,625 29,081 35,993 49,817 77465
Figure 3: Training and validation losses at different epochs up
to 145. A stopping criterion of 100 is used to obtain the model C. Varying the Maxpooling parameter
with the least validation loss. The results of the third set of experiments are discussed here.
The goal is to investigate the effect of changing the maxpool-
ing parameter. This corresponds to a subsampling ratio of 2, 4,
6, and 8 respectively as shown in Table 7. A value of 2 means
the next layer will be half the dimension of the previous one,
etc. Note that the maxpooling layer can be considered a fea-
ture reduction layer that also helps to alleviate overfitting
since it progressively reduces the number of parameters that
need to be trained. The other parameters were fixed as fol-
lows: Number of filters in both convolutional layers = 32;
Length of convolutional filters = 4; number of units in dense
layer=8.

Figure 4: Training and validation accuracies at different It can be seen from Table 7 that as we increase the maxpool-
epochs up to 145. These plots correspond to the training and ing parameter, the total number of training parameters is re-
validation losses depicted in Figure 3. duced. At the same time, we witness a progressive decline in
overall performance. Therefore, for our CNN model designed
B. Varying the length of the filters. to classify applications into ‘botnet’ and ‘normal’, the optimal
subsampling ratio for both layers is 2.
In this section we examine the effect of the length of filters on
the performance of the model while the number of filters is
Table 7: Maxpooling parameter vs. model performance.
fixed at 32 in each convolutional layer. The length is varied
Length of filters used=4 for both convolutional layers; number
from 4, 8, 16, 32, to 64 respectively (as shown in Table 6).
of filters =32 for both layers; dense layer = 8 units; validation
The number of units in the dense layer was fixed at 8. The
split=10%.
results indicate that the length of the filters does not appear to
have much of an impact on the overall classification accuracy Maxpooling parame-
2 4 6 8
and F1-score performance, when increased. However, the ter/Subsampling ratio
least filter length of 4 achieves the highest accuracy and F1- Accuracy 0.989 0.987 0.983 0.978
score. Note that as we increase the length of the filters, the
Precision 0.983 0.982 0.974 0.971
number of parameters to be trained increases (from 25,652 for
length=4 to 77,465 for length=64). Recall 0.978 0.973 0.967 0.948
F1-score 0.981 0.978 0.970 0.959
The lack of improvement with the length of filters may be Training
attributed to larger number of parameters leading to overfitting 25,625 9497 6,425 5,401
Parameters
the model to the training data thereby reducing its generaliza-
tion capability. This in turn leads to degraded performance D. CNN performance vs. other machine learning classifiers:
when tested on new data. Basically, what these results show is 10 fold cross validation results.
that when the training parameters increase beyond a certain In Table 8, the performance of the CNN model developed in
limit, the model becomes too complex for the data and this this paper is compared to other machine learning classifiers:
leads to overfitting. This becomes evident in lack of improve- Naïve Bayes, SVM, Random Forest, Artificial Neural Net-
ment or degradation in performance when tested on previously work, J48, Random Tree, REPtree, and Bayes Net. Figure 5
unseen data. shows the F1-scores of the classifiers, where CNN has the
highest F1-score (0.981), followed by SVM (0.976), SL have used are reported in every paper. Nevertheless, it is clear
(0.973), ANN (0.973) and Random Forest (0.973). Bayes Net that our CNN model obtained better overall accuracy, F1 and
had the least F1-score of 0.781. Table 8 shows that the recall recall than the other works.
of CNN is 0.978 which indicates that it has the best botnet
detection performance than the other classifiers. Note that the Table 9: performance comparisons with other works. Note
ANN was a back propagation neural network built with a sin- that all of the papers used botnets samples from the ISCX
gle hidden layer consisting 32 units (neurons). The sigmoid dataset.
activation function was used within the neurons. This ANN Paper reference Botnets ACC Rec. Prec. F1
represented the application of a neural network without deep /Benign (%)
learning. The ANN showed no significant improvement in the Hojjatinia et al. [8] 1800/3650 97.2 0.96 0.955 0.957
results when the number of units in the hidden layer was in- Tansettanakorn et al. [9] 1926/150 - 0.969 0.972 -
creased beyond 32.
Anwar et. al [6] 1400/1400 95.1 0.827 0.97 -

Table 8: Comparison of our CNN results with results from Abdullah et al. [10] 1505/850 - 0.946 0.931 -
other ML classifiers. Alqatawna & Faris [7] 1635/1635 97.3 0.957 0.987 -
ACC Prec. Rec. F1 This paper 1929/4873 98.9 0.978 0.983 0.981
Naïve Bayes 0.872 0.728 0.874 0.795
SVM 0.987 0.980 0.973 0.976
VI. CONCLUSIONS AND FUTURE WORK
RF 0.985 0.982 0.965 0.973 In this paper, we proposed a deep learning model based on 1D
CNN for the detection of Android botnets. We evaluated the
ANN 0.985 0.982 0.965 0.973 model through extensive experiments with 1,929 botnet apps
SL 0.984 0.983 0.963 0.973 and 4,387 clean apps. The model outperforms several popular
J48 0.981 0.974 0.958 0.966 machine learning classifiers evaluated on the same dataset.
The results (Accuracy: 98.9%; Precision: 0.983; Recall: 0.978;
Random Tree 0.972 0.948 0.955 0.951 F1- score: 0.981) indicate that our proposed CNN based model
REPTree 0.979 0.973 0.954 0.963 can be used to detect new, previously unseen Android botnets
Bayes Net 0.867 0.736 0.832 0.781 more accurately than the other models. For future work, we
will aim to improve the model training process by automating
CNN 0.989 0.983 0.978 0.981 the search and selection of the key influencing parameters (i.e.
number of filters, filter length, and number of fully connected
(dense) layers) that jointly result in the optimal performing
CNN model.
CNN 0.981

SVM 0.976

ANN 0.973

SL 0.973

RF 0.973

J48 0.966

REPTREE 0.963

RANDOM TREE 0.951

NAÏVE BAYES 0.795

BAYES NET 0.781

0.70.750.8 0.85 0.9 0.95 1


F1 Score

Figure 5: F1-score of CNN vs other ML classifiers.

E. Comparison with other works on Android botnet detection.


In Table 9, we present a comparison of our results with those
reported in other papers that focus on Android botnet
detection. Note that all the papers mentioned in the table have
used the ISCX botnet dataset for their work. In our study we
utilized the entire 1929 samples within the dataset. In the
second column of the table, the numbers of botnet samples and
benign samples used in the papers are shown, while the other
columns contain the performance results. Not all of the
performance metrics we
REFERENCES Communication Technology (ICACT), 2015 17th
International Conference on, IEEE. pp. 347–352.
[1] S. Y. Yerima and S. Khan “Longitudinal
Perfomance Anlaysis of Machine Learning based [15] S. Y. Yerima, M. K. Alzaylaee, and S. Sezer.
Android Malware Detectors” 2019 International “Machine
Conference on Cyber Security and Protection of Digital learning-based dynamic analysis of Android apps with
Services (Cyber Security), IEEE improved code coverage” EURASIP Journal on
[2] H. Pieterse and M. S. Olivier, "Android botnets on Information Security, 4 (2019).
the rise: Trends and characteristics," 2012 Information https://doi.org/10.1186/s13635-019-0087-1
Security for South Africa, Johannesburg, Gauteng, [16] Meng, X. and Spanoudakis, G. (2016). MBotCS: A
2012, pp. 1-5. mobile botnet detection system based on machine
[3] Letteri, I., Del Rosso, M., Caianiello, P., Cassioli, learning. Lecture Notes in Computer Science, 9572, pp.
D., 2018. Performance of botnet detection by neural 274-291. doi: 10.1007/978-3-319-31811-0_17
networks in software-dened networks, in: CEUR [17] B. Alothman and P. Rattadilok ‘Android botnet
WORKSHOP PROCEEDINGS, CEUR-WS. detection: An integrated source code mining aproach’
[4] Kadir, A.F.A., Stakhanova, N., Ghorbani, A.A., 12th International Conference for Internet Technology
2015. Android botnets: What urls are telling us, in: and Secured Transactions (ICITST),11-14
International Conference on Network and System Dec.,Cambridge, UK, 2017, IEEE, pp 111-115.
Security, Springer. pp. 78–91. [18] A. J. Alzahrani and A. A. Ghorbani, "Real-time
[5] ISCX Android botnet dataset. Available from signature-based detection approach for sms botnet," in
https://www. 2015 13th Annual Conference on Privacy, Security and
unb.ca/cic/datasets/android-botnet.html. [Accessed Trust (PST), 2015: IEEE, pp. 157-164.
03/03/2020] [6] S. Anwar, J. M. Zain, Z. Inayat, R. U. [19] D. A. Girei, M. A. Shah, and M. B. Shahid, "An
Haq, A. Karim, and A. N. Jabir, "A static approach enhanced botnet detection technique for mobile devices
towards mobile botnet detection," in 2016 3rd using log analysis," in 2016 22nd International
International Conference on Electronic Design (ICED), Conference on Automation and Computing (ICAC),
2016: IEEE, pp. 563-567. 2016: IEEE, pp. 450-455.
[7] J. f. Alqatawna and H. Faris, "Toward a Detection [20] M. Yusof, M. M. Saudi, and F. Ridzuan, "A New
Framework for Android Botnet," in 2017 International Android Botnet Classification for GPS Exploitation
Conference on New Trends in Computing Sciences Based on Permission and API Calls," in International
(ICTCS), 2017: IEEE, pp. 197-202. Conference on Advanced Engineering Theory and
[8] S Hojjatinia, S Hamzenejadi, H Mohseni, “Android Applications, 2017: Springer, pp. 27-37.
Botnet Detection using Convolutional Neural [21] Y. LeCun, Y.Bengio, and G. Hinton, Deep learning,
Networks” 28th Iranian Conferenc on Electircal Nature 521 (2015), no. 7553, 436-444
Engineering (ICEE2020). [22] N. Srivastava, G. Hinton, A. Krizhevsky, I.
Stuskever, and R. Salakhutdinov. “Dropout: A simple
[9] C. Tansettanakorn, S. Thongprasit, S. Thamkongka, way to prevent neural networks from overfitting” The
and V. Visoottiviseth, "ABIS: a prototype of android Journal of Machine Learning Research, 15(1):1929-1958,
botnet identification 0.781 0.795 0.951 0.963 0.966 2014.
0.973 0.973 0.973 0.976 0.981 0.7 0.75 0.8 0.85 0.9 [23] X. Glorot, A. Bordes, and Y. Bengio, ‘‘Deep sparse
0.95 1 BAYES NET NAÏVE BAYES RANDOM rectier neural networks,’’ in Proc. 14th Int. Conf. Artif.
TREE REPTREE J48 RF SL ANN SVM CNN F1 Intell. Statist., 2011, pp. 315– 323.
Score system," in 2016 Fifth ICT International Student [24] M. K. Alzaylaee, S. Y. Yerima, Sakir Sezer “DL-
Project Conference (ICT-ISPC), 2016: IEEE, pp. 1-5. Droid: Deep learning based android malware detection
[10] Z. Abdullah, M. M. Saudi, and N. B. Anuar, using real devices” Computers & Security, Volume 89,
"ABC: android botnet classification using feature 2020, 101663, ISSN 0167-4048,
selection and classification algorithms," Advanced https://doi.org/10.1016/j.cose.2019.101663.
Science Letters, vol. 23, no. 5, pp. 4717-4720, 2017.
[11] Karim, Ahmad & Salleh, Rosli & Shah, Syed.
(2015). DeDroid: A Mobile Botnet Detection Approach
Based on Static Analysis. 10.1109/UIC-ATC-ScalCom-
CBDCom-IoP.2015.240. [12] The Drebin Dataset.
Available at:
https://www.sec.cs.tubs.de/~danarp/drebin/index.html
[accessed 05/03/2020]
[13] Cinzia Bernardeschia, Francesco Mercaldo,
Vittoria Nardonec, Antonella Santoned, Exploiting
Model Checking for Mobile Botnet Detection. 23rd
International Conference on Knowledge-Based and
Intelligent Information & Engineering Systems.
Procedia Computer Science 159 (2019) 963–972.
[14] Jadhav, S., Dutia, S., Calangutkar, K., Oh, T.,
Kim, Y.H., Kim, J.N., 2015. Cloud-based android
botnet malware detection system, in: Advanced

You might also like