Big Data Analytics For Classification of Network Enabled Devices

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2016 30th International Conference on Advanced Information Networking and Applications Workshops

Big Data Analytics for Classification of Network


Enabled Devices
Deepali Arora and Kin Fun Li Alex Loffler
Department of Electrical and Computer Engineering CSO, Telus Telecommunications Inc.
University of Victoria 3777 Kingsway
P.O. Box 3055 STN CSC, Victoria, B.C., CANADA, V8W 3P6 Burnaby, BC, V5H 3Z7
{darora, kinli}@ece.uvic.ca alex.loffler@telus.com

Abstract—As information technology (IT) and telecommunica- per user will increase to six or more [3] and if the security
tion systems continue to grow in size and complexity, especially concerns are not addressed in an adequate manner then they
with Internet of Things (IoT) gaining popularity, maintaining will have an adverse impact on the potential market for the
a secure and seamless exchange of information between devices
becomes a challenging task. A large number of devices connected IoT applications [4]. Studies suggest that authentication in the
over the Internet leads to an increase in vulnerabilities and IoT network is crucial as it can have a rippling effect on data
security threats, which makes the identification of critical assets security over the IoT applications [5] and have suggested ways
necessary. Asset identification helps organizations to identify of addressing authentication challenges based on a device’s
and to respond quickly to any security breaches. In this paper, applications, platform, or connection. Both identification and
machine learning based techniques are used to identify assets
based on their connectivity, i.e., servers and endpoints. For authentication are the initial phases of accessing the network
the analysis presented in this paper four different machine system, with identification generally taking place before au-
learning algorithms, K-Nearest Neighbor, Naive Bayes, Support thorization. However, depending on the systems, these terms
Vector Machines, and Random Forest algorithms are used and may be used synonymously. Identifying critical assets is a key
the performance of these algorithms is assessed in terms of factor in maintaining secure communications between devices
the F-score calculated for each algorithm. Results show that
for a given dataset, amongst all four algorithms, the Random in the IoT [6].
Forest classifier achieved highest accuracy in terms of identifying Asset identification is also critical for enterprises, especially
the assets correctly. However, the Random Forest algorithm is from a security perspective [6], [7]. In case of a security
computationally intensive and may not work for large datasets. breach, assessing the impact of the damage caused by unautho-
Naive Bayes algorithm yielded the worst performance and K- rized entities and taking appropriate measures largely depends
Nearest Neighbor’s performance was very close to that achieved
by Support Vector Machines. Our results shows that for the given on the underlying knowledge possessed by an enterprise about
dataset, Support Vector Machine based classifier was found to its key assets [8], [9]. However, with additional devices being
be a good compromise in terms of accuracy and computational added to networks on a daily basis, identifying all of the assets
expensiveness. owned by an enterprise becomes a challenging task, especially
Index Terms—asset classification, telecommunications, secu- in the presence of limited information describing those assets
rity, device classification
[10].
Studies suggest that assets can generally be classified on
I. I NTRODUCTION
the following criterion, i.e., hardware based that includes
The Internet of things (IoT) paradigm is gaining increased network, computer, storage, or security equipment; software
popularity and is predicted to grow tremendously over the based that includes application or system softwares; and
next five years. It is believed that by 2020 nearly 50 billion services or information based assets supporting either network
devices will be connected to the Internet and these inter- or information services [11]. Studies have suggested different
connected device networks would be used to support various techniques to classify assets based on either of these criteria.
applications and services [1]. With large number of devices For example, Wang et al. [12], [13], [14] suggested Radio
connected to the Internet, maintaining a secure and seamless Frequency Identification (RFID) technology for maintenance
exchange of information between them is a daunting task. This and management of hardware assets. Similarly, [15] suggested
would require addressing several security and privacy related classification of assets based on the type of information they
challenges including, but not limited to, user information and store and its criticality. Some studies [16] have also focused
data security, device identification and authentication, and end- on designing a methodological tool (instrument) for asset
to-end functional security requirements [2]. Confidentiality, identification in web applications for the purpose of risk
authentication, and data integrity are key issues that need to be assessment. However, the scalability of these techniques and
addressed to make the IoT more reliable and have it eventually tools within the IoT environment is still not clear.
become part of our daily lives, just like mobile phones are Another factor unique to the IoT phenomenon is the pres-
today. It is believed that by 2024, the number of gadgets owned sure for high volume, extremely low cost devices (e.g., Internet

978-1-5090-2461-2/16 $31.00 © 2016 IEEE 708


DOI 10.1109/WAINA.2016.131
Protocol (IP) enabled light bulbs). This creates an enormous detecting botnets [30]. While earlier studies have focused
pressure to reduce per-unit cost. This typically manifests itself on techniques to monitor network infrastructure and identify
as a reduction in the devices on-board resources and capabil- cyber attacks, the efficiency of these techniques heavily relies
ities which in turn can lead to protocol violations or non- on the knowledge of the underlying assets. Identifying assets,
standard behaviors. This presents a risk to both the networks that are at high risk, during security breaches is crucial for an
these devices are attached to as well as other services sharing organization and its customers.
these networks. Having the ability to automatically classify
devices would allow a network operator to apply both security III. N ETWORK DEVICE CLASSIFICATION
and traffic shaping controls to mitigate some of these risks, Network devices can be classified based on their applica-
e.g., quality of service (QoS) policies, application firewalls, tions, platforms they support, or connectivity. In this paper,
etc. machine learning based techniques are used for device classi-
With the growth of IoT and massive increase in the number fication, based on their connectivity. Classifications of devices
of assets over the network infrastructure, identifying assets based on applications and supporting platforms were also
in a fast and efficient manner is crucial, especially from carried out but presenting those results are beyond the scope
security perspective. In this paper, techniques for device (asset) of this paper. In the context of this paper, connectivity refers
classification using machine learning algorithms, that can be to communication between a device and other assets over the
easily used within the IoT framework with high accuracy network and includes both servers and endpoints. Servers refer
and level of confidence, are proposed. The devices in the to the devices supporting dedicated services and include file
context of this paper refer to computing devices, i.e., server server, printers, web servers, database servers, mail servers,
and endpoints. To the best of our knowledge, classification of and domain name system (DNS) servers. Devices other than
telecommunication devices using machine learning algorithms servers were considered as endpoint devices.
has not been performed in the literature. Moreover, this work The classification algorithms are broadly classified into two
is done in collaboration with a leading telecommunication main categories, i.e., supervised and unsupervised learning
service provider to ensure that the algorithms designed are algorithms. In supervised algorithms, the classes are prede-
directly applicable to the real world industry data. termined based on some pre-defined criteria and a sample
The overall objectives of this paper are: 1) to develop of this predefined data is used to train the model. The task
algorithms that can classify devices with high accuracy, 2) of machine learning algorithm is to search for the patterns
to develop techniques capable of selecting important features similar to the pre-defined classes used to train the model.
while eliminating any redundant features that can have an These models are evaluated on the basis of their predictive
adverse effect on accuracy of classifiers, 3) to deploy these capacity in relation to measures of variance in the data itself.
algorithms outside of a laboratory setting and to test their per- Some of the commonly used machine learning algorithms are
formance within a collaborative industry partner’s framework. K- Nearest Neighbor, Decision Trees, Random Forest, Naive
This paper is organized as follows: Related work in the field Bayes, Neural Networks, and Support Vector Machines, etc.
is discussed in Section II, Introduction to different classifica- Unsupervised learning algorithms are not provided with pre-
tion algorithms is presented in Section III, the methodology defined classes and the objective of unsupervised learning
followed to classify devices is discussed in Section IV, results algorithms is to develop classification labels automatically.
are presented in Section V, and conclusions are made in Unsupervised algorithms seek out similarity between pieces of
Section VI. data in order to determine whether they can be characterized
to form distinct groups. These groups are termed clusters, and
II. R ELATED W ORK a number of machine learning algorithms for clustering are
A number of studies have used machine learning algorithms available in the literature [31].
to address various security related issues surrounding the In this paper, for the classification of devices, four different
network infrastructure. For example, [17], [18], [19], [20] used supervised machine learning algorithms were explored and
supervised or unsupervised machine learning algorithms to the performance of these algorithms was compared in terms
classify network traffic flows and to identify traffic patterns of the accuracy exhibited by each algorithm in predicting
yielding security related issues such as attacks, misuse of device classes. The algorithms considered in this study are:
services, or unauthorized access of services. Similarly [21], K-Nearest Neighbor (KNN), Naive Bayes, Support Vector Ma-
[22], [23] explored various machine learning algorithms for chine (SVM), and Random Forest classifiers. These algorithms
network intrusion detection with an underlying assumption are discussed in brief as follows:
that intruders exhibit behaviors different from the known
behaviors of legitimate users [24]. While [25], [26], [27] A. K-Nearest Neighbor classifier (KNN)
used these algorithms for anomaly detection where anomalies The K-Nearest Neighbor, also known as instance-based
refers to any unusual behaviors in the system. Other examples learning, is one of the simplest machine learning algorithm
of applications of machine learning algorithms for network that does not make any generalized assumptions either based
security purposes include: prediction of the strength of pass- on the underlying data distribution or training data itself.
words [28], identifying users’ activities on networks [29] or Additionally, due to lack of generalization, the KNN stores

709
feature vectors and class labels associated with them, for Following supervised learning approaches, SVM also uses the
all the samples, that are used for training, to predict a new training data to predict the class for the new dataset known as
test set label. To predict the class of a test sample, the test set, based on their attributes values. Given a training set of
closest K neighbors from the training set are selected and instance-label pairs (xi , yi ), i = 1, · · · , l, where xi ∈ Rn , n
the final prediction is made based on the distance between the dimensional real vector and yi ∈ {−1, 1}l , the SVM is based
samples selected from training data and a given test sample. on solving the following optimization problem
To determine the nearest neighbors, a commonly used distance
1  l
function such as Euclidean distance, is used [32]. If xi is a minw,b,ζ wT w + C ζi (6)
test sample with m features (xi1 , xi2 , · · · , xim ), where n is 2 i=1
the total number of input samples (i = 1, 2, , n), and m is
the total number of features (j = 1, 2, , m). The Euclidean subject to yi (wT φ(xi ) + b) ≥ 1 − ζi and ζi ≥ 0, where
distance between sample xi and xl where (l = 1, 2, , n) is C > 0 is the penalty parameter of the error term, w is
defined as [33] the vector of coefficients, b is a constant, index i represents
the N training cases. Training vectors xi are mapped to
higher dimensional space by the function φ . For nonlinear
d(xi , xl ) =
 (1) classifiers, corresponding to a linear classifier term φ(xi ), a
(xi1 − xl1 ) + (xi2 − xl2 ) + · · · + (xim − xlm )2
2 2
kernel function K where K(xi , xj ) = φ(xi ) · φ(xj ) is used.
B. Naive Bayes classifier (NB) A number of different kernel functions are proposed in the
literature [36]. The four basic kernels functions are:
The Naive Bayes classifiers are simple probabilistic based
• linear: K(xi , xj ) = xi xj
T
supervised machine learning classifiers based on applying
• polynomial: K(xi , xj ) = (γxi xj + r) , γ > 0
T d
Bayes’ theorem with an assumption of independence among
• radial basis function (RBF): K(xi , xj ) = exp(−γ||xi −
features [34], [35]. Given a class variable y and a dependent
xj ||2 ), γ > 0
feature vector x1 through xn , Bayes theorem states
• sigmoid: K(xi , xj ) = tanh(γxi xj + r)
T
P (y)P (x1 , · · · xn | y) where γ, r, and d are kernel parameters.
P (y | x1 , · · · , xn ) = (2)
P (x1 , . . . , xn )
D. Random Forest (RF)
with an independence assumption that
Random forests are an ensemble learning method for clas-
P (xi |y, x1 , . . . , xi−1 , xi+1 , . . . , xn ) = P (xi |y) (3) sification and regression that is based on constructing decision
for all i, equation 2 can be further simplified as trees at training time and outputting the class that is the mode
n of the classes output by individual trees. Several ways have
P (y) i=1 P (xi | y) been proposed and compared to each other to randomize the
P (y | x1 , . . . , xn ) = (4)
P (x1 , . . . , xn ) trees. These includes, 1) bagging, i.e., training each tree on a
random subset of training samples drawn independently and
Here P (x1 , . . . , xn ) is constant. Combining the above model
uniformly at random; 2) boosting, in which a random subsets
with the maximum a posteriori or MAP decision rule, P (y)
of samples are drawn in sequence, each subset is drawn from
and P (xi | y) can be estimated. A Bayes classifier is the
a distribution that favors samples on which previous classifiers
function that relates a class label ŷ with training class y as
in the sequence failed, and classifiers are given a vote weight
follows
proportional to their performance; and 3) arcing, similar to

n boosting but without the final weighting of votes [37].
ŷ = arg max P (y) P (xi | y) (5)
y IV. M ETHODOLOGY FOLLOWED
i=1

where xi are the features vectors, y is the set of classes The NetFlow data files containing the information about the
that are used in the classification. Equation 5 implies that network traffic flow including source IP address, destination IP
in order to find, which class does a new test set belongs address, source and destination port numbers, direction of the
to, the product of the probability of each feature given a flow of traffic, protocol used, number of packets transmitted,
particular class (likelihood) multiplied by the probability of duration for which the connection was made, and total data
the particular class (prior) must be calculated. After calculating received in bytes, were used for analysis. NetFlow is a feature
n
P (y) i=1 P (xi | y) for all the classes of set y, the one with of Cisco routers that provides valuable information about
the highest probability is selected. network users and applications, peak usage times, and traffic
routing [38]. The attributes present in the NetFlow data are
C. Support Vector Machine (SVM) referred to as primary attributes. Primary attributes alone may
The Support Vector Machine is a set of supervised learning not possibly classify devices accurately as IP addresses are not
methods used for classification or regression. They are based dedicated to a single machine. Rather, based on these primary
on the concept of decision planes that define decision bound- attributes, secondary attributes were also derived for each IP
aries separating set of objects belonging to different classes. address, including the total number of incoming and outgoing

710
sessions, ratio of the incoming to outgoing sessions, number analysis of our asset classification model presented in this
and types of ports used for connection, number of other IP paper.
addresses with whom connection was made, duration of the
connections and bytes received. The attribute values, excluding
port numbers, were standardized using z-score scaling such
that all features are characterized by zero mean (μ) and unit
standard deviation (σ), using the following expression [39].
x−μ
xnew = (7)
σ
The Principal Component Analysis technique, discussed in
brief below, was then applied to find the subset of features
used for further analysis [40].

A. Principal Component Analysis (PCA)


The PCA converts a set of observations of possibly cor-
related variables into a set of values of linearly uncorrelated
variables called principal components which define the vari-
ance in the dataset with the first component associated with
the largest possible variance, and each succeeding component
in turn has the highest variance possible under the constraint
that it is orthogonal to the preceding components. While these Fig. 1: Schematic representation of asset classification model
principal factors represent or replace one or more of the
original variables, it should be noted that they are not just a V. R ESULTS
one-to-one transformation, so inverse transformations are not
For the analysis presented in this paper, 21,500 records were
possible.
considered. Nearly 4,130 records with dedicated server ports
Let X represent the original data matrix with p variables
were categorized as server class and the remaining records
and n samples. Let Y be another matrix obtained by linear
were categorized as endpoints. Following other studies [44],
transformation of matrix X such that Y = AX such that
70% of the data was used for training and the model was
Y1 = aT1 X = a11 X1 + a12 X2 + · · · + a1p Xp (8) tested using the remaining 30% of the data. The performance
of the classifiers was assessed using the following measures:
a211
The weights a11 , a12 , · · · , a1p are selected such that + tp
a212 + · · · + a21p = 1. Here Y1 represents the first principal P recision = (9)
tp + f p
component that accounts for the highest possible variance in
the data set. The second principal component is calculated tp
Recall = (10)
in the same way, with the condition that it is uncorrelated tp + f n
with the first principal component and that it accounts for where tp is true positive which is the number of values
the next highest variance. This procedure is continued until correctly assigned to a class, fp is false positive, i.e., number
all the principal components, equal to the original number of values incorrectly assigned to a class, and fn is the number
of variables have been obtained and all the information of of values that are incorrectly not assigned to the class [44].
the original dataset has been accounted for. The rows of A A measure that combines precision and recall is the F-score
are the eigenvectors of matrix Cx , where Cx = n1 XX T . The given by
covariance matrix of principal components, CY is thus a di- precision · recall
agonal matrix with each eigenvalue representing the principal F =2· (11)
precision + recall
components. Note that principal components are constrained
The results obtained using the four classifiers are discussed
to decrease monotonically from the first principal component
below.
to the last. More details on PCA can be found in [41], [42],
[43]. TABLE I: Performance of Classifier models
Next, for each IP address, attributes derived using PCA, Classifier Precision (%) Recall (%) F-Score (%)
and its associated port numbers are obtained. The classifier
KNN 91.8 97.5 94.5
models were designed to predict the class to which each
SVM 92.5 88.2 90.2
device, associated by its IP address, belongs to. For the
NB 96 61.3 74.5
analysis presented in this paper, four different classification
algorithms, discussed above, were used and their performance RF 95.8 99 97.3
was assessed in terms of accuracy yielded by each algorithm. Table 1 shows the percentage values of precision, recall,and
Figure 1 summarizes the steps followed in the design and F-score obtained using the four different classifier models

711
Fig. 2: Confusion matrix results for the four classifiers considered

for device classification. The values of the F-score show the worst performance. Given that the instances for server
that for the given dataset, Naive Bayes algorithm yielded class were quite low compared to endpoint class, drawing any
the worst performance in terms of accuracy as compared to conclusions about the NB algorithm, based on TN or FP results
other algorithms. This could be related to the inability of the is not straightforward. Similar analysis was also carried out for
Naive Bayes algorithm to perform efficiently for dataset used different datasets and the Naive Bayes algorithm consistently
in this analysis, where the features exhibited high variance. yielded poor performance.
Additionally, features like ratio of the incoming to outgoing
traffic used in the analysis is correlated with other features
like incoming traffic sessions and outgoing traffic sessions. VI. C ONCLUSIONS
Thus, the assumption of independence between features for the
Naive Bayes algorithm did not hold well, thus yielding poor In this paper a machine learning based asset classification
performance. The SVM classifier exhibited higher accuracy model for application in a telecommunication organization
than NB algorithm. This is because the SVM algorithms is proposed. The raw data in the form of NetFlow data
can perform quite well in separating the classes for datasets files serves as an input to the model. Both the primary and
with lower number of attributes. Moreover, SVM algorithms secondary features were then extracted to form the class
are known to perform well when applied to datasets like labels for the classifier models. Four different classifiers were
the one used in this paper, with partially known labels for used for analysis (KNN, Naive Bayes, SVM and Random
certain devices. The KNN yielded higher accuracy compared Forest) and the accuracy of the classifier models was calculated
to both the NB and SVM algorithms. This is because the KNN using F-scores. Results show that Naive Bayes classifier’s
algorithm is quite efficient in identifying the class labels for assumption of independence between features did not work
the dataset with small number of variables, given that only the well for the given dataset and thus the Naive Bayes algorithm
subset of features are used in this paper for analysis. However, yielded the lowest accuracy when compared to the other three
the performance of the KNN may not remain the same if all the classifier models. The Random Forest algorithm yielded the
available features of our dataset were to be used for analysis. highest accuracy but was found to be highly computationally
Finally, the RF algorithm yielded the highest accuracy. The expensive. The accuracy obtained by using both KNN and
higher accuracy of RF algorithm is based on the ability of our SVM classifier models was relatively closer to that obtained
model to learn from different individual decision trees, thus by Random Forest algorithm but the SVM proved to be little
capturing the effect of randomness in the training data, before more computationally efficient than the KNN classifier. This
yielding final output classifier. However, RF algorithms was is because in KNN classifier the distance of each test sample
found to be quite slow and is computationally very expensive to all the training samples needs to be calculated.
and therefore may not work well in a real time scenario or for Results show that the use of machine learning based classi-
large dataset. fier models provide a viable solution in automating asset clas-
Figure 2, shows true positive (TP), true negative (TN), false sification, especially for large datasets. Automating of assets is
positive (FP), and false negative (FN) results obtained using important for organizations as it would allow them to maintain
the four classifiers. In Figure 2 although the TP and FN results and control various devices within the IoT infrastructure thus
convey the same information that RF algorithm is the best mitigating some of the risks associated with device breakages.
followed by KNN and SVM, with NB algorithm yielding the It would also allow various teams within the organization to
worst performance, TN results reveal that NB algorithm is identify and respond quickly to any security breaches or event
best followed by SVM and RF, with KNN algorithm yielding failures.

712
ACKNOWLEDGMENT [21] K. Tabia and S. Benferhat, “On the use of decision trees as behavioral
approaches in intrusion detection,” in Seventh International Conference
The authors would like to thank Telus Telecommunications on Machine Learning and Applications, ICMLA ’08., Dec 2008, pp.
Inc., Mitacs and NSERC for supporting this project. 665–670.
[22] T. Khoshgoftaar, S. Nath, S. Zhong, and N. Seliya, “Intrusion detection
in wireless networks using clustering techniques with expert analysis,” in
R EFERENCES Fourth International Conference on Machine Learning and Applications,
[1] G. Fink, D. Zarzhitsky, T. Carroll, and E. Farquhar, “Security and privacy Dec 2005, pp. 6 pp.–.
grand challenges for the internet of things,” in International Conference [23] J. Li, G.-Y. Zhang, and G. chang Gu, “The research and implementa-
on Collaboration Technologies and Systems (CTS), June 2015, pp. 27– tion of intelligent intrusion detection system based on artificial neural
34. network,” in Proceedings of 2004 International Conference on Machine
[2] M. Hossain, M. Fotouhi, and R. Hasan, “Towards an analysis of security Learning and Cybernetics, vol. 5, Aug 2004, pp. 3178–3182 vol.5.
issues, challenges, and open problems in the internet of things,” in IEEE [24] S. Mukkamala, G. Janoski, and A. Sung, “Intrusion detection using
World Congress on Services (SERVICES), June 2015, pp. 21–28. neural networks and support vector machines,” in Proceedings of the
[3] P. N. Mahalle, B. Anggorojati, N. R. Prasad, and R. Prasad, “Identity 2002 International Joint Conference on Neural Networks, IJCNN ’02,
authentication and capability based access control (IACAC) for the vol. 2, 2002, pp. 1702–1707.
internet of things,” Feb 2013. [25] G. Jidiga and P. Sammulal, “Anomaly detection using machine learning
[4] S. Subashini and V. Kavitha, “Review: A survey on security issues in with a case study,” in International Conference on Advanced Commu-
service delivery models of cloud computing,” J. Netw. Comput. Appl., nication Control and Computing Technologies (ICACCCT), May 2014,
vol. 34, no. 1, pp. 1–11, Jan. 2011. pp. 1060–1065.
[5] A. Gamundani, “An impact review on internet of things attacks,” in In- [26] S. Omar, A. Ngadi, and H. H. Jebur, “Article: Machine learning
ternational Conference on Emerging Trends in Networks and Computer techniques for anomaly detection: An overview,” International Journal
Communications (ETNCC), May 2015, pp. 114–118. of Computer Applications, vol. 79, no. 2, pp. 33–41, October 2013, full
[6] M. Rothman. Continuous security monitoring text available.
classifying assets. [Online]. Available: http: [27] T. Ahmed, B. Oreshkin, and M. Coates, “Machine learning approaches to
//www.tripwire.com/state-of-security/security-data-protection/ network anomaly detection,” in Proceedings of the 2Nd USENIX Work-
continuous-security-monitoring-classifying-assets-4/ shop on Tackling Computer Systems Problems with Machine Learning
[7] A. Kadam. Identifying and classifying assets. [Online]. Available: Techniques, ser. SYSML’07, 2007, pp. 7:1–7:6.
http://www.networkmagazineindia.com/200212/security2.shtml [28] M. Vijaya, K. Jamuna, and S. Karpagavalli, “Password strength pre-
[8] S. Corporation. Assets, threats and vulnerabilities:discovery and diction using supervised machine learning techniques,” in International
analysis. [Online]. Available: https://www.symantec.com/content/en/us/ Conference on Advances in Computing, Control, Telecommunication
enterprise/media/securityresponse/whitepapers/RiskManagement.pdf Technologies, ACT ’09., Dec 2009, pp. 401–405.
[9] O. Brdiczka. Do you know how to protect your key assets? [29] M. Tao, Y. C. Ming, and C. Juan, “Profiling and identifying users’
[Online]. Available: http://www.computerworld.com/article/2892229/ activities with network traffic analysis,” in 6th IEEE International
detecting-the-insider-threat-do-you-know-how-to-protect-your-key-assets. Conference on Software Engineering and Service Science (ICSESS), Sept
html 2015, pp. 503–506.
[10] R. M. Lee. Active cyber defense cycle asset identification and network [30] M. Stevanovic and J. Pedersen, “An efficient flow-based botnet detection
security monitoring. [Online]. Available: http://www.csemag.com/ using supervised machine learning,” in International Conference on
[11] C. N. S.R. Gupta. Asset identification in information security risk Computing, Networking and Communications (ICNC), Feb 2014, pp.
analysis. [Online]. Available: http://ijretm.com/paper/SP june 2014/ 797–801.
IJRETM-2014-SP-035.pdf [31] S. Theodoridis and K. Koutroumbas, Introduction to pattern recognition.
[12] M. Wang, J. Tan, and Y. Li, “Design and implementation of enterprise London: Academic, 2010.
asset management system based on IOT technology,” in IEEE Interna- [32] k-nearest neighbors algorithm. [Online]. Available: https://en.wikipedia.
tional Conference on Communication Software and Networks (ICCSN), org/wiki/K-nearestneighborsalgorithm
June 2015, pp. 384–388. [33] k-nearest neighbors algorithm. [Online]. Available: http://www.
[13] R. W. Klein, M. A. Temple, and M. J. Mendenhall, “Application of scholarpedia.org/article/K-nearest neighbor
wavelet-based RF fingerprinting to enhance wireless network security,” [34] Naive bayes classifier. [Online]. Available: https://en.wikipedia.org/
Communications and Networks, Journal of, vol. 11, no. 6, pp. 544–555, wiki/Naive Bayes classifier
Dec 2009. [35] Naive bayes. [Online]. Available: http://scikit-learn.org/stable/modules/
[14] M. Williams, M. A. Temple, and D. Reising, “Augmenting bit-level naive bayes.htmlr
network security using physical layer RF-DNA fingerprinting,” in IEEE [36] C. Hsu, C. Chang, and C. Lin. (2010) A practical guide to support
Global Telecommunications Conference (GLOBECOM 2010), Dec 2010, vector classification. [Online]. Available: http://www.csie.ntu.edu.tw/
∼cjlin/papers/guide/guide.pdf
pp. 1–6.
[15] X. Hu, T. Wang, M. Stoecklin, D. Schales, J. Jang, and R. Sailer, “Asset [37] C. Tomasi. Random forest classifiers. [Online]. Available: https:
risk scoring in enterprise network with mutually reinforced reputation //www.cs.duke.edu/courses/fall15/compsci527/notes/random forests.pdf
propagation,” in IEEE Security and Privacy Workshops (SPW), May [38] Cisco IOS netflow. [Online]. Available: http://www.cisco.com/c/en/us/
2014, pp. 61–64. products/ios-nx-os-software/ios-netflow/index.html
[16] B. Romero M, H. Haddad, and J. Molero A, “A methodological tool [39] Methods for data standardization. [Online]. Avail-
for asset identification in web applications: Security risk assessment,” able: https://www.biomedware.com/files/documentation/Preparingdata/
in Fourth International Conference on Software Engineering Advances, Methodsfordatastandardization.htm
Sept 2009, pp. 413–418. [40] F. Song, Z. Guo, and D. Mei, “Feature selection using principal com-
ponent analysis,” in International Conference on System Science, En-
[17] A. W. Moore and D. Zuev, “Internet traffic classification using bayesian
gineering Design and Manufacturing Informatization (ICSEM), vol. 1,
analysis techniques,” SIGMETRICS Perform. Eval. Rev., vol. 33, no. 1,
Nov 2010, pp. 27–30.
pp. 50–60, June 2005.
[41] Applied multivariate statistical analysis. [Online]. Available: https:
[18] Z. Li, R. Yuan, and X. Guan, “Accurate classification of the internet
//onlinecourses.science.psu.edu/stat505/node/51
traffic based on the SVM method,” in IEEE International Conference
[42] Principal component analysis. [Online]. Available: https://en.wikipedia.
on Communications, ICC ’07., June 2007, pp. 1373–1378.
org/wiki/Principal component analysis
[19] J. Zhang, X. Chen, Y. Xiang, W. Zhou, and J. Wu, “Robust network
[43] J. Shlens. A tutorial on principal component analysis. [Online].
traffic classification,” Networking, IEEE/ACM Transactions on, vol. 23,
Available: http://arxiv.org/pdf/1404.1100v1.pdf
no. 4, pp. 1257–1270, Aug 2015.
[44] N. Aharrane, K. El Moutaouakil, and K. Satori, “A comparison of super-
[20] W. Lu and L. Xue, “A heuristic-based co-clustering algorithm for the
vised classification methods for a statistical set of features: Application:
internet traffic classification,” in 28th International Conference on Ad-
Amazigh ocr,” in Intelligent Systems and Computer Vision (ISCV), 2015,
vanced Information Networking and Applications Workshops (WAINA),
March 2015, pp. 1–8.
May 2014, pp. 49–54.

713

You might also like