An Intelligent Intrusion Detection System Using Outlier Detection and Multiclass SVM
An Intelligent Intrusion Detection System Using Outlier Detection and Multiclass SVM
An Intelligent Intrusion Detection System Using Outlier Detection and Multiclass SVM
Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 01, Mar 2011
An Intelligent Intrusion Detection System Using Outlier Detection and Multiclass SVM
S.GANAPATHY1, N.JAISANKAR2, P.YOGESH3 AND A.KANNAN4, Department of Information Science and Technology, College of Engineering Guindy, Anna University, Chennai-25.
ganapathy.sannasi@gmail.com 1 , jaisasi_win@yahoo.com 2 , yogesh@annauniv.edu 3 ,kannan@annauniv.edu 4
Abstract-Intrusion Detection Systems have been used along with various techniques to detect intrusions in networks, distributed databases and web databases. However, all these systems are able to detect the intruders with high false alarm rate. In this paper, we propose a new intrusion detection model using the combination of outlier detection method and multiclass SVM classification. For this purpose, we propose a new outlier detection algorithm called Weighted Distance Based Outlier Detection algorithm (WDBOD) and an Enhanced Multiclass Support Vector Machine algorithm for detecting the intruders. The experimental results of the proposed model show that this system detects anomalies with low false alarm rate and high detection rate when tested with KDD Cup 99 data set. Index Terms- Intrusion Detection System (IDS), Multiclass Support Vector Machine (MSVM), Weighted Distance Based Outlier Detection (WDBOD), K-Nearest Neighbor (KNN)
intrusion detection, fraud detection and fault detection in manufacturing, among other things. In this paper, a new intrusion detection model for detecting the intruders in networks and databases that use a combination of outlier detection algorithm and an Enhanced Multiclass SVM (EMSVM) algorithm for classification has been proposed in order to classify the attacks effectively and to detect them. Even though various types of attacks happen in the internet scenario, we have focused mainly on the effective detection of DDoS attacks by using enhanced MSVM algorithm, since DDoS attacks are more serious than other attacks. We have used an Enhanced Multiclass Support Vector Machines for classification since then are the accurate classifiers and are more effective in binary classification. II. LITERATURE SURVEY There are many works in the literature that discuss about intrusion detection, classification and outlier detection. Among than, Fabrizio Angiulli et. al [2] have proposed a distance based outlier detection method, which is to find the top outliers in an unlabeled data set and to provide a subset of it, called the outlier detection solving agent. This solving agent can investigate the accuracy effectively based on outliers. Farhan Abdel-Fattah et. al [3] have proposed a novel intrusion detection method by combining two anomaly methods namely conformal predictor k-nearest neighbor and distance based outlier detection(CPDOD) algorithm. JeenShing Wang et. al [1] have proposed a cluster validity measure with outlier detection and cluster merging algorithms for the Support Vector Clustering algorithm. This algorithm is capable of identifying the ideal cluster number with compact and smooth arbitrary shaped cluster contours for increasing robustness of outliers and noises. There are many classification algorithms that are found in the literature. For example, an algorithm called Tree structured Multiclass SVM has been proposed by Snehal A.Mulay et. al [9] for classifying data effectively. Their paper proposed decision tree based algorithms to construct multiclass IDS which are used to improve the training time, testing time and accuracy of IDS. 166
I. INTRODUCTION Currently, network based computer systems play increasingly vital role in society and Intrusion Detection System (IDS) has become an in dispensable component in security architecture. Though many IDS have been proposed and implemented in the past, the existing intrusion detection methods, including misuse detection and anomaly detection [6], are generally incapable of adapting to new types of networks.This causes a high false positive rate during intrusion detection using such systems. Moreover, traditional Intrusion Detection methods can only detect known intrusions since they classify instances by what they have learned. However, the necessity to build adaptive IDS with self learning abilities has become a hot spot in security field. An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism [Hawkins 1980]. Outlier detection is useful for classification like supervised learning methods and is more suitable for applications in which label information is either hard to obtain or unreliable. Typical examples of such application areas include network
2011 ACEEE
DOI: 01.IJRTET.05.01.174
Review Paper
Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 01, Mar 2011 Multiple level tree classifiers were proposed by various researchers in the past [8] in order to design effective IDSs. In such systems, the data are split into normal DOS, PROBE and others (a new class label U2R and R2L). In the second level, the algorithm split the others into its corresponding U2R and R2L, while the third level classifies the attacks into its individual specific attacks. However, it is necessary to classify the DOS attacks with a special attention to improve the network performance. Chaobin Liu et. al [4] have constructed an improved multiclass support vector machines based on binary tree structure. Srinivas Mukkamala et. al [7] proposed an IDS for detecting DoS attacks using Support Vector Machines (SVMs). Comparing with all the works in the literature the IDS proposed in this paper is different in many ways. We use an outlier detection method for finding relevant data. Second we use an enhanced MSVM for classification. Third we focus on DDoS attacks which are the most important among the different types of attacks. Finally, we used the KDD cup data set for carrying the experiments. III. SYSTEM ARCHITECTURE IV. PROPOSED WDBOD ALGORITHM A. Weighted Distance Based Outlier Detection (WDBOD) We improved the Conformal Prediction for K-Nearest Neighbor (CP-KNN) nonconformity score[4] calculation using the Manhattan distances between two points. This non conformity score calculated based on the shortest distances between ith and jth nodes (y) as well as jth and i th nodes (-y) in the same class y.
B. Weighted Distance based Outlier Detection In this work, we propose a weighted distance based on the outlier factor to determine the outlierness of a point in the feature set. We have used this algorithm to improve the detection accuracy. The formal definitions of weighted distance based outlier factor is presented as follows Let s1,s2,,sk be the K-nearest neighbors of an object x with weights w1,w2,w3,..,wk. The weighted distance of these k-nearest neighbors the object can be defined as w1d1, w2d2,..,wkdk where d1,d2,.,dk are the normal Euclidean distances . d(x,si) = widi Their average weighted distance is computed using the formula
The architecture of the system proposed in this work consists of two major modules namely, Data collection module and Intrusion Detection module, as shown in figure 1. Data collection module is used to collect the network data. Intrusion Detection Module used to detect the intruder from the given data using Weighted Average Distance based Detection Algorithm and Enhanced Multiclass Support Vector Machine Algorithm. The intrusion detection module is used to distinguish the intruders from normal users using an outlier Detection Algorithm which is combined with SVM for obtaining better classification accuracy.
a) Weighted Distance Based Outlier Detection Algorithm (WDBOD) Input: The training Data Set. Output: The set of p-values when TS is two classes data set normal (n) and abnormal (a). The algorithm steps are as follows, Phase 1: Training 1. Calculate the weighted average distance using eqn 3. 2. Compute the number of nodes n whose distance is greater than the weighted average. 3. Compute the inner weighted average for the k-nearest inner nodes. 4. Train the data for inner and outer nodes.
Phase 2: Testing
1. Compute the new arriving node weighted distance. 2. If the distance of newly arrived node > the weighted average distance then abnormal else normal.
Fig 1 System Architecture
2011 ACEEE
167
DOI: 01.IJRTET.05.01.174
Review Paper
Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 01, Mar 2011 b) Multiclass Support Vector Machine The Multiclass Support Vector Machine (MSVM) algorithm provides as follows: First, we first compute the distance between two classes of patterns and repeat it for each class of such patterns. Where the distance between two classes is computed using the Minkowski Distance. According this method, the distance between two points with statistical distributions varying drastically from each other, which makes the intrusion detection a very challenging task. B. Experimental Results Table 1 is the performance analysis of the proposed system (Outlier detection with EMSVM). Figure 1 illustrates the comparison of false positive rate in different experiments. Figure 2 shows the performance comparison of MSVM, Enhanced MSVM and WBDOD with EMSVM for different attacks like DoS, Probe and Others.
defined as
We find the center point of every class by using the formula, where p is the order.
After this calculation, five classes obtained earlier are converted into two classes. For example let A, B, C, D and E be five classes. If the Minkowski distance of any two classes are less than that of the other classes then that pair is replaced by 1(Normal). Otherwise, it is replaced by -1 (Attacker). So, at end of the repeated process, we have only 1s and -1s combinations. Since -1 classes are removed, the remaining classes are used to construct the tree. c) Enhanced Multiclass Support Vector Machine algorithm Confirm two initial cluster centers by algorithm search m. Import a new class C. Compute the Minkowski distance between two classes. if ( dAB > dAC ) then B is assigned as Normal Else C is assigned as Attacker. Find the min & max of the distance. If (dAB < threshold limit of the distance) then create a new cluster and this is the center of the new cluster. Else B is assigned as an Attacker. Repeat the operation until reduced the difference between the classes. V. EXPREMENTATION AND RESULTS A. Training and Test data The dataset used in the experiment was taken from the Third International Knowledge Discovery and Data Mining Tools Competition (KDD Cup 99). Each connection record is described by 41 attributes. The list of attributes consists of both continuous-type and discrete type variables,
2011 ACEEE
Figure 2 The Results Comparison between EMSVM and Outlier Detection with EMSVM
168
DOI: 01.IJRTET.05.01.174
Review Paper
Int. J. on Recent Trends in Engineering & Technology, Vol. 05, No. 01, Mar 2011 VI. CONCLUSION AND FUTURE WORKS In this work, an intelligent IDS has been proposed and implemented using a combination of two algorithms namely, Weight Based DOD algorithm and Enhanced Multiclass SVM for effective intrusion detection. The classification accuracy for DoS, Probe and others attacks are 99.22%, 99.12% and 99.3 %. The main advantage of this work is the increase in detection accuracy and reduction in false positive rates. Future works in this direction could be the use of tuple reduction techniques for preprocessing in this Intrusion Detection Model. REFERENCES
1. Jeen Shing Wang and Jen-Chieh Chiang, A Cluster Valdity Measure with Outlier Detection for Support Vector Machine, IEEE Transactions on Systems, Man, and Cybernatics-Part B Cybernatics, Vol.38, No.1, pp.78-89, February 2008. 6. Denning D E, An Intrusion Detection Model, IEEE Transactions on Software Engineering, vol.51, No.8, pp. 1226, Aug 2003. 7. Srinivas Mukkamala, Andrew H.Sung, Detecting Denial of Service Attacks using Support Vector Machine, The IEEE International Conference on Fuzzy Systems, pp. 1231-1236, 2003. 8. KDD Cup 1999 Data, Information and Computer Science, University of California, Irvine. http://kdd.ics.uci.edu/databases/ kddcup99/kddcup99.html. 9. Snehal A.mulay, P.R Devale, G.V.Garje, Intrusion Detection System using Support Vector Machine and Decision Tree, International Journal of Computer Applications, Vol. No.3, pp.0975-8887, June 2010. ABOUT AUTHORS S.Ganapathy received M.E degree in Computer Science & Engineering from Anna University, Chennai where he is pursuing Ph.D currently. His area of interest is Networks security. N.Jaisankar received M.E degree in Computer Science & Engineering from Sathyabama University, Chennai. He is pursuing Ph.D currently in Anna University, Chennai. His area of interest includes Data Mining and security. P.Yogesh is currently working as Assistant Professor in Anna University, Chennai. He received his M.E from Madurai Kamarajar University and Ph.D from Anna University. He has published 20 papers in journals and conferences. His area of interest includes Computer Networks and Security. A.Kannan is currently working as a Professor in Anna University, Chennai where he received his M.E and Ph.D in Computer Science & Engineering. He has published more than 80 articles in journals and conferences. His area of interest includes Data bases, Artificial Intelligence and Security.
2. Fabrizio Angiulli, Stefano Basta, and Clara Pizzuti, Distance based Detection and prediction of Outliers, IEEE Transactions on Knowledge and Data Engineering, Vol.18, No.2, February 2006.
3. Farhan Abdel-Fattah, Zulkhairi Md. Dahalin and Shaidah Jusoh, Dynamic Intrusion Detection Method for Mobile AdHoc Networks Using CPDOD Algorithm, IJCA Special Issues on Mobile Ad-hoc Networks MANETs, pp. 22-29, 2010. 4. Chaobin Liu, Yuexiang Yang, Chuan Tang, An Improved method for Multiclass Support Vector Machines, 2010 International Conference on Measuring Technology and Mechtronics Automation, IEEE, 2010. 5. S.Ganapathy, P.Yogesh, A.Kannan, An Intelligent Intrusion Detection System for Mobile Ad-Hoc Networks Using Multiclass Classification, Proceedings of the International Conference on Power, Control and Embedded Systems, pp. 438-444, December 2010.
2011 ACEEE
169
DOI: 01.IJRTET.05.01.174