Academia.eduAcademia.edu

Classification Rule Discovery on Biological Dataset Using Ant Colony Optimization

Classification systems have been widely utilized in medical domain to explore patient's data and extract a predictive model. This model helps physicians to improve the irprognos is diagnosis or treatment planning procedures. Data mining can be done by using different functionalities. Classification is one of them. Classification is a data mining technique that assigns objects to a predefined classes or labels. The aim of classification is to classify the objects into target class. On the other hand biology inspired algorithms such as Genetic Algorithms (GA) and Swarm based approaches like Particle Swarm Optimization (PSO) and Ant Colonies Optimization (ACO) were used in solving many data mining problems. In this project, binary classification is considered as an area of problem. The main aim of this project is to discover the classification rule on biological dataset using ant miner by calculating accuracy function depends upon pheromone update levels. Ant miner uses rule induction algorithm that occupies collective intelligence to construct classification rules.

International Journal of Engineering Technology and Management (IJETM) ISSN: 2394-6881 Available Online at www.ijetm.org Volume 3, Issue 4; July-August: 2016; Page No. 01-12 Classification Rule Discovery on Biological Dataset Using Ant Colony Optimization 1 2 M.Ramachandro , Dr.R.Bhramaramba 1 Asst.Professor, Department of CSE, GMRIT Rajam-32127, Srikakulam, rama00565@gmail.com 2 Associate Professor, Dept. of Information Technology, GIT, GITAM University, Visakhapatnam bhramarambaravi@gmail.com ABSTRACT: Classifi atio s ste s ha e ee idel utilized i edi al do ai to e plo e patie t’s data and extract a predictive model. This model helps physicians to improve the irprognos is diagnosis or treatment planning procedures. Data mining can be done by using different functionalities. Classification is one of them. Classification is a data mining technique that assigns objects to a predefined classes or labels. The aim of classification is to classify the objects into target class. On the other hand biology inspired algorithms such as Genetic Algorithms (GA) and Swarm based approaches like Particle Swarm Optimization (PSO) and Ant Colonies Optimization (ACO) were used in solving many data mining problems. In this project, binary classification is considered as an area of problem. The main aim of this project is to discover the classification rule on biological dataset using ant miner by calculating accuracy function depends upon pheromone update levels. Ant miner uses rule induction algorithm that occupies collective intelligence to construct classification rules. Keywords: Particle Swarm Optimization, Ant Colonies Optimization, Classification, INTRODUCTION being learning classifier systems. Data mining has been defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data it uses machine learning and visualization techniques to discover and present knowledge in a form of which is easily comprehensible to humans. The actual data mining task is the automatic or semi automatic analysis of large quantities of data to extract previously unknown interesting patterns such as groups of data records, unusual record and dependencies. Out of several data mining tasks, including regression, c l u s t e r i n g , dependence modeling, etc, classification is most studied and popular data mining task. The main objective of classification is to build a model that predicts the class of an unseen data instance through predicting attributes. Rule Discovery is an important data mining task since it generates a set of symbolic rules that describe each class or category in a natural way. The human mind is able to understand rules better than any other data mining model. However, these rules need to be simple and comprehensive; othe ise, a hu a o ’t e a le to o p ehe d them. Evolutionary algorithms have been widely used for rule discovery, a well-known approach 1.1EXISTING SYSTEM Corresponding author: M.Ramachandro Classification is a data mining technique that assigns items to a predefined categories or classes or labels. The aim of classification is to predict the target class for the inputted data. On the other hand biology inspired algorithms such as Genetic Algorithms (GA) and Swarm based approaches like Particle Swarm Optimization (PSO) and Ant Colonies Optimization (ACO) were used in solving many data mining problems and currently the most prominent choice in the area of swarm intelligence. In this paper binary classification is considered as an area of problem and a modified Ant Miner is used to solve the problem. The basic algorithm of Ant Miner has been modified with a different classification accuracy function. 1.2 PROPOSED SYSTEM In many real world problems, classification is used as one of the important decision making technique. classification task can be used when a tuple or sample needs to be classified into a predefined set of classes based on some set of attributes. There are many real world problems that can be 1 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) categorized as classification problems like weather forecasting, credit risk evaluation, medical diagnosis problem, bankruptcy prediction etc. In binary class problems a set of attributes is categorized to one class between two classes like a decision of Yes or No. While in a multiple classification problem a tuple is categorized to one class having number of classes as a solution like Class A, Class B and Class C, i.e. more than two classes as a solution or predicted class. In this a different quality function is used which is simpler in mathematical operation i.e. less no of multiplication and division operation, suitable for binary classification and produces good results. The function is described in. Q=TP+TN/TP+TN+FP+FN All the parameters passed in quality function have the same meaning as in basic Ant Miner. 1.3. ADVANTAGES  Positive Feedback accounts for rapid discovery of good solutions.  Distributed computation avoids premature convergence.  The greedy heuristic helps find acceptable solution in the early solution in the early stages of the search process.  Ant-Miner discovered rule lists much simpler (i.e., smaller) than the rule sets discovered by C4.5 and the rule lists discovered byCN2. 2. LITERATURESURVEY In the last several decades, the size of data is increasing vigorously every day. The factors include the widespread use of barcodes for most commercial products, the computerization of many business, scientific and government transactions. In addition to this, popular usage of World Wide Web as global information, it has flooded with tremendous amount of data. Most of the data patterns are unstructured and complex, though available in the digitalized form. Analyzing of unstructured data is difficult and not efficient until we change it into a structured data. Data mining is the process of discovering new patterns from large data sets involving methods of artificial intelligent, machine learning, statics and database systems. It is the best way to get structured data patterns. It extracts knowledge from the dataset in human understandable structure. The main methods in the data mining process are association, classification, and clustering. 2.1. CLASSIFICATION Classification is done on the basis of the learnt classification model and it comprises of assigning a class label to test samples. Properties of classification With classification rules the groups (or classes) are specified beforehand, with each training data instance belonging to a particular class. 1. This type of data you will get from the train data. 2. This type of learning is called as supervised learning. 3. This type solving problem comes under Classification. Classification versus Clustering Classification is a supervised learning whereas clustering under goes an unsupervised learning. In general, in classification we have a set of predefined classes and want to know which class a new object belongs to is whereas in the clustering tries to group a set of objects and find whether there is some relationship between the objects. 2.2 Performance Evaluation of Classification Methods Classification methods are usually compared on the basis of following criteria 2.2.1 Predictive Accuracy This is the ability of the classification model [6] to correctly classify unseen data. After a classification model has built with the help of training data its accuracy is measured on test samples whose correct class labels are known but not shown to the model. Predictive accuracy is the number of correctly categorized test samples divided by total number of test samples. For example, if we have twenty test samples and the classification model correctly classifies eighteen out of them, then accuracy of the model is90%. 2.2.2 Robustness This is the ability of the classification model to perform well on noisy or missing values data. 2.2.3 Speed This is the computational cost of generating the © 2016 IJETM. All Rights Reserved. 2 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) model. This cost is measured in terms of running time of the algorithm. The running time is measured in terms of number of steps/ operation required by an algorithm and it is independent of the operating system and the machine used. 2.2.4 Scalability This is the ability to construct the model efficiently even for a large amount of high dimensional data. When we increase the size of input the algorithm should be able to construct the classification model as efficiently as for small input size. 2.2.5 Interpretability This is the ease of comprehensibility understanding of the model by the user. 2.3 or Types of Classifiers A large number of classification methods are available. They can be divided into two major groups: comprehensible classifiers and statistical (or mathematical classifiers). 2.3.1 Comprehensible Classifiers Comprehensible classifiers [1] are usually rule based classifiers. These are easy to understand and interpret and are interesting for the users (or at least the domain experts). They are in contrast to mathematical classifiers which are difficult to understand. The major benefit of these classifiers is that comprehensibility leads to trust of the user on the decisions obtained from them. Some of the commonly used rule induction algorithms are described below.  C4.5 Decision Tree  CN2 3. IMPLEMENTATIONISSUES 3.1 Classification analysis CLASSIFICATION [6] is one of the most frequently occurring tasks of human decision making. A classification problem encompasses the assignment of an object to a predefined class according to its characteristics. Many decision problem in a variety of domains, such as engineering, medical sciences, human sciences, and management science can be considered as classification problems. Popular examples are speech recognition, character recognition, medical diagnosis, bankruptcy prediction and credit scoring. Throughout the years, a myriad of techniques for classification has been proposed such as linear and logistic regression, decision trees and rules, k-nearest neighbor classifiers, neural networks, and support vector machines (SVMs). Various bench marking studies indicate the success of the latter two nonlinear classification techniques, but their strength is also their main weakness: since the classifiers generated by neural networks and SVMs are described as complex mathematical functions, they are rather in comprehensible and opaque to humans. This opacity property prevents them from being used in many real-life applications where both accuracy and comprehensibility are required, such as medical diagnosis and credit risk evaluation. For example, in credit scoring, since the models concern key decisions of a financial institution, they need to be validated by a financial regulator. Transparency and comprehensibility are, therefore, of crucial importance. Similarly, classification models provided to physicians for medical diagnosis need to be validated, demanding the same clarity as for any domain that requires regulatory validation. Classification is a two steps process. 1st step is Model Construction. Model Construction: It describes a set of predetermined classes. Each tuple/sample is assumed to belong to a predefined class, as determined by the class label attribute. These of tuples used for model construction is called training set. The model is represented as classification rules, decision trees, or mathematical formulae. Model usage: This is the 2nd step in classification. For classifying future or unknown objects, this is used. This model estimates the accuracy of the model. The known label of test sample is compared with the classified result from the model. Test set is independent of Training set. There are many algorithms which are used for classification in data mining shown above. Following are some algorithms 1. Rule based classifier 2. Decision tree induction 3. Nearest neighbor classifier 4. Bayesian classifier 5. Artificial neural network 6. Support vector machine 7. Ensemble classifier 8. Regression trees © 2016 IJETM. All Rights Reserved. 3 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) The Data The data used in this investigation is the diabetes [7] data. It has a total dimension of 699 rows and 9columns. For the purposes of training and testing, only 75% of the overall data is used for training and the rest is used for testing the accuracy of the classification of the selected classification methods. 3.2. OVERVIEW OFDIABETES 3.2.1. Diabetes Diabetes [7] is a disease that occurs when the insulin production in the body is inadequate or the body is unable to use the produced insulin in a proper manner, as a result, this leads to high blood glucose. The body cells break down the food into glucose and this glucose needs to be transported to all the cells of the body. The insulin is the hormone that directs the glucose that is produced by breaking down the food in to the body cells. Any change in the production of insulin leads to an increase in the blood sugar levels and this can lead to damage to the tissues and failure of the organs. Generally a person is considered to be suffering from diabetes, when blood sugar levels are above normal (4.4 to 6.1 mm ol/L). Generally a person is considered to be suffering from diabetes, when blood sugar levels are above normal (4.4to6.1mmol/L)[5].A diabetic patient essentially has low production of insulin or their body is not able to use the insulin well. There are three main types of diabetes, viz. Type1,Type2 and Gestational. Type1–The disease manifest as an auto immune disease occurring at a very young age of below 20 years. In this type of diabetes, the pancreatic cells that produce insulin have been destroyed. Type2 – Diabetes is in the state when the various organs of the body become insulin resistant, and this increases the demand for insulin. At this point, pa eas does ’t ake the e ui ed a ou t of insulin. Gestational diabetes ends to occur in p eg a t o e , as the pa eas do ’t ake sufficient amount of insulin. All these types of diabetes need treatment and if they are detected at a nearly state, one can avoid the complications associated with them. Now a days, large amount of information is collected in the form of patient records by the hospitals. Knowledge discovery for predictive purposes is done through data mining, which is analysis technique that helps in proposing inferences There are three main types of diabetes, viz. Type 1, Type 2 and Gestational. 3.2.2Types of Diabetes The three main types of diabetes are described below: 1. Type1–Though there are only about 10% of diabetes [7] patients have this form of diabetes, recently, there has been a rise in the number of cases of this type in the United States. The disease manifest as an auto immune disease occurring at a very young age of below 20 years hence also called juvenile-on set diabetes. In this type of diabetes, the pancreatic cells that produce insulin have been destroyed by the defense system of the body. Injections of insulin along with frequent blood tests and dietary restrictions have to be followed by patients suffering from Type 1diabetes. 2. Type 2 – This type accounts for almost 90% of the diabetes cases and commonly called the adult on set diabetes or the non - insulin dependent diabetes. In this case the various organs of the body become insulin resistant, and this increases the demand for insulin. At this point, pancreas does ’t ake the e ui ed a ou t of i suli . To keep this type of diabetes at bay, the patients have to follow a strict diet, exercise routine and keep track of the blood glucose. Obesity, being overweight, being physically inactive can lead to type 2 diabetes. Also with age ing, the risk of developing diabetes is considered to be more. Majority of the Type 2 diabetes patients have border line diabetes or the Pre-Diabetes, a condition where the blood glucose levels are higher than normal but not as high as a diabetic patient. 3. Gestational diabetes–is a type of diabetes that tends to occur in pregnant women due to the high suga le els as the pa eas do ’t p odu e sufficient amount of insulin. © 2016 IJETM. All Rights Reserved. 4 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) Table 4.2.2.1: Dataset [5] Description Dataset No of attributes No of instances 8 768 Pima Indians Diabetes Database of National Institute of Diabetes and Digestive and Kidney Diseases Table4.2.2.2: The attributes description [7] Attribute Values No of times pregnant Plasma glucoseconcentration Diastolic blood pressure Triceps skin foldthickness 2 hour insulin levels Body mass index Pedigree function Age Class variable 1 or2 Preg Plas Pres Skin Insulin Mass Pedi Age Class 3.3MODULES 3.3.1DataPre-processing Data pre-processing [4] is an important step in the data mining process. Data-gathering methods are often loosely controlled, resulting in out-of-range alues e.g., I o e: − , i possi le data combinations (e.g., Sex: Male, Pregnant: Yes), missing values, etc. Analyzing data that has not been carefully screened for such problems can produce misleading results. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. Data preparation and filtering steps can take considerable amount of processing time. Data pre-processing includes cleaning, normalization, transformation, feature extraction and selection, etc. Some attributes may not be required in the analysis, and then those attributes can be removed from the dataset before analysis. For example, attribute instance number of iris dataset is not required in analysis. This attribute can be removed by selecting it in the Attributes check box, and clicking Remove. Resulting dataset then can be stored in arff file format. Missing values Missing data might occur because the value is not relevant to a particular case, could not be recorded when the data was collected, or is ignored by users because of privacy concerns. Missing values lead to the difficulty of extracting useful information from that data set [2]. Missing data are the absence of data items that hide some information that may be important [1]. Most of the real world databases are characterized by an unavoidable problem of incompleteness, in terms of missing or erroneous values Type of missing data: There is different type of missing value MCAR The te Missi g Co pletel at Ra do efe s to data where the missing ness mechanism does not depend on the variable of interest, or any other variable, which is observed in the dataset. MAR MAR Sometimes data might not be missing at a do ut a e te ed as Missi g at Ra do . We a o side a e t Xi as issi g at random if the data meets the requirement that missing ness should not depend on the value of Xi after controlling for another variable. NAMR NAMR If the data is not missing at random or i fo ati el issi g the it is te ed as Not © 2016 IJETM. All Rights Reserved. 5 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) issi g at Ra do . “u h a situatio o u s he the missing ness mechanism depends on the actual value of missing data. Missing data imputation techniques Lit wise deletion: This method omits those cases (instances) with missing data and does analysis on the remains. Though it is the most common method, it has two obvious disadvantages: a) A substantial decrease in the size of data set available for the analysis. b) Data are not always missing completely at random. Mean/ Mode Imputation (MMI) Replace a missing data with the mean (numeric attribute) or mode (nominal attribute) of all cases observed. To reduce the influence of exceptional data, median can also be used. This is one of the most common used methods. Replacing the missing values In weka there is a filter called "Replace Missing Values" that permit to replace all missing values in a dataset using the mean of each attribute. I'd like to replace missing values, for a certain attribute, using the mean of values that belong to a certain class. For example in a binary dataset I think that is more correct to replace a missing value for an attribute in record that belong to the positive class using the mean calculated with only the records that belong to the positive class.to replace missing values of Class A by taking the mean calculated from the training instances of that particular class A, then ou a e " iasi g ou dataset. To a oid ias hi h eventually will over fit your trained model), it is (Differential path effect) With time, the amount of pheromone the ants deposit increases more rapidly on the shorter path, and so more ants prefer this path. This positive effect is called auto catalysis. The difference between the two paths is wise to use the default "replace missing values" function- i.e., to consider mean and mode of all training instances rather than of just that particular class. 3.3.2Applying ACO model on dataset ANT COLONY SYSTEM (ACS) Ant Colony Optimization (ACO) [4] is a branch of a newly developed form of artificial intelligence called swarm intelligence. Swarm intelligence is a field hi h studies the e e ge t olle ti e i tellige e of g oups of si ple age ts . I g oups of i se ts, which live in colonies, such as ants and bees, an individual can only do simple tasks on its own, while the colony's cooperative work is the main reason determining the intelligent behavior it shows. Most real ants are blind. However, each ant while it is walking, deposits a chemical substance on the ground called pheromone. Pheromone encourages the following ants to stay close to previous moves. The pheromone evaporates over time to allow search exploration. In a number of experiments presented in Dorigo and Maniezzo illustrate the complex behavior of ant colonies. For example, a set of ants built a path to some food. An obstacle with two ends was then placed in their way such that one end of the obstacle was more distant than the other. In the beginning, equal numbers of ants spread around the two ends of the obstacle. Since all ants have almost the same speed, the ants going around the nearer end of the obstacle return before the ants going around the farther end called the preferential path effect; it is the result of the differential deposition of pheromone between the two sides of the obstacle, since the ants following the shorter path will make more visits to the source than those following the longer path. © 2016 IJETM. All Rights Reserved. 6 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) Because of pheromone evaporation pheromone on the longer path vanishes with time. ACO Based Classification Rule Discovery: Ant Miner Algorithm A few authors have applied ACO for discovery of classification rules. The first ACO based algorithm for classification rule discovery, called, Ant Miner [3] was proposed by Parpinelli, etal. An ant constructs a rule. It starts with an empty rule and incrementally constructs it by adding one term at a time. The selection of a term to be added is probabilistic and based on two factors: a heuristic quality of the term and the amount of pheromone deposited on it by the previous ants. The authors use information gain as the heuristic value of a term. The rule construction continues until one of the two situations occurs. One situation is that there is no term left whose addition would not cause the rule to cover a number of cases smaller than a threshold specified by the user called Min_cases_per_rule (minimum number of cases covered by the rule). The second situation is that there are no more attributes that could be inserted in the rule because all attributes have already been utilized by the ant. When one of these two stopping o ditio s is et the a a t’s tou is o side ed o plete the ule’s a te ede t pa t is complete).The consequent of the rule is assigned by taking a majority vote from the training samples covered by the rule. The constructed rule is then pruned to remove irrelevant terms and to improve its accuracy. The quality of the constructed rule is determined and pheromone values are updated on the trail take place by the ant relative to the quality of rule. After this a new ant starts with updated pheromone values to guide its search. When all ants have constructed their rules, the best rule among them is selected and added to a discovered rule list. The training samples correctly classified by that rule are deleted from the training set. This process continues until the number of uncovered samples is less than a threshold specified by the user. The final product is an ordered discovered rule list that is used to classify the test data. The goal of ant miner is to extract classification rules from data. The algorithm is presented above 1. Training set=all training cases; attributes that are not yet used by the ant. 2. WHILE(No. of cases in the Training set>max _uncovered_ cases) 3. 4. 5. 6. i=0; REPEAT i=i+1; Anti-incrementally constructs a classification rule; 7. Prune the just constructed rule; 8. Update the pheromone of the trail followed by Anti; 9. UNTIL i ≥No_of_A ts 10. Select the best rule among all constructed rules; 11. Remove the cases correctly covered by the selected rule from the training set; 12. ENDWHILE Pheromone Initialization: All cells in the pheromone table are initialized equally to the following value Where a is the total number of attributes, bi is the number of values in the domain of attribute i. Rule Construction Each rule in Ant-Miner contains a condition part as the antecedent and a predicted class. The condition part is a conjunction of attribute-operator-value tuples. The operator used in all experiments is = si ei A t-Miner2, just a sin Ant-Miner, all attributes are assumed to be categorical. Let us assume a rule condition such as termij≈ Ai=Vij, where Aiis the ithattribute and Vijist the jthvalue in the domain of Ai. The probability, that this condition is added to the current partial rule that the ant is constructing, is given by the following Equation: Where ηijis a problem-dependent heuristic value for term-ij, τijis the amount of pheromone currently available (at time t) on the connection between attribute I and value I is the set of attributes that are not yet used by the ant. Heuristic Value In traditional ACO a heuristic value is usually used in conjunction with the pheromone value to decide on © 2016 IJETM. All Rights Reserved. 7 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) the transitions to be made. In Ant-Miner, the heuristic value is taken to be an information theoretic measure for the quality of the term to be added to the rule. The quality here is measured in terms of the entropy for preferring this term to the others, and is given by the following equations: = Rule Pruning Immediately after the ant completes the construction of a rule, rule pruning is under taken to increase the comprehensibility and accuracy of the rule. After the pruning step, the rule may be assigned a different predicted class based on the majority class in the cases covered by the rule antecedent. The rule pruning procedure iteratively removes the term whose removal will cause a maximum increase in the quality of the rule. The quality of a rule is measured using the following equation: �= � ���� � ����� × � ���� + ��� ���� ��� ��� + � ����� DESCRIPTION OF ANT-MINER [3]: The pseudo code of Ant miner is at a very high level of abstraction, in Algorithm. Ant-Miner starts by initializing the training set to the set of all training cases, and initializing the discovered rule list to an empty list. Then it performs an outer loop where each iteration discovers a classification rule. The first step of this outer loop is to initialize all trails with the same amount of pheromone, which means that all term shave the same probability of being chosen by an ant to incrementally construct a rule. This is done by an inner loop, consisting of three steps. First, an ant starts with an empty rule and incrementally constructs a classification rule by adding one term at a time to the current rule. In this step a term..– representing a triple <Attribute = Value> – is chosen to be added to the current rule with probability proportional to the product of h..t..(t), where h..is the value of a problem – dependent heuristic function for term. And tjj(t) is the amount of pheromone associated with term i. at iteration (time index) t. More precisely, hi is essentially the information gain associated with term i. The higher the value of h. the more relevant for classification term. Is and so the higher its probability of being chosen .t..(t) corresponds to the amount of pheromone currently available in the positio i,]’ of the t ail ei g follo ed the current ant. The better the quality of the rule constructed by an ant, the higher the amount of phe o o e added to the t ail positio s te s visited used the a t. The second step of the inner loop consists of pruning the just-constructed rule, that is, removing irrelevant terms – terms that do not improve the predictive accuracy of the rule. In essence, a term is removed from a rule if this operation does not decrease the quality of the rule. This pruning process helps to avoid the overfitting of the discovered rule to the training set. The third step of the inner loop consists of updating the pheromone of all trails by increasing the pheromone in the trail followed by the ant, the quality (Q) of a rule is measured by the equation: Where Sensitivity = TP/ (TP+FN) and Specificity=TN/ (TN+FP). The meaning of the acronyms TP, FN, TN and FP is as follows: True Pos = number of true positives, that is, the number of cases covered by the rule that have the class predicted by the rule; False Neg = number of false negatives, that is, the number of cases that are not covered by the rule but that have the class predicted by the rule; True Neg = number of true negatives, that is, the number of cases that are not covered by the rule and that do not have the class predicted by the rule; and False Pos = number of false positives, that is, the number of cases covered by the rule that have a class different from the class predicted by the rule 4. RESULTS ANDDISCUSSIONS 4.1 DATA SET ANALYSIS The data sets for performing classification had been taken from the Pima Indians Diabetes Database of National Institute of Diabetes. The data sets had been taken to apply for ant colony optimization 4.2. Diabetes dataset Diabetes data set consists of 9 numerical attributes and 7 6 8 instances .. Using this data, we are cleaning the data. After applying preprocessing © 2016 IJETM. All Rights Reserved. 8 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) 4.3 Data fields of DiabetesDataset 4.4OUTPUTSCREENS 4.4.1Overview Screen © 2016 IJETM. All Rights Reserved. 9 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) © 2016 IJETM. All Rights Reserved. 10 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) 5. CONCLUSIONSANDFUTUREENHANCEMENTS Conclusion In this paper, we have discussed the use of ACO for classification. By providing an appropriate environment, the ants choose their paths and implicitly construct a rule. One of the strengths of the ant – based approaches is that the results are comprehensible, as they are in a rule based format. Such rule lists provide insight into the decision making, which is a key requirement in domains such as credit scoring and medical diagnosis. The proposed Ant Miner technique can handle both binary and multiclass classification problems and generates rule lists consisting of propositional and interval rules. Another advantage of ACO that comes out more clearly in our approach is the possibility to handle distributed environments. Since the Ant Miner construction graph is defined as a sequence of vertex groups (of which the order is of no relevance), we are able to mine distribute databases. Future Enhancements An issue faced by any rule-based classifier is that, although the classifier is comprehensible, it is not necessarily in line with existing domain knowledge [7]. It may well occur that data instances, that are very evident to classify by the domain expert, do not appear frequently enough in the data in order to be appropriately modeled by the rule induction technique. Hence, to be sure that the rules are intuitive and logical, expert knowledge needs to be incorporated. An example rule-set of such an unintuitive rule list, generated by Ant Miner, The underlined term is contradictory to medical knowledge suggesting that increasing tumor sizes result in higher probability of recurrence. As shown in Fig. 10, such domain knowledge can be included in Ant Miner by changing the environment.4 since the ants extract rules for the recurrence class only, we can remove the second vertex group corresponding to the upper bound on the variable. Doing so ensures that the rules comply with the constraint required for the tumor size variable. Applying such constraints on relevant datasets to obtain accurate, comprehensible, and intuitive rule lists is surely an interesting topic for future research. 6. References: 1. D.Martens,M.deBacker,R.Haesen,B.Baesens,C. Mues,a dJ.Va thie e , A t asedapp oa h to the knowledge fusion p o le , i AntColony Classification and Associative Classification Rule Discovery using Ant Colony Optimization Optimization and Swarm Intelligence(ANTS 2006), LNCS 4150, pp. 84-95,Springer,2006. © 2016 IJETM. All Rights Reserved. 11 M.Ramachandro, et.al.,International Journal of Engineering Technology and Management (IJETM) 2. Abraham, A.Grosan, C., Ramos V.: "Swarm Intelligence in Data Mining". Studiesin Computational Intelligence, vol. 34,(2006). 3. SmaldonJ.&Freitas,A.A.(2006).Anewversionofth eAnt-Mineralgorithmdiscovering unordered rule sets. Proc. Genetic and Evolutionary Computation Conf.(GECCO-20060),San Francisco, CAMorganKaufmann. 4. DorigoM&Maniezzo,V.(1996).Theantsystem:opt imizationbyacolonyofcooperatin Optimization,D.Corne,M.DorigoandF.Glover,Eds .London,U.K.:McGraw-Hill,1999,pp. 11–32. 5. https://archive.ics.uci.edu/ml/datasets/Pima+In dians+Diabetes 6. Liu, H.A. Abbaas, a d B. M Ka , Classifi atio rule discovery with ant colonyopti izatio , IEEE Computational Intelligence Bulletin, Vol. 3, No. 1, Feb.2004. 7. R. S. Parpinelli, H. S. Lopes, and A. A. Freitas, A a t olo ased s ste fo data i i g: Appli atio s to edi al data, i Proc. Genetic and Evol.Comput. Conf., 2001, pp.791–797 8. Parepinelli, R. S., Lopes, H. S., & Freitas, A. (2002). An Ant Colony Algorithm for Classification Rule Discovery. In H. A. a. R.S. a. C. Newton (Ed.), Data Mining: Heuristic Approach: Idea Group Publishing. Text Books: 1. Introduction to data mining, Pang-Ning-Tan, Michael Steinbach, Vipin Kumar, Published by PearsonEducation. 2. Data Mining and Concepts by Jiawei Han, Michelin Kamber, Morgan Kauffma © 2016 IJETM. All Rights Reserved. 12