Classifier Based Text Mining For Radial Basis Function
Classifier Based Text Mining For Radial Basis Function
Classifier Based Text Mining For Radial Basis Function
Abstract: - Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge
discovery in text (KDT), or Text data mining or Text Mining. In Neural Network that address classification problems,
training set, testing set, learning rate are considered as key tasks. That is collection of input/output patterns that are used
to train the network and used to assess the network performance, set the rate of adjustments. This paper describes a
proposed radial basis function neural net classifier that performs cross validation for original RBF Neural Network. In
order to reduce the optimization of classification accuracy, training time. The feasibility the benefits of the proposed
approach are demonstrated by means of two data sets like mushroom, weather symbolic. It is shown that, for mushroom
(large dataset) the accuracy with Proposed RBF Neural Network was in average around 1.4 % less than with the original
RBF Neural Network and the larger the improvement in speed. For weather symbolic (smaller dataset) the accuracy with
Proposed RBF Neural Network was in average around 35.7 % less than with the original RBF Neural Network and the
smaller the improvement in speed. This algorithm is independent of specify data sets so that many ideas and solutions can
be transferred to other classifier paradigms.
Keywords Radial Basis Function, Classification accuracy, Text mining, Time complexity.
1. Introduction
ISSN: 1790-5109
Page 476
ISBN: 978-960-6766-41-1
7th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING and DATA BASES (AIKED'08),
University of Cambridge, UK, Feb 20-22, 2008
1.3 Applications
These days, neural networks are used in a very large
number of applications. Application areas include
system identification and control (Vehicle control,
process control), game-playing and decision making
(backgammon, chess, racing), pattern recognition (radar
systems, face identification, object recognition and
more), sequence recognition (gesture, speech,
handwritten text recognition), medical diagnosis,
financial applications, data mining (or knowledge
discovery in databases, "KDD"), visualization and email spam filtering.
ISSN: 1790-5109
Page 477
ISBN: 978-960-6766-41-1
7th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING and DATA BASES (AIKED'08),
University of Cambridge, UK, Feb 20-22, 2008
ISSN: 1790-5109
Page 478
ISBN: 978-960-6766-41-1
7th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING and DATA BASES (AIKED'08),
University of Cambridge, UK, Feb 20-22, 2008
4.1 Overview
From an algorithmic perspective, optimization is a least
value for the minimization that can be used to solve a
wide range of optimization tasks including the most
important parameters are optimized of neural network.
4.2 Standard
Validation
Methods
of
the
Cross
ISSN: 1790-5109
5 Experimental results
In this section we demonstrated the properties and
advantages of our approach by means of two data sets
like mushroom, weather symbolic. The performance of
classification algorithms is usually examined by
evaluating the accuracy of the classification. However,
since classification is often a fuzzy problem, the correct
answer may depend on the user. Traditional algorithm
evaluation approaches such as determining the space
and time overhead can be used, but these approaches
are usually secondary. Classification accuracy [13] is
usually calculated determining the percentage of tuples
placed in the correct class. This ignores the fact that
there also may be a cost associated with an incorrect
assignment to the wrong class. This perhaps should also
Page 479
ISBN: 978-960-6766-41-1
7th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING and DATA BASES (AIKED'08),
University of Cambridge, UK, Feb 20-22, 2008
Training Time
300
PRBF
200
PRBF
100
ORBF
Table 1
Properties of data sets
Dataset
Factor of
Mushroom
Weather.
symbolic
Instances
0
1
8124
23
14
Mushroom
weather.
symbolic
Table 2
Training Time (seconds)
Faster by
Original
Proposed
Radial
Radial
Basis
Basis
Function
Function
(ORBF)
(PRBF)
217.23
246.61
29.38
0.05
0.06
0.01
ISSN: 1790-5109
ORBF
Attribues
The
performance of classification algorithms is usually
examined by evaluating the accuracy of the
classification. However, since classification is often a
fuzzy problem, the correct answer may depend on the
user. Traditional algorithm evaluation approaches such
as determining the space and time overhead can be
used, but these approaches are usually secondary.
Classification accuracy [11] is usually calculated by
determining the percentage of tuples placed in the
correct class. This ignores the fact that there also may
be a cost associated with an incorrect assignment to the
wrong class. This perhaps should also be determined.
We examine the Performance of classification much as
is done with information retrieval systems. With only
two classes, there are four possible outcomes with the
classification. The upper left and lower right quadrants
are correct actions. The remaining two quadrants are
incorrect actions.
Dataset
Factor of
Dataset
Factor of
% Correct
using
10fold
cross
validation
(PRBF)
%
Correct
class
(ORBF)
Mushroom
65.5465
66.9498
1.4033 %
Weather.
symbolic
64.2857
100
35.7143 %
Classification
Accuracy
Classification Accuracy
P RBF
150
100
PRBF
50
ORBF
0
ORBF
6 Conclusions
In this work we developed one text mining classifier
using Neural Network methods to measure the training
time for two data sets like mushroom, weather
symbolic. First, we utilized our developed text mining
algorithms, including text mining techniques based on
classification of data in two data collections. After that,
we employ exiting neural network to deal with measure
the training time for two data sets. Experimental results
Page 480
ISBN: 978-960-6766-41-1
7th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING and DATA BASES (AIKED'08),
University of Cambridge, UK, Feb 20-22, 2008
Acknowledgement
Authors gratefully acknowledge the authorities of
Annamalai University for the facilities offered and
encouragement to carry out this work. This part of work
is supported in part by the first author got Career
Award for Young Teachers (CAYT) grant from All
India Council for Technical Education, New Delhi.
They would also like to thank the reviewers for their
valuable remarks
References:
[1] Guobin Ou,Yi Lu Murphey, Multi-class
pattern classification using neural
networks, Pattern Recognition 40 (2007)
[2] M.Govindarajan, Dr.RM.Chandrasekaran,
Classifier Based Text Mining for Neural
Network Proceeding of XII international
conference on computer, electrical and
system science and engineering, may 24-26,
Vienna , Austria, waste.org,2007. pp. 200205
[3] Oliver Buchtala, Manual Klimek and
Bernhard Sick, Member, IEEE
Evolutionary Optimization of Radial Basis
Function Classifier for Data Mining
Applications, IEEE Transactions on
systems,man,andcybernets,vol.35,No.5,
October,2005
[4] Jiawei Han , Micheline Kamber Data
Mining Concepts and Techniques
Elsevier, 2003, pages 303 to 311 , 322 to
325.
[5] Intrusion Detection: Support Vector
Machines and Neural Networks, Srinivas
Mukkamala, Guadalupe Janoski, Andrew
Sung {srinivas, silfalco, , Department of
Computer Science
New Mexico Institute
of Mining and Technology, Socorro, New
Mexico 87801, 2002, IEEE
[6] N. Jovanovic, V. Milutinovic, and Z.
ISSN: 1790-5109
Page 481
ISBN: 978-960-6766-41-1