Genetic Algorithm Based Feature Selection For Medical Diagnosing Using Artificial Neural Network

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 54

Genetic Algorithm Based Feature Selection For Medical

Diagnosing Using Artificial Neural Network

Under The Esteemed Guidance Of


Mrs. P. Aruna Kumari
Assistant Professor
Department of CSE, JNTUK-UCEV
Project Team
R Priyanka 13VV1A0523
K Ravi Kumar 13VV1A0539
A Jayadhar 13VV1A0528
K.L.M Kumar 14VV5A0563
Problem Statement
Soft computing approaches have been emerged as an effective medical diagnostic support system.
Medical datasets are often classified by a large number of disease measurements and a relatively
small number of patient records. All these measurements (features) are not important or
irrelevant/noisy.

A set of features that are representative of all variations of the disease are needed.

Feature Selection (FS) is a solution that involves finding a subset of prominent features to improve
predictive accuracy and to remove the redundant features.

The goal of this project is to investigate the fuzzy logic based Genetic algorithm to generate
reduced number of features with improvement in diagnostic performance.

we are investigating two prominent diseases diabetes and thyroid.

The reduced feature subset is used to predict the disease based on the Artificial Neural Network.
Existing System
Generally, doctors consider all the symptoms in order to diagnose the presence of disease even
though some of the symptoms are not necessary to diagnose the disease.

Due to these unnecessary symptoms, it usually takes more time to decide the presence of disease.
Proposed System
The redundant or unnecessary symptoms makes the diagnosing system to take more time to decide
the presence of hepatitis.

By applying the Optimization techniques, the unnecessary symptoms can be eliminated by


considering necessary symptoms for diagnosing the disease.

After applying the optimization techniques to the dataset, it will produce the optimum feature set
using which the decision model can be built.
Architectural View
Methodology
Data Pre-processing
a. Data Cleaning
b. Data Discretization
Feature Selection
a. Fuzzification
b. Genetic Algorithm
Classification
a. Artificial Neural Networks
PRE-PROCESSING
Why we need Pre-Processing ?
All features are not important or irrelevant/noisy.

These features may be especially harmful in the case of relatively small training sets.

Increases redundancy.

Irrelevancy and redundancy is harder to evaluate.

Extreme number of features carries the problem of memory usage .

Decreases accuracy
How Pre-Processing ?
The Purpose of the data Pre-processing is to extract useful data from raw datasets and these are
converted into the format required for the prediction of the disease .
The Data is cleaned for missing values, analyzed and transformed for further steps .

Pre-Processing Steps:

1. Data Cleaning (Filling Missing Values) using Attribute Mean

2. Data Transformation using Normalization

3. Data Discretization using Equal Areas Method


Data Cleaning (Filling Missing values)

The missing values are handled by computing the Attribute Mean of all the other attribute values
in the same column of the dataset.

Dataset before Handling Missing values

The Missing values are represented by ? and in the data set.

Dataset after Handling Missing values

The missing values are handled by computing the Attribute Mean of all the other attribute values in

the same column of the dataset.


Dataset before Handling Missing values
Dataset after Handling Missing values
Data Discretization
The attribute consist of continuous values and are now converted into discrete values by applying
Equal Areas method of discretization to the attributes in the dataset.

Discretizing the data using equal areas method

0.00 0.33 = 1

0.33 0.66 = 2

0.66 0.99 = 3
FEATURE SELECTION
What is Feature Selection ?
Feature selection is generally used in machine learning, especially when the learning task envelopes
high-dimensional datasets.
The intention of feature selection is to reduce the complexity and augment the quality of a dataset by
selecting prominent features.
Optimization methods are used in the process of feature selection to compute the most significant set
features from the data set while maintaining adequate accuracy rate represented by the original set of
features.
In our Feature Selection ,it comprises of
1. Fuzzification followed by
2. Genetic Algorithm.
Fuzzification is needed to deal with Uncertainty.
Genetic Algorithm is used for selecting Optimum Feature Subsets .
Why Fuzzification ?

Real World data may not be binary Valued data(0 or 1).


Computer cannot deal with Uncertainty.
Fuzzy logic helps to deal with Uncertainty providing more accuracy.
The Crisp data is converted into fuzzy variables providing more accurate nature.
This is called Fuzzification.
Uses:
To handle Uncertainty and Imprecise nature .
To describe grades of Truth.
It is useful in Complex Non-linear applications .
Definition:
Fuzzification comprises the process of transforming crisp values into grades of membership for
linguistic terms of fuzzy sets.
The membership function is used to associate a grade to each linguistic term .
Algorithm for Fuzzification
Define linguistic variables and terms.
Construct membership functions for them.
Convert crisp data into fuzzy data sets using membership functions.

Step 1: Define linguistic variables and terms


Linguistic variables are input and output variables in the form of simple words or sentences. For
room temperature, cold, warm, hot, etc., are linguistic terms.
Temperature t = {very-cold, cold, warm, very-warm, hot}
Every member of this set is a linguistic term and it can cover some portion of overall temperature
values.

Step 2: Construct membership functions for them


A membership function (MF) is a curve that defines how each point in the input space is mapped to a
membership value (or degree of membership) between 0 and 1.
Algorithm for Fuzzification(Contd.)
The membership functions of temperature variable are as shown

There are different types of membership functions.


For Continuously Changing values , two membership functions are used.
1. Triangular
2. Trapezoid .
We are using Triangular Membership function which indicates 3 possible ranges.
Algorithm for Fuzzification(Contd.)

Step 4: Obtain fuzzy value


The result of Triangular membership value is a fuzzy value of a variable.
Fuzzification Example
Sample Input :
Sample Output :
Genetic Algorithm
Biological Background

Each cell of a living organisms contains chromosomes - strings of DNA


Each chromosome contains a set of genes - blocks of DNA
A collection of genes genotype
Reproduction involves recombination of genes from parents
Fitness Function
The fitness of an organism is how much it can reproduce before it dies .
Particular solution may be ranked against all
Measures the quality of the represented solution
It is always problem dependent
The fitness of a solution is measured, how best it gives the result
For instance,

Fitness Function =Accuracy + No. of Absents * Balancing factor


Basic Operators of Genetic
Algorithm
Reproduction
It is usually the first operator applied on population.
Chromosomes are selected from the population of parents to cross over and produce offspring.
It is based on Darwins evolution theory of Survival of the fittest.
Therefore, this operator is also known as Selection Operator.
Cross Over Operator

After reproduction phase, population is enriched with better individuals.


It makes clones of good strings but does not create new ones.
Cross over operator is applied to the mating pool with a hope that it would create better strings.
Mutation Operator

After cross over, the strings are subjected to mutation.


Mutation of a bit involves flipping it, changing 0 to 1 and vice-versa.
Algorithm
Step 1: Represent the problem variable domain as a chromosome of a fixed length, choose the size
of a chromosome population N, the crossover probability pc and the mutation probability pm.
Step 2:Define a fitness function to measure the performance, or fitness, of an individual chromosome
in the problem domain. The fitness function establishes the basis for selecting chromosomes that will
be mated during reproduction.
Step 3: Randomly generate an initial population of chromosomes of size N:
x1, x2 , . . . , xN
Step 4:Calculate the fitness of each individual chromosome:
f (x1), f (x2), . . . , f (xN)
Algorithm
Step 5: Select a pair of chromosomes for mating from the current population. Parent`chromosomes
are selected with a probability related to their fitness .
Step 6: Create a pair of offspring chromosomes by applying the genetic operators crossover and
mutation.
Step 7: Place the created offspring chromosomes in the new population .
Step 8: Repeat Step 5 until the size of the new chromosome population becomes equal to the size of
the initial population, N.
Step 9: Replace the initial (parent) chromosom population with the new (offspring) population.
Step 10: Go to Step 4, and repeat the process until the termination criterion is satisfied.
Working Principle of Genetic Algorithm
Genetic Algorithm: Case Study
A simple example will help us to understand how a GA works. Let us find the maximum value
of the function (15x - x2) where parameter x varies between 0 and 15. For simplicity, we may
assume that x takes only integer values. Thus, chromosomes can be built with only four
genes:

Integer Binary code Integer Binary code Integer Binary code


1 0001 6 0110 11 1011
2 0010 7 0111 12 1100
3 0011 8 1000 13 1101
4 0100 9 1001 14 1110
5 0101 10 1010 15 1111
The fitness function and chromosome locations
Chromosome Chromosome Decoded Chromosome Fitness
label string integer fitness ratio, %
X1 1 1 0 0 12 36 16.5
X2 0 1 0 0 4 44 20.2
X3 0 0 0 1 1 14 6.4
X4 1 1 1 0 14 14 6.4
X5 0 1 1 1 7 56 25.7
X6 1 0 0 1 9 54 24.8
60 60
f(x)
50 50

40 40

30 30

20 20

10 10

0 0
0 5 10 15 0 5 10 15
x x
(a) Chromosome initial locations. (b ) Chromosome final locations.
Roulette wheel selection
The most commonly used chromosome selection techniques is the
Roulette wheel selection.

100 0
X1: 16.5%
X2: 20.2%
75.2 X3: 6.4%
X4: 6.4%
X5: 25.3%
36.7 X6: 24.8%
49.5 43.1
Crossover operator
In our example, we have an initial population of 6 chromosomes. Thus, to establish the same
population in the next generation, the roulette wheel would be spun six times.

Once a pair of parent chromosomes is selected, the crossover operator is applied.

First, the crossover operator randomly chooses a crossover point where two parent chromosomes
break, and then exchanges the chromosome parts after that point. As a result, two new offspring
are created.

If a pair of chromosomes does not cross over, then the chromosome cloning takes place, and the
offspring are created as exact copies of each parent.
X6i 1 0 00 1 0 1 00 00 X2i

X1i 0 11 00 00
1 0 11 11 11 X5i

X2i 0 1 0 0 0 1 1 1 X5i
Mutation operator
Mutation represents a change in the gene.

Mutation is a background operator. Its role is to provide a guarantee that the search algorithm
is not trapped on a local optimum.

The mutation operator flips a randomly selected gene in a chromosome.

The mutation probability is quite small in nature, and is kept low for GAs, typically in the
range between 0.001 and 0.01.
X6'i 1 0 0 0

X2'i 0 1 0 1
0

X1'i 0
1 1 1 1 1 1 X1"i

X5'i 0 1 0
1 0
1

X2i 0 1 0 0 1 0 X2"i

X5i 0 1 1 1
C rossover
Generation i
X1i 1 1 0 0 f = 36 X6i 1 0 00 1 0 1 00 00 X2i
X2i 0 1 0 0 f = 44
X3i 0 0 0 1 f = 14
X4i 1 1 1 0 f = 14 X1i 0 11 00 00
1 0 11 11 11 X5i
X5i 0 1 1 1 f = 56
X6i 1 0 0 1 f = 54
X2i 0 1 0 0 0 1 1 1 X5i
Generation (i + 1)
X1i+1 1 0 0 0 f = 56 Mutation
X2i+1 0 1 0 1 f = 50 X6'i 1 0 0 0
X3i+1 1 0 1 1 f = 44
X2'i 0 1 0 1
0
X4i+1 0 1 0 0 f = 44
X1'i 0
1 1 1 1 1 1 X1"i
X5i+1 0 1 1 0 f = 54
X6i+1 0 1 1 1 f = 56 X5'i 0 1 0
1 0
1

X2i 0 1 0 0 1 0 X2"i

X5i 0 1 1 1
CLASSIFICATION
Classification Techniques
There are different types of class techniques such as Artificial neural networks, Decision trees,
Nearest neighbor method, Rule induction, Data visualization.

Among that there are some limitations in every model.

E.g.: Decision trees will always gives the results based on local optima. This will effect the final
output.

There are limitations in artificial neural networks too i.e., time complexity.

For that we are decreasing the number of features that effect the disease by using fuzzy feature
selection and genetic algorithms
Why Artificial neural networks?
As in medical diagnosis we did not want an local optima results as that value may effect the final

output . That will degrade the system performance.

ADVANTAGES:
Character recognisation.
Image compression.
Travelling salesman problem
Artificial neural networks
An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the
way biological nervous systems, such as the brain, process information.

Neural networks can be used to extract patterns and detect trends that are too complex to be
noticed by either humans or other computer techniques.

A trained neural network can be thought of as an "expert can then be used to provide projections
given new situations of interest and answer "what if" questions.

Adaptive learning, Self-organization, Real time operation, Fault tolerance via Redundant
information coding.
Proposed Process in artificial neural networks
with Backpropagation
Step 1: Take an input from already selected features from defuzzified data.
Step 2: Apply the decision patterns in the hidden layer.
Step 3: Obtain the output and find the difference between the cost function.
Step 4: If the difference is greater than threshold value the again change the weights and bias
value and iterate the process till we get difference less than threshold value.
Step 5: If less than threshold value the obtain the output and give the result.
Step 6: After that we will use backpropagation to increase the efficiency of the system.
Working of Basic Neural Network
An artificial neuron is a device with many inputs and one output.
The neuron has two modes of operation.
The training mode and the using mode.
For using mode we will use firing rules in order to determine whether to fire or not.
E.g.: Hamming distance technique.
Working of Basic Neural Network(Cont.)
The Artificial neural networks consists of three groups
1.Input layer 2.Hidden layer 3.Output layer

The behavior of an ANN (Artificial Neural Network) depends on both the weights and the input-
output function.
The input-output function is of three categories
Linear units ,threshold unit ,sigmoid unit.
Sample working of Neural Network By Realizing the
AND gate from XNOR gate
Non-linear classification: XOR/XNOR.
Here we will take an output function as H().
The sample neural network will be

Here we will do biasing in order to increase the flexibility but biasing will depend upon the
selected output.
Sample working of Neural Network By Realizing the AND gate
from XNOR gate(continued)
Sometimes it's convenient to add the weights into the diagram. These values are in fact just
the parameters so
101 = -30
111 = 20
121 = 20
To use our original notation.
x1 x2 H()
0 0 G(-30)
0 1 G(-10)
1 0 G(-10)
1 1 G(10)

-t
The value of G(x) cab be find using sigmoid function i.e.; 1/1+e
Artificial neural networks(conclusion)

Backpropagation, is a common method of training artificial neural networks and used


in conjunction with an optimization method.
When an input vector is presented to the network, it is propagated forward through the
network, layer by layer, until it reaches the output layer.
The output of the network is then compared to the desired output, using a loss
function, and an error value is calculated for each of the neurons in the output layer.
Backpropagation is used to increase the efficiency of the neural networks.
ANALYSIS
Time taken to completely process 8 attributes for an
neural network.
ANALYSIS(contd..)
Time taken to process 5 attributes in an neural network.
OUTPUTS
OUTPUTS
Conclusion
a. This work presented a feature selection method based on fuzzification followed by genetic
algorithms.
b. The results shown that a reduced number of features can achieve classification accuracy superior to
that of using the full set of features.
c. Here we considered Diabetes and Thyroid dataset, And then applied fuzzification then genetic
algorithm which resulted in a feature subset of optimal attributes then using Artificial neural
networks we will find the decision about the disease.
Thank you

You might also like