Feed Forward Neural Networks: Prof. Adel Abdennour

Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

Prof.

Adel Abdennour
1
Feed Forward Neural Networks
Overview
Feed Forward Neural Networks
Training Feed Forward Neural Nets by
Error Back Propagation Method
Example
2
Feed Forward Neural Networks
Learning in a multilayer network proceeds the same way as for a
perceptron.
A training set of input patterns is presented to the network.
The network computes its output pattern, and if there is an error
or in other words a difference between actual and desired
output patterns the weights are adjusted to reduce this error.
The classical learning algorithm of FFNN is based on the
gradient descent Rule & Generalized Delta Rule.
3
Feed Forward Neural Networks
The activation function used in FFNN are continuous
functions of the weights, differentiable everywhere.
FFNNs with typical activation function (Sigmoid Function)
are capable of approximating to any desired degree of
accuracy
4

x
e
x f

+
=
1
1
) (
Sigmoid Function
Linear Function
Feed Forward Neural Networks
5
The mathematical form of Sigmoid Function is :
Feed Forward Neural Networks
Gradient Descent Learning Rule:
Requires the definition of an error (or objective function)
The sum of squared errors is usually used
t
p
& o
p
= target and actual output for the pth- pattern
P
T
= total number of input-target vector pairs in the training set
E is total error
6
Feed Forward Neural Networks
Gradient Descent Learning Rule:
Minimize error by moving weights along the
decreasing slope of error
iterate through the training set and adjust the
weights to minimize the gradient of the error
7
Feed Forward Neural Networks
Giving a single training pattern, weights are updated using
Delta rule:
Where,
8
Feed Forward Neural Networks
Generalized Delta Learning Rule:
Assumes activation function which is once differentiable at least.
Assumes that sigmoid function is used.
9
Taking Differential of the sigmoid function w.r.t net
p
we get:
Feed Forward Neural Networks
Example:
10
x
1
= 0.1
79 . 0
3 . 1
1
1
=

+ e
83 . 0
6 . 1
1
1
=

+ e
Feed Forward Neural Networks
Error Back Propagation Method
Error Back propagation Method
Different methods are available to train a neural network, but Back
propagation method is most widely used
It has been used since the 1980s to adjust the weights
Basic Technique:
Calculates the error by taking the difference between the calculated
result and the actual result
The error is fed back through the network and the weights are
adjusted to minimize the error
11
Back-propagation training algorithm illustrated:
Back prop adjusts the weights of the NN in order
to minimize the network total mean squared error.
Network activation
Error computation
Forward Step
Error propagation
Backward Step
Feed Forward Neural Networks
Error Back Propagation Method
Hidden
layer
Output
layer
Feed Forward Neural Networks
Error Back Propagation Method

13
Feed Forward Neural Networks
Error Back Propagation Method
Symbols used for Layers:
14
Feed Forward Neural Networks
Error Back Propagation Method
First Step:
Calculate the sum of signals on each of
the neuron in the output layer i.e.
Apply sigmoid function on the output of the
each neuron
15

=
j
j
O
jk
W
k
N
)
k
N (
1
1
k
O f =

+
=
k
N
e
Feed Forward Neural Networks
Error Back Propagation Method
Second Step:
Determine the amount of error:
The equation can be simplified as:
Calculate the corrected weights from the
following expression:
16
)
k
N ( )
k
O
k
t (
k
'
f = o
)
k
O 1 (
k
O )
k
O
k
t (
k
= o
j
W W O o +
k

jk jk
Feed Forward Neural Networks
Error Back Propagation Method
Second Step:
Determine the amount of error in the hidden layer
by using the new weight W
jk
:
Now the new weights between the input and the
hidden layer can be calculated as:
Now apply these steps on every input and iterate
several times until you reach the lowest possible
error.
Then the network is ready to use
17
( )
o O = o
k
k jk
W
j
O 1
j j
i j ij
W
ij
W O o +
Feed Forward Neural Networks
Error Back Propagation Method
Let us understand this process through a simplified
example:
We shall select the learning rate = 0.5, to simplify
the operations
Network to be trained
18
Feed Forward Neural Networks
Error Back Propagation Method
Inputs and Outputs are:
We shall assume random weights initially and start
using the first row of the table of inputs and outputs
19
x
1
x
2
Target (t)
0 0 0
0 1 1
1 0 1
1 1 1

x
1
x
2
t W
11
W
12
W
21
W
22
W
10
W
20
0 0 0 1 0 0 1 1 1

Feed Forward Neural Networks
Error Back Propagation Method
We shall use the following notations for the
network:
h
i1
= input to the first neuron of the hidden layer
h
i2
= input to the second neuron of the hidden layer
h
o1
= output of the neuron 1 of the hidden layer
h
o2
= output of the neuron 2 of the hidden layer
N = input to the neuron of the output layer
O = output of the network
20
Feed Forward Neural Networks
Error Back Propagation Method
Thus we get the following values:
21
h
i1
= W
11
x
1
+ W
21
x
2
= (1) (0) + (0) (0)
= 0


h
i2
= W
12
x
1
+ W
22
x
2

= (0) (0) + (1) (0)
= 0

Feed Forward Neural Networks
Error Back Propagation Method
22

1 i
h
e 1
1
1 O
h

+
=


5 . 0
0
e 1
1
=

+
=




2 i
h
e 1
1
2 O
h

+
=

5 . 0
0
e 1
1
=

+
=



Feed Forward Neural Networks
Error Back Propagation Method
Now the total signal going to the output
layer can be calculated as:
And the Output of the network would be:
23
N = W
10
h
O1
+ W
20
h
O2
= (1) (0.5) + (1) (0.5)
= 1

1
e 1
1
N
e 1
1
O

+
=

+
=


= 0.73106

Feed Forward Neural Networks
Error Back Propagation Method
Now determining the error:
Now calculating the corrected weights between
the hidden and the output layer:
24
( ) ( ) O 1 O O t
O
= o
= (0-0.73106) (0.73106) (1-0.73106)
= -0.14373

1 O
h
O 10
W
10
W o +
= 1 + (0.5) (-.14373) (0.5)
= 0.9641

2 O
h
O 20
W
20
W o +
=1 + (0.5) (-.14373) (0.5)
= 0. 9641

Feed Forward Neural Networks
Error Back Propagation Method
Now we will continue on the same approach
towards the input layer.
25
( )
O 10
W
1 O
h 1
1 O
h
h
1
o = o
= (0.5) (1-0.5) (.9641) (-0.14373)
= -0.0346

( )
O 20
W
2 O
h 1
2 O
h
h
2
o = o
= (0.5) (1-0.5) (0.9641) (-0.14373)
= -0.0346

Feed Forward Neural Networks
Error Back Propagation Method
Now calculating the corrected weights between the input
and the hidden layer:
We note here that the weights have not changed and this
is normal because the inputs are equal to zero. But the
situation will change with the other inputs.
26
1 h 11 11
1
W W x o + =
= 1 + (0.5) (-0.0346) (0)
= 1
1 h 12 12
2
W W x o + =
= 0 + (1) (-0.0346) (0)
= 0
2 h 21 21
1
W W x o + =
= 0 + (1) (-0.0346) (0)
= 0
2 h 22 22
2
W W x o + =
= 1 + (1) (-0.0346) (0)
= 1

Feed Forward Neural Networks
Error Back Propagation Method
SO, the result of the first passage in the training process:
Now we will take the second row of the data
and re-train the network the same way as
the previous and will follow the same steps

27
x
1
x
2
t W
11
W
12
W
21
W
22
W
10
W
20
0 0 0 1 0 0 1 0.9641 0.9641

Feed Forward Neural Networks
Error Back Propagation Method
Using the values and weights that we have
acquired in the previous phase as a result
of training:
28
x
1
x
2
t W
11
W
12
W
21
W
22
W
10
W
20
0 1 1 1 0 0.005115 1.0043 0.97054 0.97544

Feed Forward Neural Networks
Error Back Propagation Method
The training process requires the same
steps to be performed many times to get the
lowest value of the error
The following table shows the weights after
thousand iterations
29
W
11
W
12
W
21
W
22
W
10
W
20
-3.5402 4.0244 -3.5248 4.5814 -11.9103 4.6940

Feed Forward Neural Networks
Error Back Propagation Method
As we see in the table are the actual results very close
to the desired results
Through the previous example, we see that the difficulty
of training does not lie in its understanding, but in the
effort required, especially with repeated operations,
sometimes for thousands of times. That's why this
process is usually performed with the computer
30
x
1
x
2
Target (t) Output (O)
0 0 0 0.0264
0 1 1 0.9867
1 0 1 0.9863
1 1 1 0.9908

Applications of Feed Forward Nets
31
Applications of Feed-forward nets
Pattern recognition
Face Recognition
Character Recognition
Sonar mine/rock recognition
Navigation of a car
Stock-market prediction
Pronunciation (NETtalk)
Applications of Feed Forward Nets
32
Example: Voice Recognition
Task: Learn to discriminate between two different
voices saying Hello
Data
Sources
Ahmad
Naseer
Format
Frequency distribution (60 bins)
Applications of Feed Forward Nets
33
Network architecture
Feed forward network
60 input (one for each frequency bin)
6 hidden
2 output (0-1 for Ahmed, 1-0 for Naseer)
Applications of Feed Forward Nets
34
Presenting the data
Ahmed
Naseer
Applications of Feed Forward Nets
35
Presenting the data (untrained network)
Ahmed
Naseer
0.43
0.26
0.73
0.55
Applications of Feed Forward Nets
36
Calculate error
Ahmed
Naseer
0.43 0 = 0.43
0.26 1 = 0.74
0.73 1 = 0.27
0.55 0 = 0.55
Applications of Feed Forward Nets
37
Back prop error and adjust weights
Ahmed
Naseer
0.43 0 = 0.43
0.26 1 = 0.74
0.73 1 = 0.27
0.55 0 = 0.55
1.17
0.82
Applications of Feed Forward Nets
38
Repeat process (sweep) for all training pairs
Present data
Calculate error
Back propagate error
Adjust weights
Repeat process multiple times
Applications of Feed Forward Nets
39
Presenting the data (trained network)
Ahmed
Naseer
0.01
0.99
0.99
0.01
Applications of Feed Forward Nets
40
Results Voice Recognition
Performance of trained network
Discrimination accuracy between known
Hellos = 100%
Discrimination accuracy between new Hellos
=100%
Additional Issues of Training Neural Networks
When Training these networks, attention must be given to
the following issues.
Optimization
Over fitting
Under fitting
Network size selection
Learning Rate
Lacking attention in any one of the above issues will lead
to an incomplete or ineffective neural network.
41
Additional Issues of Training Neural Networks
Optimization:
Several optimization algorithms for training NNs have
been developed. These algorithms are classified into two
classes:
Local Optimization
Where the algorithm may get stuck in a local optimum
without finding a global optimum. Gradient descent is
example of local optimizers
Global Optimization
Where the algorithm searches for the global optimum by
employing mechanism to search larger parts of the search
space
42
Global maximum
Local maximums
Additional Issues of Training Neural Networks
Local and Global optimization techniques can be combined to form
hybrid training algorithms
Additional Issues of Training Neural Networks
Learning consists of adjusting weights until an acceptable
empirical error has been reached. Two types of supervised
training algorithm exist, based on when weights are updated:
Stochastic/online training:
Where weights are adjusted after each pattern presentation. In
this case the next input pattern is selected randomly from the
training set, to prevent any bias that may occur due to the order
in which pattern occur in the training set.
Batch/offline training:
Where weight changes are accumulated and used to adjust
weights only after all training patterns have been presented.
44
Additional Issues of Training Neural Networks
45
46
Additional Issues of Training Neural Networks
47
Not good Training Proper Training Excessive Training
Additional Issues of Training Neural Networks
48
There are a large number of methods and instructions, used to
avoid such problems, the most important is Early Stopping.
In this the data is divided into three sections:
Training
Validation
Testing

You might also like