Artificial Neural Networks (ANN) : 1-Introduction
Artificial Neural Networks (ANN) : 1-Introduction
Artificial Neural Networks (ANN) : 1-Introduction
1-Introduction
An ANN models the relationship between a set of input signals and output signals
It uses a network of artificial neurons also referred to as Nodes to solve learning problems.
They are versatile learners since they can be used for nearly any learning task: classification, numeric
prediction, and unsupervised pattern recognition.
2-structure :
A directed network diagram defines the relationship between the input (x vars) and output signals (y
vars). Each dendrite (capteur) is weighted (w values) according to its importance, the inputs are grouped
and then processed throughout an activation function f
Activation function: what processes the input and transforms it into an output.
Network topology (architecture) describing the number of neurons and the number of layers.
Training algorithm which sets the weights
3-Activation Functions:
This process sums up the total input signals and determines whether it meets the firing threshold, if so,
the neuron passes the signal, else it does not. This is known as the threshold activation function: an
output is reached only once a specified input threshold is attained.
Unit step activation function
The threshold is rarely used in ANN since activation functions are chosen based on their capabilities to
demonstrate desirable mathematical characteristics and model the relationships.
The most common 1 is the sigmoid activation function, where the output is no longer binary byt ranges
from 0 to 1.
It is common thanks to being differentiable which will be used in determining the optimal weights.
Of course, all the inputs must be standardized and normalized to avoid the high and low ends and
squeezing the input values to small ranges. This also helps the model work faster and even train faster.
4-Network Topology:
Topology helps understand the complexity of tasks. Larger and more complex networks identify more
subtle patterns and complex decision boundaries.
A recurrent network (feedback network) allows signal to travel in both directions using loops.
The appropriate number depends also on the amount of training data, noisy data, and the
complexity of the learning task.
The topology itself does not learn anything, as input data is processed, connections (weights) are either
strengthened or weakened.
Backpropagation iterates through many cycles of 2 processes. Each iteration is called an epoch. Weights
are set randomly since the model does not have prior knowledge. As a result, it keeps cycling through
processes until a stopping criterion is reached.
1. A forward phase in which neurons are activated in sequence from the input layer to the
output layer, applying each neuron's weights and activation function upon reaching an
output signal.
2. A backward phase in which the output signal is compared with the target value in the
training data. The difference is an error that will be propagated backward to modify
connection weights.
The network uses the information sent backward to reduce the total error. So how much weight should
be changed?
We use the gradient descent technique. The backpropagation uses the derivative of the activation
function to identify the gradient in the direction of the oncoming weight. The gradient suggests how
steeply the error will be reduced or weight for a change in the weight. The algorithm will attempt to
change the weights that lead to the greatest reduction in error by the learning rate.
The greater this rate is the faster the algo will attempt to descend the gradients which will reduce training
time.
6- Batch Training:
For CBOW (Continuous bag of words), we use RELUE as the activation function.
Example:
to eat
sandwich
there is also SKIPGRAM that uses the target word and estimate its surroundings.