Chapter 3-1 Neural Network

Artificial Neural Networks
Part 1/3
Biological Inspirations
Biological Inspirations
Humans perform complex tasks like vision, motor
control, or language understanding very well.
One way to build intelligent machines is to try to

imitate the (organizational principles of) human
brain.
Human Brain
• The brain is a highly complex, non-linear, and parallel computer,
composed of some 1011 neurons that are densely connected (~104
connection per neuron). We have just begun to understand how the
brain works...
• A neuron is much slower (10-3sec) compared to a silicon logic gate

(10-9sec), however the massive interconnection between neurons
make up for the comparably slow rate.
– Complex perceptual decisions are arrived at quickly
(within a few hundred milliseconds)
• 100-Steps rule: Since individual neurons operate in a few

milliseconds, calculations do not involve more than about 100 serial
steps and the information sent from one neuron to another is very
small (a few bits)
• Plasticity: Some of the neural structure of the brain is present at

birth, while other parts are developed through learning, especially in
early stages of life, to adapt to the environment (new inputs).
Biological Neuron
A variety of different neurons exist (motor neuron,
on-center off-surround visual cells…), with different
branching structures.
The connections of the network and the strengths of

the individual synapses establish the function of the
network.
Biological Neuron
– dendrites: nerve fibres carrying electrical signals to the cell
– cell body: computes a non-linear function of its inputs
– axon: single long fiber that carries the electrical signal
from the cell body to other neurons
– synapse: the point of contact between the axon of one cell
and the dendrite of another, regulating a chemical
connection whose strength affects the input to the cell.
Computational models inspired by the human brain:
– Massively parallel, distributed system, made up of simple

processing units (neurons)
– Synaptic connection strengths among neurons are used to

store the acquired knowledge.
– Knowledge is acquired by the network from its

environment through a learning process
Properties of ANNs
Learning from examples
– labeled or unlabeled
Adaptivity
– changing the connection strengths to learn things
Non-linearity
– the non-linear activation functions are essential
Fault tolerance
– if one of the neurons or connections is damaged, the whole
network still works quite well
Thus, they might be better alternatives than classical solutions for

problems characterised by:
– high dimensionality, noisy, imprecise or imperfect data; and
– a lack of a clearly stated mathematical solution or
algorithm
Neuron
Model and
Network
Architectures
Artificial Neuron Model
x0= +1
bi
x1 :Bias
wi1
x2
 f ai
x3 Neuroni Activation Output
function
xm wim
Input
Synaptic
Weights
Bias
n
ai = f (ni) = f (wijxj + bi)

j=1
An artificial neuron:
- computes the weighted sum of its input (called its net input)
- adds its bias
- passes this value through an activation function
We say that the neuron “fires” (i.e. becomes active) if its output is
above zero.
Bias
Bias can be incorporated as another weight clamped to a fixed
input of +1.0
This extra free variable (bias) makes the neuron more

powerful.
n
ai = f (ni) = f (wijxj) = f(wi.xj)

j=0
Activation functions
Also called the squashing function as it limits

the amplitude of the output of the
neuron.
Many types of activations functions are used:

– linear: a = f(n) = n
– threshold: a = {1 if n >= 0
(hardlimiting)
0 if n < 0
– sigmoid: a = 1/(1+e-n)
– ...
Activation Functions
A neural network is a massively parallel, distributed processor
made up of simple processing units (artificial neurons).
It resembles the brain in two respects:

– Knowledge is acquired by the network from its
environment through a learning process
– Synaptic connection strengths among neurons are used to
store the acquired knowledge.
Different Network Topologies
Single layer feed-forward networks
– Input layer projecting into the output layer
Input Output
layer
layer
Multi-layer feed-forward networks
– One or more hidden layers.
– Input projects only from previous layers onto a layer.
typically, only from one layer to the next
2-layer or
1-hidden layer
fully connected
network
Input Hidden Output
layer layer
layer
Recurrent networks
– A network with feedback, where some of its
inputs are connected to some of its outputs (discrete
time).
Input Output
layer
layer
Applications of ANNs
ANNs have been widely used in various domains for:
– Pattern recognition
– Function approximation
– Associative memory
– ...
Early ANN Models:
– Perceptron, ADALINE, Hopfield Network
Current Models:
– Deep Learning Architectures
– Multilayer feedforward networks (Multilayer perceptrons)
– Radial Basis Function networks
– Self Organizing Networks
– ...
How to Decide on a Network Topology?
– # of input nodes?
• Number of features
– # of output nodes?
• Suitable to encode the output representation
– transfer function?
• Suitable to the problem
– # of hidden nodes?
• Not exactly known
Multilayer Perceptron
Each layer may have different number of nodes and different
activation functions
But commonly:
– Same activation function within one layer
• sigmoid/tanh activation function is used in the hidden
units, and
• sigmoid/tanh or linear activation functions are used in
the output units depending on the problem
(classification-sigmoid/tanh or function approximation-
linear)
Neural Networks Resources
Reference
Neural Networks Text Books
Main text books:

• “Neural Networks: A Comprehensive Foundation”, S. Haykin (very
good -theoretical)
• “Pattern Recognition with Neural Networks”, C. Bishop (very good-
more accessible)
• “Neural Network Design” by Hagan, Demuth and Beale
(introductory)
Books emphasizing the practical aspects:

• “Neural Smithing”, Reeds and Marks
• “Practical Neural Network Recipees in C++”’ T. Masters
• Seminal Paper (but now quite old!):
– “Parallel Distributed Processing” Rumelhart and
McClelland et al.
Deep Learning books and tutorials:

• http://www.deeplearningbook.org/
Neural Networks Literature
Review Articles:
R. P. Lippman, “An introduction to Computing with Neural Nets”’ IEEE
ASP Magazine, 4-22, April 1987.
T. Kohonen, “An Introduction to Neural Computing”, Neural Networks,
1, 3-16, 1988.
A. K. Jain, J. Mao, K. Mohuiddin, “Artificial Neural Networks: A Tutorial”’
IEEE Computer, March 1996’ p. 31-44.
Journals:
IEEE Transactions on NN
Neural Networks
Neural Computation
Biological
Cybernetics
...
Part 2/3 – Perceptron
02/06/2025 Maslina Zolkepli @ SSK5603 26

Perceptron
• A single artificial neuron that computes its weighted input and

uses a threshold activation function.
• It effectively separates the input space into two categories by

the hyperplane:
wTx + b = 0

Decision Boundary
The weight vector is orthogonal to the decision boundary
The weight vector points in the direction of the vector which

should produce an output of 1
• so that the vectors with the positive output are on the right side of
the decision boundary
– if w pointed in the opposite direction, the dot products of all input
vectors would have the opposite sign
– would result in same classification but with opposite labels
The bias determines the position of the boundary

• solve for wTp+b = 0 using one point on the decision boundary to
find b.

Two-Input Case
+
a = hardlim(n) = [1 2]p + -2
w1 1 = 1 w1 = 2 -
2
Decision Boundary: all points p for which wTp + b =0

If we have the weights and not the bias, we can take a point on
the decision boundary, p=[2 0]T, and solving for [1 2]p + b = 0,
we see that b=-2.

p
w

proj. of p onto w
Decision Boundary wT.p = ||w||||p||Cos
proj. of p onto w
w T
p+b=0 = ||p||Cos
1 1wTp = -b = wT.p/||w||
• All points on the decision boundary have the same inner
product (= -b) with the weight vector
• Therefore they have the same projection onto the weight
vector; so they must lie on a line orthogonal to the weight
vector
ADVANCED

An
Illustrative
Example

Boolean OR
Given the above input-output pairs (p,t), can you find (manually)
the weights of a perceptron to do the job?

Boolean OR Solution
1) Pick an
admissable decision
boundary
2) Weight vector should be orthogonal to the decision boundary.

0.5
1 w =
0.5
3) Pick a point on the decision boundary to find the bias.
T 0 + b = 0.25 + b = 0
1w p + b = 0.5 0.5  b = œ0.25
0.5

Matrix Form

Multiple-Neuron Perceptron
weights of one neuron
in one row of W.
w1 1 w1 2  w1 R
W = w2 1 w2 2  w2 R
3x2
wS 1 wS 2  wS R
1 T
i w wi 1
W = 2w T
iw = wi 2
2x1
Sw
T w i R
a = hardlimn  = hardlim wTp + b 

i i i i

Multiple-Neuron Perceptron
Each neuron will have its own decision boundary.

T
iw p + bi
= 0
A single neuron can classify input vectors into
two categories.
An S-neuron perceptron can potentially

classify input vectors into 2S categories.

Perceptron Limitations

Perceptron
Limitations
• A single layer perceptron can only learn linearly separable
problems.
– Boolean AND function is linearly separable,
whereas Boolean XOR function is not.
Boolean AND Boolean XOR

AND Network
x1
W
1
=
0
.
5
x2 W0 = -0.8
X0=1 Σ
W2=0.5

Perceptron Limitations
Linear Decision Boundary
T
1w p + b = 0
Linearly Inseparable Problems

Perceptron
Limitations
For a linearly not-separable problem:
– Would it help if we use more layers of neurons?
– What could be the learning rule for each neuron?
Solution: Multilayer networks

and the backpropagation
learning algorithm

• More than one layer of perceptrons (with a hardlimiting
activation function) can learn any Boolean function.
• However, a learning algorithm for multi-layer perceptrons has
not been developed until much later
– backpropagation algorithm
– replacing the hardlimiter in the perceptron with a sigmoid
activation function

Summary
• So far we have seen how a single neuron with a threshold

activation function separates the input space into two.
• We also talked about how more than one nodes may indicate
convex (open or closed) regions
• Next week = Backpropagation algorithm to learn the

weights automatically

Chapter 3-1 Neural Network

Uploaded by

Copyright:

Available Formats

Chapter 3-1 Neural Network

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 3-1 Neural Network

Uploaded by

Copyright:

Available Formats

Artificial Neural Networks

One way to build intelligent machines is to try to

• A neuron is much slower (10-3sec) compared to a silicon logic gate

• 100-Steps rule: Since individual neurons operate in a few

• Plasticity: Some of the neural structure of the brain is present at

The connections of the network and the strengths of

– Massively parallel, distributed system, made up of simple

– Synaptic connection strengths among neurons are used to

– Knowledge is acquired by the network from its

Thus, they might be better alternatives than classical solutions for

ai = f (ni) = f (wijxj + bi)

This extra free variable (bias) makes the neuron more

ai = f (ni) = f (wijxj) = f(wi.xj)

Also called the squashing function as it limits

Many types of activations functions are used:

It resembles the brain in two respects:

Main text books:

Books emphasizing the practical aspects:

Deep Learning books and tutorials:

02/06/2025 Maslina Zolkepli @ SSK5603 26

• A single artificial neuron that computes its weighted input and

• It effectively separates the input space into two categories by

02/06/2025 Maslina Zolkepli @ SSK5603 27

The weight vector is orthogonal to the decision boundary

The weight vector points in the direction of the vector which

The bias determines the position of the boundary

02/06/2025 Maslina Zolkepli @ SSK5603 28

Decision Boundary: all points p for which wTp + b =0

02/06/2025 Maslina Zolkepli @ SSK5603 29

02/06/2025 Maslina Zolkepli @ SSK5603 30

02/06/2025 Maslina Zolkepli @ SSK5603 31

02/06/2025 Maslina Zolkepli @ SSK5603 32

2) Weight vector should be orthogonal to the decision boundary.

3) Pick a point on the decision boundary to find the bias.

02/06/2025 Maslina Zolkepli @ SSK5603 33

02/06/2025 Maslina Zolkepli @ SSK5603 34

a = hardlimn  = hardlim wTp + b 

02/06/2025 Maslina Zolkepli @ SSK5603 35

Each neuron will have its own decision boundary.

An S-neuron perceptron can potentially

02/06/2025 Maslina Zolkepli @ SSK5603 36

02/06/2025 Maslina Zolkepli @ SSK5603 37

Boolean AND Boolean XOR

02/06/2025 Maslina Zolkepli @ SSK5603 38

02/06/2025 Maslina Zolkepli @ SSK5603 39

Linearly Inseparable Problems

02/06/2025 Maslina Zolkepli @ SSK5603 40

Solution: Multilayer networks

02/06/2025 Maslina Zolkepli @ SSK5603 41

02/06/2025 Maslina Zolkepli @ SSK5603 42

• So far we have seen how a single neuron with a threshold

• Next week = Backpropagation algorithm to learn the

02/06/2025 Maslina Zolkepli @ SSK5603 43

You might also like