connectionist

11 Machine Learning: Connectionist
11.0 Introduction 11.4 Competitive Learning

11.1 Foundations of 11.5 Hebbian Coincidence
Connectionist Learning
Networks
11.6 Attractor Networks or
11.2 Perceptron Learning “Memories”
11.3 Backpropagation 11.7 Epilogue and
Learning References
11.8 Exercises
Additional sources used in preparing the slides:

Robert Wilensky’s AI lecture notes,
http://www.cs.berkeley.edu/~wilensky/cs188
Various sites that explain how a neuron works
1
Chapter Objectives
• Learn about the neurons in the human brain

• Learn about single neuron systems
• Introduce neural networks
2
Inspiration: The human brain
• We seem to learn facts and get better at doing

things without having to run a separate
“learning procedure.”
• It is desirable to integrate learning more with
doing.
3
Biology
• The brain doesn’t seem to have a CPU.

• Instead, it’s got lots of simple, parallel,
asynchronous units, called neurons.
• Every neuron is a single cell that has a
number of relatively short fibers, called
dendrites, and one long fiber, called an axon.
 The end of the axon branches out into more short fibers
 Each fiber “connects” to the dendrites and cell bodies of
other neurons
 The “connection” is actually a short gap, called a
synapse
 Axons are transmitters, dendrites are receivers
4
Neuron
5
How neurons work
• The fibers of surrounding neurons emit

chemicals (neurotransmitters) that move across
the synapse and change the electrical potential
of the cell body
 Sometimes the action across the synapse increases the
potential, and sometimes it decreases it.
 If the potential reaches a certain threshold, an electrical
pulse, or action potential, will travel down the axon,
eventually reaching all the branches, causing them to
release their neurotransmitters. And so on ...
6
How neurons work (cont’d)
7
How neurons change
• There are changes to neurons that are

presumed to reflect or enable learning:
 The synaptic connections exhibit plasticity. In other
words, the degree to which a neuron will react to a
stimulus across a particular synapse is subject to long-
term change over time (long-term potentiation).
 Neurons also will create new connections to other
neurons.
 Other changes in structure also seem to occur, some less
well understood than others.
8
Neurons as devices
• How many neurons are there in the human brain?

- around 1012
(with, perhaps, 1014 or so synapses)
• Neurons are slow devices.
- Tens of milliseconds to do something.
- Feldman translates this into the “100 step
program constraint: Most of the AI tasks we
want to do take people less than a second.
So any brain “program” can’t be longer than
100 neural “instructions.”
• No particular unit seems to be important.
- Destroying any one brain cell has little effect
on overall processing.
9
How do neurons do it?
• Basically, all the billions of neurons in the

brain are active at once.
- So, this is truly massive parallelism.
• But, probably not the kind of parallelism that
we are used to in conventional Computer
Science.
- Sending messages (i.e., patterns that
encode information) is probably too
slow to work.
- So information is probably encoded
some other way, e.g., by the
connections themselves.
10
AI / Cognitive Science Implication
• Explain cognition by richly connected

networks transmitting simple signals.
• Sometimes called
- connectionist computing
(by Jerry Feldman)
- Parallel Distributed Processing (PDP)
(by Rumelhart, McClelland, and Hinton)
- neural networks (NN)
- artificial neural networks (ANN)
(emphasizing that the relation to biology
is generally rather tenuous)
11
From a neuron to a perceptron
• All connectionist models use a similar model of

a neuron
• There is a collection of units each of which has
 a number of weighted inputs from other units
 inputs represent the degree to which the other unit is firing
 weights represent how much the units wants to listen to
other units
 a threshold that the sum of the weighted inputs are
compared against
 the threshold has to be crossed for the unit to do something
(“fire”)
 a single output to another bunch of units
 what the unit decided to do, given all the inputs and its
threshold
12
A unit (perceptron)
w1
x1
x2 w2
x3 w3 O=f(y)
. y=wixi
.
. wn
xn
xiare inputs
wiare weights
wnis usually set for the threshold with xn =1 (bias)
y is the weighted sum of inputs including the
threshold (activation level)
o is the output. The output is computed using a
function that determines how far the
perceptron’s activation level is below or
above 0
13
Notes
• The perceptrons are continuously active

- Actually, real neurons fire all the time; what
changes is the rate of firing, from a few to a
few hundred impulses a second
• The weights of the perceptrons are not fixed
- Indeed, learning in a NN system is basically a
matter of changing weights
14
Interesting questions for NNs
• How do we wire up a network of perceptrons?

- i.e., what “architecture” do we use?
• How does the network represent knowledge?
- i.e., what do the nodes mean?
• How do we set the weights?
- i.e., how does learning take place?
15
The simplest architecture: a single
perceptron
x1 w1
x2 w2
x3 w3 o
. y=wixi
.
. wn
xn
A perceptron computes o = sign (X . W), where

X.W = w1 * x1 + w2 * x2 + … + wn * 1, and
sign(x) = 1 if x > 0 and
-1 otherwise
A perceptron can act as a logic gate
interpreting 1 as true and -1 (or 0) as false
16
Logical function and
+1
x
y +1 xy
x+y-2
-2
1
x y x+y-2 output
1 1 0 1
1 0 -1 -1
0 1 -1 -1
0 0 -2 -1
17
Logical function or
+1
x
y +1 xy
x+y-1
-1
1
x y x+y-1 output
1 1 1 1
1 0 0 1
0 1 1 1
0 0 -1 -1
18
Training perceptrons
• We can train perceptrons to compute the

function of our choice
• The procedure
 Start with a perceptron with any values for the weights
(usually 0)
 Feed the input, let the perceptron compute the answer
 If the answer is right, do nothing
 If the answer is wrong, then modify the weights by
adding or subtracting the input vector (perhaps scaled
down)
 Iterate over all the input vectors, repeating as necessary,
until the perceptron learns what we want
19
Training perceptrons: the intuition
• If the unit should have gone on, but didn’t,

increase the influence of the inputs that are on:
- adding the input (or fraction thereof) to the
weights will do so;
• If it should have been off, but was on,
decrease influence of the units that were on:
- subtracting the input from the weights does
this
20
Example: teaching the logical or function
Want to learn this:

b x y output
1 -1 -1 -1
1 -1 1 1
1 1 -1 1
1 1 1 1
Initially the weights are all 0, i.e., the weight vector

is (0 0 0)
The next step is to cycle through the inputs and
change the weights as necessary
21
b x y output
1 -1 -1 -1
1 -1 1 1
The training cycle 1
1
1
1
-1
1
1
1
Input Weights Result Action

1. (1 -1 -1) (0 0 0) f(0) = -1correct, do nothing
2. (1 -1 1) (0 0 0) f(0) = -1should have been 1,
so add inputs to weights
(1 -1 1) (0 0 0) + (1 -1 1) = (1 -1 1)
3. (1 1 -1) (1 -1 1) f(-1) = -1 should have been 1,
so add inputs to weights
(2 0 0) (1 -1 1) + (1 1 -1) = (2 0 0)
4. (1 1 1) (2 0 0) f(1) = 1 correct, but keep going!
1. (1 -1 -1) (2 0 0) f(2) = 1 should be have been -1,
so subtract inputs from weights
(1 1 1) (2 0 0) - (1 -1 -1) = (1 1 1)
These do the trick!
22
b x y output
1 -1 -1 -1
1 -1 1 1
The final set of weights 1
1
1
1
-1
1
1
1
The learned set of weights does

the right thing for all the data:
(1 -1 -1) . ( 1 1 1) = -1  f(-1) = -1
(1 -1 1) . (1 1 1) = 1  f(1) = 1
(1 1 -1) . (1 1 1) = 1  f(1) = 1
(1 1 1) . (1 1 1) = 3  f(3) = 1
23
The general procedure
• Start with a perceptron with any values for the

weights (usually 0)
• Feed the input, let the perceptron compute the
answer
• If the answer is right, do nothing
• If the answer is wrong, then modify the weights
by adding or subtracting the input vector
wi = c (d - f) xi
• Iterate over all the input vectors, repeating as
necessary, until the perceptron learns what we
want (i.e., the weight vector converges)
24
More on wi = c (d - f) xi
c is the learning constant

d is the desired output
f is the actual output
(d - f ) is either 0 (correct), or (1 - (-1))= 2,

or (-1 - 1) = -2.
The net effect is:
When the actual output is -1 and should be 1,
increment the weights on the ith line by 2cxi. When
the actual output is 1 and should be -1, decrement
the weights on the ith line by 2cxi.
25
A data set for perceptron classification
26
A two-dimensional plot of the data points
27
The good news
• The weight vector converges to
(-1.3 -1.1 10.9)
after 500 iterations.

• The equation of the line found is
-1.3 * x1 + -1.1 * x2 + 10.9 = 0

• I had different weight vectors in 5 - 7 iterations
28
The bad news: the exclusive-or problem
No straight line in two-dimensions can separate the

(0, 1) and (1, 0) data points from (0, 0) and (1, 1).
A single perceptron can only learn linearly
separable data sets.
29
The solution: multi-layered NNs
30
The adjustment for wki depends on the total
contribution of node i to the error at the output
31
Comments on neural networks
• Parallelism in AI is not new.

- spreading activation, etc.
• Neural models for AI is not new.
- Indeed, is as old as AI, some
subdisciplines such as computer vision,
have continuously thought this way.
• Much neural network works makes biologically
implausible assumptions about how neurons
work
- backpropagation is biologically
implausible.
- “neurally inspired computing” rather
than “brain science.”
32
Comments on neural networks (cont’d)
• None of the neural network models distinguish

humans from dogs from dolphins from flatworms.
- Whatever distinguishes “higher”
cognitive capacities (language,
reasoning) may not be apparent at this
level of analysis.
• Relation between NN and “symbolic AI”?
- Some claim NN models don’t have
symbols and representations.
- Others think of NNs as simply being an
“implementation-level” theory.
- NNs started out as a branch of
statistical pattern classification, and is
headed back that way.
33
Nevertheless
• NNs give us important insights into how to

think about cognition
• NNs have been used in solving lots of
problems
 learning how to pronounce words from spelling (NETtalk,
Sejnowski and Rosenberg, 1987)
 Controlling kilns (Ciftci, 2001)
34

connectionist

Uploaded by

Copyright:

Available Formats

connectionist

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

connectionist

Uploaded by

Copyright:

Available Formats

11 Machine Learning: Connectionist

11.0 Introduction 11.4 Competitive Learning

Additional sources used in preparing the slides:

• Learn about the neurons in the human brain

• We seem to learn facts and get better at doing

• The brain doesn’t seem to have a CPU.

• The fibers of surrounding neurons emit

• There are changes to neurons that are

• How many neurons are there in the human brain?

• Basically, all the billions of neurons in the

• Explain cognition by richly connected

• All connectionist models use a similar model of

 weights represent how much the units wants to listen to

• The perceptrons are continuously active

• How do we wire up a network of perceptrons?

A perceptron computes o = sign (X . W), where

• We can train perceptrons to compute the

• If the unit should have gone on, but didn’t,

Want to learn this:

Initially the weights are all 0, i.e., the weight vector

Input Weights Result Action

The learned set of weights does

• Start with a perceptron with any values for the

c is the learning constant

(d - f ) is either 0 (correct), or (1 - (-1))= 2,

• The weight vector converges to

(-1.3 -1.1 10.9)

after 500 iterations.

-1.3 * x1 + -1.1 * x2 + 10.9 = 0

No straight line in two-dimensions can separate the

• Parallelism in AI is not new.

• None of the neural network models distinguish

• NNs give us important insights into how to

You might also like