0% found this document useful (0 votes)

120 views26 pages

Perceptron Example (Practice Que)

The document discusses the perceptron learning algorithm, which was the first neural network learning model developed in the 1960s. It describes how a perceptron node functions as a basic threshold logic unit, and how the perceptron learning rule updates the node's weights based on whether the output matches the target. The rule aims to minimize errors by adjusting weights proportionally to the product of the input and error. An example demonstrates how the weights are updated over multiple training instances until the network correctly classifies all examples. The document notes limitations in perceptrons' ability to learn linearly separable problems and handle multi-class output.

Uploaded by

uijn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

120 views26 pages

Perceptron Example (Practice Que)

Uploaded by

uijn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

Perceptron Learning Algorithm

First neural network learning model in the 1960’s

Simple and limited (single layer models)
Basic concepts are similar for multi-layer models so this is a good
learning tool
Still used in many current applications (modems, etc.)

CS 478 - PERCEPTRONS 1
Perceptron Node – Threshold
Logic Unit
x1 w1

x2 w2 q z

xn wn

n
1 if åx w ³q
i =1
i i

z= n
0 if åx w <q
i =1
i i

CS 478 - PERCEPTRONS 2
Perceptron Node – Threshold
Logic Unit
x1 w1

x2 w2 q z

xn wn

• Learn weights such that an objective 1 if åx w ³q

i =1
i i
function is maximized. z= n
• What objective function should we use?
• What learning algorithm should we use?
0 if åx w <q
i =1
i i

CS 478 - PERCEPTRONS 3
Perceptron Learning Algorithm
x1 .4

.1 z

x2 -.2

åx w ³q
x1 x2 t
1 if i i
i =1
.8 .3 1 z= n
.4 .1 0 0 if åx w <q
i =1
i i

CS 478 - PERCEPTRONS 4
First Training Instance
.8 .4

.1 z =1

.3 -.2 net = .8.4 + .3-.2 = .26

åx w ³q
x1 x2 t
1 if i i
i =1
.8 .3 1 z= n
.4 .1 0 0 if åx w <q
i =1
i i

CS 478 - PERCEPTRONS 5
Second Training Instance
.4 .4

.1 z =1

.1 -.2 net = .4.4 + .1-.2 = .14

åx w ³q
x1 x2 t
1 if i i
.8 .3 1 z=
i =1 Dwi = (t - z) * c * xi
n
.4 .1 0 0 if åx w <q
i =1
i i

CS 478 - PERCEPTRONS 6
Perceptron Rule Learning
Dwi = c(t – z) xi

Where wi is the weight from input i to perceptron node, c is the learning rate, tj is the
target for the current instance, z is the current output, and xi is ith input
Least perturbation principle
◦ Only change weights if there is an error
◦ small c rather than changing weights sufficient to make current pattern correct
◦ Scale by xi

Create a perceptron node with n inputs

Iteratively apply a pattern from the training set and apply the perceptron rule
Each iteration through the training set is an epoch
Continue training until total training set error ceases to improve
Perceptron Convergence Theorem: Guaranteed to find a solution in finite time if a
solution exists

CS 478 - PERCEPTRONS 7
CS 478 - PERCEPTRONS 8
Augmented Pattern Vectors
1 0 1 -> 0
1 0 0 -> 1
Augmented Version
1 0 1 1 -> 0
1 0 0 1 -> 1
Treat threshold like any other weight. No special case. Call it a bias
since it biases the output up or down.
Since we start with random weights anyways, can ignore the - notion,
and just think of the bias as an extra available weight. (note the author
uses a -1 input)
Always use a bias weight

CS 478 - PERCEPTRONS 9
Perceptron Rule Example
Assume a 3 input perceptron plus bias (it outputs 1 if net > 0, else 0)
Assume a learning rate c of 1 and initial weights all 0: Dwi = c(t – z) xi
Training set 0 0 1 -> 0
1 1 1 -> 1
1 0 1 -> 1
0 1 1 -> 0

Pattern Target Weight Vector Net Output DW

001 1 0 0000

CS 478 - PERCEPTRONS 10
Example
Assume a 3 input perceptron plus bias (it outputs 1 if net > 0, else 0)
Assume a learning rate c of 1 and initial weights all 0: Dwi = c(t – z) xi
Training set 0 0 1 -> 0
1 1 1 -> 1
1 0 1 -> 1
0 1 1 -> 0

Pattern Target Weight Vector Net Output DW

001 1 0 0000 0 0 0 0 0 0
111 1 1 0000

CS 478 - PERCEPTRONS 11
Example
Assume a 3 input perceptron plus bias (it outputs 1 if net > 0, else 0)
Assume a learning rate c of 1 and initial weights all 0: Dwi = c(t – z) xi
Training set 0 0 1 -> 0
1 1 1 -> 1
1 0 1 -> 1
0 1 1 -> 0

Pattern Target Weight Vector Net Output DW

001 1 0 0000 0 0 0 0 0 0
111 1 1 0000 0 0 1 1 1 1
101 1 1 1111

CS 478 - PERCEPTRONS 12
Example
Assume a 3 input perceptron plus bias (it outputs 1 if net > 0, else 0)
Assume a learning rate c of 1 and initial weights all 0: Dwi = c(t – z) xi
Training set 0 0 1 -> 0
1 1 1 -> 1
1 0 1 -> 1
0 1 1 -> 0

Pattern Target Weight Vector Net Output DW

001 1 0 0000 0 0 0 0 0 0
111 1 1 0000 0 0 1 1 1 1
101 1 1 1111 3 1 0 0 0 0
011 1 0 1111

CS 478 - PERCEPTRONS 13
Example
Assume a 3 input perceptron plus bias (it outputs 1 if net > 0, else 0)
Assume a learning rate c of 1 and initial weights all 0: Dwi = c(t – z) xi
Training set 0 0 1 -> 0
1 1 1 -> 1
1 0 1 -> 1
0 1 1 -> 0

Pattern Target Weight Vector Net Output DW

001 1 0 0000 0 0 0 0 0 0
111 1 1 0000 0 0 1 1 1 1
101 1 1 1111 3 1 0 0 0 0
011 1 0 1111 3 1 0 -1 -1 -1
001 1 0 1000

CS 478 - PERCEPTRONS 14
Example
Assume a 3 input perceptron plus bias (it outputs 1 if net > 0, else 0)

Assume a learning rate c of 1 and initial weights all 0: Dwi = c(t – z) xi

Training set 0 0 1 -> 0

1 1 1 -> 1
1 0 1 -> 1
0 1 1 -> 0

Pattern Target Weight Vector Net Output DW

001 1 0 0000 0 0 0 0 0 0

111 1 1 0000 0 0 1 1 1 1

101 1 1 1111 3 1 0 0 0 0

011 1 0 1111 3 1 0 -1 -1 -1

001 1 0 1000 0 0 0 0 0 0

111 1 1 1000 1 1 0 0 0 0

101 1 1 1000 1 1 0 0 0 0

011 1 0 1000 0 0 0 0 0 0

CS 478 - PERCEPTRONS 15
Linear Separability

CS 478 - PERCEPTRONS 16
Linear Separability and
Generalization

When is data noise vs. a legitimate exception

CS 478 - PERCEPTRONS 17
Limited Functionality of
Hyperplane

CS 478 - PERCEPTRONS 18
How to Handle Multi-Class Output
This is an issue with any learning model which only supports binary
classification (perceptron, SVM, etc.)
Create 1 perceptron for each output class, where the training set
considers all other classes to be negative examples
◦ Run all perceptrons on novel data and set the output to the class of the
perceptron which outputs high
◦ If there is a tie, choose the perceptron with the highest net value

Create 1 perceptron for each pair of output classes, where the training
set only contains examples from the 2 classes
◦ Run all perceptrons on novel data and set the output to be the class with the
most wins (votes) from the perceptrons
◦ In case of a tie, use the net values to decide
◦ Number of models grows by the square of the output classes

CS 478 - PERCEPTRONS 19
UC Irvine Machine Learning Data
Base
Iris Data Set
4.8,3.0,1.4,0.3, Iris-setosa
5.1,3.8,1.6,0.2, Iris-setosa
4.6,3.2,1.4,0.2, Iris-setosa
5.3,3.7,1.5,0.2, Iris-setosa
5.0,3.3,1.4,0.2, Iris-setosa
7.0,3.2,4.7,1.4, Iris-versicolor
6.4,3.2,4.5,1.5, Iris-versicolor
6.9,3.1,4.9,1.5, Iris-versicolor
5.5,2.3,4.0,1.3, Iris-versicolor
6.5,2.8,4.6,1.5, Iris-versicolor
6.0,2.2,5.0,1.5, Iris-viginica
6.9,3.2,5.7,2.3, Iris-viginica
5.6,2.8,4.9,2.0, Iris-viginica
7.7,2.8,6.7,2.0, Iris-viginica
6.3,2.7,4.9,1.8, Iris-viginica

CS 478 – PERCEPTRONS 20
Objective Functions:
Accuracy/Error
How do we judge the quality of a particular model (e.g. Perceptron with a
particular setting of weights)
Consider how accurate the model is on the data set
◦ Classification accuracy = # Correct/Total instances
◦ Classification error = # Misclassified/Total instances (= 1 – acc)

For real valued outputs and/or targets

◦ Pattern error = Target – output
 Errors could cancel each other
 Common approach is Squared Error = S(ti – zi)2
◦ Total sum squared error = S Pattern Errors = S S (ti – zi)2

For nominal data, pattern error is typically 1 for a mismatch and 0 for a
match
◦ For nominal (including binary) output and targets, SSE and classification error
are equivalent

CS 478 - PERCEPTRONS 21
Mean Squared Error
Mean Squared Error (MSE) – SSE/n where n is the number of instances
in the data set
◦ This can be nice because it normalizes the error for data sets of different
sizes
◦ MSE is the average squared error per pattern

Root Mean Squared Error (RMSE) – is the square root of the MSE
◦ This puts the error value back into the same units as the features and can
thus be more intuitive
◦ RMSE is the average distance (error) of targets from the outputs in the same
scale as the features

CS 478 - PERCEPTRONS 22
Gradient Descent Learning:
Minimize (Maximze) the Objective
Function
Error Landscape
SSE:
Sum
Squared
Error
S (ti – zi)2

0
Weight Values

CS 478 - PERCEPTRONS 23
Deriving a Gradient Descent
Learning Algorithm
Goal is to decrease overall error (or other objective function) each time
a weight is changed

Total Sum Squared error one possible objective function E: S (ti – zi)2
Seek a weight changing algorithm such that ¶E
is negative
¶wijlearning
If a formula can be found then we have a gradient descent
algorithm
Delta rule is a variant of the perceptron rule which gives a gradient
descent learning algorithm

CS 478 - PERCEPTRONS 24
Delta rule algorithm
Delta rule uses (target - net) before the net value goes through the threshold in
the learning rule to decide weight update

Weights are updated even when the output would be correct

Because this model is single layer and because of the SSE objective function, the
error surface is guaranteed to be parabolic with only one minima
Learning rate
◦ If learning rate is too large can jump around global minimum
◦ If too small, will work, but will take a longer time
◦ Can decrease learning rate over time to give higher speed and still attain the
global minimum (although exact minimum is still just for training set and
thus…)

CS 478 - PERCEPTRONS 25
Perceptron rule vs Delta rule
Perceptron rule (target - thresholded output) guaranteed to converge to a
separating hyperplane if the problem is linearly separable. Otherwise may
not converge – could get in cycle
Singe layer Delta rule guaranteed to have only one global minimum. Thus
it will converge to the best SSE solution whether the problem is linearly
separable or not.
◦ Could have a higher misclassification rate than with the perceptron rule and a
less intuitive decision surface – we will discuss with regression

Stopping Criteria – For these models stop when no longer making progress
◦ When you have gone a few epochs with no significant improvement/change
between epochs (including oscillations)

CS 478 - PERCEPTRONS 26

Forester Spanish (Chile) Transcription Guidelines
100% (2)
Forester Spanish (Chile) Transcription Guidelines
19 pages
Ncracit 2023
No ratings yet
Ncracit 2023
479 pages
Sunrays of Radiance - Overcoming Negativity - The Eloists PDF
No ratings yet
Sunrays of Radiance - Overcoming Negativity - The Eloists PDF
80 pages
Perceptron Learning Rules
50% (2)
Perceptron Learning Rules
38 pages
2007 02 01b Janecek Perceptron
No ratings yet
2007 02 01b Janecek Perceptron
37 pages
Peer Instruction: Might Not Just Be One Correct Answer
No ratings yet
Peer Instruction: Might Not Just Be One Correct Answer
52 pages
Percept Ron
No ratings yet
Percept Ron
53 pages
Perceptron 2015
No ratings yet
Perceptron 2015
63 pages
Artificial Neural Networks: CS464 Introduction To Machine Learning 1
No ratings yet
Artificial Neural Networks: CS464 Introduction To Machine Learning 1
39 pages
Perceptron PDF
No ratings yet
Perceptron PDF
37 pages
Lecture 4: Perceptrons and Multilayer Perceptrons: Cognitive Systems II - Machine Learning SS 2005
No ratings yet
Lecture 4: Perceptrons and Multilayer Perceptrons: Cognitive Systems II - Machine Learning SS 2005
25 pages
NN 2
No ratings yet
NN 2
42 pages
Unit 1 Until MLP
No ratings yet
Unit 1 Until MLP
56 pages
The Thinner Book - Atomic Habits by James Clear - Summary
No ratings yet
The Thinner Book - Atomic Habits by James Clear - Summary
24 pages
Neural N Problems - SLP
No ratings yet
Neural N Problems - SLP
123 pages
4. Learning Algorithm
No ratings yet
4. Learning Algorithm
58 pages
Clase3_redUnidireccional
No ratings yet
Clase3_redUnidireccional
74 pages
Linear Classifier-Perceptron
No ratings yet
Linear Classifier-Perceptron
42 pages
NN-Ch2 New V1
No ratings yet
NN-Ch2 New V1
99 pages
Carl Jung & Alfred Adler Theories of Perosnality: Personality Psychology
No ratings yet
Carl Jung & Alfred Adler Theories of Perosnality: Personality Psychology
24 pages
NY Perceptron Notes
No ratings yet
NY Perceptron Notes
21 pages
PNAL4 SingleLayerNets
No ratings yet
PNAL4 SingleLayerNets
42 pages
G - 3 MUSIC EFDT Learning Plan With TOS Assessments
No ratings yet
G - 3 MUSIC EFDT Learning Plan With TOS Assessments
17 pages
Single Layer Feedforward Networks
No ratings yet
Single Layer Feedforward Networks
21 pages
lecture 4
No ratings yet
lecture 4
65 pages
Lecture 5 NN
No ratings yet
Lecture 5 NN
57 pages
02 Neural Network
No ratings yet
02 Neural Network
28 pages
PLA Explanation
No ratings yet
PLA Explanation
19 pages
Lecture 6 Perceptron Learning Rule
No ratings yet
Lecture 6 Perceptron Learning Rule
32 pages
Neural_N_Problems - SLP
No ratings yet
Neural_N_Problems - SLP
123 pages
08 NN
No ratings yet
08 NN
43 pages
Impact of Feedback Request Forms and Verbal Feedback On Higher Education Students Feedback Perception Self Efficacy and Motivation
No ratings yet
Impact of Feedback Request Forms and Verbal Feedback On Higher Education Students Feedback Perception Self Efficacy and Motivation
21 pages
Emotional Intelligence
No ratings yet
Emotional Intelligence
11 pages
Percept Ron
No ratings yet
Percept Ron
54 pages
Machine Learning 3
No ratings yet
Machine Learning 3
30 pages
Perceptron Network
No ratings yet
Perceptron Network
26 pages
Loss, Grieving, Death and Post Mortem Care
No ratings yet
Loss, Grieving, Death and Post Mortem Care
14 pages
The Creative Person Zilch
No ratings yet
The Creative Person Zilch
19 pages
Preceptron
No ratings yet
Preceptron
17 pages
Lecture 2
No ratings yet
Lecture 2
20 pages
Chapter 2. Training NN
No ratings yet
Chapter 2. Training NN
50 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
37_Perceptron2
No ratings yet
37_Perceptron2
21 pages
Slide 2
No ratings yet
Slide 2
35 pages
Artificial Neural Networks[1]
No ratings yet
Artificial Neural Networks[1]
87 pages
Neural Networks Unit - 8 Perceptrons Objective: An Objective of The Unit-8 Is, The Reader Should Be Able To Know What Do You
No ratings yet
Neural Networks Unit - 8 Perceptrons Objective: An Objective of The Unit-8 Is, The Reader Should Be Able To Know What Do You
13 pages
Sensation and Perception
No ratings yet
Sensation and Perception
95 pages
Neural Network
No ratings yet
Neural Network
82 pages
21 Characteristics of 21st Century Learners Learning, Teaching and Leadership
No ratings yet
21 Characteristics of 21st Century Learners Learning, Teaching and Leadership
1 page
Dave Reed: Connectionist Approach To AI
No ratings yet
Dave Reed: Connectionist Approach To AI
26 pages
Org-Mgt Q1 M1
No ratings yet
Org-Mgt Q1 M1
12 pages
NSTP
No ratings yet
NSTP
4 pages
Conceptual Dependency
No ratings yet
Conceptual Dependency
14 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
LECTURE 2 The Concept of Linguistic Personality.
No ratings yet
LECTURE 2 The Concept of Linguistic Personality.
5 pages
AI-Lecture 12 - Simple Perceptron
100% (1)
AI-Lecture 12 - Simple Perceptron
24 pages
Assignment 2 Hrd-Updated
No ratings yet
Assignment 2 Hrd-Updated
6 pages
Artificial Neural Networks - 2
No ratings yet
Artificial Neural Networks - 2
10 pages
Perceptrons
No ratings yet
Perceptrons
11 pages
Lecture 8 - Intro to Neural Networks
No ratings yet
Lecture 8 - Intro to Neural Networks
61 pages
ANN (Perceptron) 02
No ratings yet
ANN (Perceptron) 02
14 pages
perceptron
No ratings yet
perceptron
11 pages
20200428135045cfbc718e2c (1)
No ratings yet
20200428135045cfbc718e2c (1)
30 pages
Documento W
No ratings yet
Documento W
1 page
Characteristics of Argumentative Discourse
No ratings yet
Characteristics of Argumentative Discourse
1 page
Percept Ron
No ratings yet
Percept Ron
15 pages
NNFL 3unit
No ratings yet
NNFL 3unit
10 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
71 pages
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
No ratings yet
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
11 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Lecture 3- Rosenblatt_s Perceptron-Ch2
No ratings yet
Lecture 3- Rosenblatt_s Perceptron-Ch2
20 pages
Perceptron Linear Classifiers
No ratings yet
Perceptron Linear Classifiers
42 pages
Action Research
No ratings yet
Action Research
11 pages
A Presentation On: By: Edutechlearners
No ratings yet
A Presentation On: By: Edutechlearners
33 pages
Perceptron PDF
0% (1)
Perceptron PDF
8 pages
4th Eso Computer Session Unit 2
No ratings yet
4th Eso Computer Session Unit 2
1 page
Ann Mid1: Artificial Neural Networks With Biological Neural Network - Similarity
No ratings yet
Ann Mid1: Artificial Neural Networks With Biological Neural Network - Similarity
13 pages
Naveen Reddy (2)
No ratings yet
Naveen Reddy (2)
1 page
Golden Rule Lesson Plan
No ratings yet
Golden Rule Lesson Plan
1 page
TMT PDF
100% (1)
TMT PDF
6 pages
Systemic Functional Grammar (SFG) Is
No ratings yet
Systemic Functional Grammar (SFG) Is
3 pages
Little Seed
No ratings yet
Little Seed
4 pages
Problem and Solution Lesson Plan
83% (6)
Problem and Solution Lesson Plan
13 pages
Sociolinguistics - Origin PDF
No ratings yet
Sociolinguistics - Origin PDF
7 pages
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
From Everand
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)