0% found this document useful (0 votes)

45 views28 pages

06 - Classification Algorithms - Part II

This document discusses linear classification algorithms and the perceptron algorithm. It introduces linear discriminant functions for binary and multi-class classification problems. The perceptron algorithm is described as an iterative method for learning the weights of a linear classifier for two separable classes. Performance of classifiers is assessed using accuracy or error rate, calculated on a held-out test set.

Uploaded by

Fatih Arslan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views28 pages

06 - Classification Algorithms - Part II

Uploaded by

Fatih Arslan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

BIM488 Introduction to Pattern Recognition

Classification Algorithms - Part II

Outline

• Introduction
• Linear Discriminant Functions
• The Perceptron Algorithm
• Performance Assessment

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 2

1
Introduction

• Previously, our major concern was to design classifiers

based on probability density functions.
• Now, we will focus on the design of linear classifiers,
regardless of the underlying distributions describing the
training data.
• The major advantage of linear classifiers is their simplicity
and computational attractiveness.
• Here, our assumption is that all feature vectors from the
available classes can be classified correctly using a linear
classifier, and we will develop techniques for the
computation of the corresponding linear functions.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 3

Introduction

The solid and empty dots can be correctly classified by any

number of linear classifiers. H1 (blue) classifies them correctly, as
does H2 (red). H2 could be considered "better" in the sense that
it is also furthest from both groups. H3 (green) fails to correctly
classify the dots.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 4

2
Introduction

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 5

Linear Discriminant Functions
• A classifier that uses discriminant functions assigns a
feature vector x to class ωi if
gi(x) > gj(x) for all j≠i

where gi(x), i = 1, . . . , c, are the discriminant functions for c

classes.
• A discriminant function that is a linear combination of the
components of x is called a linear discriminant function and
can be written as
g(x) = wTx + w0 = w1x1 + w1x2+ ... + wdxd + w0

where w is the weight vector and w0 is the bias (or

threshold weight).

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 6

3
Linear Discriminant Functions

• For the two-category case, the decision rule can be written

Decide : ω1 if g(x) > 0

ω2 otherwise

• The equation g(x) = 0 defines the decision boundary that

separates points assigned to ω1 from points assigned to ω2.
• When g(x) is linear, the decision surface is a hyperplane
whose orientation is determined by the normal vector w and
location is determined by the bias ω0.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 7

Linear Discriminant Functions

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 8

4
Linear Discriminant Functions

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 9

Linear Discriminant Functions

Multicategory Case:

• There is more than one way to devise multicategory

classifiers with linear discriminant functions.
• One against all: we can pose the problem as c two-class
problems, where the i’th problem is solved by a linear
discriminant that separates points assigned to ωi from those
not assigned to ωi.
• One against one: Alternatively, we can use c(c-1)/2 linear
discriminants, one for every pair of classes.
• Also, we can use c linear discriminants, one for each class,
and assign x to ωi if gi(x) > gj(x) for all j≠i.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 10

5
Linear Discriminant Functions

Figure: Linear decision boundaries for a 4-class problem devised as

(a) four 2-class problems (b) 6 pairwise problems.The pink regions have
ambiguous category assignments.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 11

Linear Discriminant Functions

• To avoid the problem of ambiguous regions:

– Define c linear discriminant functions
– Assign x to i if gi(x) > gj(x) for all j  i.

• The resulting classifier is called a linear machine

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 12

6
Linear Discriminant Functions

Figure: Linear decision boundaries produced by using one linear

discriminant for each class.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 13

Linear Discriminant Functions

• The boundary between two regions Ri and Rj is a portion of

the hyperplane given by:

gi (x)  g j (x) or
(w i  w j )t x  ( wi 0  w j 0 )  0

• The decision regions for a linear machine are convex.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 14

7
The Perceptron Algorithm
• The perceptron algorithm is appropriate for the 2-class
problem and for classes that are linearly separable.
• The perceptron algorithm computes the values of the
weights w of a linear classifier, which separates the two
classes.
• The algorithm is iterative. It starts with an initial estimate in
the extended (d +1)-dimensional space and converges to a
solution in a finite number of iteration steps.
• The solution w correctly classifies all the training points
assuming linearly separable classes.
• Note that the perceptron algorithm converges to one out of
infinite possible solutions.
• Starting from different initial conditions, different
hyperplanes result.
BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 15
The Perceptron Algorithm
• The update at the i th iteration step has the simple form

w(t  1)  w(t )   t   x x
xY

• Y is the set of wrongly classified samples by the current estimate w(t),

• δx is −1 if x Є ω1, and +1 if x Є ω2,
• ρt is a user-defined parameter that controls the convergence speed and
must obey certain requirements to guarantee convergence (for
example, ρt can be chosen to be constant, ρt = ρ).
• The algorithm converges when Y becomes empty.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 16

8
The Perceptron Algorithm

• Move the hyperplane so that training samples are on its

positive side.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 17

The Perceptron Algorithm

• Once the classifier has been computed, a point, x, is

classified to either of the two classes depending on the
outcome of the following operation:
f (wTx) = f (w1x(1) + w2x(2) + ··· + wdx(d) + w0)

• The function f (·) in its simplest form is the step or sign

function ( f (z) = 1 if z > 0; f (z) =−1 if z < 0).
• However, it may have other forms; for example, the output
may be either 1 or 0 for z > 0 and z < 0, respectively.
• In general, it is known as the activation function.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 18

9
The Perceptron Algorithm

• The basic network model, known as perceptron or neuron,

that implements the classification operation is shown
below:

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 19

The Perceptron Algorithm

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 20

10
The Perceptron Algorithm

Some important points related to perceptron:

• For a fixed learning parameter, the number of iterations (in

general) increases as the classes move closer to each
other (i.e., as the problem becomes more difficult).
• The algorithm fails to converge for a data set that is not
linearly separable. Then, what should we do?
• Different initial estimates for w may lead to different final
estimates for it (although all of them are optimal in the
sense that they separate the training data of the two
classes).

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 21

Performance Assessment

• We can use accuracy or error rate to assess performance

of classifiers.
• Accuracy is the ratio of correct classifications.
• Error rate is the ratio of incorrect classifications.
• Accuracy = 1 - Error rate.
• Example:
10 images belonging to the same class
Number of correctly classified images = 8
Number of incorrectly classified images = 2
Accuracy = 8 / 10 = 0.8 = 80%
Error Rate = 2 / 10 = 0.2 = 20%

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 22

11
Performance Assessment

• Performance is evaluated on a testing set.

• Therefore, entire dataset should be divided into
– training set
– testing set
• Classification model is obtained using the training set.
• Classification performance is assessed using the testing
set.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 23

Performance Assessment

• For objective evaluation, k-fold cross validation technique

is used. Why ?
• Example: k = 3
Fold 1 Fold 2 Fold 3

Training Training Testing

Training Testing Training

Testing Training Training

Accuracy1 Accuracy2 Accuracy3

Overall accuracy = (Accuracy1 + Accuracy2 + Accuracy3) / 3

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 24

12
Performance Assessment

• We can also use a confusion matrix during assessment

• The example below shows predicted and true class labels
for a 10-class recognition problem.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 25

Summary

• Introduction
• Linear Discriminant Functions
• The Perceptron Algorithm
• Performance Assessment

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 26

13
References

• S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras, Introduction

to Pattern Recognition: A MATLAB Approach, Academic Press, 2010.

• S. Theodoridis and K. Koutroumbas, Pattern Recognition (4th Edition),

Academic Press, 2009.

• R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification (2nd Edition),

Wiley, 2001.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 27

Pattern Classification: Second Edition
No ratings yet
Pattern Classification: Second Edition
11 pages
Pattern All Week
No ratings yet
Pattern All Week
391 pages
05 Linear Classifiers
No ratings yet
05 Linear Classifiers
59 pages
07 - Classification Algorithms - Part III
No ratings yet
07 - Classification Algorithms - Part III
30 pages
Single Layer Perceptron Classifier
No ratings yet
Single Layer Perceptron Classifier
62 pages
06 Lectureslides LinearClassification Fixed
No ratings yet
06 Lectureslides LinearClassification Fixed
52 pages
SML Lecture5
No ratings yet
SML Lecture5
45 pages
PR Slide Spring 2017
No ratings yet
PR Slide Spring 2017
5 pages
01 Halfspaces Perceptron
No ratings yet
01 Halfspaces Perceptron
56 pages
07 Supervised Machine Learning
No ratings yet
07 Supervised Machine Learning
24 pages
Pattern Recognition Linear Classifier by Zaheer Ahmad
0% (1)
Pattern Recognition Linear Classifier by Zaheer Ahmad
37 pages
Lecture 2 Math
No ratings yet
Lecture 2 Math
34 pages
CIS 4526: Foundations of Machine Learning Linear Classification: Perceptron
No ratings yet
CIS 4526: Foundations of Machine Learning Linear Classification: Perceptron
33 pages
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
No ratings yet
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
5 pages
PRu 4
No ratings yet
PRu 4
13 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
74 pages
Ai and ML
No ratings yet
Ai and ML
16 pages
Lec 41
No ratings yet
Lec 41
6 pages
Perceptron 2014
No ratings yet
Perceptron 2014
44 pages
02 Linear Classification
No ratings yet
02 Linear Classification
52 pages
Pattern Recognition - Theodoridis Koutroumbas
No ratings yet
Pattern Recognition - Theodoridis Koutroumbas
641 pages
PNAL4 SingleLayerNets
No ratings yet
PNAL4 SingleLayerNets
42 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
ML - Lec 6 - Linear Classifiers
No ratings yet
ML - Lec 6 - Linear Classifiers
55 pages
هه
No ratings yet
هه
6 pages
Iv. Single Layer Structures: 4.1. Perceptrons
No ratings yet
Iv. Single Layer Structures: 4.1. Perceptrons
26 pages
Introduction To Machine Learning Lecture 3: Linear Classification Methods
No ratings yet
Introduction To Machine Learning Lecture 3: Linear Classification Methods
40 pages
lec41
No ratings yet
lec41
6 pages
Lecture 3 - Rosenblatt - S Perceptron-Ch2
No ratings yet
Lecture 3 - Rosenblatt - S Perceptron-Ch2
20 pages
Linear Discriminant Functions: CS479/679 Pattern Recognition Dr. George Bebis
No ratings yet
Linear Discriminant Functions: CS479/679 Pattern Recognition Dr. George Bebis
41 pages
CSE 473 Pattern Recognition
No ratings yet
CSE 473 Pattern Recognition
45 pages
PR Some Solutions
No ratings yet
PR Some Solutions
26 pages
ML Unit I
No ratings yet
ML Unit I
14 pages
Unit 6
No ratings yet
Unit 6
41 pages
3 Percept Ron
No ratings yet
3 Percept Ron
34 pages
5. ML_Discriminant Functions
No ratings yet
5. ML_Discriminant Functions
17 pages
cs188 Fa23 Note21
No ratings yet
cs188 Fa23 Note21
8 pages
Lecture Notes On Pattern Recognition and Image Processing
No ratings yet
Lecture Notes On Pattern Recognition and Image Processing
24 pages
Unit 1 Image Proc
No ratings yet
Unit 1 Image Proc
37 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
120 pages
Classification
No ratings yet
Classification
47 pages
Machine Learning - Classifiers and Boosting: Reading CH 18.6-18.12, 20.1-20.3.2
No ratings yet
Machine Learning - Classifiers and Boosting: Reading CH 18.6-18.12, 20.1-20.3.2
54 pages
Perceptron
No ratings yet
Perceptron
23 pages
02 Training Patterns
No ratings yet
02 Training Patterns
18 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
125 pages
Machine Learning-4
100% (1)
Machine Learning-4
18 pages
An Introduction To Pattern Recognition - 2
No ratings yet
An Introduction To Pattern Recognition - 2
46 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
Pattern Recognition
No ratings yet
Pattern Recognition
52 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
46 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
IIT Madras Notes Machine Learning
No ratings yet
IIT Madras Notes Machine Learning
13 pages
Lecture 1, Part 2: Linear Classification: Roger Grosse
No ratings yet
Lecture 1, Part 2: Linear Classification: Roger Grosse
10 pages
Percept Ron
No ratings yet
Percept Ron
23 pages
Perceptron Notes
No ratings yet
Perceptron Notes
5 pages
Pattern Classification: Second Edition
No ratings yet
Pattern Classification: Second Edition
11 pages
Machine Learning: Linear Models For Classification 1
No ratings yet
Machine Learning: Linear Models For Classification 1
30 pages
APRAAR
No ratings yet
APRAAR
9 pages
Research On Optimization of Image Using Skeletonization Technique With Advanced Algorithm
No ratings yet
Research On Optimization of Image Using Skeletonization Technique With Advanced Algorithm
6 pages
Introduction To AI-ML-and Applications
No ratings yet
Introduction To AI-ML-and Applications
115 pages
Shriram Nandakumar - November15
No ratings yet
Shriram Nandakumar - November15
3 pages
A Review On Time Series Data Mining
100% (1)
A Review On Time Series Data Mining
18 pages
Chapter 3
No ratings yet
Chapter 3
12 pages
ERA2010 Conference List
No ratings yet
ERA2010 Conference List
136 pages
References
No ratings yet
References
21 pages
Haykin, Simon Kosko, Bart (Eds.) Intelligent Signal Processing 2001
100% (2)
Haykin, Simon Kosko, Bart (Eds.) Intelligent Signal Processing 2001
595 pages
Advanced Teaching Skills
No ratings yet
Advanced Teaching Skills
8 pages
Matlab Neural Network
No ratings yet
Matlab Neural Network
9 pages
Question: Design A BI System For Fraud Detection .Describe All The Steps From Data Collection To Decision Making Clearly?
No ratings yet
Question: Design A BI System For Fraud Detection .Describe All The Steps From Data Collection To Decision Making Clearly?
2 pages
PHD Thesis Roffo 2016 171 PDF
No ratings yet
PHD Thesis Roffo 2016 171 PDF
245 pages
Carpenter 2020
No ratings yet
Carpenter 2020
2 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
43 pages
PRCV Lab Manual-Final
No ratings yet
PRCV Lab Manual-Final
60 pages
Skew Detection Using Wavelet Decomposition and Projection Profile Analysis
No ratings yet
Skew Detection Using Wavelet Decomposition and Projection Profile Analysis
0 pages
Project Report: ON Heart Disease Prediction Using Machine Learning
No ratings yet
Project Report: ON Heart Disease Prediction Using Machine Learning
35 pages
0.PR Representation
No ratings yet
0.PR Representation
21 pages
Cse Ds Btech IV Yr Vii Sem Syllabus July 2023
No ratings yet
Cse Ds Btech IV Yr Vii Sem Syllabus July 2023
24 pages
Unit-7 (CAQC)
100% (14)
Unit-7 (CAQC)
8 pages
Pattern Recognition - Clustering - Classification
No ratings yet
Pattern Recognition - Clustering - Classification
177 pages
Applications of Artificial Intelligence in Academic Libraries
0% (1)
Applications of Artificial Intelligence in Academic Libraries
6 pages
Cu5096 Ia1
No ratings yet
Cu5096 Ia1
1 page
Pattern Recognition
No ratings yet
Pattern Recognition
12 pages
Lectures1 2
No ratings yet
Lectures1 2
28 pages
Analysis and Probability
100% (2)
Analysis and Probability
320 pages
Pattern Recognition
No ratings yet
Pattern Recognition
66 pages

06 - Classification Algorithms - Part II

Uploaded by

06 - Classification Algorithms - Part II

Uploaded by

BIM488 Introduction to Pattern Recognition

Classification Algorithms - Part II

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 2

• Previously, our major concern was to design classifiers

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 3

The solid and empty dots can be correctly classified by any

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 4

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 5

where gi(x), i = 1, . . . , c, are the discriminant functions for c

where w is the weight vector and w0 is the bias (or

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 6

• For the two-category case, the decision rule can be written

Decide : ω1 if g(x) > 0

• The equation g(x) = 0 defines the decision boundary that

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 7

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 8

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 9

• There is more than one way to devise multicategory

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 10

Figure: Linear decision boundaries for a 4-class problem devised as

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 11

• To avoid the problem of ambiguous regions:

• The resulting classifier is called a linear machine

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 12

Figure: Linear decision boundaries produced by using one linear

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 13

• The boundary between two regions Ri and Rj is a portion of

• The decision regions for a linear machine are convex.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 14

• Y is the set of wrongly classified samples by the current estimate w(t),

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 16

• Move the hyperplane so that training samples are on its

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 17

• Once the classifier has been computed, a point, x, is

• The function f (·) in its simplest form is the step or sign

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 18

• The basic network model, known as perceptron or neuron,

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 19

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 20

Some important points related to perceptron:

• For a fixed learning parameter, the number of iterations (in

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 21

• We can use accuracy or error rate to assess performance

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 22

• Performance is evaluated on a testing set.

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 23

• For objective evaluation, k-fold cross validation technique

Training Training Testing

Training Testing Training

Testing Training Training

Accuracy1 Accuracy2 Accuracy3

Overall accuracy = (Accuracy1 + Accuracy2 + Accuracy3) / 3

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 24

• We can also use a confusion matrix during assessment

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 25

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 26

• S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras, Introduction

• S. Theodoridis and K. Koutroumbas, Pattern Recognition (4th Edition),

• R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification (2nd Edition),

BIM488 Introduction to Pattern Recognition Classification Algorithms - Part II 27

You might also like