07 Pattern Recognition
07 Pattern Recognition
07 Pattern Recognition
Recognition
Perceptron (“single-layer perceptron”)
• simple NN for
linearly separable
patterns only
error correcting learning
y(n) = sgn w (n)x(n)
T
w x − = 0
i =1
i i
Class C2
hyperplane
Class C1
Separating boundary
Class C1
Class C1
Class C2
Class C2
• http://playground.tensorflow.org
• Machine vision
• Character recognition (OCR)
• Computer aided diagnosis
• Speech recognition
• Face recognition
• Biometrics
Pattern • Image Data Base retrieval
• Data mining
Recognition) • Bioinformatics
The task:
Patterns
sensor
feature
generation
feature
selection
classifier
design
system
evaluation
AIDP, M.Oravec, ICSM FEI STU
Watanabe defines a pattern as
opposite of a chaos; it is an entity,
vaguely defined, that could be given a
name.
• For example, a pattern could be a fingerprint
Pattern, image, a handwritten cursive word, a human
face, or a speech signal.
1) data
2) data 3) decision
Design of a acquisition and
preprocessing,
representation, making.
pattern
recognition
system Learning from a set of examples (training
set) is an important and desired attribute
of most pattern recognition systems.
Disadvantages
• if the patterns are distorted due to the imaging process, viewpoint change, or large intraclass
variations among the patterns.
The goal is to choose those features that allow pattern vectors belonging to different
categories to occupy compact and disjoint regions in a p-dimensional feature space.
Given a set of training patterns from each class, the objective is to establish decision
boundaries in the feature space which separate patterns belonging to different
classes.
• In the statistical decision theoretic approach, the decision boundaries are determined by the probability
distributions of the patterns belonging to each class, which must either be specified or learned
• One can also take a discriminant analysis-based approach to classification: First a parametric form of the
decision boundary (e.g., linear or quadratic) is specified; then the best decision boundary of the specified
form is found based on the classification of training patterns.
• hierarchical perspective where a pattern is viewed as being composed of simple subpatterns which
are themselves built from yet simpler subpatterns
• The simplest/elementary subpatterns to be recognized are called primitives and the given complex
pattern is represented in terms of the interrelationships between these primitives.
• In syntactic pattern recognition, a formal analogy is drawn between the structure of patterns and
the syntax of a language.
– The patterns are viewed as sentences belonging to a language, primitives are viewed as the
alphabet of the language, and the sentences are generated according to a grammar. Thus, a
large collection of complex patterns can be described by a small number of primitives and
grammatical rules. The grammar for each pattern class must be inferred from the available
training samples.
– Suitable, if the patterns have a definite structure which can be captured in terms of a set of
rules,
• such as EKG waveforms, textured images, and shape analysis of contours
– Difficulties which primarily have to do with the segmentation of noisy patterns (to detect the
primitives) and the inference of the grammar from training data.
• preprocessing
• dimensionality reduction
– „the curse of dimensionality“ - number of trainings points is exp. function of
the feature dimension
– „peaking phenomenon" - for a fixed training. the set of features can degrade
the activity of the classifier (the number of training points is small due to the
number of features)
• feature extraction
• feature selection
• classification
AIDP, M.Oravec, ICSM FEI STU
Pattern recognition systems
xp
• Two-stage system
x1 y1
x2 feature
input decision
...
classifier
...
extraction
xp ym
Data Feature
space space
transformation
dim: p dim: m=p
dim: m<p
decorrelation
dimensionality reduction
compute y
y = AT x
E y (i ) y ( j ) = 0, i j
➢ Sketch of the proof:
Ry = E y y = E A x x A = AT Rx A
T T T
AIDP, M.Oravec, ICSM FEI STU
• If A is chosen so that its columns a i are the orthogonal
eigenvectors of Rx, then
R y = AT Rx A =
where Λ is diagonal with elements the respective
eigenvalues λi.
• Observe that this is a sufficient condition but not
necessary. It imposes a specific orthogonal structure on
A.
i =0
m
2
E x − xˆ = E y (i )a i
2
i =
− The error is:
E x − xˆ
2
=
m
i =
i
• Data:
➢ Assign x → i : ki k j i j
2 PNN
➢ PB PkNN PB +
k
➢ k → , PkNN → PB
PNN 2 PB
P3 NN PB + 3( PB ) 2
Ri = x : d ( x, x i ) d ( x, x j ) i j
AIDP, M.Oravec, ICSM FEI STU
Voronoi tessellation
◼ 2D case ◼ 3D case
http://courses.cs.tamu.edu/rgutier/cs790_w02/l8.pdf
http://courses.cs.tamu.edu/rgutier/cs790_w02/l8.pdf
http://courses.cs.tamu.edu/rgutier/cs790_w02/l8.pdf
AIDP, M.Oravec, ICSM FEI STU
Other nice solutions:
◼ ensemble learning
◼ SVM (support vector machine)
◼ neural networks