0% found this document useful (0 votes)
40 views11 pages

ML 06 Multiclass

This document discusses multiclass classification, which involves classifying instances into one of more than two classes. It describes two common techniques for converting binary classifiers into multiclass classifiers: one-vs-rest (OvR) and one-vs-one (OvO). OvR builds one binary classifier for each class, while OvO builds a binary classifier for each pair of classes. The document provides pseudocode for the OvR and OvO algorithms and explains how predictions are made using the classifiers from each approach.

Uploaded by

yiqing he
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views11 pages

ML 06 Multiclass

This document discusses multiclass classification, which involves classifying instances into one of more than two classes. It describes two common techniques for converting binary classifiers into multiclass classifiers: one-vs-rest (OvR) and one-vs-one (OvO). OvR builds one binary classifier for each class, while OvO builds a binary classifier for each pair of classes. The document provides pseudocode for the OvR and OvO algorithms and explains how predictions are made using the classifiers from each approach.

Uploaded by

yiqing he
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Multiclass Classification

The City College of New York


CSc 59929 – Introduction to Machine Learning
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Definitions

• Binary Classification is the problem of classifying


instances into one of two classes.

• Multiclass or Multinomial Classification is the problem of


classifying instances into one of more than two classes.

Multiclass classification should not be confused with multi-


label classification, where multiple labels are to be predicted
for each instance.
Ref.: Wikipedia

The City College of New York


CSc 59929 – Introduction to Machine Learning 2
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Binary Classifiers

While some classification algorithms naturally permit the use


of more than two classes, others are by nature binary
algorithms; these can, however, be turned into multinomial
classifiers by a variety of strategies.

Ref.: Wikipedia

The City College of New York


CSc 59929 – Introduction to Machine Learning 3
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Using binary classifiers
for multiclass classification

• One-vs.-Rest (OvR)
• One-vs.-All (OvA)
• One-against-All (OaA)

• One-vs.-One (OvO)

Ref.: Wikipedia

The City College of New York


CSc 59929 – Introduction to Machine Learning 4
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Comparison of OvR and OvO

The City College of New York


CSc 59929 – Introduction to Machine Learning 5
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Training OvR
fA fB fC fD
A A Not B Not C Not D

B Not A B Not C Not D

C Not A Not B C Not D

D Not A Not B Not C D

Ref.: Wikipedia
The City College of New York
CSc 59929 – Introduction to Machine Learning 6
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Predicting with OvR

A
fA ( x)
B
fB ( x) Choose the prediction that
has the highest cofidence.
C
fC ( x )
D fD ( x)

Ref.: Wikipedia
The City College of New York
CSc 59929 – Introduction to Machine Learning 7
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Training OvO f AD
f AB f AC
A A A A

B B C D

C f BC f BD fCD
B B C
D

C D D

The City College of New York


CSc 59929 – Introduction to Machine Learning 8
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Predicting with OvO

A
f AB ( x )
f AC ( x )
B f AD ( x ) Choose the result that is
predicted most often.
f BC ( x )
C
f BD ( x )
D fCD ( x )

Ref.: Wikipedia
The City College of New York
CSc 59929 – Introduction to Machine Learning 9
Spring 2020 – Erik K. Grimmelmann, Ph.D.
One-vs.-Rest (OvR)
Inputs:
 L, a learner (a training algorithm for binary classifiers)
 X , a sample
 y, labels where yi ∈ {1,..., K }
Output:
 { f k } , a set of classifiers, for k ∈{1,..., K }
Procedure:
 For each k in {1,..., K }, construct a new label vector, z ,
where
= zi 1= if yi k and = zi 0 otherwise.
 Apply L to ( X , z ) to obtain f k
Making a decision consists of applying all classifiers to a sample and
predicting the label i for which the corresponding classifier reports
the highest confidence score, yˆ = argmax f i ( x) Ref.: Wikipedia
i∈{1…K }
The City College of New York
CSc 59929 – Introduction to Machine Learning 10
Spring 2020 – Erik K. Grimmelmann, Ph.D.
One-vs.-One (OvO)
Inputs:
 L, a learner (a training algorithm for binary classifiers)
 X , a sample
 y, labels where yi ∈ {1,..., K }
Output:
 { f } , a set of classifiers, for j,k ∈{1,..., K } and j < k.
jk

Procedure:
 For each j ,k in {1,..., K } with j < k , form subsets X jk of X
such that xi ∈ X jk if yi =j or yi =k .
 Apply L to each ( X jk ) to obtain f jk
Making a decision consists of applying all classifiers to a sample and
predicting the label i for which the most classifiers report a
result of i. Ref.: Wikipedia
The City College of New York
CSc 59929 – Introduction to Machine Learning 11
Spring 2020 – Erik K. Grimmelmann, Ph.D.

You might also like