0% found this document useful (0 votes)

13 views

Lecture 4.2 Supervised Learning Classification

Uploaded by

Samuel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Lecture 4.2 Supervised Learning Classification

Uploaded by

Samuel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Supervised learning

● Classification
Definitions and Terminology
Example: an object or instance in data used.
●

Features: the set of attributes, often represented as

●

a vector, associated to an example, e.g., height and weight for

●

gender prediction.
●Labels:
–in classification, category associated to an object, e.g., positive or
negative in binary classification.
–in regression, real-valued numbers.
Definitions and Terminology
● Training data: data used for training algorithm.
● Validation data: use for parameter estimation
● Test data: data exclusively used for testing algorithm.
Classification: Definition
•Given a collection of records (training set )
•Each record contains a set of attributes, one of the attributes is the class.
•Find a model for class attribute as a function of the values
of other attributes.
•Goal: previously unseen records should be assigned a
class as accurately as possible.
•A test set is used to determine the accuracy of the model. Usually, the given data
set is divided into training and test sets, with training set used to build the model
and test set used to validate it.
Classification
●Data
–We are given a training data set:

–Feature vectors: data points

–Labels
Illustrating Classification Task
Tid Attrib1 Attrib2 Attrib3 Class Learning
1 Yes Large 125K No
algorithm
2 No Medium 100K No

3 No Small 70K No

4 Yes Medium 120K No

Induction
5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No Learn

8 No Small 85K Yes Model
9 No Medium 75K No

10 No Small 90K Yes

Model
10

Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class Model
11 No Small 55K ?

12 Yes Medium 80K ?

13 Yes Large 110K ? Deduction

14 No Small 95K ?

15 No Large 67K ?
10

Test Set
Examples of Classification Task
•Predicting tumor cells as benign or malignant

•Classifying credit card transactions

as legitimate or fraudulent

•Classifying secondary structures of protein

as alpha-helix, beta-sheet, or random
coil

•Categorizing news stories as finance,

weather, entertainment, sports, etc
Task: sentiment classification
<start> this film was just brilliant casting
location scenery story direction everyone's really
Input data suited the part they played and you could just
imagine being there Robert redford's is an amazing
• movie review (English) actor and now the same being director norman's
father came from the same scottish island as myself
Output data so i loved the fact there was a real connection
with this film the witty remarks throughout the
/ film were great it was just brilliant so much that
i bought the film as soon as it

Training examples
Test examples

9
Examples
●Give example feature vectors and potential labels
–Problem:

–Data:

–Features:
–Labels
Hypothesis (Model)
●Training set: class label.
●Class label

●Hypothesis (model) class H: the set of classifier functions we will

use. Ideally, the true class distribution C can be represented by a

function in H (exactly, or with a small error).
●Having selected H, learning the class reduces to finding an

optimal h ∈ H. We don’t know the true class regions C, but we can

approximate them by the empirical error :
Example - SPAM Detection
●Problem: classify each e-mail message as SPAM or non-SPAM
(binary classification problem).
●Data: large collection of SPAM and non-SPAM messages (labeled

examples).
●Features: define features for all examples (e.g., presence or

absence of some sequences).

–critical step (should use prior knowledge).
●Algorithm: choose type of algorithm adapted to the problem.

–typically requires choice of hypothesis set.

Example - SPAM Detection
●Learning stages:
–divide labeled collection into training and test data.
–use training data and features to train machine learning
algorithm.
●algorithms may require choosing a parameter

●Use validation set to choose best parameters

–predict labels of examples in test data to evaluate algorithm.

Classification example
Features: height, weight

x x
o x x
o o x x
Height o o x
o o x x o x
o
o o o x x x x - weight-lifters
x
o - ballet dancers

Weight

14
Classification example - Simple Model
Features: height, weight
Decision boundary

x x
o x x
o o x x
Height o o x
o o x x o x
o
o o o x x x x - weight-lifters
x
o - ballet dancers

Weight

15
Supervised learning: methodology
Select model, e.g., random forest, (deep) neural network, ...
Train model, i.e., determine parameters
• Data: input + output
• training data  determine model parameters
• validation data  yardstick to avoid overfitting
Test model
• Data: input + output
• testing data  final scoring of the model
Production
• Data: input  predict output

16
Types of classification
Binary: Two class classification
Multiclass : Multiclass classification makes the assumption that
each sample is assigned to one and only one label: a fruit can be
either an apple or a pear but not both at the same time.
Multilabel : Multilabel classification assigns to each sample a set
of target labels.
Importance of Features
●Features:
–poor features, uncorrelated with labels, make learning very
difficult for all algorithms.
–good features, can be very effective; often knowledge of the
task can help.
Objective Functions

There are several ways to study the fundamentals of machine learning. A sort of
optimization is one of the several aspects that Machine Learning can analyse.
Optimization problems are primarily concerned with determining the best, or "optimal,"
solution to some form of problem, usually mathematical in nature. If the best answer is to
be found, some method of judging the quality of any solution is required. In this situation,
objective function comes in handy.
The term "objective functions" refers to the concept of a goal. With data and model
parameters as inputs, this function may be evaluated to yield a number. Any given
problem has certain variables that may be altered; our objective is to discover values for
these variables that maximise or reduce this number.

For example, in machine learning, you define a model, M. To train M, you usually define a
loss function L (e.g., a mean squared error), which you want to minimise. L is the
"objective function" of your problem (which in this case is to be minimised).
Parameter
The machine learns from the training data to map the target function, but the
configuration of the function is unknown.
Different algorithms make various conclusions or biases about the function‘s
structure, so our task as machine learning practitioners is to test various machine
learning algorithms to see which one is effective at modeling the underlying
function.
Thus machine learning models are parameterized so that their behavior can be
tuned for a given problem. These models can have many parameters and finding
the best combination of parameters can be treated as a search problem.
What is a parameter in a machine learning model?

A model parameter is a configuration variable that is internal to the model and whose value
can be estimated from the given data.
• They are required by the model when making predictions.
• Their values define the skill of the model on your problem.
• They are estimated or learned from historical training data.
• They are often not set manually by the practitioner.
• They are often saved as part of the learned model.
The examples of model parameters include:
• The weights in an artificial neural network.
• The support vectors in a support vector machine.
• The coefficients in linear regression or logistic regression.
What is the parametric model?

A learning model that summarizes data with a set of fixed-size parameters (independent on the
number of instances of training).Parametric machine learning algorithms are which optimizes
the function to a known form.
In a parametric model, you know exactly which model you are going to fit in with the data, for
example, linear regression line.
Following the functional form of a linear line clarifies the learning process greatly. Now we’ll
have to do is estimate the line equation coefficients and we have a predictive model for the
problem. With the intercept and the coefficient, one can predict any value along with the
regression.
Some more examples of parametric machine learning algorithms include:
• Logistic Regression
• Linear Discriminant Analysis
• Perceptron
• Naive Bayes
• Simple Neural Networks
nonparametric model?

Nonparametric machine learning algorithms are those which do not make specific assumptions
about the type of the mapping function.
They are prepared to choose any functional form from the training data, by not making
assumptions.
The word nonparametric does not mean that the value lacks parameters existing in it, but
rather that the parameters are adjustable and can change.
A simple to understand the nonparametric model is the k-nearest neighbors' algorithm, making
predictions for a new data instance based on the most similar training patterns k. The only
assumption it makes about the data set is that the training patterns that are the most similar
are most likely to have a similar result.
Some more examples of popular nonparametric machine learning algorithms are:
• k-Nearest Neighbors
• Decision Trees like CART and C4.5
• Support Vector Machines
Linear algorithms assume, that the sample features x and the label output y are linearly
related and there is an affine function f(x)=⟨w,x⟩+b describing the underlying relationship.
e.g. Linear regression, Logistic regression, SVM

Nonlinear algorithms assumes a nonlinear relationship between x and y. Thus, f(x) can by
a function of arbitrary complexity.
e.g. KNN, Decision trees, NN
Machine Learning Methods (Classification)
•Naïve Bayes
•Decision Trees
•Instance Based Methods (CBR, k-NN)
•Support Vector Machines
•Artificial Neural Networks

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
CE Declaration of Conformity - Haier MRV
No ratings yet
CE Declaration of Conformity - Haier MRV
8 pages
Industry Internship Report
50% (2)
Industry Internship Report
43 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
27 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
19_ML_intro
No ratings yet
19_ML_intro
33 pages
Classification
No ratings yet
Classification
53 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
Classification
No ratings yet
Classification
22 pages
Week 4 - Intro to ML
No ratings yet
Week 4 - Intro to ML
37 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Module 4
No ratings yet
Module 4
28 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Summer of Science-Final Report
100% (1)
Summer of Science-Final Report
7 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Aimlf Unit 3
No ratings yet
Aimlf Unit 3
20 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
ML-2-PPT-UNIT-2
No ratings yet
ML-2-PPT-UNIT-2
214 pages
MLSM Lecture1 050923
No ratings yet
MLSM Lecture1 050923
37 pages
Notes
No ratings yet
Notes
125 pages
MachineLearning in short
No ratings yet
MachineLearning in short
10 pages
ML Final Print Upload
No ratings yet
ML Final Print Upload
10 pages
SML_Lecture1
No ratings yet
SML_Lecture1
37 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Unit I
No ratings yet
Unit I
44 pages
Classification
100% (2)
Classification
105 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
ML 3RD Unit
No ratings yet
ML 3RD Unit
67 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Neural Networks Cheat Sheet - 2020 PDF
No ratings yet
Neural Networks Cheat Sheet - 2020 PDF
14 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
39 pages
5 Classification
No ratings yet
5 Classification
40 pages
Task The Problems That Can Be Solved With Machine Learning
No ratings yet
Task The Problems That Can Be Solved With Machine Learning
9 pages
ARTIFICIAL INTELLIGENCE LEC 2
No ratings yet
ARTIFICIAL INTELLIGENCE LEC 2
17 pages
MachineLearning Jan2nd
100% (2)
MachineLearning Jan2nd
171 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
15 pages
Week 01
No ratings yet
Week 01
37 pages
unit 1
100% (1)
unit 1
13 pages
01 - ML - Introduction (1)
No ratings yet
01 - ML - Introduction (1)
65 pages
Machine Learning Intro & Evaluation Metrics
No ratings yet
Machine Learning Intro & Evaluation Metrics
49 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
Module 1
No ratings yet
Module 1
50 pages
Machine Learning – I[1]
No ratings yet
Machine Learning – I[1]
126 pages
CE880_lecture5_slides
No ratings yet
CE880_lecture5_slides
32 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Lecture 1 - Machine Learning
No ratings yet
Lecture 1 - Machine Learning
148 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Lecture 2
No ratings yet
Lecture 2
22 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
Introduction To ML
No ratings yet
Introduction To ML
31 pages
Introduction To Machinelearning
No ratings yet
Introduction To Machinelearning
75 pages
Unit 1
No ratings yet
Unit 1
77 pages
ML Sit1305
No ratings yet
ML Sit1305
127 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Avast Premier Crack
No ratings yet
Avast Premier Crack
2 pages
PCX Hotline: 725-8888: Dealer's Price List Sat, Oct 2, 2021
No ratings yet
PCX Hotline: 725-8888: Dealer's Price List Sat, Oct 2, 2021
2 pages
Pgdca - Syllabus-1 - 20 8 2010
100% (2)
Pgdca - Syllabus-1 - 20 8 2010
19 pages
PF 1.0 and 1.1_Core Mapping
No ratings yet
PF 1.0 and 1.1_Core Mapping
65 pages
Iphone 13 Midnight - Google Search
No ratings yet
Iphone 13 Midnight - Google Search
1 page
FEA Solid Modeling Meshing and Analysis - A Basic Users Guide
No ratings yet
FEA Solid Modeling Meshing and Analysis - A Basic Users Guide
13 pages
Final DRP
No ratings yet
Final DRP
38 pages
Reinforcement Learning Based Multi-Access Control and Battery Prediction With Energy Harvesting in Iot Systems
No ratings yet
Reinforcement Learning Based Multi-Access Control and Battery Prediction With Energy Harvesting in Iot Systems
12 pages
Galaxy Tab The Missing Manual Covers Samsung TouchWiz Interface Missing Manuals 1st Edition Preston Gralla - The ebook with rich content is ready for you to download
100% (1)
Galaxy Tab The Missing Manual Covers Samsung TouchWiz Interface Missing Manuals 1st Edition Preston Gralla - The ebook with rich content is ready for you to download
57 pages
Samsung - Galaxy Tab S Tablet - Android Tablet - Warranty (WIF - SM-T800 - TAB - S - EN - HS - MM - 6!0!091516 - FINAL) (2016)
No ratings yet
Samsung - Galaxy Tab S Tablet - Android Tablet - Warranty (WIF - SM-T800 - TAB - S - EN - HS - MM - 6!0!091516 - FINAL) (2016)
38 pages
Trabajo Verbo To Be
No ratings yet
Trabajo Verbo To Be
3 pages
1232-Article Text-2726-2-10-20240615
No ratings yet
1232-Article Text-2726-2-10-20240615
22 pages
GDB Debug Native Part of Java Application (C - C++ Libraries and JDK) by Alexey Pirogov Medium
No ratings yet
GDB Debug Native Part of Java Application (C - C++ Libraries and JDK) by Alexey Pirogov Medium
20 pages
Sanam Math
No ratings yet
Sanam Math
13 pages
Beta-12Lta: American Standard Series
No ratings yet
Beta-12Lta: American Standard Series
2 pages
Youngest Entrepreneurs of India-Shravan and Sanjay Kumaran
No ratings yet
Youngest Entrepreneurs of India-Shravan and Sanjay Kumaran
3 pages
Pa 5450 Series
No ratings yet
Pa 5450 Series
7 pages
Statement 194801000020579
No ratings yet
Statement 194801000020579
11 pages
Chapter 4 Disruptive Technology Part 2
No ratings yet
Chapter 4 Disruptive Technology Part 2
47 pages
BASIC JAVA PROGRAMMING
No ratings yet
BASIC JAVA PROGRAMMING
22 pages
Computerized Accconting With Tally
No ratings yet
Computerized Accconting With Tally
5 pages
Fusion Project Costingcosting
No ratings yet
Fusion Project Costingcosting
216 pages
NICOLET NEURODIAGNOSTICS MONITORING NM&S Fam Bro - FNL
No ratings yet
NICOLET NEURODIAGNOSTICS MONITORING NM&S Fam Bro - FNL
12 pages
Answer Book - Rose Wines
100% (1)
Answer Book - Rose Wines
11 pages
Megawin 8051 ISP Via COM Port: User Manual
No ratings yet
Megawin 8051 ISP Via COM Port: User Manual
14 pages
Business Registration (1)
No ratings yet
Business Registration (1)
4 pages
Sy Iatd Mobilediagnost WDR (4512 988 05601 Rev Aa)
No ratings yet
Sy Iatd Mobilediagnost WDR (4512 988 05601 Rev Aa)
17 pages
Larrie Paul Tiernan - Photographer Introduction - Dragan Effect
No ratings yet
Larrie Paul Tiernan - Photographer Introduction - Dragan Effect
10 pages