Lecture 4.2 Supervised Learning Classification
Lecture 4.2 Supervised Learning Classification
● Classification
Definitions and Terminology
Example: an object or instance in data used.
●
gender prediction.
●Labels:
–in classification, category associated to an object, e.g., positive or
negative in binary classification.
–in regression, real-valued numbers.
Definitions and Terminology
● Training data: data used for training algorithm.
● Validation data: use for parameter estimation
● Test data: data exclusively used for testing algorithm.
Classification: Definition
•Given a collection of records (training set )
•Each record contains a set of attributes, one of the attributes is the class.
•Find a model for class attribute as a function of the values
of other attributes.
•Goal: previously unseen records should be assigned a
class as accurately as possible.
•A test set is used to determine the accuracy of the model. Usually, the given data
set is divided into training and test sets, with training set used to build the model
and test set used to validate it.
Classification
●Data
–We are given a training data set:
3 No Small 70K No
6 No Medium 60K No
Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class Model
11 No Small 55K ?
15 No Large 67K ?
10
Test Set
Examples of Classification Task
•Predicting tumor cells as benign or malignant
9
Examples
●Give example feature vectors and potential labels
–Problem:
–Data:
–Features:
–Labels
Hypothesis (Model)
●Training set: class label.
●Class label
examples).
●Features: define features for all examples (e.g., presence or
x x
o x x
o o x x
Height o o x
o o x x o x
o
o o o x x x x - weight-lifters
x
o - ballet dancers
Weight
14
Classification example - Simple Model
Features: height, weight
Decision boundary
x x
o x x
o o x x
Height o o x
o o x x o x
o
o o o x x x x - weight-lifters
x
o - ballet dancers
Weight
15
Supervised learning: methodology
Select model, e.g., random forest, (deep) neural network, ...
Train model, i.e., determine parameters
• Data: input + output
• training data determine model parameters
• validation data yardstick to avoid overfitting
Test model
• Data: input + output
• testing data final scoring of the model
Production
• Data: input predict output
16
Types of classification
Binary: Two class classification
Multiclass : Multiclass classification makes the assumption that
each sample is assigned to one and only one label: a fruit can be
either an apple or a pear but not both at the same time.
Multilabel : Multilabel classification assigns to each sample a set
of target labels.
Importance of Features
●Features:
–poor features, uncorrelated with labels, make learning very
difficult for all algorithms.
–good features, can be very effective; often knowledge of the
task can help.
Objective Functions
There are several ways to study the fundamentals of machine learning. A sort of
optimization is one of the several aspects that Machine Learning can analyse.
Optimization problems are primarily concerned with determining the best, or "optimal,"
solution to some form of problem, usually mathematical in nature. If the best answer is to
be found, some method of judging the quality of any solution is required. In this situation,
objective function comes in handy.
The term "objective functions" refers to the concept of a goal. With data and model
parameters as inputs, this function may be evaluated to yield a number. Any given
problem has certain variables that may be altered; our objective is to discover values for
these variables that maximise or reduce this number.
For example, in machine learning, you define a model, M. To train M, you usually define a
loss function L (e.g., a mean squared error), which you want to minimise. L is the
"objective function" of your problem (which in this case is to be minimised).
Parameter
The machine learns from the training data to map the target function, but the
configuration of the function is unknown.
Different algorithms make various conclusions or biases about the function‘s
structure, so our task as machine learning practitioners is to test various machine
learning algorithms to see which one is effective at modeling the underlying
function.
Thus machine learning models are parameterized so that their behavior can be
tuned for a given problem. These models can have many parameters and finding
the best combination of parameters can be treated as a search problem.
What is a parameter in a machine learning model?
A model parameter is a configuration variable that is internal to the model and whose value
can be estimated from the given data.
• They are required by the model when making predictions.
• Their values define the skill of the model on your problem.
• They are estimated or learned from historical training data.
• They are often not set manually by the practitioner.
• They are often saved as part of the learned model.
The examples of model parameters include:
• The weights in an artificial neural network.
• The support vectors in a support vector machine.
• The coefficients in linear regression or logistic regression.
What is the parametric model?
A learning model that summarizes data with a set of fixed-size parameters (independent on the
number of instances of training).Parametric machine learning algorithms are which optimizes
the function to a known form.
In a parametric model, you know exactly which model you are going to fit in with the data, for
example, linear regression line.
Following the functional form of a linear line clarifies the learning process greatly. Now we’ll
have to do is estimate the line equation coefficients and we have a predictive model for the
problem. With the intercept and the coefficient, one can predict any value along with the
regression.
Some more examples of parametric machine learning algorithms include:
• Logistic Regression
• Linear Discriminant Analysis
• Perceptron
• Naive Bayes
• Simple Neural Networks
nonparametric model?
Nonparametric machine learning algorithms are those which do not make specific assumptions
about the type of the mapping function.
They are prepared to choose any functional form from the training data, by not making
assumptions.
The word nonparametric does not mean that the value lacks parameters existing in it, but
rather that the parameters are adjustable and can change.
A simple to understand the nonparametric model is the k-nearest neighbors' algorithm, making
predictions for a new data instance based on the most similar training patterns k. The only
assumption it makes about the data set is that the training patterns that are the most similar
are most likely to have a similar result.
Some more examples of popular nonparametric machine learning algorithms are:
• k-Nearest Neighbors
• Decision Trees like CART and C4.5
• Support Vector Machines
Linear algorithms assume, that the sample features x and the label output y are linearly
related and there is an affine function f(x)=⟨w,x⟩+b describing the underlying relationship.
e.g. Linear regression, Logistic regression, SVM
Nonlinear algorithms assumes a nonlinear relationship between x and y. Thus, f(x) can by
a function of arbitrary complexity.
e.g. KNN, Decision trees, NN
Machine Learning Methods (Classification)
•Naïve Bayes
•Decision Trees
•Instance Based Methods (CBR, k-NN)
•Support Vector Machines
•Artificial Neural Networks
25