Week 6
Week 6
Week 6
Support Vector Machines (SVMs) are a powerful class of supervised learning algorithms
used for classification and regression tasks. SVMs work by finding the optimal
hyperplane that separates data points into different classes or predicts continuous target
variables. There are several types of SVMs, each with its own characteristics and
variations. Here are the main types of support vector machines:
1. Linear SVM:
- Linear SVM is the simplest form of SVM and is used for linearly separable datasets,
where the classes can be separated by a straight line (hyperplane) in the feature space.
- In linear SVM, the goal is to find the hyperplane that maximizes the margin, which is
the distance between the hyperplane and the closest data points (support vectors) from
each class.
- The decision function of a linear SVM is represented as:
\[f(x) = w^T x + b\]
where \(w\) is the weight vector, \(x\) is the input feature vector, and \(b\) is the bias
term.
- Linear SVM is computationally efficient and works well for high-dimensional
datasets with a large number of features.
2. Nonlinear SVM:
- Nonlinear SVM is used for datasets that are not linearly separable, meaning that the
classes cannot be separated by a straight line in the feature space.
- Nonlinear SVM uses kernel functions to map the input features into a higher-
dimensional space where the classes become separable by a hyperplane.
- Common kernel functions used in nonlinear SVM include:
- Polynomial kernel: \(K(x, x') = (x^T x' + c)^d\)
- Gaussian (Radial Basis Function) kernel: \(K(x, x') = e^{-\frac{\|x -
x'\|^2}{2\sigma^2}}\)
- Sigmoid kernel: \(K(x, x') = \tanh(\alpha x^T x' + c)\)
- Nonlinear SVM allows for more flexible decision boundaries and can capture
complex relationships in the data.
3. Multiclass SVM:
- Multiclass SVM is an extension of binary SVM to handle datasets with more than two
classes. It can classify instances into multiple classes by training multiple binary
classifiers, typically using the one-vs-rest (OvR) or one-vs-one (OvO) strategy.
- In the OvR strategy, a separate binary classifier is trained for each class, which learns
to distinguish that class from all other classes. The final prediction is made based on the
classifier with the highest confidence score.
- In the OvO strategy, a binary classifier is trained for each pair of classes, which learns
to distinguish between instances of those two classes. The final prediction is made based
on a majority voting scheme.
4. Probabilistic SVM:
- Probabilistic SVM extends the binary SVM to output probability estimates for class
membership rather than just binary predictions.
- One common approach for probabilistic SVM is Platt scaling, which fits a logistic
regression model to the SVM decision scores to estimate class probabilities.
- Probabilistic SVM provides a measure of confidence or uncertainty in the predictions,
which can be useful for decision-making and evaluating model performance.
These are the main types of support vector machines, each tailored to different types of
data and tasks. Depending on the characteristics of the dataset and the objectives of the
analysis, practitioners may choose the most appropriate type of SVM and kernel function
to build accurate and effective models.