0% found this document useful (0 votes)

3 views

data mining techniques

Uploaded by

narendranic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

data mining techniques

Uploaded by

narendranic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Support Vector Machine Algorithm

A Support Vector Machine (SVM) is a powerful machine learning algorithm widely used for both linear
and nonlinear classification, as well as regression and outlier detection tasks. SVMs are highly
adaptable, making them suitable for various applications such as text classification, image
classification, spam detection, handwriting identification, gene expression analysis, face detection,
and anomaly detection.

SVMs are particularly effective because they focus on finding the maximum separating
hyperplane between the different classes in the target feature, making them robust for both binary and
multiclass classification. In this outline, we will explore the Support Vector Machine (SVM) algorithm,
its applications, and how it effectively handles both linear and nonlinear classification, as well
as regression and outlier detection tasks.

Support Vector Machine

A Support Vector Machine (SVM) is a supervised machine learning algorithm used for
both classification and regression tasks. While it can be applied to regression problems, SVM is best
suited for classification tasks. The primary objective of the SVM algorithm is to identify the optimal
hyperplane in an N-dimensional space that can effectively separate data points into different classes in
the feature space. The algorithm ensures that the margin between the closest points of different classes,
known as support vectors, is maximized.

The dimension of the hyperplane depends on the number of features. For instance, if there are two
input features, the hyperplane is simply a line, and if there are three input features, the hyperplane
becomes a 2-D plane. As the number of features increases beyond three, the complexity of visualizing
the hyperplane also increases.

Consider two independent variables, x1 and x2, and one dependent variable represented as either a
blue circle or a red circle.

• In this scenario, the hyperplane is a line because we are working with two features (x1 and x2).

• There are multiple lines (or hyperplanes) that can separate the data points.

• The challenge is to determine the best hyperplane that maximizes the separation margin
between the red and blue circles.
Linearly Separable Data points

From the figure above it’s very clear that there are multiple lines (our hyperplane here is a line because
we are considering only two input features x1, x2) that segregate our data points or do a classification
between red and blue circles. So how do we choose the best line or in general the best hyperplane that
segregates our data points?

How does Support Vector Machine Algorithm Work?

One reasonable choice for the best hyperplane in a Support Vector Machine (SVM) is the one that
maximizes the separation margin between the two classes. The maximum-margin hyperplane, also
referred to as the hard margin, is selected based on maximizing the distance between the hyperplane
and the nearest data point on each side.
Multiple hyperplanes separate the data from two classes

So we choose the hyperplane whose distance from it to the nearest data point on each side is
maximized. If such a hyperplane exists it is known as the maximum-margin hyperplane/hard margin. So
from the above figure, we choose L2. Let’s consider a scenario like shown below

Selecting hyperplane for data with outlier

Here we have one blue ball in the boundary of the red ball. So how does SVM classify the data? It’s
simple! The blue ball in the boundary of red ones is an outlier of blue balls. The SVM algorithm has the
characteristics to ignore the outlier and finds the best hyperplane that maximizes the margin. SVM is
robust to outliers.

Hyperplane which is the most optimized one

So in this type of data point what SVM does is, finds the maximum margin as done with previous data
sets along with that it adds a penalty each time a point crosses the margin. So the margins in these types
of cases are called soft margins. When there is a soft margin to the data set, the SVM tries to
minimize (1/margin+∧(∑penalty)). Hinge loss is a commonly used penalty. If no violations no hinge loss.If
violations hinge loss proportional to the distance of violation.

Till now, we were talking about linearly separable data(the group of blue balls and red balls are
separable by a straight line/linear line). What to do if data are not linearly separable?

Original 1D dataset for classification

Say, our data is shown in the figure above. SVM solves this by creating a new variable using a kernel. We
call a point xi on the line and we create a new variable yi as a function of distance from origin o.so if we
plot this we get something like as shown below

Mapping 1D data to 2D to become able to separate the two classes

In this case, the new variable y is created as a function of distance from the origin. A non-linear function
that creates a new variable is referred to as a kernel.

Support Vector Machine Terminology

• Hyperplane: The hyperplane is the decision boundary used to separate data points of different
classes in a feature space. For linear classification, this is a linear equation represented as
wx+b=0.

• Support Vectors: Support vectors are the closest data points to the hyperplane. These points are
critical in determining the hyperplane and the margin in Support Vector Machine (SVM).

• Margin: The margin refers to the distance between the support vector and the hyperplane. The
primary goal of the SVM algorithm is to maximize this margin, as a wider margin typically results
in better classification performance.

• Kernel: The kernel is a mathematical function used in SVM to map input data into a higher-
dimensional feature space. This allows the SVM to find a hyperplane in cases where data points
are not linearly separable in the original space. Common kernel functions include linear,
polynomial, radial basis function (RBF), and sigmoid.

• Hard Margin: A hard margin refers to the maximum-margin hyperplane that perfectly separates
the data points of different classes without any misclassifications.
• Soft Margin: When data contains outliers or is not perfectly separable, SVM uses the soft
margin technique. This method introduces a slack variable for each data point to allow some
misclassifications while balancing between maximizing the margin and minimizing violations.

• C: The C parameter in SVM is a regularization term that balances margin maximization and the
penalty for misclassifications. A higher C value imposes a stricter penalty for margin violations,
leading to a smaller margin but fewer misclassifications.

• Hinge Loss: The hinge loss is a common loss function in SVMs. It penalizes misclassified points or
margin violations and is often combined with a regularization term in the objective function.

• Dual Problem: The dual problem in SVM involves solving for the Lagrange multipliers associated
with the support vectors. This formulation allows for the use of the kernel trick and facilitates
more efficient computation.
Apriori Algorithm

Apriori algorithm is given by R. Agrawal and R. Srikant in 1994 for finding frequent itemsets in a dataset
for boolean association rule. Name of the algorithm is Apriori because it uses prior knowledge of
frequent itemset properties. We apply an iterative approach or level-wise search where k-frequent
itemsets are used to find k+1 itemsets.

To improve the efficiency of level-wise generation of frequent itemsets, an important property is used
called Apriori property which helps by reducing the search space.

Apriori Property –
All non-empty subset of frequent itemset must be frequent. The key concept of Apriori algorithm is its
anti-monotonicity of support measure. Apriori assumes that

All subsets of a frequent itemset must be frequent(Apriori property).

If an itemset is infrequent, all its supersets will be infrequent.

Before we start understanding the algorithm, go through some definitions which are explained in my
previous post.
Consider the following dataset and we will find frequent itemsets and generate association rules for
them.

minimum support count is 2

minimum confidence is 60%

Step-1: K=1
(I) Create a table containing support count of each item present in dataset – Called C1(candidate set)
(II) compare candidate set item’s support count with minimum support count(here min_support=2 if
support_count of candidate set items is less than min_support then remove those items). This gives us
itemset L1.

Step-2: K=2

• Generate candidate set C2 using L1 (this is called join step). Condition of joining Lk-1 and Lk-1 is
that it should have (K-2) elements in common.

• Check all subsets of an itemset are frequent or not and if not frequent remove that
itemset.(Example subset of{I1, I2} are {I1}, {I2} they are frequent.Check for each itemset)

• Now find support count of these itemsets by searching in dataset.

(II) compare candidate (C2) support count with minimum support count(here min_support=2 if
support_count of candidate set item is less than min_support then remove those items) this gives us
itemset L2.

Step-3:

o Generate candidate set C3 using L2 (join step). Condition of joining Lk-1 and Lk-1 is that it
should have (K-2) elements in common. So here, for L2, first element should match.
So itemset generated by joining L2 is {I1, I2, I3}{I1, I2, I5}{I1, I3, i5}{I2, I3, I4}{I2, I4, I5}{I2,
I3, I5}

o Check if all subsets of these itemsets are frequent or not and if not, then remove that
itemset.(Here subset of {I1, I2, I3} are {I1, I2},{I2, I3},{I1, I3} which are frequent. For {I2,
I3, I4}, subset {I3, I4} is not frequent so remove it. Similarly check for every itemset)

o find support count of these remaining itemset by searching in dataset.

(II) Compare candidate (C3) support count with minimum support count(here min_support=2 if
support_count of candidate set item is less than min_support then remove those items) this gives us
itemset L3.

Step-4:

o Generate candidate set C4 using L3 (join step). Condition of joining Lk-1 and Lk-1 (K=4) is
that, they should have (K-2) elements in common. So here, for L3, first 2 elements
(items) should match.
o Check all subsets of these itemsets are frequent or not (Here itemset formed by joining
L3 is {I1, I2, I3, I5} so its subset contains {I1, I3, I5}, which is not frequent). So no itemset
in C4

o We stop here because no frequent itemsets are found further

Thus, we have discovered all the frequent item-sets. Now generation of strong association rule comes
into picture. For that we need to calculate confidence of each rule.

Confidence –
A confidence of 60% means that 60% of the customers, who purchased milk and bread also bought
butter.

Confidence(A->B)=Support_count(A∪B)/Support_count(A)

So here, by taking an example of any frequent itemset, we will show the rule generation.
Itemset {I1, I2, I3} //from L3
SO rules can be
[I1Î2]=>[I3] //confidence = sup(I1Î2Î3)/sup(I1Î2) = 2/4*100=50%
[I1Î3]=>[I2] //confidence = sup(I1Î2Î3)/sup(I1Î3) = 2/4*100=50%
[I2Î3]=>[I1] //confidence = sup(I1Î2Î3)/sup(I2Î3) = 2/4*100=50%
[I1]=>[I2Î3] //confidence = sup(I1Î2Î3)/sup(I1) = 2/6*100=33%
[I2]=>[I1Î3] //confidence = sup(I1Î2Î3)/sup(I2) = 2/7*100=28%
[I3]=>[I1Î2] //confidence = sup(I1Î2Î3)/sup(I3) = 2/6*100=33%

So if minimum confidence is 50%, then first 3 rules can be considered as strong association rules.

Limitations of Apriori Algorithm

Apriori Algorithm can be slow. The main limitation is time required to hold a vast number of candidate
sets with much frequent itemsets, low minimum support or large itemsets i.e. it is not an efficient
approach for large number of datasets. For example, if there are 10^4 from frequent 1- itemsets, it need
to generate more than 10^7 candidates into 2-length which in turn they will be tested and accumulate.
Furthermore, to detect frequent pattern in size 100 i.e. v1, v2… v100, it have to generate 2^100
candidate itemsets that yield on costly and wasting of time of candidate generation. So, it will check for
many sets from candidate itemsets, also it will scan database many times repeatedly for finding
candidate itemsets. Apriori will be very low and inefficiency when memory capacity is limited with large
number of transactions.

G11 PDF
No ratings yet
G11 PDF
15 pages
SVM notes unit 4.docx
No ratings yet
SVM notes unit 4.docx
8 pages
Unit2 notes What is a Support Vector Machine
No ratings yet
Unit2 notes What is a Support Vector Machine
11 pages
Support Vector Machine Algorithm
No ratings yet
Support Vector Machine Algorithm
8 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
SVM
No ratings yet
SVM
6 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
SVM
No ratings yet
SVM
11 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
Unit 2
No ratings yet
Unit 2
47 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
Support Vector Machine
100% (1)
Support Vector Machine
25 pages
UNIT-III Support Vector Machines
No ratings yet
UNIT-III Support Vector Machines
43 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
DMML Unit4 - SVM
No ratings yet
DMML Unit4 - SVM
50 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
28 pages
Support Vector Machine
No ratings yet
Support Vector Machine
12 pages
Unit-4 AI - SVM
No ratings yet
Unit-4 AI - SVM
21 pages
SVM.pptx
No ratings yet
SVM.pptx
67 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
SVM
No ratings yet
SVM
11 pages
SVM
No ratings yet
SVM
12 pages
SUpport Vector Machine
No ratings yet
SUpport Vector Machine
28 pages
Support vector Machine.pptx
No ratings yet
Support vector Machine.pptx
18 pages
SVM
No ratings yet
SVM
43 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Machine Learning (CSO851) - Lecture 05
No ratings yet
Machine Learning (CSO851) - Lecture 05
27 pages
SVMs[1]
No ratings yet
SVMs[1]
30 pages
Presentation On Support Vector Machine (SVM)
100% (2)
Presentation On Support Vector Machine (SVM)
22 pages
Support Vector Machine-1
No ratings yet
Support Vector Machine-1
12 pages
SVM (Repaired)
No ratings yet
SVM (Repaired)
39 pages
Support Vector Machine
No ratings yet
Support Vector Machine
40 pages
SVM&Decision Tree
No ratings yet
SVM&Decision Tree
10 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
6 pages
Classification Regression: Mostly Used in Classification Problems
No ratings yet
Classification Regression: Mostly Used in Classification Problems
8 pages
SVM Notes
No ratings yet
SVM Notes
8 pages
Support Vector Machine
No ratings yet
Support Vector Machine
13 pages
SVMs
No ratings yet
SVMs
30 pages
Session Svmclassification
No ratings yet
Session Svmclassification
28 pages
1 SVM Lecture material Main Notes
No ratings yet
1 SVM Lecture material Main Notes
19 pages
Machine Learning(r17a0534) 54 57
No ratings yet
Machine Learning(r17a0534) 54 57
4 pages
Support Vector Machine (SVM) Algorithm
No ratings yet
Support Vector Machine (SVM) Algorithm
8 pages
ML Support Vector Machines 2
No ratings yet
ML Support Vector Machines 2
22 pages
SVM - Feb 15
No ratings yet
SVM - Feb 15
34 pages
5-SVM
No ratings yet
5-SVM
34 pages
DataMining_Chapter5
No ratings yet
DataMining_Chapter5
9 pages
Support Vactor Machine Final
No ratings yet
Support Vactor Machine Final
11 pages
Support Vector Machine
No ratings yet
Support Vector Machine
9 pages
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
Chapter 07
No ratings yet
Chapter 07
18 pages
Machine Learning Term Test 2
No ratings yet
Machine Learning Term Test 2
20 pages
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
No ratings yet
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
69 pages
Introduction To Support Vector Machines
No ratings yet
Introduction To Support Vector Machines
46 pages
SVM Theory
No ratings yet
SVM Theory
7 pages
Lab 6 Dsa
No ratings yet
Lab 6 Dsa
15 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
No ratings yet
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
69 pages
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
The Algebra Dance - Banach Tarski Paradox PDF
No ratings yet
The Algebra Dance - Banach Tarski Paradox PDF
18 pages
Mathematics 2ND Term Note1
No ratings yet
Mathematics 2ND Term Note1
42 pages
Mathematics Semester 1 Answer Key For Urdu Medium
No ratings yet
Mathematics Semester 1 Answer Key For Urdu Medium
5 pages
Asco Progress Academy P
No ratings yet
Asco Progress Academy P
2 pages
Text Bookof Finite Element Analysis by P. Seshu PDF
No ratings yet
Text Bookof Finite Element Analysis by P. Seshu PDF
342 pages
Learning Mathematics With Pattern Blocks - Activities
No ratings yet
Learning Mathematics With Pattern Blocks - Activities
8 pages
Instant Download Longitudinal Structural Equation Modeling 1st Edition Todd D. Little PDF All Chapter
100% (16)
Instant Download Longitudinal Structural Equation Modeling 1st Edition Todd D. Little PDF All Chapter
47 pages
Mechanical Systems and Signal Processing
No ratings yet
Mechanical Systems and Signal Processing
22 pages
Assignment 5.5 5.6
No ratings yet
Assignment 5.5 5.6
4 pages
The Positive Impact of Robotics
No ratings yet
The Positive Impact of Robotics
17 pages
Ex# (3.1-3.2-3.4) First Year Step
No ratings yet
Ex# (3.1-3.2-3.4) First Year Step
11 pages
Tifr Dec2011 Question Paper
0% (1)
Tifr Dec2011 Question Paper
15 pages
Lecture-03- Gauss Ellimination Method
No ratings yet
Lecture-03- Gauss Ellimination Method
31 pages
Planning Your Career
100% (1)
Planning Your Career
17 pages
Research On Data Encryption Standard Based On AES Algorithm in Internet of Things Environment
No ratings yet
Research On Data Encryption Standard Based On AES Algorithm in Internet of Things Environment
5 pages
Best Approach: Complex Number (Sheet)
No ratings yet
Best Approach: Complex Number (Sheet)
57 pages
Safari 2
No ratings yet
Safari 2
6 pages
Mathematical Circles
88% (16)
Mathematical Circles
286 pages
7 Days MBE Prep SOP
No ratings yet
7 Days MBE Prep SOP
3 pages
How To Create An Online Calculator: Using Javascript To Improve Productivity & Workflow
No ratings yet
How To Create An Online Calculator: Using Javascript To Improve Productivity & Workflow
11 pages
Group 4 A1 E101 Final Lab Report
No ratings yet
Group 4 A1 E101 Final Lab Report
6 pages
Mechanical Handbook 2010 Final
No ratings yet
Mechanical Handbook 2010 Final
45 pages
Day & Date Subject: Pareeksha Bhavan
No ratings yet
Day & Date Subject: Pareeksha Bhavan
3 pages
Full download Viscous Fluid Flow 4th Edition Frank M. White - eBook PDF pdf docx
100% (8)
Full download Viscous Fluid Flow 4th Edition Frank M. White - eBook PDF pdf docx
69 pages
Maths Lecturer
No ratings yet
Maths Lecturer
4 pages
E. P. Bos, H. A. G. Braakhuis, W. Duba, C. H. Kneepkens, C. Schabel-Medieval Supposition Theory Revisited-Brill Academic Pub (2013)
100% (1)
E. P. Bos, H. A. G. Braakhuis, W. Duba, C. H. Kneepkens, C. Schabel-Medieval Supposition Theory Revisited-Brill Academic Pub (2013)
560 pages
Chapter 20 PDF
0% (1)
Chapter 20 PDF
88 pages
CT 03 Eee 333
No ratings yet
CT 03 Eee 333
2 pages
Module-6-Matrix-Structural-Analysis-
No ratings yet
Module-6-Matrix-Structural-Analysis-
34 pages