0% found this document useful (0 votes)

46 views

Algorithms - K Nearest Neighbors

K-nearest neighbors (KNN) is a simple machine learning algorithm that classifies new data points based on the majority class of its k nearest neighbors. It calculates the distance between a new data point and all other points in the training set using a distance measure like Euclidean distance. It then finds the k nearest data points based on distance and assigns the new point to the class that is most common amongst its k neighbors. Choosing an optimal value for k and normalizing features are important for KNN to perform well. While simple, it can achieve high accuracy with large datasets.

Uploaded by

Xander Rodriguez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views

Algorithms - K Nearest Neighbors

Uploaded by

Xander Rodriguez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Algorithms: K Nearest Neighbors

Tilani Gunawardena
1
Algorithms: K Nearest Neighbors

2
Simple Analogy..
• Tell me about your friends(who your
neighbors are) and I will tell you who you are.

3
Instance-based Learning

Its very similar to a

Desktop!!

4
KNN – Different names
• K-Nearest Neighbors
• Memory-Based Reasoning
• Example-Based Reasoning
• Instance-Based Learning
• Lazy Learning

5
What is KNN?

• A powerful classification algorithm used in pattern

recognition.

• K nearest neighbors stores all available cases and

classifies new cases based on a similarity measure(e.g
distance function)

• One of the top data mining algorithms used today.

• A non-parametric lazy learning algorithm (An Instance-

based Learning method).

6
KNN: Classification Approach

• An object (a new instance) is classified by a

majority votes for its neighbor classes.
• The object is assigned to the most common class
amongst its K nearest neighbors.(measured by a
distant function )

7
8
Distance Measure

Compute
Distance
Test
Record

Training
Records Choose k of the
“nearest” records

9
Distance measure for Continuous
Variables

10
Distance Between Neighbors
• Calculate the distance between new example
(E) and all examples in the training set.

• Euclidean distance between two examples.

– X = [x1,x2,x3,..,xn]
– Y = [y1,y2,y3,...,yn]

– The Euclidean distance between X and Y is defined

as: n
D( X , Y )  2
 (x  y )
i 1
i i
11
K-Nearest Neighbor Algorithm
• All the instances correspond to points in an n-dimensional
feature space.

• Each instance is represented with a set of numerical

attributes.

• Each of the training data consists of a set of vectors and a

class label associated with each vector.

• Classification is done by comparing feature vectors of

different K nearest points.

• Select the K-nearest examples to E in the training set.

• Assign E to the most common class among its K-nearest

neighbors.
12
3-KNN: Example(1)
Customer Age Income No. Class Distance from John
credit
cards
sqrt [(35-37)2+(35-50)2 +(3-
George 35 35K 3 No 2)2]=15.16

Rachel 22 50K 2 Yes sqrt [(22-37)2+(50-50)2 +(2-

2)2]=15
Steve 63 200K 1 No sqrt [(63-37)2+(200-50)2 +(1-
2)2]=152.23

Tom 59 170K 1 No sqrt [(59-37)2+(170-50)2 +(1-

2)2]=122

Anne 25 40K 4 Yes sqrt [(25-37)2+(40-50)2 +(4-

2)2]=15.74

John 37 50K 2 ? YES

13
How to choose K?

• If K is too small it is sensitive to noise points.

• Larger K works well. But too large K may include majority

points from other classes.

• Rule of thumb is K < sqrt(n), n is number of examples.

14
15
X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points

that have the k smallest distance to x

16
KNN Feature Weighting

• Scale each feature by its importance for

classification

• Can use our prior knowledge about which features

are more important
• Can learn the weights wk using cross‐validation (to
be covered later)

17
Feature Normalization

• Distance between neighbors could be dominated

by some attributes with relatively large numbers.
 e.g., income of customers in our previous example.

• Arises when two features are in different scales.

• Important to normalize those features.

– Mapping values to numbers between 0 – 1.

18
Nominal/Categorical Data
• Distance works naturally with numerical attributes.

• Binary value categorical data attributes can be regarded as 1

or 0.

19
KNN Classification
$250,000

$200,000

$150,000

Loan$ Non-Default
$100,000 Default

$50,000

$0
0 10 20 30 40 50 60 70

Age

20
KNN Classification – Distance
Age Loan Default Distance
25 $40,000 N 102000
35 $60,000 N 82000
45 $80,000 N 62000
20 $20,000 N 122000
35 $120,000 N 22000
52 $18,000 N 124000
23 $95,000 Y 47000
40 $62,000 Y 80000
60 $100,000 Y 42000
48 $220,000 Y 78000
33 $150,000 Y 8000

48 $142,000 ?

D  ( x1  x2 )  ( y1  y2 )
2 2

21
KNN Classification – Standardized Distance
Age Loan Default Distance
0.125 0.11 N 0.7652
0.375 0.21 N 0.5200
0.625 0.31 N 0.3160
0 0.01 N 0.9245
0.375 0.50 N 0.3428
0.8 0.00 N 0.6220
0.075 0.38 Y 0.6669
0.5 0.22 Y 0.4437
1 0.41 Y 0.3650
0.7 1.00 Y 0.3861
0.325 0.65 Y 0.3771

0.7 0.61 ?
X  Min
Xs 
Max  Min
22
Strengths of KNN
• Very simple and intuitive.
• Can be applied to the data from any distribution.
• Good classification if the number of samples is large enough.

Weaknesses of KNN

• Takes more time to classify a new example.

• need to calculate and compare distance from new example
to all other examples.
• Choosing k may be tricky.
• Need large number of samples for accuracy.

6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
Turn Strategy Into Action
No ratings yet
Turn Strategy Into Action
4 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
Lecture Note #3_PEC-CS701E
No ratings yet
Lecture Note #3_PEC-CS701E
27 pages
K-Nearest Neighbour (KNN)
No ratings yet
K-Nearest Neighbour (KNN)
14 pages
Algorithms: K Nearest Neighbors
No ratings yet
Algorithms: K Nearest Neighbors
16 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
K- Nearest Neighbors.pptx
No ratings yet
K- Nearest Neighbors.pptx
33 pages
KNN v2
No ratings yet
KNN v2
31 pages
Introduction To KNN
100% (1)
Introduction To KNN
8 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
K- Nearest Neighbor
No ratings yet
K- Nearest Neighbor
13 pages
CPE412 Pattern Recognition (Week 6)
No ratings yet
CPE412 Pattern Recognition (Week 6)
27 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
K-Nearest-Neighbors-KNN-A-Fundamental-Machine-Learning-Algorithm (1).pptx
No ratings yet
K-Nearest-Neighbors-KNN-A-Fundamental-Machine-Learning-Algorithm (1).pptx
11 pages
WEEK 07
No ratings yet
WEEK 07
24 pages
KNN_Algorithm
No ratings yet
KNN_Algorithm
2 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
5. K-Nearest Neighbors
No ratings yet
5. K-Nearest Neighbors
35 pages
KNN
No ratings yet
KNN
53 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
K-NN (Nearest Neighbor)
100% (1)
K-NN (Nearest Neighbor)
17 pages
K Nearest Neighbor KNN
No ratings yet
K Nearest Neighbor KNN
18 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
14 K - Nearest Neighbours
No ratings yet
14 K - Nearest Neighbours
8 pages
26. K Nearest Neighbor
No ratings yet
26. K Nearest Neighbor
32 pages
4.kNN Concepts
No ratings yet
4.kNN Concepts
12 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
SUMSEM-2020-21 MEE6070 ETH VL2020210700842 Reference Material I 16-Jul-2021 K-Nearest Neighbors (KNN) Algorithm (Repaired) Week-3
No ratings yet
SUMSEM-2020-21 MEE6070 ETH VL2020210700842 Reference Material I 16-Jul-2021 K-Nearest Neighbors (KNN) Algorithm (Repaired) Week-3
40 pages
KNN
No ratings yet
KNN
29 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
Bài-nhóm-tìm-hiểu-về-KNN
No ratings yet
Bài-nhóm-tìm-hiểu-về-KNN
5 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
Lecture 22 - K-Nearnest Neighbours
No ratings yet
Lecture 22 - K-Nearnest Neighbours
11 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
K-NN Method
No ratings yet
K-NN Method
12 pages
Week 5 - Instance-Based Learning & PCA
No ratings yet
Week 5 - Instance-Based Learning & PCA
69 pages
K Nearest Neighbors (KNN) : "Birds of A Feather Flock Together"
No ratings yet
K Nearest Neighbors (KNN) : "Birds of A Feather Flock Together"
16 pages
K-Nearest Neighbor Algorithm
100% (1)
K-Nearest Neighbor Algorithm
6 pages
K-Nearest Neighbor Algorithm
No ratings yet
K-Nearest Neighbor Algorithm
6 pages
Research Paper
No ratings yet
Research Paper
6 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
1694600817-Unit2.3 KNN CU 2.0
No ratings yet
1694600817-Unit2.3 KNN CU 2.0
25 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
33 pages
K Nearest Neighbor (KNN)
No ratings yet
K Nearest Neighbor (KNN)
9 pages
3.1 K Nearest Neighbour Classifier (1)
No ratings yet
3.1 K Nearest Neighbour Classifier (1)
24 pages
K-Nearest Neighbor (KNN)
No ratings yet
K-Nearest Neighbor (KNN)
27 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
Instance-Based Learning: K-Nearest Neighbour Learning
No ratings yet
Instance-Based Learning: K-Nearest Neighbour Learning
21 pages
K Nearestneighborknnalgorithm 241117075907 d767c46d
No ratings yet
K Nearestneighborknnalgorithm 241117075907 d767c46d
13 pages
k-Nearest Neighbors (k-NN) Algorithm
No ratings yet
k-Nearest Neighbors (k-NN) Algorithm
10 pages
Unit V: Distance and Rule Based Models
No ratings yet
Unit V: Distance and Rule Based Models
56 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
KNN
No ratings yet
KNN
3 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Cairo Documentation
No ratings yet
Cairo Documentation
19 pages
No-Code Programming For Biology Beginner's Guide
No ratings yet
No-Code Programming For Biology Beginner's Guide
108 pages
Generating API Using Django Rest Framework With Insomnia
No ratings yet
Generating API Using Django Rest Framework With Insomnia
7 pages
ADO
No ratings yet
ADO
7,264 pages
Curso Online FREE - 5
No ratings yet
Curso Online FREE - 5
1 page
Github Vs Gitlab
No ratings yet
Github Vs Gitlab
8 pages
Efficient Python Tricks and Tools For Data Scientists (Git and GitHub)
No ratings yet
Efficient Python Tricks and Tools For Data Scientists (Git and GitHub)
8 pages
Git and Github Datasheet
100% (1)
Git and Github Datasheet
14 pages
Curusos 01
No ratings yet
Curusos 01
5 pages
HLPS Proposal
No ratings yet
HLPS Proposal
14 pages
Census 2011
No ratings yet
Census 2011
16 pages
COURSE OUTLINE B2.6 (1)
No ratings yet
COURSE OUTLINE B2.6 (1)
6 pages
RRS
0% (1)
RRS
2 pages
The Catholic di-WPS Office
No ratings yet
The Catholic di-WPS Office
2 pages
Artificial Intelligence & Machine Learning Unit 6: Applications Question Bank and Its Solution
No ratings yet
Artificial Intelligence & Machine Learning Unit 6: Applications Question Bank and Its Solution
22 pages
Em 504 Financial Management Bautista
No ratings yet
Em 504 Financial Management Bautista
8 pages
Mobile Systems Engineering - DKU International
No ratings yet
Mobile Systems Engineering - DKU International
3 pages
Cooper v. Aaron Redacted
No ratings yet
Cooper v. Aaron Redacted
4 pages
anxiety_in_sports_performance_-_
No ratings yet
anxiety_in_sports_performance_-_
2 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
PDC-2 Manual v.8.1
No ratings yet
PDC-2 Manual v.8.1
9 pages
Igcse Cam Math p4 分章
No ratings yet
Igcse Cam Math p4 分章
656 pages
Annual Report 2005-06
No ratings yet
Annual Report 2005-06
212 pages
Anp Entrepreneurship s4 Sb
No ratings yet
Anp Entrepreneurship s4 Sb
176 pages
Instant download of International Encyclopedia of Marriage and Family Volume 1 Ab Du 2nd ed Edition James J. Ponzetti ebook PDF, every chapter
No ratings yet
Instant download of International Encyclopedia of Marriage and Family Volume 1 Ab Du 2nd ed Edition James J. Ponzetti ebook PDF, every chapter
67 pages
10 Steps in Research Process
No ratings yet
10 Steps in Research Process
3 pages
Naskah Drama Bahasa Inggris
No ratings yet
Naskah Drama Bahasa Inggris
16 pages
Reflection Essay
No ratings yet
Reflection Essay
2 pages
Mor 567
No ratings yet
Mor 567
12 pages
Members of The Wayne County Board of Education Are Paid $11,700 Annually. Board Chairman Chris West Is Paid $12,900
No ratings yet
Members of The Wayne County Board of Education Are Paid $11,700 Annually. Board Chairman Chris West Is Paid $12,900
1 page
Hindi Homework Help
100% (1)
Hindi Homework Help
7 pages
英皇考级2023-2024钢琴（电子乐谱） 4
No ratings yet
英皇考级2023-2024钢琴（电子乐谱） 4
1 page
Keras1-Introduction Two KEras
No ratings yet
Keras1-Introduction Two KEras
6 pages
John Ortiz CV
No ratings yet
John Ortiz CV
9 pages
Pipeline Leak Detection and Control
No ratings yet
Pipeline Leak Detection and Control
6 pages
IIS Easy Migration Tool Quick Start Guide
No ratings yet
IIS Easy Migration Tool Quick Start Guide
26 pages
1st Term Exam Materials 2022 - 2023 - G3
No ratings yet
1st Term Exam Materials 2022 - 2023 - G3
2 pages
Artikel Roni Faslah
No ratings yet
Artikel Roni Faslah
9 pages