0% found this document useful (0 votes)

42 views

Image Clustering: Prof. Dr. Rafiqul Islam Department of CSE

The document discusses various clustering methods including K-means, nearest neighbor clustering, and DBSCAN. K-means clusters data by finding K cluster centers and assigning each data point to its closest center, then recalculating the centers. DBSCAN clusters based on density, where a cluster contains areas of high density connected by areas of lower but still high density. The document provides examples applying these methods to cluster a sample dataset.

Uploaded by

Mainul Islam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views

Image Clustering: Prof. Dr. Rafiqul Islam Department of CSE

Uploaded by

Mainul Islam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Image Clustering

Prof. Dr. Rafiqul Islam

Department of CSE
Introduction to Clustering
• Clustering: the process of grouping a set of
objects into classes of similar objects
– Documents within a cluster should be similar.
– Documents from different clusters should be dissimilar.

8/12/2020 Dr. Rafiqul Islam 2

Ch. 16

Introduction to Clustering

• How would
you design an
algorithm for
finding the
three clusters
in this case?

8/12/2020 Dr. Rafiqul Islam 3

Clustering methods
• Centroid based methods
• Connectivity based methods
• Distribution based methods
• Density based methods

8/12/2020 Dr. Rafiqul Islam 4

Clustering Method : K-means

• K-Means clustering algorithm is an unsupervised

algorithm and it is used to segment the interest area
from the background.
• It clusters, or partitions the given data into K-clusters
or parts based on the K-centroids.
• The algorithm is used when you have unlabeled
data(i.e. data without defined categories or groups).
• The goal is to find certain groups based on some kind
of similarity in the data with the number of groups
represented by K.

8/12/2020 Dr. Rafiqul Islam 5

Clustering Method : K-means

• Steps in K-Means algorithm:

1. Choose the number of clusters K.
2. Select at random K points, the centroids (not
necessarily from your dataset).
3. Assign each data point to the closest centroid →
that forms K clusters.
4. Compute and place the new centroid of each cluster.
5. Reassign each data point to the new closest
centroid. If any reassignment. took place, go to step 4,
otherwise, the model is ready.

8/12/2020 Dr. Rafiqul Islam 6

Clustering Method : K-means

• Use the k-means algorithm and Euclidean distance to

cluster the following 8 examples into 3 clusters:
• A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5),
A6=(6,4), A7=(1,2), A8=(4,9).
• The initial seeds (centers of each cluster) are A1, A4
and A7. Run the k-means algorithm

8/12/2020 Dr. Rafiqul Islam 7

Clustering Method: K-means
D(A1, Seed1) = 0 D(A2, Seed1) = D(A2, D(A3, Seed1) =
D(A1, Seed2) = Seed2) = D(A3, Seed2) =
D(A1, Seed3) = D(A2, Seed3) = D(A3, Seed3) =
D(A4, Seed1) = D(A5, Seed1) = D(A6, Seed1) =
D(A4, Seed2) = D(A5, Seed2) = D(A6, Seed2) =
D(A4, Seed3) = D(A5, Seed3) = D(A6, Seed3) =
D(A7, Seed1) = D(A8, Seed1) =
D(A7, Seed2) = D(A8, Seed2) =
D(A7, Seed3) = D(A8, Seed3) =
Cluster 01: Cluster 02: Cluster 03:
A1 A3, A4, A5, A6, A8 A2, A7
New Center: (2, 10) New Center: (6, 6) New Center: (1.5, 3.5)

8/12/2020 Dr. Rafiqul Islam 8

Clustering Method: K-means

8/12/2020 Dr. Rafiqul Islam 9

Clustering Method: K-means

8/12/2020 Dr. Rafiqul Islam 10

Nearest Neighbor clustering
• The k-nearest neighbors algorithm (k-NN) is a non-
parametric method proposed by Thomas Cover used
for classification.
• In k-NN classification, the output is a class
membership. An object is classified by a plurality
vote of its neighbors, with the object being assigned
to the class most common among its k nearest
neighbors (k is a positive integer, typically small).
• If k = 1, then the object is simply assigned to the
class of that single nearest neighbor.

8/12/2020 Dr. Rafiqul Islam 11

Nearest Neighbor clustering
• Initialize K to your chosen number of neighbors
• For each example in the data
– Calculate the distance between the query example and the
current example from the data.
– Add the distance and the index of the example to an
ordered collection
• Sort the ordered collection of distances and indices
from smallest to largest (in ascending order) by the
distances
• Pick the first K entries from the sorted collection
• Get the labels of the selected K entries

8/12/2020 Dr. Rafiqul Islam 12

Nearest Neighbor clustering
• Use the Nearest Neighbor clustering algorithm
and Euclidean distance to cluster the examples
from the
• previous exercise: A1=(2,10), A2=(2,5),
A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4),
A7=(1,2), A8=(4,9).
• Suppose that the threshold value (t) is 4.

8/12/2020 Dr. Rafiqul Islam 13

Nearest Neighbor clustering

8/12/2020 Dr. Rafiqul Islam 14

Nearest Neighbor clustering

8/12/2020 Dr. Rafiqul Islam 15

DB Scan Clustering
• Density-Based Clustering refers to unsupervised
learning methods that identify distinctive groups/
clusters in the data, based on the idea that a cluster in
data space is a contiguous region of high point density,
separated from other such clusters by contiguous
regions of low point density.
• Density-Based Spatial Clustering of Applications with
Noise (DBSCAN) is a base algorithm for density-based
clustering.
• It can discover clusters of different shapes and sizes
from a large amount of data, which is containing noise
and outliers.

8/12/2020 Dr. Rafiqul Islam 16

DB Scan Clustering
• The DBSCAN algorithm uses two parameters:
• minPts: The minimum number of points (a
threshold) clustered together for a region to be
considered dense.
• eps (ε): A distance measure that will be used to
locate the points in the neighborhood of any
point.

8/12/2020 Dr. Rafiqul Islam 17

DB Scan Clustering
• These parameters can be understood if we explore two
concepts called Density Reachability and Density
Connectivity.
• Reachability in terms of density establishes a point to
be reachable from another if it lies within a particular
distance (eps) from it.
• Connectivity, on the other hand, involves a transitivity
based chaining-approach to determine whether points
are located in a particular cluster. For example, p and q
points could be connected if p->r->s->t->q, where a->b
means b is in the neighborhood of a.

8/12/2020 Dr. Rafiqul Islam 18

DB Scan Clustering

Core — This is a point that has at least m points within distance n from itself.
Border — This is a point that has at least one Core point at a distance n.
Noise — This is a point that is neither a Core nor a Border. And it has less
than m points within distance n from itself.
8/12/2020 Dr. Rafiqul Islam 19
DB Scan Clustering
• Algorithmic steps for DBSCAN clustering
• The algorithm proceeds by arbitrarily picking up a
point in the dataset (until all points have been
visited).
• If there are at least ‘minPoint’ points within a radius
of ‘ε’ to the point then we consider all these points
to be part of the same cluster.
• The clusters are then expanded by recursively
repeating the neighborhood calculation for each
neighboring point

8/12/2020 Dr. Rafiqul Islam 20

DB Scan Clustering
• If Epsilon is 2 and minpoint is 2, what are the
clusters that DBScan would discover with the
following 8 examples:
A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5),
A6=(6,4), A7=(1,2), A8=(4,9).
• Draw the 10 by 10 space and illustrate the
discovered clusters. What if Epsilon is
increased to 10 ?

8/12/2020 Dr. Rafiqul Islam 21

DB Scan Clustering
• The Euclidean distance matrix is

8/12/2020 Dr. Rafiqul Islam 22

DB Scan Clustering
• N2(A1)={}; N2(A2)={}; N2(A3)={A5, A6};
N2(A4)={A8}; N2(A5)={A3, A6};
N2(A6)={A3, A5}; N2(A7)={}; N2(A8)={A4}
• So A1, A2, and A7 are outliers, while we have
two clusters C1={A4, A8} and C2={A3, A5,
A6}

8/12/2020 Dr. Rafiqul Islam 23

DB Scan Clustering

8/12/2020 Dr. Rafiqul Islam 24

DB Scan Clustering
If Epsilon is 10 then the neighborhood of some points will increase:
A1 would join the cluster C1 and A2 would joint with A7 to form cluster C3={A2, A7}.

8/12/2020 Dr. Rafiqul Islam 25

Clustering Method
• Single Linkage Method
• Complete Linkage Method
• Hierarchical Method

8/12/2020 Dr. Rafiqul Islam 26

ML - 8
No ratings yet
ML - 8
70 pages
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
No ratings yet
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
54 pages
M5
No ratings yet
M5
40 pages
L07 Clustering algorithms
No ratings yet
L07 Clustering algorithms
45 pages
Datamining-lect5 - Clustering. the K-means Algorithm. Hierarchical Clustering. the DBSCAN Algorithm. Clustering Evaluation
No ratings yet
Datamining-lect5 - Clustering. the K-means Algorithm. Hierarchical Clustering. the DBSCAN Algorithm. Clustering Evaluation
110 pages
Unit 3 Updated Notes
No ratings yet
Unit 3 Updated Notes
29 pages
M5
No ratings yet
M5
40 pages
Clustering
No ratings yet
Clustering
12 pages
UNIT - 4 DWDM
No ratings yet
UNIT - 4 DWDM
27 pages
Clustering Analysis
No ratings yet
Clustering Analysis
30 pages
Data Mining Lecture Notes-1: Bsc. (H) Computer Science: Vi Semester Teacher: Ms. Sonal Linda
No ratings yet
Data Mining Lecture Notes-1: Bsc. (H) Computer Science: Vi Semester Teacher: Ms. Sonal Linda
40 pages
Unit 4
No ratings yet
Unit 4
5 pages
Clustering
No ratings yet
Clustering
65 pages
Module 5
No ratings yet
Module 5
98 pages
Dbscan: Presented By: Garrett Poppe
No ratings yet
Dbscan: Presented By: Garrett Poppe
22 pages
Unit 5
No ratings yet
Unit 5
63 pages
PART2
No ratings yet
PART2
61 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
Machine Learning Unit-4
No ratings yet
Machine Learning Unit-4
24 pages
Exp5 - Unsupervised Learning
No ratings yet
Exp5 - Unsupervised Learning
13 pages
datamining-lect8
No ratings yet
datamining-lect8
79 pages
Data Mining Unit-Iv
No ratings yet
Data Mining Unit-Iv
34 pages
DM Lect 8_Clustering - DBSCAN
No ratings yet
DM Lect 8_Clustering - DBSCAN
22 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
03 Clustering
No ratings yet
03 Clustering
63 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
Cluster Analysis
No ratings yet
Cluster Analysis
22 pages
Week 9 Part 1 Clustering
No ratings yet
Week 9 Part 1 Clustering
44 pages
Lecture 13 - Unsupervised Learning, PCA ICA
No ratings yet
Lecture 13 - Unsupervised Learning, PCA ICA
50 pages
Mod 4 - CLustering
No ratings yet
Mod 4 - CLustering
55 pages
Clustering Algorithm: An Unsupervised Learning Approach
No ratings yet
Clustering Algorithm: An Unsupervised Learning Approach
23 pages
UNIT-6 DBSCAN Clustering
No ratings yet
UNIT-6 DBSCAN Clustering
6 pages
DWDM 5
No ratings yet
DWDM 5
12 pages
1. Clustering
No ratings yet
1. Clustering
75 pages
Density ML
No ratings yet
Density ML
51 pages
DBSCAN
No ratings yet
DBSCAN
42 pages
Lecture 9 Clustering
No ratings yet
Lecture 9 Clustering
36 pages
Topic4 Clustering
No ratings yet
Topic4 Clustering
78 pages
4 Clustering
No ratings yet
4 Clustering
9 pages
Clustering new
No ratings yet
Clustering new
6 pages
Unit-4 (2)
No ratings yet
Unit-4 (2)
29 pages
DWM PT 2 QB Soln
No ratings yet
DWM PT 2 QB Soln
8 pages
U-5_IML (2)
No ratings yet
U-5_IML (2)
20 pages
Lect 12
No ratings yet
Lect 12
80 pages
Lecture 12 - Unsupervised Learning - Shoould Be Marged
No ratings yet
Lecture 12 - Unsupervised Learning - Shoould Be Marged
31 pages
5 - CH 5-K-Means Clustering
No ratings yet
5 - CH 5-K-Means Clustering
54 pages
20 - 1 - ML - Unsup - 03 - Dbscan Hdbscan
No ratings yet
20 - 1 - ML - Unsup - 03 - Dbscan Hdbscan
21 pages
Unit-IV ppt
No ratings yet
Unit-IV ppt
51 pages
Spatial Data Mining: Clustering Techniques
No ratings yet
Spatial Data Mining: Clustering Techniques
56 pages
Density Based CA
No ratings yet
Density Based CA
8 pages
Module 10
No ratings yet
Module 10
59 pages
Density Based Clustering
No ratings yet
Density Based Clustering
25 pages
Unit 7 Clustering
No ratings yet
Unit 7 Clustering
56 pages
DMDWUNITV
No ratings yet
DMDWUNITV
72 pages
K Mean Clustering1
No ratings yet
K Mean Clustering1
23 pages
DBSCAN AND OPTICS
No ratings yet
DBSCAN AND OPTICS
28 pages
Cluster Analysis 1731695796
No ratings yet
Cluster Analysis 1731695796
91 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
CLIQUE Algorithm Grid-Based Subspace Clustering
No ratings yet
CLIQUE Algorithm Grid-Based Subspace Clustering
10 pages
Hierarchical Clustering: Ke Chen
No ratings yet
Hierarchical Clustering: Ke Chen
21 pages
16-Efficient and scalable frequent item set mining methods_ Apriori algorithm-05-02-2025
No ratings yet
16-Efficient and scalable frequent item set mining methods_ Apriori algorithm-05-02-2025
37 pages
4.unsupervised Learning Model-Clustering
No ratings yet
4.unsupervised Learning Model-Clustering
45 pages
Unit - V
No ratings yet
Unit - V
44 pages
ML CT Question Paper 2023 24
No ratings yet
ML CT Question Paper 2023 24
2 pages
Assignment 5
No ratings yet
Assignment 5
3 pages
91329-0136097111 ch01
No ratings yet
91329-0136097111 ch01
12 pages
Madaline Algorithm
No ratings yet
Madaline Algorithm
1 page
ML Questions Paper
No ratings yet
ML Questions Paper
8 pages
Siamese Network: Shusen Wang
No ratings yet
Siamese Network: Shusen Wang
51 pages
Download Complete Data Mining: Concepts and Techniques, 4th Edition Jiawei Han PDF for All Chapters
100% (4)
Download Complete Data Mining: Concepts and Techniques, 4th Edition Jiawei Han PDF for All Chapters
40 pages
Random Forest
No ratings yet
Random Forest
18 pages
Clustering Part-2
No ratings yet
Clustering Part-2
49 pages
Top 100 Interview Questions On Machine Learning
100% (1)
Top 100 Interview Questions On Machine Learning
155 pages
Take Test_ Final Exam Preparation – Artificial ..
No ratings yet
Take Test_ Final Exam Preparation – Artificial ..
11 pages
D1T3 - Clarence Chio and Anto Joseph - Practical Machine Learning in Infosecurity
No ratings yet
D1T3 - Clarence Chio and Anto Joseph - Practical Machine Learning in Infosecurity
33 pages
Support Vector Machine
No ratings yet
Support Vector Machine
21 pages
Intrinsic and Extrinsic Evaluations of Word Embeddings: Michael Zhai, Johnny Tan, Jinho D. Choi
No ratings yet
Intrinsic and Extrinsic Evaluations of Word Embeddings: Michael Zhai, Johnny Tan, Jinho D. Choi
2 pages
NeurIPS 2022 Revised
No ratings yet
NeurIPS 2022 Revised
9 pages
R3 - Dive Into Deep Learning - Zhang Lipton Li Smola
100% (1)
R3 - Dive Into Deep Learning - Zhang Lipton Li Smola
1,025 pages
Assignment On DWDM
No ratings yet
Assignment On DWDM
8 pages
MCSE - PGCS202 - SOFT COMPUTING - R18 - Booklet
No ratings yet
MCSE - PGCS202 - SOFT COMPUTING - R18 - Booklet
2 pages
Deep Network With Support Vector Machines: Abstract. Deep Learning Methods Aims at Learning Features Automatically at
No ratings yet
Deep Network With Support Vector Machines: Abstract. Deep Learning Methods Aims at Learning Features Automatically at
2 pages
Ait401 DL Syllubus
100% (1)
Ait401 DL Syllubus
13 pages
Iv. Single Layer Structures: 4.1. Perceptrons
No ratings yet
Iv. Single Layer Structures: 4.1. Perceptrons
26 pages
Backpropagation Working Error Computation Adjusting Weights
No ratings yet
Backpropagation Working Error Computation Adjusting Weights
12 pages
Lec 9 Supervised Learning Final
100% (1)
Lec 9 Supervised Learning Final
182 pages
Final Exam ANNFL 2015-1
No ratings yet
Final Exam ANNFL 2015-1
9 pages