0% found this document useful (0 votes)

5 views6 pages

Hierarchical Clustering

Hierarchical clustering is a method that organizes data into a tree-like structure by merging or splitting clusters without needing to predefine the number of clusters. It has advantages such as flexibility and the ability to create a hierarchy, but is computationally expensive and sensitive to noise. Different linkage methods, including single, complete, average, and Ward's linkage, offer various approaches to cluster formation, each with its own strengths and limitations.

Uploaded by

Rana Ben Fraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views6 pages

Hierarchical Clustering

Uploaded by

Rana Ben Fraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Hierarchical Clustering

Definition:
Hierarchical clustering is a method that builds a tree-like structure
(dendrogram) of clusters by either merging smaller clusters into larger ones
(agglomerative) or splitting larger clusters into smaller ones (divisive). It does
not require the number of clusters to be specified in advance.

It’s like building a family tree: you start with individuals and group them into
families, then combine families into bigger groups like tribes or regions.

Advantages (+):
No need to predefine clusters: You don't need to specify the number of clusters beforehand.

Flexible: Works with any distance metric, and can be used with different linkage methods (like single,
complete, or centroid).

Creates a hierarchy: It gives you a tree (dendrogram) to show how clusters are merged, so you can choose
the level of clustering.

Disadvantages (-):
Computationally expensive: It can be slow, especially with large datasets.

Sensitive to noise and outliers: It might incorrectly group noisy or outlying data points.

Doesn't scale well: It’s less efficient with very large datasets compared to other methods.

--> In short, Agglomerative Clustering is great for flexibility and hierarchical analysis but can be slow and
sensitive to noise.

Hierarchical Clustering 1
Agglomerative Clustering (Bottom-Up):

1. Start with each data point as its own cluster, compute distance between all
points.(Imagine every person starts alone.)

2. Merge the two closest clusters.(You keep pairing the most similar people
into groups.)

3. Update linkage distance matrix based on linkage criterion:

Single Linkage (Nearest Neighbor) : The distance between two groups is the shortest
distance between any two points in the groups.

Finds the closest pair of points in two clusters; good for detecting elongated
shapes but prone to chaining.

Example: In a dataset, single linkage might connect two clusters just

because one point from each cluster is very close.

Hierarchical Clustering 2
The distance between two groups is the
Complete Linkage (Farthest Neighbor): * longest distance between any two points in
the groups.

Finds the farthest pair of points in two clusters; good for compact clusters.
The distance between two groups is the average of all the distances
Average Linkage: * between points in the first group and points in the second group.

Balances single and complete linkage, often giving more natural groupings.

Ward’s Linkage: This method tries to merge groups in a way that minimizes the increase in
"variance" (spread of points). It tries to keep the new group as compact as
possible.

Definition: The distance between two clusters is the increase in the total
within-cluster variance that results from merging them.

Minimize total within cluster variance

Focuses on compactness and minimizing variance.

This method is part of hierarchical clustering but focuses on keeping

clusters compact.(It tries to make the groups as tight and neat as possible.)

Hierarchical Clustering 3
It merges clusters that increase the overall cluster variance the least.
(Basically, it looks for the two groups that "fit" together best.)

Formula:
d(C1,C2)=Increase in variance when C1 and C2 are merged.

Behavior: Produces compact, spherical clusters by minimizing the variance

within clusters. It is often considered the most robust linkage method.

Advantages:

Creates well-separated, tight clusters. (The groups it forms look neat and
logical.)

Limitations:

Not good for elongated or irregularly shaped clusters. (It assumes clusters
are compact and round.)

Example:
If you’re grouping cities based on population and income, Ward’s method will
ensure each group has cities that are closely related:

Cluster 1: Small towns.

Cluster 2: Medium-sized cities.

Cluster 3: Large metropolitan areas.(Each cluster will be compact and easy

to understand.)

Centroid Linkage: measures the distance between the centers (centroids) of two groups. The
centroid is simply the average position of all points in the group.

Uses the average position (centroid) of clusters; Distance between cluster

means / Simple but less robust.

Hierarchical Clustering 4
Divisive Clustering (Top-Down):

1. Start with all data points in one cluster.(Start with everyone in one big
group.)

2. Split the biggest cluster into smaller groups.(Divide it into smaller groups
step by step.)

3. Stops when each object form a culster or until it specifies certain

conditions.

Advantages:
You don’t need to know the number of clusters beforehand. (You can
decide after looking at the hierarchy.)

Shows relationships clearly in a dendrogram. (It’s like a family tree for your
data.)

Limitations:
Computationally expensive for large datasets. (Takes a long time with too
many data points.)

Can’t reassign points to different clusters later. (Once a point joins a group,
it’s stuck there.)

Hierarchical Clustering 5
Worked example :

https://www.youtube.com/watch?v=8QCBl-xdeZI&ab_channel=DATAtab

Single Linkage:
Good for non-elliptical shapes (long or stretched clusters).
Sensitive to noise and outliers (can be easily affected by weird points).

Complete Linkage:
Less affected by noise and outliers.
Can break larger clusters into smaller ones.
Works best with globular (round) shapes.

Group Average (Average Linkage):

An intermediate approach between Single and Complete Linkage.
Less sensitive to outliers than Single Linkage.
Doesn’t break big clusters as much as Complete Linkage.
Uses the average distance between points in two groups to decide merging.

Hierarchical Clustering 6

03 Hierarchical Clustering
100% (1)
03 Hierarchical Clustering
15 pages
Hierarchical Clustering: Class Program University Semester Lecturer Sources
100% (1)
Hierarchical Clustering: Class Program University Semester Lecturer Sources
33 pages
Hierarchical Clustering: Relationship Between Clusters
No ratings yet
Hierarchical Clustering: Relationship Between Clusters
23 pages
Week-9-Part-2 Agglomerative Clustering
No ratings yet
Week-9-Part-2 Agglomerative Clustering
40 pages
K-Means and Hierarchical Clustering
No ratings yet
K-Means and Hierarchical Clustering
30 pages
2D Cutting Stock Optimization Software Survey
No ratings yet
2D Cutting Stock Optimization Software Survey
29 pages
Unit 3 Clustering
No ratings yet
Unit 3 Clustering
101 pages
Chap1-2 (IA) Complexity - Examples
No ratings yet
Chap1-2 (IA) Complexity - Examples
167 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
11 pages
Clustering
No ratings yet
Clustering
110 pages
6 - Machine Learning and Unlabeled Data
No ratings yet
6 - Machine Learning and Unlabeled Data
67 pages
Unit-6 Clustering Techniques
No ratings yet
Unit-6 Clustering Techniques
110 pages
6 - Chapter 6 - Hierarchical Clustering
No ratings yet
6 - Chapter 6 - Hierarchical Clustering
32 pages
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
No ratings yet
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
41 pages
Lecture - 11 Hierarchical Clustering
No ratings yet
Lecture - 11 Hierarchical Clustering
28 pages
Hierarchial Clustering
No ratings yet
Hierarchial Clustering
14 pages
Hierar Scale4
No ratings yet
Hierar Scale4
51 pages
RK Clustering
No ratings yet
RK Clustering
77 pages
Lect 11 DM
No ratings yet
Lect 11 DM
41 pages
Hierarchical Clustering Unit 4 ML
No ratings yet
Hierarchical Clustering Unit 4 ML
14 pages
Module-5-Cluster Analysis-Part1
No ratings yet
Module-5-Cluster Analysis-Part1
24 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
41 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
4 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
34 pages
Presentation 28128 Content Document 20241126014005PM
No ratings yet
Presentation 28128 Content Document 20241126014005PM
80 pages
Agnes
No ratings yet
Agnes
25 pages
AI20- Hierarchical-clustering
No ratings yet
AI20- Hierarchical-clustering
31 pages
Unit-4 new
No ratings yet
Unit-4 new
36 pages
Hierarchical Clustering pdf
No ratings yet
Hierarchical Clustering pdf
7 pages
4.4 Hierarchical Clustering Methods
No ratings yet
4.4 Hierarchical Clustering Methods
39 pages
Unit 4 Self Made (1)
No ratings yet
Unit 4 Self Made (1)
28 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
10 pages
Hierarchical-Clustering-in-Machine-Learning
No ratings yet
Hierarchical-Clustering-in-Machine-Learning
10 pages
UnSupervisedLearning
No ratings yet
UnSupervisedLearning
22 pages
1629189889 ML TCS Lecture Hierarchical 1608
No ratings yet
1629189889 ML TCS Lecture Hierarchical 1608
41 pages
Clustering: EE-671 Prof L. Behera, IITK
No ratings yet
Clustering: EE-671 Prof L. Behera, IITK
33 pages
Chap1-3 (IA) Complexity - BigO
No ratings yet
Chap1-3 (IA) Complexity - BigO
104 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
Hierarchical Clustering Algorithm
No ratings yet
Hierarchical Clustering Algorithm
9 pages
Advanced Data Structures
No ratings yet
Advanced Data Structures
86 pages
3CP10 Mjj Hierarchical Clustering
No ratings yet
3CP10 Mjj Hierarchical Clustering
40 pages
Example For Agglomerative Clustering
No ratings yet
Example For Agglomerative Clustering
2 pages
Hierarchical Clusters
No ratings yet
Hierarchical Clusters
6 pages
Heirarchical clustering
No ratings yet
Heirarchical clustering
22 pages
10Hierarchical&Probabilistic Clustering & GMM (ML)
No ratings yet
10Hierarchical&Probabilistic Clustering & GMM (ML)
24 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
26 pages
ML_Lec-17
No ratings yet
ML_Lec-17
12 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
Clustring
No ratings yet
Clustring
20 pages
Chapter 4 _ Clustering
No ratings yet
Chapter 4 _ Clustering
21 pages
Lec.4.D. M. spring 2025
No ratings yet
Lec.4.D. M. spring 2025
19 pages
unit5_CSM_ML
No ratings yet
unit5_CSM_ML
32 pages
누리-세종학당 온라인 한국어 레벨테스트 시스템 Test
No ratings yet
누리-세종학당 온라인 한국어 레벨테스트 시스템 Test
2 pages
5CS4-AOA-Unit-2_ppt @zammers
No ratings yet
5CS4-AOA-Unit-2_ppt @zammers
108 pages
Clustering
No ratings yet
Clustering
19 pages
Spooo
No ratings yet
Spooo
9 pages
Flexible Regression - Lecture 6: Marnie Mclean Room 344 Mathematics and Statistics Building
No ratings yet
Flexible Regression - Lecture 6: Marnie Mclean Room 344 Mathematics and Statistics Building
27 pages
CG U2 Technical
No ratings yet
CG U2 Technical
55 pages
Hierarchical Clustering - 11.3.2024 - Full
No ratings yet
Hierarchical Clustering - 11.3.2024 - Full
14 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
7 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
7 pages
Hierarchical_Clustering_Case_Study
No ratings yet
Hierarchical_Clustering_Case_Study
4 pages
Chapter1_Introduction_Java_2024
No ratings yet
Chapter1_Introduction_Java_2024
61 pages
Transportation Problem 2
100% (1)
Transportation Problem 2
11 pages
Lattin Et Al - Analyzing Multivariate Data - 281-283
No ratings yet
Lattin Et Al - Analyzing Multivariate Data - 281-283
3 pages
DWM Exp8 127 133 137
No ratings yet
DWM Exp8 127 133 137
4 pages
Op Amp - Application 3 - Active Filters
No ratings yet
Op Amp - Application 3 - Active Filters
156 pages
Short Questions for Hierarchical Clustering (1)
No ratings yet
Short Questions for Hierarchical Clustering (1)
3 pages
Session 7
No ratings yet
Session 7
15 pages
19CSE353-L23
No ratings yet
19CSE353-L23
15 pages
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
No ratings yet
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
3 pages
Chapter4-Blockchain Application Design
No ratings yet
Chapter4-Blockchain Application Design
17 pages
Solution of A Partial Differential Equations Using The Method of Lines
No ratings yet
Solution of A Partial Differential Equations Using The Method of Lines
16 pages
ML
No ratings yet
ML
13 pages
ML Lectures - 20 22
No ratings yet
ML Lectures - 20 22
14 pages
NoteGPT Flashcards 1739123443917
No ratings yet
NoteGPT Flashcards 1739123443917
10 pages
CS134S05FinalExam
No ratings yet
CS134S05FinalExam
15 pages
TY NUMERICAL ANALYSIS Unit1 PDF
No ratings yet
TY NUMERICAL ANALYSIS Unit1 PDF
26 pages
IBM Coding Questions With Answers 2024
No ratings yet
IBM Coding Questions With Answers 2024
13 pages
Intermediate Value Theorem
No ratings yet
Intermediate Value Theorem
15 pages
DP2
No ratings yet
DP2
9 pages
PracticeQuestions Final
No ratings yet
PracticeQuestions Final
9 pages
Mathematical Definitions of Transforms — Finufft 2.3.0 Documentation
No ratings yet
Mathematical Definitions of Transforms — Finufft 2.3.0 Documentation
1 page
pdf
No ratings yet
pdf
5 pages
Quarter Test - Basic Calculus
No ratings yet
Quarter Test - Basic Calculus
6 pages
MACHINE LEARNING
No ratings yet
MACHINE LEARNING
5 pages
Eecs 639 HW4
No ratings yet
Eecs 639 HW4
8 pages
Chapter 1Summary Request
No ratings yet
Chapter 1Summary Request
4 pages
Exponential Smoothing Ovherview
No ratings yet
Exponential Smoothing Ovherview
4 pages
Spectral_Clustering
No ratings yet
Spectral_Clustering
4 pages
HW6
0% (1)
HW6
2 pages
K-Means_Clustering
No ratings yet
K-Means_Clustering
5 pages
Clustering
No ratings yet
Clustering
3 pages
Tamilarasi - DL AD3501 - Lesson Plan
No ratings yet
Tamilarasi - DL AD3501 - Lesson Plan
3 pages
International Trade Insights - Scholarly Flashcards
No ratings yet
International Trade Insights - Scholarly Flashcards
4 pages
Invoice Classification Using Deep Features and Machine Learning Techniques
No ratings yet
Invoice Classification Using Deep Features and Machine Learning Techniques
5 pages
Coding Statements TCS NQT
No ratings yet
Coding Statements TCS NQT
13 pages
ZOHO
No ratings yet
ZOHO
4 pages
Analysis-Assisted Sound Processing With Audiosculpt
No ratings yet
Analysis-Assisted Sound Processing With Audiosculpt
5 pages
First PT (Tos)
No ratings yet
First PT (Tos)
2 pages
Neural Machine Translation - Coursera
No ratings yet
Neural Machine Translation - Coursera
2 pages
Attention Is All You Need-Summary by Meghana B
No ratings yet
Attention Is All You Need-Summary by Meghana B
2 pages
Assignment 8
No ratings yet
Assignment 8
2 pages
Pamantasan NG Cabuyao: College of Engineering
No ratings yet
Pamantasan NG Cabuyao: College of Engineering
2 pages
Introduction To Java Programming
No ratings yet
Introduction To Java Programming
1 page
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet