Data Mining Functionalities

notes

Uploaded by

abhishekpatekar2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views13 pages

Data Mining Functionalities

notes

Uploaded by

abhishekpatekar2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

UNIT IV: Data Mining Functionalities


A hierarchical clustering method works by grouping data
objects into a hierarchy or “tree” of clusters.



Representing data objects in the form of a hierarchy is
useful for data summarization and visualization.



Agglomerative versus Divisive Hierarchical Clustering



A hierarchical clustering method can be either
agglomerative or divisive, depending on whether the
hierarchical decomposition is formed in a bottom-up
(merging) or top- down (splitting) fashion.

Agglomerative versus Divisive Hierarchical Clustering


An agglomerative hierarchical clustering method uses a bottom-up
strategy.

It typically starts by letting each object form its own cluster and
iteratively merges clusters into larger and larger clusters, until all the
objects are in a single cluster or certain termination conditions are
satisfied.

The single cluster becomes the hierarchy’s root.

For the merging step, it finds the two clusters that are closest to each
other (according to some similarity measure), and combines the two to
form one cluster.

Agglomerative versus Divisive Hierarchical Clustering


A divisive hierarchical clustering method employs a top-down
strategy.

It starts by placing all objects in one cluster, which is the hierarchy’s
root.

It then divides the root cluster into several smaller subclusters, and
recursively partitions those clusters into smaller ones.

The partitioning process continues until each cluster at the lowest
level is coherent enough—either containing only one object, or the
objects within a cluster are sufficiently similar to each other.

Agglomerative versus Divisive Hierarchical Clustering
Hierarchical Clustering
Clusters are merged based on the distance between them and to
calculate the distance between the clusters we have different types of
linkages.
Linkage Criteria:

It determines the distance between sets of observations as a function
of the pairwise distance between observations.

In Single Linkage, the distance between two clusters is the minimum
distance between members of the two clusters

In Complete Linkage, the distance between two clusters is the
maximum distance between members of the two clusters

In Average Linkage, the distance between two clusters is the
average of all distances between members of the two clusters

In Centroid Linkage, the distance between two clusters is is the
distance between their centroids
Hierarchical Clustering
Hierarchical Clustering
Hierarchical Clustering
Objective : For the one dimensional data set {7,10,20,28,35}, perform
hierarchical clustering and plot the dendogram to visualize it.


Let’s solve the problem by hand using both the types of agglomerative
hierarchical clustering :



Single Linkage : In single link hierarchical clustering, we merge in each
step the two clusters, whose two closest members have the smallest
distance.



Complete Linkage : In complete link hierarchical clustering, we merge in
the members of the clusters in each step, which provide the smallest
maximum pairwise distance.

Fundamental Working Of KNN(K-
Nearest Neighbors)

KNN is a supervised machine learning algorithm.


Assume that we have a dataset in which the data points are
classified into 2 categories.
Fundamental Working Of KNN(K-
Nearest Neighbors)
✔ Now, when the new data point comes in, the KNN algorithm
will predict which category or group does the new data point
belongs to.

✔ To make this prediction, we first select a value for K.

Here K is the Number of neighbors.

For our example, let’s take K = 5 nearest neighbors

Fundamental Working Of KNN(K-Nearest Neighbors)
✔ Now, when a new data point comes in, the algorithm will identify
5 nearest neighbors to that new data point.

✔ The new data point is classified as category A or category B based on

the majority. From the graph, we can see that majority of neighbors
belong to category A. Hence the new data point will be classified as
category A.

✔ Once our model is able to classify the new data points as category A or
category B, we can say that our model is ready to make the predictions.

Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
No ratings yet
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
3 pages
Hierarchical Clustering Unit 4 ML
No ratings yet
Hierarchical Clustering Unit 4 ML
14 pages
ML Unit Iii
No ratings yet
ML Unit Iii
12 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
10 pages
Agglomerative Clustering
No ratings yet
Agglomerative Clustering
6 pages
MA Unit 5
No ratings yet
MA Unit 5
7 pages
Unit-6 Clustering Techniques
No ratings yet
Unit-6 Clustering Techniques
110 pages
Unit 4 Self Made
No ratings yet
Unit 4 Self Made
28 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
32 pages
Clustering
No ratings yet
Clustering
69 pages
Hierarchical Clustering Algorithm
No ratings yet
Hierarchical Clustering Algorithm
9 pages
Presentation 28128 Content Document 20241126014005PM
No ratings yet
Presentation 28128 Content Document 20241126014005PM
80 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
41 pages
P 3.1.3 Hierarchical
No ratings yet
P 3.1.3 Hierarchical
30 pages
Un Supervised Learning
No ratings yet
Un Supervised Learning
22 pages
Clustering: EE-671 Prof L. Behera, IITK
No ratings yet
Clustering: EE-671 Prof L. Behera, IITK
33 pages
Unit-4 New
No ratings yet
Unit-4 New
36 pages
Unit 5 Cluster Analysis
No ratings yet
Unit 5 Cluster Analysis
15 pages
Grouping
No ratings yet
Grouping
98 pages
Module-5-Cluster Analysis-Part1
No ratings yet
Module-5-Cluster Analysis-Part1
24 pages
CLUSTERING
No ratings yet
CLUSTERING
16 pages
Hierarchical Clustering - 11.3.2024 - Full
No ratings yet
Hierarchical Clustering - 11.3.2024 - Full
14 pages
ML Imp Ques 2
No ratings yet
ML Imp Ques 2
37 pages
Data Mining Unit 5
No ratings yet
Data Mining Unit 5
30 pages
Slide TIF311 DM 10 11
No ratings yet
Slide TIF311 DM 10 11
49 pages
07 Hierarchical Clustering
No ratings yet
07 Hierarchical Clustering
19 pages
Hierarchical Clustering: Required Data
No ratings yet
Hierarchical Clustering: Required Data
6 pages
DWM 4
No ratings yet
DWM 4
14 pages
Hierarchical Clustering: Relationship Between Clusters
No ratings yet
Hierarchical Clustering: Relationship Between Clusters
23 pages
Machine Learning Notes Anna University
100% (1)
Machine Learning Notes Anna University
14 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
10 pages
ML Unit-5
No ratings yet
ML Unit-5
30 pages
HAC
No ratings yet
HAC
8 pages
Unit 3 Clustering
No ratings yet
Unit 3 Clustering
101 pages
Agnes
No ratings yet
Agnes
25 pages
Lect 11 DM
No ratings yet
Lect 11 DM
41 pages
Clustring
No ratings yet
Clustring
20 pages
13 Clustering and Classifier
No ratings yet
13 Clustering and Classifier
123 pages
Chap15 Cluster Analysis
No ratings yet
Chap15 Cluster Analysis
55 pages
Lecture-11 Cluster Analysis-1
No ratings yet
Lecture-11 Cluster Analysis-1
28 pages
Lecture - 11 Hierarchical Clustering
No ratings yet
Lecture - 11 Hierarchical Clustering
28 pages
RK Clustering
No ratings yet
RK Clustering
77 pages
U-5 Iml
No ratings yet
U-5 Iml
20 pages
Clustering
No ratings yet
Clustering
75 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
Topic 6d - Hierarchical Algorithm
No ratings yet
Topic 6d - Hierarchical Algorithm
38 pages
An Overview On Clustering Methods: T. Soni Madhulatha
No ratings yet
An Overview On Clustering Methods: T. Soni Madhulatha
7 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
ML Unit-5
No ratings yet
ML Unit-5
31 pages
Clustering
No ratings yet
Clustering
39 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
Cluster Analysis Concept & Methods
No ratings yet
Cluster Analysis Concept & Methods
14 pages
Data Science Session 8 Clustering V0
No ratings yet
Data Science Session 8 Clustering V0
30 pages
Unit 4 Descriptive Modeling
No ratings yet
Unit 4 Descriptive Modeling
18 pages
Module 3 - 1
No ratings yet
Module 3 - 1
149 pages
ML 8
No ratings yet
ML 8
12 pages
Hierar Scale4
No ratings yet
Hierar Scale4
51 pages
Clustering Agglo Devisive DBSCAN
No ratings yet
Clustering Agglo Devisive DBSCAN
78 pages
AIMLB-PGP-2025-Session-12
No ratings yet
AIMLB-PGP-2025-Session-12
45 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
NoteGPT AI PPT Maker 1728839183012
No ratings yet
NoteGPT AI PPT Maker 1728839183012
18 pages
ML 2.3 Prashant
No ratings yet
ML 2.3 Prashant
4 pages
1 s2.0 S2212827122004036 Main
No ratings yet
1 s2.0 S2212827122004036 Main
6 pages
Science BSC Information Technology Semester 6 2024 April Business Intelligence Cbcs
No ratings yet
Science BSC Information Technology Semester 6 2024 April Business Intelligence Cbcs
2 pages
ML Full Notes
No ratings yet
ML Full Notes
66 pages
Applied Predictive Analytics - Caselets
No ratings yet
Applied Predictive Analytics - Caselets
9 pages
Department of Computer Science and Engineering Course Code: CD503 Course Name: Pattern Recognition Introduction To Pattern Recognition
No ratings yet
Department of Computer Science and Engineering Course Code: CD503 Course Name: Pattern Recognition Introduction To Pattern Recognition
2 pages
An Antibiotic Resistance Test
No ratings yet
An Antibiotic Resistance Test
20 pages
K-Means and PCA
No ratings yet
K-Means and PCA
69 pages
6th IT Handbook
No ratings yet
6th IT Handbook
37 pages
BSC Bca 6 Sem Data Mining 20100407 Mar 2020 - 240117 - 204349
No ratings yet
BSC Bca 6 Sem Data Mining 20100407 Mar 2020 - 240117 - 204349
2 pages
CS8080 Information Retrieval Technique Ripped From Amazon Kindle
No ratings yet
CS8080 Information Retrieval Technique Ripped From Amazon Kindle
168 pages
Network Traffic Intrusion Detection System Using Decision Tree & K-Means Clustering Algorithm
No ratings yet
Network Traffic Intrusion Detection System Using Decision Tree & K-Means Clustering Algorithm
3 pages
ML Sec2 Group-4 2024
No ratings yet
ML Sec2 Group-4 2024
13 pages
Research Article: Improved KNN Algorithm Based On Preprocessing of Center in Smart Cities
No ratings yet
Research Article: Improved KNN Algorithm Based On Preprocessing of Center in Smart Cities
10 pages
Implementation of Dbrain Search Algorithm On Page Clustering
No ratings yet
Implementation of Dbrain Search Algorithm On Page Clustering
5 pages
CL-I Lab Manual
No ratings yet
CL-I Lab Manual
131 pages
Module 7 Introduction To Data Mining
No ratings yet
Module 7 Introduction To Data Mining
56 pages
Oracle Cost-Based Optimizer Basics
No ratings yet
Oracle Cost-Based Optimizer Basics
52 pages
Business Analytics / Data Science As A Career: Seminar Session
No ratings yet
Business Analytics / Data Science As A Career: Seminar Session
38 pages
1.1 Overview: Data Mining Based Risk Estimation of Road Accidents
No ratings yet
1.1 Overview: Data Mining Based Risk Estimation of Road Accidents
61 pages
Han 2019
No ratings yet
Han 2019
18 pages
Keeping Secrets: K-12 Students' Understanding of Cryptography Anke Lindmeier Andreas Mühling
No ratings yet
Keeping Secrets: K-12 Students' Understanding of Cryptography Anke Lindmeier Andreas Mühling
10 pages
1a JOE-Template 11 1
No ratings yet
1a JOE-Template 11 1
17 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
16 pages
Learning Machine Learning With Yellowbrick
No ratings yet
Learning Machine Learning With Yellowbrick
64 pages
Traduccion de Juan Vicente
No ratings yet
Traduccion de Juan Vicente
31 pages
Advanced Data Analyst Roadmap
No ratings yet
Advanced Data Analyst Roadmap
3 pages
13 Unsupervised Learning
No ratings yet
13 Unsupervised Learning
9 pages
Data Mining: Dosen: Dr. Vitri Tundjungsari
No ratings yet
Data Mining: Dosen: Dr. Vitri Tundjungsari
64 pages

Data Mining Functionalities

Uploaded by

Data Mining Functionalities

Uploaded by

UNIT IV: Data Mining Functionalities

✔ To make this prediction, we first select a value for K.

For our example, let’s take K = 5 nearest neighbors

✔ The new data point is classified as category A or category B based on

You might also like