Welcome to Scribd!

0% found this document useful (0 votes)

53 views

Assignment 5

Uploaded by

K-means clustering is an unsupervised machine learning algorithm that groups unlabeled data points into a specified number of clusters (k) based on their similarities. It works by assigning data points to the closest cluster centroid and iteratively updating centroid positions until clusters are stable or the maximum number of iterations is reached. [END SUMMARY]

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Assignment 5

Uploaded by

Pujan Patel

0% found this document useful (0 votes)

53 views3 pages

Original Title

assignment 5

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

0% found this document useful (0 votes)

53 views3 pages

Assignment 5

Uploaded by

Pujan Patel

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 3

Search inside document

Assignment -5 (CLUSTERING AND CLASSIFICATION)

1. Explain k means clustering algorithm and give its advantages and disadvantages

Ans):
K-Means Clustering is an unsupervised learning algorithm that is used to solve the clustering problems
in machine learning or data science. In this topic, we will learn what is K-means clustering algorithm,
how the algorithm works, along with the Python implementation of k-means clustering.

It allows us to cluster the data into different groups and a convenient way to discover the categories of
groups in the unlabelled dataset on its own without the need for any training.

It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim of this
algorithm is to minimize the sum of distances between the data point and their corresponding clusters.

The algorithm takes the unlabelled dataset as input, divides the dataset into k-number of clusters, and
repeats the process until it does not find the best clusters. The value of k should be predetermined in
this algorithm.

The k-means clustering algorithm mainly performs two tasks:

o Determines the best value for K centre points or centroids by an iterative process.
o Assigns each data point to its closest k-centre. Those data points which are near to the
particular k-centre, create a cluster.

Hence each cluster has datapoints with some commonalities, and it is away from other clusters

Advantages of k-means:

 Relatively simple to implement.

 Scales to large data sets.
 Guarantees convergence.
 Can warm-start the positions of centroids.
 Easily adapts to new examples.
 Generalizes to clusters of different shapes and sizes, such as elliptical clusters.

Disadvantages of k-means:

 Choosing k manually:

Use the “Loss vs. Clusters” plot to find the optimal (k).

 Being dependent on initial values.

For a low k, you can mitigate this dependence by running k-means several times with different initial
values and picking the best result. As k increases, you need advanced versions of k-means to pick
better values of the initial centroids (called k-means seeding). Clustering data of varying sizes and
density.
k-means has trouble clustering data where clusters are of varying sizes and density. To cluster such
data, you need to generalize k-means as described in the Advantages section.

 Clustering outliers.

Centroids can be dragged by outliers, or outliers might get their own cluster instead of being ignored.
Consider removing or clipping outliers before clustering.

 Scaling with number of dimensions.

As the number of dimensions increases, a distance-based similarity measure converges to a constant

value between any given examples. Reduce dimensionality either by using PCA on the feature data, or
by using “spectral clustering” to modify the clustering algorithm as explained below.

2. Use K-means clustering algorithm to divide the following data into clusters.
D = {2,3,4,10,11,12,20,25,30}
K=2
M1 = 4, M2 = 12

Ans)
3. Use K-means clustering algorithm to divide the following data into clusters.
D = {2,4,6,9,12,16,20,24,26}
K=2
M1 = 4, M2 = 12

Ans)

4 What is clustering? Illustrate with an example various steps and commands

involved for performing the k means clustering in R

Ans)

Predictive Analytics Updated
Document30 pages
Predictive Analytics Updated
sahilbansal136101
No ratings yet
K Mean
Document7 pages
K Mean
Deergha Tiwari
No ratings yet
Experiment No 07: Mihir Patel Teit 2
Document5 pages
Experiment No 07: Mihir Patel Teit 2
MIHIR PATEL
No ratings yet
Unsupervisd Learning Algorithm
Document6 pages
Unsupervisd Learning Algorithm
Shrey Dixit
No ratings yet
unsupervised learning
Document23 pages
unsupervised learning
shaukeenkha3606
No ratings yet
K Means Clustering
Document6 pages
K Means Clustering
Alina Corina Bala
No ratings yet
Text Analytics Unit-3
Document11 pages
Text Analytics Unit-3
aathyukthas.ai20001
No ratings yet
CV UNIT 4
Document60 pages
CV UNIT 4
jayalakshmi.mca staff
No ratings yet
K-Means Clustering
Document5 pages
K-Means Clustering
Mani
No ratings yet
Task 22
Document5 pages
Task 22
syedafatimasajid23
No ratings yet
Machine Learning with Python for Beginners
From Everand
Machine Learning with Python for Beginners
Saimon Carrie
No ratings yet
Aiml - 06 - 28
Document4 pages
Aiml - 06 - 28
darshil shah
No ratings yet
ML extended
Document25 pages
ML extended
anchitbansal11
No ratings yet
Partitioning Methods
Document26 pages
Partitioning Methods
Ahmed hussain
No ratings yet
Working of K Means Algorithm - YashBhure
Document14 pages
Working of K Means Algorithm - YashBhure
Yash Bhure
No ratings yet
ML-12
Document19 pages
ML-12
adnanriaz2024
No ratings yet
Unit4 Datascience
Document43 pages
Unit4 Datascience
drsaranyarcw
No ratings yet
DMBI5
Document9 pages
DMBI5
Shubham Jha
No ratings yet
Summary - MachineLearning (Part 2)
Document19 pages
Summary - MachineLearning (Part 2)
aril dan
No ratings yet
Clustering Analysis (1)
Document12 pages
Clustering Analysis (1)
Vidhi Tanwar
No ratings yet
A Novel Approach of Implementing An Optimal K-Means Plus Plus Algorithm For Scalar Data
Document6 pages
A Novel Approach of Implementing An Optimal K-Means Plus Plus Algorithm For Scalar Data
sinigersky
No ratings yet
DM Lecture 06
Document32 pages
DM Lecture 06
Sameer Ahmad
No ratings yet
INTRO TO ML ASS
Document3 pages
INTRO TO ML ASS
gayukalai67
No ratings yet
A Tutorial On Clustering Algorithms
Document4 pages
A Tutorial On Clustering Algorithms
jczerna
No ratings yet
K-Means Clustering
Document8 pages
K-Means Clustering
Abeer Pareek
No ratings yet
U1 - KMeans - 5th Sem - DS
Document14 pages
U1 - KMeans - 5th Sem - DS
subbumail051
No ratings yet
6 - Into To Data Science Techniques and Clustering
Document16 pages
6 - Into To Data Science Techniques and Clustering
Niranjan Herwadkar
No ratings yet
ML - Unit - 2
Document13 pages
ML - Unit - 2
Dr D S Naga Malleswara Rao
No ratings yet
Jaipur National University: Project Design With Seminar
Document26 pages
Jaipur National University: Project Design With Seminar
Faizan Shaikh
100% (1)
Analysis&Comparisonof Efficient Techniquesof
Document5 pages
Analysis&Comparisonof Efficient Techniquesof
astha
No ratings yet
KMeans Clustering
Document16 pages
KMeans Clustering
Basant Kothari
No ratings yet
K-Means_Clustering_Report
Document2 pages
K-Means_Clustering_Report
Vishnu Prasad Prasad
No ratings yet
EML %th Module
Document40 pages
EML %th Module
cherry.divesh099
No ratings yet
EXP 7
Document6 pages
EXP 7
Kratos grime
No ratings yet
ML Assign4
Document7 pages
ML Assign4
hrr601097
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 4 Notes
Document23 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 4 Notes
sai753638
No ratings yet
Clustering Algorithm
Document47 pages
Clustering Algorithm
asinghal2122003
No ratings yet
10 Marks Questions
Document19 pages
10 Marks Questions
Anupriya Veerasamy
No ratings yet
Unit-Iv Material
Document24 pages
Unit-Iv Material
Udaya sri
No ratings yet
KNN VS Kmeans
Document3 pages
KNN VS Kmeans
Soubhagya Kumar Sahoo
No ratings yet
Journal of Computer Applications - WWW - Jcaksrce.org - Volume 4 Issue 2
Document5 pages
Journal of Computer Applications - WWW - Jcaksrce.org - Volume 4 Issue 2
Journal of Computer Applications
No ratings yet
1731009606_Clustering_(Class_38-39)
Document45 pages
1731009606_Clustering_(Class_38-39)
TANISHA SINHA
No ratings yet
Meeting 7 Unsupervised Learnign
Document95 pages
Meeting 7 Unsupervised Learnign
Antonio Victory
No ratings yet
Unit 4 Aam
Document26 pages
Unit 4 Aam
davidhackwell531
No ratings yet
Big Data
Document21 pages
Big Data
adnansohail438
No ratings yet
ADB Ch07 - Data Mining Clustering K-Means
Document27 pages
ADB Ch07 - Data Mining Clustering K-Means
hl7694016
No ratings yet
UCS551 Chapter 7 - Clustering
Document9 pages
UCS551 Chapter 7 - Clustering
Farah Yahaya
No ratings yet
4 Clustering With K-Means - Kaggle
Document9 pages
4 Clustering With K-Means - Kaggle
Prujith Muthu Ram
No ratings yet
UNIT 4 ML Notes
Document22 pages
UNIT 4 ML Notes
Durga Bhavani Alanka
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
Document14 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
Jeeva Jeeva
No ratings yet
Clustering-Part 1
Document35 pages
Clustering-Part 1
abebaw
No ratings yet
Presentation: Operating System Concept CS-582
Document13 pages
Presentation: Operating System Concept CS-582
Mujtaba Hassan
No ratings yet
Assignment 4 A
Document15 pages
Assignment 4 A
sahilmukund.awasarkar
No ratings yet
DWDM Unit5
Document14 pages
DWDM Unit5
sri charan
No ratings yet
Unit 5 - Cluster Analysis
Document14 pages
Unit 5 - Cluster Analysis
eskpg066
No ratings yet
SML Hand Note Bau by DT
Document1 page
SML Hand Note Bau by DT
farida1971yasmin
No ratings yet
A Fast K-Means Implementation Using Coresets
Document10 pages
A Fast K-Means Implementation Using Coresets
Hiino
No ratings yet
Hierarchical Clustering: Required Data
Document6 pages
Hierarchical Clustering: Required Data
Hritik Agrawal
No ratings yet
Learneverythingai
Document12 pages
Learneverythingai
nasby18
No ratings yet
PeerEval Unsupervised
Document6 pages
PeerEval Unsupervised
rest peace
No ratings yet
Zara
Document47 pages
Zara
Davin Malore
No ratings yet
Classification of Hand Movements From EE
Document9 pages
Classification of Hand Movements From EE
Pujan Patel
No ratings yet
Huang Et Al-2020-Frontiers in Neuroscience
Document8 pages
Huang Et Al-2020-Frontiers in Neuroscience
Pujan Patel
No ratings yet
AI Assignment 1, Pujan K Patel
Document13 pages
AI Assignment 1, Pujan K Patel
Pujan Patel
No ratings yet
Assignment 6
Document12 pages
Assignment 6
Pujan Patel
No ratings yet
HMI Unit 1
Document16 pages
HMI Unit 1
Pujan Patel
No ratings yet
Exercises - Dss - Partd - Handout
Document12 pages
Exercises - Dss - Partd - Handout
Aditya Joshi
No ratings yet
Nail Disease PREDICTION
Document34 pages
Nail Disease PREDICTION
YAZHINI M (813520104049)
No ratings yet
Summer Internship Report
Document24 pages
Summer Internship Report
Shivangi Jaiswal
No ratings yet
Non-Intrusive Appliance Load Monitoring Using Low-Resolution Smart Meter Data
Document6 pages
Non-Intrusive Appliance Load Monitoring Using Low-Resolution Smart Meter Data
naveed30
No ratings yet
Kukbit Internship Projects
Document14 pages
Kukbit Internship Projects
Saumya Ranjan
No ratings yet
Machine Learning Algorithms For Satellite Image Classification Using Google Earth Engine and Landsat Satellite Data Morocco Case Study
Document16 pages
Machine Learning Algorithms For Satellite Image Classification Using Google Earth Engine and Landsat Satellite Data Morocco Case Study
deekshadeeku550
No ratings yet
BCI BASED HOME AUTOMATION SYSTEM Report PDF
Document66 pages
BCI BASED HOME AUTOMATION SYSTEM Report PDF
shamshad
No ratings yet
Detection of Ocular Cataracts With Convolutional Neural Networks
Document10 pages
Detection of Ocular Cataracts With Convolutional Neural Networks
Alejandro Perdomo
No ratings yet
JIUP - Deny Ardianto
Document14 pages
JIUP - Deny Ardianto
m84005254
No ratings yet
The WEKA Data Mining Software An Update
Document10 pages
The WEKA Data Mining Software An Update
Prateek Malhotra
No ratings yet
Ac 1
Document2 pages
Ac 1
Prudhvi Kurakula
No ratings yet
Hydroclassifier
Document14 pages
Hydroclassifier
Wariah Parvez
No ratings yet
DSCI 6003 Class Notes
Document7 pages
DSCI 6003 Class Notes
Desire Matouba
No ratings yet
Exam Advanced Data Mining Date: 5-11-2009 Time: 14.00-17.00: General Remarks
Document5 pages
Exam Advanced Data Mining Date: 5-11-2009 Time: 14.00-17.00: General Remarks
kishh28
100% (1)
BUSI 651 - Week 3n
Document24 pages
BUSI 651 - Week 3n
Fabian Enrique Acosta Cortes
No ratings yet
Text-Independent Speaker Identification For The Amharic Language
Document117 pages
Text-Independent Speaker Identification For The Amharic Language
solomon
No ratings yet
Detecting Fake Social Media Profiles Using Blockchain
Document21 pages
Detecting Fake Social Media Profiles Using Blockchain
yuvaprajan74
No ratings yet
An Accelerometer - Based Leak Detection System
Document16 pages
An Accelerometer - Based Leak Detection System
Chiemela
No ratings yet
Download Full Comprehensive Chemometrics: Chemical and Biochemical Data Analysis 2nd Edition Steven Brown (Editor) PDF All Chapters
Document66 pages
Download Full Comprehensive Chemometrics: Chemical and Biochemical Data Analysis 2nd Edition Steven Brown (Editor) PDF All Chapters
grisacurro
100% (5)
Review Article Digital Change Detection Techniques Using Remotely Sensed Data
Document16 pages
Review Article Digital Change Detection Techniques Using Remotely Sensed Data
Fidza
No ratings yet
Chapter 2 Data Preprocessing
Document23 pages
Chapter 2 Data Preprocessing
liyu agye
No ratings yet
Onlinepay
Document23 pages
Onlinepay
Shaik Adilazeez
No ratings yet
IDRISI Selva GIS Image Processing Brochure PDF
Document8 pages
IDRISI Selva GIS Image Processing Brochure PDF
Erika Leon Soriano
No ratings yet
The State-Of-The-Art in Predictive Visual Analytics
Document24 pages
The State-Of-The-Art in Predictive Visual Analytics
amicoi
No ratings yet
Weist 2016
Document73 pages
Weist 2016
Alisson Dayanne
No ratings yet
Module-1 DM
Document15 pages
Module-1 DM
prathammsr192003
No ratings yet
Artificial Neural Networks: Asad Anwar Butt
Document39 pages
Artificial Neural Networks: Asad Anwar Butt
Nauman Zafar
No ratings yet
Ethiopia Football Federation
Document4 pages
Ethiopia Football Federation
Getachew Yizengaw Enyew
No ratings yet
10.12921 cmst.2006.12.02.143-147 Moczko
Document5 pages
10.12921 cmst.2006.12.02.143-147 Moczko
shital shermale
No ratings yet