0% found this document useful (0 votes)

30 views7 pages

1120pm - 85.epra Journals 8308

This document compares and contrasts the K-means and hierarchical clustering algorithms. It discusses how clustering is an unsupervised machine learning technique used to group similar data points together. The document analyzes the K-means and hierarchical clustering algorithms using the WEKA data mining tool on different datasets. It reviews related literature that has compared various clustering algorithms and tools. The goal is to better understand the advantages and disadvantages of different clustering approaches to select the most appropriate one for a given use case and dataset.

Uploaded by

Grace Angelia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views7 pages

1120pm - 85.epra Journals 8308

Uploaded by

Grace Angelia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

ISSN (Online): 2455-3662

EPRA International Journal of Multidisciplinary Research (IJMR) - Peer Reviewed Journal

Volume: 7 | Issue: 8 | August 2021|| Journal DOI: 10.36713/epra2013 || SJIF Impact Factor 2021: 8.047 || ISI Value: 1.188

A COMPARATIVE ANALYSIS OF K-MEANS AND

HIERARCHICAL CLUSTERING

Aastha Gupta1, Himanshu Sharma2, Anas Akhtar3

1,2,3
Jagan Institute of Management Studies, sec-5, Rohini

Article DOI: https://doi.org/10.36713/epra8308

DOI No: 10.36713/epra8308

ABSTRACT
Clustering is the process of arranging comparable data elements into groups. One of the most frequent data mining analytical
techniques is clustering analysis; the clustering algorithm's strategy has a direct influence on the clustering results. This
study examines the many types of algorithms, such as k-means clustering algorithms, and compares and contrasts their
advantages and disadvantages. This paper also highlights concerns with clustering algorithms, such as time complexity and
accuracy, in order to give better outcomes in a variety of environments. The outcomes are described in terms of big datasets.
The focus of this study is on clustering algorithms with the WEKA data mining tool. Clustering is the process of dividing a big
data set into small groups or clusters. Clustering is an unsupervised approach that may be used to analyze big datasets with
many characteristics. It's a data-modeling technique that provides a clear image of your data. Two clustering methods, k-
means and hierarchical clustering, are explained in this survey and their analysis using WEKA tool on different data sets.
KEYWORDS: data clustering, weka , k-means, hierarchical clustering

I. INTRODUCTION another, while data in other clusters varies. Clustering

Clustering is a vital part of data mining, and it's is a common data analysis approach for identifying
also one of the hottest topics of science in recent times. homogenous groups of objects based on attribute
It is a technology that examines the logical or physical values. Data Clustering has many different real life
relationships between data and divides the data set into applications such as image segmentation, data analysis,
many clusters, each of which is made up of similar data machine learning, search engines, document retrieval,
sets in nature. object recognition and evaluation, computational,
Data clustering is a process in which we group economics, libraries, insurances studies.
together entities with similar characteristics. Clustering Clustering algorithms are effective meta-
quality depending on the similarity metric and how it's learning tools for assessing the information generated
implemented. The clustering's main aim is to find a by modern applications. Clustering methods are widely
collection of patterns, points, and connections or employed in a variety of applications. Data
objects from a natural grouping. One of the most organization and categorization, as well as data
remarkable data mining technique is clustering. Based modelling as well as data compression. When selecting
on some rules, data may be classified into several a clustering algorithm, think about whether it can scale
classes or clusters, resulting in great similarity among to your dataset. Machine learning datasets can contain
data sets of the same class and substantial differences millions of instances, but not all clustering algorithms
among data objects of other classes. [1] scale well. The similarity of all pairs of examples is
Clustering is a method for logically categorizing computed by several clustering algorithms.
raw data and looking for hidden patterns in large Clustering approaches are used to classify
datasets. It's the act of grouping data into fragmented groups of related data in multivariate data sets. There
clusters so that data in one cluster matches data in are a variety of clustering methods including:

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

412
ISSN (Online): 2455-3662
EPRA International Journal of Multidisciplinary Research (IJMR) - Peer Reviewed Journal
Volume: 7 | Issue: 8 | August 2021|| Journal DOI: 10.36713/epra2013 || SJIF Impact Factor 2021: 8.047 || ISI Value: 1.188

 Partitioning methods variables like client profile, value, limits and expense
 Fuzzy clustering issues. Creators have investigated deals information
 Hierarchical clustering with bunching calculations like K-Means and EM
 Density-based clustering (assumption augmentation) that uncovered many
 Model-based clustering fascinating examples helpful for improving deals
income and accomplishing higher deals volumes. K-
II. LITERATURE REVIEW Means and EM (segment Procedures) calculations are
1. Manish Verma, Mauly Srivastava, Neha …” A more qualified to assess deals information in
Comparative Study of Various Clustering Algorithms correlation with thickness based Procedures.
in Data Mining” [5]
The author made a comparison between different 5. Soumi Ghosh, S. K. Dubey, “Comparative Analysis
clustering techniques. The aim was to measure the of K-Means…..” [9]
algorithm which gives the best performance. It was The paper includes comparison of two clustering
observed that K-means is faster than all the algorithms techniques, centroid-based K-Means and representative
that are mentioned in this paper. K-means and EM object-based Fuzzy C-Means clustering techniques.
gives the best results than hierarchical clustering when This analysis is based on a performance evaluation with
working on huge data set. these algorithms about how efficient outputs are
generated. The results of this comparative research
2. U. Kaymak and M. Setnes, “Extended fuzzy depicts that efficiency of FCM is somewhat closer to
clustering algorithms” [6] K-means. However, computation time is still longer
The author uses fuzzy clustering algorithm to than K-means since the fuzzy measure calculations are
divide dataset into clusters. Some of the issues using involved.
fussy algorithm were discussed by the author such as
number and shape of clusters, division of data patterns, 6. M.Venkat Reddy, M. Vivekananda, RUVN Satish.
choosing the number of clusters in the data. Enhanced [10]
version of fussy means were given and their properties The researchers have discovered an efficient
were illustrated. Examples were used to show that the clustering technique by comparing Divisive and
enhanced algorithms does not require any additional Agglomerative Hierarchical Clustering with K-means.
input from the user and can determine partition of data The outcome of paper was that Agglomerative
on its own. clustering along with k-means is the practical choice to
achieve a high degree of accuracy. Divisive clustering
3. Karthikeyan B., Dipu Jo George, G. Manikandan, with k-means also functions efficiently where each
Tony Thomas “A comparative study on k-means cluster is fixed i.e. where the initial centroids are taken
clustering and agglomerative hierarchical clustering,” in a fixed number for each cluster rather than by
[7] random selection.
The authors have done a comparative study to
determine the best-suited algorithm among K-Means 7.. N. Sharma .“Comparison the various clustering
and Agglomerative Hierarchical Clustering. It was algorithms of weka tools”.[11]
concluded that k-means can be best used for larger The authors have compared and contrasted
datasets with minimal runtime and memory change different clustering algorithms. Weka Tool is used to
rate. It is also concluded that the agglomeration implement all of the proposed algorithms. The purpose
hierarchical clustering technique is best suited for of their research is to determine which algorithm is
smaller data sets because of the minimum overall more appropriate and efficient. DBSCAN, EM,
memory consumption. Farthest First, OPTICS, and the K-Means algorithms
are among these algorithms. They show the benefits
4. S. H. Sastry, P. Babu and M. S. Prasada, “Analysis and drawbacks of each algorithm in this study.
& Prediction of Sales Data in SA P-ERP System” They have demonstrated the benefits and drawbacks of
using Clustering Algorithms”, [8] each method in this paper, however based on their
The authors of this paper used grouping study, they discovered that the k-means clustering
procedures for recognizing contrast in item deals and algorithm is the simplest of the algorithms and fastest
furthermore to recognize and think about deals algorithm to be used with large datasets.
throughout a specific time. The interest for steel items
is repeating furthermore, relies upon numerous

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

413
ISSN (Online): 2455-3662
EPRA International Journal of Multidisciplinary Research (IJMR) - Peer Reviewed Journal
Volume: 7 | Issue: 8 | August 2021|| Journal DOI: 10.36713/epra2013 || SJIF Impact Factor 2021: 8.047 || ISI Value: 1.188

III. CLUSTERING PROCESS obtained from clustering algorithms are based on

The analytical processes required in cluster analysis some assumptions which depends on the properties
have been established in the literature based on the of the data set (geometry and density distribution)
basic paradigm on Knowledge Discovery in databases. and input parameter values since the class labels
Figure 1 depicts the steps involved in the clustering are not specified. A good clustering algorithm can
process.[3] recognize clusters regardless of their structure.
1. Feature selection
The stage is about choosing characteristics for 3. Cluster validation
cluster analysis. Because the class labels aren't Cluster validation of the clusters is an
predefined in cluster analysis, there's a good assessment of the clusters generated. Clusters are
chance you'll pick features that are irrelevant or checked to determine a satisfactory quality of the
inconsequential. Additionally, removing non- created clusters and to achieve the desired clusters.
essential information improves clustering results. External clusters can all be used to test clusters
The process of determining the most effective with internal indices and relative indices. The
subset of the original characteristics to employ in clusters generated by the algorithm are assessed in
clustering is known as feature selection. The this stage. Visualizing the clusters is a useful way
application of one or more modifications of the to rapidly double-check the cluster results. [4]
input features to create new salient characteristics
is known as feature extraction. To get an adequate 4. Result Analysis
collection of characteristics to employ in The clusters produced from the initial set of
clustering, one or both of these strategies can be data are analyzed to gain a better understanding of
applied. them and to guarantee that the attributes of the
clusters are obtained. Integration of expert
2. Clustering algorithm evaluations with additional experimental findings
The choice of a clustering algorithm influences and analysis might also help to broaden the
the clusters obtained from the data . The results interpretation.

Fig 1 . Clustering Process

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

414
ISSN (Online): 2455-3662
EPRA International Journal of Multidisciplinary Research (IJMR) - Peer Reviewed Journal
Volume: 7 | Issue: 8 | August 2021|| Journal DOI: 10.36713/epra2013 || SJIF Impact Factor 2021: 8.047 || ISI Value: 1.188

IV. CLUSTERING Disadvantages

BENCHMARKING CRITERION  K-Means may be computationally faster only
The comparative strengths and limitations of each if value of K is small.
algorithm in relation to the three-dimensional [3-D]  Can only be used if the mean is known.
characteristics of large data should be analyzed by  Not suitable for high dimensional data
particular criteria for the evaluation of large-data  Sensitive to noise/outliers [12]
clustering methods including Volume, Velocity, and
Variety. V.II. Hierarchical clustering
The efficiency to manage a large amount of data is A hierarchical method creates a hierarchical
called volume of a clustering process. The following representation of a set of data items. Dendrograms are
criteria are taken into account while choosing a good made using the Tree of Clusters. Sibling clusters split
clustering algorithm for the Volume property: the points covered by their shared parent, whereas child
i) the dataset's size, clusters exist in every cluster node. A typical clustering
ii) dealing with high dimensionality, and approach that can be helpful for a range of data mining
iii) managing the noisy data. tasks is hierarchical algorithms. A hierarchical
The capability to handle various sorts of data is referred clustering technique creates a succession of clusterings
to as variety of clustering process. The following in which each grouping gets nestled into the clustering
criteria are taken into account while choosing a good behind it.
clustering algorithm for the Variety property: Advantages
i) the dataset type;  Applicable to all attribute types.
ii) shape of clusters  Easy at handling similarity data.
The speed of an algorithm over massive data is referred  Small groups are formed, making analysis and
to as velocity of clustering process. The different comprehension simpler.
criteria are taken into account while choosing a good  The number of clusters are not pre-defined, so
clustering procedure for the Velocity property: the user has the ability to dynamically select
i) the algorithm's complexity; clusters.
ii) the algorithm's run-time performance  Concept wise simple.

V. COMPARITIVE ANALYSIS Disadvantages

V.I. K-Means  Clustering Cluster merging/splitting is a
The K means clustering algorithm is commonly used. permanent process.
This technique will be useful in extracting meaningful  It is impossible to correct erroneous judgments
information from a large database using a cluster. The afterwards.
K-means clustering algorithm is a well-known data
 Divisive techniques can be time-consuming to
clustering technique. It is used in a variety of
compute.
applications, including information retrieval and
 Methods aren't always (necessarily) scalable
computer vision. K-means clustering divides n data
when dealing with huge datasets.
points into k clusters, allowing for the grouping of
comparable data points. It's an iterative strategy for  A termination/readout condition is required.
assigning each point to the cluster with the closest Hierarchical clustering can be divided into two sub
centroid. The centroid of these clusters is then categories:
calculated again by taking the average.

Advantages HIERARCHICAL
 Simple: - Easy to understand and to
CLUSTERING
implement.
 Efficient: Time complexity is O(t.k.n) very
efficient to work with huge data sets
 Requires an input from user. AGGLOMERATIVE DIVISIVE
HIERARCHICAL HIERARCHICAL

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

415
ISSN (Online): 2455-3662
EPRA International Journal of Multidisciplinary Research (IJMR) - Peer Reviewed Journal
Volume: 7 | Issue: 8 | August 2021|| Journal DOI: 10.36713/epra2013 || SJIF Impact Factor 2021: 8.047 || ISI Value: 1.188

I. Agglomerative Hierarchical clustering II. Divisive Hierarchical clustering

The bottom-up approach is often referred to as the The divisive clustering method, on the other hand,
agglomerative approach, since it begins with each works from the top down, starting with a single cluster
object that forms a separate group. It continues to at the top and dividing it down to the bottom. It usually
merge the nearby objects or groups. It keeps going until starts in the same cluster with all of the objects. Then,
all the groups are combined to one or until the through the application of the K-means clustering, a
condition of termination is maintained. The aim of cluster is divided into smaller clusters. It is down until
agglomerative clustering technique is to group together the termination condition carries every object in one
objects with similar characteristics. [14] cluster. [13]

VI. WEKA TOOL users to apply machine learning algorithms to their own
Weka is freely available on the Internet and comes with data, independent of computer’s platform. We used the
a new data mining document that describes and Weka tool version 3.8.5 in this work to examine the
thoroughly explains all of the techniques that are accuracy and speed of simple K-means and
included. Weka class libraries-based applications may Hierarchical clustering algorithms on pre-given
operate on any computer with a Web browser, allowing datasets.

VII. EXPERIMENT explanation of datasets utilized in experiment

Various datasets with known clustering are available in evaluation, are used in this study. [11]
the UCI collection of machine learning databases for Table 1 lists some of the features of the test datasets –
testing the accuracy and efficiency of simple k-means number of attributes and number of instances formed in
and hierarchical clustering algorithms. The Diabetes the given dataset.
datasets and Hypothyroid datasets, as well as a brief

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

416
ISSN (Online): 2455-3662
EPRA International Journal of Multidisciplinary Research (IJMR) - Peer Reviewed Journal
Volume: 7 | Issue: 8 | August 2021|| Journal DOI: 10.36713/epra2013 || SJIF Impact Factor 2021: 8.047 || ISI Value: 1.188

Table 1. Description of Data Sets

Datasets No. of Attributes No of Instances
Diabetes 09 768
Hypothyroid 30 3772

Table 2.Clustering Results for Data Sets.

k-means Hierarchical time k-means Hierarchical
running time(sec) clustering Accuracy % clustering
running time (sec) Accuracy %
Diabeties 0.06 2.14 51.692 65.104
Hypothyroid 0.16 5.74 69.64 93.24
Table 2 shows the clustering findings for cluster k=3.

Fig 2 Shows Running Time when Both Algorithms are applied on the Same Datasets.
7

4
k-means
3

2 hierarchical

0
diabetes hypothyroid
Figure 2. Running time v/s Datasets

100
90
80
70
60
50 k-means
40 heirarchical
30
20
10
0
diabetes hypothyroid

Figure 3. Accuracy v/s Datasets

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

417
ISSN (Online): 2455-3662
EPRA International Journal of Multidisciplinary Research (IJMR) - Peer Reviewed Journal
Volume: 7 | Issue: 8 | August 2021|| Journal DOI: 10.36713/epra2013 || SJIF Impact Factor 2021: 8.047 || ISI Value: 1.188

VIII. CONCLUSION 8. S. H. Sastry, P. Babu and M. S. Prasada, “Analysis

The K-mean method performs in clustering huge & Prediction of Sales Data in SA P-ERP System
data sets, and its performance improves as the using Clustering Algorithms”, arXiv preprint
arXiv:1312.2678, (2013).
number of clusters grows. For categorical data, a 9. Soumi Ghosh, Sanjay Kumar Dubey, Comparative
hierarchical algorithm was employed, and according Analysis ofK-Means and Fuzzy C-Means
to its complexity, a new approach for giving rank Algorithms, International Journal of Advanced
values to each categorical attribute using K- means Computer Science and Applications, Vol. 4, No.4,
was applied, in which categorical data is first 2013.
transformed to numeric by assigning rank values to 10. M. V. Reddy, M. Vivekananda, and R. U. V. N.
each categorical attribute. The K-mean algorithm Satish, “Divisive Hierarchical Clustering with K-
performs better than the Hierarchical Clustering means and Agglomerative Divisive Hierarchical
Algorithm. The RMSE lowers as the number of Clustering with K-means and Agglomerative
Hierarchical Clustering,”
clusters rises, and the performance of the K-Means 11. N. Sharma, A. Bajpai and R. Litoruya,
method improves as the RMSE drops. When “Comparison the various clustering algorithms of
clustering certain (noisy) data, all of the methods weka tools” International Jornal of Emerging
contain some ambiguity. When clustered, all of the technology and Advanced Engineering, vol. 2, no.
methods exhibit some uncertainty in some (noisy) 5, (2012) May.
data. When a large dataset is used, the quality of all 12. Amit Saxena1 , Mukesh Prasad2 , Akshansh Gupta3
algorithms improves dramatically. The K-Means , Neha Bharill4 , Om Prakash Patel4 , Aruna
algorithm is extremely sensitive to dataset noise. This Tiwari4, A Review of Clustering Techniques and
noise makes it difficult for the algorithm to cluster Developments
13. K. Wang, B. Wang, and L. Peng, “CVAP:
data into appropriate clusters, and thus has an impact validation for cluster analyses,” Data Sci. J., vol. 8,
on the method's outcome. When working with large pp. 88–93, 2009
datasets, the K-Means system results conventional 14. Performance of selected agglomerative
clustering algorithms while still producing high- hierarchical clustering methods nusa erman1 , ales
quality clusters. korosec2 , jana suklan3

IX. REFERENCES
1. D. Karaboga and C. Ozturk, “A novel clustering
approach: Artificial Bee Colony (ABC)
algorithm”,Applied Soft Computing, vol. 11, no. 1,
(2011), pp. 652-657.
2. J. Senthilnath, S. N. Omkar and V. Mani,
“Clustering using firefly algorithm: performance
study”,Swarm and Evolutionary Computation, vol.
1, no. 3, (2011), pp. 164-171.
3. M. Halkidi, Y. Batistakis, and M. Vazirgiannis, “On
clustering validation techniques, ” J. Intell. Inf.
Syst., vol. 17, no. 2–3, pp. 107–145, 2001.
4. K. Wang, B. Wang, and L. Peng, “CVAP:
validation for cluster analyses,” Data Sci. J., vol. 8,
pp. 88–93, 2009.
5. M. Verma, M. Srivastava, N. Chack, A. K. Diswar,
and N. Gupta, “A Comparative Study of Various
Clustering Algorithms in Data Mining,” Int. J. Eng.
Res. Appl.
6. U. Kaymak and M. Setnes, “Extended fuzzy
clustering algorithms”, ERIM Report Series
Reference No.ERS-2001-51-LIS, (2000).
7. B. Karthikeyan, D. J. George, G. Manikandan, and
T. Thomas, “A comparative study on k-means
clustering and agglomerative hierarchical
clustering,” Int. J. Emerg. Trends Eng. Res., vol. 8,
no. 5

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

418

(Solution Manual) Daniel Solow, How To Read and Do Proofs, 6th Edition
100% (1)
(Solution Manual) Daniel Solow, How To Read and Do Proofs, 6th Edition
56 pages
Robust Flight Control A Design Challenge Lecture Notes in Control and Information Sciences
100% (1)
Robust Flight Control A Design Challenge Lecture Notes in Control and Information Sciences
659 pages
4dsocial - Interactive Design Environments
No ratings yet
4dsocial - Interactive Design Environments
131 pages
Autosphere iRPA On AWS For Telecom Enterprise Automation - Autosphere iRPA On AWS For Telecom Enterprise Automation
No ratings yet
Autosphere iRPA On AWS For Telecom Enterprise Automation - Autosphere iRPA On AWS For Telecom Enterprise Automation
2 pages
Soft Computing MCQ
85% (13)
Soft Computing MCQ
12 pages
Image Classification Using Image
No ratings yet
Image Classification Using Image
50 pages
PSO and WDO Data Clusterin
No ratings yet
PSO and WDO Data Clusterin
19 pages
Comparative Study of K-Means and Hierarchical Clustering Techniques
No ratings yet
Comparative Study of K-Means and Hierarchical Clustering Techniques
7 pages
Clustering Techniques in Data Mining
No ratings yet
Clustering Techniques in Data Mining
7 pages
Comparative Analysis of K-Means and Fuzzy C-Means Algorithms
No ratings yet
Comparative Analysis of K-Means and Fuzzy C-Means Algorithms
5 pages
A Fast and Effective Partitional Clustering Algorithm For Large Categorical Datasets Using A K-Means Based Approach
No ratings yet
A Fast and Effective Partitional Clustering Algorithm For Large Categorical Datasets Using A K-Means Based Approach
21 pages
PR Assignment 02 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 02 - Seemal Ajaz (206979)
5 pages
Amity School of Engineering and Technology Amity University, Uttar Pradesh
No ratings yet
Amity School of Engineering and Technology Amity University, Uttar Pradesh
5 pages
Ijettcs 2014 04 25 123
No ratings yet
Ijettcs 2014 04 25 123
5 pages
A Comparative Study of K-Means, K-Medoid and Enhanced K-Medoid Algorithms
No ratings yet
A Comparative Study of K-Means, K-Medoid and Enhanced K-Medoid Algorithms
4 pages
NMB34203 ch1 Intro
No ratings yet
NMB34203 ch1 Intro
38 pages
By Lior Rokach and Oded Maimon: Clustering Methods
No ratings yet
By Lior Rokach and Oded Maimon: Clustering Methods
5 pages
Lecture Notes 2016
No ratings yet
Lecture Notes 2016
132 pages
Artificial Intelligence Solutions For Urban Land Dynamics
No ratings yet
Artificial Intelligence Solutions For Urban Land Dynamics
21 pages
Clustering Lung Cancer Data by K-Means and K-Medoids Algorithms
No ratings yet
Clustering Lung Cancer Data by K-Means and K-Medoids Algorithms
5 pages
An Improved Technique For Document Clustering
No ratings yet
An Improved Technique For Document Clustering
4 pages
Cluster Evaluation Techniques: Atds Assignment
No ratings yet
Cluster Evaluation Techniques: Atds Assignment
4 pages
Comparison of Different Clustering Algorithms Using WEKA Tool
No ratings yet
Comparison of Different Clustering Algorithms Using WEKA Tool
3 pages
A Parallel Study On Clustering Algorithms in Data Mining
No ratings yet
A Parallel Study On Clustering Algorithms in Data Mining
7 pages
CH 11
No ratings yet
CH 11
26 pages
Week 2-TPB Critical Reading
No ratings yet
Week 2-TPB Critical Reading
39 pages
Statistical Considerations On The K - Means Algorithm
No ratings yet
Statistical Considerations On The K - Means Algorithm
9 pages
A Review On K Means Clustering
No ratings yet
A Review On K Means Clustering
7 pages
Unit 3 (ML)
No ratings yet
Unit 3 (ML)
26 pages
Mathematical Modeling of Control System
No ratings yet
Mathematical Modeling of Control System
31 pages
Civil-Applications of Artificial Neural Networks in Civil Engineering
100% (1)
Civil-Applications of Artificial Neural Networks in Civil Engineering
25 pages
Gautam A. Kudale
No ratings yet
Gautam A. Kudale
6 pages
January 2024: Top 10 Downloaded Articles in Computer Science & Information Technology
No ratings yet
January 2024: Top 10 Downloaded Articles in Computer Science & Information Technology
35 pages
Chess Playing Robotic Arm
No ratings yet
Chess Playing Robotic Arm
9 pages
An Enhanced Clustering Algorithm To Analyze Spatial Data: Dr. Mahesh Kumar, Mr. Sachin Yadav
No ratings yet
An Enhanced Clustering Algorithm To Analyze Spatial Data: Dr. Mahesh Kumar, Mr. Sachin Yadav
3 pages
Week 2-TPB Critical Reading
No ratings yet
Week 2-TPB Critical Reading
39 pages
February 2024-: Top Read Articles in Computer Science & Information Technology
No ratings yet
February 2024-: Top Read Articles in Computer Science & Information Technology
35 pages
Comparison of Graph Clustering Algorithms
No ratings yet
Comparison of Graph Clustering Algorithms
6 pages
A Rtificial Intelligence and Robotics: Contributi Scientifici
No ratings yet
A Rtificial Intelligence and Robotics: Contributi Scientifici
7 pages
Keeping Dry: The Mathematics of Running in The Rain: Dankhailman Brucetorrents
No ratings yet
Keeping Dry: The Mathematics of Running in The Rain: Dankhailman Brucetorrents
12 pages
Instructor's Solution Manual For Neural Networks
No ratings yet
Instructor's Solution Manual For Neural Networks
40 pages
Www-Pyimagesearch-Com-2020-02-24-Denoising-Autoencoders-With-Keras-Tensorflow-An (1-18)
No ratings yet
Www-Pyimagesearch-Com-2020-02-24-Denoising-Autoencoders-With-Keras-Tensorflow-An (1-18)
28 pages
Review Paper On Clustering and Validation Techniques
No ratings yet
Review Paper On Clustering and Validation Techniques
5 pages
Comprehensive Review On Clustering Techniques and Its Application On High Dimensional Data
No ratings yet
Comprehensive Review On Clustering Techniques and Its Application On High Dimensional Data
8 pages
D. Patel Institute of Technology Department of Information Technology Subject: Data Mining and Business Intelligence (2170715) A.Y 2020-2021
No ratings yet
D. Patel Institute of Technology Department of Information Technology Subject: Data Mining and Business Intelligence (2170715) A.Y 2020-2021
2 pages
Hindi Intent Classification
No ratings yet
Hindi Intent Classification
13 pages
Exercise 4 "Linear System Identification Using Neural Networks" Objective
No ratings yet
Exercise 4 "Linear System Identification Using Neural Networks" Objective
7 pages
2022-A Comprehensive Survey of Clustering Algorithms State-Of-The-Art Machine Learning Applications Taxonomy Challenges
No ratings yet
2022-A Comprehensive Survey of Clustering Algorithms State-Of-The-Art Machine Learning Applications Taxonomy Challenges
43 pages
Dwi Prima Handayani Putri - Ilkom VI A
No ratings yet
Dwi Prima Handayani Putri - Ilkom VI A
12 pages
Iterative Improved K-Means Clusterin
No ratings yet
Iterative Improved K-Means Clusterin
5 pages
Andrew Rosenberg - Lecture 14: Neural Networks
No ratings yet
Andrew Rosenberg - Lecture 14: Neural Networks
50 pages
Normalization Based K Means Clustering Algorithm
No ratings yet
Normalization Based K Means Clustering Algorithm
5 pages
05 Clustering
No ratings yet
05 Clustering
96 pages
978 3 540 73560 1
No ratings yet
978 3 540 73560 1
619 pages
Unit 5
No ratings yet
Unit 5
10 pages
Unit 4
No ratings yet
Unit 4
4 pages
PRJ C MR 18
No ratings yet
PRJ C MR 18
4 pages
May Jun 2023-1
No ratings yet
May Jun 2023-1
2 pages
All Mysql Queries Cheat Sheet
No ratings yet
All Mysql Queries Cheat Sheet
8 pages
An Improved K-Means Cluster Algorithm Using Map Reduce Techniques To Mining of Inter and Intra Cluster Datain Big Data Analytics
No ratings yet
An Improved K-Means Cluster Algorithm Using Map Reduce Techniques To Mining of Inter and Intra Cluster Datain Big Data Analytics
12 pages
Ijcset 2016060701
No ratings yet
Ijcset 2016060701
3 pages
SQL Server Statistics
No ratings yet
SQL Server Statistics
44 pages
Oracle Course Section 4 and 5 PDF
No ratings yet
Oracle Course Section 4 and 5 PDF
10 pages
Big Data Environment
No ratings yet
Big Data Environment
23 pages
AI Unit5 Ppts
No ratings yet
AI Unit5 Ppts
27 pages
40 Graficos Negocios Finanzas
No ratings yet
40 Graficos Negocios Finanzas
41 pages
1 s2.0 S0020025522014633 Main
No ratings yet
1 s2.0 S0020025522014633 Main
33 pages
Lp-Iii Be Lab Manual Final1
No ratings yet
Lp-Iii Be Lab Manual Final1
4 pages
IEEE Xplore Reference Download 2023.12.16.19.55.42
No ratings yet
IEEE Xplore Reference Download 2023.12.16.19.55.42
2 pages
Clustering Techniquesin Data Mining
No ratings yet
Clustering Techniquesin Data Mining
7 pages
Bharti Airtel
No ratings yet
Bharti Airtel
6 pages
Syllabus
No ratings yet
Syllabus
11 pages
Surveyofclusteringmethods
No ratings yet
Surveyofclusteringmethods
29 pages
A Review of Self Optimal Clustering Technique and Data Mining Approach
No ratings yet
A Review of Self Optimal Clustering Technique and Data Mining Approach
6 pages
Cluster Analysis
No ratings yet
Cluster Analysis
26 pages
Comprehensive Review of K Means Clustering Algorithms1
No ratings yet
Comprehensive Review of K Means Clustering Algorithms1
6 pages
DWDM Unit 3
No ratings yet
DWDM Unit 3
21 pages
Symmetry 13 01789 v2
No ratings yet
Symmetry 13 01789 v2
15 pages
Unit 4 Descriptive Modeling
No ratings yet
Unit 4 Descriptive Modeling
18 pages
ML Unit 4 Notes - NJ
No ratings yet
ML Unit 4 Notes - NJ
15 pages
Automatic Clustering Algorithms A Systematic Revie
No ratings yet
Automatic Clustering Algorithms A Systematic Revie
61 pages
AReviewof Clustering Algorithms
No ratings yet
AReviewof Clustering Algorithms
8 pages
Research On K Mean Algorithm
No ratings yet
Research On K Mean Algorithm
5 pages
Introduction To Cluster Analysis.
No ratings yet
Introduction To Cluster Analysis.
53 pages
Clustering
No ratings yet
Clustering
8 pages
Anupama Luthra - 2011
No ratings yet
Anupama Luthra - 2011
21 pages
Lecture 8 - Clustering
No ratings yet
Lecture 8 - Clustering
23 pages
10 Clus Basic
No ratings yet
10 Clus Basic
95 pages
DM Module 4
No ratings yet
DM Module 4
17 pages
Enhancing Clustering Performance: A Hybrid Generalized K-Means Approach
No ratings yet
Enhancing Clustering Performance: A Hybrid Generalized K-Means Approach
9 pages
Module V
No ratings yet
Module V
16 pages
Clustering
No ratings yet
Clustering
34 pages
DWMModule 4
No ratings yet
DWMModule 4
31 pages
Clustering Notes
No ratings yet
Clustering Notes
17 pages
A Novel Clustering Technique For Efficient Clustering of Big Data in Hadoop Ecosystem
No ratings yet
A Novel Clustering Technique For Efficient Clustering of Big Data in Hadoop Ecosystem
8 pages
Clustering in Data Mining
No ratings yet
Clustering in Data Mining
14 pages
An Investigation into the Use of a Neural Tree Classifier for Knowledge Discovery in OLAP Databases
From Everand
An Investigation into the Use of a Neural Tree Classifier for Knowledge Discovery in OLAP Databases
David R Swinburne
No ratings yet

1120pm - 85.epra Journals 8308

Uploaded by

1120pm - 85.epra Journals 8308

Uploaded by

ISSN (Online): 2455-3662

EPRA International Journal of Multidisciplinary Research (IJMR) - Peer Reviewed Journal

A COMPARATIVE ANALYSIS OF K-MEANS AND

Aastha Gupta1, Himanshu Sharma2, Anas Akhtar3

Article DOI: https://doi.org/10.36713/epra8308

I. INTRODUCTION another, while data in other clusters varies. Clustering

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

III. CLUSTERING PROCESS obtained from clustering algorithms are based on

Fig 1 . Clustering Process

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

IV. CLUSTERING Disadvantages

V. COMPARITIVE ANALYSIS Disadvantages

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

I. Agglomerative Hierarchical clustering II. Divisive Hierarchical clustering

VII. EXPERIMENT explanation of datasets utilized in experiment

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

Table 1. Description of Data Sets

Table 2.Clustering Results for Data Sets.

Figure 3. Accuracy v/s Datasets

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

VIII. CONCLUSION 8. S. H. Sastry, P. Babu and M. S. Prasada, “Analysis

2021 EPRA IJMR | www.eprajournals.com | Journal DOI URL: https://doi.org/10.36713/epra2013

You might also like