INTRO TO ML ASS
INTRO TO ML ASS
INTRO TO ML ASS
NAME:GAYATHRI.K
REG NO:212223230061
1. Introduction to Clustering:
K-Means clustering is a popular clustering algorithm that divides a dataset into kkk distinct
non-overlapping clusters. Each cluster is defined by its centroid, and data points are assigned to
the cluster with the closest centroid.
Advantages:
Disadvantages:
5. Python Implementation:
Points: (2,3),(3,3),(6,7),(8,8),(3,5),(7,6)\text{Points: } (2, 3), (3, 3), (6, 7), (8, 8), (3, 5), (7,
6)Points: (2,3),(3,3),(6,7),(8,8),(3,5),(7,6)
Code:
import numpy as np
# Example dataset
data = np.array([[2, 3], [3, 3], [6, 7], [8, 8], [3, 5], [7, 6]])
k = 2 # Number of clusters
# K-Means algorithm
for _ in range(100): # Max iterations
clusters = [[] for _ in range(k)]
for point in data:
# Assign points to the nearest centroid
idx = np.argmin([np.linalg.norm(point - c) for c in centroids])
clusters[idx].append(point)
# Update centroids
new_centroids = [np.mean(cluster, axis=0) if cluster else centroids[i]
for i, cluster in enumerate(clusters)]
if np.allclose(new_centroids, centroids): # Check for convergence
break
centroids = new_centroids
# Display results
for i, cluster in enumerate(clusters):
print(f"Cluster {i+1}: {cluster}")
print(f"Final Centroids: {centroids}")
7. Conclusion:
The K-Means algorithm is a powerful clustering technique for grouping similar data. Its simplicity
and scalability make it a popular choice for many practical applications. However, understanding
its limitations and choosing kkk wisely are critical for optimal performance.