Expt 8 Divisive Algorithm Based On MST 121A1091

PROGRAM NO.
8 DIVISIVE CLLSTERING ALGORITHM based on MST
Aim: To implement Divisive Clustering Algorithm based on Minimum Spanning

Tree (MST) in Python and understand the graph-based clustering approach.
Theory:
Introduction and Concept:
Divisive Clustering is a top-down hierarchical clustering algorithm, where the
entire dataset starts as a single cluster and is iteratively split into smaller clusters until
specific criteria are met. Graph-based clustering leverages the relationship between
data points by representing them as nodes connected by weighted edges, and clusters
are formed by splitting the graph. Using a Minimum Spanning Tree (MST) for
divisive clustering ensures that we partition the data by removing the most significant
edges, breaking the dataset along its natural groupings.
Algorithm – Divisive Clustering using MST:
1. Represent the dataset as a graph, with nodes as data points and edges
weighted by the distance (e.g., Euclidean distance) between them.
2. Compute the Minimum Spanning Tree (MST) using algorithms like Prim’s
or Kruskal’s to connect all nodes with minimal edge weight.
3. Identify and remove the largest edge in the MST, breaking the graph into two
clusters.
4. Repeat the process recursively on each new subgraph until the required
number of clusters is obtained or stopping criteria are satisfied.
Python Implementation:
1. Import necessary libraries: networkx (for graph and MST operations) and
scipy (for distance computation).
2. Create a graph using the dataset and compute the MST.
3. Identify the edge with the largest weight and remove it to split the graph.
4. Continue splitting iteratively to form clusters, and visualize or evaluate the
clusters formed.
import networkx as nx
import scipy.spatial.distance as distance
import numpy as np
import matplotlib.pyplot as plt
def divisive_clustering_mst(data, num_clusters):

"""
Performs divisive clustering using a Minimum Spanning Tree (MST).
Args:
data: A 2D NumPy array containing the data points.
num_clusters: The desired number of clusters.
Returns:
A list of clusters, where each cluster is a list of data point indices.
"""
# Create a graph with nodes representing data points

G = nx.Graph()
for i in range(len(data)):
G.add_node(i)
# Compute distances between data points

distances = distance.cdist(data, data)
# Add edges with weights representing distances

for i in range(len(data)):
for j in range(i + 1, len(data)):
G.add_edge(i, j, weight=distances[i, j])
# Compute the Minimum Spanning Tree (MST)

mst = nx.minimum_spanning_tree(G)
# Split the graph iteratively to form clusters

clusters = [list(G.nodes())]
while len(clusters) < num_clusters:
# Find the edge with the largest weight in the MST
max_edge = max(mst.edges(data=True), key=lambda edge: edge[2]['weight'])
u, v, weight = max_edge
# Remove the edge to split the graph

mst.remove_edge(u, v)
# Create new clusters based on the disconnected components

new_clusters = []
for component in nx.connected_components(mst):
new_clusters.append(list(component))
clusters = new_clusters
return clusters
# Example usage
data = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]])
num_clusters = 3
clusters = divisive_clustering_mst(data, num_clusters)
print("Clusters:", clusters)
# Visualization
def plot_clusters(data, clusters):
plt.figure(figsize=(8, 6))
colors = ['r', 'g', 'b', 'y', 'c', 'm']
for cluster_idx, cluster in enumerate(clusters):

cluster_points = data[cluster]
plt.scatter(cluster_points[:, 0], cluster_points[:, 1], label=f"Cluster
{cluster_idx + 1}", c=colors[cluster_idx])
plt.title("Divisive Clustering using MST")

plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.legend()
plt.show()
plot_clusters(data, clusters)
Conclusion:
⚫ Divisive clustering using MST provides an effective way to cluster data by
capturing the natural structure of the graph.
⚫ This approach works well when data points form clusters with varying densities
and shapes. Although computationally intensive for large datasets, MST-based
clustering ensures meaningful partitions that are difficult to achieve with
traditional distance-based methods.

Expt 8 Divisive Algorithm Based On MST 121A1091

Uploaded by

Copyright:

Available Formats

Expt 8 Divisive Algorithm Based On MST 121A1091

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Expt 8 Divisive Algorithm Based On MST 121A1091

Uploaded by

Copyright:

Available Formats

PROGRAM NO.

8 DIVISIVE CLLSTERING ALGORITHM based on MST

Aim: To implement Divisive Clustering Algorithm based on Minimum Spanning

def divisive_clustering_mst(data, num_clusters):

# Create a graph with nodes representing data points

# Compute distances between data points

# Add edges with weights representing distances

# Compute the Minimum Spanning Tree (MST)

# Split the graph iteratively to form clusters

# Remove the edge to split the graph

# Create new clusters based on the disconnected components

for cluster_idx, cluster in enumerate(clusters):

plt.title("Divisive Clustering using MST")

You might also like