Data Mining Assignment
Data Mining Assignment
CLUSTERNG
Data Mining Assignment
SUBMITTED BY:
Khushal Rastogi (2k19/CS/60)
Ques. 6 TO 10.
Digvijay Nayal (2k19/CS/53)
Ques. 1 TO 5.
B.Sc. Hons. Computer Science
INTRODUCTION
In the above sample 2-dimension dataset, it is visible that the dataset forms 3
clusters that are far apart, and points in the same cluster are close to each other.
2. Take two nearest clusters and join them to form one single cluster.
3. Proceed recursively step 2 until you obtain the desired number of
clusters.
There are some methods which are used to calculate the similarity between two
clusters:
• (Average Linkage) The average distance between all points in the two
clusters.
X Y
P1 0.40 0.53
P2 0.22 038
P3 0.35 0.32
P4 0.26 0.19
P5 0.08 0.41
P6 0.45 0.30
Answer:
P1 P2 P3 P4 P5 P6
P1 P2 P3, P6 P4 P5
P1 0 0.24
P2 0.24 0
P1 0.00
P1 0
P4 0.37 0.15 0
P1 0
X Y
P1 0.40 0.53
P2 0.22 038
P3 0.35 0.32
P4 0.26 0.19
P5 0.08 0.41
P6 0.45 0.30
Answer:
P1 P2 P3 P4 P5 P6
P1 0.00
P2 0.24 0.00
P1 P2 P3, P6 P4 P5
P1 0.00
P2 0.24 0.00
P1 0.00
P1 0
P4 0.37 0.39 0
Question 3: Use the distance matrix to perform complete link hierarchical clustering. Show your
results by drawing a dendrogram. The dendrogram should clearly show the order in which points are
merged.
P1 P2 P3 P4 P5
Answer:
P1, P2 P3 P4 P5
P1, P2 0.00
P3 0.64 0.00
P1, P2 P3, P4 P5
P1, P2 0.00
P5 0.98 0.00
Question 4: Use the distance matrix to perform complete single link hierarchical clustering. Show
your results by drawing a dendrogram. The dendrogram should clearly show the order in which points
are merged.
P1 P2 P3 P4 P5
Answer:
P1 P2 P3 P4 P5
P1 0.00
P2 0.40 0.00
P1, P2 P3 P4 P5
P1, P2 0.00
P3 0.41 0.00
P3 0.41 0.00
P4 0.44 0.00
Question 5: Find the clusters using single link technique. draw the dendrogram.
Answer:
P1 P2 P3 P4 P5 P6 P7
P1 0
P2 40 .24 0
P3 25.19 15.52 0
P4 39 6.77 14.25 0
P1 P2 P3,P7 P4 P5 P6
P1 0
P2 40 .24 0
P4 39 6.77 14.25 0
P1 0
P2,P5 40 .24 0
P4 39 6.77 14.25 0
P2,P5 35.54 0 P1 0
X Y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
Answer:
1 2 3 4 5
1 0
2 4.0 0
3 11.7 8.1 0
12 3 4 5
12 0
3 8.1 0
4 16.0 9.8 0
12 3 45
12 0
3 8.1 0
45 16.0 9.8 0
12,3 45
12,3 0
45 9.8 0
Answer:
E A C B D
E 0
A 1 0
C 2 2 0
B 2 5 1 0
D 3 3 6 3 0
EA C B D
EA 0
C 2 0
B 2 1 0
D 3 6 3 0
EA 0
BC 2 0
D 3 3 0
EA,BC D
EA,BC 0
D 3 0
Question 8: Use the distance matrix to perform complete linkage hierarchical clustering. Show
your results by drawing a dendrogram. The dendrogram should clearly show the order in which points
are merged.
Answer:
E A C B D
E 0
A 1 0
C 2 2 0
B 2 5 1 0
D 3 3 6 3 0
EA C B D
EA 0
C 2 0
B 5 1 0
D 3 6 3 0
EA BC D
EA 0
BC 5 0
D 3 6 0
Pair(EA,D) with distance = 3.
EA,D BC
EA,D 0
BC 6 0
Question 9: Use the distance matrix to perform complete linkage hierarchical clustering. Show
your results by drawing a dendrogram. The dendrogram should clearly show the order in which points
are merged.
Answer:
A B C D E F
A 0
B 5 0
C 14 9 0
D 11 20 13 0
E 18 15 6 3 0
F 10 16 8 10 11 0
Pair(D,E).
A B C DE F
A 0
B 5 0
C 11 9 0
DE 18 20 13 0
F 10 16 8 11 0
Pair(A,B).
AB C DE F
AB 0
C 14 0
DE 20 13 0
F 16 8 11 0
Pair(C,F).
AB CF DE
AB 0
CF 16 0
DE 20 13 0
Pair(CF,DE)
AB CF,DE
AB 0
CF,DE 20 0
Question 10: Use the distance matrix to perform single linkage hierarchical clustering. Show your
results by drawing a dendrogram. The dendrogram should clearly show the order in which points are
merged.
Answer:
A B C D E F
A 0
B 5 0
C 14 9 0
D 11 20 13 0
E 18 15 6 3 0
F 10 16 8 10 11 0
Pair(D,E).
A B C DE F
A 0
B 5 0
C 11 9 0
DE 11 15 6 0
F 10 16 8 10 0
Pair(A,B).
AB C DE F
AB 0
C 9 0
DE 11 6 0
F 10 8 10 0
Pair(C,DE).
AB C,DE F
AB 0
C,DE 9 0
F 10 8 0
Pair((C,DE),F).
AB (C,DE),F
AB 0
(C,DE),F 9 0