DBSCAN Clustering
DBSCAN Clustering
Density Based Spatial Clustering of Applications with Noise(DBSCAN) is one of the clustering
algorithms which can find clusters in noisy data. It works even on those datasets where K-
Means fail to find meaningful clusters. More information about it can be found here.
You can learn more about the DBSCAN algorithm in the below video.
The below code snippet will help to create clusters in data using DBSCAN.
Sample Output:
The Silhouette Coefficient is calculated using the mean intra-cluster distance (a) and the
mean nearest-cluster distance (b) for each sample. The Silhouette Coefficient for a sample is
(b – a) / max(a, b). To clarify, b is the distance between a sample and the nearest cluster that
the sample is not a part of. Note that Silhouette Coefficient is only defined if number of labels
is 2 <= n_labels <= n_samples – 1.
The best value of the Silhouette Coefficient is 1 and the worst value is -1. Values near 0 indicate
overlapping clusters. Negative values generally indicate that a sample has been assigned to
the wrong cluster
Sample Output
1 # DBSCAN Clustering
2 from sklearn.cluster import DBSCAN
3 db = DBSCAN(eps=0.18, min_samples=2)
4 # Plotting the clusters
5 plt.scatter(x= X[:,0], y= X[:,1], c=db.fit_predict(X))
Farukh Hashmi
https://thinkingneuron.com/
thinkingneuron@gmail.com
REBECCA V.
AUGUST 2, 2022 AT 6:02 PM
Hi! Thanks for the code snippet. Just a heads up it appears there may be a rendering error in
line 20:
if(len(np.unique(db.fit_predict(X)))>1):
Reply
REBECCA V.
AUGUST 2, 2022 AT 6:03 PM
Reply
Leave a Reply!
Your email address will not be published. Required fields are marked *
Comment
Name*
Email*
Website
Submit
Wordcloud Xgboost