Cosine metric for K-Means algorithm #31541
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
Addresses #31450
What does this implement/fix? Explain your changes.
This PR adds support for an optional
cosine distance
metric in the K-means algorithm, allowing it to function as Spherical K-means when selected.Changes:
metric
- defaults to 'euclidean'metric
iscosine
, and normalize both input and centroids. Then, euclidean distance will be equivalent to cosine distance.Any other comments?
Relevant sources:
while the performance of the other four measures are quite
similar": Similarity Measures for Text Document Clustering - 1942 citations