Skip to content

sklearn.cluster.AgglomerativeClustering: Can we do without completing the matrix? 'UserWarning:the number of connected components of the connectivity matrix is *>1. Completing it to avoid stopping the tree early.'  #5327

Open
@zkuncheva

Description

@zkuncheva

I have tried this both on the latest 0.16.1 version and on the latest bleeding edge version of sklearn '0.17.dev0' and this appears to be an issue in both.

I use sklearn.cluster.AgglomerativeClustering(affinity='precomputed',connectivity=Cmat,linkage='complete')
where Cmat is a connectivity matrix in which there are disconnected components.
As indicated by the source code, I get the error message UserWarning: the number of connected components of the connectivity matrix is *>1. Completing it to avoid stopping the tree early.

However, reading the source code I see that when completing the connectivity matrix the developers are wondering whether the clustering can take place without completing the matrix:
""XXX: Can we do without completing the matrix?""

I am interested exactly in this development. Do you think sklearn is planning to fix this and make it possible to do the clustering without completing the matrix? I think it should not be too hard.

Best,
Zhana

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions