Skip to content

sklearn.cluster.KMeans 0.23 is extra slower compared to 0.22.2 #17230

@MichalRIcar

Description

@MichalRIcar

Used code:

from sklearn import cluster

for k in range(1,15):
     cluster.KMeans(
           n_clusters   = k,           
           random_state = 42,      
           n_init       = 10,
           max_iter     = 2000,
           algorithm    = 'full',
           init         = 'k-means++'   )

Expected Results

Computation in v0.22.2 was done in 2mins for whole set of explored 15 k

Actual Results

Computation takes more than 20min with exactly same data and setup as before
Also, computation even with k=1 takes very long time → compared to previous version lower k meant much faster computation

Versions

System:
python: 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\micha\anaconda3\python.exe
machine: Windows-10-10.0.18362-SP0

Python dependencies:
pip: 20.0.2
setuptools: 45.2.0.post20200210
sklearn: 0.23.0
numpy: 1.18.1
scipy: 1.4.1
Cython: 0.29.15
pandas: 1.0.3
matplotlib: 3.1.3
joblib: 0.14.1

Built with OpenMP: True

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions