Skip to content

[MRG] new K-means implementation for improved performances #11950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 187 commits into from
Feb 20, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
187 commits
Select commit Hold shift + click to select a range
7005c0c
ENH New implementation of K-means using chunks, speed improvement and…
jeremiedbb Aug 30, 2018
7966dd0
elkan center_half_distance init to 0 & out center_shift
jeremiedbb Oct 22, 2018
97fcf1f
out center_shift & numpy computations on pairwise_distances
jeremiedbb Oct 22, 2018
78a167d
comment
jeremiedbb Oct 22, 2018
35fd78e
error message minibatchkmeans partial_fit different number of features
jeremiedbb Oct 22, 2018
6dae806
drop python 2 CI
jeremiedbb Oct 22, 2018
f5c0aa1
refactor center_shift computation
jeremiedbb Oct 22, 2018
a1c1fac
docstring
jeremiedbb Oct 22, 2018
8df3b1e
fix center_shift
jeremiedbb Oct 22, 2018
8f111c7
update tests
jeremiedbb Oct 22, 2018
e8be354
range consistency
jeremiedbb Oct 25, 2018
0bcc1f1
cos
jeremiedbb Oct 25, 2018
107290e
fix algorithm check
jeremiedbb Oct 29, 2018
aac2350
typo
jeremiedbb Oct 29, 2018
8e432be
deprecation precompute in tests
jeremiedbb Oct 29, 2018
4531fc6
use libc FLT_MAX
jeremiedbb Oct 31, 2018
52d8aba
setup unlik cblas
jeremiedbb Oct 31, 2018
ff35b29
remove unecessary blas stuff from setup
jeremiedbb Nov 29, 2018
286aed4
Add _clibs module to limit number of threads for C-libs
jeremiedbb Nov 29, 2018
e720abe
fix merge conflict
jeremiedbb Nov 29, 2018
5d82b8d
fix import deprecated
jeremiedbb Nov 29, 2018
2368f70
try to fix clib tests ??
jeremiedbb Nov 29, 2018
4d960a3
doesn't work... revert
jeremiedbb Nov 29, 2018
afc306b
add header for _k_means to export cdef funcs
jeremiedbb Nov 30, 2018
9ffb9ca
calloc instead of malloc
jeremiedbb Dec 5, 2018
aced525
tst build
jeremiedbb Dec 13, 2018
497f899
add get_openblas_version to clibs and skip tests with old openblas
jeremiedbb Dec 14, 2018
6fbe8b1
cython directive language_level
jeremiedbb Dec 14, 2018
bcb727e
fix merge conflicts
jeremiedbb Dec 14, 2018
e4c159c
fix merge conflicts
jeremiedbb Dec 14, 2018
684ea4e
thread limit context manager
jeremiedbb Dec 14, 2018
67900ad
skip openblas
jeremiedbb Dec 14, 2018
c1d262f
new line end of file
jeremiedbb Dec 17, 2018
ed308b7
merge master CI
jeremiedbb Dec 21, 2018
4b76694
merge master CI
jeremiedbb Dec 21, 2018
e77ac24
tst clang version
jeremiedbb Dec 21, 2018
9ccc725
same
jeremiedbb Dec 21, 2018
9a03162
add llvm-openmp to travis
jeremiedbb Dec 21, 2018
215be94
appveyor codecov
jeremiedbb Jan 14, 2019
bec9079
openmp flags
jeremiedbb Jan 16, 2019
43d3fba
openmp flags
jeremiedbb Jan 16, 2019
2c61378
openmp flags
jeremiedbb Jan 18, 2019
7679a9f
fix conflicts
jeremiedbb Jan 18, 2019
f916091
ompenmp
jeremiedbb Jan 18, 2019
4c2da0c
no need
jeremiedbb Jan 25, 2019
a4383fb
flake8
jeremiedbb Jan 31, 2019
cf25383
force init order
jeremiedbb Jan 31, 2019
9868632
remove forced X order
jeremiedbb Jan 31, 2019
7326362
same
jeremiedbb Jan 31, 2019
212ae77
same
jeremiedbb Jan 31, 2019
3c2ad09
Merge branch 'master' into kmeans-perf
jeremiedbb Feb 3, 2019
e9a4cee
directly use _cython_blas
jeremiedbb Feb 3, 2019
41ea6df
ensure order='C' even if copy_x = false
jeremiedbb Feb 3, 2019
745c756
remove unnecessary condition
jeremiedbb Feb 3, 2019
d4d0eea
merge master
jeremiedbb Feb 4, 2019
930be82
merge master
jeremiedbb Feb 4, 2019
084db44
copy_x docstring
jeremiedbb Feb 6, 2019
55a6563
refactor, use memviews more, add sparse elkan
jeremiedbb Feb 12, 2019
8461712
refactor, use memviews more, add sparse elkan
jeremiedbb Feb 12, 2019
9ed4436
docstrings
jeremiedbb Feb 21, 2019
df55e4c
merge master
jeremiedbb Feb 21, 2019
4d93fa5
nitpick
jeremiedbb Feb 21, 2019
31a3052
fix euclean_sparse_dense
jeremiedbb Feb 22, 2019
621661e
fix euclidean sparse dense
jeremiedbb Feb 22, 2019
dda6527
fix relocate empty cluster
jeremiedbb Feb 26, 2019
a48504a
fix relocate empty clusters
jeremiedbb Feb 26, 2019
eb09a06
lint...
jeremiedbb Feb 26, 2019
b723633
Merge branch 'master' into kmeans-perf
jeremiedbb Feb 26, 2019
014956d
tst azure openmp
jeremiedbb Feb 26, 2019
ec74a76
tst openmp
jeremiedbb Feb 26, 2019
5485c96
same
jeremiedbb Feb 26, 2019
8a07a32
adress comments & improve docstrings
jeremiedbb Feb 28, 2019
40de5b3
fix
jeremiedbb Mar 14, 2019
a31158b
merge master
jeremiedbb Mar 14, 2019
0aaee58
revert last changes: bad scalabilty
jeremiedbb Jun 24, 2019
34cd11e
revert last changes: bad scalabbility (continued)
jeremiedbb Jun 24, 2019
0a78fc4
merge master
jeremiedbb Jun 24, 2019
6c13a7d
merge master
jeremiedbb Jun 24, 2019
d8439fd
openmp helper equivalent of effective_n_jobs
jeremiedbb Jun 26, 2019
b8900ab
protect openmp calls
jeremiedbb Jun 26, 2019
e47bdb8
comment openmp max threads
jeremiedbb Jun 26, 2019
8050149
right place comment
jeremiedbb Jun 26, 2019
7532722
avoid copy centers_old <-> centers_new
jeremiedbb Jun 27, 2019
54f8146
avoid copy centers_old <-> centers_new
jeremiedbb Jun 27, 2019
280f551
don't import joblib if unecessary
jeremiedbb Jul 5, 2019
8f5ebfd
Update sklearn/utils/openmp_helpers.pyx
jeremiedbb Aug 4, 2019
edebabf
Update sklearn/utils/openmp_helpers.pyx
jeremiedbb Aug 9, 2019
4e99452
Update sklearn/utils/openmp_helpers.pyx
jeremiedbb Aug 9, 2019
0a95450
vendor threadpoolctl
jeremiedbb Sep 13, 2019
2cbc706
merge master
jeremiedbb Sep 16, 2019
f0198e4
merge master
jeremiedbb Sep 16, 2019
3af82ba
Merge branch 'vendor-threadpoolctl' into kmeans-perf
jeremiedbb Sep 16, 2019
6cb945b
remove _clibs
jeremiedbb Sep 16, 2019
6dd4525
fix merge mistakes
jeremiedbb Sep 16, 2019
0278f67
cln
jeremiedbb Sep 16, 2019
aa8eeba
revert appveyor modifs
jeremiedbb Sep 17, 2019
2881065
Merge branch 'master' into kmeans-perf
jeremiedbb Sep 17, 2019
f1231f5
improve docstring
jeremiedbb Sep 17, 2019
f09aa4b
Merge branch 'master' into openmp-effective-njobs
jeremiedbb Sep 17, 2019
f23ccbb
test deprecated precompute distance
jeremiedbb Sep 18, 2019
e2dd616
test elkan + 1 cluster warning
jeremiedbb Sep 18, 2019
4e2ff78
test error wrong algo
jeremiedbb Sep 18, 2019
b9af0a6
Make it explicit that LOKY_MAX_CPU_COUNT can impact _openmp_effective…
ogrisel Sep 18, 2019
ed148bb
Merge branch 'openmp-effective-njobs' into kmeans-perf
ogrisel Sep 18, 2019
d9ea936
Use _openmp_effective_n_threads in KMeans.fit
ogrisel Sep 18, 2019
851b05f
cln
jeremiedbb Sep 19, 2019
ca8584f
Merge branch 'kmeans-perf' of github.com:jeremiedbb/scikit-learn into…
jeremiedbb Sep 19, 2019
de02372
cln
jeremiedbb Sep 19, 2019
2fec37b
Merge branch 'vendor-threadpoolctl' into kmeans-perf
jeremiedbb Sep 19, 2019
09f9423
merge master
jeremiedbb Dec 30, 2019
830de25
cln
jeremiedbb Dec 30, 2019
b10b927
cln
jeremiedbb Dec 30, 2019
1788435
Merge branch 'master' into kmeans-perf
jeremiedbb Dec 31, 2019
daba537
cln
jeremiedbb Dec 31, 2019
df0cedb
cln
jeremiedbb Dec 31, 2019
ff37713
cln
jeremiedbb Dec 31, 2019
b59e16b
fix docstring example
jeremiedbb Dec 31, 2019
8170ea1
Merge remote-tracking branch 'origin/master' into pr/jeremiedbb/11950
glemaitre Jan 10, 2020
11bc395
skip last E step when hard convergence
jeremiedbb Jan 14, 2020
08d29aa
Elkan
jeremiedbb Jan 14, 2020
b21cc8e
improve docstring update_centers param
jeremiedbb Jan 14, 2020
d57a870
n_threads
jeremiedbb Jan 14, 2020
dd0878e
Merge branch 'kmeans-perf' of github.com:jeremiedbb/scikit-learn into…
jeremiedbb Jan 14, 2020
99ad111
lint
jeremiedbb Jan 14, 2020
a735b8b
comment on X pointer
jeremiedbb Jan 16, 2020
a46b708
Merge remote-tracking branch 'upstream/master' into kmeans-perf
jeremiedbb Jan 16, 2020
3ffd301
docstrings
jeremiedbb Jan 16, 2020
41325cc
docstring
jeremiedbb Jan 16, 2020
1880572
test relocate empty clusters helper
jeremiedbb Jan 16, 2020
1eb46b8
add test for 1 kmeans iteration
jeremiedbb Jan 16, 2020
463fcad
cln
jeremiedbb Jan 16, 2020
3ee1cd6
comment on "auto" for algorithm param
jeremiedbb Jan 16, 2020
24f5baf
same
jeremiedbb Jan 16, 2020
087ce55
typo
jeremiedbb Jan 16, 2020
59e0673
spacing
jeremiedbb Jan 16, 2020
869d9da
comment elkan extra memory
jeremiedbb Jan 16, 2020
51c2fea
address comments
jeremiedbb Jan 16, 2020
54a3f91
address comments
jeremiedbb Jan 16, 2020
daab347
same
jeremiedbb Jan 16, 2020
6ac05a0
Merge remote-tracking branch 'upstream/master' into kmeans-perf
jeremiedbb Jan 17, 2020
7d916cb
fast tol if 0
jeremiedbb Jan 17, 2020
6f048b7
pep8
jeremiedbb Jan 20, 2020
89d960c
pep8
jeremiedbb Jan 20, 2020
9ae33cd
remove threadpoolctl from externals -> dependency
jeremiedbb Jan 24, 2020
acc66c2
same
jeremiedbb Jan 24, 2020
6257eed
fix
jeremiedbb Jan 24, 2020
60acd73
install threadpoolctl in ci
jeremiedbb Jan 27, 2020
d19ab9c
iter
jeremiedbb Jan 27, 2020
45bc797
iter
jeremiedbb Jan 27, 2020
5b645e6
iter
jeremiedbb Jan 27, 2020
d52572c
Merge remote-tracking branch 'upstream/master' into kmeans-perf
jeremiedbb Feb 6, 2020
62337d6
deprecated precompute_distances has no effect
jeremiedbb Feb 6, 2020
0a5655d
comment gemm
jeremiedbb Feb 6, 2020
8493292
fix tests
jeremiedbb Feb 6, 2020
b2a41f2
deprecate n_jobs
jeremiedbb Feb 10, 2020
ddf1584
same
jeremiedbb Feb 10, 2020
bbd73c9
Merge remote-tracking branch 'upstream/master' into kmeans-perf
jeremiedbb Feb 10, 2020
5999941
tolerance takes rounding errors into account
jeremiedbb Feb 11, 2020
2f329dc
cln test deprecated n_jobs
jeremiedbb Feb 11, 2020
c8050e6
Merge remote-tracking branch 'upstream/master' into kmeans-perf
jeremiedbb Feb 11, 2020
afd176d
deprecate n_jobs for bicluster
jeremiedbb Feb 11, 2020
b7ae100
tol=0, change test, advised against in docstring
jeremiedbb Feb 13, 2020
d90b5e3
pass n_jobs to kmeans in bicluster
jeremiedbb Feb 14, 2020
c7687d4
remove outdated test kmeans++ with 2 jobs
jeremiedbb Feb 14, 2020
9b4d4d4
cln
jeremiedbb Feb 14, 2020
7c8d0ee
update comment of test 1 thread vs 2 threads
jeremiedbb Feb 14, 2020
bb34213
improve comment of test_k_means_1_iteration
jeremiedbb Feb 14, 2020
b69bd72
don't use is to compare dtypes
jeremiedbb Feb 14, 2020
9833a99
address review comments
jeremiedbb Feb 14, 2020
877a991
cln
jeremiedbb Feb 14, 2020
fd20130
cln
jeremiedbb Feb 18, 2020
94ff4f5
don't accept large sparse
jeremiedbb Feb 18, 2020
86ef932
avoid multiple indirect memory access
jeremiedbb Feb 18, 2020
830b14a
explicit squared kwarg
jeremiedbb Feb 18, 2020
b7e7cce
reword structured data
jeremiedbb Feb 18, 2020
3381c1b
reword structured data
jeremiedbb Feb 18, 2020
b62e3c0
what's new
jeremiedbb Feb 18, 2020
f7a63bc
revert squared kwarg
jeremiedbb Feb 18, 2020
74993a6
parallelism in user guide
jeremiedbb Feb 19, 2020
a1a324e
mention openmp in what's new
jeremiedbb Feb 19, 2020
9ad6fac
comment replace ndarray by memview when cython 0.3
jeremiedbb Feb 19, 2020
d3ac803
format params in docstrings
jeremiedbb Feb 19, 2020
56103d0
in out inout
jeremiedbb Feb 19, 2020
be97ff3
format docstring params part 2
jeremiedbb Feb 19, 2020
ecf6ecd
format docstring params part 3
jeremiedbb Feb 19, 2020
2099044
Merge remote-tracking branch 'upstream/master' into kmeans-perf
jeremiedbb Feb 20, 2020
9c21272
add more tests for private helpers
jeremiedbb Feb 20, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ jobs:
PILLOW_VERSION: '*'
PYTEST_VERSION: '*'
JOBLIB_VERSION: '*'
THREADPOOLCTL_VERSION: '2.0.0'
COVERAGE: 'true'

- template: build_tools/azure/posix.yml
Expand All @@ -54,6 +55,7 @@ jobs:
DISTRIB: 'ubuntu'
PYTHON_VERSION: '3.6'
JOBLIB_VERSION: '0.11'
THREADPOOLCTL_VERSION: '2.0.0'
# Linux + Python 3.6 build with OpenBLAS and without SITE_JOBLIB
py36_conda_openblas:
DISTRIB: 'conda'
Expand All @@ -70,6 +72,7 @@ jobs:
SCIKIT_IMAGE_VERSION: '*'
# latest version of joblib available in conda for Python 3.6
JOBLIB_VERSION: '0.13.2'
THREADPOOLCTL_VERSION: '2.0.0'
COVERAGE: 'true'
# Linux environment to test the latest available dependencies and MKL.
# It runs tests requiring lightgbm, pandas and PyAMG.
Expand All @@ -92,6 +95,7 @@ jobs:
DISTRIB: 'ubuntu-32'
PYTHON_VERSION: '3.6'
JOBLIB_VERSION: '0.13'
THREADPOOLCTL_VERSION: '2.0.0'

- template: build_tools/azure/posix.yml
parameters:
Expand All @@ -109,6 +113,7 @@ jobs:
PILLOW_VERSION: '*'
PYTEST_VERSION: '*'
JOBLIB_VERSION: '*'
THREADPOOLCTL_VERSION: '2.0.0'
COVERAGE: 'true'
pylatest_conda_mkl_no_openmp:
DISTRIB: 'conda'
Expand All @@ -120,6 +125,7 @@ jobs:
PILLOW_VERSION: '*'
PYTEST_VERSION: '*'
JOBLIB_VERSION: '*'
THREADPOOLCTL_VERSION: '2.0.0'
COVERAGE: 'true'
SKLEARN_TEST_NO_OPENMP: 'true'
SKLEARN_SKIP_OPENMP_TEST: 'true'
Expand Down
4 changes: 3 additions & 1 deletion build_tools/azure/install.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,16 @@ IF "%PYTHON_ARCH%"=="64" (

call activate %VIRTUALENV%

pip install threadpoolctl

IF "%PYTEST_VERSION%"=="*" (
pip install pytest
) else (
pip install pytest==%PYTEST_VERSION%
)
pip install pytest-xdist
) else (
pip install numpy scipy cython pytest wheel pillow joblib
pip install numpy scipy cython pytest wheel pillow joblib threadpoolctl
)
if "%COVERAGE%" == "true" (
pip install coverage codecov pytest-cov
Expand Down
6 changes: 4 additions & 2 deletions build_tools/azure/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@ if [[ "$DISTRIB" == "conda" ]]; then

make_conda $TO_INSTALL

pip install threadpoolctl==$THREADPOOLCTL_VERSION

if [[ "$PYTEST_VERSION" == "*" ]]; then
python -m pip install pytest
else
Expand All @@ -81,13 +83,13 @@ elif [[ "$DISTRIB" == "ubuntu" ]]; then
sudo apt-get install python3-scipy python3-matplotlib libatlas3-base libatlas-base-dev python3-virtualenv
python3 -m virtualenv --system-site-packages --python=python3 $VIRTUALENV
source $VIRTUALENV/bin/activate
python -m pip install pytest==$PYTEST_VERSION pytest-cov cython joblib==$JOBLIB_VERSION
python -m pip install pytest==$PYTEST_VERSION pytest-cov cython joblib==$JOBLIB_VERSION threadpoolctl==$THREADPOOLCTL_VERSION
elif [[ "$DISTRIB" == "ubuntu-32" ]]; then
apt-get update
apt-get install -y python3-dev python3-scipy python3-matplotlib libatlas3-base libatlas-base-dev python3-virtualenv
python3 -m virtualenv --system-site-packages --python=python3 $VIRTUALENV
source $VIRTUALENV/bin/activate
python -m pip install pytest==$PYTEST_VERSION pytest-cov cython joblib==$JOBLIB_VERSION
python -m pip install pytest==$PYTEST_VERSION pytest-cov cython joblib==$JOBLIB_VERSION threadpoolctl==$THREADPOOLCTL_VERSION
elif [[ "$DISTRIB" == "conda-pip-latest" ]]; then
# Since conda main channel usually lacks behind on the latest releases,
# we use pypi to test against the latest releases of the dependencies.
Expand Down
1 change: 1 addition & 0 deletions build_tools/azure/posix-32.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ jobs:
-e JUNITXML=$JUNITXML
-e VIRTUALENV=testvenv
-e JOBLIB_VERSION=$JOBLIB_VERSION
-e THREADPOOLCTL_VERSION=$THREADPOOLCTL_VERSION
-e PYTEST_VERSION=$PYTEST_VERSION
-e OMP_NUM_THREADS=$OMP_NUM_THREADS
-e OPENBLAS_NUM_THREADS=$OPENBLAS_NUM_THREADS
Expand Down
13 changes: 7 additions & 6 deletions doc/modules/clustering.rst
Original file line number Diff line number Diff line change
Expand Up @@ -205,12 +205,13 @@ computing cluster centers and values of inertia. For example, assigning a
weight of 2 to a sample is equivalent to adding a duplicate of that sample
to the dataset :math:`X`.

A parameter can be given to allow K-means to be run in parallel, called
``n_jobs``. Giving this parameter a positive value uses that many processors
(default: 1). A value of -1 uses all available processors, with -2 using one
less, and so on. Parallelization generally speeds up computation at the cost of
memory (in this case, multiple copies of centroids need to be stored, one for
each job).
Low-level parallelism
---------------------

:class:`KMeans` benefits from OpenMP based parallelism through Cython. Small
chunks of data (256 samples) are processed in parallel, which in addition
yields a low memory footprint. For more details on how to control the number of
threads, please refer to our :ref:`parallelism` notes.

.. warning::

Expand Down
20 changes: 20 additions & 0 deletions doc/whats_new/v0.23.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,26 @@ Changelog
could not have a `np.int64` type. :pr:`16484`
by :user:`Jeremie du Boisberranger <jeremiedbb>`.

- |API| The ``n_jobs`` parameter of :class:`cluster.KMeans`,
:class:`cluster.SpectralCoclustering` and
:class:`cluster.SpectralBiclustering` is deprecated. They now use OpenMP
based parallelism. For more details on how to control the number of threads,
please refer to our :ref:`parallelism` notes. :pr:`11950` by
:user:`Jeremie du Boisberranger <jeremiedbb>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side note, if you add yourself to doc/whats_new/_contributors.rst you can use the short version


- |API| The ``precompute_distances`` parameter of :class:`cluster.KMeans` is
deprecated. It has no effect. :pr:`11950` by
:user:`Jeremie du Boisberranger <jeremiedbb>`.

- |Efficiency| The critical parts of :class:`cluster.KMeans` have a more
optimized implementation. Parallelism is now over the data instead of over
initializations allowing better scalability. :pr:`11950` by
:user:`Jeremie du Boisberranger <jeremiedbb>`.

- |Enhancement| :class:`cluster.KMeans` now supports sparse data when
`solver = "elkan"`. :pr:`11950` by
:user:`Jeremie du Boisberranger <jeremiedbb>`.

:mod:`sklearn.compose`
......................

Expand Down
4 changes: 3 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
NUMPY_MIN_VERSION = '1.13.3'

JOBLIB_MIN_VERSION = '0.11'
THREADPOOLCTL_MIN_VERSION = '2.0.0'

# Optional setuptools features
# We need to import setuptools early, if we want setuptools features,
Expand Down Expand Up @@ -257,7 +258,8 @@ def setup_package():
install_requires=[
'numpy>={}'.format(NUMPY_MIN_VERSION),
'scipy>={}'.format(SCIPY_MIN_VERSION),
'joblib>={}'.format(JOBLIB_MIN_VERSION)
'joblib>={}'.format(JOBLIB_MIN_VERSION),
'threadpoolctl>={}'.format(THREADPOOLCTL_MIN_VERSION)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should add a note in the miscellaneous section of the what's new to indicate the new dependency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically this requires a SLEP (as per our governance doc) but according to #16242 this is a no brainer so let's just do it I think

],
package_data={'': ['*.pxd']},
**extra_setuptools_args)
Expand Down
19 changes: 16 additions & 3 deletions sklearn/cluster/_bicluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# License: BSD 3 clause

from abc import ABCMeta, abstractmethod
import warnings

import numpy as np

Expand Down Expand Up @@ -88,7 +89,7 @@ class BaseSpectral(BiclusterMixin, BaseEstimator, metaclass=ABCMeta):
@abstractmethod
def __init__(self, n_clusters=3, svd_method="randomized",
n_svd_vecs=None, mini_batch=False, init="k-means++",
n_init=10, n_jobs=None, random_state=None):
n_init=10, n_jobs='deprecated', random_state=None):
self.n_clusters = n_clusters
self.svd_method = svd_method
self.n_svd_vecs = n_svd_vecs
Expand All @@ -115,6 +116,10 @@ def fit(self, X, y=None):
y : Ignored

"""
if self.n_jobs != 'deprecated':
warnings.warn("'n_jobs' was deprecated in version 0.23 and will be"
" removed in 0.25.", FutureWarning)

X = check_array(X, accept_sparse='csr', dtype=np.float64)
self._check_parameters()
self._fit(X)
Expand Down Expand Up @@ -233,6 +238,10 @@ class SpectralCoclustering(BaseSpectral):
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
for more details.

.. deprecated:: 0.23
``n_jobs`` was deprecated in version 0.23 and will be removed in
0.25.

random_state : int, RandomState instance, default=None
Used for randomizing the singular value decomposition and the k-means
initialization. Use an int to make the randomness deterministic.
Expand Down Expand Up @@ -277,7 +286,7 @@ class SpectralCoclustering(BaseSpectral):
"""
def __init__(self, n_clusters=3, svd_method='randomized',
n_svd_vecs=None, mini_batch=False, init='k-means++',
n_init=10, n_jobs=None, random_state=None):
n_init=10, n_jobs='deprecated', random_state=None):
super().__init__(n_clusters,
svd_method,
n_svd_vecs,
Expand Down Expand Up @@ -380,6 +389,10 @@ class SpectralBiclustering(BaseSpectral):
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
for more details.

.. deprecated:: 0.23
``n_jobs`` was deprecated in version 0.23 and will be removed in
0.25.

random_state : int, RandomState instance, default=None
Used for randomizing the singular value decomposition and the k-means
initialization. Use an int to make the randomness deterministic.
Expand Down Expand Up @@ -425,7 +438,7 @@ class SpectralBiclustering(BaseSpectral):
def __init__(self, n_clusters=3, method='bistochastic',
n_components=6, n_best=3, svd_method='randomized',
n_svd_vecs=None, mini_batch=False, init='k-means++',
n_init=10, n_jobs=None, random_state=None):
n_init=10, n_jobs='deprecated', random_state=None):
super().__init__(n_clusters,
svd_method,
n_svd_vecs,
Expand Down
Loading