Loosened to `dist <= stop_thresh` to converge in on 1D constant data #28951

akikuno · 2024-05-05T07:49:59Z

Reference Issues/PRs

Discussed #28926

What does this implement/fix? Explain your changes.

As @ogrisel suggested, I implemented the condition dist <= stop_thresh in the _mean_shift_single_seed function to address the issue of MeanShift failing to converge on 1D constant data within 300 iterations.

Any other comments?

…cikit-learn#28926

github-actions · 2024-05-05T07:51:09Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: d8e3889. Link to the linter CI: here}

jeremiedbb · 2024-05-06T09:06:11Z

Thanks for the PR @akikuno. Please add a non regression test in test_mean_shift.py and an entry in the v1.5.rst changelog

sklearn/cluster/tests/test_mean_shift.py

ogrisel · 2024-05-06T16:31:47Z

sklearn/cluster/tests/test_mean_shift.py

+    # Test convergence using 2D constant data
+    x = np.concatenate([np.zeros((10, 10)), np.ones((10, 10))])
+    n_iter = MeanShift().fit(x).n_iter_
+    assert n_iter < 300


I would remove the 2d case. The 1d case is enough as non-regression test.

@ogrisel
Thank you so much for all your guidance! I have learnt a lot.

+1 to remove the 2d case

@akikuno This comment has not been addressed yet.

@ogrisel
Sorry, I mistakenly thought it had already been changed.
I have now removed the 2d case.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

jeremiedbb · 2024-05-07T09:39:56Z

doc/whats_new/v1.5.rst

+- |Efficiency| The `clustering.MeanShift` class has now improved computational speed as it properly converges for constant data.
+  :pr:`28951` by :user:`Akihiro Kuno <akikuno>`.
+


I would consider it a bug and move that in the cluster section of the changelog.

@jeremiedbb
Thanks for your updates! I have moved the log to the sklearn.cluster section.

doc/whats_new/v1.5.rst

Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>

jeremiedbb

LGTM. thanks @akikuno

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>

Loosened to dist <= stop_thresh to converge in on 1D constant data s…

797157e

…cikit-learn#28926

github-actions bot added the module:cluster label May 5, 2024

Linted using black

2baa163

akikuno added 2 commits May 6, 2024 18:53

Add changelog of MeanShift enhancement scikit-learn#28951

1495676

Add tests for MeanShift to ensure convergence with constant data.

20caeda

ogrisel reviewed May 6, 2024

View reviewed changes

akikuno and others added 3 commits May 7, 2024 08:39

Update sklearn/cluster/tests/test_mean_shift.py

6ed39ae

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Update sklearn/cluster/tests/test_mean_shift.py

33892d3

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Merge branch 'main' into feature/meanshift-stop_thresh

28990a1

jeremiedbb reviewed May 7, 2024

View reviewed changes

akikuno and others added 4 commits May 9, 2024 10:53

Remove the 2d case to test the convergence of constant data

edb19ee

Apply the suggested change from @jeremiedbb

fb9dd4a

Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>

Move the changelog to the sklearn.cluster section

1ecc1f3

Merge branch 'main' into feature/meanshift-stop_thresh

c82828e

akikuno requested review from ogrisel and jeremiedbb May 13, 2024 05:37

jeremiedbb approved these changes May 13, 2024

View reviewed changes

jeremiedbb added this to the 1.5 milestone May 13, 2024

Merge branch 'main' into feature/meanshift-stop_thresh

d8e3889

jeremiedbb merged commit e796d0a into scikit-learn:main May 17, 2024

akikuno deleted the feature/meanshift-stop_thresh branch May 18, 2024 03:23

glemaitre mentioned this pull request May 18, 2024

Performance Degradation in MeanShift When Data Has No Variance #28926

Closed

jeremiedbb added a commit to jeremiedbb/scikit-learn that referenced this pull request May 20, 2024

FIX convergence criterion of MeanShift (scikit-learn#28951)

ac52198

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>

jeremiedbb mentioned this pull request May 20, 2024

Release 1.5.0 #29054

Merged

14 tasks

jeremiedbb added a commit that referenced this pull request May 21, 2024

FIX convergence criterion of MeanShift (#28951)

9bd7047

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Loosened to `dist <= stop_thresh` to converge in on 1D constant data #28951

Loosened to `dist <= stop_thresh` to converge in on 1D constant data #28951

akikuno commented May 5, 2024

Uh oh!

github-actions bot commented May 5, 2024 •

edited

Loading

Uh oh!

jeremiedbb commented May 6, 2024

Uh oh!

Uh oh!

Uh oh!

ogrisel May 6, 2024

Uh oh!

akikuno May 6, 2024

Uh oh!

jeremiedbb May 7, 2024

Uh oh!

ogrisel May 7, 2024

Uh oh!

akikuno May 9, 2024

Uh oh!

jeremiedbb May 7, 2024

Uh oh!

akikuno May 9, 2024

Uh oh!

Uh oh!

jeremiedbb left a comment

Uh oh!

Uh oh!

		- \|Efficiency\| The `clustering.MeanShift` class has now improved computational speed as it properly converges for constant data.
		:pr:`28951` by :user:`Akihiro Kuno <akikuno>`.

Uh oh!

Loosened to dist <= stop_thresh to converge in on 1D constant data #28951

Loosened to dist <= stop_thresh to converge in on 1D constant data #28951

Conversation

akikuno commented May 5, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented May 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

jeremiedbb commented May 6, 2024

Uh oh!

Uh oh!

Uh oh!

ogrisel May 6, 2024

Choose a reason for hiding this comment

Uh oh!

akikuno May 6, 2024

Choose a reason for hiding this comment

Uh oh!

jeremiedbb May 7, 2024

Choose a reason for hiding this comment

Uh oh!

ogrisel May 7, 2024

Choose a reason for hiding this comment

Uh oh!

akikuno May 9, 2024

Choose a reason for hiding this comment

Uh oh!

jeremiedbb May 7, 2024

Choose a reason for hiding this comment

Uh oh!

akikuno May 9, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeremiedbb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Loosened to `dist <= stop_thresh` to converge in on 1D constant data #28951

Loosened to `dist <= stop_thresh` to converge in on 1D constant data #28951

github-actions bot commented May 5, 2024 •

edited

Loading