[MRG] clusterQR method added to spectral segmentation #12316

lobpcg · 2018-10-06T16:27:13Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

#12164 adds clusterQR method to 'kmeans' and 'discretize' in spectral clustering

Any other comments?

The actual changes for #12164 are just a few lines in only 3 core codes, spectral.py, test_spectral.py, and plot_coin_segmentation.py

eamanu · 2018-10-06T20:38:14Z

@lobpcg Please add [WIP] tag to name

examples/cluster/plot_coin_segmentation.py

eamanu · 2018-10-06T20:57:45Z

@lobpcg look in the sepectral_clustering docs.

In the line where Parameters are defined. I think that the Travis error is because the ---- below Parameters title is longer than the title

lobpcg · 2018-10-06T22:34:55Z

@eamanu - thanks for your suggestions, now all implemented.

@lobpcg look in the sepectral_clustering docs.

In the line where Parameters are defined. I think that the Travis error is because the ---- below Parameters title is longer than the title

This was a typo indeed, now fixed - @eamanu thanks for noticing! The fix has however not affected the Travis error, still present unfortunately... Any other advice, please?

Since the plot_coin_segmentation example demo is involved, I have added the modified fine scikit-learn\doc\modules\clustering.rst to this PR.

I am unsure where to put the generated figures
auto_examples/cluster/images/sphx_glr_plot_coin_segmentation_001.png
auto_examples/cluster/images/sphx_glr_plot_coin_segmentation_002.png
and a new corresponding to clusterQR
auto_examples/cluster/images/sphx_glr_plot_coin_segmentation_003.png
I attach them here

eamanu · 2018-10-11T05:14:32Z

If you think that is ready please add the tag [MRG]

sklearn/cluster/spectral.py

eamanu

It's good for me

master update since Oct 7

lobpcg · 2018-10-30T02:45:43Z

This PR has been completed. Could someone please make more comments to react to or just commit it to the base? I start already forgetting my own proposed changes in this PR...

lobpcg · 2018-12-05T17:07:55Z

@eamanu This PR has been finalized over a month ago. Could someone please approve it or make new suggestions?

rth · 2018-12-05T17:43:22Z

@lobpcg Thanks for the PR, but as was mentioned in the original issue #12164 by 2 maintainers, this algorithm does not meet the inclusion criteria https://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms and would be more suited as a scikit-learn-contrib project until it gains wider usage and could be considered for inclusion in scikit-learn.

lobpcg · 2018-12-05T18:16:23Z

The actual changes are just a few lines in only 3 core codes, spectral.py, test_spectral.py, and plot_coin_segmentation.py That hardly justifies, in MHO, creating a brand new separate project at http://contrib.scikit-learn.org/ as proposed by @jnothman and @ogrisel . My hope has been that @jnothman and @ogrisel might change their recommendation, after seeing the actual changes in the code and results, waiving some of the inclusion criteria due to rather trivial changes in the code and noticeable improvements in performance... I have asked @jnothman and @ogrisel in the original issue #12164 almost 2 months ago, but got no response.

jnothman · 2018-12-09T05:57:05Z

The actual changes are just a few lines in only 3 core codes, spectral.py, test_spectral.py, and plot_coin_segmentation.py That hardly justifies, in MHO, creating a brand new separate project at http://contrib.scikit-learn.org as proposed by @jnothman and @ogrisel .

We simply can't use that justification regularly. Methods for solving optimisation problems change frequently or vary for different applications and we cannot afford to be a complete library of them. This is why we have criteria for maturity. But perhaps if it is highly valuable, we can make spectral segmentation more customisable by allowing a function to be passed instead of one of the built-in solvers?

lobpcg · 2018-12-10T17:59:16Z

I surely understand and share your general concerns and the requirements for maturity. However, please consider the following:

I have done all the needed code writing already in this PR, as far as I can see,
the changes are very small, only several lines of code, so there is just tiny additional maintenance effort,
the new function is not the default in this PR so the code is fully backward compatible,
the new clusterQR is not yet another optimization method, but rather direct extraction of clusters from eigenvectors in spectral clustering, designed specifically for that purpose only,
in contrast to the existing 'kmeans' and 'discretize', the new clusterQR has no tuning parameters, e.g., runs no iterations,
yet the new clusterQR seems to consistently outperform both 'kmeans' and 'discretize' in terms of quality and speed, e.g., see the example above.

While not technically satisfying yet the criteria for maturity, the new clusterQR appears to set the new standard for extraction of clusters from eigenvectors in spectral clustering, that seems to me to be difficult to improve. It thus deserves in the future to become the default method instead of kmeans after intensive testing by users, that hopefully should follow if this PR is merged and released.

master merge with destro latest

merge just updated local master into clusterQR

master update

lobpcg · 2019-08-05T18:11:50Z

@GaelVaroquaux could you please have a look at this PR and express your opinion?

ogrisel · 2019-08-29T12:20:08Z

The AMG fix #13707 is now merged in master.

Cluster qr

Merge pull request #21 from lobpcg/clusterQR

lobpcg · 2019-08-30T04:19:11Z

Could someone please review and finally decide if this dies or gets merged?

lobpcg · 2019-09-26T02:20:25Z

Could someone please review?

agramfort · 2019-09-26T08:26:49Z

in the referenced issue I see:

This doesn't meet our basic criteria for inclusion of stable and mature algorithms. What makes you think it is worth our while to maintain an implementation of this? What are the chances that this will remain a canonical approach in 5 years' time?

+1 for making a prototype Python implementation outside of the scikit-learn code base and running some benchmarks.

so there are some concerns which could eventually be ruled out with some benchmarks showing the empirical superiority of the method. Have these benchmarks done and made public?

lobpcg · 2019-09-26T13:36:49Z

in the referenced issue I see:
This doesn't meet our basic criteria for inclusion of stable and mature algorithms. What makes you think it is worth our while to maintain an implementation of this? What are the chances that this will remain a canonical approach in 5 years' time?
+1 for making a prototype Python implementation outside of the scikit-learn code base and running some benchmarks.
so there are some concerns which could eventually be ruled out with some benchmarks showing the empirical superiority of the method. Have these benchmarks done and made public?

@agramfort Yes, of course, please see the plots in #12316 (comment) and/or run the plot_coin_segmentation example demo in this PR yourself. A systematic comparison is performed in the original paper, cited in #12164

agramfort · 2019-09-26T20:34:07Z

the plot_coin_segmentation example demo in this PR

tells me the code leads to comparable results.

A systematic comparison is performed in the original paper, cited in #12164 <#12164>

have a script to replicate the timings given the current implementation is necessary. see for example the benchmarks https://github.com/scikit-learn/scikit-learn/tree/master/benchmarks that have been shared with the PRs to demonstrate and replicate timings on different problem dimensions and hardware.

lobpcg · 2019-09-26T20:44:57Z

the plot_coin_segmentation example demo in this PR
tells me the code leads to comparable results.

Comparable, but better - please just look at the plots.

have a script to replicate the timings given the current implementation is necessary. see for example the benchmarks https://github.com/scikit-learn/scikit-learn/tree/master/benchmarks that have been shared with the PRs to demonstrate and replicate timings on different problem dimensions and hardware.

Sure, if this is a blocker, it is doable and probably not too difficult. I'll look into it - thanks for the link!

master update

update from latest master

lobpcg · 2021-09-25T19:43:05Z

This PR is stalled and has conflicts. I close it and open instead [WIP] cluster_qr method added to spectral segmentation #21148

clusterQR method added to spectral segmentation

c8e8ff2

lobpcg mentioned this pull request Oct 6, 2018

[Closed] adding clusterQR to spectral clustering, and LOBPCG as an SVD solver to PCA and Truncated PCA #12291

Closed

lobpcg added 3 commits October 6, 2018 15:10

fix comment line too long

6f819e7

typo fixed in spectral.py

9902767

spectral.py trailing white space removed

9fa6457

eamanu suggested changes Oct 6, 2018

View reviewed changes

lobpcg added 2 commits October 6, 2018 17:42

typos fixed

44daccb

Update doc/modules/clustering.rst

3ce05d0

formatting/typo

5789fb8

lobpcg changed the title ~~clusterQR method added to spectral segmentation~~ [WIP] clusterQR method added to spectral segmentation Oct 6, 2018

lobpcg mentioned this pull request Oct 6, 2018

new feature: add clusterQR method to 'kmeans' and 'discretize' in spectral clustering #12164

Closed

spectral.py typo fixed

9a0638b

lobpcg changed the title ~~[WIP] clusterQR method added to spectral segmentation~~ clusterQR method added to spectral segmentation Oct 10, 2018

eamanu suggested changes Oct 11, 2018

View reviewed changes

sklearn/cluster/spectral.py Outdated Show resolved Hide resolved

Update sklearn/cluster/spectral.py

b313306

lobpcg changed the title ~~clusterQR method added to spectral segmentation~~ [MRG] clusterQR method added to spectral segmentation Oct 11, 2018

eamanu approved these changes Oct 12, 2018

View reviewed changes

Merge pull request #1 from scikit-learn/master

2146c81

master update since Oct 7

lobpcg added 2 commits January 26, 2019 14:09

Merge pull request #2 from scikit-learn/master

1a8a552

master merge with destro latest

Merge pull request #3 from lobpcg/master

33e34fe

merge just updated local master into clusterQR

lobpcg added 2 commits August 1, 2019 19:06

reverse changes irrelevant to this PR

a9e9260

Merge pull request #18 from scikit-learn/master

ced9b50

master update

amueller added the Waiting for Reviewer label Aug 6, 2019

glemaitre self-assigned this Aug 12, 2019

lobpcg mentioned this pull request Aug 13, 2019

[WIP] Improve spectral clustering implementation #10739

Closed

lobpcg added 3 commits August 29, 2019 22:53

Merge branch 'master' into clusterQR

f2b1adc

Merge pull request #21 from lobpcg/clusterQR

13ea224

Cluster qr

Merge pull request #22 from lobpcg/master

6924fa9

Merge pull request #21 from lobpcg/clusterQR

lobpcg and others added 6 commits November 29, 2019 11:13

trying to fix a conflict

78a788a

Merge pull request #23 from scikit-learn/master

5704634

master update

Merge pull request #24 from lobpcg/master

72d8d60

update from latest master

restore the edits

9556999

add imports

540f937

trying to fix rst warnings

9bbd667

github-actions bot added the module:cluster label Mar 2, 2020

cmarmo added Needs Benchmarks A tag for the issues and PRs which require some benchmarks and removed Waiting for Reviewer labels Jan 6, 2021

Base automatically changed from master to main January 22, 2021 10:50

lobpcg mentioned this pull request Sep 25, 2021

ENH add 'cluster_qr' method to spectral segmentation #21148

Merged

lobpcg closed this Sep 25, 2021

lobpcg deleted the clusterQR branch September 25, 2021 19:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] clusterQR method added to spectral segmentation #12316

[MRG] clusterQR method added to spectral segmentation #12316

lobpcg commented Oct 6, 2018 •

edited

Loading

eamanu commented Oct 6, 2018

eamanu commented Oct 6, 2018

lobpcg commented Oct 6, 2018 •

edited

Loading

eamanu commented Oct 11, 2018

eamanu left a comment

lobpcg commented Oct 30, 2018

lobpcg commented Dec 5, 2018

rth commented Dec 5, 2018

lobpcg commented Dec 5, 2018

jnothman commented Dec 9, 2018

lobpcg commented Dec 10, 2018

lobpcg commented Aug 5, 2019

ogrisel commented Aug 29, 2019

lobpcg commented Aug 30, 2019

lobpcg commented Sep 26, 2019

agramfort commented Sep 26, 2019

lobpcg commented Sep 26, 2019

agramfort commented Sep 26, 2019 via email

lobpcg commented Sep 26, 2019

lobpcg commented Sep 25, 2021

[MRG] clusterQR method added to spectral segmentation #12316

[MRG] clusterQR method added to spectral segmentation #12316

Conversation

lobpcg commented Oct 6, 2018 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

eamanu commented Oct 6, 2018

eamanu commented Oct 6, 2018

lobpcg commented Oct 6, 2018 • edited Loading

eamanu commented Oct 11, 2018

eamanu left a comment

Choose a reason for hiding this comment

lobpcg commented Oct 30, 2018

lobpcg commented Dec 5, 2018

rth commented Dec 5, 2018

lobpcg commented Dec 5, 2018

jnothman commented Dec 9, 2018

lobpcg commented Dec 10, 2018

lobpcg commented Aug 5, 2019

ogrisel commented Aug 29, 2019

lobpcg commented Aug 30, 2019

lobpcg commented Sep 26, 2019

agramfort commented Sep 26, 2019

lobpcg commented Sep 26, 2019

agramfort commented Sep 26, 2019 via email

lobpcg commented Sep 26, 2019

lobpcg commented Sep 25, 2021

lobpcg commented Oct 6, 2018 •

edited

Loading

lobpcg commented Oct 6, 2018 •

edited

Loading