-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] clusterQR method added to spectral segmentation #12316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@lobpcg Please add [WIP] tag to name |
@lobpcg look in the In the line where Parameters are defined. I think that the Travis error is because the |
@eamanu - thanks for your suggestions, now all implemented.
This was a typo indeed, now fixed - @eamanu thanks for noticing! The fix has however not affected the Travis error, still present unfortunately... Any other advice, please? Since the plot_coin_segmentation example demo is involved, I have added the modified fine scikit-learn\doc\modules\clustering.rst to this PR. I am unsure where to put the generated figures |
If you think that is ready please add the tag [MRG] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good for me
master update since Oct 7
This PR has been completed. Could someone please make more comments to react to or just commit it to the base? I start already forgetting my own proposed changes in this PR... |
@eamanu This PR has been finalized over a month ago. Could someone please approve it or make new suggestions? |
@lobpcg Thanks for the PR, but as was mentioned in the original issue #12164 by 2 maintainers, this algorithm does not meet the inclusion criteria https://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms and would be more suited as a scikit-learn-contrib project until it gains wider usage and could be considered for inclusion in scikit-learn. |
The actual changes are just a few lines in only 3 core codes, spectral.py, test_spectral.py, and plot_coin_segmentation.py That hardly justifies, in MHO, creating a brand new separate project at http://contrib.scikit-learn.org/ as proposed by @jnothman and @ogrisel . My hope has been that @jnothman and @ogrisel might change their recommendation, after seeing the actual changes in the code and results, waiving some of the inclusion criteria due to rather trivial changes in the code and noticeable improvements in performance... I have asked @jnothman and @ogrisel in the original issue #12164 almost 2 months ago, but got no response. |
We simply can't use that justification regularly. Methods for solving optimisation problems change frequently or vary for different applications and we cannot afford to be a complete library of them. This is why we have criteria for maturity. But perhaps if it is highly valuable, we can make spectral segmentation more customisable by allowing a function to be passed instead of one of the built-in solvers? |
I surely understand and share your general concerns and the requirements for maturity. However, please consider the following:
While not technically satisfying yet the criteria for maturity, the new clusterQR appears to set the new standard for extraction of clusters from eigenvectors in spectral clustering, that seems to me to be difficult to improve. It thus deserves in the future to become the default method instead of kmeans after intensive testing by users, that hopefully should follow if this PR is merged and released. |
master merge with destro latest
merge just updated local master into clusterQR
@GaelVaroquaux could you please have a look at this PR and express your opinion? |
The AMG fix #13707 is now merged in master. |
Cluster qr
Merge pull request #21 from lobpcg/clusterQR
Could someone please review and finally decide if this dies or gets merged? |
Could someone please review? |
in the referenced issue I see:
so there are some concerns which could eventually be ruled out with some benchmarks showing the empirical superiority of the method. Have these benchmarks done and made public? |
@agramfort Yes, of course, please see the plots in #12316 (comment) and/or run the plot_coin_segmentation example demo in this PR yourself. A systematic comparison is performed in the original paper, cited in #12164 |
the plot_coin_segmentation example demo in this PR
tells me the code leads to comparable results.
have a script to replicate the timings given the current implementation is
necessary.
see for example the benchmarks
https://github.com/scikit-learn/scikit-learn/tree/master/benchmarks
that have been shared with the PRs to demonstrate and replicate timings on
different
problem dimensions and hardware.
|
Comparable, but better - please just look at the plots.
Sure, if this is a blocker, it is doable and probably not too difficult. I'll look into it - thanks for the link! |
master update
update from latest master
This PR is stalled and has conflicts. I close it and open instead [WIP] cluster_qr method added to spectral segmentation #21148 |
Reference Issues/PRs
closes #12164
What does this implement/fix? Explain your changes.
#12164 adds clusterQR method to 'kmeans' and 'discretize' in spectral clustering
Any other comments?
The actual changes for #12164 are just a few lines in only 3 core codes, spectral.py, test_spectral.py, and plot_coin_segmentation.py