Skip to content

Fix plot_coin_segamentation speed issue #13383 #13652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

boyuanmike
Copy link

I changed the eigen_solver to 'amg' in the function sklearn.cluster.SpectralClustering. The runtime on CircleCI is 4.4 sec after the change.

@jnothman
Copy link
Member

So this is 100x faster? That's nice :)

@jnothman
Copy link
Member

Not sure if we want to require amg to build docs. How does the output compare?

@boyuanmike
Copy link
Author

Not sure if we want to require amg to build docs. How does the output compare?

If we non't include pyamg inside build_docs.sh, the circleCI will fail.

@jnothman
Copy link
Member

jnothman commented Apr 16, 2019 via email

@boyuanmike
Copy link
Author

"How does the output compare?" was a separate question about whether the example is materially the same with a different solver

when the assign_label in sklearn.cluster.SpectralClustering is "kmeans", the graphs look the same. But when the assign_label is "discretize". The graphs look different.
The new graphs:
Figure_1

Figure_2

The old graphs:
Figure_2
Figure_1

@jnothman
Copy link
Member

jnothman commented Apr 16, 2019 via email

@boyuanmike
Copy link
Author

The documentation does not much explain how discretize works, and I've not looked into it... But these certainly seem to be very different results. Why?

In the document https://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html, it says that using amg eigen solver may lead to instabilities.

@lobpcg
Copy link
Contributor

lobpcg commented Aug 13, 2019

The documentation does not much explain how discretize works, and I've not looked into it... But these certainly seem to be very different results. Why?

Both implemented methods k-means and discretize are imperfect, and expected to give very different results. #12316 implements an alternative method for choosing the clusters from the eigenvectors, outperforming both k-means and discretize.

@lobpcg
Copy link
Contributor

lobpcg commented Aug 13, 2019

The documentation does not much explain how discretize works, and I've not looked into it... But these certainly seem to be very different results. Why?

In the document https://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html, it says that using amg eigen solver may lead to instabilities.

  1. Instabilities in AMG simply make the code crash.
  2. Fix for spectral clustering error when using 'amg' solver #13707 fixes AMG instability for spectral clustering

@ogrisel
Copy link
Member

ogrisel commented Aug 29, 2019

#13707 resolves the instability problem of AMG preconditioning and is now merged in master. Indeed this does not change anything for the current PR (it's unrelated).

Base automatically changed from master to main January 22, 2021 10:51
@thomasjpfan
Copy link
Member

I tested this PR with the latest scikit-learn version and unable to get a clustering that looks like the original. I prefer the slower solver since it gives more stable results compared to a faster solver that does not. I think we likely need a different approach to speeding up this example. With that in mind, I am closing this PR.

@thomasjpfan thomasjpfan closed this Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants