-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
eigen_tol in _spectral_embedding.py does not propagate to solvers other than arpack #21243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Currently, |
|
To be backward compatible, we can have Does this sound reasonable? |
@thomasjpfan Yes. In a future PR, may be someone should also introduce parameter On a side note, I am unsure that 0 in |
It would indeed be interesting to be able to pass a Having
That would deserve more investigation in tracking down the meaning of |
In addition to The PR already did those changes: #11968 My concern is about the tests. I don't really know how to trigger some convergence issue to be sure that we probably handle the exception. |
@glemaitre The theory tells us that convergence of any eigensolver will be slow for a random initiation if there is a small separation of wanted and unwanted eigenvalues, e.g., finding one, the smallest eigenvalue, in the case where the second smallest is very close. In spectral clustering that would correspond to determining two clusters in the case where there are three or more nearly identical clusters. E.g., make a Gaussian point cloud (a blob) in the positive quadrant in 2D far enough from the origin, then flip the signs of the 2 components of all the points to create symmetric copies of the blob in every quadrant. Now you have 4 identical clusters, due to symmetry, just perturb them slightly and use any distance to generate the graph adjacency matrix. As you try to partition this union of 4 nearly identical clusters into two clusters, where do you partition, vertically or horizontally? This ambiguity should manifest itself in slow convergence in the solver. I have not tried such an example, but the theory predicts that it should work. Could be actually a good new demo... |
While at it, may be also fix #14713 ? |
Describe the bug
In
_spectral_embedding.py
the optiondoes not appear to propagate to solvers 'amg' and 'lobpcg', e.g., adding
eigen_tol
and changing its value inexamples/cluster/plot_coin_segmentation.py
affects the outcome only ifeigen_solver='arpack'
, which is the default but not for the other two: 'lobpcg' and 'amg'.Steps/Code to Reproduce
See above
Expected Results
eigen_tol
works for all supported solversActual Results
eigen_tol
works for `eigen_solver='arpack' only and is ignored otherwiseWhile at it
Also introduce and pass
max_iter
param to all eigen solvers, although maybe it should be namedeigen_max_iter
to avoid confusion.Versions
System:
python: 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\Name\AppData\Local\Programs\Python\Python37\python.exe
machine: Windows-10-10.0.19041-SP0
Python dependencies:
pip: 21.2.4
setuptools: 58.1.0
sklearn: 1.0
numpy: 1.21.2
scipy: 1.7.1
Cython: 0.29.24
pandas: 1.3.3
matplotlib: 3.4.3
joblib: 1.0.1
threadpoolctl: 2.2.0
The text was updated successfully, but these errors were encountered: