Skip to content

eigen_tol in _spectral_embedding.py does not propagate to solvers other than arpack #21243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lobpcg opened this issue Oct 5, 2021 · 8 comments · Fixed by #23210
Closed

eigen_tol in _spectral_embedding.py does not propagate to solvers other than arpack #21243

lobpcg opened this issue Oct 5, 2021 · 8 comments · Fixed by #23210

Comments

@lobpcg
Copy link
Contributor

lobpcg commented Oct 5, 2021

Describe the bug

In _spectral_embedding.py the option

    eigen_tol : float, default=0.0
        Stopping criterion for eigendecomposition of the Laplacian matrix
        when ``eigen_solver='arpack'``.

does not appear to propagate to solvers 'amg' and 'lobpcg', e.g., adding eigen_tol and changing its value in examples/cluster/plot_coin_segmentation.py affects the outcome only if eigen_solver='arpack', which is the default but not for the other two: 'lobpcg' and 'amg'.

Steps/Code to Reproduce

See above

Expected Results

eigen_tol works for all supported solvers

Actual Results

eigen_tol works for `eigen_solver='arpack' only and is ignored otherwise

While at it

Also introduce and pass max_iter param to all eigen solvers, although maybe it should be named eigen_max_iter to avoid confusion.

Versions

System:
python: 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\Name\AppData\Local\Programs\Python\Python37\python.exe
machine: Windows-10-10.0.19041-SP0

Python dependencies:
pip: 21.2.4
setuptools: 58.1.0
sklearn: 1.0
numpy: 1.21.2
scipy: 1.7.1
Cython: 0.29.24
pandas: 1.3.3
matplotlib: 3.4.3
joblib: 1.0.1
threadpoolctl: 2.2.0

@thomasjpfan
Copy link
Member

thomasjpfan commented Oct 6, 2021

Currently, eigen_tol is different between solvers i.e. logpcg uses 1e-5 and arpack uses 0.0. Should the solvers have different default tolerances?

@lobpcg
Copy link
Contributor Author

lobpcg commented Oct 6, 2021

  1. The solvers should ideally have the same default accuracy, but arpack and lobpcg likely have different stopping criteria, so it may need some testing to figure out what the default numeric values of tolerances should be to make the accuracy the same.
  2. arpack formally uses 0.0, but mathematically the tolerance cannot be zero - the iterations would never stop due to numeric errors. Thus, arpack probably treats 0.0 as None, i.e., sets up some tolerance >0 internally.
  3. At least amg is just lobpcg with some extra tricks, so tolerances in amg and lobpcg should be the same.
  4. Figuring out and setting default tolerances is a different issue, unrelated to the reported bug; e.g., see FIX: tol in _spectral_embedding.py #21194

@thomasjpfan
Copy link
Member

To be backward compatible, we can have eigen_tol=None and pass in 0 to arpack, which is machine precision internally. For lobpcg and amg, we continue to pass in 1e-5 when eigen_tol=None. If eigen_tol is a float, we can pass it in directly.

Does this sound reasonable?

@lobpcg
Copy link
Contributor Author

lobpcg commented Oct 8, 2021

To be backward compatible, we can have eigen_tol=None and pass in 0 to arpack, which is machine precision internally. For lobpcg and amg, we continue to pass in 1e-5 when eigen_tol=None. If eigen_tol is a float, we can pass it in directly.

Does this sound reasonable?

@thomasjpfan Yes.

In a future PR, may be someone should also introduce parameter maxit, pass it to all solvers to overwrite the internal defaults, and set some defaults in spectral_clustering, just like for eigen_tol. That would be a desirable enhancement.

On a side note, I am unsure that 0 in arpack is indeed machine precision internally. But if it is 0, I am confident that it's not a good default for arpack here. E.g., something like 1e-8 seems to work just fine in #21148 plot_coin_segmentation.py while cutting 2x the compute time. But that's also not relevant to the present issue and should be addressed separately.

@ogrisel
Copy link
Member

ogrisel commented Oct 12, 2021

It would indeed be interesting to be able to pass a max_iter param to the eigen solvers, although maybe it should be named eigen_max_iter to avoid confusion with any outer-level max_iter loop in the estimator itself.

Having eigen_max_iter exposed would probably better help study the influence of the eigensolver convergence on the outer statistical estimation problem.

On a side note, I am unsure that 0 in arpack is indeed machine precision internally. But if it is 0, I am confident that it's not a good default for arpack here. E.g., something like 1e-8 seems to work just fine in #21148 plot_coin_segmentation.py while cutting 2x the compute time. But that's also not relevant to the present issue and should be addressed separately.

That would deserve more investigation in tracking down the meaning of tol=0 in ARPACK. Also to move forward I like the plan suggested by @thomasjpfan in #21243 (comment) to move backward compat problems out of the way and decouple the discussion of the choice of the default value for the arpack solver from the fix of the original problem reported here (namely eigen_tol propagation for other solvers).

@glemaitre
Copy link
Member

In addition to spectral_embedding, the same changes need to be done in SpectralEmbedding, spectral_clustering, and SpectralClustering.

The PR already did those changes: #11968
It did most of the backward compatibility things that are discussed here. The tricky one is the max_iter=2000 in the second call to lobpcg.

My concern is about the tests. I don't really know how to trigger some convergence issue to be sure that we probably handle the exception.

@lobpcg
Copy link
Contributor Author

lobpcg commented Dec 17, 2021

I don't really know how to trigger some convergence issue

@glemaitre The theory tells us that convergence of any eigensolver will be slow for a random initiation if there is a small separation of wanted and unwanted eigenvalues, e.g., finding one, the smallest eigenvalue, in the case where the second smallest is very close. In spectral clustering that would correspond to determining two clusters in the case where there are three or more nearly identical clusters. E.g., make a Gaussian point cloud (a blob) in the positive quadrant in 2D far enough from the origin, then flip the signs of the 2 components of all the points to create symmetric copies of the blob in every quadrant. Now you have 4 identical clusters, due to symmetry, just perturb them slightly and use any distance to generate the graph adjacency matrix. As you try to partition this union of 4 nearly identical clusters into two clusters, where do you partition, vertically or horizontally? This ambiguity should manifest itself in slow convergence in the solver.

I have not tried such an example, but the theory predicts that it should work. Could be actually a good new demo...

@lobpcg
Copy link
Contributor Author

lobpcg commented Dec 17, 2021

While at it, may be also fix #14713 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants