-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Add "Randomized SVD" solver option to KernelPCA for faster partial decompositions, like in PCA #12068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
…s proposed in scikit-learn#12068. It works exactly the same than for PCA: simply set eigen_solver to 'randomized' to use that method.
Do you have an idea of the work necessary for this? (I don't, and haven't
yet looked into it)
|
Ah, I see there is a pull request.
|
You might also want to add LOBPCG https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lobpcg.html which is already used in http://scikit-learn.org/stable/modules/generated/sklearn.manifold.SpectralEmbedding.html and http://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html LOBPCG is expected to outperform both ARPACK and "randomized" solvers for large problems |
Cool suggestion! I see that 'amg' is also mentioned in the doc, is there any reason why we should not add it in kPCA too ? |
'amg' is an option for 'lobpcg' that may make LOBPCG work faster, but the effect is highly problem dependent. It is unclear to me if the 'amg' option would work well for kPCA. Having LOBPCG already added to kPCA, it would be easy to test 'amg' and check it is worth adding 'amg' to kPCA. |
ok then. I suggest to wait for the current pull request to be merged first, since it already depends on two others that will be relevant in the case of 'lobpcg for kPCA' (#12145 concerns eigenvalues validation, and #12143 is to fix kPCA behaviour in case of perfect zero-eigenvalues). It will be very easy to add the new method to the benchmark provided in #12069 so that you have a nice plot of execution times. |
PCA currently offers an
svd_solver
parameter allowing users to get faster partial decomposition using two methods:'arpack'
and'randomized'
svd.KernelPCA offers an
eigen_solver
parameter allowing users to get faster partial decomposition using'arpack'
, but'randomized'
svd is not supported. Would it seem reasonable to add it in the list of possible solvers ?(Thanks grilling for the suggestion and support for implementation !)
The text was updated successfully, but these errors were encountered: