Skip to content

Feature request: forcing a Gram matrix to be positive definite #8590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
chrishmorris opened this issue Mar 15, 2017 · 9 comments
Open

Feature request: forcing a Gram matrix to be positive definite #8590

chrishmorris opened this issue Mar 15, 2017 · 9 comments

Comments

@chrishmorris
Copy link

Description

Kernel methods like SVR use a similarity measure rather than a vector of features. The prior knowledge is encoded by making a domain-appropriate similarity measure. However, there is a technical requirement that is sometimes hard to meet: a valid kernel must be symmetric, and positive semidefinite.

Suggested solution

In some circumstances, it works well to coerce the Gram matrix into a symmetric matrix by averaging it with its transpose; and then coercing it into a PSD matrix by discarding negative eigenvalues. For R users, the latter operation is available here: https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/nearPD.html.

The user can do these calculation for SVR, which supports precomputed kernels, but not for the other kernel methods in sklearn, which do not.

Even for SVR, the second transformation is more efficient if it is integrated into the Cholesky decomposition which is already performed by the sklearn package.

I therefore propose an option coerce_kernel=False, on the fit method for SVR.

@amueller
Copy link
Member

What kernel methods in sklearn don't accept precomputed kernels? I'm not sold on this because it's really hacky. I have not heard of anyone do that. Can you give an example when you'd do that?

@chrishmorris
Copy link
Author

I do this because the objects in my x space are molecules. My kernel is Tanimoto similarity between molecule fingerprints.

The point of kernel methods is that devising an appropriate kernel function is an alternative to feature extraction. Mercer's theorem says that a kernel induces a vector space over the input space, but an infinite-dimensional one, so this approach is inherently more powerful than identifying features. Reducing the input space to a vector of features then choose a kernel function is missing the point.

@jnothman
Copy link
Member

To repeat: What kernel methods in sklearn don't accept precomputed kernels?

@amueller
Copy link
Member

Tanimoto similarity is the jaccard similarity, right? That doesn't lead to a positive definite kernel?

@amueller
Copy link
Member

Hm I guess it does not, but wouldn't exp(similarity) lead to one?

@chrishmorris
Copy link
Author

To repeat: What kernel methods in sklearn don't accept precomputed kernels?

Gaussian Process Regression.

For Kernel Ridge Regression, just a documentation change is needed.

For Kernel Ridge Regression does not mention it, but in fact it supports

@chrishmorris
Copy link
Author

animoto similarity is the jaccard similarity, right? That doesn't lead to a positive definite kernel?

"It is well known that the Jaccard–Tanimoto similarity measure satisfies the Mercer's conditions and, hence, yields a Mercer kernel "
https://academic.oup.com/bioinformatics/article/26/10/1348/193672

@amueller
Copy link
Member

Ok, so you don't need to "coerce the Gram matrix". I had asked when you need to coerce the gram matrix and you gave that as an example.

@UnixJunkie
Copy link
Contributor

Hi Chris, a bigger problem (but related I think) is that currently scikit-learn doesn't implement the Tanimoto kernel. While this kernel is super useful in cheminformatics and information research and not everybody is skilled enough to contribute to scikit-learn. Cf. #29507 I am unsure if Tanimoto kernel would be unsuited to kernel-based methods, but @jpvert might know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants