Description
Description
Kernel methods like SVR use a similarity measure rather than a vector of features. The prior knowledge is encoded by making a domain-appropriate similarity measure. However, there is a technical requirement that is sometimes hard to meet: a valid kernel must be symmetric, and positive semidefinite.
Suggested solution
In some circumstances, it works well to coerce the Gram matrix into a symmetric matrix by averaging it with its transpose; and then coercing it into a PSD matrix by discarding negative eigenvalues. For R users, the latter operation is available here: https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/nearPD.html.
The user can do these calculation for SVR, which supports precomputed kernels, but not for the other kernel methods in sklearn, which do not.
Even for SVR, the second transformation is more efficient if it is integrated into the Cholesky decomposition which is already performed by the sklearn package.
I therefore propose an option coerce_kernel=False, on the fit method for SVR.