-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Intercept in Kernel Ridge Regression #21840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@agramfort Would this be achievable by internally preprocessing X and y as in the linear models before to compute the kernel and then apply the offset at predict? |
Another option that works with precomputed kernels:
|
@TomDLT can you give a source or explanation for your statement? I have no knowledge of kernel matrix centering as of now. Why is it not sufficient to set |
Test on the linear-kernel case. (edit: incorrect, see fixed version below) import numpy as np
from sklearn.kernel_ridge import KernelRidge
from sklearn.linear_model import Ridge
from sklearn.preprocessing import KernelCenterer
fit_intercept = True
need_intercept = True
n_samples = 100
n_features = 10
alpha = 0.1
###############################################################################
# create dataset
X_train = np.random.randn(n_samples, n_features)
X_test = np.random.randn(n_samples, n_features)
y_train = np.random.randn(n_samples)
if need_intercept:
X_train += 10
X_test += 10
y_train += 5
else:
X_train -= X_train.mean(axis=0)
y_train -= y_train.mean(axis=0)
###############################################################################
# model A: ridge regression
model_a = Ridge(alpha=alpha, fit_intercept=fit_intercept).fit(X_train, y_train)
y_pred_a = model_a.predict(X_test)
###############################################################################
# model B: kernel ridge regression with a linear kernel
if fit_intercept:
# precompute kernel
K_train = X_train @ X_train.T
centerer = KernelCenterer()
K_train_centered = centerer.fit_transform(K_train)
# center target
y_train_mean = y_train.mean(axis=0)
y_train_centered = y_train - y_train_mean
# fit centered model
model_b = KernelRidge(alpha=alpha,
kernel="precomputed").fit(K_train_centered,
y_train_centered)
K_test = X_test @ X_train.T
y_pred_b = model_b.predict(K_test)
# add intercept
intercept = y_train_mean - centerer.K_fit_rows_ @ model_b.dual_coef_
y_pred_b += intercept
else:
y_pred_b = KernelRidge(alpha=alpha,
kernel="linear").fit(X_train,
y_train).predict(X_test)
###############################################################################
np.testing.assert_array_almost_equal(y_pred_a, y_pred_b) Not sure how to handle the sample weights though. |
It seems the above solution may be incorrect. The adjustment for the intercept should be more than just the term
|
I think you are right, the formula Here is the updated version of my script above using this fix. import numpy as np
from sklearn.kernel_ridge import KernelRidge
from sklearn.linear_model import Ridge
from sklearn.preprocessing import KernelCenterer
fit_intercept = True
need_intercept = True
n_samples = 100
n_features = 10
alpha = 0.1
###############################################################################
# create dataset
X_train = np.random.randn(n_samples, n_features)
X_test = np.random.randn(n_samples, n_features)
y_train = np.random.randn(n_samples)
if need_intercept:
intercepts = np.random.randn(n_features) * 10
X_train += intercepts
X_test += intercepts
y_train += 5
else:
X_train -= X_train.mean(axis=0)
y_train -= y_train.mean(axis=0)
###############################################################################
# model A: ridge regression
model_a = Ridge(alpha=alpha, fit_intercept=fit_intercept).fit(X_train, y_train)
y_pred_a = model_a.predict(X_test)
###############################################################################
# model B: kernel ridge regression with a linear kernel
if fit_intercept:
# precompute kernel
K_train = X_train @ X_train.T
centerer = KernelCenterer()
K_train_centered = centerer.fit_transform(K_train)
# center target
y_train_mean = y_train.mean(axis=0)
y_train_centered = y_train - y_train_mean
# fit centered model
model_b = KernelRidge(alpha=alpha,
kernel="precomputed").fit(K_train_centered,
y_train_centered)
K_test = X_test @ X_train.T
K_test_centered = centerer.transform(K_test)
y_pred_b = model_b.predict(K_test_centered)
# add intercept
intercept = y_train_mean
y_pred_b += intercept
else:
y_pred_b = KernelRidge(alpha=alpha,
kernel="linear").fit(X_train,
y_train).predict(X_test)
###############################################################################
np.testing.assert_array_almost_equal(y_pred_a, y_pred_b, decimal=10) |
Based on your comments, maybe the KernelRidge class can be modified like bellow? import numpy as np class KernelRidgeRegression(KernelRidge):
if name == 'main':
|
Will this be on the roadmap? |
Describe the workflow you want to enable
Currently, Kernel Ridge Regression as implemented in
sklearn.kernel_ridge.KernelRidge
does not support an intercept like e.g.sklearn.svm.SVR
does. This can lead to problems if the target value is shifted.Describe your proposed solution
Implement an intercept into
KernelRidge
.Describe alternatives you've considered, if relevant
Document that there is no intercept present for
KernelRidge
and that, therefore, manual target value shifting is necessary. This is of course the easier path with less work and maintenance.Additional context
Demo of problems caused by missing intercept
The text was updated successfully, but these errors were encountered: