-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Closed
Closed
Copy link
Description
I used GaussianProcessRegressor like this:
from sklearn.gaussian_process.kernels import Matern
from sklearn.gaussian_process import GaussianProcessRegressor
import numpy as np
ls = [
([2, 3, 4, 5], [1, 1, 3]),
([5, 3, 1, 3], [1, 4, 3]),
([2, 7, 2, 5], [8, 2, 3]),
([-2, 3, 4, 5], [1, 0, 4]),
([2, 8, 7, 5], [-1, 2, 3]),
]
X = np.array([ele[0] for ele in ls])
y = np.array([ele[1] for ele in ls])
gp = GaussianProcessRegressor(
kernel=Matern(nu=2.5),
alpha=1e-6,
normalize_y=True,
n_restarts_optimizer=5,
random_state=np.random.RandomState(None),
)
gp.fit(X, y)
# x = np.array([200, 300, 600, 500.]).reshape(1, -1)
x = np.array([
[2, 3, 6, 5.],
[2, 3, 3, 5.],
[5, 3, 1, 3.00001],
[5, 3, 1, 3]
])
v, d = gp.predict(x, return_std=True)
print(v, d)
_, c = gp.predict(x, return_cov=True)
print(c)
But I got this Error:
ValueError: operands could not be broadcast together with shapes (4,) (3,)
I scaned the code of GaussianProcessRegressor
, and I found out what happened in function predict(self, X, return_std=False, return_cov=False)
:
- code below: y_var.shape==(4,) but self._y_train_std.shape==(3,)
# undo normalisation
y_var = y_var * self._y_train_std ** 2
I think y_var = y_var * self._y_train_std ** 2
will work well if there is only one target.
In multi-target scene, we should change it like this:
# undo normalisation
# y_var = y_var * self._y_train_std ** 2
y_var = y_var.reshape((-1, 1))
y_var = np.einsum("ij,j->ij", y_var, self._y_train_std ** 2)
PS:
I think we need to add an EPS
to the self._y_train_std
to avoid zero division error.
# Normalize target value
if self.normalize_y:
self._y_train_mean = np.mean(y, axis=0)
# self._y_train_std = np.std(y, axis=0)
self._y_train_std = np.std(y, axis=0) + self.EPS # avoid zero division error
# Remove mean and make unit variance
y = (y - self._y_train_mean) / self._y_train_std