GaussianProcessRegressor cannot correctly predict std in multi-target scene

I used GaussianProcessRegressor like this:
```python
from sklearn.gaussian_process.kernels import Matern
from sklearn.gaussian_process import GaussianProcessRegressor
import numpy as np

ls = [
    ([2, 3, 4, 5], [1, 1, 3]),
    ([5, 3, 1, 3], [1, 4, 3]),
    ([2, 7, 2, 5], [8, 2, 3]),
    ([-2, 3, 4, 5], [1, 0, 4]),
    ([2, 8, 7, 5], [-1, 2, 3]),
]

X = np.array([ele[0] for ele in ls])
y = np.array([ele[1] for ele in ls])

gp = GaussianProcessRegressor(
    kernel=Matern(nu=2.5),
    alpha=1e-6,
    normalize_y=True,
    n_restarts_optimizer=5,
    random_state=np.random.RandomState(None),
)

gp.fit(X, y)
# x = np.array([200, 300, 600, 500.]).reshape(1, -1)
x = np.array([
    [2, 3, 6, 5.],
    [2, 3, 3, 5.],
    [5, 3, 1, 3.00001],
    [5, 3, 1, 3]
])

v, d = gp.predict(x, return_std=True)

print(v, d)

_, c = gp.predict(x, return_cov=True)

print(c)
```
But I got this Error:

```python
ValueError: operands could not be broadcast together with shapes (4,) (3,) 
```

I scaned the code of `GaussianProcessRegressor`,  and I found out what happened in function `predict(self, X, return_std=False, return_cov=False)`:
- code below: y_var.shape==(4,) but self._y_train_std.shape==(3,)
```
# undo normalisation
y_var = y_var * self._y_train_std ** 2
```
I think `y_var = y_var * self._y_train_std ** 2` will work well if there is only one target.
In multi-target scene, we should change it like this:
```python
# undo normalisation
# y_var = y_var * self._y_train_std ** 2
y_var = y_var.reshape((-1, 1))
y_var = np.einsum("ij,j->ij", y_var, self._y_train_std ** 2)
```
PS:
I think we need to add an `EPS` to the `self._y_train_std` to avoid zero division error.
```python
# Normalize target value
if self.normalize_y:
    self._y_train_mean = np.mean(y, axis=0)
    # self._y_train_std = np.std(y, axis=0)
    self._y_train_std = np.std(y, axis=0) + self.EPS # avoid zero division error

    # Remove mean and make unit variance
    y = (y - self._y_train_mean) / self._y_train_std
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GaussianProcessRegressor cannot correctly predict std in multi-target scene #17394

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

GaussianProcessRegressor cannot correctly predict std in multi-target scene #17394

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions