Description
Describe the bug
Inconsistency in the standard deviation for the GaussianProcessRegressor
when solved with return_std=True
and return_cov=True
. My gut/eyeball says that the return_cov
scenario is the correct one.
In _gpr.py
, the construction of the inverse covariance matrix K seems unstable when using solve_triangular
in the return_std=True
scenario
L_inv = solve_triangular(self.L_.T,
np.eye(self.L_.shape[0]))
self._K_inv = L_inv.dot(L_inv.T)
Steps/Code to Reproduce
This code reproduces for me on a small problem, deviation seem worse on larger problems. The two methods of computing the stddev do not match. The return_std=True
produces variances smaller than 0 and are simply truncated. Attempting to multiply the found K_inv
by K
does not yield an identity matrix, and has off diagonal terms on the scale of E-5.
Note that the data being fit here is nearly linear, hence the reason the length factors are so large. My guess is that this contributes to the instability of the K inversion.
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel, WhiteKernel
import platform; print(platform.platform())
import sys; print("Python", sys.version)
print("NumPy", np.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)
#kernel
noise=1e-5
signal_var = 8.98576054e+05
length_factor = np.array([5.91326520e+02, 1.32584051e+03])
kernel = ConstantKernel(signal_var, (1e-12, 1e12)) * RBF(length_factor, (1e-12, 1e12)) + WhiteKernel(noise_level=noise)
m = GaussianProcessRegressor(kernel=kernel, alpha=0, n_restarts_optimizer=0, random_state=0)
#Setup Training Data
x_train=np.array([[ 0., 0.],
[ 1.54919334,-0.77459667],
[-1.54919334, 0. ],
[ 0. ,-1.54919334],
[ 0.77459667 ,0.77459667],
[-0.77459667 ,1.54919334]])
y_train=np.array([[-2.14882017e-10],
[-4.66975823e+00],
[ 4.01823986e+00],
[-1.30303674e+00],
[-1.35760156e+00],
[ 3.31215668e+00]])
#fit. We must do this to register teh GPR as "fitted"
m.fit(x_train, y_train)
theta = np.exp(m.kernel_.theta)
print(theta)
#testing data
x_test = np.array([[-1.93649167,-1.93649167],
[ 1.93649167, -1.93649167],
[-1.93649167, 1.93649167],
[ 1.93649167, 1.93649167]])
#Predict the std_dev in two ways
pred1, std1 = m.predict(x_test, return_std=True)
pred2, cov = m.predict(x_test, return_cov=True)
std2=np.sqrt(np.diagonal(cov))
#Compare
print(std1)
print(std2)
#This SHould be the identity
K = m.kernel_(m.X_train_)
print(np.matmul(m._K_inv,K))
-->
Expected Results
std1 and std2 should be relatively close, _K_inv
multiplied by K should yield identity
Actual Results
std1 and std2 do not match. std1 experiences setting to 0
[0. 0. 0. 0.]
[0.00754465 0.0063387 0.00642104 0.00834401]
Supposed identity has significant off diagonal terms
[[ 1.00003408e+00 1.26688982e-05 2.07282495e-05 1.40330489e-05
1.30975155e-05 6.95791413e-06]
[ 6.70887474e-06 1.00000728e+00 4.80152611e-06 6.97880660e-06
4.38964496e-06 5.03122769e-06]
[-2.73120041e-07 1.17911915e-06 1.00000163e+00 1.96316852e-07
1.40319910e-06 -6.62906536e-07]
[-1.28473718e-05 -3.97354831e-06 -3.31062869e-06 9.99990618e-01
-4.49222162e-06 -6.98006418e-06]
[-8.08983868e-06 5.34078880e-07 3.35425311e-06 -8.92775191e-06
1.00000882e+00 -7.82074069e-07]
[-2.70670598e-06 3.74244221e-06 1.10799128e-06 4.11810288e-06
-7.87127389e-07 1.00000157e+00]]
Errors in _gpy.py
point towards issue
/mnt/d/Projects/scikit-learn/sklearn/gaussian_process/_gpr.py:374: UserWarning: Predicted variances smaller than 0. Setting those variances to 0.
warnings.warn("Predicted variances smaller than 0. "
Versions
Linux-4.4.0-19041-Microsoft-x86_64-with-glibc2.31
Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46)
[GCC 9.3.0]
NumPy 1.19.5
SciPy 1.6.2
Scikit-Learn 1.0.dev0