Skip to content

Inconsistancy in GaussianProcessRegressor between return_std and return_cov #19936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iwhalvic opened this issue Apr 20, 2021 · 2 comments
Closed
Labels

Comments

@iwhalvic
Copy link
Contributor

iwhalvic commented Apr 20, 2021

Describe the bug

Inconsistency in the standard deviation for the GaussianProcessRegressor when solved with return_std=True and return_cov=True. My gut/eyeball says that the return_cov scenario is the correct one.

In _gpr.py, the construction of the inverse covariance matrix K seems unstable when using solve_triangular in the return_std=True scenario

L_inv = solve_triangular(self.L_.T,
                         np.eye(self.L_.shape[0]))
self._K_inv = L_inv.dot(L_inv.T)

Steps/Code to Reproduce

This code reproduces for me on a small problem, deviation seem worse on larger problems. The two methods of computing the stddev do not match. The return_std=True produces variances smaller than 0 and are simply truncated. Attempting to multiply the found K_inv by K does not yield an identity matrix, and has off diagonal terms on the scale of E-5.

Note that the data being fit here is nearly linear, hence the reason the length factors are so large. My guess is that this contributes to the instability of the K inversion.

import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel, WhiteKernel

import platform; print(platform.platform())
import sys; print("Python", sys.version)
print("NumPy", np.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)

#kernel
noise=1e-5
signal_var = 8.98576054e+05
length_factor = np.array([5.91326520e+02, 1.32584051e+03])
kernel = ConstantKernel(signal_var, (1e-12, 1e12)) * RBF(length_factor, (1e-12, 1e12)) + WhiteKernel(noise_level=noise)
m = GaussianProcessRegressor(kernel=kernel, alpha=0, n_restarts_optimizer=0, random_state=0)
#Setup Training Data
x_train=np.array([[ 0., 0.],
 [ 1.54919334,-0.77459667],
 [-1.54919334, 0.        ],
 [ 0.         ,-1.54919334],
 [ 0.77459667  ,0.77459667],
 [-0.77459667  ,1.54919334]])
y_train=np.array([[-2.14882017e-10],
 [-4.66975823e+00],
 [ 4.01823986e+00],
 [-1.30303674e+00],
 [-1.35760156e+00],
 [ 3.31215668e+00]])
#fit. We must do this to register teh GPR as "fitted"
m.fit(x_train, y_train)
theta = np.exp(m.kernel_.theta)
print(theta)
#testing data
x_test = np.array([[-1.93649167,-1.93649167],
 [ 1.93649167, -1.93649167],
 [-1.93649167,  1.93649167],
 [ 1.93649167,  1.93649167]])
#Predict the std_dev in two ways
pred1, std1 = m.predict(x_test, return_std=True)
pred2, cov = m.predict(x_test, return_cov=True)
std2=np.sqrt(np.diagonal(cov))
#Compare
print(std1)
print(std2)
#This SHould be the identity
K = m.kernel_(m.X_train_)
print(np.matmul(m._K_inv,K))

-->

Expected Results

std1 and std2 should be relatively close, _K_inv multiplied by K should yield identity

Actual Results

std1 and std2 do not match. std1 experiences setting to 0

[0. 0. 0. 0.]
[0.00754465 0.0063387  0.00642104 0.00834401]

Supposed identity has significant off diagonal terms

[[ 1.00003408e+00  1.26688982e-05  2.07282495e-05  1.40330489e-05
   1.30975155e-05  6.95791413e-06]
 [ 6.70887474e-06  1.00000728e+00  4.80152611e-06  6.97880660e-06
   4.38964496e-06  5.03122769e-06]
 [-2.73120041e-07  1.17911915e-06  1.00000163e+00  1.96316852e-07
   1.40319910e-06 -6.62906536e-07]
 [-1.28473718e-05 -3.97354831e-06 -3.31062869e-06  9.99990618e-01
  -4.49222162e-06 -6.98006418e-06]
 [-8.08983868e-06  5.34078880e-07  3.35425311e-06 -8.92775191e-06
   1.00000882e+00 -7.82074069e-07]
 [-2.70670598e-06  3.74244221e-06  1.10799128e-06  4.11810288e-06
  -7.87127389e-07  1.00000157e+00]]

Errors in _gpy.py point towards issue

/mnt/d/Projects/scikit-learn/sklearn/gaussian_process/_gpr.py:374: UserWarning: Predicted variances smaller than 0. Setting those variances to 0.
  warnings.warn("Predicted variances smaller than 0. "

Versions

Linux-4.4.0-19041-Microsoft-x86_64-with-glibc2.31
Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46)
[GCC 9.3.0]
NumPy 1.19.5
SciPy 1.6.2
Scikit-Learn 1.0.dev0
@iwhalvic
Copy link
Contributor Author

I believe I have a solution to this, thought it means that _K_inv is never explicitly formed, and we rely on the Cholesky decomposition (as we do in the return_covar case. Could still form _K_inv for compatibility.

@glemaitre
Copy link
Member

Solved in #19939

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants