Inconsistancy in GaussianProcessRegressor between return_std and return_cov #19936

iwhalvic · 2021-04-20T18:56:15Z

Describe the bug

Inconsistency in the standard deviation for the GaussianProcessRegressor when solved with return_std=True and return_cov=True. My gut/eyeball says that the return_cov scenario is the correct one.

In _gpr.py, the construction of the inverse covariance matrix K seems unstable when using solve_triangular in the return_std=True scenario

L_inv = solve_triangular(self.L_.T,
                         np.eye(self.L_.shape[0]))
self._K_inv = L_inv.dot(L_inv.T)

Steps/Code to Reproduce

This code reproduces for me on a small problem, deviation seem worse on larger problems. The two methods of computing the stddev do not match. The return_std=True produces variances smaller than 0 and are simply truncated. Attempting to multiply the found K_inv by K does not yield an identity matrix, and has off diagonal terms on the scale of E-5.

Note that the data being fit here is nearly linear, hence the reason the length factors are so large. My guess is that this contributes to the instability of the K inversion.

import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel, WhiteKernel

import platform; print(platform.platform())
import sys; print("Python", sys.version)
print("NumPy", np.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)

#kernel
noise=1e-5
signal_var = 8.98576054e+05
length_factor = np.array([5.91326520e+02, 1.32584051e+03])
kernel = ConstantKernel(signal_var, (1e-12, 1e12)) * RBF(length_factor, (1e-12, 1e12)) + WhiteKernel(noise_level=noise)
m = GaussianProcessRegressor(kernel=kernel, alpha=0, n_restarts_optimizer=0, random_state=0)
#Setup Training Data
x_train=np.array([[ 0., 0.],
 [ 1.54919334,-0.77459667],
 [-1.54919334, 0.        ],
 [ 0.         ,-1.54919334],
 [ 0.77459667  ,0.77459667],
 [-0.77459667  ,1.54919334]])
y_train=np.array([[-2.14882017e-10],
 [-4.66975823e+00],
 [ 4.01823986e+00],
 [-1.30303674e+00],
 [-1.35760156e+00],
 [ 3.31215668e+00]])
#fit. We must do this to register teh GPR as "fitted"
m.fit(x_train, y_train)
theta = np.exp(m.kernel_.theta)
print(theta)
#testing data
x_test = np.array([[-1.93649167,-1.93649167],
 [ 1.93649167, -1.93649167],
 [-1.93649167,  1.93649167],
 [ 1.93649167,  1.93649167]])
#Predict the std_dev in two ways
pred1, std1 = m.predict(x_test, return_std=True)
pred2, cov = m.predict(x_test, return_cov=True)
std2=np.sqrt(np.diagonal(cov))
#Compare
print(std1)
print(std2)
#This SHould be the identity
K = m.kernel_(m.X_train_)
print(np.matmul(m._K_inv,K))

-->

Expected Results

std1 and std2 should be relatively close, _K_inv multiplied by K should yield identity

Actual Results

std1 and std2 do not match. std1 experiences setting to 0

[0. 0. 0. 0.]
[0.00754465 0.0063387  0.00642104 0.00834401]

Supposed identity has significant off diagonal terms

[[ 1.00003408e+00  1.26688982e-05  2.07282495e-05  1.40330489e-05
   1.30975155e-05  6.95791413e-06]
 [ 6.70887474e-06  1.00000728e+00  4.80152611e-06  6.97880660e-06
   4.38964496e-06  5.03122769e-06]
 [-2.73120041e-07  1.17911915e-06  1.00000163e+00  1.96316852e-07
   1.40319910e-06 -6.62906536e-07]
 [-1.28473718e-05 -3.97354831e-06 -3.31062869e-06  9.99990618e-01
  -4.49222162e-06 -6.98006418e-06]
 [-8.08983868e-06  5.34078880e-07  3.35425311e-06 -8.92775191e-06
   1.00000882e+00 -7.82074069e-07]
 [-2.70670598e-06  3.74244221e-06  1.10799128e-06  4.11810288e-06
  -7.87127389e-07  1.00000157e+00]]

Errors in _gpy.py point towards issue

/mnt/d/Projects/scikit-learn/sklearn/gaussian_process/_gpr.py:374: UserWarning: Predicted variances smaller than 0. Setting those variances to 0.
  warnings.warn("Predicted variances smaller than 0. "

Versions

Linux-4.4.0-19041-Microsoft-x86_64-with-glibc2.31
Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46)
[GCC 9.3.0]
NumPy 1.19.5
SciPy 1.6.2
Scikit-Learn 1.0.dev0

The text was updated successfully, but these errors were encountered:

iwhalvic · 2021-04-20T19:06:16Z

I believe I have a solution to this, thought it means that _K_inv is never explicitly formed, and we rely on the Cholesky decomposition (as we do in the return_covar case. Could still form _K_inv for compatibility.

glemaitre · 2021-05-08T10:25:52Z

Solved in #19939

iwhalvic added the Bug: triage label Apr 20, 2021

iwhalvic mentioned this issue Apr 20, 2021

FIX Use cho_solve when return_std=True for GaussianProcessRegressor #19939

Merged

glemaitre added Bug and removed Bug: triage labels Apr 21, 2021

Caratheodory301 mentioned this issue Apr 25, 2021

Fix Inconsistency in GaussianProcessRegressor between return_std and return_cov #19936 #19980

Closed

glemaitre closed this as completed May 8, 2021

sqbl mentioned this issue Nov 9, 2022

Inconsistency with book (Rasmussen+Williams) regarding calculation of std and cov novonordisk-research/ProcessOptimizer#128

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistancy in GaussianProcessRegressor between return_std and return_cov #19936

Inconsistancy in GaussianProcessRegressor between return_std and return_cov #19936

iwhalvic commented Apr 20, 2021 •

edited

Loading

iwhalvic commented Apr 20, 2021

glemaitre commented May 8, 2021

Inconsistancy in GaussianProcessRegressor between return_std and return_cov #19936

Inconsistancy in GaussianProcessRegressor between return_std and return_cov #19936

Comments

iwhalvic commented Apr 20, 2021 • edited Loading

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

iwhalvic commented Apr 20, 2021

glemaitre commented May 8, 2021

iwhalvic commented Apr 20, 2021 •

edited

Loading