Fix Inconsistency in GaussianProcessRegressor between return_std and return_cov #19936 #19980
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
Fixes Issue #19936 Inconsistency in GaussianProcessRegressor between return_std and return_cov
What does this implement/fix? Explain your changes.
This pull request tries to fix the inconsistency in GaussianProcessRegressor between return_std and return_cov. As discussed in Issue #19936, GaussianProcessRegressor.predict(x, return_std=True) and GaussianProcessRegressor.predict(x, return_cov=True) return inconsistent results on the predicted variances on the evaluated points x. This is due to numerical instability during solve_triangle in gpr.GaussianProcessRegressor.predict. I used cho_solve((self.L, True), K_trans.T) instead of solve_triangle to calculate kernel(X_train,X_train)^(-1)*kernel(X_train,X_test) to fix the bug. I also changed subscripts of np.eisum later to calculate the predicted covariance. I created 10 random test cases to check if results from return_std and return_cov are consistent and all of them passed in less than 1s.
Any other comments?
I also tried other methods to fix this bug. For example, I tried np.linalg.solve instead of solve_triangle to get the inverse of the kernel matrix evaluated at X_train. However, in this case, since K_inv is a positive definite, solve method based on Cholesky decomposition would be the most efficient way to do so. Local testing verified the performace difference.