Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
FIX Use cho_solve when return_std=True for GaussianProcessRegressor #19939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX Use cho_solve when return_std=True for GaussianProcessRegressor #19939
Changes from all commits
a313674
3ba61f2
0be8fa3
17b22fb
7917673
c61139a
7d62590
0160a67
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest I prefer to have the URL :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I just have seen that both styles exist.
@iwhalvic: you can ignore this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are those values the ones for the case you are describing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This kernel is near to be singular: #19939 (comment)
I assume that this fine for a non-regression test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the problem is about diagonal elements of
gpr.L_
being close to zero, maybe we could make that explicit, e.g. with:although I am not 100% sure this is the true cause of the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition number of
gpr.L_
might also be a good proxy to assert if solving this system comes with numerical instability.In this case, numerical instability is prone be present as
cond(gpr.L_) >> 1
:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep this is exactly what we can observe in the branch
main
when we try to inverseL_.T
with constant target. I print the condition number ofL_
and when it fails for the constant target it is due to the fact thatL_
is ill-conditioned. When the condition number >> 1, the inversion is numerically instable but theL_
is not singular.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just adding my comments on what is occurring here:
Covariance matrices must be positive semi-definite, so all eigenvalues >=0
Inversion of matrices require all eigenvalues <>0
So theoretically, we could have a covariance matrix with a 0 eigenvalue -> non-invertible
Minimal example: If I ask you to invert A=[[1 1] [1 1]] you will likely not have an answer. In an analogy to what the previous code using
solve_triangular(self.L_.T, np.eye(self.L_.shape[0]))
we are asking: "Solve Ax=[1 0]". Once again no solution for x.However, in the fix code
cho_solve((self.L_, True), K_trans.T)
, but L_ and K_trans are both generated from the same kernel function, so we would not expect to ever be posed with such a problem. Instead we pose a problem such as "Solve Ax=[1 1]", at which point you can probably provide an answer (if not quite a few).Kernel functions such as the test case form matrices like A=[[1 1] [1 1]] because the length scales of the RBF are quite large, so all points become highly correlated with all other points (almost regardless of distance in the X space). I believe a true linear fit using a GP with an RBF kernel would have a correlation length of inf.
The
alpha
parameter in GaussianProcessRegressor attempts to avoid this, and with a sufficiently largealpha
value I predict it would for this case. But regardless thereturn_cov=True
andreturn_std=True
should be as consistent as possible.But I would welcome a GP expert to weigh in.