-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
BUG Fix covariance and stdev shape in GPR with normalize_y #22199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -682,3 +683,43 @@ def test_y_std_with_multitarget_normalized(): | |||
assert y_pred.shape == (n_samples, n_targets) | |||
assert y_std.shape == (n_samples, n_targets) | |||
assert y_cov.shape == (n_samples, n_samples, n_targets) | |||
|
|||
|
|||
def test_y_std_cov_with_multitarget(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this test, it would be better to use pytest.mark.parametrize
on the previous test to avoid code redundancy.
In addition, I see that I already did a PR but forgot about it: #21996
It was covering one of the problems. Could you check that the solution is equivalent? I also see that we made an additional test to check the value of the std. dev. and covariance. This should be added in yours as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Yes your solution in #21996 is equivalent. In addition, I added line 399-400 which make sure that y_mean
is also the right shape corresponding to the docstring for gpr.predict
.
I merged your fixes in with this branch and updated the test use pytest.mark.parametrize
as suggested. Is the test for the values of std. dev. and covariance that you mentioned the one included in #21996?
…to use pytest.mark.parametrize
…nto gpr_shapes
"""Check the proper normalization of `y_std` and `y_cov` in multi-target scene. | ||
@pytest.mark.parametrize("normalize_y", [True, False]) | ||
@pytest.mark.parametrize("n_targets", [0, 1, 10]) | ||
def test_multitarget_shape(normalize_y, n_targets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this test is only testing multi-target I think that we don't need to parameterize it with n_targets
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yes I think the name on the test should be changed. I included n_targets = 0 and 1 because it was having problems with that, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
… n_targets=None in test parameterization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged the comment. The PR LGTM.
@thomasjpfan since you reviewed my original PR, you might want to have a look such that we can merge this fix. I thought that we added it in 1.0.2 indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits in the test, otherwise LGTM
Thanks for working on this PR @Tenavi ! |
…arn#22199) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Nakamura-Zimmerer, Tenavi (ARC-AF) <tenavi.nakamura-zimmerer@nasa.gov>
* Updating to scipy 1.12 * Scikitlearn 1.0 is incompatible with scipy 1.12, so updating to 1.1 * Sometimes only a 1d array is returned. * Removing testing kulsinski metric. * simps is deprecated so switching to simpson * Handle change from scikit-learn/scikit-learn#22199 * Improving test to classify better. * Regold due to scikit change scikit-learn/scikit-learn#22604 * Regold PolyExponential files (had rel err of 1e-03 or less) * Make timestep uniform for scipy update. * Regolding because of changes in scipy 1.12 * Increase limits to improve convergence. * Unpinning xarray and updating numpy * Updating various libraries. * Fix working with newer tensorflow. * Values need to be switched to tuples for hstack in numpy 1.26 * Updating to new ray version. * The deque size can be bigger in python 3.11 * Report difference in row lengths, instead of crashing OrderedCSVDiffer. Also report gold file name. * Remove Fourier__signal_f__period10.0__phase This was either +pi or -pi semirandomly, so nolonger testing it. * Regolding changes to ROM/TimeSeries/DMD/BOPDMD because of library changes. * Support xarray 2024.7 and newer. Pre 2024.7 automatically squeeze()ed groupby results, so now need to explicitly call squeeze(). * Fixing long line. * Increasing zero threshold because of change in libraries. * Remove version from setuptools since ray updated. * Optimizing persistence in BayesianMatyas. * Switch OVO to use estimator that is not constantly zero. * Use keepdims instead of try catch block. * Updating to default using python 3.11
Reference Issues/PRs
Resolves #22174
Resolves #22175
What does this implement/fix? Explain your changes.
_y_train_std
the appropriate shape in casenormalize_y=False
. Thennumpy.outer(y_var, self._y_train_std**2)
andnumpy.outer(y_cov, self._y_train_std**2)
always maken_targets
predictions. These still get squeezed as per convention. This fixes the shapes of arrays returned bypredict
whenreturn_std=True
orreturn_cov=True
.y_mean
inpredict
if the shape is(n_samples, 1)
to make this(n_samples,)
, following the documentation.sample_y
, whenn_targets > 1
then the loop in multivariate normal predictions also loops over the last axis (targets) of the predicted covariance array. This fixes the problem with incorrect shapes returned bysample_y
.Any other comments?
An old test had to be modified to reflect the correct shapes output by
predict
.