Handling division by zero std for GaussianProcessRegressor #19361

sobkevich · 2021-02-05T22:55:14Z

Reference Issues/PRs

Related to #994 which is posted to scikit-optimize but actually refers to scikit-learn

What does this implement/fix?

This PR fixes behavior for cases where std is zero (or contains zeros).
In the current behavior division by zero std (during target values normalization) produces an array of nans, which leads to ValueError: array must not contain infs or NaNs.
In a new version division by zero is handled in the same way as in StandardScaler.

kernc · 2021-02-06T00:17:32Z

sklearn/gaussian_process/_gpr.py

@@ -519,3 +519,20 @@ def _constrained_optimization(self, obj_func, initial_theta, bounds):

    def _more_tags(self):
        return {'requires_fit': False}
+
+
+def _handle_zeros_in_std(std, copy=True):


Why not rather import existing?

scikit-learn/sklearn/preprocessing/tests/test_data.py

Line 49 in 27f1c73

from sklearn.preprocessing._data import _handle_zeros_in_scale

@kernc
I thought it would be "unconventional" to import function with underscore prefix. I'll fix this.
upd.: done.

Ideally, the function would be rephrased as return np.where(std, std, 1, copy=copy) and phased out.

Done.
It seems that np.where() doesn't have an argument copy, please, let me know if I'm missing something

It doesn't. I said ideally. 😅

You are aware I'm not a maintainer here? With the amount of change now, this might have fewer chances to be accepted. Change is expensive. Not to mention how np.where() always produces a copy, which might be less efficient. Best get some maintainer feedback first.

You are aware I'm not a maintainer here?

Sorry, I've just noticed that 🤦‍♀️

You are right, for some cases additional resource consumption caused by np.where() may appear to be crucial, so it seems logical not to touch all parts that work correct and return to the usage of _handle_zeros_in_scale with copy=False in GaussianProcessRegressor

kernc · 2021-02-06T15:49:50Z

Also make sure the linters pass. It's what's failing the CI now.

cmarmo · 2021-02-07T11:58:02Z

Hi @sobkevich, thanks for your pull request. Note that this pull request could fix #18318, for which another pull request (#18388) was already opened.

sobkevich · 2021-02-13T21:42:45Z

@cmarmo thank you for letting me know that there is already the PR for fixing this behavior. Could you please tell if I should close this pull request now or after merging PR #18388?

cmarmo · 2021-02-13T23:12:05Z

Could you please tell if I should close this pull request now or after merging PR #18388?

To be honest ... I don't know... I find your solution quite 'a la scikit-learn' ... :) ... let's wait if there would be any answer in the previous PR. Thanks for your patience.

jaburke166 · 2021-02-15T19:16:53Z

sklearn/gaussian_process/_gpr.py

@@ -200,7 +201,8 @@ def fit(self, X, y):
            self._y_train_std = np.std(y, axis=0)

            # Remove mean and make unit variance
-            y = (y - self._y_train_mean) / self._y_train_std
+            y = (y - self._y_train_mean) / \


Hi there!

I'm very new to the whole "contributing" side of GitHub but I've been affected by the std_dev==0 issue that has seemingly been discussed since as early as November of last year! I'm glad this is being rectified now.

The reason I'm commenting is because I copied this fix into the current version of _gpr.py (version 0.24.0) but the issue seems to be persistent. My thoughts are that self._y_std_dev needs to be stored as _handle_zeros_in_scale(np.std(y,axis=0), copy=False) (or something similar) instead of the scaling issue only being dealt with only in the denominator.

I write this because in line 201 the self._y_std_dev will still be 0 and be carried to other functions (like predict() and sample_y(). For example, further downstream in predict() the covariance matrix on line 357 will be multiplied by this small number: y_cov = y_cov * self._y_train_std**2. This will subsequently cause issues in sample_y() as well.

I could be wrong here as I'm no expert, but I thought I'd comment my thoughts nonetheless!

@jaburke166
Thank you for your comment, I've just added a case for y_with_zero_std to test_y_normalization. It works with the current approach but fails if self._y_train_std = _handle_zeros_in_scale(np.std(y, axis=0), copy=False)
Could you please provide more details about your concern about passing zeros as std to predict()?

Hi @sobkevich,

With the changes you've made on _gpr.py, it seems that self._y_train_std is not overwritten by _handle_zeros_in_scale() in the case where self._y_train_std is extremely small or 0 . Wouldn't this mean that further downstream in predict(self, X, **return_cov=True**), when the normalisation is inverted (lines 349-350 and 356-357), the output of y_mean and y_cov will not be aligned with the fact we've set self._y_train_std=1?

Given the assumption that self._y_train_std is almost 0, squaring this and multiplying it by y_cov on line 357 will surely run into numerical issues which could results in y_cov losing its positive semi-definite property which when sampling from the GP in sample_y() on line 416 (or 419 for multidimensional y) using rng.multivariate_normal() we will be trying to sample from a non-positive semi-definite covariance matrix which would throw back rubbish samples (I think, but I am no GP expert at all).

Hi @sobkevich,

I'm wondering if you've had any time to have a look at this?

@jaburke166 I believe that the fix you suggest would be to define:

self._y_train_std = _handle_zeros_in_scale(np.std(y, axis=0), copy=False)

instead of detecting constant y at scaling time but I would like to be sure.

I am worried that that this would also be wrong in some other way but I am not sure.

Could you suggest the code snippet for a new test case to add to the test suite that would highlight the problem you are raising above?

@jblackburne this solution is being implemented in #18388 but it lacks good tests so feel free to suggest one there.

@ogrisel
@jaburke166, yes, you're right that with my approach all y_cov are going to be filled with zeros, so I tried the fix you've suggested: self._y_train_std = _handle_zeros_in_scale(np.std(y, axis=0), copy=False). With this approach with some kernels y_cov would also be filled with zeros (if y_cov has only zero elements it is also positive semi-definite because all eigenvalues >=0, in such case all samples would be in y_mean point) but for some kernels it is not zero-like. So it is a better solution.
But I've also noticed that test test_predict_cov_vs_std(kernel) fails when it receives y_with_zero_std both with and without normalization. I've considered adding some noise np.random.normal(0, 1e-8, y.shape) to y instead of _handle_zeros_in_scale, it seems to be a working solution, but I'm also not a GP expert.
And tomorrow I'll check if I missed anything in tests.

…ction

cmarmo · 2021-04-21T19:01:07Z

Solved in #19703.

handling division by zero std for GaussianProcessRegressor

0ce10e4

github-actions bot added the module:gaussian_process label Feb 5, 2021

test behavior for devision by zero std

e4ae1c3

sobkevich mentioned this pull request Feb 5, 2021

New unexpected behavior with scikit-learn==0.24.1 scikit-optimize/scikit-optimize#994

Open

kernc reviewed Feb 6, 2021

View reviewed changes

sobkevich added 3 commits February 6, 2021 02:32

usage of the existing func _handle_zeros_in_scale

7df01af

rephrase _handle_zeros_in_scale using np.where()

8af2dca

replace _handle_zeros_in_scale with np.where()

12e9bbf

sobkevich added 2 commits February 13, 2021 19:57

fix linting, return to _handle_zeros_in_scale

68829d2

fix linting in docstring

adde6db

jaburke166 reviewed Feb 15, 2021

View reviewed changes

sobkevich added 3 commits February 16, 2021 15:48

add y with zero std to test_y_normalization, update normalization fun…

6f715f2

…ction

get empty line back

7250179

fix line length

dae1b0c

th-yoon mentioned this pull request Feb 17, 2021

Fix to check if len(y) > 1 when normalize option is enabled in GaussianProcess._gpr.fit #19474

Closed

ogrisel mentioned this pull request Mar 17, 2021

Prevent division by zero in GPR when y_train is constant #18388

Closed

afonari mentioned this pull request Mar 17, 2021

Prevent division by zero in GPR when y_train is constant #19703

Merged

cmarmo added the Superseded PR has been replace by a newer PR label Mar 25, 2021

cmarmo closed this Apr 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling division by zero std for GaussianProcessRegressor #19361

Handling division by zero std for GaussianProcessRegressor #19361

sobkevich commented Feb 5, 2021 •

edited

Loading

kernc Feb 6, 2021

sobkevich Feb 6, 2021 •

edited

Loading

kernc Feb 6, 2021

sobkevich Feb 6, 2021

kernc Feb 6, 2021

sobkevich Feb 6, 2021

kernc commented Feb 6, 2021

cmarmo commented Feb 7, 2021

sobkevich commented Feb 13, 2021

cmarmo commented Feb 13, 2021

jaburke166 Feb 15, 2021 •

edited

Loading

sobkevich Feb 16, 2021

jaburke166 Feb 16, 2021 •

edited

Loading

jaburke166 Mar 10, 2021

ogrisel Mar 17, 2021

ogrisel Mar 17, 2021

sobkevich Mar 19, 2021

cmarmo commented Apr 21, 2021

Handling division by zero std for GaussianProcessRegressor #19361

Handling division by zero std for GaussianProcessRegressor #19361

Conversation

sobkevich commented Feb 5, 2021 • edited Loading

Reference Issues/PRs

What does this implement/fix?

Choose a reason for hiding this comment

sobkevich Feb 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kernc commented Feb 6, 2021

cmarmo commented Feb 7, 2021

sobkevich commented Feb 13, 2021

cmarmo commented Feb 13, 2021

jaburke166 Feb 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaburke166 Feb 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmarmo commented Apr 21, 2021

sobkevich commented Feb 5, 2021 •

edited

Loading

sobkevich Feb 6, 2021 •

edited

Loading

jaburke166 Feb 15, 2021 •

edited

Loading

jaburke166 Feb 16, 2021 •

edited

Loading