Skip to content

Fix RuntimeWarning: invalid value encountered in test_calibration.py #19421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Feb 11, 2021
Merged

Fix RuntimeWarning: invalid value encountered in test_calibration.py #19421

merged 20 commits into from
Feb 11, 2021

Conversation

t-kusanagi
Copy link
Contributor

Reference Issues/PRs

#19334

What does this implement/fix? Explain your changes.

The runtime error was cause by _CalibratedClassifier.predict_proba function.
The error occurs when np.sum(proba, axis=1)[:, np.newaxis] has some zero elements.
So, I use np.divide to avoid this error. Without out parameter, we will get the warning from assert_allclose(...) in test_calibration_multiclass.

Note:

Things I've tried that don't work.

  • Use with np.errstate(divide='ignore')
    • With pytest, still get warnig.
  • np.divide(proba, denominator, where=denominator != 0.0) (without out parameter)
    • The warning about zero division is resolved, but another warning will be raised from assert_allclose(...) in test_calibration_multiclass.
      • proba[np.isnan(proba)] = 1. / n_classes is not enough to this case. Some of values seems not to become NaN, so sum of proba cannnot be 1.0

@jeremiedbb
Copy link
Member

The reason why np.divide(proba, denominator, where=denominator != 0.0) did not work is because of the next line in the code:

# XXX : for some reason all probas can be 0
proba[np.isnan(proba)] = 1. / n_classes

Since now you don't divide by 0, you don't get inf and the probas are not normalized.

Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
@ogrisel
Copy link
Member

ogrisel commented Feb 10, 2021

Interesting, I did not know about the where parameter of np.divide. I agree with @jeremiedbb that we can collapse the code to remove any np.nan occurrence from the start.

If this is not already the case, we could probably add a specific test to "calibrate" such a badly behaving classifier that always predicts zeros in predict_probas and check that:

  • we get uniform 1/n_classes probabilities as a results;
  • we do not get a waning using sklearn.utils._testing.assert_no_warnings.

@t-kusanagi
Copy link
Contributor Author

The reason why np.divide(proba, denominator, where=denominator != 0.0) did not work is because of the next line in the code:

# XXX : for some reason all probas can be 0
proba[np.isnan(proba)] = 1. / n_classes

Since now you don't divide by 0, you don't get inf and the probas are not normalized.

Thanks, I overlooked the fact that this process is only for division by zero.

@jeremiedbb
Copy link
Member

we do not get a waning using sklearn.utils._testing.assert_no_warnings.

@ogrisel I don't think we want to check that no warning is issued. We don't want to check that every line of code doesn't raise a warning. I think that the fact that the warning disappears with this PR is enough.

@ogrisel
Copy link
Member

ogrisel commented Feb 10, 2021

I think that the fact that the warning disappears with this PR is enough.

Alright, but don't we want to explicitly check that we get uniform probabilities when the based classifier predicts only zeros.

@jeremiedbb
Copy link
Member

jeremiedbb commented Feb 10, 2021

Alright, but don't we want to explicitly check that we get uniform probabilities when the base classifier predicts only zeros.

I didn't argue against that :)

@t-kusanagi
Copy link
Contributor Author

I didn't argue against that :)

OK, now I'm going to add this test.

@t-kusanagi
Copy link
Contributor Author

OK, now I'm going to add this test.

Since it is difficult to control the behavior of the CalibratedClassifierCV class and the test code becomes large, I added test_calibration_zero_probability, which is just a test for _CalibratedClassifier.predict_proba.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new test. Here are some suggestions to improve it a bit. Otherwise LGTM.

t-kusanagi and others added 4 commits February 10, 2021 23:00
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once @jeremiedbb's remaining comments have been addressed.

t-kusanagi and others added 3 commits February 11, 2021 10:13
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
t-kusanagi and others added 3 commits February 11, 2021 10:15
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks @t-kusanagi

@jeremiedbb jeremiedbb merged commit 6489daf into scikit-learn:main Feb 11, 2021
@t-kusanagi t-kusanagi deleted the fix_test_calibration branch February 11, 2021 09:24
@ogrisel
Copy link
Member

ogrisel commented Feb 11, 2021

Thank you very much for the fix and the nice new test @t-kusanagi.

@ogrisel
Copy link
Member

ogrisel commented Feb 11, 2021

ping @lucyleeow to keep you in the loop on your favorite meta-estimator :)

@glemaitre glemaitre mentioned this pull request Apr 22, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants