MAINT Clean up deprecations for 1.5: in log_loss #28851

jeremiedbb · 2024-04-16T17:01:31Z

Removed the deprecated eps param of log_loss.
log_loss now raises an error when predicted probas do not sum to 1.

github-actions · 2024-04-16T17:02:51Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: dac73bc. Link to the linter CI: here}

jeremiedbb · 2024-04-16T17:06:00Z

sklearn/metrics/tests/test_classification.py

-    # binary case: check correct boundary values for eps = 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [0, 1], eps=0) == 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [0, 0], eps=0) == np.inf
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [1, 1], eps=0) == np.inf
-
-    # multiclass case: check correct boundary values for eps = 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1, 2], [[1, 0, 0], [0, 1, 0], [0, 0, 1]], eps=0) == 0
-    with pytest.warns(FutureWarning):
-        assert (
-            log_loss([0, 1, 2], [[0, 0.5, 0.5], [0, 1, 0], [0, 0, 1]], eps=0) == np.inf
-        )


@lorentzenchr I'm a bit confused. Does removing eps (deprecated in #25299) means that now eps is always 0 or eps is always computed based on the dtype ? The previous "auto" seems to indicate the latter but in that case testing edge cases is no longer possible.

See #24515 (comment).
My opinion is that eps=0 is the correct behavior (who are we to judge and MODIFY uncalibrated predicted probabilities!). The consensus was more in the direction of dtype dependent.

At least, the clipping should not happen here to me. If y_true=0 and y_pred=0, the result should be exactly 0. xlogy(0, 0) = 0 (no warning).

The question is do we want to return inf when y_true != 0 and y_pred = 0, or a finite value. If the former, we should clip with eps=0, else we should clip the result of xlogy as suggested in #24515 (comment)

I would go with returning inf, but I don't know if we rely on it being finite (maybe if *SearchCV and co), and the warning message said that eps will be non-zero in 1.5, so maybe we should better keep it as is.

I have the same opinion as you.

Indeed, I find the following weird:

>>> log_loss([0, 1, 2], [[1, 0, 0], [0, 1, 0], [0, 0, 1]]) 2.2204460492503136e-16

Shall to conditionally clip to eps only when one_hot_encode(y_true) > 0 and clip to 0 otherwise?

Actually whatever the decision on this, there should be a test to cover the case where log_loss reaches its minimum (perfect predictions), both in binary and multiclass settings.

I'd rather leave this discussion for a separate issue/PR to (try to) keep the focus of this PR on the deprecations clean-up.

I added a test for perfect predictions that only checks that the result is close to 0 for now.

sklearn/metrics/_classification.py

sklearn/metrics/tests/test_classification.py

ogrisel

Another pass of feedback:

sklearn/metrics/_classification.py

sklearn/metrics/tests/test_classification.py

sklearn/metrics/tests/test_common.py

ogrisel · 2024-04-24T15:10:35Z

sklearn/metrics/tests/test_classification.py

-    # binary case: check correct boundary values for eps = 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [0, 1], eps=0) == 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [0, 0], eps=0) == np.inf
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [1, 1], eps=0) == np.inf
-
-    # multiclass case: check correct boundary values for eps = 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1, 2], [[1, 0, 0], [0, 1, 0], [0, 0, 1]], eps=0) == 0
-    with pytest.warns(FutureWarning):
-        assert (
-            log_loss([0, 1, 2], [[0, 0.5, 0.5], [0, 1, 0], [0, 0, 1]], eps=0) == np.inf
-        )


Indeed, I find the following weird:

>>> log_loss([0, 1, 2], [[1, 0, 0], [0, 1, 0], [0, 0, 1]]) 2.2204460492503136e-16

Shall to conditionally clip to eps only when one_hot_encode(y_true) > 0 and clip to 0 otherwise?

sklearn/metrics/tests/test_classification.py

ogrisel · 2024-04-24T15:19:41Z

sklearn/metrics/tests/test_classification.py

-    # binary case: check correct boundary values for eps = 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [0, 1], eps=0) == 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [0, 0], eps=0) == np.inf
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1], [1, 1], eps=0) == np.inf
-
-    # multiclass case: check correct boundary values for eps = 0
-    with pytest.warns(FutureWarning):
-        assert log_loss([0, 1, 2], [[1, 0, 0], [0, 1, 0], [0, 0, 1]], eps=0) == 0
-    with pytest.warns(FutureWarning):
-        assert (
-            log_loss([0, 1, 2], [[0, 0.5, 0.5], [0, 1, 0], [0, 0, 1]], eps=0) == np.inf
-        )


Actually whatever the decision on this, there should be a test to cover the case where log_loss reaches its minimum (perfect predictions), both in binary and multiclass settings.

sklearn/metrics/_classification.py

Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>

ogrisel

LGTM as well. Thanks.

cln deprecations [doc build]

829fa65

jeremiedbb added the No Changelog Needed label Apr 16, 2024

github-actions bot added the module:metrics label Apr 16, 2024

jeremiedbb commented Apr 16, 2024

View reviewed changes

jeremiedbb added 2 commits April 16, 2024 19:06

lint [doc build]

12765b0

add test for sum(probas) != 1 error

bb13aa2

lorentzenchr approved these changes Apr 17, 2024

View reviewed changes

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved

sklearn/metrics/tests/test_classification.py Show resolved Hide resolved

jeremiedbb and others added 2 commits April 18, 2024 16:06

fix test for inf edge case

bcb5f3a

warn for non-probabilities

fc4edaa

ogrisel reviewed Apr 24, 2024

View reviewed changes

address review comments

eb26d8e

jeremiedbb added this to the 1.5 milestone Apr 25, 2024

jeremiedbb added the Blocker label Apr 25, 2024

glemaitre reviewed Apr 26, 2024

View reviewed changes

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved

glemaitre approved these changes Apr 26, 2024

View reviewed changes

Update sklearn/metrics/_classification.py

dac73bc

Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>

ogrisel approved these changes Apr 29, 2024

View reviewed changes

ogrisel merged commit 19c068f into scikit-learn:main Apr 29, 2024

Uh oh!

MAINT Clean up deprecations for 1.5: in log_loss #28851

MAINT Clean up deprecations for 1.5: in log_loss #28851

Uh oh!

Conversation

jeremiedbb commented Apr 16, 2024

Uh oh!

github-actions bot commented Apr 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

jeremiedbb Apr 16, 2024

Choose a reason for hiding this comment

Uh oh!

lorentzenchr Apr 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeremiedbb Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lorentzenchr Apr 18, 2024

Choose a reason for hiding this comment

Uh oh!

ogrisel Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

ogrisel Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

jeremiedbb Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ogrisel Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ogrisel Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Apr 16, 2024 •

edited

Loading

lorentzenchr Apr 16, 2024 •

edited

Loading

jeremiedbb Apr 18, 2024 •

edited

Loading