Make scorers return python floats #30575

jeremiedbb · 2025-01-03T16:08:06Z

github-actions · 2025-01-03T16:09:28Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: d7f57b6. Link to the linter CI: here}

jeremiedbb · 2025-01-22T14:46:46Z

I've used the existing common tests for metrics. When doing that I found that it's not very intuitive and lacks a check to make sure we're not forgetting any (spoiler: we are). I might take a shot at modernizing it a bit like we did for the common tests but in a later PR.

Since there are already many occurrences where we convert to float and it was done without changelog entry, I think it's fine to not add a changelog entry for the remaing ones. It's just a difference in the repr after all.

OmarManzoor

LGTM. Thanks @jeremiedbb

lesteve · 2025-01-27T10:20:15Z

doc/modules/model_evaluation.rst

@@ -2732,7 +2732,7 @@ Here is a small example of usage of the :func:`max_error` function::
  >>> y_true = [3, 2, 7, 1]
  >>> y_pred = [9, 2, 7, 1]
  >>> max_error(y_true, y_pred)
-  np.int64(6)
+  6.0


This is a change from a numpy scalar with an int dtype to a Python float. This may be a bit more suprising than from a numpy scalar with a float dtype to a Python float.

I spent 5 minutes trying to find a way it could have unintended side-effects but I could not find anything. Maybe somebody else wants to think about it during 5 minutes as well?

For example on main

from sklearn.metrics import max_error 1 / max_error([1, 2], [3, 5])

returns np.float64(0.3333333333333333) on main and 0.3333 with this PR.

I seem to remember there were some differences in numpy scalar handling in numpy 2.0 maybe worth a look about what happens with numpy < 2 ...

Maybe a reason to not change this (or change to using int/float as appropriate?) is that if you feed integers to max_error you could be expecting to get back integers. A bit like 3 - 1 == 2. But maybe we are being too detailed orientated?

I think for very large integers you have to be careful when converting to floats as some of them can't be represented as float.

This one is arguably a fix for an inconsistency with the rest of the code base. max_error doesn't always return an int. It happens that if both y_pred and y_true have an int dtype, it returns an int.

We don't enforce this behavior for any other scorer when the usual return type is float but could be an int in some specific setting. For instance, accuracy_score(normalize=False) counts the number of correctly classified samples, yet still returns a float.

Also, the docstring of max_error already states that the return type is float :)

Well then, no need for an exception :)

doc/modules/model_evaluation.rst

sklearn/metrics/cluster/tests/test_common.py

sklearn/metrics/tests/test_common.py

Co-authored-by: Tim Head <betatim@gmail.com>

lesteve · 2025-02-03T10:07:52Z

2 approvals already, let's merge this!

wip

8b45e53

github-actions bot added the module:metrics label Jan 3, 2025

jeremiedbb added 6 commits January 3, 2025 17:27

iter

3e0a15b

iter

7181297

iter

e86436e

iter

4946b11

Merge remote-tracking branch 'upstream/main' into metrics-return-floats

c595588

common test

9b0c165

jeremiedbb added the No Changelog Needed label Jan 22, 2025

jeremiedbb marked this pull request as ready for review January 22, 2025 14:39

jeremiedbb changed the title ~~[WIP] Make scorers return python floats~~ Make scorers return python floats Jan 22, 2025

OmarManzoor approved these changes Jan 27, 2025

View reviewed changes

OmarManzoor added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jan 27, 2025

lesteve reviewed Jan 27, 2025

View reviewed changes

betatim reviewed Jan 29, 2025

View reviewed changes

doc/modules/model_evaluation.rst Show resolved Hide resolved

betatim reviewed Jan 29, 2025

View reviewed changes

sklearn/metrics/cluster/tests/test_common.py Outdated Show resolved Hide resolved

betatim reviewed Jan 29, 2025

View reviewed changes

sklearn/metrics/tests/test_common.py Outdated Show resolved Hide resolved

Apply suggestions from code review

d7f57b6

Co-authored-by: Tim Head <betatim@gmail.com>

betatim approved these changes Jan 29, 2025

View reviewed changes

lesteve merged commit 39cc03f into scikit-learn:main Feb 3, 2025
31 checks passed

lesteve mentioned this pull request Apr 14, 2025

FIX Raise on empty inputs in accuracy_score #31187

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Make scorers return python floats #30575

Make scorers return python floats #30575

Uh oh!

jeremiedbb commented Jan 3, 2025

Uh oh!

github-actions bot commented Jan 3, 2025 •

edited

Loading

Uh oh!

jeremiedbb commented Jan 22, 2025

Uh oh!

OmarManzoor left a comment

Uh oh!

lesteve Jan 27, 2025

Uh oh!

betatim Jan 29, 2025

Uh oh!

jeremiedbb Jan 29, 2025

Uh oh!

jeremiedbb Jan 29, 2025

Uh oh!

betatim Jan 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lesteve commented Feb 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Make scorers return python floats #30575

Make scorers return python floats #30575

Uh oh!

Conversation

jeremiedbb commented Jan 3, 2025

Uh oh!

github-actions bot commented Jan 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

jeremiedbb commented Jan 22, 2025

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

lesteve Jan 27, 2025

Choose a reason for hiding this comment

Uh oh!

betatim Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

jeremiedbb Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

jeremiedbb Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

betatim Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lesteve commented Feb 3, 2025

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jan 3, 2025 •

edited

Loading