Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Make scorers return python floats #30575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make scorers return python floats #30575
Changes from all commits
8b45e53
3e0a15b
7181297
e86436e
4946b11
c595588
9b0c165
d7f57b6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a change from a numpy scalar with an int dtype to a Python float. This may be a bit more suprising than from a numpy scalar with a float dtype to a Python float.
I spent 5 minutes trying to find a way it could have unintended side-effects but I could not find anything. Maybe somebody else wants to think about it during 5 minutes as well?
For example on
main
returns
np.float64(0.3333333333333333)
onmain
and0.3333
with this PR.I seem to remember there were some differences in numpy scalar handling in numpy 2.0 maybe worth a look about what happens with
numpy < 2
...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a reason to not change this (or change to using
int
/float
as appropriate?) is that if you feed integers tomax_error
you could be expecting to get back integers. A bit like3 - 1 == 2
. But maybe we are being too detailed orientated?I think for very large integers you have to be careful when converting to floats as some of them can't be represented as float.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is arguably a fix for an inconsistency with the rest of the code base.
max_error
doesn't always return an int. It happens that if both y_pred and y_true have an int dtype, it returns an int.We don't enforce this behavior for any other scorer when the usual return type is float but could be an int in some specific setting. For instance,
accuracy_score(normalize=False)
counts the number of correctly classified samples, yet still returns a float.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the docstring of
max_error
already states that the return type is float :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well then, no need for an exception :)