Skip to content

FIX Raise on empty inputs in accuracy_score #31187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

StefanieSenger
Copy link
Contributor

@StefanieSenger StefanieSenger commented Apr 12, 2025

Reference Issues/PRs

towards #29048

What does this implement/fix? Explain your changes.

This PR adds a replace_undefined_by param to accuracy_score to deal with empty y_true and y_pred.
Also adds tests.

Open Question

Note that before this PR accuracy_score returned like this:
accuracy_score(np.array([]), np.array([]))

nan

accuracy_score(np.array([]), np.array([]), normalize=False)

0.0

I would like to consider this inconsistency as a bug and fix this with this PR for the next release without deprecation, so it comes faster. Would this be okay? How would you see that, @adrinjalali?

Copy link

github-actions bot commented Apr 12, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: a530be7. Link to the linter CI: here

@StefanieSenger StefanieSenger added this to the 1.7 milestone Apr 12, 2025
@StefanieSenger StefanieSenger marked this pull request as draft April 12, 2025 17:03
@StefanieSenger StefanieSenger marked this pull request as ready for review April 12, 2025 17:18
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits, otherwise LGTM.

thus ill-defined. Can take the following values:
- `np.nan` to return `np.nan`
- a floating point value in the range of [0.0, 1.0] or int 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- a floating point value in the range of [0.0, 1.0] or int 0
- a floating point value in the range of $[0.0, 1.0]$ or `int` 0

I think that makes it math like

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guillaume once told me we don't use latex in docstrings. So I will use back ticks instead.

"defaults to 0 when `normalize=False` is set."
)
warnings.warn(msg, UndefinedMetricWarning, stacklevel=2)
return replace_undefined_by if math.isnan(replace_undefined_by) else 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return replace_undefined_by if math.isnan(replace_undefined_by) else 0
return replace_undefined_by if np.isnan(replace_undefined_by) else 0

since we're using np.nan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both is possible, but we're replacing np.isnan with math.isnan for array api, so we can already do it here, I think.

@@ -3081,7 +3114,7 @@ def hamming_loss(y_true, y_pred, *, sample_weight=None):
Returns
-------
loss : float or int
loss : float
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there return types are changed. I agree it should be always float for all of them, but it'd be nice to have a test for all the cases to make sure it's actually float.

Copy link
Contributor Author

@StefanieSenger StefanieSenger Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this in a separate PR.

These tests had already been added in #30575.

Copy link
Contributor Author

@StefanieSenger StefanieSenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review, @adrinjalali.

I have addressed all your comments. For testing if all the classification metrics indeed return floats, I will open a separate PR.

thus ill-defined. Can take the following values:
- `np.nan` to return `np.nan`
- a floating point value in the range of [0.0, 1.0] or int 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guillaume once told me we don't use latex in docstrings. So I will use back ticks instead.

"defaults to 0 when `normalize=False` is set."
)
warnings.warn(msg, UndefinedMetricWarning, stacklevel=2)
return replace_undefined_by if math.isnan(replace_undefined_by) else 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both is possible, but we're replacing np.isnan with math.isnan for array api, so we can already do it here, I think.

@@ -3081,7 +3114,7 @@ def hamming_loss(y_true, y_pred, *, sample_weight=None):
Returns
-------
loss : float or int
loss : float
Copy link
Contributor Author

@StefanieSenger StefanieSenger Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this in a separate PR.

These tests had already been added in #30575.

@@ -360,6 +376,18 @@ def accuracy_score(y_true, y_pred, *, normalize=True, sample_weight=None):
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
check_consistent_length(y_true, y_pred, sample_weight)

if _num_samples(y_true) == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed during the dedicated meeting, we believe that the fact that accuracy_score does not raise ValueError when called on empty arrays was an oversight:

  • scikit-learn estimators always raise ValueError when fit on empty data,
  • more importantly, they always raise ValueError when predicting or transforming empty data, so it's not possible to get a working pipeline that would output an empty y_pred array;
  • many other metric functions such as roc_auc, f1_score and mean_squared_error already raise ValueError.

I think we can treat the accuracy_score case as a bug and make it raise ValueError instead (without deprecation cycle).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All other metric functions call check_array or a similar function that rejects empty arrays.

This is apparently not the case for _check_targets used in accuracy_score. We should review the list of calls to _check_targets to check whether we should change this function to:

  • reject empty arrays;
  • maybe even call check_array directly, but we would then need to check for redundant input validation to avoid introducing unnecessary overhead of redundant costly checks.

@ogrisel ogrisel removed this from the 1.7 milestone May 7, 2025
@ogrisel
Copy link
Member

ogrisel commented May 7, 2025

I remove the 1.7 milestone for this one because there is no emergency to fix the bug described above for 1.7.

@StefanieSenger
Copy link
Contributor Author

StefanieSenger commented May 7, 2025

Just to be sure: raising a ValueError on empty inputs also means that accuracy_score would not get a replace_undefined_by param, because empty inputs would be the only reason here that would create a division by zero.

@StefanieSenger StefanieSenger changed the title ENH Add replace_undefined_by to accuracy_score FIX Raise on empty inputs in accuracy_score May 7, 2025
@StefanieSenger StefanieSenger marked this pull request as draft May 7, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants