ENH Add `zero_division` parameter for `accuracy_score` #29213

Jaimin020 · 2024-06-07T17:49:10Z

Introduce the zero_division parameter to the accuracy_score function when y_true and y_pred are empty.

Reference Issues/PRs

Make zero_division parameter consistent in the different metric #29048 (Task-1)

What does this implement/fix? Explain your changes.

I have verified the lengths of the variables "y_true" and "y_pred." If both lengths are 0, I have generated an output based on the value of the "zero_warning" variable.

Any other comments?

Introduce the zero_division parameter to the accuracy_score function when y_true and y_pred are empty.

github-actions · 2024-06-07T17:50:45Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: f155899. Link to the linter CI: here}

Revise the method for calculating the number of elements in y_pred and y_true.

StefanieSenger · 2024-06-10T07:57:38Z

Hi @Jaimin020,

regarding the CodeCov failures: you need to add accuracy_score to the tests in test_classification.py. The output of the CI seems not very helpful here however. Normally, the precise lines of the uncovered code should show up in the "Files changed" tab.

StefanieSenger

I leave you some thoughts, that might be helpful for furthing this PR, @Jaimin020.

The type of this PR would not be Fix, but ENH and you need to add an entry into doc/whats_new/v1.6.rst as well.

StefanieSenger · 2024-06-10T08:04:54Z

sklearn/metrics/_classification.py

@@ -154,10 +154,16 @@ def _check_targets(y_true, y_pred):
        "y_pred": ["array-like", "sparse matrix"],
        "normalize": ["boolean"],
        "sample_weight": ["array-like", None],
+        "zero_division": [
+            Options(Real, {0, 1}),


I think you have to include np.nan here.

Nice, though the input we would expect from users would not be the str "nan", but the float value np.nan, which you can include like this:

Suggested change

Options(Real, {0, 1}),

Options(Real, {0, 1, np.nan}),

Then you need to delete the "nan" in li. 159 again.

StefanieSenger · 2024-06-10T08:11:26Z

sklearn/metrics/_classification.py

+        Sets the value to return when there is a zero division.
+
+        Notes:
+        - If set to "warn", this acts like 0, but a warning is also raised.


To my ears "acts like 0" is not very clear. A suggestion:

Suggested change

- If set to "warn", this acts like 0, but a warning is also raised.

- If set to "warn", this behaves like a 0.0 input, but a warning is also

raised.

StefanieSenger · 2024-06-10T09:45:01Z

sklearn/metrics/_classification.py

+    if len_y_true == 0 and len_y_pred == 0:
+        score = _check_zero_division(zero_division)
+        if zero_division == "warn":
+            _warn_prf(None, "Predcited", "Accuracy is", 0)


Suggested change

_warn_prf(None, "Predcited", "Accuracy is", 0)

_warn_prf(None, "predicted", "`accuracy_score` is", 0)

I think we should be precise and tell our users where exactly the warning origins from.

Added test cases for the 'zero_division' parameter in the accuracy_score function. Additionally, minor bugs were fixed as noted by the reviewer, and 'doc/whats_new/v1.6.rst' has been updated.

Jaimin020 · 2024-06-10T16:31:05Z

Hii, @StefanieSenger

I have made all the changes you mentioned.

The modification is completed to pass all tests.

StefanieSenger

Nice, thank you, @Jaimin020.
I've made a little comment about the param validation of the np.nan value. Apart from that it looks good to me. I'm not a maintainer though. @glemaitre, maybe you want to have a look?

StefanieSenger · 2024-06-12T06:31:14Z

sklearn/metrics/_classification.py

@@ -154,10 +154,16 @@ def _check_targets(y_true, y_pred):
        "y_pred": ["array-like", "sparse matrix"],
        "normalize": ["boolean"],
        "sample_weight": ["array-like", None],
+        "zero_division": [
+            Options(Real, {0, 1}),


Nice, though the input we would expect from users would not be the str "nan", but the float value np.nan, which you can include like this:

Suggested change

Options(Real, {0, 1}),

Options(Real, {0, 1, np.nan}),

Then you need to delete the "nan" in li. 159 again.

Jaimin020 · 2024-06-12T08:55:37Z

Hii, @StefanieSenger and @glemaitre

I've implemented all the changes suggested by @StefanieSenger. @glemaitre, please review this PR and let me know if any further modifications are needed.

glemaitre · 2024-06-12T15:25:32Z

sklearn/metrics/_classification.py

@@ -154,10 +154,16 @@ def _check_targets(y_true, y_pred):
        "y_pred": ["array-like", "sparse matrix"],
        "normalize": ["boolean"],
        "sample_weight": ["array-like", None],
+        "zero_division": [
+            Options(Real, {0, 1, np.nan}),


Suggested change

Options(Real, {0, 1, np.nan}),

Options(Real, {0.0, 1.0, np.nan}),

glemaitre · 2024-06-12T15:25:58Z

doc/whats_new/v1.6.rst

+:mod:`sklearn.metrics`
+..............................


Suggested change

:mod:`sklearn.metrics`

..............................

:mod:`sklearn.metrics`

......................

glemaitre · 2024-06-12T15:38:50Z

sklearn/metrics/_classification.py

+    # Check y_true and y_pred is empty
+    len_y_true = _num_samples(y_true)
+    len_y_pred = _num_samples(y_pred)
+
+    if len_y_true == 0 and len_y_pred == 0:
+        score = _check_zero_division(zero_division)
+        if zero_division == "warn":
+            _warn_prf(None, "predicted", "accuracy_score is", 0)
+        return score
+
    # Compute accuracy for each possible representation
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
    check_consistent_length(y_true, y_pred, sample_weight)


We should first do the checking of the vectors to raise a proper error. Then, we have only to check a single vector because we already check for consistency between vector. I would also avoid to use the _warn_prf because the name is really linked to precision-recall-fscore.

So let just warn directly.

Suggested change

# Check y_true and y_pred is empty

len_y_true = _num_samples(y_true)

len_y_pred = _num_samples(y_pred)

if len_y_true == 0 and len_y_pred == 0:

score = _check_zero_division(zero_division)

if zero_division == "warn":

_warn_prf(None, "predicted", "accuracy_score is", 0)

return score

# Compute accuracy for each possible representation

y_type, y_true, y_pred = _check_targets(y_true, y_pred)

check_consistent_length(y_true, y_pred, sample_weight)

# Compute accuracy for each possible representation

y_type, y_true, y_pred = _check_targets(y_true, y_pred)

check_consistent_length(y_true, y_pred, sample_weight)

if len(y_true) == 0: # empty vectors

if zero_division == "warn":

msg = (

f"accuracy() is ill-defined and set to 0.0. Use the `zero_division` "

"param to control this behavior."

)

warnings.warn(msg, UndefinedMetricWarning)

return _check_zero_division(zero_division)

glemaitre · 2024-06-12T15:40:01Z

sklearn/metrics/_classification.py

+        Sets the value to return when there is a zero division.
+
+        Notes:
+        - If set to "warn", this behaves like a 0.0 input, but a warning is also
+        raised.


Suggested change

Sets the value to return when there is a zero division.

Notes:

- If set to "warn", this behaves like a 0.0 input, but a warning is also

raised.

Sets the value to return when there is a zero division, e.g. when `y_true` and `y_pred`

are empty. If set to "warn", returns 0.0 input, but a warning is also raised.

.. versionadded:: 1.6

glemaitre · 2024-06-12T15:41:37Z

sklearn/metrics/tests/test_classification.py

@@ -215,6 +215,29 @@ def test_classification_report_zero_division_warning(zero_division):
            assert not record


+def test_accuracy_score_zero_division_warning():


Can we use the test_zero_division_nan_no_warning and test_zero_division_nan_warning that already exist.

I have revised the test case according to your suggestions. Please review it.

All modifications done.

Jaimin020 · 2024-06-15T15:14:50Z

Hii @glemaitre,

I have implemented all the changes you requested. Please review it.

glemaitre

LGTM @Jaimin020.

I quickly pushed a commit to merge main in the branch and just modify the message related to skipping the test that is rather a nitpick.

We will need a second approval.

Jaimin020 · 2024-07-24T13:13:09Z

LGTM @Jaimin020.

I quickly pushed a commit to merge main in the branch and just modify the message related to skipping the test that is rather a nitpick.

We will need a second approval.

Thanks @glemaitre for the quick update and merge with the main branch.

Addition of "zero_division"

f08de64

Introduce the zero_division parameter to the accuracy_score function when y_true and y_pred are empty.

github-actions bot added the module:metrics label Jun 7, 2024

Jaimin020 changed the title ~~Addition of "zero_division"~~ Addition of "zero_division for accuracy_score Jun 7, 2024

Jaimin020 changed the title ~~Addition of "zero_division for accuracy_score~~ Addition of "zero_division" for accuracy_score Jun 7, 2024

Jaimin020 added 3 commits June 7, 2024 23:28

Merge branch 'main' into 29048_task_1

75acb37

Merge branch 'scikit-learn:main' into 29048_task_1

fa7beec

Resolve checks

a36bf0c

Revise the method for calculating the number of elements in y_pred and y_true.

Jaimin020 mentioned this pull request Jun 8, 2024

Make zero_division parameter consistent in the different metric #29048

Open

5 tasks

Jaimin020 changed the title ~~Addition of "zero_division" for accuracy_score~~ Fix: Addition of "zero_division" for accuracy_score Jun 8, 2024

StefanieSenger reviewed Jun 10, 2024

View reviewed changes

Jaimin020 changed the title ~~Fix: Addition of "zero_division" for accuracy_score~~ ENH: Addition of "zero_division" for accuracy_score Jun 10, 2024

Jaimin020 added 4 commits June 10, 2024 19:02

Merge branch 'scikit-learn:main' into 29048_task_1

a467083

Added test cases and makes a mentioned changes.

9cc5c1b

Added test cases for the 'zero_division' parameter in the accuracy_score function. Additionally, minor bugs were fixed as noted by the reviewer, and 'doc/whats_new/v1.6.rst' has been updated.

Merge branch 'main' into 29048_task_1

e79d941

Linting check fix

619b795

Jaimin020 changed the title ~~ENH: Addition of "zero_division" for accuracy_score~~ ENH: Addition of "zero_division" parameter for accuracy_score Jun 10, 2024

Minor Bug fix.

3e5eef4

The modification is completed to pass all tests.

StefanieSenger reviewed Jun 12, 2024

View reviewed changes

Jaimin020 added 2 commits June 12, 2024 12:20

Merge branch 'scikit-learn:main' into 29048_task_1

74527f9

Added "np.nan" instead of "nan"

bbce03f

glemaitre self-requested a review June 12, 2024 15:22

glemaitre changed the title ~~ENH: Addition of "zero_division" parameter for accuracy_score~~ ENH Add zero_division parameter for accuracy_score Jun 12, 2024

glemaitre added this to the 1.6 milestone Jun 13, 2024

glemaitre reviewed Jun 13, 2024

View reviewed changes

Merge branch 'scikit-learn:main' into 29048_task_1

69d9070

Updates Done

a0cb06b

All modifications done.

Jaimin020 added 9 commits June 17, 2024 20:39

Merge branch 'main' into 29048_task_1

3e0b785

Merge branch 'main' into 29048_task_1

597f8b8

Merge branch 'main' into 29048_task_1

6c12471

Minor Bug Fix

11b09ce

Minor Bug Fix

6be16bb

Merge branch 'main' into 29048_task_1

47013aa

Merge branch 'main' into 29048_task_1

7979ecf

Merge branch 'main' into 29048_task_1

1d4cd39

Merge branch 'main' into 29048_task_1

de89e7d

glemaitre self-requested a review July 22, 2024 08:15

glemaitre added 2 commits July 22, 2024 10:21

Merge remote-tracking branch 'origin/main' into pr/Jaimin020/29213

3341ba9

nitpicks

faf05da

glemaitre approved these changes Jul 22, 2024

View reviewed changes

Merge remote-tracking branch 'upstream/main' into 29048_task_1

f155899

adrinjalali approved these changes Oct 22, 2024

View reviewed changes

adrinjalali enabled auto-merge (squash) October 22, 2024 13:39

adrinjalali merged commit 99f0f69 into scikit-learn:main Oct 22, 2024
28 checks passed

	- If set to "warn", this acts like 0, but a warning is also raised.
	- If set to "warn", this behaves like a 0.0 input, but a warning is also
	raised.

	_warn_prf(None, "Predcited", "Accuracy is", 0)
	_warn_prf(None, "predicted", "`accuracy_score` is", 0)

	Options(Real, {0, 1, np.nan}),
	Options(Real, {0.0, 1.0, np.nan}),

-        Sets the value to return when there is a zero division.
-        Notes:
-        - If set to "warn", this behaves like a 0.0 input, but a warning is also
-        raised.
+        Sets the value to return when there is a zero division, e.g. when `y_true` and `y_pred`
+        are empty. If set to "warn", returns 0.0 input, but a warning is also raised.
+        .. versionadded:: 1.6

		@@ -215,6 +215,29 @@ def test_classification_report_zero_division_warning(zero_division):
		assert not record


		def test_accuracy_score_zero_division_warning():

Uh oh!

ENH Add zero_division parameter for accuracy_score #29213

ENH Add zero_division parameter for accuracy_score #29213

Uh oh!

Conversation

Jaimin020 commented Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

StefanieSenger commented Jun 10, 2024

Uh oh!

StefanieSenger left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

StefanieSenger Jun 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jaimin020 commented Jun 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

StefanieSenger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jaimin020 commented Jun 12, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jaimin020 commented Jun 15, 2024

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Jaimin020 commented Jul 24, 2024

Uh oh!

Uh oh!

Uh oh!

ENH Add `zero_division` parameter for `accuracy_score` #29213

ENH Add `zero_division` parameter for `accuracy_score` #29213

Jaimin020 commented Jun 7, 2024 •

edited

Loading

github-actions bot commented Jun 7, 2024 •

edited

Loading

StefanieSenger left a comment •

edited

Loading

StefanieSenger Jun 10, 2024 •

edited

Loading

Jaimin020 commented Jun 10, 2024 •

edited

Loading