FIX compute precision-recall at 100% recall #23214

stephanecollot · 2022-04-25T19:37:59Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Remove the unnecessary dropping.

Any other comments?

Full disclosure, this PR modifies precision_recall_curve() that is only used by _binary_uninterpolated_average_precision() that is only used by average_precision_score()

scikit-learn/sklearn/metrics/_ranking.py

Line 205 in 24106c2

precision, recall, _ = precision_recall_curve(

I think average_precision_score() should not be impacted by this change, and its is tested 54 times in unit tests

glemaitre · 2022-04-26T07:17:31Z

You will need to fix failing tests. Basically, it should be docstring test in most cases.
We also need an additional non-regression test: I think that we should be a PR curve and check that the last point is equivalent to a decision rule that always predicts the positive class.

stephanecollot · 2022-04-26T15:24:32Z

@glemaitre I added 3 commits:

fixing existing unit tests
fixing existing doc tests
adding regression test (I check that it is failing if I revert the change)

glemaitre · 2022-04-26T20:31:49Z

Please add an entry to the change log at doc/whats_new/v1.1.rst. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:.

glemaitre · 2022-04-26T20:41:03Z

sklearn/metrics/_ranking.py

@@ -855,11 +855,11 @@ def precision_recall_curve(y_true, probas_pred, *, pos_label=None, sample_weight
    >>> precision, recall, thresholds = precision_recall_curve(
    ...     y_true, y_scores)
    >>> precision
-    array([0.66666667, 0.5       , 1.        , 1.        ])
+    array([0.5       , 0.66666667, 0.5       , 1.        , 1.        ])


We should update the docstring of thresholds to define properly n_thresholds.

glemaitre

Apart from these 2 changes, LGTM

glemaitre

Thanks for the changes. LGTM

sklearn/metrics/_ranking.py

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

stephanecollot · 2022-04-29T14:29:32Z

It seems the doc is failing, but I cannot understand why:

Sphinx Warnings in affected files
doc/whats_new/v1.1.rst:105: WARNING: Unexpected indentation.

nothing wrong at line 105.

glemaitre · 2022-04-29T15:34:31Z

For sure this is not linked with your change. We can ignore it.

glemaitre · 2022-04-29T15:35:27Z

I merge main into your branch. The problem was solved in main by the following PR: #23246

…3214

jeremiedbb

LGTM. Thanks @stephanecollot

stephanecollot · 2022-05-02T13:09:47Z

It went very smoothly, I'm happy I did this first contribution, thank you @glemaitre.
I will most probably continue to contribute by adding uncertainty band on precision-recall curves soon.

glemaitre · 2022-05-02T14:28:34Z

@stephanecollot I started a POC on the subject: #21211

The idea is to use cross-validation to get uncertainty bounds. I will probably find time to carry on some work in the next release. However, we will need to breakdown the PR into smaller PR to facilitate the review process:

Add the return_indices parameter to cross_validate
Then add a new from_cv_results class method for each display

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr>

fix precision recall

2c42703

github-actions bot added the module:metrics label Apr 25, 2022

Stephane Collot added 3 commits April 26, 2022 16:13

Fix existing unit tests

a523b38

fix pytest doc string

8792266

add non-regression test

c3d326d

glemaitre changed the title ~~Fixes precision recall at 100% recall~~ FIX compute precision-recall at 100% recall Apr 26, 2022

glemaitre reviewed Apr 26, 2022

View reviewed changes

glemaitre approved these changes Apr 26, 2022

View reviewed changes

update docstring and changelog

99867cd

glemaitre approved these changes Apr 29, 2022

View reviewed changes

sklearn/metrics/_ranking.py Outdated Show resolved Hide resolved

sklearn/metrics/_ranking.py Outdated Show resolved Hide resolved

stephanecollot and others added 3 commits April 29, 2022 15:24

Update sklearn/metrics/_ranking.py glemaitre 1

080dac0

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/metrics/_ranking.py glemaitre 2

f0b0692

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Merge branch 'main' into fix_precision_recall_curve

d7cf7ae

Merge branch 'main' into fix_precision_recall_curve

00e60dc

jeremiedbb added 2 commits May 2, 2022 12:42

Merge remote-tracking branch 'upstream/main' into pr/stephanecollot/2…

53c26b4

…3214

fix position in changelog

3554bfd

jeremiedbb approved these changes May 2, 2022

View reviewed changes

jeremiedbb merged commit 32c53bc into scikit-learn:main May 2, 2022

glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request May 19, 2022

FIX compute precision-recall at 100% recall (scikit-learn#23214)

0025dfa

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr>

glemaitre added a commit that referenced this pull request May 19, 2022

FIX compute precision-recall at 100% recall (#23214)

a176436

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr>

Borda mentioned this pull request May 23, 2022

freeze scikit-learn & min py 3.7 Lightning-AI/torchmetrics#1042

Merged

4 tasks

dmitryduev mentioned this pull request Jun 2, 2022

Address precision-recall curve updates in sklearn wandb/wandb#3735

Merged

chukarsten mentioned this pull request Jun 7, 2022

Sklearn 1.1.1 Upgrade alteryx/evalml#3525

Merged

betatim mentioned this pull request Nov 18, 2022

Control default behavior of PR curve #24976

Closed

This was referenced Feb 9, 2023

[CI] Doc test is failing pytorch/ignite#2850

Closed

Update Doctest for precision_recall_curve.py pytorch/ignite#2852

Merged

SkafteNicki mentioned this pull request Mar 13, 2023

build(deps): update scikit-learn requirement from <1.1.1,>1.0 to >1.0,<1.2.2 in /requirements Lightning-AI/torchmetrics#1570

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX compute precision-recall at 100% recall #23214

FIX compute precision-recall at 100% recall #23214

stephanecollot commented Apr 25, 2022

glemaitre commented Apr 26, 2022

stephanecollot commented Apr 26, 2022

glemaitre commented Apr 26, 2022

glemaitre Apr 26, 2022

glemaitre left a comment

glemaitre left a comment

stephanecollot commented Apr 29, 2022 •

edited

Loading

glemaitre commented Apr 29, 2022

glemaitre commented Apr 29, 2022

jeremiedbb left a comment

stephanecollot commented May 2, 2022

glemaitre commented May 2, 2022

FIX compute precision-recall at 100% recall #23214

FIX compute precision-recall at 100% recall #23214

Conversation

stephanecollot commented Apr 25, 2022

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

glemaitre commented Apr 26, 2022

stephanecollot commented Apr 26, 2022

glemaitre commented Apr 26, 2022

glemaitre Apr 26, 2022

Choose a reason for hiding this comment

glemaitre left a comment

Choose a reason for hiding this comment

glemaitre left a comment

Choose a reason for hiding this comment

stephanecollot commented Apr 29, 2022 • edited Loading

glemaitre commented Apr 29, 2022

glemaitre commented Apr 29, 2022

jeremiedbb left a comment

Choose a reason for hiding this comment

stephanecollot commented May 2, 2022

glemaitre commented May 2, 2022

stephanecollot commented Apr 29, 2022 •

edited

Loading