ENH PrecisionRecallDisplay add option to plot chance level #26019

Charlie-XIAO · 2023-03-29T20:14:00Z

Reference Issues/PRs

Towards #25929. Relevant PRs: #25972, #25987.

What does this implement/fix? Explain your changes.

This PR implements the following:

Add attribute chance_level_ to PrecisionRecallDisplay class.
Add option to plot the chance level line, and supports passing a dict to alter its style.

Any other comments?

Unlike RocCurveDisplay in #25987, the chance level here depends on y. Therefore, if plot_chance_level=True, plot would require the prevalence of the positive label. It is okay if one uses from_estimator or from_predictions since we can compute the prevalence level from y that is originally required by these methods. Please let me know if I should approach differently.

By the way, I git greped all files under examples/ that include PrecisionRecallDisplay, but none of them have plotted the chance level line, so I made no modifications.

…urve

…ther display

ArturoAmorQ

Thanks for the PR @Charlie-XIAO! This will certainly be a nice addition :) Here are just a couple of comments.

You will also have to solve the current conflicts with main.

sklearn/metrics/_plot/precision_recall_curve.py

sklearn/metrics/_plot/tests/test_precision_recall_display.py

Charlie-XIAO · 2023-04-11T17:59:20Z

Hi @ArturoAmorQ, thanks for your review! I have made your suggested changes and resolved conflicts with the main branch. The test cases have passed, but I haven't tested for the error messages I added yet. Please let me know if I should make any further changes and if I need to modify or test for those error messages.

EDIT: Clearly I would need to add test cases for the error messages since Codecov did not pass. Will do soon.

glemaitre

Here are a couple comments.

sklearn/metrics/_plot/precision_recall_curve.py

sklearn/metrics/_plot/tests/test_precision_recall_display.py

Charlie-XIAO · 2023-04-14T06:33:47Z

Hi @glemaitre, thanks for your review! I've made your suggested changes. Please let me know if there are any further modifications you want me to make.

glemaitre

It would be cool if we can add this plot_chance_level=True inside one of the example where it makes sense.

sklearn/metrics/_plot/precision_recall_curve.py

sklearn/metrics/_plot/tests/test_precision_recall_display.py

sklearn/metrics/_plot/precision_recall_curve.py

Charlie-XIAO · 2023-04-14T12:54:35Z

Thanks for the review @glemaitre @betatim! I have made the suggested changes, except that I have not added the new test yet. I will do that as well as adding this new feature in some example ASAP.

…thod

Charlie-XIAO · 2023-04-14T14:31:35Z

I have added this new feature into some of the examples @glemaitre. For instance,

As you can see, then chance level line is always at the bottom. I think we may need to fix both axes to [0, 1], as suggested in Issue #25929. If you agree, I will open another PR for these visual changes (for both PR and ROC curve), similar to what I've done in PR #25972, then apply them to these examples.

glemaitre · 2023-04-14T14:45:55Z

I think we may need to fix both axes to [0, 1]

Yes we will need the other improvement but in another PR.

Charlie-XIAO · 2023-04-14T16:23:19Z

Yes we will need the other improvement but in another PR.

Okay, will do after this PR is accepted (since I want to avoid too many merge conflicts).

glemaitre

LGTM upton the 2 minors changes.

doc/whats_new/v1.3.rst

sklearn/metrics/_plot/precision_recall_curve.py

Charlie-XIAO · 2023-05-04T13:41:00Z

Hi @glemaitre, thanks for your review! One thing is that Counter does not have total, and as suggested in the Counter docs, the common pattern for obtaining total counts is sum(c.values()), as currently implemented. Please let me know if this is okay.

glemaitre · 2023-05-04T14:05:37Z

True, this is Python 3.10. So let's revert then.

glemaitre · 2023-05-14T13:06:00Z

ping @betatim would you mind having a new look at this PR? This is good to be merged on my side.

sklearn/metrics/_plot/precision_recall_curve.py

betatim

I left one comment about making it an error to request the chance level line without providing the data needed.

Otherwise this looks good to me. Thanks for the work!

betatim · 2023-05-22T23:26:41Z

Thanks for the additional exception. Merged.

glemaitre · 2023-05-31T15:59:15Z

Nice. Thanks @Charlie-XIAO

…arn#26019) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

e-pet · 2024-11-26T12:35:06Z

I am confused - what exactly does this line represent? The right-most point is the always-positive classifier as I understand it, which is indeed the baseline to beat in PR analysis - but it is not in any way random (and thus 'chance level' seems misleading in this context)? And how / why is the plotted horizontal extension of this point meaningful? Kull and Flach argue (see link above) that the comparison line to plot (analogous to the diagonal in the ROC curve) would be the F1 isometric curve associated with this classifier, which is hyperbolic and can also easily be computed:

  baserate = y_true.sum() / len(y_true)
  F1_baseline = 2 * (baserate * 1) / (baserate + 1)
  recall = np.arange(F1_baseline / (2 - F1_baseline), 1+1e-7, 0.05)  
  prec_baseline = F1_baseline * recall / (2 * recall - F1_baseline)
  plt.plot(recall, prec_baseline, 'k--', label="$F_1$ baseline")

add option to plot chance level line and customize rendering for PR c…

1a1c53b

…urve

github-actions bot added the module:metrics label Mar 29, 2023

Charlie-XIAO added 3 commits March 30, 2023 04:17

changelog added

161fd9f

changed chance_level_kwargs to chance_level_kw for consistency with o…

2863ddf

…ther display

changelog updated

d4e44d1

glemaitre self-requested a review April 4, 2023 09:26

ArturoAmorQ reviewed Apr 11, 2023

View reviewed changes

sklearn/metrics/_plot/precision_recall_curve.py Outdated Show resolved Hide resolved

sklearn/metrics/_plot/precision_recall_curve.py Outdated Show resolved Hide resolved

sklearn/metrics/_plot/tests/test_precision_recall_display.py Outdated Show resolved Hide resolved

Charlie-XIAO and others added 3 commits April 12, 2023 01:45

resolved conversations: default value of pos_prevalence changed to None

b2df9e2

Merge branch 'main' into pr-vis-enh

a570ba8

added check and solved linting issues

206a926

improved test coverage

edbbdc5

glemaitre reviewed Apr 13, 2023

View reviewed changes

resolved conversations

54b5eae

glemaitre self-requested a review April 14, 2023 09:57

glemaitre reviewed Apr 14, 2023

View reviewed changes

betatim reviewed Apr 14, 2023

View reviewed changes

sklearn/metrics/_plot/precision_recall_curve.py Show resolved Hide resolved

partially resolved conversations, the rest TBD soon

39ff08e

Charlie-XIAO added 3 commits April 14, 2023 21:30

fixed typo in attribute name

46a0679

added test to check that prevalence_pos_label is reusable via plot me…

c6460e5

…thod

added example

d729fcb

Charlie-XIAO and others added 4 commits April 14, 2023 12:23

Merge branch 'main' into pr-vis-enh

2eb5bec

'secretly' fix a changelog typo in my previous contributions

46807c3

Merge branch 'main' into pr-vis-enh

401a370

Merge remote-tracking branch 'upstream/main' into pr-vis-enh

21becf3

Merge remote-tracking branch 'upstream/main' into pr-vis-enh

4503504

glemaitre self-requested a review May 4, 2023 09:06

Merge branch 'main' into pr-vis-enh

d96210e

glemaitre approved these changes May 4, 2023

View reviewed changes

doc/whats_new/v1.3.rst Outdated Show resolved Hide resolved

sklearn/metrics/_plot/precision_recall_curve.py Outdated Show resolved Hide resolved

glemaitre added this to the 1.3 milestone May 4, 2023

glemaitre added the Waiting for Second Reviewer First reviewer is done, need a second one! label May 4, 2023

Charlie-XIAO and others added 5 commits May 4, 2023 21:18

reverted suspicious additions

ab8873c

resolved conversations

a825b92

counter object does not have total()

2afede7

reverted unnecessary change

7b21acc

Merge branch 'main' into pr-vis-enh

3de2722

Charlie-XIAO mentioned this pull request May 14, 2023

ENH despine keyword for ROC and PR curves #26367

Merged

Charlie-XIAO added 3 commits May 15, 2023 03:16

minor modification

2457e90

Merge remote-tracking branch 'upstream/main' into pr-vis-enh

8e11c09

minor fix

e93a243

Charlie-XIAO requested a review from betatim May 16, 2023 02:37

betatim reviewed May 16, 2023

View reviewed changes

sklearn/metrics/_plot/precision_recall_curve.py Show resolved Hide resolved

betatim approved these changes May 16, 2023

View reviewed changes

Charlie-XIAO and others added 3 commits May 16, 2023 09:45

Merge branch 'main' into pr-vis-enh

1e46101

raises when plotting chance level but no prevalence level is given

8338521

Merge branch 'main' into pr-vis-enh

0c91fc0

betatim merged commit 6238968 into scikit-learn:main May 22, 2023

Charlie-XIAO deleted the pr-vis-enh branch September 23, 2023 13:28

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023

ENH PrecisionRecallDisplay add option to plot chance level (scikit-le…

782a7bd

…arn#26019) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Uh oh!

ENH PrecisionRecallDisplay add option to plot chance level #26019

ENH PrecisionRecallDisplay add option to plot chance level #26019

Uh oh!

Conversation

Charlie-XIAO commented Mar 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

ArturoAmorQ left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Charlie-XIAO commented Apr 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Charlie-XIAO commented Apr 14, 2023

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Charlie-XIAO commented Apr 14, 2023

Uh oh!

Charlie-XIAO commented Apr 14, 2023

Uh oh!

glemaitre commented Apr 14, 2023

Uh oh!

Charlie-XIAO commented Apr 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Charlie-XIAO commented May 4, 2023

Uh oh!

glemaitre commented May 4, 2023

Uh oh!

glemaitre commented May 14, 2023

Uh oh!

Uh oh!

betatim left a comment

Choose a reason for hiding this comment

Uh oh!

betatim commented May 22, 2023

Uh oh!

glemaitre commented May 31, 2023

Uh oh!

e-pet commented Nov 26, 2024

Uh oh!

Charlie-XIAO commented Mar 29, 2023 •

edited

Loading

ArturoAmorQ left a comment •

edited

Loading

Charlie-XIAO commented Apr 11, 2023 •

edited

Loading

Charlie-XIAO commented Apr 14, 2023 •

edited

Loading