Skip to content

ENH Add replace_undefined_by param to class_likelihood_ratios #29288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 47 commits into from
Jan 17, 2025

Conversation

StefanieSenger
Copy link
Contributor

@StefanieSenger StefanieSenger commented Jun 18, 2024

Reference Issues/PRs

towards #29048

What does this implement/fix? Explain your changes.

This PR adds a zero_division param to class_likelihood_ratios like we're doing in the above issue. Since this function returns two scores, the input to the zero_division param also needs to encompass two values.

There is a raise_warning param already used for a similar purpose, that I deprecated here in a way that translates its functionality (exclusively raising warnings, the return values are not affected) to the new param.

Question:
The output of zero_division="warn" (default) is set to np.nan as it is with the current function in case of a zero division (which is also the content of the warning). The idea when we talked about it was to keep backwards compatibility.

I think we don't need to do this and can return the lowest scores for each metric respectively (1 for LR+ and 0 for LR-) in case of zero_division="warn" right away, because the return values don't have anything to do with the deprecated param.
Does that make sense?

Edit: this question was answered: yes, we keep the np.nan default return value for backwards compatibility until version 1.8.

Any other comments?

The warning that was previously raised if support_pos == 0 has nothing to do with dividing by zero and thus I decided to decouple it from the new param (and the old one). Since this doesn't change the functionality and only adds an additional warning in a certain case, that is probably alright, isn't it?

Copy link

github-actions bot commented Jun 18, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 49ca5d8. Link to the linter CI: here

@ArturoAmorQ ArturoAmorQ self-requested a review June 18, 2024 15:22
@glemaitre glemaitre self-requested a review June 18, 2024 21:09
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should add more test to check all the possibility for zero_division !="warn". We should also check that we raise the proper deprecation warning for raise_warning.

"zero_division": [
Hidden(StrOptions({"default"})),
StrOptions({"warn"}),
dict, # this needs to be further defined, but Options only takes unmutable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think dict is enough. Further validation will be handled in the function itself. The parameter is not doing advance checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there currently is no validation inplace on the function itself, because I was using open else statements for any other input. An option would be to use elifs with well defined conditions for the zero_division args and raise an input error on else if the user passes something invalid.
But my impression had been that the param validation meant to handle this rather than the function code and the goal of having this was to unburden the actual function code from validations like these. Which way should be precede?

Copy link
Contributor Author

@StefanieSenger StefanieSenger Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter is not doing advance checks.

Oh, did you mean: parameter_validation is not used for advance checks? And advance checks, does this mean checks that go further than checking types and maybe immutable arguments? Then I think I understand now and the input validation needs to happen in the function itself.

Edit: Done

To define the return values in case of a division by zero, use
`zero_division` instead.

zero_division : str or dict, default="warn"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we know exactly the string, we can state it directly.

I'm also thinking to add np.nan as a parameter that would be a shorthand for {"LR+": np.nan, "LR-": np.nan}. It is a bit lighter.

Suggested change
zero_division : str or dict, default="warn"
zero_division : "warn", np.nan or dict, default="warn"

Copy link
Contributor Author

@StefanieSenger StefanieSenger Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think we also keep the {"LR+": np.nan, "LR-": np.nan} option and add the np.nan option additionally. I slighly deviated and made it a str "nan", because a) this is how it is used in the other zero_division param cases and b) np.nan resulted difficult to work with (in the validation and in other places).

Edit: this is done

Comment on lines 2042 to 2044
# zero_division="warn" does not need to be "np.nan" anymore and should be the lowest
# score for each metric respectively (1 for LR+ and 0 for LR-) to match the other
# functions that take a zero_division param. Return values and warning messages need
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this change of behaviour should also be done with a deprecation from 1.8 to 1.10.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand. Do you say there should be two cycles of deprecation?

Copy link
Contributor Author

@StefanieSenger StefanieSenger Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at this anew, I see that you meant that the change of the default value for zero_division from warn to the lowest possible scores needs to be a new deprecation cycle.

Can we also consider doing two steps in one? Because the goal is that the functionality of zero_division matches the other functions that take a zero_division param and the longer this is not the case, the more confusing.

Edit: I have added a FutureWarning for the changing default behaviour and commented on this in the # TODO section.

negative_likelihood_ratio = neg_num / neg_denom
elif isinstance(
zero_division.get("LR-", None), (int, float)
) and zero_division.get("LR-", None) in [0, 1]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I don't think that we will raise an error if we don't have any other thing that 0/1. I'm thinking that we could always create when possible a dictionary and check that the values in the dict are the one expected (0/1/np.inf). This would be safer at this stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very interesting. Before I jump into making this, let me reformulate to clarify if we are talking about the same idea:

Should we accept any value that the metrics could output as a valid input in the the zero_division dict?
Should we allow it for class_likelihood_ratios only or also for the other classification metrics?
Would it be useful to someone?

I feel I could not judge if it was useful and would rather stick with how zero_division works on the other metrics by implementing it in a similar way here. (Which means restricting possible values to the min and max scores.) If you or someone goes "yes, yes yes" for the above questions, I'd be happy to learn more about this idea and to make it work this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we had talked about this, I have now implemented a lenient check of the input values for the zero_division param at the beginning of the function, that allows all the values, the ratios could take.

Copy link
Member

@ArturoAmorQ ArturoAmorQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @StefanieSenger, I would also briefly document the zero_division behavior in the dedicated mathematical divergences section of the user guide.

Copy link
Contributor Author

@StefanieSenger StefanieSenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review @glemaitre, I have done most things you have suggested and commented on a few, that were not that clear and I would like to exchange some thoughts or concerns.

To define the return values in case of a division by zero, use
`zero_division` instead.

zero_division : str or dict, default="warn"
Copy link
Contributor Author

@StefanieSenger StefanieSenger Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think we also keep the {"LR+": np.nan, "LR-": np.nan} option and add the np.nan option additionally. I slighly deviated and made it a str "nan", because a) this is how it is used in the other zero_division param cases and b) np.nan resulted difficult to work with (in the validation and in other places).

Edit: this is done

Comment on lines 2042 to 2044
# zero_division="warn" does not need to be "np.nan" anymore and should be the lowest
# score for each metric respectively (1 for LR+ and 0 for LR-) to match the other
# functions that take a zero_division param. Return values and warning messages need
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand. Do you say there should be two cycles of deprecation?

"zero_division": [
Hidden(StrOptions({"default"})),
StrOptions({"warn"}),
dict, # this needs to be further defined, but Options only takes unmutable
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there currently is no validation inplace on the function itself, because I was using open else statements for any other input. An option would be to use elifs with well defined conditions for the zero_division args and raise an input error on else if the user passes something invalid.
But my impression had been that the param validation meant to handle this rather than the function code and the goal of having this was to unburden the actual function code from validations like these. Which way should be precede?

negative_likelihood_ratio = neg_num / neg_denom
elif isinstance(
zero_division.get("LR-", None), (int, float)
) and zero_division.get("LR-", None) in [0, 1]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very interesting. Before I jump into making this, let me reformulate to clarify if we are talking about the same idea:

Should we accept any value that the metrics could output as a valid input in the the zero_division dict?
Should we allow it for class_likelihood_ratios only or also for the other classification metrics?
Would it be useful to someone?

I feel I could not judge if it was useful and would rather stick with how zero_division works on the other metrics by implementing it in a similar way here. (Which means restricting possible values to the min and max scores.) If you or someone goes "yes, yes yes" for the above questions, I'd be happy to learn more about this idea and to make it work this way.

@StefanieSenger
Copy link
Contributor Author

StefanieSenger commented Jun 24, 2024

Thanks for the PR @StefanieSenger, I would also briefly document the zero_division behavior in the dedicated mathematical divergences section of the user guide.

Thank you @ArturoAmorQ, I have mentioned it in this section. I also couldn't hold myself back to try to define the conditions for zero divisions a bit clearer there. Please let me know if I succeeded with my attempt.

Copy link
Member

@ArturoAmorQ ArturoAmorQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @StefanieSenger, here are just a couple of comments regarding the documentation aspects of this PR. I'll let the others review for the actual code :)

Comment on lines 1921 to 1922
interpreted as the classifier never wrongly identifying negative cases as positives.
This happens, for instance, when using a `DummyClassifier` that always predicts the
Copy link
Member

@ArturoAmorQ ArturoAmorQ Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording "never wrongly identifying negative cases as positives" does not bring added value, as it is an equivalence to fp=0.

The original wording "perfectly identifying positive cases" was intended to remind the user that LR+ is a "higher is better" kind of metrics (indeed, if fp=0 any sample classified as positive has to be a real positive and then LR+=inf). The paragraph is meant to highlight that the interpretation "higher is better ergo inf is perfect" is lost when both fp=tp=0, as that could be the case of a DummyClassifier (the absence of positive predictions leads to fp=tp=0).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe my point would be clearer if we give some examples:

  • Case 1: fp=0, tp≠0, LR_+ diverges
    • the classifier is perfect
    • in coherence with being a "higher is better" metric e.g.:
    • y_true = np.array([0, 0, 0, 1, 1, 1])
    • y_pred = np.array([0, 0, 0, 1, 1, 1])
  • Case 2: fp=0, tp=0, LR_+ diverges
    • y_pred only includes the majority class
    • can be misleading as it's the case of a DummyClassifier with imbalanced data e.g.:
    • y_true = np.array([0, 0, 0, 0, 0, 1])
    • y_pred = np.array([0, 0, 0, 0, 0, 0])
  • Case 3: fn=0, tp=0, LR_+ and LR_- diverge
    • the test set only includes the majority class, but the classifier still makes errors
    • can be misleading as it's a common case when cross-validating with imbalanced data e.g.:
    • y_true = np.array([0, 0, 0, 0, 0, 0])
    • y_pred = np.array([0, 0, 0, 0, 0, 1])
  • Case 4: tn=0, LR_- diverges
    • positive and negative classes are inverted
    • can be indicative of divergences in LR_+ if classes are reverted e.g.:
    • y_true = np.array([1, 1, 1, 0, 0, 0])
    • y_pred = np.array([0, 0, 0, 1, 1, 1])

Having this in the shape of table would be nice.

Copy link
Member

@ArturoAmorQ ArturoAmorQ Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something similar to:

| Zeros in the Confusion Matrix | Diverges      | Reason                                                         | Interpretation                                                                                    | Example                                      |
|-------------------------------|---------------|----------------------------------------------------------------|---------------------------------------------------------------------------------------------------|----------------------------------------------|
| fp=0, tp≠0                    | LR_+          | The classifier is perfect                                      | In coherence with being a "higher is better" metric                                               | `y_true = [0, 0, 0, 1, 1, 1]`<br>`y_pred = [0, 0, 0, 1, 1, 1]` |
| fp=0, tp=0                    | LR_+          | `y_pred` only includes the majority class                      | Can be misleading, as it’s the case of a `DummyClassifier` with imbalanced data                   | `y_true = [0, 0, 0, 0, 0, 1]`<br>`y_pred = [0, 0, 0, 0, 0, 0]` |
| fn=0, tp=0                    | LR_+ and LR_- | The test set only includes the majority class, but classifier still makes errors | Can be misleading, common when cross-validating with imbalanced data | `y_true = [0, 0, 0, 0, 0, 0]`<br>`y_pred = [0, 0, 0, 0, 0, 1]` |
| tn=0                          | LR_-          | Positive and negative classes are inverted                     | Can indicate divergences in LR_+ if classes are reversed                                          | `y_true = [1, 1, 1, 0, 0, 0]`<br>`y_pred = [0, 0, 0, 1, 1, 1]` |

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for clarifying this, @ArturoAmorQ. I think I understand for the LR+ case (not so sure about the LR- case though, so I didn't touch it).

I have tried to point out these two cases fp==0 might indicate and also stay in the scope of a side note/drop down on mathematical divergences. I feel that this could be material for an extensive example, but this is outside the scope of this PR.

Please let me know what you think about my attempt to rephrase and give advice on how to interpret the positive likelihood ratio in case of a divergence / warning shown to the user.

Comment on lines 2021 to 2028
Conditions under which this warning is raised in relation to the confusion matrix
deriving from the `y_true` and `y_pred` arguments:
When the number of false positives is 0, the positive likelihood ratio is undefined.
When the number of true negatives is 0, the negative likelihood ratio is undefined.
`UndefinedMetricWarning` is also (and regardles of the `zero_division` and
`raise_warning` arguments) raised when no samples of the positive class are present
in `y_true` (when the sum of true positives and false negatives is 0). Then, both
the positive and the negative likelihood ratios are undefined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Conditions under which this warning is raised in relation to the confusion matrix
deriving from the `y_true` and `y_pred` arguments:
When the number of false positives is 0, the positive likelihood ratio is undefined.
When the number of true negatives is 0, the negative likelihood ratio is undefined.
`UndefinedMetricWarning` is also (and regardles of the `zero_division` and
`raise_warning` arguments) raised when no samples of the positive class are present
in `y_true` (when the sum of true positives and false negatives is 0). Then, both
the positive and the negative likelihood ratios are undefined.
A warning is raised given the following conditions in `y_true` and `y_pred`:
- False positives are 0: positive likelihood ratio is undefined.
- True negatives are 0: negative likelihood ratio is undefined.
- No positive class samples seen in `y_true`: both likelihood ratios are
undefined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion, @ArturoAmorQ. I agree that the previous text was a bit confusing and I took your suggestion to work on it in my new commit. I have also mentioned the input arguments for zero_division and raise_warning that also influence, if a warning in raised.

Comment on lines 1921 to 1922
interpreted as the classifier never wrongly identifying negative cases as positives.
This happens, for instance, when using a `DummyClassifier` that always predicts the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe my point would be clearer if we give some examples:

  • Case 1: fp=0, tp≠0, LR_+ diverges
    • the classifier is perfect
    • in coherence with being a "higher is better" metric e.g.:
    • y_true = np.array([0, 0, 0, 1, 1, 1])
    • y_pred = np.array([0, 0, 0, 1, 1, 1])
  • Case 2: fp=0, tp=0, LR_+ diverges
    • y_pred only includes the majority class
    • can be misleading as it's the case of a DummyClassifier with imbalanced data e.g.:
    • y_true = np.array([0, 0, 0, 0, 0, 1])
    • y_pred = np.array([0, 0, 0, 0, 0, 0])
  • Case 3: fn=0, tp=0, LR_+ and LR_- diverge
    • the test set only includes the majority class, but the classifier still makes errors
    • can be misleading as it's a common case when cross-validating with imbalanced data e.g.:
    • y_true = np.array([0, 0, 0, 0, 0, 0])
    • y_pred = np.array([0, 0, 0, 0, 0, 1])
  • Case 4: tn=0, LR_- diverges
    • positive and negative classes are inverted
    • can be indicative of divergences in LR_+ if classes are reverted e.g.:
    • y_true = np.array([1, 1, 1, 0, 0, 0])
    • y_pred = np.array([0, 0, 0, 1, 1, 1])

Having this in the shape of table would be nice.

Comment on lines 1921 to 1922
interpreted as the classifier never wrongly identifying negative cases as positives.
This happens, for instance, when using a `DummyClassifier` that always predicts the
Copy link
Member

@ArturoAmorQ ArturoAmorQ Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something similar to:

| Zeros in the Confusion Matrix | Diverges      | Reason                                                         | Interpretation                                                                                    | Example                                      |
|-------------------------------|---------------|----------------------------------------------------------------|---------------------------------------------------------------------------------------------------|----------------------------------------------|
| fp=0, tp≠0                    | LR_+          | The classifier is perfect                                      | In coherence with being a "higher is better" metric                                               | `y_true = [0, 0, 0, 1, 1, 1]`<br>`y_pred = [0, 0, 0, 1, 1, 1]` |
| fp=0, tp=0                    | LR_+          | `y_pred` only includes the majority class                      | Can be misleading, as it’s the case of a `DummyClassifier` with imbalanced data                   | `y_true = [0, 0, 0, 0, 0, 1]`<br>`y_pred = [0, 0, 0, 0, 0, 0]` |
| fn=0, tp=0                    | LR_+ and LR_- | The test set only includes the majority class, but classifier still makes errors | Can be misleading, common when cross-validating with imbalanced data | `y_true = [0, 0, 0, 0, 0, 0]`<br>`y_pred = [0, 0, 0, 0, 0, 1]` |
| tn=0                          | LR_-          | Positive and negative classes are inverted                     | Can indicate divergences in LR_+ if classes are reversed                                          | `y_true = [1, 1, 1, 0, 0, 0]`<br>`y_pred = [0, 0, 0, 1, 1, 1]` |

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise this now LGTM.

"raise_warning": ["boolean", Hidden(StrOptions({"deprecated"}))],
"replace_undefined_by": [
Hidden(StrOptions({"default"})),
(StrOptions({"worst"})),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(StrOptions({"worst"})),
1.0,

So here I used to think the worst case for the two scores is different (0 and 1), but if for both of them it's 1, then I think it makes sense here for the argument option to be simply 1, while explaining in the docstring (as it's already done in this PR), what the ranges for the two scores are.

cc @glemaitre

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to think that "worst" is more intuitive than 1.0, but it's true that this then would be different than the other cases where we have implemented zero_division.
But no strong opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer "worst" for its explicitly but I would like to remain consistent. I would therefore prefer to go for 1.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it back from "worst" to to 1.0 here, in the tests, in the control flow checks, in the warning messages and in the docstring.

@glemaitre glemaitre self-requested a review November 25, 2024 18:54
default an appropriate warning message and returns `nan` to avoid pollution when
averaging over cross-validation folds.
The positive likelihood ratio (`LR+`) is undefined when :math:`fp = 0`, meaning the
classifier does not misclassify any negatives as positives. This condition can either
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to call "negative samples" (or observations) or "positive samples" instead of "negatives" and "positives"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rephrased this into "does not misclassify any negative labels as positives."
I think that is even clearer.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only minor changes. The code is pretty great and well documented I should say (quite easy to get lost in this zero_division mess otherwise).

Really good job @StefanieSenger.

@adrinjalali you might want to be the one merging after the comments are addressed.

Copy link
Contributor Author

@StefanieSenger StefanieSenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review @glemaitre.

I have made all the required changes.

default an appropriate warning message and returns `nan` to avoid pollution when
averaging over cross-validation folds.
The positive likelihood ratio (`LR+`) is undefined when :math:`fp = 0`, meaning the
classifier does not misclassify any negatives as positives. This condition can either
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rephrased this into "does not misclassify any negative labels as positives."
I think that is even clearer.

"raise_warning": ["boolean", Hidden(StrOptions({"deprecated"}))],
"replace_undefined_by": [
Hidden(StrOptions({"default"})),
(StrOptions({"worst"})),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it back from "worst" to to 1.0 here, in the tests, in the control flow checks, in the warning messages and in the docstring.

Comment on lines +143 to +145
if (isinstance(constraint, str) and constraint == "nan") or (
isinstance(constraint, float) and np.isnan(constraint)
):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the test case.

@adrinjalali adrinjalali merged commit 3060384 into scikit-learn:main Jan 17, 2025
31 checks passed
@lesteve
Copy link
Member

lesteve commented Jan 17, 2025

Not of extremely importance on a Friday afternoon but FYI this broke main, looks like a warning turned into error.

Relevant part from build log

Extension error:
Here is a summary of the problems encountered when running the examples:

Unexpected failing examples (1):

    ../examples/model_selection/plot_likelihood_ratios.py failed leaving traceback:

    Traceback (most recent call last):
      File "/home/circleci/project/examples/model_selection/plot_likelihood_ratios.py", line 64, in <module>
        pos_LR, neg_LR = class_likelihood_ratios(y_test, y_pred)
      File "/home/circleci/project/sklearn/utils/_param_validation.py", line 218, in wrapper
        return func(*args, **kwargs)
      File "/home/circleci/project/sklearn/metrics/_classification.py", line 2100, in class_likelihood_ratios
        warnings.warn(mgs_changed_default, FutureWarning)
    FutureWarning: The default return value of `class_likelihood_ratios` in case of a division by zero has been deprecated in 1.7 and will be changed to the worst scores (`(1.0, 1.0)`) in version 1.9. Set `replace_undefined_by=1.0` to use the newdefault and to silence this Warning.

@StefanieSenger
Copy link
Contributor Author

So, didn't we build the examples properly before in the CI? I don't understand why this didn't show up before.

@adrinjalali
Copy link
Member

The whole doc is not built in the CI unless you add [doc build] to a commit message, but they're built on main.

@StefanieSenger
Copy link
Contributor Author

Otherwise only the files I touched are build?

@adrinjalali
Copy link
Member

Yes, exactly. I should have remembered to ask you to submit that commit with the message, happens every now and then. All good, a follow up PR to fix is okay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants