Skip to content

DOC add an example on how to optimize a metric with a constraint in TunedThresholdClassifierCV #28944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
glemaitre opened this issue May 3, 2024 · 6 comments

Comments

@glemaitre
Copy link
Member

We merged TunedThresholdClassifierCV in #26120.
However, we don't expose any way to optimize a metric that is constrained by another as one would do when choosing a point on the ROC or PR curves.

We should have an example that shows how to do such optimization as discussed here:
#26120 (review)

This would be a temporary trick until we settle on the best possible API regarding this constrained scorer.

@github-actions github-actions bot added the Needs Triage Issue requires triage label May 3, 2024
@adrinjalali
Copy link
Member

Probably also figure out a nice api that we can use for SearchCV as well

@jeremiedbb
Copy link
Member

I think that having a single scoring param able to handle constrained metrics as well would be ideal. In addition to make an example for that, we can maybe implement a helper for the main uses cases, precision at recall and tpr at tnr, that returns a ready-to-use scorer.

Probably also figure out a nice api that we can use for SearchCV as well

Do you think it would make sense to search over constraint values ? My understanding was that such constraint is determined by an external factor that you have no flexibility over.

@jeremiedbb jeremiedbb added New Feature Documentation and removed Needs Triage Issue requires triage labels May 6, 2024
@adrinjalali
Copy link
Member

At least in fairness related context, you might have a constraint on a fairness metric, and among those you want the best accuracy.

Or if you take the classical example, you want the model which has the least AIC/BIC within the 1 std of the test error of the best model.

@daustria
Copy link

daustria commented Jul 8, 2024

would be interested in contributing to this. my understanding is that a short example should be added to show the current api can be used (although in a bit of a tricky manner, using -np.inf) to optimize a metric according to some constraint. for where to add it, some potential spots i identified could be here in 3.3.1.1, inside the note, or making a new subsection in here in 3.4.1. perhaps most ambitiously it could be inserted in this existing example on cost-sensitive learning.

@glemaitre
Copy link
Member Author

I think we should not complexify the example that you pointed out. However the following one might be a better place where we can show the pattern and the associated ROC curve to understand the constraint.

Then, we can document in the user guide and refer to this example.

@daustria
Copy link

daustria commented Aug 2, 2024

added a draft PR for this example, still need to do a little more writing but the code and layout changes and overall shape is what I have in mind for reworking this example to include the constrained metric pattern. please let me know your thoughts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants