-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
EXA add how to optimized a metric under constraints in TunedThresholdClassifierCV #29617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
EXA add how to optimized a metric under constraints in TunedThresholdClassifierCV #29617
Conversation
8bfe8c6
to
830c508
Compare
Could you modify the example such that the CI is passing. Right now the linter is failing. |
e1dcc03
to
75891f4
Compare
I'll try to review this example soonish. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is some interesting idea but I think that we need to rework the order:
- First I think that we can solely focus on
precision
,recall
, andf1_score
. Basically, we were already not happy using thebalanced_accuracy
. - So we could first optimize a model that maximize the
f1_score
and add the precision-recall display similarly to what you did for the ROC curve. We could show what is the trade-off of precision and recall associated to this maximum f1-score. - In addition, we can make the plot of the distribution of the threshold for this case.
- Then, we can do the new section that maximize either the precision for a given recall.
- So we can repeat the precision-recall curve and show the trade-off with on one side the constraint and the maximum score.
- I think that we don't need to show the distribution of the threshold for this case.
This is actually a nice narration because this is something I would expect to resonate for medical application.
def run_model_cv_results(model, cv): | ||
cv_results_model = pd.DataFrame( | ||
cross_validate( | ||
model, | ||
data, | ||
target, | ||
scoring=scoring, | ||
cv=cv, | ||
return_train_score=True, | ||
return_estimator=True, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a bit counter intuitive and I would not recommend in production script but for our example, I would avoid to define a function and explicitly call the pd.DataFrame
and cross_validate
. The reason is that it avoids the reader to scroll up-down.
One way that we can minimize the call is to store some of the parameters that are set across the experiment:
params = {
"scoring": scoring, "cv": cv, "return_train_score": True, "return_estimator": True
}
cv_results_vanilla_model = pd.DataFrame(cross_validate(model, data, target, **params))
cv_results_vanilla_model[cv_scores].aggregate(["mean", "std"]).T
Then, we can reuse params
scoring = { | ||
"accuracy": "accuracy", | ||
"balanced_accuracy": "balanced_accuracy", | ||
"recall": make_scorer(recall_score, pos_label=pos_label), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not read the rest of the example but if we want to optimize a metric under constraint, let's use precision
and recall
. The idea is to maximize the recall given a certain level of precision
. So we could compute these metric.
I have to check the entire example but we could drop the accuracy and balanced accuracy then.
@@ -116,17 +133,7 @@ | |||
from sklearn.model_selection import TunedThresholdClassifierCV | |||
|
|||
tuned_model = TunedThresholdClassifierCV(estimator=model, scoring="balanced_accuracy") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK so what we could do is maximize the F1-score and thus compute the precision/recall/f1-score. The story telling would be that we could maximize the F1-score but that the mean between the precision and recall (the definition of f1-score) is a bit arbritrary and that in practice medical doctors are actually looking at a trade-off between the precision and recall and thus the motivation of maximizing a metric under constraint.
cv_results_constrained_metric_model | ||
), | ||
} | ||
# %% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the plot is cool because it is using CV (we try to get a .from_cv_results
) I think that we need to drop it.
Instead, we are better the same type of analysis with the precision recall curve since we are going to compute the score for the precision recall and f1-score.
@daustria Sorry that it took me so much time to come back to you. Would you still have time to make the changes. |
no worries, thanks for your time and suggestions, i am still available to work on these changes |
Great. Thanks @daustria feel free to ping me for the further iterations. |
just writing to say i am still working on this, will push it soon. sorry for the delay, i had a trip, broke my hand during the trip and gotten sick recently |
No rush @daustria, hope you get better. Typing with a broken hand should not be the best. |
75891f4
to
e2cecb6
Compare
@daustria do not hesitate to ping my handle when you made the changes. |
okay, will ping on revisions. sorry for the long delay, after my vacation and hand injury healed, i also had some interviews to prepare for. i should have some time now though. |
e2cecb6
to
6c901fe
Compare
@glemaitre I just noticed I forgot to actually push my changes a couple of days ago, when force pushing my local branch, and all I did was update the commit message. Sorry about that, I guess I forgot how to use git. should be there now, though. |
@glemaitre hi, im wondering if this is still wanted? im currently working on #29962 and i dont mind rebasing or making changes to this issue while i am doing it. |
Yes we need this one. Let me provide some feedback. |
6c901fe
to
fd61e9e
Compare
…on precision recall tradeoffs instead of accuracy and balanced accuracy.
fd61e9e
to
2120319
Compare
Reference Issues/PRs
Towards #28944
What does this implement/fix? Explain your changes.
Added a third classifier to the example 'Post-hoc tuning the cut-off point of decision function' to show the pattern of optimizing one
metric constrained by another. I also added an ROC plot.
Any other comments?
I moved things to include this pattern as best as I could. Any comments on how I can improve it would be appreciated. I still need to add some more sentences to make it more readable or educational, but please let me know your thoughts, especially on the code and layout changes.