-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] FEA New meta-estimator to post-tune the decision_function/predict_proba threshold for binary classifiers #16525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
d3705d6
b99218b
ce66427
3ee3b5a
c5a51eb
0aab70c
6e12f8a
420df8d
6b36f68
00da5f7
e8837e0
1a03fad
63b285a
65e8329
d924684
237c919
4210537
255dfe8
f3a372d
f998625
ca4c50a
3eb0289
c0acab0
3edc421
982918a
a426131
470a09c
a565589
6c0db49
c41e999
f53f833
dd4e9fe
e32cfa7
07915e9
fc1c422
aa5cd16
a669ecf
7fadbfb
a477e7b
09b47bb
42e7f00
e9d7873
add8320
7025768
0520f96
536753f
6a12a1f
b8cfd34
0713232
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,34 @@ | ||||||||||
.. currentmodule:: sklearn.model_selection | ||||||||||
|
||||||||||
.. _prediction_tuning: | ||||||||||
|
||||||||||
================================================ | ||||||||||
Tuning of the decision threshold of an estimator | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
================================================ | ||||||||||
|
||||||||||
The real-valued decision functions, i.e. `decision_function` and | ||||||||||
`predict_proba`, of machine-learning classifiers carry the inherited biases of | ||||||||||
Comment on lines
+9
to
+10
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
the fitted model; e.g, in a class imbalanced setting, a classifier | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
will naturally lean toward the most frequent class. In some other cases, the | ||||||||||
generic objective function used to train a model is generally unaware of the | ||||||||||
evaluation criteria used to evaluate the model; e.g., one might want to | ||||||||||
penalized differently a false-positive and false-negative ---it will be less | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
detrimental to show an MR image without a cancer (i.e., false-positive) to a | ||||||||||
radiologist than hidding one with a cancer (i.e, false-negtative) when | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
developing some computer-aided diagnosis system. | ||||||||||
|
||||||||||
In a binary classification scenario, the hard-prediction, i.e. `predict`, for a | ||||||||||
classifier most commonly use the `predict_proba` and apply a decision threshold | ||||||||||
at 0.5 to output a positive or negative label. Thus, this hard-prediction | ||||||||||
suffers from the same drawbacks than the one raised in the above paragraph. | ||||||||||
Comment on lines
+20
to
+23
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. May I suggest:
|
||||||||||
|
||||||||||
Post-tuning of the decision threshold | ||||||||||
===================================== | ||||||||||
|
||||||||||
:class:`CutoffClassifier` allows for post-tuning the decision threshold using | ||||||||||
either `decision_function` or `predict_proba` and an objective metric for which | ||||||||||
we want our threshold to be optimized for. | ||||||||||
|
||||||||||
Fine-tune using a single objective metric | ||||||||||
----------------------------------------- | ||||||||||
|
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -611,6 +611,12 @@ Changelog | |||||||
be removed in 0.25. :pr:`16401` by | ||||||||
:user:`Arie Pratama Sutiono <ariepratama>` | ||||||||
|
||||||||
- |MajorFeature| :class:`model_selection.CutoffClassifier` calibrates the | ||||||||
decision threshold function of a classifier by maximizing a binary | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
classification metric through cross-validation. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
:pr:`16525` by :user:`Guillaume Lemaitre <glemaitre>` and | ||||||||
:user:`Prokopis Gryllos <PGryllos>`. | ||||||||
|
||||||||
:mod:`sklearn.multioutput` | ||||||||
.......................... | ||||||||
|
||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, is this more about threshold tuning than prediction tuning?