-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FEA Implementation of "threshold-dependent metric per threshold value" curve #25639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…/scikit-learn into metric_threshold_curve
❌ Linting issuesThis PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling You can see the details of the linting issues under the
|
OK now that we merged the I think this is time to review and prioritize this feature. @vitaliset would you have time to dedicate to work on this feature? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial thoughts. I did not look at the documentation or test but it will come later.
def metric_threshold_curve( | ||
y_true, | ||
y_score, | ||
score_func, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to call this scoring
as well for consistency. But here, we should only accept a callable.
# Make y_true a boolean vector. | ||
y_true = y_true == pos_label | ||
|
||
# Sort scores and corresponding truth values. | ||
desc_score_indices = np.argsort(y_score, kind="mergesort")[::-1] | ||
y_score = y_score[desc_score_indices] | ||
y_true = y_true[desc_score_indices] | ||
if sample_weight is not None: | ||
sample_weight = sample_weight[desc_score_indices] | ||
|
||
# Logic to see if we need to use all possible thresholds (distinct values). | ||
all_thresholds = False | ||
if threshold_grid is None: | ||
all_thresholds = True | ||
elif isinstance(threshold_grid, int): | ||
if len(set(y_score)) < threshold_grid: | ||
all_thresholds = True | ||
|
||
if all_thresholds: | ||
# y_score typically has many tied values. Here we extract | ||
# the indices associated with the distinct values. We also | ||
# concatenate a value for the end of the curve. | ||
distinct_value_indices = np.where(np.diff(y_score))[0] | ||
threshold_idxs = np.r_[distinct_value_indices, y_true.size - 1] | ||
thresholds = y_score[threshold_idxs[::-1]] | ||
elif isinstance(threshold_grid, int): | ||
# It takes representative score points to calculate the metric | ||
# with these thresholds. | ||
thresholds = np.percentile( | ||
list(set(y_score)), np.linspace(0, 100, threshold_grid) | ||
) | ||
else: | ||
# If threshold_grid is an array then run some checks and sort | ||
# it for consistency. | ||
threshold_grid = column_or_1d(threshold_grid) | ||
assert_all_finite(threshold_grid) | ||
thresholds = np.sort(threshold_grid) | ||
|
||
# For each threshold calculates the metric. | ||
metric_values = [] | ||
for threshold in thresholds: | ||
preds_threshold = (y_score > threshold).astype(int) | ||
metric_values.append( | ||
score_func(y_true, preds_threshold, sample_weight=sample_weight) | ||
) | ||
# TODO: should we multithread the metric calculations? | ||
|
||
return np.array(metric_values), thresholds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this code is already implemented by the _score
function of the sklearn.model_selection._classification_threshold._CurveScorer
class.
I think that we should leverage this code by creating this scorer. We probably need to dissociate getting y_score
from the scoring itself such that here we only call the scoring part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now, it makes sense to move the _CurveScorer
in metrics
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now, it makes sense to move the
_CurveScorer
inmetrics
.
Do you want me to do this in this PR or create a separate one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to be in a separate PR. Depending on the schedule, I might start to do the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to dissociate getting y_score from the scoring itself such that here we only call the scoring part.
I think it will make sense to do this on this PR after we move the code to the proper place. Do you agree?
Awesome news! I might need a couple of weeks, but I would love to make this feature available! Will work on your comments as soon as I can, @glemaitre. |
It will be tight to get in 1.6 but it will be one of my prioritize PR for 1.7. |
@vitaliset , thanks for your patience on this. We discussed with @glemaitre and wanted to try and prioritize this. Are you still interested in working on this? If not, I am happy to push it forward, and you will still be credited. |
Towards #21391.
Intending to later build the
MetricThresholdCurveDisplay
following the same structure that other Displays have, this PR implements the associate curve. I decided to break the original issue into two parts (curve and Display) for easier review (but I don't mind adding the Display to this PR as well).[Update 08 June 2024] The code example is outdated after Guillaume Lemaitre's first reviews. For instance, I've moved the code to metrics instead of inspection and changed the parameters names. Leaving it here because the idea is still similar. Will update this later.
A quick example of usage of the implementation here:
[Update 08 June 2024] Will be using the code from
_CurveScorer
instead of_binary_clf_curve
as soon as I move it to metrics module.Most of the code for
metric_threshold_curve
function is an adaptation of_binary_clf_curve
.Points of doubt:
[Update 09 June 2024] When I come back to this PR, I need to do this before asking for a new review:
_CurveScorer
frommodel_selection
tometrics
#29216._CurveScorer
so I can dissociate getting y_score from the scoring itself such that here we only call the scoring part.decision_threshold_curve
code to use this new method of_CurveScorer
.