TunedThresholdClassifierCV: add other metrics #29061

koaning · 2024-05-21T08:23:42Z

Describe the workflow you want to enable

I figured that I might use the new tuned thresholder to turn code like this into something that's a bit more like gridsearch with all the parallism benefits.

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
from sklearn.model_selection import FixedThresholdClassifier, train_test_split
from tqdm import trange


X, y = make_classification(
    n_samples=10_000, weights=[0.9, 0.1], class_sep=0.8, random_state=42
)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, stratify=y, random_state=42
)

classifier = LogisticRegression(random_state=0).fit(X_train, y_train)

n_steps = 200
metrics = []
for i in trange(1, n_steps):
    classifier_other_threshold = FixedThresholdClassifier(
        classifier, threshold=i/n_steps, response_method="predict_proba"
    ).fit(X_train, y_train)
    
    y_pred = classifier_other_threshold.predict(X_train)
    metrics.append({
        'threshold': i/n_steps,
        'f1': f1_score(y_train, y_pred),
        'precision': precision_score(y_train, y_pred),
        'recall': recall_score(y_train, y_pred),
        'accuracy': accuracy_score(y_train, y_pred)
    })

This data can give me a very pretty plot with a lot of information.

But I think I can't make this chart with the new tuned thresholder in the 1.5 release candidate.

I can do this:

from sklearn.model_selection import TunedThresholdClassifierCV
from sklearn.metrics import make_scorer

classifier_other_threshold = TunedThresholdClassifierCV(
    classifier,  
    scoring=make_scorer(f1_score), 
    response_method="predict_proba", 
    thresholds=200, 
    n_jobs=-1, 
    store_cv_results=True
)
classifier_other_threshold.fit(X_train, y_train)

And this gives me data for a pretty plot as well, but it only contains the f1 score. There is no way to add extra metrics in the current implementation.

Describe your proposed solution

Maybe it makes sense to also allow a metrics input to the tuned cv estimator. That way, it can still collect any extra metrics that we might be interested in.

Describe alternatives you've considered, if relevant

The aforementioned code shows the chart that I am mainly interested in. A single metric never tells me the full story and the extra metrics help prevent me overfit on a single variable. Reality tends to be more complex than what a single metric can provide, so I like to nudge extra context with some (custom) metrics.

Additional context

No response

The text was updated successfully, but these errors were encountered:

glemaitre · 2024-05-21T08:34:23Z

Duplicate of #21391
The metric is started to be implemented in #25639
Then, the next step will be to implement the display that will call the metric under the hood.

glemaitre · 2024-05-21T08:37:20Z

Maybe it makes sense to also allow a metrics input to the tuned cv estimator. That way, it can still collect any extra metrics that we might be interested in.

I would be against this API. We should limit the internal attribute to the modelling only (currently optimizing a single metric). However, if we have a display, then we can input the model and compute and store the metric. So in terms of usability I think the display should be in charge of storing the metric log.

glemaitre · 2024-05-21T08:39:41Z

@koaning Do you think this is fine to close this issue in favor of #21391?

koaning · 2024-05-21T08:40:58Z

Oh, just to be clear, I am not worried about the display/chart. This is more just an example of a manual chart that I might be interested in making. I am mainly concerned with being able to dive into the effect that a different threshold may have. There can be multiple concerns, and a single metrics usually doesn't capture that.

What I want is the ability to add metrics, some of which are custom, that can help ensure that optimizing for one thing, say f1 score, does not cause issues else where, say a fairness metric. I am totally fine with optimizing a single metric, that totally makes sense! But I would prefer to also be able to report on other metrics while doing so.

glemaitre · 2024-05-21T08:45:31Z

What I want is the ability to add metrics, some of which are custom, that can help ensure that optimizing for one thing, say f1 score, does not cause issues else where, say a fairness metric. I am totally fine with optimizing a single metric, that totally makes sense! But I would prefer to also be able to report on other metrics while doing so.

But once you have the model, this is just an evaluation issue. So by providing a function sklearn.metrics.decision_threshold_curve that take any callable (maybe even a list of callable) should allow for custom and already defined metric.

However, this require an additional line of code because it is not called within the TunedThresholdClassifierCV but afterwards:

model = TunedThresholdClassifierCV(.., scoring=business_metrics, ...).fit(X_train, y_train)
metrics_log = decision_threshold_curve(
    y_test, model.predict_proba(X_test), scoring=[business_metrics, f1_score, ...]
)

The advantage here is that you can compute the score on a provided dataset and not only an internal validation set. So you will be able to compare train/test or run it through cross-validation.

koaning · 2024-05-21T08:49:36Z

I was not aware of the decision_threshold_curve but I also could not find it. Just to make sure, is that a typo?

glemaitre · 2024-05-21T08:50:58Z

I was not aware of the decision_threshold_curve but I also could not find it. Just to make sure, is that a typo?

Nop, this is my proposal in #25639

koaning · 2024-05-21T08:53:23Z

Ahhhh sorry, now I see. Yeah, ok, with a visualisation feature like that I suppose you could always do that exercise without using an automated tuner and I also see how that is a separate problem.

Fair enough. Closing this one! Thanks for the response.

skanskan · 2024-09-14T11:55:32Z

Is TunedThresholdClassifierCV with refit=True the same than GridSearchCV using as hyperparameter the threshold?
Do they refit all parameters in the model at the same time than the hyperparameters?

glemaitre · 2024-09-17T15:26:10Z

refit=True means that once you picked up the set of best parameters, you retrain the underlying estimator on the full dataset with selected parameters thus similar to the GridSearchCV policy.

koaning added Needs Triage Issue requires triage New Feature labels May 21, 2024

koaning mentioned this issue May 21, 2024

Don't refit in FixedThresholdClassifier when original model is already trained. #29062

Closed

koaning closed this as completed May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TunedThresholdClassifierCV: add other metrics #29061

TunedThresholdClassifierCV: add other metrics #29061

koaning commented May 21, 2024 •

edited

Loading

glemaitre commented May 21, 2024

glemaitre commented May 21, 2024 •

edited

Loading

glemaitre commented May 21, 2024

koaning commented May 21, 2024 •

edited

Loading

glemaitre commented May 21, 2024 •

edited

Loading

koaning commented May 21, 2024

glemaitre commented May 21, 2024

koaning commented May 21, 2024

skanskan commented Sep 14, 2024

glemaitre commented Sep 17, 2024 •

edited

Loading

TunedThresholdClassifierCV: add other metrics #29061

TunedThresholdClassifierCV: add other metrics #29061

Comments

koaning commented May 21, 2024 • edited Loading

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

glemaitre commented May 21, 2024

glemaitre commented May 21, 2024 • edited Loading

glemaitre commented May 21, 2024

koaning commented May 21, 2024 • edited Loading

glemaitre commented May 21, 2024 • edited Loading

koaning commented May 21, 2024

glemaitre commented May 21, 2024

koaning commented May 21, 2024

skanskan commented Sep 14, 2024

glemaitre commented Sep 17, 2024 • edited Loading

koaning commented May 21, 2024 •

edited

Loading

glemaitre commented May 21, 2024 •

edited

Loading

koaning commented May 21, 2024 •

edited

Loading

glemaitre commented May 21, 2024 •

edited

Loading

glemaitre commented Sep 17, 2024 •

edited

Loading