You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Display the recall as a function of the predicted positive rate (PP) using sklearn.metrics.precision_recall_curve to compute the recall and PP as a quantiles of the threshold scores. Currently not possible to perform consistently as sklearn.metrics.precision_recall_curve drops a lot of threshold values corresponding recall = 1. This behavior has been recently introduced.
Describe your proposed solution
Enable a drop_intermediate parameter in sklearn.metrics.precision_recall_curve similar to sklearn.metrics.roc_curve with default value = False and keep the extreme values of the threshold to avoid side effects.
Describe alternatives you've considered, if relevant
No response
Additional context
The documentation of sklearn.metrics.precision_recall_curve describes n_thresholds = len(np.unique(probas_pred)). This is not anymore the behavior of this function. This might cause a lot of backward incompatibility hence the suggested default value False.
Code to reproduce:
import numpy as np
import numpy.random as npr
import sklearn as sk
from sklearn.metrics import precision_recall_curve
Describe the workflow you want to enable
Display the recall as a function of the predicted positive rate (PP) using
sklearn.metrics.precision_recall_curve
to compute the recall and PP as a quantiles of the threshold scores. Currently not possible to perform consistently assklearn.metrics.precision_recall_curve
drops a lot of threshold values corresponding recall = 1. This behavior has been recently introduced.Describe your proposed solution
Enable a
drop_intermediate
parameter insklearn.metrics.precision_recall_curve
similar tosklearn.metrics.roc_curve
with default value = False and keep the extreme values of the threshold to avoid side effects.Describe alternatives you've considered, if relevant
No response
Additional context
The documentation of
sklearn.metrics.precision_recall_curve
describesn_thresholds = len(np.unique(probas_pred))
. This is not anymore the behavior of this function. This might cause a lot of backward incompatibility hence the suggested default value False.Code to reproduce:
import numpy as np
import numpy.random as npr
import sklearn as sk
from sklearn.metrics import precision_recall_curve
scoresPredictor = np.arange(100)
groundTruth = np.concatenate((np.zeros(50), np.ones(50)))
precision_PR, recall_PR, thresholds_PR = precision_recall_curve(groundTruth, scoresPredictor)
print(len(np.unique(scoresPredictor)))
print(len(thresholds_PR))
The text was updated successfully, but these errors were encountered: