Description
Description
The ROC functionality reduces its workload by first extracting forward differences between scores and then discarding potential thresholds that are too close together by calling utils.isclose.fixes
. The problem is that this happens without notification of the user (or any chance to intervene, setting drop_intermediate
argument to False
does not change anything) and, moreover, without scaling the tolerance w.r.t. to the variance of the score vector.
Steps/Code to Reproduce
This is the (slightly simplified) example given in the documentation of metrics.roc_curve
import numpy as np
from sklearn import metrics
y = np.array([False, False, True, True])
original_scores = np.array([0.1, 0.4, 0.35, 0.8])
scores = original_scores
fpr, tpr, thresholds = metrics.roc_curve(y, scores)
print fpr, tpr, thresholds, metrics.roc_auc_score(y, scores)
scores = 1e-6*original_scores
fpr, tpr, thresholds = metrics.roc_curve(y, scores)
print fpr, tpr, thresholds, metrics.roc_auc_score(y, scores)
scores = 1e-7*original_scores
fpr, tpr, thresholds = metrics.roc_curve(y, scores)
print fpr, tpr, thresholds, metrics.roc_auc_score(y, scores)
scores = 1e-8*original_scores
fpr, tpr, thresholds = metrics.roc_curve(y, scores)
print fpr, tpr, thresholds, metrics.roc_auc_score(y, scores)
Expected Results
All printed lines should yield the same output.
Versions
Windows-10-10.0.10586
('Python', '2.7.11 |Anaconda 2.5.0 (64-bit)| (default, Jan 29 2016, 14:26:21) [MSC v.1500 64 bit (AMD64)]')
('NumPy', '1.10.4')
('SciPy', '0.17.0')
('Scikit-Learn', '0.17')