FIX remove np.divide with where and without out argument in `precision_recall_curve` #24382

lesteve · 2022-09-07T08:35:45Z

What does this implement/fix? Explain your changes.

np.divide with where and wihtout an out argument has unitialised values outside of the where condition. After #24245 I think this PR removes the only place where we use this in scikit-learn.

See https://numpy.org/doc/stable/reference/ufuncs.html#ufunc and search "where" (sorry no direct link)

Note that if an uninitialized return array is created, values of where=False will leave those values uninitialized.

Any other comments?

This np.divide was introduced in #19085 to fix the recall when y_true has only zeros this did not affect precision. The behaviour was to have zeros so I used this an an initialisation.

Having looked at the code I am fairly confident that actually tps + fps can not be zero (unless strictly negative sample weights) but it seems like this is a bit tricky to convince yourself so maybe leave the np.divide(..., where=denominator != 0)? Alternatively I could add a comment trying to summarise the following (not that easy).

The code is in _binary_clf_curve:

scikit-learn/sklearn/metrics/_ranking.py

Line 703 in fd0e815

def _binary_clf_curve(y_true, y_score, pos_label=None, sample_weight=None):

no sample_weight case: tps + fps = 1 + threshold_idxs which is never zero

sample_weight case is a bit more tricky:

tps + fps = stable_cumsum(y_true * weight) + stable_cumsum((1 - y_true) * weight)

if (sample_weight > 0).all(), tps + fps > 0 we are all good

if (sample_weight == 0).any(), zero sample_weight get actually masked out in

scikit-learn/sklearn/metrics/_ranking.py

Lines 748 to 755 in fd0e815

    
           # Filter out zero-weighted samples, as they should not impact the result 
        
           if sample_weight is not None: 
        
               sample_weight = column_or_1d(sample_weight) 
        
               sample_weight = _check_sample_weight(sample_weight, y_true) 
        
               nonzero_weight_mask = sample_weight != 0 
        
               y_true = y_true[nonzero_weight_mask] 
        
               y_score = y_score[nonzero_weight_mask] 
        
               sample_weight = sample_weight[nonzero_weight_mask]

so you never see them in the cumulative sum

if (sample_weight < 0).any() tps + fps could be zero, but the consensus is that it does not make any sense to have negative sample weights 🤷. There seem to be some use cases in High Energy Physics but I assume that this is more for an estimator than computing a precision-recall curve. Let's leave this discussion for a separate PR anyway this is a can full of worms.

I tried pragmatically to find a case where tps + fps == 0 and the only way I managed to get this is with negative sample_weight. Here is a snippet if you want to try for yourself:

from sklearn.metrics._ranking import _binary_clf_curve, precision_recall_curve

y_true = [0, 0, 0, 0]
probas_pred = [0.7, 0.6, 0.5, 0.4]
sample_weight = [-1, 1, -1, 1]
fps, tps, thresholds = _binary_clf_curve(
    y_true, probas_pred, pos_label=1, sample_weight=sample_weight
)
print(f"{fps=}")
print(f"{tps=}")
print(f"{thresholds=}")
print(f"{fps+tps == 0=}")

precision, recall, threshold = precision_recall_curve(y_true, probas_pred, pos_label=1, sample_weight=sample_weight)
print(f"{precision=}")

Output:

fps=array([-1.,  0., -1.,  0.])
tps=array([-0.,  0.,  0.,  0.])
thresholds=array([0.7, 0.6, 0.5, 0.4])
fps+tps == 0=array([False,  True, False,  True])

jeremiedbb

LGTM.

To summarize, tps + fps can never be 0 (unless using negative sample weight, which doesn't really make sense here). We still use np.divide instead of just dividing in case someone really wants to use negative sample weights (although not officially supported). So this fix should not impact the behavior in the officially supported case and thus doesn't require a what's new entry.
I'm fine with that.

betatim · 2022-09-07T09:17:38Z

For my education: if ps can't be zero, why does the where= exist?

I can't quite convince myself either that ps can never be zero (w/o using -ve weights) but somehow it makes sense from looking at the code of _binary_clf_curve(). The bit that makes me think it can't be zero is that fps is (⚠️ massive simplification) calculated as 1-tps.

ogrisel

LGTM with an inline comment:

sklearn/metrics/_ranking.py

glemaitre · 2022-09-07T11:15:48Z

I quickly check when did we tackle the problem of getting some non-finite values and it seems it was in this PR: #9980

Without negative sample weights, I would foresee the case where we are in a CV and sum(sample_weight) == 0 (i.e. selecting samples with null weights). I would be surprised that nothing during the fit is breaking in this case.

+1 for the change even thought we might never end up in this situation.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

sklearn/metrics/_ranking.py

…n_recall_curve` (scikit-learn#24382) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

FIX remove np.divide with where without out argument

6864d4a

github-actions bot added the module:metrics label Sep 7, 2022

jeremiedbb approved these changes Sep 7, 2022

View reviewed changes

ogrisel approved these changes Sep 7, 2022

View reviewed changes

sklearn/metrics/_ranking.py Show resolved Hide resolved

Update sklearn/metrics/_ranking.py

b25de9a

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

glemaitre reviewed Sep 7, 2022

View reviewed changes

sklearn/metrics/_ranking.py Outdated Show resolved Hide resolved

Update sklearn/metrics/_ranking.py

d40c4c3

jeremiedbb merged commit fb22b4f into scikit-learn:main Sep 7, 2022

lesteve deleted the np-divide-precision-recall branch September 7, 2022 12:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX remove np.divide with where and without out argument in `precision_recall_curve` #24382

FIX remove np.divide with where and without out argument in `precision_recall_curve` #24382

lesteve commented Sep 7, 2022 •

edited

Loading

jeremiedbb left a comment

betatim commented Sep 7, 2022

ogrisel left a comment

glemaitre commented Sep 7, 2022

	# Filter out zero-weighted samples, as they should not impact the result
	if sample_weight is not None:
	sample_weight = column_or_1d(sample_weight)
	sample_weight = _check_sample_weight(sample_weight, y_true)
	nonzero_weight_mask = sample_weight != 0
	y_true = y_true[nonzero_weight_mask]
	y_score = y_score[nonzero_weight_mask]
	sample_weight = sample_weight[nonzero_weight_mask]

FIX remove np.divide with where and without out argument in precision_recall_curve #24382

FIX remove np.divide with where and without out argument in precision_recall_curve #24382

Conversation

lesteve commented Sep 7, 2022 • edited Loading

What does this implement/fix? Explain your changes.

Any other comments?

jeremiedbb left a comment

Choose a reason for hiding this comment

betatim commented Sep 7, 2022

ogrisel left a comment

Choose a reason for hiding this comment

glemaitre commented Sep 7, 2022

FIX remove np.divide with where and without out argument in `precision_recall_curve` #24382

FIX remove np.divide with where and without out argument in `precision_recall_curve` #24382

lesteve commented Sep 7, 2022 •

edited

Loading