Precision Recall numbers computed by Scikits are not interpolated (non-standard)

Hi, 

Scikit Learn seems to implement Precision Recall curves (and Average Precision values/AUC under PR curve) in a non-standard way, without documenting the discrepancy. The standard way of computing Precision Recall numbers is by interpolating the curve, as described here:
http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html
the motivation is to 
1. Smooth out the kinks and reduce noise contribution to the score
2. In any practical application, if your PR curve ever went up, then you would strictly prefer to set your threshold there rather than at the original place (achieving both more precision and recall). Hence, people prefer to interpolate the curve, which better integrates out the threshold parameter and gives a more sensible estimate of the real performance.

This is also what standard code for Pascal VOC does and explain this in their writeup:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.157.5766&rep=rep1&type=pdf

VL_FEAT also has options for interpolation:
http://www.vlfeat.org/matlab/vl_pr.html
and as shown in their code here: https://github.com/vlfeat/vlfeat/blob/edc378a722ea0d79e29f4648a54bb62f32b22568/toolbox/plotop/vl_pr.m

The concern is that people using the scikit version will see incorrectly reported LOWER performance, than what they might see reported in other papers.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Precision Recall numbers computed by Scikits are not interpolated (non-standard) #4577

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Precision Recall numbers computed by Scikits are not interpolated (non-standard) #4577

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions