average_precision_score() overestimates AUC value

#### Description
The average_precision_score() function in sklearn doesn't return a correct AUC value.

#### Steps/Code to Reproduce
Example:
```python
import numpy as np
"""
    Desc: average_precision_score returns overestimated AUC of precision-recall curve
"""
# pathological example
p = [0.833, 0.800] # precision
r = [0.294, 0.235] # recall

# computation of average_precision_score()
print("AUC       = {:3f}".format(-np.sum(np.diff(r) * np.array(p)[:-1]))) # _binary_uninterpolated_average_precision()

# computation of auc() with trapezoid interpolation
print("AUC TRAP. = {:3f}".format(-np.trapz(p, r)))

# possible fix in _binary_uninterpolated_average_precision() **(edited)**
print("AUC FIX   = {:3f}".format(-np.sum(np.diff(r) * np.minimum(p[:-1], p[1:])))

#>> AUC       = 0.049147
#>> AUC TRAP. = 0.048174
#>> AUC FIX   = 0.047200
```


#### Expected Results
AUC without interpolation = (0.294 - 0.235) * 0.800 = 0.472
AUC with trapezoidal interpolation = 0.472 + (0.294 - 0.235) * (0.833 - 0.800) / 2 = 0.0482

#### Actual Results
This is what sklearn implements for AUC without interpolation (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html):
```python
sum((r[i] - r[i+1]) * p[i] for i in range(len(p)-1))
>> 0.049147
```
This is what I think is correct (**no longer; see edit**):
```python
sum((r[i] - r[i+1]) * p[i+1] for i in range(len(p)-1))
>> 0.047200
```

**EDIT:** I found that the above 'correct' implementation doesn't always underestimate. It depends on the input. Therefore I have revised the uninterpolated AUC calculation to this:
```python
sum((r[i] - r[i+1]) * min(p[i] + p[i+1]) for i in range(len(p)-1)) 
>> 0.047200
```

This has the advantage that the AUC calculation is more consistent; it is either equal or underestimated, but never overestimated (compared to the current uninterpolated AUC function). Below I show some examples on what it does:

- Example 1: all work fine
```python
p = [0.3, 1.0]
r = [1.0, 0.0]

#Results:
>> 0.30    # sklearn's _binary_uninterpolated_average_precision()
>> 0.30    # my consistent _binary_uninterpolated_average_precision()
>> 0.65    # np.trapz() (trapezoidal interpolation)
```
![pr_curve1](https://user-images.githubusercontent.com/36004944/52123847-9a8ca780-2627-11e9-8abb-a313102e74ab.png)

- Example 2: sklearn's _binary_uninterpolated_average_precision returns inaccurate number
```python
p = [1.0, 0.3]
r = [1.0, 0.0]

#Results:
>> 1.00    # sklearn's _binary_uninterpolated_average_precision()
>> 0.30    # my consistent _binary_uninterpolated_average_precision()
>> 0.65    # np.trapz() (trapezoidal interpolation)
```
![pr_curve2](https://user-images.githubusercontent.com/36004944/52123845-99f41100-2627-11e9-8b55-f0957980ecd5.png)

- Example 3: extra example
```python
p = [0.4, 0.1, 1.0]
r = [1.0, 0.9, 0.0]

#Results:
>> 0.13      # sklearn's _binary_uninterpolated_average_precision()
>> 0.10      # my consistent _binary_uninterpolated_average_precision()
>> 0.52      # np.trapz() (trapezoidal interpolation)
```
![pr_curve3](https://user-images.githubusercontent.com/36004944/52123846-9a8ca780-2627-11e9-8460-2ccb398dfa12.png)

#### Versions
Windows-10-10.0.17134-SP0
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)]
NumPy 1.14.0
SciPy 1.0.0
Scikit-Learn 0.19.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

average_precision_score() overestimates AUC value #13074

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

average_precision_score() overestimates AUC value #13074

Description

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions