Skip to content

a bug in average precision function #6377

@roadjiang

Description

@roadjiang

The bug is as follows:

y_true = np.array([0 for _ in xrange(10000)])
y_true[0] = 1
y_scores = np.array([0 for _ in xrange(10000)])
average_precision_score(y_true,y_scores)

Out[562]: 0.50004999999999999

It overestimates the true map. Think about it. Basically all 10000 the samples have the same rank (a dummy ranked list) and it gets lucky the first one is the true positive. How could this ap is 0.5?

If this is true, that is to say, for every problem if the code output nothing it gets a pretty good ap as 0.5.
In many hard problems, to get even 0.1-0.2 average precision is pretty hard.

Basically, you need shuffle the ranked list before calculating map to avoid this overestimation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions