Skip to content

[WIP] Ranking metrics #2805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed

Conversation

mblondel
Copy link
Member

An early PR to make reading the diff easier. Not ready for detailed comments but high-level comments welcome :)

@arjoly
Copy link
Member

arjoly commented Jan 31, 2014

Interesting pr!

@mblondel mblondel mentioned this pull request Jan 31, 2014
13 tasks
assert_equal(mean_ndcg_score([5, 3, 2], [2, 1, 0]), 1.0)
assert_equal(mean_ndcg_score([2, 3, 5], [0, 1, 2]), 1.0)
assert_equal(mean_ndcg_score([5, 3, 2], [2, 1, 0], k=2), 1.0)
assert_equal(mean_ndcg_score([2, 3, 5], [0, 1, 2], k=2), 1.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also add (to better check the truncation):

assert_equal(mean_ndcg_score([2, 3, 5], [0, 1, 0], k=2), 1.0)

@ogrisel
Copy link
Member

ogrisel commented Jan 31, 2014

So the handling of ties is implementation specific (depends on the initial ordering of the target score and the tie handling of the np.argsort which is not even stable by default).

I would rather implement the pessimistic tie handling as described in #2580 (comment), at least as an option.

@ogrisel
Copy link
Member

ogrisel commented Jan 31, 2014

Also it would be great to add a sample_group=None to at least ndcg_score and mean_ndcg_score (and maybe others) to make it easy for users to pass evaluate a predictions made on a multi-query Learning to Rank test set.

@mblondel
Copy link
Member Author

@ogrisel Handling of ties and edge cases is on my todo list.


Parameters
----------
y_true : array, shape = [n_samples]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to follow numpy convention we agreed during last sprint to write

array, shape (n_samples,)

@davidgasquez
Copy link
Contributor

I have some scoring functions working (NDCG@k and DCG@k) with make_scorer and willing to make a pull request if this one is discontinued.

@amueller
Copy link
Member

amueller commented Oct 7, 2016

what happened to this? Also @davidgasquez I guess go ahead?

@davidgasquez
Copy link
Contributor

Hey @amueller! Not 100% sure if I should make a PR with the current implementation. As you can see, NDCG@K requires LabelBinarizer and I don't know if that's something acceptable for Scikit Learn! Looking forward to hearing your thoughts on this!

@jnothman
Copy link
Member

Using a LabelBinarizer is fine, but I'm not sure if you mean to be handling
multilabel ground truth as well.

On 11 October 2016 at 22:24, David Gasquez notifications@github.com wrote:

Hey @amueller https://github.com/amueller! Not 100% sure if I should
make a PR with the current implementation
https://github.com/davidgasquez/kaggle-airbnb/blob/master/kairbnb/metrics.py.
As you can see, NDCG@K requires LabelBinarizer and I don't know if
that's something acceptable for Scikit Learn! Looking forward to hearing
your thoughts on this!


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#2805 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz6_g9R8LLsG6iCyYSnUk3AErUi_x-ks5qy3HVgaJpZM4BeGxi
.

@davidgasquez
Copy link
Contributor

Using a LabelBinarizer is fine, but I'm not sure if you mean to be handling
multilabel ground truth as well.

Thanks! I'll read the contributing guidelines and try to send a PR in the next few days!

@davidgasquez
Copy link
Contributor

Started working on it with #7739. Any feedback is appreciated! 😄

@alfaro96
Copy link
Member

alfaro96 commented Oct 9, 2020

Are we interested on implementing kendall_tau_score, pairwise_accuracy_score and spearman_rho_score metrics (I am +1).

Indeed, I would be happy to implement these metrics.

Base automatically changed from master to main January 22, 2021 10:48
@lorentzenchr
Copy link
Member

lorentzenchr commented Sep 17, 2021

Summary: dcg_score and ndcg_score have been added in #7739, and kenall's tau or spearman's rho could be very easily used by scipy's implementations:

from scipy.stats import kendalltau, spearmanr
from sklearn.metrics import make_scorer

kenall_tau_score = make_scorer(kendalltau)
spearman_rho_score = make_scorer(kendalltau)

This leaves us with pairwise_accuracy_score.

I'm content with that and would therefore close. Any objections?

Remark: We could add this in an example, for instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants