Skip to content

[MRG] Metric documentation (mainly in classification) #1512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Jan 9, 2013

Conversation

arjoly
Copy link
Member

@arjoly arjoly commented Jan 3, 2013

I would like to partly tackle the issue in #1508.
In this pr, I intend to add definitions to classification metrics and remaining regression metrics.

Before going into the MRG state, I still have read two or three times the added documentation to find mistakes.

Furthermore, I am not sure about the documentation on explained_variance_score. It is new to me.

Questions:

  1. After this pr, I would like to add some multilabels metrics, some thoughts about this?
  2. Why is there a zero_one and zero_one_score instead of zero_one_loss and accuracy_score?
  3. Why ClassifierMixin uses a "homebrew" accuracy metrics in the score member?

@amueller
Copy link
Member

amueller commented Jan 3, 2013

Thanks a lot! You're on fire :)

  1. Awesome! See Evaluation metrics for multi label classifiers #558.
  2. I'd actually prefer to have zero_one_loss and zero_one_score. They should definitely end with loss and score! We can talk about whether both or one of them should be called accuracy.
  3. Because it was easier to implement than to import? Feel free to switch to zero_one_score (needs a lazy import to avoid import loops).

@amueller
Copy link
Member

amueller commented Jan 3, 2013

Currently the clustering metrics are documented in the clustering module. I think I would prefer them to be documented in the metrics module. Any opinions on that @arjoly @ogrisel @GaelVaroquaux @robertlayton ?

@arjoly
Copy link
Member Author

arjoly commented Jan 3, 2013

By the way, there is two metrics that ends with _error with mean_absolute_error and mean_square_error.

@amueller I have to leave, I will think about it tomorrow.

Edit: yes indeed mean_squared_error.

@amueller
Copy link
Member

amueller commented Jan 3, 2013

Lol I see we will become good friends :)

Maybe for regression the losses are called errors? I am not sure how much thought went into a naming scheme.

I think being explicit is important, which is why I don't like zero_one.

Being consistent is nice, too, but we shouldn't overdo it. Is the rest consistent? Is it worth renaming to mean_absolute_loss and mean_squared_loss?

Or could we try to have separate naming schemes for classification, regression and clustering metrics?

OT mean_square_error is deprecated and now mean_squared_error.

@GaelVaroquaux
Copy link
Member

Any opinions on that @arjoly @ogrisel @GaelVaroquaux @robertlayton ?

No strong opinion as long as there are some frequent and highly visible
links.

G

@ogrisel
Copy link
Member

ogrisel commented Jan 3, 2013

I think its ok to keep the _error suffix instead of _loss when this is a very common name such as mean_squared_error. mean_squared_loss would be too confusing and mean_squared_error_loss quite redundant in my opinion.

@ogrisel
Copy link
Member

ogrisel commented Jan 3, 2013

+1 for being more explicit by deprecating zero_one and adding an explicit name such as zero_one_loss and accuracy_score (one being 1 minus the other).

@robertlayton
Copy link
Member

+1 for having accuracy_score. Generally, if the thing has an explicit name, we should use that. In cases where the same metric has multiple names, this runs into problems.

As a thought, how do people feel about having aliases? i.e. after the definition of mean_absolute_error, have the line: mean_squared_error = mean_absolute_error.
This would make searching for functions significantly easier, both on the webpage and by doing something like:
print [d for d in dir(sklearn) if "squared" in d]

Oh, and put all metric documentation in the metrics part. Makes sense, that is where I would have expected them to be. With the clustering documentation update I'm (slowly) getting through, the documentation page may get significantly longer.

@GaelVaroquaux
Copy link
Member

As a thought, how do people feel about having aliases?

Not too excited: there should be one and only one prefered way of
achieving a result, elsewhere people are going to wonder what the
differences are.

This would make searching for functions significantly easier, both on the
webpage and by doing something like:
print [d for d in dir(sklearn) if "squared" in d]

I'd suggest to put disambiguation in the docstring and recommend the use
of 'numpy.lookfor'.

@amueller
Copy link
Member

amueller commented Jan 3, 2013

If we are lucky then people can google the docs ;)
Btw mean_squared_error and mean_absolute_error are different ;)

@GaelVaroquaux
Copy link
Member

Btw mean_squared_error and mean_absolute_error are different ;)

That's what I thought, but I didn't dare mention it :)

@robertlayton
Copy link
Member

Btw mean_squared_error and mean_absolute_error are different ;)

Of course they are. In my defence, I only just had my first coffee of the day.

@arjoly
Copy link
Member Author

arjoly commented Jan 4, 2013

Currently the clustering metrics are documented in the clustering module. I think I would prefer them to be documented in the metrics module.

It would be nice to have everything in one place and a link from the clustering documentation to the module evaluation part.

Oh, and put all metric documentation in the metrics part. Makes sense, that is where I would have expected them to be. With the clustering documentation update I'm (slowly) getting through, the documentation page may get significantly longer.

Will you make it or I perform the change?

I will initiate name changes (accuracy_score and zero_one_loss). I suppose that I have to add deprecation warning for 0.15, remove previous name from classes.rst, perform change in the narrative doc and the __init__.py

@ogrisel
Copy link
Member

ogrisel commented Jan 4, 2013

I will initiate name changes (accuracy_score and zero_one_loss). I suppose that I have to add deprecation warning for 0.15, remove previous name from classes.rst, perform change in the narrative doc and the init.py

Yes.

@ogrisel
Copy link
Member

ogrisel commented Jan 4, 2013

And update the API change of the whats_new.rst file when ready to merge.

@arjoly
Copy link
Member Author

arjoly commented Jan 4, 2013

+1 for being more explicit by deprecating zero_one and adding an explicit name such as zero_one_loss and accuracy_score (one being 1 minus the other).

Currently zero_one_loss is the sum of the zero one loss over the samples. So accuracy_score is not 1 - zero_one_loss.

@arjoly
Copy link
Member Author

arjoly commented Jan 4, 2013

I think there is still some typo to hunt down, but what I wanted to achieve is there.

So reviews and comments are welcome. :-)

@ogrisel
Copy link
Member

ogrisel commented Jan 4, 2013

Alright for keeping zero_one_loss as the sum and making accuracy_score(true, pred) == np.mean(true != pred).

@arjoly
Copy link
Member Author

arjoly commented Jan 4, 2013

you meant accuracy_score(true, pred) == np.mean(true == pred) ?

@amueller
Copy link
Member

amueller commented Jan 4, 2013

I think that is what he meant.

@ogrisel
Copy link
Member

ogrisel commented Jan 4, 2013

Indeed... sorry for the confusion.

roc_curve


Others have been extended to the multiclass case:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather say something along the lines of "also work in". For example a confusion matrix is quite naturally multiclass and doesn't need to be extended.

@arjoly
Copy link
Member Author

arjoly commented Jan 9, 2013

I have rebased on top of master to take into account #1534.

@GaelVaroquaux The deprecated decorator is now used.

To summarize what are the remaining questions / remarks to take into account

  1. The explained variance question of @amueller.
  2. The import consistency question

For

  1. I don't know this metric enough. It is pretty new to me.
  2. This might be solve latter.

@amueller
Copy link
Member

amueller commented Jan 9, 2013

Have you changed the default behavior of the renamed zero_one? could you please test that the old name has the old default behavior and the new name has the new?

>>> zero_one_loss(y_true, y_pred)
0.25
>>> zero_one_loss(y_true, y_pred, normalize=False)
1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amueller Check the doctest here.

@arjoly
Copy link
Member Author

arjoly commented Jan 9, 2013

Have you changed the default behavior of the renamed zero_one?

No.

could you please test that the old name has the old default behavior and the new name has the new?

See the reference.

Does it answer your question?

@arjoly
Copy link
Member Author

arjoly commented Jan 9, 2013

Taken into account your question, I have also clarified the api changed. Good catch! Thanks.

@ogrisel
Copy link
Member

ogrisel commented Jan 9, 2013

Explained variance can be negative for completely random or anti-correlated predictions:

>>> from sklearn.metrics import explained_variance_score
>>> explained_variance_score([0, 1, 2], [1, 2, 0])
-2.0
>>> explained_variance_score([0, 1, 2], [1, 2, 3])
1.0
>>> explained_variance_score([0, 1, 2], [1, 2, 2])
0.66666666666666663

Usually for a real life, non dummy model it's positive though (if the signal is not itself completely random).

assert_equal(zero_one(y_true, y_pred),
zero_one(y_pred, y_true))

assert_almost_equal(zero_one(y_true, y_pred, normalize=True),
zero_one(y_pred, y_true, normalize=True), 2)
zero_one(y_pred, y_true, normalize=True), 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these raise deprecation warnings, right? could you please catch them?

@amueller
Copy link
Member

amueller commented Jan 9, 2013

@arjoly Thanks for the pointers and sorry for being lazy. Having a busy day. Apart from my comment on the deprecation warnings I think this is good to go.

@arjoly
Copy link
Member Author

arjoly commented Jan 9, 2013

Thanks @ogrisel for your lights on this subject.

@amueller Don't worry, we all have our busy day.

@arjoly
Copy link
Member Author

arjoly commented Jan 9, 2013

@amueller Warnings are catched!

@amueller
Copy link
Member

amueller commented Jan 9, 2013

thanks :) I guess you are good to know. feel free to merge using you preferred method ;)

Arnaud Joly notifications@github.com schrieb:

Warnings are catched!


Reply to this email directly or view it on GitHub:
#1512 (comment)

Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

@arjoly arjoly merged commit fd5801e into scikit-learn:master Jan 9, 2013
@arjoly
Copy link
Member Author

arjoly commented Jan 9, 2013

Merge by rebase! :-)

Thanks all for your time and review!!!

@mblondel
Copy link
Member

mblondel commented Jan 9, 2013

A total of 1,306 additions and 215 deletions. That's a huge contribution! Thanks heaps!

@amueller
Copy link
Member

amueller commented Jan 9, 2013

+1 :)

Mathieu Blondel notifications@github.com schrieb:

A total of 1,306 additions and 215 deletions. That's a huge
contribution! Thanks heaps!


Reply to this email directly or view it on GitHub:
#1512 (comment)

@arjoly arjoly deleted the metric-classification-doc branch March 7, 2013 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants