[MRG] Metric documentation (mainly in classification) #1512

arjoly · 2013-01-03T17:28:26Z

I would like to partly tackle the issue in #1508.
In this pr, I intend to add definitions to classification metrics and remaining regression metrics.

Before going into the MRG state, I still have read two or three times the added documentation to find mistakes.

Furthermore, I am not sure about the documentation on explained_variance_score. It is new to me.

Questions:

After this pr, I would like to add some multilabels metrics, some thoughts about this?
Why is there a zero_one and zero_one_score instead of zero_one_loss and accuracy_score?
Why ClassifierMixin uses a "homebrew" accuracy metrics in the score member?

amueller · 2013-01-03T17:34:15Z

Thanks a lot! You're on fire :)

Awesome! See Evaluation metrics for multi label classifiers #558.
I'd actually prefer to have zero_one_loss and zero_one_score. They should definitely end with loss and score! We can talk about whether both or one of them should be called accuracy.
Because it was easier to implement than to import? Feel free to switch to zero_one_score (needs a lazy import to avoid import loops).

amueller · 2013-01-03T17:38:06Z

Currently the clustering metrics are documented in the clustering module. I think I would prefer them to be documented in the metrics module. Any opinions on that @arjoly @ogrisel @GaelVaroquaux @robertlayton ?

arjoly · 2013-01-03T17:39:11Z

By the way, there is two metrics that ends with _error with mean_absolute_error and mean_square_error.

@amueller I have to leave, I will think about it tomorrow.

Edit: yes indeed mean_squared_error.

amueller · 2013-01-03T17:45:15Z

Lol I see we will become good friends :)

Maybe for regression the losses are called errors? I am not sure how much thought went into a naming scheme.

I think being explicit is important, which is why I don't like zero_one.

Being consistent is nice, too, but we shouldn't overdo it. Is the rest consistent? Is it worth renaming to mean_absolute_loss and mean_squared_loss?

Or could we try to have separate naming schemes for classification, regression and clustering metrics?

OT mean_square_error is deprecated and now mean_squared_error.

GaelVaroquaux · 2013-01-03T17:59:39Z

Any opinions on that @arjoly @ogrisel @GaelVaroquaux @robertlayton ?

No strong opinion as long as there are some frequent and highly visible
links.

G

ogrisel · 2013-01-03T18:02:31Z

I think its ok to keep the _error suffix instead of _loss when this is a very common name such as mean_squared_error. mean_squared_loss would be too confusing and mean_squared_error_loss quite redundant in my opinion.

ogrisel · 2013-01-03T18:03:58Z

+1 for being more explicit by deprecating zero_one and adding an explicit name such as zero_one_loss and accuracy_score (one being 1 minus the other).

robertlayton · 2013-01-03T21:42:27Z

+1 for having accuracy_score. Generally, if the thing has an explicit name, we should use that. In cases where the same metric has multiple names, this runs into problems.

As a thought, how do people feel about having aliases? i.e. after the definition of mean_absolute_error, have the line: mean_squared_error = mean_absolute_error.
This would make searching for functions significantly easier, both on the webpage and by doing something like:
print [d for d in dir(sklearn) if "squared" in d]

Oh, and put all metric documentation in the metrics part. Makes sense, that is where I would have expected them to be. With the clustering documentation update I'm (slowly) getting through, the documentation page may get significantly longer.

GaelVaroquaux · 2013-01-03T22:22:16Z

As a thought, how do people feel about having aliases?

Not too excited: there should be one and only one prefered way of
achieving a result, elsewhere people are going to wonder what the
differences are.

This would make searching for functions significantly easier, both on the
webpage and by doing something like:
print [d for d in dir(sklearn) if "squared" in d]

I'd suggest to put disambiguation in the docstring and recommend the use
of 'numpy.lookfor'.

amueller · 2013-01-03T22:28:47Z

If we are lucky then people can google the docs ;)
Btw mean_squared_error and mean_absolute_error are different ;)

GaelVaroquaux · 2013-01-03T22:33:49Z

Btw mean_squared_error and mean_absolute_error are different ;)

That's what I thought, but I didn't dare mention it :)

robertlayton · 2013-01-03T22:39:45Z

Btw mean_squared_error and mean_absolute_error are different ;)

Of course they are. In my defence, I only just had my first coffee of the day.

arjoly · 2013-01-04T10:21:26Z

Currently the clustering metrics are documented in the clustering module. I think I would prefer them to be documented in the metrics module.

It would be nice to have everything in one place and a link from the clustering documentation to the module evaluation part.

Oh, and put all metric documentation in the metrics part. Makes sense, that is where I would have expected them to be. With the clustering documentation update I'm (slowly) getting through, the documentation page may get significantly longer.

Will you make it or I perform the change?

I will initiate name changes (accuracy_score and zero_one_loss). I suppose that I have to add deprecation warning for 0.15, remove previous name from classes.rst, perform change in the narrative doc and the __init__.py

ogrisel · 2013-01-04T10:28:31Z

I will initiate name changes (accuracy_score and zero_one_loss). I suppose that I have to add deprecation warning for 0.15, remove previous name from classes.rst, perform change in the narrative doc and the init.py

Yes.

ogrisel · 2013-01-04T10:29:05Z

And update the API change of the whats_new.rst file when ready to merge.

arjoly · 2013-01-04T15:45:57Z

+1 for being more explicit by deprecating zero_one and adding an explicit name such as zero_one_loss and accuracy_score (one being 1 minus the other).

Currently zero_one_loss is the sum of the zero one loss over the samples. So accuracy_score is not 1 - zero_one_loss.

arjoly · 2013-01-04T15:48:27Z

I think there is still some typo to hunt down, but what I wanted to achieve is there.

So reviews and comments are welcome. :-)

ogrisel · 2013-01-04T15:48:39Z

Alright for keeping zero_one_loss as the sum and making accuracy_score(true, pred) == np.mean(true != pred).

arjoly · 2013-01-04T15:49:21Z

you meant accuracy_score(true, pred) == np.mean(true == pred) ?

amueller · 2013-01-04T15:50:31Z

I think that is what he meant.

ogrisel · 2013-01-04T16:36:05Z

Indeed... sorry for the confusion.

amueller · 2013-01-05T12:18:27Z

doc/modules/model_evaluation.rst

+   roc_curve
+
+
+Others have been extended to the multiclass case:


I would rather say something along the lines of "also work in". For example a confusion matrix is quite naturally multiclass and doesn't need to be extended.

arjoly · 2013-01-09T15:12:21Z

I have rebased on top of master to take into account #1534.

@GaelVaroquaux The deprecated decorator is now used.

To summarize what are the remaining questions / remarks to take into account

The explained variance question of @amueller.
The import consistency question

For

I don't know this metric enough. It is pretty new to me.
This might be solve latter.

amueller · 2013-01-09T15:20:27Z

Have you changed the default behavior of the renamed zero_one? could you please test that the old name has the old default behavior and the new name has the new?

arjoly · 2013-01-09T15:24:19Z

sklearn/metrics/metrics.py

+    >>> zero_one_loss(y_true, y_pred)
+    0.25
+    >>> zero_one_loss(y_true, y_pred, normalize=False)
+    1


@amueller Check the doctest here.

arjoly · 2013-01-09T15:27:18Z

Have you changed the default behavior of the renamed zero_one?

No.

could you please test that the old name has the old default behavior and the new name has the new?

See the reference.

Does it answer your question?

arjoly · 2013-01-09T15:56:35Z

Taken into account your question, I have also clarified the api changed. Good catch! Thanks.

ogrisel · 2013-01-09T16:21:36Z

Explained variance can be negative for completely random or anti-correlated predictions:

>>> from sklearn.metrics import explained_variance_score
>>> explained_variance_score([0, 1, 2], [1, 2, 0])
-2.0
>>> explained_variance_score([0, 1, 2], [1, 2, 3])
1.0
>>> explained_variance_score([0, 1, 2], [1, 2, 2])
0.66666666666666663

Usually for a real life, non dummy model it's positive though (if the signal is not itself completely random).

amueller · 2013-01-09T16:42:26Z

sklearn/metrics/tests/test_metrics.py

    assert_equal(zero_one(y_true, y_pred),
                 zero_one(y_pred, y_true))

    assert_almost_equal(zero_one(y_true, y_pred, normalize=True),
-                 zero_one(y_pred, y_true, normalize=True), 2)
+                        zero_one(y_pred, y_true, normalize=True), 2)


these raise deprecation warnings, right? could you please catch them?

amueller · 2013-01-09T16:43:36Z

@arjoly Thanks for the pointers and sorry for being lazy. Having a busy day. Apart from my comment on the deprecation warnings I think this is good to go.

arjoly · 2013-01-09T17:22:26Z

Thanks @ogrisel for your lights on this subject.

@amueller Don't worry, we all have our busy day.

arjoly · 2013-01-09T17:34:12Z

@amueller Warnings are catched!

amueller · 2013-01-09T17:45:55Z

thanks :) I guess you are good to know. feel free to merge using you preferred method ;)

Arnaud Joly notifications@github.com schrieb:

Warnings are catched!

Reply to this email directly or view it on GitHub:
#1512 (comment)

Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

arjoly · 2013-01-09T17:51:03Z

Merge by rebase! :-)

Thanks all for your time and review!!!

mblondel · 2013-01-09T18:26:11Z

A total of 1,306 additions and 215 deletions. That's a huge contribution! Thanks heaps!

amueller · 2013-01-09T19:18:26Z

+1 :)

Mathieu Blondel notifications@github.com schrieb:

A total of 1,306 additions and 215 deletions. That's a huge
contribution! Thanks heaps!

Reply to this email directly or view it on GitHub:
#1512 (comment)

amueller reviewed Jan 5, 2013
View reviewed changes

arjoly added 8 commits January 9, 2013 15:35

DOC comment from @amueller + several minor improvements

4c49bab

TST + DOC add many examples on sklearn.metrics

2e911f0

DOC typo + minor improvements

e3506ae

DOC remove redundant comment

b0cfa10

DOC better example with dummy estimator + link to appropriate reference

4aa4698

ENH use deprecated decorator

17d5fef

FIX DOC missing default behavior change

e9bc6ca

DOC COSMIT pretty math

d9a4cd6

arjoly reviewed Jan 9, 2013
View reviewed changes

DOC clarification of api change

8a8ea33

amueller reviewed Jan 9, 2013
View reviewed changes

FIX catch deprecation warning

fd5801e

arjoly merged commit fd5801e into scikit-learn:master Jan 9, 2013

This was referenced Jan 9, 2013

Some functions in sklearn.metrics aren't presented in the user guide #1508

Closed

Model evaluation / classification metric docs #1377

Closed

arjoly deleted the metric-classification-doc branch March 7, 2013 10:37

[MRG] Metric documentation (mainly in classification) #1512

[MRG] Metric documentation (mainly in classification) #1512

Conversation

arjoly commented Jan 3, 2013

amueller commented Jan 3, 2013

amueller commented Jan 3, 2013

arjoly commented Jan 3, 2013

amueller commented Jan 3, 2013

GaelVaroquaux commented Jan 3, 2013

ogrisel commented Jan 3, 2013

ogrisel commented Jan 3, 2013

robertlayton commented Jan 3, 2013

GaelVaroquaux commented Jan 3, 2013

amueller commented Jan 3, 2013

GaelVaroquaux commented Jan 3, 2013

robertlayton commented Jan 3, 2013

arjoly commented Jan 4, 2013

ogrisel commented Jan 4, 2013

ogrisel commented Jan 4, 2013

arjoly commented Jan 4, 2013

arjoly commented Jan 4, 2013

ogrisel commented Jan 4, 2013

arjoly commented Jan 4, 2013

amueller commented Jan 4, 2013

ogrisel commented Jan 4, 2013

amueller Jan 5, 2013

Choose a reason for hiding this comment

arjoly commented Jan 9, 2013

amueller commented Jan 9, 2013

arjoly Jan 9, 2013

Choose a reason for hiding this comment

arjoly commented Jan 9, 2013

arjoly commented Jan 9, 2013

ogrisel commented Jan 9, 2013

amueller Jan 9, 2013

Choose a reason for hiding this comment

amueller commented Jan 9, 2013

arjoly commented Jan 9, 2013

arjoly commented Jan 9, 2013

amueller commented Jan 9, 2013

arjoly commented Jan 9, 2013

mblondel commented Jan 9, 2013

amueller commented Jan 9, 2013