[MRG+1] Multi-output scoring, or 2493 cont'd for real #3474

eickenberg · 2014-07-22T21:11:49Z

This time taking into account the refactoring of metrics.py into several files.

This replaces #3456, which I closed.

arjoly · 2014-07-23T07:49:43Z

doc/modules/model_evaluation.rst

@@ -1006,6 +1023,13 @@ and :math:`y_i` is the corresponding true value, then the mean absolute error

  \text{MAE}(y, \hat{y}) = \frac{1}{n_{\text{samples}}} \sum_{i=0}^{n_{\text{samples}}-1} \left| y_i - \hat{y}_i \right|.

+The :func:`mean_absolute_error` function has an `output_weights` keyword
+with two possible values `None` and 'uniform'. If the value provided is


To render properly in the doc, the code should between double backtick thus None .

ogrisel · 2014-07-23T08:54:02Z

Travis reports 2 failures:

======================================================================
FAIL: Doctest: sklearn.metrics.regression.r2_score
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/doctest.py", line 2201, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for sklearn.metrics.regression.r2_score
  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/metrics/regression.py", line 369, in r2_score

----------------------------------------------------------------------
File "/home/travis/build/scikit-learn/scikit-learn/sklearn/metrics/regression.py", line 426, in sklearn.metrics.regression.r2_score
Failed example:
    r2_score(y_true, y_pred)  # doctest: +ELLIPSIS
Expected:
    0.938...
Got:
    0.93680052666227787

>>  raise self.failureException(self.format_failure(<StringIO.StringIO instance at 0x60c9b00>.getvalue()))


======================================================================
FAIL: sklearn.metrics.tests.test_common.test_format_invariance_with_1d_vectors
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/virtualenv/python2.7_with_system_site_packages/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/metrics/tests/test_common.py", line 484, in test_format_invariance_with_1d_vectors
    assert_raises(ValueError, metric, y1_row, y2_row)
AssertionError: ValueError not raised

eickenberg · 2014-07-28T20:57:43Z

I fixed the ValueError not raised fail by adding explained_variance_score to the list of multioutput scores in test_common.py. The other error was due to the fact that the default averaging of scores is now 'uniform'.

arjoly · 2014-07-30T07:05:24Z

doc/modules/model_evaluation.rst

+variance of each output is weighted by the scale of the corresponding target
+variable. 
+One can also specify arbitrary weights: If an ``ndarray`` is given,
+then a weighted average is formed accordingly.


This text is almost identical for explained_variance_score, r2_score, mean_absolute_error and mean_squared_error (missing in narrative doc). What do you think of creating a separate section to explain the output_weight (and maybe sample_weight) argument?

I am having troubles building the docs. Hope to be able to make that work quickly so I can actually see my edits.

arjoly · 2014-07-30T07:18:43Z

Can you run flake8 on the code? Thanks!

arjoly · 2014-07-30T07:21:48Z

sklearn/metrics/regression.py

+        output_weights = np.ones_like(output_scores)
+    elif output_weights == 'variance':
+        output_weights = denominator
+    elif output_weights == None:


output_weights is None

arjoly · 2014-07-30T07:34:39Z

sklearn/metrics/regression.py

+    # y_true is not interesting for scoring a regression anyway
+    output_scores[nonzero_numerator & ~nonzero_denominator] = 0.
+    if output_weights == 'uniform':
+        output_weights = np.ones_like(output_scores)


You can set this to None to have the same effect.

arjoly · 2014-08-01T07:26:40Z

I am having troubles building the docs. Hope to be able to make that work quickly so I can actually see my edits.

I am able to build the doc with

Sphinx==1.2b1
Pillow==2.0.0
Pygments==1.6
PIL==1.1.7
matplotlib==1.2.1
Jinja2==2.7
MarkupSafe==0.18
docutils==0.10

I hope it helps.

arjoly · 2014-08-02T17:30:21Z

sklearn/metrics/regression.py

-    variance = np.average((values - average)**2, weights=sample_weight)
+            sample_weight = sample_weight.reshape((n_samples, 1))
+    # if multi output but sample weight only specified in one column,
+    # then we need to broadcast it over outputs


Could we avoid the broadcast by first averaging over samples and then average over outputs?

I could do this if axis=None. Otherwise I need to keep the targets separate. Or bypass this function altogether ...

Ok, my fear is that it becomes slow. But further optimization could be done in another pr.

It is true that this can become slow when (n_test_samples, n_outputs) gets very large. I have around (1000, 100000) and it is still OK. Making this better will surely imply either getting rid of np.average (in the multioutput case), because it imposes this broadcast to be done explicitly or to writing an upstream suggestion to handle broadcasting implicitly, which will probably not be accepted on grounds of avoiding magical behaviour. If this is a serious problem I can definitely fix it right away.

arjoly · 2014-08-04T08:52:57Z

LGTM travis failure is unrelated.

MechCoder · 2014-08-05T20:46:13Z

@eickenberg Thanks a lot for bringing this back to life :)

eickenberg · 2014-08-05T20:49:42Z

Definitely breathing again -- Just squashed it down to one commit, hope it survived ;)

…ghts and doc.

coveralls · 2014-08-06T22:41:30Z

Coverage increased (+0.0%) when pulling e40235e on eickenberg:2493b2 into 83223fd on scikit-learn:master.

arjoly · 2014-08-07T07:32:03Z

Travis is happy now :-)

MechCoder · 2014-08-12T15:55:16Z

@eickenberg hope this doesn't die till someone has to bring it back to life again :)

arjoly · 2014-08-12T19:52:03Z

@MechCoder Have you had time to review the code?

MechCoder · 2014-08-12T19:53:22Z

@arjoly I definitely can, but I doubt I will have anything more ~~other than~~ over your inputs :)

arjoly · 2014-09-24T08:16:24Z

Could we consider this as a +1? :-)

MechCoder · 2014-09-25T11:08:32Z

@arjoly I'm slightly busy till Thursday next week (grad school exams). I will definitely have a final look on Thursday or just after that.

MechCoder · 2014-10-07T15:31:06Z

sklearn/metrics/regression.py

@@ -114,13 +152,32 @@ def mean_absolute_error(y_true, y_pred, sample_weight=None):
    y_pred : array-like of shape = [n_samples] or [n_samples, n_outputs]
        Estimated target values.

+    output_weights : string in ['uniform'] or None
+                     or array-like of shape [n_outputs]


should it be array-like, shape(n_outputs,)

ah I see it is like this everywhere. so this can be left alone.

I tried to keep to the local choice of syntax to stay consistent. That said, I don't like it and could change it, but then I would change it all. Putting a low priority on that, but if we conclude it is necessary, I'll do it.

MechCoder · 2014-10-07T23:35:54Z

@eickenberg Thanks for finishing up my work. If you agree to my changes, and are busy with some other work, let me know I can send a PR across your branch. I would like to see this merged.

Also there are merge conflicts.

eickenberg · 2014-10-08T09:46:57Z

I will update the code somewhat this afternoon and then try to rebase. There may be need for one extra round of discussion. But it looks like we are almost there.

amueller · 2015-04-29T22:48:27Z

Closing in favor of #4491.

arjoly reviewed Jul 23, 2014
View reviewed changes

arjoly reviewed Jul 30, 2014
View reviewed changes

arjoly reviewed Aug 2, 2014
View reviewed changes

arjoly changed the title ~~[MRG] Multi-output scoring, or 2493 cont'd for real~~ [MRG+1] Multi-output scoring, or 2493 cont'd for real Aug 4, 2014

ENH Multiple output scoring as implemented by Manoj, added sample wei…

e40235e

…ghts and doc.

larsmans force-pushed the master branch from 58a55ad to 4b82379 Compare August 25, 2014 21:50

MechCoder reviewed Oct 7, 2014
View reviewed changes

This was referenced Oct 13, 2014

Median absolute error #3761

Merged

[MRG+2] ENH: median_absolute_error consistent with other regression metrics #3764

Closed

MechCoder force-pushed the master branch from 6deaea0 to 3f49cee Compare November 3, 2014 12:36

kshmelkov mentioned this pull request Apr 2, 2015

[MRG+2] rebasing pr/3474 (multioutput regression metrics) #4491

Closed

amueller closed this Apr 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+1] Multi-output scoring, or 2493 cont'd for real #3474

[MRG+1] Multi-output scoring, or 2493 cont'd for real #3474

eickenberg commented Jul 22, 2014

arjoly Jul 23, 2014

ogrisel commented Jul 23, 2014

eickenberg commented Jul 28, 2014

arjoly Jul 30, 2014

eickenberg Jul 31, 2014

arjoly commented Jul 30, 2014

arjoly Jul 30, 2014

arjoly Jul 30, 2014

arjoly commented Aug 1, 2014

arjoly Aug 2, 2014

eickenberg Aug 2, 2014

arjoly Aug 4, 2014

eickenberg Aug 5, 2014

arjoly commented Aug 4, 2014

MechCoder commented Aug 5, 2014

eickenberg commented Aug 5, 2014

coveralls commented Aug 6, 2014

arjoly commented Aug 7, 2014

MechCoder commented Aug 12, 2014

arjoly commented Aug 12, 2014

MechCoder commented Aug 12, 2014

arjoly commented Sep 24, 2014

MechCoder commented Sep 25, 2014

MechCoder Oct 7, 2014

MechCoder Oct 7, 2014

eickenberg Oct 8, 2014

MechCoder commented Oct 7, 2014

eickenberg commented Oct 8, 2014

amueller commented Apr 29, 2015

[MRG+1] Multi-output scoring, or 2493 cont'd for real #3474

[MRG+1] Multi-output scoring, or 2493 cont'd for real #3474

Conversation

eickenberg commented Jul 22, 2014

Choose a reason for hiding this comment

ogrisel commented Jul 23, 2014

eickenberg commented Jul 28, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arjoly commented Jul 30, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arjoly commented Aug 1, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arjoly commented Aug 4, 2014

MechCoder commented Aug 5, 2014

eickenberg commented Aug 5, 2014

coveralls commented Aug 6, 2014

arjoly commented Aug 7, 2014

MechCoder commented Aug 12, 2014

arjoly commented Aug 12, 2014

MechCoder commented Aug 12, 2014

arjoly commented Sep 24, 2014

MechCoder commented Sep 25, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MechCoder commented Oct 7, 2014

eickenberg commented Oct 8, 2014

amueller commented Apr 29, 2015