[MRG+2] rebasing pr/3474 (multioutput regression metrics) #4491

kshmelkov · 2015-04-02T10:52:58Z

I have tried to rebase PR #3474. If it passes Travis, I guess mentionned PR can be replaced.

Basically, it is a support of multioutput support for regression metrics (MAE, MSE, R2, explained variance) in a way that we can get a whole array of scores instead of some kind of averaging. BTW, I corrected a small mistake in docs: it states that median absolute error supports multioutput while it doesn't.

eickenberg · 2015-04-02T11:12:52Z

sklearn/metrics/regression.py

@@ -33,7 +36,7 @@
 ]


-def _check_reg_targets(y_true, y_pred):
+def _check_reg_targets(y_true, y_pred, output_weights):


Could you check whether it is acceptable to make output_weights=None a default? This is making Travis fail on mean_absolute_error.

It may be better to not set the default and be explicit in mean_absolute_error

output_weights=None isn't a sane default: it returns a whole array. 'uniform' is better as default. In any case I'll make it explicit to correct median_absolute_error.

OK, then don't put any default.

eickenberg · 2015-04-02T11:13:37Z

Thanks very much for taking care of this!

eickenberg · 2015-04-02T11:15:51Z

sklearn/metrics/regression.py



-def _average_and_variance(values, sample_weight=None):
+def _average_and_variance(values, sample_weight=None, axis=None):


In the other PR we had discussed whether this function is useful. Looking at it a few months later, I find it weird that this logic was extracted when it is only used twice lower down.

Could we reach a decision on whether to remove it?

@MechCoder @arjoly @ogrisel

I agree that it looks like over-engineering in this case. I found an issue numpy/numpy#5164 about it, but it seems it was ignored.

I was thinking more of just writing it out explicitly where needed.

But since it works the way it is right now, in the interest of terminating this PR, I think we should leave it the way it is.

We agreed on removing it that day, if I recall correctly.

kshmelkov · 2015-04-02T15:07:38Z

I have removed that debatable function.

arjoly · 2015-04-02T15:41:14Z

sklearn/metrics/regression.py

+
+
+def r2_score(y_true, y_pred,
+             output_weights='uniform',


This is regression compared to the previous behavior.

Sorry, what do you mean exactly?

Regression with respect to the single target behavior or regression with respect to a previous version of this PR?

It's a regression with respect to the current (master) behavior of r2_score.

I was convinced that the result of the discussions in the last PRs and at the scikit-learn sprint last summer in Paris was that the sane default behavior of agglomerating r2 scores was by averaging the scaled variances. The way it is in current master was add as the keyword output_weights="variance" for those who understand what its consequences are and who need it (scores being weighted by the variances of the target variables, ie dependent on their scale/unit).

I find individual too misleading. collapse_output? scores_map? multioutput_averaging?

Sorry, I meant output_weights='individual' as a replacement for output_weights=None. Also, as Gaël says lower down output_weights -> multioutput_weight

Then I suggest an option multioutput with values raw_scores (currently None behavior), uniform_average, variance_weighted. In this case default None will raise deprecation warning and fallback to variance_weighted. Is it OK for everybody?

Then I suggest an option multioutput with values raw_scores (currently None behavior), uniform_average, variance_weighted. In this case default None will raise deprecation warning and fallback to variance_weighted. Is it OK for everybody?

I like these names.

kshmelkov · 2015-04-03T16:42:05Z

I have done a refactoring. But I can't understand what is the hell with travis.

eickenberg · 2015-04-04T10:51:03Z

Thanks. The Travis failure is indeed weird, since it didn't even start the tests (the one config that runs has all tests passing). I don't know if it is possible to ask it to try again on this commit. But what you can always do is add another commit, correcting some minor issues, such as PEP8 violations.

Another helpful thing would be to bring the docstrings into a more standard form, i.e.

array-like of shape = [n_samples] or [n_samples, n_outputs]

should become

array-like of shape (n_samples,) or (n_samples, n_outputs)

coveralls · 2015-04-05T19:30:09Z

Coverage increased (+0.04%) to 95.16% when pulling 22d2741 on kshmelkov:rebase_pr_3474 into 6e54079 on scikit-learn:master.

kshmelkov · 2015-04-05T22:14:05Z

Well, I think it's done.

eickenberg · 2015-04-06T09:52:14Z

This looks good to me now.

I am assuming that even if 1.0 is possibly on the horizon after 0.17, it is still OK to assume 0.18 in the deprecation message. That can always be changed when this becomes clear.

@kshmelkov could you squash your commits into a single one called something like ENH multioutput regression metrics ?

amueller · 2015-04-06T17:38:26Z

@arjoly gave +1 on the other one, right? @arjoly can you maybe have a look again to confirm?

arjoly · 2015-04-11T08:22:28Z

doc/modules/model_evaluation.rst

+is ``'uniform_average'``, which entails a uniformly weighted mean over outputs. If
+an ``ndarray`` of shape ``(n_outputs,)`` is passed, then its entries are
+interpreted as weights and an according weighted average is returned. If
+``multioutput`` is ``'raw_scores'`` is specified, then all unaltered individual scores


raw_scores doesn't make sense with loss / error based metrics.

arjoly · 2015-04-11T08:41:59Z

Mostly stylish comments, whenever those are addressed +1.

amueller · 2015-04-13T22:17:42Z

doc/modules/model_evaluation.rst

+
+These functions have an ``multioutput`` keyword argument which specifies
+the way the scores for each individual target should be averaged. The default
+is ``'uniform_average'``, which entails a uniformly weighted mean over outputs. If


I don't think "entails" is the right word.

ogrisel · 2015-04-23T13:40:00Z

@kshmelkov could you please squash all your commits in this branch?

ogrisel · 2015-04-23T13:41:27Z

Other than that +1 on my side as well.

kshmelkov · 2015-04-23T18:15:22Z

@ogrisel I have applied your corrections and squashed the commits.

amueller · 2015-04-29T22:44:05Z

should this get a whatsnew?

amueller · 2015-04-29T22:49:05Z

we should close #4491 when merged.

MechCoder · 2015-05-08T14:46:03Z

sklearn/metrics/regression.py

        Sample weights.

+    multioutput : string in ['raw_values', 'uniform_average',
+                'variance_weighted'] or array-like of shape (n_outputs)


does this render ok in the docs?

MechCoder · 2015-05-08T15:00:13Z

Sorry for not being part of the discussion but apart from those 2 minor comments, this looks good to me as well. Thanks @arjoly @kshmelkov and @eickenberg for the help throughout. Let us merge? It's been 2 long years since this PR was started.

MechCoder · 2015-05-08T15:00:49Z

we should close #4491 when merged.

@amueller I thought github does that automatically? :P

amueller · 2015-05-08T15:02:11Z

Well if the PR was started with "Closing #4491" but because of the many re-openings of PRs, this one didn't even reference the original issue.
Do you want to check if the docs render, add a whatsnew and push?

MechCoder · 2015-05-08T15:04:53Z

sklearn/metrics/regression.py

+    y_type, y_true, y_pred, multioutput = _check_reg_targets(
+        y_true, y_pred, multioutput)
+
+    y_diff_avg = np.average(y_true - y_pred, weights=sample_weight, axis=0)


probably does not matter, but can reuse y_true - y_pred..

MechCoder · 2015-05-08T16:35:15Z

@kshmelkov Can you just check the rendering and we can merge?

amueller · 2015-05-08T16:43:46Z

I'm on it.

amueller · 2015-05-08T17:00:33Z

pushed with whatsnew.rst

arjoly · 2015-05-09T08:51:13Z

Great !!! Thanks to all who have contributed ! :-)

eickenberg reviewed Apr 2, 2015
View reviewed changes

kshmelkov changed the title ~~rebasing pr/3474 (multioutput regression metrics)~~ [MRG] rebasing pr/3474 (multioutput regression metrics) Apr 2, 2015

arjoly reviewed Apr 2, 2015
View reviewed changes

kshmelkov force-pushed the rebase_pr_3474 branch from 22d2741 to d0b8ba1 Compare April 6, 2015 18:17

arjoly reviewed Apr 11, 2015
View reviewed changes

amueller reviewed Apr 13, 2015
View reviewed changes

ogrisel changed the title ~~[MRG] rebasing pr/3474 (multioutput regression metrics)~~ [MRG+1] rebasing pr/3474 (multioutput regression metrics) Apr 23, 2015

ENH multioutput regression metrics

a745318

kshmelkov force-pushed the rebase_pr_3474 branch from 110ed21 to a745318 Compare April 23, 2015 17:54

amueller changed the title ~~[MRG+1] rebasing pr/3474 (multioutput regression metrics)~~ [MRG+2] rebasing pr/3474 (multioutput regression metrics) Apr 29, 2015

amueller mentioned this pull request Apr 29, 2015

[MRG+1] Multi-output scoring, or 2493 cont'd for real #3474

Closed

MechCoder reviewed May 8, 2015
View reviewed changes

amueller closed this May 8, 2015

amueller mentioned this pull request May 8, 2015

option to return an array in metrics if multi-output #2200

Closed

kshmelkov deleted the rebase_pr_3474 branch May 15, 2015 22:14



		def _average_and_variance(values, sample_weight=None):
		def _average_and_variance(values, sample_weight=None, axis=None):

[MRG+2] rebasing pr/3474 (multioutput regression metrics) #4491

[MRG+2] rebasing pr/3474 (multioutput regression metrics) #4491

Conversation

kshmelkov commented Apr 2, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eickenberg commented Apr 2, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kshmelkov commented Apr 2, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GaelVaroquaux Apr 2, 2015 via email

Choose a reason for hiding this comment

kshmelkov commented Apr 3, 2015

eickenberg commented Apr 4, 2015

coveralls commented Apr 5, 2015

kshmelkov commented Apr 5, 2015

eickenberg commented Apr 6, 2015

amueller commented Apr 6, 2015

Choose a reason for hiding this comment

arjoly commented Apr 11, 2015

Choose a reason for hiding this comment

ogrisel commented Apr 23, 2015

ogrisel commented Apr 23, 2015

kshmelkov commented Apr 23, 2015

amueller commented Apr 29, 2015

amueller commented Apr 29, 2015

Choose a reason for hiding this comment

MechCoder commented May 8, 2015

MechCoder commented May 8, 2015

amueller commented May 8, 2015

Choose a reason for hiding this comment

MechCoder commented May 8, 2015

amueller commented May 8, 2015

amueller commented May 8, 2015

arjoly commented May 9, 2015