-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+2] ENH: median_absolute_error consistent with other regression metrics #3764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+2] ENH: median_absolute_error consistent with other regression metrics #3764
Conversation
@FlorianWilhelm Thanks for handling this case. It looks good. However, are you sure that is how it should be computed? I am not familiar with this metrics. Would it be better to have L1 space median? In doubt, I would be in favor of raising an error with multi-output data for now. Sorry if my previous comment weren't clear. |
@arjoly I think in this case, there is no wrong and right, only consistency. It could also be argued that the way |
I am +1 when what happens with multi-output data will be stated clearly in Thanks for polishing the small details! |
@arjoly I added a few words about our approach in |
I would be more precise in the median absolute error section. Something like
This current situation will change and improve whenever #3474 is merged. |
@arjoly Added a line directly to the median absolute error section. |
and :func:`r2_score`. Multioutput is handled with a two level approach. First, | ||
the chosen metric is applied to each row of the output matrix in order to | ||
generate a single value per sample. This creates a one-dimensional output | ||
vector. In a second step, the metric is applied to the output vector. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this again. This might not be true for r2_score.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you suggest? Should I remove my sentences or should r2_score
be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for removing the sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
+1 from my side ! |
I am not familiar about the use case of this metric. Intuitively, does it not make sense to find the median across every output and then find the median across all such medians? However in all other metrics, it has been done the same way, average across all outputs, and then average across all samples, instead of averaging the metrics across all outputs, so at least the user has an idea of what to expect. So I am +1 for consistency. Thanks, |
On second thoughts, I'm not sure taking median across the outputs, (i.e |
@MechCoder I think the behavior is as wrong or right as in the case of applying the mean in the other metrics but it would be consistent and it is stated in the documentation like that. Why not just merge this as a first step and in a second larger step we should reconsider the whole multi-output handling anyway. Maybe the user should be able to supply a cost functional to judge the quality of a multi-output. |
I agree that it is better not to support multi-output (here; don't worry On 3 November 2014 21:29, Florian Wilhelm notifications@github.com wrote:
|
…an_absolute_error Conflicts: doc/modules/model_evaluation.rst
Sorry for letting you wait, was quite busy the last weeks. I removed the multioutput feature of |
Looks good. @arjoly please merge the next time, when you see it. |
merged as 167e96e thanks @FlorianWilhelm ! |
As discussed in #3761, this PR should make
metrics.median_absolute_error
consistent to the other regression metrics likemean_absolute_error
in case of multi-output.