Add the RMSE(Root Mean Squared Error) option to the cross_val_score. #6457

shaynekang · 2016-02-27T07:27:18Z

Add the RMSE(Root Mean Squared Error) option to the cross_val_score.

Many Kaggle competitions are selecting RMSE as their official evaluation score. (Home Depot Product Search Relevance, Restaurant Revenue Prediction, Facial Keypoints Detection, etc)

Usually, Kagglers directly implement this use of mean_squared_error. However, I think this is a waste of time, so I decided to implement RMSE and thereby add it to the cross_val_score function.

Hope this helps Kagglers :D

jnothman · 2016-02-27T10:07:55Z

doc/modules/model_evaluation.rst

+
+The :func:`root_mean_squared_error` function computes `root mean square
+error <https://en.wikipedia.org/wiki/Root-mean-square_deviation>`_, a risk
+metric corresponding to the expected value of the root mean squared error loss or


*loss or loss

jnothman · 2016-02-27T10:08:44Z

I'm happy to see this separate scorer, but I don't think this needs to be a separate function to MSE, merely a parameter. Not sure if others will agree though...

GaelVaroquaux · 2016-02-27T11:35:36Z

I'm happy to see this separate scorer, but I don't think this needs to be a separate function to MSE, merely a parameter. Not sure if others will agree though...

No strong feeling on either side.

jnothman · 2016-02-27T11:47:13Z

I guess I was thinking of the normalize option in other metrics as comparable.

GaelVaroquaux · 2016-02-27T11:58:07Z

As discussed at the last sprint (I can't find notes, bad on us): we want to deprecate MSE, and only use 'negated_mse' to always have consistently bigger is better:

https://sourceforge.net/p/scikit-learn/mailman/message/31632671/

So we should have a negated_rmse

shaynekang · 2016-02-27T14:09:34Z

@jnothman Thank you for the suggestion. Some people might think these as over-engineering. However, I am concerned that the parameter approach (such as the normalize option) might cause confusion to other people who want to use this method. I'm a fairly open-minded person and I will follow a better suggestion if there is one. Honestly speaking, I think this is the best implementation so far. Anyway, I will always welcome good opinions.

@GaelVaroquaux I remember reading about that discussion. I would suggest the scikit-learn teams to obtain a separate pull request about negated_mse and negated_rmse. If I implement negated_rmse in this pull request, we fall into a dangerous situation where we implement only negated_rmse but not negated_mse. I'll send another pull request about both negated_mse and negated_rmse if time allows.

PS) Sorry for the test failure. :( I'll fix the problem as soon as possible.

GaelVaroquaux · 2016-02-27T14:11:06Z

If I implement negated_rmse in this pull request, we fall into a dangerous situation where we implement only negated_rmse but not negated_mse. I'll send another pull request about both negated_mse and negated_rmse if time allows.

I find that implementing rmse is actually the dangerous issue: we would want to deprecate it as soon as it is implemented. Deprecations are things that are very costly, to us and to our users.

shaynekang · 2016-02-29T07:20:55Z

@GaelVaroquaux I understand. You have a point.

I’ve decided to implement negated_rmse as you mentioned. But I am still not sure about the amendment that came out during this discussion. Is the following right?

Rename the scoring option of cross_val_score, from root_mean_squared_error to negated_rmse.
In addition to No.1, set the greater_is_better option to True when calling the make_scorer method. (ex: root_mean_squared_error_scorer = make_scorer(root_mean_squared_error, greater_is_better=True) in scorer.py)

shaynekang · 2016-03-04T13:25:57Z

Any updates on this?

shaynekang · 2016-03-27T03:46:07Z

I read the following discussion about the negated_mse issue at #2439. Consequently, I decided it was enough to just change the scoring option of cross_val_score, similar to the suggestion by @GaelVaroquaux. I welcome any opinions. :D

nelson-liu · 2016-03-27T04:56:10Z

hmm, this seems reasonable (from an api design viewpoint). I'll take a look at the code in a few days when I get a chance.

shaynekang · 2016-03-28T07:22:22Z

@nelson-liu Sure you can. :D

nelson-liu · 2016-03-31T16:27:48Z

the implementation lgtm, @GaelVaroquaux do you have any thoughts on the api design?

amueller · 2016-10-11T00:18:47Z

can you please rebase? and it should be just neg_, not negated_, but yes, that's the way to do it.
Sorry for the slow reply :(

shaynekang · 2016-10-11T06:43:56Z

@amueller Sure. :D Wait a moment.

…aluation).

…or' to 'negated_mse'

shaynekang · 2016-10-11T12:53:03Z

I just finished to rebase this pull request. Are there anythings I can help you with?

jnothman · 2016-10-13T12:58:21Z

I still would rather not see root_mean_squared_error as a separate function, and documented as a separate metric, particularly if that documentation doesn't emphasise why that sqrt transformation is helpful. Note that elsewhere we have options like euclidean_distances(..., squared=False) and accuracy with a normalize parameter. I don't really see how we gain by an effective alias, except in scoring.

shaynekang · 2016-10-13T15:28:18Z

Thanks for your opinion. :D

In my understanding, I'm not sure whether we should treat this in the same way as euclidean_distances or not. Usually we can treat both euclidean_distances (squared=True) and euclidean_distances (squared=False) in the same way because it's conceptually equal. (Usually we use squared=False option just for performance optimization) Therefore, although these two methods don't return the same results, we don't need to describe precisely how these two approaches are different.

However, I think the root_mean_squared_error is different. Usually we choose the RMSE when large errors are particularly undesirable. These are conceptually different. Therefore, we'll need to treat these two methods as separate methods and separate documentation. (And you're right. the reason why I have a mind to implement this is for the scoring option of the cross_val_score)

I'll update the documentation as soon as possible, and I'll especially emphasise when this method would be helpful, and why this approach is sometimes better than MSE.

amueller · 2016-10-14T17:06:01Z

sklearn/metrics/regression.py

+
+    """
+    error = mean_squared_error(y_true, y_pred, sample_weight, multioutput)
+    return error ** 0.5


btw, is this as good as np.sqrt? It might take a different code path.

amueller · 2016-10-14T17:07:35Z

@shaynekang this is a global monotonic transformation, so it doesn't change anything. If there was a sqrt on the inside that would be different. But this has exactly the same properties as mse.
So I tend to agree with @jnothman

GaelVaroquaux · 2016-10-14T17:08:28Z

error = mean_squared_error(y_true, y_pred, sample_weight, multioutput)

return error ** 0.5

btw, is this as good as np.sqrt? It might take a different code path.

I just did timings and, to my surprise, both take the same amount of
time.

GaelVaroquaux · 2016-10-14T17:15:22Z

Indeed, in terms of model selection, RMSE and MSE are exactly the same thing.

The question is how far down we want to lower the bar for people who don't understand these things, or don't know how to compute a sqrt in Python. My hunch is usually that lowering the bar too much ends up onboarding people that we shouldn't onboard, because they will be only cost and no benefit.

amueller · 2016-10-14T19:17:56Z

I just did timings and, to my surprise, both take the same amount of time.

https://github.com/numpy/numpy/blob/c90d7c94fd2077d0beca48fa89a423da2b0bb663/numpy/core/src/multiarray/number.c#L522

though explicit is better than implicit ;)

jnothman · 2016-10-15T10:13:49Z

So I get the impression that core dev consensus is something like -1 for a
separate metric, +0.5 for a separate scorer??

On 15 October 2016 at 06:17, Andreas Mueller notifications@github.com
wrote:

I just did timings and, to my surprise, both take the same amount of time.

https://github.com/numpy/numpy/blob/c90d7c94fd2077d0beca48fa89a423
da2b0bb663/numpy/core/src/multiarray/number.c#L522

though explicit is better than implicit ;)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#6457 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz6wJVEWnMbXYNNRT1tKFtESAo4L8_ks5qz9VlgaJpZM4HkaRG
.

amueller · 2016-10-17T15:57:24Z

@jnothman yeah I think so.

cmarmo · 2020-12-14T21:06:31Z

@rth, you merged #13467. Am I wrong saying that this PR is no longer needed, then? Could it be closed? Thanks!

rth · 2020-12-14T21:10:24Z

Yes, thanks @cmarmo!

Thanks for to all contributors in this PR. Closing as resolved.

jnothman reviewed Feb 27, 2016
View reviewed changes

shaynekang added 7 commits October 11, 2016 17:23

implement root mean squared error (w/ test)

2800346

update rmse(root mean squared error) description to document(model ev…

f309177

…aluation).

fix test failure about rmse(root mean squared error)

f260cad

fix doctest failure

fc5d74c

rename scoring option of cross_val_score, from 'root_mean_squared_err…

fac60e1

…or' to 'negated_mse'

rename 'negated_' to 'neg_'

5e71ced

update 'neg_rmse' and 'metrics.root_mean_squared_error' to the document

fb78a6a

shaynekang force-pushed the rmse branch from 6f95a2c to fb78a6a Compare October 11, 2016 09:31

fix errors were raised from flake8

f4ce3d8

jnothman added the Waiting for Reviewer label Oct 13, 2016

amueller reviewed Oct 14, 2016

View reviewed changes

qinhanmin2014 mentioned this pull request Mar 12, 2019

Implement RMSE (root-mean-square error) metric and scorer #12895

Closed

smarie mentioned this pull request Mar 20, 2019

Regression metrics - which strategy ? #13482

Open

github-actions bot added the module:metrics label Mar 2, 2020

cmarmo removed the Waiting for Reviewer label Dec 14, 2020

rth closed this Dec 14, 2020

Uh oh!

Add the RMSE(Root Mean Squared Error) option to the cross_val_score. #6457

Add the RMSE(Root Mean Squared Error) option to the cross_val_score. #6457

Uh oh!

Conversation

shaynekang commented Feb 27, 2016

Uh oh!

jnothman Feb 27, 2016

Choose a reason for hiding this comment

Uh oh!

jnothman commented Feb 27, 2016

Uh oh!

GaelVaroquaux commented Feb 27, 2016 via email

Uh oh!

jnothman commented Feb 27, 2016

Uh oh!

GaelVaroquaux commented Feb 27, 2016

Uh oh!

shaynekang commented Feb 27, 2016

Uh oh!

GaelVaroquaux commented Feb 27, 2016 via email

Uh oh!

shaynekang commented Feb 29, 2016

Uh oh!

shaynekang commented Mar 4, 2016

Uh oh!

shaynekang commented Mar 27, 2016

Uh oh!

nelson-liu commented Mar 27, 2016

Uh oh!

shaynekang commented Mar 28, 2016

Uh oh!

nelson-liu commented Mar 31, 2016

Uh oh!

amueller commented Oct 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shaynekang commented Oct 11, 2016

Uh oh!

shaynekang commented Oct 11, 2016

Uh oh!

jnothman commented Oct 13, 2016

Uh oh!

shaynekang commented Oct 13, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amueller Oct 14, 2016

Choose a reason for hiding this comment

Uh oh!

amueller commented Oct 14, 2016

Uh oh!

GaelVaroquaux commented Oct 14, 2016

Uh oh!

GaelVaroquaux commented Oct 14, 2016

Uh oh!

amueller commented Oct 14, 2016

Uh oh!

jnothman commented Oct 15, 2016

Uh oh!

amueller commented Oct 17, 2016

Uh oh!

cmarmo commented Dec 14, 2020

Uh oh!

rth commented Dec 14, 2020

Uh oh!

Uh oh!

amueller commented Oct 11, 2016 •

edited

Loading

shaynekang commented Oct 13, 2016 •

edited

Loading