R squared out of sample #21957
Replies: 5 comments 18 replies
-
Could you provide access to your notebook without requesting any access? |
Beta Was this translation helpful? Give feedback.
-
To the best of my knowledge, this statement is not correct and lacks evidence. On the contrary, out-of-sample R2 is used in many places, it is well understood theoretically and (as long as the denominator is equal) is well suited for model comparison of models that predict the conditional expectation of the observed target, E[Y[X]. It has all the same disadvantages as MSE, though. |
Beta Was this translation helpful? Give feedback.
This comment was marked as spam.
This comment was marked as spam.
-
Dear Contributor, Thank you for raising this important point about ( R^2 ) and its limitations for evaluating predictive power on test datasets. You're correct that ( R^2 ) can be misleading when the variance of ( y ) dominates its dependence on ( X ), resulting in negative scores even for well-learned models. To address your questions:
We encourage you to open an issue first to outline your proposal and gather feedback before submitting the PR. Best regards, |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
There is a known problem, that R2 should not be used to access "predictive power" of a model. In more concrete terms, it is incorrect to measure the quality of a model by calculating R2 on test dataset, which is different from learing dataset.
Why? Because when
y
variance is high compared toy
dependence on featuresX
, R2 will produce negative score, even despite the model learns relationX -> y
perfectly well.For an empirical proof I've concocted a notebook.
There I show how negative
R2
is produced, and howR2 out of sample
fixes the problem.I haven't found any existing issue about
R squared out of sample
in this repository, so my questions areIs there any other discussion or work in progress about it?
If there is no other work in this direction, would it be welcomed, if I submitted a PR?
Beta Was this translation helpful? Give feedback.
All reactions