-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
[MRG] Does not store all cv values nor all dual coef in _RidgeGCV fit #15183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please add whats new entry with Efficiency
tag.
sklearn/linear_model/ridge.py
Outdated
@@ -1048,6 +1048,16 @@ def _matmat(self, v): | |||
return res | |||
|
|||
|
|||
class _IdentityEstimator: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly less hacky...now that it is a class. :)
Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>
thanks @thomasjpfan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jeromedockes ! Some of the added changes are not covered by tests, it would be good to add more tests for those (unless codecov has issues again).
sklearn/linear_model/ridge.py
Outdated
"""Hack to call a scorer when we already have the predictions.""" | ||
|
||
def decision_function(self, y_predict): | ||
return y_predict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is never called by tests.
sklearn/linear_model/ridge.py
Outdated
alpha_score = scorer( | ||
_IdentityEstimator(), predictions.ravel(), y.ravel()) | ||
if self.store_cv_values: | ||
self.cv_values_[:, i] = predictions.ravel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not covered by tests either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code path is not reached because
scikit-learn/sklearn/linear_model/ridge.py
Lines 1562 to 1564 in 86aea99
if self.store_cv_values: | |
raise ValueError("cv!=None and store_cv_values=True " | |
" are incompatible") |
which means we can assume that self.store_cv_values
is always false when scorer
is defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it is always false when cv != None
which means do not use GCV, but this uses a different estimator than the one affected here. The tests I added now should cover the missing lines.
But it is true that the meaning of the stored "cv values" is different when scorer
is defined, see #13998
sklearn/linear_model/ridge.py
Outdated
best_coef, best_score, best_alpha = c, alpha_score, alpha | ||
|
||
self.alpha_ = best_alpha | ||
self.best_score_ = best_score |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to include this, we should document it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a note in the docstring. but note that this is a private class -- users don't have access to this best_score_
.
adding a best_score_
attribute to public RidgeCV
and RidgeClassifierCV
estimators is discussed in #4667
doc/whats_new/v0.22.rst
Outdated
@@ -372,6 +373,12 @@ Changelog | |||
and `fit_intercept=True`. | |||
:pr:`15086` by :user:`Alex Gramfort <agramfort>`. | |||
|
|||
- |Efficiency| :class:`linear_model.RidgeCV` now does not allocate a potentially |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By convention, this should appear before the linear_model Fix entries.
sklearn/linear_model/ridge.py
Outdated
@@ -1048,6 +1048,16 @@ def _matmat(self, v): | |||
return res | |||
|
|||
|
|||
class _IdentityEstimator: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are starting to need those hacks everywhere... We talked about this with @ogrisel but can't remember where
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @jeromedockes , a few comments
def scorer(estimator, X, Y): | ||
pred = estimator.decision_function(X) | ||
return np.sum((pred - Y)**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need this scorer?
assert_allclose(loo_pred, ridge_cv.cv_values_[:, 1]) | ||
|
||
|
||
def test_ridge_gcv_decision_function_scoring(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this test testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this test was intended to check the equivalence between score=None
and score=scorer
where scorer compute the mean squared error
@jeromedockes I pushed what I thought your tests were done for. Could you have a look if it was what you intended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to mention RidgeClassifierCV
in what's new, otherwise this LGTM
@@ -495,6 +496,16 @@ Changelog | |||
requires less memory. | |||
:pr:`14108`, :pr:`14170`, :pr:`14296` by :user:`Alex Henrie <alexhenrie>`. | |||
|
|||
- |Efficiency| :class:`linear_model.RidgeCV` now does not allocate a | |||
potentially large array to store dual coefficients for all hyperparameters | |||
during its `fit`, nor an array to store all LOO predictions unless |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOO predictions or mean squared errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it depends of scoring
: LOO predictions if scoring is not None
otherwise mean squared errors.
But agreed that it should be added.
# equivalent to `scoring=None` | ||
|
||
def scorer(estimator, X, Y): | ||
pred = estimator.decision_function(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps predict will be more friendly here :)
for i, alpha in enumerate(self.alphas): | ||
G_inverse_diag, c = solve( | ||
float(alpha), y, sqrt_sw, X_mean, *decomposition) | ||
if error: | ||
squared_errors = (c / G_inverse_diag) ** 2 | ||
cv_values[:, i] = squared_errors.ravel() | ||
alpha_score = -squared_errors.mean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's another bug: we need to devide sample_weight here, otherwise we get weighted error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why shouldn't sample weights be taken into account when computing the error? Note that this has always been the behaviour of the GCV estimator and is made explicit here
scikit-learn/sklearn/linear_model/_ridge.py
Line 1554 in 98cb91b
or another form of cross-validation, because only generalized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's embarassing that such important thing is documented in private functions.
I think users will expect that RidgeCV()
should be equivalent to GridSearchCV(Ridge(), cv=LeaveOneOut())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g., if we add sample_weight to test_ridge_gcv_equivalence_prediction_metric, this test will fail
Good catch @qinhanmin2014 Playing a bit, |
I have a draft locally, I'll upload it so you can play with it. |
actually there's another issue regarding RidgeClassifierCV, let's fix RidgeCV first. |
yes it is. Thanks a lot! |
should this pr be closed in favour of #15648 ? |
|
||
- |Fix| In :class:`linear_model.RidgeCV`, the predicitons reported by | ||
`cv_values_` are now rescaled in the original space when `scoring` is not | ||
`None`. :pr:`13995` by :user:`Jérôme Dockès <jeromedockes>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#13995 is an issue, not the PR
We'll close when needed and your name will still be noted. Thanks for contributing. |
And if you have time, please review #15648, thanks a lot :) |
@qinhanmin2014 @jeromedockes is working as well in Inria with me so we can continue working directly in this PR. It would avoid duplicated comments and keep the history. |
That's fine. I provide that PR for you to play with. |
Can you guys redirect there: https://gitter.im/glemaitre/ridgecv |
please ping when you want me to review. |
@glemaitre, the issue this PR was meant to close has been closed in the meanwhile by #15182 (you are the author). Should this one be closed? What about #15648 also meant to close the same issue? Thanks! |
We can close this PR. There is still some unsolved issue with Ridge: *Denormalize predictions
|
fixes #15182
in master the
_RidgeGCV
stores the LOO predictions and the dual coefficients for all hyperparameters duringfit
, which can take a lot of memory. This PR only stores the best score and coefficients whenstore_cv_values == False