Skip to content

TST tight and clean tests for Ridge #22910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Mar 31, 2022

Conversation

lorentzenchr
Copy link
Member

@lorentzenchr lorentzenchr commented Mar 20, 2022

Reference Issues/PRs

None...yet.

What does this implement/fix? Explain your changes.

This PR restructures tests for Ridge, removes duplicate ones and adds very tight, general tests for correct solutions.

Any other comments?

Found Bug

The new tests uncover that for wide/fat X, i.e. n_features > n_samples, Ridge(fit_intercept=True) does not yield the minimum norm solution. This is now reported in #22947.

Reason

For wide X, the least squares problem reads a bit different: min ||w||_2 subject to Xw = y with solution w = X'(XX')^-1 y, see e.g. http://ee263.stanford.edu/lectures/min-norm.pdf.
With explicit intercept w0, this reads w = X'(XX' + 1 1')^-1 y, where 1 is a column vector of ones. w0 = 1'(XX' + 1 1')^-1 y.
This is incompatible with our mean centering approach.

@lorentzenchr lorentzenchr marked this pull request as ready for review March 22, 2022 22:16
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's indeed a much better way of testing and the fact that we cannot recover the minimum norm solution is troublesome...

# But it is not the minimum norm solution. (This should be equal.)
assert np.linalg.norm(np.r_[model.intercept_, model.coef_]) > np.linalg.norm(
np.r_[intercept, coef]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this is not the minimum norm solution for any solver is surprising, no? Do you have any idea what could cause this discrepancy?

Is the difference in norms the same for all solvers? Or are some solvers significantly worse in that respect than others?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to change alpha to a small positive value but this does not fully fix the problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this is not the minimum norm solution for any solver is surprising, no? Do you have any idea what could cause this discrepancy?

I edited the PR text to explain this failure. It is connected to mean centering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the difference in norms the same for all solvers? Or are some solvers significantly worse in that respect than others?

Yes, it is the same norm for all solvers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited the PR text to explain this failure. It is connected to mean centering.

Thanks. Do you see an easy way out of this?

I think we can merge this PR that improves the test without fixing this bug. But if we do so, we should open a dedicated issue to document this problem and reference the issue URL where appropriate in the tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #22947.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can merge this PR

I would like that very much 😉

"""
X, y, coef, _ = ols_ridge_dataset
n_samples, n_features = X.shape
alpha = 0 # OLS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that doing something like:

Suggested change
alpha = 0 # OLS
alpha = 1e-30 # near OLS

moves the solution closer to the minimum norm solution for most (all?) solvers, at least for some seed. But the difference in norms is still far from negligible (between 0.001 and 0.2) and very dependent on global_random_seed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider this 2 separate issues/tests:

  • Test for exact OLS, i.e. alpha=0 gives minimum norm solution.
  • Test that as alpha gets close to 0, one approaches the minimum norm solution.

@lorentzenchr
Copy link
Member Author

New tests pass with SKLEARN_TESTS_GLOBAL_RANDOM_SEED="all" pytest sklearn/linear_model/tests/test_ridge.py.

@lorentzenchr
Copy link
Member Author

@jeremiedbb You seem to be interested in linear models. Maybe you like this PR:wink:

@lorentzenchr lorentzenchr added this to the 1.1 milestone Mar 29, 2022
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. +1 for merging this test-suite refactoring without waiting for the resolution of #22947.

@ogrisel
Copy link
Member

ogrisel commented Mar 30, 2022

/cc @agramfort

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx @lorentzenchr

now this thing about mean centering affecting minimum norm solution puzzles me. I'll follow up in #22947

@agramfort agramfort merged commit 6528e14 into scikit-learn:main Mar 31, 2022
@agramfort
Copy link
Member

thx @lorentzenchr

@lorentzenchr lorentzenchr deleted the clean_ridge_tests branch April 1, 2022 06:44
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 6, 2022
* MNT replace pinvh by solve

* DOC more info for svd solver

* TST rewrite test_ridge

* MNT remove test_ridge_singular

* MNT restructure into several tests

* MNT remove test_toy_ridge_object

* MNT remove test_ridge_sparse_svd

This is tested in test_ridge_fit_intercept_sparse_error.

* TST exclude cholesky from singular problem

* CLN two fixes

* MNT parametrize test_ridge_sample_weights

* MNT restructure test_ridge_sample_weights

* CLN tighten tolerance for sag solver

* CLN try to fix saga tolerance

* CLN make test_ridge_sample_weights nicer

* MNT remove test_ridge_regression_sample_weights

* MNT rename to test_ridge_regression_sample_weights

* CLN make test_ridge_regression_unpenalized pass for all random seeds

* CLN make tests pass for all random seeds

* DOC fix typos

* TST skip cholesky for singular problems

* MNT move up test_ridge_regression_sample_weights

* CLN set skip reason as comment
jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Apr 29, 2022
* MNT replace pinvh by solve

* DOC more info for svd solver

* TST rewrite test_ridge

* MNT remove test_ridge_singular

* MNT restructure into several tests

* MNT remove test_toy_ridge_object

* MNT remove test_ridge_sparse_svd

This is tested in test_ridge_fit_intercept_sparse_error.

* TST exclude cholesky from singular problem

* CLN two fixes

* MNT parametrize test_ridge_sample_weights

* MNT restructure test_ridge_sample_weights

* CLN tighten tolerance for sag solver

* CLN try to fix saga tolerance

* CLN make test_ridge_sample_weights nicer

* MNT remove test_ridge_regression_sample_weights

* MNT rename to test_ridge_regression_sample_weights

* CLN make test_ridge_regression_unpenalized pass for all random seeds

* CLN make tests pass for all random seeds

* DOC fix typos

* TST skip cholesky for singular problems

* MNT move up test_ridge_regression_sample_weights

* CLN set skip reason as comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants