Skip to content

TST use global_random_seed in sklearn/linear_model/tests/test_base.py #23465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

svenstehle
Copy link
Contributor

@svenstehle svenstehle commented May 25, 2022

Reference Issues/PRs

Partially addresses #22827

Task List

  • address FIXMEs for sklearn version 1.2 in code (currently at 1.2.dev0) --> out of scope for this PR
  • test_linear_regression_sample_weights
  • test_raises_value_error_if_sample_weights_greater_than_1d (--> no seed change; this tests ValueError)
  • test_linear_regression_sparse
  • test_linear_regression_sparse_equal_dense (--> out of scope, will be deprecated in v1.2)
  • test_linear_regression_multiple_outcome (--> no seed change, random data has no impact on test)
  • test_linear_regression_sparse_multiple_outcome
  • test_linear_regression_positive_vs_nonpositive
  • test_linear_regression_positive_vs_nonpositive_when_positive
  • test_linear_regression_pd_sparse_dataframe_warning (--> no seed change; this asserts warnings)
  • test_preprocess_data
  • test_preprocess_data_multioutput
  • test_preprocess_data_weighted
  • test_sparse_preprocess_data_offsets
  • test_dtype_preprocess_data
  • test_rescale_data_dense
  • todo: check remaining tests in test_base.py

What does this implement/fix? Explain your changes.

  • Added global_random_seed to all seeded tests in sklearn/linear_model/tests/test_base.py
  • The only test that failed on a few seeds was test_linear_regression_sample_weights
    • Fixed by removing the arbitrary check assert reg.score(X, y) > 0.5 - scores can be below 0.5 for some cases of fit_intercept=False
    • Now test passes for all seeds, more info see below

Long version:

Observations:
  • for specific seeds, i.e. seeds (3, 15, 17, 20, 23, 25, 54, 64, 79, 91, 99), the test test_linear_regression_sample_weights fails if fit_intercept=False. The test passes if fit_intercept=True:

    >       assert reg.score(X, y) > 0.5
    E       assert 0.3016564425292758 > 0.5
    E        +  where 0.3016564425292758 = <bound method RegressorMixin.score of LinearRegression(fit_intercept=False)>(array([[-0.60612102, -1.05993975, -0.55091967, -0.27568627,  1.22225373],\n       [-0.90585899,  0.06935217,  2.1786556...215,  2.11737829, -0.44639435, -0.66693906],\n       [ 0.48079725,  1.98498011,  0.39190894,  2.3982028 ,  2.41736027]]), array([-1.61411983, -0.21642658,  1.66363212, -0.12201627,  1.39842066,\n       -0.63836468]))
    E        +    where <bound method RegressorMixin.score of LinearRegression(fit_intercept=False)> = LinearRegression(fit_intercept=False).score
    
    sklearn/linear_model/tests/test_base.py:80: AssertionError
    

    In the failing cases with one of these seeds and fit_intercept=False, reg.score is lower than 0.5. Setting fit_intercept=True fixes this for all seeds.

  • Any other combination than n_samples, n_features = 6, 5 results in more failed tests. E.g. reducing to n_samples, n_features = 5, 4 does not help even with fit_intercept=True

  • n_samples needs to remain at a low value, because it will be hard to fit a linear regression to more than a few randomly chosen samples/targets. E.g. every test fails with n_samples = 30.

Possible solutions
  • remove assert reg.score(X, y) > 0.5

    • we picked that one
  • run only tests with working combinations of global_random_seed and fit_intercept; e.g.

    if np.isin(global_random_seed, (3, 15, 17, 20, 23, 25, 54, 64, 79, 91, 99)) and not fit_intercept:
      pytest.skip("Unsupported Configuration")
  • mark bad combinations as xfail - we have to introduce a gate-keeper fixture for this:

    @pytest.fixture
    def xfail_selected_intercept_seed_combos(request):
        fit_intercept = request.getfixturevalue("fit_intercept")
        seed = request.getfixturevalue("global_random_seed")
    
        # test is known to fail on these seeds for `fit_intercep=False`
        allowed_failures = ((False, s) for s in (3, 15, 17, 20, 23, 25, 54, 64, 79, 91, 99))
        if (fit_intercept, seed) in allowed_failures:
            request.node.add_marker(
                pytest.mark.xfail(reason="skipping bad combinations of tests", strict=True)
            )
    
    
    @pytest.mark.parametrize("array_constr", [np.array, sparse.csr_matrix])
    @pytest.mark.parametrize("fit_intercept", [True, False])
    @pytest.mark.usefixtures("xfail_selected_intercept_seed_combos")
    def test_linear_regression_sample_weights(
        array_constr, fit_intercept, global_random_seed
    ):

Either one does the job and introduces more seeds to the test. I prefer the latter option though. It tells a better story: it tests every combination and safeguards against silently passing combinations in the future.

Comments and opinions are very much appreciated! :)

Any other comments?

I created a task list to track the implementation progress of global_random_seed. Also, let's discuss if we tackle the two version 1.2 related FIXME items in this or in another (linked?) PR (how is that kept track of normally?)

@svenstehle svenstehle changed the title [WIP] TST use global_random_seed in sklearn/linear_model/tests/test_base.py TST use global_random_seed in sklearn/linear_model/tests/test_base.py May 28, 2022
@lorentzenchr
Copy link
Member

@svenstehle Thanks for working on this. I think xfail on specific seeds is an anti-pattern that we should certainly avoid. Only skipping over the code changes, I see the following options:

  • Increase tolerances where applicable.
  • Improve the test where possible or remove it (e.g. why should a score be > 0.5, pretty arbitrary).
  • Mark the whole test as xfail.

@svenstehle
Copy link
Contributor Author

svenstehle commented Jun 1, 2022

@svenstehle Thanks for working on this. I think xfail on specific seeds is an anti-pattern that we should certainly avoid. Only skipping over the code changes, I see the following options:

  • Increase tolerances where applicable.
  • Improve the test where possible or remove it (e.g. why should a score be > 0.5, pretty arbitrary).
  • Mark the whole test as xfail.

Hi @lorentzenchr thanks for your reply and input. A valid idea to drop the check for reg.score.

  • As far as I understand it, we are checking the correct behaviour of the sample_weight parameter
  • reg.score is unnecessary for that. reg.score can also be smaller than 0, if the model error is larger than the baseline ((y_true - y_true.mean()) ** 2).sum() - which is the case for some of those seeds, because we have fit_intercept=False

Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for fixing this.

@lorentzenchr lorentzenchr added the Quick Review For PRs that are quick to review label Jun 3, 2022
@svenstehle
Copy link
Contributor Author

svenstehle commented Jun 4, 2022

LGTM. Thanks for fixing this.

Thanks @lorentzenchr . There are more tests to check in this file though, I will go through them over the weekend to see how many need changes/rework. Depending on that, we can discuss how to proceed.

Update: at least one other test is impacted.
I will open a task-list, other approaches are welcome.

@svenstehle
Copy link
Contributor Author

svenstehle commented Jun 4, 2022

As a conversation point, I added address FIXMEs for sklearn version 1.2 in code (currently at 1.2.dev0) to the task list. But we should probably not address these two FIXME items in this PR. They are about deprecation of normalize. Will they be handled in a PR concerning sklearn/linear_model/_base.py?

@lorentzenchr
Copy link
Member

@svenstehle Please keep the instructions of #22827 in mind, e.g.

We probably do not need to convert all scikit-learn tests to use this fixture. We should instead focus our efforts on tests that actually check for important mathematical properties of our estimators or model evaluation tools. For instance, there is no need to check for the seed-insensitivity of tests that checks for the exception messages raised when passing invalid inputs.

For instance, test_raises_value_error_if_sample_weights_greater_than_1d does not need the global random seed fixture.

@svenstehle
Copy link
Contributor Author

@svenstehle Please keep the instructions of #22827 in mind, e.g.

We probably do not need to convert all scikit-learn tests to use this fixture. We should instead focus our efforts on tests that actually check for important mathematical properties of our estimators or model evaluation tools. For instance, there is no need to check for the seed-insensitivity of tests that checks for the exception messages raised when passing invalid inputs.

For instance, test_raises_value_error_if_sample_weights_greater_than_1d does not need the global random seed fixture.

Good catch, thanks for raising this. Addressed

…hts_greater_than_1d since we are only checking that no value errors are raised
@glemaitre glemaitre self-requested a review June 13, 2022 18:25
@@ -233,8 +235,9 @@ def test_linear_regression_sparse_equal_dense(normalize, fit_intercept):
assert_allclose(clf_dense.coef_, clf_sparse.coef_)


def test_linear_regression_multiple_outcome(random_state=0):
def test_linear_regression_multiple_outcome(global_random_seed):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can set a local random state here.
We only check that we fit idependently on different target and the randomness will not impact anything apart of creating the dataset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, addressed.

# Test multiple-outcome linear regressions
random_state = check_random_state(global_random_seed)
X, y = make_regression(random_state=random_state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can pass directly random_state=0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

# Test multiple-outcome linear regressions with sparse data
random_state = check_random_state(random_state)
random_state = check_random_state(global_random_seed)
X, y = make_sparse_uncorrelated(random_state=random_state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that this is the same case than above but for sparse matrix so real need to parametrize the random state here.

# Test multiple-outcome nonnegative linear regressions
random_state = check_random_state(random_state)
random_state = check_random_state(global_random_seed)
X, y = make_sparse_uncorrelated(random_state=random_state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there is a couple of places that we can avoid to parametrize. Otherwise LGTM

svenstehle and others added 2 commits June 19, 2022 16:02
…execution order

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
…te' and replace all 'random_state' variables with 'rng'
Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
@glemaitre glemaitre merged commit 158b620 into scikit-learn:main Jun 22, 2022
@glemaitre
Copy link
Member

Thanks @svenstehle LGTM. Merging

ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Jul 11, 2022
…scikit-learn#23465)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants