TST use global_random_seed in sklearn/metrics/tests/test_regression.py #30865

sortofamudkip · 2025-02-20T04:15:46Z

Reference Issues/PRs

Towards #22827.

What does this implement/fix? Explain your changes.

The modified tests are:

test_tweedie_deviance_continuity: tolerance was changed from 1e-6 to 1e-5 to allow all tests to pass.
test_mean_absolute_percentage_error: all tests passed.
test_dummy_quantile_parameter_tuning: all tests passed.
test_pinball_loss_relation_with_mae: all tests passed.
test_mean_pinball_loss_on_constant_predictions: some tests failed.

In particular, tests failed locally with the following configurations:

	`n_samples=3000`	`n_samples=30000`
`distribution=uniform`	6/300 failed	0/300 failed
`distribution=exponential`	6/300 failed	0/300 failed
`distribution=normal`	53/300 failed	117/300 failed

All tests for distribution=lognormal passed.

Increasing the number of samples causes distribution=normal to fail more often.

test_tweedie_deviance_continuity test_mean_absolute_percentage_error test_mean_pinball_loss_on_constant_predictions test_dummy_quantile_parameter_tuning test_pinball_loss_relation_with_mae

github-actions · 2025-02-20T04:16:58Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 3a646e3. Link to the linter CI: here}

OmarManzoor

Thanks for the PR @sortofamudkip

OmarManzoor · 2025-03-13T11:20:51Z

sklearn/metrics/tests/test_regression.py

    assert result.success
    # The minimum is not unique with limited data, hence the large tolerance.
-    assert result.x == pytest.approx(best_pred, rel=1e-2)
+    assert result_x == pytest.approx(best_pred, rel=1e-2)


I think the issue is mostly with recovering the actual value which can turn out to be somewhat different than the actual best_pred. Moreover the optimization might also not be converging to the actual minimum loss on some occasions using the default tolerance in the minimize method.
What would be the best approach to handling this @ogrisel @lorentzenchr

Given the approximate nature of this part of the test due to the limited data, increasing the tol looks fine to me. I also added an absolute tol as well to take into account the normal distribution + 0.5 quantile (i.e. expected close to 0).

test_tweedie_deviance_continuity test_mean_absolute_percentage_error test_mean_pinball_loss_on_constant_predictions test_dummy_quantile_parameter_tuning test_pinball_loss_relation_with_mae

jeremiedbb

LGTM. Thanks @sortofamudkip.

add global_random_seed to test_regression.py [all random seeds]

75cf743

test_tweedie_deviance_continuity test_mean_absolute_percentage_error test_mean_pinball_loss_on_constant_predictions test_dummy_quantile_parameter_tuning test_pinball_loss_relation_with_mae

github-actions bot added the module:metrics label Feb 20, 2025

StefanieSenger added the No Changelog Needed label Feb 20, 2025

sortofamudkip added 2 commits February 20, 2025 23:00

Merge branch 'main' into rng_test_regression

2a7766e

Merge branch 'scikit-learn:main' into rng_test_regression

e84f48c

NMM3N approved these changes Mar 6, 2025

View reviewed changes

OmarManzoor reviewed Mar 13, 2025

View reviewed changes

glemaitre mentioned this pull request Mar 27, 2025

Improve tests by using global_random_seed fixture to make them less seed-sensitive #22827

Open

jeremiedbb added 4 commits April 11, 2025 11:52

Merge remote-tracking branch 'upstream/main' into pr/sortofamudkip/30865

0e8e1e2

increase tol

5f4138c

comment

30097d4

[all random seeds]

3a646e3

test_tweedie_deviance_continuity test_mean_absolute_percentage_error test_mean_pinball_loss_on_constant_predictions test_dummy_quantile_parameter_tuning test_pinball_loss_relation_with_mae

jeremiedbb approved these changes Apr 11, 2025

View reviewed changes

jeremiedbb merged commit 31cbde3 into scikit-learn:main Apr 11, 2025
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST use global_random_seed in sklearn/metrics/tests/test_regression.py #30865

TST use global_random_seed in sklearn/metrics/tests/test_regression.py #30865

sortofamudkip commented Feb 20, 2025 •

edited

Loading

github-actions bot commented Feb 20, 2025 •

edited

Loading

OmarManzoor left a comment

OmarManzoor Mar 13, 2025

jeremiedbb Apr 11, 2025

jeremiedbb left a comment

TST use global_random_seed in sklearn/metrics/tests/test_regression.py #30865

TST use global_random_seed in sklearn/metrics/tests/test_regression.py #30865

Conversation

sortofamudkip commented Feb 20, 2025 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

github-actions bot commented Feb 20, 2025 • edited Loading

✔️ Linting Passed

OmarManzoor left a comment

Choose a reason for hiding this comment

OmarManzoor Mar 13, 2025

Choose a reason for hiding this comment

jeremiedbb Apr 11, 2025

Choose a reason for hiding this comment

jeremiedbb left a comment

Choose a reason for hiding this comment

sortofamudkip commented Feb 20, 2025 •

edited

Loading

github-actions bot commented Feb 20, 2025 •

edited

Loading