Skip to content

TST use global_random_seed in sklearn/metrics/tests/test_regression.py #30865

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 11, 2025

Conversation

sortofamudkip
Copy link
Contributor

@sortofamudkip sortofamudkip commented Feb 20, 2025

Reference Issues/PRs

Towards #22827.

What does this implement/fix? Explain your changes.

The modified tests are:

  • test_tweedie_deviance_continuity: tolerance was changed from 1e-6 to 1e-5 to allow all tests to pass.
  • test_mean_absolute_percentage_error: all tests passed.
  • test_dummy_quantile_parameter_tuning: all tests passed.
  • test_pinball_loss_relation_with_mae: all tests passed.
  • test_mean_pinball_loss_on_constant_predictions: some tests failed.

In particular, tests failed locally with the following configurations:

n_samples=3000 n_samples=30000
distribution=uniform 6/300 failed 0/300 failed
distribution=exponential 6/300 failed 0/300 failed
distribution=normal 53/300 failed 117/300 failed

All tests for distribution=lognormal passed.

Increasing the number of samples causes distribution=normal to fail more often.

test_tweedie_deviance_continuity
test_mean_absolute_percentage_error
test_mean_pinball_loss_on_constant_predictions
test_dummy_quantile_parameter_tuning
test_pinball_loss_relation_with_mae
Copy link

github-actions bot commented Feb 20, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 3a646e3. Link to the linter CI: here

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @sortofamudkip

assert result.success
# The minimum is not unique with limited data, hence the large tolerance.
assert result.x == pytest.approx(best_pred, rel=1e-2)
assert result_x == pytest.approx(best_pred, rel=1e-2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is mostly with recovering the actual value which can turn out to be somewhat different than the actual best_pred. Moreover the optimization might also not be converging to the actual minimum loss on some occasions using the default tolerance in the minimize method.
What would be the best approach to handling this @ogrisel @lorentzenchr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the approximate nature of this part of the test due to the limited data, increasing the tol looks fine to me. I also added an absolute tol as well to take into account the normal distribution + 0.5 quantile (i.e. expected close to 0).

test_tweedie_deviance_continuity
test_mean_absolute_percentage_error
test_mean_pinball_loss_on_constant_predictions
test_dummy_quantile_parameter_tuning
test_pinball_loss_relation_with_mae
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @sortofamudkip.

@jeremiedbb jeremiedbb merged commit 31cbde3 into scikit-learn:main Apr 11, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants