Skip to content

TST use global_dtype in sklearn/metrics/tests/test_pairwise.py #22666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Sep 26, 2022

Conversation

jjerphan
Copy link
Member

@jjerphan jjerphan commented Mar 3, 2022

Reference Issues/PRs

Partially addresses #22881
Precedes #22590

What does this implement/fix? Explain your changes.

This parametrizes tests from test_pairwise.py to run on 32bit datasets.

Any other comments?

We could introduce a mechanism to be able to able to remove tests' execution on 32bit datasets if this takes too much time to complete.

@jjerphan jjerphan marked this pull request as ready for review March 3, 2022 15:01
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's systematically add assertions on the expected dtype for the results of the pairwise distance computation when the dtype of the input is fully specified in the test.

I only did some suggestions via github because I cannot suggestion on folded lines but almost all the newly parametrized tests would deserved such a treatment to make the parametrization more useful.

@jjerphan jjerphan changed the title TST Adapt test_pairwise.py to test implementations on 32bit datasets TST use global_dtype in sklearn/metrics/tests/test_pairwise.py Mar 23, 2022
@glemaitre glemaitre self-requested a review June 9, 2022 09:45
@@ -291,7 +296,7 @@ def callable_rbf_kernel(x, y, **kwds):
(pairwise_kernels, callable_rbf_kernel, {"gamma": 0.1}),
],
)
@pytest.mark.parametrize("dtype", [np.float64, int])
@pytest.mark.parametrize("dtype", [np.float64, np.float32, int])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't weird to not use np.int32 or np.int64 since int would be platform dependent here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could also be worth to check the output dtype here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the dtype fails here because it turns out that the returned dtype depends on n_jobs. This is a bug imo. I'll open an issue about that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the issue opened? If so we could link it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that an issue was created.

@jeremiedbb: do you have a snippet which reproduces this bug? This way we could create an issue and resolve it. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #24502

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This is already a net improvement even if all the tests have not yet been global_dtype fixtured.

I locally ran:

SKLEARN_RUN_FLOAT32_TESTS=1 pytest -v sklearn/metrics/tests/test_pairwise.py

and I get 244 passed test instead of 194. No failures and no new warnings.

@@ -291,7 +296,7 @@ def callable_rbf_kernel(x, y, **kwds):
(pairwise_kernels, callable_rbf_kernel, {"gamma": 0.1}),
],
)
@pytest.mark.parametrize("dtype", [np.float64, int])
@pytest.mark.parametrize("dtype", [np.float64, np.float32, int])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the issue opened? If so we could link it here.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@jjerphan jjerphan mentioned this pull request Sep 19, 2022
3 tasks
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @jjerphan

@jeremiedbb jeremiedbb merged commit 9a76368 into scikit-learn:main Sep 26, 2022
@jjerphan jjerphan deleted the tst/test_pairwise-32bit branch October 21, 2022 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants