TST Add minimal setup to be able to run test suite on float32 #22690

jjerphan · 2022-03-04T15:17:55Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Based on @thomasjpfan's #22663 (comment), this implements a minimal setup to run the test suite on a dtype parameter using a minimal parametrisation.

Any other comments?

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

ogrisel · 2022-03-04T17:22:07Z

To avoid merging dead code could you please use it in at least one test and set the environment variable on one of the CI builds (a fast one preferably)?

jjerphan · 2022-03-07T08:29:51Z

Yes, this is still in draft and I am figuring which tests I am going to include here based on the discussions in other pull requests.

sklearn/conftest.py

jjerphan · 2022-03-09T11:06:50Z

I am just adding a test for this global fixture.
I would prefer updating the set of recent TST PRs accordingly if this PR gets approved.

sklearn/conftest.py

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

ogrisel

If we start to use this fixture extensively, this will generate a lot of skipped tests by default which will render the output very verbose. At some point we might want to avoid the skip mark and instead do:

GLOBAL_DTYPES_TO_TEST = [np.float64]
if environ.get("SKLEARN_TESTS_RUN_FLOAT32", "0") != "1":
    print("Enabling float32 tests globally.")
    GLOBAL_DTYPES_TO_TEST.append(np.float32)
else:
    print("Skipping float32 tests globally. Set SKLEARN_TESTS_RUN_FLOAT32=1 to enable.")


@pytest.fixture(params=GLOBAL_DTYPES_TO_TEST)
def global_dtype(request):
    yield request.param

but I think we can keep it this way for now. +1 for merging as is.

azure-pipelines.yml

sklearn/conftest.py

ogrisel · 2022-03-10T13:14:09Z

I have an idea to reuse a similar pattern to define a random_seed fixture. I will do a follow-up PR.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

ogrisel · 2022-03-10T15:51:18Z

@thomasjpfan let's merge?

thomasjpfan

Codewise this looks good. Can we document the variable along with the other ones here?

scikit-learn/doc/computing/parallelism.rst

Lines 203 to 207 in 05ef752

    
           :SKLEARN_ENABLE_DEBUG_CYTHON_DIRECTIVES: 
        
               When this environment variable is set to a non zero value, the `Cython` 
        
               derivative, `boundscheck` is set to `True`. This is useful for finding 
        
               segfaults.

(A future PR can be to move some of these environment variables out of doc/computing/parallelism.rst and into doc/developers/contributing.rst.)

So as to have a similar name to `SKLEARN_SKIP_NETWORK_TESTS`. Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

jeremiedbb

I think this fixture is not enough on its own

jeremiedbb · 2022-03-11T14:43:11Z

In the tests, we usually end up checking that 2 values are close to each other. "close" depends on the dtype.

The default rtol of assert_allclose is 1e-7. This is fine for float64 since machine precision in that case is ~ 1e-16.
For float32, machine precision is a little bigger than 1e-7, so we can't expect test to pass when changing the dtype. In that case, the rtol needs to be at least 1e-5 or even 1e-4 to take rouding error accumulation into account.

There's the same concern about atol. in assert_allclose, atol essentially means a value below atol is considered to be equal to zero. In some tests where we want to test some values against 0, rtol is irrelevant and we need to set atol. Although let's 1e-10 could be considered a 0 in float64, it's way too small for float32.

We need the dtype to come with an associated rtol and atol. The fixture should be something like

"dtype, rtol, atol", [(np.float64, 1e-7, 1e-10), (np.float32, 1e-4, 1e-5)]

The rtol should then be used in all calls to assert_allclose, unless exception for which a comment should then explain why use a different rtol.
The atol should only be used when we compare arrays where some values are compared against zeros.

thomasjpfan · 2022-03-11T17:16:03Z

In PyTorch, they set their own tolerances based on dtype: https://github.com/pytorch/pytorch/blob/1f29b3130af218847a043e58fdc64511bbe072fe/torch/testing/_comparison.py#L43-L51 (They use unittest, so they can use self.assertEqual that sets the tolerances properly)

For our use case, we can have a custom "assert_allclose" that sets the atol and rtol correctly based on the dtypes.

jeremiedbb · 2022-03-11T18:01:23Z

I'd be happy with a custom assert_allclose. To me it should be done in this PR or prior to this PR before we start getting a bunch of PRs that add the fixture with random fixes to make the tests pass.

ogrisel · 2022-03-13T16:21:59Z

+1 as well for a custom assert_allclose in scikit-learn.

jjerphan · 2022-03-14T08:36:17Z

I also think having custom assertions (e.g assert_allclose) for testing would be convenient.

Currently, assertions in tests are imported from numpy.testing and sklearn._utils.testing.
Moreover, sklearn._utils.testing simply declares its assertions by importing the ones numpy.testing.

In this regards, I think that it would be better to import assertions from sklearn._utils.testing instead of from numpy.testing, prior the definition of custom versions of assertions.

What do you think?

sklearn/utils/_testing.py

changes made

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Yes.

jeremiedbb

Looking at what we have to do to implement our custom assert_allclose and the fact that we won't have a dtype dependent atol, I wonder if it wouldn't be simpler to make the fixture being "global_dtype, rtol, atol" ?

sklearn/feature_selection/tests/test_mutual_info.py

sklearn/utils/_testing.py

sklearn/utils/tests/test_testing.py

ogrisel · 2022-03-16T17:33:57Z

I suppose the deletion of sklearn/conftest.py was not intentional.

ogrisel · 2022-03-16T17:39:25Z

Looking at what we have to do to implement our custom assert_allclose and the fact that we won't have a dtype dependent atol, I wonder if it wouldn't be simpler to make the fixture being "global_dtype, rtol, atol" ?

I do not understand.

We need a fixture to enable or disable the float32 tests to avoid doubling the test duration by default no? We would only run the float32 variant on a few fast CI entries to sparse the slow runners that already take more than 25 min to complete.

Then if those dtype-parametrized tests use our assert_allclose the body of those test function will be dtype-agnostic almost magically, no? The only cases where this won't necessary be the case will be the tests where we expect zero values in which case choosing an atol carefully will have to be done on a case by case basis: #22690 (comment)

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>

jeremiedbb · 2022-03-17T09:40:58Z

Looking at what we have to do to implement our custom assert_allclose and the fact that we won't have a dtype dependent atol, I wonder if it wouldn't be simpler to make the fixture being "global_dtype, rtol, atol" ?

I do not understand.

nevermind, I thought more about that and I think the custom assert all_close is better.

ogrisel

Could you please add a test in sklearn.utils.tests.test_testing.py that checks the behavior of the scikit-learn specific assert_allclose variation?

def test_float32_aware_assert_allclose():
    # The relative tolerance for float32 inputs is 1e-4
    assert_allclose(np.array([1. + 2e-5], dtype=np.float32), 1.)
    with pytest.raises(AssertionError):
        assert_allclose(np.array([1. + 2e-4], dtype=np.float32), 1.)

    # The relative tolerance for other inputs is left to 1e-7 as in
    # the original numpy version.
    assert_allclose(np.array([1. + 2e-8], dtype=np.float64), 1.)
    with pytest.raises(AssertionError):
        assert_allclose(np.array([1. + 2e-7], dtype=np.float64), 1.)

    # atol is left to 0.0 by default, even for float32
    with pytest.raises(AssertionError):
        assert_allclose(np.array([1. + 1e-5], dtype=np.float32), 0.)
    assert_allclose(np.array([1. + 1e-5], dtype=np.float32), 0., atol=2e-5)

if I am not mistaken (I have not tested).

No need to test the rest of the behavior of assert_allclose that is already tested in the numpy test suite.

sklearn/feature_selection/tests/test_mutual_info.py

sklearn/utils/_testing.py

jeremiedbb

We also need a bit of doc to explain what is the difference between rtol and atol, what are their meaning and when to set atol. Not sure where to put that. Maybe in contributing.rst ?

doc/computing/parallelism.rst

sklearn/feature_selection/tests/test_mutual_info.py

sklearn/utils/_testing.py

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

jeremiedbb

LGTM. Just a last remark from my side.

doc/developers/develop.rst

sklearn/feature_selection/tests/test_mutual_info.py

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>

ogrisel

Final round of nitpicks but other than that LGTM!

doc/developers/develop.rst

sklearn/utils/tests/test_testing.py

sklearn/utils/_testing.py

jeremiedbb · 2022-03-17T13:25:30Z

Thanks @jjerphan !

jjerphan · 2022-03-17T13:43:30Z

I made myself suffer for nothing on this one. 😄

…-learn#22690) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>

TST Add minimal setup to be able to run test suite on float32

6d8efc5

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

jjerphan added 2 commits March 9, 2022 11:37

TST Use dtype fixture in one test

eb911dd

CI Do not skip 32bit test for py38_conda_defaults_openblas

86bf47f

jjerphan added the No Changelog Needed label Mar 9, 2022

jjerphan commented Mar 9, 2022

View reviewed changes

sklearn/conftest.py Outdated Show resolved Hide resolved

jjerphan added 2 commits March 9, 2022 12:03

Merge branch 'main' into float32-test-suite

1727e71

fixup! TST Use dtype fixture in one test

585e278

jjerphan marked this pull request as ready for review March 9, 2022 11:04

jjerphan added the Waiting for Reviewer label Mar 9, 2022

thomasjpfan reviewed Mar 9, 2022

View reviewed changes

sklearn/conftest.py Outdated Show resolved Hide resolved

MAINT Apply reviews comments

09ccf31

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

ogrisel approved these changes Mar 10, 2022

View reviewed changes

azure-pipelines.yml Outdated Show resolved Hide resolved

sklearn/conftest.py Outdated Show resolved Hide resolved

ogrisel added the Quick Review For PRs that are quick to review label Mar 10, 2022

Use an more explicit name for the env variable

f71db82

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

ogrisel mentioned this pull request Mar 10, 2022

TST introducing the random_seed fixture #22749

Merged

3 tasks

thomasjpfan reviewed Mar 10, 2022

View reviewed changes

jjerphan and others added 2 commits March 11, 2022 11:37

DOC Rename and document the environement variable

915a4c7

So as to have a similar name to `SKLEARN_SKIP_NETWORK_TESTS`. Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

fixup! DOC Rename and document the environement variable

53caeeb

jjerphan requested a review from thomasjpfan March 11, 2022 13:53

jeremiedbb previously requested changes Mar 11, 2022

View reviewed changes

jeremiedbb reviewed Mar 14, 2022

View reviewed changes

sklearn/utils/_testing.py Outdated Show resolved Hide resolved

jjerphan and others added 5 commits March 16, 2022 12:00

Review comments

8826197

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

TST Add tests to testing tests

0d42609

Yes.

Merge branch 'main' into float32-test-suite

0c71649

TST Add more rtols

0a4a1f9

TST Adapt test_testing.py

5701e8f

jeremiedbb reviewed Mar 16, 2022

View reviewed changes

jjerphan and others added 5 commits March 17, 2022 08:44

Julien clearing his mess

15d53d8

Simplify

2ff5675

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>

TST Trust numpy test suite

f502f11

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>

C'mon, Julien

831b3e7

TST Actually use custom assert_allclose

354e9c1

ogrisel reviewed Mar 17, 2022

View reviewed changes

jeremiedbb reviewed Mar 17, 2022

View reviewed changes

Review comments

efb0d02

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

jeremiedbb approved these changes Mar 17, 2022

View reviewed changes

doc/developers/develop.rst Outdated Show resolved Hide resolved

sklearn/feature_selection/tests/test_mutual_info.py Outdated Show resolved Hide resolved

TST Last changes

1bd542e

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>

ogrisel approved these changes Mar 17, 2022

View reviewed changes

doc/developers/develop.rst Outdated Show resolved Hide resolved

sklearn/utils/tests/test_testing.py Outdated Show resolved Hide resolved

sklearn/utils/_testing.py Outdated Show resolved Hide resolved

Apply suggestions from code review

fa673d2

jeremiedbb merged commit 613773d into scikit-learn:main Mar 17, 2022

jjerphan deleted the float32-test-suite branch March 17, 2022 13:43

jjerphan mentioned this pull request Mar 17, 2022

Improve tests to make them run on variously typed data using the global_dtype fixture #22881

Open

jjerphan mentioned this pull request Feb 14, 2023

TST Add option to run tests on 32bit data #22680

Closed

	:SKLEARN_ENABLE_DEBUG_CYTHON_DIRECTIVES:

	When this environment variable is set to a non zero value, the `Cython`
	derivative, `boundscheck` is set to `True`. This is useful for finding
	segfaults.

Uh oh!

TST Add minimal setup to be able to run test suite on float32 #22690

TST Add minimal setup to be able to run test suite on float32 #22690

Uh oh!

Conversation

jjerphan commented Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

ogrisel commented Mar 4, 2022

Uh oh!

jjerphan commented Mar 7, 2022

Uh oh!

Uh oh!

jjerphan commented Mar 9, 2022

Uh oh!

Uh oh!

ogrisel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ogrisel commented Mar 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Mar 10, 2022

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

jeremiedbb left a comment

Choose a reason for hiding this comment

Uh oh!

jeremiedbb commented Mar 11, 2022

Uh oh!

thomasjpfan commented Mar 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremiedbb commented Mar 11, 2022

Uh oh!

ogrisel commented Mar 13, 2022

Uh oh!

jjerphan commented Mar 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jeremiedbb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ogrisel commented Mar 16, 2022

Uh oh!

ogrisel commented Mar 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremiedbb commented Mar 17, 2022

Uh oh!

ogrisel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeremiedbb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jjerphan commented Mar 4, 2022 •

edited

Loading

ogrisel left a comment •

edited

Loading

ogrisel commented Mar 10, 2022 •

edited

Loading

thomasjpfan commented Mar 11, 2022 •

edited

Loading

jjerphan commented Mar 14, 2022 •

edited

Loading

ogrisel commented Mar 16, 2022 •

edited

Loading

ogrisel left a comment •

edited

Loading

jjerphan commented Mar 17, 2022 •

edited

Loading