Skip to content

TST Add minimal setup to be able to run test suite on float32 #22690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Mar 17, 2022

Conversation

jjerphan
Copy link
Member

@jjerphan jjerphan commented Mar 4, 2022

Reference Issues/PRs

Solves #22680.

What does this implement/fix? Explain your changes.

Based on @thomasjpfan's #22663 (comment), this implements a minimal setup to run the test suite on a dtype parameter using a minimal parametrisation.

Any other comments?

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@ogrisel
Copy link
Member

ogrisel commented Mar 4, 2022

To avoid merging dead code could you please use it in at least one test and set the environment variable on one of the CI builds (a fast one preferably)?

@jjerphan
Copy link
Member Author

jjerphan commented Mar 7, 2022

Yes, this is still in draft and I am figuring which tests I am going to include here based on the discussions in other pull requests.

@jjerphan jjerphan marked this pull request as ready for review March 9, 2022 11:04
@jjerphan
Copy link
Member Author

jjerphan commented Mar 9, 2022

I am just adding a test for this global fixture.
I would prefer updating the set of recent TST PRs accordingly if this PR gets approved.

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we start to use this fixture extensively, this will generate a lot of skipped tests by default which will render the output very verbose. At some point we might want to avoid the skip mark and instead do:

GLOBAL_DTYPES_TO_TEST = [np.float64]
if environ.get("SKLEARN_TESTS_RUN_FLOAT32", "0") != "1":
    print("Enabling float32 tests globally.")
    GLOBAL_DTYPES_TO_TEST.append(np.float32)
else:
    print("Skipping float32 tests globally. Set SKLEARN_TESTS_RUN_FLOAT32=1 to enable.")


@pytest.fixture(params=GLOBAL_DTYPES_TO_TEST)
def global_dtype(request):
    yield request.param

but I think we can keep it this way for now. +1 for merging as is.

@ogrisel ogrisel added the Quick Review For PRs that are quick to review label Mar 10, 2022
@ogrisel
Copy link
Member

ogrisel commented Mar 10, 2022

I have an idea to reuse a similar pattern to define a random_seed fixture. I will do a follow-up PR.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@ogrisel
Copy link
Member

ogrisel commented Mar 10, 2022

@thomasjpfan let's merge?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codewise this looks good. Can we document the variable along with the other ones here?

:SKLEARN_ENABLE_DEBUG_CYTHON_DIRECTIVES:
When this environment variable is set to a non zero value, the `Cython`
derivative, `boundscheck` is set to `True`. This is useful for finding
segfaults.

(A future PR can be to move some of these environment variables out of doc/computing/parallelism.rst and into doc/developers/contributing.rst.)

jjerphan and others added 2 commits March 11, 2022 11:37
So as to have a similar name to `SKLEARN_SKIP_NETWORK_TESTS`.

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@jjerphan jjerphan requested a review from thomasjpfan March 11, 2022 13:53
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fixture is not enough on its own

@jeremiedbb
Copy link
Member

In the tests, we usually end up checking that 2 values are close to each other. "close" depends on the dtype.

The default rtol of assert_allclose is 1e-7. This is fine for float64 since machine precision in that case is ~ 1e-16.
For float32, machine precision is a little bigger than 1e-7, so we can't expect test to pass when changing the dtype. In that case, the rtol needs to be at least 1e-5 or even 1e-4 to take rouding error accumulation into account.

There's the same concern about atol. in assert_allclose, atol essentially means a value below atol is considered to be equal to zero. In some tests where we want to test some values against 0, rtol is irrelevant and we need to set atol. Although let's 1e-10 could be considered a 0 in float64, it's way too small for float32.

We need the dtype to come with an associated rtol and atol. The fixture should be something like

"dtype, rtol, atol", [(np.float64, 1e-7, 1e-10), (np.float32, 1e-4, 1e-5)]

The rtol should then be used in all calls to assert_allclose, unless exception for which a comment should then explain why use a different rtol.
The atol should only be used when we compare arrays where some values are compared against zeros.

@thomasjpfan
Copy link
Member

thomasjpfan commented Mar 11, 2022

In PyTorch, they set their own tolerances based on dtype: https://github.com/pytorch/pytorch/blob/1f29b3130af218847a043e58fdc64511bbe072fe/torch/testing/_comparison.py#L43-L51 (They use unittest, so they can use self.assertEqual that sets the tolerances properly)

For our use case, we can have a custom "assert_allclose" that sets the atol and rtol correctly based on the dtypes.

@jeremiedbb
Copy link
Member

I'd be happy with a custom assert_allclose. To me it should be done in this PR or prior to this PR before we start getting a bunch of PRs that add the fixture with random fixes to make the tests pass.

@ogrisel
Copy link
Member

ogrisel commented Mar 13, 2022

+1 as well for a custom assert_allclose in scikit-learn.

@jjerphan
Copy link
Member Author

jjerphan commented Mar 14, 2022

I also think having custom assertions (e.g assert_allclose) for testing would be convenient.

Currently, assertions in tests are imported from numpy.testing and sklearn._utils.testing.
Moreover, sklearn._utils.testing simply declares its assertions by importing the ones numpy.testing.

In this regards, I think that it would be better to import assertions from sklearn._utils.testing instead of from numpy.testing, prior the definition of custom versions of assertions.

What do you think?

@jeremiedbb jeremiedbb dismissed their stale review March 14, 2022 16:35

changes made

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at what we have to do to implement our custom assert_allclose and the fact that we won't have a dtype dependent atol, I wonder if it wouldn't be simpler to make the fixture being "global_dtype, rtol, atol" ?

@ogrisel
Copy link
Member

ogrisel commented Mar 16, 2022

I suppose the deletion of sklearn/conftest.py was not intentional.

@ogrisel
Copy link
Member

ogrisel commented Mar 16, 2022

Looking at what we have to do to implement our custom assert_allclose and the fact that we won't have a dtype dependent atol, I wonder if it wouldn't be simpler to make the fixture being "global_dtype, rtol, atol" ?

I do not understand.

We need a fixture to enable or disable the float32 tests to avoid doubling the test duration by default no? We would only run the float32 variant on a few fast CI entries to sparse the slow runners that already take more than 25 min to complete.

Then if those dtype-parametrized tests use our assert_allclose the body of those test function will be dtype-agnostic almost magically, no? The only cases where this won't necessary be the case will be the tests where we expect zero values in which case choosing an atol carefully will have to be done on a case by case basis: #22690 (comment)

jjerphan and others added 5 commits March 17, 2022 08:44
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
@jeremiedbb
Copy link
Member

Looking at what we have to do to implement our custom assert_allclose and the fact that we won't have a dtype dependent atol, I wonder if it wouldn't be simpler to make the fixture being "global_dtype, rtol, atol" ?

I do not understand.

nevermind, I thought more about that and I think the custom assert all_close is better.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a test in sklearn.utils.tests.test_testing.py that checks the behavior of the scikit-learn specific assert_allclose variation?

def test_float32_aware_assert_allclose():
    # The relative tolerance for float32 inputs is 1e-4
    assert_allclose(np.array([1. + 2e-5], dtype=np.float32), 1.)
    with pytest.raises(AssertionError):
        assert_allclose(np.array([1. + 2e-4], dtype=np.float32), 1.)

    # The relative tolerance for other inputs is left to 1e-7 as in
    # the original numpy version.
    assert_allclose(np.array([1. + 2e-8], dtype=np.float64), 1.)
    with pytest.raises(AssertionError):
        assert_allclose(np.array([1. + 2e-7], dtype=np.float64), 1.)

    # atol is left to 0.0 by default, even for float32
    with pytest.raises(AssertionError):
        assert_allclose(np.array([1. + 1e-5], dtype=np.float32), 0.)
    assert_allclose(np.array([1. + 1e-5], dtype=np.float32), 0., atol=2e-5)

if I am not mistaken (I have not tested).

No need to test the rest of the behavior of assert_allclose that is already tested in the numpy test suite.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need a bit of doc to explain what is the difference between rtol and atol, what are their meaning and when to set atol. Not sure where to put that. Maybe in contributing.rst ?

Co-authored-by: Jérémie du Boisberranger
<jeremiedbb@users.noreply.github.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a last remark from my side.

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final round of nitpicks but other than that LGTM!

@jeremiedbb jeremiedbb merged commit 613773d into scikit-learn:main Mar 17, 2022
@jeremiedbb
Copy link
Member

Thanks @jjerphan !

@jjerphan
Copy link
Member Author

jjerphan commented Mar 17, 2022

I made myself suffer for nothing on this one. 😄

@jjerphan jjerphan deleted the float32-test-suite branch March 17, 2022 13:43
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 6, 2022
…-learn#22690)

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants