Skip to content

Absolute tolerance in approximate equality tests #10562

Closed
@rth

Description

@rth

In a number of places in tests, numpy.testing.assert_allclose is used with default absolute tolerance parameter which is atol=0.

This means in particular that np.testing.assert_allclose(0, 1e-16) will fail. More context for the reasons behind this choice can be found in numpy/numpy#3183 (comment) and PEP485, which can be summed up with,

If, for a given use case, a user needs to compare to zero, the test will be guaranteed to fail the first time, and the user can select an appropriate value.

The issue is that occasionally, the tests will pass, and but then may fail on some other platform.

For instance, this test in estimator_checks.py passes CI on master, but then randomly fail for osx: [1], [2] correspond to the same commit, one for the PR, one in the forked repo (this test fails with the bellow message and not necessary for the same Python versions) ,

```
________________________ test_non_meta_estimators[1045] ________________________
args = ('GaussianProcess', GaussianProcess(beta0=None, corr='squared_exponential', normalize=True,
  ...orage_mode='full', theta0=0.1, thetaL=None, thetaU=None,
    verbose=False))
[...]
>       assert_allclose(y_pred.ravel(), y_pred_2d.ravel())
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=0
E       
E       (mismatch 40.0%)
E        x: array([ -1.089129e-13,   1.000000e+00,   2.000000e+00,   1.272316e-13,
E                1.000000e+00,   2.000000e+00,   5.051515e-14,   1.000000e+00,
E                2.000000e+00,   9.214851e-15])
E        y: array([ -1.086908e-13,   1.000000e+00,   2.000000e+00,   1.155742e-13,
E                1.000000e+00,   2.000000e+00,   5.062617e-14,   1.000000e+00,
E                2.000000e+00,   9.325873e-15])
```

When atol is used, it not very consistent.

As to sklearn.utils.testing.assert_allclose_dense_sparse it has by default atol=1e-9 and not 0.

While the necessary absolute tolerance is test dependent, it might still be useful to

  • have a default value (e.g. 1e-9) when it's needed (e.g. DEFAULT_ATOL in sklearn.utils.testing) , except for the cases when it has to be increased for specific reasons.
  • use atol when it's definitely reasonable to do so (e.g. in probability equalities)
  • make sklearn.utils.testing.assert_allclose_dense_sparse and sklearn.utils.testing.assert_allclose have the same default atol.
  • Check that we don't have floating point equalities with 0.0 even if CI tests passes ([1], [2], [3] ..)

This might help improving the numerical stability of tests and prevent some of the tests failures on less common platforms cc @yarikoptic

What do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions