Absolute tolerance in approximate equality tests

In a number of places in tests, `numpy.testing.assert_allclose` is used with default absolute tolerance parameter which is `atol=0`.

This means in particular that `np.testing.assert_allclose(0, 1e-16)` will fail. More context for the reasons behind this choice can be found in https://github.com/numpy/numpy/issues/3183#issuecomment-350892896  and [PEP485](https://www.python.org/dev/peps/pep-0485/#absolute-tolerance-default), which can be summed up with,

> If, for a given use case, a user needs to compare to zero, the test will be guaranteed to fail the first time, and the user can select an appropriate value.

The issue is that occasionally, the tests will pass, and but then may fail on some other platform. 

For instance,  [this test in `estimator_checks.py`](https://github.com/scikit-learn/scikit-learn/blob/4a9034a97c9a303326e1cc4ae40eb06accea8657/sklearn/utils/estimator_checks.py#L1443) passes CI on master, but then randomly fail for osx: [[1]](https://travis-ci.org/MacPython/scikit-learn-wheels/builds/334709556?utm_source=github_status&utm_medium=notification), [[2]](https://travis-ci.org/rth/scikit-learn-wheels/builds/334709530) correspond to the same commit, one for the PR, one in the forked repo (this test fails with the bellow message and not necessary for the same Python versions) , 
 
    ```
    ________________________ test_non_meta_estimators[1045] ________________________
    args = ('GaussianProcess', GaussianProcess(beta0=None, corr='squared_exponential', normalize=True,
      ...orage_mode='full', theta0=0.1, thetaL=None, thetaU=None,
        verbose=False))
    [...]
    >       assert_allclose(y_pred.ravel(), y_pred_2d.ravel())
    E       AssertionError: 
    E       Not equal to tolerance rtol=1e-07, atol=0
    E       
    E       (mismatch 40.0%)
    E        x: array([ -1.089129e-13,   1.000000e+00,   2.000000e+00,   1.272316e-13,
    E                1.000000e+00,   2.000000e+00,   5.051515e-14,   1.000000e+00,
    E                2.000000e+00,   9.214851e-15])
    E        y: array([ -1.086908e-13,   1.000000e+00,   2.000000e+00,   1.155742e-13,
    E                1.000000e+00,   2.000000e+00,   5.062617e-14,   1.000000e+00,
    E                2.000000e+00,   9.325873e-15])
    ```
  * this [test in `test_gradient_boosting_loss_functions.py`](https://github.com/scikit-learn/scikit-learn/blob/b6854944fb2b2e13579160e61f3b0b43e12a2e1c/sklearn/ensemble/tests/test_gradient_boosting_loss_functions.py#L152) compares floats to 0 with `atol=0` but doesn't fail.
 * https://github.com/scikit-learn/scikit-learn/issues/10561 is another manifestation of this that fails on ppc64le



When `atol` is used, it [not very consistent](https://github.com/scikit-learn/scikit-learn/search?p=2&q=atol&type=&utf8=%E2%9C%93).

As to [`sklearn.utils.testing.assert_allclose_dense_sparse`](https://github.com/scikit-learn/scikit-learn/blob/c429a4d72ee49cfdb8f8b9a80559292d69dd14b9/sklearn/utils/testing.py#L388) it has by default `atol=1e-9` and not 0.

While the necessary absolute tolerance is test dependent,  it might still be useful to
  *  have a default value (e.g. 1e-9) when it's needed (e.g. `DEFAULT_ATOL` in  `sklearn.utils.testing`) , except for the cases when it has to be increased for specific reasons.
  * use `atol` when it's definitely reasonable to do so (e.g. in probability equalities)
  * make `sklearn.utils.testing.assert_allclose_dense_sparse` and `sklearn.utils.testing.assert_allclose` have the same default `atol`.
  * Check that we don't have floating point equalities with 0.0 even if CI tests passes ([[1]](https://github.com/scikit-learn/scikit-learn/blob/6bd1cf594b05fdf16f827077e719a4ab0c0bd7ab/sklearn/linear_model/tests/test_logistic.py#L928), [[2]](https://github.com/scikit-learn/scikit-learn/blob/6bd1cf594b05fdf16f827077e719a4ab0c0bd7ab/sklearn/cluster/tests/test_mean_shift.py#L41), [[3]](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/svm/tests/test_svm.py#L902) ..)

This might help improving the numerical stability of tests and prevent some of the tests failures on less common platforms  cc @yarikoptic

What do you think?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Absolute tolerance in approximate equality tests #10562

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Absolute tolerance in approximate equality tests #10562

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions