Skip to content

Absolute tolerance in approximate equality tests #10562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rth opened this issue Jan 31, 2018 · 10 comments
Closed

Absolute tolerance in approximate equality tests #10562

rth opened this issue Jan 31, 2018 · 10 comments

Comments

@rth
Copy link
Member

rth commented Jan 31, 2018

In a number of places in tests, numpy.testing.assert_allclose is used with default absolute tolerance parameter which is atol=0.

This means in particular that np.testing.assert_allclose(0, 1e-16) will fail. More context for the reasons behind this choice can be found in numpy/numpy#3183 (comment) and PEP485, which can be summed up with,

If, for a given use case, a user needs to compare to zero, the test will be guaranteed to fail the first time, and the user can select an appropriate value.

The issue is that occasionally, the tests will pass, and but then may fail on some other platform.

For instance, this test in estimator_checks.py passes CI on master, but then randomly fail for osx: [1], [2] correspond to the same commit, one for the PR, one in the forked repo (this test fails with the bellow message and not necessary for the same Python versions) ,

```
________________________ test_non_meta_estimators[1045] ________________________
args = ('GaussianProcess', GaussianProcess(beta0=None, corr='squared_exponential', normalize=True,
  ...orage_mode='full', theta0=0.1, thetaL=None, thetaU=None,
    verbose=False))
[...]
>       assert_allclose(y_pred.ravel(), y_pred_2d.ravel())
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=0
E       
E       (mismatch 40.0%)
E        x: array([ -1.089129e-13,   1.000000e+00,   2.000000e+00,   1.272316e-13,
E                1.000000e+00,   2.000000e+00,   5.051515e-14,   1.000000e+00,
E                2.000000e+00,   9.214851e-15])
E        y: array([ -1.086908e-13,   1.000000e+00,   2.000000e+00,   1.155742e-13,
E                1.000000e+00,   2.000000e+00,   5.062617e-14,   1.000000e+00,
E                2.000000e+00,   9.325873e-15])
```

When atol is used, it not very consistent.

As to sklearn.utils.testing.assert_allclose_dense_sparse it has by default atol=1e-9 and not 0.

While the necessary absolute tolerance is test dependent, it might still be useful to

  • have a default value (e.g. 1e-9) when it's needed (e.g. DEFAULT_ATOL in sklearn.utils.testing) , except for the cases when it has to be increased for specific reasons.
  • use atol when it's definitely reasonable to do so (e.g. in probability equalities)
  • make sklearn.utils.testing.assert_allclose_dense_sparse and sklearn.utils.testing.assert_allclose have the same default atol.
  • Check that we don't have floating point equalities with 0.0 even if CI tests passes ([1], [2], [3] ..)

This might help improving the numerical stability of tests and prevent some of the tests failures on less common platforms cc @yarikoptic

What do you think?

@jnothman
Copy link
Member

jnothman commented Jan 31, 2018 via email

@glemaitre
Copy link
Member

Not sure it is 100% relevant to this issue but we are using assert_array_almost_equal in a lot of place.
As specified in the numpy doc, I would think that we should change those using assert_allclose.

@jnothman
Copy link
Member

I don't think we benefit from changing everything to assert_allclose, and we would run the risk of finding that tests fail on some platforms etc., although we can recommend it for new tests.

I'm still somewhat undecided about @rth's proposal here.

@rth
Copy link
Member Author

rth commented Feb 27, 2018

#10280 (comment) is also a good point by @jnothman

we would run the risk of finding that tests fail on some platforms etc., although we can recommend it for new tests.

yes, I think it would be useful to at least have some recommendations in contributors docs of what to use for new tests.

BTW for the record, assert x == pytest.approx(y, rel=None, abs=None) will use a rel=1e-6 and abs=1e-12 cf pytest docs.

@jnothman
Copy link
Member

jnothman commented Feb 27, 2018 via email

@rth
Copy link
Member Author

rth commented Feb 27, 2018

Not arrays, just floats.

@gamazeps
Copy link
Contributor

I could reproduce this issue on master:

================================================================================================ FAILURES =================================================================================================
_________________________________________________________________________________ [doctest] sklearn.svm.classes.LinearSVC __________________________________________________________________________________
116     >>> from sklearn.svm import LinearSVC
117     >>> from sklearn.datasets import make_classification
118     >>> X, y = make_classification(n_features=4, random_state=0)
119     >>> clf = LinearSVC(random_state=0)
120     >>> clf.fit(X, y)
121     LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
122          intercept_scaling=1, loss='squared_hinge', max_iter=1000,
123          multi_class='ovr', penalty='l2', random_state=0, tol=0.0001,
124          verbose=0)
125     >>> print(clf.coef_)
Expected:
    [[ 0.08551385  0.39414796  0.49847831  0.37513797]]
Got:
    [[ 0.08551956  0.39414641  0.49848001  0.37514411]]

/Users/gamazeps/dev/scikit-learn/sklearn/svm/classes.py:125: DocTestFailure
_________________________________________________________________________________ [doctest] sklearn.svm.classes.LinearSVR __________________________________________________________________________________
326     --------
327     >>> from sklearn.svm import LinearSVR
328     >>> from sklearn.datasets import make_regression
329     >>> X, y = make_regression(n_features=4, random_state=0)
330     >>> regr = LinearSVR(random_state=0)
331     >>> regr.fit(X, y)
332     LinearSVR(C=1.0, dual=True, epsilon=0.0, fit_intercept=True,
333          intercept_scaling=1.0, loss='epsilon_insensitive', max_iter=1000,
334          random_state=0, tol=0.0001, verbose=0)
335     >>> print(regr.coef_)
Expected:
    [ 16.35750999  26.91499923  42.30652207  60.47843124]
Got:
    [ 16.36245986  26.91619359  42.30787928  60.47656073]

/Users/gamazeps/dev/scikit-learn/sklearn/svm/classes.py:335: DocTestFailure
______________________________________________________________________________________ test_non_meta_estimators[1071] ______________________________________________________________________________________

args = ('GaussianProcess', GaussianProcess(beta0=None, corr='squared_exponential', normalize=True,
      ...orage_mode='full', theta0=0.1, thetaL=None, thetaU=None,
        verbose=False)), kwargs = {}

    @wraps(fn)
    def wrapper(*args, **kwargs):
        # very important to avoid uncontrolled state propagation
        clean_warning_registry()
        with warnings.catch_warnings():
            warnings.simplefilter("ignore", self.category)
>           return fn(*args, **kwargs)

args       = ('GaussianProcess', GaussianProcess(beta0=None, corr='squared_exponential', normalize=True,
      ...orage_mode='full', theta0=0.1, thetaL=None, thetaU=None,
        verbose=False))
fn         = <function check_supervised_y_2d at 0x10b20b758>
kwargs     = {}
self       = _IgnoreWarnings(record=True)

sklearn/utils/testing.py:326:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

name = 'GaussianProcess', estimator_orig = GaussianProcess(beta0=None, corr='squared_exponential', normalize=True,
      ...orage_mode='full', theta0=0.1, thetaL=None, thetaU=None,
        verbose=False)

    @ignore_warnings(category=(DeprecationWarning, FutureWarning))
    def check_supervised_y_2d(name, estimator_orig):
        if "MultiTask" in name:
            # These only work on 2d, so this test makes no sense
            return
        rnd = np.random.RandomState(0)
        X = pairwise_estimator_convert_X(rnd.uniform(size=(10, 3)), estimator_orig)
        y = np.arange(10) % 3
        estimator = clone(estimator_orig)
        set_random_state(estimator)
        # fit
        estimator.fit(X, y)
        y_pred = estimator.predict(X)

        set_random_state(estimator)
        # Check that when a 2D y is given, a DataConversionWarning is
        # raised
        with warnings.catch_warnings(record=True) as w:
            warnings.simplefilter("always", DataConversionWarning)
            warnings.simplefilter("ignore", RuntimeWarning)
            estimator.fit(X, y[:, np.newaxis])
        y_pred_2d = estimator.predict(X)
        msg = "expected 1 DataConversionWarning, got: %s" % (
            ", ".join([str(w_x) for w_x in w]))
        if name not in MULTI_OUTPUT:
            # check that we warned if we don't support multi-output
            assert_greater(len(w), 0, msg)
            assert_true("DataConversionWarning('A column-vector y"
                        " was passed when a 1d array was expected" in msg)
>       assert_allclose(y_pred.ravel(), y_pred_2d.ravel())
E       AssertionError:
E       Not equal to tolerance rtol=1e-07, atol=0
E
E       (mismatch 40.0%)
E        x: array([ -8.559820e-14,   1.000000e+00,   2.000000e+00,   1.150191e-13,
E                1.000000e+00,   2.000000e+00,   5.606626e-14,   1.000000e+00,
E                2.000000e+00,   2.065015e-14])
E        y: array([ -8.548717e-14,   1.000000e+00,   2.000000e+00,   1.152411e-13,
E                1.000000e+00,   2.000000e+00,   5.617729e-14,   1.000000e+00,
E                2.000000e+00,   2.076117e-14])

X          = array([[ 0.5488135 ,  0.71518937,  0.60276338],
       [ 0.54488318,  0.423654...43,  0.63992102,  0.14335329],
       [ 0.94466892,  0.52184832,  0.41466194]])
estimator  = GaussianProcess(beta0=None,
        corr=<function squared_exponential at 0x10...full',
        theta0=array([[ 0.1]]), thetaL=None, thetaU=None, verbose=False)
estimator_orig = GaussianProcess(beta0=None, corr='squared_exponential', normalize=True,
      ...orage_mode='full', theta0=0.1, thetaL=None, thetaU=None,
        verbose=False)
msg        = 'expected 1 DataConversionWarning, got: '
name       = 'GaussianProcess'
rnd        = <mtrand.RandomState object at 0x109b44f00>
w          = []
y          = array([0, 1, 2, 0, 1, 2, 0, 1, 2, 0])
y_pred     = array([ -8.55981952e-14,   1.00000000e+00,   2.00000000e+00,
         1.150191...  5.60662627e-14,   1.00000000e+00,   2.00000000e+00,
         2.06501483e-14])
y_pred_2d  = array([[ -8.54871729e-14],
       [  1.00000000e+00],
       [  2.00000000e+00...     [  1.00000000e+00],
       [  2.00000000e+00],
       [  2.07611706e-14]])

sklearn/utils/estimator_checks.py:1523: AssertionError
==================================================================== 3 failed, 8671 passed, 83 skipped, 616 warnings in 284.72 seconds =====================================================================
make: *** [test-code] Error 1

my HEAD is at eed8379

(sk) ➜  scikit-learn git:(master) ✗ pip freeze
attrs==17.4.0
Cython==0.27.3
funcsigs==1.0.2
numpy==1.14.1
pluggy==0.6.0
py==1.5.2
pytest==3.4.2
scipy==1.0.0
six==1.11.0
(sk) ➜  scikit-learn git:(master) ✗ python --version
Python 2.7.10

These tests keep on failing despite a rebuild.

I just cloned, fetched the pip dependencies, and then executedpython setup.py && make

@amueller
Copy link
Member

do you want to tag as help wanted or just do it yourself?

@rth
Copy link
Member Author

rth commented Jul 2, 2018

Thanks for mentioning it @gamazeps !

do you want to tag as help wanted or just do it yourself?

Will make a PR to fix issues discussed in #9485 (comment) and #10561, as for the rest of this PR it doesn't seem there is clear consensus on what should be done (e.g. #10562 (comment))

@rth
Copy link
Member Author

rth commented Mar 18, 2020

Closing as "potentially problematic, but no consensus on actions needed".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants