Skip to content

Consistent failure on OSX in draft 0.20.2 #12823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jnothman opened this issue Dec 18, 2018 · 8 comments
Closed

Consistent failure on OSX in draft 0.20.2 #12823

jnothman opened this issue Dec 18, 2018 · 8 comments

Comments

@jnothman
Copy link
Member

While trying to build wheels for 0.20.2, all Mac builds have failed (e.g. https://travis-ci.org/MacPython/scikit-learn-wheels/jobs/469752588) with:


___________________ test_pca_dtype_preservation[randomized] ____________________
svd_solver = 'randomized'
    @pytest.mark.parametrize('svd_solver', solver_list)
    def test_pca_dtype_preservation(svd_solver):
>       check_pca_float_dtype_preservation(svd_solver)
svd_solver = 'randomized'
../venv/lib/python2.7/site-packages/sklearn/decomposition/tests/test_pca.py:707: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
svd_solver = 'randomized'
    def check_pca_float_dtype_preservation(svd_solver):
        # Ensure that PCA does not upscale the dtype when input is float32
        X_64 = np.random.RandomState(0).rand(1000, 4).astype(np.float64)
        X_32 = X_64.astype(np.float32)
    
        pca_64 = PCA(n_components=3, svd_solver=svd_solver,
                     random_state=0).fit(X_64)
        pca_32 = PCA(n_components=3, svd_solver=svd_solver,
                     random_state=0).fit(X_32)
    
        assert pca_64.components_.dtype == np.float64
        assert pca_32.components_.dtype == np.float32
        assert pca_64.transform(X_64).dtype == np.float64
        assert pca_32.transform(X_32).dtype == np.float32
    
        assert_array_almost_equal(pca_64.components_, pca_32.components_,
>                                 decimal=5)
E       AssertionError: 
E       Arrays are not almost equal to 5 decimals
E       
E       (mismatch 16.6666666667%)
E        x: array([[ 0.62022,  0.15983, -0.38317, -0.66555],
E              [ 0.26318,  0.24085,  0.90801, -0.21966],
E              [-0.12498, -0.88109,  0.16727, -0.42437]])
E        y: array([[ 0.62022,  0.15983, -0.38317, -0.66555],
E              [ 0.26318,  0.24084,  0.90801, -0.21967],
E              [-0.12498, -0.88109,  0.16726, -0.42436]], dtype=float32)
X_32       = array([[ 0.54881352,  0.71518934,  0.60276335,  0.54488319],
       [ 0.423654...],
       [ 0.43487364,  0.83000296,  0.93280619,  0.30833843]], dtype=float32)
X_64       = array([[ 0.5488135 ,  0.71518937,  0.60276338,  0.54488318],
       [ 0.423654...91,  0.34963937],
       [ 0.43487363,  0.83000295,  0.93280618,  0.30833843]])
pca_32     = PCA(copy=True, iterated_power='auto', n_components=3, random_state=0,
  svd_solver='randomized', tol=0.0, whiten=False)
pca_64     = PCA(copy=True, iterated_power='auto', n_components=3, random_state=0,
  svd_solver='randomized', tol=0.0, whiten=False)
svd_solver = 'randomized'

I can't see any change in 0.20.2 that could have caused this new failure.

@qinhanmin2014
Copy link
Member

I guess this is likely to be a scipy issue, since we always use the latest version of scipy and scipy just released 1.2.0, but I don't have a mac.

@qinhanmin2014
Copy link
Member

@jnothman xfail for mac or set decimal=4?

@jnothman
Copy link
Member Author

If we can quickly get a build on OSX here, then we can see if decimal=4 works without fussing around with submodule commits for scikit-learn-wheels.

jnothman added a commit to jnothman/scikit-learn that referenced this issue Dec 19, 2018
jnothman added a commit to jnothman/scikit-learn that referenced this issue Dec 19, 2018
@qinhanmin2014
Copy link
Member

leave it open for further investigation?
We're able to reproduce the error on mac if we only update scipy from 0.17.0 to 1.1.0, so seems that it's indeed a scipy issue. We'd prefer if someone can figure out what's happening in scipy.

@jnothman
Copy link
Member Author

I don't think it's an unreasonable loss in not precision for float32 but perhaps we should make them aware

@ogrisel
Copy link
Member

ogrisel commented Dec 19, 2018

Maybe also try with the scipy 1.2.0 wheel that now includes openblas?

@qinhanmin2014
Copy link
Member

Maybe also try with the scipy 1.2.0 wheel that now includes openblas?

I think we're using scipy 1.2.0 in scikit-learn-wheels when releasing 0.20.2.

@jeremiedbb
Copy link
Member

leave it open for further investigation?

I don't think there's need for further investigation. We can't expect the same level of accuracy when comparing against float32. The default rtol we use to compare float64 is 1e-7. It's reasonable to use a rtol between 1e-3 and 1e-4 for float32 since precision is halved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants