[MRG] Adding min. req. for numpy and scipy to circleci #10557

jotasi · 2018-01-31T06:03:24Z

Reference Issues/PRs

See discussion in #10527

What does this implement/fix? Explain your changes.

As suggested by @lesteve in his comment in the discussion about #10527, this PR specifies the numpy and scipy versions installed in the python 2 build of circleCI. They are set to the minimum requirements (numpy>=1.8.2, scipy>=0.13.3) to verify that the examples all run with those minimum requirements as well.

jnothman

Maybe add a comment on the python2 line saying "Test examples run with minimal dependencies."

lesteve · 2018-02-01T09:04:42Z

Merging, thanks a lot!

qinhanmin2014 · 2018-02-01T15:00:33Z

ping @jnothman @lesteve @jotasi
Seems that this one has made python2 build fail in master (maybe because we don't build the examples in PR, right?)
See https://circleci.com/gh/scikit-learn/scikit-learn/17795

Unexpected failing examples:
/home/circleci/project/examples/gaussian_process/plot_gpc_xor.py failed leaving traceback:
Traceback (most recent call last):
  File "/home/circleci/project/examples/gaussian_process/plot_gpc_xor.py", line 35, in <module>
    clf = GaussianProcessClassifier(kernel=kernel, warm_start=True).fit(X, Y)
  File "/home/circleci/project/sklearn/gaussian_process/gpc.py", line 615, in fit
    self.base_estimator_.fit(X, y)
  File "/home/circleci/project/sklearn/gaussian_process/gpc.py", line 210, in fit
    self.kernel_.bounds)]
  File "/home/circleci/project/sklearn/gaussian_process/gpc.py", line 428, in _constrained_optimization
    fmin_l_bfgs_b(obj_func, initial_theta, bounds=bounds)
  File "/home/circleci/miniconda/envs/testenv/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py", line 185, in fmin_l_bfgs_b
    **opts)
  File "/home/circleci/miniconda/envs/testenv/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py", line 314, in _minimize_lbfgsb
    f, g = func_and_grad(x)
  File "/home/circleci/miniconda/envs/testenv/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py", line 263, in func_and_grad
    f = fun(x, *args)
  File "/home/circleci/miniconda/envs/testenv/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 59, in __call__
    fg = self.fun(x, *args)
  File "/home/circleci/project/sklearn/gaussian_process/gpc.py", line 202, in obj_func
    theta, eval_gradient=True)
  File "/home/circleci/project/sklearn/gaussian_process/gpc.py", line 346, in log_marginal_likelihood
    self._posterior_mode(K, return_temporaries=True)
  File "/home/circleci/project/sklearn/gaussian_process/gpc.py", line 399, in _posterior_mode
    L = cholesky(B, lower=True)
  File "/home/circleci/miniconda/envs/testenv/lib/python2.7/site-packages/scipy/linalg/decomp_cholesky.py", line 81, in cholesky
    check_finite=check_finite)
  File "/home/circleci/miniconda/envs/testenv/lib/python2.7/site-packages/scipy/linalg/decomp_cholesky.py", line 20, in _cholesky
    a1 = asarray_chkfinite(a)
  File "/home/circleci/miniconda/envs/testenv/lib/python2.7/site-packages/numpy/lib/function_base.py", line 595, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

jnothman · 2018-02-01T20:18:57Z

ahh right ... revert and fix?

jotasi · 2018-02-02T06:07:11Z

I'm very sorry for breaking. I wasn't aware that the examples aren't build in PR-tests and thus did not test whether all examples still worked.

I traced the reason for the error and it turns out to be due to a bug in scipy.special.expit (expit(x)=1/(1-exp(-x)) which used to eval to nan for large numbers (approx. greater than 700) instead of 1. It was fixed in February of 2014 and is thus still present in SciPy 0.13.3 but fixed in SciPy 0.14.0.

The problem arises here:

scikit-learn/sklearn/gaussian_process/gpc.py

Lines 391 to 417 in c5ef50a

    
           for _ in range(self.max_iter_predict): 
        
               # Line 4 
        
               pi = expit(f) 
        
               W = pi * (1 - pi) 
        
               # Line 5 
        
               W_sr = np.sqrt(W) 
        
               W_sr_K = W_sr[:, np.newaxis] * K 
        
               B = np.eye(W.shape[0]) + W_sr_K * W_sr 
        
               L = cholesky(B, lower=True) 
        
               # Line 6 
        
               b = W * f + (self.y_train_ - pi) 
        
               # Line 7 
        
               a = b - W_sr * cho_solve((L, True), W_sr_K.dot(b)) 
        
               # Line 8 
        
               f = K.dot(a) 
        
               # Line 10: Compute log marginal likelihood in loop and use as 
        
               #          convergence criterion 
        
               lml = -0.5 * a.T.dot(f) \ 
        
                   - np.log(1 + np.exp(-(self.y_train_ * 2 - 1) * f)).sum() \ 
        
                   - np.log(np.diag(L)).sum() 
        
               # Check if we have converged (log marginal likelihood does 
        
               # not decrease) 
        
               # XXX: more complex convergence criterion 
        
               if lml - log_marginal_likelihood < 1e-10: 
        
                   break 
        
               log_marginal_likelihood = lml

when one element of f exceeds 717 and thus expit returns an array containing a nan. Then pi->W->W_sr->W_sr_K->B all contain nan's and thus the call to choleskyfails.

So I guess for this example to properly run, the minimum SciPy version is 0.14.0.

… build (#10557)" This reverts commit e7ecdb0.

lesteve · 2018-02-02T07:39:11Z

Sorry I should have checked the generated documentation. The doc only gets build in PRs if you modify one of the examples or if you have [doc build] in your commit message.

Unless there is a straightfoward fix, I would be in favour of reverting the commit until we figure out what to do about it.

jotasi · 2018-02-02T07:47:02Z

Oh, good to know. I'll remember that next time!

I don't really see a straightforward fix, except for bumping the SciPy version up to 0.14.0 as otherwise one would either need to tweak the example to not trigger the bug or even change scikit-learn/sklearn/gaussian_process/gpc.py to take care of it. On the other hand, bumping SciPy up to 0.14.0 would lead to the minimum SciPy version needed for the examples differing from the minimum version required for scikit-learn per documentation (and the reasoning to support the package versions in ubuntu-trusty).

qinhanmin2014 · 2018-02-02T08:29:30Z

ping @lesteve Seems that you commit in a separate branch, not in master?

lesteve · 2018-02-02T09:27:35Z

Should be reverted in 84488b5.

… build (#10557)" This reverts commit e7ecdb0. The examples are failing with Scipy 0.13.3 and we need to investigate.

Adding min. req. for numpy and scipy to circleci

dbaaa98

jnothman approved these changes Jan 31, 2018

View reviewed changes

State reasoning for the versions

0c3fc74

jotasi changed the title ~~Adding min. req. for numpy and scipy to circleci~~ [MRG] Adding min. req. for numpy and scipy to circleci Feb 1, 2018

lesteve merged commit e7ecdb0 into scikit-learn:master Feb 1, 2018

lesteve added a commit that referenced this pull request Feb 2, 2018

Revert "Use minimal versions for numpy and scipy on CircleCI Python 2…

dbfbbfb

… build (#10557)" This reverts commit e7ecdb0.

lesteve added a commit that referenced this pull request Feb 2, 2018

Revert "Use minimal versions for numpy and scipy on CircleCI Python 2…

84488b5

… build (#10557)" This reverts commit e7ecdb0. The examples are failing with Scipy 0.13.3 and we need to investigate.

jotasi mentioned this pull request Feb 8, 2018

[MRG + 1] Remove pilutil from examples #10527

Merged

lesteve mentioned this pull request Feb 12, 2018

[MRG+1] Use almost minimal dependencies in Python 2 build on CircleCI #10627

Merged

jotasi deleted the specify_ci_versions branch February 15, 2018 13:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] Adding min. req. for numpy and scipy to circleci #10557

[MRG] Adding min. req. for numpy and scipy to circleci #10557

Uh oh!

jotasi commented Jan 31, 2018

Uh oh!

jnothman left a comment

Uh oh!

lesteve commented Feb 1, 2018

Uh oh!

qinhanmin2014 commented Feb 1, 2018

Uh oh!

jnothman commented Feb 1, 2018 via email

Uh oh!

jotasi commented Feb 2, 2018

Uh oh!

lesteve commented Feb 2, 2018

Uh oh!

jotasi commented Feb 2, 2018

Uh oh!

qinhanmin2014 commented Feb 2, 2018

Uh oh!

lesteve commented Feb 2, 2018

Uh oh!

Uh oh!

Uh oh!

[MRG] Adding min. req. for numpy and scipy to circleci #10557

[MRG] Adding min. req. for numpy and scipy to circleci #10557

Uh oh!

Conversation

jotasi commented Jan 31, 2018

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

lesteve commented Feb 1, 2018

Uh oh!

qinhanmin2014 commented Feb 1, 2018

Uh oh!

jnothman commented Feb 1, 2018 via email

Uh oh!

jotasi commented Feb 2, 2018

Uh oh!

lesteve commented Feb 2, 2018

Uh oh!

jotasi commented Feb 2, 2018

Uh oh!

qinhanmin2014 commented Feb 2, 2018

Uh oh!

lesteve commented Feb 2, 2018

Uh oh!

Uh oh!