0.20rc1; fix not accessing correct installed sklearn path in AppVeyor #7

jnothman · 2018-08-22T02:04:10Z

Try to fix appveyor testing

jnothman · 2018-08-22T02:26:10Z

Comparing the logs from the latest working appveyor build and this, I think I'm barking up the wrong tree, but might continue debugging in this PR.

amueller · 2018-08-22T03:11:13Z

well the change was using pytest not nosetest (or at least that was the intention). But I know very little of what was going on here in the first place...

jnothman · 2018-08-22T03:19:02Z

Well I can see that the error seems to occur because it's reading from the wrong sklearn path. But I can't entirely understand why. So I'm trying just removing the checkout sklearn directory and we'll see in about 2 minutes whether that succeeds.

jnothman · 2018-08-22T03:20:58Z

It's working!

But we do have test failures.

jnothman · 2018-08-22T03:25:06Z

Several failures, it seems on the

PYTHON_VERSION=2.7.11, PYTHON_ARCH=32

run, including so far:

sklearn\cluster\bicluster.py FF                                          [  0%]
sklearn\cluster\k_means_.py FF                                           [  0%]
sklearn\cluster\mean_shift_.py F                                         [  0%]
sklearn\feature_extraction\text.py F.F                                   [ 10%]
sklearn\metrics\classification.py ....F...........                       [ 22%]

:(

amueller · 2018-08-22T03:45:35Z

yay 32bit!

amueller · 2018-08-22T03:45:53Z

(my plan is to be somewhat online at least until the RC is out btw)

jnothman · 2018-08-22T03:48:25Z

Failures are all doctest failures (ping @adrinjalali??):

For example:

413     >>> clustering.row_labels_
Expected:
    array([1, 1, 1, 0, 0, 0], dtype=int32)
Got:
    array([1, 1, 1, 0, 0, 0])

797     >>> print(vectorizer.get_feature_names())
Expected:
    ['and', 'document', 'first', 'is', 'one', 'second', 'the', 'third', 'this']
Got:
    [u'and', u'document', u'first', u'is', u'one', u'second', u'the', u'third', u'this']
c:\python27\lib\site-packages\sklearn\feature_extraction\text.py:797: DocTestFailure


234     >>> confusion_matrix(y_true, y_pred)
Differences (unified diff with -expected +actual):
    @@ -1,3 +1,3 @@
     array([[2, 0, 0],
            [0, 0, 1],
    -       [1, 0, 2]])
    +       [1, 0, 2]], dtype=int64)
c:\python27\lib\site-packages\sklearn\metrics\classification.py:234: DocTestFailure

Apart from these platform-dependent things, we also get a doctest failure in MeanShift which may be unstable?? I can only guess that this was always unstable across platforms and we've only seen this because we now have a doctest. Not sure how to debug this.

356     >>> from sklearn.cluster import MeanShift
357     >>> import numpy as np
358     >>> X = np.array([[1, 1], [2, 1], [1, 0],
359     ...               [4, 7], [3, 5], [3, 6]])
360     >>> clustering = MeanShift(bandwidth=2).fit(X)
361     >>> clustering.labels_
Expected:
    array([0, 0, 0, 1, 1, 1])
Got:
    array([1, 1, 1, 0, 0, 0])
c:\python27\lib\site-packages\sklearn\cluster\mean_shift_.py:361: DocTestFailure

jnothman · 2018-08-22T03:49:44Z

Feel free to merge this so that we can then tweak on top of it?

amueller · 2018-08-22T03:53:27Z

doctests are supposed to be skipped on 32bit. they are in our makefile.

amueller · 2018-08-22T03:53:57Z

(for example because of the dtype thing).
Bed time in NYC 😴

jnothman · 2018-08-22T03:54:33Z

Well, then let's try running the makefile here.

jnothman · 2018-08-22T04:02:23Z

Actually, no, the makefile only skips tests in doc/ not doctests in sklearn/* on 32 bit

jnothman · 2018-08-22T04:11:24Z

On Py=2.7, arch=64, we also have doctest failures like:

1231     >>> neigh
Expected:
    [array([0, 3]), array([1]), array([2]), array([0, 3]), array([4])]
Got:
    [array([0, 3], dtype=int64), array([1], dtype=int64), array([2], dtype=int64), array([0, 3], dtype=int64), array([4], dtype=int64)]

and

152     >>> loo.get_n_splits(X)
Expected:
    2
Got:
    2L

and

2522     >>> print(pt.lambdas_)
Expected:
    [1.38668178e+00 5.93926346e-09]
Got:
    [1.38668180e+00 5.93926346e-09]

jnothman · 2018-08-22T04:15:18Z

and conftest.py now skips doctests where numpy < 1.14, but doesn't make any other requirements.

jnothman · 2018-08-22T04:26:35Z

But I'm presuming these tests weren't checked on nose. I'm not sure how/when to disable doctests nicely in pytest, unless we modify conftest. (Why hasn't this been an issue in the scikit-learn repo appveyor?)

jnothman · 2018-08-22T05:49:34Z

It's funny. Even on 64-bit runs, we're getting test failures that imply a 32-bit environment. Things like:

Expected:
    array([0, 1], dtype=int32)
Got:
    array([0, 1])

Expected:
    array([0, 0, 0, 1, 1, 1])
Got:
    array([0, 0, 0, 1, 1, 1], dtype=int64)

Is it possible that even on 64-bit we are using a 32-bit version?? I can't see how that might be happening.

adrinjalali · 2018-08-22T15:17:36Z

@jnothman I'm not sure if there's a way to fix any of these (in a reasonable way).

Basically, in a 64 bit system, the dtype is not printed if it's int64, and is printed if it's int32, and vice versa.

The only fix I can think of, is to explicitly set the int type throughout the algorithms based on the architecture it runs on. But I don't think that's what we're going to do here.

And it's the same with the other failing doctest examples. In Python3 all strings are unicode by default, and when Python2 sees unicode it puts the u behind the literals to show that.

The MeanShift is the only one I see which should actually be fixed out of these you pasted here, which I'll investigate.

For the other ones, I'm happy to help, but I'm not sure if/how we want to approach the issue.

I've got the 32bit VM running, so I can do the tests easier now.

amueller · 2018-08-22T15:20:37Z

Hm maybe @ogrisel remembers what exactly we're running usually? And yes, these errors on 64bit looks suspicious. I think we might be skipping 32bit and py27 but I'm not sure about the latter.
And failures on 64bit/py3 are strange.

amueller · 2018-08-22T15:28:29Z

we're running py27 doctests on travis from what I can see... but I should really start packing now ;)

amueller · 2018-08-22T15:30:59Z

wait we're discarding the setup.cfg here, right? that might be the difference?

rth · 2018-08-22T20:02:53Z

It would make sense to discard setup.cfg and pass the needed CLI parameters manually (e.g. I'm not sure how to cancel out --doctest-modules otherwise).

I've got the 32bit VM running, so I can do the tests easier now.

There is also a Docker setup in https://github.com/rth/scikit-learn-dev-docker/blob/master/debian/apt/Dockerfile that could be used with the https://hub.docker.com/r/i386/debian/ image to get a 32 bit environment without having to setup a VM..

rth · 2018-08-22T20:13:52Z

This is using master from one month ago. If I understood the discussion right, it doesn't work with latest master? It would have been nice to have the latest one, otherwise we won't know when issues (reported in scikit-learn/scikit-learn#11878) are fixed..

jnothman · 2018-08-23T00:52:50Z

Okay, I'm trying disabling setup.cfg to avoid doctests, and have updated scikit-learn to master.

jnothman · 2018-08-23T05:19:17Z

It looks like we have test failures on 32-bit architectures in both Travis and AppVeyor.

[00:09:18] >               assert_almost_equal(lr_32.coef_, lr_64.coef_.astype(np.float32))
[00:09:18] E               AssertionError: 
[00:09:18] E               Arrays are not almost equal to 7 decimals
[00:09:18] E               
[00:09:18] E               (mismatch 100.0%)
[00:09:18] E                x: array([[0.6324632, 0.4584147]], dtype=float32)
[00:09:18] E                y: array([[0.6325136, 0.4583441]], dtype=float32)
[00:09:18] 
[00:09:18] c:\python36\lib\site-packages\sklearn\linear_model\tests\test_logistic.py:1288: AssertionError

and sklearn/cluster/tests/test_optics.py:test_reach_dists

seem to be the only culprits, at a glance.

rth · 2018-08-23T06:14:05Z

Proposed a fix for test_logistic in scikit-learn/scikit-learn#11899

amueller · 2018-08-23T22:37:45Z

This is using master from one month ago.

Is it? I think it's using master from three days ago?

amueller · 2018-08-23T22:40:08Z

Looks like on travis only the Optics thing is failing. So we just change the precision for that?

rth · 2018-08-27T06:17:28Z

Updating master should hopefully make all the CI green.

rth · 2018-08-27T11:57:27Z

Now we get failures on Mac (all builds) for test_logistic_regression_multi_class_auto[lbfgs-est1] where est1 = LogisticRegressionCV: the relative error seems to be of the order of 0.2-0.3 which is huge. Not sure why it would error there but not for other solvers or platforms.

jnothman · 2018-08-27T14:33:48Z

Sometimes testing rigorously shoots us in the foot?

jnothman · 2018-08-27T15:00:14Z

So on the first Mac run there we have we have with multi_class='auto'

array([[ -1.76913765,   2.30239397,  -9.4657277 ,  -3.27405007],\n       [ -2.8...-7.93137481],\n       [  4.64859645,   0.38360847,  10.60498514,  11.20542488]])

and with multi_class='multinomial'

array([[ -1.4978193 ,   3.56126163,  -9.12636383,  -3.91053244],\n       [ -3.0...-7.85516744],\n       [  4.58786754,  -0.80120555,  10.22093667,  11.76569988]]))

But on the second:

array([[ -1.61884513,   3.26080141,  -9.34635526,  -3.64453893],\n       [ -3.04425378,  -3.15970995,  -1.22510387,  -7.76084652],\n       [  4.66309891,  -0.10109146,  10.57145912,  11.40538545]])
array([[ -1.61884174,   3.26079774,  -9.34636003,  -3.64453259],\n       [ -3.04425799,  -3.15970338,  -1.22509766,  -7.76085408],\n       [  4.66309973,  -0.10109436,  10.57145768,  11.40538666]])

Note these are much closer, though they still fail.

The third run:

array([[ -1.78577073,   2.9900282 ,  -9.20496655,  -3.94358273],\n       [ -2.93334658,  -3.01462014,  -1.41757392,  -7.51907641],\n       [  4.7191173 ,   0.02459194,  10.62254047,  11.46265915]])
array([[ -1.61884513,   3.26080141,  -9.34635526,  -3.64453893],\n       [ -3.04425378,  -3.15970995,  -1.22510387,  -7.76084652],\n       [  4.66309891,  -0.10109146,  10.57145912,  11.40538545]])

On my macbook the test passes with:

array([[-1.58362685,  3.23565193, -9.16015996, -3.94813821],\n       [-3.10899051, -3.12192597, -1.1105531 , -7.79403915],\n       [ 4.69261736, -0.11372595, 10.27071306, 11.74217736]])

In a 32-bit VM where I'd been testing OPTICS issues, I get:

array([[-1.94948273,  2.77308919, -9.1520717 , -4.06930446],\n       [-2.839474...91, -7.4417493 ],\n       [ 4.78895732,  0.11330503, 10.66136861, 11.51105376]])

Increasing max_iter changes the result. But the result is extremely unstable across platforms!

I don't know what to do here... yet.

jnothman · 2018-08-28T00:24:41Z

A second run of the same CI gives different results. So the fit is indeed non-deterministic.

rth · 2018-08-28T08:44:50Z

Let's skip it and investigate later.

rth · 2018-08-28T08:49:48Z

Sometimes testing rigorously shoots us in the foot?

More like discovering new bugs :) Though bugs never end unfortunately, and we should release regardless..

jnothman · 2018-08-29T14:14:39Z

This should be ready for merge to produce wheels for the RC.

jnothman · 2018-08-29T14:15:24Z

@ogrisel or @amueller should merge and manage the pypi uploads for RC..?

ogrisel · 2018-08-30T11:31:12Z

Thanks, sorry for the lack of reply. Let me merge this and give you the rights to this repo so that you are not stuck again in the future.

jnothman · 2018-08-30T12:50:32Z

Thanks. Though I think I still can't upload the wheels to pypi.

Also, appveyor looks like it is failing on master

Get more verbosity from pip install

53d9b3e

jnothman changed the title ~~Get more verbosity from pip install~~ Get more verbosity from pip install in appveyor Aug 22, 2018

jnothman added 3 commits August 22, 2018 12:34

show me sys.path

33be5df

quotes

ee911d9

remove the source dir

0905a84

jnothman changed the title ~~Get more verbosity from pip install in appveyor~~ Fix not accessing correct installed sklearn path in AppVeyor Aug 22, 2018

rth mentioned this pull request Aug 22, 2018

Build failures for master/0.20 scikit-learn/scikit-learn#11878

Closed

Don't use setup.cfg in pytest in appveyor

347972a

rth mentioned this pull request Aug 23, 2018

MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch scikit-learn/scikit-learn#11899

Merged

adrinjalali mentioned this pull request Aug 23, 2018

[MRG+1] break the tie in Meanshift in case cluster intensities are the same scikit-learn/scikit-learn#11901

Merged

rth mentioned this pull request Aug 24, 2018

test_logistic::test_dtype_match failure on 32 bit Windows scikit-learn/scikit-learn#11908

Closed

Update master

737b4df

jnothman mentioned this pull request Aug 27, 2018

[MRG] TST/FIX stop optics reachability failure on 32bit scikit-learn/scikit-learn#11916

Merged

Update to master

26d9cf5

jnothman mentioned this pull request Aug 27, 2018

Multinomial LogisticRegressionCV with lbfgs non-deterministic on Travis Mac OS scikit-learn/scikit-learn#11924

Closed

Empty commit to see if coefs differ

26d6d3a

adrinjalali mentioned this pull request Aug 28, 2018

[MRG+1] OPTICS Change quick_scan floating point comparison to isclose scikit-learn/scikit-learn#11929

Merged

jnothman added 2 commits August 29, 2018 17:48

Update to master

f0e61c5

Update to 0.20rc1

cf9ad38

jnothman changed the title ~~Fix not accessing correct installed sklearn path in AppVeyor~~ 0.20rc1; fix not accessing correct installed sklearn path in AppVeyor Aug 29, 2018

ogrisel merged commit 35b76b9 into MacPython:master Aug 30, 2018

0.20rc1; fix not accessing correct installed sklearn path in AppVeyor #7

0.20rc1; fix not accessing correct installed sklearn path in AppVeyor #7

Uh oh!

Conversation

jnothman commented Aug 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

jnothman commented Aug 22, 2018

Uh oh!

adrinjalali commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

rth commented Aug 22, 2018

Uh oh!

rth commented Aug 22, 2018

Uh oh!

jnothman commented Aug 23, 2018

Uh oh!

jnothman commented Aug 23, 2018

Uh oh!

rth commented Aug 23, 2018

Uh oh!

amueller commented Aug 23, 2018

Uh oh!

amueller commented Aug 23, 2018

Uh oh!

rth commented Aug 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rth commented Aug 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Aug 27, 2018

Uh oh!

jnothman commented Aug 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Aug 28, 2018

Uh oh!

rth commented Aug 28, 2018

Uh oh!

rth commented Aug 28, 2018

jnothman commented Aug 22, 2018 •

edited

Loading

jnothman commented Aug 22, 2018 •

edited

Loading

rth commented Aug 27, 2018 •

edited

Loading

rth commented Aug 27, 2018 •

edited

Loading

jnothman commented Aug 27, 2018 •

edited

Loading