Skip to content

[MRG]: fix tests, centralize warnings and reset __warningregistry__ #2541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Nov 29, 2013

Conversation

dengemann
Copy link
Contributor

@larsmans this tackles the mysterious UnboundLocalError ValueError ...

@@ -312,7 +312,7 @@ def test_fast_dot():
try:
linalg.get_blas_funcs('gemm')
has_blas = True
except AttributeError, ValueError:
except [AttributeError, ValueError]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValueError was used as identifier ... implicitly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. Nice catch.

Could you please use a tuple instead of a list (it's good practice, as they take less time to build, although here it's useless).

@@ -312,7 +312,7 @@ def test_fast_dot():
try:
linalg.get_blas_funcs('gemm')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the stable API here. another commit is coming.

@GaelVaroquaux
Copy link
Member

Looks good. 👍 to merge if the tests on travis, ... although I'd love a tuple :)

@dengemann
Copy link
Contributor Author

@amueller it seems Travis complains about this second bug (no warnings) which I can't spot on my box. I'll inquire tomorrow on my Linux VMs.

@amueller
Copy link
Member

Thanks! I thought it would be something like that but really didn't see it ^^ nice catch. I'll pull and see if I can find the problem.

has_blas = True
except AttributeError, ValueError:
except (AttributeError, ValueError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. Python 2 syntax bit me :)

@ogrisel
Copy link
Member

ogrisel commented Oct 21, 2013

This fix has not been merged yet? @dengemann could you try to rebase on current master to see if travis is fine?

@ogrisel
Copy link
Member

ogrisel commented Oct 21, 2013

Actually no need to rebase: I misread github status: this PR is mergeable but does not fix the travis failure.

@dengemann
Copy link
Contributor Author

@ogrisel I'll have a few minutes later to look into the remaining issue with Travis. I'll ping you

@dengemann
Copy link
Contributor Author

Hooray, I can reproduce this error with my 32bit Neurodebian VM. I'm on it.

@dengemann
Copy link
Contributor Author

This is really nasty. I think we've had such things before: the single tests passes but when the test-suite is run the warnings are missing. A bell is ringing, but ...

@ogrisel
Copy link
Member

ogrisel commented Oct 21, 2013

Yeah I kind of remember dealing with such weirdness during the sprint. I don't remember the outcome though. @amueller maybe?

@dengemann
Copy link
Contributor Author

Current outcome on my end: It seems, when going back in the cue of tests the SVM tests are the last tests that have impact on this particular test. When commenting out about 1/4 of the functions the subsequent warning-tests pass. But the trace does not stop there. When running nosetests with uncommented SVM tests other ones earlier determine the later success.... This is crazy.

@dengemann
Copy link
Contributor Author

Interesting, when muting other failing tests and picking one of them for deeper inquiry it seems accessing clf.coef_ triggers the problem.

def test_svc():
    """Check that sparse SVC gives the same result as SVC"""

    clf = svm.SVC(kernel='linear', probability=True, random_state=0)
    clf.fit(X, Y)
    sp_clf = svm.SVC(kernel='linear', probability=True, random_state=0)
    sp_clf.fit(X_sp, Y)

    assert_array_equal(sp_clf.predict(T), true_result)

    assert_true(sparse.issparse(sp_clf.support_vectors_))
    assert_array_almost_equal(clf.support_vectors_,
                              sp_clf.support_vectors_.todense())

    assert_true(sparse.issparse(sp_clf.dual_coef_))
    assert_array_almost_equal(clf.dual_coef_, sp_clf.dual_coef_.todense())

    print clf.coef_  #### the culprit !

    assert_true(sparse.issparse(sp_clf.coef_))
    # assert_array_almost_equal(clf.coef_, sp_clf.coef_.todense())
    assert_array_almost_equal(clf.support_, sp_clf.support_)
    assert_array_almost_equal(clf.predict(T), sp_clf.predict(T))

    # refit with a different dataset
    clf.fit(X2, Y2)
    sp_clf.fit(X2_sp, Y2)
    assert_array_almost_equal(clf.support_vectors_,
                              sp_clf.support_vectors_.todense())
    assert_array_almost_equal(clf.dual_coef_, sp_clf.dual_coef_.todense())
    # assert_array_almost_equal(clf.coef_, sp_clf.coef_.todense())
    assert_array_almost_equal(clf.support_, sp_clf.support_)
    assert_array_almost_equal(clf.predict(T2), sp_clf.predict(T2))
    assert_array_almost_equal(clf.predict_proba(T2),
                              sp_clf.predict_proba(T2), 4)

@dengemann
Copy link
Contributor Author

I think something is deeply flawed with how warnings work, cf. http://stackoverflow.com/questions/2390766/how-do-i-disable-and-then-re-enable-a-warning/2391106#2391106, especially in combination with nosetests. I can move warningfilters from A to B and tests pass for a subset of manually cued nosetests across modules. This is really arbitrary.

@amueller
Copy link
Member

I feared that would be the problem. But the fix seems to be workable, right? We should be able to completely reset the warning registry and all is good (?)

@amueller
Copy link
Member

Ok I have no idea what is going on .. I tried to empty the warning registry but that didn't help....

@dengemann
Copy link
Contributor Author

@amueller I should have mentioned that I tried different ways of resetting the registry ...
The only thing I did not try was writing a wrapper / new context manager and replace the original simplefilter / catch_warnings blocks blocks so we have full control. Something is telling me that the problem accumulates and the place where we see the issue is rather arbitrary ... So maybe it is worth trying to implement a catchwarnings function in utils that takes as argument the warnings filter and deletes the registry on exiting. We would then use this in a with statement instead of catch_warnings + simplefilter

@amueller
Copy link
Member

Hm... but deleting the registry should reset it, no matter what happened before, shouldn't it?
I won't claim to understand what is happening, though ;)
If you think it it worth investigating, go for it. The alternative is declaring the warning system broken and not testing that warnings are raised.

@arjoly
Copy link
Member

arjoly commented Oct 22, 2013

I don't know if it will solve your issue, but in the metrics module we had some problems with warning testing. We ended up by decorating all tests that generate a spurious warning with sklearn.utils.testing.ignore_warnings. When we actually want to test that a warning is raised, we do the following

def test_something():
    with warnings.catch_warnings(record=True):
        warnings.simplefilter("always")  # This line is needed to ensure that warning are raised                              
        out = assert_warns(RaisedWarning, function, arg1, arg2, ...)
        assert_equal(out, expected)

The discussion in this pr #1988 might also help.

PS : assert_warns and assert_no_warns are part of sklearn.utils.testing.

@dengemann
Copy link
Contributor Author

@arjoly thanks for the hint. Nice we meanwhile got those util functions, I'll take a look.

@dengemann
Copy link
Contributor Author

@arjoly why do we call assert_warns from within a context manager block when assert_warns itself uses such a block internally? When calling it outside a with block related tests pass. So I don't see any obvious reason to write more than:

out= assert_warns(RaisedWarning, function, arg1, arg2, ...)

@ogrisel
Copy link
Member

ogrisel commented Oct 23, 2013

@dengemann +1 on your last comment.

@arjoly
Copy link
Member

arjoly commented Oct 23, 2013

I can't remenber. Maybe it was related to travis. If it works, feel free to remove it. :-)

@dengemann
Copy link
Contributor Author

I can't remenber. Maybe it was related to travis. If it works, feel free to remove it. :-)

Ok, I will do that everywhere to bring warnings under our control. We can then have a decision whether the warning system is broken or not.

@ogrisel
Copy link
Member

ogrisel commented Oct 23, 2013

Ok, I will do that everywhere to bring warnings under our control. We can then have a decision whether the warning system is broken or not.

Thanks @dengemann for tackling this!

@dengemann
Copy link
Contributor Author

@ogrisel @amueller

Except a few places where warnings had a special meaning and except the common modules the biggest proportion of warnings is now centralized in sklearn.utils.testing

Deniss-MacBook-Pro:scikit-learn dengemann$ git grep 'import warnings' | cat | grep test_
sklearn/datasets/tests/test_base.py:import warnings
sklearn/externals/joblib/test/test_memory.py:import warnings
sklearn/feature_extraction/tests/test_text.py:import warnings
sklearn/preprocessing/tests/test_data.py:import warnings
sklearn/tests/test_common.py:import warnings
sklearn/tests/test_cross_validation.py:import warnings
sklearn/tests/test_dummy.py:import warnings
sklearn/tests/test_grid_search.py:import warnings
sklearn/tests/test_multiclass.py:import warnings
sklearn/tests/test_naive_bayes.py:import warnings
sklearn/tests/test_random_projection.py:import warnings
sklearn/utils/tests/test_testing.py:import warnings
sklearn/utils/tests/test_utils.py:import warnings

Now let's see what we can find out. One hypothesis would be that that the warningregistry belinging to the module that issues the warning must be emptied. Let's see.

raise AssertionError("The message received ('%s') for <%s> is "
"not the one you expected ('%s')"
% (msg, func.__name__, message
))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogrisel I think we should add this on the long run. Needed it to make sure I don't destroy critical tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean, you think we should use this variant that checks the warning message content as well as the type extensively? I agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not make sense everywhere, but where you want to check the message string use this variant. It's a bit of reduplication but I don't see a nice way of merging this into the other function due to the args and kwargs

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling d8ad680 on dengemann:fix_dot into feb4312 on scikit-learn:master.

@GaelVaroquaux
Copy link
Member

I found the bug! I think that it was an actual bug, not a spurious aspect of the warning system: the problem is that the tolerance becomes negative.

I fixed that on:
https://github.com/GaelVaroquaux/scikit-learn/tree/pr_2541

@larsmans : can you check that the tests pass on your box.

@vene : can you look at my fix, as I think you know this part of the codebase well.

@dengemann
Copy link
Contributor Author

I think that it was an actual bug, not a spurious aspect of the warning system

Amazing! So the new warning system actually helped? I'll rebase.

@dengemann
Copy link
Contributor Author

I'll rebase.

... once this is merged.

@GaelVaroquaux
Copy link
Member

... once this is merged.

Can you just cherry-pick my commit into your PR please? All you need is
to fetch from my fork, and do a cherry-pick on the corresponding commit
hash.

@dengemann
Copy link
Contributor Author

Can you just cherry-pick my commit into your PR please?

Sure I was almost going to suggest this.

@dengemann
Copy link
Contributor Author

done

@GaelVaroquaux
Copy link
Member

OK, this PR should be ready to go. It now passes on my box where it was
failing. People, please pound upon it!

@larsmans
Copy link
Member

Confirm, make test passes without problems.

larsmans added a commit that referenced this pull request Nov 29, 2013
[MRG]: fix tests, centralize warnings and reset `__warningregistry__`
@larsmans larsmans merged commit 8b037ad into scikit-learn:master Nov 29, 2013
@GaelVaroquaux
Copy link
Member

Confirm, make test passes without problems.

Awesome! I am waiting for @vene to check my fix: he understands this code
better than I do.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling dc8a03e on dengemann:fix_dot into feb4312 on scikit-learn:master.

@GaelVaroquaux
Copy link
Member

Merged #2541.

@larsman is live and daring.

Let's wait and see if all the buildbots pass!

@larsmans
Copy link
Member

Friday night, time for adventure ;)

@GaelVaroquaux
Copy link
Member

Honestly, I am really happy that we were able to merge that PR.

@dengemann
Copy link
Contributor Author

@GaelVaroquaux @larsmans thanks for your assistance. This biggish PR open while development was moving forward was somewhat nightmarish ...
We can now move on with smaller commits to tackle unaddressed warnings, add to contributing and improve the context manager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants