-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MRG factorize common tests. #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Btw I think this PR will work better with #803, as this enforces a more consistent API. |
The following classifiers fail on iris with default params: I am a bit surprised that |
What surprises me is that |
Also true. |
I haven't looked into it this much, as there are also other tests failing. |
#test cloning | ||
clone(e) | ||
# test __repr__ | ||
print(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the built-in function repr
for that.
I have trouble testing the NB classifiers. Maybe @larsmans has some opinion on that. My understanding is that |
@amueller Shuffle the iris dataset first. |
@mblondel Thanks. That one bit me already so often ;) I reverted the SGD default changes since it seems to work. It is quite unstable though, and I think adding some more iterations by default might be good. What do you think? |
You could add a list of estimators that must be handled specifically: this way you could add |
Well, for the moment it works. My question was more: how sensible are the current defaults. Usually I use much more than 5 iterations. I guess these are good with huge, very redundant datasets. But would it hurt to do 20 iterations instead? |
Ideally we should have a large number of iterations (e.g. 100) + some early stopping criterion based on a exponentially moving average of the test error measured from a validation test subsamples from the training set if not passed explicitly. |
2012/6/12 Andreas Mueller
Well, BernoulliNB can handle negative data, though it doesn't really
That's right. It didn't, so far, because our tf-idf implementation was |
@larsmans Maybe I misread you or you misread me: I wanted it say About tf-idf: was that fixed? I remember some PR hanging around a while ago. |
@ogrisel wouldn't it also be good to stop on the training error? In my view, the early stopping is more of a convergence check than a regularization. And I'm a bit afraid of doing to much behind the scenes. |
@amueller: ah, I get you now. No, I think @ogrisel fixed tf-idf (?). |
@larsmans: My idea was that having a per-feature median might be more natural as default threshold instead of 0. That was just something that came to my mind, not sure if it makes sense. I'm really not familiar with the kind of data that is used with this classifier. |
The common application domains of Bernoulli NB, AFAIK, are text classification and NLP tasks like WSD with word window features. In both cases, the median is likely zero for each feature except the stop words. |
Yes it's "fixed": it no longer outputs negatives values. However it does so in a non-standard way and I might change it again in the future. However last time I tried to implement the canonical normalization from the text books, the kmeans text clustering example was behaving much worse hence I decided to keep the current scheme for now as I did not have time to investigate further. |
Any comments? Should I merge? There are some smaller fixes that would be good to have in master. I can also cherry pick them and wait for a proper review on the testing stuff.... |
If all tests pass on your box, +1 for merging. More tests is always better :) |
Ok then I'll have a look again tonight and if everything works I'll merge. |
…ionMixin not inherit from ClassifierMixin
…ction of sparse SVM
…ed them). Turned up alpha and n_iter. This corresponds to more regularization and more carefull SGD. On which kind of problems do the old defaults work?
…st guessed them)." This reverts commit b659bc399da94be0de4a970da15e69cb778d4101.
Runs fine. Still needs work but should be a good place to start from. |
It seems that some of the new tests are broken on python 2.6: https://jenkins.shiningpanda.com/scikit-learn/job/python-2.6-numpy-1.3.0-scipy-0.7.2/703/consoleText |
Yes, I saw that but didn't have time to fix it. I hope I can do it later today. |
Should be fixed now. |
Thanks! |
This is a shot at factorizing common tests for all estimators.
Any suggestions on what to test are very welcome :)
Closes #406.
I think I would like to merge the stuff I already did before going on with more tests. At the moment, everything passes.
There are some (minor) issues for going further, for example the
fit_pairwise
problem which prevents me from testing clusterings.I feel there is already quite a lot in this PR and the longer it gets, the less motivation someone will have to review ;)