-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Tests for sample order invariance in estimator_checks #8695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi! I would like to work on this, but I'm new to this project. I took a look at estimator_checks.py and there's a lot there. Would this be a completely new test or would this be added to test_check_estimator? |
You could add part of this to check_classifiers_train, but then would need
the same for regressors, transformers and outlier detectors. The best way
to do it is often more apparent after trying one way.
…On 5 April 2017 at 11:21, Jeff Colfer ***@***.***> wrote:
Hi! I would like to work on this, but I'm new to this project. I took a
look at estimator_checks.py and there's a lot there. Would this be a
completely new test or would this be added to test_check_estimator?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8695 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6_pWr4LsKLCPTcfJaCi1j-Fo1b2Qks5rsuyfgaJpZM4MyOfs>
.
|
… for sample invariance in predict_proba to ensure that reordering or subsampling \n does not change the sample-wide output \n \n Addresses: scikit-learn#8695 \n
Adds a check for sample order invariance for regressors in estimator_checks Adresses: scikit-learn#8695
Would the test for transformers only be in check_transformer_general, or would it need to be present in all the check_transformer* functions? I'm also unsure how I would produce an estimator that would fail one of these tests. Also, I've been unable to find a straightforward way to build from source on windows. Do you (or anyone) have any recommendations? Thanks so much! |
Adds a simple check for sample order invariance in the predict function when using the estimator_cheks Addresses: scikit-learn#8695
I'm sorry I know nothing about building on windows beyond what's in the docs. An example of an estimator that fails on one of these tests is: class Bad(BaseEstimator):
def fit(self, X, y=None): return self
def predict(self, X): return np.arange(len(X)) I'm not sure the best way to structure the tests in the current checking framework. Try one and we'll see if there's better when reviewing your pull request. |
Hmm this might not be the place to ask this, but I'm having a hard time building/testing. I created a VM on my windows machine and I'm not able to use ERROR: Failure: ImportError (cannot import name _hierarchical) FAILED (errors=143) Building doesn't fail when I just type
Traceback (most recent call last): I'm using Ubuntu 16.04.2 in a VMWare workstation VM on a 64 bit Intel system. |
@jnothman this issue seems open, I would like to contribute, can you please give me some pointers |
Get your head around |
Thanks, I'm picking it up. |
hey is someone working on this? Can I pick this up? |
While sample and feature order can have subtle effects on the model
fit
by an estimator, I think we should have common tests to ensure that reordering or subsamplingX
inpredict
ortransform
orscore_samples
orpredict_proba
ordecision_function
does not change the sample-wise output. That is:Apologies if we already have such tests, but I can't see them (which is also an issue: we don't actually have a clear list of what is asserted by estimator_checks)
The text was updated successfully, but these errors were encountered: