Skip to content

Common check for sample weight invariance with removed samples #16507

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 10, 2020

Conversation

rth
Copy link
Member

@rth rth commented Feb 20, 2020

Continues and closes #15015

Closes #5515

This merges the common check which ensure that setting sample_weight to 0 is equivalent to removing samples. Estimators that currently fail it are listed in #16298 and are marked as a known failure.

We use the _xfail_test estimator tag to mark estimators that xfail this test.

Also related to #11316, #15657

@rth
Copy link
Member Author

rth commented Feb 20, 2020

The current output of this test is,

pytest sklearn/tests/test_common.py -k "check_sample_weights_invariance and zeros"       
========================================================= test session starts =========================================================
platform linux -- Python 3.8.0, pytest-5.2.1, py-1.8.0, pluggy-0.13.0                                                                  
rootdir: /home/rth/src/scikit-learn, inifile: setup.cfg                                                                                
plugins: forked-1.1.3, xdist-1.31.0
collected 6243 items / 6078 deselected / 165 selected                                                                                 

sklearn/tests/test_common.py ...............x..................................x...x.....x...............xx...x........x....... [ 59%]
......xx...x................x...............xxxx...................                                                             [100%]

======================================================= short test summary info =======================================================
XFAIL sklearn/tests/test_common.py::test_estimators[CalibratedClassifierCV(base_estimator=LinearDiscriminantAnalysis())-check_sample_we
ights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[IsolationForest()-check_sample_weights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[KMeans()-check_sample_weights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[KernelDensity()-check_sample_weights_invariance(kind=zeros)]
  sample_weight must have positive values
XFAIL sklearn/tests/test_common.py::test_estimators[LinearSVC()-check_sample_weights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[LinearSVR()-check_sample_weights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[LogisticRegressionCV()-check_sample_weights_invariance(kind=zeros)]                
  zero sample_weight is not equivalent to removing samples                                                                             
XFAIL sklearn/tests/test_common.py::test_estimators[MiniBatchKMeans()-check_sample_weights_invariance(kind=zeros)]                     
  zero sample_weight is not equivalent to removing samples                                                                             
XFAIL sklearn/tests/test_common.py::test_estimators[NuSVC()-check_sample_weights_invariance(kind=zeros)]                               
  zero sample_weight is not equivalent to removing samples                                                                             
XFAIL sklearn/tests/test_common.py::test_estimators[NuSVR()-check_sample_weights_invariance(kind=zeros)]                               
  zero sample_weight is not equivalent to removing samples                                                                             
XFAIL sklearn/tests/test_common.py::test_estimators[OneClassSVM()-check_sample_weights_invariance(kind=zeros)]                        
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[RANSACRegressor(base_estimator=Ridge())-check_sample_weights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[SGDClassifier()-check_sample_weights_invariance(kind=zeros)]                      
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[SGDRegressor()-check_sample_weights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[SVC()-check_sample_weights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
XFAIL sklearn/tests/test_common.py::test_estimators[SVR()-check_sample_weights_invariance(kind=zeros)]
  zero sample_weight is not equivalent to removing samples
==================================== 149 passed, 6078 deselected, 16 xfailed, 15 warnings in 2.38s ====================================

'check_sample_weights_invariance(kind=zeros)':
'zero sample_weight is not equivalent to removing samples',
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course the amount of repetitions could be reduced by defining this tag in BaseSVC but the expectations is that these estimators would be fixed one by one and it's easier to understand what needs fixing this way.

# skip tests marked as a known failure and raise a warning
msg = xfail_checks[check_name]
warnings.warn(f'Skipping {check_name}: {msg}', SkipTestWarning)
continue
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change following #16502 analogous to what was added in test_common.py::test_estimators . Without this test_check_estimator_clones started to fail with this PR, since it runs check_estimator on MiniBatchKMeans which now has one xfail test.

Skipping with a warning tests marked as xfail in check_estimator, as done here, is a solution to this issue.

@rth
Copy link
Member Author

rth commented Mar 4, 2020

To answer the question why SVC fails even though it should have been fixed in #14286, the failures of,

$ pytest sklearn/tests/test_common.py -k "check_sample_weights_invariance and (SVC or SVR)" --runxfail

are,

Not equal to tolerance rtol=1e-07, atol=0
E                   For LinearSVC sample_weight is not equivalent to removing samples
E                   Mismatch: 100%
E                   Max absolute difference: 0.03157996
E                   Max relative difference: 0.03572515

E                   For LinearSVR sample_weight is not equivalent to removing samples
E                   Mismatch: 100%
E                   Max absolute difference: 0.17451726
E                   Max relative difference: 0.16016769

E                   For NuSVC sample_weight is not equivalent to removing samples
E                   Mismatch: 100%
E                   Max absolute difference: 2.06101805e-05
E                   Max relative difference: 2.06106053e-05

E                   For NuSVR sample_weight is not equivalent to removing samples
E                   Mismatch: 100%
E                   Max absolute difference: 0.00042324
E                   Max relative difference: 0.00042301

E                   For SVC sample_weight is not equivalent to removing samples
E                   Mismatch: 100%
E                   Max absolute difference: 0.00053014
E                   Max relative difference: 0.00053037

E                   For SVR sample_weight is not equivalent to removing samples
E                   Mismatch: 100%
E                   Max absolute difference: 0.00065833
E                   Max relative difference: 0.00059826

so LinearSVC and LinearSVR look definitely broken and they were not addressed in #14286

To make SVR, SVC, NuSVR, NuSVC one would need to increase the relative tolerance to 6e-4 which is quite a lot. The test added in that PR checked coefficients with rtol=1e-3. Maybe that's the best we can do for libsvm (any 32bit casting happening internally?), not sure..

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's merge this and then we can fix estimators one by one

@rth rth requested a review from NicolasHug March 5, 2020 08:28
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This control over checks is quite satisfying to see in action :)

Is there a reason we are modifying y and not X? Doing so means this check only works for supervised estimators.

err_msg = (f"For {name} sample_weight=None is not equivalent to "
f"sample_weight=ones")
elif kind == 'zeros':
X2 = np.vstack([X, X])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deserves a comment like "Construct a dataset that is very different to (X, y) if weights are disregarded, but identical to (X, y) given weights".

However, this doesn't really work when the estimator is unsupervised. This could pass if weights are disregarded.

@rth
Copy link
Member Author

rth commented Mar 5, 2020

Is there a reason we are modifying y and not X? Doing so means this check only works for supervised estimators.

Good point, thanks! Also added a modification to X.

'check_sample_weights_invariance(kind=zeros)':
'zero sample_weight is not equivalent to removing samples',
}
}
Copy link
Member Author

@rth rth Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RidgeClassifierCV now also started to fail with a quite high tolerance required to pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely unrelated to this PR but since you've been doing this (which is helpful, thanks): do you think it'd be useful for github to support comments from PR author to reviewers, but something distinct from regular review comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely unrelated to this PR but since you've been doing this (which is helpful, thanks): do you think it'd be useful for github to support comments from PR author to reviewers, but something distinct from regular review comment?

Maybe, but I can't say the current way it works is an issue for me.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rth , a few comments and questions

'check_sample_weights_invariance(kind=zeros)':
'zero sample_weight is not equivalent to removing samples',
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely unrelated to this PR but since you've been doing this (which is helpful, thanks): do you think it'd be useful for github to support comments from PR author to reviewers, but something distinct from regular review comment?

Comment on lines +462 to +465
# skip tests marked as a known failure and raise a warning
msg = xfail_checks[check_name]
warnings.warn(f'Skipping {check_name}: {msg}', SkipTestWarning)
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we apply the xfail decorator as a function instead of manually replicating its behavior?

I.e.

check = pytest.mark.xfail(check)

or something like that?

Because we have the try / except logic just below (and we might want to update the comment about pandas)

Copy link
Member Author

@rth rth Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use pytest in this file, since it's supposed to work without it.

There is indeed some slight redundancy with _mark_xfail_checks but I think trying to factorize it might be more trouble than what it's worth. It's just 6 extra lines in the end.

Updated the comment about pandas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say the introduction of the tag is a good reason to start requiring pytest for using check_estimator now. But agreed that's another story.

It's just 6 extra lines in the end

Sure, though I find our whole test suite really hard to maintain in general.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, though I find our whole test suite really hard to maintain in general.

Agreed, I'm just saying that slightly more verbosity and 2 repetitions is easier to maintain than coming up with some function wrappers. The problem is not so much lines of code as complexity.

if kind == 'ones':
X2 = X
y2 = y
sw2 = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be more natural to set this to ones, and set the SW of estimator1 to None instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, fixed.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @rth

Comment on lines +462 to +465
# skip tests marked as a known failure and raise a warning
msg = xfail_checks[check_name]
warnings.warn(f'Skipping {check_name}: {msg}', SkipTestWarning)
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say the introduction of the tag is a good reason to start requiring pytest for using check_estimator now. But agreed that's another story.

It's just 6 extra lines in the end

Sure, though I find our whole test suite really hard to maintain in general.

@@ -816,7 +825,7 @@ def check_sample_weights_shape(name, estimator_orig):


@ignore_warnings(category=FutureWarning)
def check_sample_weights_invariance(name, estimator_orig):
def check_sample_weights_invariance(name, estimator_orig, kind="ones"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check be tested?

@rth
Copy link
Member Author

rth commented May 10, 2020

Should this check be tested?

Well for common tests , there is a risk of false negatives and false positives,

  • false negatives (i.e. errors by mistake) would break the test suite would be fairly easy to detect since CI would fail..
  • false positives (i.e. not erroring when they should) doesn't happen here since we had to xfail tests that failed.

Sorry I don't have much availability to write detailed tests for this at the moment. I think it's more useful to have this in master (original PR was 8 month ago) that leave it in its current state; also to avoid blocking #15554

Merging with +2.

@rth rth changed the title Common check for sample weight invariance with removed samples (continued) Common check for sample weight invariance with removed samples May 10, 2020
@rth rth merged commit 77279d6 into scikit-learn:master May 10, 2020
@rth rth deleted the sample-weight-common-check-xfail branch May 10, 2020 13:41
@rth
Copy link
Member Author

rth commented May 10, 2020

OK, no actually I should have re-run CI since there were outdated changes in check_estimator and the way xfail is handled. Was too optimistic. Fix in #17175

rth added a commit that referenced this pull request May 10, 2020
rth added a commit to rth/scikit-learn that referenced this pull request May 10, 2020
gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants