-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
add common test that zero sample weight means samples are ignored #15015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
added decision_function and predict_proba, now also For the CV ones, I guess the splits are different because the CV is influenced by the zero sample weights. That's a bit weird but not sure what the expected behavior should be? |
All SVM family should be fixed with #14286. Waiting for a second review there. Edit: Actually this is only solving issue with negative weights. |
y2 = _enforce_estimator_tags_y(estimator3, y2) | ||
weights = np.ones(shape=len(y) * 2) | ||
weights[len(y):] = 0 | ||
X2, y2, weights = shuffle(X2, y2, weights, random_state=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shuffling the data might have an impact, isn't it.
I was checking and OneClassSVM will give the same result if you are not shuffling the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed. Your test is better wrt to that, I think. I didn't want things to be sorted by weight (CV estimators will fail) but I guess I would need to ensure the order of the non-zero-weight samples stays the same (which is I think what you did?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the default tol
is too high to compare the decision function. I just tried with SVC
with a tol=1e-12
then the decision function will be really close.
I'd say splitters should take the weights into account for shuffling/splitting the data. If the data has a lot of samples with 0 weight, not considering it would mean some splits may have samples all with sample_weight=0. Also, in terms of a shuffle k split, it does make sense (to me) to have a similar weight across the splits. That said, I'm not sure if it's a reasonable expectation to have the same set of non-zero-weight samples after introducing a bunch of zero-weighed samples to the data. |
I agree with your reasoning. However, that's a behavior change and also an API change. Right now, the signature of It might also be tricky to define what the desired behavior is. So let's say for KFold we want the sum of the weights to be the same for each fold (instead of the number of samples). Do we want this to be as stable as possible or do we want to make sure they also have similar amounts of samples? What does it mean for Now suppose we do that. I assume the case with all equal sample weights (all ones) will result in a degenerate relaxed solution, and so the rounding will be tricky. In that case we can't recover the original unweighted solution from the all ones case :-/ That possibly means that we need to come up with a greedy heuristic (as we did in other cases) that reduces to the unweighted case more easily. And finally, as you said, we need to define our expectations. We already give different results after shuffling the data in some cases (should this be a tag? hum... for randomized algorithms this is a bit tricky to define). Do we want to say that if we replace a sample with two identical samples with half the weight the results should be the same? That's probably not mathematically true for SGD, right? Right now I think the easiest solution might be having a tag that says that an estimator depends on the order of the data. |
I am not sure we want to go down the way of |
Merged master to trigger a CI run with the result of #14286. I also update the description to be able to tick estimators and track progress. |
#14286 alone is not enough to fix the common test for models in the SVM family. Maybe a stricter tol would also be required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my previous try, we needed to reduce the tolerance for the SVM classifier.
#15038 should also be merged (It should not be an issue here since they are not the default parameters).
How about marking failing estimators with xfail, opening a follow up issue and merging this? It would be helpful to have this in master (otherwise one have to checkout this branch, and localy merge master to do anything about this). While if it was in master, one could simply run it with |
I continued this PR in #16507 as suggested above which would allow to merge this test to master with some estimators marked as a known failure. |
Adding a test following #10873.
This is a pretty simple sample_weight test that says that a weight of 0 is equivalent to not having the samples.
I think every failure here should be considered a bug. This is:
I wonder if for the SGD based algorithms there's an issue where we shrink
w
if we see a sample with sample weight 0.