-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] [BUG] AdaBoostRegressor should not raise errors if the base_estimator does not support sample_weights #5207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] [BUG] AdaBoostRegressor should not raise errors if the base_estimator does not support sample_weights #5207
Conversation
@glouppe @arjoly Could you please verify if this is a bug or not? I do not know the internals of boosting to comment on whether the sample_weights provided to the fit in the meta-estimator is the same as that passed to fit in the base estimators. But by how it is implemented in AdaBoostClassifier, it seems like that. |
Interesting. I am encountering a failed test in this PR: #5204 from within |
Yes :) It might be better to keep the changes independent though, i.e wait till this PR is merged and rebase the sample_weights PR over this. |
What is |
I think the idea is to raise an error or not support base estimators which do not have sample_weights. However, I suppose an error should be raised only when sample_weights is not None for these cases? Anyhow, the checks for LogisticRegression can be removed in the other PR or the solver can be explicitly set to liblinear since that should also raise a ValueError (although in a different part of the code) |
Actually this is not a bug. The weighting of samples is taken care by the block of code above which does a weighted random sampling of AdaBoostRegressor should work without the base estimator needing sample_weights. |
9bd22a5
to
b86b846
Compare
@vstolbunov I've updated the pull request. Let me know what you think. |
Good point. Looks good to me. |
I dont understand this PR. AdaBoostRegressor needs the base estimator to use the sample weights that are computed at each iteration (see https://github.com/MechCoder/scikit-learn/blob/ensemble_sample_weight_bug/sklearn/ensemble/weight_boosting.py#L1053 for regression). Therefore the base estimator must support sample weights. |
I am confused, but in the code over the base estimator does not require these weights to be passed to the fit method over here (https://github.com/MechCoder/scikit-learn/blob/ensemble_sample_weight_bug/sklearn/ensemble/weight_boosting.py#L1020) and the handling of sample_weights is externally done from L1011 TO L1016 right?. Where else is it passed to the base estimator? |
Indeed, that is strange. I think this was done on purpose but I am not sure anymore. @ndawe do you remember? |
Are you okay with removing the sample_weights check for now (as done in this PR)? and opening an issue that discusses this for the future? |
This check for sample_weight support was not in the original implementation that I wrote (as I recall). Git blame shows @akshayah3. For regression, sample_weight support is not required in the base estimator since AdaBoost.R2 only uses the weights in the bootstrapping. So this check should be removed. |
b86b846
to
89f4c8b
Compare
@ndawe Thanks! Would you be able to do a quick code review? This is blocking another |
+1 Just had a quick look and this looks fine to me. |
""" | ||
class DummyEstimator(BaseEstimator): | ||
|
||
def fit(self, X, y, sample_weight=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample_weight=None isn't needed here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, that was the point of the test in the first place :/
does not support sample_weights
89f4c8b
to
bb2aced
Compare
I don't see why we shouldn't just merge this right now :) |
I'll leave the bold step of doing this to others :) |
Sorry for the late review. I just checked the paper, this is indeed the way it is supposed to be for AdaBoost.R2. |
[MRG+1] [BUG] AdaBoostRegressor should not raise errors if the base_estimator does not support sample_weights
So was #3711 just a mistake? |
Yes, it shouldn't have been added for the regressor. (But this was correct for the classifier.) |
Thanks @MechCoder ! |
AdaBoostRegressor should work without sample_weights in the base estimator
The random weighted sampling is done internally in the _boost method in AdaBoostRegressor.