-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Expose an apply method for gradient boosters #5222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Could you also change https://github.com/scikit-learn/scikit-learn/blob/master/examples/ensemble/plot_feature_transformation.py#L76 to make use of this new method? |
return the index of the leaf x ends up in. | ||
""" | ||
|
||
if self.estimators_ is None or len(self.estimators_) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some code from https://github.com/jmschrei/scikit-learn/blob/gbt_apply/sklearn/ensemble/gradient_boosting.py#L1068 could be factored out in a _validate_X_predict
method?
I thought we didn't have this method because the base estimator doesn't need to be a tree? |
This doesn't consider the base estimator, only the successive trees grown off the gradient of the base estimator. |
Ah, I'm sorry, you meant that you can gradient boost other models other than trees. Currently it is hard coded as trees, and I haven't seen any movement towards including arbitrary gradient boosting thus far. Maybe @ogrisel can comment more? |
I accidentally muddled this PR with my other PR. I am closing this one and will open a new one. |
I don't know if it's useful to gradient boost other models in practice. I know XGBoost can use linear models as base learner. However I have a hard time understanding why this is not equivalent to fitting a linear model directly to the loss function. Would love a simple explanation for that. Maybe @pprett knows? |
I think it is similar to stacked linear models - by training over different On Tue, Sep 8, 2015 at 10:28 AM, Olivier Grisel notifications@github.com
|
The linear model in xgboost is exactly same fitting a linear model to the loss function with parallel coordinate descent. It is implemented in the same interface as gradient boosting, because they are connected in nature. i.e. Addictive linear model together in gradient boosting way is equivalent to coordinated descent for linear model. |
Thanks @tqchen. That sounds different from what we would get by using linear models as base estimators in scikit-learn gb classes though. |
Yes, I guess this was because the difference in terms of interface. xgboost's gbm class is a update style interface where choice can be made whether to add estimator, or improve the current estimator based on the statistics. For linear model, this allows inplace modification of the loss function. |
In response to #5209, I have added an apply method for gradient boosters. This returns a matrix of dimensions (n_samples, n_estimators, n_classes), where each index is the terminal leaf which the sample ends up. This mostly wraps the DecisionTree apply method and mirrors the RandomForest one. A simple unit test has been added as well.
cc @ogrisel @pprett @glouppe @arjoly