-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
Describe the workflow you want to enable
As partial dependence of a model at a point is defined as an expectation, it should respect sample_weight
if someone wishes to use it (for instance, when you know your X
does not follow the distribution you are interested in).
#25209 tries to solve this for method='brute'
when you have new X
. For older tree-based models trained with sample_weights, method='recursion'
keeps track of the training sample_weight
and calculates the partial_dependece
with that into consideration.
But, as discussed during the implementation of sample_weight
on the HistGradientBoosting estimators (#14696 (comment)), these models stores an attribute _fitted_with_sw
and when partial_dependece
with recursion is asked, it throws an error:
scikit-learn/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py
Lines 1142 to 1148 in 205f3b7
if getattr(self, "_fitted_with_sw", False): | |
raise NotImplementedError( | |
"{} does not support partial dependence " | |
"plots with the 'recursion' method when " | |
"sample weights were given during fit " | |
"time.".format(self.__class__.__name__) | |
) |
Describe your proposed solution
As discussed in #24872 (comment), the big difference between other tree-based algorithms and HistGradientBoosting is that HistGradientBoosting does not save the weighted_n_node_samples
when building the tree.
Describe alternatives you've considered, if relevant
No response
Additional context
No response