Skip to content

ENH partial_dependece plot for HistGradientBoosting estimator fitted with sample_weight #25210

@vitaliset

Description

@vitaliset

Describe the workflow you want to enable

As partial dependence of a model at a point is defined as an expectation, it should respect sample_weight if someone wishes to use it (for instance, when you know your X does not follow the distribution you are interested in).

#25209 tries to solve this for method='brute' when you have new X. For older tree-based models trained with sample_weights, method='recursion' keeps track of the training sample_weight and calculates the partial_dependece with that into consideration.

But, as discussed during the implementation of sample_weight on the HistGradientBoosting estimators (#14696 (comment)), these models stores an attribute _fitted_with_sw and when partial_dependece with recursion is asked, it throws an error:

if getattr(self, "_fitted_with_sw", False):
raise NotImplementedError(
"{} does not support partial dependence "
"plots with the 'recursion' method when "
"sample weights were given during fit "
"time.".format(self.__class__.__name__)
)

Describe your proposed solution

As discussed in #24872 (comment), the big difference between other tree-based algorithms and HistGradientBoosting is that HistGradientBoosting does not save the weighted_n_node_samples when building the tree.

Describe alternatives you've considered, if relevant

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions