Skip to content

HistGradientBoosting* interaction constraints #19148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lorentzenchr opened this issue Jan 10, 2021 · 12 comments · Fixed by #21020
Closed

HistGradientBoosting* interaction constraints #19148

lorentzenchr opened this issue Jan 10, 2021 · 12 comments · Fixed by #21020
Assignees
Labels
Moderate Anything that requires some knowledge of conventions and best practices module:ensemble New Feature

Comments

@lorentzenchr
Copy link
Member

lorentzenchr commented Jan 10, 2021

Describe the workflow you want to enable

I'd like to use HistGradientBoostingClassifier and HistGradientBoostingRegressor with the possibility to set interaction constraints for certain features. As said in microsoft/LightGBM#2884 (comment), it is one way to make those black boxes more intuitive and interpretable. In addition, it makes it much more easy to marginalize over those features.

Additional context

LightGBM has interaction_constraints, see their docs. XGBoost has them, see their docs.
Have also a look at the XGBoost tutorial on interaction constraints for a nice visualization and for potential benefits:

  • Better predictive performance from focusing on interactions that work – whether through domain specific knowledge or algorithms that rank interactions
  • Less noise in predictions; better generalization
  • More control to the user on what the model can fit. For example, the user may want to exclude some interactions even if they perform well due to regulatory constraints
@lorentzenchr
Copy link
Member Author

One question is on the API. As long as feature names are not available, I guess it would be similar to monotonic_cst. See also the way of LightGBM to specify interaction constraints: https://lightgbm.readthedocs.io/en/latest/Parameters.html#interaction_constraints.

@ogrisel
Copy link
Member

ogrisel commented Apr 28, 2021

I agree we could allow the same index based constraints specs as for monotonic_cst + a special interaction_cst="univariate" option to make it possible to implement simple pseudo-GAMs.

@lorentzenchr
Copy link
Member Author

lorentzenchr commented Sep 12, 2021

Concerning the API. In both lightgbm and xgboost, interaction constraints are specified as nested lists:

[[0, 1], [2, 3, 4]]

where each inner list is a group of indices of features that are allowed to interact with each other.
It becomes more involved when a feature appears more than once, e.g.

[[0,1], [1, 3, 4]]

It's not well documented, but xgboost (see dmlc/xgboost#7115) and lightgbm (see microsoft/LightGBM#4481) handle this differently:

lightgbm   allowed       forbidden      forbidden       forbidden       forbidden
xgboost    allowed        allowed       forbidden       forbidden       forbidden
              0              0              0               0               0
             / \            / \            / \             / \             / \
            1  ..          1   ..         1   ..          2  ..           3  ..
           / \            / \            / \
          1   0          0   4          2

I propose to implement the choice of LightGBM: constraints are inherited to their childs.
Note that in the XGBoost version, constraints are broken if there are overlapping constraints.

@thomasjpfan thomasjpfan self-assigned this Sep 12, 2021
@lorentzenchr
Copy link
Member Author

lorentzenchr commented Sep 12, 2021

@thomasjpfan I also started today working on it. Should I share what I have or are you already on the way?

The main point of impact seems to be in the function find_node_split

with nogil:
    ...
    for feature_idx in prange(n_features,..):  # Here we should only loop over allowed features.
        ...

@thomasjpfan
Copy link
Member

@lorentzenchr Feel free to work on it! Ping me for reviews. :)

@lorentzenchr
Copy link
Member Author

@mayer79 ping

@mayer79
Copy link
Contributor

mayer79 commented Sep 20, 2021

Super exciting, thanks for working on this! Regarding the interface, a natural way to treat unlisted variables is to consider them as one separate interaction group. A typical application of interaction constraints is to force one or a couple of features to act additively on the prediction, while letting the remaining features interact freely.

So, e.g., if there are features 0, 1, 2, 3 and the model should be additive only in feature 0, then we could either specify [[0], [1, 2, 3]] or [[0]].

@lorentzenchr
Copy link
Member Author

lorentzenchr commented Oct 23, 2021

Cross referencing #3482 (comment):

It seemed a bit weird in terms of inclusion criteria to me (what's the reference for general interactions constraints?). I guess we're weighing what the implementations do more heavily than what the literature does now, which is an option but not something we have decided on.

In the end, as long as we have easy ways for users to discover how to do gradient boosting GAMs that are in accordance with what's empirically & academically validated then that's great.

@amueller Sometimes, there are very important things in practice like interaction constraints that are not well investigated in the ivory tower of academic research.

@mayer79 Once, you mentioned a reference paper for interaction constraints. Do you still remember?

@mayer79
Copy link
Contributor

mayer79 commented Oct 23, 2021

@lorentzenchr : this is the first source that I could find when digging into the topic:
Simon C. K. Lee and Sheldon Lin and Katrien Antonio (2015). Delta Boosting Machine and its Application in Actuarial Modeling. Institute of Actuaries of Australia
Direct link to pdf: https://actuaries.asn.au/Library/Events/ASTINAFIRERMColloquium/2015/AntonioEtAlDeltaBoostingPaper.pdf

@lorentzenchr
Copy link
Member Author

@mayer79 Thank you.

A paper that uses more complicated interaction constraints for bundling location related features together is https://dx.doi.org/10.2139/ssrn.3924412.

@lorentzenchr
Copy link
Member Author

BTW, interaction constraints for HGBT are listed on the priorities for 2021 from the Technical Committee, point 3 in https://scikit-learn.fondation-inria.fr/technical-committee-november-5-2020-fr/.

@lorentzenchr
Copy link
Member Author

I also found the paper https://arxiv.org/abs/2007.05758 "Feature Interactions in XGBoost".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Moderate Anything that requires some knowledge of conventions and best practices module:ensemble New Feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants