Skip to content

[RFC] Tree module improvements #5212

Open
@jmschrei

Description

@jmschrei

I am planning on submitting several PRs in an attempt to merge #5041 in slowly, with the ultimate goal being a clean implementation of multithreaded decision tree building so that Gradient Boosting can be faster. With one of the main concepts merged (#5203), here is a list of separate PRs which I'd like to merge in the near future.

Longer range goals which I'd like to work towards (but have no clear plan as of right now) are the following:

  • Add an approximate splitter
  • Add multithreading support for single decision trees
  • Add a partial fit method for tree building
  • Support categorical variables
  • Support missing values

At this point, it will be clearer to me what specific changes to Splitter, Criteria, and TreeBuilder need to be added to make multithreading a possibility. @glouppe @arjoly @GaelVaroquaux @pprett if you have any comments, I'd love to hear them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions