Reconsider default of `max_features` of RandomForestRegressor

In https://github.com/scikit-learn/scikit-learn/issues/7254, there was a long discussion on `max_features` defaults for random forests. As a consequence, the default "auto" was changed to "sqrt" for RandomForestClassifier, but unfortunately not for RandomForestRegressor. I would like to reconsider this decision.

### What to change?

The default of RandomForestRegressor's `max_features = "auto"` should point to m/3 or sqrt(m), where m is the number of features.

### Why?

1. Good defaults are essential for random forests. The fact that random forests do well even without hyperparameter tuning is one of their only advantages over boosted trees. 

2. Every implementation in R and also h2o use sqrt(m) or m/3 as default. R's `ranger` package uses sqrt(m) for both regression and classification. https://github.com/imbs-hl/ranger

3. Column subsampling per split is the main source of randomness, leading to less correlated trees. The current default removes this effect. Strictly speaking, the current default does not fit a proper random forest but rather a bagged tree. My experience shows that random forests perform better than bagged trees in the majority of the cases.

4. Training time is proportional to `max_features`. I.e. one could easily run 500 trees instead of 100 with a better default.

Note: I am not talking about defaults for completely randomized trees, just about proper random forests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Reconsider default of `max_features` of RandomForestRegressor #20111

What to change?

Why?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Reconsider default of max_features of RandomForestRegressor #20111

Description

What to change?

Why?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Reconsider default of `max_features` of RandomForestRegressor #20111