Skip to content

"auto" value of max_features for RandomForestRegressor is poor #7254

Closed
@odusseys

Description

@odusseys

Currently, setting "auto" for the max_features parameter of RandomForestRegressor (and ExtraTreesRegressor for that matter) leads to choosing max_features = n_features, ie. simple bagging.

This is misleading if the documentation isn't carefully examined (in particular since this value is different for classification, which uses sqrt(n_features), actually leading to a proper random forest and not bagging). One may think they are training a Random Forest when they really aren't.

Furthermore, this is not the recommended value for regression. For example, in Hastie et. al., 15.3 (p. 587), the recommended value is n_features / 3.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions