Skip to content

Add standard-deviation output to sklearn.ensemble.RandomForestRegressor #23257

Closed
@sebastianpinedaar

Description

@sebastianpinedaar

Describe the workflow you want to enable

I think it would be awesome if the RF regressor returns the standard deviation (not only the mean) of the output of the different trees.

Describe your proposed solution

This is not a definite option, but contains the core idea:

from sklearn.ensemble import RandomForestRegressor
import numpy as np

class RandomForestWithUncertainty:

def __init__(self,**args):

    self.model = RandomForestRegressor(**args)

def predict(self, X, return_std = False):

    pred = np.array([tree.predict(X) for tree in self.model]).T
    pred_mean = np.mean(pred, axis=1)

    if return_std:

      pred_var = (pred - pred_mean.reshape(-1,1))**2
      pred_std = np.sqrt(np.mean(pred_var, axis=1))
      return pred_mean, pred_std

    else:
      return pred_mean

Describe alternatives you've considered, if relevant

No response

Additional context

This is useful if we want to use the RF in algorithms that require uncertainties estimatios, such as Bayesian Optimization.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions