Skip to content

Broken estimator_ attribute on some ensemble models #25588

@BenjaminBossan

Description

@BenjaminBossan

Describe the bug

Several ensemble models raise an error when trying to access the existing estimator_ attribute.

The problem is that this property tries to access self._estimator, which is set by sklearn.ensemble.BaseEnsemble._validate_estimator, but that method is not called by all subclasses.

def _validate_estimator(self, default=None):

For VotingClassifier and VotingRegressor, it's understandable IMO, but the error message could be better. For gradient boosting, estimator_ could return something useful.

More as a reminder to myself, _validate_estimator is being rewritten in #24250 to return the estimator instead of setting it inplace.

Steps/Code to Reproduce

import sklearn.ensemble
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor

estimators = [
    sklearn.ensemble.AdaBoostClassifier(),
    sklearn.ensemble.AdaBoostRegressor(),
    sklearn.ensemble.BaggingClassifier(),
    sklearn.ensemble.BaggingRegressor(),
    sklearn.ensemble.ExtraTreesClassifier(),
    sklearn.ensemble.ExtraTreesRegressor(),
    sklearn.ensemble.GradientBoostingClassifier(),
    sklearn.ensemble.GradientBoostingRegressor(),
    sklearn.ensemble.HistGradientBoostingClassifier(),
    sklearn.ensemble.HistGradientBoostingRegressor(),
    sklearn.ensemble.RandomForestClassifier(),
    sklearn.ensemble.RandomForestRegressor(),
    sklearn.ensemble.VotingClassifier([('5', KNeighborsClassifier(5)), ('10', KNeighborsClassifier(10))]),
    sklearn.ensemble.VotingRegressor([('5', KNeighborsRegressor(5)), ('10', KNeighborsRegressor(10))]),
]

X, y = [[1], [2]], [0, 1]

msg = "Got {} error when trying to access .estimator_ in {}"
for estimator in estimators:
    estimator.fit(X, y)
    try:
        estimator.estimator_
    except Exception as e:
        print(msg.format(e.__class__.__name__, estimator.__class__.__name__))

Expected Results

No error is printed.

Actual Results

Got AttributeError error when trying to access .estimator_ in GradientBoostingClassifier
Got AttributeError error when trying to access .estimator_ in GradientBoostingRegressor
Got AttributeError error when trying to access .estimator_ in HistGradientBoostingClassifier
Got AttributeError error when trying to access .estimator_ in HistGradientBoostingRegressor
Got AttributeError error when trying to access .estimator_ in VotingClassifier
Got AttributeError error when trying to access .estimator_ in VotingRegressor

Versions

System:
    python: 3.10.9 | packaged by conda-forge | (main, Feb  2 2023, 20:20:04) [GCC 11.3.0]
executable: /home/name/anaconda3/envs/skops/bin/python
   machine: Linux-5.15.0-60-generic-x86_64-with-glibc2.35

Python dependencies:
      sklearn: 1.2.0
          pip: 22.3.1
   setuptools: 65.5.1
        numpy: 1.23.5
        scipy: 1.9.3
       Cython: None
       pandas: 1.5.3
   matplotlib: 3.6.3
       joblib: 1.2.0
threadpoolctl: 3.1.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /home/name/anaconda3/envs/skops/lib/libopenblasp-r0.3.21.so
        version: 0.3.21
threading_layer: pthreads
   architecture: Haswell
    num_threads: 8

       user_api: openmp
   internal_api: openmp
         prefix: libgomp
       filepath: /home/name/anaconda3/envs/skops/lib/libgomp.so.1.0.0
        version: None
    num_threads: 8

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions