Skip to content

HistGradientBoostingClassifier fails for early stopping on the test set with warm start #16661

@mfeurer

Description

@mfeurer

Describe the bug

I found a combination of input arguments to HistGradientBoostingClassifier which causes it to fail for two repeated calls to fit when warm_start is turned on and it uses the training set for early stopping.

Steps/Code to Reproduce

import sklearn.ensemble
import sklearn.datasets
from sklearn.experimental import enable_hist_gradient_boosting

X, y = sklearn.datasets.load_wine(return_X_y=True)

gb = sklearn.ensemble.HistGradientBoostingClassifier(
    max_iter=1000,
    scoring='loss',
    warm_start=True,
    n_iter_no_change=1,
    validation_fraction=None,
)

gb.fit(X, y)
gb.fit(X, y)

Expected Results

No error is thrown.

Actual Results

Traceback (most recent call last):
  File "/home/feurerm/sync_dir/projects/automl_competition_2015/auto-sklearn/test.py", line 16, in <module>
    gb.fit(X, y)
  File "/home/feurerm/miniconda/3-4.5.4/envs/autosklearn/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 356, in fit
    raw_predictions_val, y_val
UnboundLocalError: local variable 'raw_predictions_val' referenced before assignment

Versions

System:
    python: 3.7.1 (default, Dec 14 2018, 19:28:38)  [GCC 7.3.0]
executable: /home/feurerm/miniconda/3-4.5.4/envs/autosklearn/bin/python3.7
   machine: Linux-4.15.0-88-generic-x86_64-with-debian-buster-sid

Python dependencies:
       pip: 10.0.1
setuptools: 39.2.0
   sklearn: 0.22.2
     numpy: 1.14.5
     scipy: 1.2.0
    Cython: 0.28.4
    pandas: 0.25.0
matplotlib: None
    joblib: 0.12.1

Built with OpenMP: True

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions