Skip to content

Optimize _check_partial_fit_first_call for regressors #21060

Closed
@PSSF23

Description

@PSSF23

Describe the bug

In _check_partial_fit_first_call, it requires:

Estimators that implement the ``partial_fit`` API need to be provided with
the list of possible classes at the first call to partial_fit.
Subsequent calls to partial_fit should check that ``classes`` is still
consistent with a previous value of ``clf.classes_`` when provided.

Regressors such as DecisionTreeRegressor don't have the classes_ attribute, yet like classifiers, they could also benefit from a partial_fit function. I believe an is_classifier check should solve the problem.

Steps/Code to Reproduce

Run _check_partial_fit_first_call on a regressor with the partial_fit function implemented.

Expected Results

No error is thrown when classes is not passed into partial_fit for regressors.

Actual Results

ValueError: classes must be passed on the first call to partial_fit.

I ran into this error when testing #18889 and had to pass an empty classes parameter.

Versions

System:
    python: 3.8.5 (default, Sep  4 2020, 02:22:02)  [Clang 10.0.0 ]
executable: /Users/pssf23/miniconda3/envs/ndd/bin/python
   machine: macOS-10.16-x86_64-i386-64bit

Python dependencies:
          pip: 20.3.3
   setuptools: 51.0.0.post20201207
      sklearn: 0.24.dev0
        numpy: 1.21.0
        scipy: 1.7.1
       Cython: 0.29.21
       pandas: 1.2.0
   matplotlib: 3.3.3
       joblib: 1.0.0
threadpoolctl: 2.1.0

Built with OpenMP: False

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions