Skip to content

check_estimator is stricter than what is stated in the Estimator API doc #16241

Open
@ogrisel

Description

@ogrisel

This issue was raised in a discussion regarding the LightGBM scikit-learn compatible estimators: microsoft/LightGBM#2628 (comment)

The problem is that check_estimator complains about private attributes set in the __init__ of a scikit-learn estimator while our documentation just state the following (while not explicitly prohibiting setting private attributes in __init__):

The arguments accepted by __init__ should all be keyword arguments with a default value. In other words, a user should be able to instantiate an estimator without passing any arguments to it. The arguments should all correspond to hyperparameters describing the model or the optimisation problem the estimator tries to solve. These initial arguments (or parameters) are always remembered by the estimator. Also note that they should not be documented under the “Attributes” section, but rather under the “Parameters” section for that estimator.

In addition, every keyword argument accepted by __init__ should correspond to an attribute on the instance. Scikit-learn relies on this to find the relevant attributes to set on an estimator when doing model selection.

from: https://scikit-learn.org/0.22/developers/develop.html#instantiation

The strict check is defined in sklearn.utils.estimator_checks.check_no_attributes_set_in_init.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Discussion

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions