Skip to content

Behaviour of warm_start=True and max_iter (and n_estimators) #25522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
glemaitre opened this issue Jan 31, 2023 · 2 comments
Closed

Behaviour of warm_start=True and max_iter (and n_estimators) #25522

glemaitre opened this issue Jan 31, 2023 · 2 comments
Labels

Comments

@glemaitre
Copy link
Member

glemaitre commented Jan 31, 2023

This issue is an RFC to clarify the expected behavior max_iter and n_iter_ (or estimators and len(estimators_) equivalently) when used with warm_start=True.

Estimators to be considered

The estimators to be considered can be found in the following manner:

from inspect import signature
from sklearn.utils import all_estimators

type_filter = ["classifier", "regressor"]
estimators = []
for name, klass in all_estimators(type_filter=type_filter):
    params = signature(klass).parameters
    if (
        any(it_param in params for it_param in ["max_iter", "n_estimators"]) and
        "warm_start" in params
    ):
        print(name)

which give

BaggingClassifier
BaggingRegressor
ElasticNet
ExtraTreesClassifier
ExtraTreesRegressor
GammaRegressor
GradientBoostingClassifier
GradientBoostingRegressor
HistGradientBoostingClassifier
HistGradientBoostingRegressor
HuberRegressor
Lasso
LogisticRegression
MLPClassifier
MLPRegressor
MultiTaskElasticNet
MultiTaskLasso
PassiveAggressiveClassifier
PassiveAggressiveRegressor
Perceptron
PoissonRegressor
RandomForestClassifier
RandomForestRegressor
SGDClassifier
SGDRegressor
TweedieRegressor

Review the different behaviors

We will evaluate the behaviour by doing the following experiment:

  • set max_iter=2 (or n_estimators=2) and warm_start=True
  • fit the estimator and check n_iter_ (or len(estimators_))
  • set max_iter=3 (or n_estimators=3)
  • fit the estimator and check n_iter_ (or len(estimators_))

The idea is to check if we report the total number of iterations or just the number of iterations of the latest fit call.

GLM estimators

from sklearn.linear_model import GammaRegressor, PoissonRegressor, TweedieRegressor

Estimators = [GammaRegressor, PoissonRegressor, TweedieRegressor]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

In this case, n_iter_ is reported to be 2 and then 3. Using the verbose option, the model did effectively 5 iterations.

Ensemble estimators

from sklearn.ensemble import (
    BaggingClassifier,
    BaggingRegressor,
    ExtraTreesClassifier,
    ExtraTreesRegressor,
    GradientBoostingClassifier,
    GradientBoostingRegressor,
    RandomForestClassifier,
    RandomForestRegressor,
)

Estimators = [
    BaggingRegressor,
    ExtraTreesRegressor,
    GradientBoostingRegressor,
    RandomForestRegressor,
]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, n_estimators=2).fit(X_reg, y_reg)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(n_estimators=3)
    estimator.fit(X_reg, y_reg)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")

Estimators = [
    BaggingClassifier,
    ExtraTreesClassifier,
    GradientBoostingClassifier,
    RandomForestClassifier,
]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, n_estimators=2).fit(X_clf, y_clf)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(n_estimators=3)
    estimator.fit(X_clf, y_clf)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")

In this case, len(estimators_) is 2 and 3. It differs from the previous GLM because we have in total only 3 estimators.

Similar behaviour for HistGradientBoosting:

from sklearn.ensemble import (
    HistGradientBoostingClassifier, HistGradientBoostingRegressor
)

Estimators = [HistGradientBoostingRegressor, HistGradientBoostingClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

Estimators using coordinate descent

from sklearn.linear_model import ElasticNet, Lasso

Estimators = [ElasticNet, Lasso]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2).fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

It will be similar for MultiTaskElasticNet and MultiTaskLasso.

This is equivalent to GLM. The _path is called using self.max_iter without taking into account n_iter_. So the total number of iterations will be 5.

MLP estimators:

from sklearn.neural_network import MLPClassifier, MLPRegressor

Estimators = [MLPRegressor, MLPClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

n_iter_ is reported to be 2 and 5. So the fit behavior is consistent with other linear model but the reported n_iter_ report the global number of iterations.

SGD estimators

from sklearn.linear_model import SGDClassifier, SGDRegressor

Estimators = [SGDRegressor, SGDClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

n_iter_ is reported to be 2 and 3 and make 5 iterations in total. In line with GLMs.
Perceptron will expose the same behaviour.

Other estimators

HuberRegressor is behaving the same as GLMs

@github-actions github-actions bot added the Needs Triage Issue requires triage label Jan 31, 2023
@glemaitre glemaitre added RFC and removed Needs Triage Issue requires triage labels Jan 31, 2023
@glemaitre
Copy link
Member Author

By opening this PR, I convinced myself that we already do the right thing and that the right fix is to modify the MLP estimators to be coherent when reporting n_iter_.

@glemaitre
Copy link
Member Author

I am going to open a PR to clarify a bit the behaviour at least in the glossary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant