Behaviour of `warm_start=True` and `max_iter` (and `n_estimators`) #25522

glemaitre · 2023-01-31T13:09:34Z

This issue is an RFC to clarify the expected behavior max_iter and n_iter_ (or estimators and len(estimators_) equivalently) when used with warm_start=True.

Estimators to be considered

The estimators to be considered can be found in the following manner:

from inspect import signature
from sklearn.utils import all_estimators

type_filter = ["classifier", "regressor"]
estimators = []
for name, klass in all_estimators(type_filter=type_filter):
    params = signature(klass).parameters
    if (
        any(it_param in params for it_param in ["max_iter", "n_estimators"]) and
        "warm_start" in params
    ):
        print(name)

which give

BaggingClassifier
BaggingRegressor
ElasticNet
ExtraTreesClassifier
ExtraTreesRegressor
GammaRegressor
GradientBoostingClassifier
GradientBoostingRegressor
HistGradientBoostingClassifier
HistGradientBoostingRegressor
HuberRegressor
Lasso
LogisticRegression
MLPClassifier
MLPRegressor
MultiTaskElasticNet
MultiTaskLasso
PassiveAggressiveClassifier
PassiveAggressiveRegressor
Perceptron
PoissonRegressor
RandomForestClassifier
RandomForestRegressor
SGDClassifier
SGDRegressor
TweedieRegressor

Review the different behaviors

We will evaluate the behaviour by doing the following experiment:

set max_iter=2 (or n_estimators=2) and warm_start=True
fit the estimator and check n_iter_ (or len(estimators_))
set max_iter=3 (or n_estimators=3)
fit the estimator and check n_iter_ (or len(estimators_))

The idea is to check if we report the total number of iterations or just the number of iterations of the latest fit call.

GLM estimators

from sklearn.linear_model import GammaRegressor, PoissonRegressor, TweedieRegressor

Estimators = [GammaRegressor, PoissonRegressor, TweedieRegressor]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

In this case, n_iter_ is reported to be 2 and then 3. Using the verbose option, the model did effectively 5 iterations.

Ensemble estimators

from sklearn.ensemble import (
    BaggingClassifier,
    BaggingRegressor,
    ExtraTreesClassifier,
    ExtraTreesRegressor,
    GradientBoostingClassifier,
    GradientBoostingRegressor,
    RandomForestClassifier,
    RandomForestRegressor,
)

Estimators = [
    BaggingRegressor,
    ExtraTreesRegressor,
    GradientBoostingRegressor,
    RandomForestRegressor,
]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, n_estimators=2).fit(X_reg, y_reg)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(n_estimators=3)
    estimator.fit(X_reg, y_reg)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")

Estimators = [
    BaggingClassifier,
    ExtraTreesClassifier,
    GradientBoostingClassifier,
    RandomForestClassifier,
]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, n_estimators=2).fit(X_clf, y_clf)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(n_estimators=3)
    estimator.fit(X_clf, y_clf)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")

In this case, len(estimators_) is 2 and 3. It differs from the previous GLM because we have in total only 3 estimators.

Similar behaviour for HistGradientBoosting:

from sklearn.ensemble import (
    HistGradientBoostingClassifier, HistGradientBoostingRegressor
)

Estimators = [HistGradientBoostingRegressor, HistGradientBoostingClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

Estimators using coordinate descent

from sklearn.linear_model import ElasticNet, Lasso

Estimators = [ElasticNet, Lasso]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2).fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

It will be similar for MultiTaskElasticNet and MultiTaskLasso.

This is equivalent to GLM. The _path is called using self.max_iter without taking into account n_iter_. So the total number of iterations will be 5.

MLP estimators:

from sklearn.neural_network import MLPClassifier, MLPRegressor

Estimators = [MLPRegressor, MLPClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

n_iter_ is reported to be 2 and 5. So the fit behavior is consistent with other linear model but the reported n_iter_ report the global number of iterations.

SGD estimators

from sklearn.linear_model import SGDClassifier, SGDRegressor

Estimators = [SGDRegressor, SGDClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

n_iter_ is reported to be 2 and 3 and make 5 iterations in total. In line with GLMs.
Perceptron will expose the same behaviour.

Other estimators

HuberRegressor is behaving the same as GLMs

The text was updated successfully, but these errors were encountered:

glemaitre · 2023-01-31T13:31:09Z

By opening this PR, I convinced myself that we already do the right thing and that the right fix is to modify the MLP estimators to be coherent when reporting n_iter_.

glemaitre · 2023-01-31T14:19:48Z

I am going to open a PR to clarify a bit the behaviour at least in the glossary.

github-actions bot added the Needs Triage Issue requires triage label Jan 31, 2023

glemaitre added RFC and removed Needs Triage Issue requires triage labels Jan 31, 2023

glemaitre closed this as completed Jan 31, 2023

glemaitre mentioned this issue Jan 31, 2023

FIX report properly n_iter_ when warm_start=True #25443

Merged

glemaitre mentioned this issue Jan 31, 2023

DOC improve the warm_start glossary entry #25523

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Behaviour of `warm_start=True` and `max_iter` (and `n_estimators`) #25522

Behaviour of `warm_start=True` and `max_iter` (and `n_estimators`) #25522

glemaitre commented Jan 31, 2023 •

edited

Loading

glemaitre commented Jan 31, 2023

glemaitre commented Jan 31, 2023

Behaviour of warm_start=True and max_iter (and n_estimators) #25522

Behaviour of warm_start=True and max_iter (and n_estimators) #25522

Comments

glemaitre commented Jan 31, 2023 • edited Loading

Estimators to be considered

Review the different behaviors

GLM estimators

Ensemble estimators

Estimators using coordinate descent

MLP estimators:

SGD estimators

Other estimators

glemaitre commented Jan 31, 2023

glemaitre commented Jan 31, 2023

Behaviour of `warm_start=True` and `max_iter` (and `n_estimators`) #25522

Behaviour of `warm_start=True` and `max_iter` (and `n_estimators`) #25522

glemaitre commented Jan 31, 2023 •

edited

Loading