-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX always expose best_loss_, validation_scores_, and best_validation_score #24683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Only comment is about english. The thing I can't judge is if the idea of only storing a value in the attributes when early stopping is used/not used makes sense. Given that the values aren't calculated, I assume this was thought about in previous PRs. How are type changes (python list to numpy array) handled in terms of deprecation? |
One issue is that we cannot document it otherwise. It makes the API more consistent with
I am not sure if we need to handle it. An array would behave as a Python list. The |
Co-authored-by: Tim Head <betatim@gmail.com>
Does this need a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
doc/whats_new/v1.2.rst
Outdated
`validation_scores_`, and `best_validation_score_`. `best_loss_` is set to | ||
`None` when `early_stopping=True`, while `validation_scores_` and | ||
`best_validation_score_` are set to `None` when `early_stopping=False`. | ||
`validation_scores_` also changed from a Python list to a Numpy array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we can make validation_scores_
a ndarray because it breaks warm_start
:
from sklearn.neural_network import MLPRegressor
from sklearn.datasets import make_regression
X, y = make_regression(random_state=42)
mlp = MLPRegressor(max_iter=10, random_state=0, warm_start=True, early_stopping=True)
mlp.fit(X, y)
mlp.set_params(max_iter=20)
# Fails with this PR
mlp.fit(X, y)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I will revert this and add a non-regression test as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment on testing otherwise LGTM
mlp.fit(X_iris, y_iris) | ||
mlp.set_params(max_iter=20) | ||
mlp.fit(X_iris, y_iris) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check that length of validation_scores_
to make sure it is updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that there is an underlying bug: with max_iter=20
, we end up with n_iter_ == 30
for the regressor. I need to investigate. But it could be fixed in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this because there were ten (original) + twenty (warm start) iterations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure yet. I have to go into details. I have an issue understanding the semantics there.
Having n_iter_ > max_iter
is surprising. Calling fit
just mean that you restart the full process from scratch but not a random init.
The above behaviour would be what I expect if I do a partial_fit
because in this case, I am starting from a trained model and incrementally learning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My expectation came from GradientBoostingClassifier
's warm_start
. It doesn't have a partial_fit()
, so calling fit()
is the only game in town.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe at the very least there is room for improving the consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just make it clear what to expect. But we can delay that in the issue that I open: #24764
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you, @glemaitre.
I just have one minor duplicated remark.
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
…_score (scikit-learn#24683) Co-authored-by: Tim Head <betatim@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
The first step towards #24411
MLPClassifier
andMLPRegressor
are not consistent by not exposing some parameters related to early stopping. Therefore, they do not appear in the documentation.This PR set them to
None
if not relevant and documents the values stored in those arrays.In addition,
validation_scores_
is changed from a Python list to a NumPy array that is more consistent with other estimators.