Skip to content

FIX always expose best_loss_, validation_scores_, and best_validation_score #24683

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Dec 6, 2022

Conversation

glemaitre
Copy link
Member

The first step towards #24411

MLPClassifier and MLPRegressor are not consistent by not exposing some parameters related to early stopping. Therefore, they do not appear in the documentation.

This PR set them to None if not relevant and documents the values stored in those arrays.

In addition, validation_scores_ is changed from a Python list to a NumPy array that is more consistent with other estimators.

@betatim
Copy link
Member

betatim commented Oct 18, 2022

Only comment is about english.

The thing I can't judge is if the idea of only storing a value in the attributes when early stopping is used/not used makes sense. Given that the values aren't calculated, I assume this was thought about in previous PRs.

How are type changes (python list to numpy array) handled in terms of deprecation?

@glemaitre
Copy link
Member Author

The thing I can't judge is if the idea of only storing a value in the attributes when early stopping is used/not used makes sense. Given that the values aren't calculated, I assume this was thought about in previous PRs.

One issue is that we cannot document it otherwise. It makes the API more consistent with HistGradientBoosting* early-stopping for instance.

How are type changes (python list to numpy array) handled in terms of deprecation?

I am not sure if we need to handle it. An array would behave as a Python list. The isinstance would be problematic but I don't know if a user would do that.

Co-authored-by: Tim Head <betatim@gmail.com>
@haiatn
Copy link
Contributor

haiatn commented Oct 21, 2022

Does this need a Waiting for Second Reviewer tag?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

`validation_scores_`, and `best_validation_score_`. `best_loss_` is set to
`None` when `early_stopping=True`, while `validation_scores_` and
`best_validation_score_` are set to `None` when `early_stopping=False`.
`validation_scores_` also changed from a Python list to a Numpy array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we can make validation_scores_ a ndarray because it breaks warm_start:

from sklearn.neural_network import MLPRegressor
from sklearn.datasets import make_regression

X, y = make_regression(random_state=42)
mlp = MLPRegressor(max_iter=10, random_state=0, warm_start=True, early_stopping=True)

mlp.fit(X, y)

mlp.set_params(max_iter=20)

# Fails with this PR
mlp.fit(X, y)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I will revert this and add a non-regression test as well.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment on testing otherwise LGTM

Comment on lines 903 to 905
mlp.fit(X_iris, y_iris)
mlp.set_params(max_iter=20)
mlp.fit(X_iris, y_iris)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check that length of validation_scores_ to make sure it is updated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there is an underlying bug: with max_iter=20, we end up with n_iter_ == 30 for the regressor. I need to investigate. But it could be fixed in another PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this because there were ten (original) + twenty (warm start) iterations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure yet. I have to go into details. I have an issue understanding the semantics there.

Having n_iter_ > max_iter is surprising. Calling fit just mean that you restart the full process from scratch but not a random init.

The above behaviour would be what I expect if I do a partial_fit because in this case, I am starting from a trained model and incrementally learning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My expectation came from GradientBoostingClassifier's warm_start. It doesn't have a partial_fit(), so calling fit() is the only game in town.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe at the very least there is room for improving the consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just make it clear what to expect. But we can delay that in the issue that I open: #24764

@glemaitre glemaitre added the To backport PR merged in master that need a backport to a release branch defined based on the milestone. label Dec 5, 2022
@glemaitre glemaitre added this to the 1.2 milestone Dec 5, 2022
Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you, @glemaitre.

I just have one minor duplicated remark.

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
@jjerphan jjerphan enabled auto-merge (squash) December 6, 2022 09:03
@jjerphan jjerphan merged commit 1f9dc71 into scikit-learn:main Dec 6, 2022
jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Dec 6, 2022
…_score (scikit-learn#24683)

Co-authored-by: Tim Head <betatim@gmail.com>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:neural_network To backport PR merged in master that need a backport to a release branch defined based on the milestone.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants