Skip to content

FEA Turn on early stopping in histogram GBDT by default #14516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Feb 12, 2020

Conversation

johannfaouzi
Copy link
Contributor

Reference Issues/PRs

Fixes #14503.

What does this implement/fix? Explain your changes.

This PR enables early stopping by default in HGBM:

  • n_iter_no_change=10 instead of n_iter_no_change=None (not sure about this value, I looked at Should we turn on early stopping in HistGradientBoosting by default? #14303)
  • The docstrings have been changed accordingly
  • The following sentence has been added in the docstrings (changes are welcomed, lack of inspiration): Early stopping is the default behavior, as it usually makes the fitting process much faster without a substantial difference in terms of predictive performance.
  • The random_state has to be fixed in the examples (for the cross-validation), because the performance (clf.score(X, y)) depends on the splitting.
  • n_iter_no_change=None has been added to some existing tests, so that they can check the expected behavior.
  • A small test has been added that checks that early stopping is enabled by default

@johannfaouzi
Copy link
Contributor Author

johannfaouzi commented Jul 30, 2019

It looks like HistGradientBoostingClassifier does not pass test_estimators...

Most common errors:

  • ValueError: The test_size = 1 should be greater or equal to the number of classes = 3
  • ValueError: The test_size = 2 should be greater or equal to the number of classes = 3

This error also occurs one time in test_estimators[HistGradientBoostingClassifier-check_classifiers_classes]:

  • TypeError: '<' not supported between instances of 'str' and 'float'

@johannfaouzi johannfaouzi changed the title [MRG]: Turn on early stopping in HGBM by default [WiP] Turn on early stopping in HGBM by default Jul 30, 2019
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving this a shot @johannfaouzi.

Please add a simple parametrized test that makes sure ES is enabled by default.

Regarding the estimator checks failures, I think the simplest way to go for now is to update set_checking_parameters in utils/estimator_checks.py and deactivate early stopping here.

@johannfaouzi
Copy link
Contributor Author

johannfaouzi commented Jul 31, 2019 via email

NicolasHug
NicolasHug previously approved these changes Jul 31, 2019
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpic but LGTM.
Thanks @johannfaouzi !!

So the new behavior is that early stopping is enabled with n_iter_no_change=10, validation_fraction is .1 and the scorer is the estimator's default scorer. I'll make sure to document this in the user guide in #14525 .

@NicolasHug
Copy link
Member

@johannfaouzi , we're planning to add support for sample weights. It's a non trivial change since that would require changes pretty much everywhere in the GBDT code, including (and mostly) in Cython.

We'd be happy to provide guidance if that's something you'd like to work on? I'd definitely tag this as a hard PR, but it could be a great learning experience ;). LMK.

@NicolasHug NicolasHug changed the title [WiP] Turn on early stopping in HGBM by default [MRG] Turn on early stopping in histogram GBDT by default Jul 31, 2019
@NicolasHug
Copy link
Member

@adrinjalali might want to review this?

@johannfaouzi
Copy link
Contributor Author

@johannfaouzi , we're planning to add support for sample weights. It's a non trivial change since that would require changes pretty much everywhere in the GBDT code, including (and mostly) in Cython.

We'd be happy to provide guidance if that's something you'd like to work on? I'd definitely tag this as a hard PR, but it could be a great learning experience ;). LMK.

Thanks for your proposal! Unfortunately I think that it would be a bit too time-consuming for me, as I have never written a single line of Cython (I am a big fan of numba thus). I saw that you are working on a pure-numba implementation of GBM in pygbm with Olivier. Can't wait for numba to be added as a dependency of scikit-learn, although it might be a while before it happens ;)

@NicolasHug
Copy link
Member

No problem!

Can't wait for numba to be added as a dependency of scikit-learn, although it might be a while before it happens ;)

Ha, I don't see that happening anytime soon. When we were developing pygbm I found numba to be fairly unstable compared to Cython. And as far as I know they still don't support compilation caching which is mandatory for us (pygbm takes about 10 seconds to compile every single time you run it). And I'm not even talking about the fact that we don't like adding dependencies ^^

@johannfaouzi
Copy link
Contributor Author

And as far as I know they still don't support compilation caching which is mandatory for us (pygbm takes about 10 seconds to compile every single time you run it).

There has been a cache option for a while, so it's probably not what you mean (but I clearly lack knowledge in that part of IT to clearly understand the difference).

And I'm not even talking about the fact that we don't like adding dependencies ^^

That's what I thought :)

@amueller
Copy link
Member

@NicolasHug scipy is going to add numba as a dependency soon(ish), so sklearn will have it by default... at least if Ralf get's his way ;)

(HistGradientBoostingRegressor(random_state=0), 'recursion')]
(HistGradientBoostingRegressor(random_state=0, n_iter_no_change=None),
'brute'),
(HistGradientBoostingRegressor(random_state=0, n_iter_no_change=None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would the test fail with early stopping?

Copy link
Member

@NicolasHug NicolasHug Jul 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are very small training sets where early stopping leads to bad results

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the dataset has 100 samples (the default value for our make_regression), right? I kinda feel like the default parameters should give reasonable results for the default parameters of our data generators. Would that be too hard to have an auto early stopping parameter to figure if it's a good idea or not?

I really miss having a nice and more dynamic early stopping parameter, like a policy, a function, or an object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I remember a dataset with 40 samples. Not sure.

Would that be too hard to have an auto early stopping parameter to figure if it's a good idea or not?

I don't think that's the place for this discussion.

See also the latest comments on #11324

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests fails with early stopping, otherwise I would not have changed this (I didn't know about the inspection module before). I can undo the changes locally to see where it fails exactly if you are interested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug I think this is a relevant place to discuss this here, because clearly we're changing our defaults in a way that makes the defaults break on a lot of our examples. That kind of brings into question whether that's actually an improvement. Having to change the doctests to adhere to the lower accuracies is not a good sign either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, I think this has less to do with how to test, and more with what good default settings are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I meant to say that if we want to have a discussion around reworking our early-stopping mechanism, we should probably open an issue for it.


I'm not particularly uncomfortable with estimators not passing our common tests with the default values. I don't even think we can achieve that.

Take the first failing test, check_estimators_dtypes. This one has a dataset with n_samples=20 and 3 classes. The default for validation_fraction is .1 which leads to test_size=2, so train_test_split fails with The test_size = 2 should be greater or equal to the number of classes = 3

What is the right course of action here? I don't think it's worth changing the default, or having a "clever" 'auto' value just for this test to pass. It's not worth changing the test either. This would be another form of the VW thing that you mentionned before. The test isn't even remotely related to accuracy/overfitting (so it's not related to early stopping either: there's no harm in disabling it)

@adrinjalali
Copy link
Member

Otherwise LGTM

@johannfaouzi
Copy link
Contributor Author

As mentioned in the docstrings, HistGBM is not really suited for small datasets. However, the tests use relatively small datasets (it makes them faster). Moreover, on small datasets with early stopping, the validation set is very small and makes the variance of the performance evaluation quite high.

Maybe we should add a new parameter early_stopping:

early_stopping : 'auto' or bool (default='bool')
    If 'auto', early stopping is enabled if the sample size is larger than 1000.
    If True, early stopping is enabled. If False, early stopping is disabled.

1000 is just a placeholder, I don't know what a good value could be.

n_iter_no_change would always be an integer and would be ignored if there is no early stopping.

@adrinjalali
Copy link
Member

I would agree with an auto option, and having it the default.

@NicolasHug
Copy link
Member

I'm OK with an 'auto' default but I personally have no idea what a sensible default would be. Maybe @ogrisel would have some ideas.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @johannfaouzi , still LGTM.

The test_warm_start_early_stopping should be updated to:

  • set early_stopping=True
  • change the assert to assert 0 < n_iter_second_fit - n_iter_first_fit < n_iter_no_change to avoid the test succeeding for the wrong reason (like it is right now)

@qinhanmin2014 qinhanmin2014 mentioned this pull request Nov 27, 2019
@adrinjalali adrinjalali modified the milestones: 0.22, 0.23 Dec 3, 2019
@ogrisel ogrisel self-requested a review December 23, 2019 13:16
@NicolasHug
Copy link
Member

Could this get some love maybe @ogrisel @adrinjalali @glemaitre ?

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR somehow exploded, and it was partly my fault 🙈

I like the changes now. Do we test for effectiveness of it somehow? i.e. do we check if it improves the performance?

Comment on lines 121 to 125
- |Feature| :func:`inspection.partial_dependence` and
:func:`inspection.plot_partial_dependence` now support the fast 'recursion'
method for both estimators. :pr:`13769` by `Nicolas Hug`_.

:mod:`sklearn.feature_extraction`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O_o

@@ -710,21 +712,25 @@ class HistGradientBoostingRegressor(RegressorMixin, BaseHistGradientBoosting):
and add more estimators to the ensemble. For results to be valid, the
estimator should be re-trained on the same data only.
See :term:`the Glossary <warm_start>`.
scoring : str or callable or None, optional (default=None)
early_stopping : 'auto' or bool (default='auto')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
early_stopping : 'auto' or bool (default='auto')
early_stopping : 'auto' or bool, default='auto'

but also happy to have all of them fixed in another PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we should probably have the whole file fixed in another PR since all docstrings follow the old style rn

@NicolasHug
Copy link
Member

Do we test for effectiveness of it somehow? i.e. do we check if it improves the performance?

Benchmarks start here #14516 (comment)

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah now I remember, at the end there wasn't something which we could add to the tests easily.

@NicolasHug
Copy link
Member

Thanks @adrinjalali

Codecov seems unhappy but it's giving me a 500

@ogrisel I'll merge in the following days unless you have any additional comments

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think that we could move forward @NicolasHug

@NicolasHug
Copy link
Member

Yup thanks for the reminder!

@NicolasHug NicolasHug changed the title [MRG] Turn on early stopping in histogram GBDT by default FEA Turn on early stopping in histogram GBDT by default Feb 12, 2020
@NicolasHug NicolasHug merged commit ee6b369 into scikit-learn:master Feb 12, 2020
@NicolasHug
Copy link
Member

Thanks for the nice work @johannfaouzi !

@glemaitre
Copy link
Member

glemaitre commented Feb 12, 2020

Thanks @johannfaouzi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Turn on early stopping by default in HistGradientBoosting estimators
6 participants