Skip to content

ENH Uses _openmp_effective_n_threads to get the number of threads in HistGradientBoosting* #20477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 9, 2021

Conversation

thomasjpfan
Copy link
Member

Reference Issues/PRs

Fixes #16016

What does this implement/fix? Explain your changes.

This PR uses _openmp_effective_n_threads to assign num_threads in prange.

  • For objects that are used openmp for fit: TreeGrower, Splitter, and HistogramBuilder, _BinMapper, BaseLoss, n_threads is set during __init__.
  • For the predictor, _openmp_effective_n_threads is queried and passed in because its environment can be different, this logic is in: _predict_iterations.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for fixing this!

Just a few suggestions:

@thomasjpfan thomasjpfan changed the title ENH Uses _openmp_effective_n_threads to get the number of threads ENH Uses _openmp_effective_n_threads to get the number of threads in HistGradientBoosting* Jul 7, 2021
@thomasjpfan
Copy link
Member Author

I made a slight change at bec6f51 where the caller needs to pass in n_threads to _predict_iterations. I made this change to reduce the number of calls to _openmp_effective_n_threads in _staged_raw_predict, which loops through all the predictors.

Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com>
@thomasjpfan thomasjpfan force-pushed the hist_gradient_threads branch from 61775f7 to 645cb9b Compare July 7, 2021 18:47
Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Just a few remarks

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @thomasjpfan !

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @thomasjpfan and sorry for the late review.

I think we should document this in our parallelization docs: https://scikit-learn.org/stable/computing/parallelism.html#openmp-based-parallelism

Also, for consistency, shouldn't we start using _openmp_effective_n_threads in all other estimators as well? This would make the documentation easier to write, which is usually a good indicator that it's something reasonable to do

Comment on lines +268 to +269
avoids performance problems caused by over-subscription when using those
classes in a docker container for instance. :pr:`20477`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when using those classes in a docker container for instance

This is unrelated to docker, isn't it?
Some docker images will set CPU quotas, but some don't. I think that the over-generalization to all docker containers is confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this only affects linux machines right?

Copy link
Member

@jjerphan jjerphan Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this only affects linux machines right?

This will affect most machines because: the docker deamon needs a Linux kernel (to use cgroups and other features of it); this kernel generally is the host's OS's, or a virtual machine's running Linux.

I think Docker started developing support for Windows-native images, but this is rather niche.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment about linux was dissociated from the one about docker.

Basically, we should probably clarify that this change does not affect Windows or OSX users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it still affect Windows or OSX users because a Linux VM has to be used for Docker in those cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not talking about docker.

I am saying that as far as I understand, this entry will not affect users using scikit-learn on Windows or OSX.
(as long as they don't use docker, which is probably the vast majority of them).

@@ -262,6 +262,13 @@ Changelog
:mod:`sklearn.ensemble`
.......................

- |Enhancement| :class:`~sklearn.ensemble.HistGradientBoostingClassifier` and
:class:`~sklearn.ensemble.HistGradientBoostingRegressor` take cgroups quotas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a link to https://en.wikipedia.org/wiki/Cgroups for further context. I'm not sure we can expect readers to know what cgroups is

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants