Skip to content

Oversubscription in HistGradientBoosting with pytest-xdist #15078

Closed
@rth

Description

@rth

When running tests with pytest-xdist on a machine with 12 (physical) CPU machine, the use of OpenMP in HistGradientBoosting seem to lead to significant over-subscription,

pytest sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py  -v

for me takes 0.85s. This runs 2 docstrings on training GBDT classifier and regressor on iris and boston datasets respectively.

  • Running thin on 2 parallel processes with (-n 2) takes 56s (and 50 threads are created).
  • Running with 2 processes and OMP_NUM_THREADS=2 takes 0.52s

While I understand the case of catastrophic oversubscription when N_CPU_THREADS**2 threads are created on a machine with many cores, here we create 2*N_CPU_THREADS only as compared to 1*N_CPU_THREADS and get a 10x slowdown.

Can someone reproduce it? Here using scikit-learn master, and a conda env on Linux with latest numpy scipy nomkl python=3.7.

Because pytest-xdist uses its own parallelism system (not sure what it does exactly) I guess this won't be addressed by threadpoolctl #14979?

Edit: Originally reported in https://github.com/tomMoral/loky/issues/224

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions