Oversubscription in HistGradientBoosting with pytest-xdist

When running tests with pytest-xdist on a machine with 12 (physical) CPU machine, the use of OpenMP in HistGradientBoosting seem to lead to significant over-subscription,

```
pytest sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py  -v
```
for me takes 0.85s. This runs 2 docstrings on training GBDT classifier and regressor on iris and boston datasets respectively.

- Running thin on 2 parallel processes with (`-n 2`) takes 56s (and 50 threads are created). 
- Running with 2 processes and `OMP_NUM_THREADS=2` takes 0.52s

While I understand the case of catastrophic oversubscription when `N_CPU_THREADS**2` threads are created on a machine with many cores, here we create `2*N_CPU_THREADS` only as compared to `1*N_CPU_THREADS` and get a 10x slowdown.

Can someone reproduce it? Here using scikit-learn master, and a conda env on Linux with latest `numpy scipy nomkl python=3.7`.

Because pytest-xdist uses its own parallelism system (not sure what it does exactly) I guess this won't be addressed by threadpoolctl https://github.com/scikit-learn/scikit-learn/issues/14979?


**Edit:** Originally reported in https://github.com/tomMoral/loky/issues/224


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Oversubscription in HistGradientBoosting with pytest-xdist #15078

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Oversubscription in HistGradientBoosting with pytest-xdist #15078

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions