Skip to content

Conversation

jeremiedbb
Copy link
Member

Fixes #20483

@jeremiedbb jeremiedbb added the Quick Review For PRs that are quick to review label Mar 11, 2022
@adrinjalali
Copy link
Member

Can this be tested? 😁

@jeremiedbb
Copy link
Member Author

I don't see how to do that easily. We don't do it for the other pranges. It should mainly be noticeable when running in e.g a docker container with quota on cpu ressources on a machine with many cores.

@thomasjpfan
Copy link
Member

At a high level, we already test that the results are the same with different n_threads setting:

def test_result_equal_in_diff_n_threads(Estimator):
# Check that KMeans/MiniBatchKMeans give the same results in parallel mode
# than in sequential mode.

@jeremiedbb
Copy link
Member Author

jeremiedbb commented Mar 12, 2022

This PR is not about parallelizing a some part of the code. This function is already parallel, it's just that we don't control the number of threads the way we do for other pranges, i.e. with _openmp_effective_n_threads. That's why testing this change is particularily hard. The parallelism in itself is indeed already covered by the test mentioned above.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan
Copy link
Member

thomasjpfan commented Mar 12, 2022

Sorry for being unclear. I was trying to say that we already test for different n_threads settings showing that this PR does not introduce any regressions.

As long as _openmp_effective_n_threads is correct, then I do not think we need to test for functions that use n_threads in terms of oversubscription.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. LGTM!

@ogrisel ogrisel merged commit 09a7293 into scikit-learn:main Mar 14, 2022
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cython module:cluster Quick Review For PRs that are quick to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use _openmp_effective_n_threads to take cgroups CPU quotas into account into account in Elkan and Minibatch KMeans
4 participants