Skip to content

[WIP] PERF Parallelize W/H updates of NMF with OpenMP #16439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

jeremiedbb
Copy link
Member

Continuation of #6641

Here's a benchmark of the speedup using 2 threads. When the number of components is large enough the speed-up approaches 2x (maybe a bit better than 2x because I switched to a blas dot instead of a manual loop).
image

Since it now uses a BLAS function inside an OpenMP loop, we first need to prevent oversubscription with threadpoolctl before we can merge this pr.

@NicolasHug
Copy link
Member

Please add something in the UG and maybe also in the docstring to link to https://scikit-learn.org/stable/modules/computing.html#openmp-based-parallelism

Base automatically changed from master to main January 22, 2021 10:52
@jjerphan
Copy link
Member

@jeremiedbb: this already looks like a nice improvement! Is there anything left to do here? 🙂

@jeremiedbb
Copy link
Member Author

I tried to run the benchmark again and found very weird results. After digging it appears that there's a weird interaction between some builds of OpenBLAS and OpenMP, making it 10x slower in some cases. This is not expected and I opened OpenMathLib/OpenBLAS#3187. Since it occurs with the openblas shipped by default by conda-forge I think we should wait a bit before merging this.

@jjerphan
Copy link
Member

Should we try to update and revisit this PR (without using the call to _dot to prevent OpenMathLib/OpenBLAS#3187 from happening)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants