-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Parallelisation of decomposition/sparse_encode #13005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Parallelisation of decomposition/sparse_encode #13005
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, please add a what's new entry
f2ee8d9
to
a3a3114
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @nixphix
doc/whats_new/v0.20.rst
Outdated
:mod:`sklearn.decomposition` | ||
........................... | ||
|
||
- |Fix| Fixed a bug in :meth:`decomposition.sparse_encode` where computation was single |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use :func:
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
a3a3114
to
32c422d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Have you looked for other instances from when we introduced effective_n_jobs?
I've checked all the |
…arn#13005)" This reverts commit 1ffbb6e.
…arn#13005)" This reverts commit 1ffbb6e.
I have a multiprocessing setup, where each process runs a SparseCoder with n_jobs=1, and in the main thread I split my data to these worker processes. In scikit-learn=0.20.2 it works perfectly fine (parallelization makes processing faster as expected), but when I updated to 0.20.3 it gets super slow (parallel computing takes ~10 times longer than doing it in just one process). |
Reference Issues/PRs
xref #12955
What does this implement/fix? Explain your changes.
sparse encode (dict learning uses sparse encode) is running in single-threaded mode irrespective of n_jobs parameters, the parallel execution code is unreachable since
effective_n_jobs
always returns a positive integer.scikit-learn/sklearn/decomposition/dict_learning.py
Lines 303 to 316 in ff46f6e