Closed
Description
Some calls to Parallel
in scikit-learn hardcode what joblib backend to use. It would be nice if these could be overridden to use a different backend using the parallel_backend
contextmanager. This would allow optionally using the dask.distributed
backend in more places, which may provide speedups (see comment here).
One way to do this would be to check if there's a globally set backend (as set by the context manager) and use that, otherwise use the specified fallback. This might look like:
from sklearn.externals.joblib.parallel import _backend
def active_backend_or(default):
"""If there is an active joblib backend use that, otherwise use the default"""
return getattr(_backend, 'backend_and_jobs', (default, None))[0]
# Use the active backend if set, otherwise use "threading"
Parallel(backend=active_backend_or("threading"))(...)
I'm not sure if this is a fix that should be implemented in scikit-learn or in joblib. Opening this for discussion.