Add ability to override joblib backend for scikit-learn estimators

Some calls to `Parallel` in scikit-learn hardcode what joblib backend to use. It would be nice if these could be overridden to use a different backend using the `parallel_backend` contextmanager. This would allow optionally using the `dask.distributed` backend in more places, which *may* provide speedups (see comment [here](https://github.com/dask/distributed/pull/1022#issuecomment-297550998)).

One way to do this would be to check if there's a globally set backend (as set by the context manager) and use that, otherwise use the specified fallback. This might look like:

```python
from sklearn.externals.joblib.parallel import _backend

def active_backend_or(default):
    """If there is an active joblib backend use that, otherwise use the default"""
    return getattr(_backend, 'backend_and_jobs', (default, None))[0]

# Use the active backend if set, otherwise use "threading"
Parallel(backend=active_backend_or("threading"))(...)
```

I'm not sure if this is a fix that should be implemented in scikit-learn or in joblib. Opening this for discussion.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add ability to override joblib backend for scikit-learn estimators #8804

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Add ability to override joblib backend for scikit-learn estimators #8804

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions