Skip to content

FIX pass explicit configuration to delayed #25290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

glemaitre
Copy link
Member

Working alternative to #25242
closes #25242
closes #25239

This is an alternative to #25242 that does not work if the thread import scikit-learn is different from the thread making the call to Parallel.

Here, we have an alternative where we pass explicitly the configuration that is obtained by the thread that makes the Parallel code.

We raise a warning if this is not the case. It makes sure that it will turn into an error if we forget to pass the config to delayed. The code will still be working if joblib decides to provide a way to provide a context and a config.

@glemaitre glemaitre marked this pull request as ready for review January 4, 2023 17:02
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach is a good trade-off between verbosity and non-magicness.

I think the tests should be expanded to check the thread-safety of the config management in conjunction with joblib calls, both when using the loky backend and the threading backend.

@@ -840,8 +841,12 @@ def evaluate_candidates(candidate_params, cv=None, more_results=None):
)
)

# Capture the config of the current thread here instead of inside the
# generator expression. The generator expression can be consumed by
# an auxiliary thread in joblib.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was written only once for the purpose of this PR only. Repeating it everywhere might be too verbose. Not sure what to do...

Maybe the content of the warning message is explicit enough.

)

assert_array_equal(results, [123] * 2)
assert_array_equal(results, [123] * 10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could extend this test to show that this patterns also work from other threads, e.g. with something like the following (untested):

    results = []

    def parallel_inspect_config():
        with sklearn.config_context(working_memory=123):
            config = sklearn.get_config()
            results.extend(
                Parallel(n_jobs=2, pre_dispatch=4)(
                    delayed(get_working_memory, config=config)() for _ in range(n_iter)
                )
            )

    other_thread = threading.Thread(target=parallel_inspect_config)
    other_thread.start()
    other_thread.join()

    assert results == [123] * n_iter

It would even be better to have a test with ThreadpoolExecutor that checks that concurrently running threads calling joblib parallel with different contexts do not result in mixed up configurations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this test.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this implementation is less magical compared to #25242

@@ -107,22 +109,39 @@ def _eigh(*args, **kwargs):


# remove when https://github.com/joblib/joblib/issues/1071 is fixed
def delayed(function):
def delayed(function, config=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original bug is big enough for us to have a private _delayed(function, config) so it can be backported to 1.2.1 to avoid new API in a bug fix release. I'll be okay if this is done in separate PR.

Unfortunately, developers using joblib, will need to update their code to work correctly with globally setting transform="pandas". Moreover, developers will need to depend on utils.fixes.delayed for a while. This suggests to me that we need to render the docs for utils.fixes.delayed and properly document it.

@ogrisel
Copy link
Member

ogrisel commented Jan 9, 2023

Thinking more about it, we could also make this more automatic by subclassing joblib.Parallel as sklearn.fixes.Parallel to overried the Parallel.__call__ method to automatically call sklearn.get_config there and then rewrap the generator args of Parallel.__call__ to call delayed_object.set_config(config) on each task.

That would mandate using the sklearn.fixes.Parallel subclass everywhere though.

And indeed, maybe we should consider those tools (Parallel and delayed) semi-public with proper docstrings to explain how they extend the joblib equivalent to propagate scikit-learn specific configuration to worker threads and processes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ColumnTransformers don't honor set_config(transform_output="pandas") when multiprocessing with n_jobs>1
3 participants