-
-
Notifications
You must be signed in to change notification settings - Fork 26k
MAINT replace enable_slep006 fixture by @config_context(enable_metadata_routing=True) #30038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT replace enable_slep006 fixture by @config_context(enable_metadata_routing=True) #30038
Conversation
/cc @lesteve @adrinjalali |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
/cc @StefanieSenger just to let you know about this PR for future PRs related to metadata routing. |
Let's merge this one, thanks! |
I am wondering why the |
@@ -62,13 +62,6 @@ | |||
my_other_weights = rng.rand(N) | |||
|
|||
|
|||
@pytest.fixture(autouse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ogrisel was there no other way in the files where this needs to be always active? I don't like that every single test has it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally lean slightly towards being explicit even if that means more repeated code but yeah I do get your point.
I guess if you really wanted to, you could do something like below. I am not sure I would recommend it though, it feels slightly too magical. One caveat is that the for
loop needs to be at the very bottom of the file otherwise it will not decorate all test functions.
def decorator(f):
def wrapped(*args, **kwargs):
print('decorator')
return f(*args, **kwargs)
return wrapped
def test_1():
pass
def test_2():
pass
my_locals = [(name, obj) for name, obj in locals().items() if name.startswith('test_') and callable(obj)]
for name, obj in my_locals:
locals()[name] = decorator(obj)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we simply use a thread safe version of a pytest fixture?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wished "simply" applied here but unfortunately it's not the case 😉
One of the issue is that the fixture runs in the main thread and the test function (through the pytest plugin like pytest-run-parallel) runs inside a different thread. So if you change scikit-learn config settings in the fixture your changing the config in the main thread and that does not affect the config in the test function.
See #30007 (comment) for more details.
Answering @StefanieSenger's remarks:
I don't think it matters, but I thought I would put the non-pytest decorator closer to the function definition to better convey the intention that we want this context config to be set right around the function call, even if the different test parametrization are not run consecutively and furthermore if they are run concurrently.
Context is set and unset at each function call in either cases (whatever the relative ordering of the decorators). This can be checked by adding diff --git a/sklearn/_config.py b/sklearn/_config.py
index 05549c88a9..0518a796c0 100644
--- a/sklearn/_config.py
+++ b/sklearn/_config.py
@@ -357,6 +357,7 @@ def config_context(
ValueError: Input contains NaN...
"""
old_config = get_config()
+ new_config = {k: locals().get(k) for k in old_config.keys()}
set_config(
assume_finite=assume_finite,
working_memory=working_memory,
@@ -369,8 +370,13 @@ def config_context(
enable_metadata_routing=enable_metadata_routing,
skip_parameter_validation=skip_parameter_validation,
)
-
+ filtered_new_config = {k: v for k, v in new_config.items() if v is not None}
+ print(f"set_config with temporary values: {filtered_new_config}")
try:
yield
finally:
set_config(**old_config)
+ filtered_old_config = {
+ k: old_config[k] for k, v in new_config.items() if v is not None
+ }
+ print(f"set_config with original values {filtered_old_config}") If we run pytest for a function with the double decoration (
As expected. Then we can swap the order of the decorators for a specific test and rerun the pytest command for that test function: index e944ecc4ab..e2a78219da 100644
--- a/sklearn/ensemble/tests/test_stacking.py
+++ b/sklearn/ensemble/tests/test_stacking.py
@@ -921,6 +921,7 @@ def test_routing_passed_metadata_not_supported(Estimator, Child):
)
+@config_context(enable_metadata_routing=True)
@pytest.mark.parametrize(
"Estimator, Child",
[
@@ -928,7 +929,6 @@ def test_routing_passed_metadata_not_supported(Estimator, Child):
(StackingRegressor, ConsumingRegressor),
],
)
-@config_context(enable_metadata_routing=True)
def test_get_metadata_routing_without_fit(Estimator, Child):
# Test that metadata_routing() doesn't raise when called before fit.
est = Estimator([("sub_est", Child())]) we get the same behavior:
But this is what we want: we need the Furthermore, the per-function call overhead of |
@ogrisel Thank you for your comment on my question. It helped me understand the context gets set the same number of times either way (still a bit startled on the why, but it's okay, this is not what matters).
Happy, that this is the case. :) |
This is a first step towards making it possible to use thread-based parallelism to run the tests to be able to test with the new free-threading mode of Python 3.13.
Discussed in #30007 (comment).
In short:
sklearn.config_context
stores the active config in thread local variables. However, all the pytest fixtures are typically run in the main thread while the tests itself are run in a different thread when using one of the recent pytest plugins dedicated to running the tests in parallel using a thread pool (to detect GIL related compatibility issues).One could argue that this is a design bug of those plugins, but more pragmatically, we can simplify our test suite by directly using
sklearn.config_context
as a function decorator instead of relying on a dedicated fixture. The function decorator is executed on the same thread as where the test function itself is run.