-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
DOC Improve docstrings for scikit-learn configuration functions #31486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…onfig, config_context)
sklearn/_config.py
Outdated
It is recommended to use this context manager over :func:`set_config` for | ||
experimental features (such as `array_api_dispatch` or `enable_metadata_routing`), | ||
as their support is not yet consistently available across the library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not true. Using context managers for such settings actually probably results in confusion. I'd remove the paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Genuine question, I didn't realise this? In the array API docs this is what we seem to suggest: https://scikit-learn.org/dev/modules/array_api.html#pytorch-support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in array API, we would suggest users to use config_context
and I also remember that we had talked about encouraging users to rather use config_context
for metadata routing, because then they cannot forget to disable it and don't get error messages, that are hard to trace back to what caused them. But that was when it was not as widely available as it is now.
Should I maybe only remove its mention, so that metadata routing is not so exposed here? I think it's outgrowing experimental state and that maybe for it that soon wouldn't be the right advise anymore.
It is recommended to use this context manager over :func:`set_config` for | |
experimental features (such as `array_api_dispatch` or `enable_metadata_routing`), | |
as their support is not yet consistently available across the library. | |
It is recommended to use this context manager over :func:`set_config` for | |
experimental features (such as `array_api_dispatch`), | |
as their support is not yet consistently available across the library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issues with using context managers in array API and metadata routing:
- users might fit with array API enabled, which results in certain private attributes being stored in the same space as the input. Then down the script they forget to enable array API and
predict
would fail since it would convert input to numpy and try to do computation with numpy on one side, and original space of the input data on the other. We don't have a good solution for this and we just error I guess. - with metadata routing, users might set requests via (
set_{method}_request
) for instance, and then down there forget to enable routing and try aGridSearchCV
assuming things get routed.
So in both cases, it's better to either enable it or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I do not necessarily agree, because usage mistakes can happen whether set_config
or config_context
are used, but it's not wrong to leave this recommendation out either, so I will remove this part.
About different namespaces during fit
and predict
:
We don't have a good solution for this and we just error I guess.
We don't error currently. I think if we would error, then the error message could point towards enabling array_api_dispatch
, and then using config_context
would not be a hurdle. This question is currently discussed on the bottom of issue #28668, especially from #28668 (comment) onwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also had an issue about it: #26083
sklearn/_config.py
Outdated
This context manager can be used to apply scikit-learn configuration changes locally | ||
within a specific scope. Once the context exits, the previous global configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This context manager can be used to apply scikit-learn configuration changes locally | |
within a specific scope. Once the context exits, the previous global configuration | |
This context manager can be used to apply scikit-learn configuration changes locally, | |
within a context block. Once the context exits, the previous global configuration |
not 100% on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit redundant and I also wanted to educate people who don't know what a context manager is.
What about:
This context manager can be used to apply scikit-learn configuration changes locally | |
within a specific scope. Once the context exits, the previous global configuration | |
This context manager can be used to apply scikit-learn configuration changes | |
locally, limited to the scope of the with statement. Once the context exits, the | |
previous global configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about?
This context manager can be used to apply scikit-learn configuration changes locally | |
within a specific scope. Once the context exits, the previous global configuration | |
This context manager can be used to apply scikit-learn configuration changes within | |
the scope of the with statement. Once the context exits, the previous global configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's it.
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
This PR improves the docstrings of
set_config
,get_config
,config_context
, to help users understand how to use them.Specifically:
None
) and global defaults