-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
DOC Remove scikit-learn-only docs references related to metadata routing #26747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR and taking the time to explain the problem in details. I think this is a legitimate request.
WDYT @adrinjalali? Shall we use static URLs instead? Or is the text suggested in the PR enough?
This issue came up during review of Although, I am okay accepting this PR with static links, so it is easier for third party libraries. |
Ah interesting. I didn't notice that. I guess we just got lucky that we hadn't run into this in LightGBM before, based on the mix of things we import from I think the A note in https://scikit-learn.org/stable/developers/develop.html about this might help other developers of downstream libraries avoid stumbling over this the way I did... but I'll leave it to yall whether or not add that. Would totally understand if you want to keep that document free of such details. Given all this... it's ok with me if you all choose to close this PR without merging it. I'll leave that up to you to decide. Thanks very much for your time and consideration on this issue! |
We are getting the same problem in the nilearn doc build: is this recommendation for third party mentioned somewhere in the doc? Sorry if I missed it. |
I don't think we should change them to static links, since it'll be brittle and links might change. Closing this one as the solution is an easy on to implement through intersphinx. |
The following warning generates and error in doc build. Warning, treated as error: /home/runner/work/brainiak/brainiak/brainiak/factoranalysis/htfa.py:docstring of brainiak.factoranalysis.htfa:1:undefined label: 'metadata_routing' Seems to be discussed here: scikit-learn/scikit-learn#26747
Although that really makes life difficult for people who work on secure cluster where access to external urls via intersphix is limited. |
You can hook into |
Reference Issues/PRs
The release of
scikit-learn
v1.3.0 two days ago (link) included the addition of new public methods for metadata routing, specifically #24027.That change broke documentation generation in
LightGBM
(https://github.com/microsoft/LightGBM), and I suspect it might affect other projects which implement scikit-learn-compatible estimators similarly. The LightGBM-specific details are documented in microsoft/LightGBM#5954 (comment), but I've included the relevant points below.What does this implement/fix? Explain your changes.
Proposes removing reStructuredText
:ref:
references toscikit-learn
's own documentation in public methods of classes which are intended to be inherited from by other libraries.In LightGBM, we follow the advice in "Developing scikit-learn estimators" (link) and inherit from
sklearn.base.BaseEstimator
,sklearn.base.ClassifierMixin
, andsklearn.base.RegressorMixin
to reduce boilerplate and keep up with changes here inscikit-learn
.#24027 introduced reStructuredText references in public methods of those classes which only resolve correctly when rendered inside
scikit-learn
's own docs, like this:scikit-learn/sklearn/utils/_metadata_requests.py
Lines 1290 to 1291 in b88b539
In LightGBM, we use
sphinx.ext.autodoc
to generate documentation of thelightgbm
Python package, e.g. https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMClassifier.html#. And we set the following in our Sphinx configuration (code link)With that combination,
sphinx.ext.autodoc
automatically includes the rendered docstrings for every class attribute and method which doesn't begin with a_
, including those inherited from the class's parent classes. That means as ofscikit-learn
v1.3.0, Sphinx will try to render method documentation for e.g.lightgbm.sklearn.LGBMClassifier.get_metadata_routing
.Because of the scikit-learn-docs-specific references in
scikit-learn
's docstrings, that can lead to docs-generation errors like the following:Any other comments?
I suspect that the setup for LightGBM that I've described is one that other projects implementing scikit-learn-compatible estimators might have for their documentation + strategy for remaining compatible. That's why I think a change to the docs here in
scikit-learn
is preferable to the following things I could do in LightGBM to work around this:get_metadata_routing
and theset_*_request()
methods, and define new docstrings on them, inlightgbm
's source codelightgbm
's Sphinx docs which catch those references like:ref:User Guide <metadata_routing>
and re-route out to the relevant page(s) at https://scikit-learn.org/Since the things in
sklearn.base
are expected to be imported by other projects creating scikit-learn-compatible estimators, specifically to reduce such boilerplate and duplicate effort across downstream projects, I think they should be free from docs references which assume they're being rendered inside scikit-learn's own documentation.Thanks very much for your time and consideration.