-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
RFC Definition of public API #12927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd rather see a definition which is reflected in the code and import paths, for it to be clear from the code that people write, and not only some documentation. I'd even throw the wild idea of having v0.21 as the last in 0.xx series and have it as an LTS release, and simply add an If not, we can try the hack you mentioned in #11182 (comment), doesn't sound too bad to me. |
Thanks for the feedback!
How would that handle the case of
yeah, v1.0 is a possibility, but ideally, we would need some way to handle the deprecation/removal of
The most general way of moving imports including nested subdirectories I'm aware of is with import hooks. That's definitely not trivial and error-prone (e.g. for pickling, or some systems that implement their own hooks such as pyinstaller) and I don't think that it would be justified in this case. |
My proposal is that we make all files to follow This means the imports would look like:
And I'd say both are private because a part of the import path has an
If v0.21 is going to be the last in the 0.xx series, I guess having a way to handle removal/deprecation of code from externals becomes less critical, i.e. worst case scenario we keep them there, but they're removed/moved to |
I am +1 for this. We are not really consistent up to now but changes all the naming will be costly for all open PRs. The proposal to make it happen at 1.0 might be a way.
I agree with this point and I think that it could be extended for |
Actually, PEP8 make the distinction between public / private API using the documentation only: https://www.python.org/dev/peps/pep-0008/#public-and-internal-interfaces
Regarding,
This would break a lot of existing code though, which can make users unhappy even in a major 1.0 release. |
My proposal is that we make all files to follow blahblah.py style, which
would mean to rename dbscan.py to _dbscan.py and data.py to _data.py
I am +1 too. Explicit is good.
joblib might be a bit more tricky.
Joblib vendoring is going away anyhow, so it's just a question of finding
the right path.
|
This continues the work done in #12639 on dropping the python 2 support by, - ~~removing unnecessary `from __future__` imports~~ - removing unused `sklearn.utils.fixes` assuming we can agree in #12927 that `sklearn.utils.fixes` are private as was stated e.g. in #6616 (comment)
Briefly (having only skimmed the discussion):
Renaming modules breaks unpickling. I'm not convinced it should be done
lightly even if we provide no guarantees. There are also other things in
the module namespace that we hope are not being imported, i.e. things
imported from elsewhere. I think we need to just state which import paths
are considered public.
The glossary also provides some relevant definition:
https://scikit-learn.org/0.20/glossary.html#term-backwards-compatibility
|
On this subject https://github.com/glyph/publication is an intersting read (even if I'm not sure I agree with the proposed final solution). In particular, see the section about disadvanges of using underscores everwhere. |
You mean my proposal ;P #9250 |
One of the reasons I proposed the underscore is that it simplifies tab-completion, but I think the definition of If we do that and have otherwise empty |
So you mean defining empty `__all__` in most places? I'd agree to that.
|
This continues the work done in scikit-learn#12639 on dropping the python 2 support by, - ~~removing unnecessary `from __future__` imports~~ - removing unused `sklearn.utils.fixes` assuming we can agree in scikit-learn#12927 that `sklearn.utils.fixes` are private as was stated e.g. in scikit-learn#6616 (comment)
This continues the work done in scikit-learn#12639 on dropping the python 2 support by, - ~~removing unnecessary `from __future__` imports~~ - removing unused `sklearn.utils.fixes` assuming we can agree in scikit-learn#12927 that `sklearn.utils.fixes` are private as was stated e.g. in scikit-learn#6616 (comment)
I have opened #14913 that renames Imports from I would like to do that for all modules. Please comment whether you'd like to see this happen. Advantages:
Drawbacks:
|
I would like to do that for all modules. Please comment whether you'd like to see this happen.
+1 for this happening, with appropriate deprecation cycle.
|
Please note that this approach will break pickles once the deprecation ends.
|
(not that we care so much about pickle compatibility, but that we should be
aware that users will)
|
+1 on moving forward with this. |
I proposed this in 2017 (#9250) and I'm still in favor. |
I'm wondering about the effect on merge conflicts of this. If it was just a rename, git would probably handle it fine, but we are both moving So I think we should at least evaluate some approaches that would preserve backward compatibility of imports from |
Or maybe what we need is a tool to help solve such merge conflicts. It's
still the same patch, it just needs to be applied with respect to a
different path.
|
I agree that the conflicts aren't as trivial as I thought. The solution I could come up with so far is that the PR authors must create a commit where they rename Then the merge with master is easy. But the process isn't necessarily obvious yeah |
Closing as a duplicate of #9250 If interested, please subscribe to that issue, as currently the conversation is split between here and there, which is confusing to follow. |
In several PRs (e.g. #11182 #12916) the question arises whether we need to deprecate some object before it can be removed or changed. This goes back to defining what is public API in scikit-learn.
The lest controversial definition is that,
_
are privateHowever, you could do, for instance,
Does it mean that we are supposed to preserve backward compatibility on
sklearn.cluster.dbscan_.NearestNeighbors
in terms of import path? How aboutsklearn.preprocessing.data.sparse
(scipy.sparse
)?I guess not, meaning that just because we have an import path without an underscore does not mean that it is part of the public API. At the very least it also needs to be documented or used in examples.
If we take this definition,
sklearn.utils
, is public API (as documented https://scikit-learn.org/stable/modules/classes.html#module-sklearn.utils) but certainly not all 151 objects listed in Public vs. private utils again #6616 (comment)sklearn.utils.fixes
is not.sklearn.externals
are not (except forsklearn.externals.joblib
) that we previously used in examples.This would mean that we can e.g. remove
sklearn.externals.six
in #12916 without a deprecation warning (but possibly with a what's new entry). I have a hard time seeing a user reasonably complaining that we didn't go through a deprecation cycle there.This would also help resolving the "public vs private utils" discussion in #6616
WDYT, do you have other ideas of how we should define what is public API in scikit-learn?
cc @scikit-learn/core-devs
The text was updated successfully, but these errors were encountered: