-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
On the Utilities for Developers page, it states:
Warning: These utilities are meant to be used internally within the scikit-learn package. They are not guaranteed to be stable between versions of scikit-learn. Backports, in particular, will be removed as the scikit-learn dependencies evolve.
If we want to provide utilities to support third-party estimators, we should treat some of these utilities as "first class" citizens.
For example safe_indexing
would be extremely useful for third parties that want to support DataFrames as input. Currently, the options for third-party developers is to build their own "safe_indexing" or depend on our private version which may not be stable.
Another example is scikit-learn/enhancement_proposals#22, which defines a n_features_in_
contract where we will internally use private methods to cohere with the contract. Third-party estimators would need to build their own methods or functions to work with the SLEP.
TLDR: Now that much of the utilities are "private", we can make deliberate decisions about what utilities should be public and supported by us. This would mean deprecation cycles, etc. If we support some of the utils
module, it will make it easier to build estimators, which will enrich the ecosystem of scikit-learn compatible estimators.
CC @scikit-learn/core-devs