Skip to content

RFC Make estimator tags available at class level instead of instance level #20590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adrinjalali opened this issue Jul 23, 2021 · 6 comments
Closed

Comments

@adrinjalali
Copy link
Member

#20350 uses class attributes such as _metadata_request__sample_weight to change or set default values for requests related to certain metadata. Those attributes need to be available at the class level (as opposed to instance) since we generate certain methods for the classes and that part of the process has access to the class, and not the instance. Therefore as it stands we can't put default request information in estimator tags.

But what we do there, is that we go through the MRO, and we find all we need, and we create the default values. This is very similar to what estimator tags do.

On the other hand, estimator tags are really an attribute of the class, and not the instance, therefore it'd make sense for them to be class attributes.

Are there any reasons why we have them specifically on the instance? If not, I propose we move them to be a class thing.

WDYT @scikit-learn/core-devs ?

@thomasjpfan
Copy link
Member

Fundamentally, I have been thinking that the estimator tags feel more natural attached to the class.

We use the instance when we define "pairwise":

def _more_tags(self):
return {"pairwise": self.affinity == "precomputed"}

That was a recent addition, where I assumed that tags will stay connected to the instance. (If we want to move tags into the class level, I would greatly prefer "pairwise" to just be an public attribute third party developers can set "X_pairwise_".

@NicolasHug
Copy link
Member

Are there any reasons why we have them specifically on the instance?

Meta-estimators I assume, since they must delegate most tags to the inner estimator which is only known after instantiation, and sometimes even after fit()

@thomasjpfan
Copy link
Member

thomasjpfan commented Jul 23, 2021

Meta-estimators I assume, since they must delegate most tags to the inner estimator which is only known after instantiation, and sometimes even after fit()

Hmm, I guess the most visible one is *SearchCV when searching through estimators, *SearchCV should inherit the tags of the estimator that it found. (I do not think this is implemented)

@NicolasHug
Copy link
Member

There are other meta-estimators like SFS and RFE, but not as many as I expected:

(sklearn-dev) ➜  scikit-learn git:(main) ✗ git grep -n _safe_tags\(self
sklearn/feature_selection/_base.py:86:            force_all_finite=not _safe_tags(self, key="allow_nan"),
sklearn/feature_selection/_from_model.py:314:        return {"allow_nan": _safe_tags(self.estimator, key="allow_nan")}
sklearn/feature_selection/_rfe.py:415:            "allow_nan": _safe_tags(self.estimator, key="allow_nan"),
sklearn/feature_selection/_sequential.py:236:            "allow_nan": _safe_tags(self.estimator, key="allow_nan"),
sklearn/model_selection/_search.py:382:            "pairwise": _safe_tags(self.estimator, "pairwise"),
sklearn/multiclass.py:577:        return {"pairwise": _safe_tags(self.estimator, key="pairwise")}
sklearn/multiclass.py:880:        return {"pairwise": _safe_tags(self.estimator, key="pairwise")}
sklearn/pipeline.py:642:        return {"pairwise": _safe_tags(self.steps[0][1], "pairwise")}
sklearn/utils/_tags.py:34:    where `est` comes from: typically `_safe_tags(self.base_estimator)` where

(I do not think this is implemented)

Indeed, tag delegation doesn't seem to be fully implemented. But it does seem that we rely on it at least partially, so making the tags a class-level property might be a bit tricky

@jnothman
Copy link
Member

For good or bad, the parameters of an estimator fundamentally change its capabilities/API at times. I think the answer to this is "sorry, no, find another way".

@adrinjalali
Copy link
Member Author

Ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants