Skip to content

FIX Fixes tags for passthrough in Pipeline #18820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

thomasjpfan
Copy link
Member

Reference Issues/PRs

Fixes #18815

What does this implement/fix? Explain your changes.

This PR makes pipeline a little more forgiving when working with steps that does not have _get_tags.

@thomasjpfan thomasjpfan added this to the 0.24 milestone Nov 12, 2020
Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do we need an entry in what's new. It does look so private :)

@NicolasHug
Copy link
Member

Thanks for the PR, let's not merge this until we decide on #18797 (comment) please, I think this is strongly related. In particular the suggested option 2 would fix the original issue in a more general and consistent way

@thomasjpfan
Copy link
Member Author

This PR also is a bug fix because of Pipeline also accepts None and 'passthrough' which does not have tags.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the remark of @thomasjpfan above. WDYT @NicolasHug shall we merge this independently of the other issues?

estimator_tags = self.steps[0][1]._get_tags()
return {'pairwise': estimator_tags.get('pairwise', False)}
try:
pairwise = self.steps[0][1]._get_tags().get('pairwise', False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer just trying hasattr because otherwise it's not clear what may raise the AttributeError (could be e.g. self.steps).

Also about .get('pairwise', False): I think it's fair to assume that if _get_tags exists, then the key will exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also about .get('pairwise', False): I think it's fair to assume that if _get_tags exists, then the key will exist.

This opens the discussion of "Which tags should we expect to exist". This becomes a bigger question when BaseEstimator is not expected to be inherited.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which tags should we expect to exist

If _get_tags() exists, then all of them should exist I'd say

Also, I'm uncomfortable with having the tag's default defined twice: here, and in sklearn.base. The default might not change anytime soon, but this is still a pattern that we should avoid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If _get_tags() exists, then all of them should exist I'd say

This feels like something to be tested in check_estimator for "api-only".

@NicolasHug
Copy link
Member

I understand this is a bugfix for None, but more generally this is a bugfix for any object that does not have get_tags(). So maybe we can wait until Monday's discussion (w.r.t. inheriting from BaseEstimator)?

If we agree to use _safe_tags in meta-estimators, which is my preferred solution for now, then we could just call _safe_tags() here.

Also I believe that keeping this issue open will be helpful during Monday's discussions. If there's a rush to get things in because of the release, I don't want to block it either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error raised during grid search on pipeline with None for transformer step
5 participants